[PATCH] nvme-pci: free irq properly when cannot create adminq

Tong Zhang posted 1 patch 2 years, 8 months ago
drivers/nvme/host/pci.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
[PATCH] nvme-pci: free irq properly when cannot create adminq
Posted by Tong Zhang 2 years, 8 months ago
nvme_pci_configure_admin_queue could return -ENODEV, in this case, we
will need to free IRQ properly. Otherwise following warning could be
triggered

[    5.286752] WARNING: CPU: 0 PID: 33 at kernel/irq/irqdomain.c:253 irq_domain_remove+0x12d/0x140
[    5.290547] Call Trace:
[    5.290626]  <TASK>
[    5.290695]  msi_remove_device_irq_domain+0xc9/0xf0
[    5.290843]  msi_device_data_release+0x15/0x80
[    5.290978]  release_nodes+0x58/0x90
[    5.293788] WARNING: CPU: 0 PID: 33 at kernel/irq/msi.c:276 msi_device_data_release+0x76/0x80
[    5.297573] Call Trace:
[    5.297651]  <TASK>
[    5.297719]  release_nodes+0x58/0x90
[    5.297831]  devres_release_all+0xef/0x140
[    5.298339]  device_unbind_cleanup+0x11/0xc0
[    5.298479]  really_probe+0x296/0x320

Fixes: a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable")
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
---
 drivers/nvme/host/pci.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f0f8027644bb..1fc2a2e130ab 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2584,8 +2584,13 @@ static int nvme_pci_enable(struct nvme_dev *dev)
 	pci_enable_pcie_error_reporting(pdev);
 	pci_save_state(pdev);
 
-	return nvme_pci_configure_admin_queue(dev);
+	result = nvme_pci_configure_admin_queue(dev);
+	if (result)
+		goto free_irq;
+	return result;
 
+ free_irq:
+	pci_free_irq_vectors(pdev);
  disable:
 	pci_disable_device(pdev);
 	return result;
-- 
2.25.1
Re: [PATCH] nvme-pci: free irq properly when cannot create adminq
Posted by Keith Busch 2 years, 8 months ago
On Wed, Dec 28, 2022 at 10:05:49PM -0800, Tong Zhang wrote:
> nvme_pci_configure_admin_queue could return -ENODEV, in this case, we
> will need to free IRQ properly. Otherwise following warning could be
> triggered
> 
> [    5.286752] WARNING: CPU: 0 PID: 33 at kernel/irq/irqdomain.c:253 irq_domain_remove+0x12d/0x140
> [    5.290547] Call Trace:
> [    5.290626]  <TASK>
> [    5.290695]  msi_remove_device_irq_domain+0xc9/0xf0
> [    5.290843]  msi_device_data_release+0x15/0x80
> [    5.290978]  release_nodes+0x58/0x90
> [    5.293788] WARNING: CPU: 0 PID: 33 at kernel/irq/msi.c:276 msi_device_data_release+0x76/0x80
> [    5.297573] Call Trace:
> [    5.297651]  <TASK>
> [    5.297719]  release_nodes+0x58/0x90
> [    5.297831]  devres_release_all+0xef/0x140
> [    5.298339]  device_unbind_cleanup+0x11/0xc0
> [    5.298479]  really_probe+0x296/0x320
> 
> Fixes: a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable")

Right. It's really only needed when called from probe as the reset_work
handles the cleanup when called from there, but this is safe for both
cases.

> @@ -2584,8 +2584,13 @@ static int nvme_pci_enable(struct nvme_dev *dev)
>  	pci_enable_pcie_error_reporting(pdev);
>  	pci_save_state(pdev);
>  
> -	return nvme_pci_configure_admin_queue(dev);
> +	result = nvme_pci_configure_admin_queue(dev);
> +	if (result)
> +		goto free_irq;
> +	return result;

Since you're already in this function, you should also add a "goto
disable" if pci_alloc_irq_vectors() fails. Right now it just returns
with the pci device still enabled, and it won't get disabled from probe.
  
> + free_irq:
> +	pci_free_irq_vectors(pdev);
>   disable:
>  	pci_disable_device(pdev);
>  	return result;
> --
Re: [PATCH] nvme-pci: free irq properly when cannot create adminq
Posted by Tong Zhang 2 years, 8 months ago
On Thu, Dec 29, 2022 at 10:04 AM Keith Busch <kbusch@kernel.org> wrote:
>
> On Wed, Dec 28, 2022 at 10:05:49PM -0800, Tong Zhang wrote:
> > nvme_pci_configure_admin_queue could return -ENODEV, in this case, we
> > will need to free IRQ properly. Otherwise following warning could be
> > triggered
> >
> > [    5.286752] WARNING: CPU: 0 PID: 33 at kernel/irq/irqdomain.c:253 irq_domain_remove+0x12d/0x140
> > [    5.290547] Call Trace:
> > [    5.290626]  <TASK>
> > [    5.290695]  msi_remove_device_irq_domain+0xc9/0xf0
> > [    5.290843]  msi_device_data_release+0x15/0x80
> > [    5.290978]  release_nodes+0x58/0x90
> > [    5.293788] WARNING: CPU: 0 PID: 33 at kernel/irq/msi.c:276 msi_device_data_release+0x76/0x80
> > [    5.297573] Call Trace:
> > [    5.297651]  <TASK>
> > [    5.297719]  release_nodes+0x58/0x90
> > [    5.297831]  devres_release_all+0xef/0x140
> > [    5.298339]  device_unbind_cleanup+0x11/0xc0
> > [    5.298479]  really_probe+0x296/0x320
> >
> > Fixes: a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable")
>
> Right. It's really only needed when called from probe as the reset_work
> handles the cleanup when called from there, but this is safe for both
> cases.
>
> > @@ -2584,8 +2584,13 @@ static int nvme_pci_enable(struct nvme_dev *dev)
> >       pci_enable_pcie_error_reporting(pdev);
> >       pci_save_state(pdev);
> >
> > -     return nvme_pci_configure_admin_queue(dev);
> > +     result = nvme_pci_configure_admin_queue(dev);
> > +     if (result)
> > +             goto free_irq;
> > +     return result;
>
> Since you're already in this function, you should also add a "goto
> disable" if pci_alloc_irq_vectors() fails. Right now it just returns
> with the pci device still enabled, and it won't get disabled from probe.
>

Thank you Keith! I have added this fix and sent a v2.
- Tong

> > + free_irq:
> > +     pci_free_irq_vectors(pdev);
> >   disable:
> >       pci_disable_device(pdev);
> >       return result;
> > --
[PATCH v2] nvme-pci: fix error handling in nvme_pci_enable()
Posted by Tong Zhang 2 years, 8 months ago
There are two issues in nvme_pci_enable()
1) If pci_alloc_irq_vectors() fails, device is left enabled. Fix this by
adding a goto disable statement.
2) nvme_pci_configure_admin_queue could return -ENODEV, in this case,
we will need to free IRQ properly. Otherwise following warning could be
triggered

[    5.286752] WARNING: CPU: 0 PID: 33 at kernel/irq/irqdomain.c:253 irq_domain_remove+0x12d/0x140
[    5.290547] Call Trace:
[    5.290626]  <TASK>
[    5.290695]  msi_remove_device_irq_domain+0xc9/0xf0
[    5.290843]  msi_device_data_release+0x15/0x80
[    5.290978]  release_nodes+0x58/0x90
[    5.293788] WARNING: CPU: 0 PID: 33 at kernel/irq/msi.c:276 msi_device_data_release+0x76/0x80
[    5.297573] Call Trace:
[    5.297651]  <TASK>
[    5.297719]  release_nodes+0x58/0x90
[    5.297831]  devres_release_all+0xef/0x140
[    5.298339]  device_unbind_cleanup+0x11/0xc0
[    5.298479]  really_probe+0x296/0x320

Fixes: a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable")
Co-developed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
---
v2: handle pci_alloc_irq_vectors() error

 drivers/nvme/host/pci.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f0f8027644bb..3255e7a6f643 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2530,7 +2530,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
 	 */
 	result = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_ALL_TYPES);
 	if (result < 0)
-		return result;
+		goto disable;
 
 	dev->ctrl.cap = lo_hi_readq(dev->bar + NVME_REG_CAP);
 
@@ -2584,8 +2584,13 @@ static int nvme_pci_enable(struct nvme_dev *dev)
 	pci_enable_pcie_error_reporting(pdev);
 	pci_save_state(pdev);
 
-	return nvme_pci_configure_admin_queue(dev);
+	result = nvme_pci_configure_admin_queue(dev);
+	if (result)
+		goto free_irq;
+	return result;
 
+ free_irq:
+	pci_free_irq_vectors(pdev);
  disable:
 	pci_disable_device(pdev);
 	return result;
-- 
2.25.1
Re: [PATCH v2] nvme-pci: fix error handling in nvme_pci_enable()
Posted by Christoph Hellwig 2 years, 8 months ago
Thanks,

applied to nvme-6.2.
Re: [PATCH v2] nvme-pci: fix error handling in nvme_pci_enable()
Posted by Keith Busch 2 years, 8 months ago
Looks good.

Reviewed-by: Keith Busch <kbusch@kernel.org>