[PATCH v2] fbnic: close fw_log race between users and teardown

Chengfeng Ye posted 1 patch 1 month, 2 weeks ago
There is a newer version of this series
.../net/ethernet/meta/fbnic/fbnic_fw_log.c    |  3 ---
drivers/net/ethernet/meta/fbnic/fbnic_pci.c   | 19 ++++++++++++-------
2 files changed, 12 insertions(+), 10 deletions(-)
[PATCH v2] fbnic: close fw_log race between users and teardown
Posted by Chengfeng Ye 1 month, 2 weeks ago
Fixes a theoretical race on fw_log between the teardown path and fw_log
write functions.

fw_log is written inside fbnic_fw_log_write() and can be reached from
the mailbox handler fbnic_fw_msix_intr(), but fw_log is freed before
IRQ/MBX teardown during cleanup, resulting in a potential data race of
dereferencing a freed/null variable.

Possible Interleaving scenario:
  CPU0: fbnic_fw_log_write()
          if (fbnic_fw_log_ready())   // true
          ... preempt ...
  CPU1: fbnic_fw_log_free()
          vfree(log->data_start);
          log->data_start = NULL;
  CPU0: continues, walks log->entries or writes to log->data_start

The initialization also has an incorrect order problem, as the fw_log
is currently allocated after MBX setup during initialization.
Fix the problems by adjusting the synchronization order to put
initialization in place before the mailbox is enabled, and not cleared
until after the mailbox has been disabled.

Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
---
Changes in v2:
  - Adjust synchronization instead of using lock protection
  - Also fix problem in initialization
--
 .../net/ethernet/meta/fbnic/fbnic_fw_log.c    |  3 ---
 drivers/net/ethernet/meta/fbnic/fbnic_pci.c   | 19 ++++++++++++-------
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
index 85a883dba385..d8a9a7d7c237 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
@@ -51,8 +51,6 @@ int fbnic_fw_log_init(struct fbnic_dev *fbd)
 	log->data_start = data;
 	log->data_end = data + FBNIC_FW_LOG_SIZE;
 
-	fbnic_fw_log_enable(fbd, true);
-
 	return 0;
 }
 
@@ -63,7 +61,6 @@ void fbnic_fw_log_free(struct fbnic_dev *fbd)
 	if (!fbnic_fw_log_ready(fbd))
 		return;
 
-	fbnic_fw_log_disable(fbd);
 	INIT_LIST_HEAD(&log->entries);
 	log->size = 0;
 	vfree(log->data_start);
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_pci.c b/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
index 9240673c7533..e92187bc1c0f 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_pci.c
@@ -307,11 +307,17 @@ static int fbnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 		goto free_irqs;
 	}
 
+	err = fbnic_fw_log_init(fbd);
+	if (err)
+		dev_warn(fbd->dev,
+			 "Unable to initialize firmware log buffer: %d\n",
+			 err);
+
 	err = fbnic_fw_request_mbx(fbd);
 	if (err) {
 		dev_err(&pdev->dev,
 			"Firmware mailbox initialization failure\n");
-		goto free_irqs;
+		goto free_fw_log;
 	}
 
 	/* Send the request to enable the FW logging to host. Note if this
@@ -319,11 +325,7 @@ static int fbnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	 * possible the FW is just too old to support the logging and needs
 	 * to be updated.
 	 */
-	err = fbnic_fw_log_init(fbd);
-	if (err)
-		dev_warn(fbd->dev,
-			 "Unable to initialize firmware log buffer: %d\n",
-			 err);
+	fbnic_fw_log_enable(fbd, true);
 
 	fbnic_devlink_register(fbd);
 	fbnic_devlink_otp_check(fbd, "error detected during probe");
@@ -370,6 +372,8 @@ static int fbnic_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	  * firmware updates for fixes.
 	  */
 	return 0;
+free_fw_log:
+	fbnic_fw_log_free(fbd);
 free_irqs:
 	fbnic_free_irqs(fbd);
 err_destroy_health:
@@ -404,8 +408,9 @@ static void fbnic_remove(struct pci_dev *pdev)
 	fbnic_hwmon_unregister(fbd);
 	fbnic_dbg_fbd_exit(fbd);
 	fbnic_devlink_unregister(fbd);
-	fbnic_fw_log_free(fbd);
+	fbnic_fw_log_disable(fbd);
 	fbnic_fw_free_mbx(fbd);
+	fbnic_fw_log_free(fbd);
 	fbnic_free_irqs(fbd);
 
 	fbnic_devlink_health_destroy(fbd);
-- 
2.25.1
Re: [PATCH v2] fbnic: close fw_log race between users and teardown
Posted by Jakub Kicinski 1 month, 2 weeks ago
On Wed, 11 Feb 2026 15:00:22 +0000 Chengfeng Ye wrote:
> Fixes a theoretical race on fw_log between the teardown path and fw_log
> write functions.
> 
> fw_log is written inside fbnic_fw_log_write() and can be reached from
> the mailbox handler fbnic_fw_msix_intr(), but fw_log is freed before
> IRQ/MBX teardown during cleanup, resulting in a potential data race of
> dereferencing a freed/null variable.
> 
> Possible Interleaving scenario:
>   CPU0: fbnic_fw_log_write()
>           if (fbnic_fw_log_ready())   // true
>           ... preempt ...
>   CPU1: fbnic_fw_log_free()
>           vfree(log->data_start);
>           log->data_start = NULL;
>   CPU0: continues, walks log->entries or writes to log->data_start

Could you be more specific about the entry points for CPU0 and CPU1?

> The initialization also has an incorrect order problem, as the fw_log
> is currently allocated after MBX setup during initialization.
> Fix the problems by adjusting the synchronization order to put
> initialization in place before the mailbox is enabled, and not cleared
> until after the mailbox has been disabled.

Since this is a fix please add a Fixes tag pointing to where the buggy
code was added.
-- 
pw-bot: cr
Re: [PATCH v2] fbnic: close fw_log race between users and teardown
Posted by Chengfeng Ye 1 month, 2 weeks ago
On Thu, Feb 12, 2026 at 2:46 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 11 Feb 2026 15:00:22 +0000 Chengfeng Ye wrote:
> > Fixes a theoretical race on fw_log between the teardown path and fw_log
> > write functions.
> >
> > fw_log is written inside fbnic_fw_log_write() and can be reached from
> > the mailbox handler fbnic_fw_msix_intr(), but fw_log is freed before
> > IRQ/MBX teardown during cleanup, resulting in a potential data race of
> > dereferencing a freed/null variable.
> >
> > Possible Interleaving scenario:
> >   CPU0: fbnic_fw_log_write()
> >           if (fbnic_fw_log_ready())   // true
> >           ... preempt ...
> >   CPU1: fbnic_fw_log_free()
> >           vfree(log->data_start);
> >           log->data_start = NULL;
> >   CPU0: continues, walks log->entries or writes to log->data_start
>
> Could you be more specific about the entry points for CPU0 and CPU1?
>
> > The initialization also has an incorrect order problem, as the fw_log
> > is currently allocated after MBX setup during initialization.
> > Fix the problems by adjusting the synchronization order to put
> > initialization in place before the mailbox is enabled, and not cleared
> > until after the mailbox has been disabled.
>
> Since this is a fix please add a Fixes tag pointing to where the buggy
> code was added.
> --
> pw-bot: cr

No problem, with issues resolved in the v3 patch just sent.

Thanks,
Chengfeng