[PATCH] fbnic: close fw_log race between users and teardown

Chengfeng Ye posted 1 patch 11 hours ago
.../net/ethernet/meta/fbnic/fbnic_debugfs.c   |  8 ++++---
.../net/ethernet/meta/fbnic/fbnic_fw_log.c    | 21 +++++++++++++------
2 files changed, 20 insertions(+), 9 deletions(-)
[PATCH] fbnic: close fw_log race between users and teardown
Posted by Chengfeng Ye 11 hours ago
Fixes a theoretical race on fw_log between the teardown path and fw_log
show/write functions.

The fw_log null check was performed outside the lock in both
fbnic_dbg_fw_log_show() and fbnic_fw_log_write() (called from debugfs
paths or mailbox threaded IRQ handler). Concurrent teardown in
fbnic_fw_log_free() could clear and free the log buffer after the check
because there is no proper synchronization, leading to list traversal or
buffer access on freed memory.

fbnic_fw_log_write() can be reached from the mailbox handler
fbnic_fw_msix_intr(), but fbnic_fw_log_free() runs before IRQ/MBX
teardown. fbnic_dbg_fw_log_show() runs via debugfs and seems to be
not synchronized with removal either.

Possible Interleaving scenario:
  CPU0: fbnic_dbg_fw_log_show() or fbnic_fw_log_write()
          if (fbnic_fw_log_ready())   // true
          ... preempt ...
  CPU1: fbnic_fw_log_free()
          vfree(log->data_start);
          log->data_start = NULL;
  CPU0: continues, walks log->entries or writes to log->data_start

Move readiness checks under the fw_log spinlock, and clear the log state
under the same lock in fbnic_fw_log_free() before freeing the buffer.
This makes readers/writers mutually exclusive with teardown.

Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
---
 .../net/ethernet/meta/fbnic/fbnic_debugfs.c   |  8 ++++---
 .../net/ethernet/meta/fbnic/fbnic_fw_log.c    | 21 +++++++++++++------
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_debugfs.c b/drivers/net/ethernet/meta/fbnic/fbnic_debugfs.c
index b7238dd967fe..4171dde590fc 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_debugfs.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_debugfs.c
@@ -176,11 +176,13 @@ static int fbnic_dbg_fw_log_show(struct seq_file *s, void *v)
 	struct fbnic_fw_log_entry *entry;
 	unsigned long flags;
 
-	if (!fbnic_fw_log_ready(fbd))
-		return -ENXIO;
-
 	spin_lock_irqsave(&fbd->fw_log.lock, flags);
 
+	if (!fbnic_fw_log_ready(fbd)) {
+		spin_unlock_irqrestore(&fbd->fw_log.lock, flags);
+		return -ENXIO;
+	}
+
 	list_for_each_entry_reverse(entry, &fbd->fw_log.entries, list) {
 		seq_printf(s, FBNIC_FW_LOG_FMT, entry->index,
 			   (entry->timestamp / (MSEC_PER_SEC * 60 * 60 * 24)),
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
index 85a883dba385..d371932435e5 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
@@ -59,16 +59,24 @@ int fbnic_fw_log_init(struct fbnic_dev *fbd)
 void fbnic_fw_log_free(struct fbnic_dev *fbd)
 {
 	struct fbnic_fw_log *log = &fbd->fw_log;
-
-	if (!fbnic_fw_log_ready(fbd))
-		return;
+	unsigned long flags;
+	void *data;
 
 	fbnic_fw_log_disable(fbd);
+
+	spin_lock_irqsave(&log->lock, flags);
+	if (!fbnic_fw_log_ready(fbd)) {
+		spin_unlock_irqrestore(&log->lock, flags);
+		return;
+	}
+	data = log->data_start;
 	INIT_LIST_HEAD(&log->entries);
 	log->size = 0;
-	vfree(log->data_start);
 	log->data_start = NULL;
 	log->data_end = NULL;
+	spin_unlock_irqrestore(&log->lock, flags);
+
+	vfree(data);
 }
 
 int fbnic_fw_log_write(struct fbnic_dev *fbd, u64 index, u32 timestamp,
@@ -80,13 +88,14 @@ int fbnic_fw_log_write(struct fbnic_dev *fbd, u64 index, u32 timestamp,
 	unsigned long flags;
 	void *entry_end;
 
+	spin_lock_irqsave(&log->lock, flags);
+
 	if (!fbnic_fw_log_ready(fbd)) {
+		spin_unlock_irqrestore(&log->lock, flags);
 		dev_err(fbd->dev, "Firmware sent log entry without being requested!\n");
 		return -ENOSPC;
 	}
 
-	spin_lock_irqsave(&log->lock, flags);
-
 	if (list_empty(&log->entries)) {
 		entry = log->data_start;
 	} else {
-- 
2.25.1
Re: [PATCH] fbnic: close fw_log race between users and teardown
Posted by Lee Trager an hour ago
On 2/9/26 8:38 AM, Chengfeng Ye wrote:

> Fixes a theoretical race on fw_log between the teardown path and fw_log
> show/write functions.
>
> The fw_log null check was performed outside the lock in both
> fbnic_dbg_fw_log_show() and fbnic_fw_log_write() (called from debugfs
> paths or mailbox threaded IRQ handler).
This was done intentionally. Initially the log buffer would only be 
allocated if DebugFS was enabled. If a bug in firmware caused a log to 
be sent the driver can reject it without the lock.
> Concurrent teardown in
> fbnic_fw_log_free() could clear and free the log buffer after the check
> because there is no proper synchronization, leading to list traversal or
> buffer access on freed memory.

fbnic_fw_log_free() is only called when the driver is removed, after 
DebugFS has been disabled. Before freeing the buffer the driver sends an 
explicit message to firmware to stop sending new message.

>
> fbnic_fw_log_write() can be reached from the mailbox handler
> fbnic_fw_msix_intr(), but fbnic_fw_log_free() runs before IRQ/MBX
> teardown. fbnic_dbg_fw_log_show() runs via debugfs and seems to be
> not synchronized with removal either.
>
> Possible Interleaving scenario:
>    CPU0: fbnic_dbg_fw_log_show() or fbnic_fw_log_write()
>            if (fbnic_fw_log_ready())   // true
>            ... preempt ...
>    CPU1: fbnic_fw_log_free()
>            vfree(log->data_start);
>            log->data_start = NULL;
>    CPU0: continues, walks log->entries or writes to log->data_start
>
> Move readiness checks under the fw_log spinlock, and clear the log state
> under the same lock in fbnic_fw_log_free() before freeing the buffer.
> This makes readers/writers mutually exclusive with teardown.
I can see how this makes things safer but not sure the extra locking is 
truly needed. So far we have not seen issues in test or prod during 
driver removal.

@Alex - Up to you if you think we should always lock when touching the 
buffer.

Lee