.../net/ethernet/meta/fbnic/fbnic_debugfs.c | 8 ++++--- .../net/ethernet/meta/fbnic/fbnic_fw_log.c | 21 +++++++++++++------ 2 files changed, 20 insertions(+), 9 deletions(-)
Fixes a theoretical race on fw_log between the teardown path and fw_log
show/write functions.
The fw_log null check was performed outside the lock in both
fbnic_dbg_fw_log_show() and fbnic_fw_log_write() (called from debugfs
paths or mailbox threaded IRQ handler). Concurrent teardown in
fbnic_fw_log_free() could clear and free the log buffer after the check
because there is no proper synchronization, leading to list traversal or
buffer access on freed memory.
fbnic_fw_log_write() can be reached from the mailbox handler
fbnic_fw_msix_intr(), but fbnic_fw_log_free() runs before IRQ/MBX
teardown. fbnic_dbg_fw_log_show() runs via debugfs and seems to be
not synchronized with removal either.
Possible Interleaving scenario:
CPU0: fbnic_dbg_fw_log_show() or fbnic_fw_log_write()
if (fbnic_fw_log_ready()) // true
... preempt ...
CPU1: fbnic_fw_log_free()
vfree(log->data_start);
log->data_start = NULL;
CPU0: continues, walks log->entries or writes to log->data_start
Move readiness checks under the fw_log spinlock, and clear the log state
under the same lock in fbnic_fw_log_free() before freeing the buffer.
This makes readers/writers mutually exclusive with teardown.
Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
---
.../net/ethernet/meta/fbnic/fbnic_debugfs.c | 8 ++++---
.../net/ethernet/meta/fbnic/fbnic_fw_log.c | 21 +++++++++++++------
2 files changed, 20 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_debugfs.c b/drivers/net/ethernet/meta/fbnic/fbnic_debugfs.c
index b7238dd967fe..4171dde590fc 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_debugfs.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_debugfs.c
@@ -176,11 +176,13 @@ static int fbnic_dbg_fw_log_show(struct seq_file *s, void *v)
struct fbnic_fw_log_entry *entry;
unsigned long flags;
- if (!fbnic_fw_log_ready(fbd))
- return -ENXIO;
-
spin_lock_irqsave(&fbd->fw_log.lock, flags);
+ if (!fbnic_fw_log_ready(fbd)) {
+ spin_unlock_irqrestore(&fbd->fw_log.lock, flags);
+ return -ENXIO;
+ }
+
list_for_each_entry_reverse(entry, &fbd->fw_log.entries, list) {
seq_printf(s, FBNIC_FW_LOG_FMT, entry->index,
(entry->timestamp / (MSEC_PER_SEC * 60 * 60 * 24)),
diff --git a/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c b/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
index 85a883dba385..d371932435e5 100644
--- a/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
+++ b/drivers/net/ethernet/meta/fbnic/fbnic_fw_log.c
@@ -59,16 +59,24 @@ int fbnic_fw_log_init(struct fbnic_dev *fbd)
void fbnic_fw_log_free(struct fbnic_dev *fbd)
{
struct fbnic_fw_log *log = &fbd->fw_log;
-
- if (!fbnic_fw_log_ready(fbd))
- return;
+ unsigned long flags;
+ void *data;
fbnic_fw_log_disable(fbd);
+
+ spin_lock_irqsave(&log->lock, flags);
+ if (!fbnic_fw_log_ready(fbd)) {
+ spin_unlock_irqrestore(&log->lock, flags);
+ return;
+ }
+ data = log->data_start;
INIT_LIST_HEAD(&log->entries);
log->size = 0;
- vfree(log->data_start);
log->data_start = NULL;
log->data_end = NULL;
+ spin_unlock_irqrestore(&log->lock, flags);
+
+ vfree(data);
}
int fbnic_fw_log_write(struct fbnic_dev *fbd, u64 index, u32 timestamp,
@@ -80,13 +88,14 @@ int fbnic_fw_log_write(struct fbnic_dev *fbd, u64 index, u32 timestamp,
unsigned long flags;
void *entry_end;
+ spin_lock_irqsave(&log->lock, flags);
+
if (!fbnic_fw_log_ready(fbd)) {
+ spin_unlock_irqrestore(&log->lock, flags);
dev_err(fbd->dev, "Firmware sent log entry without being requested!\n");
return -ENOSPC;
}
- spin_lock_irqsave(&log->lock, flags);
-
if (list_empty(&log->entries)) {
entry = log->data_start;
} else {
--
2.25.1
On 2/9/26 8:38 AM, Chengfeng Ye wrote: > Fixes a theoretical race on fw_log between the teardown path and fw_log > show/write functions. > > The fw_log null check was performed outside the lock in both > fbnic_dbg_fw_log_show() and fbnic_fw_log_write() (called from debugfs > paths or mailbox threaded IRQ handler). This was done intentionally. Initially the log buffer would only be allocated if DebugFS was enabled. If a bug in firmware caused a log to be sent the driver can reject it without the lock. > Concurrent teardown in > fbnic_fw_log_free() could clear and free the log buffer after the check > because there is no proper synchronization, leading to list traversal or > buffer access on freed memory. fbnic_fw_log_free() is only called when the driver is removed, after DebugFS has been disabled. Before freeing the buffer the driver sends an explicit message to firmware to stop sending new message. > > fbnic_fw_log_write() can be reached from the mailbox handler > fbnic_fw_msix_intr(), but fbnic_fw_log_free() runs before IRQ/MBX > teardown. fbnic_dbg_fw_log_show() runs via debugfs and seems to be > not synchronized with removal either. > > Possible Interleaving scenario: > CPU0: fbnic_dbg_fw_log_show() or fbnic_fw_log_write() > if (fbnic_fw_log_ready()) // true > ... preempt ... > CPU1: fbnic_fw_log_free() > vfree(log->data_start); > log->data_start = NULL; > CPU0: continues, walks log->entries or writes to log->data_start > > Move readiness checks under the fw_log spinlock, and clear the log state > under the same lock in fbnic_fw_log_free() before freeing the buffer. > This makes readers/writers mutually exclusive with teardown. I can see how this makes things safer but not sure the extra locking is truly needed. So far we have not seen issues in test or prod during driver removal. @Alex - Up to you if you think we should always lock when touching the buffer. Lee
© 2016 - 2026 Red Hat, Inc.