[PATCH] xfs: annotate lockless b_flags read in xfs_buf_lock

Cen Zhang posted 1 patch 6 days ago
fs/xfs/xfs_buf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] xfs: annotate lockless b_flags read in xfs_buf_lock
Posted by Cen Zhang 6 days ago
xfs_buf_lock() reads bp->b_flags before acquiring the buffer semaphore
to check whether a stale, pinned buffer needs a log force:

    if (atomic_read(&bp->b_pin_count) && (bp->b_flags & XBF_STALE))

This races with xfs_trans_dirty_buf(), which modifies b_flags while
the buffer is locked by a transaction on another CPU.

The pre-semaphore check is a performance hint: if a stale pinned
buffer is detected, forcing the log avoids a long wait on the
semaphore.  Either outcome of the race is benign -- a false positive
triggers a harmless log force, and a false negative simply means the
caller blocks on the semaphore and the log force happens later.

Annotate the lockless read with READ_ONCE().

Fixes: ed3b4d6cdc81 ("xfs: Improve scalability of busy extent tracking")
Cc: stable@vger.kernel.org
Signed-off-by: Cen Zhang <zzzccc427@gmail.com>
---
 fs/xfs/xfs_buf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index d2f3c50d80e7..6819477307bd 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -988,7 +988,7 @@ xfs_buf_lock(
 {
 	trace_xfs_buf_lock(bp, _RET_IP_);
 
-	if (atomic_read(&bp->b_pin_count) && (bp->b_flags & XBF_STALE))
+	if (atomic_read(&bp->b_pin_count) && (READ_ONCE(bp->b_flags) & XBF_STALE))
 		xfs_log_force(bp->b_mount, 0);
 	down(&bp->b_sema);
 
-- 
2.34.1
Re: [PATCH] xfs: annotate lockless b_flags read in xfs_buf_lock
Posted by Dave Chinner 3 days, 13 hours ago
On Fri, Mar 27, 2026 at 09:11:52PM +0800, Cen Zhang wrote:
> xfs_buf_lock() reads bp->b_flags before acquiring the buffer semaphore
> to check whether a stale, pinned buffer needs a log force:
> 
>     if (atomic_read(&bp->b_pin_count) && (bp->b_flags & XBF_STALE))
> 
> This races with xfs_trans_dirty_buf(), which modifies b_flags while
> the buffer is locked by a transaction on another CPU.
> 
> The pre-semaphore check is a performance hint: if a stale pinned
> buffer is detected, forcing the log avoids a long wait on the
> semaphore.  Either outcome of the race is benign -- a false positive
> triggers a harmless log force, and a false negative simply means the
> caller blocks on the semaphore and the log force happens later.
> 
> Annotate the lockless read with READ_ONCE().

No. READ_ONCE should not be used to annotate a benign data access
race. data_race() should be used because all it does is turn off
KASAN for that access, and unlike READ_ONCE(), there is no code
change when KASAN is not enabled.

But, in reality, the race condition here is more than just the
b_flags access.  There is a big comment above the function explaining what
the check does, and that the buffer pinned check that precedes the
b_flags check can race with journal completion, too. SO, if we lose
the race and trigger the log force, then nothing bad happens, and
latency is no worse than if we won the race and triggered a log
force.

IOWs, adding READ_ONCE() to these check doesn't actually "annotate"
anything useful or informative - it's not paired with WRITE_ONCE()
anywhere (nor would we want to be doing that), and so it's just a
random, uncommented macro in the code that makes things harder to
read.

-Dave.
-- 
Dave Chinner
dgc@kernel.org
Re: [PATCH] xfs: annotate lockless b_flags read in xfs_buf_lock
Posted by Cen Zhang 3 days, 11 hours ago
Hi Dave,

> No. READ_ONCE should not be used to annotate a benign data access
> race. data_race() should be used because all it does is turn off
> KASAN for that access, and unlike READ_ONCE(), there is no code
> change when KASAN is not enabled.

Thank you for the review.  You're right -- data_race() is the correct
annotation for a known-benign race.  READ_ONCE() adds an unnecessary
compiler barrier and, without a paired WRITE_ONCE() on the write side,
is not the right pattern here.

> But, in reality, the race condition here is more than just the
> b_flags access.  There is a big comment above the function explaining what
> the check does, and that the buffer pinned check that precedes the
> b_flags check can race with journal completion, too.

Agreed.  The atomic_read() on b_pin_count is already KCSAN-safe, so
only the b_flags access needs a data_race() annotation to suppress
the KCSAN report.  The commit message in v2 now acknowledges that the
entire pre-semaphore check is racy by design.

I'll send a v2 using data_race().

Thanks,
Cen