[PATCH RESEND] locking/spinlock/debug: Fix data-race in do_raw_write_lock

A. Sverdlin posted 1 patch 1 month, 1 week ago
kernel/locking/spinlock_debug.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH RESEND] locking/spinlock/debug: Fix data-race in do_raw_write_lock
Posted by A. Sverdlin 1 month, 1 week ago
From: Alexander Sverdlin <alexander.sverdlin@siemens.com>

KCSAN reports:

BUG: KCSAN: data-race in do_raw_write_lock / do_raw_write_lock

write (marked) to 0xffff800009cf504c of 4 bytes by task 1102 on cpu 1:
 do_raw_write_lock+0x120/0x204
 _raw_write_lock_irq
 do_exit
 call_usermodehelper_exec_async
 ret_from_fork

read to 0xffff800009cf504c of 4 bytes by task 1103 on cpu 0:
 do_raw_write_lock+0x88/0x204
 _raw_write_lock_irq
 do_exit
 call_usermodehelper_exec_async
 ret_from_fork

value changed: 0xffffffff -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 1103 Comm: kworker/u4:1 6.1.111

Commit 1a365e822372 ("locking/spinlock/debug: Fix various data races") has
adressed most of these races, but seems to be not consistent/not complete.

From do_raw_write_lock() only debug_write_lock_after() part has been
converted to WRITE_ONCE(), but not debug_write_lock_before() part.
Do it now.

Cc: stable@vger.kernel.org
Fixes: 1a365e822372 ("locking/spinlock/debug: Fix various data races")
Reported-by: Adrian Freihofer <adrian.freihofer@siemens.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
---
There are still some inconsistencies remaining IMO:
- lock->magic is sometimes accessed with READ_ONCE() even though it's only
being plain-written;
- debug_spin_unlock() and debug_write_unlock() both do WRITE_ONCE() on
lock->owner and lock->owner_cpu, but examine them with plain read accesses.

 kernel/locking/spinlock_debug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 87b03d2e41dbb..2338b3adfb55f 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -184,8 +184,8 @@ void do_raw_read_unlock(rwlock_t *lock)
 static inline void debug_write_lock_before(rwlock_t *lock)
 {
 	RWLOCK_BUG_ON(lock->magic != RWLOCK_MAGIC, lock, "bad magic");
-	RWLOCK_BUG_ON(lock->owner == current, lock, "recursion");
-	RWLOCK_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
+	RWLOCK_BUG_ON(READ_ONCE(lock->owner) == current, lock, "recursion");
+	RWLOCK_BUG_ON(READ_ONCE(lock->owner_cpu) == raw_smp_processor_id(),
 							lock, "cpu recursion");
 }
 
-- 
2.47.1
Re: [PATCH RESEND] locking/spinlock/debug: Fix data-race in do_raw_write_lock
Posted by Paul E. McKenney 1 month, 1 week ago
On Tue, Aug 26, 2025 at 12:27:27PM +0200, A. Sverdlin wrote:
> From: Alexander Sverdlin <alexander.sverdlin@siemens.com>
> 
> KCSAN reports:
> 
> BUG: KCSAN: data-race in do_raw_write_lock / do_raw_write_lock
> 
> write (marked) to 0xffff800009cf504c of 4 bytes by task 1102 on cpu 1:
>  do_raw_write_lock+0x120/0x204
>  _raw_write_lock_irq
>  do_exit
>  call_usermodehelper_exec_async
>  ret_from_fork
> 
> read to 0xffff800009cf504c of 4 bytes by task 1103 on cpu 0:
>  do_raw_write_lock+0x88/0x204
>  _raw_write_lock_irq
>  do_exit
>  call_usermodehelper_exec_async
>  ret_from_fork
> 
> value changed: 0xffffffff -> 0x00000001
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 PID: 1103 Comm: kworker/u4:1 6.1.111
> 
> Commit 1a365e822372 ("locking/spinlock/debug: Fix various data races") has
> adressed most of these races, but seems to be not consistent/not complete.
> 
> >From do_raw_write_lock() only debug_write_lock_after() part has been
> converted to WRITE_ONCE(), but not debug_write_lock_before() part.
> Do it now.
> 
> Cc: stable@vger.kernel.org
> Fixes: 1a365e822372 ("locking/spinlock/debug: Fix various data races")
> Reported-by: Adrian Freihofer <adrian.freihofer@siemens.com>
> Acked-by: Waiman Long <longman@redhat.com>
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> ---
> There are still some inconsistencies remaining IMO:
> - lock->magic is sometimes accessed with READ_ONCE() even though it's only
> being plain-written;
> - debug_spin_unlock() and debug_write_unlock() both do WRITE_ONCE() on
> lock->owner and lock->owner_cpu, but examine them with plain read accesses.
> 
>  kernel/locking/spinlock_debug.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> index 87b03d2e41dbb..2338b3adfb55f 100644
> --- a/kernel/locking/spinlock_debug.c
> +++ b/kernel/locking/spinlock_debug.c
> @@ -184,8 +184,8 @@ void do_raw_read_unlock(rwlock_t *lock)
>  static inline void debug_write_lock_before(rwlock_t *lock)
>  {
>  	RWLOCK_BUG_ON(lock->magic != RWLOCK_MAGIC, lock, "bad magic");
> -	RWLOCK_BUG_ON(lock->owner == current, lock, "recursion");
> -	RWLOCK_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
> +	RWLOCK_BUG_ON(READ_ONCE(lock->owner) == current, lock, "recursion");
> +	RWLOCK_BUG_ON(READ_ONCE(lock->owner_cpu) == raw_smp_processor_id(),
>  							lock, "cpu recursion");
>  }
>  
> -- 
> 2.47.1
>
Re: [PATCH RESEND] locking/spinlock/debug: Fix data-race in do_raw_write_lock
Posted by Boqun Feng 3 weeks, 4 days ago
On Tue, Aug 26, 2025 at 04:44:30AM -0700, Paul E. McKenney wrote:
> On Tue, Aug 26, 2025 at 12:27:27PM +0200, A. Sverdlin wrote:
> > From: Alexander Sverdlin <alexander.sverdlin@siemens.com>
> > 
> > KCSAN reports:
> > 
> > BUG: KCSAN: data-race in do_raw_write_lock / do_raw_write_lock
> > 
> > write (marked) to 0xffff800009cf504c of 4 bytes by task 1102 on cpu 1:
> >  do_raw_write_lock+0x120/0x204
> >  _raw_write_lock_irq
> >  do_exit
> >  call_usermodehelper_exec_async
> >  ret_from_fork
> > 
> > read to 0xffff800009cf504c of 4 bytes by task 1103 on cpu 0:
> >  do_raw_write_lock+0x88/0x204
> >  _raw_write_lock_irq
> >  do_exit
> >  call_usermodehelper_exec_async
> >  ret_from_fork
> > 
> > value changed: 0xffffffff -> 0x00000001
> > 
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 0 PID: 1103 Comm: kworker/u4:1 6.1.111
> > 
> > Commit 1a365e822372 ("locking/spinlock/debug: Fix various data races") has
> > adressed most of these races, but seems to be not consistent/not complete.
> > 
> > >From do_raw_write_lock() only debug_write_lock_after() part has been
> > converted to WRITE_ONCE(), but not debug_write_lock_before() part.
> > Do it now.
> > 
> > Cc: stable@vger.kernel.org
> > Fixes: 1a365e822372 ("locking/spinlock/debug: Fix various data races")
> > Reported-by: Adrian Freihofer <adrian.freihofer@siemens.com>
> > Acked-by: Waiman Long <longman@redhat.com>
> > Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
> 
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
> 

Thank you, I will queue this for future testing ans reviews.

Alexander, is there any link to a kcsan splat that we can use? Thanks!

Regards,
Boqun

> > ---
> > There are still some inconsistencies remaining IMO:
> > - lock->magic is sometimes accessed with READ_ONCE() even though it's only
> > being plain-written;
> > - debug_spin_unlock() and debug_write_unlock() both do WRITE_ONCE() on
> > lock->owner and lock->owner_cpu, but examine them with plain read accesses.
> > 
> >  kernel/locking/spinlock_debug.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> > index 87b03d2e41dbb..2338b3adfb55f 100644
> > --- a/kernel/locking/spinlock_debug.c
> > +++ b/kernel/locking/spinlock_debug.c
> > @@ -184,8 +184,8 @@ void do_raw_read_unlock(rwlock_t *lock)
> >  static inline void debug_write_lock_before(rwlock_t *lock)
> >  {
> >  	RWLOCK_BUG_ON(lock->magic != RWLOCK_MAGIC, lock, "bad magic");
> > -	RWLOCK_BUG_ON(lock->owner == current, lock, "recursion");
> > -	RWLOCK_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
> > +	RWLOCK_BUG_ON(READ_ONCE(lock->owner) == current, lock, "recursion");
> > +	RWLOCK_BUG_ON(READ_ONCE(lock->owner_cpu) == raw_smp_processor_id(),
> >  							lock, "cpu recursion");
> >  }
> >  
> > -- 
> > 2.47.1
> >