[PATCH] locking/spinlock/debug: Fix data-race in do_raw_write_lock

A. Sverdlin posted 1 patch 1 year ago
There is a newer version of this series
kernel/locking/spinlock_debug.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH] locking/spinlock/debug: Fix data-race in do_raw_write_lock
Posted by A. Sverdlin 1 year ago
From: Alexander Sverdlin <alexander.sverdlin@siemens.com>

KCSAN reports:

BUG: KCSAN: data-race in do_raw_write_lock / do_raw_write_lock

write (marked) to 0xffff800009cf504c of 4 bytes by task 1102 on cpu 1:
 do_raw_write_lock+0x120/0x204
 _raw_write_lock_irq
 do_exit
 call_usermodehelper_exec_async
 ret_from_fork

read to 0xffff800009cf504c of 4 bytes by task 1103 on cpu 0:
 do_raw_write_lock+0x88/0x204
 _raw_write_lock_irq
 do_exit
 call_usermodehelper_exec_async
 ret_from_fork

value changed: 0xffffffff -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 1103 Comm: kworker/u4:1 6.1.111

Commit 1a365e822372 ("locking/spinlock/debug: Fix various data races") has
adressed most of these races, but seems to be not consistent/not complete.

From do_raw_write_lock() only debug_write_lock_after() part has been
converted to WRITE_ONCE(), but not debug_write_lock_before() part.
Do it now.

Fixes: 1a365e822372 ("locking/spinlock/debug: Fix various data races")
Reported-by: Adrian Freihofer <adrian.freihofer@siemens.com>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
---
There are still some inconsistencies remaining IMO:
- lock->magic is sometimes accessed with READ_ONCE() even though it's only
being plain-written;
- debug_spin_unlock() and debug_write_unlock() both do WRITE_ONCE() on
lock->owner and lock->owner_cpu, but examine them with plain read accesses.

 kernel/locking/spinlock_debug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 87b03d2e41dbb..2338b3adfb55f 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -184,8 +184,8 @@ void do_raw_read_unlock(rwlock_t *lock)
 static inline void debug_write_lock_before(rwlock_t *lock)
 {
 	RWLOCK_BUG_ON(lock->magic != RWLOCK_MAGIC, lock, "bad magic");
-	RWLOCK_BUG_ON(lock->owner == current, lock, "recursion");
-	RWLOCK_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
+	RWLOCK_BUG_ON(READ_ONCE(lock->owner) == current, lock, "recursion");
+	RWLOCK_BUG_ON(READ_ONCE(lock->owner_cpu) == raw_smp_processor_id(),
 							lock, "cpu recursion");
 }
 
-- 
2.47.1
Re: [PATCH] locking/spinlock/debug: Fix data-race in do_raw_write_lock
Posted by Waiman Long 1 year ago
On 12/5/24 12:01 PM, A. Sverdlin wrote:
> From: Alexander Sverdlin <alexander.sverdlin@siemens.com>
>
> KCSAN reports:
>
> BUG: KCSAN: data-race in do_raw_write_lock / do_raw_write_lock
>
> write (marked) to 0xffff800009cf504c of 4 bytes by task 1102 on cpu 1:
>   do_raw_write_lock+0x120/0x204
>   _raw_write_lock_irq
>   do_exit
>   call_usermodehelper_exec_async
>   ret_from_fork
>
> read to 0xffff800009cf504c of 4 bytes by task 1103 on cpu 0:
>   do_raw_write_lock+0x88/0x204
>   _raw_write_lock_irq
>   do_exit
>   call_usermodehelper_exec_async
>   ret_from_fork
>
> value changed: 0xffffffff -> 0x00000001
>
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 PID: 1103 Comm: kworker/u4:1 6.1.111
>
> Commit 1a365e822372 ("locking/spinlock/debug: Fix various data races") has
> adressed most of these races, but seems to be not consistent/not complete.
>
>  From do_raw_write_lock() only debug_write_lock_after() part has been
> converted to WRITE_ONCE(), but not debug_write_lock_before() part.
> Do it now.
>
> Fixes: 1a365e822372 ("locking/spinlock/debug: Fix various data races")
> Reported-by: Adrian Freihofer <adrian.freihofer@siemens.com>
> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
> ---
> There are still some inconsistencies remaining IMO:
> - lock->magic is sometimes accessed with READ_ONCE() even though it's only
> being plain-written;
> - debug_spin_unlock() and debug_write_unlock() both do WRITE_ONCE() on
> lock->owner and lock->owner_cpu, but examine them with plain read accesses.
>
>   kernel/locking/spinlock_debug.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> index 87b03d2e41dbb..2338b3adfb55f 100644
> --- a/kernel/locking/spinlock_debug.c
> +++ b/kernel/locking/spinlock_debug.c
> @@ -184,8 +184,8 @@ void do_raw_read_unlock(rwlock_t *lock)
>   static inline void debug_write_lock_before(rwlock_t *lock)
>   {
>   	RWLOCK_BUG_ON(lock->magic != RWLOCK_MAGIC, lock, "bad magic");
> -	RWLOCK_BUG_ON(lock->owner == current, lock, "recursion");
> -	RWLOCK_BUG_ON(lock->owner_cpu == raw_smp_processor_id(),
> +	RWLOCK_BUG_ON(READ_ONCE(lock->owner) == current, lock, "recursion");
> +	RWLOCK_BUG_ON(READ_ONCE(lock->owner_cpu) == raw_smp_processor_id(),
>   							lock, "cpu recursion");
>   }
>   

LGTM

Acked-by: Waiman Long <longman@redhat.com>