[PATCH] xen/spinlock: move debug helpers inside the locked regions

Roger Pau Monne posted 1 patch 3 years, 8 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20200729111330.64549-1-roger.pau@citrix.com
xen/common/spinlock.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
[PATCH] xen/spinlock: move debug helpers inside the locked regions
Posted by Roger Pau Monne 3 years, 8 months ago
Debug helpers such as lock profiling or the invariant pCPU assertions
must strictly be performed inside the exclusive locked region, or else
races might happen.

Note the issue was not strictly introduced by the pointed commit in
the Fixes tag, since lock stats where already incremented before the
barrier, but that commit made it more apparent as manipulating the cpu
field could happen outside of the locked regions and thus trigger the
BUG_ON. This is only enabled on debug builds, and thus releases are
not affected.

Fixes: 80cba391a35 ('spinlocks: in debug builds store cpu holding the lock')
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/common/spinlock.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
index 17f4519fc7..ce3106e2d3 100644
--- a/xen/common/spinlock.c
+++ b/xen/common/spinlock.c
@@ -170,9 +170,9 @@ void inline _spin_lock_cb(spinlock_t *lock, void (*cb)(void *), void *data)
             cb(data);
         arch_lock_relax();
     }
+    arch_lock_acquire_barrier();
     got_lock(&lock->debug);
     LOCK_PROFILE_GOT;
-    arch_lock_acquire_barrier();
 }
 
 void _spin_lock(spinlock_t *lock)
@@ -198,9 +198,9 @@ unsigned long _spin_lock_irqsave(spinlock_t *lock)
 
 void _spin_unlock(spinlock_t *lock)
 {
-    arch_lock_release_barrier();
     LOCK_PROFILE_REL;
     rel_lock(&lock->debug);
+    arch_lock_release_barrier();
     add_sized(&lock->tickets.head, 1);
     arch_lock_signal();
     preempt_enable();
@@ -249,15 +249,15 @@ int _spin_trylock(spinlock_t *lock)
         preempt_enable();
         return 0;
     }
+    /*
+     * cmpxchg() is a full barrier so no need for an
+     * arch_lock_acquire_barrier().
+     */
     got_lock(&lock->debug);
 #ifdef CONFIG_DEBUG_LOCK_PROFILE
     if (lock->profile)
         lock->profile->time_locked = NOW();
 #endif
-    /*
-     * cmpxchg() is a full barrier so no need for an
-     * arch_lock_acquire_barrier().
-     */
     return 1;
 }
 
-- 
2.27.0


Re: [PATCH] xen/spinlock: move debug helpers inside the locked regions
Posted by Julien Grall 3 years, 8 months ago
Hi Roger,

On 29/07/2020 12:13, Roger Pau Monne wrote:
> Debug helpers such as lock profiling or the invariant pCPU assertions
> must strictly be performed inside the exclusive locked region, or else
> races might happen.
> 
> Note the issue was not strictly introduced by the pointed commit in
> the Fixes tag, since lock stats where already incremented before the
> barrier, but that commit made it more apparent as manipulating the cpu
> field could happen outside of the locked regions and thus trigger the
> BUG_ON.

 From the wording, it is not entirely clear which BUG_ON() you are 
referring to. I am guessing, it is the one in rel_lock(). Am I correct?

Otherwise, the change looks good to me.

Cheers,

> This is only enabled on debug builds, and thus releases are
> not affected.
> 
> Fixes: 80cba391a35 ('spinlocks: in debug builds store cpu holding the lock')
> Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>   xen/common/spinlock.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/common/spinlock.c b/xen/common/spinlock.c
> index 17f4519fc7..ce3106e2d3 100644
> --- a/xen/common/spinlock.c
> +++ b/xen/common/spinlock.c
> @@ -170,9 +170,9 @@ void inline _spin_lock_cb(spinlock_t *lock, void (*cb)(void *), void *data)
>               cb(data);
>           arch_lock_relax();
>       }
> +    arch_lock_acquire_barrier();
>       got_lock(&lock->debug);
>       LOCK_PROFILE_GOT;
> -    arch_lock_acquire_barrier();
>   }
>   
>   void _spin_lock(spinlock_t *lock)
> @@ -198,9 +198,9 @@ unsigned long _spin_lock_irqsave(spinlock_t *lock)
>   
>   void _spin_unlock(spinlock_t *lock)
>   {
> -    arch_lock_release_barrier();
>       LOCK_PROFILE_REL;
>       rel_lock(&lock->debug);
> +    arch_lock_release_barrier();
>       add_sized(&lock->tickets.head, 1);
>       arch_lock_signal();
>       preempt_enable();
> @@ -249,15 +249,15 @@ int _spin_trylock(spinlock_t *lock)
>           preempt_enable();
>           return 0;
>       }
> +    /*
> +     * cmpxchg() is a full barrier so no need for an
> +     * arch_lock_acquire_barrier().
> +     */
>       got_lock(&lock->debug);
>   #ifdef CONFIG_DEBUG_LOCK_PROFILE
>       if (lock->profile)
>           lock->profile->time_locked = NOW();
>   #endif
> -    /*
> -     * cmpxchg() is a full barrier so no need for an
> -     * arch_lock_acquire_barrier().
> -     */
>       return 1;
>   }
>   
> 

-- 
Julien Grall

Re: [PATCH] xen/spinlock: move debug helpers inside the locked regions
Posted by Roger Pau Monné 3 years, 8 months ago
On Wed, Jul 29, 2020 at 02:37:44PM +0100, Julien Grall wrote:
> Hi Roger,
> 
> On 29/07/2020 12:13, Roger Pau Monne wrote:
> > Debug helpers such as lock profiling or the invariant pCPU assertions
> > must strictly be performed inside the exclusive locked region, or else
> > races might happen.
> > 
> > Note the issue was not strictly introduced by the pointed commit in
> > the Fixes tag, since lock stats where already incremented before the
> > barrier, but that commit made it more apparent as manipulating the cpu
> > field could happen outside of the locked regions and thus trigger the
> > BUG_ON.
> 
> From the wording, it is not entirely clear which BUG_ON() you are referring
> to. I am guessing, it is the one in rel_lock(). Am I correct?

Yes, that's right. Expanding to:

"...  and thus trigger the BUG_ON in rel_lock()." would be better.

> Otherwise, the change looks good to me.

Thanks.

Re: [PATCH] xen/spinlock: move debug helpers inside the locked regions
Posted by Julien Grall 3 years, 8 months ago
Hi Roger,

On 29/07/2020 14:50, Roger Pau Monné wrote:
> On Wed, Jul 29, 2020 at 02:37:44PM +0100, Julien Grall wrote:
>> Hi Roger,
>>
>> On 29/07/2020 12:13, Roger Pau Monne wrote:
>>> Debug helpers such as lock profiling or the invariant pCPU assertions
>>> must strictly be performed inside the exclusive locked region, or else
>>> races might happen.
>>>
>>> Note the issue was not strictly introduced by the pointed commit in
>>> the Fixes tag, since lock stats where already incremented before the
>>> barrier, but that commit made it more apparent as manipulating the cpu
>>> field could happen outside of the locked regions and thus trigger the
>>> BUG_ON.
>>
>>  From the wording, it is not entirely clear which BUG_ON() you are referring
>> to. I am guessing, it is the one in rel_lock(). Am I correct?
> 
> Yes, that's right. Expanding to:
> 
> "...  and thus trigger the BUG_ON in rel_lock()." would be better.

Looks good to me. With that:

Reviewed-by: Julien Grall <jgrall@amazon.com>

I am happy to do the update on commit if there is no more comments.

Cheers,

-- 
Julien Grall

Re: [PATCH] xen/spinlock: move debug helpers inside the locked regions
Posted by Julien Grall 3 years, 8 months ago

On 29/07/2020 15:57, Julien Grall wrote:
> Hi Roger,
> 
> On 29/07/2020 14:50, Roger Pau Monné wrote:
>> On Wed, Jul 29, 2020 at 02:37:44PM +0100, Julien Grall wrote:
>>> Hi Roger,
>>>
>>> On 29/07/2020 12:13, Roger Pau Monne wrote:
>>>> Debug helpers such as lock profiling or the invariant pCPU assertions
>>>> must strictly be performed inside the exclusive locked region, or else
>>>> races might happen.
>>>>
>>>> Note the issue was not strictly introduced by the pointed commit in
>>>> the Fixes tag, since lock stats where already incremented before the
>>>> barrier, but that commit made it more apparent as manipulating the cpu
>>>> field could happen outside of the locked regions and thus trigger the
>>>> BUG_ON.
>>>
>>>  From the wording, it is not entirely clear which BUG_ON() you are 
>>> referring
>>> to. I am guessing, it is the one in rel_lock(). Am I correct?
>>
>> Yes, that's right. Expanding to:
>>
>> "...  and thus trigger the BUG_ON in rel_lock()." would be better.
> 
> Looks good to me. With that:
> 
> Reviewed-by: Julien Grall <jgrall@amazon.com>
> 
> I am happy to do the update on commit if there is no more comments.

Committed.

Thank you!

Cheers,

-- 
Julien Grall