[v2] xen/x86: lock cacheline for add_sized()

[Xen-devel] [PATCH v2] xen/x86: lock cacheline for add_sized()

Posted by Juergen Gross 4 years, 8 months ago

add_sized() should use an atomic update of the memory word, as it is
used by spin_unlock().

Ticket locks are using a read-modify-write operation on parts of the
lockword for unlocking, while trying to lock is done by an atomic
update of the complete lockword. This requires the unlock operation to
be atomic, too, as otherwise the unlock might not write back the
correct data.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 xen/include/asm-x86/atomic.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/asm-x86/atomic.h b/xen/include/asm-x86/atomic.h
index 682bcf91b1..897f661191 100644
--- a/xen/include/asm-x86/atomic.h
+++ b/xen/include/asm-x86/atomic.h
@@ -21,7 +21,7 @@ static inline void name(volatile type *addr, type val) \
 #define build_add_sized(name, size, type, reg) \
     static inline void name(volatile type *addr, type val)              \
     {                                                                   \
-        asm volatile("add" size " %1,%0"                                \
+        asm volatile("lock add" size " %1,%0"                           \
                      : "=m" (*addr)                                     \
                      : reg (val));                                      \
     }
-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/x86: lock cacheline for add_sized()

Posted by Jan Beulich 4 years, 8 months ago

On 07.08.2019 09:32, Juergen Gross wrote:
> add_sized() should use an atomic update of the memory word, as it is
> used by spin_unlock().
> 
> Ticket locks are using a read-modify-write operation on parts of the
> lockword for unlocking, while trying to lock is done by an atomic
> update of the complete lockword. This requires the unlock operation to
> be atomic, too, as otherwise the unlock might not write back the
> correct data.

I have to take back my reply to v1, and hence I'm afraid that
if the change is really needed the description is still
insufficient. Let's look at both sides: Acquire is a LOCKed
read-modify-write of the full word, with the additional
property that the value written back to the low half is
unchanged from the value read. Release is

	read low half
	increment low half
	write low half

Since the low half doesn't change during any acquire (including
attempts, i.e. try-lock), it doesn't matter if it races with
the above sequence. It can freely happen between any two of the
three steps.

Therefore what I'm really after is (a) clarification whether
the issue you mean to fix was observed in practice and (b) a
concrete scenario where things would go wrong.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v2] xen/x86: lock cacheline for add_sized()

Posted by Juergen Gross 4 years, 8 months ago

On 07.08.19 11:10, Jan Beulich wrote:
> On 07.08.2019 09:32, Juergen Gross wrote:
>> add_sized() should use an atomic update of the memory word, as it is
>> used by spin_unlock().
>>
>> Ticket locks are using a read-modify-write operation on parts of the
>> lockword for unlocking, while trying to lock is done by an atomic
>> update of the complete lockword. This requires the unlock operation to
>> be atomic, too, as otherwise the unlock might not write back the
>> correct data.
> 
> I have to take back my reply to v1, and hence I'm afraid that
> if the change is really needed the description is still
> insufficient. Let's look at both sides: Acquire is a LOCKed
> read-modify-write of the full word, with the additional
> property that the value written back to the low half is
> unchanged from the value read. Release is
> 
>      read low half
>      increment low half
>      write low half
> 
> Since the low half doesn't change during any acquire (including
> attempts, i.e. try-lock), it doesn't matter if it races with
> the above sequence. It can freely happen between any two of the
> three steps.

Hmm, sounds sensible.

> Therefore what I'm really after is (a) clarification whether
> the issue you mean to fix was observed in practice and (b) a
> concrete scenario where things would go wrong.

I have seen a crash due to nmi watchdog with my core scheduling series.
It really looked like no one was holding the lock, but two cpus were
waiting for it. I couldn't find another explanation for that behavior,
but there might be another reason for it.

The problem occurred a short time after onlining a cpu when all cpus
tried to gather for time calibration and two of the cpus were not
joining, as they were waiting for the lock with IRQs disabled.

I have seen this scenario only once, but there are other strange hangs
in the same situation, so it might be that the bug I'm still looking for
is the reason for the problem I'm trying to solve with this patch.

Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel