[Xen-devel] [PATCH v8 3/5] xen: don't process rcu callbacks when holding a rcu_read_lock()

Juergen Gross posted 5 patches 5 years, 10 months ago
Maintainers: Stefano Stabellini <sstabellini@kernel.org>, Jan Beulich <jbeulich@suse.com>, Julien Grall <julien@xen.org>, Wei Liu <wl@xen.org>, "Roger Pau Monné" <roger.pau@citrix.com>, Ian Jackson <ian.jackson@eu.citrix.com>, Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>, George Dunlap <george.dunlap@citrix.com>, Andrew Cooper <andrew.cooper3@citrix.com>
[Xen-devel] [PATCH v8 3/5] xen: don't process rcu callbacks when holding a rcu_read_lock()
Posted by Juergen Gross 5 years, 10 months ago
Some keyhandlers are calling process_pending_softirqs() while holding
a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
activate rcu calls which should not happen inside a rcu_read_lock().

For that purpose modify process_pending_softirqs() to not allow rcu
callback processing when a rcu_read_lock() is being held.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
V3:
- add RCU_SOFTIRQ to ignore in process_pending_softirqs_norcu()
  (Roger Pau Monné)

V5:
- block rcu processing depending on rch_read_lock() being held or not
  (Jan Beulich)
---
 xen/common/softirq.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/xen/common/softirq.c b/xen/common/softirq.c
index b83ad96d6c..00d676b62c 100644
--- a/xen/common/softirq.c
+++ b/xen/common/softirq.c
@@ -29,6 +29,7 @@ static void __do_softirq(unsigned long ignore_mask)
 {
     unsigned int i, cpu;
     unsigned long pending;
+    bool rcu_allowed = !(ignore_mask & (1ul << RCU_SOFTIRQ));
 
     for ( ; ; )
     {
@@ -38,7 +39,7 @@ static void __do_softirq(unsigned long ignore_mask)
          */
         cpu = smp_processor_id();
 
-        if ( rcu_pending(cpu) )
+        if ( rcu_allowed && rcu_pending(cpu) )
             rcu_check_callbacks(cpu);
 
         if ( ((pending = (softirq_pending(cpu) & ~ignore_mask)) == 0)
@@ -53,9 +54,16 @@ static void __do_softirq(unsigned long ignore_mask)
 
 void process_pending_softirqs(void)
 {
+    unsigned long ignore_mask = (1ul << SCHEDULE_SOFTIRQ) |
+                                (1ul << SCHED_SLAVE_SOFTIRQ);
+
+    /* Block RCU processing in case of rcu_read_lock() held. */
+    if ( preempt_count() )
+        ignore_mask |= 1ul << RCU_SOFTIRQ;
+
     ASSERT(!in_irq() && local_irq_is_enabled());
     /* Do not enter scheduler as it can preempt the calling context. */
-    __do_softirq((1ul << SCHEDULE_SOFTIRQ) | (1ul << SCHED_SLAVE_SOFTIRQ));
+    __do_softirq(ignore_mask);
 }
 
 void do_softirq(void)
-- 
2.16.4


Re: [Xen-devel] [PATCH v8 3/5] xen: don't process rcu callbacks when holding a rcu_read_lock()
Posted by Igor Druzhinin 5 years, 10 months ago
On 26/03/2020 09:19, Juergen Gross wrote:
> Some keyhandlers are calling process_pending_softirqs() while holding
> a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
> activate rcu calls which should not happen inside a rcu_read_lock().
> 
> For that purpose modify process_pending_softirqs() to not allow rcu
> callback processing when a rcu_read_lock() is being held.
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> ---
> V3:
> - add RCU_SOFTIRQ to ignore in process_pending_softirqs_norcu()
>   (Roger Pau Monné)
> 
> V5:
> - block rcu processing depending on rch_read_lock() being held or not
>   (Jan Beulich)

Juergen,

Our BVT revealed a likely problem with this commit in that form.
Since 12509bbeb9e ("rwlocks: call preempt_disable() when taking a rwlock")
preemption is disabled after taking cpu_maps which will block RCU
callback processing inside rcu_barrier itself. This will result in
all system hang on boot after 540d4d60378 ("cpu: sync any remaining
RCU callbacks before CPU up/down").

Igor

Re: [Xen-devel] [PATCH v8 3/5] xen: don't process rcu callbacks when holding a rcu_read_lock()
Posted by Jürgen Groß 5 years, 10 months ago
On 27.03.20 00:24, Igor Druzhinin wrote:
> On 26/03/2020 09:19, Juergen Gross wrote:
>> Some keyhandlers are calling process_pending_softirqs() while holding
>> a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
>> activate rcu calls which should not happen inside a rcu_read_lock().
>>
>> For that purpose modify process_pending_softirqs() to not allow rcu
>> callback processing when a rcu_read_lock() is being held.
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> ---
>> V3:
>> - add RCU_SOFTIRQ to ignore in process_pending_softirqs_norcu()
>>    (Roger Pau Monné)
>>
>> V5:
>> - block rcu processing depending on rch_read_lock() being held or not
>>    (Jan Beulich)
> 
> Juergen,
> 
> Our BVT revealed a likely problem with this commit in that form.
> Since 12509bbeb9e ("rwlocks: call preempt_disable() when taking a rwlock")
> preemption is disabled after taking cpu_maps which will block RCU
> callback processing inside rcu_barrier itself. This will result in

Why would that block RCU callback processing?

RCU callbacks should be blocked only if a rcu lock is being held.

Did I miss something in my patches?

> all system hang on boot after 540d4d60378 ("cpu: sync any remaining
> RCU callbacks before CPU up/down").


Juergen

Re: [Xen-devel] [PATCH v8 3/5] xen: don't process rcu callbacks when holding a rcu_read_lock()
Posted by Jan Beulich 5 years, 10 months ago
On 27.03.2020 09:10, Jürgen Groß wrote:
> On 27.03.20 00:24, Igor Druzhinin wrote:
>> On 26/03/2020 09:19, Juergen Gross wrote:
>>> Some keyhandlers are calling process_pending_softirqs() while holding
>>> a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
>>> activate rcu calls which should not happen inside a rcu_read_lock().
>>>
>>> For that purpose modify process_pending_softirqs() to not allow rcu
>>> callback processing when a rcu_read_lock() is being held.
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>> ---
>>> V3:
>>> - add RCU_SOFTIRQ to ignore in process_pending_softirqs_norcu()
>>>    (Roger Pau Monné)
>>>
>>> V5:
>>> - block rcu processing depending on rch_read_lock() being held or not
>>>    (Jan Beulich)
>>
>> Juergen,
>>
>> Our BVT revealed a likely problem with this commit in that form.
>> Since 12509bbeb9e ("rwlocks: call preempt_disable() when taking a rwlock")
>> preemption is disabled after taking cpu_maps which will block RCU
>> callback processing inside rcu_barrier itself. This will result in
> 
> Why would that block RCU callback processing?
> 
> RCU callbacks should be blocked only if a rcu lock is being held.
> 
> Did I miss something in my patches?

Igor, are you perhaps running without "rcu: add assertions to debug
build"? I think this actually fixes what you describe. Without it
rcu_barrier(), in its second loop, calling process_pending_softirqs(),
would cause the RCU softirq to not be invoked anymore with preemption
disabled. Of course the title of this change doesn't reflect this at
all.

Jürgen, as an aside, while looking at the code again, I think the
comment near the end of process_pending_softirqs() would now rather
belong at its very beginning; should have spotted this while
reviewing.

Jan

Re: [Xen-devel] [PATCH v8 3/5] xen: don't process rcu callbacks when holding a rcu_read_lock()
Posted by Igor Druzhinin 5 years, 10 months ago
On 27/03/2020 08:35, Jan Beulich wrote:
> On 27.03.2020 09:10, Jürgen Groß wrote:
>> On 27.03.20 00:24, Igor Druzhinin wrote:
>>> On 26/03/2020 09:19, Juergen Gross wrote:
>>>> Some keyhandlers are calling process_pending_softirqs() while holding
>>>> a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
>>>> activate rcu calls which should not happen inside a rcu_read_lock().
>>>>
>>>> For that purpose modify process_pending_softirqs() to not allow rcu
>>>> callback processing when a rcu_read_lock() is being held.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>> ---
>>>> V3:
>>>> - add RCU_SOFTIRQ to ignore in process_pending_softirqs_norcu()
>>>>    (Roger Pau Monné)
>>>>
>>>> V5:
>>>> - block rcu processing depending on rch_read_lock() being held or not
>>>>    (Jan Beulich)
>>>
>>> Juergen,
>>>
>>> Our BVT revealed a likely problem with this commit in that form.
>>> Since 12509bbeb9e ("rwlocks: call preempt_disable() when taking a rwlock")
>>> preemption is disabled after taking cpu_maps which will block RCU
>>> callback processing inside rcu_barrier itself. This will result in
>>
>> Why would that block RCU callback processing?
>>
>> RCU callbacks should be blocked only if a rcu lock is being held.
>>
>> Did I miss something in my patches?
> 
> Igor, are you perhaps running without "rcu: add assertions to debug
> build"? I think this actually fixes what you describe. Without it
> rcu_barrier(), in its second loop, calling process_pending_softirqs(),
> would cause the RCU softirq to not be invoked anymore with preemption
> disabled. Of course the title of this change doesn't reflect this at
> all.

Yes, that explains it - I indeed skipped that patch from backporting to
our tree. Thanks, for the quick catch!

Igor


Re: [Xen-devel] [PATCH v8 3/5] xen: don't process rcu callbacks when holding a rcu_read_lock()
Posted by Jürgen Groß 5 years, 10 months ago
On 27.03.20 09:35, Jan Beulich wrote:
> On 27.03.2020 09:10, Jürgen Groß wrote:
>> On 27.03.20 00:24, Igor Druzhinin wrote:
>>> On 26/03/2020 09:19, Juergen Gross wrote:
>>>> Some keyhandlers are calling process_pending_softirqs() while holding
>>>> a rcu_read_lock(). This is wrong, as process_pending_softirqs() might
>>>> activate rcu calls which should not happen inside a rcu_read_lock().
>>>>
>>>> For that purpose modify process_pending_softirqs() to not allow rcu
>>>> callback processing when a rcu_read_lock() is being held.
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>>>> ---
>>>> V3:
>>>> - add RCU_SOFTIRQ to ignore in process_pending_softirqs_norcu()
>>>>     (Roger Pau Monné)
>>>>
>>>> V5:
>>>> - block rcu processing depending on rch_read_lock() being held or not
>>>>     (Jan Beulich)
>>>
>>> Juergen,
>>>
>>> Our BVT revealed a likely problem with this commit in that form.
>>> Since 12509bbeb9e ("rwlocks: call preempt_disable() when taking a rwlock")
>>> preemption is disabled after taking cpu_maps which will block RCU
>>> callback processing inside rcu_barrier itself. This will result in
>>
>> Why would that block RCU callback processing?
>>
>> RCU callbacks should be blocked only if a rcu lock is being held.
>>
>> Did I miss something in my patches?
> 
> Igor, are you perhaps running without "rcu: add assertions to debug
> build"? I think this actually fixes what you describe. Without it
> rcu_barrier(), in its second loop, calling process_pending_softirqs(),
> would cause the RCU softirq to not be invoked anymore with preemption
> disabled. Of course the title of this change doesn't reflect this at
> all.

Right. This explains why I don't see the hang on my test system.

> 
> Jürgen, as an aside, while looking at the code again, I think the
> comment near the end of process_pending_softirqs() would now rather
> belong at its very beginning; should have spotted this while
> reviewing.

Oh, indeed. Will send a patch.


Juergen