[Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet

Juergen Gross posted 1 patch 4 years, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20200228071922.3983-1-jgross@suse.com
There is a newer version of this series
xen/arch/x86/microcode.c  | 54 +++++++++++++++++++++++++++++------------------
xen/common/stop_machine.c |  1 +
2 files changed, 35 insertions(+), 20 deletions(-)
[Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet
Posted by Juergen Gross 4 years, 1 month ago
With core scheduling active it is mandatory for stop_machine_run() to
be called in idle context only (so either during boot or in a tasklet),
as otherwise a scheduling deadlock would occur: stop_machine_run()
does a cpu rendezvous by activating a tasklet on all other cpus. In
case stop_machine_run() was not called in an idle vcpu it would block
scheduling the idle vcpu on its siblings with core scheduling being
active, resulting in a hang.

Put a BUG_ON() into stop_machine_run() to test for being called in an
idle vcpu only and adapt the missing call site (ucode loading) to use a
tasklet for calling stop_machine_run().

Signed-off-by: Juergen Gross <jgross@suse.com>
---
V2:
- rephrase commit message (Julien Grall)
---
 xen/arch/x86/microcode.c  | 54 +++++++++++++++++++++++++++++------------------
 xen/common/stop_machine.c |  1 +
 2 files changed, 35 insertions(+), 20 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 35c1d36cdc..4cf4e66b18 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -562,30 +562,18 @@ static int do_microcode_update(void *patch)
     return ret;
 }
 
-int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
+struct ucode_buf {
+    unsigned int len;
+    char buffer[];
+};
+
+static long microcode_update_helper(void *data)
 {
     int ret;
-    void *buffer;
+    struct ucode_buf *buffer = data;
     unsigned int cpu, updated;
     struct microcode_patch *patch;
 
-    if ( len != (uint32_t)len )
-        return -E2BIG;
-
-    if ( microcode_ops == NULL )
-        return -EINVAL;
-
-    buffer = xmalloc_bytes(len);
-    if ( !buffer )
-        return -ENOMEM;
-
-    ret = copy_from_guest(buffer, buf, len);
-    if ( ret )
-    {
-        xfree(buffer);
-        return -EFAULT;
-    }
-
     /* cpu_online_map must not change during update */
     if ( !get_cpu_maps() )
     {
@@ -607,7 +595,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         return -EPERM;
     }
 
-    patch = parse_blob(buffer, len);
+    patch = parse_blob(buffer->buffer, buffer->len);
     xfree(buffer);
     if ( IS_ERR(patch) )
     {
@@ -700,6 +688,32 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     return ret;
 }
 
+int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
+{
+    int ret;
+    struct ucode_buf *buffer;
+
+    if ( len != (uint32_t)len )
+        return -E2BIG;
+
+    if ( microcode_ops == NULL )
+        return -EINVAL;
+
+    buffer = xmalloc_flex_struct(struct ucode_buf, buffer, len);
+    if ( !buffer )
+        return -ENOMEM;
+
+    ret = copy_from_guest(buffer->buffer, buf, len);
+    if ( ret )
+    {
+        xfree(buffer);
+        return -EFAULT;
+    }
+    buffer->len = len;
+
+    return continue_hypercall_on_cpu(0, microcode_update_helper, buffer);
+}
+
 static int __init microcode_init(void)
 {
     /*
diff --git a/xen/common/stop_machine.c b/xen/common/stop_machine.c
index 33d9602217..fe7f7d4447 100644
--- a/xen/common/stop_machine.c
+++ b/xen/common/stop_machine.c
@@ -74,6 +74,7 @@ int stop_machine_run(int (*fn)(void *), void *data, unsigned int cpu)
     int ret;
 
     BUG_ON(!local_irq_is_enabled());
+    BUG_ON(!is_idle_vcpu(current));
 
     /* cpu_online_map must not change. */
     if ( !get_cpu_maps() )
-- 
2.16.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet
Posted by Jan Beulich 4 years, 1 month ago
On 28.02.2020 08:19, Juergen Gross wrote:
> With core scheduling active it is mandatory for stop_machine_run() to
> be called in idle context only (so either during boot or in a tasklet),
> as otherwise a scheduling deadlock would occur: stop_machine_run()
> does a cpu rendezvous by activating a tasklet on all other cpus. In
> case stop_machine_run() was not called in an idle vcpu it would block
> scheduling the idle vcpu on its siblings with core scheduling being
> active, resulting in a hang.
> 
> Put a BUG_ON() into stop_machine_run() to test for being called in an
> idle vcpu only and adapt the missing call site (ucode loading) to use a
> tasklet for calling stop_machine_run().
> 
> Signed-off-by: Juergen Gross <jgross@suse.com>
> ---
> V2:
> - rephrase commit message (Julien Grall)
> ---
>  xen/arch/x86/microcode.c  | 54 +++++++++++++++++++++++++++++------------------
>  xen/common/stop_machine.c |  1 +
>  2 files changed, 35 insertions(+), 20 deletions(-)

There's no mention anywhere of a connection to your RCU series,
nor to rcu_barrier(). Yet the change puts a new restriction also
on its use, and iirc this was mentioned in prior discussion. Did
I miss anything?

> @@ -700,6 +688,32 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>      return ret;
>  }
>  
> +int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
> +{
> +    int ret;
> +    struct ucode_buf *buffer;
> +
> +    if ( len != (uint32_t)len )
> +        return -E2BIG;
> +
> +    if ( microcode_ops == NULL )
> +        return -EINVAL;
> +
> +    buffer = xmalloc_flex_struct(struct ucode_buf, buffer, len);
> +    if ( !buffer )
> +        return -ENOMEM;
> +
> +    ret = copy_from_guest(buffer->buffer, buf, len);
> +    if ( ret )
> +    {
> +        xfree(buffer);
> +        return -EFAULT;
> +    }
> +    buffer->len = len;
> +
> +    return continue_hypercall_on_cpu(0, microcode_update_helper, buffer);
> +}

Andrew, just to clarify - were you okay with Jürgen's response regarding
this re-introduction of continue_hypercall_on_cpu() here?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet
Posted by Jürgen Groß 4 years, 1 month ago
On 28.02.20 09:27, Jan Beulich wrote:
> On 28.02.2020 08:19, Juergen Gross wrote:
>> With core scheduling active it is mandatory for stop_machine_run() to
>> be called in idle context only (so either during boot or in a tasklet),
>> as otherwise a scheduling deadlock would occur: stop_machine_run()
>> does a cpu rendezvous by activating a tasklet on all other cpus. In
>> case stop_machine_run() was not called in an idle vcpu it would block
>> scheduling the idle vcpu on its siblings with core scheduling being
>> active, resulting in a hang.
>>
>> Put a BUG_ON() into stop_machine_run() to test for being called in an
>> idle vcpu only and adapt the missing call site (ucode loading) to use a
>> tasklet for calling stop_machine_run().
>>
>> Signed-off-by: Juergen Gross <jgross@suse.com>
>> ---
>> V2:
>> - rephrase commit message (Julien Grall)
>> ---
>>   xen/arch/x86/microcode.c  | 54 +++++++++++++++++++++++++++++------------------
>>   xen/common/stop_machine.c |  1 +
>>   2 files changed, 35 insertions(+), 20 deletions(-)
> 
> There's no mention anywhere of a connection to your RCU series,
> nor to rcu_barrier(). Yet the change puts a new restriction also
> on its use, and iirc this was mentioned in prior discussion. Did
> I miss anything?

Basically this patch makes the restriction explicit. Without the patch
stop_machine_run() being called outside of a tasklet would just hang
with core scheduling being active. Now it will catch those violations
early even with core scheduling inactive.

Note that currently there are no violations of this restriction anywhere
in the hypervisor other than the one addressed by this patch.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet
Posted by Jan Beulich 4 years, 1 month ago
On 28.02.2020 09:58, Jürgen Groß wrote:
> On 28.02.20 09:27, Jan Beulich wrote:
>> On 28.02.2020 08:19, Juergen Gross wrote:
>>> With core scheduling active it is mandatory for stop_machine_run() to
>>> be called in idle context only (so either during boot or in a tasklet),
>>> as otherwise a scheduling deadlock would occur: stop_machine_run()
>>> does a cpu rendezvous by activating a tasklet on all other cpus. In
>>> case stop_machine_run() was not called in an idle vcpu it would block
>>> scheduling the idle vcpu on its siblings with core scheduling being
>>> active, resulting in a hang.
>>>
>>> Put a BUG_ON() into stop_machine_run() to test for being called in an
>>> idle vcpu only and adapt the missing call site (ucode loading) to use a
>>> tasklet for calling stop_machine_run().
>>>
>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>> ---
>>> V2:
>>> - rephrase commit message (Julien Grall)
>>> ---
>>>   xen/arch/x86/microcode.c  | 54 +++++++++++++++++++++++++++++------------------
>>>   xen/common/stop_machine.c |  1 +
>>>   2 files changed, 35 insertions(+), 20 deletions(-)
>>
>> There's no mention anywhere of a connection to your RCU series,
>> nor to rcu_barrier(). Yet the change puts a new restriction also
>> on its use, and iirc this was mentioned in prior discussion. Did
>> I miss anything?
> 
> Basically this patch makes the restriction explicit. Without the patch
> stop_machine_run() being called outside of a tasklet would just hang
> with core scheduling being active. Now it will catch those violations
> early even with core scheduling inactive.
> 
> Note that currently there are no violations of this restriction anywhere
> in the hypervisor other than the one addressed by this patch.

Well, there is a connection to core scheduling. Without it, i.e.
prior to 4.13, there was no restriction on the placement of
rcu_barrier() invocations. According to what you say above, the
restriction was implicitly introduced with core scheduling. It
should imo be made explicit by attaching a comment, which would
(again imo) best be done here because now you also make this
case crash without core scheduling in use.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet
Posted by Jürgen Groß 4 years, 1 month ago
On 28.02.20 10:15, Jan Beulich wrote:
> On 28.02.2020 09:58, Jürgen Groß wrote:
>> On 28.02.20 09:27, Jan Beulich wrote:
>>> On 28.02.2020 08:19, Juergen Gross wrote:
>>>> With core scheduling active it is mandatory for stop_machine_run() to
>>>> be called in idle context only (so either during boot or in a tasklet),
>>>> as otherwise a scheduling deadlock would occur: stop_machine_run()
>>>> does a cpu rendezvous by activating a tasklet on all other cpus. In
>>>> case stop_machine_run() was not called in an idle vcpu it would block
>>>> scheduling the idle vcpu on its siblings with core scheduling being
>>>> active, resulting in a hang.
>>>>
>>>> Put a BUG_ON() into stop_machine_run() to test for being called in an
>>>> idle vcpu only and adapt the missing call site (ucode loading) to use a
>>>> tasklet for calling stop_machine_run().
>>>>
>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>> ---
>>>> V2:
>>>> - rephrase commit message (Julien Grall)
>>>> ---
>>>>    xen/arch/x86/microcode.c  | 54 +++++++++++++++++++++++++++++------------------
>>>>    xen/common/stop_machine.c |  1 +
>>>>    2 files changed, 35 insertions(+), 20 deletions(-)
>>>
>>> There's no mention anywhere of a connection to your RCU series,
>>> nor to rcu_barrier(). Yet the change puts a new restriction also
>>> on its use, and iirc this was mentioned in prior discussion. Did
>>> I miss anything?
>>
>> Basically this patch makes the restriction explicit. Without the patch
>> stop_machine_run() being called outside of a tasklet would just hang
>> with core scheduling being active. Now it will catch those violations
>> early even with core scheduling inactive.
>>
>> Note that currently there are no violations of this restriction anywhere
>> in the hypervisor other than the one addressed by this patch.
> 
> Well, there is a connection to core scheduling. Without it, i.e.
> prior to 4.13, there was no restriction on the placement of
> rcu_barrier() invocations. According to what you say above, the
> restriction was implicitly introduced with core scheduling. It
> should imo be made explicit by attaching a comment, which would
> (again imo) best be done here because now you also make this
> case crash without core scheduling in use.

Okay, I'll add a comment to stop_machine_run() and to rcu_barrier(). The
rcu_barrier() comment will be then removed by my rcu series again.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2] xen: make sure stop_machine_run() is always called in a tasklet
Posted by Jan Beulich 4 years, 1 month ago
On 28.02.2020 10:18, Jürgen Groß wrote:
> On 28.02.20 10:15, Jan Beulich wrote:
>> On 28.02.2020 09:58, Jürgen Groß wrote:
>>> On 28.02.20 09:27, Jan Beulich wrote:
>>>> On 28.02.2020 08:19, Juergen Gross wrote:
>>>>> With core scheduling active it is mandatory for stop_machine_run() to
>>>>> be called in idle context only (so either during boot or in a tasklet),
>>>>> as otherwise a scheduling deadlock would occur: stop_machine_run()
>>>>> does a cpu rendezvous by activating a tasklet on all other cpus. In
>>>>> case stop_machine_run() was not called in an idle vcpu it would block
>>>>> scheduling the idle vcpu on its siblings with core scheduling being
>>>>> active, resulting in a hang.
>>>>>
>>>>> Put a BUG_ON() into stop_machine_run() to test for being called in an
>>>>> idle vcpu only and adapt the missing call site (ucode loading) to use a
>>>>> tasklet for calling stop_machine_run().
>>>>>
>>>>> Signed-off-by: Juergen Gross <jgross@suse.com>
>>>>> ---
>>>>> V2:
>>>>> - rephrase commit message (Julien Grall)
>>>>> ---
>>>>>    xen/arch/x86/microcode.c  | 54 +++++++++++++++++++++++++++++------------------
>>>>>    xen/common/stop_machine.c |  1 +
>>>>>    2 files changed, 35 insertions(+), 20 deletions(-)
>>>>
>>>> There's no mention anywhere of a connection to your RCU series,
>>>> nor to rcu_barrier(). Yet the change puts a new restriction also
>>>> on its use, and iirc this was mentioned in prior discussion. Did
>>>> I miss anything?
>>>
>>> Basically this patch makes the restriction explicit. Without the patch
>>> stop_machine_run() being called outside of a tasklet would just hang
>>> with core scheduling being active. Now it will catch those violations
>>> early even with core scheduling inactive.
>>>
>>> Note that currently there are no violations of this restriction anywhere
>>> in the hypervisor other than the one addressed by this patch.
>>
>> Well, there is a connection to core scheduling. Without it, i.e.
>> prior to 4.13, there was no restriction on the placement of
>> rcu_barrier() invocations. According to what you say above, the
>> restriction was implicitly introduced with core scheduling. It
>> should imo be made explicit by attaching a comment, which would
>> (again imo) best be done here because now you also make this
>> case crash without core scheduling in use.
> 
> Okay, I'll add a comment to stop_machine_run() and to rcu_barrier(). The
> rcu_barrier() comment will be then removed by my rcu series again.

Right - the alternative then would be to call out a dependency of
this patch on that series.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel