[PATCH 6/6] x86/idle: Fix buggy "x86/mwait-idle: enable interrupts before C1 on Xeons"

Andrew Cooper posted 6 patches 4 months ago
There is a newer version of this series
[PATCH 6/6] x86/idle: Fix buggy "x86/mwait-idle: enable interrupts before C1 on Xeons"
Posted by Andrew Cooper 4 months ago
The check of this_softirq_pending must be performed with irqs disabled, but
this property was broken by an attempt to optimise entry/exit latency.

Commit c227233ad64c in Linux (which we copied into Xen) was fixed up by
edc8fc01f608 in Linux, which we have so far missed.

Going to sleep without waking on interrupts is nonsensical outside of
play_dead(), so overload this to select between two possible MWAITs, the
second using the STI shadow to cover MWAIT for exactly the same reason as we
do in safe_halt().

Fixes: b17e0ec72ede ("x86/mwait-idle: enable interrupts before C1 on Xeons")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Anthony PERARD <anthony.perard@vates.tech>
CC: Michal Orzel <michal.orzel@amd.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
---
 xen/arch/x86/acpi/cpu_idle.c  | 16 +++++++++++++++-
 xen/arch/x86/cpu/mwait-idle.c |  8 ++------
 2 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
index de020dfee87f..dc8b7111a181 100644
--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -80,6 +80,13 @@ static always_inline void mwait(unsigned int eax, unsigned int ecx)
                    :: "a" (eax), "c" (ecx) );
 }
 
+static always_inline void sti_mwait_cli(unsigned int eax, unsigned int ecx)
+{
+    /* STI shadow covers MWAIT. */
+    asm volatile ( "sti; mwait; cli"
+                   :: "a" (eax), "c" (ecx) );
+}
+
 #define GET_HW_RES_IN_NS(msr, val) \
     do { rdmsrl(msr, val); val = tsc_ticks2ns(val); } while( 0 )
 #define GET_MC6_RES(val)  GET_HW_RES_IN_NS(0x664, val)
@@ -461,12 +468,19 @@ void mwait_idle_with_hints(unsigned int eax, unsigned int ecx)
 
     monitor(this_softirq_pending, 0, 0);
 
+    ASSERT(!local_irq_is_enabled());
+
     if ( !*this_softirq_pending )
     {
         struct cpu_info *info = get_cpu_info();
 
         spec_ctrl_enter_idle(info);
-        mwait(eax, ecx);
+
+        if ( ecx & MWAIT_ECX_INTERRUPT_BREAK )
+            mwait(eax, ecx);
+        else
+            sti_mwait_cli(eax, ecx);
+
         spec_ctrl_exit_idle(info);
     }
 
diff --git a/xen/arch/x86/cpu/mwait-idle.c b/xen/arch/x86/cpu/mwait-idle.c
index 5c16f5ad3a82..b65d6ae9ddc5 100644
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -946,12 +946,8 @@ static void cf_check mwait_idle(void)
 
 	update_last_cx_stat(power, cx, before);
 
-	if (cx->irq_enable_early)
-		local_irq_enable();
-
-	mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
-
-	local_irq_disable();
+	mwait_idle_with_hints(cx->address,
+                              cx->irq_enable_early ? 0 : MWAIT_ECX_INTERRUPT_BREAK);
 
 	after = alternative_call(cpuidle_get_tick);
 
-- 
2.39.5


Re: [PATCH 6/6] x86/idle: Fix buggy "x86/mwait-idle: enable interrupts before C1 on Xeons"
Posted by Jan Beulich 3 months, 4 weeks ago
On 02.07.2025 16:41, Andrew Cooper wrote:
> @@ -461,12 +468,19 @@ void mwait_idle_with_hints(unsigned int eax, unsigned int ecx)
>  
>      monitor(this_softirq_pending, 0, 0);
>  
> +    ASSERT(!local_irq_is_enabled());
> +
>      if ( !*this_softirq_pending )
>      {
>          struct cpu_info *info = get_cpu_info();
>  
>          spec_ctrl_enter_idle(info);
> -        mwait(eax, ecx);
> +
> +        if ( ecx & MWAIT_ECX_INTERRUPT_BREAK )
> +            mwait(eax, ecx);
> +        else
> +            sti_mwait_cli(eax, ecx);

Actually, I'm curious: It seems quite likely that you did consider an
alternative resulting in assembly code like this:

	test	$MWAIT_ECX_INTERRUPT_BREAK, %cl
	jz	0f
	sti
0:
	monitor
	cli

CLI being a relatively cheap operation aiui, is there anything wrong or
undesirable with this (smaller overall) alternative?

Jan
Re: [PATCH 6/6] x86/idle: Fix buggy "x86/mwait-idle: enable interrupts before C1 on Xeons"
Posted by Andrew Cooper 3 months, 4 weeks ago
On 03/07/2025 10:43 am, Jan Beulich wrote:
> On 02.07.2025 16:41, Andrew Cooper wrote:
>> @@ -461,12 +468,19 @@ void mwait_idle_with_hints(unsigned int eax, unsigned int ecx)
>>  
>>      monitor(this_softirq_pending, 0, 0);
>>  
>> +    ASSERT(!local_irq_is_enabled());
>> +
>>      if ( !*this_softirq_pending )
>>      {
>>          struct cpu_info *info = get_cpu_info();
>>  
>>          spec_ctrl_enter_idle(info);
>> -        mwait(eax, ecx);
>> +
>> +        if ( ecx & MWAIT_ECX_INTERRUPT_BREAK )
>> +            mwait(eax, ecx);
>> +        else
>> +            sti_mwait_cli(eax, ecx);
> Actually, I'm curious: It seems quite likely that you did consider an
> alternative resulting in assembly code like this:
>
> 	test	$MWAIT_ECX_INTERRUPT_BREAK, %cl
> 	jz	0f
> 	sti
> 0:
> 	monitor
> 	cli
>
> CLI being a relatively cheap operation aiui, is there anything wrong or
> undesirable with this (smaller overall) alternative?

Other than it needing to be mwait?  The overheads aren't interesting;
they're nothing compared to going idle.

What does matter is that such a transformation cannot exist in mwait()
itself, as that breaks acpi_dead_idle(), and if we turn this mwait()
into inline asm, there's only a single caller of mwait() left.

~Andrew

Re: [PATCH 6/6] x86/idle: Fix buggy "x86/mwait-idle: enable interrupts before C1 on Xeons"
Posted by Jan Beulich 3 months, 4 weeks ago
On 03.07.2025 14:10, Andrew Cooper wrote:
> On 03/07/2025 10:43 am, Jan Beulich wrote:
>> On 02.07.2025 16:41, Andrew Cooper wrote:
>>> @@ -461,12 +468,19 @@ void mwait_idle_with_hints(unsigned int eax, unsigned int ecx)
>>>  
>>>      monitor(this_softirq_pending, 0, 0);
>>>  
>>> +    ASSERT(!local_irq_is_enabled());
>>> +
>>>      if ( !*this_softirq_pending )
>>>      {
>>>          struct cpu_info *info = get_cpu_info();
>>>  
>>>          spec_ctrl_enter_idle(info);
>>> -        mwait(eax, ecx);
>>> +
>>> +        if ( ecx & MWAIT_ECX_INTERRUPT_BREAK )
>>> +            mwait(eax, ecx);
>>> +        else
>>> +            sti_mwait_cli(eax, ecx);
>> Actually, I'm curious: It seems quite likely that you did consider an
>> alternative resulting in assembly code like this:
>>
>> 	test	$MWAIT_ECX_INTERRUPT_BREAK, %cl
>> 	jz	0f
>> 	sti
>> 0:
>> 	monitor
>> 	cli
>>
>> CLI being a relatively cheap operation aiui, is there anything wrong or
>> undesirable with this (smaller overall) alternative?
> 
> Other than it needing to be mwait?

Oops.

>  The overheads aren't interesting;
> they're nothing compared to going idle.
> 
> What does matter is that such a transformation cannot exist in mwait()
> itself, as that breaks acpi_dead_idle(), and if we turn this mwait()
> into inline asm, there's only a single caller of mwait() left.

I was rather think of it living in something derived from sti_mwait_cli(),
which could then be called unconditionally from mwait_idle_with_hints().

Jan

Re: [PATCH 6/6] x86/idle: Fix buggy "x86/mwait-idle: enable interrupts before C1 on Xeons"
Posted by Jan Beulich 3 months, 4 weeks ago
On 02.07.2025 16:41, Andrew Cooper wrote:
> The check of this_softirq_pending must be performed with irqs disabled, but
> this property was broken by an attempt to optimise entry/exit latency.
> 
> Commit c227233ad64c in Linux (which we copied into Xen) was fixed up by
> edc8fc01f608 in Linux, which we have so far missed.
> 
> Going to sleep without waking on interrupts is nonsensical outside of
> play_dead(), so overload this to select between two possible MWAITs, the
> second using the STI shadow to cover MWAIT for exactly the same reason as we
> do in safe_halt().
> 
> Fixes: b17e0ec72ede ("x86/mwait-idle: enable interrupts before C1 on Xeons")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
with ...

> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -946,12 +946,8 @@ static void cf_check mwait_idle(void)
>  
>  	update_last_cx_stat(power, cx, before);
>  
> -	if (cx->irq_enable_early)
> -		local_irq_enable();
> -
> -	mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
> -
> -	local_irq_disable();
> +	mwait_idle_with_hints(cx->address,
> +                              cx->irq_enable_early ? 0 : MWAIT_ECX_INTERRUPT_BREAK);

... indentation here switched to Linux style (to match the rest of the file).

Jan