[RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()

Patryk Wlazlyn posted 8 patches 1 year, 2 months ago
There is a newer version of this series
[RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Patryk Wlazlyn 1 year, 2 months ago
The algorithm based on cpuid leaf 0x5 in the mwait_play_dead(),
for looking up the mwait hint for the deepest cstate may calculate the
wrong hint on recent Intel x86 platforms.

Prefer cpuidle_play_dead() over the generic mwait_play_dead() loop and
only fallback to the later in case of missing appropriate enter_dead()
handler.

Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
---
 arch/x86/kernel/smpboot.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index d0464c7a0af5..2627b56fb9bc 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1413,9 +1413,10 @@ void native_play_dead(void)
 	play_dead_common();
 	tboot_shutdown(TB_SHUTDOWN_WFS);
 
+	/* The first successful play_dead() will not return */
+	cpuidle_play_dead();
 	mwait_play_dead();
-	if (cpuidle_play_dead())
-		hlt_play_dead();
+	hlt_play_dead();
 }
 
 #else /* ... !CONFIG_HOTPLUG_CPU */
-- 
2.47.0
Re: [RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Rafael J. Wysocki 1 year, 2 months ago
On Mon, Nov 25, 2024 at 2:21 PM Patryk Wlazlyn
<patryk.wlazlyn@linux.intel.com> wrote:
>
> The algorithm based on cpuid leaf 0x5 in the mwait_play_dead(),
> for looking up the mwait hint for the deepest cstate may calculate the
> wrong hint on recent Intel x86 platforms.
>
> Prefer cpuidle_play_dead() over the generic mwait_play_dead() loop and
> only fallback to the later in case of missing appropriate enter_dead()
> handler.
>
> Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
> ---
>  arch/x86/kernel/smpboot.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index d0464c7a0af5..2627b56fb9bc 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1413,9 +1413,10 @@ void native_play_dead(void)
>         play_dead_common();
>         tboot_shutdown(TB_SHUTDOWN_WFS);
>
> +       /* The first successful play_dead() will not return */
> +       cpuidle_play_dead();
>         mwait_play_dead();
> -       if (cpuidle_play_dead())
> -               hlt_play_dead();
> +       hlt_play_dead();
>  }
>
>  #else /* ... !CONFIG_HOTPLUG_CPU */
> --

If you first make intel_idle provide :enter_dead() for all CPUs on all
platforms and implement it by calling mwait_play_dead_with_hint(), you
won't need mwait_play_dead() any more.
Re: [RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Patryk Wlazlyn 1 year, 2 months ago
> If you first make intel_idle provide :enter_dead() for all CPUs on all
> platforms and implement it by calling mwait_play_dead_with_hint(), you
> won't need mwait_play_dead() any more.
Crossed my mind, but because mwait_play_dead doesn't filter on Intel
vendor specifically, I thought that I might break some edge case.


I am all for simplifying.
Re: [RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Rafael J. Wysocki 1 year, 2 months ago
On Mon, Nov 25, 2024 at 3:43 PM Patryk Wlazlyn
<patryk.wlazlyn@linux.intel.com> wrote:
>
> > If you first make intel_idle provide :enter_dead() for all CPUs on all
> > platforms and implement it by calling mwait_play_dead_with_hint(), you
> > won't need mwait_play_dead() any more.
> Crossed my mind, but because mwait_play_dead doesn't filter on Intel
> vendor specifically,

In practice, it does.

The vendor check in it is equivalent to "if Intel".

> I thought that I might break some edge case.
>
>
> I am all for simplifying.
Re: [RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Patryk Wlazlyn 1 year, 2 months ago
>>> If you first make intel_idle provide :enter_dead() for all CPUs on all
>>> platforms and implement it by calling mwait_play_dead_with_hint(), you
>>> won't need mwait_play_dead() any more.
>> Crossed my mind, but because mwait_play_dead doesn't filter on Intel
>> vendor specifically,
>
> In practice, it does.
>
> The vendor check in it is equivalent to "if Intel".

Actually, what about INTEL_IDLE=n?
We might hit acpi_idle, which would call mwait_play_dead_with_hint() now, but
if we don't, don't we want to try mwait_play_dead before hlt or is it too
unrealistic to happen?
Re: [RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Rafael J. Wysocki 1 year, 2 months ago
On Tue, Nov 26, 2024 at 12:56 PM Patryk Wlazlyn
<patryk.wlazlyn@linux.intel.com> wrote:
>
> >>> If you first make intel_idle provide :enter_dead() for all CPUs on all
> >>> platforms and implement it by calling mwait_play_dead_with_hint(), you
> >>> won't need mwait_play_dead() any more.
> >> Crossed my mind, but because mwait_play_dead doesn't filter on Intel
> >> vendor specifically,
> >
> > In practice, it does.
> >
> > The vendor check in it is equivalent to "if Intel".
>
> Actually, what about INTEL_IDLE=n?
> We might hit acpi_idle, which would call mwait_play_dead_with_hint() now, but
> if we don't, don't we want to try mwait_play_dead before hlt or is it too
> unrealistic to happen?

In that case the hint to use would not be known anyway, so
hlt_play_dead() is the right choice IMV.
Re: [RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Patryk Wlazlyn 1 year, 2 months ago
>>>>> If you first make intel_idle provide :enter_dead() for all CPUs on all
>>>>> platforms and implement it by calling mwait_play_dead_with_hint(), you
>>>>> won't need mwait_play_dead() any more.
>>>> Crossed my mind, but because mwait_play_dead doesn't filter on Intel
>>>> vendor specifically,
>>>
>>> In practice, it does.
>>>
>>> The vendor check in it is equivalent to "if Intel".
>>
>> Actually, what about INTEL_IDLE=n?
>> We might hit acpi_idle, which would call mwait_play_dead_with_hint() now, but
>> if we don't, don't we want to try mwait_play_dead before hlt or is it too
>> unrealistic to happen?
>
> In that case the hint to use would not be known anyway, so
> hlt_play_dead() is the right choice IMV.

Fair, it's not known, but we could fallback to the old, cpuid leaf 0x5 based
algorithm, which works on a lot of hardware.

That being said, I think it's cleaner to get rid of the old algorithm entirely
and rely on idle drivers to do the right thing from this point forward.

If we could bring the CPU out of the mwait loop into the hlt loop somehow (via
interrupt for example) we could remove the old kexec hack altogether.
Re: [RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Rafael J. Wysocki 1 year, 2 months ago
On Tue, Nov 26, 2024 at 2:10 PM Patryk Wlazlyn
<patryk.wlazlyn@linux.intel.com> wrote:
>
> >>>>> If you first make intel_idle provide :enter_dead() for all CPUs on all
> >>>>> platforms and implement it by calling mwait_play_dead_with_hint(), you
> >>>>> won't need mwait_play_dead() any more.
> >>>> Crossed my mind, but because mwait_play_dead doesn't filter on Intel
> >>>> vendor specifically,
> >>>
> >>> In practice, it does.
> >>>
> >>> The vendor check in it is equivalent to "if Intel".
> >>
> >> Actually, what about INTEL_IDLE=n?
> >> We might hit acpi_idle, which would call mwait_play_dead_with_hint() now, but
> >> if we don't, don't we want to try mwait_play_dead before hlt or is it too
> >> unrealistic to happen?
> >
> > In that case the hint to use would not be known anyway, so
> > hlt_play_dead() is the right choice IMV.
>
> Fair, it's not known, but we could fallback to the old, cpuid leaf 0x5 based
> algorithm, which works on a lot of hardware.
>
> That being said, I think it's cleaner to get rid of the old algorithm entirely
> and rely on idle drivers to do the right thing from this point forward.
>
> If we could bring the CPU out of the mwait loop into the hlt loop somehow (via
> interrupt for example) we could remove the old kexec hack altogether.

Well, the problem is that CPUs may be woken up from MWAIT waits for
various reasons and the reason for a given wakeup is not known a
priori.
Re: [RFC PATCH v4 5/8] x86/smp native_play_dead: Prefer cpuidle_play_dead() over mwait_play_dead()
Posted by Peter Zijlstra 1 year, 2 months ago
On Mon, Nov 25, 2024 at 02:20:25PM +0100, Patryk Wlazlyn wrote:
> The algorithm based on cpuid leaf 0x5 in the mwait_play_dead(),
> for looking up the mwait hint for the deepest cstate may calculate the
> wrong hint on recent Intel x86 platforms.
> 
> Prefer cpuidle_play_dead() over the generic mwait_play_dead() loop and
> only fallback to the later in case of missing appropriate enter_dead()
> handler.

So 7/8 should go first, such that picking cpuidle doesn't end up in some
weird state.

> 
> Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
> ---
>  arch/x86/kernel/smpboot.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index d0464c7a0af5..2627b56fb9bc 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -1413,9 +1413,10 @@ void native_play_dead(void)
>  	play_dead_common();
>  	tboot_shutdown(TB_SHUTDOWN_WFS);
>  
> +	/* The first successful play_dead() will not return */
> +	cpuidle_play_dead();
>  	mwait_play_dead();
> -	if (cpuidle_play_dead())
> -		hlt_play_dead();
> +	hlt_play_dead();
>  }
>  
>  #else /* ... !CONFIG_HOTPLUG_CPU */
> -- 
> 2.47.0
>