Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
disables IBRS when the CPU enters long idle. However, when a CPU becomes
offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS is
enabled. That will impact the performance of a sibling CPU. Mitigate
this performance impact by clearing all the mitigation bits in SPEC_CTRL
MSR when offline and restoring the value of the MSR when it becomes
online again.
Signed-off-by: Waiman Long <longman@redhat.com>
---
arch/x86/kernel/smpboot.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 352f0ce1ece4..5ff82fef413c 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -84,6 +84,7 @@
#include <asm/hw_irq.h>
#include <asm/stackprotector.h>
#include <asm/sev.h>
+#include <asm/nospec-branch.h>
/* representing HT siblings of each logical CPU */
DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
@@ -1838,12 +1839,24 @@ void __noreturn hlt_play_dead(void)
void native_play_dead(void)
{
+ u64 spec_ctrl = spec_ctrl_current();
+
+ if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
+ this_cpu_write(x86_spec_ctrl_current, 0);
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+ }
+
play_dead_common();
tboot_shutdown(TB_SHUTDOWN_WFS);
mwait_play_dead();
if (cpuidle_play_dead())
hlt_play_dead();
+
+ if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
+ this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
+ }
}
#else /* ... !CONFIG_HOTPLUG_CPU */
--
2.31.1
On Tue, Jun 20, 2023 at 10:06:22AM -0400, Waiman Long wrote:
> Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
> disables IBRS when the CPU enters long idle. However, when a CPU becomes
> offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS is
> enabled. That will impact the performance of a sibling CPU. Mitigate
> this performance impact by clearing all the mitigation bits in SPEC_CTRL
> MSR when offline and restoring the value of the MSR when it becomes
> online again.
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
> arch/x86/kernel/smpboot.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 352f0ce1ece4..5ff82fef413c 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -84,6 +84,7 @@
> #include <asm/hw_irq.h>
> #include <asm/stackprotector.h>
> #include <asm/sev.h>
> +#include <asm/nospec-branch.h>
>
> /* representing HT siblings of each logical CPU */
> DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
> @@ -1838,12 +1839,24 @@ void __noreturn hlt_play_dead(void)
>
> void native_play_dead(void)
> {
> + u64 spec_ctrl = spec_ctrl_current();
> +
> + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
> + this_cpu_write(x86_spec_ctrl_current, 0);
> + native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> + }
> +
> play_dead_common();
> tboot_shutdown(TB_SHUTDOWN_WFS);
>
> mwait_play_dead();
> if (cpuidle_play_dead())
> hlt_play_dead();
> +
> + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
> + native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
> + this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
> + }
> }
play_dead() is marked __noreturn
On 6/21/23 03:23, Peter Zijlstra wrote:
> On Tue, Jun 20, 2023 at 10:06:22AM -0400, Waiman Long wrote:
>> Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
>> disables IBRS when the CPU enters long idle. However, when a CPU becomes
>> offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS is
>> enabled. That will impact the performance of a sibling CPU. Mitigate
>> this performance impact by clearing all the mitigation bits in SPEC_CTRL
>> MSR when offline and restoring the value of the MSR when it becomes
>> online again.
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>> arch/x86/kernel/smpboot.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>> index 352f0ce1ece4..5ff82fef413c 100644
>> --- a/arch/x86/kernel/smpboot.c
>> +++ b/arch/x86/kernel/smpboot.c
>> @@ -84,6 +84,7 @@
>> #include <asm/hw_irq.h>
>> #include <asm/stackprotector.h>
>> #include <asm/sev.h>
>> +#include <asm/nospec-branch.h>
>>
>> /* representing HT siblings of each logical CPU */
>> DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
>> @@ -1838,12 +1839,24 @@ void __noreturn hlt_play_dead(void)
>>
>> void native_play_dead(void)
>> {
>> + u64 spec_ctrl = spec_ctrl_current();
>> +
>> + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
>> + this_cpu_write(x86_spec_ctrl_current, 0);
>> + native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>> + }
>> +
>> play_dead_common();
>> tboot_shutdown(TB_SHUTDOWN_WFS);
>>
>> mwait_play_dead();
>> if (cpuidle_play_dead())
>> hlt_play_dead();
>> +
>> + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
>> + native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
>> + this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
>> + }
>> }
> play_dead() is marked __noreturn
There are different versions of play_dead() in the kernel. Some of them
are indeed marked __noreturn like the non-SMP one in
arch/x86/kernel/process.c. The native_play_dead() that I am patching
isn't one of those.
Cheers,
Longman
On Wed, Jun 21, 2023 at 09:59:52AM -0400, Waiman Long wrote:
>
> On 6/21/23 03:23, Peter Zijlstra wrote:
> > On Tue, Jun 20, 2023 at 10:06:22AM -0400, Waiman Long wrote:
> > > Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
> > > disables IBRS when the CPU enters long idle. However, when a CPU becomes
> > > offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS is
> > > enabled. That will impact the performance of a sibling CPU. Mitigate
> > > this performance impact by clearing all the mitigation bits in SPEC_CTRL
> > > MSR when offline and restoring the value of the MSR when it becomes
> > > online again.
> > >
> > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > ---
> > > arch/x86/kernel/smpboot.c | 13 +++++++++++++
> > > 1 file changed, 13 insertions(+)
> > >
> > > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > > index 352f0ce1ece4..5ff82fef413c 100644
> > > --- a/arch/x86/kernel/smpboot.c
> > > +++ b/arch/x86/kernel/smpboot.c
> > > @@ -84,6 +84,7 @@
> > > #include <asm/hw_irq.h>
> > > #include <asm/stackprotector.h>
> > > #include <asm/sev.h>
> > > +#include <asm/nospec-branch.h>
> > > /* representing HT siblings of each logical CPU */
> > > DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
> > > @@ -1838,12 +1839,24 @@ void __noreturn hlt_play_dead(void)
> > > void native_play_dead(void)
> > > {
> > > + u64 spec_ctrl = spec_ctrl_current();
> > > +
> > > + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
> > > + this_cpu_write(x86_spec_ctrl_current, 0);
> > > + native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
> > > + }
> > > +
> > > play_dead_common();
> > > tboot_shutdown(TB_SHUTDOWN_WFS);
> > > mwait_play_dead();
> > > if (cpuidle_play_dead())
> > > hlt_play_dead();
> > > +
> > > + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
> > > + native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
> > > + this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
> > > + }
> > > }
> > play_dead() is marked __noreturn
>
> There are different versions of play_dead() in the kernel. Some of them are
> indeed marked __noreturn like the non-SMP one in arch/x86/kernel/process.c.
> The native_play_dead() that I am patching isn't one of those.
mostly by accident I think, hlt_play_dead() is, so I'm thinking
everybody (all compiler and objtool) managed to figure out
native_play_dead() is __noreturn too.
On 6/21/23 10:36, Peter Zijlstra wrote:
> On Wed, Jun 21, 2023 at 09:59:52AM -0400, Waiman Long wrote:
>> On 6/21/23 03:23, Peter Zijlstra wrote:
>>> On Tue, Jun 20, 2023 at 10:06:22AM -0400, Waiman Long wrote:
>>>> Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle")
>>>> disables IBRS when the CPU enters long idle. However, when a CPU becomes
>>>> offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRS is
>>>> enabled. That will impact the performance of a sibling CPU. Mitigate
>>>> this performance impact by clearing all the mitigation bits in SPEC_CTRL
>>>> MSR when offline and restoring the value of the MSR when it becomes
>>>> online again.
>>>>
>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>> ---
>>>> arch/x86/kernel/smpboot.c | 13 +++++++++++++
>>>> 1 file changed, 13 insertions(+)
>>>>
>>>> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
>>>> index 352f0ce1ece4..5ff82fef413c 100644
>>>> --- a/arch/x86/kernel/smpboot.c
>>>> +++ b/arch/x86/kernel/smpboot.c
>>>> @@ -84,6 +84,7 @@
>>>> #include <asm/hw_irq.h>
>>>> #include <asm/stackprotector.h>
>>>> #include <asm/sev.h>
>>>> +#include <asm/nospec-branch.h>
>>>> /* representing HT siblings of each logical CPU */
>>>> DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_sibling_map);
>>>> @@ -1838,12 +1839,24 @@ void __noreturn hlt_play_dead(void)
>>>> void native_play_dead(void)
>>>> {
>>>> + u64 spec_ctrl = spec_ctrl_current();
>>>> +
>>>> + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
>>>> + this_cpu_write(x86_spec_ctrl_current, 0);
>>>> + native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
>>>> + }
>>>> +
>>>> play_dead_common();
>>>> tboot_shutdown(TB_SHUTDOWN_WFS);
>>>> mwait_play_dead();
>>>> if (cpuidle_play_dead())
>>>> hlt_play_dead();
>>>> +
>>>> + if (cpu_feature_enabled(X86_FEATURE_KERNEL_IBRS)) {
>>>> + native_wrmsrl(MSR_IA32_SPEC_CTRL, spec_ctrl);
>>>> + this_cpu_write(x86_spec_ctrl_current, spec_ctrl);
>>>> + }
>>>> }
>>> play_dead() is marked __noreturn
>> There are different versions of play_dead() in the kernel. Some of them are
>> indeed marked __noreturn like the non-SMP one in arch/x86/kernel/process.c.
>> The native_play_dead() that I am patching isn't one of those.
> mostly by accident I think, hlt_play_dead() is, so I'm thinking
> everybody (all compiler and objtool) managed to figure out
> native_play_dead() is __noreturn too.
Well, hlt_play_dead() is only called if cpuidle_play_dead() returns an
error which is not the typical case. My testing does confirm that this
patch is able to keep the IBRS bit off when a CPU is offline via its
online sysfs file.
Cheers,
Longman
On Wed, Jun 21, 2023 at 10:44:23AM -0400, Waiman Long wrote: > Well, hlt_play_dead() is only called if cpuidle_play_dead() returns an error > which is not the typical case. My testing does confirm that this patch is > able to keep the IBRS bit off when a CPU is offline via its online sysfs > file. The point is; your re-enable IBRS hunk at the end is dead-code. It should never ever run and having it is confusing.
On 6/21/23 10:48, Peter Zijlstra wrote: > On Wed, Jun 21, 2023 at 10:44:23AM -0400, Waiman Long wrote: > >> Well, hlt_play_dead() is only called if cpuidle_play_dead() returns an error >> which is not the typical case. My testing does confirm that this patch is >> able to keep the IBRS bit off when a CPU is offline via its online sysfs >> file. > The point is; your re-enable IBRS hunk at the end is dead-code. It > should never ever run and having it is confusing. What I meant is that hlt_play_dead() should never be called unless there is some serious problem with the system and native_play_dead() does return in normal usage. Cheers, Longman
On Wed, Jun 21, 2023 at 10:51:33AM -0400, Waiman Long wrote: > > On 6/21/23 10:48, Peter Zijlstra wrote: > > On Wed, Jun 21, 2023 at 10:44:23AM -0400, Waiman Long wrote: > > > > > Well, hlt_play_dead() is only called if cpuidle_play_dead() returns an error > > > which is not the typical case. My testing does confirm that this patch is > > > able to keep the IBRS bit off when a CPU is offline via its online sysfs > > > file. > > The point is; your re-enable IBRS hunk at the end is dead-code. It > > should never ever run and having it is confusing. > > What I meant is that hlt_play_dead() should never be called unless there is > some serious problem with the system and native_play_dead() does return in > normal usage. This is all through arch_cpu_idle_dead() which is __noreturn. And no, none of this must ever return. If you want an offline CPU to come back, you re-init.
On 6/21/23 10:54, Peter Zijlstra wrote: > On Wed, Jun 21, 2023 at 10:51:33AM -0400, Waiman Long wrote: >> On 6/21/23 10:48, Peter Zijlstra wrote: >>> On Wed, Jun 21, 2023 at 10:44:23AM -0400, Waiman Long wrote: >>> >>>> Well, hlt_play_dead() is only called if cpuidle_play_dead() returns an error >>>> which is not the typical case. My testing does confirm that this patch is >>>> able to keep the IBRS bit off when a CPU is offline via its online sysfs >>>> file. >>> The point is; your re-enable IBRS hunk at the end is dead-code. It >>> should never ever run and having it is confusing. >> What I meant is that hlt_play_dead() should never be called unless there is >> some serious problem with the system and native_play_dead() does return in >> normal usage. > This is all through arch_cpu_idle_dead() which is __noreturn. And no, > none of this must ever return. > > If you want an offline CPU to come back, you re-init. Yes, you are right. I thought it will return. I will update the patch accordingly. Thanks, Longman
© 2016 - 2026 Red Hat, Inc.