arch/x86/include/asm/processor.h | 2 +- arch/x86/kernel/process.c | 1 + 2 files changed, 2 insertions(+), 1 deletion(-)
From: Li RongQing <lirongqing@baidu.com>
x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will
cause performance degradation, so override prefer_mwait_c1_over_halt
to a new value, aviod loading cpuidle-haltpoll driver
Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
arch/x86/include/asm/processor.h | 2 +-
arch/x86/kernel/process.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 67c9d73..6bc94fd 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -658,7 +658,7 @@ extern void amd_e400_c1e_apic_setup(void);
extern unsigned long boot_option_idle_override;
enum idle_boot_override {IDLE_NO_OVERRIDE=0, IDLE_HALT, IDLE_NOMWAIT,
- IDLE_POLL};
+ IDLE_POLL, IDLE_PREF_MWAIT};
extern void enable_sep_cpu(void);
extern int sysenter_setup(void);
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index c21b734..a16985c 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
} else if (prefer_mwait_c1_over_halt(c)) {
pr_info("using mwait in idle threads\n");
x86_idle = mwait_idle;
+ boot_option_idle_override = IDLE_PREF_MWAIT;
} else if (cpu_feature_enabled(X86_FEATURE_TDX_GUEST)) {
pr_info("using TDX aware idle routine\n");
x86_idle = tdx_safe_halt;
--
2.9.4
Li! On Fri, Dec 02 2022 at 11:37, lirongqing@baidu.com wrote: > From: Li RongQing <lirongqing@baidu.com> > > x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will > cause performance degradation, so override prefer_mwait_c1_over_halt > to a new value, aviod loading cpuidle-haltpoll driver Neither the subject line nor the above makes any sense to me. prefer_mwait_c1_over_halt() is a function which is invoked and when it returns true then the execution ends up in the code path you are patching. > @@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c) > } else if (prefer_mwait_c1_over_halt(c)) { > pr_info("using mwait in idle threads\n"); > x86_idle = mwait_idle; > + boot_option_idle_override = IDLE_PREF_MWAIT; What you do is setting boot_option_idle_override to a new value, but that has nothing to do with prefer_mwait_c1_over_halt() at all. So how are you overriding that function to a new value? But that's just a word smithing problem. The real and way worse problem is that you pick a variable, which has the purpose to capture the idle override on the kernel command line, and modify it as you see fit, just to prevent that driver from loading. select_idle_routine() reads it to check whether there was a command line override or not. But it is not supposed to write it. Why? Have you checked what else evaluates that variable? Obviously not, because a simple grep would have told you: drivers/cpuidle/cpuidle-haltpoll.c: if (boot_option_idle_override != IDLE_NO_OVERRIDE) drivers/idle/intel_idle.c: if (boot_option_idle_override != IDLE_NO_OVERRIDE) Congratulations! Your patch breaks the default setup of every recent Intel system on the planet because it not only prevents the cpuidle-haltpoll, but also the intel_idle driver from loading. Seriously. It's not too much asked to do at least a simple grep and look at all _nine_ places which evaluate boot_option_idle_override. Thanks, tglx
> -----Original Message-----
> From: Thomas Gleixner <tglx@linutronix.de>
> Sent: Saturday, December 3, 2022 2:48 AM
> To: Li,Rongqing <lirongqing@baidu.com>; mingo@redhat.com; bp@alien8.de;
> dave.hansen@linux.intel.com; x86@kernel.org; peterz@infradead.org;
> tony.luck@intel.com; wyes.karny@amd.com; linux-kernel@vger.kernel.org;
> rafael.j.wysocki@intel.com
> Subject: Re: [PATCH][RFC] x86: override prefer_mwait_c1_over_halt to avoid
> loading cpuidle-haltpoll driver
>
> Li!
>
> On Fri, Dec 02 2022 at 11:37, lirongqing@baidu.com wrote:
> > From: Li RongQing <lirongqing@baidu.com>
> >
> > x86 KVM guests with MWAIT can load cpuidle-haltpoll driver, and will
> > cause performance degradation, so override prefer_mwait_c1_over_halt
> > to a new value, aviod loading cpuidle-haltpoll driver
>
> Neither the subject line nor the above makes any sense to me.
>
> prefer_mwait_c1_over_halt() is a function which is invoked and when it returns
> true then the execution ends up in the code path you are patching.
>
> > @@ -889,6 +889,7 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
> > } else if (prefer_mwait_c1_over_halt(c)) {
> > pr_info("using mwait in idle threads\n");
> > x86_idle = mwait_idle;
> > + boot_option_idle_override = IDLE_PREF_MWAIT;
>
> What you do is setting boot_option_idle_override to a new value, but that has
> nothing to do with prefer_mwait_c1_over_halt() at all.
>
> So how are you overriding that function to a new value?
>
> But that's just a word smithing problem.
>
> The real and way worse problem is that you pick a variable, which has the
> purpose to capture the idle override on the kernel command line, and modify it
> as you see fit, just to prevent that driver from loading.
>
> select_idle_routine() reads it to check whether there was a command line
> override or not. But it is not supposed to write it. Why?
>
> Have you checked what else evaluates that variable? Obviously not, because a
> simple grep would have told you:
>
> drivers/cpuidle/cpuidle-haltpoll.c: if (boot_option_idle_override !=
> IDLE_NO_OVERRIDE)
> drivers/idle/intel_idle.c: if (boot_option_idle_override !=
> IDLE_NO_OVERRIDE)
>
> Congratulations!
>
> Your patch breaks the default setup of every recent Intel system on the planet
> because it not only prevents the cpuidle-haltpoll, but also the intel_idle driver
> from loading.
>
> Seriously. It's not too much asked to do at least a simple grep and look at all
> _nine_ places which evaluate boot_option_idle_override.
>
Sorry for the careless
Thanks for the review, I will send a new version, which export a function to tell haltpoll driver whether or not mwait is used, like below
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 67c9d73..159ef33 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -862,4 +862,6 @@ bool arch_is_platform_page(u64 paddr);
#define arch_is_platform_page arch_is_platform_page
#endif
+bool is_mwait_idle(void);
+
#endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index c21b734..330972c 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -896,6 +896,12 @@ void select_idle_routine(const struct cpuinfo_x86 *c)
x86_idle = default_idle;
}
+bool is_mwait_idle(void)
+{
+ return x86_idle == mwait_idle;
+}
+EXPORT_SYMBOL_GPL(is_mwait_idle);
+
void amd_e400_c1e_apic_setup(void)
{
if (boot_cpu_has_bug(X86_BUG_AMD_APIC_C1E)) {
diff --git a/drivers/cpuidle/cpuidle-haltpoll.c b/drivers/cpuidle/cpuidle-haltpoll.c
index 3a39a7f..8cf1ddf 100644
--- a/drivers/cpuidle/cpuidle-haltpoll.c
+++ b/drivers/cpuidle/cpuidle-haltpoll.c
@@ -17,6 +17,7 @@
#include <linux/sched/idle.h>
#include <linux/kvm_para.h>
#include <linux/cpuidle_haltpoll.h>
+#include <linux/processor.h>
static bool force __read_mostly;
module_param(force, bool, 0444);
@@ -111,6 +112,9 @@ static int __init haltpoll_init(void)
if (!kvm_para_available() || !haltpoll_want())
return -ENODEV;
+ if (is_mwait_idle())
+ return -ENODEV;
+
cpuidle_poll_state_init(drv);
ret = cpuidle_register_driver(drv);
Thanks
-Li
© 2016 - 2025 Red Hat, Inc.