Commit 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH
workaround, add barriers") adds barriers, justified with:
... and add memory barriers around it since the documentation is explicit
that CLFLUSH is only ordered with respect to MFENCE.
The SDM currently states:
Executions of the CLFLUSH instruction are ordered with respect to each
other and with respect to writes, locked read-modify-write instructions,
and fence instructions[1].
With footnote 1 reading:
Earlier versions of this manual specified that executions of the CLFLUSH
instruction were ordered only by the MFENCE instruction. All processors
implementing the CLFLUSH instruction also order it relative to the other
operations enumerated above.
i.e. The SDM was incorrect at the time, and barriers should not have been
inserted. Double checking the original AAI65 errata (not available from
intel.com any more) shows no mention of barriers either.
Additionally, drop the static_cpu_has_bug() and use a plain alternative.
The workaround is a single instruction, with identical address setup to the
MONITOR instruction.
Link: https://web.archive.org/web/20090219054841/http://download.intel.com/design/xeon/specupdt/32033601.pdf
Fixes: 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ingo Molnar <mingo@redhat.com>
CC: Borislav Petkov <bp@alien8.de>
CC: Dave Hansen <dave.hansen@linux.intel.com>
CC: x86@kernel.org
CC: "H. Peter Anvin" <hpa@zytor.com>
CC: linux-kernel@vger.kernel.org
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index ce857ef54cf1..dff9e7d854ed 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -116,13 +116,11 @@ static __always_inline void __sti_mwait(unsigned long eax, unsigned long ecx)
static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
{
if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {
- if (static_cpu_has_bug(X86_BUG_CLFLUSH_MONITOR)) {
- mb();
- clflush((void *)¤t_thread_info()->flags);
- mb();
- }
+ const void *addr = ¤t_thread_info()->flags;
- __monitor((void *)¤t_thread_info()->flags, 0, 0);
+ alternative_input("", "clflush (%[addr])", X86_BUG_CLFLUSH_MONITOR,
+ [addr] "a" (addr));
+ __monitor(addr, 0, 0);
if (!need_resched()) {
if (ecx & 1) {
On 4/2/25 02:10, Andrew Cooper wrote:
> i.e. The SDM was incorrect at the time, and barriers should not have been
> inserted. Double checking the original AAI65 errata (not available from
> intel.com any more) shows no mention of barriers either.
There's a near copy-and-paste of that code here:
> static __cpuidle void mwait_idle(void)
> {
> if (!current_set_polling_and_test()) {
> if (this_cpu_has(X86_BUG_CLFLUSH_MONITOR)) {
> mb(); /* quirk */
> clflush((void *)¤t_thread_info()->flags);
> mb(); /* quirk */
> }
Any reason it can't get the same treatment?
On 02/04/2025 3:01 pm, Dave Hansen wrote:
> On 4/2/25 02:10, Andrew Cooper wrote:
>> i.e. The SDM was incorrect at the time, and barriers should not have been
>> inserted. Double checking the original AAI65 errata (not available from
>> intel.com any more) shows no mention of barriers either.
> There's a near copy-and-paste of that code here:
>
>> static __cpuidle void mwait_idle(void)
>> {
>> if (!current_set_polling_and_test()) {
>> if (this_cpu_has(X86_BUG_CLFLUSH_MONITOR)) {
>> mb(); /* quirk */
>> clflush((void *)¤t_thread_info()->flags);
>> mb(); /* quirk */
>> }
> Any reason it can't get the same treatment?
Oh, yes that should get the same treatment. (Sorry, too may cross-ports
between Xen and Linux trying to untangle this mess).
I'll submit a v2.
~Andrew
On Wed, Apr 02, 2025 at 10:10:17AM +0100, Andrew Cooper wrote:
> Commit 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH
> workaround, add barriers") adds barriers, justified with:
>
> ... and add memory barriers around it since the documentation is explicit
> that CLFLUSH is only ordered with respect to MFENCE.
>
> The SDM currently states:
>
> Executions of the CLFLUSH instruction are ordered with respect to each
> other and with respect to writes, locked read-modify-write instructions,
> and fence instructions[1].
>
> With footnote 1 reading:
>
> Earlier versions of this manual specified that executions of the CLFLUSH
> instruction were ordered only by the MFENCE instruction. All processors
> implementing the CLFLUSH instruction also order it relative to the other
> operations enumerated above.
>
> i.e. The SDM was incorrect at the time, and barriers should not have been
> inserted. Double checking the original AAI65 errata (not available from
> intel.com any more) shows no mention of barriers either.
>
> Additionally, drop the static_cpu_has_bug() and use a plain alternative.
> The workaround is a single instruction, with identical address setup to the
> MONITOR instruction.
>
> Link: https://web.archive.org/web/20090219054841/http://download.intel.com/design/xeon/specupdt/32033601.pdf
> Fixes: 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
> ---
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Ingo Molnar <mingo@redhat.com>
> CC: Borislav Petkov <bp@alien8.de>
> CC: Dave Hansen <dave.hansen@linux.intel.com>
> CC: x86@kernel.org
> CC: "H. Peter Anvin" <hpa@zytor.com>
> CC: linux-kernel@vger.kernel.org
>
> diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
> index ce857ef54cf1..dff9e7d854ed 100644
> --- a/arch/x86/include/asm/mwait.h
> +++ b/arch/x86/include/asm/mwait.h
> @@ -116,13 +116,11 @@ static __always_inline void __sti_mwait(unsigned long eax, unsigned long ecx)
> static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
> {
> if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {
> - if (static_cpu_has_bug(X86_BUG_CLFLUSH_MONITOR)) {
> - mb();
> - clflush((void *)¤t_thread_info()->flags);
> - mb();
> - }
> + const void *addr = ¤t_thread_info()->flags;
>
> - __monitor((void *)¤t_thread_info()->flags, 0, 0);
> + alternative_input("", "clflush (%[addr])", X86_BUG_CLFLUSH_MONITOR,
> + [addr] "a" (addr));
> + __monitor(addr, 0, 0);
>
> if (!need_resched()) {
> if (ecx & 1) {
LGTM.
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
The following commit has been merged into the x86/mm branch of tip:
Commit-ID: 90a22a5f841490790ecb17166633582681d44945
Gitweb: https://git.kernel.org/tip/90a22a5f841490790ecb17166633582681d44945
Author: Andrew Cooper <andrew.cooper3@citrix.com>
AuthorDate: Wed, 02 Apr 2025 10:10:17 +01:00
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 02 Apr 2025 11:54:51 +02:00
x86/idle: Remove mb() barriers for X86_BUG_CLFLUSH_MONITOR in mwait_idle_with_hints()
The following commit, 12 years ago:
7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
added barriers around the CLFLUSH in mwait_idle_with_hints(), justified with:
... and add memory barriers around it since the documentation is explicit
that CLFLUSH is only ordered with respect to MFENCE.
The SDM currently states:
Executions of the CLFLUSH instruction are ordered with respect to each
other and with respect to writes, locked read-modify-write instructions,
and fence instructions.
https://web.archive.org/web/20090219054841/http://download.intel.com/design/xeon/specupdt/32033601.pdf
With footnote 1 reading:
Earlier versions of this manual specified that executions of the CLFLUSH
instruction were ordered only by the MFENCE instruction. All processors
implementing the CLFLUSH instruction also order it relative to the other
operations enumerated above.
I.e. The SDM was incorrect at the time, and barriers should not have been
inserted. Double checking the original AAI65 errata (not available from
intel.com any more) shows no mention of barriers either.
Additionally, drop the static_cpu_has_bug() and use a plain alternative().
The workaround is a single instruction, with identical address setup to the
MONITOR instruction.
Fixes: 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20250402091017.1249019-1-andrew.cooper3@citrix.com
---
arch/x86/include/asm/mwait.h | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index ce857ef..54dc313 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -116,13 +116,10 @@ static __always_inline void __sti_mwait(unsigned long eax, unsigned long ecx)
static __always_inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
{
if (static_cpu_has_bug(X86_BUG_MONITOR) || !current_set_polling_and_test()) {
- if (static_cpu_has_bug(X86_BUG_CLFLUSH_MONITOR)) {
- mb();
- clflush((void *)¤t_thread_info()->flags);
- mb();
- }
+ const void *addr = ¤t_thread_info()->flags;
- __monitor((void *)¤t_thread_info()->flags, 0, 0);
+ alternative_input("", "clflush (%[addr])", X86_BUG_CLFLUSH_MONITOR, [addr] "a" (addr));
+ __monitor(addr, 0, 0);
if (!need_resched()) {
if (ecx & 1) {
On 02/04/2025 11:10 am, tip-bot2 for Andrew Cooper wrote:
> The following commit has been merged into the x86/mm branch of tip:
>
> Commit-ID: 90a22a5f841490790ecb17166633582681d44945
> Gitweb: https://git.kernel.org/tip/90a22a5f841490790ecb17166633582681d44945
> Author: Andrew Cooper <andrew.cooper3@citrix.com>
> AuthorDate: Wed, 02 Apr 2025 10:10:17 +01:00
> Committer: Ingo Molnar <mingo@kernel.org>
> CommitterDate: Wed, 02 Apr 2025 11:54:51 +02:00
>
> x86/idle: Remove mb() barriers for X86_BUG_CLFLUSH_MONITOR in mwait_idle_with_hints()
>
> The following commit, 12 years ago:
>
> 7e98b7192046 ("x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers")
>
> added barriers around the CLFLUSH in mwait_idle_with_hints(), justified with:
>
> ... and add memory barriers around it since the documentation is explicit
> that CLFLUSH is only ordered with respect to MFENCE.
>
> The SDM currently states:
>
> Executions of the CLFLUSH instruction are ordered with respect to each
> other and with respect to writes, locked read-modify-write instructions,
> and fence instructions.
>
> https://web.archive.org/web/20090219054841/http://download.intel.com/design/xeon/specupdt/32033601.pdf
This link isn't the SDM. Its ...
>
> With footnote 1 reading:
>
> Earlier versions of this manual specified that executions of the CLFLUSH
> instruction were ordered only by the MFENCE instruction. All processors
> implementing the CLFLUSH instruction also order it relative to the other
> operations enumerated above.
>
> I.e. The SDM was incorrect at the time, and barriers should not have been
> inserted. Double checking the original AAI65 errata (not available from
> intel.com any more) shows no mention of barriers either.
... this errata link, no longer hosted on intel.com.
~Andrew
© 2016 - 2026 Red Hat, Inc.