The ordering in should_flush_tlb() is entirely non-obvious and is only
correct because x86 is TSO. Clarify the situation by replacing two
WRITE_ONCE()s with smp_store_release(), which on x86 is cosmetic.
Additionally, clarify the comment on should_flush_tlb().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
arch/x86/mm/tlb.c | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -910,8 +910,10 @@ void switch_mm_irqs_off(struct mm_struct
* Indicate that CR3 is about to change. nmi_uaccess_okay()
* and others are sensitive to the window where mm_cpumask(),
* CR3 and cpu_tlbstate.loaded_mm are not all in sync.
+ *
+ * Also, see should_flush_tlb().
*/
- WRITE_ONCE(this_tlbstate->loaded_mm, LOADED_MM_SWITCHING);
+ smp_store_release(&this_tlbstate->loaded_mm, LOADED_MM_SWITCHING);
barrier();
/* Start receiving IPIs and then read tlb_gen (and LAM below) */
@@ -938,10 +940,11 @@ void switch_mm_irqs_off(struct mm_struct
trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, 0);
}
- /* Make sure we write CR3 before loaded_mm. */
- barrier();
-
- WRITE_ONCE(this_tlbstate->loaded_mm, next);
+ /*
+ * Make sure we write CR3 before loaded_mm.
+ * See nmi_uaccess_okay() and should_flush_tlb().
+ */
+ smp_store_release(&this_tlbstate->loaded_mm, next);
WRITE_ONCE(this_tlbstate->loaded_mm_asid, ns.asid);
cpu_tlbstate_update_lam(new_lam, mm_untag_mask(next));
@@ -1280,9 +1283,20 @@ static bool should_flush_tlb(int cpu, vo
struct flush_tlb_info *info = data;
/*
- * Order the 'loaded_mm' and 'is_lazy' against their
- * write ordering in switch_mm_irqs_off(). Ensure
- * 'is_lazy' is at least as new as 'loaded_mm'.
+ * switch_mm_irqs_off() should_flush_tlb()
+ * WRITE_ONCE(is_lazy, false); loaded_mm = READ_ONCE(loaded_mm);
+ * smp_store_release(loaded_mm, SWITCHING); smp_rmb();
+ * mov-cr3
+ * smp_store_release(loaded_mm, next)
+ * if (READ_ONCE(is_lazy))
+ * return false;
+ *
+ * Where smp_rmb() matches against either smp_store_release() to
+ * ensure that if we observe loaded_mm to be either SWITCHING or next
+ * we must also observe is_lazy == false.
+ *
+ * If this were not so, it would be possible to falsely return false
+ * and miss sending an invalidation IPI.
*/
smp_rmb();
On 5/20/25 03:55, Peter Zijlstra wrote: > The ordering in should_flush_tlb() is entirely non-obvious and is only > correct because x86 is TSO. Clarify the situation by replacing two > WRITE_ONCE()s with smp_store_release(), which on x86 is cosmetic. > > Additionally, clarify the comment on should_flush_tlb(). Thanks for clarifying the ordering in those comments. It's much appreciated by us mere mortals! Oh, and there are a few we's that snuck into the comments. But whoever applies this can fix them up. Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
© 2016 - 2026 Red Hat, Inc.