[v14] AMD broadcast TLB invalidation

[PATCH v14 05/13] x86/mm: use INVLPGB in flush_tlb_all

Posted by Rik van Riel 11 months, 2 weeks ago

The flush_tlb_all() function is not used a whole lot, but we might
as well use broadcast TLB flushing there, too.

Signed-off-by: Rik van Riel <riel@surriel.com>
Tested-by: Manali Shukla <Manali.Shukla@amd.com>
Tested-by: Brendan Jackman <jackmanb@google.com>
Tested-by: Michael Kelley <mhklinux@outlook.com>
---
 arch/x86/mm/tlb.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index f44a03bca41c..a6cd61d5f423 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1064,7 +1064,6 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
 	mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end);
 }
 
-
 static void do_flush_tlb_all(void *info)
 {
 	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
@@ -1074,6 +1073,15 @@ static void do_flush_tlb_all(void *info)
 void flush_tlb_all(void)
 {
 	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH);
+
+	/* First try (faster) hardware-assisted TLB invalidation. */
+	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
+		guard(preempt)();
+		invlpgb_flush_all();
+		return;
+	}
+
+	/* Fall back to the IPI-based invalidation. */
 	on_each_cpu(do_flush_tlb_all, NULL, 1);
 }
 
-- 
2.47.1

Re: [PATCH v14 05/13] x86/mm: use INVLPGB in flush_tlb_all

Posted by Borislav Petkov 11 months, 2 weeks ago

On Tue, Feb 25, 2025 at 10:00:40PM -0500, Rik van Riel wrote:
> The flush_tlb_all() function is not used a whole lot, but we might
> as well use broadcast TLB flushing there, too.
> 
> Signed-off-by: Rik van Riel <riel@surriel.com>
> Tested-by: Manali Shukla <Manali.Shukla@amd.com>
> Tested-by: Brendan Jackman <jackmanb@google.com>
> Tested-by: Michael Kelley <mhklinux@outlook.com>
> ---
>  arch/x86/mm/tlb.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)

Edits ontop:

--- /tmp/current.patch	2025-02-28 23:18:51.670490799 +0100
+++ /tmp/0001-x86-mm-Use-INVLPGB-in-flush_tlb_all.patch	2025-02-28 23:17:48.590844991 +0100
@@ -1,22 +1,23 @@
+From 5bdf59c0589b71328bd340ea48a00917def62dc0 Mon Sep 17 00:00:00 2001
 From: Rik van Riel <riel@surriel.com>
 Date: Tue, 25 Feb 2025 22:00:40 -0500
-Subject: x86/mm: Use INVLPGB in flush_tlb_all
+Subject: [PATCH] x86/mm: Use INVLPGB in flush_tlb_all()
 
-The flush_tlb_all() function is not used a whole lot, but we might
-as well use broadcast TLB flushing there, too.
+The flush_tlb_all() function is not used a whole lot, but it might as
+well use broadcast TLB flushing there, too.
+
+  [ bp: Massage, restore balanced if-else branches in the function,
+    comment some. ]
 
 Signed-off-by: Rik van Riel <riel@surriel.com>
 Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
-Tested-by: Manali Shukla <Manali.Shukla@amd.com>
-Tested-by: Brendan Jackman <jackmanb@google.com>
-Tested-by: Michael Kelley <mhklinux@outlook.com>
 Link: https://lore.kernel.org/r/20250226030129.530345-6-riel@surriel.com
 ---
- arch/x86/mm/tlb.c | 10 +++++++++-
- 1 file changed, 9 insertions(+), 1 deletion(-)
+ arch/x86/mm/tlb.c | 17 +++++++++++++++--
+ 1 file changed, 15 insertions(+), 2 deletions(-)
 
 diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
-index f44a03bca41c..a6cd61d5f423 100644
+index 5c44b94ad5af..f49627e02311 100644
 --- a/arch/x86/mm/tlb.c
 +++ b/arch/x86/mm/tlb.c
 @@ -1064,7 +1064,6 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
@@ -27,22 +28,29 @@ index f44a03bca41c..a6cd61d5f423 100644
  static void do_flush_tlb_all(void *info)
  {
  	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH_RECEIVED);
-@@ -1074,6 +1073,15 @@ static void do_flush_tlb_all(void *info)
+@@ -1074,7 +1073,21 @@ static void do_flush_tlb_all(void *info)
  void flush_tlb_all(void)
  {
  	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH);
+-	on_each_cpu(do_flush_tlb_all, NULL, 1);
 +
 +	/* First try (faster) hardware-assisted TLB invalidation. */
 +	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
++		/*
++		 * TLBSYNC at the end needs to make sure all flushes done
++		 * on the current CPU have been executed system-wide.
++		 * Therefore, make sure nothing gets migrated
++		 * in-between but disable preemption as it is cheaper.
++		 */
 +		guard(preempt)();
 +		invlpgb_flush_all();
-+		return;
++	} else {
++		/* Fall back to the IPI-based invalidation. */
++		on_each_cpu(do_flush_tlb_all, NULL, 1);
 +	}
-+
-+	/* Fall back to the IPI-based invalidation. */
- 	on_each_cpu(do_flush_tlb_all, NULL, 1);
  }
  
+ /* Flush an arbitrarily large range of memory with INVLPGB. */
 -- 

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v14 05/13] x86/mm: use INVLPGB in flush_tlb_all

Posted by Dave Hansen 11 months, 2 weeks ago

On 2/25/25 19:00, Rik van Riel wrote:
>  void flush_tlb_all(void)
>  {
>  	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH);
> +
> +	/* First try (faster) hardware-assisted TLB invalidation. */
> +	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
> +		guard(preempt)();
> +		invlpgb_flush_all();
> +		return;
> +	}

We haven't talked at all about the locking rules for
invlpgb_flush_all(). It was used once in this series without any
explicit preempt twiddling. I assume that was because it was used in a
path where preempt is disabled.

If it does need a universal rule about preempt, can we please add an:

	lockdep_assert_preemption_disabled()

along with a comment about why it needs preempt disabled?

Also, the previous code did:

	if (cpu_feature_enabled(X86_FEATURE_INVLPGB))
		invlpgb_foo();
	else
		old_foo();

Is there a reason to break with that pattern? It would be nice to be
consistent.

Re: [PATCH v14 05/13] x86/mm: use INVLPGB in flush_tlb_all

Posted by Borislav Petkov 11 months, 2 weeks ago

On Fri, Feb 28, 2025 at 11:18:04AM -0800, Dave Hansen wrote:
> We haven't talked at all about the locking rules for
> invlpgb_flush_all(). It was used once in this series without any
> explicit preempt twiddling. I assume that was because it was used in a
> path where preempt is disabled.
> 
> If it does need a universal rule about preempt, can we please add an:
> 
> 	lockdep_assert_preemption_disabled()
> 
> along with a comment about why it needs preempt disabled?

So, after talking on IRC last night, below is what I think we should do ontop.

More specifically:

- I've pushed the preemption guard inside the functions which do
  INVLPGB+TLBSYNC so that callers do not have to think about it.

- invlpgb_kernel_range_flush() I still don't like and we have to rely there on
  cant_migrate() in __tlbsync() - I'd like for all of them to be nicely packed
  but don't have an idea yet how to do that cleanly...

- document what means for bits rax[0:2] being clear when issuing INVLPGB


That ok?

Anything I've missed?

If not, I'll integrate this into the patches.

Thx.

diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 45d9c7687d61..0d90ceeb472b 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -39,6 +39,10 @@ static inline void invlpg(unsigned long addr)
  * the first page, while __invlpgb gets the more human readable number of
  * pages to invalidate.
  *
+ * The bits in rax[0:2] determine respectively which components of the address
+ * (VA, PCID, ASID) get compared when flushing. If neither bits are set, *any*
+ * address in the specified range matches.
+ *
  * TLBSYNC is used to ensure that pending INVLPGB invalidations initiated from
  * this CPU have completed.
  */
@@ -60,10 +64,10 @@ static inline void __invlpgb(unsigned long asid, unsigned long pcid,
 static inline void __tlbsync(void)
 {
 	/*
-	 * tlbsync waits for invlpgb instructions originating on the
-	 * same CPU to have completed. Print a warning if we could have
-	 * migrated, and might not be waiting on all the invlpgbs issued
-	 * during this TLB invalidation sequence.
+	 * TLBSYNC waits for INVLPGB instructions originating on the same CPU
+	 * to have completed. Print a warning if the task has been migrated,
+	 * and might not be waiting on all the INVLPGBs issued during this TLB
+	 * invalidation sequence.
 	 */
 	cant_migrate();
 
@@ -106,6 +110,13 @@ static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid)
 /* Flush all mappings, including globals, for all PCIDs. */
 static inline void invlpgb_flush_all(void)
 {
+	/*
+	 * TLBSYNC at the end needs to make sure all flushes done on the
+	 * current CPU have been executed system-wide. Therefore, make
+	 * sure nothing gets migrated in-between but disable preemption
+	 * as it is cheaper.
+	 */
+	guard(preempt)();
 	__invlpgb(0, 0, 0, 1, 0, INVLPGB_INCLUDE_GLOBAL);
 	__tlbsync();
 }
@@ -119,10 +130,7 @@ static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr)
 /* Flush all mappings for all PCIDs except globals. */
 static inline void invlpgb_flush_all_nonglobals(void)
 {
-	/*
-	 * @addr=0 means both rax[1] (valid PCID) and rax[2] (valid ASID) are clear
-	 * so flush *any* PCID and ASID.
-	 */
+	guard(preempt)();
 	__invlpgb(0, 0, 0, 1, 0, 0);
 	__tlbsync();
 }
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index f49627e02311..8cd084bc3d98 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1075,19 +1075,11 @@ void flush_tlb_all(void)
 	count_vm_tlb_event(NR_TLB_REMOTE_FLUSH);
 
 	/* First try (faster) hardware-assisted TLB invalidation. */
-	if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) {
-		/*
-		 * TLBSYNC at the end needs to make sure all flushes done
-		 * on the current CPU have been executed system-wide.
-		 * Therefore, make sure nothing gets migrated
-		 * in-between but disable preemption as it is cheaper.
-		 */
-		guard(preempt)();
+	if (cpu_feature_enabled(X86_FEATURE_INVLPGB))
 		invlpgb_flush_all();
-	} else {
+	else
 		/* Fall back to the IPI-based invalidation. */
 		on_each_cpu(do_flush_tlb_all, NULL, 1);
-	}
 }
 
 /* Flush an arbitrarily large range of memory with INVLPGB. */

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH v14 05/13] x86/mm: use INVLPGB in flush_tlb_all

Posted by Rik van Riel 11 months, 2 weeks ago

On Sat, 2025-03-01 at 13:20 +0100, Borislav Petkov wrote:
> 
> So, after talking on IRC last night, below is what I think we should
> do ontop.

This all looks great! Thank you.

-- 
All Rights Reversed.