[PATCH v9 06/12] x86/mm: use INVLPGB for kernel TLB flushes

Rik van Riel posted 12 patches 10 months, 1 week ago
There is a newer version of this series
[PATCH v9 06/12] x86/mm: use INVLPGB for kernel TLB flushes
Posted by Rik van Riel 10 months, 1 week ago
Use broadcast TLB invalidation for kernel addresses when available.

Remove the need to send IPIs for kernel TLB flushes.

Signed-off-by: Rik van Riel <riel@surriel.com>
Tested-by: Manali Shukla <Manali.Shukla@amd.com>
---
 arch/x86/mm/tlb.c | 28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 36939b104561..227e972b6fbc 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1086,6 +1086,30 @@ void flush_tlb_all(void)
 	on_each_cpu(do_flush_tlb_all, NULL, 1);
 }
 
+static bool broadcast_kernel_range_flush(struct flush_tlb_info *info)
+{
+	unsigned long addr;
+	unsigned long nr;
+
+	if (!IS_ENABLED(CONFIG_X86_BROADCAST_TLB_FLUSH))
+		return false;
+
+	if (!cpu_feature_enabled(X86_FEATURE_INVLPGB))
+		return false;
+
+	if (info->end == TLB_FLUSH_ALL) {
+		invlpgb_flush_all();
+		return true;
+	}
+
+	for (addr = info->start; addr < info->end; addr += nr << PAGE_SHIFT) {
+		nr = min((info->end - addr) >> PAGE_SHIFT, invlpgb_count_max);
+		invlpgb_flush_addr_nosync(addr, nr);
+	}
+	tlbsync();
+	return true;
+}
+
 static void do_kernel_range_flush(void *info)
 {
 	struct flush_tlb_info *f = info;
@@ -1105,7 +1129,9 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end)
 	info = get_flush_tlb_info(NULL, start, end, PAGE_SHIFT, false,
 				  TLB_GENERATION_INVALID);
 
-	if (info->end == TLB_FLUSH_ALL)
+	if (broadcast_kernel_range_flush(info))
+		; /* Fall through. */
+	else if (info->end == TLB_FLUSH_ALL)
 		on_each_cpu(do_flush_tlb_all, NULL, 1);
 	else
 		on_each_cpu(do_kernel_range_flush, info, 1);
-- 
2.47.1
Re: [PATCH v9 06/12] x86/mm: use INVLPGB for kernel TLB flushes
Posted by Brendan Jackman 10 months, 1 week ago
On Thu, 6 Feb 2025 at 05:45, Rik van Riel <riel@surriel.com> wrote:
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 36939b104561..227e972b6fbc 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -1086,6 +1086,30 @@ void flush_tlb_all(void)
>         on_each_cpu(do_flush_tlb_all, NULL, 1);
>  }
>
> +static bool broadcast_kernel_range_flush(struct flush_tlb_info *info)
> +{
> +       unsigned long addr;
> +       unsigned long nr;
> +
> +       if (!IS_ENABLED(CONFIG_X86_BROADCAST_TLB_FLUSH))
> +               return false;
> +
> +       if (!cpu_feature_enabled(X86_FEATURE_INVLPGB))
> +               return false;

I think that first conditional can be shunted off into the header

diff --git a/arch/x86/include/asm/disabled-features.h
b/arch/x86/include/asm/disabled-features.h
index c492bdc97b05..61376b4e4fa7 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -129,6 +129,12 @@
 #define DISABLE_SEV_SNP                (1 << (X86_FEATURE_SEV_SNP & 31))
 #endif

+#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH
+#define DISABLE_INVLPGB        0
+#else
+#define DISABLE_INVLPGB        (1 << (X86_FEATURE_INVLPGB & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -146,7 +152,7 @@
 #define DISABLED_MASK11
(DISABLE_RETPOLINE|DISABLE_RETHUNK|DISABLE_UNRET| \
                         DISABLE_CALL_DEPTH_TRACKING|DISABLE_USER_SHSTK)
 #define DISABLED_MASK12        (DISABLE_FRED|DISABLE_LAM)
-#define DISABLED_MASK13        0
+#define DISABLED_MASK13        (DISABLE_INVLPGB)
 #define DISABLED_MASK14        0
 #define DISABLED_MASK15        0
 #define DISABLED_MASK16
(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP| \
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 227e972b6fbc..b8a8665359a2 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -1091,9 +1091,6 @@ static bool broadcast_kernel_range_flush(struct
flush_tlb_info *info)
        unsigned long addr;
        unsigned long nr;

-       if (!IS_ENABLED(CONFIG_X86_BROADCAST_TLB_FLUSH))
-               return false;
-
        if (!cpu_feature_enabled(X86_FEATURE_INVLPGB))
                return false;

With !CPU_SUP_AMD and the above, broadcast_kernel_range_flush
disappears from tlb.o. (Caveat - I didn't actually read the disasm I
just made it noinline and checked the call disappeared).

It's actually more lines of code but now they're off in a boilerplate
header and it's consistent with the other flags that do this.
Re: [PATCH v9 06/12] x86/mm: use INVLPGB for kernel TLB flushes
Posted by Rik van Riel 10 months, 1 week ago
On Fri, 2025-02-07 at 17:03 +0100, Brendan Jackman wrote:
> On Thu, 6 Feb 2025 at 05:45, Rik van Riel <riel@surriel.com> wrote:
> > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> > index 36939b104561..227e972b6fbc 100644
> > --- a/arch/x86/mm/tlb.c
> > +++ b/arch/x86/mm/tlb.c
> > @@ -1086,6 +1086,30 @@ void flush_tlb_all(void)
> >         on_each_cpu(do_flush_tlb_all, NULL, 1);
> >  }
> > 
> > +static bool broadcast_kernel_range_flush(struct flush_tlb_info
> > *info)
> > +{
> > +       unsigned long addr;
> > +       unsigned long nr;
> > +
> > +       if (!IS_ENABLED(CONFIG_X86_BROADCAST_TLB_FLUSH))
> > +               return false;
> > +
> > +       if (!cpu_feature_enabled(X86_FEATURE_INVLPGB))
> > +               return false;
> 
> I think that first conditional can be shunted off into the header
> 
> diff --git a/arch/x86/include/asm/disabled-features.h
> b/arch/x86/include/asm/disabled-features.h
> index c492bdc97b05..61376b4e4fa7 100644
> --- a/arch/x86/include/asm/disabled-features.h
> +++ b/arch/x86/include/asm/disabled-features.h
> @@ -129,6 +129,12 @@
>  #define DISABLE_SEV_SNP                (1 << (X86_FEATURE_SEV_SNP &
> 31))
>  #endif
> 
> +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH
> +#define DISABLE_INVLPGB        0
> +#else
> +#define DISABLE_INVLPGB        (1 << (X86_FEATURE_INVLPGB & 31))
> +#endif
> +

I'm adding this for v10, with the disabled-features.h
stuff in patch 5, and removing the no longer needed
tests from each subsequent patch.

-- 
All Rights Reversed.
Re: [PATCH v9 06/12] x86/mm: use INVLPGB for kernel TLB flushes
Posted by Rik van Riel 10 months, 1 week ago
On Fri, 2025-02-07 at 17:03 +0100, Brendan Jackman wrote:
> O
> With !CPU_SUP_AMD and the above, broadcast_kernel_range_flush
> disappears from tlb.o. (Caveat - I didn't actually read the disasm I
> just made it noinline and checked the call disappeared).
> 
> It's actually more lines of code but now they're off in a boilerplate
> header and it's consistent with the other flags that do this.
> 
What compiler did you use?

While I like the cleanup in principle, I
don't want to saddle people with older
compilers with extra code they don't need.

-- 
All Rights Reversed.
Re: [PATCH v9 06/12] x86/mm: use INVLPGB for kernel TLB flushes
Posted by Brendan Jackman 10 months, 1 week ago
On Fri, 7 Feb 2025 at 21:51, Rik van Riel <riel@surriel.com> wrote:
>
> On Fri, 2025-02-07 at 17:03 +0100, Brendan Jackman wrote:
> > O
> > With !CPU_SUP_AMD and the above, broadcast_kernel_range_flush
> > disappears from tlb.o. (Caveat - I didn't actually read the disasm I
> > just made it noinline and checked the call disappeared).
> >
> > It's actually more lines of code but now they're off in a boilerplate
> > header and it's consistent with the other flags that do this.
> >
> What compiler did you use?
>
> While I like the cleanup in principle, I
> don't want to saddle people with older
> compilers with extra code they don't need.

I used a pretty fresh Clang but I'd be very surprised if it needs a
fancy compiler. Compared to

if (IS_ENABLED(CONFIG_FOO))

I think all we have with the disabled-features.h magic is

- An extra __builtin_constant_p - I did a quick search and I can find
GCC release notes referring to this at least back to 4.7 (2012) [0].
Note also this doesn't create any code.

- An extra bit of constant folding to turn the (x & y) into
true/false. This seems like something compilers have been good at for
a long time. And if someone's happy with a compiler so old that it
can't do this, I dunno but they probably don't mind a few extra
instructions.

[1] https://gcc.gnu.org/gcc-4.7/changes.html