[PATCH 2/3] x86/mm: Fix early boot use of INVPLGB

Rik van Riel posted 3 patches 8 months, 1 week ago
[PATCH 2/3] x86/mm: Fix early boot use of INVPLGB
Posted by Rik van Riel 8 months, 1 week ago
Use of the INVLPGB instruction is done based off the X86_FEATURE_INVLPGB
CPU feature, which is provided directly by the hardware.

If invlpgb_kernel_range_flush is called before the kernel has read
the value of invlpgb_count_max from the hardware, the normally
bounded loop can become an infinite loop if invlpgb_count_max is
initialized to zero.

Fix that issue by initializing invlpgb_count_max to 1.

This way INVPLGB at early boot time will be a little bit slower
than normal (with initialized invplgb_count_max), and not an
instant hang at bootup time.

Signed-off-by: Rik van Riel <riel@surriel.com>
Fixes: b7aa05cbdc52 ("x86/mm: Add INVLPGB support code")
Cc: stable@kernel.org
---
 arch/x86/kernel/cpu/amd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 93da466dfe2c..b2ad8d13211a 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -31,7 +31,7 @@
 
 #include "cpu.h"
 
-u16 invlpgb_count_max __ro_after_init;
+u16 invlpgb_count_max __ro_after_init = 1;
 
 static inline int rdmsrq_amd_safe(unsigned msr, u64 *p)
 {
-- 
2.49.0
Re: [PATCH 2/3] x86/mm: Fix early boot use of INVPLGB
Posted by Dave Hansen 8 months, 1 week ago
On 6/2/25 06:30, Rik van Riel wrote:
> Use of the INVLPGB instruction is done based off the X86_FEATURE_INVLPGB
> CPU feature, which is provided directly by the hardware.
> 
> If invlpgb_kernel_range_flush is called before the kernel has read
> the value of invlpgb_count_max from the hardware, the normally
> bounded loop can become an infinite loop if invlpgb_count_max is
> initialized to zero.
> 
> Fix that issue by initializing invlpgb_count_max to 1.
> 
> This way INVPLGB at early boot time will be a little bit slower
> than normal (with initialized invplgb_count_max), and not an
> instant hang at bootup time.

The INVLPGB instruction has limits on how many invalidations it can
perform at once. That limit is enumerated in CPUID, read by the kernel,
and stored in 'invlpgb_count_max'. Ranged invalidation (like
invlpgb_kernel_range_flush()) break up their invalidations so that they
do not exceed the limit.

However, early boot code currently attempts to do ranged invalidations
before populating 'invlpgb_count_max'. There's a for() loop which is
basically:

	for (...; addr < end; addr += invlpgb_count_max*PAGE_SIZE)

It doesn't make much progress when invlpgb_count_max==0.

... then the rest

---

BTW, how was this code even _working_ without this patch? Are the early
boot ranged invalidations infrequent or something?

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>