[PATCH v5 1/7] x86/mm: Decouple MAX_PHYSMEM_BITS from LA57 state

Ard Biesheuvel posted 7 patches 7 months ago
[PATCH v5 1/7] x86/mm: Decouple MAX_PHYSMEM_BITS from LA57 state
Posted by Ard Biesheuvel 7 months ago
From: Ard Biesheuvel <ardb@kernel.org>

As the Intel SDM states, MAXPHYADDR is up to 52 bits when running in
long mode, and this is independent from the number of levels of paging.
I.e., it is permitted for a 4-level hierarchy to use 52-bit output
addresses in the descriptors, both for next-level tables and for the
mappings themselves.

So set MAX_PHYSMEM_BITS to 52 in all cases for x86_64, and drop the
MAX_POSSIBLE_PHYSMEM_BITS definition, which becomes redundant as a
result.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/include/asm/pgtable_64_types.h | 2 --
 arch/x86/include/asm/sparsemem.h        | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 4604f924d8b8..1481b234465a 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -56,8 +56,6 @@ extern unsigned int ptrs_per_p4d;
 #define P4D_SIZE		(_AC(1, UL) << P4D_SHIFT)
 #define P4D_MASK		(~(P4D_SIZE - 1))
 
-#define MAX_POSSIBLE_PHYSMEM_BITS	52
-
 /*
  * 3rd level page
  */
diff --git a/arch/x86/include/asm/sparsemem.h b/arch/x86/include/asm/sparsemem.h
index 3918c7a434f5..550b6d73ae22 100644
--- a/arch/x86/include/asm/sparsemem.h
+++ b/arch/x86/include/asm/sparsemem.h
@@ -26,7 +26,7 @@
 # endif
 #else /* CONFIG_X86_32 */
 # define SECTION_SIZE_BITS	27 /* matt - 128 is convenient right now */
-# define MAX_PHYSMEM_BITS	(pgtable_l5_enabled() ? 52 : 46)
+# define MAX_PHYSMEM_BITS	52
 #endif
 
 #endif /* CONFIG_SPARSEMEM */
-- 
2.49.0.1101.gccaa498523-goog
Re: [PATCH v5 1/7] x86/mm: Decouple MAX_PHYSMEM_BITS from LA57 state
Posted by Kirill A. Shutemov 7 months ago
On Tue, May 20, 2025 at 12:41:40PM +0200, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> As the Intel SDM states, MAXPHYADDR is up to 52 bits when running in
> long mode, and this is independent from the number of levels of paging.
> I.e., it is permitted for a 4-level hierarchy to use 52-bit output
> addresses in the descriptors, both for next-level tables and for the
> mappings themselves.
> 
> So set MAX_PHYSMEM_BITS to 52 in all cases for x86_64, and drop the
> MAX_POSSIBLE_PHYSMEM_BITS definition, which becomes redundant as a
> result.

I think it will backfire.

We only have a 46-bit window in memory layout if 4-level paging is
enabled. Currently, we truncate PA to whatever fits into 46 bits.

I expect to see weird failures if you try to boot with this patch in
4-level paging mode on machine with > 64 TiB of memory.

If we want to go this path, it might be useful to refuse to boot
altogether in 4-level paging mode if there's anything in memory map above
46-bit.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH v5 1/7] x86/mm: Decouple MAX_PHYSMEM_BITS from LA57 state
Posted by Ard Biesheuvel 7 months ago
On Tue, 20 May 2025 at 12:59, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>
> On Tue, May 20, 2025 at 12:41:40PM +0200, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > As the Intel SDM states, MAXPHYADDR is up to 52 bits when running in
> > long mode, and this is independent from the number of levels of paging.
> > I.e., it is permitted for a 4-level hierarchy to use 52-bit output
> > addresses in the descriptors, both for next-level tables and for the
> > mappings themselves.
> >
> > So set MAX_PHYSMEM_BITS to 52 in all cases for x86_64, and drop the
> > MAX_POSSIBLE_PHYSMEM_BITS definition, which becomes redundant as a
> > result.
>
> I think it will backfire.
>
> We only have a 46-bit window in memory layout if 4-level paging is
> enabled. Currently, we truncate PA to whatever fits into 46 bits.
>

This is the linear map, right?

I assumed that this affected MMIO mappings too, but it seems x86 does
not rely on MAX_PHYSMEM_BITS for that.

> I expect to see weird failures if you try to boot with this patch in
> 4-level paging mode on machine with > 64 TiB of memory.
>
> If we want to go this path, it might be useful to refuse to boot
> altogether in 4-level paging mode if there's anything in memory map above
> 46-bit.
>

Agreed - if RAM does not fit, it makes no sense to limp on. I assumed
this limit applied to any physical address.

I'll withdraw the patch - it was just an unrelated thing I spotted, so
that shouldn't affect the rest of the series.