arch/x86/include/asm/pgalloc.h | 7 ++++++- arch/x86/mm/pgtable.c | 4 ++++ arch/x86/mm/pti.c | 3 +++ 3 files changed, 13 insertions(+), 1 deletion(-)
Recently _pgd_alloc() was switched from using __get_free_pages() to
pagetable_alloc_noprof(), which might return a compound page in case
the allocation order is larger than 0.
On x86 this will be the case if CONFIG_MITIGATION_PAGE_TABLE_ISOLATION
is set, even if PTI has been disabled at runtime.
When running as a Xen PV guest (this will always disable PTI), using
a compound page for a PGD will result in VM_BUG_ON_PGFLAGS being
triggered when the Xen code tries to pin the PGD.
Fix the Xen issue together with the not needed 8k allocation for a
PGD with PTI disabled by using a variable holding the PGD allocation
order in case CONFIG_MITIGATION_PAGE_TABLE_ISOLATION is set.
Reported-by: Petr Vaněk <arkamar@atlas.cz>
Fixes: a9b3c355c2e6 ("asm-generic: pgalloc: provide generic __pgd_{alloc,free}")
Cc: stable@vger.kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
---
arch/x86/include/asm/pgalloc.h | 7 ++++++-
arch/x86/mm/pgtable.c | 4 ++++
arch/x86/mm/pti.c | 3 +++
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index a33147520044..754f95bddf98 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -34,8 +34,13 @@ static inline void paravirt_release_p4d(unsigned long pfn) {}
* Instead of one PGD, we acquire two PGDs. Being order-1, it is
* both 8k in size and 8k-aligned. That lets us just flip bit 12
* in a pointer to swap between the two 4k halves.
+ *
+ * As PTI can be runtime disabled (either via boot parameter or due to
+ * running as a Xen PV guest), store the actually needed allocation
+ * order in a global variable.
*/
-#define PGD_ALLOCATION_ORDER 1
+#define PGD_ALLOCATION_ORDER pgd_allocation_order
+extern unsigned int pgd_allocation_order;
#else
#define PGD_ALLOCATION_ORDER 0
#endif
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a05fcddfc811..f61b2d6be311 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -12,6 +12,10 @@ phys_addr_t physical_mask __ro_after_init = (1ULL << __PHYSICAL_MASK_SHIFT) - 1;
EXPORT_SYMBOL(physical_mask);
#endif
+#ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION
+unsigned int pgd_allocation_order = 0;
+#endif
+
pgtable_t pte_alloc_one(struct mm_struct *mm)
{
return __pte_alloc_one(mm, GFP_PGTABLE_USER);
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 5f0d579932c6..44b7120c63e3 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -38,6 +38,7 @@
#include <asm/desc.h>
#include <asm/sections.h>
#include <asm/set_memory.h>
+#include <asm/pgalloc.h>
#undef pr_fmt
#define pr_fmt(fmt) "Kernel/User page tables isolation: " fmt
@@ -97,6 +98,8 @@ void __init pti_check_boottime_disable(void)
if (pti_mode == PTI_AUTO && !boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
return;
+ pgd_allocation_order = 1;
+
setup_force_cpu_cap(X86_FEATURE_PTI);
}
--
2.43.0
On Thu, Apr 17, 2025 at 04:48:08PM +0200, Juergen Gross wrote:
> Recently _pgd_alloc() was switched from using __get_free_pages() to
> pagetable_alloc_noprof(), which might return a compound page in case
> the allocation order is larger than 0.
>
> On x86 this will be the case if CONFIG_MITIGATION_PAGE_TABLE_ISOLATION
> is set, even if PTI has been disabled at runtime.
>
> When running as a Xen PV guest (this will always disable PTI), using
> a compound page for a PGD will result in VM_BUG_ON_PGFLAGS being
> triggered when the Xen code tries to pin the PGD.
> Fix the Xen issue together with the not needed 8k allocation for a
> PGD with PTI disabled by using a variable holding the PGD allocation
> order in case CONFIG_MITIGATION_PAGE_TABLE_ISOLATION is set.
>
> Reported-by: Petr Vaněk <arkamar@atlas.cz>
> Fixes: a9b3c355c2e6 ("asm-generic: pgalloc: provide generic __pgd_{alloc,free}")
> Cc: stable@vger.kernel.org
> Signed-off-by: Juergen Gross <jgross@suse.com>
I have runtime tested this patch, and it fixes the reported issue. The
following trailers can be appended to the commit message (as per [1]):
Closes: https://lore.kernel.org/lkml/202541612720-Z_-deOZTOztMXHBh-arkamar@atlas.cz/
Tested-by: Petr Vaněk <arkamar@atlas.cz>
Cheers,
Petr
[1] https://docs.kernel.org/process/5.Posting.html#patch-formatting-and-changelogs
On 4/17/25 07:48, Juergen Gross wrote:
> -#define PGD_ALLOCATION_ORDER 1
> +#define PGD_ALLOCATION_ORDER pgd_allocation_order
> +extern unsigned int pgd_allocation_order;
> #else
> #define PGD_ALLOCATION_ORDER 0
> #endif
Instead of hiding a variable behind a macro-looking name and a bunch of
#ifdefs, can we please fix this properly?
static inline pgd_allocation_order(void)
{
if (cpu_feature_enabled(X86_FEATURE_PTI))
return 1;
return 0;
}
and then s/PGD_ALLOCATION_ORDER/pgd_allocation_order()/.
Wouldn't that be a billion times better?
© 2016 - 2025 Red Hat, Inc.