[PATCH v2 01/22] x86/mm: split out preallocate_sub_pgd()

Brendan Jackman posted 22 patches 2 weeks ago
[PATCH v2 01/22] x86/mm: split out preallocate_sub_pgd()
Posted by Brendan Jackman 2 weeks ago
This code will be needed elsewhere in a following patch. Split out the
trivial code move for easy review.

This changes the logging slightly: instead of panic() directly reporting
the level of the failure, there is now a generic panic message which
will be preceded by a separate warn that reports the level of the
failure. This is a simple way to have this helper suit the needs of its
new user as well as the existing one.

Other than logging, no functional change intended.

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
 arch/x86/include/asm/pgalloc.h | 33 +++++++++++++++++++++++++++++++
 arch/x86/mm/init_64.c          | 44 +++++++-----------------------------------
 2 files changed, 40 insertions(+), 37 deletions(-)

diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index c88691b15f3c6..3541b86c9c6b0 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -2,6 +2,7 @@
 #ifndef _ASM_X86_PGALLOC_H
 #define _ASM_X86_PGALLOC_H
 
+#include <linux/printk.h>
 #include <linux/threads.h>
 #include <linux/mm.h>		/* for struct page */
 #include <linux/pagemap.h>
@@ -128,6 +129,38 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
 	___pud_free_tlb(tlb, pud);
 }
 
+/* Allocate a pagetable pointed to by the top hardware level. */
+static inline int preallocate_sub_pgd(struct mm_struct *mm, unsigned long addr)
+{
+	const char *lvl;
+	p4d_t *p4d;
+	pud_t *pud;
+
+	lvl = "p4d";
+	p4d = p4d_alloc(mm, pgd_offset_pgd(mm->pgd, addr), addr);
+	if (!p4d)
+		goto failed;
+
+	if (pgtable_l5_enabled())
+		return 0;
+
+	/*
+	 * On 4-level systems, the P4D layer is folded away and
+	 * the above code does no preallocation.  Below, go down
+	 * to the pud _software_ level to ensure the second
+	 * hardware level is allocated on 4-level systems too.
+	 */
+	lvl = "pud";
+	pud = pud_alloc(mm, p4d, addr);
+	if (!pud)
+		goto failed;
+	return 0;
+
+failed:
+	pr_warn_ratelimited("Failed to preallocate %s\n", lvl);
+	return -ENOMEM;
+}
+
 #if CONFIG_PGTABLE_LEVELS > 4
 static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
 {
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index df2261fa4f985..79806386dc42f 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1318,46 +1318,16 @@ static void __init register_page_bootmem_info(void)
 static void __init preallocate_vmalloc_pages(void)
 {
 	unsigned long addr;
-	const char *lvl;
 
 	for (addr = VMALLOC_START; addr <= VMEMORY_END; addr = ALIGN(addr + 1, PGDIR_SIZE)) {
-		pgd_t *pgd = pgd_offset_k(addr);
-		p4d_t *p4d;
-		pud_t *pud;
-
-		lvl = "p4d";
-		p4d = p4d_alloc(&init_mm, pgd, addr);
-		if (!p4d)
-			goto failed;
-
-		if (pgtable_l5_enabled())
-			continue;
-
-		/*
-		 * The goal here is to allocate all possibly required
-		 * hardware page tables pointed to by the top hardware
-		 * level.
-		 *
-		 * On 4-level systems, the P4D layer is folded away and
-		 * the above code does no preallocation.  Below, go down
-		 * to the pud _software_ level to ensure the second
-		 * hardware level is allocated on 4-level systems too.
-		 */
-		lvl = "pud";
-		pud = pud_alloc(&init_mm, p4d, addr);
-		if (!pud)
-			goto failed;
+		if (preallocate_sub_pgd(&init_mm, addr)) {
+			/*
+			 * The pages have to be there now or they will be
+			 * missing in process page-tables later.
+			 */
+			panic("Failed to pre-allocate pagetables for vmalloc area\n");
+		}
 	}
-
-	return;
-
-failed:
-
-	/*
-	 * The pages have to be there now or they will be missing in
-	 * process page-tables later.
-	 */
-	panic("Failed to pre-allocate %s pages for vmalloc area\n", lvl);
 }
 
 void __init arch_mm_preinit(void)

-- 
2.51.2
Re: [PATCH v2 01/22] x86/mm: split out preallocate_sub_pgd()
Posted by Borislav Petkov 1 week, 3 days ago
On Fri, Mar 20, 2026 at 06:23:25PM +0000, Brendan Jackman wrote:
> This code will be needed elsewhere in a following patch. Split out the
> trivial code move for easy review.
> 
> This changes the logging slightly: instead of panic() directly reporting
> the level of the failure, there is now a generic panic message which
> will be preceded by a separate warn that reports the level of the
> failure. This is a simple way to have this helper suit the needs of its
> new user as well as the existing one.

"Describe your changes in imperative mood, e.g. “make xyzzy do frotz” instead
of “[This patch] makes xyzzy do frotz” or “[I] changed xyzzy to do frotz”, as
if you are giving orders to the codebase to change its behaviour."

> Other than logging, no functional change intended.
> 
> Signed-off-by: Brendan Jackman <jackmanb@google.com>
> ---
>  arch/x86/include/asm/pgalloc.h | 33 +++++++++++++++++++++++++++++++
>  arch/x86/mm/init_64.c          | 44 +++++++-----------------------------------
>  2 files changed, 40 insertions(+), 37 deletions(-)
> 
> diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
> index c88691b15f3c6..3541b86c9c6b0 100644
> --- a/arch/x86/include/asm/pgalloc.h
> +++ b/arch/x86/include/asm/pgalloc.h

Why in a header?

That function is kinda bigger than the rest of the oneliners there...

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Re: [PATCH v2 01/22] x86/mm: split out preallocate_sub_pgd()
Posted by Brendan Jackman 1 week, 2 days ago
On Tue Mar 24, 2026 at 3:27 PM UTC, Borislav Petkov wrote:
> On Fri, Mar 20, 2026 at 06:23:25PM +0000, Brendan Jackman wrote:
>> This code will be needed elsewhere in a following patch. Split out the
>> trivial code move for easy review.
>> 
>> This changes the logging slightly: instead of panic() directly reporting
>> the level of the failure, there is now a generic panic message which
>> will be preceded by a separate warn that reports the level of the
>> failure. This is a simple way to have this helper suit the needs of its
>> new user as well as the existing one.
>
> "Describe your changes in imperative mood, e.g. “make xyzzy do frotz” instead
> of “[This patch] makes xyzzy do frotz” or “[I] changed xyzzy to do frotz”, as
> if you are giving orders to the codebase to change its behaviour."

Yeah there are a bunch of comments in this series where I've written "we
do XYZ" instead of "do XYZ", I've set up an AI prompt [0] to try and
catch it for me before I send you guys crazy with yet another series
that makes the same mistake over and over.

[0] https://lore.kernel.org/all/DHA6G1E2H5P4.2D7JKTRKIBE3U@google.com/

TBH for this particular case it seems borderline? The referent of the
"this" is the previous paragraph, it's trying to show that the logging
change is a side effect of the code movement rather than a part of
the patch's intent. But there are more explicit ways to carry that
message so I'll rewrite it.

>> Other than logging, no functional change intended.
>> 
>> Signed-off-by: Brendan Jackman <jackmanb@google.com>
>> ---
>>  arch/x86/include/asm/pgalloc.h | 33 +++++++++++++++++++++++++++++++
>>  arch/x86/mm/init_64.c          | 44 +++++++-----------------------------------
>>  2 files changed, 40 insertions(+), 37 deletions(-)
>> 
>> diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
>> index c88691b15f3c6..3541b86c9c6b0 100644
>> --- a/arch/x86/include/asm/pgalloc.h
>> +++ b/arch/x86/include/asm/pgalloc.h
>
> Why in a header?
>
> That function is kinda bigger than the rest of the oneliners there...

Hm, I suspect I wanted to avoid creating new ifdefs in the C file? Seems
harmless though. Looks fine in arch/x86/mm/pgtable.c.
Re: [PATCH v2 01/22] x86/mm: split out preallocate_sub_pgd()
Posted by Dave Hansen 2 weeks ago
On 3/20/26 11:23, Brendan Jackman wrote:
> -		/*
> -		 * The goal here is to allocate all possibly required
> -		 * hardware page tables pointed to by the top hardware
> -		 * level.

This comment is pretty important, IMNHO, and you zapped it.

The problem here is that the per-MM carved out space is PGD-sized. You
want to make sure there are page tables allocated for that space. But,
if you say "go allocate a p4d" then that will collapse down to doing
nothing on a 4-level system.

So, this is effectively:

	Go allocate a p4d or pud, depending on if it's 4 or 5 level.
	Basically, always allocate the level that the hardware PGD
	points to.

Could we put a comment to that effect around somewhere, please?
Re: [PATCH v2 01/22] x86/mm: split out preallocate_sub_pgd()
Posted by Brendan Jackman 1 week, 4 days ago
On Fri Mar 20, 2026 at 7:42 PM UTC, Dave Hansen wrote:
> On 3/20/26 11:23, Brendan Jackman wrote:
>> -		/*
>> -		 * The goal here is to allocate all possibly required
>> -		 * hardware page tables pointed to by the top hardware
>> -		 * level.
>
> This comment is pretty important, IMNHO, and you zapped it.
>
> The problem here is that the per-MM carved out space is PGD-sized. You
> want to make sure there are page tables allocated for that space. But,
> if you say "go allocate a p4d" then that will collapse down to doing
> nothing on a 4-level system.
>
> So, this is effectively:
>
> 	Go allocate a p4d or pud, depending on if it's 4 or 5 level.
> 	Basically, always allocate the level that the hardware PGD
> 	points to.
>
> Could we put a comment to that effect around somewhere, please?

Hm I kinda thought the comments I left in there captured all this stuff,
but yeah I can see this is a bit of a weird function so more commentary
makes sense. How about I just put a few more words into the top comment:

/* 
 * Allocate all possibly requried hardware page tables pointed to ths
 * top hardware level. In other words, allocate a p4d on 5-level or a
 * pud on 4-level.
 */

And then just leave the internal one as it is:

	/*
	 * On 4-level systems, the P4D layer is folded away and
	 * the above code does no preallocation.  Below, go down
	 * to the pud _software_ level to ensure the second
	 * hardware level is allocated on 4-level systems too.
	 */