[PATCH v2] mm: fix KASAN build error due to p*d_populate_kernel()

Harry Yoo posted 1 patch 1 month, 1 week ago
There is a newer version of this series
include/linux/pgalloc.h | 33 +++++++++++++++++++--------------
1 file changed, 19 insertions(+), 14 deletions(-)
[PATCH v2] mm: fix KASAN build error due to p*d_populate_kernel()
Posted by Harry Yoo 1 month, 1 week ago
KASAN unconditionally references kasan_early_shadow_{p4d,pud}.
However, these global variables may not exist depending on the number of
page table levels. For example, if CONFIG_PGTABLE_LEVELS=3, both
variables do not exist. Although KASAN may refernce non-existent
variables, it didn't break builds because calls to {pgd,p4d}_populate()
are optimized away at compile time.

However, {pgd,p4d}_populate_kernel() is defined as a function regardless
of the number of page table levels, so the compiler may not optimize
them away. In this case, the following linker error occurs:

ld.lld: error: undefined symbol: kasan_early_shadow_p4d
>>> referenced by init.c:260 (/home/hyeyoo/mm-new/mm/kasan/init.c:260)
>>>               mm/kasan/init.o:(kasan_populate_early_shadow) in archive vmlinux.a
>>> referenced by init.c:260 (/home/hyeyoo/mm-new/mm/kasan/init.c:260)
>>>               mm/kasan/init.o:(kasan_populate_early_shadow) in archive vmlinux.a
>>> did you mean: kasan_early_shadow_pmd
>>> defined in: vmlinux.a(mm/kasan/init.o)

ld.lld: error: undefined symbol: kasan_early_shadow_pud
>>> referenced by init.c:263 (/home/hyeyoo/mm-new/mm/kasan/init.c:263)
>>>               mm/kasan/init.o:(kasan_populate_early_shadow) in archive vmlinux.a
>>> referenced by init.c:263 (/home/hyeyoo/mm-new/mm/kasan/init.c:263)
>>>               mm/kasan/init.o:(kasan_populate_early_shadow) in archive vmlinux.a
>>> referenced by init.c:200 (/home/hyeyoo/mm-new/mm/kasan/init.c:200)
>>>               mm/kasan/init.o:(zero_p4d_populate) in archive vmlinux.a
>>> referenced 1 more times

Therefore, to allow calls to {pgd,p4d}_populate_kernel() to be optimized
out at compile time, define {pgd,p4d}_populate_kernel() as macros.
This way, when pgd_populate() or p4d_populate() are simply empty macros,
the corresponding *_populate_kernel() functions can also be optimized
away.

Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---

v1 -> v2: added comment per Lorenzo's comment.

 include/linux/pgalloc.h | 33 +++++++++++++++++++--------------
 1 file changed, 19 insertions(+), 14 deletions(-)

diff --git a/include/linux/pgalloc.h b/include/linux/pgalloc.h
index 290ab864320f..9174fa59bbc5 100644
--- a/include/linux/pgalloc.h
+++ b/include/linux/pgalloc.h
@@ -5,20 +5,25 @@
 #include <linux/pgtable.h>
 #include <asm/pgalloc.h>
 
-static inline void pgd_populate_kernel(unsigned long addr, pgd_t *pgd,
-				       p4d_t *p4d)
-{
-	pgd_populate(&init_mm, pgd, p4d);
-	if (ARCH_PAGE_TABLE_SYNC_MASK & PGTBL_PGD_MODIFIED)
-		arch_sync_kernel_mappings(addr, addr);
-}
+/*
+ * {pgd,p4d}_populate_kernel() are defined as macros to allow
+ * compile-time optimization based on the configured page table levels.
+ * Without this, linking may fail because callers (e.g., KASAN) may rely
+ * on calls to these functions being optimized away when passing symbols
+ * that exist only for certain page table levels.
+ */
+#define pgd_populate_kernel(addr, pgd, p4d)				\
+	do {								\
+		pgd_populate(&init_mm, pgd, p4d);			\
+		if (ARCH_PAGE_TABLE_SYNC_MASK & PGTBL_PGD_MODIFIED)	\
+			arch_sync_kernel_mappings(addr, addr);		\
+	} while (0)
 
-static inline void p4d_populate_kernel(unsigned long addr, p4d_t *p4d,
-				       pud_t *pud)
-{
-	p4d_populate(&init_mm, p4d, pud);
-	if (ARCH_PAGE_TABLE_SYNC_MASK & PGTBL_P4D_MODIFIED)
-		arch_sync_kernel_mappings(addr, addr);
-}
+#define p4d_populate_kernel(addr, p4d, pud)				\
+	do {								\
+		p4d_populate(&init_mm, p4d, pud);			\
+		if (ARCH_PAGE_TABLE_SYNC_MASK & PGTBL_P4D_MODIFIED)	\
+			arch_sync_kernel_mappings(addr, addr);		\
+	} while (0)
 
 #endif /* _LINUX_PGALLOC_H */
-- 
2.43.0
Re: [PATCH v2] mm: fix KASAN build error due to p*d_populate_kernel()
Posted by Dave Hansen 1 month, 1 week ago
On 8/21/25 04:57, Harry Yoo wrote:
> However, {pgd,p4d}_populate_kernel() is defined as a function regardless
> of the number of page table levels, so the compiler may not optimize
> them away. In this case, the following linker error occurs:

This part of the changelog confused me. I think it's focusing on the
wrong thing.

The code that's triggering this is literally:

>                         pgd_populate(&init_mm, pgd,
>                                         lm_alias(kasan_early_shadow_p4d));

It sure _looks_ like it's unconditionally referencing the
'kasan_early_shadow_p4d' symbol. I think it's wrong to hide that with
macro magic and just assume that the macros won't reference it.

If a symbol isn't being defined, it shouldn't be referenced in C code.:q

The right way to do it is to have an #ifdef in a header that avoids
compiling in the reference to the symbol.

But just changing the 'static inline' to a #define seems like a fragile
hack to me.
Re: [PATCH v2] mm: fix KASAN build error due to p*d_populate_kernel()
Posted by Harry Yoo 1 month, 1 week ago
On Thu, Aug 21, 2025 at 10:36:12AM -0700, Dave Hansen wrote:
> On 8/21/25 04:57, Harry Yoo wrote:
> > However, {pgd,p4d}_populate_kernel() is defined as a function regardless
> > of the number of page table levels, so the compiler may not optimize
> > them away. In this case, the following linker error occurs:

Hi, thanks for taking a look, Dave!

First of all, this is a fix-up patch of a mm-hotfixes patch series that
fixes a bug (I should have explained that in the changelog) [1].

[1] https://lore.kernel.org/linux-mm/20250818020206.4517-1-harry.yoo@oracle.com

I think we can continue discussing it and perhaps do that as part of
a follow-up series, because the current patch series need to be backported
to -stable and your suggestion to improve existing code doesn't require
-stable backports.

Does that sound fine?

> This part of the changelog confused me. I think it's focusing on the
> wrong thing.
> 
> The code that's triggering this is literally:
> 
> >                         pgd_populate(&init_mm, pgd,
> >                                         lm_alias(kasan_early_shadow_p4d));
> 
> It sure _looks_ like it's unconditionally referencing the
> 'kasan_early_shadow_p4d' symbol. I think it's wrong to hide that with
> macro magic and just assume that the macros won't reference it.
> 
> If a symbol isn't being defined, it shouldn't be referenced in C code.:q

A fair point, and that's what KASAN code has been doing for years.

> The right way to do it is to have an #ifdef in a header that avoids
> compiling in the reference to the symbol.

You mean defining some wrapper functions for p*d_populate_kernel() in
KASAN with different implementations based on ifdeffery?

Just to clarify, what should be the exact ifdeffery to cover these cases?
#if CONFIG_PGTABLE_LEVELS == 4 and 5, or
#ifdef __PAGETABLE_P4D_FOLDED and __PAGETABLE_PUD_FOLDED ?

I have no strong opinion on this, let's hear what KASAN folks think.

> But just changing the 'static inline' to a #define seems like a fragile
> hack to me.

At least that's what KASAN has relied on p*d_populate() to do...

-- 
Cheers,
Harry / Hyeonggon
Re: [PATCH v2] mm: fix KASAN build error due to p*d_populate_kernel()
Posted by Dave Hansen 1 month, 1 week ago
On 8/21/25 18:11, Harry Yoo wrote:
> On Thu, Aug 21, 2025 at 10:36:12AM -0700, Dave Hansen wrote:
>> On 8/21/25 04:57, Harry Yoo wrote:
>>> However, {pgd,p4d}_populate_kernel() is defined as a function regardless
>>> of the number of page table levels, so the compiler may not optimize
>>> them away. In this case, the following linker error occurs:
> 
> Hi, thanks for taking a look, Dave!
> 
> First of all, this is a fix-up patch of a mm-hotfixes patch series that
> fixes a bug (I should have explained that in the changelog) [1].
> 
> [1] https://lore.kernel.org/linux-mm/20250818020206.4517-1-harry.yoo@oracle.com
> 
> I think we can continue discussing it and perhaps do that as part of
> a follow-up series, because the current patch series need to be backported
> to -stable and your suggestion to improve existing code doesn't require
> -stable backports.
> 
> Does that sound fine?
> 
>> This part of the changelog confused me. I think it's focusing on the
>> wrong thing.
>>
>> The code that's triggering this is literally:
>>
>>>                         pgd_populate(&init_mm, pgd,
>>>                                         lm_alias(kasan_early_shadow_p4d));
>>
>> It sure _looks_ like it's unconditionally referencing the
>> 'kasan_early_shadow_p4d' symbol. I think it's wrong to hide that with
>> macro magic and just assume that the macros won't reference it.
>>
>> If a symbol isn't being defined, it shouldn't be referenced in C code.:q
> 
> A fair point, and that's what KASAN code has been doing for years.
> 
>> The right way to do it is to have an #ifdef in a header that avoids
>> compiling in the reference to the symbol.
> 
> You mean defining some wrapper functions for p*d_populate_kernel() in
> KASAN with different implementations based on ifdeffery?

That would work.

So would something like:

#if CONFIG_PGTABLE_LEVELS >= 4
extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
#else
#define kasan_early_shadow_p4d NULL
#endif

> Just to clarify, what should be the exact ifdeffery to cover these cases?
> #if CONFIG_PGTABLE_LEVELS == 4 and 5, or
> #ifdef __PAGETABLE_P4D_FOLDED and __PAGETABLE_PUD_FOLDED ?
> 
> I have no strong opinion on this, let's hear what KASAN folks think.

I think CONFIG_PGTABLE_LEVELS works, but in the end I'm not picky about
the specific #ifdefs that work.
Re: [PATCH v2] mm: fix KASAN build error due to p*d_populate_kernel()
Posted by Andrey Ryabinin 1 month, 1 week ago

On 8/22/25 7:08 PM, Dave Hansen wrote:
> On 8/21/25 18:11, Harry Yoo wrote:
>> On Thu, Aug 21, 2025 at 10:36:12AM -0700, Dave Hansen wrote:
>>> On 8/21/25 04:57, Harry Yoo wrote:
>>>> However, {pgd,p4d}_populate_kernel() is defined as a function regardless
>>>> of the number of page table levels, so the compiler may not optimize
>>>> them away. In this case, the following linker error occurs:
>>
>> Hi, thanks for taking a look, Dave!
>>
>> First of all, this is a fix-up patch of a mm-hotfixes patch series that
>> fixes a bug (I should have explained that in the changelog) [1].
>>
>> [1] https://lore.kernel.org/linux-mm/20250818020206.4517-1-harry.yoo@oracle.com
>>
>> I think we can continue discussing it and perhaps do that as part of
>> a follow-up series, because the current patch series need to be backported
>> to -stable and your suggestion to improve existing code doesn't require
>> -stable backports.
>>
>> Does that sound fine?
>>
>>> This part of the changelog confused me. I think it's focusing on the
>>> wrong thing.
>>>
>>> The code that's triggering this is literally:
>>>
>>>>                         pgd_populate(&init_mm, pgd,
>>>>                                         lm_alias(kasan_early_shadow_p4d));
>>>
>>> It sure _looks_ like it's unconditionally referencing the
>>> 'kasan_early_shadow_p4d' symbol. I think it's wrong to hide that with
>>> macro magic and just assume that the macros won't reference it.
>>>
>>> If a symbol isn't being defined, it shouldn't be referenced in C code.:q
>>
>> A fair point, and that's what KASAN code has been doing for years.
>>
>>> The right way to do it is to have an #ifdef in a header that avoids
>>> compiling in the reference to the symbol.
>>
>> You mean defining some wrapper functions for p*d_populate_kernel() in
>> KASAN with different implementations based on ifdeffery?
> 
> That would work.
> 
> So would something like:
> 
> #if CONFIG_PGTABLE_LEVELS >= 4
> extern p4d_t kasan_early_shadow_p4d[MAX_PTRS_PER_P4D];
> #else
> #define kasan_early_shadow_p4d NULL
> #endif
> 

This won't work. It will fix the linker error, but will introduce runtime bug instead:

lm_alias(kasan_early_shadow_p4d) -> __va(__phys_addr_symbol(NULL))

On arm64:

phys_addr_t __phys_addr_symbol(unsigned long x)
	VIRTUAL_BUG_ON(x < (unsigned long) KERNEL_START ||
		       x > (unsigned long) KERNEL_END);

And NULL is < KERNEL_START.

Since __phys_addr_symbol() isn't pure or const,  compiler has no right to eliminate such
call even though the return value is unused.
Re: [PATCH v2] mm: fix KASAN build error due to p*d_populate_kernel()
Posted by Andrey Ryabinin 1 month, 1 week ago

On 8/22/25 3:11 AM, Harry Yoo wrote:
> On Thu, Aug 21, 2025 at 10:36:12AM -0700, Dave Hansen wrote:
>> On 8/21/25 04:57, Harry Yoo wrote:
>>> However, {pgd,p4d}_populate_kernel() is defined as a function regardless
>>> of the number of page table levels, so the compiler may not optimize
>>> them away. In this case, the following linker error occurs:
> 
> Hi, thanks for taking a look, Dave!
> 
> First of all, this is a fix-up patch of a mm-hotfixes patch series that
> fixes a bug (I should have explained that in the changelog) [1].
> 
> [1] https://lore.kernel.org/linux-mm/20250818020206.4517-1-harry.yoo@oracle.com
> 
> I think we can continue discussing it and perhaps do that as part of
> a follow-up series, because the current patch series need to be backported
> to -stable and your suggestion to improve existing code doesn't require
> -stable backports.
> 
> Does that sound fine?
> 
>> This part of the changelog confused me. I think it's focusing on the
>> wrong thing.
>>
>> The code that's triggering this is literally:
>>
>>>                         pgd_populate(&init_mm, pgd,
>>>                                         lm_alias(kasan_early_shadow_p4d));
>>
>> It sure _looks_ like it's unconditionally referencing the
>> 'kasan_early_shadow_p4d' symbol. I think it's wrong to hide that with
>> macro magic and just assume that the macros won't reference it.
>>
>> If a symbol isn't being defined, it shouldn't be referenced in C code.:q


That's not exactly the case for the kernel. It historically relied on being
compiled with optimization and compiler being able to eliminate unused references.
AFAIR BUILD_BUG_ON() works like that, there are also plenty of code like

if  (IS_ENABLED(CONFIG_SOMETHING))
	ptr = &something;
else
	ptr = &something_else; 

e.g. irq_remaping_prepare();


> 
> A fair point, and that's what KASAN code has been doing for years.
> 
>> The right way to do it is to have an #ifdef in a header that avoids
>> compiling in the reference to the symbol.
> 
> You mean defining some wrapper functions for p*d_populate_kernel() in
> KASAN with different implementations based on ifdeffery?
> 
> Just to clarify, what should be the exact ifdeffery to cover these cases?
> #if CONFIG_PGTABLE_LEVELS == 4 and 5, or
> #ifdef __PAGETABLE_P4D_FOLDED and __PAGETABLE_PUD_FOLDED ?
> 

I think ifdef should be the same as for symbol, so '#if CONFIG_PGTABLE_LEVELS > 4'
for *_p4d and '#if CONFIG_PGTABLE_LEVELS > 3' for *_pud


> I have no strong opinion on this, let's hear what KASAN folks think.
> 

So, I think we have following options:

1. Macros as you did.
2. Hide references in function under  '#if CONFIG_PGTABLE_LEVELS > x', like Dave suggested.
3. It should be enough to just add if in code like
            if (CONFIG_PGTABLE_LEVELS > 4)
		pgd_populate_kernel(addr, pgd,
                                          lm_alias(kasan_early_shadow_p4d));
Compiler should be able to optimize it away.

4. I guess that the link error is due to enabled CONFIG_DEBUG_VIRTUAL=y
lm_alias() ends up with __phys_addr_symbol() function call which compiler can't optimize away.
Technically we can declare __phys_addr_symbol() with __attribute__((pure)), so compiler will
be able to optimize away this call, because the result should be unused.
But I'm not sure we really want that, because it's debug function and even if the result is unused
we might want to still have a check if symbol address is correct.


I would probably prefer 3rd option, but I don't really have very strong opinion, so either way is fine.
Re: [PATCH v2] mm: fix KASAN build error due to p*d_populate_kernel()
Posted by Harry Yoo 1 month, 1 week ago
On Fri, Aug 22, 2025 at 06:02:40PM +0200, Andrey Ryabinin wrote:
> On 8/22/25 3:11 AM, Harry Yoo wrote:
> > On Thu, Aug 21, 2025 at 10:36:12AM -0700, Dave Hansen wrote:
> >> On 8/21/25 04:57, Harry Yoo wrote:
> >>> However, {pgd,p4d}_populate_kernel() is defined as a function regardless
> >>> of the number of page table levels, so the compiler may not optimize
> >>> them away. In this case, the following linker error occurs:
> > 
> > Hi, thanks for taking a look, Dave!
> > 
> > First of all, this is a fix-up patch of a mm-hotfixes patch series that
> > fixes a bug (I should have explained that in the changelog) [1].
> > 
> > [1] https://lore.kernel.org/linux-mm/20250818020206.4517-1-harry.yoo@oracle.com
> > 
> > I think we can continue discussing it and perhaps do that as part of
> > a follow-up series, because the current patch series need to be backported
> > to -stable and your suggestion to improve existing code doesn't require
> > -stable backports.
> > 
> > Does that sound fine?
> > 
> >> This part of the changelog confused me. I think it's focusing on the
> >> wrong thing.
> >>
> >> The code that's triggering this is literally:
> >>
> >>>                         pgd_populate(&init_mm, pgd,
> >>>                                         lm_alias(kasan_early_shadow_p4d));
> >>
> >> It sure _looks_ like it's unconditionally referencing the
> >> 'kasan_early_shadow_p4d' symbol. I think it's wrong to hide that with
> >> macro magic and just assume that the macros won't reference it.
> >>
> >> If a symbol isn't being defined, it shouldn't be referenced in C code.:q
>  
> That's not exactly the case for the kernel. It historically relied on being
> compiled with optimization and compiler being able to eliminate unused references.
> AFAIR BUILD_BUG_ON() works like that, there are also plenty of code like
> 
> if  (IS_ENABLED(CONFIG_SOMETHING))
> 	ptr = &something;
> else
> 	ptr = &something_else; 
> 
> e.g. irq_remaping_prepare();

Agreed. I've seen this pattern in many places.

> > A fair point, and that's what KASAN code has been doing for years.
> > 
> >> The right way to do it is to have an #ifdef in a header that avoids
> >> compiling in the reference to the symbol.
> > 
> > You mean defining some wrapper functions for p*d_populate_kernel() in
> > KASAN with different implementations based on ifdeffery?
> > 
> > Just to clarify, what should be the exact ifdeffery to cover these cases?
> > #if CONFIG_PGTABLE_LEVELS == 4 and 5, or
> > #ifdef __PAGETABLE_P4D_FOLDED and __PAGETABLE_PUD_FOLDED ?
> > 
> 
> I think ifdef should be the same as for symbol, so '#if CONFIG_PGTABLE_LEVELS > 4'
> for *_p4d and '#if CONFIG_PGTABLE_LEVELS > 3' for *_pud

Right.

> > I have no strong opinion on this, let's hear what KASAN folks think.
> > 
> 
> So, I think we have following options:
> 
> 1. Macros as you did.
> 2. Hide references in function under  '#if CONFIG_PGTABLE_LEVELS > x', like Dave suggested.
> 3. It should be enough to just add if in code like
>             if (CONFIG_PGTABLE_LEVELS > 4)
> 		pgd_populate_kernel(addr, pgd,
>                                           lm_alias(kasan_early_shadow_p4d));
> Compiler should be able to optimize it away.
> 
> 4. I guess that the link error is due to enabled CONFIG_DEBUG_VIRTUAL=y
> lm_alias() ends up with __phys_addr_symbol() function call which compiler can't optimize away.
> Technically we can declare __phys_addr_symbol() with __attribute__((pure)), so compiler will
> be able to optimize away this call, because the result should be unused.
> But I'm not sure we really want that, because it's debug function and even if the result is unused
> we might want to still have a check if symbol address is correct.
> 
> 
> I would probably prefer 3rd option, but I don't really have very strong opinion, so either way is fine.

I also prefer 3rd option (but 1st or 2nd is also fine to me)

That's two votes, I'll do 2nd option in the follow-up series unless
Dave or somebody else objects?

-- 
Cheers,
Harry / Hyeonggon