mm: introduce ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to sync kernel mapping conditionally

[PATCH] mm: introduce ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to sync kernel mapping conditionally

Posted by alexjlzheng@gmail.com 2 weeks ago

From: Jinliang Zheng <alexjlzheng@tencent.com>

After commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for
vmalloc area"), we don't need to synchronize kernel mappings in the
vmalloc area on x86_64.

And commit 58a18fe95e83 ("x86/mm/64: Do not sync vmalloc/ioremap
mappings") actually does this.

But commit 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK
and arch_sync_kernel_mappings()") breaks this.

This patch introduces ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to avoid
unnecessary kernel mappings synchronization of the vmalloc area.

Fixes: 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings()")
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 arch/arm/include/asm/page.h                 | 3 ++-
 arch/x86/include/asm/pgtable-2level_types.h | 3 ++-
 arch/x86/include/asm/pgtable-3level_types.h | 3 ++-
 include/linux/pgtable.h                     | 4 ++++
 mm/memory.c                                 | 2 +-
 mm/vmalloc.c                                | 6 +++---
 6 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index ef11b721230e..764afc1d0aba 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -167,7 +167,8 @@ extern void copy_page(void *to, const void *from);
 #else
 #include <asm/pgtable-2level-types.h>
 #ifdef CONFIG_VMAP_STACK
-#define ARCH_PAGE_TABLE_SYNC_MASK	PGTBL_PMD_MODIFIED
+#define ARCH_PAGE_TABLE_SYNC_MASK		PGTBL_PMD_MODIFIED
+#define ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC	ARCH_PAGE_TABLE_SYNC_MASK
 #endif
 #endif
 
diff --git a/arch/x86/include/asm/pgtable-2level_types.h b/arch/x86/include/asm/pgtable-2level_types.h
index 54690bd4ddbe..650b12c25c0c 100644
--- a/arch/x86/include/asm/pgtable-2level_types.h
+++ b/arch/x86/include/asm/pgtable-2level_types.h
@@ -18,7 +18,8 @@ typedef union {
 } pte_t;
 #endif	/* !__ASSEMBLER__ */
 
-#define ARCH_PAGE_TABLE_SYNC_MASK	PGTBL_PMD_MODIFIED
+#define ARCH_PAGE_TABLE_SYNC_MASK		PGTBL_PMD_MODIFIED
+#define ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC	ARCH_PAGE_TABLE_SYNC_MASK
 
 /*
  * Traditional i386 two-level paging structure:
diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h
index 580b09bf6a45..272d946a3c7d 100644
--- a/arch/x86/include/asm/pgtable-3level_types.h
+++ b/arch/x86/include/asm/pgtable-3level_types.h
@@ -27,7 +27,8 @@ typedef union {
 } pmd_t;
 #endif	/* !__ASSEMBLER__ */
 
-#define ARCH_PAGE_TABLE_SYNC_MASK	PGTBL_PMD_MODIFIED
+#define ARCH_PAGE_TABLE_SYNC_MASK		PGTBL_PMD_MODIFIED
+#define ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC	ARCH_PAGE_TABLE_SYNC_MASK
 
 /*
  * PGDIR_SHIFT determines what a top-level page table entry can map
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 2b80fd456c8b..53b97c5773ba 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1476,6 +1476,10 @@ static inline void modify_prot_commit_ptes(struct vm_area_struct *vma, unsigned
 #define ARCH_PAGE_TABLE_SYNC_MASK 0
 #endif
 
+#ifndef ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC
+#define ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC 0
+#endif
+
 /*
  * There is no default implementation for arch_sync_kernel_mappings(). It is
  * relied upon the compiler to optimize calls out if ARCH_PAGE_TABLE_SYNC_MASK
diff --git a/mm/memory.c b/mm/memory.c
index 0ba4f6b71847..cd2488043f8f 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3170,7 +3170,7 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr,
 			break;
 	} while (pgd++, addr = next, addr != end);
 
-	if (mask & ARCH_PAGE_TABLE_SYNC_MASK)
+	if (mask & ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC)
 		arch_sync_kernel_mappings(start, start + size);
 
 	return err;
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 5edd536ba9d2..2fe2480de5dc 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -311,7 +311,7 @@ static int vmap_range_noflush(unsigned long addr, unsigned long end,
 			break;
 	} while (pgd++, phys_addr += (next - addr), addr = next, addr != end);
 
-	if (mask & ARCH_PAGE_TABLE_SYNC_MASK)
+	if (mask & ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC)
 		arch_sync_kernel_mappings(start, end);
 
 	return err;
@@ -484,7 +484,7 @@ void __vunmap_range_noflush(unsigned long start, unsigned long end)
 		vunmap_p4d_range(pgd, addr, next, &mask);
 	} while (pgd++, addr = next, addr != end);
 
-	if (mask & ARCH_PAGE_TABLE_SYNC_MASK)
+	if (mask & ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC)
 		arch_sync_kernel_mappings(start, end);
 }
 
@@ -629,7 +629,7 @@ static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end,
 			break;
 	} while (pgd++, addr = next, addr != end);
 
-	if (mask & ARCH_PAGE_TABLE_SYNC_MASK)
+	if (mask & ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC)
 		arch_sync_kernel_mappings(start, end);
 
 	return err;
-- 
2.49.0

Re: [PATCH] mm: introduce ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to sync kernel mapping conditionally

Posted by Harry Yoo 2 weeks ago

On Wed, Sep 17, 2025 at 11:48:29PM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> After commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for
> vmalloc area"), we don't need to synchronize kernel mappings in the
> vmalloc area on x86_64.

Right.

> And commit 58a18fe95e83 ("x86/mm/64: Do not sync vmalloc/ioremap
> mappings") actually does this.

Right.

> But commit 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK
> and arch_sync_kernel_mappings()") breaks this.

Good point.

> This patch introduces ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to avoid
> unnecessary kernel mappings synchronization of the vmalloc area.
> 
> Fixes: 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings()")

The commit is getting backported to -stable kernels.

Do you think this can cause a visible performance regression from
user point of view, or it's just a nice optimization to have?
(and any data to support?)

> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> ---
>  arch/arm/include/asm/page.h                 | 3 ++-
>  arch/x86/include/asm/pgtable-2level_types.h | 3 ++-
>  arch/x86/include/asm/pgtable-3level_types.h | 3 ++-
>  include/linux/pgtable.h                     | 4 ++++
>  mm/memory.c                                 | 2 +-
>  mm/vmalloc.c                                | 6 +++---
>  6 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 0ba4f6b71847..cd2488043f8f 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3170,7 +3170,7 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr,
>  			break;
>  	} while (pgd++, addr = next, addr != end);
>  
> -	if (mask & ARCH_PAGE_TABLE_SYNC_MASK)
> +	if (mask & ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC)
>  		arch_sync_kernel_mappings(start, start + size);

But vmalloc is not the only user of apply_to_page_range()?

-- 
Cheers,
Harry / Hyeonggon

Re: [PATCH] mm: introduce ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to sync kernel mapping conditionally

Posted by Jinliang Zheng 2 weeks ago

On Thu, 18 Sep 2025 01:41:04 +0900, harry.yoo@oracle.com wrote:
> On Wed, Sep 17, 2025 at 11:48:29PM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > 
> > After commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for
> > vmalloc area"), we don't need to synchronize kernel mappings in the
> > vmalloc area on x86_64.
> 
> Right.
> 
> > And commit 58a18fe95e83 ("x86/mm/64: Do not sync vmalloc/ioremap
> > mappings") actually does this.
> 
> Right.
> 
> > But commit 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK
> > and arch_sync_kernel_mappings()") breaks this.
> 
> Good point.
> 
> > This patch introduces ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to avoid
> > unnecessary kernel mappings synchronization of the vmalloc area.
> > 
> > Fixes: 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings()")
> 
> The commit is getting backported to -stable kernels.
> 
> Do you think this can cause a visible performance regression from
> user point of view, or it's just a nice optimization to have?
> (and any data to support?)

Haha, when I woke up in bed this morning, I suddenly realized that I
might have pushed a worthless patch and wasted everyone's precious time.

Sorry for that. :-(

After commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area"),
pgd_alloc_track()/p4d_alloc_track() in vmalloc() and apply_to_range() may should
always return a mask that does not contain PGTBL_PGD_MODIFIED (5 level pgtable)
or PGTBL_P4D_MODIFIED (4 level pgtable), thereby bypassing the call to
arch_sync_kernel_mappings(). Right?

thanks,
Jinliang Zheng. :)

> 
> > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > ---
> >  arch/arm/include/asm/page.h                 | 3 ++-
> >  arch/x86/include/asm/pgtable-2level_types.h | 3 ++-
> >  arch/x86/include/asm/pgtable-3level_types.h | 3 ++-
> >  include/linux/pgtable.h                     | 4 ++++
> >  mm/memory.c                                 | 2 +-
> >  mm/vmalloc.c                                | 6 +++---
> >  6 files changed, 14 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 0ba4f6b71847..cd2488043f8f 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3170,7 +3170,7 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr,
> >  			break;
> >  	} while (pgd++, addr = next, addr != end);
> >  
> > -	if (mask & ARCH_PAGE_TABLE_SYNC_MASK)
> > +	if (mask & ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC)
> >  		arch_sync_kernel_mappings(start, start + size);
> 
> But vmalloc is not the only user of apply_to_page_range()?
> 
> -- 
> Cheers,
> Harry / Hyeonggon

Re: [PATCH] mm: introduce ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to sync kernel mapping conditionally

Posted by Harry Yoo 2 weeks ago

On Thu, Sep 18, 2025 at 09:31:30AM +0800, Jinliang Zheng wrote:
> On Thu, 18 Sep 2025 01:41:04 +0900, harry.yoo@oracle.com wrote:
> > On Wed, Sep 17, 2025 at 11:48:29PM +0800, alexjlzheng@gmail.com wrote:
> > > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > > 
> > > After commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for
> > > vmalloc area"), we don't need to synchronize kernel mappings in the
> > > vmalloc area on x86_64.
> > 
> > Right.
> > 
> > > And commit 58a18fe95e83 ("x86/mm/64: Do not sync vmalloc/ioremap
> > > mappings") actually does this.
> > 
> > Right.
> > 
> > > But commit 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK
> > > and arch_sync_kernel_mappings()") breaks this.
> > 
> > Good point.
> > 
> > > This patch introduces ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to avoid
> > > unnecessary kernel mappings synchronization of the vmalloc area.
> > > 
> > > Fixes: 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings()")
> > 
> > The commit is getting backported to -stable kernels.
> > 
> > Do you think this can cause a visible performance regression from
> > user point of view, or it's just a nice optimization to have?
> > (and any data to support?)
> 
> Haha, when I woke up in bed this morning, I suddenly realized that I
> might have pushed a worthless patch and wasted everyone's precious time.
> 
> Sorry for that. :-(

It's okay!

> After commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for vmalloc area"),
> pgd_alloc_track()/p4d_alloc_track() in vmalloc() and apply_to_range() may should
> always return a mask that does not contain PGTBL_PGD_MODIFIED (5 level pgtable)
> or PGTBL_P4D_MODIFIED (4 level pgtable), thereby bypassing the call to
> arch_sync_kernel_mappings(). Right?

Yeah, I was confused about it too ;)

I think you're right. because vmalloc area is already populated,
p4d_alloc_track() / pud_alloc_track() won't return
PGTBL_PGD_MODIFIED or PGTBL_P4D_MODIFIED.

> thanks,
> Jinliang Zheng. :)
> 
> > 
> > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > > ---
> > >  arch/arm/include/asm/page.h                 | 3 ++-
> > >  arch/x86/include/asm/pgtable-2level_types.h | 3 ++-
> > >  arch/x86/include/asm/pgtable-3level_types.h | 3 ++-
> > >  include/linux/pgtable.h                     | 4 ++++
> > >  mm/memory.c                                 | 2 +-
> > >  mm/vmalloc.c                                | 6 +++---
> > >  6 files changed, 14 insertions(+), 7 deletions(-)
> > > 
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 0ba4f6b71847..cd2488043f8f 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -3170,7 +3170,7 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr,
> > >  			break;
> > >  	} while (pgd++, addr = next, addr != end);
> > >  
> > > -	if (mask & ARCH_PAGE_TABLE_SYNC_MASK)
> > > +	if (mask & ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC)
> > >  		arch_sync_kernel_mappings(start, start + size);
> > 
> > But vmalloc is not the only user of apply_to_page_range()?
> > 
> > -- 
> > Cheers,
> > Harry / Hyeonggon

-- 
Cheers,
Harry / Hyeonggon

Re: [PATCH] mm: introduce ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to sync kernel mapping conditionally

Posted by Harry Yoo 2 weeks ago

On Thu, Sep 18, 2025 at 01:41:04AM +0900, Harry Yoo wrote:
> On Wed, Sep 17, 2025 at 11:48:29PM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > 
> > After commit 6eb82f994026 ("x86/mm: Pre-allocate P4D/PUD pages for
> > vmalloc area"), we don't need to synchronize kernel mappings in the
> > vmalloc area on x86_64.
> 
> Right.
> 
> > And commit 58a18fe95e83 ("x86/mm/64: Do not sync vmalloc/ioremap
> > mappings") actually does this.
> 
> Right.
> 
> > But commit 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK
> > and arch_sync_kernel_mappings()") breaks this.
> 
> Good point.
> 
> > This patch introduces ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC to avoid
> > unnecessary kernel mappings synchronization of the vmalloc area.
> > 
> > Fixes: 6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings()")
>
> The commit is getting backported to -stable kernels.

Just to be clear, "the commit" I mentioned above was commit
6659d0279980 ("x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and
arch_sync_kernel_mappings()"), and I was not saying this patch is
going to be backported to -stable.

If you intend to backport it, the `Cc: <stable@vger.kernel.org>` tag
is required to backport it to -stable kernels.

> Do you think this can cause a visible performance regression from
> user point of view, or it's just a nice optimization to have?
> (and any data to support?)

And that's why I was asking if you think this needs to be backported :)

> > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > ---
> >  arch/arm/include/asm/page.h                 | 3 ++-
> >  arch/x86/include/asm/pgtable-2level_types.h | 3 ++-
> >  arch/x86/include/asm/pgtable-3level_types.h | 3 ++-
> >  include/linux/pgtable.h                     | 4 ++++
> >  mm/memory.c                                 | 2 +-
> >  mm/vmalloc.c                                | 6 +++---
> >  6 files changed, 14 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 0ba4f6b71847..cd2488043f8f 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3170,7 +3170,7 @@ static int __apply_to_page_range(struct mm_struct *mm, unsigned long addr,
> >  			break;
> >  	} while (pgd++, addr = next, addr != end);
> >  
> > -	if (mask & ARCH_PAGE_TABLE_SYNC_MASK)
> > +	if (mask & ARCH_PAGE_TABLE_SYNC_MASK_VMALLOC)
> >  		arch_sync_kernel_mappings(start, start + size);
> 
> But vmalloc is not the only user of apply_to_page_range()?
> 
> -- 
> Cheers,
> Harry / Hyeonggon

-- 
Cheers,
Harry / Hyeonggon