arch/arm64/include/asm/vmalloc.h | 6 +- arch/arm64/mm/hugetlbpage.c | 10 ++ mm/vmalloc.c | 178 +++++++++++++++++++++++++------ 3 files changed, 161 insertions(+), 33 deletions(-)
This patchset accelerates ioremap, vmalloc, and vmap when the memory
is physically fully or partially contiguous. Two techniques are used:
1. Avoid page table zigzag when setting PTEs/PMDs for multiple memory
segments
2. Use batched mappings wherever possible in both vmalloc and ARM64
layers
Patches 1–2 extend ARM64 vmalloc CONT-PTE mapping to support multiple
CONT-PTE regions instead of just one.
Patches 3–4 extend vmap_small_pages_range_noflush() to support page
shifts other than PAGE_SHIFT. This allows mapping multiple memory
segments for vmalloc() without zigzagging page tables.
Patches 5–8 add huge vmap support for contiguous pages. This not only
improves performance but also enables PMD or CONT-PTE mapping for the
vmapped area, reducing TLB pressure.
Many thanks to Xueyuan Chen for his substantial testing efforts
on RK3588 boards.
On the RK3588 8-core ARM64 SoC, with tasks pinned to CPU2 and
the performance CPUfreq policy enabled, Xueyuan’s tests report:
* ioremap(1 MB): 1.2× faster
* vmalloc(1 MB) mapping time (excluding allocation) with
VM_ALLOW_HUGE_VMAP: 1.5× faster
* vmap(): 5.6× faster when memory includes some order-8 pages,
with no regression observed for order-0 pages
Barry Song (Xiaomi) (8):
arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE
setup
arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple
CONT_PTE
mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger
page_shift sizes
mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings
mm/vmalloc: map contiguous pages in batches for vmap() if possible
mm/vmalloc: align vm_area so vmap() can batch mappings
mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable
zigzag
mm/vmalloc: Stop scanning for compound pages after encountering small
pages in vmap
arch/arm64/include/asm/vmalloc.h | 6 +-
arch/arm64/mm/hugetlbpage.c | 10 ++
mm/vmalloc.c | 178 +++++++++++++++++++++++++------
3 files changed, 161 insertions(+), 33 deletions(-)
--
2.39.3 (Apple Git-146)
On 08/04/26 8:21 am, Barry Song (Xiaomi) wrote: > This patchset accelerates ioremap, vmalloc, and vmap when the memory > is physically fully or partially contiguous. Two techniques are used: > > 1. Avoid page table zigzag when setting PTEs/PMDs for multiple memory > segments > 2. Use batched mappings wherever possible in both vmalloc and ARM64 > layers > > Patches 1–2 extend ARM64 vmalloc CONT-PTE mapping to support multiple > CONT-PTE regions instead of just one. > > Patches 3–4 extend vmap_small_pages_range_noflush() to support page > shifts other than PAGE_SHIFT. This allows mapping multiple memory > segments for vmalloc() without zigzagging page tables. > > Patches 5–8 add huge vmap support for contiguous pages. This not only > improves performance but also enables PMD or CONT-PTE mapping for the > vmapped area, reducing TLB pressure. > > Many thanks to Xueyuan Chen for his substantial testing efforts > on RK3588 boards. > > On the RK3588 8-core ARM64 SoC, with tasks pinned to CPU2 and > the performance CPUfreq policy enabled, Xueyuan’s tests report: > > * ioremap(1 MB): 1.2× faster > * vmalloc(1 MB) mapping time (excluding allocation) with > VM_ALLOW_HUGE_VMAP: 1.5× faster > * vmap(): 5.6× faster when memory includes some order-8 pages, > with no regression observed for order-0 pages > > Barry Song (Xiaomi) (8): > arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE > setup > arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple > CONT_PTE > mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger > page_shift sizes > mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings > mm/vmalloc: map contiguous pages in batches for vmap() if possible > mm/vmalloc: align vm_area so vmap() can batch mappings > mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable > zigzag > mm/vmalloc: Stop scanning for compound pages after encountering small > pages in vmap > > arch/arm64/include/asm/vmalloc.h | 6 +- > arch/arm64/mm/hugetlbpage.c | 10 ++ > mm/vmalloc.c | 178 +++++++++++++++++++++++++------ > 3 files changed, 161 insertions(+), 33 deletions(-) > Hi Barry, have you got the chance to work on v2?
On Mon, Apr 27, 2026 at 11:05 PM Dev Jain <dev.jain@arm.com> wrote: > > > > On 08/04/26 8:21 am, Barry Song (Xiaomi) wrote: [...] > > > > Barry Song (Xiaomi) (8): > > arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE > > setup > > arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple > > CONT_PTE > > mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger > > page_shift sizes > > mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings > > mm/vmalloc: map contiguous pages in batches for vmap() if possible > > mm/vmalloc: align vm_area so vmap() can batch mappings > > mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable > > zigzag > > mm/vmalloc: Stop scanning for compound pages after encountering small > > pages in vmap > > > > arch/arm64/include/asm/vmalloc.h | 6 +- > > arch/arm64/mm/hugetlbpage.c | 10 ++ > > mm/vmalloc.c | 178 +++++++++++++++++++++++++------ > > 3 files changed, 161 insertions(+), 33 deletions(-) > > > > Hi Barry, have you got the chance to work on v2? Hi Dev, thanks for the ping. Yes, I’m getting Wen Jiang (cc’d) to send v2 within the next few days. The patchset is basically ready, but still under testing.
On 28/04/26 8:46 am, Barry Song wrote: > On Mon, Apr 27, 2026 at 11:05 PM Dev Jain <dev.jain@arm.com> wrote: >> >> >> >> On 08/04/26 8:21 am, Barry Song (Xiaomi) wrote: > [...] >>> >>> Barry Song (Xiaomi) (8): >>> arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE >>> setup >>> arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple >>> CONT_PTE >>> mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger >>> page_shift sizes >>> mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings >>> mm/vmalloc: map contiguous pages in batches for vmap() if possible >>> mm/vmalloc: align vm_area so vmap() can batch mappings >>> mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable >>> zigzag >>> mm/vmalloc: Stop scanning for compound pages after encountering small >>> pages in vmap >>> >>> arch/arm64/include/asm/vmalloc.h | 6 +- >>> arch/arm64/mm/hugetlbpage.c | 10 ++ >>> mm/vmalloc.c | 178 +++++++++++++++++++++++++------ >>> 3 files changed, 161 insertions(+), 33 deletions(-) >>> >> >> Hi Barry, have you got the chance to work on v2? > > Hi Dev, thanks for the ping. > > Yes, I’m getting Wen Jiang (cc’d) to send v2 within the next few days. > The patchset is basically ready, but still under testing. Thanks!
On 08/04/26 8:21 am, Barry Song (Xiaomi) wrote: > This patchset accelerates ioremap, vmalloc, and vmap when the memory > is physically fully or partially contiguous. Two techniques are used: > > 1. Avoid page table zigzag when setting PTEs/PMDs for multiple memory > segments > 2. Use batched mappings wherever possible in both vmalloc and ARM64 > layers > > Patches 1–2 extend ARM64 vmalloc CONT-PTE mapping to support multiple > CONT-PTE regions instead of just one. > > Patches 3–4 extend vmap_small_pages_range_noflush() to support page > shifts other than PAGE_SHIFT. This allows mapping multiple memory > segments for vmalloc() without zigzagging page tables. > > Patches 5–8 add huge vmap support for contiguous pages. This not only > improves performance but also enables PMD or CONT-PTE mapping for the > vmapped area, reducing TLB pressure. > > Many thanks to Xueyuan Chen for his substantial testing efforts > on RK3588 boards. > > On the RK3588 8-core ARM64 SoC, with tasks pinned to CPU2 and > the performance CPUfreq policy enabled, Xueyuan’s tests report: > > * ioremap(1 MB): 1.2× faster > * vmalloc(1 MB) mapping time (excluding allocation) with > VM_ALLOW_HUGE_VMAP: 1.5× faster > * vmap(): 5.6× faster when memory includes some order-8 pages, > with no regression observed for order-0 pages > > Barry Song (Xiaomi) (8): > arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE > setup > arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple > CONT_PTE > mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger > page_shift sizes > mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings > mm/vmalloc: map contiguous pages in batches for vmap() if possible > mm/vmalloc: align vm_area so vmap() can batch mappings > mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable > zigzag > mm/vmalloc: Stop scanning for compound pages after encountering small > pages in vmap > > arch/arm64/include/asm/vmalloc.h | 6 +- > arch/arm64/mm/hugetlbpage.c | 10 ++ > mm/vmalloc.c | 178 +++++++++++++++++++++++++------ > 3 files changed, 161 insertions(+), 33 deletions(-) > On Linux VM on Apple M3, running mm-selftests: ./run_vmtests.sh -t "hugetlb" TAP version 13 # ----------------------- # running ./hugepage-mmap # ----------------------- # TAP version 13 # 1..1 # # Returned address is 0xffffe7c00000 [ 30.884630] kernel BUG at mm/page_table_check.c:86! [ 30.884701] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP [ 30.886803] Modules linked in: [ 30.887217] CPU: 3 UID: 0 PID: 1869 Comm: hugepage-mmap Not tainted 7.0.0-rc5+ #86 PREEMPT [ 30.888218] Hardware name: linux,dummy-virt (DT) [ 30.889413] pstate: a1400005 (NzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 30.889901] pc : page_table_check_clear.part.0+0x128/0x1a0 [ 30.890337] lr : page_table_check_clear.part.0+0x7c/0x1a0 [ 30.890714] sp : ffff800084da3ad0 [ 30.890946] x29: ffff800084da3ad0 x28: 0000000000000001 x27: 0010000000000001 [ 30.891434] x26: 0040000000000040 x25: ffffa06bb8fb9000 x24: 00000000ffffffff [ 30.891932] x23: 0000000000000001 x22: 0000000000000000 x21: ffffa06bb8997810 [ 30.892514] x20: 0000000000113e39 x19: 0000000000113e38 x18: 0000000000000000 [ 30.893007] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 30.893500] x14: ffffa06bb7013780 x13: 0000fffff7f90fff x12: 0000000000000000 [ 30.893990] x11: 1fffe0001a1282c1 x10: ffff0000d094160c x9 : ffffa06bb568a858 [ 30.894479] x8 : ffff5f95c8474000 x7 : 0000000000000000 x6 : ffff00017fffc500 [ 30.894973] x5 : ffff000191208fc0 x4 : 0000000000000000 x3 : 0000000000004000 [ 30.895449] x2 : 0000000000000000 x1 : 00000000ffffffff x0 : ffff0000c071f1b8 [ 30.895875] Call trace: [ 30.896027] page_table_check_clear.part.0+0x128/0x1a0 (P) [ 30.896369] page_table_check_clear+0xc8/0x138 [ 30.896776] __page_table_check_ptes_set+0xe4/0x1e8 [ 30.897073] __set_ptes_anysz+0x2e4/0x308 [ 30.897327] set_huge_pte_at+0xec/0x210 [ 30.897561] hugetlb_no_page+0x1ec/0x8e0 [ 30.897807] hugetlb_fault+0x188/0x740 [ 30.898036] handle_mm_fault+0x294/0x2c0 [ 30.898283] do_page_fault+0x120/0x748 [ 30.898539] do_translation_fault+0x68/0x90 [ 30.898796] do_mem_abort+0x4c/0xa8 [ 30.899011] el0_da+0x2c/0x90 [ 30.899205] el0t_64_sync_handler+0xd0/0xe8 [ 30.899461] el0t_64_sync+0x198/0x1a0 [ 30.899688] Code: 91001021 b8f80022 51000441 36fffd41 (d4210000) [ 30.900053] ---[ end trace 0000000000000000 ]--- The bug is at BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0); My tree is mm-unstable, commit 3fa44141e0bb.
On Wed, Apr 8, 2026 at 5:14 PM Dev Jain <dev.jain@arm.com> wrote:
>
>
>
> On 08/04/26 8:21 am, Barry Song (Xiaomi) wrote:
> > This patchset accelerates ioremap, vmalloc, and vmap when the memory
> > is physically fully or partially contiguous. Two techniques are used:
> >
> > 1. Avoid page table zigzag when setting PTEs/PMDs for multiple memory
> > segments
> > 2. Use batched mappings wherever possible in both vmalloc and ARM64
> > layers
> >
> > Patches 1–2 extend ARM64 vmalloc CONT-PTE mapping to support multiple
> > CONT-PTE regions instead of just one.
> >
> > Patches 3–4 extend vmap_small_pages_range_noflush() to support page
> > shifts other than PAGE_SHIFT. This allows mapping multiple memory
> > segments for vmalloc() without zigzagging page tables.
> >
> > Patches 5–8 add huge vmap support for contiguous pages. This not only
> > improves performance but also enables PMD or CONT-PTE mapping for the
> > vmapped area, reducing TLB pressure.
> >
> > Many thanks to Xueyuan Chen for his substantial testing efforts
> > on RK3588 boards.
> >
> > On the RK3588 8-core ARM64 SoC, with tasks pinned to CPU2 and
> > the performance CPUfreq policy enabled, Xueyuan’s tests report:
> >
> > * ioremap(1 MB): 1.2× faster
> > * vmalloc(1 MB) mapping time (excluding allocation) with
> > VM_ALLOW_HUGE_VMAP: 1.5× faster
> > * vmap(): 5.6× faster when memory includes some order-8 pages,
> > with no regression observed for order-0 pages
> >
> > Barry Song (Xiaomi) (8):
> > arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE
> > setup
> > arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple
> > CONT_PTE
> > mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger
> > page_shift sizes
> > mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings
> > mm/vmalloc: map contiguous pages in batches for vmap() if possible
> > mm/vmalloc: align vm_area so vmap() can batch mappings
> > mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable
> > zigzag
> > mm/vmalloc: Stop scanning for compound pages after encountering small
> > pages in vmap
> >
> > arch/arm64/include/asm/vmalloc.h | 6 +-
> > arch/arm64/mm/hugetlbpage.c | 10 ++
> > mm/vmalloc.c | 178 +++++++++++++++++++++++++------
> > 3 files changed, 161 insertions(+), 33 deletions(-)
> >
>
> On Linux VM on Apple M3, running mm-selftests:
Dev, thanks for your report. Sorry for the silly typo—
Xueyuan’s vmalloc/vmap tests don’t trigger that case yet.
it should be fixed by:
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index bf31c11ebd3b..25b9fce1ec6a 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -110,7 +110,7 @@ static inline int num_contig_ptes(unsigned long
size, size_t *pgsize)
contig_ptes = CONT_PTES;
break;
default:
- if (size < CONT_PMD_SIZE && size > 0 &&
+ if (size < PMD_SIZE && size > 0 &&
IS_ALIGNED(size, CONT_PTE_SIZE)) {
contig_ptes = size >> PAGE_SHIFT;
*pgsize = PAGE_SIZE;
@@ -365,7 +365,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int
shift, vm_flags_t flags)
case CONT_PTE_SIZE:
return pte_mkcont(entry);
default:
- if (pagesize < CONT_PMD_SIZE && pagesize > 0 &&
+ if (pagesize < PMD_SIZE && pagesize > 0 &&
IS_ALIGNED(pagesize, CONT_PTE_SIZE))
return pte_mkcont(entry);
>
> ./run_vmtests.sh -t "hugetlb"
>
> TAP version 13
> # -----------------------
> # running ./hugepage-mmap
> # -----------------------
> # TAP version 13
> # 1..1
> # # Returned address is 0xffffe7c00000
>
>
>
> [ 30.884630] kernel BUG at mm/page_table_check.c:86!
> [ 30.884701] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
> [ 30.886803] Modules linked in:
> [ 30.887217] CPU: 3 UID: 0 PID: 1869 Comm: hugepage-mmap Not tainted 7.0.0-rc5+ #86 PREEMPT
> [ 30.888218] Hardware name: linux,dummy-virt (DT)
> [ 30.889413] pstate: a1400005 (NzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
> [ 30.889901] pc : page_table_check_clear.part.0+0x128/0x1a0
> [ 30.890337] lr : page_table_check_clear.part.0+0x7c/0x1a0
> [ 30.890714] sp : ffff800084da3ad0
> [ 30.890946] x29: ffff800084da3ad0 x28: 0000000000000001 x27: 0010000000000001
> [ 30.891434] x26: 0040000000000040 x25: ffffa06bb8fb9000 x24: 00000000ffffffff
> [ 30.891932] x23: 0000000000000001 x22: 0000000000000000 x21: ffffa06bb8997810
> [ 30.892514] x20: 0000000000113e39 x19: 0000000000113e38 x18: 0000000000000000
> [ 30.893007] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> [ 30.893500] x14: ffffa06bb7013780 x13: 0000fffff7f90fff x12: 0000000000000000
> [ 30.893990] x11: 1fffe0001a1282c1 x10: ffff0000d094160c x9 : ffffa06bb568a858
> [ 30.894479] x8 : ffff5f95c8474000 x7 : 0000000000000000 x6 : ffff00017fffc500
> [ 30.894973] x5 : ffff000191208fc0 x4 : 0000000000000000 x3 : 0000000000004000
> [ 30.895449] x2 : 0000000000000000 x1 : 00000000ffffffff x0 : ffff0000c071f1b8
> [ 30.895875] Call trace:
> [ 30.896027] page_table_check_clear.part.0+0x128/0x1a0 (P)
> [ 30.896369] page_table_check_clear+0xc8/0x138
> [ 30.896776] __page_table_check_ptes_set+0xe4/0x1e8
> [ 30.897073] __set_ptes_anysz+0x2e4/0x308
> [ 30.897327] set_huge_pte_at+0xec/0x210
> [ 30.897561] hugetlb_no_page+0x1ec/0x8e0
> [ 30.897807] hugetlb_fault+0x188/0x740
> [ 30.898036] handle_mm_fault+0x294/0x2c0
> [ 30.898283] do_page_fault+0x120/0x748
> [ 30.898539] do_translation_fault+0x68/0x90
> [ 30.898796] do_mem_abort+0x4c/0xa8
> [ 30.899011] el0_da+0x2c/0x90
> [ 30.899205] el0t_64_sync_handler+0xd0/0xe8
> [ 30.899461] el0t_64_sync+0x198/0x1a0
> [ 30.899688] Code: 91001021 b8f80022 51000441 36fffd41 (d4210000)
> [ 30.900053] ---[ end trace 0000000000000000 ]---
>
>
>
> The bug is at
>
> BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0);
>
> My tree is mm-unstable, commit 3fa44141e0bb.
>
Thanks
Barry
On 08/04/26 4:21 pm, Barry Song wrote:
> On Wed, Apr 8, 2026 at 5:14 PM Dev Jain <dev.jain@arm.com> wrote:
>>
>>
>>
>> On 08/04/26 8:21 am, Barry Song (Xiaomi) wrote:
>>> This patchset accelerates ioremap, vmalloc, and vmap when the memory
>>> is physically fully or partially contiguous. Two techniques are used:
>>>
>>> 1. Avoid page table zigzag when setting PTEs/PMDs for multiple memory
>>> segments
>>> 2. Use batched mappings wherever possible in both vmalloc and ARM64
>>> layers
>>>
>>> Patches 1–2 extend ARM64 vmalloc CONT-PTE mapping to support multiple
>>> CONT-PTE regions instead of just one.
>>>
>>> Patches 3–4 extend vmap_small_pages_range_noflush() to support page
>>> shifts other than PAGE_SHIFT. This allows mapping multiple memory
>>> segments for vmalloc() without zigzagging page tables.
>>>
>>> Patches 5–8 add huge vmap support for contiguous pages. This not only
>>> improves performance but also enables PMD or CONT-PTE mapping for the
>>> vmapped area, reducing TLB pressure.
>>>
>>> Many thanks to Xueyuan Chen for his substantial testing efforts
>>> on RK3588 boards.
>>>
>>> On the RK3588 8-core ARM64 SoC, with tasks pinned to CPU2 and
>>> the performance CPUfreq policy enabled, Xueyuan’s tests report:
>>>
>>> * ioremap(1 MB): 1.2× faster
>>> * vmalloc(1 MB) mapping time (excluding allocation) with
>>> VM_ALLOW_HUGE_VMAP: 1.5× faster
>>> * vmap(): 5.6× faster when memory includes some order-8 pages,
>>> with no regression observed for order-0 pages
>>>
>>> Barry Song (Xiaomi) (8):
>>> arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE
>>> setup
>>> arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple
>>> CONT_PTE
>>> mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger
>>> page_shift sizes
>>> mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings
>>> mm/vmalloc: map contiguous pages in batches for vmap() if possible
>>> mm/vmalloc: align vm_area so vmap() can batch mappings
>>> mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable
>>> zigzag
>>> mm/vmalloc: Stop scanning for compound pages after encountering small
>>> pages in vmap
>>>
>>> arch/arm64/include/asm/vmalloc.h | 6 +-
>>> arch/arm64/mm/hugetlbpage.c | 10 ++
>>> mm/vmalloc.c | 178 +++++++++++++++++++++++++------
>>> 3 files changed, 161 insertions(+), 33 deletions(-)
>>>
>>
>> On Linux VM on Apple M3, running mm-selftests:
>
> Dev, thanks for your report. Sorry for the silly typo—
> Xueyuan’s vmalloc/vmap tests don’t trigger that case yet.
>
> it should be fixed by:
>
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index bf31c11ebd3b..25b9fce1ec6a 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -110,7 +110,7 @@ static inline int num_contig_ptes(unsigned long
> size, size_t *pgsize)
> contig_ptes = CONT_PTES;
> break;
> default:
> - if (size < CONT_PMD_SIZE && size > 0 &&
> + if (size < PMD_SIZE && size > 0 &&
> IS_ALIGNED(size, CONT_PTE_SIZE)) {
> contig_ptes = size >> PAGE_SHIFT;
> *pgsize = PAGE_SIZE;
> @@ -365,7 +365,7 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int
> shift, vm_flags_t flags)
> case CONT_PTE_SIZE:
> return pte_mkcont(entry);
> default:
> - if (pagesize < CONT_PMD_SIZE && pagesize > 0 &&
> + if (pagesize < PMD_SIZE && pagesize > 0 &&
> IS_ALIGNED(pagesize, CONT_PTE_SIZE))
> return pte_mkcont(entry);
Yeah indeed the problem was that a PMD chunk was being treated as 512 ptes
rather than 1 PMD. This fixes it.
>
>>
>> ./run_vmtests.sh -t "hugetlb"
>>
>> TAP version 13
>> # -----------------------
>> # running ./hugepage-mmap
>> # -----------------------
>> # TAP version 13
>> # 1..1
>> # # Returned address is 0xffffe7c00000
>>
>>
>>
>> [ 30.884630] kernel BUG at mm/page_table_check.c:86!
>> [ 30.884701] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
>> [ 30.886803] Modules linked in:
>> [ 30.887217] CPU: 3 UID: 0 PID: 1869 Comm: hugepage-mmap Not tainted 7.0.0-rc5+ #86 PREEMPT
>> [ 30.888218] Hardware name: linux,dummy-virt (DT)
>> [ 30.889413] pstate: a1400005 (NzCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
>> [ 30.889901] pc : page_table_check_clear.part.0+0x128/0x1a0
>> [ 30.890337] lr : page_table_check_clear.part.0+0x7c/0x1a0
>> [ 30.890714] sp : ffff800084da3ad0
>> [ 30.890946] x29: ffff800084da3ad0 x28: 0000000000000001 x27: 0010000000000001
>> [ 30.891434] x26: 0040000000000040 x25: ffffa06bb8fb9000 x24: 00000000ffffffff
>> [ 30.891932] x23: 0000000000000001 x22: 0000000000000000 x21: ffffa06bb8997810
>> [ 30.892514] x20: 0000000000113e39 x19: 0000000000113e38 x18: 0000000000000000
>> [ 30.893007] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
>> [ 30.893500] x14: ffffa06bb7013780 x13: 0000fffff7f90fff x12: 0000000000000000
>> [ 30.893990] x11: 1fffe0001a1282c1 x10: ffff0000d094160c x9 : ffffa06bb568a858
>> [ 30.894479] x8 : ffff5f95c8474000 x7 : 0000000000000000 x6 : ffff00017fffc500
>> [ 30.894973] x5 : ffff000191208fc0 x4 : 0000000000000000 x3 : 0000000000004000
>> [ 30.895449] x2 : 0000000000000000 x1 : 00000000ffffffff x0 : ffff0000c071f1b8
>> [ 30.895875] Call trace:
>> [ 30.896027] page_table_check_clear.part.0+0x128/0x1a0 (P)
>> [ 30.896369] page_table_check_clear+0xc8/0x138
>> [ 30.896776] __page_table_check_ptes_set+0xe4/0x1e8
>> [ 30.897073] __set_ptes_anysz+0x2e4/0x308
>> [ 30.897327] set_huge_pte_at+0xec/0x210
>> [ 30.897561] hugetlb_no_page+0x1ec/0x8e0
>> [ 30.897807] hugetlb_fault+0x188/0x740
>> [ 30.898036] handle_mm_fault+0x294/0x2c0
>> [ 30.898283] do_page_fault+0x120/0x748
>> [ 30.898539] do_translation_fault+0x68/0x90
>> [ 30.898796] do_mem_abort+0x4c/0xa8
>> [ 30.899011] el0_da+0x2c/0x90
>> [ 30.899205] el0t_64_sync_handler+0xd0/0xe8
>> [ 30.899461] el0t_64_sync+0x198/0x1a0
>> [ 30.899688] Code: 91001021 b8f80022 51000441 36fffd41 (d4210000)
>> [ 30.900053] ---[ end trace 0000000000000000 ]---
>>
>>
>>
>> The bug is at
>>
>> BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0);
>>
>> My tree is mm-unstable, commit 3fa44141e0bb.
>>
>
> Thanks
> Barry
© 2016 - 2026 Red Hat, Inc.