From nobody Sat Feb 7 05:44:27 2026 Received: from mail-oi1-f172.google.com (mail-oi1-f172.google.com [209.85.167.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 032651A9F90 for ; Mon, 2 Feb 2026 00:55:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993732; cv=none; b=utK95jX0hMdJGUCmhT12ppB8GEQv/pJ+7h3STgzoOvqRykqjqW8MMLBb1b55w8CSkC0b3sHdhW0ss7CeLJz4fFmFosjOcs5ASewEDuC1vtqLM3t9FONLbP+odxh7LANRvZZcRX+hQQBQRDwGxikhInT0lDctN7fPln6o3uRhMZ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993732; c=relaxed/simple; bh=1orPxcwDE+HeShO14qJw8h/wyHB+5A6Oqt+HZSouneo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RufduYd2HQdw5KNUH+LkKi1dPwoBUr05seiuYPVTjKXP4hz7XoEQUxtMbdARmLZ9itpHqwRYTmA286gZYajjzWjJbfV7v2FxYKTYRftejA5H7tuwSRpuYb1GLs+6meUQTR28Dlvs7poa3yfrVRrESG0AO1T4UZrYFEyZ+NMV16w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=csQcH1nR; arc=none smtp.client-ip=209.85.167.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="csQcH1nR" Received: by mail-oi1-f172.google.com with SMTP id 5614622812f47-460f3f9fdb1so854059b6e.0 for ; Sun, 01 Feb 2026 16:55:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993729; x=1770598529; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=28/pyoJMGtkXfar6Z5eN1MS6qKjklbYO6LOLeELBWwE=; b=csQcH1nRALPVppV3ePO1ZahAwfZ27CSK0o70IRZ74/2CCqy9BuUOBdyYWoyxP4JoGy DVycILrU4Qwzo1k28ZiLKzOkFU1/MQWlezfhm3B3jTqeoJ+IJjB86tZW7HyDw6WLws62 j1z5ktubBAkyZEZJAOhJVc915iETqT8CPDA9B4hg/y+IuqL+sliu+Iz9RQeQOaFqzWCq 3IT5H4KMnsTQfaXoTM+fW7Cxsn5GCjjWsFjHCt1XQF9yHz4HS+pFAUBbz2NZ3SVQuujc b7UbHDuw3SkqobtchJ3NOXNCrA4ryAbXrCCp1eRCLf+mM6ufDZad8nTBOmG2FbsZ9q7d G2/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993729; x=1770598529; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=28/pyoJMGtkXfar6Z5eN1MS6qKjklbYO6LOLeELBWwE=; b=WfNVlbFJ4m97s2fsn+bNM6nMNC0IWywCf0q04VJsX/plupegsLlWlf8TRNgHWeMAFB qj6/Gn2/sU47nQbj179XYvv0avcuP+OSyqdXVH1XOm27GmF2fDguYVbvo94oOKsdjnuG Ols2xk86vQ68XoacP9MzaVbQtbW2rxMTG4IgCI06+f30db/z0qdAVcx8Kn2qK0zEVO5i byHc4nxOlXc+TO7tEWdDHbR6p0VOrikdNXPRwXhFJ6e64pXFcwzE+Pb+kxanJlz7mjh5 R7p7mXlXGJ+Nvn85XKMFtDmyTER6l/MQK+sSx3Le2VRRx/xd/OS8BTeTKl3NIRob6W5L ySaQ== X-Forwarded-Encrypted: i=1; AJvYcCUq/ZGqRe+CiuWGpHSEJB9IgfpdgcZ8LSUTgNlCLBUVY/ND7nAzxj+hgAgnGcTltEm8qhZzzB6Xk5UR0rc=@vger.kernel.org X-Gm-Message-State: AOJu0YwWm+i5/LxKG/zjpbSR3jrwKze97hhTstquJlo3T19ocz8SZRZM Ra65tHEwRQXADkciEig3YL+DDzg1l+0zpOQvConiLfZleVp6D3a2dehq X-Gm-Gg: AZuq6aKbTY5a7MRpNmSA3bxDeACAmk3iaJuLmpyUjdEy5yjqIiJ6mH36s7lzzzMghRW f4/Uu9SH/8aysbVbh383n64Vb6XGaBBNbE665IK9vagGUz2rL4FwBU2Yea5fu98aH3aGze7cNJb WZOA6mvCJzHNgwZPNa8coCDxdLiQeMNQHwGCnrSjGIS31yEpjKsJI4pJ6xtobhMcTt4m9XRkzQB w3kSgyRFWYefX7+pKWREzzV4VWamlyASIZSKTLsLoK5E1Lov3l8Q37c3N69rzIR8Vf4xTEVG1CC u4cj/zmB1M4rqxUOT/NGLns1xYDEkDo5OuguORgAC5FwKhFPhqRtoTOhsDbG8rKb/TcgshQ+v0l Che8CKe6NJ0NOYuQLPVE3RSLX11RTpnYm1SZ8EDDVX5bIzskj+rXgKvRY2mPVE2LYPTzHghv7jW +B83RTaF4= X-Received: by 2002:a05:6808:221f:b0:44d:9f05:7159 with SMTP id 5614622812f47-45f1e45761fmr6662638b6e.29.1769993728719; Sun, 01 Feb 2026 16:55:28 -0800 (PST) Received: from localhost ([2a03:2880:10ff:9::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-40980497603sm8657293fac.21.2026.02.01.16.55.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:28 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 01/12] mm: add PUD THP ptdesc and rmap support Date: Sun, 1 Feb 2026 16:50:18 -0800 Message-ID: <20260202005451.774496-2-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For page table management, PUD THPs need to pre-deposit page tables that will be used when the huge page is later split. When a PUD THP is allocated, we cannot know in advance when or why it might need to be split (COW, partial unmap, reclaim), but we need page tables ready for that eventuality. Similar to how PMD THPs deposit a single PTE table, PUD THPs deposit a PMD table which itself contains deposited PTE tables - a two-level deposit. This commit adds the deposit/withdraw infrastructure and a new pud_huge_pmd field in ptdesc to store the deposited PMD. The deposited PMD tables are stored as a singly-linked stack using only page->lru.next as the link pointer. A doubly-linked list using the standard list_head mechanism would cause memory corruption: list_del() poisons both lru.next (offset 8) and lru.prev (offset 16), but lru.prev overlaps with ptdesc->pmd_huge_pte at offset 16. Since deposited PMD tables have their own deposited PTE tables stored in pmd_huge_pte, poisoning lru.prev would corrupt the PTE table list and cause crashes when withdrawing PTE tables during split. PMD THPs don't have this problem because their deposited PTE tables don't have sub-deposits. Using only lru.next avoids the overlap entirely. For reverse mapping, PUD THPs need the same rmap support that PMD THPs have. The page_vma_mapped_walk() function is extended to recognize and handle PUD-mapped folios during rmap traversal. A new TTU_SPLIT_HUGE_PUD flag tells the unmap path to split PUD THPs before proceeding, since there is no PUD-level migration entry format - the split converts the single PUD mapping into individual PTE mappings that can be migrated or swapped normally. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 5 +++ include/linux/mm.h | 19 ++++++++ include/linux/mm_types.h | 5 ++- include/linux/pgtable.h | 8 ++++ include/linux/rmap.h | 7 ++- mm/huge_memory.c | 8 ++++ mm/internal.h | 3 ++ mm/page_vma_mapped.c | 35 +++++++++++++++ mm/pgtable-generic.c | 83 ++++++++++++++++++++++++++++++++++ mm/rmap.c | 96 +++++++++++++++++++++++++++++++++++++--- 10 files changed, 260 insertions(+), 9 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index a4d9f964dfdea..e672e45bb9cc7 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -463,10 +463,15 @@ void __split_huge_pud(struct vm_area_struct *vma, pud= _t *pud, unsigned long address); =20 #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +void split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, + unsigned long address); int change_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, pud_t *pudp, unsigned long addr, pgprot_t newprot, unsigned long cp_flags); #else +static inline void +split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, + unsigned long address) {} static inline int change_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, pud_t *pudp, unsigned long addr, pgprot_t newprot, diff --git a/include/linux/mm.h b/include/linux/mm.h index ab2e7e30aef96..a15e18df0f771 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3455,6 +3455,22 @@ static inline bool pagetable_pmd_ctor(struct mm_stru= ct *mm, * considered ready to switch to split PUD locks yet; there may be places * which need to be converted from page_table_lock. */ +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static inline struct page *pud_pgtable_page(pud_t *pud) +{ + unsigned long mask =3D ~(PTRS_PER_PUD * sizeof(pud_t) - 1); + + return virt_to_page((void *)((unsigned long)pud & mask)); +} + +static inline struct ptdesc *pud_ptdesc(pud_t *pud) +{ + return page_ptdesc(pud_pgtable_page(pud)); +} + +#define pud_huge_pmd(pud) (pud_ptdesc(pud)->pud_huge_pmd) +#endif + static inline spinlock_t *pud_lockptr(struct mm_struct *mm, pud_t *pud) { return &mm->page_table_lock; @@ -3471,6 +3487,9 @@ static inline spinlock_t *pud_lock(struct mm_struct *= mm, pud_t *pud) static inline void pagetable_pud_ctor(struct ptdesc *ptdesc) { __pagetable_ctor(ptdesc); +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + ptdesc->pud_huge_pmd =3D NULL; +#endif } =20 static inline void pagetable_p4d_ctor(struct ptdesc *ptdesc) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 78950eb8926dc..26a38490ae2e1 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -577,7 +577,10 @@ struct ptdesc { struct list_head pt_list; struct { unsigned long _pt_pad_1; - pgtable_t pmd_huge_pte; + union { + pgtable_t pmd_huge_pte; /* For PMD tables: deposited PTE */ + pgtable_t pud_huge_pmd; /* For PUD tables: deposited PMD list */ + }; }; }; unsigned long __page_mapping; diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 2f0dd3a4ace1a..3ce733c1d71a2 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1168,6 +1168,14 @@ extern pgtable_t pgtable_trans_huge_withdraw(struct = mm_struct *mm, pmd_t *pmdp); #define arch_needs_pgtable_deposit() (false) #endif =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +extern void pgtable_trans_huge_pud_deposit(struct mm_struct *mm, pud_t *pu= dp, + pmd_t *pmd_table); +extern pmd_t *pgtable_trans_huge_pud_withdraw(struct mm_struct *mm, pud_t = *pudp); +extern void pud_deposit_pte(pmd_t *pmd_table, pgtable_t pgtable); +extern pgtable_t pud_withdraw_pte(pmd_t *pmd_table); +#endif + #ifdef CONFIG_TRANSPARENT_HUGEPAGE /* * This is an implementation of pmdp_establish() that is only suitable for= an diff --git a/include/linux/rmap.h b/include/linux/rmap.h index daa92a58585d9..08cd0a0eb8763 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -101,6 +101,7 @@ enum ttu_flags { * do a final flush if necessary */ TTU_RMAP_LOCKED =3D 0x80, /* do not grab rmap lock: * caller holds it */ + TTU_SPLIT_HUGE_PUD =3D 0x100, /* split huge PUD if any */ }; =20 #ifdef CONFIG_MMU @@ -473,6 +474,8 @@ void folio_add_anon_rmap_ptes(struct folio *, struct pa= ge *, int nr_pages, folio_add_anon_rmap_ptes(folio, page, 1, vma, address, flags) void folio_add_anon_rmap_pmd(struct folio *, struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); +void folio_add_anon_rmap_pud(struct folio *, struct page *, + struct vm_area_struct *, unsigned long address, rmap_t flags); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address, rmap_t flags); void folio_add_file_rmap_ptes(struct folio *, struct page *, int nr_pages, @@ -933,6 +936,7 @@ struct page_vma_mapped_walk { pgoff_t pgoff; struct vm_area_struct *vma; unsigned long address; + pud_t *pud; pmd_t *pmd; pte_t *pte; spinlock_t *ptl; @@ -970,7 +974,7 @@ static inline void page_vma_mapped_walk_done(struct pag= e_vma_mapped_walk *pvmw) static inline void page_vma_mapped_walk_restart(struct page_vma_mapped_walk *pvmw) { - WARN_ON_ONCE(!pvmw->pmd && !pvmw->pte); + WARN_ON_ONCE(!pvmw->pud && !pvmw->pmd && !pvmw->pte); =20 if (likely(pvmw->ptl)) spin_unlock(pvmw->ptl); @@ -978,6 +982,7 @@ page_vma_mapped_walk_restart(struct page_vma_mapped_wal= k *pvmw) WARN_ON_ONCE(1); =20 pvmw->ptl =3D NULL; + pvmw->pud =3D NULL; pvmw->pmd =3D NULL; pvmw->pte =3D NULL; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 40cf59301c21a..3128b3beedb0a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2933,6 +2933,14 @@ void __split_huge_pud(struct vm_area_struct *vma, pu= d_t *pud, spin_unlock(ptl); mmu_notifier_invalidate_range_end(&range); } + +void split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, + unsigned long address) +{ + VM_WARN_ON_ONCE(!IS_ALIGNED(address, HPAGE_PUD_SIZE)); + if (pud_trans_huge(*pud)) + __split_huge_pud_locked(vma, pud, address); +} #else void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, unsigned long address) diff --git a/mm/internal.h b/mm/internal.h index 9ee336aa03656..21d5c00f638dc 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -545,6 +545,9 @@ int user_proactive_reclaim(char *buf, * in mm/rmap.c: */ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address); +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +pud_t *mm_find_pud(struct mm_struct *mm, unsigned long address); +#endif =20 /* * in mm/page_alloc.c diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index b38a1d00c971b..d31eafba38041 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -146,6 +146,18 @@ static bool check_pmd(unsigned long pfn, struct page_v= ma_mapped_walk *pvmw) return true; } =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +/* Returns true if the two ranges overlap. Careful to not overflow. */ +static bool check_pud(unsigned long pfn, struct page_vma_mapped_walk *pvmw) +{ + if ((pfn + HPAGE_PUD_NR - 1) < pvmw->pfn) + return false; + if (pfn > pvmw->pfn + pvmw->nr_pages - 1) + return false; + return true; +} +#endif + static void step_forward(struct page_vma_mapped_walk *pvmw, unsigned long = size) { pvmw->address =3D (pvmw->address + size) & ~(size - 1); @@ -188,6 +200,10 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk = *pvmw) pud_t *pud; pmd_t pmde; =20 + /* The only possible pud mapping has been handled on last iteration */ + if (pvmw->pud && !pvmw->pmd) + return not_found(pvmw); + /* The only possible pmd mapping has been handled on last iteration */ if (pvmw->pmd && !pvmw->pte) return not_found(pvmw); @@ -234,6 +250,25 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk = *pvmw) continue; } =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + /* Check for PUD-mapped THP */ + if (pud_trans_huge(*pud)) { + pvmw->pud =3D pud; + pvmw->ptl =3D pud_lock(mm, pud); + if (likely(pud_trans_huge(*pud))) { + if (pvmw->flags & PVMW_MIGRATION) + return not_found(pvmw); + if (!check_pud(pud_pfn(*pud), pvmw)) + return not_found(pvmw); + return true; + } + /* PUD was split under us, retry at PMD level */ + spin_unlock(pvmw->ptl); + pvmw->ptl =3D NULL; + pvmw->pud =3D NULL; + } +#endif + pvmw->pmd =3D pmd_offset(pud, pvmw->address); /* * Make sure the pmd value isn't cached in a register by the diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index d3aec7a9926ad..2047558ddcd79 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -195,6 +195,89 @@ pgtable_t pgtable_trans_huge_withdraw(struct mm_struct= *mm, pmd_t *pmdp) } #endif =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +/* + * Deposit page tables for PUD THP. + * Called with PUD lock held. Stores PMD tables in a singly-linked stack + * via pud_huge_pmd, using only pmd_page->lru.next as the link pointer. + * + * IMPORTANT: We use only lru.next (offset 8) for linking, NOT the full + * list_head. This is because lru.prev (offset 16) overlaps with + * ptdesc->pmd_huge_pte, which stores the PMD table's deposited PTE tables. + * Using list_del() would corrupt pmd_huge_pte with LIST_POISON2. + * + * PTE tables should be deposited into the PMD using pud_deposit_pte(). + */ +void pgtable_trans_huge_pud_deposit(struct mm_struct *mm, pud_t *pudp, + pmd_t *pmd_table) +{ + pgtable_t pmd_page =3D virt_to_page(pmd_table); + + assert_spin_locked(pud_lockptr(mm, pudp)); + + /* Push onto stack using only lru.next as the link */ + pmd_page->lru.next =3D (struct list_head *)pud_huge_pmd(pudp); + pud_huge_pmd(pudp) =3D pmd_page; +} + +/* + * Withdraw the deposited PMD table for PUD THP split or zap. + * Called with PUD lock held. + * Returns NULL if no more PMD tables are deposited. + */ +pmd_t *pgtable_trans_huge_pud_withdraw(struct mm_struct *mm, pud_t *pudp) +{ + pgtable_t pmd_page; + + assert_spin_locked(pud_lockptr(mm, pudp)); + + pmd_page =3D pud_huge_pmd(pudp); + if (!pmd_page) + return NULL; + + /* Pop from stack - lru.next points to next PMD page (or NULL) */ + pud_huge_pmd(pudp) =3D (pgtable_t)pmd_page->lru.next; + + return page_address(pmd_page); +} + +/* + * Deposit a PTE table into a standalone PMD table (not yet in page table = hierarchy). + * Used for PUD THP pre-deposit. The PMD table's pmd_huge_pte stores a lin= ked list. + * No lock assertion since the PMD isn't visible yet. + */ +void pud_deposit_pte(pmd_t *pmd_table, pgtable_t pgtable) +{ + struct ptdesc *ptdesc =3D virt_to_ptdesc(pmd_table); + + /* FIFO - add to front of list */ + if (!ptdesc->pmd_huge_pte) + INIT_LIST_HEAD(&pgtable->lru); + else + list_add(&pgtable->lru, &ptdesc->pmd_huge_pte->lru); + ptdesc->pmd_huge_pte =3D pgtable; +} + +/* + * Withdraw a PTE table from a standalone PMD table. + * Returns NULL if no more PTE tables are deposited. + */ +pgtable_t pud_withdraw_pte(pmd_t *pmd_table) +{ + struct ptdesc *ptdesc =3D virt_to_ptdesc(pmd_table); + pgtable_t pgtable; + + pgtable =3D ptdesc->pmd_huge_pte; + if (!pgtable) + return NULL; + ptdesc->pmd_huge_pte =3D list_first_entry_or_null(&pgtable->lru, + struct page, lru); + if (ptdesc->pmd_huge_pte) + list_del(&pgtable->lru); + return pgtable; +} +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ + #ifndef __HAVE_ARCH_PMDP_INVALIDATE pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) diff --git a/mm/rmap.c b/mm/rmap.c index 7b9879ef442d9..69acabd763da4 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -811,6 +811,32 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long= address) return pmd; } =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +/* + * Returns the actual pud_t* where we expect 'address' to be mapped from, = or + * NULL if it doesn't exist. No guarantees / checks on what the pud_t* + * represents. + */ +pud_t *mm_find_pud(struct mm_struct *mm, unsigned long address) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud =3D NULL; + + pgd =3D pgd_offset(mm, address); + if (!pgd_present(*pgd)) + goto out; + + p4d =3D p4d_offset(pgd, address); + if (!p4d_present(*p4d)) + goto out; + + pud =3D pud_offset(p4d, address); +out: + return pud; +} +#endif + struct folio_referenced_arg { int mapcount; int referenced; @@ -1415,11 +1441,7 @@ static __always_inline void __folio_add_anon_rmap(st= ruct folio *folio, SetPageAnonExclusive(page); break; case PGTABLE_LEVEL_PUD: - /* - * Keep the compiler happy, we don't support anonymous - * PUD mappings. - */ - WARN_ON_ONCE(1); + SetPageAnonExclusive(page); break; default: BUILD_BUG(); @@ -1503,6 +1525,31 @@ void folio_add_anon_rmap_pmd(struct folio *folio, st= ruct page *page, #endif } =20 +/** + * folio_add_anon_rmap_pud - add a PUD mapping to a page range of an anon = folio + * @folio: The folio to add the mapping to + * @page: The first page to add + * @vma: The vm area in which the mapping is added + * @address: The user virtual address of the first page to map + * @flags: The rmap flags + * + * The page range of folio is defined by [first_page, first_page + HPAGE_P= UD_NR) + * + * The caller needs to hold the page table lock, and the page must be lock= ed in + * the anon_vma case: to serialize mapping,index checking after setting. + */ +void folio_add_anon_rmap_pud(struct folio *folio, struct page *page, + struct vm_area_struct *vma, unsigned long address, rmap_t flags) +{ +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ + defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) + __folio_add_anon_rmap(folio, page, HPAGE_PUD_NR, vma, address, flags, + PGTABLE_LEVEL_PUD); +#else + WARN_ON_ONCE(true); +#endif +} + /** * folio_add_new_anon_rmap - Add mapping to a new anonymous folio. * @folio: The folio to add the mapping to. @@ -1934,6 +1981,20 @@ static bool try_to_unmap_one(struct folio *folio, st= ruct vm_area_struct *vma, } =20 if (!pvmw.pte) { + /* + * Check for PUD-mapped THP first. + * If we have a PUD mapping and TTU_SPLIT_HUGE_PUD is set, + * split the PUD to PMD level and restart the walk. + */ + if (pvmw.pud && pud_trans_huge(*pvmw.pud)) { + if (flags & TTU_SPLIT_HUGE_PUD) { + split_huge_pud_locked(vma, pvmw.pud, pvmw.address); + flags &=3D ~TTU_SPLIT_HUGE_PUD; + page_vma_mapped_walk_restart(&pvmw); + continue; + } + } + if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) { if (unmap_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, folio)) goto walk_done; @@ -2325,6 +2386,27 @@ static bool try_to_migrate_one(struct folio *folio, = struct vm_area_struct *vma, mmu_notifier_invalidate_range_start(&range); =20 while (page_vma_mapped_walk(&pvmw)) { + /* Handle PUD-mapped THP first */ + if (!pvmw.pte && !pvmw.pmd) { +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + /* + * PUD-mapped THP: skip migration to preserve the huge + * page. Splitting would defeat the purpose of PUD THPs. + * Return false to indicate migration failure, which + * will cause alloc_contig_range() to try a different + * memory region. + */ + if (pvmw.pud && pud_trans_huge(*pvmw.pud)) { + page_vma_mapped_walk_done(&pvmw); + ret =3D false; + break; + } +#endif + /* Unexpected state: !pte && !pmd but not a PUD THP */ + page_vma_mapped_walk_done(&pvmw); + break; + } + /* PMD-mapped THP migration entry */ if (!pvmw.pte) { __maybe_unused unsigned long pfn; @@ -2607,10 +2689,10 @@ void try_to_migrate(struct folio *folio, enum ttu_f= lags flags) =20 /* * Migration always ignores mlock and only supports TTU_RMAP_LOCKED and - * TTU_SPLIT_HUGE_PMD, TTU_SYNC, and TTU_BATCH_FLUSH flags. + * TTU_SPLIT_HUGE_PMD, TTU_SPLIT_HUGE_PUD, TTU_SYNC, and TTU_BATCH_FLUSH = flags. */ if (WARN_ON_ONCE(flags & ~(TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD | - TTU_SYNC | TTU_BATCH_FLUSH))) + TTU_SPLIT_HUGE_PUD | TTU_SYNC | TTU_BATCH_FLUSH))) return; =20 if (folio_is_zone_device(folio) && --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-ot1-f41.google.com (mail-ot1-f41.google.com [209.85.210.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E66FC1C84BB for ; Mon, 2 Feb 2026 00:55:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993732; cv=none; b=cB6x99S3sNPYi0gX6fQMaiq3gZUsUqweMWH9Lzw1E6xxaJaiIK1xysZlnjL2/l28I3Ws+qNDvlQvsYYsYSpRc08VTBaRY1tW4cvtBypsrjVEPF1JjOp/+iQc7U7Sc/Uzbe7GaqTZdsL4nWdeTw2UOxhprmqgVvw1iIy3e1UtKrc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993732; c=relaxed/simple; bh=k54yCkPzYLPB0NhfHbeMM2mVpmbSEmT8wriWzgSdCmA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=P+4NLt5hhnwyGKeZTkcETSIqLiLKKh/yKbBv9DEzj+Y4Qfxer7Tsy1TpOlRGJNQSF/7hPzC8AVjMkgwCWBEHQD1yE0E00Bu8RRoCYXHkzINA9/1YL0j+YiBZwbTLiHTpVJYAqBR6g0kGO9K0mHICIfV/QiadLjUfGYB0vFMegA4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=dL9Y4gr0; arc=none smtp.client-ip=209.85.210.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="dL9Y4gr0" Received: by mail-ot1-f41.google.com with SMTP id 46e09a7af769-7d1866473b0so2463686a34.1 for ; Sun, 01 Feb 2026 16:55:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993730; x=1770598530; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2rQf42l81vbrWHLFD+4Euf1umigy7cZsac9bu9cfg5c=; b=dL9Y4gr0hc2wvAJWyMgrHmS8zSnEybO7er3fiAslgxyIs2pG1VD9Ru4b836w1/OqWD TBNqgys8NftlNylseVlPJM9fyIPd2VhlFqTpPvjbCndvFfL7m/+Gb2P1O6tZbN51QmxP yKFfPYcbwLqALNKhsty2aa3a9snkQ8TeLyvaVVSkL/aFyLxQAwX2rkm2pT1E6dsqqA9U ku/OhWHMwsNrB83v7YQOfNyt/eJXqLfZZXySQ5ywyfAF63Ck/xI/a1Iw2D4B8/9YB0zB JOhLSqXMUQYIkGxrkI4+uvaPHI35Z5YeqquhX7GJC8mf8MU7J73Ox1xY/vVmDX8rokhY dubQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993730; x=1770598530; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=2rQf42l81vbrWHLFD+4Euf1umigy7cZsac9bu9cfg5c=; b=GjZAVnOj+Vl5E+N5pgTbOcHNyb7HgjXjnmnQkvHzYcBVVZx9+VeOErpH6GJataVQoV imal1q9ay8E923kMYVjbWpZk7SVz5yrXnajMSEKGpMhy9ExQuzU7j/Bzz81fDkuCQziE CLKKHqT5cvnJdPf9UpeE3MYwPjHzA3AoAZEQwNLEUbsyl+HIVUaaAOhvwHoWNapZS/Xw Vi3Z435I5C542HfVZnktHkJl814swO+yPrfPtSOuWBIkQ0nCa9BPpfxcGO8PYfwPhn97 MxRipPPYs7+YUP8vQA66V1AS1cEHsF4Yc/OfJ8qAqA2T88/VXYwW5NCCgmKORFghMYV7 qK+Q== X-Forwarded-Encrypted: i=1; AJvYcCV1TVLXZwrMcp9X2TL3jBXeuUE6ipMa0bOLQ0vmXv4283tuqjPy+Nt9mARJmdvVEzREhL97QHyrYB9F/mQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yy3Y22ezCjuL/mA3b4oeMuiVo/m2e5Ubu4Rtbc776TKp0pkNgUs PQAnl5CAlE4WvKW8LCUd8d0iCJASBNQXZdvqSQBCdIbRetm3qm5VQLF5 X-Gm-Gg: AZuq6aJ9bAg2hDX12cRQJ4dUZiDuiAL/rlcW8kN7JFaoZEcCCwq9gpim+NEvQAKy8ZY AUx/NRy2Z1uOU55GWhNtFvqPP3qkb9BbUWrjp0B491Zn+LUjgqj+jsFIj1HW2hEBbllcpmOcIDs 5zS6YCp74Ol6YseNkyDloaRK4RGLggUKm7J49DMaC0rmo5c5PKyPq4cSeIjiezPREWJiqYNE02v Wg0ESaQxQEtHKMfNK+5fRlosU7C0r9oNYmJdAI21mM6ckR2mDkGM0cmkviGlz/rBbaRuTSBCFqX ZYyvUaETaiHn8gXPUWeUlE8gRNp7nSDOVFrfH5afhlIr6WAIWGl+SnZht82nzKyftqQy/x8N/sE wG4dJzTWxUOWtPyaN3+RlM1kHkFmoALYxm4wAbPq0wHIJwfIxzynB7dbRMwO4YKCXKaHkiDTh3X J7Hcej0KdH X-Received: by 2002:a05:6830:4389:b0:7c7:8280:9207 with SMTP id 46e09a7af769-7d1a5335f7dmr6749201a34.37.1769993729809; Sun, 01 Feb 2026 16:55:29 -0800 (PST) Received: from localhost ([2a03:2880:10ff:4e::]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7d18c69ac16sm9771704a34.7.2026.02.01.16.55.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:29 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 02/12] mm/thp: add mTHP stats infrastructure for PUD THP Date: Sun, 1 Feb 2026 16:50:19 -0800 Message-ID: <20260202005451.774496-3-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Extend the mTHP (multi-size THP) statistics infrastructure to support PUD-sized transparent huge pages. The mTHP framework tracks statistics for each supported THP size through per-order counters exposed via sysfs. To add PUD THP support, PUD_ORDER must be included in the set of tracked orders. With this change, PUD THP events (allocations, faults, splits, swaps) are tracked and exposed through the existing sysfs interface at /sys/kernel/mm/transparent_hugepage/hugepages-1048576kB/stats/. This provides visibility into PUD THP behavior for debugging and performance analysis. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 42 +++++++++++++++++++++++++++++++++++++---- mm/huge_memory.c | 3 ++- 2 files changed, 40 insertions(+), 5 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index e672e45bb9cc7..5509ba8555b6e 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -76,7 +76,13 @@ extern struct kobj_attribute thpsize_shmem_enabled_attr; * and including PMD_ORDER, except order-0 (which is not "huge") and order= -1 * (which is a limitation of the THP implementation). */ -#define THP_ORDERS_ALL_ANON ((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0) | BIT(1))) +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +#define THP_ORDERS_ALL_ANON_PUD BIT(PUD_ORDER) +#else +#define THP_ORDERS_ALL_ANON_PUD 0 +#endif +#define THP_ORDERS_ALL_ANON (((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0) | BIT(1)= )) | \ + THP_ORDERS_ALL_ANON_PUD) =20 /* * Mask of all large folio orders supported for file THP. Folios in a DAX @@ -146,18 +152,46 @@ enum mthp_stat_item { }; =20 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && defined(CONFIG_SYSFS) + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +#define MTHP_STAT_COUNT (PMD_ORDER + 2) +#define MTHP_STAT_PUD_INDEX (PMD_ORDER + 1) /* PUD uses last index */ +#else +#define MTHP_STAT_COUNT (PMD_ORDER + 1) +#endif + struct mthp_stat { - unsigned long stats[ilog2(MAX_PTRS_PER_PTE) + 1][__MTHP_STAT_COUNT]; + unsigned long stats[MTHP_STAT_COUNT][__MTHP_STAT_COUNT]; }; =20 DECLARE_PER_CPU(struct mthp_stat, mthp_stats); =20 +static inline int mthp_stat_order_to_index(int order) +{ +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + if (order =3D=3D PUD_ORDER) + return MTHP_STAT_PUD_INDEX; +#endif + return order; +} + static inline void mod_mthp_stat(int order, enum mthp_stat_item item, int = delta) { - if (order <=3D 0 || order > PMD_ORDER) + int index; + + if (order <=3D 0) + return; + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + if (order !=3D PUD_ORDER && order > PMD_ORDER) return; +#else + if (order > PMD_ORDER) + return; +#endif =20 - this_cpu_add(mthp_stats.stats[order][item], delta); + index =3D mthp_stat_order_to_index(order); + this_cpu_add(mthp_stats.stats[index][item], delta); } =20 static inline void count_mthp_stat(int order, enum mthp_stat_item item) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3128b3beedb0a..d033624d7e1f2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -598,11 +598,12 @@ static unsigned long sum_mthp_stat(int order, enum mt= hp_stat_item item) { unsigned long sum =3D 0; int cpu; + int index =3D mthp_stat_order_to_index(order); =20 for_each_possible_cpu(cpu) { struct mthp_stat *this =3D &per_cpu(mthp_stats, cpu); =20 - sum +=3D this->stats[order][item]; + sum +=3D this->stats[index][item]; } =20 return sum; --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-oa1-f52.google.com (mail-oa1-f52.google.com [209.85.160.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55B031D7E42 for ; Mon, 2 Feb 2026 00:55:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993734; cv=none; b=dBT5PLVgHw7/ERzGEqKQ+ni12q0562uDvgEusut/GOsKRCvdVP911N9ncEwbvovofD8yga2AVzViHbTQxMyeSUtc9+VjuQD1nk3CINhGX3XhSetAXY/PEQeUr+Rmg/H8k4UgQOTaf1aRnHxwwLTIb0thgSwATc2z1rnZZtqoGFI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993734; c=relaxed/simple; bh=UACVXmZBEo1X51u/TXlj9W3+QUpZUFZby5VTXcET5M0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LrtemYdKzo3XyRGk1g2sIvzWiy6A4VP6dJh2krauqbsgkwJZfc3qXz+Av7oaqXVujXO89h/LHr/knWL/LUNRU/sSkoPx3uts70QhIZPkJRT+QCdvO4SYpcPaI5T19VbfLPveTJ20TqKLiV0onRJTz1Ll+80U0SK9DDaY/6knwTc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=kHnBq2ua; arc=none smtp.client-ip=209.85.160.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="kHnBq2ua" Received: by mail-oa1-f52.google.com with SMTP id 586e51a60fabf-40959ad00f2so2806872fac.3 for ; Sun, 01 Feb 2026 16:55:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993731; x=1770598531; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ebhAyHaFyMRLprG6lnpimbBHIluq1sCYCiZGgVEBcbg=; b=kHnBq2uaqlZ9lT/ibpnePqWH1reJXr92xSo3Sm1SO5U2dBWq4YD0uXM/+c25rqIXjI puikv20Yq+C1NK6SFfuEbc/8+54ave4sIJGkWMjnBbB1gvxGZqlILmn3XEMAcH4zlXbr iFalRva74I6aodLtcH7hcUSxr4bKm3ntJp4oJNXcd/bffyTFBynlARaRYD636L31Gx6A 8UUiMPOmcj+6OnXgYvrpqKrxh7IFtHRJb2aYXdEWj67SeAnpuWeTsjfBcZ677N1r3Spl 90FGoqZxZ3+VIhVAGKMc/MRhDIs7/AXlDHST23VJR499XhTnoIuCmhhqBDz13rSNa4vm Q0xQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993731; x=1770598531; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ebhAyHaFyMRLprG6lnpimbBHIluq1sCYCiZGgVEBcbg=; b=hg57Q8vyUAlK0aZPjlGO1KI8QTA636j7VyYBs0LHdwnRhWqMcumc5IAqXpuocZpP76 oLuCxOZSs11MnYCk/cusDhADaXvqsIl8/woWrm3Q4DTI75tuJmjrcuLTYb8yUTjoYsCH PpbEee77mLm0K3CN8pWtEzMgmuB1eVsCZUictH4z4WEh0VLkiwo/fq1wUYGbwjOmBLy+ owTqDKMoAFKlKM8B8xBfey8hGmBrWikG+9wuJUUqNuZJn0VOBWqS/86KPqdZacoNGc2n Io6VOoPb/9/HshuykIioA5zCvMdb2zlzlqI1JT1TtnFe3YL217a7Z5502ApyqVtSk8jZ cAtQ== X-Forwarded-Encrypted: i=1; AJvYcCVEQe8MpeUuQSRmWJNaLjS7xBD8iXZAG4yAP5MsmGpEZ1pcE7z54+uX7+SuFBtFngc8L9Vg9r8d8slD6MY=@vger.kernel.org X-Gm-Message-State: AOJu0Yy6Z9ArbBejcKVVCAw6f+zx0601ZdwJZZXoy0oJJzJnN/GZGvma p4CSJELzf57dcXGvUCBmzd34wWkNnLAkkn27lsmTbEXhC7sc6otPoSbu X-Gm-Gg: AZuq6aLuE43+c7TutE6seUStH3Hsgc3cyJc/VzeROizSmcfaAcKshyRCqMntqF4hwoT 1DBuaN7ZnbW1dAVOFgl9OmFW3JZbscJ8O0c60nEYItbBgDIknoh1T6AQ57MwBBJ1Qr4NHQiT3aS 6HaiXI/2UaZIO8QGnxegDebdJLyIElsaA7GrF7+JkFlkYuynn3dj1DJCZdKThq0aL10p8P+pBOm 0S7ydURMxMt1aDhMoR6qpono5SmXeWo8kd5u5VoI8SabzlVEo8B6tPUaac691Txle1dbnHaVhAO wVv+ZkXZbW5HfPo8Xp8veLpXmsvTuhdb98umlVT33T/rgK3qxyvo7btl8Rt43+VVLel0GMeRHnc uFV2m2QglEWmVhl7sclRVthzcdVA/9/9xVUgY46pfG1Af96PYCw6xKfsK5x0Fo+OR1AeLyRfpT/ tfzmry/VA= X-Received: by 2002:a05:6871:840d:b0:409:e3c4:6027 with SMTP id 586e51a60fabf-409e3c473e1mr2136971fac.55.1769993731016; Sun, 01 Feb 2026 16:55:31 -0800 (PST) Received: from localhost ([2a03:2880:10ff:2::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-4095756e77fsm10108592fac.19.2026.02.01.16.55.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:30 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 03/12] mm: thp: add PUD THP allocation and fault handling Date: Sun, 1 Feb 2026 16:50:20 -0800 Message-ID: <20260202005451.774496-4-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the page fault handling path for anonymous PUD THPs, following the same design as the existing PMD THP fault handlers. When a process accesses memory in an anonymous VMA that is PUD-aligned and large enough, the fault handler checks if PUD THP is enabled and attempts to allocate a 1GB folio. The allocation uses folio_alloc_gigantic. If allocation succeeds, the folio is mapped at the faulting PUD entry. Before installing the PUD mapping, page tables are pre-deposited for future use. A PUD THP will eventually need to be split - whether due to copy-on-write after fork, partial munmap, mprotect on a subregion, or memory reclaim. At split time, we need 512 PTE tables (one for each PMD entry) plus the PMD table itself. Allocating 513 page tables during split could fail, leaving the system unable to proceed. By depositing them at fault time when memory pressure is typically lower, we guarantee the split will always succeed. The write-protect fault handler triggers when a process tries to write to a PUD THP that is mapped read-only (typically after fork). Rather than implementing PUD-level COW which would require copying 1GB of data, the handler splits the PUD to PTE level and retries the fault. The retry then handles COW at PTE level, copying only the single 4KB page being written. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 2 + mm/huge_memory.c | 260 ++++++++++++++++++++++++++++++++++++++-- mm/memory.c | 8 +- 3 files changed, 258 insertions(+), 12 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5509ba8555b6e..a292035c0270f 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -8,6 +8,7 @@ #include =20 vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf); +vm_fault_t do_huge_pud_anonymous_page(struct vm_fault *vmf); int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm, pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long addr, struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma); @@ -25,6 +26,7 @@ static inline void huge_pud_set_accessed(struct vm_fault = *vmf, pud_t orig_pud) #endif =20 vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf); +vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf); bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *= vma, pmd_t *pmd, unsigned long addr, unsigned long next); int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t= *pmd, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index d033624d7e1f2..7613caf1e7c30 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1294,6 +1294,70 @@ static struct folio *vma_alloc_anon_folio_pmd(struct= vm_area_struct *vma, return folio; } =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static struct folio *vma_alloc_anon_folio_pud(struct vm_area_struct *vma, + unsigned long addr) +{ + gfp_t gfp =3D vma_thp_gfp_mask(vma); + const int order =3D HPAGE_PUD_ORDER; + struct folio *folio =3D NULL; + /* + * Contiguous allocation via alloc_contig_range() migrates existing + * pages out of the target range. __GFP_NOMEMALLOC would allow using + * memory reserves for migration destination pages, but THP is an + * optional performance optimization and should not deplete reserves + * that may be needed for critical allocations. Remove it. + * alloc_contig_range_noprof (__alloc_contig_verify_gfp_mask) will + * cause this to fail without it. + */ + gfp_t contig_gfp =3D gfp & ~__GFP_NOMEMALLOC; + + folio =3D folio_alloc_gigantic(order, contig_gfp, numa_node_id(), NULL); + + if (unlikely(!folio)) { + count_vm_event(THP_FAULT_FALLBACK); + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); + return NULL; + } + + VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); + if (mem_cgroup_charge(folio, vma->vm_mm, gfp)) { + folio_put(folio); + count_vm_event(THP_FAULT_FALLBACK); + count_vm_event(THP_FAULT_FALLBACK_CHARGE); + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK); + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE); + return NULL; + } + folio_throttle_swaprate(folio, gfp); + + /* + * When a folio is not zeroed during allocation (__GFP_ZERO not used) + * or user folios require special handling, folio_zero_user() is used to + * make sure that the page corresponding to the faulting address will be + * hot in the cache after zeroing. + */ + if (user_alloc_needs_zeroing()) + folio_zero_user(folio, addr); + /* + * The memory barrier inside __folio_mark_uptodate makes sure that + * folio_zero_user writes become visible before the set_pud_at() + * write. + */ + __folio_mark_uptodate(folio); + + /* + * Set the large_rmappable flag so that the folio can be properly + * removed from the deferred_split list when freed. + * folio_alloc_gigantic() doesn't set this flag (unlike __folio_alloc), + * so we must set it explicitly. + */ + folio_set_large_rmappable(folio); + + return folio; +} +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ + void map_anon_folio_pmd_nopf(struct folio *folio, pmd_t *pmd, struct vm_area_struct *vma, unsigned long haddr) { @@ -1318,6 +1382,40 @@ static void map_anon_folio_pmd_pf(struct folio *foli= o, pmd_t *pmd, count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); } =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static pud_t maybe_pud_mkwrite(pud_t pud, struct vm_area_struct *vma) +{ + if (likely(vma->vm_flags & VM_WRITE)) + pud =3D pud_mkwrite(pud); + return pud; +} + +static void map_anon_folio_pud_nopf(struct folio *folio, pud_t *pud, + struct vm_area_struct *vma, unsigned long haddr) +{ + pud_t entry; + + entry =3D folio_mk_pud(folio, vma->vm_page_prot); + entry =3D maybe_pud_mkwrite(pud_mkdirty(entry), vma); + folio_add_new_anon_rmap(folio, vma, haddr, RMAP_EXCLUSIVE); + folio_add_lru_vma(folio, vma); + set_pud_at(vma->vm_mm, haddr, pud, entry); + update_mmu_cache_pud(vma, haddr, pud); + deferred_split_folio(folio, false); +} + + +static void map_anon_folio_pud_pf(struct folio *folio, pud_t *pud, + struct vm_area_struct *vma, unsigned long haddr) +{ + map_anon_folio_pud_nopf(folio, pud, vma, haddr); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PUD_NR); + count_vm_event(THP_FAULT_ALLOC); + count_mthp_stat(HPAGE_PUD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); + count_memcg_event_mm(vma->vm_mm, THP_FAULT_ALLOC); +} +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ + static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf) { unsigned long haddr =3D vmf->address & HPAGE_PMD_MASK; @@ -1513,6 +1611,161 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fau= lt *vmf) return __do_huge_pmd_anonymous_page(vmf); } =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +/* Number of PTE tables needed for PUD THP split: 512 */ +#define NR_PTE_TABLES_FOR_PUD (HPAGE_PUD_NR / HPAGE_PMD_NR) + +/* + * Allocate page tables for PUD THP pre-deposit. + */ +static bool alloc_pud_predeposit_ptables(struct mm_struct *mm, + unsigned long haddr, + pmd_t **pmd_table_out, + int *nr_pte_deposited) +{ + pmd_t *pmd_table; + pgtable_t pte_table; + struct ptdesc *pmd_ptdesc; + int i; + + *pmd_table_out =3D NULL; + *nr_pte_deposited =3D 0; + + pmd_table =3D pmd_alloc_one(mm, haddr); + if (!pmd_table) + return false; + + /* Initialize the pmd_huge_pte field for PTE table storage */ + pmd_ptdesc =3D virt_to_ptdesc(pmd_table); + pmd_ptdesc->pmd_huge_pte =3D NULL; + + /* Allocate and deposit 512 PTE tables into the PMD table */ + for (i =3D 0; i < NR_PTE_TABLES_FOR_PUD; i++) { + pte_table =3D pte_alloc_one(mm); + if (!pte_table) + goto fail; + pud_deposit_pte(pmd_table, pte_table); + (*nr_pte_deposited)++; + } + + *pmd_table_out =3D pmd_table; + return true; + +fail: + /* Free any PTE tables we deposited */ + while ((pte_table =3D pud_withdraw_pte(pmd_table)) !=3D NULL) + pte_free(mm, pte_table); + pmd_free(mm, pmd_table); + return false; +} + +/* + * Free pre-allocated page tables if the PUD THP fault fails. + */ +static void free_pud_predeposit_ptables(struct mm_struct *mm, + pmd_t *pmd_table) +{ + pgtable_t pte_table; + + if (!pmd_table) + return; + + while ((pte_table =3D pud_withdraw_pte(pmd_table)) !=3D NULL) + pte_free(mm, pte_table); + pmd_free(mm, pmd_table); +} + +vm_fault_t do_huge_pud_anonymous_page(struct vm_fault *vmf) +{ + struct vm_area_struct *vma =3D vmf->vma; + unsigned long haddr =3D vmf->address & HPAGE_PUD_MASK; + struct folio *folio; + pmd_t *pmd_table =3D NULL; + int nr_pte_deposited =3D 0; + vm_fault_t ret =3D 0; + int i; + + /* Check VMA bounds and alignment */ + if (!thp_vma_suitable_order(vma, haddr, PUD_ORDER)) + return VM_FAULT_FALLBACK; + + ret =3D vmf_anon_prepare(vmf); + if (ret) + return ret; + + folio =3D vma_alloc_anon_folio_pud(vma, vmf->address); + if (unlikely(!folio)) + return VM_FAULT_FALLBACK; + + /* + * Pre-allocate page tables for future PUD split. + * We need 1 PMD table and 512 PTE tables. + */ + if (!alloc_pud_predeposit_ptables(vma->vm_mm, haddr, + &pmd_table, &nr_pte_deposited)) { + folio_put(folio); + return VM_FAULT_FALLBACK; + } + + vmf->ptl =3D pud_lock(vma->vm_mm, vmf->pud); + if (unlikely(!pud_none(*vmf->pud))) + goto release; + + ret =3D check_stable_address_space(vma->vm_mm); + if (ret) + goto release; + + /* Deliver the page fault to userland */ + if (userfaultfd_missing(vma)) { + spin_unlock(vmf->ptl); + folio_put(folio); + free_pud_predeposit_ptables(vma->vm_mm, pmd_table); + ret =3D handle_userfault(vmf, VM_UFFD_MISSING); + VM_BUG_ON(ret & VM_FAULT_FALLBACK); + return ret; + } + + /* Deposit page tables for future PUD split */ + pgtable_trans_huge_pud_deposit(vma->vm_mm, vmf->pud, pmd_table); + map_anon_folio_pud_pf(folio, vmf->pud, vma, haddr); + mm_inc_nr_pmds(vma->vm_mm); + for (i =3D 0; i < nr_pte_deposited; i++) + mm_inc_nr_ptes(vma->vm_mm); + spin_unlock(vmf->ptl); + + return 0; +release: + spin_unlock(vmf->ptl); + folio_put(folio); + free_pud_predeposit_ptables(vma->vm_mm, pmd_table); + return ret; +} +#else +vm_fault_t do_huge_pud_anonymous_page(struct vm_fault *vmf) +{ + return VM_FAULT_FALLBACK; +} +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf) +{ + struct vm_area_struct *vma =3D vmf->vma; + + /* + * For now, split PUD to PTE level on write fault. + * This is the simplest approach for COW handling. + */ + __split_huge_pud(vma, vmf->pud, vmf->address); + return VM_FAULT_FALLBACK; +} +#else +vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf) +{ + return VM_FAULT_FALLBACK; +} +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ + struct folio_or_pfn { union { struct folio *folio; @@ -1646,13 +1899,6 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf= , struct folio *folio, EXPORT_SYMBOL_GPL(vmf_insert_folio_pmd); =20 #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD -static pud_t maybe_pud_mkwrite(pud_t pud, struct vm_area_struct *vma) -{ - if (likely(vma->vm_flags & VM_WRITE)) - pud =3D pud_mkwrite(pud); - return pud; -} - static vm_fault_t insert_pud(struct vm_area_struct *vma, unsigned long add= r, pud_t *pud, struct folio_or_pfn fop, pgprot_t prot, bool write) { diff --git a/mm/memory.c b/mm/memory.c index 87cf4e1a6f866..e5f86c1d2aded 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6142,9 +6142,9 @@ static vm_fault_t create_huge_pud(struct vm_fault *vm= f) #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) struct vm_area_struct *vma =3D vmf->vma; - /* No support for anonymous transparent PUD pages yet */ + if (vma_is_anonymous(vma)) - return VM_FAULT_FALLBACK; + return do_huge_pud_anonymous_page(vmf); if (vma->vm_ops->huge_fault) return vma->vm_ops->huge_fault(vmf, PUD_ORDER); #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -6158,9 +6158,8 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, p= ud_t orig_pud) struct vm_area_struct *vma =3D vmf->vma; vm_fault_t ret; =20 - /* No support for anonymous transparent PUD pages yet */ if (vma_is_anonymous(vma)) - goto split; + return do_huge_pud_wp_page(vmf); if (vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) { if (vma->vm_ops->huge_fault) { ret =3D vma->vm_ops->huge_fault(vmf, PUD_ORDER); @@ -6168,7 +6167,6 @@ static vm_fault_t wp_huge_pud(struct vm_fault *vmf, p= ud_t orig_pud) return ret; } } -split: /* COW or write-notify not handled on PUD level: split pud.*/ __split_huge_pud(vma, vmf->pud, vmf->address); #endif /* CONFIG_TRANSPARENT_HUGEPAGE && CONFIG_HAVE_ARCH_TRANSPARENT_HUGE= PAGE_PUD */ --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-ot1-f41.google.com (mail-ot1-f41.google.com [209.85.210.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 672FB1DD877 for ; Mon, 2 Feb 2026 00:55:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993735; cv=none; b=ST4AEaY8qOrIL49MSB8TFtSdTav3tFLHlj3yCs5Dm8tXHw/FxBE6UEioRwlmrtAocwbOIHi8PamX7ivFngqY462ivhUo4rxN1ownBKU1Ug4uL0jLrZDFBeDGFZVPM0EYaoKzpkiWYhOsnQHAC6oECD97at9E0bl4Zbv81zkl0p4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993735; c=relaxed/simple; bh=cjgFeACt/HuXCXXjGk0EPmM6x0/qwts2wC0UYSzsLVU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Fx2bdDMV00WuJnFzPvKTGeCkL/6QEBuJ4H1c0Blsft5UbBjOC7hLq9HYQOh7Co8b0+sLYzyVoD1phWh6+VhKHH7FBUVieROpq0KJqeiR0w13b47F4cSGyarFZuLeY0xO09kiteFB0U00ZOsqBuSrm8u9wqxZOGRR9GLBIDnj1Ts= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lLcD1TID; arc=none smtp.client-ip=209.85.210.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lLcD1TID" Received: by mail-ot1-f41.google.com with SMTP id 46e09a7af769-7cfdf7e7d19so3663587a34.2 for ; Sun, 01 Feb 2026 16:55:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993732; x=1770598532; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SiaAYLnp++F+oMGcZPU7KISbq82XKW6g9VibgjyCto8=; b=lLcD1TID/r5ckSpZpFkJNCf0fH7Eojv2zcKH7Xh/MbjSfi1WeLCTZEw9kiueZfxugc lf7Z12Cp4Go7XTtzjmnghxcIHr4IeZFYSo7mAIWPL7otNDwCGxjFAbF02U9vp+9bRKoX 6xiy+QtK+nIbPlacDeWCLXdCLvuC0X6ft2eK5r+1GajUzrJiVn/rFkokUWNh/h7BxYUG MXn4HyMv8bsQjpMHgbkC6rSfiMbG2uRlgSiFQF8RL9guwVbCO+j86VY3cEKgKIqmidui oUesDlLzI4s/xb8dJMwBB2w2cEoDp6Net5c64jqrAvCfekQV+7sgHLzbzDNZubZkzAdv wbSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993732; x=1770598532; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=SiaAYLnp++F+oMGcZPU7KISbq82XKW6g9VibgjyCto8=; b=CM1New0+/XEAmxU7FY+6ZieIs4dX0yVjWHM+yVOPcUly740HRTo6JtRKgbxCwDGnw7 pgpxoxOKa0Mn835UhqMg9F80VSE3r7yChXl2ovNuuA7YwhGHqg4dEO+B35m9iNBbqpOz CpMjPRJwg1Ou0sZmgYvTJRNnILAJjZwXeGNqmcnNWDuePfEZKc5aqe6xi9PFXynT9dTk axm4XOJQVOKLeUshWPQzlWtXFsrDaX9iFKGMpmQLTMGklTj4us11uGgs8cWzqbEqjjyE 2GAjKUMOKmEOgHVJRxW2iVesLfqPfs2vR0K3HRXyQmBLs99XaVjJJsEOD4uAw6HGjnKQ fTKw== X-Forwarded-Encrypted: i=1; AJvYcCVXp6e9iWZugGIyQWr5iby43EXUa9Y7lAd6VJ725ENg/CTJznmYduYfg1+c3PChcEgZZ6jiBP+fVOe8GQI=@vger.kernel.org X-Gm-Message-State: AOJu0YxWFiD+QXQow/K4lSiG1aZJqNit5itzz9GKmVn5g/L4c8MrtA2z P+WkvmorMGGwwet/Bcg/tOvyONWl2YAiJZMSyVpOfLlhCKNi6sjPbzwAnUopkuNE X-Gm-Gg: AZuq6aLlRGUj3qN5OaiDfma+lF6lY5TmiLJjCQvzQxtfUfu9OmQpEivDQzONDhgr3Kz tJeXQ3m6ZOJW9jr6Z9R+Q8STrZw5rSXgsyLixNQkZylv57lrlAtCNS7hh9sdxwH2gOBps+EFOij UNqzUNeUCkkoFkH8PcGtUM1TOS9chXeMPYdU1n/QFNuzYhGvOAsoGzBpLn/M3GqRqWO3HmQNBKe gV9YZY2BdujhsZFmji1Fjdu08bhFndQQrGsG3NZZQS3K3p/hRMRS9a6Ri2Q72YEVF3Tx9PK/eZp 5QYgnq/I5N55MHjrB6w2CEVSouB2npjQOI3y77xPsfAR4DireAbupLsc/PZES2kt+n0IwpM/9H+ 7px7NDxRHRyLj1vcWY2rKKoEdRV6vumLWRyFDpL8ruodZ5kb2Ur4Bslb4TEhTVPuuw0bhqH1owJ 4L0vjbJoly X-Received: by 2002:a05:6830:63ca:b0:7d1:8e24:724d with SMTP id 46e09a7af769-7d1a53d3833mr6441202a34.38.1769993732151; Sun, 01 Feb 2026 16:55:32 -0800 (PST) Received: from localhost ([2a03:2880:10ff:4e::]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7d1a6dd4423sm5572396a34.28.2026.02.01.16.55.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:31 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 04/12] mm: thp: implement PUD THP split to PTE level Date: Sun, 1 Feb 2026 16:50:21 -0800 Message-ID: <20260202005451.774496-5-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement the split operation that converts a PUD THP mapping into individual PTE mappings. A PUD THP maps 1GB of memory with a single page table entry. When the mapping needs to be broken - for COW, partial unmap, permission changes, or reclaim - it must be split into smaller mappings. Unlike PMD THPs which split into 512 PTEs in a single level, PUD THPs require a two-level split: the single PUD entry becomes 512 PMD entries, each pointing to a PTE table containing 512 PTEs, for a total of 262144 page table entries. The split uses page tables that were pre-deposited when the PUD THP was first allocated. This guarantees the split cannot fail due to memory allocation failure, which is critical since splits often happen under memory pressure during reclaim. The deposited PMD table is installed in the PUD entry, and each PMD slot receives one of the 512 deposited PTE tables. Each PTE is populated to map one 4KB page of the original 1GB folio. Page flags from the original PUD entry (dirty, accessed, writable, soft-dirty) are propagated to each PTE so that no information is lost. The rmap is updated to remove the single PUD-level mapping entry and add 262144 PTE-level mapping entries. The split goes directly to PTE level rather than stopping at PMD level. This is because the kernel's rmap infrastructure assumes that PMD-level mappings are for PMD-sized folios. If we mapped a PUD-sized folio at PMD level (512 PMD entries for one folio), the rmap accounting would break - it would see 512 "large" mappings for a folio that should have far more. Going to PTE level avoids this problem entirely. Signed-off-by: Usama Arif --- mm/huge_memory.c | 181 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 173 insertions(+), 8 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 7613caf1e7c30..39b8212b5abd4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3129,12 +3129,82 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_= area_struct *vma, return 1; } =20 +/* + * Structure to hold page tables for PUD split. + * Tables are withdrawn from the pre-deposit made at fault time. + */ +struct pud_split_ptables { + pmd_t *pmd_table; + pgtable_t *pte_tables; /* Array of 512 PTE tables */ + int nr_pte_tables; /* Number of PTE tables in array */ +}; + +/* + * Withdraw pre-deposited page tables from PUD THP. + * Tables are always deposited at fault time in do_huge_pud_anonymous_page= (). + * Returns true if successful, false if no tables deposited. + */ +static bool withdraw_pud_split_ptables(struct mm_struct *mm, pud_t *pud, + struct pud_split_ptables *tables) +{ + pmd_t *pmd_table; + pgtable_t pte_table; + int i; + + tables->pmd_table =3D NULL; + tables->pte_tables =3D NULL; + tables->nr_pte_tables =3D 0; + + /* Try to withdraw the deposited PMD table */ + pmd_table =3D pgtable_trans_huge_pud_withdraw(mm, pud); + if (!pmd_table) + return false; + + tables->pmd_table =3D pmd_table; + + /* Allocate array to hold PTE table pointers */ + tables->pte_tables =3D kmalloc_array(NR_PTE_TABLES_FOR_PUD, + sizeof(pgtable_t), GFP_ATOMIC); + if (!tables->pte_tables) + goto fail; + + /* Withdraw PTE tables from the PMD table */ + for (i =3D 0; i < NR_PTE_TABLES_FOR_PUD; i++) { + pte_table =3D pud_withdraw_pte(pmd_table); + if (!pte_table) + goto fail; + tables->pte_tables[i] =3D pte_table; + tables->nr_pte_tables++; + } + + return true; + +fail: + /* Put back any tables we withdrew */ + for (i =3D 0; i < tables->nr_pte_tables; i++) + pud_deposit_pte(pmd_table, tables->pte_tables[i]); + kfree(tables->pte_tables); + pgtable_trans_huge_pud_deposit(mm, pud, pmd_table); + tables->pmd_table =3D NULL; + tables->pte_tables =3D NULL; + tables->nr_pte_tables =3D 0; + return false; +} + static void __split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, unsigned long haddr) { + bool dirty =3D false, young =3D false, write =3D false; + struct pud_split_ptables tables =3D { 0 }; + struct mm_struct *mm =3D vma->vm_mm; + rmap_t rmap_flags =3D RMAP_NONE; + bool anon_exclusive =3D false; + bool soft_dirty =3D false; struct folio *folio; + unsigned long addr; struct page *page; pud_t old_pud; + int i, j; =20 VM_BUG_ON(haddr & ~HPAGE_PUD_MASK); VM_BUG_ON_VMA(vma->vm_start > haddr, vma); @@ -3145,20 +3215,115 @@ static void __split_huge_pud_locked(struct vm_area= _struct *vma, pud_t *pud, =20 old_pud =3D pudp_huge_clear_flush(vma, haddr, pud); =20 - if (!vma_is_dax(vma)) + if (!vma_is_anonymous(vma)) { + if (!vma_is_dax(vma)) + return; + + page =3D pud_page(old_pud); + folio =3D page_folio(page); + + if (!folio_test_dirty(folio) && pud_dirty(old_pud)) + folio_mark_dirty(folio); + if (!folio_test_referenced(folio) && pud_young(old_pud)) + folio_set_referenced(folio); + folio_remove_rmap_pud(folio, page, vma); + folio_put(folio); + add_mm_counter(mm, mm_counter_file(folio), -HPAGE_PUD_NR); return; + } + + /* + * Anonymous PUD split: split directly to PTE level. + * + * We cannot create PMD huge entries pointing to portions of a larger + * folio because the kernel's rmap infrastructure assumes PMD mappings + * are for PMD-sized folios only (see __folio_rmap_sanity_checks). + * Instead, we create a PMD table with 512 entries, each pointing to + * a PTE table with 512 PTEs. + * + * Tables are always deposited at fault time in do_huge_pud_anonymous_pag= e(). + */ + if (!withdraw_pud_split_ptables(mm, pud, &tables)) { + WARN_ON_ONCE(1); + return; + } =20 page =3D pud_page(old_pud); folio =3D page_folio(page); =20 - if (!folio_test_dirty(folio) && pud_dirty(old_pud)) - folio_mark_dirty(folio); - if (!folio_test_referenced(folio) && pud_young(old_pud)) - folio_set_referenced(folio); + dirty =3D pud_dirty(old_pud); + write =3D pud_write(old_pud); + young =3D pud_young(old_pud); + soft_dirty =3D pud_soft_dirty(old_pud); + anon_exclusive =3D PageAnonExclusive(page); + + if (dirty) + folio_set_dirty(folio); + + /* + * Add references for each page that will have its own PTE. + * Original folio has 1 reference. After split, each of 262144 PTEs + * will eventually be unmapped, each calling folio_put(). + */ + folio_ref_add(folio, HPAGE_PUD_NR - 1); + + /* + * Add PTE-level rmap for all pages at once. + */ + if (anon_exclusive) + rmap_flags |=3D RMAP_EXCLUSIVE; + folio_add_anon_rmap_ptes(folio, page, HPAGE_PUD_NR, + vma, haddr, rmap_flags); + + /* Remove PUD-level rmap */ folio_remove_rmap_pud(folio, page, vma); - folio_put(folio); - add_mm_counter(vma->vm_mm, mm_counter_file(folio), - -HPAGE_PUD_NR); + + /* + * Create 512 PMD entries, each pointing to a PTE table. + * Each PTE table has 512 PTEs pointing to individual pages. + */ + addr =3D haddr; + for (i =3D 0; i < (HPAGE_PUD_NR / HPAGE_PMD_NR); i++) { + pmd_t *pmd_entry =3D tables.pmd_table + i; + pgtable_t pte_table =3D tables.pte_tables[i]; + pte_t *pte; + struct page *subpage_base =3D page + i * HPAGE_PMD_NR; + + /* Populate the PTE table */ + pte =3D page_address(pte_table); + for (j =3D 0; j < HPAGE_PMD_NR; j++) { + struct page *subpage =3D subpage_base + j; + pte_t entry; + + entry =3D mk_pte(subpage, vma->vm_page_prot); + if (write) + entry =3D pte_mkwrite(entry, vma); + if (dirty) + entry =3D pte_mkdirty(entry); + if (young) + entry =3D pte_mkyoung(entry); + if (soft_dirty) + entry =3D pte_mksoft_dirty(entry); + + set_pte_at(mm, addr + j * PAGE_SIZE, pte + j, entry); + } + + /* Set PMD to point to PTE table */ + pmd_populate(mm, pmd_entry, pte_table); + addr +=3D HPAGE_PMD_SIZE; + } + + /* + * Memory barrier ensures all PMD entries are visible before + * installing the PMD table in the PUD. + */ + smp_wmb(); + + /* Install the PMD table in the PUD */ + pud_populate(mm, pud, tables.pmd_table); + + /* Free the temporary array holding PTE table pointers */ + kfree(tables.pte_tables); } =20 void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-ot1-f44.google.com (mail-ot1-f44.google.com [209.85.210.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91D451D61A3 for ; Mon, 2 Feb 2026 00:55:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993736; cv=none; b=tQj5pUCmjtz/uoJc37nSEtiOtNYlHqVK1L3mmDVpbvFjkfd4r51Qnue5TZlVDhW9BIRdzsDXR+cHErkRr9ZliDujL6bcpHMN5PltKIT15UA9yLCxyEFFaG7ItaRBJTtgByngn0WylYceSAi7DounTh+q72LRIUVqbYrhMYs5y4E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993736; c=relaxed/simple; bh=ic1Eq3oay0wPXLz1rrx+Q6+KQM7YDUABsR6V02ZzTQI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MybdDt+pQOx5zuhuASbFoCKKHGoUiR67QLYL5PdOrL7IbETL4IWXH0qj4hOpzb6+8a4H+0vOF+4j+DetSGUlwll94714XnnvMajnIIPZfOmgbyt8itEHBpL9TZiYYXDBJ4YDcl8ZzKwwtH19MmlL9s4cPf1ryQcdBy6ih1B6Fr4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=RrYFHAzB; arc=none smtp.client-ip=209.85.210.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RrYFHAzB" Received: by mail-ot1-f44.google.com with SMTP id 46e09a7af769-7d1890f7cefso3323846a34.3 for ; Sun, 01 Feb 2026 16:55:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993733; x=1770598533; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DxuUiLfYi20Fha3ulkW2hqwY4wzmXAQF/uDuddDWXP0=; b=RrYFHAzBgbIGxkFOPJDOZi2noEyw8b9IqDDVUllXRpbWcdHJOzky6DllZjdo0suqXy oDiuBjBwcZE1N5bHyAn4wYsCE7eG6tBGjxZNFspprqSEAH3CdmXIkhGk8XnvRX+AoiCJ QDyV01jNPyHnB37CzPgC4yE5j5WrBa4lqi3an9ueNoHQ4udCi7OSRnbn5lpyTL1aBPqh YmDx5x9ni+Lo574kTHHx6BrY0cUBPQOtv0rwmTuxk8xw+mO06tBHCQ1LB869mQaFJFjQ BVVUaPkNv9yLRW5DWb+JwQCYKAPENVi+QrCnHdK7CDU52BBR0TeG+B7Jrp866Io7eTG/ ii6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993733; x=1770598533; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=DxuUiLfYi20Fha3ulkW2hqwY4wzmXAQF/uDuddDWXP0=; b=S6O53e8AJ8NsXGadr5+y2W2IjkFeKcySaTrcg/yTbueOzu+OUa8OK2wP+zREwfKEn7 4K1o2BR02Mny/j1nGUbhEQrDEGAP2yyfAFDMbefWdLI269Ej7XZ/ORddOzaaW41vDiMR 1xISOFAo0DrzKQVVt0KJwbM5EwCam4u5CnTEdJOfmJh1nKKr//EvLq4L19N3QoiH0rbZ OR6JhvAWsaf46ooQbPrdn3XqvEU4NglRBuV8myIHUbMay61ZdSEU4buKEGu59GFCcftI 6v6O5VYBOQ/YTP5xt1BfrN8SwTqEDaAjKxtGKdsLx+SWALdCjHIeZupI51J2OlMxqnNM F4cw== X-Forwarded-Encrypted: i=1; AJvYcCVWcvYKXG3C9QVlr/yYEzMpepys08Cnt8JSeSmH7Ln2JgOJjD+XAt/Z86vYfXJkrpQ/jp4ZIZV39Hn3KpI=@vger.kernel.org X-Gm-Message-State: AOJu0Yw33eF3/kdY7cVqXo6H6mRW8WDKVf+9LVCnBM255bOdvh9GgPqq RnlGpjVF9qLwq40kPnqUEYGDkVsSqDJrA0OSxRfxjtfRfyvMJtVy7qdb X-Gm-Gg: AZuq6aJntB5UvZLCOlg+ll0uP69NTEB97G6Zc4fFbIWjmdm+RxlOCzV2jpHAV8n1Mh7 QWoumBMwCU/KbLnd7tsiHjfOP3LDgvHREtjNVSI5ci4YN8pfesQxZ0DA6Ib/5vdCEbXUlc4L9h8 rE62/xFmsZWhJWG3sgj1Uv0SRROCEZa4owtYDo9KIVoVgjqRV9Y7o3/KJlrlgyz1PtHEdi0ivkV 8OMXrCooqvcWWYycc4UBODcQ+KBggrOADb4AjVWh++kSwbMXbpxAp4Devj76YD/P1ziTgg/0zsR l7QjenKC+S/QZjtVM/7xKAwT3utGohZy8QPF/bVuNR/gJvdI5jylxT9tEvNjZTDorweehFuIE6d as9tyqu992P6qIFpE8boNIDQ+ABjSy0iWMjQo0A8yq2ZHMPqw7mmrn/+CwepRRhggNcACxPEgLo vnVczITddK X-Received: by 2002:a05:6830:3112:b0:7d1:4d70:736e with SMTP id 46e09a7af769-7d1a535482fmr5803898a34.35.1769993733426; Sun, 01 Feb 2026 16:55:33 -0800 (PST) Received: from localhost ([2a03:2880:10ff:53::]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7d18c69bf4csm9760188a34.9.2026.02.01.16.55.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:32 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 05/12] mm: thp: add reclaim and migration support for PUD THP Date: Sun, 1 Feb 2026 16:50:22 -0800 Message-ID: <20260202005451.774496-6-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Enable the memory reclaim and migration paths to handle PUD THPs correctly by splitting them before proceeding. Memory reclaim needs to unmap pages before they can be reclaimed. For PUD THPs, the unmap path now passes TTU_SPLIT_HUGE_PUD when unmapping PUD-sized folios. This triggers the PUD split during the unmap phase, converting the single PUD mapping into 262144 PTE mappings. Reclaim then proceeds normally with the individual pages. This follows the same pattern used for PMD THPs with TTU_SPLIT_HUGE_PMD. When migration encounters a PUD-sized folio, it now splits the folio first using the standard folio split mechanism. The resulting smaller folios (or individual pages) can then be migrated normally. This matches how PMD THPs are handled when PMD migration is not supported on a given architecture. The split-before-migrate approach means PUD THPs will be broken up during NUMA balancing or memory compaction. While this loses the TLB benefit of the large mapping, it allows these memory management operations to proceed. Future work could add PUD-level migration entries to preserve the mapping through migration. Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 11 ++++++ mm/huge_memory.c | 83 +++++++++++++++++++++++++++++++++++++---- mm/migrate.c | 17 +++++++++ mm/vmscan.c | 2 + 4 files changed, 105 insertions(+), 8 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index a292035c0270f..8b2bffda4b4f3 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -559,6 +559,17 @@ static inline bool folio_test_pmd_mappable(struct foli= o *folio) return folio_order(folio) >=3D HPAGE_PMD_ORDER; } =20 +/** + * folio_test_pud_mappable - Can we map this folio with a PUD? + * @folio: The folio to test + * + * Return: true - @folio can be PUD-mapped, false - @folio cannot be PUD-m= apped. + */ +static inline bool folio_test_pud_mappable(struct folio *folio) +{ + return folio_order(folio) >=3D HPAGE_PUD_ORDER; +} + vm_fault_t do_huge_pmd_numa_page(struct vm_fault *vmf); =20 vm_fault_t do_huge_pmd_device_private(struct vm_fault *vmf); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 39b8212b5abd4..87b2c21df4a49 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2228,9 +2228,17 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct m= m_struct *src_mm, goto out_unlock; =20 /* - * TODO: once we support anonymous pages, use - * folio_try_dup_anon_rmap_*() and split if duplicating fails. + * For anonymous pages, split to PTE level. + * This simplifies fork handling - we don't need to duplicate + * the complex anon rmap at PUD level. */ + if (vma_is_anonymous(vma)) { + spin_unlock(src_ptl); + spin_unlock(dst_ptl); + __split_huge_pud(vma, src_pud, addr); + return -EAGAIN; + } + if (is_cow_mapping(vma->vm_flags) && pud_write(pud)) { pudp_set_wrprotect(src_mm, addr, src_pud); pud =3D pud_wrprotect(pud); @@ -3099,11 +3107,29 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_= area_struct *vma, { spinlock_t *ptl; pud_t orig_pud; + pmd_t *pmd_table; + pgtable_t pte_table; + int nr_pte_tables =3D 0; =20 ptl =3D __pud_trans_huge_lock(pud, vma); if (!ptl) return 0; =20 + /* + * Withdraw any deposited page tables before clearing the PUD. + * These need to be freed and their counters decremented. + */ + pmd_table =3D pgtable_trans_huge_pud_withdraw(tlb->mm, pud); + if (pmd_table) { + while ((pte_table =3D pud_withdraw_pte(pmd_table)) !=3D NULL) { + pte_free(tlb->mm, pte_table); + mm_dec_nr_ptes(tlb->mm); + nr_pte_tables++; + } + pmd_free(tlb->mm, pmd_table); + mm_dec_nr_pmds(tlb->mm); + } + orig_pud =3D pudp_huge_get_and_clear_full(vma, addr, pud, tlb->fullmm); arch_check_zapped_pud(vma, orig_pud); tlb_remove_pud_tlb_entry(tlb, pud, addr); @@ -3114,14 +3140,15 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_= area_struct *vma, struct page *page =3D NULL; struct folio *folio; =20 - /* No support for anonymous PUD pages or migration yet */ - VM_WARN_ON_ONCE(vma_is_anonymous(vma) || - !pud_present(orig_pud)); + VM_WARN_ON_ONCE(!pud_present(orig_pud)); =20 page =3D pud_page(orig_pud); folio =3D page_folio(page); folio_remove_rmap_pud(folio, page, vma); - add_mm_counter(tlb->mm, mm_counter_file(folio), -HPAGE_PUD_NR); + if (vma_is_anonymous(vma)) + add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PUD_NR); + else + add_mm_counter(tlb->mm, mm_counter_file(folio), -HPAGE_PUD_NR); =20 spin_unlock(ptl); tlb_remove_page_size(tlb, page, HPAGE_PUD_SIZE); @@ -3729,15 +3756,53 @@ static inline void split_huge_pmd_if_needed(struct = vm_area_struct *vma, unsigned split_huge_pmd_address(vma, address, false); } =20 +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static void split_huge_pud_address(struct vm_area_struct *vma, unsigned lo= ng address) +{ + pud_t *pud =3D mm_find_pud(vma->vm_mm, address); + + if (!pud) + return; + + __split_huge_pud(vma, pud, address); +} + +static inline void split_huge_pud_if_needed(struct vm_area_struct *vma, un= signed long address) +{ + /* + * If the new address isn't PUD-aligned and it could previously + * contain a PUD huge page: check if we need to split it. + */ + if (!IS_ALIGNED(address, HPAGE_PUD_SIZE) && + range_in_vma(vma, ALIGN_DOWN(address, HPAGE_PUD_SIZE), + ALIGN(address, HPAGE_PUD_SIZE))) + split_huge_pud_address(vma, address); +} +#else +static inline void split_huge_pud_if_needed(struct vm_area_struct *vma, un= signed long address) +{ +} +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ + void vma_adjust_trans_huge(struct vm_area_struct *vma, unsigned long start, unsigned long end, struct vm_area_struct *next) { - /* Check if we need to split start first. */ + /* Check if we need to split PUD THP at start first. */ + split_huge_pud_if_needed(vma, start); + + /* Check if we need to split PUD THP at end. */ + split_huge_pud_if_needed(vma, end); + + /* If we're incrementing next->vm_start, we might need to split it. */ + if (next) + split_huge_pud_if_needed(next, end); + + /* Check if we need to split PMD THP at start. */ split_huge_pmd_if_needed(vma, start); =20 - /* Check if we need to split end next. */ + /* Check if we need to split PMD THP at end. */ split_huge_pmd_if_needed(vma, end); =20 /* If we're incrementing next->vm_start, we might need to split it. */ @@ -3752,6 +3817,8 @@ static void unmap_folio(struct folio *folio) =20 VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); =20 + if (folio_test_pud_mappable(folio)) + ttu_flags |=3D TTU_SPLIT_HUGE_PUD; if (folio_test_pmd_mappable(folio)) ttu_flags |=3D TTU_SPLIT_HUGE_PMD; =20 diff --git a/mm/migrate.c b/mm/migrate.c index 4688b9e38cd2f..2d3d2f5585d14 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1859,6 +1859,23 @@ static int migrate_pages_batch(struct list_head *fro= m, * we will migrate them after the rest of the * list is processed. */ + /* + * PUD-sized folios cannot be migrated directly, + * but can be split. Try to split them first and + * migrate the resulting smaller folios. + */ + if (folio_test_pud_mappable(folio)) { + nr_failed++; + stats->nr_thp_failed++; + if (!try_split_folio(folio, split_folios, mode)) { + stats->nr_thp_split++; + stats->nr_split++; + continue; + } + stats->nr_failed_pages +=3D nr_pages; + list_move_tail(&folio->lru, ret_folios); + continue; + } if (!thp_migration_supported() && is_thp) { nr_failed++; stats->nr_thp_failed++; diff --git a/mm/vmscan.c b/mm/vmscan.c index 619691aa43938..868514a770bf2 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1348,6 +1348,8 @@ static unsigned int shrink_folio_list(struct list_hea= d *folio_list, enum ttu_flags flags =3D TTU_BATCH_FLUSH; bool was_swapbacked =3D folio_test_swapbacked(folio); =20 + if (folio_test_pud_mappable(folio)) + flags |=3D TTU_SPLIT_HUGE_PUD; if (folio_test_pmd_mappable(folio)) flags |=3D TTU_SPLIT_HUGE_PMD; /* --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-ot1-f52.google.com (mail-ot1-f52.google.com [209.85.210.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 42E081C84BB for ; Mon, 2 Feb 2026 00:55:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993739; cv=none; b=UrjgqF2qheOGzQIA1Kux/ZmuIcSLV7gecLgMJ4hCtxwh701y85kckFzJXOpdsTQ09ljLn3jjCq3zjw5MrLcfvzBgWBIV9eLT6ZxEnZkpwlGyhs0hoT+SmG4nG/33VOEa/64UBqRubtsFFDCij9yevANDmntj1S4hpnxIvnazTzw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993739; c=relaxed/simple; bh=NlKcc2bK7s93mx20kEsBBPa+qxif8S6miT5xOZ3eqA4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eEpE0Ou6lO8sHYPOh/6n2EIs0yy1cvc1/dE4EOCfxpWPT/7c2umZmpM2+qKveaNm7U815ceLhudFfTnzkyGmd9xlO6VxPumNV3vOUbQdoqA735SAI4d0YQ+R/SyFAoqfuMp42sKzpj+fAyddGoHwjUBfXxE2iE7fhUEK5BgXAR0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=Htv9LLOA; arc=none smtp.client-ip=209.85.210.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Htv9LLOA" Received: by mail-ot1-f52.google.com with SMTP id 46e09a7af769-7d18d0e6d71so2399031a34.1 for ; Sun, 01 Feb 2026 16:55:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993735; x=1770598535; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4Sqbiib7KB6vkBOddWWUPu9R4jigpGCUz2K/91JUCWs=; b=Htv9LLOA9e8gLHKTs/X/WAUlzz1u/H8w5ZnxUyqCIhhklKAnd4ci0MxH//u9jlFdZh oPvQ94NwitHnhKacRd/qRUEX1yzPFjlIIKgCZXGGxpxnD2c2CNgH0lq+BjQ/UtKkVt// esmxRAOCpQtPjT6lQPc3gMh36ad3hTNmPNG5HIYAFFb1YP8Rqq6W+/AqWQW/aN6WeViW EFco/jn78J+cFJxTm6hgHfef7tGOPp938saxA0qzz9ZGNHjYjr9eIUE6e37B0nVTiJv+ etjEotoOqwjKtDEU6UrJvOYnOhJo15iFPMp0GLteV7jv8+p1PYDYYfibrj9KE55OIEH+ KERA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993735; x=1770598535; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=4Sqbiib7KB6vkBOddWWUPu9R4jigpGCUz2K/91JUCWs=; b=VPbi28wN4xuRs0OKh9Nw+gR64Vpv3BUCGl5RytcUnp6c/MLj4US/4WWcpyUYNH51KU p85KTGYuXrfnZMDXmMa1jkpBZp90whjPiaKcri3Dc5y9xkLzcr2yRFPAGextHDQundFD u6pur3uzhNSSslZhndUUbqLUBnKShVdeF2AvalBhnkVS7gOtB5BWukCtHLL4f1msvaVp mhy/iKkB3cTuwBxjrhA7eWKRkixpjwebnvJvSR55wK6vObvMhAuEydyT79B4vlsxdZpR TLSHuMvpeK/ziKhpmoBQ+0T6d1Tx4zZ5zIZyHA58R8zk+b46jkOHpzRrN5JmKtYanVyt evMA== X-Forwarded-Encrypted: i=1; AJvYcCXXdvAKSedw6WBAQ6WbFzmIldQ/xR8VtN6i2iiurZoDybSe2H+H8h4EODhFcq5hC/YMqBsBw8LmfZEEgwU=@vger.kernel.org X-Gm-Message-State: AOJu0YwrQUZdgiT13cUX1Xk774cflCl0/N/z0HqHUQl1VsJNJJN7xHk7 Stj+fZ5kNYFJHaDeBz5nhghdANF/4y/yPvBmdRZWF2qFu0HYEA3hQX1K X-Gm-Gg: AZuq6aJCn0lac8KF2AvVorU5xF7KEMjM90G1NJ0uMRI0kKYSsGACQy6cSogABdR9eGd +4xkTionBpCop16Qjwucwl9WYQ/mt9gQDYhOSJSMqolzuiOMj4YYvR2ib5Q7yDtO1nnz9SIUfqb u51z7I9K2ky6VNk4584fTaTZhVmU7s8WQPrx7wxP/Ihq9y0OBVIPf8Kpp3Ijml0/wg6BHtT8EW5 SjFxic/rPXw9+DOsRxpjVQRAkpmBmuWjYLcv6z6yyVUhm4TTWwJbrmC4q1hePGbattqq1SyQSU6 u8cgWMWHFLJYTe23Oiz1lM74xrBtIjCgZpXp6VoyCodK4RUxVTia8Yn47/tcaAKpKi9C9hlaw+t qBrzJgU2BS2MuxTY70CIwndEx4u4F/7ESMPO+1vm6FUIBvda7IckB1gQMROzakrE6nkV3Mte4fa INFvMsp8w= X-Received: by 2002:a05:6820:2213:b0:661:167b:72db with SMTP id 006d021491bc7-6630f35f037mr4965770eaf.38.1769993735072; Sun, 01 Feb 2026 16:55:35 -0800 (PST) Received: from localhost ([2a03:2880:10ff:2::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-4095753b347sm9566016fac.17.2026.02.01.16.55.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:34 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 06/12] selftests/mm: add PUD THP basic allocation test Date: Sun, 1 Feb 2026 16:50:23 -0800 Message-ID: <20260202005451.774496-7-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a selftest for PUD-level THPs (1GB THPs) with test infrastructure and a basic allocation test. The test uses the kselftest harness FIXTURE/TEST_F framework. A shared fixture allocates a 2GB anonymous mapping and computes a PUD-aligned address within it. Helper functions read THP counters from /proc/vmstat and mTHP statistics from sysfs. The basic allocation test verifies the fundamental PUD THP allocation path by touching a PUD-aligned region and checking that the mTHP anon_fault_alloc counter increments, confirming a 1GB folio was allocated. Signed-off-by: Usama Arif --- tools/testing/selftests/mm/Makefile | 1 + tools/testing/selftests/mm/pud_thp_test.c | 161 ++++++++++++++++++++++ 2 files changed, 162 insertions(+) create mode 100644 tools/testing/selftests/mm/pud_thp_test.c diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/= mm/Makefile index eaf9312097f7b..ab79f1693941a 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -88,6 +88,7 @@ TEST_GEN_FILES +=3D pagemap_ioctl TEST_GEN_FILES +=3D pfnmap TEST_GEN_FILES +=3D process_madv TEST_GEN_FILES +=3D prctl_thp_disable +TEST_GEN_FILES +=3D pud_thp_test TEST_GEN_FILES +=3D thuge-gen TEST_GEN_FILES +=3D transhuge-stress TEST_GEN_FILES +=3D uffd-stress diff --git a/tools/testing/selftests/mm/pud_thp_test.c b/tools/testing/self= tests/mm/pud_thp_test.c new file mode 100644 index 0000000000000..6f0c02c6afd3a --- /dev/null +++ b/tools/testing/selftests/mm/pud_thp_test.c @@ -0,0 +1,161 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Test program for PUD-level Transparent Huge Pages (1GB anonymous THP) + * + * Prerequisites: + * - Kernel with PUD THP support (CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PU= D) + * - THP enabled: echo always > /sys/kernel/mm/transparent_hugepage/enabled + * - PUD THP enabled: echo always > /sys/kernel/mm/transparent_hugepage/hu= gepages-1048576kB/enabled + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "kselftest_harness.h" + +#define PUD_SIZE (1UL << 30) /* 1GB */ +#define PMD_SIZE (1UL << 21) /* 2MB */ +#define PAGE_SIZE (1UL << 12) /* 4KB */ + +#define TEST_REGION_SIZE (2 * PUD_SIZE) /* 2GB to ensure PUD alignment */ + +/* Get PUD-aligned address within a region */ +static inline void *pud_align(void *addr) +{ + return (void *)(((unsigned long)addr + PUD_SIZE - 1) & ~(PUD_SIZE - 1)); +} + +/* Read vmstat counter */ +static unsigned long read_vmstat(const char *name) +{ + FILE *fp; + char line[256]; + unsigned long value =3D 0; + + fp =3D fopen("/proc/vmstat", "r"); + if (!fp) + return 0; + + while (fgets(line, sizeof(line), fp)) { + if (strncmp(line, name, strlen(name)) =3D=3D 0 && + line[strlen(name)] =3D=3D ' ') { + sscanf(line + strlen(name), " %lu", &value); + break; + } + } + fclose(fp); + return value; +} + +/* Read mTHP stats for PUD order (1GB =3D 1048576kB) */ +static unsigned long read_mthp_stat(const char *stat_name) +{ + char path[256]; + char buf[64]; + int fd; + ssize_t ret; + unsigned long value =3D 0; + + snprintf(path, sizeof(path), + "/sys/kernel/mm/transparent_hugepage/hugepages-1048576kB/stats/%s", + stat_name); + fd =3D open(path, O_RDONLY); + if (fd < 0) + return 0; + ret =3D read(fd, buf, sizeof(buf) - 1); + close(fd); + if (ret <=3D 0) + return 0; + buf[ret] =3D '\0'; + sscanf(buf, "%lu", &value); + return value; +} + +/* Check if PUD THP is enabled */ +static int pud_thp_enabled(void) +{ + char buf[64]; + int fd; + ssize_t ret; + + fd =3D open("/sys/kernel/mm/transparent_hugepage/hugepages-1048576kB/enab= led", O_RDONLY); + if (fd < 0) + return 0; + ret =3D read(fd, buf, sizeof(buf) - 1); + close(fd); + if (ret <=3D 0) + return 0; + buf[ret] =3D '\0'; + + /* Check if [always] or [madvise] is set */ + if (strstr(buf, "[always]") || strstr(buf, "[madvise]")) + return 1; + return 0; +} + +/* + * Main fixture for PUD THP tests + * Allocates a 2GB region and provides a PUD-aligned pointer within it + */ +FIXTURE(pud_thp) +{ + void *mem; /* Base mmap allocation */ + void *aligned; /* PUD-aligned pointer within mem */ + unsigned long mthp_alloc_before; + unsigned long split_before; +}; + +FIXTURE_SETUP(pud_thp) +{ + if (!pud_thp_enabled()) + SKIP(return, "PUD THP not enabled in sysfs"); + + self->mthp_alloc_before =3D read_mthp_stat("anon_fault_alloc"); + self->split_before =3D read_vmstat("thp_split_pud"); + + self->mem =3D mmap(NULL, TEST_REGION_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + ASSERT_NE(self->mem, MAP_FAILED); + + self->aligned =3D pud_align(self->mem); +} + +FIXTURE_TEARDOWN(pud_thp) +{ + if (self->mem && self->mem !=3D MAP_FAILED) + munmap(self->mem, TEST_REGION_SIZE); +} + +/* + * Test: Basic PUD THP allocation + * Verifies that touching a PUD-aligned region allocates a PUD THP + */ +TEST_F(pud_thp, basic_allocation) +{ + unsigned long mthp_alloc_after; + + /* Touch memory to trigger page fault and PUD THP allocation */ + memset(self->aligned, 0xAB, PUD_SIZE); + + mthp_alloc_after =3D read_mthp_stat("anon_fault_alloc"); + + /* + * If mTHP allocation counter increased, a PUD THP was allocated. + */ + if (mthp_alloc_after <=3D self->mthp_alloc_before) + SKIP(return, "PUD THP not allocated"); + + TH_LOG("PUD THP allocated (anon_fault_alloc: %lu -> %lu)", + self->mthp_alloc_before, mthp_alloc_after); +} + +TEST_HARNESS_MAIN --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-oo1-f54.google.com (mail-oo1-f54.google.com [209.85.161.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBF2F1DEFE8 for ; Mon, 2 Feb 2026 00:55:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993745; cv=none; b=d7SDEQnWuhtpFFYbo2GPjqnNPSj2ZEUoNhDto9gonzMY4ZGRViDKOpw2pApw9oBtfwtlY9E0gkckKi5O+W6P0O6c1hNw4u9W7gYmxcuHH6XDfqel5jjvULBUxeDaYhy/vT5Y2sFuF0YPT4EtCDX6ul2cNO33f4zrGr2TWIujkzs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993745; c=relaxed/simple; bh=0LLMt9lW9wPU5ftwk43ki0LBd00McSx4znKDBj8NQLA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JemwcA3MzC4hislMWLfqKO0qg/QO10JOKFr5IZB8XA7na1vD2AQeXfJdX7J4QqEMQsJX8Wzb4TUce/kpgl0TS43xcNiYygpiIss2fwTHhDaO/Y51SRnf89iCba7fJtEwfC/t5eP4I57JS+Iv5jP4tbAgPRtU30mlkUeUZ9UkzWU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=GtHNGTI+; arc=none smtp.client-ip=209.85.161.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="GtHNGTI+" Received: by mail-oo1-f54.google.com with SMTP id 006d021491bc7-662feac8228so2200054eaf.3 for ; Sun, 01 Feb 2026 16:55:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993736; x=1770598536; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LTtZXHix/tkNG5CoGGDCP/mgAQ8nVfL3AIR6yAVTbnQ=; b=GtHNGTI+wlg/0sIjX4ni0eqlcF+Xt/153rtlNL/dhy+0AhaDHtnZtS2AO5bP8hyNhr +12fO9xK9mBrBb6AH+8mf+LgYSZQSspOCuXYjpEJIUONpmC87GnIm7oDzX1L+QOZ7PHr B79d6jfx0Z5959Bw8BYyypURTfXW29es/+EN2b3+GeP6IVKxr1RVy6DAfgRAurw7mr+3 qzRSg7HzdosRBFm4uh/D7uQXZwYBstdlT/5j1tY9BghQM6rP+xLom/hXT+DoR0mNIN1x Kz7jJsGg6VbawlCUsa5n2V8AzzbbU9mgJlN1QuBY+vVJstNyAJ51iA/m+2X324YjJZkE 5jsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993736; x=1770598536; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=LTtZXHix/tkNG5CoGGDCP/mgAQ8nVfL3AIR6yAVTbnQ=; b=FmKYQn6CRhxQedJ0jkrLHzuFCUZ8DlhDHrml2QlSHFnChqyYLs4ZYC1J9zx7IqTFpx CdBZ0EDqz9nM43K9qf5nawaGCCNB20hmRFNp5xMNJzYovuBDUcnfUsy60cOVI0z1s0ms XN+KK+0t+j47up7WPdHc09xmlvGitz2OerbbNMRZNZjuMfPV/o4eRzs2vDGK+R0SS+QK gPYQjQgcTEEf98JGsv2wQmIkbX8PDsA90179Z0rroLqMBwDbjqFP1XZCa43YJZMC7qkE Oxswy2Ow8CnMmHN8um/JjcjTAAQFJGMF2OeQyGnOpRPh++qmkoFQmTFqQUc39MKSeV/O xRvQ== X-Forwarded-Encrypted: i=1; AJvYcCWAxc+OW3qEBJdcKxj4EzICnJCJLqptypfpxU0qPCs3JZxECYNXv5EGVcKERHVRXIIJf1zFFardQxi6KFI=@vger.kernel.org X-Gm-Message-State: AOJu0Yx04IMLFIRTqeYmCRAScaKLqGkXhMQbEiZPLo+xtD34VL9YG0gL cX1lPRlXDVLFFm2bovVTTgPyBQUxSqTAd14kQxUosc3RC5unyJzPO/GT X-Gm-Gg: AZuq6aKpldgJz62Ewa6YI/7dxYnhihcn+QCtDkxAzHjlt9OTugwqIiVPTA6Asqj4KKB tci2JpR/0M7CvfiGoszrO7KxWc4Zp1y2UxR+e875WK79YfQ67EylU9C46/7cA3T/NaRKgJMM18O w7DKdWuK5dZJSwSCj45cRx+BDyAx/Ac/dnjUk+7RnoCQ4uVut+KvVLNGI8xTJm6scWkuqMLRYlP LuiOSDLMp3dWAKzRsnA6HYUIIpNVM8pzs6kJZSlwr0ed6Qc89Js6IkgD9cuM5ELty6ooRjM1aXb IPJQHKAwl0Oc5HuyeAcPvFdw3hh4n39ch8aYRo00fr8A9FaDAppJeJ/Rpen+PZ62tqZsFpMSJ7m HcXGqufoIpiCzUpgZ1AE0mOI+13OC7Aqyx2wM7YWhkLRfeInVI6rkmiMETkOW3+IMr3jeA5mtfL +6d6g7yRiYAsePpXuEDQ== X-Received: by 2002:a05:6820:820:b0:65d:88b:c00d with SMTP id 006d021491bc7-6630f38f2a8mr5651984eaf.66.1769993736604; Sun, 01 Feb 2026 16:55:36 -0800 (PST) Received: from localhost ([2a03:2880:10ff:6::]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-662f9961dffsm9323713eaf.5.2026.02.01.16.55.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:35 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 07/12] selftests/mm: add PUD THP read/write access test Date: Sun, 1 Feb 2026 16:50:24 -0800 Message-ID: <20260202005451.774496-8-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a test that verifies data integrity across a 1GB PUD THP region by writing patterns at page boundaries and reading them back. Signed-off-by: Usama Arif --- tools/testing/selftests/mm/pud_thp_test.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/tools/testing/selftests/mm/pud_thp_test.c b/tools/testing/self= tests/mm/pud_thp_test.c index 6f0c02c6afd3a..7a1f0b0f81468 100644 --- a/tools/testing/selftests/mm/pud_thp_test.c +++ b/tools/testing/selftests/mm/pud_thp_test.c @@ -158,4 +158,27 @@ TEST_F(pud_thp, basic_allocation) self->mthp_alloc_before, mthp_alloc_after); } =20 +/* + * Test: Read/write access patterns + * Verifies data integrity across the entire 1GB region + */ +TEST_F(pud_thp, read_write_access) +{ + unsigned long *ptr =3D (unsigned long *)self->aligned; + size_t i; + int errors =3D 0; + + /* Write pattern - sample every page to reduce test time */ + for (i =3D 0; i < PUD_SIZE / sizeof(unsigned long); i +=3D PAGE_SIZE / si= zeof(unsigned long)) + ptr[i] =3D i ^ 0xDEADBEEFUL; + + /* Verify pattern */ + for (i =3D 0; i < PUD_SIZE / sizeof(unsigned long); i +=3D PAGE_SIZE / si= zeof(unsigned long)) { + if (ptr[i] !=3D (i ^ 0xDEADBEEFUL)) + errors++; + } + + ASSERT_EQ(errors, 0); +} + TEST_HARNESS_MAIN --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-ot1-f50.google.com (mail-ot1-f50.google.com [209.85.210.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B727B1C5486 for ; Mon, 2 Feb 2026 00:55:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993740; cv=none; b=dS1cgmC0FRy/yJVtQxtFu8JEGqJFR5vv3od6XK524mbsXq5np7+xxOKq3qZ1AaSkU8KtFHEYOMx5xclk1YscBYuq5mzHHzLgwVsp2R+FmZgakX0dEWV3xx7ZgNFuh50B6LlGMFSwwpPEDnqAh4JW0TzbNGLAorYdpWeOlcf7Uf8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993740; c=relaxed/simple; bh=gOm6225tzT/vCJ3XXtS5HViDGmn6+yhopeDA1D1IL30=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PFIPW8o244GdijGnLbr+xeN2NpCQUnNZ23tSmGOaEKoGS2XCVb13CDq9Me6vV5XJAnl0dIadoOtgcvV8IRe2CFSL/i9wOcFeHJf3BUkEO38LRomLG0aGj32ETiqnRvjvh+/Ukpl/eK3go+QfPWQbDJcGR8KgfjNJKw/17nX9Hf4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=JClimhEF; arc=none smtp.client-ip=209.85.210.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="JClimhEF" Received: by mail-ot1-f50.google.com with SMTP id 46e09a7af769-7d1872504cbso3428046a34.0 for ; Sun, 01 Feb 2026 16:55:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993737; x=1770598537; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hqkhJdlpK1W1XSQ5YW44EIDdLoQmf+hCCGZAHvEFwYM=; b=JClimhEFVtyzDQTrjNrZYDcDKJ2iNR6h0HPI1Mgc3ayjDw+9dhcQ520C5ChV6/J2tM eljmaVpCqIxRK7049VMulRLsvH12jDXwHYk3Q/5OjPKKR7Q9W1rNFfetV2a6oKBx6mmK nVX8GcY/PQuHbPKbxdVA9tDvXNBtV8nqMuBgLBY7id+slddYhzR3mN0kVgct+aSV1U97 Y8+yIdy4gOU+MQyUZM9p9RFw7EIt/aOhrtxdyv583Xd675+K0Envh85YW01joZwRM/wF wPQswoQDBAaxOp+xeDgZ4ll250DJ37Db8R98vHoOn6ncBY/+1nITdIQs2muoojNF9++m stqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993737; x=1770598537; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=hqkhJdlpK1W1XSQ5YW44EIDdLoQmf+hCCGZAHvEFwYM=; b=iL8eM+o54o8KFcBTegwje46ctXBgul06/T39n/MhtqogLPzhC5NN1N5NZH0l7Q/aVL g9HMzaUPwT3iG2hY1yPHF+X5fJwqD2mJ8xcHMxuUNOY9D3lpvdg1+J6KBx+R66qUqNdE sYgM03ImeUfgXIib9MRqmkANA9n2ogLxNEkwOqafR693D74buzXKaNMRxxrcOdlXxF2j +rn5apKx4QKn1Mm9TvWyMZMwsUphMwGIA1JFgj9zaHpzqH/EKcbELP/oG3Ld9l2Q0R5F KXf4RNBLzEYaWTmMagHoY9WdMYWIZJQv+NDvRow/OLvC3qBMedBC32Ml96+eFmkG2rYs YJ4A== X-Forwarded-Encrypted: i=1; AJvYcCUKBbVo2EO5Q3gat8LdYfSzEzfdnK0wH1mHeu1QRdjPfOZCu7mjrWTY5aY1nMsVYM2RLohDdMd3XH/rwMk=@vger.kernel.org X-Gm-Message-State: AOJu0YwiBa1gYLBG2Yu+GnbJILB/vBCIZjR8Ws2hExO3RR7voS4Qy2XH BcfqhT7JezUFheK5qwN1d7isoRAq7yPxg5D10u2BnXwEGM0g8SBuoXxT X-Gm-Gg: AZuq6aIWffdZD20POTfz1Q8eBNaKrpUorMshI2SJ6tKhB01J3QS/3wHTebwremhnuW7 P/zctSY+Luso8T+zwnKVwaZWkjHlMDK3EvK648nxaqNaY8B+Rxjf4BsOL5KcftDiQtf0eIZPbC6 V0xKF0oz/O0rj9aAmnzHLs87nlcqLsOI2eDP8U2PXO+3lGNjzoy7kxc615duHsaLq/1t5t7pRDX KeoeqAoDJpnPfwIr6laMIATcuqrCncwoY4Sz0HncEOb980IqcXrt01NCctdxf58/6Cgeiv8PCY1 R2SCru71zjaqFAK8X4G41aESuxdiYL5wDHpZL4m99wWO1xAvcKEKsYlFX7eqYfmDFpUFXFPUpDz 7PVAly+R7DgT74u6PyqmkPJODYyOZ+d9UocpPSA/fEeR9R/4Xo4+vEGMN46BpM++pOhdrP2jN4T Yr+oO6V36P X-Received: by 2002:a05:6830:6aef:b0:7cf:d2d4:8013 with SMTP id 46e09a7af769-7d197077d08mr7671450a34.5.1769993737664; Sun, 01 Feb 2026 16:55:37 -0800 (PST) Received: from localhost ([2a03:2880:10ff:53::]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-7d18c825bfdsm9605006a34.29.2026.02.01.16.55.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:37 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 08/12] selftests/mm: add PUD THP fork COW test Date: Sun, 1 Feb 2026 16:50:25 -0800 Message-ID: <20260202005451.774496-9-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a test that allocates a PUD THP, forks a child process, and has the child write to the shared memory. This triggers the copy-on-write path which must split the PUD THP. The test verifies that both parent and child see correct data after the split. Signed-off-by: Usama Arif --- tools/testing/selftests/mm/pud_thp_test.c | 44 +++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/tools/testing/selftests/mm/pud_thp_test.c b/tools/testing/self= tests/mm/pud_thp_test.c index 7a1f0b0f81468..27a509cd477d5 100644 --- a/tools/testing/selftests/mm/pud_thp_test.c +++ b/tools/testing/selftests/mm/pud_thp_test.c @@ -181,4 +181,48 @@ TEST_F(pud_thp, read_write_access) ASSERT_EQ(errors, 0); } =20 +/* + * Test: Fork and copy-on-write + * Verifies that COW correctly splits the PUD THP and isolates parent/child + */ +TEST_F(pud_thp, fork_cow) +{ + unsigned long *ptr =3D (unsigned long *)self->aligned; + unsigned char *bytes =3D (unsigned char *)self->aligned; + pid_t pid; + int status; + unsigned long split_after; + + /* Initialize memory with known pattern */ + memset(self->aligned, 0xCC, PUD_SIZE); + + pid =3D fork(); + ASSERT_GE(pid, 0); + + if (pid =3D=3D 0) { + /* Child: write to trigger COW */ + ptr[0] =3D 0x12345678UL; + + /* Verify write succeeded and rest of memory unchanged */ + if (ptr[0] !=3D 0x12345678UL) + _exit(1); + if (bytes[PAGE_SIZE] !=3D 0xCC) + _exit(2); + + _exit(0); + } + + /* Parent: wait for child */ + waitpid(pid, &status, 0); + ASSERT_TRUE(WIFEXITED(status)); + ASSERT_EQ(WEXITSTATUS(status), 0); + + /* Verify parent memory unchanged (COW should have given child a copy) */ + ASSERT_EQ(bytes[0], 0xCC); + + split_after =3D read_vmstat("thp_split_pud"); + TH_LOG("Fork COW completed (thp_split_pud: %lu -> %lu)", + self->split_before, split_after); +} + TEST_HARNESS_MAIN --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-oa1-f44.google.com (mail-oa1-f44.google.com [209.85.160.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D79DD1E1DEC for ; Mon, 2 Feb 2026 00:55:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993745; cv=none; b=U21qr2UI3jZWhOSl4Nh1TRmSBs+jUBaC3qDIBRReWOsZVK+tGxrYjZE1mBC6AQrl2pWgdpphackeCQrsofSBNvkblhMZwELEjngV8UZoYKgUaipO6Pk1ae7cd2OhvD9zVvo6gBQDXIEgHnOoSYnzIRmJAGVC46n5EwgjOacwF1s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993745; c=relaxed/simple; bh=gbKvyB+aiTFTZUMAein1LUGPTeFjm59N1+OrIl3/GXs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=onSQeAlnbUyc6QzINqreaYlAOqjEk/SDKtYuK5CuvRI8G/JgarH2kqKfsdXFfg8URgwiD3+7//P4gqbdeGaqwKHI3F6K8sTuWLUtNjrQtdJOkjd1pi+U36Ulpq+NdJGuzV4vW4oAuYV1EFkrwoc/d6XLBRAYTCe/FWXZCa6tzUo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=He3OQrZS; arc=none smtp.client-ip=209.85.160.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="He3OQrZS" Received: by mail-oa1-f44.google.com with SMTP id 586e51a60fabf-4094fbd1808so1563500fac.1 for ; Sun, 01 Feb 2026 16:55:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993739; x=1770598539; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I8vQRA+06SerFEKUWpG9738UD1yc3kHqwB4W4DvkIi8=; b=He3OQrZS4ymLhSiFF3GmTW9V9rYcbzz8cv8IA6WILbyKsm9l523vBykCwyHCw9039i wsnhoJvBYhywjD8aCB/FF12/b7QQF/Cl1OMgNZneYz/X7xoRV7w/Uj2xcrxP0zjieaLl X3KdpsPbY+C4/cwVmLkUpfUXK5RvXkYJSDtU5xgi0RsrHpxsytdPTTD99VVqrUSM94P5 7Uuha9ZTl5NF/1VP8OZDKIhNjrnDn1voxHqOA8GJJ3m4aAAbBp6c922+fJDvks9SO2DF O6GR3GNFO/ApA6bRXJQFXtLvrPiCzWpKMP842HZI2l2tDGiBfm2K2ZZ/uc1q49NgTK5o bVwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993739; x=1770598539; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=I8vQRA+06SerFEKUWpG9738UD1yc3kHqwB4W4DvkIi8=; b=ZwZWe+gV8caz7Z/2/d+Xt7E1rMxjEmT+SVYE/J3+k3tf4qEoGYsFszIAlziVR+ZjgR sWgXmLFXXa9WjDnh+pyl9/hhI2xzk8tzLM2vPvFT0k4utlo7dkzp2okNXv/UDXCOoG1x qgNUihBkNKEX679t2LIwHaqmbwgKH6xlUYO5Pc1cTLPNHiQeSwPrdAqOaP+2Z3aqaUIM N4QGiRucdl6W4r1P3fESSP0uNgKh7qWBBIl95O5o807U8JWkM1fO3V0+ZQf5ozQ6mJjZ 8pwY/hcHcOvWSGiJoQVWLrdtGzEck36Wd3HdWY7Flg+uktJjNcEns9MB3ms2SbocHm3e 2ftg== X-Forwarded-Encrypted: i=1; AJvYcCWgEy6dihPePoOsCxKeMWFkz419s5Zk8Hvjd5HN7YD6UIohxo/TcBZeHhwUScT4LTf5eEbhayaWaC5dV9k=@vger.kernel.org X-Gm-Message-State: AOJu0Yz7hkQ5MhLliRgldTM/ksJysLhUd4tkDVfFuVKCY5xyeUrJKEOF 6wsRDjOOd6Ma2fUdbV7Jmqi+AhESU40CXWTKNBLWYArRhR3v17LJtTSV X-Gm-Gg: AZuq6aKt9JCu6qyJ1eQkFFMLG9gfH239V6MzieeHPOt80O9wzWg1H8EG6KCxC0s3eIJ P6GSyezOg2karkOTt6C4sJAGRx/XJAk6Qf/tLqkBIDVUKXK66iI8lTW14tXSi6pKXAQrTz3a+iG QWUSQiCdohs4Q3Y9CFD30j8NrfKVTywXR1ytVngARLa4uGpI3aPym8LpyV/1mwfDMYBGpUnJbNb gCUcv/Yqj7iluppcBoG1+FNuN4/WXISw8E84+QTnNuctlOT7iBvfgS+eDdfkB+TrWPf4M+xFkkT Fl20PvpQxos5ScN8Fwbr+HAyCP3qzwHM1bmu3Rp60YqfO5q6N5216LJqS+NiBjyFBGQv/ciTobF /S252VJ79KVYld3UMgsNk+AufO4YIwkwjNnq41+IN90kGgfZ0+nXopfbZJ0eUsQCXYdyw0o9qeU RFEvBv0q5O X-Received: by 2002:a05:6870:890f:b0:409:47c2:bea with SMTP id 586e51a60fabf-409a6f87b2fmr5052931fac.41.1769993738683; Sun, 01 Feb 2026 16:55:38 -0800 (PST) Received: from localhost ([2a03:2880:10ff:72::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-4095752ef7bsm10547305fac.16.2026.02.01.16.55.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:38 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 09/12] selftests/mm: add PUD THP partial munmap test Date: Sun, 1 Feb 2026 16:50:26 -0800 Message-ID: <20260202005451.774496-10-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a test that allocates a PUD THP and unmaps a 2MB region from the middle. Since the PUD can no longer cover the entire region, it must be split. The test verifies that memory before and after the hole remains accessible with correct data. Signed-off-by: Usama Arif --- tools/testing/selftests/mm/pud_thp_test.c | 31 +++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/tools/testing/selftests/mm/pud_thp_test.c b/tools/testing/self= tests/mm/pud_thp_test.c index 27a509cd477d5..8d4cb0e60f7f7 100644 --- a/tools/testing/selftests/mm/pud_thp_test.c +++ b/tools/testing/selftests/mm/pud_thp_test.c @@ -225,4 +225,35 @@ TEST_F(pud_thp, fork_cow) self->split_before, split_after); } =20 +/* + * Test: Partial munmap triggers split + * Verifies that unmapping part of a PUD THP splits it correctly + */ +TEST_F(pud_thp, partial_munmap) +{ + unsigned long *ptr =3D (unsigned long *)self->aligned; + unsigned long *after_hole; + unsigned long split_after; + int ret; + + /* Touch memory to allocate PUD THP */ + memset(self->aligned, 0xDD, PUD_SIZE); + + /* Unmap a 2MB region in the middle - should trigger PUD split */ + ret =3D munmap((char *)self->aligned + PUD_SIZE / 2, PMD_SIZE); + ASSERT_EQ(ret, 0); + + split_after =3D read_vmstat("thp_split_pud"); + + /* Verify memory before the hole is still accessible and correct */ + ASSERT_EQ(ptr[0], 0xDDDDDDDDDDDDDDDDUL); + + /* Verify memory after the hole is still accessible and correct */ + after_hole =3D (unsigned long *)((char *)self->aligned + PUD_SIZE / 2 + P= MD_SIZE); + ASSERT_EQ(*after_hole, 0xDDDDDDDDDDDDDDDDUL); + + TH_LOG("Partial munmap completed (thp_split_pud: %lu -> %lu)", + self->split_before, split_after); +} + TEST_HARNESS_MAIN --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-oi1-f178.google.com (mail-oi1-f178.google.com [209.85.167.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B591C1E8332 for ; Mon, 2 Feb 2026 00:55:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993744; cv=none; b=qwkYZxWL9+eTEU4qG2vZgK5q1H6sIIVUYw26aumk9v/TtPWhpbm/JJDHv9IBxa/689Elg9mHHriRvRua/3BSgSvUnf5FKztofLRj2qTim7BK0gqFb0z2qTyoGqARyPyThth6IeCf3v4RcC9qQJEdeImjV+HCBpyjuiNpd4J8zho= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993744; c=relaxed/simple; bh=dKikJPUghK2/M3i+1IdY0EihZkncvlLAZaHXkq5UHPk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=prC30sj7UQI1JqnrOf7I8cVj84P3amTf842TTu1Te3Dni/4cyWJKIjTgYghBR9nbMBzsFrXmorINT3cHYj+u0jUSpuNyayv+ZnwrkXXQScwgyIrQaUuQKMHwJ1q0dXjJp8zpO7TY+Pz8h861wbEV/Foc3wARgbRYYgK7IbzVpPQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=aQ43r4QW; arc=none smtp.client-ip=209.85.167.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="aQ43r4QW" Received: by mail-oi1-f178.google.com with SMTP id 5614622812f47-45f0dbe7dafso1504595b6e.3 for ; Sun, 01 Feb 2026 16:55:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993740; x=1770598540; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cYjr1TGhCAf/RKiT8q6ihWoR60MJtgSyO0Vv0YqmLhk=; b=aQ43r4QWypslIX1toWIYX1TtZ1YIc2+q7bek2uNfMx8PmVNTpO8TZXn+n2asiQ5IbH qKKkJN5acBSW8lvODAsABBsK1cMiJG5wyqvZXvChUFJmtl5ELX55zj3FEXNPN3DSYI0s jIWhn/0Txkq54zosnR2Ya7Rwfg3xbwsMfGo/E8yRPPR81EibBdN0O5AfNA6vrHWuJw+j osSMBcWmQk0vmyWu21WfMV2D6IJd7LqqAGzSbDN95m9QwFwwAV5jOOGtcx+vyM1XfCtI p/PV5avd2/Ziy93yfvLPF9OZC3HakXKtI8ZFeIDWBl3UN1cSMvEWNJW9ycCZ4FvDydg2 8dvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993740; x=1770598540; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=cYjr1TGhCAf/RKiT8q6ihWoR60MJtgSyO0Vv0YqmLhk=; b=My2yvYBaKStG1t68eeWY3a7tcJC2gnqVODXvTIaZfx7aalvPDXXMkMftnv5PFZfGiW WEi+7twWTU84FRWDb8UF8sUFKUQ+Arhcp7HwBxqu7TQP+XULC6HfVz6tObDtDv+wgQ07 dvyFXrIAaVSPGUA85Dqs1YCrZ19j+tjuWGlKikab5TrxrJf7DreidTT9NFo34TmEykk9 jFUgG1GD4h3ZZ0OYWgWDIfiXzi5BjVQRhe6R+RbrA08dKU4+NogsbMmm3am7CkWJTZ5/ ge9KZM2R9e6ZMKlP9eh4IlnEXotKJRvxjmnZcK4EyVnA/xi6RKSfRAwVRUskw7axZ4cH jjgg== X-Forwarded-Encrypted: i=1; AJvYcCVqPzcMyISwCJcCxNHkbXp647FhZ5hq37oZOEJmmlidNRoyB7ty+r8RMOogAtJ3lHn3Uj6DIM5k1jg3sJs=@vger.kernel.org X-Gm-Message-State: AOJu0YwHCwry3lurFR//1+SLdc3qDrfTcB1oQLLesouC4CVDPfyFBM3D AhYu1uGUy3lDVbPKFRVsGZqs39EO8J4XUTdYXdH+QQjwJSEOz9FPSik/ X-Gm-Gg: AZuq6aJ6yN1281l+zGk+dyNeV4/A21Qag/FcBRSmSMLUaU7i6n3sv72k8XcG7B9s9xJ 6yvBYdDMQu1r71g+kI4zGnctGZ4OLbQutGSBGnuQ9eGy7sRac9gDgRmgcUE1rIyXzqoUDfqKtdC Cj6E5LsNjAnXJgm/WgPhJtVTi9aGQXi0XX8ySOOcGxl7GchNtNJeP0h/PKnz4gNcFzwhvhhZmMU 8ttzLfrcZZxnPVwnc9CI41mHaer1ZKJg8RRu1SBWALznvLJVzfbA3Y3bZulYynXvIB+NyPF8GrN Pq8Yiv9/lHzZB9SSvSfQMxpC8p4TKOzHsxhS+A56iHrFokR2nrplN4H7T0DeEF4nrJrF6exLFP5 q7PXSlRk6M95uPqSB9DagfikHBC+4Vl6gn3hXi2gogdCk6M+QHhtbVqDnGW/LT7eV4sVOtRKdu7 aNgCBrwWqF X-Received: by 2002:a05:6808:67c1:b0:45f:21a:4995 with SMTP id 5614622812f47-45f34b832admr4654575b6e.23.1769993740110; Sun, 01 Feb 2026 16:55:40 -0800 (PST) Received: from localhost ([2a03:2880:10ff:53::]) by smtp.gmail.com with ESMTPSA id 586e51a60fabf-409575f6551sm10693913fac.22.2026.02.01.16.55.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:39 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 10/12] selftests/mm: add PUD THP mprotect split test Date: Sun, 1 Feb 2026 16:50:27 -0800 Message-ID: <20260202005451.774496-11-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a test that changes permissions on a portion of a PUD THP using mprotect. Since different parts now have different permissions, the PUD must be split. The test verifies correct behavior after the permission change. Signed-off-by: Usama Arif --- tools/testing/selftests/mm/pud_thp_test.c | 26 +++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/tools/testing/selftests/mm/pud_thp_test.c b/tools/testing/self= tests/mm/pud_thp_test.c index 8d4cb0e60f7f7..b59eb470adbba 100644 --- a/tools/testing/selftests/mm/pud_thp_test.c +++ b/tools/testing/selftests/mm/pud_thp_test.c @@ -256,4 +256,30 @@ TEST_F(pud_thp, partial_munmap) self->split_before, split_after); } =20 +/* + * Test: mprotect triggers split + * Verifies that changing protection on part of a PUD THP splits it + */ +TEST_F(pud_thp, mprotect_split) +{ + volatile unsigned char *p =3D (unsigned char *)self->aligned; + unsigned long split_after; + int ret; + + /* Touch memory to allocate PUD THP */ + memset(self->aligned, 0xEE, PUD_SIZE); + + /* Change protection on a 2MB region - should trigger PUD split */ + ret =3D mprotect((char *)self->aligned + PMD_SIZE, PMD_SIZE, PROT_READ); + ASSERT_EQ(ret, 0); + + split_after =3D read_vmstat("thp_split_pud"); + + /* Verify memory still readable */ + ASSERT_EQ(*p, 0xEE); + + TH_LOG("mprotect split completed (thp_split_pud: %lu -> %lu)", + self->split_before, split_after); +} + TEST_HARNESS_MAIN --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-oi1-f178.google.com (mail-oi1-f178.google.com [209.85.167.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B311C1E1DE5 for ; Mon, 2 Feb 2026 00:55:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993744; cv=none; b=D6ZeqRYB7wiVbAOLqohGfqZU/17P1uNfPxLhmVDRV7CCM9kIDqwYc3XRxOutdvBm/sDrESSrNq9MRAn0iHNaMVB9MC7aa950FyC+fSMRMx6wvBzzPTekpyrudYXiiY59W1F09KwnV8hMbHP3VvgdoSbICNWualD9NzSzOWStD/c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993744; c=relaxed/simple; bh=xHstk2Jk4m/lc2nrF1FKGsC6k95gf3TRjbE6hrUxCzM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uXHZ8j9QYLA6Jeh6ooQea9tDyucuigMf79K5ZD5lLKb7HDY13JJip4NMt5zFWDZOLqKPVqsOecqBnjit15Z17edC2n8z5EZitiIELE/9OLLt0DFcW/F4y969yHwX3rvN7/9KFfm17C3NcvytwPKR5ZuBbaAz4mbZW8/5pNU8Dmo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PIXOmsZs; arc=none smtp.client-ip=209.85.167.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PIXOmsZs" Received: by mail-oi1-f178.google.com with SMTP id 5614622812f47-45eabce608dso2891996b6e.2 for ; Sun, 01 Feb 2026 16:55:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993741; x=1770598541; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hKd/LbJPp/JDVOH5LR7j10+P7y3WmWdSNYy5oCtrFjw=; b=PIXOmsZsedWfAo+svsnq0YrU6C8ixOoS8KqBim07IUGPjN577wvQ2WAs1MM2D719Jb yM1ySSdJS6pGzrb2puRxg62Ta8ep7tILJbuX/17Q/4Oi5i5J/tAZGDpDAh5YFQLugCqd AKBwTZgZhEutpuXbBaUk2NnFQi55VnWARlyEGHHbH8nb4qZxnizcRSs4Q8oANXdp1jbs aersdB4G6eMeBZpnjM6mXT3iwqFgdQQ7+SgdTk8pcuH0qDBmCOTd2HsyIYsNyImPBuQt QTN8J2piaDoMsIsb38BpqVnimFril0m5O42q5mvgyZnVcyhl/PfW4jy2GEDT8zKuEEOq hl2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993741; x=1770598541; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=hKd/LbJPp/JDVOH5LR7j10+P7y3WmWdSNYy5oCtrFjw=; b=sCYEbydubTitO1JDnLoLQ8Q57lfY3sq8co4vfrjZeWJ6SjLiRk/luj1TIxY1gnsU1u jDRP1hSSqevH64NS5nlDygEByOfCJkMefg+HBcfkJVaTp8ML4qavRoAqBxTDSZvcTK+4 0IgnQKG2JWk81RzlFjODS6C8ehDCRPCI0+RDqOcsjY2u34fI+CGHRfCzgiJocDCbc6Cu HzdYLEegwrnq9O5k98LDna9jFr1AaKsxAVHPZCX4lWYJXCwv4mPy320xwh/tgjYKn/U0 3O1fQXWz2XbGNBcnkY/EujwAEZ/vg6ZsgyOSD+I/GHatSbVq44NG35g4N4+eG1wduCkL PU/g== X-Forwarded-Encrypted: i=1; AJvYcCV2QJ4Ko3CrddnSWrRXBWr6587ThXd5eCuQdiiAj1Gd3q5rl8wsD/lP13IfLX4aFQhVVX26UuPRrdNnSMg=@vger.kernel.org X-Gm-Message-State: AOJu0YxkSNw7/0T0kXyy/AcDWH3iTJ0Ihumrc3l2mVLrCi9vSsU8Yd8n D/7xuaHo147/G+11BirZRHJqXwufqDHp8CS1/JSf4t+CSsWBK69zRn6V X-Gm-Gg: AZuq6aLcumrK2vfxTQneWkRJnfNJlPL94M8PRG5+XP7d+Yk6F/mJDwBnPnei5qEuQHT zxhMn6vBNCuDiwtlvoC9yfw2cC5z8ws1Eb5ZxCs1riRHH1pCNkSTQNQiMLD9gSjsh5Y1pN4W/1h 1gLv6Px1PCsFxAYrkr2v0EFyq0+JmJsDgtYACBCyseddtlF9TDz6ZkLmvz+5L2CgjaH3+p7ZePt bRJ7iDaFE+xajHbzOd/XSIfM8UepzFLma5IGvSTyY1BRsSVq7LzXIp5k5r8MSytXi7/xD9lyeUV pcbXzal3d4nEbxWyMF8/THdDEqv+Du/wCU09UdzN4uoBhO7RY7IzEDyGcAjKdLXQMtbOaMBn/OD BXuNHQ34NGedDkL4weB7Pikz/fGvw2hynJBFyXJ/Brp24leBEsgRMIWUhUDEe7EYmjV896OeXfj a7Sl1qjSebkDHAx4n5vs0= X-Received: by 2002:a05:6820:2909:b0:662:f997:8448 with SMTP id 006d021491bc7-6630f37792emr6869006eaf.58.1769993741508; Sun, 01 Feb 2026 16:55:41 -0800 (PST) Received: from localhost ([2a03:2880:10ff:71::]) by smtp.gmail.com with ESMTPSA id 006d021491bc7-662f996216asm9000764eaf.6.2026.02.01.16.55.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:40 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 11/12] selftests/mm: add PUD THP reclaim test Date: Sun, 1 Feb 2026 16:50:28 -0800 Message-ID: <20260202005451.774496-12-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a test that uses MADV_PAGEOUT to advise the kernel to page out the PUD THP memory. This exercises the reclaim path which must split the PUD THP before reclaiming the individual pages. Signed-off-by: Usama Arif --- tools/testing/selftests/mm/pud_thp_test.c | 33 +++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/tools/testing/selftests/mm/pud_thp_test.c b/tools/testing/self= tests/mm/pud_thp_test.c index b59eb470adbba..961fdc489d8a2 100644 --- a/tools/testing/selftests/mm/pud_thp_test.c +++ b/tools/testing/selftests/mm/pud_thp_test.c @@ -28,6 +28,10 @@ =20 #define TEST_REGION_SIZE (2 * PUD_SIZE) /* 2GB to ensure PUD alignment */ =20 +#ifndef MADV_PAGEOUT +#define MADV_PAGEOUT 21 +#endif + /* Get PUD-aligned address within a region */ static inline void *pud_align(void *addr) { @@ -282,4 +286,33 @@ TEST_F(pud_thp, mprotect_split) self->split_before, split_after); } =20 +/* + * Test: Reclaim via MADV_PAGEOUT + * Verifies that reclaim path correctly handles PUD THPs + */ +TEST_F(pud_thp, reclaim_pageout) +{ + volatile unsigned char *p; + unsigned long split_after; + int ret; + + /* Touch memory to allocate PUD THP */ + memset(self->aligned, 0xAA, PUD_SIZE); + + /* Try to reclaim the pages */ + ret =3D madvise(self->aligned, PUD_SIZE, MADV_PAGEOUT); + if (ret < 0 && errno =3D=3D EINVAL) + SKIP(return, "MADV_PAGEOUT not supported"); + ASSERT_EQ(ret, 0); + + split_after =3D read_vmstat("thp_split_pud"); + + /* Touch memory again to verify it's still accessible */ + p =3D (unsigned char *)self->aligned; + (void)*p; /* Read to bring pages back if swapped */ + + TH_LOG("Reclaim completed (thp_split_pud: %lu -> %lu)", + self->split_before, split_after); +} + TEST_HARNESS_MAIN --=20 2.47.3 From nobody Sat Feb 7 05:44:27 2026 Received: from mail-oi1-f173.google.com (mail-oi1-f173.google.com [209.85.167.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2275A1DED5C for ; Mon, 2 Feb 2026 00:55:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993752; cv=none; b=Hb7aXhpzs2E7iMsIaxup8GfY5UpyiTgPNCzPpwnDRDgKt9wlhSrdcYypcjOuDxffzd4C2yieITr1PiHqN9rs/0pi2NrWh1lVqBMIkbWS3X7N39jjWTBJ5vZbdZMugv83oiIb14k7CEIMi8jS8b5kRoQxnYnaKVkR9HqWt3xVOm0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769993752; c=relaxed/simple; bh=/6rVf9rvQIWd/wLmx0Shpzxn6jgbrMWsAfd5Dyv6kmc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Jtm2Ytj7yVAtfgWpbevsaDDcqoQ5IYoXo4Is0KC6NpyNTH9ZI1k2bgf0upiXu/UUAKoY0uQ4QXZWn7L3RqrjWDHFCWoSEsq6rRVtofPWXHTAwYyEqoiSmB/qZDg7WIIpMDUWABPfw6Favh2tj1oaOPXE541dnXo9nnFit9AJ5NI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=jA6mJPDG; arc=none smtp.client-ip=209.85.167.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jA6mJPDG" Received: by mail-oi1-f173.google.com with SMTP id 5614622812f47-45c9fdf2a06so2735108b6e.2 for ; Sun, 01 Feb 2026 16:55:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1769993743; x=1770598543; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Gvsdt8gBwwkrCim/nEc53pgmTP4BbWKzy8tyuoNVZ4Q=; b=jA6mJPDGBkaTwTRuLLFGjZBjyDhFSoDZf1sVSKtUB/maiudoUNUzt4ZzMBVBbatrfs X3kjZwc8QKKOifK7srGkURWHM5pLY/hh5LXkYw7kIctg9qvFQUcbpahTIuE58tqg5eGE 033yDtCYJmKL6C5LNlh1urUqRSY6Z7OPld1vk6w+K6AhYADOLnOlH5cW86rfCVcg/h6W SfuOo7d+1Mzr4Zg9PCutPHjdThkHSnf60mM4q5gNTNLzmH6JyHuvEVjU6qsCGuVcH5aG yJlB8CEQ+Rf6jXEAais0nWkmW2NFUgV58FCMJ7QYUtnTx4HVOCoEjcSN+0IbTK4xpcFF BAHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1769993743; x=1770598543; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Gvsdt8gBwwkrCim/nEc53pgmTP4BbWKzy8tyuoNVZ4Q=; b=VAYSFLgMwGCbU2qlCY9lKocwrH7NSQ2zLgrgGDsT412WhP4UqzQKOxrt//UAl1KA5N rHVbdMl5MRm/c14wYoxDo7bOf5Ih0De6RRsTnxFkVGWwcAEBU0fsVSfPJdwNWzEadV2r 7AY3o7BFGzxYRzarQRyYNcvV1cE7EwAwutRE+cjHsgjyFruDQnxbd0XjORUs4W54C8Ai dbciMEFXB82H08SYAbz11eMArXYjkyHLwqUbOqZyetVZpAS9X08KeUGLVffFJ8D1Cx3W 3ZSx9+9X08Un4/A3x6PYh0BjDSBT+y9dSoIXx5kEOicqINMIfmBrQgTSOiyonGyEX95Q thrw== X-Forwarded-Encrypted: i=1; AJvYcCU6BMaXd86M27L3XNTWP2rodj4g2nzSeA3gSEXqf0GMbHHCEd6fFgF+mRKfFVi3mcaDq1ZOOMj0cik91UA=@vger.kernel.org X-Gm-Message-State: AOJu0Yxd1z7vkGFegXfiLt/BnoROjylirlLqmtT9ASYluCfc5+qbEdo/ CNqNoThVptusZ1ZmFH+JOcMlZiGq3hQfoDOfjT8w1RUl2/W7WvbDj7Y8beUfyFfR X-Gm-Gg: AZuq6aIoAZdtgRzinMReC0uT8Wzp4NuyOqtk/d4O1ZwFBIZ8Gfq7TBbqJFWNgkstGFC sOFa9z/FYmkRIc/e7X7FR2sLWDHlZIyfrOkgABfN6xwREmP8I2SS6DNsMnuw6egYbK9NRBG1ZuM 0r6ylXuH3Be4RFkVN7+rjzbVcrXThCVNJ/zW7McGefG6ARw/cK4RqHAr+APbKA10XxrVdKs0X/5 xTNCD1ioqAKIsLSQJXPeniYQrIrZ0W0h9sPHK81f8XSXfzCkGi6aD9PKv5oZn4P1uzSxCM5+dBM NNPzpweWy7yDD7tgKm1Xpi9qeWzOcj36VI+cDvLXNH/X0Z3tS1l39Tu1pPhjDqMkNob8Or+KCYn ZBtcu/KEHZXoMD9TNkPfD9++ljOWuB8eIv7v6H4YzuPSy0wCrVmyGR1++PdvzOvD2N7fVIjN/h8 8gUPHpDWI= X-Received: by 2002:a05:6808:c1f9:b0:45c:75ec:b9d0 with SMTP id 5614622812f47-45f34cb4ca2mr5143633b6e.34.1769993743045; Sun, 01 Feb 2026 16:55:43 -0800 (PST) Received: from localhost ([2a03:2880:10ff:4::]) by smtp.gmail.com with ESMTPSA id 5614622812f47-45f08f6010esm8293308b6e.15.2026.02.01.16.55.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 01 Feb 2026 16:55:42 -0800 (PST) From: Usama Arif To: ziy@nvidia.com, Andrew Morton , David Hildenbrand , lorenzo.stoakes@oracle.com, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, kas@kernel.org, baohua@kernel.org, dev.jain@arm.com, baolin.wang@linux.alibaba.com, npache@redhat.com, Liam.Howlett@oracle.com, ryan.roberts@arm.com, vbabka@suse.cz, lance.yang@linux.dev, linux-kernel@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [RFC 12/12] selftests/mm: add PUD THP migration test Date: Sun, 1 Feb 2026 16:50:29 -0800 Message-ID: <20260202005451.774496-13-usamaarif642@gmail.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20260202005451.774496-1-usamaarif642@gmail.com> References: <20260202005451.774496-1-usamaarif642@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a test that uses mbind() to change the NUMA memory policy, which triggers migration. The kernel must split PUD THPs before migration since there is no PUD-level migration entry support. The test verifies data integrity after the migration attempt. Signed-off-by: Usama Arif --- tools/testing/selftests/mm/pud_thp_test.c | 42 +++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/tools/testing/selftests/mm/pud_thp_test.c b/tools/testing/self= tests/mm/pud_thp_test.c index 961fdc489d8a2..7e227f29e69fb 100644 --- a/tools/testing/selftests/mm/pud_thp_test.c +++ b/tools/testing/selftests/mm/pud_thp_test.c @@ -32,6 +32,14 @@ #define MADV_PAGEOUT 21 #endif =20 +#ifndef MPOL_BIND +#define MPOL_BIND 2 +#endif + +#ifndef MPOL_MF_MOVE +#define MPOL_MF_MOVE (1 << 1) +#endif + /* Get PUD-aligned address within a region */ static inline void *pud_align(void *addr) { @@ -315,4 +323,38 @@ TEST_F(pud_thp, reclaim_pageout) self->split_before, split_after); } =20 +/* + * Test: Migration via mbind + * Verifies that migration path correctly handles PUD THPs by splitting + */ +TEST_F(pud_thp, migration_mbind) +{ + unsigned char *bytes =3D (unsigned char *)self->aligned; + unsigned long nodemask =3D 1UL; /* Node 0 */ + unsigned long split_after; + int ret; + + /* Touch memory to allocate PUD THP */ + memset(self->aligned, 0xBB, PUD_SIZE); + + /* Try to migrate by changing NUMA policy */ + ret =3D syscall(__NR_mbind, self->aligned, PUD_SIZE, MPOL_BIND, &nodemask, + sizeof(nodemask) * 8, MPOL_MF_MOVE); + /* + * mbind may fail with EINVAL (single node) or EIO (migration failed), + * which is acceptable - we just want to exercise the migration path. + */ + if (ret < 0 && errno !=3D EINVAL && errno !=3D EIO) + TH_LOG("mbind returned unexpected error: %s", strerror(errno)); + + split_after =3D read_vmstat("thp_split_pud"); + + /* Verify data integrity */ + ASSERT_EQ(bytes[0], 0xBB); + ASSERT_EQ(bytes[PUD_SIZE - 1], 0xBB); + + TH_LOG("Migration completed (thp_split_pud: %lu -> %lu)", + self->split_before, split_after); +} + TEST_HARNESS_MAIN --=20 2.47.3