From nobody Wed Jun 17 01:42:34 2026 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F92A384225 for ; Tue, 21 Apr 2026 09:25:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763522; cv=none; b=gba2yDO1aELPJEFbgAH/gp8IzFZy7Ej1logKk5oaYTyoqO8XgtQ4g1erSgEYMRcQvlquEs4t9iGfLXlggPfQHuOVoXs0UF9La7fPPm2vdbJnFe8K5p3W+QGtZdnfZ9tJs7A/SfVvN0pXNFWiFrV9wGpgBDy4X65+dY6fBQF6WyU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763522; c=relaxed/simple; bh=6O9w10R9PumjBOXlRGOZcGuAOn27IfvE7Y5dMyNW/9E=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JImAjMkW6cvIUDJ2BaQsf17UByd60oBJR5JvIU9bdGyB8yI+ZVFSuHB2H6hiNWrUFaPjrLGarPdHEhVaJt3dpEqoMAy7ohsCB2fWTHPoZgxKoArdfiRveAJ1P561l99OshaQtAVZ1Yg8voTXv4e6FcoHQaTp2NyvItEKXLn3+Jg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=HtR2eQnv; arc=none smtp.client-ip=209.85.214.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="HtR2eQnv" Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-2b2589c26e3so35930915ad.1 for ; Tue, 21 Apr 2026 02:25:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776763520; x=1777368320; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=jQRro4iFet5g6RlukJUR9qjzAO0LrmO0ikVc7Fh+rVg=; b=HtR2eQnv7/B0c9HHAvYO7J7Yyo9UA5/tOYscCx+Pw8s+g7SDtwkSpqheO95yZWvTBo fkeRGZgBXR81MKu2ZDMklv1xEa26t7gvRhqIvTk9ZileGUY3JoQR6EelaIGlDDs/2AUc 3+p+u0h7M8EOpPTUde8Hlu8nexjU6qeABsfvm+pP8u4pprXPYpjKWaYqy0FWRszyfKJC R0JY+hTteMhHdEdxDI4qNmamKo7MVXo7HFZob9fNMnTpYSci6ztGrcsXoLF98tX0736s 0+B14WPiETfmEx0rqUh9jrvJNN9AsXSQor1pJY8HJoJOGo7l5z75/tqZuQBcVrabWaX1 2ZeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776763520; x=1777368320; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=jQRro4iFet5g6RlukJUR9qjzAO0LrmO0ikVc7Fh+rVg=; b=s1OU+Bc1siixRbxSI4FbDHry+RrLRWPnPMhlzY3reEh44/PEFo/SHowVxYguajlufl D05NbNruyIcjwEHTNqppFPffqX4JdM0tPd9jycxC+RTfsex9BOodBJ5pkJqVULFLifMk czn58GVEKcnaTO3M20B6NZAM3/kMwKEf9RARopkgqvBJAsBqZaUsLOrV6eWVpbVRbvfU EvTIBFXNvy72IGxPLK0hGWp44hudELU2Xvp2RSg2owRqcb/mrW1q9MduAG1oxf+WUtXe b/M1lNSEL1/fD5e8FJIrYyMIbBPy+yJFxXtse0klQWsAc17ACof6QRUO1JCIEamHLpBk qXEA== X-Forwarded-Encrypted: i=1; AFNElJ+mp7F2/PhEet1BHftWD/sYV8NKoKYTAEhVWmLxBieaoRsj2cHWHNbPbk5nLEkGw+9C7gHDFleVO0VA24w=@vger.kernel.org X-Gm-Message-State: AOJu0YwtuUg2kpyEOQCFAO7KCyFRcsaNoNAXsOlX4sDTJtwOyDHq601e tc5Rp7iBxugCpLZ8R8elcW+HnD7AkJiu5mUfvqbk1fODCvxvfGwyv3+nyiyQPuTr2YY= X-Gm-Gg: AeBDieseiqXmL1PLXSSHSDzSF7Rptw6dGtfi1cQ2lWf07vYe+4C5wgwz9lN1bsxs7Yl jSYZOPL/O4trDnCgbRLy2Cvf0A7S5PTffmmHl6Lbk0a7IVz0lBBYMaDY43vOS90JHyNdzKj0xN0 FHbseensL2ItyNZMUbFNwBV84H6oGt2krmL0jyvrADhnBY6cAEzS+ipKiqm63Ph669KNgchqZTH 2vHraJwazAC25nhsza3vp1FHDZDH7Br5WF9oYP52fw0oWyP1wFO8UO3ABtx/GpgUWvc4DJOd8Yl 7kQtdikLMHa3U2gy6Mi9sVAXvv2Fb7AuYnbn+A2RVg8r8QZ9Iomdq3P8OCBKpLp+YTLZ73d9S/T YcTga8E7YtUOLaN5rZ6kEyxurxaH0FRBERZTRuQ+kANCju6axscyappAgAvZahQo7JI6RtN2iXP cECzrdkhVcjgPEHfN9ss2Xg9sLgUECNBjR609Z3qQM9i/A3uasxqMsUNo43aCM2OfDckeT X-Received: by 2002:a17:902:9001:b0:2b4:5893:3f3c with SMTP id d9443c01a7336-2b5f9ef4c3fmr117016875ad.13.1776763519593; Tue, 21 Apr 2026 02:25:19 -0700 (PDT) Received: from L6YN4KR4K9.bytedance.net ([61.213.176.6]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b5fa9ff39csm131965105ad.4.2026.04.21.02.25.10 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 21 Apr 2026 02:25:19 -0700 (PDT) From: Yunhui Cui To: akpm@linux-foundation.org, alex@ghiti.fr, andrew+kernel@donnellan.id.au, andreyknvl@gmail.com, anup@brainfault.org, aou@eecs.berkeley.edu, apopple@nvidia.com, ardb@kernel.org, atish.patra@linux.dev, baolin.wang@linux.alibaba.com, cuiyunhui@bytedance.com, david@kernel.org, debug@rivosinc.com, djordje.todorovic@htecgroup.com, dvyukov@google.com, elver@google.com, glider@google.com, ilias.apalodimas@linaro.org, junhui.liu@pigmoral.tech, kasan-dev@googlegroups.com, kees@kernel.org, kevin.brodsky@arm.com, kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, liu.xuemei1@zte.com.cn, ljs@kernel.org, namcao@linutronix.de, osalvador@suse.de, palmer@dabbelt.com, pjw@kernel.org, rmclure@linux.ibm.com, rostedt@goodmis.org, rppt@kernel.org, ryabinin.a.a@gmail.com, surenb@google.com, vincenzo.frascino@arm.com, vishal.moola@gmail.com, wangruikang@iscas.ac.cn, zhangchunyan@iscas.ac.cn Subject: [PATCH 1/7] riscv: mm: split raw and public PTE helpers Date: Tue, 21 Apr 2026 17:24:51 +0800 Message-Id: <20260421092457.37649-2-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20260421092457.37649-1-cuiyunhui@bytedance.com> References: <20260421092457.37649-1-cuiyunhui@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce raw PTE helpers prefixed with double underscores for callers that need direct access to the underlying PTE encoding. These __* helpers form private low-level primitives, while the existing helpers remain the public core-MM-facing API that RISC-V can later wrap with additional architecture-specific semantics without exposing those details to generic callers. Switch kernel internal page table users in early boot, KASAN, EFI, hibernate and pageattr to use the private raw helpers directly. No functional change intended. Signed-off-by: Yunhui Cui --- arch/riscv/include/asm/kfence.h | 4 +- arch/riscv/include/asm/pgtable.h | 87 ++++++++++++++++++++++++++++---- arch/riscv/kernel/efi.c | 4 +- arch/riscv/kernel/hibernate.c | 2 +- arch/riscv/mm/fault.c | 4 +- arch/riscv/mm/init.c | 8 +-- arch/riscv/mm/kasan_init.c | 14 ++--- arch/riscv/mm/pageattr.c | 12 ++--- arch/riscv/mm/pgtable.c | 19 ++++++- 9 files changed, 117 insertions(+), 37 deletions(-) diff --git a/arch/riscv/include/asm/kfence.h b/arch/riscv/include/asm/kfenc= e.h index d08bf7fb3aee6..2bcaeff1167c6 100644 --- a/arch/riscv/include/asm/kfence.h +++ b/arch/riscv/include/asm/kfence.h @@ -18,9 +18,9 @@ static inline bool kfence_protect_page(unsigned long addr= , bool protect) pte_t *pte =3D virt_to_kpte(addr); =20 if (protect) - set_pte(pte, __pte(pte_val(ptep_get(pte)) & ~_PAGE_PRESENT)); + __set_pte(pte, __pte(pte_val(__ptep_get(pte)) & ~_PAGE_PRESENT)); else - set_pte(pte, __pte(pte_val(ptep_get(pte)) | _PAGE_PRESENT)); + __set_pte(pte, __pte(pte_val(__ptep_get(pte)) | _PAGE_PRESENT)); =20 preempt_disable(); local_flush_tlb_kernel_range(addr, addr + PAGE_SIZE); diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgta= ble.h index a1a7c6520a095..4de1f40fa77ea 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -602,11 +602,18 @@ static inline int pte_same(pte_t pte_a, pte_t pte_b) * a page table are directly modified. Thus, the following hook is * made available. */ -static inline void set_pte(pte_t *ptep, pte_t pteval) +static inline void __set_pte(pte_t *ptep, pte_t pteval) { WRITE_ONCE(*ptep, pteval); } =20 +#define __set_pte __set_pte + +static inline void set_pte(pte_t *ptep, pte_t pteval) +{ + __set_pte(ptep, pteval); +} + void flush_icache_pte(struct mm_struct *mm, pte_t pte); =20 static inline void __set_pte_at(struct mm_struct *mm, pte_t *ptep, pte_t p= teval) @@ -619,8 +626,8 @@ static inline void __set_pte_at(struct mm_struct *mm, p= te_t *ptep, pte_t pteval) =20 #define PFN_PTE_SHIFT _PAGE_PFN_SHIFT =20 -static inline void set_ptes(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pteval, unsigned int nr) +static inline void __set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pteval, unsigned int nr) { page_table_check_ptes_set(mm, addr, ptep, pteval, nr); =20 @@ -632,31 +639,61 @@ static inline void set_ptes(struct mm_struct *mm, uns= igned long addr, pte_val(pteval) +=3D 1 << _PAGE_PFN_SHIFT; } } -#define set_ptes set_ptes + +#define __set_ptes __set_ptes + +static inline void set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pteval, unsigned int nr) +{ + __set_ptes(mm, addr, ptep, pteval, nr); +} + +static inline void __pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + __set_pte_at(mm, ptep, __pte(0)); +} =20 static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - __set_pte_at(mm, ptep, __pte(0)); + __pte_clear(mm, addr, ptep); +} + +#define __ptep_get __ptep_get +static inline pte_t __ptep_get(pte_t *ptep) +{ + return READ_ONCE(*ptep); +} + +#define __ptep_get_lockless __ptep_get_lockless +static inline pte_t __ptep_get_lockless(pte_t *ptep) +{ + return __ptep_get(ptep); } =20 #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS /* defined in mm/pgtable.c */ extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long= address, pte_t *ptep, pte_t entry, int dirty); +int __ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep, + pte_t entry, int dirty); #define __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG /* defined in mm/pgtable.c */ bool ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep); +bool __ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep); =20 #define __HAVE_ARCH_PTEP_GET_AND_CLEAR -static inline pte_t ptep_get_and_clear(struct mm_struct *mm, - unsigned long address, pte_t *ptep) +static inline pte_t +__ptep_get_and_clear(struct mm_struct *mm, unsigned long address, pte_t *p= tep) { #ifdef CONFIG_SMP pte_t pte =3D __pte(xchg(&ptep->pte, 0)); #else pte_t pte =3D *ptep; =20 - set_pte(ptep, __pte(0)); + __set_pte(ptep, __pte(0)); #endif =20 page_table_check_pte_clear(mm, address, pte); @@ -664,9 +701,16 @@ static inline pte_t ptep_get_and_clear(struct mm_struc= t *mm, return pte; } =20 -#define __HAVE_ARCH_PTEP_SET_WRPROTECT -static inline void ptep_set_wrprotect(struct mm_struct *mm, - unsigned long address, pte_t *ptep) +#define __ptep_get_and_clear __ptep_get_and_clear + +static inline pte_t ptep_get_and_clear(struct mm_struct *mm, + unsigned long address, pte_t *ptep) +{ + return __ptep_get_and_clear(mm, address, ptep); +} + +static inline void +__ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *p= tep) { pte_t read_pte =3D READ_ONCE(*ptep); /* @@ -679,6 +723,27 @@ static inline void ptep_set_wrprotect(struct mm_struct= *mm, ((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ)); } =20 +#define __ptep_set_wrprotect __ptep_set_wrprotect + +#define __HAVE_ARCH_PTEP_SET_WRPROTECT +static inline void ptep_set_wrprotect(struct mm_struct *mm, + unsigned long address, pte_t *ptep) +{ + __ptep_set_wrprotect(mm, address, ptep); +} + +static inline pte_t __ptep_clear_flush(struct vm_area_struct *vma, + unsigned long address, + pte_t *ptep) +{ + pte_t pte =3D __ptep_get_and_clear(vma->vm_mm, address, ptep); + + if (pte_accessible(vma->vm_mm, pte)) + flush_tlb_page(vma, address); + + return pte; +} + #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH static inline bool ptep_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) diff --git a/arch/riscv/kernel/efi.c b/arch/riscv/kernel/efi.c index b64bf1624a052..673eca7705ba5 100644 --- a/arch/riscv/kernel/efi.c +++ b/arch/riscv/kernel/efi.c @@ -60,7 +60,7 @@ int __init efi_create_mapping(struct mm_struct *mm, efi_m= emory_desc_t *md) static int __init set_permissions(pte_t *ptep, unsigned long addr, void *d= ata) { efi_memory_desc_t *md =3D data; - pte_t pte =3D ptep_get(ptep); + pte_t pte =3D __ptep_get(ptep); unsigned long val; =20 if (md->attribute & EFI_MEMORY_RO) { @@ -72,7 +72,7 @@ static int __init set_permissions(pte_t *ptep, unsigned l= ong addr, void *data) val =3D pte_val(pte) & ~_PAGE_EXEC; pte =3D __pte(val); } - set_pte(ptep, pte); + __set_pte(ptep, pte); =20 return 0; } diff --git a/arch/riscv/kernel/hibernate.c b/arch/riscv/kernel/hibernate.c index 982843828adb7..0360a6f3e1bf2 100644 --- a/arch/riscv/kernel/hibernate.c +++ b/arch/riscv/kernel/hibernate.c @@ -186,7 +186,7 @@ static int temp_pgtable_map_pte(pmd_t *dst_pmdp, pmd_t = *src_pmdp, unsigned long pte_t pte =3D READ_ONCE(*src_ptep); =20 if (pte_present(pte)) - set_pte(dst_ptep, __pte(pte_val(pte) | pgprot_val(prot))); + __set_pte(dst_ptep, __pte(pte_val(pte) | pgprot_val(prot))); } while (dst_ptep++, src_ptep++, start +=3D PAGE_SIZE, start < end); =20 return 0; diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c index 04ed6f8acae4f..fe8b11a8ad143 100644 --- a/arch/riscv/mm/fault.c +++ b/arch/riscv/mm/fault.c @@ -69,7 +69,7 @@ static void show_pte(unsigned long addr) if (!ptep) goto out; =20 - pte =3D ptep_get(ptep); + pte =3D READ_ONCE(*ptep); pr_cont(", pte=3D%016lx", pte_val(pte)); pte_unmap(ptep); out: @@ -231,7 +231,7 @@ static inline void vmalloc_fault(struct pt_regs *regs, = int code, unsigned long a * silently loop forever. */ pte_k =3D pte_offset_kernel(pmd_k, addr); - if (!pte_present(ptep_get(pte_k))) { + if (!pte_present(__ptep_get(pte_k))) { no_context(regs, addr); return; } diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index decd7df40fa42..86321b093d252 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -376,9 +376,9 @@ void __set_fixmap(enum fixed_addresses idx, phys_addr_t= phys, pgprot_t prot) ptep =3D &fixmap_pte[pte_index(addr)]; =20 if (pgprot_val(prot)) - set_pte(ptep, pfn_pte(phys >> PAGE_SHIFT, prot)); + __set_pte(ptep, pfn_pte(phys >> PAGE_SHIFT, prot)); else - pte_clear(&init_mm, addr, ptep); + __pte_clear(&init_mm, addr, ptep); local_flush_tlb_page(addr); } =20 @@ -1558,11 +1558,11 @@ static void __meminit remove_pte_mapping(pte_t *pte= _base, unsigned long addr, un next =3D end; =20 ptep =3D pte_base + pte_index(addr); - pte =3D ptep_get(ptep); + pte =3D __ptep_get(ptep); if (!pte_present(*ptep)) continue; =20 - pte_clear(&init_mm, addr, ptep); + __pte_clear(&init_mm, addr, ptep); if (is_vmemmap) free_vmemmap_storage(pte_page(pte), PAGE_SIZE, altmap); } diff --git a/arch/riscv/mm/kasan_init.c b/arch/riscv/mm/kasan_init.c index c4a2a9e5586e7..0c2f5e8e48063 100644 --- a/arch/riscv/mm/kasan_init.c +++ b/arch/riscv/mm/kasan_init.c @@ -39,9 +39,9 @@ static void __init kasan_populate_pte(pmd_t *pmd, unsigne= d long vaddr, unsigned ptep =3D pte_offset_kernel(pmd, vaddr); =20 do { - if (pte_none(ptep_get(ptep))) { + if (pte_none(__ptep_get(ptep))) { phys_addr =3D memblock_phys_alloc(PAGE_SIZE, PAGE_SIZE); - set_pte(ptep, pfn_pte(PFN_DOWN(phys_addr), PAGE_KERNEL)); + __set_pte(ptep, pfn_pte(PFN_DOWN(phys_addr), PAGE_KERNEL)); memset(__va(phys_addr), KASAN_SHADOW_INIT, PAGE_SIZE); } } while (ptep++, vaddr +=3D PAGE_SIZE, vaddr !=3D end); @@ -327,8 +327,8 @@ asmlinkage void __init kasan_early_init(void) KASAN_SHADOW_END - (1UL << (64 - KASAN_SHADOW_SCALE_SHIFT))); =20 for (i =3D 0; i < PTRS_PER_PTE; ++i) - set_pte(kasan_early_shadow_pte + i, - pfn_pte(virt_to_pfn(kasan_early_shadow_page), PAGE_KERNEL)); + __set_pte(kasan_early_shadow_pte + i, + pfn_pte(virt_to_pfn(kasan_early_shadow_page), PAGE_KERNEL)); =20 for (i =3D 0; i < PTRS_PER_PMD; ++i) set_pmd(kasan_early_shadow_pmd + i, @@ -523,9 +523,9 @@ void __init kasan_init(void) kasan_mem_to_shadow((const void *)MODULES_VADDR + SZ_2G)); =20 for (i =3D 0; i < PTRS_PER_PTE; i++) - set_pte(&kasan_early_shadow_pte[i], - mk_pte(virt_to_page(kasan_early_shadow_page), - __pgprot(_PAGE_PRESENT | _PAGE_READ | + __set_pte(&kasan_early_shadow_pte[i], + mk_pte(virt_to_page(kasan_early_shadow_page), + __pgprot(_PAGE_PRESENT | _PAGE_READ | _PAGE_ACCESSED))); =20 memset(kasan_early_shadow_page, KASAN_SHADOW_INIT, PAGE_SIZE); diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c index 3f76db3d27699..e0271e2a0b295 100644 --- a/arch/riscv/mm/pageattr.c +++ b/arch/riscv/mm/pageattr.c @@ -68,10 +68,10 @@ static int pageattr_pmd_entry(pmd_t *pmd, unsigned long= addr, static int pageattr_pte_entry(pte_t *pte, unsigned long addr, unsigned long next, struct mm_walk *walk) { - pte_t val =3D ptep_get(pte); + pte_t val =3D __ptep_get(pte); =20 val =3D __pte(set_pageattr_masks(pte_val(val), walk)); - set_pte(pte, val); + __set_pte(pte, val); =20 return 0; } @@ -121,7 +121,7 @@ static int __split_linear_mapping_pmd(pud_t *pudp, =20 ptep_new =3D (pte_t *)page_address(pte_page); for (i =3D 0; i < PTRS_PER_PTE; ++i, ++ptep_new) - set_pte(ptep_new, pfn_pte(pfn + i, prot)); + __set_pte(ptep_new, pfn_pte(pfn + i, prot)); =20 smp_wmb(); =20 @@ -406,14 +406,14 @@ static int debug_pagealloc_set_page(pte_t *pte, unsig= ned long addr, void *data) { int enable =3D *(int *)data; =20 - unsigned long val =3D pte_val(ptep_get(pte)); + unsigned long val =3D pte_val(__ptep_get(pte)); =20 if (enable) val |=3D _PAGE_PRESENT; else val &=3D ~_PAGE_PRESENT; =20 - set_pte(pte, __pte(val)); + __set_pte(pte, __pte(val)); =20 return 0; } @@ -466,5 +466,5 @@ bool kernel_page_present(struct page *page) return true; =20 pte =3D pte_offset_kernel(pmd, addr); - return pte_present(ptep_get(pte)); + return pte_present(__ptep_get(pte)); } diff --git a/arch/riscv/mm/pgtable.c b/arch/riscv/mm/pgtable.c index 9c4427d0b1874..9131a78fe15c4 100644 --- a/arch/riscv/mm/pgtable.c +++ b/arch/riscv/mm/pgtable.c @@ -8,6 +8,13 @@ int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, pte_t *ptep, pte_t entry, int dirty) +{ + return __ptep_set_access_flags(vma, address, ptep, entry, dirty); +} + +int __ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep, + pte_t entry, int dirty) { if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SVVPTC)) { if (!pte_same(ptep_get(ptep), entry)) { @@ -32,11 +39,19 @@ int ptep_set_access_flags(struct vm_area_struct *vma, bool ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { - if (!pte_young(ptep_get(ptep))) + return __ptep_test_and_clear_young(vma, address, ptep); +} +EXPORT_SYMBOL_GPL(ptep_test_and_clear_young); + +bool __ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep) +{ + if (!pte_young(__ptep_get(ptep))) return false; + return test_and_clear_bit(_PAGE_ACCESSED_OFFSET, &pte_val(*ptep)); } -EXPORT_SYMBOL_GPL(ptep_test_and_clear_young); +EXPORT_SYMBOL_GPL(__ptep_test_and_clear_young); =20 #ifdef CONFIG_64BIT pud_t *pud_offset(p4d_t *p4d, unsigned long address) --=20 2.39.5 From nobody Wed Jun 17 01:42:34 2026 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9C4139F162 for ; Tue, 21 Apr 2026 09:25:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763531; cv=none; b=XAeWEQuo0xaRl0mEpIf0/qJ4SmT7BF7c4PboWJSKRSE+dqBcxA2uijG8ppoYi2Sc2b4wATKlVpDL0gB+nHTo1s7IHleLfH1Qy7fsnxTaO7cOoL+w5Lg4GJ9PbHSgOhg8zMPRW91yOR1b0mtVvgOm4g1GCl0P4YAuMv+gg1Lbcik= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763531; c=relaxed/simple; bh=qUIa/9sDsAgCTGzA9WinapTYHc7hlNf+Cle2ugEYNbY=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=T98Jg2yq8edM551fv2n+2KbdvNUAeK0W32UPczxo5D3m3JVj6MIC4ImUpB+YI3ebUvJAcypxVL7YY85IjR77MAxNkpd7ASP1Qe37wcBlJKeELXQnZ4Pf0N6Uxghh545blXr+wv2Bgz6G9gtMUGLSn+kqrJejXP+o03iaK8EDvz0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=A7tYf/Xq; arc=none smtp.client-ip=209.85.214.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="A7tYf/Xq" Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-2b0afa0210bso18640075ad.2 for ; Tue, 21 Apr 2026 02:25:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776763529; x=1777368329; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=hAU2czezOYKOKIblUParrQgVvl9OeyW0IiuDcmMed3s=; b=A7tYf/Xq5WVKWO5Pi9ZYHlN05OrDhUVlDEXf3HM18VqP2j96Ob53D7zMfOvxUYvOHP cqCYgo+sj9oe2723eY+EwqZKC7WDu7791TxGJ0ISXOe8Dp+3Dg8H/eumxGWwCdfg3Psv LWnhx76MP4dEAgxJTv31YcWe62UbcWdtc61Bts9pnAaesyllUMm5fmgIIgRww9JIjjxu +TYFHmMPI4/w+FEcUJk/PFP4Uld7CwMLnv1aJ5cWnwwaGYfCki2zIa4y4Hvif8m+MB1O Jf7Do2eU0sK5qWZtykmT5LbIqiTSOT08A/SMRiXpUjTeYXrTWqJEJ+EKTTNeanTsmC1h Et8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776763529; x=1777368329; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=hAU2czezOYKOKIblUParrQgVvl9OeyW0IiuDcmMed3s=; b=Aw4fC732OEl/1BfKwYDps7hdoMCBf42iOEnKrHjwfh9joYNyljwyvOPHr52iWJcDAA aUTsVMGm9clE2PVznLFGWmm2QX971mlqqowIJaKBRpPHDALiKtMsRqmGgUbUNirFKjga wDv0T7Jro5wL2Z4EBcWWUmaBJRx6bhVQUqUyBPTX4lzfMkfvb1aMKKXutlqLmHeWedCg xk68wNnc1aWuEYxhmRD4gagpY90zpK0yN+xPPJFBCJMm1VwsQ/yKKK1hs12E+7l/U9sk RUoKI4p6xI3QX9HkAmYxiOB1BJnkfIVS5zT+P7MWQA7ylwkThXWrp1snGv9vSoZllYBE cf/g== X-Forwarded-Encrypted: i=1; AFNElJ9qn6bobwCs2hOQhAd+J31+7sCQaKwxZ02Sc9jONrkdd72rGtfPBKRqBIfSdBveRzn8si5APgeg5XUbgRE=@vger.kernel.org X-Gm-Message-State: AOJu0YzVFSekZ2AzcMZRYzcGfj09NvqKh2cXIqaUzOZYThBDko+s72JF pUwl1O1PIsBi53WB6UkiaCCXViNdevwWj/HzKxAsjHDKqlG+hYetFNdJxSk8MhENfXo= X-Gm-Gg: AeBDietRZKTtTwHDU1t+in8YjBEptulO5rQQ+VrJeGahCpdBHcoOL/B+pCHUV3zXUBl IMlAYWBmq8hMVXCbSkovMgfbLGFuaa41jJFYRp/wl3PTEQGEjqrfzsaU78rIwmta9X5h+Yo0UXd 52VLP5R6JB32q7664U/yq/s9gEq3kTO7ua4TWnD5bP+0/vLKfG8dbxap8m8Z8ASWB9lSjbsUB3l Q0Nv+Di8dc09l0enBWjhjFn6wKwT6tQg8NCPV4MliZ0FJh1ULbUdxdnYFKeHimBWsMxwKELArgX 7qq20fTyYqvCY4aAUVd0LOqIZsOVCiHg0RNJG7BZjcu+Jpi1cXF1GP+iEysrYDr3Z7YQelZpzNq IQnrKefcHZ8m7v8Y1C8zLfLmd6FWixbflVf9cYVWJhsZGyO0zJfp0Kbo5R5HWm+5RQF+Gvqyp0k FnftV8tZWt4igSOfI7t2nGU2kwah9MxUYSyNSFo5ufg8oBH+PMc/xVxx0n/Zp3T6niNH2r X-Received: by 2002:a17:902:bd44:b0:2b2:5da8:14be with SMTP id d9443c01a7336-2b5f9fcd9eemr128057035ad.41.1776763529026; Tue, 21 Apr 2026 02:25:29 -0700 (PDT) Received: from L6YN4KR4K9.bytedance.net ([61.213.176.6]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b5fa9ff39csm131965105ad.4.2026.04.21.02.25.19 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 21 Apr 2026 02:25:28 -0700 (PDT) From: Yunhui Cui To: akpm@linux-foundation.org, alex@ghiti.fr, andrew+kernel@donnellan.id.au, andreyknvl@gmail.com, anup@brainfault.org, aou@eecs.berkeley.edu, apopple@nvidia.com, ardb@kernel.org, atish.patra@linux.dev, baolin.wang@linux.alibaba.com, cuiyunhui@bytedance.com, david@kernel.org, debug@rivosinc.com, djordje.todorovic@htecgroup.com, dvyukov@google.com, elver@google.com, glider@google.com, ilias.apalodimas@linaro.org, junhui.liu@pigmoral.tech, kasan-dev@googlegroups.com, kees@kernel.org, kevin.brodsky@arm.com, kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, liu.xuemei1@zte.com.cn, ljs@kernel.org, namcao@linutronix.de, osalvador@suse.de, palmer@dabbelt.com, pjw@kernel.org, rmclure@linux.ibm.com, rostedt@goodmis.org, rppt@kernel.org, ryabinin.a.a@gmail.com, surenb@google.com, vincenzo.frascino@arm.com, vishal.moola@gmail.com, wangruikang@iscas.ac.cn, zhangchunyan@iscas.ac.cn Subject: [PATCH 2/7] riscv/kvm: use raw PTE helpers for G-stage leaf PTEs Date: Tue, 21 Apr 2026 17:24:52 +0800 Message-Id: <20260421092457.37649-3-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20260421092457.37649-1-cuiyunhui@bytedance.com> References: <20260421092457.37649-1-cuiyunhui@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use the raw RISC-V PTE helpers when KVM G-stage code needs to inspect or update the exact leaf entry encoding. This keeps G-stage page tables independent from the public PTE wrappers that will gain Svnapot-aware behavior. No functional change intended. Signed-off-by: Yunhui Cui --- arch/riscv/kvm/gstage.c | 48 ++++++++++++++++++++++------------------- arch/riscv/kvm/mmu.c | 4 ++-- 2 files changed, 28 insertions(+), 24 deletions(-) diff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c index d9fe8be2a1516..fda235092533a 100644 --- a/arch/riscv/kvm/gstage.c +++ b/arch/riscv/kvm/gstage.c @@ -88,7 +88,7 @@ bool kvm_riscv_gstage_get_leaf(struct kvm_gstage *gstage,= gpa_t addr, *ptep_level =3D current_level; ptep =3D (pte_t *)gstage->pgd; ptep =3D &ptep[gstage_pte_index(gstage, addr, current_level)]; - while (ptep && pte_val(ptep_get(ptep))) { + while (ptep && pte_val(__ptep_get(ptep))) { if (gstage_pte_leaf(ptep)) { *ptep_level =3D current_level; *ptepp =3D ptep; @@ -98,7 +98,7 @@ bool kvm_riscv_gstage_get_leaf(struct kvm_gstage *gstage,= gpa_t addr, if (current_level) { current_level--; *ptep_level =3D current_level; - ptep =3D (pte_t *)gstage_pte_page_vaddr(ptep_get(ptep)); + ptep =3D (pte_t *)gstage_pte_page_vaddr(__ptep_get(ptep)); ptep =3D &ptep[gstage_pte_index(gstage, addr, current_level)]; } else { ptep =3D NULL; @@ -138,18 +138,19 @@ int kvm_riscv_gstage_set_pte(struct kvm_gstage *gstag= e, if (gstage_pte_leaf(ptep)) return -EEXIST; =20 - if (!pte_val(ptep_get(ptep))) { + if (!pte_val(__ptep_get(ptep))) { if (!pcache) return -ENOMEM; next_ptep =3D kvm_mmu_memory_cache_alloc(pcache); if (!next_ptep) return -ENOMEM; - set_pte(ptep, pfn_pte(PFN_DOWN(__pa(next_ptep)), - __pgprot(_PAGE_TABLE))); + __set_pte(ptep, + pfn_pte(PFN_DOWN(__pa(next_ptep)), + __pgprot(_PAGE_TABLE))); } else { if (gstage_pte_leaf(ptep)) return -EEXIST; - next_ptep =3D (pte_t *)gstage_pte_page_vaddr(ptep_get(ptep)); + next_ptep =3D (pte_t *)gstage_pte_page_vaddr(__ptep_get(ptep)); } =20 current_level--; @@ -157,7 +158,7 @@ int kvm_riscv_gstage_set_pte(struct kvm_gstage *gstage, } =20 if (pte_val(*ptep) !=3D pte_val(map->pte)) { - set_pte(ptep, map->pte); + __set_pte(ptep, map->pte); if (gstage_pte_leaf(ptep)) gstage_tlb_flush(gstage, current_level, map->addr); } @@ -170,13 +171,13 @@ static void kvm_riscv_gstage_update_pte_prot(struct k= vm_gstage *gstage, u32 leve { pte_t new_pte; =20 - if (pgprot_val(pte_pgprot(ptep_get(ptep))) =3D=3D pgprot_val(prot)) + if (pgprot_val(pte_pgprot(__ptep_get(ptep))) =3D=3D pgprot_val(prot)) return; =20 - new_pte =3D pfn_pte(pte_pfn(ptep_get(ptep)), prot); + new_pte =3D pfn_pte(pte_pfn(__ptep_get(ptep)), prot); new_pte =3D pte_mkdirty(new_pte); =20 - set_pte(ptep, new_pte); + __set_pte(ptep, new_pte); =20 gstage_tlb_flush(gstage, level, addr); } @@ -255,7 +256,8 @@ int kvm_riscv_gstage_map_page(struct kvm_gstage *gstage, if (ptep_level > out_map->level) { kvm_riscv_gstage_split_huge(gstage, pcache, gpa, out_map->level, true); - } else if (ALIGN_DOWN(PFN_PHYS(pte_pfn(ptep_get(ptep))), page_size) =3D= =3D hpa) { + } else if (ALIGN_DOWN(PFN_PHYS(pte_pfn(__ptep_get(ptep))), + page_size) =3D=3D hpa) { kvm_riscv_gstage_update_pte_prot(gstage, ptep_level, gpa, ptep, prot); return 0; } @@ -301,16 +303,16 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gs= tage, while(current_level > target_level) { ptep =3D (pte_t *)&next_ptep[gstage_pte_index(gstage, addr, current_leve= l)]; =20 - if (!pte_val(ptep_get(ptep))) + if (!pte_val(__ptep_get(ptep))) break; =20 if (!gstage_pte_leaf(ptep)) { - next_ptep =3D (pte_t *)gstage_pte_page_vaddr(ptep_get(ptep)); + next_ptep =3D (pte_t *)gstage_pte_page_vaddr(__ptep_get(ptep)); current_level--; continue; } =20 - huge_pte =3D pte_val(ptep_get(ptep)); + huge_pte =3D pte_val(__ptep_get(ptep)); =20 ret =3D gstage_level_to_page_size(gstage, current_level - 1, &child_page= _size); if (ret) @@ -322,11 +324,12 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gs= tage, =20 for (i =3D 0; i < PTRS_PER_PTE; i++) { child_pte =3D make_child_pte(huge_pte, i, child_page_size); - set_pte((pte_t *)&next_ptep[i], __pte(child_pte)); + __set_pte((pte_t *)&next_ptep[i], __pte(child_pte)); } =20 - set_pte(ptep, pfn_pte(PFN_DOWN(__pa(next_ptep)), - __pgprot(_PAGE_TABLE))); + __set_pte(ptep, + pfn_pte(PFN_DOWN(__pa(next_ptep)), + __pgprot(_PAGE_TABLE))); =20 if (flush) gstage_tlb_flush(gstage, current_level, addr); @@ -351,18 +354,18 @@ void kvm_riscv_gstage_op_pte(struct kvm_gstage *gstag= e, gpa_t addr, =20 WARN_ON(addr & (page_size - 1)); =20 - if (!pte_val(ptep_get(ptep))) + if (!pte_val(__ptep_get(ptep))) return; =20 if (ptep_level && !gstage_pte_leaf(ptep)) { - next_ptep =3D (pte_t *)gstage_pte_page_vaddr(ptep_get(ptep)); + next_ptep =3D (pte_t *)gstage_pte_page_vaddr(__ptep_get(ptep)); next_ptep_level =3D ptep_level - 1; ret =3D gstage_level_to_page_size(gstage, next_ptep_level, &next_page_si= ze); if (ret) return; =20 if (op =3D=3D GSTAGE_OP_CLEAR) - set_pte(ptep, __pte(0)); + __set_pte(ptep, __pte(0)); for (i =3D 0; i < PTRS_PER_PTE; i++) kvm_riscv_gstage_op_pte(gstage, addr + i * next_page_size, &next_ptep[i], next_ptep_level, op); @@ -371,9 +374,10 @@ void kvm_riscv_gstage_op_pte(struct kvm_gstage *gstage= , gpa_t addr, } else { old_pte =3D *ptep; if (op =3D=3D GSTAGE_OP_CLEAR) - set_pte(ptep, __pte(0)); + __set_pte(ptep, __pte(0)); else if (op =3D=3D GSTAGE_OP_WP) - set_pte(ptep, __pte(pte_val(ptep_get(ptep)) & ~_PAGE_WRITE)); + __set_pte(ptep, + __pte(pte_val(__ptep_get(ptep)) & ~_PAGE_WRITE)); if (pte_val(*ptep) !=3D pte_val(old_pte)) gstage_tlb_flush(gstage, ptep_level, addr); } diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index 2d3def024270c..f338ef08a6d13 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -262,7 +262,7 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range = *range) &ptep, &ptep_level)) return false; =20 - return ptep_test_and_clear_young(NULL, 0, ptep); + return __ptep_test_and_clear_young(NULL, 0, ptep); } =20 bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) @@ -282,7 +282,7 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_r= ange *range) &ptep, &ptep_level)) return false; =20 - return pte_young(ptep_get(ptep)); + return pte_young(__ptep_get(ptep)); } =20 static bool fault_supports_gstage_huge_mapping(struct kvm_memory_slot *mem= slot, --=20 2.39.5 From nobody Wed Jun 17 01:42:34 2026 Received: from mail-pl1-f170.google.com (mail-pl1-f170.google.com [209.85.214.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A1BE351C36 for ; Tue, 21 Apr 2026 09:25:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763543; cv=none; b=A6mSQ7LiGSfZjncyr+LUOzStKaIeH/p8vGaLQoEcJBmX8s27AvTRvWZ9dBQ51OEdNtCrQlmyOJgkuLTdkvSBBkNF6CJeGRyGod1kdf8aywVRCjGvBK+9fRmrCPysBZlawvg8GCq0yuwu9XGrFbaEEAXkLOYbyTYqUA8tFvPYayA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763543; c=relaxed/simple; bh=HQOGKbyG1jukjMC5CkaRgoKkdrfRVJKKytRiflUvgY0=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JqpcVxNmYeruiJz1ODw36ar4xPtZ1d16Q0PoLuUjkLohR5tp/CK7YP71qxHlEqF6MBsXRkjt2NqDe0TtCUSs1KLtlMfB1Dz6UEJ5vmJIUlrJcQ+n8OClneyaBQPETEgg3YChfzGEdfPu7RyOSyoYCM/WwPGSr3y+RT7FByNmVfY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=B37AwPPE; arc=none smtp.client-ip=209.85.214.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="B37AwPPE" Received: by mail-pl1-f170.google.com with SMTP id d9443c01a7336-2b299b3c739so18248015ad.3 for ; Tue, 21 Apr 2026 02:25:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776763539; x=1777368339; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=e3Aiqpt/ykXwmcx+XDDP/rmLrEEbX7Jgl7wd7KeN5Vc=; b=B37AwPPEQpmKAdGFaeQMU9XWxmd9m/ZRvNXJ+yXZzTwrIg1EnAB0CURVNE834xbPMb kzQRSPgd//ZAvlJXYh/FPZVLEylY1TawPi7S2M5VB3WJ4Gmd4UQRlN1wlH+c6g+EeNkN eWZlUFuBXZyocDz5cDJI8S6mhcCsP6ymWv6XxWp8wFB7AYsztZ1nF7DbSxY+6dtAbIcU T+i/rQq3s79y8WxVGnhSeFAvx6IdY0GtzGPIrpxTH4QuC8rJtVEUYx0PDW5RbzehISu5 D00eytCKTBTxoLTxchZJ9zZy1/XL86Blsf+MNgGU4OLXHAx9vTWJxi8VkUc6f9G1zD2c Iz/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776763539; x=1777368339; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=e3Aiqpt/ykXwmcx+XDDP/rmLrEEbX7Jgl7wd7KeN5Vc=; b=AEmnHVDq3H1DG5XnCHrH5Ahu1dQUO3xKsgfpvBUO7PI2lbkYJG1t7I+hJHkQVSTAkG i8MnHD85376a9sZbuB0XrwwkTJyHl+Iy2DFKhYCUQiE1NfmxJyH/TCtaBz7dxmQ00ssp JTh4m5XfDD91zcOBk+RDZTsWY3OtXTNsfiRyRqXvAjWLiFATOM70JLmkjqwX0Iy9C2HH 118v4nqNB56sQDc4xUyD+paTzOWWOLdOMopvz79xr2RCkGjPMBOsOKd8BHsKcqSDJ6OA iDxDDI4paTx5TkFKx33B3GB6cYVWszmbmZTrMKZQ+vRjxRiRd+RJrWADvkSnkljupvuV y6gg== X-Forwarded-Encrypted: i=1; AFNElJ/me7F7AvRYqxK52PHo+t3rDG9H/4BPzixjreKr/gUpVqS1pqRI1VqUVbcpt/3v9K08jlJmSkaHg2axsNw=@vger.kernel.org X-Gm-Message-State: AOJu0YzwRqOZFQH2TQzpStRfszOcGUXcsi647vTEcOIJ3L0cI8uCueFw Epnr8QoYFks06jXUxEb7pXrbYoZNxtS4w9Z/M5+Eqr8OiUHCTD9lqk2GspIDK0GB9A8= X-Gm-Gg: AeBDies3fRsboW2tm7bftFPBrN8TGN4xky93cuiNA8FB1ZwWuQ3igNPMi5erex2hczO yeGl7zSDKJjSIuTlaWmKpFvVbmjKbgvyQUQ9nmwNNPnWWXr+mnmJMF5D6Rh1wF6jKGPhsD6YhO3 yKw50YohG576MW7DhnfulCGjjDw274ij5cShoE+5EGNZtUQG0XMVnjt3HDgxxrYtMCtBgAN4FXC N8pfmMIue/xGF0dyIeAKsju7BqhPcSnEWc6/92vimUg/FfM61GjIUhgyYoBsiVJtK58/p/wBV7t nl0cqaMukHmMBVKqjR3I1jphKIYEicQJXisLOAkCZcj6w8x0ZMxbW5gPo5UTYUPGywjKSlAc+pf hSKDGP9PtbUv5e+TBEn3MQzOgBG7k3lpPzYTkvAI3av7lJn6AhTp3yN7e9G1fcL0zo9mz0HO5fa X9xcHjIcr4ZiUPYSXvTiAL/ofhDF40Owb9GhO9w98CUxdWgN4jE+D/d26j4GUOP7tRnUCM X-Received: by 2002:a17:902:f54d:b0:2b4:5b1a:d09c with SMTP id d9443c01a7336-2b5f9edb4cbmr186430275ad.15.1776763538472; Tue, 21 Apr 2026 02:25:38 -0700 (PDT) Received: from L6YN4KR4K9.bytedance.net ([61.213.176.6]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b5fa9ff39csm131965105ad.4.2026.04.21.02.25.29 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 21 Apr 2026 02:25:38 -0700 (PDT) From: Yunhui Cui To: akpm@linux-foundation.org, alex@ghiti.fr, andrew+kernel@donnellan.id.au, andreyknvl@gmail.com, anup@brainfault.org, aou@eecs.berkeley.edu, apopple@nvidia.com, ardb@kernel.org, atish.patra@linux.dev, baolin.wang@linux.alibaba.com, cuiyunhui@bytedance.com, david@kernel.org, debug@rivosinc.com, djordje.todorovic@htecgroup.com, dvyukov@google.com, elver@google.com, glider@google.com, ilias.apalodimas@linaro.org, junhui.liu@pigmoral.tech, kasan-dev@googlegroups.com, kees@kernel.org, kevin.brodsky@arm.com, kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, liu.xuemei1@zte.com.cn, ljs@kernel.org, namcao@linutronix.de, osalvador@suse.de, palmer@dabbelt.com, pjw@kernel.org, rmclure@linux.ibm.com, rostedt@goodmis.org, rppt@kernel.org, ryabinin.a.a@gmail.com, surenb@google.com, vincenzo.frascino@arm.com, vishal.moola@gmail.com, wangruikang@iscas.ac.cn, zhangchunyan@iscas.ac.cn Subject: [PATCH 3/7] riscv: mm: add Svnapot-aware contiguous PTE wrappers Date: Tue, 21 Apr 2026 17:24:53 +0800 Message-Id: <20260421092457.37649-4-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20260421092457.37649-1-cuiyunhui@bytedance.com> References: <20260421092457.37649-1-cuiyunhui@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add Svnapot-aware wrappers around the public PTE helpers so core MM callers can operate on contiguous mappings without learning the NAPOT encoding details. Introduce contpte.c to handle folding, unfolding and accessed/dirty state aggregation for contiguous PTE blocks. Keep the raw __* helpers unchanged so NAPOT-aware callers can continue to access the underlying PTE encoding directly, and centralize the public Svnapot-aware wrappers under a single CONFIG_RISCV_ISA_SVNAPOT block with simple alias fallbacks for the non-Svnapot case. Signed-off-by: Yunhui Cui --- arch/riscv/include/asm/pgtable.h | 288 +++++++++++++++++-- arch/riscv/mm/Makefile | 1 + arch/riscv/mm/contpte.c | 479 +++++++++++++++++++++++++++++++ arch/riscv/mm/pgtable.c | 39 ++- 4 files changed, 769 insertions(+), 38 deletions(-) create mode 100644 arch/riscv/mm/contpte.c diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgta= ble.h index 4de1f40fa77ea..722483d4df37f 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -11,6 +11,10 @@ =20 #include =20 +#ifndef __ASSEMBLER__ +#include +#endif + #ifndef CONFIG_MMU #ifdef CONFIG_RELOCATABLE #define KERNEL_LINK_ADDR UL(0) @@ -301,6 +305,12 @@ static inline unsigned long pte_napot(pte_t pte) return 0; } =20 +static inline pte_t pte_mknapot(pte_t pte, unsigned int order) +{ + (void)order; + return pte; +} + #endif /* CONFIG_RISCV_ISA_SVNAPOT */ =20 /* Yields the page frame number (PFN) of a page table entry */ @@ -339,6 +349,11 @@ static inline int pte_present(pte_t pte) return (pte_val(pte) & (_PAGE_PRESENT | _PAGE_PROT_NONE)); } =20 +static inline bool pte_present_napot(pte_t pte) +{ + return pte_present(pte) && pte_napot(pte); +} + #define pte_accessible pte_accessible static inline unsigned long pte_accessible(struct mm_struct *mm, pte_t a) { @@ -392,6 +407,23 @@ static inline int pte_special(pte_t pte) return pte_val(pte) & _PAGE_SPECIAL; } =20 +static inline pte_t pte_mknonnapot(pte_t pte, unsigned long addr) +{ + unsigned long pfn; + unsigned long offset; + pgprot_t prot; + + if (!pte_present_napot(pte)) + return pte; + + offset =3D (addr & (napot_cont_size(napot_cont_order(pte)) - 1)) >> + PAGE_SHIFT; + pfn =3D pte_pfn(pte) + offset; + prot =3D __pgprot((pte_val(pte) & ~_PAGE_PFN_MASK) & ~_PAGE_NAPOT); + + return pfn_pte(pfn, prot); +} + /* static inline pte_t pte_rdprotect(pte_t pte) */ =20 static inline pte_t pte_wrprotect(pte_t pte) @@ -642,24 +674,12 @@ static inline void __set_ptes(struct mm_struct *mm, u= nsigned long addr, =20 #define __set_ptes __set_ptes =20 -static inline void set_ptes(struct mm_struct *mm, unsigned long addr, - pte_t *ptep, pte_t pteval, unsigned int nr) -{ - __set_ptes(mm, addr, ptep, pteval, nr); -} - static inline void __pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { __set_pte_at(mm, ptep, __pte(0)); } =20 -static inline void pte_clear(struct mm_struct *mm, - unsigned long addr, pte_t *ptep) -{ - __pte_clear(mm, addr, ptep); -} - #define __ptep_get __ptep_get static inline pte_t __ptep_get(pte_t *ptep) { @@ -672,6 +692,47 @@ static inline pte_t __ptep_get_lockless(pte_t *ptep) return __ptep_get(ptep); } =20 +static inline void __clear_young_dirty_pte(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + pte_t pte, cydp_t flags) +{ + pte_t old_pte; + + do { + old_pte =3D pte; + + if (flags & CYDP_CLEAR_YOUNG) + pte =3D pte_mkold(pte); + if (flags & CYDP_CLEAR_DIRTY) + pte =3D pte_mkclean(pte); + + pte_val(pte) =3D cmpxchg_relaxed(&pte_val(*ptep), + pte_val(old_pte), + pte_val(pte)); + } while (pte_val(pte) !=3D pte_val(old_pte)); +} + +static inline void __clear_young_dirty_ptes(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + unsigned int nr, cydp_t flags) +{ + pte_t pte; + + for (;;) { + pte =3D __ptep_get(ptep); + + if (flags =3D=3D (CYDP_CLEAR_YOUNG | CYDP_CLEAR_DIRTY)) + __set_pte(ptep, pte_mkclean(pte_mkold(pte))); + else + __clear_young_dirty_pte(vma, addr, ptep, pte, flags); + + if (--nr =3D=3D 0) + break; + ptep++; + addr +=3D PAGE_SIZE; + } +} + #define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS /* defined in mm/pgtable.c */ extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long= address, pte_t *ptep, pte_t entry, int dirty); @@ -703,12 +764,6 @@ __ptep_get_and_clear(struct mm_struct *mm, unsigned lo= ng address, pte_t *ptep) =20 #define __ptep_get_and_clear __ptep_get_and_clear =20 -static inline pte_t ptep_get_and_clear(struct mm_struct *mm, - unsigned long address, pte_t *ptep) -{ - return __ptep_get_and_clear(mm, address, ptep); -} - static inline void __ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *p= tep) { @@ -725,13 +780,6 @@ __ptep_set_wrprotect(struct mm_struct *mm, unsigned lo= ng address, pte_t *ptep) =20 #define __ptep_set_wrprotect __ptep_set_wrprotect =20 -#define __HAVE_ARCH_PTEP_SET_WRPROTECT -static inline void ptep_set_wrprotect(struct mm_struct *mm, - unsigned long address, pte_t *ptep) -{ - __ptep_set_wrprotect(mm, address, ptep); -} - static inline pte_t __ptep_clear_flush(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) @@ -744,9 +792,8 @@ static inline pte_t __ptep_clear_flush(struct vm_area_s= truct *vma, return pte; } =20 -#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH -static inline bool ptep_clear_flush_young(struct vm_area_struct *vma, - unsigned long address, pte_t *ptep) +static inline bool __ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep) { /* * This comment is borrowed from x86, but applies equally to RISC-V: @@ -763,9 +810,192 @@ static inline bool ptep_clear_flush_young(struct vm_a= rea_struct *vma, * shouldn't really matter because there's no real memory * pressure for swapout to react to. ] */ - return ptep_test_and_clear_young(vma, address, ptep); + return __ptep_test_and_clear_young(vma, address, ptep); +} + +#define __ptep_clear_flush_young __ptep_clear_flush_young + +#ifdef CONFIG_RISCV_ISA_SVNAPOT + +/* + * The Svnapot helpers transparently manage napot-encoded PTEs for the pub= lic + * core-MM-facing API below. The napot bit is a private implementation det= ail + * of those public helpers. Callers that need direct access to the underly= ing + * PTE encoding must use the low-level __* helpers instead. + */ +void __napotpte_try_fold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); +void __napotpte_try_unfold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); +pte_t napotpte_ptep_get(pte_t *ptep, pte_t orig_pte); +pte_t napotpte_ptep_get_lockless(pte_t *ptep); +void napotpte_set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr); +void napotpte_clear_young_dirty_ptes(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + unsigned int nr, cydp_t flags); +bool napotpte_ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep, + pte_t entry, int dirty); +bool napotpte_ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep); +bool napotpte_ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep); + +static __always_inline bool riscv_pte_present_napot(pte_t pte) +{ + return riscv_has_extension_unlikely(RISCV_ISA_EXT_SVNAPOT) && + pte_present_napot(pte); +} + +static __always_inline void +napotpte_try_fold(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + pte_t pte) +{ + const unsigned long contmask =3D napot_pte_num(NAPOT_CONT64KB_ORDER) - 1; + bool valign =3D ((addr >> PAGE_SHIFT) & contmask) =3D=3D contmask; + + if (unlikely(valign)) { + bool palign =3D (pte_pfn(pte) & contmask) =3D=3D contmask; + + if (unlikely(palign && pte_present(pte) && !pte_napot(pte) && + !pte_special(pte))) + __napotpte_try_fold(mm, addr, ptep, pte); + } +} + +static __always_inline void +napotpte_try_unfold(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + pte_t pte) +{ + if (unlikely(pte_present_napot(pte))) + __napotpte_try_unfold(mm, addr, ptep, pte); +} + +#define set_ptes set_ptes +static inline void set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pteval, unsigned int nr) +{ + pteval =3D pte_mknonnapot(pteval, addr); + + if (likely(nr =3D=3D 1)) { + napotpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __set_ptes(mm, addr, ptep, pteval, 1); + napotpte_try_fold(mm, addr, ptep, pteval); + return; + } + + napotpte_set_ptes(mm, addr, ptep, pteval, nr); +} + +static inline void pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t *ptep) +{ + napotpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __pte_clear(mm, addr, ptep); +} + +#define ptep_get ptep_get +static inline pte_t ptep_get(pte_t *ptep) +{ + pte_t pte =3D __ptep_get(ptep); + + if (likely(!pte_present_napot(pte))) + return pte; + + return napotpte_ptep_get(ptep, pte); +} + +#define ptep_get_lockless ptep_get_lockless +static inline pte_t ptep_get_lockless(pte_t *ptep) +{ + pte_t pte =3D __ptep_get_lockless(ptep); + + if (likely(!pte_present_napot(pte))) + return pte; + + return napotpte_ptep_get_lockless(ptep); +} + +static inline pte_t ptep_get_and_clear(struct mm_struct *mm, + unsigned long address, pte_t *ptep) +{ + napotpte_try_unfold(mm, address, ptep, __ptep_get(ptep)); + + return __ptep_get_and_clear(mm, address, ptep); +} + +#define clear_young_dirty_ptes clear_young_dirty_ptes +static inline void clear_young_dirty_ptes(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + unsigned int nr, cydp_t flags) +{ + napotpte_clear_young_dirty_ptes(vma, addr, ptep, nr, flags); +} + +#define __HAVE_ARCH_PTEP_SET_WRPROTECT +static inline void ptep_set_wrprotect(struct mm_struct *mm, + unsigned long address, pte_t *ptep) +{ + __ptep_set_wrprotect(mm, address, ptep); } =20 +#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH +static inline bool ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep) +{ + pte_t orig_pte =3D __ptep_get(ptep); + + if (likely(!riscv_pte_present_napot(orig_pte))) + return __ptep_clear_flush_young(vma, address, ptep); + + return napotpte_ptep_clear_flush_young(vma, address, ptep); +} + +#else /* CONFIG_RISCV_ISA_SVNAPOT */ + +static __always_inline bool riscv_pte_present_napot(pte_t pte) +{ + return false; +} + +static inline bool napotpte_ptep_set_access_flags(struct vm_area_struct *v= ma, + unsigned long address, + pte_t *ptep, pte_t entry, + int dirty) +{ + return false; +} + +static inline bool +napotpte_ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long address, + pte_t *ptep) +{ + return false; +} + +static inline bool +napotpte_ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, + pte_t *ptep) +{ + return false; +} + +#define set_ptes __set_ptes +#define pte_clear __pte_clear +#define ptep_get __ptep_get +#define ptep_get_lockless __ptep_get_lockless +#define ptep_get_and_clear __ptep_get_and_clear +#define clear_young_dirty_ptes __clear_young_dirty_ptes +#define __HAVE_ARCH_PTEP_SET_WRPROTECT +#define ptep_set_wrprotect __ptep_set_wrprotect +#define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH +#define ptep_clear_flush_young __ptep_clear_flush_young + +#endif /* CONFIG_RISCV_ISA_SVNAPOT */ + #define pgprot_nx pgprot_nx static inline pgprot_t pgprot_nx(pgprot_t _prot) { diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile index b916a68d324ad..5855f923b83ec 100644 --- a/arch/riscv/mm/Makefile +++ b/arch/riscv/mm/Makefile @@ -17,6 +17,7 @@ obj-$(CONFIG_MMU) +=3D extable.o fault.o pageattr.o pgtab= le.o tlbflush.o obj-y +=3D cacheflush.o obj-y +=3D context.o obj-y +=3D pmem.o +obj-$(CONFIG_RISCV_ISA_SVNAPOT) +=3D contpte.o =20 obj-$(CONFIG_HUGETLB_PAGE) +=3D hugetlbpage.o obj-$(CONFIG_PTDUMP) +=3D ptdump.o diff --git a/arch/riscv/mm/contpte.c b/arch/riscv/mm/contpte.c new file mode 100644 index 0000000000000..f73af7d9b099a --- /dev/null +++ b/arch/riscv/mm/contpte.c @@ -0,0 +1,479 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +static inline bool napot_hw_supported(void) +{ + return riscv_has_extension_unlikely(RISCV_ISA_EXT_SVNAPOT); +} + +static inline bool mm_is_user(struct mm_struct *mm) +{ + if (unlikely(mm_is_efi(mm))) + return false; + + return mm !=3D &init_mm; +} + +static inline unsigned int napotpte_order(void) +{ + return NAPOT_CONT64KB_ORDER; +} + +static inline unsigned long napotpte_size(void) +{ + return napot_cont_size(napotpte_order()); +} + +static inline unsigned int napotpte_pte_num(void) +{ + return napot_pte_num(napotpte_order()); +} + +static inline unsigned long napotpte_mask(void) +{ + return napotpte_size() - 1; +} + +static inline unsigned long napot_align_addr(unsigned long addr) +{ + return ALIGN_DOWN(addr, napotpte_size()); +} + +static inline pte_t *napot_align_ptep(pte_t *ptep) +{ + return PTR_ALIGN_DOWN(ptep, napotpte_pte_num() * sizeof(*ptep)); +} + +static inline pte_t pte_mask_ad(pte_t pte) +{ + return pte_mkold(pte_mkclean(pte)); +} + +static inline unsigned long pte_protval_no_pfn_no_napot(pte_t pte) +{ + return (pte_val(pte) & ~_PAGE_PFN_MASK) & ~_PAGE_NAPOT; +} + +static inline void napotpte_clear_young_dirty_pte(pte_t *ptep, cydp_t flag= s) +{ + pte_t old_pte, new_pte; + unsigned long old_val, new_val; + + do { + old_pte =3D READ_ONCE(*ptep); + new_pte =3D old_pte; + if (flags & CYDP_CLEAR_YOUNG) + new_pte =3D pte_mkold(new_pte); + if (flags & CYDP_CLEAR_DIRTY) + new_pte =3D pte_mkclean(new_pte); + + old_val =3D pte_val(old_pte); + new_val =3D pte_val(new_pte); + } while (cmpxchg_relaxed(&pte_val(*ptep), old_val, new_val) !=3D old_val); +} + +static inline pte_t napotpte_subpte(pte_t *ptep, pte_t pte) +{ + unsigned long pfn; + pgprot_t prot; + + if (!pte_present_napot(pte)) + return pte; + + pfn =3D pte_pfn(pte) + (ptep - napot_align_ptep(ptep)); + prot =3D __pgprot(pte_protval_no_pfn_no_napot(pte)); + + return pfn_pte(pfn, prot); +} + +static inline pte_t +__napot_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t= *ptep) +{ + pte_t pte; + + pte =3D __pte(atomic_long_xchg((atomic_long_t *)ptep, 0)); + page_table_check_pte_clear(mm, addr, pte); + + return pte; +} + +static void napotpte_convert(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t target) +{ + unsigned long start_addr, end; + pte_t *start_ptep; + pte_t ptent, pte; + unsigned int i, nr; + + start_addr =3D napot_align_addr(addr); + start_ptep =3D napot_align_ptep(ptep); + nr =3D napotpte_pte_num(); + end =3D start_addr + napotpte_size(); + + for (i =3D 0; i < nr; i++) { + ptent =3D __napot_ptep_get_and_clear(mm, start_addr + i * PAGE_SIZE, + start_ptep + i); + if (pte_dirty(ptent)) + target =3D pte_mkdirty(target); + if (pte_young(ptent)) + target =3D pte_mkyoung(target); + } + + flush_tlb_mm_range(mm, start_addr, end, PAGE_SIZE); + + page_table_check_ptes_set(mm, start_addr, start_ptep, target, nr); + if (pte_napot(target)) { + for (i =3D 0; i < nr; i++) + __set_pte_at(mm, start_ptep + i, target); + return; + } + + for (i =3D 0; i < nr; i++) { + pte =3D pfn_pte(pte_pfn(target) + i, + __pgprot(pte_protval_no_pfn_no_napot(target))); + if (pte_dirty(target)) + pte =3D pte_mkdirty(pte); + if (pte_young(target)) + pte =3D pte_mkyoung(pte); + __set_pte_at(mm, start_ptep + i, pte); + } +} + +static inline bool napotpte_is_consistent(pte_t pte, pte_t orig_pte) +{ + return pte_present_napot(pte) && + pte_val(pte_mask_ad(pte)) =3D=3D pte_val(pte_mask_ad(orig_pte)); +} + +void __napotpte_try_fold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + struct page *page; + struct folio *folio; + unsigned long folio_start, folio_end; + unsigned long cont_start, cont_end; + unsigned long pfn; + pgprot_t prot; + pte_t expected, cur; + pte_t *start; + unsigned int i, nr; + + if (!napot_hw_supported() || !mm_is_user(mm)) + return; + + if (!pte_present(pte) || pte_napot(pte) || pte_special(pte)) + return; + + page =3D pte_page(pte); + folio =3D page_folio(page); + folio_start =3D addr - (page - &folio->page) * PAGE_SIZE; + folio_end =3D folio_start + folio_nr_pages(folio) * PAGE_SIZE; + cont_start =3D napot_align_addr(addr); + cont_end =3D cont_start + napotpte_size(); + if (folio_start > cont_start || folio_end < cont_end) + return; + + nr =3D napotpte_pte_num(); + start =3D napot_align_ptep(ptep); + + pfn =3D ALIGN_DOWN(pte_pfn(pte), nr); + prot =3D pte_pgprot(pte_mask_ad(pte)); + expected =3D pfn_pte(pfn, prot); + + for (i =3D 0; i < nr; i++) { + cur =3D READ_ONCE(start[i]); + if (pte_val(pte_mask_ad(cur)) !=3D pte_val(expected)) + return; + pte_val(expected) +=3D 1UL << _PAGE_PFN_SHIFT; + } + + expected =3D pte_mknapot(pfn_pte(pfn, prot), napotpte_order()); + napotpte_convert(mm, addr, ptep, expected); +} +EXPORT_SYMBOL(__napotpte_try_fold); + +void __napotpte_try_unfold(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + pte_t target; + pgprot_t prot; + + if (!napot_hw_supported() || !mm_is_user(mm)) + return; + + prot =3D __pgprot(pte_protval_no_pfn_no_napot(pte)); + target =3D pfn_pte(pte_pfn(pte), prot); + + napotpte_convert(mm, addr, ptep, target); +} +EXPORT_SYMBOL(__napotpte_try_unfold); + +pte_t napotpte_ptep_get(pte_t *ptep, pte_t orig_pte) +{ + pte_t pte, cur; + pte_t *start; + unsigned int i, nr; + + if (!napot_hw_supported() || !pte_present_napot(orig_pte)) + return orig_pte; + + pte =3D orig_pte; + start =3D napot_align_ptep(ptep); + nr =3D napotpte_pte_num(); + + for (i =3D 0; i < nr; i++) { + cur =3D READ_ONCE(start[i]); + if (!napotpte_is_consistent(cur, orig_pte)) + return napotpte_subpte(ptep, orig_pte); + if (pte_dirty(cur)) + pte =3D pte_mkdirty(pte); + if (pte_young(cur)) + pte =3D pte_mkyoung(pte); + } + + return napotpte_subpte(ptep, pte); +} +EXPORT_SYMBOL(napotpte_ptep_get); + +pte_t napotpte_ptep_get_lockless(pte_t *orig_ptep) +{ + pte_t orig_pte, pte; + pte_t *ptep; + unsigned int i, nr; + + if (!napot_hw_supported()) + return READ_ONCE(*orig_ptep); + + nr =3D napotpte_pte_num(); + +retry: + orig_pte =3D READ_ONCE(*orig_ptep); + if (!pte_present_napot(orig_pte)) + return orig_pte; + + ptep =3D napot_align_ptep(orig_ptep); + + for (i =3D 0; i < nr; i++, ptep++) { + pte =3D READ_ONCE(*ptep); + + if (!napotpte_is_consistent(pte, orig_pte)) + goto retry; + + if (pte_dirty(pte)) { + orig_pte =3D pte_mkdirty(orig_pte); + for (; i < nr; i++, ptep++) { + pte =3D READ_ONCE(*ptep); + + if (!napotpte_is_consistent(pte, orig_pte)) + goto retry; + + if (pte_young(pte)) { + orig_pte =3D pte_mkyoung(orig_pte); + break; + } + } + break; + } + + if (pte_young(pte)) { + orig_pte =3D pte_mkyoung(orig_pte); + i++; + ptep++; + for (; i < nr; i++, ptep++) { + pte =3D READ_ONCE(*ptep); + + if (!napotpte_is_consistent(pte, orig_pte)) + goto retry; + + if (pte_dirty(pte)) { + orig_pte =3D pte_mkdirty(orig_pte); + break; + } + } + break; + } + } + + return napotpte_subpte(orig_ptep, orig_pte); +} +EXPORT_SYMBOL(napotpte_ptep_get_lockless); + +void napotpte_set_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte, unsigned int nr) +{ + unsigned long next, end; + unsigned long pfn, size, boundary; + pgprot_t prot; + unsigned int chunk, i; + pte_t cur; + + if (!napot_hw_supported() || !mm_is_user(mm)) { + __set_ptes(mm, addr, ptep, pte, nr); + return; + } + + size =3D napotpte_size(); + end =3D addr + ((unsigned long)nr << PAGE_SHIFT); + pfn =3D pte_pfn(pte); + prot =3D __pgprot(pte_protval_no_pfn_no_napot(pte)); + + do { + boundary =3D (addr + size) & ~napotpte_mask(); + next =3D (boundary - 1 < end - 1) ? boundary : end; + chunk =3D (next - addr) >> PAGE_SHIFT; + + cur =3D pfn_pte(pfn, prot); + if (((addr | next | (pfn << PAGE_SHIFT)) & napotpte_mask()) =3D=3D 0) { + cur =3D pte_mknapot(cur, napotpte_order()); + page_table_check_ptes_set(mm, addr, ptep, cur, chunk); + for (i =3D 0; i < chunk; i++) + __set_pte_at(mm, ptep + i, cur); + } else { + __set_ptes(mm, addr, ptep, cur, chunk); + } + + addr =3D next; + ptep +=3D chunk; + pfn +=3D chunk; + } while (addr !=3D end); +} +EXPORT_SYMBOL(napotpte_set_ptes); + +void napotpte_clear_young_dirty_ptes(struct vm_area_struct *vma, + unsigned long addr, pte_t *ptep, + unsigned int nr, cydp_t flags) +{ + struct mm_struct *mm; + unsigned long start, end; + unsigned int total; + + mm =3D vma->vm_mm; + if (!napot_hw_supported() || !mm_is_user(mm)) { + for (;;) { + if (flags =3D=3D CYDP_CLEAR_YOUNG) + __ptep_test_and_clear_young(vma, addr, ptep); + else + napotpte_clear_young_dirty_pte(ptep, flags); + if (--nr =3D=3D 0) + break; + ptep++; + addr +=3D PAGE_SIZE; + } + return; + } + + start =3D addr; + end =3D start + nr * PAGE_SIZE; + + if (pte_present_napot(READ_ONCE(*(ptep + nr - 1)))) + end =3D ALIGN(end, napotpte_size()); + + if (pte_present_napot(READ_ONCE(*ptep))) { + start =3D napot_align_addr(start); + ptep =3D napot_align_ptep(ptep); + } + + total =3D (end - start) >> PAGE_SHIFT; + for (; total; total--, ptep++, start +=3D PAGE_SIZE) + napotpte_clear_young_dirty_pte(ptep, flags); +} +EXPORT_SYMBOL(napotpte_clear_young_dirty_ptes); + +bool napotpte_ptep_set_access_flags(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep, + pte_t entry, int dirty) +{ + pte_t orig_pte, raw_pte, napot_pte; + pte_t *start; + pgprot_t prot; + unsigned long start_addr; + unsigned int i, nr; + bool changed; + + raw_pte =3D READ_ONCE(*ptep); + if (!napot_hw_supported() || !pte_present_napot(raw_pte)) + return false; + + orig_pte =3D ptep_get(ptep); + if (pte_val(orig_pte) =3D=3D pte_val(entry)) + return false; + + if (pte_write(orig_pte) !=3D pte_write(entry)) { + __napotpte_try_unfold(vma->vm_mm, address, ptep, raw_pte); + entry =3D pte_mknonnapot(entry, address); + + return ptep_set_access_flags(vma, address, ptep, entry, dirty); + } + + prot =3D pte_pgprot(entry); + napot_pte =3D pfn_pte(pte_pfn(raw_pte), prot); + napot_pte =3D pte_mknapot(napot_pte, napotpte_order()); + + start =3D napot_align_ptep(ptep); + start_addr =3D napot_align_addr(address); + nr =3D napotpte_pte_num(); + changed =3D false; + + page_table_check_ptes_set(vma->vm_mm, start_addr, start, napot_pte, nr); + for (i =3D 0; i < nr; i++) { + if (!pte_same(READ_ONCE(start[i]), napot_pte)) { + __set_pte_at(vma->vm_mm, start + i, napot_pte); + changed =3D true; + } + } + + if (changed) + flush_tlb_range(vma, start_addr, start_addr + napotpte_size()); + + return changed; +} +EXPORT_SYMBOL(napotpte_ptep_set_access_flags); + +bool napotpte_ptep_test_and_clear_young(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep) +{ + pte_t *start; + unsigned int i, nr; + bool young; + + if (!napot_hw_supported() || !pte_present_napot(READ_ONCE(*ptep))) + return false; + + start =3D napot_align_ptep(ptep); + nr =3D napotpte_pte_num(); + young =3D false; + + for (i =3D 0; i < nr; i++) + young |=3D test_and_clear_bit(_PAGE_ACCESSED_OFFSET, + &pte_val(start[i])); + + return young; +} +EXPORT_SYMBOL(napotpte_ptep_test_and_clear_young); + +bool napotpte_ptep_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, pte_t *ptep) +{ + unsigned long start_addr; + bool young; + + young =3D napotpte_ptep_test_and_clear_young(vma, address, ptep); + if (!young) + return false; + + start_addr =3D napot_align_addr(address); + flush_tlb_range(vma, start_addr, start_addr + napotpte_size()); + + return true; +} +EXPORT_SYMBOL(napotpte_ptep_clear_flush_young); diff --git a/arch/riscv/mm/pgtable.c b/arch/riscv/mm/pgtable.c index 9131a78fe15c4..85ff49286f91c 100644 --- a/arch/riscv/mm/pgtable.c +++ b/arch/riscv/mm/pgtable.c @@ -9,6 +9,14 @@ int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, pte_t *ptep, pte_t entry, int dirty) { + pte_t raw_pte; + + entry =3D pte_mknonnapot(entry, address); + raw_pte =3D READ_ONCE(*ptep); + if (riscv_pte_present_napot(raw_pte)) + return napotpte_ptep_set_access_flags(vma, address, ptep, entry, + dirty); + return __ptep_set_access_flags(vma, address, ptep, entry, dirty); } =20 @@ -16,19 +24,26 @@ int __ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, pte_t *ptep, pte_t entry, int dirty) { - if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SVVPTC)) { - if (!pte_same(ptep_get(ptep), entry)) { - __set_pte_at(vma->vm_mm, ptep, entry); - /* Here only not svadu is impacted */ - flush_tlb_page(vma, address); - return true; - } + pte_t raw_pte; + bool changed; + + entry =3D pte_mknonnapot(entry, address); + raw_pte =3D READ_ONCE(*ptep); + if (riscv_pte_present_napot(raw_pte)) + return false; =20 + changed =3D !pte_same(raw_pte, entry); + if (!changed) return false; + + __set_pte_at(vma->vm_mm, ptep, entry); + + if (riscv_has_extension_unlikely(RISCV_ISA_EXT_SVVPTC)) { + /* Here only not svadu is impacted */ + flush_tlb_page(vma, address); + return true; } =20 - if (!pte_same(ptep_get(ptep), entry)) - __set_pte_at(vma->vm_mm, ptep, entry); /* * update_mmu_cache will unconditionally execute, handling both * the case that the PTE changed and the spurious fault case. @@ -39,6 +54,12 @@ int __ptep_set_access_flags(struct vm_area_struct *vma, bool ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) { + pte_t raw_pte; + + raw_pte =3D READ_ONCE(*ptep); + if (riscv_pte_present_napot(raw_pte)) + return napotpte_ptep_test_and_clear_young(vma, address, ptep); + return __ptep_test_and_clear_young(vma, address, ptep); } EXPORT_SYMBOL_GPL(ptep_test_and_clear_young); --=20 2.39.5 From nobody Wed Jun 17 01:42:34 2026 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7DAB3359A66 for ; Tue, 21 Apr 2026 09:25:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763549; cv=none; b=Ce14Hmg6ZX10RxRGBG9sKY1YQvgje7NDFxjiW1ganhW+Rlecpdg6BRUffnRINK19QTfQFb7okhoamZMYYetgrhVAiNlahuxwZgbgb58fTbNNXhZRLlUnVaDm3h7ZKAM5zQ40EhCp6k8+WzR7//XetR4wZqS94cC5Vdh7HGPpZrM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763549; c=relaxed/simple; bh=85CI3Uesj7nk7m45yHMK35fkc3qMQwMDld+L1RWxK1c=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NSQcyqxNnZDtnApQKeiymIlXzX3EbTWzXvTdFEBikazP41cx8Ers9mOOphvCy5Q/PAsxosB3FoUgqfoBNScSgY79A2Tio/odRp6uuWrkSkUPQE/MXwHYz7ir8DSzqIqJgmYonlg0JjNXJNBdBUDoxMoDdJ0Mney5J4kjaLWs0M0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=LMntBeNG; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="LMntBeNG" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-3614826eca4so3436678a91.1 for ; Tue, 21 Apr 2026 02:25:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776763548; x=1777368348; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=pTjwhKG7KzQh08aYS/X3NKGI2n2kHnjJW0npzktt0MY=; b=LMntBeNG1+gCYXxuv22qJqBRUnyPqe4UeMy/Sq3QEawnuI+pF6SnMj0YoGSKrrNp4D VDZv2DszFs9p+P39PBZw7iNE/oPrMbdbrGT5m9kycc2sGQw5NX9prxkdU41PX81b6e63 Y/E4/rqLb0RgfPYBAbEKXMtixh3L4aBAOBrxyO9XBDUk/kXG3YyfKp/4VDBfF7e0UWb1 ZRTXIhyS8Ctf+ysGvUy/nejzBwAKEmua4TfedatnzX0jKAcOlx2Pi3oAqkhJqxXkl4Xo vC0rEvCXMn1+BOdgeuu5K7jF5KusgL96jroPp5A5C4MWcekqA5UUfDrXrxrICCFEfdiK qRnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776763548; x=1777368348; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=pTjwhKG7KzQh08aYS/X3NKGI2n2kHnjJW0npzktt0MY=; b=M0FV9hP4d575Vdh6v+VghhQ735Mogg8u723818xrwpDAM5BmAgssng88RiPxjJYSVs q/sHdt3QelCJGL8T/1IHSlLHFX8raaoomjKFAbMPcqnqEMyyWoAiAA1ovdeg3c8e212B ykgn8S+KWEs1C+xzPYs92AF4C7YSTRKz32Aa3At+3GTttkJQnVItDGbKtezsLiKSf2TY djmpe2t/ug5i1Q+WX5onSKjkxlBoKj3O510sRUu1vOW614QzAN9CqH5AtQA1do/QugYc x3LMM4oMCGqUTgXoW/5FuFK3oRx9O0H3d4E1FgCUboa3Dbm/mLdnEkAvMWDFlzLAxlwu DnNg== X-Forwarded-Encrypted: i=1; AFNElJ8Xga0IXuz/ShBgIUnuCJ3s2YyjT9e8C/hBE97Mcq0ZAmIZAZ67Xcbi17zGuSgaVkZiCd7xUDnZGyGvT3g=@vger.kernel.org X-Gm-Message-State: AOJu0YyzNAXfeksNm7LykP37DoM5RD46tz4FPFPie0Lv0w9C8JfOwEay XT9HgwP499qMCtr6FTyt1YXIjFJIidxC4YIrnjUb/cje4UcLANn02KdttVgTRuhK1l8= X-Gm-Gg: AeBDieu7+BFGVqykEpcUykhQ2/fXQc3qjudyNS7oB6/j4Bk/FNOo6I+jEEXXnuxgWC8 p1C5Im82UdaB9Oyaj9c56gYU+g0+uot+F7ScUZti1BuLQy4efVxjFb9oHV93P21bl7YkfO5LA6z r4FKvtUBa+yzM66tyC07Ty4tPtjLaxcZnzaOc0tctaOLdfImjQs8xKA9PBFSeUZEB16FbTEQcgf k8JNWUVc8ig2HdPSr6XgJ+REjtFjNXzyxla2VUUlVI1I7+XKwJ0iNONiaOSxnLYCtCw7MesMcMY jrNGlatvig8y4+bNXd956vVwY0KM/UlwexUEc12xsMjpE/f4zLiop3gROvWECC3wHij6t9cIZpc cAtpqS8G6aESq35KbD6WT6fWWDQSOtARbYfxMO4MP7shUMTaU/tOQzRxGOlNimrp0ELb7uAU7Wk pBKI0qwGatkc+PvPyY7yvyd2RLdKtCHKRNRU61iD8AahJ47xZbSMSvkc2Yk+beIMFZCGWq X-Received: by 2002:a17:90b:2dc1:b0:35e:30bc:96e2 with SMTP id 98e67ed59e1d1-361403f1901mr17724201a91.10.1776763547742; Tue, 21 Apr 2026 02:25:47 -0700 (PDT) Received: from L6YN4KR4K9.bytedance.net ([61.213.176.6]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b5fa9ff39csm131965105ad.4.2026.04.21.02.25.38 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 21 Apr 2026 02:25:47 -0700 (PDT) From: Yunhui Cui To: akpm@linux-foundation.org, alex@ghiti.fr, andrew+kernel@donnellan.id.au, andreyknvl@gmail.com, anup@brainfault.org, aou@eecs.berkeley.edu, apopple@nvidia.com, ardb@kernel.org, atish.patra@linux.dev, baolin.wang@linux.alibaba.com, cuiyunhui@bytedance.com, david@kernel.org, debug@rivosinc.com, djordje.todorovic@htecgroup.com, dvyukov@google.com, elver@google.com, glider@google.com, ilias.apalodimas@linaro.org, junhui.liu@pigmoral.tech, kasan-dev@googlegroups.com, kees@kernel.org, kevin.brodsky@arm.com, kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, liu.xuemei1@zte.com.cn, ljs@kernel.org, namcao@linutronix.de, osalvador@suse.de, palmer@dabbelt.com, pjw@kernel.org, rmclure@linux.ibm.com, rostedt@goodmis.org, rppt@kernel.org, ryabinin.a.a@gmail.com, surenb@google.com, vincenzo.frascino@arm.com, vishal.moola@gmail.com, wangruikang@iscas.ac.cn, zhangchunyan@iscas.ac.cn Subject: [PATCH 4/7] riscv: hugetlb: switch NAPOT mappings to raw PTE helpers Date: Tue, 21 Apr 2026 17:24:54 +0800 Message-Id: <20260421092457.37649-5-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20260421092457.37649-1-cuiyunhui@bytedance.com> References: <20260421092457.37649-1-cuiyunhui@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use raw PTE helpers in hugetlb code to operate directly on the underlying PTE entries. This lets hugetlb manage NAPOT folding/unfolding explicitly instead of going through Svnapot-aware public wrappers. Add explicit NAPOT unfolding in set_huge_pte_at() before replacing an existing NAPOT mapping with non-NAPOT entries. No functional change intended. Signed-off-by: Yunhui Cui --- arch/riscv/mm/hugetlbpage.c | 55 +++++++++++++++++++++++-------------- 1 file changed, 35 insertions(+), 20 deletions(-) diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index a6d217112cf46..65a89b4fdad8b 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -7,7 +7,7 @@ pte_t huge_ptep_get(struct mm_struct *mm, unsigned long add= r, pte_t *ptep) { unsigned long pte_num; int i; - pte_t orig_pte =3D ptep_get(ptep); + pte_t orig_pte =3D __ptep_get(ptep); =20 if (!pte_present(orig_pte) || !pte_napot(orig_pte)) return orig_pte; @@ -15,7 +15,7 @@ pte_t huge_ptep_get(struct mm_struct *mm, unsigned long a= ddr, pte_t *ptep) pte_num =3D napot_pte_num(napot_cont_order(orig_pte)); =20 for (i =3D 0; i < pte_num; i++, ptep++) { - pte_t pte =3D ptep_get(ptep); + pte_t pte =3D __ptep_get(ptep); =20 if (pte_dirty(pte)) orig_pte =3D pte_mkdirty(orig_pte); @@ -74,7 +74,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, =20 out: if (pte) { - pte_t pteval =3D ptep_get_lockless(pte); + pte_t pteval =3D __ptep_get_lockless(pte); =20 WARN_ON_ONCE(pte_present(pteval) && !pte_huge(pteval)); } @@ -153,12 +153,12 @@ static pte_t get_clear_contig(struct mm_struct *mm, pte_t pte, tmp_pte; bool present; =20 - pte =3D ptep_get_and_clear(mm, addr, ptep); + pte =3D __ptep_get_and_clear(mm, addr, ptep); present =3D pte_present(pte); while (--ncontig) { ptep++; addr +=3D PAGE_SIZE; - tmp_pte =3D ptep_get_and_clear(mm, addr, ptep); + tmp_pte =3D __ptep_get_and_clear(mm, addr, ptep); if (present) { if (pte_dirty(tmp_pte)) pte =3D pte_mkdirty(pte); @@ -210,7 +210,7 @@ static void clear_flush(struct mm_struct *mm, unsigned long i, saddr =3D addr; =20 for (i =3D 0; i < ncontig; i++, addr +=3D pgsize, ptep++) - ptep_get_and_clear(mm, addr, ptep); + __ptep_get_and_clear(mm, addr, ptep); =20 flush_tlb_range(&vma, saddr, addr); } @@ -250,25 +250,40 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long sz) { size_t pgsize; + pte_t orig_pte; + pte_t pteval; int i, pte_num; =20 pte_num =3D num_contig_ptes_from_size(sz, &pgsize); =20 if (!pte_present(pte)) { - for (i =3D 0; i < pte_num; i++, ptep++, addr +=3D pgsize) - set_ptes(mm, addr, ptep, pte, 1); + for (i =3D 0; i < pte_num; i++, ptep++, addr +=3D pgsize) { + pteval =3D pte_mknonnapot(pte, addr); + orig_pte =3D __ptep_get(ptep); + + if (pte_present_napot(orig_pte)) + __napotpte_try_unfold(mm, addr, ptep, orig_pte); + + __set_ptes(mm, addr, ptep, pteval, 1); + } return; } =20 if (!pte_napot(pte)) { - set_ptes(mm, addr, ptep, pte, 1); + pteval =3D pte_mknonnapot(pte, addr); + orig_pte =3D __ptep_get(ptep); + + if (pte_present_napot(orig_pte)) + __napotpte_try_unfold(mm, addr, ptep, orig_pte); + + __set_ptes(mm, addr, ptep, pteval, 1); return; } =20 clear_flush(mm, addr, ptep, pgsize, pte_num); =20 for (i =3D 0; i < pte_num; i++, ptep++, addr +=3D pgsize) - set_pte_at(mm, addr, ptep, pte); + __set_ptes(mm, addr, ptep, pte, 1); } =20 int huge_ptep_set_access_flags(struct vm_area_struct *vma, @@ -283,7 +298,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *v= ma, int i, pte_num; =20 if (!pte_napot(pte)) - return ptep_set_access_flags(vma, addr, ptep, pte, dirty); + return __ptep_set_access_flags(vma, addr, ptep, pte, dirty); =20 order =3D napot_cont_order(pte); pte_num =3D napot_pte_num(order); @@ -307,11 +322,11 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, pte_t *ptep, unsigned long sz) { size_t pgsize; - pte_t orig_pte =3D ptep_get(ptep); + pte_t orig_pte =3D __ptep_get(ptep); int pte_num; =20 if (!pte_napot(orig_pte)) - return ptep_get_and_clear(mm, addr, ptep); + return __ptep_get_and_clear(mm, addr, ptep); =20 pte_num =3D num_contig_ptes_from_size(sz, &pgsize); =20 @@ -322,13 +337,13 @@ void huge_ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - pte_t pte =3D ptep_get(ptep); + pte_t pte =3D __ptep_get(ptep); unsigned long order; pte_t orig_pte; int i, pte_num; =20 if (!pte_napot(pte)) { - ptep_set_wrprotect(mm, addr, ptep); + __ptep_set_wrprotect(mm, addr, ptep); return; } =20 @@ -347,11 +362,11 @@ pte_t huge_ptep_clear_flush(struct vm_area_struct *vm= a, unsigned long addr, pte_t *ptep) { - pte_t pte =3D ptep_get(ptep); + pte_t pte =3D __ptep_get(ptep); int pte_num; =20 if (!pte_napot(pte)) - return ptep_clear_flush(vma, addr, ptep); + return __ptep_clear_flush(vma, addr, ptep); =20 pte_num =3D napot_pte_num(napot_cont_order(pte)); =20 @@ -364,18 +379,18 @@ void huge_pte_clear(struct mm_struct *mm, unsigned long sz) { size_t pgsize; - pte_t pte =3D ptep_get(ptep); + pte_t pte =3D __ptep_get(ptep); int i, pte_num; =20 if (!pte_napot(pte)) { - pte_clear(mm, addr, ptep); + __pte_clear(mm, addr, ptep); return; } =20 pte_num =3D num_contig_ptes_from_size(sz, &pgsize); =20 for (i =3D 0; i < pte_num; i++, addr +=3D pgsize, ptep++) - pte_clear(mm, addr, ptep); + __pte_clear(mm, addr, ptep); } =20 static bool is_napot_size(unsigned long size) --=20 2.39.5 From nobody Wed Jun 17 01:42:34 2026 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BCDAF384225 for ; Tue, 21 Apr 2026 09:25:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763559; cv=none; b=WJals990zBtSNa4njo5gJ85+5SlAUpl4Hfa6pcPcXZYsXGeOvo8PAf44pyFM97KBv3q1WnURNg689sIvkQzyI3UmGRHKCA52ByEh5xe3Eni2nIDNgyMFOGyPLp8T91o/Ub/eMHF5Kiy0T06d83COlbSaCCcSt6D4uqBBAmUhUn8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763559; c=relaxed/simple; bh=mHZEN9VsRGqjqCssbvNKqiiprdqF3SpE8VAB/Dd5Wtg=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rOGAX1NtVjLcwxctFihGg274s3KB9JVg6X/5WDde2rLK7HwF4vsy1yFkPHNpbPtI2oOROCupVg7CzxNE6PUgCXqb/siUBm/sWGSPzY2ODnEds6viIQ7rqfCVeajoLe0Jr4qHMAyjsaL6DF0boj1h2huAJbKtC9htWyn4DxVK0z4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=lYThRDqh; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="lYThRDqh" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-2ad4d639db3so21555715ad.0 for ; Tue, 21 Apr 2026 02:25:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776763557; x=1777368357; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=CqGRaYikd48k8jEHjjdiykQw70hWXJqzRcIUQ0MV2mg=; b=lYThRDqhzeUmWdXf2XYH+8RL82uBbJnDFkstNVFWbtBtaYV9HzuYdqdPM0ufRTjavq 06Y4Yiri4ZtQ2GcD/5kHFN5neReyU55H7/GIlZn/SZq8gqMrW8bWKw0i892zGi+Cf9NY xuioPEopBdzdB9X2jmL9XZscgssLU4ZwnTlBHQycJiOgx1nOJOA5i5t1Hri5EIQE2hHR oFy9BLjRynWM3rbal//Q7oSrCYvJjYb8MQWex3qGFGbJcVc/u6IsaD0QNnoshOIoHLHi kD3l2nHp7NsSVSmhTsVcRPG9RHT2MjPyr1112gkLS8zHj60sY3zlQ6UB3WpoD4oVjC+1 aySw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776763557; x=1777368357; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=CqGRaYikd48k8jEHjjdiykQw70hWXJqzRcIUQ0MV2mg=; b=ndGGEk+fp15cR+FOfgoWgseyix1GvvKjgZx9RC7CchfxtGnVrVqw7ecCS7eYPhM9my dhah+X+dwAo5v7rBgaC5PO7vyn3m+YvIS/ot/uJ/Bxjz+g1hqllpbv+pLZqswkvEKxkO mbbLUPVMQfKSjNNrPs9SFkV0rFypoZs+8FARMyRz7/x94+YWwFxMw/8wb6o3mv4/0J0E fSXhUegTkYdfIyUk3Rmb0fWB3v+RfNhSBaFr53hPIiHr2b/WchYhzioZNKlo+HGj0S6t UQc23r06cXRiwoA60qGj3+LUxwjDI7NN54KFYSsnYYOeS9WzpLrgM5O3Z4EM6KIPqgRG ftyg== X-Forwarded-Encrypted: i=1; AFNElJ+YjIzjmUgMhr5Fv3jKLuXZAS7eVUSwkn78TW5ASoIWo0CBilCNtnNq+rKn3DPZjCX/TZt4K+hYe+Ah/rc=@vger.kernel.org X-Gm-Message-State: AOJu0YxtWz5c9W0EH9JgZyp17GRrIBcDjOOP7qTEEOgpu2rXJEpCHP7p y5TH/4gYbvbZ3CECClm+ZB2zjKYEwzStfQpWjl78U3JrAW5ywfupsbnenZyKr78desw= X-Gm-Gg: AeBDietTsKVK5nQMGZ8G/fQPDlkKw8P3je81/4XK6j4ZNsm2OoUf4cw07VkNaBazBqe JQiapK2O4aPx5SLRG7CNk25Npc0sAwu9UbJC0tZWMO6SuVBDifQRhx+XTPeCD+Abi45bDjwLRn7 bFatWxRcEv7bQUwb//dO7kU8u/WH2zVT0bX1tKPrm++hc9nEGKk5kucfgx9ci7jtR16+I8L2f5y vpFfEH7tBh9RycJysb20rdyxQyEdH6NzC/jt4mKFmD18ySsVEbtDNqIKpgcqN7JdEvLgY8ifQ2q Rn2OD2oHIE7eBve5E0UZ9supdILOXWePFSXYkhlYNrButt+05voGwVFG/mxJhQoAlkr1AikTnpH v06AOiw43i8WLjS0YDoKfzJVMVKpSsH/4I/i53sio/Jm4M/XrqIR2UqffTN3Nb3mCF2OOAz7gXq nvAz3dbYYrrs92kvdLMcPpVAkL4tntVvqzksVocY/u0pIAXL2kjERVbt5qOI7RJHywJTKt X-Received: by 2002:a17:903:1b30:b0:2b4:586d:2e5c with SMTP id d9443c01a7336-2b5f9ecbaa1mr174163805ad.2.1776763557022; Tue, 21 Apr 2026 02:25:57 -0700 (PDT) Received: from L6YN4KR4K9.bytedance.net ([61.213.176.6]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b5fa9ff39csm131965105ad.4.2026.04.21.02.25.48 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 21 Apr 2026 02:25:56 -0700 (PDT) From: Yunhui Cui To: akpm@linux-foundation.org, alex@ghiti.fr, andrew+kernel@donnellan.id.au, andreyknvl@gmail.com, anup@brainfault.org, aou@eecs.berkeley.edu, apopple@nvidia.com, ardb@kernel.org, atish.patra@linux.dev, baolin.wang@linux.alibaba.com, cuiyunhui@bytedance.com, david@kernel.org, debug@rivosinc.com, djordje.todorovic@htecgroup.com, dvyukov@google.com, elver@google.com, glider@google.com, ilias.apalodimas@linaro.org, junhui.liu@pigmoral.tech, kasan-dev@googlegroups.com, kees@kernel.org, kevin.brodsky@arm.com, kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, liu.xuemei1@zte.com.cn, ljs@kernel.org, namcao@linutronix.de, osalvador@suse.de, palmer@dabbelt.com, pjw@kernel.org, rmclure@linux.ibm.com, rostedt@goodmis.org, rppt@kernel.org, ryabinin.a.a@gmail.com, surenb@google.com, vincenzo.frascino@arm.com, vishal.moola@gmail.com, wangruikang@iscas.ac.cn, zhangchunyan@iscas.ac.cn Subject: [PATCH 5/7] riscv: add contiguous PTE range clearing helpers Date: Tue, 21 Apr 2026 17:24:55 +0800 Message-Id: <20260421092457.37649-6-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20260421092457.37649-1-cuiyunhui@bytedance.com> References: <20260421092457.37649-1-cuiyunhui@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add Svnapot-aware implementations of clear_full_ptes() and get_and_clear_full_ptes() so full PTE batches can be cleared without losing the required unfold semantics for NAPOT mappings. Signed-off-by: Yunhui Cui --- arch/riscv/include/asm/pgtable.h | 75 ++++++++++++++++++++++++- arch/riscv/mm/contpte.c | 96 ++++++++++++++++++++++++++++++++ 2 files changed, 170 insertions(+), 1 deletion(-) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgta= ble.h index 722483d4df37f..3e6516b5a4587 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -657,7 +657,6 @@ static inline void __set_pte_at(struct mm_struct *mm, p= te_t *ptep, pte_t pteval) } =20 #define PFN_PTE_SHIFT _PAGE_PFN_SHIFT - static inline void __set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pteval, unsigned int nr) { @@ -764,6 +763,47 @@ __ptep_get_and_clear(struct mm_struct *mm, unsigned lo= ng address, pte_t *ptep) =20 #define __ptep_get_and_clear __ptep_get_and_clear =20 +static inline void __clear_full_ptes(struct mm_struct *mm, unsigned long a= ddr, + pte_t *ptep, unsigned int nr, int full) +{ + (void)full; + + for (;;) { + __ptep_get_and_clear(mm, addr, ptep); + if (--nr =3D=3D 0) + break; + ptep++; + addr +=3D PAGE_SIZE; + } +} + +#define __clear_full_ptes __clear_full_ptes + +static inline pte_t __get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, + pte_t *ptep, + unsigned int nr, + int full) +{ + pte_t pte, tmp_pte; + + (void)full; + + pte =3D __ptep_get_and_clear(mm, addr, ptep); + while (--nr) { + ptep++; + addr +=3D PAGE_SIZE; + tmp_pte =3D __ptep_get_and_clear(mm, addr, ptep); + if (pte_dirty(tmp_pte)) + pte =3D pte_mkdirty(pte); + if (pte_young(tmp_pte)) + pte =3D pte_mkyoung(pte); + } + + return pte; +} + +#define __get_and_clear_full_ptes __get_and_clear_full_ptes static inline void __ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *p= tep) { @@ -831,6 +871,11 @@ pte_t napotpte_ptep_get(pte_t *ptep, pte_t orig_pte); pte_t napotpte_ptep_get_lockless(pte_t *ptep); void napotpte_set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr); +void napotpte_clear_full_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr, int full); +pte_t napotpte_get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr, int full); void napotpte_clear_young_dirty_ptes(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, unsigned int nr, cydp_t flags); @@ -933,6 +978,32 @@ static inline void clear_young_dirty_ptes(struct vm_ar= ea_struct *vma, napotpte_clear_young_dirty_ptes(vma, addr, ptep, nr, flags); } =20 +#define clear_full_ptes clear_full_ptes +static inline void clear_full_ptes(struct mm_struct *mm, unsigned long add= r, + pte_t *ptep, unsigned int nr, int full) +{ + if (likely(nr =3D=3D 1)) { + napotpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + __clear_full_ptes(mm, addr, ptep, nr, full); + return; + } + + napotpte_clear_full_ptes(mm, addr, ptep, nr, full); +} + +#define get_and_clear_full_ptes get_and_clear_full_ptes +static inline pte_t get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr, int full) +{ + if (likely(nr =3D=3D 1)) { + napotpte_try_unfold(mm, addr, ptep, __ptep_get(ptep)); + return __get_and_clear_full_ptes(mm, addr, ptep, nr, full); + } + + return napotpte_get_and_clear_full_ptes(mm, addr, ptep, nr, full); +} + #define __HAVE_ARCH_PTEP_SET_WRPROTECT static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep) @@ -989,6 +1060,8 @@ napotpte_ptep_clear_flush_young(struct vm_area_struct = *vma, #define ptep_get_lockless __ptep_get_lockless #define ptep_get_and_clear __ptep_get_and_clear #define clear_young_dirty_ptes __clear_young_dirty_ptes +#define clear_full_ptes __clear_full_ptes +#define get_and_clear_full_ptes __get_and_clear_full_ptes #define __HAVE_ARCH_PTEP_SET_WRPROTECT #define ptep_set_wrprotect __ptep_set_wrprotect #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH diff --git a/arch/riscv/mm/contpte.c b/arch/riscv/mm/contpte.c index f73af7d9b099a..77c2a4dbd3dda 100644 --- a/arch/riscv/mm/contpte.c +++ b/arch/riscv/mm/contpte.c @@ -107,6 +107,38 @@ __napot_ptep_get_and_clear(struct mm_struct *mm, unsig= ned long addr, pte_t *ptep return pte; } =20 +static void __napot_clear_full_ptes(struct mm_struct *mm, unsigned long ad= dr, + pte_t *ptep, unsigned int nr) +{ + for (;;) { + __napot_ptep_get_and_clear(mm, addr, ptep); + if (--nr =3D=3D 0) + break; + ptep++; + addr +=3D PAGE_SIZE; + } +} + +static pte_t __napot_get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr) +{ + pte_t pte, tmp_pte; + + pte =3D __napot_ptep_get_and_clear(mm, addr, ptep); + while (--nr) { + ptep++; + addr +=3D PAGE_SIZE; + tmp_pte =3D __napot_ptep_get_and_clear(mm, addr, ptep); + if (pte_dirty(tmp_pte)) + pte =3D pte_mkdirty(pte); + if (pte_young(tmp_pte)) + pte =3D pte_mkyoung(pte); + } + + return pte; +} + static void napotpte_convert(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t target) { @@ -202,6 +234,33 @@ void __napotpte_try_fold(struct mm_struct *mm, unsigne= d long addr, } EXPORT_SYMBOL(__napotpte_try_fold); =20 +static void napotpte_try_unfold_range(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr) +{ + unsigned long next; + pte_t pte; + unsigned int chunk; + + while (nr) { + pte =3D READ_ONCE(*ptep); + if (pte_present_napot(pte)) { + __napotpte_try_unfold(mm, addr, ptep, pte); + next =3D napot_align_addr(addr) + napotpte_size(); + chunk =3D (next - addr) >> PAGE_SHIFT; + } else { + chunk =3D 1; + } + + if (chunk > nr) + chunk =3D nr; + + ptep +=3D chunk; + addr +=3D chunk * PAGE_SIZE; + nr -=3D chunk; + } +} + void __napotpte_try_unfold(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { @@ -349,6 +408,43 @@ void napotpte_set_ptes(struct mm_struct *mm, unsigned = long addr, } EXPORT_SYMBOL(napotpte_set_ptes); =20 +void napotpte_clear_full_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr, int full) +{ + (void)full; + + if (!napot_hw_supported() || !mm_is_user(mm)) { + __napot_clear_full_ptes(mm, addr, ptep, nr); + return; + } + + /* + * Unlike arm64 contpte, a Svnapot PTE block stores identical + * napot-encoded entries across the whole block rather than per-page + * PFNs. Batch zap paths must therefore unfold the whole covered range + * so the core MM later sees ordinary per-page PTEs for rmap/rss/tlb + * batching. + */ + napotpte_try_unfold_range(mm, addr, ptep, nr); + __napot_clear_full_ptes(mm, addr, ptep, nr); +} +EXPORT_SYMBOL(napotpte_clear_full_ptes); + +pte_t napotpte_get_and_clear_full_ptes(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr, int full) +{ + (void)full; + + if (!napot_hw_supported() || !mm_is_user(mm)) + return __napot_get_and_clear_full_ptes(mm, addr, ptep, nr); + + napotpte_try_unfold_range(mm, addr, ptep, nr); + + return __napot_get_and_clear_full_ptes(mm, addr, ptep, nr); +} +EXPORT_SYMBOL(napotpte_get_and_clear_full_ptes); + void napotpte_clear_young_dirty_ptes(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, unsigned int nr, cydp_t flags) --=20 2.39.5 From nobody Wed Jun 17 01:42:34 2026 Received: from mail-pj1-f54.google.com (mail-pj1-f54.google.com [209.85.216.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4EE453A3E7F for ; Tue, 21 Apr 2026 09:26:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763568; cv=none; b=dtsEA08vwSGMjcZplTQmU1MOcsc51pe87y6sNRBt0VJPOG1YqokK9pDfy8q38j5QMWmA6bDhHdCNfGsMf8RS522e1sNR51gVY2AgVG7fP/I8T8+o+8//Yom4Pksu4XBblsiEdWMrgDubWwPBhm0RWlvpvTW9auKYWsX2SylMz0M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763568; c=relaxed/simple; bh=6PIN/dG7wFqQzqUa3HZ7Md7FkGfW//HU1jgeoeBIV1E=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=q7k2CPxceMV7fUBHbUn65bTie3qzmfHkRpBDJBJCAo6VzCY2oZJbv8vT4Zr0mv3VNuq/Os+T1zGwYW1ldCvfooxyUNm3/LpXpjBet9h1ppbicIzFVzSJ8be/6GUDUt4MkQL8jabBt3LjhS4wuulEnjQhhVE6Nsl2qic4T/hj+XE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=aLgBMTtq; arc=none smtp.client-ip=209.85.216.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="aLgBMTtq" Received: by mail-pj1-f54.google.com with SMTP id 98e67ed59e1d1-356337f058aso2599638a91.2 for ; Tue, 21 Apr 2026 02:26:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776763566; x=1777368366; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=FJWvxx2SvBNX2CVsFvwsTk0IK3zSjjAIlJH4BJFUonM=; b=aLgBMTtq5BDkSduB8coXYpoodGKG9tuSocK8BDlmgDOrw8Kqzo+gYzNq0/wtkZubha hjZgW9QOzntpnZDp2LqV4LtUGHX38VDzo4tkVZxUukxYFKvbORQpL4jxUYpklFTCAZJU XVnlqo4Hk0YfY2Fv73Bk8KXyyEhO5ZdBJ61pnldTgG68w2/71TWZm8HFm54AH+x8Y9o3 LgLMbfIRHCIyqfkAhh37nUdvqcx3O7HFhlPBIxpq0rGb1nqvrXNE4oQjDdgzcCGqUjYu TqYgmCuEisedUau/et0KIf3OaMaCfY2cCHDXjQiQO1RAnTTu1e5fWpck+8S+bt09L5XV 4h/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776763566; x=1777368366; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=FJWvxx2SvBNX2CVsFvwsTk0IK3zSjjAIlJH4BJFUonM=; b=YOyMneX0b7x7b0cGqXt+VZmWtYi/iKlY0dydSdaWz4l6SKEKEXWmMAc7ARLwtKGMNt KRaF/QTj2IWMYrJmVzBhPyPM3bWSHuRvBOe/yigl7OHCf3TNnXP5XRlLYjxcHIfjW6Ll 1ASXhMRRZ89z/qaeXOUaF/5H+Eb79khwuPbFKbRIP2SfUiJx5uirvA/7o2uyYHePFDwy xgzWh7BkJt6KeTm+cujeDuOFZE2qBDQltYph4WjHdc/MBiWeacc8vAPZzSiZe7L1tCTL YJ4WngG8I8pO31nMR2hWAnntktNoP7H/S/hiNVgO7IJGk3rtvSugAb2AasohBBR/1r+4 QKyA== X-Forwarded-Encrypted: i=1; AFNElJ+KQNw8Vyz9AzClVw57ixPuFPvL7bCrOeKsd3so7JKOsGj8rXGfjNKPU75HDs6G5Jsbc6TSnQXvhvc4E9w=@vger.kernel.org X-Gm-Message-State: AOJu0YzKAWRmi3NoSj/aCC1JosuA2dDfrLDrIP6dJsrCt7lVYJORGsGm AhlcF6Mi7udxjaMEyTNPfuG+99oWMawzg+zt7VeHXNzLgOa2X0cCalIqifJvxHeBXU8= X-Gm-Gg: AeBDievPsG7w1xUPfBOdNrnorF3GnLy7R53eZxdjbUtVH02It6Cyb/NWy2CWWc8P/cn ZA2aTmSga8UV6VLGIl7Eephi+zmt0L9sZW4KZ/5RXJvc0gRBJcu95Hwxwnc8cvPHj1z27qv4oln vyHAU8VUb1GU13jQZpfcaZcw8Jn7g2SL6bsGz2FCWIaSYeH+7Yo4dpTG0NAsNevlbZvmcUOxlw4 43d/y1lrXEE1LBeB3RHh+yoyQAYTkHiPXu/8nBdSlkdiZTOsHcHU/7xy6Z3BGAe18Hmy8TQVxqU nxmBfOG+6t37Qsw1Fd/yJBu3m8syTD90VNYBY6WnwRLVuMQPCL3/xLYj2EiDcP3IHAXuK22fJLD keP49EwQPEW3yhbjmzke2HMj853b8opaxSq7VOUmtnM+1/OjOMrwKcXYLJ/wnLnXiJm31BnCvAz MCvD2R2KSPMHj/x9gc43pgbqlwUZvxB40tc6JsTU2nWWPVSqIUMO7s8LgEW2AndvhbdwgEwZtfW +NVsow= X-Received: by 2002:a17:90b:5865:b0:35d:a557:e41 with SMTP id 98e67ed59e1d1-3614046f978mr18558407a91.14.1776763566317; Tue, 21 Apr 2026 02:26:06 -0700 (PDT) Received: from L6YN4KR4K9.bytedance.net ([61.213.176.6]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b5fa9ff39csm131965105ad.4.2026.04.21.02.25.57 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 21 Apr 2026 02:26:05 -0700 (PDT) From: Yunhui Cui To: akpm@linux-foundation.org, alex@ghiti.fr, andrew+kernel@donnellan.id.au, andreyknvl@gmail.com, anup@brainfault.org, aou@eecs.berkeley.edu, apopple@nvidia.com, ardb@kernel.org, atish.patra@linux.dev, baolin.wang@linux.alibaba.com, cuiyunhui@bytedance.com, david@kernel.org, debug@rivosinc.com, djordje.todorovic@htecgroup.com, dvyukov@google.com, elver@google.com, glider@google.com, ilias.apalodimas@linaro.org, junhui.liu@pigmoral.tech, kasan-dev@googlegroups.com, kees@kernel.org, kevin.brodsky@arm.com, kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, liu.xuemei1@zte.com.cn, ljs@kernel.org, namcao@linutronix.de, osalvador@suse.de, palmer@dabbelt.com, pjw@kernel.org, rmclure@linux.ibm.com, rostedt@goodmis.org, rppt@kernel.org, ryabinin.a.a@gmail.com, surenb@google.com, vincenzo.frascino@arm.com, vishal.moola@gmail.com, wangruikang@iscas.ac.cn, zhangchunyan@iscas.ac.cn Subject: [PATCH 6/7] riscv: batch write-protect contiguous PTE ranges Date: Tue, 21 Apr 2026 17:24:56 +0800 Message-Id: <20260421092457.37649-7-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20260421092457.37649-1-cuiyunhui@bytedance.com> References: <20260421092457.37649-1-cuiyunhui@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Hook wrprotect_ptes() into the Svnapot contpte helpers so write protection can preserve fully covered NAPOT blocks and only unfold partial ranges at the edges. Signed-off-by: Yunhui Cui --- arch/riscv/include/asm/pgtable.h | 38 +++++++++++++++++++++++++++-- arch/riscv/mm/contpte.c | 42 ++++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+), 2 deletions(-) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgta= ble.h index 3e6516b5a4587..db82253efb218 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -813,13 +813,30 @@ __ptep_set_wrprotect(struct mm_struct *mm, unsigned l= ong address, pte_t *ptep) * shadow stack memory is XWR =3D 010 and thus clearing _PAGE_WRITE will = lead to * encoding 000b which is wrong encoding with V =3D 1. This should lead t= o page fault * but we dont want this wrong configuration to be set in page tables. + * Keep the entry readable when clearing write permissions so we don't cr= eate + * an invalid present encoding. */ atomic_long_set((atomic_long_t *)ptep, - ((pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | _PAGE_READ)); + (pte_val(read_pte) & ~(unsigned long)_PAGE_WRITE) | + _PAGE_READ); } =20 #define __ptep_set_wrprotect __ptep_set_wrprotect =20 +static inline void __wrprotect_ptes(struct mm_struct *mm, + unsigned long address, + pte_t *ptep, unsigned int nr) +{ + for (;;) { + __ptep_set_wrprotect(mm, address, ptep); + if (--nr =3D=3D 0) + break; + ptep++; + address +=3D PAGE_SIZE; + } +} + +#define __wrprotect_ptes __wrprotect_ptes static inline pte_t __ptep_clear_flush(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) @@ -879,6 +896,8 @@ pte_t napotpte_get_and_clear_full_ptes(struct mm_struct= *mm, void napotpte_clear_young_dirty_ptes(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, unsigned int nr, cydp_t flags); +void napotpte_wrprotect_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr); bool napotpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, pte_t *ptep, pte_t entry, int dirty); @@ -1004,11 +1023,25 @@ static inline pte_t get_and_clear_full_ptes(struct = mm_struct *mm, return napotpte_get_and_clear_full_ptes(mm, addr, ptep, nr, full); } =20 +#define wrprotect_ptes wrprotect_ptes +static inline void wrprotect_ptes(struct mm_struct *mm, + unsigned long address, pte_t *ptep, + unsigned int nr) +{ + if (likely(nr =3D=3D 1)) { + napotpte_try_unfold(mm, address, ptep, __ptep_get(ptep)); + __ptep_set_wrprotect(mm, address, ptep); + return; + } + + napotpte_wrprotect_ptes(mm, address, ptep, nr); +} + #define __HAVE_ARCH_PTEP_SET_WRPROTECT static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long address, pte_t *ptep) { - __ptep_set_wrprotect(mm, address, ptep); + wrprotect_ptes(mm, address, ptep, 1); } =20 #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH @@ -1062,6 +1095,7 @@ napotpte_ptep_clear_flush_young(struct vm_area_struct= *vma, #define clear_young_dirty_ptes __clear_young_dirty_ptes #define clear_full_ptes __clear_full_ptes #define get_and_clear_full_ptes __get_and_clear_full_ptes +#define wrprotect_ptes __wrprotect_ptes #define __HAVE_ARCH_PTEP_SET_WRPROTECT #define ptep_set_wrprotect __ptep_set_wrprotect #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH diff --git a/arch/riscv/mm/contpte.c b/arch/riscv/mm/contpte.c index 77c2a4dbd3dda..077ffa49e89d9 100644 --- a/arch/riscv/mm/contpte.c +++ b/arch/riscv/mm/contpte.c @@ -261,6 +261,30 @@ static void napotpte_try_unfold_range(struct mm_struct= *mm, } } =20 +static void napotpte_try_unfold_partial(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + unsigned int nr) +{ + pte_t pte; + + if (ptep !=3D napot_align_ptep(ptep) || nr < napotpte_pte_num()) { + pte =3D READ_ONCE(*ptep); + if (pte_present_napot(pte)) + __napotpte_try_unfold(mm, addr, ptep, pte); + } + + if (ptep + nr !=3D napot_align_ptep(ptep + nr)) { + unsigned long last_addr; + pte_t *last_ptep; + + last_addr =3D addr + PAGE_SIZE * (nr - 1); + last_ptep =3D ptep + nr - 1; + pte =3D READ_ONCE(*last_ptep); + if (pte_present_napot(pte)) + __napotpte_try_unfold(mm, last_addr, last_ptep, pte); + } +} + void __napotpte_try_unfold(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { @@ -485,6 +509,24 @@ void napotpte_clear_young_dirty_ptes(struct vm_area_st= ruct *vma, } EXPORT_SYMBOL(napotpte_clear_young_dirty_ptes); =20 +void napotpte_wrprotect_ptes(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, unsigned int nr) +{ + unsigned int i; + + if (!napot_hw_supported() || !mm_is_user(mm)) { + for (i =3D 0; i < nr; i++, ptep++, addr +=3D PAGE_SIZE) + __ptep_set_wrprotect(mm, addr, ptep); + return; + } + + napotpte_try_unfold_partial(mm, addr, ptep, nr); + + for (i =3D 0; i < nr; i++, ptep++, addr +=3D PAGE_SIZE) + __ptep_set_wrprotect(mm, addr, ptep); +} +EXPORT_SYMBOL(napotpte_wrprotect_ptes); + bool napotpte_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address, pte_t *ptep, pte_t entry, int dirty) --=20 2.39.5 From nobody Wed Jun 17 01:42:34 2026 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 688E53A3E7F for ; Tue, 21 Apr 2026 09:26:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763578; cv=none; b=Jfev3YfxI5ox2JrzFt5kknBfBzeGSUEv1U2IfPNtWuf5U0KQ8jdlghngAoGbflba6P9jv8ypqHxNY6hmNqspkI9NTXpLk3BjW8bAUldbscPGJlwBexWbsGsaBkrmXoMgoOZWNU+UGCGM3cnimjzgns8PWbKcRTaMIIiYBNbFn2U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776763578; c=relaxed/simple; bh=NqIa/AVjalXxIFvZZGUvR0dRYemXOYMzQmU4EZu8VMY=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ALIRYMRK0hjc4UBBWT56Oe0fU3cfOiJzY5ipz/HUiu3ns68VPKR1x8FOCuN9DsekR8irUQyLYknys2nPkLnHz6fk7YzJmLMQeOpUWCvH649huOlhcF6olqfG5uF7Yzi2EnAoG2SuQXafqfLguDkHzCMYFUu4EPlgp/lvTHG2/0k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=k6STzqCz; arc=none smtp.client-ip=209.85.216.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="k6STzqCz" Received: by mail-pj1-f53.google.com with SMTP id 98e67ed59e1d1-3567e2b4159so2683211a91.0 for ; Tue, 21 Apr 2026 02:26:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1776763576; x=1777368376; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=vuusCw6rlfDGFySSxpy3cAwZYM4fJMPSUnMyh4fNp+s=; b=k6STzqCzh4nQJwlUfc7Dq+aZKHTOH3FAx/Yfnx1vUnkIWPz+5dAS95O4t4E1PvQIaw Vp06O/tCOIi55uNnDoKonOa8x2E+ab9kPixsievWw4CiSvHAMTh7ItbB9XEu2nA+FUjB ZsRhSxig1AikZ9TW8VoT/geODGCoegvKd/koJRaAW34cjZUtC4sb4xFQ6Prg3VXBjVfg LPUyMRVLXHUXIRm7uLwLOBCnjTYlfqAtLYd6ngjq4OKtaR/7JH4yS8VAxxHiWxKWpJgp M7SotZF+N+pjrgWRAEF8R4M/W8qqVjHiyzKg7GW/YQMn23UpHHoFIGftvu7+WfmkAx2e erKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1776763576; x=1777368376; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=vuusCw6rlfDGFySSxpy3cAwZYM4fJMPSUnMyh4fNp+s=; b=dqCtn0HJwyDxGnG61sTm5c6c2ilzgiyfQSbYINRE+Rl2a19/7rew8kKXD1e5rQlXl4 gYyFr6P1oL5cPCymOvUy4tU+dgRVfoO69XkR0RDcPvB5EVOUUuiEHKWCokDjSgzXZJu1 Ld2l0P0UYg00OXdzX01QQwWwJ9ah36RHFVS+06L5VHpGl2TnLr9Xlz8AEd1y9J5f19kw QLLEPDPmgnHRXlGEj9Fpc5TyBiRI0UvDyjORP6W1yEZ1FlQObNpcrGJA4R9i8HscDu4g zRe1YMU3/8r3DwhSrFaZCujlzXuxUjnYvgsUpS+LZCuCHxt13Q3cFdsLtxOyJJ8unG8p uGAw== X-Forwarded-Encrypted: i=1; AFNElJ9l9vlMuMHxdWeYMiFR3aZ1e+NVI16c8OERXsWOtUOGKwDXADuOkmB1Af7tvKVJEJ1kYX90T7HM/Q0lubk=@vger.kernel.org X-Gm-Message-State: AOJu0YwGI8aLqd/k5BVvxv3xoCqtUNPgIT9Dd0zh2kWpXF9VN8QJCg2Q LDAx6eUGJoBhhxcx2fwabC7s0sXPIzLHsZ6Myx6siBYru/Anv//TWN8fRCZlxhCZ004= X-Gm-Gg: AeBDietLs7GJbEok5/pKOBD7+bNplhRIEAUhFWwWlLJXBcEUaLa9iGBVlbWye18vQpW LgFBBlsyewXaENkPU5B6llKS6vJMyyRdixrA+BQifCWOG8HxbSLLpE3PpiGWh4xKt1yznhR79S7 oEIk1xQF0c0q6AzZ0PxuqnmnKS2AWY0hJH7/lVN1oD6yjuGvDTEwGLFolwK6P4WPQNRX0XCoGww AQjPiGdBKNigRkSKFJ1YZb0rcCsxr0qs9zAcstrwbkvrp9erq286xv/H00IsRuu83Z882fb+5Xv jRApaF6+LVDQSJMQuFG3vM2NY7MYf5YY/6lOs5tvMIC8D98l+Gj0pOrPMcltix3sqPB7AqXGrdJ atsgxZ+r/7r2U6qqTvBEAcnogaqIRTI7Dhugs0RW1z3bcAnEMJXKtoSVZyTE3d7mtK3DwaKDsFx wNN2AG6zgP4JjvZL8LMRRhPKDvWWDW+HRcndmuFsbOucaTvaxkBVr8JrJh5Nz2jlfOAmzy X-Received: by 2002:a17:90b:1809:b0:359:15c8:e8e1 with SMTP id 98e67ed59e1d1-361404a8241mr17256461a91.25.1776763575646; Tue, 21 Apr 2026 02:26:15 -0700 (PDT) Received: from L6YN4KR4K9.bytedance.net ([61.213.176.6]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2b5fa9ff39csm131965105ad.4.2026.04.21.02.26.06 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 21 Apr 2026 02:26:15 -0700 (PDT) From: Yunhui Cui To: akpm@linux-foundation.org, alex@ghiti.fr, andrew+kernel@donnellan.id.au, andreyknvl@gmail.com, anup@brainfault.org, aou@eecs.berkeley.edu, apopple@nvidia.com, ardb@kernel.org, atish.patra@linux.dev, baolin.wang@linux.alibaba.com, cuiyunhui@bytedance.com, david@kernel.org, debug@rivosinc.com, djordje.todorovic@htecgroup.com, dvyukov@google.com, elver@google.com, glider@google.com, ilias.apalodimas@linaro.org, junhui.liu@pigmoral.tech, kasan-dev@googlegroups.com, kees@kernel.org, kevin.brodsky@arm.com, kvm-riscv@lists.infradead.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, liu.xuemei1@zte.com.cn, ljs@kernel.org, namcao@linutronix.de, osalvador@suse.de, palmer@dabbelt.com, pjw@kernel.org, rmclure@linux.ibm.com, rostedt@goodmis.org, rppt@kernel.org, ryabinin.a.a@gmail.com, surenb@google.com, vincenzo.frascino@arm.com, vishal.moola@gmail.com, wangruikang@iscas.ac.cn, zhangchunyan@iscas.ac.cn Subject: [PATCH 7/7] riscv: add Svnapot-aware pte_batch_hint support Date: Tue, 21 Apr 2026 17:24:57 +0800 Message-Id: <20260421092457.37649-8-cuiyunhui@bytedance.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20260421092457.37649-1-cuiyunhui@bytedance.com> References: <20260421092457.37649-1-cuiyunhui@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Provide a Svnapot-specific pte_batch_hint() implementation so callers can batch over a contiguous napot range without re-reading each PTE entry. Keep the public wrapper in pgtable.h and leave the CONFIG-disabled case on the existing single-entry fallback. Signed-off-by: Yunhui Cui --- arch/riscv/include/asm/pgtable.h | 19 +++++++++++++++++- arch/riscv/mm/contpte.c | 33 ++++++++++++++++++++++++++++++++ 2 files changed, 51 insertions(+), 1 deletion(-) diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgta= ble.h index db82253efb218..264af77392c6e 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -872,6 +872,13 @@ static inline bool __ptep_clear_flush_young(struct vm_= area_struct *vma, =20 #define __ptep_clear_flush_young __ptep_clear_flush_young =20 +static inline unsigned int __pte_batch_hint(pte_t *ptep, pte_t pte) +{ + return 1; +} + +#define __pte_batch_hint __pte_batch_hint + #ifdef CONFIG_RISCV_ISA_SVNAPOT =20 /* @@ -886,6 +893,7 @@ void __napotpte_try_unfold(struct mm_struct *mm, unsign= ed long addr, pte_t *ptep, pte_t pte); pte_t napotpte_ptep_get(pte_t *ptep, pte_t orig_pte); pte_t napotpte_ptep_get_lockless(pte_t *ptep); +unsigned int napotpte_pte_batch_hint(pte_t *ptep); void napotpte_set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr); void napotpte_clear_full_ptes(struct mm_struct *mm, unsigned long addr, @@ -1056,6 +1064,15 @@ static inline bool ptep_clear_flush_young(struct vm_= area_struct *vma, return napotpte_ptep_clear_flush_young(vma, address, ptep); } =20 +#define pte_batch_hint pte_batch_hint +static inline unsigned int pte_batch_hint(pte_t *ptep, pte_t pte) +{ + if (!pte_present(pte)) + return 1; + + return napotpte_pte_batch_hint(ptep); +} + #else /* CONFIG_RISCV_ISA_SVNAPOT */ =20 static __always_inline bool riscv_pte_present_napot(pte_t pte) @@ -1100,9 +1117,9 @@ napotpte_ptep_clear_flush_young(struct vm_area_struct= *vma, #define ptep_set_wrprotect __ptep_set_wrprotect #define __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH #define ptep_clear_flush_young __ptep_clear_flush_young +#define pte_batch_hint __pte_batch_hint =20 #endif /* CONFIG_RISCV_ISA_SVNAPOT */ - #define pgprot_nx pgprot_nx static inline pgprot_t pgprot_nx(pgprot_t _prot) { diff --git a/arch/riscv/mm/contpte.c b/arch/riscv/mm/contpte.c index 077ffa49e89d9..134b8c401cabc 100644 --- a/arch/riscv/mm/contpte.c +++ b/arch/riscv/mm/contpte.c @@ -187,6 +187,12 @@ static inline bool napotpte_is_consistent(pte_t pte, p= te_t orig_pte) pte_val(pte_mask_ad(pte)) =3D=3D pte_val(pte_mask_ad(orig_pte)); } =20 +static inline bool napotpte_is_batch_consistent(pte_t pte, pte_t orig_pte) +{ + return pte_present_napot(pte) && + pte_val(pte_mkold(pte)) =3D=3D pte_val(pte_mkold(orig_pte)); +} + void __napotpte_try_fold(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { @@ -391,6 +397,33 @@ pte_t napotpte_ptep_get_lockless(pte_t *orig_ptep) } EXPORT_SYMBOL(napotpte_ptep_get_lockless); =20 +unsigned int napotpte_pte_batch_hint(pte_t *ptep) +{ + pte_t orig_pte, pte; + pte_t *start; + unsigned int i, nr, off; + + if (!napot_hw_supported()) + return 1; + + orig_pte =3D READ_ONCE(*ptep); + if (!pte_present_napot(orig_pte)) + return 1; + + start =3D napot_align_ptep(ptep); + nr =3D napotpte_pte_num(); + off =3D ptep - start; + + for (i =3D off; i < nr; i++) { + pte =3D READ_ONCE(start[i]); + if (!napotpte_is_batch_consistent(pte, orig_pte)) + return 1; + } + + return nr - off; +} +EXPORT_SYMBOL(napotpte_pte_batch_hint); + void napotpte_set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned int nr) { --=20 2.39.5