From nobody Sat Jun 20 17:33:36 2026 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9D1C3624A4 for ; Sun, 12 Apr 2026 17:43:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015790; cv=none; b=XvJFqBCwV32Zw8I6P0MmkAc1grDUel4FrDg42vu8iQD86cTLDHP6DgVuMkZFcBNhqdhW0dT9SgAoqVbmY268PiU9nacJhqARaBn10BQ2pZ1YOtMP4VC45oKQ8VSWADWrQw4aT/vmj33VBBGKSB3GiA4Eg9kJtFuikRUFOXjZzkQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015790; c=relaxed/simple; bh=NLUNNE8EsURKmnpMJnewE9ce3JTEtxRuTCtn2tv7Xcc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jIL1cezvLdhyDUFeGWbgU6umLY/T82fGl3hKKig8uC/V3cAz5mAtd2hbBCOgKoXPna31qk9ELora+/+54Q9tYgH+tfpoJMx4qaN7j4S88dj8YHZ/Yp3M24wCreOLlGEIgRMzGSxWjgGEQGynvrzoqLkhGgpSDIjLug/8rvrW64I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=0t0JFm80; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=9z8b5kPr; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=0t0JFm80; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=9z8b5kPr; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="0t0JFm80"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="9z8b5kPr"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="0t0JFm80"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="9z8b5kPr" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 91A5B5BD7B; Sun, 12 Apr 2026 17:43:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015781; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QxV3HLonIziP9SjdoNzpMD12zqZudwMxDY+moMUz51Q=; b=0t0JFm80DRjuGIOqw6p+Lx6q3VoPKz43FOdZ7exxs3sbOeaMOulVDzcQHtA5NXeCUB33hY wjFElbGqjjp2Bo3IdDKgUvrp/532MqBtus6pIULCfsx3oAT9neKuvQDzBG5SDBSk93+rwV 2sREobqVCEJV8yBsMqZ4tc9RcK3aoyw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015781; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QxV3HLonIziP9SjdoNzpMD12zqZudwMxDY+moMUz51Q=; b=9z8b5kPr7UXj36Kzc9/6Nbq2aIuIXRfSN8FRiAe5maNo8Mfov6VAPqfbu71CHwE4ratITb M8FpMso8D5KfSuCQ== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015781; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QxV3HLonIziP9SjdoNzpMD12zqZudwMxDY+moMUz51Q=; b=0t0JFm80DRjuGIOqw6p+Lx6q3VoPKz43FOdZ7exxs3sbOeaMOulVDzcQHtA5NXeCUB33hY wjFElbGqjjp2Bo3IdDKgUvrp/532MqBtus6pIULCfsx3oAT9neKuvQDzBG5SDBSk93+rwV 2sREobqVCEJV8yBsMqZ4tc9RcK3aoyw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015781; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QxV3HLonIziP9SjdoNzpMD12zqZudwMxDY+moMUz51Q=; b=9z8b5kPr7UXj36Kzc9/6Nbq2aIuIXRfSN8FRiAe5maNo8Mfov6VAPqfbu71CHwE4ratITb M8FpMso8D5KfSuCQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 180934AA4A; Sun, 12 Apr 2026 17:43:01 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id qOVVA6XZ22miRQAAD6G6ig (envelope-from ); Sun, 12 Apr 2026 17:43:01 +0000 From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Michal Hocko , Vlastimil Babka , Muchun Song , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oscar Salvador Subject: [RFC PATCH 1/7] mm: Add softleaf_from_pud Date: Sun, 12 Apr 2026 19:42:38 +0200 Message-ID: <20260412174244.133715-2-osalvador@suse.de> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20260412174244.133715-1-osalvador@suse.de> References: <20260412174244.133715-1-osalvador@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCPT_COUNT_SEVEN(0.00)[9]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; TO_DN_SOME(0.00)[]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Score: -2.80 X-Spam-Level: Content-Type: text/plain; charset="utf-8" We want to be able to operate on HugeTLB pages as we do with normal pages, which means stop predenting everyting is a pte in HugeTLB world and be able to operate on the right entry level. Since we can have HugeTLB as PUD entries, we need the infrastructure that allows us to operate on them, so add softleaf_from_pud(), and the infrastructure that comes with it. Signed-off-by: Oscar Salvador --- arch/arm64/include/asm/pgtable.h | 12 +++++ arch/loongarch/include/asm/pgtable.h | 1 + arch/powerpc/include/asm/book3s/64/pgtable.h | 7 +++ arch/s390/include/asm/pgtable.h | 38 ++++++++++++++++ arch/x86/include/asm/pgtable.h | 48 ++++++++++++++++++++ arch/x86/include/asm/pgtable_64.h | 2 + include/asm-generic/pgtable_uffd.h | 15 ++++++ include/linux/leafops.h | 33 ++++++++++++++ include/linux/pgtable.h | 37 +++++++++++++++ 9 files changed, 193 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgta= ble.h index b3e58735c49b..e42ad56a86d4 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -596,6 +596,13 @@ static inline int pmd_protnone(pmd_t pmd) #define pmd_mkyoung(pmd) pte_pmd(pte_mkyoung(pmd_pte(pmd))) #define pmd_mkinvalid(pmd) pte_pmd(pte_mkinvalid(pmd_pte(pmd))) #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +#define pud_uffd_wp(pud) pte_uffd_wp(pud_pte(pud)) +#define pud_mkuffd_wp(pud) pte_pud(pte_mkuffd_wp(pud_pte(pud))) +#define pud_clear_uffd_wp(pud) pte_pud(pte_clear_uffd_wp(pud_pte(pud))) +#define pud_swp_uffd_wp(pud) pte_swp_uffd_wp(pud_pte(pud)) +#define pud_swp_mkuffd_wp(pud) pte_pud(pte_swp_mkuffd_wp(pud_pte(pud))) +#define pud_swp_clear_uffd_wp(pud) \ + pte_pud(pte_swp_clear_uffd_wp(pud_pte(pud))) #define pmd_uffd_wp(pmd) pte_uffd_wp(pmd_pte(pmd)) #define pmd_mkuffd_wp(pmd) pte_pmd(pte_mkuffd_wp(pmd_pte(pmd))) #define pmd_clear_uffd_wp(pmd) pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd))) @@ -1528,6 +1535,11 @@ static inline pmd_t pmdp_establish(struct vm_area_st= ruct *vma, #define __swp_entry_to_pmd(swp) __pmd((swp).val) #endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ =20 +#ifdef CONFIG_HUGETLB_PAGE +#define __pud_to_swp_entry(pud) ((swp_entry_t) { pud_val(pud) }) +#define __swp_entry_to_pud(swp) __pud((swp).val) +#endif + /* * Ensure that there are not more swap files than can be encoded in the ke= rnel * PTEs. diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/= asm/pgtable.h index c33b3bcb733e..eba6d20f007f 100644 --- a/arch/loongarch/include/asm/pgtable.h +++ b/arch/loongarch/include/asm/pgtable.h @@ -335,6 +335,7 @@ static inline pte_t mk_swap_pte(unsigned long type, uns= igned long offset) #define __swp_entry_to_pmd(x) __pmd((x).val | _PAGE_HUGE) #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val(pte) }) #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val(pmd) }) +#define __pud_to_swp_entry(pud) ((swp_entry_t) { pud_val(pud) }) =20 static inline bool pte_swp_exclusive(pte_t pte) { diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/in= clude/asm/book3s/64/pgtable.h index 1a91762b455d..476781c59d5f 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -1065,6 +1065,13 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd) #define pmd_swp_soft_dirty(pmd) pte_swp_soft_dirty(pmd_pte(pmd)) #define pmd_swp_clear_soft_dirty(pmd) pte_pmd(pte_swp_clear_soft_dirty(pmd= _pte(pmd))) #endif + +#ifdef CONFIG_HUGETLB_PAGE +#define pud_swp_mksoft_dirty(pud) pte_pud(pte_swp_mksoft_dirty(pud_pte(pud= ))) +#define pud_swp_soft_dirty(pud) pte_swp_soft_dirty(pud_pte(pud)) +#define pud_swp_clear_soft_dirty(pud) pte_pud(pte_swp_clear_soft_dirty(pud= _pte(pud))) +#endif + #endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ =20 #ifdef CONFIG_NUMA_BALANCING diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtabl= e.h index 1c3c3be93be9..0d1d571215c4 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -901,11 +901,31 @@ static inline pmd_t pmd_clear_soft_dirty(pmd_t pmd) return clear_pmd_bit(pmd, __pgprot(_SEGMENT_ENTRY_SOFT_DIRTY)); } =20 +static inline int pud_soft_dirty(pud_t pud) +{ + return pud_val(pud) & _REGION3_ENTRY_SOFT_DIRTY; +} + +static inline pud_t pud_mksoft_dirty(pud_t pud) +{ + return set_pud_bit(pud, __pgprot(_REGION3_ENTRY_SOFT_DIRTY)); +} + +static inline pud_t pud_clear_soft_dirty(pud_t pud) +{ + return clear_pud_bit(pud, __pgprot(_REGION3_ENTRY_SOFT_DIRTY)); +} + #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION #define pmd_swp_soft_dirty(pmd) pmd_soft_dirty(pmd) #define pmd_swp_mksoft_dirty(pmd) pmd_mksoft_dirty(pmd) #define pmd_swp_clear_soft_dirty(pmd) pmd_clear_soft_dirty(pmd) #endif +#ifdef CONFIG_HUGETLB_PAGE +#define pud_swp_soft_dirty(pud) pud_soft_dirty(pud) +#define pud_swp_mksoft_dirty(pud) pud_mksoft_dirty(pud) +#define pud_swp_clear_soft_dirty(pud) pud_clear_soft_dirty(pud) +#endif =20 /* * query functions pte_write/pte_dirty/pte_young only work if @@ -1901,6 +1921,24 @@ static inline unsigned long __swp_offset_rste(swp_en= try_t entry) * requires conversion of the swap type and offset, and not all the possib= le * PTE bits. */ +static inline swp_entry_t __pud_to_swp_entry(pud_t pud) +{ + swp_entry_t arch_entry; + pte_t pte; + + arch_entry =3D __rste_to_swp_entry(pud_val(pud)); + pte =3D mk_swap_pte(__swp_type_rste(arch_entry), __swp_offset_rste(arch_e= ntry)); + return __pte_to_swp_entry(pte); +} + +static inline pud_t __swp_entry_to_pud(swp_entry_t arch_entry) +{ + pud_t pud; + + pud =3D __pud(mk_swap_rste(__swp_type(arch_entry), __swp_offset(arch_entr= y))); + return pud; +} + static inline swp_entry_t __pmd_to_swp_entry(pmd_t pmd) { swp_entry_t arch_entry; diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 1662c5a8f445..a68ff339cd56 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -656,6 +656,23 @@ static inline pud_t pud_mkwrite(pud_t pud) return pud_clear_saveddirty(pud); } =20 +#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP +static inline int pud_uffd_wp(pud_t pud) +{ + return pud_flags(pud) & _PAGE_UFFD_WP; +} + +static inline pud_t pud_mkuffd_wp(pud_t pud) +{ + return pud_wrprotect(pud_set_flags(pud, _PAGE_UFFD_WP)); +} + +static inline pud_t pud_clear_uffd_wp(pud_t pud) +{ + return pud_clear_flags(pud, _PAGE_UFFD_WP); +} +#endif + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline int pte_soft_dirty(pte_t pte) { @@ -1557,6 +1574,22 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t p= md) return pmd_clear_flags(pmd, _PAGE_SWP_SOFT_DIRTY); } #endif +#ifdef CONFIG_HUGETLB_PAGE +static inline pud_t pud_swp_mksoft_dirty(pud_t pud) +{ + return pud_set_flags(pud, _PAGE_SWP_SOFT_DIRTY); +} + +static inline int pud_swp_soft_dirty(pud_t pud) +{ + return pud_flags(pud) & _PAGE_SWP_SOFT_DIRTY; +} + +static inline pud_t pud_swp_clear_soft_dirty(pud_t pud) +{ + return pud_clear_flags(pud, _PAGE_SWP_SOFT_DIRTY); +} +#endif #endif =20 #ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP @@ -1589,6 +1622,21 @@ static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) { return pmd_clear_flags(pmd, _PAGE_SWP_UFFD_WP); } + +static inline pud_t pud_swp_mkuffd_wp(pud_t pud) +{ + return pud_set_flags(pud, _PAGE_SWP_UFFD_WP); +} + +static inline int pud_swp_uffd_wp(pud_t pud) +{ + return pud_flags(pud) & _PAGE_SWP_UFFD_WP; +} + +static inline pud_t pud_swp_clear_uffd_wp(pud_t pud) +{ + return pud_clear_flags(pud, _PAGE_SWP_UFFD_WP); +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ =20 static inline u16 pte_flags_pkey(unsigned long pte_flags) diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtab= le_64.h index f06e5d6a2747..0cf02ddd3d4b 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -236,8 +236,10 @@ static inline void native_pgd_clear(pgd_t *pgd) =20 #define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) }) #define __pmd_to_swp_entry(pmd) ((swp_entry_t) { pmd_val((pmd)) }) +#define __pud_to_swp_entry(pud) ((swp_entry_t) { pud_val((pud)) }) #define __swp_entry_to_pte(x) (__pte((x).val)) #define __swp_entry_to_pmd(x) (__pmd((x).val)) +#define __swp_entry_to_pud(x) (__pud((x).val)) =20 extern void cleanup_highmap(void); =20 diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtab= le_uffd.h index 0d85791efdf7..59c9d6762ec8 100644 --- a/include/asm-generic/pgtable_uffd.h +++ b/include/asm-generic/pgtable_uffd.h @@ -78,6 +78,21 @@ static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd) { return pmd; } + +static inline pud_t pud_swp_mkuffd_wp(pud_t pud) +{ + return pud; +} + +static inline int pud_swp_uffd_wp(pud_t pud) +{ + return 0; +} + +static inline pud_t pud_swp_clear_uffd_wp(pud_t pud) +{ + return pud; +} #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */ =20 #endif /* _ASM_GENERIC_PGTABLE_UFFD_H */ diff --git a/include/linux/leafops.h b/include/linux/leafops.h index a9ff94b744f2..122ac50aeb09 100644 --- a/include/linux/leafops.h +++ b/include/linux/leafops.h @@ -117,6 +117,39 @@ static inline softleaf_t softleaf_from_pmd(pmd_t pmd) =20 #endif =20 +#ifdef CONFIG_HUGETLB_PAGE +/** + * softleaf_from_pud() - Obtain a leaf entry from a PUD entry. + * @pud: PUD entry. + * + * If @pud is present (therefore not a leaf entry) the function returns an= empty + * leaf entry. Otherwise, it returns a leaf entry. + * + * Returns: Leaf entry. + */ +static inline softleaf_t softleaf_from_pud(pud_t pud) +{ + softleaf_t arch_entry; + + if (pud_present(pud) || pud_none(pud)) + return softleaf_mk_none(); + + if (pud_swp_soft_dirty(pud)) + pud =3D pud_swp_clear_soft_dirty(pud); + if (pud_swp_uffd_wp(pud)) + pud =3D pud_swp_clear_uffd_wp(pud); + arch_entry =3D __pud_to_swp_entry(pud); + + /* Temporary until swp_entry_t eliminated. */ + return swp_entry(__swp_type(arch_entry), __swp_offset(arch_entry)); +} +#else +static inline softleaf_t softleaf_from_pud(pud_t pud) +{ + return softleaf_mk_none(); +} +#endif + /** * softleaf_is_none() - Is the leaf entry empty? * @entry: Leaf entry. diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index a50df42a893f..1abd9c52a4f2 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1761,6 +1761,22 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t p= md) return pmd; } #endif +#ifndef CONFIG_HUGETLB_PAGE +static inline pud_t pud_swp_mksoft_dirty(pud_t pud) +{ + return pud; +} + +static inline int pud_swp_soft_dirty(pud_t pud) +{ + return 0; +} + +static inline pud_t pud_swp_clear_soft_dirty(pud_t pud) +{ + return pud; +} +#endif #else /* !CONFIG_HAVE_ARCH_SOFT_DIRTY */ static inline int pte_soft_dirty(pte_t pte) { @@ -1821,6 +1837,21 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t p= md) { return pmd; } + +static inline pud_t pud_swp_mksoft_dirty(pud_t pud) +{ + return pud; +} + +static inline int pud_swp_soft_dirty(pud_t pud) +{ + return 0; +} + +static inline pud_t pud_swp_clear_soft_dirty(pud_t pud) +{ + return pud; +} #endif =20 #ifndef __HAVE_PFNMAP_TRACKING @@ -2369,4 +2400,10 @@ pgprot_t vm_get_page_prot(vm_flags_t vm_flags) \ } \ EXPORT_SYMBOL(vm_get_page_prot); =20 +#ifdef CONFIG_HUGETLB_PAGE +#ifndef __pud_to_swp_entry +#define __pud_to_swp_entry(pud) ((swp_entry_t) { pud_val(pud) }) +#endif +#endif + #endif /* _LINUX_PGTABLE_H */ --=20 2.35.3 From nobody Sat Jun 20 17:33:36 2026 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4F223630BC for ; Sun, 12 Apr 2026 17:43:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015796; cv=none; b=FVWiKsCBhNEpZKsmvVfYeuPCLdFP2jEACV6dPP2LJCMjexsEjTYMyyq9m//9wiw4SbiShKY2/2HxJ4OskdmCBtfX3ljeinoibnzeGdRkkhqqMXOCebcJ20AdF2qX1SbXrCE1iaGvHOrd1IAHX/OitgLyd+mAW/CwKWcaMiQnIIE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015796; c=relaxed/simple; bh=9K/vWN1GNqmBL5aoNyPnKnm/4TeONdhpy+O6bS9nEIs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H7Hbu6dkEwS2/OY5pcz46BSxDUZbhoPSBiQ5fmNUvcrsT6Mnhcj8lq/2yvuetJv0aGfHQd5paPfQ5n/nymANxIBxZQnj0kEzB0s7soTM1jrntuqIjHl29wLlasalyY4w7ERaj9T8EXT+5yXerQgKPN9IgCQ/E0wB2sPqGt6cCzM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=AuJPD9CS; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=u45SU55D; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=AuJPD9CS; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=u45SU55D; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="AuJPD9CS"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="u45SU55D"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="AuJPD9CS"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="u45SU55D" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 2801C5BD83; Sun, 12 Apr 2026 17:43:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015782; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bdCTR0tCpwfzVsn5q0RVh1DnLiRFbAD92h9WzaCJKOs=; b=AuJPD9CSFTtu9WM+XaKpk2j0S0KF74+Ghh+JguxmJ8jMocklMU5tZH41YDzhA4/WpzMIsJ lHhPKbMwXSVr9JZ9ALm8Dv5My10g+p4HR3uYdlkOOFRiOPbsfO4RgpdW/6XDnJ3CRYftFd OcIcQ8pw4uWfaY53Huukxd4eEL51kqU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015782; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bdCTR0tCpwfzVsn5q0RVh1DnLiRFbAD92h9WzaCJKOs=; b=u45SU55D/mS/OVH6cBFwsQVgMBlY/UZZVshfmjx/jDoSoWpVcOsRw/EhFbbjcq5ns/iXER 4UJxH74uH0OqgvBg== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015782; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bdCTR0tCpwfzVsn5q0RVh1DnLiRFbAD92h9WzaCJKOs=; b=AuJPD9CSFTtu9WM+XaKpk2j0S0KF74+Ghh+JguxmJ8jMocklMU5tZH41YDzhA4/WpzMIsJ lHhPKbMwXSVr9JZ9ALm8Dv5My10g+p4HR3uYdlkOOFRiOPbsfO4RgpdW/6XDnJ3CRYftFd OcIcQ8pw4uWfaY53Huukxd4eEL51kqU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015782; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bdCTR0tCpwfzVsn5q0RVh1DnLiRFbAD92h9WzaCJKOs=; b=u45SU55D/mS/OVH6cBFwsQVgMBlY/UZZVshfmjx/jDoSoWpVcOsRw/EhFbbjcq5ns/iXER 4UJxH74uH0OqgvBg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A3E984AA48; Sun, 12 Apr 2026 17:43:01 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 4PBPJaXZ22miRQAAD6G6ig (envelope-from ); Sun, 12 Apr 2026 17:43:01 +0000 From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Michal Hocko , Vlastimil Babka , Muchun Song , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oscar Salvador Subject: [RFC PATCH 2/7] mm: Add {pmd,pud}_huge_lock helper Date: Sun, 12 Apr 2026 19:42:39 +0200 Message-ID: <20260412174244.133715-3-osalvador@suse.de> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20260412174244.133715-1-osalvador@suse.de> References: <20260412174244.133715-1-osalvador@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Spam-Score: -2.80 X-Spam-Level: X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCPT_COUNT_SEVEN(0.00)[9]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; TO_DN_SOME(0.00)[]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO Content-Type: text/plain; charset="utf-8" HugeTLB and THP use the same lock for pud and pmd, so create two helpers that can be directly used by both of them, as they will be used in the generic pagewalkers. Signed-off-by: Oscar Salvador --- include/linux/mm_inline.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index fa2d6ba811b5..3ac77b50e91f 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -655,4 +655,36 @@ static inline size_t num_pages_contiguous(struct page = **pages, size_t nr_pages) return i; } =20 +static inline spinlock_t *pmd_huge_lock(pmd_t *pmd, struct vm_area_struct = *vma) +{ + spinlock_t *ptl; + + if (pmd_present(*pmd) || !pmd_none(*pmd)) { + ptl =3D pmd_lock(vma->vm_mm, pmd); + if (pmd_present(*pmd) && pmd_leaf(*pmd)) + return ptl; + else if (!pmd_present(*pmd) && !pmd_none(*pmd)) + return ptl; + spin_unlock(ptl); + } + + return NULL; +} + +static inline spinlock_t *pud_huge_lock(pud_t *pud, struct vm_area_struct = *vma) +{ + spinlock_t *ptl; + + if (pud_present(*pud) || !pud_none(*pud)) { + ptl =3D pud_lock(vma->vm_mm, pud); + if (pud_present(*pud) && pud_leaf(*pud)) + return ptl; + else if (!pud_present(*pud) && !pud_none(*pud)) + return ptl; + spin_unlock(ptl); + } + + return NULL; +} + #endif --=20 2.35.3 From nobody Sat Jun 20 17:33:36 2026 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D9F623624DD for ; Sun, 12 Apr 2026 17:43:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015802; cv=none; b=SbJwAmn+Ag7Ik6gidCl6Jeug324EBBfOuXrNDYDzSg/V5pb3yTAPVZIwbhg/1Q6dojhobW49lc3ovDrOl+zbYwOBNNETM0EjjYPIt21VExoO1xG6nn3Lk+bD1BlHJLVCBG+7nOVHzvKyrapKgorpa/kAJLecS4jXzZtoMEkYkNk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015802; c=relaxed/simple; bh=IVFiypFH+izMgicn1tMi7oS4pQ7WfgvdQWZBwXCTdyI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FRwurc9mrHULDxOswpdklT6986Sacq0dyO/lptlYvPcJiP7BrKLad9uNnJphjjxBq9yZj6cIkDfC8crKMybSIBeRNzBzdqwRXbzo3GkR49UkIdVOBkm4jyI83fPnelexKUFsjaj43LzZaPHLIIKUzmuJu1WrLkSdA4KXanb77M4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=FCALKoAo; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=R6s6/rPi; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=FCALKoAo; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=R6s6/rPi; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="FCALKoAo"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="R6s6/rPi"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="FCALKoAo"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="R6s6/rPi" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id AEEFF5BD85; Sun, 12 Apr 2026 17:43:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015782; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7YmowN5FH0Kg0mhxzZsHNC6X9cqrKQWei1fJR8Huzcs=; b=FCALKoAopvvuIaahjF6Hlaljhy4ZVBYgzZ+q83vY4tTTFr6SyVlll5F6K4IPS2PgTHU+uY x0A1QNBEab0oLxDXKnV1lgsT+yitSB5MqTFl/tlJ6tyrhccbVeWyEjWUcP73Copgkgn8Xs +twjdbk9Xqq9o83sscmatGlfuizUn8k= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015782; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7YmowN5FH0Kg0mhxzZsHNC6X9cqrKQWei1fJR8Huzcs=; b=R6s6/rPiYD4W6Ta4jyOWgHOV7PPVjivoTuKYszaxdnMCqsvvx7K4cKkhJiCezIX0OikRbB UieUFhzX+Bq/OWBQ== Authentication-Results: smtp-out2.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=FCALKoAo; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b="R6s6/rPi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015782; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7YmowN5FH0Kg0mhxzZsHNC6X9cqrKQWei1fJR8Huzcs=; b=FCALKoAopvvuIaahjF6Hlaljhy4ZVBYgzZ+q83vY4tTTFr6SyVlll5F6K4IPS2PgTHU+uY x0A1QNBEab0oLxDXKnV1lgsT+yitSB5MqTFl/tlJ6tyrhccbVeWyEjWUcP73Copgkgn8Xs +twjdbk9Xqq9o83sscmatGlfuizUn8k= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015782; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7YmowN5FH0Kg0mhxzZsHNC6X9cqrKQWei1fJR8Huzcs=; b=R6s6/rPiYD4W6Ta4jyOWgHOV7PPVjivoTuKYszaxdnMCqsvvx7K4cKkhJiCezIX0OikRbB UieUFhzX+Bq/OWBQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 378574AA4A; Sun, 12 Apr 2026 17:43:02 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id qIEMC6bZ22miRQAAD6G6ig (envelope-from ); Sun, 12 Apr 2026 17:43:02 +0000 From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Michal Hocko , Vlastimil Babka , Muchun Song , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oscar Salvador Subject: [RFC PATCH 3/7] mm: Implement folio_pmd_batch Date: Sun, 12 Apr 2026 19:42:40 +0200 Message-ID: <20260412174244.133715-4-osalvador@suse.de> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20260412174244.133715-1-osalvador@suse.de> References: <20260412174244.133715-1-osalvador@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Action: no action X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email]; RCVD_VIA_SMTP_AUTH(0.00)[]; FUZZY_RATELIMITED(0.00)[rspamd.com]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FROM_EQ_ENVFROM(0.00)[]; RCVD_TLS_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCPT_COUNT_SEVEN(0.00)[9]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; TO_DN_SOME(0.00)[]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Queue-Id: AEEFF5BD85 X-Spam-Flag: NO X-Spam-Score: -3.01 X-Spam-Level: Content-Type: text/plain; charset="utf-8" HugeTLB can be mapped as contiguous PMDs, so we need a way to be able to batch them as we do for contiguous PTEs. Implement folio_pmd_batch in order to do that. Signed-off-by: Oscar Salvador --- arch/arm64/include/asm/pgtable.h | 19 ++++++++ include/linux/pgtable.h | 30 +++++++++++++ mm/internal.h | 75 +++++++++++++++++++++++++++++++- 3 files changed, 123 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgta= ble.h index e42ad56a86d4..5b5490505b94 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -170,6 +170,8 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t ph= ys) (__boundary - 1 < (end) - 1) ? __boundary : (end); \ }) =20 +#define pmd_valid_cont(pmd) (pmd_valid(pmd) && pmd_cont(pmd)) + #define pte_hw_dirty(pte) (pte_write(pte) && !pte_rdonly(pte)) #define pte_sw_dirty(pte) (!!(pte_val(pte) & PTE_DIRTY)) #define pte_dirty(pte) (pte_sw_dirty(pte) || pte_hw_dirty(pte)) @@ -670,6 +672,12 @@ static inline pgprot_t pmd_pgprot(pmd_t pmd) return __pgprot(pmd_val(pfn_pmd(pfn, __pgprot(0))) ^ pmd_val(pmd)); } =20 +#define pmd_advance_pfn pmd_advance_pfn +static inline pmd_t pmd_advance_pfn(pmd_t pmd, unsigned long nr) +{ + return pfn_pmd(pmd_pfn(pmd) + nr, pmd_pgprot(pmd)); +} + #define pud_pgprot pud_pgprot static inline pgprot_t pud_pgprot(pud_t pud) { @@ -1645,6 +1653,17 @@ extern void modify_prot_commit_ptes(struct vm_area_s= truct *vma, unsigned long ad pte_t *ptep, pte_t old_pte, pte_t pte, unsigned int nr); =20 +#ifdef CONFIG_HUGETLB_PAGE +#define pmd_batch_hint pmd_batch_hint +static inline unsigned int pmd_batch_hint(pmd_t *pmdp, pmd_t pmd) +{ + if (!pmd_valid_cont(pmd)) + return 1; + + return CONT_PMDS - (((unsigned long)pmdp >> 3) & (CONT_PMDS - 1)); +} +#endif + #ifdef CONFIG_ARM64_CONTPTE =20 /* diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 1abd9c52a4f2..6f01d5ed73f6 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -358,6 +358,36 @@ static inline void lazy_mmu_mode_pause(void) {} static inline void lazy_mmu_mode_resume(void) {} #endif =20 +#ifndef pmd_batch_hint +/** + * pmd_batch_hint - Number of PMD entries that can be added to batch witho= ut scanning. + * @pmdp: Page table pointer for the entry. + * @pmd: Page table entry. + * + * Some architectures know that a set of contiguous pmds all map the same + * contiguous memory with the same permissions. In this case, it can provi= de a + * hint to aid pmd batching without the core code needing to scan every pm= d. + * + * An architecture implementation may ignore the PMD accessed state. Furth= er, + * the dirty state must apply atomically to all the PMDs described by the = hint. + * + * May be overridden by the architecture, else pmd_batch_hint is always 1. + */ +static inline unsigned int pmd_batch_hint(pmd_t *pmdp, pmd_t pmd) +{ + return 1; +} +#endif + +#ifndef pmd_advance_pfn +static inline pmd_t pmd_advance_pfn(pmd_t pmd, unsigned long nr) +{ + return __pmd(pmd_val(pmd) + (nr << PFN_PTE_SHIFT)); +} +#endif + +#define pmd_next_pfn(pmd) pmd_advance_pfn(pmd, 1) + #ifndef pte_batch_hint /** * pte_batch_hint - Number of pages that can be added to batch without sca= nning. diff --git a/mm/internal.h b/mm/internal.h index cb0af847d7d9..8fa0681ff2af 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -269,7 +269,7 @@ static inline int anon_vma_prepare(struct vm_area_struc= t *vma) return __anon_vma_prepare(vma); } =20 -/* Flags for folio_pte_batch(). */ +/* Flags for folio_{pmd,pte}_batch(). */ typedef int __bitwise fpb_t; =20 /* Compare PTEs respecting the dirty bit. */ @@ -293,6 +293,79 @@ typedef int __bitwise fpb_t; */ #define FPB_MERGE_YOUNG_DIRTY ((__force fpb_t)BIT(4)) =20 +static inline pmd_t __pmd_batch_clear_ignored(pmd_t pmd, fpb_t flags) +{ + if (!(flags & FPB_RESPECT_DIRTY)) + pmd =3D pmd_mkclean(pmd); + if (likely(!(flags & FPB_RESPECT_SOFT_DIRTY))) + pmd =3D pmd_clear_soft_dirty(pmd); + if (likely(!(flags & FPB_RESPECT_WRITE))) + pmd =3D pmd_wrprotect(pmd); + return pmd_mkold(pmd); +} + +/** + * folio_pmd_batch - detect a PMD batch for a large folio. + * - The only user of this is hugetlb for contiguous + * PMDs + **/ +static inline int folio_pmd_batch(struct folio *folio, pmd_t *pmdp, pmd_t = *pmdentp, + int max_nr, fpb_t flags, bool *any_writable, + bool *any_young, bool *any_dirty) +{ + pmd_t expected_pmd, pmd =3D *pmdentp; + bool writable, young, dirty; + int nr, cur_nr; + + if (any_writable) + *any_writable =3D !!pmd_write(*pmdentp); + if (any_young) + *any_young =3D !!pmd_young(*pmdentp); + if (any_dirty) + *any_dirty =3D !!pmd_dirty(*pmdentp); + + VM_WARN_ON_FOLIO(!pmd_present(pmd), folio); + VM_WARN_ON_FOLIO(!folio_test_large(folio) || max_nr < 1, folio); + VM_WARN_ON_FOLIO(page_folio(pfn_to_page(pmd_pfn(pmd))) !=3D folio, folio); + + /* Limit max_nr to the actual remaining PFNs in the folio we could batch.= */ + max_nr =3D min_t(unsigned long, max_nr, + (folio_pfn(folio) + folio_nr_pages(folio) - + pmd_pfn(pmd)) >> (PMD_SHIFT - PAGE_SHIFT)); + + nr =3D pmd_batch_hint(pmdp, pmd); + expected_pmd =3D __pmd_batch_clear_ignored(pmd_advance_pfn(pmd, nr << (PM= D_SHIFT - PAGE_SHIFT)), flags); + pmdp =3D pmdp + nr; + + while (nr < max_nr) { + pmd =3D pmdp_get(pmdp); + if (any_writable) + writable =3D !!pmd_write(pmd); + if (any_young) + young =3D !!pmd_young(pmd); + if (any_dirty) + dirty =3D !!pmd_dirty(pmd); + pmd =3D __pmd_batch_clear_ignored(pmd, flags); + + if (!pmd_same(pmd, expected_pmd)) + break; + + if (any_writable) + *any_writable |=3D writable; + if (any_young) + *any_young |=3D young; + if (any_dirty) + *any_dirty |=3D dirty; + + cur_nr =3D pmd_batch_hint(pmdp, pmd); + expected_pmd =3D pmd_advance_pfn(expected_pmd, cur_nr << (PMD_SHIFT - PA= GE_SHIFT)); + pmdp +=3D cur_nr; + nr +=3D cur_nr; + } + + return min(nr, max_nr); +} + static inline pte_t __pte_batch_clear_ignored(pte_t pte, fpb_t flags) { if (!(flags & FPB_RESPECT_DIRTY)) --=20 2.35.3 From nobody Sat Jun 20 17:33:36 2026 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9EEB836308E for ; Sun, 12 Apr 2026 17:43:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015809; cv=none; b=Q+hipma54VmA4QDzRoNRtM12kaw2CKG+Oj2o6iivzbCHAsK3sZToVyZBFc5R2H0vF4wUL0IuGgMMSMQaI7oprAuC0/zWl2E/NJauMKs1ti/dvLHzLp0xBMyNHjuAzHSratQA/zWiIup+UrH13AzAEYMjDGHlWE73Fxs7OG6Po8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015809; c=relaxed/simple; bh=R9NErneZaY1W3P+yLzJOeb96fF8z1yHRp4UowTtpdhE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Oawr3ZNd2fTHFRs1uJWOOK6YOkd2aX6XPT+VzUHX2DPDS9uN4qagicoi6mvKfioZDsW4V1DWUqRa3V3nhhy6K3FQq55vrhzDmtAiXMBuMtTCyfu6IpmAOBBFcVk5B9NqMVg/tM2LFLGwrOjaLntFEVylIcRj/35Ji8wz119elVY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=XRSZFB5X; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=6YgjNOx+; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=XRSZFB5X; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=6YgjNOx+; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="XRSZFB5X"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="6YgjNOx+"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="XRSZFB5X"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="6YgjNOx+" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 4E1385BD87; Sun, 12 Apr 2026 17:43:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015783; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/mcbLvQVD9OoXomXT4xdcrVMVpfN32lb1agD07R5EgA=; b=XRSZFB5XNF9TRp6FFMT6B/556yhQ/VDNq19vPy4f1ucdzTNwr5WgNMVuyEhmcD9QhRViGJ mdCRfNrH+tLWWnxwn4atmEzFwL5u/MoAmyQDmFpXLNiyDimujYxIvgbwHB5u/ZGi6zKtwl //cWqpeLTkazKHYOGVOXUkMwRWHZs4U= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015783; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/mcbLvQVD9OoXomXT4xdcrVMVpfN32lb1agD07R5EgA=; b=6YgjNOx+IyZnD5HiARoLdIWr2wl0X2+hObbsbVLzDIr48vBiW4nRZPW63zVqs+JDZeE8T/ FNh289HGieudCBCQ== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015783; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/mcbLvQVD9OoXomXT4xdcrVMVpfN32lb1agD07R5EgA=; b=XRSZFB5XNF9TRp6FFMT6B/556yhQ/VDNq19vPy4f1ucdzTNwr5WgNMVuyEhmcD9QhRViGJ mdCRfNrH+tLWWnxwn4atmEzFwL5u/MoAmyQDmFpXLNiyDimujYxIvgbwHB5u/ZGi6zKtwl //cWqpeLTkazKHYOGVOXUkMwRWHZs4U= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015783; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/mcbLvQVD9OoXomXT4xdcrVMVpfN32lb1agD07R5EgA=; b=6YgjNOx+IyZnD5HiARoLdIWr2wl0X2+hObbsbVLzDIr48vBiW4nRZPW63zVqs+JDZeE8T/ FNh289HGieudCBCQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id BEFD34AA48; Sun, 12 Apr 2026 17:43:02 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id AGkfLKbZ22miRQAAD6G6ig (envelope-from ); Sun, 12 Apr 2026 17:43:02 +0000 From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Michal Hocko , Vlastimil Babka , Muchun Song , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oscar Salvador , David Hildenbrand Subject: [RFC PATCH 4/7] mm: Implement pt_range_walk Date: Sun, 12 Apr 2026 19:42:41 +0200 Message-ID: <20260412174244.133715-5-osalvador@suse.de> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20260412174244.133715-1-osalvador@suse.de> References: <20260412174244.133715-1-osalvador@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Spam-Score: -3.30 X-Spam-Level: X-Spamd-Result: default: False [-3.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCVD_VIA_SMTP_AUTH(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_DN_SOME(0.00)[]; ARC_NA(0.00)[]; RCPT_COUNT_SEVEN(0.00)[10]; RCVD_TLS_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_HAS_DN(0.00)[]; R_RATELIMIT(0.00)[to_ip_from(RLd9dsuofksntgrby8c3fm48h6)]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email] X-Spam-Flag: NO Implement pt_range_walk, which is a pagewalk API that implements locking and batching itself, and returns a struct containing information about the address space which is backed by the vma. It goes through the address range provided, and returns whatever it find there, softleaf entries, folios, etc. and information about the entry itself like whether it is dirty, shared, present, size of the entry, pageta= ble level of the entry, number of batched entries, etc. It defines the following types: #define PT_TYPE_NONE #define PT_TYPE_FOLIO #define PT_TYPE_MARKER #define PT_TYPE_PFN #define PT_TYPE_SWAP #define PT_TYPE_MIGRATION #define PT_TYPE_DEVICE #define PT_TYPE_HWPOISON #define PT_TYPE_ALL and it lets the caller be explicit about what types it is interested in. If it finds a type, but the caller stated it is not of importance, it keeps scanning the address range till the next type is found, or till we exhaust the range. We have three functions: .pt_range_walk_start() .pt_range_walk_next() .pt_range_walk_done() pt_range_walk_start() starts scanning the range and it returns the first type it finds, then we keep calling pt_range_walk_next() until we get PTW_DONE, which means we exhausted the range, and once that happens we have to call pt_range_walk_done() in order to cleanup the pt_range_walk internal state, like locking. An example below: =C2=B4=C2=B4=C2=B4=C2=B4 pt_type_flags_t flags =3D PT_TYPE_ALL; type =3D pt_range_walk_start(&ptw, vma, start, vma->vm_end, flags); while (type !=3D PTW_DONE) { do_something type =3D pt_range_walk_next(&ptw, vma, start, vma->vm_end, flags= ); } pt_range_walk_done(&ptw); =C2=B4=C2=B4=C2=B4=C2=B4 The API manages locking within the interface, and also batching, which means that it can handle contiguous ptes (or pmds in the case of hugetlb) itself. Suggested-by: David Hildenbrand Signed-off-by: Oscar Salvador --- arch/arm64/include/asm/pgtable.h | 1 + include/linux/mm.h | 2 + include/linux/pagewalk.h | 104 ++++++++ mm/memory.c | 22 ++ mm/pagewalk.c | 400 +++++++++++++++++++++++++++++++ 5 files changed, 529 insertions(+) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgta= ble.h index 5b5490505b94..9f8cca8880e0 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -642,6 +642,7 @@ static inline pmd_t pmd_mkspecial(pmd_t pmd) #define pmd_pfn(pmd) ((__pmd_to_phys(pmd) & PMD_MASK) >> PAGE_SHIFT) #define pfn_pmd(pfn,prot) __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << PA= GE_SHIFT) | pgprot_val(prot)) =20 +#define pud_dirty(pud) pte_dirty(pud_pte(pud)) #define pud_young(pud) pte_young(pud_pte(pud)) #define pud_mkyoung(pud) pte_pud(pte_mkyoung(pud_pte(pud))) #define pud_write(pud) pte_write(pud_pte(pud)) diff --git a/include/linux/mm.h b/include/linux/mm.h index 5be3d8a8f806..c4e7fc558476 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2829,6 +2829,8 @@ struct folio *vm_normal_folio_pmd(struct vm_area_stru= ct *vma, unsigned long addr, pmd_t pmd); struct page *vm_normal_page_pmd(struct vm_area_struct *vma, unsigned long = addr, pmd_t pmd); +struct folio *vm_normal_folio_pud(struct vm_area_struct *vma, + unsigned long addr, pud_t pud); struct page *vm_normal_page_pud(struct vm_area_struct *vma, unsigned long = addr, pud_t pud); =20 diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index 88e18615dd72..8662468b4a3f 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -204,4 +204,108 @@ struct folio *folio_walk_start(struct folio_walk *fw, vma_pgtable_walk_end(__vma); \ } while (0) =20 +typedef int __bitwise pt_type_flags_t; + +/* + * Types we are interested in returning. Those which are not explicitly set + * will be silently ignored by keep walking the page tables. + */ +#define PT_TYPE_NONE ((__force pt_type_flags_t)BIT(0)) +#define PT_TYPE_FOLIO ((__force pt_type_flags_t)BIT(1)) +#define PT_TYPE_MARKER ((__force pt_type_flags_t)BIT(2)) +#define PT_TYPE_PFN ((__force pt_type_flags_t)BIT(3)) +#define PT_TYPE_SWAP ((__force pt_type_flags_t)BIT(4)) +#define PT_TYPE_MIGRATION ((__force pt_type_flags_t)BIT(5)) +#define PT_TYPE_DEVICE ((__force pt_type_flags_t)BIT(6)) +#define PT_TYPE_HWPOISON ((__force pt_type_flags_t)BIT(7)) +#define PT_TYPE_ALL (PT_TYPE_NONE | PT_TYPE_FOLIO | PT_TYPE_MARKER | \ + PT_TYPE_PFN | PT_TYPE_SWAP | PT_TYPE_MIGRATION | \ + PT_TYPE_DEVICE | PT_TYPE_HWPOISON) + +enum pt_range_walk_level { + PTW_PUD_LEVEL, + PTW_PMD_LEVEL, + PTW_PTE_LEVEL, +}; + +enum pt_range_walk_type { + PTW_ABORT, + PTW_DONE, + PTW_NONE, + PTW_FOLIO, + PTW_MARKER, + PTW_PFN, + PTW_SWAP, + PTW_MIGRATION, + PTW_DEVICE, + PTW_HWPOISON, +}; + +/** + * struct pt_range_walk - pt_range_walk() + * @page: exact folio page referenced (if applicable) + * @folio: folio mapped (if any) + * @nr_entries: number of contiguous entries of the same type + * @size: stores nr_batched * entry_size + * @softleaf_entry: softleaf entry (if any) + * @writable: whether it is writable + * @young: whether it is young + * @dirty: whether it is dirty + * @present: whether it is present in the page tables + * @vma_locked: whether we are holding the vma lock + * @pmd_shared: only used for hugetlb + * @curr_addr: current addr we are operating on + * @next_addr: next addr to be used walk the page tables + * @level: page table level + * @pte: copy of the entry value (PTW_PTE_LEVEL). + * @pmd: copy of the entry value (PTW_PMD_LEVEL). + * @pud: copy of the entry value (PTW_PUD_LEVEL). + * @mm: the mm_struct we are walking + * @vma: the vma we are walking + * @ptl: pointer to the page table lock. + */ + +struct pt_range_walk { + struct page *page; + struct folio *folio; + int nr_entries; + unsigned long size; + softleaf_t softleaf_entry; + bool writable; + bool young; + bool dirty; + bool present; + bool vma_locked; + bool pmd_shared; + unsigned long curr_addr; + unsigned long next_addr; + enum pt_range_walk_level level; + union { + pte_t *ptep; + pud_t *pudp; + pmd_t *pmdp; + }; + union { + pte_t pte; + pud_t pud; + pmd_t pmd; + }; + struct mm_struct *mm; + struct vm_area_struct *vma; + spinlock_t *ptl; +}; + +enum pt_range_walk_type pt_range_walk(struct pt_range_walk *ptw, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + pt_type_flags_t flags); +enum pt_range_walk_type pt_range_walk_start(struct pt_range_walk *ptw, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + pt_type_flags_t flags); +enum pt_range_walk_type pt_range_walk_next(struct pt_range_walk *ptw, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + pt_type_flags_t flags); +void pt_range_walk_done(struct pt_range_walk *ptw); #endif /* _LINUX_PAGEWALK_H */ diff --git a/mm/memory.c b/mm/memory.c index 07778814b4a8..e016bc7a49d9 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -850,6 +850,28 @@ struct page *vm_normal_page_pud(struct vm_area_struct = *vma, return __vm_normal_page(vma, addr, pud_pfn(pud), pud_special(pud), pud_val(pud), PGTABLE_LEVEL_PUD); } + +/** + * vm_normal_folio_pud() - Get the "struct folio" associated with a PUD + * @vma: The VMA mapping the @pud. + * @addr: The address where the @pud is mapped. + * @pud: The PUD. + * + * Get the "struct folio" associated with a PUD. See __vm_normal_page() + * for details on "normal" and "special" mappings. + * + * Return: Returns the "struct folio" if this is a "normal" mapping. Retur= ns + * NULL if this is a "special" mapping. + */ +struct folio *vm_normal_folio_pud(struct vm_area_struct *vma, + unsigned long addr, pud_t pud) +{ + struct page *page =3D vm_normal_page_pud(vma, addr, pud); + + if (page) + return page_folio(page); + return NULL; +} #endif =20 /** diff --git a/mm/pagewalk.c b/mm/pagewalk.c index a94c401ab2cf..4c5c28fdccd4 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -1029,3 +1029,403 @@ struct folio *folio_walk_start(struct folio_walk *f= w, fw->ptl =3D ptl; return page_folio(page); } + +enum pt_range_walk_type pt_range_walk(struct pt_range_walk *ptw, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + pt_type_flags_t flags) +{ + pgd_t *pgdp; + p4d_t *p4dp; + pud_t *pudp, pud; + pmd_t *pmdp, pmd; + pte_t *ptep, pte; + int nr_batched =3D 1; + spinlock_t *ptl =3D NULL; + unsigned long entry_size; + struct page *page; + struct folio *folio; + enum pt_range_walk_type ret_type =3D PTW_DONE; + bool writable, young, dirty; + unsigned long curr_addr, next_addr =3D ptw->next_addr ? ptw->next_addr : = addr; + + if (WARN_ON_ONCE(next_addr < vma->vm_start || next_addr >=3D vma->vm_end)) + return ret_type; + + mmap_assert_locked(ptw->mm); + + if (ptw->ptl) { + spin_unlock(ptw->ptl); + ptw->ptl =3D NULL; + } + + if (ptw->level =3D=3D PTW_PTE_LEVEL && ptw->ptep) { + pte_unmap(ptw->ptep); + ptw->ptep =3D NULL; + } + + if (!ptw->vma_locked) { + vma_pgtable_walk_begin(vma); + ptw->vma_locked =3D true; + ptw->vma =3D vma; + } + +keep_walking: + ret_type =3D PTW_DONE; + folio =3D NULL; + page =3D NULL; + writable =3D young =3D dirty =3D false; + ptw->present =3D false; + ptw->pmd_shared =3D false; + ptw->folio =3D NULL; + ptw->page =3D NULL; + + curr_addr =3D next_addr; + if (ptl) { + spin_unlock(ptl); + ptl =3D NULL; + } + /* + * If we keep walking the page tables because we are not interested + * in the type we found, make sure to check whether we reached the end. + */ + if (curr_addr >=3D end) { + ptw->next_addr =3D next_addr; + return ret_type; + } +again: + pgdp =3D pgd_offset(ptw->mm, curr_addr); + next_addr =3D pgd_addr_end(curr_addr, end); + + if (pgd_none_or_clear_bad(pgdp)) + /* PTW_ABORT? */ + goto keep_walking; + + next_addr =3D p4d_addr_end(curr_addr, end); + p4dp =3D p4d_offset(pgdp, curr_addr); + if (p4d_none_or_clear_bad(p4dp)) + /* PTW_ABORT? */ + goto keep_walking; + + entry_size =3D PUD_SIZE; + ptw->level =3D PTW_PUD_LEVEL; + next_addr =3D pud_addr_end(curr_addr, end); + pudp =3D pud_offset(p4dp, curr_addr); + pud =3D pudp_get(pudp); + if (pud_none(pud)) { + if (!(flags & PT_TYPE_NONE)) + goto keep_walking; + ret_type =3D PTW_NONE; + goto found; + } + /* + * For now, there are no architectures which supports pgd or p4d + * leafs, pud is the first level that can be a leaf. + */ + if (IS_ENABLED(CONFIG_PGTABLE_HAS_HUGE_LEAVES) && + (!pud_present(pud) || pud_leaf(pud))) { + ptl =3D pud_huge_lock(pudp, vma); + if (!ptl) + goto again; + + pud =3D pudp_get(pudp); + ptw->pudp =3D pudp; + ptw->pud =3D pud; + if (pud_none(pud)) { + if (!(flags & PT_TYPE_NONE)) + goto keep_walking; + ret_type =3D PTW_NONE; + } else if (pud_present(pud) && !pud_leaf(pud)) { + spin_unlock(ptl); + ptl =3D NULL; + goto pmd_table; + } else if (pud_present(pud)) { + /* + * We do not support PUD-device or pud-PFNMAP, so + * if it is present, we must have a folio (Tm). + */ + page =3D vm_normal_page_pud(vma, curr_addr, pud); + if (!page || !(flags & PT_TYPE_FOLIO)) + goto keep_walking; + + ret_type =3D PTW_FOLIO; + folio =3D page_folio(page); + ptw->present =3D true; + dirty =3D !!pud_dirty(pud); + young =3D !!pud_young(pud); + writable =3D !!pud_write(pud); + } else if (!pud_none(pud)) { + /* PUD-hugetlbs can have special swap entries */ + const softleaf_t entry =3D softleaf_from_pud(pud); + + ptw->softleaf_entry =3D entry; + + if (softleaf_is_marker(entry)) { + if (!(flags & PT_TYPE_MARKER)) + goto keep_walking; + ret_type =3D PTW_MARKER; + } else if (softleaf_has_pfn(entry)) { + if (softleaf_is_migration(entry)) { + if (!(flags & PT_TYPE_MIGRATION)) + goto keep_walking; + ret_type =3D PTW_MIGRATION; + } else if (softleaf_is_hwpoison(entry)) { + if (!(flags & PT_TYPE_HWPOISON)) + goto keep_walking; + ret_type =3D PTW_HWPOISON; + } + + page =3D softleaf_to_page(entry); + if (page) + folio =3D page_folio(page); + } + } else { + /* We found nothing, keep going */ + goto keep_walking; + } + + /* We found a type */ + goto found; + } +pmd_table: + entry_size =3D PMD_SIZE; + ptw->level =3D PTW_PMD_LEVEL; + next_addr =3D pmd_addr_end(curr_addr, end); + pmdp =3D pmd_offset(pudp, curr_addr); + pmd =3D pmdp_get_lockless(pmdp); + if (pmd_none(pmd)) { + if (!(flags & PT_TYPE_NONE)) + goto keep_walking; + ret_type =3D PTW_NONE; + goto found; + } + + if (IS_ENABLED(CONFIG_PGTABLE_HAS_HUGE_LEAVES) && + (!pmd_present(pmd) || pmd_leaf(pmd))) { + ptl =3D pmd_huge_lock(pmdp, vma); + if (!ptl) + goto again; + + pmd =3D pmdp_get(pmdp); + ptw->pmdp =3D pmdp; + ptw->pmd =3D pmd; + if (pmd_none(pmd)) { + if (!(flags & PT_TYPE_NONE)) + goto keep_walking; + ret_type =3D PTW_NONE; + } else if (pmd_present(pmd) && !pmd_leaf(pmd)) { + spin_unlock(ptl); + ptl =3D NULL; + goto pte_table; + } else if (pmd_present(pmd)) { + page =3D vm_normal_page_pmd(vma, curr_addr, pmd); + if (page) { + if (!(flags & PT_TYPE_FOLIO)) + goto keep_walking; + ret_type =3D PTW_FOLIO; + folio =3D page_folio(page); + if (folio_size(folio) > entry_size) { + /* We can batch */ + int max_nr =3D folio_size(folio) / entry_size; + + nr_batched =3D folio_pmd_batch(folio, pmdp, &pmd, + max_nr, 0, + &writable, + &young, + &dirty); + } else { + dirty =3D !!pmd_dirty(pmd); + young =3D !!pmd_young(pmd); + writable =3D !!pmd_write(pmd); + } + } else if (!page && (is_huge_zero_pmd(pmd) || + vma->vm_flags & VM_PFNMAP)) { + if (!(flags & PT_TYPE_PFN)) + goto keep_walking; + /* Create a subtype to differentiate them? */ + ret_type =3D PTW_PFN; + } else if (!page) { + goto keep_walking; + } + ptw->present =3D true; + next_addr +=3D (nr_batched * entry_size) - entry_size; + } else if (!pmd_none(pmd)) { + const softleaf_t entry =3D softleaf_from_pmd(pmd); + + ptw->softleaf_entry =3D entry; + + if (softleaf_is_marker(entry)) { + if (!(flags & PT_TYPE_MARKER)) + goto keep_walking; + ret_type =3D PTW_MARKER; + } else if (softleaf_has_pfn(entry)) { + if (softleaf_is_migration(entry)) { + if (!(flags & PT_TYPE_MIGRATION)) + goto keep_walking; + ret_type =3D PTW_MIGRATION; + } else if (softleaf_is_hwpoison(entry)) { + if (!(flags & PT_TYPE_HWPOISON)) + goto keep_walking; + ret_type =3D PTW_HWPOISON; + } else if (softleaf_is_device_private(entry) || + softleaf_is_device_exclusive(entry)) { + if (!(flags & PT_TYPE_DEVICE)) + goto not_found; + ptw->present =3D true; + ret_type =3D PTW_DEVICE; + } + page =3D softleaf_to_page(entry); + if (page) + folio =3D page_folio(page); + } + } else { + /* We found nothing, keep going */ + goto keep_walking; + } + + if (ret_type !=3D PTW_NONE && is_vm_hugetlb_page(vma) && + hugetlb_pmd_shared((pte_t *)pmdp)) + ptw->pmd_shared =3D true; + + goto found; + } +pte_table: + entry_size =3D PAGE_SIZE; + ptw->level =3D PTW_PTE_LEVEL; + next_addr =3D curr_addr + PAGE_SIZE; + ptep =3D pte_offset_map_lock(vma->vm_mm, pmdp, curr_addr, &ptl); + if (!ptep) + goto again; + + pte =3D ptep_get(ptep); + ptw->ptep =3D ptep; + ptw->pte =3D pte; + if (pte_none(pte)) { + if (!(flags & PT_TYPE_NONE)) + goto not_found; + ret_type =3D PTW_NONE; + } else if (pte_present(pte)) { + page =3D vm_normal_page(vma, curr_addr, pte); + if (page) { + if (!(flags & PT_TYPE_FOLIO)) + goto not_found; + ret_type =3D PTW_FOLIO; + folio =3D page_folio(page); + if (folio_test_large(folio)) { + /* We can batch */ + unsigned long end_addr =3D pmd_addr_end(curr_addr, end); + int max_nr =3D (end_addr - curr_addr) >> PAGE_SHIFT; + + nr_batched =3D folio_pte_batch_flags(folio, vma, ptep, &pte, max_nr, + FPB_MERGE_WRITE | FPB_MERGE_YOUNG_DIRTY); + } + } else if (!page && (is_zero_pfn(pte_pfn(pte)) || + vma->vm_flags & VM_PFNMAP)) { + if (!(flags & PT_TYPE_PFN)) + goto not_found; + ret_type =3D PTW_PFN; + } + + dirty =3D !!pte_dirty(pte); + young =3D !!pte_young(pte); + writable =3D !!pte_write(pte); + ptw->present =3D true; + next_addr +=3D (nr_batched * entry_size) - entry_size; + } else if (!pte_none(pte)) { + const softleaf_t entry =3D softleaf_from_pte(pte); + + ptw->softleaf_entry =3D entry; + + if (softleaf_is_marker(entry)) { + if (!(flags & PT_TYPE_MARKER)) + goto not_found; + ret_type =3D PTW_MARKER; + } else if (softleaf_is_swap(entry)) { + unsigned long end_addr =3D pmd_addr_end(curr_addr, end); + int max_nr =3D (end_addr - curr_addr) >> PAGE_SHIFT; + + if (!(flags & PT_TYPE_SWAP)) + goto not_found; + + nr_batched =3D swap_pte_batch(ptep, max_nr, pte); + next_addr +=3D (nr_batched * entry_size) - entry_size; + ret_type =3D PTW_SWAP; + } else if (softleaf_has_pfn(entry)) { + if (softleaf_is_migration(entry)) { + if (!(flags & PT_TYPE_MIGRATION)) + goto not_found; + ret_type =3D PTW_MIGRATION; + } else if (softleaf_is_hwpoison(entry)) { + if (!(flags & PT_TYPE_HWPOISON)) + goto not_found; + ret_type =3D PTW_HWPOISON; + } else if (softleaf_is_device_private(entry) || + softleaf_is_device_exclusive(entry)) { + if (!(flags & PT_TYPE_DEVICE)) + goto not_found; + ptw->present =3D true; + ret_type =3D PTW_DEVICE; + } + page =3D softleaf_to_page(entry); + if (page) + folio =3D page_folio(page); + } + } else { +not_found: + /* We found nothing, keep going */ + pte_unmap_unlock(ptep, ptl); + ptw->ptep =3D NULL; + ptl =3D NULL; + goto keep_walking; + } + +found: + /* Fill in remaining ptw struct before returning */ + ptw->ptl =3D ptl; + ptw->curr_addr =3D curr_addr; + ptw->next_addr =3D next_addr; + ptw->writable =3D writable; + ptw->young =3D young; + ptw->dirty =3D dirty; + ptw->nr_entries =3D nr_batched; + ptw->size =3D nr_batched * entry_size; + if (folio) { + ptw->folio =3D folio; + ptw->page =3D page + ((curr_addr & (entry_size - 1)) >> PAGE_SHIFT); + } + return ret_type; +} + +enum pt_range_walk_type pt_range_walk_start(struct pt_range_walk *ptw, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + pt_type_flags_t flags) +{ + if (!ptw->mm) + return PTW_DONE; + if (addr >=3D end) + return PTW_DONE; + return pt_range_walk(ptw, vma, addr, end, flags); +} + +enum pt_range_walk_type pt_range_walk_next(struct pt_range_walk *ptw, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + pt_type_flags_t flags) +{ + /* We went through the complete range */ + if (ptw->next_addr >=3D end) + return PTW_DONE; + return pt_range_walk(ptw, vma, addr, end, flags); +} + +void pt_range_walk_done(struct pt_range_walk *ptw) +{ + if (ptw->ptl) + spin_unlock(ptw->ptl); + if (ptw->level =3D=3D PTW_PTE_LEVEL && ptw->ptep) + pte_unmap(ptw->ptep); + if (ptw->vma_locked) + vma_pgtable_walk_end(ptw->vma); + cond_resched(); +} --=20 2.35.3 From nobody Sat Jun 20 17:33:36 2026 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 93DE636308A for ; Sun, 12 Apr 2026 17:43:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015787; cv=none; b=AiYvHxEVK+vVsG5UkS0gRtqcll/uVVQ1zXSKam2I6jxjBVUldqxWusdGHpslDAsemBhKxFi5Iq4Id6hA3424tJHsCtAm8yThian11OqntzvfhTjkjhygrGTs4VLukSSYFPumAl+Zy4o2bH+Biy1E4IumZqCQtmZ3OVHKsqhCCtc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015787; c=relaxed/simple; bh=VMbUxGgTOpjDIw17PBfrEwsXl3Ga2WAfLiuByGZtzbA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TDiZidQoQUS0COkyofqk+DnNpMQHjzCxP7PQqcOWzDd+eKkxShQsmqkfGZJBxP6pIq+SOmtprO9yOqPUR5DIK5FsriTNrNS3KXLugAN3R/xzNP90a1MqW3DD2YBu6JgzMz03n/R3XxI46XoRLVmQVEvBz6u0U//T2b1SW1lbQEg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=wuvo7ilM; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=OHh8wmKt; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=wuvo7ilM; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=OHh8wmKt; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="wuvo7ilM"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="OHh8wmKt"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="wuvo7ilM"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="OHh8wmKt" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D65036A851; Sun, 12 Apr 2026 17:43:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015783; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lBbANHRonp7Gey8phh5lpLom/zoIGcl2OxsXcSKEgng=; b=wuvo7ilM/mAIP444FSEDE9Lq9V6cvn43YPU1/eEsHkYF9e4hVwqqA5mPwBlH3DMywAouOS 4w1yyt4biDf3atzwZTJUn0ljlYjnG5DUkHXulfLD7M/MK33VCQgvDVPoTtzGvX8obPCI1+ +SOYnroavRBCHA8f/xIhiWBZk0NGDmA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015783; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lBbANHRonp7Gey8phh5lpLom/zoIGcl2OxsXcSKEgng=; b=OHh8wmKtuaEOmva08p2CCnWEuCU/BiE0rP5xmFoYMdPknJc+DbwBpRM179KMCxpakRFVv9 LLjkd0HhBuM5/ABA== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=wuvo7ilM; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=OHh8wmKt DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015783; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lBbANHRonp7Gey8phh5lpLom/zoIGcl2OxsXcSKEgng=; b=wuvo7ilM/mAIP444FSEDE9Lq9V6cvn43YPU1/eEsHkYF9e4hVwqqA5mPwBlH3DMywAouOS 4w1yyt4biDf3atzwZTJUn0ljlYjnG5DUkHXulfLD7M/MK33VCQgvDVPoTtzGvX8obPCI1+ +SOYnroavRBCHA8f/xIhiWBZk0NGDmA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015783; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lBbANHRonp7Gey8phh5lpLom/zoIGcl2OxsXcSKEgng=; b=OHh8wmKtuaEOmva08p2CCnWEuCU/BiE0rP5xmFoYMdPknJc+DbwBpRM179KMCxpakRFVv9 LLjkd0HhBuM5/ABA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 5E2964AA4A; Sun, 12 Apr 2026 17:43:03 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id cDt8FKfZ22miRQAAD6G6ig (envelope-from ); Sun, 12 Apr 2026 17:43:03 +0000 From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Michal Hocko , Vlastimil Babka , Muchun Song , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oscar Salvador Subject: [RFC PATCH 5/7] mm: Make /proc/pid/smaps use the new generic pagewalk API Date: Sun, 12 Apr 2026 19:42:42 +0200 Message-ID: <20260412174244.133715-6-osalvador@suse.de> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20260412174244.133715-1-osalvador@suse.de> References: <20260412174244.133715-1-osalvador@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Action: no action X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCPT_COUNT_SEVEN(0.00)[9]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_EQ_ENVFROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; R_RATELIMIT(0.00)[to_ip_from(RLnwjaipjwg7testigjitbjsak)]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Queue-Id: D65036A851 X-Spam-Flag: NO X-Spam-Score: -3.01 X-Spam-Level: Content-Type: text/plain; charset="utf-8" Have /proc/pid/smaps make use of the new generic API, and remove the code which was using the old one. Signed-off-by: Oscar Salvador --- fs/proc/task_mmu.c | 309 ++++++++++++--------------------------------- 1 file changed, 84 insertions(+), 225 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index e091931d7ca1..afbcdb11ad80 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -915,7 +915,7 @@ static void smaps_page_accumulate(struct mem_size_stats= *mss, =20 static void smaps_account(struct mem_size_stats *mss, struct page *page, bool compound, bool young, bool dirty, bool locked, - bool present) + bool present, int ssize) { struct folio *folio =3D page_folio(page); int i, nr =3D compound ? compound_nr(page) : 1; @@ -923,6 +923,11 @@ static void smaps_account(struct mem_size_stats *mss, = struct page *page, bool exclusive; int mapcount; =20 + if (ssize) { + nr =3D ssize / PAGE_SIZE; + size =3D ssize; + } + /* * First accumulate quantities that depend only on |size| and the type * of the compound page. @@ -988,150 +993,6 @@ static void smaps_account(struct mem_size_stats *mss,= struct page *page, } } =20 -#ifdef CONFIG_SHMEM -static int smaps_pte_hole(unsigned long addr, unsigned long end, - __always_unused int depth, struct mm_walk *walk) -{ - struct mem_size_stats *mss =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - - mss->swap +=3D shmem_partial_swap_usage(walk->vma->vm_file->f_mapping, - linear_page_index(vma, addr), - linear_page_index(vma, end)); - - return 0; -} -#else -#define smaps_pte_hole NULL -#endif /* CONFIG_SHMEM */ - -static void smaps_pte_hole_lookup(unsigned long addr, struct mm_walk *walk) -{ -#ifdef CONFIG_SHMEM - if (walk->ops->pte_hole) { - /* depth is not used */ - smaps_pte_hole(addr, addr + PAGE_SIZE, 0, walk); - } -#endif -} - -static void smaps_pte_entry(pte_t *pte, unsigned long addr, - struct mm_walk *walk) -{ - struct mem_size_stats *mss =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - bool locked =3D !!(vma->vm_flags & VM_LOCKED); - struct page *page =3D NULL; - bool present =3D false, young =3D false, dirty =3D false; - pte_t ptent =3D ptep_get(pte); - - if (pte_present(ptent)) { - page =3D vm_normal_page(vma, addr, ptent); - young =3D pte_young(ptent); - dirty =3D pte_dirty(ptent); - present =3D true; - } else if (pte_none(ptent)) { - smaps_pte_hole_lookup(addr, walk); - } else { - const softleaf_t entry =3D softleaf_from_pte(ptent); - - if (softleaf_is_swap(entry)) { - int mapcount; - - mss->swap +=3D PAGE_SIZE; - mapcount =3D swp_swapcount(entry); - if (mapcount >=3D 2) { - u64 pss_delta =3D (u64)PAGE_SIZE << PSS_SHIFT; - - do_div(pss_delta, mapcount); - mss->swap_pss +=3D pss_delta; - } else { - mss->swap_pss +=3D (u64)PAGE_SIZE << PSS_SHIFT; - } - } else if (softleaf_has_pfn(entry)) { - if (softleaf_is_device_private(entry)) - present =3D true; - page =3D softleaf_to_page(entry); - } - } - - if (!page) - return; - - smaps_account(mss, page, false, young, dirty, locked, present); -} - -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, - struct mm_walk *walk) -{ - struct mem_size_stats *mss =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - bool locked =3D !!(vma->vm_flags & VM_LOCKED); - struct page *page =3D NULL; - bool present =3D false; - struct folio *folio; - - if (pmd_none(*pmd)) - return; - if (pmd_present(*pmd)) { - page =3D vm_normal_page_pmd(vma, addr, *pmd); - present =3D true; - } else if (unlikely(thp_migration_supported())) { - const softleaf_t entry =3D softleaf_from_pmd(*pmd); - - if (softleaf_has_pfn(entry)) - page =3D softleaf_to_page(entry); - } - if (IS_ERR_OR_NULL(page)) - return; - folio =3D page_folio(page); - if (folio_test_anon(folio)) - mss->anonymous_thp +=3D HPAGE_PMD_SIZE; - else if (folio_test_swapbacked(folio)) - mss->shmem_thp +=3D HPAGE_PMD_SIZE; - else if (folio_is_zone_device(folio)) - /* pass */; - else - mss->file_thp +=3D HPAGE_PMD_SIZE; - - smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), - locked, present); -} -#else -static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, - struct mm_walk *walk) -{ -} -#endif - -static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long e= nd, - struct mm_walk *walk) -{ - struct vm_area_struct *vma =3D walk->vma; - pte_t *pte; - spinlock_t *ptl; - - ptl =3D pmd_trans_huge_lock(pmd, vma); - if (ptl) { - smaps_pmd_entry(pmd, addr, walk); - spin_unlock(ptl); - goto out; - } - - pte =3D pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); - if (!pte) { - walk->action =3D ACTION_AGAIN; - return 0; - } - for (; addr !=3D end; pte++, addr +=3D PAGE_SIZE) - smaps_pte_entry(pte, addr, walk); - pte_unmap_unlock(pte - 1, ptl); -out: - cond_resched(); - return 0; -} - static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct = *vma) { /* @@ -1228,58 +1089,6 @@ static void show_smap_vma_flags(struct seq_file *m, = struct vm_area_struct *vma) seq_putc(m, '\n'); } =20 -#ifdef CONFIG_HUGETLB_PAGE -static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, - struct mm_walk *walk) -{ - struct mem_size_stats *mss =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - struct folio *folio =3D NULL; - bool present =3D false; - spinlock_t *ptl; - pte_t ptent; - - ptl =3D huge_pte_lock(hstate_vma(vma), walk->mm, pte); - ptent =3D huge_ptep_get(walk->mm, addr, pte); - if (pte_present(ptent)) { - folio =3D page_folio(pte_page(ptent)); - present =3D true; - } else { - const softleaf_t entry =3D softleaf_from_pte(ptent); - - if (softleaf_has_pfn(entry)) - folio =3D softleaf_to_folio(entry); - } - - if (folio) { - /* We treat non-present entries as "maybe shared". */ - if (!present || folio_maybe_mapped_shared(folio) || - hugetlb_pmd_shared(pte)) - mss->shared_hugetlb +=3D huge_page_size(hstate_vma(vma)); - else - mss->private_hugetlb +=3D huge_page_size(hstate_vma(vma)); - } - spin_unlock(ptl); - return 0; -} -#else -#define smaps_hugetlb_range NULL -#endif /* HUGETLB_PAGE */ - -static const struct mm_walk_ops smaps_walk_ops =3D { - .pmd_entry =3D smaps_pte_range, - .hugetlb_entry =3D smaps_hugetlb_range, - .walk_lock =3D PGWALK_RDLOCK, -}; - -static const struct mm_walk_ops smaps_shmem_walk_ops =3D { - .pmd_entry =3D smaps_pte_range, - .hugetlb_entry =3D smaps_hugetlb_range, - .pte_hole =3D smaps_pte_hole, - .walk_lock =3D PGWALK_RDLOCK, -}; - /* * Gather mem stats from @vma with the indicated beginning * address @start, and keep them in @mss. @@ -1287,40 +1096,90 @@ static const struct mm_walk_ops smaps_shmem_walk_op= s =3D { * Use vm_start of @vma as the beginning address if @start is 0. */ static void smap_gather_stats(struct vm_area_struct *vma, - struct mem_size_stats *mss, unsigned long start) + struct mem_size_stats *mss, + unsigned long start) { - const struct mm_walk_ops *ops =3D &smaps_walk_ops; - - /* Invalid start */ - if (start >=3D vma->vm_end) - return; + struct pt_range_walk ptw =3D { + .mm =3D vma->vm_mm + }; + enum pt_range_walk_type type; + pt_type_flags_t flags =3D PT_TYPE_ALL; =20 - if (vma->vm_file && shmem_mapping(vma->vm_file->f_mapping)) { - /* - * For shared or readonly shmem mappings we know that all - * swapped out pages belong to the shmem object, and we can - * obtain the swap value much more efficiently. For private - * writable mappings, we might have COW pages that are - * not affected by the parent swapped out pages of the shmem - * object, so we have to distinguish them during the page walk. - * Unless we know that the shmem object (or the part mapped by - * our VMA) has no swapped out pages at all. - */ - unsigned long shmem_swapped =3D shmem_swap_usage(vma); + if (!start) + start =3D vma->vm_start; + + flags &=3D ~(PT_TYPE_NONE|PT_TYPE_PFN); + + type =3D pt_range_walk_start(&ptw, vma, start, vma->vm_end, flags); + while (type !=3D PTW_DONE) { + bool locked =3D !!(vma->vm_flags & VM_LOCKED); + bool compound =3D false, account =3D false; + unsigned long swap_size; + int mapcount; + + switch (type) { + case PTW_FOLIO: + case PTW_MIGRATION: + case PTW_HWPOISON: + case PTW_DEVICE: + /* + * We either have a folio because vm_normal_folio was + * successful, or because we had a special swap entry + * and could retrieve it with softleaf_to_page. + */ + if (is_vm_hugetlb_page(vma)) { + /* HugeTLB */ + unsigned long size =3D huge_page_size(hstate_vma(ptw.vma)); + + if (!ptw.present || folio_maybe_mapped_shared(ptw.folio) || + ptw.pmd_shared) + mss->shared_hugetlb +=3D size; + else + mss->private_hugetlb +=3D size; + } else { + account =3D true; + if (ptw.level =3D=3D PTW_PMD_LEVEL) { + /* THP */ + compound =3D true; + if (folio_test_anon(ptw.folio)) + mss->anonymous_thp +=3D ptw.size; + else if (folio_test_swapbacked(ptw.folio)) + mss->shmem_thp +=3D ptw.size; + else if (folio_is_zone_device(ptw.folio)) + /* pass */; + else + mss->file_thp +=3D ptw.size; + } else if (ptw.level =3D=3D PTW_PTE_LEVEL && ptw.nr_entries > 1) { + compound =3D true; + } + } + break; + case PTW_SWAP: + account =3D true; + swap_size =3D PAGE_SIZE * ptw.nr_entries; + mss->swap +=3D swap_size; + mapcount =3D swp_swapcount(ptw.softleaf_entry); + if (mapcount >=3D 2) { + u64 pss_delta =3D (u64)swap_size << PSS_SHIFT; =20 - if (!start && (!shmem_swapped || (vma->vm_flags & VM_SHARED) || - !(vma->vm_flags & VM_WRITE))) { - mss->swap +=3D shmem_swapped; - } else { - ops =3D &smaps_shmem_walk_ops; + do_div(pss_delta, mapcount); + mss->swap_pss +=3D pss_delta; + } else { + mss->swap_pss +=3D (u64)swap_size << PSS_SHIFT; + } + break; + default: + /* Ooops */ + break; } + + if (account && ptw.folio) + smaps_account(mss, ptw.page, compound, ptw.young, + ptw.dirty, locked, ptw.present, ptw.size); + type =3D pt_range_walk_next(&ptw, vma, start, vma->vm_end, flags); } =20 - /* mmap_lock is held in m_start */ - if (!start) - walk_page_vma(vma, ops, mss); - else - walk_page_range(vma->vm_mm, start, vma->vm_end, ops, mss); + pt_range_walk_done(&ptw); } =20 #define SEQ_PUT_DEC(str, val) \ --=20 2.35.3 From nobody Sat Jun 20 17:33:36 2026 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51BEF36308A for ; Sun, 12 Apr 2026 17:43:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015793; cv=none; b=V9ZiTLypxyJ3W5QgtEX3kjl2jXSxfSJv70TAnCdQUBjDAB1WXN0esvYc+RTvSIy4UoKKVeUNLzqzUZ3c8XYUono/MQNPhUObzOfuJ/RtsS5zkIwUhR9KuQYUp6+CGH2rbn+yzTVJjIkaalYEoX1a1bHEC2rKxGra6mYqnEvF1J8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015793; c=relaxed/simple; bh=2Vlcas1wNcl6ytpRNwjp9UiKZiso+oUMTl+9F88P3Ak=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=e1uZ6/1Y0vIP+8lcMf1WlqrmpcjIQkE550ccnpLIKVwTdJFPGQ8VpHxoUE2KPjaANizjz8+a9xRCjTaao00F8A8ma+6hXoyf6GFIrV+G27wTPd0+eNiYQkpWhRe/KJruIDu+OuFIkuqFzJjaTsATLtHIFeLMdGxfXORCq+i502g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=dgU1NRMv; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=6hljExyn; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=dgU1NRMv; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=6hljExyn; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="dgU1NRMv"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="6hljExyn"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="dgU1NRMv"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="6hljExyn" Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 689BC6A86F; Sun, 12 Apr 2026 17:43:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0M6XziQ2IXEafjsL1pugEtyiZe4RP/OzEyf3hhTlfvw=; b=dgU1NRMvEc63f/Bc89kah5Wzjf0TVlp4Orq4tR2lDDIA5j6UKBuH2WJeu73Nx4WSwGq8sy QIRAiKDBuqn/+OH54GaIYH+YDbcvgzbaf9w9TBWeIURhiDgm4oKiFVfVRLkPAjaEFzsACu 8vVEYUFHLdWE9Jp2SUCoAmogNwIUcxM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0M6XziQ2IXEafjsL1pugEtyiZe4RP/OzEyf3hhTlfvw=; b=6hljExynyYqJN6CMan72X4pFgo78XwzTtblhO8UTu2wLkwG/Y0xvlv/CjZ4HRK1SqJ6qiD 3e+rCuO8kDVQ0pAA== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.de header.s=susede2_rsa header.b=dgU1NRMv; dkim=pass header.d=suse.de header.s=susede2_ed25519 header.b=6hljExyn DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0M6XziQ2IXEafjsL1pugEtyiZe4RP/OzEyf3hhTlfvw=; b=dgU1NRMvEc63f/Bc89kah5Wzjf0TVlp4Orq4tR2lDDIA5j6UKBuH2WJeu73Nx4WSwGq8sy QIRAiKDBuqn/+OH54GaIYH+YDbcvgzbaf9w9TBWeIURhiDgm4oKiFVfVRLkPAjaEFzsACu 8vVEYUFHLdWE9Jp2SUCoAmogNwIUcxM= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0M6XziQ2IXEafjsL1pugEtyiZe4RP/OzEyf3hhTlfvw=; b=6hljExynyYqJN6CMan72X4pFgo78XwzTtblhO8UTu2wLkwG/Y0xvlv/CjZ4HRK1SqJ6qiD 3e+rCuO8kDVQ0pAA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id E66624AA48; Sun, 12 Apr 2026 17:43:03 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 0PW4NafZ22miRQAAD6G6ig (envelope-from ); Sun, 12 Apr 2026 17:43:03 +0000 From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Michal Hocko , Vlastimil Babka , Muchun Song , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oscar Salvador Subject: [RFC PATCH 6/7] mm: Make /proc/pid/numa_maps use the new generic pagewalk API Date: Sun, 12 Apr 2026 19:42:43 +0200 Message-ID: <20260412174244.133715-7-osalvador@suse.de> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20260412174244.133715-1-osalvador@suse.de> References: <20260412174244.133715-1-osalvador@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Action: no action X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Spamd-Result: default: False [-3.01 / 50.00]; BAYES_HAM(-3.00)[100.00%]; MID_CONTAINS_FROM(1.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_MISSING_CHARSET(0.50)[]; R_DKIM_ALLOW(-0.20)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:dkim,suse.de:mid,suse.de:email]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCPT_COUNT_SEVEN(0.00)[9]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FROM_EQ_ENVFROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; R_RATELIMIT(0.00)[to_ip_from(RLnwjaipjwg7testigjitbjsak)]; DKIM_TRACE(0.00)[suse.de:+] X-Rspamd-Queue-Id: 689BC6A86F X-Spam-Flag: NO X-Spam-Score: -3.01 X-Spam-Level: Content-Type: text/plain; charset="utf-8" Have /proc/pid/numa_maps make use of the new generic API, and remove the code which was using the old one Signed-off-by: Oscar Salvador --- fs/proc/task_mmu.c | 136 +++++++++++---------------------------------- 1 file changed, 32 insertions(+), 104 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index afbcdb11ad80..776e7a6baf00 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -3048,108 +3048,6 @@ static struct page *can_gather_numa_stats(pte_t pte= , struct vm_area_struct *vma, return page; } =20 -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -static struct page *can_gather_numa_stats_pmd(pmd_t pmd, - struct vm_area_struct *vma, - unsigned long addr) -{ - struct page *page; - int nid; - - if (!pmd_present(pmd)) - return NULL; - - page =3D vm_normal_page_pmd(vma, addr, pmd); - if (!page) - return NULL; - - if (PageReserved(page)) - return NULL; - - nid =3D page_to_nid(page); - if (!node_isset(nid, node_states[N_MEMORY])) - return NULL; - - return page; -} -#endif - -static int gather_pte_stats(pmd_t *pmd, unsigned long addr, - unsigned long end, struct mm_walk *walk) -{ - struct numa_maps *md =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - spinlock_t *ptl; - pte_t *orig_pte; - pte_t *pte; - -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - ptl =3D pmd_trans_huge_lock(pmd, vma); - if (ptl) { - struct page *page; - - page =3D can_gather_numa_stats_pmd(*pmd, vma, addr); - if (page) - gather_stats(page, md, pmd_dirty(*pmd), - HPAGE_PMD_SIZE/PAGE_SIZE); - spin_unlock(ptl); - return 0; - } -#endif - orig_pte =3D pte =3D pte_offset_map_lock(walk->mm, pmd, addr, &ptl); - if (!pte) { - walk->action =3D ACTION_AGAIN; - return 0; - } - do { - pte_t ptent =3D ptep_get(pte); - struct page *page =3D can_gather_numa_stats(ptent, vma, addr); - if (!page) - continue; - gather_stats(page, md, pte_dirty(ptent), 1); - - } while (pte++, addr +=3D PAGE_SIZE, addr !=3D end); - pte_unmap_unlock(orig_pte, ptl); - cond_resched(); - return 0; -} -#ifdef CONFIG_HUGETLB_PAGE -static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, struct mm_walk *walk) -{ - pte_t huge_pte; - struct numa_maps *md; - struct page *page; - spinlock_t *ptl; - - ptl =3D huge_pte_lock(hstate_vma(walk->vma), walk->mm, pte); - huge_pte =3D huge_ptep_get(walk->mm, addr, pte); - if (!pte_present(huge_pte)) - goto out; - - page =3D pte_page(huge_pte); - - md =3D walk->private; - gather_stats(page, md, pte_dirty(huge_pte), 1); -out: - spin_unlock(ptl); - return 0; -} - -#else -static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask, - unsigned long addr, unsigned long end, struct mm_walk *walk) -{ - return 0; -} -#endif - -static const struct mm_walk_ops show_numa_ops =3D { - .hugetlb_entry =3D gather_hugetlb_stats, - .pmd_entry =3D gather_pte_stats, - .walk_lock =3D PGWALK_RDLOCK, -}; - /* * Display pages allocated per node and memory policy via /proc. */ @@ -3161,9 +3059,15 @@ static int show_numa_map(struct seq_file *m, void *v) struct numa_maps *md =3D &numa_priv->md; struct file *file =3D vma->vm_file; struct mm_struct *mm =3D vma->vm_mm; + struct pt_range_walk ptw =3D { + .mm =3D mm + }; + enum pt_range_walk_type type; + pt_type_flags_t flags; char buffer[64]; struct mempolicy *pol; pgoff_t ilx; + int nr_pages; int nid; =20 if (!mm) @@ -3194,8 +3098,32 @@ static int show_numa_map(struct seq_file *m, void *v) if (is_vm_hugetlb_page(vma)) seq_puts(m, " huge"); =20 - /* mmap_lock is held by m_start */ - walk_page_vma(vma, &show_numa_ops, md); + flags =3D PT_TYPE_FOLIO; + type =3D pt_range_walk_start(&ptw, vma, vma->vm_start, vma->vm_end, flags= ); + while (type !=3D PTW_DONE) { + + if (!ptw.folio || !ptw.page || PageReserved(ptw.page)) + goto not_found; + + nid =3D page_to_nid(ptw.page); + if (!node_isset(nid, node_states[N_MEMORY])) + goto not_found; + + if (is_vm_hugetlb_page(vma)) + /* + * As opposed to THP, HugeTLB counts the entire huge + * page as one unit size. + */ + nr_pages =3D ptw.nr_entries; + else + nr_pages =3D ptw.size / PAGE_SIZE; + + gather_stats(ptw.page, md, ptw.dirty, nr_pages); +not_found: + type =3D pt_range_walk_next(&ptw, vma, vma->vm_start, vma->vm_end, flags= ); + + } + pt_range_walk_done(&ptw); =20 if (!md->pages) goto out; --=20 2.35.3 From nobody Sat Jun 20 17:33:36 2026 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D0DB375F7C for ; Sun, 12 Apr 2026 17:43:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015801; cv=none; b=YAw41D0THj08/ReSPmbmMHVPqnD8jHSD8vY7fRnvLKwU9iedhYYdQs0DsotosBY8rkeoh5W5mNWoGSx1KsBnmoL83LSI1hoAHuythT6LrwRu98pbLe7FylZNRrmY3BXVdp+KFRQdu18T4IJNQrk8FTjb8nuoECJtcRMerW6tQq0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776015801; c=relaxed/simple; bh=9Wk6RrkhKPwj/7G3muIfnLZox/riGdW4QFEbcD/XhZU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oyjoFOesi8upjbAK+WStdksrZsTtLuD3timyCNMj4b7Q2uNbU+jL7OCYj227M2bTVzsajVYWyDZtYeVd5wP6XgJ/LlKfOwDGMEqNQlA6v9WNTsu08ODY8wZG084gHzyzfAZOmkaVD7Ylgxb5+785DSXVnPfA9YWW8M0IbaPNlPE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=KTOS5NuL; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=zNAQHat7; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=o3fYM61i; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=MYt7rd32; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="KTOS5NuL"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="zNAQHat7"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="o3fYM61i"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="MYt7rd32" Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id EF8C46A871; Sun, 12 Apr 2026 17:43:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015785; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sFdvXCfTfqe+GIJUnVe1dTR0RpJ/i5W7x3fsTp932Uw=; b=KTOS5NuL4ag9rZ0dUn49H1eSqO5c58rSdgTQNerEOUw/e8dnHHAsf7ozUezxeed9T5Pvpf 4YD57Pyowi/WBbxT/3wHKATN3RTagl8ga7w1u0PiiuZQzQg3QEsj/EEM1LQzz+p5djde5j TjpvX48LgrJxXLN4pmwobX4cXGp4m3M= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015785; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sFdvXCfTfqe+GIJUnVe1dTR0RpJ/i5W7x3fsTp932Uw=; b=zNAQHat7URyXcf54NYyEmUJWII2OWBWKRsurpazDSWJYr8Jc/VkEktJiNBoHQrEvIpkZIb pHZX0FZfMGJCcACA== Authentication-Results: smtp-out1.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1776015784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sFdvXCfTfqe+GIJUnVe1dTR0RpJ/i5W7x3fsTp932Uw=; b=o3fYM61iqk6X2r2qWto3FR5Soks6aVxNovdYrARmrr1XrOUq4DikSt7jTeSGGlhrIpN6gM En8s6zYAhIsQr/n2/FKHjSnjQV2Ahouvn72+eTsAob8vIiwwiWiRYduzlhs/dJ7rigdS7R DFQ/rHXfxeu1JeDjgkYJbERoGskwr6g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1776015784; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sFdvXCfTfqe+GIJUnVe1dTR0RpJ/i5W7x3fsTp932Uw=; b=MYt7rd32gxaTTKESI+hVkX3kSC5o9khTJMpYQOb2hJGXXm+zXnd0dwBa7irWyykbKaxnRL z6r/Wvf2M1HpaxBw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 7908C4AA4A; Sun, 12 Apr 2026 17:43:04 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id OHnoGqjZ22miRQAAD6G6ig (envelope-from ); Sun, 12 Apr 2026 17:43:04 +0000 From: Oscar Salvador To: Andrew Morton Cc: David Hildenbrand , Michal Hocko , Vlastimil Babka , Muchun Song , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oscar Salvador Subject: [RFC PATCH 7/7] mm: Make /proc/pid/pagemap use the new generic pagewalk API Date: Sun, 12 Apr 2026 19:42:44 +0200 Message-ID: <20260412174244.133715-8-osalvador@suse.de> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20260412174244.133715-1-osalvador@suse.de> References: <20260412174244.133715-1-osalvador@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Spamd-Result: default: False [-2.80 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; R_MISSING_CHARSET(0.50)[]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; TO_MATCH_ENVRCPT_ALL(0.00)[]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:mid,suse.de:email]; FUZZY_RATELIMITED(0.00)[rspamd.com]; RCPT_COUNT_SEVEN(0.00)[9]; RCVD_COUNT_TWO(0.00)[2]; FROM_EQ_ENVFROM(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; R_RATELIMIT(0.00)[to_ip_from(RLd9dsuofksntgrby8c3fm48h6)]; RCVD_TLS_ALL(0.00)[] X-Spam-Flag: NO X-Spam-Score: -2.80 X-Spam-Level: Content-Type: text/plain; charset="utf-8" Have /proc/pid/pagemap make use of the new generic API, and remove the code which was using the old one. Signed-off-by: Oscar Salvador --- arch/x86/include/asm/pgtable.h | 4 + arch/x86/mm/pgtable.c | 18 +- fs/proc/task_mmu.c | 906 +++++++++++++++------------------ include/linux/leafops.h | 13 + include/linux/pgtable.h | 30 ++ mm/pgtable-generic.c | 10 + 6 files changed, 481 insertions(+), 500 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a68ff339cd56..1d18f6177784 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1400,6 +1400,10 @@ static inline pud_t pudp_establish(struct vm_area_st= ruct *vma, } #endif =20 +#define __HAVE_ARCH_PUDP_INVALIDATE_AD +extern pud_t pudp_invalidate_ad(struct vm_area_struct *vma, + unsigned long address, pud_t *pudp); + #define __HAVE_ARCH_PMDP_INVALIDATE_AD extern pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 2e5ecfdce73c..828f5ca9195e 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -530,8 +530,22 @@ pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma, u= nsigned long address, } #endif =20 -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ - defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) +#if (defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ + defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD)) || \ + defined CONFIG_HUGETLB_PAGE + +pud_t pudp_invalidate_ad(struct vm_area_struct *vma, unsigned long address, + pud_t *pudp) +{ + VM_WARN_ON_ONCE(!pud_present(*pudp)); + + /* + * No flush is necessary. Once an invalid PUD is established, the PUD's + * access and dirty bits cannot be updated. + */ + return pudp_establish(vma, address, pudp, pud_mkinvalid(*pudp)); +} + pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, pud_t *pudp) { diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 776e7a6baf00..6b6d5a39cd5a 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1856,192 +1856,6 @@ static pagemap_entry_t pte_to_pagemap_entry(struct = pagemapread *pm, return make_pme(frame, flags); } =20 -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -static int pagemap_pmd_range_thp(pmd_t *pmdp, unsigned long addr, - unsigned long end, struct vm_area_struct *vma, - struct pagemapread *pm) -{ - unsigned int idx =3D (addr & ~PMD_MASK) >> PAGE_SHIFT; - u64 flags =3D 0, frame =3D 0; - pmd_t pmd =3D *pmdp; - struct page *page =3D NULL; - struct folio *folio =3D NULL; - int err =3D 0; - - if (vma->vm_flags & VM_SOFTDIRTY) - flags |=3D PM_SOFT_DIRTY; - - if (pmd_none(pmd)) - goto populate_pagemap; - - if (pmd_present(pmd)) { - page =3D pmd_page(pmd); - - flags |=3D PM_PRESENT; - if (pmd_soft_dirty(pmd)) - flags |=3D PM_SOFT_DIRTY; - if (pmd_uffd_wp(pmd)) - flags |=3D PM_UFFD_WP; - if (pm->show_pfn) - frame =3D pmd_pfn(pmd) + idx; - } else if (thp_migration_supported()) { - const softleaf_t entry =3D softleaf_from_pmd(pmd); - unsigned long offset; - - if (pm->show_pfn) { - if (softleaf_has_pfn(entry)) - offset =3D softleaf_to_pfn(entry) + idx; - else - offset =3D swp_offset(entry) + idx; - frame =3D swp_type(entry) | - (offset << MAX_SWAPFILES_SHIFT); - } - flags |=3D PM_SWAP; - if (pmd_swp_soft_dirty(pmd)) - flags |=3D PM_SOFT_DIRTY; - if (pmd_swp_uffd_wp(pmd)) - flags |=3D PM_UFFD_WP; - VM_WARN_ON_ONCE(!pmd_is_migration_entry(pmd)); - page =3D softleaf_to_page(entry); - } - - if (page) { - folio =3D page_folio(page); - if (!folio_test_anon(folio)) - flags |=3D PM_FILE; - } - -populate_pagemap: - for (; addr !=3D end; addr +=3D PAGE_SIZE, idx++) { - u64 cur_flags =3D flags; - pagemap_entry_t pme; - - if (folio && (flags & PM_PRESENT) && - __folio_page_mapped_exclusively(folio, page)) - cur_flags |=3D PM_MMAP_EXCLUSIVE; - - pme =3D make_pme(frame, cur_flags); - err =3D add_to_pagemap(&pme, pm); - if (err) - break; - if (pm->show_pfn) { - if (flags & PM_PRESENT) - frame++; - else if (flags & PM_SWAP) - frame +=3D (1 << MAX_SWAPFILES_SHIFT); - } - } - return err; -} -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ - -static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned lon= g end, - struct mm_walk *walk) -{ - struct vm_area_struct *vma =3D walk->vma; - struct pagemapread *pm =3D walk->private; - spinlock_t *ptl; - pte_t *pte, *orig_pte; - int err =3D 0; - -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - ptl =3D pmd_trans_huge_lock(pmdp, vma); - if (ptl) { - err =3D pagemap_pmd_range_thp(pmdp, addr, end, vma, pm); - spin_unlock(ptl); - return err; - } -#endif - - /* - * We can assume that @vma always points to a valid one and @end never - * goes beyond vma->vm_end. - */ - orig_pte =3D pte =3D pte_offset_map_lock(walk->mm, pmdp, addr, &ptl); - if (!pte) { - walk->action =3D ACTION_AGAIN; - return err; - } - for (; addr < end; pte++, addr +=3D PAGE_SIZE) { - pagemap_entry_t pme; - - pme =3D pte_to_pagemap_entry(pm, vma, addr, ptep_get(pte)); - err =3D add_to_pagemap(&pme, pm); - if (err) - break; - } - pte_unmap_unlock(orig_pte, ptl); - - cond_resched(); - - return err; -} - -#ifdef CONFIG_HUGETLB_PAGE -/* This function walks within one hugetlb entry in the single call */ -static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, - unsigned long addr, unsigned long end, - struct mm_walk *walk) -{ - struct pagemapread *pm =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - u64 flags =3D 0, frame =3D 0; - spinlock_t *ptl; - int err =3D 0; - pte_t pte; - - if (vma->vm_flags & VM_SOFTDIRTY) - flags |=3D PM_SOFT_DIRTY; - - ptl =3D huge_pte_lock(hstate_vma(vma), walk->mm, ptep); - pte =3D huge_ptep_get(walk->mm, addr, ptep); - if (pte_present(pte)) { - struct folio *folio =3D page_folio(pte_page(pte)); - - if (!folio_test_anon(folio)) - flags |=3D PM_FILE; - - if (!folio_maybe_mapped_shared(folio) && - !hugetlb_pmd_shared(ptep)) - flags |=3D PM_MMAP_EXCLUSIVE; - - if (huge_pte_uffd_wp(pte)) - flags |=3D PM_UFFD_WP; - - flags |=3D PM_PRESENT; - if (pm->show_pfn) - frame =3D pte_pfn(pte) + - ((addr & ~hmask) >> PAGE_SHIFT); - } else if (pte_swp_uffd_wp_any(pte)) { - flags |=3D PM_UFFD_WP; - } - - for (; addr !=3D end; addr +=3D PAGE_SIZE) { - pagemap_entry_t pme =3D make_pme(frame, flags); - - err =3D add_to_pagemap(&pme, pm); - if (err) - break; - if (pm->show_pfn && (flags & PM_PRESENT)) - frame++; - } - - spin_unlock(ptl); - cond_resched(); - - return err; -} -#else -#define pagemap_hugetlb_range NULL -#endif /* HUGETLB_PAGE */ - -static const struct mm_walk_ops pagemap_ops =3D { - .pmd_entry =3D pagemap_pmd_range, - .pte_hole =3D pagemap_pte_hole, - .hugetlb_entry =3D pagemap_hugetlb_range, - .walk_lock =3D PGWALK_RDLOCK, -}; - /* * /proc/pid/pagemap - an array mapping virtual pages to pfns * @@ -2070,99 +1884,6 @@ static const struct mm_walk_ops pagemap_ops =3D { * determine which areas of memory are actually mapped and llseek to * skip over unmapped regions. */ -static ssize_t pagemap_read(struct file *file, char __user *buf, - size_t count, loff_t *ppos) -{ - struct mm_struct *mm =3D file->private_data; - struct pagemapread pm; - unsigned long src; - unsigned long svpfn; - unsigned long start_vaddr; - unsigned long end_vaddr; - int ret =3D 0, copied =3D 0; - - if (!mm || !mmget_not_zero(mm)) - goto out; - - ret =3D -EINVAL; - /* file position must be aligned */ - if ((*ppos % PM_ENTRY_BYTES) || (count % PM_ENTRY_BYTES)) - goto out_mm; - - ret =3D 0; - if (!count) - goto out_mm; - - /* do not disclose physical addresses: attack vector */ - pm.show_pfn =3D file_ns_capable(file, &init_user_ns, CAP_SYS_ADMIN); - - pm.len =3D (PAGEMAP_WALK_SIZE >> PAGE_SHIFT); - pm.buffer =3D kmalloc_array(pm.len, PM_ENTRY_BYTES, GFP_KERNEL); - ret =3D -ENOMEM; - if (!pm.buffer) - goto out_mm; - - src =3D *ppos; - svpfn =3D src / PM_ENTRY_BYTES; - end_vaddr =3D mm->task_size; - - /* watch out for wraparound */ - start_vaddr =3D end_vaddr; - if (svpfn <=3D (ULONG_MAX >> PAGE_SHIFT)) { - unsigned long end; - - ret =3D mmap_read_lock_killable(mm); - if (ret) - goto out_free; - start_vaddr =3D untagged_addr_remote(mm, svpfn << PAGE_SHIFT); - mmap_read_unlock(mm); - - end =3D start_vaddr + ((count / PM_ENTRY_BYTES) << PAGE_SHIFT); - if (end >=3D start_vaddr && end < mm->task_size) - end_vaddr =3D end; - } - - /* Ensure the address is inside the task */ - if (start_vaddr > mm->task_size) - start_vaddr =3D end_vaddr; - - ret =3D 0; - while (count && (start_vaddr < end_vaddr)) { - int len; - unsigned long end; - - pm.pos =3D 0; - end =3D (start_vaddr + PAGEMAP_WALK_SIZE) & PAGEMAP_WALK_MASK; - /* overflow ? */ - if (end < start_vaddr || end > end_vaddr) - end =3D end_vaddr; - ret =3D mmap_read_lock_killable(mm); - if (ret) - goto out_free; - ret =3D walk_page_range(mm, start_vaddr, end, &pagemap_ops, &pm); - mmap_read_unlock(mm); - start_vaddr =3D end; - - len =3D min(count, PM_ENTRY_BYTES * pm.pos); - if (copy_to_user(buf, pm.buffer, len)) { - ret =3D -EFAULT; - goto out_free; - } - copied +=3D len; - buf +=3D len; - count -=3D len; - } - *ppos +=3D copied; - if (!ret || ret =3D=3D PM_END_OF_BUFFER) - ret =3D copied; - -out_free: - kfree(pm.buffer); -out_mm: - mmput(mm); -out: - return ret; -} =20 static int pagemap_open(struct inode *inode, struct file *file) { @@ -2267,6 +1988,23 @@ static void make_uffd_wp_pte(struct vm_area_struct *= vma, } } =20 +#ifdef CONFIG_HUGETLB_PAGE +static void make_uffd_wp_pud(struct vm_area_struct *vma, + unsigned long addr, pud_t *pudp) +{ + pud_t old, pud =3D *pudp; + + if (pud_present(pud)) { + old =3D pudp_invalidate_ad(vma, addr, pudp); + pud =3D pud_mkuffd_wp(old); + set_pud_at(vma->vm_mm, addr, pudp, pud); + } else if (pud_is_migration_entry(pud)) { + pud =3D pud_swp_mkuffd_wp(pud); + set_pud_at(vma->vm_mm, addr, pudp, pud); + } +} +#endif + #ifdef CONFIG_TRANSPARENT_HUGEPAGE static unsigned long pagemap_thp_category(struct pagemap_scan_private *p, struct vm_area_struct *vma, @@ -2539,216 +2277,6 @@ static int pagemap_scan_output(unsigned long catego= ries, return ret; } =20 -static int pagemap_scan_thp_entry(pmd_t *pmd, unsigned long start, - unsigned long end, struct mm_walk *walk) -{ -#ifdef CONFIG_TRANSPARENT_HUGEPAGE - struct pagemap_scan_private *p =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - unsigned long categories; - spinlock_t *ptl; - int ret =3D 0; - - ptl =3D pmd_trans_huge_lock(pmd, vma); - if (!ptl) - return -ENOENT; - - categories =3D p->cur_vma_category | - pagemap_thp_category(p, vma, start, *pmd); - - if (!pagemap_scan_is_interesting_page(categories, p)) - goto out_unlock; - - ret =3D pagemap_scan_output(categories, p, start, &end); - if (start =3D=3D end) - goto out_unlock; - - if (~p->arg.flags & PM_SCAN_WP_MATCHING) - goto out_unlock; - if (~categories & PAGE_IS_WRITTEN) - goto out_unlock; - - /* - * Break huge page into small pages if the WP operation - * needs to be performed on a portion of the huge page. - */ - if (end !=3D start + HPAGE_SIZE) { - spin_unlock(ptl); - split_huge_pmd(vma, pmd, start); - pagemap_scan_backout_range(p, start, end); - /* Report as if there was no THP */ - return -ENOENT; - } - - make_uffd_wp_pmd(vma, start, pmd); - flush_tlb_range(vma, start, end); -out_unlock: - spin_unlock(ptl); - return ret; -#else /* !CONFIG_TRANSPARENT_HUGEPAGE */ - return -ENOENT; -#endif -} - -static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start, - unsigned long end, struct mm_walk *walk) -{ - struct pagemap_scan_private *p =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - unsigned long addr, flush_end =3D 0; - pte_t *pte, *start_pte; - spinlock_t *ptl; - int ret; - - ret =3D pagemap_scan_thp_entry(pmd, start, end, walk); - if (ret !=3D -ENOENT) - return ret; - - ret =3D 0; - start_pte =3D pte =3D pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl); - if (!pte) { - walk->action =3D ACTION_AGAIN; - return 0; - } - - lazy_mmu_mode_enable(); - - if ((p->arg.flags & PM_SCAN_WP_MATCHING) && !p->vec_out) { - /* Fast path for performing exclusive WP */ - for (addr =3D start; addr !=3D end; pte++, addr +=3D PAGE_SIZE) { - pte_t ptent =3D ptep_get(pte); - - if ((pte_present(ptent) && pte_uffd_wp(ptent)) || - pte_swp_uffd_wp_any(ptent)) - continue; - make_uffd_wp_pte(vma, addr, pte, ptent); - if (!flush_end) - start =3D addr; - flush_end =3D addr + PAGE_SIZE; - } - goto flush_and_return; - } - - if (!p->arg.category_anyof_mask && !p->arg.category_inverted && - p->arg.category_mask =3D=3D PAGE_IS_WRITTEN && - p->arg.return_mask =3D=3D PAGE_IS_WRITTEN) { - for (addr =3D start; addr < end; pte++, addr +=3D PAGE_SIZE) { - unsigned long next =3D addr + PAGE_SIZE; - pte_t ptent =3D ptep_get(pte); - - if ((pte_present(ptent) && pte_uffd_wp(ptent)) || - pte_swp_uffd_wp_any(ptent)) - continue; - ret =3D pagemap_scan_output(p->cur_vma_category | PAGE_IS_WRITTEN, - p, addr, &next); - if (next =3D=3D addr) - break; - if (~p->arg.flags & PM_SCAN_WP_MATCHING) - continue; - make_uffd_wp_pte(vma, addr, pte, ptent); - if (!flush_end) - start =3D addr; - flush_end =3D next; - } - goto flush_and_return; - } - - for (addr =3D start; addr !=3D end; pte++, addr +=3D PAGE_SIZE) { - pte_t ptent =3D ptep_get(pte); - unsigned long categories =3D p->cur_vma_category | - pagemap_page_category(p, vma, addr, ptent); - unsigned long next =3D addr + PAGE_SIZE; - - if (!pagemap_scan_is_interesting_page(categories, p)) - continue; - - ret =3D pagemap_scan_output(categories, p, addr, &next); - if (next =3D=3D addr) - break; - - if (~p->arg.flags & PM_SCAN_WP_MATCHING) - continue; - if (~categories & PAGE_IS_WRITTEN) - continue; - - make_uffd_wp_pte(vma, addr, pte, ptent); - if (!flush_end) - start =3D addr; - flush_end =3D next; - } - -flush_and_return: - if (flush_end) - flush_tlb_range(vma, start, addr); - - lazy_mmu_mode_disable(); - pte_unmap_unlock(start_pte, ptl); - - cond_resched(); - return ret; -} - -#ifdef CONFIG_HUGETLB_PAGE -static int pagemap_scan_hugetlb_entry(pte_t *ptep, unsigned long hmask, - unsigned long start, unsigned long end, - struct mm_walk *walk) -{ - struct pagemap_scan_private *p =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - unsigned long categories; - spinlock_t *ptl; - int ret =3D 0; - pte_t pte; - - if (~p->arg.flags & PM_SCAN_WP_MATCHING) { - /* Go the short route when not write-protecting pages. */ - - pte =3D huge_ptep_get(walk->mm, start, ptep); - categories =3D p->cur_vma_category | pagemap_hugetlb_category(pte); - - if (!pagemap_scan_is_interesting_page(categories, p)) - return 0; - - return pagemap_scan_output(categories, p, start, &end); - } - - i_mmap_lock_write(vma->vm_file->f_mapping); - ptl =3D huge_pte_lock(hstate_vma(vma), vma->vm_mm, ptep); - - pte =3D huge_ptep_get(walk->mm, start, ptep); - categories =3D p->cur_vma_category | pagemap_hugetlb_category(pte); - - if (!pagemap_scan_is_interesting_page(categories, p)) - goto out_unlock; - - ret =3D pagemap_scan_output(categories, p, start, &end); - if (start =3D=3D end) - goto out_unlock; - - if (~categories & PAGE_IS_WRITTEN) - goto out_unlock; - - if (end !=3D start + HPAGE_SIZE) { - /* Partial HugeTLB page WP isn't possible. */ - pagemap_scan_backout_range(p, start, end); - p->arg.walk_end =3D start; - ret =3D 0; - goto out_unlock; - } - - make_uffd_wp_huge_pte(vma, start, ptep, pte); - flush_hugetlb_tlb_range(vma, start, end); - -out_unlock: - spin_unlock(ptl); - i_mmap_unlock_write(vma->vm_file->f_mapping); - - return ret; -} -#else -#define pagemap_scan_hugetlb_entry NULL -#endif - static int pagemap_scan_pte_hole(unsigned long addr, unsigned long end, int depth, struct mm_walk *walk) { @@ -2773,13 +2301,6 @@ static int pagemap_scan_pte_hole(unsigned long addr,= unsigned long end, return ret; } =20 -static const struct mm_walk_ops pagemap_scan_ops =3D { - .test_walk =3D pagemap_scan_test_walk, - .pmd_entry =3D pagemap_scan_pmd_entry, - .pte_hole =3D pagemap_scan_pte_hole, - .hugetlb_entry =3D pagemap_scan_hugetlb_entry, -}; - static int pagemap_scan_get_args(struct pm_scan_arg *arg, unsigned long uarg) { @@ -2877,6 +2398,135 @@ static long pagemap_scan_flush_buffer(struct pagema= p_scan_private *p) return n; } =20 +static unsigned long pagemap_set_category(struct pagemap_scan_private *p, + struct pt_range_walk *ptw, + enum pt_range_walk_type type) +{ + unsigned long categories =3D 0; + + if (ptw->present) { + categories |=3D PAGE_IS_PRESENT; + + if (type =3D=3D PTW_FOLIO && !PageAnon(ptw->page)) + categories |=3D PAGE_IS_FILE; + if (type =3D=3D PTW_PFN) + categories |=3D PAGE_IS_PFNZERO; + } else { + categories |=3D PAGE_IS_SWAPPED; + } + + switch (ptw->level) { + case PTW_PUD_LEVEL: + if (ptw->present) { + if (!pud_uffd_wp(ptw->pud)) + categories |=3D PAGE_IS_WRITTEN; + if (pud_soft_dirty(ptw->pud)) + categories |=3D PAGE_IS_SOFT_DIRTY; + } else { + if (!pud_swp_uffd_wp(ptw->pud)) + categories |=3D PAGE_IS_WRITTEN; + if (pud_swp_soft_dirty(ptw->pud)) + categories |=3D PAGE_IS_SOFT_DIRTY; + } + break; + case PTW_PMD_LEVEL: + if (ptw->present) { + if (!pmd_uffd_wp(ptw->pmd)) + categories |=3D PAGE_IS_WRITTEN; + if (pmd_soft_dirty(ptw->pmd)) + categories |=3D PAGE_IS_SOFT_DIRTY; + } else { + const softleaf_t entry =3D softleaf_from_pmd(ptw->pmd); + + if (softleaf_has_pfn(entry) && + !folio_test_anon(softleaf_to_folio(entry))) + categories |=3D PAGE_IS_FILE; + if (!pmd_swp_uffd_wp(ptw->pmd)) + categories |=3D PAGE_IS_WRITTEN; + if (pmd_swp_soft_dirty(ptw->pmd)) + categories |=3D PAGE_IS_SOFT_DIRTY; + } + break; + case PTW_PTE_LEVEL: + if (ptw->present) { + if (!pte_uffd_wp(ptw->pte)) + categories |=3D PAGE_IS_WRITTEN; + if (pte_soft_dirty(ptw->pte)) + categories |=3D PAGE_IS_SOFT_DIRTY; + } else { + if (!pte_swp_uffd_wp_any(ptw->pte)) + categories |=3D PAGE_IS_WRITTEN; + if (pte_swp_soft_dirty(ptw->pte)) + categories |=3D PAGE_IS_SOFT_DIRTY; + } + break; + } + + return categories; +} + +static int pagemap_scan_walk(struct vm_area_struct *vma, struct pagemap_sc= an_private *p, + unsigned long addr) +{ + int ret =3D 0; + struct pt_range_walk ptw =3D { + .mm =3D vma->vm_mm + }; + enum pt_range_walk_type type; + pt_type_flags_t flags =3D PT_TYPE_ALL; + +keep_walking: + type =3D pt_range_walk_start(&ptw, vma, addr, vma->vm_end, flags); + while (type !=3D PTW_DONE) { + unsigned long categories =3D p->cur_vma_category | + pagemap_set_category(p, &ptw, type); + unsigned long curr_addr =3D ptw.curr_addr; + + if (pagemap_scan_is_interesting_page(categories, p)) { + unsigned long end; + + end =3D ptw.next_addr; + + if (~p->arg.flags & PM_SCAN_WP_MATCHING) + goto keep_walking; + if (~categories & PAGE_IS_WRITTEN) + goto keep_walking; + + ret =3D pagemap_scan_output(categories, p, curr_addr, &end); + if (curr_addr =3D=3D end) + goto out; + + if (end !=3D curr_addr + HPAGE_SIZE) { + if (is_vm_hugetlb_page(ptw.vma)) { + /* Partial HugeTLB page WP isn't possible. */ + pagemap_scan_backout_range(p, curr_addr, end); + p->arg.walk_end =3D curr_addr; + ret =3D 0; + goto keep_walking; + } + if (ptw.level =3D=3D PTW_PMD_LEVEL) { + pt_range_walk_done(&ptw); + split_huge_pmd(ptw.vma, ptw.pmdp, curr_addr); + pagemap_scan_backout_range(p, curr_addr, end); + /* Relaunch now that we split the pmd */ + goto keep_walking; + } + } + + if (ptw.level =3D=3D PTW_PUD_LEVEL) + make_uffd_wp_pud(ptw.vma, curr_addr, ptw.pudp); + if (ptw.level =3D=3D PTW_PMD_LEVEL) + make_uffd_wp_pmd(ptw.vma, curr_addr, ptw.pmdp); + if (ptw.level =3D=3D PTW_PTE_LEVEL) + make_uffd_wp_pte(ptw.vma, curr_addr, ptw.ptep, ptw.pte); + } + type =3D pt_range_walk_next(&ptw, vma, vma->vm_start, vma->vm_end, flags= ); + } +out: + pt_range_walk_done(&ptw); + return ret; +} + static long do_pagemap_scan(struct mm_struct *mm, unsigned long uarg) { struct pagemap_scan_private p =3D {0}; @@ -2897,6 +2547,7 @@ static long do_pagemap_scan(struct mm_struct *mm, uns= igned long uarg) for (walk_start =3D p.arg.start; walk_start < p.arg.end; walk_start =3D p.arg.walk_end) { struct mmu_notifier_range range; + unsigned long next; long n_out; =20 if (fatal_signal_pending(current)) { @@ -2915,8 +2566,21 @@ static long do_pagemap_scan(struct mm_struct *mm, un= signed long uarg) mmu_notifier_invalidate_range_start(&range); } =20 - ret =3D walk_page_range(mm, walk_start, p.arg.end, - &pagemap_scan_ops, &p); + do { + struct vm_area_struct *vma =3D find_vma(mm, walk_start); + + if (vma) { + ret =3D pagemap_scan_walk(vma, &p, walk_start); + if (ret) + break; + walk_start =3D min(p.arg.end, vma->vm_end); + next =3D walk_start; + } else { + walk_start =3D p.arg.end; + next =3D p.arg.end; + } + + } while (next < p.arg.end); =20 if (p.arg.flags & PM_SCAN_WP_MATCHING) mmu_notifier_invalidate_range_end(&range); @@ -2950,6 +2614,251 @@ static long do_pagemap_scan(struct mm_struct *mm, u= nsigned long uarg) return ret; } =20 +static int pagemap_read_walk_range(struct vm_area_struct *vma, unsigned lo= ng start, + struct pagemapread *pm) +{ + int err =3D 0; + struct pt_range_walk ptw =3D { + .mm =3D vma->vm_mm + }; + enum pt_range_walk_type type; + pt_type_flags_t wflags =3D PT_TYPE_ALL; + pte_t *ptep; + + wflags &=3D ~(PT_TYPE_NONE|PT_TYPE_PFN); + + type =3D pt_range_walk_start(&ptw, vma, start, vma->vm_end, wflags); + while (type !=3D PTW_DONE) { + unsigned long end; + u64 frame =3D 0, flags =3D 0; + struct page *page =3D NULL; + struct folio *folio =3D NULL; + + end =3D 0; + switch (ptw.level) { + case PTW_PUD_LEVEL: + end =3D pud_addr_end(start, vma->vm_end); + if (vma->vm_flags & VM_SOFTDIRTY) + flags |=3D PM_SOFT_DIRTY; + + if (pud_present(ptw.pud)) { + page =3D pud_page(ptw.pud); + folio =3D page_folio(page); + flags |=3D PM_PRESENT; + + if (!folio_test_anon(folio)) + flags |=3D PM_FILE; + + if (pm->show_pfn) { + unsigned long hmask =3D huge_page_mask(hstate_vma(vma)); + + frame =3D pud_pfn(ptw.pud) + + ((start & ~hmask) >> PAGE_SHIFT); + } + } else if (pud_swp_uffd_wp(ptw.pud)) { + flags |=3D PM_UFFD_WP; + } + break; + case PTW_PMD_LEVEL: + unsigned int idx =3D (start & ~PMD_MASK) >> PAGE_SHIFT; + + end =3D pmd_addr_end(start, vma->vm_end); + if (vma->vm_flags & VM_SOFTDIRTY) + flags |=3D PM_SOFT_DIRTY; + + if (pmd_present(ptw.pmd)) { + page =3D pmd_page(ptw.pmd); + flags |=3D PM_PRESENT; + + if (pmd_soft_dirty(ptw.pmd)) + flags |=3D PM_SOFT_DIRTY; + if (pmd_uffd_wp(ptw.pmd)) + flags |=3D PM_UFFD_WP; + if (pm->show_pfn) + frame =3D pmd_pfn(ptw.pmd) + idx; + } else if (thp_migration_supported() || IS_ENABLED(CONFIG_HUGETLB_PAGE)= ) { + const softleaf_t entry =3D softleaf_from_pmd(ptw.pmd); + unsigned long offset; + + if (pm->show_pfn) { + if (softleaf_has_pfn(entry)) + offset =3D softleaf_to_pfn(entry) + idx; + else + offset =3D swp_offset(entry) + idx; + frame =3D swp_type(entry) | + (offset << MAX_SWAPFILES_SHIFT); + } + + if (!is_vm_hugetlb_page(vma)) + flags |=3D PM_SWAP; + if (pmd_swp_soft_dirty(ptw.pmd)) + flags |=3D PM_SOFT_DIRTY; + if (pmd_swp_uffd_wp(ptw.pmd)) + flags |=3D PM_UFFD_WP; + + VM_WARN_ON_ONCE(!pmd_is_migration_entry(ptw.pmd)); + page =3D softleaf_to_page(entry); + } + + if (page) { + folio =3D page_folio(page); + if (!folio_test_anon(folio)) + flags |=3D PM_FILE; + } + + break; + case PTW_PTE_LEVEL: + end =3D pmd_addr_end(start, vma->vm_end); + break; + } + + if (ptw.level =3D=3D PTW_PTE_LEVEL) { + ptep =3D ptw.ptep; + for (; start < end; ptep++, start +=3D PAGE_SIZE) { + pagemap_entry_t pme; + + pme =3D pte_to_pagemap_entry(pm, vma, start, ptep_get(ptep)); + err =3D add_to_pagemap(&pme, pm); + ptw.next_addr =3D start + PAGE_SIZE; + if (err) + break; + } + } else { + for (; start !=3D end; start +=3D PAGE_SIZE) { + u64 cur_flags =3D flags; + pagemap_entry_t pme; + + if (folio && (flags & PM_PRESENT) && + __folio_page_mapped_exclusively(folio, page)) + cur_flags |=3D PM_MMAP_EXCLUSIVE; + + pme =3D make_pme(frame, cur_flags); + err =3D add_to_pagemap(&pme, pm); + if (err) + break; + if (pm->show_pfn) { + if (flags & PM_PRESENT) + frame++; + else if (flags & PM_SWAP) + frame +=3D (1 << MAX_SWAPFILES_SHIFT); + } + } + } + type =3D pt_range_walk_next(&ptw, vma, vma->vm_start, vma->vm_end, wflag= s); + } + pt_range_walk_done(&ptw); + + return err; +} + +static ssize_t pagemap_read(struct file *file, char __user *buf, + size_t count, loff_t *ppos) +{ + struct mm_struct *mm =3D file->private_data; + struct pagemapread pm; + unsigned long src; + unsigned long svpfn; + unsigned long start_vaddr; + unsigned long end_vaddr; + int ret =3D 0, copied =3D 0; + + if (!mm || !mmget_not_zero(mm)) + goto out; + + ret =3D -EINVAL; + /* file position must be aligned */ + if ((*ppos % PM_ENTRY_BYTES) || (count % PM_ENTRY_BYTES)) + goto out_mm; + + ret =3D 0; + if (!count) + goto out_mm; + + /* do not disclose physical addresses: attack vector */ + pm.show_pfn =3D file_ns_capable(file, &init_user_ns, CAP_SYS_ADMIN); + + pm.len =3D (PAGEMAP_WALK_SIZE >> PAGE_SHIFT); + pm.buffer =3D kmalloc_array(pm.len, PM_ENTRY_BYTES, GFP_KERNEL); + ret =3D -ENOMEM; + if (!pm.buffer) + goto out_mm; + + src =3D *ppos; + svpfn =3D src / PM_ENTRY_BYTES; + end_vaddr =3D mm->task_size; + + /* watch out for wraparound */ + start_vaddr =3D end_vaddr; + if (svpfn <=3D (ULONG_MAX >> PAGE_SHIFT)) { + unsigned long end; + + ret =3D mmap_read_lock_killable(mm); + if (ret) + goto out_free; + start_vaddr =3D untagged_addr_remote(mm, svpfn << PAGE_SHIFT); + mmap_read_unlock(mm); + + end =3D start_vaddr + ((count / PM_ENTRY_BYTES) << PAGE_SHIFT); + if (end >=3D start_vaddr && end < mm->task_size) + end_vaddr =3D end; + } + + /* Ensure the address is inside the task */ + if (start_vaddr > mm->task_size) + start_vaddr =3D end_vaddr; + + ret =3D 0; + + while (count && (start_vaddr < end_vaddr)) { + int len; + unsigned long end; + unsigned long next; + + pm.pos =3D 0; + end =3D (start_vaddr + PAGEMAP_WALK_SIZE) & PAGEMAP_WALK_MASK; + if (end < start_vaddr || end > end_vaddr) + end =3D end_vaddr; + ret =3D mmap_read_lock_killable(mm); + if (ret) + goto out_free; + + do { + struct vm_area_struct *vma =3D find_vma(mm, start_vaddr); + + if (vma) { + ret =3D pagemap_read_walk_range(vma, start_vaddr, &pm); + if (ret) + goto out_err; + start_vaddr =3D min(end, vma->vm_end); + next =3D start_vaddr; + } else { + next =3D end; + } + } while (next < end); +out_err: + mmap_read_unlock(mm); + + len =3D min(count, PM_ENTRY_BYTES * pm.pos); + if (copy_to_user(buf, pm.buffer, len)) { + ret =3D -EFAULT; + goto out_free; + } + copied +=3D len; + buf +=3D len; + count -=3D len; + } + *ppos +=3D copied; + if (!ret || ret =3D=3D PM_END_OF_BUFFER) + ret =3D copied; + +out_free: + kfree(pm.buffer); +out_mm: + mmput(mm); +out: + return ret; +} + static long do_pagemap_cmd(struct file *file, unsigned int cmd, unsigned long arg) { @@ -2972,6 +2881,7 @@ const struct file_operations proc_pagemap_operations = =3D { .unlocked_ioctl =3D do_pagemap_cmd, .compat_ioctl =3D do_pagemap_cmd, }; + #endif /* CONFIG_PROC_PAGE_MONITOR */ =20 #ifdef CONFIG_NUMA diff --git a/include/linux/leafops.h b/include/linux/leafops.h index 122ac50aeb09..6444625c6fbb 100644 --- a/include/linux/leafops.h +++ b/include/linux/leafops.h @@ -618,6 +618,19 @@ static inline bool pmd_is_device_private_entry(pmd_t p= md) =20 #endif /* CONFIG_ZONE_DEVICE && CONFIG_ARCH_ENABLE_THP_MIGRATION */ =20 +#ifdef CONFIG_HUGETLB_PAGE +/** + * pud_is_migration_entry() - Does this PUD entry encode a migration entry? + * @pud: PUD entry. + * + * Returns: true if the PUD encodes a migration entry, otherwise false. + */ +static inline bool pud_is_migration_entry(pud_t pud) +{ + return softleaf_is_migration(softleaf_from_pud(pud)); +} +#endif + /** * pmd_is_migration_entry() - Does this PMD entry encode a migration entry? * @pmd: PMD entry. diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 6f01d5ed73f6..6f8e83a5bb08 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1229,11 +1229,21 @@ static inline pmd_t generic_pmdp_establish(struct v= m_area_struct *vma, } #endif =20 +#ifndef __HAVE_ARCH_PUDP_INVALIDATE +extern pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long add= ress, + pud_t *pudp); +#endif + #ifndef __HAVE_ARCH_PMDP_INVALIDATE extern pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long add= ress, pmd_t *pmdp); #endif =20 +#ifndef __HAVE_ARCH_PUDP_INVALIDATE_AD +extern pud_t pudp_invalidate_ad(struct vm_area_struct *vma, + unsigned long address, pud_t *pudp); +#endif + #ifndef __HAVE_ARCH_PMDP_INVALIDATE_AD =20 /* @@ -1776,6 +1786,21 @@ static inline pgprot_t pgprot_modify(pgprot_t oldpro= t, pgprot_t newprot) =20 #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION +static inline pud_t pud_swp_mksoft_dirty(pud_t pud) +{ + return pud; +} + +static inline int pud_swp_soft_dirty(pud_t pud) +{ + return 0; +} + +static inline pud_t pud_swp_clear_soft_dirty(pud_t pud) +{ + return pud; +} + static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) { return pmd; @@ -1818,6 +1843,11 @@ static inline int pmd_soft_dirty(pmd_t pmd) return 0; } =20 +static inline int pud_soft_dirty(pud_t pud) +{ + return 0; +} + static inline pte_t pte_mksoft_dirty(pte_t pte) { return pte; diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index af7966169d69..f390c93b98b2 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -206,6 +206,16 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsi= gned long address, } #endif =20 +#ifndef __HAVE_ARCH_PUDP_INVALIDATE_AD +pud_t pudp_invalidate_ad(struct vm_area_struct *vma, unsigned long address, + pud_t *pudp) + +{ + VM_WARN_ON_ONCE(!pud_present(*pudp)); + return pudp_invalidate(vma, address, pudp); +} +#endif + #ifndef __HAVE_ARCH_PMDP_INVALIDATE_AD pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) --=20 2.35.3