From nobody Sun Apr 5 16:29:49 2026 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BBE683EDACB for ; Thu, 2 Apr 2026 14:16:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775139416; cv=none; b=HM1NMkpWvv6LB8QzIr3MUpjjOwJolmSvJsyRs7JWUS8VVryaCsT3X05ZhS2oJoRdZRQ5f1KqTA6qCclKHfU2AT5u2sngx9zAeztLNKbHj/msKMjEVvIEijnY4NsUjOXVqcYAvEBHc09oEK5oEGZmlQZGng3cVHIWvAfMTD38U+E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775139416; c=relaxed/simple; bh=wEIvH10IOu76mWezk6NBFrQlw11nI27lOYWb3x/rE7U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bDqp6vtfLvbXlCFVP3cP2fFyvMyFsFLSSVj2HnbL6kLK6H8p7sxXf5S8TvtD38tT436JMb9jcgpGlUPkcx1FfdO/34CVjnSeMUSx449+rRX7pJj3fx+NqzAqgiRe8q/kdHoNy3dP6MUVl7R+0f9n7fsrZXg+LWd2v2PJuUjs1J8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id EC3D64D348; Thu, 2 Apr 2026 14:16:47 +0000 (UTC) Authentication-Results: smtp-out1.suse.de; none Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 250154A0B1; Thu, 2 Apr 2026 14:16:47 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id eGWzBU96zmlQDQAAD6G6ig (envelope-from ); Thu, 02 Apr 2026 14:16:47 +0000 From: Pedro Falcato To: Andrew Morton , "Liam R. Howlett" , Lorenzo Stoakes Cc: Pedro Falcato , Vlastimil Babka , Jann Horn , David Hildenbrand , Dev Jain , Luke Yang , jhladky@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 1/2] mm/mprotect: move softleaf code out of the main function Date: Thu, 2 Apr 2026 15:16:27 +0100 Message-ID: <20260402141628.3367596-2-pfalcato@suse.de> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260402141628.3367596-1-pfalcato@suse.de> References: <20260402141628.3367596-1-pfalcato@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Queue-Id: EC3D64D348 X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Action: no action X-Spam-Score: -4.00 X-Spam-Level: X-Spam-Flag: NO X-Spamd-Result: default: False [-4.00 / 50.00]; REPLY(-4.00)[] X-Rspamd-Server: rspamd1.dmz-prg2.suse.org Content-Type: text/plain; charset="utf-8" Move softleaf change_pte_range code into a separate function. This makes the change_pte_range() function a good bit smaller, and lessens cognitive load when reading through the function. Reviewed-by: Lorenzo Stoakes (Oracle) Acked-by: David Hildenbrand (Arm) Tested-by: Luke Yang Signed-off-by: Pedro Falcato --- mm/mprotect.c | 127 ++++++++++++++++++++++++++------------------------ 1 file changed, 67 insertions(+), 60 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 9681f055b9fc..5929ce792c7b 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -211,6 +211,72 @@ static void set_write_prot_commit_flush_ptes(struct vm= _area_struct *vma, commit_anon_folio_batch(vma, folio, page, addr, ptep, oldpte, ptent, nr_p= tes, tlb); } =20 +static long change_softleaf_pte(struct vm_area_struct *vma, + unsigned long addr, pte_t *pte, pte_t oldpte, unsigned long cp_flags) +{ + const bool uffd_wp =3D cp_flags & MM_CP_UFFD_WP; + const bool uffd_wp_resolve =3D cp_flags & MM_CP_UFFD_WP_RESOLVE; + softleaf_t entry =3D softleaf_from_pte(oldpte); + pte_t newpte; + + if (softleaf_is_migration_write(entry)) { + const struct folio *folio =3D softleaf_to_folio(entry); + + /* + * A protection check is difficult so + * just be safe and disable write + */ + if (folio_test_anon(folio)) + entry =3D make_readable_exclusive_migration_entry(swp_offset(entry)); + else + entry =3D make_readable_migration_entry(swp_offset(entry)); + newpte =3D swp_entry_to_pte(entry); + if (pte_swp_soft_dirty(oldpte)) + newpte =3D pte_swp_mksoft_dirty(newpte); + } else if (softleaf_is_device_private_write(entry)) { + /* + * We do not preserve soft-dirtiness. See + * copy_nonpresent_pte() for explanation. + */ + entry =3D make_readable_device_private_entry(swp_offset(entry)); + newpte =3D swp_entry_to_pte(entry); + if (pte_swp_uffd_wp(oldpte)) + newpte =3D pte_swp_mkuffd_wp(newpte); + } else if (softleaf_is_marker(entry)) { + /* + * Ignore error swap entries unconditionally, + * because any access should sigbus/sigsegv + * anyway. + */ + if (softleaf_is_poison_marker(entry) || + softleaf_is_guard_marker(entry)) + return 0; + /* + * If this is uffd-wp pte marker and we'd like + * to unprotect it, drop it; the next page + * fault will trigger without uffd trapping. + */ + if (uffd_wp_resolve) { + pte_clear(vma->vm_mm, addr, pte); + return 1; + } + return 0; + } else { + newpte =3D oldpte; + } + + if (uffd_wp) + newpte =3D pte_swp_mkuffd_wp(newpte); + else if (uffd_wp_resolve) + newpte =3D pte_swp_clear_uffd_wp(newpte); + + if (!pte_same(oldpte, newpte)) { + set_pte_at(vma->vm_mm, addr, pte, newpte); + return 1; + } + return 0; +} + static long change_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, pgprot_t newprot, unsigned long cp_flags) @@ -317,66 +383,7 @@ static long change_pte_range(struct mmu_gather *tlb, pages++; } } else { - softleaf_t entry =3D softleaf_from_pte(oldpte); - pte_t newpte; - - if (softleaf_is_migration_write(entry)) { - const struct folio *folio =3D softleaf_to_folio(entry); - - /* - * A protection check is difficult so - * just be safe and disable write - */ - if (folio_test_anon(folio)) - entry =3D make_readable_exclusive_migration_entry( - swp_offset(entry)); - else - entry =3D make_readable_migration_entry(swp_offset(entry)); - newpte =3D swp_entry_to_pte(entry); - if (pte_swp_soft_dirty(oldpte)) - newpte =3D pte_swp_mksoft_dirty(newpte); - } else if (softleaf_is_device_private_write(entry)) { - /* - * We do not preserve soft-dirtiness. See - * copy_nonpresent_pte() for explanation. - */ - entry =3D make_readable_device_private_entry( - swp_offset(entry)); - newpte =3D swp_entry_to_pte(entry); - if (pte_swp_uffd_wp(oldpte)) - newpte =3D pte_swp_mkuffd_wp(newpte); - } else if (softleaf_is_marker(entry)) { - /* - * Ignore error swap entries unconditionally, - * because any access should sigbus/sigsegv - * anyway. - */ - if (softleaf_is_poison_marker(entry) || - softleaf_is_guard_marker(entry)) - continue; - /* - * If this is uffd-wp pte marker and we'd like - * to unprotect it, drop it; the next page - * fault will trigger without uffd trapping. - */ - if (uffd_wp_resolve) { - pte_clear(vma->vm_mm, addr, pte); - pages++; - } - continue; - } else { - newpte =3D oldpte; - } - - if (uffd_wp) - newpte =3D pte_swp_mkuffd_wp(newpte); - else if (uffd_wp_resolve) - newpte =3D pte_swp_clear_uffd_wp(newpte); - - if (!pte_same(oldpte, newpte)) { - set_pte_at(vma->vm_mm, addr, pte, newpte); - pages++; - } + pages +=3D change_softleaf_pte(vma, addr, pte, oldpte, cp_flags); } } while (pte +=3D nr_ptes, addr +=3D nr_ptes * PAGE_SIZE, addr !=3D end); lazy_mmu_mode_disable(); --=20 2.53.0 From nobody Sun Apr 5 16:29:49 2026 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 971FF3ED11F for ; Thu, 2 Apr 2026 14:16:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775139412; cv=none; b=n6T+QXkcBvfJIZ7wQJXfuMFJygSxkdD+h5iI2itNFNZhIFldjOvU7ZwQQsxnT6xic3ZA97bqOb3rtTgbnVH91UGUqv1szo1HuKXqRoN0DxkNg4AnAgUcK7Vj4U8b0u/KKUzAufydElq0PVCikMagYTRVawiLWLzfmsnIejonfwA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775139412; c=relaxed/simple; bh=rFim/d09a6vp17kjhDwXgV1EHs2dzVwtoSIAD7ZtpSI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Pivrjjlg7Ng6XTG1Sj47X8uV3iehCwaICRx9AxX/MUxG6/IU8iqjvgfgRRBHOlKpq5kdmyC+uUgFDnwaygLzUS+JLmhVMt//Cx23amnT21f3CtdtvcGx9Dh6L20LtAyEZKAsxv1zxj/s08hAaK+6a+2Lq4PbIsq1RzybBbDFGoI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; arc=none smtp.client-ip=195.135.223.131 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id DC93F5BDF8; Thu, 2 Apr 2026 14:16:48 +0000 (UTC) Authentication-Results: smtp-out2.suse.de; none Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 109BD4A0B0; Thu, 2 Apr 2026 14:16:48 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id UM2lAFB6zmlQDQAAD6G6ig (envelope-from ); Thu, 02 Apr 2026 14:16:48 +0000 From: Pedro Falcato To: Andrew Morton , "Liam R. Howlett" , Lorenzo Stoakes Cc: Pedro Falcato , Vlastimil Babka , Jann Horn , David Hildenbrand , Dev Jain , Luke Yang , jhladky@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 2/2] mm/mprotect: special-case small folios when applying write permissions Date: Thu, 2 Apr 2026 15:16:28 +0100 Message-ID: <20260402141628.3367596-3-pfalcato@suse.de> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260402141628.3367596-1-pfalcato@suse.de> References: <20260402141628.3367596-1-pfalcato@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Spamd-Result: default: False [-4.00 / 50.00]; REPLY(-4.00)[] X-Rspamd-Queue-Id: DC93F5BDF8 X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Rspamd-Action: no action X-Spam-Flag: NO X-Spam-Score: -4.00 X-Spam-Level: Content-Type: text/plain; charset="utf-8" The common order-0 case is important enough to want its own branch, and avoids the hairy, large loop logic that the CPU does not seem to handle particularly well. While at it, encourage the compiler to inline batch PTE logic and resolve constant branches by adding __always_inline strategically. Suggested-by: David Hildenbrand (Arm) Reviewed-by: Lorenzo Stoakes (Oracle) Tested-by: Luke Yang Signed-off-by: Pedro Falcato --- mm/mprotect.c | 91 ++++++++++++++++++++++++++++++++------------------- 1 file changed, 57 insertions(+), 34 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 5929ce792c7b..98da856e3a52 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -117,9 +117,9 @@ static int mprotect_folio_pte_batch(struct folio *folio= , pte_t *ptep, } =20 /* Set nr_ptes number of ptes, starting from idx */ -static void prot_commit_flush_ptes(struct vm_area_struct *vma, unsigned lo= ng addr, - pte_t *ptep, pte_t oldpte, pte_t ptent, int nr_ptes, - int idx, bool set_write, struct mmu_gather *tlb) +static __always_inline void prot_commit_flush_ptes(struct vm_area_struct *= vma, + unsigned long addr, pte_t *ptep, pte_t oldpte, pte_t ptent, + int nr_ptes, int idx, bool set_write, struct mmu_gather *tlb) { /* * Advance the position in the batch by idx; note that if idx > 0, @@ -143,7 +143,7 @@ static void prot_commit_flush_ptes(struct vm_area_struc= t *vma, unsigned long add * !PageAnonExclusive() pages, starting from start_idx. Caller must enforce * that the ptes point to consecutive pages of the same anon large folio. */ -static int page_anon_exclusive_sub_batch(int start_idx, int max_len, +static __always_inline int page_anon_exclusive_sub_batch(int start_idx, in= t max_len, struct page *first_page, bool expected_anon_exclusive) { int idx; @@ -169,7 +169,7 @@ static int page_anon_exclusive_sub_batch(int start_idx,= int max_len, * pte of the batch. Therefore, we must individually check all pages and * retrieve sub-batches. */ -static void commit_anon_folio_batch(struct vm_area_struct *vma, +static __always_inline void commit_anon_folio_batch(struct vm_area_struct = *vma, struct folio *folio, struct page *first_page, unsigned long addr, pte_t = *ptep, pte_t oldpte, pte_t ptent, int nr_ptes, struct mmu_gather *tlb) { @@ -188,7 +188,7 @@ static void commit_anon_folio_batch(struct vm_area_stru= ct *vma, } } =20 -static void set_write_prot_commit_flush_ptes(struct vm_area_struct *vma, +static __always_inline void set_write_prot_commit_flush_ptes(struct vm_are= a_struct *vma, struct folio *folio, struct page *page, unsigned long addr, pte_t *ptep, pte_t oldpte, pte_t ptent, int nr_ptes, struct mmu_gather *tlb) { @@ -277,6 +277,45 @@ static long change_softleaf_pte(struct vm_area_struct = *vma, return 0; } =20 +static __always_inline void change_present_ptes(struct mmu_gather *tlb, + struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, + int nr_ptes, unsigned long end, pgprot_t newprot, + struct folio *folio, struct page *page, unsigned long cp_flags) +{ + const bool uffd_wp_resolve =3D cp_flags & MM_CP_UFFD_WP_RESOLVE; + const bool uffd_wp =3D cp_flags & MM_CP_UFFD_WP; + pte_t ptent, oldpte; + + oldpte =3D modify_prot_start_ptes(vma, addr, ptep, nr_ptes); + ptent =3D pte_modify(oldpte, newprot); + + if (uffd_wp) + ptent =3D pte_mkuffd_wp(ptent); + else if (uffd_wp_resolve) + ptent =3D pte_clear_uffd_wp(ptent); + + /* + * In some writable, shared mappings, we might want + * to catch actual write access -- see + * vma_wants_writenotify(). + * + * In all writable, private mappings, we have to + * properly handle COW. + * + * In both cases, we can sometimes still change PTEs + * writable and avoid the write-fault handler, for + * example, if a PTE is already dirty and no other + * COW or special handling is required. + */ + if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && + !pte_write(ptent)) + set_write_prot_commit_flush_ptes(vma, folio, page, + addr, ptep, oldpte, ptent, nr_ptes, tlb); + else + prot_commit_flush_ptes(vma, addr, ptep, oldpte, ptent, + nr_ptes, /* idx =3D */ 0, /* set_write =3D */ false, tlb); +} + static long change_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, pgprot_t newprot, unsigned long cp_flags) @@ -287,7 +326,6 @@ static long change_pte_range(struct mmu_gather *tlb, bool is_private_single_threaded; bool prot_numa =3D cp_flags & MM_CP_PROT_NUMA; bool uffd_wp =3D cp_flags & MM_CP_UFFD_WP; - bool uffd_wp_resolve =3D cp_flags & MM_CP_UFFD_WP_RESOLVE; int nr_ptes; =20 tlb_change_page_size(tlb, PAGE_SIZE); @@ -308,7 +346,6 @@ static long change_pte_range(struct mmu_gather *tlb, int max_nr_ptes =3D (end - addr) >> PAGE_SHIFT; struct folio *folio =3D NULL; struct page *page; - pte_t ptent; =20 /* Already in the desired state. */ if (prot_numa && pte_protnone(oldpte)) @@ -334,34 +371,20 @@ static long change_pte_range(struct mmu_gather *tlb, =20 nr_ptes =3D mprotect_folio_pte_batch(folio, pte, oldpte, max_nr_ptes, f= lags); =20 - oldpte =3D modify_prot_start_ptes(vma, addr, pte, nr_ptes); - ptent =3D pte_modify(oldpte, newprot); - - if (uffd_wp) - ptent =3D pte_mkuffd_wp(ptent); - else if (uffd_wp_resolve) - ptent =3D pte_clear_uffd_wp(ptent); - /* - * In some writable, shared mappings, we might want - * to catch actual write access -- see - * vma_wants_writenotify(). - * - * In all writable, private mappings, we have to - * properly handle COW. - * - * In both cases, we can sometimes still change PTEs - * writable and avoid the write-fault handler, for - * example, if a PTE is already dirty and no other - * COW or special handling is required. + * Optimize for the small-folio common case by + * special-casing it here. Compiler constant propagation + * plus copious amounts of __always_inline does wonders. */ - if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && - !pte_write(ptent)) - set_write_prot_commit_flush_ptes(vma, folio, page, - addr, pte, oldpte, ptent, nr_ptes, tlb); - else - prot_commit_flush_ptes(vma, addr, pte, oldpte, ptent, - nr_ptes, /* idx =3D */ 0, /* set_write =3D */ false, tlb); + if (likely(nr_ptes =3D=3D 1)) { + change_present_ptes(tlb, vma, addr, pte, 1, + end, newprot, folio, page, cp_flags); + } else { + change_present_ptes(tlb, vma, addr, pte, + nr_ptes, end, newprot, folio, page, + cp_flags); + } + pages +=3D nr_ptes; } else if (pte_none(oldpte)) { /* --=20 2.53.0