From nobody Sun Jun 21 10:07:16 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFBFFC433EF for ; Tue, 29 Mar 2022 16:46:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239793AbiC2QsY (ORCPT ); Tue, 29 Mar 2022 12:48:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239782AbiC2QsR (ORCPT ); Tue, 29 Mar 2022 12:48:17 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1C2B5A6E3E for ; Tue, 29 Mar 2022 09:46:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648572392; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o2w/fJglRUuWKtszG5JddRy/AwNMJzuTrZhnPm9ugtA=; b=hKBA8V6QWKeD4MnFJo8tiy8ErnEqzmNto3wePZvTRlYM5R4OdYgxoS+fH2r+dF3rIzIgGV wAcKRNoKy4HP5tVIJEOi2gsj7ZSp0O/CwDBo60FvDjosr+e8EjJZKI+2GQffbeHW3VA9GY p7H0H1XfLxp/ZSUhJbDkz5zZfteI1ao= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-85-sqKJpRupPP2WGYuGU09DYA-1; Tue, 29 Mar 2022 12:46:23 -0400 X-MC-Unique: sqKJpRupPP2WGYuGU09DYA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1A27A29DD99C; Tue, 29 Mar 2022 16:45:28 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.134]) by smtp.corp.redhat.com (Postfix) with ESMTP id F0C4E1121314; Tue, 29 Mar 2022 16:44:33 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Pedro Gomes , Oded Gabbay , Catalin Marinas , Will Deacon , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Gerald Schaefer , linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, David Hildenbrand Subject: [PATCH v2 1/8] mm/swap: remember PG_anon_exclusive via a swp pte bit Date: Tue, 29 Mar 2022 18:43:22 +0200 Message-Id: <20220329164329.208407-2-david@redhat.com> In-Reply-To: <20220329164329.208407-1-david@redhat.com> References: <20220329164329.208407-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Currently, we clear PG_anon_exclusive in try_to_unmap() and forget about it. We do this, to keep fork() logic on swap entries easy and efficient: for example, if we wouldn't clear it when unmapping, we'd have to lookup the page in the swapcache for each and every swap entry during fork() and clear PG_anon_exclusive if set. Instead, we want to store that information directly in the swap pte, protected by the page table lock, similarly to how we handle SWP_MIGRATION_READ_EXCLUSIVE for migration entries. However, for actual swap entries, we don't want to mess with the swap type (e.g., still one bit) because it overcomplicates swap code. In try_to_unmap(), we already reject to unmap in case the page might be pinned, because we must not lose PG_anon_exclusive on pinned pages ever. Checking if there are other unexpected references reliably *before* completely unmapping a page is unfortunately not really possible: THP heavily overcomplicate the situation. Once fully unmapped it's easier -- we, for example, make sure that there are no unexpected references *after* unmapping a page before starting writeback on that page. So, we currently might end up unmapping a page and clearing PG_anon_exclusive if that page has additional references, for example, due to a FOLL_GET. do_swap_page() has to re-determine if a page is exclusive, which will easily fail if there are other references on a page, most prominently GUP references via FOLL_GET. This can currently result in memory corruptions when taking a FOLL_GET | FOLL_WRITE reference on a page even when fork() is never involved: try_to_unmap() will succeed, and when refaulting the page, it cannot be marked exclusive and will get replaced by a copy in the page tables on the next write access, resulting in writes via the GUP reference to the page being lost. In an ideal world, everybody that uses GUP and wants to modify page content, such as O_DIRECT, would properly use FOLL_PIN. However, that conversion will take a while. It's easier to fix what used to work in the past (FOLL_GET | FOLL_WRITE) remembering PG_anon_exclusive. In addition, by remembering PG_anon_exclusive we can further reduce unnecessary COW in some cases, so it's the natural thing to do. So let's transfer the PG_anon_exclusive information to the swap pte and store it via an architecture-dependant pte bit; use that information when restoring the swap pte in do_swap_page() and unuse_pte(). During fork(), we simply have to clear the pte bit and are done. Of course, there is one corner case to handle: swap backends that don't support concurrent page modifications while the page is under writeback. Special case these, and drop the exclusive marker. Add a comment why that is just fine (also, reuse_swap_page() would have done the same in the past). In the future, we'll hopefully have all architectures support __HAVE_ARCH_PTE_SWP_EXCLUSIVE, such that we can get rid of the empty stubs and the define completely. Then, we can also convert SWP_MIGRATION_READ_EXCLUSIVE. For architectures it's fairly easy to support: either simply use a yet unused pte bit that can be used for swap entries, steal one from the arch type bits if they exceed 5, or steal one from the offset bits. Note: R/O FOLL_GET references were never really reliable, especially when taking one on a shared page and then writing to the page (e.g., GUP after fork()). FOLL_GET, including R/W references, were never really reliable once fork was involved (e.g., GUP before fork(), GUP during fork()). KSM steps back in case it stumbles over unexpected references and is, therefore, fine. Signed-off-by: David Hildenbrand Acked-by: Vlastimil Babka --- include/linux/pgtable.h | 29 ++++++++++++++++++++++ include/linux/swapops.h | 2 ++ mm/memory.c | 55 ++++++++++++++++++++++++++++++++++++++--- mm/rmap.c | 19 ++++++++------ mm/swapfile.c | 13 +++++++++- 5 files changed, 105 insertions(+), 13 deletions(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index f4f4077b97aa..53750224e176 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1003,6 +1003,35 @@ static inline pgprot_t pgprot_modify(pgprot_t oldpro= t, pgprot_t newprot) #define arch_start_context_switch(prev) do {} while (0) #endif =20 +/* + * When replacing an anonymous page by a real (!non) swap entry, we clear + * PG_anon_exclusive from the page and instead remember whether the flag w= as + * set in the swp pte. During fork(), we have to mark the entry as !exclus= ive + * (possibly shared). On swapin, we use that information to restore + * PG_anon_exclusive, which is very helpful in cases where we might have + * additional (e.g., FOLL_GET) references on a page and wouldn't be able to + * detect exclusivity. + * + * These functions don't apply to non-swap entries (e.g., migration, hwpoi= son, + * ...). + */ +#ifndef __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return pte; +} + +static inline int pte_swp_exclusive(pte_t pte) +{ + return false; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return pte; +} +#endif + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY #ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 06280fc1c99b..32d517a28969 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -26,6 +26,8 @@ /* Clear all flags but only keep swp_entry_t related information */ static inline pte_t pte_swp_clear_flags(pte_t pte) { + if (pte_swp_exclusive(pte)) + pte =3D pte_swp_clear_exclusive(pte); if (pte_swp_soft_dirty(pte)) pte =3D pte_swp_clear_soft_dirty(pte); if (pte_swp_uffd_wp(pte)) diff --git a/mm/memory.c b/mm/memory.c index 14618f446139..9060cc7f2123 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -792,6 +792,11 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct m= m_struct *src_mm, &src_mm->mmlist); spin_unlock(&mmlist_lock); } + /* Mark the swap entry as shared. */ + if (pte_swp_exclusive(*src_pte)) { + pte =3D pte_swp_clear_exclusive(*src_pte); + set_pte_at(src_mm, addr, src_pte, pte); + } rss[MM_SWAPENTS]++; } else if (is_migration_entry(entry)) { page =3D pfn_swap_entry_to_page(entry); @@ -3559,6 +3564,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) struct page *page =3D NULL, *swapcache; struct swap_info_struct *si =3D NULL; rmap_t rmap_flags =3D RMAP_NONE; + bool exclusive =3D false; swp_entry_t entry; pte_t pte; int locked; @@ -3724,6 +3730,46 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) BUG_ON(!PageAnon(page) && PageMappedToDisk(page)); BUG_ON(PageAnon(page) && PageAnonExclusive(page)); =20 + /* + * Check under PT lock (to protect against concurrent fork() sharing + * the swap entry concurrently) for certainly exclusive pages. + */ + if (!PageKsm(page)) { + /* + * Note that pte_swp_exclusive() =3D=3D false for architectures + * without __HAVE_ARCH_PTE_SWP_EXCLUSIVE. + */ + exclusive =3D pte_swp_exclusive(vmf->orig_pte); + if (page !=3D swapcache) { + /* + * We have a fresh page that is not exposed to the + * swapcache -> certainly exclusive. + */ + exclusive =3D true; + } else if (exclusive && PageWriteback(page) && + !(swp_swap_info(entry)->flags & SWP_STABLE_WRITES)) { + /* + * This is tricky: not all swap backends support + * concurrent page modifications while under writeback. + * + * So if we stumble over such a page in the swapcache + * we must not set the page exclusive, otherwise we can + * map it writable without further checks and modify it + * while still under writeback. + * + * For these problematic swap backends, simply drop the + * exclusive marker: this is perfectly fine as we start + * writeback only if we fully unmapped the page and + * there are no unexpected references on the page after + * unmapping succeeded. After fully unmapped, no + * further GUP references (FOLL_GET and FOLL_PIN) can + * appear, so dropping the exclusive marker and mapping + * it only R/O is fine. + */ + exclusive =3D false; + } + } + /* * Remove the swap entry and conditionally try to free up the swapcache. * We're already holding a reference on the page but haven't mapped it @@ -3738,11 +3784,12 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) pte =3D mk_pte(page, vma->vm_page_prot); =20 /* - * Same logic as in do_wp_page(); however, optimize for fresh pages - * that are certainly not shared because we just allocated them without - * exposing them to the swapcache. + * Same logic as in do_wp_page(); however, optimize for pages that are + * certainly not shared either because we just allocated them without + * exposing them to the swapcache or because the swap entry indicates + * exclusivity. */ - if (!PageKsm(page) && (page !=3D swapcache || page_count(page) =3D=3D 1))= { + if (!PageKsm(page) && (exclusive || page_count(page) =3D=3D 1)) { if (vmf->flags & FAULT_FLAG_WRITE) { pte =3D maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &=3D ~FAULT_FLAG_WRITE; diff --git a/mm/rmap.c b/mm/rmap.c index 4de07234cbcf..c8c257d94962 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1656,14 +1656,15 @@ static bool try_to_unmap_one(struct folio *folio, s= truct vm_area_struct *vma, break; } /* - * Note: We *don't* remember yet if the page was mapped - * exclusively in the swap entry, so swapin code has - * to re-determine that manually and might detect the - * page as possibly shared, for example, if there are - * other references on the page or if the page is under - * writeback. We made sure that there are no GUP pins - * on the page that would rely on it, so for GUP pins - * this is fine. + * Note: We *don't* remember if the page was mapped + * exclusively in the swap pte if the architecture + * doesn't support __HAVE_ARCH_PTE_SWP_EXCLUSIVE. In + * that case, swapin code has to re-determine that + * manually and might detect the page as possibly + * shared, for example, if there are other references on + * the page or if the page is under writeback. We made + * sure that there are no GUP pins on the page that + * would rely on it, so for GUP pins this is fine. */ if (list_empty(&mm->mmlist)) { spin_lock(&mmlist_lock); @@ -1674,6 +1675,8 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, dec_mm_counter(mm, MM_ANONPAGES); inc_mm_counter(mm, MM_SWAPENTS); swp_pte =3D swp_entry_to_pte(entry); + if (anon_exclusive) + swp_pte =3D pte_swp_mkexclusive(swp_pte); if (pte_soft_dirty(pteval)) swp_pte =3D pte_swp_mksoft_dirty(swp_pte); if (pte_uffd_wp(pteval)) diff --git a/mm/swapfile.c b/mm/swapfile.c index a7847324d476..7279b2d2d71d 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1804,7 +1804,18 @@ static int unuse_pte(struct vm_area_struct *vma, pmd= _t *pmd, inc_mm_counter(vma->vm_mm, MM_ANONPAGES); get_page(page); if (page =3D=3D swapcache) { - page_add_anon_rmap(page, vma, addr, RMAP_NONE); + rmap_t rmap_flags =3D RMAP_NONE; + + /* + * See do_swap_page(): PageWriteback() would be problematic. + * However, we do a wait_on_page_writeback() just before this + * call and have the page locked. + */ + VM_BUG_ON_PAGE(PageWriteback(page), page); + if (pte_swp_exclusive(*pte)) + rmap_flags |=3D RMAP_EXCLUSIVE; + + page_add_anon_rmap(page, vma, addr, rmap_flags); } else { /* ksm created a completely new copy */ page_add_new_anon_rmap(page, vma, addr); lru_cache_add_inactive_or_unevictable(page, vma); --=20 2.35.1 From nobody Sun Jun 21 10:07:16 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0B7CC433EF for ; Tue, 29 Mar 2022 16:46:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239832AbiC2Qs3 (ORCPT ); Tue, 29 Mar 2022 12:48:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239775AbiC2QsX (ORCPT ); Tue, 29 Mar 2022 12:48:23 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2151D6211F for ; Tue, 29 Mar 2022 09:46:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648572400; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jc+cEdEl7LGBIs1sXVyEIKh1m/fhthwqXZBcWFtK9j4=; b=dFKv8dbEwXMEPfy1kBM7zpYImT7ijIWdc3fRf261yUSXk9rwAkAaktviDWvWame0CFDRc8 Tu70E1NDEHQudsNrRlT03QkRgzdWbZRg+bnI3D9NP9ZEwmG6q0HXZSHOUEwoWxo/VJO0hY BMhl26Wv2J2N15T8mf/PdWcqnTQ5KLs= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-178-UimuhVBmOH6cfHeWB8eWKg-1; Tue, 29 Mar 2022 12:46:36 -0400 X-MC-Unique: UimuhVBmOH6cfHeWB8eWKg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 02E0C3C11C82; Tue, 29 Mar 2022 16:46:19 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.134]) by smtp.corp.redhat.com (Postfix) with ESMTP id 20837112132D; Tue, 29 Mar 2022 16:45:27 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Pedro Gomes , Oded Gabbay , Catalin Marinas , Will Deacon , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Gerald Schaefer , linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, David Hildenbrand Subject: [PATCH v2 2/8] mm/debug_vm_pgtable: add tests for __HAVE_ARCH_PTE_SWP_EXCLUSIVE Date: Tue, 29 Mar 2022 18:43:23 +0200 Message-Id: <20220329164329.208407-3-david@redhat.com> In-Reply-To: <20220329164329.208407-1-david@redhat.com> References: <20220329164329.208407-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Let's test that __HAVE_ARCH_PTE_SWP_EXCLUSIVE works as expected. Signed-off-by: David Hildenbrand Acked-by: Vlastimil Babka --- mm/debug_vm_pgtable.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index db2abd9e415b..55f1a8dc716f 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -837,6 +837,19 @@ static void __init pmd_soft_dirty_tests(struct pgtable= _debug_args *args) { } static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *ar= gs) { } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ =20 +static void __init pte_swap_exclusive_tests(struct pgtable_debug_args *arg= s) +{ +#ifdef __HAVE_ARCH_PTE_SWP_EXCLUSIVE + pte_t pte =3D pfn_pte(args->fixed_pte_pfn, args->page_prot); + + pr_debug("Validating PTE swap exclusive\n"); + pte =3D pte_swp_mkexclusive(pte); + WARN_ON(!pte_swp_exclusive(pte)); + pte =3D pte_swp_clear_exclusive(pte); + WARN_ON(pte_swp_exclusive(pte)); +#endif /* __HAVE_ARCH_PTE_SWP_EXCLUSIVE */ +} + static void __init pte_swap_tests(struct pgtable_debug_args *args) { swp_entry_t swp; @@ -1288,6 +1301,8 @@ static int __init debug_vm_pgtable(void) pte_swap_soft_dirty_tests(&args); pmd_swap_soft_dirty_tests(&args); =20 + pte_swap_exclusive_tests(&args); + pte_swap_tests(&args); pmd_swap_tests(&args); =20 --=20 2.35.1 From nobody Sun Jun 21 10:07:16 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B71CC433F5 for ; Tue, 29 Mar 2022 16:47:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239845AbiC2Qsm (ORCPT ); Tue, 29 Mar 2022 12:48:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239799AbiC2Qsi (ORCPT ); Tue, 29 Mar 2022 12:48:38 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 74BB4E33AE for ; Tue, 29 Mar 2022 09:46:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648572414; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qCOLd7ScB3TWjSGdY4CbToIDZJ4hYbd0NeSKJa+Yxms=; b=a0f++STMOOv+ZcXg2j47wSQnvZ6anv4Ur4QWloYRXjJ3MWaF8H5Lc6/Vd/83K4B861SCOe Y+N0JG0bm7+tCp//T2VOp1IbwE8xHw3I01T+v1adcmWnQn7m7PfDbLiAkJnJ2IiAoud/Al pzX+pDJdy7jliqmpgoZYycGaBg5BTi8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-498-ujdDMkxTNKC-pLfA0LZt3w-1; Tue, 29 Mar 2022 12:46:52 -0400 X-MC-Unique: ujdDMkxTNKC-pLfA0LZt3w-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F395D804198; Tue, 29 Mar 2022 16:46:49 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.134]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8487B111F3B0; Tue, 29 Mar 2022 16:46:18 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Pedro Gomes , Oded Gabbay , Catalin Marinas , Will Deacon , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Gerald Schaefer , linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, David Hildenbrand Subject: [PATCH v2 3/8] x86/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Date: Tue, 29 Mar 2022 18:43:24 +0200 Message-Id: <20220329164329.208407-4-david@redhat.com> In-Reply-To: <20220329164329.208407-1-david@redhat.com> References: <20220329164329.208407-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Let's use bit 3 to remember PG_anon_exclusive in swap ptes. Signed-off-by: David Hildenbrand --- arch/x86/include/asm/pgtable.h | 16 ++++++++++++++++ arch/x86/include/asm/pgtable_64.h | 4 +++- arch/x86/include/asm/pgtable_types.h | 5 +++++ 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 62ab07e24aef..e42e668153e9 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1292,6 +1292,22 @@ static inline void update_mmu_cache_pud(struct vm_ar= ea_struct *vma, { } =20 +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return pte_set_flags(pte, _PAGE_SWP_EXCLUSIVE); +} + +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_flags(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return pte_clear_flags(pte, _PAGE_SWP_EXCLUSIVE); +} + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline pte_t pte_swp_mksoft_dirty(pte_t pte) { diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtab= le_64.h index 56d0399a0cd1..e479491da8d5 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -186,7 +186,7 @@ static inline void native_pgd_clear(pgd_t *pgd) * * | ... | 11| 10| 9|8|7|6|5| 4| 3|2| 1|0| <- bit number * | ... |SW3|SW2|SW1|G|L|D|A|CD|WT|U| W|P| <- bit names - * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| X|F|SD|0| <- swp entry + * | TYPE (59-63) | ~OFFSET (9-58) |0|0|X|X| X| E|F|SD|0| <- swp entry * * G (8) is aliased and used as a PROT_NONE indicator for * !present ptes. We need to start storing swap entries above @@ -203,6 +203,8 @@ static inline void native_pgd_clear(pgd_t *pgd) * F (2) in swp entry is used to record when a pagetable is * writeprotected by userfaultfd WP support. * + * E (3) in swp entry is used to rememeber PG_anon_exclusive. + * * Bit 7 in swp entry should be 0 because pmd_present checks not only P, * but also L and G. * diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pg= table_types.h index 40497a9020c6..54a8f370046d 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -83,6 +83,11 @@ #define _PAGE_SOFT_DIRTY (_AT(pteval_t, 0)) #endif =20 +/* + * We borrow bit 3 to remember PG_anon_exclusive. + */ +#define _PAGE_SWP_EXCLUSIVE _PAGE_PWT + /* * Tracking soft dirty bit when a page goes to a swap is tricky. * We need a bit which can be stored in pte _and_ not conflict --=20 2.35.1 From nobody Sun Jun 21 10:07:16 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9409AC4332F for ; Tue, 29 Mar 2022 16:48:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239877AbiC2Qty (ORCPT ); Tue, 29 Mar 2022 12:49:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232641AbiC2Qtv (ORCPT ); Tue, 29 Mar 2022 12:49:51 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7298C1EF5EF for ; Tue, 29 Mar 2022 09:48:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648572487; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UamAqAXbUMgJ82o9Zip1ZNkayiY40lIOnL1ikWZpgY8=; b=fij4pOGoM7R4KTXF20XM0vKVX7pSLAvPYbD0zBf2jjp9DkcCJZEapGpjvtRfCCLjq0vrrp oMSd++XoWqeyFupunX7ZAWwSzztNLE6WTprS3XC8nSThcVXSOuxe4Gm2CJaLmpnDKSvOa/ /q730gCFyoys3Va+CJVf9flsWG8cjys= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-553-mI4tm1dZOHGbCA7GF14ToA-1; Tue, 29 Mar 2022 12:47:35 -0400 X-MC-Unique: mI4tm1dZOHGbCA7GF14ToA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4B64C3C13A08; Tue, 29 Mar 2022 16:47:33 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.134]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5D5041121330; Tue, 29 Mar 2022 16:46:50 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Pedro Gomes , Oded Gabbay , Catalin Marinas , Will Deacon , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Gerald Schaefer , linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, David Hildenbrand Subject: [PATCH v2 4/8] arm64/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Date: Tue, 29 Mar 2022 18:43:25 +0200 Message-Id: <20220329164329.208407-5-david@redhat.com> In-Reply-To: <20220329164329.208407-1-david@redhat.com> References: <20220329164329.208407-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Let's use one of the type bits: core-mm only supports 5, so there is no need to consume 6. Note that we might be able to reuse bit 1, but reusing bit 1 turned out problematic in the past for PROT_NONE handling; so let's play safe and use another bit. Reviewed-by: Catalin Marinas Signed-off-by: David Hildenbrand --- arch/arm64/include/asm/pgtable-prot.h | 1 + arch/arm64/include/asm/pgtable.h | 23 ++++++++++++++++++++--- 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm= /pgtable-prot.h index b1e1b74d993c..62e0ebeed720 100644 --- a/arch/arm64/include/asm/pgtable-prot.h +++ b/arch/arm64/include/asm/pgtable-prot.h @@ -14,6 +14,7 @@ * Software defined PTE bits definition. */ #define PTE_WRITE (PTE_DBM) /* same as DBM (51) */ +#define PTE_SWP_EXCLUSIVE (_AT(pteval_t, 1) << 2) /* only for swp ptes */ #define PTE_DIRTY (_AT(pteval_t, 1) << 55) #define PTE_SPECIAL (_AT(pteval_t, 1) << 56) #define PTE_DEVMAP (_AT(pteval_t, 1) << 57) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgta= ble.h index 94e147e5456c..ad9b221963d4 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -402,6 +402,22 @@ static inline pgprot_t mk_pmd_sect_prot(pgprot_t prot) return __pgprot((pgprot_val(prot) & ~PMD_TABLE_BIT) | PMD_TYPE_SECT); } =20 +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE)); +} + +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & PTE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return clear_pte_bit(pte, __pgprot(PTE_SWP_EXCLUSIVE)); +} + #ifdef CONFIG_NUMA_BALANCING /* * See the comment in include/linux/pgtable.h @@ -909,12 +925,13 @@ static inline pmd_t pmdp_establish(struct vm_area_str= uct *vma, /* * Encode and decode a swap entry: * bits 0-1: present (must be zero) - * bits 2-7: swap type + * bits 2: remember PG_anon_exclusive + * bits 3-7: swap type * bits 8-57: swap offset * bit 58: PTE_PROT_NONE (must be zero) */ -#define __SWP_TYPE_SHIFT 2 -#define __SWP_TYPE_BITS 6 +#define __SWP_TYPE_SHIFT 3 +#define __SWP_TYPE_BITS 5 #define __SWP_OFFSET_BITS 50 #define __SWP_TYPE_MASK ((1 << __SWP_TYPE_BITS) - 1) #define __SWP_OFFSET_SHIFT (__SWP_TYPE_BITS + __SWP_TYPE_SHIFT) --=20 2.35.1 From nobody Sun Jun 21 10:07:16 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41635C433EF for ; Tue, 29 Mar 2022 16:48:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239902AbiC2Qu3 (ORCPT ); Tue, 29 Mar 2022 12:50:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238991AbiC2Qu0 (ORCPT ); Tue, 29 Mar 2022 12:50:26 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 58DC8BC30 for ; Tue, 29 Mar 2022 09:48:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648572520; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TxaQVn2NE+iTp5vXuPi54qPiuWLB/mqPzeAeVN+34/k=; b=h2kPPwTyjFki9EukskbFzAvQi59y2Oxld29YU8JsbM2kwOH7QnE/AHj9RG5/YUJdNrAoiV gTfL0qphMreNzo6g52KHvohiQX3v6/WjuEcmOTEtqiEdQnYHts80eT+63pItvFZRGT+B4Y H9UjImJIuxP/xDaHamWTSolZDcLCbbw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-58-Xxnwvdp8Oy6WfSdhCutofQ-1; Tue, 29 Mar 2022 12:48:33 -0400 X-MC-Unique: Xxnwvdp8Oy6WfSdhCutofQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DE8122800947; Tue, 29 Mar 2022 16:48:31 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.134]) by smtp.corp.redhat.com (Postfix) with ESMTP id AA9B01121330; Tue, 29 Mar 2022 16:47:33 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Pedro Gomes , Oded Gabbay , Catalin Marinas , Will Deacon , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Gerald Schaefer , linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, David Hildenbrand Subject: [PATCH v2 5/8] s390/pgtable: cleanup description of swp pte layout Date: Tue, 29 Mar 2022 18:43:26 +0200 Message-Id: <20220329164329.208407-6-david@redhat.com> In-Reply-To: <20220329164329.208407-1-david@redhat.com> References: <20220329164329.208407-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Bit 52 and bit 55 don't have to be zero: they only trigger a translation-specifiation exception if the PTE is marked as valid, which is not the case for swap ptes. Document which bits are used for what, and which ones are unused. Signed-off-by: David Hildenbrand --- arch/s390/include/asm/pgtable.h | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtabl= e.h index 9df679152620..3982575bb586 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -1712,18 +1712,17 @@ static inline int has_transparent_hugepage(void) /* * 64 bit swap entry format: * A page-table entry has some bits we have to treat in a special way. - * Bits 52 and bit 55 have to be zero, otherwise a specification - * exception will occur instead of a page translation exception. The - * specification exception has the bad habit not to store necessary - * information in the lowcore. - * Bits 54 and 63 are used to indicate the page type. + * Bits 54 and 63 are used to indicate the page type. Bit 53 marks the pte + * as invalid. * A swap pte is indicated by bit pattern (pte & 0x201) =3D=3D 0x200 - * This leaves the bits 0-51 and bits 56-62 to store type and offset. - * We use the 5 bits from 57-61 for the type and the 52 bits from 0-51 - * for the offset. - * | offset |01100|type |00| + * | offset |X11XX|type |S0| * |0000000000111111111122222222223333333333444444444455|55555|55566|66| * |0123456789012345678901234567890123456789012345678901|23456|78901|23| + * + * Bits 0-51 store the offset. + * Bits 57-61 store the type. + * Bit 62 (S) is used for softdirty tracking. + * Bits 52, 55 and 56 (X) are unused. */ =20 #define __SWP_OFFSET_MASK ((1UL << 52) - 1) --=20 2.35.1 From nobody Sun Jun 21 10:07:16 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 673E4C433FE for ; Tue, 29 Mar 2022 16:49:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239910AbiC2QvQ (ORCPT ); Tue, 29 Mar 2022 12:51:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238991AbiC2QvO (ORCPT ); Tue, 29 Mar 2022 12:51:14 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 948D5CF4AD for ; Tue, 29 Mar 2022 09:49:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648572570; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sC+ymRkt7JVDUa+jsWm6tHAksM2n+k+zaIL6QWBa32I=; b=Bm2RHli0iCGxiPAKcT5SdZnhs9TiP+EITXkK49j/50xYIcS3NNAaxY/wTU/73T4IdCG2d0 o9XsMKezQkde1BKvwSlg/12ruc1CfBAmPlecpqALMQ4AFQ2oz/OEgKw3dAbUm+3EDccdzc ppMbTrs6mdi12GSy7gFv7VJsaJRHSEg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-41-roYzQxbCMv-hrUjMfAj--g-1; Tue, 29 Mar 2022 12:49:29 -0400 X-MC-Unique: roYzQxbCMv-hrUjMfAj--g-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5ED81811E9B; Tue, 29 Mar 2022 16:49:27 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.134]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3BA641121314; Tue, 29 Mar 2022 16:48:32 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Pedro Gomes , Oded Gabbay , Catalin Marinas , Will Deacon , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Gerald Schaefer , linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, David Hildenbrand Subject: [PATCH v2 6/8] s390/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE Date: Tue, 29 Mar 2022 18:43:27 +0200 Message-Id: <20220329164329.208407-7-david@redhat.com> In-Reply-To: <20220329164329.208407-1-david@redhat.com> References: <20220329164329.208407-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Let's use bit 52, which is unused. Signed-off-by: David Hildenbrand --- arch/s390/include/asm/pgtable.h | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtabl= e.h index 3982575bb586..a397b072a580 100644 --- a/arch/s390/include/asm/pgtable.h +++ b/arch/s390/include/asm/pgtable.h @@ -181,6 +181,8 @@ static inline int is_module_addr(void *addr) #define _PAGE_SOFT_DIRTY 0x000 #endif =20 +#define _PAGE_SWP_EXCLUSIVE _PAGE_LARGE /* SW pte exclusive swap bit */ + /* Set of bits not changed in pte_modify */ #define _PAGE_CHG_MASK (PAGE_MASK | _PAGE_SPECIAL | _PAGE_DIRTY | \ _PAGE_YOUNG | _PAGE_SOFT_DIRTY) @@ -826,6 +828,22 @@ static inline int pmd_protnone(pmd_t pmd) } #endif =20 +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline int pte_swp_exclusive(pte_t pte) +{ + return pte_val(pte) & _PAGE_SWP_EXCLUSIVE; +} + +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return set_pte_bit(pte, __pgprot(_PAGE_SWP_EXCLUSIVE)); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return clear_pte_bit(pte, __pgprot(_PAGE_SWP_EXCLUSIVE)); +} + static inline int pte_soft_dirty(pte_t pte) { return pte_val(pte) & _PAGE_SOFT_DIRTY; @@ -1715,14 +1733,15 @@ static inline int has_transparent_hugepage(void) * Bits 54 and 63 are used to indicate the page type. Bit 53 marks the pte * as invalid. * A swap pte is indicated by bit pattern (pte & 0x201) =3D=3D 0x200 - * | offset |X11XX|type |S0| + * | offset |E11XX|type |S0| * |0000000000111111111122222222223333333333444444444455|55555|55566|66| * |0123456789012345678901234567890123456789012345678901|23456|78901|23| * * Bits 0-51 store the offset. + * Bit 52 (E) is used to remember PG_anon_exclusive. * Bits 57-61 store the type. * Bit 62 (S) is used for softdirty tracking. - * Bits 52, 55 and 56 (X) are unused. + * Bits 55 and 56 (X) are unused. */ =20 #define __SWP_OFFSET_MASK ((1UL << 52) - 1) --=20 2.35.1 From nobody Sun Jun 21 10:07:16 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4F2DC433F5 for ; Tue, 29 Mar 2022 16:50:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234382AbiC2QwA (ORCPT ); Tue, 29 Mar 2022 12:52:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239915AbiC2Qv5 (ORCPT ); Tue, 29 Mar 2022 12:51:57 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 509A72527B1 for ; Tue, 29 Mar 2022 09:50:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648572613; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bMfZOVQVS5I2zZPcY9Fdh37ujSfx4fPUK1QRVEaCA5k=; b=IQsrZ4Kw+TkBVsrNIwn344Cw/0fK62+C3VUJplg8lEt66504Fb+E5iAaw0gKG9eyLCs/VP JLQGRZUhubt4Ka9aL6uf/IVJ6HASqP3RN1MQPo8qejogs+qyXXj4Xh0yjbZYnRcrHGEXPi y0PV813Wg2nY4bSEnCgZ8cK8QJkd8vc= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-57-w3EaO9epNH-b7CkKN7fb2w-1; Tue, 29 Mar 2022 12:50:09 -0400 X-MC-Unique: w3EaO9epNH-b7CkKN7fb2w-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 719C31C1782F; Tue, 29 Mar 2022 16:50:07 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.134]) by smtp.corp.redhat.com (Postfix) with ESMTP id BC9701121314; Tue, 29 Mar 2022 16:49:27 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Pedro Gomes , Oded Gabbay , Catalin Marinas , Will Deacon , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Gerald Schaefer , linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, David Hildenbrand Subject: [PATCH v2 7/8] powerpc/pgtable: remove _PAGE_BIT_SWAP_TYPE for book3s Date: Tue, 29 Mar 2022 18:43:28 +0200 Message-Id: <20220329164329.208407-8-david@redhat.com> In-Reply-To: <20220329164329.208407-1-david@redhat.com> References: <20220329164329.208407-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The swap type is simply stored in bits 0x1f of the swap pte. Let's simplify by just getting rid of _PAGE_BIT_SWAP_TYPE. It's not like that we can simply change it: _PAGE_SWP_SOFT_DIRTY would suddenly fall into _RPAGE_RSV1, which isn't possible and would make the BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY) angry. While at it, make it clearer which bit we're actually using for _PAGE_SWP_SOFT_DIRTY by just using the proper define and introduce and use SWP_TYPE_MASK. Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/book3s/64/pgtable.h | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/in= clude/asm/book3s/64/pgtable.h index 875730d5af40..8e98375d5c4a 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -13,7 +13,6 @@ /* * Common bits between hash and Radix page table */ -#define _PAGE_BIT_SWAP_TYPE 0 =20 #define _PAGE_EXEC 0x00001 /* execute permission */ #define _PAGE_WRITE 0x00002 /* write access allowed */ @@ -751,17 +750,16 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t ne= wprot) * Don't have overlapping bits with _PAGE_HPTEFLAGS \ * We filter HPTEFLAGS on set_pte. \ */ \ - BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \ + BUILD_BUG_ON(_PAGE_HPTEFLAGS & SWP_TYPE_MASK); \ BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY); \ } while (0) =20 #define SWP_TYPE_BITS 5 -#define __swp_type(x) (((x).val >> _PAGE_BIT_SWAP_TYPE) \ - & ((1UL << SWP_TYPE_BITS) - 1)) +#define SWP_TYPE_MASK ((1UL << SWP_TYPE_BITS) - 1) +#define __swp_type(x) ((x).val & SWP_TYPE_MASK) #define __swp_offset(x) (((x).val & PTE_RPN_MASK) >> PAGE_SHIFT) #define __swp_entry(type, offset) ((swp_entry_t) { \ - ((type) << _PAGE_BIT_SWAP_TYPE) \ - | (((offset) << PAGE_SHIFT) & PTE_RPN_MASK)}) + (type) | (((offset) << PAGE_SHIFT) & PTE_RPN_MASK)}) /* * swp_entry_t must be independent of pte bits. We build a swp_entry_t from * swap type and offset we get from swap and convert that to pte to find a @@ -774,7 +772,7 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newp= rot) #define __swp_entry_to_pmd(x) (pte_pmd(__swp_entry_to_pte(x))) =20 #ifdef CONFIG_MEM_SOFT_DIRTY -#define _PAGE_SWP_SOFT_DIRTY (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYP= E)) +#define _PAGE_SWP_SOFT_DIRTY _PAGE_NON_IDEMPOTENT #else #define _PAGE_SWP_SOFT_DIRTY 0UL #endif /* CONFIG_MEM_SOFT_DIRTY */ --=20 2.35.1 From nobody Sun Jun 21 10:07:16 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62246C433FE for ; Tue, 29 Mar 2022 16:50:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239940AbiC2QwJ (ORCPT ); Tue, 29 Mar 2022 12:52:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239951AbiC2QwG (ORCPT ); Tue, 29 Mar 2022 12:52:06 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7F7B5258FCF for ; Tue, 29 Mar 2022 09:50:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1648572621; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TwO5MaKuynUSMbsOinBzVosboPLVHwPyacLiL1/kyJE=; b=GDLSsqbtQDhsuxqqXQFRQdezY1YgYbMjOTZN5kjiNKYvYQMhW14GOMb2NJqk0Yg8vJN3Jr wyLmO3q7srVxVJNj50837qmmML1CZi9Ao1dEcWySFtrT5wWTisYbW8hwDwISALJ/U8gDR3 Cpu2fL3AqReqhJ5NSjXuFdbtOMPFYzk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-344-ZGoj0HAdOw6IJYJ1nLpi-Q-1; Tue, 29 Mar 2022 12:50:17 -0400 X-MC-Unique: ZGoj0HAdOw6IJYJ1nLpi-Q-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id AFE7D29AA2F0; Tue, 29 Mar 2022 16:50:15 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.134]) by smtp.corp.redhat.com (Postfix) with ESMTP id CA3611121314; Tue, 29 Mar 2022 16:50:07 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: Andrew Morton , Hugh Dickins , Linus Torvalds , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Nadav Amit , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Liang Zhang , Pedro Gomes , Oded Gabbay , Catalin Marinas , Will Deacon , Michael Ellerman , Benjamin Herrenschmidt , Paul Mackerras , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , Gerald Schaefer , linux-mm@kvack.org, x86@kernel.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, David Hildenbrand Subject: [PATCH v2 8/8] powerpc/pgtable: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE for book3s Date: Tue, 29 Mar 2022 18:43:29 +0200 Message-Id: <20220329164329.208407-9-david@redhat.com> In-Reply-To: <20220329164329.208407-1-david@redhat.com> References: <20220329164329.208407-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Right now, the last 5 bits (0x1f) of the swap entry are used for the type and the bit before that (0x20) is used for _PAGE_SWP_SOFT_DIRTY. We cannot use 0x40, as that collides with _RPAGE_RSV1 -- contained in _PAGE_HPTEFLAGS. The next candidate would be _RPAGE_SW3 (0x200) -- which is used for _PAGE_SOFT_DIRTY for !swp ptes. So let's just use _PAGE_SOFT_DIRTY for _PAGE_SWP_SOFT_DIRTY (to make it easier to grasp) and use 0x20 now for _PAGE_SWP_EXCLUSIVE. Signed-off-by: David Hildenbrand --- arch/powerpc/include/asm/book3s/64/pgtable.h | 21 +++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/in= clude/asm/book3s/64/pgtable.h index 8e98375d5c4a..eecff2036869 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -752,6 +752,7 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newp= rot) */ \ BUILD_BUG_ON(_PAGE_HPTEFLAGS & SWP_TYPE_MASK); \ BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_SOFT_DIRTY); \ + BUILD_BUG_ON(_PAGE_HPTEFLAGS & _PAGE_SWP_EXCLUSIVE); \ } while (0) =20 #define SWP_TYPE_BITS 5 @@ -772,11 +773,13 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t ne= wprot) #define __swp_entry_to_pmd(x) (pte_pmd(__swp_entry_to_pte(x))) =20 #ifdef CONFIG_MEM_SOFT_DIRTY -#define _PAGE_SWP_SOFT_DIRTY _PAGE_NON_IDEMPOTENT +#define _PAGE_SWP_SOFT_DIRTY _PAGE_SOFT_DIRTY #else #define _PAGE_SWP_SOFT_DIRTY 0UL #endif /* CONFIG_MEM_SOFT_DIRTY */ =20 +#define _PAGE_SWP_EXCLUSIVE _PAGE_NON_IDEMPOTENT + #ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY static inline pte_t pte_swp_mksoft_dirty(pte_t pte) { @@ -794,6 +797,22 @@ static inline pte_t pte_swp_clear_soft_dirty(pte_t pte) } #endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */ =20 +#define __HAVE_ARCH_PTE_SWP_EXCLUSIVE +static inline pte_t pte_swp_mkexclusive(pte_t pte) +{ + return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_SWP_EXCLUSIVE)); +} + +static inline int pte_swp_exclusive(pte_t pte) +{ + return !!(pte_raw(pte) & cpu_to_be64(_PAGE_SWP_EXCLUSIVE)); +} + +static inline pte_t pte_swp_clear_exclusive(pte_t pte) +{ + return __pte_raw(pte_raw(pte) & cpu_to_be64(~_PAGE_SWP_EXCLUSIVE)); +} + static inline bool check_pte_access(unsigned long access, unsigned long pt= ev) { /* --=20 2.35.1