From nobody Sun Feb 8 13:17:08 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6AE4C001B0 for ; Wed, 19 Jul 2023 22:50:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231445AbjGSWuD (ORCPT ); Wed, 19 Jul 2023 18:50:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231327AbjGSWr6 (ORCPT ); Wed, 19 Jul 2023 18:47:58 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4EA12722; Wed, 19 Jul 2023 15:47:45 -0700 (PDT) Date: Wed, 19 Jul 2023 22:47:43 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1689806863; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=z3eUPvhUVXrSfaJrlzOBXGtiffyshQ1/8ValfOTCW8E=; b=Wbip4lMnhoTStFAoDM9Uv6JLgKQSTsj2cWbZBRHjqOSvQEsmmaRauZ0/FoulWoj/cXQLDA 8wTFukplTfNpdqoBmDgQmHnSgAmBrPuuh45L/Sx3W94izz5kyGOn7XRwi/5SYMY623KMlK LpY7Epy59BjH2viPaccmw5kqjHKSDq2MSf9KjbCjfibxJLMKxhLalh7LOOPMB0f1qaPUHp Dpn8zr9ZfMSZRc2NgbAKkxO/Vn9/80JwnVU9D/KOALBdn3Jh7kQnN/0E3rwVDbcjDuydO0 A6L7YimIr3ysjSfecuZ4vKmqcGvNyMSenNWvLnN+NQiERfGCL5/AbZKjMVdAzw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1689806863; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=z3eUPvhUVXrSfaJrlzOBXGtiffyshQ1/8ValfOTCW8E=; b=8sdzXZj6/lxQJ2cV+aVmqRyKzhf336XQ9rShDg+BZq3WnKSjfjKIKKn+7p4llPiMYJiIE0 gTY9CyZp/zbuZFBA== From: "tip-bot2 for Rick Edgecombe" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/shstk] mm: Make pte_mkwrite() take a VMA Cc: David Hildenbrand , Rick Edgecombe , Dave Hansen , "Mike Rapoport (IBM)" , x86@kernel.org, linux-kernel@vger.kernel.org MIME-Version: 1.0 Message-ID: <168980686308.28540.13139038542941174691.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the x86/shstk branch of tip: Commit-ID: 161e393c0f63592a3b95bdd8b55752653763fc6d Gitweb: https://git.kernel.org/tip/161e393c0f63592a3b95bdd8b55752653= 763fc6d Author: Rick Edgecombe AuthorDate: Mon, 12 Jun 2023 17:10:29 -07:00 Committer: Rick Edgecombe CommitterDate: Tue, 11 Jul 2023 14:12:15 -07:00 mm: Make pte_mkwrite() take a VMA The x86 Shadow stack feature includes a new type of memory called shadow stack. This shadow stack memory has some unusual properties, which requires some core mm changes to function properly. One of these unusual properties is that shadow stack memory is writable, but only in limited ways. These limits are applied via a specific PTE bit combination. Nevertheless, the memory is writable, and core mm code will need to apply the writable permissions in the typical paths that call pte_mkwrite(). Future patches will make pte_mkwrite() take a VMA, so that the x86 implementation of it can know whether to create regular writable or shadow stack mappings. But there are a couple of challenges to this. Modifying the signatures of each arch pte_mkwrite() implementation would be error prone because some are generated with macros and would need to be re-implemented. Also, some pte_mkwrite() callers operate on kernel memory without a VMA. So this can be done in a three step process. First pte_mkwrite() can be renamed to pte_mkwrite_novma() in each arch, with a generic pte_mkwrite() added that just calls pte_mkwrite_novma(). Next callers without a VMA can be moved to pte_mkwrite_novma(). And lastly, pte_mkwrite() and all callers can be changed to take/pass a VMA. Previous work pte_mkwrite() renamed pte_mkwrite_novma() and converted callers that don't have a VMA were to use pte_mkwrite_novma(). So now change pte_mkwrite() to take a VMA and change the remaining callers to pass a VMA. Apply the same changes for pmd_mkwrite(). No functional change. Suggested-by: David Hildenbrand Signed-off-by: Rick Edgecombe Signed-off-by: Dave Hansen Reviewed-by: Mike Rapoport (IBM) Acked-by: David Hildenbrand Link: https://lore.kernel.org/all/20230613001108.3040476-4-rick.p.edgecombe= %40intel.com --- Documentation/mm/arch_pgtable_helpers.rst | 6 ++++-- include/linux/mm.h | 2 +- include/linux/pgtable.h | 4 ++-- mm/debug_vm_pgtable.c | 12 ++++++------ mm/huge_memory.c | 10 +++++----- mm/memory.c | 4 ++-- mm/migrate.c | 2 +- mm/migrate_device.c | 2 +- mm/mprotect.c | 2 +- mm/userfaultfd.c | 2 +- 10 files changed, 24 insertions(+), 22 deletions(-) diff --git a/Documentation/mm/arch_pgtable_helpers.rst b/Documentation/mm/a= rch_pgtable_helpers.rst index 69ce1f2..c82e3ee 100644 --- a/Documentation/mm/arch_pgtable_helpers.rst +++ b/Documentation/mm/arch_pgtable_helpers.rst @@ -46,7 +46,8 @@ PTE Page Table Helpers +---------------------------+---------------------------------------------= -----+ | pte_mkclean | Creates a clean PTE = | +---------------------------+---------------------------------------------= -----+ -| pte_mkwrite | Creates a writable PTE = | +| pte_mkwrite | Creates a writable PTE of the type specified= by | +| | the VMA. = | +---------------------------+---------------------------------------------= -----+ | pte_mkwrite_novma | Creates a writable PTE, of the conventional = type | | | of writable. = | @@ -121,7 +122,8 @@ PMD Page Table Helpers +---------------------------+---------------------------------------------= -----+ | pmd_mkclean | Creates a clean PMD = | +---------------------------+---------------------------------------------= -----+ -| pmd_mkwrite | Creates a writable PMD = | +| pmd_mkwrite | Creates a writable PMD of the type specified= by | +| | the VMA. = | +---------------------------+---------------------------------------------= -----+ | pmd_mkwrite_novma | Creates a writable PMD, of the conventional = type | | | of writable. = | diff --git a/include/linux/mm.h b/include/linux/mm.h index 2dd73e4..d40fa0f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1277,7 +1277,7 @@ void free_compound_page(struct page *page); static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) { if (likely(vma->vm_flags & VM_WRITE)) - pte =3D pte_mkwrite(pte); + pte =3D pte_mkwrite(pte, vma); return pte; } =20 diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index e6ea6e0..9462f4a 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -516,14 +516,14 @@ extern pud_t pudp_huge_clear_flush(struct vm_area_str= uct *vma, #endif =20 #ifndef pte_mkwrite -static inline pte_t pte_mkwrite(pte_t pte) +static inline pte_t pte_mkwrite(pte_t pte, struct vm_area_struct *vma) { return pte_mkwrite_novma(pte); } #endif =20 #if defined(CONFIG_ARCH_WANT_PMD_MKWRITE) && !defined(pmd_mkwrite) -static inline pmd_t pmd_mkwrite(pmd_t pmd) +static inline pmd_t pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) { return pmd_mkwrite_novma(pmd); } diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index ee119e3..b457ca1 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -109,10 +109,10 @@ static void __init pte_basic_tests(struct pgtable_deb= ug_args *args, int idx) WARN_ON(!pte_same(pte, pte)); WARN_ON(!pte_young(pte_mkyoung(pte_mkold(pte)))); WARN_ON(!pte_dirty(pte_mkdirty(pte_mkclean(pte)))); - WARN_ON(!pte_write(pte_mkwrite(pte_wrprotect(pte)))); + WARN_ON(!pte_write(pte_mkwrite(pte_wrprotect(pte), args->vma))); WARN_ON(pte_young(pte_mkold(pte_mkyoung(pte)))); WARN_ON(pte_dirty(pte_mkclean(pte_mkdirty(pte)))); - WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte)))); + WARN_ON(pte_write(pte_wrprotect(pte_mkwrite(pte, args->vma)))); WARN_ON(pte_dirty(pte_wrprotect(pte_mkclean(pte)))); WARN_ON(!pte_dirty(pte_wrprotect(pte_mkdirty(pte)))); } @@ -156,7 +156,7 @@ static void __init pte_advanced_tests(struct pgtable_de= bug_args *args) pte =3D pte_mkclean(pte); set_pte_at(args->mm, args->vaddr, args->ptep, pte); flush_dcache_page(page); - pte =3D pte_mkwrite(pte); + pte =3D pte_mkwrite(pte, args->vma); pte =3D pte_mkdirty(pte); ptep_set_access_flags(args->vma, args->vaddr, args->ptep, pte, 1); pte =3D ptep_get(args->ptep); @@ -202,10 +202,10 @@ static void __init pmd_basic_tests(struct pgtable_deb= ug_args *args, int idx) WARN_ON(!pmd_same(pmd, pmd)); WARN_ON(!pmd_young(pmd_mkyoung(pmd_mkold(pmd)))); WARN_ON(!pmd_dirty(pmd_mkdirty(pmd_mkclean(pmd)))); - WARN_ON(!pmd_write(pmd_mkwrite(pmd_wrprotect(pmd)))); + WARN_ON(!pmd_write(pmd_mkwrite(pmd_wrprotect(pmd), args->vma))); WARN_ON(pmd_young(pmd_mkold(pmd_mkyoung(pmd)))); WARN_ON(pmd_dirty(pmd_mkclean(pmd_mkdirty(pmd)))); - WARN_ON(pmd_write(pmd_wrprotect(pmd_mkwrite(pmd)))); + WARN_ON(pmd_write(pmd_wrprotect(pmd_mkwrite(pmd, args->vma)))); WARN_ON(pmd_dirty(pmd_wrprotect(pmd_mkclean(pmd)))); WARN_ON(!pmd_dirty(pmd_wrprotect(pmd_mkdirty(pmd)))); /* @@ -256,7 +256,7 @@ static void __init pmd_advanced_tests(struct pgtable_de= bug_args *args) pmd =3D pmd_mkclean(pmd); set_pmd_at(args->mm, vaddr, args->pmdp, pmd); flush_dcache_page(page); - pmd =3D pmd_mkwrite(pmd); + pmd =3D pmd_mkwrite(pmd, args->vma); pmd =3D pmd_mkdirty(pmd); pmdp_set_access_flags(args->vma, vaddr, args->pmdp, pmd, 1); pmd =3D READ_ONCE(*args->pmdp); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index eb36783..23c2aa6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -551,7 +551,7 @@ __setup("transparent_hugepage=3D", setup_transparent_hu= gepage); pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma) { if (likely(vma->vm_flags & VM_WRITE)) - pmd =3D pmd_mkwrite(pmd); + pmd =3D pmd_mkwrite(pmd, vma); return pmd; } =20 @@ -1572,7 +1572,7 @@ out_map: pmd =3D pmd_modify(oldpmd, vma->vm_page_prot); pmd =3D pmd_mkyoung(pmd); if (writable) - pmd =3D pmd_mkwrite(pmd); + pmd =3D pmd_mkwrite(pmd, vma); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, pmd); update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); spin_unlock(vmf->ptl); @@ -1925,7 +1925,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm= _area_struct *vma, /* See change_pte_range(). */ if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && !pmd_write(entry) && can_change_pmd_writable(vma, addr, entry)) - entry =3D pmd_mkwrite(entry); + entry =3D pmd_mkwrite(entry, vma); =20 ret =3D HPAGE_PMD_NR; set_pmd_at(mm, addr, pmd, entry); @@ -2243,7 +2243,7 @@ static void __split_huge_pmd_locked(struct vm_area_st= ruct *vma, pmd_t *pmd, } else { entry =3D mk_pte(page + i, READ_ONCE(vma->vm_page_prot)); if (write) - entry =3D pte_mkwrite(entry); + entry =3D pte_mkwrite(entry, vma); if (anon_exclusive) SetPageAnonExclusive(page + i); if (!young) @@ -3287,7 +3287,7 @@ void remove_migration_pmd(struct page_vma_mapped_walk= *pvmw, struct page *new) if (pmd_swp_soft_dirty(*pvmw->pmd)) pmde =3D pmd_mksoft_dirty(pmde); if (is_writable_migration_entry(entry)) - pmde =3D pmd_mkwrite(pmde); + pmde =3D pmd_mkwrite(pmde, vma); if (pmd_swp_uffd_wp(*pvmw->pmd)) pmde =3D pmd_mkuffd_wp(pmde); if (!is_migration_entry_young(entry)) diff --git a/mm/memory.c b/mm/memory.c index 01f39e8..f093c73 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4119,7 +4119,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *= vmf) entry =3D mk_pte(&folio->page, vma->vm_page_prot); entry =3D pte_sw_mkyoung(entry); if (vma->vm_flags & VM_WRITE) - entry =3D pte_mkwrite(pte_mkdirty(entry)); + entry =3D pte_mkwrite(pte_mkdirty(entry), vma); =20 vmf->pte =3D pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); @@ -4808,7 +4808,7 @@ out_map: pte =3D pte_modify(old_pte, vma->vm_page_prot); pte =3D pte_mkyoung(pte); if (writable) - pte =3D pte_mkwrite(pte); + pte =3D pte_mkwrite(pte, vma); ptep_modify_prot_commit(vma, vmf->address, vmf->pte, old_pte, pte); update_mmu_cache(vma, vmf->address, vmf->pte); pte_unmap_unlock(vmf->pte, vmf->ptl); diff --git a/mm/migrate.c b/mm/migrate.c index 24baad2..18f58b7 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -220,7 +220,7 @@ static bool remove_migration_pte(struct folio *folio, if (folio_test_dirty(folio) && is_migration_entry_dirty(entry)) pte =3D pte_mkdirty(pte); if (is_writable_migration_entry(entry)) - pte =3D pte_mkwrite(pte); + pte =3D pte_mkwrite(pte, vma); else if (pte_swp_uffd_wp(old_pte)) pte =3D pte_mkuffd_wp(pte); =20 diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 8365158..df280aa 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -623,7 +623,7 @@ static void migrate_vma_insert_page(struct migrate_vma = *migrate, } entry =3D mk_pte(page, vma->vm_page_prot); if (vma->vm_flags & VM_WRITE) - entry =3D pte_mkwrite(pte_mkdirty(entry)); + entry =3D pte_mkwrite(pte_mkdirty(entry), vma); } =20 ptep =3D pte_offset_map_lock(mm, pmdp, addr, &ptl); diff --git a/mm/mprotect.c b/mm/mprotect.c index 6f658d4..b342e01 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -185,7 +185,7 @@ static long change_pte_range(struct mmu_gather *tlb, if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && !pte_write(ptent) && can_change_pte_writable(vma, addr, ptent)) - ptent =3D pte_mkwrite(ptent); + ptent =3D pte_mkwrite(ptent, vma); =20 ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); if (pte_needs_flush(oldpte, ptent)) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index a2bf37e..b322ac5 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -72,7 +72,7 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, if (page_in_cache && !vm_shared) writable =3D false; if (writable) - _dst_pte =3D pte_mkwrite(_dst_pte); + _dst_pte =3D pte_mkwrite(_dst_pte, dst_vma); if (flags & MFILL_ATOMIC_WP) _dst_pte =3D pte_mkuffd_wp(_dst_pte);