From nobody Wed Jul 1 13:27:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B32EAC433F5 for ; Tue, 21 Dec 2021 15:46:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236689AbhLUPq4 (ORCPT ); Tue, 21 Dec 2021 10:46:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236670AbhLUPqz (ORCPT ); Tue, 21 Dec 2021 10:46:55 -0500 Received: from mail-qk1-x730.google.com (mail-qk1-x730.google.com [IPv6:2607:f8b0:4864:20::730]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD422C06173F for ; Tue, 21 Dec 2021 07:46:54 -0800 (PST) Received: by mail-qk1-x730.google.com with SMTP id f138so1477287qke.10 for ; Tue, 21 Dec 2021 07:46:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=RKJqaZULeG1QSA7YB831SR6HERaPIQOIwY733yUqU5M=; b=FKON5HAyn15g62AGhOlT1/tBz+VEw1J67ZHg3wwFLM8ZgF8fXf2LcxUQI365+XnEvp 7WyWvfd+32btAKnG7C3/aIuB3xfZm55cFb5ugAQ8ST1/145P5LenlINlmo3DZmrzo8zb uIo242st9q8BikF5hgObrPFY+SIaDvInrNRImnfXXsqF6KH2Q00gueQ8N0Y+q3y7Fu91 hRhDr+Gco87JRzd8fkqTAtbUrfUNbv7rPdS8FvOkYHKPnyd8jw3J6SFAmeVfo5OTXxwb w4wiwz8v1zVH4sD4khwU/oJlbp9yOeIjT5IipirRF9BJmr1K1xtPleoa1ZgMnGJH0JLT 3BZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RKJqaZULeG1QSA7YB831SR6HERaPIQOIwY733yUqU5M=; b=YzA6haczAmT0a52uxAWGAfc8zTo36akUtEDriIRtey/6n46j/JY3ansAfLFZ2ae0xs yIumiveL70sqUaNvwdgjnShdDgIEJ/NjnihjXR4np49LVMa0uRnmgy07eO+uP+graGCK o0C7uXS3DKOS1M2RM7ZmlJHXmThiak/f3EYVIF9jQGN+Fgbgj+kjY1XK24kxZQJMeHKw uum7wCzLmarbV0+nOaP/Cr4UoUGHkUNTaRLNMUSvrhn4XZxekqc8lJ5d9vD882bDnn5S maS0FtNuPJzIgJ8gZZB9JQqOtUjMvAAOrc30TBNXGXqjbBj8jDVGiiA4YzvA7rb5WMji sNfA== X-Gm-Message-State: AOAM530+yaWGbCiAsCc2jrAdlIQTmB4wDWVVlrVv6dBC/aVn9BNQDqHj GaBkF467U6N4NqgmUq8j4JAysQ== X-Google-Smtp-Source: ABdhPJzcVYX4pffnC7B1BEW7SxoZDfhSLwDnPuVSyGmvG/Ya68GAB7xCw58hA1bjyg8lE09NC2cp3g== X-Received: by 2002:a05:620a:4096:: with SMTP id f22mr2473224qko.600.1640101614024; Tue, 21 Dec 2021 07:46:54 -0800 (PST) Received: from soleen.c.googlers.com.com (189.216.85.34.bc.googleusercontent.com. [34.85.216.189]) by smtp.gmail.com with ESMTPSA id d20sm224588qtg.73.2021.12.21.07.46.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 07:46:53 -0800 (PST) From: Pasha Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, pjt@google.com, weixugc@google.com, gthelen@google.com, mingo@redhat.com, corbet@lwn.net, will@kernel.org, rppt@kernel.org, keescook@chromium.org, tglx@linutronix.de, peterz@infradead.org, masahiroy@kernel.org, samitolvanen@google.com, dave.hansen@linux.intel.com, x86@kernel.org, frederic@kernel.org, hpa@zytor.com, aneesh.kumar@linux.ibm.com, jirislaby@kernel.org, songmuchun@bytedance.com, qydwhotmail@gmail.com, hughd@google.com Subject: [PATCH v3 1/4] mm: change page type prior to adding page table entry Date: Tue, 21 Dec 2021 15:46:47 +0000 Message-Id: <20211221154650.1047963-2-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.34.1.307.g9b7440fafd-goog In-Reply-To: <20211221154650.1047963-1-pasha.tatashin@soleen.com> References: <20211221154650.1047963-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There are a few places where we first update the entry in the user page table, and later change the struct page to indicate that this is anonymous or file page. In most places, however, we first configure the page metadata and then insert entries into the page table. Page table check, will use the information from struct page to verify the type of entry is inserted. Change the order in all places to first update struct page, and later to update page table. This means that we first do calls that may change the type of page (anon or file): page_move_anon_rmap page_add_anon_rmap do_page_add_anon_rmap page_add_new_anon_rmap page_add_file_rmap hugepage_add_anon_rmap hugepage_add_new_anon_rmap And after that do calls that add entries to the page table: set_huge_pte_at set_pte_at Signed-off-by: Pasha Tatashin --- mm/hugetlb.c | 6 +++--- mm/memory.c | 9 +++++---- mm/migrate.c | 5 ++--- mm/swapfile.c | 4 ++-- 4 files changed, 12 insertions(+), 12 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a1baa198519a..61895cc01d09 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4684,8 +4684,8 @@ hugetlb_install_page(struct vm_area_struct *vma, pte_= t *ptep, unsigned long addr struct page *new_page) { __SetPageUptodate(new_page); - set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1)); hugepage_add_new_anon_rmap(new_page, vma, addr); + set_huge_pte_at(vma->vm_mm, addr, ptep, make_huge_pte(vma, new_page, 1)); hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm); ClearHPageRestoreReserve(new_page); SetHPageMigratable(new_page); @@ -5259,10 +5259,10 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm,= struct vm_area_struct *vma, /* Break COW */ huge_ptep_clear_flush(vma, haddr, ptep); mmu_notifier_invalidate_range(mm, range.start, range.end); - set_huge_pte_at(mm, haddr, ptep, - make_huge_pte(vma, new_page, 1)); page_remove_rmap(old_page, true); hugepage_add_new_anon_rmap(new_page, vma, haddr); + set_huge_pte_at(mm, haddr, ptep, + make_huge_pte(vma, new_page, 1)); SetHPageMigratable(new_page); /* Make the old page be freed below */ new_page =3D old_page; diff --git a/mm/memory.c b/mm/memory.c index 71e475d440b0..11cb28a2ca54 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -720,8 +720,6 @@ static void restore_exclusive_pte(struct vm_area_struct= *vma, else if (is_writable_device_exclusive_entry(entry)) pte =3D maybe_mkwrite(pte_mkdirty(pte), vma); =20 - set_pte_at(vma->vm_mm, address, ptep, pte); - /* * No need to take a page reference as one was already * created when the swap entry was made. @@ -735,6 +733,8 @@ static void restore_exclusive_pte(struct vm_area_struct= *vma, */ WARN_ON_ONCE(!PageAnon(page)); =20 + set_pte_at(vma->vm_mm, address, ptep, pte); + if (vma->vm_flags & VM_LOCKED) mlock_vma_page(page); =20 @@ -3635,8 +3635,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) pte =3D pte_mkuffd_wp(pte); pte =3D pte_wrprotect(pte); } - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); - arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); vmf->orig_pte =3D pte; =20 /* ksm created a completely new copy */ @@ -3647,6 +3645,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) do_page_add_anon_rmap(page, vma, vmf->address, exclusive); } =20 + set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); + arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); + swap_free(entry); if (mem_cgroup_swap_full(page) || (vma->vm_flags & VM_LOCKED) || PageMlocked(page)) diff --git a/mm/migrate.c b/mm/migrate.c index c9296d63878d..f943a2d99de7 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -237,20 +237,19 @@ static bool remove_migration_pte(struct page *page, s= truct vm_area_struct *vma, =20 pte =3D pte_mkhuge(pte); pte =3D arch_make_huge_pte(pte, shift, vma->vm_flags); - set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); if (PageAnon(new)) hugepage_add_anon_rmap(new, vma, pvmw.address); else page_dup_rmap(new, true); + set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } else #endif { - set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); - if (PageAnon(new)) page_add_anon_rmap(new, vma, pvmw.address, false); else page_add_file_rmap(new, false); + set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); } if (vma->vm_flags & VM_LOCKED && !PageTransCompound(new)) mlock_vma_page(new); diff --git a/mm/swapfile.c b/mm/swapfile.c index e59e08ef46e1..e64207e2ef1d 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1917,14 +1917,14 @@ static int unuse_pte(struct vm_area_struct *vma, pm= d_t *pmd, dec_mm_counter(vma->vm_mm, MM_SWAPENTS); inc_mm_counter(vma->vm_mm, MM_ANONPAGES); get_page(page); - set_pte_at(vma->vm_mm, addr, pte, - pte_mkold(mk_pte(page, vma->vm_page_prot))); if (page =3D=3D swapcache) { page_add_anon_rmap(page, vma, addr, false); } else { /* ksm created a completely new copy */ page_add_new_anon_rmap(page, vma, addr, false); lru_cache_add_inactive_or_unevictable(page, vma); } + set_pte_at(vma->vm_mm, addr, pte, + pte_mkold(mk_pte(page, vma->vm_page_prot))); swap_free(entry); out: pte_unmap_unlock(pte, ptl); --=20 2.34.1.307.g9b7440fafd-goog From nobody Wed Jul 1 13:27:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F915C433FE for ; Tue, 21 Dec 2021 15:47:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237197AbhLUPq7 (ORCPT ); Tue, 21 Dec 2021 10:46:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236675AbhLUPqz (ORCPT ); Tue, 21 Dec 2021 10:46:55 -0500 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B17BDC06173F for ; Tue, 21 Dec 2021 07:46:55 -0800 (PST) Received: by mail-qt1-x835.google.com with SMTP id t11so13222302qtw.3 for ; Tue, 21 Dec 2021 07:46:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=8lYwPz/4m3tDnSwW80VP9nsgiF1z9D2++nKEW6hl0KA=; b=kO8yGAwFB3DggVZ93fPld7nNchSHxNZW2NmK+E+Zhg/kPc9IUvAJr7Riv6SFkBYvSS Ch+8U2tozsmQnKCF1kRrpen+bCce4/fnW/fvR6gpy60FmL+f1PH5MT+LPk9Itli4RXPQ mFPsXKYT5fAaH2slrqgjPYEDdMPEcAyYLarVP3uEYLgNajmhicBZUCE/9uOvl3+uK/d+ tX57f3ADUC538hwUiPUZx3mVZMKQ1pGZPZ3fqhvsetg9pNFsUQ9if3L2RPeoefEVJdju WPrj0vvglEUYoiXd5zdxvY1LhKIeTyAJIg2sWzBWRAI+aIxGmqe/Nyl/LLRtmZ0xihpt G+Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8lYwPz/4m3tDnSwW80VP9nsgiF1z9D2++nKEW6hl0KA=; b=OZN5zo+983WcLPVojRn+V85TSn6ACs5094EwvbH6315tmA1xcCJRuDicZIZOU/zCKu j9HaEASvdgx5AujCfAeV44jNr7k2Vghw1p6fpNwnHubs7ljfHVQdTC3IQ6EyD7Om0PLe EwUtK6i49e3ng9oY9mdVKG1Vf8szTFp2rjCsrkU0r4fc+fOdCmz7CsDon2NV27dH2p1p LbtTjKBZGN0LZ7plr1HuOSNfO/u8nHsDjeR0Wn+S8ll37GKMXzWIFmV8LGbiyla2nWV7 hD97AxcBzy/B3/LPSrAgWQQw5xu5tcC3viqyNn392VMnw2dI3QveTAxoSUhQhVkVtt1P Xotg== X-Gm-Message-State: AOAM532UcTtox/9vl2DNrpB92Dj4RhN5S/vePfnTf+wFDFMyHHhPqp8f 66D6pKcttDA+iLfr6r+AUhPKGw== X-Google-Smtp-Source: ABdhPJyeJtPKa7tGYCh0gZkqHjPcdfCZESI0/adSt3JTZTkT3vCpCDexDTbFfki3nLwOQ4rZwOtgMQ== X-Received: by 2002:ac8:7774:: with SMTP id h20mr2730148qtu.236.1640101614863; Tue, 21 Dec 2021 07:46:54 -0800 (PST) Received: from soleen.c.googlers.com.com (189.216.85.34.bc.googleusercontent.com. [34.85.216.189]) by smtp.gmail.com with ESMTPSA id d20sm224588qtg.73.2021.12.21.07.46.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 07:46:54 -0800 (PST) From: Pasha Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, pjt@google.com, weixugc@google.com, gthelen@google.com, mingo@redhat.com, corbet@lwn.net, will@kernel.org, rppt@kernel.org, keescook@chromium.org, tglx@linutronix.de, peterz@infradead.org, masahiroy@kernel.org, samitolvanen@google.com, dave.hansen@linux.intel.com, x86@kernel.org, frederic@kernel.org, hpa@zytor.com, aneesh.kumar@linux.ibm.com, jirislaby@kernel.org, songmuchun@bytedance.com, qydwhotmail@gmail.com, hughd@google.com Subject: [PATCH v3 2/4] mm: ptep_clear() page table helper Date: Tue, 21 Dec 2021 15:46:48 +0000 Message-Id: <20211221154650.1047963-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.34.1.307.g9b7440fafd-goog In-Reply-To: <20211221154650.1047963-1-pasha.tatashin@soleen.com> References: <20211221154650.1047963-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" We have ptep_get_and_clear() and ptep_get_and_clear_full() helpers to clear PTE from user page tables, but there is no variant for simple clear of a present PTE from user page tables without using a low level pte_clear() which can be either native or para-virtualised. Add a new ptep_clear() that can be used in common code to clear PTEs from page table. We will need this call later in order to add a hook for page table check. Signed-off-by: Pasha Tatashin --- Documentation/vm/arch_pgtable_helpers.rst | 6 ++++-- include/linux/pgtable.h | 8 ++++++++ mm/debug_vm_pgtable.c | 2 +- mm/khugepaged.c | 12 ++---------- 4 files changed, 15 insertions(+), 13 deletions(-) diff --git a/Documentation/vm/arch_pgtable_helpers.rst b/Documentation/vm/a= rch_pgtable_helpers.rst index 552567d863b8..fbe06ec75370 100644 --- a/Documentation/vm/arch_pgtable_helpers.rst +++ b/Documentation/vm/arch_pgtable_helpers.rst @@ -66,9 +66,11 @@ PTE Page Table Helpers +---------------------------+---------------------------------------------= -----+ | pte_mknotpresent | Invalidates a mapped PTE = | +---------------------------+---------------------------------------------= -----+ -| ptep_get_and_clear | Clears a PTE = | +| ptep_clear | Clears a PTE = | +---------------------------+---------------------------------------------= -----+ -| ptep_get_and_clear_full | Clears a PTE = | +| ptep_get_and_clear | Clears and returns PTE = | ++---------------------------+---------------------------------------------= -----+ +| ptep_get_and_clear_full | Clears and returns PTE (batched PTE unmap) = | +---------------------------+---------------------------------------------= -----+ | ptep_test_and_clear_young | Clears young from a PTE = | +---------------------------+---------------------------------------------= -----+ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index e24d2c992b11..bc8713a76e03 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -258,6 +258,14 @@ static inline int pmdp_clear_flush_young(struct vm_are= a_struct *vma, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif =20 +#ifndef __HAVE_ARCH_PTEP_CLEAR +static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + pte_clear(mm, addr, ptep); +} +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c index 228e3954b90c..cd48a34c87a4 100644 --- a/mm/debug_vm_pgtable.c +++ b/mm/debug_vm_pgtable.c @@ -652,7 +652,7 @@ static void __init pte_clear_tests(struct pgtable_debug= _args *args) set_pte_at(args->mm, args->vaddr, args->ptep, pte); flush_dcache_page(page); barrier(); - pte_clear(args->mm, args->vaddr, args->ptep); + ptep_clear(args->mm, args->vaddr, args->ptep); pte =3D ptep_get(args->ptep); WARN_ON(!pte_none(pte)); } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index ed0fa6368706..7720189a2da7 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -756,11 +756,7 @@ static void __collapse_huge_page_copy(pte_t *pte, stru= ct page *page, * ptl mostly unnecessary. */ spin_lock(ptl); - /* - * paravirt calls inside pte_clear here are - * superfluous. - */ - pte_clear(vma->vm_mm, address, _pte); + ptep_clear(vma->vm_mm, address, _pte); spin_unlock(ptl); } } else { @@ -774,11 +770,7 @@ static void __collapse_huge_page_copy(pte_t *pte, stru= ct page *page, * inside page_remove_rmap(). */ spin_lock(ptl); - /* - * paravirt calls inside pte_clear here are - * superfluous. - */ - pte_clear(vma->vm_mm, address, _pte); + ptep_clear(vma->vm_mm, address, _pte); page_remove_rmap(src_page, false); spin_unlock(ptl); free_page_and_swap_cache(src_page); --=20 2.34.1.307.g9b7440fafd-goog From nobody Wed Jul 1 13:27:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BA2FC433EF for ; Tue, 21 Dec 2021 15:47:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239228AbhLUPrB (ORCPT ); Tue, 21 Dec 2021 10:47:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236690AbhLUPq5 (ORCPT ); Tue, 21 Dec 2021 10:46:57 -0500 Received: from mail-qt1-x82b.google.com (mail-qt1-x82b.google.com [IPv6:2607:f8b0:4864:20::82b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0EE3CC061574 for ; Tue, 21 Dec 2021 07:46:57 -0800 (PST) Received: by mail-qt1-x82b.google.com with SMTP id n15so13301395qta.0 for ; Tue, 21 Dec 2021 07:46:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=cI7YVBo+Cohe6VKMVjfceS0qoE/CwWGaUs+lrS/IFhw=; b=j9YnSfPBq3Uj1s+FVmAJaDYoGxPz4B7+SyaV0NM2+GjRboMUIbJXQqvyzkDSGOukFa G13jNSIoE5JSsqf5nBU0FqbHa9KrtdU6umw3dBV9vcEhWI7LLFlZJIogNi02qvXShmyb BrJbgiabYSHPmwtNSqZptFzHSSMUXQ0K43QGHpgwwzjHnfGsw6ivu0gwMnXPKMgpegyQ 3st+WrYpN6vJwHS2IjdBsqwM3Q2yaIXybCHMdgSkVbheFO/d7NA+CPpp1EUJRxby+oI2 8vUqw9UWOubI2AVpG5Zd9hNotCKDe5nyKukLEyTc5fCxQpEjeSb+IkFTOZ2fsFMdeB+y kPtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cI7YVBo+Cohe6VKMVjfceS0qoE/CwWGaUs+lrS/IFhw=; b=3VJY9I8Ahm/Kbpym9JqUi2ZxXlNLdkKoNWjx7ypXC0qn8Yq956KVcRiiqXzfqYT1qJ 2NCCu0JEQSDeK2TLepUUDy9BWs225LRaArrYxwIlo0R9bBxIoAxzkzs60vPcHOcDX2M1 TqKRPfD1w3o33Xu23b8DWu6Gwk0bo3Kt5zyPnWDlx2I3j9OPp8m2LQIkZ/waziQctjHn 7GC1+AY3tNypPaBQdt3ZA5AcUorP0GgN4pEjBRDTUgm+CMT9KcCd7ZwTMo///H6lH1Wi 6GoZ++sYnFxaf8Tm5GTBjT2Lplp0Cjan63e6b+BIvqTPIgdqvZH8vU7dJtl4vd3GAb4m fdaw== X-Gm-Message-State: AOAM533RRkzB82FTHokolwxbaOcPy8XBztjzgGOvRaOb7IW1BnaL+/Zt op7ukPr7xjAseD7rNAJhvM5vZxXbf3zVAw== X-Google-Smtp-Source: ABdhPJwlMJHxT8pjC4T4DgGsxuBW0b2SGWQjuWAGtDd4XhfZroIVquyM1GZwaRX9tJWbgpmt03OfTQ== X-Received: by 2002:a05:622a:54d:: with SMTP id m13mr2685987qtx.33.1640101616027; Tue, 21 Dec 2021 07:46:56 -0800 (PST) Received: from soleen.c.googlers.com.com (189.216.85.34.bc.googleusercontent.com. [34.85.216.189]) by smtp.gmail.com with ESMTPSA id d20sm224588qtg.73.2021.12.21.07.46.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 07:46:55 -0800 (PST) From: Pasha Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, pjt@google.com, weixugc@google.com, gthelen@google.com, mingo@redhat.com, corbet@lwn.net, will@kernel.org, rppt@kernel.org, keescook@chromium.org, tglx@linutronix.de, peterz@infradead.org, masahiroy@kernel.org, samitolvanen@google.com, dave.hansen@linux.intel.com, x86@kernel.org, frederic@kernel.org, hpa@zytor.com, aneesh.kumar@linux.ibm.com, jirislaby@kernel.org, songmuchun@bytedance.com, qydwhotmail@gmail.com, hughd@google.com Subject: [PATCH v3 3/4] mm: page table check Date: Tue, 21 Dec 2021 15:46:49 +0000 Message-Id: <20211221154650.1047963-4-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.34.1.307.g9b7440fafd-goog In-Reply-To: <20211221154650.1047963-1-pasha.tatashin@soleen.com> References: <20211221154650.1047963-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Check user page table entries at the time they are added and removed. Allows to synchronously catch memory corruption issues related to double mapping. When a pte for an anonymous page is added into page table, we verify that this pte does not already point to a file backed page, and vice versa if this is a file backed page that is being added we verify that this page does not have an anonymous mapping We also enforce that read-only sharing for anonymous pages is allowed (i.e. cow after fork). All other sharing must be for file pages. Page table check allows to protect and debug cases where "struct page" metadata became corrupted for some reason. For example, when refcnt or mapcount become invalid. Signed-off-by: Pasha Tatashin --- Documentation/vm/index.rst | 1 + Documentation/vm/page_table_check.rst | 56 ++++++ MAINTAINERS | 9 + arch/Kconfig | 3 + include/linux/page_table_check.h | 147 ++++++++++++++ mm/Kconfig.debug | 24 +++ mm/Makefile | 1 + mm/page_alloc.c | 4 + mm/page_ext.c | 4 + mm/page_table_check.c | 270 ++++++++++++++++++++++++++ 10 files changed, 519 insertions(+) create mode 100644 Documentation/vm/page_table_check.rst create mode 100644 include/linux/page_table_check.h create mode 100644 mm/page_table_check.c diff --git a/Documentation/vm/index.rst b/Documentation/vm/index.rst index 6f5ffef4b716..43bb54d897d9 100644 --- a/Documentation/vm/index.rst +++ b/Documentation/vm/index.rst @@ -31,6 +31,7 @@ algorithms. If you are looking for advice on simply allo= cating memory, see the page_migration page_frags page_owner + page_table_check remap_file_pages slub split_page_table_lock diff --git a/Documentation/vm/page_table_check.rst b/Documentation/vm/page_= table_check.rst new file mode 100644 index 000000000000..81f521ff7ea7 --- /dev/null +++ b/Documentation/vm/page_table_check.rst @@ -0,0 +1,56 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _page_table_check: + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Page Table Check +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Page table check allows to hardern the kernel by ensuring that some types = of +the memory corruptions are prevented. + +Page table check performs extra verifications at the time when new pages b= ecome +accessible from the userspace by getting their page table entries (PTEs PM= Ds +etc.) added into the table. + +In case of detected corruption, the kernel is crashed. There is a small +performance and memory overhead associated with the page table check. Ther= efore, +it is disabled by default, but can be optionally enabled on systems where = the +extra hardening outweighs the performance costs. Also, because page table = check +is synchronous, it can help with debugging double map memory corruption is= sues, +by crashing kernel at the time wrong mapping occurs instead of later which= is +often the case with memory corruptions bugs. + +Double mapping detection logic +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D + ++-------------------+-------------------+-------------------+-------------= -----+ +| Current Mapping | New mapping | Permissions | Rule = | ++=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D+=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D+ +| Anonymous | Anonymous | Read | Allow = | ++-------------------+-------------------+-------------------+-------------= -----+ +| Anonymous | Anonymous | Read / Write | Prohibit = | ++-------------------+-------------------+-------------------+-------------= -----+ +| Anonymous | Named | Any | Prohibit = | ++-------------------+-------------------+-------------------+-------------= -----+ +| Named | Anonymous | Any | Prohibit = | ++-------------------+-------------------+-------------------+-------------= -----+ +| Named | Named | Any | Allow = | ++-------------------+-------------------+-------------------+-------------= -----+ + +Enabling Page Table Check +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D + +Build kernel with: + +- PAGE_TABLE_CHECK=3Dy + Note, it can only be enabled on platforms where ARCH_SUPPORTS_PAGE_TABLE= _CHECK + is available. + +- Boot with 'page_table_check=3Don' kernel parameter. + +Optionally, build kernel with PAGE_TABLE_CHECK_ENFORCED in order to have p= age +table support without extra kernel parameter. diff --git a/MAINTAINERS b/MAINTAINERS index 4403b348851d..16bc8cdc1492 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14485,6 +14485,15 @@ F: include/net/page_pool.h F: include/trace/events/page_pool.h F: net/core/page_pool.c =20 +PAGE TABLE CHECK +M: Pasha Tatashin +M: Andrew Morton +L: linux-mm@kvack.org +S: Maintained +F: Documentation/vm/page_table_check.rst +F: include/linux/page_table_check.h +F: mm/page_table_check.c + PANASONIC LAPTOP ACPI EXTRAS DRIVER M: Kenneth Chan L: platform-driver-x86@vger.kernel.org diff --git a/arch/Kconfig b/arch/Kconfig index 75ad877c5c48..fdba59052abc 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -1307,6 +1307,9 @@ config HAVE_ARCH_PFN_VALID config ARCH_SUPPORTS_DEBUG_PAGEALLOC bool =20 +config ARCH_SUPPORTS_PAGE_TABLE_CHECK + bool + config ARCH_SPLIT_ARG64 bool help diff --git a/include/linux/page_table_check.h b/include/linux/page_table_ch= eck.h new file mode 100644 index 000000000000..38cace1da7b6 --- /dev/null +++ b/include/linux/page_table_check.h @@ -0,0 +1,147 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2021, Google LLC. + * Pasha Tatashin + */ +#ifndef __LINUX_PAGE_TABLE_CHECK_H +#define __LINUX_PAGE_TABLE_CHECK_H + +#ifdef CONFIG_PAGE_TABLE_CHECK +#include + +extern struct static_key_true page_table_check_disabled; +extern struct page_ext_operations page_table_check_ops; + +void __page_table_check_zero(struct page *page, unsigned int order); +void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t pte); +void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, + pmd_t pmd); +void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, + pud_t pud); +void __page_table_check_pte_set(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte); +void __page_table_check_pmd_set(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd); +void __page_table_check_pud_set(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud); + +static inline void page_table_check_alloc(struct page *page, unsigned int = order) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_zero(page, order); +} + +static inline void page_table_check_free(struct page *page, unsigned int o= rder) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_zero(page, order); +} + +static inline void page_table_check_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t pte) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pte_clear(mm, addr, pte); +} + +static inline void page_table_check_pmd_clear(struct mm_struct *mm, + unsigned long addr, pmd_t pmd) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pmd_clear(mm, addr, pmd); +} + +static inline void page_table_check_pud_clear(struct mm_struct *mm, + unsigned long addr, pud_t pud) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pud_clear(mm, addr, pud); +} + +static inline void page_table_check_pte_set(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + pte_t pte) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pte_set(mm, addr, ptep, pte); +} + +static inline void page_table_check_pmd_set(struct mm_struct *mm, + unsigned long addr, pmd_t *pmdp, + pmd_t pmd) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pmd_set(mm, addr, pmdp, pmd); +} + +static inline void page_table_check_pud_set(struct mm_struct *mm, + unsigned long addr, pud_t *pudp, + pud_t pud) +{ + if (static_branch_likely(&page_table_check_disabled)) + return; + + __page_table_check_pud_set(mm, addr, pudp, pud); +} + +#else + +static inline void page_table_check_alloc(struct page *page, unsigned int = order) +{ +} + +static inline void page_table_check_free(struct page *page, unsigned int o= rder) +{ +} + +static inline void page_table_check_pte_clear(struct mm_struct *mm, + unsigned long addr, pte_t pte) +{ +} + +static inline void page_table_check_pmd_clear(struct mm_struct *mm, + unsigned long addr, pmd_t pmd) +{ +} + +static inline void page_table_check_pud_clear(struct mm_struct *mm, + unsigned long addr, pud_t pud) +{ +} + +static inline void page_table_check_pte_set(struct mm_struct *mm, + unsigned long addr, pte_t *ptep, + pte_t pte) +{ +} + +static inline void page_table_check_pmd_set(struct mm_struct *mm, + unsigned long addr, pmd_t *pmdp, + pmd_t pmd) +{ +} + +static inline void page_table_check_pud_set(struct mm_struct *mm, + unsigned long addr, pud_t *pudp, + pud_t pud) +{ +} + +#endif /* CONFIG_PAGE_TABLE_CHECK */ +#endif /* __LINUX_PAGE_TABLE_CHECK_H */ diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug index 1e73717802f8..5bd5bb097252 100644 --- a/mm/Kconfig.debug +++ b/mm/Kconfig.debug @@ -62,6 +62,30 @@ config PAGE_OWNER =20 If unsure, say N. =20 +config PAGE_TABLE_CHECK + bool "Check for invalid mappings in user page tables" + depends on ARCH_SUPPORTS_PAGE_TABLE_CHECK + select PAGE_EXTENSION + help + Check that anonymous page is not being mapped twice with read write + permissions. Check that anonymous and file pages are not being + erroneously shared. Since the checking is performed at the time + entries are added and removed to user page tables, leaking, corruption + and double mapping problems are detected synchronously. + + If unsure say "n". + +config PAGE_TABLE_CHECK_ENFORCED + bool "Enforce the page table checking by default" + depends on PAGE_TABLE_CHECK + help + Always enable page table checking. By default the page table checking + is disabled, and can be optionally enabled via page_table_check=3Don + kernel parameter. This config enforces that page table check is always + enabled. + + If unsure say "n". + config PAGE_POISONING bool "Poison pages after freeing" help diff --git a/mm/Makefile b/mm/Makefile index 7919cd7f13f2..588d3113f3b0 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -114,6 +114,7 @@ obj-$(CONFIG_GENERIC_EARLY_IOREMAP) +=3D early_ioremap.o obj-$(CONFIG_CMA) +=3D cma.o obj-$(CONFIG_MEMORY_BALLOON) +=3D balloon_compaction.o obj-$(CONFIG_PAGE_EXTENSION) +=3D page_ext.o +obj-$(CONFIG_PAGE_TABLE_CHECK) +=3D page_table_check.o obj-$(CONFIG_CMA_DEBUGFS) +=3D cma_debug.o obj-$(CONFIG_SECRETMEM) +=3D secretmem.o obj-$(CONFIG_CMA_SYSFS) +=3D cma_sysfs.o diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 77253ea6031e..edfd6c81af82 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -64,6 +64,7 @@ #include #include #include +#include #include #include #include @@ -1308,6 +1309,7 @@ static __always_inline bool free_pages_prepare(struct= page *page, if (memcg_kmem_enabled() && PageMemcgKmem(page)) __memcg_kmem_uncharge_page(page, order); reset_page_owner(page, order); + page_table_check_free(page, order); return false; } =20 @@ -1347,6 +1349,7 @@ static __always_inline bool free_pages_prepare(struct= page *page, page_cpupid_reset_last(page); page->flags &=3D ~PAGE_FLAGS_CHECK_AT_PREP; reset_page_owner(page, order); + page_table_check_free(page, order); =20 if (!PageHighMem(page)) { debug_check_no_locks_freed(page_address(page), @@ -2421,6 +2424,7 @@ inline void post_alloc_hook(struct page *page, unsign= ed int order, } =20 set_page_owner(page, order, gfp_flags); + page_table_check_alloc(page, order); } =20 static void prep_new_page(struct page *page, unsigned int order, gfp_t gfp= _flags, diff --git a/mm/page_ext.c b/mm/page_ext.c index 6242afb24d84..bee3240604dc 100644 --- a/mm/page_ext.c +++ b/mm/page_ext.c @@ -8,6 +8,7 @@ #include #include #include +#include =20 /* * struct page extension @@ -75,6 +76,9 @@ static struct page_ext_operations *page_ext_ops[] =3D { #if defined(CONFIG_PAGE_IDLE_FLAG) && !defined(CONFIG_64BIT) &page_idle_ops, #endif +#ifdef CONFIG_PAGE_TABLE_CHECK + &page_table_check_ops, +#endif }; =20 unsigned long page_ext_size =3D sizeof(struct page_ext); diff --git a/mm/page_table_check.c b/mm/page_table_check.c new file mode 100644 index 000000000000..7504e7caa2a1 --- /dev/null +++ b/mm/page_table_check.c @@ -0,0 +1,270 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2021, Google LLC. + * Pasha Tatashin + */ +#include +#include + +#undef pr_fmt +#define pr_fmt(fmt) "page_table_check: " fmt + +struct page_table_check { + atomic_t anon_map_count; + atomic_t file_map_count; +}; + +static bool __page_table_check_enabled __initdata =3D + IS_ENABLED(CONFIG_PAGE_TABLE_CHECK_ENFORCED); + +DEFINE_STATIC_KEY_TRUE(page_table_check_disabled); +EXPORT_SYMBOL(page_table_check_disabled); + +static int __init early_page_table_check_param(char *buf) +{ + if (!buf) + return -EINVAL; + + if (strcmp(buf, "on") =3D=3D 0) + __page_table_check_enabled =3D true; + else if (strcmp(buf, "off") =3D=3D 0) + __page_table_check_enabled =3D false; + + return 0; +} + +early_param("page_table_check", early_page_table_check_param); + +static bool __init need_page_table_check(void) +{ + return __page_table_check_enabled; +} + +static void __init init_page_table_check(void) +{ + if (!__page_table_check_enabled) + return; + static_branch_disable(&page_table_check_disabled); +} + +struct page_ext_operations page_table_check_ops =3D { + .size =3D sizeof(struct page_table_check), + .need =3D need_page_table_check, + .init =3D init_page_table_check, +}; + +static struct page_table_check *get_page_table_check(struct page_ext *page= _ext) +{ + BUG_ON(!page_ext); + return (void *)(page_ext) + page_table_check_ops.offset; +} + +static inline bool pte_user_accessible_page(pte_t pte) +{ + return (pte_val(pte) & _PAGE_PRESENT) && (pte_val(pte) & _PAGE_USER); +} + +static inline bool pmd_user_accessible_page(pmd_t pmd) +{ + return pmd_leaf(pmd) && (pmd_val(pmd) & _PAGE_PRESENT) && + (pmd_val(pmd) & _PAGE_USER); +} + +static inline bool pud_user_accessible_page(pud_t pud) +{ + return pud_leaf(pud) && (pud_val(pud) & _PAGE_PRESENT) && + (pud_val(pud) & _PAGE_USER); +} + +/* + * An enty is removed from the page table, decrement the counters for that= page + * verify that it is of correct type and counters do not become negative. + */ +static void page_table_check_clear(struct mm_struct *mm, unsigned long add= r, + unsigned long pfn, unsigned long pgcnt) +{ + struct page_ext *page_ext; + struct page *page; + bool anon; + int i; + + if (!pfn_valid(pfn)) + return; + + page =3D pfn_to_page(pfn); + page_ext =3D lookup_page_ext(page); + anon =3D PageAnon(page); + + for (i =3D 0; i < pgcnt; i++) { + struct page_table_check *ptc =3D get_page_table_check(page_ext); + + if (anon) { + BUG_ON(atomic_read(&ptc->file_map_count)); + BUG_ON(atomic_dec_return(&ptc->anon_map_count) < 0); + } else { + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_dec_return(&ptc->file_map_count) < 0); + } + page_ext =3D page_ext_next(page_ext); + } +} + +/* + * A new enty is added to the page table, increment the counters for that = page + * verify that it is of correct type and is not being mapped with a differ= ent + * type to a different process. + */ +static void page_table_check_set(struct mm_struct *mm, unsigned long addr, + unsigned long pfn, unsigned long pgcnt, + bool rw) +{ + struct page_ext *page_ext; + struct page *page; + bool anon; + int i; + + if (!pfn_valid(pfn)) + return; + + page =3D pfn_to_page(pfn); + page_ext =3D lookup_page_ext(page); + anon =3D PageAnon(page); + + for (i =3D 0; i < pgcnt; i++) { + struct page_table_check *ptc =3D get_page_table_check(page_ext); + + if (anon) { + BUG_ON(atomic_read(&ptc->file_map_count)); + BUG_ON(atomic_inc_return(&ptc->anon_map_count) > 1 && rw); + } else { + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_inc_return(&ptc->file_map_count) < 0); + } + page_ext =3D page_ext_next(page_ext); + } +} + +/* + * page is on free list, or is being allocated, verify that counters are z= eroes + * crash if they are not. + */ +void __page_table_check_zero(struct page *page, unsigned int order) +{ + struct page_ext *page_ext =3D lookup_page_ext(page); + int i; + + BUG_ON(!page_ext); + for (i =3D 0; i < (1 << order); i++) { + struct page_table_check *ptc =3D get_page_table_check(page_ext); + + BUG_ON(atomic_read(&ptc->anon_map_count)); + BUG_ON(atomic_read(&ptc->file_map_count)); + page_ext =3D page_ext_next(page_ext); + } +} + +void __page_table_check_pte_clear(struct mm_struct *mm, unsigned long addr, + pte_t pte) +{ + if (&init_mm =3D=3D mm) + return; + + if (pte_user_accessible_page(pte)) { + page_table_check_clear(mm, addr, pte_pfn(pte), + PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pte_clear); + +void __page_table_check_pmd_clear(struct mm_struct *mm, unsigned long addr, + pmd_t pmd) +{ + if (&init_mm =3D=3D mm) + return; + + if (pmd_user_accessible_page(pmd)) { + page_table_check_clear(mm, addr, pmd_pfn(pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pmd_clear); + +void __page_table_check_pud_clear(struct mm_struct *mm, unsigned long addr, + pud_t pud) +{ + if (&init_mm =3D=3D mm) + return; + + if (pud_user_accessible_page(pud)) { + page_table_check_clear(mm, addr, pud_pfn(pud), + PUD_PAGE_SIZE >> PAGE_SHIFT); + } +} +EXPORT_SYMBOL(__page_table_check_pud_clear); + +void __page_table_check_pte_set(struct mm_struct *mm, unsigned long addr, + pte_t *ptep, pte_t pte) +{ + pte_t old_pte; + + if (&init_mm =3D=3D mm) + return; + + old_pte =3D *ptep; + if (pte_user_accessible_page(old_pte)) { + page_table_check_clear(mm, addr, pte_pfn(old_pte), + PAGE_SIZE >> PAGE_SHIFT); + } + + if (pte_user_accessible_page(pte)) { + page_table_check_set(mm, addr, pte_pfn(pte), + PAGE_SIZE >> PAGE_SHIFT, + pte_write(pte)); + } +} +EXPORT_SYMBOL(__page_table_check_pte_set); + +void __page_table_check_pmd_set(struct mm_struct *mm, unsigned long addr, + pmd_t *pmdp, pmd_t pmd) +{ + pmd_t old_pmd; + + if (&init_mm =3D=3D mm) + return; + + old_pmd =3D *pmdp; + if (pmd_user_accessible_page(old_pmd)) { + page_table_check_clear(mm, addr, pmd_pfn(old_pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT); + } + + if (pmd_user_accessible_page(pmd)) { + page_table_check_set(mm, addr, pmd_pfn(pmd), + PMD_PAGE_SIZE >> PAGE_SHIFT, + pmd_write(pmd)); + } +} +EXPORT_SYMBOL(__page_table_check_pmd_set); + +void __page_table_check_pud_set(struct mm_struct *mm, unsigned long addr, + pud_t *pudp, pud_t pud) +{ + pud_t old_pud; + + if (&init_mm =3D=3D mm) + return; + + old_pud =3D *pudp; + if (pud_user_accessible_page(old_pud)) { + page_table_check_clear(mm, addr, pud_pfn(old_pud), + PUD_PAGE_SIZE >> PAGE_SHIFT); + } + + if (pud_user_accessible_page(pud)) { + page_table_check_set(mm, addr, pud_pfn(pud), + PUD_PAGE_SIZE >> PAGE_SHIFT, + pud_write(pud)); + } +} +EXPORT_SYMBOL(__page_table_check_pud_set); --=20 2.34.1.307.g9b7440fafd-goog From nobody Wed Jul 1 13:27:41 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C70EC433FE for ; Tue, 21 Dec 2021 15:47:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236760AbhLUPrC (ORCPT ); Tue, 21 Dec 2021 10:47:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236720AbhLUPq6 (ORCPT ); Tue, 21 Dec 2021 10:46:58 -0500 Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EBC3FC061746 for ; Tue, 21 Dec 2021 07:46:57 -0800 (PST) Received: by mail-qv1-xf2b.google.com with SMTP id js9so12746786qvb.12 for ; Tue, 21 Dec 2021 07:46:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=from:to:subject:date:message-id:in-reply-to:references:mime-version :content-transfer-encoding; bh=Xmyi2zCZ0qq0KJ7l9kYGd9/NNLTFkYCru9KHZL7RlW0=; b=Zvi8Rg4g886r2e8/bLoT2y0NWG6X3S56t/9kCh4NczClQm07VEJTUryxHSMYWFbEAo p8PMXJ28kOioXPBq0ix99dx2hczfOcjhb56wg27CwaCByKCyeYEsIq/mLoS45sgCBGeM fDMAZ8kDg3haHxKo30vmLRsG3wmmMiyAUkWAsdmlE9K4ZhS/Clmk7D2ouNpwtysNOU+N vt2wiNJ7VNXFzv5pcyJDqBeNFyaPTB6vpEDWgvRVRZtqjd7GkeD75fe5919W+lA7HGg4 lMLeSnvReUyHifMEDe2ULS43fvlV/KZTOV2GeuXkTxivarBuppz470cyXZmtg/B6rE5q cwjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Xmyi2zCZ0qq0KJ7l9kYGd9/NNLTFkYCru9KHZL7RlW0=; b=5vj0/AOgUuBnE7s17umVWAlPI2V131Ohh8zJJFWhrOELagm8cUdkSGFknFsZIcy1sG x147ZNXnaDemodeiklXLpvbnQe+jsaWRgOjwGhs88ZG99Yy4cdlrToN6/xQFzteP0C1t A9wpRkii8N5otw0WXgMaMXgp9Fc7sP574QD4o8WszGOzoL+7xyGtaVgwcTY6pTvn9kf/ byEUeUmnwyJNcIo62MHIYWTbtygNIvO+uKJIK4nbA6ADla1HQq1IHGt7P51byaw0FS0L TmeVrSjWO+bnk+3KIy+a84oPjuyUDWJGTnt7MB2boFEpbYlmJz6B5WQ6nBzd1itMB1Xt KC+A== X-Gm-Message-State: AOAM531oPoP1b/kvCrb/Z53tuKiCJLMQqhZvgoeX0CJXDD07sOjWwZgP 78VRLntQ6UU5801l6O+rZt6Ntg== X-Google-Smtp-Source: ABdhPJwInCo1DVCqL5Hwf+MENbgmDZwkZnQaFs2cfcc4ye0/E2LpBXPTI45RluoaIQIBeUt6O6joMA== X-Received: by 2002:ad4:4ee3:: with SMTP id dv3mr3100222qvb.8.1640101617069; Tue, 21 Dec 2021 07:46:57 -0800 (PST) Received: from soleen.c.googlers.com.com (189.216.85.34.bc.googleusercontent.com. [34.85.216.189]) by smtp.gmail.com with ESMTPSA id d20sm224588qtg.73.2021.12.21.07.46.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 Dec 2021 07:46:56 -0800 (PST) From: Pasha Tatashin To: pasha.tatashin@soleen.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, pjt@google.com, weixugc@google.com, gthelen@google.com, mingo@redhat.com, corbet@lwn.net, will@kernel.org, rppt@kernel.org, keescook@chromium.org, tglx@linutronix.de, peterz@infradead.org, masahiroy@kernel.org, samitolvanen@google.com, dave.hansen@linux.intel.com, x86@kernel.org, frederic@kernel.org, hpa@zytor.com, aneesh.kumar@linux.ibm.com, jirislaby@kernel.org, songmuchun@bytedance.com, qydwhotmail@gmail.com, hughd@google.com Subject: [PATCH v3 4/4] x86: mm: add x86_64 support for page table check Date: Tue, 21 Dec 2021 15:46:50 +0000 Message-Id: <20211221154650.1047963-5-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.34.1.307.g9b7440fafd-goog In-Reply-To: <20211221154650.1047963-1-pasha.tatashin@soleen.com> References: <20211221154650.1047963-1-pasha.tatashin@soleen.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add page table check hooks into routines that modify user page tables. Signed-off-by: Pasha Tatashin --- arch/x86/Kconfig | 1 + arch/x86/include/asm/pgtable.h | 29 +++++++++++++++++++++++++++-- 2 files changed, 28 insertions(+), 2 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 5eac1e3610e9..cc91c639acfb 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -104,6 +104,7 @@ config X86 select ARCH_SUPPORTS_ACPI select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC + select ARCH_SUPPORTS_PAGE_TABLE_CHECK if X86_64 select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <=3D 4096 select ARCH_SUPPORTS_LTO_CLANG diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index dea9fe8a56cc..8a9432fb3802 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -27,6 +27,7 @@ #include #include #include +#include =20 extern pgd_t early_top_pgt[PTRS_PER_PGD]; bool __init __early_make_pgtable(unsigned long address, pmdval_t pmd); @@ -1007,18 +1008,21 @@ static inline pud_t native_local_pudp_get_and_clear= (pud_t *pudp) static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte) { + page_table_check_pte_set(mm, addr, ptep, pte); set_pte(ptep, pte); } =20 static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pmd_t pmd) { + page_table_check_pmd_set(mm, addr, pmdp, pmd); set_pmd(pmdp, pmd); } =20 static inline void set_pud_at(struct mm_struct *mm, unsigned long addr, pud_t *pudp, pud_t pud) { + page_table_check_pud_set(mm, addr, pudp, pud); native_set_pud(pudp, pud); } =20 @@ -1049,6 +1053,7 @@ static inline pte_t ptep_get_and_clear(struct mm_stru= ct *mm, unsigned long addr, pte_t *ptep) { pte_t pte =3D native_ptep_get_and_clear(ptep); + page_table_check_pte_clear(mm, addr, pte); return pte; } =20 @@ -1064,12 +1069,23 @@ static inline pte_t ptep_get_and_clear_full(struct = mm_struct *mm, * care about updates and native needs no locking */ pte =3D native_local_ptep_get_and_clear(ptep); + page_table_check_pte_clear(mm, addr, pte); } else { pte =3D ptep_get_and_clear(mm, addr, ptep); } return pte; } =20 +#define __HAVE_ARCH_PTEP_CLEAR +static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, + pte_t *ptep) +{ + if (IS_ENABLED(CONFIG_PAGE_TABLE_CHECK)) + ptep_get_and_clear(mm, addr, ptep); + else + pte_clear(mm, addr, ptep); +} + #define __HAVE_ARCH_PTEP_SET_WRPROTECT static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) @@ -1110,14 +1126,22 @@ static inline int pmd_write(pmd_t pmd) static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned= long addr, pmd_t *pmdp) { - return native_pmdp_get_and_clear(pmdp); + pmd_t pmd =3D native_pmdp_get_and_clear(pmdp); + + page_table_check_pmd_clear(mm, addr, pmd); + + return pmd; } =20 #define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pud_t *pudp) { - return native_pudp_get_and_clear(pudp); + pud_t pud =3D native_pudp_get_and_clear(pudp); + + page_table_check_pud_clear(mm, addr, pud); + + return pud; } =20 #define __HAVE_ARCH_PMDP_SET_WRPROTECT @@ -1138,6 +1162,7 @@ static inline int pud_write(pud_t pud) static inline pmd_t pmdp_establish(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp, pmd_t pmd) { + page_table_check_pmd_set(vma->vm_mm, address, pmdp, pmd); if (IS_ENABLED(CONFIG_SMP)) { return xchg(pmdp, pmd); } else { --=20 2.34.1.307.g9b7440fafd-goog