From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B313BC433EF for ; Mon, 18 Jul 2022 19:37:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236152AbiGRThI (ORCPT ); Mon, 18 Jul 2022 15:37:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236113AbiGRThG (ORCPT ); Mon, 18 Jul 2022 15:37:06 -0400 Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBD9F2A42A for ; Mon, 18 Jul 2022 12:37:04 -0700 (PDT) Received: by mail-pf1-x433.google.com with SMTP id o12so11561506pfp.5 for ; Mon, 18 Jul 2022 12:37:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lqN6lUA6CKPQ6pXduKwXrnAjAVaaU8p/HZgGKWiS1UU=; b=anEaeCdDUpWprBhzF25fJpAdtW+LgFlwZ2ebfZ8wM0kpnCfn2B96Ny11a5FiZPFqb+ 8d217vnOhZgu2PfDrycJhxqOQiXud4xvWn2c80DYc99oVi3Ir9FehDJAGUmSeOUhksfr QR+ahMTY316YC8qzqWOkrwI9aF+uUCzKo6Q+X7SXWicVhjcsJrIFPjQgPYBlrrFSkyza nrpLpAgplwp8N1Y7IPDrjQick91Rj9cFFO7tm3RrwD9n6LXQmJHs9J6KNpKvgDrlFFhE LWoW2wCSjvgnI+efvRVSp0tdhvWKQfmNMDQryOAu3HMUlJExJUMY0OFO8aRltUkZiSpA eLoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lqN6lUA6CKPQ6pXduKwXrnAjAVaaU8p/HZgGKWiS1UU=; b=0DO4pzxrliP/dqSm6i1TMjEcuUmJTOsiktjtipKQbU6qESLpd/fajXu8V0GbCccJvY DoXADkPg2x/EfAaYCcVOLpJgxEv3Y3BYQhfUZTukrmxhQexzyNE0DU8ZNISKBx9SY4S8 xTKISLCKK3V88V5h6jCXG6naAck/g3JZSZ3fMU/cDy3Tlw8ljVNMRWovo5GWjWSkWkmN kn0n5osuEYkwqU3v0gDhTNCb3hPxcq/qL5MQl72crBKi51UP2/9MtSuk/ziK+9GKf/vb yhPHdKd0301uiCbsRH67y/HQVE0Y1kMJbnHoD3IfA5cyG9aAlpjVDpM+FSXcjinHIo50 HQmA== X-Gm-Message-State: AJIora/Sd2YxhNXWsLIJSAWKQ6SHyMpuDmaK4RnCH1g8kQ7A33ATm3CZ 8E/OGD7xhw05rJQEN6oVurQ= X-Google-Smtp-Source: AGRyM1t4cIpL6IVhzKiQES1MTghU8BQMTlI5pi0eeK/ksqsiBBWEFmy9twLEG4Iqd+Ne+Q/gUcyJmQ== X-Received: by 2002:a05:6a00:b92:b0:52a:e60d:dfbb with SMTP id g18-20020a056a000b9200b0052ae60ddfbbmr29561053pfj.72.1658173024046; Mon, 18 Jul 2022 12:37:04 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:03 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 01/14] userfaultfd: set dirty and young on writeprotect Date: Mon, 18 Jul 2022 05:01:59 -0700 Message-Id: <20220718120212.3180-2-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit When userfaultfd makes a PTE writable, it can now change the PTE directly, in some cases, without going triggering a page-fault first. Yet, doing so might leave the PTE that was write-unprotected as old and clean. At least on x86, this would cause a >500 cycles overhead when the PTE is first accessed. Use MM_CP_WILL_NEED to set the PTE as young and dirty when userfaultfd gets a hint that the page is likely to be used. Avoid changing the PTE to young and dirty in other cases to avoid excessive writeback and messing with the page reclamation logic. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin --- include/linux/mm.h | 2 ++ mm/mprotect.c | 9 ++++++++- mm/userfaultfd.c | 8 ++++++-- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 9cc02a7e503b..4afd75ce5875 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1988,6 +1988,8 @@ extern unsigned long move_page_tables(struct vm_area_= struct *vma, /* Whether this change is for write protecting */ #define MM_CP_UFFD_WP (1UL << 2) /* do wp */ #define MM_CP_UFFD_WP_RESOLVE (1UL << 3) /* Resolve wp */ +/* Whether to try to mark entries as dirty as they are to be written */ +#define MM_CP_WILL_NEED (1UL << 4) #define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \ MM_CP_UFFD_WP_RESOLVE) =20 diff --git a/mm/mprotect.c b/mm/mprotect.c index 996a97e213ad..34c2dfb68c42 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -82,6 +82,7 @@ static unsigned long change_pte_range(struct mmu_gather *= tlb, bool prot_numa =3D cp_flags & MM_CP_PROT_NUMA; bool uffd_wp =3D cp_flags & MM_CP_UFFD_WP; bool uffd_wp_resolve =3D cp_flags & MM_CP_UFFD_WP_RESOLVE; + bool will_need =3D cp_flags & MM_CP_WILL_NEED; =20 tlb_change_page_size(tlb, PAGE_SIZE); =20 @@ -172,6 +173,9 @@ static unsigned long change_pte_range(struct mmu_gather= *tlb, ptent =3D pte_clear_uffd_wp(ptent); } =20 + if (will_need) + ptent =3D pte_mkyoung(ptent); + /* * In some writable, shared mappings, we might want * to catch actual write access -- see @@ -187,8 +191,11 @@ static unsigned long change_pte_range(struct mmu_gathe= r *tlb, */ if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && !pte_write(ptent) && - can_change_pte_writable(vma, addr, ptent)) + can_change_pte_writable(vma, addr, ptent)) { ptent =3D pte_mkwrite(ptent); + if (will_need) + ptent =3D pte_mkdirty(ptent); + } =20 ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); if (pte_needs_flush(oldpte, ptent)) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 954c6980b29f..e0492f5f06a0 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -749,6 +749,7 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsig= ned long start, bool enable_wp =3D uffd_flags & UFFD_FLAGS_WP; struct vm_area_struct *dst_vma; unsigned long page_mask; + unsigned long cp_flags; struct mmu_gather tlb; pgprot_t newprot; int err; @@ -795,9 +796,12 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsi= gned long start, else newprot =3D vm_get_page_prot(dst_vma->vm_flags); =20 + cp_flags =3D enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE; + if (uffd_flags & (UFFD_FLAGS_ACCESS_LIKELY|UFFD_FLAGS_WRITE_LIKELY)) + cp_flags |=3D MM_CP_WILL_NEED; + tlb_gather_mmu(&tlb, dst_mm); - change_protection(&tlb, dst_vma, start, start + len, newprot, - enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE); + change_protection(&tlb, dst_vma, start, start + len, newprot, cp_flags); tlb_finish_mmu(&tlb); =20 err =3D 0; --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EB9ECCA479 for ; Mon, 18 Jul 2022 19:37:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236168AbiGRThL (ORCPT ); Mon, 18 Jul 2022 15:37:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236147AbiGRThG (ORCPT ); Mon, 18 Jul 2022 15:37:06 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CB15E00D for ; Mon, 18 Jul 2022 12:37:06 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id bh13so11528720pgb.4 for ; Mon, 18 Jul 2022 12:37:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=IufP9Zf+LxI7/ja1tm+EfHoZ34yAqbNH4YNdr2fLKzg=; b=qG3NzQadQYCy4rNcC/00/m1k00Rdnq8ihunEXn9ko/dtWH2HiC/JLZcB0XE3Fmrnme 4uHW7JOvxcluNfpQxgbQVqeUgNSwGxPrf47yeAbr1vPKr2M8T3ojhSwKcIotbVOpuGpC 03vIgXyGqL1xqLnoRk6HZYGbEsfTuBQb3UTilxepLrrh6di54zWWHrvnUwafOcNixDvI mAiMqZJOgnaD3MYRYn1evGphPH12+idguFStTv1WGAkRhN95WV3+4haBwVkVtpdhUTwQ dAycGnGYVMeeHc5jfDHQcsMcyMaz2Yv5VlzLzbDlj5bMe2rXOZ12FUP7+Pdwyzf05Qjg mXxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IufP9Zf+LxI7/ja1tm+EfHoZ34yAqbNH4YNdr2fLKzg=; b=Sx9D+ZvR5pwK654JVRdbB36EzIVM6722qUAAoNfk59ypgWGOzSk3+7SX3Q3k0cL3JI EMWsv0L9tNN/Aoin9i/6i9r6yJlGf0tksrYW4Zmab6Ti9o+appr+4KEykGDMhOI1jzFG +/6+HEGi8upQIuLqptlAI9OmhyYLr2AvAr85M8Hy1KReptu51pguEec4//UrnPQuyu5v gNkINVzMLskN0kYUKFyU7ARypqLGk0+OgvKxD3vmLcl6Gg/nIJamOvG+MjeD9+5dMq6t FChh5OipG1Z02xJqU79dClL/E0kv1r14+QDPMqc7tFr1GD3ljByGGBwdaO0wGtEVvog8 SGeQ== X-Gm-Message-State: AJIora9kjgNshVgvCmBYjunY/YTiSw2lWLCqk4nr89qjMY549MQcs4XM I6DAKSUUDWWfYcLpLLHZ9lC1+d3pnkMDvw== X-Google-Smtp-Source: AGRyM1s4dUG7omvru0Q+c8qc7tyPUfIKb/kzYGzarbIemngP+tEUPjt2ISjbd7jZhK5SI6J3kDZqpg== X-Received: by 2002:a63:504c:0:b0:419:d02e:f42a with SMTP id q12-20020a63504c000000b00419d02ef42amr17915418pgl.566.1658173025597; Mon, 18 Jul 2022 12:37:05 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:05 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 02/14] userfaultfd: try to map write-unprotected pages Date: Mon, 18 Jul 2022 05:02:00 -0700 Message-Id: <20220718120212.3180-3-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit When using userfaultfd write-(un)protect ioctl, try to change the PTE to be writable. This would save a page-fault afterwards. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit Acked-by: Peter Xu --- mm/userfaultfd.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index e0492f5f06a0..6013b217e9f3 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -799,6 +799,8 @@ int mwriteprotect_range(struct mm_struct *dst_mm, unsig= ned long start, cp_flags =3D enable_wp ? MM_CP_UFFD_WP : MM_CP_UFFD_WP_RESOLVE; if (uffd_flags & (UFFD_FLAGS_ACCESS_LIKELY|UFFD_FLAGS_WRITE_LIKELY)) cp_flags |=3D MM_CP_WILL_NEED; + if (!enable_wp && (uffd_flags & UFFD_FLAGS_WRITE_LIKELY)) + cp_flags |=3D MM_CP_TRY_CHANGE_WRITABLE; =20 tlb_gather_mmu(&tlb, dst_mm); change_protection(&tlb, dst_vma, start, start + len, newprot, cp_flags); --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5A68CCA479 for ; Mon, 18 Jul 2022 19:37:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236172AbiGRThR (ORCPT ); Mon, 18 Jul 2022 15:37:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236113AbiGRThI (ORCPT ); Mon, 18 Jul 2022 15:37:08 -0400 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBE7DE00D for ; Mon, 18 Jul 2022 12:37:07 -0700 (PDT) Received: by mail-pj1-x1035.google.com with SMTP id g4-20020a17090a290400b001f1f2b7379dso356929pjd.0 for ; Mon, 18 Jul 2022 12:37:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2Q49j7ZnjpJbe7tnNHIQw9SAVjFrSuB+lmoi9+DDawg=; b=bmUPVvdzZ44bZXvMXaJebmRi8Em4oQjsNDUw0UeAiKmw4D56qcZbcTGB0JlqUCsblt OEV1P1bs52wmG0gm8jY3wqSSUBITQIxayVYMmqk0JsmL+4pXoNVZ2FH1OsJW8kiZmEr5 o7V/37GKM2yOHpp0l6kahMySGW2vJQoZxf4VUu7MLfRdvtcC2k5ObYDHuP/XNyCSsW5v F/svHC9579Af/qJy3HoEVjxdNA4i/N87zD/iGJjAednCkdcRy5LqlQH1N6K6bzUuXkKo O+nHVRq37bCF+vMHywbE4q/ySw3g4zg984mjk3Zd66z+yPz9kQc8fmRVgs/2ZAhpyHge Nzfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2Q49j7ZnjpJbe7tnNHIQw9SAVjFrSuB+lmoi9+DDawg=; b=QTzDvcuX8FBkQsx5DxhJzvZ1Pcfl2KYNFjNOAasgtF2GcqiTrLlW4NyRWY7a2h2kaf YHoYnRi8/Fr0kKRwN+eYly7QUa7yUM2drCXEA8/DodWdow8GuN4Tf2Z5/VmUdFatPoQ8 fsSbufeLN40oFaEXbpXjKLL7xbvVHjVVmKQ1XGysMJ8I2+c1RkgHWRgElZzoT7eB1hAz 2jC6bVtS4O8ZbzCNUvwWuZnzvXZFCTE3RGbJ3avGlOmgIxCmjQuH6WQfSSfqbHjW2G7p URaRwA3PHFYqE2WU6uql22Y4a5+3P6wVgAitVuP50CpPp2lx1R5uM3GQPmqx6FfpEnwi zlew== X-Gm-Message-State: AJIora+tPWip9msXutFJW5q+hdHM/RQ73ScCDj8dL0/MhGSo6gN11w0Z +LgiAkRoXxNJZkf5Rd4CL64= X-Google-Smtp-Source: AGRyM1tTWf0SQi53/iAko8YuS1u16C/ViaTRcBXow+cR8M+MN0AQg8swqeSra7RTo6Z8gZakiBjToQ== X-Received: by 2002:a17:90a:d195:b0:1ef:8eb2:4f4d with SMTP id fu21-20020a17090ad19500b001ef8eb24f4dmr39367516pjb.104.1658173027111; Mon, 18 Jul 2022 12:37:07 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:06 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 03/14] mm/mprotect: allow exclusive anon pages to be writable Date: Mon, 18 Jul 2022 05:02:01 -0700 Message-Id: <20220718120212.3180-4-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit Anonymous pages might have the dirty bit clear, but this should not prevent mprotect from making them writable if they are exclusive. Therefore, skip the test whether the page is dirty in this case. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- mm/mprotect.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index 34c2dfb68c42..da5b9bf8204f 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -45,7 +45,7 @@ static inline bool can_change_pte_writable(struct vm_area= _struct *vma, =20 VM_BUG_ON(!(vma->vm_flags & VM_WRITE) || pte_write(pte)); =20 - if (pte_protnone(pte) || !pte_dirty(pte)) + if (pte_protnone(pte)) return false; =20 /* Do we need write faults for softdirty tracking? */ @@ -66,7 +66,8 @@ static inline bool can_change_pte_writable(struct vm_area= _struct *vma, page =3D vm_normal_page(vma, addr, pte); if (!page || !PageAnon(page) || !PageAnonExclusive(page)) return false; - } + } else if (!pte_dirty(pte)) + return false; =20 return true; } --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1045C433EF for ; Mon, 18 Jul 2022 19:37:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236193AbiGRThT (ORCPT ); Mon, 18 Jul 2022 15:37:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236157AbiGRThK (ORCPT ); Mon, 18 Jul 2022 15:37:10 -0400 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 658082CCB9 for ; Mon, 18 Jul 2022 12:37:09 -0700 (PDT) Received: by mail-pj1-x1031.google.com with SMTP id a15so12753123pjs.0 for ; Mon, 18 Jul 2022 12:37:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZFfNYrVG625GOSA+id1AogN2HiBF2G/dud76HkUGe+4=; b=DuOhKFmRzGfZbq1nE2NE6Kc5a1/zJpTjh2RvqjQrCWls/7Qhm59F3Q5qnZAqPiZzON HTPLLExPZor3+WUD0GCeFJpaZ9HYJzT2HyRhOUdXKPSxPoB6NcCCp6BaC0PUnMozc1Yt 7yR/B8zLzmfdp4wBCIwoubwwq+IB152Z6aXNEBRtKOMqAeRb0Yg89EaxtZKRLj96Zhk2 zoX+JoTOX8z/qePogfLcdbultYnUpTJzA0NIOrLiUF3n+b/B8dbqeFNxtjocuKqlEg9I gVwTOuaGbUDx39ekLwLxOUStg09K/QpEk4bRVNpBu1NY3eLXRtkHF8N04DDoJYpcUW4c qdtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZFfNYrVG625GOSA+id1AogN2HiBF2G/dud76HkUGe+4=; b=Nti7WPuVxm8TeHVNyBgzYwmxLHmfDCvVoDS3GKG8ndpeF/h5J9GGDhTlsBwhvWcl85 HlYg11vGOjswxZA07Gqe9fYdaSJYx7R/4qtEJ9VMd8sojEyIqVXhHm+XUyzNoSgoeqP5 xED4uYiRkcuCK6Pb6aXDy1GQPhMTAntVQNVv08+moqU9BYv9fsmjxOjnN4kX4EWHHUuV epoNUUmty2zy0NrUJXSPDy1dhSohlXRMnUx8Eqb3g2Siol8BD/N3O2nM317ZUu/7N4pe HTVMTpi29SwIJzn7dd8b6+sfJcbJUY8t9Ceq+IyWjv4QDuUpWgKQa52TSOMSlQP9kXTl klpA== X-Gm-Message-State: AJIora/a0d0q8OR+O44L/M8CFk59oTDSHFn2w9dgtwyJod46sQX01FNk 8yilkL9xY3r6u91jd01gi0s= X-Google-Smtp-Source: AGRyM1vEHCEomLrXWGc5eN5sitXXmIDRxEROcYfVCHE3CQXw0PbDpM+yh+ClwMd2ENoAaXz/XYUQHw== X-Received: by 2002:a17:90a:9f8d:b0:1f0:253e:3ecf with SMTP id o13-20020a17090a9f8d00b001f0253e3ecfmr41557002pjp.33.1658173028683; Mon, 18 Jul 2022 12:37:08 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:08 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 04/14] mm/mprotect: preserve write with MM_CP_TRY_CHANGE_WRITABLE Date: Mon, 18 Jul 2022 05:02:02 -0700 Message-Id: <20220718120212.3180-5-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit When MM_CP_TRY_CHANGE_WRITABLE is used, change_pte_range() tries to set PTEs as writable. Yet, writable PTEs might still become read-only, due to various limitations of the logic that determines whether a PTE can become writable (see can_change_pte_writable()). Anyhow, it is much easier to keep the writable bit set when MM_CP_TRY_CHANGE_WRITABLE is used than to first clear it and then figure out whether it can be set again. Preserve the write-bit when MM_CP_TRY_CHANGE_WRITABLE is used, similarly to the way it is done with NUMA. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- mm/mprotect.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/mprotect.c b/mm/mprotect.c index da5b9bf8204f..92bfb17dcb8a 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -84,6 +84,7 @@ static unsigned long change_pte_range(struct mmu_gather *= tlb, bool uffd_wp =3D cp_flags & MM_CP_UFFD_WP; bool uffd_wp_resolve =3D cp_flags & MM_CP_UFFD_WP_RESOLVE; bool will_need =3D cp_flags & MM_CP_WILL_NEED; + bool try_change_writable =3D cp_flags & MM_CP_TRY_CHANGE_WRITABLE; =20 tlb_change_page_size(tlb, PAGE_SIZE); =20 @@ -114,7 +115,8 @@ static unsigned long change_pte_range(struct mmu_gather= *tlb, oldpte =3D *pte; if (pte_present(oldpte)) { pte_t ptent; - bool preserve_write =3D prot_numa && pte_write(oldpte); + bool preserve_write =3D (prot_numa || try_change_writable) && + pte_write(oldpte); =20 /* * Avoid trapping faults against the zero or KSM @@ -190,8 +192,7 @@ static unsigned long change_pte_range(struct mmu_gather= *tlb, * example, if a PTE is already dirty and no other * COW or special handling is required. */ - if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && - !pte_write(ptent) && + if (try_change_writable && !pte_write(ptent) && can_change_pte_writable(vma, addr, ptent)) { ptent =3D pte_mkwrite(ptent); if (will_need) --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45C4CC433EF for ; Mon, 18 Jul 2022 19:37:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236204AbiGRThV (ORCPT ); Mon, 18 Jul 2022 15:37:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236171AbiGRThQ (ORCPT ); Mon, 18 Jul 2022 15:37:16 -0400 Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AF682D1E2 for ; Mon, 18 Jul 2022 12:37:11 -0700 (PDT) Received: by mail-pg1-x531.google.com with SMTP id 72so11548760pge.0 for ; Mon, 18 Jul 2022 12:37:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ztsUe2wZg/YICD33iqkzWz48lO96HPpWHQuuJ7tssIY=; b=L6+6rmWmlW74N35sQK1UeLXbLRgio2fsQS6mHJKVKzCTV61e8UW1ZZTN3koJ/ZKhAc Fuvodma7l4HTg6DZYvRj45/tMb7kCvzZw5K3JToemZFvQ1c5i9J+9IHlZylJfJHA8nmc uoiAVOLKXgLbMrTehDka9krReSTVnxLoS+Pghk2tFsInE1OztwnvEOqn4jNqft/Etn2w jjh/vGyNDH6Vj+uXcHOgy8ww40jB0eIPjLweUE55/vH6SgXA/XVlD1I8fHOnP35ZIRdB XkPCLZoHaJrwMPAIT4pU6iMAvGGNwFLN6atUCLJsOAvQHbVbxWALAXD8mmcQKOHMfINI qscw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ztsUe2wZg/YICD33iqkzWz48lO96HPpWHQuuJ7tssIY=; b=EkAAfLru/MUM3TA+6tS7NUftUc7WP27afLom4G2Fy9UurxeblN9dLq+a2NebV8C0/N WkMgnbmrj9vD422w+bVoXs8ua3dR4fAX/RTNEy2XmQ3goL07ok55Eir4l+9B2MLcAYuI ghju2KigT1LNAh3cms0zkPwVaxHFIFqkzyWLIkI495Lj+MMo7ND4kZ6zn0ieIKWud0bp xxIbnsUn2VCFL3LkLmk3PXiXwYszrbNEg1Oj/K7ypObyf0WLMXzMFTOvVK/G3on146yB RN/ulOen5vG5i5trcb//5LPZ3Z6iVAd3xkde1WHczAWpRLSNvHPNQ3+FpAnkOr4Q3etk PBrQ== X-Gm-Message-State: AJIora/cV60WNmnQNUza9qiv5+zUIs6jqvob+dFBiZB7MKo4twCZQfxo qCNMcY7bjbK5GTqQteIMvgs= X-Google-Smtp-Source: AGRyM1ta8bb059Mp7zmutlWdMZ80ywE9XEnxT7RsTULurbViUhAfTouO49B+QsXkfMAuN7Zpzky0+Q== X-Received: by 2002:a05:6a00:889:b0:510:91e6:6463 with SMTP id q9-20020a056a00088900b0051091e66463mr29798403pfj.58.1658173030299; Mon, 18 Jul 2022 12:37:10 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:09 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin , x86@kernel.org Subject: [RFC PATCH 05/14] x86/mm: check exec permissions on fault Date: Mon, 18 Jul 2022 05:02:03 -0700 Message-Id: <20220718120212.3180-6-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit access_error() currently does not check for execution permission violation. As a result, spurious page-faults due to execution permission violation cause SIGSEGV. It appears not to be an issue so far, but the next patches avoid TLB flushes on permission promotion, which can lead to this scenario. nodejs for instance crashes when TLB flush is avoided on permission promotion. Add a check to prevent access_error() from returning mistakenly that spurious page-faults due to instruction fetch are a reason for an access error. It is assumed that error code bits of "instruction fetch" and "write" in the hardware error code are mutual exclusive, and the change assumes so. However, to be on the safe side, especially if hypervisors misbehave, assert this is the case and warn otherwise. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Cc: x86@kernel.org Signed-off-by: Nadav Amit --- arch/x86/mm/fault.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index fe10c6d76bac..00013c1fac3f 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1107,10 +1107,28 @@ access_error(unsigned long error_code, struct vm_ar= ea_struct *vma) (error_code & X86_PF_INSTR), foreign)) return 1; =20 - if (error_code & X86_PF_WRITE) { + if (error_code & (X86_PF_WRITE | X86_PF_INSTR)) { + /* + * CPUs are not expected to set the two error code bits + * together, but to ensure that hypervisors do not misbehave, + * run an additional sanity check. + */ + if ((error_code & (X86_PF_WRITE|X86_PF_INSTR)) =3D=3D + (X86_PF_WRITE|X86_PF_INSTR)) { + WARN_ON_ONCE(1); + return 1; + } + /* write, present and write, not present: */ - if (unlikely(!(vma->vm_flags & VM_WRITE))) + if ((error_code & X86_PF_WRITE) && + unlikely(!(vma->vm_flags & VM_WRITE))) + return 1; + + /* exec, present and exec, not present: */ + if ((error_code & X86_PF_INSTR) && + unlikely(!(vma->vm_flags & VM_EXEC))) return 1; + return 0; } =20 --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C92EC43334 for ; Mon, 18 Jul 2022 19:37:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236219AbiGRTh0 (ORCPT ); Mon, 18 Jul 2022 15:37:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236182AbiGRThQ (ORCPT ); Mon, 18 Jul 2022 15:37:16 -0400 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B88CE2F67A for ; Mon, 18 Jul 2022 12:37:12 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id h132so11502194pgc.10 for ; Mon, 18 Jul 2022 12:37:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=VCDxDoUSCrnTZdtdcyuvlf+NNtVJJIcoOQ0MDquBzAY=; b=WpzOa1eCI6gKdwOYRlP/7uAcBkP9V49tEy6TRy1l2Khiu+GZnm10Q5jyVbyYM1tYJ0 r1e+kLptD6I/s6ViDo6K3ZWQxtQqTivN9316p9KbowhoCjz6+CLLmK18EEI8JLQoLMTw 6hJP+pqwUio5gv5yYp+Q5f9oq+Nlk4vL7U3Nr2uzyliGndyYJu/aty4nLWwUOlzgE7f3 XKi59Yz5xUZS3IH7VwQS7RSyx2mkvafbnQ3f/hs9aloww2hNMJj8gDsL4ymdSB9RE3Xb MzJM2PoJh3hUc1G7Ka+jmv39doe16fxqrhkTV1xozTVG+hkSSQ0CDqVkr23SSAAypVrA ptew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VCDxDoUSCrnTZdtdcyuvlf+NNtVJJIcoOQ0MDquBzAY=; b=2HKMR4ElqOQVT3oOxcbjcvJAxl9zqQda4XT7P3wFPyG1Us3oEMYoGBjNidaC0zdht7 wiCVr2mE7ZDD4Qn/q1rgzbKyZ01B/r+n/xcQf8TssMMhVGBhAbofKpMiw4DQ+XHB223z hnTwkyaUql9glXPW8tidWEC1buZ/hqZqsMLHrz+al92h82D76Scv12gNGENnaS0IeHuj LjlaqCGRvO6ecG2Hp6iTNN/cdU1a6u9OHibZ+RBurpD7t9BLlfYtL3qL3ng/600psw8j bIpuVWdSwdwXcK9mVcHqaug4DsNZKUk6ZYZqljb0qDSrXUIVjPapaKWuESj3n+86Trbn xeJA== X-Gm-Message-State: AJIora8JFC6CZzNkiKkvdElfhm1uyUQJXL4i9C+jdHLrcbfS8sLtgi6z 6NTtZLQN6B27Sth2lOiK4NA= X-Google-Smtp-Source: AGRyM1vGJivctvX5ZkdLIJ4O9MXoDMilDP3XL2pp+bNJOG2Zy6lN3VLUtfwzhoCDKo97VJCK/R1iTA== X-Received: by 2002:a63:fc48:0:b0:40d:ad0a:a868 with SMTP id r8-20020a63fc48000000b0040dad0aa868mr25533373pgk.204.1658173031839; Mon, 18 Jul 2022 12:37:11 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:11 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 06/14] mm/rmap: avoid flushing on page_vma_mkclean_one() when possible Date: Mon, 18 Jul 2022 05:02:04 -0700 Message-Id: <20220718120212.3180-7-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit x86 is capable to avoid TLB flush on clean writable entries. page_vma_mkclean_one() does not take advantage of this behavior. Adapt it. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- mm/rmap.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 83172ee0ea35..23997c387858 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -961,17 +961,25 @@ static int page_vma_mkclean_one(struct page_vma_mappe= d_walk *pvmw) =20 address =3D pvmw->address; if (pvmw->pte) { - pte_t entry; + pte_t entry, oldpte; pte_t *pte =3D pvmw->pte; =20 if (!pte_dirty(*pte) && !pte_write(*pte)) continue; =20 flush_cache_page(vma, address, pte_pfn(*pte)); - entry =3D ptep_clear_flush(vma, address, pte); - entry =3D pte_wrprotect(entry); + oldpte =3D ptep_modify_prot_start(pvmw->vma, address, + pte); + + entry =3D pte_wrprotect(oldpte); entry =3D pte_mkclean(entry); - set_pte_at(vma->vm_mm, address, pte, entry); + + if (pte_needs_flush(oldpte, entry) || + mm_tlb_flush_pending(vma->vm_mm)) + flush_tlb_page(vma, address); + + ptep_modify_prot_commit(vma, address, pte, oldpte, + entry); ret =3D 1; } else { #ifdef CONFIG_TRANSPARENT_HUGEPAGE --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37D92C43334 for ; Mon, 18 Jul 2022 19:37:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236192AbiGRTh2 (ORCPT ); Mon, 18 Jul 2022 15:37:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236184AbiGRThQ (ORCPT ); Mon, 18 Jul 2022 15:37:16 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 074C92CDED for ; Mon, 18 Jul 2022 12:37:14 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id g126so11565451pfb.3 for ; Mon, 18 Jul 2022 12:37:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ii9tnumcH6f4uVFiJaNvbNbCHj/eOFwFpwV99AWDIWY=; b=gF1O4Ap8c6u3Wl7EJAnk5cpari+SSbCrCwIHnkKi2iGutzFYOra9oMYtovijltSliX f9n+3+zOazFlTlGbgoP7cASWhUOA3yRUCNf36n4asDLBRZ2n5F6V77g8Ih0ng07spWdH HCMUUeGMN6x1yYhcSmRTYhGUaieuitiLXWFPOh9CH2CMEqYcyByqPimLVW+wP36dQPZe 1RJE9T2eaZawBkz5E8JuXZAG/5xKtgo8+wNL/iuOrcsSXLKNEDQJjqpz/5Yuv3v3GVFQ DvCcYoxTnP9tWp+lBd1uyqN2PAo5ihDVfdOn4Uj1GiSnYTj0YlvWIZ5h14BKB8GCHPjA 0gFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ii9tnumcH6f4uVFiJaNvbNbCHj/eOFwFpwV99AWDIWY=; b=zNosqQGRRV4CaRWURanoiewy4STvLM3KxyKHHT8F6Hsf8HCScyF+cQ3A2t8VEiOceI QZY5H10tGPVIEX5UdpZX+XKlTNm0SWt7bxaxipLM8I0hHhLNNRSc7CIa/hpYw2MGJU2e 6SVIIyNp4KmNg0Ufmsk7oeyXD4mazPfbNcYEitOVeB50srGE2EiUU4Dp9LT3x2hJWnav ds960yZ/NKDnjROufVOZQyn43lPtj1xn+XpVJ3fL0bDB+88WPkAe3/a+y1PXckyBPHoT wnCKML5fdoWdWhCnS2yhK9Za3NHtjidRuGRNspCldqha8ypXQP5LWBlN4+bqau1YVeLg SHyw== X-Gm-Message-State: AJIora966Nwt47D7+YDU9q5ZbBx+WjFQz39REMn4yP0wquFAmJTBQ4g0 bxkHX9pc2LSshNWIDL+YlOg= X-Google-Smtp-Source: AGRyM1tECueM0uV0ckJl2HFWgTNipKEQZcrDzMSchFkRcqY0hBmCA7xorYY92YcZXwgaVycH49hQXg== X-Received: by 2002:a05:6a00:3388:b0:52a:c018:6cdf with SMTP id cm8-20020a056a00338800b0052ac0186cdfmr29910792pfb.55.1658173033302; Mon, 18 Jul 2022 12:37:13 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:12 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 07/14] mm: do fix spurious page-faults for instruction faults Date: Mon, 18 Jul 2022 05:02:05 -0700 Message-Id: <20220718120212.3180-8-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit The next patches might cause spurious instruction faults on x86. To prevent them from occurring too much, call flush_tlb_fix_spurious_fault() for page-faults on code fetching as well. The callee is expected to do a full flush, or whatever is necessary to avoid further TLB flushes. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 31ec3f0071a2..152a47876c36 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4924,7 +4924,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *v= mf) * This still avoids useless tlb flushes for .text page faults * with threads. */ - if (vmf->flags & FAULT_FLAG_WRITE) + if (vmf->flags & (FAULT_FLAG_WRITE|FAULT_FLAG_INSTRUCTION)) flush_tlb_fix_spurious_fault(vmf->vma, vmf->address); } unlock: --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E120C433EF for ; Mon, 18 Jul 2022 19:37:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236240AbiGRThc (ORCPT ); Mon, 18 Jul 2022 15:37:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236188AbiGRThR (ORCPT ); Mon, 18 Jul 2022 15:37:17 -0400 Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D68130541 for ; Mon, 18 Jul 2022 12:37:15 -0700 (PDT) Received: by mail-pg1-x530.google.com with SMTP id s27so11490852pga.13 for ; Mon, 18 Jul 2022 12:37:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pKws8yGRo63vLoo5PvEhs398wa0pzhCZmm6r0b5wNuc=; b=qMpMBqoEJowM7hHEVUp4urmfW4BmRk2K5WRu6rjNNtcGI/q+u2qcbzoTj0dymzU/YN GICq1D6c+IwKTAtMkM/ialUwiAbHd2qyav/oi4Lb2/tQir2MH5L1/9FN/OSYwOL11ogc J95A/oNh44qpCzubW1bL4lW6pv2SZPldzPpGkHfXoFe/mJfk3/K11+ZWWai+ZuaWUxuj ZktYEG+NaXZRrAZnM4dfOCYWFErlRI/RQXv4NrUjDJOmLWof5hAYkt6ED39XDfIu11Ol QDBHOQm0nz2tV3l6dVnFXwCweU/mVRE1/C2vxelapt4M8n9mqIfwuSDyhKsokwR/n6pa 5lRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pKws8yGRo63vLoo5PvEhs398wa0pzhCZmm6r0b5wNuc=; b=p19nGqIbcAFUMXODGLbp5jyjrhLm3T5bzu5wCgv5behm4KjNDftIlVhBcghfbm/w5V nDHfxuVqq01uLqLnsswCfLjADBO1LUD0Y795bEDV5QMiNzkahi+qNKl76EJLgNN74BUB RhcMjuBqjwBrwf+UnSwTa/uzvvKh3LGK0ecJKQc0TdGj4vKbYpl986X2lKMMzTU5gM5l TxjJ4J9O5rjSy3koi/I84TPOX9ZBTLNZwmhbvkUN+oH2FOCJxEWrpDoeQYqIaK5M0+M4 Ok113NBIUa0JHTnM5qVNQeS00ULQ9sNyZFatmzbkDWtdyQR5IR8k6IEw+eDvsXOny8SX XW3g== X-Gm-Message-State: AJIora+M516yro9aU1Oktzw2lt+fkXmZPV3Xiw2fQBdP+1HU3fcHDuRT t+l1/7SiJAioiHTUNvu9cz0= X-Google-Smtp-Source: AGRyM1v2EDFEp/irIA5oLmEjHJRkKj6neMr06kHIqrOuqyOom6q/rRWy9YaOFocxFmxkQmdYOSbCBw== X-Received: by 2002:aa7:9256:0:b0:52a:cbf7:43ea with SMTP id 22-20020aa79256000000b0052acbf743eamr29902665pfp.7.1658173034758; Mon, 18 Jul 2022 12:37:14 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:14 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 08/14] x86/mm: introduce flush_tlb_fix_spurious_fault Date: Mon, 18 Jul 2022 05:02:06 -0700 Message-Id: <20220718120212.3180-9-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit The next patches introduce relaxed TLB flushes for x86, which would require a full TLB flush upon spurious page-fault. If a spurious page-fault occurs on x86, check if the local TLB generation is out of sync and perform a TLB flush if needed. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- arch/x86/include/asm/pgtable.h | 4 +++- arch/x86/mm/tlb.c | 17 +++++++++++++++++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 44e2d6f1dbaa..1fbdaff1bb7a 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1079,7 +1079,9 @@ static inline void ptep_set_wrprotect(struct mm_struc= t *mm, clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); } =20 -#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0) +extern void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma, + unsigned long address); +#define flush_tlb_fix_spurious_fault flush_tlb_fix_spurious_fault =20 #define mk_pmd(page, pgprot) pfn_pmd(page_to_pfn(page), (pgprot)) =20 diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index d400b6d9d246..ff3bcc55435e 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -955,6 +955,23 @@ static void put_flush_tlb_info(void) #endif } =20 +void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma, + unsigned long address) +{ + u32 loaded_mm_asid =3D this_cpu_read(cpu_tlbstate.loaded_mm_asid); + u64 mm_tlb_gen =3D atomic64_read(&vma->vm_mm->context.tlb_gen); + u64 local_tlb_gen =3D this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].tlb= _gen); + struct flush_tlb_info *info; + + if (local_tlb_gen =3D=3D mm_tlb_gen) + return; + + preempt_disable(); + info =3D get_flush_tlb_info(NULL, 0, TLB_FLUSH_ALL, 0, false, 0); + flush_tlb_func(info); + preempt_enable(); +} + void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables) --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69C0EC43334 for ; Mon, 18 Jul 2022 19:37:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236248AbiGRThg (ORCPT ); Mon, 18 Jul 2022 15:37:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236113AbiGRThS (ORCPT ); Mon, 18 Jul 2022 15:37:18 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F37D02C11C for ; Mon, 18 Jul 2022 12:37:16 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id c6so9964741pla.6 for ; Mon, 18 Jul 2022 12:37:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=UxBEaDHDH5iA0Z434ccAU48aZkF7F2lqG0NFB8y3nyA=; b=my81onu1OySYd5QwEPgIH8cJCe9cT3LzQcW5my3gA9nCFtR9e6T1J/1lK1jjSII47b XzLYG4wNMKvD5zk7A9NXnPcVfA8pkuT9KKksuuwF5ZQ6O6UnPvamyCGOQ2LJPk9KXu9L kgMe/KjW3ZnacjULEPlULSqWBoMSvDgcVQWCZ9dS7E3A/VHBu+CNLPn3udaU8MDL3n3e YNJ5VbOlgdKf/OCAkw4+dhshaaH/47yLYrsGZocGFV5uh828uMeXJTxExa4skk442DN4 6DVjWBCvEQYvI/SVz4Bd5YjpqSzAHvvVxTopGeS4OpNTTVU93jEq1ZlERdDACOhI0soH TYZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UxBEaDHDH5iA0Z434ccAU48aZkF7F2lqG0NFB8y3nyA=; b=Oa1qxDaXXgvzNplHvOLdLNPYTNy2FL3KIwFA4OWTQFHdS4cnTKfEYwyncR31NwprYN 00vVyeH44K5gF9OGE3x2K4+K4QQdqxlPEaxRh93zcWsxU8/VFrPPLzM2SveQGP97Q0bX lYNqdbjxtoi5hKrUNaLwAK9fPfXp9WaP0aKYdRneRucWBGccaX+9gNlI5zGyEWnD1HKl dop+M8fbgH+17rKha7+jPDK/a6MHStI5hFPmA/E5M99gcrq0gizAAb/0xrval0Icp/9A Fgl8hVyfZ2WMTXcJvphgycKgyJlKP4AIs5lGLGeWf0PlEODCpn+dv6WJxQtF6nRJbqmL zS2w== X-Gm-Message-State: AJIora+KmJBfgZMOlQT0ol1lTtiZFORogmJXvZ4g9y5qDViuJ5CzncL7 P/MBAKSbzvGyy6vB9WqnFCw= X-Google-Smtp-Source: AGRyM1ukZTqzinGmKpw0KyjuFIJDx0R0wkssnJUHMEvYck6xu6OkrvRKPvj8jOGhO6Icx6YjPWbjsw== X-Received: by 2002:a17:902:ab13:b0:16c:bc10:85a9 with SMTP id ik19-20020a170902ab1300b0016cbc1085a9mr21694648plb.7.1658173036246; Mon, 18 Jul 2022 12:37:16 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:15 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 09/14] mm: introduce relaxed TLB flushes Date: Mon, 18 Jul 2022 05:02:07 -0700 Message-Id: <20220718120212.3180-10-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit Introduce the concept of strict and relaxed TLB flushes. Relaxed TLB flushes are TLB flushes that can be skipped but might lead to degraded performance. It is down to arch code (in the next patches) to deal with relaxed flushes correctly. One such behavior is flushing the local TLB eagerly and remote TLBs lazily. Track whether a flush is strict in the mmu_gather struct and introduce the required constants for tracking. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 41 ++++++------ include/asm-generic/tlb.h | 114 ++++++++++++++++++-------------- include/linux/mm_types.h | 6 ++ mm/huge_memory.c | 7 +- mm/hugetlb.c | 2 +- mm/mmu_gather.c | 1 + mm/mprotect.c | 8 ++- mm/rmap.c | 2 +- 8 files changed, 107 insertions(+), 74 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflus= h.h index 4af5579c7ef7..77d4810e5a5d 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -259,7 +259,7 @@ static inline void arch_tlbbatch_add_mm(struct arch_tlb= flush_unmap_batch *batch, =20 extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch); =20 -static inline bool pte_flags_need_flush(unsigned long oldflags, +static inline enum pte_flush_type pte_flags_flush_type(unsigned long oldfl= ags, unsigned long newflags, bool ignore_access) { @@ -290,71 +290,72 @@ static inline bool pte_flags_need_flush(unsigned long= oldflags, diff &=3D ~_PAGE_ACCESSED; =20 /* - * Did any of the 'flush_on_clear' flags was clleared set from between - * 'oldflags' and 'newflags'? + * Did any of the 'flush_on_clear' flags was cleared between 'oldflags' + * and 'newflags'? */ if (diff & oldflags & flush_on_clear) - return true; + return PTE_FLUSH_STRICT; =20 /* Flush on modified flags. */ if (diff & flush_on_change) - return true; + return PTE_FLUSH_STRICT; =20 /* Ensure there are no flags that were left behind */ if (IS_ENABLED(CONFIG_DEBUG_VM) && (diff & ~(flush_on_clear | software_flags | flush_on_change))) { VM_WARN_ON_ONCE(1); - return true; + return PTE_FLUSH_STRICT; } =20 - return false; + return PTE_FLUSH_NONE; } =20 /* - * pte_needs_flush() checks whether permissions were demoted and require a - * flush. It should only be used for userspace PTEs. + * pte_flush_type() checks whether permissions were demoted or promoted and + * whether a strict or relaxed TLB flush is need. It should only be used on + * userspace PTEs. */ -static inline bool pte_needs_flush(pte_t oldpte, pte_t newpte) +static inline enum pte_flush_type pte_flush_type(pte_t oldpte, pte_t newpt= e) { /* !PRESENT -> * ; no need for flush */ if (!(pte_flags(oldpte) & _PAGE_PRESENT)) - return false; + return PTE_FLUSH_NONE; =20 /* PFN changed ; needs flush */ if (pte_pfn(oldpte) !=3D pte_pfn(newpte)) - return true; + return PTE_FLUSH_STRICT; =20 /* * check PTE flags; ignore access-bit; see comment in * ptep_clear_flush_young(). */ - return pte_flags_need_flush(pte_flags(oldpte), pte_flags(newpte), + return pte_flags_flush_type(pte_flags(oldpte), pte_flags(newpte), true); } -#define pte_needs_flush pte_needs_flush +#define pte_flush_type pte_flush_type =20 /* - * huge_pmd_needs_flush() checks whether permissions were demoted and requ= ire a + * huge_pmd_flush_type() checks whether permissions were demoted and requi= re a * flush. It should only be used for userspace huge PMDs. */ -static inline bool huge_pmd_needs_flush(pmd_t oldpmd, pmd_t newpmd) +static inline enum pte_flush_type huge_pmd_flush_type(pmd_t oldpmd, pmd_t = newpmd) { /* !PRESENT -> * ; no need for flush */ if (!(pmd_flags(oldpmd) & _PAGE_PRESENT)) - return false; + return PTE_FLUSH_NONE; =20 /* PFN changed ; needs flush */ if (pmd_pfn(oldpmd) !=3D pmd_pfn(newpmd)) - return true; + return PTE_FLUSH_STRICT; =20 /* * check PMD flags; do not ignore access-bit; see * pmdp_clear_flush_young(). */ - return pte_flags_need_flush(pmd_flags(oldpmd), pmd_flags(newpmd), + return pte_flags_flush_type(pmd_flags(oldpmd), pmd_flags(newpmd), false); } -#define huge_pmd_needs_flush huge_pmd_needs_flush +#define huge_pmd_flush_type huge_pmd_flush_type =20 #endif /* !MODULE */ =20 diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index ff3e82553a76..07b3eb8caf63 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -289,6 +289,11 @@ struct mmu_gather { unsigned int vma_exec : 1; unsigned int vma_huge : 1; =20 + /* + * wheteher we made flushing strict (add protection) or changed + * mappings. + */ + unsigned int strict : 1; unsigned int batch_count; =20 #ifndef CONFIG_MMU_GATHER_NO_GATHER @@ -325,6 +330,7 @@ static inline void __tlb_reset_range(struct mmu_gather = *tlb) tlb->cleared_pmds =3D 0; tlb->cleared_puds =3D 0; tlb->cleared_p4ds =3D 0; + tlb->strict =3D 0; /* * Do not reset mmu_gather::vma_* fields here, we do not * call into tlb_start_vma() again to set them if there is an @@ -518,31 +524,43 @@ static inline void tlb_end_vma(struct mmu_gather *tlb= , struct vm_area_struct *vm * and set corresponding cleared_*. */ static inline void tlb_flush_pte_range(struct mmu_gather *tlb, - unsigned long address, unsigned long size) + unsigned long address, unsigned long size, + bool strict) { __tlb_adjust_range(tlb, address, size); tlb->cleared_ptes =3D 1; + if (strict) + tlb->strict =3D 1; } =20 static inline void tlb_flush_pmd_range(struct mmu_gather *tlb, - unsigned long address, unsigned long size) + unsigned long address, unsigned long size, + bool strict) { __tlb_adjust_range(tlb, address, size); tlb->cleared_pmds =3D 1; + if (strict) + tlb->strict =3D 1; } =20 static inline void tlb_flush_pud_range(struct mmu_gather *tlb, - unsigned long address, unsigned long size) + unsigned long address, unsigned long size, + bool strict) { __tlb_adjust_range(tlb, address, size); tlb->cleared_puds =3D 1; + if (strict) + tlb->strict =3D 1; } =20 static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, - unsigned long address, unsigned long size) + unsigned long address, unsigned long size, + bool strict) { __tlb_adjust_range(tlb, address, size); tlb->cleared_p4ds =3D 1; + if (strict) + tlb->strict =3D 1; } =20 #ifndef __tlb_remove_tlb_entry @@ -556,24 +574,24 @@ static inline void tlb_flush_p4d_range(struct mmu_gat= her *tlb, * so we can later optimise away the tlb invalidate. This helps when * userspace is unmapping already-unmapped pages, which happens quite a lo= t. */ -#define tlb_remove_tlb_entry(tlb, ptep, address) \ - do { \ - tlb_flush_pte_range(tlb, address, PAGE_SIZE); \ - __tlb_remove_tlb_entry(tlb, ptep, address); \ +#define tlb_remove_tlb_entry(tlb, ptep, address) \ + do { \ + tlb_flush_pte_range(tlb, address, PAGE_SIZE, true); \ + __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) =20 -#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ - do { \ - unsigned long _sz =3D huge_page_size(h); \ - if (_sz >=3D P4D_SIZE) \ - tlb_flush_p4d_range(tlb, address, _sz); \ - else if (_sz >=3D PUD_SIZE) \ - tlb_flush_pud_range(tlb, address, _sz); \ - else if (_sz >=3D PMD_SIZE) \ - tlb_flush_pmd_range(tlb, address, _sz); \ - else \ - tlb_flush_pte_range(tlb, address, _sz); \ - __tlb_remove_tlb_entry(tlb, ptep, address); \ +#define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ + do { \ + unsigned long _sz =3D huge_page_size(h); \ + if (_sz >=3D P4D_SIZE) \ + tlb_flush_p4d_range(tlb, address, _sz, true); \ + else if (_sz >=3D PUD_SIZE) \ + tlb_flush_pud_range(tlb, address, _sz, true); \ + else if (_sz >=3D PMD_SIZE) \ + tlb_flush_pmd_range(tlb, address, _sz, true); \ + else \ + tlb_flush_pte_range(tlb, address, _sz, true); \ + __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) =20 /** @@ -586,7 +604,7 @@ static inline void tlb_flush_p4d_range(struct mmu_gathe= r *tlb, =20 #define tlb_remove_pmd_tlb_entry(tlb, pmdp, address) \ do { \ - tlb_flush_pmd_range(tlb, address, HPAGE_PMD_SIZE); \ + tlb_flush_pmd_range(tlb, address, HPAGE_PMD_SIZE, true);\ __tlb_remove_pmd_tlb_entry(tlb, pmdp, address); \ } while (0) =20 @@ -600,7 +618,7 @@ static inline void tlb_flush_p4d_range(struct mmu_gathe= r *tlb, =20 #define tlb_remove_pud_tlb_entry(tlb, pudp, address) \ do { \ - tlb_flush_pud_range(tlb, address, HPAGE_PUD_SIZE); \ + tlb_flush_pud_range(tlb, address, HPAGE_PUD_SIZE, true);\ __tlb_remove_pud_tlb_entry(tlb, pudp, address); \ } while (0) =20 @@ -623,52 +641,52 @@ static inline void tlb_flush_p4d_range(struct mmu_gat= her *tlb, */ =20 #ifndef pte_free_tlb -#define pte_free_tlb(tlb, ptep, address) \ - do { \ - tlb_flush_pmd_range(tlb, address, PAGE_SIZE); \ - tlb->freed_tables =3D 1; \ - __pte_free_tlb(tlb, ptep, address); \ +#define pte_free_tlb(tlb, ptep, address) \ + do { \ + tlb_flush_pmd_range(tlb, address, PAGE_SIZE, true); \ + tlb->freed_tables =3D 1; \ + __pte_free_tlb(tlb, ptep, address); \ } while (0) #endif =20 #ifndef pmd_free_tlb -#define pmd_free_tlb(tlb, pmdp, address) \ - do { \ - tlb_flush_pud_range(tlb, address, PAGE_SIZE); \ - tlb->freed_tables =3D 1; \ - __pmd_free_tlb(tlb, pmdp, address); \ +#define pmd_free_tlb(tlb, pmdp, address) \ + do { \ + tlb_flush_pud_range(tlb, address, PAGE_SIZE, true); \ + tlb->freed_tables =3D 1; \ + __pmd_free_tlb(tlb, pmdp, address); \ } while (0) #endif =20 #ifndef pud_free_tlb -#define pud_free_tlb(tlb, pudp, address) \ - do { \ - tlb_flush_p4d_range(tlb, address, PAGE_SIZE); \ - tlb->freed_tables =3D 1; \ - __pud_free_tlb(tlb, pudp, address); \ +#define pud_free_tlb(tlb, pudp, address) \ + do { \ + tlb_flush_p4d_range(tlb, address, PAGE_SIZE, true); \ + tlb->freed_tables =3D 1; \ + __pud_free_tlb(tlb, pudp, address); \ } while (0) #endif =20 #ifndef p4d_free_tlb -#define p4d_free_tlb(tlb, pudp, address) \ - do { \ - __tlb_adjust_range(tlb, address, PAGE_SIZE); \ - tlb->freed_tables =3D 1; \ - __p4d_free_tlb(tlb, pudp, address); \ +#define p4d_free_tlb(tlb, pudp, address) \ + do { \ + __tlb_adjust_range(tlb, address, PAGE_SIZE, true); \ + tlb->freed_tables =3D 1; \ + __p4d_free_tlb(tlb, pudp, address); \ } while (0) #endif =20 -#ifndef pte_needs_flush -static inline bool pte_needs_flush(pte_t oldpte, pte_t newpte) +#ifndef pte_flush_type +static inline struct pte_flush_type pte_flush_type(pte_t oldpte, pte_t new= pte) { - return true; + return PTE_FLUSH_STRICT; } #endif =20 -#ifndef huge_pmd_needs_flush -static inline bool huge_pmd_needs_flush(pmd_t oldpmd, pmd_t newpmd) +#ifndef huge_pmd_flush_type +static inline bool huge_pmd_flush_type(pmd_t oldpmd, pmd_t newpmd) { - return true; + return PTE_FLUSH_STRICT; } #endif =20 diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6b961a29bf26..8825f1314a28 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -698,6 +698,12 @@ extern void tlb_gather_mmu(struct mmu_gather *tlb, str= uct mm_struct *mm); extern void tlb_gather_mmu_fullmm(struct mmu_gather *tlb, struct mm_struct= *mm); extern void tlb_finish_mmu(struct mmu_gather *tlb); =20 +enum pte_flush_type { + PTE_FLUSH_NONE =3D 0, /* not necessary */ + PTE_FLUSH_STRICT =3D 1, /* required */ + PTE_FLUSH_RELAXED =3D 2, /* can cause spurious page-faults */ +}; + struct vm_fault; =20 /** diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 60d742c33de3..09e6608a6431 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1713,6 +1713,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm= _area_struct *vma, bool prot_numa =3D cp_flags & MM_CP_PROT_NUMA; bool uffd_wp =3D cp_flags & MM_CP_UFFD_WP; bool uffd_wp_resolve =3D cp_flags & MM_CP_UFFD_WP_RESOLVE; + enum pte_flush_type flush_type; =20 tlb_change_page_size(tlb, HPAGE_PMD_SIZE); =20 @@ -1815,8 +1816,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct v= m_area_struct *vma, ret =3D HPAGE_PMD_NR; set_pmd_at(mm, addr, pmd, entry); =20 - if (huge_pmd_needs_flush(oldpmd, entry)) - tlb_flush_pmd_range(tlb, addr, HPAGE_PMD_SIZE); + flush_type =3D huge_pmd_flush_type(oldpmd, entry); + if (flush_type !=3D PTE_FLUSH_NONE) + tlb_flush_pmd_range(tlb, addr, HPAGE_PMD_SIZE, + flush_type =3D=3D PTE_FLUSH_STRICT); =20 BUG_ON(vma_is_anonymous(vma) && !preserve_write && pmd_write(entry)); unlock: diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6621d3fe4991..9a667237a69a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5022,7 +5022,7 @@ static void __unmap_hugepage_range(struct mmu_gather = *tlb, struct vm_area_struct ptl =3D huge_pte_lock(h, mm, ptep); if (huge_pmd_unshare(mm, vma, &address, ptep)) { spin_unlock(ptl); - tlb_flush_pmd_range(tlb, address & PUD_MASK, PUD_SIZE); + tlb_flush_pmd_range(tlb, address & PUD_MASK, PUD_SIZE, true); force_flush =3D true; continue; } diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index a71924bd38c0..9a8bd2f23543 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -348,6 +348,7 @@ void tlb_finish_mmu(struct mmu_gather *tlb) tlb->fullmm =3D 1; __tlb_reset_range(tlb); tlb->freed_tables =3D 1; + tlb->strict =3D 1; } =20 tlb_flush_mmu(tlb); diff --git a/mm/mprotect.c b/mm/mprotect.c index 92bfb17dcb8a..ead20dc66d34 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -117,6 +117,7 @@ static unsigned long change_pte_range(struct mmu_gather= *tlb, pte_t ptent; bool preserve_write =3D (prot_numa || try_change_writable) && pte_write(oldpte); + enum pte_flush_type flush_type; =20 /* * Avoid trapping faults against the zero or KSM @@ -200,8 +201,11 @@ static unsigned long change_pte_range(struct mmu_gathe= r *tlb, } =20 ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); - if (pte_needs_flush(oldpte, ptent)) - tlb_flush_pte_range(tlb, addr, PAGE_SIZE); + + flush_type =3D pte_flush_type(oldpte, ptent); + if (flush_type !=3D PTE_FLUSH_NONE) + tlb_flush_pte_range(tlb, addr, PAGE_SIZE, + flush_type =3D=3D PTE_FLUSH_STRICT); pages++; } else if (is_swap_pte(oldpte)) { swp_entry_t entry =3D pte_to_swp_entry(oldpte); diff --git a/mm/rmap.c b/mm/rmap.c index 23997c387858..62f4b2a4f067 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -974,7 +974,7 @@ static int page_vma_mkclean_one(struct page_vma_mapped_= walk *pvmw) entry =3D pte_wrprotect(oldpte); entry =3D pte_mkclean(entry); =20 - if (pte_needs_flush(oldpte, entry) || + if (pte_flush_type(oldpte, entry) !=3D PTE_FLUSH_NONE || mm_tlb_flush_pending(vma->vm_mm)) flush_tlb_page(vma, address); =20 --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D4ABC433EF for ; Mon, 18 Jul 2022 19:37:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236260AbiGRThj (ORCPT ); Mon, 18 Jul 2022 15:37:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236190AbiGRThT (ORCPT ); Mon, 18 Jul 2022 15:37:19 -0400 Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7400E2D1E3 for ; Mon, 18 Jul 2022 12:37:18 -0700 (PDT) Received: by mail-pj1-x1035.google.com with SMTP id g4-20020a17090a290400b001f1f2b7379dso357374pjd.0 for ; Mon, 18 Jul 2022 12:37:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Oo4Pz1r5XHW7nwxpGgJZANPDnH94I0NGToeI6/zKa1I=; b=Yta4aSxqKdJJamMX5Ka9KLD4dubuyf5nvRkNeJsovDB9Ytj1BQ5qWdRgLzIHHHERB6 5sXf3niGo/72wXP0jlBMi/vVkFJX8hLRvbMlrw7tatNFrNNgmNMB9HbzNICkzbNcXs3w 6cu7pwr4AsBJZ3fn8/KYDPZk8nX+6Rg3Ljpt2/TGFqbUZ7cKL97bfzfYsiBq4vhT+Kw9 MVxwsvKLWzOIIRIJm9ZWmpzyPYTX/gBlEmxcuQU3TzAh1tBYia4QkzmOzqCE6j5LO15n QaXqO0JTJLp//5mmCRV7FullleEzXsADDRB6uSgI3/pl3o0omrcymnmvGc3DAicOGAQh YnEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Oo4Pz1r5XHW7nwxpGgJZANPDnH94I0NGToeI6/zKa1I=; b=VowYbvd3LpdyHdziQkX4zPe46BeX0WuZW92HDaAs4Q5kDNWYwcYBnTIZeQBsoIYM1Y s844MAUbeAwWIKMfVYGTQU1B5gMdfWlGNVPdkuzsFAwle7twvVXHCckzA7aYhr3JuOLp ss39W0t6+vldjdGcsgDlzl+/ygGgyWIXPagXIEL/WEFgebOhIRE57XFdlxIqW8yU7Hnq dNa5F+2tWk33KveVxxcmwL+9fiNVjKuyGuU4FvjDWndnImyvQ6J+BjiEeJJgWDwhenB2 IRxQUZmcTaBVi6eFMuApSBvH7t/e9LSgMM+ciL8mhpCzAxZkAE8O6HJpaDPxGoMQe8wu aNXg== X-Gm-Message-State: AJIora9CRh4aUOSPQaZHozUglbIZCf3AH1M9X0kBIOebUrV5LQ3AVaDT DBwsbVInaYwpzR/8D2mJEOQ= X-Google-Smtp-Source: AGRyM1v14qCZwBb3XJU5MS/ZzWDs+xXqoaSfao3rsct5ND+6jaOGoGucVBBiblDjUX+cd9G34nsdiw== X-Received: by 2002:a17:902:7106:b0:16c:6c95:6153 with SMTP id a6-20020a170902710600b0016c6c956153mr29356100pll.166.1658173037752; Mon, 18 Jul 2022 12:37:17 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:17 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 10/14] x86/mm: introduce relaxed TLB flushes Date: Mon, 18 Jul 2022 05:02:08 -0700 Message-Id: <20220718120212.3180-11-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit Introduce relaxed TLB flushes in x86. When protection is removed from PTEs (i.e., PTEs become writeable or executable), relaxed TLB flushes would be used. Relaxed TLB flushes do flush the local TLB, but do not flush remote TLBs. If later a spurious page-fault is encountered, and the local TLB generation is found to be out of sync with the mm's TLB generation, a full TLB flush takes place to prevent further spurious page-faults from occurring. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlb.h | 3 ++- arch/x86/include/asm/tlbflush.h | 9 +++++---- arch/x86/kernel/alternative.c | 2 +- arch/x86/kernel/ldt.c | 3 ++- arch/x86/mm/tlb.c | 4 ++-- 5 files changed, 12 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 1bfe979bb9bc..51c85136f9a8 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -20,7 +20,8 @@ static inline void tlb_flush(struct mmu_gather *tlb) end =3D tlb->end; } =20 - flush_tlb_mm_range(tlb->mm, start, end, stride_shift, tlb->freed_tables); + flush_tlb_mm_range(tlb->mm, start, end, stride_shift, tlb->freed_tables, + tlb->strict); } =20 /* diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflus= h.h index 77d4810e5a5d..230cd1d24fe6 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -220,23 +220,24 @@ void flush_tlb_multi(const struct cpumask *cpumask, #endif =20 #define flush_tlb_mm(mm) \ - flush_tlb_mm_range(mm, 0UL, TLB_FLUSH_ALL, 0UL, true) + flush_tlb_mm_range(mm, 0UL, TLB_FLUSH_ALL, 0UL, true, true) =20 #define flush_tlb_range(vma, start, end) \ flush_tlb_mm_range((vma)->vm_mm, start, end, \ ((vma)->vm_flags & VM_HUGETLB) \ ? huge_page_shift(hstate_vma(vma)) \ - : PAGE_SHIFT, false) + : PAGE_SHIFT, false, true) =20 extern void flush_tlb_all(void); extern void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, - bool freed_tables); + bool freed_tables, bool strict); extern void flush_tlb_kernel_range(unsigned long start, unsigned long end); =20 static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned lon= g a) { - flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false); + flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false, + true); } =20 static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index e257f6c80372..48945a47fd76 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -1099,7 +1099,7 @@ static void *__text_poke(text_poke_f func, void *addr= , const void *src, size_t l */ flush_tlb_mm_range(poking_mm, poking_addr, poking_addr + (cross_page_boundary ? 2 : 1) * PAGE_SIZE, - PAGE_SHIFT, false); + PAGE_SHIFT, false, true); =20 if (func =3D=3D text_poke_memcpy) { /* diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c index 525876e7b9f4..7c7bc97324bc 100644 --- a/arch/x86/kernel/ldt.c +++ b/arch/x86/kernel/ldt.c @@ -372,7 +372,8 @@ static void unmap_ldt_struct(struct mm_struct *mm, stru= ct ldt_struct *ldt) } =20 va =3D (unsigned long)ldt_slot_va(ldt->slot); - flush_tlb_mm_range(mm, va, va + nr_pages * PAGE_SIZE, PAGE_SHIFT, false); + flush_tlb_mm_range(mm, va, va + nr_pages * PAGE_SIZE, PAGE_SHIFT, false, + true); } =20 #else /* !CONFIG_PAGE_TABLE_ISOLATION */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index ff3bcc55435e..ec5033d28a97 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -974,7 +974,7 @@ void flush_tlb_fix_spurious_fault(struct vm_area_struct= *vma, =20 void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, - bool freed_tables) + bool freed_tables, bool strict) { struct flush_tlb_info *info; u64 new_tlb_gen; @@ -1000,7 +1000,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigne= d long start, * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { + if (strict && cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { flush_tlb_multi(mm_cpumask(mm), info); } else if (mm =3D=3D this_cpu_read(cpu_tlbstate.loaded_mm)) { lockdep_assert_irqs_enabled(); --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6F14C43334 for ; Mon, 18 Jul 2022 19:37:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236212AbiGRThm (ORCPT ); Mon, 18 Jul 2022 15:37:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236197AbiGRThU (ORCPT ); Mon, 18 Jul 2022 15:37:20 -0400 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D116F2E690 for ; Mon, 18 Jul 2022 12:37:19 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id h132so11502442pgc.10 for ; Mon, 18 Jul 2022 12:37:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sYv+0w7UbdJfCd/9qiOzPSdo3DPlp98QBnk6mmrNOzw=; b=LfD4eNC9SIqr9cYWB25Xbr4WiNNia1nr+3GwzKavBUssVyFOxkqXDhxuEsCKotgHGX IWop0WCMjmuoVKLDBbwwaKBQDtdmqkvwVj7sFVhuA7x/GT8cqq96HiJcsEWIREVlkZkC XkrXrhlO0AHfhi7Ka0pK/Bk6TGPHIcLO4v6aQWeY6gTl5lS1/EBV6g/dpM3ldLkVpRs4 P/DRFwB289y/gdLxuvwxWRrdWSS/ANky/BYuh5pHXytyibxUmY1neqRo3cZIvwKFVK6F YK0uUoEexTXqrgimYNcJK1bFS70JivrQyx9FnwmH/rpHZPac+0k2ux4Obajgr/ZrEpGI ztog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sYv+0w7UbdJfCd/9qiOzPSdo3DPlp98QBnk6mmrNOzw=; b=XiukuTmSJtZFfLopEAeh6ZNaUYrI0Q3rRjHZQUZMLGAkjBierDSCkXSZIAS+ODYbfh 9zoXrvEpEGwib18HlLawyc7Q6emYnX188Gl6wIojMD/fbgavm+VLzyD7vgIckvnU4a6x WRHis8RdZDhqW/oiCHPdKoWCo+2yd1h1LKIb9mGDUwhkUb6YMDMfzmOo/9qbjLAkCWnb c1IEXbuDibVVK1FEJRI9ucOuC2G5pYqJtiq6KIJb87ez+JnZ0fXxujLnJwacPgkdTiBm 5mGm8J8X1qMA24ntYfREm0hEk4j8e8mfRu3B4EcI7NvW/xV+pEE/mdOys9JCn5a5E1r0 4msA== X-Gm-Message-State: AJIora+O9X00L4KDV8utyrTxTLfHY0s0zCdVXkNBi+xgKzUGTz5eiv3Y EnvhpyXKPB/gj5nvDUxGI+g= X-Google-Smtp-Source: AGRyM1vupv7tu2tOFZ1Su8p2ph2+On2wlRIi/LDIoBmyKSXGXyJd3hN8YQju5I/uoHMU9RGeKryGig== X-Received: by 2002:a63:2684:0:b0:415:18d8:78dd with SMTP id m126-20020a632684000000b0041518d878ddmr26454387pgm.33.1658173039290; Mon, 18 Jul 2022 12:37:19 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:18 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 11/14] x86/mm: use relaxed TLB flushes when protection is removed Date: Mon, 18 Jul 2022 05:02:09 -0700 Message-Id: <20220718120212.3180-12-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit When checking x86 PTE flags to determine whether a TLB flush is needed, determine whether a relaxed TLB flush is sufficient. If protection is added (NX removed or W added), indicate that a relaxed TLB flush would suffice. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflus= h.h index 230cd1d24fe6..4f98735ab07a 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -271,18 +271,23 @@ static inline enum pte_flush_type pte_flags_flush_typ= e(unsigned long oldflags, * dirty/access bit if needed without a fault. */ const pteval_t flush_on_clear =3D _PAGE_DIRTY | _PAGE_PRESENT | - _PAGE_ACCESSED; + _PAGE_ACCESSED | _PAGE_RW; + const pteval_t flush_on_set =3D _PAGE_NX; + const pteval_t flush_on_set_relaxed =3D _PAGE_RW; + const pteval_t flush_on_clear_relaxed =3D _PAGE_NX; const pteval_t software_flags =3D _PAGE_SOFTW1 | _PAGE_SOFTW2 | _PAGE_SOFTW3 | _PAGE_SOFTW4; - const pteval_t flush_on_change =3D _PAGE_RW | _PAGE_USER | _PAGE_PWT | + const pteval_t flush_on_change =3D _PAGE_USER | _PAGE_PWT | _PAGE_PCD | _PAGE_PSE | _PAGE_GLOBAL | _PAGE_PAT | _PAGE_PAT_LARGE | _PAGE_PKEY_BIT0 | _PAGE_PKEY_BIT1 | - _PAGE_PKEY_BIT2 | _PAGE_PKEY_BIT3 | _PAGE_NX; + _PAGE_PKEY_BIT2 | _PAGE_PKEY_BIT3; unsigned long diff =3D oldflags ^ newflags; =20 BUILD_BUG_ON(flush_on_clear & software_flags); BUILD_BUG_ON(flush_on_clear & flush_on_change); BUILD_BUG_ON(flush_on_change & software_flags); + BUILD_BUG_ON(flush_on_change & flush_on_clear_relaxed); + BUILD_BUG_ON(flush_on_change & flush_on_set_relaxed); =20 /* Ignore software flags */ diff &=3D ~software_flags; @@ -301,9 +306,16 @@ static inline enum pte_flush_type pte_flags_flush_type= (unsigned long oldflags, if (diff & flush_on_change) return PTE_FLUSH_STRICT; =20 + if (diff & oldflags & flush_on_clear_relaxed) + return PTE_FLUSH_RELAXED; + + if (diff & newflags & flush_on_set_relaxed) + return PTE_FLUSH_RELAXED; + /* Ensure there are no flags that were left behind */ if (IS_ENABLED(CONFIG_DEBUG_VM) && - (diff & ~(flush_on_clear | software_flags | flush_on_change))) { + (diff & ~(flush_on_clear | flush_on_set | + software_flags | flush_on_change))) { VM_WARN_ON_ONCE(1); return PTE_FLUSH_STRICT; } --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EA12C43334 for ; Mon, 18 Jul 2022 19:37:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236173AbiGRThu (ORCPT ); Mon, 18 Jul 2022 15:37:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236216AbiGRThX (ORCPT ); Mon, 18 Jul 2022 15:37:23 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0585D30F75 for ; Mon, 18 Jul 2022 12:37:22 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id z1so9967428plb.1 for ; Mon, 18 Jul 2022 12:37:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=U3bHH+VCApNk0EgSMkQpPIu/hogjL02kkGQOh371JqE=; b=cgVgCcOubWJlYQh87ciup80RfrTt14z9y9XheMGRDWpUOcN7tz83N1/LeVdqY4FC8x NjZ299ZmT5ayN+y3qeWtB4AiGGlLv0bTeoa41zMsLcKw7Yvbr8j3tdxbVbn/BCQj5UnM N+DKf8yCPT06zTRIR4vx6LrfvTTsQMm6mDcaVmmqoslMBPkuYDGQM8LAXE49wLZerlr4 O2XHyJrKy3OzAIEITCEW9D6Zdl1GZeNoRYmVFo5+eof3svAuQ9oW9pdkFfi1jwDp+2Kd cmWr6cYwHWT0ai9wmSNVyZ4nkkJV4zU269t4/tA7BL3pdN3OppGjcXUqwL5ga4yr3+rK lK/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=U3bHH+VCApNk0EgSMkQpPIu/hogjL02kkGQOh371JqE=; b=DUF3PtX4kOW5U/7yPOpKXEYLS9Q/RZ50yl6Bxn9H+AA2Ng215CxmHBcUop+9mbQHmB SoEFOwlH+EYCdWvsBcjj7QCAjA/fDqqrOYmU4w0i0dje47kD6U//HuYOMBG8+FXkhRBi 5EwJo8DQpOREMIYst2DIR5aGdcuAvviQAnLNeywMVEaTnXSzm9TyOC/e1VqSh0lTxnka thzC7UxKX3RUKYZVYAClSoFEkKAK/75XY2dKj3wj2ldIf5u1IG24tP4akIF5HPXLBHHC MlRHQlebSXQhOafP68ZjwVPGKhPHXj76pPhSwVQTyPXKYI26bf4+oAq5SkZTSm6gRdK4 pZ0Q== X-Gm-Message-State: AJIora/yOYpL3bO6okyZkjIURP+NnopbhtwLHMNuWDtiy+arZiDhi8YT jnMkyC3Eq0sp1ZyBolEp0sA= X-Google-Smtp-Source: AGRyM1t9fp3tA/qUiJLR5rxiQilje7wUJ0XNs2hdR5hLaXg/+Pz/WHXJKpxZoMJbK3OuMmRL/9fyZw== X-Received: by 2002:a17:90a:f114:b0:1ef:991f:12e7 with SMTP id cc20-20020a17090af11400b001ef991f12e7mr41584874pjb.199.1658173040943; Mon, 18 Jul 2022 12:37:20 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:20 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 12/14] x86/tlb: no flush on PTE change from RW->RO when PTE is clean Date: Mon, 18 Jul 2022 05:02:10 -0700 Message-Id: <20220718120212.3180-13-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit On x86 it is possible to skip a TLB flush when a RW entry become RO and the PTE is clean. Add logic to detect this case and skip the flush. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflus= h.h index 4f98735ab07a..58c95e36b098 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -271,8 +271,9 @@ static inline enum pte_flush_type pte_flags_flush_type(= unsigned long oldflags, * dirty/access bit if needed without a fault. */ const pteval_t flush_on_clear =3D _PAGE_DIRTY | _PAGE_PRESENT | - _PAGE_ACCESSED | _PAGE_RW; + _PAGE_ACCESSED; const pteval_t flush_on_set =3D _PAGE_NX; + const pteval_t flush_on_special =3D _PAGE_RW; const pteval_t flush_on_set_relaxed =3D _PAGE_RW; const pteval_t flush_on_clear_relaxed =3D _PAGE_NX; const pteval_t software_flags =3D _PAGE_SOFTW1 | _PAGE_SOFTW2 | @@ -302,6 +303,17 @@ static inline enum pte_flush_type pte_flags_flush_type= (unsigned long oldflags, if (diff & oldflags & flush_on_clear) return PTE_FLUSH_STRICT; =20 + /* + * Did any of the 'flush_on_set' flags was set between 'oldflags' and + * 'newflags'? + */ + if (diff & newflags & flush_on_set) + return PTE_FLUSH_STRICT; + + /* On RW->RO, a flush is needed if the old entry is dirty */ + if ((diff & oldflags & _PAGE_RW) && (oldflags & _PAGE_DIRTY)) + return PTE_FLUSH_STRICT; + /* Flush on modified flags. */ if (diff & flush_on_change) return PTE_FLUSH_STRICT; @@ -314,7 +326,7 @@ static inline enum pte_flush_type pte_flags_flush_type(= unsigned long oldflags, =20 /* Ensure there are no flags that were left behind */ if (IS_ENABLED(CONFIG_DEBUG_VM) && - (diff & ~(flush_on_clear | flush_on_set | + (diff & ~(flush_on_clear | flush_on_set | flush_on_special | software_flags | flush_on_change))) { VM_WARN_ON_ONCE(1); return PTE_FLUSH_STRICT; --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40442C433EF for ; Mon, 18 Jul 2022 19:37:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236113AbiGRThx (ORCPT ); Mon, 18 Jul 2022 15:37:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57536 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234041AbiGRThY (ORCPT ); Mon, 18 Jul 2022 15:37:24 -0400 Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85DB62A42A for ; Mon, 18 Jul 2022 12:37:23 -0700 (PDT) Received: by mail-pj1-x102a.google.com with SMTP id p9so12675814pjd.3 for ; Mon, 18 Jul 2022 12:37:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qVM0y4gQPitFR1tKEmIDwlmAZZH+TL3Rnk5RAABBHP4=; b=V6nKbIynJQ32RQiVSsWsL92BBUcNPr/7NmtVMidKRitE6VlZae99PvNi8rcn+Np+jv pbcQ52AiEx1o9SZVttEDGEVOxJyIKTeKYIzcFXPWQ2G1C3zkD8cn5YkP2iBXBUXZlaa9 Dy3y3Pvf3hypERuQAuX174h2D0jZEGUq76cBKglKQIYGgRfk5wLo1AfQulxtBGjoFPoU NeY2dZN8ekT0PmwfBJUXjrrFexeYdAxIeAOeiGPEDV4wWbXQZIW5CufcmI7B5mKfG2oK 0gw2N3fyOtwK7H2d9kvGxukR6bLS/4L+jHEjvGnYCpck8OPzbSIcYAHTHgjEuhAB8YLR 92pw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qVM0y4gQPitFR1tKEmIDwlmAZZH+TL3Rnk5RAABBHP4=; b=D3MsYC/8Fz26Oomg+nSJkWj7AraScua9RwNlW+wo/yP4b3y8Iyu6bze7UCHflGG4QU ENfbSvA2lpmjmZCd1faBzShJeewutDT8WzSAtjDkro7GgscKIb6CLbbhLEIUy7DUbfyL oMGpl5/YQ/Ejj0lwvKd9oHnugThEJfh9TFJntszh8UOXb6SVwqwWaHJkqUZzNjUMYD3P zNDskyBAnmAIFtoqRsALH/JsehnTeRRuSoj7OuoTWhmrfIFJ1fs7+r2Ad1lLfTAq8Qbd /1bAjq/qP/OPSinCRDD9PxEAWEf2Bwp1npyHwKFNl5hzYvuDHgrsYEdfKrF8vJhRS6MU OZWA== X-Gm-Message-State: AJIora/dkGYaiBcFVOs6sUK9eBAdfo3kkhzPculSJ72nQGudsauBcy42 WK7/0o+c0lE9z5RXFY3eh2I= X-Google-Smtp-Source: AGRyM1u6taPQ6doU3CQByQAU6zsUojk73LMBR4sgGh24aRXQUX89N81HjhtbgEEEFfcQzY5TwRCCRg== X-Received: by 2002:a17:90b:4b4d:b0:1ef:a2c2:6bcc with SMTP id mi13-20020a17090b4b4d00b001efa2c26bccmr34467490pjb.186.1658173042743; Mon, 18 Jul 2022 12:37:22 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:22 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 13/14] mm/mprotect: do not check flush type if a strict is needed Date: Mon, 18 Jul 2022 05:02:11 -0700 Message-Id: <20220718120212.3180-14-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit Once it was determined that a strict TLB flush is needed, it is both likely that other PTEs would need strict TLB flush and little benefit from not extending the range that is flushed. Skip the check which TLB flush is needed, if it was determined a strict flush is already needed. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- mm/huge_memory.c | 4 +++- mm/mprotect.c | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 09e6608a6431..b32b7da0f6f7 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1816,7 +1816,9 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm= _area_struct *vma, ret =3D HPAGE_PMD_NR; set_pmd_at(mm, addr, pmd, entry); =20 - flush_type =3D huge_pmd_flush_type(oldpmd, entry); + flush_type =3D PTE_FLUSH_STRICT; + if (!tlb->strict) + flush_type =3D huge_pmd_flush_type(oldpmd, entry); if (flush_type !=3D PTE_FLUSH_NONE) tlb_flush_pmd_range(tlb, addr, HPAGE_PMD_SIZE, flush_type =3D=3D PTE_FLUSH_STRICT); diff --git a/mm/mprotect.c b/mm/mprotect.c index ead20dc66d34..cf775f6c8c08 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -202,7 +202,9 @@ static unsigned long change_pte_range(struct mmu_gather= *tlb, =20 ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); =20 - flush_type =3D pte_flush_type(oldpte, ptent); + flush_type =3D PTE_FLUSH_STRICT; + if (!tlb->strict) + flush_type =3D pte_flush_type(oldpte, ptent); if (flush_type !=3D PTE_FLUSH_NONE) tlb_flush_pte_range(tlb, addr, PAGE_SIZE, flush_type =3D=3D PTE_FLUSH_STRICT); --=20 2.25.1 From nobody Sat Apr 18 04:19:35 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03C4AC433EF for ; Mon, 18 Jul 2022 19:37:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236257AbiGRTh4 (ORCPT ); Mon, 18 Jul 2022 15:37:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236222AbiGRTh0 (ORCPT ); Mon, 18 Jul 2022 15:37:26 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E57A31201 for ; Mon, 18 Jul 2022 12:37:25 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id u7-20020a17090a3fc700b001f1efc76be2so474500pjm.1 for ; Mon, 18 Jul 2022 12:37:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vrKPjJvEMBTDr1mCiDbgd+KWQ8nvzpXNXEsnVSmuvO0=; b=eYUZpQJpHryXrfyw7d48dRMqIWNv2+baxNAcAVwwnO0fjHTTRD50J1bwh/PG8hjhyo RFIxBYYQfj5C9aZGvM+1u1GlB88xW+kndwLcOf6g2YEW+59i/GnfGPV2S9WxiHUwpSKX x08be9IZJ+YBx/qdBlDRbTMoOz/KXj85/cPSJxz55YfQzzX3Jpf+kiiIG4NAlj1yOMjc BjlyIgdaoYZ3WQ5j2mm/bj4G39v948AmRF31dtKwb52uX06Lm82D+vYO7r8Ancx0fHXp /xQCRujNj3hHmv8itzoQcsbS0vCIUj1i2sOEK0ErYyuRJHrObKtzScHJD8DsT83a0u+y G2WQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vrKPjJvEMBTDr1mCiDbgd+KWQ8nvzpXNXEsnVSmuvO0=; b=2CdXM3jze6djMDc2m5UOJaAW+Dau99ycI7jbHsx0gjdzsgxFcHDaVORnpCGNB/5u2M lrc+5Eq3+htdt2Y4vu2KcSyFuV3AvbnNobg+9Hx0j25TAKeXyI7eFp32tMOEdjLLRPKE 3dmjaqq2MhfncC2n8FHepdx9Q5+Ymdg2RL1Qz1IFccMWShz6r8rVYLdiuPJinI6VTPyS UxuRv8xuDmLba/+VLhSlzC8MBSZuCpLQYXJttTSASmcPthTz1Vw+KYB8gXq//1zAavPo mOagMVVoE2f4SSWrozoNYQ4zBWuuyY14fQCoQKVfMC+klxmm6M38WuNKv4fYc5G8meLT 6b2A== X-Gm-Message-State: AJIora90k34nv+MNli4nDt9R1cqMq/QMZPqBlBqskRM/8siLVA+KdJYu /IcGAQjVfHWLN1F+wXrWLck= X-Google-Smtp-Source: AGRyM1uYMSGTx4GyYdBLjJpMepu7xtMk4UhhFdK1EgIxLwbQTenVoVfEyUI9bNANKUd9XUVG9W4PUA== X-Received: by 2002:a17:903:1111:b0:16a:acf4:e951 with SMTP id n17-20020a170903111100b0016aacf4e951mr29606165plh.72.1658173044426; Mon, 18 Jul 2022 12:37:24 -0700 (PDT) Received: from sc2-haas01-esx0118.eng.vmware.com ([66.170.99.1]) by smtp.gmail.com with ESMTPSA id q6-20020a170902a3c600b0016bc4a6ce28sm9907887plb.98.2022.07.18.12.37.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Jul 2022 12:37:24 -0700 (PDT) From: Nadav Amit X-Google-Original-From: Nadav Amit To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , David Hildenbrand , Peter Xu , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin Subject: [RFC PATCH 14/14] mm: conditional check of pfn in pte_flush_type Date: Mon, 18 Jul 2022 05:02:12 -0700 Message-Id: <20220718120212.3180-15-namit@vmware.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220718120212.3180-1-namit@vmware.com> References: <20220718120212.3180-1-namit@vmware.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Nadav Amit Checking whether PFNs in two PTEs are the same takes surprisingly large number of instructions. Yet in fact, in most cases the caller to pte_flush_type() already knows if the PFN was changed. For instance, mprotect() does not change the PFN, but only modifies the protection flags. Add argument to pte_flush_type() to indicate whether the PFN should be checked. Keep checking it in mm-debug to see if some caller was wrong to assume the PFN is the same. Cc: Andrea Arcangeli Cc: Andrew Cooper Cc: Andrew Morton Cc: Andy Lutomirski Cc: Dave Hansen Cc: David Hildenbrand Cc: Peter Xu Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Will Deacon Cc: Yu Zhao Cc: Nick Piggin Signed-off-by: Nadav Amit --- arch/x86/include/asm/tlbflush.h | 14 ++++++++++---- include/asm-generic/tlb.h | 6 ++++-- mm/huge_memory.c | 2 +- mm/mprotect.c | 2 +- mm/rmap.c | 2 +- 5 files changed, 17 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflus= h.h index 58c95e36b098..50349861fdc9 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -340,14 +340,17 @@ static inline enum pte_flush_type pte_flags_flush_typ= e(unsigned long oldflags, * whether a strict or relaxed TLB flush is need. It should only be used on * userspace PTEs. */ -static inline enum pte_flush_type pte_flush_type(pte_t oldpte, pte_t newpt= e) +static inline enum pte_flush_type pte_flush_type(pte_t oldpte, pte_t newpt= e, + bool check_pfn) { /* !PRESENT -> * ; no need for flush */ if (!(pte_flags(oldpte) & _PAGE_PRESENT)) return PTE_FLUSH_NONE; =20 /* PFN changed ; needs flush */ - if (pte_pfn(oldpte) !=3D pte_pfn(newpte)) + if (!check_pfn) + VM_BUG_ON(pte_pfn(oldpte) !=3D pte_pfn(newpte)); + else if (pte_pfn(oldpte) !=3D pte_pfn(newpte)) return PTE_FLUSH_STRICT; =20 /* @@ -363,14 +366,17 @@ static inline enum pte_flush_type pte_flush_type(pte_= t oldpte, pte_t newpte) * huge_pmd_flush_type() checks whether permissions were demoted and requi= re a * flush. It should only be used for userspace huge PMDs. */ -static inline enum pte_flush_type huge_pmd_flush_type(pmd_t oldpmd, pmd_t = newpmd) +static inline enum pte_flush_type huge_pmd_flush_type(pmd_t oldpmd, pmd_t = newpmd, + bool check_pfn) { /* !PRESENT -> * ; no need for flush */ if (!(pmd_flags(oldpmd) & _PAGE_PRESENT)) return PTE_FLUSH_NONE; =20 /* PFN changed ; needs flush */ - if (pmd_pfn(oldpmd) !=3D pmd_pfn(newpmd)) + if (!check_pfn) + VM_BUG_ON(pmd_pfn(oldpmd) !=3D pmd_pfn(newpmd)); + else if (pmd_pfn(oldpmd) !=3D pmd_pfn(newpmd)) return PTE_FLUSH_STRICT; =20 /* diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 07b3eb8caf63..aee9da6cc5d5 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -677,14 +677,16 @@ static inline void tlb_flush_p4d_range(struct mmu_gat= her *tlb, #endif =20 #ifndef pte_flush_type -static inline struct pte_flush_type pte_flush_type(pte_t oldpte, pte_t new= pte) +static inline struct pte_flush_type pte_flush_type(pte_t oldpte, pte_t new= pte, + bool check_pfn) { return PTE_FLUSH_STRICT; } #endif =20 #ifndef huge_pmd_flush_type -static inline bool huge_pmd_flush_type(pmd_t oldpmd, pmd_t newpmd) +static inline bool huge_pmd_flush_type(pmd_t oldpmd, pmd_t newpmd, + bool check_pfn) { return PTE_FLUSH_STRICT; } diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b32b7da0f6f7..92a7b3ca317f 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1818,7 +1818,7 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm= _area_struct *vma, =20 flush_type =3D PTE_FLUSH_STRICT; if (!tlb->strict) - flush_type =3D huge_pmd_flush_type(oldpmd, entry); + flush_type =3D huge_pmd_flush_type(oldpmd, entry, false); if (flush_type !=3D PTE_FLUSH_NONE) tlb_flush_pmd_range(tlb, addr, HPAGE_PMD_SIZE, flush_type =3D=3D PTE_FLUSH_STRICT); diff --git a/mm/mprotect.c b/mm/mprotect.c index cf775f6c8c08..78081d7f4edf 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -204,7 +204,7 @@ static unsigned long change_pte_range(struct mmu_gather= *tlb, =20 flush_type =3D PTE_FLUSH_STRICT; if (!tlb->strict) - flush_type =3D pte_flush_type(oldpte, ptent); + flush_type =3D pte_flush_type(oldpte, ptent, false); if (flush_type !=3D PTE_FLUSH_NONE) tlb_flush_pte_range(tlb, addr, PAGE_SIZE, flush_type =3D=3D PTE_FLUSH_STRICT); diff --git a/mm/rmap.c b/mm/rmap.c index 62f4b2a4f067..63261619b607 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -974,7 +974,7 @@ static int page_vma_mkclean_one(struct page_vma_mapped_= walk *pvmw) entry =3D pte_wrprotect(oldpte); entry =3D pte_mkclean(entry); =20 - if (pte_flush_type(oldpte, entry) !=3D PTE_FLUSH_NONE || + if (pte_flush_type(oldpte, entry, false) !=3D PTE_FLUSH_NONE || mm_tlb_flush_pending(vma->vm_mm)) flush_tlb_page(vma, address); =20 --=20 2.25.1