From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD9F1C5479D for ; Mon, 9 Jan 2023 20:53:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237069AbjAIUxw (ORCPT ); Mon, 9 Jan 2023 15:53:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232796AbjAIUxp (ORCPT ); Mon, 9 Jan 2023 15:53:45 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFF136950B for ; Mon, 9 Jan 2023 12:53:43 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id g9-20020a25bdc9000000b0073727a20239so10199254ybk.4 for ; Mon, 09 Jan 2023 12:53:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NqXJ+ZomBW9TS34c/jGDj4kdswhUlT56T9j0TeoADNk=; b=hV1eYOET6T2afidyHxuU+TDalPYH7SoLHvYl4BYwO2+JPtGGpug5jlQUoy9K4emxnp wkRJchtDj68k/F+2XFNh7yn/DcVU/S6b5BfTlEf/HmlwKUP7B8xL+hIWlClC+stgjn0Q yjEqU4w0bHW24Tkvl4HAvFsvboqjzGC/OqOppLTxN2NoinKdWAqCYIk7Wp05H1K9TySk fMjEc/Qrcjc9huA2llMuqioo8fVnZzgiNBMYUce/o/EkMO+/Qtv30xUsop9ErVhi9UD1 18fRcEvtN38NaeuSdMu7DJkgG+1MjHvnJbjraNd54FiH0BDnP0iYXASfyl6JxE09JvbK zNVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NqXJ+ZomBW9TS34c/jGDj4kdswhUlT56T9j0TeoADNk=; b=242fQAyHVl3ypGRGZy3tues/8NKBNQpxC7BUD4dxf69xf3GvyKzxXcaIhUnjx/SQNt 1po8dY6Iye0/TiEIVP/xDxnOA4o61tCUa+RLbBYcGjej9GUiTpuaudq+kNAYr4JSN9Y5 dvJ3svSIOsCPupuyJeCLnXlhUm3KZAICYAXK6djxytAUnNORk6WybuJFNVGonoRCZLoo kVfbIHgroyh/80uDHs/yUharul5bGMIctEd9JM1VhDLWVWq+SpOzFx+JSwSrjVacTmHW t4RkQJmrXXS2bLWMZDnJ8VZWlH3Mr6yZhhkrpUIVg1Fos/cXS5lph+ucOIMgraT93NIz lz6g== X-Gm-Message-State: AFqh2kpUDQjpPnCKJSoqbWdL3vqdOOJCIng2/yAP8DGGnlk7k2DI2piP DdyAiMcmSlLP6Bfvrv0YRNu5hv94LSo= X-Google-Smtp-Source: AMrXdXsc+lRNNMUj/grXDfNh64QUbjHDxIxAaFKRrkqMjOPH6SvLaEBts0d+5j6B8x6v1ECq7EPEJR82hgY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:7189:0:b0:707:18f:7226 with SMTP id m131-20020a257189000000b00707018f7226mr210074ybc.505.1673297623117; Mon, 09 Jan 2023 12:53:43 -0800 (PST) Date: Mon, 9 Jan 2023 12:52:56 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-2-surenb@google.com> Subject: [PATCH 01/41] maple_tree: Be more cautious about dead nodes From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com, Liam Howlett Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Liam Howlett ma_pivots() and ma_data_end() may be called with a dead node. Ensure to that the node isn't dead before using the returned values. This is necessary for RCU mode of the maple tree. Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett Signed-off-by: Suren Baghdasaryan --- lib/maple_tree.c | 53 +++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 43 insertions(+), 10 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 26e2045d3cda..ff9f04e0150d 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -540,6 +540,7 @@ static inline bool ma_dead_node(const struct maple_node= *node) =20 return (parent =3D=3D node); } + /* * mte_dead_node() - check if the @enode is dead. * @enode: The encoded maple node @@ -621,6 +622,8 @@ static inline unsigned int mas_alloc_req(const struct m= a_state *mas) * @node - the maple node * @type - the node type * + * In the event of a dead node, this array may be %NULL + * * Return: A pointer to the maple node pivots */ static inline unsigned long *ma_pivots(struct maple_node *node, @@ -1091,8 +1094,11 @@ static int mas_ascend(struct ma_state *mas) a_type =3D mas_parent_enum(mas, p_enode); a_node =3D mte_parent(p_enode); a_slot =3D mte_parent_slot(p_enode); - pivots =3D ma_pivots(a_node, a_type); a_enode =3D mt_mk_node(a_node, a_type); + pivots =3D ma_pivots(a_node, a_type); + + if (unlikely(ma_dead_node(a_node))) + return 1; =20 if (!set_min && a_slot) { set_min =3D true; @@ -1398,6 +1404,9 @@ static inline unsigned char ma_data_end(struct maple_= node *node, { unsigned char offset; =20 + if (!pivots) + return 0; + if (type =3D=3D maple_arange_64) return ma_meta_end(node, type); =20 @@ -1433,6 +1442,9 @@ static inline unsigned char mas_data_end(struct ma_st= ate *mas) return ma_meta_end(node, type); =20 pivots =3D ma_pivots(node, type); + if (unlikely(ma_dead_node(node))) + return 0; + offset =3D mt_pivots[type] - 1; if (likely(!pivots[offset])) return ma_meta_end(node, type); @@ -4504,6 +4516,9 @@ static inline int mas_prev_node(struct ma_state *mas,= unsigned long min) node =3D mas_mn(mas); slots =3D ma_slots(node, mt); pivots =3D ma_pivots(node, mt); + if (unlikely(ma_dead_node(node))) + return 1; + mas->max =3D pivots[offset]; if (offset) mas->min =3D pivots[offset - 1] + 1; @@ -4525,6 +4540,9 @@ static inline int mas_prev_node(struct ma_state *mas,= unsigned long min) slots =3D ma_slots(node, mt); pivots =3D ma_pivots(node, mt); offset =3D ma_data_end(node, mt, pivots, mas->max); + if (unlikely(ma_dead_node(node))) + return 1; + if (offset) mas->min =3D pivots[offset - 1] + 1; =20 @@ -4573,6 +4591,7 @@ static inline int mas_next_node(struct ma_state *mas,= struct maple_node *node, struct maple_enode *enode; int level =3D 0; unsigned char offset; + unsigned char node_end; enum maple_type mt; void __rcu **slots; =20 @@ -4596,7 +4615,11 @@ static inline int mas_next_node(struct ma_state *mas= , struct maple_node *node, node =3D mas_mn(mas); mt =3D mte_node_type(mas->node); pivots =3D ma_pivots(node, mt); - } while (unlikely(offset =3D=3D ma_data_end(node, mt, pivots, mas->max))); + node_end =3D ma_data_end(node, mt, pivots, mas->max); + if (unlikely(ma_dead_node(node))) + return 1; + + } while (unlikely(offset =3D=3D node_end)); =20 slots =3D ma_slots(node, mt); pivot =3D mas_safe_pivot(mas, pivots, ++offset, mt); @@ -4612,6 +4635,9 @@ static inline int mas_next_node(struct ma_state *mas,= struct maple_node *node, mt =3D mte_node_type(mas->node); slots =3D ma_slots(node, mt); pivots =3D ma_pivots(node, mt); + if (unlikely(ma_dead_node(node))) + return 1; + offset =3D 0; pivot =3D pivots[0]; } @@ -4658,16 +4684,18 @@ static inline void *mas_next_nentry(struct ma_state= *mas, return NULL; } =20 - pivots =3D ma_pivots(node, type); slots =3D ma_slots(node, type); - mas->index =3D mas_safe_min(mas, pivots, mas->offset); - if (ma_dead_node(node)) + pivots =3D ma_pivots(node, type); + count =3D ma_data_end(node, type, pivots, mas->max); + if (unlikely(ma_dead_node(node))) return NULL; =20 + mas->index =3D mas_safe_min(mas, pivots, mas->offset); + if (unlikely(ma_dead_node(node))) + return NULL; if (mas->index > max) return NULL; =20 - count =3D ma_data_end(node, type, pivots, mas->max); if (mas->offset > count) return NULL; =20 @@ -4815,6 +4843,11 @@ static inline void *mas_prev_nentry(struct ma_state = *mas, unsigned long limit, =20 slots =3D ma_slots(mn, mt); pivots =3D ma_pivots(mn, mt); + if (unlikely(ma_dead_node(mn))) { + mas_rewalk(mas, index); + goto retry; + } + if (offset =3D=3D mt_pivots[mt]) pivot =3D mas->max; else @@ -6613,11 +6646,11 @@ static inline void *mas_first_entry(struct ma_state= *mas, struct maple_node *mn, while (likely(!ma_is_leaf(mt))) { MT_BUG_ON(mas->tree, mte_dead_node(mas->node)); slots =3D ma_slots(mn, mt); - pivots =3D ma_pivots(mn, mt); - max =3D pivots[0]; entry =3D mas_slot(mas, slots, 0); + pivots =3D ma_pivots(mn, mt); if (unlikely(ma_dead_node(mn))) return NULL; + max =3D pivots[0]; mas->node =3D entry; mn =3D mas_mn(mas); mt =3D mte_node_type(mas->node); @@ -6637,13 +6670,13 @@ static inline void *mas_first_entry(struct ma_state= *mas, struct maple_node *mn, if (likely(entry)) return entry; =20 - pivots =3D ma_pivots(mn, mt); - mas->index =3D pivots[0] + 1; mas->offset =3D 1; entry =3D mas_slot(mas, slots, 1); + pivots =3D ma_pivots(mn, mt); if (unlikely(ma_dead_node(mn))) return NULL; =20 + mas->index =3D pivots[0] + 1; if (mas->index > limit) goto none; =20 --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 47DC8C6379F for ; Mon, 9 Jan 2023 20:54:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234779AbjAIUx5 (ORCPT ); Mon, 9 Jan 2023 15:53:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234961AbjAIUxr (ORCPT ); Mon, 9 Jan 2023 15:53:47 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5ECF2755E7 for ; Mon, 9 Jan 2023 12:53:46 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id h6-20020a17090aa88600b00223fccff2efso7863654pjq.6 for ; Mon, 09 Jan 2023 12:53:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GU7SFbwLvAPdD9ofPJY8ZC3AFrLWlaqOvHr/TBsM1Ak=; b=cDmwrsTnYnxMRe2M80E4L23YmooQix5wB2D0eUb+oN2Hp3bbyg7HFQB/PgVNLoHyKT pK0aWtF4CTk14CmlPxG5u4iqhHXIU+wNQHLdGD94tpFP5D5/s9ok9CC5dVKy7ZTVeaNC t3/Bp5EGifXzniBaXl20JHyzpYwXEFHr6vzZDBUr27QQthKbHcJnIlWiGRlMM6UQg0ND gIbAYHo5+gFS5Z1JXyH/8Q0hICWyuNm77sn6kp0Bt2Cy3fEKAr803t0EnzzgxBkOELQu v2Q/SfqHP/lVu/l9N+v6WarwdrrXQ25UydUZpK32X5OyRK1NLw7E4rGZtvAQp6+5paD/ Xoog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GU7SFbwLvAPdD9ofPJY8ZC3AFrLWlaqOvHr/TBsM1Ak=; b=Jc3+iwdrxu2zn6bEtogwfbBauBNTV2ljdQzckmZE30klt1Qf/28EZKtFcSxDoHF0lE Xwao4WKUu6iFEcghBzs00II2UpU8dzsE20OV85fUjR8p2UeVMG3rlBfGFFwZPrt/vLUx he03aAM1ooW2+yAmHlyRhlMSMebDIAtZlVg0oweNqVdp+fsWYri+Y3lHMm3d3RFyZbvU 3PWAgbhyKpDjftYKcTrMPXJwU/VHPoqSeaNBGV/SwK69IcwtK0FXoCE7X9SxeHSeLhH5 XsGcoT3NB37TIpuW9sUaUTQi1+Wyzgqz85q0wWNFWrOw97o+opyGcUNNgu/5TCCe/qir zmMg== X-Gm-Message-State: AFqh2kogAM7KQMnTsDbILHy4yU/+JJk/8yMRJ5Jl6rbhRXd+kqvcalJ+ UhB8SYOKF6TtJmplFbbhsy7Tg6j3Wc8= X-Google-Smtp-Source: AMrXdXsx8jOEqF8lBOHYbim74eUIuqk/BpJXbormK1OrkMgDb8c5pT4kOfmHApmrR7R22KnvvmwfjroHm1I= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a17:902:c209:b0:177:faf5:58c5 with SMTP id 9-20020a170902c20900b00177faf558c5mr4095626pll.166.1673297625715; Mon, 09 Jan 2023 12:53:45 -0800 (PST) Date: Mon, 9 Jan 2023 12:52:57 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-3-surenb@google.com> Subject: [PATCH 02/41] maple_tree: Detect dead nodes in mas_start() From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com, Liam Howlett Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Liam Howlett When initially starting a search, the root node may already be in the process of being replaced in RCU mode. Detect and restart the walk if this is the case. This is necessary for RCU mode of the maple tree. Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett Signed-off-by: Suren Baghdasaryan --- lib/maple_tree.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index ff9f04e0150d..a748938ad2e9 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -1359,11 +1359,15 @@ static inline struct maple_enode *mas_start(struct = ma_state *mas) mas->depth =3D 0; mas->offset =3D 0; =20 +retry: root =3D mas_root(mas); /* Tree with nodes */ if (likely(xa_is_node(root))) { mas->depth =3D 1; mas->node =3D mte_safe_root(root); + if (mte_dead_node(mas->node)) + goto retry; + return NULL; } =20 --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72E97C5479D for ; Mon, 9 Jan 2023 20:54:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237261AbjAIUyC (ORCPT ); Mon, 9 Jan 2023 15:54:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236909AbjAIUxu (ORCPT ); Mon, 9 Jan 2023 15:53:50 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0919F6B5FE for ; Mon, 9 Jan 2023 12:53:49 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-46eb8a5a713so103730177b3.1 for ; Mon, 09 Jan 2023 12:53:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tYS2Mdxn+jwv2Pf7FZ7D7lMnoZhtkSPqAfmT0KRte/A=; b=nFEf7/SuzvGfFfBp8alBHLcE/S20BVhkxOFCKJJxDEEpkBCqnPrEjXLQRPhpajIyrV J/MheNxTRWGkew5780i2M8uqtnyvNe0zoFPa24hnhtsW7xvdNOQ3PiEyoeTmA3WWPiUh QK2oHtldPphM+TCue7rjkADeiZ4F/aOO5f9+1p0Hswrqrn/9/CHXJyrN6vRAUq5GDqS7 Axk6TqH0bcf1b4jM3d3LsItdyw7HXfIeFcgC6ZYFCu5leTcjZd3mHlVr8yfm4xVeOVbq OV6UpEzwnJ3oSH64LbnS9rS633sT7MF5+6p/WT9/4pbYtB4HJqqA4EN1qZDYnvdMkGJo Aq4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tYS2Mdxn+jwv2Pf7FZ7D7lMnoZhtkSPqAfmT0KRte/A=; b=BSTYANskypCCjpv3vFOc3FP5vusE+Rjt/egZaQf5U3vqSNrjrfC5SiyjnVGOr7v2ks 3X4QFOKR6IrCpbRHBjjVsS7RjnGt11b2Pw366bgDwwFRkgMPQMtBJBIO+5BucFVWZvL8 1aDkbjyTuXHNI9y8wYSaNEIDUFwcRc91Djar5icMCMmUWyVhOkoJwXIX9sjqU2iVoKI6 FG6u4LRJniaZmjqQ3ud7lqCz27RL91Zz/NGlIfi0EkRCvTEuW8i2vxS8jYTGrI2vlY+F TAYwb+Ti9ppttNgqiyrkyD887DleyTSpCetnjL9CmT6g230eVLh0GsQWmoLbuByDtB+I LG+w== X-Gm-Message-State: AFqh2ko1LOTlWzt2chs9xJAh+RN7HPrEnx8Xxeq+tBjc+j0xUjGP3KHC 4Ev3cRHZYUrLklXFfrJ//ALjFudo1VE= X-Google-Smtp-Source: AMrXdXs7OcQqihjg4pdS+IdnJHYFJuvOEzRgXy7ctwOn05/Em3c821T3ojLTZvqk1LIxMtAuoTuisCHM2Gs= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:dd07:0:b0:7b4:db9a:48ae with SMTP id u7-20020a25dd07000000b007b4db9a48aemr1234542ybg.207.1673297628223; Mon, 09 Jan 2023 12:53:48 -0800 (PST) Date: Mon, 9 Jan 2023 12:52:58 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-4-surenb@google.com> Subject: [PATCH 03/41] maple_tree: Fix freeing of nodes in rcu mode From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com, Liam Howlett Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Liam Howlett The walk to destroy the nodes was not always setting the node type and would result in a destroy method potentially using the values as nodes. Avoid this by setting the correct node types. This is necessary for the RCU mode of the maple tree. Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett Signed-off-by: Suren Baghdasaryan --- lib/maple_tree.c | 73 ++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 62 insertions(+), 11 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index a748938ad2e9..a11eea943f8d 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -897,6 +897,44 @@ static inline void ma_set_meta(struct maple_node *mn, = enum maple_type mt, meta->end =3D end; } =20 +/* + * mas_clear_meta() - clear the metadata information of a node, if it exis= ts + * @mas: The maple state + * @mn: The maple node + * @mt: The maple node type + * @offset: The offset of the highest sub-gap in this node. + * @end: The end of the data in this node. + */ +static inline void mas_clear_meta(struct ma_state *mas, struct maple_node = *mn, + enum maple_type mt) +{ + struct maple_metadata *meta; + unsigned long *pivots; + void __rcu **slots; + void *next; + + switch (mt) { + case maple_range_64: + pivots =3D mn->mr64.pivot; + if (unlikely(pivots[MAPLE_RANGE64_SLOTS - 2])) { + slots =3D mn->mr64.slot; + next =3D mas_slot_locked(mas, slots, + MAPLE_RANGE64_SLOTS - 1); + if (unlikely((mte_to_node(next) && mte_node_type(next)))) + return; /* The last slot is a node, no metadata */ + } + fallthrough; + case maple_arange_64: + meta =3D ma_meta(mn, mt); + break; + default: + return; + } + + meta->gap =3D 0; + meta->end =3D 0; +} + /* * ma_meta_end() - Get the data end of a node from the metadata * @mn: The maple node @@ -5448,20 +5486,22 @@ static inline int mas_rev_alloc(struct ma_state *ma= s, unsigned long min, * mas_dead_leaves() - Mark all leaves of a node as dead. * @mas: The maple state * @slots: Pointer to the slot array + * @type: The maple node type * * Must hold the write lock. * * Return: The number of leaves marked as dead. */ static inline -unsigned char mas_dead_leaves(struct ma_state *mas, void __rcu **slots) +unsigned char mas_dead_leaves(struct ma_state *mas, void __rcu **slots, + enum maple_type mt) { struct maple_node *node; enum maple_type type; void *entry; int offset; =20 - for (offset =3D 0; offset < mt_slot_count(mas->node); offset++) { + for (offset =3D 0; offset < mt_slots[mt]; offset++) { entry =3D mas_slot_locked(mas, slots, offset); type =3D mte_node_type(entry); node =3D mte_to_node(entry); @@ -5480,14 +5520,13 @@ unsigned char mas_dead_leaves(struct ma_state *mas,= void __rcu **slots) =20 static void __rcu **mas_dead_walk(struct ma_state *mas, unsigned char offs= et) { - struct maple_node *node, *next; + struct maple_node *next; void __rcu **slots =3D NULL; =20 next =3D mas_mn(mas); do { - mas->node =3D ma_enode_ptr(next); - node =3D mas_mn(mas); - slots =3D ma_slots(node, node->type); + mas->node =3D mt_mk_node(next, next->type); + slots =3D ma_slots(next, next->type); next =3D mas_slot_locked(mas, slots, offset); offset =3D 0; } while (!ma_is_leaf(next->type)); @@ -5551,11 +5590,14 @@ static inline void __rcu **mas_destroy_descend(stru= ct ma_state *mas, node =3D mas_mn(mas); slots =3D ma_slots(node, mte_node_type(mas->node)); next =3D mas_slot_locked(mas, slots, 0); - if ((mte_dead_node(next))) + if ((mte_dead_node(next))) { + mte_to_node(next)->type =3D mte_node_type(next); next =3D mas_slot_locked(mas, slots, 1); + } =20 mte_set_node_dead(mas->node); node->type =3D mte_node_type(mas->node); + mas_clear_meta(mas, node, node->type); node->piv_parent =3D prev; node->parent_slot =3D offset; offset =3D 0; @@ -5575,13 +5617,18 @@ static void mt_destroy_walk(struct maple_enode *eno= de, unsigned char ma_flags, =20 MA_STATE(mas, &mt, 0, 0); =20 - if (mte_is_leaf(enode)) + mas.node =3D enode; + if (mte_is_leaf(enode)) { + node->type =3D mte_node_type(enode); goto free_leaf; + } =20 + ma_flags &=3D ~MT_FLAGS_LOCK_MASK; mt_init_flags(&mt, ma_flags); mas_lock(&mas); =20 - mas.node =3D start =3D enode; + mte_to_node(enode)->ma_flags =3D ma_flags; + start =3D enode; slots =3D mas_destroy_descend(&mas, start, 0); node =3D mas_mn(&mas); do { @@ -5589,7 +5636,8 @@ static void mt_destroy_walk(struct maple_enode *enode= , unsigned char ma_flags, unsigned char offset; struct maple_enode *parent, *tmp; =20 - node->slot_len =3D mas_dead_leaves(&mas, slots); + node->type =3D mte_node_type(mas.node); + node->slot_len =3D mas_dead_leaves(&mas, slots, node->type); if (free) mt_free_bulk(node->slot_len, slots); offset =3D node->parent_slot + 1; @@ -5613,7 +5661,8 @@ static void mt_destroy_walk(struct maple_enode *enode= , unsigned char ma_flags, } while (start !=3D mas.node); =20 node =3D mas_mn(&mas); - node->slot_len =3D mas_dead_leaves(&mas, slots); + node->type =3D mte_node_type(mas.node); + node->slot_len =3D mas_dead_leaves(&mas, slots, node->type); if (free) mt_free_bulk(node->slot_len, slots); =20 @@ -5623,6 +5672,8 @@ static void mt_destroy_walk(struct maple_enode *enode= , unsigned char ma_flags, free_leaf: if (free) mt_free_rcu(&node->rcu); + else + mas_clear_meta(&mas, node, node->type); } =20 /* --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55CC7C61DB3 for ; Mon, 9 Jan 2023 20:54:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237370AbjAIUyQ (ORCPT ); Mon, 9 Jan 2023 15:54:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237026AbjAIUxw (ORCPT ); Mon, 9 Jan 2023 15:53:52 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9BD22755E7 for ; Mon, 9 Jan 2023 12:53:51 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-434eb7c6fa5so104996567b3.14 for ; Mon, 09 Jan 2023 12:53:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xyMpIuQX3h8xEF8sgRu6qp18jXndG+tVBpDaAHqFUcE=; b=fcpOTy1E+3ThlAgpg3hHomgF7jysjdpHFVliFHHbaTXw696G+d4UafDfpkipyR4nuU tlOJBreetd3v3dCXyb/Rdewxh5p1khH4cqHcd7IJy4L2zy/b/nqItrk1BUqrfoTrE/cO 7u4Zqnf/OqJvpx/Forf/3GSWw4rGhAxROXbd0Dn4cjHGhWxv2F6/FkGrRQlT4BSVn9QB Gq9sxTxT4dzmvld2wEbl8PkQKZUyemG56T9+2h4MZ3hLbr2uHjNUXIbifPrLJAMZf98P qyO5BkbAlxHe07UMVtRr4iqChj93uPwUR0TOsMr/XaxeajQHkPY5F6oCUcd/VOMyVzz5 1yLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xyMpIuQX3h8xEF8sgRu6qp18jXndG+tVBpDaAHqFUcE=; b=DNvomKQppWhF4WxN1TWiLpb7diWj9SAugEWfFfVy3b0WwqqUhbN87HvzkfcFuD/oCO DUtLCjb/y9Ky6CaYw5i7BhII5xRoyV+QljBnLHYre9T4cn66JJUPFg01pA+GBzFtO3zc RhHVl7703qvHNkrVnFz8jKNeaO9bTFKa4pAvYfgXIHQRV7G8U3+w49RiR9Nz4DP8bR4C /5YXfoNVgrH7m6qEG7WDPCG3uze4ynqJbBGVNGc0tMML8JRMiJCh132nHZRdpgNvGi3x 4/N4pY+f2nsRgnU2NZqVZET8ow92AyPDPOrjrVOh7qwU3o/zi2aQbHaTaUlq7f029LJq MslA== X-Gm-Message-State: AFqh2koFoFDC98djgkA4TVWFAmWy01bMYiNi+Ys8hEdqJS2OtJ4PycVG 8pcdPqnluk5XhEd42P2HmWLloetlc6o= X-Google-Smtp-Source: AMrXdXs29F0iyIf9MRCODAQbmPmP2d9pU1gWPBdnTE2PRXcNw1d8zc+w15TjL10LubrEOMnWhSTePMV2FM0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a05:690c:299:b0:370:2d8c:8193 with SMTP id bf25-20020a05690c029900b003702d8c8193mr1225043ywb.221.1673297630870; Mon, 09 Jan 2023 12:53:50 -0800 (PST) Date: Mon, 9 Jan 2023 12:52:59 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-5-surenb@google.com> Subject: [PATCH 04/41] maple_tree: remove extra smp_wmb() from mas_dead_leaves() From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com, Liam Howlett Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Liam Howlett The call to mte_set_dead_node() before the smp_wmb() already calls smp_wmb() so this is not needed. This is an optimization for the RCU mode of the maple tree. Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam Howlett Signed-off-by: Suren Baghdasaryan --- lib/maple_tree.c | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index a11eea943f8d..d85291b19f86 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -5510,7 +5510,6 @@ unsigned char mas_dead_leaves(struct ma_state *mas, v= oid __rcu **slots, break; =20 mte_set_node_dead(entry); - smp_wmb(); /* Needed for RCU */ node->type =3D type; rcu_assign_pointer(slots[offset], node); } --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0427C54EBD for ; Mon, 9 Jan 2023 20:54:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237026AbjAIUyW (ORCPT ); Mon, 9 Jan 2023 15:54:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235414AbjAIUxz (ORCPT ); Mon, 9 Jan 2023 15:53:55 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4951F6A0C7 for ; Mon, 9 Jan 2023 12:53:54 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-4c6bd2981d8so66528777b3.2 for ; Mon, 09 Jan 2023 12:53:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=r/ByKXL5/TL05Cni5RZ8/3QBk5f4BKrzjgLm3Nsczuc=; b=XZhB0MurOIdgvWrQwu+hJ2lSxOQT+CHAxznZAlATY9tL6yBOIdJSyl09j7qs++7K6+ +noh5D5/c9QmPEOhqfvJHyQwq3kACUyssJheSfMwcYViJIFHS1/y1fiRpO9p3lukpSo8 zurDxJt/kmbPIHQRSNmtoT5iYyxem6gKfY8wpBL8JKkSoFXDKg0JciA3ABN5+4Pa/8FZ Svy0a58Amowv53Vk76UKApMAQ9ZprfLfSHlZL0fWzl8fmhDrZgAmkrOVH9/RFI/nNlnR WsAxkAJv9sUdL+8FTIMmDBza+16v8vXL8ENtWQu+olGHSLyyl/6A++OsnTnGdWDPMYeD K8RA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=r/ByKXL5/TL05Cni5RZ8/3QBk5f4BKrzjgLm3Nsczuc=; b=Q4kQb6RZ7UWyRWB5GYOLEZ30HtToGHWFaKry+h5n1BGT+upwg1RCMvOCyoFcHrpP4G TwJ3i/GK4O5tq1KmLG6t13PFf+bVH9A+OVBD9uP94icI69AkEPQo/N4QF1NRPhhiF4Nl cw1yj5HGCeNbUdZOh/m6IruZ17fctyQYw61JNGYLFCPtgj7DI2Ehek5Iyza0lf6w20jt TL7ti72a+4tTdA4hOnAt229bTRa4h3DPG1v/apLEXpnZ8eEJM5vDA1i4jYHPa4UzJswG YiljgdunUxo5ldr+eIjKH6x+oMAHZcS+liUe1/mU6uh/cMyxFjx9WBXMG2T2dsqxwJu7 6wVA== X-Gm-Message-State: AFqh2kp62LdsERkxSozCG1dx7vmBjy5N41LOoGtoyU3mTw8D71hrsYNQ 4bM/U1NnGK5Aa2pvqzidKa92B1I1XCs= X-Google-Smtp-Source: AMrXdXv6pxAJ0Ez+UVi7qfPaeqrQmlVnO2LvHLtbQu4X0J/l7RkUZy3CJtaYn7PIXuCjX2fZCm6e/eZrYK0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a0d:f007:0:b0:388:941:23a8 with SMTP id z7-20020a0df007000000b00388094123a8mr1505536ywe.152.1673297633444; Mon, 09 Jan 2023 12:53:53 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:00 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-6-surenb@google.com> Subject: [PATCH 05/41] maple_tree: Fix write memory barrier of nodes once dead for RCU mode From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com, "Liam R. Howlett" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Liam R. Howlett" During the development of the maple tree, the strategy of freeing multiple nodes changed and, in the process, the pivots were reused to store pointers to dead nodes. To ensure the readers see accurate pivots, the writers need to mark the nodes as dead and call smp_wmb() to ensure any readers can identify the node as dead before using the pivot values. There were two places where the old method of marking the node as dead without smp_wmb() were being used, which resulted in RCU readers seeing the wrong pivot value before seeing the node was dead. Fix this race condition by using mte_set_node_dead() which has the smp_wmb() call to ensure the race is closed. Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed are marked as dead to ensure there are no other call paths besides the two updated paths. This is necessary for the RCU mode of the maple tree. Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett Signed-off-by: Suren Baghdasaryan --- lib/maple_tree.c | 7 +++++-- tools/testing/radix-tree/maple.c | 16 ++++++++++++++++ 2 files changed, 21 insertions(+), 2 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index d85291b19f86..8066fb1e8ec9 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -179,7 +179,7 @@ static void mt_free_rcu(struct rcu_head *head) */ static void ma_free_rcu(struct maple_node *node) { - node->parent =3D ma_parent_ptr(node); + WARN_ON(node->parent !=3D ma_parent_ptr(node)); call_rcu(&node->rcu, mt_free_rcu); } =20 @@ -1775,8 +1775,10 @@ static inline void mas_replace(struct ma_state *mas,= bool advanced) rcu_assign_pointer(slots[offset], mas->node); } =20 - if (!advanced) + if (!advanced) { + mte_set_node_dead(old_enode); mas_free(mas, old_enode); + } } =20 /* @@ -4217,6 +4219,7 @@ static inline bool mas_wr_node_store(struct ma_wr_sta= te *wr_mas) done: mas_leaf_set_meta(mas, newnode, dst_pivots, maple_leaf_64, new_end); if (in_rcu) { + mte_set_node_dead(mas->node); mas->node =3D mt_mk_node(newnode, wr_mas->type); mas_replace(mas, false); } else { diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/ma= ple.c index 81fa7ec2e66a..2539ad6c4777 100644 --- a/tools/testing/radix-tree/maple.c +++ b/tools/testing/radix-tree/maple.c @@ -108,6 +108,7 @@ static noinline void check_new_node(struct maple_tree *= mt) MT_BUG_ON(mt, mn->slot[1] !=3D NULL); MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); =20 + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); mas.node =3D MAS_START; mas_nomem(&mas, GFP_KERNEL); @@ -160,6 +161,7 @@ static noinline void check_new_node(struct maple_tree *= mt) MT_BUG_ON(mt, mas_allocated(&mas) !=3D i); MT_BUG_ON(mt, !mn); MT_BUG_ON(mt, not_empty(mn)); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); } =20 @@ -192,6 +194,7 @@ static noinline void check_new_node(struct maple_tree *= mt) MT_BUG_ON(mt, not_empty(mn)); MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - 1); MT_BUG_ON(mt, !mn); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); } =20 @@ -210,6 +213,7 @@ static noinline void check_new_node(struct maple_tree *= mt) mn =3D mas_pop_node(&mas); MT_BUG_ON(mt, not_empty(mn)); MT_BUG_ON(mt, mas_allocated(&mas) !=3D j - 1); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); } MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); @@ -233,6 +237,7 @@ static noinline void check_new_node(struct maple_tree *= mt) MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - j); mn =3D mas_pop_node(&mas); MT_BUG_ON(mt, not_empty(mn)); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); MT_BUG_ON(mt, mas_allocated(&mas) !=3D i - j - 1); } @@ -269,6 +274,7 @@ static noinline void check_new_node(struct maple_tree *= mt) mn =3D mas_pop_node(&mas); /* get the next node. */ MT_BUG_ON(mt, mn =3D=3D NULL); MT_BUG_ON(mt, not_empty(mn)); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); } MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); @@ -294,6 +300,7 @@ static noinline void check_new_node(struct maple_tree *= mt) mn =3D mas_pop_node(&mas2); /* get the next node. */ MT_BUG_ON(mt, mn =3D=3D NULL); MT_BUG_ON(mt, not_empty(mn)); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); } MT_BUG_ON(mt, mas_allocated(&mas2) !=3D 0); @@ -334,10 +341,12 @@ static noinline void check_new_node(struct maple_tree= *mt) MT_BUG_ON(mt, mas_allocated(&mas) !=3D MAPLE_ALLOC_SLOTS + 2); mn =3D mas_pop_node(&mas); MT_BUG_ON(mt, not_empty(mn)); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); for (i =3D 1; i <=3D MAPLE_ALLOC_SLOTS + 1; i++) { mn =3D mas_pop_node(&mas); MT_BUG_ON(mt, not_empty(mn)); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); } MT_BUG_ON(mt, mas_allocated(&mas) !=3D 0); @@ -375,6 +384,7 @@ static noinline void check_new_node(struct maple_tree *= mt) mas_node_count(&mas, i); /* Request */ mas_nomem(&mas, GFP_KERNEL); /* Fill request */ mn =3D mas_pop_node(&mas); /* get the next node. */ + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); mas_destroy(&mas); =20 @@ -382,10 +392,13 @@ static noinline void check_new_node(struct maple_tree= *mt) mas_node_count(&mas, i); /* Request */ mas_nomem(&mas, GFP_KERNEL); /* Fill request */ mn =3D mas_pop_node(&mas); /* get the next node. */ + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); mn =3D mas_pop_node(&mas); /* get the next node. */ + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); mn =3D mas_pop_node(&mas); /* get the next node. */ + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); mas_destroy(&mas); } @@ -35369,6 +35382,7 @@ static noinline void check_prealloc(struct maple_tr= ee *mt) MT_BUG_ON(mt, allocated !=3D 1 + height * 3); mn =3D mas_pop_node(&mas); MT_BUG_ON(mt, mas_allocated(&mas) !=3D allocated - 1); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) !=3D 0); mas_destroy(&mas); @@ -35386,6 +35400,7 @@ static noinline void check_prealloc(struct maple_tr= ee *mt) mas_destroy(&mas); allocated =3D mas_allocated(&mas); MT_BUG_ON(mt, allocated !=3D 0); + mn->parent =3D ma_parent_ptr(mn); ma_free_rcu(mn); =20 MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL) !=3D 0); @@ -35756,6 +35771,7 @@ void farmer_tests(void) tree.ma_root =3D mt_mk_node(node, maple_leaf_64); mt_dump(&tree); =20 + node->parent =3D ma_parent_ptr(node); ma_free_rcu(node); =20 /* Check things that will make lockdep angry */ --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3680BC61DB3 for ; Mon, 9 Jan 2023 20:54:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237262AbjAIUyb (ORCPT ); Mon, 9 Jan 2023 15:54:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235639AbjAIUx5 (ORCPT ); Mon, 9 Jan 2023 15:53:57 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5EA774598 for ; Mon, 9 Jan 2023 12:53:56 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id r8-20020a252b08000000b007b989d5e105so6925570ybr.11 for ; Mon, 09 Jan 2023 12:53:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tpY24U65dxDevik5396IsQMXmVZXiLzmW2j6b7ptC2o=; b=FGVUXzbMKsAgHtVe2H+ZGPByy9vTxmIYvrPV7ZuF76ZDV4PCOupJdRY2uHLiI7i+8R 4Fjm3rnOC/ry4DX8g5MN/0YQ4aHPdCQZYFwG7UcSZ/43mvYNjuR6DmjlzCEnl8acF7KD ERVaFO4iv0zgtEZ4MmYzhpcYCiFLTkatnznfDTOltPoYyw5CvEIb1yptgTPwDbjWQDix pGxXZ1tFpWIKQUAKj6j+gMffNX2JeuA3u7s8cWrwnHHzy9T6apLvHv6bApzCw3vZvb6B a2xiK595oO97JhT8ZQYkTmP/j11qoIyiGxI24X5jHCYtpJjinnDlcXs5UPFQKhdZ/Kdz DK0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tpY24U65dxDevik5396IsQMXmVZXiLzmW2j6b7ptC2o=; b=TO7rP8Kc59XyDQqzqM0my/30gHtjua5O65Yh0CaeCm/ezC+m+04aIdUT964TXyVT+F rn558gf0OP6JryPHKswGQVkvqOuHCOcz7MPBV7JMEYifdGr1awHq6vopOZI8Nqny7QOC 0+ApA8hblSS3tf2D3KuuTiPUa2sUqvFsXCNIR5n7NMOTH1ubjGFsND/zRyrTQX0hdPkv nSKhHiNzuahUS/vpQz2ra0SArYWeYdagE9xSwbX2uczRDZ8a8JYneIcGRvrtjHdpx7P2 f8bkuD4NO+DqhaaINQOMdniMOJLGWyl9oE4NAMfSJv+NL9O2vc0pNin8dLKVNeHrpwyd bukw== X-Gm-Message-State: AFqh2kqqiJtOKmjA/pP0v2vGk+cWnMcjMaumLdcLbau8/ApWG/J6kRkg zyY57lyIouv5ZHKXxqY4nv7JgIiMv6Y= X-Google-Smtp-Source: AMrXdXs5jnkdpf0WLGjSqHJja91Z23ecX2Os+zrAvpCFIdrfHWytTqhc2UPXUEbTUa59RZZK2SYuIAkg8gM= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:2491:0:b0:7ba:cd98:5bed with SMTP id k139-20020a252491000000b007bacd985bedmr646916ybk.69.1673297636065; Mon, 09 Jan 2023 12:53:56 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:01 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-7-surenb@google.com> Subject: [PATCH 06/41] maple_tree: Add smp_rmb() to dead node detection From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com, "Liam R. Howlett" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Liam R. Howlett" Add an smp_rmb() before reading the parent pointer to ensure that anything read from the node prior to the parent pointer hasn't been reordered ahead of this check. The is necessary for RCU mode. Fixes: 54a611b60590 ("Maple Tree: add new data structure") Signed-off-by: Liam R. Howlett Signed-off-by: Suren Baghdasaryan --- lib/maple_tree.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/lib/maple_tree.c b/lib/maple_tree.c index 8066fb1e8ec9..80ca28b656d3 100644 --- a/lib/maple_tree.c +++ b/lib/maple_tree.c @@ -535,9 +535,11 @@ static inline struct maple_node *mte_parent(const stru= ct maple_enode *enode) */ static inline bool ma_dead_node(const struct maple_node *node) { - struct maple_node *parent =3D (void *)((unsigned long) - node->parent & ~MAPLE_NODE_MASK); + struct maple_node *parent; =20 + /* Do not reorder reads from the node prior to the parent check */ + smp_rmb(); + parent =3D (void *)((unsigned long) node->parent & ~MAPLE_NODE_MASK); return (parent =3D=3D node); } =20 @@ -552,6 +554,8 @@ static inline bool mte_dead_node(const struct maple_eno= de *enode) struct maple_node *parent, *node; =20 node =3D mte_to_node(enode); + /* Do not reorder reads from the node prior to the parent check */ + smp_rmb(); parent =3D mte_parent(enode); return (parent =3D=3D node); } --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 141E9C5479D for ; Mon, 9 Jan 2023 20:54:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229618AbjAIUyg (ORCPT ); Mon, 9 Jan 2023 15:54:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237209AbjAIUyA (ORCPT ); Mon, 9 Jan 2023 15:54:00 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41B8675D0D for ; Mon, 9 Jan 2023 12:53:59 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-487b0bf1117so104130847b3.5 for ; Mon, 09 Jan 2023 12:53:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=TT+2XsH00KwTX585kTrU5AriHJozMAXcs0iFxeAd0aM=; b=Czdnd0b+5bkyGxVk+9aViJOW1TJ5m6k5XpyXdj4R+Q3dGGEBK2eP5deQvySgSh0iz/ GZ1iYooFrZ5CipGLrqMEQsrY/Zn4kUyQA8m4xIFN7mfEerPKCDkU1AubZLdIfDfth4hu d8dG+6QDcTUs7A9Y052NxeL9TA5a3QdfxlEt4CBoSlbUsLAUBEORkFyS1sdo7qsKHieD EXHjCZRBRMhYEWoMQiXcySJyrWxVf33iALFZt6xsCbFcNo15dZaOnJt+J48Z17ETAj7O 45DKwd6DY6kCK0yPohqOb3E3TN0P5VYKNR4qSRkuKInUqGltfhkGbi1MmOGm2N8pUB+H +TEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TT+2XsH00KwTX585kTrU5AriHJozMAXcs0iFxeAd0aM=; b=bC7B/rJ3bVdZfmD7YYZZtl92FWvCZHlyc5d7T6uwJwdSoDaAH/3/GspsE7gH/oJE3Y YNMRv8y17rWecdG/EefZGbFAhpm1grM2Sr1FZ9GZ3a9BAsy40pDUq1b8eVXFQ3ozmq0C +VviG7ozHW8h4GXs9OX8aVUt7RhtG5St1bP2Lis68XORHnS/WERPnOVi1B/O1diGDSOL rkQzJsIVu+p6qNdCtLlaTBEX6lIiO2yk1mJn+u3r5Rqkm47lnNu/IRtB+0HClHkj07ZO k9peQbqzHUzVjWxV5e0cNQm4dodhfYqhBHuGt3SwvA6c+XKBKkRdVfxYaAEKyfFkqH7t cbWQ== X-Gm-Message-State: AFqh2krcWkwsbUS2J382Dit4Tmi49DbKYlV/OA33TsMieHdYRVWHBYhw 0urCr++7aSWDYrkU74IXXrdTb7k+6QM= X-Google-Smtp-Source: AMrXdXtUN3SiDpbjRFPvhoNJBXB5uM1KkpI6Zcz0Naqt9m0yZDl1B9u4vz5wIeviZulbrZU/bcJYjep5MGE= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:abea:0:b0:762:b86:e82e with SMTP id v97-20020a25abea000000b007620b86e82emr7288109ybi.407.1673297638462; Mon, 09 Jan 2023 12:53:58 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:02 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-8-surenb@google.com> Subject: [PATCH 07/41] mm: Enable maple tree RCU mode by default. From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com, "Liam R. Howlett" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Liam R. Howlett" Use the maple tree in RCU mode for VMA tracking. This is necessary for the use of per-VMA locking. RCU mode is enabled by default but disabled when exiting an mm and for the new tree during a fork. Also enable RCU for the tree used in munmap operations to ensure the nodes remain valid for readers. Signed-off-by: Liam R. Howlett Signed-off-by: Suren Baghdasaryan --- include/linux/mm_types.h | 3 ++- kernel/fork.c | 3 +++ mm/mmap.c | 4 +++- 3 files changed, 8 insertions(+), 2 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 3b8475007734..4b6bce73fbb4 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -810,7 +810,8 @@ struct mm_struct { unsigned long cpu_bitmap[]; }; =20 -#define MM_MT_FLAGS (MT_FLAGS_ALLOC_RANGE | MT_FLAGS_LOCK_EXTERN) +#define MM_MT_FLAGS (MT_FLAGS_ALLOC_RANGE | MT_FLAGS_LOCK_EXTERN | \ + MT_FLAGS_USE_RCU) extern struct mm_struct init_mm; =20 /* Pointer magic because the dynamic array size confuses some compilers. */ diff --git a/kernel/fork.c b/kernel/fork.c index 9f7fe3541897..58aab6c889a4 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -617,6 +617,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *= mm, if (retval) goto out; =20 + mt_clear_in_rcu(mas.tree); mas_for_each(&old_mas, mpnt, ULONG_MAX) { struct file *file; =20 @@ -703,6 +704,8 @@ static __latent_entropy int dup_mmap(struct mm_struct *= mm, retval =3D arch_dup_mmap(oldmm, mm); loop_out: mas_destroy(&mas); + if (!retval) + mt_set_in_rcu(mas.tree); out: mmap_write_unlock(mm); flush_tlb_mm(oldmm); diff --git a/mm/mmap.c b/mm/mmap.c index 87d929316d57..9db37adfc00a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2304,7 +2304,8 @@ do_mas_align_munmap(struct ma_state *mas, struct vm_a= rea_struct *vma, int count =3D 0; int error =3D -ENOMEM; MA_STATE(mas_detach, &mt_detach, 0, 0); - mt_init_flags(&mt_detach, MT_FLAGS_LOCK_EXTERN); + mt_init_flags(&mt_detach, mas->tree->ma_flags & + (MT_FLAGS_LOCK_MASK | MT_FLAGS_USE_RCU)); mt_set_external_lock(&mt_detach, &mm->mmap_lock); =20 if (mas_preallocate(mas, vma, GFP_KERNEL)) @@ -3091,6 +3092,7 @@ void exit_mmap(struct mm_struct *mm) */ set_bit(MMF_OOM_SKIP, &mm->flags); mmap_write_lock(mm); + mt_clear_in_rcu(&mm->mm_mt); free_pgtables(&tlb, &mm->mm_mt, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); tlb_finish_mmu(&tlb); --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E88C3C5479D for ; Mon, 9 Jan 2023 20:54:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235349AbjAIUy1 (ORCPT ); Mon, 9 Jan 2023 15:54:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59920 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237271AbjAIUyD (ORCPT ); Mon, 9 Jan 2023 15:54:03 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D187E7A39C for ; Mon, 9 Jan 2023 12:54:01 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id a4-20020a5b0004000000b006fdc6aaec4fso10372319ybp.20 for ; Mon, 09 Jan 2023 12:54:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=bU2TYXI/IQ+ahWPrvtMjvIvPP35LcWsiAJFQ60YWV8o=; b=hJQlgWQOwjt9bdNS4iNCFQHGlLFypCJo9NibsA7lH+Cj66pCbZCU+CxVh9F7A2fm7T vmlH3bAYg1kSOc6vTNCehxfbrsiSwMkUwfklPcXaZBTBOgsQNuQ1lURfneHqEpASj7ak 72wsJeHj+71ppFuT3bry8xcf97dMVhsy87IzDxduellVAKg7BiJc4vIK8HA7ompTnbjq +gURuiw11LvdH0ad+L6DkctzinvpOQbzYQIPe9DQz0Et4Siymb/UJSUp4isr8x/TUfSb z+/96aen7RzYU/fA94YBNLJOwTLnQ/SBQCaJx176+XVEeqKe1Aaxg+5hL1pkiSz0xMHc S9Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bU2TYXI/IQ+ahWPrvtMjvIvPP35LcWsiAJFQ60YWV8o=; b=6ankpYhCrBJCI3tw4GGTUQA78ZhrF3vS+bvjens+A2YtAzNBYbmBSbbinpq8oluG1t DEZzzV7WlvGepKkcL16ExPrCsikPRMXG4IMrAyDotyfC1HoG4HQDrsLpBNzdpGX4gEUP xWqVFACS8BDHBG1kche1+ysOytcAtr2paDTSUYiHz8yJPnErYJjAoGgtXtunGT5Ha7lz su5n5jL3MrrlpVD+w4DDJEVaTU+W7/pdWmio9cKWCi7SOZCVcENXIzrusBxIwE/oUQWQ 09EdWC3pfY1xNhblOW9226aQmFtJKb+iyDY1D8nXDt5S4L1BFQasBfLF0kPA6M+eGlKT YLmA== X-Gm-Message-State: AFqh2kofjfIYk3POQw957Ax06ZxYq33OW7AWmlrgmG4lOL0irTVgKzxN FR5z5lnE5aO3a7tD61R5fRo8ynZ//EI= X-Google-Smtp-Source: AMrXdXt3ngSzeTd4BjcWtYkKZUmKlJ+coRJjsu8zn30hseAAXL6r6J79qyIJjR/uVl5OkeApgAgECG74aLI= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:9f8b:0:b0:705:cde7:2363 with SMTP id u11-20020a259f8b000000b00705cde72363mr8137319ybq.81.1673297640913; Mon, 09 Jan 2023 12:54:00 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:03 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-9-surenb@google.com> Subject: [PATCH 08/41] mm: introduce CONFIG_PER_VMA_LOCK From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This configuration variable will be used to build the support for VMA locking during page fault handling. This is enabled by default on supported architectures with SMP and MMU set. The architecture support is needed since the page fault handler is called from the architecture's page faulting code which needs modifications to handle faults under VMA lock. Signed-off-by: Suren Baghdasaryan --- mm/Kconfig | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index ff7b209dec05..0aeca3794972 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1183,6 +1183,19 @@ config LRU_GEN_STATS This option has a per-memcg and per-node memory overhead. # } =20 +config ARCH_SUPPORTS_PER_VMA_LOCK + def_bool n + +config PER_VMA_LOCK + bool "Per-vma locking support" + default y + depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP + help + Allow per-vma locking during page fault handling. + + This feature allows locking each virtual memory area separately when + handling page faults instead of taking mmap_lock. + source "mm/damon/Kconfig" =20 endmenu --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99D6FC54EBD for ; Mon, 9 Jan 2023 20:54:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237502AbjAIUym (ORCPT ); Mon, 9 Jan 2023 15:54:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237338AbjAIUyF (ORCPT ); Mon, 9 Jan 2023 15:54:05 -0500 Received: from mail-oi1-x24a.google.com (mail-oi1-x24a.google.com [IPv6:2607:f8b0:4864:20::24a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DAE374598 for ; Mon, 9 Jan 2023 12:54:04 -0800 (PST) Received: by mail-oi1-x24a.google.com with SMTP id b8-20020a056808010800b0035e342ff33aso3044932oie.13 for ; Mon, 09 Jan 2023 12:54:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=D1pZqRdf8jzuIQZRLrMCJmLqIEvbyFYNYfXzudRZz6Y=; b=plHU/zWrfFxaQ0yt/JPDRB18eEsMPdd+ZJ4E4yT8T5NRkdKBZYZu2kA11sksr3NXzt jDgkb12AloyToG1qepTp/9j3c1S/nL9/sMLfiUGj3EbCeKSxNDyG2Qbnp2R8KUfjKQNu 1hUjiCYEdhUY2xqldYVyznhCdRudSDrsopmij1IaAdQfZY5giCy+PasuJa7G7ZNci7e/ L6oXIEMKihljCOjvvq+Mko/UGqlkYlK33Q2p9mFgU8soW7h5CIdjJbM9Bn0dub4CcI+s 5hJtucAeeWPqjUkI2FPiSFxFnJ3u2uppPiS920i95ViALBCSjuucJs6UET/oQp6p8GLF Q6Rw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=D1pZqRdf8jzuIQZRLrMCJmLqIEvbyFYNYfXzudRZz6Y=; b=NV+uWW8kSqjnzXC7qxYInTBwa5wfHfVMsbOIQqaJNekLb+x92UzFDOITuhZVR35I6M maTYcagBmSw0p6cnWX89HQoEBJvKXb9HC4Goe4mRSE0CFeDkkIureH+3F/kWsmUTykEu QAbPgrH77pV5CpM3UqJ3AkCYpu6Ncz2J4oGtouexcSyikyhKH6ByLtagDS9ErMbkOE6r ViT1xBKbZQiSTFzm/T9E7orv1aCzM0CokbzxMzmOEmKxy3hiupTsRLk0rumgEZdQ5Pqe jGyF6CL5IDaHnnYhQOsIgljGyx+qK5WAP1pRaPUNMFamMOHOJWSFZneA5woLsMQ9EORT td+Q== X-Gm-Message-State: AFqh2kppwJH6haqOwDXv8UaB65ED3bl/GpTGkLwz6VFs7LRqjzVnYVa1 IwmsGmNXlqfR8JqNfMj6XbWlDVi0uuY= X-Google-Smtp-Source: AMrXdXv95OTi9We7BklH2FBX3pkox2UVBRMtva8Mm7ne7fTGkQBBnTiOs3huBoDZBy+tfDGkNugi4GYzEh4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a05:6870:649e:b0:143:509f:17f0 with SMTP id cz30-20020a056870649e00b00143509f17f0mr4940900oab.211.1673297643629; Mon, 09 Jan 2023 12:54:03 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:04 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-10-surenb@google.com> Subject: [PATCH 09/41] mm: rcu safe VMA freeing From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Michel Lespinasse This prepares for page faults handling under VMA lock, looking up VMAs under protection of an rcu read lock, instead of the usual mmap read lock. Signed-off-by: Michel Lespinasse Signed-off-by: Suren Baghdasaryan --- include/linux/mm_types.h | 13 ++++++++++--- kernel/fork.c | 13 +++++++++++++ 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 4b6bce73fbb4..d5cdec1314fe 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -535,9 +535,16 @@ struct anon_vma_name { struct vm_area_struct { /* The first cache line has the info for VMA tree walking. */ =20 - unsigned long vm_start; /* Our start address within vm_mm. */ - unsigned long vm_end; /* The first byte after our end address - within vm_mm. */ + union { + struct { + /* VMA covers [vm_start; vm_end) addresses within mm */ + unsigned long vm_start; + unsigned long vm_end; + }; +#ifdef CONFIG_PER_VMA_LOCK + struct rcu_head vm_rcu; /* Used for deferred freeing. */ +#endif + }; =20 struct mm_struct *vm_mm; /* The address space we belong to. */ =20 diff --git a/kernel/fork.c b/kernel/fork.c index 58aab6c889a4..5986817f393c 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -479,10 +479,23 @@ struct vm_area_struct *vm_area_dup(struct vm_area_str= uct *orig) return new; } =20 +#ifdef CONFIG_PER_VMA_LOCK +static void __vm_area_free(struct rcu_head *head) +{ + struct vm_area_struct *vma =3D container_of(head, struct vm_area_struct, + vm_rcu); + kmem_cache_free(vm_area_cachep, vma); +} +#endif + void vm_area_free(struct vm_area_struct *vma) { free_anon_vma_name(vma); +#ifdef CONFIG_PER_VMA_LOCK + call_rcu(&vma->vm_rcu, __vm_area_free); +#else kmem_cache_free(vm_area_cachep, vma); +#endif } =20 static void account_kernel_stack(struct task_struct *tsk, int account) --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9835C6379F for ; Mon, 9 Jan 2023 20:54:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231418AbjAIUyp (ORCPT ); Mon, 9 Jan 2023 15:54:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235462AbjAIUyH (ORCPT ); Mon, 9 Jan 2023 15:54:07 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3334D76822 for ; Mon, 9 Jan 2023 12:54:06 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id g9-20020a25bdc9000000b0073727a20239so10200306ybk.4 for ; Mon, 09 Jan 2023 12:54:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=S7Z9LsPdf1X6x51hAUi2Nm1VlEONIlFgbEepvCuHuj0=; b=lU92C8Jcfmul8qFoqZuzCOb83nRDo2DOHABHtbUeff7KpjabsQdodVMIjgjNegm4Ui gxUyuAPRcETsiRyXJ5mliOj6g1oYM6B1qnPvTAT1LA3dK82Ds4hQfkrM9RtYlbde0cbx i9uLWTSQCuYKj1fxgTqem2OIhsK37AMmmELynCklGF8pD//eviNoFDjhffy4KqRRdQJC MSzyVPeVUGI3nBlo70SJ15d8SfW3UoSbvzQXr0uydmfUdXg4Fl9jpjR3RSUiLlvME5Kv Z62tI5fReQIg+DMNczYR66QAZbTIG8pGdwhVnCRBmPZaWsdfRkDIb3vX4UMeFSba2+4A TcUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=S7Z9LsPdf1X6x51hAUi2Nm1VlEONIlFgbEepvCuHuj0=; b=35iZIX8Q6oOETnTkGoO3qvd4aVmM+DjFKN5BvA1DhyCMLg94UeDK+ELkc8Q9PfE1se iZBO2Q3avgobUVlYyda99gV7k/j0Gq0CGg3K5HgBLTVAWiDKl9Mo+AiDK4DXvS0SWhb6 O7jVwe6YwUURDIcmE0GLwB7b/+2wf8jvt8HFuwpzFv3/XcWkzjhrWStry7WoDmAC/b3N MFXoBJu8LtfQ6w6Z8lodne+MFCZ2DeENlnn+lXg0zsYg6lYt/d8HVGkF/8qEOI8+o5C1 pvSPOVVnUpc6gLwuZnSydH69DUiOu+OW3dv1a+HZ5s6/w3vjKM6sEUG/h2IGXQz0AgKg es/g== X-Gm-Message-State: AFqh2kpkUXznPRJOPtfJKEmuMrfMJk0cUnDSDvGjnWHloFcRklGc+3y3 X/zr0zpPHV2vNxWj11sMj44O58E4Aco= X-Google-Smtp-Source: AMrXdXt5VtatAvHvgu6DDkpLTQLv6tox4QHYgsFVwcGGt6mUkf2zF94ad2ULjvwhvY9gJJrtYWAmbQIwTHs= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a81:4fd0:0:b0:468:5f78:265a with SMTP id d199-20020a814fd0000000b004685f78265amr1632391ywb.87.1673297645909; Mon, 09 Jan 2023 12:54:05 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:05 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-11-surenb@google.com> Subject: [PATCH 10/41] mm: move mmap_lock assert function definitions From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move mmap_lock assert function definitions up so that they can be used by other mmap_lock routines. Signed-off-by: Suren Baghdasaryan --- include/linux/mmap_lock.h | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 96e113e23d04..e49ba91bb1f0 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -60,6 +60,18 @@ static inline void __mmap_lock_trace_released(struct mm_= struct *mm, bool write) =20 #endif /* CONFIG_TRACING */ =20 +static inline void mmap_assert_locked(struct mm_struct *mm) +{ + lockdep_assert_held(&mm->mmap_lock); + VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); +} + +static inline void mmap_assert_write_locked(struct mm_struct *mm) +{ + lockdep_assert_held_write(&mm->mmap_lock); + VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); +} + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); @@ -150,18 +162,6 @@ static inline void mmap_read_unlock_non_owner(struct m= m_struct *mm) up_read_non_owner(&mm->mmap_lock); } =20 -static inline void mmap_assert_locked(struct mm_struct *mm) -{ - lockdep_assert_held(&mm->mmap_lock); - VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); -} - -static inline void mmap_assert_write_locked(struct mm_struct *mm) -{ - lockdep_assert_held_write(&mm->mmap_lock); - VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); -} - static inline int mmap_lock_is_contended(struct mm_struct *mm) { return rwsem_is_contended(&mm->mmap_lock); --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 322CEC54EBD for ; Mon, 9 Jan 2023 20:54:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237574AbjAIUys (ORCPT ); Mon, 9 Jan 2023 15:54:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235299AbjAIUyJ (ORCPT ); Mon, 9 Jan 2023 15:54:09 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B52437458B for ; Mon, 9 Jan 2023 12:54:08 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id z17-20020a256651000000b007907852ca4dso10290949ybm.16 for ; Mon, 09 Jan 2023 12:54:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VKcEnbcUm4yNPmJ0/UBWAk4YOoOktsWvie1GPNgM1C8=; b=GGWOlVFNZmpsv4EqX6fJ0IzU46mRQdwbpB1woCBvprw4wrB1Px0W3aw0R3eA4F/+xV kXuBRDFymMvypEzFr/D4iD7og2WAH2DpqMoKxtl2yVGo8HrvBlCIzI0TYQCav4I1owI5 QhJAem5dniKSevuuTKEF0EwC81wZP9AQ1hwH/aOaMvCSNfv3FcYZaW7nXgrr4EsyFNJE AJznojM8o2oQLC4YAgQdb7K64EDnrtRV2QjiyePckmOOYm4vpeviEyBwwk5GM5l0mBi5 NKaa6vy3INpTQuuxZ0BRwKOg99o3TKTitzAEr6jS2DiN0O+IImG4Tiab8jZ8nkO0QP6Q MsPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VKcEnbcUm4yNPmJ0/UBWAk4YOoOktsWvie1GPNgM1C8=; b=0TH/FtS1iHalAEAfuWZgWXlDiDaDR/S5MlzQ7iXcWTsYLCCTo3WwJnmOZ7BiXuMws/ oWR+kdJXDaNYK2auC80eQda0bndzIGYRjCC0976Ol3U59dvFlBBSRJkBJcgTxWso3lXl tdVgmip6KCppxXIWpRxzv3sZcGb6jWOMjujsuBfMFVnL/iQuFYdLYGU0IH0a4h4NZyCE w1mTM8SBX82tdJ0vzZe+PBALM+qtd2oNrVaMhSu1x0HjTo7dnFq4/n93LAwSY08bz8jM tHqmGkiLEPRTo70JQnB9tqfWqvlV9td3/rhxzilRq57eTCrfZWUenO9Yd1QVkjbn7lV/ rKOg== X-Gm-Message-State: AFqh2kqLNuWRenyFaLEmjlF4e4UIvFgQX6enhQVnuyj1pHd4jO6jYwJ9 w0v6TQI0v4LD9LPy9/hhETGPkGD2hdQ= X-Google-Smtp-Source: AMrXdXuuklmw1U5OSS04yToFIKYXQnoMsaTWu76HfgvyMhLNzfJxTKm5wgWz/6XkfwWL03n+CBAO8y2WRQ4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a0d:d48a:0:b0:46a:f5ee:a2fe with SMTP id w132-20020a0dd48a000000b0046af5eea2femr1229443ywd.207.1673297647974; Mon, 09 Jan 2023 12:54:07 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:06 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-12-surenb@google.com> Subject: [PATCH 11/41] mm: export dump_mm() From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" mmap_assert_write_locked() will be used in the next patch to ensure vma write lock is taken only under mmap_lock exclusive lock. Because mmap_assert_write_locked() uses dump_mm() and there are cases when vma write lock is taken from inside a module, it's necessary to export dump_mm() function. Signed-off-by: Suren Baghdasaryan --- mm/debug.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/debug.c b/mm/debug.c index 7f8e5f744e42..b6e9e53469d1 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -215,6 +215,7 @@ void dump_mm(const struct mm_struct *mm) mm->def_flags, &mm->def_flags ); } +EXPORT_SYMBOL(dump_mm); =20 static bool page_init_poisoning __read_mostly =3D true; =20 --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2536FC6379F for ; Mon, 9 Jan 2023 20:55:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237587AbjAIUy6 (ORCPT ); Mon, 9 Jan 2023 15:54:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237333AbjAIUyQ (ORCPT ); Mon, 9 Jan 2023 15:54:16 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CAD727A3A3 for ; Mon, 9 Jan 2023 12:54:11 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-4bdeb1bbeafso103864127b3.4 for ; Mon, 09 Jan 2023 12:54:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=79bvrukKvLAS13wwuEGJsW+4aNck296i4OOdN8AZrSw=; b=TUshLIkna8lEowgf738l64K6OhURrpjePFxYxdKUuWhyUGH2D62dGB+OGRki5oXp4C xTYuRmR7XzQ7I/28ieatH6mJigev2F3K0HIzC6hflnBGI34WrRdJkJZ4AZfQFOQ8gGav 9UNpZpO2F0owMoX4KYmO5GlLvBicz0eIhvCyoFVAlBVkJOLooeK67cBYtsiWvskUp6uE tE8t31WtIP9ueuHL/u/lXs1s/RC2HGn9FvMQfHmx5Itoy4F3Nj/5Ls77RI1e7g50JCXC 9+/P6cV1602qOnEnSC6h6SSpk6hPN1ukv5Ai24vZZWfGTttvD7F17H6z929cULqI8Sm7 oWIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=79bvrukKvLAS13wwuEGJsW+4aNck296i4OOdN8AZrSw=; b=ViMIKSjAFbjxHX3WXrkShqsK1hxUvVMtmuk0WB4Omn2JTReFvDPa+wgLMOYqmMbWh9 WvNHBE9C/+g2/FiyPbFMZNZM5MgyhzKREMpsfDfYePQx9kgs4o1HT3dlF096SFD9r8wW 7eEJb1PNctwdf0Jz3vMzG8z1pVEZ/LfVs3sDPirKw++rtI4Jf5Q87Nqf2WSFVWD9Noep te6TJJTQNfXmCvDAIJqFutsOddz4pZz3S9OuoW/ZAg7djP2UAdZha87wwmPKMjLBFEoT 5suf8PmLXbz/nWYpoDsW1tljEU/JtPOasD2CiQF6nqzNQ4GQkAw3BCXIlcW28/iV4oXk IPTg== X-Gm-Message-State: AFqh2kpSLqvCAdtju+A26IxdxNlYkY7a6pOlt7g6F7H3qLfYUge0epWz FlRcm1u3cvtmfWxIVmb257ytH1m1RHw= X-Google-Smtp-Source: AMrXdXukETqaSxCpG7iZuCRz7oEpYHbdWSzbf1+vDAfX24FIika8R4beORIYrEIIAWwV84R+bp3+gXy0ol4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a0d:c481:0:b0:4ad:7104:1f66 with SMTP id g123-20020a0dc481000000b004ad71041f66mr3418570ywd.53.1673297651084; Mon, 09 Jan 2023 12:54:11 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:07 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-13-surenb@google.com> Subject: [PATCH 12/41] mm: add per-VMA lock and helper functions to control it From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a per-VMA rw_semaphore to be used during page fault handling instead of mmap_lock. Because there are cases when multiple VMAs need to be exclusively locked during VMA tree modifications, instead of the usual lock/unlock patter we mark a VMA as locked by taking per-VMA lock exclusively and setting vma->lock_seq to the current mm->lock_seq. When mmap_write_lock holder is done with all modifications and drops mmap_lock, it will increment mm->lock_seq, effectively unlocking all VMAs marked as locked. VMA lock is placed on the cache line boundary so that its 'count' field falls into the first cache line while the rest of the fields fall into the second cache line. This lets the 'count' field to be cached with other frequently accessed fields and used quickly in uncontended case while 'owner' and other fields used in the contended case will not invalidate the first cache line while waiting on the lock. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 80 +++++++++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 8 ++++ include/linux/mmap_lock.h | 13 +++++++ kernel/fork.c | 4 ++ mm/init-mm.c | 3 ++ 5 files changed, 108 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index f3f196e4d66d..ec2c4c227d51 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -612,6 +612,85 @@ struct vm_operations_struct { unsigned long addr); }; =20 +#ifdef CONFIG_PER_VMA_LOCK +static inline void vma_init_lock(struct vm_area_struct *vma) +{ + init_rwsem(&vma->lock); + vma->vm_lock_seq =3D -1; +} + +static inline void vma_write_lock(struct vm_area_struct *vma) +{ + int mm_lock_seq; + + mmap_assert_write_locked(vma->vm_mm); + + /* + * current task is holding mmap_write_lock, both vma->vm_lock_seq and + * mm->mm_lock_seq can't be concurrently modified. + */ + mm_lock_seq =3D READ_ONCE(vma->vm_mm->mm_lock_seq); + if (vma->vm_lock_seq =3D=3D mm_lock_seq) + return; + + down_write(&vma->lock); + vma->vm_lock_seq =3D mm_lock_seq; + up_write(&vma->lock); +} + +/* + * Try to read-lock a vma. The function is allowed to occasionally yield f= alse + * locked result to avoid performance overhead, in which case we fall back= to + * using mmap_lock. The function should never yield false unlocked result. + */ +static inline bool vma_read_trylock(struct vm_area_struct *vma) +{ + /* Check before locking. A race might cause false locked result. */ + if (vma->vm_lock_seq =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq)) + return false; + + if (unlikely(down_read_trylock(&vma->lock) =3D=3D 0)) + return false; + + /* + * Overflow might produce false locked result. + * False unlocked result is impossible because we modify and check + * vma->vm_lock_seq under vma->lock protection and mm->mm_lock_seq + * modification invalidates all existing locks. + */ + if (unlikely(vma->vm_lock_seq =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq)))= { + up_read(&vma->lock); + return false; + } + return true; +} + +static inline void vma_read_unlock(struct vm_area_struct *vma) +{ + up_read(&vma->lock); +} + +static inline void vma_assert_write_locked(struct vm_area_struct *vma) +{ + mmap_assert_write_locked(vma->vm_mm); + /* + * current task is holding mmap_write_lock, both vma->vm_lock_seq and + * mm->mm_lock_seq can't be concurrently modified. + */ + VM_BUG_ON_VMA(vma->vm_lock_seq !=3D READ_ONCE(vma->vm_mm->mm_lock_seq), v= ma); +} + +#else /* CONFIG_PER_VMA_LOCK */ + +static inline void vma_init_lock(struct vm_area_struct *vma) {} +static inline void vma_write_lock(struct vm_area_struct *vma) {} +static inline bool vma_read_trylock(struct vm_area_struct *vma) + { return false; } +static inline void vma_read_unlock(struct vm_area_struct *vma) {} +static inline void vma_assert_write_locked(struct vm_area_struct *vma) {} + +#endif /* CONFIG_PER_VMA_LOCK */ + static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *= mm) { static const struct vm_operations_struct dummy_vm_ops =3D {}; @@ -620,6 +699,7 @@ static inline void vma_init(struct vm_area_struct *vma,= struct mm_struct *mm) vma->vm_mm =3D mm; vma->vm_ops =3D &dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); + vma_init_lock(vma); } =20 static inline void vma_set_anonymous(struct vm_area_struct *vma) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index d5cdec1314fe..5f7c5ca89931 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -555,6 +555,11 @@ struct vm_area_struct { pgprot_t vm_page_prot; unsigned long vm_flags; /* Flags, see mm.h. */ =20 +#ifdef CONFIG_PER_VMA_LOCK + int vm_lock_seq; + struct rw_semaphore lock; +#endif + /* * For areas with an address space and backing store, * linkage into the address_space->i_mmap interval tree. @@ -680,6 +685,9 @@ struct mm_struct { * init_mm.mmlist, and are protected * by mmlist_lock */ +#ifdef CONFIG_PER_VMA_LOCK + int mm_lock_seq; +#endif =20 =20 unsigned long hiwater_rss; /* High-watermark of RSS usage */ diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index e49ba91bb1f0..40facd4c398b 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -72,6 +72,17 @@ static inline void mmap_assert_write_locked(struct mm_st= ruct *mm) VM_BUG_ON_MM(!rwsem_is_locked(&mm->mmap_lock), mm); } =20 +#ifdef CONFIG_PER_VMA_LOCK +static inline void vma_write_unlock_mm(struct mm_struct *mm) +{ + mmap_assert_write_locked(mm); + /* No races during update due to exclusive mmap_lock being held */ + WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); +} +#else +static inline void vma_write_unlock_mm(struct mm_struct *mm) {} +#endif + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); @@ -114,12 +125,14 @@ static inline bool mmap_write_trylock(struct mm_struc= t *mm) static inline void mmap_write_unlock(struct mm_struct *mm) { __mmap_lock_trace_released(mm, true); + vma_write_unlock_mm(mm); up_write(&mm->mmap_lock); } =20 static inline void mmap_write_downgrade(struct mm_struct *mm) { __mmap_lock_trace_acquire_returned(mm, false, true); + vma_write_unlock_mm(mm); downgrade_write(&mm->mmap_lock); } =20 diff --git a/kernel/fork.c b/kernel/fork.c index 5986817f393c..c026d75108b3 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -474,6 +474,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struc= t *orig) */ *new =3D data_race(*orig); INIT_LIST_HEAD(&new->anon_vma_chain); + vma_init_lock(new); dup_anon_vma_name(orig, new); } return new; @@ -1145,6 +1146,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm= , struct task_struct *p, seqcount_init(&mm->write_protect_seq); mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); +#ifdef CONFIG_PER_VMA_LOCK + WRITE_ONCE(mm->mm_lock_seq, 0); +#endif mm_pgtables_bytes_init(mm); mm->map_count =3D 0; mm->locked_vm =3D 0; diff --git a/mm/init-mm.c b/mm/init-mm.c index c9327abb771c..33269314e060 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -37,6 +37,9 @@ struct mm_struct init_mm =3D { .page_table_lock =3D __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock), .arg_lock =3D __SPIN_LOCK_UNLOCKED(init_mm.arg_lock), .mmlist =3D LIST_HEAD_INIT(init_mm.mmlist), +#ifdef CONFIG_PER_VMA_LOCK + .mm_lock_seq =3D 0, +#endif .user_ns =3D &init_user_ns, .cpu_bitmap =3D CPU_BITS_NONE, #ifdef CONFIG_IOMMU_SVA --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31009C6379F for ; Mon, 9 Jan 2023 20:55:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237638AbjAIUzK (ORCPT ); Mon, 9 Jan 2023 15:55:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237397AbjAIUyR (ORCPT ); Mon, 9 Jan 2023 15:54:17 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 318B077D0E for ; Mon, 9 Jan 2023 12:54:14 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id 31-20020a17090a0fa200b00226e43409c2so3542156pjz.9 for ; Mon, 09 Jan 2023 12:54:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=7ThjMRKbwnKxVGB3eylsidymuk6Qi/iLA6nYtJEm4MU=; b=RKn/zVLZEJYWNukTXjRCu4nz81zaytlRkR7rakIR2uB/aIfeElp0yCIRzuM8+pvZ3x BQOElK8tbS+rhMmhhEhBtfHOWGZOyvH/HCUdLENpEngLYbZsqjJ9OUXahFZPUHh+XaxU YMwkVhfHQ73TasfAveK5V4w8pX7O6IKpRzfg+l1/+SN9l2l60gXMRIOKbNPAPi3/90FE v/HorT/klJnq8Lcd3XX+VkW89j017v/vCfpCQTBG4XFodaDoUu6obKl8t1y+jpshNDo0 91Vfmkl3EHo0Elu4jQ9efLprq8Gy+cE160O08y2PdqRoSx1lsEPlDSvoNxeAyXeTk2x0 qpkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=7ThjMRKbwnKxVGB3eylsidymuk6Qi/iLA6nYtJEm4MU=; b=l4mmbpAKy2O1PAk8y/kacszyy540lIu8NNdeipP2LJKtrQyYzk9nV6Fu1tx9MiKM5j uUeLzhlLkEUwfiEKdZOjwrLNCMRI568fRbN/X6m3+/hS6R3bBhN75o2Uw2ZPDPW7yxV+ HDcayym1tf0eLVREY8FZkNDu9Zv/fvRxaoz4M2VyhAkRUR/nNopM77caUZXVGa5bh2kd tN3ytoJNxrfcRDTi/CG8CisttQiFvEB19gstYBwZdmYtaTmJAnkjtVvG+cy11NA2LWlc /4N10RT07r3FKa0q95OknOBvxkUj0MsML4LkYkwHfYrJnnG74hMroXy3CM/bgbtHj0iD kNNA== X-Gm-Message-State: AFqh2kp2usmtUDJcRzsbUz5rsJL1fo9OhRCe0zylOGXAesmm6V48xPNh HjJFWQItY15pjQk4JqLYhgrmmLh67Fw= X-Google-Smtp-Source: AMrXdXvxnut8BEB8WYWMFHzY9PnfOwu60Xd9fTsfu57bMChvZCCQkV4pslftmJOaEGGsy1IoVxboiXkLZWU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a17:90a:5c86:b0:219:c1fb:5da8 with SMTP id r6-20020a17090a5c8600b00219c1fb5da8mr5399379pji.221.1673297653600; Mon, 09 Jan 2023 12:54:13 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:08 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-14-surenb@google.com> Subject: [PATCH 13/41] mm: introduce vma->vm_flags modifier functions From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To keep vma locking correctness when vm_flags are modified, add modifier functions to be used whenever flags are updated. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 38 ++++++++++++++++++++++++++++++++++++++ include/linux/mm_types.h | 8 +++++++- 2 files changed, 45 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ec2c4c227d51..35cf0a6cbcc2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -702,6 +702,44 @@ static inline void vma_init(struct vm_area_struct *vma= , struct mm_struct *mm) vma_init_lock(vma); } =20 +/* Use when VMA is not part of the VMA tree and needs no locking */ +static inline +void init_vm_flags(struct vm_area_struct *vma, unsigned long flags) +{ + WRITE_ONCE(vma->vm_flags, flags); +} + +/* Use when VMA is part of the VMA tree and needs appropriate locking */ +static inline +void reset_vm_flags(struct vm_area_struct *vma, unsigned long flags) +{ + vma_write_lock(vma); + init_vm_flags(vma, flags); +} + +static inline +void set_vm_flags(struct vm_area_struct *vma, unsigned long flags) +{ + vma_write_lock(vma); + vma->vm_flags |=3D flags; +} + +static inline +void clear_vm_flags(struct vm_area_struct *vma, unsigned long flags) +{ + vma_write_lock(vma); + vma->vm_flags &=3D ~flags; +} + +static inline +void mod_vm_flags(struct vm_area_struct *vma, + unsigned long set, unsigned long clear) +{ + vma_write_lock(vma); + vma->vm_flags |=3D set; + vma->vm_flags &=3D ~clear; +} + static inline void vma_set_anonymous(struct vm_area_struct *vma) { vma->vm_ops =3D NULL; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 5f7c5ca89931..0d27edd3e63a 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -553,7 +553,13 @@ struct vm_area_struct { * See vmf_insert_mixed_prot() for discussion. */ pgprot_t vm_page_prot; - unsigned long vm_flags; /* Flags, see mm.h. */ + + /* + * Flags, see mm.h. + * WARNING! Do not modify directly to keep correct VMA locking. + * Use {init|reset|set|clear|mod}_vm_flags() functions instead. + */ + unsigned long vm_flags; =20 #ifdef CONFIG_PER_VMA_LOCK int vm_lock_seq; --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBD92C61DB3 for ; Mon, 9 Jan 2023 20:55:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237701AbjAIUz0 (ORCPT ); Mon, 9 Jan 2023 15:55:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237545AbjAIUyo (ORCPT ); Mon, 9 Jan 2023 15:54:44 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C3B67A38A for ; Mon, 9 Jan 2023 12:54:16 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-4c6bd2981d8so66539597b3.2 for ; Mon, 09 Jan 2023 12:54:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=q5vG3v1PIcAm/ERdB03D9/j7bAwMWdEUWbQM9OP8Y4c=; b=jGV2twit8eMqqlAwxSI2kIyw+GdbQsmn963oqlT3GN/ehTsmuPHj865pM8TWLhKQHd jTIMTXUStd0tYuBt7C6hb5YIOb4/wBppGWfSzDrzDKFeyuMa9uP2aNs8o441PZXFnjCo UTpNeX3UnEavX2v58G4+rqPOBU9aUnr6H/ChTcAvQIipWCjxL9GlTdM7oqLGXY19DPFQ q7bt4dBbBloqbwmYh9ylKkKgLjVpvssPEYXAiDYbhfBukg85oeSOt+1VHZNlGA7zFiYp 8fJla0RzQ4AZwln2gfdXjRilgUN42dHF3aEjUxK8uZvM/sfKPj53v6EzOYOtMmGPRMLa fuFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=q5vG3v1PIcAm/ERdB03D9/j7bAwMWdEUWbQM9OP8Y4c=; b=xnE7xeBMf35lai2BSz+66tt2XvOg6wKkMkkQ9HD3nPDWo3IzBmOGah/42Eil7CZ4x1 wtgl/HKpTR4b++KJZQ7rjH52E3c9lzFcmw08NIVt7gaTC4ZDl7ot/7TemfqxFnbGcCTq R4BqH3ySwT+magtPgbg7QFUgwQz6XU9MS704JK375Sm5xbcHv79AXeFyVGsreJPq5uzU sDmMyeKgSTLwC1tyJCeexi8CY4HDkguMVfRHl7RZgPTv0UMtHWB2Z3TqIekMkhdFzp37 lRJORcrThZOVK2+2c3X1gyYaqliYRxxusk0+/3Oa1BPYQwlm4Q2m3Sw6kOFtjD9Sd73a NLPQ== X-Gm-Message-State: AFqh2koRYH4OCCRAs4DNFKpu3anK8p91T4F4sebKX8KvytH+bSkdD2vk 8yW4Tryf2qHjOYv/lASo8jMiIQ0SZVc= X-Google-Smtp-Source: AMrXdXvsQzJo8PSoW1ccRcrlqwtefe5KwIQ7Te6dX0P6/V4WPjQWsck4Zjq+DJXzzvFNYEa755Hhn8p6kg0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:ca88:0:b0:7bf:24c7:997e with SMTP id a130-20020a25ca88000000b007bf24c7997emr393826ybg.217.1673297656291; Mon, 09 Jan 2023 12:54:16 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:09 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-15-surenb@google.com> Subject: [PATCH 14/41] mm: replace VM_LOCKED_CLEAR_MASK with VM_LOCKED_MASK From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To simplify the usage of VM_LOCKED_CLEAR_MASK in clear_vm_flags(), replace it with VM_LOCKED_MASK bitmask and convert all users. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 4 ++-- kernel/fork.c | 2 +- mm/hugetlb.c | 4 ++-- mm/mlock.c | 6 +++--- mm/mmap.c | 6 +++--- mm/mremap.c | 2 +- 6 files changed, 12 insertions(+), 12 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 35cf0a6cbcc2..2b16d45b75a6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -416,8 +416,8 @@ extern unsigned int kobjsize(const void *objp); /* This mask defines which mm->def_flags a process can inherit its parent = */ #define VM_INIT_DEF_MASK VM_NOHUGEPAGE =20 -/* This mask is used to clear all the VMA flags used by mlock */ -#define VM_LOCKED_CLEAR_MASK (~(VM_LOCKED | VM_LOCKONFAULT)) +/* This mask represents all the VMA flag bits used by mlock */ +#define VM_LOCKED_MASK (VM_LOCKED | VM_LOCKONFAULT) =20 /* Arch-specific flags to clear when updating VM flags on protection chang= e */ #ifndef VM_ARCH_CLEAR diff --git a/kernel/fork.c b/kernel/fork.c index c026d75108b3..1591dd8a0745 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -674,7 +674,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *= mm, tmp->anon_vma =3D NULL; } else if (anon_vma_fork(tmp, mpnt)) goto fail_nomem_anon_vma_fork; - tmp->vm_flags &=3D ~(VM_LOCKED | VM_LOCKONFAULT); + clear_vm_flags(tmp, VM_LOCKED_MASK); file =3D tmp->vm_file; if (file) { struct address_space *mapping =3D file->f_mapping; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index db895230ee7e..24861cbfa2b1 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6950,8 +6950,8 @@ static unsigned long page_table_shareable(struct vm_a= rea_struct *svma, unsigned long s_end =3D sbase + PUD_SIZE; =20 /* Allow segments to share if only one is marked locked */ - unsigned long vm_flags =3D vma->vm_flags & VM_LOCKED_CLEAR_MASK; - unsigned long svm_flags =3D svma->vm_flags & VM_LOCKED_CLEAR_MASK; + unsigned long vm_flags =3D vma->vm_flags & ~VM_LOCKED_MASK; + unsigned long svm_flags =3D svma->vm_flags & ~VM_LOCKED_MASK; =20 /* * match the virtual addresses, permission and the alignment of the diff --git a/mm/mlock.c b/mm/mlock.c index 7032f6dd0ce1..06aa9e204fac 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -490,7 +490,7 @@ static int apply_vma_lock_flags(unsigned long start, si= ze_t len, prev =3D mas_prev(&mas, 0); =20 for (nstart =3D start ; ; ) { - vm_flags_t newflags =3D vma->vm_flags & VM_LOCKED_CLEAR_MASK; + vm_flags_t newflags =3D vma->vm_flags & ~VM_LOCKED_MASK; =20 newflags |=3D flags; =20 @@ -662,7 +662,7 @@ static int apply_mlockall_flags(int flags) struct vm_area_struct *vma, *prev =3D NULL; vm_flags_t to_add =3D 0; =20 - current->mm->def_flags &=3D VM_LOCKED_CLEAR_MASK; + current->mm->def_flags &=3D ~VM_LOCKED_MASK; if (flags & MCL_FUTURE) { current->mm->def_flags |=3D VM_LOCKED; =20 @@ -682,7 +682,7 @@ static int apply_mlockall_flags(int flags) mas_for_each(&mas, vma, ULONG_MAX) { vm_flags_t newflags; =20 - newflags =3D vma->vm_flags & VM_LOCKED_CLEAR_MASK; + newflags =3D vma->vm_flags & ~VM_LOCKED_MASK; newflags |=3D to_add; =20 /* Ignore errors */ diff --git a/mm/mmap.c b/mm/mmap.c index 9db37adfc00a..5c4b608edde9 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2721,7 +2721,7 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, if ((vm_flags & VM_SPECIAL) || vma_is_dax(vma) || is_vm_hugetlb_page(vma) || vma =3D=3D get_gate_vma(current->mm)) - vma->vm_flags &=3D VM_LOCKED_CLEAR_MASK; + clear_vm_flags(vma, VM_LOCKED_MASK); else mm->locked_vm +=3D (len >> PAGE_SHIFT); } @@ -3392,8 +3392,8 @@ static struct vm_area_struct *__install_special_mappi= ng( vma->vm_start =3D addr; vma->vm_end =3D addr + len; =20 - vma->vm_flags =3D vm_flags | mm->def_flags | VM_DONTEXPAND | VM_SOFTDIRTY; - vma->vm_flags &=3D VM_LOCKED_CLEAR_MASK; + init_vm_flags(vma, (vm_flags | mm->def_flags | + VM_DONTEXPAND | VM_SOFTDIRTY) & ~VM_LOCKED_MASK); vma->vm_page_prot =3D vm_get_page_prot(vma->vm_flags); =20 vma->vm_ops =3D ops; diff --git a/mm/mremap.c b/mm/mremap.c index fe587c5d6591..5f6f9931bff1 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -686,7 +686,7 @@ static unsigned long move_vma(struct vm_area_struct *vm= a, =20 if (unlikely(!err && (flags & MREMAP_DONTUNMAP))) { /* We always clear VM_LOCKED[ONFAULT] on the old vma */ - vma->vm_flags &=3D VM_LOCKED_CLEAR_MASK; + clear_vm_flags(vma, VM_LOCKED_MASK); =20 /* * anon_vma links of the old vma is no longer needed after its page --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B6ABC54EBD for ; Mon, 9 Jan 2023 20:55:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237335AbjAIUzh (ORCPT ); Mon, 9 Jan 2023 15:55:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237643AbjAIUzL (ORCPT ); Mon, 9 Jan 2023 15:55:11 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D9F238063C for ; Mon, 9 Jan 2023 12:54:19 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-349423f04dbso105134707b3.13 for ; Mon, 09 Jan 2023 12:54:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6Ardk86ZWr85sWOJatKIPgo5mLXp8YCAbjCFaUECHgw=; b=U85xBLsLgZCrqvXfEAg+BaZGz/oCE54YcPmN31yIJksHoum3GUFpvX6o/xRYHrKVyd E4a8yls32cSVlFzqD8P875zWgC/tJSi2iWYjbH9dUZozVCd+b4WdQPTCi9gGl7+7zj7A bQSa+/lZgMkuENk9nuHI7gV3Bhxr7XKQMhvJd8f0UiXEXL2tce0fkIW+hBTEhRWZnOgt /2tL4nNxLnQgCAC2A3SpLPpA792wJNiga2ra8JsoQUeBZO8Q3z22sDcpkTZ9PKzrpIa/ lk8fMvausMKZA9FfvsgHWY3VFM3qx2b48rKCPwhE5oifYwxJxANcNQsndjFCM9F7eiVG iPKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6Ardk86ZWr85sWOJatKIPgo5mLXp8YCAbjCFaUECHgw=; b=cFLM+xW04mbOyhO2bfPE2rEP8f9MSxNojU9ctYYdOtZgt/CPiBoBN3SH1k8RpZ4WD2 P95bZYlGT7ODUP9aJxZyJoNL88V/GafJ+LMzegkE/wNhjGU1OdLK602kZ2FPPIy7HaKX 48OQBN/bTm21TFXN+ErF4Rh+WvHlNVQbpTXiF+WURmBHbLv0b/nD8tKgcIQVxBRliFUB BDGzmRXwalVS/wKzBcyVF8SqE4fmWjP9Nuyvm06jOmTKsrpYXGRYroLTLxKPl6m9IRhr NTSfgS9MhUOYUqEx7uDLhbpjcLyGMxg1ueqG9yUE256YiBp+u6kAF4T2UokYYpCz1I4e O6tQ== X-Gm-Message-State: AFqh2kpTdiBMEqLncvCCmgp1dsZvOZAxHTF4EiVwhy9dvthigzIxGDY2 mpuHYme+kQkxAyOWRHzuEyxB1uoCoPw= X-Google-Smtp-Source: AMrXdXtHh3sXAXvU3ipTAaTFjEARVHgNgaZ/Vo1lplSDPhEcoTa76U/QAOd7/9m4ZYA8KiqxLFoWIR57kX0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:ef4b:0:b0:7b9:3895:a1e6 with SMTP id w11-20020a25ef4b000000b007b93895a1e6mr1365847ybm.335.1673297658938; Mon, 09 Jan 2023 12:54:18 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:10 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-16-surenb@google.com> Subject: [PATCH 15/41] mm: replace vma->vm_flags direct modifications with modifier calls From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace direct modifications to vma->vm_flags with calls to modifier functions to be able to track flag changes and to keep vma locking correctness. Signed-off-by: Suren Baghdasaryan --- arch/arm/kernel/process.c | 2 +- arch/ia64/mm/init.c | 8 ++++---- arch/loongarch/include/asm/tlb.h | 2 +- arch/powerpc/kvm/book3s_xive_native.c | 2 +- arch/powerpc/mm/book3s64/subpage_prot.c | 2 +- arch/powerpc/platforms/book3s/vas-api.c | 2 +- arch/powerpc/platforms/cell/spufs/file.c | 14 +++++++------- arch/s390/mm/gmap.c | 3 +-- arch/x86/entry/vsyscall/vsyscall_64.c | 2 +- arch/x86/kernel/cpu/sgx/driver.c | 2 +- arch/x86/kernel/cpu/sgx/virt.c | 2 +- arch/x86/mm/pat/memtype.c | 6 +++--- arch/x86/um/mem_32.c | 2 +- drivers/acpi/pfr_telemetry.c | 2 +- drivers/android/binder.c | 3 +-- drivers/char/mspec.c | 2 +- drivers/crypto/hisilicon/qm.c | 2 +- drivers/dax/device.c | 2 +- drivers/dma/idxd/cdev.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 4 ++-- drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c | 4 ++-- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 4 ++-- drivers/gpu/drm/amd/amdkfd/kfd_process.c | 4 ++-- drivers/gpu/drm/drm_gem.c | 2 +- drivers/gpu/drm/drm_gem_dma_helper.c | 3 +-- drivers/gpu/drm/drm_gem_shmem_helper.c | 2 +- drivers/gpu/drm/drm_vm.c | 8 ++++---- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 2 +- drivers/gpu/drm/exynos/exynos_drm_gem.c | 4 ++-- drivers/gpu/drm/gma500/framebuffer.c | 2 +- drivers/gpu/drm/i810/i810_dma.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 4 ++-- drivers/gpu/drm/mediatek/mtk_drm_gem.c | 2 +- drivers/gpu/drm/msm/msm_gem.c | 2 +- drivers/gpu/drm/omapdrm/omap_gem.c | 3 +-- drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 3 +-- drivers/gpu/drm/tegra/gem.c | 5 ++--- drivers/gpu/drm/ttm/ttm_bo_vm.c | 3 +-- drivers/gpu/drm/virtio/virtgpu_vram.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c | 2 +- drivers/gpu/drm/xen/xen_drm_front_gem.c | 3 +-- drivers/hsi/clients/cmt_speech.c | 2 +- drivers/hwtracing/intel_th/msu.c | 2 +- drivers/hwtracing/stm/core.c | 2 +- drivers/infiniband/hw/hfi1/file_ops.c | 4 ++-- drivers/infiniband/hw/mlx5/main.c | 4 ++-- drivers/infiniband/hw/qib/qib_file_ops.c | 13 ++++++------- drivers/infiniband/hw/usnic/usnic_ib_verbs.c | 2 +- drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c | 2 +- .../media/common/videobuf2/videobuf2-dma-contig.c | 2 +- drivers/media/common/videobuf2/videobuf2-vmalloc.c | 2 +- drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +- drivers/media/v4l2-core/videobuf-dma-sg.c | 4 ++-- drivers/media/v4l2-core/videobuf-vmalloc.c | 2 +- drivers/misc/cxl/context.c | 2 +- drivers/misc/habanalabs/common/memory.c | 2 +- drivers/misc/habanalabs/gaudi/gaudi.c | 4 ++-- drivers/misc/habanalabs/gaudi2/gaudi2.c | 8 ++++---- drivers/misc/habanalabs/goya/goya.c | 4 ++-- drivers/misc/ocxl/context.c | 4 ++-- drivers/misc/ocxl/sysfs.c | 2 +- drivers/misc/open-dice.c | 6 +++--- drivers/misc/sgi-gru/grufile.c | 4 ++-- drivers/misc/uacce/uacce.c | 2 +- drivers/sbus/char/oradax.c | 2 +- drivers/scsi/cxlflash/ocxl_hw.c | 2 +- drivers/scsi/sg.c | 2 +- drivers/staging/media/atomisp/pci/hmm/hmm_bo.c | 2 +- drivers/staging/media/deprecated/meye/meye.c | 4 ++-- .../media/deprecated/stkwebcam/stk-webcam.c | 2 +- drivers/target/target_core_user.c | 2 +- drivers/uio/uio.c | 2 +- drivers/usb/core/devio.c | 3 +-- drivers/usb/mon/mon_bin.c | 3 +-- drivers/vdpa/vdpa_user/iova_domain.c | 2 +- drivers/vfio/pci/vfio_pci_core.c | 2 +- drivers/vhost/vdpa.c | 2 +- drivers/video/fbdev/68328fb.c | 2 +- drivers/video/fbdev/core/fb_defio.c | 4 ++-- drivers/xen/gntalloc.c | 2 +- drivers/xen/gntdev.c | 4 ++-- drivers/xen/privcmd-buf.c | 2 +- drivers/xen/privcmd.c | 4 ++-- fs/aio.c | 2 +- fs/cramfs/inode.c | 2 +- fs/erofs/data.c | 2 +- fs/exec.c | 4 ++-- fs/ext4/file.c | 2 +- fs/fuse/dax.c | 2 +- fs/hugetlbfs/inode.c | 4 ++-- fs/orangefs/file.c | 3 +-- fs/proc/task_mmu.c | 2 +- fs/proc/vmcore.c | 3 +-- fs/userfaultfd.c | 12 ++++++------ fs/xfs/xfs_file.c | 2 +- include/linux/mm.h | 2 +- kernel/bpf/ringbuf.c | 4 ++-- kernel/bpf/syscall.c | 4 ++-- kernel/events/core.c | 2 +- kernel/kcov.c | 2 +- kernel/relay.c | 2 +- mm/madvise.c | 2 +- mm/memory.c | 6 +++--- mm/mlock.c | 6 +++--- mm/mmap.c | 10 +++++----- mm/mprotect.c | 2 +- mm/mremap.c | 6 +++--- mm/nommu.c | 11 ++++++----- mm/secretmem.c | 2 +- mm/shmem.c | 2 +- mm/vmalloc.c | 2 +- net/ipv4/tcp.c | 4 ++-- security/selinux/selinuxfs.c | 6 +++--- sound/core/oss/pcm_oss.c | 2 +- sound/core/pcm_native.c | 9 +++++---- sound/soc/pxa/mmp-sspa.c | 2 +- sound/usb/usx2y/us122l.c | 4 ++-- sound/usb/usx2y/usX2Yhwdep.c | 2 +- sound/usb/usx2y/usx2yhwdeppcm.c | 2 +- 120 files changed, 194 insertions(+), 205 deletions(-) diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index f811733a8fc5..ec65f3ea3150 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -316,7 +316,7 @@ static int __init gate_vma_init(void) gate_vma.vm_page_prot =3D PAGE_READONLY_EXEC; gate_vma.vm_start =3D 0xffff0000; gate_vma.vm_end =3D 0xffff0000 + PAGE_SIZE; - gate_vma.vm_flags =3D VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYEXEC; + init_vm_flags(&gate_vma, VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYEXEC); return 0; } arch_initcall(gate_vma_init); diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c index fc4e4217e87f..d355e0ce28ab 100644 --- a/arch/ia64/mm/init.c +++ b/arch/ia64/mm/init.c @@ -109,7 +109,7 @@ ia64_init_addr_space (void) vma_set_anonymous(vma); vma->vm_start =3D current->thread.rbs_bot & PAGE_MASK; vma->vm_end =3D vma->vm_start + PAGE_SIZE; - vma->vm_flags =3D VM_DATA_DEFAULT_FLAGS|VM_GROWSUP|VM_ACCOUNT; + init_vm_flags(vma, VM_DATA_DEFAULT_FLAGS|VM_GROWSUP|VM_ACCOUNT); vma->vm_page_prot =3D vm_get_page_prot(vma->vm_flags); mmap_write_lock(current->mm); if (insert_vm_struct(current->mm, vma)) { @@ -127,8 +127,8 @@ ia64_init_addr_space (void) vma_set_anonymous(vma); vma->vm_end =3D PAGE_SIZE; vma->vm_page_prot =3D __pgprot(pgprot_val(PAGE_READONLY) | _PAGE_MA_NAT= ); - vma->vm_flags =3D VM_READ | VM_MAYREAD | VM_IO | - VM_DONTEXPAND | VM_DONTDUMP; + init_vm_flags(vma, VM_READ | VM_MAYREAD | VM_IO | + VM_DONTEXPAND | VM_DONTDUMP); mmap_write_lock(current->mm); if (insert_vm_struct(current->mm, vma)) { mmap_write_unlock(current->mm); @@ -272,7 +272,7 @@ static int __init gate_vma_init(void) vma_init(&gate_vma, NULL); gate_vma.vm_start =3D FIXADDR_USER_START; gate_vma.vm_end =3D FIXADDR_USER_END; - gate_vma.vm_flags =3D VM_READ | VM_MAYREAD | VM_EXEC | VM_MAYEXEC; + init_vm_flags(&gate_vma, VM_READ | VM_MAYREAD | VM_EXEC | VM_MAYEXEC); gate_vma.vm_page_prot =3D __pgprot(__ACCESS_BITS | _PAGE_PL_3 | _PAGE_AR_= RX); =20 return 0; diff --git a/arch/loongarch/include/asm/tlb.h b/arch/loongarch/include/asm/= tlb.h index dd24f5898f65..51e35b44d105 100644 --- a/arch/loongarch/include/asm/tlb.h +++ b/arch/loongarch/include/asm/tlb.h @@ -149,7 +149,7 @@ static inline void tlb_flush(struct mmu_gather *tlb) struct vm_area_struct vma; =20 vma.vm_mm =3D tlb->mm; - vma.vm_flags =3D 0; + init_vm_flags(&vma, 0); if (tlb->fullmm) { flush_tlb_mm(tlb->mm); return; diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3= s_xive_native.c index 4f566bea5e10..7976af0f5ff8 100644 --- a/arch/powerpc/kvm/book3s_xive_native.c +++ b/arch/powerpc/kvm/book3s_xive_native.c @@ -324,7 +324,7 @@ static int kvmppc_xive_native_mmap(struct kvm_device *d= ev, return -EINVAL; } =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached_wc(vma->vm_page_prot); =20 /* diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book= 3s64/subpage_prot.c index d73b3b4176e8..72948cdb1911 100644 --- a/arch/powerpc/mm/book3s64/subpage_prot.c +++ b/arch/powerpc/mm/book3s64/subpage_prot.c @@ -156,7 +156,7 @@ static void subpage_mark_vma_nohuge(struct mm_struct *m= m, unsigned long addr, * VM_NOHUGEPAGE and split them. */ for_each_vma_range(vmi, vma, addr + len) { - vma->vm_flags |=3D VM_NOHUGEPAGE; + set_vm_flags(vma, VM_NOHUGEPAGE); walk_page_vma(vma, &subpage_walk_ops, NULL); } } diff --git a/arch/powerpc/platforms/book3s/vas-api.c b/arch/powerpc/platfor= ms/book3s/vas-api.c index eb5bed333750..a81615768fff 100644 --- a/arch/powerpc/platforms/book3s/vas-api.c +++ b/arch/powerpc/platforms/book3s/vas-api.c @@ -525,7 +525,7 @@ static int coproc_mmap(struct file *fp, struct vm_area_= struct *vma) pfn =3D paste_addr >> PAGE_SHIFT; =20 /* flags, page_prot from cxl_mmap(), except we want cachable */ - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_cached(vma->vm_page_prot); =20 prot =3D __pgprot(pgprot_val(vma->vm_page_prot) | _PAGE_DIRTY); diff --git a/arch/powerpc/platforms/cell/spufs/file.c b/arch/powerpc/platfo= rms/cell/spufs/file.c index 62d90a5e23d1..784fa39a484a 100644 --- a/arch/powerpc/platforms/cell/spufs/file.c +++ b/arch/powerpc/platforms/cell/spufs/file.c @@ -291,7 +291,7 @@ static int spufs_mem_mmap(struct file *file, struct vm_= area_struct *vma) if (!(vma->vm_flags & VM_SHARED)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached_wc(vma->vm_page_prot); =20 vma->vm_ops =3D &spufs_mem_mmap_vmops; @@ -381,7 +381,7 @@ static int spufs_cntl_mmap(struct file *file, struct vm= _area_struct *vma) if (!(vma->vm_flags & VM_SHARED)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 vma->vm_ops =3D &spufs_cntl_mmap_vmops; @@ -1043,7 +1043,7 @@ static int spufs_signal1_mmap(struct file *file, stru= ct vm_area_struct *vma) if (!(vma->vm_flags & VM_SHARED)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 vma->vm_ops =3D &spufs_signal1_mmap_vmops; @@ -1179,7 +1179,7 @@ static int spufs_signal2_mmap(struct file *file, stru= ct vm_area_struct *vma) if (!(vma->vm_flags & VM_SHARED)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 vma->vm_ops =3D &spufs_signal2_mmap_vmops; @@ -1302,7 +1302,7 @@ static int spufs_mss_mmap(struct file *file, struct v= m_area_struct *vma) if (!(vma->vm_flags & VM_SHARED)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 vma->vm_ops =3D &spufs_mss_mmap_vmops; @@ -1364,7 +1364,7 @@ static int spufs_psmap_mmap(struct file *file, struct= vm_area_struct *vma) if (!(vma->vm_flags & VM_SHARED)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 vma->vm_ops =3D &spufs_psmap_mmap_vmops; @@ -1424,7 +1424,7 @@ static int spufs_mfc_mmap(struct file *file, struct v= m_area_struct *vma) if (!(vma->vm_flags & VM_SHARED)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 vma->vm_ops =3D &spufs_mfc_mmap_vmops; diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 74e1d873dce0..3811d6c86d09 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2522,8 +2522,7 @@ static inline void thp_split_mm(struct mm_struct *mm) VMA_ITERATOR(vmi, mm, 0); =20 for_each_vma(vmi, vma) { - vma->vm_flags &=3D ~VM_HUGEPAGE; - vma->vm_flags |=3D VM_NOHUGEPAGE; + mod_vm_flags(vma, VM_NOHUGEPAGE, VM_HUGEPAGE); walk_page_vma(vma, &thp_split_walk_ops, NULL); } mm->def_flags |=3D VM_NOHUGEPAGE; diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c b/arch/x86/entry/vsyscal= l/vsyscall_64.c index 4af81df133ee..e2a1626d86d8 100644 --- a/arch/x86/entry/vsyscall/vsyscall_64.c +++ b/arch/x86/entry/vsyscall/vsyscall_64.c @@ -391,7 +391,7 @@ void __init map_vsyscall(void) } =20 if (vsyscall_mode =3D=3D XONLY) - gate_vma.vm_flags =3D VM_EXEC; + init_vm_flags(&gate_vma, VM_EXEC); =20 BUILD_BUG_ON((unsigned long)__fix_to_virt(VSYSCALL_PAGE) !=3D (unsigned long)VSYSCALL_ADDR); diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/dri= ver.c index aa9b8b868867..42c0bded93b6 100644 --- a/arch/x86/kernel/cpu/sgx/driver.c +++ b/arch/x86/kernel/cpu/sgx/driver.c @@ -95,7 +95,7 @@ static int sgx_mmap(struct file *file, struct vm_area_str= uct *vma) return ret; =20 vma->vm_ops =3D &sgx_vm_ops; - vma->vm_flags |=3D VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO; + set_vm_flags(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO); vma->vm_private_data =3D encl; =20 return 0; diff --git a/arch/x86/kernel/cpu/sgx/virt.c b/arch/x86/kernel/cpu/sgx/virt.c index 6a77a14eee38..0774a0bfeb28 100644 --- a/arch/x86/kernel/cpu/sgx/virt.c +++ b/arch/x86/kernel/cpu/sgx/virt.c @@ -105,7 +105,7 @@ static int sgx_vepc_mmap(struct file *file, struct vm_a= rea_struct *vma) =20 vma->vm_ops =3D &sgx_vepc_vm_ops; /* Don't copy VMA in fork() */ - vma->vm_flags |=3D VM_PFNMAP | VM_IO | VM_DONTDUMP | VM_DONTCOPY; + set_vm_flags(vma, VM_PFNMAP | VM_IO | VM_DONTDUMP | VM_DONTCOPY); vma->vm_private_data =3D vepc; =20 return 0; diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c index 46de9cf5c91d..9e490a372896 100644 --- a/arch/x86/mm/pat/memtype.c +++ b/arch/x86/mm/pat/memtype.c @@ -999,7 +999,7 @@ int track_pfn_remap(struct vm_area_struct *vma, pgprot_= t *prot, =20 ret =3D reserve_pfn_range(paddr, size, prot, 0); if (ret =3D=3D 0 && vma) - vma->vm_flags |=3D VM_PAT; + set_vm_flags(vma, VM_PAT); return ret; } =20 @@ -1065,7 +1065,7 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned= long pfn, } free_pfn_range(paddr, size); if (vma) - vma->vm_flags &=3D ~VM_PAT; + clear_vm_flags(vma, VM_PAT); } =20 /* @@ -1075,7 +1075,7 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned= long pfn, */ void untrack_pfn_moved(struct vm_area_struct *vma) { - vma->vm_flags &=3D ~VM_PAT; + clear_vm_flags(vma, VM_PAT); } =20 pgprot_t pgprot_writecombine(pgprot_t prot) diff --git a/arch/x86/um/mem_32.c b/arch/x86/um/mem_32.c index cafd01f730da..bfd2c320ad25 100644 --- a/arch/x86/um/mem_32.c +++ b/arch/x86/um/mem_32.c @@ -16,7 +16,7 @@ static int __init gate_vma_init(void) vma_init(&gate_vma, NULL); gate_vma.vm_start =3D FIXADDR_USER_START; gate_vma.vm_end =3D FIXADDR_USER_END; - gate_vma.vm_flags =3D VM_READ | VM_MAYREAD | VM_EXEC | VM_MAYEXEC; + init_vm_flags(&gate_vma, VM_READ | VM_MAYREAD | VM_EXEC | VM_MAYEXEC); gate_vma.vm_page_prot =3D PAGE_READONLY; =20 return 0; diff --git a/drivers/acpi/pfr_telemetry.c b/drivers/acpi/pfr_telemetry.c index 27fb6cdad75f..9e339c705b5b 100644 --- a/drivers/acpi/pfr_telemetry.c +++ b/drivers/acpi/pfr_telemetry.c @@ -310,7 +310,7 @@ pfrt_log_mmap(struct file *file, struct vm_area_struct = *vma) return -EROFS; =20 /* changing from read to write with mprotect is not allowed */ - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 pfrt_log_dev =3D to_pfrt_log_dev(file); =20 diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 880224ec6abb..dd6c99223b8c 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -5572,8 +5572,7 @@ static int binder_mmap(struct file *filp, struct vm_a= rea_struct *vma) proc->pid, vma->vm_start, vma->vm_end, "bad vm_flags", -EPERM); return -EPERM; } - vma->vm_flags |=3D VM_DONTCOPY | VM_MIXEDMAP; - vma->vm_flags &=3D ~VM_MAYWRITE; + mod_vm_flags(vma, VM_DONTCOPY | VM_MIXEDMAP, VM_MAYWRITE); =20 vma->vm_ops =3D &binder_vm_ops; vma->vm_private_data =3D proc; diff --git a/drivers/char/mspec.c b/drivers/char/mspec.c index f8231e2e84be..57bd36a28f95 100644 --- a/drivers/char/mspec.c +++ b/drivers/char/mspec.c @@ -206,7 +206,7 @@ mspec_mmap(struct file *file, struct vm_area_struct *vm= a, refcount_set(&vdata->refcnt, 1); vma->vm_private_data =3D vdata; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); if (vdata->type =3D=3D MSPEC_UNCACHED) vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); vma->vm_ops =3D &mspec_vm_ops; diff --git a/drivers/crypto/hisilicon/qm.c b/drivers/crypto/hisilicon/qm.c index 007ac7a69ce7..57ecdb5c97fb 100644 --- a/drivers/crypto/hisilicon/qm.c +++ b/drivers/crypto/hisilicon/qm.c @@ -2363,7 +2363,7 @@ static int hisi_qm_uacce_mmap(struct uacce_queue *q, return -EINVAL; } =20 - vma->vm_flags |=3D VM_IO; + set_vm_flags(vma, VM_IO); =20 return remap_pfn_range(vma, vma->vm_start, phys_base >> PAGE_SHIFT, diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 5494d745ced5..6e9726dfaa7e 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -308,7 +308,7 @@ static int dax_mmap(struct file *filp, struct vm_area_s= truct *vma) return rc; =20 vma->vm_ops =3D &dax_vm_ops; - vma->vm_flags |=3D VM_HUGEPAGE; + set_vm_flags(vma, VM_HUGEPAGE); return 0; } =20 diff --git a/drivers/dma/idxd/cdev.c b/drivers/dma/idxd/cdev.c index e13e92609943..51cf836cf329 100644 --- a/drivers/dma/idxd/cdev.c +++ b/drivers/dma/idxd/cdev.c @@ -201,7 +201,7 @@ static int idxd_cdev_mmap(struct file *filp, struct vm_= area_struct *vma) if (rc < 0) return rc; =20 - vma->vm_flags |=3D VM_DONTCOPY; + set_vm_flags(vma, VM_DONTCOPY); pfn =3D (base + idxd_get_wq_portal_full_offset(wq->id, IDXD_PORTAL_LIMITED)) >> PAGE_SHIFT; vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_gem.c index bb7350ea1d75..70b08a0d13cd 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -257,7 +257,7 @@ static int amdgpu_gem_object_mmap(struct drm_gem_object= *obj, struct vm_area_str */ if (is_cow_mapping(vma->vm_flags) && !(vma->vm_flags & VM_ACCESS_FLAGS)) - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 return drm_gem_ttm_mmap(obj, vma); } diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd= /amdkfd/kfd_chardev.c index 6d291aa6386b..7beb8dd6a5e6 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -2879,8 +2879,8 @@ static int kfd_mmio_mmap(struct kfd_dev *dev, struct = kfd_process *process, =20 address =3D dev->adev->rmmio_remap.bus_addr; =20 - vma->vm_flags |=3D VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE | - VM_DONTDUMP | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE | + VM_DONTDUMP | VM_PFNMAP); =20 vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/am= d/amdkfd/kfd_doorbell.c index cd4e61bf0493..6cbe47cf9be5 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c @@ -159,8 +159,8 @@ int kfd_doorbell_mmap(struct kfd_dev *dev, struct kfd_p= rocess *process, address =3D kfd_get_process_doorbells(pdd); if (!address) return -ENOMEM; - vma->vm_flags |=3D VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE | - VM_DONTDUMP | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE | + VM_DONTDUMP | VM_PFNMAP); =20 vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/= amdkfd/kfd_events.c index 729d26d648af..95cd20056cea 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c @@ -1052,8 +1052,8 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_a= rea_struct *vma) pfn =3D __pa(page->kernel_address); pfn >>=3D PAGE_SHIFT; =20 - vma->vm_flags |=3D VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE - | VM_DONTDUMP | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE + | VM_DONTDUMP | VM_PFNMAP); =20 pr_debug("Mapping signal page\n"); pr_debug(" start user address =3D=3D 0x%08lx\n", vma->vm_start); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd= /amdkfd/kfd_process.c index 51b1683ac5c1..b40f4b122918 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1978,8 +1978,8 @@ int kfd_reserved_mem_mmap(struct kfd_dev *dev, struct= kfd_process *process, return -ENOMEM; } =20 - vma->vm_flags |=3D VM_IO | VM_DONTCOPY | VM_DONTEXPAND - | VM_NORESERVE | VM_DONTDUMP | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_DONTCOPY | VM_DONTEXPAND + | VM_NORESERVE | VM_DONTDUMP | VM_PFNMAP); /* Mapping pages to user process */ return remap_pfn_range(vma, vma->vm_start, PFN_DOWN(__pa(qpd->cwsr_kaddr)), diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index b8db675e7fb5..6ea7bcaa592b 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1047,7 +1047,7 @@ int drm_gem_mmap_obj(struct drm_gem_object *obj, unsi= gned long obj_size, goto err_drm_gem_object_put; } =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_page_prot =3D pgprot_writecombine(vm_get_page_prot(vma->vm_flags= )); vma->vm_page_prot =3D pgprot_decrypted(vma->vm_page_prot); } diff --git a/drivers/gpu/drm/drm_gem_dma_helper.c b/drivers/gpu/drm/drm_gem= _dma_helper.c index 1e658c448366..41f241b9a581 100644 --- a/drivers/gpu/drm/drm_gem_dma_helper.c +++ b/drivers/gpu/drm/drm_gem_dma_helper.c @@ -530,8 +530,7 @@ int drm_gem_dma_mmap(struct drm_gem_dma_object *dma_obj= , struct vm_area_struct * * the whole buffer. */ vma->vm_pgoff -=3D drm_vma_node_start(&obj->vma_node); - vma->vm_flags &=3D ~VM_PFNMAP; - vma->vm_flags |=3D VM_DONTEXPAND; + mod_vm_flags(vma, VM_DONTEXPAND, VM_PFNMAP); =20 if (dma_obj->map_noncoherent) { vma->vm_page_prot =3D vm_get_page_prot(vma->vm_flags); diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_g= em_shmem_helper.c index b602cd72a120..a5032dfac492 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -633,7 +633,7 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object *shm= em, struct vm_area_struct if (ret) return ret; =20 - vma->vm_flags |=3D VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_page_prot =3D vm_get_page_prot(vma->vm_flags); if (shmem->map_wc) vma->vm_page_prot =3D pgprot_writecombine(vma->vm_page_prot); diff --git a/drivers/gpu/drm/drm_vm.c b/drivers/gpu/drm/drm_vm.c index f024dc93939e..8867bb6c40e3 100644 --- a/drivers/gpu/drm/drm_vm.c +++ b/drivers/gpu/drm/drm_vm.c @@ -476,7 +476,7 @@ static int drm_mmap_dma(struct file *filp, struct vm_ar= ea_struct *vma) =20 if (!capable(CAP_SYS_ADMIN) && (dma->flags & _DRM_DMA_USE_PCI_RO)) { - vma->vm_flags &=3D ~(VM_WRITE | VM_MAYWRITE); + clear_vm_flags(vma, VM_WRITE | VM_MAYWRITE); #if defined(__i386__) || defined(__x86_64__) pgprot_val(vma->vm_page_prot) &=3D ~_PAGE_RW; #else @@ -492,7 +492,7 @@ static int drm_mmap_dma(struct file *filp, struct vm_ar= ea_struct *vma) =20 vma->vm_ops =3D &drm_vm_dma_ops; =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); =20 drm_vm_open_locked(dev, vma); return 0; @@ -560,7 +560,7 @@ static int drm_mmap_locked(struct file *filp, struct vm= _area_struct *vma) return -EINVAL; =20 if (!capable(CAP_SYS_ADMIN) && (map->flags & _DRM_READ_ONLY)) { - vma->vm_flags &=3D ~(VM_WRITE | VM_MAYWRITE); + clear_vm_flags(vma, VM_WRITE | VM_MAYWRITE); #if defined(__i386__) || defined(__x86_64__) pgprot_val(vma->vm_page_prot) &=3D ~_PAGE_RW; #else @@ -628,7 +628,7 @@ static int drm_mmap_locked(struct file *filp, struct vm= _area_struct *vma) default: return -EINVAL; /* This should never happen. */ } - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); =20 drm_vm_open_locked(dev, vma); return 0; diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnavi= v/etnaviv_gem.c index c5ae5492e1af..9a5a317038a4 100644 --- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c @@ -130,7 +130,7 @@ static int etnaviv_gem_mmap_obj(struct etnaviv_gem_obje= ct *etnaviv_obj, { pgprot_t vm_page_prot; =20 - vma->vm_flags |=3D VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); =20 vm_page_prot =3D vm_get_page_prot(vma->vm_flags); =20 diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exyn= os/exynos_drm_gem.c index 3e493f48e0d4..c330d415729c 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c @@ -274,7 +274,7 @@ static int exynos_drm_gem_mmap_buffer(struct exynos_drm= _gem *exynos_gem, unsigned long vm_size; int ret; =20 - vma->vm_flags &=3D ~VM_PFNMAP; + clear_vm_flags(vma, VM_PFNMAP); vma->vm_pgoff =3D 0; =20 vm_size =3D vma->vm_end - vma->vm_start; @@ -368,7 +368,7 @@ static int exynos_drm_gem_mmap(struct drm_gem_object *o= bj, struct vm_area_struct if (obj->import_attach) return dma_buf_mmap(obj->dma_buf, vma, 0); =20 - vma->vm_flags |=3D VM_IO | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP); =20 DRM_DEV_DEBUG_KMS(to_dma_dev(obj->dev), "flags =3D 0x%x\n", exynos_gem->flags); diff --git a/drivers/gpu/drm/gma500/framebuffer.c b/drivers/gpu/drm/gma500/= framebuffer.c index 8d5a37b8f110..471d5b3c1535 100644 --- a/drivers/gpu/drm/gma500/framebuffer.c +++ b/drivers/gpu/drm/gma500/framebuffer.c @@ -139,7 +139,7 @@ static int psbfb_mmap(struct fb_info *info, struct vm_a= rea_struct *vma) */ vma->vm_ops =3D &psbfb_vm_ops; vma->vm_private_data =3D (void *)fb; - vma->vm_flags |=3D VM_IO | VM_MIXEDMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_MIXEDMAP | VM_DONTEXPAND | VM_DONTDUMP); return 0; } =20 diff --git a/drivers/gpu/drm/i810/i810_dma.c b/drivers/gpu/drm/i810/i810_dm= a.c index 9fb4dd63342f..bced8c30709e 100644 --- a/drivers/gpu/drm/i810/i810_dma.c +++ b/drivers/gpu/drm/i810/i810_dma.c @@ -102,7 +102,7 @@ static int i810_mmap_buffers(struct file *filp, struct = vm_area_struct *vma) buf =3D dev_priv->mmap_buffer; buf_priv =3D buf->dev_private; =20 - vma->vm_flags |=3D VM_DONTCOPY; + set_vm_flags(vma, VM_DONTCOPY); =20 buf_priv->currently_mapped =3D I810_BUF_MAPPED; =20 diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i91= 5/gem/i915_gem_mman.c index 0ad44f3868de..71b9e0485cb9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -979,7 +979,7 @@ int i915_gem_mmap(struct file *filp, struct vm_area_str= uct *vma) i915_gem_object_put(obj); return -EINVAL; } - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); } =20 anon =3D mmap_singleton(to_i915(dev)); @@ -988,7 +988,7 @@ int i915_gem_mmap(struct file *filp, struct vm_area_str= uct *vma) return PTR_ERR(anon); } =20 - vma->vm_flags |=3D VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO; + set_vm_flags(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_IO); =20 /* * We keep the ref on mmo->obj, not vm_file, but we require diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/media= tek/mtk_drm_gem.c index 47e96b0289f9..427089733b87 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c @@ -158,7 +158,7 @@ static int mtk_drm_gem_object_mmap(struct drm_gem_objec= t *obj, * dma_alloc_attrs() allocated a struct page table for mtk_gem, so clear * VM_PFNMAP flag that was set by drm_gem_mmap_obj()/drm_gem_mmap(). */ - vma->vm_flags |=3D VM_IO | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_page_prot =3D pgprot_writecombine(vm_get_page_prot(vma->vm_flags)= ); vma->vm_page_prot =3D pgprot_decrypted(vma->vm_page_prot); =20 diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 1dee0d18abbb..8aff3ae909af 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -1012,7 +1012,7 @@ static int msm_gem_object_mmap(struct drm_gem_object = *obj, struct vm_area_struct { struct msm_gem_object *msm_obj =3D to_msm_bo(obj); =20 - vma->vm_flags |=3D VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_page_prot =3D msm_gem_pgprot(msm_obj, vm_get_page_prot(vma->vm_fl= ags)); =20 return 0; diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/o= map_gem.c index cf571796fd26..9c0e7d6a3784 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.c +++ b/drivers/gpu/drm/omapdrm/omap_gem.c @@ -543,8 +543,7 @@ int omap_gem_mmap_obj(struct drm_gem_object *obj, { struct omap_gem_object *omap_obj =3D to_omap_bo(obj); =20 - vma->vm_flags &=3D ~VM_PFNMAP; - vma->vm_flags |=3D VM_MIXEDMAP; + mod_vm_flags(vma, VM_MIXEDMAP, VM_PFNMAP); =20 if (omap_obj->flags & OMAP_BO_WC) { vma->vm_page_prot =3D pgprot_writecombine(vm_get_page_prot(vma->vm_flags= )); diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c b/drivers/gpu/drm/= rockchip/rockchip_drm_gem.c index 6edb7c52cb3d..735b64bbdcf2 100644 --- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c +++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c @@ -251,8 +251,7 @@ static int rockchip_drm_gem_object_mmap(struct drm_gem_= object *obj, * We allocated a struct page table for rk_obj, so clear * VM_PFNMAP flag that was set by drm_gem_mmap_obj()/drm_gem_mmap(). */ - vma->vm_flags |=3D VM_IO | VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_flags &=3D ~VM_PFNMAP; + mod_vm_flags(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP, VM_PFNMAP); =20 vma->vm_page_prot =3D pgprot_writecombine(vm_get_page_prot(vma->vm_flags)= ); vma->vm_page_prot =3D pgprot_decrypted(vma->vm_page_prot); diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 979e7bc902f6..6cdc6c45ef27 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -574,7 +574,7 @@ int __tegra_gem_mmap(struct drm_gem_object *gem, struct= vm_area_struct *vma) * and set the vm_pgoff (used as a fake buffer offset by DRM) * to 0 as we want to map the whole buffer. */ - vma->vm_flags &=3D ~VM_PFNMAP; + clear_vm_flags(vma, VM_PFNMAP); vma->vm_pgoff =3D 0; =20 err =3D dma_mmap_wc(gem->dev->dev, vma, bo->vaddr, bo->iova, @@ -588,8 +588,7 @@ int __tegra_gem_mmap(struct drm_gem_object *gem, struct= vm_area_struct *vma) } else { pgprot_t prot =3D vm_get_page_prot(vma->vm_flags); =20 - vma->vm_flags |=3D VM_MIXEDMAP; - vma->vm_flags &=3D ~VM_PFNMAP; + mod_vm_flags(vma, VM_MIXEDMAP, VM_PFNMAP); =20 vma->vm_page_prot =3D pgprot_writecombine(prot); } diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_v= m.c index 5a3e4b891377..0861e6e33964 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -468,8 +468,7 @@ int ttm_bo_mmap_obj(struct vm_area_struct *vma, struct = ttm_buffer_object *bo) =20 vma->vm_private_data =3D bo; =20 - vma->vm_flags |=3D VM_PFNMAP; - vma->vm_flags |=3D VM_IO | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_PFNMAP | VM_IO | VM_DONTEXPAND | VM_DONTDUMP); return 0; } EXPORT_SYMBOL(ttm_bo_mmap_obj); diff --git a/drivers/gpu/drm/virtio/virtgpu_vram.c b/drivers/gpu/drm/virtio= /virtgpu_vram.c index 6b45b0429fef..5498a1dbef63 100644 --- a/drivers/gpu/drm/virtio/virtgpu_vram.c +++ b/drivers/gpu/drm/virtio/virtgpu_vram.c @@ -46,7 +46,7 @@ static int virtio_gpu_vram_mmap(struct drm_gem_object *ob= j, return -EINVAL; =20 vma->vm_pgoff -=3D drm_vma_node_start(&obj->vma_node); - vma->vm_flags |=3D VM_MIXEDMAP | VM_DONTEXPAND; + set_vm_flags(vma, VM_MIXEDMAP | VM_DONTEXPAND); vma->vm_page_prot =3D vm_get_page_prot(vma->vm_flags); vma->vm_page_prot =3D pgprot_decrypted(vma->vm_page_prot); vma->vm_ops =3D &virtio_gpu_vram_vm_ops; diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c b/drivers/gpu/drm/vmw= gfx/vmwgfx_ttm_glue.c index 265f7c48d856..8c8015528b6f 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ttm_glue.c @@ -97,7 +97,7 @@ int vmw_mmap(struct file *filp, struct vm_area_struct *vm= a) =20 /* Use VM_PFNMAP rather than VM_MIXEDMAP if not a COW mapping */ if (!is_cow_mapping(vma->vm_flags)) - vma->vm_flags =3D (vma->vm_flags & ~VM_MIXEDMAP) | VM_PFNMAP; + mod_vm_flags(vma, VM_PFNMAP, VM_MIXEDMAP); =20 ttm_bo_put(bo); /* release extra ref taken by ttm_bo_mmap_obj() */ =20 diff --git a/drivers/gpu/drm/xen/xen_drm_front_gem.c b/drivers/gpu/drm/xen/= xen_drm_front_gem.c index 4c95ebcdcc2d..18a93ad4aa1f 100644 --- a/drivers/gpu/drm/xen/xen_drm_front_gem.c +++ b/drivers/gpu/drm/xen/xen_drm_front_gem.c @@ -69,8 +69,7 @@ static int xen_drm_front_gem_object_mmap(struct drm_gem_o= bject *gem_obj, * vm_pgoff (used as a fake buffer offset by DRM) to 0 as we want to map * the whole buffer. */ - vma->vm_flags &=3D ~VM_PFNMAP; - vma->vm_flags |=3D VM_MIXEDMAP | VM_DONTEXPAND; + mod_vm_flags(vma, VM_MIXEDMAP | VM_DONTEXPAND, VM_PFNMAP); vma->vm_pgoff =3D 0; =20 /* diff --git a/drivers/hsi/clients/cmt_speech.c b/drivers/hsi/clients/cmt_spe= ech.c index 8069f795c864..952a31e742a1 100644 --- a/drivers/hsi/clients/cmt_speech.c +++ b/drivers/hsi/clients/cmt_speech.c @@ -1264,7 +1264,7 @@ static int cs_char_mmap(struct file *file, struct vm_= area_struct *vma) if (vma_pages(vma) !=3D 1) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_DONTDUMP | VM_DONTEXPAND; + set_vm_flags(vma, VM_IO | VM_DONTDUMP | VM_DONTEXPAND); vma->vm_ops =3D &cs_char_vm_ops; vma->vm_private_data =3D file->private_data; =20 diff --git a/drivers/hwtracing/intel_th/msu.c b/drivers/hwtracing/intel_th/= msu.c index 6c8215a47a60..a6f178bf3ded 100644 --- a/drivers/hwtracing/intel_th/msu.c +++ b/drivers/hwtracing/intel_th/msu.c @@ -1659,7 +1659,7 @@ static int intel_th_msc_mmap(struct file *file, struc= t vm_area_struct *vma) atomic_dec(&msc->user_count); =20 vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTCOPY; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTCOPY); vma->vm_ops =3D &msc_mmap_ops; return ret; } diff --git a/drivers/hwtracing/stm/core.c b/drivers/hwtracing/stm/core.c index 2712e699ba08..9a59e61c4194 100644 --- a/drivers/hwtracing/stm/core.c +++ b/drivers/hwtracing/stm/core.c @@ -715,7 +715,7 @@ static int stm_char_mmap(struct file *file, struct vm_a= rea_struct *vma) pm_runtime_get_sync(&stm->dev); =20 vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); - vma->vm_flags |=3D VM_IO | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &stm_mmap_vmops; vm_iomap_memory(vma, phys, size); =20 diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/= hfi1/file_ops.c index f5f9269fdc16..7294f2d33bc6 100644 --- a/drivers/infiniband/hw/hfi1/file_ops.c +++ b/drivers/infiniband/hw/hfi1/file_ops.c @@ -403,7 +403,7 @@ static int hfi1_file_mmap(struct file *fp, struct vm_ar= ea_struct *vma) ret =3D -EPERM; goto done; } - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); addr =3D vma->vm_start; for (i =3D 0 ; i < uctxt->egrbufs.numbufs; i++) { memlen =3D uctxt->egrbufs.buffers[i].len; @@ -528,7 +528,7 @@ static int hfi1_file_mmap(struct file *fp, struct vm_ar= ea_struct *vma) goto done; } =20 - vma->vm_flags =3D flags; + reset_vm_flags(vma, flags); hfi1_cdbg(PROC, "%u:%u type:%u io/vf:%d/%d, addr:0x%llx, len:%lu(%lu), flags:0x%lx\n", ctxt, subctxt, type, mapio, vmf, memaddr, memlen, diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5= /main.c index c669ef6e47e7..538318c809b3 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -2087,7 +2087,7 @@ static int mlx5_ib_mmap_clock_info_page(struct mlx5_i= b_dev *dev, =20 if (vma->vm_flags & (VM_WRITE | VM_EXEC)) return -EPERM; - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 if (!dev->mdev->clock_info) return -EOPNOTSUPP; @@ -2311,7 +2311,7 @@ static int mlx5_ib_mmap(struct ib_ucontext *ibcontext= , struct vm_area_struct *vm =20 if (vma->vm_flags & VM_WRITE) return -EPERM; - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 /* Don't expose to user-space information it shouldn't have */ if (PAGE_SIZE > 4096) diff --git a/drivers/infiniband/hw/qib/qib_file_ops.c b/drivers/infiniband/= hw/qib/qib_file_ops.c index 3937144b2ae5..16ef80df4b7f 100644 --- a/drivers/infiniband/hw/qib/qib_file_ops.c +++ b/drivers/infiniband/hw/qib/qib_file_ops.c @@ -733,7 +733,7 @@ static int qib_mmap_mem(struct vm_area_struct *vma, str= uct qib_ctxtdata *rcd, } =20 /* don't allow them to later change with mprotect */ - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); } =20 pfn =3D virt_to_phys(kvaddr) >> PAGE_SHIFT; @@ -769,7 +769,7 @@ static int mmap_ureg(struct vm_area_struct *vma, struct= qib_devdata *dd, phys =3D dd->physaddr + ureg; vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); =20 - vma->vm_flags |=3D VM_DONTCOPY | VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTCOPY | VM_DONTEXPAND); ret =3D io_remap_pfn_range(vma, vma->vm_start, phys >> PAGE_SHIFT, vma->vm_end - vma->vm_start, @@ -810,8 +810,7 @@ static int mmap_piobufs(struct vm_area_struct *vma, * don't allow them to later change to readable with mprotect (for when * not initially mapped readable, as is normally the case) */ - vma->vm_flags &=3D ~VM_MAYREAD; - vma->vm_flags |=3D VM_DONTCOPY | VM_DONTEXPAND; + mod_vm_flags(vma, VM_DONTCOPY | VM_DONTEXPAND, VM_MAYREAD); =20 /* We used PAT if wc_cookie =3D=3D 0 */ if (!dd->wc_cookie) @@ -852,7 +851,7 @@ static int mmap_rcvegrbufs(struct vm_area_struct *vma, goto bail; } /* don't allow them to later change to writable with mprotect */ - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 start =3D vma->vm_start; =20 @@ -944,7 +943,7 @@ static int mmap_kvaddr(struct vm_area_struct *vma, u64 = pgaddr, * Don't allow permission to later change to writable * with mprotect. */ - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); } else goto bail; len =3D vma->vm_end - vma->vm_start; @@ -955,7 +954,7 @@ static int mmap_kvaddr(struct vm_area_struct *vma, u64 = pgaddr, =20 vma->vm_pgoff =3D (unsigned long) addr >> PAGE_SHIFT; vma->vm_ops =3D &qib_file_vm_ops; - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); ret =3D 1; =20 bail: diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infinib= and/hw/usnic/usnic_ib_verbs.c index 6e8c4fbb8083..6f9237c2a26b 100644 --- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c +++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c @@ -672,7 +672,7 @@ int usnic_ib_mmap(struct ib_ucontext *context, usnic_dbg("\n"); =20 us_ibdev =3D to_usdev(context->device); - vma->vm_flags |=3D VM_IO; + set_vm_flags(vma, VM_IO); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); vfid =3D vma->vm_pgoff; usnic_dbg("Page Offset %lu PAGE_SHIFT %u VFID %u\n", diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c b/drivers/infi= niband/hw/vmw_pvrdma/pvrdma_verbs.c index 19176583dbde..7f1b7b5dd3f4 100644 --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c @@ -408,7 +408,7 @@ int pvrdma_mmap(struct ib_ucontext *ibcontext, struct v= m_area_struct *vma) } =20 /* Map UAR to kernel space, VM_LOCKED? */ - vma->vm_flags |=3D VM_DONTCOPY | VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTCOPY | VM_DONTEXPAND); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); if (io_remap_pfn_range(vma, start, context->uar.pfn, size, vma->vm_page_prot)) diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/driver= s/media/common/videobuf2/videobuf2-dma-contig.c index 5f1175f8b349..e66ae399749e 100644 --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c @@ -293,7 +293,7 @@ static int vb2_dc_mmap(void *buf_priv, struct vm_area_s= truct *vma) return ret; } =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); vma->vm_private_data =3D &buf->handler; vma->vm_ops =3D &vb2_common_vm_ops; =20 diff --git a/drivers/media/common/videobuf2/videobuf2-vmalloc.c b/drivers/m= edia/common/videobuf2/videobuf2-vmalloc.c index 959b45beb1f3..edb47240ec17 100644 --- a/drivers/media/common/videobuf2/videobuf2-vmalloc.c +++ b/drivers/media/common/videobuf2/videobuf2-vmalloc.c @@ -185,7 +185,7 @@ static int vb2_vmalloc_mmap(void *buf_priv, struct vm_a= rea_struct *vma) /* * Make sure that vm_areas for 2 buffers won't be merged together */ - vma->vm_flags |=3D VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTEXPAND); =20 /* * Use common vm_area operations to track buffer refcount. diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/= v4l2-core/videobuf-dma-contig.c index f2c439359557..c030823185ba 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -314,7 +314,7 @@ static int __videobuf_mmap_mapper(struct videobuf_queue= *q, } =20 vma->vm_ops =3D &videobuf_vm_ops; - vma->vm_flags |=3D VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTEXPAND); vma->vm_private_data =3D map; =20 dev_dbg(q->dev, "mmap %p: q=3D%p %08lx-%08lx (%lx) pgoff %08lx buf %d\n", diff --git a/drivers/media/v4l2-core/videobuf-dma-sg.c b/drivers/media/v4l2= -core/videobuf-dma-sg.c index 234e9f647c96..9adac4875f29 100644 --- a/drivers/media/v4l2-core/videobuf-dma-sg.c +++ b/drivers/media/v4l2-core/videobuf-dma-sg.c @@ -630,8 +630,8 @@ static int __videobuf_mmap_mapper(struct videobuf_queue= *q, map->count =3D 1; map->q =3D q; vma->vm_ops =3D &videobuf_vm_ops; - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_flags &=3D ~VM_IO; /* using shared anonymous pages */ + /* using shared anonymous pages */ + mod_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP, VM_IO); vma->vm_private_data =3D map; dprintk(1, "mmap %p: q=3D%p %08lx-%08lx pgoff %08lx bufs %d-%d\n", map, q, vma->vm_start, vma->vm_end, vma->vm_pgoff, first, last); diff --git a/drivers/media/v4l2-core/videobuf-vmalloc.c b/drivers/media/v4l= 2-core/videobuf-vmalloc.c index 9b2443720ab0..48d439ccd414 100644 --- a/drivers/media/v4l2-core/videobuf-vmalloc.c +++ b/drivers/media/v4l2-core/videobuf-vmalloc.c @@ -247,7 +247,7 @@ static int __videobuf_mmap_mapper(struct videobuf_queue= *q, } =20 vma->vm_ops =3D &videobuf_vm_ops; - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); vma->vm_private_data =3D map; =20 dprintk(1, "mmap %p: q=3D%p %08lx-%08lx (%lx) pgoff %08lx buf %d\n", diff --git a/drivers/misc/cxl/context.c b/drivers/misc/cxl/context.c index acaa44809c58..17562e4efcb2 100644 --- a/drivers/misc/cxl/context.c +++ b/drivers/misc/cxl/context.c @@ -220,7 +220,7 @@ int cxl_context_iomap(struct cxl_context *ctx, struct v= m_area_struct *vma) pr_devel("%s: mmio physical: %llx pe: %i master:%i\n", __func__, ctx->psn_phys, ctx->pe , ctx->master); =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); vma->vm_ops =3D &cxl_mmap_vmops; return 0; diff --git a/drivers/misc/habanalabs/common/memory.c b/drivers/misc/habanal= abs/common/memory.c index 5e9ae7600d75..ad8eae764b9b 100644 --- a/drivers/misc/habanalabs/common/memory.c +++ b/drivers/misc/habanalabs/common/memory.c @@ -2082,7 +2082,7 @@ static int hl_ts_mmap(struct hl_mmap_mem_buf *buf, st= ruct vm_area_struct *vma, v { struct hl_ts_buff *ts_buff =3D buf->private; =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP | VM_DONTCOPY | VM_NORESER= VE; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP | VM_DONTCOPY | VM_NORESERV= E); return remap_vmalloc_range(vma, ts_buff->user_buff_address, 0); } =20 diff --git a/drivers/misc/habanalabs/gaudi/gaudi.c b/drivers/misc/habanalab= s/gaudi/gaudi.c index 9f5e208701ba..4186f04da224 100644 --- a/drivers/misc/habanalabs/gaudi/gaudi.c +++ b/drivers/misc/habanalabs/gaudi/gaudi.c @@ -4236,8 +4236,8 @@ static int gaudi_mmap(struct hl_device *hdev, struct = vm_area_struct *vma, { int rc; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | - VM_DONTCOPY | VM_NORESERVE; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | + VM_DONTCOPY | VM_NORESERVE); =20 rc =3D dma_mmap_coherent(hdev->dev, vma, cpu_addr, (dma_addr - HOST_PHYS_BASE), size); diff --git a/drivers/misc/habanalabs/gaudi2/gaudi2.c b/drivers/misc/habanal= abs/gaudi2/gaudi2.c index e793fb2bdcbe..7311c3053944 100644 --- a/drivers/misc/habanalabs/gaudi2/gaudi2.c +++ b/drivers/misc/habanalabs/gaudi2/gaudi2.c @@ -5538,8 +5538,8 @@ static int gaudi2_mmap(struct hl_device *hdev, struct= vm_area_struct *vma, { int rc; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | - VM_DONTCOPY | VM_NORESERVE; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | + VM_DONTCOPY | VM_NORESERVE); =20 #ifdef _HAS_DMA_MMAP_COHERENT =20 @@ -10116,8 +10116,8 @@ static int gaudi2_block_mmap(struct hl_device *hdev= , struct vm_area_struct *vma, =20 address =3D pci_resource_start(hdev->pdev, SRAM_CFG_BAR_ID) + offset_in_b= ar; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | - VM_DONTCOPY | VM_NORESERVE; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | + VM_DONTCOPY | VM_NORESERVE); =20 rc =3D remap_pfn_range(vma, vma->vm_start, address >> PAGE_SHIFT, block_size, vma->vm_page_prot); diff --git a/drivers/misc/habanalabs/goya/goya.c b/drivers/misc/habanalabs/= goya/goya.c index 0f083fcf81a6..5e2aaa26ea29 100644 --- a/drivers/misc/habanalabs/goya/goya.c +++ b/drivers/misc/habanalabs/goya/goya.c @@ -2880,8 +2880,8 @@ static int goya_mmap(struct hl_device *hdev, struct v= m_area_struct *vma, { int rc; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | - VM_DONTCOPY | VM_NORESERVE; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | + VM_DONTCOPY | VM_NORESERVE); =20 rc =3D dma_mmap_coherent(hdev->dev, vma, cpu_addr, (dma_addr - HOST_PHYS_BASE), size); diff --git a/drivers/misc/ocxl/context.c b/drivers/misc/ocxl/context.c index 9eb0d93b01c6..e6f941248e93 100644 --- a/drivers/misc/ocxl/context.c +++ b/drivers/misc/ocxl/context.c @@ -180,7 +180,7 @@ static int check_mmap_afu_irq(struct ocxl_context *ctx, if ((vma->vm_flags & VM_READ) || (vma->vm_flags & VM_EXEC) || !(vma->vm_flags & VM_WRITE)) return -EINVAL; - vma->vm_flags &=3D ~(VM_MAYREAD | VM_MAYEXEC); + clear_vm_flags(vma, VM_MAYREAD | VM_MAYEXEC); return 0; } =20 @@ -204,7 +204,7 @@ int ocxl_context_mmap(struct ocxl_context *ctx, struct = vm_area_struct *vma) if (rc) return rc; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); vma->vm_ops =3D &ocxl_vmops; return 0; diff --git a/drivers/misc/ocxl/sysfs.c b/drivers/misc/ocxl/sysfs.c index 25c78df8055d..9398246cac79 100644 --- a/drivers/misc/ocxl/sysfs.c +++ b/drivers/misc/ocxl/sysfs.c @@ -134,7 +134,7 @@ static int global_mmio_mmap(struct file *filp, struct k= object *kobj, (afu->config.global_mmio_size >> PAGE_SHIFT)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); vma->vm_ops =3D &global_mmio_vmops; vma->vm_private_data =3D afu; diff --git a/drivers/misc/open-dice.c b/drivers/misc/open-dice.c index c61be3404c6f..9f9438b5b075 100644 --- a/drivers/misc/open-dice.c +++ b/drivers/misc/open-dice.c @@ -96,13 +96,13 @@ static int open_dice_mmap(struct file *filp, struct vm_= area_struct *vma) =20 /* Ensure userspace cannot acquire VM_WRITE + VM_SHARED later. */ if (vma->vm_flags & VM_WRITE) - vma->vm_flags &=3D ~VM_MAYSHARE; + clear_vm_flags(vma, VM_MAYSHARE); else if (vma->vm_flags & VM_SHARED) - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 /* Create write-combine mapping so all clients observe a wipe. */ vma->vm_page_prot =3D pgprot_writecombine(vma->vm_page_prot); - vma->vm_flags |=3D VM_DONTCOPY | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTCOPY | VM_DONTDUMP); return vm_iomap_memory(vma, drvdata->rmem->base, drvdata->rmem->size); } =20 diff --git a/drivers/misc/sgi-gru/grufile.c b/drivers/misc/sgi-gru/grufile.c index 7ffcfc0bb587..8b777286d3b2 100644 --- a/drivers/misc/sgi-gru/grufile.c +++ b/drivers/misc/sgi-gru/grufile.c @@ -101,8 +101,8 @@ static int gru_file_mmap(struct file *file, struct vm_a= rea_struct *vma) vma->vm_end & (GRU_GSEG_PAGESIZE - 1)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_LOCKED | - VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_LOCKED | + VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_page_prot =3D PAGE_SHARED; vma->vm_ops =3D &gru_vm_ops; =20 diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c index 905eff1f840e..f57e91cdb0f6 100644 --- a/drivers/misc/uacce/uacce.c +++ b/drivers/misc/uacce/uacce.c @@ -229,7 +229,7 @@ static int uacce_fops_mmap(struct file *filep, struct v= m_area_struct *vma) if (!qfr) return -ENOMEM; =20 - vma->vm_flags |=3D VM_DONTCOPY | VM_DONTEXPAND | VM_WIPEONFORK; + set_vm_flags(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_WIPEONFORK); vma->vm_ops =3D &uacce_vm_ops; vma->vm_private_data =3D q; qfr->type =3D type; diff --git a/drivers/sbus/char/oradax.c b/drivers/sbus/char/oradax.c index 21b7cb6e7e70..a096734daad0 100644 --- a/drivers/sbus/char/oradax.c +++ b/drivers/sbus/char/oradax.c @@ -389,7 +389,7 @@ static int dax_devmap(struct file *f, struct vm_area_st= ruct *vma) /* completion area is mapped read-only for user */ if (vma->vm_flags & VM_WRITE) return -EPERM; - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 if (remap_pfn_range(vma, vma->vm_start, ctx->ca_buf_ra >> PAGE_SHIFT, len, vma->vm_page_prot)) diff --git a/drivers/scsi/cxlflash/ocxl_hw.c b/drivers/scsi/cxlflash/ocxl_h= w.c index 631eda2d467e..d386c25c2699 100644 --- a/drivers/scsi/cxlflash/ocxl_hw.c +++ b/drivers/scsi/cxlflash/ocxl_hw.c @@ -1167,7 +1167,7 @@ static int afu_mmap(struct file *file, struct vm_area= _struct *vma) (ctx->psn_size >> PAGE_SHIFT)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP; + set_vm_flags(vma, VM_IO | VM_PFNMAP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); vma->vm_ops =3D &ocxlflash_vmops; return 0; diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index ff9854f59964..7438adfe3bdc 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1288,7 +1288,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma) } =20 sfp->mmap_called =3D 1; - vma->vm_flags |=3D VM_IO | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_private_data =3D sfp; vma->vm_ops =3D &sg_mmap_vm_ops; out: diff --git a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c b/drivers/stagi= ng/media/atomisp/pci/hmm/hmm_bo.c index 5e53eed8ae95..df1c944e5058 100644 --- a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c +++ b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c @@ -1072,7 +1072,7 @@ int hmm_bo_mmap(struct vm_area_struct *vma, struct hm= m_buffer_object *bo) vma->vm_private_data =3D bo; =20 vma->vm_ops =3D &hmm_bo_vm_ops; - vma->vm_flags |=3D VM_IO | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP); =20 /* * call hmm_bo_vm_open explicitly. diff --git a/drivers/staging/media/deprecated/meye/meye.c b/drivers/staging= /media/deprecated/meye/meye.c index 5d87efd9b95c..2505e64d7119 100644 --- a/drivers/staging/media/deprecated/meye/meye.c +++ b/drivers/staging/media/deprecated/meye/meye.c @@ -1476,8 +1476,8 @@ static int meye_mmap(struct file *file, struct vm_are= a_struct *vma) } =20 vma->vm_ops =3D &meye_vm_ops; - vma->vm_flags &=3D ~VM_IO; /* not I/O memory */ - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + /* not I/O memory */ + mod_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP, VM_IO); vma->vm_private_data =3D (void *) (offset / gbufsize); meye_vm_open(vma); =20 diff --git a/drivers/staging/media/deprecated/stkwebcam/stk-webcam.c b/driv= ers/staging/media/deprecated/stkwebcam/stk-webcam.c index 787edb3d47c2..196d1034f104 100644 --- a/drivers/staging/media/deprecated/stkwebcam/stk-webcam.c +++ b/drivers/staging/media/deprecated/stkwebcam/stk-webcam.c @@ -779,7 +779,7 @@ static int v4l_stk_mmap(struct file *fp, struct vm_area= _struct *vma) ret =3D remap_vmalloc_range(vma, sbuf->buffer, 0); if (ret) return ret; - vma->vm_flags |=3D VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTEXPAND); vma->vm_private_data =3D sbuf; vma->vm_ops =3D &stk_v4l_vm_ops; sbuf->v4lbuf.flags |=3D V4L2_BUF_FLAG_MAPPED; diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core= _user.c index 2940559c3086..9fd64259904c 100644 --- a/drivers/target/target_core_user.c +++ b/drivers/target/target_core_user.c @@ -1928,7 +1928,7 @@ static int tcmu_mmap(struct uio_info *info, struct vm= _area_struct *vma) { struct tcmu_dev *udev =3D container_of(info, struct tcmu_dev, uio_info); =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &tcmu_vm_ops; =20 vma->vm_private_data =3D udev; diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c index 43afbb7c5ab9..08802744f3b7 100644 --- a/drivers/uio/uio.c +++ b/drivers/uio/uio.c @@ -713,7 +713,7 @@ static const struct vm_operations_struct uio_logical_vm= _ops =3D { =20 static int uio_mmap_logical(struct vm_area_struct *vma) { - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &uio_logical_vm_ops; return 0; } diff --git a/drivers/usb/core/devio.c b/drivers/usb/core/devio.c index 837f3e57f580..d9aefa259883 100644 --- a/drivers/usb/core/devio.c +++ b/drivers/usb/core/devio.c @@ -279,8 +279,7 @@ static int usbdev_mmap(struct file *file, struct vm_are= a_struct *vma) } } =20 - vma->vm_flags |=3D VM_IO; - vma->vm_flags |=3D (VM_DONTEXPAND | VM_DONTDUMP); + set_vm_flags(vma, VM_IO | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &usbdev_vm_ops; vma->vm_private_data =3D usbm; =20 diff --git a/drivers/usb/mon/mon_bin.c b/drivers/usb/mon/mon_bin.c index 094e812e9e69..9b2d48a65fdf 100644 --- a/drivers/usb/mon/mon_bin.c +++ b/drivers/usb/mon/mon_bin.c @@ -1272,8 +1272,7 @@ static int mon_bin_mmap(struct file *filp, struct vm_= area_struct *vma) if (vma->vm_flags & VM_WRITE) return -EPERM; =20 - vma->vm_flags &=3D ~VM_MAYWRITE; - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + mod_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP, VM_MAYWRITE); vma->vm_private_data =3D filp->private_data; mon_bin_vma_open(vma); return 0; diff --git a/drivers/vdpa/vdpa_user/iova_domain.c b/drivers/vdpa/vdpa_user/= iova_domain.c index e682bc7ee6c9..39dcce2e455b 100644 --- a/drivers/vdpa/vdpa_user/iova_domain.c +++ b/drivers/vdpa/vdpa_user/iova_domain.c @@ -512,7 +512,7 @@ static int vduse_domain_mmap(struct file *file, struct = vm_area_struct *vma) { struct vduse_iova_domain *domain =3D file->private_data; =20 - vma->vm_flags |=3D VM_DONTDUMP | VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTDUMP | VM_DONTEXPAND); vma->vm_private_data =3D domain; vma->vm_ops =3D &vduse_domain_mmap_ops; =20 diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_c= ore.c index 26a541cc64d1..86eb3fc9ffb4 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1799,7 +1799,7 @@ int vfio_pci_core_mmap(struct vfio_device *core_vdev,= struct vm_area_struct *vma * See remap_pfn_range(), called from vfio_pci_fault() but we can't * change vm_flags within the fault handler. Set them now. */ - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &vfio_pci_mmap_ops; =20 return 0; diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c index ec32f785dfde..7b81994a7d02 100644 --- a/drivers/vhost/vdpa.c +++ b/drivers/vhost/vdpa.c @@ -1315,7 +1315,7 @@ static int vhost_vdpa_mmap(struct file *file, struct = vm_area_struct *vma) if (vma->vm_end - vma->vm_start !=3D notify.size) return -ENOTSUPP; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &vhost_vdpa_vm_ops; return 0; } diff --git a/drivers/video/fbdev/68328fb.c b/drivers/video/fbdev/68328fb.c index 7db03ed77c76..a794a740af10 100644 --- a/drivers/video/fbdev/68328fb.c +++ b/drivers/video/fbdev/68328fb.c @@ -391,7 +391,7 @@ static int mc68x328fb_mmap(struct fb_info *info, struct= vm_area_struct *vma) #ifndef MMU /* this is uClinux (no MMU) specific code */ =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); vma->vm_start =3D videomemory; =20 return 0; diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core= /fb_defio.c index c730253ab85c..af0bfaa2d014 100644 --- a/drivers/video/fbdev/core/fb_defio.c +++ b/drivers/video/fbdev/core/fb_defio.c @@ -232,9 +232,9 @@ static const struct address_space_operations fb_deferre= d_io_aops =3D { int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma) { vma->vm_ops =3D &fb_deferred_io_vm_ops; - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); if (!(info->flags & FBINFO_VIRTFB)) - vma->vm_flags |=3D VM_IO; + set_vm_flags(vma, VM_IO); vma->vm_private_data =3D info; return 0; } diff --git a/drivers/xen/gntalloc.c b/drivers/xen/gntalloc.c index a15729beb9d1..ee4a8958dc68 100644 --- a/drivers/xen/gntalloc.c +++ b/drivers/xen/gntalloc.c @@ -525,7 +525,7 @@ static int gntalloc_mmap(struct file *filp, struct vm_a= rea_struct *vma) =20 vma->vm_private_data =3D vm_priv; =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); =20 vma->vm_ops =3D &gntalloc_vmops; =20 diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 4d9a3050de6a..6d5bb1ebb661 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -1055,10 +1055,10 @@ static int gntdev_mmap(struct file *flip, struct vm= _area_struct *vma) =20 vma->vm_ops =3D &gntdev_vmops; =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP | VM_MIXEDMAP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP | VM_MIXEDMAP); =20 if (use_ptemod) - vma->vm_flags |=3D VM_DONTCOPY; + set_vm_flags(vma, VM_DONTCOPY); =20 vma->vm_private_data =3D map; if (map->flags) { diff --git a/drivers/xen/privcmd-buf.c b/drivers/xen/privcmd-buf.c index dd5bbb6e1b6b..037547918630 100644 --- a/drivers/xen/privcmd-buf.c +++ b/drivers/xen/privcmd-buf.c @@ -156,7 +156,7 @@ static int privcmd_buf_mmap(struct file *file, struct v= m_area_struct *vma) vma_priv->file_priv =3D file_priv; vma_priv->users =3D 1; =20 - vma->vm_flags |=3D VM_IO | VM_DONTEXPAND; + set_vm_flags(vma, VM_IO | VM_DONTEXPAND); vma->vm_ops =3D &privcmd_buf_vm_ops; vma->vm_private_data =3D vma_priv; =20 diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c index 1edf45ee9890..4c8cfc6f86d8 100644 --- a/drivers/xen/privcmd.c +++ b/drivers/xen/privcmd.c @@ -934,8 +934,8 @@ static int privcmd_mmap(struct file *file, struct vm_ar= ea_struct *vma) { /* DONTCOPY is essential for Xen because copy_page_range doesn't know * how to recreate these mappings */ - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTCOPY | - VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTCOPY | + VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &privcmd_vm_ops; vma->vm_private_data =3D NULL; =20 diff --git a/fs/aio.c b/fs/aio.c index 562916d85cba..db821fb1e92d 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -390,7 +390,7 @@ static const struct vm_operations_struct aio_ring_vm_op= s =3D { =20 static int aio_ring_mmap(struct file *file, struct vm_area_struct *vma) { - vma->vm_flags |=3D VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTEXPAND); vma->vm_ops =3D &aio_ring_vm_ops; return 0; } diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c index 61ccf7722fc3..874a17a1b8d9 100644 --- a/fs/cramfs/inode.c +++ b/fs/cramfs/inode.c @@ -408,7 +408,7 @@ static int cramfs_physmem_mmap(struct file *file, struc= t vm_area_struct *vma) * unpopulated ptes via cramfs_read_folio(). */ int i; - vma->vm_flags |=3D VM_MIXEDMAP; + set_vm_flags(vma, VM_MIXEDMAP); for (i =3D 0; i < pages && !ret; i++) { vm_fault_t vmf; unsigned long off =3D i * PAGE_SIZE; diff --git a/fs/erofs/data.c b/fs/erofs/data.c index f57f921683d7..e6413ced2bb1 100644 --- a/fs/erofs/data.c +++ b/fs/erofs/data.c @@ -429,7 +429,7 @@ static int erofs_file_mmap(struct file *file, struct vm= _area_struct *vma) return -EINVAL; =20 vma->vm_ops =3D &erofs_dax_vm_ops; - vma->vm_flags |=3D VM_HUGEPAGE; + set_vm_flags(vma, VM_HUGEPAGE); return 0; } #else diff --git a/fs/exec.c b/fs/exec.c index ab913243a367..5e1631e109a8 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -270,7 +270,7 @@ static int __bprm_mm_init(struct linux_binprm *bprm) BUILD_BUG_ON(VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUP); vma->vm_end =3D STACK_TOP_MAX; vma->vm_start =3D vma->vm_end - PAGE_SIZE; - vma->vm_flags =3D VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SET= UP; + init_vm_flags(vma, VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SE= TUP); vma->vm_page_prot =3D vm_get_page_prot(vma->vm_flags); =20 err =3D insert_vm_struct(mm, vma); @@ -834,7 +834,7 @@ int setup_arg_pages(struct linux_binprm *bprm, } =20 /* mprotect_fixup is overkill to remove the temporary stack flags */ - vma->vm_flags &=3D ~VM_STACK_INCOMPLETE_SETUP; + clear_vm_flags(vma, VM_STACK_INCOMPLETE_SETUP); =20 stack_expand =3D 131072UL; /* randomly 32*4k (or 2*64k) pages */ stack_size =3D vma->vm_end - vma->vm_start; diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 7ac0a81bd371..baeb385b07c7 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -801,7 +801,7 @@ static int ext4_file_mmap(struct file *file, struct vm_= area_struct *vma) file_accessed(file); if (IS_DAX(file_inode(file))) { vma->vm_ops =3D &ext4_dax_vm_ops; - vma->vm_flags |=3D VM_HUGEPAGE; + set_vm_flags(vma, VM_HUGEPAGE); } else { vma->vm_ops =3D &ext4_file_vm_ops; } diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index e23e802a8013..599969edc869 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -860,7 +860,7 @@ int fuse_dax_mmap(struct file *file, struct vm_area_str= uct *vma) { file_accessed(file); vma->vm_ops =3D &fuse_dax_vm_ops; - vma->vm_flags |=3D VM_MIXEDMAP | VM_HUGEPAGE; + set_vm_flags(vma, VM_MIXEDMAP | VM_HUGEPAGE); return 0; } =20 diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 790d2727141a..d63a392985a7 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -132,7 +132,7 @@ static int hugetlbfs_file_mmap(struct file *file, struc= t vm_area_struct *vma) * way when do_mmap unwinds (may be important on powerpc * and ia64). */ - vma->vm_flags |=3D VM_HUGETLB | VM_DONTEXPAND; + set_vm_flags(vma, VM_HUGETLB | VM_DONTEXPAND); vma->vm_ops =3D &hugetlb_vm_ops; =20 ret =3D seal_check_future_write(info->seals, vma); @@ -813,7 +813,7 @@ static long hugetlbfs_fallocate(struct file *file, int = mode, loff_t offset, * as input to create an allocation policy. */ vma_init(&pseudo_vma, mm); - pseudo_vma.vm_flags =3D (VM_HUGETLB | VM_MAYSHARE | VM_SHARED); + init_vm_flags(&pseudo_vma, VM_HUGETLB | VM_MAYSHARE | VM_SHARED); pseudo_vma.vm_file =3D file; =20 for (index =3D start; index < end; index++) { diff --git a/fs/orangefs/file.c b/fs/orangefs/file.c index 167fa43b24f9..0f668db6bcf3 100644 --- a/fs/orangefs/file.c +++ b/fs/orangefs/file.c @@ -389,8 +389,7 @@ static int orangefs_file_mmap(struct file *file, struct= vm_area_struct *vma) "orangefs_file_mmap: called on %pD\n", file); =20 /* set the sequential readahead hint */ - vma->vm_flags |=3D VM_SEQ_READ; - vma->vm_flags &=3D ~VM_RAND_READ; + mod_vm_flags(vma, VM_SEQ_READ, VM_RAND_READ); =20 file_accessed(file); vma->vm_ops =3D &orangefs_file_vm_ops; diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index e35a0398db63..4d651777c8a5 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1302,7 +1302,7 @@ static ssize_t clear_refs_write(struct file *file, co= nst char __user *buf, mas_for_each(&mas, vma, ULONG_MAX) { if (!(vma->vm_flags & VM_SOFTDIRTY)) continue; - vma->vm_flags &=3D ~VM_SOFTDIRTY; + clear_vm_flags(vma, VM_SOFTDIRTY); vma_set_page_prot(vma); } =20 diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c index 09a81e4b1273..858e4e804f85 100644 --- a/fs/proc/vmcore.c +++ b/fs/proc/vmcore.c @@ -582,8 +582,7 @@ static int mmap_vmcore(struct file *file, struct vm_are= a_struct *vma) if (vma->vm_flags & (VM_WRITE | VM_EXEC)) return -EPERM; =20 - vma->vm_flags &=3D ~(VM_MAYWRITE | VM_MAYEXEC); - vma->vm_flags |=3D VM_MIXEDMAP; + mod_vm_flags(vma, VM_MIXEDMAP, VM_MAYWRITE | VM_MAYEXEC); vma->vm_ops =3D &vmcore_mmap_ops; =20 len =3D 0; diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 98ac37e34e3d..f46252544924 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -618,7 +618,7 @@ static void userfaultfd_event_wait_completion(struct us= erfaultfd_ctx *ctx, for_each_vma(vmi, vma) { if (vma->vm_userfaultfd_ctx.ctx =3D=3D release_new_ctx) { vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; - vma->vm_flags &=3D ~__VM_UFFD_FLAGS; + clear_vm_flags(vma, __VM_UFFD_FLAGS); } } mmap_write_unlock(mm); @@ -652,7 +652,7 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct = list_head *fcs) octx =3D vma->vm_userfaultfd_ctx.ctx; if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) { vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; - vma->vm_flags &=3D ~__VM_UFFD_FLAGS; + clear_vm_flags(vma, __VM_UFFD_FLAGS); return 0; } =20 @@ -733,7 +733,7 @@ void mremap_userfaultfd_prep(struct vm_area_struct *vma, } else { /* Drop uffd context if remap feature not enabled */ vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; - vma->vm_flags &=3D ~__VM_UFFD_FLAGS; + clear_vm_flags(vma, __VM_UFFD_FLAGS); } } =20 @@ -895,7 +895,7 @@ static int userfaultfd_release(struct inode *inode, str= uct file *file) prev =3D vma; } =20 - vma->vm_flags =3D new_flags; + reset_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; } mmap_write_unlock(mm); @@ -1463,7 +1463,7 @@ static int userfaultfd_register(struct userfaultfd_ct= x *ctx, * the next vma was merged into the current one and * the current one has not been updated yet. */ - vma->vm_flags =3D new_flags; + reset_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx.ctx =3D ctx; =20 if (is_vm_hugetlb_page(vma) && uffd_disable_huge_pmd_share(vma)) @@ -1651,7 +1651,7 @@ static int userfaultfd_unregister(struct userfaultfd_= ctx *ctx, * the next vma was merged into the current one and * the current one has not been updated yet. */ - vma->vm_flags =3D new_flags; + reset_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; =20 skip: diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 595a5bcf46b9..bf777fed0dd4 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1429,7 +1429,7 @@ xfs_file_mmap( file_accessed(file); vma->vm_ops =3D &xfs_file_vm_ops; if (IS_DAX(inode)) - vma->vm_flags |=3D VM_HUGEPAGE; + set_vm_flags(vma, VM_HUGEPAGE); return 0; } =20 diff --git a/include/linux/mm.h b/include/linux/mm.h index 2b16d45b75a6..594e835bad9c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3756,7 +3756,7 @@ static inline int seal_check_future_write(int seals, = struct vm_area_struct *vma) * VM_MAYWRITE as we still want them to be COW-writable. */ if (vma->vm_flags & VM_SHARED) - vma->vm_flags &=3D ~(VM_MAYWRITE); + clear_vm_flags(vma, VM_MAYWRITE); } =20 return 0; diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 80f4b4d88aaf..d2c967cc2873 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -269,7 +269,7 @@ static int ringbuf_map_mmap_kern(struct bpf_map *map, s= truct vm_area_struct *vma if (vma->vm_pgoff !=3D 0 || vma->vm_end - vma->vm_start !=3D PAGE_SIZE) return -EPERM; } else { - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); } /* remap_vmalloc_range() checks size and offset constraints */ return remap_vmalloc_range(vma, rb_map->rb, @@ -290,7 +290,7 @@ static int ringbuf_map_mmap_user(struct bpf_map *map, s= truct vm_area_struct *vma */ return -EPERM; } else { - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); } /* remap_vmalloc_range() checks size and offset constraints */ return remap_vmalloc_range(vma, rb_map->rb, vma->vm_pgoff + RINGBUF_PGOFF= ); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 64131f88c553..db19094c7ac7 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -882,10 +882,10 @@ static int bpf_map_mmap(struct file *filp, struct vm_= area_struct *vma) /* set default open/close callbacks */ vma->vm_ops =3D &bpf_map_default_vmops; vma->vm_private_data =3D map; - vma->vm_flags &=3D ~VM_MAYEXEC; + clear_vm_flags(vma, VM_MAYEXEC); if (!(vma->vm_flags & VM_WRITE)) /* disallow re-mapping with PROT_WRITE */ - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 err =3D map->ops->map_mmap(map, vma); if (err) diff --git a/kernel/events/core.c b/kernel/events/core.c index d56328e5080e..6745460dcf49 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -6573,7 +6573,7 @@ static int perf_mmap(struct file *file, struct vm_are= a_struct *vma) * Since pinned accounting is per vm we cannot allow fork() to copy our * vma. */ - vma->vm_flags |=3D VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &perf_mmap_vmops; =20 if (event->pmu->event_mapped) diff --git a/kernel/kcov.c b/kernel/kcov.c index e5cd09fd8a05..27fc1e26e1e1 100644 --- a/kernel/kcov.c +++ b/kernel/kcov.c @@ -489,7 +489,7 @@ static int kcov_mmap(struct file *filep, struct vm_area= _struct *vma) goto exit; } spin_unlock_irqrestore(&kcov->lock, flags); - vma->vm_flags |=3D VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTEXPAND); for (off =3D 0; off < size; off +=3D PAGE_SIZE) { page =3D vmalloc_to_page(kcov->area + off); res =3D vm_insert_page(vma, vma->vm_start + off, page); diff --git a/kernel/relay.c b/kernel/relay.c index ef12532168d9..085aa8707bc2 100644 --- a/kernel/relay.c +++ b/kernel/relay.c @@ -91,7 +91,7 @@ static int relay_mmap_buf(struct rchan_buf *buf, struct v= m_area_struct *vma) return -EINVAL; =20 vma->vm_ops =3D &relay_file_mmap_ops; - vma->vm_flags |=3D VM_DONTEXPAND; + set_vm_flags(vma, VM_DONTEXPAND); vma->vm_private_data =3D buf; =20 return 0; diff --git a/mm/madvise.c b/mm/madvise.c index a56a6d17e201..5b74321bcac9 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -179,7 +179,7 @@ static int madvise_update_vma(struct vm_area_struct *vm= a, /* * vm_flags is protected by the mmap_lock held in write mode. */ - vma->vm_flags =3D new_flags; + reset_vm_flags(vma, new_flags); if (!vma->vm_file || vma_is_anon_shmem(vma)) { error =3D replace_anon_vma_name(vma, anon_name); if (error) diff --git a/mm/memory.c b/mm/memory.c index aad226daf41b..2fabf89b2be9 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1951,7 +1951,7 @@ int vm_insert_pages(struct vm_area_struct *vma, unsig= ned long addr, if (!(vma->vm_flags & VM_MIXEDMAP)) { BUG_ON(mmap_read_trylock(vma->vm_mm)); BUG_ON(vma->vm_flags & VM_PFNMAP); - vma->vm_flags |=3D VM_MIXEDMAP; + set_vm_flags(vma, VM_MIXEDMAP); } /* Defer page refcount checking till we're about to map that page. */ return insert_pages(vma, addr, pages, num, vma->vm_page_prot); @@ -2009,7 +2009,7 @@ int vm_insert_page(struct vm_area_struct *vma, unsign= ed long addr, if (!(vma->vm_flags & VM_MIXEDMAP)) { BUG_ON(mmap_read_trylock(vma->vm_mm)); BUG_ON(vma->vm_flags & VM_PFNMAP); - vma->vm_flags |=3D VM_MIXEDMAP; + set_vm_flags(vma, VM_MIXEDMAP); } return insert_page(vma, addr, page, vma->vm_page_prot); } @@ -2475,7 +2475,7 @@ int remap_pfn_range_notrack(struct vm_area_struct *vm= a, unsigned long addr, vma->vm_pgoff =3D pfn; } =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); =20 BUG_ON(addr >=3D end); pfn -=3D addr >> PAGE_SHIFT; diff --git a/mm/mlock.c b/mm/mlock.c index 06aa9e204fac..4807e91aaa8b 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -380,7 +380,7 @@ static void mlock_vma_pages_range(struct vm_area_struct= *vma, */ if (newflags & VM_LOCKED) newflags |=3D VM_IO; - WRITE_ONCE(vma->vm_flags, newflags); + reset_vm_flags(vma, newflags); =20 lru_add_drain(); walk_page_range(vma->vm_mm, start, end, &mlock_walk_ops, NULL); @@ -388,7 +388,7 @@ static void mlock_vma_pages_range(struct vm_area_struct= *vma, =20 if (newflags & VM_IO) { newflags &=3D ~VM_IO; - WRITE_ONCE(vma->vm_flags, newflags); + reset_vm_flags(vma, newflags); } } =20 @@ -456,7 +456,7 @@ static int mlock_fixup(struct vm_area_struct *vma, stru= ct vm_area_struct **prev, =20 if ((newflags & VM_LOCKED) && (oldflags & VM_LOCKED)) { /* No work to do, and mlocking twice would be wrong */ - vma->vm_flags =3D newflags; + reset_vm_flags(vma, newflags); } else { mlock_vma_pages_range(vma, start, end, newflags); } diff --git a/mm/mmap.c b/mm/mmap.c index 5c4b608edde9..fa994ae903d9 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2607,7 +2607,7 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, =20 vma->vm_start =3D addr; vma->vm_end =3D end; - vma->vm_flags =3D vm_flags; + init_vm_flags(vma, vm_flags); vma->vm_page_prot =3D vm_get_page_prot(vm_flags); vma->vm_pgoff =3D pgoff; =20 @@ -2736,7 +2736,7 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, * then new mapped in-place (which must be aimed as * a completely new data area). */ - vma->vm_flags |=3D VM_SOFTDIRTY; + set_vm_flags(vma, VM_SOFTDIRTY); =20 vma_set_page_prot(vma); =20 @@ -2959,7 +2959,7 @@ static int do_brk_flags(struct ma_state *mas, struct = vm_area_struct *vma, anon_vma_interval_tree_pre_update_vma(vma); } vma->vm_end =3D addr + len; - vma->vm_flags |=3D VM_SOFTDIRTY; + set_vm_flags(vma, VM_SOFTDIRTY); mas_store_prealloc(mas, vma); =20 if (vma->anon_vma) { @@ -2979,7 +2979,7 @@ static int do_brk_flags(struct ma_state *mas, struct = vm_area_struct *vma, vma->vm_start =3D addr; vma->vm_end =3D addr + len; vma->vm_pgoff =3D addr >> PAGE_SHIFT; - vma->vm_flags =3D flags; + init_vm_flags(vma, flags); vma->vm_page_prot =3D vm_get_page_prot(flags); mas_set_range(mas, vma->vm_start, addr + len - 1); if (mas_store_gfp(mas, vma, GFP_KERNEL)) @@ -2992,7 +2992,7 @@ static int do_brk_flags(struct ma_state *mas, struct = vm_area_struct *vma, mm->data_vm +=3D len >> PAGE_SHIFT; if (flags & VM_LOCKED) mm->locked_vm +=3D (len >> PAGE_SHIFT); - vma->vm_flags |=3D VM_SOFTDIRTY; + set_vm_flags(vma, VM_SOFTDIRTY); validate_mm(mm); return 0; =20 diff --git a/mm/mprotect.c b/mm/mprotect.c index 908df12caa26..79adae74c094 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -633,7 +633,7 @@ mprotect_fixup(struct mmu_gather *tlb, struct vm_area_s= truct *vma, * vm_flags and vm_page_prot are protected by the mmap_lock * held in write mode. */ - vma->vm_flags =3D newflags; + reset_vm_flags(vma, newflags); if (vma_wants_manual_pte_write_upgrade(vma)) mm_cp_flags |=3D MM_CP_TRY_CHANGE_WRITABLE; vma_set_page_prot(vma); diff --git a/mm/mremap.c b/mm/mremap.c index 5f6f9931bff1..2ccdd1561f5b 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -661,7 +661,7 @@ static unsigned long move_vma(struct vm_area_struct *vm= a, =20 /* Conceal VM_ACCOUNT so old reservation is not undone */ if (vm_flags & VM_ACCOUNT && !(flags & MREMAP_DONTUNMAP)) { - vma->vm_flags &=3D ~VM_ACCOUNT; + clear_vm_flags(vma, VM_ACCOUNT); excess =3D vma->vm_end - vma->vm_start - old_len; if (old_addr > vma->vm_start && old_addr + old_len < vma->vm_end) @@ -716,9 +716,9 @@ static unsigned long move_vma(struct vm_area_struct *vm= a, =20 /* Restore VM_ACCOUNT if one or two pieces of vma left */ if (excess) { - vma->vm_flags |=3D VM_ACCOUNT; + set_vm_flags(vma, VM_ACCOUNT); if (split) - find_vma(mm, vma->vm_end)->vm_flags |=3D VM_ACCOUNT; + set_vm_flags(find_vma(mm, vma->vm_end), VM_ACCOUNT); } =20 return new_addr; diff --git a/mm/nommu.c b/mm/nommu.c index 214c70e1d059..b3154357ced5 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -173,7 +173,7 @@ static void *__vmalloc_user_flags(unsigned long size, g= fp_t flags) mmap_write_lock(current->mm); vma =3D find_vma(current->mm, (unsigned long)ret); if (vma) - vma->vm_flags |=3D VM_USERMAP; + set_vm_flags(vma, VM_USERMAP); mmap_write_unlock(current->mm); } =20 @@ -991,7 +991,8 @@ static int do_mmap_private(struct vm_area_struct *vma, =20 atomic_long_add(total, &mmap_pages_allocated); =20 - region->vm_flags =3D vma->vm_flags |=3D VM_MAPPED_COPY; + set_vm_flags(vma, VM_MAPPED_COPY); + region->vm_flags =3D vma->flags; region->vm_start =3D (unsigned long) base; region->vm_end =3D region->vm_start + len; region->vm_top =3D region->vm_start + (total << PAGE_SHIFT); @@ -1088,7 +1089,7 @@ unsigned long do_mmap(struct file *file, region->vm_flags =3D vm_flags; region->vm_pgoff =3D pgoff; =20 - vma->vm_flags =3D vm_flags; + init_vm_flags(vma, vm_flags); vma->vm_pgoff =3D pgoff; =20 if (file) { @@ -1152,7 +1153,7 @@ unsigned long do_mmap(struct file *file, vma->vm_end =3D start + len; =20 if (pregion->vm_flags & VM_MAPPED_COPY) - vma->vm_flags |=3D VM_MAPPED_COPY; + set_vm_flags(vma, VM_MAPPED_COPY); else { ret =3D do_mmap_shared_file(vma); if (ret < 0) { @@ -1632,7 +1633,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsig= ned long addr, if (addr !=3D (pfn << PAGE_SHIFT)) return -EINVAL; =20 - vma->vm_flags |=3D VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); return 0; } EXPORT_SYMBOL(remap_pfn_range); diff --git a/mm/secretmem.c b/mm/secretmem.c index 04c3ac9448a1..334b85714bd7 100644 --- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -128,7 +128,7 @@ static int secretmem_mmap(struct file *file, struct vm_= area_struct *vma) if (mlock_future_check(vma->vm_mm, vma->vm_flags | VM_LOCKED, len)) return -EAGAIN; =20 - vma->vm_flags |=3D VM_LOCKED | VM_DONTDUMP; + set_vm_flags(vma, VM_LOCKED | VM_DONTDUMP); vma->vm_ops =3D &secretmem_vm_ops; =20 return 0; diff --git a/mm/shmem.c b/mm/shmem.c index c301487be5fb..2096bbdc955f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2289,7 +2289,7 @@ static int shmem_mmap(struct file *file, struct vm_ar= ea_struct *vma) return ret; =20 /* arm64 - allow memory tagging on RAM-based files */ - vma->vm_flags |=3D VM_MTE_ALLOWED; + set_vm_flags(vma, VM_MTE_ALLOWED); =20 file_accessed(file); /* This is anonymous shared memory if it is unlinked at the time of mmap = */ diff --git a/mm/vmalloc.c b/mm/vmalloc.c index ca71de7c9d77..da02ec9c650f 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -3657,7 +3657,7 @@ int remap_vmalloc_range_partial(struct vm_area_struct= *vma, unsigned long uaddr, size -=3D PAGE_SIZE; } while (size > 0); =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); =20 return 0; } diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index c567d5e8053e..30158585c688 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1890,10 +1890,10 @@ int tcp_mmap(struct file *file, struct socket *sock, { if (vma->vm_flags & (VM_WRITE | VM_EXEC)) return -EPERM; - vma->vm_flags &=3D ~(VM_MAYWRITE | VM_MAYEXEC); + clear_vm_flags(vma, VM_MAYWRITE | VM_MAYEXEC); =20 /* Instruct vm_insert_page() to not mmap_read_lock(mm) */ - vma->vm_flags |=3D VM_MIXEDMAP; + set_vm_flags(vma, VM_MIXEDMAP); =20 vma->vm_ops =3D &tcp_vm_ops; return 0; diff --git a/security/selinux/selinuxfs.c b/security/selinux/selinuxfs.c index 0a6894cdc54d..9037deb5979e 100644 --- a/security/selinux/selinuxfs.c +++ b/security/selinux/selinuxfs.c @@ -262,7 +262,7 @@ static int sel_mmap_handle_status(struct file *filp, if (vma->vm_flags & VM_WRITE) return -EPERM; /* disallow mprotect() turns it into writable */ - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 return remap_pfn_range(vma, vma->vm_start, page_to_pfn(status), @@ -506,13 +506,13 @@ static int sel_mmap_policy(struct file *filp, struct = vm_area_struct *vma) { if (vma->vm_flags & VM_SHARED) { /* do not allow mprotect to make mapping writable */ - vma->vm_flags &=3D ~VM_MAYWRITE; + clear_vm_flags(vma, VM_MAYWRITE); =20 if (vma->vm_flags & VM_WRITE) return -EACCES; } =20 - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); vma->vm_ops =3D &sel_mmap_policy_ops; =20 return 0; diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index ac2efeb63a39..52473e2acd07 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -2910,7 +2910,7 @@ static int snd_pcm_oss_mmap(struct file *file, struct= vm_area_struct *area) } /* set VM_READ access as well to fix memset() routines that do reads before writes (to improve performance) */ - area->vm_flags |=3D VM_READ; + set_vm_flags(area, VM_READ); if (substream =3D=3D NULL) return -ENXIO; runtime =3D substream->runtime; diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 9c122e757efe..f716bdb70afe 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -3675,8 +3675,9 @@ static int snd_pcm_mmap_status(struct snd_pcm_substre= am *substream, struct file return -EINVAL; area->vm_ops =3D &snd_pcm_vm_ops_status; area->vm_private_data =3D substream; - area->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; - area->vm_flags &=3D ~(VM_WRITE | VM_MAYWRITE); + mod_vm_flags(area, VM_DONTEXPAND | VM_DONTDUMP, + VM_WRITE | VM_MAYWRITE); + return 0; } =20 @@ -3712,7 +3713,7 @@ static int snd_pcm_mmap_control(struct snd_pcm_substr= eam *substream, struct file return -EINVAL; area->vm_ops =3D &snd_pcm_vm_ops_control; area->vm_private_data =3D substream; - area->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(area, VM_DONTEXPAND | VM_DONTDUMP); return 0; } =20 @@ -3828,7 +3829,7 @@ static const struct vm_operations_struct snd_pcm_vm_o= ps_data_fault =3D { int snd_pcm_lib_default_mmap(struct snd_pcm_substream *substream, struct vm_area_struct *area) { - area->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(area, VM_DONTEXPAND | VM_DONTDUMP); if (!substream->ops->page && !snd_dma_buffer_mmap(snd_pcm_get_dma_buf(substream), area)) return 0; diff --git a/sound/soc/pxa/mmp-sspa.c b/sound/soc/pxa/mmp-sspa.c index fb5a4390443f..fdd72d9bb46c 100644 --- a/sound/soc/pxa/mmp-sspa.c +++ b/sound/soc/pxa/mmp-sspa.c @@ -404,7 +404,7 @@ static int mmp_pcm_mmap(struct snd_soc_component *compo= nent, struct snd_pcm_substream *substream, struct vm_area_struct *vma) { - vma->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(vma, VM_DONTEXPAND | VM_DONTDUMP); vma->vm_page_prot =3D pgprot_noncached(vma->vm_page_prot); return remap_pfn_range(vma, vma->vm_start, substream->dma_buffer.addr >> PAGE_SHIFT, diff --git a/sound/usb/usx2y/us122l.c b/sound/usb/usx2y/us122l.c index e558931cce16..b51db622a69b 100644 --- a/sound/usb/usx2y/us122l.c +++ b/sound/usb/usx2y/us122l.c @@ -224,9 +224,9 @@ static int usb_stream_hwdep_mmap(struct snd_hwdep *hw, } =20 area->vm_ops =3D &usb_stream_hwdep_vm_ops; - area->vm_flags |=3D VM_DONTDUMP; + set_vm_flags(area, VM_DONTDUMP); if (!read) - area->vm_flags |=3D VM_DONTEXPAND; + set_vm_flags(area, VM_DONTEXPAND); area->vm_private_data =3D us122l; atomic_inc(&us122l->mmap_count); out: diff --git a/sound/usb/usx2y/usX2Yhwdep.c b/sound/usb/usx2y/usX2Yhwdep.c index c29da0341bc5..3abe6d891f98 100644 --- a/sound/usb/usx2y/usX2Yhwdep.c +++ b/sound/usb/usx2y/usX2Yhwdep.c @@ -61,7 +61,7 @@ static int snd_us428ctls_mmap(struct snd_hwdep *hw, struc= t file *filp, struct vm } =20 area->vm_ops =3D &us428ctls_vm_ops; - area->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(area, VM_DONTEXPAND | VM_DONTDUMP); area->vm_private_data =3D hw->private_data; return 0; } diff --git a/sound/usb/usx2y/usx2yhwdeppcm.c b/sound/usb/usx2y/usx2yhwdeppc= m.c index 767a227d54da..22ce93b2fb24 100644 --- a/sound/usb/usx2y/usx2yhwdeppcm.c +++ b/sound/usb/usx2y/usx2yhwdeppcm.c @@ -706,7 +706,7 @@ static int snd_usx2y_hwdep_pcm_mmap(struct snd_hwdep *h= w, struct file *filp, str return -ENODEV; =20 area->vm_ops =3D &snd_usx2y_hwdep_pcm_vm_ops; - area->vm_flags |=3D VM_DONTEXPAND | VM_DONTDUMP; + set_vm_flags(area, VM_DONTEXPAND | VM_DONTDUMP); area->vm_private_data =3D hw->private_data; return 0; } --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC701C54EBD for ; Mon, 9 Jan 2023 20:55:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237620AbjAIUzp (ORCPT ); Mon, 9 Jan 2023 15:55:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237607AbjAIUzV (ORCPT ); Mon, 9 Jan 2023 15:55:21 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3F548110C for ; Mon, 9 Jan 2023 12:54:22 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-4c57d19de6fso76641347b3.20 for ; Mon, 09 Jan 2023 12:54:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=N7mRvZOR9OSs0kq6Wz0vbjqUGqop/n2M19/OmQJBzkI=; b=kK6VGqQrZF8SG+/SjIkBNYvs6AD0XP75PAW8W4oCooA8s3XRGaAuXa1jS1HIr83jUq PfI+R4O3x61YrRtYzJolldiJm8hvHC8SxT9psAXnN9HRUclq2Ip2r4U5z9IkvWwPkq1G R8EmUXB5Ci5VhDTGcvMKuR9ocC5TZKh6cUC/3KU3F2jjjEENLe5/WyQt1N1DazjgpxS4 yljAKeS+kCfpqMArc2lBS+96rZHSjRTZ1IN5B+phw/v3GaAeI+j/XkkH3A++TUlQdWQv Z4M4yuHxGDrl8gjpdpiZsATGjMq3MZe3RurXo4cwDj7NtnBXJ6X3cbtci2E7Lsi2JjQl 1UiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=N7mRvZOR9OSs0kq6Wz0vbjqUGqop/n2M19/OmQJBzkI=; b=nEUCk+pyS4koPMNY1PZXCXyvazvvcyQOBTpIsqcv7/RMfnTcMRK9Otz5NeVqLOB1+V Puv80WK1xq40Cqo4TU8JcUx7f5plvoY0cp6IZhxppv6YJxhlRokLqzqlDyh1kSnCZ/Ld QZGYTEiyjEkfMkSNuUvMIMgP0OWZPPJw2in9Iz5Co1uAuFV1ziAVXVBvCPnLSooZeEFZ u0Ta2uhz3zY6gd1+5jPSFec32amrktg0hd9KeecY8kSoR9g7Ah51ALP3+ZPmqsJwkueq pntl+3B9P0Z3lpuG3j/TkWDpBZjH4r+rWeZJwDuXhWE3FdI0Bp+puDmvSfguMzec7Nrp nOdA== X-Gm-Message-State: AFqh2kpnoUkXo9ITcQuUFFryBUSYLFrd5CVqH1Cj1WEyK6sg2eTqndBM 9q3NijcrR0CMLOb4S6aTrdnpoIP8q+o= X-Google-Smtp-Source: AMrXdXshiyJrN6EY9CYZGhpA9uWntRCGsbXmXt4bSfqvBPsR2AoGwmZNzha8yCXqkiCEpIRfWXrumcAFYxY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a81:65d7:0:b0:39a:afeb:f519 with SMTP id z206-20020a8165d7000000b0039aafebf519mr799450ywb.146.1673297662050; Mon, 09 Jan 2023 12:54:22 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:11 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-17-surenb@google.com> Subject: [PATCH 16/41] mm: replace vma->vm_flags indirect modification in ksm_madvise From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Replace indirect modifications to vma->vm_flags with calls to modifier functions to be able to track flag changes and to keep vma locking correctness. Add a BUG_ON check in ksm_madvise() to catch indirect vm_flags modification attempts. Signed-off-by: Suren Baghdasaryan --- arch/powerpc/kvm/book3s_hv_uvmem.c | 5 ++++- arch/s390/mm/gmap.c | 5 ++++- mm/khugepaged.c | 2 ++ mm/ksm.c | 2 ++ 4 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_h= v_uvmem.c index 1d67baa5557a..325a7a47d348 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -393,6 +393,7 @@ static int kvmppc_memslot_page_merge(struct kvm *kvm, { unsigned long gfn =3D memslot->base_gfn; unsigned long end, start =3D gfn_to_hva(kvm, gfn); + unsigned long vm_flags; int ret =3D 0; struct vm_area_struct *vma; int merge_flag =3D (merge) ? MADV_MERGEABLE : MADV_UNMERGEABLE; @@ -409,12 +410,14 @@ static int kvmppc_memslot_page_merge(struct kvm *kvm, ret =3D H_STATE; break; } + vm_flags =3D vma->vm_flags; ret =3D ksm_madvise(vma, vma->vm_start, vma->vm_end, - merge_flag, &vma->vm_flags); + merge_flag, &vm_flags); if (ret) { ret =3D H_STATE; break; } + reset_vm_flags(vma, vm_flags); start =3D vma->vm_end; } while (end > vma->vm_end); =20 diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 3811d6c86d09..e47387f8be6d 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2587,14 +2587,17 @@ int gmap_mark_unmergeable(void) { struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma; + unsigned long vm_flags; int ret; VMA_ITERATOR(vmi, mm, 0); =20 for_each_vma(vmi, vma) { + vm_flags =3D vma->vm_flags; ret =3D ksm_madvise(vma, vma->vm_start, vma->vm_end, - MADV_UNMERGEABLE, &vma->vm_flags); + MADV_UNMERGEABLE, &vm_flags); if (ret) return ret; + reset_vm_flags(vma, vm_flags); } mm->def_flags &=3D ~VM_MERGEABLE; return 0; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5cb401aa2b9d..5376246a3052 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -352,6 +352,8 @@ struct attribute_group khugepaged_attr_group =3D { int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice) { + /* vma->vm_flags can be changed only using modifier functions */ + BUG_ON(vm_flags =3D=3D &vma->vm_flags); switch (advice) { case MADV_HUGEPAGE: #ifdef CONFIG_S390 diff --git a/mm/ksm.c b/mm/ksm.c index dd02780c387f..d05c41b289db 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2471,6 +2471,8 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned = long start, struct mm_struct *mm =3D vma->vm_mm; int err; =20 + /* vma->vm_flags can be changed only using modifier functions */ + BUG_ON(vm_flags =3D=3D &vma->vm_flags); switch (advice) { case MADV_MERGEABLE: /* --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 327BCC54EBD for ; Mon, 9 Jan 2023 20:56:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237333AbjAIU4H (ORCPT ); Mon, 9 Jan 2023 15:56:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237546AbjAIUzY (ORCPT ); Mon, 9 Jan 2023 15:55:24 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3DF5E81D7F for ; Mon, 9 Jan 2023 12:54:25 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id k18-20020a170902c41200b001896d523dc8so6993492plk.19 for ; Mon, 09 Jan 2023 12:54:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hDqG76VNjmKFCkYaVQCHk1jK9PV7CZBX+MdyRpSUw7Y=; b=oTTbEHXXf074UTT/LXfKIzTlYAOVSZBWjz/z2pi8uw2sgTsKOn0pf4M+FUmb3fhKEp t8utYb+BYUrtOmcMbBRfjQmj3HjZCnvt7nX8XXsh4x467hUw4mc17N9vBA4v3IH0gXYu Tf0NgJ8TRnq+dnC575YF2XUlugz8igLSTMm592zbLFSlzWIJOjCEMLUlmrt7z2QrvV2T gekww8A2UJVHgfp1zwqjj/KBiOxEMFrYzziSkpJrGHAD1djB4N2zU8kmwkE7PnEOrNUu hxvDlopUQbh5t04B3B6cW3kLOVcLSOY+TAkcMQBqHoc5+w1BV2sS39QIbB6FPKvgrwld s2sA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hDqG76VNjmKFCkYaVQCHk1jK9PV7CZBX+MdyRpSUw7Y=; b=CaFAor7JdrXfi8+vNs4KOkeelfkQptUs/UWVR9Fxwg0aucomzDTtjMNI3Y3DvdgWNP wzvuEukT8nnTy8wxW/1SyUpLihHe5bGhAcj2aK+0nPOzhzVdjtubNfuRBXKHRDvrNsCw n+sL8/gh6kaw6SZusaXG5Oku0/C3H5c+y3hR2O8UmmwkUgnrk7HDvLCZCQAz8UpQAgaz fd8U2KdCJ7JH8U5IDQpH+a6coo2gMTBoRQqVSHf3iRQQ1lo632dKWRlK7caQyjbRajMO 14KN3mv278DxYYus2mId4lAItBmXuOjqyNNM6hX7ywzP2jiZqoKOvFDk0nf/Q5Ct9g4C pVWw== X-Gm-Message-State: AFqh2kp8e+dQFCnFn+dErp4nus9y36gqot+LlmJaNI2ltTrcoZE+v1dZ kMhsV//g+Iu3++8W2NWbiIro7Hd+yrw= X-Google-Smtp-Source: AMrXdXvx+5mjUKCd2siyFSRbOdBiHgQPNh6ZnBtsj8lXWONeXJTph7ewao1odYnxoExvO1xzjoIbjT4zuzo= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a17:90a:638b:b0:221:52e3:1f56 with SMTP id f11-20020a17090a638b00b0022152e31f56mr5166916pjj.225.1673297664681; Mon, 09 Jan 2023 12:54:24 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:12 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-18-surenb@google.com> Subject: [PATCH 17/41] mm/mmap: move VMA locking before anon_vma_lock_write call From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move VMA flag modification (which now implies VMA locking) before anon_vma_lock_write to match the locking order of page fault handler. Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/mmap.c b/mm/mmap.c index fa994ae903d9..53d885e70a54 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2953,13 +2953,13 @@ static int do_brk_flags(struct ma_state *mas, struc= t vm_area_struct *vma, if (mas_preallocate(mas, vma, GFP_KERNEL)) goto unacct_fail; =20 + set_vm_flags(vma, VM_SOFTDIRTY); vma_adjust_trans_huge(vma, vma->vm_start, addr + len, 0); if (vma->anon_vma) { anon_vma_lock_write(vma->anon_vma); anon_vma_interval_tree_pre_update_vma(vma); } vma->vm_end =3D addr + len; - set_vm_flags(vma, VM_SOFTDIRTY); mas_store_prealloc(mas, vma); =20 if (vma->anon_vma) { --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44D31C54EBD for ; Mon, 9 Jan 2023 20:56:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237757AbjAIU4L (ORCPT ); Mon, 9 Jan 2023 15:56:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237601AbjAIUz0 (ORCPT ); Mon, 9 Jan 2023 15:55:26 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D7DC85C89 for ; Mon, 9 Jan 2023 12:54:28 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id n203-20020a2572d4000000b0078f09db9888so10253981ybc.18 for ; Mon, 09 Jan 2023 12:54:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=mttq8CcSxh6pk/PDvvqo/9XfYX/awSV+YKJHUec4DWQ=; b=EbLgVpiNWauCxYsPhZzluNh8i7Jc9od95XU7q4baD/bZuTFbACugKjaBqwn3ZlwjPK DyA2dJROokooe1X9hWF9cqxx7R8dzaLC4TQ2VgG1K6ZzCrV6Cn+bNgWtTm8qKV2EPlQI C3EgQy0lx8995KoeqPGehRxKjsY9DJj6pO/ydpCP34y4VCDsA5lFDA9xaQObsm54nagL gf/rqX49ePSI64VKn0w7yUB28hwRSupVRZyvn8G3inMqKMvhkMvaFS5iHqspfhqeeQGW sJQNxcx4h1OW/3tP5B8wZxgrbXFTXvC0zscvDZLigjE3sBlwiKuCEIrG4zHiBAmIFXhN sp1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mttq8CcSxh6pk/PDvvqo/9XfYX/awSV+YKJHUec4DWQ=; b=5WJWiABhMZgyLmuOtlK7476/RvLzIHRadz5DxklX2oNNchRgDf7QOrcl0lXIaZM+rR 7O24E38Nz2hBt7D+DiRjiCO+1yiKZDKuzWP9ukYeejYHZajSfiHQdDsKU8TQBEZ7HgBW apwQwp1VnW7eRKAGg1XNscCrG+AOmXQ42AcJdYvIY9V3HaTX4krx165n1zmvN8git6eq vwP0VngLOFBFdeyl/QJoXnzPX8m2iTucptrA4TyGz1N1XqzBYAQa+P6m8WYVw1UiGhzC Ux0N6YN/hxCCgIYmHF8jyTxndRI/dpTieLuhPMg1EOjQsk/qNRfHvp4BeeJkx2l1Im3e pfsQ== X-Gm-Message-State: AFqh2kr+c5dm96RGEXRTWU5v36UFKfwg817i4Mzjp6MmS1mYFm3LGt0a vXuLccAOiUf8JRBm3tkW8mzcDf/sL3Q= X-Google-Smtp-Source: AMrXdXsB8fbMiTP1DYP5J5XEU88bTBTK0YYISNp2DJzOoqciXF/+cmYxWMxqmdOzbjRBqH/+xJjk8vQehVo= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:850e:0:b0:6f8:42d8:2507 with SMTP id w14-20020a25850e000000b006f842d82507mr8097864ybk.110.1673297667509; Mon, 09 Jan 2023 12:54:27 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:13 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-19-surenb@google.com> Subject: [PATCH 18/41] mm/khugepaged: write-lock VMA while collapsing a huge page From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Protect VMA from concurrent page fault handler while collapsing a huge page. Page fault handler needs a stable PMD to use PTL and relies on per-VMA lock to prevent concurrent PMD changes. pmdp_collapse_flush(), set_huge_pmd() and collapse_and_free_pmd() can modify a PMD, which will not be detected by a page fault handler without proper locking. Signed-off-by: Suren Baghdasaryan --- mm/khugepaged.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5376246a3052..d8d0647f0c2c 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1032,6 +1032,7 @@ static int collapse_huge_page(struct mm_struct *mm, u= nsigned long address, if (result !=3D SCAN_SUCCEED) goto out_up_write; =20 + vma_write_lock(vma); anon_vma_lock_write(vma->anon_vma); =20 mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, NULL, mm, @@ -1503,6 +1504,9 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, uns= igned long addr, goto drop_hpage; } =20 + /* Lock the vma before taking i_mmap and page table locks */ + vma_write_lock(vma); + /* * We need to lock the mapping so that from here on, only GUP-fast and * hardware page walks can access the parts of the page tables that @@ -1690,6 +1694,7 @@ static int retract_page_tables(struct address_space *= mapping, pgoff_t pgoff, result =3D SCAN_PTE_UFFD_WP; goto unlock_next; } + vma_write_lock(vma); collapse_and_free_pmd(mm, vma, addr, pmd); if (!cc->is_khugepaged && is_target) result =3D set_huge_pmd(vma, addr, pmd, hpage); --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC119C5479D for ; Mon, 9 Jan 2023 20:56:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237713AbjAIU4S (ORCPT ); Mon, 9 Jan 2023 15:56:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237640AbjAIUzd (ORCPT ); Mon, 9 Jan 2023 15:55:33 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54C23848F7 for ; Mon, 9 Jan 2023 12:54:30 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id w15-20020a05690204ef00b007b966ba4410so7175454ybs.5 for ; Mon, 09 Jan 2023 12:54:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ves4Zx76RBv25M18zzcj7gVloTdTuNMdyKmYC5DDvz0=; b=MKIFYUVg3C+KcUmk/DiJLvwQZvDLVNMlr8DQ0MCXFdkv6hN2eBfUWDzxIZ7BfVHp8I 9ZQKQbnc3cjtIsfRVDcXo71Ba38rTMbcUdq5jPPQ/O/qUYTP0dyF6taFqjbQZ9FRkBHu 4AhRyAaM/Xs1rNZHH4s644beTjAgsiLfyufSEIgttXRExgdKqL3ufxpDIm4mlvORyhtg 8KPqhXKz4czDblD/OO3yCfwh5TQTP8fNUj9/04RnIZEyuuEXUuTlBqAldxaXV0tj/mD/ bYzPmov2AWSWqL0UrgwZuK2hqucg36ElYTON5SUI9h/73LoxJPTGa7ERIj/8Aw4bHycM nzHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ves4Zx76RBv25M18zzcj7gVloTdTuNMdyKmYC5DDvz0=; b=cKW5U6PtxhwR348hIPI16LT/msHQ3L7R4hiIO1BA0MCRzrTnS6IvePIpht9FiXpOax TFRWUWo6MzLbl1eQ2BoqA/GvbU2bQxkDmLUvg8U91I49DHmsuCo3UapsyREKyNPC7uHO NoYnDL8MxTHrOiNmdgFluTitEbl33jWVsY9Ci4i47ci6rAdhyaYKEJB3iIMF+uFHYri4 vqWvSKUEKzAWlAoKlVQfF7ffMFfn0dfuGAu3TRWFaBp9OCxaLGgcpRzTNDf75BffHk97 hnAC70rOy0OTPXCah4QeAun9aW0WSqBZwMZDKtJ3HeGmC20M4E1G0JZ7Ft3/zHah893G oGsw== X-Gm-Message-State: AFqh2koVPSfy7cFw3J/Al13Q5l+X4YxnOUwStW73V6MoKOHFGuFcWEcc teWWgDxX1vK1oLfe8ZoRcOJRn2IyLiU= X-Google-Smtp-Source: AMrXdXvMc8uw91kFZQ5TZdRrbkHWe3EKuKErf5QuBh5uD1fMXXAng4XpjPLPHBE1E+VnZgCLzgnV8hcsb8Y= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a81:a013:0:b0:4a9:884a:20c4 with SMTP id x19-20020a81a013000000b004a9884a20c4mr4780346ywg.139.1673297669408; Mon, 09 Jan 2023 12:54:29 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:14 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-20-surenb@google.com> Subject: [PATCH 19/41] mm/mmap: write-lock VMAs before merging, splitting or expanding them From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Decisions about whether VMAs can be merged, split or expanded must be made while VMAs are protected from the changes which can affect that decision. For example, merge_vma uses vma->anon_vma in its decision whether the VMA can be merged. Meanwhile, page fault handler changes vma->anon_vma during COW operation. Write-lock all VMAs which might be affected by a merge or split operation before making decision how such operations should be performed. Not sure if expansion really needs this, just being paranoid. Otherwise mmap_region and vm_brk_flags might not locking. Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 53d885e70a54..f6ca4a87f9e2 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -254,8 +254,11 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) */ mas_set(&mas, oldbrk); next =3D mas_find(&mas, newbrk - 1 + PAGE_SIZE + stack_guard_gap); - if (next && newbrk + PAGE_SIZE > vm_start_gap(next)) - goto out; + if (next) { + vma_write_lock(next); + if (newbrk + PAGE_SIZE > vm_start_gap(next)) + goto out; + } =20 brkvma =3D mas_prev(&mas, mm->start_brk); /* Ok, looks good - let it rip. */ @@ -1017,10 +1020,17 @@ struct vm_area_struct *vma_merge(struct mm_struct *= mm, if (vm_flags & VM_SPECIAL) return NULL; =20 + if (prev) + vma_write_lock(prev); next =3D find_vma(mm, prev ? prev->vm_end : 0); mid =3D next; - if (next && next->vm_end =3D=3D end) /* cases 6, 7, 8 */ + if (next) + vma_write_lock(next); + if (next && next->vm_end =3D=3D end) { /* cases 6, 7, 8 */ next =3D find_vma(mm, next->vm_end); + if (next) + vma_write_lock(next); + } =20 /* verify some invariant that must be enforced by the caller */ VM_WARN_ON(prev && addr <=3D prev->vm_start); @@ -2198,6 +2208,7 @@ int __split_vma(struct mm_struct *mm, struct vm_area_= struct *vma, int err; validate_mm_mt(mm); =20 + vma_write_lock(vma); if (vma->vm_ops && vma->vm_ops->may_split) { err =3D vma->vm_ops->may_split(vma, addr); if (err) @@ -2564,6 +2575,8 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, =20 /* Attempt to expand an old mapping */ /* Check next */ + if (next) + vma_write_lock(next); if (next && next->vm_start =3D=3D end && !vma_policy(next) && can_vma_merge_before(next, vm_flags, NULL, file, pgoff+pglen, NULL_VM_UFFD_CTX, NULL)) { @@ -2573,6 +2586,8 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, } =20 /* Check prev */ + if (prev) + vma_write_lock(prev); if (prev && prev->vm_end =3D=3D addr && !vma_policy(prev) && (vma ? can_vma_merge_after(prev, vm_flags, vma->anon_vma, file, pgoff, vma->vm_userfaultfd_ctx, NULL) : @@ -2942,6 +2957,8 @@ static int do_brk_flags(struct ma_state *mas, struct = vm_area_struct *vma, if (security_vm_enough_memory_mm(mm, len >> PAGE_SHIFT)) return -ENOMEM; =20 + if (vma) + vma_write_lock(vma); /* * Expand the existing vma if possible; Note that singular lists do not * occur after forking, so the expand will only happen on new VMAs. --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E42DBC6379F for ; Mon, 9 Jan 2023 20:56:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237767AbjAIU4Y (ORCPT ); Mon, 9 Jan 2023 15:56:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237715AbjAIUze (ORCPT ); Mon, 9 Jan 2023 15:55:34 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 957307458B for ; Mon, 9 Jan 2023 12:54:32 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id y66-20020a25c845000000b00733b5049b6fso10232847ybf.3 for ; Mon, 09 Jan 2023 12:54:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FDedOk4BDhI0jXVeiXMnt29MYEOAHvgEHolCeEtDVh8=; b=m9qNPQfvAaSk/cZYvrILeQolkFuo8oX0rPOdz+ZeBZe0viaaMWSs2xM9wojEerSzR9 Ehc0liLdN4Rf9t9ITHbgyUo/TpjLKodzNOt3cYyWGhisrg0tQj8O7DOpS0TtK3tPQLQG qUB/3EWr0qIxJ9v9dQ2Qpf6kyj6P77wOGaZ+XxUB3NlRJ4nvQo0f4D36W3MPMO4ktFMc ysLVV5w6Mp+QnxSVtHOQDDwbxZ7kpIUBpnDdTFd6J6q0MuwMTyvve19XzRy6+EFEsY7p 5CEU/qiYVWTESVV3KMWaGXvUZnEBFFh/Tdv+49ubumcalCQjcVJZu4o8Eo8hhtY4lcfd RRGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FDedOk4BDhI0jXVeiXMnt29MYEOAHvgEHolCeEtDVh8=; b=OUuTXtNvUDVWu6R5zX/GeqhSGq96BSGhtqdFA7FfooxyTv4V/fh4aO4L/eu/lutw6b bMqKwAGut14W14Q0gxn7Hpw3oxWEigGQ2rIVh6lHFwTGNO9Vg74YMMUU+SFSF14HSIJ4 IvcfK1Jykh/37DOicMA/cF4QjAhwWBzXUK/gAZBX7Fgjyj5LvidiF+tNyG4EDT3FrfqT 6tJj2Dyttp7k/R4sO1ZBi0rvaj81jZO60/x2VzhsjU6cJOBb1byw1swSxYYgUMmntjpd ZTswJMZPj+ltefoDVfNz8/oNeiWalrupZk54p7PpA3bYfy/xIf8U6wgebfqOW8KIiVhc t6QA== X-Gm-Message-State: AFqh2krKGAPZSFn31H5N0Y67TD+OEGt/BsZaciihR/3M2rn61hOOGc0w uXTsvpJcQXVo51M9nE8XknvwvuV1EPE= X-Google-Smtp-Source: AMrXdXvGyOxiOv+MuSWvovzrd0so3mEpN1Edvcx+Z2HBXoeLvz+P6Hw4SXaxw2akuo9mk6+zHwkliVPrxQU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:1489:0:b0:6fb:ef1e:bb1a with SMTP id 131-20020a251489000000b006fbef1ebb1amr6925224ybu.168.1673297671808; Mon, 09 Jan 2023 12:54:31 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:15 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-21-surenb@google.com> Subject: [PATCH 20/41] mm/mmap: write-lock VMAs in vma_adjust From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vma_adjust modifies a VMA and possibly its neighbors. Write-lock them before making the modifications. Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/mm/mmap.c b/mm/mmap.c index f6ca4a87f9e2..1e2154137631 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -614,6 +614,12 @@ inline int vma_expand(struct ma_state *mas, struct vm_= area_struct *vma, * The following helper function should be used when such adjustments * are necessary. The "insert" vma (if any) is to be inserted * before we drop the necessary locks. + * 'expand' vma is always locked before it's passed to __vma_adjust() + * from vma_merge() because vma should not change from the moment + * can_vma_merge_{before|after} decision is made. + * 'insert' vma is used only by __split_vma() and it's always a brand + * new vma which is not yet added into mm's vma tree, therefore no need + * to lock it. */ int __vma_adjust(struct vm_area_struct *vma, unsigned long start, unsigned long end, pgoff_t pgoff, struct vm_area_struct *insert, @@ -633,6 +639,10 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned = long start, MA_STATE(mas, &mm->mm_mt, 0, 0); struct vm_area_struct *exporter =3D NULL, *importer =3D NULL; =20 + vma_write_lock(vma); + if (next) + vma_write_lock(next); + if (next && !insert) { if (end >=3D next->vm_end) { /* @@ -676,8 +686,11 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned = long start, * If next doesn't have anon_vma, import from vma after * next, if the vma overlaps with it. */ - if (remove_next =3D=3D 2 && !next->anon_vma) + if (remove_next =3D=3D 2 && !next->anon_vma) { exporter =3D next_next; + if (exporter) + vma_write_lock(exporter); + } =20 } else if (end > next->vm_start) { /* @@ -850,6 +863,8 @@ int __vma_adjust(struct vm_area_struct *vma, unsigned l= ong start, if (remove_next =3D=3D 2) { remove_next =3D 1; next =3D next_next; + if (next) + vma_write_lock(next); goto again; } } --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B147EC5479D for ; Mon, 9 Jan 2023 20:56:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237771AbjAIU4d (ORCPT ); Mon, 9 Jan 2023 15:56:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237721AbjAIUzg (ORCPT ); Mon, 9 Jan 2023 15:55:36 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38A8487F17 for ; Mon, 9 Jan 2023 12:54:35 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-4c1456d608cso104273017b3.15 for ; Mon, 09 Jan 2023 12:54:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1sz3d8Q6zHAextbGhc5zNdx6/hdXqibZ5+zjxQEPtcY=; b=dyW75mchIm8T3L3V0Aa5O5essWjp4aFupYS1LFKIISIPYObh/dim0X80nsutKOS2m/ 7vd6AcHVv3F7nL+fFOG4vgOpiulnyZtSLux8AIxMn/Q09ghgqU9hpb0yvL9ry+wKM1Wm UvSXVlZK7VJVWgc1kVtRM87LVA0ol6j3197yqrcZsN3y1TKaMHMTXNPpOvk38N+aLEKf U9xTtNcWgL4MRS7DZ4GXXUuetaMZa2ezppHTDMHc58pzPdKSGBN9qWFGFMwk1DYAuR6m mxxVO9djRkO0qH7ZMi64I+oMQN0AQu1sYjF82YaLSmfJI9fTdyDigl7Wn8SgruW4bLXZ WU7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1sz3d8Q6zHAextbGhc5zNdx6/hdXqibZ5+zjxQEPtcY=; b=2ek9xVPg5kTOse9s7DJeUBms3KlFaULCDJFWo3rGk1EOtLUtJ13x5aoAp6Py8+7tLk XEiGY4i3N46zeLJe0M35P5Ny+7xSGXQ4KBAn5OfDinW4elurR8QpTJPtPdiRCYFysLSE EGVz+A+/lcUVTtThybM/uQmTytdmEcuu6GWBy/dbmvV1S60GKfN3bYGxDaU9O400Yr2l e72l8WFD37cOoEb0rxkJT0gf9IyyWIrtgsaxDAA0rA2KRpCr5YNUgpKGWhZin0V+Lf4B x8GFt8NjBuiEjnM+WoeHVL2SYEj0DaHZYniFrJB4IIp7GRBEh36XO1b9D08Be0D9Ihtm 1PSw== X-Gm-Message-State: AFqh2koEsPZXj1SVJJ+ptcFwACtnk8wQrjxZaU3mkJk95vm44CIr+BMl 7l7PMjKMTMptk5S1A2Jh8v9Nkf0Y+fg= X-Google-Smtp-Source: AMrXdXvYJ0MKAPgXdwWnp9E6tLdKAmG2zH31TOsESiFv0D3eqJovHEz4lb9T2t60zhYoFlt31PHaEzcWVdQ= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:830b:0:b0:731:a583:5571 with SMTP id s11-20020a25830b000000b00731a5835571mr8144189ybk.320.1673297674437; Mon, 09 Jan 2023 12:54:34 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:16 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-22-surenb@google.com> Subject: [PATCH 21/41] mm/mmap: write-lock VMAs affected by VMA expansion From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vma_expand changes VMA boundaries and might result in freeing an adjacent VMA. Write-lock affected VMAs to prevent concurrent page faults. Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 1e2154137631..ff02cb51e7e7 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -544,6 +544,7 @@ inline int vma_expand(struct ma_state *mas, struct vm_a= rea_struct *vma, if (mas_preallocate(mas, vma, GFP_KERNEL)) goto nomem; =20 + vma_write_lock(vma); vma_adjust_trans_huge(vma, start, end, 0); =20 if (file) { @@ -590,6 +591,7 @@ inline int vma_expand(struct ma_state *mas, struct vm_a= rea_struct *vma, } =20 if (remove_next) { + vma_write_lock(next); if (file) { uprobe_munmap(next, next->vm_start, next->vm_end); fput(file); --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 828E1C61DB3 for ; Mon, 9 Jan 2023 20:57:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237808AbjAIU5A (ORCPT ); Mon, 9 Jan 2023 15:57:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237607AbjAIU4C (ORCPT ); Mon, 9 Jan 2023 15:56:02 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB8808B535 for ; Mon, 9 Jan 2023 12:54:37 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id z17-20020a25e311000000b00719e04e59e1so10332272ybd.10 for ; Mon, 09 Jan 2023 12:54:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1FpUUiRmr7dBuhWWRB46HjSP7TgY2quKdzRuNf/hqHM=; b=AsrsBvZP9aQHF3L5Nl/u1fB+D2bm7e0YjzSCvgbBaPzVvDqUAEV/bKN3f/FWWFnJld y54pIfCVv+Sf28j1GT6GnDmjTV3QxG0dtIi870ieojG97R1YrLaN1+Hnnh1RqNC6IKhV FDsy8k4m0XBTPaqC3JVLB2amGI721PZ/e3uTJFIVXTdjA/KJiKwGIaizGxyvcp5J8QaW 5xEAO0mZOLLdDxM+ya/sdbkzi1vVwSyWReOOtmKkNqv4WGZ0uHxmN44Y8NrueULgHAw2 BBJ3cZtaXuduG2TgX3HtPnChX2RwWmECLeYtrImIURhb/vK80eEmATn6FHpKHV2WHJD9 N1QA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1FpUUiRmr7dBuhWWRB46HjSP7TgY2quKdzRuNf/hqHM=; b=7mh5zTs6gpnI607jpHI6uZiwptDQdvYWBqlhWBXMRyjtvvdf2pSxsNBdRuOVN5iJao wBU1NlY53WWFpm3cFGCEcF/soXHTbUcX3CacPXCc2Y1j3woLPQvTXLRwR6mN6GnRSWLv 8P1GXmiRm7XNTOCE0km5CZeyT7UoOecIUj9qXpVWWdWrwGJ6i2BFcI36IvGvJ/IEw2mn bVtLWwTv+Bg7SM5Hq6Z2Rr7NrtbXkNxWoGOPZPgvNWTQwuOp5X4JKnzugW8R+yyLAXsv c92LzA0jfb6b7vPzSgrxL20dZOWGuDyawkF+2eqjiSLHJ0CKQ25MN75lTo2ZYkxU7ALA sN5A== X-Gm-Message-State: AFqh2kr/s7x7pc30V35IN1Wibl8TKOnuSAhSXXSIwXcNixKsPOwFfOr5 Jhj5rmM77W2R9Rqx+e11XS7hCvToHN8= X-Google-Smtp-Source: AMrXdXtAgZu+pAoAaYuQQAsM5E55paYeoDQZrstt2b6LAIgKd1Hk/SubADrA5GRLvEWllXjlmSi56GzDLGU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:4646:0:b0:799:3955:201f with SMTP id t67-20020a254646000000b007993955201fmr3367922yba.94.1673297676537; Mon, 09 Jan 2023 12:54:36 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:17 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-23-surenb@google.com> Subject: [PATCH 22/41] mm/mremap: write-lock VMA while remapping it to a new address range From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Write-lock VMA as locked before copying it and when copy_vma produces a new VMA. Signed-off-by: Suren Baghdasaryan Reviewed-by: Laurent Dufour --- mm/mmap.c | 1 + mm/mremap.c | 1 + 2 files changed, 2 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index ff02cb51e7e7..da1908730828 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3261,6 +3261,7 @@ struct vm_area_struct *copy_vma(struct vm_area_struct= **vmap, get_file(new_vma->vm_file); if (new_vma->vm_ops && new_vma->vm_ops->open) new_vma->vm_ops->open(new_vma); + vma_write_lock(new_vma); if (vma_link(mm, new_vma)) goto out_vma_link; *need_rmap_locks =3D false; diff --git a/mm/mremap.c b/mm/mremap.c index 2ccdd1561f5b..d24a79bcb1a1 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -622,6 +622,7 @@ static unsigned long move_vma(struct vm_area_struct *vm= a, return -ENOMEM; } =20 + vma_write_lock(vma); new_pgoff =3D vma->vm_pgoff + ((old_addr - vma->vm_start) >> PAGE_SHIFT); new_vma =3D copy_vma(&vma, new_addr, new_len, new_pgoff, &need_rmap_locks); --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F109C5479D for ; Mon, 9 Jan 2023 20:57:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237822AbjAIU5R (ORCPT ); Mon, 9 Jan 2023 15:57:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237745AbjAIU4I (ORCPT ); Mon, 9 Jan 2023 15:56:08 -0500 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AD478B510 for ; Mon, 9 Jan 2023 12:54:40 -0800 (PST) Received: by mail-pl1-x64a.google.com with SMTP id z3-20020a170903018300b0018fb8ca1688so6973059plg.5 for ; Mon, 09 Jan 2023 12:54:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=olOpibg5eXy7KG3cFGJtGZQRi2wnSYuvllMlsZALFjE=; b=pKSCW4f0d77MorxMhTYeMa3Bkvztzi2l/11p+8AD9JlAcd7O/4nLaK3tMdO/JZYFZO oqL+3k7pxb64mumhhSbnD+8oqAK9T+YZbtnVKYdyWwz1xTl0lxuCTGJlmjyD4W6tkdoJ ScE8xkE+709mZSvnfMafhqWZz0JUfC8PX8Gqgq4hVWAiFqsbly88nJnbf2OtvsEjRJ+V Bh1jSeLHcoArd7dhSosnJ19E8FECzKPk1IhOzNYJUOL7f9tKHCqrIEP0OXg1wGsfV/sX JopWIX7MNjYmVxKWpBhB8gnFt1KMXJN0MSlcK65fQjg6FKg5IorU4J4ePrdeGZ792swl N3JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=olOpibg5eXy7KG3cFGJtGZQRi2wnSYuvllMlsZALFjE=; b=Cse5lVDedKARQeb5dOg8SwyoAtu+QJQ4oB/2nBrSGg/s27leK0jzpr1QZE+41ziT0J hSqEKktydZ2zd0wYQcrKSTZpwEvq6QFW9QZU05o8ewxsulMXjpFOSUQVK3R5GAENo/ab q2SwEK81hyeZ0TLV7Z3TXeRrg49ANKQ3GRiwWcerOuxfMcnL81TSgaIeCLVRWMkTQP1z 87hdIRgCzYxN86Q65mVsL3YOeIMgKqHibIyodCNddG2gFRjmbhYPKg0Kq4wc24MrCrmc roPOYHT2bZAmLh/RhhcKh1UtZTy/TIGmOH6aUfjhwME3JsnkuR8NDdKHaJjHGHn8wST0 7RtA== X-Gm-Message-State: AFqh2kooJSGHSRf56ZYbaMJ7MtGrB9lNinD/ziV1TkqufYBRkkrST25p Jo3tHIxBfVwP8Hx2dm1X/TBNxYRD34A= X-Google-Smtp-Source: AMrXdXsFsv+WY6uB2QAxQKVey5zkUH2AXIcpYb3YNAv+Gej95uP4vEcIzk/UCgPyK193sduP1xpoK0p/ZF8= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a62:1501:0:b0:589:8362:c7ce with SMTP id 1-20020a621501000000b005898362c7cemr251559pfv.21.1673297679447; Mon, 09 Jan 2023 12:54:39 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:18 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-24-surenb@google.com> Subject: [PATCH 23/41] mm: write-lock VMAs before removing them from VMA tree From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Write-locking VMAs before isolating them ensures that page fault handlers don't operate on isolated VMAs. Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 2 ++ mm/nommu.c | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index da1908730828..be289e0b693b 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -448,6 +448,7 @@ void vma_mas_store(struct vm_area_struct *vma, struct m= a_state *mas) */ void vma_mas_remove(struct vm_area_struct *vma, struct ma_state *mas) { + vma_write_lock(vma); trace_vma_mas_szero(mas->tree, vma->vm_start, vma->vm_end - 1); mas->index =3D vma->vm_start; mas->last =3D vma->vm_end - 1; @@ -2300,6 +2301,7 @@ int split_vma(struct mm_struct *mm, struct vm_area_st= ruct *vma, static inline int munmap_sidetree(struct vm_area_struct *vma, struct ma_state *mas_detach) { + vma_write_lock(vma); mas_set_range(mas_detach, vma->vm_start, vma->vm_end - 1); if (mas_store_gfp(mas_detach, vma, GFP_KERNEL)) return -ENOMEM; diff --git a/mm/nommu.c b/mm/nommu.c index b3154357ced5..7ae91337ef14 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -552,6 +552,7 @@ void vma_mas_store(struct vm_area_struct *vma, struct m= a_state *mas) =20 void vma_mas_remove(struct vm_area_struct *vma, struct ma_state *mas) { + vma_write_lock(vma); mas->index =3D vma->vm_start; mas->last =3D vma->vm_end - 1; mas_store_prealloc(mas, NULL); @@ -1551,6 +1552,10 @@ void exit_mmap(struct mm_struct *mm) mmap_write_lock(mm); for_each_vma(vmi, vma) { cleanup_vma_from_mm(vma); + /* + * No need to lock VMA because this is the only mm user and no + * page fault handled can race with it. + */ delete_vma(mm, vma); cond_resched(); } --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19D26C61DB3 for ; Mon, 9 Jan 2023 20:57:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237702AbjAIU5U (ORCPT ); Mon, 9 Jan 2023 15:57:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237759AbjAIU4L (ORCPT ); Mon, 9 Jan 2023 15:56:11 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41F7A8BF16 for ; Mon, 9 Jan 2023 12:54:42 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-4c6bd2981d8so66551767b3.2 for ; Mon, 09 Jan 2023 12:54:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=AwPKUNzvgMd9Kf7LcbfBV8kCxW3mIJvnnhSUrm6UVIM=; b=PUe2nQQDxcDIZZYDvSnvU7x/WgxDknYnlEY7L4Z8C3iDtsfKJ6ipNpTRC5N0+o397w Zlr5LNQCm8jIObT2qwQm+iFFbmfCf3so91ezlzHPCZn3+26NE3G0VG8VYTX74hVikOaP hXnC1hgxqMkKtkdKCm4ja7+1OhhfuKZ8QD3Yt7TKznRnTkaA7k7w1KBOtddu/hOGuQp+ p5yPJ6DKvl3ugNtnIXuytmcOf3bXiekUrutBQkFLl092Px87ewbl/fVBINgSpD18bwdH tuw54xfQPY3gshaLVXYSV9JTTyWxRw4/COwHM+KjA25VHZQUSDq3VtesXUsv3nuWcWdU OueA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AwPKUNzvgMd9Kf7LcbfBV8kCxW3mIJvnnhSUrm6UVIM=; b=YdMaEkicky6wdP1JKSEXim3FmXKEud5lmJ2XZxkOaulRGgp9tnIG3+RSqetcvnaOd8 xMNDQMbb0ItU+auSXS0Ps1t+MR6YZp639Lz03jPdLNNPZrfhHTZa7Qfa2ACZrGXdcHLy ByF+QMOT4Q9DVUhLDY4i91GZurReuyhH47Rs6pEFXNGuSOFtOdQv3/oeiFbM+goFNu1d FLairNgoYobDFBu4CCDLZLJPQ+oXfaddGANTpMwaOS5sJ1qqZlNpISWzuOX5Qh5sLnMD E2ffAVrvHgUKn0vPy58qPyAvM8K7DYEEsVNbQQg35BpSlgADAAZLflk29UhLDSSXjZpd 1XTg== X-Gm-Message-State: AFqh2koECELcScCp90JIcDFNgVhX0C7dhiKw22THVl+yAe3t998G5r0C Ad9L/hC0rJSj4O2z4Qpa7OWYPYavYHE= X-Google-Smtp-Source: AMrXdXsf9qNWBP/q4HbIX0JjhZtra/GO9q6jv0FDi4j+zY41mh+XeX3qFvEfYkTJ9YCl7ccm1I7VEz8cc3E= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a81:4702:0:b0:4cd:f764:1911 with SMTP id u2-20020a814702000000b004cdf7641911mr673145ywa.403.1673297681920; Mon, 09 Jan 2023 12:54:41 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:19 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-25-surenb@google.com> Subject: [PATCH 24/41] mm: conditionally write-lock VMA in free_pgtables From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Normally free_pgtables needs to lock affected VMAs except for the case when VMAs were isolated under VMA write-lock. munmap() does just that, isolating while holding appropriate locks and then downgrading mmap_lock and dropping per-VMA locks before freeing page tables. Add a parameter to free_pgtables and unmap_region for such scenario. Signed-off-by: Suren Baghdasaryan --- mm/internal.h | 2 +- mm/memory.c | 6 +++++- mm/mmap.c | 18 ++++++++++++------ 3 files changed, 18 insertions(+), 8 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index bcf75a8b032d..5ea4ff1a70e7 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -87,7 +87,7 @@ void folio_activate(struct folio *folio); =20 void free_pgtables(struct mmu_gather *tlb, struct maple_tree *mt, struct vm_area_struct *start_vma, unsigned long floor, - unsigned long ceiling); + unsigned long ceiling, bool lock_vma); void pmd_install(struct mm_struct *mm, pmd_t *pmd, pgtable_t *pte); =20 struct zap_details; diff --git a/mm/memory.c b/mm/memory.c index 2fabf89b2be9..9ece18548db1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -348,7 +348,7 @@ void free_pgd_range(struct mmu_gather *tlb, =20 void free_pgtables(struct mmu_gather *tlb, struct maple_tree *mt, struct vm_area_struct *vma, unsigned long floor, - unsigned long ceiling) + unsigned long ceiling, bool lock_vma) { MA_STATE(mas, mt, vma->vm_end, vma->vm_end); =20 @@ -366,6 +366,8 @@ void free_pgtables(struct mmu_gather *tlb, struct maple= _tree *mt, * Hide vma from rmap and truncate_pagecache before freeing * pgtables */ + if (lock_vma) + vma_write_lock(vma); unlink_anon_vmas(vma); unlink_file_vma(vma); =20 @@ -380,6 +382,8 @@ void free_pgtables(struct mmu_gather *tlb, struct maple= _tree *mt, && !is_vm_hugetlb_page(next)) { vma =3D next; next =3D mas_find(&mas, ceiling - 1); + if (lock_vma) + vma_write_lock(vma); unlink_anon_vmas(vma); unlink_file_vma(vma); } diff --git a/mm/mmap.c b/mm/mmap.c index be289e0b693b..0d767ce043af 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -78,7 +78,7 @@ core_param(ignore_rlimit_data, ignore_rlimit_data, bool, = 0644); static void unmap_region(struct mm_struct *mm, struct maple_tree *mt, struct vm_area_struct *vma, struct vm_area_struct *prev, struct vm_area_struct *next, unsigned long start, - unsigned long end); + unsigned long end, bool lock_vma); =20 static pgprot_t vm_pgprot_modify(pgprot_t oldprot, unsigned long vm_flags) { @@ -2202,7 +2202,7 @@ static inline void remove_mt(struct mm_struct *mm, st= ruct ma_state *mas) static void unmap_region(struct mm_struct *mm, struct maple_tree *mt, struct vm_area_struct *vma, struct vm_area_struct *prev, struct vm_area_struct *next, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, bool lock_vma) { struct mmu_gather tlb; =20 @@ -2211,7 +2211,8 @@ static void unmap_region(struct mm_struct *mm, struct= maple_tree *mt, update_hiwater_rss(mm); unmap_vmas(&tlb, mt, vma, start, end); free_pgtables(&tlb, mt, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS, - next ? next->vm_start : USER_PGTABLES_CEILING); + next ? next->vm_start : USER_PGTABLES_CEILING, + lock_vma); tlb_finish_mmu(&tlb); } =20 @@ -2468,7 +2469,11 @@ do_mas_align_munmap(struct ma_state *mas, struct vm_= area_struct *vma, mmap_write_downgrade(mm); } =20 - unmap_region(mm, &mt_detach, vma, prev, next, start, end); + /* + * We can free page tables without locking the vmas because they were + * isolated before we downgraded mmap_lock and dropped per-vma locks. + */ + unmap_region(mm, &mt_detach, vma, prev, next, start, end, !downgrade); /* Statistics and freeing VMAs */ mas_set(&mas_detach, start); remove_mt(mm, &mas_detach); @@ -2785,7 +2790,8 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, vma->vm_file =3D NULL; =20 /* Undo any partial mapping done by a device driver. */ - unmap_region(mm, mas.tree, vma, prev, next, vma->vm_start, vma->vm_end); + unmap_region(mm, mas.tree, vma, prev, next, vma->vm_start, vma->vm_end, + true); if (file && (vm_flags & VM_SHARED)) mapping_unmap_writable(file->f_mapping); free_vma: @@ -3130,7 +3136,7 @@ void exit_mmap(struct mm_struct *mm) mmap_write_lock(mm); mt_clear_in_rcu(&mm->mm_mt); free_pgtables(&tlb, &mm->mm_mt, vma, FIRST_USER_ADDRESS, - USER_PGTABLES_CEILING); + USER_PGTABLES_CEILING, true); tlb_finish_mmu(&tlb); =20 /* --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 79619C54EBD for ; Mon, 9 Jan 2023 20:57:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237704AbjAIU5q (ORCPT ); Mon, 9 Jan 2023 15:57:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237690AbjAIU4y (ORCPT ); Mon, 9 Jan 2023 15:56:54 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FE598CBC2 for ; Mon, 9 Jan 2023 12:54:51 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id t13-20020a056902018d00b0074747131938so10284629ybh.12 for ; Mon, 09 Jan 2023 12:54:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=89ITsq4nUiN8d4MMzUNaP/VKvoRvGlIDiLzssAyWMSw=; b=nfh/0wSN7du3H+jbnftiNBYsH0L/ZzvAjrQDNKD6/Ux50j22Vuiz0S8HvVp3TvO1vo K/T56ndCuYOiwpC071dddEQYgomwd4Y1mEzWyHgdRsI4Zm5Szg4pGbhhjtZTIjjKwBZ2 DuLjlMuvIIQ0lWiW9Nfxk/1vqmTgUUR+Vpz2r/N+XxIUJS4L6MBvWuo91YhKUizAdsFD /Z7UYgJwWP6aH5EHzLl9AB9L/N1SliUIF0AbaZQSGUEgi4Q4Dc2tmxMGPV7cLiqwxdNW 0MiPjg6rJc2LN6ARMi2zMWyKxfOe4Y81VTcKlWoejCss5A49BpopZcqw0LB7HbZRjJwk DoGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=89ITsq4nUiN8d4MMzUNaP/VKvoRvGlIDiLzssAyWMSw=; b=DBTBJvVe3+oAwgyt14LIjDrL4HG9CZIfbWtxhCVftB7Pc2Aiw/JZJAplIhi+NxXJZy +Ovc4HvfTE/ngRHwXNWUoonRLwKarZ3F6aChRAdzpzqaMDXtudugSv74mgtBEt2ls8sE MeuVnQC5H+r2wgwydXTmEtidi4J7AkUeMCQgULXQdBxqQP9z4USQ5g8fdP/7G3jiNXMS ggFOMOQ9Dq1zXw7lSNMxLHcgzM4eTVMNgoO1d+VcDfkypcR2c1K3RGOrrl4dY1Yt1qsb 7a5kB5fAM0vd4SwUq0iQEwcXnBoQI+hOaHiDBR4uh62SqrxmjcfN0beh0LX/D9NBvKBu DVig== X-Gm-Message-State: AFqh2krVtVF/0/dlDb/GmVKGGcNMK2uDqErFld4JVT8dqlKnY1vSF1hL gez0thUQVC3HlhECsE93lInml48WlTY= X-Google-Smtp-Source: AMrXdXvWplK/QqycNgSSdeBDfganGdyEUrMbsmwVLuA4csVljoZeQRUCGyXX2t9p/6nnKr16T46cKc1FR2o= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a0d:eacb:0:b0:485:126c:3834 with SMTP id t194-20020a0deacb000000b00485126c3834mr6100451ywe.360.1673297684211; Mon, 09 Jan 2023 12:54:44 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:20 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-26-surenb@google.com> Subject: [PATCH 25/41] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" While unmapping VMAs, adjacent VMAs might be able to grow into the area being unmapped. In such cases write-lock adjacent VMAs to prevent this growth. Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 0d767ce043af..30c7d1c5206e 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2461,11 +2461,13 @@ do_mas_align_munmap(struct ma_state *mas, struct vm= _area_struct *vma, * down_read(mmap_lock) and collide with the VMA we are about to unmap. */ if (downgrade) { - if (next && (next->vm_flags & VM_GROWSDOWN)) + if (next && (next->vm_flags & VM_GROWSDOWN)) { + vma_write_lock(next); downgrade =3D false; - else if (prev && (prev->vm_flags & VM_GROWSUP)) + } else if (prev && (prev->vm_flags & VM_GROWSUP)) { + vma_write_lock(prev); downgrade =3D false; - else + } else mmap_write_downgrade(mm); } =20 --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7439CC61DB3 for ; Mon, 9 Jan 2023 20:58:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237846AbjAIU55 (ORCPT ); Mon, 9 Jan 2023 15:57:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237750AbjAIU53 (ORCPT ); Mon, 9 Jan 2023 15:57:29 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 939FA8CBFE for ; Mon, 9 Jan 2023 12:54:55 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id p15-20020a170902a40f00b00192b2bbb7f8so7044568plq.14 for ; Mon, 09 Jan 2023 12:54:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=L7MiPLggwHMt078/i5i4Bbk0h7mxL/B1TX4Loq0bJkA=; b=k/y/f651ouIVm5VYXSBuN3WGA3r0siHAwbforN7cl7XldyaXSobuMoKnfRbHsUp2Zw mT4Fy24gz9zLv4V1qPdlGMDZmt4uQYDL7SMguN64hvgl+OPwnYaemI3hYYgdYfqYy2ZE RZpsDEbxomOp03K+XTMpAlOj5/FuHalix+oD7gvliXvyOvoSifjWwtoSfq337yDgZBWx NsM/wn5sH1zgbh2ZxdX6VDsXj4dH/jKa39D017jsbV+IQ14YXnxgSRRt8KTb8ppwgMfm uVAI77HN37Vy1v+4HMEL/4fAI+3c+Ki/ub6d+F+N4Xj5NK0ZJDRKlJXLfRFrsnczSTJ7 ObUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=L7MiPLggwHMt078/i5i4Bbk0h7mxL/B1TX4Loq0bJkA=; b=LIku4FFdTFF6ymPF/7eEdvuaGF8VODd4zef9wPwsWVyN7n+ez8iX7vNQ4cRwj49n/k S9dhCO6YwLOlSl1OEg4pAFziyAGRcTBpnFllDmX5QNp/mlXXXCP30MZgibaroBbzgQTx JxrScvPbZ0FfPgc3yiF+5qF2IMLi9B8AV5IV5gitb0sWbt5L71vFek5fhEQy4iPKiWy3 bVVIRm6BMeY9Yq2PUP/uFJtp1oaz5yNFwNPcT1h4xgDVDnjVMVQfLlr5o6E9EyX8THpV bUVE1aTY/VpxvbmnKhYJaKl+lHR/WkiTheus8dYDk76RlgCPWPbYrKyKV77VbA6WSlr0 pGMw== X-Gm-Message-State: AFqh2kqEEjVZrY1eS4yk6w+QgWCxkZDLDq4JdXtn1x1wUPKKsbZv0E5E yhRxaQFOEi28pTpvRUGJUgqfZyJKbhM= X-Google-Smtp-Source: AMrXdXscxn+TZfW1mPWgpj2/8Haqh4tY58OgT8gYBU0unalxDUYH6+O0E4BOyMrkarY33KKQg8fQ3JxSbTo= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a17:902:8e88:b0:190:becc:7e76 with SMTP id bg8-20020a1709028e8800b00190becc7e76mr6134633plb.1.1673297686693; Mon, 09 Jan 2023 12:54:46 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:21 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-27-surenb@google.com> Subject: [PATCH 26/41] kernel/fork: assert no VMA readers during its destruction From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Assert there are no holders of VMA lock for reading when it is about to be destroyed. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 8 ++++++++ kernel/fork.c | 2 ++ 2 files changed, 10 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 594e835bad9c..c464fc8a514c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -680,6 +680,13 @@ static inline void vma_assert_write_locked(struct vm_a= rea_struct *vma) VM_BUG_ON_VMA(vma->vm_lock_seq !=3D READ_ONCE(vma->vm_mm->mm_lock_seq), v= ma); } =20 +static inline void vma_assert_no_reader(struct vm_area_struct *vma) +{ + VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock) && + vma->vm_lock_seq !=3D READ_ONCE(vma->vm_mm->mm_lock_seq), + vma); +} + #else /* CONFIG_PER_VMA_LOCK */ =20 static inline void vma_init_lock(struct vm_area_struct *vma) {} @@ -688,6 +695,7 @@ static inline bool vma_read_trylock(struct vm_area_stru= ct *vma) { return false; } static inline void vma_read_unlock(struct vm_area_struct *vma) {} static inline void vma_assert_write_locked(struct vm_area_struct *vma) {} +static inline void vma_assert_no_reader(struct vm_area_struct *vma) {} =20 #endif /* CONFIG_PER_VMA_LOCK */ =20 diff --git a/kernel/fork.c b/kernel/fork.c index 1591dd8a0745..6d9f14e55ecf 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -485,6 +485,8 @@ static void __vm_area_free(struct rcu_head *head) { struct vm_area_struct *vma =3D container_of(head, struct vm_area_struct, vm_rcu); + /* The vma should either have no lock holders or be write-locked. */ + vma_assert_no_reader(vma); kmem_cache_free(vm_area_cachep, vma); } #endif --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29281C5479D for ; Mon, 9 Jan 2023 20:58:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237594AbjAIU6Z (ORCPT ); Mon, 9 Jan 2023 15:58:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237776AbjAIU5d (ORCPT ); Mon, 9 Jan 2023 15:57:33 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FE388CD2F for ; Mon, 9 Jan 2023 12:54:57 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id i10-20020a25f20a000000b006ea4f43c0ddso10382006ybe.21 for ; Mon, 09 Jan 2023 12:54:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0pkQpuFsTWto6vflu1WV/u0ieoG+9lWN639GtltEg4c=; b=aWg1ukrhEhZM2G7oua95ritICYN4v22wUCcY0gEG4tVylqBmv829owF41k3CxkyAaw SuqbgaxpvQ3jo7YPBWRSVqYiX7AqAmWKt0XHBKUG5SuIsHysEKVzXAcfr0KHgTcUUWfl b8z+Kg1PDae7rSd6mmxN8sCBWlq1F+xCCZRkBLC5iLo/l2X3T6+e58rjDwLcesFby2PU TDwKsOLIZU7rmZo8L2TvumA953Nmin0x2oWW50UYSkBp0m/8PkZaL+q137UQ5Jg9l0HG VtNsJ9z3uJiIVmCVloCQiu4j8+XoGxPmrLqPQDbiOLf/gZJpQPJ4VggKimmoURCteWUt n6qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0pkQpuFsTWto6vflu1WV/u0ieoG+9lWN639GtltEg4c=; b=2HT4VoEXiwo0QYuQkXrTDpl/bb2eimY0w7iMhMm9DglDMINTpIYeZOWRIFPiF1TAkJ weqWHKUwYGwXPgZOAss/EF7A0MELQop6m4azBueeAPr5oftgkP+B+qNr16v3JkU+FV4Z Vb8H786bUMn9zrs6FZTNgjBxJ5codPuupPgBIRG4Su7lyG78nYt+unUPVhGETd4FF5Id nWWMEk5eRPia+ie0mfMeCdmfPziMD730pujw3AdmajEOcyVnG0X26l2VM0YKWLQGyhXE 4c2EGYHuf8XPa4V8v0Nm4e59y2hO5zBr/yfH/XaZZZdZ4imy58kR59fdwdIcPcYsXMVS 2Tqw== X-Gm-Message-State: AFqh2krA/Vbl0eQCTe8QqhVwmmu8JAwTDZF48pxtRLw5fUAHk+yys7wE ptVDKbgJNgyuqSeui3esFbZ9E51Esu0= X-Google-Smtp-Source: AMrXdXvMFgW5AuRJO7w2AGyv4a3g+CMQq4rAqpoqS9oQdpJUn0mmZFftW7BbnOBV6/f8EJVvUFaCfl4rQG0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a81:e93:0:b0:3d6:233b:7292 with SMTP id 141-20020a810e93000000b003d6233b7292mr1998513ywo.476.1673297689205; Mon, 09 Jan 2023 12:54:49 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:22 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-28-surenb@google.com> Subject: [PATCH 27/41] mm/mmap: prevent pagefault handler from racing with mmu_notifier registration From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Page fault handlers might need to fire MMU notifications while a new notifier is being registered. Modify mm_take_all_locks to write-lock all VMAs and prevent this race with fault handlers that would hold VMA locks. VMAs are locked before i_mmap_rwsem and anon_vma to keep the same locking order as in page fault handlers. Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 30c7d1c5206e..a256deca0bc0 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3566,6 +3566,7 @@ static void vm_lock_mapping(struct mm_struct *mm, str= uct address_space *mapping) * of mm/rmap.c: * - all hugetlbfs_i_mmap_rwsem_key locks (aka mapping->i_mmap_rwsem for * hugetlb mapping); + * - all vmas marked locked * - all i_mmap_rwsem locks; * - all anon_vma->rwseml * @@ -3591,6 +3592,7 @@ int mm_take_all_locks(struct mm_struct *mm) mas_for_each(&mas, vma, ULONG_MAX) { if (signal_pending(current)) goto out_unlock; + vma_write_lock(vma); if (vma->vm_file && vma->vm_file->f_mapping && is_vm_hugetlb_page(vma)) vm_lock_mapping(mm, vma->vm_file->f_mapping); @@ -3677,6 +3679,7 @@ void mm_drop_all_locks(struct mm_struct *mm) if (vma->vm_file && vma->vm_file->f_mapping) vm_unlock_mapping(vma->vm_file->f_mapping); } + vma_write_unlock_mm(mm); =20 mutex_unlock(&mm_all_locks_mutex); } --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4738CC54EBD for ; Mon, 9 Jan 2023 20:58:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237847AbjAIU6i (ORCPT ); Mon, 9 Jan 2023 15:58:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237559AbjAIU5b (ORCPT ); Mon, 9 Jan 2023 15:57:31 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 722B48CBEB for ; Mon, 9 Jan 2023 12:54:57 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-4c11ae6ab25so104151227b3.8 for ; Mon, 09 Jan 2023 12:54:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=UA5+XYk5SNGwY0wBqjbKeTB68Smdf1b2WfFGnAueiFI=; b=T55Sv4TWznVzBLEvpmky/jAR0QJoeEeCX90o+hjvZVdyHXxi4X+dLVqm78fwsxY8qM 7mstQqJ2q4Sn5cjstBMo6qt6HReHOpVXqnm5LMKSQ635M0MCSPIfIgxAkOTHc8oB12I/ ai6kXEsVt00A/xCygYMywSMRtpTqMTCmAJNGp6/HS0isXJekWM6FOqzBH2+trv87L4rS u+h7CTRZzB07Ow9ioaR2euNV+QdGF1fgPh0hG8fHh3bLFfxrE2kcC3qaflJTUkk0dQTh 8vb62xpka2Ra5fG5NjEzQJYLTJMKCpwH3JBwVijZjrLZOBQ13qNu1ZZEA2iHn/6uZCd9 oE+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=UA5+XYk5SNGwY0wBqjbKeTB68Smdf1b2WfFGnAueiFI=; b=09eAUy0m5bvmYB9GNbnJPlQtcmZKV0c6yoKGDri0M8/FxmMGv59D1WLb6UZgjsNiyr AN7yu93u6DdJ3Rt+8TIfkHZpzT4n5ZWxJ+UzkNIz0G98zVS3/px7EdSw9Fpr5aR5lUXE tBOTHJmf++MggIBVpIo5ItwUkEw04cTm2Te+x3DpyyBB8/XJ6YlsVVvo/w/9FYl5NPnN HbiVlEnC//pWf1NF6fMDzquKdkGnEqEHWvaKckZHPOg3O9350umI5DawHJqx0xnUcQA8 QSXQqNtkA+Y31Enn4BbYSGV2wen776tb5DZFZa/tUZt66qt1ntj+Vp58WIWikUQ87KF+ npjg== X-Gm-Message-State: AFqh2kpMIdlavF4C9TcUgYsF1FtT1WVPFVmADICdmIJRw0kL0PDD0qGR AS3iUdP01+y/tpXydWvmhts0IqHm8wI= X-Google-Smtp-Source: AMrXdXsg8SKnvt1lG56Fk9uUJaowgWL0vbqBlxMLAmQhblw+uvVMcBKNRkD6OuTHfHWzX2ebIh14gNZU+Cs= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a05:6902:920:b0:781:eeed:9abf with SMTP id bu32-20020a056902092000b00781eeed9abfmr7003326ybb.538.1673297691543; Mon, 09 Jan 2023 12:54:51 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:23 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-29-surenb@google.com> Subject: [PATCH 28/41] mm: introduce lock_vma_under_rcu to be used from arch-specific code From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce lock_vma_under_rcu function to lookup and lock a VMA during page fault handling. When VMA is not found, can't be locked or changes after being locked, the function returns NULL. The lookup is performed under RCU protection to prevent the found VMA from being destroyed before the VMA lock is acquired. VMA lock statistics are updated according to the results. For now only anonymous VMAs can be searched this way. In other cases the function returns NULL. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 3 +++ mm/memory.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index c464fc8a514c..d0fddf6a1de9 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -687,6 +687,9 @@ static inline void vma_assert_no_reader(struct vm_area_= struct *vma) vma); } =20 +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, + unsigned long address); + #else /* CONFIG_PER_VMA_LOCK */ =20 static inline void vma_init_lock(struct vm_area_struct *vma) {} diff --git a/mm/memory.c b/mm/memory.c index 9ece18548db1..a658e26d965d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5242,6 +5242,57 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vm= a, unsigned long address, } EXPORT_SYMBOL_GPL(handle_mm_fault); =20 +#ifdef CONFIG_PER_VMA_LOCK +/* + * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed = to be + * stable and not isolated. If the VMA is not found or is being modified t= he + * function returns NULL. + */ +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, + unsigned long address) +{ + MA_STATE(mas, &mm->mm_mt, address, address); + struct vm_area_struct *vma, *validate; + + rcu_read_lock(); + vma =3D mas_walk(&mas); +retry: + if (!vma) + goto inval; + + /* Only anonymous vmas are supported for now */ + if (!vma_is_anonymous(vma)) + goto inval; + + if (!vma_read_trylock(vma)) + goto inval; + + /* Check since vm_start/vm_end might change before we lock the VMA */ + if (unlikely(address < vma->vm_start || address >=3D vma->vm_end)) { + vma_read_unlock(vma); + goto inval; + } + + /* Check if the VMA got isolated after we found it */ + mas.index =3D address; + validate =3D mas_walk(&mas); + if (validate !=3D vma) { + vma_read_unlock(vma); + count_vm_vma_lock_event(VMA_LOCK_MISS); + /* The area was replaced with another one. */ + vma =3D validate; + goto retry; + } + + rcu_read_unlock(); + return vma; +inval: + rcu_read_unlock(); + count_vm_vma_lock_event(VMA_LOCK_ABORT); + return NULL; +} +#endif /* CONFIG_PER_VMA_LOCK */ + #ifndef __PAGETABLE_P4D_FOLDED /* * Allocate p4d page table. --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64356C54EBD for ; Mon, 9 Jan 2023 20:58:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237722AbjAIU5x (ORCPT ); Mon, 9 Jan 2023 15:57:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237821AbjAIU5Q (ORCPT ); Mon, 9 Jan 2023 15:57:16 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 674578CBF2 for ; Mon, 9 Jan 2023 12:54:54 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-349423f04dbso105152877b3.13 for ; Mon, 09 Jan 2023 12:54:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Bx36nWIWIaMqsyKC4DFh9Dy6zgkZnlvdwV69ikS09tQ=; b=GjzlXD654bxHD8qHg+BCMZfuuyb19mHGKgBfyrdYtuHPZI3RXY2mylK8MrrZ6vi/KV XpBGPAViKdrJ111khQK0LarqUyqdysojbfZLW54fo2JQeFn5ElZgOwEGNBdof4Exx32h oce1agjXuGHjstOPH9icNCzJQUnGWQYtYgBGU792WEzYIKYlpSXiCOhWF2Ne+MX/DqXf xrifjQ0boHWSv5tXEGAlsRbW4dbyfRz7jLqAVJ0lSkujXDjZ2O7m0HOdnuYL7/iH8Jk5 Up4mHJat+wejXPO4fCLDWlYHvfZ/+JBFjf5EhWbTUQtFaZy2JybPs/XeOQ27EyoIyzsK wpvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Bx36nWIWIaMqsyKC4DFh9Dy6zgkZnlvdwV69ikS09tQ=; b=oPZP1oeJTK5yhJ+VhRoPHY1vNKKEuZNPJtcKLJBT+/jc2bVVwgbWSZPNFZW49j2GjU aJn0TVMcsKMMGVg/CRC840qIe/VhYp/DiVFQ5TAk63NZ2q6GZfJtciCUEmzv/7YqLIdj eWkfJhZaD1duxyx8hEWFfHZ7kBI7q9LfboWMEusEi1le7BcljfUksS6zhGgimBrpIzmk BDr1CHT5C3P+K0d3PVu9GazxqCJOnoG0bVQJgVYmnUR/gXOWkSYE9d8e1nc0pJzMLKnK x1EwI63w6RPuNW6mu3R9jSBIalao0CSz8pkbRFx1tGBpD1xnuj37+U0jkWqZz/tGavmd OIgA== X-Gm-Message-State: AFqh2krwGjKbjRVkpCDlwX6Cwk46F9/IZd5NmdcjBWkSgSAdKIevUlf0 XTJjeugRTsE0u+MARplO3rcYOI/QI+0= X-Google-Smtp-Source: AMrXdXsUnoeDnhtdWoii8DMH8rCFUM9XgPs0a9yaOqp3Qv/VXJUnVpuxBD4Is/HyplWdMowbocjku3a0aLg= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:3c83:0:b0:700:604:2e1b with SMTP id j125-20020a253c83000000b0070006042e1bmr7014317yba.246.1673297694067; Mon, 09 Jan 2023 12:54:54 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:24 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-30-surenb@google.com> Subject: [PATCH 29/41] mm: fall back to mmap_lock if vma->anon_vma is not yet set From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When vma->anon_vma is not set, pagefault handler will set it by either reusing anon_vma of an adjacent VMA if VMAs are compatible or by allocating a new one. find_mergeable_anon_vma() walks VMA tree to find a compatible adjacent VMA and that requires not only the faulting VMA to be stable but also the tree structure and other VMAs inside that tree. Therefore locking just the faulting VMA is not enough for this search. Fall back to taking mmap_lock when vma->anon_vma is not set. This situation happens only on the first page fault and should not affect overall performance. Signed-off-by: Suren Baghdasaryan --- mm/memory.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index a658e26d965d..2560524ad7f4 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5264,6 +5264,10 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_= struct *mm, if (!vma_is_anonymous(vma)) goto inval; =20 + /* find_mergeable_anon_vma uses adjacent vmas which are not locked */ + if (!vma->anon_vma) + goto inval; + if (!vma_read_trylock(vma)) goto inval; =20 --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56910C6379F for ; Mon, 9 Jan 2023 20:58:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237866AbjAIU6m (ORCPT ); Mon, 9 Jan 2023 15:58:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237451AbjAIU5f (ORCPT ); Mon, 9 Jan 2023 15:57:35 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 365078D5D1 for ; Mon, 9 Jan 2023 12:55:01 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id t13-20020a056902018d00b0074747131938so10285211ybh.12 for ; Mon, 09 Jan 2023 12:55:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=B6J4oGWYAsMb5e7sz4HwYEXURXCKE081WlUNIaNvLis=; b=RmMHlCmm38xVKeEsK8n8bfn/aY3rbZqU5QjWJwlg7PYpTvR8g1QwX79gxnClyI+zIW JCQh5e4SjFQ2Q8nxF5bjIKp19Ks1VtVZP+kR4i8xND+6Fj+6CTJ8vlU6n1mfUhVhmmz7 +sE7byDQX+/dNdDepWb8tj8CSiQt6RCyFFAm480IRiAV2JLM0c7a5e0c36lCq09bYHwP wg+XvRL1iHdJUlNZBka6/ieT4oXpnPfG2/UuJktrbb0wvhMWj6jl5RcTqQ78qy0Rsk8c 4BQSRlXhCecUdC5ZETwJhxsGpmEcobjcEL9EE3pAQY+ImA2TDfMr31CaGMQDdx5w7rZe dkaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=B6J4oGWYAsMb5e7sz4HwYEXURXCKE081WlUNIaNvLis=; b=OYu0cg5uScXr9lHNabXtT2qxo8FVN8pBbHLdBGUSsle74ct5ohvXGZS749Y+FWc88J ZBjsH6l7MqFt0DoproH793BZnyjTbhqVG9HJvNEogIJTK09PRYOEnZPxxNBYs6JSNwQX RAC/YB0v0rkxouYe5QzVbYrg5j7M70hJ1Z5x62T7fCDgeIIieb1FXeMWdIBFXFkjMzLt exniDaYnXpgvAZ0AzeQolyRGKvgKUTN86yz0nTnPXITGRHQvJnjOx16E8aryuF0sLXFS /p7vZMgbmoDjfPvtsch5HY4Wxo0kSkq5kyf1D0vS6bJCHg8wT2U21qbBaOgQSbTKDgI8 2Lhw== X-Gm-Message-State: AFqh2krUiYnp2V4R+uYWIKASmmK2/KuHbyg+DM4w9u3Zf7nioRec/r4r PAAtCE4RLgqeVySQU+m5F6G+HXiQuVo= X-Google-Smtp-Source: AMrXdXvwV2DJ0LbEXE5ZgdwVnwlID+Hh7Y6icoi6WDsC3E+JAvvNZYvjiQv0jw9VN21R/xnRr9wwO4IvIuA= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:e7c7:0:b0:7bf:cf52:ede5 with SMTP id e190-20020a25e7c7000000b007bfcf52ede5mr196156ybh.626.1673297696306; Mon, 09 Jan 2023 12:54:56 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:25 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-31-surenb@google.com> Subject: [PATCH 30/41] mm: add FAULT_FLAG_VMA_LOCK flag From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a new flag to distinguish page faults handled under protection of per-vma lock. Signed-off-by: Suren Baghdasaryan Reviewed-by: Laurent Dufour --- include/linux/mm.h | 3 ++- include/linux/mm_types.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index d0fddf6a1de9..2e3be1d45371 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -467,7 +467,8 @@ static inline bool fault_flag_allow_retry_first(enum fa= ult_flag flags) { FAULT_FLAG_USER, "USER" }, \ { FAULT_FLAG_REMOTE, "REMOTE" }, \ { FAULT_FLAG_INSTRUCTION, "INSTRUCTION" }, \ - { FAULT_FLAG_INTERRUPTIBLE, "INTERRUPTIBLE" } + { FAULT_FLAG_INTERRUPTIBLE, "INTERRUPTIBLE" }, \ + { FAULT_FLAG_VMA_LOCK, "VMA_LOCK" } =20 /* * vm_fault is filled by the pagefault handler and passed to the vma's diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0d27edd3e63a..fce9113d979c 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1103,6 +1103,7 @@ enum fault_flag { FAULT_FLAG_INTERRUPTIBLE =3D 1 << 9, FAULT_FLAG_UNSHARE =3D 1 << 10, FAULT_FLAG_ORIG_PTE_VALID =3D 1 << 11, + FAULT_FLAG_VMA_LOCK =3D 1 << 12, }; =20 typedef unsigned int __bitwise zap_flags_t; --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85566C61DB3 for ; Mon, 9 Jan 2023 20:58:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237959AbjAIU6v (ORCPT ); Mon, 9 Jan 2023 15:58:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237303AbjAIU5l (ORCPT ); Mon, 9 Jan 2023 15:57:41 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 833FC8D5CC for ; Mon, 9 Jan 2023 12:55:05 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id j18-20020a170902da9200b00189b3b16addso7051030plx.23 for ; Mon, 09 Jan 2023 12:55:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wzUeswsih4PBYsAGoTi9IuotyhbGAzkWIDmV0FovK6Q=; b=i8mCfCut8WK6+zjWPXPrBAn0OVK3TZZqUir2pgpkQNWhenu43X3/akUsts/8WrGL5U Ag1lqNnPrK3Lht8c6bYeEHv2p/KYkV84L0N5RKup3Lqh7FNgEiwz2qb9iAaK8aobnie5 SWRp3Ej3gvy6Gx0qa+UPT0Wm9SDIcbH9qbkEiBwV9a/y0P6JGxAF3x2bjosqkuWs4x2Q 7wdZhW31F8FFO8VMgBTx/6RIzfF9caT2DmPvdENgd9fKarg0jWwQoD/+83rrhQbpzEQp l4T3tmRoZNr6BoK1+IwDWlh9AP60sfKufgosjp9Mre5rIBM3jgdbCDpmS0FgJglfxzEq wvVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wzUeswsih4PBYsAGoTi9IuotyhbGAzkWIDmV0FovK6Q=; b=xeHozbe4eDOwQcykrFHBIlzjjaasUmnkeVVA5CmWRjBzI5CvFjbBq8vl3ealIxj39x UViMqmoLpQFzALrtUb5PMq3Puj+2U9yLRaw8niaLpSfgiB82QM7rGofa4ZrfWa4/LfQU PXA4dvHro8WDeo68M9wovQDfd94Ma9muB8if/LRmDVWoH2mINgO8PsGhpt94A4mEmCtH 2odlKw4sxz0ZCYhhhVGJeGFPct9ZWptmqwOCNHoP5fssb9AvQiKhG76j4mNzHANt4ltD AosLTQTLoLbYW9kA/zRrcwfahhJ98VcYZrpajh82QXY0QTHH/90z00C3llsZ8RrsHpMu 1dIA== X-Gm-Message-State: AFqh2kpLzf8vzCOqujvPOXF0bW9lzmKHlgse3olZn4NWH/ImjY+g7fOJ bnnP1rpIUYZuMflAIQ13kypyVfzyzYo= X-Google-Smtp-Source: AMrXdXtXcDh9ACK+yDPMFgwqnCidD6d48HBoOS8XbTQx42HB6Mm5GZSyonzBvXPdXaN+wlg8/pxMUTj5geU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a17:902:7d8a:b0:189:754b:9d9c with SMTP id a10-20020a1709027d8a00b00189754b9d9cmr4530740plm.120.1673297699405; Mon, 09 Jan 2023 12:54:59 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:26 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-32-surenb@google.com> Subject: [PATCH 31/41] mm: prevent do_swap_page from handling page faults under VMA lock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Due to the possibility of do_swap_page dropping mmap_lock, abort fault handling under VMA lock and retry holding mmap_lock. This can be handled more gracefully in the future. Signed-off-by: Suren Baghdasaryan Reviewed-by: Laurent Dufour --- mm/memory.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 2560524ad7f4..20806bc8b4eb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3707,6 +3707,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (!pte_unmap_same(vmf)) goto out; =20 + if (vmf->flags & FAULT_FLAG_VMA_LOCK) { + ret =3D VM_FAULT_RETRY; + goto out; + } + entry =3D pte_to_swp_entry(vmf->orig_pte); if (unlikely(non_swap_entry(entry))) { if (is_migration_entry(entry)) { --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69C8BC67871 for ; Mon, 9 Jan 2023 20:58:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237950AbjAIU6r (ORCPT ); Mon, 9 Jan 2023 15:58:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237782AbjAIU5f (ORCPT ); Mon, 9 Jan 2023 15:57:35 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C27E58CD03 for ; Mon, 9 Jan 2023 12:55:01 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id z17-20020a25e311000000b00719e04e59e1so10333479ybd.10 for ; Mon, 09 Jan 2023 12:55:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kEbjJObwF3X8F3sFkf3mioj7FiTBVSLxL3AOZVg1QSY=; b=b707Zl9+U9/mZ9c+XkHaueCkegKOCNYI/oVAEusRFU63I3srOxecFlDkxyKx//4mW5 JwP5obEgccip2H4R9+fiuFL4CW2k5fc/GSQhSK1HsVxo3ZKjTR0Lux66eUg0E3PQlBTw wXh6GkCABKuopnTH7q3BfoKfyBQ/HE6fsvI1JacGKFSqz7wW0ul/c07mqYyM4k4N+tIJ sPYdgU87tyvftSSm83lgw37DswrvUBJMsShqXbiuQjXDGVNwSgoByZHvozZ1Xmfis7g3 cB964lLozfT6WEMDCdFsl0vR2EiUHB+3pHP9fkmg+z92l6SffusdjR7B04cQIU05PEZE 4Ang== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kEbjJObwF3X8F3sFkf3mioj7FiTBVSLxL3AOZVg1QSY=; b=On5XH7T/XztBTIHK1F0jbT3+7EpKtly/sB/VELy5QXD64CES6fsN4Df0fL6okyF+sS cm8TpIp+16HIV2IAmsYDzfFra/o1/rO9Ivt8WkYS+ylN7Frx+hZAW+FjH/qeQ9nw0lt9 yU+qIWO8ZEZmeOehDze2kSA6MImtN65Cz0M3rv+cJBXSGEQaO7qxTO/VCX56KQ3OcYGR k7IPUzZMH0P3CmUqcwIu2YX6DESYCCHF+ZmZ/z/HS9YFEChTsHDD9dl6nmqPMvJR76Ui zEn57ofBgclev2pSkbePmANZpd1XidywhVOOEeZWyjD5UCyHeWX9+PgKCMhjYQm+5OfG mwiQ== X-Gm-Message-State: AFqh2kpezNhwfZi/psI2KulaQFNumhrpEUXU/e2Un16i3JiAyyLH5NmZ w7SukcgcjZuk8lUTpsf9Bn/1QmgQWGU= X-Google-Smtp-Source: AMrXdXuMlfQ+hsH5bYnRbFVRQKfHDnoMxNv1bo1mLunTWqhQUT5+3qxDxTtfm+RaU7eYfnfY1D8yBkPOjiE= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:9304:0:b0:7bf:ffcb:79d7 with SMTP id f4-20020a259304000000b007bfffcb79d7mr163411ybo.446.1673297701437; Mon, 09 Jan 2023 12:55:01 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:27 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-33-surenb@google.com> Subject: [PATCH 32/41] mm: prevent userfaults to be handled under per-vma lock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Due to the possibility of handle_userfault dropping mmap_lock, avoid fault handling under VMA lock and retry holding mmap_lock. This can be handled more gracefully in the future. Signed-off-by: Suren Baghdasaryan Suggested-by: Peter Xu --- mm/memory.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 20806bc8b4eb..12508f4d845a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5273,6 +5273,13 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_= struct *mm, if (!vma->anon_vma) goto inval; =20 + /* + * Due to the possibility of userfault handler dropping mmap_lock, avoid + * it for now and fall back to page fault handling under mmap_lock. + */ + if (userfaultfd_armed(vma)) + goto inval; + if (!vma_read_trylock(vma)) goto inval; =20 --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BD97C5479D for ; Mon, 9 Jan 2023 20:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237676AbjAIU64 (ORCPT ); Mon, 9 Jan 2023 15:58:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237546AbjAIU5n (ORCPT ); Mon, 9 Jan 2023 15:57:43 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9631A8CBEF for ; Mon, 9 Jan 2023 12:55:08 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-46839d9ca5dso104917327b3.16 for ; Mon, 09 Jan 2023 12:55:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=mbt4mHoq0eWZO149X3WH8AIvdwQHHBNP+0MBWdKikso=; b=I5umo4kG0hMiq4gs5OujcHd7Sn1GGJbiybdYQYTj0HAR3QOkD0OA35qcaeXUMRzGAJ dIG3EBC52wC5ibcKgTHA/19I2NomM7CaXPc78eVykHWlIF2Vt/KiNizKg3yGh0eNsGHP jtagKS1D/qcsRmP7Fqc869E/kUJC8aHONABQKFab5CUW1r0+y/kJjLtE/M8OXtv+HWMx 0jlTEySwEhV749+tPTyOU0MfX8AR34H6BYDnkbNQmdajk1IIrv1WChV6C7Fwj8u25gID LgF2w8/r6WitNZ0Roa2WVZN3Fbrz2yHOP70XPp57APsjpBjV3GkkefNK2oUkx1yrmoLm XvuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mbt4mHoq0eWZO149X3WH8AIvdwQHHBNP+0MBWdKikso=; b=Aria4KBkJ+vtSsUpTWbHvjsR2k24s+TWn07EEP5OBL86Vdk1r0+7GUf25CSAeyjLQv sVpGW439s+E0228Lf8cmJVJjgmI4do7CDuce4zvHznNG5BmE1rgkFETuIPxTK0D8zh8g pYxVrbzaArfyb651grDOl/rYlb/PVIcX4JA+A7HXQlZaBZ/i8Mef1RC+o3rbNv/vcHFg q3iF58+qZPWYjKwqrMSSfKMs2gAR1XlWwy0WhWZVh13hsX8BrSc2KIMR/1V7VHMKinnY vr9DWGFigvrkqrQLoHSeag5G5R4B9aieiQH/E1ow+76btGgBVPqVQyNGA+nnog1wvr/u 9daA== X-Gm-Message-State: AFqh2kpUBEMCNpS1c5G1bxerdvEBxB+BKJ+jUp0HpbxwjCcuL2qyv9HG BB3W2DO4KrDb456+SUY+QNZy2hE45pE= X-Google-Smtp-Source: AMrXdXtYuIZxYJtCLtoqxn2YJzAdgg4PkVPb1cvNxFA+XZbpVDno2DZhvITBpFn/vgBWgY0+QZ+ehdsSMWQ= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:9d82:0:b0:774:df47:b5f with SMTP id v2-20020a259d82000000b00774df470b5fmr6154650ybp.405.1673297703709; Mon, 09 Jan 2023 12:55:03 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:28 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-34-surenb@google.com> Subject: [PATCH 33/41] mm: introduce per-VMA lock statistics From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a new CONFIG_PER_VMA_LOCK_STATS config option to dump extra statistics about handling page fault under VMA lock. Signed-off-by: Suren Baghdasaryan --- include/linux/vm_event_item.h | 6 ++++++ include/linux/vmstat.h | 6 ++++++ mm/Kconfig.debug | 8 ++++++++ mm/vmstat.c | 6 ++++++ 4 files changed, 26 insertions(+) diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 7f5d1caf5890..8abfa1240040 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -149,6 +149,12 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, #ifdef CONFIG_X86 DIRECT_MAP_LEVEL2_SPLIT, DIRECT_MAP_LEVEL3_SPLIT, +#endif +#ifdef CONFIG_PER_VMA_LOCK_STATS + VMA_LOCK_SUCCESS, + VMA_LOCK_ABORT, + VMA_LOCK_RETRY, + VMA_LOCK_MISS, #endif NR_VM_EVENT_ITEMS }; diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index 19cf5b6892ce..fed855bae6d8 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -125,6 +125,12 @@ static inline void vm_events_fold_cpu(int cpu) #define count_vm_tlb_events(x, y) do { (void)(y); } while (0) #endif =20 +#ifdef CONFIG_PER_VMA_LOCK_STATS +#define count_vm_vma_lock_event(x) count_vm_event(x) +#else +#define count_vm_vma_lock_event(x) do {} while (0) +#endif + #define __count_zid_vm_events(item, zid, delta) \ __count_vm_events(item##_NORMAL - ZONE_NORMAL + zid, delta) =20 diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug index fca699ad1fb0..32a93b064590 100644 --- a/mm/Kconfig.debug +++ b/mm/Kconfig.debug @@ -207,3 +207,11 @@ config PTDUMP_DEBUGFS kernel. =20 If in doubt, say N. + + +config PER_VMA_LOCK_STATS + bool "Statistics for per-vma locks" + depends on PER_VMA_LOCK + default y + help + Statistics for per-vma locks. diff --git a/mm/vmstat.c b/mm/vmstat.c index 1ea6a5ce1c41..4f1089a1860e 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1399,6 +1399,12 @@ const char * const vmstat_text[] =3D { "direct_map_level2_splits", "direct_map_level3_splits", #endif +#ifdef CONFIG_PER_VMA_LOCK_STATS + "vma_lock_success", + "vma_lock_abort", + "vma_lock_retry", + "vma_lock_miss", +#endif #endif /* CONFIG_VM_EVENT_COUNTERS || CONFIG_MEMCG */ }; #endif /* CONFIG_PROC_FS || CONFIG_SYSFS || CONFIG_NUMA || CONFIG_MEMCG */ --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C1B5C6379F for ; Mon, 9 Jan 2023 20:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237893AbjAIU7D (ORCPT ); Mon, 9 Jan 2023 15:59:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237751AbjAIU5r (ORCPT ); Mon, 9 Jan 2023 15:57:47 -0500 Received: from mail-pf1-x449.google.com (mail-pf1-x449.google.com [IPv6:2607:f8b0:4864:20::449]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77B3E8CD39 for ; Mon, 9 Jan 2023 12:55:10 -0800 (PST) Received: by mail-pf1-x449.google.com with SMTP id z62-20020a623341000000b005809a4c70efso3995407pfz.0 for ; Mon, 09 Jan 2023 12:55:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wQaK3IYB3deWxpAe9eNtEaAaK5GLZai42ZGvF8iLG5s=; b=rw20v86zZuVQYP+Ot38AI7CSNxJo3kS0fh4+Xzkg0ua9O1bePCbmIGXyRAOceAwONa J0TZdcxQSasqUL/oBe6wf7jLdD0R+p+Bu2MPqV1YykB1eqTKi0/qjL5rdUZIcmvdelEF RBoPMFMogBGK0PrycWwKvVRHJ+Ch4XmPWscy4TJxhfZ7irUJVjF6OWFk+4uiaZ5g0a9h xVEF8daEuEJDm1wmFyKFMZlgoKpmxoJhKWB0BV89iFmA1f0B2X0D/WawWhEeTcmbbuya u6sy1rh3eP2M76YM3K6bmEc6ANBDahY+/H9HaeO8ptCkobUJsiIXuerRnexRjBXA4ux5 pQew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wQaK3IYB3deWxpAe9eNtEaAaK5GLZai42ZGvF8iLG5s=; b=yS23VSzCOwCFEsWSlB7spedhUrkZw2fVykti7K4ql6h2zk/cPMppoTcCxUEcwwks/m UmRkf85BqVCC4HROSkJlgExVZGvBIRg1+vT0757hc4xWxApXAyhmCDyOW9WUodby9Mep 2S/nEmYsm1nm6SU4AiyyHm1DzJPtEU5Rf5lRr7UdjiXrcpN1UtzG1/AFwxJy1wm49Da8 aW+rWR5s8JwB1xJ/bWQT+hqqh7iN50roquL2lTK3K++lDw5zcrx2llHa0yZtTyo2PqGs e4mGoevp+VIEz3GdUtSszs//aCinww2zEia0bKiNPRc6ahm+SFHHQa7zZvhLqgDwPCuf OOwg== X-Gm-Message-State: AFqh2kp6PaI9fLIUR1/gbzUPPYzUUNxdGCfqUSDX/OBvdGXUSSlUCbiz H5PdtYYnUoOQ8j0u8a79NwQYoMYkOeY= X-Google-Smtp-Source: AMrXdXslCGRX6fAYSmsr9gs68MBt+wxNt6fnptg7xXYf9+oinmXIkRDOo+t8kmB5UcEQZNFBtsxG/pDk5Sg= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a17:902:7583:b0:189:bee:65ee with SMTP id j3-20020a170902758300b001890bee65eemr3889654pll.107.1673297706216; Mon, 09 Jan 2023 12:55:06 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:29 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-35-surenb@google.com> Subject: [PATCH 34/41] x86/mm: try VMA lock-based page fault handling first From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Attempt VMA lock-based page fault handling first, and fall back to the existing mmap_lock-based handling if that fails. Signed-off-by: Suren Baghdasaryan --- arch/x86/Kconfig | 1 + arch/x86/mm/fault.c | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3604074a878b..3647f7bdb110 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -27,6 +27,7 @@ config X86_64 # Options that are inherently 64-bit kernel only: select ARCH_HAS_GIGANTIC_PAGE select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 + select ARCH_SUPPORTS_PER_VMA_LOCK select ARCH_USE_CMPXCHG_LOCKREF select HAVE_ARCH_SOFT_DIRTY select MODULES_USE_ELF_RELA diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 7b0d4ab894c8..983266e7c49b 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -19,6 +19,7 @@ #include /* faulthandler_disabled() */ #include /* efi_crash_gracefully_on_page_fault()*/ #include +#include /* find_and_lock_vma() */ =20 #include /* boot_cpu_has, ... */ #include /* dotraplinkage, ... */ @@ -1354,6 +1355,38 @@ void do_user_addr_fault(struct pt_regs *regs, } #endif =20 +#ifdef CONFIG_PER_VMA_LOCK + if (!(flags & FAULT_FLAG_USER) || atomic_read(&mm->mm_users) =3D=3D 1) + goto lock_mmap; + + vma =3D lock_vma_under_rcu(mm, address); + if (!vma) + goto lock_mmap; + + if (unlikely(access_error(error_code, vma))) { + vma_read_unlock(vma); + goto lock_mmap; + } + fault =3D handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs= ); + vma_read_unlock(vma); + + if (!(fault & VM_FAULT_RETRY)) { + count_vm_vma_lock_event(VMA_LOCK_SUCCESS); + goto done; + } + count_vm_vma_lock_event(VMA_LOCK_RETRY); + + /* Quick path to respond to signals */ + if (fault_signal_pending(fault, regs)) { + if (!user_mode(regs)) + kernelmode_fixup_or_oops(regs, error_code, address, + SIGBUS, BUS_ADRERR, + ARCH_DEFAULT_PKEY); + return; + } +lock_mmap: +#endif /* CONFIG_PER_VMA_LOCK */ + /* * Kernel-mode access to the user address space should only occur * on well-defined single instructions listed in the exception @@ -1454,6 +1487,9 @@ void do_user_addr_fault(struct pt_regs *regs, } =20 mmap_read_unlock(mm); +#ifdef CONFIG_PER_VMA_LOCK +done: +#endif if (likely(!(fault & VM_FAULT_ERROR))) return; =20 --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A49FC61DB3 for ; Mon, 9 Jan 2023 20:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237881AbjAIU7A (ORCPT ); Mon, 9 Jan 2023 15:59:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237700AbjAIU5o (ORCPT ); Mon, 9 Jan 2023 15:57:44 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 358638D3A5 for ; Mon, 9 Jan 2023 12:55:09 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-4c1456d608cso104291007b3.15 for ; Mon, 09 Jan 2023 12:55:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=m2UwVHuB7txMBXnAywM8cXmBB/zrDJgCqgF9heuZ7RI=; b=AELPgr/6hvWzT+1V7lD4mX315E9t1xeapwUzcEqCdKIYPiCzKk8dc+T92lAP4lgkah 7yRjN7psrT8Aldeuq4taS71dz7N2REjLgY5gl/EmOw6Bjz3k1TLxoTZM4lDuQnOU9Jya +UewXZCmP8ol69UqHnariEjrKTpgLVgEy+/Bgrxiodf6VRTx+vPl+chOix68VXTazEey Yz24CBUMjgzvQiO/GtdPs+W2QXpN/PA/WPYudNeoUrMmtRKn07dYT94LClSxQvVwRpgt N9P9H+WlrUeFi7Q8PmpAS+IgzxksFHRinwxPnTAAPwB27jZBUc2AgIvIZ8+W58HDzxG+ RgCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=m2UwVHuB7txMBXnAywM8cXmBB/zrDJgCqgF9heuZ7RI=; b=pxrFuNpSRpTVqXdIvdDiox+Fh7BRC0c0X0avaRRjRN+T7WdT1YVGwVvzVs74928le2 sIn6G02qTU/Ihz8v1Lukb3e7YqEqPQEcjjft9Q8RCnDawLOU1lO85QtNBKL1i22m/1zz XczSf3bC/qhom0cqsY6X8bBT8qHW1Ih1OSLTOREGRMmE7sXuOIIUwzZax3QxEiJWg3mJ M0PLQBFL/gOtUhI+66y676oUGsBsX3orNGIhi2IIaPQRyPeXrRmO+2rSpu58ZjGRl7PK XrQYoiuNJdwHWEMM6ZDvtFO1mOSqxP77ZxSVQIuli2sjSEn7XSRZrMcjav6ub1W7M3tQ 5g3Q== X-Gm-Message-State: AFqh2kpYFbyCgnURfDKhxVqxA9y4vsqXq0uCCK/3hqfAxYQlKwMuxDpB ZIRn9lNm/OjQt+8Q8uLwR6nFySpThxw= X-Google-Smtp-Source: AMrXdXu86J9QPZ8qYzzFeKpvTLXFdkNuTVtCgOtoNpSPm2yn8vVjTEAXybleIDx2Sx/3r3Ulgd8R54EZk2M= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:b78c:0:b0:769:74cd:9c63 with SMTP id n12-20020a25b78c000000b0076974cd9c63mr6699949ybh.257.1673297708715; Mon, 09 Jan 2023 12:55:08 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:30 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-36-surenb@google.com> Subject: [PATCH 35/41] arm64/mm: try VMA lock-based page fault handling first From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Attempt VMA lock-based page fault handling first, and fall back to the existing mmap_lock-based handling if that fails. Signed-off-by: Suren Baghdasaryan --- arch/arm64/Kconfig | 1 + arch/arm64/mm/fault.c | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 03934808b2ed..829fa6d14a36 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -95,6 +95,7 @@ config ARM64 select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 select ARCH_SUPPORTS_NUMA_BALANCING select ARCH_SUPPORTS_PAGE_TABLE_CHECK + select ARCH_SUPPORTS_PER_VMA_LOCK select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT select ARCH_WANT_DEFAULT_BPF_JIT select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 596f46dabe4e..833fa8bab291 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -535,6 +535,9 @@ static int __kprobes do_page_fault(unsigned long far, u= nsigned long esr, unsigned long vm_flags; unsigned int mm_flags =3D FAULT_FLAG_DEFAULT; unsigned long addr =3D untagged_addr(far); +#ifdef CONFIG_PER_VMA_LOCK + struct vm_area_struct *vma; +#endif =20 if (kprobe_page_fault(regs, esr)) return 0; @@ -585,6 +588,36 @@ static int __kprobes do_page_fault(unsigned long far, = unsigned long esr, =20 perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr); =20 +#ifdef CONFIG_PER_VMA_LOCK + if (!(mm_flags & FAULT_FLAG_USER) || atomic_read(&mm->mm_users) =3D=3D 1) + goto lock_mmap; + + vma =3D lock_vma_under_rcu(mm, addr); + if (!vma) + goto lock_mmap; + + if (!(vma->vm_flags & vm_flags)) { + vma_read_unlock(vma); + goto lock_mmap; + } + fault =3D handle_mm_fault(vma, addr & PAGE_MASK, + mm_flags | FAULT_FLAG_VMA_LOCK, regs); + vma_read_unlock(vma); + + if (!(fault & VM_FAULT_RETRY)) { + count_vm_vma_lock_event(VMA_LOCK_SUCCESS); + goto done; + } + count_vm_vma_lock_event(VMA_LOCK_RETRY); + + /* Quick path to respond to signals */ + if (fault_signal_pending(fault, regs)) { + if (!user_mode(regs)) + goto no_context; + return 0; + } +lock_mmap: +#endif /* CONFIG_PER_VMA_LOCK */ /* * As per x86, we may deadlock here. However, since the kernel only * validly references user space from well defined areas of the code, @@ -628,6 +661,9 @@ static int __kprobes do_page_fault(unsigned long far, u= nsigned long esr, } mmap_read_unlock(mm); =20 +#ifdef CONFIG_PER_VMA_LOCK +done: +#endif /* * Handle the "normal" (no error) case first. */ --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABC1EC54EBD for ; Mon, 9 Jan 2023 20:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237904AbjAIU7G (ORCPT ); Mon, 9 Jan 2023 15:59:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237833AbjAIU5t (ORCPT ); Mon, 9 Jan 2023 15:57:49 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 453C58D5E5 for ; Mon, 9 Jan 2023 12:55:11 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id y66-20020a25c845000000b00733b5049b6fso10234698ybf.3 for ; Mon, 09 Jan 2023 12:55:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=n9fXm3izjyyPtsvowAse/sqFmR2VD35eu3/dkFbABUU=; b=Lu1lZFQRLQ14pZBfGy4NklmBcU6A/6INZGZkbzsgSlo/xf+q4f3Srx8+Mwys+6coe+ Ffmm0USNrJ8AFhCvinJsbBeU/mUqellUur3KPDhj6fquokQjmdfa6asRXc95A2ZwxK0r 65ObusIEfZRdiZ2acFQ8p3bJxbaCwGG5NTSDswdl0j+wpKeIAqm5hRE9lT+/32AEgmrz B2Z5fNXwd3NAztwL7Tcvgx4jFF9uDwxQg3WbD1oe+/QJTBY53gcrTvhoGSAa89iq0kDV 4crl41ZGwNSTBBEwkTNkn6j9GWODsWMSurEgXbqgc2WbNqPAfDPrxauLcSQy2BQ/r3X3 zHyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=n9fXm3izjyyPtsvowAse/sqFmR2VD35eu3/dkFbABUU=; b=ytGwXFomoJG3wRRGh3nq32WsMqFbEpf0JSB4FjXRLUncTSwpu9t0m/3AegunibIaN8 wO7Mo7X9uNBU434SpUro6qpvyYf8bsTVEn1zE9iS9KBkJ/aV0FKvxzvTCYaOjWc2qCoc kDuA2UgBoRlp33xwun9M4OdFdkPTBPVaBjlEHImE61ihPFi5TyZMtL4YH91ALC+Afh8V SamuIqBDHRvo98sUL1tl3cxUYEpFZy7a9eNfJrZiZaosJ+nYJfNyWN84y5Vyps2jiiHh gaqK7MZjDFE843I6MHKlScgv01LVDJ8GVvzIuSXTnbidxuToqzSuBM4AOcy0n0rm9+rq 6klw== X-Gm-Message-State: AFqh2kol4pFh0p1WdxTOhtbehVzYXWpcP5eB9ZRyqZv+iCf6y9yN6W4w 674C+zF7ftsyijOsznOdw//VLMifUzU= X-Google-Smtp-Source: AMrXdXvNqK3Ohvvx3O6oqyebZlpPuhsn/AIt7Mf5iuM9SXcX9ARLOAj6UwU5PecAUl7FhPG0GswhG64wODs= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:730f:0:b0:6fa:3287:8df6 with SMTP id o15-20020a25730f000000b006fa32878df6mr6834426ybc.424.1673297710804; Mon, 09 Jan 2023 12:55:10 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:31 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-37-surenb@google.com> Subject: [PATCH 36/41] powerc/mm: try VMA lock-based page fault handling first From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Laurent Dufour Attempt VMA lock-based page fault handling first, and fall back to the existing mmap_lock-based handling if that fails. Copied from "x86/mm: try VMA lock-based page fault handling first" Signed-off-by: Laurent Dufour Signed-off-by: Suren Baghdasaryan --- arch/powerpc/mm/fault.c | 41 ++++++++++++++++++++++++++ arch/powerpc/platforms/powernv/Kconfig | 1 + arch/powerpc/platforms/pseries/Kconfig | 1 + 3 files changed, 43 insertions(+) diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c index 2bef19cc1b98..f92f8956d5f2 100644 --- a/arch/powerpc/mm/fault.c +++ b/arch/powerpc/mm/fault.c @@ -469,6 +469,44 @@ static int ___do_page_fault(struct pt_regs *regs, unsi= gned long address, if (is_exec) flags |=3D FAULT_FLAG_INSTRUCTION; =20 +#ifdef CONFIG_PER_VMA_LOCK + if (!(flags & FAULT_FLAG_USER) || atomic_read(&mm->mm_users) =3D=3D 1) + goto lock_mmap; + + vma =3D lock_vma_under_rcu(mm, address); + if (!vma) + goto lock_mmap; + + if (unlikely(access_pkey_error(is_write, is_exec, + (error_code & DSISR_KEYFAULT), vma))) { + int rc =3D bad_access_pkey(regs, address, vma); + + vma_read_unlock(vma); + return rc; + } + + if (unlikely(access_error(is_write, is_exec, vma))) { + int rc =3D bad_access(regs, address); + + vma_read_unlock(vma); + return rc; + } + + fault =3D handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs= ); + vma_read_unlock(vma); + + if (!(fault & VM_FAULT_RETRY)) { + count_vm_vma_lock_event(VMA_LOCK_SUCCESS); + goto done; + } + count_vm_vma_lock_event(VMA_LOCK_RETRY); + + if (fault_signal_pending(fault, regs)) + return user_mode(regs) ? 0 : SIGBUS; + +lock_mmap: +#endif /* CONFIG_PER_VMA_LOCK */ + /* When running in the kernel we expect faults to occur only to * addresses in user space. All other faults represent errors in the * kernel and should generate an OOPS. Unfortunately, in the case of an @@ -545,6 +583,9 @@ static int ___do_page_fault(struct pt_regs *regs, unsig= ned long address, =20 mmap_read_unlock(current->mm); =20 +#ifdef CONFIG_PER_VMA_LOCK +done: +#endif if (unlikely(fault & VM_FAULT_ERROR)) return mm_fault_error(regs, address, fault); =20 diff --git a/arch/powerpc/platforms/powernv/Kconfig b/arch/powerpc/platform= s/powernv/Kconfig index ae248a161b43..70a46acc70d6 100644 --- a/arch/powerpc/platforms/powernv/Kconfig +++ b/arch/powerpc/platforms/powernv/Kconfig @@ -16,6 +16,7 @@ config PPC_POWERNV select PPC_DOORBELL select MMU_NOTIFIER select FORCE_SMP + select ARCH_SUPPORTS_PER_VMA_LOCK default y =20 config OPAL_PRD diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platform= s/pseries/Kconfig index a3b4d99567cb..e036a04ff1ca 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -21,6 +21,7 @@ config PPC_PSERIES select HOTPLUG_CPU select FORCE_SMP select SWIOTLB + select ARCH_SUPPORTS_PER_VMA_LOCK default y =20 config PARAVIRT --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD6F0C678D5 for ; Mon, 9 Jan 2023 20:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237910AbjAIU7J (ORCPT ); Mon, 9 Jan 2023 15:59:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33130 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237864AbjAIU6B (ORCPT ); Mon, 9 Jan 2023 15:58:01 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C6038E9B1 for ; Mon, 9 Jan 2023 12:55:15 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id w15-20020a05690204ef00b007b966ba4410so7177556ybs.5 for ; Mon, 09 Jan 2023 12:55:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JqE4Jv/Ie7jJyt+ylXoJ6xWWNYJ7mAmWvaygaHhGgmA=; b=C0RC5qgBD/V26736LQT1/agfFQE3oawoBsVodaZr+HfHKR+SKXSkxzHnqRRZUQyjCF /VKZZZ+vJ6kaEdIfo26R2t3ZnpRNnactrnv5DhopT9zYmpMMrFjwmeMN0NuvQGOG9Ma5 xgohCqYaIMlqK/c6f2BeWoXmU7iZ4/9gccjpi+n1Vpya+2P9xxgUSOybhe5uijKT6IMh I/M50QEDZcxaCjlrD2td9ExwPf2mPPj9P3UPSnAgQR/cCy9DPbczOjnWTPsq0XpCcdbi g/lQrqmdlC6kTELYjuH3lVXIdNQlERXCKK1DKiV3+4JsEhkwrKU78EAKu+F90rDI8Z8t c6mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JqE4Jv/Ie7jJyt+ylXoJ6xWWNYJ7mAmWvaygaHhGgmA=; b=VNDtRtCZbfC8ppgUEHdrHnhVH9LBuc60UsVKibbl5LyVamjIdxoQXBhskek1+t9Syk 0/BK/DkI7Sj+HMpc/nZgAsOMwlf/6oPcFiuBge//L+mU+osFfIoMkjhym3Wv+YuqsOvl vKJhZlVjV0Crhf1cqPt3SQrGUa4rG0/nxF5gaO1glPSmDaDA+TjPu054I83VGFWppeun qmvc1s8KTlhoQJQ/rPDX7VZjP7L9a/ZHuQPctwi2U9LF/JHZfc/DsKpejjqh3QfbPWQd OGQgeRgebcxD53gWJZ4rvuDa6Zi8h5TBC0bYP3Nib62par0DcL0ufP3yfZPCf0/zek4w SNnw== X-Gm-Message-State: AFqh2kqu+qPip9iZBXyqX9SjYj/tRtPSj87p2+JHpiDc6f1QbKCtIH3C Bt5xk+dHFUQAo9Fm2gmIgk5pJs55WQo= X-Google-Smtp-Source: AMrXdXsVSd5Wz07hI5hxUAzF2wzs/BFzwaLObvOXI24uMhgcUHGkQCCTKMo/jOVrGruw6pJEAc/B3Pke/WM= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a0d:dbce:0:b0:4ad:7104:1f63 with SMTP id d197-20020a0ddbce000000b004ad71041f63mr3427748ywe.49.1673297713421; Mon, 09 Jan 2023 12:55:13 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:32 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-38-surenb@google.com> Subject: [PATCH 37/41] mm: introduce mod_vm_flags_nolock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In cases when VMA flags are modified after VMA was isolated and mmap_lock was downgraded, flags modifications do not require per-VMA locking and an attempt to lock the VMA would result in an assertion because mmap write lock is not held. Introduce mod_vm_flags_nolock to be used in such situation. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 2e3be1d45371..7d436a5027cc 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -743,6 +743,14 @@ void clear_vm_flags(struct vm_area_struct *vma, unsign= ed long flags) vma->vm_flags &=3D ~flags; } =20 +static inline +void mod_vm_flags_nolock(struct vm_area_struct *vma, + unsigned long set, unsigned long clear) +{ + vma->vm_flags |=3D set; + vma->vm_flags &=3D ~clear; +} + static inline void mod_vm_flags(struct vm_area_struct *vma, unsigned long set, unsigned long clear) --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D496BC67871 for ; Mon, 9 Jan 2023 20:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237915AbjAIU7Q (ORCPT ); Mon, 9 Jan 2023 15:59:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237883AbjAIU6F (ORCPT ); Mon, 9 Jan 2023 15:58:05 -0500 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F1598E994 for ; Mon, 9 Jan 2023 12:55:17 -0800 (PST) Received: by mail-pj1-x104a.google.com with SMTP id h1-20020a17090a470100b0022646263abfso3630483pjg.6 for ; Mon, 09 Jan 2023 12:55:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iUx/OB3iQQFsS6Pcn0sWsw86mSb+m9wyFp0pF/RbXSk=; b=dIxUWUkrPawF5nCoP+HUjvA+1vmI2x8utDxJKsWUDKbZyEnrEFBNIhzaM8Fi8ju0R6 eFijpcCpNnzwFYn4qQcfR/+HsKpxnsThGu7XxjIktqkOlh8pbTgjPjgxlF3er9jtpiFK 3GSfGJF3spx9qr3lx0Xbz1tHEn1iO2FRbitd8mVM168Ji21A6WJbwvjHXFAHlbmszk/z PHIXCswYcUFMYiFzIipRXQxj+Lq9k2DTCyxkIwApZosMOj5/aAMGnoES19O/qzXO1kO2 XId6gaep+ySgAUv8E1eSLybQz1XP2tMRZPlcvFIYudobcMhpLFz5bkGWnIsnoWm/3FYR M9Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iUx/OB3iQQFsS6Pcn0sWsw86mSb+m9wyFp0pF/RbXSk=; b=ThauEerJsCcZZZExuz6OwOHfGejYCTGA8dOl1iZJZ+kn5p7St/ZNemJe3uUDU/zCy+ CSKz62k1EBiFB8ULrVC/Y4HHa0JsVgmydZlEjSCSp7TBYF6XZk4wo6ePMT85OQrnRB8w A7X9TU3CE5Vl/G1MesSIDzayEL8ML9CLsFzugtKdL6X7YefWIcLf3X0tu4z5QGB1zo4r +VBYDi9pSnlfLRp626qsyuK1+HBnhLJK/xR6cSNt9dAaoc9li1cqIJ7K1CE1x4Ed8fiy pPbyKnqPcbv5GhOVteQq1ZKgqwKfX5arN/3o/Lw2jstbjBu0vWdX9jgTdMzLJL33f3yv pJzQ== X-Gm-Message-State: AFqh2kocJWrBpJlDppiYhtqnoKRJlPnAqq+BUe/JxcC0sATYgkR3z+Ah xCMl7ZkaC7UEt9XU99olEuNmF3TRdLk= X-Google-Smtp-Source: AMrXdXtKEO5NZ0LV26PgvCCbCCrX8iN+m+pnC68agl9KzwVT0TYKaTzsINmDv+j62IKmNfwju5kMUEnv/YA= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a17:90b:370a:b0:226:f53b:d370 with SMTP id mg10-20020a17090b370a00b00226f53bd370mr875190pjb.75.1673297716576; Mon, 09 Jan 2023 12:55:16 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:33 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-39-surenb@google.com> Subject: [PATCH 38/41] mm: avoid assertion in untrack_pfn From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" untrack_pfn can be called after VMA was isolated and mmap_lock downgraded. An attempt to lock affected VMA would cause an assertion, therefore use mod_vm_flags_nolock in such situations. Signed-off-by: Suren Baghdasaryan --- arch/x86/mm/pat/memtype.c | 10 +++++++--- include/linux/mm.h | 2 +- include/linux/pgtable.h | 5 +++-- mm/memory.c | 15 ++++++++------- mm/memremap.c | 4 ++-- mm/mmap.c | 4 ++-- 6 files changed, 23 insertions(+), 17 deletions(-) diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c index 9e490a372896..f71c8381430b 100644 --- a/arch/x86/mm/pat/memtype.c +++ b/arch/x86/mm/pat/memtype.c @@ -1045,7 +1045,7 @@ void track_pfn_insert(struct vm_area_struct *vma, pgp= rot_t *prot, pfn_t pfn) * can be for the entire vma (in which case pfn, size are zero). */ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, - unsigned long size) + unsigned long size, bool lock_vma) { resource_size_t paddr; unsigned long prot; @@ -1064,8 +1064,12 @@ void untrack_pfn(struct vm_area_struct *vma, unsigne= d long pfn, size =3D vma->vm_end - vma->vm_start; } free_pfn_range(paddr, size); - if (vma) - clear_vm_flags(vma, VM_PAT); + if (vma) { + if (lock_vma) + clear_vm_flags(vma, VM_PAT); + else + mod_vm_flags_nolock(vma, 0, VM_PAT); + } } =20 /* diff --git a/include/linux/mm.h b/include/linux/mm.h index 7d436a5027cc..3158f33e268c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2135,7 +2135,7 @@ void zap_page_range_single(struct vm_area_struct *vma= , unsigned long address, unsigned long size, struct zap_details *details); void unmap_vmas(struct mmu_gather *tlb, struct maple_tree *mt, struct vm_area_struct *start_vma, unsigned long start, - unsigned long end); + unsigned long end, bool lock_vma); =20 struct mmu_notifier_range; =20 diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 1159b25b0542..eaa831bd675d 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1214,7 +1214,8 @@ static inline int track_pfn_copy(struct vm_area_struc= t *vma) * can be for the entire vma (in which case pfn, size are zero). */ static inline void untrack_pfn(struct vm_area_struct *vma, - unsigned long pfn, unsigned long size) + unsigned long pfn, unsigned long size, + bool lock_vma) { } =20 @@ -1232,7 +1233,7 @@ extern void track_pfn_insert(struct vm_area_struct *v= ma, pgprot_t *prot, pfn_t pfn); extern int track_pfn_copy(struct vm_area_struct *vma); extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, - unsigned long size); + unsigned long size, bool lock_vma); extern void untrack_pfn_moved(struct vm_area_struct *vma); #endif =20 diff --git a/mm/memory.c b/mm/memory.c index 12508f4d845a..5c7d5eaa60d8 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1610,7 +1610,7 @@ void unmap_page_range(struct mmu_gather *tlb, static void unmap_single_vma(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, unsigned long end_addr, - struct zap_details *details) + struct zap_details *details, bool lock_vma) { unsigned long start =3D max(vma->vm_start, start_addr); unsigned long end; @@ -1625,7 +1625,7 @@ static void unmap_single_vma(struct mmu_gather *tlb, uprobe_munmap(vma, start, end); =20 if (unlikely(vma->vm_flags & VM_PFNMAP)) - untrack_pfn(vma, 0, 0); + untrack_pfn(vma, 0, 0, lock_vma); =20 if (start !=3D end) { if (unlikely(is_vm_hugetlb_page(vma))) { @@ -1672,7 +1672,7 @@ static void unmap_single_vma(struct mmu_gather *tlb, */ void unmap_vmas(struct mmu_gather *tlb, struct maple_tree *mt, struct vm_area_struct *vma, unsigned long start_addr, - unsigned long end_addr) + unsigned long end_addr, bool lock_vma) { struct mmu_notifier_range range; struct zap_details details =3D { @@ -1686,7 +1686,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct maple_= tree *mt, start_addr, end_addr); mmu_notifier_invalidate_range_start(&range); do { - unmap_single_vma(tlb, vma, start_addr, end_addr, &details); + unmap_single_vma(tlb, vma, start_addr, end_addr, &details, + lock_vma); } while ((vma =3D mas_find(&mas, end_addr - 1)) !=3D NULL); mmu_notifier_invalidate_range_end(&range); } @@ -1715,7 +1716,7 @@ void zap_page_range(struct vm_area_struct *vma, unsig= ned long start, update_hiwater_rss(vma->vm_mm); mmu_notifier_invalidate_range_start(&range); do { - unmap_single_vma(&tlb, vma, start, range.end, NULL); + unmap_single_vma(&tlb, vma, start, range.end, NULL, false); } while ((vma =3D mas_find(&mas, end - 1)) !=3D NULL); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); @@ -1750,7 +1751,7 @@ void zap_page_range_single(struct vm_area_struct *vma= , unsigned long address, * unmap 'address-end' not 'range.start-range.end' as range * could have been expanded for hugetlb pmd sharing. */ - unmap_single_vma(&tlb, vma, address, end, details); + unmap_single_vma(&tlb, vma, address, end, details, false); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); } @@ -2519,7 +2520,7 @@ int remap_pfn_range(struct vm_area_struct *vma, unsig= ned long addr, =20 err =3D remap_pfn_range_notrack(vma, addr, pfn, size, prot); if (err) - untrack_pfn(vma, pfn, PAGE_ALIGN(size)); + untrack_pfn(vma, pfn, PAGE_ALIGN(size), true); return err; } EXPORT_SYMBOL(remap_pfn_range); diff --git a/mm/memremap.c b/mm/memremap.c index 08cbf54fe037..2f88f43d4a01 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -129,7 +129,7 @@ static void pageunmap_range(struct dev_pagemap *pgmap, = int range_id) } mem_hotplug_done(); =20 - untrack_pfn(NULL, PHYS_PFN(range->start), range_len(range)); + untrack_pfn(NULL, PHYS_PFN(range->start), range_len(range), true); pgmap_array_delete(range); } =20 @@ -276,7 +276,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, str= uct mhp_params *params, if (!is_private) kasan_remove_zero_shadow(__va(range->start), range_len(range)); err_kasan: - untrack_pfn(NULL, PHYS_PFN(range->start), range_len(range)); + untrack_pfn(NULL, PHYS_PFN(range->start), range_len(range), true); err_pfn_remap: pgmap_array_delete(range); return error; diff --git a/mm/mmap.c b/mm/mmap.c index a256deca0bc0..332af383f7cd 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2209,7 +2209,7 @@ static void unmap_region(struct mm_struct *mm, struct= maple_tree *mt, lru_add_drain(); tlb_gather_mmu(&tlb, mm); update_hiwater_rss(mm); - unmap_vmas(&tlb, mt, vma, start, end); + unmap_vmas(&tlb, mt, vma, start, end, lock_vma); free_pgtables(&tlb, mt, vma, prev ? prev->vm_end : FIRST_USER_ADDRESS, next ? next->vm_start : USER_PGTABLES_CEILING, lock_vma); @@ -3127,7 +3127,7 @@ void exit_mmap(struct mm_struct *mm) tlb_gather_mmu_fullmm(&tlb, mm); /* update_hiwater_rss(mm) here? but nobody should be looking */ /* Use ULONG_MAX here to ensure all VMAs in the mm are unmapped */ - unmap_vmas(&tlb, &mm->mm_mt, vma, 0, ULONG_MAX); + unmap_vmas(&tlb, &mm->mm_mt, vma, 0, ULONG_MAX, false); mmap_read_unlock(mm); =20 /* --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6609C678D7 for ; Mon, 9 Jan 2023 20:59:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237921AbjAIU7T (ORCPT ); Mon, 9 Jan 2023 15:59:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237654AbjAIU61 (ORCPT ); Mon, 9 Jan 2023 15:58:27 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B83417A38A for ; Mon, 9 Jan 2023 12:55:23 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-4755eb8a57bso104764437b3.12 for ; Mon, 09 Jan 2023 12:55:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=q4uicoifY7gC/4yqte16SYBfutCg5tFbw97tUXlS1Bs=; b=R+wIE/Pt4KhRz8VFgR0zMOS2JkvFmXy26q+OSQ9LjCyq0QISjHNPW3JBEjDsyfPMZn JHIhnIofZ7qGF96jAqweLgH0bhrxbWlUBYN16b0h3IUNriVtZxEflnZ/KJaCsGgJuROJ 7NtSec9kWHRW8dxlpNmOxFIq510sDiRPIwm32r6PlWpXwb1OQmeULeUN7CB+D4fk8aTq 1FAeEWt3FnKxXn+FEK6NZ2a1Nn+94uGmt6MDM7k61uLqgDX1MFlIF8o4Zu3p/TF9jRMz yr+WOD/8oTTja8WHghFFUJvljuYoTFD7DGiXYFX6wrr/dkh/UC1lDtuKJ2cCE65save6 YCEA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=q4uicoifY7gC/4yqte16SYBfutCg5tFbw97tUXlS1Bs=; b=LBLwJVsp97v3SBem+s5WWlqEwK8msJw9MtgnoPVGCmCxaDpANpsVVdMNCMW20/F/tk OORzCnhb4Zp0iHxBh7nNxU3OK6P8421/jJj6KPYDZjzNzk96JoCX8g+NK6xuQPH6udT9 w4zBzzeWHVnAk9mOD5dnLpd/ZhU/cwnATUiORtNMBiF84539/i1bqCttg3WhMRgTT2PT u8QJ9P/MNTPnLohSc6sEGUxGZlI2yYki9pzsaXN2nThKMgfX1C+HWwi1Eevt6BKiIrFV kqjYsZ+9w5HpwIinCe0N4VzrDgSAnPHR3yGuLutIsE10CKyTvy+GZljB17GJtnerrDlO +LNQ== X-Gm-Message-State: AFqh2kqD7iHKVaFzBNPwbTBlIhUHymSESY6ZdyceXQ4NnE8a7IUVjH18 QBHHE0mjcdPsS37koeBjNKz/+wYYYZE= X-Google-Smtp-Source: AMrXdXufHU+YEXa/HPbI4vLLyfNw163JJwFMEsJciRnPr1X88dc4ccEYYlMB1fPlEQ5O4fkaQPa8neGtLug= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:9246:0:b0:6f2:2d54:dcdf with SMTP id e6-20020a259246000000b006f22d54dcdfmr6080399ybo.144.1673297718997; Mon, 09 Jan 2023 12:55:18 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:34 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-40-surenb@google.com> Subject: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" call_rcu() can take a long time when callback offloading is enabled. Its use in the vm_area_free can cause regressions in the exit path when multiple VMAs are being freed. To minimize that impact, place VMAs into a list and free them in groups using one call_rcu() call per group. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 1 + include/linux/mm_types.h | 19 +++++++++-- kernel/fork.c | 68 +++++++++++++++++++++++++++++++++++----- mm/init-mm.c | 3 ++ mm/mmap.c | 1 + 5 files changed, 82 insertions(+), 10 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3158f33e268c..50c7a6dd9c7a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -250,6 +250,7 @@ void setup_initial_init_mm(void *start_code, void *end_= code, struct vm_area_struct *vm_area_alloc(struct mm_struct *); struct vm_area_struct *vm_area_dup(struct vm_area_struct *); void vm_area_free(struct vm_area_struct *); +void drain_free_vmas(struct mm_struct *mm); =20 #ifndef CONFIG_MMU extern struct rb_root nommu_region_tree; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index fce9113d979c..c0e6c8e4700b 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -592,8 +592,18 @@ struct vm_area_struct { /* Information about our backing store: */ unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE units */ - struct file * vm_file; /* File we map to (can be NULL). */ - void * vm_private_data; /* was vm_pte (shared mem) */ + union { + struct { + /* File we map to (can be NULL). */ + struct file *vm_file; + + /* was vm_pte (shared mem) */ + void *vm_private_data; + }; +#ifdef CONFIG_PER_VMA_LOCK + struct list_head vm_free_list; +#endif + }; =20 #ifdef CONFIG_ANON_VMA_NAME /* @@ -693,6 +703,11 @@ struct mm_struct { */ #ifdef CONFIG_PER_VMA_LOCK int mm_lock_seq; + struct { + struct list_head head; + spinlock_t lock; + int size; + } vma_free_list; #endif =20 =20 diff --git a/kernel/fork.c b/kernel/fork.c index 6d9f14e55ecf..97f2b751f88d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -481,26 +481,75 @@ struct vm_area_struct *vm_area_dup(struct vm_area_str= uct *orig) } =20 #ifdef CONFIG_PER_VMA_LOCK -static void __vm_area_free(struct rcu_head *head) +static inline void __vm_area_free(struct vm_area_struct *vma) { - struct vm_area_struct *vma =3D container_of(head, struct vm_area_struct, - vm_rcu); /* The vma should either have no lock holders or be write-locked. */ vma_assert_no_reader(vma); kmem_cache_free(vm_area_cachep, vma); } -#endif + +static void vma_free_rcu_callback(struct rcu_head *head) +{ + struct vm_area_struct *first_vma; + struct vm_area_struct *vma, *vma2; + + first_vma =3D container_of(head, struct vm_area_struct, vm_rcu); + list_for_each_entry_safe(vma, vma2, &first_vma->vm_free_list, vm_free_lis= t) + __vm_area_free(vma); + __vm_area_free(first_vma); +} + +void drain_free_vmas(struct mm_struct *mm) +{ + struct vm_area_struct *first_vma; + LIST_HEAD(to_destroy); + + spin_lock(&mm->vma_free_list.lock); + list_splice_init(&mm->vma_free_list.head, &to_destroy); + mm->vma_free_list.size =3D 0; + spin_unlock(&mm->vma_free_list.lock); + + if (list_empty(&to_destroy)) + return; + + first_vma =3D list_first_entry(&to_destroy, struct vm_area_struct, vm_fre= e_list); + /* Remove the head which is allocated on the stack */ + list_del(&to_destroy); + + call_rcu(&first_vma->vm_rcu, vma_free_rcu_callback); +} + +#define VM_AREA_FREE_LIST_MAX 32 + +void vm_area_free(struct vm_area_struct *vma) +{ + struct mm_struct *mm =3D vma->vm_mm; + bool drain; + + free_anon_vma_name(vma); + + spin_lock(&mm->vma_free_list.lock); + list_add(&vma->vm_free_list, &mm->vma_free_list.head); + mm->vma_free_list.size++; + drain =3D mm->vma_free_list.size > VM_AREA_FREE_LIST_MAX; + spin_unlock(&mm->vma_free_list.lock); + + if (drain) + drain_free_vmas(mm); +} + +#else /* CONFIG_PER_VMA_LOCK */ + +void drain_free_vmas(struct mm_struct *mm) {} =20 void vm_area_free(struct vm_area_struct *vma) { free_anon_vma_name(vma); -#ifdef CONFIG_PER_VMA_LOCK - call_rcu(&vma->vm_rcu, __vm_area_free); -#else kmem_cache_free(vm_area_cachep, vma); -#endif } =20 +#endif /* CONFIG_PER_VMA_LOCK */ + static void account_kernel_stack(struct task_struct *tsk, int account) { if (IS_ENABLED(CONFIG_VMAP_STACK)) { @@ -1150,6 +1199,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm= , struct task_struct *p, INIT_LIST_HEAD(&mm->mmlist); #ifdef CONFIG_PER_VMA_LOCK WRITE_ONCE(mm->mm_lock_seq, 0); + INIT_LIST_HEAD(&mm->vma_free_list.head); + spin_lock_init(&mm->vma_free_list.lock); + mm->vma_free_list.size =3D 0; #endif mm_pgtables_bytes_init(mm); mm->map_count =3D 0; diff --git a/mm/init-mm.c b/mm/init-mm.c index 33269314e060..b53d23c2d7a3 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -39,6 +39,9 @@ struct mm_struct init_mm =3D { .mmlist =3D LIST_HEAD_INIT(init_mm.mmlist), #ifdef CONFIG_PER_VMA_LOCK .mm_lock_seq =3D 0, + .vma_free_list.head =3D LIST_HEAD_INIT(init_mm.vma_free_list.head), + .vma_free_list.lock =3D __SPIN_LOCK_UNLOCKED(init_mm.vma_free_list.lock), + .vma_free_list.size =3D 0, #endif .user_ns =3D &init_user_ns, .cpu_bitmap =3D CPU_BITS_NONE, diff --git a/mm/mmap.c b/mm/mmap.c index 332af383f7cd..a0d5d3af1d95 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3159,6 +3159,7 @@ void exit_mmap(struct mm_struct *mm) trace_exit_mmap(mm); __mt_destroy(&mm->mm_mt); mmap_write_unlock(mm); + drain_free_vmas(mm); vm_unacct_memory(nr_accounted); } =20 --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E717C54EBD for ; Mon, 9 Jan 2023 20:59:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237935AbjAIU7Z (ORCPT ); Mon, 9 Jan 2023 15:59:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34772 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237453AbjAIU6W (ORCPT ); Mon, 9 Jan 2023 15:58:22 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CAE288E9BB for ; Mon, 9 Jan 2023 12:55:23 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-4597b0ff5e9so105296977b3.10 for ; Mon, 09 Jan 2023 12:55:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KsiQb/dE9yymRpEr0UBz2Qq4KLrRHehGb+mEU6hVzX0=; b=sWtobSCBvoHV5TPkE/zYtWszgUW3jVRDiVjTt3sAcjFBaXlqjfdYiQ6AHG86rnNt0b mg+e+VoU+a89JO+rslq34FfyzRkAoGg7DuVDlMH1H1ePQ32MuB2+77t0gAtkJ+y+wX6p B/589/jsqLZApvm6BoQPTAcfoFYkEfWsjxaKMUYIa10AIrZvQul2s59ukdBVjUu/Ibrp egNG28U4wT5FZXec8pRJXGar6bKM6QyHpkRg0GkDbw6/Hl7olgQ7YiTkujC6hmd+0b7u 2fP91n1+O4wNRIiFMGmQ1g/TPPehatod7VMoh0dY0z9ogD+TX+MKDQ5sfW6QhS4fa9cZ O/Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KsiQb/dE9yymRpEr0UBz2Qq4KLrRHehGb+mEU6hVzX0=; b=mE5jrXaVUYv/0RnvlEGxu2MomIHk2di19OwyAYbkCv2q0MczPgfCPNP6Ihj6SLWyCO 7NrDWmRSxUrqqFOhThCqapEUhbhAYMUVdSmKX3K+yWouothbdHCPOq7ZLtMPEE7Sblfh h988FW5a9SYKbG7EeY1tGqjuDY53YxR8ogVmnX0lxgSP+4Dtgv1nCx/oHheXt9o05QP7 5SR/Kv3DihFexE4Wz/Lp/ZXir2NBvGmUb9351Eicpiv+CWvn52tCBNzKxNIkdno81gk0 hIpfupqpSpeMWfrIScxCBWf+e5cqJJqcjH11V00/Y7s74y2bDH3vKGao4sejejhgMsAF WIiA== X-Gm-Message-State: AFqh2krNHOcLxaqn+b6kUfa33I1p/EKysLMj8Eg6e7cZ3mlJnTaqriY0 hsA4cpEiIqfA/FdkwbFi9sZAHFn9UPY= X-Google-Smtp-Source: AMrXdXvB4GUb8flK/fogRdz9aeHP/yrxt0+V7dhaMMs1drQdYkqQuuAWfxuEu+0du5vQqeydWK6w8e3+WKU= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a25:abc6:0:b0:755:29c5:63e with SMTP id v64-20020a25abc6000000b0075529c5063emr6840356ybi.142.1673297721512; Mon, 09 Jan 2023 12:55:21 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:35 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-41-surenb@google.com> Subject: [PATCH 40/41] mm: separate vma->lock from vm_area_struct From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vma->lock being part of the vm_area_struct causes performance regression during page faults because during contention its count and owner fields are constantly updated and having other parts of vm_area_struct used during page fault handling next to them causes constant cache line bouncing. Fix that by moving the lock outside of the vm_area_struct. All attempts to keep vma->lock inside vm_area_struct in a separate cache line still produce performance regression especially on NUMA machines. Smallest regression was achieved when lock is placed in the fourth cache line but that bloats vm_area_struct to 256 bytes. Considering performance and memory impact, separate lock looks like the best option. It increases memory footprint of each VMA but that will be addressed in the next patch. Note that after this change vma_init() does not allocate or initialize vma->lock anymore. A number of drivers allocate a pseudo VMA on the stack but they never use the VMA's lock, therefore it does not need to be allocated. The future drivers which might need the VMA lock should use vm_area_alloc()/vm_area_free() to allocate it. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 25 ++++++------ include/linux/mm_types.h | 6 ++- kernel/fork.c | 82 ++++++++++++++++++++++++++++------------ 3 files changed, 74 insertions(+), 39 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 50c7a6dd9c7a..d40bf8a5e19e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -615,11 +615,6 @@ struct vm_operations_struct { }; =20 #ifdef CONFIG_PER_VMA_LOCK -static inline void vma_init_lock(struct vm_area_struct *vma) -{ - init_rwsem(&vma->lock); - vma->vm_lock_seq =3D -1; -} =20 static inline void vma_write_lock(struct vm_area_struct *vma) { @@ -635,9 +630,9 @@ static inline void vma_write_lock(struct vm_area_struct= *vma) if (vma->vm_lock_seq =3D=3D mm_lock_seq) return; =20 - down_write(&vma->lock); + down_write(&vma->vm_lock->lock); vma->vm_lock_seq =3D mm_lock_seq; - up_write(&vma->lock); + up_write(&vma->vm_lock->lock); } =20 /* @@ -651,17 +646,17 @@ static inline bool vma_read_trylock(struct vm_area_st= ruct *vma) if (vma->vm_lock_seq =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq)) return false; =20 - if (unlikely(down_read_trylock(&vma->lock) =3D=3D 0)) + if (unlikely(down_read_trylock(&vma->vm_lock->lock) =3D=3D 0)) return false; =20 /* * Overflow might produce false locked result. * False unlocked result is impossible because we modify and check - * vma->vm_lock_seq under vma->lock protection and mm->mm_lock_seq + * vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq * modification invalidates all existing locks. */ if (unlikely(vma->vm_lock_seq =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq)))= { - up_read(&vma->lock); + up_read(&vma->vm_lock->lock); return false; } return true; @@ -669,7 +664,7 @@ static inline bool vma_read_trylock(struct vm_area_stru= ct *vma) =20 static inline void vma_read_unlock(struct vm_area_struct *vma) { - up_read(&vma->lock); + up_read(&vma->vm_lock->lock); } =20 static inline void vma_assert_write_locked(struct vm_area_struct *vma) @@ -684,7 +679,7 @@ static inline void vma_assert_write_locked(struct vm_ar= ea_struct *vma) =20 static inline void vma_assert_no_reader(struct vm_area_struct *vma) { - VM_BUG_ON_VMA(rwsem_is_locked(&vma->lock) && + VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock->lock) && vma->vm_lock_seq !=3D READ_ONCE(vma->vm_mm->mm_lock_seq), vma); } @@ -694,7 +689,6 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_str= uct *mm, =20 #else /* CONFIG_PER_VMA_LOCK */ =20 -static inline void vma_init_lock(struct vm_area_struct *vma) {} static inline void vma_write_lock(struct vm_area_struct *vma) {} static inline bool vma_read_trylock(struct vm_area_struct *vma) { return false; } @@ -704,6 +698,10 @@ static inline void vma_assert_no_reader(struct vm_area= _struct *vma) {} =20 #endif /* CONFIG_PER_VMA_LOCK */ =20 +/* + * WARNING: vma_init does not initialize vma->vm_lock. + * Use vm_area_alloc()/vm_area_free() if vma needs locking. + */ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *= mm) { static const struct vm_operations_struct dummy_vm_ops =3D {}; @@ -712,7 +710,6 @@ static inline void vma_init(struct vm_area_struct *vma,= struct mm_struct *mm) vma->vm_mm =3D mm; vma->vm_ops =3D &dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); - vma_init_lock(vma); } =20 /* Use when VMA is not part of the VMA tree and needs no locking */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index c0e6c8e4700b..faa61b400f9b 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -526,6 +526,10 @@ struct anon_vma_name { char name[]; }; =20 +struct vma_lock { + struct rw_semaphore lock; +}; + /* * This struct describes a virtual memory area. There is one of these * per VM-area/task. A VM area is any part of the process virtual memory @@ -563,7 +567,7 @@ struct vm_area_struct { =20 #ifdef CONFIG_PER_VMA_LOCK int vm_lock_seq; - struct rw_semaphore lock; + struct vma_lock *vm_lock; #endif =20 /* diff --git a/kernel/fork.c b/kernel/fork.c index 97f2b751f88d..95db6a521cf1 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -451,40 +451,28 @@ static struct kmem_cache *vm_area_cachep; /* SLAB cache for mm_struct structures (tsk->mm) */ static struct kmem_cache *mm_cachep; =20 -struct vm_area_struct *vm_area_alloc(struct mm_struct *mm) -{ - struct vm_area_struct *vma; +#ifdef CONFIG_PER_VMA_LOCK =20 - vma =3D kmem_cache_alloc(vm_area_cachep, GFP_KERNEL); - if (vma) - vma_init(vma, mm); - return vma; -} +/* SLAB cache for vm_area_struct.lock */ +static struct kmem_cache *vma_lock_cachep; =20 -struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) +static bool vma_init_lock(struct vm_area_struct *vma) { - struct vm_area_struct *new =3D kmem_cache_alloc(vm_area_cachep, GFP_KERNE= L); + vma->vm_lock =3D kmem_cache_alloc(vma_lock_cachep, GFP_KERNEL); + if (!vma->vm_lock) + return false; =20 - if (new) { - ASSERT_EXCLUSIVE_WRITER(orig->vm_flags); - ASSERT_EXCLUSIVE_WRITER(orig->vm_file); - /* - * orig->shared.rb may be modified concurrently, but the clone - * will be reinitialized. - */ - *new =3D data_race(*orig); - INIT_LIST_HEAD(&new->anon_vma_chain); - vma_init_lock(new); - dup_anon_vma_name(orig, new); - } - return new; + init_rwsem(&vma->vm_lock->lock); + vma->vm_lock_seq =3D -1; + + return true; } =20 -#ifdef CONFIG_PER_VMA_LOCK static inline void __vm_area_free(struct vm_area_struct *vma) { /* The vma should either have no lock holders or be write-locked. */ vma_assert_no_reader(vma); + kmem_cache_free(vma_lock_cachep, vma->vm_lock); kmem_cache_free(vm_area_cachep, vma); } =20 @@ -540,6 +528,7 @@ void vm_area_free(struct vm_area_struct *vma) =20 #else /* CONFIG_PER_VMA_LOCK */ =20 +static bool vma_init_lock(struct vm_area_struct *vma) { return true; } void drain_free_vmas(struct mm_struct *mm) {} =20 void vm_area_free(struct vm_area_struct *vma) @@ -550,6 +539,48 @@ void vm_area_free(struct vm_area_struct *vma) =20 #endif /* CONFIG_PER_VMA_LOCK */ =20 +struct vm_area_struct *vm_area_alloc(struct mm_struct *mm) +{ + struct vm_area_struct *vma; + + vma =3D kmem_cache_alloc(vm_area_cachep, GFP_KERNEL); + if (!vma) + return NULL; + + vma_init(vma, mm); + if (!vma_init_lock(vma)) { + kmem_cache_free(vm_area_cachep, vma); + return NULL; + } + + return vma; +} + +struct vm_area_struct *vm_area_dup(struct vm_area_struct *orig) +{ + struct vm_area_struct *new; + + new =3D kmem_cache_alloc(vm_area_cachep, GFP_KERNEL); + if (!new) + return NULL; + + ASSERT_EXCLUSIVE_WRITER(orig->vm_flags); + ASSERT_EXCLUSIVE_WRITER(orig->vm_file); + /* + * orig->shared.rb may be modified concurrently, but the clone + * will be reinitialized. + */ + *new =3D data_race(*orig); + if (!vma_init_lock(new)) { + kmem_cache_free(vm_area_cachep, new); + return NULL; + } + INIT_LIST_HEAD(&new->anon_vma_chain); + dup_anon_vma_name(orig, new); + + return new; +} + static void account_kernel_stack(struct task_struct *tsk, int account) { if (IS_ENABLED(CONFIG_VMAP_STACK)) { @@ -3138,6 +3169,9 @@ void __init proc_caches_init(void) NULL); =20 vm_area_cachep =3D KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT); +#ifdef CONFIG_PER_VMA_LOCK + vma_lock_cachep =3D KMEM_CACHE(vma_lock, SLAB_PANIC|SLAB_ACCOUNT); +#endif mmap_init(); nsproxy_cache_init(); } --=20 2.39.0 From nobody Mon Sep 15 21:41:10 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B6B2C5479D for ; Mon, 9 Jan 2023 20:59:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237732AbjAIU73 (ORCPT ); Mon, 9 Jan 2023 15:59:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234818AbjAIU62 (ORCPT ); Mon, 9 Jan 2023 15:58:28 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 671498F2A6 for ; Mon, 9 Jan 2023 12:55:24 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-4c1456d608cso104298577b3.15 for ; Mon, 09 Jan 2023 12:55:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6eLzL6RHeFdZbRB5oFC0iJ795TBzqFT5ovI/MwJrKVk=; b=J47RXHly65L6DdSqhS16NmYDveuNz+mvTNG4sGvbQhUoWkeX6DBvsEbauZWeZvA/FL 6iJWwlnZTOJwBfCXcZD78WFVJCG7QdSQyabstEWgAbqGHA3DDp9ZOq1VOAfH1PKWr803 hD+hbwJSt9HgzHQXXidFA8sLeLstkKuaZuBnozg86r7c8tgk1iKjkQ1PqFU36BM8+uUF XrYI9BSRx6s/UM8T/tSpiSH9gONtyaFYLkfVKHbhv2aDCwIEnthDk+4faf5N9RzawpoY Qe27f6FOB5GsQ/NQQedhE4ZJmkp3i9Bbttmr9VE8Pj62jtmxe5G6ZqBn6CUeB13gsJFT GliQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6eLzL6RHeFdZbRB5oFC0iJ795TBzqFT5ovI/MwJrKVk=; b=eiA5DvksMQP1HiV5Tuk9RAoRbP7/IeNAr8iJFy5I+ZwnrrhVBDhX9Bnhbs3J1G/M50 E536UZEynQ8b7wfV4XnaYiA7sqJTwSQSKGVBZLG65PD0/VQJLYsrrFb0wQWoGI0q4WXU Wz25AEH+NWPcSktmsEMiboDYCpTheR8OUUm2oIiX8Q6u2j6aioxxg7fu3OP08a9k0flU f34x3wLS6pxMi8We05Fm4Tk682UcbXG3qQuGEvZOZeVZsbv47cwx8MNPc/n2Hj+bXMgc LdQxPUz2aRrLWBg9c38bW6Kx+w6tA8S5CP49AYnTX12bh2Kk2pQobWjYbv/8vdjRxlH9 PP3g== X-Gm-Message-State: AFqh2kruaOSFBoz855EZnR0WlKo5WVOWQxnSV0W2SR92a9ypHjJghEEy DZuiK2SwbgjdgUVw2uub90UQzMDuX4w= X-Google-Smtp-Source: AMrXdXvrHv3ZAwS3fQ/OCRP/gup9EskfAqpVKQwrbofhFp/7IXGRuSpH+yzdGI9HlCLagGnSix+y0aiJr6A= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:9393:6f7a:d410:55ca]) (user=surenb job=sendgmr) by 2002:a5b:305:0:b0:707:473f:b763 with SMTP id j5-20020a5b0305000000b00707473fb763mr7071095ybp.158.1673297724036; Mon, 09 Jan 2023 12:55:24 -0800 (PST) Date: Mon, 9 Jan 2023 12:53:36 -0800 In-Reply-To: <20230109205336.3665937-1-surenb@google.com> Mime-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109205336.3665937-42-surenb@google.com> Subject: [PATCH 41/41] mm: replace rw_semaphore with atomic_t in vma_lock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" rw_semaphore is a sizable structure of 40 bytes and consumes considerable space for each vm_area_struct. However vma_lock has two important specifics which can be used to replace rw_semaphore with a simpler structure: 1. Readers never wait. They try to take the vma_lock and fall back to mmap_lock if that fails. 2. Only one writer at a time will ever try to write-lock a vma_lock because writers first take mmap_lock in write mode. Because of these requirements, full rw_semaphore functionality is not needed and we can replace rw_semaphore with an atomic variable. When a reader takes read lock, it increments the atomic unless the value is negative. If that fails read-locking is aborted and mmap_lock is used instead. When writer takes write lock, it resets atomic value to -1 if the current value is 0 (no readers). Since all writers take mmap_lock in write mode first, there can be only one writer at a time. If there are readers, writer will place itself into a wait queue using new mm_struct.vma_writer_wait waitqueue head. The last reader to release the vma_lock will signal the writer to wake up. vm_lock_seq is also moved into vma_lock and along with atomic_t they are nicely packed and consume 8 bytes, bringing the overhead from vma_lock from 44 to 16 bytes: slabinfo before the changes: ... : ... vm_area_struct ... 152 53 2 : ... slabinfo with vma_lock: ... : ... rw_semaphore ... 8 512 1 : ... vm_area_struct ... 160 51 2 : ... Assuming 40000 vm_area_structs, memory consumption would be: baseline: 6040kB vma_lock (vm_area_structs+vma_lock): 6280kB+316kB=3D6596kB Total increase: 556kB atomic_t might overflow if there are many competing readers, therefore vma_read_trylock() implements an overflow check and if that occurs it restors the previous value and exits with a failure to lock. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 37 +++++++++++++++++++++++++------------ include/linux/mm_types.h | 10 ++++++++-- kernel/fork.c | 6 +++--- mm/init-mm.c | 2 ++ 4 files changed, 38 insertions(+), 17 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index d40bf8a5e19e..294dd44b2198 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -627,12 +627,16 @@ static inline void vma_write_lock(struct vm_area_stru= ct *vma) * mm->mm_lock_seq can't be concurrently modified. */ mm_lock_seq =3D READ_ONCE(vma->vm_mm->mm_lock_seq); - if (vma->vm_lock_seq =3D=3D mm_lock_seq) + if (vma->vm_lock->lock_seq =3D=3D mm_lock_seq) return; =20 - down_write(&vma->vm_lock->lock); - vma->vm_lock_seq =3D mm_lock_seq; - up_write(&vma->vm_lock->lock); + if (atomic_cmpxchg(&vma->vm_lock->count, 0, -1)) + wait_event(vma->vm_mm->vma_writer_wait, + atomic_cmpxchg(&vma->vm_lock->count, 0, -1) =3D=3D 0); + vma->vm_lock->lock_seq =3D mm_lock_seq; + /* Write barrier to ensure lock_seq change is visible before count */ + smp_wmb(); + atomic_set(&vma->vm_lock->count, 0); } =20 /* @@ -643,20 +647,28 @@ static inline void vma_write_lock(struct vm_area_stru= ct *vma) static inline bool vma_read_trylock(struct vm_area_struct *vma) { /* Check before locking. A race might cause false locked result. */ - if (vma->vm_lock_seq =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq)) + if (vma->vm_lock->lock_seq =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq)) return false; =20 - if (unlikely(down_read_trylock(&vma->vm_lock->lock) =3D=3D 0)) + if (unlikely(!atomic_inc_unless_negative(&vma->vm_lock->count))) return false; =20 + /* If atomic_t overflows, restore and fail to lock. */ + if (unlikely(atomic_read(&vma->vm_lock->count) < 0)) { + if (atomic_dec_and_test(&vma->vm_lock->count)) + wake_up(&vma->vm_mm->vma_writer_wait); + return false; + } + /* * Overflow might produce false locked result. * False unlocked result is impossible because we modify and check * vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq * modification invalidates all existing locks. */ - if (unlikely(vma->vm_lock_seq =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq)))= { - up_read(&vma->vm_lock->lock); + if (unlikely(vma->vm_lock->lock_seq =3D=3D READ_ONCE(vma->vm_mm->mm_lock_= seq))) { + if (atomic_dec_and_test(&vma->vm_lock->count)) + wake_up(&vma->vm_mm->vma_writer_wait); return false; } return true; @@ -664,7 +676,8 @@ static inline bool vma_read_trylock(struct vm_area_stru= ct *vma) =20 static inline void vma_read_unlock(struct vm_area_struct *vma) { - up_read(&vma->vm_lock->lock); + if (atomic_dec_and_test(&vma->vm_lock->count)) + wake_up(&vma->vm_mm->vma_writer_wait); } =20 static inline void vma_assert_write_locked(struct vm_area_struct *vma) @@ -674,13 +687,13 @@ static inline void vma_assert_write_locked(struct vm_= area_struct *vma) * current task is holding mmap_write_lock, both vma->vm_lock_seq and * mm->mm_lock_seq can't be concurrently modified. */ - VM_BUG_ON_VMA(vma->vm_lock_seq !=3D READ_ONCE(vma->vm_mm->mm_lock_seq), v= ma); + VM_BUG_ON_VMA(vma->vm_lock->lock_seq !=3D READ_ONCE(vma->vm_mm->mm_lock_s= eq), vma); } =20 static inline void vma_assert_no_reader(struct vm_area_struct *vma) { - VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock->lock) && - vma->vm_lock_seq !=3D READ_ONCE(vma->vm_mm->mm_lock_seq), + VM_BUG_ON_VMA(atomic_read(&vma->vm_lock->count) > 0 && + vma->vm_lock->lock_seq !=3D READ_ONCE(vma->vm_mm->mm_lock_seq), vma); } =20 diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index faa61b400f9b..a6050c38ca2e 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -527,7 +527,13 @@ struct anon_vma_name { }; =20 struct vma_lock { - struct rw_semaphore lock; + /* + * count > 0 =3D=3D> read-locked with 'count' number of readers + * count < 0 =3D=3D> write-locked + * count =3D 0 =3D=3D> unlocked + */ + atomic_t count; + int lock_seq; }; =20 /* @@ -566,7 +572,6 @@ struct vm_area_struct { unsigned long vm_flags; =20 #ifdef CONFIG_PER_VMA_LOCK - int vm_lock_seq; struct vma_lock *vm_lock; #endif =20 @@ -706,6 +711,7 @@ struct mm_struct { * by mmlist_lock */ #ifdef CONFIG_PER_VMA_LOCK + struct wait_queue_head vma_writer_wait; int mm_lock_seq; struct { struct list_head head; diff --git a/kernel/fork.c b/kernel/fork.c index 95db6a521cf1..b221ad182d98 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -461,9 +461,8 @@ static bool vma_init_lock(struct vm_area_struct *vma) vma->vm_lock =3D kmem_cache_alloc(vma_lock_cachep, GFP_KERNEL); if (!vma->vm_lock) return false; - - init_rwsem(&vma->vm_lock->lock); - vma->vm_lock_seq =3D -1; + atomic_set(&vma->vm_lock->count, 0); + vma->vm_lock->lock_seq =3D -1; =20 return true; } @@ -1229,6 +1228,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm= , struct task_struct *p, mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); #ifdef CONFIG_PER_VMA_LOCK + init_waitqueue_head(&mm->vma_writer_wait); WRITE_ONCE(mm->mm_lock_seq, 0); INIT_LIST_HEAD(&mm->vma_free_list.head); spin_lock_init(&mm->vma_free_list.lock); diff --git a/mm/init-mm.c b/mm/init-mm.c index b53d23c2d7a3..0088e31e5f7e 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -38,6 +38,8 @@ struct mm_struct init_mm =3D { .arg_lock =3D __SPIN_LOCK_UNLOCKED(init_mm.arg_lock), .mmlist =3D LIST_HEAD_INIT(init_mm.mmlist), #ifdef CONFIG_PER_VMA_LOCK + .vma_writer_wait =3D + __WAIT_QUEUE_HEAD_INITIALIZER(init_mm.vma_writer_wait), .mm_lock_seq =3D 0, .vma_free_list.head =3D LIST_HEAD_INIT(init_mm.vma_free_list.head), .vma_free_list.lock =3D __SPIN_LOCK_UNLOCKED(init_mm.vma_free_list.lock), --=20 2.39.0