From nobody Sat Feb  7 23:50:16 2026
Received: from mail-oa1-f73.google.com (mail-oa1-f73.google.com
 [209.85.160.73])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22BD5212F97
	for <linux-kernel@vger.kernel.org>; Fri,  6 Dec 2024 22:52:09 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.160.73
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1733525532; cv=none;
 b=DS9P4xnIb+TRLwnay5oNn+q3JWZOoI38gmdh1hT+3RGOnfIa+IK6/OT+iQ4BqlDJF6FlPUpgzyUpzAo1HF3dGLRcZG0VDQ/wZSpD3O2c/tqmPcMIASmwY6HbYWOcmIhJF7vnuxExXLkX+hEKa1kYEzN9t1YV7/0CVgzxkIbxcdU=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1733525532; c=relaxed/simple;
	bh=2j1BzXPRuP+UhOQHpNJNISGx3R6QvDIWF84DWtljS3M=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=JXhSqG+gKbhucOWdeAHTVle191OzWCZfLA1w5L9FM85fzwaSTUadQdp4K4wZ/qReQwOrsZ45gCsJomJlpdl5hJrV+5AsWwVBurZY/3TK+I6NXntUzOZ611hpso/b2HWdCZdW22j9swTDdh2J8GVtd6Bs0ANDQWYh9dfkeI4q6M4=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=cvxbFSAz; arc=none smtp.client-ip=209.85.160.73
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="cvxbFSAz"
Received: by mail-oa1-f73.google.com with SMTP id
 586e51a60fabf-29e4d2c28ffso2316052fac.3
        for <linux-kernel@vger.kernel.org>;
 Fri, 06 Dec 2024 14:52:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1733525529; x=1734130329;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=2XVlPHpEZtu9xIw4Qap8alaO3+SGpEYwVQX0iy/v7kQ=;
        b=cvxbFSAzZLVRrz8SUlDqgxhO83eb/7TedstGzq+UVNvkXX5ZepC6ZLryjDTCEzs4e8
         xp/INd9YzpTgg2DQ31jIx9RNC2xe6cHmntP046FqHuXZ3WHQhC6wxRZviw6cAzNAG/4N
         KFjGVJZCf5w6BxI8qsexkwuDcl1sqoq7lYxJtDrWabgkCGnc5J+bB0f6oI3Adibg0LTT
         YU3iaqA/favaw5L3HkT3h0tdsvnLBJV2yhyAVC39oTxQPDhbu5/PIv8asP00cAPLh4R8
         ra7uLfARMuaB75oJ2sR/+cb68b76/IPRKCA5SUQD+asic7ucQwE3Bq3b+Fa/h9jTok4G
         6LWg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1733525529; x=1734130329;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=2XVlPHpEZtu9xIw4Qap8alaO3+SGpEYwVQX0iy/v7kQ=;
        b=jNxIN77tGvWZ06EP2oK7GtjEtlmjQlUq1RAufVOb22NevRTNAqB83eHz+E42upVrZ0
         Mg4eWIFFuaNnR8Z6DqKpwyoX0zSkNBQjHRNqVeOUEBrg28wbtmAVpI/dajp+TEghbcxV
         4dWPwG3quAS5FH1MoXAVO0MR9oect/v7nCEwtP11OTVSF0xvEIRmo0GwEoNgivJ6h9zQ
         HaEwfP045Y1DwgFIy8leUpq6jZ5RBeoZdn6bi3cd1DqoGl/BL13Fa5DpJk1qfpdbHnnU
         6c5CpyBdxj2WCn5fAHtDWDhTJMdVp4v/OdWPn4hrYA2uUtMmPZMUMssLcXAoJ42TkC74
         ncvA==
X-Forwarded-Encrypted: i=1;
 AJvYcCVd7XOZA1WI6wP4wJ+WsXWCdObnZ5altnDy2c1X5gPaklFngXAezyOv4e2eiejlF+bnqAOJ/ByDIjMWLzc=@vger.kernel.org
X-Gm-Message-State: AOJu0YwTPNocbf3qoMAP1VPFIPCHep8GUET63NufF/6ybR8XCvAJXH53
	oHfDAHDlMmKpSgiTTJ0H3bngd5A29Lcfj0pKiIi7NZ3LPPDpOQ2SUKrkF6S6wHuu5k03fejIgI4
	wGA==
X-Google-Smtp-Source: 
 AGHT+IE3XhAvDScGGmsUTdbbULESWT7yEme/wNWVflifCM9618rjWWcRcIta7LlU8v32EA39xFZwUHCJlPc=
X-Received: from oacpd2.prod.google.com
 ([2002:a05:6871:7a82:b0:297:2483:5994])
 (user=surenb job=prod-delivery.src-stubby-dispatcher) by
 2002:a05:6808:e8c:b0:3ea:5161:f71
 with SMTP id 5614622812f47-3eb19ce52a7mr4865569b6e.20.1733525529219; Fri, 06
 Dec 2024 14:52:09 -0800 (PST)
Date: Fri,  6 Dec 2024 14:51:58 -0800
In-Reply-To: <20241206225204.4008261-1-surenb@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20241206225204.4008261-1-surenb@google.com>
X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog
Message-ID: <20241206225204.4008261-2-surenb@google.com>
Subject: [PATCH v5 1/6] mm: introduce vma_start_read_locked{_nested} helpers
From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
	mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com,
	oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com,
	peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com,
	minchan@google.com, jannh@google.com, shakeel.butt@linux.dev,
	souravpanda@google.com, pasha.tatashin@soleen.com, corbet@lwn.net,
	linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com, surenb@google.com
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Introduce helper functions which can be used to read-lock a VMA when
holding mmap_lock for read. Replace direct accesses to vma->vm_lock
with these new helpers.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/mm.h | 24 ++++++++++++++++++++++++
 mm/userfaultfd.c   | 22 +++++-----------------
 2 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 187e42339d8e..c4a001972223 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -734,6 +734,30 @@ static inline bool vma_start_read(struct vm_area_struc=
t *vma)
 	return true;
 }
=20
+/*
+ * Use only while holding mmap read lock which guarantees that locking wil=
l not
+ * fail (nobody can concurrently write-lock the vma). vma_start_read() sho=
uld
+ * not be used in such cases because it might fail due to mm_lock_seq over=
flow.
+ * This functionality is used to obtain vma read lock and drop the mmap re=
ad lock.
+ */
+static inline void vma_start_read_locked_nested(struct vm_area_struct *vma=
, int subclass)
+{
+	mmap_assert_locked(vma->vm_mm);
+	down_read_nested(&vma->vm_lock->lock, subclass);
+}
+
+/*
+ * Use only while holding mmap read lock which guarantees that locking wil=
l not
+ * fail (nobody can concurrently write-lock the vma). vma_start_read() sho=
uld
+ * not be used in such cases because it might fail due to mm_lock_seq over=
flow.
+ * This functionality is used to obtain vma read lock and drop the mmap re=
ad lock.
+ */
+static inline void vma_start_read_locked(struct vm_area_struct *vma)
+{
+	mmap_assert_locked(vma->vm_mm);
+	down_read(&vma->vm_lock->lock);
+}
+
 static inline void vma_end_read(struct vm_area_struct *vma)
 {
 	rcu_read_lock(); /* keeps vma alive till the end of up_read */
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 8e16dc290ddf..bc9a66ec6a6e 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -84,16 +84,8 @@ static struct vm_area_struct *uffd_lock_vma(struct mm_st=
ruct *mm,
=20
 	mmap_read_lock(mm);
 	vma =3D find_vma_and_prepare_anon(mm, address);
-	if (!IS_ERR(vma)) {
-		/*
-		 * We cannot use vma_start_read() as it may fail due to
-		 * false locked (see comment in vma_start_read()). We
-		 * can avoid that by directly locking vm_lock under
-		 * mmap_lock, which guarantees that nobody can lock the
-		 * vma for write (vma_start_write()) under us.
-		 */
-		down_read(&vma->vm_lock->lock);
-	}
+	if (!IS_ERR(vma))
+		vma_start_read_locked(vma);
=20
 	mmap_read_unlock(mm);
 	return vma;
@@ -1491,14 +1483,10 @@ static int uffd_move_lock(struct mm_struct *mm,
 	mmap_read_lock(mm);
 	err =3D find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, src_vmap);
 	if (!err) {
-		/*
-		 * See comment in uffd_lock_vma() as to why not using
-		 * vma_start_read() here.
-		 */
-		down_read(&(*dst_vmap)->vm_lock->lock);
+		vma_start_read_locked(*dst_vmap);
 		if (*dst_vmap !=3D *src_vmap)
-			down_read_nested(&(*src_vmap)->vm_lock->lock,
-					 SINGLE_DEPTH_NESTING);
+			vma_start_read_locked_nested(*src_vmap,
+						SINGLE_DEPTH_NESTING);
 	}
 	mmap_read_unlock(mm);
 	return err;
--=20
2.47.0.338.g60cca15819-goog
From nobody Sat Feb  7 23:50:16 2026
Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com
 [209.85.216.73])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADC74212FAD
	for <linux-kernel@vger.kernel.org>; Fri,  6 Dec 2024 22:52:11 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.216.73
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1733525533; cv=none;
 b=tNbVCWLOZqmxUBqV9Kwv4GKSUduFAcTUxNoVWZNI7YGDR/sr3cX+GTpnQQeYJAi+W34udfrt6OQXUjZSH68qL1x/ElPycoL56qANi1dMwIEwRTiKsi+EV8KFExrPSC+DYAam69MDQqd9avxBIe2gWTkKVGSpeaeKXUvwqCc2iJw=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1733525533; c=relaxed/simple;
	bh=yZ8WrAPC+u3xQ+9DFVXyDSsUiIJb0rdw94bWQHV4ShE=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=gg382s3PCFDoTafxZJRNyrYzkPlPVKeuTcnwKQyhg/Q+b++lonPUEW1znSDFoOr6S4p3OQrh8MxTg/6KTKqN6AMJ9xS1yhnynlAPVCdFDf7+KPT/Rs+7ZdmGwh7MzDx8ciNZ7l2Tn5YJBW457nwqARDfJNb0qIrwntckhZGRpOY=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=bJwZ5Hri; arc=none smtp.client-ip=209.85.216.73
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="bJwZ5Hri"
Received: by mail-pj1-f73.google.com with SMTP id
 98e67ed59e1d1-2ef7fbd99a6so275870a91.1
        for <linux-kernel@vger.kernel.org>;
 Fri, 06 Dec 2024 14:52:11 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1733525531; x=1734130331;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=rT8Mwg3YVngq88fQDqUxdC7bmsa9/fkDJKquyXN45js=;
        b=bJwZ5HriGjjdNPgISWnRim3Cn9vFNVGBbNNdY8udrwIak7RdLIUYDurNrj1wfn68IE
         eM2NcQJTMP+JFbxo0YqA4geVzPpANwUy/4Srp13g88sNFTSZL5zO6u9c++edonGdtFHT
         qBh3iVNVzmwUBm3csVXiUjeSQ+RPFPi44ep5IwG6hVwRhY2Nmfz2sdYg8K2MKPlURfMw
         lvbJlVfFF9UriGey4M5XLAmECgtOSsrKP8UN0zc3yS9yWuf4JLSXaJBfOI7r+zSSW5IU
         qA8Z+igkAAVdY4Q3lgtL5mqxSqGK58Q3eroPG/gw5wVy0TfcsVf/0Yfxs/t8jFjR+jF1
         bfMQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1733525531; x=1734130331;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=rT8Mwg3YVngq88fQDqUxdC7bmsa9/fkDJKquyXN45js=;
        b=swr1NuWqegS5qrCyMVclAicr9tP6oCSeyj/CzE7nhfPYoboez679d9w3/ubJJGaki1
         PSWPmuVaddrGIVrIW3RoW0eeJ4veipy9ZE6zdVC9fRNit2pzJsGK23J4mfz2860jr69O
         sAE6KGjKTJr3HXMIMHoObYiAuVSmgE7iE8r4mf3Ss0/X5+wFfJ/YLr9WNk9Pfe7GeJ9c
         uAayeRlUxWdJUE3yFIph2NzBXicQvr/0xF0wmsRfbU8eLY77gq2HcevCnZ0+RziRUPf/
         +5SBz4ADRX7OYCXYqm+vjGqIZgMc/CIsI4CX0T0KO/bXW65uhkMgTYJFK2G5aZefQJ28
         5q0w==
X-Forwarded-Encrypted: i=1;
 AJvYcCXUN6FBcsFcIeSOBKoXXLsXSA+inikwJ/kvU1OsIm5XFrte0QXD/PpisawtkiN4hKxiT3jUqpeLI4MNdmg=@vger.kernel.org
X-Gm-Message-State: AOJu0YzqC37+nvwrXTV0mtVnSBrqiqo89FYItUyXIP/3KJR3EAqvZsGB
	btltcFEzAL6sJRXqsEKjg0Cnc9r8O2BEPXg27fdAwj2dyiJ2RbkUKMyVNhSCgWbV2MKBwoI7eck
	amw==
X-Google-Smtp-Source: 
 AGHT+IFlxdPNqWtYnG8mmAY9quNf9jbf25IzCiv9883/eI2U6H6bU+wM6Ogc0koGE2fA7VPMtISD4bXqCdc=
X-Received: from pjl13.prod.google.com ([2002:a17:90b:2f8d:b0:2e0:aba3:662a])
 (user=surenb job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:90a:ca97:b0:2ee:6d08:7936
 with SMTP id 98e67ed59e1d1-2ef6a6c10cemr7355055a91.20.1733525531141; Fri, 06
 Dec 2024 14:52:11 -0800 (PST)
Date: Fri,  6 Dec 2024 14:51:59 -0800
In-Reply-To: <20241206225204.4008261-1-surenb@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20241206225204.4008261-1-surenb@google.com>
X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog
Message-ID: <20241206225204.4008261-3-surenb@google.com>
Subject: [PATCH v5 2/6] mm: move per-vma lock into vm_area_struct
From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
	mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com,
	oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com,
	peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com,
	minchan@google.com, jannh@google.com, shakeel.butt@linux.dev,
	souravpanda@google.com, pasha.tatashin@soleen.com, corbet@lwn.net,
	linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com, surenb@google.com
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Back when per-vma locks were introduces, vm_lock was moved out of
vm_area_struct in [1] because of the performance regression caused by
false cacheline sharing. Recent investigation [2] revealed that the
regressions is limited to a rather old Broadwell microarchitecture and
even there it can be mitigated by disabling adjacent cacheline
prefetching, see [3].
Splitting single logical structure into multiple ones leads to more
complicated management, extra pointer dereferences and overall less
maintainable code. When that split-away part is a lock, it complicates
things even further. With no performance benefits, there are no reasons
for this split. Merging the vm_lock back into vm_area_struct also allows
vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset.
Move vm_lock back into vm_area_struct, aligning it at the cacheline
boundary and changing the cache to be cacheline-aligned as well.
With kernel compiled using defconfig, this causes VMA memory consumption
to grow from 160 (vm_area_struct) + 40 (vm_lock) bytes to 256 bytes:

    slabinfo before:
     <name>           ... <objsize> <objperslab> <pagesperslab> : ...
     vma_lock         ...     40  102    1 : ...
     vm_area_struct   ...    160   51    2 : ...

    slabinfo after moving vm_lock:
     <name>           ... <objsize> <objperslab> <pagesperslab> : ...
     vm_area_struct   ...    256   32    2 : ...

Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages,
which is 5.5MB per 100000 VMAs. Note that the size of this structure is
dependent on the kernel configuration and typically the original size is
higher than 160 bytes. Therefore these calculations are close to the
worst case scenario. A more realistic vm_area_struct usage before this
change is:

     <name>           ... <objsize> <objperslab> <pagesperslab> : ...
     vma_lock         ...     40  102    1 : ...
     vm_area_struct   ...    176   46    2 : ...

Aggregate VMA memory consumption per 1000 VMAs grows from 54 to 64 pages,
which is 3.9MB per 100000 VMAs.
This memory consumption growth can be addressed later by optimizing the
vm_lock.

[1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/
[2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/
[3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbf=
P_pR+-2g@mail.gmail.com/

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/mm.h               | 28 ++++++++++--------
 include/linux/mm_types.h         |  6 ++--
 kernel/fork.c                    | 49 ++++----------------------------
 tools/testing/vma/vma_internal.h | 33 +++++----------------
 4 files changed, 32 insertions(+), 84 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c4a001972223..ee71a504ef88 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -696,6 +696,12 @@ static inline void vma_numab_state_free(struct vm_area=
_struct *vma) {}
 #endif /* CONFIG_NUMA_BALANCING */
=20
 #ifdef CONFIG_PER_VMA_LOCK
+static inline void vma_lock_init(struct vm_area_struct *vma)
+{
+	init_rwsem(&vma->vm_lock.lock);
+	vma->vm_lock_seq =3D UINT_MAX;
+}
+
 /*
  * Try to read-lock a vma. The function is allowed to occasionally yield f=
alse
  * locked result to avoid performance overhead, in which case we fall back=
 to
@@ -713,7 +719,7 @@ static inline bool vma_start_read(struct vm_area_struct=
 *vma)
 	if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq.=
sequence))
 		return false;
=20
-	if (unlikely(down_read_trylock(&vma->vm_lock->lock) =3D=3D 0))
+	if (unlikely(down_read_trylock(&vma->vm_lock.lock) =3D=3D 0))
 		return false;
=20
 	/*
@@ -728,7 +734,7 @@ static inline bool vma_start_read(struct vm_area_struct=
 *vma)
 	 * This pairs with RELEASE semantics in vma_end_write_all().
 	 */
 	if (unlikely(vma->vm_lock_seq =3D=3D raw_read_seqcount(&vma->vm_mm->mm_lo=
ck_seq))) {
-		up_read(&vma->vm_lock->lock);
+		up_read(&vma->vm_lock.lock);
 		return false;
 	}
 	return true;
@@ -743,7 +749,7 @@ static inline bool vma_start_read(struct vm_area_struct=
 *vma)
 static inline void vma_start_read_locked_nested(struct vm_area_struct *vma=
, int subclass)
 {
 	mmap_assert_locked(vma->vm_mm);
-	down_read_nested(&vma->vm_lock->lock, subclass);
+	down_read_nested(&vma->vm_lock.lock, subclass);
 }
=20
 /*
@@ -755,13 +761,13 @@ static inline void vma_start_read_locked_nested(struc=
t vm_area_struct *vma, int
 static inline void vma_start_read_locked(struct vm_area_struct *vma)
 {
 	mmap_assert_locked(vma->vm_mm);
-	down_read(&vma->vm_lock->lock);
+	down_read(&vma->vm_lock.lock);
 }
=20
 static inline void vma_end_read(struct vm_area_struct *vma)
 {
 	rcu_read_lock(); /* keeps vma alive till the end of up_read */
-	up_read(&vma->vm_lock->lock);
+	up_read(&vma->vm_lock.lock);
 	rcu_read_unlock();
 }
=20
@@ -790,7 +796,7 @@ static inline void vma_start_write(struct vm_area_struc=
t *vma)
 	if (__is_vma_write_locked(vma, &mm_lock_seq))
 		return;
=20
-	down_write(&vma->vm_lock->lock);
+	down_write(&vma->vm_lock.lock);
 	/*
 	 * We should use WRITE_ONCE() here because we can have concurrent reads
 	 * from the early lockless pessimistic check in vma_start_read().
@@ -798,7 +804,7 @@ static inline void vma_start_write(struct vm_area_struc=
t *vma)
 	 * we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy.
 	 */
 	WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq);
-	up_write(&vma->vm_lock->lock);
+	up_write(&vma->vm_lock.lock);
 }
=20
 static inline void vma_assert_write_locked(struct vm_area_struct *vma)
@@ -810,7 +816,7 @@ static inline void vma_assert_write_locked(struct vm_ar=
ea_struct *vma)
=20
 static inline void vma_assert_locked(struct vm_area_struct *vma)
 {
-	if (!rwsem_is_locked(&vma->vm_lock->lock))
+	if (!rwsem_is_locked(&vma->vm_lock.lock))
 		vma_assert_write_locked(vma);
 }
=20
@@ -843,6 +849,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_str=
uct *mm,
=20
 #else /* CONFIG_PER_VMA_LOCK */
=20
+static inline void vma_lock_init(struct vm_area_struct *vma) {}
 static inline bool vma_start_read(struct vm_area_struct *vma)
 		{ return false; }
 static inline void vma_end_read(struct vm_area_struct *vma) {}
@@ -877,10 +884,6 @@ static inline void assert_fault_locked(struct vm_fault=
 *vmf)
=20
 extern const struct vm_operations_struct vma_dummy_vm_ops;
=20
-/*
- * WARNING: vma_init does not initialize vma->vm_lock.
- * Use vm_area_alloc()/vm_area_free() if vma needs locking.
- */
 static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *=
mm)
 {
 	memset(vma, 0, sizeof(*vma));
@@ -889,6 +892,7 @@ static inline void vma_init(struct vm_area_struct *vma,=
 struct mm_struct *mm)
 	INIT_LIST_HEAD(&vma->anon_vma_chain);
 	vma_mark_detached(vma, false);
 	vma_numab_state_init(vma);
+	vma_lock_init(vma);
 }
=20
 /* Use when VMA is not part of the VMA tree and needs no locking */
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 706b3c926a08..be3551654325 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -700,8 +700,6 @@ struct vm_area_struct {
 	 * slowpath.
 	 */
 	unsigned int vm_lock_seq;
-	/* Unstable RCU readers are allowed to read this. */
-	struct vma_lock *vm_lock;
 #endif
=20
 	/*
@@ -754,6 +752,10 @@ struct vm_area_struct {
 	struct vma_numab_state *numab_state;	/* NUMA Balancing state */
 #endif
 	struct vm_userfaultfd_ctx vm_userfaultfd_ctx;
+#ifdef CONFIG_PER_VMA_LOCK
+	/* Unstable RCU readers are allowed to read this. */
+	struct vma_lock vm_lock ____cacheline_aligned_in_smp;
+#endif
 } __randomize_layout;
=20
 #ifdef CONFIG_NUMA
diff --git a/kernel/fork.c b/kernel/fork.c
index 37055b4c30fb..21660a9ad97a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -436,35 +436,6 @@ static struct kmem_cache *vm_area_cachep;
 /* SLAB cache for mm_struct structures (tsk->mm) */
 static struct kmem_cache *mm_cachep;
=20
-#ifdef CONFIG_PER_VMA_LOCK
-
-/* SLAB cache for vm_area_struct.lock */
-static struct kmem_cache *vma_lock_cachep;
-
-static bool vma_lock_alloc(struct vm_area_struct *vma)
-{
-	vma->vm_lock =3D kmem_cache_alloc(vma_lock_cachep, GFP_KERNEL);
-	if (!vma->vm_lock)
-		return false;
-
-	init_rwsem(&vma->vm_lock->lock);
-	vma->vm_lock_seq =3D UINT_MAX;
-
-	return true;
-}
-
-static inline void vma_lock_free(struct vm_area_struct *vma)
-{
-	kmem_cache_free(vma_lock_cachep, vma->vm_lock);
-}
-
-#else /* CONFIG_PER_VMA_LOCK */
-
-static inline bool vma_lock_alloc(struct vm_area_struct *vma) { return tru=
e; }
-static inline void vma_lock_free(struct vm_area_struct *vma) {}
-
-#endif /* CONFIG_PER_VMA_LOCK */
-
 struct vm_area_struct *vm_area_alloc(struct mm_struct *mm)
 {
 	struct vm_area_struct *vma;
@@ -474,10 +445,6 @@ struct vm_area_struct *vm_area_alloc(struct mm_struct =
*mm)
 		return NULL;
=20
 	vma_init(vma, mm);
-	if (!vma_lock_alloc(vma)) {
-		kmem_cache_free(vm_area_cachep, vma);
-		return NULL;
-	}
=20
 	return vma;
 }
@@ -496,10 +463,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_stru=
ct *orig)
 	 * will be reinitialized.
 	 */
 	data_race(memcpy(new, orig, sizeof(*new)));
-	if (!vma_lock_alloc(new)) {
-		kmem_cache_free(vm_area_cachep, new);
-		return NULL;
-	}
+	vma_lock_init(new);
 	INIT_LIST_HEAD(&new->anon_vma_chain);
 	vma_numab_state_init(new);
 	dup_anon_vma_name(orig, new);
@@ -511,7 +475,6 @@ void __vm_area_free(struct vm_area_struct *vma)
 {
 	vma_numab_state_free(vma);
 	free_anon_vma_name(vma);
-	vma_lock_free(vma);
 	kmem_cache_free(vm_area_cachep, vma);
 }
=20
@@ -522,7 +485,7 @@ static void vm_area_free_rcu_cb(struct rcu_head *head)
 						  vm_rcu);
=20
 	/* The vma should not be locked while being destroyed. */
-	VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock->lock), vma);
+	VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock.lock), vma);
 	__vm_area_free(vma);
 }
 #endif
@@ -3190,11 +3153,9 @@ void __init proc_caches_init(void)
 			sizeof(struct fs_struct), 0,
 			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT,
 			NULL);
-
-	vm_area_cachep =3D KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT);
-#ifdef CONFIG_PER_VMA_LOCK
-	vma_lock_cachep =3D KMEM_CACHE(vma_lock, SLAB_PANIC|SLAB_ACCOUNT);
-#endif
+	vm_area_cachep =3D KMEM_CACHE(vm_area_struct,
+			SLAB_HWCACHE_ALIGN|SLAB_NO_MERGE|SLAB_PANIC|
+			SLAB_ACCOUNT);
 	mmap_init();
 	nsproxy_cache_init();
 }
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter=
nal.h
index b973b3e41c83..568c18d24d53 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -270,10 +270,10 @@ struct vm_area_struct {
 	/*
 	 * Can only be written (using WRITE_ONCE()) while holding both:
 	 *  - mmap_lock (in write mode)
-	 *  - vm_lock->lock (in write mode)
+	 *  - vm_lock.lock (in write mode)
 	 * Can be read reliably while holding one of:
 	 *  - mmap_lock (in read or write mode)
-	 *  - vm_lock->lock (in read or write mode)
+	 *  - vm_lock.lock (in read or write mode)
 	 * Can be read unreliably (using READ_ONCE()) for pessimistic bailout
 	 * while holding nothing (except RCU to keep the VMA struct allocated).
 	 *
@@ -282,7 +282,7 @@ struct vm_area_struct {
 	 * slowpath.
 	 */
 	unsigned int vm_lock_seq;
-	struct vma_lock *vm_lock;
+	struct vma_lock vm_lock;
 #endif
=20
 	/*
@@ -459,17 +459,10 @@ static inline struct vm_area_struct *vma_next(struct =
vma_iterator *vmi)
 	return mas_find(&vmi->mas, ULONG_MAX);
 }
=20
-static inline bool vma_lock_alloc(struct vm_area_struct *vma)
+static inline void vma_lock_init(struct vm_area_struct *vma)
 {
-	vma->vm_lock =3D calloc(1, sizeof(struct vma_lock));
-
-	if (!vma->vm_lock)
-		return false;
-
-	init_rwsem(&vma->vm_lock->lock);
+	init_rwsem(&vma->vm_lock.lock);
 	vma->vm_lock_seq =3D UINT_MAX;
-
-	return true;
 }
=20
 static inline void vma_assert_write_locked(struct vm_area_struct *);
@@ -492,6 +485,7 @@ static inline void vma_init(struct vm_area_struct *vma,=
 struct mm_struct *mm)
 	vma->vm_ops =3D &vma_dummy_vm_ops;
 	INIT_LIST_HEAD(&vma->anon_vma_chain);
 	vma_mark_detached(vma, false);
+	vma_lock_init(vma);
 }
=20
 static inline struct vm_area_struct *vm_area_alloc(struct mm_struct *mm)
@@ -502,10 +496,6 @@ static inline struct vm_area_struct *vm_area_alloc(str=
uct mm_struct *mm)
 		return NULL;
=20
 	vma_init(vma, mm);
-	if (!vma_lock_alloc(vma)) {
-		free(vma);
-		return NULL;
-	}
=20
 	return vma;
 }
@@ -518,10 +508,7 @@ static inline struct vm_area_struct *vm_area_dup(struc=
t vm_area_struct *orig)
 		return NULL;
=20
 	memcpy(new, orig, sizeof(*new));
-	if (!vma_lock_alloc(new)) {
-		free(new);
-		return NULL;
-	}
+	vma_lock_init(new);
 	INIT_LIST_HEAD(&new->anon_vma_chain);
=20
 	return new;
@@ -691,14 +678,8 @@ static inline void mpol_put(struct mempolicy *)
 {
 }
=20
-static inline void vma_lock_free(struct vm_area_struct *vma)
-{
-	free(vma->vm_lock);
-}
-
 static inline void __vm_area_free(struct vm_area_struct *vma)
 {
-	vma_lock_free(vma);
 	free(vma);
 }
=20
--=20
2.47.0.338.g60cca15819-goog
From nobody Sat Feb  7 23:50:16 2026
Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com
 [209.85.216.73])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9368C213E64
	for <linux-kernel@vger.kernel.org>; Fri,  6 Dec 2024 22:52:13 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.216.73
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1733525535; cv=none;
 b=TQfoqZccY2Y+ewyxI15gZsEe1ZRnbywes/yjDe0F1spfwXdONbESBapozIHnwSaQto1wcQ1/LiTGa0PMWoab+EweKgZfhMMuka1swgXDfF1vvQCffQ7Vf+MrzAWXpUkcuV6WwpmlRMjYJpzjyijWN+CFxEWRhgA2SwAPc1xL0oE=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1733525535; c=relaxed/simple;
	bh=XcgvtnPc0ZtjSlSdejhrnXMo9S3SS+IQQAWD3AyzvMI=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=QCY5egwjgazOu3C8GFgjpz6PdlL8SAMaRvj7asQh8MboyfRUyrFJn+feYorsuI8QlgR+liBLOxqUHRvrEk2a1sB6nOLPmaZgmoYHDzGpaxKX4ZqQYBss7xXS9BGRKwZ1zmIPdERi1NdVixqbA/Kyl8S19NKNbP1hPS3BZdagkXM=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=1xafacCX; arc=none smtp.client-ip=209.85.216.73
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="1xafacCX"
Received: by mail-pj1-f73.google.com with SMTP id
 98e67ed59e1d1-2ee86953aeaso2515896a91.2
        for <linux-kernel@vger.kernel.org>;
 Fri, 06 Dec 2024 14:52:13 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1733525533; x=1734130333;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=/Y40baPUGWax4wYRQAtfppaknEh3um9zBwyBL9xV1Y4=;
        b=1xafacCXcH8ZKA3IN1fS+FompP40vgnyM3qlXuDjILXY0yTwKBF3RfXNpdvWR+zXLa
         rnA4E5a5bcKYxP9ePw1PyDoF8IsmpI/jRXj3VKgJx3Se5FEasW48lpqwJNVGAjP+vde9
         jwhisL3eTVbcmSdeUbMXeH4SHMfMtfdHNAZXMpBQzUeABFs28BC7rVdNMFRKoebmG52U
         bg7mpIfBShyJjcuMCRrinwpcUvvH9P/FJQgVcTUUDXptHRcfjpqQPCufPJrUNxGHM/Xi
         4GLZL5cfKBH0ZmA9aQoOFTsjo9PwrYNsFTzNn1uuoWIyy59kFh0sjZI5x7d2JjkyNwH5
         exNQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1733525533; x=1734130333;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=/Y40baPUGWax4wYRQAtfppaknEh3um9zBwyBL9xV1Y4=;
        b=p9SJzCvn+5HJr1TwKGmZ96/rJUJNSZJ5i2mR6GC9qxiSmJadGgZL5Bt6oYOqpkPbrm
         NQ9knhDN/3tBdJCDxZczlbNa8ULaAsPmmZMhVEffZv/xPdT2MaUmj4QPXmuAOFAPlE7V
         x9R0/uJPnZcmJzsoPJpcrkskxwCVHvK4wW19NoNcM44AGQSAmnVBiMhEy4ZnfMEn0dXj
         KrwWbwM2vgD8YpMkAZN5TMgf8T+skcijMhTCkHCcqDw+E1rYSTmkiN291Da2obbqFCpT
         jE9pBiwf0iqiOgbGLpEhVxHMZucSkQ+FdKEDzHDwVMyb/JLd4xw5+2pGSTX5e4TGnbfr
         jFfQ==
X-Forwarded-Encrypted: i=1;
 AJvYcCV6zUJZhr03c/pOdN15orVlhujck5wiS29o91Nx0IQbIjVrAiMqpX3TiPzOETTe/RmFvL3UWKi8aTd7YxQ=@vger.kernel.org
X-Gm-Message-State: AOJu0YzNytxtlIjnnSqf71IQNngXvpKfOTHA82tep9hcE5L2xaMXWFEw
	EfJ+UjE2E3a4MsOpqdGLQ1RKLS6ZgLatutmuXC9FNqrN1dc7Cqu+CKydYWMc3eHy1qjOSCep8pI
	RAg==
X-Google-Smtp-Source: 
 AGHT+IHCd2OHlcBdUGfCCxX6+VS5I4FF5b82AKepICjOdPKRrERfe6SXg3caG61pXASigp7AKy6oYeA6ikU=
X-Received: from pjbmf7.prod.google.com ([2002:a17:90b:1847:b0:2e1:8750:2b46])
 (user=surenb job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:90b:2d43:b0:2ee:ba84:5cac
 with SMTP id 98e67ed59e1d1-2ef6955f863mr7322434a91.7.1733525533068; Fri, 06
 Dec 2024 14:52:13 -0800 (PST)
Date: Fri,  6 Dec 2024 14:52:00 -0800
In-Reply-To: <20241206225204.4008261-1-surenb@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20241206225204.4008261-1-surenb@google.com>
X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog
Message-ID: <20241206225204.4008261-4-surenb@google.com>
Subject: [PATCH v5 3/6] mm: mark vma as detached until it's added into vma
 tree
From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
	mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com,
	oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com,
	peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com,
	minchan@google.com, jannh@google.com, shakeel.butt@linux.dev,
	souravpanda@google.com, pasha.tatashin@soleen.com, corbet@lwn.net,
	linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com, surenb@google.com
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Current implementation does not set detached flag when a VMA is first
allocated. This does not represent the real state of the VMA, which is
detached until it is added into mm's VMA tree. Fix this by marking new
VMAs as detached and resetting detached flag only after VMA is added
into a tree.
Introduce vma_mark_attached() to make the API more readable and to
simplify possible future cleanup when vma->vm_mm might be used to
indicate detached vma and vma_mark_attached() will need an additional
mm parameter.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/mm.h               | 27 ++++++++++++++++++++-------
 kernel/fork.c                    |  4 ++++
 mm/memory.c                      |  2 +-
 mm/vma.c                         |  6 +++---
 mm/vma.h                         |  2 ++
 tools/testing/vma/vma_internal.h | 17 ++++++++++++-----
 6 files changed, 42 insertions(+), 16 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ee71a504ef88..2bf38c1e9cca 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -820,12 +820,21 @@ static inline void vma_assert_locked(struct vm_area_s=
truct *vma)
 		vma_assert_write_locked(vma);
 }
=20
-static inline void vma_mark_detached(struct vm_area_struct *vma, bool deta=
ched)
+static inline void vma_mark_attached(struct vm_area_struct *vma)
+{
+	vma->detached =3D false;
+}
+
+static inline void vma_mark_detached(struct vm_area_struct *vma)
 {
 	/* When detaching vma should be write-locked */
-	if (detached)
-		vma_assert_write_locked(vma);
-	vma->detached =3D detached;
+	vma_assert_write_locked(vma);
+	vma->detached =3D true;
+}
+
+static inline bool is_vma_detached(struct vm_area_struct *vma)
+{
+	return vma->detached;
 }
=20
 static inline void release_fault_lock(struct vm_fault *vmf)
@@ -856,8 +865,8 @@ static inline void vma_end_read(struct vm_area_struct *=
vma) {}
 static inline void vma_start_write(struct vm_area_struct *vma) {}
 static inline void vma_assert_write_locked(struct vm_area_struct *vma)
 		{ mmap_assert_write_locked(vma->vm_mm); }
-static inline void vma_mark_detached(struct vm_area_struct *vma,
-				     bool detached) {}
+static inline void vma_mark_attached(struct vm_area_struct *vma) {}
+static inline void vma_mark_detached(struct vm_area_struct *vma) {}
=20
 static inline struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *=
mm,
 		unsigned long address)
@@ -890,7 +899,10 @@ static inline void vma_init(struct vm_area_struct *vma=
, struct mm_struct *mm)
 	vma->vm_mm =3D mm;
 	vma->vm_ops =3D &vma_dummy_vm_ops;
 	INIT_LIST_HEAD(&vma->anon_vma_chain);
-	vma_mark_detached(vma, false);
+#ifdef CONFIG_PER_VMA_LOCK
+	/* vma is not locked, can't use vma_mark_detached() */
+	vma->detached =3D true;
+#endif
 	vma_numab_state_init(vma);
 	vma_lock_init(vma);
 }
@@ -1085,6 +1097,7 @@ static inline int vma_iter_bulk_store(struct vma_iter=
ator *vmi,
 	if (unlikely(mas_is_err(&vmi->mas)))
 		return -ENOMEM;
=20
+	vma_mark_attached(vma);
 	return 0;
 }
=20
diff --git a/kernel/fork.c b/kernel/fork.c
index 21660a9ad97a..71990f46aa4e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -465,6 +465,10 @@ struct vm_area_struct *vm_area_dup(struct vm_area_stru=
ct *orig)
 	data_race(memcpy(new, orig, sizeof(*new)));
 	vma_lock_init(new);
 	INIT_LIST_HEAD(&new->anon_vma_chain);
+#ifdef CONFIG_PER_VMA_LOCK
+	/* vma is not locked, can't use vma_mark_detached() */
+	new->detached =3D true;
+#endif
 	vma_numab_state_init(new);
 	dup_anon_vma_name(orig, new);
=20
diff --git a/mm/memory.c b/mm/memory.c
index f823906a4a0f..b252f19b28c9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -6372,7 +6372,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_s=
truct *mm,
 		goto inval;
=20
 	/* Check if the VMA got isolated after we found it */
-	if (vma->detached) {
+	if (is_vma_detached(vma)) {
 		vma_end_read(vma);
 		count_vm_vma_lock_event(VMA_LOCK_MISS);
 		/* The area was replaced with another one */
diff --git a/mm/vma.c b/mm/vma.c
index a06747845cac..cdc63728f47f 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -327,7 +327,7 @@ static void vma_complete(struct vma_prepare *vp, struct=
 vma_iterator *vmi,
=20
 	if (vp->remove) {
 again:
-		vma_mark_detached(vp->remove, true);
+		vma_mark_detached(vp->remove);
 		if (vp->file) {
 			uprobe_munmap(vp->remove, vp->remove->vm_start,
 				      vp->remove->vm_end);
@@ -1220,7 +1220,7 @@ static void reattach_vmas(struct ma_state *mas_detach)
=20
 	mas_set(mas_detach, 0);
 	mas_for_each(mas_detach, vma, ULONG_MAX)
-		vma_mark_detached(vma, false);
+		vma_mark_attached(vma);
=20
 	__mt_destroy(mas_detach->tree);
 }
@@ -1295,7 +1295,7 @@ static int vms_gather_munmap_vmas(struct vma_munmap_s=
truct *vms,
 		if (error)
 			goto munmap_gather_failed;
=20
-		vma_mark_detached(next, true);
+		vma_mark_detached(next);
 		nrpages =3D vma_pages(next);
=20
 		vms->nr_pages +=3D nrpages;
diff --git a/mm/vma.h b/mm/vma.h
index 295d44ea54db..32d99b2963df 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -156,6 +156,7 @@ static inline int vma_iter_store_gfp(struct vma_iterato=
r *vmi,
 	if (unlikely(mas_is_err(&vmi->mas)))
 		return -ENOMEM;
=20
+	vma_mark_attached(vma);
 	return 0;
 }
=20
@@ -385,6 +386,7 @@ static inline void vma_iter_store(struct vma_iterator *=
vmi,
=20
 	__mas_set_range(&vmi->mas, vma->vm_start, vma->vm_end - 1);
 	mas_store_prealloc(&vmi->mas, vma);
+	vma_mark_attached(vma);
 }
=20
 static inline unsigned long vma_iter_addr(struct vma_iterator *vmi)
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter=
nal.h
index 568c18d24d53..0cdc5f8c3d60 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -465,13 +465,17 @@ static inline void vma_lock_init(struct vm_area_struc=
t *vma)
 	vma->vm_lock_seq =3D UINT_MAX;
 }
=20
+static inline void vma_mark_attached(struct vm_area_struct *vma)
+{
+	vma->detached =3D false;
+}
+
 static inline void vma_assert_write_locked(struct vm_area_struct *);
-static inline void vma_mark_detached(struct vm_area_struct *vma, bool deta=
ched)
+static inline void vma_mark_detached(struct vm_area_struct *vma)
 {
 	/* When detaching vma should be write-locked */
-	if (detached)
-		vma_assert_write_locked(vma);
-	vma->detached =3D detached;
+	vma_assert_write_locked(vma);
+	vma->detached =3D true;
 }
=20
 extern const struct vm_operations_struct vma_dummy_vm_ops;
@@ -484,7 +488,8 @@ static inline void vma_init(struct vm_area_struct *vma,=
 struct mm_struct *mm)
 	vma->vm_mm =3D mm;
 	vma->vm_ops =3D &vma_dummy_vm_ops;
 	INIT_LIST_HEAD(&vma->anon_vma_chain);
-	vma_mark_detached(vma, false);
+	/* vma is not locked, can't use vma_mark_detached() */
+	vma->detached =3D true;
 	vma_lock_init(vma);
 }
=20
@@ -510,6 +515,8 @@ static inline struct vm_area_struct *vm_area_dup(struct=
 vm_area_struct *orig)
 	memcpy(new, orig, sizeof(*new));
 	vma_lock_init(new);
 	INIT_LIST_HEAD(&new->anon_vma_chain);
+	/* vma is not locked, can't use vma_mark_detached() */
+	new->detached =3D true;
=20
 	return new;
 }
--=20
2.47.0.338.g60cca15819-goog
From nobody Sat Feb  7 23:50:16 2026
Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com
 [209.85.214.202])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0F83213E81
	for <linux-kernel@vger.kernel.org>; Fri,  6 Dec 2024 22:52:15 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.214.202
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1733525537; cv=none;
 b=lI5MeuiNPZmGz6WxM33+oiKLpmOiXIdiDJ1ebmbxGa6uSv+G6A6TzXoad87SPcwMRYHP4aNl5Q0YGlYdP79fW4SeECaOWZsYzh9j+wAgda91ZwBxXHShDouHFg7xbkQWAmavY4HbKHQu/mnKw4Tz7hWrOwx21MEtzCLyJnTnADk=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1733525537; c=relaxed/simple;
	bh=+Y7GyVaJVqChVxJXlxPgNwWTN8ML6a/45dk/ORefJis=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=mgybtyFldZAWE62ajQRhNaKu+zulCCxF2zY/ohOAlhfoSLY1zGX6N4iOskACFiRNR2zACJaEfVW90tqL9d0akdsqTF9F8AtjddujiwaZTPxMI1wDTGcets0eNNZ9Hmzl3SBgyVzIDiFM1GWi6uDWaNJjN4r1ADiITyeCRrGuJa4=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=H0eTDpMW; arc=none smtp.client-ip=209.85.214.202
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="H0eTDpMW"
Received: by mail-pl1-f202.google.com with SMTP id
 d9443c01a7336-2153861c470so24349505ad.2
        for <linux-kernel@vger.kernel.org>;
 Fri, 06 Dec 2024 14:52:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1733525535; x=1734130335;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=7xWkv895t2vkmivZEcKc0VmEj0Py63DFJG/LYdHvb3Y=;
        b=H0eTDpMWTpNe9+PmAf6rdiLr2nAVZpFt9a904UXcC3aDI/OyjdXD66vW2wcClUoEqd
         6c1gMfJlSMTiGJueleHSfCm3Z1M+tH18VFNFoyMXS0qvAflIyKPTtxSuKyu3iPP6Tbnu
         deQ1LrF8NeO7RZDgyj3p82SbPdAmKw6xDrJ5HW6egatu8uMbtVT8VRyRRdlFn8lrIqED
         dy1i/IlOMvK5mpNFRJSEWFzwlIniU88R+8fIBXyVd+SMn9f9Lmz1ji2TPMTJDCnxHxNq
         3Fso37oz5paQdLFKc2UHqjdHZ19DLG1jGs/zOJQXJzLcLrL9WmsZZQeJZDycLZkuQEz3
         Gn7Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1733525535; x=1734130335;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=7xWkv895t2vkmivZEcKc0VmEj0Py63DFJG/LYdHvb3Y=;
        b=tAyh/rlESZACQougv+LAYseLwouhTyYZ2ezpzNvUV10XB3ZN8pFnrAskfOuE2l95Vt
         0miTTnzlmeOXjxgfOgWy9DMgqXsd+R7a2YzF+SiaxCToghDEg965Rkt095MY++bV9U7B
         fO7rJWZ8qFrI3bazJuHk2b1x9FYbHz6ZVUAo3P2eoE35BAQDICCQ6SOD8jdZBR6KWIDm
         FnOcIVY/RwYmvs44VVcPJfOFRc6gnLej12KcXCIfxkN8zA6JcGy7nLDbQfDL0xPjFyUB
         kqVaDhHyPUTPPbEgzIVFBQkwGJUhRCwq9aSzxYiEfbCce4Wt1I45cvOg4XnY1EO/xJAO
         f56w==
X-Forwarded-Encrypted: i=1;
 AJvYcCX8fPb/nisSLsjYMlnECK/hN+tfvqlPGjUKJMeuzGWd7GyQ7UxuY15RskUdKHwpN6scZ3VWqCIF8JGEQms=@vger.kernel.org
X-Gm-Message-State: AOJu0YxjPxXyQAN6Gur4CEXyej6spm7TQL3/Y4Uly+SoDfQfbdoY3/7u
	eInzrx4jNJpN5dQBkK4qXvHFEeAaiS43cBXl7GoJAgiy3ZfDQvYyZnP6mVkp4sty1r0JUVHaOv5
	glA==
X-Google-Smtp-Source: 
 AGHT+IE5HYCr7W0ZnBekrRI/mwxgnpYPkvPquVys2NZfqfzmdtEe1gkkS7IzDbSltL7UMNDRR9lW/VjGZAs=
X-Received: from plbkd6.prod.google.com ([2002:a17:903:13c6:b0:20c:526b:44a8])
 (user=surenb job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:902:fc86:b0:215:9091:4f57
 with SMTP id d9443c01a7336-21614dac170mr62965795ad.43.1733525534992; Fri, 06
 Dec 2024 14:52:14 -0800 (PST)
Date: Fri,  6 Dec 2024 14:52:01 -0800
In-Reply-To: <20241206225204.4008261-1-surenb@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20241206225204.4008261-1-surenb@google.com>
X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog
Message-ID: <20241206225204.4008261-5-surenb@google.com>
Subject: [PATCH v5 4/6] mm: make vma cache SLAB_TYPESAFE_BY_RCU
From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
	mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com,
	oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com,
	peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com,
	minchan@google.com, jannh@google.com, shakeel.butt@linux.dev,
	souravpanda@google.com, pasha.tatashin@soleen.com, corbet@lwn.net,
	linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com, surenb@google.com
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that
object reuse before RCU grace period is over will be detected inside
lock_vma_under_rcu().
lock_vma_under_rcu() enters RCU read section, finds the vma at the
given address, locks the vma and checks if it got detached or remapped
to cover a different address range. These last checks are there
to ensure that the vma was not modified after we found it but before
locking it.
vma reuse introduces several new possibilities:
1. vma can be reused after it was found but before it is locked;
2. vma can be reused and reinitialized (including changing its vm_mm)
while being locked in vma_start_read();
3. vma can be reused and reinitialized after it was found but before
it is locked, then attached at a new address or to a new mm while
read-locked;
For case #1 current checks will help detecting cases when:
- vma was reused but not yet added into the tree (detached check)
- vma was reused at a different address range (address check);
We are missing the check for vm_mm to ensure the reused vma was not
attached to a different mm. This patch adds the missing check.
For case #2, we pass mm to vma_start_read() to prevent access to
unstable vma->vm_mm. This might lead to vma_start_read() returning
a false locked result but that's not critical if it's rare because
it will only lead to a retry under mmap_lock.
For case #3, we ensure the order in which vma->detached flag and
vm_start/vm_end/vm_mm are set and checked. vma gets attached after
vm_start/vm_end/vm_mm were set and lock_vma_under_rcu() should check
vma->detached before checking vm_start/vm_end/vm_mm. This is required
because attaching vma happens without vma write-lock, as opposed to
vma detaching, which requires vma write-lock. This patch adds memory
barriers inside is_vma_detached() and vma_mark_attached() needed to
order reads and writes to vma->detached vs vm_start/vm_end/vm_mm.
After these provisions, SLAB_TYPESAFE_BY_RCU is added to vm_area_cachep.
This will facilitate vm_area_struct reuse and will minimize the number
of call_rcu() calls.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
 include/linux/mm.h               |  36 +++++--
 include/linux/mm_types.h         |  10 +-
 include/linux/slab.h             |   6 --
 kernel/fork.c                    | 157 +++++++++++++++++++++++++------
 mm/memory.c                      |  15 ++-
 mm/vma.c                         |   2 +-
 tools/testing/vma/vma_internal.h |   7 +-
 7 files changed, 179 insertions(+), 54 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2bf38c1e9cca..3568bcbc7c81 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -257,7 +257,7 @@ struct vm_area_struct *vm_area_alloc(struct mm_struct *=
);
 struct vm_area_struct *vm_area_dup(struct vm_area_struct *);
 void vm_area_free(struct vm_area_struct *);
 /* Use only if VMA has no other users */
-void __vm_area_free(struct vm_area_struct *vma);
+void vm_area_free_unreachable(struct vm_area_struct *vma);
=20
 #ifndef CONFIG_MMU
 extern struct rb_root nommu_region_tree;
@@ -706,8 +706,10 @@ static inline void vma_lock_init(struct vm_area_struct=
 *vma)
  * Try to read-lock a vma. The function is allowed to occasionally yield f=
alse
  * locked result to avoid performance overhead, in which case we fall back=
 to
  * using mmap_lock. The function should never yield false unlocked result.
+ * False locked result is possible if mm_lock_seq overflows or if vma gets
+ * reused and attached to a different mm before we lock it.
  */
-static inline bool vma_start_read(struct vm_area_struct *vma)
+static inline bool vma_start_read(struct mm_struct *mm, struct vm_area_str=
uct *vma)
 {
 	/*
 	 * Check before locking. A race might cause false locked result.
@@ -716,7 +718,7 @@ static inline bool vma_start_read(struct vm_area_struct=
 *vma)
 	 * we don't rely on for anything - the mm_lock_seq read against which we
 	 * need ordering is below.
 	 */
-	if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq.=
sequence))
+	if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(mm->mm_lock_seq.sequence=
))
 		return false;
=20
 	if (unlikely(down_read_trylock(&vma->vm_lock.lock) =3D=3D 0))
@@ -733,7 +735,7 @@ static inline bool vma_start_read(struct vm_area_struct=
 *vma)
 	 * after it has been unlocked.
 	 * This pairs with RELEASE semantics in vma_end_write_all().
 	 */
-	if (unlikely(vma->vm_lock_seq =3D=3D raw_read_seqcount(&vma->vm_mm->mm_lo=
ck_seq))) {
+	if (unlikely(vma->vm_lock_seq =3D=3D raw_read_seqcount(&mm->mm_lock_seq))=
) {
 		up_read(&vma->vm_lock.lock);
 		return false;
 	}
@@ -822,7 +824,15 @@ static inline void vma_assert_locked(struct vm_area_st=
ruct *vma)
=20
 static inline void vma_mark_attached(struct vm_area_struct *vma)
 {
-	vma->detached =3D false;
+	/*
+	 * This pairs with smp_rmb() inside is_vma_detached().
+	 * vma is marked attached after all vma modifications are done and it
+	 * got added into the vma tree. All prior vma modifications should be
+	 * made visible before marking the vma attached.
+	 */
+	smp_wmb();
+	/* This pairs with READ_ONCE() in is_vma_detached(). */
+	WRITE_ONCE(vma->detached, false);
 }
=20
 static inline void vma_mark_detached(struct vm_area_struct *vma)
@@ -834,7 +844,18 @@ static inline void vma_mark_detached(struct vm_area_st=
ruct *vma)
=20
 static inline bool is_vma_detached(struct vm_area_struct *vma)
 {
-	return vma->detached;
+	bool detached;
+
+	/* This pairs with WRITE_ONCE() in vma_mark_attached(). */
+	detached =3D READ_ONCE(vma->detached);
+	/*
+	 * This pairs with smp_wmb() inside vma_mark_attached() to ensure
+	 * vma->detached is read before vma attributes read later inside
+	 * lock_vma_under_rcu().
+	 */
+	smp_rmb();
+
+	return detached;
 }
=20
 static inline void release_fault_lock(struct vm_fault *vmf)
@@ -859,7 +880,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_str=
uct *mm,
 #else /* CONFIG_PER_VMA_LOCK */
=20
 static inline void vma_lock_init(struct vm_area_struct *vma) {}
-static inline bool vma_start_read(struct vm_area_struct *vma)
+static inline bool vma_start_read(struct mm_struct *mm, struct vm_area_str=
uct *vma)
 		{ return false; }
 static inline void vma_end_read(struct vm_area_struct *vma) {}
 static inline void vma_start_write(struct vm_area_struct *vma) {}
@@ -893,6 +914,7 @@ static inline void assert_fault_locked(struct vm_fault =
*vmf)
=20
 extern const struct vm_operations_struct vma_dummy_vm_ops;
=20
+/* Use on VMAs not created using vm_area_alloc() */
 static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *=
mm)
 {
 	memset(vma, 0, sizeof(*vma));
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index be3551654325..5d8779997266 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -543,6 +543,12 @@ static inline void *folio_get_private(struct folio *fo=
lio)
=20
 typedef unsigned long vm_flags_t;
=20
+/*
+ * freeptr_t represents a SLUB freelist pointer, which might be encoded
+ * and not dereferenceable if CONFIG_SLAB_FREELIST_HARDENED is enabled.
+ */
+typedef struct { unsigned long v; } freeptr_t;
+
 /*
  * A region containing a mapping of a non-memory backed file under NOMMU
  * conditions.  These are held in a global tree and are pinned by the VMAs=
 that
@@ -657,9 +663,7 @@ struct vm_area_struct {
 			unsigned long vm_start;
 			unsigned long vm_end;
 		};
-#ifdef CONFIG_PER_VMA_LOCK
-		struct rcu_head vm_rcu;	/* Used for deferred freeing. */
-#endif
+		freeptr_t vm_freeptr; /* Pointer used by SLAB_TYPESAFE_BY_RCU */
 	};
=20
 	/*
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 10a971c2bde3..681b685b6c4e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -234,12 +234,6 @@ enum _slab_flag_bits {
 #define SLAB_NO_OBJ_EXT		__SLAB_FLAG_UNUSED
 #endif
=20
-/*
- * freeptr_t represents a SLUB freelist pointer, which might be encoded
- * and not dereferenceable if CONFIG_SLAB_FREELIST_HARDENED is enabled.
- */
-typedef struct { unsigned long v; } freeptr_t;
-
 /*
  * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
  *
diff --git a/kernel/fork.c b/kernel/fork.c
index 71990f46aa4e..e7e76a660e4c 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -436,6 +436,98 @@ static struct kmem_cache *vm_area_cachep;
 /* SLAB cache for mm_struct structures (tsk->mm) */
 static struct kmem_cache *mm_cachep;
=20
+static void vm_area_ctor(void *data)
+{
+	struct vm_area_struct *vma =3D (struct vm_area_struct *)data;
+
+#ifdef CONFIG_PER_VMA_LOCK
+	/* vma is not locked, can't use vma_mark_detached() */
+	vma->detached =3D true;
+#endif
+	INIT_LIST_HEAD(&vma->anon_vma_chain);
+	vma_lock_init(vma);
+}
+
+#ifdef CONFIG_PER_VMA_LOCK
+
+static void vma_clear(struct vm_area_struct *vma, struct mm_struct *mm)
+{
+	vma->vm_mm =3D mm;
+	vma->vm_ops =3D &vma_dummy_vm_ops;
+	vma->vm_start =3D 0;
+	vma->vm_end =3D 0;
+	vma->anon_vma =3D NULL;
+	vma->vm_pgoff =3D 0;
+	vma->vm_file =3D NULL;
+	vma->vm_private_data =3D NULL;
+	vm_flags_init(vma, 0);
+	memset(&vma->vm_page_prot, 0, sizeof(vma->vm_page_prot));
+	memset(&vma->shared, 0, sizeof(vma->shared));
+	memset(&vma->vm_userfaultfd_ctx, 0, sizeof(vma->vm_userfaultfd_ctx));
+	vma_numab_state_init(vma);
+#ifdef CONFIG_ANON_VMA_NAME
+	vma->anon_name =3D NULL;
+#endif
+#ifdef CONFIG_SWAP
+	memset(&vma->swap_readahead_info, 0, sizeof(vma->swap_readahead_info));
+#endif
+#ifndef CONFIG_MMU
+	vma->vm_region =3D NULL;
+#endif
+#ifdef CONFIG_NUMA
+	vma->vm_policy =3D NULL;
+#endif
+}
+
+static void vma_copy(const struct vm_area_struct *src, struct vm_area_stru=
ct *dest)
+{
+	dest->vm_mm =3D src->vm_mm;
+	dest->vm_ops =3D src->vm_ops;
+	dest->vm_start =3D src->vm_start;
+	dest->vm_end =3D src->vm_end;
+	dest->anon_vma =3D src->anon_vma;
+	dest->vm_pgoff =3D src->vm_pgoff;
+	dest->vm_file =3D src->vm_file;
+	dest->vm_private_data =3D src->vm_private_data;
+	vm_flags_init(dest, src->vm_flags);
+	memcpy(&dest->vm_page_prot, &src->vm_page_prot,
+	       sizeof(dest->vm_page_prot));
+	memcpy(&dest->shared, &src->shared, sizeof(dest->shared));
+	memcpy(&dest->vm_userfaultfd_ctx, &src->vm_userfaultfd_ctx,
+	       sizeof(dest->vm_userfaultfd_ctx));
+#ifdef CONFIG_ANON_VMA_NAME
+	dest->anon_name =3D src->anon_name;
+#endif
+#ifdef CONFIG_SWAP
+	memcpy(&dest->swap_readahead_info, &src->swap_readahead_info,
+	       sizeof(dest->swap_readahead_info));
+#endif
+#ifndef CONFIG_MMU
+	dest->vm_region =3D src->vm_region;
+#endif
+#ifdef CONFIG_NUMA
+	dest->vm_policy =3D src->vm_policy;
+#endif
+}
+
+#else /* CONFIG_PER_VMA_LOCK */
+
+static void vma_clear(struct vm_area_struct *vma, struct mm_struct *mm)
+{
+	vma_init(vma, mm);
+}
+
+static void vma_copy(const struct vm_area_struct *src, struct vm_area_stru=
ct *dest)
+{
+	/*
+	 * orig->shared.rb may be modified concurrently, but the clone
+	 * will be reinitialized.
+	 */
+	data_race(memcpy(dest, src, sizeof(*dest)));
+}
+
+#endif /* CONFIG_PER_VMA_LOCK */
+
 struct vm_area_struct *vm_area_alloc(struct mm_struct *mm)
 {
 	struct vm_area_struct *vma;
@@ -444,7 +536,7 @@ struct vm_area_struct *vm_area_alloc(struct mm_struct *=
mm)
 	if (!vma)
 		return NULL;
=20
-	vma_init(vma, mm);
+	vma_clear(vma, mm);
=20
 	return vma;
 }
@@ -458,49 +550,46 @@ struct vm_area_struct *vm_area_dup(struct vm_area_str=
uct *orig)
=20
 	ASSERT_EXCLUSIVE_WRITER(orig->vm_flags);
 	ASSERT_EXCLUSIVE_WRITER(orig->vm_file);
-	/*
-	 * orig->shared.rb may be modified concurrently, but the clone
-	 * will be reinitialized.
-	 */
-	data_race(memcpy(new, orig, sizeof(*new)));
-	vma_lock_init(new);
-	INIT_LIST_HEAD(&new->anon_vma_chain);
-#ifdef CONFIG_PER_VMA_LOCK
-	/* vma is not locked, can't use vma_mark_detached() */
-	new->detached =3D true;
-#endif
+	vma_copy(orig, new);
 	vma_numab_state_init(new);
 	dup_anon_vma_name(orig, new);
=20
 	return new;
 }
=20
-void __vm_area_free(struct vm_area_struct *vma)
+static void __vm_area_free(struct vm_area_struct *vma, bool unreachable)
 {
+#ifdef CONFIG_PER_VMA_LOCK
+	/*
+	 * With SLAB_TYPESAFE_BY_RCU, vma can be reused and we need
+	 * vma->detached to be set before vma is returned into the cache.
+	 * This way reused object won't be used by readers until it's
+	 * initialized and reattached.
+	 * If vma is unreachable, there can be no other users and we
+	 * can set vma->detached directly with no risk of a race.
+	 * If vma is reachable, then it should have been already detached
+	 * under vma write-lock or it was never attached.
+	 */
+	if (unreachable)
+		vma->detached =3D true;
+	else
+		VM_BUG_ON_VMA(!is_vma_detached(vma), vma);
+	vma->vm_lock_seq =3D UINT_MAX;
+#endif
+	VM_BUG_ON_VMA(!list_empty(&vma->anon_vma_chain), vma);
 	vma_numab_state_free(vma);
 	free_anon_vma_name(vma);
 	kmem_cache_free(vm_area_cachep, vma);
 }
=20
-#ifdef CONFIG_PER_VMA_LOCK
-static void vm_area_free_rcu_cb(struct rcu_head *head)
+void vm_area_free(struct vm_area_struct *vma)
 {
-	struct vm_area_struct *vma =3D container_of(head, struct vm_area_struct,
-						  vm_rcu);
-
-	/* The vma should not be locked while being destroyed. */
-	VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock.lock), vma);
-	__vm_area_free(vma);
+	__vm_area_free(vma, false);
 }
-#endif
=20
-void vm_area_free(struct vm_area_struct *vma)
+void vm_area_free_unreachable(struct vm_area_struct *vma)
 {
-#ifdef CONFIG_PER_VMA_LOCK
-	call_rcu(&vma->vm_rcu, vm_area_free_rcu_cb);
-#else
-	__vm_area_free(vma);
-#endif
+	__vm_area_free(vma, true);
 }
=20
 static void account_kernel_stack(struct task_struct *tsk, int account)
@@ -3141,6 +3230,12 @@ void __init mm_cache_init(void)
=20
 void __init proc_caches_init(void)
 {
+	struct kmem_cache_args args =3D {
+		.use_freeptr_offset =3D true,
+		.freeptr_offset =3D offsetof(struct vm_area_struct, vm_freeptr),
+		.ctor =3D vm_area_ctor,
+	};
+
 	sighand_cachep =3D kmem_cache_create("sighand_cache",
 			sizeof(struct sighand_struct), 0,
 			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_TYPESAFE_BY_RCU|
@@ -3157,9 +3252,11 @@ void __init proc_caches_init(void)
 			sizeof(struct fs_struct), 0,
 			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT,
 			NULL);
-	vm_area_cachep =3D KMEM_CACHE(vm_area_struct,
-			SLAB_HWCACHE_ALIGN|SLAB_NO_MERGE|SLAB_PANIC|
+	vm_area_cachep =3D kmem_cache_create("vm_area_struct",
+			sizeof(struct vm_area_struct), &args,
+			SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_TYPESAFE_BY_RCU|
 			SLAB_ACCOUNT);
+
 	mmap_init();
 	nsproxy_cache_init();
 }
diff --git a/mm/memory.c b/mm/memory.c
index b252f19b28c9..6f4d4d423835 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -6368,10 +6368,16 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm=
_struct *mm,
 	if (!vma)
 		goto inval;
=20
-	if (!vma_start_read(vma))
+	if (!vma_start_read(mm, vma))
 		goto inval;
=20
-	/* Check if the VMA got isolated after we found it */
+	/*
+	 * Check if the VMA got isolated after we found it.
+	 * Note: vma we found could have been recycled and is being reattached.
+	 * It's possible to attach a vma while it is read-locked, however a
+	 * read-locked vma can't be detached (detaching requires write-locking).
+	 * Therefore if this check passes, we have an attached and stable vma.
+	 */
 	if (is_vma_detached(vma)) {
 		vma_end_read(vma);
 		count_vm_vma_lock_event(VMA_LOCK_MISS);
@@ -6385,8 +6391,9 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_s=
truct *mm,
 	 * fields are accessible for RCU readers.
 	 */
=20
-	/* Check since vm_start/vm_end might change before we lock the VMA */
-	if (unlikely(address < vma->vm_start || address >=3D vma->vm_end))
+	/* Check if the vma we locked is the right one. */
+	if (unlikely(vma->vm_mm !=3D mm ||
+		     address < vma->vm_start || address >=3D vma->vm_end))
 		goto inval_end_read;
=20
 	rcu_read_unlock();
diff --git a/mm/vma.c b/mm/vma.c
index cdc63728f47f..648784416833 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -414,7 +414,7 @@ void remove_vma(struct vm_area_struct *vma, bool unreac=
hable)
 		fput(vma->vm_file);
 	mpol_put(vma_policy(vma));
 	if (unreachable)
-		__vm_area_free(vma);
+		vm_area_free_unreachable(vma);
 	else
 		vm_area_free(vma);
 }
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter=
nal.h
index 0cdc5f8c3d60..3eeb1317cc69 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -685,14 +685,15 @@ static inline void mpol_put(struct mempolicy *)
 {
 }
=20
-static inline void __vm_area_free(struct vm_area_struct *vma)
+static inline void vm_area_free(struct vm_area_struct *vma)
 {
 	free(vma);
 }
=20
-static inline void vm_area_free(struct vm_area_struct *vma)
+static inline void vm_area_free_unreachable(struct vm_area_struct *vma)
 {
-	__vm_area_free(vma);
+	vma->detached =3D true;
+	vm_area_free(vma);
 }
=20
 static inline void lru_add_drain(void)
--=20
2.47.0.338.g60cca15819-goog
From nobody Sat Feb  7 23:50:16 2026
Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com
 [209.85.216.74])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id A46FD213E9B
	for <linux-kernel@vger.kernel.org>; Fri,  6 Dec 2024 22:52:17 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.216.74
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1733525539; cv=none;
 b=B/OF00i7i099SKS47/0Wq2mkBpBVdNcKd4WXIVznAotMZnGgKavQDVQl26vjEBNoXcmD0sH8VVBG2D3Atos2moPA4kvwizYQf/dRwM+NrlBszW2Icl7ndu9L76qRar+6plq2YijkeNw1stYhnblQggPxbq4ghbD4F4SaJTbOrGQ=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1733525539; c=relaxed/simple;
	bh=PjabiwPy2/IWTDrO/BYxdaWeCjYeqj9nti52zZnQdks=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=pno38Xcx9NDciVo+/acwYGJwOeO6v5UdMcKWqPaR+mKqxpDP7MHn3qkUIM+2iJKoe5FEn35qci7n6fYa199GVUTdBcsoC0hRLg+d+Y6oZqg2jsUhyrfHMgVBSGK96lti6xpjqYnXD3BJMPi9TXQ3VikR2jHx5tX71alCsVppqOM=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=zKJg3MVT; arc=none smtp.client-ip=209.85.216.74
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="zKJg3MVT"
Received: by mail-pj1-f74.google.com with SMTP id
 98e67ed59e1d1-2ee6df32602so2534462a91.0
        for <linux-kernel@vger.kernel.org>;
 Fri, 06 Dec 2024 14:52:17 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1733525537; x=1734130337;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=VpkME8tKt7xRuIeyUR2rAOpo4R/J9PT1fVyVW9ktqRE=;
        b=zKJg3MVTFNPMSj96zpun1Dlpr+0os8zKtsPD3dXCO+EUMN0coUUz4b/JJ646+up4HY
         EcWCmhA0U9ENKZuVBFv+mSQLxNmXRqAFLZFqCZyV/HEdjMThbXlvo8rnRCX5V9D8VciD
         uGJLFmiyHLdBIAtmyPTSx3u+gSOn24uSlTsk/vwlKHgq3snkY4qM6bLHeTkynQXmJGRo
         +L8jkfjC0/FiltI4ZeUPVW3V/ctwrnHt4XNfUS6aTXOlJ5bMpMNfagCXfeDi44Yd4UBw
         gRI/Lf1GVfwnXLcfsosVm0t+I9i6r/xPx78jAhU9niUiitRdUep/pI8hUKrnFxpbTEdv
         FMcQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1733525537; x=1734130337;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=VpkME8tKt7xRuIeyUR2rAOpo4R/J9PT1fVyVW9ktqRE=;
        b=GMMJNNAFrF9tDcoMvshdN1SXgqA2u7F3SWroIKsG8Y3HCWxZL13rjitkUkA9iELsUu
         ndrlh8g2BgMpNn1N8XySKmgspO8oXm639tz0JKqOZ1GbCZ0wjd1xMB+trEIydG2ruJw7
         fTrhmYSPy9d/zq1oGMZ+jeUiCS1uCw01cRu2v+jRAcuzaTA/ARd1RfrA66eIu23oGUEh
         sFNgLaeXUPBY4P5oXcgfrnADvTv8WfVZ3pAwnbNxtffxq+xr6YSFwPBisBW/aoX0qLU+
         3OEwteMAl9QfwYZ0QrKkvwpk5IwkXPcbkKItji/nTVfEG0keXT5SL+QgTW0y492khRRy
         LDuA==
X-Forwarded-Encrypted: i=1;
 AJvYcCUY/FRMTUTnbQKqk0qtR5WohfwMXGYC1CdO7Z2w8VGGJYxMmBhIu1iCoLqIyJ35gE8vTgAXzMPVuPHK1nY=@vger.kernel.org
X-Gm-Message-State: AOJu0YyfEw7WBwurMwtTnYX03FX1iK6awqE4jsPp3cVSBf5RGsg/FlAK
	xE1uVbdHtzPqh+RRfUp5GWy0Jf2YiLdNKSyXwYWyz5POdo4It4BT2bU4L1utELCCXT8wMbbJZn2
	4Gg==
X-Google-Smtp-Source: 
 AGHT+IHFlvnZGP/uT3lo2rN1Kek/VGxFz7KPL1PDra1Y7TXq5etwmH5ZVO3EXQpkF7vZG3RxgNCe7HXm2rM=
X-Received: from pjboi8.prod.google.com ([2002:a17:90b:3a08:b0:2ea:4a74:ac2])
 (user=surenb job=prod-delivery.src-stubby-dispatcher) by
 2002:a17:90b:1e43:b0:2ee:8008:b583
 with SMTP id 98e67ed59e1d1-2ef69fffa82mr7837910a91.16.1733525537081; Fri, 06
 Dec 2024 14:52:17 -0800 (PST)
Date: Fri,  6 Dec 2024 14:52:02 -0800
In-Reply-To: <20241206225204.4008261-1-surenb@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20241206225204.4008261-1-surenb@google.com>
X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog
Message-ID: <20241206225204.4008261-6-surenb@google.com>
Subject: [PATCH v5 5/6] mm/slab: allow freeptr_offset to be used with ctor
From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
	mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com,
	oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com,
	peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com,
	minchan@google.com, jannh@google.com, shakeel.butt@linux.dev,
	souravpanda@google.com, pasha.tatashin@soleen.com, corbet@lwn.net,
	linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com, surenb@google.com
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

There is no real reason to prevent freeptr_offset usage when a slab
cache has a ctor. The only real limitation is that any field unioned
with the free pointer and initialized by ctor will be overwritten since
free pointer is set after @ctor invocation. Document this limitation
and enable usage of freeptr_offset with ctor.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
---
 include/linux/slab.h | 5 +++--
 mm/slub.c            | 2 +-
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 681b685b6c4e..6bad744bef5e 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -305,8 +305,9 @@ struct kmem_cache_args {
 	 * Using %0 as a value for @freeptr_offset is valid. If @freeptr_offset
 	 * is specified, %use_freeptr_offset must be set %true.
 	 *
-	 * Note that @ctor currently isn't supported with custom free pointers
-	 * as a @ctor requires an external free pointer.
+	 * Note that fields unioned with free pointer cannot be initialized by
+	 * @ctor since free pointer is set after @ctor invocation, so those
+	 * values will be overwritten.
 	 */
 	unsigned int freeptr_offset;
 	/**
diff --git a/mm/slub.c b/mm/slub.c
index 870a1d95521d..f62c829b7b6b 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5462,7 +5462,7 @@ static int calculate_sizes(struct kmem_cache_args *ar=
gs, struct kmem_cache *s)
 	s->inuse =3D size;
=20
 	if (((flags & SLAB_TYPESAFE_BY_RCU) && !args->use_freeptr_offset) ||
-	    (flags & SLAB_POISON) || s->ctor ||
+	    (flags & SLAB_POISON) || (s->ctor && !args->use_freeptr_offset) ||
 	    ((flags & SLAB_RED_ZONE) &&
 	     (s->object_size < sizeof(void *) || slub_debug_orig_size(s)))) {
 		/*
--=20
2.47.0.338.g60cca15819-goog
From nobody Sat Feb  7 23:50:16 2026
Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com
 [209.85.210.201])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id AF3BB214802
	for <linux-kernel@vger.kernel.org>; Fri,  6 Dec 2024 22:52:19 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 arc=none smtp.client-ip=209.85.210.201
ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1733525541; cv=none;
 b=pshHFITQjNvDJF0IF31Pqyi2eamoXlsDUHwAiV6lD8KhDLj+gcGU7FveM2nI3VDNkji5zWaNs0ktZ6MK/0jZDPINNTIto4SrklEQx1lsXQN3BxwJ0etcLjKe3wN0HlsR/1pbQ+INDAswmkL0VpclP0KsJUgcX//2oODRJ4xFw9Q=
ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1733525541; c=relaxed/simple;
	bh=wuVL3KCJxK4NGHQjmcu/THUHhYmfWZVursHHMGiwF8Q=;
	h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From:
	 To:Cc:Content-Type;
 b=Yf86mMG82gTIZtU2kCINHukWWCe6LLBZSl9d+8Ng7YOuQBSDegK93bonVzl7lPfO8mTK3L955euWDfrQewwJOWpUB0uyPmtWgMc8kIJpdAetDhzciIJnfsehw6gpTRsGthi/pxlZ7rXIQw8st1PJE+ijUXYiJxUqnehzAhaM7mQ=
ARC-Authentication-Results: i=1; smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com;
 dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b=vK6cwR6n; arc=none smtp.client-ip=209.85.210.201
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=reject dis=none) header.from=google.com
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=flex--surenb.bounces.google.com
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com
 header.b="vK6cwR6n"
Received: by mail-pf1-f201.google.com with SMTP id
 d2e1a72fcca58-7251424982fso2243781b3a.0
        for <linux-kernel@vger.kernel.org>;
 Fri, 06 Dec 2024 14:52:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20230601; t=1733525539; x=1734130339;
 darn=vger.kernel.org;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=j6wEf3Fj/sCIHriciJDFTmwIAy5yi6AjoE4yT83zSv8=;
        b=vK6cwR6nuiNXdM/7mqHoOieOXQe+wqxcYLqt5JcX2vv8z8DAUoUFbXihtP8qPQmpvC
         zHfILku+r0ihX/rAH2chEhoN6cc++hP0/a2+uZxr+zvtM+/dFKGpxJtWFA0OIIc3mtCn
         sTVjf4jkCXl1Ik3SQzcvMgNoXmFxSx+MZKdnWv1WL3hQHa3Q2it4dowbTyQ9F5qlUv7p
         7ism3CxdIHD5sy0K5xOMHM8MfEzqAJUbn/rQGLui7bsi8+5QzIp8xcaSHRj+HNp/i9DJ
         xmsuD9qf6gS9wU9BZxjp73xQfmyv6TLKgujowLwmTo+Cn/94rpEx6gg8TnnnW5T9LnmA
         Qyrw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20230601; t=1733525539; x=1734130339;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=j6wEf3Fj/sCIHriciJDFTmwIAy5yi6AjoE4yT83zSv8=;
        b=EiCZxOLKzqFW5chP1VU6y2cw2Z5NPdzE7lgCITtzPtR2DtA2ZV/UxlWlZuUTd6on5V
         PPwGoQCrOnewJXbi+/h3DO080eLd/UWUT0YuU6P5lvGiaRcVt1w8cakntLIOlYX49wCp
         OsZ22J3wXszfqg94myD4gcJ3r6m+VKrb/6M9H2lQFHiK4M0D0a2+JgkFHvIXJ/9MRZF0
         GrSOx60rDk3GD4wqE2BDBm2TQ3N1OG3K4y0bBPhyj5uHQ2+U0h7aE+XycDO0HXWbQgSi
         8obM9kqDA4splUnrv1lLEsslR4OuOmomESOhucIRn5GnnUqIbt3hm4d3M/mv6qCVbnnx
         YYZQ==
X-Forwarded-Encrypted: i=1;
 AJvYcCVJtj6gDqGcbEb8DKh9lnPyiim/Qk5jzgUdqKX4wpe8YAFK98f69K6keAPoefMxn7UfntBZfHB7jOHIHdE=@vger.kernel.org
X-Gm-Message-State: AOJu0Yzsde143dw2cMD+K/77y+QyBQ6RQGSKXe0+2zUOzlPZF1dyrCb+
	nJnzr2/9U+z0oEyE7Nr26eb5Fx465+FX17K3VmATl/8zBvpQ4/P2yLAkUm4S88jOyuVz+1pu9FB
	4Xg==
X-Google-Smtp-Source: 
 AGHT+IF/E1FLpmQ9hujXfDz8ah1g4xzG4PooG2XovTR77om7ynRlxwPBbvaTMprv3cmEsfAU/D0wit+tFCw=
X-Received: from pfbeg17.prod.google.com
 ([2002:a05:6a00:8011:b0:724:edad:f712])
 (user=surenb job=prod-delivery.src-stubby-dispatcher) by
 2002:a05:6a00:cc2:b0:71e:55e2:2c58
 with SMTP id d2e1a72fcca58-725b812e375mr6388253b3a.15.1733525538875; Fri, 06
 Dec 2024 14:52:18 -0800 (PST)
Date: Fri,  6 Dec 2024 14:52:03 -0800
In-Reply-To: <20241206225204.4008261-1-surenb@google.com>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
Mime-Version: 1.0
References: <20241206225204.4008261-1-surenb@google.com>
X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog
Message-ID: <20241206225204.4008261-7-surenb@google.com>
Subject: [PATCH v5 6/6] docs/mm: document latest changes to vm_lock
From: Suren Baghdasaryan <surenb@google.com>
To: akpm@linux-foundation.org
Cc: willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com,
	mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com,
	oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com,
	peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org,
	brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com,
	minchan@google.com, jannh@google.com, shakeel.butt@linux.dev,
	souravpanda@google.com, pasha.tatashin@soleen.com, corbet@lwn.net,
	linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@android.com, surenb@google.com
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="utf-8"

Change the documentation to reflect that vm_lock is integrated into vma.
Document newly introduced vma_start_read_locked{_nested} functions.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 Documentation/mm/process_addrs.rst | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_=
addrs.rst
index 81417fa2ed20..92cf497a9e3c 100644
--- a/Documentation/mm/process_addrs.rst
+++ b/Documentation/mm/process_addrs.rst
@@ -716,7 +716,11 @@ calls :c:func:`!rcu_read_lock` to ensure that the VMA =
is looked up in an RCU
 critical section, then attempts to VMA lock it via :c:func:`!vma_start_rea=
d`,
 before releasing the RCU lock via :c:func:`!rcu_read_unlock`.
=20
-VMA read locks hold the read lock on the :c:member:`!vma->vm_lock` semapho=
re for
+In cases when the user already holds mmap read lock, :c:func:`!vma_start_r=
ead_locked`
+and :c:func:`!vma_start_read_locked_nested` can be used. These functions a=
lways
+succeed in acquiring VMA read lock.
+
+VMA read locks hold the read lock on the :c:member:`!vma.vm_lock` semaphor=
e for
 their duration and the caller of :c:func:`!lock_vma_under_rcu` must releas=
e it
 via :c:func:`!vma_end_read`.
=20
@@ -780,7 +784,7 @@ keep VMAs locked across entirely separate write operati=
ons. It also maintains
 correct lock ordering.
=20
 Each time a VMA read lock is acquired, we acquire a read lock on the
-:c:member:`!vma->vm_lock` read/write semaphore and hold it, while checking=
 that
+:c:member:`!vma.vm_lock` read/write semaphore and hold it, while checking =
that
 the sequence count of the VMA does not match that of the mm.
=20
 If it does, the read lock fails. If it does not, we hold the lock, excludi=
ng
@@ -790,7 +794,7 @@ Importantly, maple tree operations performed in :c:func=
:`!lock_vma_under_rcu`
 are also RCU safe, so the whole read lock operation is guaranteed to funct=
ion
 correctly.
=20
-On the write side, we acquire a write lock on the :c:member:`!vma->vm_lock`
+On the write side, we acquire a write lock on the :c:member:`!vma.vm_lock`
 read/write semaphore, before setting the VMA's sequence number under this =
lock,
 also simultaneously holding the mmap write lock.
=20
--=20
2.47.0.338.g60cca15819-goog