From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AE4813BC39 for ; Thu, 26 Dec 2024 17:07:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232837; cv=none; b=APvgG/7XzqKrqh/soarydTm0to7wZo1uxzeRz3d4kcklrf4JxK12MwiYjxNubq2bNc0G/PhJDlnUjfSxjcuFe24FdvQ9UCvfEWG4Qlo2mWEmt0Rn+AgNUQFfN2uiLqZBDZW78yKrAtprYPA5aW+u066ZkXpRs513gUqLWM02Ni8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232837; c=relaxed/simple; bh=fvJMYV/tbE2hwCDL6WpJtnm7Rj70ZnNeVx47h6oWiYI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=PVJYCtSTTueQWyzGTlDczodM7OMM9S3p5Qh2hJP/YWO13KgUk8nFRNy/9mDEDgjYlUwJtvyT+jBsM07YGMRpg9vhj9zmfvVRlrmmwpycyfB7y8NfEJx3yNyq95uPzc0n/bbxHqbNUp/Ywt2Ngkds9q9umfeo1eiU8IishPo45A4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=whD4LKVJ; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="whD4LKVJ" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2164fad3792so79221185ad.0 for ; Thu, 26 Dec 2024 09:07:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232835; x=1735837635; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lZeoQJ48TvPY6kfbB0nyOCq9qEQQ9w4ETlSFBxsMiUM=; b=whD4LKVJS8shdMcWypXrJiWLLf/e4Dab39HWUafy7thir014v/YFXpRJ4INtjTKP9W 9w6TpLefIQ3HjpimX4Ukhw5MtqWopGxdp13YhpRufIZHVEC5RzDTNT//lztgeD+TrdA2 +PIBVPYiMdd1hHMP8yrrLEKC2Ept35uZHv+YRiVGX10zg84m1UGFHjtZoehfM80OvqBQ clqTCilpt/I2DtBqJ85GVve+sxhth92nORdpG/TAMS9KjlKIKfs+no0yx7t5eNrGnQBl Ngoahn9wILBntZGbdLbtskRHzLxEmzF4rRhRKCnMLPjcEGuZCE18hRgx4dvW4o4fOtCq VKOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232835; x=1735837635; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lZeoQJ48TvPY6kfbB0nyOCq9qEQQ9w4ETlSFBxsMiUM=; b=U7ULicUFCSODxtgh3qegUW9RdaJvQZeB6WCyW2W3QecpxDAZNl0FJfyeslY9NCSHa8 TvHzg1+FME6+E9p0XmrJddPv0MXEB1kNE8undx6rq5L+j4Ukx/ByA6qCTaH2mk0jjrbc lFEsrBvJjTo7sTvdP6lQT6KGxj5iQxe/L+rHMrRnSyk9c6oRDewlhmdvjcx2SgEe+W1z OfhQp+YPqIX5ysCmFGLMav/MsFUd8AVtJ19RfIzSILBRGjyJytQah0mVC4nIoH7aRJLr AYCJ6p2uPBIf3vVW6SGPlwBgwvLTg877StneCPIh2JUfgvcbHyprkDeOVljoUROZEDbb WHug== X-Forwarded-Encrypted: i=1; AJvYcCXYC+ERrrRjO+CH2FoWV2B645ApLHLaJvS8OOQ0/0g41o4cdjL/5PjQMIxtAOqpTTQTxukHXQwYapprVI0=@vger.kernel.org X-Gm-Message-State: AOJu0YyUG0QWEk9BocGqNabPciqmn0iqTFO/ta5l1qcxQzpGXLUM0Pbr sTOfymtfYEzKvjOUWf1THMKcBa/BoUanca6dEoiwr3lhZfiY6uGlaeNxykCbYry9ozrf8ULlq9G DiA== X-Google-Smtp-Source: AGHT+IGWdbrm7iCFwz72a6IFmBjvmmjFFdB3W+njMZ9BDPMzH23Ov9tsL+sC5puGq870a2X6Uofv9F0zU/g= X-Received: from plty1.prod.google.com ([2002:a17:902:8641:b0:216:2f96:8ba1]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d587:b0:216:5af7:eb2a with SMTP id d9443c01a7336-219e6ebb70bmr361013645ad.33.1735232835405; Thu, 26 Dec 2024 09:07:15 -0800 (PST) Date: Thu, 26 Dec 2024 09:06:53 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-2-surenb@google.com> Subject: [PATCH v7 01/17] mm: introduce vma_start_read_locked{_nested} helpers From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce helper functions which can be used to read-lock a VMA when holding mmap_lock for read. Replace direct accesses to vma->vm_lock with these new helpers. Signed-off-by: Suren Baghdasaryan Reviewed-by: Lorenzo Stoakes Reviewed-by: Davidlohr Bueso Reviewed-by: Shakeel Butt Reviewed-by: Vlastimil Babka Reviewed-by: Liam R. Howlett --- include/linux/mm.h | 24 ++++++++++++++++++++++++ mm/userfaultfd.c | 22 +++++----------------- 2 files changed, 29 insertions(+), 17 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 406b981af881..a48e207d25f2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -735,6 +735,30 @@ static inline bool vma_start_read(struct vm_area_struc= t *vma) return true; } =20 +/* + * Use only while holding mmap read lock which guarantees that locking wil= l not + * fail (nobody can concurrently write-lock the vma). vma_start_read() sho= uld + * not be used in such cases because it might fail due to mm_lock_seq over= flow. + * This functionality is used to obtain vma read lock and drop the mmap re= ad lock. + */ +static inline void vma_start_read_locked_nested(struct vm_area_struct *vma= , int subclass) +{ + mmap_assert_locked(vma->vm_mm); + down_read_nested(&vma->vm_lock->lock, subclass); +} + +/* + * Use only while holding mmap read lock which guarantees that locking wil= l not + * fail (nobody can concurrently write-lock the vma). vma_start_read() sho= uld + * not be used in such cases because it might fail due to mm_lock_seq over= flow. + * This functionality is used to obtain vma read lock and drop the mmap re= ad lock. + */ +static inline void vma_start_read_locked(struct vm_area_struct *vma) +{ + mmap_assert_locked(vma->vm_mm); + down_read(&vma->vm_lock->lock); +} + static inline void vma_end_read(struct vm_area_struct *vma) { rcu_read_lock(); /* keeps vma alive till the end of up_read */ diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index af3dfc3633db..4527c385935b 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -84,16 +84,8 @@ static struct vm_area_struct *uffd_lock_vma(struct mm_st= ruct *mm, =20 mmap_read_lock(mm); vma =3D find_vma_and_prepare_anon(mm, address); - if (!IS_ERR(vma)) { - /* - * We cannot use vma_start_read() as it may fail due to - * false locked (see comment in vma_start_read()). We - * can avoid that by directly locking vm_lock under - * mmap_lock, which guarantees that nobody can lock the - * vma for write (vma_start_write()) under us. - */ - down_read(&vma->vm_lock->lock); - } + if (!IS_ERR(vma)) + vma_start_read_locked(vma); =20 mmap_read_unlock(mm); return vma; @@ -1491,14 +1483,10 @@ static int uffd_move_lock(struct mm_struct *mm, mmap_read_lock(mm); err =3D find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, src_vmap); if (!err) { - /* - * See comment in uffd_lock_vma() as to why not using - * vma_start_read() here. - */ - down_read(&(*dst_vmap)->vm_lock->lock); + vma_start_read_locked(*dst_vmap); if (*dst_vmap !=3D *src_vmap) - down_read_nested(&(*src_vmap)->vm_lock->lock, - SINGLE_DEPTH_NESTING); + vma_start_read_locked_nested(*src_vmap, + SINGLE_DEPTH_NESTING); } mmap_read_unlock(mm); return err; --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 537A11459E0 for ; Thu, 26 Dec 2024 17:07:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232840; cv=none; b=rGPz/h++vZJBQote712Uj0fIXjwinFmOBu6DN8BN7AEfmRZ6KS9S3PY2Nb+kI5Hv3qTtZaIwNHSkFKVGKeG2C+u/TpUr0lkkhscdDMma3W4jw1MsbchdIlZnW1zhuhFKuBhdkDVpYpjHCfMlSHS6kcI0fI9UTxv/T73J9Jr6sRM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232840; c=relaxed/simple; bh=TOUGh8Vv99oVAfWkCckhEPckVFFiA6TSF8ewh+//pYI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=fgeivD33PNrKLJqvUrpAJqJpsUrY/i/n8/HY7xiH0Xs8jFBOFd1IJfDH3cVRqhpY2Rt60tXVvxgeGzbebqkYizjUEw+7eenU31pMvz5r5zUYfGSKdgTWaa3qolVi89kkwJbjkJp7EiN7g+w7x56YvF2jtssYKI/LY/sY2Oz+X4Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Vbk238dE; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Vbk238dE" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21638389f63so72142445ad.1 for ; Thu, 26 Dec 2024 09:07:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232838; x=1735837638; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6NbeBWVE54m5pl9tXwNyg1Wb6fYDZqP7j0x+BaAe+lA=; b=Vbk238dEPr9vUAe/s31UYAMAiMuUL5w5WEHLdgH3ujfKk6sxgeUE7EtcVzzHiBhmj+ sf5Wo7ft5xB5Efv3nbl0Bkq63wSQIaNSiWAiABh+QgC8mw1zOpC75ZhiSD1q6IXPhwXe 1WOuPceZLRmzd1BK5y6o4l+DY9R1n25BDRu6BRwE8EW9YhV4pl6nGGButn+mPX12qDZc qwLSWsI4pvruY1HnAsm2+J/kPw9gx+PUD2MFrIdmLWuOBK6jnR/ndRDyV73XGWpUCtWk 9NkmUeI0nubvGoa1+7XaAY+DmoWDtsyOwg0WMsTKHw9a+70aKpLjVb9kmnD04lp+csHQ sUfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232838; x=1735837638; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6NbeBWVE54m5pl9tXwNyg1Wb6fYDZqP7j0x+BaAe+lA=; b=ByGqmP/f+1yOqoi41gSuD7VzmC39bjjJfFziHHAz4KDysS+Ep/bZ+J56bsc8aqw+lI Qx/KZnc8XjPCGuFM4d0yqhL62SncytDz0aaROsQW8lV27fycSb/yr4hM1tMt+QVmmtQZ hyA41zFZLxKP68vEObtWVyo+H263IGNcbg4g9ysiLdG6GhWsOcFf4GdjB+InPjhVS8s1 SDVIkQSXo6pVGYaGb7MOmbM6YCjxHVpEGl9zDfSavIdnU6RSe21/yiLWcLyIrR5DI2zl 6uLj+zHwoCumSTkFBD2XPhWKjzbWOVjQSNJ044aftgY93zL8xShjLuKujXjIR3V3I5hT jeaw== X-Forwarded-Encrypted: i=1; AJvYcCWATQ4iNc/eRCJWtL5cvsmq3otMm3crbvV/xG1ujCfWOkX08Y2Ijxn+pGunAE/qEZxoNPr2pqHXAwkGCNs=@vger.kernel.org X-Gm-Message-State: AOJu0YyMUG23kQyXOgTxc7cIWuQlIdSgXI00ed+KRtShlXe3WnLCwV8X ixFb5Kx01fcqUtjHGYPbkHfI4ySJ0GUlRSGJRYsyLxEqWGDfs6pMT/sPsvzUuOkqgFyYeiHTDPc TJA== X-Google-Smtp-Source: AGHT+IF0YYbhM7XYApspOk/uURmaoeCLW8NQlVpD5NmAe0m1TWKTebPNW6STDsV9ROEbHXxPc4+2zc362YA= X-Received: from pfbb7.prod.google.com ([2002:a05:6a00:ac87:b0:728:c6d8:5683]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:3991:b0:1db:ed8a:a607 with SMTP id adf61e73a8af0-1e5e047b457mr38977051637.11.1735232837659; Thu, 26 Dec 2024 09:07:17 -0800 (PST) Date: Thu, 26 Dec 2024 09:06:54 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-3-surenb@google.com> Subject: [PATCH v7 02/17] mm: move per-vma lock into vm_area_struct From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Back when per-vma locks were introduces, vm_lock was moved out of vm_area_struct in [1] because of the performance regression caused by false cacheline sharing. Recent investigation [2] revealed that the regressions is limited to a rather old Broadwell microarchitecture and even there it can be mitigated by disabling adjacent cacheline prefetching, see [3]. Splitting single logical structure into multiple ones leads to more complicated management, extra pointer dereferences and overall less maintainable code. When that split-away part is a lock, it complicates things even further. With no performance benefits, there are no reasons for this split. Merging the vm_lock back into vm_area_struct also allows vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. Move vm_lock back into vm_area_struct, aligning it at the cacheline boundary and changing the cache to be cacheline-aligned as well. With kernel compiled using defconfig, this causes VMA memory consumption to grow from 160 (vm_area_struct) + 40 (vm_lock) bytes to 256 bytes: slabinfo before: ... : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 160 51 2 : ... slabinfo after moving vm_lock: ... : ... vm_area_struct ... 256 32 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 50 to 64 pages, which is 5.5MB per 100000 VMAs. Note that the size of this structure is dependent on the kernel configuration and typically the original size is higher than 160 bytes. Therefore these calculations are close to the worst case scenario. A more realistic vm_area_struct usage before this change is: ... : ... vma_lock ... 40 102 1 : ... vm_area_struct ... 176 46 2 : ... Aggregate VMA memory consumption per 1000 VMAs grows from 54 to 64 pages, which is 3.9MB per 100000 VMAs. This memory consumption growth can be addressed later by optimizing the vm_lock. [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/ [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbf= P_pR+-2g@mail.gmail.com/ Signed-off-by: Suren Baghdasaryan Reviewed-by: Lorenzo Stoakes Reviewed-by: Shakeel Butt Reviewed-by: Vlastimil Babka Reviewed-by: Liam R. Howlett --- include/linux/mm.h | 28 ++++++++++-------- include/linux/mm_types.h | 6 ++-- kernel/fork.c | 49 ++++---------------------------- tools/testing/vma/vma_internal.h | 33 +++++---------------- 4 files changed, 32 insertions(+), 84 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a48e207d25f2..f3f92ba8f5fe 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -697,6 +697,12 @@ static inline void vma_numab_state_free(struct vm_area= _struct *vma) {} #endif /* CONFIG_NUMA_BALANCING */ =20 #ifdef CONFIG_PER_VMA_LOCK +static inline void vma_lock_init(struct vm_area_struct *vma) +{ + init_rwsem(&vma->vm_lock.lock); + vma->vm_lock_seq =3D UINT_MAX; +} + /* * Try to read-lock a vma. The function is allowed to occasionally yield f= alse * locked result to avoid performance overhead, in which case we fall back= to @@ -714,7 +720,7 @@ static inline bool vma_start_read(struct vm_area_struct= *vma) if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq.= sequence)) return false; =20 - if (unlikely(down_read_trylock(&vma->vm_lock->lock) =3D=3D 0)) + if (unlikely(down_read_trylock(&vma->vm_lock.lock) =3D=3D 0)) return false; =20 /* @@ -729,7 +735,7 @@ static inline bool vma_start_read(struct vm_area_struct= *vma) * This pairs with RELEASE semantics in vma_end_write_all(). */ if (unlikely(vma->vm_lock_seq =3D=3D raw_read_seqcount(&vma->vm_mm->mm_lo= ck_seq))) { - up_read(&vma->vm_lock->lock); + up_read(&vma->vm_lock.lock); return false; } return true; @@ -744,7 +750,7 @@ static inline bool vma_start_read(struct vm_area_struct= *vma) static inline void vma_start_read_locked_nested(struct vm_area_struct *vma= , int subclass) { mmap_assert_locked(vma->vm_mm); - down_read_nested(&vma->vm_lock->lock, subclass); + down_read_nested(&vma->vm_lock.lock, subclass); } =20 /* @@ -756,13 +762,13 @@ static inline void vma_start_read_locked_nested(struc= t vm_area_struct *vma, int static inline void vma_start_read_locked(struct vm_area_struct *vma) { mmap_assert_locked(vma->vm_mm); - down_read(&vma->vm_lock->lock); + down_read(&vma->vm_lock.lock); } =20 static inline void vma_end_read(struct vm_area_struct *vma) { rcu_read_lock(); /* keeps vma alive till the end of up_read */ - up_read(&vma->vm_lock->lock); + up_read(&vma->vm_lock.lock); rcu_read_unlock(); } =20 @@ -791,7 +797,7 @@ static inline void vma_start_write(struct vm_area_struc= t *vma) if (__is_vma_write_locked(vma, &mm_lock_seq)) return; =20 - down_write(&vma->vm_lock->lock); + down_write(&vma->vm_lock.lock); /* * We should use WRITE_ONCE() here because we can have concurrent reads * from the early lockless pessimistic check in vma_start_read(). @@ -799,7 +805,7 @@ static inline void vma_start_write(struct vm_area_struc= t *vma) * we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy. */ WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq); - up_write(&vma->vm_lock->lock); + up_write(&vma->vm_lock.lock); } =20 static inline void vma_assert_write_locked(struct vm_area_struct *vma) @@ -811,7 +817,7 @@ static inline void vma_assert_write_locked(struct vm_ar= ea_struct *vma) =20 static inline void vma_assert_locked(struct vm_area_struct *vma) { - if (!rwsem_is_locked(&vma->vm_lock->lock)) + if (!rwsem_is_locked(&vma->vm_lock.lock)) vma_assert_write_locked(vma); } =20 @@ -844,6 +850,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_str= uct *mm, =20 #else /* CONFIG_PER_VMA_LOCK */ =20 +static inline void vma_lock_init(struct vm_area_struct *vma) {} static inline bool vma_start_read(struct vm_area_struct *vma) { return false; } static inline void vma_end_read(struct vm_area_struct *vma) {} @@ -878,10 +885,6 @@ static inline void assert_fault_locked(struct vm_fault= *vmf) =20 extern const struct vm_operations_struct vma_dummy_vm_ops; =20 -/* - * WARNING: vma_init does not initialize vma->vm_lock. - * Use vm_area_alloc()/vm_area_free() if vma needs locking. - */ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *= mm) { memset(vma, 0, sizeof(*vma)); @@ -890,6 +893,7 @@ static inline void vma_init(struct vm_area_struct *vma,= struct mm_struct *mm) INIT_LIST_HEAD(&vma->anon_vma_chain); vma_mark_detached(vma, false); vma_numab_state_init(vma); + vma_lock_init(vma); } =20 /* Use when VMA is not part of the VMA tree and needs no locking */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 5f1b2dc788e2..6573d95f1d1e 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -730,8 +730,6 @@ struct vm_area_struct { * slowpath. */ unsigned int vm_lock_seq; - /* Unstable RCU readers are allowed to read this. */ - struct vma_lock *vm_lock; #endif =20 /* @@ -784,6 +782,10 @@ struct vm_area_struct { struct vma_numab_state *numab_state; /* NUMA Balancing state */ #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; +#ifdef CONFIG_PER_VMA_LOCK + /* Unstable RCU readers are allowed to read this. */ + struct vma_lock vm_lock ____cacheline_aligned_in_smp; +#endif } __randomize_layout; =20 #ifdef CONFIG_NUMA diff --git a/kernel/fork.c b/kernel/fork.c index ded49f18cd95..40a8e615499f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -436,35 +436,6 @@ static struct kmem_cache *vm_area_cachep; /* SLAB cache for mm_struct structures (tsk->mm) */ static struct kmem_cache *mm_cachep; =20 -#ifdef CONFIG_PER_VMA_LOCK - -/* SLAB cache for vm_area_struct.lock */ -static struct kmem_cache *vma_lock_cachep; - -static bool vma_lock_alloc(struct vm_area_struct *vma) -{ - vma->vm_lock =3D kmem_cache_alloc(vma_lock_cachep, GFP_KERNEL); - if (!vma->vm_lock) - return false; - - init_rwsem(&vma->vm_lock->lock); - vma->vm_lock_seq =3D UINT_MAX; - - return true; -} - -static inline void vma_lock_free(struct vm_area_struct *vma) -{ - kmem_cache_free(vma_lock_cachep, vma->vm_lock); -} - -#else /* CONFIG_PER_VMA_LOCK */ - -static inline bool vma_lock_alloc(struct vm_area_struct *vma) { return tru= e; } -static inline void vma_lock_free(struct vm_area_struct *vma) {} - -#endif /* CONFIG_PER_VMA_LOCK */ - struct vm_area_struct *vm_area_alloc(struct mm_struct *mm) { struct vm_area_struct *vma; @@ -474,10 +445,6 @@ struct vm_area_struct *vm_area_alloc(struct mm_struct = *mm) return NULL; =20 vma_init(vma, mm); - if (!vma_lock_alloc(vma)) { - kmem_cache_free(vm_area_cachep, vma); - return NULL; - } =20 return vma; } @@ -496,10 +463,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_stru= ct *orig) * will be reinitialized. */ data_race(memcpy(new, orig, sizeof(*new))); - if (!vma_lock_alloc(new)) { - kmem_cache_free(vm_area_cachep, new); - return NULL; - } + vma_lock_init(new); INIT_LIST_HEAD(&new->anon_vma_chain); vma_numab_state_init(new); dup_anon_vma_name(orig, new); @@ -511,7 +475,6 @@ void __vm_area_free(struct vm_area_struct *vma) { vma_numab_state_free(vma); free_anon_vma_name(vma); - vma_lock_free(vma); kmem_cache_free(vm_area_cachep, vma); } =20 @@ -522,7 +485,7 @@ static void vm_area_free_rcu_cb(struct rcu_head *head) vm_rcu); =20 /* The vma should not be locked while being destroyed. */ - VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock->lock), vma); + VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock.lock), vma); __vm_area_free(vma); } #endif @@ -3188,11 +3151,9 @@ void __init proc_caches_init(void) sizeof(struct fs_struct), 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL); - - vm_area_cachep =3D KMEM_CACHE(vm_area_struct, SLAB_PANIC|SLAB_ACCOUNT); -#ifdef CONFIG_PER_VMA_LOCK - vma_lock_cachep =3D KMEM_CACHE(vma_lock, SLAB_PANIC|SLAB_ACCOUNT); -#endif + vm_area_cachep =3D KMEM_CACHE(vm_area_struct, + SLAB_HWCACHE_ALIGN|SLAB_NO_MERGE|SLAB_PANIC| + SLAB_ACCOUNT); mmap_init(); nsproxy_cache_init(); } diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index ae635eecbfa8..d19ce6fcab83 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -270,10 +270,10 @@ struct vm_area_struct { /* * Can only be written (using WRITE_ONCE()) while holding both: * - mmap_lock (in write mode) - * - vm_lock->lock (in write mode) + * - vm_lock.lock (in write mode) * Can be read reliably while holding one of: * - mmap_lock (in read or write mode) - * - vm_lock->lock (in read or write mode) + * - vm_lock.lock (in read or write mode) * Can be read unreliably (using READ_ONCE()) for pessimistic bailout * while holding nothing (except RCU to keep the VMA struct allocated). * @@ -282,7 +282,7 @@ struct vm_area_struct { * slowpath. */ unsigned int vm_lock_seq; - struct vma_lock *vm_lock; + struct vma_lock vm_lock; #endif =20 /* @@ -459,17 +459,10 @@ static inline struct vm_area_struct *vma_next(struct = vma_iterator *vmi) return mas_find(&vmi->mas, ULONG_MAX); } =20 -static inline bool vma_lock_alloc(struct vm_area_struct *vma) +static inline void vma_lock_init(struct vm_area_struct *vma) { - vma->vm_lock =3D calloc(1, sizeof(struct vma_lock)); - - if (!vma->vm_lock) - return false; - - init_rwsem(&vma->vm_lock->lock); + init_rwsem(&vma->vm_lock.lock); vma->vm_lock_seq =3D UINT_MAX; - - return true; } =20 static inline void vma_assert_write_locked(struct vm_area_struct *); @@ -492,6 +485,7 @@ static inline void vma_init(struct vm_area_struct *vma,= struct mm_struct *mm) vma->vm_ops =3D &vma_dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); vma_mark_detached(vma, false); + vma_lock_init(vma); } =20 static inline struct vm_area_struct *vm_area_alloc(struct mm_struct *mm) @@ -502,10 +496,6 @@ static inline struct vm_area_struct *vm_area_alloc(str= uct mm_struct *mm) return NULL; =20 vma_init(vma, mm); - if (!vma_lock_alloc(vma)) { - free(vma); - return NULL; - } =20 return vma; } @@ -518,10 +508,7 @@ static inline struct vm_area_struct *vm_area_dup(struc= t vm_area_struct *orig) return NULL; =20 memcpy(new, orig, sizeof(*new)); - if (!vma_lock_alloc(new)) { - free(new); - return NULL; - } + vma_lock_init(new); INIT_LIST_HEAD(&new->anon_vma_chain); =20 return new; @@ -691,14 +678,8 @@ static inline void mpol_put(struct mempolicy *) { } =20 -static inline void vma_lock_free(struct vm_area_struct *vma) -{ - free(vma->vm_lock); -} - static inline void __vm_area_free(struct vm_area_struct *vma) { - vma_lock_free(vma); free(vma); } =20 --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59F18146D45 for ; Thu, 26 Dec 2024 17:07:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232842; cv=none; b=orBjfNxnl4bsgmr84Ktia7gjpw4li8hH1MY3/g1QLK96zU7Y8i+SLwfPB7cuNqOwLRws/JCSZQ0xfhRN/RLaxilFzefaWg7/y9Jtl0tpOmy4W5+3P/nD18GKU2vl8SRGxj7SQ22QvJAWDrxn4QRUSzn7eVnp7QPYTbOO6s7HluQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232842; c=relaxed/simple; bh=Vjvp6aRKQ9utdAT+qEZMT61k/JXlv8Qbyaf2UHbSZjs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=k3Kfbq37pvd22yFfiO35lrEDY/Ln66pBsiuN7C9VHIS6rpZRRWK8W5nv7jN/cxVYjIPqRsDcO5rd1xhPonAIN0OVAgLNBl47Vt7pKpJ6/CN+8EIm4mKJCvsgg4lgl6Q1G2Tdyi493HogNu4aASxmv3Z21XjCfV70xf1LAlQNDvQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=W5r07BBe; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="W5r07BBe" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2efa0eb9cfeso7839851a91.0 for ; Thu, 26 Dec 2024 09:07:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232840; x=1735837640; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cAz4Nb0K6HeFkqWg5wvl8l9ILYUUia2eSNvAxmroFu8=; b=W5r07BBe5waXfXMd0I0iTkZuaEhw8eD8ugjyxYqEB3QoZ5FiPiw/ydR5SBRrH6C3eq hUy5+2LWK4BGO45aKnLDlbwYj5Ets8xR135g07MKmqkmct2mk4///TBW78Vx6/O1kXLO lTKWJuqaQLr0z09/X1i8CkFiIP4nftqWGC0Cvs9Ygw2k6YHWYZARpT/XRp4+tXV92jQK W4wASscuDGHVXWrc++2AQfoALNY45kVaBV31dblxFihe6+ZzG4JOOwrPcBrDEdZc3fbg Jzh8CLliGdn8As3xHy79LK3Rw3SRcagq9RfiwEvcynF5OO2smipmZtbgGvqH3+pmOSAc qcag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232840; x=1735837640; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cAz4Nb0K6HeFkqWg5wvl8l9ILYUUia2eSNvAxmroFu8=; b=YS8I7NY38N+jxwcrQ29s0cvkmU7kZVLzNhDrCJD0J3BF/wEmdmo+F1MtRKZUWTvqcM 1s5GOTC3FikGnvn12HVvmHAozoP4JREH25cKexqC7VGjJ4YYrFWRerAZLQGvJY3bFEBi oP+sv8F2kw/hOINUl1ekUm+hZ3oJCJxJPHSojQRprmB9yIC9qSdfEUcddMwDdKJq9qt0 zJhOe1/UmPmHGYjIQ4ORLYgpYY4SNjuLCGuE76fkOyOvBBhWa2Hziv+xQ8IvGIHmqDhB rz+hx3Pf1OG4q7AxyOTga5ddTT7MQ1Vmu7H7NOBWctxEMLSHirBosEsZjDG2wMq2JbOr 92xw== X-Forwarded-Encrypted: i=1; AJvYcCUFeUNeLtmynuQECmV0JjMH080hhLztp1ZCYamtR7M4UnnuZAAYDaK5JAoCk9vbWkH+CAliTd+HThtTij8=@vger.kernel.org X-Gm-Message-State: AOJu0Ywtmn6o1/Otxe/6l69Lrp1/s45Ey/v4/BnpIFfEnSBc/8G74lD4 ylUO0CzUupZnD3CvKwiQZSURGqUSmEoYeBd4CvSkEQnGUicXBjwrET2LNCcQFRJqAx8uUkCcTld gJw== X-Google-Smtp-Source: AGHT+IHG75OaLxopGAtkxZlDDT2v+C2m79NALqlAe4xW7nBDhmw9thNrZbKpynlUfVT/naik0hkFrK8zsZY= X-Received: from pjbtb12.prod.google.com ([2002:a17:90b:53cc:b0:2ef:7483:e770]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5251:b0:2f2:a90e:74ef with SMTP id 98e67ed59e1d1-2f44353f0b2mr41720800a91.1.1735232839810; Thu, 26 Dec 2024 09:07:19 -0800 (PST) Date: Thu, 26 Dec 2024 09:06:55 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-4-surenb@google.com> Subject: [PATCH v7 03/17] mm: mark vma as detached until it's added into vma tree From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Current implementation does not set detached flag when a VMA is first allocated. This does not represent the real state of the VMA, which is detached until it is added into mm's VMA tree. Fix this by marking new VMAs as detached and resetting detached flag only after VMA is added into a tree. Introduce vma_mark_attached() to make the API more readable and to simplify possible future cleanup when vma->vm_mm might be used to indicate detached vma and vma_mark_attached() will need an additional mm parameter. Signed-off-by: Suren Baghdasaryan Reviewed-by: Shakeel Butt Reviewed-by: Lorenzo Stoakes Reviewed-by: Vlastimil Babka Reviewed-by: Liam R. Howlett --- include/linux/mm.h | 27 ++++++++++++++++++++------- kernel/fork.c | 4 ++++ mm/memory.c | 2 +- mm/vma.c | 6 +++--- mm/vma.h | 2 ++ tools/testing/vma/vma_internal.h | 17 ++++++++++++----- 6 files changed, 42 insertions(+), 16 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index f3f92ba8f5fe..081178b0eec4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -821,12 +821,21 @@ static inline void vma_assert_locked(struct vm_area_s= truct *vma) vma_assert_write_locked(vma); } =20 -static inline void vma_mark_detached(struct vm_area_struct *vma, bool deta= ched) +static inline void vma_mark_attached(struct vm_area_struct *vma) +{ + vma->detached =3D false; +} + +static inline void vma_mark_detached(struct vm_area_struct *vma) { /* When detaching vma should be write-locked */ - if (detached) - vma_assert_write_locked(vma); - vma->detached =3D detached; + vma_assert_write_locked(vma); + vma->detached =3D true; +} + +static inline bool is_vma_detached(struct vm_area_struct *vma) +{ + return vma->detached; } =20 static inline void release_fault_lock(struct vm_fault *vmf) @@ -857,8 +866,8 @@ static inline void vma_end_read(struct vm_area_struct *= vma) {} static inline void vma_start_write(struct vm_area_struct *vma) {} static inline void vma_assert_write_locked(struct vm_area_struct *vma) { mmap_assert_write_locked(vma->vm_mm); } -static inline void vma_mark_detached(struct vm_area_struct *vma, - bool detached) {} +static inline void vma_mark_attached(struct vm_area_struct *vma) {} +static inline void vma_mark_detached(struct vm_area_struct *vma) {} =20 static inline struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *= mm, unsigned long address) @@ -891,7 +900,10 @@ static inline void vma_init(struct vm_area_struct *vma= , struct mm_struct *mm) vma->vm_mm =3D mm; vma->vm_ops =3D &vma_dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); - vma_mark_detached(vma, false); +#ifdef CONFIG_PER_VMA_LOCK + /* vma is not locked, can't use vma_mark_detached() */ + vma->detached =3D true; +#endif vma_numab_state_init(vma); vma_lock_init(vma); } @@ -1086,6 +1098,7 @@ static inline int vma_iter_bulk_store(struct vma_iter= ator *vmi, if (unlikely(mas_is_err(&vmi->mas))) return -ENOMEM; =20 + vma_mark_attached(vma); return 0; } =20 diff --git a/kernel/fork.c b/kernel/fork.c index 40a8e615499f..f2f9e7b427ad 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -465,6 +465,10 @@ struct vm_area_struct *vm_area_dup(struct vm_area_stru= ct *orig) data_race(memcpy(new, orig, sizeof(*new))); vma_lock_init(new); INIT_LIST_HEAD(&new->anon_vma_chain); +#ifdef CONFIG_PER_VMA_LOCK + /* vma is not locked, can't use vma_mark_detached() */ + new->detached =3D true; +#endif vma_numab_state_init(new); dup_anon_vma_name(orig, new); =20 diff --git a/mm/memory.c b/mm/memory.c index 2a20e3810534..d0dee2282325 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6349,7 +6349,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_s= truct *mm, goto inval; =20 /* Check if the VMA got isolated after we found it */ - if (vma->detached) { + if (is_vma_detached(vma)) { vma_end_read(vma); count_vm_vma_lock_event(VMA_LOCK_MISS); /* The area was replaced with another one */ diff --git a/mm/vma.c b/mm/vma.c index 0caaeea899a9..476146c25283 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -327,7 +327,7 @@ static void vma_complete(struct vma_prepare *vp, struct= vma_iterator *vmi, =20 if (vp->remove) { again: - vma_mark_detached(vp->remove, true); + vma_mark_detached(vp->remove); if (vp->file) { uprobe_munmap(vp->remove, vp->remove->vm_start, vp->remove->vm_end); @@ -1220,7 +1220,7 @@ static void reattach_vmas(struct ma_state *mas_detach) =20 mas_set(mas_detach, 0); mas_for_each(mas_detach, vma, ULONG_MAX) - vma_mark_detached(vma, false); + vma_mark_attached(vma); =20 __mt_destroy(mas_detach->tree); } @@ -1295,7 +1295,7 @@ static int vms_gather_munmap_vmas(struct vma_munmap_s= truct *vms, if (error) goto munmap_gather_failed; =20 - vma_mark_detached(next, true); + vma_mark_detached(next); nrpages =3D vma_pages(next); =20 vms->nr_pages +=3D nrpages; diff --git a/mm/vma.h b/mm/vma.h index 61ed044b6145..24636a2b0acf 100644 --- a/mm/vma.h +++ b/mm/vma.h @@ -157,6 +157,7 @@ static inline int vma_iter_store_gfp(struct vma_iterato= r *vmi, if (unlikely(mas_is_err(&vmi->mas))) return -ENOMEM; =20 + vma_mark_attached(vma); return 0; } =20 @@ -389,6 +390,7 @@ static inline void vma_iter_store(struct vma_iterator *= vmi, =20 __mas_set_range(&vmi->mas, vma->vm_start, vma->vm_end - 1); mas_store_prealloc(&vmi->mas, vma); + vma_mark_attached(vma); } =20 static inline unsigned long vma_iter_addr(struct vma_iterator *vmi) diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index d19ce6fcab83..2a624f9304da 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -465,13 +465,17 @@ static inline void vma_lock_init(struct vm_area_struc= t *vma) vma->vm_lock_seq =3D UINT_MAX; } =20 +static inline void vma_mark_attached(struct vm_area_struct *vma) +{ + vma->detached =3D false; +} + static inline void vma_assert_write_locked(struct vm_area_struct *); -static inline void vma_mark_detached(struct vm_area_struct *vma, bool deta= ched) +static inline void vma_mark_detached(struct vm_area_struct *vma) { /* When detaching vma should be write-locked */ - if (detached) - vma_assert_write_locked(vma); - vma->detached =3D detached; + vma_assert_write_locked(vma); + vma->detached =3D true; } =20 extern const struct vm_operations_struct vma_dummy_vm_ops; @@ -484,7 +488,8 @@ static inline void vma_init(struct vm_area_struct *vma,= struct mm_struct *mm) vma->vm_mm =3D mm; vma->vm_ops =3D &vma_dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); - vma_mark_detached(vma, false); + /* vma is not locked, can't use vma_mark_detached() */ + vma->detached =3D true; vma_lock_init(vma); } =20 @@ -510,6 +515,8 @@ static inline struct vm_area_struct *vm_area_dup(struct= vm_area_struct *orig) memcpy(new, orig, sizeof(*new)); vma_lock_init(new); INIT_LIST_HEAD(&new->anon_vma_chain); + /* vma is not locked, can't use vma_mark_detached() */ + new->detached =3D true; =20 return new; } --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 905C0146D59 for ; Thu, 26 Dec 2024 17:07:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232844; cv=none; b=EVvL8j5VxWi/gHSJgNJJIgg97bmS3dSX3nQ25O7HDvGBaNhHNtH80W1h3A1sCVpJEZI3hoUs9gTPDUQRbg2BH0t6e7IxDPFslMa6DZ+pDNFGGhNA6CoSt1ER4N7lwa0n3rZmJmfLcEtEOtKokrl0q9AlbLj8ejd2KOIJIAJyGVw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232844; c=relaxed/simple; bh=ruiO9aLYsTYCBSkm86yTyt9vPV6wSq57QK6xsEp5ZIU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ND6YCofxaOkzwkZVcKa21g6OPjtawqKETK9fE9or2qwrBUYsTPE2XPy/DMuF1fWkw3SUHrc7/bC82U0YqiwpXjDJB30d6+v5V5VU+pIT3rklKm55b+Z/NT5lp/YgikCKm/UCvTxE5D+oKar/ELzVN9EfXdr9R+XdvMQeGM4CHIY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=lDtPdHSP; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lDtPdHSP" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef7fbd99a6so8019169a91.1 for ; Thu, 26 Dec 2024 09:07:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232842; x=1735837642; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=uC/uS/GmF3UbaohKrad0ksO3lGJZa9/e0a1asiVVa0A=; b=lDtPdHSPm6LoQXshi3n62O94yzLX/frfcnOpntG7kL93zTP+vzhlNnSllFFqOkXVtz OkNQvy/qdqBGYXQ5Z5vbGVL+vBq6IZVjrc7CPQFXBgb4u0vvQQ4qc/+ZGoCknwNri/dq j9to4sAzHA1uzbGt3VnXA1GRSADF8bmR/COFaBcug1yuXH+H4MZx72ElAYhPlAMQMYBL 62Pf9s6xiTn4haZCcMp1NWZbhLzLpBFLHXp872ZQWg2fpIxvMWT+tH4twbqI/4IuswP7 05EfH6je478Y71XSb2XHPfrtPaNTZNsN8ZFPbWeeZMcAC7BeAEHumJdFYZOfvS5/hJzc mc5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232842; x=1735837642; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uC/uS/GmF3UbaohKrad0ksO3lGJZa9/e0a1asiVVa0A=; b=AINpS2bSLQdGTmfzQwRwsAs12L9lk/RSjun5JCwEmWmzEAAndg3xq59vlMcq+PWsD8 v65htWdtwfHnXwN96Zwd0VWs+vlRMjyakEyAYq/+dmwjkYemgKM3jxq+grsyVFe10woU fGdJqvAwMwNsLXm5kaJMdBJ1uUAIAYMxpRH9NTjPSitqyhhqY9g3jlASPJCMl25ihD1n X5OdjedewqYn7b0cr6kxZlXIb8B6UD97zKcyGdiIn55RlbmqHQ21T3WNtnPDEx1Giemq sMl6uhV0IapvGLiym6iSboFbNB45H07rfpVVgtRupspI301yIJX+b+uXQzOZdG9nxTiY Qr5Q== X-Forwarded-Encrypted: i=1; AJvYcCUlvq2VB2NoTgTdXUtbuHsubCnpJnvJDzZyQ1r9+7y36KjI8ZAe6yJMw1qe2pVZZdEW+JwjqiND+pbJ2Rs=@vger.kernel.org X-Gm-Message-State: AOJu0YzwrIeHEOZpa8nP6/iHYZmt7fboq+Z6sMKV+AZcdFVXwqxo4tpK dCLwckU5vo9TBPWf+6v/dS4S7a2q324JiZ7jlvGaoWBp2VI/GT1CdWXEdsa1kkeA3UIEDpxQnH/ OBQ== X-Google-Smtp-Source: AGHT+IHCdGYEXz7DM0Oi4jA5FcxiPuZb/VK8PE48RzKkuHUHkyLtSTzBzCK0XU3Squr5UgqXBJEcCRqnrjo= X-Received: from pjyr4.prod.google.com ([2002:a17:90a:e184:b0:2ea:3a1b:f493]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:fc4f:b0:2ea:5054:6c44 with SMTP id 98e67ed59e1d1-2f452eeb641mr29738883a91.31.1735232841920; Thu, 26 Dec 2024 09:07:21 -0800 (PST) Date: Thu, 26 Dec 2024 09:06:56 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-5-surenb@google.com> Subject: [PATCH v7 04/17] mm: modify vma_iter_store{_gfp} to indicate if it's storing a new vma From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vma_iter_store() functions can be used both when adding a new vma and when updating an existing one. However for existing ones we do not need to mark them attached as they are already marked that way. Add a parameter to distinguish the usage and skip vma_mark_attached() when not needed. Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 12 ++++++++++++ mm/nommu.c | 4 ++-- mm/vma.c | 16 ++++++++-------- mm/vma.h | 13 +++++++++---- 4 files changed, 31 insertions(+), 14 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 081178b0eec4..c50edfedd99d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -821,6 +821,16 @@ static inline void vma_assert_locked(struct vm_area_st= ruct *vma) vma_assert_write_locked(vma); } =20 +static inline void vma_assert_attached(struct vm_area_struct *vma) +{ + VM_BUG_ON_VMA(vma->detached, vma); +} + +static inline void vma_assert_detached(struct vm_area_struct *vma) +{ + VM_BUG_ON_VMA(!vma->detached, vma); +} + static inline void vma_mark_attached(struct vm_area_struct *vma) { vma->detached =3D false; @@ -866,6 +876,8 @@ static inline void vma_end_read(struct vm_area_struct *= vma) {} static inline void vma_start_write(struct vm_area_struct *vma) {} static inline void vma_assert_write_locked(struct vm_area_struct *vma) { mmap_assert_write_locked(vma->vm_mm); } +static inline void vma_assert_attached(struct vm_area_struct *vma) {} +static inline void vma_assert_detached(struct vm_area_struct *vma) {} static inline void vma_mark_attached(struct vm_area_struct *vma) {} static inline void vma_mark_detached(struct vm_area_struct *vma) {} =20 diff --git a/mm/nommu.c b/mm/nommu.c index 9cb6e99215e2..72c8c505836c 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1191,7 +1191,7 @@ unsigned long do_mmap(struct file *file, setup_vma_to_mm(vma, current->mm); current->mm->map_count++; /* add the VMA to the tree */ - vma_iter_store(&vmi, vma); + vma_iter_store(&vmi, vma, true); =20 /* we flush the region from the icache only when the first executable * mapping of it is made */ @@ -1356,7 +1356,7 @@ static int split_vma(struct vma_iterator *vmi, struct= vm_area_struct *vma, =20 setup_vma_to_mm(vma, mm); setup_vma_to_mm(new, mm); - vma_iter_store(vmi, new); + vma_iter_store(vmi, new, true); mm->map_count++; return 0; =20 diff --git a/mm/vma.c b/mm/vma.c index 476146c25283..ce113dd8c471 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -306,7 +306,7 @@ static void vma_complete(struct vma_prepare *vp, struct= vma_iterator *vmi, * us to insert it before dropping the locks * (it may either follow vma or precede it). */ - vma_iter_store(vmi, vp->insert); + vma_iter_store(vmi, vp->insert, true); mm->map_count++; } =20 @@ -660,14 +660,14 @@ static int commit_merge(struct vma_merge_struct *vmg, vma_set_range(vmg->vma, vmg->start, vmg->end, vmg->pgoff); =20 if (expanded) - vma_iter_store(vmg->vmi, vmg->vma); + vma_iter_store(vmg->vmi, vmg->vma, false); =20 if (adj_start) { adjust->vm_start +=3D adj_start; adjust->vm_pgoff +=3D PHYS_PFN(adj_start); if (adj_start < 0) { WARN_ON(expanded); - vma_iter_store(vmg->vmi, adjust); + vma_iter_store(vmg->vmi, adjust, false); } } =20 @@ -1689,7 +1689,7 @@ int vma_link(struct mm_struct *mm, struct vm_area_str= uct *vma) return -ENOMEM; =20 vma_start_write(vma); - vma_iter_store(&vmi, vma); + vma_iter_store(&vmi, vma, true); vma_link_file(vma); mm->map_count++; validate_mm(mm); @@ -2368,7 +2368,7 @@ static int __mmap_new_vma(struct mmap_state *map, str= uct vm_area_struct **vmap) =20 /* Lock the VMA since it is modified after insertion into VMA tree */ vma_start_write(vma); - vma_iter_store(vmi, vma); + vma_iter_store(vmi, vma, true); map->mm->map_count++; vma_link_file(vma); =20 @@ -2542,7 +2542,7 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_= area_struct *vma, vm_flags_init(vma, flags); vma->vm_page_prot =3D vm_get_page_prot(flags); vma_start_write(vma); - if (vma_iter_store_gfp(vmi, vma, GFP_KERNEL)) + if (vma_iter_store_gfp(vmi, vma, GFP_KERNEL, true)) goto mas_store_fail; =20 mm->map_count++; @@ -2785,7 +2785,7 @@ int expand_upwards(struct vm_area_struct *vma, unsign= ed long address) anon_vma_interval_tree_pre_update_vma(vma); vma->vm_end =3D address; /* Overwrite old entry in mtree. */ - vma_iter_store(&vmi, vma); + vma_iter_store(&vmi, vma, false); anon_vma_interval_tree_post_update_vma(vma); =20 perf_event_mmap(vma); @@ -2865,7 +2865,7 @@ int expand_downwards(struct vm_area_struct *vma, unsi= gned long address) vma->vm_start =3D address; vma->vm_pgoff -=3D grow; /* Overwrite old entry in mtree. */ - vma_iter_store(&vmi, vma); + vma_iter_store(&vmi, vma, false); anon_vma_interval_tree_post_update_vma(vma); =20 perf_event_mmap(vma); diff --git a/mm/vma.h b/mm/vma.h index 24636a2b0acf..18c9e49b1eae 100644 --- a/mm/vma.h +++ b/mm/vma.h @@ -145,7 +145,7 @@ __must_check int vma_shrink(struct vma_iterator *vmi, unsigned long start, unsigned long end, pgoff_t pgoff); =20 static inline int vma_iter_store_gfp(struct vma_iterator *vmi, - struct vm_area_struct *vma, gfp_t gfp) + struct vm_area_struct *vma, gfp_t gfp, bool new_vma) =20 { if (vmi->mas.status !=3D ma_start && @@ -157,7 +157,10 @@ static inline int vma_iter_store_gfp(struct vma_iterat= or *vmi, if (unlikely(mas_is_err(&vmi->mas))) return -ENOMEM; =20 - vma_mark_attached(vma); + if (new_vma) + vma_mark_attached(vma); + vma_assert_attached(vma); + return 0; } =20 @@ -366,7 +369,7 @@ static inline struct vm_area_struct *vma_iter_load(stru= ct vma_iterator *vmi) =20 /* Store a VMA with preallocated memory */ static inline void vma_iter_store(struct vma_iterator *vmi, - struct vm_area_struct *vma) + struct vm_area_struct *vma, bool new_vma) { =20 #if defined(CONFIG_DEBUG_VM_MAPLE_TREE) @@ -390,7 +393,9 @@ static inline void vma_iter_store(struct vma_iterator *= vmi, =20 __mas_set_range(&vmi->mas, vma->vm_start, vma->vm_end - 1); mas_store_prealloc(&vmi->mas, vma); - vma_mark_attached(vma); + if (new_vma) + vma_mark_attached(vma); + vma_assert_attached(vma); } =20 static inline unsigned long vma_iter_addr(struct vma_iterator *vmi) --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1EFE1CB332 for ; Thu, 26 Dec 2024 17:07:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232846; cv=none; b=NKuUcV0LzddmW6j8wqW0OgzFLb+HSuoFALf39BP9hTtjJfiyS6mKfkZvG7SXvKc7pIuzVRMW+En/w/6KtrVoLBnPS6fCFFqgHSrVnr4u4GG7FGybr8kozye1wC4Zr2MyR4yq6KGMfiPJI00WOY1whMSJy7qEn7Ww1hLNpRJ8rDk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232846; c=relaxed/simple; bh=p/o5ZGpP7QNevNaCwrDZS9y9Ty/3GQzxxQCTMzyjkLg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QWn51swU5bpyF3cTi5VGyDow7tFAXDZlBEWMSl0c2RM9G1tCA3eOlX6CeTOORp/hhA/0FDiYRZ0J1xCYdmKxpzuxbpajMmdyb/VIXKiZNe3Khz4vcfxtI6+Pcv2o7MfJKBK4FEF90MHP/re4TZV7XTesJdfJEcIRpV+IEKMSaEw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=MkiC76Vt; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="MkiC76Vt" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2166e907b5eso79305785ad.3 for ; Thu, 26 Dec 2024 09:07:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232844; x=1735837644; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zZzt97suQHoGz3iLHkiBfewFYx04NIJnrpXSSuwtyuU=; b=MkiC76VtsdXiWuhgh5zm9EJLqQUGnngy09F9f+kbH8INJugR6HEPwy2e4EMqmv7UJM v5cGRCoZ9aMF5j5jSFi45tcN3v8E9xNciP5ntwaK3iYTR/D7Du/+ywDLKUXuvFwWnFIE x2uH+kcg2Ko5MGOsWV2PybJkPeAVX70UEYIUXVGR8wwA/S89ucBZ3r2IwOQIgbTVdACV gtE+BiQ3lSjw9rOm6i4cSxVWzAexaavRuVkDvdIYVBvHACiJa6+uNegJYIGNyabiKWDZ P6gpkrZ2LOSQY8NswY+sGT2K0BLyqLgML9aPBTQlCNoNlx0ilI/e1hF2bo1W16Dlcn3f 9rEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232844; x=1735837644; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zZzt97suQHoGz3iLHkiBfewFYx04NIJnrpXSSuwtyuU=; b=HVzLYCgBDXWy+ezrFnK+KdHMoAskOPndmAZIg+bxboxUL6bt3qT2zJOKam3/bVwZG1 jzbEyPouB/SWvViMNalntoDl63lHhRRxW2A9DDAshx/gcFKTAxlk/zfY64HogPvEt4F8 c2O8H3wTbE5STMuRDvLZpV3zJs2vhrE3jsLFNQ3b9iFnMFFcfEDOaxpi4u34oAxNj/Xt NuNxiiij/hXC4buSVCNozu9GvxNTb9jk5DXhYucXhOhvI6FcAH+U6JjrUUKdVaaQzyIL 2OPh93QAmGETP17TmcKVBO9srP9Nn8qtBhAnI2uzKgVwCrH/p8F0qituOD2FIuSMIsow UIUg== X-Forwarded-Encrypted: i=1; AJvYcCWV0f3UAgKGs5RoEXh2HpBYvP9Y2ZLyhC6Ex537B2WuUngqY09RbS77gFme09RQx8a1CxhLpM7bl0QI2/k=@vger.kernel.org X-Gm-Message-State: AOJu0Yyvx/RSptVlXc96UuoeL9gPAQV4O6OP70t0iuvjE35YaFJ5oypu 1Ot00Yrd3iJMGWSF3UMJF4uHWeNXPyNSR/cq1YvCISBySYWwvKEfU36vSVCetNPXmoHmoj4aE+2 fSg== X-Google-Smtp-Source: AGHT+IEAlsKyIxUvD+OjguaAw8zgyVzbZEnVmDfknkzudP9YDfj4nmndYPZJH96mZmFPgzBT4t688/RysUg= X-Received: from plbkn3.prod.google.com ([2002:a17:903:783:b0:216:5441:d855]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f54e:b0:215:e98c:c5b5 with SMTP id d9443c01a7336-219e6e8c5c5mr356653565ad.1.1735232843904; Thu, 26 Dec 2024 09:07:23 -0800 (PST) Date: Thu, 26 Dec 2024 09:06:57 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-6-surenb@google.com> Subject: [PATCH v7 05/17] mm: mark vmas detached upon exit From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When exit_mmap() removes vmas belonging to an exiting task, it does not mark them as detached since they can't be reached by other tasks and they will be freed shortly. Once we introduce vma reuse, all vmas will have to be in detached state before they are freed to ensure vma when reused is in a consistent state. Add missing vma_mark_detached() before freeing the vma. Signed-off-by: Suren Baghdasaryan Reviewed-by: Vlastimil Babka --- mm/vma.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/vma.c b/mm/vma.c index ce113dd8c471..4a3deb6f9662 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -413,9 +413,10 @@ void remove_vma(struct vm_area_struct *vma, bool unrea= chable) if (vma->vm_file) fput(vma->vm_file); mpol_put(vma_policy(vma)); - if (unreachable) + if (unreachable) { + vma_mark_detached(vma); __vm_area_free(vma); - else + } else vm_area_free(vma); } =20 --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 999FF1D14E2 for ; Thu, 26 Dec 2024 17:07:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232848; cv=none; b=t+sshnDsyIRZK2qBoOUlHtoSuhUXgarWlm6BJOMuQJ8AkJrurkAL08LQTi3Yk4jilgvhmV6pYH4caD9/wlrBhqABDuEknpfw0/Ox92GMznPLe1gax2Hk134oBbzSMyNQRaLQNalFF8nx7PAy9f8fbj1IU5rAHNeIiHdehPsZv1Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232848; c=relaxed/simple; bh=7fZMRb/YbbjOi0MmTg3pGeQUEUL0PmuVszXd/0QvBig=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=bDzFhZ7n8WHu+Bxi5n2+4mGj3iWKTlSj0XkWT7E725mExXFZ/E1uHWAmLZ8gSAQKLVSFST5UbUUfENvo6yGRJMh2Q0Wjcg4KyskWgb2QmDDq+3V7Sp7cE8fY5+dJ+m7U1CuMkKZysRuA1XM7XjuYjRkiyBsKG39and2qfekt5oY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kzunHBQV; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kzunHBQV" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2162259a5dcso144834575ad.3 for ; Thu, 26 Dec 2024 09:07:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232846; x=1735837646; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=g5ywt//znwFKMBkiU95FpNYi4gj0LeYvCae6lfTb4Ek=; b=kzunHBQVkRfVFNXbgQGUT5jPyCC20z7B7knCDh1wwFgMhNvqSl/guiaZyHb8fjFSSL uBNpcikpRGLZ6Su2H+yv32SsgS5q0Z/GGOpXB5mk7c/zF8jdHQaW5rumOR8puU3e7/IL 32G309NInGCrqG/pALwo1E5QKAgE1AWhAYyp/dTew7LT/Br9QYehYqZ5PVeo7yLMgQEZ b1pJPQCNVEPxCYJfTSj0Zh65YGiJLHT4fl01sfd3RJ6kivy0xQOgxlB9kxlinCKRQjI1 bu1B6SBcqYwaj9o+qK5HnKOjebrkr55dSOX8mOpTqHxeKayhwLSZei2zA6vCJYpm5wKi Adig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232846; x=1735837646; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=g5ywt//znwFKMBkiU95FpNYi4gj0LeYvCae6lfTb4Ek=; b=U0bK2La4hwgZ7/5F/Z8/ehz7Fs/dmI8ZY5tBh82NpBerYoHKFC8z7RYDsc+ZQ3S1mZ nVWYWPf3Yx5c4/Hselir623QXAbcGMVWirgeo7g1r4ZtYEUvbQ6Ckz2ziCCWiNd8j6MZ G4Xfpi348AVsih06DknNqpbppPykIM8cDunHDsXjLz4FS0fi9GdnXQAsgqv2VOkpI3C3 FceeNCnVR4ALt4T8yxjwas/dPswhdvUlDTJ3VwR9OY7tGLTdrrVACR19i7hPHEVupams MWOrFoZRhrj14evntGZkryG2tefoTqd5Tyu/VkSBc5z1yTFoViwyzoCoLjvU9KhP902v h2tw== X-Forwarded-Encrypted: i=1; AJvYcCWRAC6p3QzAKrdiI/wUavNr00CL6RQbJ8qXrt2BZ874PxpI9Vuqr4NSnBoGz5oexxTirYxzHyCN9pkHyfw=@vger.kernel.org X-Gm-Message-State: AOJu0YxH/Sm5yQaGsYdI11uAcEyX1rJxgZN6ZQMAJbn7ZOvG2WCY3rvH RqRh6mOx+Wv6oKwiguZengKUgzirAVKBGh8lMyKJ5qjL99nJDzYWSgChmnGAAINND8tStQl+9KJ RMw== X-Google-Smtp-Source: AGHT+IE/h3it9ewL3ohy9Py+t9rkxMgblJxIK9ZNryUZhJOqjESQSzvwpTa/5zazFY30vveAdTUiPRQAGfg= X-Received: from pgjs19.prod.google.com ([2002:a63:f053:0:b0:800:502a:791c]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d589:b0:216:45b9:439b with SMTP id d9443c01a7336-219e6f28486mr336700685ad.50.1735232845994; Thu, 26 Dec 2024 09:07:25 -0800 (PST) Date: Thu, 26 Dec 2024 09:06:58 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-7-surenb@google.com> Subject: [PATCH v7 06/17] mm/nommu: fix the last places where vma is not locked before being attached From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" nommu configuration has two places where vma gets attached to the vma tree without write-locking it. Add the missing locks to ensure vma is always locked before it's attached. Signed-off-by: Suren Baghdasaryan --- mm/nommu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/nommu.c b/mm/nommu.c index 72c8c505836c..1754e84e5758 100644 --- a/mm/nommu.c +++ b/mm/nommu.c @@ -1189,6 +1189,7 @@ unsigned long do_mmap(struct file *file, goto error_just_free; =20 setup_vma_to_mm(vma, current->mm); + vma_start_write(vma); current->mm->map_count++; /* add the VMA to the tree */ vma_iter_store(&vmi, vma, true); @@ -1356,6 +1357,7 @@ static int split_vma(struct vma_iterator *vmi, struct= vm_area_struct *vma, =20 setup_vma_to_mm(vma, mm); setup_vma_to_mm(new, mm); + vma_start_write(new); vma_iter_store(vmi, new, true); mm->map_count++; return 0; --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A404D142E7C for ; Thu, 26 Dec 2024 17:07:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232850; cv=none; b=uazGSltzLODAYyk/JTXlDzLruInCxHOLiE4BsXUo3wTEb1flGQCefVC5qSV3KfcksVkhF3CXMrjjU2szR0bcVyv7mcOvcOsXltwkQ6QuGyPJ7vE5PRoa0pT0MiRwwfLwzf6+HRtvRgClqGtDDK2jThKgrSjNYO5H4gctKZrOan0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232850; c=relaxed/simple; bh=x8qaFP1AxURYpa5oiGFQli55lZvdLlf3Vb+7T69X9+M=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=COfCbDbq281EXfDIiHinTKX9Ueof5d6F52nGNJpTYp/93iz2TlyLkYL3ZyFuYNWfyC9rgYOs9umZ3xyVjwHq8U4Lw3841pnAkmtI3oQWlywK/sCgTJI6JOWyo0rFnHhFCtEvLRIf50ihRCyHVzbBeaHL+g+ZKQh4oZZib5XKMH0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Ot1dOOFi; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Ot1dOOFi" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ef6ef9ba3fso8267067a91.2 for ; Thu, 26 Dec 2024 09:07:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232848; x=1735837648; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0ykFNL3Q77su3l1hmkCa41V3LHYM0KyunH48W8conCI=; b=Ot1dOOFi/jvZcCjPeJTj5e/4frA4ibU3acgE9zm++BszZBGb/ZjsocImg/CGUV5CO8 Nf0i6kW4RPmBB+TuXbiIq8Q9lyE2+v4asNGs0Uk5rDtO9bHM1YGu40KzXtj5Ht7ERDC+ B/L6tlRe3frjNZ9HyASSI0j9MSMsdu51HOuZCzaE/9VZl6J0JJQYjGOzEDdDliIGKe/+ 10b5x0BsD/Muc7zMOxNgBRoflAWzgxXH/9zq5DtqSjk1NCiAZ7DIsrJDwjA+qT3wbLfE c2yf/cX1DalQlAjJWOKo5actZhpOr1jvlkqNxDWDK8hXV6ed8Y/SA9EgVIR7E5Mk6/yA 0IBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232848; x=1735837648; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0ykFNL3Q77su3l1hmkCa41V3LHYM0KyunH48W8conCI=; b=fPGpe/8wOqyZFoqmyYsp4TEOsRIwns/+o0VTGF9jwo7oCCk5im3zTYB2jzc+h1h2Z9 y0iVnfEmqgCgTJqr9qrfTIFTx8ZC0FI+xroRmhdjn0ucHhb3+E6PQtDiOL/OSNdKik6Q ALhfMFEF2GLYxsFEHVRKS2tHqtalGSY7++Ax991rEowxJfwpNzf7AQV4WLxtoIfhmd9N 2BQG+cGMFPvo106Stt3rWZB4Fa4R4g43ntw8vnp1X5LdgPIM5DicXSwSH3hJRriN9HWS OQcAaueXx9xcBs/l7poZFEAagx8E3zQIXUhhLd5apFVpfGi/jVb4qN7u3As0rpyG+DfF QxRQ== X-Forwarded-Encrypted: i=1; AJvYcCXTb3DCdIi157ywOxzBMVdJp++6Es3iCtjvn7xA1HLmi9n7QoY5g8dnjYQ4UnFyQ9uq5lovO67nF3e0UV0=@vger.kernel.org X-Gm-Message-State: AOJu0YzuKwQD/Q4lsNMO2EwFYM6Slu/CKVsWGMnZWATkSK/WTUBmA54A LJ948HBdF3G4fJf6yikHRvyIgI+s9ddlZRWxsUVyINprA9NjU74NDb/UJvdJoCEw9AccefRspq6 MEQ== X-Google-Smtp-Source: AGHT+IHSckZ6MRPuI2AW+AMmVk8LfVa6FJMQqBAOVn0ohTdOeJ5YmYmLuchDuM88M65uvkUM82mTiq54fW8= X-Received: from pjvf8.prod.google.com ([2002:a17:90a:da88:b0:2ef:8eb8:e4eb]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2b8e:b0:2ee:49c4:4a7c with SMTP id 98e67ed59e1d1-2f452e39845mr35060096a91.18.1735232848030; Thu, 26 Dec 2024 09:07:28 -0800 (PST) Date: Thu, 26 Dec 2024 09:06:59 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-8-surenb@google.com> Subject: [PATCH v7 07/17] types: move struct rcuwait into types.h From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move rcuwait struct definition into types.h so that rcuwait can be used without including rcuwait.h which includes other headers. Without this change mm_types.h can't use rcuwait due to a the following circular dependency: mm_types.h -> rcuwait.h -> signal.h -> mm_types.h Suggested-by: Matthew Wilcox Signed-off-by: Suren Baghdasaryan Acked-by: Davidlohr Bueso Acked-by: Liam R. Howlett --- include/linux/rcuwait.h | 13 +------------ include/linux/types.h | 12 ++++++++++++ 2 files changed, 13 insertions(+), 12 deletions(-) diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h index 27343424225c..9ad134a04b41 100644 --- a/include/linux/rcuwait.h +++ b/include/linux/rcuwait.h @@ -4,18 +4,7 @@ =20 #include #include - -/* - * rcuwait provides a way of blocking and waking up a single - * task in an rcu-safe manner. - * - * The only time @task is non-nil is when a user is blocked (or - * checking if it needs to) on a condition, and reset as soon as we - * know that the condition has succeeded and are awoken. - */ -struct rcuwait { - struct task_struct __rcu *task; -}; +#include =20 #define __RCUWAIT_INITIALIZER(name) \ { .task =3D NULL, } diff --git a/include/linux/types.h b/include/linux/types.h index 2d7b9ae8714c..f1356a9a5730 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -248,5 +248,17 @@ typedef void (*swap_func_t)(void *a, void *b, int size= ); typedef int (*cmp_r_func_t)(const void *a, const void *b, const void *priv= ); typedef int (*cmp_func_t)(const void *a, const void *b); =20 +/* + * rcuwait provides a way of blocking and waking up a single + * task in an rcu-safe manner. + * + * The only time @task is non-nil is when a user is blocked (or + * checking if it needs to) on a condition, and reset as soon as we + * know that the condition has succeeded and are awoken. + */ +struct rcuwait { + struct task_struct __rcu *task; +}; + #endif /* __ASSEMBLY__ */ #endif /* _LINUX_TYPES_H */ --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A30114373F for ; Thu, 26 Dec 2024 17:07:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232851; cv=none; b=X1RrnIFBHBQN/HAYNeZ+sqUcQ52zmgQdAiDWR6wywn53WHB9PstznTg1nFT6DTpBE/g7GBVL7jNKa8bA4pOw3zcMF0cvUopkYyVwN976mncVpwOhspKGPUwxNB1wziPQoDq+OSi5ZeQ2eT+ZS1yZxbZaydhxm0PCwDm8BdLLXo4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232851; c=relaxed/simple; bh=7e8BADgSS/eI3b4iWIZ0Vev97Rzzh61LrDXN/gVavY4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=I4AdGWBerqlhZCe7bqK1fzofmphhVfyGla4Ajc4k8sFqvIuxUc2f7yY0/ezSOotsxwlZvZnjL2Q4tbhbxiWfxJPCxrYFuPmi0OS9ZdnJi3zNFph2pzHz9xgeRN3BPRZupvrgF8wvBN3lS+T0VU2WW4m0iNXWF0EtUX5gSgMuO2Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=QuaRa0qT; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="QuaRa0qT" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21631cbf87dso73147115ad.3 for ; Thu, 26 Dec 2024 09:07:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232850; x=1735837650; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Jqq50JvZ531gII06X2js4uIjNkeT7syCyMdKPeFhteI=; b=QuaRa0qT4etm+sFBA6ezx4cwSUGZpTQusYuRn66OSuWRWWvQQsUdZQN8zy3ArQ3UHw fc3YYrHntsQY5tMdKT9CTqTeUetaLetgrInjQgUad1aCW/tQS3EJrow2rS6Ef0cdUsYX dx86Yo8S9yHmzRwWuvbK3uOVKgsqNbPNn+Msh61aLSeFXRWgY/tP8ZFR9AzAmm7Zze14 J9/Ue0FGpf88YbwhprpBNo5EN/VR7XsLdDq4yvekA4PeGvBemQoMrDIxORMWTfHJuxzH 6D9e40mzJ8tgDplZx0zjqy6utvgS1QdeBOst/XorTBsiJ8HRnIdfFy3ocJaNMlkLMARA vLNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232850; x=1735837650; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Jqq50JvZ531gII06X2js4uIjNkeT7syCyMdKPeFhteI=; b=GK0JC7V7xx0hxct8D/MQQP0vimBhr+hTp+wtQy6iiuF1kKNF91WCg0hj38QZp2EL4+ u+Qbk+wox0BSvzn0lWm5G+FCzXUhWwfSsoGT4/0kdCq9CF4soOU6PmrjYLSAMsPrqAj2 zm2ysWz39s4oS5QRLeboyHXz/Sfi2hiysxVwDWnHx7h5C8GguOyzDOMoZICRvTKCcH+8 oMCU0DGxVgNK6FkSWdHS3Gj+p6W44V27dYUDaPV9PP8QH+nUg0TWXIzfpxMBbZt1nQfj yAnqqhWbEfbNcd+bahizvfbqFp0QdG22kRUhZFQWtAb5ni9NXBSEXCR57UjP0EEq7QUr HOtQ== X-Forwarded-Encrypted: i=1; AJvYcCUYVRfGu35OLrniZNX47WqyzO2EvykO+qG81iosWH4WlsMcmdDWQB419aGF/6QlWQDVYDJM5z8XwS+WIx0=@vger.kernel.org X-Gm-Message-State: AOJu0YwNHoDe8y90NiqwZwZYFYulT6XaQvxPv+v+Ahhl/jYv0Zq5vRqt EUVg3qymWCt6JZGL4rsXvlx7DJeb07MGXIvgySCnIljq8o/p6dW7ZBzfL4clBqsVQ2G/egtbWtm 6mQ== X-Google-Smtp-Source: AGHT+IFQ/9ir9mO2gzhzka3p88zGCJjxZamQMl1FNnXg5zmv9rf3CH9hBVzxn6/eqqjJbIU25WLMpAjn/FY= X-Received: from pfbbe9.prod.google.com ([2002:a05:6a00:1f09:b0:725:936f:c305]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:6f0b:b0:725:b4f7:378e with SMTP id d2e1a72fcca58-72abdbe0cb5mr31015210b3a.0.1735232849837; Thu, 26 Dec 2024 09:07:29 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:00 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-9-surenb@google.com> Subject: [PATCH v7 08/17] mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" With upcoming replacement of vm_lock with vm_refcnt, we need to handle a possibility of vma_start_read_locked/vma_start_read_locked_nested failing due to refcount overflow. Prepare for such possibility by changing these APIs and adjusting their users. Signed-off-by: Suren Baghdasaryan Cc: Lokesh Gidra Acked-by: Vlastimil Babka --- include/linux/mm.h | 6 ++++-- mm/userfaultfd.c | 17 ++++++++++++----- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index c50edfedd99d..ab27de9729d8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -747,10 +747,11 @@ static inline bool vma_start_read(struct vm_area_stru= ct *vma) * not be used in such cases because it might fail due to mm_lock_seq over= flow. * This functionality is used to obtain vma read lock and drop the mmap re= ad lock. */ -static inline void vma_start_read_locked_nested(struct vm_area_struct *vma= , int subclass) +static inline bool vma_start_read_locked_nested(struct vm_area_struct *vma= , int subclass) { mmap_assert_locked(vma->vm_mm); down_read_nested(&vma->vm_lock.lock, subclass); + return true; } =20 /* @@ -759,10 +760,11 @@ static inline void vma_start_read_locked_nested(struc= t vm_area_struct *vma, int * not be used in such cases because it might fail due to mm_lock_seq over= flow. * This functionality is used to obtain vma read lock and drop the mmap re= ad lock. */ -static inline void vma_start_read_locked(struct vm_area_struct *vma) +static inline bool vma_start_read_locked(struct vm_area_struct *vma) { mmap_assert_locked(vma->vm_mm); down_read(&vma->vm_lock.lock); + return true; } =20 static inline void vma_end_read(struct vm_area_struct *vma) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 4527c385935b..38207d8be205 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -85,7 +85,8 @@ static struct vm_area_struct *uffd_lock_vma(struct mm_str= uct *mm, mmap_read_lock(mm); vma =3D find_vma_and_prepare_anon(mm, address); if (!IS_ERR(vma)) - vma_start_read_locked(vma); + if (!vma_start_read_locked(vma)) + vma =3D ERR_PTR(-EAGAIN); =20 mmap_read_unlock(mm); return vma; @@ -1483,10 +1484,16 @@ static int uffd_move_lock(struct mm_struct *mm, mmap_read_lock(mm); err =3D find_vmas_mm_locked(mm, dst_start, src_start, dst_vmap, src_vmap); if (!err) { - vma_start_read_locked(*dst_vmap); - if (*dst_vmap !=3D *src_vmap) - vma_start_read_locked_nested(*src_vmap, - SINGLE_DEPTH_NESTING); + if (vma_start_read_locked(*dst_vmap)) { + if (*dst_vmap !=3D *src_vmap) { + if (!vma_start_read_locked_nested(*src_vmap, + SINGLE_DEPTH_NESTING)) { + vma_end_read(*dst_vmap); + err =3D -EAGAIN; + } + } + } else + err =3D -EAGAIN; } mmap_read_unlock(mm); return err; --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 759161D7E4A for ; Thu, 26 Dec 2024 17:07:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232853; cv=none; b=P0Qdqwb0G7CwJPVQVH32bTIAHvpQTdbxLXmX5XpxFl3XsaNUymnoqpS74Z29bBFWdyH6aOXT1gPvfdcBtrY5zZYNZiPhirQEQ8pRcoUd7xNceNHp7cYvpP1R2+CNkMr30Ow3F7nHTiFzu2kAKaXbBJYtNQN7ARxtwwA6t3l8EPk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232853; c=relaxed/simple; bh=eVQ3VVggh8kVNxQAQ6vf7aEf20tCEfNHXPz3nGbyC34=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ne7eEs1t7E35/L9+jRl+OeQCZKLJxa0ssuiECffDQpTdQsnSSVjbAywmYtaM4Wi8w8fC3rMuNF2DcRJWHUenYpbqjcpJxa3VvEMUF9xNJxZMsbyr36BRycBdyvqkf001Zi/4LYlYmTllP29YQFrQ0kUlJotwi3ERmn0IwWN0m+I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=isj/PPJZ; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="isj/PPJZ" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2166d99341eso84110115ad.0 for ; Thu, 26 Dec 2024 09:07:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232852; x=1735837652; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3UFX636acVjy8McvbWwNjOMeHy5jkJa1vw72jY/LAWg=; b=isj/PPJZ2b1nSCRYbvKE/9CiqUanOR8d4bs21E+B2n7F/aYEM31Q+xbtHaigPkaNps FRn9QUVz20Op5fcuLgKcXUgUY821+JwQxpJsTaB0RdeThUeZrDWGondHB5xyeOWGFxF2 d4LCSxEYAZkRiY23pm3l55Dqcx+YShycEZkjdv1NJSBL9knRWHrf9RI+pyqvkqU0yUW6 rKN5NwWhHQ63TXUae+zu7B5+MU5Nu10nt2DI6+YHwCOuA5CerE6Y8LLhCmeRn7rDYw4j ExApa8YxUpeYwx4gjkUP62+waHZEDx/dZK2VmUCT5/har3hM4dVRWLAwsZ/qXd27RFAy qNQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232852; x=1735837652; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3UFX636acVjy8McvbWwNjOMeHy5jkJa1vw72jY/LAWg=; b=ajHouzXrDnI7rDm1oMPzBzX8zrYibmM+PpaaqUndAE1UkndmTvLb8Q24VAb8xJUXcj /IBJSbXVbmuEDuAjRh2NM+QJeUUZq/za9SueJAMMHJLj1C+DjbDDCuQXFrctDFNbmDb+ GkxVXYrLrOBXDjHqVktMZDO9WBHxj+gTEerEuj2J92gMMRRL89Fjt+TB2uvNnGA66Gpt xXCu81Hk1XMTzdWDkeyopHNHFvpSWN6JGVHkgRgrA7I+p8Q0frOTgxSzFxWEx6mcF18P sw0cnADzX3P0XlbMpcFaxN451+iIpHxHSCIGPVHkJJyzJN2WYMHzlFVOMd53GaTBAlXK 44Og== X-Forwarded-Encrypted: i=1; AJvYcCUnqSwUvkWhaJoFY04Wx6mPYi1SAJxMPaBl1+8dBgNS5Co+/iG+cbYn1tttUiPG1PJ6bdU6eFB+Bw5Ua1c=@vger.kernel.org X-Gm-Message-State: AOJu0YxOfl66EbFbW0WQhY8gzDPRfuJpSh6Zr6NkO8UiJG8la7dkKvfq VX+2azn8VdXP4NxiALu99QWAD+KNK9/ZtAoHJvRocyIAIQlLDb2p39b538dHgim6BsvMSv35E1+ olA== X-Google-Smtp-Source: AGHT+IE+Il4aaMscBOZzUniftz3cly+CPDHxMUs/tHLaE2nxYIYJF6s9V97DoMW87H97O8rFF+TQEZN/e8w= X-Received: from plsp1.prod.google.com ([2002:a17:902:bd01:b0:212:5134:8485]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2cc:b0:216:5867:976a with SMTP id d9443c01a7336-219e6f1176fmr305791325ad.45.1735232851873; Thu, 26 Dec 2024 09:07:31 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:01 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-10-surenb@google.com> Subject: [PATCH v7 09/17] mm: move mmap_init_lock() out of the header file From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" mmap_init_lock() is used only from mm_init() in fork.c, therefore it does not have to reside in the header file. This move lets us avoid including additional headers in mmap_lock.h later, when mmap_init_lock() needs to initialize rcuwait object. Signed-off-by: Suren Baghdasaryan Reviewed-by: Vlastimil Babka --- include/linux/mmap_lock.h | 6 ------ kernel/fork.c | 6 ++++++ 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 45a21faa3ff6..4706c6769902 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -122,12 +122,6 @@ static inline bool mmap_lock_speculate_retry(struct mm= _struct *mm, unsigned int =20 #endif /* CONFIG_PER_VMA_LOCK */ =20 -static inline void mmap_init_lock(struct mm_struct *mm) -{ - init_rwsem(&mm->mmap_lock); - mm_lock_seqcount_init(mm); -} - static inline void mmap_write_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, true); diff --git a/kernel/fork.c b/kernel/fork.c index f2f9e7b427ad..d4c75428ccaf 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1219,6 +1219,12 @@ static void mm_init_uprobes_state(struct mm_struct *= mm) #endif } =20 +static inline void mmap_init_lock(struct mm_struct *mm) +{ + init_rwsem(&mm->mmap_lock); + mm_lock_seqcount_init(mm); +} + static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct = *p, struct user_namespace *user_ns) { --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77B7A1D89FA for ; Thu, 26 Dec 2024 17:07:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232855; cv=none; b=mBsNHhbjo1hJGwu3XXtqwrcLtGMwU6RnT6PQlQxQhY6R8ZAB66KL88foBNt6u+9gnDmaKUddO61xrucwsjM7k2kaGltsqOTZGkUtmwOh3fPaNp7BohbR/drx5PqZvK4A3wfw2PJfz/Hdw5rUwrxVtF1080ha0seh42N5G2yjFw8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232855; c=relaxed/simple; bh=y5H/oyb5yTq6yRneEBepHGREG8blFWXRMu5RqF/mU9c=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DnZW5MBbzGDFSS97sY8qiwqM8DQGyhp91ag9nHE4Ut7wGj5jPcdhYzlzBRu++4EA20cU50QVeQiXmU250hpO/gXhPp7NrugYRJpZWciPCTvuTWFjjRqR9MMOvGNFtzu6FhP6u6PdTg1ZvXQ3Bx3Ow+AOt3kru7tbFutLGsykobI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=XVpx/Eud; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="XVpx/Eud" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-218cf85639eso125213835ad.3 for ; Thu, 26 Dec 2024 09:07:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232854; x=1735837654; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=itcT003jdnlfYy/z4AAxsZgzhf2fL1y4OTxlsN9l2Ik=; b=XVpx/EudRd2RdUgnbP2+o1KeYdk0hBPA7u6i0leHRz+RMK2vL4EyuhFYEX/fMEp8UD RKa6LVtvdivBnacgyZ3+CdcjfiJC/gMuDDjTr1JCQ1hy7dRwqmEMvRIAXmacvCLpy2T+ L2m1/0w5MupzW+8aSmJE8LMdbi6pcXwTnl8nUGbYfm90a17pcNxBILRKRR+/yvcT0C+t CIXwFQVuZwC5OOtWNE4Nyrqnvw6cFfEI0Aahznta2GB5vIuxKUUoqd08LRHJcf3UGDTf FUkUlVzYTJ0DRx/GnEXX3lZ79NRCfAugzBscAPb9EL8IASawu44C9P7V3TzqRo05PTpQ +ixA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232854; x=1735837654; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=itcT003jdnlfYy/z4AAxsZgzhf2fL1y4OTxlsN9l2Ik=; b=jwAewyChIWwfs8toE9VUB6bHgJi8THW9IPcum2GAi7M0Ct0igVKc3VguEXNPr61Cvu 11oBeF9VLkc0r9UaC+g0xHDem61Rs4urzTs8vUZaetak96thtF7yYmVV/lsC0ysMwCK2 ynSyks7i+jKKndNTNp+wp8N1UcAxSOWqQYUvjJRbcGzYpiigZKzaL5R993dvT8eKa7jl P7abdYSFjEX0zoen425pvJyXgzUMpicM29S8zOBS++tGAhiLjEAdkf7cYHmD1lpeGzP8 ttYlgN1uQRVZ80QslDOTWD57Pqc5zQJdSeRn5I4+MWAK6wVlt+luGzNhx0ZrP2dKj39y eYZQ== X-Forwarded-Encrypted: i=1; AJvYcCXBbTydBzrefcO47oA1AXadjfbGQlHliHybwoCHzIxXSFCgctwECK5yyHw9iJ4408OSsFtTHCvbmHtvcrs=@vger.kernel.org X-Gm-Message-State: AOJu0Yxp+7SIZ9yhGSfElY5uxDULpUyzrmS8LhTYjeFnYfllSVPDOM0F DtBDystB7ESQAR45ikfzQpnda/e/A84oG09ImU6cgG4xa5vzlB4StZmvHeH1pP+AeiO5ZGPBksZ buA== X-Google-Smtp-Source: AGHT+IFK/ZcPw9QQ8rKy9Iz7tN9PwEqV+omG9KoBZdHVze2si2OKYWeWR6yvQPLae2+NyznTr7bIa67cCm8= X-Received: from pfaq3.prod.google.com ([2002:a05:6a00:a883:b0:725:e76f:1445]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:4308:b0:1e0:c8d9:3382 with SMTP id adf61e73a8af0-1e5e0847084mr38655491637.45.1735232853953; Thu, 26 Dec 2024 09:07:33 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:02 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-11-surenb@google.com> Subject: [PATCH v7 10/17] mm: uninline the main body of vma_start_write() From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vma_start_write() is used in many places and will grow in size very soon. It is not used in performance critical paths and uninlining it should limit the future code size growth. No functional changes. Signed-off-by: Suren Baghdasaryan Reviewed-by: Vlastimil Babka --- include/linux/mm.h | 12 +++--------- mm/memory.c | 14 ++++++++++++++ 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ab27de9729d8..ea4c4228b125 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -787,6 +787,8 @@ static bool __is_vma_write_locked(struct vm_area_struct= *vma, unsigned int *mm_l return (vma->vm_lock_seq =3D=3D *mm_lock_seq); } =20 +void __vma_start_write(struct vm_area_struct *vma, unsigned int mm_lock_se= q); + /* * Begin writing to a VMA. * Exclude concurrent readers under the per-VMA lock until the currently @@ -799,15 +801,7 @@ static inline void vma_start_write(struct vm_area_stru= ct *vma) if (__is_vma_write_locked(vma, &mm_lock_seq)) return; =20 - down_write(&vma->vm_lock.lock); - /* - * We should use WRITE_ONCE() here because we can have concurrent reads - * from the early lockless pessimistic check in vma_start_read(). - * We don't really care about the correctness of that early check, but - * we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy. - */ - WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq); - up_write(&vma->vm_lock.lock); + __vma_start_write(vma, mm_lock_seq); } =20 static inline void vma_assert_write_locked(struct vm_area_struct *vma) diff --git a/mm/memory.c b/mm/memory.c index d0dee2282325..236fdecd44d6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6328,6 +6328,20 @@ struct vm_area_struct *lock_mm_and_find_vma(struct m= m_struct *mm, #endif =20 #ifdef CONFIG_PER_VMA_LOCK +void __vma_start_write(struct vm_area_struct *vma, unsigned int mm_lock_se= q) +{ + down_write(&vma->vm_lock.lock); + /* + * We should use WRITE_ONCE() here because we can have concurrent reads + * from the early lockless pessimistic check in vma_start_read(). + * We don't really care about the correctness of that early check, but + * we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy. + */ + WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq); + up_write(&vma->vm_lock.lock); +} +EXPORT_SYMBOL_GPL(__vma_start_write); + /* * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed = to be * stable and not isolated. If the VMA is not found or is being modified t= he --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 849D11D90B3 for ; Thu, 26 Dec 2024 17:07:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232858; cv=none; b=fOPPS1OwRlE/tJPmjq6eKpkVFg/OBzwfP49Wn83Mg3iee71m3Bn5qETYj4L1s2ELrXTxBVNq6ndqYzesSB+Pj8cdRZNx+fSGSY31I4aCNXcHx3GaxdRU7v4cwneo/u2SJbPIiH3UP0QeW1dSR9YzBz09mR4a9//ztADTguona5o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232858; c=relaxed/simple; bh=slEyDSyOmggj5mJJINjjr/YocKsmeusN6XkbngdxcOw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=oxGqSeEzYmFTTqsBt4fY132FC+OvklBPpEJdIAxfdlgA4BwLUVMiQb/l+/VMA2VeRQjL8iNPViGeV0I3ogFwe/WlUiFnRjBIY7DdZ3Qe++kal809lLdbGEkEMxi1bwATDYeek/BcysDcSISyXUgyHVmWk6rj46i4NqphH0YZYUQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=i+1YKpTg; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="i+1YKpTg" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef9204f898so8145849a91.2 for ; Thu, 26 Dec 2024 09:07:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232856; x=1735837656; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PJyEpzu4DU8rJRmq3ow3hqlNy2LMWwwp3HWnjTLdNPo=; b=i+1YKpTg2Mx2oWk2NdqVxtYPcRiHcZ2KZX91e9faaDhq/o+H8+jgSG6lSgC8AtMeMk qEsVGnyhSEjtj6ufIaBwldJZ+PznCXN3IPpdU+EMQveD1k3bTslQxThhyeJqRJj/Y+eX xP3soaIvvIv0bOJ/e3p5DZV17OjfGd5WTdtW9iWXUtsLEl7oQ7O1hCnMj63NeuTDjywM qVEs/d5djwxnG4kBCCA3gfl4aX0GbBx8L+FScoyO75MYBrDawSJASZ9/A+KLyYUnBxYV JPaau9aEUKyk8UKA5IBTCF8oW9Pzzr56y4s5lbW3hq9F1umbngK0/dnE43Re37z5xx0Q Ixjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232856; x=1735837656; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PJyEpzu4DU8rJRmq3ow3hqlNy2LMWwwp3HWnjTLdNPo=; b=v2wJb4BoleGeDJau2a2WS4QVywZi4kKDPQUnkX7YPstcjjqMWnYj4rA7OppHlrCZKI G9tXEkWy0mwMBEwPwuwJb5iDol9XJ9KZvcaHglLMQrFlLEr0bNDiRyH/0u9ylUvHWO4Z S2uoKk7jOx2t8O6/nHqurHf4x78pYVwFU1uTDUEpxNIT/kAigBN2qiR7KO2fDg0GpALV gBSWBsmYp5kCvAu4gb1I3P8BgJAVm9VG1UsVWHipftCfHlY1f/V0ZceSWYBcVT0uuUgj nRLQa+TiI0iIBEuj9GYN/IwbsobhcpforgBbONPxX/SYuMEddfdKxiIRa08s7FqOfJx2 qAoQ== X-Forwarded-Encrypted: i=1; AJvYcCW81VOAYroVUB+z5ccBpeBKf0+zfWj03hzTuH6aExnWH2YaX3bo5K56vCJxzSAR4j/G3Z+D6FIiVBqIpHA=@vger.kernel.org X-Gm-Message-State: AOJu0Yx2Jtjjmh5nTCCzDzVaoQqOZNuprCoUpPTnRJAJnAUh1Nkibt1z zeRV2vxw41jXsEwUEvj7thb4EB1zQ9j+9PBN3+rqVhXXX7TdphsSBRdC6fd7AIwjutQpGdR5NDu +JQ== X-Google-Smtp-Source: AGHT+IEl4ZXzZa5pPyzTXDxPRSH8PuS/I6bpKT2fhAEsb0xYuFK9NMlUOzOSp0EdHvBpo/b42AdwcHR6bBY= X-Received: from pjtu11.prod.google.com ([2002:a17:90a:c88b:b0:2f2:e933:8ba6]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:528a:b0:2ee:8619:210b with SMTP id 98e67ed59e1d1-2f452ec3589mr35645319a91.29.1735232856007; Thu, 26 Dec 2024 09:07:36 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:03 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-12-surenb@google.com> Subject: [PATCH v7 11/17] refcount: introduce __refcount_{add|inc}_not_zero_limited From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce functions to increase refcount but with a top limit above which they will fail to increase. Setting the limit to 0 indicates no limit. Signed-off-by: Suren Baghdasaryan --- include/linux/refcount.h | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/include/linux/refcount.h b/include/linux/refcount.h index 35f039ecb272..e51a49179307 100644 --- a/include/linux/refcount.h +++ b/include/linux/refcount.h @@ -137,13 +137,19 @@ static inline unsigned int refcount_read(const refcou= nt_t *r) } =20 static inline __must_check __signed_wrap -bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp) +bool __refcount_add_not_zero_limited(int i, refcount_t *r, int *oldp, + int limit) { int old =3D refcount_read(r); =20 do { if (!old) break; + if (limit && old + i > limit) { + if (oldp) + *oldp =3D old; + return false; + } } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i)); =20 if (oldp) @@ -155,6 +161,12 @@ bool __refcount_add_not_zero(int i, refcount_t *r, int= *oldp) return old; } =20 +static inline __must_check __signed_wrap +bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp) +{ + return __refcount_add_not_zero_limited(i, r, oldp, 0); +} + /** * refcount_add_not_zero - add a value to a refcount unless it is 0 * @i: the value to add to the refcount @@ -213,6 +225,12 @@ static inline void refcount_add(int i, refcount_t *r) __refcount_add(i, r, NULL); } =20 +static inline __must_check bool __refcount_inc_not_zero_limited(refcount_t= *r, + int *oldp, int limit) +{ + return __refcount_add_not_zero_limited(1, r, oldp, limit); +} + static inline __must_check bool __refcount_inc_not_zero(refcount_t *r, int= *oldp) { return __refcount_add_not_zero(1, r, oldp); --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CC0111D95A9 for ; Thu, 26 Dec 2024 17:07:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232861; cv=none; b=U4c4OC3H8RvTMg3VTKpE5oXsjpfNb75RI8T8HIRSeARxHey7ogt8SMdVhaG4+ePam4kNJUiGvi15rqzrZ1S10P6tJdLsYZyH7TJqSMImu0g+W6J0mOLtamdTJ5RRQUzAM79TwlFjLc+lg6ztF3scBmXr1qZY7/jJt2S+1g3u+0s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232861; c=relaxed/simple; bh=5w+63U9MfQ/w0WDtsRcDNSm/jts5O33pMe9QLmR3m2Y=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RaKbWuk5f03QSj4829Wh3sNY+rfx4Pl6IVkCY1uKAJnhyBXp4w9kOxrDOxTdzMKOG99RovQI/CYTeR0ZfrmMex4prw40YDc6QF4qH+5VfIvjc1dbN8yi5jFJ+9v8ceUKO9R7kWbPwDTWbaZNYTzFdGysBQCUwpKY6mRQB7U7hbo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=cpBdz4iR; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="cpBdz4iR" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2165433e229so81594185ad.1 for ; Thu, 26 Dec 2024 09:07:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232858; x=1735837658; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=89s7s8pRCE/F5saH3vOD8dpykpc0sY2Jizt1/i2JaEM=; b=cpBdz4iRSYS5YXAiaxX62ElIw+rW467PhJGd8Qz3whfNoHE4cQq9ZOO07wkWPGxqno +pRq5cKhG8FDRYR8Zor/MhBxY2Ux0oEwgMxAVLkT0tANA6loh7Db0RISz4BP82q4EV8s uvlNJuUNrbFxRaHQdCucIEL7HJWx6UgnDxNMEq3rY8A15Ky8rpl85X+leBwZkYo+kvn9 hwYu6r3SP68wxv+YHPkJvVycnhi6lXLposn85SIryswxL22LRD/dFhLk/b4Yz/znwqMS DLOZZX4OFwA3eUUJUZEZ+bZ+Kj8rXiY6MnFCfVPpKk4C3TVIH3AGEaSxABngHhCgPkNM 9AZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232858; x=1735837658; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=89s7s8pRCE/F5saH3vOD8dpykpc0sY2Jizt1/i2JaEM=; b=M+Vec7Ac+t9UaFY1/nvx7IOFaGJRtlMULqXXCogct2UgUx0wv/UyQdAqZEWpwJvwXH asOoz/cfyaEFaKS/xXbPsI4QhP3dGRW8jEHuQy7TSnvRz+h2lsKhZv0YXuBI6fyw30KJ +CQFSbxWNCIkW2FB1upKO5FiWahO+sW9d4WfqqhcoeKjwU0qjgoRg7ZJ9WS1cU0acKs/ nHuf+HA3vrRWfET59v+VK1VEVykQ6+ZebwAD1JqEqSfZgQK55B5LaR8TZu33ZHmLRqfS TPLntHjTvUbj+oJxe8/0vJQiaN7IFF91qRMZsPs7IuYacWFXFVep0PMrUTJQMuNbQjwP 3RBA== X-Forwarded-Encrypted: i=1; AJvYcCVO3pZ2yziFjOEDHIlNB5EVLlV+nX7jLDdOXNtFFJgJ71+w7DCNDjbnuRenwhPNBGfZzaYkPeBZdy/nEug=@vger.kernel.org X-Gm-Message-State: AOJu0YzZQd9JxtAHB4uu6ZE4gqVHk9b1kRMzBKBuT0KJ1dmkkRfrmMDv JDkGPcZGc3i1v0bzI7QbUXrYdDDm+57M/bvGDn3L6ItV8AIAZIQzPWl/ox+ip7/2+OP6z9b691s 7GA== X-Google-Smtp-Source: AGHT+IE7Mk8eNe/JaivV7kGqKUrjWDMa+rKOhzyI+abJe9i8/UdhXPRKG981mQLisrrX+gPM4LihSIFuh4U= X-Received: from pgbbi9.prod.google.com ([2002:a05:6a02:249:b0:7fd:5835:26d1]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:6a25:b0:1d9:6c9c:75ea with SMTP id adf61e73a8af0-1e5e0448930mr37740524637.5.1735232858088; Thu, 26 Dec 2024 09:07:38 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:04 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-13-surenb@google.com> Subject: [PATCH v7 12/17] mm: replace vm_lock and detached flag with a reference count From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" rw_semaphore is a sizable structure of 40 bytes and consumes considerable space for each vm_area_struct. However vma_lock has two important specifics which can be used to replace rw_semaphore with a simpler structure: 1. Readers never wait. They try to take the vma_lock and fall back to mmap_lock if that fails. 2. Only one writer at a time will ever try to write-lock a vma_lock because writers first take mmap_lock in write mode. Because of these requirements, full rw_semaphore functionality is not needed and we can replace rw_semaphore and the vma->detached flag with a refcount (vm_refcnt). When vma is in detached state, vm_refcnt is 0 and only a call to vma_mark_attached() can take it out of this state. Note that unlike before, now we enforce both vma_mark_attached() and vma_mark_detached() to be done only after vma has been write-locked. vma_mark_attached() changes vm_refcnt to 1 to indicate that it has been attached to the vma tree. When a reader takes read lock, it increments vm_refcnt, unless the top usable bit of vm_refcnt (0x40000000) is set, indicating presence of a writer. When writer takes write lock, it both increments vm_refcnt and sets the top usable bit to indicate its presence. If there are readers, writer will wait using newly introduced mm->vma_writer_wait. Since all writers take mmap_lock in write mode first, there can be only one writer at a time. The last reader to release the lock will signal the writer to wake up. refcount might overflow if there are many competing readers, in which case read-locking will fail. Readers are expected to handle such failures. Suggested-by: Peter Zijlstra Suggested-by: Matthew Wilcox Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 100 +++++++++++++++++++++---------- include/linux/mm_types.h | 22 ++++--- kernel/fork.c | 13 ++-- mm/init-mm.c | 1 + mm/memory.c | 68 +++++++++++++++++---- tools/testing/vma/linux/atomic.h | 5 ++ tools/testing/vma/vma_internal.h | 66 +++++++++++--------- 7 files changed, 185 insertions(+), 90 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ea4c4228b125..99f4720d7e51 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -32,6 +32,7 @@ #include #include #include +#include =20 struct mempolicy; struct anon_vma; @@ -697,12 +698,34 @@ static inline void vma_numab_state_free(struct vm_are= a_struct *vma) {} #endif /* CONFIG_NUMA_BALANCING */ =20 #ifdef CONFIG_PER_VMA_LOCK -static inline void vma_lock_init(struct vm_area_struct *vma) +static inline void vma_lockdep_init(struct vm_area_struct *vma) { - init_rwsem(&vma->vm_lock.lock); +#ifdef CONFIG_DEBUG_LOCK_ALLOC + static struct lock_class_key lockdep_key; + + lockdep_init_map(&vma->vmlock_dep_map, "vm_lock", &lockdep_key, 0); +#endif +} + +static inline void vma_init_lock(struct vm_area_struct *vma, bool reset_re= fcnt) +{ + if (reset_refcnt) + refcount_set(&vma->vm_refcnt, 0); vma->vm_lock_seq =3D UINT_MAX; } =20 +static inline void vma_refcount_put(struct vm_area_struct *vma) +{ + int refcnt; + + if (!__refcount_dec_and_test(&vma->vm_refcnt, &refcnt)) { + rwsem_release(&vma->vmlock_dep_map, _RET_IP_); + + if (refcnt & VMA_LOCK_OFFSET) + rcuwait_wake_up(&vma->vm_mm->vma_writer_wait); + } +} + /* * Try to read-lock a vma. The function is allowed to occasionally yield f= alse * locked result to avoid performance overhead, in which case we fall back= to @@ -710,6 +733,8 @@ static inline void vma_lock_init(struct vm_area_struct = *vma) */ static inline bool vma_start_read(struct vm_area_struct *vma) { + int oldcnt; + /* * Check before locking. A race might cause false locked result. * We can use READ_ONCE() for the mm_lock_seq here, and don't need @@ -720,13 +745,20 @@ static inline bool vma_start_read(struct vm_area_stru= ct *vma) if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq.= sequence)) return false; =20 - if (unlikely(down_read_trylock(&vma->vm_lock.lock) =3D=3D 0)) + + rwsem_acquire_read(&vma->vmlock_dep_map, 0, 0, _RET_IP_); + /* Limit at VMA_REF_LIMIT to leave one count for a writer */ + if (unlikely(!__refcount_inc_not_zero_limited(&vma->vm_refcnt, &oldcnt, + VMA_REF_LIMIT))) { + rwsem_release(&vma->vmlock_dep_map, _RET_IP_); return false; + } + lock_acquired(&vma->vmlock_dep_map, _RET_IP_); =20 /* - * Overflow might produce false locked result. + * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result. * False unlocked result is impossible because we modify and check - * vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq + * vma->vm_lock_seq under vma->vm_refcnt protection and mm->mm_lock_seq * modification invalidates all existing locks. * * We must use ACQUIRE semantics for the mm_lock_seq so that if we are @@ -734,10 +766,12 @@ static inline bool vma_start_read(struct vm_area_stru= ct *vma) * after it has been unlocked. * This pairs with RELEASE semantics in vma_end_write_all(). */ - if (unlikely(vma->vm_lock_seq =3D=3D raw_read_seqcount(&vma->vm_mm->mm_lo= ck_seq))) { - up_read(&vma->vm_lock.lock); + if (unlikely(oldcnt & VMA_LOCK_OFFSET || + vma->vm_lock_seq =3D=3D raw_read_seqcount(&vma->vm_mm->mm_lock_seq)= )) { + vma_refcount_put(vma); return false; } + return true; } =20 @@ -749,8 +783,17 @@ static inline bool vma_start_read(struct vm_area_struc= t *vma) */ static inline bool vma_start_read_locked_nested(struct vm_area_struct *vma= , int subclass) { + int oldcnt; + mmap_assert_locked(vma->vm_mm); - down_read_nested(&vma->vm_lock.lock, subclass); + rwsem_acquire_read(&vma->vmlock_dep_map, subclass, 0, _RET_IP_); + /* Limit at VMA_REF_LIMIT to leave one count for a writer */ + if (unlikely(!__refcount_inc_not_zero_limited(&vma->vm_refcnt, &oldcnt, + VMA_REF_LIMIT))) { + rwsem_release(&vma->vmlock_dep_map, _RET_IP_); + return false; + } + lock_acquired(&vma->vmlock_dep_map, _RET_IP_); return true; } =20 @@ -762,15 +805,13 @@ static inline bool vma_start_read_locked_nested(struc= t vm_area_struct *vma, int */ static inline bool vma_start_read_locked(struct vm_area_struct *vma) { - mmap_assert_locked(vma->vm_mm); - down_read(&vma->vm_lock.lock); - return true; + return vma_start_read_locked_nested(vma, 0); } =20 static inline void vma_end_read(struct vm_area_struct *vma) { rcu_read_lock(); /* keeps vma alive till the end of up_read */ - up_read(&vma->vm_lock.lock); + vma_refcount_put(vma); rcu_read_unlock(); } =20 @@ -813,36 +854,33 @@ static inline void vma_assert_write_locked(struct vm_= area_struct *vma) =20 static inline void vma_assert_locked(struct vm_area_struct *vma) { - if (!rwsem_is_locked(&vma->vm_lock.lock)) + if (refcount_read(&vma->vm_refcnt) <=3D 1) vma_assert_write_locked(vma); } =20 +/* + * WARNING: to avoid racing with vma_mark_attached()/vma_mark_detached(), = these + * assertions should be made either under mmap_write_lock or when the obje= ct + * has been isolated under mmap_write_lock, ensuring no competing writers. + */ static inline void vma_assert_attached(struct vm_area_struct *vma) { - VM_BUG_ON_VMA(vma->detached, vma); + VM_BUG_ON_VMA(!refcount_read(&vma->vm_refcnt), vma); } =20 static inline void vma_assert_detached(struct vm_area_struct *vma) { - VM_BUG_ON_VMA(!vma->detached, vma); + VM_BUG_ON_VMA(refcount_read(&vma->vm_refcnt), vma); } =20 static inline void vma_mark_attached(struct vm_area_struct *vma) { - vma->detached =3D false; -} - -static inline void vma_mark_detached(struct vm_area_struct *vma) -{ - /* When detaching vma should be write-locked */ vma_assert_write_locked(vma); - vma->detached =3D true; + vma_assert_detached(vma); + refcount_set(&vma->vm_refcnt, 1); } =20 -static inline bool is_vma_detached(struct vm_area_struct *vma) -{ - return vma->detached; -} +void vma_mark_detached(struct vm_area_struct *vma); =20 static inline void release_fault_lock(struct vm_fault *vmf) { @@ -865,7 +903,8 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_str= uct *mm, =20 #else /* CONFIG_PER_VMA_LOCK */ =20 -static inline void vma_lock_init(struct vm_area_struct *vma) {} +static inline void vma_lockdep_init(struct vm_area_struct *vma) {} +static inline void vma_init_lock(struct vm_area_struct *vma, bool reset_re= fcnt) {} static inline bool vma_start_read(struct vm_area_struct *vma) { return false; } static inline void vma_end_read(struct vm_area_struct *vma) {} @@ -908,12 +947,9 @@ static inline void vma_init(struct vm_area_struct *vma= , struct mm_struct *mm) vma->vm_mm =3D mm; vma->vm_ops =3D &vma_dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); -#ifdef CONFIG_PER_VMA_LOCK - /* vma is not locked, can't use vma_mark_detached() */ - vma->detached =3D true; -#endif vma_numab_state_init(vma); - vma_lock_init(vma); + vma_lockdep_init(vma); + vma_init_lock(vma, false); } =20 /* Use when VMA is not part of the VMA tree and needs no locking */ diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6573d95f1d1e..b5312421dec6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -19,6 +19,7 @@ #include #include #include +#include =20 #include =20 @@ -629,9 +630,8 @@ static inline struct anon_vma_name *anon_vma_name_alloc= (const char *name) } #endif =20 -struct vma_lock { - struct rw_semaphore lock; -}; +#define VMA_LOCK_OFFSET 0x40000000 +#define VMA_REF_LIMIT (VMA_LOCK_OFFSET - 2) =20 struct vma_numab_state { /* @@ -709,19 +709,13 @@ struct vm_area_struct { }; =20 #ifdef CONFIG_PER_VMA_LOCK - /* - * Flag to indicate areas detached from the mm->mm_mt tree. - * Unstable RCU readers are allowed to read this. - */ - bool detached; - /* * Can only be written (using WRITE_ONCE()) while holding both: * - mmap_lock (in write mode) - * - vm_lock->lock (in write mode) + * - vm_refcnt bit at VMA_LOCK_OFFSET is set * Can be read reliably while holding one of: * - mmap_lock (in read or write mode) - * - vm_lock->lock (in read or write mode) + * - vm_refcnt bit at VMA_LOCK_OFFSET is set or vm_refcnt > 1 * Can be read unreliably (using READ_ONCE()) for pessimistic bailout * while holding nothing (except RCU to keep the VMA struct allocated). * @@ -784,7 +778,10 @@ struct vm_area_struct { struct vm_userfaultfd_ctx vm_userfaultfd_ctx; #ifdef CONFIG_PER_VMA_LOCK /* Unstable RCU readers are allowed to read this. */ - struct vma_lock vm_lock ____cacheline_aligned_in_smp; + refcount_t vm_refcnt ____cacheline_aligned_in_smp; +#ifdef CONFIG_DEBUG_LOCK_ALLOC + struct lockdep_map vmlock_dep_map; +#endif #endif } __randomize_layout; =20 @@ -919,6 +916,7 @@ struct mm_struct { * by mmlist_lock */ #ifdef CONFIG_PER_VMA_LOCK + struct rcuwait vma_writer_wait; /* * This field has lock-like semantics, meaning it is sometimes * accessed with ACQUIRE/RELEASE semantics. diff --git a/kernel/fork.c b/kernel/fork.c index d4c75428ccaf..7a0800d48112 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -463,12 +463,8 @@ struct vm_area_struct *vm_area_dup(struct vm_area_stru= ct *orig) * will be reinitialized. */ data_race(memcpy(new, orig, sizeof(*new))); - vma_lock_init(new); + vma_init_lock(new, true); INIT_LIST_HEAD(&new->anon_vma_chain); -#ifdef CONFIG_PER_VMA_LOCK - /* vma is not locked, can't use vma_mark_detached() */ - new->detached =3D true; -#endif vma_numab_state_init(new); dup_anon_vma_name(orig, new); =20 @@ -477,6 +473,8 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struc= t *orig) =20 void __vm_area_free(struct vm_area_struct *vma) { + /* The vma should be detached while being destroyed. */ + vma_assert_detached(vma); vma_numab_state_free(vma); free_anon_vma_name(vma); kmem_cache_free(vm_area_cachep, vma); @@ -488,8 +486,6 @@ static void vm_area_free_rcu_cb(struct rcu_head *head) struct vm_area_struct *vma =3D container_of(head, struct vm_area_struct, vm_rcu); =20 - /* The vma should not be locked while being destroyed. */ - VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock.lock), vma); __vm_area_free(vma); } #endif @@ -1223,6 +1219,9 @@ static inline void mmap_init_lock(struct mm_struct *m= m) { init_rwsem(&mm->mmap_lock); mm_lock_seqcount_init(mm); +#ifdef CONFIG_PER_VMA_LOCK + rcuwait_init(&mm->vma_writer_wait); +#endif } =20 static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct = *p, diff --git a/mm/init-mm.c b/mm/init-mm.c index 6af3ad675930..4600e7605cab 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -40,6 +40,7 @@ struct mm_struct init_mm =3D { .arg_lock =3D __SPIN_LOCK_UNLOCKED(init_mm.arg_lock), .mmlist =3D LIST_HEAD_INIT(init_mm.mmlist), #ifdef CONFIG_PER_VMA_LOCK + .vma_writer_wait =3D __RCUWAIT_INITIALIZER(init_mm.vma_writer_wait), .mm_lock_seq =3D SEQCNT_ZERO(init_mm.mm_lock_seq), #endif .user_ns =3D &init_user_ns, diff --git a/mm/memory.c b/mm/memory.c index 236fdecd44d6..2def47b5dff0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6328,9 +6328,39 @@ struct vm_area_struct *lock_mm_and_find_vma(struct m= m_struct *mm, #endif =20 #ifdef CONFIG_PER_VMA_LOCK +static inline bool __vma_enter_locked(struct vm_area_struct *vma, unsigned= int tgt_refcnt) +{ + /* + * If vma is detached then only vma_mark_attached() can raise the + * vm_refcnt. mmap_write_lock prevents racing with vma_mark_attached(). + */ + if (!refcount_inc_not_zero(&vma->vm_refcnt)) + return false; + + rwsem_acquire(&vma->vmlock_dep_map, 0, 0, _RET_IP_); + /* vma is attached, set the writer present bit */ + refcount_add(VMA_LOCK_OFFSET, &vma->vm_refcnt); + rcuwait_wait_event(&vma->vm_mm->vma_writer_wait, + refcount_read(&vma->vm_refcnt) =3D=3D tgt_refcnt, + TASK_UNINTERRUPTIBLE); + lock_acquired(&vma->vmlock_dep_map, _RET_IP_); + + return true; +} + +static inline void __vma_exit_locked(struct vm_area_struct *vma, bool *det= ached) +{ + *detached =3D refcount_sub_and_test(VMA_LOCK_OFFSET + 1, &vma->vm_refcnt); + rwsem_release(&vma->vmlock_dep_map, _RET_IP_); +} + void __vma_start_write(struct vm_area_struct *vma, unsigned int mm_lock_se= q) { - down_write(&vma->vm_lock.lock); + bool locked; + + /* Wait until refcnt is (VMA_LOCK_OFFSET + 2) =3D> attached with no reade= rs */ + locked =3D __vma_enter_locked(vma, VMA_LOCK_OFFSET + 2); + /* * We should use WRITE_ONCE() here because we can have concurrent reads * from the early lockless pessimistic check in vma_start_read(). @@ -6338,10 +6368,36 @@ void __vma_start_write(struct vm_area_struct *vma, = unsigned int mm_lock_seq) * we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy. */ WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq); - up_write(&vma->vm_lock.lock); + + if (locked) { + bool detached; + + __vma_exit_locked(vma, &detached); + VM_BUG_ON_VMA(detached, vma); /* vma should remain attached */ + } } EXPORT_SYMBOL_GPL(__vma_start_write); =20 +void vma_mark_detached(struct vm_area_struct *vma) +{ + vma_assert_write_locked(vma); + vma_assert_attached(vma); + + /* We are the only writer, so no need to use vma_refcount_put(). */ + if (unlikely(!refcount_dec_and_test(&vma->vm_refcnt))) { + /* + * Wait until refcnt is (VMA_LOCK_OFFSET + 1) =3D> detached with + * no readers + */ + if (__vma_enter_locked(vma, VMA_LOCK_OFFSET + 1)) { + bool detached; + + __vma_exit_locked(vma, &detached); + VM_BUG_ON_VMA(!detached, vma); + } + } +} + /* * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed = to be * stable and not isolated. If the VMA is not found or is being modified t= he @@ -6354,7 +6410,6 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_s= truct *mm, struct vm_area_struct *vma; =20 rcu_read_lock(); -retry: vma =3D mas_walk(&mas); if (!vma) goto inval; @@ -6362,13 +6417,6 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_= struct *mm, if (!vma_start_read(vma)) goto inval; =20 - /* Check if the VMA got isolated after we found it */ - if (is_vma_detached(vma)) { - vma_end_read(vma); - count_vm_vma_lock_event(VMA_LOCK_MISS); - /* The area was replaced with another one */ - goto retry; - } /* * At this point, we have a stable reference to a VMA: The VMA is * locked and we know it hasn't already been isolated. diff --git a/tools/testing/vma/linux/atomic.h b/tools/testing/vma/linux/ato= mic.h index e01f66f98982..2e2021553196 100644 --- a/tools/testing/vma/linux/atomic.h +++ b/tools/testing/vma/linux/atomic.h @@ -9,4 +9,9 @@ #define atomic_set(x, y) do {} while (0) #define U8_MAX UCHAR_MAX =20 +#ifndef atomic_cmpxchg_relaxed +#define atomic_cmpxchg_relaxed uatomic_cmpxchg +#define atomic_cmpxchg_release uatomic_cmpxchg +#endif /* atomic_cmpxchg_relaxed */ + #endif /* _LINUX_ATOMIC_H */ diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 2a624f9304da..1e8cd2f013fa 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -25,7 +25,7 @@ #include #include #include -#include +#include =20 extern unsigned long stack_guard_gap; #ifdef CONFIG_MMU @@ -132,10 +132,6 @@ typedef __bitwise unsigned int vm_fault_t; */ #define pr_warn_once pr_err =20 -typedef struct refcount_struct { - atomic_t refs; -} refcount_t; - struct kref { refcount_t refcount; }; @@ -228,15 +224,12 @@ struct mm_struct { unsigned long def_flags; }; =20 -struct vma_lock { - struct rw_semaphore lock; -}; - - struct file { struct address_space *f_mapping; }; =20 +#define VMA_LOCK_OFFSET 0x40000000 + struct vm_area_struct { /* The first cache line has the info for VMA tree walking. */ =20 @@ -264,16 +257,13 @@ struct vm_area_struct { }; =20 #ifdef CONFIG_PER_VMA_LOCK - /* Flag to indicate areas detached from the mm->mm_mt tree */ - bool detached; - /* * Can only be written (using WRITE_ONCE()) while holding both: * - mmap_lock (in write mode) - * - vm_lock.lock (in write mode) + * - vm_refcnt bit at VMA_LOCK_OFFSET is set * Can be read reliably while holding one of: * - mmap_lock (in read or write mode) - * - vm_lock.lock (in read or write mode) + * - vm_refcnt bit at VMA_LOCK_OFFSET is set or vm_refcnt > 1 * Can be read unreliably (using READ_ONCE()) for pessimistic bailout * while holding nothing (except RCU to keep the VMA struct allocated). * @@ -282,7 +272,6 @@ struct vm_area_struct { * slowpath. */ unsigned int vm_lock_seq; - struct vma_lock vm_lock; #endif =20 /* @@ -335,6 +324,10 @@ struct vm_area_struct { struct vma_numab_state *numab_state; /* NUMA Balancing state */ #endif struct vm_userfaultfd_ctx vm_userfaultfd_ctx; +#ifdef CONFIG_PER_VMA_LOCK + /* Unstable RCU readers are allowed to read this. */ + refcount_t vm_refcnt; +#endif } __randomize_layout; =20 struct vm_fault {}; @@ -459,23 +452,41 @@ static inline struct vm_area_struct *vma_next(struct = vma_iterator *vmi) return mas_find(&vmi->mas, ULONG_MAX); } =20 -static inline void vma_lock_init(struct vm_area_struct *vma) +/* + * WARNING: to avoid racing with vma_mark_attached()/vma_mark_detached(), = these + * assertions should be made either under mmap_write_lock or when the obje= ct + * has been isolated under mmap_write_lock, ensuring no competing writers. + */ +static inline void vma_assert_attached(struct vm_area_struct *vma) { - init_rwsem(&vma->vm_lock.lock); - vma->vm_lock_seq =3D UINT_MAX; + VM_BUG_ON_VMA(!refcount_read(&vma->vm_refcnt), vma); } =20 -static inline void vma_mark_attached(struct vm_area_struct *vma) +static inline void vma_assert_detached(struct vm_area_struct *vma) { - vma->detached =3D false; + VM_BUG_ON_VMA(refcount_read(&vma->vm_refcnt), vma); } =20 static inline void vma_assert_write_locked(struct vm_area_struct *); +static inline void vma_mark_attached(struct vm_area_struct *vma) +{ + vma_assert_write_locked(vma); + vma_assert_detached(vma); + refcount_set(&vma->vm_refcnt, 1); +} + static inline void vma_mark_detached(struct vm_area_struct *vma) { - /* When detaching vma should be write-locked */ vma_assert_write_locked(vma); - vma->detached =3D true; + vma_assert_attached(vma); + + /* We are the only writer, so no need to use vma_refcount_put(). */ + if (unlikely(!refcount_dec_and_test(&vma->vm_refcnt))) { + /* + * Reader must have temporarily raised vm_refcnt but it will + * drop it without using the vma since vma is write-locked. + */ + } } =20 extern const struct vm_operations_struct vma_dummy_vm_ops; @@ -488,9 +499,7 @@ static inline void vma_init(struct vm_area_struct *vma,= struct mm_struct *mm) vma->vm_mm =3D mm; vma->vm_ops =3D &vma_dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); - /* vma is not locked, can't use vma_mark_detached() */ - vma->detached =3D true; - vma_lock_init(vma); + vma->vm_lock_seq =3D UINT_MAX; } =20 static inline struct vm_area_struct *vm_area_alloc(struct mm_struct *mm) @@ -513,10 +522,9 @@ static inline struct vm_area_struct *vm_area_dup(struc= t vm_area_struct *orig) return NULL; =20 memcpy(new, orig, sizeof(*new)); - vma_lock_init(new); + refcount_set(&new->vm_refcnt, 0); + new->vm_lock_seq =3D UINT_MAX; INIT_LIST_HEAD(&new->anon_vma_chain); - /* vma is not locked, can't use vma_mark_detached() */ - new->detached =3D true; =20 return new; } --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E761C1DA10E for ; Thu, 26 Dec 2024 17:07:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232864; cv=none; b=P805nDE6wd33SVI7J6KRrQD0Xem0nFCKXOQNm2Q/8TySadoHQdcD93nmysZvEQ576Pp7RDdXCuzDppMoA1iF3yIJgfPWrMrN7xDfO5x4osFF3RcfzZs3G/xC0ThRQqXlTIYvbDfRKfcAukxjrQaI3PG81lTkRpCFhLUZiZIcHdA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232864; c=relaxed/simple; bh=bbw6pVuUSpPlySGCFqTY1teRdTcfZlASZvWIhGr3QJ8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=t8PePPOU1AhyfsV+0RngZMP2FgZka+94gfh0zxiW5U2GIX1pN/BSpTZtnptKBzqBaU06D0Szf4TtCvoJZbyJrE/TAfY3CT7la71Tx+ZAMadZSKmk39L+tjL5nJpONT2+zNpEfm0tKTcHSblf/roOyQPlzFiMC8Vte6Em6hvt0WU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ksvurq5q; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ksvurq5q" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2f2a9f056a8so8306432a91.2 for ; Thu, 26 Dec 2024 09:07:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232860; x=1735837660; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZbUgy+N8PHmQQlIfNLhjvtNZcxZdtPERK00HNSuusW4=; b=ksvurq5qiBjeu67KSoT4ll2+LZ/X9topEPcjJ19pie9Cm4RS6UuSTHOnzBe+kFN7MD UImVTe0q1jiZxXga3H8aDOSWvcDcLxizf2qUa2lbyoJrxYpyohnagMLfLVQeqazVTPnZ U0uzArnWskFLDbELXh3o6NdWvttgj+JVmh9YMUy6C/LU6KzEbvnddooU69noy1P0TywV rwQdvTU8tHCsNGgS0NC/CYChnu3jHPPCZL7XETVeNXee00IX62Tg2OfheV5f4QU2zsdC bgMhXQjvdn1lZiflZSkJ9lCPOXG8TmWOLr3Uz1Gj5Qh2YNF0vnseLMfn0UpzyZRp18y3 g3rQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232860; x=1735837660; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZbUgy+N8PHmQQlIfNLhjvtNZcxZdtPERK00HNSuusW4=; b=RQK9sr6Gqjs2N6haRekYqyuS7/Io7QMvL/REgJCFD90wTolhy5DsITjrYcgkRjseZv ZsUkyhspFUXwINAA4055irLFszZW1U5LYv1PearXuO9aUJCPtLdtDYcQApoz/1tfTVUC MD0uEZULYwjcEl5YXKdg7/ak9oxiyhZ1/jFQQHGgygex3HpicXR1I8IqT6EUJaozsFtY eSPUlvTU4fzHuSO3Kv3XibeioxMON/6hKV7vgEGNnLXYEvTOZmKLaFxakLO29SCYRPNW tQxKtbY1aIVGdkQUIaBX4y6f33IWL5FDj5k0mjFk0X5RTjRZQ0VM6vJeo7Ekq+KnfBB3 UXlA== X-Forwarded-Encrypted: i=1; AJvYcCViE0QJGYxUwIv8xqOkhXNuYLgIWpKw/nWslREhf+6B6gVm4rxQef5Q2lV+C1jqGVieiw3qafz+6/rheXY=@vger.kernel.org X-Gm-Message-State: AOJu0YyZ0IqaXO4uzcGTkV1ZGAnBhy27CL5wZoZVvbdw413vq0MrfN3D SkFJ1Y8Db0Ht/mf+t46durDmdAnH/RKO90Btmcw02J1R1JpJFa1OZVK2WZmx7Qz89ZNR6vOiW8t sSw== X-Google-Smtp-Source: AGHT+IEG5TzQ5WphA3eoGz/MbiJJmFDDNhcID7zswYwLoeBJDVpj8ipi2aVpjRu++rbXJdGfCBLbRXGohaw= X-Received: from pfbdc3.prod.google.com ([2002:a05:6a00:35c3:b0:725:cd3b:326c]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:8412:b0:1e1:a671:7122 with SMTP id adf61e73a8af0-1e5e044635dmr39159145637.2.1735232860104; Thu, 26 Dec 2024 09:07:40 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:05 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-14-surenb@google.com> Subject: [PATCH v7 13/17] mm/debug: print vm_refcnt state when dumping the vma From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vm_refcnt encodes a number of useful states: - whether vma is attached or detached - the number of current vma readers - presence of a vma writer Let's include it in the vma dump. Signed-off-by: Suren Baghdasaryan --- mm/debug.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/debug.c b/mm/debug.c index 95b6ab809c0e..68b3ba3cf603 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -181,12 +181,12 @@ void dump_vma(const struct vm_area_struct *vma) pr_emerg("vma %px start %px end %px mm %px\n" "prot %lx anon_vma %px vm_ops %px\n" "pgoff %lx file %px private_data %px\n" - "flags: %#lx(%pGv)\n", + "flags: %#lx(%pGv) refcnt %x\n", vma, (void *)vma->vm_start, (void *)vma->vm_end, vma->vm_mm, (unsigned long)pgprot_val(vma->vm_page_prot), vma->anon_vma, vma->vm_ops, vma->vm_pgoff, vma->vm_file, vma->vm_private_data, - vma->vm_flags, &vma->vm_flags); + vma->vm_flags, &vma->vm_flags, refcount_read(&vma->vm_refcnt)); } EXPORT_SYMBOL(dump_vma); =20 --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1176F1DA10B for ; Thu, 26 Dec 2024 17:07:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232864; cv=none; b=JDZrMSd64501+Wq3g31JixHRFeanQGj/Um65KGsI9DTJKzMyR0nWry25p7tfFF5nrcTLayXZm7zMAlpPjcqqHnfErmERdIIDaj8Lt5xwnO9oeVUuHl9KnBEHe4sjkg85/xgjLUPnsqfZqrFInO8DEH4PZFaLidWhEghKcM22wkU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232864; c=relaxed/simple; bh=781aAxZ/24MB/3vE078I5aUf0emZuPaLzhnQ4izvt6k=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ITGKEsjTMPNC1k2qjkIAJPUBA5lN9FciiJvi0zeXVoqP8RDowPoLUjhM3iytft2U9EcEFVvHKUz8Awmqvtx9TwU84QQGOpizCsDcTpA/ijtr4DsATW2KMito8eYahs8YcxCFgl30PEVBH3RfsmjDHg7GuTzGJztAvJCSh2OnKBA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Gd1aMOaP; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Gd1aMOaP" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2166a1a5cc4so77355805ad.3 for ; Thu, 26 Dec 2024 09:07:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232862; x=1735837662; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xUsbn4qH8t1xPQtCnq4fSt0rQiZva/B8CbRLcxSWyZc=; b=Gd1aMOaPROu5E5NbLsaMDGhWgi/F+OHwUjaQ70teGv+1oAvn2ST7GZRtAVlYEoLRo+ 2rpksYuG6tkFu9/cWhJStW3Ay8PzwyDQvzKzMFQvTsh0SPCMvk2b8Ow9ScxQe+zknxGH gIgokNGnUDBpEETcl/ZmOZ+Eq8PF0n6PsMlCyDcpSyjs51f75ZcjJB5V+7KHiF6dSz8A jXWGflG48fri/3uP0vdj7AqZJPX4sbnVKvnY5OP1pIlLwQV/pTNiICE+ZpeRQCV4H4Lr KnsMPhyeY6GuDxtpBrFYkAk4/ABDSNJhWtaH4wWW0zvXXxGgF0J3/IF7vC+mk659LXDs rrfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232862; x=1735837662; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xUsbn4qH8t1xPQtCnq4fSt0rQiZva/B8CbRLcxSWyZc=; b=lcU/+Q4nK86+x28RZmZhodGPMoAKw+c42eVRUFlLUAD/6GW1szWST0wAb7HAElabTl e3fJ0C61GomqpgNIRbh3mirDZ09ORqXlSKXHvg2Sq0q5Nkw/t9HMCaZac8UD/Dum8X2Z MXcQiO12B02ToYdhaCWbaYiBKiGEl0Ih/DRI6HQPZ2BaGF138pMiuSydjDTz9IHLz6lF 4eaQioM89WZ4QT3Ml3m7lWAVckt8yhvV2aKgZL3eKuZtVgb+wVDyuhZgAbNGAjeSyQLj flqZdMYqM3l7/cdlVpBpIwcq3NsIUwBvEinQ9FuQOC63goWIVs/jtYUkZ7tMYAGeHHPv DogA== X-Forwarded-Encrypted: i=1; AJvYcCV42ApHr5u2QCZcfjtaW9JCSgcP4XvDtfNeR14NkAEMXlYHD+3KHyY+3imUhYZ+6teo8elNcthytFhngXg=@vger.kernel.org X-Gm-Message-State: AOJu0YxzG4PnfNLNOplbWMkwmbJoWbM91Hwp1WSJBlcZNsdqp+Js4n/N itcixpN57FXK2TEpcxcA3/wTvslT0+NvbPLwaNmX7bo7ZPJ4NtnIYtYu5m0pdaUJP70zHWVKjKQ Ywg== X-Google-Smtp-Source: AGHT+IHO9Ln0yzQuO/gGPJ05vT5Etg8K56O3yhDTNfimuOaarwxAJxZRy61wRjd4sM0SxPCpRMWjAjCVTBo= X-Received: from plbkz8.prod.google.com ([2002:a17:902:f9c8:b0:216:69eb:bd08]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:dad1:b0:216:725c:a137 with SMTP id d9443c01a7336-219e6ebc7dfmr279213785ad.28.1735232862238; Thu, 26 Dec 2024 09:07:42 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:06 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-15-surenb@google.com> Subject: [PATCH v7 14/17] mm: remove extra vma_numab_state_init() call From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vma_init() already memset's the whole vm_area_struct to 0, so there is no need to an additional vma_numab_state_init(). Signed-off-by: Suren Baghdasaryan Reviewed-by: Vlastimil Babka --- include/linux/mm.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 99f4720d7e51..40bbe815df11 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -947,7 +947,6 @@ static inline void vma_init(struct vm_area_struct *vma,= struct mm_struct *mm) vma->vm_mm =3D mm; vma->vm_ops =3D &vma_dummy_vm_ops; INIT_LIST_HEAD(&vma->anon_vma_chain); - vma_numab_state_init(vma); vma_lockdep_init(vma); vma_init_lock(vma, false); } --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD14B1DA605 for ; Thu, 26 Dec 2024 17:07:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232866; cv=none; b=u+BUZFa4YHxsmH8fz40/PLPYCiC3wmvzpaiyksRwHZCx5KfPrlP8QWm39Msvsp9mknf5Ule1+2w/SCjembsLdsiqIG6PHi2Ayow4jSLCbGW4fvQZ6xotSSELpr9z/qpM+zPZWMXuahh8HStlhmTPgf/xn/YKWmKB3+kfElJwjJg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232866; c=relaxed/simple; bh=/0bGUl5184tFDePXHZ4jRFWAcn0sSk/E9p6ppVaBLHA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=UwnOcvjElwINTyEoI7CKGzKIG+RiTtk8xYHtEfE4h3wQZ1DoT3VeDiB7i3+66jA0W1h0LuXcvsR30exzjDURgMfOhQoJSJ/hLMaPnCM8qipHrUoO75xxk8EwQmu0eW7E7bjp/kH7Hc/+PRCJqImNxU0hkGftB+8Siz0Drat9/GE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=P3L9PEY3; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="P3L9PEY3" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2163a2a1ec2so139982675ad.1 for ; Thu, 26 Dec 2024 09:07:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232864; x=1735837664; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=wJt30IjlBNVAevAu7cS4hzA6cm3MdBwy61U0C7q4UdM=; b=P3L9PEY3T7v+qlPLRI3KcwAq/g8HrIJbfXgaS3G6T2uyoDpf9khOA1tVb3+I3rl3Ku 6sSTNQB5dPdabFZiS83wyzFzi6H+C6XMZdw5sePmQRYlSpSNUuvMZuJI9YpJaDFLbOU7 /Ie9klNxaczg6NvJxpD9YSGFLbo0u7LASrpg9vzhN8x750DhFohdUU8BCYxuRRCW1GPG tbRGFqJmsmbZamR6UeIR9NzQy0aCVS6rTJHYrE7sFbwFnSrVs2n92dsVdEU/P+kukZPC 9JFheFMVW6P2lQEv8JuJDfn+RUThqWbmBMGLXeml9yhJwMyEgNI/JqUiI5+dPQZ3C3ZH re1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232864; x=1735837664; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wJt30IjlBNVAevAu7cS4hzA6cm3MdBwy61U0C7q4UdM=; b=nDtdQbnZpGoQ0I9GHGtJhW3ws5eBkIhSrTsvCauYUFGuwEBs96x3NqzkRaybc/mJJR W9vWPQBCPuCBF/29/zonoKbaIa2N2EG9euk0AhjrBzQC5UWLGOFDWSRX7RvfKIwKzIJb rmN7VVQ6LX8WdYJCIFeJzdGAKUlyeTbkydzShZZ17bD28yVRlWYDf+hIvw/lO5HKktno Y/kwhvRzKGyYIPM0biM2nSzEg0YISmIs1MXaYvqi2twMmkIbr1HqZRtfblGDvs3Av4VM y4bs2Jhuei1G8nN89rJzddL2ClPbW3I8Hj2ufH0CV61TJ16jKZ0UExeP499EPK6qN3CF vngQ== X-Forwarded-Encrypted: i=1; AJvYcCWd1K7K3VqHHI7UDmnIl94AFWpVwd4+sI8ESgrDtYe0dq6LLpWV2lO6n+YRBN1upWRmo9ej6nc2AYn8EN8=@vger.kernel.org X-Gm-Message-State: AOJu0YxbgJApb8tlmjUnSgLqIRFXNqy2PUUUDed2336zdsqt2BFglO5F c+zo7fAL+Dh6ICqtf+x3ouPxFAomIARcZY83EULBVlNrvEyNINMRjIWlKA2+W24Lwmu/7vfS9bG HDA== X-Google-Smtp-Source: AGHT+IFXJnhfwZOHdM4geNls1CRjrBe2ovOlMU/gzO5O0nW/gc0UsQXMJYqFrpyCsqUXFXBoYnj2R8jomuo= X-Received: from pgjg7.prod.google.com ([2002:a63:dd47:0:b0:801:9858:ef95]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:339e:b0:1e1:aba4:209c with SMTP id adf61e73a8af0-1e5e07ffc0amr38092469637.29.1735232864140; Thu, 26 Dec 2024 09:07:44 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:07 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-16-surenb@google.com> Subject: [PATCH v7 15/17] mm: prepare lock_vma_under_rcu() for vma reuse possibility From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Once we make vma cache SLAB_TYPESAFE_BY_RCU, it will be possible for a vma to be reused and attached to another mm after lock_vma_under_rcu() locks the vma. lock_vma_under_rcu() should ensure that vma_start_read() is using the original mm and after locking the vma it should ensure that vma->vm_mm has not changed from under us. Signed-off-by: Suren Baghdasaryan Reviewed-by: Vlastimil Babka --- include/linux/mm.h | 10 ++++++---- mm/memory.c | 7 ++++--- 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 40bbe815df11..56a7d70ca5bd 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -730,8 +730,10 @@ static inline void vma_refcount_put(struct vm_area_str= uct *vma) * Try to read-lock a vma. The function is allowed to occasionally yield f= alse * locked result to avoid performance overhead, in which case we fall back= to * using mmap_lock. The function should never yield false unlocked result. + * False locked result is possible if mm_lock_seq overflows or if vma gets + * reused and attached to a different mm before we lock it. */ -static inline bool vma_start_read(struct vm_area_struct *vma) +static inline bool vma_start_read(struct mm_struct *mm, struct vm_area_str= uct *vma) { int oldcnt; =20 @@ -742,7 +744,7 @@ static inline bool vma_start_read(struct vm_area_struct= *vma) * we don't rely on for anything - the mm_lock_seq read against which we * need ordering is below. */ - if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq.= sequence)) + if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(mm->mm_lock_seq.sequence= )) return false; =20 =20 @@ -767,7 +769,7 @@ static inline bool vma_start_read(struct vm_area_struct= *vma) * This pairs with RELEASE semantics in vma_end_write_all(). */ if (unlikely(oldcnt & VMA_LOCK_OFFSET || - vma->vm_lock_seq =3D=3D raw_read_seqcount(&vma->vm_mm->mm_lock_seq)= )) { + vma->vm_lock_seq =3D=3D raw_read_seqcount(&mm->mm_lock_seq))) { vma_refcount_put(vma); return false; } @@ -905,7 +907,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_str= uct *mm, =20 static inline void vma_lockdep_init(struct vm_area_struct *vma) {} static inline void vma_init_lock(struct vm_area_struct *vma, bool reset_re= fcnt) {} -static inline bool vma_start_read(struct vm_area_struct *vma) +static inline bool vma_start_read(struct mm_struct *mm, struct vm_area_str= uct *vma) { return false; } static inline void vma_end_read(struct vm_area_struct *vma) {} static inline void vma_start_write(struct vm_area_struct *vma) {} diff --git a/mm/memory.c b/mm/memory.c index 2def47b5dff0..9cc93c2f79f3 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6414,7 +6414,7 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_s= truct *mm, if (!vma) goto inval; =20 - if (!vma_start_read(vma)) + if (!vma_start_read(mm, vma)) goto inval; =20 /* @@ -6424,8 +6424,9 @@ struct vm_area_struct *lock_vma_under_rcu(struct mm_s= truct *mm, * fields are accessible for RCU readers. */ =20 - /* Check since vm_start/vm_end might change before we lock the VMA */ - if (unlikely(address < vma->vm_start || address >=3D vma->vm_end)) + /* Check if the vma we locked is the right one. */ + if (unlikely(vma->vm_mm !=3D mm || + address < vma->vm_start || address >=3D vma->vm_end)) goto inval_end_read; =20 rcu_read_unlock(); --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 667C41DAC9A for ; Thu, 26 Dec 2024 17:07:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232868; cv=none; b=Ak2R6ZblZyHe+bGxeUDWzf8NZ02nOeDjkMLCu72rvTYL5ltJ6WthPOCR3jJMaVh4A690jpNdGv04RhKMdAM1J+GB0mZD+ll7lpsQSu0caJki758UQMCJuYCYha8obl5QCMSn7DXCIvjDJufui/CoStWj7Ms2268tbstfpdS0iR0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232868; c=relaxed/simple; bh=sdxn3rGcLv5zmA1rrGkq4Ky5Dn7POqqyjDqWrprIph0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=JRFZedcEB62oG2il8upa2BqHzkXcItDmN6lSjpq02OpQlHeonaMuwmT584PlVjG0Vlmow4D8skYEfGN3GYjJzgzXAV2e33U0yDz6jN9WljsA1UcI1dQQAjntBdLFwTo9yJi7ILcxcMTLrmCnCO9RvbVnUmWpdUcy9kofAWtzNE8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YK+8zp0O; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YK+8zp0O" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ee3206466aso8503124a91.1 for ; Thu, 26 Dec 2024 09:07:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232866; x=1735837666; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xy66KyEUcK4AYkPdjflJwK6nWoE6f14AOpge1PsZznw=; b=YK+8zp0OIq7c6etrqsoDOd1SlzaeC1++8BVbudCsYBPUAVB6Mr20i+adMbqY0m4VEr VUcsaHUxT/VIccIG1S/myRyhpDJxZzPF5FpSjnAS44buW5vasMHg0ObChUg9WGkx7zIF OBoUH4UEta0upCpxLA4mrnEMOotmrRUvgNY9YIGWW3turroNKoGpygYQqTph2NG/HMxI blvswYTZ8SgODl9zEPd+AJDTA7bdWMAyjjEF9A/chCvEmHi8my568d1DhMMqwwdTr2Ad DENjxMEwae9lMkKICoWMOrYVRLkwsehzz/BDwi54EyisYQ4Uv6yAWLPGE3ZbFedjutPA mVjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232866; x=1735837666; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xy66KyEUcK4AYkPdjflJwK6nWoE6f14AOpge1PsZznw=; b=VGkKDUD5MsKyh4vtj3RvwnHEQAwgOSUueBz+Ti9d9bu9dmoI9M2jgCvQPs6/fmlZgZ ERJwb6CQJX1QsxQUNZICN0TJyk5DYKXdfwiKBIfTx7FYwdJyOOadomts2aiQRBgDS9Bw kMOeU/F5MduD595iYtJeyfVa9ojfh4AAAxpuLhc2rcvQDom+31eH4zx1QtrDMTqP318c tVy9dRs9sASPTDDJx8bv0Cs/Da3SAjfdOyQW+Irwb2qW66JXM1f91IsEalYCuOPc7wtr 9ikwN/s1EpO9pMRm5s9oiu8oSduhVE1t+pUoBCocb86kTkxpzdLqZE1jy4DhI/vn5R6u 6PnQ== X-Forwarded-Encrypted: i=1; AJvYcCXQG9xx+3R7lmUIysHRVp3Xx2kTWvg5knWLmBrJrA2nTYjRe1QL/EQ9HyboIAQR/VH3CzlqaL4aAmiSoLI=@vger.kernel.org X-Gm-Message-State: AOJu0YyMa7tGln3T5M5ScndOIGpN0qvi3j9rSpURvgWMmGN4G8gqWHMV Txkap2UtRA0lQhqTlB3fvuQUqy09z6xh430Cv/mzS7b5jq140xqpAIn62f/+1J9HYRNQGsMRY8o 5Ow== X-Google-Smtp-Source: AGHT+IGAr2xLbuXi2R908ZY5it7N2onoqChj/Zox7EvU+cfvoLneDmddmW0yL/ElwVm7zYdY6Kk+YCxDfGI= X-Received: from pjbpq10.prod.google.com ([2002:a17:90b:3d8a:b0:2ee:4b37:f869]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3a43:b0:2ee:94a0:255c with SMTP id 98e67ed59e1d1-2f4536d25fcmr36257223a91.13.1735232865882; Thu, 26 Dec 2024 09:07:45 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:08 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-17-surenb@google.com> Subject: [PATCH v7 16/17] mm: make vma cache SLAB_TYPESAFE_BY_RCU From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that object reuse before RCU grace period is over will be detected by lock_vma_under_rcu(). Current checks are sufficient as long as vma is detached before it is freed. Implement this guarantee by calling vma_ensure_detached() before vma is freed and make vm_area_cachep SLAB_TYPESAFE_BY_RCU. This will facilitate vm_area_struct reuse and will minimize the number of call_rcu() calls. Signed-off-by: Suren Baghdasaryan Reviewed-by: Liam R. Howlett --- include/linux/mm.h | 2 -- include/linux/mm_types.h | 10 +++++++--- include/linux/slab.h | 6 ------ kernel/fork.c | 31 +++++++++---------------------- mm/mmap.c | 3 ++- mm/vma.c | 10 +++------- mm/vma.h | 2 +- tools/testing/vma/vma_internal.h | 7 +------ 8 files changed, 23 insertions(+), 48 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 56a7d70ca5bd..017d70e1d432 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -258,8 +258,6 @@ void setup_initial_init_mm(void *start_code, void *end_= code, struct vm_area_struct *vm_area_alloc(struct mm_struct *); struct vm_area_struct *vm_area_dup(struct vm_area_struct *); void vm_area_free(struct vm_area_struct *); -/* Use only if VMA has no other users */ -void __vm_area_free(struct vm_area_struct *vma); =20 #ifndef CONFIG_MMU extern struct rb_root nommu_region_tree; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index b5312421dec6..3ca4695f6d0f 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -574,6 +574,12 @@ static inline void *folio_get_private(struct folio *fo= lio) =20 typedef unsigned long vm_flags_t; =20 +/* + * freeptr_t represents a SLUB freelist pointer, which might be encoded + * and not dereferenceable if CONFIG_SLAB_FREELIST_HARDENED is enabled. + */ +typedef struct { unsigned long v; } freeptr_t; + /* * A region containing a mapping of a non-memory backed file under NOMMU * conditions. These are held in a global tree and are pinned by the VMAs= that @@ -687,9 +693,7 @@ struct vm_area_struct { unsigned long vm_start; unsigned long vm_end; }; -#ifdef CONFIG_PER_VMA_LOCK - struct rcu_head vm_rcu; /* Used for deferred freeing. */ -#endif + freeptr_t vm_freeptr; /* Pointer used by SLAB_TYPESAFE_BY_RCU */ }; =20 /* diff --git a/include/linux/slab.h b/include/linux/slab.h index 10a971c2bde3..681b685b6c4e 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -234,12 +234,6 @@ enum _slab_flag_bits { #define SLAB_NO_OBJ_EXT __SLAB_FLAG_UNUSED #endif =20 -/* - * freeptr_t represents a SLUB freelist pointer, which might be encoded - * and not dereferenceable if CONFIG_SLAB_FREELIST_HARDENED is enabled. - */ -typedef struct { unsigned long v; } freeptr_t; - /* * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests. * diff --git a/kernel/fork.c b/kernel/fork.c index 7a0800d48112..da3b1ebfd282 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -471,7 +471,7 @@ struct vm_area_struct *vm_area_dup(struct vm_area_struc= t *orig) return new; } =20 -void __vm_area_free(struct vm_area_struct *vma) +void vm_area_free(struct vm_area_struct *vma) { /* The vma should be detached while being destroyed. */ vma_assert_detached(vma); @@ -480,25 +480,6 @@ void __vm_area_free(struct vm_area_struct *vma) kmem_cache_free(vm_area_cachep, vma); } =20 -#ifdef CONFIG_PER_VMA_LOCK -static void vm_area_free_rcu_cb(struct rcu_head *head) -{ - struct vm_area_struct *vma =3D container_of(head, struct vm_area_struct, - vm_rcu); - - __vm_area_free(vma); -} -#endif - -void vm_area_free(struct vm_area_struct *vma) -{ -#ifdef CONFIG_PER_VMA_LOCK - call_rcu(&vma->vm_rcu, vm_area_free_rcu_cb); -#else - __vm_area_free(vma); -#endif -} - static void account_kernel_stack(struct task_struct *tsk, int account) { if (IS_ENABLED(CONFIG_VMAP_STACK)) { @@ -3144,6 +3125,11 @@ void __init mm_cache_init(void) =20 void __init proc_caches_init(void) { + struct kmem_cache_args args =3D { + .use_freeptr_offset =3D true, + .freeptr_offset =3D offsetof(struct vm_area_struct, vm_freeptr), + }; + sighand_cachep =3D kmem_cache_create("sighand_cache", sizeof(struct sighand_struct), 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_TYPESAFE_BY_RCU| @@ -3160,8 +3146,9 @@ void __init proc_caches_init(void) sizeof(struct fs_struct), 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_ACCOUNT, NULL); - vm_area_cachep =3D KMEM_CACHE(vm_area_struct, - SLAB_HWCACHE_ALIGN|SLAB_NO_MERGE|SLAB_PANIC| + vm_area_cachep =3D kmem_cache_create("vm_area_struct", + sizeof(struct vm_area_struct), &args, + SLAB_HWCACHE_ALIGN|SLAB_PANIC|SLAB_TYPESAFE_BY_RCU| SLAB_ACCOUNT); mmap_init(); nsproxy_cache_init(); diff --git a/mm/mmap.c b/mm/mmap.c index 3cc8de07411d..7fdc4207fe98 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1364,7 +1364,8 @@ void exit_mmap(struct mm_struct *mm) do { if (vma->vm_flags & VM_ACCOUNT) nr_accounted +=3D vma_pages(vma); - remove_vma(vma, /* unreachable =3D */ true); + vma_mark_detached(vma); + remove_vma(vma); count++; cond_resched(); vma =3D vma_next(&vmi); diff --git a/mm/vma.c b/mm/vma.c index 4a3deb6f9662..e37eb384d118 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -406,18 +406,14 @@ static bool can_vma_merge_right(struct vma_merge_stru= ct *vmg, /* * Close a vm structure and free it. */ -void remove_vma(struct vm_area_struct *vma, bool unreachable) +void remove_vma(struct vm_area_struct *vma) { might_sleep(); vma_close(vma); if (vma->vm_file) fput(vma->vm_file); mpol_put(vma_policy(vma)); - if (unreachable) { - vma_mark_detached(vma); - __vm_area_free(vma); - } else - vm_area_free(vma); + vm_area_free(vma); } =20 /* @@ -1199,7 +1195,7 @@ static void vms_complete_munmap_vmas(struct vma_munma= p_struct *vms, /* Remove and clean up vmas */ mas_set(mas_detach, 0); mas_for_each(mas_detach, vma, ULONG_MAX) - remove_vma(vma, /* unreachable =3D */ false); + remove_vma(vma); =20 vm_unacct_memory(vms->nr_accounted); validate_mm(mm); diff --git a/mm/vma.h b/mm/vma.h index 18c9e49b1eae..d6803626151d 100644 --- a/mm/vma.h +++ b/mm/vma.h @@ -173,7 +173,7 @@ int do_vmi_munmap(struct vma_iterator *vmi, struct mm_s= truct *mm, unsigned long start, size_t len, struct list_head *uf, bool unlock); =20 -void remove_vma(struct vm_area_struct *vma, bool unreachable); +void remove_vma(struct vm_area_struct *vma); =20 void unmap_region(struct ma_state *mas, struct vm_area_struct *vma, struct vm_area_struct *prev, struct vm_area_struct *next); diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index 1e8cd2f013fa..c7c580ec9a2d 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -693,14 +693,9 @@ static inline void mpol_put(struct mempolicy *) { } =20 -static inline void __vm_area_free(struct vm_area_struct *vma) -{ - free(vma); -} - static inline void vm_area_free(struct vm_area_struct *vma) { - __vm_area_free(vma); + free(vma); } =20 static inline void lru_add_drain(void) --=20 2.47.1.613.gc27f4b7a9f-goog From nobody Mon Feb 9 13:01:56 2026 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 929E11DB55D for ; Thu, 26 Dec 2024 17:07:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232870; cv=none; b=b9pbq4RvgmTSP9UgrftL+c0k74pGrTQv+05kcXak/v3FY5t0eLcRCs0bMrRdXSDQg+y++d1rj4eSv33Ijq9oq5WVHkTpqMeezNIIkdDgYONY5w6nrGziTGNa4fMtQ0ZiDOndhhU7DEtC5Uybn0VyEnRNE1+AGqZ35ZZfzBZLXoQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735232870; c=relaxed/simple; bh=GIO0fbE7SQZ1rtlKN8vZekgYlbMsWE+guT+3ZXk1HZ0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XPGwNha0oPDxloFmpCeQcKpCfOQfzSz16dYQDn4ml41P+1GuCaOCpVqrno+yYW0yvhQc+UslziZQNOp4fpfZj28E8efOo8kjW6fangUpBerpLvtvnnYQ80hCuxuB0ZaN/YRyh7ejkMyD5nqrMRspUy4GXbbt+6HacBPUU0H5EQU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WyxBxgKk; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WyxBxgKk" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ee5616e986so12303944a91.2 for ; Thu, 26 Dec 2024 09:07:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232868; x=1735837668; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4HhxNlm5l3oc+a05Upn5SwGTABo4qTWEjrE4UxpsHII=; b=WyxBxgKkhEuFt7Gaz9obomkikLL84sLKZ2N+xVPOAUNmDnOB5Gqer5te8fouT8Y9nE qF+wF6N3zDNE2oc3PfnQTA6zv2YAIzeUNhtQ0I0Cc31ris2/OIz9jRUurDiB7Rf1Tiiu DvOoIGnf+ExMcarudnES+RFdulOvru73YGVjPwzau+b5HO8kQgSMd4wxFoNkvy/ieu0C NJ9ozmMi04fE0p8HCHyWcNN9kcpyjaE/rjfpfpybEzJsGEM00eeYdRC8lcKuli3pCsLQ iQAq+6b/DVQfgS4Vtiy9pkHJnkDsSP6xHly/mKlNh5/ERh65lq0AgKxtWJgv7o2jQZMg KvpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232868; x=1735837668; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4HhxNlm5l3oc+a05Upn5SwGTABo4qTWEjrE4UxpsHII=; b=gpZGu0j3R8y01I1XJQbCowWC8gKGvwO9Jwv75qcmBbs8G0RjLeQtNDhgpCTQsBc8r9 S0P36WqHQDl/X8I4tL9haQJWMwJCLovYA46BjtyPyN0WytP7y37xrM0WYg05oKQvxMt5 nkUibe/fHHMtNMf7QWe4e46Rg1U85gz0qySpOfSUuFCfUSH940L3vCtF4pCTEL7YFVIm H80sbhryYAxCLUZUV51Eb6wHW/p8ryvW0mRFBFx1gyreBf0cZKMFtxbLX/oM1vXs4Y61 bUQgzHVcSZTb/F1J3trYd3BXH2Fq1+TqKTxB21GYpPnx2ZrGHczLYyGlPuT/3839Mdn3 Y0rw== X-Forwarded-Encrypted: i=1; AJvYcCVwzitv3XshVjGgwUltHhanKZeizS6QOluXydiVzhm9zkCUYlcF5ETw0hFcbjVC06JND7D1HZfycoqejCg=@vger.kernel.org X-Gm-Message-State: AOJu0YxUmO4oXqFNUeiN+mJ1KvZoq+hIr8awhjkbfsZVFpHGXGdYJOss E7p3Q9UOdk2RmsVADAqmRTtqyZUC4/wgKF9ZN15oRsQHwJagtodxIUIVRiPE7To3m2P8IbJKN+b uIg== X-Google-Smtp-Source: AGHT+IF7dKgo6NOta05WIlCiTUUdE+0C61M7hMlod7dRoqWPYR+1EhCFaw3EWtQn1lXePDQDqo/d6R5TwFE= X-Received: from pfbcw3.prod.google.com ([2002:a05:6a00:4503:b0:725:f1d9:f706]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4399:b0:725:8b00:167e with SMTP id d2e1a72fcca58-72abdec65e2mr31237080b3a.16.1735232867927; Thu, 26 Dec 2024 09:07:47 -0800 (PST) Date: Thu, 26 Dec 2024 09:07:09 -0800 In-Reply-To: <20241226170710.1159679-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241226170710.1159679-1-surenb@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-18-surenb@google.com> Subject: [PATCH v7 17/17] docs/mm: document latest changes to vm_lock From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Change the documentation to reflect that vm_lock is integrated into vma and replaced with vm_refcnt. Document newly introduced vma_start_read_locked{_nested} functions. Signed-off-by: Suren Baghdasaryan Reviewed-by: Liam R. Howlett --- Documentation/mm/process_addrs.rst | 44 ++++++++++++++++++------------ 1 file changed, 26 insertions(+), 18 deletions(-) diff --git a/Documentation/mm/process_addrs.rst b/Documentation/mm/process_= addrs.rst index 81417fa2ed20..f573de936b5d 100644 --- a/Documentation/mm/process_addrs.rst +++ b/Documentation/mm/process_addrs.rst @@ -716,9 +716,14 @@ calls :c:func:`!rcu_read_lock` to ensure that the VMA = is looked up in an RCU critical section, then attempts to VMA lock it via :c:func:`!vma_start_rea= d`, before releasing the RCU lock via :c:func:`!rcu_read_unlock`. =20 -VMA read locks hold the read lock on the :c:member:`!vma->vm_lock` semapho= re for -their duration and the caller of :c:func:`!lock_vma_under_rcu` must releas= e it -via :c:func:`!vma_end_read`. +In cases when the user already holds mmap read lock, :c:func:`!vma_start_r= ead_locked` +and :c:func:`!vma_start_read_locked_nested` can be used. These functions d= o not +fail due to lock contention but the caller should still check their return= values +in case they fail for other reasons. + +VMA read locks increment :c:member:`!vma.vm_refcnt` reference counter for = their +duration and the caller of :c:func:`!lock_vma_under_rcu` must drop it via +:c:func:`!vma_end_read`. =20 VMA **write** locks are acquired via :c:func:`!vma_start_write` in instanc= es where a VMA is about to be modified, unlike :c:func:`!vma_start_read` the lock is = always @@ -726,9 +731,9 @@ acquired. An mmap write lock **must** be held for the d= uration of the VMA write lock, releasing or downgrading the mmap write lock also releases the VMA w= rite lock so there is no :c:func:`!vma_end_write` function. =20 -Note that a semaphore write lock is not held across a VMA lock. Rather, a -sequence number is used for serialisation, and the write semaphore is only -acquired at the point of write lock to update this. +Note that when write-locking a VMA lock, the :c:member:`!vma.vm_refcnt` is= temporarily +modified so that readers can detect the presense of a writer. The referenc= e counter is +restored once the vma sequence number used for serialisation is updated. =20 This ensures the semantics we require - VMA write locks provide exclusive = write access to the VMA. @@ -738,7 +743,7 @@ Implementation details =20 The VMA lock mechanism is designed to be a lightweight means of avoiding t= he use of the heavily contended mmap lock. It is implemented using a combination = of a -read/write semaphore and sequence numbers belonging to the containing +reference counter and sequence numbers belonging to the containing :c:struct:`!struct mm_struct` and the VMA. =20 Read locks are acquired via :c:func:`!vma_start_read`, which is an optimis= tic @@ -779,28 +784,31 @@ release of any VMA locks on its release makes sense, = as you would never want to keep VMAs locked across entirely separate write operations. It also mainta= ins correct lock ordering. =20 -Each time a VMA read lock is acquired, we acquire a read lock on the -:c:member:`!vma->vm_lock` read/write semaphore and hold it, while checking= that -the sequence count of the VMA does not match that of the mm. +Each time a VMA read lock is acquired, we increment :c:member:`!vma.vm_ref= cnt` +reference counter and check that the sequence count of the VMA does not ma= tch +that of the mm. =20 -If it does, the read lock fails. If it does not, we hold the lock, excludi= ng -writers, but permitting other readers, who will also obtain this lock unde= r RCU. +If it does, the read lock fails and :c:member:`!vma.vm_refcnt` is dropped. +If it does not, we keep the reference counter raised, excluding writers, b= ut +permitting other readers, who can also obtain this lock under RCU. =20 Importantly, maple tree operations performed in :c:func:`!lock_vma_under_r= cu` are also RCU safe, so the whole read lock operation is guaranteed to funct= ion correctly. =20 -On the write side, we acquire a write lock on the :c:member:`!vma->vm_lock` -read/write semaphore, before setting the VMA's sequence number under this = lock, -also simultaneously holding the mmap write lock. +On the write side, we set a bit in :c:member:`!vma.vm_refcnt` which can't = be +modified by readers and wait for all readers to drop their reference count. +Once there are no readers, VMA's sequence number is set to match that of t= he +mm. During this entire operation mmap write lock is held. =20 This way, if any read locks are in effect, :c:func:`!vma_start_write` will= sleep until these are finished and mutual exclusion is achieved. =20 -After setting the VMA's sequence number, the lock is released, avoiding -complexity with a long-term held write lock. +After setting the VMA's sequence number, the bit in :c:member:`!vma.vm_ref= cnt` +indicating a writer is cleared. From this point on, VMA's sequence number = will +indicate VMA's write-locked state until mmap write lock is dropped or down= graded. =20 -This clever combination of a read/write semaphore and sequence count allow= s for +This clever combination of a reference counter and sequence count allows f= or fast RCU-based per-VMA lock acquisition (especially on page fault, though utilised elsewhere) with minimal complexity around lock ordering. =20 --=20 2.47.1.613.gc27f4b7a9f-goog