From nobody Sun Feb 8 07:22:08 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89C271FDE26; Mon, 2 Dec 2024 11:14:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733138055; cv=none; b=Ls7jAz6NcaMj3CJ9jBcV2GFwQWMP1CSrkl1tXCCHv7Hdci6/ZOOEtknoG1g9IRvFzRNxzCB28CavJn6VhvAwEeHqwh2iTanuo2/lEgc7ulojeSQwF6QaN5l5hMJA2A/XuBitMd/zwb5hZppApLG3apnqlNDhoLTH80xoMxqx55w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733138055; c=relaxed/simple; bh=h/+p9/XGPEaNTE7ThbLD21zOqWROE0lBCzStoUU/HMg=; h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version: Message-ID:Content-Type; b=oduidg4bvEDmGXmvo2ral60968R6Cy9KW84WXIGuGmqM/J7lyMl9sl0g+ZIEL2Q+LY2y7tYN2RpaqAnAmrrDzHXqUQOvxAkrwO62MrIRDN7zxSlYMLyezey44/o64tyth2r9CwRrqHa7qswzff/B2Nd/G3dV37xSTeGLI/+En4Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=OSgtFIYX; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=heiLG63k; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="OSgtFIYX"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="heiLG63k" Date: Mon, 02 Dec 2024 11:14:10 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1733138051; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x4MApJKBUUdLOEABk95xo2vnuBbWFHuqISR2tLWdcp4=; b=OSgtFIYXOS8aoWDHkcSDYfuNhuw0xZLCckSohvor6L83rsHb2VyXeH9HcT95XhujE+N50j emooOeqpp+EHlCG4NBZD7lWWGIaDNE/JibwkZ4gt69e3Io33lGAcYXe4B33XYomGrUyY+t Krscsk5JN+WabwAGM4wF+o53iAj4BcAA8/I+/VezUBMUy+NcSATBC0iH2CtVJHVjni625y 8BpDaF6tNLg+GGY4exih/BE0MtSgFRlZaZMqLjppWXfR926wsFMBnNM6SyhvXyTYiankzq VxSwmLXVO4wP9z57V+/JSPdhEsV5pDbv8mqjp1J7GyDJSTN7FOJMCYw8buJ8Sw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1733138051; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x4MApJKBUUdLOEABk95xo2vnuBbWFHuqISR2tLWdcp4=; b=heiLG63kHPjCiS8db/V2hnSvmLOycVy1Pkfvq5QFmgigNhWax+XfsWtaKLb8v23pOwATZh /bHOsnJkMqxCr/Bg== From: "tip-bot2 for Suren Baghdasaryan" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: perf/core] mm: convert mm_lock_seq to a proper seqcount Cc: Peter Zijlstra , Suren Baghdasaryan , "Liam R. Howlett" , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20241122174416.1367052-2-surenb@google.com> References: <20241122174416.1367052-2-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <173313805087.412.9677310798364386934.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Precedence: bulk Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The following commit has been merged into the perf/core branch of tip: Commit-ID: eb449bd96954b1c1e491d19066cfd2a010f0aa47 Gitweb: https://git.kernel.org/tip/eb449bd96954b1c1e491d19066cfd2a01= 0f0aa47 Author: Suren Baghdasaryan AuthorDate: Fri, 22 Nov 2024 09:44:15 -08:00 Committer: Peter Zijlstra CommitterDate: Mon, 02 Dec 2024 12:01:38 +01:00 mm: convert mm_lock_seq to a proper seqcount Convert mm_lock_seq to be seqcount_t and change all mmap_write_lock variants to increment it, in-line with the usual seqcount usage pattern. This lets us check whether the mmap_lock is write-locked by checking mm_lock_seq.sequence counter (odd=3Dlocked, even=3Dunlocked). This will be used when implementing mmap_lock speculation functions. As a result vm_lock_seq is also change to be unsigned to match the type of mm_lock_seq.sequence. Suggested-by: Peter Zijlstra Signed-off-by: Suren Baghdasaryan Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Liam R. Howlett Link: https://lkml.kernel.org/r/20241122174416.1367052-2-surenb@google.com --- include/linux/mm.h | 12 +++---- include/linux/mm_types.h | 7 ++-- include/linux/mmap_lock.h | 55 ++++++++++++++++++++----------- kernel/fork.c | 5 +--- mm/init-mm.c | 2 +- tools/testing/vma/vma.c | 4 +- tools/testing/vma/vma_internal.h | 4 +- 7 files changed, 53 insertions(+), 36 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index c39c494..ca59d16 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -710,7 +710,7 @@ static inline bool vma_start_read(struct vm_area_struct= *vma) * we don't rely on for anything - the mm_lock_seq read against which we * need ordering is below. */ - if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq)) + if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(vma->vm_mm->mm_lock_seq.= sequence)) return false; =20 if (unlikely(down_read_trylock(&vma->vm_lock->lock) =3D=3D 0)) @@ -727,7 +727,7 @@ static inline bool vma_start_read(struct vm_area_struct= *vma) * after it has been unlocked. * This pairs with RELEASE semantics in vma_end_write_all(). */ - if (unlikely(vma->vm_lock_seq =3D=3D smp_load_acquire(&vma->vm_mm->mm_loc= k_seq))) { + if (unlikely(vma->vm_lock_seq =3D=3D raw_read_seqcount(&vma->vm_mm->mm_lo= ck_seq))) { up_read(&vma->vm_lock->lock); return false; } @@ -742,7 +742,7 @@ static inline void vma_end_read(struct vm_area_struct *= vma) } =20 /* WARNING! Can only be used if mmap_lock is expected to be write-locked */ -static bool __is_vma_write_locked(struct vm_area_struct *vma, int *mm_lock= _seq) +static bool __is_vma_write_locked(struct vm_area_struct *vma, unsigned int= *mm_lock_seq) { mmap_assert_write_locked(vma->vm_mm); =20 @@ -750,7 +750,7 @@ static bool __is_vma_write_locked(struct vm_area_struct= *vma, int *mm_lock_seq) * current task is holding mmap_write_lock, both vma->vm_lock_seq and * mm->mm_lock_seq can't be concurrently modified. */ - *mm_lock_seq =3D vma->vm_mm->mm_lock_seq; + *mm_lock_seq =3D vma->vm_mm->mm_lock_seq.sequence; return (vma->vm_lock_seq =3D=3D *mm_lock_seq); } =20 @@ -761,7 +761,7 @@ static bool __is_vma_write_locked(struct vm_area_struct= *vma, int *mm_lock_seq) */ static inline void vma_start_write(struct vm_area_struct *vma) { - int mm_lock_seq; + unsigned int mm_lock_seq; =20 if (__is_vma_write_locked(vma, &mm_lock_seq)) return; @@ -779,7 +779,7 @@ static inline void vma_start_write(struct vm_area_struc= t *vma) =20 static inline void vma_assert_write_locked(struct vm_area_struct *vma) { - int mm_lock_seq; + unsigned int mm_lock_seq; =20 VM_BUG_ON_VMA(!__is_vma_write_locked(vma, &mm_lock_seq), vma); } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 7361a8f..97e2f4f 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -697,7 +697,7 @@ struct vm_area_struct { * counter reuse can only lead to occasional unnecessary use of the * slowpath. */ - int vm_lock_seq; + unsigned int vm_lock_seq; /* Unstable RCU readers are allowed to read this. */ struct vma_lock *vm_lock; #endif @@ -891,6 +891,9 @@ struct mm_struct { * Roughly speaking, incrementing the sequence number is * equivalent to releasing locks on VMAs; reading the sequence * number can be part of taking a read lock on a VMA. + * Incremented every time mmap_lock is write-locked/unlocked. + * Initialized to 0, therefore odd values indicate mmap_lock + * is write-locked and even values that it's released. * * Can be modified under write mmap_lock using RELEASE * semantics. @@ -899,7 +902,7 @@ struct mm_struct { * Can be read with ACQUIRE semantics if not holding write * mmap_lock. */ - int mm_lock_seq; + seqcount_t mm_lock_seq; #endif =20 =20 diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index de9dc20..9715326 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -71,39 +71,39 @@ static inline void mmap_assert_write_locked(const struc= t mm_struct *mm) } =20 #ifdef CONFIG_PER_VMA_LOCK -/* - * Drop all currently-held per-VMA locks. - * This is called from the mmap_lock implementation directly before releas= ing - * a write-locked mmap_lock (or downgrading it to read-locked). - * This should normally NOT be called manually from other places. - * If you want to call this manually anyway, keep in mind that this will r= elease - * *all* VMA write locks, including ones from further up the stack. - */ -static inline void vma_end_write_all(struct mm_struct *mm) +static inline void mm_lock_seqcount_init(struct mm_struct *mm) { - mmap_assert_write_locked(mm); - /* - * Nobody can concurrently modify mm->mm_lock_seq due to exclusive - * mmap_lock being held. - * We need RELEASE semantics here to ensure that preceding stores into - * the VMA take effect before we unlock it with this store. - * Pairs with ACQUIRE semantics in vma_start_read(). - */ - smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); + seqcount_init(&mm->mm_lock_seq); +} + +static inline void mm_lock_seqcount_begin(struct mm_struct *mm) +{ + do_raw_write_seqcount_begin(&mm->mm_lock_seq); +} + +static inline void mm_lock_seqcount_end(struct mm_struct *mm) +{ + ASSERT_EXCLUSIVE_WRITER(mm->mm_lock_seq); + do_raw_write_seqcount_end(&mm->mm_lock_seq); } + #else -static inline void vma_end_write_all(struct mm_struct *mm) {} +static inline void mm_lock_seqcount_init(struct mm_struct *mm) {} +static inline void mm_lock_seqcount_begin(struct mm_struct *mm) {} +static inline void mm_lock_seqcount_end(struct mm_struct *mm) {} #endif =20 static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); + mm_lock_seqcount_init(mm); } =20 static inline void mmap_write_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, true); down_write(&mm->mmap_lock); + mm_lock_seqcount_begin(mm); __mmap_lock_trace_acquire_returned(mm, true, true); } =20 @@ -111,6 +111,7 @@ static inline void mmap_write_lock_nested(struct mm_str= uct *mm, int subclass) { __mmap_lock_trace_start_locking(mm, true); down_write_nested(&mm->mmap_lock, subclass); + mm_lock_seqcount_begin(mm); __mmap_lock_trace_acquire_returned(mm, true, true); } =20 @@ -120,10 +121,26 @@ static inline int mmap_write_lock_killable(struct mm_= struct *mm) =20 __mmap_lock_trace_start_locking(mm, true); ret =3D down_write_killable(&mm->mmap_lock); + if (!ret) + mm_lock_seqcount_begin(mm); __mmap_lock_trace_acquire_returned(mm, true, ret =3D=3D 0); return ret; } =20 +/* + * Drop all currently-held per-VMA locks. + * This is called from the mmap_lock implementation directly before releas= ing + * a write-locked mmap_lock (or downgrading it to read-locked). + * This should normally NOT be called manually from other places. + * If you want to call this manually anyway, keep in mind that this will r= elease + * *all* VMA write locks, including ones from further up the stack. + */ +static inline void vma_end_write_all(struct mm_struct *mm) +{ + mmap_assert_write_locked(mm); + mm_lock_seqcount_end(mm); +} + static inline void mmap_write_unlock(struct mm_struct *mm) { __mmap_lock_trace_released(mm, true); diff --git a/kernel/fork.c b/kernel/fork.c index 1450b46..8dc670f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -448,7 +448,7 @@ static bool vma_lock_alloc(struct vm_area_struct *vma) return false; =20 init_rwsem(&vma->vm_lock->lock); - vma->vm_lock_seq =3D -1; + vma->vm_lock_seq =3D UINT_MAX; =20 return true; } @@ -1267,9 +1267,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm= , struct task_struct *p, seqcount_init(&mm->write_protect_seq); mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); -#ifdef CONFIG_PER_VMA_LOCK - mm->mm_lock_seq =3D 0; -#endif mm_pgtables_bytes_init(mm); mm->map_count =3D 0; mm->locked_vm =3D 0; diff --git a/mm/init-mm.c b/mm/init-mm.c index 24c8093..6af3ad6 100644 --- a/mm/init-mm.c +++ b/mm/init-mm.c @@ -40,7 +40,7 @@ struct mm_struct init_mm =3D { .arg_lock =3D __SPIN_LOCK_UNLOCKED(init_mm.arg_lock), .mmlist =3D LIST_HEAD_INIT(init_mm.mmlist), #ifdef CONFIG_PER_VMA_LOCK - .mm_lock_seq =3D 0, + .mm_lock_seq =3D SEQCNT_ZERO(init_mm.mm_lock_seq), #endif .user_ns =3D &init_user_ns, .cpu_bitmap =3D CPU_BITS_NONE, diff --git a/tools/testing/vma/vma.c b/tools/testing/vma/vma.c index 8fab5e1..9bcf173 100644 --- a/tools/testing/vma/vma.c +++ b/tools/testing/vma/vma.c @@ -89,7 +89,7 @@ static struct vm_area_struct *alloc_and_link_vma(struct m= m_struct *mm, * begun. Linking to the tree will have caused this to be incremented, * which means we will get a false positive otherwise. */ - vma->vm_lock_seq =3D -1; + vma->vm_lock_seq =3D UINT_MAX; =20 return vma; } @@ -214,7 +214,7 @@ static bool vma_write_started(struct vm_area_struct *vm= a) int seq =3D vma->vm_lock_seq; =20 /* We reset after each check. */ - vma->vm_lock_seq =3D -1; + vma->vm_lock_seq =3D UINT_MAX; =20 /* The vma_start_write() stub simply increments this value. */ return seq > -1; diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_inter= nal.h index e76ff57..1d9fc97 100644 --- a/tools/testing/vma/vma_internal.h +++ b/tools/testing/vma/vma_internal.h @@ -241,7 +241,7 @@ struct vm_area_struct { * counter reuse can only lead to occasional unnecessary use of the * slowpath. */ - int vm_lock_seq; + unsigned int vm_lock_seq; struct vma_lock *vm_lock; #endif =20 @@ -416,7 +416,7 @@ static inline bool vma_lock_alloc(struct vm_area_struct= *vma) return false; =20 init_rwsem(&vma->vm_lock->lock); - vma->vm_lock_seq =3D -1; + vma->vm_lock_seq =3D UINT_MAX; =20 return true; }