From nobody Tue Jun 16 19:33:57 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C30AF31BCAE for ; Wed, 29 Apr 2026 18:19:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486799; cv=none; b=GjHFj2giqLqkOTWB4kBHFblJh8iZJP4NlLd8PAiH4+G128hXh7SEiuLtiaKjA+Du2+2QYZZQKgS9UPqRJx7ICw30a7eFOhXC/twEvLvHkclK5cZ+lI5xm8sCXyMdlY6pNBv+MR0gosarNA/WgKx03VDT1aXLGXqRCgXczaxiYQo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486799; c=relaxed/simple; bh=un+pifbBsTc6JbIPkyaX7lb0emhMOKFM+/ruKLV0yA0=; h=Subject:To:Cc:From:Date:References:In-Reply-To:Message-Id; b=Mjof+H336CpnzErTLg9m1dIE+4DYeNB3lQcgBOBLG6zZr9zDXzbNvUwhcxtytbvcD6Y2lCzRQBwOE9vTBK5zxpaxzPp4BwwjaHiP9qBHjpjHtw9aboGOFUq1kbYqP/NvBEj2J+jQAit+Vmx5zYSu4J0kUx0XUdOZQw35Kh0Ap78= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ldr6bfBV; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ldr6bfBV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1777486797; x=1809022797; h=subject:to:cc:from:date:references:in-reply-to: message-id; bh=un+pifbBsTc6JbIPkyaX7lb0emhMOKFM+/ruKLV0yA0=; b=ldr6bfBVgtSjYYOy4nrgw2DNBT3jASHpDyGaDTpvDKdKIu3AHIyZ0J2L USSGpxxilQeKsiafrqs8WMjRZpS9S7GRRcdyLXuUEdZfSL7mNQT3J82YB eKhLnk4MlSx68XlrZIOfCpI263a3vqcPpGkRiLlRgkYZKIjucGdup/VFN SIIO1RxaqcO+aU7obRXDXi4BH/evdThBSnQnaeECRFLburTZaZ68+ue7I mcGzTv5hMEsjr0ANvC97T/kKAmA7NdY/YHzsmcnVELgbhAtF7+ToBveA+ zp0mdWU57JOqGkmmN+/oNLgcpjNoeEHpJ3VFZSN+oiZ3P56lZPintNOdK Q==; X-CSE-ConnectionGUID: OtvonxyKRlGKd78AbGgcyQ== X-CSE-MsgGUID: R9WAFqSjTrWFCUYaJ2Fqzw== X-IronPort-AV: E=McAfee;i="6800,10657,11771"; a="95990090" X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="95990090" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2026 11:19:56 -0700 X-CSE-ConnectionGUID: gunGUiJ1R4KizrtWzx51iw== X-CSE-MsgGUID: Q27R8zmkQUWzuTFiT12jKQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="239336386" Received: from davehans-spike.ostc.intel.com (HELO localhost.localdomain) ([10.165.164.11]) by fmviesa005.fm.intel.com with ESMTP; 29 Apr 2026 11:19:56 -0700 Subject: [PATCH 1/6] mm: Make per-VMA locks available universally To: linux-kernel@vger.kernel.org Cc: Dave Hansen , Andrew Morton , "Liam R. Howlett" , linux-mm@kvack.org, Lorenzo Stoakes , Shakeel Butt , Suren Baghdasaryan , Vlastimil Babka From: Dave Hansen Date: Wed, 29 Apr 2026 11:19:55 -0700 References: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> In-Reply-To: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> Message-Id: <20260429181955.0C443845@davehans-spike.ostc.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Dave Hansen The per-VMA locks have been around for several years. They've had some bugs worked out of them and have seen quite wide use. However, they are still only available when architectures explicitly enable them. Remove the conditional compilation around the per-VMA locks, making them available on all architectures and configs. The approach up to now seemed to be to add ARCH_SUPPORTS_PER_VMA_LOCK when the architecture started using per-VMA locks in the fault handler. But, contrary to the naming, the Kconfig option does not really indicate whether the architecture supports per-VMA locks or not. It is more of a marker for whether the architecture is likely to benefit from per-VMA locks. To me, the most important thing side-effect of universal availability is letting per-VMA locks be used in SMP=3Dn configs. This lets us use per-VMA locking in all x86 code without fallbacks. Overall, this just generally makes the kernel simpler. Just look at the diffstat. It also opens the door to users that want to use the per-VMA locks in common code. Doing *that* can bring additional simplifications. The downside of this is adding some fields to vm_area_struct and mm_struct. I suspect there are some very simple ways to implement the per-VMA locks that don't require any additional fields, especially if such an approach was limited to SMP=3Dn configs*. For now, do the simplest thing: use the same implementation everywhere. * For example, since SMP=3Dn configs don't care much about scalability or false sharing, there could be a single, global VMA seqcount that is bumped when any VMA is modified instead of having space in each VMA for a seqcount. Signed-off-by: Dave Hansen Cc: Suren Baghdasaryan Cc: Andrew Morton Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Vlastimil Babka Cc: Shakeel Butt Cc: linux-mm@kvack.org Reviewed-by: Suren Baghdasaryan --- b/arch/arm/Kconfig | 1=20 b/arch/arm64/Kconfig | 1=20 b/arch/loongarch/Kconfig | 1=20 b/arch/powerpc/platforms/powernv/Kconfig | 1=20 b/arch/powerpc/platforms/pseries/Kconfig | 1=20 b/arch/riscv/Kconfig | 1=20 b/arch/s390/Kconfig | 1=20 b/arch/x86/Kconfig | 2 - b/fs/proc/internal.h | 2 - b/fs/proc/task_mmu.c | 51 --------------------------= ----- b/include/linux/mm.h | 12 ------- b/include/linux/mm_types.h | 7 ---- b/include/linux/mmap_lock.h | 48 --------------------------= --- b/kernel/fork.c | 2 - b/mm/Kconfig | 13 ------- b/mm/mmap_lock.c | 2 - 16 files changed, 1 insertion(+), 145 deletions(-) diff -puN arch/arm64/Kconfig~unconditional-vma-locks arch/arm64/Kconfig --- a/arch/arm64/Kconfig~unconditional-vma-locks 2026-04-29 11:18:47.795519= 653 -0700 +++ b/arch/arm64/Kconfig 2026-04-29 11:18:49.088569421 -0700 @@ -80,7 +80,6 @@ config ARM64 select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 select ARCH_SUPPORTS_NUMA_BALANCING select ARCH_SUPPORTS_PAGE_TABLE_CHECK - select ARCH_SUPPORTS_PER_VMA_LOCK select ARCH_SUPPORTS_HUGE_PFNMAP if TRANSPARENT_HUGEPAGE select ARCH_SUPPORTS_RT select ARCH_SUPPORTS_SCHED_SMT diff -puN arch/arm/Kconfig~unconditional-vma-locks arch/arm/Kconfig --- a/arch/arm/Kconfig~unconditional-vma-locks 2026-04-29 11:18:47.91552427= 2 -0700 +++ b/arch/arm/Kconfig 2026-04-29 11:18:49.088569421 -0700 @@ -41,7 +41,6 @@ config ARM select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_CFI select ARCH_SUPPORTS_HUGETLBFS if ARM_LPAE - select ARCH_SUPPORTS_PER_VMA_LOCK select ARCH_SUPPORTS_RT select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF diff -puN arch/loongarch/Kconfig~unconditional-vma-locks arch/loongarch/Kco= nfig --- a/arch/loongarch/Kconfig~unconditional-vma-locks 2026-04-29 11:18:47.95= 6525850 -0700 +++ b/arch/loongarch/Kconfig 2026-04-29 11:18:49.088569421 -0700 @@ -68,7 +68,6 @@ config LOONGARCH select ARCH_SUPPORTS_LTO_CLANG_THIN select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS select ARCH_SUPPORTS_NUMA_BALANCING if NUMA - select ARCH_SUPPORTS_PER_VMA_LOCK select ARCH_SUPPORTS_RT select ARCH_SUPPORTS_SCHED_SMT if SMP select ARCH_SUPPORTS_SCHED_MC if SMP diff -puN arch/powerpc/platforms/powernv/Kconfig~unconditional-vma-locks ar= ch/powerpc/platforms/powernv/Kconfig --- a/arch/powerpc/platforms/powernv/Kconfig~unconditional-vma-locks 2026-0= 4-29 11:18:47.969526350 -0700 +++ b/arch/powerpc/platforms/powernv/Kconfig 2026-04-29 11:18:49.089569460 = -0700 @@ -17,7 +17,6 @@ config PPC_POWERNV select PPC_DOORBELL select MMU_NOTIFIER select FORCE_SMP - select ARCH_SUPPORTS_PER_VMA_LOCK select PPC_RADIX_BROADCAST_TLBIE if PPC_RADIX_MMU default y =20 diff -puN arch/powerpc/platforms/pseries/Kconfig~unconditional-vma-locks ar= ch/powerpc/platforms/pseries/Kconfig --- a/arch/powerpc/platforms/pseries/Kconfig~unconditional-vma-locks 2026-0= 4-29 11:18:47.972526466 -0700 +++ b/arch/powerpc/platforms/pseries/Kconfig 2026-04-29 11:18:49.089569460 = -0700 @@ -23,7 +23,6 @@ config PPC_PSERIES select HOTPLUG_CPU select FORCE_SMP select SWIOTLB - select ARCH_SUPPORTS_PER_VMA_LOCK select PPC_RADIX_BROADCAST_TLBIE if PPC_RADIX_MMU default y =20 diff -puN arch/riscv/Kconfig~unconditional-vma-locks arch/riscv/Kconfig --- a/arch/riscv/Kconfig~unconditional-vma-locks 2026-04-29 11:18:48.060529= 854 -0700 +++ b/arch/riscv/Kconfig 2026-04-29 11:18:49.089569460 -0700 @@ -70,7 +70,6 @@ config RISCV select ARCH_SUPPORTS_LTO_CLANG_THIN select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS if 64BIT && MMU select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU - select ARCH_SUPPORTS_PER_VMA_LOCK if MMU select ARCH_SUPPORTS_RT select ARCH_SUPPORTS_SHADOW_CALL_STACK if HAVE_SHADOW_CALL_STACK select ARCH_SUPPORTS_SCHED_MC if SMP diff -puN arch/s390/Kconfig~unconditional-vma-locks arch/s390/Kconfig --- a/arch/s390/Kconfig~unconditional-vma-locks 2026-04-29 11:18:48.1255323= 57 -0700 +++ b/arch/s390/Kconfig 2026-04-29 11:18:49.089569460 -0700 @@ -153,7 +153,6 @@ config S390 select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS select ARCH_SUPPORTS_NUMA_BALANCING select ARCH_SUPPORTS_PAGE_TABLE_CHECK - select ARCH_SUPPORTS_PER_VMA_LOCK select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF select ARCH_USE_SYM_ANNOTATIONS diff -puN arch/x86/Kconfig~unconditional-vma-locks arch/x86/Kconfig --- a/arch/x86/Kconfig~unconditional-vma-locks 2026-04-29 11:18:48.12853247= 2 -0700 +++ b/arch/x86/Kconfig 2026-04-29 11:18:49.090569499 -0700 @@ -27,7 +27,6 @@ config X86_64 select ARCH_HAS_GIGANTIC_PAGE select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 - select ARCH_SUPPORTS_PER_VMA_LOCK select ARCH_SUPPORTS_HUGE_PFNMAP if TRANSPARENT_HUGEPAGE select HAVE_ARCH_SOFT_DIRTY select MODULES_USE_ELF_RELA @@ -1885,7 +1884,6 @@ config X86_USER_SHADOW_STACK bool "X86 userspace shadow stack" depends on AS_WRUSS depends on X86_64 - depends on PER_VMA_LOCK select ARCH_USES_HIGH_VMA_FLAGS select ARCH_HAS_USER_SHADOW_STACK select X86_CET diff -puN fs/proc/internal.h~unconditional-vma-locks fs/proc/internal.h --- a/fs/proc/internal.h~unconditional-vma-locks 2026-04-29 11:18:48.305539= 283 -0700 +++ b/fs/proc/internal.h 2026-04-29 11:18:49.090569499 -0700 @@ -382,10 +382,8 @@ struct mem_size_stats; =20 struct proc_maps_locking_ctx { struct mm_struct *mm; -#ifdef CONFIG_PER_VMA_LOCK bool mmap_locked; struct vm_area_struct *locked_vma; -#endif }; =20 struct proc_maps_private { diff -puN fs/proc/task_mmu.c~unconditional-vma-locks fs/proc/task_mmu.c --- a/fs/proc/task_mmu.c~unconditional-vma-locks 2026-04-29 11:18:48.346540= 861 -0700 +++ b/fs/proc/task_mmu.c 2026-04-29 11:18:49.090569499 -0700 @@ -130,8 +130,6 @@ static void release_task_mempolicy(struc } #endif =20 -#ifdef CONFIG_PER_VMA_LOCK - static void reset_lock_ctx(struct proc_maps_locking_ctx *lock_ctx) { lock_ctx->locked_vma =3D NULL; @@ -213,33 +211,6 @@ static inline bool fallback_to_mmap_lock return true; } =20 -#else /* CONFIG_PER_VMA_LOCK */ - -static inline bool lock_vma_range(struct seq_file *m, - struct proc_maps_locking_ctx *lock_ctx) -{ - return mmap_read_lock_killable(lock_ctx->mm) =3D=3D 0; -} - -static inline void unlock_vma_range(struct proc_maps_locking_ctx *lock_ctx) -{ - mmap_read_unlock(lock_ctx->mm); -} - -static struct vm_area_struct *get_next_vma(struct proc_maps_private *priv, - loff_t last_pos) -{ - return vma_next(&priv->iter); -} - -static inline bool fallback_to_mmap_lock(struct proc_maps_private *priv, - loff_t pos) -{ - return false; -} - -#endif /* CONFIG_PER_VMA_LOCK */ - static struct vm_area_struct *proc_get_vma(struct seq_file *m, loff_t *ppo= s) { struct proc_maps_private *priv =3D m->private; @@ -527,8 +498,6 @@ static int pid_maps_open(struct inode *i PROCMAP_QUERY_VMA_FLAGS \ ) =20 -#ifdef CONFIG_PER_VMA_LOCK - static int query_vma_setup(struct proc_maps_locking_ctx *lock_ctx) { reset_lock_ctx(lock_ctx); @@ -581,26 +550,6 @@ static struct vm_area_struct *query_vma_ return vma; } =20 -#else /* CONFIG_PER_VMA_LOCK */ - -static int query_vma_setup(struct proc_maps_locking_ctx *lock_ctx) -{ - return mmap_read_lock_killable(lock_ctx->mm); -} - -static void query_vma_teardown(struct proc_maps_locking_ctx *lock_ctx) -{ - mmap_read_unlock(lock_ctx->mm); -} - -static struct vm_area_struct *query_vma_find_by_addr(struct proc_maps_lock= ing_ctx *lock_ctx, - unsigned long addr) -{ - return find_vma(lock_ctx->mm, addr); -} - -#endif /* CONFIG_PER_VMA_LOCK */ - static struct vm_area_struct *query_matching_vma(struct proc_maps_locking_= ctx *lock_ctx, unsigned long addr, u32 flags) { diff -puN include/linux/mmap_lock.h~unconditional-vma-locks include/linux/m= map_lock.h --- a/include/linux/mmap_lock.h~unconditional-vma-locks 2026-04-29 11:18:48= .700554487 -0700 +++ b/include/linux/mmap_lock.h 2026-04-29 11:18:49.091569537 -0700 @@ -76,8 +76,6 @@ static inline void mmap_assert_write_loc rwsem_assert_held_write(&mm->mmap_lock); } =20 -#ifdef CONFIG_PER_VMA_LOCK - #ifdef CONFIG_LOCKDEP #define __vma_lockdep_map(vma) (&vma->vmlock_dep_map) #else @@ -484,52 +482,6 @@ struct vm_area_struct *lock_next_vma(str struct vma_iterator *iter, unsigned long address); =20 -#else /* CONFIG_PER_VMA_LOCK */ - -static inline void mm_lock_seqcount_init(struct mm_struct *mm) {} -static inline void mm_lock_seqcount_begin(struct mm_struct *mm) {} -static inline void mm_lock_seqcount_end(struct mm_struct *mm) {} - -static inline bool mmap_lock_speculate_try_begin(struct mm_struct *mm, uns= igned int *seq) -{ - return false; -} - -static inline bool mmap_lock_speculate_retry(struct mm_struct *mm, unsigne= d int seq) -{ - return true; -} -static inline void vma_lock_init(struct vm_area_struct *vma, bool reset_re= fcnt) {} -static inline void vma_end_read(struct vm_area_struct *vma) {} -static inline void vma_start_write(struct vm_area_struct *vma) {} -static inline __must_check -int vma_start_write_killable(struct vm_area_struct *vma) { return 0; } -static inline void vma_assert_write_locked(struct vm_area_struct *vma) - { mmap_assert_write_locked(vma->vm_mm); } -static inline void vma_assert_attached(struct vm_area_struct *vma) {} -static inline void vma_assert_detached(struct vm_area_struct *vma) {} -static inline void vma_mark_attached(struct vm_area_struct *vma) {} -static inline void vma_mark_detached(struct vm_area_struct *vma) {} - -static inline struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *= mm, - unsigned long address) -{ - return NULL; -} - -static inline void vma_assert_locked(struct vm_area_struct *vma) -{ - mmap_assert_locked(vma->vm_mm); -} - -static inline void vma_assert_stabilised(struct vm_area_struct *vma) -{ - /* If no VMA locks, then either mmap lock suffices to stabilise. */ - mmap_assert_locked(vma->vm_mm); -} - -#endif /* CONFIG_PER_VMA_LOCK */ - static inline void mmap_write_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, true); diff -puN include/linux/mm.h~unconditional-vma-locks include/linux/mm.h --- a/include/linux/mm.h~unconditional-vma-locks 2026-04-29 11:18:48.714555= 026 -0700 +++ b/include/linux/mm.h 2026-04-29 11:18:49.091569537 -0700 @@ -890,7 +890,6 @@ static inline void vma_numab_state_free( * These must be here rather than mmap_lock.h as dependent on vm_fault typ= e, * declared in this header. */ -#ifdef CONFIG_PER_VMA_LOCK static inline void release_fault_lock(struct vm_fault *vmf) { if (vmf->flags & FAULT_FLAG_VMA_LOCK) @@ -906,17 +905,6 @@ static inline void assert_fault_locked(c else mmap_assert_locked(vmf->vma->vm_mm); } -#else -static inline void release_fault_lock(struct vm_fault *vmf) -{ - mmap_read_unlock(vmf->vma->vm_mm); -} - -static inline void assert_fault_locked(const struct vm_fault *vmf) -{ - mmap_assert_locked(vmf->vma->vm_mm); -} -#endif /* CONFIG_PER_VMA_LOCK */ =20 static inline bool mm_flags_test(int flag, const struct mm_struct *mm) { diff -puN include/linux/mm_types.h~unconditional-vma-locks include/linux/mm= _types.h --- a/include/linux/mm_types.h~unconditional-vma-locks 2026-04-29 11:18:48.= 761556836 -0700 +++ b/include/linux/mm_types.h 2026-04-29 11:18:49.092569576 -0700 @@ -959,7 +959,6 @@ struct vm_area_struct { vma_flags_t flags; }; =20 -#ifdef CONFIG_PER_VMA_LOCK /* * Can only be written (using WRITE_ONCE()) while holding both: * - mmap_lock (in write mode) @@ -975,7 +974,7 @@ struct vm_area_struct { * slowpath. */ unsigned int vm_lock_seq; -#endif + /* * A file's MAP_PRIVATE vma can be in both i_mmap tree and anon_vma * list, after a COW of one of the file pages. A MAP_SHARED vma @@ -1007,7 +1006,6 @@ struct vm_area_struct { #ifdef CONFIG_NUMA_BALANCING struct vma_numab_state *numab_state; /* NUMA Balancing state */ #endif -#ifdef CONFIG_PER_VMA_LOCK /* * Used to keep track of firstly, whether the VMA is attached, secondly, * if attached, how many read locks are taken, and thirdly, if the @@ -1050,7 +1048,6 @@ struct vm_area_struct { #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map vmlock_dep_map; #endif -#endif /* * For areas with an address space and backing store, * linkage into the address_space->i_mmap interval tree. @@ -1249,7 +1246,6 @@ struct mm_struct { * init_mm.mmlist, and are protected * by mmlist_lock */ -#ifdef CONFIG_PER_VMA_LOCK struct rcuwait vma_writer_wait; /* * This field has lock-like semantics, meaning it is sometimes @@ -1269,7 +1265,6 @@ struct mm_struct { * mmap_lock. */ seqcount_t mm_lock_seq; -#endif #ifdef CONFIG_FUTEX_PRIVATE_HASH struct mutex futex_hash_lock; struct futex_private_hash __rcu *futex_phash; diff -puN kernel/fork.c~unconditional-vma-locks kernel/fork.c --- a/kernel/fork.c~unconditional-vma-locks 2026-04-29 11:18:48.774557336 -= 0700 +++ b/kernel/fork.c 2026-04-29 11:18:49.092569576 -0700 @@ -1067,9 +1067,7 @@ static void mmap_init_lock(struct mm_str { init_rwsem(&mm->mmap_lock); mm_lock_seqcount_init(mm); -#ifdef CONFIG_PER_VMA_LOCK rcuwait_init(&mm->vma_writer_wait); -#endif } =20 static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct = *p, diff -puN mm/Kconfig~unconditional-vma-locks mm/Kconfig --- a/mm/Kconfig~unconditional-vma-locks 2026-04-29 11:18:48.838559801 -0700 +++ b/mm/Kconfig 2026-04-29 11:18:49.093569614 -0700 @@ -1394,19 +1394,6 @@ config LRU_GEN_STATS config LRU_GEN_WALKS_MMU def_bool y depends on LRU_GEN && ARCH_HAS_HW_PTE_YOUNG -# } - -config ARCH_SUPPORTS_PER_VMA_LOCK - def_bool n - -config PER_VMA_LOCK - def_bool y - depends on ARCH_SUPPORTS_PER_VMA_LOCK && MMU && SMP - help - Allow per-vma locking during page fault handling. - - This feature allows locking each virtual memory area separately when - handling page faults instead of taking mmap_lock. =20 config LOCK_MM_AND_FIND_VMA bool diff -puN mm/mmap_lock.c~unconditional-vma-locks mm/mmap_lock.c --- a/mm/mmap_lock.c~unconditional-vma-locks 2026-04-29 11:18:49.084569267 = -0700 +++ b/mm/mmap_lock.c 2026-04-29 11:18:49.093569614 -0700 @@ -44,7 +44,6 @@ EXPORT_SYMBOL(__mmap_lock_do_trace_relea #endif /* CONFIG_TRACING */ =20 #ifdef CONFIG_MMU -#ifdef CONFIG_PER_VMA_LOCK =20 /* State shared across __vma_[start, end]_exclude_readers. */ struct vma_exclude_readers_state { @@ -431,7 +430,6 @@ fallback: =20 return vma; } -#endif /* CONFIG_PER_VMA_LOCK */ =20 #ifdef CONFIG_LOCK_MM_AND_FIND_VMA #include _ From nobody Tue Jun 16 19:33:57 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEE843F0AA2 for ; Wed, 29 Apr 2026 18:20:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486802; cv=none; b=Pl8MDwqW0sxg9C6vOhYKbllnLWs7f2X9VLFOKEZ7azmchZGcMuUJ6WpzUVxkc5cmx3QcIiQ1klDmD/gh1dhqQWCkWdQCDD/l5PM54rXPoL5xBa8FckjTexSf84ms5kDBEb+6p6VBGT3HOg8Hn+eMzHBVkLyTpL11T168teMQydc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486802; c=relaxed/simple; bh=MZR0sgC0OINAhIfiXWTKQMo/83zXaasXTmCU+0qPmhA=; h=Subject:To:Cc:From:Date:References:In-Reply-To:Message-Id; b=LJK4lr0qZsk2u7c32LE04zRHhNcXbI2xOGltqHrfsZv9sLlI5utAgAL+N/jKMEmI/vk51g1D6JhiCoqFNIgQQ0PiOOqAHzF3iBcHsPdtKhEh0JisnUc6DTqEysJuPdyiLdq0NEg2mZ/+yexEMaT5KEVRcKbHk9PGp7ENDlEJ6eM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Rn80yLxV; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Rn80yLxV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1777486801; x=1809022801; h=subject:to:cc:from:date:references:in-reply-to: message-id; bh=MZR0sgC0OINAhIfiXWTKQMo/83zXaasXTmCU+0qPmhA=; b=Rn80yLxVC1Mva54LtJHzyMLWOvIt2zmVN6PVwMxnCUbX+onHspl3qcVQ YWIj0TkqslTIeBZfA/4rer/FptZmvgYgDnikzMa/TASvAg0ZBDdlxR6Rm 6zUpcuPZ6jJcAoK5j4TnTMpSbESrrV95m2JDw2CxxnpaviIRtk0Cgpf0H wMnEOT8lwagXJuuACEvAal3JdGxvTLUO699s8QZpbTO7MB5f6wbdMngnc rPVorcfLOUMudcCINcZylAG3IKO26udoO07i4DIdTv1nw+5xgEE6GVVn0 xKgEgEPn06UuvvS8wZsfMnzXo7yF3Z+CBOUN/Og36l9or/Eogq6SU/lGe Q==; X-CSE-ConnectionGUID: o7p/hs2JTwuE79nlQjAEAw== X-CSE-MsgGUID: OVF1BLS3TzanGKsVMbvRQg== X-IronPort-AV: E=McAfee;i="6800,10657,11771"; a="95990098" X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="95990098" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2026 11:19:58 -0700 X-CSE-ConnectionGUID: 5RlAuTAYS/eiJAWm9uHIOA== X-CSE-MsgGUID: pKdk/cmhSf+g6DM4PSZ6bg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="239336394" Received: from davehans-spike.ostc.intel.com (HELO localhost.localdomain) ([10.165.164.11]) by fmviesa005.fm.intel.com with ESMTP; 29 Apr 2026 11:19:58 -0700 Subject: [PATCH 2/6] binder: Make shrinker rely solely on per-VMA lock To: linux-kernel@vger.kernel.org Cc: Dave Hansen , Andrew Morton , "Liam R. Howlett" , linux-mm@kvack.org, Lorenzo Stoakes , Shakeel Butt , Suren Baghdasaryan , Vlastimil Babka From: Dave Hansen Date: Wed, 29 Apr 2026 11:19:57 -0700 References: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> In-Reply-To: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> Message-Id: <20260429181957.7511C256@davehans-spike.ostc.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Dave Hansen tl;dr: lock_vma_under_rcu() is already a trylock. No need to do both it and mmap_read_trylock(). Long Version: =3D=3D Background =3D=3D Historically, binder used an mmap_read_trylock() in its shrinker code. This ensures that reclaim is not blocked on an mmap_lock. Commit 95bc2d4a9020 ("binder: use per-vma lock in page reclaiming") added support for the per-VMA lock, but but left mmap_read_trylock() as a fallback. This was presumably because the per-VMA locking can fail for several reasons and most (all?) lock_vma_under_rcu() callers have a fallback to mmap_read_trylock(). =3D=3D Problem =3D=3D The fallback is not worth the complexity here. lock_vma_under_rcu() is essentially already a non-blocking trylock. The main reason it fails is also the reason mmap_read_trylock() fails: something is holding mmap_write_lock(). The only remedy for a collision with mmap_write_lock() is to wait, which this code can not do. So the "fallback" after lock_vma_under_rcu() failure is not really a fallback: it is really likely to just be retrying in vain. That retry in an of itself isn't horrible. But it adds complexity. =3D=3D Solution =3D=3D Now that per-VMA locks are universally available, lock_vma_under_rcu() will not persistently fail. Rely on it alone and simplify the code. Full disclosure: I originally tried to do this with lock_vma_under_rcu_wait(), but it did not fit well with the mmap_lock trylock semantics. Claude caught this in a review and suggested the approach in this path. It seemed sane to me. So, Suggesed-by: Claude, I guess. Signed-off-by: Dave Hansen Cc: Suren Baghdasaryan Cc: Andrew Morton Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Vlastimil Babka Cc: Shakeel Butt Cc: linux-mm@kvack.org Acked-by: Lorenzo Stoakes with nit below addressed. Reviewed-by: Suren Baghdasaryan --- b/drivers/android/binder_alloc.c | 22 +++++----------------- 1 file changed, 5 insertions(+), 17 deletions(-) diff -puN drivers/android/binder_alloc.c~binder-try-vma-lock drivers/androi= d/binder_alloc.c --- a/drivers/android/binder_alloc.c~binder-try-vma-lock 2026-04-29 11:18:5= 0.066607065 -0700 +++ b/drivers/android/binder_alloc.c 2026-04-29 11:18:50.069607180 -0700 @@ -1142,7 +1142,6 @@ enum lru_status binder_alloc_free_page(s struct vm_area_struct *vma; struct page *page_to_free; unsigned long page_addr; - int mm_locked =3D 0; size_t index; =20 if (!mmget_not_zero(mm)) @@ -1151,15 +1150,10 @@ enum lru_status binder_alloc_free_page(s index =3D mdata->page_index; page_addr =3D alloc->vm_start + index * PAGE_SIZE; =20 - /* attempt per-vma lock first */ + /* attempt per-vma lock */ vma =3D lock_vma_under_rcu(mm, page_addr); - if (!vma) { - /* fall back to mmap_lock */ - if (!mmap_read_trylock(mm)) - goto err_mmap_read_lock_failed; - mm_locked =3D 1; - vma =3D vma_lookup(mm, page_addr); - } + if (!vma) + goto err_mmap_read_lock_failed; =20 if (!mutex_trylock(&alloc->mutex)) goto err_get_alloc_mutex_failed; @@ -1191,10 +1185,7 @@ enum lru_status binder_alloc_free_page(s } =20 mutex_unlock(&alloc->mutex); - if (mm_locked) - mmap_read_unlock(mm); - else - vma_end_read(vma); + vma_end_read(vma); mmput_async(mm); binder_free_page(page_to_free); =20 @@ -1203,10 +1194,7 @@ enum lru_status binder_alloc_free_page(s err_invalid_vma: mutex_unlock(&alloc->mutex); err_get_alloc_mutex_failed: - if (mm_locked) - mmap_read_unlock(mm); - else - vma_end_read(vma); + vma_end_read(vma); err_mmap_read_lock_failed: mmput_async(mm); err_mmget: _ From nobody Tue Jun 16 19:33:57 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB211402BAC for ; Wed, 29 Apr 2026 18:20:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486803; cv=none; b=XLoEmr41H5FucKop4ZTzjc75drV32B3H9cykqAOvsxkUaIpG5VIL3Z8kUJzF9HKRv27nxM3RXK5MyfYKqbgApJdQvICL3rXO9L3vPD7Qz97EJlDTvOpPOVl2MlbWuJQgnHNNKadQZmUquAHKJH1c1itRDoa89Dvv+VI+8EVihN8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486803; c=relaxed/simple; bh=gAidwY/isSq9GwYadrcKVxfY5GsbxvEPZGfjDxsycxY=; h=Subject:To:Cc:From:Date:References:In-Reply-To:Message-Id; b=Kob1s8GBUmrltFxzKratQVKgDSaq5OCtU7GOkDRxrC7BLo+/r59HaD5PnjjUPPAU7a+M/LXiQ2aD9bCxaehlEjfvtxWSlgPL1qUFI0qDuCy6RpYk+ZjviHB9J0t+CA05v0YQ5bB1MJA6voUkKFI59P+cJpK8U0cPWkaqY99xH+I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=cpZxdqtB; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="cpZxdqtB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1777486802; x=1809022802; h=subject:to:cc:from:date:references:in-reply-to: message-id; bh=gAidwY/isSq9GwYadrcKVxfY5GsbxvEPZGfjDxsycxY=; b=cpZxdqtBweFDsxAH41ac5E1BwFK5GLgIWo22zElbkE1RMaIMh8uIJvKl zGixIdu2t3F12UaQKRv72F/V3nIlq5fbCeEongYnZi3te/vIrTFiiNQI7 BtxthAOH3Tr++cUm32zJKaWLWcalEtcVIEftm2Kb7YdLWcPE5djPRyOKt G/IzKSRdn7ogWjD2p3y75UFcdOI43iXSoApsbfUGdR1G7o3KfRRJB/Ckp myg4WO/8xkB++zVlJ4fTxLZj9cb2CSncS/d6Fo1c+2wKNDbGBfB+9+/WM 9z58S0phbBTjF21buTySeSghXL3BFtTYojiX2laj5SOqk4FeAPm9PxTow A==; X-CSE-ConnectionGUID: RcLOBUC0R7Ot4NicEvrgcA== X-CSE-MsgGUID: Ky2weyG/SjKwGUfeiUGRuQ== X-IronPort-AV: E=McAfee;i="6800,10657,11771"; a="95990109" X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="95990109" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2026 11:20:00 -0700 X-CSE-ConnectionGUID: WNjeZOkkRJuU6X+T+RGEcQ== X-CSE-MsgGUID: 4rIFHkXPQNaEscpxqJpPfA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="239336402" Received: from davehans-spike.ostc.intel.com (HELO localhost.localdomain) ([10.165.164.11]) by fmviesa005.fm.intel.com with ESMTP; 29 Apr 2026 11:19:59 -0700 Subject: [PATCH 3/6] mm: Add RCU-based VMA lookup that waits for writers To: linux-kernel@vger.kernel.org Cc: Dave Hansen , Andrew Morton , "Liam R. Howlett" , linux-mm@kvack.org, Lorenzo Stoakes , Shakeel Butt , Suren Baghdasaryan , Vlastimil Babka From: Dave Hansen Date: Wed, 29 Apr 2026 11:19:59 -0700 References: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> In-Reply-To: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> Message-Id: <20260429181959.BC9DABC5@davehans-spike.ostc.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Dave Hansen =3D=3D Background =3D=3D There are basically two parallel ways to look up a VMA: the traditional way, which is protected by mmap_lock, and the RCU-based per-VMA lock way which is based on RCU and refcounts. =3D=3D Problems =3D=3D The mmap_lock one is more straightforward to use but it has a big disadvantage in that it can not be mixed with page faults since those can take mmap_lock for read. The RCU one can be mixed with faults, but it is not available in all configs, so all RCU users need to be able to fall back to the traditional way. =3D=3D Solution =3D=3D Add a variant of the RCU-based lookup that waits for writers. This is basically the same as the existing RCU-based lookup, but it also takes mmap_lock for read and waits for writers to finish before returning the VMA. This has two big advantages: 1. Callers do not need to have a fallback path for when they collide with writers. 2. It can be used in contexts where page faults can happen because it can take the mmap_lock for read but never *holds* it. =3D=3D Discussion =3D=3D I am not married to the naming here at all. Naming suggestions would be much appreciated. This basically uses mmap_lock to wait for writers, nothing else. The VMA is obviously stable under mmap_read_lock() and the code _can_ likely take advantage of that and possibly even remove the goto. For instance, it could (probably) bump the VMA refcount and exclude future writers. That would eliminate the goto. But the approach as-is is probably the smallest line count and arguably the simplest approach. It is a good place to start a conversation if nothing else. Signed-off-by: Dave Hansen Cc: Suren Baghdasaryan Cc: Andrew Morton Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Vlastimil Babka Cc: Shakeel Butt Cc: linux-mm@kvack.org --- b/include/linux/mmap_lock.h | 2 ++ b/mm/mmap_lock.c | 43 +++++++++++++++++++++++++++++++++++++++= ++++ 2 files changed, 45 insertions(+) diff -puN include/linux/mmap_lock.h~lock-vma-under-rcu-wait include/linux/m= map_lock.h --- a/include/linux/mmap_lock.h~lock-vma-under-rcu-wait 2026-04-29 11:18:50= .633628887 -0700 +++ b/include/linux/mmap_lock.h 2026-04-29 11:18:50.707631737 -0700 @@ -470,6 +470,8 @@ static inline void vma_mark_detached(str =20 struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, unsigned long address); +struct vm_area_struct *lock_vma_under_rcu_wait(struct mm_struct *mm, + unsigned long address); =20 /* * Locks next vma pointed by the iterator. Confirms the locked vma has not diff -puN mm/mmap_lock.c~lock-vma-under-rcu-wait mm/mmap_lock.c --- a/mm/mmap_lock.c~lock-vma-under-rcu-wait 2026-04-29 11:18:50.704631622 = -0700 +++ b/mm/mmap_lock.c 2026-04-29 11:18:50.707631737 -0700 @@ -340,6 +340,49 @@ inval: return NULL; } =20 +/* + * Find the VMA covering 'address' and lock it for reading. Waits for writ= ers to + * finish if the VMA is being modified. Returns NULL if there is no VMA co= vering + * 'address'. + * + * The fast path does not take mmap lock. + */ +struct vm_area_struct *lock_vma_under_rcu_wait(struct mm_struct *mm, + unsigned long address) +{ + struct vm_area_struct *vma; + +retry: + vma =3D lock_vma_under_rcu(mm, address); + /* Fast path: return stable VMA covering 'address': */ + if (vma) + return vma; + + /* + * Slow path: the VMA covering 'address' is being modified. + * or there is no VMA covering 'address'. Rule out the + * possibility that the VMA is being modified: + */ + mmap_read_lock(mm); + vma =3D vma_lookup(mm, address); + mmap_read_unlock(mm); + + /* There was for sure no VMA covering 'address': */ + if (!vma) + return NULL; + + /* + * VMA was likely being modified during RCU lookup. Try again. + * mmap_read_lock() waited for the writer to complete and the + * writer is now done. + * + * There is no guarantee that any single retry will succeed, + * and it is possible but highly unlikely this will loop + * forever. + */ + goto retry; +} + static struct vm_area_struct *lock_next_vma_under_mmap_lock(struct mm_stru= ct *mm, struct vma_iterator *vmi, unsigned long from_addr) _ From nobody Tue Jun 16 19:33:57 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4539A41B346 for ; Wed, 29 Apr 2026 18:20:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486803; cv=none; b=NwLiBtSeJcZ/HuNlT1SEj2vKS1b79qCRxaBYjdk/ArY1mww87uzWk6WWZvMDgyYEPChTPMx+0vaYZumcA4KsUOM9HpC7Tx6S+WjMAvwH1SewPQ0KTUw677y65+LF6u5S2rDTJuxlIcdw7kbS7mugCy4zMpYXSHplvdZbVZj0pQk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486803; c=relaxed/simple; bh=+sJGsZ4/2gGZtmGKtRdP26CBc6vMMZiB8DZRyd6JjHk=; h=Subject:To:Cc:From:Date:References:In-Reply-To:Message-Id; b=jNABVH8nX6KwxY6oAxr7Tznxu/QFizRsshgkRN8U+7/7v7slLNpcac6IRH/wAc07VZ9os7/d9iKJ00ADNgTc5uAgVFWgRbc+EfQaGlNNBg1Yn+WMtWpUlJ/ivfIVxtiCxwOpg9EPj7ioYtXy6kVuGbz6Svhw/7UbqEyUD5NZH0E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=MfCgKhyv; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="MfCgKhyv" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1777486802; x=1809022802; h=subject:to:cc:from:date:references:in-reply-to: message-id; bh=+sJGsZ4/2gGZtmGKtRdP26CBc6vMMZiB8DZRyd6JjHk=; b=MfCgKhyvwe4nNSaeYKB/PpYVz6YvHnN8tsTkUJ9W8Ntjr4+PUREKko+Y yPAV4/DgLkKIwQy1ftM6I/PjpBCblSc+sLrDHxY78Rg+gv5NGJ9+GwfrB P0bkcuYYVWAKCQnYVwY1Vbn0/XNtQpXjtyqkBW3SupZEuXs+wh0weF1NQ maFjPrglWvmqilj7pEXwOol59k7UbPfBNJsFt9pbztkDCe7FyqNqo7XZ/ mx1O0/FMeWEi7hGP4mQlLzPq1ZorMkCC0uC/SqgZZzDzvLZcJ7pbZCxDu /YSHY7/B1P3ZDF98ZviVGLzyxQfsFGPW6h1Yz4jEAZ5TuheXCqyGFM1aM g==; X-CSE-ConnectionGUID: Ewx7aZOxSuG0yTf2bPvubQ== X-CSE-MsgGUID: slSN5/JaRA6Tm4swWB+kHg== X-IronPort-AV: E=McAfee;i="6800,10657,11771"; a="95990120" X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="95990120" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2026 11:20:01 -0700 X-CSE-ConnectionGUID: gncUGFtqT9GLDg0YFldOQA== X-CSE-MsgGUID: 9ZVGbQVnSfWkW5U9dFOILg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="239336412" Received: from davehans-spike.ostc.intel.com (HELO localhost.localdomain) ([10.165.164.11]) by fmviesa005.fm.intel.com with ESMTP; 29 Apr 2026 11:20:01 -0700 Subject: [PATCH 4/6] binder: Remove mmap_lock fallback To: linux-kernel@vger.kernel.org Cc: Dave Hansen , Andrew Morton , "Liam R. Howlett" , linux-mm@kvack.org, Lorenzo Stoakes , Shakeel Butt , Suren Baghdasaryan , Vlastimil Babka From: Dave Hansen Date: Wed, 29 Apr 2026 11:20:00 -0700 References: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> In-Reply-To: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> Message-Id: <20260429182000.93887DFB@davehans-spike.ostc.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Dave Hansen Previously, the per-VMA locking could fail in the face of writers which necessitate a fallback to mmap_lock. The new lock_vma_under_rcu_wait() will wait for writers instead of failing. Use the new helper. Wait for writers. Remove the fallback to mmap_lock. Signed-off-by: Dave Hansen Cc: Suren Baghdasaryan Cc: Andrew Morton Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Vlastimil Babka Cc: Shakeel Butt Cc: linux-mm@kvack.org Acked-by: Lorenzo Stoakes Reviewed-by: Suren Baghdasaryan --- b/drivers/android/binder_alloc.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-) diff -puN drivers/android/binder_alloc.c~binder-vma-waiter drivers/android/= binder_alloc.c --- a/drivers/android/binder_alloc.c~binder-vma-waiter 2026-04-29 11:18:51.= 307654829 -0700 +++ b/drivers/android/binder_alloc.c 2026-04-29 11:18:51.310654944 -0700 @@ -259,21 +259,14 @@ static int binder_page_insert(struct bin struct vm_area_struct *vma; int ret =3D -ESRCH; =20 - /* attempt per-vma lock first */ - vma =3D lock_vma_under_rcu(mm, addr); - if (vma) { - if (binder_alloc_is_mapped(alloc)) - ret =3D vm_insert_page(vma, addr, page); - vma_end_read(vma); + vma =3D lock_vma_under_rcu_wait(mm, addr); + if (!vma) return ret; - } =20 - /* fall back to mmap_lock */ - mmap_read_lock(mm); - vma =3D vma_lookup(mm, addr); - if (vma && binder_alloc_is_mapped(alloc)) + if (binder_alloc_is_mapped(alloc)) ret =3D vm_insert_page(vma, addr, page); - mmap_read_unlock(mm); + + vma_end_read(vma); =20 return ret; } _ From nobody Tue Jun 16 19:33:57 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0724032B99F for ; Wed, 29 Apr 2026 18:20:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486808; cv=none; b=kbEefvx62FRsSPi+EnKc5Tje+EwBjYBf4DRufswWANEYULgc+bQUmX0/nH3ALDwQxDZzWRaxkGWnHITpX8BEZyomDiOOwGFQbl3UGhABII00O1EWpZeiSw36oQ8nr1wIQICLIlHmSyjgylB6JQzfGjifWeY+VX+QAAWn2mHQauk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486808; c=relaxed/simple; bh=+GTbivOW+M0MfDpe1qWwwf3hHWabdB0SE6jBLZ6CPSM=; h=Subject:To:Cc:From:Date:References:In-Reply-To:Message-Id; b=Oy8ml/+KKbnREOzd0vZJtYvkikypt8RMpFwP8uh0DGhrCdHWbEccEeKbXHkvqEI4evzvo7oMv4b3RyBWIw34D4slmWNdaIHyf2fu9q7/BniBnNe0K31qf943EMH4c4m46SBeyDj2flyFTHnCe53CIZyhuG/aB3clK1StV0A+GZY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Dt4nP5Ke; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Dt4nP5Ke" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1777486807; x=1809022807; h=subject:to:cc:from:date:references:in-reply-to: message-id; bh=+GTbivOW+M0MfDpe1qWwwf3hHWabdB0SE6jBLZ6CPSM=; b=Dt4nP5KeGJC5Gsy21TuPfUZiPOnYUKhS6lDmvTHaA7cg9Z8vjuycFLIS Zv7mmaaOtABdIWSpXslznTaZCDp/HJwO0bVEfuBIaww4ZMVJjaCjWqHgi 56lof4r5iU2pImJbnAugZtN8UFREPbqQ5G055AkkktCLSKguHyCqvl4/z sJ4fQx52gh0PxayTfCqwi+HrABa8A/tZaFeyFo9y0dukPfwMFzL38l0D3 Sn0UYdqvHacuiXeaHze3rFBebYrnBmpXcHybjMB+4q9XMwZajqGrhAupo 1WB19MdWoev8SVP6WNrOCkYkYYnF4DKcydR3fQ1ehQkzq/43uxRX5QSbO Q==; X-CSE-ConnectionGUID: LHR2j8emSNiPNNyp48OBpQ== X-CSE-MsgGUID: /3HY1D3ARSG5clluFuANbA== X-IronPort-AV: E=McAfee;i="6800,10657,11771"; a="95990134" X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="95990134" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2026 11:20:06 -0700 X-CSE-ConnectionGUID: /QEVPuNTTE6LY2Xy0HqMUw== X-CSE-MsgGUID: /5JWNQ8AQzSyc14sr4YyRg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="239336441" Received: from davehans-spike.ostc.intel.com (HELO localhost.localdomain) ([10.165.164.11]) by fmviesa005.fm.intel.com with ESMTP; 29 Apr 2026 11:20:03 -0700 Subject: [PATCH 5/6] tcp: Remove mmap_lock fallback path To: linux-kernel@vger.kernel.org Cc: Dave Hansen , Andrew Morton , "Liam R. Howlett" , linux-mm@kvack.org, Lorenzo Stoakes , Shakeel Butt , Suren Baghdasaryan , Vlastimil Babka From: Dave Hansen Date: Wed, 29 Apr 2026 11:20:02 -0700 References: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> In-Reply-To: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> Message-Id: <20260429182002.BB61C7BC@davehans-spike.ostc.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Dave Hansen Previously, the per-VMA locking could fail in the face of writers which necessitates a fallback to mmap_lock. The new lock_vma_under_rcu_wait() will wait for writers instead of failing. Use the new helper. Wait for writers. Remove the fallback to mmap_lock. This really is a nice cleanup. It removes the need to pass the lock state back and forth to find_tcp_vma(). Signed-off-by: Dave Hansen Cc: Suren Baghdasaryan Cc: Andrew Morton Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Vlastimil Babka Cc: Shakeel Butt Cc: linux-mm@kvack.org Acked-by: Lorenzo Stoakes Reviewed-by: Suren Baghdasaryan --- b/net/ipv4/tcp.c | 31 +++++++++---------------------- 1 file changed, 9 insertions(+), 22 deletions(-) diff -puN net/ipv4/tcp.c~ipv4-tcp-vma-waiter net/ipv4/tcp.c --- a/net/ipv4/tcp.c~ipv4-tcp-vma-waiter 2026-04-29 11:18:51.870676498 -0700 +++ b/net/ipv4/tcp.c 2026-04-29 11:18:51.874676652 -0700 @@ -2171,27 +2171,18 @@ static void tcp_zc_finalize_rx_tstamp(st } =20 static struct vm_area_struct *find_tcp_vma(struct mm_struct *mm, - unsigned long address, - bool *mmap_locked) + unsigned long address) { - struct vm_area_struct *vma =3D lock_vma_under_rcu(mm, address); + struct vm_area_struct *vma =3D lock_vma_under_rcu_wait(mm, address); =20 - if (vma) { - if (vma->vm_ops !=3D &tcp_vm_ops) { - vma_end_read(vma); - return NULL; - } - *mmap_locked =3D false; - return vma; - } + if (!vma) + return NULL; =20 - mmap_read_lock(mm); - vma =3D vma_lookup(mm, address); - if (!vma || vma->vm_ops !=3D &tcp_vm_ops) { - mmap_read_unlock(mm); + if (vma->vm_ops !=3D &tcp_vm_ops) { + vma_end_read(vma); return NULL; } - *mmap_locked =3D true; + return vma; } =20 @@ -2212,7 +2203,6 @@ static int tcp_zerocopy_receive(struct s u32 seq =3D tp->copied_seq; u32 total_bytes_to_map; int inq =3D tcp_inq(sk); - bool mmap_locked; int ret; =20 zc->copybuf_len =3D 0; @@ -2237,7 +2227,7 @@ static int tcp_zerocopy_receive(struct s return 0; } =20 - vma =3D find_tcp_vma(current->mm, address, &mmap_locked); + vma =3D find_tcp_vma(current->mm, address); if (!vma) return -EINVAL; =20 @@ -2319,10 +2309,7 @@ static int tcp_zerocopy_receive(struct s zc, total_bytes_to_map); } out: - if (mmap_locked) - mmap_read_unlock(current->mm); - else - vma_end_read(vma); + vma_end_read(vma); /* Try to copy straggler data. */ if (!ret) copylen =3D tcp_zc_handle_leftover(zc, sk, skb, &seq, copybuf_len, tss); _ From nobody Tue Jun 16 19:33:57 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D60A413222 for ; Wed, 29 Apr 2026 18:20:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486809; cv=none; b=KLkpipDKBLEaW2fyVMDggImXBTY7GWN80mA2BTCWdW3k0Xbuos9hUnMC06VO/BSHiAkvA1WuH3CQdhLCgtgvQd6EwYcEyf4WFCqBVSv1qfzvox7d2pcphPZZHK1lw8mvP4Pg6zUzXC9omlOTkz7PjyNWABaqhX5aSZt4eQXJYyI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777486809; c=relaxed/simple; bh=7dFDPzrIAbzJLwmxK2im4erAqAsRrcXN0YYEP3dJALk=; h=Subject:To:Cc:From:Date:References:In-Reply-To:Message-Id; b=P0i40NAtJLbOa638XN51lCLIZqafohpIEyb9F75wUwSW/tH4xntzIqZn82UkEZtx1AYNPsuLELkH//5DdWB2GDHyYdTJfrs337s2cBctJP4NsKt40dag0z5CvD0Y3+22pAmpMgW6V9EIUUoh0OJy6m+OTsrEGlkgUlHFozqnO/Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YyNZs0HM; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YyNZs0HM" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1777486808; x=1809022808; h=subject:to:cc:from:date:references:in-reply-to: message-id; bh=7dFDPzrIAbzJLwmxK2im4erAqAsRrcXN0YYEP3dJALk=; b=YyNZs0HM21EEYkYUPCRf8OjfRexZYzFg2ojbgR/dxQPRcbbANZ7zdMf1 yg4fUhK0W1Sm7rX8LW3YSIRi6DyvVyLZQ/IIO7xG02NGRHu7gI/pB0QSZ MuFDtfw4MPI9Peaz3+tGN4jxn+buO6Y8JhxBI8XisWGuilwB3QpAKmZxw ABAiwFfUFZW1Oat+99xWuEKC209BRXXRSG+bBeLKPzv9xqTke0reVj9T2 NgDNmroInbAnoFZrjRyMRT+tGkZODbcV2pjyXR/H/GcpKeiyupOHh/akV 1kULNJ1Fi87HKQwIj49kOiU2p7htjjuX0+gWIGhpucMQDDyD2mgT/EvDA g==; X-CSE-ConnectionGUID: RW7OSy2RT+aIp1aJjK6kWg== X-CSE-MsgGUID: 9/8lzUKCRCKmmrJsS7EHDw== X-IronPort-AV: E=McAfee;i="6800,10657,11771"; a="95990141" X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="95990141" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2026 11:20:06 -0700 X-CSE-ConnectionGUID: PD/Hu5SWQpmU0LRPzM05AQ== X-CSE-MsgGUID: xbkb5ieTSnKs2CexBURa/Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,206,1770624000"; d="scan'208";a="239336473" Received: from davehans-spike.ostc.intel.com (HELO localhost.localdomain) ([10.165.164.11]) by fmviesa005.fm.intel.com with ESMTP; 29 Apr 2026 11:20:06 -0700 Subject: [PATCH 6/6] x86/mm: Avoid mmap lock for shadow stack pop fast path To: linux-kernel@vger.kernel.org Cc: Dave Hansen , Andrew Morton , "Liam R. Howlett" , linux-mm@kvack.org, Lorenzo Stoakes , Shakeel Butt , Suren Baghdasaryan , Vlastimil Babka From: Dave Hansen Date: Wed, 29 Apr 2026 11:20:05 -0700 References: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> In-Reply-To: <20260429181954.F50224AE@davehans-spike.ostc.intel.com> Message-Id: <20260429182005.00BF70D8@davehans-spike.ostc.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Dave Hansen The shadow stack code needs to look at the VMA from which it is reading a userspace "token" to ensure that the memory is shadow stack memory. If it did not do this, it might read the token from non-shadow-stack memory, which could result in a control flow hijack. But that lookup requires two things: * Looking at a VMA, which must be locked * Touching userspace That's a bit of a pain because mmap_lock can not be held while touching userspace. So the code has to drop the lock, touch userspace, then re-acquire the lock and check if the VMA might have changed. The current implementation does with a combination of holding mmap_lock and looping if the VMA might have changed. It works great. But the lock_vma_under_rcu_wait() API is a little simpler and also does not use mmap_lock in its fast path. Switch to lock_vma_under_rcu_wait(). BTW, this does swap in a mmap_read_lock() for mmap_read_lock_killable(). That obviously isn't ideal, but it's trivially fixable with another variant of the helper. I'd apprecaite if we could handwave that away for the moment. :) Signed-off-by: Dave Hansen Cc: Suren Baghdasaryan Cc: Andrew Morton Cc: "Liam R. Howlett" Cc: Lorenzo Stoakes Cc: Vlastimil Babka Cc: Shakeel Butt Cc: linux-mm@kvack.org --- b/arch/x86/kernel/shstk.c | 47 ++++++++++++++++-------------------------= ----- 1 file changed, 17 insertions(+), 30 deletions(-) diff -puN arch/x86/kernel/shstk.c~shstk-pop-rcu arch/x86/kernel/shstk.c --- a/arch/x86/kernel/shstk.c~shstk-pop-rcu 2026-04-29 11:18:52.425697858 -= 0700 +++ b/arch/x86/kernel/shstk.c 2026-04-29 11:18:52.428697973 -0700 @@ -326,8 +326,9 @@ static int shstk_push_sigframe(unsigned =20 static int shstk_pop_sigframe(unsigned long *ssp) { + struct vm_area_struct *vma; unsigned long token_addr; - unsigned int seq; + int err; =20 /* * It is possible for the SSP to be off the end of a shadow stack by 4 @@ -338,35 +339,21 @@ static int shstk_pop_sigframe(unsigned l if (!IS_ALIGNED(*ssp, 8)) return -EINVAL; =20 - do { - struct vm_area_struct *vma; - bool valid_vma; - int err; - - if (mmap_read_lock_killable(current->mm)) - return -EINTR; - - vma =3D find_vma(current->mm, *ssp); - valid_vma =3D vma && (vma->vm_flags & VM_SHADOW_STACK); - - /* - * VMAs can change between get_shstk_data() and find_vma(). - * Watch for changes and ensure that 'token_addr' comes from - * 'vma' by recording a seqcount. - * - * Ignore the return value of mmap_lock_speculate_try_begin() - * because the mmap lock excludes the possibility of writers. - */ - mmap_lock_speculate_try_begin(current->mm, &seq); - mmap_read_unlock(current->mm); - - if (!valid_vma) - return -EINVAL; - - err =3D get_shstk_data(&token_addr, (unsigned long __user *)*ssp); - if (err) - return err; - } while (mmap_lock_speculate_retry(current->mm, seq)); + vma =3D lock_vma_under_rcu_wait(current->mm, *ssp); + if (!vma) + return -EINVAL; + + if (!(vma->vm_flags & VM_SHADOW_STACK)) { + vma_end_read(vma); + return -EINVAL; + } + + err =3D get_shstk_data(&token_addr, (unsigned long __user *)*ssp); + + vma_end_read(vma); + + if (err) + return err; =20 /* Restore SSP aligned? */ if (unlikely(!IS_ALIGNED(token_addr, 8))) _