From nobody Sun Feb 8 02:21:33 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43949C04FE0 for ; Mon, 31 Jul 2023 17:15:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232689AbjGaRPk (ORCPT ); Mon, 31 Jul 2023 13:15:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232341AbjGaRPL (ORCPT ); Mon, 31 Jul 2023 13:15:11 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3153919A for ; Mon, 31 Jul 2023 10:12:40 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d20d99c03fbso5239289276.3 for ; Mon, 31 Jul 2023 10:12:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690823559; x=1691428359; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OgBnAtPoNKX+KJs8/FKp+BmtZpXa1bCkw8iPPxm9v3Y=; b=Gui0GRYVBMuJeIPTHfyhHWRf3ZvvQ9Lj3V7ipXxNJcDko805MiAWbxKx8mDUarhmuB q/olN0XhD4l8TaoIpr8gDksobmmqELsiDOw3r+0sx6ApTaHKI74YK3XC55P8mWDFy5hP 3P9FBMAEa8e8vsor78FFzDcyQVY4zchDIvDwMMlMLiDHo0y8hhkXv10ZVpjqVjdyAN4b 24xpkaWLkyKWOoC6+J3sDFsPEf08qqJ9mcoY4tSZQR7CAVRPQDvaUPVk6cI7UdE14AbJ cOpcpUcuFLJ9bBngvUSa13BpF0/aqtlVAHc6CDPFkqJFkVTb8czwRnhttfFT6igKBYZ4 1B8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690823559; x=1691428359; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OgBnAtPoNKX+KJs8/FKp+BmtZpXa1bCkw8iPPxm9v3Y=; b=Ozpa6R5Tt27hfRGDcR6QZAjQWcrXTdYfZ79JYt6G1vch2KhZA3aMREHnxSXHThTUkh IbSAgu4lXzTq0SM7eZZqAMlqIYzFGAne40D81GQOoNlN18PUYm+eXss2H4MxcZDns+yA boxD96836pVgi8KntkD4eL9rttJQa3LHxz89FzIBPL/AwdVQDIidtjz9hk+D/iJFNUkH 4l9TGwxAnGxgFARdfgAPfhZxN8J5va+842Jgkh2xFkw4gvISYCF4norPvSKBFLG7az7x f09RrjUmrJ/XAow3WMINmpQL/Z/gkFjqIhfplRZfKpwNpvTfiSBLlgbDcEGrWGWr8kkW KRWg== X-Gm-Message-State: ABy/qLbtBORC4m4VWGOzlgKstItxQqXgfYtouhKcV3V/IET58dJ+r4cD FuTj3gHO7Vo65osXkW8035dqEjczAT4= X-Google-Smtp-Source: APBJJlFnvcCtbJZ73O/WCx8iV93eYXfURurzOhWScTRkBXD7hwc1ky41TRPGayUXZ/7QplAeardfbFpzlyQ= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:d4d3:7e41:cb80:21ea]) (user=surenb job=sendgmr) by 2002:a25:2055:0:b0:d27:bbcf:44ba with SMTP id g82-20020a252055000000b00d27bbcf44bamr66443ybg.1.1690823559416; Mon, 31 Jul 2023 10:12:39 -0700 (PDT) Date: Mon, 31 Jul 2023 10:12:27 -0700 In-Reply-To: <20230731171233.1098105-1-surenb@google.com> Mime-Version: 1.0 References: <20230731171233.1098105-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.487.g6d72f3e995-goog Message-ID: <20230731171233.1098105-2-surenb@google.com> Subject: [PATCH 1/6] mm: enable page walking API to lock vmas during the walk From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org, jannh@google.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, vbabka@suse.cz, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, hannes@cmpxchg.org, dave@stgolabs.net, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org, Suren Baghdasaryan , Linus Torvalds Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" walk_page_range() and friends often operate under write-locked mmap_lock. With introduction of vma locks, the vmas have to be locked as well during such walks to prevent concurrent page faults in these areas. Add an additional parameter to walk_page_range() functions to indicate the walks which should lock the vmas before operating on them. Cc: stable@vger.kernel.org # 6.4.x Suggested-by: Linus Torvalds Suggested-by: Jann Horn Signed-off-by: Suren Baghdasaryan --- arch/powerpc/mm/book3s64/subpage_prot.c | 2 +- arch/riscv/mm/pageattr.c | 4 ++-- arch/s390/mm/gmap.c | 10 +++++----- fs/proc/task_mmu.c | 10 +++++----- include/linux/pagewalk.h | 6 +++--- mm/damon/vaddr.c | 4 ++-- mm/hmm.c | 2 +- mm/ksm.c | 16 ++++++++-------- mm/madvise.c | 8 ++++---- mm/memcontrol.c | 4 ++-- mm/memory-failure.c | 2 +- mm/mempolicy.c | 12 ++++-------- mm/migrate_device.c | 2 +- mm/mincore.c | 2 +- mm/mlock.c | 2 +- mm/mprotect.c | 2 +- mm/pagewalk.c | 13 ++++++++++--- mm/vmscan.c | 3 ++- 18 files changed, 54 insertions(+), 50 deletions(-) diff --git a/arch/powerpc/mm/book3s64/subpage_prot.c b/arch/powerpc/mm/book= 3s64/subpage_prot.c index 0dc85556dec5..177e5c646d9c 100644 --- a/arch/powerpc/mm/book3s64/subpage_prot.c +++ b/arch/powerpc/mm/book3s64/subpage_prot.c @@ -159,7 +159,7 @@ static void subpage_mark_vma_nohuge(struct mm_struct *m= m, unsigned long addr, */ for_each_vma_range(vmi, vma, addr + len) { vm_flags_set(vma, VM_NOHUGEPAGE); - walk_page_vma(vma, &subpage_walk_ops, NULL); + walk_page_vma(vma, &subpage_walk_ops, true, NULL); } } #else diff --git a/arch/riscv/mm/pageattr.c b/arch/riscv/mm/pageattr.c index ea3d61de065b..95207994cbf0 100644 --- a/arch/riscv/mm/pageattr.c +++ b/arch/riscv/mm/pageattr.c @@ -167,7 +167,7 @@ int set_direct_map_invalid_noflush(struct page *page) }; =20 mmap_read_lock(&init_mm); - ret =3D walk_page_range(&init_mm, start, end, &pageattr_ops, &masks); + ret =3D walk_page_range(&init_mm, start, end, &pageattr_ops, false, &mask= s); mmap_read_unlock(&init_mm); =20 return ret; @@ -184,7 +184,7 @@ int set_direct_map_default_noflush(struct page *page) }; =20 mmap_read_lock(&init_mm); - ret =3D walk_page_range(&init_mm, start, end, &pageattr_ops, &masks); + ret =3D walk_page_range(&init_mm, start, end, &pageattr_ops, false, &mask= s); mmap_read_unlock(&init_mm); =20 return ret; diff --git a/arch/s390/mm/gmap.c b/arch/s390/mm/gmap.c index 9c8af31be970..16a58c860c74 100644 --- a/arch/s390/mm/gmap.c +++ b/arch/s390/mm/gmap.c @@ -2523,7 +2523,7 @@ static inline void thp_split_mm(struct mm_struct *mm) =20 for_each_vma(vmi, vma) { vm_flags_mod(vma, VM_NOHUGEPAGE, VM_HUGEPAGE); - walk_page_vma(vma, &thp_split_walk_ops, NULL); + walk_page_vma(vma, &thp_split_walk_ops, true, NULL); } mm->def_flags |=3D VM_NOHUGEPAGE; } @@ -2584,7 +2584,7 @@ int s390_enable_sie(void) mm->context.has_pgste =3D 1; /* split thp mappings and disable thp for future mappings */ thp_split_mm(mm); - walk_page_range(mm, 0, TASK_SIZE, &zap_zero_walk_ops, NULL); + walk_page_range(mm, 0, TASK_SIZE, &zap_zero_walk_ops, true, NULL); mmap_write_unlock(mm); return 0; } @@ -2672,7 +2672,7 @@ int s390_enable_skey(void) mm->context.uses_skeys =3D 0; goto out_up; } - walk_page_range(mm, 0, TASK_SIZE, &enable_skey_walk_ops, NULL); + walk_page_range(mm, 0, TASK_SIZE, &enable_skey_walk_ops, true, NULL); =20 out_up: mmap_write_unlock(mm); @@ -2697,7 +2697,7 @@ static const struct mm_walk_ops reset_cmma_walk_ops = =3D { void s390_reset_cmma(struct mm_struct *mm) { mmap_write_lock(mm); - walk_page_range(mm, 0, TASK_SIZE, &reset_cmma_walk_ops, NULL); + walk_page_range(mm, 0, TASK_SIZE, &reset_cmma_walk_ops, true, NULL); mmap_write_unlock(mm); } EXPORT_SYMBOL_GPL(s390_reset_cmma); @@ -2771,7 +2771,7 @@ int __s390_uv_destroy_range(struct mm_struct *mm, uns= igned long start, while (r > 0) { state.count =3D 0; mmap_read_lock(mm); - r =3D walk_page_range(mm, state.next, end, &gather_pages_ops, &state); + r =3D walk_page_range(mm, state.next, end, &gather_pages_ops, false, &st= ate); mmap_read_unlock(mm); cond_resched(); s390_uv_destroy_pfns(state.count, state.pfns); diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 507cd4e59d07..f0d0f2959f91 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -804,9 +804,9 @@ static void smap_gather_stats(struct vm_area_struct *vm= a, =20 /* mmap_lock is held in m_start */ if (!start) - walk_page_vma(vma, ops, mss); + walk_page_vma(vma, ops, false, mss); else - walk_page_range(vma->vm_mm, start, vma->vm_end, ops, mss); + walk_page_range(vma->vm_mm, start, vma->vm_end, ops, false, mss); } =20 #define SEQ_PUT_DEC(str, val) \ @@ -1307,7 +1307,7 @@ static ssize_t clear_refs_write(struct file *file, co= nst char __user *buf, 0, mm, 0, -1UL); mmu_notifier_invalidate_range_start(&range); } - walk_page_range(mm, 0, -1, &clear_refs_walk_ops, &cp); + walk_page_range(mm, 0, -1, &clear_refs_walk_ops, true, &cp); if (type =3D=3D CLEAR_REFS_SOFT_DIRTY) { mmu_notifier_invalidate_range_end(&range); flush_tlb_mm(mm); @@ -1720,7 +1720,7 @@ static ssize_t pagemap_read(struct file *file, char _= _user *buf, ret =3D mmap_read_lock_killable(mm); if (ret) goto out_free; - ret =3D walk_page_range(mm, start_vaddr, end, &pagemap_ops, &pm); + ret =3D walk_page_range(mm, start_vaddr, end, &pagemap_ops, false, &pm); mmap_read_unlock(mm); start_vaddr =3D end; =20 @@ -1981,7 +1981,7 @@ static int show_numa_map(struct seq_file *m, void *v) seq_puts(m, " huge"); =20 /* mmap_lock is held by m_start */ - walk_page_vma(vma, &show_numa_ops, md); + walk_page_vma(vma, &show_numa_ops, false, md); =20 if (!md->pages) goto out; diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index 27a6df448ee5..69656ec44049 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -105,16 +105,16 @@ struct mm_walk { =20 int walk_page_range(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, - void *private); + bool lock_vma, void *private); int walk_page_range_novma(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, pgd_t *pgd, void *private); int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, - void *private); + bool lock_vma, void *private); int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *op= s, - void *private); + bool lock_vma, void *private); int walk_page_mapping(struct address_space *mapping, pgoff_t first_index, pgoff_t nr, const struct mm_walk_ops *ops, void *private); diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 2fcc9731528a..54f50b1aefe4 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -391,7 +391,7 @@ static const struct mm_walk_ops damon_mkold_ops =3D { static void damon_va_mkold(struct mm_struct *mm, unsigned long addr) { mmap_read_lock(mm); - walk_page_range(mm, addr, addr + 1, &damon_mkold_ops, NULL); + walk_page_range(mm, addr, addr + 1, &damon_mkold_ops, false, NULL); mmap_read_unlock(mm); } =20 @@ -536,7 +536,7 @@ static bool damon_va_young(struct mm_struct *mm, unsign= ed long addr, }; =20 mmap_read_lock(mm); - walk_page_range(mm, addr, addr + 1, &damon_young_ops, &arg); + walk_page_range(mm, addr, addr + 1, &damon_young_ops, false, &arg); mmap_read_unlock(mm); return arg.young; } diff --git a/mm/hmm.c b/mm/hmm.c index 855e25e59d8f..f94f5e268e40 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -600,7 +600,7 @@ int hmm_range_fault(struct hmm_range *range) range->notifier_seq)) return -EBUSY; ret =3D walk_page_range(mm, hmm_vma_walk.last, range->end, - &hmm_walk_ops, &hmm_vma_walk); + &hmm_walk_ops, false, &hmm_vma_walk); /* * When -EBUSY is returned the loop restarts with * hmm_vma_walk.last set to an address that has not been stored diff --git a/mm/ksm.c b/mm/ksm.c index ba266359da55..494a1f3fcb97 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -470,7 +470,7 @@ static const struct mm_walk_ops break_ksm_ops =3D { * of the process that owns 'vma'. We also do not want to enforce * protection keys here anyway. */ -static int break_ksm(struct vm_area_struct *vma, unsigned long addr) +static int break_ksm(struct vm_area_struct *vma, unsigned long addr, bool = lock_vma) { vm_fault_t ret =3D 0; =20 @@ -479,7 +479,7 @@ static int break_ksm(struct vm_area_struct *vma, unsign= ed long addr) =20 cond_resched(); ksm_page =3D walk_page_range_vma(vma, addr, addr + 1, - &break_ksm_ops, NULL); + &break_ksm_ops, lock_vma, NULL); if (WARN_ON_ONCE(ksm_page < 0)) return ksm_page; if (!ksm_page) @@ -565,7 +565,7 @@ static void break_cow(struct ksm_rmap_item *rmap_item) mmap_read_lock(mm); vma =3D find_mergeable_vma(mm, addr); if (vma) - break_ksm(vma, addr); + break_ksm(vma, addr, false); mmap_read_unlock(mm); } =20 @@ -871,7 +871,7 @@ static void remove_trailing_rmap_items(struct ksm_rmap_= item **rmap_list) * in cmp_and_merge_page on one of the rmap_items we would be removing. */ static int unmerge_ksm_pages(struct vm_area_struct *vma, - unsigned long start, unsigned long end) + unsigned long start, unsigned long end, bool lock_vma) { unsigned long addr; int err =3D 0; @@ -882,7 +882,7 @@ static int unmerge_ksm_pages(struct vm_area_struct *vma, if (signal_pending(current)) err =3D -ERESTARTSYS; else - err =3D break_ksm(vma, addr); + err =3D break_ksm(vma, addr, lock_vma); } return err; } @@ -1029,7 +1029,7 @@ static int unmerge_and_remove_all_rmap_items(void) if (!(vma->vm_flags & VM_MERGEABLE) || !vma->anon_vma) continue; err =3D unmerge_ksm_pages(vma, - vma->vm_start, vma->vm_end); + vma->vm_start, vma->vm_end, false); if (err) goto error; } @@ -2530,7 +2530,7 @@ static int __ksm_del_vma(struct vm_area_struct *vma) return 0; =20 if (vma->anon_vma) { - err =3D unmerge_ksm_pages(vma, vma->vm_start, vma->vm_end); + err =3D unmerge_ksm_pages(vma, vma->vm_start, vma->vm_end, true); if (err) return err; } @@ -2668,7 +2668,7 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned = long start, return 0; /* just ignore the advice */ =20 if (vma->anon_vma) { - err =3D unmerge_ksm_pages(vma, start, end); + err =3D unmerge_ksm_pages(vma, start, end, true); if (err) return err; } diff --git a/mm/madvise.c b/mm/madvise.c index 886f06066622..0e484111a1d2 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -287,7 +287,7 @@ static long madvise_willneed(struct vm_area_struct *vma, *prev =3D vma; #ifdef CONFIG_SWAP if (!file) { - walk_page_range(vma->vm_mm, start, end, &swapin_walk_ops, vma); + walk_page_range(vma->vm_mm, start, end, &swapin_walk_ops, false, vma); lru_add_drain(); /* Push any new pages onto the LRU now */ return 0; } @@ -546,7 +546,7 @@ static void madvise_cold_page_range(struct mmu_gather *= tlb, }; =20 tlb_start_vma(tlb, vma); - walk_page_range(vma->vm_mm, addr, end, &cold_walk_ops, &walk_private); + walk_page_range(vma->vm_mm, addr, end, &cold_walk_ops, false, &walk_priva= te); tlb_end_vma(tlb, vma); } =20 @@ -584,7 +584,7 @@ static void madvise_pageout_page_range(struct mmu_gathe= r *tlb, }; =20 tlb_start_vma(tlb, vma); - walk_page_range(vma->vm_mm, addr, end, &cold_walk_ops, &walk_private); + walk_page_range(vma->vm_mm, addr, end, &cold_walk_ops, false, &walk_priva= te); tlb_end_vma(tlb, vma); } =20 @@ -786,7 +786,7 @@ static int madvise_free_single_vma(struct vm_area_struc= t *vma, mmu_notifier_invalidate_range_start(&range); tlb_start_vma(&tlb, vma); walk_page_range(vma->vm_mm, range.start, range.end, - &madvise_free_walk_ops, &tlb); + &madvise_free_walk_ops, false, &tlb); tlb_end_vma(&tlb, vma); mmu_notifier_invalidate_range_end(&range); tlb_finish_mmu(&tlb); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e8ca4bdcb03c..76aaadbd4bf9 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -6031,7 +6031,7 @@ static unsigned long mem_cgroup_count_precharge(struc= t mm_struct *mm) unsigned long precharge; =20 mmap_read_lock(mm); - walk_page_range(mm, 0, ULONG_MAX, &precharge_walk_ops, NULL); + walk_page_range(mm, 0, ULONG_MAX, &precharge_walk_ops, false, NULL); mmap_read_unlock(mm); =20 precharge =3D mc.precharge; @@ -6332,7 +6332,7 @@ static void mem_cgroup_move_charge(void) * When we have consumed all precharges and failed in doing * additional charge, the page walk just aborts. */ - walk_page_range(mc.mm, 0, ULONG_MAX, &charge_walk_ops, NULL); + walk_page_range(mc.mm, 0, ULONG_MAX, &charge_walk_ops, false, NULL); mmap_read_unlock(mc.mm); atomic_dec(&mc.from->moving_account); } diff --git a/mm/memory-failure.c b/mm/memory-failure.c index ece5d481b5ff..763297df9240 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -860,7 +860,7 @@ static int kill_accessing_process(struct task_struct *p= , unsigned long pfn, =20 mmap_read_lock(p->mm); ret =3D walk_page_range(p->mm, 0, TASK_SIZE, &hwp_walk_ops, - (void *)&priv); + false, (void *)&priv); if (ret =3D=3D 1 && priv.tk.addr) kill_proc(&priv.tk, pfn, flags); else diff --git a/mm/mempolicy.c b/mm/mempolicy.c index c53f8beeb507..70ba53c70700 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -738,7 +738,7 @@ static const struct mm_walk_ops queue_pages_walk_ops = =3D { static int queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long= end, nodemask_t *nodes, unsigned long flags, - struct list_head *pagelist) + struct list_head *pagelist, bool lock_vma) { int err; struct queue_pages qp =3D { @@ -750,7 +750,7 @@ queue_pages_range(struct mm_struct *mm, unsigned long s= tart, unsigned long end, .first =3D NULL, }; =20 - err =3D walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp); + err =3D walk_page_range(mm, start, end, &queue_pages_walk_ops, lock_vma, = &qp); =20 if (!qp.first) /* whole range in hole */ @@ -1078,7 +1078,7 @@ static int migrate_to_node(struct mm_struct *mm, int = source, int dest, vma =3D find_vma(mm, 0); VM_BUG_ON(!(flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL))); queue_pages_range(mm, vma->vm_start, mm->task_size, &nmask, - flags | MPOL_MF_DISCONTIG_OK, &pagelist); + flags | MPOL_MF_DISCONTIG_OK, &pagelist, false); =20 if (!list_empty(&pagelist)) { err =3D migrate_pages(&pagelist, alloc_migration_target, NULL, @@ -1321,12 +1321,8 @@ static long do_mbind(unsigned long start, unsigned l= ong len, * Lock the VMAs before scanning for pages to migrate, to ensure we don't * miss a concurrently inserted page. */ - vma_iter_init(&vmi, mm, start); - for_each_vma_range(vmi, vma, end) - vma_start_write(vma); - ret =3D queue_pages_range(mm, start, end, nmask, - flags | MPOL_MF_INVERT, &pagelist); + flags | MPOL_MF_INVERT, &pagelist, true); =20 if (ret < 0) { err =3D ret; diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 8365158460ed..1bc9937bf1fb 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -304,7 +304,7 @@ static void migrate_vma_collect(struct migrate_vma *mig= rate) mmu_notifier_invalidate_range_start(&range); =20 walk_page_range(migrate->vma->vm_mm, migrate->start, migrate->end, - &migrate_vma_walk_ops, migrate); + &migrate_vma_walk_ops, false, migrate); =20 mmu_notifier_invalidate_range_end(&range); migrate->end =3D migrate->start + (migrate->npages << PAGE_SHIFT); diff --git a/mm/mincore.c b/mm/mincore.c index b7f7a516b26c..a06288c6c126 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -198,7 +198,7 @@ static long do_mincore(unsigned long addr, unsigned lon= g pages, unsigned char *v memset(vec, 1, pages); return pages; } - err =3D walk_page_range(vma->vm_mm, addr, end, &mincore_walk_ops, vec); + err =3D walk_page_range(vma->vm_mm, addr, end, &mincore_walk_ops, false, = vec); if (err < 0) return err; return (end - addr) >> PAGE_SHIFT; diff --git a/mm/mlock.c b/mm/mlock.c index 0a0c996c5c21..3634de0b28e3 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -389,7 +389,7 @@ static void mlock_vma_pages_range(struct vm_area_struct= *vma, vm_flags_reset_once(vma, newflags); =20 lru_add_drain(); - walk_page_range(vma->vm_mm, start, end, &mlock_walk_ops, NULL); + walk_page_range(vma->vm_mm, start, end, &mlock_walk_ops, true, NULL); lru_add_drain(); =20 if (newflags & VM_IO) { diff --git a/mm/mprotect.c b/mm/mprotect.c index 6f658d483704..f781f709c39d 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -599,7 +599,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gat= her *tlb, pgprot_t new_pgprot =3D vm_get_page_prot(newflags); =20 error =3D walk_page_range(current->mm, start, end, - &prot_none_walk_ops, &new_pgprot); + &prot_none_walk_ops, true, &new_pgprot); if (error) return error; } diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 2022333805d3..7503885fae75 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -406,6 +406,7 @@ static int __walk_page_range(unsigned long start, unsig= ned long end, * @start: start address of the virtual address range * @end: end address of the virtual address range * @ops: operation to call during the walk + * @lock_vma write-lock the vma before operating on it * @private: private data for callbacks' usage * * Recursively walk the page table tree of the process represented by @mm @@ -442,7 +443,7 @@ static int __walk_page_range(unsigned long start, unsig= ned long end, */ int walk_page_range(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, - void *private) + bool lock_vma, void *private) { int err =3D 0; unsigned long next; @@ -474,6 +475,8 @@ int walk_page_range(struct mm_struct *mm, unsigned long= start, if (ops->pte_hole) err =3D ops->pte_hole(start, next, -1, &walk); } else { /* inside vma */ + if (lock_vma) + vma_start_write(vma); walk.vma =3D vma; next =3D min(end, vma->vm_end); vma =3D find_vma(mm, vma->vm_end); @@ -535,7 +538,7 @@ int walk_page_range_novma(struct mm_struct *mm, unsigne= d long start, =20 int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, - void *private) + bool lock_vma, void *private) { struct mm_walk walk =3D { .ops =3D ops, @@ -550,11 +553,13 @@ int walk_page_range_vma(struct vm_area_struct *vma, u= nsigned long start, return -EINVAL; =20 mmap_assert_locked(walk.mm); + if (lock_vma) + vma_start_write(vma); return __walk_page_range(start, end, &walk); } =20 int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *op= s, - void *private) + bool lock_vma, void *private) { struct mm_walk walk =3D { .ops =3D ops, @@ -567,6 +572,8 @@ int walk_page_vma(struct vm_area_struct *vma, const str= uct mm_walk_ops *ops, return -EINVAL; =20 mmap_assert_locked(walk.mm); + if (lock_vma) + vma_start_write(vma); return __walk_page_range(vma->vm_start, vma->vm_end, &walk); } =20 diff --git a/mm/vmscan.c b/mm/vmscan.c index 1080209a568b..d85f86871fd9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4306,7 +4306,8 @@ static void walk_mm(struct lruvec *lruvec, struct mm_= struct *mm, struct lru_gen_ =20 /* the caller might be holding the lock for write */ if (mmap_read_trylock(mm)) { - err =3D walk_page_range(mm, walk->next_addr, ULONG_MAX, &mm_walk_ops, w= alk); + err =3D walk_page_range(mm, walk->next_addr, ULONG_MAX, + &mm_walk_ops, false, walk); =20 mmap_read_unlock(mm); } --=20 2.41.0.487.g6d72f3e995-goog From nobody Sun Feb 8 02:21:33 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AE20C001DC for ; Mon, 31 Jul 2023 17:15:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233813AbjGaRPo (ORCPT ); Mon, 31 Jul 2023 13:15:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232372AbjGaRPL (ORCPT ); Mon, 31 Jul 2023 13:15:11 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A4A411C for ; Mon, 31 Jul 2023 10:12:42 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d061f324d64so4950095276.1 for ; Mon, 31 Jul 2023 10:12:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690823562; x=1691428362; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=9nr+kbatQ6YMh4Ju6KIXOvG29nOfDMrirC5hgMlL3cU=; b=GXelEXrbYIY2IAX+kiYu0x1tkCdi2SyGtSF/9w53Zm3gPRECLbctG/dBvEhZfym31B ZpNGuvTs3RdmkliU7juePlylZr5pnrjsO4FCwqrfsr6YZw+CfZKmgNsroN9WWEkA9+ck VJdhxNNh/Cu46QjjSL9nAJzM84Yb17nR6+dWlfcQS/FVxSuAY184c48pG3oyTsWv//zV pwBrsYEyhVF3UcXFqaT0GQRjqCuBIJhqhqjimJS9XP+GadbKmGH7LM8IjHhW/7STICdG EJDRSQ3ltPFB9ycjQw7aPjUMjYM8r7OK4dOG6LXF0kGtaXGnCriJR99uz1fZ6+M5MjcR M39g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690823562; x=1691428362; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=9nr+kbatQ6YMh4Ju6KIXOvG29nOfDMrirC5hgMlL3cU=; b=fsq/fHOnG4+uTMOe5F6Eqbj5J62gEdIXQtAjemYMF7g8twSlkXP+wxFOzdlwCBl46E O7Lb1oo9scOqxzW+5+Q1m04qFqLjrXOJOCIUa6KOai4wwttGEVNFW+ySo+YCZNnbEem+ 6OyqBbeOL0mvskOPfd3Ws+VOWFdU4PB+a9DXA36AdT3rKoWBG5YNQfVeptER5WxZJt1G chhaVQ2q7RAKSvZk7DjUH6Yw2WnoNa6cws1JFJcWO6m2zT53HJxkPWTEXKOQeq6yXroL 1JBncjLUS2zKzF8WXl4tf4sgHdE4w4LJiwzyW/+reU0d4/pqE4YCVDy3n+LZD4FWj4RC Ivfw== X-Gm-Message-State: ABy/qLbqrVZ359/Ch0FcoGKED5Je9Bwm2HXU1KMs37zTXV+HY37PpQMY goqqcpUd9CmzniZ/dc6bH3cU6o3hABw= X-Google-Smtp-Source: APBJJlHglqY7H7DXFApYFeh47wS6CJVxYTVOVhaQK3HM9XVkVTrRgX4b1iDjG9+19cN8CdS+pNJRAvwhhLw= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:d4d3:7e41:cb80:21ea]) (user=surenb job=sendgmr) by 2002:a25:d808:0:b0:d04:faa6:e62b with SMTP id p8-20020a25d808000000b00d04faa6e62bmr57379ybg.6.1690823561762; Mon, 31 Jul 2023 10:12:41 -0700 (PDT) Date: Mon, 31 Jul 2023 10:12:28 -0700 In-Reply-To: <20230731171233.1098105-1-surenb@google.com> Mime-Version: 1.0 References: <20230731171233.1098105-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.487.g6d72f3e995-goog Message-ID: <20230731171233.1098105-3-surenb@google.com> Subject: [PATCH 2/6] mm: for !CONFIG_PER_VMA_LOCK equate write lock assertion for vma and mmap From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org, jannh@google.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, vbabka@suse.cz, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, hannes@cmpxchg.org, dave@stgolabs.net, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org, Suren Baghdasaryan Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When CONFIG_PER_VMA_LOCK=3Dn, vma_assert_write_locked() should be equivalent to mmap_assert_write_locked(). Suggested-by: Jann Horn Signed-off-by: Suren Baghdasaryan --- include/linux/mm.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 406ab9ea818f..262b5f44101d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -750,7 +750,8 @@ static inline void vma_end_read(struct vm_area_struct *= vma) {} static inline void vma_start_write(struct vm_area_struct *vma) {} static inline bool vma_try_start_write(struct vm_area_struct *vma) { return true; } -static inline void vma_assert_write_locked(struct vm_area_struct *vma) {} +static inline void vma_assert_write_locked(struct vm_area_struct *vma) + { mmap_assert_write_locked(vma->vm_mm); } static inline void vma_mark_detached(struct vm_area_struct *vma, bool detached) {} =20 --=20 2.41.0.487.g6d72f3e995-goog From nobody Sun Feb 8 02:21:33 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 547B6C001DE for ; Mon, 31 Jul 2023 17:15:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233857AbjGaRPq (ORCPT ); Mon, 31 Jul 2023 13:15:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232429AbjGaRPM (ORCPT ); Mon, 31 Jul 2023 13:15:12 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2408B1718 for ; Mon, 31 Jul 2023 10:12:45 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d064a458dd5so3590181276.1 for ; Mon, 31 Jul 2023 10:12:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690823564; x=1691428364; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=k54nr7y8sT8Wz3UIefWJCe/J3ZJoLd0MPdD7SX1yxV4=; b=f9Xkv4qDndAI0YdOZ2cqpuzigjJkWQ867WFypP7OSwCe0TjMpUnPnkD5Y6ryh7F15Y M9Q14b7Q7rdYUncpsFgbIcnXrps3cXu/n5nLbGedhZ45/p7BeS8PbqD3FkLK8kNSVcWe sJK2CotFttBgUzcy2VXRjvFzjM5g8hEPvN+5TMKbT+Soi/X9AdWfs7huzDn319kCX/N6 GMp6dPMzi2Z4psHXbwS83zAFVmyMj0uFSK8waScyRdzBafY4I+1PCGfjiTaar6EOvw8g 4FjFEveEJTGYSjlmO9UfUzptyPCWrdDCvasot5Lp1U7AI3L7Llir4ep5LDCt1MtOBHu3 6mFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690823564; x=1691428364; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=k54nr7y8sT8Wz3UIefWJCe/J3ZJoLd0MPdD7SX1yxV4=; b=hHuCPqOP+KtJMsscykRNJcNM5/VfBLExn5SHnHjemOJgpecE3H86qALIEbQL+MOsQo /owYhm++NAzyL2+iGlnimKPxE5L1XzulegDw1XbpVl4CsQECNLP++6OmRc1Arp2EFqaq sGgYlh29m8N+xDOBEcFBt1Ch0BPCEyUFteTRsPlmIE1hE+w/OLgij146JTkvdWy1dXcU LNxLg9Sj8MnPqBTiNwtIWpHhQuAEXlMOS2mN6WQlWHLxNeJUxIXZYTmXrVh9HgtruR++ H1SoQOi3hDbqNZ4cqTTTkpcBF55olXYSZss1MAgjFE1EIterZTrdjhUpJx96vfN5VnQR iL4Q== X-Gm-Message-State: ABy/qLY+lArDusKtnqNksAC/vOtxVoBzKmgpu6P4uouu41jde0AsXmGV hCElT0e06Y+Fjm48LnG4jhdYE0A7JUk= X-Google-Smtp-Source: APBJJlFUd/WIRrMRABoKwcapNJpdOFT19YiYMf0HEPYfoNPTpuOYC63k+ANMx6Qm9ThWVSfUuBUUFqFSUIk= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:d4d3:7e41:cb80:21ea]) (user=surenb job=sendgmr) by 2002:a05:6902:11ca:b0:d09:b19:fe2c with SMTP id n10-20020a05690211ca00b00d090b19fe2cmr63888ybu.12.1690823564414; Mon, 31 Jul 2023 10:12:44 -0700 (PDT) Date: Mon, 31 Jul 2023 10:12:29 -0700 In-Reply-To: <20230731171233.1098105-1-surenb@google.com> Mime-Version: 1.0 References: <20230731171233.1098105-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.487.g6d72f3e995-goog Message-ID: <20230731171233.1098105-4-surenb@google.com> Subject: [PATCH 3/6] mm: replace mmap with vma write lock assertions when operating on a vma From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org, jannh@google.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, vbabka@suse.cz, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, hannes@cmpxchg.org, dave@stgolabs.net, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org, Suren Baghdasaryan Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Vma write lock assertion always includes mmap write lock assertion and additional vma lock checks when per-VMA locks are enabled. Replace weaker mmap_assert_write_locked() assertions with stronger vma_assert_write_locked() ones when we are operating on a vma which is expected to be locked. Suggested-by: Jann Horn Signed-off-by: Suren Baghdasaryan --- mm/hugetlb.c | 2 +- mm/khugepaged.c | 5 +++-- mm/memory.c | 2 +- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 64a3239b6407..1d871a1167d8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5028,7 +5028,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, st= ruct mm_struct *src, src_vma->vm_start, src_vma->vm_end); mmu_notifier_invalidate_range_start(&range); - mmap_assert_write_locked(src); + vma_assert_write_locked(src_vma); raw_write_seqcount_begin(&src->write_protect_seq); } else { /* diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 78c8d5d8b628..1e43a56fba31 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1495,7 +1495,7 @@ static int set_huge_pmd(struct vm_area_struct *vma, u= nsigned long addr, }; =20 VM_BUG_ON(!PageTransHuge(hpage)); - mmap_assert_write_locked(vma->vm_mm); + vma_assert_write_locked(vma); =20 if (do_set_pmd(&vmf, hpage)) return SCAN_FAIL; @@ -1525,7 +1525,7 @@ static void collapse_and_free_pmd(struct mm_struct *m= m, struct vm_area_struct *v pmd_t pmd; struct mmu_notifier_range range; =20 - mmap_assert_write_locked(mm); + vma_assert_write_locked(vma); if (vma->vm_file) lockdep_assert_held_write(&vma->vm_file->f_mapping->i_mmap_rwsem); /* @@ -1570,6 +1570,7 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, uns= igned long addr, int count =3D 0, result =3D SCAN_FAIL; int i; =20 + /* Ensure vma can't change, it will be locked below after checks */ mmap_assert_write_locked(mm); =20 /* Fast check before locking page if already PMD-mapped */ diff --git a/mm/memory.c b/mm/memory.c index 603b2f419948..652d99b9858a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1312,7 +1312,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struc= t vm_area_struct *src_vma) * Use the raw variant of the seqcount_t write API to avoid * lockdep complaining about preemptibility. */ - mmap_assert_write_locked(src_mm); + vma_assert_write_locked(src_vma); raw_write_seqcount_begin(&src_mm->write_protect_seq); } =20 --=20 2.41.0.487.g6d72f3e995-goog From nobody Sun Feb 8 02:21:33 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCCBFC001DE for ; Mon, 31 Jul 2023 17:15:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233855AbjGaRPv (ORCPT ); Mon, 31 Jul 2023 13:15:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232560AbjGaRPN (ORCPT ); Mon, 31 Jul 2023 13:15:13 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50E442705 for ; Mon, 31 Jul 2023 10:12:47 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-583c49018c6so55202067b3.0 for ; Mon, 31 Jul 2023 10:12:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690823566; x=1691428366; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pWiy0eqxBgZlb+dN+Yox0ISeOPzNiRGRRyq+SZhtGFE=; b=TqEf6HrdE+I2RnvUQAJhHrd/y5HGFhwmubpEUXYwAE4bGwVm81VSt8g5W0j9fPPAuQ sX4JCbn4xB1mj7KL3Fly3i1ij68a6VGGbaiodngH2NBWtY3W2Fduv5VAGXk0Y+AsbLAU 2uykQrOSFF3WK6HbnGND60wrUIQwprYirtJ7g4akiT7JRGvKgviVxDtLgAB37WUQewaT bIDeNPs0lImrqBS8/+XnjxQ4XT/lIUE0hzBuPIL335K48KzLJyX4EWLQ3B19lgcjlmTZ ixFnC3T+Kv/cFjI8V1Y3EgfT6F9hQnaglRNBsbIeI6o7F9OSz2bO7I7d5X/ejesuCCqD 50Gg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690823566; x=1691428366; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pWiy0eqxBgZlb+dN+Yox0ISeOPzNiRGRRyq+SZhtGFE=; b=V83mS/67JFu+GHz3Thm2QI0uetKLH5ve0IWl7nDItIZ/V8PzQALIjdp0h/ZlAElGHf gBLIGDV0gYSYNs30RNd7Atr/f7NcFGqd/d0PhJRbGz6DHEzjSrixamzJNByiGMiSA5m1 mpV18e93mU/onmE0om3Pa9X/WIzI9bow5gcyYcaQFgl9DbxzeZIGYirdqm0bmWYpvbPz c7C89z/ymUsRLIsfhy04luMjK4wUNMjoUYNaIOL4yQpbYWW/wff+Iak8isGJFv9pI8d5 Xi8KQHvtGTKFhEkvVlhXHwaKo8eFQOFLeIoZsq8Np+UMU8Ngt7taYDTOoB4Pgt8Y8pwJ eEbg== X-Gm-Message-State: ABy/qLbGJ1w2XuyWS9SxykTHJxo90h/ret6rmt63vgBTHWFLSOouUbIZ JHzUyT3IFi+TiQHYB/CxEbJfbUpPpRw= X-Google-Smtp-Source: APBJJlHwwvxymWxZDUL8yzdLNqvheNGhA91W/OiK0dav8XIzZNSGI7RuCw/tew0kh4tMP8lXsVeDK4HgzX0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:d4d3:7e41:cb80:21ea]) (user=surenb job=sendgmr) by 2002:a81:b625:0:b0:586:50cf:e13f with SMTP id u37-20020a81b625000000b0058650cfe13fmr9301ywh.1.1690823566545; Mon, 31 Jul 2023 10:12:46 -0700 (PDT) Date: Mon, 31 Jul 2023 10:12:30 -0700 In-Reply-To: <20230731171233.1098105-1-surenb@google.com> Mime-Version: 1.0 References: <20230731171233.1098105-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.487.g6d72f3e995-goog Message-ID: <20230731171233.1098105-5-surenb@google.com> Subject: [PATCH 4/6] mm: lock vma explicitly before doing vm_flags_reset and vm_flags_reset_once From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org, jannh@google.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, vbabka@suse.cz, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, hannes@cmpxchg.org, dave@stgolabs.net, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org, Suren Baghdasaryan , Linus Torvalds Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implicit vma locking inside vm_flags_reset() and vm_flags_reset_once() is not obvious and makes it hard to understand where vma locking is happening. Also in some cases (like in dup_userfaultfd()) vma should be locked earlier than vma_flags modification. To make locking more visible, change these functions to assert that the vma write lock is taken and explicitly lock the vma beforehand. Fix userfaultfd functions which should lock the vma earlier. Suggested-by: Linus Torvalds Signed-off-by: Suren Baghdasaryan --- arch/powerpc/kvm/book3s_hv_uvmem.c | 1 + drivers/infiniband/hw/hfi1/file_ops.c | 1 + fs/userfaultfd.c | 6 ++++++ include/linux/mm.h | 10 +++++++--- mm/madvise.c | 5 ++--- mm/mlock.c | 3 ++- mm/mprotect.c | 1 + 7 files changed, 20 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_h= v_uvmem.c index 709ebd578394..e2d6f9327f77 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -410,6 +410,7 @@ static int kvmppc_memslot_page_merge(struct kvm *kvm, ret =3D H_STATE; break; } + vma_start_write(vma); /* Copy vm_flags to avoid partial modifications in ksm_madvise */ vm_flags =3D vma->vm_flags; ret =3D ksm_madvise(vma, vma->vm_start, vma->vm_end, diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/= hfi1/file_ops.c index a5ab22cedd41..5920bfc1e1c5 100644 --- a/drivers/infiniband/hw/hfi1/file_ops.c +++ b/drivers/infiniband/hw/hfi1/file_ops.c @@ -344,6 +344,7 @@ static int hfi1_file_mmap(struct file *fp, struct vm_ar= ea_struct *vma) goto done; } =20 + vma_start_write(vma); /* * vm_pgoff is used as a buffer selector cookie. Always mmap from * the beginning. diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 7cecd49e078b..6cde95533dcd 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -667,6 +667,7 @@ static void userfaultfd_event_wait_completion(struct us= erfaultfd_ctx *ctx, mmap_write_lock(mm); for_each_vma(vmi, vma) { if (vma->vm_userfaultfd_ctx.ctx =3D=3D release_new_ctx) { + vma_start_write(vma); vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS); @@ -702,6 +703,7 @@ int dup_userfaultfd(struct vm_area_struct *vma, struct = list_head *fcs) =20 octx =3D vma->vm_userfaultfd_ctx.ctx; if (!octx || !(octx->features & UFFD_FEATURE_EVENT_FORK)) { + vma_start_write(vma); vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS); return 0; @@ -783,6 +785,7 @@ void mremap_userfaultfd_prep(struct vm_area_struct *vma, atomic_inc(&ctx->mmap_changing); } else { /* Drop uffd context if remap feature not enabled */ + vma_start_write(vma); vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; userfaultfd_set_vm_flags(vma, vma->vm_flags & ~__VM_UFFD_FLAGS); } @@ -940,6 +943,7 @@ static int userfaultfd_release(struct inode *inode, str= uct file *file) prev =3D vma; } =20 + vma_start_write(vma); userfaultfd_set_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; } @@ -1502,6 +1506,7 @@ static int userfaultfd_register(struct userfaultfd_ct= x *ctx, * the next vma was merged into the current one and * the current one has not been updated yet. */ + vma_start_write(vma); userfaultfd_set_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx.ctx =3D ctx; =20 @@ -1685,6 +1690,7 @@ static int userfaultfd_unregister(struct userfaultfd_= ctx *ctx, * the next vma was merged into the current one and * the current one has not been updated yet. */ + vma_start_write(vma); userfaultfd_set_vm_flags(vma, new_flags); vma->vm_userfaultfd_ctx =3D NULL_VM_UFFD_CTX; =20 diff --git a/include/linux/mm.h b/include/linux/mm.h index 262b5f44101d..2c720c9bb1ae 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -780,18 +780,22 @@ static inline void vm_flags_init(struct vm_area_struc= t *vma, ACCESS_PRIVATE(vma, __vm_flags) =3D flags; } =20 -/* Use when VMA is part of the VMA tree and modifications need coordinatio= n */ +/* + * Use when VMA is part of the VMA tree and modifications need coordination + * Note: vm_flags_reset and vm_flags_reset_once do not lock the vma and + * it should be locked explicitly beforehand. + */ static inline void vm_flags_reset(struct vm_area_struct *vma, vm_flags_t flags) { - vma_start_write(vma); + vma_assert_write_locked(vma); vm_flags_init(vma, flags); } =20 static inline void vm_flags_reset_once(struct vm_area_struct *vma, vm_flags_t flags) { - vma_start_write(vma); + vma_assert_write_locked(vma); WRITE_ONCE(ACCESS_PRIVATE(vma, __vm_flags), flags); } =20 diff --git a/mm/madvise.c b/mm/madvise.c index 0e484111a1d2..54628f4ca217 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -173,9 +173,8 @@ static int madvise_update_vma(struct vm_area_struct *vm= a, } =20 success: - /* - * vm_flags is protected by the mmap_lock held in write mode. - */ + /* vm_flags is protected by the mmap_lock held in write mode. */ + vma_start_write(vma); vm_flags_reset(vma, new_flags); if (!vma->vm_file || vma_is_anon_shmem(vma)) { error =3D replace_anon_vma_name(vma, anon_name); diff --git a/mm/mlock.c b/mm/mlock.c index 3634de0b28e3..f0f5125188ba 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -386,6 +386,7 @@ static void mlock_vma_pages_range(struct vm_area_struct= *vma, */ if (newflags & VM_LOCKED) newflags |=3D VM_IO; + vma_start_write(vma); vm_flags_reset_once(vma, newflags); =20 lru_add_drain(); @@ -460,9 +461,9 @@ static int mlock_fixup(struct vma_iterator *vmi, struct= vm_area_struct *vma, * It's okay if try_to_unmap_one unmaps a page just after we * set VM_LOCKED, populate_vma_page_range will bring it back. */ - if ((newflags & VM_LOCKED) && (oldflags & VM_LOCKED)) { /* No work to do, and mlocking twice would be wrong */ + vma_start_write(vma); vm_flags_reset(vma, newflags); } else { mlock_vma_pages_range(vma, start, end, newflags); diff --git a/mm/mprotect.c b/mm/mprotect.c index f781f709c39d..0eab019914db 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -656,6 +656,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gat= her *tlb, * vm_flags and vm_page_prot are protected by the mmap_lock * held in write mode. */ + vma_start_write(vma); vm_flags_reset(vma, newflags); if (vma_wants_manual_pte_write_upgrade(vma)) mm_cp_flags |=3D MM_CP_TRY_CHANGE_WRITABLE; --=20 2.41.0.487.g6d72f3e995-goog From nobody Sun Feb 8 02:21:33 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5978CC001DE for ; Mon, 31 Jul 2023 17:16:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232985AbjGaRP7 (ORCPT ); Mon, 31 Jul 2023 13:15:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232673AbjGaRPQ (ORCPT ); Mon, 31 Jul 2023 13:15:16 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8EE0B1FFD for ; Mon, 31 Jul 2023 10:12:49 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d1efa597303so9846789276.0 for ; Mon, 31 Jul 2023 10:12:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690823569; x=1691428369; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fl37fA/fktQ3urfS+YwWUE+qcRJpc6Gt7zobpPnOyFQ=; b=LgDlXg7e+opMqwWMES1LLqVmtZkaKcma+JDWWp+e4CicAgmkjcvE9XhQBziMq2eTa3 PXWdlSTMfbarj3wXfNgyENr1RvdnAJxrQvtE9y8G7VjJ080QaAAJjzFwuwEaNmQ0DHyN a1+nQsa+bGNOM4jP3sCAuVCP01EJzsRkj0hXximLEb2Mfn0aWk9hMuhEtMt7RzQ2IfWq dTRWuwzCf9427UAvrXN1bePAw/m+zuKvf0/dCjndjuyKyBwMXDIzsawqi0xrdLmdKZLv FQU527Nzu03LfekDUrASNgZQizxrizQv19ZDdXWFCDoALbzcJ8zvPoznDlPtXbvB3a/3 h6XQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690823569; x=1691428369; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fl37fA/fktQ3urfS+YwWUE+qcRJpc6Gt7zobpPnOyFQ=; b=hfQuGCcc7CM8tHHC/4l+VK84AMV2jUz8q9wJ2iQutqaVw6DgnuLn98ps3/0A0Rb8Ky xY/ZOb57aAaFT04F5GTsiyYxtB4BATjceFXpR/AuDdZxuhOe0wv3mkmhGVdAcgiAPAkc 4FSZ9oM947YgaaTx+93nDbauTlWQpBkXff3bj4iZHlIB8vmtM939shO0UcaFyayV4cwA XRcFJ6490VKvI/SGSdpK4FcyhSEty2DHcfQ5dcrIMKmJ/kXFilwK69EFc7otwwUlCZqB Vybr/7hT/3flT5ZXmQgvRvIzspwrY/gi9QOPK2gdYZ+40A2WZc0fadhpkxHzPFH+r3ii IqBA== X-Gm-Message-State: ABy/qLb4LcGTfE+IabPI4JrYuRu55j/YmqLgDkRn+oj36/WbouIkgcYu DMgx0GfSt81DPhCNezgjpo/BVk+ydl4= X-Google-Smtp-Source: APBJJlHhlXYyfYo/Ka1CDxxExi2V2guG40bRgvHWyuWMJtZm8ytlVRtn32TdlCPCKVgUKlv/bdcqKCA4SCM= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:d4d3:7e41:cb80:21ea]) (user=surenb job=sendgmr) by 2002:a25:e658:0:b0:d1c:57aa:d267 with SMTP id d85-20020a25e658000000b00d1c57aad267mr113311ybh.5.1690823568750; Mon, 31 Jul 2023 10:12:48 -0700 (PDT) Date: Mon, 31 Jul 2023 10:12:31 -0700 In-Reply-To: <20230731171233.1098105-1-surenb@google.com> Mime-Version: 1.0 References: <20230731171233.1098105-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.487.g6d72f3e995-goog Message-ID: <20230731171233.1098105-6-surenb@google.com> Subject: [PATCH 5/6] mm: always lock new vma before inserting into vma tree From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org, jannh@google.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, vbabka@suse.cz, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, hannes@cmpxchg.org, dave@stgolabs.net, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org, Suren Baghdasaryan Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" While it's not strictly necessary to lock a newly created vma before adding it into the vma tree (as long as no further changes are performed to it), it seems like a good policy to lock it and prevent accidental changes after it becomes visible to the page faults. Lock the vma before adding it into the vma tree. Suggested-by: Jann Horn Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 3937479d0e07..850a39dee075 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -412,6 +412,8 @@ static int vma_link(struct mm_struct *mm, struct vm_are= a_struct *vma) if (vma_iter_prealloc(&vmi)) return -ENOMEM; =20 + vma_start_write(vma); + if (vma->vm_file) { mapping =3D vma->vm_file->f_mapping; i_mmap_lock_write(mapping); @@ -477,7 +479,8 @@ static inline void vma_prepare(struct vma_prepare *vp) vma_start_write(vp->vma); if (vp->adj_next) vma_start_write(vp->adj_next); - /* vp->insert is always a newly created VMA, no need for locking */ + if (vp->insert) + vma_start_write(vp->insert); if (vp->remove) vma_start_write(vp->remove); if (vp->remove2) @@ -3098,6 +3101,7 @@ static int do_brk_flags(struct vma_iterator *vmi, str= uct vm_area_struct *vma, vma->vm_pgoff =3D addr >> PAGE_SHIFT; vm_flags_init(vma, flags); vma->vm_page_prot =3D vm_get_page_prot(flags); + vma_start_write(vma); if (vma_iter_store_gfp(vmi, vma, GFP_KERNEL)) goto mas_store_fail; =20 @@ -3345,7 +3349,6 @@ struct vm_area_struct *copy_vma(struct vm_area_struct= **vmap, get_file(new_vma->vm_file); if (new_vma->vm_ops && new_vma->vm_ops->open) new_vma->vm_ops->open(new_vma); - vma_start_write(new_vma); if (vma_link(mm, new_vma)) goto out_vma_link; *need_rmap_locks =3D false; --=20 2.41.0.487.g6d72f3e995-goog From nobody Sun Feb 8 02:21:33 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AC3CC001DC for ; Mon, 31 Jul 2023 17:16:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231252AbjGaRQI (ORCPT ); Mon, 31 Jul 2023 13:16:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41336 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232933AbjGaRPR (ORCPT ); Mon, 31 Jul 2023 13:15:17 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB4872718 for ; Mon, 31 Jul 2023 10:12:51 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d13e11bb9ecso5255515276.0 for ; Mon, 31 Jul 2023 10:12:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690823571; x=1691428371; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nlUlu5Mhd5cq+/dnIWytp1wDOwye9pYqWME5wT1VG2E=; b=sHGO4YkHIV7FKB9rSFdsUdceu+sd5n7SB4rQ2unW3pD+BVqkwNJbz+SITG2pysaMgl gKtSzBeA+OBMePojGafJ+11doCnzqSxZKPCriFEkRtJWp27lsz7elqgKl6T8eOE/f8yK w7zw6fW868PqBeRY5qLJOdGEIjLR+7606TSSgtlwLhhrV02LXp2unR5jYo2ZCHxqOD6/ aTLQZWV2NWG8NNR6Zbwdm5er8wOMF1YxeMP3rfPV2nuC2GAZIApB/MNhrk0KohkTdxbo aZdUz3A18SoWVWors7K7+klRf0vOn2JYaGIHOGPalUyPxopNDj0TFe/WRuIA3D+NB8/C cl3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690823571; x=1691428371; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nlUlu5Mhd5cq+/dnIWytp1wDOwye9pYqWME5wT1VG2E=; b=iLVH+/LhDPQ0f5JccoHOfenVggYZrkN3yyqrLyWIuxts5/9gGC2Qvs9/t6WZl3d4SC pbjzPXu6lxOiLJ20wdT5+1ZidMqbO/KidUvVjEy5e2mugrjPJ1wMZLQv3UeVFIELKWZ/ WwJQ8dpBvGB8umiPm7GcvYrFqPEJSTtS3/drytRUuA35XgJZN+aNa9YYLM7azVeZHIsa 5xmP8FvNi5eOCt36oFgX3pFoCc3LiSjR7PHg973QTtWtbBZ+4evALD5C6M1RQhaF5QNM p+vREv85bs8OSPUyVefsr/5Iir28x8re6+2iMJTMGR8nwIdK7LWYCRHYzvGCxB4vbP5T dW6A== X-Gm-Message-State: ABy/qLZvNcvUzz+ciJ7LQlW8gOqYJm4AdbU79Gnn7qr4hTtqBVKQ3iG9 po/xfit89tRew7ESPF1GpgyacywMZkg= X-Google-Smtp-Source: APBJJlE2nxuBMZTc6tsHHVP/zDZfM1zO/qPtnEy+bfYdI+vtC/H0IUdghcKWYcF/n32JcSPB9GeJlvFqcvY= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:d4d3:7e41:cb80:21ea]) (user=surenb job=sendgmr) by 2002:a05:6902:1616:b0:d12:d6e4:a08e with SMTP id bw22-20020a056902161600b00d12d6e4a08emr70507ybb.5.1690823571188; Mon, 31 Jul 2023 10:12:51 -0700 (PDT) Date: Mon, 31 Jul 2023 10:12:32 -0700 In-Reply-To: <20230731171233.1098105-1-surenb@google.com> Mime-Version: 1.0 References: <20230731171233.1098105-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.487.g6d72f3e995-goog Message-ID: <20230731171233.1098105-7-surenb@google.com> Subject: [PATCH 6/6] mm: move vma locking out of vma_prepare From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: torvalds@linux-foundation.org, jannh@google.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, vbabka@suse.cz, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, hannes@cmpxchg.org, dave@stgolabs.net, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org, Suren Baghdasaryan , Linus Torvalds Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" vma_prepare() is currently the central place where vmas are being locked before vma_complete() applies changes to them. While this is convenient, it also obscures vma locking and makes it hard to follow the locking rules. Move vma locking out of vma_prepare() and take vma locks explicitly at the locations where vmas are being modified. Suggested-by: Linus Torvalds Signed-off-by: Suren Baghdasaryan Reviewed-by: Liam R. Howlett --- mm/mmap.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index 850a39dee075..e59d83cb1d7a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -476,16 +476,6 @@ static inline void init_vma_prep(struct vma_prepare *v= p, */ static inline void vma_prepare(struct vma_prepare *vp) { - vma_start_write(vp->vma); - if (vp->adj_next) - vma_start_write(vp->adj_next); - if (vp->insert) - vma_start_write(vp->insert); - if (vp->remove) - vma_start_write(vp->remove); - if (vp->remove2) - vma_start_write(vp->remove2); - if (vp->file) { uprobe_munmap(vp->vma, vp->vma->vm_start, vp->vma->vm_end); =20 @@ -650,6 +640,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area= _struct *vma, bool remove_next =3D false; struct vma_prepare vp; =20 + vma_start_write(vma); if (next && (vma !=3D next) && (end =3D=3D next->vm_end)) { int ret; =20 @@ -657,6 +648,7 @@ int vma_expand(struct vma_iterator *vmi, struct vm_area= _struct *vma, ret =3D dup_anon_vma(vma, next); if (ret) return ret; + vma_start_write(next); } =20 init_multi_vma_prep(&vp, vma, NULL, remove_next ? next : NULL, NULL); @@ -708,6 +700,8 @@ int vma_shrink(struct vma_iterator *vmi, struct vm_area= _struct *vma, if (vma_iter_prealloc(vmi)) return -ENOMEM; =20 + vma_start_write(vma); + init_vma_prep(&vp, vma); vma_prepare(&vp); vma_adjust_trans_huge(vma, start, end, 0); @@ -946,10 +940,12 @@ struct vm_area_struct *vma_merge(struct vma_iterator = *vmi, struct mm_struct *mm, /* Can we merge both the predecessor and the successor? */ if (merge_prev && merge_next && is_mergeable_anon_vma(prev->anon_vma, next->anon_vma, NULL)) { + vma_start_write(next); remove =3D next; /* case 1 */ vma_end =3D next->vm_end; err =3D dup_anon_vma(prev, next); if (curr) { /* case 6 */ + vma_start_write(curr); remove =3D curr; remove2 =3D next; if (!next->anon_vma) @@ -958,6 +954,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *v= mi, struct mm_struct *mm, } else if (merge_prev) { /* case 2 */ if (curr) { err =3D dup_anon_vma(prev, curr); + vma_start_write(curr); if (end =3D=3D curr->vm_end) { /* case 7 */ remove =3D curr; } else { /* case 5 */ @@ -969,6 +966,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *v= mi, struct mm_struct *mm, res =3D next; if (prev && addr < prev->vm_end) { /* case 4 */ vma_end =3D addr; + vma_start_write(next); adjust =3D next; adj_start =3D -(prev->vm_end - addr); err =3D dup_anon_vma(next, prev); @@ -983,6 +981,7 @@ struct vm_area_struct *vma_merge(struct vma_iterator *v= mi, struct mm_struct *mm, vma_pgoff =3D next->vm_pgoff - pglen; if (curr) { /* case 8 */ vma_pgoff =3D curr->vm_pgoff; + vma_start_write(curr); remove =3D curr; err =3D dup_anon_vma(next, curr); } @@ -996,6 +995,8 @@ struct vm_area_struct *vma_merge(struct vma_iterator *v= mi, struct mm_struct *mm, if (vma_iter_prealloc(vmi)) return NULL; =20 + vma_start_write(vma); + init_multi_vma_prep(&vp, vma, adjust, remove, remove2); VM_WARN_ON(vp.anon_vma && adjust && adjust->anon_vma && vp.anon_vma !=3D adjust->anon_vma); @@ -2373,6 +2374,9 @@ int __split_vma(struct vma_iterator *vmi, struct vm_a= rea_struct *vma, if (new->vm_ops && new->vm_ops->open) new->vm_ops->open(new); =20 + vma_start_write(vma); + vma_start_write(new); + init_vma_prep(&vp, vma); vp.insert =3D new; vma_prepare(&vp); @@ -3078,6 +3082,8 @@ static int do_brk_flags(struct vma_iterator *vmi, str= uct vm_area_struct *vma, if (vma_iter_prealloc(vmi)) goto unacct_fail; =20 + vma_start_write(vma); + init_vma_prep(&vp, vma); vma_prepare(&vp); vma_adjust_trans_huge(vma, vma->vm_start, addr + len, 0); --=20 2.41.0.487.g6d72f3e995-goog