Since the introduction in 9a10064f5625 ("mm: add a field to store names
for private anonymous memory") the code to set anon_name on a vma has
been using madvise_update_vma() to call replace_vma_anon_name(). Since
the former is called also by a number of other madvise behaviours that
do not set a new anon_name, they have been passing the existing
anon_name of the vma to make replace_vma_anon_name() a no-op.
This is rather wasteful as it needs anon_vma_name_eq() to determine the
no-op situations, and checks for when replace_vma_anon_name() is allowed
(the vma is anon/shmem) duplicate the checks already done earlier in
madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
fix use-after-free when anon vma name is used after vma is freed")
adding anon_name refcount get/put operations exactly to the cases that
actually do not change anon_name - just so the replace_vma_anon_name()
can keep safely determining it has nothing to do.
The recent madvise cleanups made this suboptimal handling very obvious,
but happily also allow for an easy fix. madvise_update_vma() now has the
complete information whether it's been called to set a new anon_name, so
stop passing it the existing vma's name and doing the refcount get/put
in its only caller madvise_vma_behavior().
In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
only to cases where we are setting a new name, otherwise we know it's a
no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
can remove the duplicate checks for vma being anon/shmem that were done
already in madvise_vma_behavior().
The remaining reason to obtain the vma's existing anon_name is to pass
it to vma_modify_flags_name() for the splitting and merging to work
properly. In case of merging, the vma might be freed along with the
anon_name, but madvise_update_vma() will not access it afterwards so the
UAF previously fixed by commit 942341dcc574 is not reintroduced.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/madvise.c | 37 +++++++++++++------------------------
1 file changed, 13 insertions(+), 24 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..ae29395b4fc7f65a449c5772b1901a90f4195885 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -176,21 +176,25 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
}
#endif /* CONFIG_ANON_VMA_NAME */
/*
- * Update the vm_flags on region of a vma, splitting it or merging it as
- * necessary. Must be called with mmap_lock held for writing;
- * Caller should ensure anon_name stability by raising its refcount even when
- * anon_name belongs to a valid vma because this function might free that vma.
+ * Update the vm_flags and/or anon_name on region of a vma, splitting it or
+ * merging it as necessary. Must be called with mmap_lock held for writing.
*/
static int madvise_update_vma(vm_flags_t new_flags,
struct madvise_behavior *madv_behavior)
{
- int error;
struct vm_area_struct *vma = madv_behavior->vma;
struct madvise_behavior_range *range = &madv_behavior->range;
- struct anon_vma_name *anon_name = madv_behavior->anon_name;
+ bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
+ struct anon_vma_name *anon_name;
VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
- if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
+ if (set_new_anon_name)
+ anon_name = madv_behavior->anon_name;
+ else
+ anon_name = anon_vma_name(vma);
+
+ if (new_flags == vma->vm_flags && (!set_new_anon_name
+ || anon_vma_name_eq(anon_vma_name(vma), anon_name)))
return 0;
vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
@@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
/* vm_flags is protected by the mmap_lock held in write mode. */
vma_start_write(vma);
vm_flags_reset(vma, new_flags);
- if (!vma->vm_file || vma_is_anon_shmem(vma)) {
- error = replace_anon_vma_name(vma, anon_name);
- if (error)
- return error;
- }
+ if (set_new_anon_name)
+ return replace_anon_vma_name(vma, anon_name);
return 0;
}
@@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
int behavior = madv_behavior->behavior;
struct vm_area_struct *vma = madv_behavior->vma;
vm_flags_t new_flags = vma->vm_flags;
- bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME;
struct madvise_behavior_range *range = &madv_behavior->range;
int error;
@@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
/* This is a write operation.*/
VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK);
- /*
- * madvise_update_vma() might cause a VMA merge which could put an
- * anon_vma_name, so we must hold an additional reference on the
- * anon_vma_name so it doesn't disappear from under us.
- */
- if (!set_new_anon_name) {
- madv_behavior->anon_name = anon_vma_name(vma);
- anon_vma_name_get(madv_behavior->anon_name);
- }
error = madvise_update_vma(new_flags, madv_behavior);
- if (!set_new_anon_name)
- anon_vma_name_put(madv_behavior->anon_name);
out:
/*
* madvise() returns EAGAIN if kernel resources, such as
--
2.50.0
On Mon, Jun 23, 2025 at 04:59:50PM +0200, Vlastimil Babka wrote: > Since the introduction in 9a10064f5625 ("mm: add a field to store names > for private anonymous memory") the code to set anon_name on a vma has > been using madvise_update_vma() to call replace_vma_anon_name(). Since > the former is called also by a number of other madvise behaviours that > do not set a new anon_name, they have been passing the existing > anon_name of the vma to make replace_vma_anon_name() a no-op. > > This is rather wasteful as it needs anon_vma_name_eq() to determine the > no-op situations, and checks for when replace_vma_anon_name() is allowed > (the vma is anon/shmem) duplicate the checks already done earlier in > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: > fix use-after-free when anon vma name is used after vma is freed") > adding anon_name refcount get/put operations exactly to the cases that > actually do not change anon_name - just so the replace_vma_anon_name() > can keep safely determining it has nothing to do. > > The recent madvise cleanups made this suboptimal handling very obvious, > but happily also allow for an easy fix. madvise_update_vma() now has the > complete information whether it's been called to set a new anon_name, so > stop passing it the existing vma's name and doing the refcount get/put > in its only caller madvise_vma_behavior(). > > In madvise_update_vma() itself, limit calling of replace_anon_vma_name() > only to cases where we are setting a new name, otherwise we know it's a > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and > can remove the duplicate checks for vma being anon/shmem that were done > already in madvise_vma_behavior(). > > The remaining reason to obtain the vma's existing anon_name is to pass > it to vma_modify_flags_name() for the splitting and merging to work > properly. In case of merging, the vma might be freed along with the > anon_name, but madvise_update_vma() will not access it afterwards so the > UAF previously fixed by commit 942341dcc574 is not reintroduced. > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> I very much love what you're doing here, but wonder if we could further simplify even, see below... > --- > mm/madvise.c | 37 +++++++++++++------------------------ > 1 file changed, 13 insertions(+), 24 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..ae29395b4fc7f65a449c5772b1901a90f4195885 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -176,21 +176,25 @@ static int replace_anon_vma_name(struct vm_area_struct *vma, > } > #endif /* CONFIG_ANON_VMA_NAME */ > /* > - * Update the vm_flags on region of a vma, splitting it or merging it as > - * necessary. Must be called with mmap_lock held for writing; > - * Caller should ensure anon_name stability by raising its refcount even when > - * anon_name belongs to a valid vma because this function might free that vma. > + * Update the vm_flags and/or anon_name on region of a vma, splitting it or > + * merging it as necessary. Must be called with mmap_lock held for writing. > */ > static int madvise_update_vma(vm_flags_t new_flags, > struct madvise_behavior *madv_behavior) > { > - int error; > struct vm_area_struct *vma = madv_behavior->vma; > struct madvise_behavior_range *range = &madv_behavior->range; > - struct anon_vma_name *anon_name = madv_behavior->anon_name; > + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME; > + struct anon_vma_name *anon_name; > VMA_ITERATOR(vmi, madv_behavior->mm, range->start); > > - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name)) > + if (set_new_anon_name) > + anon_name = madv_behavior->anon_name; > + else > + anon_name = anon_vma_name(vma); I think we can actually avoid this altogether... So we could separate this into two functions: tatic int madvise_update_vma_anon_name(struct madvise_behavior *madv_behavior) { struct vm_area_struct *vma = madv_behavior->vma; VMA_ITERATOR(vmi, madv_behavior->mm, range->start); struct madvise_behavior_range *range = &madv_behavior->range; struct anon_vma_name *anon_name = madv_behavior->anon_name; if (anon_vma_name_eq(anon_vma_name(vma), anon_name)) rturn 0; vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, range->start, range->end, vma->vm_flags, anon_name); if (IS_ERR(vma)) return PTR_ERR(vma); madv_behavior->vma = vma; /* vm_flags is protected by the mmap_lock held in write mode. */ vma_start_write(vma); return replace_anon_vma_name(vma, anon_name); } /* * Update the vm_flags and/or anon_name on region of a vma, splitting it or * merging it as necessary. Must be called with mmap_lock held for writing. */ static int madvise_update_vma(vm_flags_t new_flags, struct madvise_behavior *madv_behavior) { struct vm_area_struct *vma = madv_behavior->vma; struct madvise_behavior_range *range = &madv_behavior->range; VMA_ITERATOR(vmi, madv_behavior->mm, range->start); if (new_flags == vma->vm_flags) return 0; vma = vma_modify_flags(&vmi, madv_behavior->prev, vma, range->start, range->end, new_flags); if (IS_ERR(vma)) return PTR_ERR(vma); madv_behavior->vma = vma; /* vm_flags is protected by the mmap_lock held in write mode. */ vma_start_write(vma); vm_flags_reset(vma, new_flags); return 0; } And avoid all the anon_vma_name stuff for the flags change altogether. This should reduce confusion at the cost of some small duplication. Note that we can then put the anon vma name version in an #ifdef CONFIG_ANON_VMA_NAME block also. > + > + if (new_flags == vma->vm_flags && (!set_new_anon_name > + || anon_vma_name_eq(anon_vma_name(vma), anon_name))) > return 0; > > vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, So maybe just replace this with: if (set_new_anon_name > @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags, > /* vm_flags is protected by the mmap_lock held in write mode. */ > vma_start_write(vma); > vm_flags_reset(vma, new_flags); > - if (!vma->vm_file || vma_is_anon_shmem(vma)) { > - error = replace_anon_vma_name(vma, anon_name); > - if (error) > - return error; > - } > + if (set_new_anon_name) > + return replace_anon_vma_name(vma, anon_name); > > return 0; > } > @@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior) > int behavior = madv_behavior->behavior; > struct vm_area_struct *vma = madv_behavior->vma; > vm_flags_t new_flags = vma->vm_flags; > - bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME; > struct madvise_behavior_range *range = &madv_behavior->range; > int error; > > @@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior) > /* This is a write operation.*/ > VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK); > > - /* > - * madvise_update_vma() might cause a VMA merge which could put an > - * anon_vma_name, so we must hold an additional reference on the > - * anon_vma_name so it doesn't disappear from under us. > - */ > - if (!set_new_anon_name) { > - madv_behavior->anon_name = anon_vma_name(vma); > - anon_vma_name_get(madv_behavior->anon_name); > - } > error = madvise_update_vma(new_flags, madv_behavior); > - if (!set_new_anon_name) > - anon_vma_name_put(madv_behavior->anon_name); Obviously I'll leave it to you as to how you'd update this to reflect that? :P > out: > /* > * madvise() returns EAGAIN if kernel resources, such as > > -- > 2.50.0 >
On 6/23/25 18:56, Lorenzo Stoakes wrote: > On Mon, Jun 23, 2025 at 04:59:50PM +0200, Vlastimil Babka wrote: > I think we can actually avoid this altogether... So we could separate this into two functions: > > tatic int madvise_update_vma_anon_name(struct madvise_behavior *madv_behavior) > { > struct vm_area_struct *vma = madv_behavior->vma; > VMA_ITERATOR(vmi, madv_behavior->mm, range->start); > struct madvise_behavior_range *range = &madv_behavior->range; > struct anon_vma_name *anon_name = madv_behavior->anon_name; > > if (anon_vma_name_eq(anon_vma_name(vma), anon_name)) > rturn 0; > > vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, > range->start, range->end, vma->vm_flags, anon_name); > if (IS_ERR(vma)) > return PTR_ERR(vma); > > madv_behavior->vma = vma; > > /* vm_flags is protected by the mmap_lock held in write mode. */ > vma_start_write(vma); > return replace_anon_vma_name(vma, anon_name); > } > > /* > * Update the vm_flags and/or anon_name on region of a vma, splitting it or > * merging it as necessary. Must be called with mmap_lock held for writing. > */ > static int madvise_update_vma(vm_flags_t new_flags, > struct madvise_behavior *madv_behavior) > { > struct vm_area_struct *vma = madv_behavior->vma; > struct madvise_behavior_range *range = &madv_behavior->range; > VMA_ITERATOR(vmi, madv_behavior->mm, range->start); > > if (new_flags == vma->vm_flags) > return 0; > > vma = vma_modify_flags(&vmi, madv_behavior->prev, vma, > range->start, range->end, new_flags); Using vma_modify_flags() is a great suggestion to avoid passing the existing vma->anon_name explicitly, thanks! I believe I can do that without duplicating the whole madvise_update_vma() function and it doesn't look that bad so I'll try going that way in v2. This also addresses Suren's concerns as there will be no local variable pointing to the vma->anon_name that can become a UAF again by future changes so we shouldn't need the warning comments either. Thanks!
On Mon, Jun 23, 2025 at 8:00 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > Since the introduction in 9a10064f5625 ("mm: add a field to store names > for private anonymous memory") the code to set anon_name on a vma has > been using madvise_update_vma() to call replace_vma_anon_name(). Since s/replace_vma_anon_name()/replace_anon_vma_name() > the former is called also by a number of other madvise behaviours that > do not set a new anon_name, they have been passing the existing > anon_name of the vma to make replace_vma_anon_name() a no-op. > > This is rather wasteful as it needs anon_vma_name_eq() to determine the > no-op situations, and checks for when replace_vma_anon_name() is allowed > (the vma is anon/shmem) duplicate the checks already done earlier in > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: > fix use-after-free when anon vma name is used after vma is freed") > adding anon_name refcount get/put operations exactly to the cases that > actually do not change anon_name - just so the replace_vma_anon_name() > can keep safely determining it has nothing to do. > > The recent madvise cleanups made this suboptimal handling very obvious, > but happily also allow for an easy fix. madvise_update_vma() now has the > complete information whether it's been called to set a new anon_name, so > stop passing it the existing vma's name and doing the refcount get/put > in its only caller madvise_vma_behavior(). > > In madvise_update_vma() itself, limit calling of replace_anon_vma_name() > only to cases where we are setting a new name, otherwise we know it's a > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and > can remove the duplicate checks for vma being anon/shmem that were done > already in madvise_vma_behavior(). > > The remaining reason to obtain the vma's existing anon_name is to pass > it to vma_modify_flags_name() for the splitting and merging to work > properly. In case of merging, the vma might be freed along with the > anon_name, but madvise_update_vma() will not access it afterwards This is quite subtle. Can we add a comment in the code that anon_name might be freed as a result of vma merge after vma_modify_flags_name() gets called and anon_name should not be accessed afterwards? > so the > UAF previously fixed by commit 942341dcc574 is not reintroduced. > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Suren Baghdasaryan <surenb@google.com> > --- > mm/madvise.c | 37 +++++++++++++------------------------ > 1 file changed, 13 insertions(+), 24 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..ae29395b4fc7f65a449c5772b1901a90f4195885 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -176,21 +176,25 @@ static int replace_anon_vma_name(struct vm_area_struct *vma, > } > #endif /* CONFIG_ANON_VMA_NAME */ > /* > - * Update the vm_flags on region of a vma, splitting it or merging it as > - * necessary. Must be called with mmap_lock held for writing; > - * Caller should ensure anon_name stability by raising its refcount even when > - * anon_name belongs to a valid vma because this function might free that vma. > + * Update the vm_flags and/or anon_name on region of a vma, splitting it or > + * merging it as necessary. Must be called with mmap_lock held for writing. > */ > static int madvise_update_vma(vm_flags_t new_flags, > struct madvise_behavior *madv_behavior) > { > - int error; > struct vm_area_struct *vma = madv_behavior->vma; > struct madvise_behavior_range *range = &madv_behavior->range; > - struct anon_vma_name *anon_name = madv_behavior->anon_name; > + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME; > + struct anon_vma_name *anon_name; > VMA_ITERATOR(vmi, madv_behavior->mm, range->start); > > - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name)) > + if (set_new_anon_name) > + anon_name = madv_behavior->anon_name; > + else > + anon_name = anon_vma_name(vma); > + > + if (new_flags == vma->vm_flags && (!set_new_anon_name > + || anon_vma_name_eq(anon_vma_name(vma), anon_name))) > return 0; > > vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, Maybe here we can add a comment, something like this: /* * vma->anon_name might be freed by vma_modify_flags_name() as a result of vma merge, * therefore accessing anon_name in the code below is unsafe if !set_new_anon_name. */ > @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags, > /* vm_flags is protected by the mmap_lock held in write mode. */ > vma_start_write(vma); > vm_flags_reset(vma, new_flags); > - if (!vma->vm_file || vma_is_anon_shmem(vma)) { > - error = replace_anon_vma_name(vma, anon_name); > - if (error) > - return error; > - } > + if (set_new_anon_name) > + return replace_anon_vma_name(vma, anon_name); > > return 0; > } > @@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior) > int behavior = madv_behavior->behavior; > struct vm_area_struct *vma = madv_behavior->vma; > vm_flags_t new_flags = vma->vm_flags; > - bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME; > struct madvise_behavior_range *range = &madv_behavior->range; > int error; > > @@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior) > /* This is a write operation.*/ > VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK); > > - /* > - * madvise_update_vma() might cause a VMA merge which could put an > - * anon_vma_name, so we must hold an additional reference on the > - * anon_vma_name so it doesn't disappear from under us. > - */ > - if (!set_new_anon_name) { > - madv_behavior->anon_name = anon_vma_name(vma); > - anon_vma_name_get(madv_behavior->anon_name); > - } > error = madvise_update_vma(new_flags, madv_behavior); > - if (!set_new_anon_name) > - anon_vma_name_put(madv_behavior->anon_name); > out: > /* > * madvise() returns EAGAIN if kernel resources, such as > > -- > 2.50.0 >
* Suren Baghdasaryan <surenb@google.com> [250623 11:39]: > On Mon, Jun 23, 2025 at 8:00 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > > > Since the introduction in 9a10064f5625 ("mm: add a field to store names > > for private anonymous memory") the code to set anon_name on a vma has > > been using madvise_update_vma() to call replace_vma_anon_name(). Since > > s/replace_vma_anon_name()/replace_anon_vma_name() > > > the former is called also by a number of other madvise behaviours that > > do not set a new anon_name, they have been passing the existing > > anon_name of the vma to make replace_vma_anon_name() a no-op. > > > > This is rather wasteful as it needs anon_vma_name_eq() to determine the > > no-op situations, and checks for when replace_vma_anon_name() is allowed > > (the vma is anon/shmem) duplicate the checks already done earlier in > > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: > > fix use-after-free when anon vma name is used after vma is freed") > > adding anon_name refcount get/put operations exactly to the cases that > > actually do not change anon_name - just so the replace_vma_anon_name() > > can keep safely determining it has nothing to do. > > > > The recent madvise cleanups made this suboptimal handling very obvious, > > but happily also allow for an easy fix. madvise_update_vma() now has the > > complete information whether it's been called to set a new anon_name, so > > stop passing it the existing vma's name and doing the refcount get/put > > in its only caller madvise_vma_behavior(). > > > > In madvise_update_vma() itself, limit calling of replace_anon_vma_name() > > only to cases where we are setting a new name, otherwise we know it's a > > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and > > can remove the duplicate checks for vma being anon/shmem that were done > > already in madvise_vma_behavior(). > > > > The remaining reason to obtain the vma's existing anon_name is to pass > > it to vma_modify_flags_name() for the splitting and merging to work > > properly. In case of merging, the vma might be freed along with the > > anon_name, but madvise_update_vma() will not access it afterwards > > This is quite subtle. Can we add a comment in the code that anon_name > might be freed as a result of vma merge after vma_modify_flags_name() > gets called and anon_name should not be accessed afterwards? Surely that's not the common pattern since the anon vma name is ref counted? And it's probably the case for more than just the anon name? ... Thanks, Liam
On Mon, Jun 23, 2025 at 12:22:26PM -0400, Liam R. Howlett wrote: > * Suren Baghdasaryan <surenb@google.com> [250623 11:39]: > > On Mon, Jun 23, 2025 at 8:00 AM Vlastimil Babka <vbabka@suse.cz> wrote: > > > > > > Since the introduction in 9a10064f5625 ("mm: add a field to store names > > > for private anonymous memory") the code to set anon_name on a vma has > > > been using madvise_update_vma() to call replace_vma_anon_name(). Since > > > > s/replace_vma_anon_name()/replace_anon_vma_name() > > > > > the former is called also by a number of other madvise behaviours that > > > do not set a new anon_name, they have been passing the existing > > > anon_name of the vma to make replace_vma_anon_name() a no-op. > > > > > > This is rather wasteful as it needs anon_vma_name_eq() to determine the > > > no-op situations, and checks for when replace_vma_anon_name() is allowed > > > (the vma is anon/shmem) duplicate the checks already done earlier in > > > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: > > > fix use-after-free when anon vma name is used after vma is freed") > > > adding anon_name refcount get/put operations exactly to the cases that > > > actually do not change anon_name - just so the replace_vma_anon_name() > > > can keep safely determining it has nothing to do. > > > > > > The recent madvise cleanups made this suboptimal handling very obvious, > > > but happily also allow for an easy fix. madvise_update_vma() now has the > > > complete information whether it's been called to set a new anon_name, so > > > stop passing it the existing vma's name and doing the refcount get/put > > > in its only caller madvise_vma_behavior(). > > > > > > In madvise_update_vma() itself, limit calling of replace_anon_vma_name() > > > only to cases where we are setting a new name, otherwise we know it's a > > > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and > > > can remove the duplicate checks for vma being anon/shmem that were done > > > already in madvise_vma_behavior(). > > > > > > The remaining reason to obtain the vma's existing anon_name is to pass > > > it to vma_modify_flags_name() for the splitting and merging to work > > > properly. In case of merging, the vma might be freed along with the > > > anon_name, but madvise_update_vma() will not access it afterwards > > > > This is quite subtle. Can we add a comment in the code that anon_name > > might be freed as a result of vma merge after vma_modify_flags_name() > > gets called and anon_name should not be accessed afterwards? > > Surely that's not the common pattern since the anon vma name is ref > counted? > > And it's probably the case for more than just the anon name? This is all quite tricky. I think the key thing is that madvise_set_anon_name() is invoked by prctl_set_vma() (yuck) which allocates a brand new anon_vma_name. When we merge, we don't actually set that anon_vma_name to anything, but we might put (and thereby maybe free) an anon_vma_name that is identical in terms of the string as part of the merge. But that'll be the vma->anon_vma_name that gets killed, not anon_vma_name in madvise_update_vma(), as that hasn't been set yet :) The problem was when doing so and referencing vma->anon_vma_name, and then afterwards invoking replace_anon_vma_name(). The VMA being changed might have got deleted as part of a merge, and so this could then be a dangling pointer. But the irony is, in that case, there's really no need to call replace_anon_vma_name() the one thing that actually uses anon_vma_name after the merge... because trivially anon_vma_name_eq() effectively comparing vma->anon_vma_name to itself will always be true and thus the operation will always be a no-op. Except it'll be a no-op referencing a dangling pointer... The previous cleanups made this whole thing clearer (see, this all sounds very claer right? :P) meaning the get/put are just totally unnecessary. TL;DR - vma->anon_vma_name is _not being used after the merge_ here in any case. > > ... > > Thanks, > Liam
© 2016 - 2025 Red Hat, Inc.