Since the introduction in 9a10064f5625 ("mm: add a field to store names
for private anonymous memory") the code to set anon_name on a vma has
been using madvise_update_vma() to call replace_anon_vma_name(). Since
the former is called also by a number of other madvise behaviours that
do not set a new anon_name, they have been passing the existing
anon_name of the vma to make replace_vma_anon_name() a no-op.
This is rather wasteful as it needs anon_vma_name_eq() to determine the
no-op situations, and checks for when replace_vma_anon_name() is allowed
(the vma is anon/shmem) duplicate the checks already done earlier in
madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm:
fix use-after-free when anon vma name is used after vma is freed")
adding anon_name refcount get/put operations exactly to the cases that
actually do not change anon_name - just so the replace_vma_anon_name()
can keep safely determining it has nothing to do.
The recent madvise cleanups made this suboptimal handling very obvious,
but happily also allow for an easy fix. madvise_update_vma() now has the
complete information whether it's been called to set a new anon_name, so
stop passing it the existing vma's name and doing the refcount get/put
in its only caller madvise_vma_behavior().
In madvise_update_vma() itself, limit calling of replace_anon_vma_name()
only to cases where we are setting a new name, otherwise we know it's a
no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and
can remove the duplicate checks for vma being anon/shmem that were done
already in madvise_vma_behavior().
Additionally, by using vma_modify_flags() when not modifying the
anon_name, avoid explicitly passing the existing vma's anon_name and
storing a pointer to it in struct madv_behavior or a local variable.
This prevents the danger of accessing a freed anon_name after vma
merging, previously fixed by commit 942341dcc574.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
---
mm/madvise.c | 37 +++++++++++++------------------------
1 file changed, 13 insertions(+), 24 deletions(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma,
}
#endif /* CONFIG_ANON_VMA_NAME */
/*
- * Update the vm_flags on region of a vma, splitting it or merging it as
- * necessary. Must be called with mmap_lock held for writing;
- * Caller should ensure anon_name stability by raising its refcount even when
- * anon_name belongs to a valid vma because this function might free that vma.
+ * Update the vm_flags and/or anon_name on region of a vma, splitting it or
+ * merging it as necessary. Must be called with mmap_lock held for writing.
*/
static int madvise_update_vma(vm_flags_t new_flags,
struct madvise_behavior *madv_behavior)
{
- int error;
struct vm_area_struct *vma = madv_behavior->vma;
struct madvise_behavior_range *range = &madv_behavior->range;
struct anon_vma_name *anon_name = madv_behavior->anon_name;
+ bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME;
VMA_ITERATOR(vmi, madv_behavior->mm, range->start);
- if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name))
+ if (new_flags == vma->vm_flags && (!set_new_anon_name ||
+ anon_vma_name_eq(anon_vma_name(vma), anon_name)))
return 0;
- vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
+ if (set_new_anon_name)
+ vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma,
range->start, range->end, new_flags, anon_name);
+ else
+ vma = vma_modify_flags(&vmi, madv_behavior->prev, vma,
+ range->start, range->end, new_flags);
+
if (IS_ERR(vma))
return PTR_ERR(vma);
@@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags,
/* vm_flags is protected by the mmap_lock held in write mode. */
vma_start_write(vma);
vm_flags_reset(vma, new_flags);
- if (!vma->vm_file || vma_is_anon_shmem(vma)) {
- error = replace_anon_vma_name(vma, anon_name);
- if (error)
- return error;
- }
+ if (set_new_anon_name)
+ return replace_anon_vma_name(vma, anon_name);
return 0;
}
@@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
int behavior = madv_behavior->behavior;
struct vm_area_struct *vma = madv_behavior->vma;
vm_flags_t new_flags = vma->vm_flags;
- bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME;
struct madvise_behavior_range *range = &madv_behavior->range;
int error;
@@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior)
/* This is a write operation.*/
VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK);
- /*
- * madvise_update_vma() might cause a VMA merge which could put an
- * anon_vma_name, so we must hold an additional reference on the
- * anon_vma_name so it doesn't disappear from under us.
- */
- if (!set_new_anon_name) {
- madv_behavior->anon_name = anon_vma_name(vma);
- anon_vma_name_get(madv_behavior->anon_name);
- }
error = madvise_update_vma(new_flags, madv_behavior);
- if (!set_new_anon_name)
- anon_vma_name_put(madv_behavior->anon_name);
out:
/*
* madvise() returns EAGAIN if kernel resources, such as
--
2.50.0
On 6/24/25 15:03, Vlastimil Babka wrote: > Since the introduction in 9a10064f5625 ("mm: add a field to store names > for private anonymous memory") the code to set anon_name on a vma has > been using madvise_update_vma() to call replace_anon_vma_name(). Since > the former is called also by a number of other madvise behaviours that > do not set a new anon_name, they have been passing the existing > anon_name of the vma to make replace_vma_anon_name() a no-op. > > This is rather wasteful as it needs anon_vma_name_eq() to determine the > no-op situations, and checks for when replace_vma_anon_name() is allowed > (the vma is anon/shmem) duplicate the checks already done earlier in > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: > fix use-after-free when anon vma name is used after vma is freed") > adding anon_name refcount get/put operations exactly to the cases that > actually do not change anon_name - just so the replace_vma_anon_name() > can keep safely determining it has nothing to do. > > The recent madvise cleanups made this suboptimal handling very obvious, > but happily also allow for an easy fix. madvise_update_vma() now has the > complete information whether it's been called to set a new anon_name, so > stop passing it the existing vma's name and doing the refcount get/put > in its only caller madvise_vma_behavior(). > > In madvise_update_vma() itself, limit calling of replace_anon_vma_name() > only to cases where we are setting a new name, otherwise we know it's a > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and > can remove the duplicate checks for vma being anon/shmem that were done > already in madvise_vma_behavior(). > > Additionally, by using vma_modify_flags() when not modifying the > anon_name, avoid explicitly passing the existing vma's anon_name and > storing a pointer to it in struct madv_behavior or a local variable. > This prevents the danger of accessing a freed anon_name after vma > merging, previously fixed by commit 942341dcc574. > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> With fixed up function names as pointed out by Suren: Since the introduction in 9a10064f5625 ("mm: add a field to store names for private anonymous memory") the code to set anon_name on a vma has been using madvise_update_vma() to call replace_anon_vma_name(). Since the former is called also by a number of other madvise behaviours that do not set a new anon_name, they have been passing the existing anon_name of the vma to make replace_anon_vma_name() a no-op. This is rather wasteful as it needs anon_vma_name_eq() to determine the no-op situations, and checks for when replace_anon_vma_name() is allowed (the vma is anon/shmem) duplicate the checks already done earlier in madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: fix use-after-free when anon vma name is used after vma is freed") adding anon_name refcount get/put operations exactly to the cases that actually do not change anon_name - just so the replace_anon_vma_name() can keep safely determining it has nothing to do. The recent madvise cleanups made this suboptimal handling very obvious, but happily also allow for an easy fix. madvise_update_vma() now has the complete information whether it's been called to set a new anon_name, so stop passing it the existing vma's name and doing the refcount get/put in its only caller madvise_vma_behavior(). In madvise_update_vma() itself, limit calling of replace_anon_vma_name() only to cases where we are setting a new name, otherwise we know it's a no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and can remove the duplicate checks for vma being anon/shmem that were done already in madvise_vma_behavior(). Additionally, by using vma_modify_flags() when not modifying the anon_name, avoid explicitly passing the existing vma's anon_name and storing a pointer to it in struct madv_behavior or a local variable. This prevents the danger of accessing a freed anon_name after vma merging, previously fixed by commit 942341dcc574. Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
On Tue, Jun 24, 2025 at 03:03:45PM +0200, Vlastimil Babka wrote: > Since the introduction in 9a10064f5625 ("mm: add a field to store names > for private anonymous memory") the code to set anon_name on a vma has > been using madvise_update_vma() to call replace_anon_vma_name(). Since > the former is called also by a number of other madvise behaviours that > do not set a new anon_name, they have been passing the existing > anon_name of the vma to make replace_vma_anon_name() a no-op. > > This is rather wasteful as it needs anon_vma_name_eq() to determine the > no-op situations, and checks for when replace_vma_anon_name() is allowed > (the vma is anon/shmem) duplicate the checks already done earlier in > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: > fix use-after-free when anon vma name is used after vma is freed") > adding anon_name refcount get/put operations exactly to the cases that > actually do not change anon_name - just so the replace_vma_anon_name() > can keep safely determining it has nothing to do. > > The recent madvise cleanups made this suboptimal handling very obvious, > but happily also allow for an easy fix. madvise_update_vma() now has the > complete information whether it's been called to set a new anon_name, so > stop passing it the existing vma's name and doing the refcount get/put > in its only caller madvise_vma_behavior(). > > In madvise_update_vma() itself, limit calling of replace_anon_vma_name() > only to cases where we are setting a new name, otherwise we know it's a > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and > can remove the duplicate checks for vma being anon/shmem that were done > already in madvise_vma_behavior(). > > Additionally, by using vma_modify_flags() when not modifying the > anon_name, avoid explicitly passing the existing vma's anon_name and > storing a pointer to it in struct madv_behavior or a local variable. > This prevents the danger of accessing a freed anon_name after vma > merging, previously fixed by commit 942341dcc574. > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cheers, LGTM so: Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> I made sure to do some stress-ng --madvise testing with this series (+ my recent fix) applied :P all good. > --- > mm/madvise.c | 37 +++++++++++++------------------------ > 1 file changed, 13 insertions(+), 24 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma, > } > #endif /* CONFIG_ANON_VMA_NAME */ > /* > - * Update the vm_flags on region of a vma, splitting it or merging it as > - * necessary. Must be called with mmap_lock held for writing; > - * Caller should ensure anon_name stability by raising its refcount even when > - * anon_name belongs to a valid vma because this function might free that vma. > + * Update the vm_flags and/or anon_name on region of a vma, splitting it or > + * merging it as necessary. Must be called with mmap_lock held for writing. > */ > static int madvise_update_vma(vm_flags_t new_flags, > struct madvise_behavior *madv_behavior) > { > - int error; > struct vm_area_struct *vma = madv_behavior->vma; > struct madvise_behavior_range *range = &madv_behavior->range; > struct anon_vma_name *anon_name = madv_behavior->anon_name; > + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME; > VMA_ITERATOR(vmi, madv_behavior->mm, range->start); > > - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name)) > + if (new_flags == vma->vm_flags && (!set_new_anon_name || > + anon_vma_name_eq(anon_vma_name(vma), anon_name))) > return 0; > > - vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, > + if (set_new_anon_name) > + vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, > range->start, range->end, new_flags, anon_name); I will do a follow up (doesn't really belong here I'd say as involves change to modify code) that makes this vma_modify_name() or vma_modify_anon_name() as this is the only caller and we don't care about flags here :) > + else > + vma = vma_modify_flags(&vmi, madv_behavior->prev, vma, > + range->start, range->end, new_flags); > + > if (IS_ERR(vma)) > return PTR_ERR(vma); > > @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags, > /* vm_flags is protected by the mmap_lock held in write mode. */ > vma_start_write(vma); > vm_flags_reset(vma, new_flags); > - if (!vma->vm_file || vma_is_anon_shmem(vma)) { > - error = replace_anon_vma_name(vma, anon_name); > - if (error) > - return error; > - } > + if (set_new_anon_name) > + return replace_anon_vma_name(vma, anon_name); > > return 0; > } > @@ -1313,7 +1314,6 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior) > int behavior = madv_behavior->behavior; > struct vm_area_struct *vma = madv_behavior->vma; > vm_flags_t new_flags = vma->vm_flags; > - bool set_new_anon_name = behavior == __MADV_SET_ANON_VMA_NAME; > struct madvise_behavior_range *range = &madv_behavior->range; > int error; > > @@ -1403,18 +1403,7 @@ static int madvise_vma_behavior(struct madvise_behavior *madv_behavior) > /* This is a write operation.*/ > VM_WARN_ON_ONCE(madv_behavior->lock_mode != MADVISE_MMAP_WRITE_LOCK); > > - /* > - * madvise_update_vma() might cause a VMA merge which could put an > - * anon_vma_name, so we must hold an additional reference on the > - * anon_vma_name so it doesn't disappear from under us. > - */ > - if (!set_new_anon_name) { > - madv_behavior->anon_name = anon_vma_name(vma); > - anon_vma_name_get(madv_behavior->anon_name); > - } > error = madvise_update_vma(new_flags, madv_behavior); > - if (!set_new_anon_name) > - anon_vma_name_put(madv_behavior->anon_name); > out: > /* > * madvise() returns EAGAIN if kernel resources, such as > > -- > 2.50.0 >
On 6/24/25 17:26, Lorenzo Stoakes wrote: > On Tue, Jun 24, 2025 at 03:03:45PM +0200, Vlastimil Babka wrote: >> Since the introduction in 9a10064f5625 ("mm: add a field to store names >> for private anonymous memory") the code to set anon_name on a vma has >> been using madvise_update_vma() to call replace_anon_vma_name(). Since >> the former is called also by a number of other madvise behaviours that >> do not set a new anon_name, they have been passing the existing >> anon_name of the vma to make replace_vma_anon_name() a no-op. >> >> This is rather wasteful as it needs anon_vma_name_eq() to determine the >> no-op situations, and checks for when replace_vma_anon_name() is allowed >> (the vma is anon/shmem) duplicate the checks already done earlier in >> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: >> fix use-after-free when anon vma name is used after vma is freed") >> adding anon_name refcount get/put operations exactly to the cases that >> actually do not change anon_name - just so the replace_vma_anon_name() >> can keep safely determining it has nothing to do. >> >> The recent madvise cleanups made this suboptimal handling very obvious, >> but happily also allow for an easy fix. madvise_update_vma() now has the >> complete information whether it's been called to set a new anon_name, so >> stop passing it the existing vma's name and doing the refcount get/put >> in its only caller madvise_vma_behavior(). >> >> In madvise_update_vma() itself, limit calling of replace_anon_vma_name() >> only to cases where we are setting a new name, otherwise we know it's a >> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and >> can remove the duplicate checks for vma being anon/shmem that were done >> already in madvise_vma_behavior(). >> >> Additionally, by using vma_modify_flags() when not modifying the >> anon_name, avoid explicitly passing the existing vma's anon_name and >> storing a pointer to it in struct madv_behavior or a local variable. >> This prevents the danger of accessing a freed anon_name after vma >> merging, previously fixed by commit 942341dcc574. >> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > > Cheers, LGTM so: > > Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> > > I made sure to do some stress-ng --madvise testing with this series (+ my recent > fix) applied :P all good. Thanks for that! >> --- >> mm/madvise.c | 37 +++++++++++++------------------------ >> 1 file changed, 13 insertions(+), 24 deletions(-) >> >> diff --git a/mm/madvise.c b/mm/madvise.c >> index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644 >> --- a/mm/madvise.c >> +++ b/mm/madvise.c >> @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma, >> } >> #endif /* CONFIG_ANON_VMA_NAME */ >> /* >> - * Update the vm_flags on region of a vma, splitting it or merging it as >> - * necessary. Must be called with mmap_lock held for writing; >> - * Caller should ensure anon_name stability by raising its refcount even when >> - * anon_name belongs to a valid vma because this function might free that vma. >> + * Update the vm_flags and/or anon_name on region of a vma, splitting it or >> + * merging it as necessary. Must be called with mmap_lock held for writing. >> */ >> static int madvise_update_vma(vm_flags_t new_flags, >> struct madvise_behavior *madv_behavior) >> { >> - int error; >> struct vm_area_struct *vma = madv_behavior->vma; >> struct madvise_behavior_range *range = &madv_behavior->range; >> struct anon_vma_name *anon_name = madv_behavior->anon_name; >> + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME; >> VMA_ITERATOR(vmi, madv_behavior->mm, range->start); >> >> - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name)) >> + if (new_flags == vma->vm_flags && (!set_new_anon_name || >> + anon_vma_name_eq(anon_vma_name(vma), anon_name))) >> return 0; >> >> - vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, >> + if (set_new_anon_name) >> + vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, >> range->start, range->end, new_flags, anon_name); > > I will do a follow up (doesn't really belong here I'd say as involves change to > modify code) that makes this vma_modify_name() or vma_modify_anon_name() as this > is the only caller and we don't care about flags here :) Oh right, good idea!
On 24.06.25 15:03, Vlastimil Babka wrote: > Since the introduction in 9a10064f5625 ("mm: add a field to store names > for private anonymous memory") the code to set anon_name on a vma has > been using madvise_update_vma() to call replace_anon_vma_name(). Since > the former is called also by a number of other madvise behaviours that > do not set a new anon_name, they have been passing the existing > anon_name of the vma to make replace_vma_anon_name() a no-op. > > This is rather wasteful as it needs anon_vma_name_eq() to determine the > no-op situations, and checks for when replace_vma_anon_name() is allowed > (the vma is anon/shmem) duplicate the checks already done earlier in > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: > fix use-after-free when anon vma name is used after vma is freed") > adding anon_name refcount get/put operations exactly to the cases that > actually do not change anon_name - just so the replace_vma_anon_name() > can keep safely determining it has nothing to do. > > The recent madvise cleanups made this suboptimal handling very obvious, > but happily also allow for an easy fix. madvise_update_vma() now has the > complete information whether it's been called to set a new anon_name, so > stop passing it the existing vma's name and doing the refcount get/put > in its only caller madvise_vma_behavior(). > > In madvise_update_vma() itself, limit calling of replace_anon_vma_name() > only to cases where we are setting a new name, otherwise we know it's a > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and > can remove the duplicate checks for vma being anon/shmem that were done > already in madvise_vma_behavior(). > > Additionally, by using vma_modify_flags() when not modifying the > anon_name, avoid explicitly passing the existing vma's anon_name and > storing a pointer to it in struct madv_behavior or a local variable. > This prevents the danger of accessing a freed anon_name after vma > merging, previously fixed by commit 942341dcc574. > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > --- > mm/madvise.c | 37 +++++++++++++------------------------ > 1 file changed, 13 insertions(+), 24 deletions(-) > > diff --git a/mm/madvise.c b/mm/madvise.c > index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma, > } > #endif /* CONFIG_ANON_VMA_NAME */ > /* > - * Update the vm_flags on region of a vma, splitting it or merging it as > - * necessary. Must be called with mmap_lock held for writing; > - * Caller should ensure anon_name stability by raising its refcount even when > - * anon_name belongs to a valid vma because this function might free that vma. > + * Update the vm_flags and/or anon_name on region of a vma, splitting it or > + * merging it as necessary. Must be called with mmap_lock held for writing. > */ > static int madvise_update_vma(vm_flags_t new_flags, > struct madvise_behavior *madv_behavior) > { > - int error; > struct vm_area_struct *vma = madv_behavior->vma; > struct madvise_behavior_range *range = &madv_behavior->range; > struct anon_vma_name *anon_name = madv_behavior->anon_name; > + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME; > VMA_ITERATOR(vmi, madv_behavior->mm, range->start); > > - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name)) > + if (new_flags == vma->vm_flags && (!set_new_anon_name || > + anon_vma_name_eq(anon_vma_name(vma), anon_name))) > return 0; > > - vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, > + if (set_new_anon_name) > + vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, > range->start, range->end, new_flags, anon_name); > + else > + vma = vma_modify_flags(&vmi, madv_behavior->prev, vma, > + range->start, range->end, new_flags); > + > if (IS_ERR(vma)) > return PTR_ERR(vma); > > @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags, > /* vm_flags is protected by the mmap_lock held in write mode. */ > vma_start_write(vma); > vm_flags_reset(vma, new_flags); > - if (!vma->vm_file || vma_is_anon_shmem(vma)) { > - error = replace_anon_vma_name(vma, anon_name); > - if (error) > - return error; > - } > + if (set_new_anon_name) > + return replace_anon_vma_name(vma, anon_name); Took me a second to find where this is already checked (-> madvise_vma_behavior()). :) Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb
On 6/24/25 15:58, David Hildenbrand wrote: > On 24.06.25 15:03, Vlastimil Babka wrote: >> Since the introduction in 9a10064f5625 ("mm: add a field to store names >> for private anonymous memory") the code to set anon_name on a vma has >> been using madvise_update_vma() to call replace_anon_vma_name(). Since >> the former is called also by a number of other madvise behaviours that >> do not set a new anon_name, they have been passing the existing >> anon_name of the vma to make replace_vma_anon_name() a no-op. >> >> This is rather wasteful as it needs anon_vma_name_eq() to determine the >> no-op situations, and checks for when replace_vma_anon_name() is allowed >> (the vma is anon/shmem) duplicate the checks already done earlier in >> madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: ^ >> fix use-after-free when anon vma name is used after vma is freed") >> adding anon_name refcount get/put operations exactly to the cases that >> actually do not change anon_name - just so the replace_vma_anon_name() >> can keep safely determining it has nothing to do. >> >> The recent madvise cleanups made this suboptimal handling very obvious, >> but happily also allow for an easy fix. madvise_update_vma() now has the >> complete information whether it's been called to set a new anon_name, so >> stop passing it the existing vma's name and doing the refcount get/put >> in its only caller madvise_vma_behavior(). >> >> In madvise_update_vma() itself, limit calling of replace_anon_vma_name() >> only to cases where we are setting a new name, otherwise we know it's a >> no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and >> can remove the duplicate checks for vma being anon/shmem that were done >> already in madvise_vma_behavior(). ^ >> >> Additionally, by using vma_modify_flags() when not modifying the >> anon_name, avoid explicitly passing the existing vma's anon_name and >> storing a pointer to it in struct madv_behavior or a local variable. >> This prevents the danger of accessing a freed anon_name after vma >> merging, previously fixed by commit 942341dcc574. >> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> <snip> > Took me a second to find where this is already checked (-> > madvise_vma_behavior()). And I thought I was repeating myself too much in the changelog :) > > :) But maybe you're joking on purpose, referring to that :) > Acked-by: David Hildenbrand <david@redhat.com> Thanks!
On Tue, Jun 24, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote: > > On 24.06.25 15:03, Vlastimil Babka wrote: > > Since the introduction in 9a10064f5625 ("mm: add a field to store names > > for private anonymous memory") the code to set anon_name on a vma has > > been using madvise_update_vma() to call replace_anon_vma_name(). Since > > the former is called also by a number of other madvise behaviours that > > do not set a new anon_name, they have been passing the existing > > anon_name of the vma to make replace_vma_anon_name() a no-op. s/replace_vma_anon_name/replace_anon_vma_name > > > > This is rather wasteful as it needs anon_vma_name_eq() to determine the > > no-op situations, and checks for when replace_vma_anon_name() is allowed s/replace_vma_anon_name/replace_anon_vma_name > > (the vma is anon/shmem) duplicate the checks already done earlier in > > madvise_vma_behavior(). It has also lead to commit 942341dcc574 ("mm: > > fix use-after-free when anon vma name is used after vma is freed") > > adding anon_name refcount get/put operations exactly to the cases that > > actually do not change anon_name - just so the replace_vma_anon_name() s/replace_vma_anon_name/replace_anon_vma_name > > can keep safely determining it has nothing to do. > > > > The recent madvise cleanups made this suboptimal handling very obvious, > > but happily also allow for an easy fix. madvise_update_vma() now has the > > complete information whether it's been called to set a new anon_name, so > > stop passing it the existing vma's name and doing the refcount get/put > > in its only caller madvise_vma_behavior(). > > > > In madvise_update_vma() itself, limit calling of replace_anon_vma_name() > > only to cases where we are setting a new name, otherwise we know it's a > > no-op. We can rely solely on the __MADV_SET_ANON_VMA_NAME behaviour and > > can remove the duplicate checks for vma being anon/shmem that were done > > already in madvise_vma_behavior(). > > > > Additionally, by using vma_modify_flags() when not modifying the > > anon_name, avoid explicitly passing the existing vma's anon_name and > > storing a pointer to it in struct madv_behavior or a local variable. > > This prevents the danger of accessing a freed anon_name after vma > > merging, previously fixed by commit 942341dcc574. > > > > Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > > --- > > mm/madvise.c | 37 +++++++++++++------------------------ > > 1 file changed, 13 insertions(+), 24 deletions(-) > > > > diff --git a/mm/madvise.c b/mm/madvise.c > > index 4491bf080f55d6d1aeffb2ff0b8fdd28904af950..fca0e9b3e844ad766e83ac04cc0d7f4099c74005 100644 > > --- a/mm/madvise.c > > +++ b/mm/madvise.c > > @@ -176,25 +176,29 @@ static int replace_anon_vma_name(struct vm_area_struct *vma, > > } > > #endif /* CONFIG_ANON_VMA_NAME */ > > /* > > - * Update the vm_flags on region of a vma, splitting it or merging it as > > - * necessary. Must be called with mmap_lock held for writing; > > - * Caller should ensure anon_name stability by raising its refcount even when > > - * anon_name belongs to a valid vma because this function might free that vma. > > + * Update the vm_flags and/or anon_name on region of a vma, splitting it or > > + * merging it as necessary. Must be called with mmap_lock held for writing. > > */ > > static int madvise_update_vma(vm_flags_t new_flags, > > struct madvise_behavior *madv_behavior) > > { > > - int error; > > struct vm_area_struct *vma = madv_behavior->vma; > > struct madvise_behavior_range *range = &madv_behavior->range; > > struct anon_vma_name *anon_name = madv_behavior->anon_name; > > + bool set_new_anon_name = madv_behavior->behavior == __MADV_SET_ANON_VMA_NAME; > > VMA_ITERATOR(vmi, madv_behavior->mm, range->start); > > > > - if (new_flags == vma->vm_flags && anon_vma_name_eq(anon_vma_name(vma), anon_name)) > > + if (new_flags == vma->vm_flags && (!set_new_anon_name || > > + anon_vma_name_eq(anon_vma_name(vma), anon_name))) > > return 0; > > > > - vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, > > + if (set_new_anon_name) > > + vma = vma_modify_flags_name(&vmi, madv_behavior->prev, vma, > > range->start, range->end, new_flags, anon_name); > > + else > > + vma = vma_modify_flags(&vmi, madv_behavior->prev, vma, > > + range->start, range->end, new_flags); > > + > > if (IS_ERR(vma)) > > return PTR_ERR(vma); > > > > @@ -203,11 +207,8 @@ static int madvise_update_vma(vm_flags_t new_flags, > > /* vm_flags is protected by the mmap_lock held in write mode. */ > > vma_start_write(vma); > > vm_flags_reset(vma, new_flags); > > - if (!vma->vm_file || vma_is_anon_shmem(vma)) { > > - error = replace_anon_vma_name(vma, anon_name); > > - if (error) > > - return error; > > - } > > + if (set_new_anon_name) > > + return replace_anon_vma_name(vma, anon_name); > > Took me a second to find where this is already checked (-> > madvise_vma_behavior()). > > :) > > Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> > > -- > Cheers, > > David / dhildenb >
On 6/24/25 16:28, Suren Baghdasaryan wrote: > On Tue, Jun 24, 2025 at 6:58 AM David Hildenbrand <david@redhat.com> wrote: >> >> On 24.06.25 15:03, Vlastimil Babka wrote: >> > Since the introduction in 9a10064f5625 ("mm: add a field to store names >> > for private anonymous memory") the code to set anon_name on a vma has >> > been using madvise_update_vma() to call replace_anon_vma_name(). Since >> > the former is called also by a number of other madvise behaviours that >> > do not set a new anon_name, they have been passing the existing >> > anon_name of the vma to make replace_vma_anon_name() a no-op. > > s/replace_vma_anon_name/replace_anon_vma_name Hm right, thanks. The problem is that it was indeed originally called replace_vma_anon_name() (later renamed) and I was copy/pasting while looking at commit 9a10064f5625... will send a fixed up changelog in a reply to patch 1/4. >> >> Acked-by: David Hildenbrand <david@redhat.com> > > Reviewed-by: Suren Baghdasaryan <surenb@google.com> Thanks! >> >> -- >> Cheers, >> >> David / dhildenb >>
© 2016 - 2025 Red Hat, Inc.