mm/rmap.c | 7 +++++++ 1 file changed, 7 insertions(+)
Commit 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas()
comments, add asserts") alters the way errors are handled, but overlooked
one important aspect of clean up.
When a VMA encounters an error state in anon_vma_clone() (that is, on
attempted allocation of anon_vma_chain objects), it cleans up partially
established state in cleanup_partial_anon_vmas(), before returning an
error.
However, this occurs prior to anon_vma->num_active_vmas being incremented,
and it also fails to clear the VMA's vma->anon_vma field, which remains in
place.
This is immediately an inconsistent state, because
anon_vma->num_active_vmas is supposed to track the number of VMAs whose
vma->anon_vma field references that anon_vma, and now that count is
off-by-negative-1 for each VMA for which this error state has occurred.
When VMAs are unlinked from this anon_vma, unlink_anon_vmas() will
eventually underflow anon_vma->num_active_vmas, which will trigger a
warning.
This will always eventually happen, as we unlink anon_vma's at process
teardown.
It could also cause maybe_reuse_anon_vma() to incorrectly permit the reuse
of an anon_vma which has active VMAs attached, which will lead to a
persistently invalid state.
The solution is to clear the VMA's anon_vma field when we clean up partial
state, as the fact we are doing so indicates clearly that the VMA is not
correctly integrated into the anon_vma tree and thus this field is invalid.
Reported-by: Sasha Levin <sashal@kernel.org>
Closes: https://lore.kernel.org/linux-mm/20260302151547.2389070-1-sashal@kernel.org/
Reported-by: Jiakai Xu <jiakaipeanut@gmail.com>
Closes: https://lore.kernel.org/linux-mm/CAFb8wJvRhatRD-9DVmr5v5pixTMPEr3UKjYBJjCd09OfH55CKg@mail.gmail.com/
Fixes: 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts")
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
mm/rmap.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/mm/rmap.c b/mm/rmap.c
index 6398d7eef393..abe4712a220c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -457,6 +457,13 @@ static void cleanup_partial_anon_vmas(struct vm_area_struct *vma)
list_del(&avc->same_vma);
anon_vma_chain_free(avc);
}
+
+ /*
+ * The anon_vma assigned to this VMA is no longer valid, as we were not
+ * able to correctly clone AVC state. Avoid inconsistent anon_vma tree
+ * state by resetting.
+ */
+ vma->anon_vma = NULL;
}
/**
--
2.53.0
On Wed, Mar 18, 2026 at 12:26:32PM +0000, Lorenzo Stoakes (Oracle) wrote:
> Commit 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas()
> comments, add asserts") alters the way errors are handled, but overlooked
> one important aspect of clean up.
>
> When a VMA encounters an error state in anon_vma_clone() (that is, on
> attempted allocation of anon_vma_chain objects), it cleans up partially
> established state in cleanup_partial_anon_vmas(), before returning an
> error.
>
> However, this occurs prior to anon_vma->num_active_vmas being incremented,
> and it also fails to clear the VMA's vma->anon_vma field, which remains in
> place.
>
> This is immediately an inconsistent state, because
> anon_vma->num_active_vmas is supposed to track the number of VMAs whose
> vma->anon_vma field references that anon_vma, and now that count is
> off-by-negative-1 for each VMA for which this error state has occurred.
>
> When VMAs are unlinked from this anon_vma, unlink_anon_vmas() will
> eventually underflow anon_vma->num_active_vmas, which will trigger a
> warning.
>
> This will always eventually happen, as we unlink anon_vma's at process
> teardown.
>
> It could also cause maybe_reuse_anon_vma() to incorrectly permit the reuse
> of an anon_vma which has active VMAs attached, which will lead to a
> persistently invalid state.
>
> The solution is to clear the VMA's anon_vma field when we clean up partial
> state, as the fact we are doing so indicates clearly that the VMA is not
> correctly integrated into the anon_vma tree and thus this field is invalid.
>
> Reported-by: Sasha Levin <sashal@kernel.org>
> Closes: https://lore.kernel.org/linux-mm/20260302151547.2389070-1-sashal@kernel.org/
> Reported-by: Jiakai Xu <jiakaipeanut@gmail.com>
> Closes: https://lore.kernel.org/linux-mm/CAFb8wJvRhatRD-9DVmr5v5pixTMPEr3UKjYBJjCd09OfH55CKg@mail.gmail.com/
> Fixes: 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
Acked-by: Harry Yoo <harry.yoo@oracle.com>
--
Cheers,
Harry / Hyeonggon
> Commit 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas()
> comments, add asserts") alters the way errors are handled, but overlooked
> one important aspect of clean up.
>
> When a VMA encounters an error state in anon_vma_clone() (that is, on
> attempted allocation of anon_vma_chain objects), it cleans up partially
> established state in cleanup_partial_anon_vmas(), before returning an
> error.
>
> However, this occurs prior to anon_vma->num_active_vmas being incremented,
> and it also fails to clear the VMA's vma->anon_vma field, which remains in
> place.
>
> This is immediately an inconsistent state, because
> anon_vma->num_active_vmas is supposed to track the number of VMAs whose
> vma->anon_vma field references that anon_vma, and now that count is
> off-by-negative-1 for each VMA for which this error state has occurred.
>
> When VMAs are unlinked from this anon_vma, unlink_anon_vmas() will
> eventually underflow anon_vma->num_active_vmas, which will trigger a
> warning.
>
> This will always eventually happen, as we unlink anon_vma's at process
> teardown.
>
> It could also cause maybe_reuse_anon_vma() to incorrectly permit the reuse
> of an anon_vma which has active VMAs attached, which will lead to a
> persistently invalid state.
>
> The solution is to clear the VMA's anon_vma field when we clean up partial
> state, as the fact we are doing so indicates clearly that the VMA is not
> correctly integrated into the anon_vma tree and thus this field is invalid.
>
> Reported-by: Sasha Levin <sashal@kernel.org>
> Closes: https://lore.kernel.org/linux-mm/20260302151547.2389070-1-sashal@kernel.org/
> Reported-by: Jiakai Xu <jiakaipeanut@gmail.com>
> Closes: https://lore.kernel.org/linux-mm/CAFb8wJvRhatRD-9DVmr5v5pixTMPEr3UKjYBJjCd09OfH55CKg@mail.gmail.com/
> Fixes: 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Tested-by: Jiakai Xu <jiakaiPeanut@gmail.com>
Thanks!
> ---
> mm/rmap.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 6398d7eef393..abe4712a220c 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -457,6 +457,13 @@ static void cleanup_partial_anon_vmas(struct vm_area_struct *vma)
> list_del(&avc->same_vma);
> anon_vma_chain_free(avc);
> }
> +
> + /*
> + * The anon_vma assigned to this VMA is no longer valid, as we were not
> + * able to correctly clone AVC state. Avoid inconsistent anon_vma tree
> + * state by resetting.
> + */
> + vma->anon_vma = NULL;
> }
>
> /**
> --
> 2.53.0
On 3/18/26 13:26, Lorenzo Stoakes (Oracle) wrote:
> Commit 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas()
> comments, add asserts") alters the way errors are handled, but overlooked
> one important aspect of clean up.
>
> When a VMA encounters an error state in anon_vma_clone() (that is, on
> attempted allocation of anon_vma_chain objects), it cleans up partially
> established state in cleanup_partial_anon_vmas(), before returning an
> error.
>
> However, this occurs prior to anon_vma->num_active_vmas being incremented,
> and it also fails to clear the VMA's vma->anon_vma field, which remains in
> place.
>
> This is immediately an inconsistent state, because
> anon_vma->num_active_vmas is supposed to track the number of VMAs whose
> vma->anon_vma field references that anon_vma, and now that count is
> off-by-negative-1 for each VMA for which this error state has occurred.
>
> When VMAs are unlinked from this anon_vma, unlink_anon_vmas() will
> eventually underflow anon_vma->num_active_vmas, which will trigger a
> warning.
>
> This will always eventually happen, as we unlink anon_vma's at process
> teardown.
>
> It could also cause maybe_reuse_anon_vma() to incorrectly permit the reuse
> of an anon_vma which has active VMAs attached, which will lead to a
> persistently invalid state.
>
> The solution is to clear the VMA's anon_vma field when we clean up partial
> state, as the fact we are doing so indicates clearly that the VMA is not
> correctly integrated into the anon_vma tree and thus this field is invalid.
>
> Reported-by: Sasha Levin <sashal@kernel.org>
> Closes: https://lore.kernel.org/linux-mm/20260302151547.2389070-1-sashal@kernel.org/
> Reported-by: Jiakai Xu <jiakaipeanut@gmail.com>
> Closes: https://lore.kernel.org/linux-mm/CAFb8wJvRhatRD-9DVmr5v5pixTMPEr3UKjYBJjCd09OfH55CKg@mail.gmail.com/
> Fixes: 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Thanks!
> ---
> mm/rmap.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 6398d7eef393..abe4712a220c 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -457,6 +457,13 @@ static void cleanup_partial_anon_vmas(struct vm_area_struct *vma)
> list_del(&avc->same_vma);
> anon_vma_chain_free(avc);
> }
> +
> + /*
> + * The anon_vma assigned to this VMA is no longer valid, as we were not
> + * able to correctly clone AVC state. Avoid inconsistent anon_vma tree
> + * state by resetting.
> + */
> + vma->anon_vma = NULL;
> }
>
> /**
> --
> 2.53.0
On 3/18/26 13:26, Lorenzo Stoakes (Oracle) wrote:
> Commit 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas()
> comments, add asserts") alters the way errors are handled, but overlooked
> one important aspect of clean up.
>
> When a VMA encounters an error state in anon_vma_clone() (that is, on
> attempted allocation of anon_vma_chain objects), it cleans up partially
> established state in cleanup_partial_anon_vmas(), before returning an
> error.
>
> However, this occurs prior to anon_vma->num_active_vmas being incremented,
> and it also fails to clear the VMA's vma->anon_vma field, which remains in
> place.
>
> This is immediately an inconsistent state, because
> anon_vma->num_active_vmas is supposed to track the number of VMAs whose
> vma->anon_vma field references that anon_vma, and now that count is
> off-by-negative-1 for each VMA for which this error state has occurred.
>
> When VMAs are unlinked from this anon_vma, unlink_anon_vmas() will
> eventually underflow anon_vma->num_active_vmas, which will trigger a
> warning.
>
> This will always eventually happen, as we unlink anon_vma's at process
> teardown.
>
> It could also cause maybe_reuse_anon_vma() to incorrectly permit the reuse
> of an anon_vma which has active VMAs attached, which will lead to a
> persistently invalid state.
>
> The solution is to clear the VMA's anon_vma field when we clean up partial
> state, as the fact we are doing so indicates clearly that the VMA is not
> correctly integrated into the anon_vma tree and thus this field is invalid.
>
> Reported-by: Sasha Levin <sashal@kernel.org>
> Closes: https://lore.kernel.org/linux-mm/20260302151547.2389070-1-sashal@kernel.org/
> Reported-by: Jiakai Xu <jiakaipeanut@gmail.com>
> Closes: https://lore.kernel.org/linux-mm/CAFb8wJvRhatRD-9DVmr5v5pixTMPEr3UKjYBJjCd09OfH55CKg@mail.gmail.com/
> Fixes: 542eda1a8329 ("mm/rmap: improve anon_vma_clone(), unlink_anon_vmas() comments, add asserts")
> Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
> mm/rmap.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 6398d7eef393..abe4712a220c 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -457,6 +457,13 @@ static void cleanup_partial_anon_vmas(struct vm_area_struct *vma)
> list_del(&avc->same_vma);
> anon_vma_chain_free(avc);
> }
> +
> + /*
> + * The anon_vma assigned to this VMA is no longer valid, as we were not
> + * able to correctly clone AVC state. Avoid inconsistent anon_vma tree
> + * state by resetting.
> + */
> + vma->anon_vma = NULL;
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
LGTM. I was wondering whether anon_vma_clone() should take care of
setting up dst->anon_vma. It looks a bit odd in dup_anon_vma.
--
Cheers,
David
On Wed, Mar 18, 2026 at 01:52:44PM +0100, David Hildenbrand (Arm) wrote: > > Acked-by: David Hildenbrand (Arm) <david@kernel.org> > Thanks! > LGTM. I was wondering whether anon_vma_clone() should take care of > setting up dst->anon_vma. It looks a bit odd in dup_anon_vma. Well in the fork case we don't want to set vma->anon_vma, as we use vma->anon_vma to denote whether or not we reused another anon_vma. For split + remap, we already duplicated it via vm_area_dup() -> vm_area_init_from(). But... now we have enum vma_operation threaded through here, we can just do that there like: /* dst is unfaulted, so inherit src's anon_vma. */ if (operation == VMA_OP_MERGE_UNFAULTED) dst->anon_vma = src->anon_vma; ? > > -- > Cheers, > > David Cheers, Lorenzo
On 3/18/26 14:03, Lorenzo Stoakes (Oracle) wrote: > On Wed, Mar 18, 2026 at 01:52:44PM +0100, David Hildenbrand (Arm) wrote: >> >> Acked-by: David Hildenbrand (Arm) <david@kernel.org> >> > > Thanks! > >> LGTM. I was wondering whether anon_vma_clone() should take care of >> setting up dst->anon_vma. It looks a bit odd in dup_anon_vma. > > Well in the fork case we don't want to set vma->anon_vma, as we use > vma->anon_vma to denote whether or not we reused another anon_vma. For > split + remap, we already duplicated it via vm_area_dup() -> > vm_area_init_from(). > > But... now we have enum vma_operation threaded through here, we can just do > that there like: > > /* dst is unfaulted, so inherit src's anon_vma. */ > if (operation == VMA_OP_MERGE_UNFAULTED) > dst->anon_vma = src->anon_vma; > Exactly what I had in mind. -- Cheers, David
On Wed, Mar 18, 2026 at 02:38:18PM +0100, David Hildenbrand (Arm) wrote: > On 3/18/26 14:03, Lorenzo Stoakes (Oracle) wrote: > > On Wed, Mar 18, 2026 at 01:52:44PM +0100, David Hildenbrand (Arm) wrote: > >> > >> Acked-by: David Hildenbrand (Arm) <david@kernel.org> > >> > > > > Thanks! > > > >> LGTM. I was wondering whether anon_vma_clone() should take care of > >> setting up dst->anon_vma. It looks a bit odd in dup_anon_vma. > > > > Well in the fork case we don't want to set vma->anon_vma, as we use > > vma->anon_vma to denote whether or not we reused another anon_vma. For > > split + remap, we already duplicated it via vm_area_dup() -> > > vm_area_init_from(). > > > > But... now we have enum vma_operation threaded through here, we can just do > > that there like: > > > > /* dst is unfaulted, so inherit src's anon_vma. */ > > if (operation == VMA_OP_MERGE_UNFAULTED) > > dst->anon_vma = src->anon_vma; > > > > Exactly what I had in mind. That's fair, will send a separate patch for that! > > -- > Cheers, > > David Cheers, Lorenzo
© 2016 - 2026 Red Hat, Inc.