[PATCH] mm/vmalloc: fix KMSAN uninit in decay_va_pool_node list handling

chenyichong posted 1 patch 2 months, 1 week ago
mm/vmalloc.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
[PATCH] mm/vmalloc: fix KMSAN uninit in decay_va_pool_node list handling
Posted by chenyichong 2 months, 1 week ago
Prevent decay_va_pool_node from overwriting concurrent repopulation of
vmap_node pool[i].head while purging. Read/reset pool[i].len under
pool_lock and splice leftover vmap_area nodes back into the pool
instead of replacing the list.

Reported-by: syzbot+37b7f6cd519f7fb8d32a@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=37b7f6cd519f7fb8d32a
Fixes: 7679ba6b36db ("mm: vmalloc: add a shrinker to drain vmap pools")
Signed-off-by: chenyichong <chenyichong@uniontech.com>
---
 mm/vmalloc.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ecbac900c35f..72fb60553a71 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2233,10 +2233,9 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
 		/* Detach the pool, so no-one can access it. */
 		spin_lock(&vn->pool_lock);
 		list_replace_init(&vn->pool[i].head, &tmp_list);
-		spin_unlock(&vn->pool_lock);
-
 		pool_len = n_decay = vn->pool[i].len;
 		WRITE_ONCE(vn->pool[i].len, 0);
+		spin_unlock(&vn->pool_lock);
 
 		/* Decay a pool by ~25% out of left objects. */
 		if (!full_decay)
@@ -2259,8 +2258,14 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
 		 */
 		if (!list_empty(&tmp_list)) {
 			spin_lock(&vn->pool_lock);
-			list_replace_init(&tmp_list, &vn->pool[i].head);
-			WRITE_ONCE(vn->pool[i].len, pool_len);
+			/*
+			 * Merge leftover areas back into the pool rather than
+			 * replacing the whole list. A concurrent allocator can
+			 * repopulate vn->pool[i].head while we are decaying
+			 * tmp_list, and replacing would drop those nodes.
+			 */
+			list_splice_tail_init(&tmp_list, &vn->pool[i].head);
+			WRITE_ONCE(vn->pool[i].len, vn->pool[i].len + pool_len);
 			spin_unlock(&vn->pool_lock);
 		}
 	}
-- 
2.50.1
Re: [PATCH] mm/vmalloc: fix KMSAN uninit in decay_va_pool_node list handling
Posted by Uladzislau Rezki 2 months, 1 week ago
On Fri, Apr 03, 2026 at 03:52:03PM +0800, chenyichong wrote:
> Prevent decay_va_pool_node from overwriting concurrent repopulation of
> vmap_node pool[i].head while purging. Read/reset pool[i].len under
> pool_lock and splice leftover vmap_area nodes back into the pool
> instead of replacing the list.
> 
> Reported-by: syzbot+37b7f6cd519f7fb8d32a@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=37b7f6cd519f7fb8d32a
> Fixes: 7679ba6b36db ("mm: vmalloc: add a shrinker to drain vmap pools")
> Signed-off-by: chenyichong <chenyichong@uniontech.com>
> ---
>  mm/vmalloc.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index ecbac900c35f..72fb60553a71 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2233,10 +2233,9 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  		/* Detach the pool, so no-one can access it. */
>  		spin_lock(&vn->pool_lock);
>  		list_replace_init(&vn->pool[i].head, &tmp_list);
> -		spin_unlock(&vn->pool_lock);
> -
>  		pool_len = n_decay = vn->pool[i].len;
>  		WRITE_ONCE(vn->pool[i].len, 0);
> +		spin_unlock(&vn->pool_lock);
>  
>  		/* Decay a pool by ~25% out of left objects. */
>  		if (!full_decay)
> @@ -2259,8 +2258,14 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  		 */
>  		if (!list_empty(&tmp_list)) {
>  			spin_lock(&vn->pool_lock);
> -			list_replace_init(&tmp_list, &vn->pool[i].head);
> -			WRITE_ONCE(vn->pool[i].len, pool_len);
> +			/*
> +			 * Merge leftover areas back into the pool rather than
> +			 * replacing the whole list. A concurrent allocator can
> +			 * repopulate vn->pool[i].head while we are decaying
> +			 * tmp_list, and replacing would drop those nodes.
> +			 */
> +			list_splice_tail_init(&tmp_list, &vn->pool[i].head);
> +			WRITE_ONCE(vn->pool[i].len, vn->pool[i].len + pool_len);
>
"A concurrent allocator can repopulate..." - Where is it done? Probably
you meant something different.

--
Uladzislau Rezki
Re: [PATCH] mm/vmalloc: fix KMSAN uninit in decay_va_pool_node list handling
Posted by Uladzislau Rezki 2 months, 1 week ago
On Fri, Apr 03, 2026 at 12:55:31PM +0200, Uladzislau Rezki wrote:
> On Fri, Apr 03, 2026 at 03:52:03PM +0800, chenyichong wrote:
> > Prevent decay_va_pool_node from overwriting concurrent repopulation of
> > vmap_node pool[i].head while purging. Read/reset pool[i].len under
> > pool_lock and splice leftover vmap_area nodes back into the pool
> > instead of replacing the list.
> > 
> > Reported-by: syzbot+37b7f6cd519f7fb8d32a@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=37b7f6cd519f7fb8d32a
> > Fixes: 7679ba6b36db ("mm: vmalloc: add a shrinker to drain vmap pools")
> > Signed-off-by: chenyichong <chenyichong@uniontech.com>
> > ---
> >  mm/vmalloc.c | 13 +++++++++----
> >  1 file changed, 9 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index ecbac900c35f..72fb60553a71 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -2233,10 +2233,9 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> >  		/* Detach the pool, so no-one can access it. */
> >  		spin_lock(&vn->pool_lock);
> >  		list_replace_init(&vn->pool[i].head, &tmp_list);
> > -		spin_unlock(&vn->pool_lock);
> > -
> >  		pool_len = n_decay = vn->pool[i].len;
> >  		WRITE_ONCE(vn->pool[i].len, 0);
> > +		spin_unlock(&vn->pool_lock);
> >  
> >  		/* Decay a pool by ~25% out of left objects. */
> >  		if (!full_decay)
> > @@ -2259,8 +2258,14 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> >  		 */
> >  		if (!list_empty(&tmp_list)) {
> >  			spin_lock(&vn->pool_lock);
> > -			list_replace_init(&tmp_list, &vn->pool[i].head);
> > -			WRITE_ONCE(vn->pool[i].len, pool_len);
> > +			/*
> > +			 * Merge leftover areas back into the pool rather than
> > +			 * replacing the whole list. A concurrent allocator can
> > +			 * repopulate vn->pool[i].head while we are decaying
> > +			 * tmp_list, and replacing would drop those nodes.
> > +			 */
> > +			list_splice_tail_init(&tmp_list, &vn->pool[i].head);
> > +			WRITE_ONCE(vn->pool[i].len, vn->pool[i].len + pool_len);
> >
> "A concurrent allocator can repopulate..." - Where is it done? Probably
> you meant something different.
> 
Actually decay_va_pool_node() is not designed to be called concurrently.
See the comment:

<snip>
/*
 * Attach the pool back if it has been partly decayed.
 * Please note, it is supposed that nobody(other contexts)
 * can populate the pool therefore a simple list replace
 * operation takes place here.
 */
<snip>

but after adding the shrinker it may be called concurrently to
free some memory, if high memory pressure occurs. This is not good
because we can lose added VAs by the __purge_vmap_area_lazy()
helper. So it might leak memory.

That problem has been introduced by the  "mm: vmalloc: add a shrinker
to drain vmap pools" patch.

IMO, the best what we should do is to follow the design reflected
by the comment. It implies that shrinker has to decay holding the
vmap_purge_lock mutex.

--
Uladzislau Rezki