[PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU

Shakeel Butt posted 6 patches 1 month ago
[PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Shakeel Butt 1 month ago
While updating the generation of the folios, MGLRU requires that the
folio's memcg association remains stable. With the charge migration
deprecated, there is no need for MGLRU to acquire locks to keep the
folio and memcg association stable.

Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
---
 mm/vmscan.c | 11 -----------
 1 file changed, 11 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 29c098790b01..fd7171658b63 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3662,10 +3662,6 @@ static void walk_mm(struct mm_struct *mm, struct lru_gen_mm_walk *walk)
 		if (walk->seq != max_seq)
 			break;
 
-		/* folio_update_gen() requires stable folio_memcg() */
-		if (!mem_cgroup_trylock_pages(memcg))
-			break;
-
 		/* the caller might be holding the lock for write */
 		if (mmap_read_trylock(mm)) {
 			err = walk_page_range(mm, walk->next_addr, ULONG_MAX, &mm_walk_ops, walk);
@@ -3673,8 +3669,6 @@ static void walk_mm(struct mm_struct *mm, struct lru_gen_mm_walk *walk)
 			mmap_read_unlock(mm);
 		}
 
-		mem_cgroup_unlock_pages();
-
 		if (walk->batched) {
 			spin_lock_irq(&lruvec->lru_lock);
 			reset_batch_size(walk);
@@ -4096,10 +4090,6 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw)
 		}
 	}
 
-	/* folio_update_gen() requires stable folio_memcg() */
-	if (!mem_cgroup_trylock_pages(memcg))
-		return true;
-
 	arch_enter_lazy_mmu_mode();
 
 	pte -= (addr - start) / PAGE_SIZE;
@@ -4144,7 +4134,6 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw)
 	}
 
 	arch_leave_lazy_mmu_mode();
-	mem_cgroup_unlock_pages();
 
 	/* feedback from rmap walkers to page table walkers */
 	if (mm_state && suitable_to_scan(i, young))
-- 
2.43.5
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Shakeel Butt 1 month ago
On Thu, Oct 24, 2024 at 06:23:02PM GMT, Shakeel Butt wrote:
> While updating the generation of the folios, MGLRU requires that the
> folio's memcg association remains stable. With the charge migration
> deprecated, there is no need for MGLRU to acquire locks to keep the
> folio and memcg association stable.
> 
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>

Andrew, can you please apply the following fix to this patch after your
unused fixup?


index fd7171658b63..b8b0e8fa1332 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3353,7 +3353,7 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg,
        if (folio_nid(folio) != pgdat->node_id)
                return NULL;

-       if (folio_memcg_rcu(folio) != memcg)
+       if (folio_memcg(folio) != memcg)
                return NULL;

        /* file VMAs can contain anon pages from COW */
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Yu Zhao 1 month ago
On Sat, Oct 26, 2024 at 12:34 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> On Thu, Oct 24, 2024 at 06:23:02PM GMT, Shakeel Butt wrote:
> > While updating the generation of the folios, MGLRU requires that the
> > folio's memcg association remains stable. With the charge migration
> > deprecated, there is no need for MGLRU to acquire locks to keep the
> > folio and memcg association stable.
> >
> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
>
> Andrew, can you please apply the following fix to this patch after your
> unused fixup?

Thanks!

> index fd7171658b63..b8b0e8fa1332 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3353,7 +3353,7 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg,
>         if (folio_nid(folio) != pgdat->node_id)
>                 return NULL;
>
> -       if (folio_memcg_rcu(folio) != memcg)
> +       if (folio_memcg(folio) != memcg)
>                 return NULL;
>
>         /* file VMAs can contain anon pages from COW */
>
>
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Yu Zhao 3 weeks ago
On Sat, Oct 26, 2024 at 09:26:04AM -0600, Yu Zhao wrote:
> On Sat, Oct 26, 2024 at 12:34 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> >
> > On Thu, Oct 24, 2024 at 06:23:02PM GMT, Shakeel Butt wrote:
> > > While updating the generation of the folios, MGLRU requires that the
> > > folio's memcg association remains stable. With the charge migration
> > > deprecated, there is no need for MGLRU to acquire locks to keep the
> > > folio and memcg association stable.
> > >
> > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> >
> > Andrew, can you please apply the following fix to this patch after your
> > unused fixup?
> 
> Thanks!

syzbot caught the following:

  WARNING: CPU: 0 PID: 85 at mm/vmscan.c:3140 folio_update_gen+0x23d/0x250 mm/vmscan.c:3140
  ...

Andrew, can you please fix this in place? Thank you.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index ddaaff67642e..9a610dbff384 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3138,7 +3138,6 @@ static int folio_update_gen(struct folio *folio, int gen)
 	unsigned long new_flags, old_flags = READ_ONCE(folio->flags);
 
 	VM_WARN_ON_ONCE(gen >= MAX_NR_GENS);
-	VM_WARN_ON_ONCE(!rcu_read_lock_held());
 
 	do {
 		/* lru_gen_del_folio() has isolated this page? */
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Andrew Morton 2 weeks, 6 days ago
On Mon, 4 Nov 2024 10:30:29 -0700 Yu Zhao <yuzhao@google.com> wrote:

> On Sat, Oct 26, 2024 at 09:26:04AM -0600, Yu Zhao wrote:
> > On Sat, Oct 26, 2024 at 12:34 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> > >
> > > On Thu, Oct 24, 2024 at 06:23:02PM GMT, Shakeel Butt wrote:
> > > > While updating the generation of the folios, MGLRU requires that the
> > > > folio's memcg association remains stable. With the charge migration
> > > > deprecated, there is no need for MGLRU to acquire locks to keep the
> > > > folio and memcg association stable.
> > > >
> > > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > >
> > > Andrew, can you please apply the following fix to this patch after your
> > > unused fixup?
> > 
> > Thanks!
> 
> syzbot caught the following:
> 
>   WARNING: CPU: 0 PID: 85 at mm/vmscan.c:3140 folio_update_gen+0x23d/0x250 mm/vmscan.c:3140
>   ...
> 
> Andrew, can you please fix this in place?

OK, but...

> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3138,7 +3138,6 @@ static int folio_update_gen(struct folio *folio, int gen)
>  	unsigned long new_flags, old_flags = READ_ONCE(folio->flags);
>  
>  	VM_WARN_ON_ONCE(gen >= MAX_NR_GENS);
> -	VM_WARN_ON_ONCE(!rcu_read_lock_held());
>  
>  	do {
>  		/* lru_gen_del_folio() has isolated this page? */

it would be good to know why this assertion is considered incorrect? 
And a link to the sysbot report?

Thanks.
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Yu Zhao 2 weeks, 6 days ago
On Mon, Nov 4, 2024 at 2:38 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Mon, 4 Nov 2024 10:30:29 -0700 Yu Zhao <yuzhao@google.com> wrote:
>
> > On Sat, Oct 26, 2024 at 09:26:04AM -0600, Yu Zhao wrote:
> > > On Sat, Oct 26, 2024 at 12:34 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> > > >
> > > > On Thu, Oct 24, 2024 at 06:23:02PM GMT, Shakeel Butt wrote:
> > > > > While updating the generation of the folios, MGLRU requires that the
> > > > > folio's memcg association remains stable. With the charge migration
> > > > > deprecated, there is no need for MGLRU to acquire locks to keep the
> > > > > folio and memcg association stable.
> > > > >
> > > > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > > >
> > > > Andrew, can you please apply the following fix to this patch after your
> > > > unused fixup?
> > >
> > > Thanks!
> >
> > syzbot caught the following:
> >
> >   WARNING: CPU: 0 PID: 85 at mm/vmscan.c:3140 folio_update_gen+0x23d/0x250 mm/vmscan.c:3140
> >   ...
> >
> > Andrew, can you please fix this in place?
>
> OK, but...
>
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3138,7 +3138,6 @@ static int folio_update_gen(struct folio *folio, int gen)
> >       unsigned long new_flags, old_flags = READ_ONCE(folio->flags);
> >
> >       VM_WARN_ON_ONCE(gen >= MAX_NR_GENS);
> > -     VM_WARN_ON_ONCE(!rcu_read_lock_held());
> >
> >       do {
> >               /* lru_gen_del_folio() has isolated this page? */
>
> it would be good to know why this assertion is considered incorrect?

The assertion was caused by the patch in this thread. It used to
assert that a folio must be protected from charge migration. Charge
migration is removed by this series, and as part of the effort, this
patch removes the RCU lock.

> And a link to the sysbot report?

https://syzkaller.appspot.com/bug?extid=24f45b8beab9788e467e
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Yu Zhao 2 weeks, 6 days ago
On Mon, Nov 4, 2024 at 3:04 PM Yu Zhao <yuzhao@google.com> wrote:
>
> On Mon, Nov 4, 2024 at 2:38 PM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Mon, 4 Nov 2024 10:30:29 -0700 Yu Zhao <yuzhao@google.com> wrote:
> >
> > > On Sat, Oct 26, 2024 at 09:26:04AM -0600, Yu Zhao wrote:
> > > > On Sat, Oct 26, 2024 at 12:34 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> > > > >
> > > > > On Thu, Oct 24, 2024 at 06:23:02PM GMT, Shakeel Butt wrote:
> > > > > > While updating the generation of the folios, MGLRU requires that the
> > > > > > folio's memcg association remains stable. With the charge migration
> > > > > > deprecated, there is no need for MGLRU to acquire locks to keep the
> > > > > > folio and memcg association stable.
> > > > > >
> > > > > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > > > >
> > > > > Andrew, can you please apply the following fix to this patch after your
> > > > > unused fixup?
> > > >
> > > > Thanks!
> > >
> > > syzbot caught the following:
> > >
> > >   WARNING: CPU: 0 PID: 85 at mm/vmscan.c:3140 folio_update_gen+0x23d/0x250 mm/vmscan.c:3140
> > >   ...
> > >
> > > Andrew, can you please fix this in place?
> >
> > OK, but...
> >
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -3138,7 +3138,6 @@ static int folio_update_gen(struct folio *folio, int gen)
> > >       unsigned long new_flags, old_flags = READ_ONCE(folio->flags);
> > >
> > >       VM_WARN_ON_ONCE(gen >= MAX_NR_GENS);
> > > -     VM_WARN_ON_ONCE(!rcu_read_lock_held());
> > >
> > >       do {
> > >               /* lru_gen_del_folio() has isolated this page? */
> >
> > it would be good to know why this assertion is considered incorrect?
>
> The assertion was caused by the patch in this thread. It used to
> assert that a folio must be protected from charge migration. Charge
> migration is removed by this series, and as part of the effort, this
> patch removes the RCU lock.
>
> > And a link to the sysbot report?
>
> https://syzkaller.appspot.com/bug?extid=24f45b8beab9788e467e

Or this link would work better:

https://lore.kernel.org/lkml/67294349.050a0220.701a.0010.GAE@google.com/
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Andrew Morton 2 weeks, 6 days ago
On Mon, 4 Nov 2024 15:08:09 -0700 Yu Zhao <yuzhao@google.com> wrote:

> > The assertion was caused by the patch in this thread. It used to
> > assert that a folio must be protected from charge migration. Charge
> > migration is removed by this series, and as part of the effort, this
> > patch removes the RCU lock.
> >
> > > And a link to the sysbot report?
> >
> > https://syzkaller.appspot.com/bug?extid=24f45b8beab9788e467e
> 
> Or this link would work better:
> 
> https://lore.kernel.org/lkml/67294349.050a0220.701a.0010.GAE@google.com/

Thanks, I pasted everyone's everything in there, so it will all be
accessible by the sufficiently patient.
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Shakeel Butt 2 weeks, 6 days ago
On Mon, Nov 04, 2024 at 01:38:34PM -0800, Andrew Morton wrote:
> On Mon, 4 Nov 2024 10:30:29 -0700 Yu Zhao <yuzhao@google.com> wrote:
> 
> > On Sat, Oct 26, 2024 at 09:26:04AM -0600, Yu Zhao wrote:
> > > On Sat, Oct 26, 2024 at 12:34 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> > > >
> > > > On Thu, Oct 24, 2024 at 06:23:02PM GMT, Shakeel Butt wrote:
> > > > > While updating the generation of the folios, MGLRU requires that the
> > > > > folio's memcg association remains stable. With the charge migration
> > > > > deprecated, there is no need for MGLRU to acquire locks to keep the
> > > > > folio and memcg association stable.
> > > > >
> > > > > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > > >
> > > > Andrew, can you please apply the following fix to this patch after your
> > > > unused fixup?
> > > 
> > > Thanks!
> > 
> > syzbot caught the following:
> > 
> >   WARNING: CPU: 0 PID: 85 at mm/vmscan.c:3140 folio_update_gen+0x23d/0x250 mm/vmscan.c:3140
> >   ...
> > 
> > Andrew, can you please fix this in place?
> 
> OK, but...
> 
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3138,7 +3138,6 @@ static int folio_update_gen(struct folio *folio, int gen)
> >  	unsigned long new_flags, old_flags = READ_ONCE(folio->flags);
> >  
> >  	VM_WARN_ON_ONCE(gen >= MAX_NR_GENS);
> > -	VM_WARN_ON_ONCE(!rcu_read_lock_held());
> >  
> >  	do {
> >  		/* lru_gen_del_folio() has isolated this page? */
> 
> it would be good to know why this assertion is considered incorrect? 
> And a link to the sysbot report?

So, this assertion is incorrect after this patch series that has removed
the charge migration and has removed mem_cgroup_trylock_pages() / 
mem_cgroup_unlock_pages() from the caller of this function
(folio_update_gen()).
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Yu Zhao 1 month ago
On Thu, Oct 24, 2024 at 7:23 PM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>
> While updating the generation of the folios, MGLRU requires that the
> folio's memcg association remains stable. With the charge migration
> deprecated, there is no need for MGLRU to acquire locks to keep the
> folio and memcg association stable.
>
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> ---
>  mm/vmscan.c | 11 -----------
>  1 file changed, 11 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 29c098790b01..fd7171658b63 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3662,10 +3662,6 @@ static void walk_mm(struct mm_struct *mm, struct lru_gen_mm_walk *walk)
>                 if (walk->seq != max_seq)
>                         break;

Please remove the lingering `struct mem_cgroup *memcg` as well as
folio_memcg_rcu(). Otherwise it causes both build and lockdep
warnings.

> -               /* folio_update_gen() requires stable folio_memcg() */
> -               if (!mem_cgroup_trylock_pages(memcg))
> -                       break;
> -
>                 /* the caller might be holding the lock for write */
>                 if (mmap_read_trylock(mm)) {
>                         err = walk_page_range(mm, walk->next_addr, ULONG_MAX, &mm_walk_ops, walk);
> @@ -3673,8 +3669,6 @@ static void walk_mm(struct mm_struct *mm, struct lru_gen_mm_walk *walk)
>                         mmap_read_unlock(mm);
>                 }
>
> -               mem_cgroup_unlock_pages();
> -
>                 if (walk->batched) {
>                         spin_lock_irq(&lruvec->lru_lock);
>                         reset_batch_size(walk);
> @@ -4096,10 +4090,6 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw)
>                 }
>         }
>
> -       /* folio_update_gen() requires stable folio_memcg() */
> -       if (!mem_cgroup_trylock_pages(memcg))
> -               return true;
> -
>         arch_enter_lazy_mmu_mode();
>
>         pte -= (addr - start) / PAGE_SIZE;
> @@ -4144,7 +4134,6 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw)
>         }
>
>         arch_leave_lazy_mmu_mode();
> -       mem_cgroup_unlock_pages();
>
>         /* feedback from rmap walkers to page table walkers */
>         if (mm_state && suitable_to_scan(i, young))
> --
> 2.43.5
>
>
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Shakeel Butt 1 month ago
On Fri, Oct 25, 2024 at 09:55:38PM GMT, Yu Zhao wrote:
> On Thu, Oct 24, 2024 at 7:23 PM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> >
> > While updating the generation of the folios, MGLRU requires that the
> > folio's memcg association remains stable. With the charge migration
> > deprecated, there is no need for MGLRU to acquire locks to keep the
> > folio and memcg association stable.
> >
> > Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> > ---
> >  mm/vmscan.c | 11 -----------
> >  1 file changed, 11 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 29c098790b01..fd7171658b63 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3662,10 +3662,6 @@ static void walk_mm(struct mm_struct *mm, struct lru_gen_mm_walk *walk)
> >                 if (walk->seq != max_seq)
> >                         break;
> 
> Please remove the lingering `struct mem_cgroup *memcg` as well as
> folio_memcg_rcu(). Otherwise it causes both build and lockdep
> warnings.
> 

Thanks for catching this. The unused warning is already fixed by Andrew,
I will fix the folio_memcg_rcu() usage.
Re: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU
Posted by Roman Gushchin 1 month ago
On Thu, Oct 24, 2024 at 06:23:02PM -0700, Shakeel Butt wrote:
> While updating the generation of the folios, MGLRU requires that the
> folio's memcg association remains stable. With the charge migration
> deprecated, there is no need for MGLRU to acquire locks to keep the
> folio and memcg association stable.
> 
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>

Reviewed-by: Roman Gushchin <roman.gushchin@linux.dev>