[PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next

Liam Howlett posted 70 patches 3 years, 7 months ago
[PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next
Posted by Liam Howlett 3 years, 7 months ago
Use the vma iterator in in get_next_vma() instead of the linked list.

Suggested-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 mm/vmscan.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 340b8a624f57..bb3256d07a43 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3776,23 +3776,14 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
 {
 	unsigned long start = round_up(*vm_end, size);
 	unsigned long end = (start | ~mask) + 1;
+	VMA_ITERATOR(vmi, args->mm, start);
 
 	VM_WARN_ON_ONCE(mask & size);
 	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
 
-	while (args->vma) {
-		if (start >= args->vma->vm_end) {
-			args->vma = args->vma->vm_next;
+	for_each_vma_range(vmi, args->vma, end) {
+		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
 			continue;
-		}
-
-		if (end && end <= args->vma->vm_start)
-			return false;
-
-		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
-			args->vma = args->vma->vm_next;
-			continue;
-		}
 
 		*vm_start = max(start, args->vma->vm_start);
 		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;
-- 
2.35.1
Re: [PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next
Posted by Yu Zhao 3 years, 6 months ago
On Tue, Sep 06, 2022 at 07:49:05PM +0000, Liam Howlett wrote:
> Use the vma iterator in in get_next_vma() instead of the linked list.
> 
> Suggested-by: Yu Zhao <yuzhao@google.com>

Apologies for the bad suggestion.

> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3776,23 +3776,14 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
>  {
>  	unsigned long start = round_up(*vm_end, size);
>  	unsigned long end = (start | ~mask) + 1;
> +	VMA_ITERATOR(vmi, args->mm, start);
>  
>  	VM_WARN_ON_ONCE(mask & size);
>  	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
>  
> -	while (args->vma) {
> -		if (start >= args->vma->vm_end) {
> -			args->vma = args->vma->vm_next;
> +	for_each_vma_range(vmi, args->vma, end) {
> +		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
>  			continue;
> -		}
> -
> -		if (end && end <= args->vma->vm_start)
> -			return false;

Here the original code leaves args->vma pointing the first vma out of
the range [start, end). This allows the caller (page table walker) to
resume at that vma, if it chooses to.

With for_each_vma_range(), under the same condition, args->vma is set to
NULL. And the page table walker may terminate prematurely. Apparently I
overlooked until I was told MGLRU in mm-unstable is slower than itself
on 6.0-rc4 yesterday.

> -
> -		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
> -			args->vma = args->vma->vm_next;
> -			continue;
> -		}
>  
>  		*vm_start = max(start, args->vma->vm_start);
>  		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;

The following should work properly. Please take a look. Thanks!

---
 mm/vmscan.c | 12 +++---------
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 11a86d47e85e..b22d3efe3031 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3776,23 +3776,17 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
 {
 	unsigned long start = round_up(*vm_end, size);
 	unsigned long end = (start | ~mask) + 1;
+	VMA_ITERATOR(vmi, args->mm, start);
 
 	VM_WARN_ON_ONCE(mask & size);
 	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
 
-	while (args->vma) {
-		if (start >= args->vma->vm_end) {
-			args->vma = args->vma->vm_next;
-			continue;
-		}
-
+	for_each_vma(vmi, args->vma) {
 		if (end && end <= args->vma->vm_start)
 			return false;
 
-		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
-			args->vma = args->vma->vm_next;
+		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
 			continue;
-		}
 
 		*vm_start = max(start, args->vma->vm_start);
 		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;
-- 
2.37.2.789.g6183377224-goog
Re: [PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next
Posted by Yu Zhao 3 years, 6 months ago
On Mon, Sep 12, 2022 at 12:55:08AM -0600, Yu Zhao wrote:
> On Tue, Sep 06, 2022 at 07:49:05PM +0000, Liam Howlett wrote:
> > Use the vma iterator in in get_next_vma() instead of the linked list.
> > 
> > Suggested-by: Yu Zhao <yuzhao@google.com>
> 
> Apologies for the bad suggestion.
> 
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3776,23 +3776,14 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
> >  {
> >  	unsigned long start = round_up(*vm_end, size);
> >  	unsigned long end = (start | ~mask) + 1;
> > +	VMA_ITERATOR(vmi, args->mm, start);
> >  
> >  	VM_WARN_ON_ONCE(mask & size);
> >  	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
> >  
> > -	while (args->vma) {
> > -		if (start >= args->vma->vm_end) {
> > -			args->vma = args->vma->vm_next;
> > +	for_each_vma_range(vmi, args->vma, end) {
> > +		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
> >  			continue;
> > -		}
> > -
> > -		if (end && end <= args->vma->vm_start)
> > -			return false;
> 
> Here the original code leaves args->vma pointing the first vma out of
> the range [start, end). This allows the caller (page table walker) to
> resume at that vma, if it chooses to.
  ^^^^^^ continue (without releasing mmap_lock)

> With for_each_vma_range(), under the same condition, args->vma is set to
> NULL. And the page table walker may terminate prematurely. Apparently I
> overlooked until I was told MGLRU in mm-unstable is slower than itself
> on 6.0-rc4 yesterday.
> 
> > -
> > -		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
> > -			args->vma = args->vma->vm_next;
> > -			continue;
> > -		}
> >  
> >  		*vm_start = max(start, args->vma->vm_start);
> >  		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;
> 
> The following should work properly. Please take a look. Thanks!
> 
> ---
>  mm/vmscan.c | 12 +++---------
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 11a86d47e85e..b22d3efe3031 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3776,23 +3776,17 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
>  {
>  	unsigned long start = round_up(*vm_end, size);
>  	unsigned long end = (start | ~mask) + 1;
> +	VMA_ITERATOR(vmi, args->mm, start);
>  
>  	VM_WARN_ON_ONCE(mask & size);
>  	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
>  
> -	while (args->vma) {
> -		if (start >= args->vma->vm_end) {
> -			args->vma = args->vma->vm_next;
> -			continue;
> -		}
> -
> +	for_each_vma(vmi, args->vma) {
>  		if (end && end <= args->vma->vm_start)
>  			return false;
>  
> -		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
> -			args->vma = args->vma->vm_next;
> +		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
>  			continue;
> -		}
>  
>  		*vm_start = max(start, args->vma->vm_start);
>  		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;
> -- 
> 2.37.2.789.g6183377224-goog
Re: [PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next
Posted by Liam Howlett 3 years, 6 months ago
* Yu Zhao <yuzhao@google.com> [220912 02:55]:
> On Tue, Sep 06, 2022 at 07:49:05PM +0000, Liam Howlett wrote:
> > Use the vma iterator in in get_next_vma() instead of the linked list.
> > 
> > Suggested-by: Yu Zhao <yuzhao@google.com>
> 
> Apologies for the bad suggestion.
> 
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3776,23 +3776,14 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
> >  {
> >  	unsigned long start = round_up(*vm_end, size);
> >  	unsigned long end = (start | ~mask) + 1;
> > +	VMA_ITERATOR(vmi, args->mm, start);
> >  
> >  	VM_WARN_ON_ONCE(mask & size);
> >  	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
> >  
> > -	while (args->vma) {
> > -		if (start >= args->vma->vm_end) {
> > -			args->vma = args->vma->vm_next;
> > +	for_each_vma_range(vmi, args->vma, end) {
> > +		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
> >  			continue;
> > -		}
> > -
> > -		if (end && end <= args->vma->vm_start)
> > -			return false;
> 
> Here the original code leaves args->vma pointing the first vma out of
> the range [start, end). This allows the caller (page table walker) to
> resume at that vma, if it chooses to.
> 
> With for_each_vma_range(), under the same condition, args->vma is set to
> NULL. And the page table walker may terminate prematurely. Apparently I
> overlooked until I was told MGLRU in mm-unstable is slower than itself
> on 6.0-rc4 yesterday.
> 
> > -
> > -		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
> > -			args->vma = args->vma->vm_next;
> > -			continue;
> > -		}
> >  
> >  		*vm_start = max(start, args->vma->vm_start);
> >  		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;
> 
> The following should work properly. Please take a look. Thanks!
> 

Thanks Yu.  This looks good to me and the explanation makes sense.

> ---
>  mm/vmscan.c | 12 +++---------
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 11a86d47e85e..b22d3efe3031 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3776,23 +3776,17 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
>  {
>  	unsigned long start = round_up(*vm_end, size);
>  	unsigned long end = (start | ~mask) + 1;
> +	VMA_ITERATOR(vmi, args->mm, start);
>  
>  	VM_WARN_ON_ONCE(mask & size);
>  	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
>  
> -	while (args->vma) {
> -		if (start >= args->vma->vm_end) {
> -			args->vma = args->vma->vm_next;
> -			continue;
> -		}
> -
> +	for_each_vma(vmi, args->vma) {
>  		if (end && end <= args->vma->vm_start)
>  			return false;
>  
> -		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
> -			args->vma = args->vma->vm_next;
> +		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
>  			continue;
> -		}
>  
>  		*vm_start = max(start, args->vma->vm_start);
>  		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;
> -- 
> 2.37.2.789.g6183377224-goog
Re: [PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next
Posted by Andrew Morton 3 years, 6 months ago
On Mon, 12 Sep 2022 00:55:08 -0600 Yu Zhao <yuzhao@google.com> wrote:

> 
> The following should work properly. Please take a look. Thanks!
> 
> ---
>  mm/vmscan.c | 12 +++---------
>  1 file changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 11a86d47e85e..b22d3efe3031 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3776,23 +3776,17 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
>  {
>  	unsigned long start = round_up(*vm_end, size);
>  	unsigned long end = (start | ~mask) + 1;
> +	VMA_ITERATOR(vmi, args->mm, start);
>  
>  	VM_WARN_ON_ONCE(mask & size);
>  	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
>  
> -	while (args->vma) {
> -		if (start >= args->vma->vm_end) {
> -			args->vma = args->vma->vm_next;
> -			continue;
> -		}
> -
> +	for_each_vma(vmi, args->vma) {
>  		if (end && end <= args->vma->vm_start)
>  			return false;
>  
> -		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
> -			args->vma = args->vma->vm_next;
> +		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
>  			continue;
> -		}
>  
>  		*vm_start = max(start, args->vma->vm_start);
>  		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;

What does this apply to?  It's almost what is in mm-unstable/linux-next
at present?

static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk *args,
			 unsigned long *vm_start, unsigned long *vm_end)
{
	unsigned long start = round_up(*vm_end, size);
	unsigned long end = (start | ~mask) + 1;
	VMA_ITERATOR(vmi, args->mm, start);

	VM_WARN_ON_ONCE(mask & size);
	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));

	for_each_vma_range(vmi, args->vma, end) {
		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
			continue;

		*vm_start = max(start, args->vma->vm_start);
		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;

		return true;
	}

	return false;
}

for_each_vma_range versus for_each_vma.
Re: [PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next
Posted by Yu Zhao 3 years, 6 months ago
On Mon, Sep 12, 2022 at 12:45:59PM -0700, Andrew Morton wrote:
> On Mon, 12 Sep 2022 00:55:08 -0600 Yu Zhao <yuzhao@google.com> wrote:
> 
> > 
> > The following should work properly. Please take a look. Thanks!
> > 
> > ---
> >  mm/vmscan.c | 12 +++---------
> >  1 file changed, 3 insertions(+), 9 deletions(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 11a86d47e85e..b22d3efe3031 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3776,23 +3776,17 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
> >  {
> >  	unsigned long start = round_up(*vm_end, size);
> >  	unsigned long end = (start | ~mask) + 1;
> > +	VMA_ITERATOR(vmi, args->mm, start);
> >  
> >  	VM_WARN_ON_ONCE(mask & size);
> >  	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
> >  
> > -	while (args->vma) {
> > -		if (start >= args->vma->vm_end) {
> > -			args->vma = args->vma->vm_next;
> > -			continue;
> > -		}
> > -
> > +	for_each_vma(vmi, args->vma) {
> >  		if (end && end <= args->vma->vm_start)
> >  			return false;
> >  
> > -		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args)) {
> > -			args->vma = args->vma->vm_next;
> > +		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
> >  			continue;
> > -		}
> >  
> >  		*vm_start = max(start, args->vma->vm_start);
> >  		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;
> 
> What does this apply to?

The above replaces the original patch in mm-unstable.

> It's almost what is in mm-unstable/linux-next
> at present?

Yes, almost.

> static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk *args,
> 			 unsigned long *vm_start, unsigned long *vm_end)
> {
> 	unsigned long start = round_up(*vm_end, size);
> 	unsigned long end = (start | ~mask) + 1;
> 	VMA_ITERATOR(vmi, args->mm, start);
> 
> 	VM_WARN_ON_ONCE(mask & size);
> 	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
> 
> 	for_each_vma_range(vmi, args->vma, end) {
> 		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
> 			continue;
> 
> 		*vm_start = max(start, args->vma->vm_start);
> 		*vm_end = min(end - 1, args->vma->vm_end - 1) + 1;
> 
> 		return true;
> 	}
> 
> 	return false;
> }
> 
> for_each_vma_range versus for_each_vma.

The diff between the original patch and this one, in case you prefer to
fix it atop rather than amend.

diff --git a/mm/vmscan.c b/mm/vmscan.c
index a7c5d15c1618..cadcc3290918 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3776,7 +3776,10 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
 	VM_WARN_ON_ONCE(mask & size);
 	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
 
-	for_each_vma_range(vmi, args->vma, end) {
+	for_each_vma(vmi, args->vma) {
+		if (end && end <= args->vma->vm_start)
+			return false;
+
 		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
 			continue;
Re: [PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next
Posted by Andrew Morton 3 years, 6 months ago
On Mon, 12 Sep 2022 14:01:28 -0600 Yu Zhao <yuzhao@google.com> wrote:

> The diff between the original patch and this one, in case you prefer to
> fix it atop rather than amend.

Always...

> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a7c5d15c1618..cadcc3290918 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3776,7 +3776,10 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
>  	VM_WARN_ON_ONCE(mask & size);
>  	VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
>  
> -	for_each_vma_range(vmi, args->vma, end) {
> +	for_each_vma(vmi, args->vma) {
> +		if (end && end <= args->vma->vm_start)
> +			return false;
> +
>  		if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
>  			continue;

Thanks.

I added your signoff so I don't get a nastygram from Stephen in the
morning.  Please send along a suitable brief changelog?
Re: [PATCH v14 67/70] mm/vmscan: Use vma iterator instead of vm_next
Posted by Yu Zhao 3 years, 6 months ago
On Mon, Sep 12, 2022 at 3:03 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Mon, 12 Sep 2022 14:01:28 -0600 Yu Zhao <yuzhao@google.com> wrote:
>
> > The diff between the original patch and this one, in case you prefer to
> > fix it atop rather than amend.
>
> Always...
>
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a7c5d15c1618..cadcc3290918 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3776,7 +3776,10 @@ static bool get_next_vma(unsigned long mask, unsigned long size, struct mm_walk
> >       VM_WARN_ON_ONCE(mask & size);
> >       VM_WARN_ON_ONCE((start & mask) != (*vm_start & mask));
> >
> > -     for_each_vma_range(vmi, args->vma, end) {
> > +     for_each_vma(vmi, args->vma) {
> > +             if (end && end <= args->vma->vm_start)
> > +                     return false;
> > +
> >               if (should_skip_vma(args->vma->vm_start, args->vma->vm_end, args))
> >                       continue;
>
> Thanks.
>
> I added your signoff so I don't get a nastygram from Stephen in the
> morning.  Please send along a suitable brief changelog?

mm/vmscan: use the proper VMA iterator

When get_next_vma() finishes iterating VMAs within a range [start,
end), it expects args->vma to point the first VMA out of that range,
if such a VMA exists. This allows its callers to continue the
iteration with a new range above the previous one, if those callers
choose to.

for_each_vma_range() always sets args->vma to NULL after it's done.
This may mislead those callers to conclude that there are no more
VMAs, and in turn they terminate their iterations prematurely.

This fix replaces for_each_vma_range() with for_each_vma() and
explicitly checks whether the next VMA is still within range, and if
not, returns false to indicate the current range has ended. The
callers may continue with the next range if args->vma is not NULL.