When finding VA in vn->busy, if VA spans several zones and the passed
addr is not the same as va->va_start, we should scan the vn in reverse
odrdr because the starting address of VA must be smaller than the passed
addr if it really resides in the VA.
E.g on a system nr_vmap_nodes=100,
<----va---->
-|-----|-----|-----|-----|-----|-----|-----|-----|-----|-
... n-1 n n+1 n+2 ... 100 0 1
VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed
addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n',
then succeed very soon.
Meanwhile we still need loop around because VA could spans node from 'n'
to node 100, node 0, node 1.
Anyway, changing to find in reverse order can improve efficiency on
many CPUs system.
Signed-off-by: Baoquan He <bhe@redhat.com>
---
mm/vmalloc.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index aca1905d3397..488d69b56765 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr)
if (va)
return va;
- } while ((i = (i + 1) % nr_vmap_nodes) != j);
+ } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
return NULL;
}
@@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr)
if (va)
return va;
- } while ((i = (i + 1) % nr_vmap_nodes) != j);
+ } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
return NULL;
}
--
2.41.0
On 4/15/2025 8:09 AM, Baoquan He wrote: > When finding VA in vn->busy, if VA spans several zones and the passed > addr is not the same as va->va_start, we should scan the vn in reverse > odrdr because the starting address of VA must be smaller than the passed > addr if it really resides in the VA. > > E.g on a system nr_vmap_nodes=100, > > <----va----> > -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- > ... n-1 n n+1 n+2 ... 100 0 1 > > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', > then succeed very soon. > > Meanwhile we still need loop around because VA could spans node from 'n' > to node 100, node 0, node 1. > > Anyway, changing to find in reverse order can improve efficiency on > many CPUs system. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index aca1905d3397..488d69b56765 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> Thanks, Shivank
On Tue, Apr 15, 2025 at 10:39:49AM +0800, Baoquan He wrote:
> When finding VA in vn->busy, if VA spans several zones and the passed
> addr is not the same as va->va_start, we should scan the vn in reverse
> odrdr because the starting address of VA must be smaller than the passed
> addr if it really resides in the VA.
>
> E.g on a system nr_vmap_nodes=100,
>
> <----va---->
> -|-----|-----|-----|-----|-----|-----|-----|-----|-----|-
> ... n-1 n n+1 n+2 ... 100 0 1
>
> VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed
> addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n',
> then succeed very soon.
>
> Meanwhile we still need loop around because VA could spans node from 'n'
> to node 100, node 0, node 1.
>
> Anyway, changing to find in reverse order can improve efficiency on
> many CPUs system.
>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
> mm/vmalloc.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index aca1905d3397..488d69b56765 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr)
>
> if (va)
> return va;
> - } while ((i = (i + 1) % nr_vmap_nodes) != j);
> + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
>
> return NULL;
> }
> @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr)
>
> if (va)
> return va;
> - } while ((i = (i + 1) % nr_vmap_nodes) != j);
> + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
>
> return NULL;
> }
> --
> 2.41.0
>
It depends. Consider a below situation:
addr
|
VA V
<------------>
<---|---|---|---|---|---|---|--->
0 1 2 3 0 1 2 3
basically it matters how big VA and how many nodes it spans. But i
agree that an assumption to reverse back is more convinced in most
cases.
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
--
Uladzislau Rezki
On 04/15/25 at 05:25pm, Uladzislau Rezki wrote: > On Tue, Apr 15, 2025 at 10:39:49AM +0800, Baoquan He wrote: > > When finding VA in vn->busy, if VA spans several zones and the passed > > addr is not the same as va->va_start, we should scan the vn in reverse > > odrdr because the starting address of VA must be smaller than the passed > > addr if it really resides in the VA. > > > > E.g on a system nr_vmap_nodes=100, > > > > <----va----> > > -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- > > ... n-1 n n+1 n+2 ... 100 0 1 > > > > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed > > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', > > then succeed very soon. > > > > Meanwhile we still need loop around because VA could spans node from 'n' > > to node 100, node 0, node 1. > > > > Anyway, changing to find in reverse order can improve efficiency on > > many CPUs system. > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > mm/vmalloc.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index aca1905d3397..488d69b56765 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > > > if (va) > > return va; > > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > > > return NULL; > > } > > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > > > if (va) > > return va; > > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > > > return NULL; > > } > > -- > > 2.41.0 > > > It depends. Consider a below situation: > > addr > | > VA V > <------------> > <---|---|---|---|---|---|---|---> > 0 1 2 3 0 1 2 3 > > basically it matters how big VA and how many nodes it spans. But i > agree that an assumption to reverse back is more convinced in most > cases. Agree, on small system with few CPUs and big VA case, the advantage is not apparent. > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Thanks.
© 2016 - 2025 Red Hat, Inc.