Introduce a basic skeleton that defines a cache entity for
a CPU. It consist of bookkeeping data which belong to a CPU
zone. A CPU-zone defines a virtual address range with its
fixed size.
This approach gives a possibility to map any address to a
particular CPU-zone it belongs to. Below is a high-level
description how it is used in alloc and free paths:
alloc-path:
in-front-pre-fetch occurs if a cache is empty. A chunk is
placed to a free-tree of this CPU. Finally this CPU cache
is clipped in order to accomplish a request. A fresh VA
resides in a busy-tree of specific CPU-zone it is bound
to.
free-path:
va->va_start is converted to a zone, so a lookup occurs in
a bookkeeping zone it belongs to. Once it gets removed a VA
is moved to a lazy-tree. It is freed after a TLB flushing.
A "cvz_size", that reflects a zone size, is set to ULONG_MAX
by this patch. The reason is we would like to follow an old
behavior until all dependent changes are in place.
There is no a functional change as a result of this patch.
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
---
mm/vmalloc.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 55 insertions(+)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 19dcffb0d563..f6da2590b4de 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -773,6 +773,61 @@ static struct rb_root free_vmap_area_root = RB_ROOT;
*/
static DEFINE_PER_CPU(struct vmap_area *, ne_fit_preload_node);
+/*
+ * A per-cpu effective vmap cache logic. It allows to pre-fetch
+ * a memory vmap chunk with a specified size. A bookkeeping data
+ * is stored and accessed per-CPU.
+ */
+enum { FREE = 0, BUSY, LAZY, NFBL };
+
+#define fbl(z, i, m) z->fbl[i].m
+#define fbl_root(z, i) fbl(z, i, root)
+#define fbl_head(z, i) fbl(z, i, head)
+
+#define fbl_lock(z, i) spin_lock(&fbl(z, i, lock))
+#define fbl_unlock(z, i) spin_unlock(&fbl(z, i, lock))
+
+struct cpu_vmap_zone {
+ /*
+ * FREE, BUSY, LAZY bookkeeping data of this CPU zone.
+ */
+ struct {
+ struct rb_root root;
+ struct list_head head;
+ spinlock_t lock;
+ } fbl[NFBL];
+
+ /*
+ * A list of outstanding lazily-freed areas, ready to
+ * be drained, which belong(address space) to this CPU
+ * zone.
+ */
+ struct list_head purge_list;
+
+ /*
+ * It controls that only one CPU can pre-fetch this
+ * CPU zone.
+ */
+ atomic_t fill_in_progress;
+};
+
+static DEFINE_PER_CPU(struct cpu_vmap_zone, cpu_vmap_zone);
+
+/* Disable a per-cpu caching. */
+static __read_mostly unsigned long cvz_size = ULONG_MAX;
+
+static inline unsigned int
+addr_to_cpu(unsigned long addr)
+{
+ return (addr / cvz_size) % num_possible_cpus();
+}
+
+static inline struct cpu_vmap_zone *
+__maybe_unused addr_to_cvz(unsigned long addr)
+{
+ return &per_cpu(cpu_vmap_zone, addr_to_cpu(addr));
+}
+
static __always_inline unsigned long
va_size(struct vmap_area *va)
{
--
2.30.2
On Mon, May 22, 2023 at 01:08:44PM +0200, Uladzislau Rezki (Sony) wrote:
> +#define fbl(z, i, m) z->fbl[i].m
> +#define fbl_root(z, i) fbl(z, i, root)
> +#define fbl_head(z, i) fbl(z, i, head)
> +
> +#define fbl_lock(z, i) spin_lock(&fbl(z, i, lock))
> +#define fbl_unlock(z, i) spin_unlock(&fbl(z, i, lock))
Even if it is just temporary, I don't think adding these wrappers
make much sense.
> +struct cpu_vmap_zone {
> + /*
> + * FREE, BUSY, LAZY bookkeeping data of this CPU zone.
> + */
> + struct {
> + struct rb_root root;
> + struct list_head head;
> + spinlock_t lock;
> + } fbl[NFBL];
Maybe replace NFBL with something longer and more descriptive?
But also in general it feels like this should be folded into a patch
doing real work. As-is it doesn't look very useful.
On Mon, May 22, 2023 at 11:08:38PM -0700, Christoph Hellwig wrote:
> On Mon, May 22, 2023 at 01:08:44PM +0200, Uladzislau Rezki (Sony) wrote:
> > +#define fbl(z, i, m) z->fbl[i].m
> > +#define fbl_root(z, i) fbl(z, i, root)
> > +#define fbl_head(z, i) fbl(z, i, head)
> > +
> > +#define fbl_lock(z, i) spin_lock(&fbl(z, i, lock))
> > +#define fbl_unlock(z, i) spin_unlock(&fbl(z, i, lock))
>
> Even if it is just temporary, I don't think adding these wrappers
> make much sense.
>
If open-coded, it looks like:
spin_lock(&z->fbl[BUSY].lock);
fbl_lock(z, BUSY);
the reason of adding such helpers is to make the name shorter.
> > +struct cpu_vmap_zone {
> > + /*
> > + * FREE, BUSY, LAZY bookkeeping data of this CPU zone.
> > + */
> > + struct {
> > + struct rb_root root;
> > + struct list_head head;
> > + spinlock_t lock;
> > + } fbl[NFBL];
>
> Maybe replace NFBL with something longer and more descriptive?
>
> But also in general it feels like this should be folded into a patch
> doing real work. As-is it doesn't look very useful.
>
I split it for better understanding for review. But i can fold it.
--
Uladzislau Rezki
On Tue, May 23, 2023 at 04:53:25PM +0200, Uladzislau Rezki wrote: > > > +#define fbl_lock(z, i) spin_lock(&fbl(z, i, lock)) > > > +#define fbl_unlock(z, i) spin_unlock(&fbl(z, i, lock)) > > > > Even if it is just temporary, I don't think adding these wrappers > > make much sense. > > > If open-coded, it looks like: > > spin_lock(&z->fbl[BUSY].lock); Give the fbl structure a name and you can have a local variable for it, which will make all this a lot more readable. And then unless there is a really good reason to iterate over this as an array just have three of these structs embedded named free, busy and lazy.
On Tue, May 23, 2023 at 08:13:47AM -0700, Christoph Hellwig wrote: > On Tue, May 23, 2023 at 04:53:25PM +0200, Uladzislau Rezki wrote: > > > > +#define fbl_lock(z, i) spin_lock(&fbl(z, i, lock)) > > > > +#define fbl_unlock(z, i) spin_unlock(&fbl(z, i, lock)) > > > > > > Even if it is just temporary, I don't think adding these wrappers > > > make much sense. > > > > > If open-coded, it looks like: > > > > spin_lock(&z->fbl[BUSY].lock); > > Give the fbl structure a name and you can have a local variable for it, > which will make all this a lot more readable. And then unless there is > a really good reason to iterate over this as an array just have three > of these structs embedded named free, busy and lazy. > OK. I can go that way. -- Uladzislau Rezki
© 2016 - 2026 Red Hat, Inc.