[v2] bpf: Allocate bpf_event_entry with node info

[PATCH v2] bpf: Allocate bpf_event_entry with node info

Posted by Namhyung Kim 1 year, 8 months ago

It was reported that accessing perf_event map entry caused pretty high
LLC misses in get_map_perf_counter().  As reading perf_event is allowed
for the local CPU only, I think we can use the target CPU of the event
as hint for the allocation like in perf_event_alloc() so that the event
and the entry can be in the same node at least.

Reported-by: Aleksei Shchekotikhin <alekseis@google.com>
Reported-by: Nilay Vaish <nilayvaish@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
v2) fix build errors

 kernel/bpf/arraymap.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index feabc0193852..067f7cf27042 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -1194,10 +1194,17 @@ static struct bpf_event_entry *bpf_event_entry_gen(struct file *perf_file,
 						   struct file *map_file)
 {
 	struct bpf_event_entry *ee;
+	struct perf_event *event = perf_file->private_data;
+	int node = -1;
 
-	ee = kzalloc(sizeof(*ee), GFP_KERNEL);
+#ifdef CONFIG_PERF_EVENTS
+	if (event->cpu >= 0)
+		node = cpu_to_node(event->cpu);
+#endif
+
+	ee = kzalloc_node(sizeof(*ee), GFP_KERNEL, node);
 	if (ee) {
-		ee->event = perf_file->private_data;
+		ee->event = event;
 		ee->perf_file = perf_file;
 		ee->map_file = map_file;
 	}
-- 
2.45.1.288.g0e0cd299f1-goog

Re: [PATCH v2] bpf: Allocate bpf_event_entry with node info

Posted by Jiri Olsa 1 year, 8 months ago

On Tue, May 28, 2024 at 11:53:11PM -0700, Namhyung Kim wrote:
> It was reported that accessing perf_event map entry caused pretty high
> LLC misses in get_map_perf_counter().  As reading perf_event is allowed
> for the local CPU only, I think we can use the target CPU of the event
> as hint for the allocation like in perf_event_alloc() so that the event
> and the entry can be in the same node at least.

looks good, is there any profile to prove the gain?

jirka

> 
> Reported-by: Aleksei Shchekotikhin <alekseis@google.com>
> Reported-by: Nilay Vaish <nilayvaish@google.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

> ---
> v2) fix build errors
> 
>  kernel/bpf/arraymap.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> index feabc0193852..067f7cf27042 100644
> --- a/kernel/bpf/arraymap.c
> +++ b/kernel/bpf/arraymap.c
> @@ -1194,10 +1194,17 @@ static struct bpf_event_entry *bpf_event_entry_gen(struct file *perf_file,
>  						   struct file *map_file)
>  {
>  	struct bpf_event_entry *ee;
> +	struct perf_event *event = perf_file->private_data;
> +	int node = -1;
>  
> -	ee = kzalloc(sizeof(*ee), GFP_KERNEL);
> +#ifdef CONFIG_PERF_EVENTS
> +	if (event->cpu >= 0)
> +		node = cpu_to_node(event->cpu);
> +#endif
> +
> +	ee = kzalloc_node(sizeof(*ee), GFP_KERNEL, node);
>  	if (ee) {
> -		ee->event = perf_file->private_data;
> +		ee->event = event;
>  		ee->perf_file = perf_file;
>  		ee->map_file = map_file;
>  	}
> -- 
> 2.45.1.288.g0e0cd299f1-goog
>

Re: [PATCH v2] bpf: Allocate bpf_event_entry with node info

Posted by Namhyung Kim 1 year, 8 months ago

Hi Jiri,

On Wed, May 29, 2024 at 1:31 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Tue, May 28, 2024 at 11:53:11PM -0700, Namhyung Kim wrote:
> > It was reported that accessing perf_event map entry caused pretty high
> > LLC misses in get_map_perf_counter().  As reading perf_event is allowed
> > for the local CPU only, I think we can use the target CPU of the event
> > as hint for the allocation like in perf_event_alloc() so that the event
> > and the entry can be in the same node at least.
>
> looks good, is there any profile to prove the gain?

No, at this point.  I'm not sure if it'd help LLC hit ratio but
I think it should improve the memory latency.

Thanks,
Namhyung

>
> >
> > Reported-by: Aleksei Shchekotikhin <alekseis@google.com>
> > Reported-by: Nilay Vaish <nilayvaish@google.com>
> > Signed-off-by: Namhyung Kim <namhyung@kernel.org>
>
> > ---
> > v2) fix build errors
> >
> >  kernel/bpf/arraymap.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
> > index feabc0193852..067f7cf27042 100644
> > --- a/kernel/bpf/arraymap.c
> > +++ b/kernel/bpf/arraymap.c
> > @@ -1194,10 +1194,17 @@ static struct bpf_event_entry *bpf_event_entry_gen(struct file *perf_file,
> >                                                  struct file *map_file)
> >  {
> >       struct bpf_event_entry *ee;
> > +     struct perf_event *event = perf_file->private_data;
> > +     int node = -1;
> >
> > -     ee = kzalloc(sizeof(*ee), GFP_KERNEL);
> > +#ifdef CONFIG_PERF_EVENTS
> > +     if (event->cpu >= 0)
> > +             node = cpu_to_node(event->cpu);
> > +#endif
> > +
> > +     ee = kzalloc_node(sizeof(*ee), GFP_KERNEL, node);
> >       if (ee) {
> > -             ee->event = perf_file->private_data;
> > +             ee->event = event;
> >               ee->perf_file = perf_file;
> >               ee->map_file = map_file;
> >       }
> > --
> > 2.45.1.288.g0e0cd299f1-goog
> >

Re: [PATCH v2] bpf: Allocate bpf_event_entry with node info

Posted by Alexei Starovoitov 1 year, 8 months ago

On Wed, May 29, 2024 at 9:54 AM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hi Jiri,
>
> On Wed, May 29, 2024 at 1:31 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Tue, May 28, 2024 at 11:53:11PM -0700, Namhyung Kim wrote:
> > > It was reported that accessing perf_event map entry caused pretty high
> > > LLC misses in get_map_perf_counter().  As reading perf_event is allowed
> > > for the local CPU only, I think we can use the target CPU of the event
> > > as hint for the allocation like in perf_event_alloc() so that the event
> > > and the entry can be in the same node at least.
> >
> > looks good, is there any profile to prove the gain?
>
> No, at this point.  I'm not sure if it'd help LLC hit ratio but
> I think it should improve the memory latency.

I have the same concern as Jiri.
Without numbers this is just a code churn.
Does this patch really make a difference?
Without numbers maintainers would have to believe the "just trust me" part.
So..
pw-bot: cr

Re: [PATCH v2] bpf: Allocate bpf_event_entry with node info

Posted by Namhyung Kim 1 year, 8 months ago

Hi Alexei,

On Wed, May 29, 2024 at 10:23 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, May 29, 2024 at 9:54 AM Namhyung Kim <namhyung@kernel.org> wrote:
> >
> > Hi Jiri,
> >
> > On Wed, May 29, 2024 at 1:31 AM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Tue, May 28, 2024 at 11:53:11PM -0700, Namhyung Kim wrote:
> > > > It was reported that accessing perf_event map entry caused pretty high
> > > > LLC misses in get_map_perf_counter().  As reading perf_event is allowed
> > > > for the local CPU only, I think we can use the target CPU of the event
> > > > as hint for the allocation like in perf_event_alloc() so that the event
> > > > and the entry can be in the same node at least.
> > >
> > > looks good, is there any profile to prove the gain?
> >
> > No, at this point.  I'm not sure if it'd help LLC hit ratio but
> > I think it should improve the memory latency.
>
> I have the same concern as Jiri.
> Without numbers this is just a code churn.
> Does this patch really make a difference?
> Without numbers maintainers would have to believe the "just trust me" part.
> So..
> pw-bot: cr

Ok, then I'll come back with numbers later.

Thanks,
Namhyung