[PATCH 0/3] perf/core: Optimize LBR callstack handling

Namhyung Kim posted 3 patches 1 month, 2 weeks ago
kernel/events/core.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)
[PATCH 0/3] perf/core: Optimize LBR callstack handling
Posted by Namhyung Kim 1 month, 2 weeks ago
Hello,

I found other problematic cases wrt LBR callstacks.  Basically O(N^2)
loop for every threads is too costly on large machines.  We can use
faster memory allocation and free methods to reduce the overhead.

Actually this approach is suggested by AI (Gemini).

Thanks,
Namhyung


Namhyung Kim (3):
  perf/core: Pass GFP flags to attach_task_ctx_data()
  perf/core: Try to allocate task_ctx_data quickly
  perf/core: Simplify __detach_global_ctx_data()

 kernel/events/core.c | 38 +++++++++++++++++++-------------------
 1 file changed, 19 insertions(+), 19 deletions(-)

-- 
2.53.0.273.g2a3d683680-goog
Re: [PATCH 0/3] perf/core: Optimize LBR callstack handling
Posted by Peter Zijlstra 1 month ago
On Wed, Feb 11, 2026 at 02:32:18PM -0800, Namhyung Kim wrote:
> Namhyung Kim (3):
>   perf/core: Pass GFP flags to attach_task_ctx_data()
>   perf/core: Try to allocate task_ctx_data quickly
>   perf/core: Simplify __detach_global_ctx_data()

They seem to have crossed paths with kalloc_obj() stuff, but I stomped
on it and now they fit.

Patches seem fine, I'll throw them at the robots.

Thanks!
Re: [PATCH 0/3] perf/core: Optimize LBR callstack handling
Posted by Namhyung Kim 1 month ago
On Thu, Feb 26, 2026 at 01:07:12PM +0100, Peter Zijlstra wrote:
> On Wed, Feb 11, 2026 at 02:32:18PM -0800, Namhyung Kim wrote:
> > Namhyung Kim (3):
> >   perf/core: Pass GFP flags to attach_task_ctx_data()
> >   perf/core: Try to allocate task_ctx_data quickly
> >   perf/core: Simplify __detach_global_ctx_data()
> 
> They seem to have crossed paths with kalloc_obj() stuff, but I stomped
> on it and now they fit.
> 
> Patches seem fine, I'll throw them at the robots.

Thanks for doing that!

Namhyung