[PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events

Andrii Nakryiko posted 4 patches 1 year, 10 months ago
There is a newer version of this series
arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
arch/x86/events/amd/lbr.c    | 11 +----------
arch/x86/events/perf_event.h | 11 +++++++++++
3 files changed, 48 insertions(+), 11 deletions(-)
[PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
Posted by Andrii Nakryiko 1 year, 10 months ago
Add AMD-specific implementation of perf_snapshot_branch_stack static call that
allows LBR capture from arbitrary points in the kernel. This is utilized by
BPF programs. See patch #3 for all the details.

Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
inlined and have no branches, to minimize LBR snapshot contamination.

Patch #4 removes an artificial restriction on perf events with LBR enabled.

Andrii Nakryiko (4):
  perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
  perf/x86/amd: avoid taking branches before disabling LBR
  perf/x86/amd: support capturing LBR from software events
  perf/x86/amd: don't reject non-sampling events with configured LBR

 arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
 arch/x86/events/amd/lbr.c    | 11 +----------
 arch/x86/events/perf_event.h | 11 +++++++++++
 3 files changed, 48 insertions(+), 11 deletions(-)

-- 
2.43.0
Re: [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
Posted by Ingo Molnar 1 year, 10 months ago
* Andrii Nakryiko <andrii@kernel.org> wrote:

> Add AMD-specific implementation of perf_snapshot_branch_stack static call that
> allows LBR capture from arbitrary points in the kernel. This is utilized by
> BPF programs. See patch #3 for all the details.
> 
> Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
> inlined and have no branches, to minimize LBR snapshot contamination.
> 
> Patch #4 removes an artificial restriction on perf events with LBR enabled.
> 
> Andrii Nakryiko (4):
>   perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
>   perf/x86/amd: avoid taking branches before disabling LBR
>   perf/x86/amd: support capturing LBR from software events
>   perf/x86/amd: don't reject non-sampling events with configured LBR
> 
>  arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
>  arch/x86/events/amd/lbr.c    | 11 +----------
>  arch/x86/events/perf_event.h | 11 +++++++++++
>  3 files changed, 48 insertions(+), 11 deletions(-)

So there's a new conflict with patch #2, probably due to interaction 
with this recent fix that is now upstream:

   598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")

I don't think it should change the logic of the snapshot feature 
materially, X86_FEATURE_AMD_LBR_PMC_FREEZE should be orthogonal to it, 
as the LBR snapshot isn't taken from a PMI.

Thanks,

	Ingo
Re: [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
Posted by Andrii Nakryiko 1 year, 10 months ago
On Mon, Apr 1, 2024 at 2:30 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Andrii Nakryiko <andrii@kernel.org> wrote:
>
> > Add AMD-specific implementation of perf_snapshot_branch_stack static call that
> > allows LBR capture from arbitrary points in the kernel. This is utilized by
> > BPF programs. See patch #3 for all the details.
> >
> > Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
> > inlined and have no branches, to minimize LBR snapshot contamination.
> >
> > Patch #4 removes an artificial restriction on perf events with LBR enabled.
> >
> > Andrii Nakryiko (4):
> >   perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
> >   perf/x86/amd: avoid taking branches before disabling LBR
> >   perf/x86/amd: support capturing LBR from software events
> >   perf/x86/amd: don't reject non-sampling events with configured LBR
> >
> >  arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
> >  arch/x86/events/amd/lbr.c    | 11 +----------
> >  arch/x86/events/perf_event.h | 11 +++++++++++
> >  3 files changed, 48 insertions(+), 11 deletions(-)
>
> So there's a new conflict with patch #2, probably due to interaction
> with this recent fix that is now upstream:
>
>    598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")
>
> I don't think it should change the logic of the snapshot feature
> materially, X86_FEATURE_AMD_LBR_PMC_FREEZE should be orthogonal to it,
> as the LBR snapshot isn't taken from a PMI.
>

Yep, seems like there was a parallel change to related code in
perf/urgent branch. And yes, you are right that it's orthogonal and
doesn't regress anything as far as branching and whatnot (just
retested everything on real hardware). So I've rebased my patches on
top of perf/urgent, will send v5 momentarily. Sorry for an extra round
on this.

> Thanks,
>
>         Ingo
Re: [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
Posted by Ingo Molnar 1 year, 10 months ago
* Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Mon, Apr 1, 2024 at 2:30 AM Ingo Molnar <mingo@kernel.org> wrote:
> >
> >
> > * Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > > Add AMD-specific implementation of perf_snapshot_branch_stack static call that
> > > allows LBR capture from arbitrary points in the kernel. This is utilized by
> > > BPF programs. See patch #3 for all the details.
> > >
> > > Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
> > > inlined and have no branches, to minimize LBR snapshot contamination.
> > >
> > > Patch #4 removes an artificial restriction on perf events with LBR enabled.
> > >
> > > Andrii Nakryiko (4):
> > >   perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
> > >   perf/x86/amd: avoid taking branches before disabling LBR
> > >   perf/x86/amd: support capturing LBR from software events
> > >   perf/x86/amd: don't reject non-sampling events with configured LBR
> > >
> > >  arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
> > >  arch/x86/events/amd/lbr.c    | 11 +----------
> > >  arch/x86/events/perf_event.h | 11 +++++++++++
> > >  3 files changed, 48 insertions(+), 11 deletions(-)
> >
> > So there's a new conflict with patch #2, probably due to interaction
> > with this recent fix that is now upstream:
> >
> >    598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")
> >
> > I don't think it should change the logic of the snapshot feature
> > materially, X86_FEATURE_AMD_LBR_PMC_FREEZE should be orthogonal to it,
> > as the LBR snapshot isn't taken from a PMI.
> >
> 
> Yep, seems like there was a parallel change to related code in 
> perf/urgent branch. And yes, you are right that it's orthogonal and 
> doesn't regress anything as far as branching and whatnot (just 
> retested everything on real hardware). So I've rebased my patches on 
> top of perf/urgent, will send v5 momentarily.

Thank you - it's now all in tip:perf/core and lined up for v6.10.

> Sorry for an extra round on this.

Not your doing really - just crossing patches.

Thanks,

	Ingo
Re: [PATCH v4 0/4] perf/x86/amd: add LBR capture support outside of hardware events
Posted by Andrii Nakryiko 1 year, 10 months ago
On Wed, Apr 3, 2024 at 1:06 AM Ingo Molnar <mingo@kernel.org> wrote:
>
>
> * Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Mon, Apr 1, 2024 at 2:30 AM Ingo Molnar <mingo@kernel.org> wrote:
> > >
> > >
> > > * Andrii Nakryiko <andrii@kernel.org> wrote:
> > >
> > > > Add AMD-specific implementation of perf_snapshot_branch_stack static call that
> > > > allows LBR capture from arbitrary points in the kernel. This is utilized by
> > > > BPF programs. See patch #3 for all the details.
> > > >
> > > > Patches #1 and #2 are preparatory steps to ensure LBR freezing is completely
> > > > inlined and have no branches, to minimize LBR snapshot contamination.
> > > >
> > > > Patch #4 removes an artificial restriction on perf events with LBR enabled.
> > > >
> > > > Andrii Nakryiko (4):
> > > >   perf/x86/amd: ensure amd_pmu_core_disable_all() is always inlined
> > > >   perf/x86/amd: avoid taking branches before disabling LBR
> > > >   perf/x86/amd: support capturing LBR from software events
> > > >   perf/x86/amd: don't reject non-sampling events with configured LBR
> > > >
> > > >  arch/x86/events/amd/core.c   | 37 +++++++++++++++++++++++++++++++++++-
> > > >  arch/x86/events/amd/lbr.c    | 11 +----------
> > > >  arch/x86/events/perf_event.h | 11 +++++++++++
> > > >  3 files changed, 48 insertions(+), 11 deletions(-)
> > >
> > > So there's a new conflict with patch #2, probably due to interaction
> > > with this recent fix that is now upstream:
> > >
> > >    598c2fafc06f ("perf/x86/amd/lbr: Use freeze based on availability")
> > >
> > > I don't think it should change the logic of the snapshot feature
> > > materially, X86_FEATURE_AMD_LBR_PMC_FREEZE should be orthogonal to it,
> > > as the LBR snapshot isn't taken from a PMI.
> > >
> >
> > Yep, seems like there was a parallel change to related code in
> > perf/urgent branch. And yes, you are right that it's orthogonal and
> > doesn't regress anything as far as branching and whatnot (just
> > retested everything on real hardware). So I've rebased my patches on
> > top of perf/urgent, will send v5 momentarily.
>
> Thank you - it's now all in tip:perf/core and lined up for v6.10.

Great, thank you!

>
> > Sorry for an extra round on this.
>
> Not your doing really - just crossing patches.
>
> Thanks,
>
>         Ingo