[PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch

Graham Woodward posted 4 patches 1 month ago
tools/perf/Documentation/itrace.txt       |  2 +-
tools/perf/Documentation/perf-arm-spe.txt |  2 +-
tools/perf/builtin-script.c               |  1 +
tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
tools/perf/util/auxtrace.h                |  3 +--
tools/perf/util/event.h                   |  1 +
6 files changed, 29 insertions(+), 11 deletions(-)
[PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch
Posted by Graham Woodward 1 month ago
Currently the --itrace=b will only show branch-misses but this change
allows perf to synthesize branches as well.

The change also incorporates the ability to display the target
addresses when specifying the addr field if the instruction is a branch.

Graham Woodward (4):
  perf arm-spe: Set sample.addr to target address for instruction sample
  perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
  perf arm-spe: Correctly set sample flags
  perf arm-spe: Update --itrace help text

 tools/perf/Documentation/itrace.txt       |  2 +-
 tools/perf/Documentation/perf-arm-spe.txt |  2 +-
 tools/perf/builtin-script.c               |  1 +
 tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
 tools/perf/util/auxtrace.h                |  3 +--
 tools/perf/util/event.h                   |  1 +
 6 files changed, 29 insertions(+), 11 deletions(-)

-- 
2.40.1
Re: [PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch
Posted by Namhyung Kim 3 weeks, 4 days ago
On Fri, 25 Oct 2024 15:30:05 +0100, Graham Woodward wrote:

> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.
> 
> Graham Woodward (4):
>   perf arm-spe: Set sample.addr to target address for instruction sample
>   perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
>   perf arm-spe: Correctly set sample flags
>   perf arm-spe: Update --itrace help text
> 
> [...]

Applied to perf-tools-next, thanks!

Best regards,
Namhyung
Re: [PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch
Posted by Namhyung Kim 3 weeks, 6 days ago
Hello,

On Fri, Oct 25, 2024 at 03:30:05PM +0100, Graham Woodward wrote:
> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.
> 
> Graham Woodward (4):
>   perf arm-spe: Set sample.addr to target address for instruction sample
>   perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
>   perf arm-spe: Correctly set sample flags
>   perf arm-spe: Update --itrace help text

It doesn't apply to perf-tools-next cleanly.  Can you please rebase?

Thanks,
Namhyung

> 
>  tools/perf/Documentation/itrace.txt       |  2 +-
>  tools/perf/Documentation/perf-arm-spe.txt |  2 +-
>  tools/perf/builtin-script.c               |  1 +
>  tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
>  tools/perf/util/auxtrace.h                |  3 +--
>  tools/perf/util/event.h                   |  1 +
>  6 files changed, 29 insertions(+), 11 deletions(-)
> 
> -- 
> 2.40.1
>
Re: [PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch
Posted by Leo Yan 3 weeks, 5 days ago
Hi Namhyung,

On Mon, Oct 28, 2024 at 09:40:21AM -0700, Namhyung Kim wrote:
> 
> Hello,
> 
> On Fri, Oct 25, 2024 at 03:30:05PM +0100, Graham Woodward wrote:
> > Currently the --itrace=b will only show branch-misses but this change
> > allows perf to synthesize branches as well.
> >
> > The change also incorporates the ability to display the target
> > addresses when specifying the addr field if the instruction is a branch.
> >
> > Graham Woodward (4):
> >   perf arm-spe: Set sample.addr to target address for instruction sample
> >   perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
> >   perf arm-spe: Correctly set sample flags
> >   perf arm-spe: Update --itrace help text
> 
> It doesn't apply to perf-tools-next cleanly.  Can you please rebase?

I confirmed this series can apply cleanly on the branch [1] with the
latest commit 150dab31d560 ("perf disasm: Fix not cleaning up
disasm_line in symbol__disassemble_raw()"):

  [1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git
      branch: perf-tools-next

If you are suggesting for the branch:

  [2] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git
      branch: perf-tools

You can see it misses some Arm SPE patches which have been picked up
in the repo [1].

Please kindly suggest what is right thing to do.

Thanks,
Leo
Re: [PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch
Posted by Namhyung Kim 3 weeks, 5 days ago
Hi Leo,

On Tue, Oct 29, 2024 at 05:03:46PM +0000, Leo Yan wrote:
> Hi Namhyung,
> 
> On Mon, Oct 28, 2024 at 09:40:21AM -0700, Namhyung Kim wrote:
> > 
> > Hello,
> > 
> > On Fri, Oct 25, 2024 at 03:30:05PM +0100, Graham Woodward wrote:
> > > Currently the --itrace=b will only show branch-misses but this change
> > > allows perf to synthesize branches as well.
> > >
> > > The change also incorporates the ability to display the target
> > > addresses when specifying the addr field if the instruction is a branch.
> > >
> > > Graham Woodward (4):
> > >   perf arm-spe: Set sample.addr to target address for instruction sample
> > >   perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
> > >   perf arm-spe: Correctly set sample flags
> > >   perf arm-spe: Update --itrace help text
> > 
> > It doesn't apply to perf-tools-next cleanly.  Can you please rebase?
> 
> I confirmed this series can apply cleanly on the branch [1] with the
> latest commit 150dab31d560 ("perf disasm: Fix not cleaning up
> disasm_line in symbol__disassemble_raw()"):
> 
>   [1] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git
>       branch: perf-tools-next
> 
> If you are suggesting for the branch:
> 
>   [2] https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools.git
>       branch: perf-tools
> 
> You can see it misses some Arm SPE patches which have been picked up
> in the repo [1].
> 
> Please kindly suggest what is right thing to do.

Sorry, my bad.  It works ok.  I'll add it to tmp.perf-tools-next branch
and run some tests.

Thanks,
Namhyung
Re: [PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch
Posted by Leo Yan 3 weeks, 5 days ago
Hi Namhyung,

On Tue, Oct 29, 2024 at 04:09:46PM -0700, Namhyung Kim wrote:

[...]

> > Please kindly suggest what is right thing to do.
> 
> Sorry, my bad.  It works ok.  I'll add it to tmp.perf-tools-next branch
> and run some tests.

Thanks for confirmation!

Leo
Re: [PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch
Posted by James Clark 4 weeks ago

On 25/10/2024 3:30 pm, Graham Woodward wrote:
> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.
> 
> Graham Woodward (4):
>    perf arm-spe: Set sample.addr to target address for instruction sample
>    perf arm-spe: Use ARM_SPE_OP_BRANCH_ERET when synthesizing branches
>    perf arm-spe: Correctly set sample flags
>    perf arm-spe: Update --itrace help text
> 
>   tools/perf/Documentation/itrace.txt       |  2 +-
>   tools/perf/Documentation/perf-arm-spe.txt |  2 +-
>   tools/perf/builtin-script.c               |  1 +
>   tools/perf/util/arm-spe.c                 | 31 ++++++++++++++++++-----
>   tools/perf/util/auxtrace.h                |  3 +--
>   tools/perf/util/event.h                   |  1 +
>   6 files changed, 29 insertions(+), 11 deletions(-)
> 

Don't forget to pickup the review tags from the previous versions. If 
you use the b4 tool it does it automatically:

Reviewed-by: James Clark <james.clark@linaro.org>
Re: [PATCH v1 0/4] perf arm-spe: Allow synthesizing of branch
Posted by Leo Yan 1 month ago
On Fri, Oct 25, 2024 at 03:30:05PM +0100, Graham Woodward wrote:
> 
> Currently the --itrace=b will only show branch-misses but this change
> allows perf to synthesize branches as well.
> 
> The change also incorporates the ability to display the target
> addresses when specifying the addr field if the instruction is a branch.

Tested for this series:

  # perf record -e arm_spe_0/branch_filter=1,load_filter=1/u \
      -- ./false_sharing.exe 1

  # perf script --itrace=i10ib  -F,+addr,+flags
    false_sharing.e  880532 [005] 1852579.389533:          1                                    branch:   jmp                       ffff91beb224     ffff91beb220 __GI___tunables_init+0x40 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389538:          1                                    branch:   jmp                       ffff91bec318     ffff91bec314 _dl_next_ld_env_entry+0x24 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389620:          1                                    branch:   jmp                       ffff91be0f14     ffff91be0f10 _dl_new_object+0x168 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389802:          1                                    branch:   jmp                       ffff91be2cf0     ffff91be2cec _dl_map_object_deps+0x3f4 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389802:         10                              instructions:   jmp                       ffff91be2cf0     ffff91be2cec _dl_map_object_deps+0x3f4 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389824:          1                                    branch:   br miss                   ffff91bee4e4     ffff91bee4e0 strcmp+0xa0 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389849:          1                                    branch:   jmp                       ffff91be1868     ffff91be1880 _dl_relocate_object+0x4a8 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389858:          1                                    branch:   jmp                       ffff91be1868     ffff91be1880 _dl_relocate_object+0x4a8 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389861:          1                                    branch:   jmp                       ffff91be1c20     ffff91be1bcc _dl_relocate_object+0x7f4 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389875:         10                              instructions:                                        0     ffff91bdfe38 _dl_lookup_symbol_x+0x58 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389876:          1                                    branch:   jmp                       ffff91bdf3a8     ffff91bdf434 do_lookup_x+0x114 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389879:          1                                    branch:   jmp                       ffff91be18ec     ffff91be18e8 _dl_relocate_object+0x510 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389886:          1                                    branch:   jmp                       ffff91bee440     ffff91bdf2dc check_match+0x154 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389890:          1                                    branch:   jmp                       ffff91bdfed4     ffff91bdfed0 _dl_lookup_symbol_x+0xf0 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389893:         10                              instructions:                                        0     ffff91be1974 _dl_relocate_object+0x59c (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389894:          1                                    branch:   jmp                       ffff91bdf3f4     ffff91bdf3f0 do_lookup_x+0xd0 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)
    false_sharing.e  880532 [005] 1852579.389906:          1                                    branch:   jmp                       ffff91bdfea4     ffff91bdfe90 _dl_lookup_symbol_x+0xb0 (/usr/lib/aarch64-linux-gnu/ld-2.31.so)

  # perf test "Check Arm SPE"
  114: Check Arm SPE trace data recording and synthesized samples      : Ok
  115: Check Arm SPE doesn't hang when there are forks                 : Ok

Tested-by: Leo Yan <leo.yan@arm.com>