[PATCH v5 00/19] perf cs-etm: Queue context packets for frontend

James Clark posted 19 patches 1 day, 11 hours ago
Documentation/trace/coresight/coresight-perf.rst   |  78 +------
MAINTAINERS                                        |   2 -
tools/perf/Documentation/perf-test.txt             |  24 ++-
tools/perf/Makefile.perf                           |  14 +-
tools/perf/scripts/python/arm-cs-trace-disasm.py   |  20 +-
tools/perf/tests/builtin-test.c                    | 187 +++++++++++++++-
tools/perf/tests/shell/coresight/Makefile          |  29 ---
.../perf/tests/shell/coresight/Makefile.miniconfig |  14 --
tools/perf/tests/shell/coresight/asm_pure_loop.sh  |  22 --
.../tests/shell/coresight/asm_pure_loop/.gitignore |   1 -
.../tests/shell/coresight/asm_pure_loop/Makefile   |  34 ---
.../shell/coresight/asm_pure_loop/asm_pure_loop.S  |  30 ---
.../tests/shell/coresight/concurrent_threads.sh    |  45 ++++
.../tests/shell/coresight/context_switch_thread.sh |  69 ++++++
tools/perf/tests/shell/coresight/deterministic.sh  |  72 +++++++
.../tests/shell/coresight/memcpy_thread/.gitignore |   1 -
.../tests/shell/coresight/memcpy_thread/Makefile   |  33 ---
.../shell/coresight/memcpy_thread/memcpy_thread.c  |  80 -------
.../tests/shell/coresight/memcpy_thread_16k_10.sh  |  22 --
.../perf/tests/shell/coresight/raw_dump_stress.sh  |  65 ++++++
.../shell/{ => coresight}/test_arm_coresight.sh    |  43 ++--
.../{ => coresight}/test_arm_coresight_disasm.sh   |  23 +-
.../tests/shell/coresight/thread_loop/.gitignore   |   1 -
.../tests/shell/coresight/thread_loop/Makefile     |  33 ---
.../shell/coresight/thread_loop/thread_loop.c      |  85 --------
.../shell/coresight/thread_loop_check_tid_10.sh    |  23 --
.../shell/coresight/thread_loop_check_tid_2.sh     |  23 --
.../shell/coresight/unroll_loop_thread/.gitignore  |   1 -
.../shell/coresight/unroll_loop_thread/Makefile    |  33 ---
.../unroll_loop_thread/unroll_loop_thread.c        |  75 -------
.../tests/shell/coresight/unroll_loop_thread_10.sh |  22 --
tools/perf/tests/shell/lib/coresight.sh            | 134 ------------
tools/perf/tests/tests.h                           |   3 +
tools/perf/tests/workloads/Build                   |   4 +
tools/perf/tests/workloads/context_switch_loop.c   | 110 ++++++++++
tools/perf/tests/workloads/deterministic.c         |  39 ++++
tools/perf/tests/workloads/named_threads.c         | 109 ++++++++++
tools/perf/util/cs-etm-decoder/cs-etm-decoder.c    |  21 +-
tools/perf/util/cs-etm.c                           | 236 ++++++++++++---------
tools/perf/util/cs-etm.h                           |   8 +-
40 files changed, 926 insertions(+), 942 deletions(-)
[PATCH v5 00/19] perf cs-etm: Queue context packets for frontend
Posted by James Clark 1 day, 11 hours ago
Fix thread tracking when decoding Coresight trace and add a new test for
it.

The new test is added as a Perf test workload instead of a custom binary
with its own build system, but this requires a new feature in Perf test
to pass in control pipes which can enable and disable events. This
scopes the recording to just the workload and helps to reduce the amount
of data recorded in tracing tests.

With this new feature we can re-write all of the Coresight tests to make
use of it and remove the remaining binaries which fixes the following
issues:

 * They didn't work in out of source builds
 * A lot of the tests unnecessarily required root and didn't skip
   without it
 * They were mainly qualitative tests which didn't look for specific
   behavior

Most importantly, the long build and runtime has been reduced. On a
Radxa Orion O6, unroll_loop_thread.c took 37s to compile which is longer
than the entire Perf build. Now the build time is negligible and the
before and after test runtimes for all the Coresight tests are:

          |   N1SDP   |   Orion O6
  -----------------------------------
  Before  |   4m  0s  |    14m 49s
  After   |      26s  |        56s
  -----------------------------------

Signed-off-by: James Clark <james.clark@linaro.org>
---
Changes in v5:
- Forgot to include this change:
  - Test for actual length of expected raw dump (Leo)
- Link to v4: https://lore.kernel.org/r/20260609-james-cs-context-tracking-fix-v4-0-44f9fb9e5c42@linaro.org

Changes in v4:
- Rename workload-ctl to record-ctl and improve docs (Leo)
- Use new packet argument everywhere in
  cs_etm__synth_instruction_sample() (Sashiko)
- Test for actual length of expected raw dump (Leo)
- Use -fno-inline instead of keyword (Leo)
- Don't test any brace or call lines in deterministic test
- Make sure context switch loop test does cleanup on failure (Sashiko)
- Remove undef int overflows in workloads (Sashiko)
- Link to v3: https://lore.kernel.org/r/20260603-james-cs-context-tracking-fix-v3-0-c392945d9ed5@linaro.org

Changes in v3:
- Minor sashiko comments
  - Close some more pipes
  - Fix warning messages
  - Error handling improvements
- Pass packet into cs_etm__synth_instruction_sample()
- Fixup stale comment (Leo)
- Link to v2: https://lore.kernel.org/r/20260602-james-cs-context-tracking-fix-v2-0-85b5ce6f55c6@linaro.org

Changes in v2:
- Add --workload-ctl option to Perf test
- Re-write all the Coresight tests and speed them up
- Pass packet to memory access function so frontend can use either the
  previous or current packet's EL
- Link to v1: https://lore.kernel.org/r/20260526-james-cs-context-tracking-fix-v1-0-ebd602e18287@linaro.org

---
James Clark (19):
      perf cs-etm: Queue context packets for frontend
      perf test: Add workload-ctl option
      perf test: Add a workload that forces context switches
      perf test cs-etm: Test process attribution
      perf test: Add deterministic workload
      perf test cs-etm: Replace unroll loop thread with deterministic decode test
      perf test cs-etm: Remove asm_pure_loop test
      perf test cs-etm: Replace memcpy test with raw dump stress test
      perf test: Add named_threads workload
      perf test cs-etm: Test decoding for concurrent threads test
      perf test cs-etm: Remove duplicate branch tests
      perf test cs-etm: Skip if not root
      perf test cs-etm: Reduce snapshot size
      perf test cs-etm: Speed up basic test
      perf test cs-etm: Remove unused Coresight workloads
      perf test cs-etm: Make disassembly test use kcore
      perf test cs-etm: Add all branch instructions to test
      perf test cs-etm: Speed up disassembly test
      perf test cs-etm: Move existing tests to coresight folder

 Documentation/trace/coresight/coresight-perf.rst   |  78 +------
 MAINTAINERS                                        |   2 -
 tools/perf/Documentation/perf-test.txt             |  24 ++-
 tools/perf/Makefile.perf                           |  14 +-
 tools/perf/scripts/python/arm-cs-trace-disasm.py   |  20 +-
 tools/perf/tests/builtin-test.c                    | 187 +++++++++++++++-
 tools/perf/tests/shell/coresight/Makefile          |  29 ---
 .../perf/tests/shell/coresight/Makefile.miniconfig |  14 --
 tools/perf/tests/shell/coresight/asm_pure_loop.sh  |  22 --
 .../tests/shell/coresight/asm_pure_loop/.gitignore |   1 -
 .../tests/shell/coresight/asm_pure_loop/Makefile   |  34 ---
 .../shell/coresight/asm_pure_loop/asm_pure_loop.S  |  30 ---
 .../tests/shell/coresight/concurrent_threads.sh    |  45 ++++
 .../tests/shell/coresight/context_switch_thread.sh |  69 ++++++
 tools/perf/tests/shell/coresight/deterministic.sh  |  72 +++++++
 .../tests/shell/coresight/memcpy_thread/.gitignore |   1 -
 .../tests/shell/coresight/memcpy_thread/Makefile   |  33 ---
 .../shell/coresight/memcpy_thread/memcpy_thread.c  |  80 -------
 .../tests/shell/coresight/memcpy_thread_16k_10.sh  |  22 --
 .../perf/tests/shell/coresight/raw_dump_stress.sh  |  65 ++++++
 .../shell/{ => coresight}/test_arm_coresight.sh    |  43 ++--
 .../{ => coresight}/test_arm_coresight_disasm.sh   |  23 +-
 .../tests/shell/coresight/thread_loop/.gitignore   |   1 -
 .../tests/shell/coresight/thread_loop/Makefile     |  33 ---
 .../shell/coresight/thread_loop/thread_loop.c      |  85 --------
 .../shell/coresight/thread_loop_check_tid_10.sh    |  23 --
 .../shell/coresight/thread_loop_check_tid_2.sh     |  23 --
 .../shell/coresight/unroll_loop_thread/.gitignore  |   1 -
 .../shell/coresight/unroll_loop_thread/Makefile    |  33 ---
 .../unroll_loop_thread/unroll_loop_thread.c        |  75 -------
 .../tests/shell/coresight/unroll_loop_thread_10.sh |  22 --
 tools/perf/tests/shell/lib/coresight.sh            | 134 ------------
 tools/perf/tests/tests.h                           |   3 +
 tools/perf/tests/workloads/Build                   |   4 +
 tools/perf/tests/workloads/context_switch_loop.c   | 110 ++++++++++
 tools/perf/tests/workloads/deterministic.c         |  39 ++++
 tools/perf/tests/workloads/named_threads.c         | 109 ++++++++++
 tools/perf/util/cs-etm-decoder/cs-etm-decoder.c    |  21 +-
 tools/perf/util/cs-etm.c                           | 236 ++++++++++++---------
 tools/perf/util/cs-etm.h                           |   8 +-
 40 files changed, 926 insertions(+), 942 deletions(-)
---
base-commit: 351a37f2fda4db668cff8ba12f2992d73dccdaea
change-id: 20260515-james-cs-context-tracking-fix-754998bae7ed

Best regards,
-- 
James Clark <james.clark@linaro.org>
Re: [PATCH v5 00/19] perf cs-etm: Queue context packets for frontend
Posted by Arnaldo Carvalho de Melo 5 hours ago
On Tue, Jun 09, 2026 at 03:40:05PM +0100, James Clark wrote:
> Fix thread tracking when decoding Coresight trace and add a new test for
> it.

The issues found by sashiko seem mild and you can address them in follow
up patches, I think.

So for the benefit of having perf-tools-next available for linux-next
testing and the window is closing soon, so I've merged this, ok?

- Arnaldo
 
> The new test is added as a Perf test workload instead of a custom binary
> with its own build system, but this requires a new feature in Perf test
> to pass in control pipes which can enable and disable events. This
> scopes the recording to just the workload and helps to reduce the amount
> of data recorded in tracing tests.
> 
> With this new feature we can re-write all of the Coresight tests to make
> use of it and remove the remaining binaries which fixes the following
> issues:
> 
>  * They didn't work in out of source builds
>  * A lot of the tests unnecessarily required root and didn't skip
>    without it
>  * They were mainly qualitative tests which didn't look for specific
>    behavior
> 
> Most importantly, the long build and runtime has been reduced. On a
> Radxa Orion O6, unroll_loop_thread.c took 37s to compile which is longer
> than the entire Perf build. Now the build time is negligible and the
> before and after test runtimes for all the Coresight tests are:
> 
>           |   N1SDP   |   Orion O6
>   -----------------------------------
>   Before  |   4m  0s  |    14m 49s
>   After   |      26s  |        56s
>   -----------------------------------
> 
> Signed-off-by: James Clark <james.clark@linaro.org>
> ---
> Changes in v5:
> - Forgot to include this change:
>   - Test for actual length of expected raw dump (Leo)
> - Link to v4: https://lore.kernel.org/r/20260609-james-cs-context-tracking-fix-v4-0-44f9fb9e5c42@linaro.org
> 
> Changes in v4:
> - Rename workload-ctl to record-ctl and improve docs (Leo)
> - Use new packet argument everywhere in
>   cs_etm__synth_instruction_sample() (Sashiko)
> - Test for actual length of expected raw dump (Leo)
> - Use -fno-inline instead of keyword (Leo)
> - Don't test any brace or call lines in deterministic test
> - Make sure context switch loop test does cleanup on failure (Sashiko)
> - Remove undef int overflows in workloads (Sashiko)
> - Link to v3: https://lore.kernel.org/r/20260603-james-cs-context-tracking-fix-v3-0-c392945d9ed5@linaro.org
> 
> Changes in v3:
> - Minor sashiko comments
>   - Close some more pipes
>   - Fix warning messages
>   - Error handling improvements
> - Pass packet into cs_etm__synth_instruction_sample()
> - Fixup stale comment (Leo)
> - Link to v2: https://lore.kernel.org/r/20260602-james-cs-context-tracking-fix-v2-0-85b5ce6f55c6@linaro.org
> 
> Changes in v2:
> - Add --workload-ctl option to Perf test
> - Re-write all the Coresight tests and speed them up
> - Pass packet to memory access function so frontend can use either the
>   previous or current packet's EL
> - Link to v1: https://lore.kernel.org/r/20260526-james-cs-context-tracking-fix-v1-0-ebd602e18287@linaro.org
> 
> ---
> James Clark (19):
>       perf cs-etm: Queue context packets for frontend
>       perf test: Add workload-ctl option
>       perf test: Add a workload that forces context switches
>       perf test cs-etm: Test process attribution
>       perf test: Add deterministic workload
>       perf test cs-etm: Replace unroll loop thread with deterministic decode test
>       perf test cs-etm: Remove asm_pure_loop test
>       perf test cs-etm: Replace memcpy test with raw dump stress test
>       perf test: Add named_threads workload
>       perf test cs-etm: Test decoding for concurrent threads test
>       perf test cs-etm: Remove duplicate branch tests
>       perf test cs-etm: Skip if not root
>       perf test cs-etm: Reduce snapshot size
>       perf test cs-etm: Speed up basic test
>       perf test cs-etm: Remove unused Coresight workloads
>       perf test cs-etm: Make disassembly test use kcore
>       perf test cs-etm: Add all branch instructions to test
>       perf test cs-etm: Speed up disassembly test
>       perf test cs-etm: Move existing tests to coresight folder
> 
>  Documentation/trace/coresight/coresight-perf.rst   |  78 +------
>  MAINTAINERS                                        |   2 -
>  tools/perf/Documentation/perf-test.txt             |  24 ++-
>  tools/perf/Makefile.perf                           |  14 +-
>  tools/perf/scripts/python/arm-cs-trace-disasm.py   |  20 +-
>  tools/perf/tests/builtin-test.c                    | 187 +++++++++++++++-
>  tools/perf/tests/shell/coresight/Makefile          |  29 ---
>  .../perf/tests/shell/coresight/Makefile.miniconfig |  14 --
>  tools/perf/tests/shell/coresight/asm_pure_loop.sh  |  22 --
>  .../tests/shell/coresight/asm_pure_loop/.gitignore |   1 -
>  .../tests/shell/coresight/asm_pure_loop/Makefile   |  34 ---
>  .../shell/coresight/asm_pure_loop/asm_pure_loop.S  |  30 ---
>  .../tests/shell/coresight/concurrent_threads.sh    |  45 ++++
>  .../tests/shell/coresight/context_switch_thread.sh |  69 ++++++
>  tools/perf/tests/shell/coresight/deterministic.sh  |  72 +++++++
>  .../tests/shell/coresight/memcpy_thread/.gitignore |   1 -
>  .../tests/shell/coresight/memcpy_thread/Makefile   |  33 ---
>  .../shell/coresight/memcpy_thread/memcpy_thread.c  |  80 -------
>  .../tests/shell/coresight/memcpy_thread_16k_10.sh  |  22 --
>  .../perf/tests/shell/coresight/raw_dump_stress.sh  |  65 ++++++
>  .../shell/{ => coresight}/test_arm_coresight.sh    |  43 ++--
>  .../{ => coresight}/test_arm_coresight_disasm.sh   |  23 +-
>  .../tests/shell/coresight/thread_loop/.gitignore   |   1 -
>  .../tests/shell/coresight/thread_loop/Makefile     |  33 ---
>  .../shell/coresight/thread_loop/thread_loop.c      |  85 --------
>  .../shell/coresight/thread_loop_check_tid_10.sh    |  23 --
>  .../shell/coresight/thread_loop_check_tid_2.sh     |  23 --
>  .../shell/coresight/unroll_loop_thread/.gitignore  |   1 -
>  .../shell/coresight/unroll_loop_thread/Makefile    |  33 ---
>  .../unroll_loop_thread/unroll_loop_thread.c        |  75 -------
>  .../tests/shell/coresight/unroll_loop_thread_10.sh |  22 --
>  tools/perf/tests/shell/lib/coresight.sh            | 134 ------------
>  tools/perf/tests/tests.h                           |   3 +
>  tools/perf/tests/workloads/Build                   |   4 +
>  tools/perf/tests/workloads/context_switch_loop.c   | 110 ++++++++++
>  tools/perf/tests/workloads/deterministic.c         |  39 ++++
>  tools/perf/tests/workloads/named_threads.c         | 109 ++++++++++
>  tools/perf/util/cs-etm-decoder/cs-etm-decoder.c    |  21 +-
>  tools/perf/util/cs-etm.c                           | 236 ++++++++++++---------
>  tools/perf/util/cs-etm.h                           |   8 +-
>  40 files changed, 926 insertions(+), 942 deletions(-)
> ---
> base-commit: 351a37f2fda4db668cff8ba12f2992d73dccdaea
> change-id: 20260515-james-cs-context-tracking-fix-754998bae7ed
> 
> Best regards,
> -- 
> James Clark <james.clark@linaro.org>