[PATCH v9 00/16] drm/msm: Add PERFCNTR_CONFIG ioctl

Rob Clark posted 16 patches 2 days, 1 hour ago
drivers/gpu/drm/msm/Makefile                  |   27 +-
drivers/gpu/drm/msm/adreno/a2xx_gpu.c         |    7 -
drivers/gpu/drm/msm/adreno/a3xx_gpu.c         |   16 -
drivers/gpu/drm/msm/adreno/a4xx_gpu.c         |    3 -
drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |   16 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.c         |   10 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |  219 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu.h         |   16 +-
drivers/gpu/drm/msm/adreno/a6xx_preempt.c     |    2 +-
drivers/gpu/drm/msm/adreno/a8xx_gpu.c         |   33 +-
drivers/gpu/drm/msm/adreno/a8xx_preempt.c     |    2 +-
drivers/gpu/drm/msm/adreno/adreno_device.c    |    8 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.c       |    7 +-
drivers/gpu/drm/msm/msm_debugfs.c             |    6 -
drivers/gpu/drm/msm/msm_drv.c                 |    2 +-
drivers/gpu/drm/msm/msm_drv.h                 |   13 +-
drivers/gpu/drm/msm/msm_gpu.c                 |  119 +-
drivers/gpu/drm/msm/msm_gpu.h                 |  104 +-
drivers/gpu/drm/msm/msm_perf.c                |  235 --
drivers/gpu/drm/msm/msm_perfcntr.c            |  667 ++++++
drivers/gpu/drm/msm/msm_perfcntr.h            |  155 ++
drivers/gpu/drm/msm/msm_ringbuffer.h          |    2 +
drivers/gpu/drm/msm/msm_submitqueue.c         |    3 +-
.../msm/registers/adreno/a2xx_perfcntrs.json  |  109 +
drivers/gpu/drm/msm/registers/adreno/a3xx.xml |    8 +-
drivers/gpu/drm/msm/registers/adreno/a5xx.xml |  141 +-
.../msm/registers/adreno/a5xx_perfcntrs.json  |  128 +
drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 1300 ++++++-----
.../msm/registers/adreno/a6xx_descriptors.xml |   71 +-
.../drm/msm/registers/adreno/a6xx_enums.xml   |    3 +
.../msm/registers/adreno/a6xx_perfcntrs.json  |  112 +
.../msm/registers/adreno/a7xx_perfcntrs.json  |  228 ++
.../msm/registers/adreno/a8xx_descriptors.xml |   96 +-
.../msm/registers/adreno/a8xx_perfcntrs.json  |  240 ++
.../msm/registers/adreno/a8xx_perfcntrs.xml   | 1929 +++++++++++++++
.../msm/registers/adreno/adreno_common.xml    |   42 +
.../drm/msm/registers/adreno/adreno_pm4.xml   |   50 +-
drivers/gpu/drm/msm/registers/gen_header.py   | 2079 +++++++++--------
include/uapi/drm/msm_drm.h                    |   48 +
39 files changed, 6044 insertions(+), 2212 deletions(-)
delete mode 100644 drivers/gpu/drm/msm/msm_perf.c
create mode 100644 drivers/gpu/drm/msm/msm_perfcntr.c
create mode 100644 drivers/gpu/drm/msm/msm_perfcntr.h
create mode 100644 drivers/gpu/drm/msm/registers/adreno/a2xx_perfcntrs.json
create mode 100644 drivers/gpu/drm/msm/registers/adreno/a5xx_perfcntrs.json
create mode 100644 drivers/gpu/drm/msm/registers/adreno/a6xx_perfcntrs.json
create mode 100644 drivers/gpu/drm/msm/registers/adreno/a7xx_perfcntrs.json
create mode 100644 drivers/gpu/drm/msm/registers/adreno/a8xx_perfcntrs.json
create mode 100644 drivers/gpu/drm/msm/registers/adreno/a8xx_perfcntrs.xml
[PATCH v9 00/16] drm/msm: Add PERFCNTR_CONFIG ioctl
Posted by Rob Clark 2 days, 1 hour ago
Add a new PERFCNTR_CONFIG ioctl, serving two functions:

1. Global counter collection (restricted to perfmon_capable()) using the
   MSM_PERFCNTR_STREAM flag.  Global counter sampling is, global, across
   contexts.  Only a single global counter stream is allowed at a time.
2. Reserve counters for local counter collection.  Local counter
   collection is local to a cmdstream (GEM_SUBMIT), and as such is
   allowed in all processes without additional privileges.

The kernel enforces that counters assigned for global counter collection
do not conflict with counters reserved for local counter collection, and
visa versa.  Since local counter collection is scoped to a single cmd-
stream, multiple UMD processes can overlap in their reserved counters.
But cannot conflict with global counter usage.

In the case of local counter collection, the UMD is still responsible
for programming the corresponding SELect registers, and sampling the
counter values, from it's cmdstream.  But by performing the reservation
step, the UMD protects itself from the kernel trying to use the same
SEL/counter regs for global counter collection.

For global counter collection, the kernel programs SEL regs, and sets up
a timer for counter sampling.  Userspace reads out the sampled values
from the returned perfcntr stream fd.  Releasing the global perfcntr
stream is simply a matter of close()ing the fd.

The final two patches wire up the needed support for global counter
stream collection while IFPC is active, and drops disabling of IFPC.

The mesa side of this is at:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/41158

igt test at:
https://gitlab.freedesktop.org/robclark/igt-gpu-tools/-/commits/perfcntrs

wiki page about the design:
https://gitlab.freedesktop.org/drm/msm/-/wikis/adreno:-perfcounter-UABI

Changes in v9:
- Fix msm_perfcntr_init() error path [Claude]
- Fix off-by-one WARN in msm_perfcntr_group_idx [Claude]
- Fix error path leak of allocated_counters [Claude]
- Fix copy_from_user()/copy_to_user() stack corruption/leak [Claude]
- Fix fifo_size overflow [Claude]
- Use kzalloc_objs() where possible
- Disallow duplicate groups in PERFCNTR_CONFIG ioctl
- Add WARN_ON_ONCE() for pwrup_reglist overflow [Claude]
- Link to v8: https://lore.kernel.org/all/20260520162454.18391-1-robin.clark@oss.qualcomm.com/

Changes in v8:
- json fixes [Akhil]
- Use dma_wmb() [Akhil]
- Use kzalloc_obj() where possible
- Link to v7: https://lore.kernel.org/all/20260518190735.16236-1-robin.clark@oss.qualcomm.com

Changes in v7:
- Use smp_load_acquire() for fifo_count_to_end() [Akhil]
- Defer installing stream_fd until end [Akhil]
- Link to v6: https://lore.kernel.org/all/20260514134052.361771-1-robin.clark@oss.qualcomm.com/

Changes in v6:
- Reword comment [Anna]
- Link to v5: https://lore.kernel.org/all/20260511130017.96867-1-robin.clark@oss.qualcomm.com/

Changes in v5:
- Drop unnecessary runpm in ioctl path
- Link to v4: https://lore.kernel.org/all/20260506171127.133572-1-robin.clark@oss.qualcomm.com

Changes in v4:
- Fix null ptr deref on older gens without perfcntr support [Claude]
- Add upper limit to userspace controlled FIFO size [Claude]
- Fix nr_regs calculation [Claude]
- Link to v3: https://lore.kernel.org/all/20260504190751.61052-1-robin.clark@oss.qualcomm.com/

Changes in v3:
- Fix loop counter issue spotted by Claude review
- Add MSM_PERFCNTR_UPDATE flag to ask kernel to return the actual # of
  available counters in case of -E2BIG
- Proper barriers for modifying pwrup_Link
- Link to v2: https://lore.kernel.org/all/20260424151140.104093-1-robin.clark@oss.qualcomm.com

Changes in v2:
- Rework makefile magic based on Dmitry's suggestion, and add a2xx/a5xx
  perfcntr tables (although only a6xx+ is supported at this point)
- Fix compile error for compilers that are picky about a struct that
  only contains a flex array
- Drop a6xx_idle() under gpu->lock in a6xx_perfcntr_configure(), replace
  with perfcntr_fence that sel_worker can check
- Add a7xx+ pwrup_reglist support for restoring SELect regs on exit from
  IFPC.  (a6xx doesn't support IFPC, and the pwrup_reglist works a bit
  differently)
- Stop disabling IFPC when global counter stream is active.
- Link to v1: https://lore.kernel.org/all/20260420222621.417276-1-robin.clark@oss.qualcomm.com/

Rob Clark (16):
  drm/msm: Remove obsolete perf infrastructure
  drm/msm: Allow CAP_PERFMON for setting SYSPROF
  drm/msm/adreno: Sync registers from mesa
  drm/msm/registers: Sync gen_header.py from mesa
  drm/msm/registers: Add perfcntr json
  drm/msm: Add a6xx+ perfcntr tables
  drm/msm: Add sysprof accessors
  drm/msm/a6xx: Add yield & flush helper
  drm/msm: Add per-context perfcntr state
  drm/msm: Add basic perfcntr infrastructure
  drm/msm/a6xx+: Add support to configure perfcntrs
  drm/msm/a8xx: Add perfcntr flush sequence
  drm/msm: Add PERFCNTR_CONFIG ioctl
  drm/msm/a6xx: Increase pwrup_reglist size
  drm/msm/a6xx: Append SEL regs to dyn pwrup reglist
  drm/msm/a6xx: Allow IFPC with perfcntr stream

 drivers/gpu/drm/msm/Makefile                  |   27 +-
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c         |    7 -
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c         |   16 -
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c         |    3 -
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |   16 +-
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c         |   10 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |  219 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h         |   16 +-
 drivers/gpu/drm/msm/adreno/a6xx_preempt.c     |    2 +-
 drivers/gpu/drm/msm/adreno/a8xx_gpu.c         |   33 +-
 drivers/gpu/drm/msm/adreno/a8xx_preempt.c     |    2 +-
 drivers/gpu/drm/msm/adreno/adreno_device.c    |    8 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       |    7 +-
 drivers/gpu/drm/msm/msm_debugfs.c             |    6 -
 drivers/gpu/drm/msm/msm_drv.c                 |    2 +-
 drivers/gpu/drm/msm/msm_drv.h                 |   13 +-
 drivers/gpu/drm/msm/msm_gpu.c                 |  119 +-
 drivers/gpu/drm/msm/msm_gpu.h                 |  104 +-
 drivers/gpu/drm/msm/msm_perf.c                |  235 --
 drivers/gpu/drm/msm/msm_perfcntr.c            |  667 ++++++
 drivers/gpu/drm/msm/msm_perfcntr.h            |  155 ++
 drivers/gpu/drm/msm/msm_ringbuffer.h          |    2 +
 drivers/gpu/drm/msm/msm_submitqueue.c         |    3 +-
 .../msm/registers/adreno/a2xx_perfcntrs.json  |  109 +
 drivers/gpu/drm/msm/registers/adreno/a3xx.xml |    8 +-
 drivers/gpu/drm/msm/registers/adreno/a5xx.xml |  141 +-
 .../msm/registers/adreno/a5xx_perfcntrs.json  |  128 +
 drivers/gpu/drm/msm/registers/adreno/a6xx.xml | 1300 ++++++-----
 .../msm/registers/adreno/a6xx_descriptors.xml |   71 +-
 .../drm/msm/registers/adreno/a6xx_enums.xml   |    3 +
 .../msm/registers/adreno/a6xx_perfcntrs.json  |  112 +
 .../msm/registers/adreno/a7xx_perfcntrs.json  |  228 ++
 .../msm/registers/adreno/a8xx_descriptors.xml |   96 +-
 .../msm/registers/adreno/a8xx_perfcntrs.json  |  240 ++
 .../msm/registers/adreno/a8xx_perfcntrs.xml   | 1929 +++++++++++++++
 .../msm/registers/adreno/adreno_common.xml    |   42 +
 .../drm/msm/registers/adreno/adreno_pm4.xml   |   50 +-
 drivers/gpu/drm/msm/registers/gen_header.py   | 2079 +++++++++--------
 include/uapi/drm/msm_drm.h                    |   48 +
 39 files changed, 6044 insertions(+), 2212 deletions(-)
 delete mode 100644 drivers/gpu/drm/msm/msm_perf.c
 create mode 100644 drivers/gpu/drm/msm/msm_perfcntr.c
 create mode 100644 drivers/gpu/drm/msm/msm_perfcntr.h
 create mode 100644 drivers/gpu/drm/msm/registers/adreno/a2xx_perfcntrs.json
 create mode 100644 drivers/gpu/drm/msm/registers/adreno/a5xx_perfcntrs.json
 create mode 100644 drivers/gpu/drm/msm/registers/adreno/a6xx_perfcntrs.json
 create mode 100644 drivers/gpu/drm/msm/registers/adreno/a7xx_perfcntrs.json
 create mode 100644 drivers/gpu/drm/msm/registers/adreno/a8xx_perfcntrs.json
 create mode 100644 drivers/gpu/drm/msm/registers/adreno/a8xx_perfcntrs.xml

-- 
2.54.0