[PATCH v13 0/6] Support sparse mappings in Panthor

Adrián Larumbe posted 6 patches 1 day, 23 hours ago
drivers/gpu/drm/panthor/panthor_device.h |   3 +
drivers/gpu/drm/panthor/panthor_drv.c    |  12 +-
drivers/gpu/drm/panthor/panthor_gem.c    |  18 ++
drivers/gpu/drm/panthor/panthor_gem.h    |   2 +
drivers/gpu/drm/panthor/panthor_mmu.c    | 237 ++++++++++++++++++-----
include/uapi/drm/panthor_drm.h           |  26 ++-
6 files changed, 249 insertions(+), 49 deletions(-)
[PATCH v13 0/6] Support sparse mappings in Panthor
Posted by Adrián Larumbe 1 day, 23 hours ago
This patch series implements sparse mappings in Panthor. Owing to the lack of HW MMU
features for sparse page table entries, this had to be implemented using a dummy object
over which sparse mappings requested over VM_BIND are mapped cyclically.

To that end, a new VM_BIND flag was added in the driver's uAPI.

The end goal of this patch series is to improve support of Vulkan sparse
resources. At the moment, to implement this feature on Mali hardware, Vulkan
sparse map is implemented by mapping the specified region to a "dummy bo" so
that the accesses do not fault. A newly created sparse resource starts off
unmapped, and therefore also has to be mapped to the "dummy bo".  This "dummy
bo" is small (a page size) in comparison to the sizes of va ranges that we might
want to map to it, and a large number of vm_bind ops can be necessary. For
example, if the user were to create a 100e6-byte sparse resident resource, we'd
have to poke VM_BIND with ceil(100e6/0x1000)=24415 map operations.

The new VM_BIND sparse mapping feature addresses this particular inefficiency by
letting us implement a single Vulkan sparse map operation and sparse resident
resource initialization with just one map operation.

Link to the conversation for the previous patch series revision at:
https://lore.kernel.org/dri-devel/20260521014359.2011484-1-adrian.larumbe@collabora.com

Neither smatch nor sparse threw out any snags.

Changes in v13:
 - Rolled back one of the fixes suggested by Sashiko, as it doesn't really apply (no need to trigger
 a rescheduling inside sparse mapping code since it's always preemptable).
 - Fixed UAF in allocation error path for a new VM when freeing dummy object.

Changes in v12:
 - Dealt with sparse VMA case when restoring a VMA that had been evicted by the shrinker.
 - Fixed issues uncovered by Sashiko at https://sashiko.dev/#/patchset/20260507214939.2852489-1-adrian.larumbe%40collabora.com

Changes in v11:
 - Fixed UAF bug when creating dummy object after vm pool.
 - Assigned a BO offset to sparse VAs that is the same as the address inside a 2MiB page.
 - Removed R-b tag as it no longer applies.
 - Some minor nits.

Changes in v10:
 - Fixed uAPI enum ordering issue
 - Reworked sparse mapping by hardcoding size of dummy object.
 - Added missing cleanup in case dummy object fails to allocate
 - Other minor fixes.

Changes in v9:
 - Addressed some nits.
 - Rearranged argument checks for vm_bind to profit from compiler optimisations.
 - Added some further comments.

Changes in v8:
 - Allocate a single 2MiB BO as a dummy buffer for sparse mappings. Let its pages
 be retrieved just like for any other BO during a map operation.
 - Removed locking around allocation of the dummy BO by doing it right at the
  time of a VMA pool creation.
 - Some minor style fixes.
 - Refactor low level page mapping code in sm_remap and sm_map.
 - Made NO_EXEC a mandatory flag for sparse mappings.
 - Actually bumped the driver's minor revision number.

Changes in v7:
 - Switched back to Panthor BO-backed dummy object instead of raw pages so as to profit from
 the existing shrinker reclaim paths.
 - Created Dummy BO's per file context to avoid information leaking between them.
 - Reorganised some of the low-level page mapping code.
 - Added commits deleting spurious white space and unused op contex field.

Changes in v6:
 - Moved all the GPUVM core code into the driver backend.
 - Discarded commits that touch on the gpuvm core too.
 - Redesigned the uAPI so that no repeat range or user BO is supplied for sparse mappings.
 - Replaced user-supplied BO with a kernel-allocated array of raw pages.

Changes in v5:
 - Minor fixes to drm_gpuvm.c.
 - Add panthor MMU page sizes device queriable param.
 - Add helper to make sure unmaps of repeated regions are correct.
 - Some fixes to Panthor's repeat mappings implementation.
 - Lump arguments to panthor_vm_prepare_map_op_ctx into a single struct.

Changes in v4:
 - Fixed the warnings reported by the kernel test robot.
  https://lore.kernel.org/oe-kbuild-all/202507041635.WyDu3TQ1-lkp@intel.com/
 - Fixed the warnings reported by the CI.
  https://patchwork.freedesktop.org/series/151264/

No changes in v3.

Changes in v2:
 - Make panthor use this stuff.
 - Make it possible to express a repeated mappina of any suitably sized
  and aligned range of a BO, rather than strictly the page size -sized
  prefix, generalizing the API. Rename DRM_GPUVA_SINGLE_PAGE to
  DRM_GPUVA_REPEAT.
 - Clean up parts of drm/gpuvm affected by these changes.

Adrián Larumbe (6):
  drm/panthor: Expose GPU page sizes to UM
  drm/panthor: Pass vm_bind_op to vm_prepare_map_op_ctx
  drm/panthor: Delete spurious whitespace from uAPI header
  drm/panthor: Remove unused operation context field
  drm/panthor: Support sparse mappings
  drm/panthor: Bump the driver version to 1.9

 drivers/gpu/drm/panthor/panthor_device.h |   3 +
 drivers/gpu/drm/panthor/panthor_drv.c    |  12 +-
 drivers/gpu/drm/panthor/panthor_gem.c    |  18 ++
 drivers/gpu/drm/panthor/panthor_gem.h    |   2 +
 drivers/gpu/drm/panthor/panthor_mmu.c    | 237 ++++++++++++++++++-----
 include/uapi/drm/panthor_drm.h           |  26 ++-
 6 files changed, 249 insertions(+), 49 deletions(-)


base-commit: c1079aebb4de218caa86c44f9a53700d1a582683
--
2.53.0