drivers/gpu/drm/drm_gem.c | 117 +++++++++++ drivers/gpu/drm/drm_gem_shmem_helper.c | 264 +++++++++++++++++++++++- drivers/gpu/drm/panfrost/panfrost_gem.c | 14 +- drivers/gpu/drm/panfrost/panfrost_gem.h | 2 +- drivers/gpu/drm/panfrost/panfrost_mmu.c | 86 ++------ drivers/gpu/drm/panthor/panthor_gem.c | 2 +- include/drm/drm_gem.h | 6 + include/drm/drm_gem_shmem_helper.h | 29 ++- include/linux/scatterlist.h | 17 ++ lib/scatterlist.c | 175 +++++++++++----- 10 files changed, 579 insertions(+), 133 deletions(-)
This patch series is a proposal for implementing sparse page allocations for shmem objects. It was initially motivated by a kind of BO managed by the Panfrost driver, the tiler heap, which grows on demand every time the GPU faults on a virtual address within its drm_mm-managed ranged. Because keeping a struct page pointer array that can describe the entire virtual range is wasteful when only a few backing pages have been allocated, at Collabora we thought a sparse allocation approach with xarrays was a more efficient choice. Since sparse and 'dense' DRM shmem objects must be managed slightly differently, the API is expanded to allow client drivers to create sparse objects and also to expand their page backing range, but everything else should remain as transparent as possible and be handled from within the DRM shmem system itself. Discussion of previus revision can be found here: https://lore.kernel.org/dri-devel/20250218232552.3450939-1-adrian.larumbe@collabora.com/ Changelog: v2: - Removed patch with helper for non-blocking shmem page allocations. - Moved page_array definitions away from scatterlist interface to hide them from consumers. - Refactored sg_alloc_append_table_from_pages() so that it now calls sg_alloc_append_table_from_page_array() to avoid code duplication. - Undid extension of __drm_gem_shmem_create() argument list so that a sparse shmem object is now fully defined in a parent function. - Moved check for absence of backing pages when putting an object into drm_gem_shmem_put_pages() - Added explanatory comments above DRM WARN'ings across yet unimplemented shmem code paths, like kernel vmap's and UM mappings of sparse objects - Created drm_gem helper for doing the actual sparse allocation, to keep the interface aligned with the existing one with regular shmem objects. - Split the body of drm_gem_shmem_get_sparse_pages_locked() into two separate functions, one which performs the actual page allocation, and another one that retrieves an sgtable. - Expanded the argument list of drm_gem_shmem_get_sparse_pages() and its children functions so that it takes an gfp mask, in the even that we would want to do non-blocking allocations, for instance like when we wish to avoid races with the shrinker memory reclaim path. - Created shmem helper that returns whether an shmem object has any backing pages. TODO: The following items need to be worked on, and will be the subject of a v3 of this RFC: - Handle the special case when some of the pages in a sparse allocation range are already present, rather than bailing out immediately. - Redefining panfrost_gem_object::sgts into an xarray or perhaps a sg_append_table to avoid memory waste in allocating more sgtable pointers than we could need. - Deciding on the rules for sparse shmem object's kmaps and UM maps. Adrián Larumbe (6): lib/scatterlist.c: Support constructing sgt from page xarray drm/shmem: Introduce the notion of sparse objects drm/shmem: Implement sparse allocation of pages for shmem objects drm/panfrost: Use shmem sparse allocation for heap BOs drm/shmem: Add a helper to check object's page backing status drm/panfrost/panthor: Take sparse objects into account for fdinfo drivers/gpu/drm/drm_gem.c | 117 +++++++++++ drivers/gpu/drm/drm_gem_shmem_helper.c | 264 +++++++++++++++++++++++- drivers/gpu/drm/panfrost/panfrost_gem.c | 14 +- drivers/gpu/drm/panfrost/panfrost_gem.h | 2 +- drivers/gpu/drm/panfrost/panfrost_mmu.c | 86 ++------ drivers/gpu/drm/panthor/panthor_gem.c | 2 +- include/drm/drm_gem.h | 6 + include/drm/drm_gem_shmem_helper.h | 29 ++- include/linux/scatterlist.h | 17 ++ lib/scatterlist.c | 175 +++++++++++----- 10 files changed, 579 insertions(+), 133 deletions(-) base-commit: 2f9d51740cc30e0d2c8a23a55b1e20cf2513c250 -- 2.48.1
Hi Am 26.03.25 um 03:14 schrieb Adrián Larumbe: > This patch series is a proposal for implementing sparse page allocations > for shmem objects. It was initially motivated by a kind of BO managed by > the Panfrost driver, the tiler heap, which grows on demand every time the > GPU faults on a virtual address within its drm_mm-managed ranged. I've looked at panfrost_gem.h and found that the driver's gem structure has grown quite a bit. It seems to have outgrown gem-shmem already. I think you should consider pulling a copy of gem-shmem into the driver and building a dedicated memory manager on top. Best regards Thomas > > Because keeping a struct page pointer array that can describe the entire > virtual range is wasteful when only a few backing pages have been > allocated, at Collabora we thought a sparse allocation approach with > xarrays was a more efficient choice. > > Since sparse and 'dense' DRM shmem objects must be managed slightly > differently, the API is expanded to allow client drivers to create sparse > objects and also to expand their page backing range, but everything else > should remain as transparent as possible and be handled from within the DRM > shmem system itself. > > Discussion of previus revision can be found here: > https://lore.kernel.org/dri-devel/20250218232552.3450939-1-adrian.larumbe@collabora.com/ > > Changelog: > v2: > - Removed patch with helper for non-blocking shmem page allocations. > - Moved page_array definitions away from scatterlist interface to hide > them from consumers. > - Refactored sg_alloc_append_table_from_pages() so that it now calls > sg_alloc_append_table_from_page_array() to avoid code duplication. > - Undid extension of __drm_gem_shmem_create() argument list so that a sparse > shmem object is now fully defined in a parent function. > - Moved check for absence of backing pages when putting an object into > drm_gem_shmem_put_pages() > - Added explanatory comments above DRM WARN'ings across yet unimplemented > shmem code paths, like kernel vmap's and UM mappings of sparse objects > - Created drm_gem helper for doing the actual sparse allocation, to keep > the interface aligned with the existing one with regular shmem objects. > - Split the body of drm_gem_shmem_get_sparse_pages_locked() into two separate > functions, one which performs the actual page allocation, and another > one that retrieves an sgtable. > - Expanded the argument list of drm_gem_shmem_get_sparse_pages() and its > children functions so that it takes an gfp mask, in the even that we would > want to do non-blocking allocations, for instance like when we wish to > avoid races with the shrinker memory reclaim path. > - Created shmem helper that returns whether an shmem object has any backing pages. > > TODO: > The following items need to be worked on, and will be the subject of a v3 of this RFC: > > - Handle the special case when some of the pages in a sparse allocation range are > already present, rather than bailing out immediately. > - Redefining panfrost_gem_object::sgts into an xarray or perhaps a sg_append_table > to avoid memory waste in allocating more sgtable pointers than we could need. > - Deciding on the rules for sparse shmem object's kmaps and UM maps. > > Adrián Larumbe (6): > lib/scatterlist.c: Support constructing sgt from page xarray > drm/shmem: Introduce the notion of sparse objects > drm/shmem: Implement sparse allocation of pages for shmem objects > drm/panfrost: Use shmem sparse allocation for heap BOs > drm/shmem: Add a helper to check object's page backing status > drm/panfrost/panthor: Take sparse objects into account for fdinfo > > drivers/gpu/drm/drm_gem.c | 117 +++++++++++ > drivers/gpu/drm/drm_gem_shmem_helper.c | 264 +++++++++++++++++++++++- > drivers/gpu/drm/panfrost/panfrost_gem.c | 14 +- > drivers/gpu/drm/panfrost/panfrost_gem.h | 2 +- > drivers/gpu/drm/panfrost/panfrost_mmu.c | 86 ++------ > drivers/gpu/drm/panthor/panthor_gem.c | 2 +- > include/drm/drm_gem.h | 6 + > include/drm/drm_gem_shmem_helper.h | 29 ++- > include/linux/scatterlist.h | 17 ++ > lib/scatterlist.c | 175 +++++++++++----- > 10 files changed, 579 insertions(+), 133 deletions(-) > > > base-commit: 2f9d51740cc30e0d2c8a23a55b1e20cf2513c250 > -- > 2.48.1 -- -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
Hi Thomas, On Mon, 31 Mar 2025 09:13:59 +0200 Thomas Zimmermann <tzimmermann@suse.de> wrote: > Hi > > Am 26.03.25 um 03:14 schrieb Adrián Larumbe: > > This patch series is a proposal for implementing sparse page allocations > > for shmem objects. It was initially motivated by a kind of BO managed by > > the Panfrost driver, the tiler heap, which grows on demand every time the > > GPU faults on a virtual address within its drm_mm-managed ranged. > > I've looked at panfrost_gem.h and found that the driver's gem structure > has grown quite a bit. It seems to have outgrown gem-shmem already. I > think you should consider pulling a copy of gem-shmem into the driver > and building a dedicated memory manager on top. Actually, it's not just something we need for panfrost. I plan to use the same non-blocking allocation mechanism for panthor's heap chunks/buffers, and lima could use it for its heap buffers too. The non-blocking page allocation is also something i915 has been open-coding here [1], and I believe that some of this logic could (and should IMHO) live in common code rather than each driver coming with its own solution, thus increasing the risk of bugs/inconsistencies. That's even more important if we provide a common gem-shmem shrinker like Dmitry's has been proposing. Best Regards, Boris [1]https://elixir.bootlin.com/linux/v6.13.7/source/drivers/gpu/drm/i915/gem/i915_gem_shmem.c#L89
Hi Am 31.03.25 um 10:31 schrieb Boris Brezillon: > Hi Thomas, > > On Mon, 31 Mar 2025 09:13:59 +0200 > Thomas Zimmermann <tzimmermann@suse.de> wrote: > >> Hi >> >> Am 26.03.25 um 03:14 schrieb Adrián Larumbe: >>> This patch series is a proposal for implementing sparse page allocations >>> for shmem objects. It was initially motivated by a kind of BO managed by >>> the Panfrost driver, the tiler heap, which grows on demand every time the >>> GPU faults on a virtual address within its drm_mm-managed ranged. >> I've looked at panfrost_gem.h and found that the driver's gem structure >> has grown quite a bit. It seems to have outgrown gem-shmem already. I >> think you should consider pulling a copy of gem-shmem into the driver >> and building a dedicated memory manager on top. > Actually, it's not just something we need for panfrost. I plan to use > the same non-blocking allocation mechanism for panthor's heap > chunks/buffers, and lima could use it for its heap buffers too. The > non-blocking page allocation is also something i915 has been > open-coding here [1], and I believe that some of this logic could > (and should IMHO) live in common code rather than each driver coming > with its own solution, thus increasing the risk of bugs/inconsistencies. > That's even more important if we provide a common gem-shmem shrinker > like Dmitry's has been proposing. Thanks for the note. Best regards Thomas > > Best Regards, > > Boris > > [1]https://elixir.bootlin.com/linux/v6.13.7/source/drivers/gpu/drm/i915/gem/i915_gem_shmem.c#L89 -- -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
© 2016 - 2025 Red Hat, Inc.