When writing a "max" limit lower than the current usage, the
existing code silently failed. This series aims to improve
on that by returning -EBUSY on failure and also attempt
to synchronously reclaim device memory to push the usage
under the new max limit to avoid the error.
Patch 1 fixes a pre-existing amdgpu_vram_mgr_init() error path
Patch 2 introduces struct dmem_cgroup_init for extensible region
registration.
Patch 3 implements and documents a reclaim callback interface
for the dmem controller.
Patch 4 implements a TTM reclaim callback.
Patches 5-6 hook up the reclaim callback to the dmem cgroup-aware
drivers xe and amdgpu.
v2:
- Remove the error propagation that was in a previous series (Maarten)
- A number of updates in patch 1. See its commit message for
details (Maarten)
v3:
- Add patch 1 fixing a pre-existing amdgpu_vram_mgr_init() error path
bug where drmm_cgroup_register_region() was called before
INIT_LIST_HEAD() and gpu_buddy_init(), causing a kernel panic on
failure. (Sashiko-bot)
- Use an rwsem to protect reclaim callback registration and region
unregister against concurrent reclaim invocations. (Sashiko-bot)
- Fix ttm_resource_manager_set_dmem_region() storing an error pointer
in man->cg unconditionally. (Sashiko-bot)
- Fix kernel-doc function name format for ttm_bo_evict_cgroup() and
ttm_resource_manager_set_dmem_region().
v4:
- Rebased on drm-tip; dropped the XE_PL_STOLEN guard in the xe patch
as stolen memory uses a separate TTM manager.
v5:
- Add patch 2 introducing struct dmem_cgroup_init to make the
dmem_cgroup_register_region() API extensible without adding positional
arguments in the future.
- Use nonblock=true in reset_all_resource_limits() to avoid sleeping
inside rcu_read_lock() in dmemcs_offline(). (Sashiko-bot)
- Compare usage against the truncated limit stored in cnt.max, not the
original u64. (Sashiko-bot)
- Use DMEM_MAX_RECLAIM_RETRIES (16) retry budget instead of 5, matching
the memcg controller; only -ENOSPC (no progress) counts against the
budget, other errors abort immediately.
- Handle NULL region in ttm_resource_manager_set_dmem_region() to clear
the reclaim callback, preventing use-after-free when the manager is
torn down while the dmem region outlives it. (Sashiko-bot)
- Return 0 on any eviction progress; reserve -ENOSPC for zero progress.
- Clear the reclaim callback in xe and amdgpu fini paths to prevent
use-after-free after driver unbind with open DRM file descriptors.
(Sashiko-bot)
- Register xe fini devres action before drmm_cgroup_register_region()
so LIFO teardown runs unregister first, draining callbacks before the
manager is destroyed. (Sashiko-bot)
- Switch amdgpu to explicit dmem_cgroup_unregister_region() at the top
of amdgpu_vram_mgr_fini() before any manager teardown, since amdgpu's
fini is called explicitly during driver unbind before drmm cleanup.
(Sashiko-bot)
- Wrap the xe reclaim callback with drm_dev_enter()/drm_dev_exit() to
prevent TTM reclaim from running after driver unbind.
v6:
- Move the ops check inside down_read() in set_resource_max(), guarded
by region->unregistered, to close a UAF race against
dmem_cgroup_unregister_region(). (Sashiko-bot)
- Fix dmem_cgroup_ops->reclaim docstring: -ENOSPC is retried up to
DMEM_MAX_RECLAIM_RETRIES times, not an immediate stop. (Sashiko-bot)
- Fix mgr->cg_region never being assigned in amdgpu_vram_mgr_init(),
causing dmem_cgroup_unregister_region() in fini to silently no-op.
(Sashiko-bot)
- Reorder amdgpu_vram_mgr_fini() to call set_used(false) and
evict_all() before dmem_cgroup_unregister_region(), so
ttm_resource_free() can uncharge via man->cg during eviction; clear
man->cg after unregister. (Sashiko-bot)
User-space tests are at
https://patchwork.freedesktop.org/series/163935/
Test-with: 20260428065411.4222-1-thomas.hellstrom@linux.intel.com
Thomas Hellström (6):
drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
cgroup/dmem: Introduce struct dmem_cgroup_init for region initialization
cgroup/dmem: Add reclaim callback for lowering max below current usage
drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem
controller
drm/xe: Wire up dmem cgroup reclaim for VRAM manager
drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 30 ++++++-
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 2 +
drivers/gpu/drm/drm_drv.c | 8 +-
drivers/gpu/drm/ttm/ttm_bo.c | 95 +++++++++++++++++++-
drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +-
drivers/gpu/drm/ttm/ttm_resource.c | 50 +++++++++++
drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 53 +++++++++--
include/drm/drm_drv.h | 4 +-
include/drm/ttm/ttm_bo.h | 10 +++
include/drm/ttm/ttm_resource.h | 7 ++
include/linux/cgroup_dmem.h | 38 +++++++-
kernel/cgroup/dmem.c | 129 ++++++++++++++++++++++++---
13 files changed, 396 insertions(+), 35 deletions(-)
--
2.54.0
Thomas Hellström (6):
drm/amdgpu: Fix init ordering in amdgpu_vram_mgr_init()
cgroup/dmem: Introduce struct dmem_cgroup_init for region
initialization
cgroup/dmem: Add reclaim callback for lowering max below current usage
drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem
controller
drm/xe: Wire up dmem cgroup reclaim for VRAM manager
drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 30 ++++-
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 2 +
drivers/gpu/drm/drm_drv.c | 8 +-
drivers/gpu/drm/ttm/ttm_bo.c | 95 +++++++++++++-
drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +-
drivers/gpu/drm/ttm/ttm_resource.c | 50 +++++++
drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 53 +++++++-
include/drm/drm_drv.h | 4 +-
include/drm/ttm/ttm_bo.h | 10 ++
include/drm/ttm/ttm_resource.h | 7 +
include/linux/cgroup_dmem.h | 38 +++++-
kernel/cgroup/dmem.c | 129 +++++++++++++++++--
13 files changed, 396 insertions(+), 35 deletions(-)
--
2.54.0