From nobody Thu Apr 2 17:04:56 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D46EA3D3CE2; Fri, 27 Mar 2026 08:16:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599402; cv=none; b=u1zHaIihdCTy6+vVTXzVjiPaxawtKdbMU5ITn2BtSfCgJr6xK1WnV/qKE03mSr53BduZ5v6Yvnl5HRZCGnFVpQkCngpUvs5k9ZSroA4B2sQvWUpbTIPDTWh26GdwLIcrrHyHtsaLjjq+RP4+2IisPWD4teFXFOJMZnKEmEaYrz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599402; c=relaxed/simple; bh=qeVWa8r7f4q+L+gXinu2pi9ngSyC1Swgdgb8weA16QE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=a1I87N7Qe/qgcX5i6FGo2u8E7QVnf8boFInW4LvARMFjLXdyApc8Z2WKOVQoo8C/KRWBgDZsLgikKOyFWYG8UHP13F58QoUFlQ+yi47iqvGWJBQaBtch+YlC4AbNXxNKY8mNdiohClBdQ6rzgBi2jp5aSAmG6X77tQJC0nlUDDg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=eY+9MLoE; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="eY+9MLoE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1774599400; x=1806135400; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qeVWa8r7f4q+L+gXinu2pi9ngSyC1Swgdgb8weA16QE=; b=eY+9MLoEt7lonn2S0RKrSwDKQHlrVNSfz1jSuRZ5yfkJO6jxqdGJqXIO hU4Mjghm0UnRJmgVTV2RJbOylg2JNOOYfKGUTjOEeYFVm2e2aT9Adt5GY v/+b/40BbMLmLC8INFxEMeuziwWBnd6GscIBV+aK0yBWDzRty+cwCIDCL oMoeL+DfqJOQeBXbaTg2cYF0TpsCpByob290y4Gvhey/oIqshcMk1b6xB pfcZVCOEqeC8Uk3e1+IqoMbQJKoDEG8m/71T2v6sh033MNkFpuW1timlA Mf7c3ZA2OdC+HpHjJRyIljoC0eKGbCwk7OaHZT6rVQLHPu07ExFLlUZee Q==; X-CSE-ConnectionGUID: IgebWfR3Rn2BF5bkBMyQTg== X-CSE-MsgGUID: aCLoj+RLQ5yqef9pbhW2BA== X-IronPort-AV: E=McAfee;i="6800,10657,11741"; a="75784035" X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="75784035" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:39 -0700 X-CSE-ConnectionGUID: GKBuzr8rT1imynU1RhvNpw== X-CSE-MsgGUID: dir45kgJQ6GnreQLIax5Cg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="255747888" Received: from egrumbac-mobl6.ger.corp.intel.com (HELO fedora) ([10.245.244.146]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:34 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , Natalie Vock , Johannes Weiner , Tejun Heo , =?UTF-8?q?Michal=20Koutn=C3=BD?= , cgroups@vger.kernel.org, Huang Rui , Matthew Brost , Matthew Auld , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Simona Vetter , David Airlie , =?UTF-8?q?Christian=20K=C3=B6nig?= , Alex Deucher , Rodrigo Vivi , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/5] cgroup/dmem: Return error when setting max below current usage Date: Fri, 27 Mar 2026 09:15:56 +0100 Message-ID: <20260327081600.4885-2-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> References: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Return -EBUSY to userspace when writing dmem.max below the cgroup's current device memory usage, rather than silently leaving the limit unchanged. Assisted-by: GitHub Copilot:claude-sonnet-4.6 Signed-off-by: Thomas Hellstr=C3=B6m --- kernel/cgroup/dmem.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c index 9d95824dc6fa..3e6d4c0b26a1 100644 --- a/kernel/cgroup/dmem.c +++ b/kernel/cgroup/dmem.c @@ -144,22 +144,24 @@ static void free_cg_pool(struct dmem_cgroup_pool_stat= e *pool) dmemcg_pool_put(pool); } =20 -static void +static int set_resource_min(struct dmem_cgroup_pool_state *pool, u64 val) { page_counter_set_min(&pool->cnt, val); + return 0; } =20 -static void +static int set_resource_low(struct dmem_cgroup_pool_state *pool, u64 val) { page_counter_set_low(&pool->cnt, val); + return 0; } =20 -static void +static int set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val) { - page_counter_set_max(&pool->cnt, val); + return page_counter_set_max(&pool->cnt, val); } =20 static u64 get_resource_low(struct dmem_cgroup_pool_state *pool) @@ -726,7 +728,7 @@ static int dmemcg_parse_limit(char *options, struct dme= m_cgroup_region *region, =20 static ssize_t dmemcg_limit_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off, - void (*apply)(struct dmem_cgroup_pool_state *, u64)) + int (*apply)(struct dmem_cgroup_pool_state *, u64)) { struct dmemcg_state *dmemcs =3D css_to_dmemcs(of_css(of)); int err =3D 0; @@ -773,7 +775,7 @@ static ssize_t dmemcg_limit_write(struct kernfs_open_fi= le *of, } =20 /* And commit */ - apply(pool, new_limit); + err =3D apply(pool, new_limit); dmemcg_pool_put(pool); =20 out_put: --=20 2.53.0 From nobody Thu Apr 2 17:04:56 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 873413D523B; Fri, 27 Mar 2026 08:16:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599408; cv=none; b=Hqe/Npq3zDols8wvQyWKFPv6CW3Nldm5jHadJX1wcGvPo61GSNIt+m9KtZwXaMX2aTfrUaSJTq7v/ld2Kz0kvOK39M4wHJqVWMSg/rX59T4qzy9q1J6l2u7Vy3gUeFP+QUgr5U6nLxTyTNegjEVHpPgdQYGXqfV/lkDjHJYS6WM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599408; c=relaxed/simple; bh=8Sffc1FnB7aLC90BrXNdUZA1BmIxr98g2Izx7ogRry0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=FrC26sFdoBK6ua17zysr5xFV6/7ZFVPZj00G6eSLpcuzMpQj5p58/bPzF+4MUcds5z/3s6li4byMain7NrsYMNwSFbhawwHCO7Yijlj2uteLM/fvj2Y5MBxY5csKTAAnJE2xpWnKFVTVgredgRb0YMr1HfQE2RyMjaYdMBq5hTA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=EEvE81ep; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="EEvE81ep" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1774599405; x=1806135405; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8Sffc1FnB7aLC90BrXNdUZA1BmIxr98g2Izx7ogRry0=; b=EEvE81epPJS+S1Q7dA5jLkiomh51vsxq3unZLbup5UpxpfdStOzeqrS2 SNgsQpMAlKH0u+str/nih28hYF1Kd5uW6l2SP5Ztz8ZrFAzshSrSy4Hem Gy1g9ZVVAsWjVNw6WrQ1zzh6YWswnReP3U/302WjsYAvLJ6ycvSw8wV4K 85YlHp2lJ9f2CqVDFTOKMPgALAgE8XiBLD76qdB3qlNTl/YgIos4n6y9D 1Y/d2DhDh8KhHhml9iPuLKXx548uUF3ZZNkJs4ez4fdEhA2edNsX+NUr5 /3Rx92Ihr3GINgxzA/laBTbCP8HFqnMmoeHwAsUKp5BP/HhrydFp2E6nD w==; X-CSE-ConnectionGUID: eSIHfrEXS3mPUijs+yiY+g== X-CSE-MsgGUID: mQ/a5FjsTE++o+svvfiHAA== X-IronPort-AV: E=McAfee;i="6800,10657,11741"; a="75784055" X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="75784055" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:45 -0700 X-CSE-ConnectionGUID: z9ayDlz1Qce3tvLAo25+iw== X-CSE-MsgGUID: +20jAXVHSBWHzwq2DEln3Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="255747905" Received: from egrumbac-mobl6.ger.corp.intel.com (HELO fedora) ([10.245.244.146]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:39 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , Natalie Vock , Johannes Weiner , Tejun Heo , =?UTF-8?q?Michal=20Koutn=C3=BD?= , cgroups@vger.kernel.org, Huang Rui , Matthew Brost , Matthew Auld , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Simona Vetter , David Airlie , =?UTF-8?q?Christian=20K=C3=B6nig?= , Alex Deucher , Rodrigo Vivi , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/5] cgroup/dmem: Add reclaim callback for lowering max below current usage Date: Fri, 27 Mar 2026 09:15:57 +0100 Message-ID: <20260327081600.4885-3-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> References: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add an optional reclaim callback to struct dmem_cgroup_region. When dmem.max is set below current usage, invoke the callback to evict memory and retry setting the limit rather than failing immediately. Signal interruptions propagate back to the write() caller. RFC: Due to us updating the max limit _after_ the usage has been sufficiently lowered, this should be prone to failures if there are aggressive allocators running in parallel to the reclaim. So can we somehow enforce the new limit while the eviction is happening? Assisted-by: GitHub Copilot:claude-sonnet-4.6 Signed-off-by: Thomas Hellstr=C3=B6m --- include/linux/cgroup_dmem.h | 11 +++++ kernel/cgroup/dmem.c | 94 +++++++++++++++++++++++++++++++++---- 2 files changed, 96 insertions(+), 9 deletions(-) diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h index dd4869f1d736..61520a431740 100644 --- a/include/linux/cgroup_dmem.h +++ b/include/linux/cgroup_dmem.h @@ -26,6 +26,10 @@ bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup= _pool_state *limit_pool, bool ignore_low, bool *ret_hit_low); =20 void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_state *pool); +void dmem_cgroup_region_set_reclaim(struct dmem_cgroup_region *region, + int (*reclaim)(struct dmem_cgroup_pool_state *pool, + u64 target_bytes, void *priv), + void *priv); #else static inline __printf(2,3) struct dmem_cgroup_region * dmem_cgroup_register_region(u64 size, const char *name_fmt, ...) @@ -62,5 +66,12 @@ bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup= _pool_state *limit_pool, static inline void dmem_cgroup_pool_state_put(struct dmem_cgroup_pool_stat= e *pool) { } =20 +static inline void +dmem_cgroup_region_set_reclaim(struct dmem_cgroup_region *region, + int (*reclaim)(struct dmem_cgroup_pool_state *pool, + u64 target_bytes, void *priv), + void *priv) +{ } + #endif #endif /* _CGROUP_DMEM_H */ diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c index 3e6d4c0b26a1..f993fb058b74 100644 --- a/kernel/cgroup/dmem.c +++ b/kernel/cgroup/dmem.c @@ -51,6 +51,18 @@ struct dmem_cgroup_region { * No new pools should be added to the region afterwards. */ bool unregistered; + + /** + * @reclaim: Optional callback invoked when dmem.max is set below the + * current usage of a pool. The driver should attempt to free at least + * @target_bytes from @pool. May be called multiple times if usage + * remains above the limit after returning. + */ + int (*reclaim)(struct dmem_cgroup_pool_state *pool, u64 target_bytes, + void *priv); + + /** @reclaim_priv: Private data passed to @reclaim. */ + void *reclaim_priv; }; =20 struct dmemcg_state { @@ -145,23 +157,59 @@ static void free_cg_pool(struct dmem_cgroup_pool_stat= e *pool) } =20 static int -set_resource_min(struct dmem_cgroup_pool_state *pool, u64 val) +set_resource_min(struct dmem_cgroup_pool_state *pool, u64 val, + struct dmem_cgroup_region *region) { page_counter_set_min(&pool->cnt, val); return 0; } =20 static int -set_resource_low(struct dmem_cgroup_pool_state *pool, u64 val) +set_resource_low(struct dmem_cgroup_pool_state *pool, u64 val, + struct dmem_cgroup_region *region) { page_counter_set_low(&pool->cnt, val); return 0; } =20 static int -set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val) +set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val, + struct dmem_cgroup_region *region) { - return page_counter_set_max(&pool->cnt, val); + int err =3D page_counter_set_max(&pool->cnt, val); + + if (err !=3D -EBUSY || !region || !region->reclaim) + return err; + + /* + * The new max is below current usage. Ask the driver to evict memory + * and retry, up to a bounded number of times. Signal interruptions are + * propagated back to the write() caller; other reclaim failures leave + * -EBUSY as the result. + */ + for (int retries =3D 5; retries > 0; retries--) { + u64 usage =3D page_counter_read(&pool->cnt); + u64 target =3D usage > val ? usage - val : 0; + int reclaim_err; + + if (!target) { + err =3D page_counter_set_max(&pool->cnt, val); + break; + } + + reclaim_err =3D region->reclaim(pool, target, region->reclaim_priv); + if (reclaim_err) { + if (reclaim_err =3D=3D -EINTR || reclaim_err =3D=3D -ERESTARTSYS) + err =3D reclaim_err; + break; + } + + err =3D page_counter_set_max(&pool->cnt, val); + if (err !=3D -EBUSY) + break; + } + + return err; } =20 static u64 get_resource_low(struct dmem_cgroup_pool_state *pool) @@ -186,9 +234,9 @@ static u64 get_resource_current(struct dmem_cgroup_pool= _state *pool) =20 static void reset_all_resource_limits(struct dmem_cgroup_pool_state *rpool) { - set_resource_min(rpool, 0); - set_resource_low(rpool, 0); - set_resource_max(rpool, PAGE_COUNTER_MAX); + set_resource_min(rpool, 0, NULL); + set_resource_low(rpool, 0, NULL); + set_resource_max(rpool, PAGE_COUNTER_MAX, NULL); } =20 static void dmemcs_offline(struct cgroup_subsys_state *css) @@ -570,6 +618,32 @@ void dmem_cgroup_pool_state_put(struct dmem_cgroup_poo= l_state *pool) } EXPORT_SYMBOL_GPL(dmem_cgroup_pool_state_put); =20 +/** + * dmem_cgroup_region_set_reclaim - Register a reclaim callback on a regio= n. + * @region: The region to register the callback for. + * @reclaim: Callback to invoke when dmem.max is set below current usage. + * Called with the pool that needs reclaiming and the number of + * bytes to free. Returns 0 on progress, negative on failure. + * @priv: Opaque pointer passed back to @reclaim. + * + * When dmem.max is lowered below the current usage of a cgroup pool, the + * dmem controller will call @reclaim with a target number of bytes to fre= e. + * After @reclaim returns the controller retries setting the limit; if usa= ge + * is still too high it calls @reclaim again, up to a bounded retry count. + */ +void dmem_cgroup_region_set_reclaim(struct dmem_cgroup_region *region, + int (*reclaim)(struct dmem_cgroup_pool_state *pool, + u64 target_bytes, void *priv), + void *priv) +{ + if (!region) + return; + + region->reclaim =3D reclaim; + region->reclaim_priv =3D priv; +} +EXPORT_SYMBOL_GPL(dmem_cgroup_region_set_reclaim); + static struct dmem_cgroup_pool_state * get_cg_pool_unlocked(struct dmemcg_state *cg, struct dmem_cgroup_region *r= egion) { @@ -728,7 +802,8 @@ static int dmemcg_parse_limit(char *options, struct dme= m_cgroup_region *region, =20 static ssize_t dmemcg_limit_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off, - int (*apply)(struct dmem_cgroup_pool_state *, u64)) + int (*apply)(struct dmem_cgroup_pool_state *, u64, + struct dmem_cgroup_region *)) { struct dmemcg_state *dmemcs =3D css_to_dmemcs(of_css(of)); int err =3D 0; @@ -775,7 +850,8 @@ static ssize_t dmemcg_limit_write(struct kernfs_open_fi= le *of, } =20 /* And commit */ - err =3D apply(pool, new_limit); + err =3D apply(pool, new_limit, region); + dmemcg_pool_put(pool); =20 out_put: --=20 2.53.0 From nobody Thu Apr 2 17:04:56 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 84BA4259CAF; Fri, 27 Mar 2026 08:16:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599414; cv=none; b=nND16cO+mqmF/QX98whpbtCmaAcsIYXNDW9lIECbIg5NWbnTip3cTqlxatw6UwxVsuVOVn1nwuX6xRSrcS6aol39Q7vImQ4MNKPIh47l9Z3M7VQMBWtG+995oF0mv+CBbxwL2ReZdKV1N8tG+4KlxrLF8aUqR0wX6v9rQmmfHLI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599414; c=relaxed/simple; bh=ytyzwX3feuwVR1dvurybDs4E55hpyZh2Q7G3QI5sDxQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=qug5H/C2tzQYCjyIjh+pK05iyTvyZh62VwetDG/okRdL2PpH7/pU1X3Of8VLLd7ODwKRAXQmzSIxx/a3omXnVOi2A54RF3MwrUwmTfYW9ud25c+9tYZD6/P4KfnSiuPwXX9UKyT2wiI+b/xBf4D0UdKzMZlg0bt737GNgD/Dwx8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KcyZAVsc; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KcyZAVsc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1774599410; x=1806135410; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ytyzwX3feuwVR1dvurybDs4E55hpyZh2Q7G3QI5sDxQ=; b=KcyZAVscGy0JlyYKUs/jgXzvLA4EO7fhhZ6iXFL7QP85u7kkIwaxa/av uAZDf6ylwr87HEui72rHo3CkP9UZnr7x7oYEdGh1yddM5f2nN/hdtkVkp TQP2xfo0cWPU3pfPm+NdmGLDBDO2DdlpTxVj7NmQKXyeLXnJzSrNiDoD+ NcqDvowSdiEuIkzC+o2wito7xriaSqNIuqtgXh+80DZnn2GJ/CSIu/rAE UGp3Mbq3Jt3c42CKwwXYbTSD3tAOtRFbXt+jeOJO+yY5pEq9vxzjVDnfd WBbFDKuO2ztEQqs9DUgAHeDSkLRPUZESxeaW4aVRvetmXv5D25oqGpgYa A==; X-CSE-ConnectionGUID: PUrNXeqAQAeC9CN0fLSlGw== X-CSE-MsgGUID: +ncMvAmrTpmN9DMemCoe6A== X-IronPort-AV: E=McAfee;i="6800,10657,11741"; a="75784069" X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="75784069" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:50 -0700 X-CSE-ConnectionGUID: DAY5HA6ESSeo1zkIKjs1rw== X-CSE-MsgGUID: juWInWz4S3aW1Mlq9cn8aQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="255747922" Received: from egrumbac-mobl6.ger.corp.intel.com (HELO fedora) ([10.245.244.146]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:45 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , Natalie Vock , Johannes Weiner , Tejun Heo , =?UTF-8?q?Michal=20Koutn=C3=BD?= , cgroups@vger.kernel.org, Huang Rui , Matthew Brost , Matthew Auld , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Simona Vetter , David Airlie , =?UTF-8?q?Christian=20K=C3=B6nig?= , Alex Deucher , Rodrigo Vivi , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/5] drm/ttm: Hook up a cgroup-aware reclaim callback for the dmem controller Date: Fri, 27 Mar 2026 09:15:58 +0100 Message-ID: <20260327081600.4885-4-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> References: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Add ttm_bo_evict_cgroup() to evict buffer objects charged to a specific dmem cgroup pool from a resource manager's LRU until a byte target is met. Add ttm_resource_manager_set_dmem_region() to register the TTM eviction path as the reclaim callback for a dmem cgroup region. The eviction context is interruptible; signals abort the operation and propagate back through the write() syscall. Introduce a new mode for the bo LRU walker so that sleeping locks can be taken. This can be used when the caller doesn't hold any previous dma_resv locks, and where it intends to hold at most one lock at a time. Like the rest of the TTM eviction this should sooner than later be converted to full WW transactions. Assisted-by: GitHub Copilot:claude-sonnet-4.6 Signed-off-by: Thomas Hellstr=C3=B6m --- drivers/gpu/drm/ttm/ttm_bo.c | 95 +++++++++++++++++++++++++++++- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +- drivers/gpu/drm/ttm/ttm_resource.c | 36 +++++++++++ include/drm/ttm/ttm_bo.h | 10 ++++ include/drm/ttm/ttm_resource.h | 4 ++ 5 files changed, 144 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index d85f0a37ac35..1745557c184c 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -515,12 +515,20 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk,= struct ttm_buffer_object * { struct ttm_bo_evict_walk *evict_walk =3D container_of(walk, typeof(*evict_walk), walk); + /* Capture size before eviction in case res is cleared. */ + s64 bo_size =3D bo->base.size; s64 lret; =20 if (!dmem_cgroup_state_evict_valuable(evict_walk->limit_pool, bo->resourc= e->css, evict_walk->try_low, &evict_walk->hit_low)) return 0; =20 + /* + * evict_walk->place is NULL in cgroup drain mode. Drivers' + * eviction_valuable() callbacks must handle a NULL place, treating it + * as "any placement": the TTM base implementation already does so via + * ttm_resource_intersects(). + */ if (bo->pin_count || !bo->bdev->funcs->eviction_valuable(bo, evict_walk->= place)) return 0; =20 @@ -536,11 +544,15 @@ static s64 ttm_bo_evict_cb(struct ttm_lru_walk *walk,= struct ttm_buffer_object * goto out; =20 evict_walk->evicted++; - if (evict_walk->res) + if (evict_walk->res) { lret =3D ttm_resource_alloc(evict_walk->evictor, evict_walk->place, evict_walk->res, NULL); - if (lret =3D=3D 0) - return 1; + if (lret =3D=3D 0) + return 1; + } else { + /* Cgroup drain: return bytes freed for byte-denominated progress. */ + return bo_size; + } out: /* Errors that should terminate the walk. */ if (lret =3D=3D -ENOSPC) @@ -614,6 +626,83 @@ static int ttm_bo_evict_alloc(struct ttm_device *bdev, return 0; } =20 +/** + * ttm_bo_evict_cgroup - Evict buffer objects charged to a specific cgroup. + * @bdev: The TTM device. + * @man: The resource manager whose LRU to walk. + * @limit_pool: The cgroup pool state whose members should be evicted. + * @target_bytes: Number of bytes to free. + * @ctx: The TTM operation context. + * + * Walk the LRU of @man and evict buffer objects that are charged to the + * cgroup identified by @limit_pool, until at least @target_bytes have been + * freed. Mirrors the two-pass (trylock -> sleeping-lock, low-watermark) + * strategy used by ttm_bo_evict_alloc(). + * + * Return: >=3D @target_bytes on full success, 0..target_bytes-1 if partia= l, + * negative error code on fatal error. + */ +s64 ttm_bo_evict_cgroup(struct ttm_device *bdev, + struct ttm_resource_manager *man, + struct dmem_cgroup_pool_state *limit_pool, + s64 target_bytes, + struct ttm_operation_ctx *ctx) +{ + struct ttm_bo_evict_walk evict_walk =3D { + .walk =3D { + .ops =3D &ttm_evict_walk_ops, + .arg =3D { .ctx =3D ctx }, + }, + .limit_pool =3D limit_pool, + /* place, evictor, res left NULL: selects cgroup drain mode */ + }; + s64 lret, pass; + + evict_walk.walk.arg.trylock_only =3D true; + lret =3D ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, target_bytes= ); + if (lret < 0 || lret >=3D target_bytes) + return lret; + + /* Second pass: also evict BOs at the low watermark. */ + if (evict_walk.hit_low) { + evict_walk.try_low =3D true; + pass =3D ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, + target_bytes - lret); + if (pass < 0) + return pass; + lret +=3D pass; + if (lret >=3D target_bytes) + return lret; + } + + /* Full sleeping-lock pass for remaining target. */ + evict_walk.try_low =3D evict_walk.hit_low =3D false; + evict_walk.walk.arg.trylock_only =3D false; + +retry: + evict_walk.walk.arg.sleeping_lock =3D true; + do { + evict_walk.evicted =3D 0; + pass =3D ttm_lru_walk_for_evict(&evict_walk.walk, bdev, man, + target_bytes - lret); + if (pass < 0) { + lret =3D pass; + goto out; + } + lret +=3D pass; + } while (lret < target_bytes && evict_walk.evicted); + + /* One more attempt if we hit the low limit during sleeping-lock pass. */ + if (lret < target_bytes && evict_walk.hit_low && !evict_walk.try_low) { + evict_walk.try_low =3D true; + goto retry; + } + +out: + return lret; +} +EXPORT_SYMBOL(ttm_bo_evict_cgroup); + /** * ttm_bo_pin - Pin the buffer object. * @bo: The buffer object to pin diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo= _util.c index f83b7d5ec6c6..81c6a674c462 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -999,7 +999,8 @@ __ttm_bo_lru_cursor_next(struct ttm_bo_lru_cursor *curs) bo =3D res->bo; if (ttm_lru_walk_trylock(curs, bo)) bo_locked =3D true; - else if (!arg->ticket || arg->ctx->no_wait_gpu || arg->trylock_only) + else if ((!arg->ticket && !arg->sleeping_lock) || arg->ctx->no_wait_gpu = || + arg->trylock_only) continue; =20 if (!ttm_bo_get_unless_zero(bo)) { diff --git a/drivers/gpu/drm/ttm/ttm_resource.c b/drivers/gpu/drm/ttm/ttm_r= esource.c index 9f36631d48b6..936552f426a7 100644 --- a/drivers/gpu/drm/ttm/ttm_resource.c +++ b/drivers/gpu/drm/ttm/ttm_resource.c @@ -937,3 +937,39 @@ void ttm_resource_manager_create_debugfs(struct ttm_re= source_manager *man, #endif } EXPORT_SYMBOL(ttm_resource_manager_create_debugfs); + +static int ttm_resource_manager_dmem_reclaim(struct dmem_cgroup_pool_state= *pool, + u64 target_bytes, void *priv) +{ + struct ttm_resource_manager *man =3D priv; + struct ttm_operation_ctx ctx =3D { .interruptible =3D true }; + s64 freed; + + freed =3D ttm_bo_evict_cgroup(man->bdev, man, pool, target_bytes, &ctx); + if (freed < 0) + return freed; + + return freed >=3D (s64)target_bytes ? 0 : -ENOSPC; +} + +/** + * ttm_resource_manager_set_dmem_region - Associate a dmem cgroup region w= ith a + * resource manager and register a = reclaim + * callback. + * @man: The resource manager. + * @region: The dmem cgroup region to associate, may be NULL or IS_ERR(). + * + * Sets @man->cg and registers ttm_resource_manager_dmem_reclaim() so that + * writing to dmem.max below current usage triggers TTM eviction rather th= an + * returning -EBUSY to userspace. + */ +void ttm_resource_manager_set_dmem_region(struct ttm_resource_manager *man, + struct dmem_cgroup_region *region) +{ + man->cg =3D region; + if (!IS_ERR_OR_NULL(region)) + dmem_cgroup_region_set_reclaim(region, + ttm_resource_manager_dmem_reclaim, + man); +} +EXPORT_SYMBOL(ttm_resource_manager_set_dmem_region); diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h index 8310bc3d55f9..32791c4db2a9 100644 --- a/include/drm/ttm/ttm_bo.h +++ b/include/drm/ttm/ttm_bo.h @@ -226,6 +226,11 @@ struct ttm_lru_walk_arg { struct ww_acquire_ctx *ticket; /** @trylock_only: Only use trylock for locking. */ bool trylock_only; + /** + * @sleeping_lock: Use sleeping locks even with %NULL @ticket. + * @trylock_only has precedence over this field. + */ + bool sleeping_lock; }; =20 /** @@ -431,6 +436,11 @@ void ttm_bo_unpin(struct ttm_buffer_object *bo); int ttm_bo_evict_first(struct ttm_device *bdev, struct ttm_resource_manager *man, struct ttm_operation_ctx *ctx); +s64 ttm_bo_evict_cgroup(struct ttm_device *bdev, + struct ttm_resource_manager *man, + struct dmem_cgroup_pool_state *limit_pool, + s64 target_bytes, + struct ttm_operation_ctx *ctx); int ttm_bo_access(struct ttm_buffer_object *bo, unsigned long offset, void *buf, int len, int write); vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo, diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h index 33e80f30b8b8..c187e6c8b871 100644 --- a/include/drm/ttm/ttm_resource.h +++ b/include/drm/ttm/ttm_resource.h @@ -39,6 +39,7 @@ =20 struct dentry; struct dmem_cgroup_device; +struct dmem_cgroup_region; struct drm_printer; struct ttm_device; struct ttm_resource_manager; @@ -475,6 +476,9 @@ void ttm_resource_manager_init(struct ttm_resource_mana= ger *man, struct ttm_device *bdev, uint64_t size); =20 +void ttm_resource_manager_set_dmem_region(struct ttm_resource_manager *man, + struct dmem_cgroup_region *region); + int ttm_resource_manager_evict_all(struct ttm_device *bdev, struct ttm_resource_manager *man); =20 --=20 2.53.0 From nobody Thu Apr 2 17:04:56 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 869D73D47D7; Fri, 27 Mar 2026 08:16:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599418; cv=none; b=pvSTK6Pr57qtALb3XU2jDFfsz6MVmk4qKl/cqSM0YWkawiP6QEO52BM3Ub2nMhQnJBLAhVKFWR6X8dpLHAM1yc1aScWaqF4O0QwatSp8B8CpSPsVEhJTiRo4EQybok9WX/dm4ibQi28rxct38K03GqKMLKZSeIr6q6vyzKPUenw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599418; c=relaxed/simple; bh=VCXy79fbBAt5QnmrzmMMuvOq3FMqVb404fQrx6b1TDI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=jr5SLU1nYJG0w0lCJ3dZdndslks/XWcM2sCsxPpYAF6YEdQOmHBxEvjEPj1gqYUAFa1wwOmrRlUz8FHYqRE2BUcg/O0+0pjyXr6RDWFiduDQmUkCldCtuFY4tLjST3BpUTGdDfgtgRtbAXEk41Qh1a2ueGmp58nZUZpZISHhDr4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QDTMBDSo; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QDTMBDSo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1774599415; x=1806135415; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VCXy79fbBAt5QnmrzmMMuvOq3FMqVb404fQrx6b1TDI=; b=QDTMBDSo+YVdfWrmBy6I2uEOHZG4R7CD/mjMERNQn+xzvIEHMrhzDW08 Zw4md3MFqOtZG6XkKZjpvSqi3lgU7TyH0Zjr9vsFpiX1wK+EhlbzD3q2P CSFZRNmBWCYsj4LdKDUHQF2IDNrx+hYyCo4MTjfo/HEd+xhV/39BeNrGo nJ/2L3KptwZTBTn0NbA12pVdSXZFzXSLWeofIOgP8g+wXtr6c8hDCmLR5 6OopA5f7MsJAOQ/kiX/fxcsMPvHJv1Zon6hsKDPcYo8mvXZAE3fAlQTO7 Vt9cf7qrRLBakHkla0iRWc06dLsJmTEDwLypbfxNRp5YWYeGMh7TYpZmr A==; X-CSE-ConnectionGUID: KE4WAniPQiigw785XtatlA== X-CSE-MsgGUID: fKY6JmBTRXW7BHO9R8COMA== X-IronPort-AV: E=McAfee;i="6800,10657,11741"; a="75784087" X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="75784087" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:55 -0700 X-CSE-ConnectionGUID: ZgzMJUUlRuaca/H2RVQkBQ== X-CSE-MsgGUID: 6pQoZOSvTGyqbF/NpGiVaA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="255747944" Received: from egrumbac-mobl6.ger.corp.intel.com (HELO fedora) ([10.245.244.146]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:50 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , Natalie Vock , Johannes Weiner , Tejun Heo , =?UTF-8?q?Michal=20Koutn=C3=BD?= , cgroups@vger.kernel.org, Huang Rui , Matthew Brost , Matthew Auld , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Simona Vetter , David Airlie , =?UTF-8?q?Christian=20K=C3=B6nig?= , Alex Deucher , Rodrigo Vivi , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/5] drm/xe: Wire up dmem cgroup reclaim for VRAM manager Date: Fri, 27 Mar 2026 09:15:59 +0100 Message-ID: <20260327081600.4885-5-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> References: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Register the VRAM manager with the dmem cgroup reclaim infrastructure so that lowering dmem.max below current VRAM usage triggers TTM eviction rather than failing with -EBUSY. Assisted-by: GitHub Copilot:claude-sonnet-4.6 Signed-off-by: Thomas Hellstr=C3=B6m --- drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_t= tm_vram_mgr.c index 5fd0d5506a7e..1bdcb3fee901 100644 --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c @@ -303,13 +303,6 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struc= t xe_ttm_vram_mgr *mgr, struct ttm_resource_manager *man =3D &mgr->manager; int err; =20 - if (mem_type !=3D XE_PL_STOLEN) { - const char *name =3D mem_type =3D=3D XE_PL_VRAM0 ? "vram0" : "vram1"; - man->cg =3D drmm_cgroup_register_region(&xe->drm, name, size); - if (IS_ERR(man->cg)) - return PTR_ERR(man->cg); - } - man->func =3D &xe_ttm_vram_mgr_func; mgr->mem_type =3D mem_type; mutex_init(&mgr->lock); @@ -318,6 +311,18 @@ int __xe_ttm_vram_mgr_init(struct xe_device *xe, struc= t xe_ttm_vram_mgr *mgr, mgr->visible_avail =3D io_size; =20 ttm_resource_manager_init(man, &xe->ttm, size); + + if (mem_type !=3D XE_PL_STOLEN) { + const char *name =3D mem_type =3D=3D XE_PL_VRAM0 ? "vram0" : "vram1"; + struct dmem_cgroup_region *cg =3D + drmm_cgroup_register_region(&xe->drm, name, size); + + if (IS_ERR(cg)) + return PTR_ERR(cg); + + ttm_resource_manager_set_dmem_region(man, cg); + } + err =3D gpu_buddy_init(&mgr->mm, man->size, default_page_size); if (err) return err; --=20 2.53.0 From nobody Thu Apr 2 17:04:56 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B42AD3D6CDA; Fri, 27 Mar 2026 08:17:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599423; cv=none; b=a4sN0ri3kTI36izkiPlRgIgcHkXJlEEHHpzN1op5+wjWp7yxMrv/azDYzqMQL4e3d8VD3kWwxYNV7RsDNYrYq+8huiNKXn2AbAfliTmRwnZcGP+f7yp40zF8vt7oMPCQZgK/BlHu84EoH5fxmIcoF4dlPxawsCsN7PmlKiNdF4o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774599423; c=relaxed/simple; bh=HTA3piuPUkYAg53mnACGm0nngNvxcbWu8yTBneObip4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uzL3QvYrrlstPZ1sp+bZJr91zgZbWJPPAf26zwsvYBNBhuuKMjv79qoBbrpKEAoAJ2XMTqcyEpxfmOhw9zb27s2iJFvfZnxcwCGFuk0fWMH0yoe5lVpL51XunjNMapGqxsbkW61RrwT8yKZihTgiqk4A1j30XVkjTp+b4JuwVXc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=pass smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=eqP5k36H; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="eqP5k36H" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1774599421; x=1806135421; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HTA3piuPUkYAg53mnACGm0nngNvxcbWu8yTBneObip4=; b=eqP5k36HDOhRKalyxfr7IEhG8P4+RlVP3G9wdnZvx+0S/g9+DzXXfhD/ NkDxq1i8bmMB7rnjxLV+5dJ+5bgk0dn9Tpt01EUbhuIGD+oxDFbk4OKHp q1K8YsRcmOmpn5X7/KYbvDltsUmKSrDjQlGnpt0CeB93eWWarw0q9UhkZ qfAE4WQyKi1oSRE10Xyv/Dp+GBAcZKr6GDH66AGpzws2YiCNx16lJcBx8 J3xDJAbTZPfVEPjoPOrHxm37LvKVlm6jpJQf2aQwHd15+ti6f69pLDB+z Dr7Y2Hyp4gFrVPFQhKYiSLwPulu3Oa8IPPRUjLydJoaHQzna3djblb4Un Q==; X-CSE-ConnectionGUID: dBT46hNFQkS8QCZU9tKdGw== X-CSE-MsgGUID: F+zum+RsTl+F1IxAktMmPg== X-IronPort-AV: E=McAfee;i="6800,10657,11741"; a="75784115" X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="75784115" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:17:00 -0700 X-CSE-ConnectionGUID: zwrmvJYGTSW6CDJ7eAFMLw== X-CSE-MsgGUID: N96fdPgUSZuFfu878NRbRw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.23,143,1770624000"; d="scan'208";a="255747969" Received: from egrumbac-mobl6.ger.corp.intel.com (HELO fedora) ([10.245.244.146]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Mar 2026 01:16:55 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-xe@lists.freedesktop.org Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , Natalie Vock , Johannes Weiner , Tejun Heo , =?UTF-8?q?Michal=20Koutn=C3=BD?= , cgroups@vger.kernel.org, Huang Rui , Matthew Brost , Matthew Auld , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , Simona Vetter , David Airlie , =?UTF-8?q?Christian=20K=C3=B6nig?= , Alex Deucher , Rodrigo Vivi , dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: [PATCH 5/5] drm/amdgpu: Wire up dmem cgroup reclaim for VRAM manager Date: Fri, 27 Mar 2026 09:16:00 +0100 Message-ID: <20260327081600.4885-6-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> References: <20260327081600.4885-1-thomas.hellstrom@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Register the VRAM manager with the dmem cgroup reclaim infrastructure so that lowering dmem.max below current VRAM usage triggers TTM eviction rather than failing with -EBUSY. Guard place->flags in amdgpu_ttm_bo_eviction_valuable() against NULL, as the TTM reclaim path passes a NULL place in cgroup drain mode. Assisted-by: GitHub Copilot:claude-sonnet-4.6 Signed-off-by: Thomas Hellstr=C3=B6m --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 10 +++++++--- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/= amdgpu/amdgpu_ttm.c index b4ab309bf08a..cd83f30a30f7 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1491,7 +1491,7 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct tt= m_buffer_object *bo, dma_resv_for_each_fence(&resv_cursor, bo->base.resv, DMA_RESV_USAGE_BOOKKEEP, f) { if (amdkfd_fence_check_mm(f, current->mm) && - !(place->flags & TTM_PL_FLAG_CONTIGUOUS)) + !(place && (place->flags & TTM_PL_FLAG_CONTIGUOUS))) return false; } =20 diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm= /amd/amdgpu/amdgpu_vram_mgr.c index 2a241a5b12c4..5987b1b9ec09 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -916,14 +916,18 @@ int amdgpu_vram_mgr_init(struct amdgpu_device *adev) { struct amdgpu_vram_mgr *mgr =3D &adev->mman.vram_mgr; struct ttm_resource_manager *man =3D &mgr->manager; + struct dmem_cgroup_region *cg; int err; =20 - man->cg =3D drmm_cgroup_register_region(adev_to_drm(adev), "vram", adev->= gmc.real_vram_size); - if (IS_ERR(man->cg)) - return PTR_ERR(man->cg); ttm_resource_manager_init(man, &adev->mman.bdev, adev->gmc.real_vram_size); =20 + cg =3D drmm_cgroup_register_region(adev_to_drm(adev), "vram", + adev->gmc.real_vram_size); + if (IS_ERR(cg)) + return PTR_ERR(cg); + ttm_resource_manager_set_dmem_region(man, cg); + mutex_init(&mgr->lock); INIT_LIST_HEAD(&mgr->reservations_pending); INIT_LIST_HEAD(&mgr->reserved_pages); --=20 2.53.0