From nobody Mon May 25 01:15:09 2026 Received: from mail-dy1-f182.google.com (mail-dy1-f182.google.com [74.125.82.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 373A235B645 for ; Wed, 20 May 2026 06:07:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.82.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779257251; cv=none; b=GYgaeyFgshj1EFFVtsAj19gB3IL4owvpUjmOC7hy4pX2E0ymxJGNqst2gzabVSPwhrnJ6WHdRsebC/teRb8KWPxsmpkno3LgoFb0ooKsDcO1CPPKJGrpvW6YHJYYUtkZbULmWSPuKwQi73S28TOkemr3ff/Xa8M/oamlt0BGsRE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779257251; c=relaxed/simple; bh=dGP8zb2hZsokw/Lvq1510DQXDH5w0sqc0ISBnrn2reQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:To:Cc; b=SiWDwOA95YAdgCTwTFGBBGovodS1Ga0w5JZcYDsMXPd0VOGH/yZGda/JXJqYXKTAS6xsj+FNyoHXIF8x8s4jiF7E1Nb/v8Tl/KyLVuE+ORUchQ2Oxzmj9QML0Vr2ebx87++mEgFTxWCoi2AIAu8YpzGPt9RdBPlzE2ThWqtE0YY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=SGaWx3ub; arc=none smtp.client-ip=74.125.82.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SGaWx3ub" Received: by mail-dy1-f182.google.com with SMTP id 5a478bee46e88-303f2fb7225so2630045eec.1 for ; Tue, 19 May 2026 23:07:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779257249; x=1779862049; darn=vger.kernel.org; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:from:to:cc:subject:date:message-id:reply-to; bh=FkbUxOw9XQFqUtX2Z9eJ32TNe+t+n97l8FjGhNB6Arg=; b=SGaWx3ubBahIIHQg7gr4gGLxUrcZDYmq2TF0JW+Nc7Xb6f5j838APuxmD+yyiMZyU4 3BhzHKz0wAgqp1dGVKv2QAu2Z7MPBSVSwnxIpwJDUuUmVjV0mIszhlHWhuy2Cz/81L6Z fAqYtk9mJo070qTFFMdXS4c6oSFw+Y7zNVmkQJF1J88/x+xufm4ks+m2FvtOnbkt0F8i N0tmXHAQYX7vhhbbK9X/d+U28PFWU9SzBYnGcYVCcx4a53PEINI+fA46E7xlfyYkn0Pe ePLJPb+A1krm0OVYPM3L9y0uJXnzvPYf0DvVzOUtf65AdWY5Bnl4925gbkhBuDStEQy8 RhpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779257249; x=1779862049; h=cc:to:message-id:content-transfer-encoding:mime-version:subject :date:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FkbUxOw9XQFqUtX2Z9eJ32TNe+t+n97l8FjGhNB6Arg=; b=D+bA9u1qrepwD7FjOD1+ybO1UImVCyDExsl7uDLhFjkQ4SsGueGqexEyAGgqWfvMr4 nr3cLbcwbM2dcrL2ivEvpY/hm2qYxAGEbS0AkzjGsrKIqaKIczlH7r2Tvk2FwxpmtBQS ZrGxoCqEROJD9BAgebl82oc4O9wFD/bFk/H4fKzofBULDwuY35ommCNENdT9H1QO1AZW 6L8ud+J5AIaerm3qRrOHVOPDWL0Ku4T94A20jG4tkIkNJGbwCFG93oXf1aja2A8T36SA oX5pFJHzqm6eBxs6JIjhlgNmlRkkAKmk1Cw51CWNnxlPcZYWonvjT2gh75SN0IPC3dT5 WHZQ== X-Forwarded-Encrypted: i=1; AFNElJ8s6NyUTmx4IX4zQPORpYwOOD4bBZylUGeuoCYKGWYoBRPDycWHbv/tSyYtD3XkXnI2JwkLs1ngkbTsLlQ=@vger.kernel.org X-Gm-Message-State: AOJu0YwHc1q4Kstr9lZmZoSxdz+Ad31NBQyLiu37oNoDpIp/GYJUjNQM /MEm9EU0wtWKroERsm+8muP2UtyrSjZ41unwjeoTcJ5C/1WLwugB4gsj X-Gm-Gg: Acq92OEt/ixRsaZKu65PPVSvbpu1ysJe4kihPv2ORzMojIuN3EL0WtF8PQMRFaswtpz /d8zies6D7Y+5AN1oFoBu/v438Z0dIbgCKisZUxNVpT2GljsZjdePYV2Ewrv/iGwROLZ/ZFCNt+ 8pV6EpIsUNvjfyLynz7Zg8dLJ4ya1joJGqhapPTsEBNPovHXC2kJ+ghQRMiGQqt6UD3thY21XzH 35+lENFKTxcG5D0eH5Y4LI334+Ccj2MnZ5C+Anc762W42IyaJgEJWTS8raNU7LyFjZ/lYJIuQZG hO5KDpVrk+1iME5BJKtJyyUMM/MT9iqAA9gM60RLMMNGbfcleC99Wi6MumCCLbaMiMuzIq/HJyS 40rn//cV7/VO08nkR7C/cpoa4heSk88pyWlZGYYrTIwpJZj7ZUaWL7Onp4W0yFlX12xTxexcFws 7/VUaYgM1QFpvZ10Y11gdlAS7R0r+0MeAe61d/wFdz/g== X-Received: by 2002:a05:7301:600c:b0:2dd:6937:79d5 with SMTP id 5a478bee46e88-303982b788dmr10220464eec.8.1779257249365; Tue, 19 May 2026 23:07:29 -0700 (PDT) Received: from wujing.localdomain ([74.48.213.230]) by smtp.gmail.com with ESMTPSA id 5a478bee46e88-30294500a97sm20885330eec.9.2026.05.19.23.07.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 May 2026 23:07:29 -0700 (PDT) From: Qiliang Yuan Date: Wed, 20 May 2026 14:07:20 +0800 Subject: [PATCH] cgroup/dmem: implement dmem.high soft limit and throttling Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20260520-feature-dmem-high-v1-1-97ca0cb7f95a@gmail.com> X-B4-Tracking: v=1; b=H4sIAAAAAAAC/x3MTQqAIBBA4avErBvIfiy7SrSQnHIWVmhFIN49a fkt3osQyDMFGIsInh4OfOwZoixgsXrfCNlkQ13VsuqEwpX0dXtC48ih5c2ikEr1oh3M0gyQu9P Tyu//nOaUPnDDIj9jAAAA X-Change-ID: 20260519-feature-dmem-high-16997148dc38 To: Maarten Lankhorst , Maxime Ripard , Natalie Vock , Tejun Heo , Johannes Weiner , =?utf-8?q?Michal_Koutn=C3=BD?= Cc: cgroups@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Qiliang Yuan X-Mailer: b4 0.14.3 Introduce the "high" soft limit for the dmem cgroup v2 controller. When a cgroup's device memory usage exceeds its high limit, tasks belonging to that cgroup are throttled by being forced into a sleep before returning to user space, instead of being failed outright as with the "max" limit. Key changes: - Add high counter configuration to dmem_cgroup_pool. - Add over-high check in the try_charge path and set TIF_NOTIFY_RESUME. - Inject the dmem throttling handler into resume_user_mode_work. - Implement the handler to perform a 100ms interruptible sleep for over-limit tasks. This mechanism provides smoother over-subscription support for device memory resources. Signed-off-by: Qiliang Yuan --- This series introduces the "high" soft limit and associated task throttling mechanism to the dmem cgroup v2 controller. The device memory (VRAM) management currently only supports hard limits (max), which leads to immediate allocation failures when reached. This can be disruptive for GPU-bound AI workloads. By introducing a soft limit, we allow cgroups to exceed their quota temporarily while applying backpressure via task throttling before the process returns to user space. The mechanism is inspired by the memory cgroup's high limit: - When usage > high, the task is marked with TIF_NOTIFY_RESUME. - Upon returning to user space, it triggers a 100ms sleep. - This provides a smoother over-subscription model for GPU resources. Qiliang Yuan (1): cgroup/dmem: implement dmem.high soft limit and throttling --- To: Maarten Lankhorst To: Maxime Ripard To: Natalie Vock To: Tejun Heo To: Johannes Weiner To: Michal Koutn=C3=BD Cc: cgroups@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org --- include/linux/cgroup_dmem.h | 10 +++++++ include/linux/resume_user_mode.h | 2 ++ kernel/cgroup/dmem.c | 60 ++++++++++++++++++++++++++++++++++++= +++- 3 files changed, 71 insertions(+), 1 deletion(-) diff --git a/include/linux/cgroup_dmem.h b/include/linux/cgroup_dmem.h index dd4869f1d736e..d58972de7c910 100644 --- a/include/linux/cgroup_dmem.h +++ b/include/linux/cgroup_dmem.h @@ -21,6 +21,13 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *re= gion, u64 size, struct dmem_cgroup_pool_state **ret_pool, struct dmem_cgroup_pool_state **ret_limit_pool); void dmem_cgroup_uncharge(struct dmem_cgroup_pool_state *pool, u64 size); +void __dmem_cgroup_handle_over_high(void); + +static inline void dmem_cgroup_handle_over_high(void) +{ + __dmem_cgroup_handle_over_high(); +} + bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit= _pool, struct dmem_cgroup_pool_state *test_pool, bool ignore_low, bool *ret_hit_low); @@ -51,6 +58,9 @@ static inline int dmem_cgroup_try_charge(struct dmem_cgro= up_region *region, u64 static inline void dmem_cgroup_uncharge(struct dmem_cgroup_pool_state *poo= l, u64 size) { } =20 +static inline void dmem_cgroup_handle_over_high(void) +{ } + static inline bool dmem_cgroup_state_evict_valuable(struct dmem_cgroup_pool_state *limit= _pool, struct dmem_cgroup_pool_state *test_pool, diff --git a/include/linux/resume_user_mode.h b/include/linux/resume_user_m= ode.h index bf92227c78d0d..afcab20998c41 100644 --- a/include/linux/resume_user_mode.h +++ b/include/linux/resume_user_mode.h @@ -8,6 +8,7 @@ #include #include #include +#include =20 /** * set_notify_resume - cause resume_user_mode_work() to be called @@ -58,6 +59,7 @@ static inline void resume_user_mode_work(struct pt_regs *= regs) =20 mem_cgroup_handle_over_high(GFP_KERNEL); blkcg_maybe_throttle_current(); + dmem_cgroup_handle_over_high(); =20 rseq_handle_slowpath(regs); } diff --git a/kernel/cgroup/dmem.c b/kernel/cgroup/dmem.c index 4753a67d0f0f2..f77c692b887b1 100644 --- a/kernel/cgroup/dmem.c +++ b/kernel/cgroup/dmem.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include =20 @@ -156,6 +157,12 @@ set_resource_low(struct dmem_cgroup_pool_state *pool, = u64 val) page_counter_set_low(&pool->cnt, val); } =20 +static void +set_resource_high(struct dmem_cgroup_pool_state *pool, u64 val) +{ + page_counter_set_high(&pool->cnt, val); +} + static void set_resource_max(struct dmem_cgroup_pool_state *pool, u64 val) { @@ -167,6 +174,11 @@ static u64 get_resource_low(struct dmem_cgroup_pool_st= ate *pool) return pool ? READ_ONCE(pool->cnt.low) : 0; } =20 +static u64 get_resource_high(struct dmem_cgroup_pool_state *pool) +{ + return pool ? READ_ONCE(pool->cnt.high) : 0; +} + static u64 get_resource_min(struct dmem_cgroup_pool_state *pool) { return pool ? READ_ONCE(pool->cnt.min) : 0; @@ -186,6 +198,7 @@ static void reset_all_resource_limits(struct dmem_cgrou= p_pool_state *rpool) { set_resource_min(rpool, 0); set_resource_low(rpool, 0); + set_resource_high(rpool, PAGE_COUNTER_MAX); set_resource_max(rpool, PAGE_COUNTER_MAX); } =20 @@ -685,6 +698,9 @@ int dmem_cgroup_try_charge(struct dmem_cgroup_region *r= egion, u64 size, goto err; } =20 + if (page_counter_read(&pool->cnt) > READ_ONCE(pool->cnt.high)) + set_notify_resume(current); + /* On success, reference from get_current_dmemcs is transferred to *ret_p= ool */ *ret_pool =3D pool; return 0; @@ -835,13 +851,24 @@ static ssize_t dmem_cgroup_region_low_write(struct ke= rnfs_open_file *of, return dmemcg_limit_write(of, buf, nbytes, off, set_resource_low); } =20 +static int dmem_cgroup_region_high_show(struct seq_file *sf, void *v) +{ + return dmemcg_limit_show(sf, v, get_resource_high); +} + +static ssize_t dmem_cgroup_region_high_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off) +{ + return dmemcg_limit_write(of, buf, nbytes, off, set_resource_high); +} + static int dmem_cgroup_region_max_show(struct seq_file *sf, void *v) { return dmemcg_limit_show(sf, v, get_resource_max); } =20 static ssize_t dmem_cgroup_region_max_write(struct kernfs_open_file *of, - char *buf, size_t nbytes, loff_t off) + char *buf, size_t nbytes, loff_t off) { return dmemcg_limit_write(of, buf, nbytes, off, set_resource_max); } @@ -868,6 +895,12 @@ static struct cftype files[] =3D { .seq_show =3D dmem_cgroup_region_low_show, .flags =3D CFTYPE_NOT_ON_ROOT, }, + { + .name =3D "high", + .write =3D dmem_cgroup_region_high_write, + .seq_show =3D dmem_cgroup_region_high_show, + .flags =3D CFTYPE_NOT_ON_ROOT, + }, { .name =3D "max", .write =3D dmem_cgroup_region_max_write, @@ -877,6 +910,31 @@ static struct cftype files[] =3D { { } /* Zero entry terminates. */ }; =20 +void __dmem_cgroup_handle_over_high(void) +{ + struct dmemcg_state *dmemcs; + struct dmem_cgroup_pool_state *pool; + + dmemcs =3D css_to_dmemcs(task_get_css(current, dmem_cgrp_id)); + if (!dmemcs) + return; + + rcu_read_lock(); + list_for_each_entry_rcu(pool, &dmemcs->pools, css_node) { + unsigned long usage, high; + + usage =3D page_counter_read(&pool->cnt); + high =3D READ_ONCE(pool->cnt.high); + + if (usage > high) + schedule_timeout_killable(HZ / 10); + } + rcu_read_unlock(); + + css_put(&dmemcs->css); +} +EXPORT_SYMBOL_GPL(__dmem_cgroup_handle_over_high); + struct cgroup_subsys dmem_cgrp_subsys =3D { .css_alloc =3D dmemcs_alloc, .css_free =3D dmemcs_free, --- base-commit: ab5fce87a778cb780a05984a2ca448f2b41aafbf change-id: 20260519-feature-dmem-high-16997148dc38 Best regards, --=20 Qiliang Yuan