From nobody Wed Apr  8 18:37:43 2026
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 23ACEC38A02
	for <linux-kernel@archiver.kernel.org>; Fri, 28 Oct 2022 09:35:48 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230219AbiJ1Jfq (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Fri, 28 Oct 2022 05:35:46 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33268 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S230203AbiJ1JfK (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Fri, 28 Oct 2022 05:35:10 -0400
Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com
 [IPv6:2a00:1450:4864:20::332])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27AED5E669
        for <linux-kernel@vger.kernel.org>;
 Fri, 28 Oct 2022 02:35:09 -0700 (PDT)
Received: by mail-wm1-x332.google.com with SMTP id
 bh7-20020a05600c3d0700b003c6fb3b2052so3309102wmb.2
        for <linux-kernel@vger.kernel.org>;
 Fri, 28 Oct 2022 02:35:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=linaro.org; s=google;
        h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to
         :cc:subject:date:message-id:reply-to;
        bh=Ji/tbDBalDOjXwUPedhgktma27oNzTvFrvjSlmC+AaI=;
        b=CtUI0dFWy5iBU4N/aa/n7y2seGmPfFZZwMfMtQcUOOcdShtwp0DRIhX/ffww9WKKGN
         ims2UGmSqQmVN+WYwAQxOURX4MeGIidCnJwokOEHyT+lzeafGpDgo3LpLwRmRRQI6jD3
         RV+udddnCPmrjsUJuWG5tPUZewk7XMwatPNoQacj3IaO60HveNqadXx4qzQkyBVHCo5L
         z0HwQtyT1sZM2Qjnb7/+rYt8Jr4L/Zz4uqJ/FoFMQX4p/zJNQT/qkuay3qPSkbCJsk05
         P4/eBySziyTXApUa2HPfWVrk/udkx7EvXxXVJcFRg0l4At7oI/ExiJosG1PqCJmuOYo3
         4BjA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=references:in-reply-to:message-id:date:subject:cc:to:from
         :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=Ji/tbDBalDOjXwUPedhgktma27oNzTvFrvjSlmC+AaI=;
        b=MeJpT2cl90g5z8Vruh2PgRrtdsZiNu4lVNfNlJ1I7xs17pe4t4C8fWFB+L59Lc4laA
         xGg7OgRU2H760tyCEsY1MGsajAo3+GmW3tL6dGBIWJThD5cBF1D52fxaLplKCh7Gz8du
         gnzdvNiLbjiW60IHaHoPXMX/m0r+mvMoMw/WWmsB/weuVKkOnIrhQPuyZUMgmv7xicXe
         cKIwfbOS55XuwRR5rXg46KWK0NXCqXSwFBxWR0OvkEVqcZuXWmGvVry4knvXYJmEV9a6
         JY6MkOOYkNV7bSHPxWaZC26N+MjaNSZqNSyOQskXRlkkO8qgHfOrvqZpieWk2Dyr1H/4
         gpIQ==
X-Gm-Message-State: ACrzQf14pF9aRexXIDHUBmCIRhBYxKb8JWXm2y2nFjswLOQClKraqk36
        d7OQwL7/XH9Igg1dtkRuM9m3Gg==
X-Google-Smtp-Source: 
 AMsMyM5XLhKjXcGgrb8hrR3AZA2ucF1nZ45/m31dtu8i7dfz4p3aHxdzFexM3FXztVv6nQTg+/xPRw==
X-Received: by 2002:a5d:4d12:0:b0:236:751a:9c90 with SMTP id
 z18-20020a5d4d12000000b00236751a9c90mr15518471wrt.609.1666949707524;
        Fri, 28 Oct 2022 02:35:07 -0700 (PDT)
Received: from localhost.localdomain ([2a01:e0a:f:6020:c12b:b448:f0a9:83ef])
        by smtp.gmail.com with ESMTPSA id
 k3-20020a05600c1c8300b003c6b7f5567csm10909426wms.0.2022.10.28.02.35.05
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Fri, 28 Oct 2022 02:35:06 -0700 (PDT)
From: Vincent Guittot <vincent.guittot@linaro.org>
To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com,
        dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com,
        mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com,
        linux-kernel@vger.kernel.org, parth@linux.ibm.com
Cc: qais.yousef@arm.com, chris.hyser@oracle.com,
        patrick.bellasi@matbug.net, David.Laight@aculab.com,
        pjt@google.com, pavel@ucw.cz, tj@kernel.org, qperret@google.com,
        tim.c.chen@linux.intel.com, joshdon@google.com, timj@gnu.org,
        kprateek.nayak@amd.com, yu.c.chen@intel.com,
        youssefesmat@chromium.org, joel@joelfernandes.org,
        Vincent Guittot <vincent.guittot@linaro.org>
Subject: [PATCH v7 6/9] sched/fair: Add sched group latency support
Date: Fri, 28 Oct 2022 11:34:00 +0200
Message-Id: <20221028093403.6673-7-vincent.guittot@linaro.org>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20221028093403.6673-1-vincent.guittot@linaro.org>
References: <20221028093403.6673-1-vincent.guittot@linaro.org>
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"

Task can set its latency priority with sched_setattr(), which is then used
to set the latency offset of its sched_enity, but sched group entities
still have the default latency offset value.

Add a latency.nice field in cpu cgroup controller to set the latency
priority of the group similarly to sched_setattr(). The latency priority
is then used to set the offset of the sched_entities of the group.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
---
 Documentation/admin-guide/cgroup-v2.rst |  8 ++++
 kernel/sched/core.c                     | 52 +++++++++++++++++++++++++
 kernel/sched/fair.c                     | 33 ++++++++++++++++
 kernel/sched/sched.h                    |  4 ++
 4 files changed, 97 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-=
guide/cgroup-v2.rst
index be4a77baf784..d8ae7e411f9c 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -1095,6 +1095,14 @@ All time durations are in microseconds.
         values similar to the sched_setattr(2). This maximum utilization
         value is used to clamp the task specific maximum utilization clamp.
=20
+  cpu.latency.nice
+	A read-write single value file which exists on non-root
+	cgroups.  The default is "0".
+
+	The nice value is in the range [-20, 19].
+
+	This interface file allows reading and setting latency using the
+	same values used by sched_setattr(2).
=20
=20
 Memory
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index caf54e54a74f..3f42b1f61a7e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10890,6 +10890,47 @@ static int cpu_idle_write_s64(struct cgroup_subsys=
_state *css,
 {
 	return sched_group_set_idle(css_tg(css), idle);
 }
+
+static s64 cpu_latency_nice_read_s64(struct cgroup_subsys_state *css,
+				    struct cftype *cft)
+{
+	int prio, delta, last_delta =3D INT_MAX;
+	s64 weight;
+
+	weight =3D css_tg(css)->latency_offset * NICE_LATENCY_WEIGHT_MAX;
+	weight =3D div_s64(weight, get_sched_latency(false));
+
+	/* Find the closest nice value to the current weight */
+	for (prio =3D 0; prio < ARRAY_SIZE(sched_latency_to_weight); prio++) {
+		delta =3D abs(sched_latency_to_weight[prio] - weight);
+		if (delta >=3D last_delta)
+			break;
+		last_delta =3D delta;
+	}
+
+	return LATENCY_TO_NICE(prio-1);
+}
+
+static int cpu_latency_nice_write_s64(struct cgroup_subsys_state *css,
+				     struct cftype *cft, s64 nice)
+{
+	s64 latency_offset;
+	long weight;
+	int idx;
+
+	if (nice < MIN_LATENCY_NICE || nice > MAX_LATENCY_NICE)
+		return -ERANGE;
+
+	idx =3D NICE_TO_LATENCY(nice);
+	idx =3D array_index_nospec(idx, LATENCY_NICE_WIDTH);
+	weight =3D sched_latency_to_weight[idx];
+
+	latency_offset =3D weight * get_sched_latency(false);
+	latency_offset =3D div_s64(latency_offset, NICE_LATENCY_WEIGHT_MAX);
+
+	return sched_group_set_latency(css_tg(css), latency_offset);
+}
+
 #endif
=20
 static struct cftype cpu_legacy_files[] =3D {
@@ -10904,6 +10945,11 @@ static struct cftype cpu_legacy_files[] =3D {
 		.read_s64 =3D cpu_idle_read_s64,
 		.write_s64 =3D cpu_idle_write_s64,
 	},
+	{
+		.name =3D "latency.nice",
+		.read_s64 =3D cpu_latency_nice_read_s64,
+		.write_s64 =3D cpu_latency_nice_write_s64,
+	},
 #endif
 #ifdef CONFIG_CFS_BANDWIDTH
 	{
@@ -11121,6 +11167,12 @@ static struct cftype cpu_files[] =3D {
 		.read_s64 =3D cpu_idle_read_s64,
 		.write_s64 =3D cpu_idle_write_s64,
 	},
+	{
+		.name =3D "latency.nice",
+		.flags =3D CFTYPE_NOT_ON_ROOT,
+		.read_s64 =3D cpu_latency_nice_read_s64,
+		.write_s64 =3D cpu_latency_nice_write_s64,
+	},
 #endif
 #ifdef CONFIG_CFS_BANDWIDTH
 	{
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 4299d5108dc7..9583936ce30c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11764,6 +11764,7 @@ int alloc_fair_sched_group(struct task_group *tg, s=
truct task_group *parent)
 		goto err;
=20
 	tg->shares =3D NICE_0_LOAD;
+	tg->latency_offset =3D 0;
=20
 	init_cfs_bandwidth(tg_cfs_bandwidth(tg));
=20
@@ -11862,6 +11863,9 @@ void init_tg_cfs_entry(struct task_group *tg, struc=
t cfs_rq *cfs_rq,
 	}
=20
 	se->my_q =3D cfs_rq;
+
+	se->latency_offset =3D tg->latency_offset;
+
 	/* guarantee group entities always have weight */
 	update_load_set(&se->load, NICE_0_LOAD);
 	se->parent =3D parent;
@@ -11992,6 +11996,35 @@ int sched_group_set_idle(struct task_group *tg, lo=
ng idle)
 	return 0;
 }
=20
+int sched_group_set_latency(struct task_group *tg, s64 latency)
+{
+	int i;
+
+	if (tg =3D=3D &root_task_group)
+		return -EINVAL;
+
+	if (abs(latency) > sysctl_sched_latency)
+		return -EINVAL;
+
+	mutex_lock(&shares_mutex);
+
+	if (tg->latency_offset =3D=3D latency) {
+		mutex_unlock(&shares_mutex);
+		return 0;
+	}
+
+	tg->latency_offset =3D latency;
+
+	for_each_possible_cpu(i) {
+		struct sched_entity *se =3D tg->se[i];
+
+		WRITE_ONCE(se->latency_offset, latency);
+	}
+
+	mutex_unlock(&shares_mutex);
+	return 0;
+}
+
 #else /* CONFIG_FAIR_GROUP_SCHED */
=20
 void free_fair_sched_group(struct task_group *tg) { }
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 99f10b4dc230..95d4be4f3af6 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -407,6 +407,8 @@ struct task_group {
=20
 	/* A positive value indicates that this is a SCHED_IDLE group. */
 	int			idle;
+	/* latency constraint of the group. */
+	int			latency_offset;
=20
 #ifdef	CONFIG_SMP
 	/*
@@ -517,6 +519,8 @@ extern int sched_group_set_shares(struct task_group *tg=
, unsigned long shares);
=20
 extern int sched_group_set_idle(struct task_group *tg, long idle);
=20
+extern int sched_group_set_latency(struct task_group *tg, s64 latency);
+
 #ifdef CONFIG_SMP
 extern void set_task_rq_fair(struct sched_entity *se,
 			     struct cfs_rq *prev, struct cfs_rq *next);
--=20
2.17.1