From nobody Wed Dec 17 14:22:05 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9BBFEE49A5 for ; Sun, 20 Aug 2023 15:26:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231473AbjHTP0N (ORCPT ); Sun, 20 Aug 2023 11:26:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231445AbjHTP0B (ORCPT ); Sun, 20 Aug 2023 11:26:01 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8BA2468D for ; Sun, 20 Aug 2023 08:23:09 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-3fe2ba3e260so25031025e9.2 for ; Sun, 20 Aug 2023 08:23:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20221208.gappssmtp.com; s=20221208; t=1692544988; x=1693149788; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5UPdbmyLLc1ipw9o/7F1VeIHdpT9gV7/ciY4+2ZsAW4=; b=hKCFQLMTQfJ8hQf4OWiVjz3/Hob4S8QXHGiw8/XkhceP3YSkh5ksHmpWkd4d5tIto1 UbwYz64hRV0P2OH/JZxKvJqECb/rZRS6rYWScs4tGyWpjF8bC7DKcSKN28yX4mBGmFSq MAhp8l+moKgHuKPNWShKAxkzuQfkY3zz2NgRdEBrA2qxIN7JUfet959J5L6q92SVsT1N GUlw53xfkT6/HymUKwlTDwQ7q6Mu9LsqA0N1lKFzzzv4aADrING0MdShBKF5rsHfL+w2 MmZFhgf57VNzYqBj1GID9jgdOneAbysoc1gd1lO96zRNscFcQJcdnvuTkcpm2tbwiX+4 W1bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692544988; x=1693149788; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5UPdbmyLLc1ipw9o/7F1VeIHdpT9gV7/ciY4+2ZsAW4=; b=Nn0tbb1HzxtoN1j65cEp5mfUz6aFmTAdNYJ6h5ibcWVTjOwdA70iAJR77Rppa8WKsZ rhoIr5vFa632Nzo7v3EKsY6V9wwNoNsF+5F+ECsR+ntwdCfSiW3MAySMc21UnbtJy3uo AucEhj+ha5OTtirNgR7mxFUsTGMMfSB/RY18bjFNlAN5dl3Hie7tPP8cK3BDxs62wJpP uuF+/ytAfKNIYFEaAvxF/MFoGhkgvsUqK5oqs+y63jZE6L+QGWbVuozd1X8EgUxl+ddy 15Jf/4ZbtPGx68E70nvsKHJdENeUGjjum8UIGAwytWAYlOw2X6+dbqRebFPqKGC6+k1h H4fA== X-Gm-Message-State: AOJu0Yyjf7qP75Wu0VaAWzzb6qlzyYWGcV5c+cVjmFicx4KY4ns5xLhC VWNvyYnaas+zhG5uv/vweaU9hw== X-Google-Smtp-Source: AGHT+IEd0xo+85GVEA/yqLAIzchZpG27H5AfdEElsaenjqaR8jM67rn/TBVQ0cS6SWtxf2Hz0GKwQQ== X-Received: by 2002:adf:e74e:0:b0:313:e8b6:1699 with SMTP id c14-20020adfe74e000000b00313e8b61699mr2801988wrn.55.1692544988346; Sun, 20 Aug 2023 08:23:08 -0700 (PDT) Received: from airbuntu.. (host109-151-228-137.range109-151.btcentralplus.com. [109.151.228.137]) by smtp.gmail.com with ESMTPSA id c3-20020adfe703000000b0031773a8e5c4sm9527466wrm.37.2023.08.20.08.23.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 08:23:08 -0700 (PDT) From: Qais Yousef To: stable@vger.kernel.org Cc: Juri Lelli , Waiman Long , Tejun Heo , Dietmar Eggemann , Peter Zijlstra , Vincent Guittot , Ingo Molnar , Hao Luo , John Stultz , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Qais Yousef Subject: [PATCH 5/6] sched/deadline: Create DL BW alloc, free & check overflow interface Date: Sun, 20 Aug 2023 16:22:57 +0100 Message-Id: <20230820152258.518128-6-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230820152258.518128-1-qyousef@layalina.io> References: <20230820152258.518128-1-qyousef@layalina.io> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Dietmar Eggemann commit 85989106feb734437e2d598b639991b9185a43a6 upstream. While moving a set of tasks between exclusive cpusets, cpuset_can_attach() -> task_can_attach() calls dl_cpu_busy(..., p) for DL BW overflow checking and per-task DL BW allocation on the destination root_domain for the DL tasks in this set. This approach has the issue of not freeing already allocated DL BW in the following error cases: (1) The set of tasks includes multiple DL tasks and DL BW overflow checking fails for one of the subsequent DL tasks. (2) Another controller next to the cpuset controller which is attached to the same cgroup fails in its can_attach(). To address this problem rework dl_cpu_busy(): (1) Split it into dl_bw_check_overflow() & dl_bw_alloc() and add a dedicated dl_bw_free(). (2) dl_bw_alloc() & dl_bw_free() take a `u64 dl_bw` parameter instead of a `struct task_struct *p` used in dl_cpu_busy(). This allows to allocate DL BW for a set of tasks too rather than only for a single task. Signed-off-by: Dietmar Eggemann Signed-off-by: Juri Lelli Signed-off-by: Tejun Heo (cherry picked from commit 85989106feb734437e2d598b639991b9185a43a6) Signed-off-by: Qais Yousef (Google) --- include/linux/sched.h | 2 ++ kernel/sched/core.c | 4 ++-- kernel/sched/deadline.c | 53 +++++++++++++++++++++++++++++++---------- kernel/sched/sched.h | 2 +- 4 files changed, 45 insertions(+), 16 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 7c17742d359c..91fe1d6418e7 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1798,6 +1798,8 @@ current_restore_flags(unsigned long orig_flags, unsig= ned long flags) =20 extern int cpuset_cpumask_can_shrink(const struct cpumask *cur, const stru= ct cpumask *trial); extern int task_can_attach(struct task_struct *p, const struct cpumask *cs= _effective_cpus); +extern int dl_bw_alloc(int cpu, u64 dl_bw); +extern void dl_bw_free(int cpu, u64 dl_bw); #ifdef CONFIG_SMP extern void do_set_cpus_allowed(struct task_struct *p, const struct cpumas= k *new_mask); extern int set_cpus_allowed_ptr(struct task_struct *p, const struct cpumas= k *new_mask); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 8a8c9220f4d8..da2a2906a826 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8814,7 +8814,7 @@ int task_can_attach(struct task_struct *p, =20 if (unlikely(cpu >=3D nr_cpu_ids)) return -EINVAL; - ret =3D dl_cpu_busy(cpu, p); + ret =3D dl_bw_alloc(cpu, p->dl.dl_bw); } =20 out: @@ -9099,7 +9099,7 @@ static void cpuset_cpu_active(void) static int cpuset_cpu_inactive(unsigned int cpu) { if (!cpuhp_tasks_frozen) { - int ret =3D dl_cpu_busy(cpu, NULL); + int ret =3D dl_bw_check_overflow(cpu); =20 if (ret) return ret; diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index fced55d6e8da..de45e4d2c61f 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2898,26 +2898,38 @@ int dl_cpuset_cpumask_can_shrink(const struct cpuma= sk *cur, return ret; } =20 -int dl_cpu_busy(int cpu, struct task_struct *p) +enum dl_bw_request { + dl_bw_req_check_overflow =3D 0, + dl_bw_req_alloc, + dl_bw_req_free +}; + +static int dl_bw_manage(enum dl_bw_request req, int cpu, u64 dl_bw) { - unsigned long flags, cap; + unsigned long flags; struct dl_bw *dl_b; - bool overflow; + bool overflow =3D 0; =20 rcu_read_lock_sched(); dl_b =3D dl_bw_of(cpu); raw_spin_lock_irqsave(&dl_b->lock, flags); - cap =3D dl_bw_capacity(cpu); - overflow =3D __dl_overflow(dl_b, cap, 0, p ? p->dl.dl_bw : 0); =20 - if (!overflow && p) { - /* - * We reserve space for this task in the destination - * root_domain, as we can't fail after this point. - * We will free resources in the source root_domain - * later on (see set_cpus_allowed_dl()). - */ - __dl_add(dl_b, p->dl.dl_bw, dl_bw_cpus(cpu)); + if (req =3D=3D dl_bw_req_free) { + __dl_sub(dl_b, dl_bw, dl_bw_cpus(cpu)); + } else { + unsigned long cap =3D dl_bw_capacity(cpu); + + overflow =3D __dl_overflow(dl_b, cap, 0, dl_bw); + + if (req =3D=3D dl_bw_req_alloc && !overflow) { + /* + * We reserve space in the destination + * root_domain, as we can't fail after this point. + * We will free resources in the source root_domain + * later on (see set_cpus_allowed_dl()). + */ + __dl_add(dl_b, dl_bw, dl_bw_cpus(cpu)); + } } =20 raw_spin_unlock_irqrestore(&dl_b->lock, flags); @@ -2925,6 +2937,21 @@ int dl_cpu_busy(int cpu, struct task_struct *p) =20 return overflow ? -EBUSY : 0; } + +int dl_bw_check_overflow(int cpu) +{ + return dl_bw_manage(dl_bw_req_check_overflow, cpu, 0); +} + +int dl_bw_alloc(int cpu, u64 dl_bw) +{ + return dl_bw_manage(dl_bw_req_alloc, cpu, dl_bw); +} + +void dl_bw_free(int cpu, u64 dl_bw) +{ + dl_bw_manage(dl_bw_req_free, cpu, dl_bw); +} #endif =20 #ifdef CONFIG_SCHED_DEBUG diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 6312f1904825..5061093d9baa 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -349,7 +349,7 @@ extern void __getparam_dl(struct task_struct *p, struct= sched_attr *attr); extern bool __checkparam_dl(const struct sched_attr *attr); extern bool dl_param_changed(struct task_struct *p, const struct sched_att= r *attr); extern int dl_cpuset_cpumask_can_shrink(const struct cpumask *cur, const = struct cpumask *trial); -extern int dl_cpu_busy(int cpu, struct task_struct *p); +extern int dl_bw_check_overflow(int cpu); =20 #ifdef CONFIG_CGROUP_SCHED =20 --=20 2.34.1