From nobody Wed Dec 17 14:22:05 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4ACADEE49A6 for ; Sun, 20 Aug 2023 15:25:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231403AbjHTPZN (ORCPT ); Sun, 20 Aug 2023 11:25:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231280AbjHTPZA (ORCPT ); Sun, 20 Aug 2023 11:25:00 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9011146B5 for ; Sun, 20 Aug 2023 08:24:30 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-3fee06efd6bso7477435e9.0 for ; Sun, 20 Aug 2023 08:24:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=layalina-io.20221208.gappssmtp.com; s=20221208; t=1692545069; x=1693149869; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fhoZ02m2Or/dIrJmk8yXErPPVZuAAzFPFt+KqPFXSJ8=; b=eOLGu+Adn2aFJVYwhd+sXiW5FXissjgHC3lcTCaqOLKUTzTlWiHzyAfxPbdFd5lTnL lNbWOG6d9qF4Kao6Z9DaMWApZN4LTVkwLWO0HbotjIz3cr8cLM+5NJqzABw88NPjn3Ko 68PdhhMTEqgPaXnJ9QhecRySoJiZADKmrKPdq2wB0RScswks3AsBtuVVE3XkHllRivyt RWQXb1A1vwzm0kTN6M1ycK2SSkm9BT/93v9X+VMo8YJRA4TInMMd92ALakZ1weMBBe1+ jUhXa/Kmhg0D5UyM1WTUktpcnWJIaXhtQOayjxwtcXEaxgnsbMmn68rtYKYgUa6+hBqh FzIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692545069; x=1693149869; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fhoZ02m2Or/dIrJmk8yXErPPVZuAAzFPFt+KqPFXSJ8=; b=F4RcLrCe06nA6sgmsGa8IqLKVfqscLKpOpw+pI01sSxLoNxx2/0nuYnO496W8OUEms EbDz1lXNte302GZSuefHzSjacDvjNBPmTMDRXGe461yEl/g8Uy1f1qzyaEFaK8htX87b fZ63c+tZKkAHMLp83Zt8P4MzmeYYh4ccT/7+mvgxOPaKc2zL5LluHxdgdODjfhXZuVL3 GiQCytDiHgz2m6mL4KOvXMt8u1qMaS2QC/614wVngtLJ6qasXLBV3MF9cnyTo5xJWenY HVb1rV5++lXIXFyfoBPNz1EDRYyMj6cTDA6CUGDgOXk+2uKjAG5vnNTHMbnEe1iFSrfc w5TA== X-Gm-Message-State: AOJu0YxGYedqxnNtxMOxSehEOBBij8uxa7IxYkoBgczlySITVFmI62iD hdYUHeEQC5jhrTmTwEE+WSH86A== X-Google-Smtp-Source: AGHT+IFPXg29ARR5RkOzB3q3wtM+hmTYqLvOywyPoSNXYRikKUAwWhDdv6QxEnp9svBhR6ZtqGUetA== X-Received: by 2002:a5d:58cf:0:b0:315:acbc:cab6 with SMTP id o15-20020a5d58cf000000b00315acbccab6mr5189690wrf.16.1692545069021; Sun, 20 Aug 2023 08:24:29 -0700 (PDT) Received: from airbuntu.. (host109-151-228-137.range109-151.btcentralplus.com. [109.151.228.137]) by smtp.gmail.com with ESMTPSA id h11-20020a5d4fcb000000b003141e629cb6sm9419257wrw.101.2023.08.20.08.24.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 08:24:28 -0700 (PDT) From: Qais Yousef To: stable@vger.kernel.org Cc: Juri Lelli , Waiman Long , Tejun Heo , Dietmar Eggemann , Peter Zijlstra , Vincent Guittot , Ingo Molnar , Hao Luo , John Stultz , cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Qais Yousef Subject: [PATCH 3/6] sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets Date: Sun, 20 Aug 2023 16:24:14 +0100 Message-Id: <20230820152417.518806-4-qyousef@layalina.io> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230820152417.518806-1-qyousef@layalina.io> References: <20230820152417.518806-1-qyousef@layalina.io> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Juri Lelli commit 6c24849f5515e4966d94fa5279bdff4acf2e9489 upstream. Qais reported that iterating over all tasks when rebuilding root domains for finding out which ones are DEADLINE and need their bandwidth correctly restored on such root domains can be a costly operation (10+ ms delays on suspend-resume). To fix the problem keep track of the number of DEADLINE tasks belonging to each cpuset and then use this information (followup patch) to only perform the above iteration if DEADLINE tasks are actually present in the cpuset for which a corresponding root domain is being rebuilt. Reported-by: Qais Yousef (Google) Link: https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalin= a.io/ Signed-off-by: Juri Lelli Reviewed-by: Waiman Long Signed-off-by: Tejun Heo (cherry picked from commit 6c24849f5515e4966d94fa5279bdff4acf2e9489) Signed-off-by: Qais Yousef (Google) --- include/linux/cpuset.h | 4 ++++ kernel/cgroup/cgroup.c | 4 ++++ kernel/cgroup/cpuset.c | 25 +++++++++++++++++++++++++ kernel/sched/deadline.c | 14 ++++++++++++++ 4 files changed, 47 insertions(+) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index 355f796c5f07..0348dba5680e 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -71,6 +71,8 @@ extern void cpuset_init_smp(void); extern void cpuset_force_rebuild(void); extern void cpuset_update_active_cpus(void); extern void cpuset_wait_for_hotplug(void); +extern void inc_dl_tasks_cs(struct task_struct *task); +extern void dec_dl_tasks_cs(struct task_struct *task); extern void cpuset_lock(void); extern void cpuset_unlock(void); extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mas= k); @@ -196,6 +198,8 @@ static inline void cpuset_update_active_cpus(void) =20 static inline void cpuset_wait_for_hotplug(void) { } =20 +static inline void inc_dl_tasks_cs(struct task_struct *task) { } +static inline void dec_dl_tasks_cs(struct task_struct *task) { } static inline void cpuset_lock(void) { } static inline void cpuset_unlock(void) { } =20 diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 73f11e4db3a4..97ecca43386d 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -57,6 +57,7 @@ #include #include #include +#include #include #include =20 @@ -6681,6 +6682,9 @@ void cgroup_exit(struct task_struct *tsk) list_add_tail(&tsk->cg_list, &cset->dying_tasks); cset->nr_tasks--; =20 + if (dl_task(tsk)) + dec_dl_tasks_cs(tsk); + WARN_ON_ONCE(cgroup_task_frozen(tsk)); if (unlikely(!(tsk->flags & PF_KTHREAD) && test_bit(CGRP_FREEZE, &task_dfl_cgroup(tsk)->flags))) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index a8a060fea074..5f57829bb986 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -193,6 +193,12 @@ struct cpuset { int use_parent_ecpus; int child_ecpus_count; =20 + /* + * number of SCHED_DEADLINE tasks attached to this cpuset, so that we + * know when to rebuild associated root domain bandwidth information. + */ + int nr_deadline_tasks; + /* Invalid partition error code, not lock protected */ enum prs_errcode prs_err; =20 @@ -245,6 +251,20 @@ static inline struct cpuset *parent_cs(struct cpuset *= cs) return css_cs(cs->css.parent); } =20 +void inc_dl_tasks_cs(struct task_struct *p) +{ + struct cpuset *cs =3D task_cs(p); + + cs->nr_deadline_tasks++; +} + +void dec_dl_tasks_cs(struct task_struct *p) +{ + struct cpuset *cs =3D task_cs(p); + + cs->nr_deadline_tasks--; +} + /* bits in struct cpuset flags field */ typedef enum { CS_ONLINE, @@ -2494,6 +2514,11 @@ static int cpuset_can_attach(struct cgroup_taskset *= tset) ret =3D security_task_setscheduler(task); if (ret) goto out_unlock; + + if (dl_task(task)) { + cs->nr_deadline_tasks++; + cpuset_attach_old_cs->nr_deadline_tasks--; + } } =20 /* diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index f7d381b6c313..98154a93e05d 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -16,6 +16,8 @@ * Fabio Checconi */ =20 +#include + /* * Default limits for DL period; on the top end we guard against small util * tasks still getting ridiculously long effective runtimes, on the bottom= end we @@ -2597,6 +2599,12 @@ static void switched_from_dl(struct rq *rq, struct t= ask_struct *p) if (task_on_rq_queued(p) && p->dl.dl_runtime) task_non_contending(p); =20 + /* + * In case a task is setscheduled out from SCHED_DEADLINE we need to + * keep track of that on its cpuset (for correct bandwidth tracking). + */ + dec_dl_tasks_cs(p); + if (!task_on_rq_queued(p)) { /* * Inactive timer is armed. However, p is leaving DEADLINE and @@ -2637,6 +2645,12 @@ static void switched_to_dl(struct rq *rq, struct tas= k_struct *p) if (hrtimer_try_to_cancel(&p->dl.inactive_timer) =3D=3D 1) put_task_struct(p); =20 + /* + * In case a task is setscheduled to SCHED_DEADLINE we need to keep + * track of that on its cpuset (for correct bandwidth tracking). + */ + inc_dl_tasks_cs(p); + /* If p is not queued we will update its parameters at next wakeup. */ if (!task_on_rq_queued(p)) { add_rq_bw(&p->dl, &rq->dl); --=20 2.34.1