From nobody Thu Dec 18 20:15:43 2025 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D7AAE267B92 for ; Mon, 12 May 2025 12:10:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747051845; cv=none; b=qETWOUYb0ur/WnGJ+h3/9Lee6J/OMxhr58MjwLqss/9f1s4FXHwcXMEo9n8nArP/4EgEKq+LaoKa4Z6sxArqui3Wte5JJFHXBNSSeY4BhlSpB6fzx6NpLrL/rLMdHqYdwoht1kS8EaCw+Ctyj2rhXH64mwWib5dpZKZ+4RYHr/o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747051845; c=relaxed/simple; bh=2QCB2SuN1wpsr9YqrrwuAk0wTgAtxbANKSzHAg2/huQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Dgjz5DNdNlnJ320wVxnMftreNWNSsfa6S7ea9oUWpyi/oRu71keMIUrvK6QvjlR9n9jlUGfzRofCsyXx1LJsmj+PgLaniVZx3EP6dMTL8UQHceVN6zpf3gaWR6rloiA/ZhQPOLgh8/N4uifJeTnurGRga7fP673ROzhYFouzhY0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=lylGVZ2J; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="lylGVZ2J" Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54C9Ep5T015457; Mon, 12 May 2025 12:10:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=DN8IsBRjOzjHHBzDJ gIL816UD74O6S2zdS//76HcAMM=; b=lylGVZ2J/+prx3ZB3yY86AW2qefNbbRO7 NnjFnZEUjsSk52PDTEosgb9Lbaej1uwJRJ+nUud8cVoFWUCZCSQ7bxQQgng6WBs/ Lt150ci+/LwdMazHXRQZHYUjMVrlkw2FULLj3igiWTrXA3pB2rl7thzCkZkoWHZO S8EG4r7ZstvGJPJcw+Af4gB2uwFBhI+DQ98yhqDqn6sbciOSPSFJf2mrCKvIZihG MGoHdh0qBrqu1YvAJEEDimaS47vylk84opGVaE78GrbJtbzgn+AG/jZXY14zOstj 2Na7E/kF6TIxB04FhIHIWkvut4AqtX4Ts4Fx0dia8crqZIHVG2qsw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46ke6j0qsb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 12:10:32 +0000 (GMT) Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 54CCANJl003141; Mon, 12 May 2025 12:10:31 GMT Received: from ppma23.wdc07v.mail.ibm.com (5d.69.3da9.ip4.static.sl-reverse.com [169.61.105.93]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46ke6j0qs8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 12:10:31 +0000 (GMT) Received: from pps.filterd (ppma23.wdc07v.mail.ibm.com [127.0.0.1]) by ppma23.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 54CA687P024537; Mon, 12 May 2025 12:10:31 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma23.wdc07v.mail.ibm.com (PPS) with ESMTPS id 46jjmkwr52-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 12:10:30 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 54CCAR8C20644370 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 May 2025 12:10:27 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9C011200E7; Mon, 12 May 2025 11:53:27 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D83B7200E1; Mon, 12 May 2025 11:53:26 +0000 (GMT) Received: from IBM-PW0CRK36.ibm.com (unknown [9.111.90.223]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 12 May 2025 11:53:26 +0000 (GMT) From: Tobias Huschle To: linux-kernel@vger.kernel.org Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, sshegde@linux.ibm.com Subject: [RFC PATCH v3 1/4] sched/fair: introduce new scheduler group type group_parked Date: Mon, 12 May 2025 13:53:22 +0200 Message-Id: <20250512115325.30022-2-huschle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250512115325.30022-1-huschle@linux.ibm.com> References: <20250512115325.30022-1-huschle@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: MJxaM8WbEVUomQ4psKk8nSAegtv5IIUz X-Proofpoint-ORIG-GUID: cT4DVgmvBXYCybM6h1-ft2F8mRiDQxS5 X-Authority-Analysis: v=2.4 cv=auyyCTZV c=1 sm=1 tr=0 ts=6821e538 cx=c_pps a=3Bg1Hr4SwmMryq2xdFQyZA==:117 a=3Bg1Hr4SwmMryq2xdFQyZA==:17 a=dt9VzEwgFbYA:10 a=VnNF1IyMAAAA:8 a=Ig1JQATniH1yXVHltnwA:9 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEyMDEyNyBTYWx0ZWRfX9aOaB6N+tnm+ ENHN05ccqdcogzjsJ6uPwwP1q3E5cz8PXkBZsHk1HsO5lP6BUfTorZgzQKEV06Q/m2GLqnjRXjs DBHZC0Q0xcY0LYeHP97OM70fUt0xt+AEWzUPjfgxezrAfWNc+IKIzDpPMfBiEGDJhKDs7aQnHOX waXrZyVfT4RHvC/x0/Kl1y/rnVgy7xuHhhIxn8KOLX6GZC4B+Vjm6xkqQ6xkJVMhZGkVS5MKv9r O6F5ExFFiRSGTn13S89sp55Vf/rJW+43tVmOt5oiv96ca74FO5jF0vwOsGjqrRheu2QogcFALkq tZk9HMsRb1Gob41IQf+1AmvOYfv7Vc7Qa0MtQPXOS9HOwhEmrhTe55ZtX75h3lVuyYJxbpJoMlc vhoGdhWgm6iHct69/ZWVq7SqFSScSe+LVikI9eDJynU1eGU1PV7U70eoQGMvuT8N5ILuHWHv X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-12_04,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 clxscore=1015 priorityscore=1501 lowpriorityscore=0 spamscore=0 suspectscore=0 bulkscore=0 malwarescore=0 impostorscore=0 adultscore=0 mlxlogscore=999 phishscore=0 classifier=spam authscore=0 authtc=n/a authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2504070000 definitions=main-2505120127 Content-Type: text/plain; charset="utf-8" A parked CPU is considered to be flagged as unsuitable to process workload at the moment, but might be become usable anytime. Depending on the necessity for additional computation power and/or available capacity of the underlying hardware. A scheduler group is considered to be parked, if there are tasks queued on parked CPUs and there are no idle CPUs, i.e. all non parked CPUs are busy or there are only parked CPUs. A scheduler group with parked tasks can be considered to not be parked, if it has idle CPUs which can pick up the parked tasks. A parked scheduler group is considered to be busier than another if it runs more tasks on parked CPUs than another parked scheduler group. A parked CPU must keep its scheduler tick (or have it re-enabled if necessary) in order to make sure that parked CPUs which only run a single task which does not give up its runtime voluntarily is still evacuated as it would otherwise go into NO_HZ. The status of the underlying hardware must be considered to be architecture dependent. Therefore the check whether a CPU is parked is architecture specific. For architectures not relying on this feature, the check is mostly a NOP. This is more efficient and non-disruptive compared to CPU hotplug in environments where such changes can be necessary on a frequent basis. Signed-off-by: Tobias Huschle --- include/linux/sched/topology.h | 19 +++++++++ kernel/sched/core.c | 13 +++++- kernel/sched/fair.c | 77 +++++++++++++++++++++++++++++----- kernel/sched/syscalls.c | 3 ++ 4 files changed, 101 insertions(+), 11 deletions(-) diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index 7b4301b7235f..6baf51d45e85 100644 --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -251,6 +251,25 @@ unsigned long arch_scale_cpu_capacity(int cpu) } #endif =20 +#ifndef arch_cpu_parked +/** + * arch_cpu_parked - Check if a given CPU is currently parked. + * + * A parked CPU cannot run any kind of workload since underlying + * physical CPU should not be used at the moment . + * + * @cpu: the CPU in question. + * + * By default assume CPU is not parked + * + * Return: Parked state of CPU + */ +static __always_inline bool arch_cpu_parked(int cpu) +{ + return false; +} +#endif + #ifndef arch_scale_hw_pressure static __always_inline unsigned long arch_scale_hw_pressure(int cpu) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c81cf642dba0..90efc322a81e 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1358,6 +1358,9 @@ bool sched_can_stop_tick(struct rq *rq) if (rq->cfs.h_nr_queued > 1) return false; =20 + if (rq->cfs.h_nr_queued > 0 && arch_cpu_parked(cpu_of(rq))) + return false; + /* * If there is one task and it has CFS runtime bandwidth constraints * and it's on the cpu now we don't want to stop the tick. @@ -2449,7 +2452,7 @@ static inline bool is_cpu_allowed(struct task_struct = *p, int cpu) =20 /* Non kernel threads are not allowed during either online or offline. */ if (!(p->flags & PF_KTHREAD)) - return cpu_active(cpu); + return !arch_cpu_parked(cpu) && cpu_active(cpu); =20 /* KTHREAD_IS_PER_CPU is always allowed. */ if (kthread_is_per_cpu(p)) @@ -2459,6 +2462,10 @@ static inline bool is_cpu_allowed(struct task_struct= *p, int cpu) if (cpu_dying(cpu)) return false; =20 + /* CPU should be avoided at the moment */ + if (arch_cpu_parked(cpu)) + return false; + /* But are allowed during online. */ return cpu_online(cpu); } @@ -3929,6 +3936,10 @@ static inline bool ttwu_queue_cond(struct task_struc= t *p, int cpu) if (!scx_allow_ttwu_queue(p)) return false; =20 + /* The task should not be queued onto a parked CPU. */ + if (arch_cpu_parked(cpu)) + return false; + /* * Do not complicate things with the async wake_list while the CPU is * in hotplug state. diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 0fb9bf995a47..ee8ccee69774 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6884,6 +6884,8 @@ static int sched_idle_rq(struct rq *rq) #ifdef CONFIG_SMP static int sched_idle_cpu(int cpu) { + if (arch_cpu_parked(cpu)) + return 0; return sched_idle_rq(cpu_rq(cpu)); } #endif @@ -7414,6 +7416,9 @@ static int wake_affine(struct sched_domain *sd, struc= t task_struct *p, { int target =3D nr_cpumask_bits; =20 + if (arch_cpu_parked(target)) + return prev_cpu; + if (sched_feat(WA_IDLE)) target =3D wake_affine_idle(this_cpu, prev_cpu, sync); =20 @@ -9204,7 +9209,12 @@ enum group_type { * The CPU is overloaded and can't provide expected CPU cycles to all * tasks. */ - group_overloaded + group_overloaded, + /* + * The CPU should be avoided as it can't provide expected CPU cycles + * even for small amounts of workload. + */ + group_parked }; =20 enum migration_type { @@ -9923,6 +9933,7 @@ struct sg_lb_stats { unsigned long group_runnable; /* Total runnable time over the CPUs of th= e group */ unsigned int sum_nr_running; /* Nr of all tasks running in the group */ unsigned int sum_h_nr_running; /* Nr of CFS tasks running in the group */ + unsigned int sum_nr_parked; unsigned int idle_cpus; /* Nr of idle CPUs in the= group */ unsigned int group_weight; enum group_type group_type; @@ -10180,6 +10191,9 @@ group_type group_classify(unsigned int imbalance_pc= t, struct sched_group *group, struct sg_lb_stats *sgs) { + if (sgs->sum_nr_parked && !sgs->idle_cpus) + return group_parked; + if (group_is_overloaded(imbalance_pct, sgs)) return group_overloaded; =20 @@ -10375,6 +10389,8 @@ static inline void update_sg_lb_stats(struct lb_env= *env, if (cpu_overutilized(i)) *sg_overutilized =3D 1; =20 + sgs->sum_nr_parked +=3D arch_cpu_parked(i) * rq->cfs.h_nr_queued; + /* * No need to call idle_cpu() if nr_running is not 0 */ @@ -10480,6 +10496,8 @@ static bool update_sd_pick_busiest(struct lb_env *e= nv, */ =20 switch (sgs->group_type) { + case group_parked: + return sgs->sum_nr_parked > busiest->sum_nr_parked; case group_overloaded: /* Select the overloaded group with highest avg_load. */ return sgs->avg_load > busiest->avg_load; @@ -10643,6 +10661,9 @@ static int idle_cpu_without(int cpu, struct task_st= ruct *p) { struct rq *rq =3D cpu_rq(cpu); =20 + if (arch_cpu_parked(cpu)) + return 0; + if (rq->curr !=3D rq->idle && rq->curr !=3D p) return 0; =20 @@ -10691,6 +10712,8 @@ static inline void update_sg_wakeup_stats(struct sc= hed_domain *sd, nr_running =3D rq->nr_running - local; sgs->sum_nr_running +=3D nr_running; =20 + sgs->sum_nr_parked +=3D arch_cpu_parked(i) * rq->cfs.h_nr_queued; + /* * No need to call idle_cpu_without() if nr_running is not 0 */ @@ -10738,6 +10761,8 @@ static bool update_pick_idlest(struct sched_group *= idlest, */ =20 switch (sgs->group_type) { + case group_parked: + return false; case group_overloaded: case group_fully_busy: /* Select the group with lowest avg_load. */ @@ -10788,7 +10813,7 @@ sched_balance_find_dst_group(struct sched_domain *s= d, struct task_struct *p, int unsigned long imbalance; struct sg_lb_stats idlest_sgs =3D { .avg_load =3D UINT_MAX, - .group_type =3D group_overloaded, + .group_type =3D group_parked, }; =20 do { @@ -10846,6 +10871,8 @@ sched_balance_find_dst_group(struct sched_domain *s= d, struct task_struct *p, int return idlest; =20 switch (local_sgs.group_type) { + case group_parked: + return idlest; case group_overloaded: case group_fully_busy: =20 @@ -11097,6 +11124,12 @@ static inline void calculate_imbalance(struct lb_e= nv *env, struct sd_lb_stats *s local =3D &sds->local_stat; busiest =3D &sds->busiest_stat; =20 + if (busiest->group_type =3D=3D group_parked) { + env->migration_type =3D migrate_task; + env->imbalance =3D busiest->sum_nr_parked; + return; + } + if (busiest->group_type =3D=3D group_misfit_task) { if (env->sd->flags & SD_ASYM_CPUCAPACITY) { /* Set imbalance to allow misfit tasks to be balanced. */ @@ -11265,13 +11298,14 @@ static inline void calculate_imbalance(struct lb_= env *env, struct sd_lb_stats *s /* * Decision matrix according to the local and busiest group type: * - * busiest \ local has_spare fully_busy misfit asym imbalanced overloaded - * has_spare nr_idle balanced N/A N/A balanced balanced - * fully_busy nr_idle nr_idle N/A N/A balanced balanced - * misfit_task force N/A N/A N/A N/A N/A - * asym_packing force force N/A N/A force force - * imbalanced force force N/A N/A force force - * overloaded force force N/A N/A force avg_load + * busiest \ local has_spare fully_busy misfit asym imbalanced overloaded = parked + * has_spare nr_idle balanced N/A N/A balanced balanced = balanced + * fully_busy nr_idle nr_idle N/A N/A balanced balanced = balanced + * misfit_task force N/A N/A N/A N/A N/A = N/A + * asym_packing force force N/A N/A force force = balanced + * imbalanced force force N/A N/A force force = balanced + * overloaded force force N/A N/A force avg_load = balanced + * parked force force N/A N/A force force = balanced * * N/A : Not Applicable because already filtered while updating * statistics. @@ -11310,6 +11344,13 @@ static struct sched_group *sched_balance_find_src_= group(struct lb_env *env) goto out_balanced; =20 busiest =3D &sds.busiest_stat; + local =3D &sds.local_stat; + + if (local->group_type =3D=3D group_parked) + goto out_balanced; + + if (busiest->group_type =3D=3D group_parked) + goto force_balance; =20 /* Misfit tasks should be dealt with regardless of the avg load */ if (busiest->group_type =3D=3D group_misfit_task) @@ -11331,7 +11372,6 @@ static struct sched_group *sched_balance_find_src_g= roup(struct lb_env *env) if (busiest->group_type =3D=3D group_imbalanced) goto force_balance; =20 - local =3D &sds.local_stat; /* * If the local group is busier than the selected busiest group * don't try and pull any tasks. @@ -11444,6 +11484,9 @@ static struct rq *sched_balance_find_src_rq(struct = lb_env *env, enum fbq_type rt; =20 rq =3D cpu_rq(i); + if (arch_cpu_parked(i) && rq->cfs.h_nr_queued) + return rq; + rt =3D fbq_classify_rq(rq); =20 /* @@ -11614,6 +11657,9 @@ static int need_active_balance(struct lb_env *env) { struct sched_domain *sd =3D env->sd; =20 + if (arch_cpu_parked(env->src_cpu) && cpu_rq(env->src_cpu)->cfs.h_nr_queue= d) + return 1; + if (asym_active_balance(env)) return 1; =20 @@ -11647,6 +11693,14 @@ static int should_we_balance(struct lb_env *env) struct sched_group *sg =3D env->sd->groups; int cpu, idle_smt =3D -1; =20 + if (arch_cpu_parked(env->dst_cpu)) + return 0; + + for_each_cpu(cpu, sched_domain_span(env->sd)) { + if (arch_cpu_parked(cpu) && cpu_rq(cpu)->cfs.h_nr_queued) + return 1; + } + /* * Ensure the balancing environment is consistent; can happen * when the softirq triggers 'during' hotplug. @@ -12788,6 +12842,9 @@ static int sched_balance_newidle(struct rq *this_rq= , struct rq_flags *rf) =20 update_misfit_status(NULL, this_rq); =20 + if (arch_cpu_parked(this_cpu)) + return 0; + /* * There is a task waiting to run. No need to search for one. * Return 0; the task will be enqueued when switching to idle. diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c index c326de1344fb..4e559d8775da 100644 --- a/kernel/sched/syscalls.c +++ b/kernel/sched/syscalls.c @@ -214,6 +214,9 @@ int idle_cpu(int cpu) return 0; #endif =20 + if (arch_cpu_parked(cpu)) + return 0; + return 1; } =20 --=20 2.34.1 From nobody Thu Dec 18 20:15:43 2025 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB41F1D7E5B for ; Mon, 12 May 2025 11:53:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747050827; cv=none; b=Jz6u/+21rOGVsT9dWagh1gbU/552t/g192sPtEd5GGMr13i2Mc1mYsR+cMpAb3q5nrOShkB9j7b4kvE+DDuiauVMAeNbXHfIMqntUGPeL9OFROS7c4k7XruQD+mRuSfingpdX7iEoqxjKk7PJRoGYvVzynfg3HkbXRnutnY+IIU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747050827; c=relaxed/simple; bh=2NJ16dCcp73RdpieYOtjDGiCHYIF7vEN/pHpM1Awfhg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=g0A420Jzecv7Ok7hFLS7xZBpkvlXk8+fgX1MMLFhsbP12s+CG4k6vBhtPYX2Lw8AU208no56HgEOjSJdfoAEDrsjWVokcD+KxRfM+/TZrBAAaDb9F4WZW7j9xwW0hEuksPe7mGvMJv8fo3ZHc9yJf7YMSYvK3lVlBiwkhKgV7bM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=jTkbqjy2; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="jTkbqjy2" Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54C9FAvG015889; Mon, 12 May 2025 11:53:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=5wFIR5gyRg2ECtomM xwyuNgGFE7Y/SXkDgRUfBo3XVU=; b=jTkbqjy2Dnh0mn8LJaqr6Umplob0gXR1r NwO9Exd6f2HnmstkIa6ZU4hY60s009DuL43zRF4uM7Lw77DOvCnivl0FZrkUKrUc ENqKa9c8EOcNxGbk0RQ3Mt7inHehcnsnhOPL04kY/Lv28aQJD8vBR6nUN+uJtBZp Xq9+f8jUVnIAMYR/e75G2kuIHOXGBqvEsKtNYa7e6O9u2rFkTpGwpBFipZBk4ZK0 mqSQS8RdhCrociXLwXSYp2g8zKh6MVZXSo5ClVcRBIa4vqwVF18OOCef+oEOCcfd 1Mzb/+86arOAspeMOf0zIi+cz4b8AQBUN2aisIAFs+8Xo2gHVMbxw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46ke6j0mg3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:33 +0000 (GMT) Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 54CBrXo2027962; Mon, 12 May 2025 11:53:33 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46ke6j0mfy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:32 +0000 (GMT) Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 54CB4vmr011544; Mon, 12 May 2025 11:53:32 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 46jku25etn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:32 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 54CBrS5p58130738 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 May 2025 11:53:28 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7A4DB200E4; Mon, 12 May 2025 11:53:28 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B38F9200E8; Mon, 12 May 2025 11:53:27 +0000 (GMT) Received: from IBM-PW0CRK36.ibm.com (unknown [9.111.90.223]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 12 May 2025 11:53:27 +0000 (GMT) From: Tobias Huschle To: linux-kernel@vger.kernel.org Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, sshegde@linux.ibm.com Subject: [RFC PATCH v3 2/4] sched/rt: add support for parked CPUs Date: Mon, 12 May 2025 13:53:23 +0200 Message-Id: <20250512115325.30022-3-huschle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250512115325.30022-1-huschle@linux.ibm.com> References: <20250512115325.30022-1-huschle@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: DA6iYAGlwSyCh9Dsvq3DsDGiKc91zCxT X-Proofpoint-ORIG-GUID: ANI7Qxp7wfaZDDPyYkQ422LC0etqW_J_ X-Authority-Analysis: v=2.4 cv=auyyCTZV c=1 sm=1 tr=0 ts=6821e13d cx=c_pps a=aDMHemPKRhS1OARIsFnwRA==:117 a=aDMHemPKRhS1OARIsFnwRA==:17 a=dt9VzEwgFbYA:10 a=VnNF1IyMAAAA:8 a=l-v1mVCvDpcWLIwNwUUA:9 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEyMDEyMiBTYWx0ZWRfX51x2jZbBapCM 1p9RbYH2UHOHmqctBRDbI1JVsrOb1odCGRaOuf9Y32TdqC+Q+ERWR+55AL0YDCuV9HnEuhN5PXB DS8VggH+iPom1Xz9cACxsLHz7mkIp7f9Mvk+AvAeTJXA4ZkgrJZQkOoYCe++QkWOPL3ks64BfXY Owb96OGIvgJLJEHwPiCaKUu2xUcfyrfowAyfYX+IP5Y5rGK7V+iD5wtPCdrRxIHhKbqnSCXq1Ak tPZndDZjNomPo0iyU3y6HIFplees5lAV16ThRcieY3JGCSKLIUKRHmiwIfKYLsHq1Ond95GE8Gx nPnv/fnR/7d7Rg7faE/KCB/Y8Gbn0NtY7JaQcXVOf5hNg9rebJ2RUKCfsx9WGdQFyRK/EeWefoF 1xj3EC7FnOyjN8OydUhhPgQRI4eenx59U12CBqegwjvqcv7qlhxvm6z1jsJ6nG0T8BU6hIs/ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-12_04,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 clxscore=1015 priorityscore=1501 lowpriorityscore=0 spamscore=0 suspectscore=0 bulkscore=0 malwarescore=0 impostorscore=0 adultscore=0 mlxlogscore=907 phishscore=0 classifier=spam authscore=0 authtc=n/a authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2504070000 definitions=main-2505120122 Content-Type: text/plain; charset="utf-8" Realtime tasks must also react to the parked states of CPUs. Tasks will be treated as if the parked CPUs have no free capacity to work on them. A dynamic change in the parked state of CPUs is handled correctly if realtime tasks do not consume 100% CPU time, without any interruption. If a realtime tasks runs without interruption, it will never enter the load balancing code and will therefore remain on a CPU, even if the CPU becomes classified as parked. Any value below 100% causes the task to be migrated off a CPU which has just been classified as parked. Signed-off-by: Tobias Huschle --- kernel/sched/rt.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index fa03ec3ed56a..595d760304fb 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -460,6 +460,9 @@ static inline bool rt_task_fits_capacity(struct task_st= ruct *p, int cpu) unsigned int max_cap; unsigned int cpu_cap; =20 + if (arch_cpu_parked(cpu)) + return false; + /* Only heterogeneous systems can benefit from this check */ if (!sched_asym_cpucap_active()) return true; @@ -474,6 +477,9 @@ static inline bool rt_task_fits_capacity(struct task_st= ruct *p, int cpu) #else static inline bool rt_task_fits_capacity(struct task_struct *p, int cpu) { + if (arch_cpu_parked(cpu)) + return false; + return true; } #endif @@ -1799,6 +1805,8 @@ static int find_lowest_rq(struct task_struct *task) int this_cpu =3D smp_processor_id(); int cpu =3D task_cpu(task); int ret; + int parked_cpu =3D 0; + int tmp_cpu; =20 /* Make sure the mask is initialized first */ if (unlikely(!lowest_mask)) @@ -1807,11 +1815,18 @@ static int find_lowest_rq(struct task_struct *task) if (task->nr_cpus_allowed =3D=3D 1) return -1; /* No other targets possible */ =20 + for_each_cpu(tmp_cpu, cpu_online_mask) { + if (arch_cpu_parked(tmp_cpu)) { + parked_cpu =3D tmp_cpu; + break; + } + } + /* * If we're on asym system ensure we consider the different capacities * of the CPUs when searching for the lowest_mask. */ - if (sched_asym_cpucap_active()) { + if (sched_asym_cpucap_active() || parked_cpu > -1) { =20 ret =3D cpupri_find_fitness(&task_rq(task)->rd->cpupri, task, lowest_mask, @@ -1833,14 +1848,14 @@ static int find_lowest_rq(struct task_struct *task) * We prioritize the last CPU that the task executed on since * it is most likely cache-hot in that location. */ - if (cpumask_test_cpu(cpu, lowest_mask)) + if (cpumask_test_cpu(cpu, lowest_mask) && !arch_cpu_parked(cpu)) return cpu; =20 /* * Otherwise, we consult the sched_domains span maps to figure * out which CPU is logically closest to our hot cache data. */ - if (!cpumask_test_cpu(this_cpu, lowest_mask)) + if (!cpumask_test_cpu(this_cpu, lowest_mask) || arch_cpu_parked(this_cpu)) this_cpu =3D -1; /* Skip this_cpu opt if not among lowest */ =20 rcu_read_lock(); @@ -1860,7 +1875,7 @@ static int find_lowest_rq(struct task_struct *task) =20 best_cpu =3D cpumask_any_and_distribute(lowest_mask, sched_domain_span(sd)); - if (best_cpu < nr_cpu_ids) { + if (best_cpu < nr_cpu_ids && !arch_cpu_parked(best_cpu)) { rcu_read_unlock(); return best_cpu; } @@ -1877,7 +1892,7 @@ static int find_lowest_rq(struct task_struct *task) return this_cpu; =20 cpu =3D cpumask_any_distribute(lowest_mask); - if (cpu < nr_cpu_ids) + if (cpu < nr_cpu_ids && !arch_cpu_parked(cpu)) return cpu; =20 return -1; --=20 2.34.1 From nobody Thu Dec 18 20:15:43 2025 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8AD4328D837 for ; Mon, 12 May 2025 11:53:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747050827; cv=none; b=OrMzYGcnBALI1+x6RqQi5YJ8Ij2SdwdOG/+HQ6Zi04iAHHpXKdXGmR9KpdVcGSytBoz6WWje57kzP/ksABZxcp7LDBSH1dHa/vLdRzAjomUh1754MuvieEnmhAepPMQXY/vK1iZIvF9bCD1x2HLJMvs76OkOHVSmw3h5T/eQuuU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747050827; c=relaxed/simple; bh=aHPFkCPoF4cazcjdMGjTAn1dG4a/FOsHoyVYi0bm0/c=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=f1MmtpsueMBs3UzD2f9BGvFgTmjFZFY5Z3aE9FAYnQTuruPIQkTIgJ+jS9bGflRTzrEDUW7deEsK9sPcZWp17kjLXkpdizacDkfzAMeQc8yGM/9+AHXtR3oYQOH3x8c+1bfdizIpqt2VRf44dxN6L/PMiN6s/GCWXlGENMRNeC8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=SjnUQbdq; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="SjnUQbdq" Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54C9EsKq015539; Mon, 12 May 2025 11:53:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=hZiTzSPBwoQzUv24S q9NqZX1ZXgabYm/8LRxaC6Wzbg=; b=SjnUQbdqDq+bot8/nIV9JtEUQWfP9DNhM qjd5bVwu2/4ffZvJtdRMAMmQOj45r0s1FVEnTrQMZqCG0WnO47YypiCIaddSb7ag KM4DIkDQ7pxURS3HCqfZKfyRDtAGFiLzyE1OhPCi65IhHLd5tdRYwjFDSS/XYekd pvkx5o1Swz08OKGrSPRhfMYusHMGPqCWdrBgfTRMUMBMzy2OSzPyF5xk2kihp2DX qrAUlzFoFrbJ0OSMbPL80pZKDZ/A9ob95bb+QBJLV1YrUduR0BSFEvLwxZEF8ll7 lrUE/y922uqeU/Tpiz3lK7dzyDomn1bweoxNsH2vdIRwzGWed0WVQ== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46ke6j0mg6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:34 +0000 (GMT) Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 54CBrXUB027970; Mon, 12 May 2025 11:53:33 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46ke6j0mg4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:33 +0000 (GMT) Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 54CADFXV003824; Mon, 12 May 2025 11:53:33 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 46jkbkdj5c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:32 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 54CBrTT448693726 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 May 2025 11:53:29 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 56E78200EA; Mon, 12 May 2025 11:53:29 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 930BA200E8; Mon, 12 May 2025 11:53:28 +0000 (GMT) Received: from IBM-PW0CRK36.ibm.com (unknown [9.111.90.223]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 12 May 2025 11:53:28 +0000 (GMT) From: Tobias Huschle To: linux-kernel@vger.kernel.org Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, sshegde@linux.ibm.com Subject: [RFC PATCH v3 3/4] sched/fair: adapt scheduler group weight and capacity for parked CPUs Date: Mon, 12 May 2025 13:53:24 +0200 Message-Id: <20250512115325.30022-4-huschle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250512115325.30022-1-huschle@linux.ibm.com> References: <20250512115325.30022-1-huschle@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: 1HGC-4Ou1_YEc9FCwMZ3jWIRpSpdLSzw X-Proofpoint-ORIG-GUID: n3hC7qRUPLd7Vkdm83KqnSsosXyicJLt X-Authority-Analysis: v=2.4 cv=auyyCTZV c=1 sm=1 tr=0 ts=6821e13e cx=c_pps a=AfN7/Ok6k8XGzOShvHwTGQ==:117 a=AfN7/Ok6k8XGzOShvHwTGQ==:17 a=dt9VzEwgFbYA:10 a=VnNF1IyMAAAA:8 a=AytC5maEDuoeYDmhhs8A:9 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEyMDEyMiBTYWx0ZWRfX8itseEZ+JdOV aUT+JOoESYO1Nj6ZF9ZbpTX80OrifqKqUUndSmN9bNRzsjceGDq2yhwunikQ2+X94SyHhlFA0E4 OMaECWxF8vYepVs6QxAu4DjG2htJ3a8tEHCOLyiL21PYJYGzwD9GhZmYqTR96Ep4/gy2TKFCkpa CapIgIhRF/vYW8tp8LI2VHP7LroPC1k2I5eHbFlm0NlMeltdY0nfxQRnA/RZbMNUlFCqILaUug/ WNfC5d5Y1+PWoU4KCCmP6ggFRlx64r65QuoUfWKv6LsZ2RtIUUIoj/+t3LhW6EiwOJ1KllZXRmR 8PkiLZUn2syUCMabnnNa/R6HzN22mTHvarpujL9bjLFp8Fz2zAZfFS3OdaBjJM0N98IyDvzZ4Wa ihpNq2G7vRCDsi06k6IjUs2Xozj/1ZAVEmEY6s+47sAoRF+HpDOsnsWfvjU5IVYSub9tygFh X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-12_04,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 clxscore=1015 priorityscore=1501 lowpriorityscore=0 spamscore=0 suspectscore=0 bulkscore=0 malwarescore=0 impostorscore=0 adultscore=0 mlxlogscore=642 phishscore=0 classifier=spam authscore=0 authtc=n/a authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2504070000 definitions=main-2505120122 Content-Type: text/plain; charset="utf-8" Parked CPUs should not be considered to be available for computation. This implies, that they should also not contribute to the overall weight of scheduler groups, as a large group of parked CPUs should not attempt to process any tasks, hence, a small group of non-parked CPUs should be considered to have a larger weight. The same consideration holds true for the CPU capacities of such groups. A group of parked CPUs should not be considered to have any capacity. Signed-off-by: Tobias Huschle --- kernel/sched/fair.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ee8ccee69774..d3161e928746 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9934,6 +9934,8 @@ struct sg_lb_stats { unsigned int sum_nr_running; /* Nr of all tasks running in the group */ unsigned int sum_h_nr_running; /* Nr of CFS tasks running in the group */ unsigned int sum_nr_parked; + unsigned int parked_cpus; + unsigned int parked_capacity; unsigned int idle_cpus; /* Nr of idle CPUs in the= group */ unsigned int group_weight; enum group_type group_type; @@ -10390,6 +10392,8 @@ static inline void update_sg_lb_stats(struct lb_env= *env, *sg_overutilized =3D 1; =20 sgs->sum_nr_parked +=3D arch_cpu_parked(i) * rq->cfs.h_nr_queued; + sgs->parked_capacity +=3D arch_cpu_parked(i) * capacity_of(i); + sgs->parked_cpus +=3D arch_cpu_parked(i); =20 /* * No need to call idle_cpu() if nr_running is not 0 @@ -10427,9 +10431,11 @@ static inline void update_sg_lb_stats(struct lb_en= v *env, } } =20 - sgs->group_capacity =3D group->sgc->capacity; + sgs->group_capacity =3D group->sgc->capacity - sgs->parked_capacity; + if (!sgs->group_capacity) + sgs->group_capacity =3D 1; =20 - sgs->group_weight =3D group->group_weight; + sgs->group_weight =3D group->group_weight - sgs->parked_cpus; =20 /* Check if dst CPU is idle and preferred to this group */ if (!local_group && env->idle && sgs->sum_h_nr_running && @@ -10713,6 +10719,8 @@ static inline void update_sg_wakeup_stats(struct sc= hed_domain *sd, sgs->sum_nr_running +=3D nr_running; =20 sgs->sum_nr_parked +=3D arch_cpu_parked(i) * rq->cfs.h_nr_queued; + sgs->parked_capacity +=3D arch_cpu_parked(i) * capacity_of(i); + sgs->parked_cpus +=3D arch_cpu_parked(i); =20 /* * No need to call idle_cpu_without() if nr_running is not 0 @@ -10728,9 +10736,11 @@ static inline void update_sg_wakeup_stats(struct s= ched_domain *sd, =20 } =20 - sgs->group_capacity =3D group->sgc->capacity; + sgs->group_capacity =3D group->sgc->capacity - sgs->parked_capacity; + if (!sgs->group_capacity) + sgs->group_capacity =3D 1; =20 - sgs->group_weight =3D group->group_weight; + sgs->group_weight =3D group->group_weight - sgs->parked_cpus; =20 sgs->group_type =3D group_classify(sd->imbalance_pct, group, sgs); =20 --=20 2.34.1 From nobody Thu Dec 18 20:15:43 2025 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8ACD427511F for ; Mon, 12 May 2025 11:53:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747050828; cv=none; b=qoiOaGKsAVTs2oSaoqgFGbX15rpYPfD6EJyNT6EeiE1/Jq9L3zJs+KY8ZAj2wUVbwkMtVxnELYQcs9tTwGjFW0mTv/met7HEHdwzRlwxePpFV/NwdHxQ8kEm9eECXzthzjHEiDpyc4z3qdpx6cdYHSfU3hXwWpj50xGK0t5dSa8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747050828; c=relaxed/simple; bh=FSil9k36AUgGeOGdJ1CqAyVzBiybUDfYnUkpWfS85mM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=G+7L9Huh1SwOP7u4h6/UZuYO0Zzxy3H3N7AUArawIl0A4pTEJih/p24bbv4ItRf6Ic2z3MLRPIr2BZc/dDxW9nL46BEe0cR71dsreTnlpJm2Z33QAWVU+TDLyEICembiIyHtAw9FYc03pzFm1QmcNCI1Gwk1/FZuqX3+4Rknat4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=hI820fn6; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="hI820fn6" Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 54C0hYn5017585; Mon, 12 May 2025 11:53:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=7XbRVulreCYu5DjKd RO0kmuT14ISqJyFSs5qeJXwyzc=; b=hI820fn6ZTeKbLSzIk7nK6d2Fsi70zge7 DLJqxvebjElIWR2FdZvynqyScydXoO/J7Rm8wlcxcGOz5jO8uPNzwowsoy2ArDR3 GVdES9vxU2zm9AsaPqEuZQXA/gBWI+r23+eaiT7Hn5YmAKVwO1hZVpwyVwFBVwOu I1pCDQ0VwypJrn/9bFxy9nveQXi6FQVqAF8CLpSgIpPQhTv5grWy7kYXFRtGQmaY qOKBf3Zl3xzA+25LGF2610XK5HLvMHenzOQIS28xS/Frv87QrbRHQRyNnwhWU1/m noAB1BWtBx4EhtvrX3I1nXl6NYwxWOA3KSdDjpadXzV+T4zA+g1UA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46jw42unmn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:35 +0000 (GMT) Received: from m0353725.ppops.net (m0353725.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 54CBqmnc031932; Mon, 12 May 2025 11:53:34 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 46jw42unmg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:34 +0000 (GMT) Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 54CApOps012021; Mon, 12 May 2025 11:53:33 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 46jku25etv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 May 2025 11:53:33 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 54CBrUfh47317318 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 May 2025 11:53:30 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4224D200E8; Mon, 12 May 2025 11:53:30 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 70E06200EB; Mon, 12 May 2025 11:53:29 +0000 (GMT) Received: from IBM-PW0CRK36.ibm.com (unknown [9.111.90.223]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 12 May 2025 11:53:29 +0000 (GMT) From: Tobias Huschle To: linux-kernel@vger.kernel.org Cc: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, sshegde@linux.ibm.com Subject: [RFC PATCH v3 4/4] s390/topology: Add initial implementation for selection of parked CPUs Date: Mon, 12 May 2025 13:53:25 +0200 Message-Id: <20250512115325.30022-5-huschle@linux.ibm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250512115325.30022-1-huschle@linux.ibm.com> References: <20250512115325.30022-1-huschle@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNTEyMDEyMiBTYWx0ZWRfX4Wq5IUtUF1jP yE5z/CrLnAbW/V0rgo3zKl5Z0yxW39TGaSYjcFfQjhLVdtZMGkAaIRQy94Qn6cQRoitkxKFgQW4 3qoZjYJCo/VvSOsWAi7t7QEBjS1Nb/KsokKt4hdV2xpkzyftMUE1oQTQOFJCDCtRg4FOY+ion/s K5P+MCEw61Z9qq2QdtWdnSfk+cGKmmp7jiQV0o+yfekn/9vVP4TgryliPN/5cZcXszqKD9ghOFw DFf5AuVw7X5rOdosuq69VGuITl/XYGlQbdm5e34d21hyaj9tcunnOYVIdIX5knSXto8ecRAgGbq 1uT5GNRsxq3FmuafRhp20g1iW5Y+JXZKF9o9AOMiQxYGU+VWLzE6/dvGiQ3PMeSSDonO0BQm+hV eEzmvXPFOd4WI28VQVHzuQ+cUn08UeXZA1496uMkaZo5ma5zpWn7jOI6q9VxBh4A3NNOM47i X-Proofpoint-ORIG-GUID: GyxktJOoiId2Yp-YRs9e2IgzFNA_y3Oq X-Authority-Analysis: v=2.4 cv=UqJjN/wB c=1 sm=1 tr=0 ts=6821e13f cx=c_pps a=aDMHemPKRhS1OARIsFnwRA==:117 a=aDMHemPKRhS1OARIsFnwRA==:17 a=dt9VzEwgFbYA:10 a=VnNF1IyMAAAA:8 a=eEyMWbnfSktHb8uulP0A:9 X-Proofpoint-GUID: GIyJV8wDoRwWowEmjcWJffDh9vVtrmhb X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.0.736,FMLib:17.12.80.40 definitions=2025-05-12_04,2025-05-09_01,2025-02-21_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 malwarescore=0 priorityscore=1501 bulkscore=0 phishscore=0 adultscore=0 clxscore=1015 mlxlogscore=828 lowpriorityscore=0 spamscore=0 suspectscore=0 mlxscore=0 classifier=spam authscore=0 authtc=n/a authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2504070000 definitions=main-2505120122 Content-Type: text/plain; charset="utf-8" At first, vertical low CPUs will be parked generally. This will later be adjusted by making the parked state dependent on the overall utilization on the underlying hypervisor. Vertical lows are always bound to the highest CPU IDs. This implies that the three types of vertically polarized CPUs are always clustered by ID. This has the following implications: - There might be scheduler domains consisting of only vertical highs - There might be scheduler domains consisting of only vertical lows Signed-off-by: Tobias Huschle --- arch/s390/include/asm/smp.h | 2 ++ arch/s390/kernel/smp.c | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/arch/s390/include/asm/smp.h b/arch/s390/include/asm/smp.h index 03f4d01664f8..93754c354803 100644 --- a/arch/s390/include/asm/smp.h +++ b/arch/s390/include/asm/smp.h @@ -31,6 +31,7 @@ static __always_inline unsigned int raw_smp_processor_id(= void) } =20 #define arch_scale_cpu_capacity smp_cpu_get_capacity +#define arch_cpu_parked smp_cpu_parked =20 extern struct mutex smp_cpu_state_mutex; extern unsigned int smp_cpu_mt_shift; @@ -56,6 +57,7 @@ extern int smp_cpu_get_polarization(int cpu); extern void smp_cpu_set_capacity(int cpu, unsigned long val); extern void smp_set_core_capacity(int cpu, unsigned long val); extern unsigned long smp_cpu_get_capacity(int cpu); +extern bool smp_cpu_parked(int cpu); extern int smp_cpu_get_cpu_address(int cpu); extern void smp_fill_possible_mask(void); extern void smp_detect_cpus(void); diff --git a/arch/s390/kernel/smp.c b/arch/s390/kernel/smp.c index 63f41dfaba85..6f6b2e90366d 100644 --- a/arch/s390/kernel/smp.c +++ b/arch/s390/kernel/smp.c @@ -680,6 +680,11 @@ void smp_set_core_capacity(int cpu, unsigned long val) smp_cpu_set_capacity(i, val); } =20 +bool smp_cpu_parked(int cpu) +{ + return smp_cpu_get_polarization(cpu) =3D=3D POLARIZATION_VL; +} + int smp_cpu_get_cpu_address(int cpu) { return per_cpu(pcpu_devices, cpu).address; --=20 2.34.1