From nobody Sun Feb 8 11:40:51 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67E4B137C44; Tue, 26 Mar 2024 08:06:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711440372; cv=none; b=LQsoaTtCG6D+YS0QbNO+hfHSh9OdRXtyn7xfZSQ5KRrw2D4fCVNXtQuSyt/QQd7fJtMPB/XMiTo7nlPZeSpsC1aYuyjv/X+yf2+tH8B0Wxc6qKA0gKOAwBRAFvlSKz7XARRQG8ZIuMPj9ExpjlCOmJtr41Eq281wmL7VdU6KRsM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711440372; c=relaxed/simple; bh=eV2XWDGmKbfwauw1MUdBE3wz5nLTLtQgJPojvMgNshk=; h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version: Message-ID:Content-Type; b=Rm3BHrKs6mdAqhQsJBc82LBOzqAHXXwUC2AHgDDmyO5G8bQgCNNdDlkvrLS/HON/Lssfq8T6iwTM7Yd7Df9DKDb3NnQOtPr7pVK+Sh5GdOMfZl7Sakj4g3ISARgTEYbHjWvfQU1h5cPnK9hTztdWZCNftSMYsUSw+/h/rs/4ykQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=AKEQOm0Z; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=uB//qXYv; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="AKEQOm0Z"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="uB//qXYv" Date: Tue, 26 Mar 2024 08:06:07 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1711440368; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1WKfjOXLOWLEwHpeAPXocJAikxgK6gs0hdOkdawz8D4=; b=AKEQOm0Ze4oNLCYlj3kwkoNKiog4V4mXRefBGIt+/ENMLzy54d5z+5XH03T5WrLhoz2MM+ dd+NEHTBJcM4lPezbWt0JByH6Z3JMfJMSP5Q/+neea0q6titIQ2QuAe4eLS0Pgw1ucsgcQ ZsFN/pT/+iD+CMS6HOmXEIC6RnLQVYakLyWAgzBN32u/aBeSghVB/uxdSixHD/8TBdmTpL Oa35ZgHfyn7Ja+9RUda316M5DzU2tWkaCQRznTIyBlsb8BWaFCZ8kI5mfpiMS7ePXuf2jL h2TkOtZfKfwObLp4pOPTAYoLJPRJHRVxGfXlLmK9skG6VI1MVWCVpCesQk01IQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1711440368; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1WKfjOXLOWLEwHpeAPXocJAikxgK6gs0hdOkdawz8D4=; b=uB//qXYvF/OCU6XmwyWqvMqYjqWZ4nEf5Xp0laxAIrqAjYGO2lYeMC7rgKXxBCtdLrRaCb S+XZNaD6w1a8EQBA== From: "tip-bot2 for Shrikanth Hegde" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] sched/fair: Add EAS checks before updating root_domain::overutilized Cc: Shrikanth Hegde , Ingo Molnar , Qais Yousef , Srikar Dronamraju , Vincent Guittot , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20240307085725.444486-2-sshegde@linux.ibm.com> References: <20240307085725.444486-2-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <171144036756.10875.522861210199899067.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Precedence: bulk Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The following commit has been merged into the sched/core branch of tip: Commit-ID: be3a51e68f2f1b17250ce40d8872c7645b7a2991 Gitweb: https://git.kernel.org/tip/be3a51e68f2f1b17250ce40d8872c7645= b7a2991 Author: Shrikanth Hegde AuthorDate: Thu, 07 Mar 2024 14:27:23 +05:30 Committer: Ingo Molnar CommitterDate: Tue, 26 Mar 2024 08:58:59 +01:00 sched/fair: Add EAS checks before updating root_domain::overutilized root_domain::overutilized is only used for EAS(energy aware scheduler) to decide whether to do load balance or not. It is not used if EAS not possible. Currently enqueue_task_fair and task_tick_fair accesses, sometime updates this field. In update_sd_lb_stats it is updated often. This causes cache contention due to true sharing and burns a lot of cycles. ::overload and ::overutilized are part of the same cacheline. Updating it often invalidates the cacheline. That causes access to ::overload to slow down due to false sharing. Hence add EAS check before accessing/updating this field. EAS check is optimized at compile time or it is a static branch. Hence it shouldn't cost much. With the patch, both enqueue_task_fair and newidle_balance don't show up as hot routines in perf profile. 6.8-rc4: 7.18% swapper [kernel.vmlinux] [k] enqueue_task_fa= ir 6.78% s [kernel.vmlinux] [k] newidle_balance +patch: 0.14% swapper [kernel.vmlinux] [k] enqueue_task_fa= ir 0.00% swapper [kernel.vmlinux] [k] newidle_balance While at it: trace_sched_overutilized_tp expect that second argument to be bool. So do a int to bool conversion for that. Fixes: 2802bf3cd936 ("sched/fair: Add over-utilization/tipping point indica= tor") Signed-off-by: Shrikanth Hegde Signed-off-by: Ingo Molnar Reviewed-by: Qais Yousef Reviewed-by: Srikar Dronamraju Reviewed-by: Vincent Guittot Link: https://lore.kernel.org/r/20240307085725.444486-2-sshegde@linux.ibm.c= om --- kernel/sched/fair.c | 53 ++++++++++++++++++++++++++++---------------- 1 file changed, 34 insertions(+), 19 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index dbf4f1c..1afa4f8 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6673,22 +6673,42 @@ static inline void hrtick_update(struct rq *rq) #ifdef CONFIG_SMP static inline bool cpu_overutilized(int cpu) { - unsigned long rq_util_min =3D uclamp_rq_get(cpu_rq(cpu), UCLAMP_MIN); - unsigned long rq_util_max =3D uclamp_rq_get(cpu_rq(cpu), UCLAMP_MAX); + unsigned long rq_util_min, rq_util_max; + + if (!sched_energy_enabled()) + return false; + + rq_util_min =3D uclamp_rq_get(cpu_rq(cpu), UCLAMP_MIN); + rq_util_max =3D uclamp_rq_get(cpu_rq(cpu), UCLAMP_MAX); =20 /* Return true only if the utilization doesn't fit CPU's capacity */ return !util_fits_cpu(cpu_util_cfs(cpu), rq_util_min, rq_util_max, cpu); } =20 -static inline void update_overutilized_status(struct rq *rq) +static inline void set_rd_overutilized_status(struct root_domain *rd, + unsigned int status) { - if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) { - WRITE_ONCE(rq->rd->overutilized, SG_OVERUTILIZED); - trace_sched_overutilized_tp(rq->rd, SG_OVERUTILIZED); - } + if (!sched_energy_enabled()) + return; + + WRITE_ONCE(rd->overutilized, status); + trace_sched_overutilized_tp(rd, !!status); +} + +static inline void check_update_overutilized_status(struct rq *rq) +{ + /* + * overutilized field is used for load balancing decisions only + * if energy aware scheduler is being used + */ + if (!sched_energy_enabled()) + return; + + if (!READ_ONCE(rq->rd->overutilized) && cpu_overutilized(rq->cpu)) + set_rd_overutilized_status(rq->rd, SG_OVERUTILIZED); } #else -static inline void update_overutilized_status(struct rq *rq) { } +static inline void check_update_overutilized_status(struct rq *rq) { } #endif =20 /* Runqueue only has SCHED_IDLE tasks enqueued */ @@ -6789,7 +6809,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *= p, int flags) * and the following generally works well enough in practice. */ if (!task_new) - update_overutilized_status(rq); + check_update_overutilized_status(rq); =20 enqueue_throttle: assert_list_leaf_cfs_rq(rq); @@ -10630,19 +10650,14 @@ static inline void update_sd_lb_stats(struct lb_e= nv *env, struct sd_lb_stats *sd env->fbq_type =3D fbq_classify_group(&sds->busiest_stat); =20 if (!env->sd->parent) { - struct root_domain *rd =3D env->dst_rq->rd; - /* update overload indicator if we are at root domain */ - WRITE_ONCE(rd->overload, sg_status & SG_OVERLOAD); + WRITE_ONCE(env->dst_rq->rd->overload, sg_status & SG_OVERLOAD); =20 /* Update over-utilization (tipping point, U >=3D 0) indicator */ - WRITE_ONCE(rd->overutilized, sg_status & SG_OVERUTILIZED); - trace_sched_overutilized_tp(rd, sg_status & SG_OVERUTILIZED); + set_rd_overutilized_status(env->dst_rq->rd, + sg_status & SG_OVERUTILIZED); } else if (sg_status & SG_OVERUTILIZED) { - struct root_domain *rd =3D env->dst_rq->rd; - - WRITE_ONCE(rd->overutilized, SG_OVERUTILIZED); - trace_sched_overutilized_tp(rd, SG_OVERUTILIZED); + set_rd_overutilized_status(env->dst_rq->rd, SG_OVERUTILIZED); } =20 update_idle_cpu_scan(env, sum_util); @@ -12667,7 +12682,7 @@ static void task_tick_fair(struct rq *rq, struct ta= sk_struct *curr, int queued) task_tick_numa(rq, curr); =20 update_misfit_status(curr, rq); - update_overutilized_status(task_rq(curr)); + check_update_overutilized_status(task_rq(curr)); =20 task_tick_core(rq, curr); }