From nobody Tue Feb 10 02:58:59 2026 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 95CF51C5D44 for ; Wed, 7 Jan 2026 06:52:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767768742; cv=none; b=IuUr1Il8qdrTyJ5VlhuZ7KnL+keLJ1bjwI1cvMTkxBlnUQywZCHvEQKRcfaAzze+8cwsifL64rGyO6CLWkQtEuxUwKL4e4oEJ79vGIWW1HmPooSFVJ0rjJ2Cn9sEgGUlxtX/hppkfiu5T+BFUX06j/ZbGmXzsv5zXWSMp2Fd0Fc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767768742; c=relaxed/simple; bh=+dwnAhz20tvtws6VvZ4x071mmwbZM4PcK4846f5pptg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LkTShuDGNxBTpXdt9fBiwUeTilh+e2+MoAxjyooCbQj6mpxek0MFo95UnFkjtgbxM+GPOcYqEID/PGCL3b9l2xYgZXYgfzoDwayM4mH9hKE5uiywrDYVUboQ3CPZaeIDbG7oTofBSx7I5bXc9uv6FoBKWxpfghiJxsciv2YEvJ4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=VgcZH2rn; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="VgcZH2rn" Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 606IJdm6008176; Wed, 7 Jan 2026 06:51:53 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=zs20oWepA0YSSIqFU xVO+9FieISgIhMvpPwXV44qxGs=; b=VgcZH2rnHs08lLRmn5e2v7UHE+a9L8AjY X13T4WX853s9lFtbc+ue3tDERmUWFt9keqrU+S8Fl4+WhZ57aTIFJ9x47o3w1pIj i8bDyGMeM1coTRHqc/0sJICg0qeo6lzRFyuQs7X5F3KsUOjaYRqtZwT/cQlqebQg kbmJZ0DjifNFA4eLpkJaYss7QrHfGNVqOXC+wTZTAStWwjzVbqJfypxukE/QPEZD RV63LUPxhZPZqSmU4j+MVSev6LDhHgmgUsa/1XtoUVBfvwihfbn5qSeVKzh32Dso tO0qayAimIQinUdZ5QMSu6k2+CikPgA9PKa9DRhgFspyJb2Dwoerg== Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4betu67d9p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 07 Jan 2026 06:51:53 +0000 (GMT) Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 6074kKxW012536; Wed, 7 Jan 2026 06:51:52 GMT Received: from smtprelay06.fra02v.mail.ibm.com ([9.218.2.230]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 4bffnjfdxy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 07 Jan 2026 06:51:52 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay06.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 6076poVu29295058 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 7 Jan 2026 06:51:50 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6EED420049; Wed, 7 Jan 2026 06:51:50 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 75E7720040; Wed, 7 Jan 2026 06:51:47 +0000 (GMT) Received: from li-7bb28a4c-2dab-11b2-a85c-887b5c60d769.ibm.com.com (unknown [9.124.216.12]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 7 Jan 2026 06:51:47 +0000 (GMT) From: Shrikanth Hegde To: mingo@kernel.org, peterz@infradead.org, vincent.guittot@linaro.org, linux-kernel@vger.kernel.org Cc: sshegde@linux.ibm.com, kprateek.nayak@amd.com, juri.lelli@redhat.com, vschneid@redhat.com, tglx@linutronix.de, dietmar.eggemann@arm.com, anna-maria@linutronix.de, frederic@kernel.org, wangyang.guo@intel.com Subject: [PATCH v3 1/3] sched/fair: Move checking for nohz cpus after time check Date: Wed, 7 Jan 2026 12:21:23 +0530 Message-ID: <20260107065125.669668-2-sshegde@linux.ibm.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260107065125.669668-1-sshegde@linux.ibm.com> References: <20260107065125.669668-1-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Authority-Analysis: v=2.4 cv=QbNrf8bv c=1 sm=1 tr=0 ts=695e0289 cx=c_pps a=AfN7/Ok6k8XGzOShvHwTGQ==:117 a=AfN7/Ok6k8XGzOShvHwTGQ==:17 a=vUbySO9Y5rIA:10 a=VkNPw1HP01LnGYTKEx00:22 a=VnNF1IyMAAAA:8 a=IMJbMpzTyqkz7RZfM9YA:9 X-Proofpoint-ORIG-GUID: vDpYxxm989Hn4RD0oSzy0otnsyOTYOH6 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMTA3MDA0OSBTYWx0ZWRfXyFbpBeRHFmkF 0F9+WywmlcZBNmhraeOEeCO8/WlZwxLV9jN8n1irzh3WWqkLmn00NnBkP4V9rIU0BccU8wIGfLH 2bx1CJ/dgXZj1URZ26JkJ3g3konqid9+8H4jnQZKTQAPLFvpMUU/lpyRk67yyMsuGzaJckPqBSj mzt5PRumVujNIEmJV0YsRAtCTk+sdrD3r8Ccd/r/qheT/hNGdmwHEj9/pwkknRK+sTgSVN1n/wM PvwMi7i2lDEKS1H7/gcq/9EBC54sDXa7HbeG3lukq0eqCTl/6avw0bneyzsabwdmEvLhepSgBZZ TzYdJDuIjgoaELo61A4SxavXM1d9O1wFr2/vHiYfeF4PLIXyISlhy3gKQGV16au+t5gk6y/+7rQ JFkCU8oMJXg7TqORRZs6OALnBoymNAeYsgNIb/tEjj0CzxIERpMxOAJRslAV1jjjj5vOe2y/35Q 8PcGGW+831ImSjRhxtw== X-Proofpoint-GUID: vDpYxxm989Hn4RD0oSzy0otnsyOTYOH6 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2026-01-06_03,2026-01-06_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 clxscore=1015 bulkscore=0 suspectscore=0 priorityscore=1501 adultscore=0 lowpriorityscore=0 impostorscore=0 phishscore=0 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2512120000 definitions=main-2601070049 Content-Type: text/plain; charset="utf-8" NOHZ idle load balancer is kicked off only after time check. So move the atomic read after the time check to access it only when needed. When there are no idle CPUs(100% busy), even if the flag gets set to NOHZ_STATS_KICK | NOHZ_NEXT_KICK, find_new_ilb will fail and there will be no NOHZ idle balance. The current behaviour is retained. Note: This patch doesn't solve any cacheline overheads. No improvement in performance apart from saving a few cycles of atomic_read. Signed-off-by: Shrikanth Hegde --- kernel/sched/fair.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9743fc0b225c..17e4e8ac5fca 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -12451,20 +12451,24 @@ static void nohz_balancer_kick(struct rq *rq) */ nohz_balance_exit_idle(rq); =20 - /* - * None are in tickless mode and hence no need for NOHZ idle load - * balancing: - */ - if (likely(!atomic_read(&nohz.nr_cpus))) - return; - if (READ_ONCE(nohz.has_blocked_load) && time_after(now, READ_ONCE(nohz.next_blocked))) flags =3D NOHZ_STATS_KICK; =20 + /* + * If none are in tickless mode, though flag maybe set, + * idle load balancing is not done as find_new_ilb fails + */ if (time_before(now, nohz.next_balance)) goto out; =20 + /* + * None are in tickless mode and hence no need for NOHZ idle load + * balancing: + */ + if (likely(!atomic_read(&nohz.nr_cpus))) + return; + if (rq->nr_running >=3D 2) { flags =3D NOHZ_STATS_KICK | NOHZ_BALANCE_KICK; goto out; --=20 2.47.3 From nobody Tue Feb 10 02:58:59 2026 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E575261B71 for ; Wed, 7 Jan 2026 06:52:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767768743; cv=none; b=CpCjhdTbsoflPWZTXRHl+7x4w4kHBM2UZncRRK+Lb3HDCdOklNsPk1hxAZL0aB7PI8t0rVohX1eflklfKu3mm29U53z+n1UD4D7feAlMSCzCjGEy0wU76/ZThPlkvPF/u5RVCjz6/9p+nsALkuPEMov6Mr8/ycwJRagdtafnfhU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767768743; c=relaxed/simple; bh=yyAu0IIHRj8ZmKkOXaGRTz0szwt/hXz4EB3oFUB9+Ww=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=vCw/fJYDq77/wuNvUQfm2KEn/A0j31Y0EbdVt/Kq7niB46E08SDzy6jeG68U0Zw93/pUv1CcttHG2k5ExMtR8r3BRTTvuoM/IS3Vd8l4n/f/y43+6/LvHpmkFpXE2NMPz0d4qHuva6HsshKOh0uLEeCoY0Igk0yVDvRnuE1m63A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=Q/3pqoeQ; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="Q/3pqoeQ" Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 606I3YWm023076; Wed, 7 Jan 2026 06:51:57 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=7JWQqLagw4TOow8fv QnPZbScw9ndn/y6jbcVGcvIZgw=; b=Q/3pqoeQovYmIBG8wbfs2xdIBeejmmEA8 odxty5AHKsSxhVYohq718xXa/JcMUgR8MbswaT6FysjhnTt5P1pdksiLRrOxEY1K eBEDYcYscpvYpWn81P8LyTjBr+m21Dt0/vp/Y2yFi92lzDyXlK2esQPFbrCR6jKh 5wEbA/s7DC/lpC/maLCZMDoTKpsyjPIUPsJ79xAigRoTh1yAmnwDrF2Ab2vPU6l2 SvD8oEE/dXEECTga8qVFlxEWZzocYgFavauebzWgXY3xg/fE51e/wi0duyXboQgk a3A57DZQOqzUM5Y3Xvi/aJabEjWS41p1RefPNtuaDA4jpmO4fN7OQ== Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4betsq7eex-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 07 Jan 2026 06:51:56 +0000 (GMT) Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 6072JrTH023511; Wed, 7 Jan 2026 06:51:55 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 4bg3rmc8ue-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 07 Jan 2026 06:51:55 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 6076pr2938666564 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 7 Jan 2026 06:51:53 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 99E2420049; Wed, 7 Jan 2026 06:51:53 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C2FCA20040; Wed, 7 Jan 2026 06:51:50 +0000 (GMT) Received: from li-7bb28a4c-2dab-11b2-a85c-887b5c60d769.ibm.com.com (unknown [9.124.216.12]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 7 Jan 2026 06:51:50 +0000 (GMT) From: Shrikanth Hegde To: mingo@kernel.org, peterz@infradead.org, vincent.guittot@linaro.org, linux-kernel@vger.kernel.org Cc: sshegde@linux.ibm.com, kprateek.nayak@amd.com, juri.lelli@redhat.com, vschneid@redhat.com, tglx@linutronix.de, dietmar.eggemann@arm.com, anna-maria@linutronix.de, frederic@kernel.org, wangyang.guo@intel.com Subject: [PATCH v3 2/3] sched/fair: Change likelyhood of nohz.nr_cpus Date: Wed, 7 Jan 2026 12:21:24 +0530 Message-ID: <20260107065125.669668-3-sshegde@linux.ibm.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260107065125.669668-1-sshegde@linux.ibm.com> References: <20260107065125.669668-1-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: KSGaHDUSz4eZC_HscUWYaY3jQRnBC41P X-Authority-Analysis: v=2.4 cv=Jvf8bc4C c=1 sm=1 tr=0 ts=695e028c cx=c_pps a=5BHTudwdYE3Te8bg5FgnPg==:117 a=5BHTudwdYE3Te8bg5FgnPg==:17 a=vUbySO9Y5rIA:10 a=VkNPw1HP01LnGYTKEx00:22 a=VnNF1IyMAAAA:8 a=gfQPMJu-hSpRr0ZdjbEA:9 X-Proofpoint-ORIG-GUID: KSGaHDUSz4eZC_HscUWYaY3jQRnBC41P X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMTA3MDA0OSBTYWx0ZWRfXxv/sYFhUxvpS rUZZPSNkzOH2H0f9gWvhBAUvMipbdQYaKOzTs7Ajb42JHIYKPg2mWswQQYKfegiGPr5dgUqXzc4 Np4w36tWoogIe+9CPIr31AAW3YRoS3z2EMpzMitgRr3wpi20bLi0OuCGQIOHtrf5Vqwql2oxHLR Orn4h7tnc70yNNKDJQKMYg9scVXPNkBM9/T5OiZp7gpi0NVOc7VSJdLh224DwXsOH8Gw9T/lpND 2GLxWC0quAIf6R9UeIjGPdoTOX7tIhOLVdTwiLXIDBByM3D8U5MTLEf721Ge3RvF0K1dtcrxHYJ jP1qB6gD0RoImobPOt8RBrzc6UnWtMY88pHiQ24JZN62gyQUEm1d9VRJbQiAfoidQkcIpPzqMW6 sQ/FCKYHt1yTQv1j3XjJA/44k4jXf5c2n7nwnDKlNgfEtV7n1ix0f7upKf2AEEhC3LgnrEsVR5v CXSjxLSVwVZpDuJNo/A== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2026-01-06_03,2026-01-06_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 suspectscore=0 impostorscore=0 lowpriorityscore=0 priorityscore=1501 phishscore=0 adultscore=0 spamscore=0 bulkscore=0 malwarescore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2512120000 definitions=main-2601070049 Content-Type: text/plain; charset="utf-8" These days most of the system have multi cores. The likelyhood of at least one or more CPUs in nohz (idle state) is higher. Give accurate hint to the branch predictor. Signed-off-by: Shrikanth Hegde --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 17e4e8ac5fca..c03f963f6216 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -12464,9 +12464,9 @@ static void nohz_balancer_kick(struct rq *rq) =20 /* * None are in tickless mode and hence no need for NOHZ idle load - * balancing: + * balancing */ - if (likely(!atomic_read(&nohz.nr_cpus))) + if (unlikely(!atomic_read(&nohz.nr_cpus))) return; =20 if (rq->nr_running >=3D 2) { --=20 2.47.3 From nobody Tue Feb 10 02:58:59 2026 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C90B9277C9A for ; Wed, 7 Jan 2026 06:52:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767768748; cv=none; b=eIuD57LGc+caL7AX1pLEinQLtl5XeUVknsdek0BKyi4DpAEQwm8JxvjkjlEWElFSvdy8YBW4dxicxbGSZ1so0SGfG2oTZBlfeNwC6bdC0slrRHa9ePumJi/j1Xy5nZFQvXjPacluCpGTlMaQG4qDbuY1YAY/sK8+z1yrBTQYgKY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767768748; c=relaxed/simple; bh=7l9LKlTd+ElP5z96ofhKUeL+fsTdw/eOuqT4b1Idy/U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=isCEoaNz9PRaWMQx9LqUpuu13sNtLC3g7FxHp3+Xte9zbIyKJR4A0floiTUphYHtc8IKWVz/Aa8GyXRxPCkvZKYrrFOK6d5u7Fz/qES0pp4Nv+9VxK1ajMgxZFgXxIEte1wWZktI+mBmK2zALYui/A3/4oQZCfZ5TDtTBDftR10= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=nGoSXKNx; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="nGoSXKNx" Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 606I2OfG028326; Wed, 7 Jan 2026 06:51:59 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=kuRDcZEgnr/RXRbGv c8QNDIAGToh5qEWeFt1+ZKLeeo=; b=nGoSXKNxDTyevTPn4a8Y/KLv5pKWjDw8E iXeLOyr6+82Z/looS0t1T/X5WyTxPOYRiC1ynT3bRS8UyL1vSflYUg+4JAu/+YHc LfD3Dt0i3JQEcQManbhlxCtERZRyTpFEk7/TrzzLR+4sSDsgLSOuXM7YFgm/JcPO Epq9r+vgGqdlaooW79qctjRYky5xudXSd+D1Nq7Rct4R9W/DeHG7vSATeRbEnk/L OYJAxu+4jzeJ1SIenoHZ1FpnpmFxuYUYuIFrMIE5bPnZE1aXEy0ZNIzWasZFzcpW 1jpMnOLmQJH2PlLybVQqFjXFHCg+xY/L+0JD7AdTVi63vOLB8/kPw== Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4betrtpe7x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 07 Jan 2026 06:51:59 +0000 (GMT) Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 60752Lwt019177; Wed, 7 Jan 2026 06:51:58 GMT Received: from smtprelay05.fra02v.mail.ibm.com ([9.218.2.225]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 4bfg517b24-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 07 Jan 2026 06:51:58 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay05.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 6076puQD39190816 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 7 Jan 2026 06:51:56 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5063720049; Wed, 7 Jan 2026 06:51:56 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id ED7EC20040; Wed, 7 Jan 2026 06:51:53 +0000 (GMT) Received: from li-7bb28a4c-2dab-11b2-a85c-887b5c60d769.ibm.com.com (unknown [9.124.216.12]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 7 Jan 2026 06:51:53 +0000 (GMT) From: Shrikanth Hegde To: mingo@kernel.org, peterz@infradead.org, vincent.guittot@linaro.org, linux-kernel@vger.kernel.org Cc: sshegde@linux.ibm.com, kprateek.nayak@amd.com, juri.lelli@redhat.com, vschneid@redhat.com, tglx@linutronix.de, dietmar.eggemann@arm.com, anna-maria@linutronix.de, frederic@kernel.org, wangyang.guo@intel.com Subject: [PATCH v3 3/3] sched/fair: Remove nohz.nr_cpus and use weight of cpumask instead Date: Wed, 7 Jan 2026 12:21:25 +0530 Message-ID: <20260107065125.669668-4-sshegde@linux.ibm.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260107065125.669668-1-sshegde@linux.ibm.com> References: <20260107065125.669668-1-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Authority-Analysis: v=2.4 cv=aaJsXBot c=1 sm=1 tr=0 ts=695e028f cx=c_pps a=aDMHemPKRhS1OARIsFnwRA==:117 a=aDMHemPKRhS1OARIsFnwRA==:17 a=vUbySO9Y5rIA:10 a=VkNPw1HP01LnGYTKEx00:22 a=VnNF1IyMAAAA:8 a=T4Os_ZD8T57gc3T8qpkA:9 X-Proofpoint-GUID: saDOios9NNVCBmKB5MlgBhYd4SJBHABj X-Proofpoint-ORIG-GUID: saDOios9NNVCBmKB5MlgBhYd4SJBHABj X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMTA3MDA0OSBTYWx0ZWRfX0U5ENYpmIPj5 k6HMhPq/zmg0r7XLosxRf6HJ0uHyICdZawUS+G82F33p4bV6OPfk53usbr4F7mQHXGN/y9Lffyl b6y9gBNd7AdM2rbUDnLg6nzRLHldaDTl1v13XdJWnH5OzGgkD+Zi8/kZFns7ELxct5Zw2Fu+lti cgMAcg8Jzt+apddu21NVOHAjg3dWlnqnaYzAbkNvxaR1Cl/yzsNrYqSRdbbMD2PKQjIJwQy40h2 k/wOHsysm8lVvhW1XOGjVSbGj9X2fIkH+0OmCZhaTCQKeVBSKTNcenqzCzeHWTPk7o0cC1Ls6Bj TU7s7uETlVTdfu6O4udgt51gdYUhDYNkRG8e7JwPoXryKyencHbU1BteiwgybyuLGotre+LAbrX kEqIcZ8N3SlfpIbI8HVkx882SxLiPlGmajD+I3+hb2CSgEZbNOCCGTkGn478u8lFWhZIlZPQnP8 V2VJuYlgX64/Ttj1yew== X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2026-01-06_03,2026-01-06_01,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 adultscore=0 lowpriorityscore=0 bulkscore=0 malwarescore=0 priorityscore=1501 clxscore=1015 phishscore=0 spamscore=0 impostorscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2512120000 definitions=main-2601070049 Content-Type: text/plain; charset="utf-8" nohz.nr_cpus was observed as contended cacheline when running enterprise workload on large systems. Fundamental scalability challenge with nohz.idle_cpus_mask and nohz.nr_cpus is the following: (1) nohz_balancer_kick() observes (reads) nohz.nr_cpus (or nohz.idle_cpu_mask) and nohz.has_blocked to see whether there's any nohz balancing work to do, in every scheduler tick. (2) nohz_balance_enter_idle() and nohz_balance_exit_idle() (through nohz_balancer_kick() via sched_tick()) modify (write) nohz.nr_cpus (and/or nohz.idle_cpu_mask) and nohz.has_blocked. The characteristic frequencies are the following: (1) nohz_balancer_kick() happens at scheduler (busy)tick frequency on CPU(which has not gone idle). This is a relatively constant frequency in the ~1 kHz range or lower. (2) happens at idle enter/exit frequency on every CPU that goes to idle. This is workload dependent, but can easily be hundreds of kHz for IO-bound loads and high CPU counts. Ie. can be orders of magnitude higher than (1), in which case a cachemiss at every invocation of (1) is almost inevitable. idle exit will trigger (1) on the CPU which is coming out of idle. There's two types of costs from these functions: (A) scheduler tick cost via (1): this happens on busy CPUs too, and is thus a primary scalability cost. But the rate here is constant and typically much lower than (B), hence the absolute benefit to workload scalability will be lower as well. (B) idle cost via (2): going-to-idle and coming-from-idle costs are secondary concerns, because they impact power efficiency more than they impact scalability. But in terms of absolute cost this scales up with nr_cpus as well, and a much faster rate, and thus may also approach and negatively impact system limits like memory bus/fabric bandwidth. Above mentioned fundamental scalability challenge remains true for nohz.idle_cpus_mask even after this patch. But nr_cpus can be derived from the mask itself. Its usage doesn't warrant a functionally correct value. It can race, at worst an additional load balance may be attempted. So, derive the value from the idle_cpus_mask. This helps to save some bus bandwidth w.r.t to that nohz cacheline(approx 50%). This in turn helps to improve enterprise workload throughput. This theory holds true for CPUMASK_OFFSTACK=3Dy and mostly true for CPUMASK_OFFSTACK=3Dn (last few bits based on NR_CPUs could be in same cacheline as nr_cpus) On system with 480 CPUs, running hackbench 40 process 10000 loops (Avg of 3 runs) baseline: 0.81% hackbench [k] nohz_balance_exit_idle 0.21% hackbench [k] nohz_balancer_kick 0.09% swapper [k] nohz_run_idle_balance With patch: 0.35% hackbench [k] nohz_balance_exit_idle 0.09% hackbench [k] nohz_balancer_kick 0.07% swapper [k] nohz_run_idle_balance [Ingo Molnar: scalability analysis changlog] Signed-off-by: Shrikanth Hegde --- kernel/sched/fair.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c03f963f6216..3408a5beb95b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7144,7 +7144,6 @@ static DEFINE_PER_CPU(cpumask_var_t, should_we_balanc= e_tmpmask); =20 static struct { cpumask_var_t idle_cpus_mask; - atomic_t nr_cpus; int has_blocked_load; /* Idle CPUS has blocked load */ int needs_update; /* Newly idle CPUs need their next_balance collated */ unsigned long next_balance; /* in jiffy units */ @@ -12466,7 +12465,7 @@ static void nohz_balancer_kick(struct rq *rq) * None are in tickless mode and hence no need for NOHZ idle load * balancing */ - if (unlikely(!atomic_read(&nohz.nr_cpus))) + if (unlikely(cpumask_empty(nohz.idle_cpus_mask))) return; =20 if (rq->nr_running >=3D 2) { @@ -12579,7 +12578,6 @@ void nohz_balance_exit_idle(struct rq *rq) =20 rq->nohz_tick_stopped =3D 0; cpumask_clear_cpu(rq->cpu, nohz.idle_cpus_mask); - atomic_dec(&nohz.nr_cpus); =20 set_cpu_sd_state_busy(rq->cpu); } @@ -12637,7 +12635,6 @@ void nohz_balance_enter_idle(int cpu) rq->nohz_tick_stopped =3D 1; =20 cpumask_set_cpu(cpu, nohz.idle_cpus_mask); - atomic_inc(&nohz.nr_cpus); =20 /* * Ensures that if nohz_idle_balance() fails to observe our --=20 2.47.3