From nobody Sun Feb 8 11:25:56 2026 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75A89E573 for ; Mon, 12 Jan 2026 05:05:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768194335; cv=none; b=OsVU4qOEnzHDSy+gpx+bcV5ACCL6uicQ8urWFwSchjR4CeBneS9fbb1iW6wu8yxMB2W+6clvWcNBr+5b5QTd4CVejX84nXKFRiSZFS0cjk31lxS+9R0VTl5MH4GgE7AL0axUVdFyoURGnURhb9iwN+dU3CWXi+T7geeYIwCvx3k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768194335; c=relaxed/simple; bh=+dwnAhz20tvtws6VvZ4x071mmwbZM4PcK4846f5pptg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZJE+DfV5DUT2DibDuk04mbRMhah8S1u632my5rZ5aVgd7/18Fo0jLWsSRV/iFxJx/f25Q9PrnMMFuUGyxqPZh6bX4ntgG7fsTRpRmEnhYFqlLsub9ouxC0lCiqbvVbjT5GLmn9HjpRbuQhHGHBB5FrW1+gy4Nx0A3J1gqEMfM3Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=b8Up6STt; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="b8Up6STt" Received: from pps.filterd (m0353725.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 60B6aY80008622; Mon, 12 Jan 2026 05:05:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=zs20oWepA0YSSIqFU xVO+9FieISgIhMvpPwXV44qxGs=; b=b8Up6STtc04xSspQSmeI88UCZgRMSA1mF ngm77DCzdV4XVa4u7Yo9KcjdlMBU0TJb9lVTQ+vcfc5v9uYEwZiq2ylFmzUoSnjb TBcCbG91uhcoBUkdLx45hGXSmgo3BsOGpk6JkIZ8ceJOM3BslrWVH5VF53N7oUDO TOAd3AIz+XvFI2SIQqJ2XTC34LhV6HUIcZI/de/Dp/LZ3/HH4+J9uxcVZA0yklpc b0UKj1yMPceGtC7OxYsZqjYaELfeI6Ng5EoxoRwI9xcqCs6vCfTslVbuJW9E9jGN Bi57veo7mFuwCpbBgUquV6M48Lob4/CaKRjysRuLD1m89EyNwVEjQ== Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4bkd6dwd1u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 Jan 2026 05:05:05 +0000 (GMT) Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 60C00TbH002505; Mon, 12 Jan 2026 05:05:04 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 4bm13sccjx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 Jan 2026 05:05:04 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 60C553NB45547864 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 Jan 2026 05:05:03 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1C9FE20043; Mon, 12 Jan 2026 05:05:03 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 23DAE20040; Mon, 12 Jan 2026 05:05:00 +0000 (GMT) Received: from li-7bb28a4c-2dab-11b2-a85c-887b5c60d769.in.ibm.com (unknown [9.109.215.252]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 12 Jan 2026 05:04:59 +0000 (GMT) From: Shrikanth Hegde To: mingo@kernel.org, peterz@infradead.org, vincent.guittot@linaro.org, linux-kernel@vger.kernel.org Cc: sshegde@linux.ibm.com, kprateek.nayak@amd.com, juri.lelli@redhat.com, vschneid@redhat.com, tglx@kernel.org, dietmar.eggemann@arm.com, anna-maria@linutronix.de, frederic@kernel.org, wangyang.guo@intel.com Subject: [PATCH v4 1/3] sched/fair: Move checking for nohz cpus after time check Date: Mon, 12 Jan 2026 10:34:40 +0530 Message-ID: <20260112050442.138446-2-sshegde@linux.ibm.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260112050442.138446-1-sshegde@linux.ibm.com> References: <20260112050442.138446-1-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: HA_J_0l-52HRtb113qyipzqkYjvDBMLj X-Authority-Analysis: v=2.4 cv=LLxrgZW9 c=1 sm=1 tr=0 ts=69648101 cx=c_pps a=bLidbwmWQ0KltjZqbj+ezA==:117 a=bLidbwmWQ0KltjZqbj+ezA==:17 a=vUbySO9Y5rIA:10 a=VkNPw1HP01LnGYTKEx00:22 a=VnNF1IyMAAAA:8 a=IMJbMpzTyqkz7RZfM9YA:9 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMTEyMDAzNyBTYWx0ZWRfXwan+6qqdmklS YpB5PL14kIgTOBI162c2sehOjE16ewQj/wy7PMSSqHOZFMdpGz4UA/X/QOabJFPGko7I6DbzxM5 iP3iTO3JE819L8dSgzG9oPJOPCzGdLtPijQB0Stwo6DSQw6cS8LpLfZDOlZSzMBedYN5+Y2R1G/ pqwXfzbdqn3PDXne3JBNNvPVYcKwTxZ6Scz7Zblr+TZuxGAh1X7xNwPNsQ85kaeiuPm/W9SAKNs QjneFq+LpH8bqwNftQ21x1eBT1TT9fzBbqPzvUZN4zv9o7MJIpNb+2xCAwY82rJSp/pSprK7u6c avtI7EXo/HGdpeXWITbvGLZ55GDqT/o6PJelCRyHtDxgc44YOYXsgp45lK9pqLKS5AFgXVy11jr rx9YMgCK6P9cpyoSwvDTFZcSfa4wofwG38+OlKS+bEHO3XH0MGNF0Cu+wr37K3UZTXOCussbAsb Hw9hupxwbNCVVq8RB9A== X-Proofpoint-ORIG-GUID: HA_J_0l-52HRtb113qyipzqkYjvDBMLj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2026-01-12_01,2026-01-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 suspectscore=0 clxscore=1011 spamscore=0 impostorscore=0 malwarescore=0 phishscore=0 adultscore=0 lowpriorityscore=0 bulkscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2512120000 definitions=main-2601120037 Content-Type: text/plain; charset="utf-8" NOHZ idle load balancer is kicked off only after time check. So move the atomic read after the time check to access it only when needed. When there are no idle CPUs(100% busy), even if the flag gets set to NOHZ_STATS_KICK | NOHZ_NEXT_KICK, find_new_ilb will fail and there will be no NOHZ idle balance. The current behaviour is retained. Note: This patch doesn't solve any cacheline overheads. No improvement in performance apart from saving a few cycles of atomic_read. Signed-off-by: Shrikanth Hegde --- kernel/sched/fair.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9743fc0b225c..17e4e8ac5fca 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -12451,20 +12451,24 @@ static void nohz_balancer_kick(struct rq *rq) */ nohz_balance_exit_idle(rq); =20 - /* - * None are in tickless mode and hence no need for NOHZ idle load - * balancing: - */ - if (likely(!atomic_read(&nohz.nr_cpus))) - return; - if (READ_ONCE(nohz.has_blocked_load) && time_after(now, READ_ONCE(nohz.next_blocked))) flags =3D NOHZ_STATS_KICK; =20 + /* + * If none are in tickless mode, though flag maybe set, + * idle load balancing is not done as find_new_ilb fails + */ if (time_before(now, nohz.next_balance)) goto out; =20 + /* + * None are in tickless mode and hence no need for NOHZ idle load + * balancing: + */ + if (likely(!atomic_read(&nohz.nr_cpus))) + return; + if (rq->nr_running >=3D 2) { flags =3D NOHZ_STATS_KICK | NOHZ_BALANCE_KICK; goto out; --=20 2.47.3 From nobody Sun Feb 8 11:25:56 2026 Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A61AB30C61B for ; Mon, 12 Jan 2026 05:05:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768194339; cv=none; b=Gg2m7snOUGKtUathgmQaL7J6sZak94y+Cb7myn1m19A66idfOTPuu5XDRqAEavbuIpj9JGRxo8lfkDR89qIVigQcrt8KfD0UHWIW6RSGRB1Gn650KDJPgnE/m/p86QXl6WSdDNAwFLkxIigbCs8w0Ui9eEvw8u1WWwAeX/ZsA/k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768194339; c=relaxed/simple; bh=yyAu0IIHRj8ZmKkOXaGRTz0szwt/hXz4EB3oFUB9+Ww=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qsfWMqlWuvKGDYcigzVppzgJ4VNP2fjSZOnKhzN3sK4FvBWA8ZZi+9v9NTZzYYx7f0GP0gLCvOVgNSyUQAc3y0GwlTopTuKg43I5UrnbA+AcxgKb7waLcdn+kIRMP/MWb7nwjYB0IGkrLMhWeEIaZs1EhvpIO5ewjvGBxN3mdkg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=oVI6x6LY; arc=none smtp.client-ip=148.163.156.1 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="oVI6x6LY" Received: from pps.filterd (m0356517.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 60BMHD9r010955; Mon, 12 Jan 2026 05:05:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=7JWQqLagw4TOow8fv QnPZbScw9ndn/y6jbcVGcvIZgw=; b=oVI6x6LYaiyysBGcreWh7hBh7BE+DHS3m 0hTK7oLOyJ9Ua6s4L7VBhyir7bIlAHF8lia6f4fDWwAqNh7Rkxgy6VjtwsFrzj5P 9dOV3sJfyk1RRvH6cgOAy016r0rglJ9INX+R0/i209tHeebjENV+Xfiq+ebT4sd1 iaHZYa2Xe437KOJz/P+KSZAAIbnzQoGs0tACMt/KLRaFRy0rxFSkDVaKVSIWGwxo 5VmlY1NsWtK54HDnDn4TQrXF+RB/TciptbmNKR+tnjM0hKtaAa2dCb5F99kSHlpB xz0XAkHPU4GbHRDBF5Pczz0fKAKVlNro5LCA6YL3D34JP+1ymlYCA== Received: from ppma21.wdc07v.mail.ibm.com (5b.69.3da9.ip4.static.sl-reverse.com [169.61.105.91]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4bkeg45gv2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 Jan 2026 05:05:09 +0000 (GMT) Received: from pps.filterd (ppma21.wdc07v.mail.ibm.com [127.0.0.1]) by ppma21.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 60C0tJEF025503; Mon, 12 Jan 2026 05:05:08 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma21.wdc07v.mail.ibm.com (PPS) with ESMTPS id 4bm23mv6e2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 Jan 2026 05:05:08 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 60C556P850332124 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 Jan 2026 05:05:06 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2C81120043; Mon, 12 Jan 2026 05:05:06 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7FD5F20040; Mon, 12 Jan 2026 05:05:03 +0000 (GMT) Received: from li-7bb28a4c-2dab-11b2-a85c-887b5c60d769.in.ibm.com (unknown [9.109.215.252]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 12 Jan 2026 05:05:03 +0000 (GMT) From: Shrikanth Hegde To: mingo@kernel.org, peterz@infradead.org, vincent.guittot@linaro.org, linux-kernel@vger.kernel.org Cc: sshegde@linux.ibm.com, kprateek.nayak@amd.com, juri.lelli@redhat.com, vschneid@redhat.com, tglx@kernel.org, dietmar.eggemann@arm.com, anna-maria@linutronix.de, frederic@kernel.org, wangyang.guo@intel.com Subject: [PATCH v4 2/3] sched/fair: Change likelyhood of nohz.nr_cpus Date: Mon, 12 Jan 2026 10:34:41 +0530 Message-ID: <20260112050442.138446-3-sshegde@linux.ibm.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260112050442.138446-1-sshegde@linux.ibm.com> References: <20260112050442.138446-1-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMTEyMDAzNyBTYWx0ZWRfX0Kk5PZTGI3ef 7kpoWUKigeKMV9oY3y13Gcdj8peoRoozPSOL/X5x2Kc6W/9UWfYrx7kR5sz8cswjkKmeQE2Rc/b X3oubBpjMBI715K5CpWu9/jRapYrHytVb3NWUHjWoQF2FBjP0Z+uI7pY3w1+0Pkg8b6pzzanBqP pNl0ufkERvTAgRrJmmW8uL9ce74RqvlY6oAKrq4ztdsd2Gb/Bsd03ZhB0tOckYQM742gzolNugs q2648o0l0nZm1lv03z8DiiQ+kZRsDtjjtSqlvVTTWRiMrGxR7H40nSPWM77iFF3H0LN1GNeNG3U jAtz9IeK37Jf4FZMjcibnv3YILJCmIT8XAszYKGo//RcH01sd2hP3jk4QyxYmTiZglqexKZueQs aHLY91mDNrR039jnFRCVbjTh3JnzXQWHUxJP3XBicn60yJum7te+PfJY1iiItZ4qIEQ5vJk+bQN Z537eYUK/1m0mzMdHpw== X-Proofpoint-ORIG-GUID: 6aV-MjFTPKLBvdg6p7e0exv5IA6SkXcd X-Authority-Analysis: v=2.4 cv=B/60EetM c=1 sm=1 tr=0 ts=69648105 cx=c_pps a=GFwsV6G8L6GxiO2Y/PsHdQ==:117 a=GFwsV6G8L6GxiO2Y/PsHdQ==:17 a=vUbySO9Y5rIA:10 a=VkNPw1HP01LnGYTKEx00:22 a=VnNF1IyMAAAA:8 a=gfQPMJu-hSpRr0ZdjbEA:9 X-Proofpoint-GUID: 6aV-MjFTPKLBvdg6p7e0exv5IA6SkXcd X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2026-01-12_01,2026-01-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 suspectscore=0 bulkscore=0 spamscore=0 impostorscore=0 malwarescore=0 phishscore=0 adultscore=0 clxscore=1015 lowpriorityscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2512120000 definitions=main-2601120037 Content-Type: text/plain; charset="utf-8" These days most of the system have multi cores. The likelyhood of at least one or more CPUs in nohz (idle state) is higher. Give accurate hint to the branch predictor. Signed-off-by: Shrikanth Hegde --- kernel/sched/fair.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 17e4e8ac5fca..c03f963f6216 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -12464,9 +12464,9 @@ static void nohz_balancer_kick(struct rq *rq) =20 /* * None are in tickless mode and hence no need for NOHZ idle load - * balancing: + * balancing */ - if (likely(!atomic_read(&nohz.nr_cpus))) + if (unlikely(!atomic_read(&nohz.nr_cpus))) return; =20 if (rq->nr_running >=3D 2) { --=20 2.47.3 From nobody Sun Feb 8 11:25:56 2026 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 671B930F53C for ; Mon, 12 Jan 2026 05:05:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768194343; cv=none; b=L4crLASdMIAZYGkYtc/VpfWRvEISc4RaZsgijgLV2mPdgsT9i0gz6sKooRzCHYZFK4DZ8hBJ9DYEu0UsULDLo5Nvk6YokZXJC71D8CE6pJuMsJotAaitQwdj5ZI3K5Ftai9+LSzRf8Gz4h9B/SfZI2IaDVYceUStI+cClKGlEq4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768194343; c=relaxed/simple; bh=cqKRkxklA5u8RnGprUJMvjvkh5eqhMWlmJKB2e9haoA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bwnsTOUrm8BQCtijr3KYKU9iw1U5bW5XLJ5d+M5x1Jvl8lukQqM4pJIt7ctszAKzHgbALHEAkoB3gvEcKETddot2WOuKrChIdSXApciTcMdVD66oNWRQK7sCnj15dwqJpb0MrNgkfbKBy+Tjt33c6vemN9qkuovvBe2JT5SlhJE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=hOTAaS4s; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="hOTAaS4s" Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 60B6SAr1026053; Mon, 12 Jan 2026 05:05:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=pp1; bh=ttXI2Jl3cWGvWWIW5 IL3FdhlBauZRjmPAqpYkTfcNoU=; b=hOTAaS4sE4cSnpaJfKpieVwaAoLzC1Bm3 x8i8V5ayucbdiMMKA/ktODnZKlV16qenPFAwoCe/nyiooPMGNyO/K0Dl1wPba+k6 YJ2MdNusfvaDsG3PnklrY8GciIrxZp+dTR9TexHnNh0NG5cFgwybx1lITFZGeLND A8bZHOkh88HgHjFx6rYTy1NlZqd2Jd3ZEdud0Rum8ac3JexIu1BsCol6xVfQW9PT o//G+mKITtBorOmFFPKnu2F0Pax2Li5XF+i1c6+Avlt8EhSapV+orCTfHMK1asLL 2EQ+2sbLQpwUguBLHHjrVf4d07nfLyVV94nuVnFW7hquXbD1Aj5iw== Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 4bkedsn96x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 Jan 2026 05:05:12 +0000 (GMT) Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 60C0ENDN014255; Mon, 12 Jan 2026 05:05:11 GMT Received: from smtprelay01.fra02v.mail.ibm.com ([9.218.2.227]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 4bm1fxvb1w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 12 Jan 2026 05:05:11 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay01.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 60C5595s60817696 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 Jan 2026 05:05:09 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A397820043; Mon, 12 Jan 2026 05:05:09 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 86E3920040; Mon, 12 Jan 2026 05:05:06 +0000 (GMT) Received: from li-7bb28a4c-2dab-11b2-a85c-887b5c60d769.in.ibm.com (unknown [9.109.215.252]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 12 Jan 2026 05:05:06 +0000 (GMT) From: Shrikanth Hegde To: mingo@kernel.org, peterz@infradead.org, vincent.guittot@linaro.org, linux-kernel@vger.kernel.org Cc: sshegde@linux.ibm.com, kprateek.nayak@amd.com, juri.lelli@redhat.com, vschneid@redhat.com, tglx@kernel.org, dietmar.eggemann@arm.com, anna-maria@linutronix.de, frederic@kernel.org, wangyang.guo@intel.com Subject: [PATCH v4 3/3] sched/fair: Remove nohz.nr_cpus and use weight of cpumask instead Date: Mon, 12 Jan 2026 10:34:42 +0530 Message-ID: <20260112050442.138446-4-sshegde@linux.ibm.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260112050442.138446-1-sshegde@linux.ibm.com> References: <20260112050442.138446-1-sshegde@linux.ibm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjYwMTEyMDAzNyBTYWx0ZWRfXymUYJWrHnKDa paWCk/t3fAASeQGxVSj9aIaugzxjv9nDmIOGa4q1WifbQdsOXqjLh1ya/vv5RJKZgf0iRf7EQr1 4TiZ6kXrV+CGvmSZuZg+xnN0BaPG/uiV9NJAKWkDGxMP766nTpZZqt6CDKz/kKQp2bcXHvVFh0R ukHL8duVgHicqyoKfnOad3eNd3Qk/Dxd+8J0n3OgI62+uLrfxFBaacxmgdIYQMDntC1KRtUVpgD 8aghJMenIHL9yQqE8xhie+sMvccSZcOEA3Fh+IDpWUM6Z0ZV74OfJU5skBnz6lfoGqekz1SMjq/ Trgqi6SLGlHQh0976+EjGbkRv2V+vZyC9iJvBVJQ9vbvSNaD8bgbAiON9an8gISRx0Mtjbf4XII bn7I85Oor/Iwh6WeDM59wTQJzh+jd9yW6s4R7xiHdBV7637if6J4AClaZO0OtkUujeB1FL4KuHy NZ6aLgECvReBW6gZPtg== X-Proofpoint-GUID: a7tNEPzuPcfnzPfVK1eZasDOOpPIvBGJ X-Authority-Analysis: v=2.4 cv=WLJyn3sR c=1 sm=1 tr=0 ts=69648108 cx=c_pps a=5BHTudwdYE3Te8bg5FgnPg==:117 a=5BHTudwdYE3Te8bg5FgnPg==:17 a=vUbySO9Y5rIA:10 a=VkNPw1HP01LnGYTKEx00:22 a=VnNF1IyMAAAA:8 a=T4Os_ZD8T57gc3T8qpkA:9 X-Proofpoint-ORIG-GUID: a7tNEPzuPcfnzPfVK1eZasDOOpPIvBGJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1121,Hydra:6.1.9,FMLib:17.12.100.49 definitions=2026-01-12_01,2026-01-09_02,2025-10-01_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 adultscore=0 malwarescore=0 phishscore=0 suspectscore=0 priorityscore=1501 bulkscore=0 clxscore=1015 impostorscore=0 lowpriorityscore=0 classifier=typeunknown authscore=0 authtc= authcc= route=outbound adjust=0 reason=mlx scancount=1 engine=8.19.0-2512120000 definitions=main-2601120037 Content-Type: text/plain; charset="utf-8" nohz.nr_cpus was observed as contended cacheline when running enterprise workload on large systems. Fundamental scalability challenge with nohz.idle_cpus_mask and nohz.nr_cpus is the following: (1) nohz_balancer_kick() observes (reads) nohz.nr_cpus (or nohz.idle_cpu_mask) and nohz.has_blocked to see whether there's any nohz balancing work to do, in every scheduler tick. (2) nohz_balance_enter_idle() and nohz_balance_exit_idle() (through nohz_balancer_kick() via sched_tick()) modify (write) nohz.nr_cpus (and/or nohz.idle_cpu_mask) and nohz.has_blocked. The characteristic frequencies are the following: (1) nohz_balancer_kick() happens at scheduler (busy)tick frequency on CPU(which has not gone idle). This is a relatively constant frequency in the ~1 kHz range or lower. (2) happens at idle enter/exit frequency on every CPU that goes to idle. This is workload dependent, but can easily be hundreds of kHz for IO-bound loads and high CPU counts. Ie. can be orders of magnitude higher than (1), in which case a cachemiss at every invocation of (1) is almost inevitable. idle exit will trigger (1) on the CPU which is coming out of idle. There's two types of costs from these functions: (A) scheduler tick cost via (1): this happens on busy CPUs too, and is thus a primary scalability cost. But the rate here is constant and typically much lower than (B), hence the absolute benefit to workload scalability will be lower as well. (B) idle cost via (2): going-to-idle and coming-from-idle costs are secondary concerns, because they impact power efficiency more than they impact scalability. But in terms of absolute cost this scales up with nr_cpus as well, and a much faster rate, and thus may also approach and negatively impact system limits like memory bus/fabric bandwidth. Note that nohz.idle_cpus_mask and nohz.nr_cpus may appear to reside in the same cacheline, however under CONFIG_CPUMASK_OFFSTACK=3Dy the backing stora= ge for nohz.idle_cpus_mask will be elsewhere. With CPUMASK_OFFSTACK=3Dn, the nohz.idle_cpus_mask and rest of nohz fields are in different cachelines under typical NR_CPUS=3D512/2048. This implies two separate cachelines being dirtied upon idle entry / exit.=20 nohz.nr_cpus can be derived from the mask itself. Its usage doesn't warrant a functionally correct value. This means one less cacheline being dirtied in idle entry/exit path which helps to save some bus bandwidth w.r.t to those nohz functions(approx 50%). This in turn helps to improve enterprise workload throughput. On system with 480 CPUs, running "hackbench 40 process 10000 loops" (Avg of 3 runs) baseline: 0.81% hackbench [k] nohz_balance_exit_idle 0.21% hackbench [k] nohz_balancer_kick 0.09% swapper [k] nohz_run_idle_balance With patch: 0.35% hackbench [k] nohz_balance_exit_idle 0.09% hackbench [k] nohz_balancer_kick 0.07% swapper [k] nohz_run_idle_balance [Ingo Molnar: scalability analysis changlog] Signed-off-by: Shrikanth Hegde Reviewed-by: Valentin Schneider --- kernel/sched/fair.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c03f963f6216..3408a5beb95b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7144,7 +7144,6 @@ static DEFINE_PER_CPU(cpumask_var_t, should_we_balanc= e_tmpmask); =20 static struct { cpumask_var_t idle_cpus_mask; - atomic_t nr_cpus; int has_blocked_load; /* Idle CPUS has blocked load */ int needs_update; /* Newly idle CPUs need their next_balance collated */ unsigned long next_balance; /* in jiffy units */ @@ -12466,7 +12465,7 @@ static void nohz_balancer_kick(struct rq *rq) * None are in tickless mode and hence no need for NOHZ idle load * balancing */ - if (unlikely(!atomic_read(&nohz.nr_cpus))) + if (unlikely(cpumask_empty(nohz.idle_cpus_mask))) return; =20 if (rq->nr_running >=3D 2) { @@ -12579,7 +12578,6 @@ void nohz_balance_exit_idle(struct rq *rq) =20 rq->nohz_tick_stopped =3D 0; cpumask_clear_cpu(rq->cpu, nohz.idle_cpus_mask); - atomic_dec(&nohz.nr_cpus); =20 set_cpu_sd_state_busy(rq->cpu); } @@ -12637,7 +12635,6 @@ void nohz_balance_enter_idle(int cpu) rq->nohz_tick_stopped =3D 1; =20 cpumask_set_cpu(cpu, nohz.idle_cpus_mask); - atomic_inc(&nohz.nr_cpus); =20 /* * Ensures that if nohz_idle_balance() fails to observe our --=20 2.47.3