From nobody Sat Feb 7 18:21:11 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A3D11EB5E3 for ; Fri, 16 Jan 2026 02:42:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.8 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768531362; cv=none; b=bnFwzDeafw0dd0ZY38/kkbHnkQ5ea9Saj+OMWE+viW2OdYfFQvj3fWOJyvQ1N4HfMn8Z6hIPsxrPOuLHCH0YgNF91LMJ5BRnSVqLd84ap4mG3PGGPy5sRWrIsBVc6Bp5t3VO8qnbffC8kY57U8BMdRtvpHOZX2CVbyvfRe9Vm9Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768531362; c=relaxed/simple; bh=b6LOt8g3MlHWqvCf/ptYuKap7YD6HwaS4Tij9lZp6sY=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=SlMFRGNrbxWbdqU87833TEPGCK1xZI81IeTFa6/+D21dFWB/AsWUbtG7nryOqori+NCgZ8ZX601d+Ggc4XmTXsBD2+EBS1Gr5x0dw42uXlBGZpaQIUNH11pmYkBmyBGRNQfLxGI2tVUyn0cJ7ANdqWbLVhJBQnScls8OpXLZ5zE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QegfemYy; arc=none smtp.client-ip=192.198.163.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QegfemYy" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1768531361; x=1800067361; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=b6LOt8g3MlHWqvCf/ptYuKap7YD6HwaS4Tij9lZp6sY=; b=QegfemYypzXgL396NfGqJpANDkN5Nn7PJyFuDkGDhu3aFQY/io3cTGK/ uLRv7sImMQnPyCfXDFPvknLHKJ+cPLhhz1/BBfc7lhmybUhwoUbtll0vo Y80bYTx1HfnoQcGOROuvuW3qM6nWQogCyoMjutuyRmtOC3LagfKc8IxyF ULgLHS2zwoWme+pFiy3sfklzUDydo7rXYO6WYFRFl3vWEFhsTYESI4F/K cuQc3QrS891k/Bxtz/GJCElOzzsqZEn3sipQpbOJ3X6Wl7DhyJhp09yj1 mJ2BzAt46wKU3ZdO30AlgRhcaGWJ0kjVuG8hTp0iITlJGVpXKXbrgl8zr g==; X-CSE-ConnectionGUID: MEgRkBE3QzeqyGQBJ72SiA== X-CSE-MsgGUID: kJ79nnWuSF+qOL94xx9TGw== X-IronPort-AV: E=McAfee;i="6800,10657,11672"; a="87423704" X-IronPort-AV: E=Sophos;i="6.21,230,1763452800"; d="scan'208";a="87423704" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jan 2026 18:42:41 -0800 X-CSE-ConnectionGUID: 2/1cXlr0Qg+ny9BdnBfbqA== X-CSE-MsgGUID: QP5gUajoTBuy0FKGf6PUCw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.21,230,1763452800"; d="scan'208";a="204739277" Received: from linux-pnp-server-11.sh.intel.com ([10.239.176.178]) by fmviesa007.fm.intel.com with ESMTP; 15 Jan 2026 18:42:37 -0800 From: Wangyang Guo To: K Prateek Nayak , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Wangyang Guo , Shrikanth Hegde , Benjamin Lei , Tim Chen , Tianyou Li Subject: [PATCH v3] sched/clock: Avoid false sharing for sched_clock_irqtime Date: Fri, 16 Jan 2026 10:39:45 +0800 Message-ID: <20260116023945.1849329-1-wangyang.guo@intel.com> X-Mailer: git-send-email 2.47.3 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Read-mostly sched_clock_irqtime may share the same cacheline with frequently updated nohz struct. Make it as static_key to avoid false sharing issue. Details: We observed ~3% cycles hotspots in irqtime_account_irq when running SPECjbb2015 in a 2-sockets system. Most of cycles spent in reading sched_clock_irqtime, which is a read-mostly var. perf c2c (cachelien view) shows it has false sharing with nohz struct: Num RmtHitm LclHitm Offset records Symbol 6.25% 0.00% 0.00% 0x0 4 [k] _nohz_idle_balance.isra.0 18.75% 100.00% 0.00% 0x8 14 [k] nohz_balance_exit_idle 6.25% 0.00% 0.00% 0x8 8 [k] nohz_balance_enter_idle 6.25% 0.00% 0.00% 0xc 8 [k] sched_balance_newidle 6.25% 0.00% 0.00% 0x10 31 [k] nohz_balancer_kick 6.25% 0.00% 0.00% 0x20 16 [k] sched_balance_newidle 37.50% 0.00% 0.00% 0x38 50 [k] irqtime_account_irq 6.25% 0.00% 0.00% 0x38 47 [k] account_process_tick 6.25% 0.00% 0.00% 0x38 12 [k] account_idle_ticks Offsets: * 0x0 -- nohz.idle_cpu_mask (r) * 0x8 -- nohz.nr_cpus (w) * 0x38 -- sched_clock_irqtime (r), not in nohz, but share cacheline The layout in /proc/kallsyms can also confirm that: ffffffff88600d40 b nohz ffffffff88600d68 B arch_needs_tick_broadcast ffffffff88600d6c b __key.264 ffffffff88600d6c b __key.265 ffffffff88600d70 b dl_generation ffffffff88600d78 b sched_clock_irqtime With the patch applied, irqtime_account_irq hotspot disappear. Reported-by: Benjamin Lei Reviewed-by: Tianyou Li Reviewed-by: Tim Chen Suggested-by: Peter Zijlstra Suggested-by: Shrikanth Hegde --- V2 -> V3: - Use static_key instead of a __read_mostly var. V1 -> V2: - Use __read_mostly instead of __cacheline_aligned to avoid wasting spaces. History: v2: https://lore.kernel.org/all/20260113074807.3404180-1-wangyang.guo@int= el.com/ v1: https://lore.kernel.org/all/20260113022958.3379650-1-wangyang.guo@int= el.com/ prev discussions: https://lore.kernel.org/all/20251211055612.4071266-1-wa= ngyang.guo@intel.com/T/#u Suggested-by: Peter Zijlstra Suggested-by: Shrikanth Hegde Reported-by: Benjamin Lei Reviewed-by: Tim Chen Reviewed-by: Tianyou Li Signed-off-by: Wangyang Guo --- kernel/sched/cputime.c | 20 ++++++++++++++++---- kernel/sched/sched.h | 4 ++-- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 7097de2c8cda..f37a27ed51d7 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -12,6 +12,8 @@ =20 #ifdef CONFIG_IRQ_TIME_ACCOUNTING =20 +DEFINE_STATIC_KEY_FALSE(sched_clock_irqtime); + /* * There are no locks covering percpu hardirq/softirq time. * They are only modified in vtime_account, on corresponding CPU @@ -25,16 +27,26 @@ */ DEFINE_PER_CPU(struct irqtime, cpu_irqtime); =20 -int sched_clock_irqtime; - void enable_sched_clock_irqtime(void) { - sched_clock_irqtime =3D 1; + static_branch_enable(&sched_clock_irqtime); } =20 +static void __disable_sched_clock_irqtime(struct work_struct *work) +{ + static_branch_disable(&sched_clock_irqtime); +} + +static DECLARE_WORK(sched_clock_irqtime_work, __disable_sched_clock_irqtim= e); + void disable_sched_clock_irqtime(void) { - sched_clock_irqtime =3D 0; + /* disable_sched_clock_irqtime can be called in atomic + * context with mark_tsc_unstable(), use wq to avoid + * "sleeping in atomic context" warning. + */ + if (irqtime_enabled()) + schedule_work(&sched_clock_irqtime_work); } =20 static void irqtime_account_delta(struct irqtime *irqtime, u64 delta, diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index adfb6e3409d7..ec963314287a 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3172,11 +3172,11 @@ struct irqtime { }; =20 DECLARE_PER_CPU(struct irqtime, cpu_irqtime); -extern int sched_clock_irqtime; +DECLARE_STATIC_KEY_FALSE(sched_clock_irqtime); =20 static inline int irqtime_enabled(void) { - return sched_clock_irqtime; + return static_branch_likely(&sched_clock_irqtime); } =20 /* --=20 2.47.3