From nobody Sat Feb 7 08:07:45 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7DAB42D8796 for ; Tue, 27 Jan 2026 04:44:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769489101; cv=none; b=iRgk6Rp+0PNff9q1XY6IHdZs0oRhOxKj4pzHGtsKh6+3LIhonPk/2hF8Qf+kt5Unnt7p1nvQY7c5WLi/kFl6ybTJbfu/X4GKCyMjk304LR9WiEiqRXrUN9FSQVVr6kj9/HlqFIMEsqmu2TiX61YJxt6/G5qQ7SCG1zwS/trlD5M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769489101; c=relaxed/simple; bh=rUzMf/OINGHD8VbF86kQShe8SlBWiCi2ATi0cbStVnQ=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=IGaHEBNzxPY7tuXfn3OUD8ZSiFiaEpc28cxoWxztZWlki9V/4rwO8o8yOSxbqsqbfl2qPHNcdc1E+itqOXIvTc0hMzvN0lb0fqJtNrjY2qcGn5ohDmVTZaJaWW9m87LMBIxJgkjp5zKgliHCAV/l/168U5YoZ7TtgRHT/b96lvs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=SMymKBlb; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="SMymKBlb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1769489099; x=1801025099; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=rUzMf/OINGHD8VbF86kQShe8SlBWiCi2ATi0cbStVnQ=; b=SMymKBlbAKzSN9pt9gWzmbOH0waMbvmovv/b8eLumoVak6EODCNj1jtF SluCdN6nDws5pLjSZsZzb+zVLryMS8CPeuF1w5il8bIQcPPvRhNc0PzBN j26iWte1WpTYqBOopdosCfWb5KroCAnQ58jKrUHWAcVK89wJyA+U59X3i bV0lHEfeWyHm+TPX4stXU2fVGEo2qrp8/GTJWpH7JGU/+NNbRIQJMfzIn +96sjc5VnNO9fI3OBGEFkwng/ZDo2fwmf5aWwjJS27PcZEMthD8exiKj2 F3EB1Gl3VO9Dl5OTKGsG6sfSW9Q/UK0bFMRv45rPhkx6u7wWDGdRbB1o3 A==; X-CSE-ConnectionGUID: D/y8igTHSq2fxaskMuEHvQ== X-CSE-MsgGUID: ggX62r/jQ+ujJdXbpm3vpA== X-IronPort-AV: E=McAfee;i="6800,10657,11683"; a="69875890" X-IronPort-AV: E=Sophos;i="6.21,256,1763452800"; d="scan'208";a="69875890" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jan 2026 20:44:58 -0800 X-CSE-ConnectionGUID: losr198WQwGa8X7O1riF5Q== X-CSE-MsgGUID: 2/DVBeJNRg+bMO6TpAUVpg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.21,256,1763452800"; d="scan'208";a="238551835" Received: from linux-pnp-server-11.sh.intel.com ([10.239.176.178]) by orviesa002.jf.intel.com with ESMTP; 26 Jan 2026 20:44:54 -0800 From: Wangyang Guo To: K Prateek Nayak , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider Cc: linux-kernel@vger.kernel.org, Wangyang Guo , Shrikanth Hegde , Benjamin Lei , Tim Chen , Tianyou Li Subject: [PATCH v6] sched/clock: Avoid false sharing for sched_clock_irqtime Date: Tue, 27 Jan 2026 12:41:59 +0800 Message-ID: <20260127044159.2254247-1-wangyang.guo@intel.com> X-Mailer: git-send-email 2.47.3 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Read-mostly sched_clock_irqtime may share the same cacheline with frequently updated nohz struct. Make it as static_key to avoid false sharing issue. The only user of disable_sched_clock_irqtime() is tsc_.*mark_unstable() which may be invoked under atomic context and require a workqueue to disable static_key. But both of them calls clear_sched_clock_stable() just before doing disable_sched_clock_irqtime(). We can reuse "sched_clock_work" to also disable sched_clock_irqtime(). One additional case need to handle is if the tsc is marked unstable before late_initcall() phase, sched_clock_work will not be invoked and sched_clock_irqtime will stay enabled although clock is unstable: tsc_init() enable_sched_clock_irqtime() # irqtime accounting is enabled here ... if (unsynchronized_tsc()) # true mark_tsc_unstable() clear_sched_clock_stable() __sched_clock_stable_early =3D 0; ... if (static_key_count(&sched_clock_running.key) =3D=3D 2) # Only happens at sched_clock_init_late() __clear_sched_clock_stable(); # Never executed ... # late_initcall() phase sched_clock_init_late() if (__sched_clock_stable_early) # Already false __set_sched_clock_stable(); # sched_clock is never marked stable # TSC unstable, but sched_clock_work won't run to disable irqtime So we need to disable_sched_clock_irqtime() in sched_clock_init_late() if clock is unstable. Suggested-by: K Prateek Nayak Suggested-by: Peter Zijlstra Suggested-by: Shrikanth Hegde Reported-by: Benjamin Lei Reviewed-by: Tim Chen Reviewed-by: Tianyou Li Signed-off-by: Wangyang Guo Reviewed-by: K Prateek Nayak Tested-by: K Prateek Nayak --- v6 -> v5: - Only disable_sched_clock_irqtime() if irqtime_enabled() in sched_lock_init_late() to avoid unnessary overhead. V5 -> v4: - Changelog update to reflect static_key changes V4 -> V3: - Avoid creating a new workqueue to disable static_key - Specify kernel version for c2c result in changelog V2 -> V3: - Use static_key instead of a __read_mostly var. V1 -> V2: - Use __read_mostly instead of __cacheline_aligned to avoid wasting spaces. History: v5: https://lore.kernel.org/all/20260127031602.1907377-1-wangyang.guo@int= el.com/ v4: https://lore.kernel.org/all/20260126021401.1490163-1-wangyang.guo@int= el.com/ v3: https://lore.kernel.org/all/20260116023945.1849329-1-wangyang.guo@int= el.com/ v2: https://lore.kernel.org/all/20260113074807.3404180-1-wangyang.guo@int= el.com/ v1: https://lore.kernel.org/all/20260113022958.3379650-1-wangyang.guo@int= el.com/ prev discussions: https://lore.kernel.org/all/20251211055612.4071266-1-wa= ngyang.guo@intel.com/T/#u --- arch/x86/kernel/tsc.c | 2 -- kernel/sched/clock.c | 3 +++ kernel/sched/cputime.c | 8 ++++---- kernel/sched/sched.h | 4 ++-- 4 files changed, 9 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 87e749106dda..9a62e18d1bff 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1142,7 +1142,6 @@ static void tsc_cs_mark_unstable(struct clocksource *= cs) tsc_unstable =3D 1; if (using_native_sched_clock()) clear_sched_clock_stable(); - disable_sched_clock_irqtime(); pr_info("Marking TSC unstable due to clocksource watchdog\n"); } =20 @@ -1212,7 +1211,6 @@ void mark_tsc_unstable(char *reason) tsc_unstable =3D 1; if (using_native_sched_clock()) clear_sched_clock_stable(); - disable_sched_clock_irqtime(); pr_info("Marking TSC unstable due to %s\n", reason); =20 clocksource_mark_unstable(&clocksource_tsc_early); diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c index f5e6dd6a6b3a..bfd3c7418405 100644 --- a/kernel/sched/clock.c +++ b/kernel/sched/clock.c @@ -173,6 +173,7 @@ notrace static void __sched_clock_work(struct work_stru= ct *work) scd->tick_gtod, __gtod_offset, scd->tick_raw, __sched_clock_offset); =20 + disable_sched_clock_irqtime(); static_branch_disable(&__sched_clock_stable); } =20 @@ -238,6 +239,8 @@ static int __init sched_clock_init_late(void) =20 if (__sched_clock_stable_early) __set_sched_clock_stable(); + else if (irqtime_enabled()) + disable_sched_clock_irqtime(); /* disable if clock unstable. */ =20 return 0; } diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 7097de2c8cda..959a86206c64 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -12,6 +12,8 @@ =20 #ifdef CONFIG_IRQ_TIME_ACCOUNTING =20 +DEFINE_STATIC_KEY_FALSE(sched_clock_irqtime); + /* * There are no locks covering percpu hardirq/softirq time. * They are only modified in vtime_account, on corresponding CPU @@ -25,16 +27,14 @@ */ DEFINE_PER_CPU(struct irqtime, cpu_irqtime); =20 -int sched_clock_irqtime; - void enable_sched_clock_irqtime(void) { - sched_clock_irqtime =3D 1; + static_branch_enable(&sched_clock_irqtime); } =20 void disable_sched_clock_irqtime(void) { - sched_clock_irqtime =3D 0; + static_branch_disable(&sched_clock_irqtime); } =20 static void irqtime_account_delta(struct irqtime *irqtime, u64 delta, diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index adfb6e3409d7..ec963314287a 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -3172,11 +3172,11 @@ struct irqtime { }; =20 DECLARE_PER_CPU(struct irqtime, cpu_irqtime); -extern int sched_clock_irqtime; +DECLARE_STATIC_KEY_FALSE(sched_clock_irqtime); =20 static inline int irqtime_enabled(void) { - return sched_clock_irqtime; + return static_branch_likely(&sched_clock_irqtime); } =20 /* --=20 2.47.3