From nobody Sun Feb 8 22:17:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E185EB64DC for ; Mon, 17 Jul 2023 18:29:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231675AbjGQS3C (ORCPT ); Mon, 17 Jul 2023 14:29:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232085AbjGQS2R (ORCPT ); Mon, 17 Jul 2023 14:28:17 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 049E5138 for ; Mon, 17 Jul 2023 11:28:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8E7F6611E6 for ; Mon, 17 Jul 2023 18:28:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EE184C433C7; Mon, 17 Jul 2023 18:28:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689618496; bh=zuvEkykktJjeNi58BabdAm26lxVTmrSjYHv8G+zZ144=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=V6uGZlSt6oa3niWcm26Tf+FRjgeRnOcrzH4HYSGaFsuJFKV0gupUKtuBA73MLGMlb moQ9VD2qqSnss5FD+hpJ5c1oW5NUbxH0DBUnxwvDa62z0ulq3oLz6mPTo/4VBWbKPm 5XyDSH3h6YK+Pfv8rSyZvDTfjoaCpKWP4qvN6lfQDfkTB+dZCXLtMuZprX/A5UEAG4 9Ei9rj4afwYBacMQRinO75Sve3m24By2pZH+QS4VwdJT5BKs8eRwGhqns90D9REVst 0/gqvK8DDgx5ZOzGYeAPQ2jHLtiVdJTQ3SgGIuZ/kTRIxAtyN4JFg++OMR1HKtfIzL /iVsLUhS2S/6A== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id A1C0ACE03F1; Mon, 17 Jul 2023 11:28:15 -0700 (PDT) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, daniel.lezcano@linaro.org, "Paul E. McKenney" , Chris Bainbridge Subject: [PATCH clocksource 1/2] clocksource: Handle negative skews in "skew is too large" messages Date: Mon, 17 Jul 2023 11:28:13 -0700 Message-Id: <20230717182814.1099419-1-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The nanosecond-to-millisecond skew computation uses unsigned arithmetic, which produces user-unfriendly large positive numbers for negative skews. Therefore, use signed arithmetic for this computation in order to preserve the negativity. Reported-by: Chris Bainbridge Reported-by: Feng Tang Fixes: dd029269947a ("clocksource: Improve "skew is too large" messages") Reviewed-by: Feng Tang Tested-by: Chris Bainbridge Signed-off-by: Paul E. McKenney --- kernel/time/clocksource.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 88cbc1181b23..c108ed8a9804 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -473,8 +473,8 @@ static void clocksource_watchdog(struct timer_list *unu= sed) /* Check the deviation from the watchdog clocksource. */ md =3D cs->uncertainty_margin + watchdog->uncertainty_margin; if (abs(cs_nsec - wd_nsec) > md) { - u64 cs_wd_msec; - u64 wd_msec; + s64 cs_wd_msec; + s64 wd_msec; u32 wd_rem; =20 pr_warn("timekeeping watchdog on CPU%d: Marking clocksource '%s' as uns= table because the skew is too large:\n", @@ -483,8 +483,8 @@ static void clocksource_watchdog(struct timer_list *unu= sed) watchdog->name, wd_nsec, wdnow, wdlast, watchdog->mask); pr_warn(" '%s' cs_nsec: %lld cs_now: %llx cs_last:= %llx mask: %llx\n", cs->name, cs_nsec, csnow, cslast, cs->mask); - cs_wd_msec =3D div_u64_rem(cs_nsec - wd_nsec, 1000U * 1000U, &wd_rem); - wd_msec =3D div_u64_rem(wd_nsec, 1000U * 1000U, &wd_rem); + cs_wd_msec =3D div_s64_rem(cs_nsec - wd_nsec, 1000 * 1000, &wd_rem); + wd_msec =3D div_s64_rem(wd_nsec, 1000 * 1000, &wd_rem); pr_warn(" Clocksource '%s' skewed %lld ns (%lld ms= ) over watchdog '%s' interval of %lld ns (%lld ms)\n", cs->name, cs_nsec - wd_nsec, cs_wd_msec, watchdog->name, wd_nsec, wd_m= sec); if (curr_clocksource =3D=3D cs) --=20 2.40.1 From nobody Sun Feb 8 22:17:47 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8408CEB64DC for ; Mon, 17 Jul 2023 18:28:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230294AbjGQS2x (ORCPT ); Mon, 17 Jul 2023 14:28:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232086AbjGQS2S (ORCPT ); Mon, 17 Jul 2023 14:28:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 119E2E6F for ; Mon, 17 Jul 2023 11:28:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A1467611F0 for ; Mon, 17 Jul 2023 18:28:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0BBABC433C8; Mon, 17 Jul 2023 18:28:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689618496; bh=LxVPV/awBr9JfrIXQcn7s5nybGvSK4R3yuZ8UegsyEo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q7HCYboIy3k9DTctpmN4Na4Olx0Cl/2WKV4x3hO5adpOJs9UdrnskmzdF0EYOzt3C kxvC9Vb9O4mdfIuRerjrWE5UXoIzFkU0VDwuX7N5pI/bxZwqDPz+7CVAc8v93Z65M/ Q79nxpbCVvlesiSbtqppsdZA0M5P7OXDA5aiie1sT9mxzx6Jy+CrSvsG4L/TLmTbmx O3HUml2d9qqsH76oGWKlHof+tldsQI03K5VQ0w2LTyoCVx3OMDlJIPNXVfGQEnLoDM NZClUUW//W1SxqNzqwUqlk5sif3Wp3TyH8UDYoCJq09n+p4fXVeS+Hwn3xNKPf0YZP SPGbU9iD5D4Yg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id A4D05CE04CD; Mon, 17 Jul 2023 11:28:15 -0700 (PDT) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, daniel.lezcano@linaro.org, Yu Liao , "Paul E . McKenney" Subject: [PATCH clocksource 2/2] x86/tsc: Extend watchdog check exemption to 4-Sockets platform Date: Mon, 17 Jul 2023 11:28:14 -0700 Message-Id: <20230717182814.1099419-2-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Feng Tang There were reports again that the tsc clocksource on 4 sockets x86 servers was wrongly judged as 'unstable' by 'jiffies' and other watchdogs, and disabled [1][2]. Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC on qualified platorms") was introduce to deal with these false alarms of tsc unstable issues, covering qualified platforms for 2 sockets or smaller ones. And from history of chasing TSC issues, Thomas and Peter only saw real TSC synchronization issue on 8 socket machines. So extend the exemption to 4 sockets to fix the issue. Rui also proposed another way to disable 'jiffies' as clocksource watchdog [3], which can also solve problem in [1]. in an architecture independent way, but can't cure the problem in [2]. whose watchdog is HPET or PMTIMER, while 'jiffies' is mostly used as watchdog in boot phase. 'nr_online_nodes' has known inaccurate problem for cases like platform with cpu-less memory nodes, sub numa cluster enabled, fakenuma, kernel cmdline parameter 'maxcpus=3D', etc. The harmful case is the 'maxcpus' one which could possibly under estimates the package number, and disable the watchdog, but bright side is it is mostly for debug usage. All these will be addressed in other patches, as discussed in thread [4]. [1]. https://lore.kernel.org/all/9d3bf570-3108-0336-9c52-9bee15767d29@huawe= i.com/ [2]. https://lore.kernel.org/lkml/06df410c-2177-4671-832f-339cff05b1d9@paul= mck-laptop/ [3]. https://lore.kernel.org/all/bd5b97f89ab2887543fc262348d1c7cafcaae536.c= amel@intel.com/ [4]. https://lore.kernel.org/all/20221021062131.1826810-1-feng.tang@intel.c= om/ Reported-by: Yu Liao Reported-by: Paul E. McKenney Signed-off-by: Feng Tang Signed-off-by: Paul E. McKenney --- arch/x86/kernel/tsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 3425c6a943e4..15f97c0abc9d 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1258,7 +1258,7 @@ static void __init check_system_tsc_reliable(void) if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && boot_cpu_has(X86_FEATURE_TSC_ADJUST) && - nr_online_nodes <=3D 2) + nr_online_nodes <=3D 4) tsc_disable_clocksource_watchdog(); } =20 --=20 2.40.1