[PATCH] clocksource: Use same threshold for read-back delay and time skew

Jiri Wiesner posted 1 patch 3 weeks, 1 day ago
kernel/time/clocksource.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
[PATCH] clocksource: Use same threshold for read-back delay and time skew
Posted by Jiri Wiesner 3 weeks, 1 day ago
The idea behind using the uncertainty margins of clocksources to set the
threshold for checking read-back delay is a sound one. For 3 clocksource
readouts, 3 uncertainty margins are added up to calcute a read-back delay
threshold. The read-back delay threshold:
2 * watchdog->uncertainty_margin + cs->uncertainty_margin
is always larger than the threshold for evaluating time skew:
watchdog->uncertainty_margin + cs->uncertainty_margin

When an SMI or NMI is received between the first watchdog read and the
clocksource read and servicing this interrupt takes more time than the
threshold for evaluating time skew but also less time than read-back delay
threshold, the readouts will pass the read-back delay threshold but the
clocksource watchdog will mark the clocksource unstable on account of time
skew. This decision is incorrect. Instead, another iteration of the loop
in cs_watchdog_read() should execute the readouts again (should there be
any iterations left).

Correct the situation by decreasing the read-back delay threshold to a
value equal to the time skew threshold. This patch resolves an issue
observed on a production machine:
clocksource: timekeeping watchdog on CPU556: Marking clocksource 'tsc' as unstable because the skew is too large:
clocksource: 'hpet' wd_nsec: 479980680 wd_now: 1d86c0e1 wd_last: 1ccfa7c4 mask: ffffffff
clocksource: 'tsc' cs_nsec: 480489826 cs_now: 9d18bdd2c71b5 cs_last: 9d18ba107dfe9 mask: ffffffffffffffff
clocksource: Clocksource 'tsc' skewed 509146 ns (0 ms) over watchdog 'hpet' interval of 479980680 ns (479 ms)
clocksource: 'tsc' is current clocksource.
tsc: Marking TSC unstable due to clocksource watchdog

Fixes: 4ac1dd3245b9 ("clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin")
Signed-off-by: Jiri Wiesner <jwiesner@suse.de>
---
 kernel/time/clocksource.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index a1890a073196..31add3e40c12 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -252,7 +252,6 @@ enum wd_read_status {
 
 static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
 {
-	int64_t md = 2 * watchdog->uncertainty_margin;
 	unsigned int nretries, max_retries;
 	int64_t wd_delay, wd_seq_delay;
 	u64 wd_end, wd_end2;
@@ -267,7 +266,8 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
 		local_irq_enable();
 
 		wd_delay = cycles_to_nsec_safe(watchdog, *wdnow, wd_end);
-		if (wd_delay <= md + cs->uncertainty_margin) {
+		if (wd_delay <= watchdog->uncertainty_margin
+				+ cs->uncertainty_margin) {
 			if (nretries > 1 && nretries >= max_retries) {
 				pr_warn("timekeeping watchdog on CPU%d: %s retried %d times before success\n",
 					smp_processor_id(), watchdog->name, nretries);
@@ -280,12 +280,12 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
 		 * there is too much external interferences that cause
 		 * significant delay in reading both clocksource and watchdog.
 		 *
-		 * If consecutive WD read-back delay > md, report
-		 * system busy, reinit the watchdog and skip the current
-		 * watchdog test.
+		 * If consecutive WD read-back delay is greater than
+		 * 2 * watchdog->uncertainty_margin, report system busy,
+		 * reinit the watchdog and skip the current watchdog test.
 		 */
 		wd_seq_delay = cycles_to_nsec_safe(watchdog, wd_end, wd_end2);
-		if (wd_seq_delay > md)
+		if (wd_seq_delay > 2 * watchdog->uncertainty_margin)
 			goto skip_test;
 	}
 
-- 
2.51.0


-- 
Jiri Wiesner
SUSE Labs
Re: [PATCH] clocksource: Use same threshold for read-back delay and time skew
Posted by Waiman Long 3 weeks, 1 day ago
On 1/16/26 1:08 PM, Jiri Wiesner wrote:
> The idea behind using the uncertainty margins of clocksources to set the
> threshold for checking read-back delay is a sound one. For 3 clocksource
> readouts, 3 uncertainty margins are added up to calcute a read-back delay
> threshold. The read-back delay threshold:
> 2 * watchdog->uncertainty_margin + cs->uncertainty_margin
> is always larger than the threshold for evaluating time skew:
> watchdog->uncertainty_margin + cs->uncertainty_margin
>
> When an SMI or NMI is received between the first watchdog read and the
> clocksource read and servicing this interrupt takes more time than the
> threshold for evaluating time skew but also less time than read-back delay
> threshold, the readouts will pass the read-back delay threshold but the
> clocksource watchdog will mark the clocksource unstable on account of time
> skew. This decision is incorrect. Instead, another iteration of the loop
> in cs_watchdog_read() should execute the readouts again (should there be
> any iterations left).
>
> Correct the situation by decreasing the read-back delay threshold to a
> value equal to the time skew threshold. This patch resolves an issue
> observed on a production machine:
> clocksource: timekeeping watchdog on CPU556: Marking clocksource 'tsc' as unstable because the skew is too large:
> clocksource: 'hpet' wd_nsec: 479980680 wd_now: 1d86c0e1 wd_last: 1ccfa7c4 mask: ffffffff
> clocksource: 'tsc' cs_nsec: 480489826 cs_now: 9d18bdd2c71b5 cs_last: 9d18ba107dfe9 mask: ffffffffffffffff
> clocksource: Clocksource 'tsc' skewed 509146 ns (0 ms) over watchdog 'hpet' interval of 479980680 ns (479 ms)
> clocksource: 'tsc' is current clocksource.
> tsc: Marking TSC unstable due to clocksource watchdog
>
> Fixes: 4ac1dd3245b9 ("clocksource: Set cs_watchdog_read() checks based on .uncertainty_margin")
> Signed-off-by: Jiri Wiesner <jwiesner@suse.de>
> ---
>   kernel/time/clocksource.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index a1890a073196..31add3e40c12 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -252,7 +252,6 @@ enum wd_read_status {
>   
>   static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow, u64 *wdnow)
>   {
> -	int64_t md = 2 * watchdog->uncertainty_margin;
>   	unsigned int nretries, max_retries;
>   	int64_t wd_delay, wd_seq_delay;
>   	u64 wd_end, wd_end2;
> @@ -267,7 +266,8 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
>   		local_irq_enable();
>   
>   		wd_delay = cycles_to_nsec_safe(watchdog, *wdnow, wd_end);
> -		if (wd_delay <= md + cs->uncertainty_margin) {
> +		if (wd_delay <= watchdog->uncertainty_margin
> +				+ cs->uncertainty_margin) {
>   			if (nretries > 1 && nretries >= max_retries) {
>   				pr_warn("timekeeping watchdog on CPU%d: %s retried %d times before success\n",
>   					smp_processor_id(), watchdog->name, nretries);
> @@ -280,12 +280,12 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow,
>   		 * there is too much external interferences that cause
>   		 * significant delay in reading both clocksource and watchdog.
>   		 *
> -		 * If consecutive WD read-back delay > md, report
> -		 * system busy, reinit the watchdog and skip the current
> -		 * watchdog test.
> +		 * If consecutive WD read-back delay is greater than
> +		 * 2 * watchdog->uncertainty_margin, report system busy,
> +		 * reinit the watchdog and skip the current watchdog test.
>   		 */
>   		wd_seq_delay = cycles_to_nsec_safe(watchdog, wd_end, wd_end2);
> -		if (wd_seq_delay > md)
> +		if (wd_seq_delay > 2 * watchdog->uncertainty_margin)
>   			goto skip_test;
>   	}
>   

I think wd_seq_delay should compare to just one 
watchdog->uncertainty_margin. See my comment on a related patch sent out 
by Thomas.

https://lore.kernel.org/lkml/8178bbf3-4a9d-4add-93bb-2ebd4dc03e9f@redhat.com/

Cheers,
Longman
Re: [PATCH] clocksource: Use same threshold for read-back delay and time skew
Posted by Thomas Gleixner 2 weeks, 6 days ago
On Fri, Jan 16 2026 at 15:05, Waiman Long wrote:
> On 1/16/26 1:08 PM, Jiri Wiesner wrote:
> I think wd_seq_delay should compare to just one 
> watchdog->uncertainty_margin. See my comment on a related patch sent out 
> by Thomas.
>
> https://lore.kernel.org/lkml/8178bbf3-4a9d-4add-93bb-2ebd4dc03e9f@redhat.com/

Duh. I completely forgot about that one....
Re: [PATCH] clocksource: Use same threshold for read-back delay and time skew
Posted by Jiri Wiesner 2 weeks, 2 days ago
On Sun, Jan 18, 2026 at 10:34:35AM +0100, Thomas Gleixner wrote:
> On Fri, Jan 16 2026 at 15:05, Waiman Long wrote:
> > On 1/16/26 1:08 PM, Jiri Wiesner wrote:
> > I think wd_seq_delay should compare to just one 
> > watchdog->uncertainty_margin. See my comment on a related patch sent out 
> > by Thomas.

I agree.

> > https://lore.kernel.org/lkml/8178bbf3-4a9d-4add-93bb-2ebd4dc03e9f@redhat.com/

Thanks for the link, Waiman.

> Duh. I completely forgot about that one....

I see the patch got merged. Thank you for taking action, Thomas.
J.