[PATCH v2] ntp: Make sure RTC is synchronized for any time jump

Benjamin ROBIN posted 1 patch 4 weeks ago
kernel/time/ntp.c          | 10 +++++-----
kernel/time/ntp_internal.h |  4 ++--
kernel/time/timekeeping.c  | 11 ++++++++---
3 files changed, 15 insertions(+), 10 deletions(-)
[PATCH v2] ntp: Make sure RTC is synchronized for any time jump
Posted by Benjamin ROBIN 4 weeks ago
Follow-up of commit 35b603f8a78b ("ntp: Make sure RTC is synchronized
when time goes backwards").

sync_hw_clock() is normally called every 11 minutes when time is
synchronized. This issue is that this periodic timer uses the REALTIME
clock, so when time moves backwards, the timer expires late.

If the timer expires late, which can be days later, the RTC will no longer
be updated, which is an issue if the device is abruptly powered OFF during
this period. When the device will restart (when powered ON), it will have
the date prior to the time jump.

This follow-up handles all kernel API (syscall) that can trigger a time
jump. Cancel periodic timer on any time jump, if and only if STA_UNSYNC
flag was previously set (net_clear() was called).

The timer will be relaunched at the end of ntp_notify_cmos_timer() if
NTP is synced again somehow: This is possible since stopping the timer is
done outside of a locked section. Otherwise the timer will be relaunched
later when NTP is synced. This way, when the time is synchronized again,
the RTC is updated after less than 2 seconds.

Signed-off-by: Benjamin ROBIN <dev@benjarobin.fr>
---
 kernel/time/ntp.c          | 10 +++++-----
 kernel/time/ntp_internal.h |  4 ++--
 kernel/time/timekeeping.c  | 11 ++++++++---
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index b550ebe0f03b..d91074633c83 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -657,14 +657,14 @@ static void sync_hw_clock(struct work_struct *work)
 	sched_sync_hw_clock(offset_nsec, res != 0);
 }
 
-void ntp_notify_cmos_timer(bool offset_set)
+void ntp_notify_cmos_timer(bool ntp_was_cleared)
 {
 	/*
-	 * If the time jumped (using ADJ_SETOFFSET) cancels sync timer,
-	 * which may have been running if the time was synchronized
-	 * prior to the ADJ_SETOFFSET call.
+	 * If time jumped (clock set), and if ntp_clear() was called,
+	 * cancels sync timer, which may have been running if time was
+	 * previously synchronized.
 	 */
-	if (offset_set)
+	if (ntp_was_cleared)
 		hrtimer_cancel(&sync_hrtimer);
 
 	/*
diff --git a/kernel/time/ntp_internal.h b/kernel/time/ntp_internal.h
index 5a633dce9057..0615ed904119 100644
--- a/kernel/time/ntp_internal.h
+++ b/kernel/time/ntp_internal.h
@@ -14,9 +14,9 @@ extern int __do_adjtimex(struct __kernel_timex *txc,
 extern void __hardpps(const struct timespec64 *phase_ts, const struct timespec64 *raw_ts);
 
 #if defined(CONFIG_GENERIC_CMOS_UPDATE) || defined(CONFIG_RTC_SYSTOHC)
-extern void ntp_notify_cmos_timer(bool offset_set);
+extern void ntp_notify_cmos_timer(bool ntp_was_cleared);
 #else
-static inline void ntp_notify_cmos_timer(bool offset_set) { }
+static inline void ntp_notify_cmos_timer(bool ntp_was_cleared) { }
 #endif
 
 #endif /* _LINUX_NTP_INTERNAL_H */
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 17cae886ca82..e44e500b694c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1536,6 +1536,7 @@ int do_settimeofday64(const struct timespec64 *ts)
 
 	/* Signal hrtimers about time change */
 	clock_was_set(CLOCK_SET_WALL);
+	ntp_notify_cmos_timer(true);
 
 	audit_tk_injoffset(ts_delta);
 	add_device_randomness(ts, sizeof(*ts));
@@ -1575,6 +1576,7 @@ static int timekeeping_inject_offset(const struct timespec64 *ts)
 
 	/* Signal hrtimers about time change */
 	clock_was_set(CLOCK_SET_WALL);
+	ntp_notify_cmos_timer(true);
 	return 0;
 }
 
@@ -1936,6 +1938,7 @@ void timekeeping_inject_sleeptime64(const struct timespec64 *delta)
 
 	/* Signal hrtimers about time change */
 	clock_was_set(CLOCK_SET_WALL | CLOCK_SET_BOOT);
+	ntp_notify_cmos_timer(true);
 }
 #endif
 
@@ -2658,7 +2661,6 @@ EXPORT_SYMBOL_GPL(random_get_entropy_fallback);
 int do_adjtimex(struct __kernel_timex *txc)
 {
 	struct audit_ntp_data ad;
-	bool offset_set = false;
 	bool clock_set = false;
 	struct timespec64 ts;
 	int ret;
@@ -2680,7 +2682,6 @@ int do_adjtimex(struct __kernel_timex *txc)
 		if (ret)
 			return ret;
 
-		offset_set = delta.tv_sec != 0;
 		audit_tk_injoffset(delta);
 	}
 
@@ -2714,7 +2715,11 @@ int do_adjtimex(struct __kernel_timex *txc)
 	if (clock_set)
 		clock_was_set(CLOCK_SET_WALL);
 
-	ntp_notify_cmos_timer(offset_set);
+	/* Time jump (ADJ_SETOFFSET) is handled by timekeeping_inject_offset(),
+	 * which calls ntp_notify_cmos_timer() to cancel NTP sync hrtimer.
+	 * For the rest of do_adjtimex(), NTP sync flag is not cleared, so no
+	 * need to cancel NTP sync hrtimer here. */
+	ntp_notify_cmos_timer(false);
 
 	return ret;
 }
-- 
2.47.0
Re: [PATCH v2] ntp: Make sure RTC is synchronized for any time jump
Posted by Thomas Gleixner 3 weeks, 5 days ago
On Sun, Oct 27 2024 at 18:43, Benjamin ROBIN wrote:
> Follow-up of commit 35b603f8a78b ("ntp: Make sure RTC is synchronized
> when time goes backwards").
>
> sync_hw_clock() is normally called every 11 minutes when time is
> synchronized. This issue is that this periodic timer uses the REALTIME
> clock, so when time moves backwards, the timer expires late.
>
> If the timer expires late, which can be days later, the RTC will no longer
> be updated, which is an issue if the device is abruptly powered OFF during
> this period. When the device will restart (when powered ON), it will have
> the date prior to the time jump.
>
> This follow-up handles all kernel API (syscall) that can trigger a time
> jump. Cancel periodic timer on any time jump, if and only if STA_UNSYNC
> flag was previously set (net_clear() was called).

This does not parse. previously set means it was set before the
operation. What you want to say here is:

  Cancel the RTC synchronization timer if the operation set the
  STA_UNSYNC flag.

net_clear()? I assume you mean ntp_clear(). But that's not the only way:

  do_adjtimex() can modify STA_UNSYNC via process_adj_status()

Also ADJ_TAI modifies CLOCK_REALTIME, which is why clock_set() is
invoked. That can make CLOCK_REALTIME go backwards.

>  	clock_was_set(CLOCK_SET_WALL);
> +	ntp_notify_cmos_timer(true);

>  	clock_was_set(CLOCK_SET_WALL);
> +	ntp_notify_cmos_timer(true);

>  	clock_was_set(CLOCK_SET_WALL | CLOCK_SET_BOOT);
> +	ntp_notify_cmos_timer(true);

Can we please have a helper function which wraps all of this?

timekeeping_clock_was_set(bases)
{
        clock_was_set(bases);
        if (bases & CLOCK_SET_WALL)
        	ntp_notify_cmos_timer(true);
}

?

> @@ -2714,7 +2715,11 @@ int do_adjtimex(struct __kernel_timex *txc)
>  	if (clock_set)
>  		clock_was_set(CLOCK_SET_WALL);
>  
> -	ntp_notify_cmos_timer(offset_set);
> +	/* Time jump (ADJ_SETOFFSET) is handled by timekeeping_inject_offset(),
> +	 * which calls ntp_notify_cmos_timer() to cancel NTP sync hrtimer.
> +	 * For the rest of do_adjtimex(), NTP sync flag is not cleared, so no
> +	 * need to cancel NTP sync hrtimer here. */

/*
 * Aside of the horrible comment formatting this is wrong as I pointed
 * out above.
 */

The problem here is that ADJ_SETOFFSET is handled seperately. This
really want's all to be in one tk_core.lock held section.

Just split out the inner workings of timekeeping_inject_offset() into a
helper and invoke it under the lock from both places which call it.

Same for timekeeping_advance().

Make sure to move all the audit and randomness muck outside of the
locked region.

That allows to cover ADJ_SETOFFSET and ADJ_TAI, but still fails to take
the modifications of STA_UNSYNC into account, but that's trivial to
solve because you can let do_adjtimex() indicate that change to the
caller.

Then you end up with:

     if (clock_set)
     	timekeeping_clock_was_set(CLOCK_SET_WALL);
     else
     	ntp_notify_cmos_timer(sta_unsync_changed);

or something like that. The latter makes sure that the timer is canceled
when STA_UNSYNC changed. It does not matter whether it was set or
cleared. You always want to cancel.

That obviously needs to be split into several patches, but you get the
idea.

Thanks,

        tglx