[PATCH] hw/rtc: fix crash caused by lost_clock >= 0 assertion

Yaowei Bai posted 1 patch 1 year, 5 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1670228615-2684-1-git-send-email-baiyw2@chinatelecom.cn
Maintainers: "Michael S. Tsirkin" <mst@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
hw/rtc/mc146818rtc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
[PATCH] hw/rtc: fix crash caused by lost_clock >= 0 assertion
Posted by Yaowei Bai 1 year, 5 months ago
In our production environment a guest crashed with this log:

    qemu-kvm: /home/abuild/rpmbuild/BUILD/qemu-5.0.0/hw/rtc/mc146818rtc.c:201: periodic_timer_update: Assertion `lost_clock >= 0' failed.
    2022-09-26 10:00:28.747+0000: shutting down, reason=crashed

This happened after the host synced time with the NTP server which
we had adjusted backward the time because it mistakenly went faster
than the real time. Other people also have this problem:

    https://bugzilla.redhat.com/show_bug.cgi?id=2054781

After the host adjusted backward the time, the guset reconfigured the
period, this makes cur_clock smaller than last_periodic_clock in
periodic_timer_update function. However, the code assumes that cur_clock
is bigger than last_periodic_clock, which is not true in the situation
above. So we need to make it clear by introducing a if statement. With
this patch we can handle this crash situation to just reset the
next_periodic_time.

Signed-off-by: Yaowei Bai <baiyw2@chinatelecom.cn>
---
 hw/rtc/mc146818rtc.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/rtc/mc146818rtc.c b/hw/rtc/mc146818rtc.c
index 1ebb412..a397949 100644
--- a/hw/rtc/mc146818rtc.c
+++ b/hw/rtc/mc146818rtc.c
@@ -199,7 +199,9 @@ periodic_timer_update(RTCState *s, int64_t current_time, uint32_t old_period, bo
         next_periodic_clock = muldiv64(s->next_periodic_time,
                                 RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND);
         last_periodic_clock = next_periodic_clock - old_period;
-        lost_clock = cur_clock - last_periodic_clock;
+        if (cur_clock > last_periodic_clock) {
+            lost_clock = cur_clock - last_periodic_clock;
+        }
         assert(lost_clock >= 0);
     }
 
-- 
2.7.4
Re: [PATCH] hw/rtc: fix crash caused by lost_clock >= 0 assertion
Posted by Michael S. Tsirkin 1 year ago
On Mon, Dec 05, 2022 at 04:23:35PM +0800, Yaowei Bai wrote:
> In our production environment a guest crashed with this log:
> 
>     qemu-kvm: /home/abuild/rpmbuild/BUILD/qemu-5.0.0/hw/rtc/mc146818rtc.c:201: periodic_timer_update: Assertion `lost_clock >= 0' failed.
>     2022-09-26 10:00:28.747+0000: shutting down, reason=crashed
> 
> This happened after the host synced time with the NTP server which
> we had adjusted backward the time because it mistakenly went faster
> than the real time. Other people also have this problem:
> 
>     https://bugzilla.redhat.com/show_bug.cgi?id=2054781
> 
> After the host adjusted backward the time, the guset reconfigured the
> period, this makes cur_clock smaller than last_periodic_clock in
> periodic_timer_update function. However, the code assumes that cur_clock
> is bigger than last_periodic_clock, which is not true in the situation
> above. So we need to make it clear by introducing a if statement. With
> this patch we can handle this crash situation to just reset the
> next_periodic_time.
> 
> Signed-off-by: Yaowei Bai <baiyw2@chinatelecom.cn>


Hmm not sure this is a good fix.  Paolo what's your take?

> ---
>  hw/rtc/mc146818rtc.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/rtc/mc146818rtc.c b/hw/rtc/mc146818rtc.c
> index 1ebb412..a397949 100644
> --- a/hw/rtc/mc146818rtc.c
> +++ b/hw/rtc/mc146818rtc.c
> @@ -199,7 +199,9 @@ periodic_timer_update(RTCState *s, int64_t current_time, uint32_t old_period, bo
>          next_periodic_clock = muldiv64(s->next_periodic_time,
>                                  RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND);
>          last_periodic_clock = next_periodic_clock - old_period;
> -        lost_clock = cur_clock - last_periodic_clock;
> +        if (cur_clock > last_periodic_clock) {
> +            lost_clock = cur_clock - last_periodic_clock;
> +        }
>          assert(lost_clock >= 0);
>      }
>  
> -- 
> 2.7.4