If the period is too big, the 'delta * period' product result
might overflow, resulting in a negative number, then the
next_event ends before the last_event. This is buggy, as there
is no forward progress. Assert this can not happen.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
---
hw/core/ptimer.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
index d58e2dfdb0..88085d4c81 100644
--- a/hw/core/ptimer.c
+++ b/hw/core/ptimer.c
@@ -125,6 +125,9 @@ static void ptimer_reload(ptimer_state *s, int delta_adjust)
s->last_event = s->next_event;
s->next_event = s->last_event + delta * period;
+ /* Verify forward progress */
+ g_assert(s->next_event > s->last_event);
+
if (period_frac) {
s->next_event += ((int64_t)period_frac * delta) >> 32;
}
--
2.20.1
On Sat, 21 Sep 2019 at 11:17, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>
> If the period is too big, the 'delta * period' product result
> might overflow, resulting in a negative number, then the
> next_event ends before the last_event. This is buggy, as there
> is no forward progress. Assert this can not happen.
>
> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> ---
> hw/core/ptimer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
> index d58e2dfdb0..88085d4c81 100644
> --- a/hw/core/ptimer.c
> +++ b/hw/core/ptimer.c
> @@ -125,6 +125,9 @@ static void ptimer_reload(ptimer_state *s, int delta_adjust)
>
> s->last_event = s->next_event;
> s->next_event = s->last_event + delta * period;
> + /* Verify forward progress */
> + g_assert(s->next_event > s->last_event);
> +
> if (period_frac) {
> s->next_event += ((int64_t)period_frac * delta) >> 32;
> }
> --
Can this only happen if a QEMU timer model using the ptimer
code has a bug, or is it guest-triggerable for some of our
timer models?
thanks
-- PMM
On 9/23/19 4:40 PM, Peter Maydell wrote:
> On Sat, 21 Sep 2019 at 11:17, Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>>
>> If the period is too big, the 'delta * period' product result
>> might overflow, resulting in a negative number, then the
>> next_event ends before the last_event. This is buggy, as there
>> is no forward progress. Assert this can not happen.
>>
>> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
>> ---
>> hw/core/ptimer.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/hw/core/ptimer.c b/hw/core/ptimer.c
>> index d58e2dfdb0..88085d4c81 100644
>> --- a/hw/core/ptimer.c
>> +++ b/hw/core/ptimer.c
>> @@ -125,6 +125,9 @@ static void ptimer_reload(ptimer_state *s, int delta_adjust)
>>
>> s->last_event = s->next_event;
>> s->next_event = s->last_event + delta * period;
>> + /* Verify forward progress */
>> + g_assert(s->next_event > s->last_event);
>> +
>> if (period_frac) {
>> s->next_event += ((int64_t)period_frac * delta) >> 32;
>> }
>> --
>
> Can this only happen if a QEMU timer model using the ptimer
> code has a bug, or is it guest-triggerable for some of our
> timer models?
I hit this running a raspi4 guest, I had incorrectly initialized a clock
using the core cpu frequency, while I had to use the APB one (in my
case, core_cpu_freq / 2). The guest use a high value to configure a slow
timer, which in my buggy case made QEMU hang in hard way to debug.
So yes, it seems guest-triggerable if the implementation is broken.
Using assert() is OK for broken implementation, right?
Or should we audit all ptimer calls?
Regards,
Phil.
On Mon, 23 Sep 2019 at 15:54, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > On 9/23/19 4:40 PM, Peter Maydell wrote: > > On Sat, 21 Sep 2019 at 11:17, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > >> > >> If the period is too big, the 'delta * period' product result > >> might overflow, resulting in a negative number, then the > >> next_event ends before the last_event. This is buggy, as there > >> is no forward progress. Assert this can not happen. > > Can this only happen if a QEMU timer model using the ptimer > > code has a bug, or is it guest-triggerable for some of our > > timer models? > > I hit this running a raspi4 guest, I had incorrectly initialized a clock > using the core cpu frequency, while I had to use the APB one (in my > case, core_cpu_freq / 2). The guest use a high value to configure a slow > timer, which in my buggy case made QEMU hang in hard way to debug. > > So yes, it seems guest-triggerable if the implementation is broken. > Using assert() is OK for broken implementation, right? Yeah, if this can only happen if QEMU code is broken then an assert is OK. I was just trying to find out what the cause was, since "this is buggy" isn't specific about where the bug is. > Or should we audit all ptimer calls? I don't think we specifically need an audit. We could perhaps expand the comment by the assert to specifically say that if the calculation of the next event overflowed then this indicates a bug in the QEMU device model using the ptimer API, so if somebody else runs into the assert they have a hint about what to look at. (An overflowed next_event indicates a time incredibly far in the future, given that it's a nanosecond time in an int64_t.) The other approach I thought of would be to make the ptimer code handle this sort of after-the-end-of-QEMU-universe time by saturating next_event to INT64_MAX rather than letting it overflow and wrap. Unfortunately while this would be fine for the 'timer event' part of the code, it would break ptimer_get_count() which calculates the current counter value by looking at the difference between the current time and the time of the next event (fixable but only with a bunch of messing about to treat a next_event of INT64_MAX as equivalent to the counter being disabled and tracking the counter value in s->delta). So an assert is the best thing I think. thanks -- PMM
On 9/23/19 5:08 PM, Peter Maydell wrote: > On Mon, 23 Sep 2019 at 15:54, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >> >> On 9/23/19 4:40 PM, Peter Maydell wrote: >>> On Sat, 21 Sep 2019 at 11:17, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >>>> >>>> If the period is too big, the 'delta * period' product result >>>> might overflow, resulting in a negative number, then the >>>> next_event ends before the last_event. This is buggy, as there >>>> is no forward progress. Assert this can not happen. > >>> Can this only happen if a QEMU timer model using the ptimer >>> code has a bug, or is it guest-triggerable for some of our >>> timer models? >> >> I hit this running a raspi4 guest, I had incorrectly initialized a clock >> using the core cpu frequency, while I had to use the APB one (in my >> case, core_cpu_freq / 2). The guest use a high value to configure a slow >> timer, which in my buggy case made QEMU hang in hard way to debug. >> >> So yes, it seems guest-triggerable if the implementation is broken. >> Using assert() is OK for broken implementation, right? > > Yeah, if this can only happen if QEMU code is broken then > an assert is OK. I was just trying to find out what the > cause was, since "this is buggy" isn't specific about where > the bug is. > >> Or should we audit all ptimer calls? > > I don't think we specifically need an audit. We could perhaps > expand the comment by the assert to specifically say that if > the calculation of the next event overflowed then this indicates > a bug in the QEMU device model using the ptimer API, so if > somebody else runs into the assert they have a hint about > what to look at. (An overflowed next_event indicates a time > incredibly far in the future, given that it's a nanosecond > time in an int64_t.) OK I'll improve the comment. > The other approach I thought of would be to make the ptimer > code handle this sort of after-the-end-of-QEMU-universe time > by saturating next_event to INT64_MAX rather than letting it > overflow and wrap. Unfortunately while this would be fine for > the 'timer event' part of the code, it would break > ptimer_get_count() which calculates the current counter > value by looking at the difference between the current > time and the time of the next event (fixable but only with > a bunch of messing about to treat a next_event of INT64_MAX > as equivalent to the counter being disabled and tracking > the counter value in s->delta). So an assert is the > best thing I think. Yes :/ Thanks! Phil.
© 2016 - 2026 Red Hat, Inc.