[PATCH] xen/arm: avoid vtimer flip-flop transition in context switch

Wei Chen posted 1 patch 1 year, 10 months ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20220615013909.283887-1-wei.chen@arm.com
xen/arch/arm/vtimer.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
[PATCH] xen/arm: avoid vtimer flip-flop transition in context switch
Posted by Wei Chen 1 year, 10 months ago
virt_vtimer_save is calculating the new time for the vtimer and
has a potential risk of timer flip in:
"v->arch.virt_timer.cval + v->domain->arch.virt_timer_base.offset
- boot_count".
In this formula, "cval + offset" could make uint64_t overflow.
Generally speaking, this is difficult to trigger. But unfortunately
the problem was encountered with a platform where the timer started
with a very huge initial value, like 0xF333899122223333. On this
platform cval + offset is overflowing after running for a while.

So in this patch, we adjust the formula to use "offset - boot_count"
first, and then use the result to plus cval. This will avoid the
uint64_t overflow.

Signed-off-by: Wei Chen <wei.chen@arm.com>
---
 xen/arch/arm/vtimer.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
index 5bb5970f58..86e63303c8 100644
--- a/xen/arch/arm/vtimer.c
+++ b/xen/arch/arm/vtimer.c
@@ -144,8 +144,9 @@ void virt_timer_save(struct vcpu *v)
     if ( (v->arch.virt_timer.ctl & CNTx_CTL_ENABLE) &&
          !(v->arch.virt_timer.ctl & CNTx_CTL_MASK))
     {
-        set_timer(&v->arch.virt_timer.timer, ticks_to_ns(v->arch.virt_timer.cval +
-                  v->domain->arch.virt_timer_base.offset - boot_count));
+        set_timer(&v->arch.virt_timer.timer,
+                  ticks_to_ns(v->domain->arch.virt_timer_base.offset -
+                              boot_count + v->arch.virt_timer.cval));
     }
 }
 
-- 
2.25.1
Re: [PATCH] xen/arm: avoid vtimer flip-flop transition in context switch
Posted by Julien Grall 1 year, 10 months ago
Hi Wei,

Title: I don't quite understand what you mean by "flip-flop transition".

On 15/06/2022 02:39, Wei Chen wrote:
> virt_vtimer_save is calculating the new time for the vtimer and
> has a potential risk of timer flip in:
> "v->arch.virt_timer.cval + v->domain->arch.virt_timer_base.offset
> - boot_count".
> In this formula, "cval + offset" could make uint64_t overflow.
> Generally speaking, this is difficult to trigger. 
> But unfortunately
> the problem was encountered with a platform where the timer started
> with a very huge initial value, like 0xF333899122223333. On this
> platform cval + offset is overflowing after running for a while.

I am not sure how this is a problem. Yes, uint64_t will overflow with 
"cval + offset", but AFAIK the overall result will still be correct and 
not undefined.

> 
> So in this patch, we adjust the formula to use "offset - boot_count"
> first, and then use the result to plus cval. This will avoid the
> uint64_t overflow.

Technically, the overflow is still present because the (offset - 
boot_count) is a non-zero value *and* cval is a 64-bit value.

So I think the equation below should be reworked to...

> 
> Signed-off-by: Wei Chen <wei.chen@arm.com>
> ---
>   xen/arch/arm/vtimer.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
> index 5bb5970f58..86e63303c8 100644
> --- a/xen/arch/arm/vtimer.c
> +++ b/xen/arch/arm/vtimer.c
> @@ -144,8 +144,9 @@ void virt_timer_save(struct vcpu *v)
>       if ( (v->arch.virt_timer.ctl & CNTx_CTL_ENABLE) &&
>            !(v->arch.virt_timer.ctl & CNTx_CTL_MASK))
>       {
> -        set_timer(&v->arch.virt_timer.timer, ticks_to_ns(v->arch.virt_timer.cval +
> -                  v->domain->arch.virt_timer_base.offset - boot_count));
> +        set_timer(&v->arch.virt_timer.timer,
> +                  ticks_to_ns(v->domain->arch.virt_timer_base.offset -
> +                              boot_count + v->arch.virt_timer.cval));

... something like:

ticks_to_ns(offset - boot_count) + ticks_to_ns(cval);

The first part of the equation should always be the same. So it could be 
stored in struct domain.

Cheers,

-- 
Julien Grall
RE: [PATCH] xen/arm: avoid vtimer flip-flop transition in context switch
Posted by Wei Chen 1 year, 10 months ago
Hi Julien,

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 2022年6月15日 17:47
> To: Wei Chen <Wei.Chen@arm.com>; xen-devel@lists.xenproject.org
> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; Bertrand
> Marquis <Bertrand.Marquis@arm.com>; Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com>
> Subject: Re: [PATCH] xen/arm: avoid vtimer flip-flop transition in context
> switch
> 
> Hi Wei,
> 
> Title: I don't quite understand what you mean by "flip-flop transition".
> 

Sorry for the no accurate words. I mean the time reaches to the MAX uint64_t
and continue from 0. Maybe an "overflow" be better for this description. 

> On 15/06/2022 02:39, Wei Chen wrote:
> > virt_vtimer_save is calculating the new time for the vtimer and
> > has a potential risk of timer flip in:
> > "v->arch.virt_timer.cval + v->domain->arch.virt_timer_base.offset
> > - boot_count".
> > In this formula, "cval + offset" could make uint64_t overflow.
> > Generally speaking, this is difficult to trigger.
> > But unfortunately
> > the problem was encountered with a platform where the timer started
> > with a very huge initial value, like 0xF333899122223333. On this
> > platform cval + offset is overflowing after running for a while.
> 
> I am not sure how this is a problem. Yes, uint64_t will overflow with
> "cval + offset", but AFAIK the overall result will still be correct and
> not undefined.
> 

Yes, just as you said, even overflow, but the result is still correct.
I had just noticed the overflow, but thought in a wrong way.

We have encountered this overflow in one RTOS guest:
PCNT:      0xf33ad45367e4a4ff
EL2_OFF:   0xf333333359e29a7f
BOOT_TICK: 0xf333333359e00000
VCNT:      0x0007a1200e020a80
If there is no timer in list, RTOS will calculate a huge number for
"infinite wait", for example:
VCAL:    0x0ff7a1200e020a80
EL2_OFF + VCAL - BOOT_TICK = 0x032ad45367e4a4ff - 0xf333333359e00000 = 0xFF7A1200E04A4FF
EL2_OFF - BOOT_TICK + VCAL = 0x29a7f + 0x0ff7a1200e020a80 = 0xFF7A1200E04A4FF


> >
> > So in this patch, we adjust the formula to use "offset - boot_count"
> > first, and then use the result to plus cval. This will avoid the
> > uint64_t overflow.
> 
> Technically, the overflow is still present because the (offset -
> boot_count) is a non-zero value *and* cval is a 64-bit value.
> 

Yes, GuestOS can issue any valid 64-bit value for their usage.

> So I think the equation below should be reworked to...
> 
> >
> > Signed-off-by: Wei Chen <wei.chen@arm.com>
> > ---
> >   xen/arch/arm/vtimer.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
> > index 5bb5970f58..86e63303c8 100644
> > --- a/xen/arch/arm/vtimer.c
> > +++ b/xen/arch/arm/vtimer.c
> > @@ -144,8 +144,9 @@ void virt_timer_save(struct vcpu *v)
> >       if ( (v->arch.virt_timer.ctl & CNTx_CTL_ENABLE) &&
> >            !(v->arch.virt_timer.ctl & CNTx_CTL_MASK))
> >       {
> > -        set_timer(&v->arch.virt_timer.timer, ticks_to_ns(v-
> >arch.virt_timer.cval +
> > -                  v->domain->arch.virt_timer_base.offset - boot_count));
> > +        set_timer(&v->arch.virt_timer.timer,
> > +                  ticks_to_ns(v->domain->arch.virt_timer_base.offset -
> > +                              boot_count + v->arch.virt_timer.cval));
> 
> ... something like:
> 
> ticks_to_ns(offset - boot_count) + ticks_to_ns(cval);
> 
> The first part of the equation should always be the same. So it could be
> stored in struct domain.
> 

If you think there is still some values to continue this patch, I will
address this comment : )


Thanks,
Wei Chen

> Cheers,
> 
> --
> Julien Grall
Re: [PATCH] xen/arm: avoid vtimer flip-flop transition in context switch
Posted by Julien Grall 1 year, 10 months ago

On 15/06/2022 11:36, Wei Chen wrote:
> Hi Julien,

Hi Wei,

>> -----Original Message-----
>> From: Julien Grall <julien@xen.org>
>> Sent: 2022年6月15日 17:47
>> To: Wei Chen <Wei.Chen@arm.com>; xen-devel@lists.xenproject.org
>> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; Bertrand
>> Marquis <Bertrand.Marquis@arm.com>; Volodymyr Babchuk
>> <Volodymyr_Babchuk@epam.com>
>> Subject: Re: [PATCH] xen/arm: avoid vtimer flip-flop transition in context
>> switch
>>>
>>> So in this patch, we adjust the formula to use "offset - boot_count"
>>> first, and then use the result to plus cval. This will avoid the
>>> uint64_t overflow.
>>
>> Technically, the overflow is still present because the (offset -
>> boot_count) is a non-zero value *and* cval is a 64-bit value.
>>
> 
> Yes, GuestOS can issue any valid 64-bit value for their usage.
> 
>> So I think the equation below should be reworked to...
>>
>>>
>>> Signed-off-by: Wei Chen <wei.chen@arm.com>
>>> ---
>>>    xen/arch/arm/vtimer.c | 5 +++--
>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
>>> index 5bb5970f58..86e63303c8 100644
>>> --- a/xen/arch/arm/vtimer.c
>>> +++ b/xen/arch/arm/vtimer.c
>>> @@ -144,8 +144,9 @@ void virt_timer_save(struct vcpu *v)
>>>        if ( (v->arch.virt_timer.ctl & CNTx_CTL_ENABLE) &&
>>>             !(v->arch.virt_timer.ctl & CNTx_CTL_MASK))
>>>        {
>>> -        set_timer(&v->arch.virt_timer.timer, ticks_to_ns(v-
>>> arch.virt_timer.cval +
>>> -                  v->domain->arch.virt_timer_base.offset - boot_count));
>>> +        set_timer(&v->arch.virt_timer.timer,
>>> +                  ticks_to_ns(v->domain->arch.virt_timer_base.offset -
>>> +                              boot_count + v->arch.virt_timer.cval));
>>
>> ... something like:
>>
>> ticks_to_ns(offset - boot_count) + ticks_to_ns(cval);
>>
>> The first part of the equation should always be the same. So it could be
>> stored in struct domain.
>>
> 
> If you think there is still some values to continue this patch, I will
> address this comment : )

I think there are. This can be easily triggered by a vCPU setting a 
large cval.

Cheers,

-- 
Julien Grall

RE: [PATCH] xen/arm: avoid vtimer flip-flop transition in context switch
Posted by Wei Chen 1 year, 10 months ago
Hi Julien,

> -----Original Message-----
> From: Julien Grall <julien@xen.org>
> Sent: 2022年6月17日 16:32
> To: Wei Chen <Wei.Chen@arm.com>; xen-devel@lists.xenproject.org
> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>; Bertrand
> Marquis <Bertrand.Marquis@arm.com>; Volodymyr Babchuk
> <Volodymyr_Babchuk@epam.com>
> Subject: Re: [PATCH] xen/arm: avoid vtimer flip-flop transition in context
> switch
> 
> 
> 
> On 15/06/2022 11:36, Wei Chen wrote:
> > Hi Julien,
> 
> Hi Wei,
> 
> >> -----Original Message-----
> >> From: Julien Grall <julien@xen.org>
> >> Sent: 2022年6月15日 17:47
> >> To: Wei Chen <Wei.Chen@arm.com>; xen-devel@lists.xenproject.org
> >> Cc: nd <nd@arm.com>; Stefano Stabellini <sstabellini@kernel.org>;
> Bertrand
> >> Marquis <Bertrand.Marquis@arm.com>; Volodymyr Babchuk
> >> <Volodymyr_Babchuk@epam.com>
> >> Subject: Re: [PATCH] xen/arm: avoid vtimer flip-flop transition in
> context
> >> switch
> >>>
> >>> So in this patch, we adjust the formula to use "offset - boot_count"
> >>> first, and then use the result to plus cval. This will avoid the
> >>> uint64_t overflow.
> >>
> >> Technically, the overflow is still present because the (offset -
> >> boot_count) is a non-zero value *and* cval is a 64-bit value.
> >>
> >
> > Yes, GuestOS can issue any valid 64-bit value for their usage.
> >
> >> So I think the equation below should be reworked to...
> >>
> >>>
> >>> Signed-off-by: Wei Chen <wei.chen@arm.com>
> >>> ---
> >>>    xen/arch/arm/vtimer.c | 5 +++--
> >>>    1 file changed, 3 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c
> >>> index 5bb5970f58..86e63303c8 100644
> >>> --- a/xen/arch/arm/vtimer.c
> >>> +++ b/xen/arch/arm/vtimer.c
> >>> @@ -144,8 +144,9 @@ void virt_timer_save(struct vcpu *v)
> >>>        if ( (v->arch.virt_timer.ctl & CNTx_CTL_ENABLE) &&
> >>>             !(v->arch.virt_timer.ctl & CNTx_CTL_MASK))
> >>>        {
> >>> -        set_timer(&v->arch.virt_timer.timer, ticks_to_ns(v-
> >>> arch.virt_timer.cval +
> >>> -                  v->domain->arch.virt_timer_base.offset -
> boot_count));
> >>> +        set_timer(&v->arch.virt_timer.timer,
> >>> +                  ticks_to_ns(v->domain->arch.virt_timer_base.offset
> -
> >>> +                              boot_count + v->arch.virt_timer.cval));
> >>
> >> ... something like:
> >>
> >> ticks_to_ns(offset - boot_count) + ticks_to_ns(cval);
> >>
> >> The first part of the equation should always be the same. So it could
> be
> >> stored in struct domain.
> >>
> >
> > If you think there is still some values to continue this patch, I will
> > address this comment : )
> 
> I think there are. This can be easily triggered by a vCPU setting a
> large cval.
> 

Ok, thanks! We will address it and refine the subject and descriptions.

> Cheers,
> 
> --
> Julien Grall