[XEN PATCH v3] xen/arinc653: fix delay in the start of major frame

Anderson Choi posted 1 patch 3 months, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/cb18dbefaf574eb4d3647097d06debcab378533a.1752815008.git.anderson.choi@boeing.com
There is a newer version of this series
xen/common/sched/arinc653.c | 39 ++++++++++++++++++++-----------------
1 file changed, 21 insertions(+), 18 deletions(-)
[XEN PATCH v3] xen/arinc653: fix delay in the start of major frame
Posted by Anderson Choi 3 months, 2 weeks ago
ARINC653 specificaion requires partition scheduling to be deterministic
and periodic over time.

However, the use of current timestamp (now) as the baseline to calculate
next_major_frame and next_switch_time introduces a delay in the start of
major frame at every period, which breaks determinism and periodicity in
partition scheduling.

For example, we observe 3.5 msec of accumulated delay at the 21st major
frame with the following configuration.

Target : qemuarm64
xen version : 4.19 (43aeacff86, x86/IRQ: constrain creator-domain-ID assertion)
dom1 : 10 msec runtime
dom2 : 10 msec runtime

$ a653_sched -p Pool-arinc dom1:10 dom2:10

0.014553536 ---x d?v? runstate_change d1v0 runnable->running //1st major
frame
0.034629712 ---x d?v? runstate_change d1v0 runnable->running //2nd major
frame
<snip>
0.397747008 |||x d?v? runstate_change d1v0 runnable->running //20th
major frame
0.418066096 -||x d?v? runstate_change d1v0 runnable->running //21st
major frame

This is due to an inherent delta between the time value the scheduler timer
is programmed to be fired with and the time value the schedule function
is executed.

Another observation that breaks the deterministic behavior of partition
scheduling is a delayed execution of schedule(); It was called 14 msec
later than programmed.

1.530603952 ---x d?v? runstate_change d1v0 runnable->running
1.564956784 --|x d?v? runstate_change d1v0 runnable->running

Enforce the periodic behavior of partition scheduling by using the value
next_major_frame as the base to calculate the start of major frame and
the next domain switch time.

Per discussion with Nathan Studer, there are odd cases the first minor
frame can be also missed. In that sceanario, iterate through the schedule after resyncing
the expected next major frame.

Per suggestion from Stewart Hildebrand, the while loop to calculate the
start of next major frame can be eliminated by using a modulo operation.

Fixes: 22787f2e107c ("ARINC 653 scheduler")

Suggested-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Suggested-by: Nathan Studer <nathan.studer@dornerworks.com>
Signed-off-by: Anderson Choi <anderson.choi@boeing.com>

---
Changes in v3:
- Replace the while loop with the modulo operation to calculate the
  start of next major frame.
- Initialize major_frame and runtime of zeroth schedule entry to
  DEFAULT_TIMESLICE not to use "if" branch in the calculation of next
major frame.

Changes in v2:
- Changed the logic to resync major frame and to find correct
  minor frame after a miss suggested by Nathan
---
---
 xen/common/sched/arinc653.c | 39 ++++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
index 930361fa5c..b7f3f41137 100644
--- a/xen/common/sched/arinc653.c
+++ b/xen/common/sched/arinc653.c
@@ -361,6 +361,8 @@ a653sched_init(struct scheduler *ops)
     ops->sched_data = prv;
 
     prv->next_major_frame = 0;
+    prv->major_frame = DEFAULT_TIMESLICE;
+    prv->schedule[0].runtime = DEFAULT_TIMESLICE;
     spin_lock_init(&prv->lock);
     INIT_LIST_HEAD(&prv->unit_list);
 
@@ -526,29 +528,30 @@ a653sched_do_schedule(
 
     spin_lock_irqsave(&sched_priv->lock, flags);
 
-    if ( sched_priv->num_schedule_entries < 1 )
-        sched_priv->next_major_frame = now + DEFAULT_TIMESLICE;
-    else if ( now >= sched_priv->next_major_frame )
+    /* Switch to next major frame directly eliminating the use of loop */
+    if ( now >= sched_priv->next_major_frame )
     {
-        /* time to enter a new major frame
-         * the first time this function is called, this will be true */
-        /* start with the first domain in the schedule */
+        s_time_t major_frame = sched_priv->major_frame;
+        s_time_t remainder = (now - sched_priv->next_major_frame) % major_frame;
+
+        /* major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
+         * if num_schedule_entries is zero.
+         */
         sched_priv->sched_index = 0;
-        sched_priv->next_major_frame = now + sched_priv->major_frame;
-        sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime;
+        sched_priv->next_major_frame = now - remainder + major_frame;
+        sched_priv->next_switch_time = now - remainder +
+            sched_priv->schedule[0].runtime;
     }
-    else
+
+    /* Switch minor frame or find correct minor frame after a miss */
+    while ( (now >= sched_priv->next_switch_time) &&
+        (sched_priv->sched_index < sched_priv->num_schedule_entries) )
     {
-        while ( (now >= sched_priv->next_switch_time) &&
-                (sched_priv->sched_index < sched_priv->num_schedule_entries) )
-        {
-            /* time to switch to the next domain in this major frame */
-            sched_priv->sched_index++;
-            sched_priv->next_switch_time +=
-                sched_priv->schedule[sched_priv->sched_index].runtime;
-        }
+        sched_priv->sched_index++;
+        sched_priv->next_switch_time +=
+            sched_priv->schedule[sched_priv->sched_index].runtime;
     }
-
+    
     /*
      * If we exhausted the domains in the schedule and still have time left
      * in the major frame then switch next at the next major frame.
-- 
2.43.0
Re: [XEN PATCH v3] xen/arinc653: fix delay in the start of major frame
Posted by Stewart Hildebrand 3 months, 1 week ago
Hi,

It largely looks OK to me, just a few small comments below.

On 7/18/25 05:16, Anderson Choi wrote:
> ARINC653 specificaion requires partition scheduling to be deterministic

Typo: s/specificaion/specification/

> and periodic over time.
> 
> However, the use of current timestamp (now) as the baseline to calculate
> next_major_frame and next_switch_time introduces a delay in the start of
> major frame at every period, which breaks determinism and periodicity in
> partition scheduling.
> 
> For example, we observe 3.5 msec of accumulated delay at the 21st major
> frame with the following configuration.
> 
> Target : qemuarm64
> xen version : 4.19 (43aeacff86, x86/IRQ: constrain creator-domain-ID assertion)
> dom1 : 10 msec runtime
> dom2 : 10 msec runtime
> 
> $ a653_sched -p Pool-arinc dom1:10 dom2:10
> 
> 0.014553536 ---x d?v? runstate_change d1v0 runnable->running //1st major
> frame
> 0.034629712 ---x d?v? runstate_change d1v0 runnable->running //2nd major
> frame
> <snip>
> 0.397747008 |||x d?v? runstate_change d1v0 runnable->running //20th
> major frame
> 0.418066096 -||x d?v? runstate_change d1v0 runnable->running //21st
> major frame
> 
> This is due to an inherent delta between the time value the scheduler timer
> is programmed to be fired with and the time value the schedule function
> is executed.
> 
> Another observation that breaks the deterministic behavior of partition
> scheduling is a delayed execution of schedule(); It was called 14 msec
> later than programmed.
> 
> 1.530603952 ---x d?v? runstate_change d1v0 runnable->running
> 1.564956784 --|x d?v? runstate_change d1v0 runnable->running
> 
> Enforce the periodic behavior of partition scheduling by using the value
> next_major_frame as the base to calculate the start of major frame and
> the next domain switch time.
> 
> Per discussion with Nathan Studer, there are odd cases the first minor
> frame can be also missed. In that sceanario, iterate through the schedule after resyncing

Typo: s/sceanario/scenario/

The line is also too long.

> the expected next major frame.
> 
> Per suggestion from Stewart Hildebrand, the while loop to calculate the
> start of next major frame can be eliminated by using a modulo operation.

The while loop was only in earlier revisions of the patch, not in the
upstream codebase, so it doesn't make sense to mention it in the commit
message.

> 
> Fixes: 22787f2e107c ("ARINC 653 scheduler")
> 

Please remove the extraneous newline between the Fixes: tag and
remaining tags

> Suggested-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> Suggested-by: Nathan Studer <nathan.studer@dornerworks.com>
> Signed-off-by: Anderson Choi <anderson.choi@boeing.com>
> 
> ---
> Changes in v3:
> - Replace the while loop with the modulo operation to calculate the
>   start of next major frame.
> - Initialize major_frame and runtime of zeroth schedule entry to
>   DEFAULT_TIMESLICE not to use "if" branch in the calculation of next
> major frame.
> 
> Changes in v2:
> - Changed the logic to resync major frame and to find correct
>   minor frame after a miss suggested by Nathan
> ---
> ---
>  xen/common/sched/arinc653.c | 39 ++++++++++++++++++++-----------------
>  1 file changed, 21 insertions(+), 18 deletions(-)
> 
> diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
> index 930361fa5c..b7f3f41137 100644
> --- a/xen/common/sched/arinc653.c
> +++ b/xen/common/sched/arinc653.c
> @@ -361,6 +361,8 @@ a653sched_init(struct scheduler *ops)
>      ops->sched_data = prv;
>  
>      prv->next_major_frame = 0;
> +    prv->major_frame = DEFAULT_TIMESLICE;
> +    prv->schedule[0].runtime = DEFAULT_TIMESLICE;
>      spin_lock_init(&prv->lock);
>      INIT_LIST_HEAD(&prv->unit_list);
>  
> @@ -526,29 +528,30 @@ a653sched_do_schedule(
>  
>      spin_lock_irqsave(&sched_priv->lock, flags);
>  

Since the num_schedule_entries < 1 case is no longer handled below, we
depend on major_frame > 0. Please add
ASSERT(sched_priv->major_frame > 0); here.

> -    if ( sched_priv->num_schedule_entries < 1 )
> -        sched_priv->next_major_frame = now + DEFAULT_TIMESLICE;
> -    else if ( now >= sched_priv->next_major_frame )
> +    /* Switch to next major frame directly eliminating the use of loop */

Similarly to the commit message, there was no loop in the code before,
it doesn't make sense to mention it in the in-code comment.

> +    if ( now >= sched_priv->next_major_frame )
>      {
> -        /* time to enter a new major frame
> -         * the first time this function is called, this will be true */
> -        /* start with the first domain in the schedule */
> +        s_time_t major_frame = sched_priv->major_frame;
> +        s_time_t remainder = (now - sched_priv->next_major_frame) % major_frame;
> +
> +        /* major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
> +         * if num_schedule_entries is zero.
> +         */

The start of the multi-line comment should be on its own line. I know
the comment format was a preexisting issue, but since you are changing
the lines anyway, please adhere to CODING_STYLE on the changed lines.

>          sched_priv->sched_index = 0;
> -        sched_priv->next_major_frame = now + sched_priv->major_frame;
> -        sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime;
> +        sched_priv->next_major_frame = now - remainder + major_frame;
> +        sched_priv->next_switch_time = now - remainder +
> +            sched_priv->schedule[0].runtime;
>      }
> -    else
> +
> +    /* Switch minor frame or find correct minor frame after a miss */
> +    while ( (now >= sched_priv->next_switch_time) &&
> +        (sched_priv->sched_index < sched_priv->num_schedule_entries) )
>      {
> -        while ( (now >= sched_priv->next_switch_time) &&
> -                (sched_priv->sched_index < sched_priv->num_schedule_entries) )
> -        {
> -            /* time to switch to the next domain in this major frame */
> -            sched_priv->sched_index++;
> -            sched_priv->next_switch_time +=
> -                sched_priv->schedule[sched_priv->sched_index].runtime;
> -        }
> +        sched_priv->sched_index++;
> +        sched_priv->next_switch_time +=
> +            sched_priv->schedule[sched_priv->sched_index].runtime;
>      }
> -
> +    

Please remove the extraneous white space

>      /*
>       * If we exhausted the domains in the schedule and still have time left
>       * in the major frame then switch next at the next major frame.
RE: [EXTERNAL] Re: [XEN PATCH v3] xen/arinc653: fix delay in the start of major frame
Posted by Choi, Anderson 3 months, 1 week ago
Stewart,

> EXT email: be mindful of links/attachments.
> 
> Hi,
> 
> It largely looks OK to me, just a few small comments below.
> 
> On 7/18/25 05:16, Anderson Choi wrote:
>> ARINC653 specificaion requires partition scheduling to be
>> deterministic
> 
> Typo: s/specificaion/specification/
>
Addressed the typo.

ARINC653 specification requires partition scheduling to be deterministic

>> Per discussion with Nathan Studer, there are odd cases the first minor
>> frame can be also missed. In that sceanario, iterate through the
>> schedule after resyncing
> 
> Typo: s/sceanario/scenario/
> 
> The line is also too long.
> 
Addressed the typo and the lengthy sentence.

Per discussion with Nathan Studer, there are odd cases the first minor
frame can be also missed. In that scenario, iterate through the schedule
after resyncing the expected next major frame.

>> Per suggestion from Stewart Hildebrand, the while loop to calculate
>> the start of next major frame can be eliminated by using a modulo
> operation.
> 
> The while loop was only in earlier revisions of the patch, not in the upstream
> codebase, so it doesn't make sense to mention it in the commit message.
>
Removed the reference to the while loop.

Per suggestion from Stewart Hildebrand, use a modulo operation to
calculate the start of next major frame.

>> 
>> Fixes: 22787f2e107c ("ARINC 653 scheduler")
>> 
> 
> Please remove the extraneous newline between the Fixes: tag and
> remaining tags
> 
Removed the newline.

Fixes: 22787f2e107c ("ARINC 653 scheduler")
Suggested-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Suggested-by: Nathan Studer <nathan.studer@dornerworks.com>
Signed-off-by: Anderson Choi <anderson.choi@boeing.com>

>> @@ -526,29 +528,30 @@ a653sched_do_schedule(
>> 
>>      spin_lock_irqsave(&sched_priv->lock, flags);
> 
> Since the num_schedule_entries < 1 case is no longer handled below, we
> depend on major_frame > 0. Please add ASSERT(sched_priv->major_frame >
> 0); here.
>
>> -    if ( sched_priv->num_schedule_entries < 1 )
>> -        sched_priv->next_major_frame = now + DEFAULT_TIMESLICE;
>> -    else if ( now >= sched_priv->next_major_frame )
>> +    /* Switch to next major frame directly eliminating the use of
>> + loop */
> 
> Similarly to the commit message, there was no loop in the code before, it
> doesn't make sense to mention it in the in-code comment.
>
Added ASSERT() and removed the mention to the loop.

     spin_lock_irqsave(&sched_priv->lock, flags);

-    /* Switch to next major frame directly eliminating the use of loop */
+    ASSERT(sched_priv->major_frame > 0);
+
+    /* Switch to next major frame using a modulo operation */

>> +    if ( now >= sched_priv->next_major_frame )
>>      {
>> -        /* time to enter a new major frame
>> -         * the first time this function is called, this will be true */
>> -        /* start with the first domain in the schedule */
>> +        s_time_t major_frame = sched_priv->major_frame;
>> +        s_time_t remainder = (now - sched_priv->next_major_frame) %
>> + major_frame;
>> +
>> +        /* major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
>> +         * if num_schedule_entries is zero.
>> +         */
> 
> The start of the multi-line comment should be on its own line. I know the
> comment format was a preexisting issue, but since you are changing the lines
> anyway, please adhere to CODING_STYLE on the changed lines.
> 
Addressed as below.

-        /* major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
+        /*
+         * major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
          * if num_schedule_entries is zero.
          */
         sched_priv->sched_index = 0;

>> +        sched_priv->sched_index++;
>> +        sched_priv->next_switch_time +=
>> +            sched_priv->schedule[sched_priv->sched_index].runtime;
>>      }
>> -
>> +
> 
> Please remove the extraneous white space
> 
Removed the white space.

I do appreciate your valuable review.
Would it be okay to submit v4 with all the changes applied?
Please let me know if there's anything that doesn't reflect your comments correctly.

Thanks,
Anderson


Re: [EXTERNAL] Re: [XEN PATCH v3] xen/arinc653: fix delay in the start of major frame
Posted by Stewart Hildebrand 3 months, 1 week ago
On 7/22/25 20:16, Choi, Anderson wrote:
> Stewart,
> 
>> EXT email: be mindful of links/attachments.
>>
>> Hi,
>>
>> It largely looks OK to me, just a few small comments below.
>>
>> On 7/18/25 05:16, Anderson Choi wrote:
>>> ARINC653 specificaion requires partition scheduling to be
>>> deterministic
>>
>> Typo: s/specificaion/specification/
>>
> Addressed the typo.
> 
> ARINC653 specification requires partition scheduling to be deterministic
> 
>>> Per discussion with Nathan Studer, there are odd cases the first minor
>>> frame can be also missed. In that sceanario, iterate through the
>>> schedule after resyncing
>>
>> Typo: s/sceanario/scenario/
>>
>> The line is also too long.
>>
> Addressed the typo and the lengthy sentence.
> 
> Per discussion with Nathan Studer, there are odd cases the first minor
> frame can be also missed. In that scenario, iterate through the schedule
> after resyncing the expected next major frame.
> 
>>> Per suggestion from Stewart Hildebrand, the while loop to calculate
>>> the start of next major frame can be eliminated by using a modulo
>> operation.
>>
>> The while loop was only in earlier revisions of the patch, not in the upstream
>> codebase, so it doesn't make sense to mention it in the commit message.
>>
> Removed the reference to the while loop.
> 
> Per suggestion from Stewart Hildebrand, use a modulo operation to
> calculate the start of next major frame.
> 
>>>
>>> Fixes: 22787f2e107c ("ARINC 653 scheduler")
>>>
>>
>> Please remove the extraneous newline between the Fixes: tag and
>> remaining tags
>>
> Removed the newline.
> 
> Fixes: 22787f2e107c ("ARINC 653 scheduler")
> Suggested-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> Suggested-by: Nathan Studer <nathan.studer@dornerworks.com>
> Signed-off-by: Anderson Choi <anderson.choi@boeing.com>
> 
>>> @@ -526,29 +528,30 @@ a653sched_do_schedule(
>>>
>>>      spin_lock_irqsave(&sched_priv->lock, flags);
>>
>> Since the num_schedule_entries < 1 case is no longer handled below, we
>> depend on major_frame > 0. Please add ASSERT(sched_priv->major_frame >
>> 0); here.
>>
>>> -    if ( sched_priv->num_schedule_entries < 1 )
>>> -        sched_priv->next_major_frame = now + DEFAULT_TIMESLICE;
>>> -    else if ( now >= sched_priv->next_major_frame )
>>> +    /* Switch to next major frame directly eliminating the use of
>>> + loop */
>>
>> Similarly to the commit message, there was no loop in the code before, it
>> doesn't make sense to mention it in the in-code comment.
>>
> Added ASSERT() and removed the mention to the loop.
> 
>      spin_lock_irqsave(&sched_priv->lock, flags);
> 
> -    /* Switch to next major frame directly eliminating the use of loop */
> +    ASSERT(sched_priv->major_frame > 0);
> +
> +    /* Switch to next major frame using a modulo operation */
> 
>>> +    if ( now >= sched_priv->next_major_frame )
>>>      {
>>> -        /* time to enter a new major frame
>>> -         * the first time this function is called, this will be true */
>>> -        /* start with the first domain in the schedule */
>>> +        s_time_t major_frame = sched_priv->major_frame;
>>> +        s_time_t remainder = (now - sched_priv->next_major_frame) %
>>> + major_frame;
>>> +
>>> +        /* major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
>>> +         * if num_schedule_entries is zero.
>>> +         */
>>
>> The start of the multi-line comment should be on its own line. I know the
>> comment format was a preexisting issue, but since you are changing the lines
>> anyway, please adhere to CODING_STYLE on the changed lines.
>>
> Addressed as below.
> 
> -        /* major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
> +        /*
> +         * major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
>           * if num_schedule_entries is zero.
>           */
>          sched_priv->sched_index = 0;
> 
>>> +        sched_priv->sched_index++;
>>> +        sched_priv->next_switch_time +=
>>> +            sched_priv->schedule[sched_priv->sched_index].runtime;
>>>      }
>>> -
>>> +
>>
>> Please remove the extraneous white space
>>
> Removed the white space.
> 
> I do appreciate your valuable review.
> Would it be okay to submit v4 with all the changes applied?

Yes, please do

> Please let me know if there's anything that doesn't reflect your comments correctly.

Your inline replies look good, thanks

> 
> Thanks,
> Anderson
> 
>