xen/common/sched/arinc653.c | 38 ++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-)
ARINC653 specificaion requires partition scheduling to be deterministic
and periodic over time.
However, the use of current timestamp (now) as the baseline to calculate
next_major_frame and next_switch_time introduces a delay in the start of
major frame at every period, which breaks determinism and periodicity in
partition scheduling.
For example, we observe 3.5 msec of accumulated delay at the 21st major
frame with the following configuration.
Target : qemuarm64
xen version : 4.19 (43aeacff86, x86/IRQ: constrain creator-domain-ID assertion)
dom1 : 10 msec runtime
dom2 : 10 msec runtime
$ a653_sched -p Pool-arinc dom1:10 dom2:10
0.014553536 ---x d?v? runstate_change d1v0 runnable->running //1st major
frame
0.034629712 ---x d?v? runstate_change d1v0 runnable->running //2nd major
frame
<snip>
0.397747008 |||x d?v? runstate_change d1v0 runnable->running //20th
major frame
0.418066096 -||x d?v? runstate_change d1v0 runnable->running //21st
major frame
This is due to an inherent delta between the time value the scheduler timer
is programmed to be fired with and the time value the schedule function
is executed.
Another observation that breaks the deterministic behavior of partition
scheduling is a delayed execution of schedule(); It was called 14 msec
later than programmed.
1.530603952 ---x d?v? runstate_change d1v0 runnable->running
1.564956784 --|x d?v? runstate_change d1v0 runnable->running
Enforce the periodic behavior of partition scheduling by using the value
next_major_frame as the base to calculate the start of major frame and
the next domain switch time.
Per discussion with Nathan Studer, there are odd cases the first minor
frame can be also missed. In that sceanario, iterate through the schedule after resyncing
the expected next major frame.
Signed-off-by: Anderson Choi <anderson.choi@boeing.com>
Suggested-by: Nathan Studer <nathan.studer@dornerworks.com>
---
Changes in v2:
- Changed the logic to resync major frame and to find correct
minor frame after a miss suggested by Nathan
---
xen/common/sched/arinc653.c | 38 ++++++++++++++++++++-----------------
1 file changed, 21 insertions(+), 17 deletions(-)
diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
index 930361fa5c..a7937ed2fd 100644
--- a/xen/common/sched/arinc653.c
+++ b/xen/common/sched/arinc653.c
@@ -526,27 +526,31 @@ a653sched_do_schedule(
spin_lock_irqsave(&sched_priv->lock, flags);
- if ( sched_priv->num_schedule_entries < 1 )
- sched_priv->next_major_frame = now + DEFAULT_TIMESLICE;
- else if ( now >= sched_priv->next_major_frame )
+ /* Switch to next major frame while handling potentially missed frames */
+ while ( now >= sched_priv->next_major_frame )
{
- /* time to enter a new major frame
- * the first time this function is called, this will be true */
- /* start with the first domain in the schedule */
sched_priv->sched_index = 0;
- sched_priv->next_major_frame = now + sched_priv->major_frame;
- sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime;
- }
- else
- {
- while ( (now >= sched_priv->next_switch_time) &&
- (sched_priv->sched_index < sched_priv->num_schedule_entries) )
+
+ if ( sched_priv->num_schedule_entries < 1 )
{
- /* time to switch to the next domain in this major frame */
- sched_priv->sched_index++;
- sched_priv->next_switch_time +=
- sched_priv->schedule[sched_priv->sched_index].runtime;
+ sched_priv->next_major_frame += DEFAULT_TIMESLICE;
+ sched_priv->next_switch_time = sched_priv->next_major_frame;
}
+ else
+ {
+ sched_priv->next_switch_time = sched_priv->next_major_frame +
+ sched_priv->schedule[0].runtime;
+ sched_priv->next_major_frame += sched_priv->major_frame;
+ }
+ }
+
+ /* Switch minor frame or find correct minor frame after a miss */
+ while ( (now >= sched_priv->next_switch_time) &&
+ (sched_priv->sched_index < sched_priv->num_schedule_entries) )
+ {
+ sched_priv->sched_index++;
+ sched_priv->next_switch_time +=
+ sched_priv->schedule[sched_priv->sched_index].runtime;
}
/*
--
2.43.0
On 7/14/25 23:16, Anderson Choi wrote:
> ARINC653 specificaion requires partition scheduling to be deterministic
s/specificaion/specification/
> and periodic over time.
>
> However, the use of current timestamp (now) as the baseline to calculate
> next_major_frame and next_switch_time introduces a delay in the start of
> major frame at every period, which breaks determinism and periodicity in
> partition scheduling.
>
> For example, we observe 3.5 msec of accumulated delay at the 21st major
> frame with the following configuration.
>
> Target : qemuarm64
> xen version : 4.19 (43aeacff86, x86/IRQ: constrain creator-domain-ID assertion)
> dom1 : 10 msec runtime
> dom2 : 10 msec runtime
>
> $ a653_sched -p Pool-arinc dom1:10 dom2:10
>
> 0.014553536 ---x d?v? runstate_change d1v0 runnable->running //1st major
> frame
> 0.034629712 ---x d?v? runstate_change d1v0 runnable->running //2nd major
> frame
> <snip>
> 0.397747008 |||x d?v? runstate_change d1v0 runnable->running //20th
> major frame
> 0.418066096 -||x d?v? runstate_change d1v0 runnable->running //21st
> major frame
>
> This is due to an inherent delta between the time value the scheduler timer
> is programmed to be fired with and the time value the schedule function
> is executed.
>
> Another observation that breaks the deterministic behavior of partition
> scheduling is a delayed execution of schedule(); It was called 14 msec
> later than programmed.
>
> 1.530603952 ---x d?v? runstate_change d1v0 runnable->running
> 1.564956784 --|x d?v? runstate_change d1v0 runnable->running
>
> Enforce the periodic behavior of partition scheduling by using the value
> next_major_frame as the base to calculate the start of major frame and
> the next domain switch time.
>
> Per discussion with Nathan Studer, there are odd cases the first minor
> frame can be also missed. In that sceanario, iterate through the schedule after resyncing
s/sceanario/scenario/
> the expected next major frame.
>
> Signed-off-by: Anderson Choi <anderson.choi@boeing.com>
> Suggested-by: Nathan Studer <nathan.studer@dornerworks.com>
I think this wants a Fixes: tag:
Fixes: 22787f2e107c ("ARINC 653 scheduler")
>
> ---
> Changes in v2:
> - Changed the logic to resync major frame and to find correct
> minor frame after a miss suggested by Nathan
> ---
> xen/common/sched/arinc653.c | 38 ++++++++++++++++++++-----------------
> 1 file changed, 21 insertions(+), 17 deletions(-)
>
> diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
> index 930361fa5c..a7937ed2fd 100644
> --- a/xen/common/sched/arinc653.c
> +++ b/xen/common/sched/arinc653.c
> @@ -526,27 +526,31 @@ a653sched_do_schedule(
>
> spin_lock_irqsave(&sched_priv->lock, flags);
>
> - if ( sched_priv->num_schedule_entries < 1 )
> - sched_priv->next_major_frame = now + DEFAULT_TIMESLICE;
> - else if ( now >= sched_priv->next_major_frame )
> + /* Switch to next major frame while handling potentially missed frames */
> + while ( now >= sched_priv->next_major_frame )
> {
> - /* time to enter a new major frame
> - * the first time this function is called, this will be true */
> - /* start with the first domain in the schedule */
> sched_priv->sched_index = 0;
> - sched_priv->next_major_frame = now + sched_priv->major_frame;
> - sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime;
> - }
> - else
> - {
> - while ( (now >= sched_priv->next_switch_time) &&
> - (sched_priv->sched_index < sched_priv->num_schedule_entries) )
> +
> + if ( sched_priv->num_schedule_entries < 1 )
> {
> - /* time to switch to the next domain in this major frame */
> - sched_priv->sched_index++;
> - sched_priv->next_switch_time +=
> - sched_priv->schedule[sched_priv->sched_index].runtime;
> + sched_priv->next_major_frame += DEFAULT_TIMESLICE;
> + sched_priv->next_switch_time = sched_priv->next_major_frame;
> }
> + else
> + {
> + sched_priv->next_switch_time = sched_priv->next_major_frame +
> + sched_priv->schedule[0].runtime;
> + sched_priv->next_major_frame += sched_priv->major_frame;
> + }
> + }
There's no need for the above loop, this can be fixed by subtracting the
remainder (modulus major_frame). E.g.:
if ( now >= sched_priv->next_major_frame )
{
s_time_t major_frame = sched_priv->num_schedule_entries < 1
? DEFAULT_TIMESLICE
: sched_priv->major_frame;
s_time_t remainder = (now - sched_priv->next_major_frame) % major_frame;
sched_priv->sched_index = 0;
sched_priv->next_major_frame = now - remainder + major_frame;
sched_priv->next_switch_time = now - remainder +
(sched_priv->num_schedule_entries < 1
? DEFAULT_TIMESLICE
: sched_priv->schedule[0].runtime);
}
The commit description may want some minor updating to reflect this.
> +
> + /* Switch minor frame or find correct minor frame after a miss */
> + while ( (now >= sched_priv->next_switch_time) &&
> + (sched_priv->sched_index < sched_priv->num_schedule_entries) )
> + {
> + sched_priv->sched_index++;
> + sched_priv->next_switch_time +=
> + sched_priv->schedule[sched_priv->sched_index].runtime;
> }
>
> /*
On 7/17/25 9:21, Hildebrand, Stewart wrote:
> > - if ( sched_priv->num_schedule_entries < 1 )
> > - sched_priv->next_major_frame = now + DEFAULT_TIMESLICE;
> > - else if ( now >= sched_priv->next_major_frame )
> > + /* Switch to next major frame while handling potentially missed frames */
> > + while ( now >= sched_priv->next_major_frame )
> > {
> > - /* time to enter a new major frame
> > - * the first time this function is called, this will be true */
> > - /* start with the first domain in the schedule */
> > sched_priv->sched_index = 0;
> > - sched_priv->next_major_frame = now + sched_priv->major_frame;
> > - sched_priv->next_switch_time = now + sched_priv->schedule[0].runtime;
> > - }
> > - else
> > - {
> > - while ( (now >= sched_priv->next_switch_time) &&
> > - (sched_priv->sched_index < sched_priv->num_schedule_entries) )
> > +
> > + if ( sched_priv->num_schedule_entries < 1 )
> > {
> > - /* time to switch to the next domain in this major frame */
> > - sched_priv->sched_index++;
> > - sched_priv->next_switch_time +=
> > - sched_priv->schedule[sched_priv->sched_index].runtime;
> > + sched_priv->next_major_frame += DEFAULT_TIMESLICE;
> > + sched_priv->next_switch_time =
> > + sched_priv->next_major_frame;
> > }
> > + else
> > + {
> > + sched_priv->next_switch_time = sched_priv->next_major_frame +
> > + sched_priv->schedule[0].runtime;
> > + sched_priv->next_major_frame += sched_priv->major_frame;
> > + }
> > + }
>
> There's no need for the above loop, this can be fixed by subtracting the remainder
> (modulus major_frame). E.g.:
>
> if ( now >= sched_priv->next_major_frame )
> {
> s_time_t major_frame = sched_priv->num_schedule_entries < 1
> ? DEFAULT_TIMESLICE
> : sched_priv->major_frame;
> s_time_t remainder = (now - sched_priv->next_major_frame) %
> major_frame;
>
> sched_priv->sched_index = 0;
> sched_priv->next_major_frame = now - remainder + major_frame;
> sched_priv->next_switch_time = now - remainder +
> (sched_priv->num_schedule_entries < 1
> ? DEFAULT_TIMESLICE
> : sched_priv->schedule[0].runtime);
> }
>
The direct method suggested by Stew is preferable in the unusual case where many major frames are missed. (We have only seen that happen when using a debugger.)
To help uncover any issues like the one this patch addresses in the future we may also want to follow up this commit with a change to make scheduler misses more obvious. Something like the following:
commit e95cbc9078127c412bd1605d93cb97837751b5b4 (HEAD -> master)
Author: Nathan Studer <nathan.studer@dornerworks.com>
Date: Thu Jul 17 12:43:39 2025 -0400
Do not silently skip frame overruns
diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
index 2d0d3bcbb3..a2c1c66c27 100644
--- a/xen/common/sched/arinc653.c
+++ b/xen/common/sched/arinc653.c
@@ -523,9 +523,17 @@ a653sched_do_schedule(
a653sched_priv_t *sched_priv = SCHED_PRIV(ops);
const unsigned int cpu = sched_get_resource_cpu(smp_processor_id());
unsigned long flags;
+ unsigned int oindex;
+ unsigned int missed;
spin_lock_irqsave(&sched_priv->lock, flags);
+ if ( now > (sched_priv->next_major_frame + sched_priv->major_frame))
+ {
+ missed = (now - sched_priv->next_major_frame) / sched_priv->major_frame;
+ printk(XENLOG_ERR, "Missed %d major frame(s)!\n", missed);
+ }
+
/* Switch to next major frame while handling potentially missed frames */
@@ -544,6 +552,7 @@ a653sched_do_schedule(
}
}
+ oindex = sched_priv->sched_index;
/* Switch minor frame or find correct minor frame after a miss */
while ( (now >= sched_priv->next_switch_time) &&
(sched_priv->sched_index < sched_priv->num_schedule_entries) )
@@ -553,6 +562,12 @@ a653sched_do_schedule(
sched_priv->schedule[sched_priv->sched_index].runtime;
}
+ if ( (oindex - sched_priv->sched_index) > 1)
+ {
+ missed = (oindex - sched_priv->sched_index - 1);
+ printk(XENLOG_WARNING, "Missed %d minor frame(s)!\n", missed);
+ }
+
/*
> The commit description may want some minor updating to reflect this.
>
> > +
> > + /* Switch minor frame or find correct minor frame after a miss */
> > + while ( (now >= sched_priv->next_switch_time) &&
> > + (sched_priv->sched_index < sched_priv->num_schedule_entries) )
> > + {
> > + sched_priv->sched_index++;
> > + sched_priv->next_switch_time +=
> > + sched_priv->schedule[sched_priv->sched_index].runtime;
> > }
> >
> > /*
> EXT email: be mindful of links/attachments.
>
> On 7/17/25 9:21, Hildebrand, Stewart wrote:
else + { + sched_priv->next_switch_time =
>>> sched_priv->next_major_frame + +
>>> sched_priv->schedule[0].runtime; +
>>> sched_priv->next_major_frame += sched_priv->major_frame; + } +
>>> }
>>
>> There's no need for the above loop, this can be fixed by subtracting
>> the remainder (modulus major_frame). E.g.:
>>
>> if ( now >= sched_priv->next_major_frame )
>> {
>> s_time_t major_frame = sched_priv->num_schedule_entries < 1
>> ? DEFAULT_TIMESLICE
>> : sched_priv->major_frame;
>> s_time_t remainder = (now - sched_priv->next_major_frame) %
>> major_frame;
>>
>> sched_priv->sched_index = 0;
>> sched_priv->next_major_frame = now - remainder + major_frame;
>> sched_priv->next_switch_time = now - remainder +
>> (sched_priv->num_schedule_entries < 1
>> ? DEFAULT_TIMESLICE
>> : sched_priv->schedule[0].runtime);
>> }
>
Stewart,
I appreciate your suggestion to eliminate the while loop.
What about initializing major_frame and schedule[0].runtime to DEFAULT_TIMESLICE at a653sched_init() and use them until the real parameters are set as below to eliminate the if branch?
diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
index 930361fa5c..73ce5cdfaf 100644
--- a/xen/common/sched/arinc653.c
+++ b/xen/common/sched/arinc653.c
@@ -361,6 +361,8 @@ a653sched_init(struct scheduler *ops)
ops->sched_data = prv;
prv->next_major_frame = 0;
+ prv->major_frame = DEFAULT_TIMESLICE;
+ prv->schedule[0].runtime = DEFAULT_TIMESLICE;
spin_lock_init(&prv->lock);
INIT_LIST_HEAD(&prv->unit_list);
static void cf_check
a653sched_do_schedule(
<snip>
/* Switch to next major frame directly eliminating the use of loop */
if ( now >= sched_priv->next_major_frame )
{
s_time_t major_frame = sched_priv->major_frame;
s_time_t remainder = (now - sched_priv->next_major_frame) % major_frame;
sched_priv->sched_index = 0;
sched_priv->next_major_frame = now - remainder + major_frame;
sched_priv->next_switch_time = now - remainder + sched_priv->schedule[0].runtime;
}
> The direct method suggested by Stew is preferable in the unusual case where
> many major frames are missed. (We have only seen that happen when using
> a debugger.)
>
> To help uncover any issues like the one this patch addresses in the future we
> may also want to follow up this commit with a change to make scheduler
> misses more obvious. Something like the following:
>
> commit e95cbc9078127c412bd1605d93cb97837751b5b4 (HEAD -> master)
> Author: Nathan Studer <nathan.studer@dornerworks.com>
> Date: Thu Jul 17 12:43:39 2025 -0400
>
> Do not silently skip frame overruns
> diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
> index 2d0d3bcbb3..a2c1c66c27 100644
> --- a/xen/common/sched/arinc653.c
> +++ b/xen/common/sched/arinc653.c
> @@ -523,9 +523,17 @@ a653sched_do_schedule(
> a653sched_priv_t *sched_priv = SCHED_PRIV(ops);
> const unsigned int cpu = sched_get_resource_cpu(smp_processor_id());
> unsigned long flags;
> + unsigned int oindex;
> + unsigned int missed;
>
> spin_lock_irqsave(&sched_priv->lock, flags);
> + if ( now > (sched_priv->next_major_frame + sched_priv->major_frame))
> + {
> + missed = (now - sched_priv->next_major_frame) / sched_priv-
>> major_frame;
> + printk(XENLOG_ERR, "Missed %d major frame(s)!\n", missed);
> + }
> +
> /* Switch to next major frame while handling potentially missed frames */
> @@ -544,6 +552,7 @@ a653sched_do_schedule(
> }
> }
> + oindex = sched_priv->sched_index;
> /* Switch minor frame or find correct minor frame after a miss */
> while ( (now >= sched_priv->next_switch_time) &&
> (sched_priv->sched_index <
> sched_priv->num_schedule_entries) ) @@ -553,6 +562,12 @@
> a653sched_do_schedule(
> sched_priv->schedule[sched_priv->sched_index].runtime;
> }
> + if ( (oindex - sched_priv->sched_index) > 1)
> + {
> + missed = (oindex - sched_priv->sched_index - 1);
> + printk(XENLOG_WARNING, "Missed %d minor frame(s)!\n", missed);
> + }
> +
> /*
Nathan,
Do we need a comma (,) between XENLOG_WARNING and the quoted string when calling printk()?
And wouldn't the log be printed every time a new minor frame starts, for example oindex = 0 and sched_priv->sched_index = 1?
I think you meant to use the condition "if (sched_priv->sched_index - oindex) > 1" to check minor frame misses?
Thanks,
Anderson
On 7/17/25 21:39, Choi, Anderson wrote:
> > On 7/17/25 9:21, Hildebrand, Stewart wrote:
> else + { + sched_priv->next_switch_time =
> >>> sched_priv->next_major_frame + +
> >>> sched_priv->schedule[0].runtime; +
> >>> sched_priv->next_major_frame += sched_priv->major_frame; + } +
> >>> }
> >>
> >> There's no need for the above loop, this can be fixed by subtracting
> >> the remainder (modulus major_frame). E.g.:
> >>
> >> if ( now >= sched_priv->next_major_frame )
> >> {
> >> s_time_t major_frame = sched_priv->num_schedule_entries < 1
> >> ? DEFAULT_TIMESLICE
> >> : sched_priv->major_frame;
> >> s_time_t remainder = (now - sched_priv->next_major_frame) %
> >> major_frame;
> >>
> >> sched_priv->sched_index = 0;
> >> sched_priv->next_major_frame = now - remainder + major_frame;
> >> sched_priv->next_switch_time = now - remainder +
> >> (sched_priv->num_schedule_entries < 1
> >> ? DEFAULT_TIMESLICE
> >> : sched_priv->schedule[0].runtime);
> >> }
> >
>
> Stewart,
>
> I appreciate your suggestion to eliminate the while loop.
> What about initializing major_frame and schedule[0].runtime to
> DEFAULT_TIMESLICE at a653sched_init() and use them until the real parameters
> are set as below to eliminate the if branch?
It would make the scheduling code cleaner, but since you asked Stew I'll defer to him.
>
> diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
> index 930361fa5c..73ce5cdfaf 100644
> --- a/xen/common/sched/arinc653.c
> +++ b/xen/common/sched/arinc653.c
> @@ -361,6 +361,8 @@ a653sched_init(struct scheduler *ops)
> ops->sched_data = prv;
>
> prv->next_major_frame = 0;
> + prv->major_frame = DEFAULT_TIMESLICE;
> + prv->schedule[0].runtime = DEFAULT_TIMESLICE;
> spin_lock_init(&prv->lock);
> INIT_LIST_HEAD(&prv->unit_list);
>
> static void cf_check
> a653sched_do_schedule(
> <snip>
> /* Switch to next major frame directly eliminating the use of loop */
> if ( now >= sched_priv->next_major_frame )
> {
> s_time_t major_frame = sched_priv->major_frame;
> s_time_t remainder = (now - sched_priv->next_major_frame) %
> major_frame;
>
> sched_priv->sched_index = 0;
> sched_priv->next_major_frame = now - remainder + major_frame;
> sched_priv->next_switch_time = now - remainder + sched_priv-
> >schedule[0].runtime;
> }
>
> > The direct method suggested by Stew is preferable in the unusual case
> > where many major frames are missed. (We have only seen that happen
> > when using a debugger.)
> >
> > To help uncover any issues like the one this patch addresses in the
> > future we may also want to follow up this commit with a change to make
> > scheduler misses more obvious. Something like the following:
> >
> > commit e95cbc9078127c412bd1605d93cb97837751b5b4 (HEAD -> master)
> > Author: Nathan Studer <nathan.studer@dornerworks.com>
> > Date: Thu Jul 17 12:43:39 2025 -0400
> >
> > Do not silently skip frame overruns diff --git
> > a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c index
> > 2d0d3bcbb3..a2c1c66c27 100644
> > --- a/xen/common/sched/arinc653.c
> > +++ b/xen/common/sched/arinc653.c
> > @@ -523,9 +523,17 @@ a653sched_do_schedule(
> > a653sched_priv_t *sched_priv = SCHED_PRIV(ops);
> > const unsigned int cpu = sched_get_resource_cpu(smp_processor_id());
> > unsigned long flags;
> > + unsigned int oindex;
> > + unsigned int missed;
> >
> > spin_lock_irqsave(&sched_priv->lock, flags);
> > + if ( now > (sched_priv->next_major_frame + sched_priv->major_frame))
> > + {
> > + missed = (now - sched_priv->next_major_frame) / sched_priv-
> >> major_frame;
> > + printk(XENLOG_ERR, "Missed %d major frame(s)!\n", missed);
> > + }
> > +
> > /* Switch to next major frame while handling potentially missed
> > frames */ @@ -544,6 +552,7 @@ a653sched_do_schedule(
> > }
> > }
> > + oindex = sched_priv->sched_index;
> > /* Switch minor frame or find correct minor frame after a miss */
> > while ( (now >= sched_priv->next_switch_time) &&
> > (sched_priv->sched_index <
> > sched_priv->num_schedule_entries) ) @@ -553,6 +562,12 @@
> > a653sched_do_schedule(
> > sched_priv->schedule[sched_priv->sched_index].runtime;
> > }
> > + if ( (oindex - sched_priv->sched_index) > 1)
> > + {
> > + missed = (oindex - sched_priv->sched_index - 1);
> > + printk(XENLOG_WARNING, "Missed %d minor frame(s)!\n", missed);
> > + }
> > +
> > /*
>
> Nathan,
>
> Do we need a comma (,) between XENLOG_WARNING and the quoted string
> when calling printk()?
No.
> And wouldn't the log be printed every time a new minor frame starts, for example
> oindex = 0 and sched_priv->sched_index = 1?
> I think you meant to use the condition "if (sched_priv->sched_index - oindex) > 1"
> to check minor frame misses?
Right, the order should be the other way around in both the condition and the print.
Regardless, we can just worry about your patch for now and improve the miss detection reporting later.
Nate
>
> Thanks,
> Anderson
>
On 7/17/25 22:16, Nathan Studer wrote:
> On 7/17/25 21:39, Choi, Anderson wrote:
>>> On 7/17/25 9:21, Hildebrand, Stewart wrote:
>> else + { + sched_priv->next_switch_time =
>>>>> sched_priv->next_major_frame + +
>>>>> sched_priv->schedule[0].runtime; +
>>>>> sched_priv->next_major_frame += sched_priv->major_frame; + } +
>>>>> }
>>>>
>>>> There's no need for the above loop, this can be fixed by subtracting
>>>> the remainder (modulus major_frame). E.g.:
>>>>
>>>> if ( now >= sched_priv->next_major_frame )
>>>> {
>>>> s_time_t major_frame = sched_priv->num_schedule_entries < 1
>>>> ? DEFAULT_TIMESLICE
>>>> : sched_priv->major_frame;
>>>> s_time_t remainder = (now - sched_priv->next_major_frame) %
>>>> major_frame;
>>>>
>>>> sched_priv->sched_index = 0;
>>>> sched_priv->next_major_frame = now - remainder + major_frame;
>>>> sched_priv->next_switch_time = now - remainder +
>>>> (sched_priv->num_schedule_entries < 1
>>>> ? DEFAULT_TIMESLICE
>>>> : sched_priv->schedule[0].runtime);
>>>> }
>>>
>>
>> Stewart,
>>
>> I appreciate your suggestion to eliminate the while loop.
>> What about initializing major_frame and schedule[0].runtime to
>> DEFAULT_TIMESLICE at a653sched_init() and use them until the real parameters
>> are set as below to eliminate the if branch?
>
> It would make the scheduling code cleaner, but since you asked Stew I'll defer to him.
It feels odd to index schedule[0] when num_schedule_entries may be
zero... But since it's a fixed size array and element 0 is guaranteed to
exist I don't have a strong preference.
>
>>
>> diff --git a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c
>> index 930361fa5c..73ce5cdfaf 100644
>> --- a/xen/common/sched/arinc653.c
>> +++ b/xen/common/sched/arinc653.c
>> @@ -361,6 +361,8 @@ a653sched_init(struct scheduler *ops)
>> ops->sched_data = prv;
>>
>> prv->next_major_frame = 0;
>> + prv->major_frame = DEFAULT_TIMESLICE;
>> + prv->schedule[0].runtime = DEFAULT_TIMESLICE;
>> spin_lock_init(&prv->lock);
>> INIT_LIST_HEAD(&prv->unit_list);
>>
>> static void cf_check
>> a653sched_do_schedule(
>> <snip>
>> /* Switch to next major frame directly eliminating the use of loop */
>> if ( now >= sched_priv->next_major_frame )
>> {
>> s_time_t major_frame = sched_priv->major_frame;
>> s_time_t remainder = (now - sched_priv->next_major_frame) %
>> major_frame;
>>
>> sched_priv->sched_index = 0;
>> sched_priv->next_major_frame = now - remainder + major_frame;
>> sched_priv->next_switch_time = now - remainder + sched_priv-
>>> schedule[0].runtime;
>> }
>>
>>> The direct method suggested by Stew is preferable in the unusual case
>>> where many major frames are missed. (We have only seen that happen
>>> when using a debugger.)
>>>
>>> To help uncover any issues like the one this patch addresses in the
>>> future we may also want to follow up this commit with a change to make
>>> scheduler misses more obvious. Something like the following:
>>>
>>> commit e95cbc9078127c412bd1605d93cb97837751b5b4 (HEAD -> master)
>>> Author: Nathan Studer <nathan.studer@dornerworks.com>
>>> Date: Thu Jul 17 12:43:39 2025 -0400
>>>
>>> Do not silently skip frame overruns diff --git
>>> a/xen/common/sched/arinc653.c b/xen/common/sched/arinc653.c index
>>> 2d0d3bcbb3..a2c1c66c27 100644
>>> --- a/xen/common/sched/arinc653.c
>>> +++ b/xen/common/sched/arinc653.c
>>> @@ -523,9 +523,17 @@ a653sched_do_schedule(
>>> a653sched_priv_t *sched_priv = SCHED_PRIV(ops);
>>> const unsigned int cpu = sched_get_resource_cpu(smp_processor_id());
>>> unsigned long flags;
>>> + unsigned int oindex;
>>> + unsigned int missed;
>>>
>>> spin_lock_irqsave(&sched_priv->lock, flags);
>>> + if ( now > (sched_priv->next_major_frame + sched_priv->major_frame))
>>> + {
>>> + missed = (now - sched_priv->next_major_frame) / sched_priv-
>>>> major_frame;
>>> + printk(XENLOG_ERR, "Missed %d major frame(s)!\n", missed);
>>> + }
>>> +
>>> /* Switch to next major frame while handling potentially missed
>>> frames */ @@ -544,6 +552,7 @@ a653sched_do_schedule(
>>> }
>>> }
>>> + oindex = sched_priv->sched_index;
>>> /* Switch minor frame or find correct minor frame after a miss */
>>> while ( (now >= sched_priv->next_switch_time) &&
>>> (sched_priv->sched_index <
>>> sched_priv->num_schedule_entries) ) @@ -553,6 +562,12 @@
>>> a653sched_do_schedule(
>>> sched_priv->schedule[sched_priv->sched_index].runtime;
>>> }
>>> + if ( (oindex - sched_priv->sched_index) > 1)
>>> + {
>>> + missed = (oindex - sched_priv->sched_index - 1);
>>> + printk(XENLOG_WARNING, "Missed %d minor frame(s)!\n", missed);
>>> + }
>>> +
>>> /*
>>
>> Nathan,
>>
>> Do we need a comma (,) between XENLOG_WARNING and the quoted string
>> when calling printk()?
>
> No.
>
>> And wouldn't the log be printed every time a new minor frame starts, for example
>> oindex = 0 and sched_priv->sched_index = 1?
>> I think you meant to use the condition "if (sched_priv->sched_index - oindex) > 1"
>> to check minor frame misses?
>
> Right, the order should be the other way around in both the condition and the print.
>
> Regardless, we can just worry about your patch for now and improve the miss detection reporting later.
+1
>
> Nate
>
>>
>> Thanks,
>> Anderson
>>
>
Stewart,
>>> Stewart,
>>>
>>> I appreciate your suggestion to eliminate the while loop.
>>> What about initializing major_frame and schedule[0].runtime to
>>> DEFAULT_TIMESLICE at a653sched_init() and use them until the real
>>> parameters are set as below to eliminate the if branch?
>>
>> It would make the scheduling code cleaner, but since you asked Stew I'll
> defer to him.
>
> It feels odd to index schedule[0] when num_schedule_entries may be zero...
> But since it's a fixed size array and element 0 is guaranteed to exist I don't
> have a strong preference.
>
Thanks for your feedback.
If you don't have a strong preference, let me write a v3 patch with the following change.
/* Switch to next major frame directly eliminating the use of loop */
if ( now >= sched_priv->next_major_frame )
{
s_time_t major_frame = sched_priv->major_frame;
s_time_t remainder = (now - sched_priv->next_major_frame) % major_frame;
/* major_frame and schedule[0].runtime contain DEFAULT_TIMESLICE
* if num_schedule_entries is zero.
*/
sched_priv->sched_index = 0;
sched_priv->next_major_frame = now - remainder + major_frame;
sched_priv->next_switch_time = now - remainder +
sched_priv->schedule[0].runtime;
}
/* Switch minor frame */
while ( (now >= sched_priv->next_switch_time) &&
I'll also include the "Fixes" tag you mentioned in the previous comment.
Fixes: 22787f2e107c ("ARINC 653 scheduler")
Thanks,
Anderson
On 15.07.2025 05:16, Anderson Choi wrote: > ARINC653 specificaion requires partition scheduling to be deterministic > and periodic over time. > > However, the use of current timestamp (now) as the baseline to calculate > next_major_frame and next_switch_time introduces a delay in the start of > major frame at every period, which breaks determinism and periodicity in > partition scheduling. > > For example, we observe 3.5 msec of accumulated delay at the 21st major > frame with the following configuration. > > Target : qemuarm64 > xen version : 4.19 (43aeacff86, x86/IRQ: constrain creator-domain-ID assertion) > dom1 : 10 msec runtime > dom2 : 10 msec runtime > > $ a653_sched -p Pool-arinc dom1:10 dom2:10 > > 0.014553536 ---x d?v? runstate_change d1v0 runnable->running //1st major > frame > 0.034629712 ---x d?v? runstate_change d1v0 runnable->running //2nd major > frame > <snip> > 0.397747008 |||x d?v? runstate_change d1v0 runnable->running //20th > major frame > 0.418066096 -||x d?v? runstate_change d1v0 runnable->running //21st > major frame > > This is due to an inherent delta between the time value the scheduler timer > is programmed to be fired with and the time value the schedule function > is executed. > > Another observation that breaks the deterministic behavior of partition > scheduling is a delayed execution of schedule(); It was called 14 msec > later than programmed. > > 1.530603952 ---x d?v? runstate_change d1v0 runnable->running > 1.564956784 --|x d?v? runstate_change d1v0 runnable->running > > Enforce the periodic behavior of partition scheduling by using the value > next_major_frame as the base to calculate the start of major frame and > the next domain switch time. > > Per discussion with Nathan Studer, there are odd cases the first minor > frame can be also missed. In that sceanario, iterate through the schedule after resyncing > the expected next major frame. > > Signed-off-by: Anderson Choi <anderson.choi@boeing.com> > Suggested-by: Nathan Studer <nathan.studer@dornerworks.com> Nit: (Most) tags in chronological order, please. Jan
Jan, >> Signed-off-by: Anderson Choi <anderson.choi@boeing.com> >> Suggested-by: Nathan Studer <nathan.studer@dornerworks.com> > > Nit: (Most) tags in chronological order, please. > Sorry, I'm not fully familiar with the revisioning / upstreaming process yet. In this case, what would be the most appropriate action? 1. Create a v3 patch? 2. Send the v2 patch with the tags reordered? 3. Exercise your suggestion from the next turn-in. Thanks, Anderson
On 15.07.2025 10:57, Choi, Anderson wrote: >>> Signed-off-by: Anderson Choi <anderson.choi@boeing.com> >>> Suggested-by: Nathan Studer <nathan.studer@dornerworks.com> >> >> Nit: (Most) tags in chronological order, please. >> > > Sorry, I'm not fully familiar with the revisioning / upstreaming process yet. > > In this case, what would be the most appropriate action? > 1. Create a v3 patch? > 2. Send the v2 patch with the tags reordered? > 3. Exercise your suggestion from the next turn-in. No need to re-submit just for this. If a v3 became necessary for other reasons, then the adjustment would want making there. Jan
© 2016 - 2025 Red Hat, Inc.