include/linux/cpuset.h | 12 ++- include/linux/sched.h | 4 +- kernel/cgroup/cgroup.c | 4 + kernel/cgroup/cpuset.c | 232 ++++++++++++++++++++++++++-------------- kernel/sched/core.c | 41 ++++--- kernel/sched/deadline.c | 67 +++++++++--- kernel/sched/sched.h | 2 +- 7 files changed, 240 insertions(+), 122 deletions(-)
Qais reported [1] that iterating over all tasks when rebuilding root
domains for finding out which ones are DEADLINE and need their bandwidth
correctly restored on such root domains can be a costly operation (10+
ms delays on suspend-resume). He proposed we skip rebuilding root
domains for certain operations, but that approach seemed arch specific
and possibly prone to errors, as paths that ultimately trigger a rebuild
might be quite convoluted (thanks Qais for spending time on this!).
To fix the problem
01/06 - Rename functions deadline with DEADLINE accounting (cleanup
suggested by Qais) - no functional change
02/06 - Bring back cpuset_mutex (so that we have write access to cpusets
from scheduler operations - and we also fix some problems
associated to percpu_cpuset_rwsem)
03/06 - Keep track of the number of DEADLINE tasks belonging to each cpuset
04/06 - Create DL BW alloc, free & check overflow interface for bulk
bandwidth allocation/removal - no functional change
05/06 - Fix bandwidth allocation handling for cgroup operation
involving multiple tasks
06/06 - Use this information to only perform the costly iteration if
DEADLINE tasks are actually present in the cpuset for which a
corresponding root domain is being rebuilt
With respect to the RFC posting [2]
1 - rename DEADLINE bandwidth accounting functions - Qais
2 - call inc/dec_dl_tasks_cs from switched_{to,from}_dl - Qais
3 - fix DEADLINE bandwidth allocation with multiple tasks - Waiman,
contributed by Dietmar
This set is also available from
https://github.com/jlelli/linux.git deadline/rework-cpusets
Best,
Juri
1 - https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/
2 - https://lore.kernel.org/lkml/20230315121812.206079-1-juri.lelli@redhat.com/
Dietmar Eggemann (2):
sched/deadline: Create DL BW alloc, free & check overflow interface
cgroup/cpuset: Free DL BW in case can_attach() fails
Juri Lelli (4):
cgroup/cpuset: Rename functions dealing with DEADLINE accounting
sched/cpuset: Bring back cpuset_mutex
sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
cgroup/cpuset: Iterate only if DEADLINE tasks are present
include/linux/cpuset.h | 12 ++-
include/linux/sched.h | 4 +-
kernel/cgroup/cgroup.c | 4 +
kernel/cgroup/cpuset.c | 232 ++++++++++++++++++++++++++--------------
kernel/sched/core.c | 41 ++++---
kernel/sched/deadline.c | 67 +++++++++---
kernel/sched/sched.h | 2 +-
7 files changed, 240 insertions(+), 122 deletions(-)
--
2.39.2
On 3/29/23 08:55, Juri Lelli wrote:
> Qais reported [1] that iterating over all tasks when rebuilding root
> domains for finding out which ones are DEADLINE and need their bandwidth
> correctly restored on such root domains can be a costly operation (10+
> ms delays on suspend-resume). He proposed we skip rebuilding root
> domains for certain operations, but that approach seemed arch specific
> and possibly prone to errors, as paths that ultimately trigger a rebuild
> might be quite convoluted (thanks Qais for spending time on this!).
>
> To fix the problem
>
> 01/06 - Rename functions deadline with DEADLINE accounting (cleanup
> suggested by Qais) - no functional change
> 02/06 - Bring back cpuset_mutex (so that we have write access to cpusets
> from scheduler operations - and we also fix some problems
> associated to percpu_cpuset_rwsem)
> 03/06 - Keep track of the number of DEADLINE tasks belonging to each cpuset
> 04/06 - Create DL BW alloc, free & check overflow interface for bulk
> bandwidth allocation/removal - no functional change
> 05/06 - Fix bandwidth allocation handling for cgroup operation
> involving multiple tasks
> 06/06 - Use this information to only perform the costly iteration if
> DEADLINE tasks are actually present in the cpuset for which a
> corresponding root domain is being rebuilt
>
> With respect to the RFC posting [2]
>
> 1 - rename DEADLINE bandwidth accounting functions - Qais
> 2 - call inc/dec_dl_tasks_cs from switched_{to,from}_dl - Qais
> 3 - fix DEADLINE bandwidth allocation with multiple tasks - Waiman,
> contributed by Dietmar
>
> This set is also available from
>
> https://github.com/jlelli/linux.git deadline/rework-cpusets
>
> Best,
> Juri
>
> 1 - https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/
> 2 - https://lore.kernel.org/lkml/20230315121812.206079-1-juri.lelli@redhat.com/
>
> Dietmar Eggemann (2):
> sched/deadline: Create DL BW alloc, free & check overflow interface
> cgroup/cpuset: Free DL BW in case can_attach() fails
>
> Juri Lelli (4):
> cgroup/cpuset: Rename functions dealing with DEADLINE accounting
> sched/cpuset: Bring back cpuset_mutex
> sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
> cgroup/cpuset: Iterate only if DEADLINE tasks are present
>
> include/linux/cpuset.h | 12 ++-
> include/linux/sched.h | 4 +-
> kernel/cgroup/cgroup.c | 4 +
> kernel/cgroup/cpuset.c | 232 ++++++++++++++++++++++++++--------------
> kernel/sched/core.c | 41 ++++---
> kernel/sched/deadline.c | 67 +++++++++---
> kernel/sched/sched.h | 2 +-
> 7 files changed, 240 insertions(+), 122 deletions(-)
Other than some minor issues that I have talked in earlier emails, the
patch series looks good to me.
You can add my ack once the issues are fixed.
Acked-by: Waiman Long <longman@redhat.com>
On 03/29/23 14:55, Juri Lelli wrote:
> Qais reported [1] that iterating over all tasks when rebuilding root
> domains for finding out which ones are DEADLINE and need their bandwidth
> correctly restored on such root domains can be a costly operation (10+
> ms delays on suspend-resume). He proposed we skip rebuilding root
> domains for certain operations, but that approach seemed arch specific
> and possibly prone to errors, as paths that ultimately trigger a rebuild
> might be quite convoluted (thanks Qais for spending time on this!).
>
> To fix the problem
>
> 01/06 - Rename functions deadline with DEADLINE accounting (cleanup
> suggested by Qais) - no functional change
> 02/06 - Bring back cpuset_mutex (so that we have write access to cpusets
> from scheduler operations - and we also fix some problems
> associated to percpu_cpuset_rwsem)
> 03/06 - Keep track of the number of DEADLINE tasks belonging to each cpuset
> 04/06 - Create DL BW alloc, free & check overflow interface for bulk
> bandwidth allocation/removal - no functional change
> 05/06 - Fix bandwidth allocation handling for cgroup operation
> involving multiple tasks
> 06/06 - Use this information to only perform the costly iteration if
> DEADLINE tasks are actually present in the cpuset for which a
> corresponding root domain is being rebuilt
>
> With respect to the RFC posting [2]
>
> 1 - rename DEADLINE bandwidth accounting functions - Qais
> 2 - call inc/dec_dl_tasks_cs from switched_{to,from}_dl - Qais
> 3 - fix DEADLINE bandwidth allocation with multiple tasks - Waiman,
> contributed by Dietmar
>
> This set is also available from
>
> https://github.com/jlelli/linux.git deadline/rework-cpusets
Thanks a lot Juri!
I picked up the updated series and applied them to a 5.10 kernel and tested the
issue is fixed. Replied with my reviewed-and-tested-bys to some of the patches
already.
I haven't looked much at Dietmar's patches and while they were part of the
test, but there are no dl tasks on the system so I felt hesitant to say
I tested that part.
Cheers
--
Qais Yousef
On 03/29/23 14:55, Juri Lelli wrote:
> Qais reported [1] that iterating over all tasks when rebuilding root
> domains for finding out which ones are DEADLINE and need their bandwidth
> correctly restored on such root domains can be a costly operation (10+
> ms delays on suspend-resume). He proposed we skip rebuilding root
> domains for certain operations, but that approach seemed arch specific
> and possibly prone to errors, as paths that ultimately trigger a rebuild
> might be quite convoluted (thanks Qais for spending time on this!).
>
> To fix the problem
>
> 01/06 - Rename functions deadline with DEADLINE accounting (cleanup
> suggested by Qais) - no functional change
> 02/06 - Bring back cpuset_mutex (so that we have write access to cpusets
> from scheduler operations - and we also fix some problems
> associated to percpu_cpuset_rwsem)
> 03/06 - Keep track of the number of DEADLINE tasks belonging to each cpuset
> 04/06 - Create DL BW alloc, free & check overflow interface for bulk
> bandwidth allocation/removal - no functional change
> 05/06 - Fix bandwidth allocation handling for cgroup operation
> involving multiple tasks
> 06/06 - Use this information to only perform the costly iteration if
> DEADLINE tasks are actually present in the cpuset for which a
> corresponding root domain is being rebuilt
>
> With respect to the RFC posting [2]
>
> 1 - rename DEADLINE bandwidth accounting functions - Qais
> 2 - call inc/dec_dl_tasks_cs from switched_{to,from}_dl - Qais
> 3 - fix DEADLINE bandwidth allocation with multiple tasks - Waiman,
> contributed by Dietmar
>
> This set is also available from
>
> https://github.com/jlelli/linux.git deadline/rework-cpusets
Is this just waiting to be picked up or still there's something to be addressed
still?
Thanks!
--
Qais Yousef
On 4/18/23 10:11, Qais Yousef wrote:
> On 03/29/23 14:55, Juri Lelli wrote:
>> Qais reported [1] that iterating over all tasks when rebuilding root
>> domains for finding out which ones are DEADLINE and need their bandwidth
>> correctly restored on such root domains can be a costly operation (10+
>> ms delays on suspend-resume). He proposed we skip rebuilding root
>> domains for certain operations, but that approach seemed arch specific
>> and possibly prone to errors, as paths that ultimately trigger a rebuild
>> might be quite convoluted (thanks Qais for spending time on this!).
>>
>> To fix the problem
>>
>> 01/06 - Rename functions deadline with DEADLINE accounting (cleanup
>> suggested by Qais) - no functional change
>> 02/06 - Bring back cpuset_mutex (so that we have write access to cpusets
>> from scheduler operations - and we also fix some problems
>> associated to percpu_cpuset_rwsem)
>> 03/06 - Keep track of the number of DEADLINE tasks belonging to each cpuset
>> 04/06 - Create DL BW alloc, free & check overflow interface for bulk
>> bandwidth allocation/removal - no functional change
>> 05/06 - Fix bandwidth allocation handling for cgroup operation
>> involving multiple tasks
>> 06/06 - Use this information to only perform the costly iteration if
>> DEADLINE tasks are actually present in the cpuset for which a
>> corresponding root domain is being rebuilt
>>
>> With respect to the RFC posting [2]
>>
>> 1 - rename DEADLINE bandwidth accounting functions - Qais
>> 2 - call inc/dec_dl_tasks_cs from switched_{to,from}_dl - Qais
>> 3 - fix DEADLINE bandwidth allocation with multiple tasks - Waiman,
>> contributed by Dietmar
>>
>> This set is also available from
>>
>> https://github.com/jlelli/linux.git deadline/rework-cpusets
> Is this just waiting to be picked up or still there's something to be addressed
> still?
There are some changes to cpuset code recently and so I believe that
this patch series may need to be refreshed to reconcile the changes.
Thanks,
Longman
On 18/04/23 10:31, Waiman Long wrote:
> On 4/18/23 10:11, Qais Yousef wrote:
> > On 03/29/23 14:55, Juri Lelli wrote:
> > > Qais reported [1] that iterating over all tasks when rebuilding root
> > > domains for finding out which ones are DEADLINE and need their bandwidth
> > > correctly restored on such root domains can be a costly operation (10+
> > > ms delays on suspend-resume). He proposed we skip rebuilding root
> > > domains for certain operations, but that approach seemed arch specific
> > > and possibly prone to errors, as paths that ultimately trigger a rebuild
> > > might be quite convoluted (thanks Qais for spending time on this!).
> > >
> > > To fix the problem
> > >
> > > 01/06 - Rename functions deadline with DEADLINE accounting (cleanup
> > > suggested by Qais) - no functional change
> > > 02/06 - Bring back cpuset_mutex (so that we have write access to cpusets
> > > from scheduler operations - and we also fix some problems
> > > associated to percpu_cpuset_rwsem)
> > > 03/06 - Keep track of the number of DEADLINE tasks belonging to each cpuset
> > > 04/06 - Create DL BW alloc, free & check overflow interface for bulk
> > > bandwidth allocation/removal - no functional change
> > > 05/06 - Fix bandwidth allocation handling for cgroup operation
> > > involving multiple tasks
> > > 06/06 - Use this information to only perform the costly iteration if
> > > DEADLINE tasks are actually present in the cpuset for which a
> > > corresponding root domain is being rebuilt
> > >
> > > With respect to the RFC posting [2]
> > >
> > > 1 - rename DEADLINE bandwidth accounting functions - Qais
> > > 2 - call inc/dec_dl_tasks_cs from switched_{to,from}_dl - Qais
> > > 3 - fix DEADLINE bandwidth allocation with multiple tasks - Waiman,
> > > contributed by Dietmar
> > >
> > > This set is also available from
> > >
> > > https://github.com/jlelli/linux.git deadline/rework-cpusets
> > Is this just waiting to be picked up or still there's something to be addressed
> > still?
>
> There are some changes to cpuset code recently and so I believe that this
> patch series may need to be refreshed to reconcile the changes.
Yeah, will soon take a look.
Thanks!
Juri
© 2016 - 2026 Red Hat, Inc.