[PATCH v6 0/2] Fix nohz_full vs cfs bandwidth

Phil Auld posted 2 patches 2 years, 7 months ago
kernel/sched/core.c     | 23 ++++++++++++++---
kernel/sched/fair.c     | 56 ++++++++++++++++++++++++++++++++++++++---
kernel/sched/features.h |  2 ++
kernel/sched/sched.h    |  3 ++-
4 files changed, 76 insertions(+), 8 deletions(-)
[PATCH v6 0/2] Fix nohz_full vs cfs bandwidth
Posted by Phil Auld 2 years, 7 months ago
This is v6 of patch 2/2 which is adding code to prevent
the tick from being stopped when the single running task
has bandwidth limits. Discussions had led to the idea of
adding a bit to task_struct to help make this decision.

There was some complexity with doing it in the task which
is  avoided by using something in the cfs_rq. Looking 
into that lead me to the hierarchical_quota field in the 
cfs_bandwith struct. We spend a good deal of effort
updating (or trying to, see patch 1/2) that value for
the whole task_group tree when a quota is set/changed.

This new version first fixes that value to be meaningful
for cgroupv2 and then leverages it to make the decisions
about blocking the tick_stop. 

Phil Auld (2):
  sched, cgroup: Restore meaning to hierarchical_quota
  Sched/fair: Block nohz tick_stop when cfs bandwidth in use

 kernel/sched/core.c     | 23 ++++++++++++++---
 kernel/sched/fair.c     | 56 ++++++++++++++++++++++++++++++++++++++---
 kernel/sched/features.h |  2 ++
 kernel/sched/sched.h    |  3 ++-
 4 files changed, 76 insertions(+), 8 deletions(-)

-- 
2.31.1
Re: [PATCH v6 0/2] Fix nohz_full vs cfs bandwidth
Posted by Phil Auld 2 years, 6 months ago
Hi Peter,

On Wed, Jul 12, 2023 at 09:33:55AM -0400 Phil Auld wrote:
> This is v6 of patch 2/2 which is adding code to prevent
> the tick from being stopped when the single running task
> has bandwidth limits. Discussions had led to the idea of
> adding a bit to task_struct to help make this decision.
> 
> There was some complexity with doing it in the task which
> is  avoided by using something in the cfs_rq. Looking 
> into that lead me to the hierarchical_quota field in the 
> cfs_bandwith struct. We spend a good deal of effort
> updating (or trying to, see patch 1/2) that value for
> the whole task_group tree when a quota is set/changed.
> 
> This new version first fixes that value to be meaningful
> for cgroupv2 and then leverages it to make the decisions
> about blocking the tick_stop. 
> 
> Phil Auld (2):
>   sched, cgroup: Restore meaning to hierarchical_quota
>   Sched/fair: Block nohz tick_stop when cfs bandwidth in use
> 
>  kernel/sched/core.c     | 23 ++++++++++++++---
>  kernel/sched/fair.c     | 56 ++++++++++++++++++++++++++++++++++++++---
>  kernel/sched/features.h |  2 ++
>  kernel/sched/sched.h    |  3 ++-
>  4 files changed, 76 insertions(+), 8 deletions(-)
> 
> -- 
> 2.31.1
> 

Ping :)

Any thoughts on these now?


Cheers,
Phil


--
Re: [PATCH v6 0/2] Fix nohz_full vs cfs bandwidth
Posted by Phil Auld 2 years, 6 months ago
Sorry, ignore duplicate, please.  Apparently I forgot email while
on PTO last week :)

On Mon, Jul 31, 2023 at 12:38:37PM -0400 Phil Auld wrote:
> Hi Peter,
> 
> On Wed, Jul 12, 2023 at 09:33:55AM -0400 Phil Auld wrote:
> > This is v6 of patch 2/2 which is adding code to prevent
> > the tick from being stopped when the single running task
> > has bandwidth limits. Discussions had led to the idea of
> > adding a bit to task_struct to help make this decision.
> > 
> > There was some complexity with doing it in the task which
> > is  avoided by using something in the cfs_rq. Looking 
> > into that lead me to the hierarchical_quota field in the 
> > cfs_bandwith struct. We spend a good deal of effort
> > updating (or trying to, see patch 1/2) that value for
> > the whole task_group tree when a quota is set/changed.
> > 
> > This new version first fixes that value to be meaningful
> > for cgroupv2 and then leverages it to make the decisions
> > about blocking the tick_stop. 
> > 
> > Phil Auld (2):
> >   sched, cgroup: Restore meaning to hierarchical_quota
> >   Sched/fair: Block nohz tick_stop when cfs bandwidth in use
> > 
> >  kernel/sched/core.c     | 23 ++++++++++++++---
> >  kernel/sched/fair.c     | 56 ++++++++++++++++++++++++++++++++++++++---
> >  kernel/sched/features.h |  2 ++
> >  kernel/sched/sched.h    |  3 ++-
> >  4 files changed, 76 insertions(+), 8 deletions(-)
> > 
> > -- 
> > 2.31.1
> > 
> 
> Ping :)
> 
> Any thoughts on these now?
> 
> 
> Cheers,
> Phil
> 
> 
> -- 
> 

--
Re: [PATCH v6 0/2] Fix nohz_full vs cfs bandwidth
Posted by Phil Auld 2 years, 6 months ago
Hi Peter,

On Wed, Jul 12, 2023 at 09:33:55AM -0400 Phil Auld wrote:
> This is v6 of patch 2/2 which is adding code to prevent
> the tick from being stopped when the single running task
> has bandwidth limits. Discussions had led to the idea of
> adding a bit to task_struct to help make this decision.
> 
> There was some complexity with doing it in the task which
> is  avoided by using something in the cfs_rq. Looking 
> into that lead me to the hierarchical_quota field in the 
> cfs_bandwith struct. We spend a good deal of effort
> updating (or trying to, see patch 1/2) that value for
> the whole task_group tree when a quota is set/changed.
> 
> This new version first fixes that value to be meaningful
> for cgroupv2 and then leverages it to make the decisions
> about blocking the tick_stop. 
> 
> Phil Auld (2):
>   sched, cgroup: Restore meaning to hierarchical_quota
>   Sched/fair: Block nohz tick_stop when cfs bandwidth in use
> 
>  kernel/sched/core.c     | 23 ++++++++++++++---
>  kernel/sched/fair.c     | 56 ++++++++++++++++++++++++++++++++++++++---
>  kernel/sched/features.h |  2 ++
>  kernel/sched/sched.h    |  3 ++-
>  4 files changed, 76 insertions(+), 8 deletions(-)
> 
> -- 
> 2.31.1
> 

Ping :)

Any thoughts on these now?


Cheers,
Phil
--