[PATCH 0/6] Add latency_nice priority

Vincent Guittot posted 6 patches 4 years, 3 months ago
There is a newer version of this series
include/linux/sched.h            |  3 +
include/uapi/linux/sched.h       |  4 +-
include/uapi/linux/sched/types.h | 19 +++++++
init/init_task.c                 |  1 +
kernel/sched/core.c              | 97 ++++++++++++++++++++++++++++++++
kernel/sched/debug.c             |  1 +
kernel/sched/fair.c              | 92 +++++++++++++++++++++++++++++-
kernel/sched/sched.h             | 34 +++++++++++
tools/include/uapi/linux/sched.h |  4 +-
9 files changed, 251 insertions(+), 4 deletions(-)
[PATCH 0/6] Add latency_nice priority
Posted by Vincent Guittot 4 years, 3 months ago
This patchset restarts the work about adding a latency nice priority to
describe the latency tolerance of cfs tasks.

The patches [1-4] have been done by Parth:
https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/

I have just rebased and moved the set of latency priority outside the
priority update. I have removed the reviewed tag because the patches
are 2 years old.

The patches [5-6] use latency nice priority to decide if a cfs task can
preempt the current running task. Patch 5 gives some tests results with
cyclictests and hackbench to highlight the benefit of latency nice
priority for short interactive task or long intensive tasks.

Parth Shah (4):
  sched: Introduce latency-nice as a per-task attribute
  sched/core: Propagate parent task's latency requirements to the child
    task
  sched: Allow sched_{get,set}attr to change latency_nice of the task
  sched/core: Add permission checks for setting the latency_nice value

Vincent Guittot (2):
  sched/fair: Take into account latency nice at wakeup
  sched/fair: Add sched group latency support

 include/linux/sched.h            |  3 +
 include/uapi/linux/sched.h       |  4 +-
 include/uapi/linux/sched/types.h | 19 +++++++
 init/init_task.c                 |  1 +
 kernel/sched/core.c              | 97 ++++++++++++++++++++++++++++++++
 kernel/sched/debug.c             |  1 +
 kernel/sched/fair.c              | 92 +++++++++++++++++++++++++++++-
 kernel/sched/sched.h             | 34 +++++++++++
 tools/include/uapi/linux/sched.h |  4 +-
 9 files changed, 251 insertions(+), 4 deletions(-)

-- 
2.17.1
Re: [PATCH 0/6] Add latency_nice priority
Posted by Qais Yousef 4 years, 3 months ago
Hi Vincent

Thanks for reviving this patchset!

On 03/11/22 17:14, Vincent Guittot wrote:
> This patchset restarts the work about adding a latency nice priority to
> describe the latency tolerance of cfs tasks.
> 
> The patches [1-4] have been done by Parth:
> https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> 
> I have just rebased and moved the set of latency priority outside the
> priority update. I have removed the reviewed tag because the patches
> are 2 years old.

AFAIR the blocking issue we had then is on agreement on the interface. Has this
been resolved now? I didn't see any further discussion since then.

> 
> The patches [5-6] use latency nice priority to decide if a cfs task can
> preempt the current running task. Patch 5 gives some tests results with
> cyclictests and hackbench to highlight the benefit of latency nice
> priority for short interactive task or long intensive tasks.

This is a new use case AFAICT. For Android, we want to do something in EAS path
to skip feec() and revert to select_idle_capacity() (prefer_idle). I think
Oracle's use case was control the search depth in the LB.

I am not keen on this new use case. It looks awefully similar to how nice
works. And if I tweak nice values I can certainly achieve similar effects
without this new addition:


	--::((TESTING NICE 0))::--

	  hackbench -l $((2560 / $group)) -g $group

	       count     mean       std    min  ...     90%      95%      99%    max
	group                                   ...                                 
	1       20.0  0.69315  0.119378  0.545  ...  0.8309  0.84725  0.97265  1.004
	4       20.0  0.54650  0.063123  0.434  ...  0.6363  0.64840  0.65448  0.656
	8       20.0  0.51025  0.042268  0.441  ...  0.5725  0.57830  0.59806  0.603
	16      20.0  0.54545  0.031041  0.483  ...  0.5824  0.58655  0.59491  0.597

	  hackbench -p -l $((2560 / $group)) -g $group

	       count     mean       std    min  ...     90%     95%      99%    max
	group                                   ...                                
	1       20.0  0.48135  0.036292  0.430  ...  0.5300  0.5481  0.54962  0.550
	4       20.0  0.42925  0.050890  0.339  ...  0.4838  0.5094  0.51548  0.517
	8       20.0  0.33655  0.049839  0.269  ...  0.4002  0.4295  0.43710  0.439
	16      20.0  0.31775  0.031001  0.280  ...  0.3530  0.3639  0.39278  0.400

	  hackbench -l 10000 -g 16 &
	  cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt

	# Min Latencies: 00005
	# Avg Latencies: 00342
	# Max Latencies: 23562


	--::((TESTING NICE -20))::--

	  hackbench -l $((2560 / $group)) -g $group

	       count     mean       std    min  ...     90%     95%      99%    max
	group                                   ...                                
	1       20.0  0.76990  0.126582  0.585  ...  0.9169  0.9316  1.03192  1.057
	4       20.0  0.67715  0.105558  0.505  ...  0.8202  0.8581  0.85962  0.860
	8       20.0  0.75715  0.061286  0.631  ...  0.8276  0.8425  0.85010  0.852
	16      20.0  0.72085  0.089566  0.578  ...  0.8493  0.8818  0.92436  0.935

	  hackbench -p -l $((2560 / $group)) -g $group

	       count     mean       std    min  ...     90%      95%      99%    max
	group                                   ...                                 
	1       20.0  0.50245  0.055636  0.388  ...  0.5839  0.60185  0.61477  0.618
	4       20.0  0.56280  0.139277  0.354  ...  0.7280  0.75075  0.82295  0.841
	8       20.0  0.58005  0.091819  0.412  ...  0.6969  0.71400  0.71400  0.714
	16      20.0  0.52110  0.081465  0.323  ...  0.6169  0.63685  0.68017  0.691

	  hackbench -l 10000 -g 16 &
	  cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt

	# Min Latencies: 00007
	# Avg Latencies: 00081
	# Max Latencies: 20560


	--::((TESTING NICE 19))::--

	  hackbench -l $((2560 / $group)) -g $group

	       count     mean       std    min  ...     90%      95%      99%    max
	group                                   ...                                 
	1       20.0  0.46560  0.013694  0.448  ...  0.4782  0.49805  0.49881  0.499
	4       20.0  0.43705  0.014979  0.414  ...  0.4550  0.45540  0.46148  0.463
	8       20.0  0.45800  0.013471  0.436  ...  0.4754  0.47925  0.48305  0.484
	16      20.0  0.53025  0.007239  0.522  ...  0.5391  0.54040  0.54648  0.548

	  hackbench -p -l $((2560 / $group)) -g $group

	       count     mean       std    min  ...     90%      95%      99%    max
	group                                   ...                                 
	1       20.0  0.27480  0.013725  0.247  ...  0.2892  0.29125  0.29505  0.296
	4       20.0  0.25095  0.011637  0.234  ...  0.2700  0.27010  0.27162  0.272
	8       20.0  0.25250  0.010097  0.240  ...  0.2632  0.27415  0.27643  0.277
	16      20.0  0.26700  0.007595  0.257  ...  0.2751  0.27645  0.28329  0.285

	  hackbench -l 10000 -g 16 &
	  cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt

	# Min Latencies: 00058
	# Avg Latencies: 77232
	# Max Latencies: 696375

For hackbench, the relationship seems to be inversed. Better nice value
produces worse result. But for the cycletest, the avg goes down considerably
similar to your results.

Aren't we just manipulating the same thing with your new proposal or did
I miss something? Can we impact preemption in isolation without having any
impact on bandwidth?

I am worried about how userspace can reason about the expected outcome when
nice and latency_nice are combined together.


Thanks

--
Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Vincent Guittot 4 years, 3 months ago
On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote:
>
> Hi Vincent
>
> Thanks for reviving this patchset!
>
> On 03/11/22 17:14, Vincent Guittot wrote:
> > This patchset restarts the work about adding a latency nice priority to
> > describe the latency tolerance of cfs tasks.
> >
> > The patches [1-4] have been done by Parth:
> > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> >
> > I have just rebased and moved the set of latency priority outside the
> > priority update. I have removed the reviewed tag because the patches
> > are 2 years old.
>
> AFAIR the blocking issue we had then is on agreement on the interface. Has this
> been resolved now? I didn't see any further discussion since then.

I think that there was an agreement about using a latency nice
priority in the range [-20:19] with -20 meaning sensitive to latency
whereas 19 means that task doesn't care about scheduling latency.  The
open point was about how to use this input in the scheduler with some
behavior being opposed.

>
> >
> > The patches [5-6] use latency nice priority to decide if a cfs task can
> > preempt the current running task. Patch 5 gives some tests results with
> > cyclictests and hackbench to highlight the benefit of latency nice
> > priority for short interactive task or long intensive tasks.
>
> This is a new use case AFAICT. For Android, we want to do something in EAS path

I don't think it's new, it's about being able to run some tasks in
priority as long as they haven't use all their cpu bandwidth. ANd this
has the main advantage of not being impact by finding an idle cpu. If
the task is latency sensitive, it will run 1st on the cpu being idle
or not.

> to skip feec() and revert to select_idle_capacity() (prefer_idle). I think
> Oracle's use case was control the search depth in the LB.
>
> I am not keen on this new use case. It looks awefully similar to how nice
> works. And if I tweak nice values I can certainly achieve similar effects
> without this new addition:

nice is about the cpu bandwidth allocated to a task and not about
scheduling task priori to another one. If you provide all cpu
bandwidth to a task, it will most probably have low latency because
others will always have consumed all their time slice.

>
>
>         --::((TESTING NICE 0))::--
>
>           hackbench -l $((2560 / $group)) -g $group
>
>                count     mean       std    min  ...     90%      95%      99%    max
>         group                                   ...
>         1       20.0  0.69315  0.119378  0.545  ...  0.8309  0.84725  0.97265  1.004
>         4       20.0  0.54650  0.063123  0.434  ...  0.6363  0.64840  0.65448  0.656
>         8       20.0  0.51025  0.042268  0.441  ...  0.5725  0.57830  0.59806  0.603
>         16      20.0  0.54545  0.031041  0.483  ...  0.5824  0.58655  0.59491  0.597
>
>           hackbench -p -l $((2560 / $group)) -g $group
>
>                count     mean       std    min  ...     90%     95%      99%    max
>         group                                   ...
>         1       20.0  0.48135  0.036292  0.430  ...  0.5300  0.5481  0.54962  0.550
>         4       20.0  0.42925  0.050890  0.339  ...  0.4838  0.5094  0.51548  0.517
>         8       20.0  0.33655  0.049839  0.269  ...  0.4002  0.4295  0.43710  0.439
>         16      20.0  0.31775  0.031001  0.280  ...  0.3530  0.3639  0.39278  0.400
>
>           hackbench -l 10000 -g 16 &
>           cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt
>
>         # Min Latencies: 00005
>         # Avg Latencies: 00342
>         # Max Latencies: 23562
>
>
>         --::((TESTING NICE -20))::--
>
>           hackbench -l $((2560 / $group)) -g $group
>
>                count     mean       std    min  ...     90%     95%      99%    max
>         group                                   ...
>         1       20.0  0.76990  0.126582  0.585  ...  0.9169  0.9316  1.03192  1.057
>         4       20.0  0.67715  0.105558  0.505  ...  0.8202  0.8581  0.85962  0.860
>         8       20.0  0.75715  0.061286  0.631  ...  0.8276  0.8425  0.85010  0.852
>         16      20.0  0.72085  0.089566  0.578  ...  0.8493  0.8818  0.92436  0.935
>
>           hackbench -p -l $((2560 / $group)) -g $group
>
>                count     mean       std    min  ...     90%      95%      99%    max
>         group                                   ...
>         1       20.0  0.50245  0.055636  0.388  ...  0.5839  0.60185  0.61477  0.618
>         4       20.0  0.56280  0.139277  0.354  ...  0.7280  0.75075  0.82295  0.841
>         8       20.0  0.58005  0.091819  0.412  ...  0.6969  0.71400  0.71400  0.714
>         16      20.0  0.52110  0.081465  0.323  ...  0.6169  0.63685  0.68017  0.691
>
>           hackbench -l 10000 -g 16 &
>           cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt
>
>         # Min Latencies: 00007
>         # Avg Latencies: 00081
>         # Max Latencies: 20560
>
>
>         --::((TESTING NICE 19))::--
>
>           hackbench -l $((2560 / $group)) -g $group
>
>                count     mean       std    min  ...     90%      95%      99%    max
>         group                                   ...
>         1       20.0  0.46560  0.013694  0.448  ...  0.4782  0.49805  0.49881  0.499
>         4       20.0  0.43705  0.014979  0.414  ...  0.4550  0.45540  0.46148  0.463
>         8       20.0  0.45800  0.013471  0.436  ...  0.4754  0.47925  0.48305  0.484
>         16      20.0  0.53025  0.007239  0.522  ...  0.5391  0.54040  0.54648  0.548
>
>           hackbench -p -l $((2560 / $group)) -g $group
>
>                count     mean       std    min  ...     90%      95%      99%    max
>         group                                   ...
>         1       20.0  0.27480  0.013725  0.247  ...  0.2892  0.29125  0.29505  0.296
>         4       20.0  0.25095  0.011637  0.234  ...  0.2700  0.27010  0.27162  0.272
>         8       20.0  0.25250  0.010097  0.240  ...  0.2632  0.27415  0.27643  0.277
>         16      20.0  0.26700  0.007595  0.257  ...  0.2751  0.27645  0.28329  0.285
>
>           hackbench -l 10000 -g 16 &
>           cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt
>
>         # Min Latencies: 00058
>         # Avg Latencies: 77232
>         # Max Latencies: 696375
>
> For hackbench, the relationship seems to be inversed. Better nice value
> produces worse result. But for the cycletest, the avg goes down considerably
> similar to your results.
>
> Aren't we just manipulating the same thing with your new proposal or did
> I miss something? Can we impact preemption in isolation without having any
> impact on bandwidth?
>
> I am worried about how userspace can reason about the expected outcome when
> nice and latency_nice are combined together.
>
>
> Thanks
>
> --
> Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Qais Yousef 4 years, 3 months ago
On 03/23/22 16:32, Vincent Guittot wrote:
> On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote:
> >
> > Hi Vincent
> >
> > Thanks for reviving this patchset!
> >
> > On 03/11/22 17:14, Vincent Guittot wrote:
> > > This patchset restarts the work about adding a latency nice priority to
> > > describe the latency tolerance of cfs tasks.
> > >
> > > The patches [1-4] have been done by Parth:
> > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> > >
> > > I have just rebased and moved the set of latency priority outside the
> > > priority update. I have removed the reviewed tag because the patches
> > > are 2 years old.
> >
> > AFAIR the blocking issue we had then is on agreement on the interface. Has this
> > been resolved now? I didn't see any further discussion since then.
> 
> I think that there was an agreement about using a latency nice
> priority in the range [-20:19] with -20 meaning sensitive to latency
> whereas 19 means that task doesn't care about scheduling latency.  The
> open point was about how to use this input in the scheduler with some
> behavior being opposed.

What I remember is that the problem was to consolidate on use cases then
discuss interfaces.

See https://lwn.net/Articles/820659/

	"  Youssef said that the interface to all of this is the sticking
	point.  Thomas Gleixner agreed, saying that the -20..19 range "requires
	a crystal ball" to use properly. Zijlstra repeated his call to
	enumerate the use cases before getting into the interface details.
	Giani repeated that the interface does not look correct now, and agreed
	that a more comprehensive look at the use cases was needed. Things were
	being done backwards currently, he said.  "

> 
> >
> > >
> > > The patches [5-6] use latency nice priority to decide if a cfs task can
> > > preempt the current running task. Patch 5 gives some tests results with
> > > cyclictests and hackbench to highlight the benefit of latency nice
> > > priority for short interactive task or long intensive tasks.
> >
> > This is a new use case AFAICT. For Android, we want to do something in EAS path
> 
> I don't think it's new, it's about being able to run some tasks in

I meant new use case to latency-nice interface. I don't think we had this in
any of our discussions before? I don't mind it, but it'd be good to clarify if
it has any relation about the other use cases and what should happen to the
other use cases.


Thanks

--
Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Vincent Guittot 4 years, 2 months ago
removed Dhaval's email which returns error

On Thu, 24 Mar 2022 at 18:25, Qais Yousef <qais.yousef@arm.com> wrote:
>
> On 03/23/22 16:32, Vincent Guittot wrote:
> > On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote:
> > >
> > > Hi Vincent
> > >
> > > Thanks for reviving this patchset!
> > >
> > > On 03/11/22 17:14, Vincent Guittot wrote:
> > > > This patchset restarts the work about adding a latency nice priority to
> > > > describe the latency tolerance of cfs tasks.
> > > >
> > > > The patches [1-4] have been done by Parth:
> > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> > > >
> > > > I have just rebased and moved the set of latency priority outside the
> > > > priority update. I have removed the reviewed tag because the patches
> > > > are 2 years old.
> > >
> > > AFAIR the blocking issue we had then is on agreement on the interface. Has this
> > > been resolved now? I didn't see any further discussion since then.
> >
> > I think that there was an agreement about using a latency nice
> > priority in the range [-20:19] with -20 meaning sensitive to latency
> > whereas 19 means that task doesn't care about scheduling latency.  The
> > open point was about how to use this input in the scheduler with some
> > behavior being opposed.
>
> What I remember is that the problem was to consolidate on use cases then
> discuss interfaces.
>
> See https://lwn.net/Articles/820659/
>
>         "  Youssef said that the interface to all of this is the sticking
>         point.  Thomas Gleixner agreed, saying that the -20..19 range "requires
>         a crystal ball" to use properly. Zijlstra repeated his call to
>         enumerate the use cases before getting into the interface details.
>         Giani repeated that the interface does not look correct now, and agreed
>         that a more comprehensive look at the use cases was needed. Things were
>         being done backwards currently, he said.  "
>

At LPC, everybody seemed aligned with latency_nice so I assumed that
there was an agreement on this interface.
Latency_nice fits well with my proposal because it's all about
relative comparison between the running task to the others. The
current nice priority is used to set how much cpu bandwidth a task
will have compared to others and the latency_nice is used in a similar
way to know which one should run compared to the others.

> >
> > >
> > > >
> > > > The patches [5-6] use latency nice priority to decide if a cfs task can
> > > > preempt the current running task. Patch 5 gives some tests results with
> > > > cyclictests and hackbench to highlight the benefit of latency nice
> > > > priority for short interactive task or long intensive tasks.
> > >
> > > This is a new use case AFAICT. For Android, we want to do something in EAS path
> >
> > I don't think it's new, it's about being able to run some tasks in
>
> I meant new use case to latency-nice interface. I don't think we had this in
> any of our discussions before? I don't mind it, but it'd be good to clarify if
> it has any relation about the other use cases and what should happen to the
> other use cases.

Several discussions happened about changing the preemption policy of
CFS. I have Mel's example in mind with hackbench where we want to
reduce the preemption capabilities for the threads and on the other
side the multimedia tasks which complain about having to wait before
being scheduled. All this is about preempting or not the others. And
all this has been kept outside topology consideration but only for the
local run queue

Regards,
Vincent

>
>
> Thanks
>
> --
> Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Qais Yousef 4 years, 2 months ago
On 03/25/22 14:27, Vincent Guittot wrote:
> removed Dhaval's email which returns error
> 
> On Thu, 24 Mar 2022 at 18:25, Qais Yousef <qais.yousef@arm.com> wrote:
> >
> > On 03/23/22 16:32, Vincent Guittot wrote:
> > > On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote:
> > > >
> > > > Hi Vincent
> > > >
> > > > Thanks for reviving this patchset!
> > > >
> > > > On 03/11/22 17:14, Vincent Guittot wrote:
> > > > > This patchset restarts the work about adding a latency nice priority to
> > > > > describe the latency tolerance of cfs tasks.
> > > > >
> > > > > The patches [1-4] have been done by Parth:
> > > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> > > > >
> > > > > I have just rebased and moved the set of latency priority outside the
> > > > > priority update. I have removed the reviewed tag because the patches
> > > > > are 2 years old.
> > > >
> > > > AFAIR the blocking issue we had then is on agreement on the interface. Has this
> > > > been resolved now? I didn't see any further discussion since then.
> > >
> > > I think that there was an agreement about using a latency nice
> > > priority in the range [-20:19] with -20 meaning sensitive to latency
> > > whereas 19 means that task doesn't care about scheduling latency.  The
> > > open point was about how to use this input in the scheduler with some
> > > behavior being opposed.
> >
> > What I remember is that the problem was to consolidate on use cases then
> > discuss interfaces.
> >
> > See https://lwn.net/Articles/820659/
> >
> >         "  Youssef said that the interface to all of this is the sticking
> >         point.  Thomas Gleixner agreed, saying that the -20..19 range "requires
> >         a crystal ball" to use properly. Zijlstra repeated his call to
> >         enumerate the use cases before getting into the interface details.
> >         Giani repeated that the interface does not look correct now, and agreed
> >         that a more comprehensive look at the use cases was needed. Things were
> >         being done backwards currently, he said.  "
> >
> 
> At LPC, everybody seemed aligned with latency_nice so I assumed that
> there was an agreement on this interface.
> Latency_nice fits well with my proposal because it's all about
> relative comparison between the running task to the others. The
> current nice priority is used to set how much cpu bandwidth a task
> will have compared to others and the latency_nice is used in a similar
> way to know which one should run compared to the others.

I think the users were happy, but not the maintainers :-)

I am still happy with it, but I just want to make sure that our use case is
something we still care about having in upstream and we'd still like to use
this interface to achieve that. I don't want it to be blocked based on
interface not suitable. So this should be taken into consideration that this is
not a replacement to at least our previous use case.

The concept of latency_nice conversion to weight is something new and I don't
think any of the other users requires it. So we need to keep the user visible
interface detached from weight which is internal implementation detail for your
use case.

> 
> > >
> > > >
> > > > >
> > > > > The patches [5-6] use latency nice priority to decide if a cfs task can
> > > > > preempt the current running task. Patch 5 gives some tests results with
> > > > > cyclictests and hackbench to highlight the benefit of latency nice
> > > > > priority for short interactive task or long intensive tasks.
> > > >
> > > > This is a new use case AFAICT. For Android, we want to do something in EAS path
> > >
> > > I don't think it's new, it's about being able to run some tasks in
> >
> > I meant new use case to latency-nice interface. I don't think we had this in
> > any of our discussions before? I don't mind it, but it'd be good to clarify if
> > it has any relation about the other use cases and what should happen to the
> > other use cases.
> 
> Several discussions happened about changing the preemption policy of
> CFS. I have Mel's example in mind with hackbench where we want to
> reduce the preemption capabilities for the threads and on the other
> side the multimedia tasks which complain about having to wait before
> being scheduled. All this is about preempting or not the others. And
> all this has been kept outside topology consideration but only for the
> local run queue

Cool. I can see its usefulness. Though I still have to convince myself that you
can affect preemption without impacting bandwidth and is not a subtler way to
modify nice.


Thanks

--
Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Vincent Guittot 4 years, 2 months ago
On Mon, 28 Mar 2022 at 18:27, Qais Yousef <qais.yousef@arm.com> wrote:
>
> On 03/25/22 14:27, Vincent Guittot wrote:
> > removed Dhaval's email which returns error
> >
> > On Thu, 24 Mar 2022 at 18:25, Qais Yousef <qais.yousef@arm.com> wrote:
> > >
> > > On 03/23/22 16:32, Vincent Guittot wrote:
> > > > On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote:
> > > > >
> > > > > Hi Vincent
> > > > >
> > > > > Thanks for reviving this patchset!
> > > > >
> > > > > On 03/11/22 17:14, Vincent Guittot wrote:
> > > > > > This patchset restarts the work about adding a latency nice priority to
> > > > > > describe the latency tolerance of cfs tasks.
> > > > > >
> > > > > > The patches [1-4] have been done by Parth:
> > > > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> > > > > >
> > > > > > I have just rebased and moved the set of latency priority outside the
> > > > > > priority update. I have removed the reviewed tag because the patches
> > > > > > are 2 years old.
> > > > >
> > > > > AFAIR the blocking issue we had then is on agreement on the interface. Has this
> > > > > been resolved now? I didn't see any further discussion since then.
> > > >
> > > > I think that there was an agreement about using a latency nice
> > > > priority in the range [-20:19] with -20 meaning sensitive to latency
> > > > whereas 19 means that task doesn't care about scheduling latency.  The
> > > > open point was about how to use this input in the scheduler with some
> > > > behavior being opposed.
> > >
> > > What I remember is that the problem was to consolidate on use cases then
> > > discuss interfaces.
> > >
> > > See https://lwn.net/Articles/820659/
> > >
> > >         "  Youssef said that the interface to all of this is the sticking
> > >         point.  Thomas Gleixner agreed, saying that the -20..19 range "requires
> > >         a crystal ball" to use properly. Zijlstra repeated his call to
> > >         enumerate the use cases before getting into the interface details.
> > >         Giani repeated that the interface does not look correct now, and agreed
> > >         that a more comprehensive look at the use cases was needed. Things were
> > >         being done backwards currently, he said.  "
> > >
> >
> > At LPC, everybody seemed aligned with latency_nice so I assumed that
> > there was an agreement on this interface.
> > Latency_nice fits well with my proposal because it's all about
> > relative comparison between the running task to the others. The
> > current nice priority is used to set how much cpu bandwidth a task
> > will have compared to others and the latency_nice is used in a similar
> > way to know which one should run compared to the others.
>
> I think the users were happy, but not the maintainers :-)
>
> I am still happy with it, but I just want to make sure that our use case is
> something we still care about having in upstream and we'd still like to use
> this interface to achieve that. I don't want it to be blocked based on
> interface not suitable. So this should be taken into consideration that this is
> not a replacement to at least our previous use case.
>
> The concept of latency_nice conversion to weight is something new and I don't
> think any of the other users requires it. So we need to keep the user visible
> interface detached from weight which is internal implementation detail for your
> use case.

note that the weight is only another way to describe relative priority
but I will keep that in mind for the next version

>
> >
> > > >
> > > > >
> > > > > >
> > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can
> > > > > > preempt the current running task. Patch 5 gives some tests results with
> > > > > > cyclictests and hackbench to highlight the benefit of latency nice
> > > > > > priority for short interactive task or long intensive tasks.
> > > > >
> > > > > This is a new use case AFAICT. For Android, we want to do something in EAS path
> > > >
> > > > I don't think it's new, it's about being able to run some tasks in
> > >
> > > I meant new use case to latency-nice interface. I don't think we had this in
> > > any of our discussions before? I don't mind it, but it'd be good to clarify if
> > > it has any relation about the other use cases and what should happen to the
> > > other use cases.
> >
> > Several discussions happened about changing the preemption policy of
> > CFS. I have Mel's example in mind with hackbench where we want to
> > reduce the preemption capabilities for the threads and on the other
> > side the multimedia tasks which complain about having to wait before
> > being scheduled. All this is about preempting or not the others. And
> > all this has been kept outside topology consideration but only for the
> > local run queue
>
> Cool. I can see its usefulness. Though I still have to convince myself that you
> can affect preemption without impacting bandwidth and is not a subtler way to
> modify nice.

This has been one of my main goal too: to not modify cpu bandwidth

Thanks
Vincent
>
>
> Thanks
>
> --
> Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Dietmar Eggemann 4 years, 2 months ago
On 11/03/2022 17:14, Vincent Guittot wrote:
> This patchset restarts the work about adding a latency nice priority to
> describe the latency tolerance of cfs tasks.
> 
> The patches [1-4] have been done by Parth:
> https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> 
> I have just rebased and moved the set of latency priority outside the
> priority update. I have removed the reviewed tag because the patches
> are 2 years old.
> 
> The patches [5-6] use latency nice priority to decide if a cfs task can
> preempt the current running task. Patch 5 gives some tests results with
> cyclictests and hackbench to highlight the benefit of latency nice
> priority for short interactive task or long intensive tasks.

The Android specific `latency_nice` (in Android `latency_sensitive`
[latency_nice < 0]) use case `Skip energy aware task placement` favors
an idle CPU over the EAS search path for a `latency_sensitive` task.

https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com

This is Android proprietary code similar to what we have in
find_idlest_group_cpu() in mainline.
We talked to the Android folks last week and IMHO they are not convinced
that they can switch this to the proposed `latency_nice->tweak
preemption` use case.
Re: [PATCH 0/6] Add latency_nice priority
Posted by Vincent Guittot 4 years, 2 months ago
Hi Dietmar,


On Mon, 28 Mar 2022 at 11:24, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>
> On 11/03/2022 17:14, Vincent Guittot wrote:
> > This patchset restarts the work about adding a latency nice priority to
> > describe the latency tolerance of cfs tasks.
> >
> > The patches [1-4] have been done by Parth:
> > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> >
> > I have just rebased and moved the set of latency priority outside the
> > priority update. I have removed the reviewed tag because the patches
> > are 2 years old.
> >
> > The patches [5-6] use latency nice priority to decide if a cfs task can
> > preempt the current running task. Patch 5 gives some tests results with
> > cyclictests and hackbench to highlight the benefit of latency nice
> > priority for short interactive task or long intensive tasks.
>
> The Android specific `latency_nice` (in Android `latency_sensitive`
> [latency_nice < 0]) use case `Skip energy aware task placement` favors
> an idle CPU over the EAS search path for a `latency_sensitive` task.
>
> https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com
>
> This is Android proprietary code similar to what we have in
> find_idlest_group_cpu() in mainline.
> We talked to the Android folks last week and IMHO they are not convinced
> that they can switch this to the proposed `latency_nice->tweak
> preemption` use case.

Thanks for discussing this with Android folks. It's not always easy to
change the behavior of a product and I would be interested to discuss
this with them. Sometimes you need a PoC to get convinced
Re: [PATCH 0/6] Add latency_nice priority
Posted by Qais Yousef 4 years, 2 months ago
+CC Wei

On 03/28/22 14:56, Vincent Guittot wrote:
> Hi Dietmar,
> 
> 
> On Mon, 28 Mar 2022 at 11:24, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> >
> > On 11/03/2022 17:14, Vincent Guittot wrote:
> > > This patchset restarts the work about adding a latency nice priority to
> > > describe the latency tolerance of cfs tasks.
> > >
> > > The patches [1-4] have been done by Parth:
> > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> > >
> > > I have just rebased and moved the set of latency priority outside the
> > > priority update. I have removed the reviewed tag because the patches
> > > are 2 years old.
> > >
> > > The patches [5-6] use latency nice priority to decide if a cfs task can
> > > preempt the current running task. Patch 5 gives some tests results with
> > > cyclictests and hackbench to highlight the benefit of latency nice
> > > priority for short interactive task or long intensive tasks.
> >
> > The Android specific `latency_nice` (in Android `latency_sensitive`
> > [latency_nice < 0]) use case `Skip energy aware task placement` favors
> > an idle CPU over the EAS search path for a `latency_sensitive` task.
> >
> > https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com
> >
> > This is Android proprietary code similar to what we have in
> > find_idlest_group_cpu() in mainline.
> > We talked to the Android folks last week and IMHO they are not convinced
> > that they can switch this to the proposed `latency_nice->tweak
> > preemption` use case.
> 
> Thanks for discussing this with Android folks. It's not always easy to
> change the behavior of a product and I would be interested to discuss
> this with them. Sometimes you need a PoC to get convinced

I think it's good to clarify for me at least here whether you intend this as
a replacement for disable EAS and revert to CAS or you see this as an
additional thing? As I understood from the discussion we had on the cover
letter, this is an additional improvement and not intended to replace any of
the previous use cases we brought up before.

Wei/Quentin, any thoughts on this?

Thanks

--
Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Vincent Guittot 4 years, 2 months ago
On Fri, 1 Apr 2022 at 14:15, Qais Yousef <qais.yousef@arm.com> wrote:
>
> +CC Wei
>
> On 03/28/22 14:56, Vincent Guittot wrote:
> > Hi Dietmar,
> >
> >
> > On Mon, 28 Mar 2022 at 11:24, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> > >
> > > On 11/03/2022 17:14, Vincent Guittot wrote:
> > > > This patchset restarts the work about adding a latency nice priority to
> > > > describe the latency tolerance of cfs tasks.
> > > >
> > > > The patches [1-4] have been done by Parth:
> > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> > > >
> > > > I have just rebased and moved the set of latency priority outside the
> > > > priority update. I have removed the reviewed tag because the patches
> > > > are 2 years old.
> > > >
> > > > The patches [5-6] use latency nice priority to decide if a cfs task can
> > > > preempt the current running task. Patch 5 gives some tests results with
> > > > cyclictests and hackbench to highlight the benefit of latency nice
> > > > priority for short interactive task or long intensive tasks.
> > >
> > > The Android specific `latency_nice` (in Android `latency_sensitive`
> > > [latency_nice < 0]) use case `Skip energy aware task placement` favors
> > > an idle CPU over the EAS search path for a `latency_sensitive` task.
> > >
> > > https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com
> > >
> > > This is Android proprietary code similar to what we have in
> > > find_idlest_group_cpu() in mainline.
> > > We talked to the Android folks last week and IMHO they are not convinced
> > > that they can switch this to the proposed `latency_nice->tweak
> > > preemption` use case.
> >
> > Thanks for discussing this with Android folks. It's not always easy to
> > change the behavior of a product and I would be interested to discuss
> > this with them. Sometimes you need a PoC to get convinced
>
> I think it's good to clarify for me at least here whether you intend this as
> a replacement for disable EAS and revert to CAS or you see this as an
> additional thing? As I understood from the discussion we had on the cover
> letter, this is an additional improvement and not intended to replace any of
> the previous use cases we brought up before.

This is not a replacement but an additional way to improve latency.
The only thing that could happen is that this feature provides
enhancement that makes the policy of disabling EAS and revert to CAS
becoming less or no more interesting.


>
> Wei/Quentin, any thoughts on this?
>
> Thanks
>
> --
> Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Qais Yousef 4 years, 2 months ago
On 04/02/22 10:46, Vincent Guittot wrote:
> On Fri, 1 Apr 2022 at 14:15, Qais Yousef <qais.yousef@arm.com> wrote:
> >
> > +CC Wei
> >
> > On 03/28/22 14:56, Vincent Guittot wrote:
> > > Hi Dietmar,
> > >
> > >
> > > On Mon, 28 Mar 2022 at 11:24, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
> > > >
> > > > On 11/03/2022 17:14, Vincent Guittot wrote:
> > > > > This patchset restarts the work about adding a latency nice priority to
> > > > > describe the latency tolerance of cfs tasks.
> > > > >
> > > > > The patches [1-4] have been done by Parth:
> > > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
> > > > >
> > > > > I have just rebased and moved the set of latency priority outside the
> > > > > priority update. I have removed the reviewed tag because the patches
> > > > > are 2 years old.
> > > > >
> > > > > The patches [5-6] use latency nice priority to decide if a cfs task can
> > > > > preempt the current running task. Patch 5 gives some tests results with
> > > > > cyclictests and hackbench to highlight the benefit of latency nice
> > > > > priority for short interactive task or long intensive tasks.
> > > >
> > > > The Android specific `latency_nice` (in Android `latency_sensitive`
> > > > [latency_nice < 0]) use case `Skip energy aware task placement` favors
> > > > an idle CPU over the EAS search path for a `latency_sensitive` task.
> > > >
> > > > https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com
> > > >
> > > > This is Android proprietary code similar to what we have in
> > > > find_idlest_group_cpu() in mainline.
> > > > We talked to the Android folks last week and IMHO they are not convinced
> > > > that they can switch this to the proposed `latency_nice->tweak
> > > > preemption` use case.
> > >
> > > Thanks for discussing this with Android folks. It's not always easy to
> > > change the behavior of a product and I would be interested to discuss
> > > this with them. Sometimes you need a PoC to get convinced
> >
> > I think it's good to clarify for me at least here whether you intend this as
> > a replacement for disable EAS and revert to CAS or you see this as an
> > additional thing? As I understood from the discussion we had on the cover
> > letter, this is an additional improvement and not intended to replace any of
> > the previous use cases we brought up before.
> 
> This is not a replacement but an additional way to improve latency.
> The only thing that could happen is that this feature provides
> enhancement that makes the policy of disabling EAS and revert to CAS
> becoming less or no more interesting.

(Sorry for delayed response, in holiday)

Okay thanks for clarifying.

One other corner case to consider if you're working on next version is what
should happen when there are multiple tasks of the same priority on the rq. RT
scheduler will push/pull tasks to ensure the task will get to run ASAP if
there's another cpu at lower priority is available. Seems a lot of complexity
to add to CFS, but at the same time if 2 important tasks require low latency
are on the same rq, one of them will suffer without introducing the ability to
migrate one of them where it can get to run sooner.

--
Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Steven Rostedt 4 years, 2 months ago
On Sat, 9 Apr 2022 18:08:41 +0100
Qais Yousef <qais.yousef@arm.com> wrote:

> One other corner case to consider if you're working on next version is what
> should happen when there are multiple tasks of the same priority on the rq. RT
> scheduler will push/pull tasks to ensure the task will get to run ASAP if
> there's another cpu at lower priority is available. Seems a lot of complexity
> to add to CFS, but at the same time if 2 important tasks require low latency
> are on the same rq, one of them will suffer without introducing the ability to
> migrate one of them where it can get to run sooner.

Instead of having the greedy algorithm of the RT push/pull logic, how
hard would it be to have the load balancer know of these tasks, and try
to keep them on different CPUs? When two are queued on the same CPU,
could it be possible to just trigger load balancing and let it do the
work?

-- Steve
Re: [PATCH 0/6] Add latency_nice priority
Posted by Qais Yousef 4 years, 2 months ago
On 04/09/22 13:28, Steven Rostedt wrote:
> On Sat, 9 Apr 2022 18:08:41 +0100
> Qais Yousef <qais.yousef@arm.com> wrote:
> 
> > One other corner case to consider if you're working on next version is what
> > should happen when there are multiple tasks of the same priority on the rq. RT
> > scheduler will push/pull tasks to ensure the task will get to run ASAP if
> > there's another cpu at lower priority is available. Seems a lot of complexity
> > to add to CFS, but at the same time if 2 important tasks require low latency
> > are on the same rq, one of them will suffer without introducing the ability to
> > migrate one of them where it can get to run sooner.
> 
> Instead of having the greedy algorithm of the RT push/pull logic, how
> hard would it be to have the load balancer know of these tasks, and try
> to keep them on different CPUs? When two are queued on the same CPU,

Oh yeah I didn't think we need to replicate push/pull. Load balancer will need
to know about it when it moves task so that it avoids placing two of these asks
on the same cpu.

> could it be possible to just trigger load balancing and let it do the
> work?

I think the other part will need to be at wake up when we decide the CPU.

If we trigger the load balancing instead then it'd behave like a push/pull?

All these paths are already complex though. So we need to carefully analyze the
trade-offs. Maybe we don't need to deliver such level of service after all. It
needs more thinking and experimenting.

Thanks

--
Qais Yousef
Re: [PATCH 0/6] Add latency_nice priority
Posted by Vincent Guittot 4 years, 2 months ago
On Sat, 9 Apr 2022 at 20:10, Qais Yousef <qais.yousef@arm.com> wrote:
>
> On 04/09/22 13:28, Steven Rostedt wrote:
> > On Sat, 9 Apr 2022 18:08:41 +0100
> > Qais Yousef <qais.yousef@arm.com> wrote:
> >
> > > One other corner case to consider if you're working on next version is what
> > > should happen when there are multiple tasks of the same priority on the rq. RT
> > > scheduler will push/pull tasks to ensure the task will get to run ASAP if
> > > there's another cpu at lower priority is available. Seems a lot of complexity
> > > to add to CFS, but at the same time if 2 important tasks require low latency
> > > are on the same rq, one of them will suffer without introducing the ability to
> > > migrate one of them where it can get to run sooner.
> >
> > Instead of having the greedy algorithm of the RT push/pull logic, how
> > hard would it be to have the load balancer know of these tasks, and try
> > to keep them on different CPUs? When two are queued on the same CPU,
>
> Oh yeah I didn't think we need to replicate push/pull. Load balancer will need
> to know about it when it moves task so that it avoids placing two of these asks
> on the same cpu.
>
> > could it be possible to just trigger load balancing and let it do the
> > work?
>
> I think the other part will need to be at wake up when we decide the CPU.
>
> If we trigger the load balancing instead then it'd behave like a push/pull?
>
> All these paths are already complex though. So we need to carefully analyze the
> trade-offs. Maybe we don't need to deliver such level of service after all. It
> needs more thinking and experimenting.

I will consider this for v2 but we have to take care of not adding
more latency by trying to find a better CPU. As you said this
additional behavior will need more thinking and experimenting

Vincent

>
> Thanks
>
> --
> Qais Yousef