include/linux/sched.h | 3 + include/uapi/linux/sched.h | 4 +- include/uapi/linux/sched/types.h | 19 +++++++ init/init_task.c | 1 + kernel/sched/core.c | 97 ++++++++++++++++++++++++++++++++ kernel/sched/debug.c | 1 + kernel/sched/fair.c | 92 +++++++++++++++++++++++++++++- kernel/sched/sched.h | 34 +++++++++++ tools/include/uapi/linux/sched.h | 4 +- 9 files changed, 251 insertions(+), 4 deletions(-)
This patchset restarts the work about adding a latency nice priority to
describe the latency tolerance of cfs tasks.
The patches [1-4] have been done by Parth:
https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/
I have just rebased and moved the set of latency priority outside the
priority update. I have removed the reviewed tag because the patches
are 2 years old.
The patches [5-6] use latency nice priority to decide if a cfs task can
preempt the current running task. Patch 5 gives some tests results with
cyclictests and hackbench to highlight the benefit of latency nice
priority for short interactive task or long intensive tasks.
Parth Shah (4):
sched: Introduce latency-nice as a per-task attribute
sched/core: Propagate parent task's latency requirements to the child
task
sched: Allow sched_{get,set}attr to change latency_nice of the task
sched/core: Add permission checks for setting the latency_nice value
Vincent Guittot (2):
sched/fair: Take into account latency nice at wakeup
sched/fair: Add sched group latency support
include/linux/sched.h | 3 +
include/uapi/linux/sched.h | 4 +-
include/uapi/linux/sched/types.h | 19 +++++++
init/init_task.c | 1 +
kernel/sched/core.c | 97 ++++++++++++++++++++++++++++++++
kernel/sched/debug.c | 1 +
kernel/sched/fair.c | 92 +++++++++++++++++++++++++++++-
kernel/sched/sched.h | 34 +++++++++++
tools/include/uapi/linux/sched.h | 4 +-
9 files changed, 251 insertions(+), 4 deletions(-)
--
2.17.1
Hi Vincent Thanks for reviving this patchset! On 03/11/22 17:14, Vincent Guittot wrote: > This patchset restarts the work about adding a latency nice priority to > describe the latency tolerance of cfs tasks. > > The patches [1-4] have been done by Parth: > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > I have just rebased and moved the set of latency priority outside the > priority update. I have removed the reviewed tag because the patches > are 2 years old. AFAIR the blocking issue we had then is on agreement on the interface. Has this been resolved now? I didn't see any further discussion since then. > > The patches [5-6] use latency nice priority to decide if a cfs task can > preempt the current running task. Patch 5 gives some tests results with > cyclictests and hackbench to highlight the benefit of latency nice > priority for short interactive task or long intensive tasks. This is a new use case AFAICT. For Android, we want to do something in EAS path to skip feec() and revert to select_idle_capacity() (prefer_idle). I think Oracle's use case was control the search depth in the LB. I am not keen on this new use case. It looks awefully similar to how nice works. And if I tweak nice values I can certainly achieve similar effects without this new addition: --::((TESTING NICE 0))::-- hackbench -l $((2560 / $group)) -g $group count mean std min ... 90% 95% 99% max group ... 1 20.0 0.69315 0.119378 0.545 ... 0.8309 0.84725 0.97265 1.004 4 20.0 0.54650 0.063123 0.434 ... 0.6363 0.64840 0.65448 0.656 8 20.0 0.51025 0.042268 0.441 ... 0.5725 0.57830 0.59806 0.603 16 20.0 0.54545 0.031041 0.483 ... 0.5824 0.58655 0.59491 0.597 hackbench -p -l $((2560 / $group)) -g $group count mean std min ... 90% 95% 99% max group ... 1 20.0 0.48135 0.036292 0.430 ... 0.5300 0.5481 0.54962 0.550 4 20.0 0.42925 0.050890 0.339 ... 0.4838 0.5094 0.51548 0.517 8 20.0 0.33655 0.049839 0.269 ... 0.4002 0.4295 0.43710 0.439 16 20.0 0.31775 0.031001 0.280 ... 0.3530 0.3639 0.39278 0.400 hackbench -l 10000 -g 16 & cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt # Min Latencies: 00005 # Avg Latencies: 00342 # Max Latencies: 23562 --::((TESTING NICE -20))::-- hackbench -l $((2560 / $group)) -g $group count mean std min ... 90% 95% 99% max group ... 1 20.0 0.76990 0.126582 0.585 ... 0.9169 0.9316 1.03192 1.057 4 20.0 0.67715 0.105558 0.505 ... 0.8202 0.8581 0.85962 0.860 8 20.0 0.75715 0.061286 0.631 ... 0.8276 0.8425 0.85010 0.852 16 20.0 0.72085 0.089566 0.578 ... 0.8493 0.8818 0.92436 0.935 hackbench -p -l $((2560 / $group)) -g $group count mean std min ... 90% 95% 99% max group ... 1 20.0 0.50245 0.055636 0.388 ... 0.5839 0.60185 0.61477 0.618 4 20.0 0.56280 0.139277 0.354 ... 0.7280 0.75075 0.82295 0.841 8 20.0 0.58005 0.091819 0.412 ... 0.6969 0.71400 0.71400 0.714 16 20.0 0.52110 0.081465 0.323 ... 0.6169 0.63685 0.68017 0.691 hackbench -l 10000 -g 16 & cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt # Min Latencies: 00007 # Avg Latencies: 00081 # Max Latencies: 20560 --::((TESTING NICE 19))::-- hackbench -l $((2560 / $group)) -g $group count mean std min ... 90% 95% 99% max group ... 1 20.0 0.46560 0.013694 0.448 ... 0.4782 0.49805 0.49881 0.499 4 20.0 0.43705 0.014979 0.414 ... 0.4550 0.45540 0.46148 0.463 8 20.0 0.45800 0.013471 0.436 ... 0.4754 0.47925 0.48305 0.484 16 20.0 0.53025 0.007239 0.522 ... 0.5391 0.54040 0.54648 0.548 hackbench -p -l $((2560 / $group)) -g $group count mean std min ... 90% 95% 99% max group ... 1 20.0 0.27480 0.013725 0.247 ... 0.2892 0.29125 0.29505 0.296 4 20.0 0.25095 0.011637 0.234 ... 0.2700 0.27010 0.27162 0.272 8 20.0 0.25250 0.010097 0.240 ... 0.2632 0.27415 0.27643 0.277 16 20.0 0.26700 0.007595 0.257 ... 0.2751 0.27645 0.28329 0.285 hackbench -l 10000 -g 16 & cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt # Min Latencies: 00058 # Avg Latencies: 77232 # Max Latencies: 696375 For hackbench, the relationship seems to be inversed. Better nice value produces worse result. But for the cycletest, the avg goes down considerably similar to your results. Aren't we just manipulating the same thing with your new proposal or did I miss something? Can we impact preemption in isolation without having any impact on bandwidth? I am worried about how userspace can reason about the expected outcome when nice and latency_nice are combined together. Thanks -- Qais Yousef
On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote: > > Hi Vincent > > Thanks for reviving this patchset! > > On 03/11/22 17:14, Vincent Guittot wrote: > > This patchset restarts the work about adding a latency nice priority to > > describe the latency tolerance of cfs tasks. > > > > The patches [1-4] have been done by Parth: > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > I have just rebased and moved the set of latency priority outside the > > priority update. I have removed the reviewed tag because the patches > > are 2 years old. > > AFAIR the blocking issue we had then is on agreement on the interface. Has this > been resolved now? I didn't see any further discussion since then. I think that there was an agreement about using a latency nice priority in the range [-20:19] with -20 meaning sensitive to latency whereas 19 means that task doesn't care about scheduling latency. The open point was about how to use this input in the scheduler with some behavior being opposed. > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > preempt the current running task. Patch 5 gives some tests results with > > cyclictests and hackbench to highlight the benefit of latency nice > > priority for short interactive task or long intensive tasks. > > This is a new use case AFAICT. For Android, we want to do something in EAS path I don't think it's new, it's about being able to run some tasks in priority as long as they haven't use all their cpu bandwidth. ANd this has the main advantage of not being impact by finding an idle cpu. If the task is latency sensitive, it will run 1st on the cpu being idle or not. > to skip feec() and revert to select_idle_capacity() (prefer_idle). I think > Oracle's use case was control the search depth in the LB. > > I am not keen on this new use case. It looks awefully similar to how nice > works. And if I tweak nice values I can certainly achieve similar effects > without this new addition: nice is about the cpu bandwidth allocated to a task and not about scheduling task priori to another one. If you provide all cpu bandwidth to a task, it will most probably have low latency because others will always have consumed all their time slice. > > > --::((TESTING NICE 0))::-- > > hackbench -l $((2560 / $group)) -g $group > > count mean std min ... 90% 95% 99% max > group ... > 1 20.0 0.69315 0.119378 0.545 ... 0.8309 0.84725 0.97265 1.004 > 4 20.0 0.54650 0.063123 0.434 ... 0.6363 0.64840 0.65448 0.656 > 8 20.0 0.51025 0.042268 0.441 ... 0.5725 0.57830 0.59806 0.603 > 16 20.0 0.54545 0.031041 0.483 ... 0.5824 0.58655 0.59491 0.597 > > hackbench -p -l $((2560 / $group)) -g $group > > count mean std min ... 90% 95% 99% max > group ... > 1 20.0 0.48135 0.036292 0.430 ... 0.5300 0.5481 0.54962 0.550 > 4 20.0 0.42925 0.050890 0.339 ... 0.4838 0.5094 0.51548 0.517 > 8 20.0 0.33655 0.049839 0.269 ... 0.4002 0.4295 0.43710 0.439 > 16 20.0 0.31775 0.031001 0.280 ... 0.3530 0.3639 0.39278 0.400 > > hackbench -l 10000 -g 16 & > cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt > > # Min Latencies: 00005 > # Avg Latencies: 00342 > # Max Latencies: 23562 > > > --::((TESTING NICE -20))::-- > > hackbench -l $((2560 / $group)) -g $group > > count mean std min ... 90% 95% 99% max > group ... > 1 20.0 0.76990 0.126582 0.585 ... 0.9169 0.9316 1.03192 1.057 > 4 20.0 0.67715 0.105558 0.505 ... 0.8202 0.8581 0.85962 0.860 > 8 20.0 0.75715 0.061286 0.631 ... 0.8276 0.8425 0.85010 0.852 > 16 20.0 0.72085 0.089566 0.578 ... 0.8493 0.8818 0.92436 0.935 > > hackbench -p -l $((2560 / $group)) -g $group > > count mean std min ... 90% 95% 99% max > group ... > 1 20.0 0.50245 0.055636 0.388 ... 0.5839 0.60185 0.61477 0.618 > 4 20.0 0.56280 0.139277 0.354 ... 0.7280 0.75075 0.82295 0.841 > 8 20.0 0.58005 0.091819 0.412 ... 0.6969 0.71400 0.71400 0.714 > 16 20.0 0.52110 0.081465 0.323 ... 0.6169 0.63685 0.68017 0.691 > > hackbench -l 10000 -g 16 & > cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt > > # Min Latencies: 00007 > # Avg Latencies: 00081 > # Max Latencies: 20560 > > > --::((TESTING NICE 19))::-- > > hackbench -l $((2560 / $group)) -g $group > > count mean std min ... 90% 95% 99% max > group ... > 1 20.0 0.46560 0.013694 0.448 ... 0.4782 0.49805 0.49881 0.499 > 4 20.0 0.43705 0.014979 0.414 ... 0.4550 0.45540 0.46148 0.463 > 8 20.0 0.45800 0.013471 0.436 ... 0.4754 0.47925 0.48305 0.484 > 16 20.0 0.53025 0.007239 0.522 ... 0.5391 0.54040 0.54648 0.548 > > hackbench -p -l $((2560 / $group)) -g $group > > count mean std min ... 90% 95% 99% max > group ... > 1 20.0 0.27480 0.013725 0.247 ... 0.2892 0.29125 0.29505 0.296 > 4 20.0 0.25095 0.011637 0.234 ... 0.2700 0.27010 0.27162 0.272 > 8 20.0 0.25250 0.010097 0.240 ... 0.2632 0.27415 0.27643 0.277 > 16 20.0 0.26700 0.007595 0.257 ... 0.2751 0.27645 0.28329 0.285 > > hackbench -l 10000 -g 16 & > cyclictest --policy other -D 5 -q -H 20000 --histfile data.txt > > # Min Latencies: 00058 > # Avg Latencies: 77232 > # Max Latencies: 696375 > > For hackbench, the relationship seems to be inversed. Better nice value > produces worse result. But for the cycletest, the avg goes down considerably > similar to your results. > > Aren't we just manipulating the same thing with your new proposal or did > I miss something? Can we impact preemption in isolation without having any > impact on bandwidth? > > I am worried about how userspace can reason about the expected outcome when > nice and latency_nice are combined together. > > > Thanks > > -- > Qais Yousef
On 03/23/22 16:32, Vincent Guittot wrote: > On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote: > > > > Hi Vincent > > > > Thanks for reviving this patchset! > > > > On 03/11/22 17:14, Vincent Guittot wrote: > > > This patchset restarts the work about adding a latency nice priority to > > > describe the latency tolerance of cfs tasks. > > > > > > The patches [1-4] have been done by Parth: > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > > > I have just rebased and moved the set of latency priority outside the > > > priority update. I have removed the reviewed tag because the patches > > > are 2 years old. > > > > AFAIR the blocking issue we had then is on agreement on the interface. Has this > > been resolved now? I didn't see any further discussion since then. > > I think that there was an agreement about using a latency nice > priority in the range [-20:19] with -20 meaning sensitive to latency > whereas 19 means that task doesn't care about scheduling latency. The > open point was about how to use this input in the scheduler with some > behavior being opposed. What I remember is that the problem was to consolidate on use cases then discuss interfaces. See https://lwn.net/Articles/820659/ " Youssef said that the interface to all of this is the sticking point. Thomas Gleixner agreed, saying that the -20..19 range "requires a crystal ball" to use properly. Zijlstra repeated his call to enumerate the use cases before getting into the interface details. Giani repeated that the interface does not look correct now, and agreed that a more comprehensive look at the use cases was needed. Things were being done backwards currently, he said. " > > > > > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > > preempt the current running task. Patch 5 gives some tests results with > > > cyclictests and hackbench to highlight the benefit of latency nice > > > priority for short interactive task or long intensive tasks. > > > > This is a new use case AFAICT. For Android, we want to do something in EAS path > > I don't think it's new, it's about being able to run some tasks in I meant new use case to latency-nice interface. I don't think we had this in any of our discussions before? I don't mind it, but it'd be good to clarify if it has any relation about the other use cases and what should happen to the other use cases. Thanks -- Qais Yousef
removed Dhaval's email which returns error On Thu, 24 Mar 2022 at 18:25, Qais Yousef <qais.yousef@arm.com> wrote: > > On 03/23/22 16:32, Vincent Guittot wrote: > > On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote: > > > > > > Hi Vincent > > > > > > Thanks for reviving this patchset! > > > > > > On 03/11/22 17:14, Vincent Guittot wrote: > > > > This patchset restarts the work about adding a latency nice priority to > > > > describe the latency tolerance of cfs tasks. > > > > > > > > The patches [1-4] have been done by Parth: > > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > > > > > I have just rebased and moved the set of latency priority outside the > > > > priority update. I have removed the reviewed tag because the patches > > > > are 2 years old. > > > > > > AFAIR the blocking issue we had then is on agreement on the interface. Has this > > > been resolved now? I didn't see any further discussion since then. > > > > I think that there was an agreement about using a latency nice > > priority in the range [-20:19] with -20 meaning sensitive to latency > > whereas 19 means that task doesn't care about scheduling latency. The > > open point was about how to use this input in the scheduler with some > > behavior being opposed. > > What I remember is that the problem was to consolidate on use cases then > discuss interfaces. > > See https://lwn.net/Articles/820659/ > > " Youssef said that the interface to all of this is the sticking > point. Thomas Gleixner agreed, saying that the -20..19 range "requires > a crystal ball" to use properly. Zijlstra repeated his call to > enumerate the use cases before getting into the interface details. > Giani repeated that the interface does not look correct now, and agreed > that a more comprehensive look at the use cases was needed. Things were > being done backwards currently, he said. " > At LPC, everybody seemed aligned with latency_nice so I assumed that there was an agreement on this interface. Latency_nice fits well with my proposal because it's all about relative comparison between the running task to the others. The current nice priority is used to set how much cpu bandwidth a task will have compared to others and the latency_nice is used in a similar way to know which one should run compared to the others. > > > > > > > > > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > > > preempt the current running task. Patch 5 gives some tests results with > > > > cyclictests and hackbench to highlight the benefit of latency nice > > > > priority for short interactive task or long intensive tasks. > > > > > > This is a new use case AFAICT. For Android, we want to do something in EAS path > > > > I don't think it's new, it's about being able to run some tasks in > > I meant new use case to latency-nice interface. I don't think we had this in > any of our discussions before? I don't mind it, but it'd be good to clarify if > it has any relation about the other use cases and what should happen to the > other use cases. Several discussions happened about changing the preemption policy of CFS. I have Mel's example in mind with hackbench where we want to reduce the preemption capabilities for the threads and on the other side the multimedia tasks which complain about having to wait before being scheduled. All this is about preempting or not the others. And all this has been kept outside topology consideration but only for the local run queue Regards, Vincent > > > Thanks > > -- > Qais Yousef
On 03/25/22 14:27, Vincent Guittot wrote: > removed Dhaval's email which returns error > > On Thu, 24 Mar 2022 at 18:25, Qais Yousef <qais.yousef@arm.com> wrote: > > > > On 03/23/22 16:32, Vincent Guittot wrote: > > > On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote: > > > > > > > > Hi Vincent > > > > > > > > Thanks for reviving this patchset! > > > > > > > > On 03/11/22 17:14, Vincent Guittot wrote: > > > > > This patchset restarts the work about adding a latency nice priority to > > > > > describe the latency tolerance of cfs tasks. > > > > > > > > > > The patches [1-4] have been done by Parth: > > > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > > > > > > > I have just rebased and moved the set of latency priority outside the > > > > > priority update. I have removed the reviewed tag because the patches > > > > > are 2 years old. > > > > > > > > AFAIR the blocking issue we had then is on agreement on the interface. Has this > > > > been resolved now? I didn't see any further discussion since then. > > > > > > I think that there was an agreement about using a latency nice > > > priority in the range [-20:19] with -20 meaning sensitive to latency > > > whereas 19 means that task doesn't care about scheduling latency. The > > > open point was about how to use this input in the scheduler with some > > > behavior being opposed. > > > > What I remember is that the problem was to consolidate on use cases then > > discuss interfaces. > > > > See https://lwn.net/Articles/820659/ > > > > " Youssef said that the interface to all of this is the sticking > > point. Thomas Gleixner agreed, saying that the -20..19 range "requires > > a crystal ball" to use properly. Zijlstra repeated his call to > > enumerate the use cases before getting into the interface details. > > Giani repeated that the interface does not look correct now, and agreed > > that a more comprehensive look at the use cases was needed. Things were > > being done backwards currently, he said. " > > > > At LPC, everybody seemed aligned with latency_nice so I assumed that > there was an agreement on this interface. > Latency_nice fits well with my proposal because it's all about > relative comparison between the running task to the others. The > current nice priority is used to set how much cpu bandwidth a task > will have compared to others and the latency_nice is used in a similar > way to know which one should run compared to the others. I think the users were happy, but not the maintainers :-) I am still happy with it, but I just want to make sure that our use case is something we still care about having in upstream and we'd still like to use this interface to achieve that. I don't want it to be blocked based on interface not suitable. So this should be taken into consideration that this is not a replacement to at least our previous use case. The concept of latency_nice conversion to weight is something new and I don't think any of the other users requires it. So we need to keep the user visible interface detached from weight which is internal implementation detail for your use case. > > > > > > > > > > > > > > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > > > > preempt the current running task. Patch 5 gives some tests results with > > > > > cyclictests and hackbench to highlight the benefit of latency nice > > > > > priority for short interactive task or long intensive tasks. > > > > > > > > This is a new use case AFAICT. For Android, we want to do something in EAS path > > > > > > I don't think it's new, it's about being able to run some tasks in > > > > I meant new use case to latency-nice interface. I don't think we had this in > > any of our discussions before? I don't mind it, but it'd be good to clarify if > > it has any relation about the other use cases and what should happen to the > > other use cases. > > Several discussions happened about changing the preemption policy of > CFS. I have Mel's example in mind with hackbench where we want to > reduce the preemption capabilities for the threads and on the other > side the multimedia tasks which complain about having to wait before > being scheduled. All this is about preempting or not the others. And > all this has been kept outside topology consideration but only for the > local run queue Cool. I can see its usefulness. Though I still have to convince myself that you can affect preemption without impacting bandwidth and is not a subtler way to modify nice. Thanks -- Qais Yousef
On Mon, 28 Mar 2022 at 18:27, Qais Yousef <qais.yousef@arm.com> wrote: > > On 03/25/22 14:27, Vincent Guittot wrote: > > removed Dhaval's email which returns error > > > > On Thu, 24 Mar 2022 at 18:25, Qais Yousef <qais.yousef@arm.com> wrote: > > > > > > On 03/23/22 16:32, Vincent Guittot wrote: > > > > On Tue, 22 Mar 2022 at 17:39, Qais Yousef <qais.yousef@arm.com> wrote: > > > > > > > > > > Hi Vincent > > > > > > > > > > Thanks for reviving this patchset! > > > > > > > > > > On 03/11/22 17:14, Vincent Guittot wrote: > > > > > > This patchset restarts the work about adding a latency nice priority to > > > > > > describe the latency tolerance of cfs tasks. > > > > > > > > > > > > The patches [1-4] have been done by Parth: > > > > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > > > > > > > > > I have just rebased and moved the set of latency priority outside the > > > > > > priority update. I have removed the reviewed tag because the patches > > > > > > are 2 years old. > > > > > > > > > > AFAIR the blocking issue we had then is on agreement on the interface. Has this > > > > > been resolved now? I didn't see any further discussion since then. > > > > > > > > I think that there was an agreement about using a latency nice > > > > priority in the range [-20:19] with -20 meaning sensitive to latency > > > > whereas 19 means that task doesn't care about scheduling latency. The > > > > open point was about how to use this input in the scheduler with some > > > > behavior being opposed. > > > > > > What I remember is that the problem was to consolidate on use cases then > > > discuss interfaces. > > > > > > See https://lwn.net/Articles/820659/ > > > > > > " Youssef said that the interface to all of this is the sticking > > > point. Thomas Gleixner agreed, saying that the -20..19 range "requires > > > a crystal ball" to use properly. Zijlstra repeated his call to > > > enumerate the use cases before getting into the interface details. > > > Giani repeated that the interface does not look correct now, and agreed > > > that a more comprehensive look at the use cases was needed. Things were > > > being done backwards currently, he said. " > > > > > > > At LPC, everybody seemed aligned with latency_nice so I assumed that > > there was an agreement on this interface. > > Latency_nice fits well with my proposal because it's all about > > relative comparison between the running task to the others. The > > current nice priority is used to set how much cpu bandwidth a task > > will have compared to others and the latency_nice is used in a similar > > way to know which one should run compared to the others. > > I think the users were happy, but not the maintainers :-) > > I am still happy with it, but I just want to make sure that our use case is > something we still care about having in upstream and we'd still like to use > this interface to achieve that. I don't want it to be blocked based on > interface not suitable. So this should be taken into consideration that this is > not a replacement to at least our previous use case. > > The concept of latency_nice conversion to weight is something new and I don't > think any of the other users requires it. So we need to keep the user visible > interface detached from weight which is internal implementation detail for your > use case. note that the weight is only another way to describe relative priority but I will keep that in mind for the next version > > > > > > > > > > > > > > > > > > > > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > > > > > preempt the current running task. Patch 5 gives some tests results with > > > > > > cyclictests and hackbench to highlight the benefit of latency nice > > > > > > priority for short interactive task or long intensive tasks. > > > > > > > > > > This is a new use case AFAICT. For Android, we want to do something in EAS path > > > > > > > > I don't think it's new, it's about being able to run some tasks in > > > > > > I meant new use case to latency-nice interface. I don't think we had this in > > > any of our discussions before? I don't mind it, but it'd be good to clarify if > > > it has any relation about the other use cases and what should happen to the > > > other use cases. > > > > Several discussions happened about changing the preemption policy of > > CFS. I have Mel's example in mind with hackbench where we want to > > reduce the preemption capabilities for the threads and on the other > > side the multimedia tasks which complain about having to wait before > > being scheduled. All this is about preempting or not the others. And > > all this has been kept outside topology consideration but only for the > > local run queue > > Cool. I can see its usefulness. Though I still have to convince myself that you > can affect preemption without impacting bandwidth and is not a subtler way to > modify nice. This has been one of my main goal too: to not modify cpu bandwidth Thanks Vincent > > > Thanks > > -- > Qais Yousef
On 11/03/2022 17:14, Vincent Guittot wrote: > This patchset restarts the work about adding a latency nice priority to > describe the latency tolerance of cfs tasks. > > The patches [1-4] have been done by Parth: > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > I have just rebased and moved the set of latency priority outside the > priority update. I have removed the reviewed tag because the patches > are 2 years old. > > The patches [5-6] use latency nice priority to decide if a cfs task can > preempt the current running task. Patch 5 gives some tests results with > cyclictests and hackbench to highlight the benefit of latency nice > priority for short interactive task or long intensive tasks. The Android specific `latency_nice` (in Android `latency_sensitive` [latency_nice < 0]) use case `Skip energy aware task placement` favors an idle CPU over the EAS search path for a `latency_sensitive` task. https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com This is Android proprietary code similar to what we have in find_idlest_group_cpu() in mainline. We talked to the Android folks last week and IMHO they are not convinced that they can switch this to the proposed `latency_nice->tweak preemption` use case.
Hi Dietmar, On Mon, 28 Mar 2022 at 11:24, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > > On 11/03/2022 17:14, Vincent Guittot wrote: > > This patchset restarts the work about adding a latency nice priority to > > describe the latency tolerance of cfs tasks. > > > > The patches [1-4] have been done by Parth: > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > I have just rebased and moved the set of latency priority outside the > > priority update. I have removed the reviewed tag because the patches > > are 2 years old. > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > preempt the current running task. Patch 5 gives some tests results with > > cyclictests and hackbench to highlight the benefit of latency nice > > priority for short interactive task or long intensive tasks. > > The Android specific `latency_nice` (in Android `latency_sensitive` > [latency_nice < 0]) use case `Skip energy aware task placement` favors > an idle CPU over the EAS search path for a `latency_sensitive` task. > > https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com > > This is Android proprietary code similar to what we have in > find_idlest_group_cpu() in mainline. > We talked to the Android folks last week and IMHO they are not convinced > that they can switch this to the proposed `latency_nice->tweak > preemption` use case. Thanks for discussing this with Android folks. It's not always easy to change the behavior of a product and I would be interested to discuss this with them. Sometimes you need a PoC to get convinced
+CC Wei On 03/28/22 14:56, Vincent Guittot wrote: > Hi Dietmar, > > > On Mon, 28 Mar 2022 at 11:24, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > > > > On 11/03/2022 17:14, Vincent Guittot wrote: > > > This patchset restarts the work about adding a latency nice priority to > > > describe the latency tolerance of cfs tasks. > > > > > > The patches [1-4] have been done by Parth: > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > > > I have just rebased and moved the set of latency priority outside the > > > priority update. I have removed the reviewed tag because the patches > > > are 2 years old. > > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > > preempt the current running task. Patch 5 gives some tests results with > > > cyclictests and hackbench to highlight the benefit of latency nice > > > priority for short interactive task or long intensive tasks. > > > > The Android specific `latency_nice` (in Android `latency_sensitive` > > [latency_nice < 0]) use case `Skip energy aware task placement` favors > > an idle CPU over the EAS search path for a `latency_sensitive` task. > > > > https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com > > > > This is Android proprietary code similar to what we have in > > find_idlest_group_cpu() in mainline. > > We talked to the Android folks last week and IMHO they are not convinced > > that they can switch this to the proposed `latency_nice->tweak > > preemption` use case. > > Thanks for discussing this with Android folks. It's not always easy to > change the behavior of a product and I would be interested to discuss > this with them. Sometimes you need a PoC to get convinced I think it's good to clarify for me at least here whether you intend this as a replacement for disable EAS and revert to CAS or you see this as an additional thing? As I understood from the discussion we had on the cover letter, this is an additional improvement and not intended to replace any of the previous use cases we brought up before. Wei/Quentin, any thoughts on this? Thanks -- Qais Yousef
On Fri, 1 Apr 2022 at 14:15, Qais Yousef <qais.yousef@arm.com> wrote: > > +CC Wei > > On 03/28/22 14:56, Vincent Guittot wrote: > > Hi Dietmar, > > > > > > On Mon, 28 Mar 2022 at 11:24, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > > > > > > On 11/03/2022 17:14, Vincent Guittot wrote: > > > > This patchset restarts the work about adding a latency nice priority to > > > > describe the latency tolerance of cfs tasks. > > > > > > > > The patches [1-4] have been done by Parth: > > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > > > > > I have just rebased and moved the set of latency priority outside the > > > > priority update. I have removed the reviewed tag because the patches > > > > are 2 years old. > > > > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > > > preempt the current running task. Patch 5 gives some tests results with > > > > cyclictests and hackbench to highlight the benefit of latency nice > > > > priority for short interactive task or long intensive tasks. > > > > > > The Android specific `latency_nice` (in Android `latency_sensitive` > > > [latency_nice < 0]) use case `Skip energy aware task placement` favors > > > an idle CPU over the EAS search path for a `latency_sensitive` task. > > > > > > https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com > > > > > > This is Android proprietary code similar to what we have in > > > find_idlest_group_cpu() in mainline. > > > We talked to the Android folks last week and IMHO they are not convinced > > > that they can switch this to the proposed `latency_nice->tweak > > > preemption` use case. > > > > Thanks for discussing this with Android folks. It's not always easy to > > change the behavior of a product and I would be interested to discuss > > this with them. Sometimes you need a PoC to get convinced > > I think it's good to clarify for me at least here whether you intend this as > a replacement for disable EAS and revert to CAS or you see this as an > additional thing? As I understood from the discussion we had on the cover > letter, this is an additional improvement and not intended to replace any of > the previous use cases we brought up before. This is not a replacement but an additional way to improve latency. The only thing that could happen is that this feature provides enhancement that makes the policy of disabling EAS and revert to CAS becoming less or no more interesting. > > Wei/Quentin, any thoughts on this? > > Thanks > > -- > Qais Yousef
On 04/02/22 10:46, Vincent Guittot wrote: > On Fri, 1 Apr 2022 at 14:15, Qais Yousef <qais.yousef@arm.com> wrote: > > > > +CC Wei > > > > On 03/28/22 14:56, Vincent Guittot wrote: > > > Hi Dietmar, > > > > > > > > > On Mon, 28 Mar 2022 at 11:24, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > > > > > > > > On 11/03/2022 17:14, Vincent Guittot wrote: > > > > > This patchset restarts the work about adding a latency nice priority to > > > > > describe the latency tolerance of cfs tasks. > > > > > > > > > > The patches [1-4] have been done by Parth: > > > > > https://lore.kernel.org/lkml/20200228090755.22829-1-parth@linux.ibm.com/ > > > > > > > > > > I have just rebased and moved the set of latency priority outside the > > > > > priority update. I have removed the reviewed tag because the patches > > > > > are 2 years old. > > > > > > > > > > The patches [5-6] use latency nice priority to decide if a cfs task can > > > > > preempt the current running task. Patch 5 gives some tests results with > > > > > cyclictests and hackbench to highlight the benefit of latency nice > > > > > priority for short interactive task or long intensive tasks. > > > > > > > > The Android specific `latency_nice` (in Android `latency_sensitive` > > > > [latency_nice < 0]) use case `Skip energy aware task placement` favors > > > > an idle CPU over the EAS search path for a `latency_sensitive` task. > > > > > > > > https://lkml.kernel.org/r/2aa4b838-c298-ec7d-08f3-caa50cc87dc2@arm.com > > > > > > > > This is Android proprietary code similar to what we have in > > > > find_idlest_group_cpu() in mainline. > > > > We talked to the Android folks last week and IMHO they are not convinced > > > > that they can switch this to the proposed `latency_nice->tweak > > > > preemption` use case. > > > > > > Thanks for discussing this with Android folks. It's not always easy to > > > change the behavior of a product and I would be interested to discuss > > > this with them. Sometimes you need a PoC to get convinced > > > > I think it's good to clarify for me at least here whether you intend this as > > a replacement for disable EAS and revert to CAS or you see this as an > > additional thing? As I understood from the discussion we had on the cover > > letter, this is an additional improvement and not intended to replace any of > > the previous use cases we brought up before. > > This is not a replacement but an additional way to improve latency. > The only thing that could happen is that this feature provides > enhancement that makes the policy of disabling EAS and revert to CAS > becoming less or no more interesting. (Sorry for delayed response, in holiday) Okay thanks for clarifying. One other corner case to consider if you're working on next version is what should happen when there are multiple tasks of the same priority on the rq. RT scheduler will push/pull tasks to ensure the task will get to run ASAP if there's another cpu at lower priority is available. Seems a lot of complexity to add to CFS, but at the same time if 2 important tasks require low latency are on the same rq, one of them will suffer without introducing the ability to migrate one of them where it can get to run sooner. -- Qais Yousef
On Sat, 9 Apr 2022 18:08:41 +0100 Qais Yousef <qais.yousef@arm.com> wrote: > One other corner case to consider if you're working on next version is what > should happen when there are multiple tasks of the same priority on the rq. RT > scheduler will push/pull tasks to ensure the task will get to run ASAP if > there's another cpu at lower priority is available. Seems a lot of complexity > to add to CFS, but at the same time if 2 important tasks require low latency > are on the same rq, one of them will suffer without introducing the ability to > migrate one of them where it can get to run sooner. Instead of having the greedy algorithm of the RT push/pull logic, how hard would it be to have the load balancer know of these tasks, and try to keep them on different CPUs? When two are queued on the same CPU, could it be possible to just trigger load balancing and let it do the work? -- Steve
On 04/09/22 13:28, Steven Rostedt wrote: > On Sat, 9 Apr 2022 18:08:41 +0100 > Qais Yousef <qais.yousef@arm.com> wrote: > > > One other corner case to consider if you're working on next version is what > > should happen when there are multiple tasks of the same priority on the rq. RT > > scheduler will push/pull tasks to ensure the task will get to run ASAP if > > there's another cpu at lower priority is available. Seems a lot of complexity > > to add to CFS, but at the same time if 2 important tasks require low latency > > are on the same rq, one of them will suffer without introducing the ability to > > migrate one of them where it can get to run sooner. > > Instead of having the greedy algorithm of the RT push/pull logic, how > hard would it be to have the load balancer know of these tasks, and try > to keep them on different CPUs? When two are queued on the same CPU, Oh yeah I didn't think we need to replicate push/pull. Load balancer will need to know about it when it moves task so that it avoids placing two of these asks on the same cpu. > could it be possible to just trigger load balancing and let it do the > work? I think the other part will need to be at wake up when we decide the CPU. If we trigger the load balancing instead then it'd behave like a push/pull? All these paths are already complex though. So we need to carefully analyze the trade-offs. Maybe we don't need to deliver such level of service after all. It needs more thinking and experimenting. Thanks -- Qais Yousef
On Sat, 9 Apr 2022 at 20:10, Qais Yousef <qais.yousef@arm.com> wrote: > > On 04/09/22 13:28, Steven Rostedt wrote: > > On Sat, 9 Apr 2022 18:08:41 +0100 > > Qais Yousef <qais.yousef@arm.com> wrote: > > > > > One other corner case to consider if you're working on next version is what > > > should happen when there are multiple tasks of the same priority on the rq. RT > > > scheduler will push/pull tasks to ensure the task will get to run ASAP if > > > there's another cpu at lower priority is available. Seems a lot of complexity > > > to add to CFS, but at the same time if 2 important tasks require low latency > > > are on the same rq, one of them will suffer without introducing the ability to > > > migrate one of them where it can get to run sooner. > > > > Instead of having the greedy algorithm of the RT push/pull logic, how > > hard would it be to have the load balancer know of these tasks, and try > > to keep them on different CPUs? When two are queued on the same CPU, > > Oh yeah I didn't think we need to replicate push/pull. Load balancer will need > to know about it when it moves task so that it avoids placing two of these asks > on the same cpu. > > > could it be possible to just trigger load balancing and let it do the > > work? > > I think the other part will need to be at wake up when we decide the CPU. > > If we trigger the load balancing instead then it'd behave like a push/pull? > > All these paths are already complex though. So we need to carefully analyze the > trade-offs. Maybe we don't need to deliver such level of service after all. It > needs more thinking and experimenting. I will consider this for v2 but we have to take care of not adding more latency by trying to find a better CPU. As you said this additional behavior will need more thinking and experimenting Vincent > > Thanks > > -- > Qais Yousef
© 2016 - 2026 Red Hat, Inc.