On 2022.11.26 08:26 Rui wrote: > On Wed, 2022-11-23 at 20:08 -0800, Doug Smythies wrote: >> On 2022.11.21 04:23 Kajetan Puchalski wrote: >>> On Wed, Nov 02, 2022 at 03:28:06PM +0000, Kajetan Puchalski wrote: >>> >>> [...] >>> >>>> v3 -> v4: >>>> - remove the chunk of code skipping metrics updates when the CPU >>>> was utilized >>>> - include new test results and more benchmarks in the cover >>>> letter >>> >>> [...] >>> >>> It's been some time so I just wanted to bump this, what do you >>> think >>> about this v4? Doug has already tested it, resuls for his machine >>> are >>> attached to the v3 thread. >> >> Hi All, >> >> I continued to test this and included the proposed ladder idle >> governor in my continued testing. >> (Which is why I added Rui as an addressee) > > Hi, Doug, Hi Rui, > Really appreciated your testing data on this. > I have some dumb questions and I need your help so that I can better > understand some of the graphs. :) > >> However, I ran out of time. Here is what I have: >> >> Kernel: 6.1-rc3 and with patch sets >> Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz >> CPU scaling driver: intel_cpufreq >> HWP disabled. >> Unless otherwsie stated, performance CPU scaling govenor. >> >> Legend: >> teo: the current teo idle governor >> util-v4: the RFC utilization teo patch set version 4. >> menu: the menu idle governor >> ladder-old: the current ladder idle governor >> ladder: the RFC ladder patchset. >> >> Workflow: shell-intensive serialized workloads. >> Variable: PIDs per second. >> Note: Single threaded. >> Master reference: forced CPU affinity to 1 CPU. This is the 1cpu on the graph. >> Performance Results: >> http://smythies.com/~doug/linux/idle/teo-util/graphs/pids-perf.png >> Schedutil Results: >> http://smythies.com/~doug/linux/idle/teo-util/graphs/pids-su.png > > what does 1cpu mean? For shell-intensive serialized workflow or: Dountil the list of tasks is finished: Start the next task in the list of stuff to do (with a new PID). Wait for it to finish Enduntil We know it represents a challenge for CPU frequency scaling drivers, schedulers, and therefore idle drivers. We also know that the best performance is achieved by overriding the scheduler and forcing CPU affinity. I use this "best" case as the master reference, using the label 1cpu on the graph. >> Workflow: sleeping ebizzy 128 threads. >> Variable: interval (uSecs). >> Performance Results: >> http://smythies.com/~doug/linux/idle/teo-util/graphs/ebizzy-128-perf.png >> Performance power and idle data: >> http://smythies.com/~doug/linux/idle/teo-util/ebizzy/perf/ > > for the "Idle state 0/1/2/3 was too deep" graphs, may I know how you > assert that an idle state is too deep/shallow? I get those stats directly from the kernel driver statistics. For example: $ grep . /sys/devices/system/cpu/cpu4/cpuidle/state*/above /sys/devices/system/cpu/cpu4/cpuidle/state0/above:0 /sys/devices/system/cpu/cpu4/cpuidle/state1/above:38085 /sys/devices/system/cpu/cpu4/cpuidle/state2/above:7668 /sys/devices/system/cpu/cpu4/cpuidle/state3/above:6823 $ grep . /sys/devices/system/cpu/cpu4/cpuidle/state*/below /sys/devices/system/cpu/cpu4/cpuidle/state0/below:72059 /sys/devices/system/cpu/cpu4/cpuidle/state1/below:246573 /sys/devices/system/cpu/cpu4/cpuidle/state2/below:7817 /sys/devices/system/cpu/cpu4/cpuidle/state3/below:0 I keep track of the changes per sample interval and graph the sum for all CPUs as a percentage of the usage of that idle state. Because I can never remember what "above" and "below" actually mean, I use the terms "was too shallow" and "was too deep". ... Doug
On Sat, 2022-11-26 at 13:56 -0800, Doug Smythies wrote: > On 2022.11.26 08:26 Rui wrote: > > On Wed, 2022-11-23 at 20:08 -0800, Doug Smythies wrote: > > > On 2022.11.21 04:23 Kajetan Puchalski wrote: > > > > On Wed, Nov 02, 2022 at 03:28:06PM +0000, Kajetan Puchalski > > > > wrote: > > > > > > > > [...] > > > > > > > > > v3 -> v4: > > > > > - remove the chunk of code skipping metrics updates when the > > > > > CPU > > > > > was utilized > > > > > - include new test results and more benchmarks in the cover > > > > > letter > > > > > > > > [...] > > > > > > > > It's been some time so I just wanted to bump this, what do you > > > > think > > > > about this v4? Doug has already tested it, resuls for his > > > > machine > > > > are > > > > attached to the v3 thread. > > > > > > Hi All, > > > > > > I continued to test this and included the proposed ladder idle > > > governor in my continued testing. > > > (Which is why I added Rui as an addressee) > > > > Hi, Doug, > > Hi Rui, > > > Really appreciated your testing data on this. > > I have some dumb questions and I need your help so that I can > > better > > understand some of the graphs. :) > > > > > However, I ran out of time. Here is what I have: > > > > > > Kernel: 6.1-rc3 and with patch sets > > > Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz > > > CPU scaling driver: intel_cpufreq > > > HWP disabled. > > > Unless otherwsie stated, performance CPU scaling govenor. > > > > > > Legend: > > > teo: the current teo idle governor > > > util-v4: the RFC utilization teo patch set version 4. > > > menu: the menu idle governor > > > ladder-old: the current ladder idle governor > > > ladder: the RFC ladder patchset. > > > > > > Workflow: shell-intensive serialized workloads. > > > Variable: PIDs per second. > > > Note: Single threaded. > > > Master reference: forced CPU affinity to 1 CPU. > > This is the 1cpu on the graph. > > > > Performance Results: > > > http://smythies.com/~doug/linux/idle/teo-util/graphs/pids-perf.png > > > Schedutil Results: > > > http://smythies.com/~doug/linux/idle/teo-util/graphs/pids-su.png > > > > what does 1cpu mean? > > For shell-intensive serialized workflow or: > > Dountil the list of tasks is finished: > Start the next task in the list of stuff to do (with a new PID). > Wait for it to finish > Enduntil > > We know it represents a challenge for CPU frequency scaling drivers, > schedulers, and therefore idle drivers. > > We also know that the best performance is achieved by overriding > the scheduler and forcing CPU affinity. I use this "best" case as the > master reference, using the label 1cpu on the graph. > Got it. > > > Workflow: sleeping ebizzy 128 threads. > > > Variable: interval (uSecs). > > > Performance Results: > > > http://smythies.com/~doug/linux/idle/teo-util/graphs/ebizzy-128-perf.png > > > Performance power and idle data: > > > http://smythies.com/~doug/linux/idle/teo-util/ebizzy/perf/ > > > > for the "Idle state 0/1/2/3 was too deep" graphs, may I know how > > you > > assert that an idle state is too deep/shallow? > > I get those stats directly from the kernel driver statistics. For > example: > > $ grep . /sys/devices/system/cpu/cpu4/cpuidle/state*/above > /sys/devices/system/cpu/cpu4/cpuidle/state0/above:0 > /sys/devices/system/cpu/cpu4/cpuidle/state1/above:38085 > /sys/devices/system/cpu/cpu4/cpuidle/state2/above:7668 > /sys/devices/system/cpu/cpu4/cpuidle/state3/above:6823 > > $ grep . /sys/devices/system/cpu/cpu4/cpuidle/state*/below > /sys/devices/system/cpu/cpu4/cpuidle/state0/below:72059 > /sys/devices/system/cpu/cpu4/cpuidle/state1/below:246573 > /sys/devices/system/cpu/cpu4/cpuidle/state2/below:7817 > /sys/devices/system/cpu/cpu4/cpuidle/state3/below:0 > > I keep track of the changes per sample interval and graph > the sum for all CPUs as a percentage of the usage of > that idle state. > > Because I can never remember what "above" and "below" > actually mean, I use the terms "was too shallow" > and "was too deep". I just checked the code. My understanding is that "above" means the previous idle state residency is too short, and a shallower state would have been a better match. "below" means the previous idle state residency is too long, and a deeper state would have been a better match. So probably "above" means "should be shallower" or "was too deep", and "below" means "should be deeper" or "was to shallow"? thanks, rui
© 2016 - 2025 Red Hat, Inc.