[PATCH v3] sched/fair: Forfeit vruntime on yield

Fernand Sieber posted 1 patch 1 week, 6 days ago
kernel/sched/fair.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
[PATCH v3] sched/fair: Forfeit vruntime on yield
Posted by Fernand Sieber 1 week, 6 days ago
If a task yields, the scheduler may decide to pick it again. The task in
turn may decide to yield immediately or shortly after, leading to a tight
loop of yields.

If there's another runnable task as this point, the deadline will be
increased by the slice at each loop. This can cause the deadline to runaway
pretty quickly, and subsequent elevated run delays later on as the task
doesn't get picked again. The reason the scheduler can pick the same task
again and again despite its deadline increasing is because it may be the
only eligible task at that point.

Fix this by making the task forfeiting its remaining vruntime and pushing
the deadline one slice ahead. This implements yield behavior more
authentically.

We limit the forfeiting to eligible tasks. This is because core scheduling
prefers running ineligible tasks rather than force idling. As such, without
the condition, we can end up on a yield loop which makes the vruntime
increase rapidly, leading to anomalous run delays later down the line.

Fixes: 147f3efaa24182 ("sched/fair: Implement an EEVDF-like scheduling  policy")
Link: https://lore.kernel.org/r/20250401123622.584018-1-sieberf@amazon.com
Link: https://lore.kernel.org/r/20250911095113.203439-1-sieberf@amazon.com
Signed-off-by: Fernand Sieber <sieberf@amazon.com>

Changes in v2:
- Implement vruntime forfeiting approach suggested by Peter Zijlstra
- Updated commit name
- Previous Reviewed-by tags removed due to algorithm change

Changes in v3:
- Only increase vruntime for eligible tasks to avoid runaway vruntime with
  core scheduling

Link: https://lore.kernel.org/r/20250916140228.452231-1-sieberf@amazon.com
---
 kernel/sched/fair.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b173a059315c..46e5a976f402 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8921,7 +8921,19 @@ static void yield_task_fair(struct rq *rq)
 	 */
 	rq_clock_skip_update(rq);
 
-	se->deadline += calc_delta_fair(se->slice, se);
+	/*
+	 * Forfeit the remaining vruntime, only if the entity is eligible. This
+	 * condition is necessary because in core scheduling we prefer to run
+	 * ineligible tasks rather than force idling. If this happens we may
+	 * end up in a loop where the core scheduler picks the yielding task,
+	 * which yields immediately again; without the condition the vruntime
+	 * ends up quickly running away.
+	 */
+	if (entity_eligible(cfs_rq, se)) {
+		se->vruntime = se->deadline;
+		se->deadline += calc_delta_fair(se->slice, se);
+		update_min_vruntime(cfs_rq);
+	}
 }
 
 static bool yield_to_task_fair(struct rq *rq, struct task_struct *p)
-- 
2.34.1




Amazon Development Centre (South Africa) (Proprietary) Limited
29 Gogosoa Street, Observatory, Cape Town, Western Cape, 7925, South Africa
Registration Number: 2004 / 034463 / 07
Re: [PATCH v3] sched/fair: Forfeit vruntime on yield
Posted by kernel test robot 6 days, 2 hours ago
Hello,


we reported "a 55.9% improvement of stress-ng.wait.ops_per_sec"
in https://lore.kernel.org/all/202509241501.f14b210a-lkp@intel.com/

now we noticed there is also a regression in our tests. report again FYI.

one thing we want to mention is the "stress-ng.sockpair.MB_written_per_sec" is
in "miscellaneous metrics" of this stress-ng test. for major part,
"stress-ng.sockpair.ops_per_sec", it's just a small difference.

0d4eaf8caf8cd633 15bf8c7b35e31295b26241425c0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    551.38           -90.5%      52.18        stress-ng.sockpair.MB_written_per_sec
    781743            -2.3%     764106        stress-ng.sockpair.ops_per_sec


below is a test example for 15bf8c7b35:

2025-09-25 15:48:21 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --oom-avoid --sockpair 192
stress-ng: info:  [8371] setting to a 1 min run per stressor
stress-ng: info:  [8371] dispatching hogs: 192 sockpair
stress-ng: info:  [8371] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
stress-ng: metrc: [8371] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
stress-ng: metrc: [8371]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
stress-ng: metrc: [8371] sockpair       49874197     65.44     72.08  12219.54    762108.28        4057.58        97.82          3132
stress-ng: metrc: [8371] miscellaneous metrics:
stress-ng: metrc: [8371] sockpair           27717.04 socketpair calls sec (harmonic mean of 192 instances)
stress-ng: metrc: [8371] sockpair              53.01 MB written per sec (harmonic mean of 192 instances)
stress-ng: info:  [8371] for a 66.13s run time:
stress-ng: info:  [8371]   12696.46s available CPU time
stress-ng: info:  [8371]      72.07s user time   (  0.57%)
stress-ng: info:  [8371]   12219.63s system time ( 96.24%)
stress-ng: info:  [8371]   12291.70s total time  ( 96.81%)
stress-ng: info:  [8371] load average: 190.99 57.46 19.94
stress-ng: info:  [8371] skipped: 0
stress-ng: info:  [8371] passed: 192: sockpair (192)
stress-ng: info:  [8371] failed: 0
stress-ng: info:  [8371] metrics untrustworthy: 0
stress-ng: info:  [8371] successful run completed in 1 min, 6.13 secs


below is an exmple from 0d4eaf8caf:

2025-09-25 18:04:37 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --oom-avoid --sockpair 192
stress-ng: info:  [8360] setting to a 1 min run per stressor
stress-ng: info:  [8360] dispatching hogs: 192 sockpair
stress-ng: info:  [8360] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
stress-ng: metrc: [8360] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
stress-ng: metrc: [8360]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
stress-ng: metrc: [8360] sockpair       51705787     65.08     56.75  12254.39    794448.25        4199.92        98.52          5160
stress-ng: metrc: [8360] miscellaneous metrics:
stress-ng: metrc: [8360] sockpair           28156.62 socketpair calls sec (harmonic mean of 192 instances)
stress-ng: metrc: [8360] sockpair             562.18 MB written per sec (harmonic mean of 192 instances)
stress-ng: info:  [8360] for a 65.40s run time:
stress-ng: info:  [8360]   12556.08s available CPU time
stress-ng: info:  [8360]      56.75s user time   (  0.45%)
stress-ng: info:  [8360]   12254.48s system time ( 97.60%)
stress-ng: info:  [8360]   12311.23s total time  ( 98.05%)
stress-ng: info:  [8360] load average: 239.81 72.31 25.10
stress-ng: info:  [8360] skipped: 0
stress-ng: info:  [8360] passed: 192: sockpair (192)
stress-ng: info:  [8360] failed: 0
stress-ng: info:  [8360] metrics untrustworthy: 0
stress-ng: info:  [8360] successful run completed in 1 min, 5.40 secs


below is full report.


kernel test robot noticed a 90.5% regression of stress-ng.sockpair.MB_written_per_sec on:


commit: 15bf8c7b35e31295b26241425c0a61102e92109f ("[PATCH v3] sched/fair: Forfeit vruntime on yield")
url: https://github.com/intel-lab-lkp/linux/commits/Fernand-Sieber/sched-fair-Forfeit-vruntime-on-yield/20250918-231320
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 0d4eaf8caf8cd633b23e949e2996b420052c2d45
patch link: https://lore.kernel.org/all/20250918150528.292620-1-sieberf@amazon.com/
patch subject: [PATCH v3] sched/fair: Forfeit vruntime on yield

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sockpair
	cpufreq_governor: performance



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202509261113.a87577ce-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250926/202509261113.a87577ce-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/sockpair/stress-ng/60s

commit: 
  0d4eaf8caf ("sched/fair: Do not balance task to a throttled cfs_rq")
  15bf8c7b35 ("sched/fair: Forfeit vruntime on yield")

0d4eaf8caf8cd633 15bf8c7b35e31295b26241425c0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.78 ±  2%      +0.2        1.02        mpstat.cpu.all.usr%
     19.57           -36.8%      12.36 ± 70%  turbostat.RAMWatt
 4.073e+08 ±  6%     +23.1%  5.013e+08 ±  5%  cpuidle..time
    266261 ±  9%     +46.4%     389733 ±  9%  cpuidle..usage
    451887 ± 77%    +160.9%    1178929 ± 33%  numa-vmstat.node0.nr_file_pages
    192819 ± 30%    +101.3%     388191 ± 43%  numa-vmstat.node1.nr_shmem
   1807416 ± 77%    +161.0%    4716665 ± 33%  numa-meminfo.node0.FilePages
   8980121            -9.0%    8174177        numa-meminfo.node0.SUnreclaim
  25356157 ±  8%     -22.0%   19772595 ±  9%  numa-meminfo.node1.MemUsed
    771480 ± 30%    +101.4%    1553932 ± 43%  numa-meminfo.node1.Shmem
    551.38           -90.5%      52.18        stress-ng.sockpair.MB_written_per_sec
  51092272            -2.2%   49968621        stress-ng.sockpair.ops
    781743            -2.3%     764106        stress-ng.sockpair.ops_per_sec
  21418332 ±  4%     +69.2%   36232510        stress-ng.time.involuntary_context_switches
     56.36           +27.4%      71.81        stress-ng.time.user_time
    150809 ± 21%  +17217.1%   26115838 ±  3%  stress-ng.time.voluntary_context_switches
   2165914 ±  7%     +92.3%    4165197 ±  4%  meminfo.Active
   2165898 ±  7%     +92.3%    4165181 ±  4%  meminfo.Active(anon)
   4926568           +39.6%    6875228        meminfo.Cached
   6826363           +28.1%    8744371        meminfo.Committed_AS
    513281 ±  8%     +98.7%    1019681 ±  6%  meminfo.Mapped
  48472806 ±  2%     -14.8%   41314088        meminfo.Memused
   1276164          +152.7%    3224818 ±  3%  meminfo.Shmem
  53022761 ±  2%     -15.7%   44672632        meminfo.max_used_kB
      0.53           -81.0%       0.10 ±  4%  perf-sched.sch_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
      0.53           -81.0%       0.10 ±  4%  perf-sched.total_sch_delay.average.ms
      2.03           -68.4%       0.64 ±  4%  perf-sched.total_wait_and_delay.average.ms
   1811449          +200.9%    5449776 ±  4%  perf-sched.total_wait_and_delay.count.ms
      1.50           -64.0%       0.54 ±  4%  perf-sched.total_wait_time.average.ms
      2.03           -68.4%       0.64 ±  4%  perf-sched.wait_and_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
   1811449          +200.9%    5449776 ±  4%  perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
      1.50           -64.0%       0.54 ±  4%  perf-sched.wait_time.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
    541937 ±  7%     +92.5%    1043389 ±  4%  proc-vmstat.nr_active_anon
   5242293            +3.5%    5423918        proc-vmstat.nr_dirty_background_threshold
  10497404            +3.5%   10861099        proc-vmstat.nr_dirty_threshold
   1232280           +39.7%    1721251        proc-vmstat.nr_file_pages
  52782357            +3.4%   54601330        proc-vmstat.nr_free_pages
  52117733            +3.8%   54073313        proc-vmstat.nr_free_pages_blocks
    128259 ±  8%    +100.8%     257594 ±  6%  proc-vmstat.nr_mapped
    319681          +153.0%     808650 ±  3%  proc-vmstat.nr_shmem
   4489133            -8.9%    4089704        proc-vmstat.nr_slab_unreclaimable
    541937 ±  7%     +92.5%    1043389 ±  4%  proc-vmstat.nr_zone_active_anon
  77303955            +2.5%   79201972        proc-vmstat.pgalloc_normal
    519724            +5.2%     546556        proc-vmstat.pgfault
  76456707            +1.7%   77739095        proc-vmstat.pgfree
  12794131 ±  6%     -27.4%    9288185        sched_debug.cfs_rq:/.avg_vruntime.max
   4610143 ±  8%     -14.9%    3923890 ±  5%  sched_debug.cfs_rq:/.avg_vruntime.min
      1.03           -20.1%       0.83 ±  2%  sched_debug.cfs_rq:/.h_nr_queued.avg
      1.03           -20.8%       0.82 ±  2%  sched_debug.cfs_rq:/.h_nr_runnable.avg
    895.00 ± 70%     +89.0%       1691 ±  2%  sched_debug.cfs_rq:/.load.min
      0.67 ± 55%    +125.0%       1.50        sched_debug.cfs_rq:/.load_avg.min
  12794131 ±  6%     -27.4%    9288185        sched_debug.cfs_rq:/.min_vruntime.max
   4610143 ±  8%     -14.9%    3923896 ±  5%  sched_debug.cfs_rq:/.min_vruntime.min
      1103           -20.2%     880.86        sched_debug.cfs_rq:/.runnable_avg.avg
    428.26 ±  6%     -63.4%     156.94 ± 22%  sched_debug.cfs_rq:/.util_est.avg
      1775 ±  6%     -39.3%       1077 ± 15%  sched_debug.cfs_rq:/.util_est.max
    396.33 ±  6%     -50.0%     198.03 ± 17%  sched_debug.cfs_rq:/.util_est.stddev
     50422 ±  6%     -34.7%      32915 ± 18%  sched_debug.cpu.avg_idle.min
    456725 ± 10%     +39.4%     636811 ±  4%  sched_debug.cpu.avg_idle.stddev
    611566 ±  5%     +25.0%     764424 ±  2%  sched_debug.cpu.max_idle_balance_cost.avg
    190657 ± 12%     +36.1%     259410 ±  5%  sched_debug.cpu.max_idle_balance_cost.stddev
      1.04           -20.4%       0.82 ±  2%  sched_debug.cpu.nr_running.avg
     57214 ±  4%    +183.5%     162228 ±  2%  sched_debug.cpu.nr_switches.avg
    253314 ±  4%     +39.3%     352777 ±  4%  sched_debug.cpu.nr_switches.max
     59410 ±  6%     +31.6%      78186 ± 10%  sched_debug.cpu.nr_switches.stddev
      3.33           -27.9%       2.40        perf-stat.i.MPKI
 1.207e+10           +11.3%  1.344e+10        perf-stat.i.branch-instructions
      0.21 ±  7%      +0.0        0.24 ±  5%  perf-stat.i.branch-miss-rate%
  23462655 ±  6%     +27.4%   29896517 ±  3%  perf-stat.i.branch-misses
     75.74            -4.4       71.33        perf-stat.i.cache-miss-rate%
 1.861e+08           -21.5%  1.462e+08        perf-stat.i.cache-misses
 2.435e+08           -17.1%  2.017e+08        perf-stat.i.cache-references
    323065 ±  5%    +191.4%     941425 ±  2%  perf-stat.i.context-switches
     10.73            -9.7%       9.69        perf-stat.i.cpi
    353.45           +39.0%     491.13 ±  4%  perf-stat.i.cpu-migrations
      3589           +30.5%       4685        perf-stat.i.cycles-between-cache-misses
 5.645e+10           +12.0%  6.323e+10        perf-stat.i.instructions
      0.09           +12.1%       0.11        perf-stat.i.ipc
      1.66 ±  5%    +193.9%       4.89 ±  2%  perf-stat.i.metric.K/sec
      6247            +5.7%       6603 ±  2%  perf-stat.i.minor-faults
      6248            +5.7%       6604 ±  2%  perf-stat.i.page-faults
      3.33           -29.7%       2.34        perf-stat.overall.MPKI
      0.20 ±  7%      +0.0        0.23 ±  4%  perf-stat.overall.branch-miss-rate%
     76.67            -3.9       72.79        perf-stat.overall.cache-miss-rate%
     10.54           -11.1%       9.37        perf-stat.overall.cpi
      3168           +26.5%       4007        perf-stat.overall.cycles-between-cache-misses
      0.09           +12.5%       0.11        perf-stat.overall.ipc
 1.204e+10           +11.1%  1.337e+10        perf-stat.ps.branch-instructions
  23586580 ±  7%     +29.7%   30600100 ±  4%  perf-stat.ps.branch-misses
 1.873e+08           -21.4%  1.471e+08        perf-stat.ps.cache-misses
 2.443e+08           -17.3%  2.021e+08        perf-stat.ps.cache-references
    324828 ±  5%    +187.0%     932274 ±  2%  perf-stat.ps.context-switches
    335.13 ±  2%     +41.7%     474.95 ±  5%  perf-stat.ps.cpu-migrations
 5.632e+10           +11.7%  6.293e+10        perf-stat.ps.instructions
      6282            +6.5%       6690 ±  2%  perf-stat.ps.minor-faults
      6284            +6.5%       6692 ±  2%  perf-stat.ps.page-faults
 3.764e+12           +12.2%  4.224e+12        perf-stat.total.instructions



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH v3] sched/fair: Forfeit vruntime on yield
Posted by kernel test robot 1 week ago

Hello,

kernel test robot noticed a 55.9% improvement of stress-ng.wait.ops_per_sec on:


commit: 15bf8c7b35e31295b26241425c0a61102e92109f ("[PATCH v3] sched/fair: Forfeit vruntime on yield")
url: https://github.com/intel-lab-lkp/linux/commits/Fernand-Sieber/sched-fair-Forfeit-vruntime-on-yield/20250918-231320
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 0d4eaf8caf8cd633b23e949e2996b420052c2d45
patch link: https://lore.kernel.org/all/20250918150528.292620-1-sieberf@amazon.com/
patch subject: [PATCH v3] sched/fair: Forfeit vruntime on yield

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: wait
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.alarm.ops_per_sec 1.3% improvement |
| test machine     | 104 threads 2 sockets (Skylake) with 192G memory        |
| test parameters  | cpufreq_governor=performance                            |
|                  | nr_threads=100%                                         |
|                  | test=alarm                                              |
|                  | testtime=60s                                            |
+------------------+---------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250924/202509241501.f14b210a-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/wait/stress-ng/60s

commit: 
  0d4eaf8caf ("sched/fair: Do not balance task to a throttled cfs_rq")
  15bf8c7b35 ("sched/fair: Forfeit vruntime on yield")

0d4eaf8caf8cd633 15bf8c7b35e31295b26241425c0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  20935372 ± 13%     -74.1%    5416590 ± 38%  cpuidle..usage
      0.22 ±  6%      -0.1        0.15 ±  6%  mpstat.cpu.all.irq%
      1.56 ±  3%      +0.6        2.16 ±  4%  mpstat.cpu.all.usr%
   2928651 ± 48%     +63.3%    4781087 ±  7%  numa-numastat.node1.local_node
   2986407 ± 47%     +63.0%    4867647 ±  8%  numa-numastat.node1.numa_hit
  65592344 ± 22%    +408.5%  3.335e+08 ±  6%  stress-ng.time.involuntary_context_switches
     64507 ±  3%     -10.6%      57643 ±  5%  stress-ng.time.minor_page_faults
    268.43           +58.0%     424.24        stress-ng.time.user_time
  94660203 ±  3%     +32.0%   1.25e+08        stress-ng.time.voluntary_context_switches
   8733656 ±  3%     +55.9%   13619248        stress-ng.wait.ops
    145711 ±  3%     +55.9%     227211        stress-ng.wait.ops_per_sec
   9901871 ± 23%     +33.6%   13230903 ±  9%  meminfo.Active
   9901855 ± 23%     +33.6%   13230887 ±  9%  meminfo.Active(anon)
  12749041 ± 18%     +26.5%   16122685 ±  7%  meminfo.Cached
  14843475 ± 15%     +22.4%   18175107 ±  5%  meminfo.Committed_AS
  16718698 ± 13%     +19.8%   20027386 ±  5%  meminfo.Memused
   9098551 ± 25%     +37.1%   12472304 ±  9%  meminfo.Shmem
  16772967 ± 13%     +19.8%   20096231 ±  6%  meminfo.max_used_kB
   7828333 ± 51%     +66.6%   13041791 ±  9%  numa-meminfo.node1.Active
   7828325 ± 51%     +66.6%   13041784 ±  9%  numa-meminfo.node1.Active(anon)
   7314210 ± 52%     +85.0%   13533714 ± 10%  numa-meminfo.node1.FilePages
     61743 ± 26%     +43.3%      88498 ± 20%  numa-meminfo.node1.KReclaimable
   9385294 ± 42%     +66.0%   15578695 ±  9%  numa-meminfo.node1.MemUsed
     61743 ± 26%     +43.3%      88498 ± 20%  numa-meminfo.node1.SReclaimable
   7219596 ± 53%     +72.1%   12426234 ±  9%  numa-meminfo.node1.Shmem
   1958162 ± 51%     +66.6%    3262251 ±  9%  numa-vmstat.node1.nr_active_anon
   1829587 ± 52%     +85.0%    3385199 ± 10%  numa-vmstat.node1.nr_file_pages
   1805933 ± 53%     +72.1%    3108329 ±  9%  numa-vmstat.node1.nr_shmem
     15439 ± 26%     +43.4%      22139 ± 20%  numa-vmstat.node1.nr_slab_reclaimable
   1958158 ± 51%     +66.6%    3262247 ±  9%  numa-vmstat.node1.nr_zone_active_anon
   2985336 ± 47%     +63.0%    4867285 ±  8%  numa-vmstat.node1.numa_hit
   2927581 ± 48%     +63.3%    4780725 ±  7%  numa-vmstat.node1.numa_local
   2475878 ± 23%     +33.7%    3310125 ±  9%  proc-vmstat.nr_active_anon
    201955 ±  2%      -5.5%     190887 ±  3%  proc-vmstat.nr_anon_pages
   3187672 ± 18%     +26.5%    4033035 ±  7%  proc-vmstat.nr_file_pages
   2275048 ± 25%     +37.2%    3120439 ±  9%  proc-vmstat.nr_shmem
     43269 ±  3%      +4.5%      45201        proc-vmstat.nr_slab_reclaimable
   2475878 ± 23%     +33.7%    3310125 ±  9%  proc-vmstat.nr_zone_active_anon
   4045331 ± 20%     +29.0%    5218368 ±  7%  proc-vmstat.numa_hit
   3847426 ± 21%     +30.5%    5020327 ±  7%  proc-vmstat.numa_local
   4094249 ± 19%     +28.8%    5274030 ±  7%  proc-vmstat.pgalloc_normal
   9011996 ±  5%     +23.4%   11121508 ±  5%  sched_debug.cfs_rq:/.avg_vruntime.max
   3236082 ±  2%     +19.6%    3869616        sched_debug.cfs_rq:/.avg_vruntime.min
   1260971 ±  4%     +25.1%    1577635 ±  9%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      0.53 ±  5%      -8.9%       0.49 ±  3%  sched_debug.cfs_rq:/.h_nr_queued.stddev
      0.54 ±  4%      -8.7%       0.49 ±  3%  sched_debug.cfs_rq:/.h_nr_runnable.stddev
   9011996 ±  5%     +23.4%   11121508 ±  5%  sched_debug.cfs_rq:/.min_vruntime.max
   3236082 ±  2%     +19.6%    3869616        sched_debug.cfs_rq:/.min_vruntime.min
   1260972 ±  4%     +25.1%    1577635 ±  9%  sched_debug.cfs_rq:/.min_vruntime.stddev
      1261 ±  4%     -16.4%       1054 ±  6%  sched_debug.cfs_rq:/.util_avg.max
    170.04 ±  4%     -30.0%     119.10 ±  6%  sched_debug.cfs_rq:/.util_avg.stddev
    390.34 ±  2%     +34.0%     523.00 ±  2%  sched_debug.cfs_rq:/.util_est.avg
    219.06 ±  5%     +22.5%     268.29 ±  4%  sched_debug.cfs_rq:/.util_est.stddev
    765966 ±  3%     -13.1%     665650 ±  3%  sched_debug.cpu.max_idle_balance_cost.avg
    296999 ±  5%     -22.6%     229736 ±  5%  sched_debug.cpu.max_idle_balance_cost.stddev
      0.53 ±  6%     -10.2%       0.48 ±  3%  sched_debug.cpu.nr_running.stddev
    467856 ±  5%    +154.2%    1189068 ±  4%  sched_debug.cpu.nr_switches.avg
   1091334 ± 35%    +458.8%    6098488 ± 11%  sched_debug.cpu.nr_switches.max
    156457 ± 39%    +579.7%    1063429 ± 12%  sched_debug.cpu.nr_switches.stddev
 1.522e+10 ±  2%     +33.0%  2.025e+10 ±  4%  perf-stat.i.branch-instructions
  26461017 ±  8%     +25.3%   33152871 ±  4%  perf-stat.i.branch-misses
  80419215 ±  6%     +22.5%   98514949        perf-stat.i.cache-references
   2950621 ±  6%    +154.2%    7499768 ±  4%  perf-stat.i.context-switches
      8.86           -23.8%       6.75        perf-stat.i.cpi
      4890 ± 16%     -56.2%       2140 ± 15%  perf-stat.i.cpu-migrations
     44725 ±  7%     -16.0%      37555 ±  3%  perf-stat.i.cycles-between-cache-misses
 7.212e+10 ±  2%     +31.4%   9.48e+10 ±  4%  perf-stat.i.instructions
      0.12 ±  3%     +32.7%       0.17 ±  7%  perf-stat.i.ipc
     15.37 ±  6%    +154.2%      39.06 ±  4%  perf-stat.i.metric.K/sec
      8.17           -23.4%       6.26        perf-stat.overall.cpi
      0.12           +30.5%       0.16        perf-stat.overall.ipc
 1.498e+10 ±  2%     +33.0%  1.993e+10 ±  4%  perf-stat.ps.branch-instructions
  26034509 ±  8%     +25.3%   32622824 ±  4%  perf-stat.ps.branch-misses
  79145687 ±  6%     +22.5%   96950950        perf-stat.ps.cache-references
   2903516 ±  6%    +154.2%    7379460 ±  4%  perf-stat.ps.context-switches
      4802 ± 16%     -56.3%       2099 ± 15%  perf-stat.ps.cpu-migrations
 7.098e+10 ±  2%     +31.4%   9.33e+10 ±  4%  perf-stat.ps.instructions
  4.42e+12           +30.9%  5.787e+12        perf-stat.total.instructions


***************************************************************************************************
lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-skl-fpga01/alarm/stress-ng/60s

commit: 
  0d4eaf8caf ("sched/fair: Do not balance task to a throttled cfs_rq")
  15bf8c7b35 ("sched/fair: Forfeit vruntime on yield")

0d4eaf8caf8cd633 15bf8c7b35e31295b26241425c0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     13051 ± 26%     +40.8%      18378 ±  6%  numa-meminfo.node1.PageTables
    230411 ± 15%     -24.0%     175131 ± 19%  numa-numastat.node0.local_node
    122.83 ± 10%     +24.6%     153.00 ±  9%  sched_debug.cfs_rq:/.runnable_avg.min
    229700 ± 15%     -24.0%     174608 ± 19%  numa-vmstat.node0.numa_local
      3264 ± 26%     +40.4%       4584 ±  6%  numa-vmstat.node1.nr_page_table_pages
     34.64            -0.5       34.15        turbostat.C1%
      1.25 ±  2%      -0.3        0.92 ±  6%  turbostat.C1E%
 1.227e+08            +1.3%  1.243e+08        stress-ng.alarm.ops
   2044889            +1.3%    2071190        stress-ng.alarm.ops_per_sec
  17839864           +33.4%   23790385        stress-ng.time.involuntary_context_switches
      5045            +1.6%       5127        stress-ng.time.percent_of_cpu_this_job_got
      1938            +1.8%       1972        stress-ng.time.system_time
      1094            +1.4%       1109        stress-ng.time.user_time
 1.402e+10            +1.2%  1.419e+10        perf-stat.i.branch-instructions
 9.466e+08            +2.1%  9.661e+08        perf-stat.i.cache-references
   6720093            +2.3%    6874753        perf-stat.i.context-switches
  2.01e+11            +1.4%  2.038e+11        perf-stat.i.cpu-cycles
   2173629            +3.4%    2247122        perf-stat.i.cpu-migrations
 6.961e+10            +1.2%  7.047e+10        perf-stat.i.instructions
     85.51            +2.6%      87.75        perf-stat.i.metric.K/sec
 1.373e+10            +1.2%   1.39e+10        perf-stat.ps.branch-instructions
 9.333e+08            +2.1%   9.53e+08        perf-stat.ps.cache-references
   6626920            +2.3%    6780505        perf-stat.ps.context-switches
 1.979e+11            +1.4%  2.007e+11        perf-stat.ps.cpu-cycles
   2146232            +3.4%    2219100        perf-stat.ps.cpu-migrations
  6.82e+10            +1.2%  6.905e+10        perf-stat.ps.instructions
     16.99            -0.7       16.30        perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      0.63            -0.4        0.25 ±100%  perf-profile.calltrace.cycles-pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.do_nanosleep
      0.76 ± 15%      -0.3        0.43 ± 73%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     33.81            -0.3       33.51        perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.common_startup_64
     32.55            -0.3       32.25        perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
     32.48            -0.3       32.19        perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
      1.06            -0.1        0.93        perf-profile.calltrace.cycles-pp.__pick_next_task.__schedule.schedule.do_nanosleep.hrtimer_nanosleep
      5.84            -0.1        5.74        perf-profile.calltrace.cycles-pp.schedule.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
      5.66            -0.1        5.56        perf-profile.calltrace.cycles-pp.__schedule.schedule.do_nanosleep.hrtimer_nanosleep.common_nsleep
      8.87            -0.1        8.79        perf-profile.calltrace.cycles-pp.__x64_sys_clock_nanosleep.do_syscall_64.entry_SYSCALL_64_after_hwframe
      8.02            -0.1        7.94        perf-profile.calltrace.cycles-pp.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep.do_syscall_64
      8.38            -0.1        8.31        perf-profile.calltrace.cycles-pp.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep.do_syscall_64.entry_SYSCALL_64_after_hwframe
      8.42            -0.1        8.35        perf-profile.calltrace.cycles-pp.common_nsleep.__x64_sys_clock_nanosleep.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.92            +0.0        1.95        perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_entities.dequeue_task_fair.try_to_block_task.__schedule
      1.40            +0.0        1.44        perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.enqueue_task.ttwu_do_activate.sched_ttwu_pending
      1.18            +0.0        1.22        perf-profile.calltrace.cycles-pp.select_idle_cpu.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up
      0.68            +0.0        0.72        perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.complete_signal
      2.48            +0.0        2.52        perf-profile.calltrace.cycles-pp.try_to_block_task.__schedule.schedule.do_nanosleep.hrtimer_nanosleep
      2.10            +0.0        2.14        perf-profile.calltrace.cycles-pp.try_to_wake_up.complete_signal.__send_signal_locked.do_send_sig_info.kill_pid_info_type
      2.38            +0.0        2.42        perf-profile.calltrace.cycles-pp.dequeue_task_fair.try_to_block_task.__schedule.schedule.do_nanosleep
      0.99            +0.0        1.03        perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.complete_signal.__send_signal_locked
      2.32            +0.0        2.36        perf-profile.calltrace.cycles-pp.complete_signal.__send_signal_locked.do_send_sig_info.kill_pid_info_type.kill_something_info
      2.24            +0.0        2.28        perf-profile.calltrace.cycles-pp.dequeue_entities.dequeue_task_fair.try_to_block_task.__schedule.schedule
      3.46            +0.0        3.50        perf-profile.calltrace.cycles-pp.__send_signal_locked.do_send_sig_info.kill_pid_info_type.kill_something_info.__x64_sys_kill
      1.79            +0.0        1.84        perf-profile.calltrace.cycles-pp.enqueue_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue
      1.73            +0.1        1.78        perf-profile.calltrace.cycles-pp.enqueue_task_fair.enqueue_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue
      1.06            +0.1        1.11        perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.complete_signal.__send_signal_locked.do_send_sig_info
      2.36            +0.1        2.41        perf-profile.calltrace.cycles-pp.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle
      4.26            +0.1        4.32        perf-profile.calltrace.cycles-pp.kill_pid_info_type.kill_something_info.__x64_sys_kill.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6.72            +0.1        6.78        perf-profile.calltrace.cycles-pp.alarm
      0.73            +0.1        0.80        perf-profile.calltrace.cycles-pp.pick_task_fair.pick_next_task_fair.__pick_next_task.__schedule.schedule
      2.86            +0.1        2.92        perf-profile.calltrace.cycles-pp.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle.cpu_startup_entry
      3.26            +0.1        3.33        perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary
      3.72            +0.1        3.80        perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.do_idle.cpu_startup_entry.start_secondary.common_startup_64
      0.85            +0.1        0.94        perf-profile.calltrace.cycles-pp.pick_next_task_fair.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield
      0.88            +0.1        0.97        perf-profile.calltrace.cycles-pp.__pick_next_task.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64
      2.02            +0.1        2.15        perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      1.54            +0.1        1.67        perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.57            +0.1        1.71        perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      2.88            +0.2        3.04        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__sched_yield
      2.34            +0.2        2.51        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      5.50            +0.2        5.68        perf-profile.calltrace.cycles-pp.__sched_yield
      0.52            +0.5        1.04        perf-profile.calltrace.cycles-pp.select_idle_core.select_idle_cpu.select_idle_sibling.select_task_rq_fair.select_task_rq
     34.13            -0.3       33.82        perf-profile.children.cycles-pp.cpuidle_idle_call
     32.84            -0.3       32.54        perf-profile.children.cycles-pp.cpuidle_enter
     32.79            -0.3       32.50        perf-profile.children.cycles-pp.cpuidle_enter_state
     13.10            -0.3       12.81        perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.78 ± 13%      -0.2        0.58 ± 20%  perf-profile.children.cycles-pp.intel_idle
      8.88            -0.1        8.80        perf-profile.children.cycles-pp.__x64_sys_clock_nanosleep
      8.05            -0.1        7.97        perf-profile.children.cycles-pp.do_nanosleep
      8.39            -0.1        8.31        perf-profile.children.cycles-pp.hrtimer_nanosleep
      8.46            -0.1        8.39        perf-profile.children.cycles-pp.common_nsleep
      1.22            -0.1        1.17        perf-profile.children.cycles-pp.pick_task_fair
      3.10            -0.0        3.06        perf-profile.children.cycles-pp.__pick_next_task
      2.60            -0.0        2.56        perf-profile.children.cycles-pp.pick_next_task_fair
      0.10 ±  3%      -0.0        0.08 ±  5%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.09 ±  5%      +0.0        0.10 ±  4%  perf-profile.children.cycles-pp.sigprocmask
      0.91            +0.0        0.94        perf-profile.children.cycles-pp.switch_mm_irqs_off
      1.85            +0.0        1.89        perf-profile.children.cycles-pp.enqueue_entity
      2.41            +0.0        2.45        perf-profile.children.cycles-pp.enqueue_task
      2.39            +0.0        2.43        perf-profile.children.cycles-pp.dequeue_task_fair
      2.48            +0.0        2.52        perf-profile.children.cycles-pp.try_to_block_task
      1.42            +0.0        1.46        perf-profile.children.cycles-pp.available_idle_cpu
      2.32            +0.0        2.37        perf-profile.children.cycles-pp.complete_signal
      2.32            +0.0        2.36        perf-profile.children.cycles-pp.enqueue_task_fair
      3.46            +0.0        3.51        perf-profile.children.cycles-pp.__send_signal_locked
      4.27            +0.1        4.32        perf-profile.children.cycles-pp.kill_pid_info_type
      4.03            +0.1        4.08        perf-profile.children.cycles-pp.do_send_sig_info
      6.84            +0.1        6.90        perf-profile.children.cycles-pp.alarm
      3.09            +0.1        3.15        perf-profile.children.cycles-pp.ttwu_do_activate
      1.95            +0.1        2.02        perf-profile.children.cycles-pp.select_idle_core
      2.23            +0.1        2.30        perf-profile.children.cycles-pp.select_idle_cpu
      3.12            +0.1        3.19        perf-profile.children.cycles-pp.sched_ttwu_pending
      3.58            +0.1        3.65        perf-profile.children.cycles-pp.__flush_smp_call_function_queue
      2.62            +0.1        2.70        perf-profile.children.cycles-pp.select_idle_sibling
      6.14            +0.1        6.22        perf-profile.children.cycles-pp.try_to_wake_up
      3.78            +0.1        3.86        perf-profile.children.cycles-pp.flush_smp_call_function_queue
      3.05            +0.1        3.14        perf-profile.children.cycles-pp.select_task_rq_fair
      3.17            +0.1        3.26        perf-profile.children.cycles-pp.select_task_rq
      2.03            +0.1        2.17        perf-profile.children.cycles-pp.__x64_sys_sched_yield
      5.56            +0.2        5.75        perf-profile.children.cycles-pp.__sched_yield
      0.78 ± 13%      -0.2        0.58 ± 20%  perf-profile.self.cycles-pp.intel_idle
      0.22 ±  2%      +0.0        0.23        perf-profile.self.cycles-pp.exit_to_user_mode_loop
      0.80            +0.0        0.83        perf-profile.self.cycles-pp.switch_mm_irqs_off
      1.40            +0.0        1.45        perf-profile.self.cycles-pp.available_idle_cpu





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki