[PATCH v3] sched/fair: sanitize vruntime of entity being placed

Roman Kagan posted 1 patch 2 years, 7 months ago
kernel/sched/fair.c | 21 +++++++++++++++++++--
1 file changed, 19 insertions(+), 2 deletions(-)
[PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Roman Kagan 2 years, 7 months ago
From: Zhang Qiao <zhangqiao22@huawei.com>

When a scheduling entity is placed onto cfs_rq, its vruntime is pulled
to the base level (around cfs_rq->min_vruntime), so that the entity
doesn't gain extra boost when placed backwards.

However, if the entity being placed wasn't executed for a long time, its
vruntime may get too far behind (e.g. while cfs_rq was executing a
low-weight hog), which can inverse the vruntime comparison due to s64
overflow.  This results in the entity being placed with its original
vruntime way forwards, so that it will effectively never get to the cpu.

To prevent that, ignore the vruntime of the entity being placed if it
didn't execute for longer than the time that can lead to an overflow.

Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
[rkagan: formatted, adjusted commit log, comments, cutoff value]
Co-developed-by: Roman Kagan <rkagan@amazon.de>
Signed-off-by: Roman Kagan <rkagan@amazon.de>
---
v2 -> v3:
- make cutoff less arbitrary and update comments [Vincent]

v1 -> v2:
- add Zhang Qiao's s-o-b
- fix constant promotion on 32bit

 kernel/sched/fair.c | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0f8736991427..3baa6b7ea860 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4656,6 +4656,7 @@ static void
 place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
 {
 	u64 vruntime = cfs_rq->min_vruntime;
+	u64 sleep_time;
 
 	/*
 	 * The 'current' period is already promised to the current tasks,
@@ -4685,8 +4686,24 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
 		vruntime -= thresh;
 	}
 
-	/* ensure we never gain time by being placed backwards. */
-	se->vruntime = max_vruntime(se->vruntime, vruntime);
+	/*
+	 * Pull vruntime of the entity being placed to the base level of
+	 * cfs_rq, to prevent boosting it if placed backwards.
+	 * However, min_vruntime can advance much faster than real time, with
+	 * the exterme being when an entity with the minimal weight always runs
+	 * on the cfs_rq.  If the new entity slept for long, its vruntime
+	 * difference from min_vruntime may overflow s64 and their comparison
+	 * may get inversed, so ignore the entity's original vruntime in that
+	 * case.
+	 * The maximal vruntime speedup is given by the ratio of normal to
+	 * minimal weight: NICE_0_LOAD / MIN_SHARES, so cutting off on the
+	 * sleep time of 2^63 / NICE_0_LOAD should be safe.
+	 */
+	sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
+	if ((s64)sleep_time > (1ULL << 63) / NICE_0_LOAD)
+		se->vruntime = vruntime;
+	else
+		se->vruntime = max_vruntime(se->vruntime, vruntime);
 }
 
 static void check_enqueue_throttle(struct cfs_rq *cfs_rq);
-- 
2.34.1




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Vincent Guittot 2 years, 6 months ago
On Thu, 9 Feb 2023 at 20:31, Roman Kagan <rkagan@amazon.de> wrote:
>
> From: Zhang Qiao <zhangqiao22@huawei.com>
>
> When a scheduling entity is placed onto cfs_rq, its vruntime is pulled
> to the base level (around cfs_rq->min_vruntime), so that the entity
> doesn't gain extra boost when placed backwards.
>
> However, if the entity being placed wasn't executed for a long time, its
> vruntime may get too far behind (e.g. while cfs_rq was executing a
> low-weight hog), which can inverse the vruntime comparison due to s64
> overflow.  This results in the entity being placed with its original
> vruntime way forwards, so that it will effectively never get to the cpu.
>
> To prevent that, ignore the vruntime of the entity being placed if it
> didn't execute for longer than the time that can lead to an overflow.
>
> Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
> [rkagan: formatted, adjusted commit log, comments, cutoff value]
> Co-developed-by: Roman Kagan <rkagan@amazon.de>
> Signed-off-by: Roman Kagan <rkagan@amazon.de>

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
> v2 -> v3:
> - make cutoff less arbitrary and update comments [Vincent]
>
> v1 -> v2:
> - add Zhang Qiao's s-o-b
> - fix constant promotion on 32bit
>
>  kernel/sched/fair.c | 21 +++++++++++++++++++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0f8736991427..3baa6b7ea860 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4656,6 +4656,7 @@ static void
>  place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>  {
>         u64 vruntime = cfs_rq->min_vruntime;
> +       u64 sleep_time;
>
>         /*
>          * The 'current' period is already promised to the current tasks,
> @@ -4685,8 +4686,24 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>                 vruntime -= thresh;
>         }
>
> -       /* ensure we never gain time by being placed backwards. */
> -       se->vruntime = max_vruntime(se->vruntime, vruntime);
> +       /*
> +        * Pull vruntime of the entity being placed to the base level of
> +        * cfs_rq, to prevent boosting it if placed backwards.
> +        * However, min_vruntime can advance much faster than real time, with
> +        * the exterme being when an entity with the minimal weight always runs
> +        * on the cfs_rq.  If the new entity slept for long, its vruntime
> +        * difference from min_vruntime may overflow s64 and their comparison
> +        * may get inversed, so ignore the entity's original vruntime in that
> +        * case.
> +        * The maximal vruntime speedup is given by the ratio of normal to
> +        * minimal weight: NICE_0_LOAD / MIN_SHARES, so cutting off on the
> +        * sleep time of 2^63 / NICE_0_LOAD should be safe.
> +        */
> +       sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> +       if ((s64)sleep_time > (1ULL << 63) / NICE_0_LOAD)
> +               se->vruntime = vruntime;
> +       else
> +               se->vruntime = max_vruntime(se->vruntime, vruntime);
>  }
>
>  static void check_enqueue_throttle(struct cfs_rq *cfs_rq);
> --
> 2.34.1
>
>
>
>
> Amazon Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> Sitz: Berlin
> Ust-ID: DE 289 237 879
>
>
>
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Roman Kagan 2 years, 6 months ago
On Tue, Feb 21, 2023 at 10:38:44AM +0100, Vincent Guittot wrote:
> On Thu, 9 Feb 2023 at 20:31, Roman Kagan <rkagan@amazon.de> wrote:
> >
> > From: Zhang Qiao <zhangqiao22@huawei.com>
> >
> > When a scheduling entity is placed onto cfs_rq, its vruntime is pulled
> > to the base level (around cfs_rq->min_vruntime), so that the entity
> > doesn't gain extra boost when placed backwards.
> >
> > However, if the entity being placed wasn't executed for a long time, its
> > vruntime may get too far behind (e.g. while cfs_rq was executing a
> > low-weight hog), which can inverse the vruntime comparison due to s64
> > overflow.  This results in the entity being placed with its original
> > vruntime way forwards, so that it will effectively never get to the cpu.
> >
> > To prevent that, ignore the vruntime of the entity being placed if it
> > didn't execute for longer than the time that can lead to an overflow.
> >
> > Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
> > [rkagan: formatted, adjusted commit log, comments, cutoff value]
> > Co-developed-by: Roman Kagan <rkagan@amazon.de>
> > Signed-off-by: Roman Kagan <rkagan@amazon.de>
> 
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
> 
> > ---
> > v2 -> v3:
> > - make cutoff less arbitrary and update comments [Vincent]
> >
> > v1 -> v2:
> > - add Zhang Qiao's s-o-b
> > - fix constant promotion on 32bit
> >
> >  kernel/sched/fair.c | 21 +++++++++++++++++++--
> >  1 file changed, 19 insertions(+), 2 deletions(-)

Turns out Peter took v2 through his tree, and it has already landed in
Linus' master.

What scares me, though, is that I've got a message from the test robot
that this commit drammatically affected hackbench results, see the quote
below.  I expected the commit not to affect any benchmarks.

Any idea what could have caused this change?

Thanks,
Roman.


On Tue, Feb 21, 2023 at 03:34:16PM +0800, kernel test robot wrote:
> FYI, we noticed a 125.5% improvement of hackbench.throughput due to commit:
> 
> commit: 829c1651e9c4a6f78398d3e67651cef9bb6b42cc ("sched/fair: sanitize vruntime of entity being placed")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: hackbench
> on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory
> with following parameters:
> 
>         nr_threads: 50%
>         iterations: 8
>         mode: process
>         ipc: pipe
>         cpufreq_governor: performance
> 
> test-description: Hackbench is both a benchmark and a stress test for the Linux kernel scheduler.
> test-url: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/hackbench.c
> 
> In addition to that, the commit also has significant impact on the following tests:
> 
> +------------------+--------------------------------------------------+
> | testcase: change | hackbench: hackbench.throughput -8.1% regression |
> | test machine     | 104 threads 2 sockets (Skylake) with 192G memory |
> | test parameters  | cpufreq_governor=performance                     |
> |                  | ipc=socket                                       |
> |                  | iterations=4                                     |
> |                  | mode=process                                     |
> |                  | nr_threads=100%                                  |
> +------------------+--------------------------------------------------+
> 
> Details are as below:
> 
> =========================================================================================
> compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
>   gcc-11/performance/pipe/8/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/hackbench
> 
> commit:
>   a2e90611b9 ("sched/fair: Remove capacity inversion detection")
>   829c1651e9 ("sched/fair: sanitize vruntime of entity being placed")
> 
> a2e90611b9f425ad 829c1651e9c4a6f78398d3e6765
> ---------------- ---------------------------
>          %stddev     %change         %stddev
>              \          |                \
>     308887 ±  5%    +125.5%     696539        hackbench.throughput
>     259291 ±  2%    +127.3%     589293        hackbench.throughput_avg
>     308887 ±  5%    +125.5%     696539        hackbench.throughput_best
>     198770 ±  2%    +105.5%     408552 ±  4%  hackbench.throughput_worst
>     319.60 ±  2%     -55.8%     141.24        hackbench.time.elapsed_time
>     319.60 ±  2%     -55.8%     141.24        hackbench.time.elapsed_time.max
>  1.298e+09 ±  8%     -87.6%  1.613e+08 ±  7%  hackbench.time.involuntary_context_switches
>     477107           -12.5%     417660        hackbench.time.minor_page_faults
>      24683 ±  2%     -57.2%      10562        hackbench.time.system_time
>       2136 ±  3%     -45.0%       1174        hackbench.time.user_time
>   3.21e+09 ±  4%     -83.0%  5.442e+08 ±  3%  hackbench.time.voluntary_context_switches
>   5.28e+08 ±  4%      +8.4%  5.723e+08 ±  3%  cpuidle..time
>     365.97 ±  2%     -48.9%     187.12        uptime.boot
>    3322559 ±  3%     +34.3%    4463206 ± 15%  vmstat.memory.cache
>   14194257 ±  2%     -62.8%    5279904 ±  3%  vmstat.system.cs
>    2120781 ±  3%     -72.8%     576421 ±  4%  vmstat.system.in
>       1.84 ± 12%      +2.6        4.48 ±  5%  mpstat.cpu.all.idle%
>       2.49 ±  3%      -1.1        1.39 ±  4%  mpstat.cpu.all.irq%
>       0.04 ± 12%      +0.0        0.05        mpstat.cpu.all.soft%
>       7.36            +2.2        9.56        mpstat.cpu.all.usr%
>      61555 ±  6%     -72.8%      16751 ± 16%  numa-meminfo.node1.Active
>      61515 ±  6%     -72.8%      16717 ± 16%  numa-meminfo.node1.Active(anon)
>     960182 ±102%    +225.6%    3125990 ± 42%  numa-meminfo.node1.FilePages
>    1754002 ± 53%    +137.9%    4173379 ± 34%  numa-meminfo.node1.MemUsed
>   35296824 ±  6%    +157.8%   91005048        numa-numastat.node0.local_node
>   35310119 ±  6%    +157.9%   91058472        numa-numastat.node0.numa_hit
>   35512423 ±  5%    +159.7%   92232951        numa-numastat.node1.local_node
>   35577275 ±  4%    +159.4%   92273266        numa-numastat.node1.numa_hit
>   35310253 ±  6%    +157.9%   91058211        numa-vmstat.node0.numa_hit
>   35296958 ±  6%    +157.8%   91004787        numa-vmstat.node0.numa_local
>      15337 ±  6%     -72.5%       4216 ± 17%  numa-vmstat.node1.nr_active_anon
>     239988 ±102%    +225.7%     781607 ± 42%  numa-vmstat.node1.nr_file_pages
>      15337 ±  6%     -72.5%       4216 ± 17%  numa-vmstat.node1.nr_zone_active_anon
>   35577325 ±  4%    +159.4%   92273215        numa-vmstat.node1.numa_hit
>   35512473 ±  5%    +159.7%   92232900        numa-vmstat.node1.numa_local
>      64500 ±  8%     -61.8%      24643 ± 32%  meminfo.Active
>      64422 ±  8%     -61.9%      24568 ± 32%  meminfo.Active(anon)
>     140271 ± 14%     -38.0%      86979 ± 24%  meminfo.AnonHugePages
>     372672 ±  2%     +13.3%     422069        meminfo.AnonPages
>    3205235 ±  3%     +35.1%    4329061 ± 15%  meminfo.Cached
>    1548601 ±  7%     +77.4%    2747319 ± 24%  meminfo.Committed_AS
>     783193 ± 14%    +154.9%    1996137 ± 33%  meminfo.Inactive
>     783010 ± 14%    +154.9%    1995951 ± 33%  meminfo.Inactive(anon)
>    4986534 ±  2%     +28.2%    6394741 ± 10%  meminfo.Memused
>     475092 ± 22%    +236.5%    1598918 ± 41%  meminfo.Shmem
>       2777            -2.1%       2719        turbostat.Bzy_MHz
>   11143123 ±  6%     +72.0%   19162667        turbostat.C1
>       0.24 ±  7%      +0.7        0.94 ±  3%  turbostat.C1%
>     100440 ± 18%    +203.8%     305136 ± 15%  turbostat.C1E
>       0.06 ±  9%      +0.1        0.18 ± 11%  turbostat.C1E%
>       1.24 ±  3%      +1.6        2.81 ±  4%  turbostat.C6%
>       1.38 ±  3%    +156.1%       3.55 ±  3%  turbostat.CPU%c1
>       0.33 ±  5%     +76.5%       0.58 ±  7%  turbostat.CPU%c6
>       0.16           +31.2%       0.21        turbostat.IPC
>  6.866e+08 ±  5%     -87.8%   83575393 ±  5%  turbostat.IRQ
>       0.33 ± 27%      +0.2        0.57        turbostat.POLL%
>       0.12 ± 10%    +176.4%       0.33 ± 12%  turbostat.Pkg%pc2
>       0.09 ±  7%    -100.0%       0.00        turbostat.Pkg%pc6
>      61.33            +5.2%      64.50 ±  2%  turbostat.PkgTmp
>      14.81            +2.0%      15.11        turbostat.RAMWatt
>      16242 ±  8%     -62.0%       6179 ± 32%  proc-vmstat.nr_active_anon
>      93150 ±  2%     +13.2%     105429        proc-vmstat.nr_anon_pages
>     801219 ±  3%     +35.1%    1082320 ± 15%  proc-vmstat.nr_file_pages
>     195506 ± 14%    +155.2%     498919 ± 33%  proc-vmstat.nr_inactive_anon
>     118682 ± 22%    +236.9%     399783 ± 41%  proc-vmstat.nr_shmem
>      16242 ±  8%     -62.0%       6179 ± 32%  proc-vmstat.nr_zone_active_anon
>     195506 ± 14%    +155.2%     498919 ± 33%  proc-vmstat.nr_zone_inactive_anon
>   70889233 ±  5%    +158.6%  1.833e+08        proc-vmstat.numa_hit
>   70811086 ±  5%    +158.8%  1.832e+08        proc-vmstat.numa_local
>      55885 ± 22%     -67.2%      18327 ± 38%  proc-vmstat.numa_pages_migrated
>     422312 ± 10%     -95.4%      19371 ±  7%  proc-vmstat.pgactivate
>   71068460 ±  5%    +158.1%  1.834e+08        proc-vmstat.pgalloc_normal
>    1554994           -19.6%    1250346 ±  4%  proc-vmstat.pgfault
>   71011267 ±  5%    +155.9%  1.817e+08        proc-vmstat.pgfree
>      55885 ± 22%     -67.2%      18327 ± 38%  proc-vmstat.pgmigrate_success
>     111247 ±  2%     -35.0%      72355 ±  2%  proc-vmstat.pgreuse
>    2506368 ±  2%     -53.1%    1176320        proc-vmstat.unevictable_pgs_scanned
>      20.06 ± 10%     -22.4%      15.56 ±  8%  sched_debug.cfs_rq:/.h_nr_running.max
>       0.81 ± 32%     -93.1%       0.06 ±223%  sched_debug.cfs_rq:/.h_nr_running.min
>       1917 ± 34%    -100.0%       0.00        sched_debug.cfs_rq:/.load.min
>      24.18 ± 10%     +39.0%      33.62 ± 11%  sched_debug.cfs_rq:/.load_avg.avg
>     245.61 ± 25%     +66.3%     408.33 ± 22%  sched_debug.cfs_rq:/.load_avg.max
>      47.52 ± 13%     +72.6%      82.03 ±  8%  sched_debug.cfs_rq:/.load_avg.stddev
>   13431147           -64.9%    4717147        sched_debug.cfs_rq:/.min_vruntime.avg
>   18161799 ±  7%     -67.4%    5925316 ±  6%  sched_debug.cfs_rq:/.min_vruntime.max
>   12413026           -65.0%    4340952        sched_debug.cfs_rq:/.min_vruntime.min
>     739748 ± 16%     -66.6%     247410 ± 17%  sched_debug.cfs_rq:/.min_vruntime.stddev
>       0.85           -16.4%       0.71        sched_debug.cfs_rq:/.nr_running.avg
>       0.61 ± 25%     -90.9%       0.06 ±223%  sched_debug.cfs_rq:/.nr_running.min
>       0.10 ± 25%    +109.3%       0.22 ±  7%  sched_debug.cfs_rq:/.nr_running.stddev
>     169.22          +101.7%     341.33        sched_debug.cfs_rq:/.removed.load_avg.max
>      32.41 ± 24%    +100.2%      64.90 ± 16%  sched_debug.cfs_rq:/.removed.load_avg.stddev
>      82.92 ± 10%    +108.1%     172.56        sched_debug.cfs_rq:/.removed.runnable_avg.max
>      13.60 ± 28%    +114.0%      29.10 ± 20%  sched_debug.cfs_rq:/.removed.runnable_avg.stddev
>      82.92 ± 10%    +108.1%     172.56        sched_debug.cfs_rq:/.removed.util_avg.max
>      13.60 ± 28%    +114.0%      29.10 ± 20%  sched_debug.cfs_rq:/.removed.util_avg.stddev
>       2156 ± 12%     -36.6%       1368 ± 27%  sched_debug.cfs_rq:/.runnable_avg.min
>       2285 ±  7%     -19.8%       1833 ±  6%  sched_debug.cfs_rq:/.runnable_avg.stddev
>   -2389921           -64.8%    -840940        sched_debug.cfs_rq:/.spread0.min
>     739781 ± 16%     -66.5%     247837 ± 17%  sched_debug.cfs_rq:/.spread0.stddev
>     843.88 ±  2%     -20.5%     670.53        sched_debug.cfs_rq:/.util_avg.avg
>     433.64 ±  7%     -43.5%     244.83 ± 17%  sched_debug.cfs_rq:/.util_avg.min
>     187.00 ±  6%     +40.6%     263.02 ±  4%  sched_debug.cfs_rq:/.util_avg.stddev
>     394.15 ± 14%     -29.5%     278.06 ±  3%  sched_debug.cfs_rq:/.util_est_enqueued.avg
>       1128 ± 12%     -17.6%     930.39 ±  5%  sched_debug.cfs_rq:/.util_est_enqueued.max
>      38.36 ± 29%    -100.0%       0.00        sched_debug.cfs_rq:/.util_est_enqueued.min
>       3596 ± 15%     -39.5%       2175 ±  7%  sched_debug.cpu.avg_idle.min
>     160647 ±  9%     -25.9%     118978 ±  9%  sched_debug.cpu.avg_idle.stddev
>     197365           -46.2%     106170        sched_debug.cpu.clock.avg
>     197450           -46.2%     106208        sched_debug.cpu.clock.max
>     197281           -46.2%     106128        sched_debug.cpu.clock.min
>      49.96 ± 22%     -53.1%      23.44 ± 19%  sched_debug.cpu.clock.stddev
>     193146           -45.7%     104898        sched_debug.cpu.clock_task.avg
>     194592           -45.8%     105455        sched_debug.cpu.clock_task.max
>     177878           -49.3%      90211        sched_debug.cpu.clock_task.min
>       1794 ±  5%     -10.7%       1602 ±  2%  sched_debug.cpu.clock_task.stddev
>      13154 ±  2%     -20.3%      10479        sched_debug.cpu.curr->pid.avg
>      15059           -17.2%      12468        sched_debug.cpu.curr->pid.max
>       7263 ± 33%    -100.0%       0.00        sched_debug.cpu.curr->pid.min
>       9321 ± 36%     +98.2%      18478 ± 44%  sched_debug.cpu.max_idle_balance_cost.stddev
>       0.00 ± 17%     -41.6%       0.00 ± 13%  sched_debug.cpu.next_balance.stddev
>      20.00 ± 11%     -21.4%      15.72 ±  7%  sched_debug.cpu.nr_running.max
>       0.86 ± 17%     -87.1%       0.11 ±141%  sched_debug.cpu.nr_running.min
>   25069883           -83.7%    4084117 ±  4%  sched_debug.cpu.nr_switches.avg
>   26486718           -82.8%    4544009 ±  4%  sched_debug.cpu.nr_switches.max
>   23680077           -84.5%    3663816 ±  4%  sched_debug.cpu.nr_switches.min
>     589836 ±  3%     -68.7%     184621 ± 16%  sched_debug.cpu.nr_switches.stddev
>     197278           -46.2%     106128        sched_debug.cpu_clk
>     194327           -46.9%     103176        sched_debug.ktime
>     197967           -46.0%     106821        sched_debug.sched_clk
>      14.91           -37.6%       9.31        perf-stat.i.MPKI
>  2.657e+10           +25.0%   3.32e+10        perf-stat.i.branch-instructions
>       1.17            -0.4        0.78        perf-stat.i.branch-miss-rate%
>  3.069e+08           -20.1%  2.454e+08        perf-stat.i.branch-misses
>       6.43 ±  8%      +2.2        8.59 ±  4%  perf-stat.i.cache-miss-rate%
>  1.952e+09           -24.3%  1.478e+09        perf-stat.i.cache-references
>   14344055 ±  2%     -58.6%    5932018 ±  3%  perf-stat.i.context-switches
>       1.83           -21.8%       1.43        perf-stat.i.cpi
>  2.403e+11            -3.4%  2.322e+11        perf-stat.i.cpu-cycles
>    1420139 ±  2%     -38.8%     869692 ±  5%  perf-stat.i.cpu-migrations
>       2619 ±  7%     -15.5%       2212 ±  8%  perf-stat.i.cycles-between-cache-misses
>       0.24 ± 19%      -0.1        0.10 ± 17%  perf-stat.i.dTLB-load-miss-rate%
>   90403286 ± 19%     -55.8%   39926283 ± 16%  perf-stat.i.dTLB-load-misses
>  3.823e+10           +28.6%  4.918e+10        perf-stat.i.dTLB-loads
>       0.01 ± 34%      -0.0        0.01 ± 33%  perf-stat.i.dTLB-store-miss-rate%
>    2779663 ± 34%     -52.7%    1315899 ± 31%  perf-stat.i.dTLB-store-misses
>   2.19e+10           +24.2%   2.72e+10        perf-stat.i.dTLB-stores
>      47.99 ±  2%     +28.0       75.94        perf-stat.i.iTLB-load-miss-rate%
>   89417955 ±  2%     +38.7%   1.24e+08 ±  4%  perf-stat.i.iTLB-load-misses
>   97721514 ±  2%     -58.2%   40865783 ±  3%  perf-stat.i.iTLB-loads
>  1.329e+11           +26.3%  1.678e+11        perf-stat.i.instructions
>       1503            -7.7%       1388 ±  3%  perf-stat.i.instructions-per-iTLB-miss
>       0.55           +30.2%       0.72        perf-stat.i.ipc
>       1.64 ± 18%    +217.4%       5.20 ± 11%  perf-stat.i.major-faults
>       2.73            -3.7%       2.63        perf-stat.i.metric.GHz
>       1098 ±  2%      -7.1%       1020 ±  3%  perf-stat.i.metric.K/sec
>       1008           +24.4%       1254        perf-stat.i.metric.M/sec
>       4334 ±  2%     +90.5%       8257 ±  7%  perf-stat.i.minor-faults
>      90.94           -14.9       75.99        perf-stat.i.node-load-miss-rate%
>   41932510 ±  8%     -43.0%   23899176 ± 10%  perf-stat.i.node-load-misses
>    3366677 ±  5%     +86.2%    6267816        perf-stat.i.node-loads
>      81.77 ±  3%     -36.3       45.52 ±  3%  perf-stat.i.node-store-miss-rate%
>   18498318 ±  7%     -31.8%   12613933 ±  7%  perf-stat.i.node-store-misses
>    3023556 ± 10%    +508.7%   18405880 ±  2%  perf-stat.i.node-stores
>       4336 ±  2%     +90.5%       8262 ±  7%  perf-stat.i.page-faults
>      14.70           -41.2%       8.65        perf-stat.overall.MPKI
>       1.16            -0.4        0.72        perf-stat.overall.branch-miss-rate%
>       6.22 ±  7%      +2.4        8.59 ±  4%  perf-stat.overall.cache-miss-rate%
>       1.81           -24.3%       1.37        perf-stat.overall.cpi
>       0.24 ± 19%      -0.2        0.07 ± 15%  perf-stat.overall.dTLB-load-miss-rate%
>       0.01 ± 34%      -0.0        0.00 ± 29%  perf-stat.overall.dTLB-store-miss-rate%
>      47.78 ±  2%     +29.3       77.12        perf-stat.overall.iTLB-load-miss-rate%
>       1486            -9.1%       1351 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
>       0.55           +32.0%       0.73        perf-stat.overall.ipc
>      92.54           -15.4       77.16 ±  2%  perf-stat.overall.node-load-miss-rate%
>      85.82 ±  2%     -48.1       37.76 ±  5%  perf-stat.overall.node-store-miss-rate%
>  2.648e+10           +25.2%  3.314e+10        perf-stat.ps.branch-instructions
>   3.06e+08           -22.1%  2.383e+08        perf-stat.ps.branch-misses
>  1.947e+09           -25.5%  1.451e+09        perf-stat.ps.cache-references
>   14298713 ±  2%     -62.5%    5359285 ±  3%  perf-stat.ps.context-switches
>  2.396e+11            -4.0%  2.299e+11        perf-stat.ps.cpu-cycles
>    1415512 ±  2%     -42.2%     817981 ±  4%  perf-stat.ps.cpu-migrations
>   90073948 ± 19%     -60.4%   35711862 ± 15%  perf-stat.ps.dTLB-load-misses
>  3.811e+10           +29.7%  4.944e+10        perf-stat.ps.dTLB-loads
>    2767291 ± 34%     -56.3%    1210210 ± 29%  perf-stat.ps.dTLB-store-misses
>  2.183e+10           +25.0%  2.729e+10        perf-stat.ps.dTLB-stores
>   89118809 ±  2%     +39.6%  1.244e+08 ±  4%  perf-stat.ps.iTLB-load-misses
>   97404381 ±  2%     -62.2%   36860047 ±  3%  perf-stat.ps.iTLB-loads
>  1.324e+11           +26.7%  1.678e+11        perf-stat.ps.instructions
>       1.62 ± 18%    +164.7%       4.29 ±  8%  perf-stat.ps.major-faults
>       4310 ±  2%     +75.1%       7549 ±  5%  perf-stat.ps.minor-faults
>   41743097 ±  8%     -47.3%   21984450 ±  9%  perf-stat.ps.node-load-misses
>    3356259 ±  5%     +92.6%    6462631        perf-stat.ps.node-loads
>   18414647 ±  7%     -35.7%   11833799 ±  6%  perf-stat.ps.node-store-misses
>    3019790 ± 10%    +545.0%   19478071        perf-stat.ps.node-stores
>       4312 ±  2%     +75.2%       7553 ±  5%  perf-stat.ps.page-faults
>  4.252e+13           -43.7%  2.395e+13        perf-stat.total.instructions
>      29.92 ±  4%     -22.8        7.09 ± 29%  perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
>      28.53 ±  5%     -21.6        6.92 ± 29%  perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write.ksys_write
>      27.86 ±  5%     -21.1        6.77 ± 29%  perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write
>      27.55 ±  5%     -20.9        6.68 ± 29%  perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write
>      22.28 ±  4%     -17.0        5.31 ± 30%  perf-profile.calltrace.cycles-pp.schedule.pipe_read.vfs_read.ksys_read.do_syscall_64
>      21.98 ±  4%     -16.7        5.24 ± 30%  perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.vfs_read.ksys_read
>      12.62 ±  4%      -9.6        3.00 ± 33%  perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
>      34.09            -9.2       24.92 ±  3%  perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      11.48 ±  5%      -8.8        2.69 ± 38%  perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
>       9.60 ±  7%      -7.2        2.40 ± 35%  perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_read.vfs_read
>      36.39            -6.2       30.20        perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>      40.40            -6.1       34.28        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>      40.95            -5.7       35.26        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read
>      37.43            -5.4       32.07        perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>       6.30 ± 11%      -5.2        1.09 ± 36%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       5.66 ± 12%      -5.1        0.58 ± 75%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       6.46 ± 10%      -5.1        1.40 ± 28%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       5.53 ± 13%      -5.0        0.56 ± 75%  perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
>       5.42 ± 13%      -4.9        0.56 ± 75%  perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
>       5.82 ±  9%      -4.7        1.10 ± 37%  perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
>       5.86 ± 16%      -4.6        1.31 ± 37%  perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
>       5.26 ±  9%      -4.4        0.89 ± 57%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common
>      45.18            -3.5       41.68        perf-profile.calltrace.cycles-pp.__libc_read
>      50.31            -3.2       47.12        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       4.00 ± 27%      -2.9        1.09 ± 40%  perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.pipe_read
>      50.75            -2.7       48.06        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write
>      40.80            -2.6       38.20        perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       3.10 ± 15%      -2.5        0.62 ±103%  perf-profile.calltrace.cycles-pp.update_cfs_group.dequeue_task_fair.__schedule.schedule.pipe_read
>       2.94 ± 12%      -2.3        0.62 ±102%  perf-profile.calltrace.cycles-pp.update_cfs_group.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
>       2.38 ±  9%      -2.0        0.38 ±102%  perf-profile.calltrace.cycles-pp._raw_spin_lock.__schedule.schedule.pipe_read.vfs_read
>       2.24 ±  7%      -1.8        0.40 ± 71%  perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_read.vfs_read.ksys_read.do_syscall_64
>       2.08 ±  6%      -1.8        0.29 ±100%  perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.pipe_read.vfs_read
>       2.10 ± 10%      -1.8        0.32 ±104%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__schedule.schedule.pipe_read
>       2.76 ±  7%      -1.5        1.24 ± 17%  perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
>       2.27 ±  5%      -1.4        0.88 ± 11%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>       2.43 ±  7%      -1.3        1.16 ± 17%  perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common
>       2.46 ±  5%      -1.3        1.20 ±  7%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>       1.54 ±  5%      -1.2        0.32 ±101%  perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
>       0.97 ±  9%      -0.3        0.66 ± 19%  perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function
>       0.86 ±  6%      +0.2        1.02        perf-profile.calltrace.cycles-pp.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write
>       0.64 ±  9%      +0.5        1.16 ±  5%  perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_read.vfs_read.ksys_read.do_syscall_64
>       0.47 ± 45%      +0.5        0.99 ±  5%  perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.60 ±  8%      +0.5        1.13 ±  5%  perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>       0.00            +0.5        0.54 ±  5%  perf-profile.calltrace.cycles-pp.current_time.file_update_time.pipe_write.vfs_write.ksys_write
>       0.00            +0.6        0.56 ±  4%  perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write
>       0.00            +0.6        0.56 ±  7%  perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_to_iter.copy_page_to_iter.pipe_read
>       0.00            +0.6        0.58 ±  5%  perf-profile.calltrace.cycles-pp.__might_resched.mutex_lock.pipe_write.vfs_write.ksys_write
>       0.00            +0.6        0.62 ±  3%  perf-profile.calltrace.cycles-pp.__might_resched.mutex_lock.pipe_read.vfs_read.ksys_read
>       0.00            +0.7        0.65 ±  6%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_write.vfs_write
>       0.00            +0.7        0.65 ±  7%  perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
>       0.57 ±  5%      +0.7        1.24 ±  6%  perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.00            +0.7        0.72 ±  6%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_write.vfs_write.ksys_write
>       0.00            +0.8        0.75 ±  6%  perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.pipe_write.vfs_write.ksys_write
>       0.74 ±  9%      +0.8        1.48 ±  5%  perf-profile.calltrace.cycles-pp.file_update_time.pipe_write.vfs_write.ksys_write.do_syscall_64
>       0.63 ±  5%      +0.8        1.40 ±  5%  perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       0.00            +0.8        0.78 ± 19%  perf-profile.calltrace.cycles-pp.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
>       0.00            +0.8        0.78 ± 19%  perf-profile.calltrace.cycles-pp.record__finish_output.__cmd_record
>       0.00            +0.8        0.78 ± 19%  perf-profile.calltrace.cycles-pp.perf_session__process_events.record__finish_output.__cmd_record
>       0.00            +0.8        0.80 ± 15%  perf-profile.calltrace.cycles-pp.__cmd_record
>       0.00            +0.8        0.82 ± 11%  perf-profile.calltrace.cycles-pp.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
>       0.00            +0.9        0.85 ±  6%  perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_write.vfs_write.ksys_write.do_syscall_64
>       0.00            +0.9        0.86 ±  4%  perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.pipe_read.vfs_read
>       0.00            +0.9        0.87 ±  5%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write
>       0.00            +0.9        0.88 ±  5%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read
>       0.26 ±100%      +1.0        1.22 ± 10%  perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_write.vfs_write.ksys_write
>       0.00            +1.0        0.96 ±  6%  perf-profile.calltrace.cycles-pp.__might_fault._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read
>       0.27 ±100%      +1.0        1.23 ± 10%  perf-profile.calltrace.cycles-pp.schedule.pipe_write.vfs_write.ksys_write.do_syscall_64
>       0.00            +1.0        0.97 ±  7%  perf-profile.calltrace.cycles-pp.page_counter_uncharge.uncharge_batch.__mem_cgroup_uncharge.__folio_put.pipe_read
>       0.87 ±  8%      +1.1        1.98 ±  5%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
>       0.73 ±  6%      +1.1        1.85 ±  5%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64
>       0.00            +1.2        1.15 ±  7%  perf-profile.calltrace.cycles-pp.uncharge_batch.__mem_cgroup_uncharge.__folio_put.pipe_read.vfs_read
>       0.00            +1.2        1.23 ±  6%  perf-profile.calltrace.cycles-pp.__mem_cgroup_uncharge.__folio_put.pipe_read.vfs_read.ksys_read
>       0.00            +1.2        1.24 ±  7%  perf-profile.calltrace.cycles-pp.__folio_put.pipe_read.vfs_read.ksys_read.do_syscall_64
>       0.48 ± 45%      +1.3        1.74 ±  6%  perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read
>       0.60 ±  7%      +1.3        1.87 ±  8%  perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_read.vfs_read.ksys_read.do_syscall_64
>       1.23 ±  7%      +1.3        2.51 ±  4%  perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64
>      43.42            +1.3       44.75        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       0.83 ±  7%      +1.3        2.17 ±  5%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.98 ±  7%      +1.4        2.36 ±  6%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.27 ±100%      +1.4        1.70 ±  9%  perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_read.vfs_read.ksys_read
>       0.79 ±  8%      +1.4        2.23 ±  6%  perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64
>       0.18 ±141%      +1.5        1.63 ±  9%  perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read
>       0.18 ±141%      +1.5        1.67 ±  9%  perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read.vfs_read
>       0.00            +1.6        1.57 ± 10%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
>       0.00            +1.6        1.57 ± 10%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
>       1.05 ±  8%      +1.7        2.73 ±  6%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter.copy_page_from_iter.pipe_write
>       1.84 ±  9%      +1.7        3.56 ±  5%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.copy_page_to_iter.pipe_read
>       1.41 ±  9%      +1.8        3.17 ±  6%  perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write
>       0.00            +1.8        1.79 ±  9%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
>       1.99 ±  9%      +2.0        3.95 ±  5%  perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read
>       2.40 ±  7%      +2.4        4.82 ±  5%  perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write
>       0.00            +2.5        2.50 ±  7%  perf-profile.calltrace.cycles-pp.__mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
>       2.89 ±  8%      +2.6        5.47 ±  5%  perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64
>       1.04 ± 30%      +2.8        3.86 ±  5%  perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_write
>       0.00            +2.9        2.90 ± 11%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
>       0.00            +2.9        2.91 ± 11%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
>       0.00            +2.9        2.91 ± 11%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
>       0.85 ± 27%      +2.9        3.80 ±  5%  perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read
>       0.00            +3.0        2.96 ± 11%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
>       2.60 ±  9%      +3.1        5.74 ±  6%  perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read.ksys_read
>       2.93 ±  9%      +3.7        6.66 ±  5%  perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.vfs_read.ksys_read.do_syscall_64
>       1.60 ± 12%      +4.6        6.18 ±  7%  perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.vfs_write.ksys_write.do_syscall_64
>       2.60 ± 10%      +4.6        7.24 ±  5%  perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
>      28.75 ±  5%     -21.6        7.19 ± 28%  perf-profile.children.cycles-pp.schedule
>      30.52 ±  4%     -21.6        8.97 ± 22%  perf-profile.children.cycles-pp.__wake_up_common_lock
>      28.53 ±  6%     -21.0        7.56 ± 26%  perf-profile.children.cycles-pp.__schedule
>      29.04 ±  5%     -20.4        8.63 ± 23%  perf-profile.children.cycles-pp.__wake_up_common
>      28.37 ±  5%     -19.9        8.44 ± 23%  perf-profile.children.cycles-pp.autoremove_wake_function
>      28.08 ±  5%     -19.7        8.33 ± 23%  perf-profile.children.cycles-pp.try_to_wake_up
>      13.90 ±  2%     -10.2        3.75 ± 28%  perf-profile.children.cycles-pp.ttwu_do_activate
>      12.66 ±  3%      -9.2        3.47 ± 29%  perf-profile.children.cycles-pp.enqueue_task_fair
>      34.20            -9.2       25.05 ±  3%  perf-profile.children.cycles-pp.pipe_read
>      90.86            -9.1       81.73        perf-profile.children.cycles-pp.do_syscall_64
>      91.80            -8.3       83.49        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>      10.28 ±  7%      -7.8        2.53 ± 27%  perf-profile.children.cycles-pp._raw_spin_lock
>       9.85 ±  7%      -6.9        2.92 ± 29%  perf-profile.children.cycles-pp.dequeue_task_fair
>       8.69 ±  7%      -6.6        2.05 ± 24%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
>       8.99 ±  6%      -6.2        2.81 ± 16%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
>      36.46            -6.1       30.34        perf-profile.children.cycles-pp.vfs_read
>       8.38 ±  8%      -5.8        2.60 ± 23%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
>       6.10 ± 11%      -5.4        0.66 ± 61%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
>      37.45            -5.3       32.13        perf-profile.children.cycles-pp.ksys_read
>       6.50 ± 35%      -4.9        1.62 ± 61%  perf-profile.children.cycles-pp.update_curr
>       6.56 ± 15%      -4.6        1.95 ± 57%  perf-profile.children.cycles-pp.update_cfs_group
>       6.38 ± 14%      -4.5        1.91 ± 28%  perf-profile.children.cycles-pp.enqueue_entity
>       5.74 ±  5%      -3.8        1.92 ± 25%  perf-profile.children.cycles-pp.update_load_avg
>      45.56            -3.8       41.75        perf-profile.children.cycles-pp.__libc_read
>       3.99 ±  4%      -3.1        0.92 ± 24%  perf-profile.children.cycles-pp.pick_next_task_fair
>       4.12 ± 27%      -2.7        1.39 ± 34%  perf-profile.children.cycles-pp.dequeue_entity
>      40.88            -2.5       38.37        perf-profile.children.cycles-pp.pipe_write
>       3.11 ±  4%      -2.4        0.75 ± 22%  perf-profile.children.cycles-pp.switch_mm_irqs_off
>       2.06 ± 33%      -1.8        0.27 ± 27%  perf-profile.children.cycles-pp.asm_sysvec_call_function_single
>       2.38 ± 41%      -1.8        0.60 ± 72%  perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
>       2.29 ±  5%      -1.7        0.60 ± 25%  perf-profile.children.cycles-pp.switch_fpu_return
>       2.30 ±  6%      -1.6        0.68 ± 18%  perf-profile.children.cycles-pp.prepare_task_switch
>       1.82 ± 33%      -1.6        0.22 ± 31%  perf-profile.children.cycles-pp.sysvec_call_function_single
>       1.77 ± 33%      -1.6        0.20 ± 32%  perf-profile.children.cycles-pp.__sysvec_call_function_single
>       1.96 ±  5%      -1.5        0.50 ± 20%  perf-profile.children.cycles-pp.reweight_entity
>       2.80 ±  7%      -1.2        1.60 ± 12%  perf-profile.children.cycles-pp.select_task_rq
>       1.61 ±  6%      -1.2        0.42 ± 25%  perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
>       1.34 ±  9%      -1.2        0.16 ± 28%  perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
>       1.62 ±  4%      -1.2        0.45 ± 22%  perf-profile.children.cycles-pp.set_next_entity
>       1.55 ±  8%      -1.1        0.43 ± 12%  perf-profile.children.cycles-pp.update_rq_clock
>       1.49 ±  8%      -1.1        0.41 ± 14%  perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
>       1.30 ± 20%      -1.0        0.26 ± 18%  perf-profile.children.cycles-pp.finish_task_switch
>       1.44 ±  5%      -1.0        0.42 ± 19%  perf-profile.children.cycles-pp.__switch_to_asm
>       2.47 ±  7%      -1.0        1.50 ± 12%  perf-profile.children.cycles-pp.select_task_rq_fair
>       2.33 ±  7%      -0.9        1.40 ±  3%  perf-profile.children.cycles-pp.prepare_to_wait_event
>       1.24 ±  7%      -0.9        0.35 ± 14%  perf-profile.children.cycles-pp.__update_load_avg_se
>       1.41 ± 32%      -0.9        0.56 ± 24%  perf-profile.children.cycles-pp.sched_ttwu_pending
>       2.29 ±  8%      -0.8        1.45 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>       1.04 ±  7%      -0.8        0.24 ± 22%  perf-profile.children.cycles-pp.check_preempt_curr
>       1.01 ±  3%      -0.7        0.30 ± 20%  perf-profile.children.cycles-pp.__switch_to
>       0.92 ±  7%      -0.7        0.26 ± 12%  perf-profile.children.cycles-pp.update_min_vruntime
>       0.71 ±  2%      -0.6        0.08 ± 75%  perf-profile.children.cycles-pp.put_prev_entity
>       0.76 ±  6%      -0.6        0.14 ± 32%  perf-profile.children.cycles-pp.check_preempt_wakeup
>       0.81 ± 66%      -0.6        0.22 ± 34%  perf-profile.children.cycles-pp.set_task_cpu
>       0.82 ± 17%      -0.6        0.23 ± 10%  perf-profile.children.cycles-pp.cpuacct_charge
>       1.08 ± 15%      -0.6        0.51 ± 10%  perf-profile.children.cycles-pp.wake_affine
>       0.56 ± 15%      -0.5        0.03 ±100%  perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi
>       0.66 ±  3%      -0.5        0.15 ± 28%  perf-profile.children.cycles-pp.os_xsave
>       0.52 ± 44%      -0.5        0.06 ±151%  perf-profile.children.cycles-pp.native_irq_return_iret
>       0.55 ±  5%      -0.4        0.15 ± 21%  perf-profile.children.cycles-pp.__calc_delta
>       0.56 ± 10%      -0.4        0.17 ± 26%  perf-profile.children.cycles-pp.___perf_sw_event
>       0.70 ± 15%      -0.4        0.32 ± 11%  perf-profile.children.cycles-pp.task_h_load
>       0.40 ±  4%      -0.3        0.06 ± 49%  perf-profile.children.cycles-pp.pick_next_entity
>       0.57 ±  6%      -0.3        0.26 ±  7%  perf-profile.children.cycles-pp.__list_del_entry_valid
>       0.39 ±  8%      -0.3        0.08 ± 24%  perf-profile.children.cycles-pp.set_next_buddy
>       0.64 ±  6%      -0.3        0.36 ±  6%  perf-profile.children.cycles-pp._raw_spin_lock_irq
>       0.53 ± 20%      -0.3        0.25 ±  8%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
>       0.36 ±  8%      -0.3        0.08 ± 11%  perf-profile.children.cycles-pp.rb_insert_color
>       0.41 ±  6%      -0.3        0.14 ± 17%  perf-profile.children.cycles-pp.sched_clock_cpu
>       0.36 ± 33%      -0.3        0.10 ± 17%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
>       0.37 ±  4%      -0.2        0.13 ± 16%  perf-profile.children.cycles-pp.native_sched_clock
>       0.28 ±  5%      -0.2        0.07 ± 18%  perf-profile.children.cycles-pp.rb_erase
>       0.32 ±  7%      -0.2        0.12 ± 10%  perf-profile.children.cycles-pp.__list_add_valid
>       0.23 ±  6%      -0.2        0.03 ±103%  perf-profile.children.cycles-pp.resched_curr
>       0.27 ±  5%      -0.2        0.08 ± 20%  perf-profile.children.cycles-pp.__wrgsbase_inactive
>       0.26 ±  6%      -0.2        0.08 ± 17%  perf-profile.children.cycles-pp.finish_wait
>       0.26 ±  4%      -0.2        0.08 ± 11%  perf-profile.children.cycles-pp.rcu_note_context_switch
>       0.33 ± 21%      -0.2        0.15 ± 32%  perf-profile.children.cycles-pp.migrate_task_rq_fair
>       0.22 ±  9%      -0.2        0.07 ± 22%  perf-profile.children.cycles-pp.perf_trace_buf_update
>       0.17 ±  8%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.rb_next
>       0.15 ± 32%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.llist_reverse_order
>       0.34 ±  7%      -0.1        0.26 ±  3%  perf-profile.children.cycles-pp.anon_pipe_buf_release
>       0.14 ±  6%      -0.1        0.07 ± 17%  perf-profile.children.cycles-pp.read@plt
>       0.10 ± 17%      -0.1        0.04 ± 75%  perf-profile.children.cycles-pp.remove_entity_load_avg
>       0.07 ± 10%      -0.0        0.02 ± 99%  perf-profile.children.cycles-pp.generic_update_time
>       0.11 ±  6%      -0.0        0.07 ±  8%  perf-profile.children.cycles-pp.__mark_inode_dirty
>       0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.load_balance
>       0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp._raw_spin_trylock
>       0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.uncharge_folio
>       0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.__do_softirq
>       0.00            +0.1        0.07 ± 10%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
>       0.00            +0.1        0.08 ± 14%  perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg
>       0.15 ± 23%      +0.1        0.23 ±  7%  perf-profile.children.cycles-pp.task_tick_fair
>       0.19 ± 17%      +0.1        0.28 ±  7%  perf-profile.children.cycles-pp.scheduler_tick
>       0.00            +0.1        0.10 ± 21%  perf-profile.children.cycles-pp.select_idle_core
>       0.00            +0.1        0.10 ±  9%  perf-profile.children.cycles-pp.osq_unlock
>       0.23 ± 12%      +0.1        0.34 ±  6%  perf-profile.children.cycles-pp.update_process_times
>       0.37 ± 13%      +0.1        0.48 ±  5%  perf-profile.children.cycles-pp.hrtimer_interrupt
>       0.24 ± 12%      +0.1        0.35 ±  6%  perf-profile.children.cycles-pp.tick_sched_handle
>       0.31 ± 14%      +0.1        0.43 ±  4%  perf-profile.children.cycles-pp.__hrtimer_run_queues
>       0.37 ± 12%      +0.1        0.49 ±  5%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
>       0.00            +0.1        0.12 ± 10%  perf-profile.children.cycles-pp.__mod_memcg_state
>       0.26 ± 10%      +0.1        0.38 ±  6%  perf-profile.children.cycles-pp.tick_sched_timer
>       0.00            +0.1        0.13 ±  7%  perf-profile.children.cycles-pp.free_unref_page
>       0.00            +0.1        0.14 ±  8%  perf-profile.children.cycles-pp.rmqueue
>       0.15 ±  8%      +0.2        0.30 ±  5%  perf-profile.children.cycles-pp.rcu_all_qs
>       0.16 ±  6%      +0.2        0.31 ±  5%  perf-profile.children.cycles-pp.__x64_sys_write
>       0.00            +0.2        0.16 ± 10%  perf-profile.children.cycles-pp.propagate_protected_usage
>       0.00            +0.2        0.16 ± 10%  perf-profile.children.cycles-pp.menu_select
>       0.00            +0.2        0.16 ±  9%  perf-profile.children.cycles-pp.memcg_account_kmem
>       0.42 ± 12%      +0.2        0.57 ±  4%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
>       0.15 ± 11%      +0.2        0.31 ±  8%  perf-profile.children.cycles-pp.__x64_sys_read
>       0.00            +0.2        0.17 ±  8%  perf-profile.children.cycles-pp.get_page_from_freelist
>       0.44 ± 11%      +0.2        0.62 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
>       0.10 ± 31%      +0.2        0.28 ± 24%  perf-profile.children.cycles-pp.mnt_user_ns
>       0.16 ±  4%      +0.2        0.35 ±  5%  perf-profile.children.cycles-pp.kill_fasync
>       0.20 ± 10%      +0.2        0.40 ±  3%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
>       0.09 ±  7%      +0.2        0.29 ±  4%  perf-profile.children.cycles-pp.page_copy_sane
>       0.08 ±  8%      +0.2        0.31 ±  6%  perf-profile.children.cycles-pp.rw_verify_area
>       0.12 ± 11%      +0.2        0.36 ±  8%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
>       0.28 ± 12%      +0.2        0.52 ±  5%  perf-profile.children.cycles-pp.inode_needs_update_time
>       0.00            +0.3        0.27 ±  7%  perf-profile.children.cycles-pp.__memcg_kmem_charge_page
>       0.43 ±  6%      +0.3        0.73 ±  5%  perf-profile.children.cycles-pp.__cond_resched
>       0.21 ± 29%      +0.3        0.54 ± 15%  perf-profile.children.cycles-pp.select_idle_cpu
>       0.10 ± 10%      +0.3        0.43 ± 17%  perf-profile.children.cycles-pp.fsnotify_perm
>       0.23 ± 11%      +0.3        0.56 ±  6%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
>       0.06 ± 75%      +0.4        0.47 ± 27%  perf-profile.children.cycles-pp.queue_event
>       0.21 ±  9%      +0.4        0.62 ±  5%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
>       0.06 ± 75%      +0.4        0.48 ± 26%  perf-profile.children.cycles-pp.ordered_events__queue
>       0.06 ± 73%      +0.4        0.50 ± 24%  perf-profile.children.cycles-pp.process_simple
>       0.01 ±223%      +0.4        0.44 ±  9%  perf-profile.children.cycles-pp.schedule_idle
>       0.05 ±  8%      +0.5        0.52 ±  7%  perf-profile.children.cycles-pp.__alloc_pages
>       0.45 ±  7%      +0.5        0.94 ±  5%  perf-profile.children.cycles-pp.__get_task_ioprio
>       0.89 ±  8%      +0.5        1.41 ±  4%  perf-profile.children.cycles-pp.__might_sleep
>       0.01 ±223%      +0.5        0.54 ± 21%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
>       0.05 ± 46%      +0.5        0.60 ±  7%  perf-profile.children.cycles-pp.osq_lock
>       0.34 ±  8%      +0.6        0.90 ±  5%  perf-profile.children.cycles-pp.aa_file_perm
>       0.01 ±223%      +0.7        0.67 ±  7%  perf-profile.children.cycles-pp.poll_idle
>       0.14 ± 17%      +0.7        0.82 ±  6%  perf-profile.children.cycles-pp.mutex_spin_on_owner
>       0.12 ± 12%      +0.7        0.82 ± 15%  perf-profile.children.cycles-pp.__cmd_record
>       0.07 ± 72%      +0.7        0.78 ± 19%  perf-profile.children.cycles-pp.reader__read_event
>       0.07 ± 72%      +0.7        0.78 ± 19%  perf-profile.children.cycles-pp.record__finish_output
>       0.07 ± 72%      +0.7        0.78 ± 19%  perf-profile.children.cycles-pp.perf_session__process_events
>       0.76 ±  8%      +0.8        1.52 ±  5%  perf-profile.children.cycles-pp.file_update_time
>       0.08 ± 61%      +0.8        0.85 ± 11%  perf-profile.children.cycles-pp.intel_idle_irq
>       1.23 ±  8%      +0.9        2.11 ±  4%  perf-profile.children.cycles-pp.__might_fault
>       0.02 ±141%      +1.0        0.97 ±  7%  perf-profile.children.cycles-pp.page_counter_uncharge
>       0.51 ±  9%      +1.0        1.48 ±  4%  perf-profile.children.cycles-pp.current_time
>       0.05 ± 46%      +1.1        1.15 ±  7%  perf-profile.children.cycles-pp.uncharge_batch
>       1.12 ±  6%      +1.1        2.23 ±  5%  perf-profile.children.cycles-pp.__fget_light
>       0.06 ± 14%      +1.2        1.23 ±  6%  perf-profile.children.cycles-pp.__mem_cgroup_uncharge
>       0.06 ± 14%      +1.2        1.24 ±  7%  perf-profile.children.cycles-pp.__folio_put
>       0.64 ±  7%      +1.2        1.83 ±  5%  perf-profile.children.cycles-pp.syscall_return_via_sysret
>       1.19 ±  8%      +1.2        2.42 ±  4%  perf-profile.children.cycles-pp.__might_resched
>       0.59 ±  9%      +1.3        1.84 ±  6%  perf-profile.children.cycles-pp.atime_needs_update
>      43.47            +1.4       44.83        perf-profile.children.cycles-pp.ksys_write
>       1.28 ±  6%      +1.4        2.68 ±  5%  perf-profile.children.cycles-pp.__fdget_pos
>       0.80 ±  8%      +1.5        2.28 ±  6%  perf-profile.children.cycles-pp.touch_atime
>       0.11 ± 49%      +1.5        1.59 ±  9%  perf-profile.children.cycles-pp.cpuidle_enter_state
>       0.11 ± 49%      +1.5        1.60 ±  9%  perf-profile.children.cycles-pp.cpuidle_enter
>       0.12 ± 51%      +1.7        1.81 ±  9%  perf-profile.children.cycles-pp.cpuidle_idle_call
>       1.44 ±  8%      +1.8        3.22 ±  6%  perf-profile.children.cycles-pp.copyin
>       2.00 ±  9%      +2.0        4.03 ±  5%  perf-profile.children.cycles-pp.copyout
>       1.02 ±  8%      +2.0        3.07 ±  5%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
>       1.63 ±  7%      +2.3        3.90 ±  5%  perf-profile.children.cycles-pp.apparmor_file_permission
>       2.64 ±  8%      +2.3        4.98 ±  5%  perf-profile.children.cycles-pp._copy_from_iter
>       0.40 ± 14%      +2.5        2.92 ±  7%  perf-profile.children.cycles-pp.__mutex_lock
>       2.91 ±  8%      +2.6        5.54 ±  5%  perf-profile.children.cycles-pp.copy_page_from_iter
>       0.17 ± 62%      +2.7        2.91 ± 11%  perf-profile.children.cycles-pp.start_secondary
>       1.83 ±  7%      +2.8        4.59 ±  5%  perf-profile.children.cycles-pp.security_file_permission
>       0.17 ± 60%      +2.8        2.94 ± 11%  perf-profile.children.cycles-pp.do_idle
>       0.17 ± 60%      +2.8        2.96 ± 11%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
>       0.17 ± 60%      +2.8        2.96 ± 11%  perf-profile.children.cycles-pp.cpu_startup_entry
>       2.62 ±  9%      +3.2        5.84 ±  6%  perf-profile.children.cycles-pp._copy_to_iter
>       1.55 ±  8%      +3.2        4.79 ±  5%  perf-profile.children.cycles-pp.__entry_text_start
>       3.09 ±  8%      +3.7        6.77 ±  5%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
>       2.95 ±  9%      +3.8        6.73 ±  5%  perf-profile.children.cycles-pp.copy_page_to_iter
>       2.28 ± 11%      +5.1        7.40 ±  6%  perf-profile.children.cycles-pp.mutex_unlock
>       3.92 ±  9%      +6.0        9.94 ±  5%  perf-profile.children.cycles-pp.mutex_lock
>       8.37 ±  9%      -5.8        2.60 ± 23%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>       6.54 ± 15%      -4.6        1.95 ± 57%  perf-profile.self.cycles-pp.update_cfs_group
>       3.08 ±  4%      -2.3        0.74 ± 22%  perf-profile.self.cycles-pp.switch_mm_irqs_off
>       2.96 ±  4%      -1.8        1.13 ± 33%  perf-profile.self.cycles-pp.update_load_avg
>       2.22 ±  8%      -1.5        0.74 ± 12%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
>       1.96 ±  9%      -1.5        0.48 ± 15%  perf-profile.self.cycles-pp.update_curr
>       1.94 ±  5%      -1.3        0.64 ± 16%  perf-profile.self.cycles-pp._raw_spin_lock
>       1.78 ±  5%      -1.3        0.50 ± 18%  perf-profile.self.cycles-pp.__schedule
>       1.59 ±  7%      -1.2        0.40 ± 12%  perf-profile.self.cycles-pp.enqueue_entity
>       1.61 ±  6%      -1.2        0.42 ± 25%  perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
>       1.44 ±  8%      -1.0        0.39 ± 14%  perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
>       1.42 ±  5%      -1.0        0.41 ± 19%  perf-profile.self.cycles-pp.__switch_to_asm
>       1.18 ±  7%      -0.9        0.33 ± 14%  perf-profile.self.cycles-pp.__update_load_avg_se
>       1.14 ± 10%      -0.8        0.31 ±  9%  perf-profile.self.cycles-pp.update_rq_clock
>       0.90 ±  7%      -0.7        0.19 ± 21%  perf-profile.self.cycles-pp.pick_next_task_fair
>       1.04 ±  7%      -0.7        0.33 ± 13%  perf-profile.self.cycles-pp.prepare_task_switch
>       0.98 ±  4%      -0.7        0.29 ± 20%  perf-profile.self.cycles-pp.__switch_to
>       0.88 ±  6%      -0.7        0.20 ± 17%  perf-profile.self.cycles-pp.enqueue_task_fair
>       1.01 ±  6%      -0.7        0.35 ± 10%  perf-profile.self.cycles-pp.prepare_to_wait_event
>       0.90 ±  8%      -0.6        0.25 ± 12%  perf-profile.self.cycles-pp.update_min_vruntime
>       0.79 ± 17%      -0.6        0.22 ±  9%  perf-profile.self.cycles-pp.cpuacct_charge
>       1.10 ±  5%      -0.6        0.54 ±  9%  perf-profile.self.cycles-pp.try_to_wake_up
>       0.66 ±  3%      -0.5        0.15 ± 27%  perf-profile.self.cycles-pp.os_xsave
>       0.71 ±  6%      -0.5        0.22 ± 18%  perf-profile.self.cycles-pp.reweight_entity
>       0.68 ±  9%      -0.5        0.19 ± 10%  perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
>       0.67 ±  9%      -0.5        0.18 ± 11%  perf-profile.self.cycles-pp.__wake_up_common
>       0.65 ±  6%      -0.5        0.17 ± 23%  perf-profile.self.cycles-pp.switch_fpu_return
>       0.60 ± 11%      -0.5        0.14 ± 28%  perf-profile.self.cycles-pp.perf_tp_event
>       0.52 ± 44%      -0.5        0.06 ±151%  perf-profile.self.cycles-pp.native_irq_return_iret
>       0.52 ±  7%      -0.4        0.08 ± 25%  perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
>       0.55 ±  4%      -0.4        0.15 ± 22%  perf-profile.self.cycles-pp.__calc_delta
>       0.61 ±  5%      -0.4        0.21 ± 12%  perf-profile.self.cycles-pp.dequeue_task_fair
>       0.69 ± 14%      -0.4        0.32 ± 11%  perf-profile.self.cycles-pp.task_h_load
>       0.49 ± 11%      -0.3        0.15 ± 29%  perf-profile.self.cycles-pp.___perf_sw_event
>       0.37 ±  4%      -0.3        0.05 ± 73%  perf-profile.self.cycles-pp.pick_next_entity
>       0.50 ±  3%      -0.3        0.19 ± 15%  perf-profile.self.cycles-pp.select_idle_sibling
>       0.38 ±  9%      -0.3        0.08 ± 24%  perf-profile.self.cycles-pp.set_next_buddy
>       0.32 ±  4%      -0.3        0.03 ±100%  perf-profile.self.cycles-pp.put_prev_entity
>       0.64 ±  6%      -0.3        0.35 ±  7%  perf-profile.self.cycles-pp._raw_spin_lock_irq
>       0.52 ±  5%      -0.3        0.25 ±  6%  perf-profile.self.cycles-pp.__list_del_entry_valid
>       0.34 ±  5%      -0.3        0.07 ± 29%  perf-profile.self.cycles-pp.schedule
>       0.35 ±  9%      -0.3        0.08 ± 10%  perf-profile.self.cycles-pp.rb_insert_color
>       0.40 ±  5%      -0.3        0.14 ± 16%  perf-profile.self.cycles-pp.select_task_rq_fair
>       0.33 ±  6%      -0.3        0.08 ± 16%  perf-profile.self.cycles-pp.check_preempt_wakeup
>       0.33 ±  8%      -0.2        0.10 ± 16%  perf-profile.self.cycles-pp.select_task_rq
>       0.36 ±  3%      -0.2        0.13 ± 16%  perf-profile.self.cycles-pp.native_sched_clock
>       0.32 ±  7%      -0.2        0.10 ± 14%  perf-profile.self.cycles-pp.finish_task_switch
>       0.32 ±  4%      -0.2        0.11 ± 13%  perf-profile.self.cycles-pp.dequeue_entity
>       0.32 ±  8%      -0.2        0.12 ± 10%  perf-profile.self.cycles-pp.__list_add_valid
>       0.23 ±  5%      -0.2        0.03 ±103%  perf-profile.self.cycles-pp.resched_curr
>       0.27 ±  6%      -0.2        0.07 ± 21%  perf-profile.self.cycles-pp.rb_erase
>       0.27 ±  5%      -0.2        0.08 ± 20%  perf-profile.self.cycles-pp.__wrgsbase_inactive
>       0.28 ± 13%      -0.2        0.09 ± 12%  perf-profile.self.cycles-pp.check_preempt_curr
>       0.30 ± 13%      -0.2        0.12 ±  7%  perf-profile.self.cycles-pp.ttwu_queue_wakelist
>       0.24 ±  5%      -0.2        0.06 ± 19%  perf-profile.self.cycles-pp.set_next_entity
>       0.21 ± 34%      -0.2        0.04 ± 71%  perf-profile.self.cycles-pp.__flush_smp_call_function_queue
>       0.25 ±  5%      -0.2        0.08 ± 16%  perf-profile.self.cycles-pp.rcu_note_context_switch
>       0.19 ± 26%      -0.1        0.04 ± 73%  perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
>       0.20 ±  8%      -0.1        0.06 ± 13%  perf-profile.self.cycles-pp.ttwu_do_activate
>       0.17 ±  8%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.rb_next
>       0.22 ± 23%      -0.1        0.09 ± 31%  perf-profile.self.cycles-pp.migrate_task_rq_fair
>       0.15 ± 32%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.llist_reverse_order
>       0.16 ±  8%      -0.1        0.06 ± 14%  perf-profile.self.cycles-pp.wake_affine
>       0.10 ± 31%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.sched_ttwu_pending
>       0.14 ±  5%      -0.1        0.07 ± 20%  perf-profile.self.cycles-pp.read@plt
>       0.32 ±  8%      -0.1        0.26 ±  3%  perf-profile.self.cycles-pp.anon_pipe_buf_release
>       0.10 ±  6%      -0.1        0.04 ± 45%  perf-profile.self.cycles-pp.__wake_up_common_lock
>       0.10 ±  9%      -0.0        0.07 ±  8%  perf-profile.self.cycles-pp.__mark_inode_dirty
>       0.00            +0.1        0.06 ± 11%  perf-profile.self.cycles-pp.free_unref_page
>       0.00            +0.1        0.06 ±  6%  perf-profile.self.cycles-pp.__alloc_pages
>       0.00            +0.1        0.06 ± 11%  perf-profile.self.cycles-pp._raw_spin_trylock
>       0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.uncharge_folio
>       0.00            +0.1        0.06 ± 11%  perf-profile.self.cycles-pp.uncharge_batch
>       0.00            +0.1        0.07 ± 10%  perf-profile.self.cycles-pp.menu_select
>       0.00            +0.1        0.08 ± 14%  perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg
>       0.00            +0.1        0.08 ±  7%  perf-profile.self.cycles-pp.__memcg_kmem_charge_page
>       0.00            +0.1        0.10 ± 10%  perf-profile.self.cycles-pp.osq_unlock
>       0.07 ±  5%      +0.1        0.17 ±  8%  perf-profile.self.cycles-pp.copyin
>       0.00            +0.1        0.11 ± 11%  perf-profile.self.cycles-pp.__mod_memcg_state
>       0.13 ±  8%      +0.1        0.24 ±  6%  perf-profile.self.cycles-pp.rcu_all_qs
>       0.14 ±  5%      +0.1        0.28 ±  5%  perf-profile.self.cycles-pp.__x64_sys_write
>       0.07 ± 10%      +0.1        0.21 ±  5%  perf-profile.self.cycles-pp.page_copy_sane
>       0.13 ± 12%      +0.1        0.28 ±  9%  perf-profile.self.cycles-pp.__x64_sys_read
>       0.00            +0.2        0.15 ± 10%  perf-profile.self.cycles-pp.propagate_protected_usage
>       0.18 ±  9%      +0.2        0.33 ±  4%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
>       0.07 ±  8%      +0.2        0.23 ±  5%  perf-profile.self.cycles-pp.rw_verify_area
>       0.08 ± 34%      +0.2        0.24 ± 27%  perf-profile.self.cycles-pp.mnt_user_ns
>       0.13 ±  5%      +0.2        0.31 ±  7%  perf-profile.self.cycles-pp.kill_fasync
>       0.21 ±  8%      +0.2        0.39 ±  5%  perf-profile.self.cycles-pp.__might_fault
>       0.06 ± 13%      +0.2        0.26 ±  9%  perf-profile.self.cycles-pp.copyout
>       0.10 ± 11%      +0.2        0.31 ±  8%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
>       0.26 ± 13%      +0.2        0.49 ±  6%  perf-profile.self.cycles-pp.inode_needs_update_time
>       0.23 ±  8%      +0.2        0.47 ±  5%  perf-profile.self.cycles-pp.copy_page_from_iter
>       0.14 ±  7%      +0.2        0.38 ±  6%  perf-profile.self.cycles-pp.file_update_time
>       0.36 ±  7%      +0.3        0.62 ±  4%  perf-profile.self.cycles-pp.ksys_read
>       0.54 ± 13%      +0.3        0.80 ±  4%  perf-profile.self.cycles-pp._copy_from_iter
>       0.15 ±  5%      +0.3        0.41 ±  8%  perf-profile.self.cycles-pp.touch_atime
>       0.14 ±  5%      +0.3        0.40 ±  6%  perf-profile.self.cycles-pp.__cond_resched
>       0.18 ±  5%      +0.3        0.47 ±  4%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
>       0.16 ±  8%      +0.3        0.46 ±  6%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
>       0.16 ±  9%      +0.3        0.47 ±  6%  perf-profile.self.cycles-pp.__fdget_pos
>       1.79 ±  8%      +0.3        2.12 ±  3%  perf-profile.self.cycles-pp.pipe_read
>       0.10 ±  8%      +0.3        0.43 ± 17%  perf-profile.self.cycles-pp.fsnotify_perm
>       0.20 ±  4%      +0.4        0.55 ±  5%  perf-profile.self.cycles-pp.ksys_write
>       0.05 ± 76%      +0.4        0.46 ± 27%  perf-profile.self.cycles-pp.queue_event
>       0.32 ±  6%      +0.4        0.73 ±  6%  perf-profile.self.cycles-pp.exit_to_user_mode_prepare
>       0.21 ±  9%      +0.4        0.62 ±  6%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
>       0.79 ±  8%      +0.4        1.22 ±  4%  perf-profile.self.cycles-pp.__might_sleep
>       0.44 ±  5%      +0.4        0.88 ±  7%  perf-profile.self.cycles-pp.do_syscall_64
>       0.26 ±  8%      +0.4        0.70 ±  4%  perf-profile.self.cycles-pp.atime_needs_update
>       0.42 ±  7%      +0.5        0.88 ±  5%  perf-profile.self.cycles-pp.__get_task_ioprio
>       0.28 ± 12%      +0.5        0.75 ±  5%  perf-profile.self.cycles-pp.copy_page_to_iter
>       0.19 ±  6%      +0.5        0.68 ± 10%  perf-profile.self.cycles-pp.security_file_permission
>       0.31 ±  8%      +0.5        0.83 ±  5%  perf-profile.self.cycles-pp.aa_file_perm
>       0.05 ± 46%      +0.5        0.59 ±  8%  perf-profile.self.cycles-pp.osq_lock
>       0.30 ±  7%      +0.5        0.85 ±  6%  perf-profile.self.cycles-pp._copy_to_iter
>       0.00            +0.6        0.59 ±  6%  perf-profile.self.cycles-pp.poll_idle
>       0.13 ± 20%      +0.7        0.81 ±  6%  perf-profile.self.cycles-pp.mutex_spin_on_owner
>       0.38 ±  9%      +0.7        1.12 ±  5%  perf-profile.self.cycles-pp.current_time
>       0.08 ± 59%      +0.8        0.82 ± 11%  perf-profile.self.cycles-pp.intel_idle_irq
>       0.92 ±  6%      +0.8        1.72 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
>       0.01 ±223%      +0.8        0.82 ±  6%  perf-profile.self.cycles-pp.page_counter_uncharge
>       0.86 ±  7%      +1.1        1.91 ±  4%  perf-profile.self.cycles-pp.vfs_read
>       1.07 ±  6%      +1.1        2.14 ±  5%  perf-profile.self.cycles-pp.__fget_light
>       0.67 ±  7%      +1.1        1.74 ±  6%  perf-profile.self.cycles-pp.vfs_write
>       0.15 ± 12%      +1.1        1.28 ±  7%  perf-profile.self.cycles-pp.__mutex_lock
>       1.09 ±  6%      +1.1        2.22 ±  5%  perf-profile.self.cycles-pp.__libc_read
>       0.62 ±  6%      +1.2        1.79 ±  5%  perf-profile.self.cycles-pp.syscall_return_via_sysret
>       1.16 ±  8%      +1.2        2.38 ±  4%  perf-profile.self.cycles-pp.__might_resched
>       0.91 ±  7%      +1.3        2.20 ±  5%  perf-profile.self.cycles-pp.__libc_write
>       0.59 ±  8%      +1.3        1.93 ±  6%  perf-profile.self.cycles-pp.__entry_text_start
>       1.27 ±  7%      +1.7        3.00 ±  6%  perf-profile.self.cycles-pp.apparmor_file_permission
>       0.99 ±  8%      +2.0        2.98 ±  5%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
>       1.74 ±  8%      +3.4        5.15 ±  6%  perf-profile.self.cycles-pp.pipe_write
>       2.98 ±  8%      +3.7        6.64 ±  5%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
>       2.62 ± 10%      +4.8        7.38 ±  5%  perf-profile.self.cycles-pp.mutex_lock
>       2.20 ± 10%      +5.1        7.30 ±  6%  perf-profile.self.cycles-pp.mutex_unlock
> 
> 
> ***************************************************************************************************
> lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
> =========================================================================================
> compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
>   gcc-11/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/hackbench
> 
> commit:
>   a2e90611b9 ("sched/fair: Remove capacity inversion detection")
>   829c1651e9 ("sched/fair: sanitize vruntime of entity being placed")
> 
> a2e90611b9f425ad 829c1651e9c4a6f78398d3e6765
> ---------------- ---------------------------
>          %stddev     %change         %stddev
>              \          |                \
>     177139            -8.1%     162815        hackbench.throughput
>     174484           -18.8%     141618 ±  2%  hackbench.throughput_avg
>     177139            -8.1%     162815        hackbench.throughput_best
>     168530           -37.3%     105615 ±  3%  hackbench.throughput_worst
>     281.38           +23.1%     346.39 ±  2%  hackbench.time.elapsed_time
>     281.38           +23.1%     346.39 ±  2%  hackbench.time.elapsed_time.max
>  1.053e+08 ±  2%    +688.4%  8.302e+08 ±  9%  hackbench.time.involuntary_context_switches
>      21992           +27.8%      28116 ±  2%  hackbench.time.system_time
>       6652            +8.2%       7196        hackbench.time.user_time
>  3.482e+08          +289.2%  1.355e+09 ±  9%  hackbench.time.voluntary_context_switches
>    2110813 ±  5%     +21.6%    2565791 ±  3%  cpuidle..usage
>     333.95           +19.5%     399.05        uptime.boot
>       0.03            -0.0        0.03        mpstat.cpu.all.soft%
>      22.68            -2.9       19.77        mpstat.cpu.all.usr%
>     561083 ± 10%     +45.5%     816171 ± 12%  numa-numastat.node0.local_node
>     614314 ±  9%     +36.9%     841173 ± 12%  numa-numastat.node0.numa_hit
>    1393279 ±  7%     -16.8%    1158997 ±  2%  numa-numastat.node1.local_node
>    1443679 ±  5%     -14.9%    1229074 ±  3%  numa-numastat.node1.numa_hit
>    4129900 ±  8%     -23.0%    3181115        vmstat.memory.cache
>       1731           +30.8%       2265        vmstat.procs.r
>    1598044          +290.3%    6237840 ±  7%  vmstat.system.cs
>     320762           +60.5%     514672 ±  8%  vmstat.system.in
>     962111 ±  6%     +46.0%    1404646 ±  7%  turbostat.C1
>     233987 ±  5%     +51.2%     353892        turbostat.C1E
>   91515563           +97.3%  1.806e+08 ± 10%  turbostat.IRQ
>     448466 ± 14%     -34.2%     294934 ±  5%  turbostat.POLL
>      34.60            -7.3%      32.07        turbostat.RAMWatt
>     514028 ±  2%     -14.0%     442125 ±  2%  meminfo.AnonPages
>    4006312 ±  8%     -23.9%    3047078        meminfo.Cached
>    3321064 ± 10%     -32.7%    2236362 ±  2%  meminfo.Committed_AS
>    1714752 ± 21%     -60.3%     680479 ±  8%  meminfo.Inactive
>    1714585 ± 21%     -60.3%     680305 ±  8%  meminfo.Inactive(anon)
>     757124 ± 18%     -67.2%     248485 ± 27%  meminfo.Mapped
>    6476123 ±  6%     -19.4%    5220738        meminfo.Memused
>    1275724 ± 26%     -75.2%     316896 ± 15%  meminfo.Shmem
>    6806047 ±  3%     -13.3%    5901974        meminfo.max_used_kB
>     161311 ± 23%     +31.7%     212494 ±  5%  numa-meminfo.node0.AnonPages
>     165693 ± 22%     +30.5%     216264 ±  5%  numa-meminfo.node0.Inactive
>     165563 ± 22%     +30.6%     216232 ±  5%  numa-meminfo.node0.Inactive(anon)
>     140638 ± 19%     -36.7%      89034 ± 11%  numa-meminfo.node0.Mapped
>     352173 ± 14%     -35.3%     227805 ±  8%  numa-meminfo.node1.AnonPages
>     501396 ± 11%     -22.6%     388042 ±  5%  numa-meminfo.node1.AnonPages.max
>    1702242 ± 43%     -77.8%     378325 ± 22%  numa-meminfo.node1.FilePages
>    1540803 ± 25%     -70.4%     455592 ± 13%  numa-meminfo.node1.Inactive
>    1540767 ± 25%     -70.4%     455451 ± 13%  numa-meminfo.node1.Inactive(anon)
>     612123 ± 18%     -74.9%     153752 ± 37%  numa-meminfo.node1.Mapped
>    3085231 ± 24%     -53.9%    1420940 ± 14%  numa-meminfo.node1.MemUsed
>     254052 ±  4%     -19.1%     205632 ± 21%  numa-meminfo.node1.SUnreclaim
>    1259640 ± 27%     -75.9%     303123 ± 15%  numa-meminfo.node1.Shmem
>     304597 ±  7%     -20.2%     242920 ± 17%  numa-meminfo.node1.Slab
>      40345 ± 23%     +31.5%      53054 ±  5%  numa-vmstat.node0.nr_anon_pages
>      41412 ± 22%     +30.4%      53988 ±  5%  numa-vmstat.node0.nr_inactive_anon
>      35261 ± 19%     -36.9%      22256 ± 12%  numa-vmstat.node0.nr_mapped
>      41412 ± 22%     +30.4%      53988 ±  5%  numa-vmstat.node0.nr_zone_inactive_anon
>     614185 ±  9%     +36.9%     841065 ± 12%  numa-vmstat.node0.numa_hit
>     560955 ± 11%     +45.5%     816063 ± 12%  numa-vmstat.node0.numa_local
>      88129 ± 14%     -35.2%      57097 ±  8%  numa-vmstat.node1.nr_anon_pages
>     426425 ± 43%     -77.9%      94199 ± 22%  numa-vmstat.node1.nr_file_pages
>     386166 ± 25%     -70.5%     113880 ± 13%  numa-vmstat.node1.nr_inactive_anon
>     153658 ± 18%     -75.3%      38021 ± 37%  numa-vmstat.node1.nr_mapped
>     315775 ± 27%     -76.1%      75399 ± 16%  numa-vmstat.node1.nr_shmem
>      63411 ±  4%     -18.6%      51593 ± 21%  numa-vmstat.node1.nr_slab_unreclaimable
>     386166 ± 25%     -70.5%     113880 ± 13%  numa-vmstat.node1.nr_zone_inactive_anon
>    1443470 ±  5%     -14.9%    1228740 ±  3%  numa-vmstat.node1.numa_hit
>    1393069 ±  7%     -16.8%    1158664 ±  2%  numa-vmstat.node1.numa_local
>     128457 ±  2%     -14.0%     110530 ±  3%  proc-vmstat.nr_anon_pages
>     999461 ±  8%     -23.8%     761774        proc-vmstat.nr_file_pages
>     426485 ± 21%     -60.1%     170237 ±  9%  proc-vmstat.nr_inactive_anon
>      82464            -2.6%      80281        proc-vmstat.nr_kernel_stack
>     187777 ± 18%     -66.9%      62076 ± 28%  proc-vmstat.nr_mapped
>     316813 ± 27%     -75.0%      79228 ± 16%  proc-vmstat.nr_shmem
>      31469            -2.0%      30840        proc-vmstat.nr_slab_reclaimable
>     117889            -8.4%     108036        proc-vmstat.nr_slab_unreclaimable
>     426485 ± 21%     -60.1%     170237 ±  9%  proc-vmstat.nr_zone_inactive_anon
>     187187 ± 12%     -43.5%     105680 ±  9%  proc-vmstat.numa_hint_faults
>     128363 ± 15%     -61.5%      49371 ± 19%  proc-vmstat.numa_hint_faults_local
>      47314 ± 22%     +39.2%      65863 ± 13%  proc-vmstat.numa_pages_migrated
>     457026 ±  9%     -18.1%     374188 ± 13%  proc-vmstat.numa_pte_updates
>    2586600 ±  3%     +27.7%    3302787 ±  8%  proc-vmstat.pgalloc_normal
>    1589970            -6.2%    1491838        proc-vmstat.pgfault
>    2347186 ± 10%     +37.7%    3232369 ±  8%  proc-vmstat.pgfree
>      47314 ± 22%     +39.2%      65863 ± 13%  proc-vmstat.pgmigrate_success
>     112713            +7.0%     120630 ±  3%  proc-vmstat.pgreuse
>    2189056           +22.2%    2674944 ±  2%  proc-vmstat.unevictable_pgs_scanned
>      14.08 ±  2%     +29.3%      18.20 ±  5%  sched_debug.cfs_rq:/.h_nr_running.avg
>       0.80 ± 14%    +179.2%       2.23 ± 24%  sched_debug.cfs_rq:/.h_nr_running.min
>     245.23 ± 12%     -19.7%     196.97 ±  6%  sched_debug.cfs_rq:/.load_avg.max
>       2.27 ± 16%     +75.0%       3.97 ±  4%  sched_debug.cfs_rq:/.load_avg.min
>      45.77 ± 16%     -17.8%      37.60 ±  6%  sched_debug.cfs_rq:/.load_avg.stddev
>   11842707           +39.9%   16567992        sched_debug.cfs_rq:/.min_vruntime.avg
>   13773080 ±  3%    +113.9%   29460281 ±  7%  sched_debug.cfs_rq:/.min_vruntime.max
>   11423218           +30.3%   14885830        sched_debug.cfs_rq:/.min_vruntime.min
>     301190 ± 12%    +439.9%    1626088 ± 10%  sched_debug.cfs_rq:/.min_vruntime.stddev
>     203.83           -16.3%     170.67        sched_debug.cfs_rq:/.removed.load_avg.max
>      14330 ±  3%     +30.9%      18756 ±  5%  sched_debug.cfs_rq:/.runnable_avg.avg
>      25115 ±  4%     +15.5%      28999 ±  6%  sched_debug.cfs_rq:/.runnable_avg.max
>       3811 ± 11%     +68.0%       6404 ± 21%  sched_debug.cfs_rq:/.runnable_avg.min
>       3818 ±  6%     +15.3%       4404 ±  7%  sched_debug.cfs_rq:/.runnable_avg.stddev
>    -849635          +410.6%   -4338612        sched_debug.cfs_rq:/.spread0.avg
>    1092373 ± 54%    +691.1%    8641673 ± 21%  sched_debug.cfs_rq:/.spread0.max
>   -1263082          +378.1%   -6038905        sched_debug.cfs_rq:/.spread0.min
>     300764 ± 12%    +441.8%    1629507 ±  9%  sched_debug.cfs_rq:/.spread0.stddev
>       1591 ±  4%     -11.1%       1413 ±  3%  sched_debug.cfs_rq:/.util_avg.max
>     288.90 ± 11%     +64.5%     475.23 ± 13%  sched_debug.cfs_rq:/.util_avg.min
>     240.33 ±  2%     -32.1%     163.09 ±  3%  sched_debug.cfs_rq:/.util_avg.stddev
>     494.27 ±  3%     +41.6%     699.85 ±  3%  sched_debug.cfs_rq:/.util_est_enqueued.avg
>      11.23 ± 54%    +634.1%      82.47 ± 22%  sched_debug.cfs_rq:/.util_est_enqueued.min
>     174576           +20.7%     210681        sched_debug.cpu.clock.avg
>     174926           +21.2%     211944        sched_debug.cpu.clock.max
>     174164           +20.3%     209436        sched_debug.cpu.clock.min
>     230.84 ± 33%    +226.1%     752.67 ± 20%  sched_debug.cpu.clock.stddev
>     172836           +20.6%     208504        sched_debug.cpu.clock_task.avg
>     173552           +21.0%     210079        sched_debug.cpu.clock_task.max
>     156807           +22.3%     191789        sched_debug.cpu.clock_task.min
>       1634           +17.1%       1914 ±  5%  sched_debug.cpu.clock_task.stddev
>       0.00 ± 32%    +220.1%       0.00 ± 20%  sched_debug.cpu.next_balance.stddev
>      14.12 ±  2%     +28.7%      18.18 ±  5%  sched_debug.cpu.nr_running.avg
>       0.73 ± 25%    +213.6%       2.30 ± 24%  sched_debug.cpu.nr_running.min
>    1810086          +461.3%   10159215 ± 10%  sched_debug.cpu.nr_switches.avg
>    2315994 ±  3%    +515.6%   14258195 ±  9%  sched_debug.cpu.nr_switches.max
>    1529863          +380.3%    7348324 ±  9%  sched_debug.cpu.nr_switches.min
>     167487 ± 18%    +770.8%    1458519 ± 21%  sched_debug.cpu.nr_switches.stddev
>     174149           +20.2%     209410        sched_debug.cpu_clk
>     170980           +20.6%     206240        sched_debug.ktime
>     174896           +20.2%     210153        sched_debug.sched_clk
>       7.35           +24.9%       9.18 ±  4%  perf-stat.i.MPKI
>  1.918e+10           +14.4%  2.194e+10        perf-stat.i.branch-instructions
>       2.16            -0.1        2.09        perf-stat.i.branch-miss-rate%
>  4.133e+08            +6.6%  4.405e+08        perf-stat.i.branch-misses
>      23.08            -9.2       13.86 ±  7%  perf-stat.i.cache-miss-rate%
>  1.714e+08           -37.2%  1.076e+08 ±  3%  perf-stat.i.cache-misses
>  7.497e+08           +33.7%  1.002e+09 ±  5%  perf-stat.i.cache-references
>    1636365          +382.4%    7893858 ±  5%  perf-stat.i.context-switches
>       2.74            -6.8%       2.56        perf-stat.i.cpi
>     131725          +288.0%     511159 ± 10%  perf-stat.i.cpu-migrations
>       1672          +160.8%       4361 ±  4%  perf-stat.i.cycles-between-cache-misses
>       0.49            +0.6        1.11 ±  5%  perf-stat.i.dTLB-load-miss-rate%
>  1.417e+08          +158.7%  3.665e+08 ±  5%  perf-stat.i.dTLB-load-misses
>  2.908e+10            +9.1%  3.172e+10        perf-stat.i.dTLB-loads
>       0.12 ±  4%      +0.1        0.20 ±  4%  perf-stat.i.dTLB-store-miss-rate%
>   20805655 ±  4%     +90.9%   39716345 ±  4%  perf-stat.i.dTLB-store-misses
>  1.755e+10            +8.6%  1.907e+10        perf-stat.i.dTLB-stores
>      29.04            +3.6       32.62 ±  2%  perf-stat.i.iTLB-load-miss-rate%
>   56676082           +60.4%   90917582 ±  3%  perf-stat.i.iTLB-load-misses
>  1.381e+08           +30.6%  1.804e+08        perf-stat.i.iTLB-loads
>   1.03e+11           +10.5%  1.139e+11        perf-stat.i.instructions
>       1840           -21.1%       1451 ±  4%  perf-stat.i.instructions-per-iTLB-miss
>       0.37           +10.9%       0.41        perf-stat.i.ipc
>       1084            -4.5%       1035 ±  2%  perf-stat.i.metric.K/sec
>     640.69           +10.3%     706.44        perf-stat.i.metric.M/sec
>       5249            -9.3%       4762 ±  3%  perf-stat.i.minor-faults
>      23.57           +18.7       42.30 ±  8%  perf-stat.i.node-load-miss-rate%
>   40174555           -45.0%   22109431 ± 10%  perf-stat.i.node-loads
>       8.84 ±  2%     +24.5       33.30 ± 10%  perf-stat.i.node-store-miss-rate%
>    2912322           +60.3%    4667137 ± 16%  perf-stat.i.node-store-misses
>   34046752           -50.6%   16826621 ±  9%  perf-stat.i.node-stores
>       5278            -9.2%       4791 ±  3%  perf-stat.i.page-faults
>       7.24           +12.1%       8.12 ±  4%  perf-stat.overall.MPKI
>       2.15            -0.1        2.05        perf-stat.overall.branch-miss-rate%
>      22.92            -9.5       13.41 ±  7%  perf-stat.overall.cache-miss-rate%
>       2.73            -6.3%       2.56        perf-stat.overall.cpi
>       1644           +43.4%       2358 ±  3%  perf-stat.overall.cycles-between-cache-misses
>       0.48            +0.5        0.99 ±  4%  perf-stat.overall.dTLB-load-miss-rate%
>       0.12 ±  4%      +0.1        0.19 ±  4%  perf-stat.overall.dTLB-store-miss-rate%
>      29.06            +2.9       32.01 ±  2%  perf-stat.overall.iTLB-load-miss-rate%
>       1826           -26.6%       1340 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
>       0.37            +6.8%       0.39        perf-stat.overall.ipc
>      22.74            +6.8       29.53 ± 13%  perf-stat.overall.node-load-miss-rate%
>       7.63            +8.4       16.02 ± 20%  perf-stat.overall.node-store-miss-rate%
>  1.915e+10            +9.0%  2.088e+10        perf-stat.ps.branch-instructions
>  4.119e+08            +3.9%  4.282e+08        perf-stat.ps.branch-misses
>  1.707e+08           -30.5%  1.186e+08 ±  3%  perf-stat.ps.cache-misses
>  7.446e+08           +19.2%  8.874e+08 ±  4%  perf-stat.ps.cache-references
>    1611874          +289.1%    6271376 ±  7%  perf-stat.ps.context-switches
>     127362          +189.0%     368041 ± 11%  perf-stat.ps.cpu-migrations
>  1.407e+08          +116.2%  3.042e+08 ±  5%  perf-stat.ps.dTLB-load-misses
>  2.901e+10            +5.4%  3.057e+10        perf-stat.ps.dTLB-loads
>   20667480 ±  4%     +66.8%   34473793 ±  4%  perf-stat.ps.dTLB-store-misses
>  1.751e+10            +5.1%   1.84e+10        perf-stat.ps.dTLB-stores
>   56310692           +45.0%   81644183 ±  4%  perf-stat.ps.iTLB-load-misses
>  1.375e+08           +26.1%  1.733e+08        perf-stat.ps.iTLB-loads
>  1.028e+11            +6.3%  1.093e+11        perf-stat.ps.instructions
>       4929           -24.5%       3723 ±  2%  perf-stat.ps.minor-faults
>   40134633           -32.9%   26946247 ±  9%  perf-stat.ps.node-loads
>    2805073           +39.5%    3914304 ± 16%  perf-stat.ps.node-store-misses
>   33938259           -38.9%   20726382 ±  8%  perf-stat.ps.node-stores
>       4952           -24.5%       3741 ±  2%  perf-stat.ps.page-faults
>  2.911e+13           +30.9%  3.809e+13 ±  2%  perf-stat.total.instructions
>      15.30 ±  4%      -8.6        6.66 ±  5%  perf-profile.calltrace.cycles-pp.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write
>      13.84 ±  6%      -7.9        5.98 ±  6%  perf-profile.calltrace.cycles-pp.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>      13.61 ±  6%      -7.8        5.84 ±  6%  perf-profile.calltrace.cycles-pp.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg
>       9.00 ±  2%      -5.5        3.48 ±  4%  perf-profile.calltrace.cycles-pp.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
>       6.44 ±  4%      -4.3        2.14 ±  6%  perf-profile.calltrace.cycles-pp.skb_release_data.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
>       5.83 ±  8%      -3.4        2.44 ±  5%  perf-profile.calltrace.cycles-pp.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
>       5.81 ±  6%      -3.3        2.48 ±  6%  perf-profile.calltrace.cycles-pp.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
>       5.50 ±  7%      -3.2        2.32 ±  6%  perf-profile.calltrace.cycles-pp.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
>       5.07 ±  8%      -3.0        2.04 ±  6%  perf-profile.calltrace.cycles-pp.__kmem_cache_alloc_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
>       6.22 ±  2%      -2.9        3.33 ±  3%  perf-profile.calltrace.cycles-pp.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
>       6.17 ±  2%      -2.9        3.30 ±  3%  perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
>       6.11 ±  2%      -2.9        3.24 ±  3%  perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg
>      50.99            -2.6       48.39        perf-profile.calltrace.cycles-pp.__libc_read
>       5.66 ±  3%      -2.3        3.35 ±  3%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read
>       5.52 ±  3%      -2.3        3.27 ±  3%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write
>       3.14 ±  2%      -1.7        1.42 ±  4%  perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
>       2.73 ±  2%      -1.6        1.15 ±  4%  perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor
>       2.59 ±  2%      -1.5        1.07 ±  4%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
>       2.72 ±  3%      -1.4        1.34 ±  6%  perf-profile.calltrace.cycles-pp.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
>      41.50            -1.2       40.27        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read
>       2.26 ±  4%      -1.1        1.12        perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
>       2.76 ±  3%      -1.1        1.63 ±  3%  perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor
>       2.84 ±  3%      -1.1        1.71 ±  2%  perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
>       2.20 ±  4%      -1.1        1.08        perf-profile.calltrace.cycles-pp.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
>       2.98 ±  2%      -1.1        1.90 ±  6%  perf-profile.calltrace.cycles-pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write
>       1.99 ±  4%      -1.1        0.92 ±  2%  perf-profile.calltrace.cycles-pp.sock_wfree.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic
>       2.10 ±  3%      -1.0        1.08 ±  4%  perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
>       2.08 ±  4%      -0.8        1.24 ±  3%  perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_write
>       2.16 ±  3%      -0.7        1.47        perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read
>       2.20 ±  2%      -0.7        1.52 ±  3%  perf-profile.calltrace.cycles-pp.__kmem_cache_free.skb_release_data.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
>       1.46 ±  3%      -0.6        0.87 ±  8%  perf-profile.calltrace.cycles-pp._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>       4.82 ±  2%      -0.6        4.24        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>       1.31 ±  2%      -0.4        0.90 ±  4%  perf-profile.calltrace.cycles-pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>       0.96 ±  3%      -0.4        0.57 ± 10%  perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg
>       1.14 ±  3%      -0.4        0.76 ±  5%  perf-profile.calltrace.cycles-pp.memcg_slab_post_alloc_hook.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
>       0.99 ±  3%      -0.3        0.65 ±  8%  perf-profile.calltrace.cycles-pp.memcg_slab_post_alloc_hook.__kmem_cache_alloc_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb
>       1.30 ±  4%      -0.3        0.99 ±  3%  perf-profile.calltrace.cycles-pp.sock_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64
>       0.98 ±  2%      -0.3        0.69 ±  3%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.67            -0.2        0.42 ± 50%  perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg
>       0.56 ±  4%      -0.2        0.32 ± 81%  perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>       0.86 ±  2%      -0.2        0.63 ±  3%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64
>       1.15 ±  4%      -0.2        0.93 ±  4%  perf-profile.calltrace.cycles-pp.security_socket_recvmsg.sock_recvmsg.sock_read_iter.vfs_read.ksys_read
>       0.90            -0.2        0.69 ±  3%  perf-profile.calltrace.cycles-pp.get_obj_cgroup_from_current.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
>       1.23 ±  3%      -0.2        1.07 ±  3%  perf-profile.calltrace.cycles-pp.security_socket_sendmsg.sock_sendmsg.sock_write_iter.vfs_write.ksys_write
>       1.05 ±  2%      -0.2        0.88 ±  2%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.84 ±  4%      -0.2        0.68 ±  4%  perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.sock_read_iter.vfs_read
>       0.88            -0.1        0.78 ±  5%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
>       0.94 ±  3%      -0.1        0.88 ±  4%  perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_sendmsg.sock_sendmsg.sock_write_iter.vfs_write
>       0.62 ±  2%      +0.3        0.90 ±  2%  perf-profile.calltrace.cycles-pp.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
>       0.00            +0.6        0.58 ±  3%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg
>       0.00            +0.6        0.61 ±  6%  perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
>       0.00            +0.6        0.62 ±  4%  perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop
>       0.00            +0.7        0.67 ± 11%  perf-profile.calltrace.cycles-pp.update_load_avg.dequeue_entity.dequeue_task_fair.__schedule.schedule
>       0.00            +0.7        0.67 ±  7%  perf-profile.calltrace.cycles-pp.__switch_to_asm.__libc_write
>       0.00            +0.8        0.76 ±  4%  perf-profile.calltrace.cycles-pp.reweight_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout
>       0.00            +0.8        0.77 ±  4%  perf-profile.calltrace.cycles-pp.___perf_sw_event.prepare_task_switch.__schedule.schedule.schedule_timeout
>       0.00            +0.8        0.77 ±  8%  perf-profile.calltrace.cycles-pp.put_prev_entity.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop
>       0.00            +0.8        0.81 ±  5%  perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
>       0.00            +0.8        0.81 ±  5%  perf-profile.calltrace.cycles-pp.check_preempt_wakeup.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
>       0.00            +0.8        0.82 ±  2%  perf-profile.calltrace.cycles-pp.__switch_to_asm.__libc_read
>       0.00            +0.8        0.82 ±  3%  perf-profile.calltrace.cycles-pp.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
>       0.00            +0.9        0.86 ±  5%  perf-profile.calltrace.cycles-pp.perf_trace_sched_wakeup_template.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
>       0.00            +0.9        0.87 ±  8%  perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
>      29.66            +0.9       30.58        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       0.00            +1.0        0.95 ±  3%  perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.schedule_timeout
>       0.00            +1.0        0.98 ±  4%  perf-profile.calltrace.cycles-pp.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
>       0.00            +1.0        0.99 ±  3%  perf-profile.calltrace.cycles-pp.update_curr.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
>       0.00            +1.0        1.05 ±  4%  perf-profile.calltrace.cycles-pp.prepare_to_wait.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
>       0.00            +1.1        1.07 ± 12%  perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function
>      27.81 ±  2%      +1.2       28.98        perf-profile.calltrace.cycles-pp.unix_stream_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64
>      27.36 ±  2%      +1.2       28.59        perf-profile.calltrace.cycles-pp.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read.ksys_read
>       0.00            +1.5        1.46 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common
>       0.00            +1.6        1.55 ±  4%  perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.schedule_timeout.unix_stream_data_wait
>       0.00            +1.6        1.60 ±  4%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
>      27.58            +1.6       29.19        perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.00            +1.6        1.63 ±  5%  perf-profile.calltrace.cycles-pp.update_curr.dequeue_entity.dequeue_task_fair.__schedule.schedule
>       0.00            +1.6        1.65 ±  5%  perf-profile.calltrace.cycles-pp.restore_fpregs_from_fpstate.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
>       0.00            +1.7        1.66 ±  6%  perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
>       0.00            +1.8        1.80        perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
>       0.00            +1.8        1.84 ±  2%  perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.schedule_timeout.unix_stream_data_wait
>       0.00            +2.0        1.97 ±  2%  perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.schedule_timeout.unix_stream_data_wait
>      26.63 ±  2%      +2.0       28.61        perf-profile.calltrace.cycles-pp.sock_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64
>       0.00            +2.0        2.01 ±  6%  perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
>       0.00            +2.1        2.09 ±  6%  perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common
>       0.00            +2.1        2.11 ±  5%  perf-profile.calltrace.cycles-pp.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
>      25.21 ±  2%      +2.2       27.43        perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write.ksys_write
>       0.00            +2.4        2.43 ±  5%  perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
>      48.00            +2.7       50.69        perf-profile.calltrace.cycles-pp.__libc_write
>       0.00            +2.9        2.87 ±  5%  perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout
>       0.09 ±223%      +3.4        3.47 ±  3%  perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
>      39.07            +4.8       43.84        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write
>       0.66 ± 18%      +5.0        5.62 ±  4%  perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.unix_stream_data_wait
>       4.73            +5.1        9.88 ±  3%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       0.66 ± 20%      +5.3        5.98 ±  3%  perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
>      35.96            +5.7       41.68        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       0.00            +6.0        6.02 ±  6%  perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
>       0.00            +6.2        6.18 ±  6%  perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
>       0.00            +6.4        6.36 ±  6%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
>       0.78 ± 19%      +6.4        7.15 ±  3%  perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
>       0.18 ±141%      +7.0        7.18 ±  6%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
>       1.89 ± 15%     +12.1       13.96 ±  3%  perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic
>       1.92 ± 15%     +12.3       14.23 ±  3%  perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg
>       1.66 ± 19%     +12.4       14.06 ±  2%  perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable
>       1.96 ± 15%     +12.5       14.48 ±  3%  perf-profile.calltrace.cycles-pp.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
>       1.69 ± 19%     +12.7       14.38 ±  2%  perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg
>       1.75 ± 19%     +13.0       14.75 ±  2%  perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg.sock_sendmsg
>       2.53 ± 10%     +13.4       15.90 ±  2%  perf-profile.calltrace.cycles-pp.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write
>       1.96 ± 16%     +13.5       15.42 ±  2%  perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
>       2.28 ± 15%     +14.6       16.86 ±  3%  perf-profile.calltrace.cycles-pp.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
>      15.31 ±  4%      -8.6        6.67 ±  5%  perf-profile.children.cycles-pp.sock_alloc_send_pskb
>      13.85 ±  6%      -7.9        5.98 ±  5%  perf-profile.children.cycles-pp.alloc_skb_with_frags
>      13.70 ±  6%      -7.8        5.89 ±  6%  perf-profile.children.cycles-pp.__alloc_skb
>       9.01 ±  2%      -5.5        3.48 ±  4%  perf-profile.children.cycles-pp.consume_skb
>       6.86 ± 26%      -4.7        2.15 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
>      11.27 ±  3%      -4.6        6.67 ±  3%  perf-profile.children.cycles-pp.syscall_return_via_sysret
>       6.46 ±  4%      -4.3        2.15 ±  6%  perf-profile.children.cycles-pp.skb_release_data
>       4.18 ± 25%      -4.0        0.15 ± 69%  perf-profile.children.cycles-pp.___slab_alloc
>       5.76 ± 32%      -3.9        1.91 ±  3%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
>       5.98 ±  8%      -3.5        2.52 ±  5%  perf-profile.children.cycles-pp.kmem_cache_alloc_node
>       5.84 ±  6%      -3.3        2.50 ±  6%  perf-profile.children.cycles-pp.kmalloc_reserve
>       3.33 ± 30%      -3.3        0.05 ± 88%  perf-profile.children.cycles-pp.get_partial_node
>       5.63 ±  7%      -3.3        2.37 ±  6%  perf-profile.children.cycles-pp.__kmalloc_node_track_caller
>       5.20 ±  7%      -3.1        2.12 ±  6%  perf-profile.children.cycles-pp.__kmem_cache_alloc_node
>       6.23 ±  2%      -2.9        3.33 ±  3%  perf-profile.children.cycles-pp.unix_stream_read_actor
>       6.18 ±  2%      -2.9        3.31 ±  3%  perf-profile.children.cycles-pp.skb_copy_datagram_iter
>       6.11 ±  2%      -2.9        3.25 ±  3%  perf-profile.children.cycles-pp.__skb_datagram_iter
>      51.39            -2.5       48.85        perf-profile.children.cycles-pp.__libc_read
>       3.14 ±  3%      -2.5        0.61 ± 13%  perf-profile.children.cycles-pp.__slab_free
>       5.34 ±  3%      -2.1        3.23 ±  3%  perf-profile.children.cycles-pp.__entry_text_start
>       3.57 ±  2%      -1.9        1.66 ±  6%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
>       3.16 ±  2%      -1.7        1.43 ±  4%  perf-profile.children.cycles-pp._copy_to_iter
>       2.74 ±  2%      -1.6        1.16 ±  4%  perf-profile.children.cycles-pp.copyout
>       4.16 ±  2%      -1.5        2.62 ±  3%  perf-profile.children.cycles-pp.__check_object_size
>       2.73 ±  3%      -1.4        1.35 ±  6%  perf-profile.children.cycles-pp.kmem_cache_free
>       2.82 ±  2%      -1.2        1.63 ±  3%  perf-profile.children.cycles-pp.check_heap_object
>       2.27 ±  4%      -1.1        1.13 ±  2%  perf-profile.children.cycles-pp.skb_release_head_state
>       2.85 ±  3%      -1.1        1.72 ±  2%  perf-profile.children.cycles-pp.simple_copy_to_iter
>       2.22 ±  4%      -1.1        1.10        perf-profile.children.cycles-pp.unix_destruct_scm
>       3.00 ±  2%      -1.1        1.91 ±  5%  perf-profile.children.cycles-pp.skb_copy_datagram_from_iter
>       2.00 ±  4%      -1.1        0.92 ±  2%  perf-profile.children.cycles-pp.sock_wfree
>       2.16 ±  3%      -0.7        1.43 ±  7%  perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook
>       1.45 ±  3%      -0.7        0.73 ±  7%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
>       2.21 ±  2%      -0.7        1.52 ±  3%  perf-profile.children.cycles-pp.__kmem_cache_free
>       1.49 ±  3%      -0.6        0.89 ±  8%  perf-profile.children.cycles-pp._copy_from_iter
>       1.40 ±  3%      -0.6        0.85 ± 13%  perf-profile.children.cycles-pp.mod_objcg_state
>       0.74            -0.5        0.24 ± 16%  perf-profile.children.cycles-pp.__build_skb_around
>       1.48            -0.5        1.01 ±  2%  perf-profile.children.cycles-pp.get_obj_cgroup_from_current
>       2.05 ±  2%      -0.5        1.59 ±  2%  perf-profile.children.cycles-pp.security_file_permission
>       0.98 ±  2%      -0.4        0.59 ± 10%  perf-profile.children.cycles-pp.copyin
>       1.08 ±  3%      -0.4        0.72 ±  3%  perf-profile.children.cycles-pp.__might_resched
>       1.75            -0.3        1.42 ±  4%  perf-profile.children.cycles-pp.apparmor_file_permission
>       1.32 ±  4%      -0.3        1.00 ±  3%  perf-profile.children.cycles-pp.sock_recvmsg
>       0.54 ±  4%      -0.3        0.25 ±  6%  perf-profile.children.cycles-pp.skb_unlink
>       0.54 ±  6%      -0.3        0.26 ±  3%  perf-profile.children.cycles-pp.unix_write_space
>       0.66 ±  3%      -0.3        0.39 ±  4%  perf-profile.children.cycles-pp.obj_cgroup_charge
>       0.68 ±  2%      -0.3        0.41 ±  4%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
>       0.86 ±  4%      -0.3        0.59 ±  3%  perf-profile.children.cycles-pp.__check_heap_object
>       0.75 ±  9%      -0.3        0.48 ±  2%  perf-profile.children.cycles-pp.skb_set_owner_w
>       1.84 ±  3%      -0.3        1.58 ±  4%  perf-profile.children.cycles-pp.aa_sk_perm
>       0.68 ± 11%      -0.2        0.44 ±  3%  perf-profile.children.cycles-pp.skb_queue_tail
>       1.22 ±  4%      -0.2        0.99 ±  5%  perf-profile.children.cycles-pp.__fdget_pos
>       0.70 ±  2%      -0.2        0.48 ±  5%  perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg
>       1.16 ±  4%      -0.2        0.93 ±  3%  perf-profile.children.cycles-pp.security_socket_recvmsg
>       0.48 ±  3%      -0.2        0.29 ±  4%  perf-profile.children.cycles-pp.__might_fault
>       0.24 ±  7%      -0.2        0.05 ± 56%  perf-profile.children.cycles-pp.fsnotify_perm
>       1.12 ±  4%      -0.2        0.93 ±  6%  perf-profile.children.cycles-pp.__fget_light
>       1.24 ±  3%      -0.2        1.07 ±  3%  perf-profile.children.cycles-pp.security_socket_sendmsg
>       0.61 ±  3%      -0.2        0.45 ±  2%  perf-profile.children.cycles-pp.__might_sleep
>       0.33 ±  5%      -0.2        0.17 ±  6%  perf-profile.children.cycles-pp.refill_obj_stock
>       0.40 ±  2%      -0.1        0.25 ±  4%  perf-profile.children.cycles-pp.kmalloc_slab
>       0.57 ±  2%      -0.1        0.45        perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
>       0.54 ±  3%      -0.1        0.42 ±  2%  perf-profile.children.cycles-pp.wait_for_unix_gc
>       0.42 ±  2%      -0.1        0.30 ±  3%  perf-profile.children.cycles-pp.is_vmalloc_addr
>       1.00 ±  2%      -0.1        0.87 ±  5%  perf-profile.children.cycles-pp.__virt_addr_valid
>       0.52 ±  2%      -0.1        0.41        perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
>       0.33 ±  3%      -0.1        0.21 ±  3%  perf-profile.children.cycles-pp.tick_sched_handle
>       0.36 ±  2%      -0.1        0.25 ±  4%  perf-profile.children.cycles-pp.tick_sched_timer
>       0.47 ±  2%      -0.1        0.36 ±  2%  perf-profile.children.cycles-pp.hrtimer_interrupt
>       0.48 ±  2%      -0.1        0.36 ±  2%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
>       0.32 ±  3%      -0.1        0.21 ±  5%  perf-profile.children.cycles-pp.update_process_times
>       0.42 ±  3%      -0.1        0.31 ±  2%  perf-profile.children.cycles-pp.__hrtimer_run_queues
>       0.26 ±  6%      -0.1        0.16 ±  4%  perf-profile.children.cycles-pp.kmalloc_size_roundup
>       0.20 ±  4%      -0.1        0.10 ±  9%  perf-profile.children.cycles-pp.task_tick_fair
>       0.24 ±  3%      -0.1        0.15 ±  4%  perf-profile.children.cycles-pp.scheduler_tick
>       0.30 ±  5%      -0.1        0.21 ±  8%  perf-profile.children.cycles-pp.obj_cgroup_uncharge_pages
>       0.20 ±  2%      -0.1        0.11 ±  6%  perf-profile.children.cycles-pp.should_failslab
>       0.51 ±  2%      -0.1        0.43 ±  6%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
>       0.15 ±  8%      -0.1        0.07 ± 13%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
>       0.19 ±  4%      -0.1        0.12 ±  5%  perf-profile.children.cycles-pp.apparmor_socket_sendmsg
>       0.20 ±  4%      -0.1        0.13 ±  5%  perf-profile.children.cycles-pp.aa_file_perm
>       0.18 ±  5%      -0.1        0.12 ±  5%  perf-profile.children.cycles-pp.apparmor_socket_recvmsg
>       0.14 ± 13%      -0.1        0.08 ± 55%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
>       0.24 ±  4%      -0.1        0.18 ±  2%  perf-profile.children.cycles-pp.rcu_all_qs
>       0.18 ± 10%      -0.1        0.12 ± 11%  perf-profile.children.cycles-pp.memcg_account_kmem
>       0.37 ±  3%      -0.1        0.31 ±  3%  perf-profile.children.cycles-pp.security_socket_getpeersec_dgram
>       0.08            -0.0        0.06 ±  8%  perf-profile.children.cycles-pp.put_pid
>       0.18 ±  3%      -0.0        0.16 ±  4%  perf-profile.children.cycles-pp.apparmor_socket_getpeersec_dgram
>       0.21 ±  3%      +0.0        0.23 ±  2%  perf-profile.children.cycles-pp.__get_task_ioprio
>       0.00            +0.1        0.05        perf-profile.children.cycles-pp.perf_exclude_event
>       0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.invalidate_user_asid
>       0.00            +0.1        0.07 ±  6%  perf-profile.children.cycles-pp.__bitmap_and
>       0.05            +0.1        0.13 ±  8%  perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
>       0.00            +0.1        0.08 ±  7%  perf-profile.children.cycles-pp.schedule_debug
>       0.00            +0.1        0.08 ± 13%  perf-profile.children.cycles-pp.read@plt
>       0.00            +0.1        0.08 ±  5%  perf-profile.children.cycles-pp.sysvec_reschedule_ipi
>       0.00            +0.1        0.10 ±  4%  perf-profile.children.cycles-pp.tracing_gen_ctx_irq_test
>       0.00            +0.1        0.10 ±  4%  perf-profile.children.cycles-pp.place_entity
>       0.00            +0.1        0.12 ± 10%  perf-profile.children.cycles-pp.native_irq_return_iret
>       0.07 ± 14%      +0.1        0.19 ±  3%  perf-profile.children.cycles-pp.__list_add_valid
>       0.00            +0.1        0.13 ±  6%  perf-profile.children.cycles-pp.perf_trace_buf_alloc
>       0.00            +0.1        0.13 ± 34%  perf-profile.children.cycles-pp._find_next_and_bit
>       0.00            +0.1        0.14 ±  5%  perf-profile.children.cycles-pp.switch_ldt
>       0.00            +0.1        0.15 ±  5%  perf-profile.children.cycles-pp.check_cfs_rq_runtime
>       0.00            +0.1        0.15 ± 30%  perf-profile.children.cycles-pp.migrate_task_rq_fair
>       0.00            +0.2        0.15 ±  5%  perf-profile.children.cycles-pp.__rdgsbase_inactive
>       0.00            +0.2        0.16 ±  3%  perf-profile.children.cycles-pp.save_fpregs_to_fpstate
>       0.00            +0.2        0.16 ±  6%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
>       0.00            +0.2        0.17        perf-profile.children.cycles-pp.perf_trace_buf_update
>       0.00            +0.2        0.18 ±  2%  perf-profile.children.cycles-pp.rb_insert_color
>       0.00            +0.2        0.18 ±  4%  perf-profile.children.cycles-pp.rb_next
>       0.00            +0.2        0.18 ± 21%  perf-profile.children.cycles-pp.__cgroup_account_cputime
>       0.01 ±223%      +0.2        0.21 ± 28%  perf-profile.children.cycles-pp.perf_trace_sched_switch
>       0.00            +0.2        0.20 ±  3%  perf-profile.children.cycles-pp.select_idle_cpu
>       0.00            +0.2        0.20 ±  3%  perf-profile.children.cycles-pp.rcu_note_context_switch
>       0.00            +0.2        0.21 ± 26%  perf-profile.children.cycles-pp.set_task_cpu
>       0.00            +0.2        0.22 ±  8%  perf-profile.children.cycles-pp.resched_curr
>       0.08 ±  5%      +0.2        0.31 ± 11%  perf-profile.children.cycles-pp.task_h_load
>       0.00            +0.2        0.24 ±  3%  perf-profile.children.cycles-pp.finish_wait
>       0.04 ± 44%      +0.3        0.29 ±  5%  perf-profile.children.cycles-pp.rb_erase
>       0.19 ±  6%      +0.3        0.46        perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
>       0.20 ±  6%      +0.3        0.47 ±  3%  perf-profile.children.cycles-pp.__list_del_entry_valid
>       0.00            +0.3        0.28 ±  3%  perf-profile.children.cycles-pp.__wrgsbase_inactive
>       0.02 ±141%      +0.3        0.30 ±  2%  perf-profile.children.cycles-pp.native_sched_clock
>       0.06 ± 13%      +0.3        0.34 ±  2%  perf-profile.children.cycles-pp.sched_clock_cpu
>       0.64 ±  2%      +0.3        0.93        perf-profile.children.cycles-pp.mutex_lock
>       0.00            +0.3        0.30 ±  5%  perf-profile.children.cycles-pp.cr4_update_irqsoff
>       0.00            +0.3        0.30 ±  4%  perf-profile.children.cycles-pp.clear_buddies
>       0.07 ± 55%      +0.3        0.37 ±  5%  perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
>       0.10 ± 66%      +0.3        0.42 ±  5%  perf-profile.children.cycles-pp.perf_tp_event
>       0.02 ±142%      +0.3        0.36 ±  6%  perf-profile.children.cycles-pp.cpuacct_charge
>       0.12 ±  9%      +0.4        0.47 ± 11%  perf-profile.children.cycles-pp.wake_affine
>       0.00            +0.4        0.36 ± 13%  perf-profile.children.cycles-pp.available_idle_cpu
>       0.05 ± 48%      +0.4        0.42 ±  6%  perf-profile.children.cycles-pp.finish_task_switch
>       0.12 ±  4%      +0.4        0.49 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi
>       0.07 ± 17%      +0.4        0.48        perf-profile.children.cycles-pp.__calc_delta
>       0.03 ±100%      +0.5        0.49 ±  4%  perf-profile.children.cycles-pp.pick_next_entity
>       0.00            +0.5        0.48 ±  8%  perf-profile.children.cycles-pp.set_next_buddy
>       0.08 ± 14%      +0.6        0.66 ±  4%  perf-profile.children.cycles-pp.update_min_vruntime
>       0.07 ± 17%      +0.6        0.68 ±  2%  perf-profile.children.cycles-pp.os_xsave
>       0.29 ±  7%      +0.7        0.99 ±  3%  perf-profile.children.cycles-pp.update_cfs_group
>       0.17 ± 17%      +0.7        0.87 ±  4%  perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
>       0.14 ±  7%      +0.7        0.87 ±  3%  perf-profile.children.cycles-pp.__update_load_avg_se
>       0.14 ± 16%      +0.8        0.90 ±  2%  perf-profile.children.cycles-pp.update_rq_clock
>       0.08 ± 17%      +0.8        0.84 ±  5%  perf-profile.children.cycles-pp.check_preempt_wakeup
>       0.12 ± 14%      +0.8        0.95 ±  3%  perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
>       0.22 ±  5%      +0.8        1.07 ±  3%  perf-profile.children.cycles-pp.prepare_to_wait
>       0.10 ± 18%      +0.9        0.98 ±  3%  perf-profile.children.cycles-pp.check_preempt_curr
>      29.72            +0.9       30.61        perf-profile.children.cycles-pp.vfs_write
>       0.14 ± 11%      +0.9        1.03 ±  4%  perf-profile.children.cycles-pp.__switch_to
>       0.07 ± 20%      +0.9        0.99 ±  6%  perf-profile.children.cycles-pp.put_prev_entity
>       0.12 ± 16%      +1.0        1.13 ±  5%  perf-profile.children.cycles-pp.___perf_sw_event
>       0.07 ± 17%      +1.0        1.10 ± 13%  perf-profile.children.cycles-pp.select_idle_sibling
>      27.82 ±  2%      +1.2       28.99        perf-profile.children.cycles-pp.unix_stream_recvmsg
>      27.41 ±  2%      +1.2       28.63        perf-profile.children.cycles-pp.unix_stream_read_generic
>       0.20 ± 15%      +1.4        1.59 ±  3%  perf-profile.children.cycles-pp.reweight_entity
>       0.21 ± 13%      +1.4        1.60 ±  4%  perf-profile.children.cycles-pp.__switch_to_asm
>       0.23 ± 10%      +1.4        1.65 ±  5%  perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
>       0.20 ± 13%      +1.5        1.69 ±  3%  perf-profile.children.cycles-pp.set_next_entity
>      27.59            +1.6       29.19        perf-profile.children.cycles-pp.sock_write_iter
>       0.28 ± 10%      +1.8        2.12 ±  5%  perf-profile.children.cycles-pp.switch_fpu_return
>       0.26 ± 11%      +1.8        2.10 ±  6%  perf-profile.children.cycles-pp.select_task_rq_fair
>      26.66 ±  2%      +2.0       28.63        perf-profile.children.cycles-pp.sock_sendmsg
>       0.31 ± 12%      +2.1        2.44 ±  5%  perf-profile.children.cycles-pp.select_task_rq
>       0.30 ± 14%      +2.2        2.46 ±  4%  perf-profile.children.cycles-pp.prepare_task_switch
>      25.27 ±  2%      +2.2       27.47        perf-profile.children.cycles-pp.unix_stream_sendmsg
>       2.10            +2.3        4.38 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
>       0.40 ± 14%      +2.5        2.92 ±  5%  perf-profile.children.cycles-pp.dequeue_entity
>      48.40            +2.6       51.02        perf-profile.children.cycles-pp.__libc_write
>       0.46 ± 15%      +3.1        3.51 ±  3%  perf-profile.children.cycles-pp.enqueue_entity
>       0.49 ± 10%      +3.2        3.64 ±  7%  perf-profile.children.cycles-pp.update_load_avg
>       0.53 ± 20%      +3.4        3.91 ±  3%  perf-profile.children.cycles-pp.update_curr
>      80.81            +3.4       84.24        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>       0.50 ± 12%      +3.5        4.00 ±  4%  perf-profile.children.cycles-pp.switch_mm_irqs_off
>       0.55 ±  9%      +3.8        4.38 ±  4%  perf-profile.children.cycles-pp.pick_next_task_fair
>       9.60            +4.6       14.15 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
>       0.78 ± 13%      +4.9        5.65 ±  4%  perf-profile.children.cycles-pp.dequeue_task_fair
>       0.78 ± 15%      +5.2        5.99 ±  3%  perf-profile.children.cycles-pp.enqueue_task_fair
>      74.30            +5.6       79.86        perf-profile.children.cycles-pp.do_syscall_64
>       0.90 ± 15%      +6.3        7.16 ±  3%  perf-profile.children.cycles-pp.ttwu_do_activate
>       0.33 ± 31%      +6.3        6.61 ±  6%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
>       0.82 ± 15%      +8.1        8.92 ±  5%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
>       1.90 ± 16%     +12.2       14.10 ±  2%  perf-profile.children.cycles-pp.try_to_wake_up
>       2.36 ± 11%     +12.2       14.60 ±  3%  perf-profile.children.cycles-pp.schedule_timeout
>       1.95 ± 15%     +12.5       14.41 ±  2%  perf-profile.children.cycles-pp.autoremove_wake_function
>       2.01 ± 15%     +12.8       14.76 ±  2%  perf-profile.children.cycles-pp.__wake_up_common
>       2.23 ± 13%     +13.2       15.45 ±  2%  perf-profile.children.cycles-pp.__wake_up_common_lock
>       2.53 ± 10%     +13.4       15.90 ±  2%  perf-profile.children.cycles-pp.sock_def_readable
>       2.29 ± 15%     +14.6       16.93 ±  3%  perf-profile.children.cycles-pp.unix_stream_data_wait
>       2.61 ± 13%     +18.0       20.65 ±  4%  perf-profile.children.cycles-pp.schedule
>       2.66 ± 13%     +18.1       20.77 ±  4%  perf-profile.children.cycles-pp.__schedule
>      11.25 ±  3%      -4.6        6.67 ±  3%  perf-profile.self.cycles-pp.syscall_return_via_sysret
>       5.76 ± 32%      -3.9        1.90 ±  3%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
>       8.69 ±  3%      -3.4        5.27 ±  3%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
>       3.11 ±  3%      -2.5        0.60 ± 13%  perf-profile.self.cycles-pp.__slab_free
>       6.65 ±  2%      -2.2        4.47 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
>       4.78 ±  3%      -1.9        2.88 ±  3%  perf-profile.self.cycles-pp.__entry_text_start
>       3.52 ±  2%      -1.9        1.64 ±  6%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
>       2.06 ±  3%      -1.1        0.96 ±  5%  perf-profile.self.cycles-pp.kmem_cache_free
>       1.42 ±  3%      -1.0        0.46 ± 10%  perf-profile.self.cycles-pp.check_heap_object
>       1.43 ±  4%      -0.8        0.64        perf-profile.self.cycles-pp.sock_wfree
>       0.99 ±  3%      -0.8        0.21 ± 12%  perf-profile.self.cycles-pp.skb_release_data
>       0.84 ±  8%      -0.7        0.10 ± 64%  perf-profile.self.cycles-pp.___slab_alloc
>       1.97 ±  2%      -0.6        1.32        perf-profile.self.cycles-pp.unix_stream_read_generic
>       1.60 ±  3%      -0.5        1.11 ±  4%  perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook
>       1.24 ±  2%      -0.5        0.75 ± 11%  perf-profile.self.cycles-pp.mod_objcg_state
>       0.71            -0.5        0.23 ± 15%  perf-profile.self.cycles-pp.__build_skb_around
>       0.95 ±  3%      -0.5        0.50 ±  6%  perf-profile.self.cycles-pp.__alloc_skb
>       0.97 ±  4%      -0.4        0.55 ±  5%  perf-profile.self.cycles-pp.kmem_cache_alloc_node
>       0.99 ±  3%      -0.4        0.59 ±  4%  perf-profile.self.cycles-pp.vfs_write
>       1.38 ±  2%      -0.4        0.99        perf-profile.self.cycles-pp.__kmem_cache_free
>       0.86 ±  2%      -0.4        0.50 ±  3%  perf-profile.self.cycles-pp.__kmem_cache_alloc_node
>       0.92 ±  4%      -0.4        0.56 ±  4%  perf-profile.self.cycles-pp.sock_write_iter
>       1.06 ±  3%      -0.4        0.70 ±  3%  perf-profile.self.cycles-pp.__might_resched
>       0.73 ±  4%      -0.3        0.44 ±  4%  perf-profile.self.cycles-pp.__cond_resched
>       0.85 ±  3%      -0.3        0.59 ±  4%  perf-profile.self.cycles-pp.__check_heap_object
>       1.46 ±  7%      -0.3        1.20 ±  2%  perf-profile.self.cycles-pp.unix_stream_sendmsg
>       0.73 ±  9%      -0.3        0.47 ±  2%  perf-profile.self.cycles-pp.skb_set_owner_w
>       1.54            -0.3        1.28 ±  4%  perf-profile.self.cycles-pp.apparmor_file_permission
>       0.74 ±  3%      -0.2        0.50 ±  2%  perf-profile.self.cycles-pp.get_obj_cgroup_from_current
>       1.15 ±  3%      -0.2        0.91 ±  8%  perf-profile.self.cycles-pp.aa_sk_perm
>       0.60            -0.2        0.36 ±  4%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
>       0.65 ±  4%      -0.2        0.45 ±  6%  perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg
>       0.24 ±  6%      -0.2        0.05 ± 56%  perf-profile.self.cycles-pp.fsnotify_perm
>       0.76 ±  3%      -0.2        0.58 ±  2%  perf-profile.self.cycles-pp.sock_read_iter
>       1.10 ±  4%      -0.2        0.92 ±  6%  perf-profile.self.cycles-pp.__fget_light
>       0.42 ±  3%      -0.2        0.25 ±  4%  perf-profile.self.cycles-pp.obj_cgroup_charge
>       0.32 ±  4%      -0.2        0.17 ±  6%  perf-profile.self.cycles-pp.refill_obj_stock
>       0.29            -0.2        0.14 ±  8%  perf-profile.self.cycles-pp.__kmalloc_node_track_caller
>       0.54 ±  3%      -0.1        0.40 ±  2%  perf-profile.self.cycles-pp.__might_sleep
>       0.30 ±  7%      -0.1        0.16 ± 22%  perf-profile.self.cycles-pp.security_file_permission
>       0.34 ±  3%      -0.1        0.21 ±  6%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
>       0.41 ±  3%      -0.1        0.29 ±  3%  perf-profile.self.cycles-pp.is_vmalloc_addr
>       0.27 ±  3%      -0.1        0.16 ±  6%  perf-profile.self.cycles-pp._copy_from_iter
>       0.24 ±  3%      -0.1        0.12 ±  3%  perf-profile.self.cycles-pp.ksys_write
>       0.95 ±  2%      -0.1        0.84 ±  5%  perf-profile.self.cycles-pp.__virt_addr_valid
>       0.56 ± 11%      -0.1        0.46 ±  4%  perf-profile.self.cycles-pp.sock_def_readable
>       0.16 ±  7%      -0.1        0.06 ± 18%  perf-profile.self.cycles-pp.sock_recvmsg
>       0.22 ±  5%      -0.1        0.14 ±  2%  perf-profile.self.cycles-pp.ksys_read
>       0.27 ±  4%      -0.1        0.19 ±  5%  perf-profile.self.cycles-pp.kmalloc_slab
>       0.28 ±  2%      -0.1        0.20 ±  2%  perf-profile.self.cycles-pp.consume_skb
>       0.35 ±  2%      -0.1        0.28 ±  3%  perf-profile.self.cycles-pp.__check_object_size
>       0.13 ±  8%      -0.1        0.06 ± 18%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
>       0.20 ±  5%      -0.1        0.12 ±  6%  perf-profile.self.cycles-pp.kmalloc_reserve
>       0.26 ±  5%      -0.1        0.19 ±  4%  perf-profile.self.cycles-pp.sock_alloc_send_pskb
>       0.42 ±  2%      -0.1        0.35 ±  7%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
>       0.19 ±  5%      -0.1        0.12 ±  6%  perf-profile.self.cycles-pp.aa_file_perm
>       0.16 ±  4%      -0.1        0.10 ±  4%  perf-profile.self.cycles-pp.skb_copy_datagram_from_iter
>       0.18 ±  4%      -0.1        0.12 ±  6%  perf-profile.self.cycles-pp.apparmor_socket_sendmsg
>       0.18 ±  5%      -0.1        0.12 ±  4%  perf-profile.self.cycles-pp.apparmor_socket_recvmsg
>       0.15 ±  5%      -0.1        0.10 ±  5%  perf-profile.self.cycles-pp.alloc_skb_with_frags
>       0.64 ±  3%      -0.1        0.59        perf-profile.self.cycles-pp.__libc_write
>       0.20 ±  4%      -0.1        0.15 ±  3%  perf-profile.self.cycles-pp._copy_to_iter
>       0.15 ±  5%      -0.1        0.10 ± 11%  perf-profile.self.cycles-pp.sock_sendmsg
>       0.08 ±  4%      -0.1        0.03 ± 81%  perf-profile.self.cycles-pp.copyout
>       0.11 ±  6%      -0.0        0.06 ±  7%  perf-profile.self.cycles-pp.__fdget_pos
>       0.12 ±  5%      -0.0        0.07 ± 10%  perf-profile.self.cycles-pp.kmalloc_size_roundup
>       0.34 ±  3%      -0.0        0.29        perf-profile.self.cycles-pp.do_syscall_64
>       0.20 ±  4%      -0.0        0.15 ±  4%  perf-profile.self.cycles-pp.rcu_all_qs
>       0.41 ±  3%      -0.0        0.37 ±  8%  perf-profile.self.cycles-pp.unix_stream_recvmsg
>       0.22 ±  2%      -0.0        0.17 ±  4%  perf-profile.self.cycles-pp.unix_destruct_scm
>       0.09 ±  4%      -0.0        0.05        perf-profile.self.cycles-pp.should_failslab
>       0.10 ± 15%      -0.0        0.06 ± 50%  perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
>       0.11 ±  4%      -0.0        0.07        perf-profile.self.cycles-pp.__might_fault
>       0.16 ±  2%      -0.0        0.13 ±  6%  perf-profile.self.cycles-pp.obj_cgroup_uncharge_pages
>       0.18 ±  4%      -0.0        0.16 ±  3%  perf-profile.self.cycles-pp.security_socket_getpeersec_dgram
>       0.28 ±  2%      -0.0        0.25 ±  2%  perf-profile.self.cycles-pp.unix_write_space
>       0.17 ±  2%      -0.0        0.15 ±  5%  perf-profile.self.cycles-pp.apparmor_socket_getpeersec_dgram
>       0.08 ±  6%      -0.0        0.05 ±  7%  perf-profile.self.cycles-pp.security_socket_sendmsg
>       0.12 ±  4%      -0.0        0.10 ±  3%  perf-profile.self.cycles-pp.__skb_datagram_iter
>       0.24 ±  2%      -0.0        0.22        perf-profile.self.cycles-pp.mutex_unlock
>       0.08 ±  5%      +0.0        0.10 ±  6%  perf-profile.self.cycles-pp.scm_recv
>       0.17 ±  2%      +0.0        0.19 ±  3%  perf-profile.self.cycles-pp.__x64_sys_read
>       0.19 ±  3%      +0.0        0.22 ±  2%  perf-profile.self.cycles-pp.__get_task_ioprio
>       0.00            +0.1        0.06        perf-profile.self.cycles-pp.finish_wait
>       0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.cr4_update_irqsoff
>       0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.invalidate_user_asid
>       0.00            +0.1        0.07 ± 12%  perf-profile.self.cycles-pp.wake_affine
>       0.00            +0.1        0.07 ±  7%  perf-profile.self.cycles-pp.check_cfs_rq_runtime
>       0.00            +0.1        0.07 ±  5%  perf-profile.self.cycles-pp.perf_trace_buf_update
>       0.00            +0.1        0.07 ±  9%  perf-profile.self.cycles-pp.asm_sysvec_reschedule_ipi
>       0.00            +0.1        0.07 ± 10%  perf-profile.self.cycles-pp.__bitmap_and
>       0.00            +0.1        0.08 ± 10%  perf-profile.self.cycles-pp.schedule_debug
>       0.00            +0.1        0.08 ± 13%  perf-profile.self.cycles-pp.read@plt
>       0.00            +0.1        0.08 ± 12%  perf-profile.self.cycles-pp.perf_trace_buf_alloc
>       0.00            +0.1        0.09 ± 35%  perf-profile.self.cycles-pp.migrate_task_rq_fair
>       0.00            +0.1        0.09 ±  5%  perf-profile.self.cycles-pp.place_entity
>       0.00            +0.1        0.10 ±  4%  perf-profile.self.cycles-pp.tracing_gen_ctx_irq_test
>       0.00            +0.1        0.10        perf-profile.self.cycles-pp.__wake_up_common_lock
>       0.07 ± 17%      +0.1        0.18 ±  3%  perf-profile.self.cycles-pp.__list_add_valid
>       0.00            +0.1        0.11 ±  8%  perf-profile.self.cycles-pp.native_irq_return_iret
>       0.00            +0.1        0.12 ±  6%  perf-profile.self.cycles-pp.select_idle_cpu
>       0.00            +0.1        0.12 ± 34%  perf-profile.self.cycles-pp._find_next_and_bit
>       0.00            +0.1        0.13 ± 25%  perf-profile.self.cycles-pp.__cgroup_account_cputime
>       0.00            +0.1        0.13 ±  7%  perf-profile.self.cycles-pp.switch_ldt
>       0.00            +0.1        0.14 ±  5%  perf-profile.self.cycles-pp.check_preempt_curr
>       0.00            +0.1        0.15 ±  2%  perf-profile.self.cycles-pp.save_fpregs_to_fpstate
>       0.00            +0.1        0.15 ±  5%  perf-profile.self.cycles-pp.__rdgsbase_inactive
>       0.14 ±  3%      +0.2        0.29        perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
>       0.00            +0.2        0.15 ±  7%  perf-profile.self.cycles-pp.ttwu_queue_wakelist
>       0.00            +0.2        0.17 ±  4%  perf-profile.self.cycles-pp.rb_insert_color
>       0.00            +0.2        0.17 ±  5%  perf-profile.self.cycles-pp.rb_next
>       0.00            +0.2        0.18 ±  2%  perf-profile.self.cycles-pp.autoremove_wake_function
>       0.01 ±223%      +0.2        0.19 ±  6%  perf-profile.self.cycles-pp.ttwu_do_activate
>       0.00            +0.2        0.20 ±  2%  perf-profile.self.cycles-pp.rcu_note_context_switch
>       0.00            +0.2        0.20 ±  7%  perf-profile.self.cycles-pp.exit_to_user_mode_loop
>       0.27            +0.2        0.47 ±  3%  perf-profile.self.cycles-pp.mutex_lock
>       0.00            +0.2        0.20 ± 28%  perf-profile.self.cycles-pp.perf_trace_sched_switch
>       0.00            +0.2        0.21 ±  9%  perf-profile.self.cycles-pp.resched_curr
>       0.04 ± 45%      +0.2        0.26 ±  7%  perf-profile.self.cycles-pp.perf_tp_event
>       0.06 ±  7%      +0.2        0.28 ±  8%  perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
>       0.19 ±  7%      +0.2        0.41 ±  5%  perf-profile.self.cycles-pp.__list_del_entry_valid
>       0.08 ±  5%      +0.2        0.31 ± 11%  perf-profile.self.cycles-pp.task_h_load
>       0.00            +0.2        0.23 ±  5%  perf-profile.self.cycles-pp.finish_task_switch
>       0.03 ± 70%      +0.2        0.27 ±  5%  perf-profile.self.cycles-pp.rb_erase
>       0.02 ±142%      +0.3        0.29 ±  2%  perf-profile.self.cycles-pp.native_sched_clock
>       0.00            +0.3        0.28 ±  3%  perf-profile.self.cycles-pp.__wrgsbase_inactive
>       0.00            +0.3        0.28 ±  6%  perf-profile.self.cycles-pp.clear_buddies
>       0.07 ± 10%      +0.3        0.35 ±  3%  perf-profile.self.cycles-pp.schedule_timeout
>       0.03 ± 70%      +0.3        0.33 ±  3%  perf-profile.self.cycles-pp.select_task_rq
>       0.06 ± 13%      +0.3        0.36 ±  4%  perf-profile.self.cycles-pp.__wake_up_common
>       0.06 ± 13%      +0.3        0.36 ±  3%  perf-profile.self.cycles-pp.dequeue_entity
>       0.06 ± 18%      +0.3        0.37 ±  7%  perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
>       0.01 ±223%      +0.3        0.33 ±  4%  perf-profile.self.cycles-pp.schedule
>       0.02 ±142%      +0.3        0.35 ±  7%  perf-profile.self.cycles-pp.cpuacct_charge
>       0.01 ±223%      +0.3        0.35        perf-profile.self.cycles-pp.set_next_entity
>       0.00            +0.4        0.35 ± 13%  perf-profile.self.cycles-pp.available_idle_cpu
>       0.08 ± 10%      +0.4        0.44 ±  5%  perf-profile.self.cycles-pp.prepare_to_wait
>       0.63 ±  3%      +0.4        1.00 ±  4%  perf-profile.self.cycles-pp.vfs_read
>       0.02 ±142%      +0.4        0.40 ±  4%  perf-profile.self.cycles-pp.check_preempt_wakeup
>       0.02 ±141%      +0.4        0.42 ±  4%  perf-profile.self.cycles-pp.pick_next_entity
>       0.07 ± 17%      +0.4        0.48        perf-profile.self.cycles-pp.__calc_delta
>       0.06 ± 14%      +0.4        0.47 ±  3%  perf-profile.self.cycles-pp.unix_stream_data_wait
>       0.04 ± 45%      +0.4        0.45 ±  4%  perf-profile.self.cycles-pp.switch_fpu_return
>       0.00            +0.5        0.46 ±  7%  perf-profile.self.cycles-pp.set_next_buddy
>       0.07 ± 17%      +0.5        0.53 ±  3%  perf-profile.self.cycles-pp.select_task_rq_fair
>       0.08 ± 16%      +0.5        0.55 ±  4%  perf-profile.self.cycles-pp.try_to_wake_up
>       0.08 ± 19%      +0.5        0.56 ±  3%  perf-profile.self.cycles-pp.update_rq_clock
>       0.02 ±141%      +0.5        0.50 ± 10%  perf-profile.self.cycles-pp.select_idle_sibling
>       0.77 ±  2%      +0.5        1.25 ±  2%  perf-profile.self.cycles-pp.__libc_read
>       0.09 ± 19%      +0.5        0.59 ±  3%  perf-profile.self.cycles-pp.reweight_entity
>       0.08 ± 14%      +0.5        0.59 ±  2%  perf-profile.self.cycles-pp.dequeue_task_fair
>       0.08 ± 13%      +0.6        0.64 ±  5%  perf-profile.self.cycles-pp.update_min_vruntime
>       0.02 ±141%      +0.6        0.58 ±  7%  perf-profile.self.cycles-pp.put_prev_entity
>       0.06 ± 11%      +0.6        0.64 ±  4%  perf-profile.self.cycles-pp.enqueue_task_fair
>       0.07 ± 18%      +0.6        0.68 ±  3%  perf-profile.self.cycles-pp.os_xsave
>       1.39 ±  2%      +0.7        2.06 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
>       0.28 ±  8%      +0.7        0.97 ±  4%  perf-profile.self.cycles-pp.update_cfs_group
>       0.14 ±  8%      +0.7        0.83 ±  3%  perf-profile.self.cycles-pp.__update_load_avg_se
>       1.76 ±  3%      +0.7        2.47 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock
>       0.12 ± 12%      +0.7        0.85 ±  5%  perf-profile.self.cycles-pp.prepare_task_switch
>       0.12 ± 12%      +0.8        0.91 ±  3%  perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
>       0.13 ± 12%      +0.8        0.93 ±  5%  perf-profile.self.cycles-pp.pick_next_task_fair
>       0.13 ± 12%      +0.9        0.98 ±  4%  perf-profile.self.cycles-pp.__switch_to
>       0.11 ± 18%      +0.9        1.06 ±  5%  perf-profile.self.cycles-pp.___perf_sw_event
>       0.16 ± 11%      +1.2        1.34 ±  4%  perf-profile.self.cycles-pp.enqueue_entity
>       0.20 ± 12%      +1.4        1.58 ±  4%  perf-profile.self.cycles-pp.__switch_to_asm
>       0.23 ± 10%      +1.4        1.65 ±  5%  perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
>       0.25 ± 12%      +1.5        1.77 ±  4%  perf-profile.self.cycles-pp.__schedule
>       0.22 ± 10%      +1.6        1.78 ± 10%  perf-profile.self.cycles-pp.update_load_avg
>       0.23 ± 16%      +1.7        1.91 ±  7%  perf-profile.self.cycles-pp.update_curr
>       0.48 ± 11%      +3.4        3.86 ±  4%  perf-profile.self.cycles-pp.switch_mm_irqs_off
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         sudo bin/lkp install job.yaml           # job file is attached in this email
>         bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
>         sudo bin/lkp run generated-yaml-file
> 
>         # if come across any failure that blocks the test,
>         # please remove ~/.lkp and /lkp dir to run from a clean state.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Vincent Guittot 2 years, 6 months ago
On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
>
> On Tue, Feb 21, 2023 at 10:38:44AM +0100, Vincent Guittot wrote:
> > On Thu, 9 Feb 2023 at 20:31, Roman Kagan <rkagan@amazon.de> wrote:
> > >
> > > From: Zhang Qiao <zhangqiao22@huawei.com>
> > >
> > > When a scheduling entity is placed onto cfs_rq, its vruntime is pulled
> > > to the base level (around cfs_rq->min_vruntime), so that the entity
> > > doesn't gain extra boost when placed backwards.
> > >
> > > However, if the entity being placed wasn't executed for a long time, its
> > > vruntime may get too far behind (e.g. while cfs_rq was executing a
> > > low-weight hog), which can inverse the vruntime comparison due to s64
> > > overflow.  This results in the entity being placed with its original
> > > vruntime way forwards, so that it will effectively never get to the cpu.
> > >
> > > To prevent that, ignore the vruntime of the entity being placed if it
> > > didn't execute for longer than the time that can lead to an overflow.
> > >
> > > Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
> > > [rkagan: formatted, adjusted commit log, comments, cutoff value]
> > > Co-developed-by: Roman Kagan <rkagan@amazon.de>
> > > Signed-off-by: Roman Kagan <rkagan@amazon.de>
> >
> > Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
> >
> > > ---
> > > v2 -> v3:
> > > - make cutoff less arbitrary and update comments [Vincent]
> > >
> > > v1 -> v2:
> > > - add Zhang Qiao's s-o-b
> > > - fix constant promotion on 32bit
> > >
> > >  kernel/sched/fair.c | 21 +++++++++++++++++++--
> > >  1 file changed, 19 insertions(+), 2 deletions(-)
>
> Turns out Peter took v2 through his tree, and it has already landed in
> Linus' master.
>
> What scares me, though, is that I've got a message from the test robot
> that this commit drammatically affected hackbench results, see the quote
> below.  I expected the commit not to affect any benchmarks.
>
> Any idea what could have caused this change?

Hmm, It's most probably because se->exec_start is reset after a
migration and the condition becomes true for newly migrated task
whereas its vruntime should be after min_vruntime.

We have missed this condition

>
> Thanks,
> Roman.
>
>
> On Tue, Feb 21, 2023 at 03:34:16PM +0800, kernel test robot wrote:
> > FYI, we noticed a 125.5% improvement of hackbench.throughput due to commit:
> >
> > commit: 829c1651e9c4a6f78398d3e67651cef9bb6b42cc ("sched/fair: sanitize vruntime of entity being placed")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> >
> > in testcase: hackbench
> > on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz (Cascade Lake) with 128G memory
> > with following parameters:
> >
> >         nr_threads: 50%
> >         iterations: 8
> >         mode: process
> >         ipc: pipe
> >         cpufreq_governor: performance
> >
> > test-description: Hackbench is both a benchmark and a stress test for the Linux kernel scheduler.
> > test-url: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/sched/cfs-scheduler/hackbench.c
> >
> > In addition to that, the commit also has significant impact on the following tests:
> >
> > +------------------+--------------------------------------------------+
> > | testcase: change | hackbench: hackbench.throughput -8.1% regression |
> > | test machine     | 104 threads 2 sockets (Skylake) with 192G memory |
> > | test parameters  | cpufreq_governor=performance                     |
> > |                  | ipc=socket                                       |
> > |                  | iterations=4                                     |
> > |                  | mode=process                                     |
> > |                  | nr_threads=100%                                  |
> > +------------------+--------------------------------------------------+
> >
> > Details are as below:
> >
> > =========================================================================================
> > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
> >   gcc-11/performance/pipe/8/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-csl-2sp9/hackbench
> >
> > commit:
> >   a2e90611b9 ("sched/fair: Remove capacity inversion detection")
> >   829c1651e9 ("sched/fair: sanitize vruntime of entity being placed")
> >
> > a2e90611b9f425ad 829c1651e9c4a6f78398d3e6765
> > ---------------- ---------------------------
> >          %stddev     %change         %stddev
> >              \          |                \
> >     308887 ±  5%    +125.5%     696539        hackbench.throughput
> >     259291 ±  2%    +127.3%     589293        hackbench.throughput_avg
> >     308887 ±  5%    +125.5%     696539        hackbench.throughput_best
> >     198770 ±  2%    +105.5%     408552 ±  4%  hackbench.throughput_worst
> >     319.60 ±  2%     -55.8%     141.24        hackbench.time.elapsed_time
> >     319.60 ±  2%     -55.8%     141.24        hackbench.time.elapsed_time.max
> >  1.298e+09 ±  8%     -87.6%  1.613e+08 ±  7%  hackbench.time.involuntary_context_switches
> >     477107           -12.5%     417660        hackbench.time.minor_page_faults
> >      24683 ±  2%     -57.2%      10562        hackbench.time.system_time
> >       2136 ±  3%     -45.0%       1174        hackbench.time.user_time
> >   3.21e+09 ±  4%     -83.0%  5.442e+08 ±  3%  hackbench.time.voluntary_context_switches
> >   5.28e+08 ±  4%      +8.4%  5.723e+08 ±  3%  cpuidle..time
> >     365.97 ±  2%     -48.9%     187.12        uptime.boot
> >    3322559 ±  3%     +34.3%    4463206 ± 15%  vmstat.memory.cache
> >   14194257 ±  2%     -62.8%    5279904 ±  3%  vmstat.system.cs
> >    2120781 ±  3%     -72.8%     576421 ±  4%  vmstat.system.in
> >       1.84 ± 12%      +2.6        4.48 ±  5%  mpstat.cpu.all.idle%
> >       2.49 ±  3%      -1.1        1.39 ±  4%  mpstat.cpu.all.irq%
> >       0.04 ± 12%      +0.0        0.05        mpstat.cpu.all.soft%
> >       7.36            +2.2        9.56        mpstat.cpu.all.usr%
> >      61555 ±  6%     -72.8%      16751 ± 16%  numa-meminfo.node1.Active
> >      61515 ±  6%     -72.8%      16717 ± 16%  numa-meminfo.node1.Active(anon)
> >     960182 ±102%    +225.6%    3125990 ± 42%  numa-meminfo.node1.FilePages
> >    1754002 ± 53%    +137.9%    4173379 ± 34%  numa-meminfo.node1.MemUsed
> >   35296824 ±  6%    +157.8%   91005048        numa-numastat.node0.local_node
> >   35310119 ±  6%    +157.9%   91058472        numa-numastat.node0.numa_hit
> >   35512423 ±  5%    +159.7%   92232951        numa-numastat.node1.local_node
> >   35577275 ±  4%    +159.4%   92273266        numa-numastat.node1.numa_hit
> >   35310253 ±  6%    +157.9%   91058211        numa-vmstat.node0.numa_hit
> >   35296958 ±  6%    +157.8%   91004787        numa-vmstat.node0.numa_local
> >      15337 ±  6%     -72.5%       4216 ± 17%  numa-vmstat.node1.nr_active_anon
> >     239988 ±102%    +225.7%     781607 ± 42%  numa-vmstat.node1.nr_file_pages
> >      15337 ±  6%     -72.5%       4216 ± 17%  numa-vmstat.node1.nr_zone_active_anon
> >   35577325 ±  4%    +159.4%   92273215        numa-vmstat.node1.numa_hit
> >   35512473 ±  5%    +159.7%   92232900        numa-vmstat.node1.numa_local
> >      64500 ±  8%     -61.8%      24643 ± 32%  meminfo.Active
> >      64422 ±  8%     -61.9%      24568 ± 32%  meminfo.Active(anon)
> >     140271 ± 14%     -38.0%      86979 ± 24%  meminfo.AnonHugePages
> >     372672 ±  2%     +13.3%     422069        meminfo.AnonPages
> >    3205235 ±  3%     +35.1%    4329061 ± 15%  meminfo.Cached
> >    1548601 ±  7%     +77.4%    2747319 ± 24%  meminfo.Committed_AS
> >     783193 ± 14%    +154.9%    1996137 ± 33%  meminfo.Inactive
> >     783010 ± 14%    +154.9%    1995951 ± 33%  meminfo.Inactive(anon)
> >    4986534 ±  2%     +28.2%    6394741 ± 10%  meminfo.Memused
> >     475092 ± 22%    +236.5%    1598918 ± 41%  meminfo.Shmem
> >       2777            -2.1%       2719        turbostat.Bzy_MHz
> >   11143123 ±  6%     +72.0%   19162667        turbostat.C1
> >       0.24 ±  7%      +0.7        0.94 ±  3%  turbostat.C1%
> >     100440 ± 18%    +203.8%     305136 ± 15%  turbostat.C1E
> >       0.06 ±  9%      +0.1        0.18 ± 11%  turbostat.C1E%
> >       1.24 ±  3%      +1.6        2.81 ±  4%  turbostat.C6%
> >       1.38 ±  3%    +156.1%       3.55 ±  3%  turbostat.CPU%c1
> >       0.33 ±  5%     +76.5%       0.58 ±  7%  turbostat.CPU%c6
> >       0.16           +31.2%       0.21        turbostat.IPC
> >  6.866e+08 ±  5%     -87.8%   83575393 ±  5%  turbostat.IRQ
> >       0.33 ± 27%      +0.2        0.57        turbostat.POLL%
> >       0.12 ± 10%    +176.4%       0.33 ± 12%  turbostat.Pkg%pc2
> >       0.09 ±  7%    -100.0%       0.00        turbostat.Pkg%pc6
> >      61.33            +5.2%      64.50 ±  2%  turbostat.PkgTmp
> >      14.81            +2.0%      15.11        turbostat.RAMWatt
> >      16242 ±  8%     -62.0%       6179 ± 32%  proc-vmstat.nr_active_anon
> >      93150 ±  2%     +13.2%     105429        proc-vmstat.nr_anon_pages
> >     801219 ±  3%     +35.1%    1082320 ± 15%  proc-vmstat.nr_file_pages
> >     195506 ± 14%    +155.2%     498919 ± 33%  proc-vmstat.nr_inactive_anon
> >     118682 ± 22%    +236.9%     399783 ± 41%  proc-vmstat.nr_shmem
> >      16242 ±  8%     -62.0%       6179 ± 32%  proc-vmstat.nr_zone_active_anon
> >     195506 ± 14%    +155.2%     498919 ± 33%  proc-vmstat.nr_zone_inactive_anon
> >   70889233 ±  5%    +158.6%  1.833e+08        proc-vmstat.numa_hit
> >   70811086 ±  5%    +158.8%  1.832e+08        proc-vmstat.numa_local
> >      55885 ± 22%     -67.2%      18327 ± 38%  proc-vmstat.numa_pages_migrated
> >     422312 ± 10%     -95.4%      19371 ±  7%  proc-vmstat.pgactivate
> >   71068460 ±  5%    +158.1%  1.834e+08        proc-vmstat.pgalloc_normal
> >    1554994           -19.6%    1250346 ±  4%  proc-vmstat.pgfault
> >   71011267 ±  5%    +155.9%  1.817e+08        proc-vmstat.pgfree
> >      55885 ± 22%     -67.2%      18327 ± 38%  proc-vmstat.pgmigrate_success
> >     111247 ±  2%     -35.0%      72355 ±  2%  proc-vmstat.pgreuse
> >    2506368 ±  2%     -53.1%    1176320        proc-vmstat.unevictable_pgs_scanned
> >      20.06 ± 10%     -22.4%      15.56 ±  8%  sched_debug.cfs_rq:/.h_nr_running.max
> >       0.81 ± 32%     -93.1%       0.06 ±223%  sched_debug.cfs_rq:/.h_nr_running.min
> >       1917 ± 34%    -100.0%       0.00        sched_debug.cfs_rq:/.load.min
> >      24.18 ± 10%     +39.0%      33.62 ± 11%  sched_debug.cfs_rq:/.load_avg.avg
> >     245.61 ± 25%     +66.3%     408.33 ± 22%  sched_debug.cfs_rq:/.load_avg.max
> >      47.52 ± 13%     +72.6%      82.03 ±  8%  sched_debug.cfs_rq:/.load_avg.stddev
> >   13431147           -64.9%    4717147        sched_debug.cfs_rq:/.min_vruntime.avg
> >   18161799 ±  7%     -67.4%    5925316 ±  6%  sched_debug.cfs_rq:/.min_vruntime.max
> >   12413026           -65.0%    4340952        sched_debug.cfs_rq:/.min_vruntime.min
> >     739748 ± 16%     -66.6%     247410 ± 17%  sched_debug.cfs_rq:/.min_vruntime.stddev
> >       0.85           -16.4%       0.71        sched_debug.cfs_rq:/.nr_running.avg
> >       0.61 ± 25%     -90.9%       0.06 ±223%  sched_debug.cfs_rq:/.nr_running.min
> >       0.10 ± 25%    +109.3%       0.22 ±  7%  sched_debug.cfs_rq:/.nr_running.stddev
> >     169.22          +101.7%     341.33        sched_debug.cfs_rq:/.removed.load_avg.max
> >      32.41 ± 24%    +100.2%      64.90 ± 16%  sched_debug.cfs_rq:/.removed.load_avg.stddev
> >      82.92 ± 10%    +108.1%     172.56        sched_debug.cfs_rq:/.removed.runnable_avg.max
> >      13.60 ± 28%    +114.0%      29.10 ± 20%  sched_debug.cfs_rq:/.removed.runnable_avg.stddev
> >      82.92 ± 10%    +108.1%     172.56        sched_debug.cfs_rq:/.removed.util_avg.max
> >      13.60 ± 28%    +114.0%      29.10 ± 20%  sched_debug.cfs_rq:/.removed.util_avg.stddev
> >       2156 ± 12%     -36.6%       1368 ± 27%  sched_debug.cfs_rq:/.runnable_avg.min
> >       2285 ±  7%     -19.8%       1833 ±  6%  sched_debug.cfs_rq:/.runnable_avg.stddev
> >   -2389921           -64.8%    -840940        sched_debug.cfs_rq:/.spread0.min
> >     739781 ± 16%     -66.5%     247837 ± 17%  sched_debug.cfs_rq:/.spread0.stddev
> >     843.88 ±  2%     -20.5%     670.53        sched_debug.cfs_rq:/.util_avg.avg
> >     433.64 ±  7%     -43.5%     244.83 ± 17%  sched_debug.cfs_rq:/.util_avg.min
> >     187.00 ±  6%     +40.6%     263.02 ±  4%  sched_debug.cfs_rq:/.util_avg.stddev
> >     394.15 ± 14%     -29.5%     278.06 ±  3%  sched_debug.cfs_rq:/.util_est_enqueued.avg
> >       1128 ± 12%     -17.6%     930.39 ±  5%  sched_debug.cfs_rq:/.util_est_enqueued.max
> >      38.36 ± 29%    -100.0%       0.00        sched_debug.cfs_rq:/.util_est_enqueued.min
> >       3596 ± 15%     -39.5%       2175 ±  7%  sched_debug.cpu.avg_idle.min
> >     160647 ±  9%     -25.9%     118978 ±  9%  sched_debug.cpu.avg_idle.stddev
> >     197365           -46.2%     106170        sched_debug.cpu.clock.avg
> >     197450           -46.2%     106208        sched_debug.cpu.clock.max
> >     197281           -46.2%     106128        sched_debug.cpu.clock.min
> >      49.96 ± 22%     -53.1%      23.44 ± 19%  sched_debug.cpu.clock.stddev
> >     193146           -45.7%     104898        sched_debug.cpu.clock_task.avg
> >     194592           -45.8%     105455        sched_debug.cpu.clock_task.max
> >     177878           -49.3%      90211        sched_debug.cpu.clock_task.min
> >       1794 ±  5%     -10.7%       1602 ±  2%  sched_debug.cpu.clock_task.stddev
> >      13154 ±  2%     -20.3%      10479        sched_debug.cpu.curr->pid.avg
> >      15059           -17.2%      12468        sched_debug.cpu.curr->pid.max
> >       7263 ± 33%    -100.0%       0.00        sched_debug.cpu.curr->pid.min
> >       9321 ± 36%     +98.2%      18478 ± 44%  sched_debug.cpu.max_idle_balance_cost.stddev
> >       0.00 ± 17%     -41.6%       0.00 ± 13%  sched_debug.cpu.next_balance.stddev
> >      20.00 ± 11%     -21.4%      15.72 ±  7%  sched_debug.cpu.nr_running.max
> >       0.86 ± 17%     -87.1%       0.11 ±141%  sched_debug.cpu.nr_running.min
> >   25069883           -83.7%    4084117 ±  4%  sched_debug.cpu.nr_switches.avg
> >   26486718           -82.8%    4544009 ±  4%  sched_debug.cpu.nr_switches.max
> >   23680077           -84.5%    3663816 ±  4%  sched_debug.cpu.nr_switches.min
> >     589836 ±  3%     -68.7%     184621 ± 16%  sched_debug.cpu.nr_switches.stddev
> >     197278           -46.2%     106128        sched_debug.cpu_clk
> >     194327           -46.9%     103176        sched_debug.ktime
> >     197967           -46.0%     106821        sched_debug.sched_clk
> >      14.91           -37.6%       9.31        perf-stat.i.MPKI
> >  2.657e+10           +25.0%   3.32e+10        perf-stat.i.branch-instructions
> >       1.17            -0.4        0.78        perf-stat.i.branch-miss-rate%
> >  3.069e+08           -20.1%  2.454e+08        perf-stat.i.branch-misses
> >       6.43 ±  8%      +2.2        8.59 ±  4%  perf-stat.i.cache-miss-rate%
> >  1.952e+09           -24.3%  1.478e+09        perf-stat.i.cache-references
> >   14344055 ±  2%     -58.6%    5932018 ±  3%  perf-stat.i.context-switches
> >       1.83           -21.8%       1.43        perf-stat.i.cpi
> >  2.403e+11            -3.4%  2.322e+11        perf-stat.i.cpu-cycles
> >    1420139 ±  2%     -38.8%     869692 ±  5%  perf-stat.i.cpu-migrations
> >       2619 ±  7%     -15.5%       2212 ±  8%  perf-stat.i.cycles-between-cache-misses
> >       0.24 ± 19%      -0.1        0.10 ± 17%  perf-stat.i.dTLB-load-miss-rate%
> >   90403286 ± 19%     -55.8%   39926283 ± 16%  perf-stat.i.dTLB-load-misses
> >  3.823e+10           +28.6%  4.918e+10        perf-stat.i.dTLB-loads
> >       0.01 ± 34%      -0.0        0.01 ± 33%  perf-stat.i.dTLB-store-miss-rate%
> >    2779663 ± 34%     -52.7%    1315899 ± 31%  perf-stat.i.dTLB-store-misses
> >   2.19e+10           +24.2%   2.72e+10        perf-stat.i.dTLB-stores
> >      47.99 ±  2%     +28.0       75.94        perf-stat.i.iTLB-load-miss-rate%
> >   89417955 ±  2%     +38.7%   1.24e+08 ±  4%  perf-stat.i.iTLB-load-misses
> >   97721514 ±  2%     -58.2%   40865783 ±  3%  perf-stat.i.iTLB-loads
> >  1.329e+11           +26.3%  1.678e+11        perf-stat.i.instructions
> >       1503            -7.7%       1388 ±  3%  perf-stat.i.instructions-per-iTLB-miss
> >       0.55           +30.2%       0.72        perf-stat.i.ipc
> >       1.64 ± 18%    +217.4%       5.20 ± 11%  perf-stat.i.major-faults
> >       2.73            -3.7%       2.63        perf-stat.i.metric.GHz
> >       1098 ±  2%      -7.1%       1020 ±  3%  perf-stat.i.metric.K/sec
> >       1008           +24.4%       1254        perf-stat.i.metric.M/sec
> >       4334 ±  2%     +90.5%       8257 ±  7%  perf-stat.i.minor-faults
> >      90.94           -14.9       75.99        perf-stat.i.node-load-miss-rate%
> >   41932510 ±  8%     -43.0%   23899176 ± 10%  perf-stat.i.node-load-misses
> >    3366677 ±  5%     +86.2%    6267816        perf-stat.i.node-loads
> >      81.77 ±  3%     -36.3       45.52 ±  3%  perf-stat.i.node-store-miss-rate%
> >   18498318 ±  7%     -31.8%   12613933 ±  7%  perf-stat.i.node-store-misses
> >    3023556 ± 10%    +508.7%   18405880 ±  2%  perf-stat.i.node-stores
> >       4336 ±  2%     +90.5%       8262 ±  7%  perf-stat.i.page-faults
> >      14.70           -41.2%       8.65        perf-stat.overall.MPKI
> >       1.16            -0.4        0.72        perf-stat.overall.branch-miss-rate%
> >       6.22 ±  7%      +2.4        8.59 ±  4%  perf-stat.overall.cache-miss-rate%
> >       1.81           -24.3%       1.37        perf-stat.overall.cpi
> >       0.24 ± 19%      -0.2        0.07 ± 15%  perf-stat.overall.dTLB-load-miss-rate%
> >       0.01 ± 34%      -0.0        0.00 ± 29%  perf-stat.overall.dTLB-store-miss-rate%
> >      47.78 ±  2%     +29.3       77.12        perf-stat.overall.iTLB-load-miss-rate%
> >       1486            -9.1%       1351 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
> >       0.55           +32.0%       0.73        perf-stat.overall.ipc
> >      92.54           -15.4       77.16 ±  2%  perf-stat.overall.node-load-miss-rate%
> >      85.82 ±  2%     -48.1       37.76 ±  5%  perf-stat.overall.node-store-miss-rate%
> >  2.648e+10           +25.2%  3.314e+10        perf-stat.ps.branch-instructions
> >   3.06e+08           -22.1%  2.383e+08        perf-stat.ps.branch-misses
> >  1.947e+09           -25.5%  1.451e+09        perf-stat.ps.cache-references
> >   14298713 ±  2%     -62.5%    5359285 ±  3%  perf-stat.ps.context-switches
> >  2.396e+11            -4.0%  2.299e+11        perf-stat.ps.cpu-cycles
> >    1415512 ±  2%     -42.2%     817981 ±  4%  perf-stat.ps.cpu-migrations
> >   90073948 ± 19%     -60.4%   35711862 ± 15%  perf-stat.ps.dTLB-load-misses
> >  3.811e+10           +29.7%  4.944e+10        perf-stat.ps.dTLB-loads
> >    2767291 ± 34%     -56.3%    1210210 ± 29%  perf-stat.ps.dTLB-store-misses
> >  2.183e+10           +25.0%  2.729e+10        perf-stat.ps.dTLB-stores
> >   89118809 ±  2%     +39.6%  1.244e+08 ±  4%  perf-stat.ps.iTLB-load-misses
> >   97404381 ±  2%     -62.2%   36860047 ±  3%  perf-stat.ps.iTLB-loads
> >  1.324e+11           +26.7%  1.678e+11        perf-stat.ps.instructions
> >       1.62 ± 18%    +164.7%       4.29 ±  8%  perf-stat.ps.major-faults
> >       4310 ±  2%     +75.1%       7549 ±  5%  perf-stat.ps.minor-faults
> >   41743097 ±  8%     -47.3%   21984450 ±  9%  perf-stat.ps.node-load-misses
> >    3356259 ±  5%     +92.6%    6462631        perf-stat.ps.node-loads
> >   18414647 ±  7%     -35.7%   11833799 ±  6%  perf-stat.ps.node-store-misses
> >    3019790 ± 10%    +545.0%   19478071        perf-stat.ps.node-stores
> >       4312 ±  2%     +75.2%       7553 ±  5%  perf-stat.ps.page-faults
> >  4.252e+13           -43.7%  2.395e+13        perf-stat.total.instructions
> >      29.92 ±  4%     -22.8        7.09 ± 29%  perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
> >      28.53 ±  5%     -21.6        6.92 ± 29%  perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write.ksys_write
> >      27.86 ±  5%     -21.1        6.77 ± 29%  perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write.vfs_write
> >      27.55 ±  5%     -20.9        6.68 ± 29%  perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_write
> >      22.28 ±  4%     -17.0        5.31 ± 30%  perf-profile.calltrace.cycles-pp.schedule.pipe_read.vfs_read.ksys_read.do_syscall_64
> >      21.98 ±  4%     -16.7        5.24 ± 30%  perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_read.vfs_read.ksys_read
> >      12.62 ±  4%      -9.6        3.00 ± 33%  perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
> >      34.09            -9.2       24.92 ±  3%  perf-profile.calltrace.cycles-pp.pipe_read.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >      11.48 ±  5%      -8.8        2.69 ± 38%  perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
> >       9.60 ±  7%      -7.2        2.40 ± 35%  perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.pipe_read.vfs_read
> >      36.39            -6.2       30.20        perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >      40.40            -6.1       34.28        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >      40.95            -5.7       35.26        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read
> >      37.43            -5.4       32.07        perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >       6.30 ± 11%      -5.2        1.09 ± 36%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       5.66 ± 12%      -5.1        0.58 ± 75%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       6.46 ± 10%      -5.1        1.40 ± 28%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       5.53 ± 13%      -5.0        0.56 ± 75%  perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
> >       5.42 ± 13%      -4.9        0.56 ± 75%  perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
> >       5.82 ±  9%      -4.7        1.10 ± 37%  perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
> >       5.86 ± 16%      -4.6        1.31 ± 37%  perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
> >       5.26 ±  9%      -4.4        0.89 ± 57%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common
> >      45.18            -3.5       41.68        perf-profile.calltrace.cycles-pp.__libc_read
> >      50.31            -3.2       47.12        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       4.00 ± 27%      -2.9        1.09 ± 40%  perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.pipe_read
> >      50.75            -2.7       48.06        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write
> >      40.80            -2.6       38.20        perf-profile.calltrace.cycles-pp.pipe_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       3.10 ± 15%      -2.5        0.62 ±103%  perf-profile.calltrace.cycles-pp.update_cfs_group.dequeue_task_fair.__schedule.schedule.pipe_read
> >       2.94 ± 12%      -2.3        0.62 ±102%  perf-profile.calltrace.cycles-pp.update_cfs_group.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
> >       2.38 ±  9%      -2.0        0.38 ±102%  perf-profile.calltrace.cycles-pp._raw_spin_lock.__schedule.schedule.pipe_read.vfs_read
> >       2.24 ±  7%      -1.8        0.40 ± 71%  perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_read.vfs_read.ksys_read.do_syscall_64
> >       2.08 ±  6%      -1.8        0.29 ±100%  perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.pipe_read.vfs_read
> >       2.10 ± 10%      -1.8        0.32 ±104%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__schedule.schedule.pipe_read
> >       2.76 ±  7%      -1.5        1.24 ± 17%  perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
> >       2.27 ±  5%      -1.4        0.88 ± 11%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >       2.43 ±  7%      -1.3        1.16 ± 17%  perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common
> >       2.46 ±  5%      -1.3        1.20 ±  7%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >       1.54 ±  5%      -1.2        0.32 ±101%  perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
> >       0.97 ±  9%      -0.3        0.66 ± 19%  perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function
> >       0.86 ±  6%      +0.2        1.02        perf-profile.calltrace.cycles-pp.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write
> >       0.64 ±  9%      +0.5        1.16 ±  5%  perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_read.vfs_read.ksys_read.do_syscall_64
> >       0.47 ± 45%      +0.5        0.99 ±  5%  perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       0.60 ±  8%      +0.5        1.13 ±  5%  perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >       0.00            +0.5        0.54 ±  5%  perf-profile.calltrace.cycles-pp.current_time.file_update_time.pipe_write.vfs_write.ksys_write
> >       0.00            +0.6        0.56 ±  4%  perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_from_iter.copy_page_from_iter.pipe_write
> >       0.00            +0.6        0.56 ±  7%  perf-profile.calltrace.cycles-pp.__might_resched.__might_fault._copy_to_iter.copy_page_to_iter.pipe_read
> >       0.00            +0.6        0.58 ±  5%  perf-profile.calltrace.cycles-pp.__might_resched.mutex_lock.pipe_write.vfs_write.ksys_write
> >       0.00            +0.6        0.62 ±  3%  perf-profile.calltrace.cycles-pp.__might_resched.mutex_lock.pipe_read.vfs_read.ksys_read
> >       0.00            +0.7        0.65 ±  6%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_write.vfs_write
> >       0.00            +0.7        0.65 ±  7%  perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> >       0.57 ±  5%      +0.7        1.24 ±  6%  perf-profile.calltrace.cycles-pp.__fget_light.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       0.00            +0.7        0.72 ±  6%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait_event.pipe_write.vfs_write.ksys_write
> >       0.00            +0.8        0.75 ±  6%  perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.pipe_write.vfs_write.ksys_write
> >       0.74 ±  9%      +0.8        1.48 ±  5%  perf-profile.calltrace.cycles-pp.file_update_time.pipe_write.vfs_write.ksys_write.do_syscall_64
> >       0.63 ±  5%      +0.8        1.40 ±  5%  perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       0.00            +0.8        0.78 ± 19%  perf-profile.calltrace.cycles-pp.reader__read_event.perf_session__process_events.record__finish_output.__cmd_record
> >       0.00            +0.8        0.78 ± 19%  perf-profile.calltrace.cycles-pp.record__finish_output.__cmd_record
> >       0.00            +0.8        0.78 ± 19%  perf-profile.calltrace.cycles-pp.perf_session__process_events.record__finish_output.__cmd_record
> >       0.00            +0.8        0.80 ± 15%  perf-profile.calltrace.cycles-pp.__cmd_record
> >       0.00            +0.8        0.82 ± 11%  perf-profile.calltrace.cycles-pp.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
> >       0.00            +0.9        0.85 ±  6%  perf-profile.calltrace.cycles-pp.prepare_to_wait_event.pipe_write.vfs_write.ksys_write.do_syscall_64
> >       0.00            +0.9        0.86 ±  4%  perf-profile.calltrace.cycles-pp.current_time.atime_needs_update.touch_atime.pipe_read.vfs_read
> >       0.00            +0.9        0.87 ±  5%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write
> >       0.00            +0.9        0.88 ±  5%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read
> >       0.26 ±100%      +1.0        1.22 ± 10%  perf-profile.calltrace.cycles-pp.__schedule.schedule.pipe_write.vfs_write.ksys_write
> >       0.00            +1.0        0.96 ±  6%  perf-profile.calltrace.cycles-pp.__might_fault._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read
> >       0.27 ±100%      +1.0        1.23 ± 10%  perf-profile.calltrace.cycles-pp.schedule.pipe_write.vfs_write.ksys_write.do_syscall_64
> >       0.00            +1.0        0.97 ±  7%  perf-profile.calltrace.cycles-pp.page_counter_uncharge.uncharge_batch.__mem_cgroup_uncharge.__folio_put.pipe_read
> >       0.87 ±  8%      +1.1        1.98 ±  5%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
> >       0.73 ±  6%      +1.1        1.85 ±  5%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64
> >       0.00            +1.2        1.15 ±  7%  perf-profile.calltrace.cycles-pp.uncharge_batch.__mem_cgroup_uncharge.__folio_put.pipe_read.vfs_read
> >       0.00            +1.2        1.23 ±  6%  perf-profile.calltrace.cycles-pp.__mem_cgroup_uncharge.__folio_put.pipe_read.vfs_read.ksys_read
> >       0.00            +1.2        1.24 ±  7%  perf-profile.calltrace.cycles-pp.__folio_put.pipe_read.vfs_read.ksys_read.do_syscall_64
> >       0.48 ± 45%      +1.3        1.74 ±  6%  perf-profile.calltrace.cycles-pp.atime_needs_update.touch_atime.pipe_read.vfs_read.ksys_read
> >       0.60 ±  7%      +1.3        1.87 ±  8%  perf-profile.calltrace.cycles-pp.__wake_up_common_lock.pipe_read.vfs_read.ksys_read.do_syscall_64
> >       1.23 ±  7%      +1.3        2.51 ±  4%  perf-profile.calltrace.cycles-pp.mutex_lock.pipe_read.vfs_read.ksys_read.do_syscall_64
> >      43.42            +1.3       44.75        perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       0.83 ±  7%      +1.3        2.17 ±  5%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       0.98 ±  7%      +1.4        2.36 ±  6%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       0.27 ±100%      +1.4        1.70 ±  9%  perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.pipe_read.vfs_read.ksys_read
> >       0.79 ±  8%      +1.4        2.23 ±  6%  perf-profile.calltrace.cycles-pp.touch_atime.pipe_read.vfs_read.ksys_read.do_syscall_64
> >       0.18 ±141%      +1.5        1.63 ±  9%  perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read
> >       0.18 ±141%      +1.5        1.67 ±  9%  perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.pipe_read.vfs_read
> >       0.00            +1.6        1.57 ± 10%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
> >       0.00            +1.6        1.57 ± 10%  perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary
> >       1.05 ±  8%      +1.7        2.73 ±  6%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin._copy_from_iter.copy_page_from_iter.pipe_write
> >       1.84 ±  9%      +1.7        3.56 ±  5%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.copy_page_to_iter.pipe_read
> >       1.41 ±  9%      +1.8        3.17 ±  6%  perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write
> >       0.00            +1.8        1.79 ±  9%  perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
> >       1.99 ±  9%      +2.0        3.95 ±  5%  perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read
> >       2.40 ±  7%      +2.4        4.82 ±  5%  perf-profile.calltrace.cycles-pp._copy_from_iter.copy_page_from_iter.pipe_write.vfs_write.ksys_write
> >       0.00            +2.5        2.50 ±  7%  perf-profile.calltrace.cycles-pp.__mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
> >       2.89 ±  8%      +2.6        5.47 ±  5%  perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.vfs_write.ksys_write.do_syscall_64
> >       1.04 ± 30%      +2.8        3.86 ±  5%  perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_write
> >       0.00            +2.9        2.90 ± 11%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
> >       0.00            +2.9        2.91 ± 11%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
> >       0.00            +2.9        2.91 ± 11%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
> >       0.85 ± 27%      +2.9        3.80 ±  5%  perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read
> >       0.00            +3.0        2.96 ± 11%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
> >       2.60 ±  9%      +3.1        5.74 ±  6%  perf-profile.calltrace.cycles-pp._copy_to_iter.copy_page_to_iter.pipe_read.vfs_read.ksys_read
> >       2.93 ±  9%      +3.7        6.66 ±  5%  perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.vfs_read.ksys_read.do_syscall_64
> >       1.60 ± 12%      +4.6        6.18 ±  7%  perf-profile.calltrace.cycles-pp.mutex_unlock.pipe_write.vfs_write.ksys_write.do_syscall_64
> >       2.60 ± 10%      +4.6        7.24 ±  5%  perf-profile.calltrace.cycles-pp.mutex_lock.pipe_write.vfs_write.ksys_write.do_syscall_64
> >      28.75 ±  5%     -21.6        7.19 ± 28%  perf-profile.children.cycles-pp.schedule
> >      30.52 ±  4%     -21.6        8.97 ± 22%  perf-profile.children.cycles-pp.__wake_up_common_lock
> >      28.53 ±  6%     -21.0        7.56 ± 26%  perf-profile.children.cycles-pp.__schedule
> >      29.04 ±  5%     -20.4        8.63 ± 23%  perf-profile.children.cycles-pp.__wake_up_common
> >      28.37 ±  5%     -19.9        8.44 ± 23%  perf-profile.children.cycles-pp.autoremove_wake_function
> >      28.08 ±  5%     -19.7        8.33 ± 23%  perf-profile.children.cycles-pp.try_to_wake_up
> >      13.90 ±  2%     -10.2        3.75 ± 28%  perf-profile.children.cycles-pp.ttwu_do_activate
> >      12.66 ±  3%      -9.2        3.47 ± 29%  perf-profile.children.cycles-pp.enqueue_task_fair
> >      34.20            -9.2       25.05 ±  3%  perf-profile.children.cycles-pp.pipe_read
> >      90.86            -9.1       81.73        perf-profile.children.cycles-pp.do_syscall_64
> >      91.80            -8.3       83.49        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> >      10.28 ±  7%      -7.8        2.53 ± 27%  perf-profile.children.cycles-pp._raw_spin_lock
> >       9.85 ±  7%      -6.9        2.92 ± 29%  perf-profile.children.cycles-pp.dequeue_task_fair
> >       8.69 ±  7%      -6.6        2.05 ± 24%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
> >       8.99 ±  6%      -6.2        2.81 ± 16%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
> >      36.46            -6.1       30.34        perf-profile.children.cycles-pp.vfs_read
> >       8.38 ±  8%      -5.8        2.60 ± 23%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> >       6.10 ± 11%      -5.4        0.66 ± 61%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
> >      37.45            -5.3       32.13        perf-profile.children.cycles-pp.ksys_read
> >       6.50 ± 35%      -4.9        1.62 ± 61%  perf-profile.children.cycles-pp.update_curr
> >       6.56 ± 15%      -4.6        1.95 ± 57%  perf-profile.children.cycles-pp.update_cfs_group
> >       6.38 ± 14%      -4.5        1.91 ± 28%  perf-profile.children.cycles-pp.enqueue_entity
> >       5.74 ±  5%      -3.8        1.92 ± 25%  perf-profile.children.cycles-pp.update_load_avg
> >      45.56            -3.8       41.75        perf-profile.children.cycles-pp.__libc_read
> >       3.99 ±  4%      -3.1        0.92 ± 24%  perf-profile.children.cycles-pp.pick_next_task_fair
> >       4.12 ± 27%      -2.7        1.39 ± 34%  perf-profile.children.cycles-pp.dequeue_entity
> >      40.88            -2.5       38.37        perf-profile.children.cycles-pp.pipe_write
> >       3.11 ±  4%      -2.4        0.75 ± 22%  perf-profile.children.cycles-pp.switch_mm_irqs_off
> >       2.06 ± 33%      -1.8        0.27 ± 27%  perf-profile.children.cycles-pp.asm_sysvec_call_function_single
> >       2.38 ± 41%      -1.8        0.60 ± 72%  perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
> >       2.29 ±  5%      -1.7        0.60 ± 25%  perf-profile.children.cycles-pp.switch_fpu_return
> >       2.30 ±  6%      -1.6        0.68 ± 18%  perf-profile.children.cycles-pp.prepare_task_switch
> >       1.82 ± 33%      -1.6        0.22 ± 31%  perf-profile.children.cycles-pp.sysvec_call_function_single
> >       1.77 ± 33%      -1.6        0.20 ± 32%  perf-profile.children.cycles-pp.__sysvec_call_function_single
> >       1.96 ±  5%      -1.5        0.50 ± 20%  perf-profile.children.cycles-pp.reweight_entity
> >       2.80 ±  7%      -1.2        1.60 ± 12%  perf-profile.children.cycles-pp.select_task_rq
> >       1.61 ±  6%      -1.2        0.42 ± 25%  perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
> >       1.34 ±  9%      -1.2        0.16 ± 28%  perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
> >       1.62 ±  4%      -1.2        0.45 ± 22%  perf-profile.children.cycles-pp.set_next_entity
> >       1.55 ±  8%      -1.1        0.43 ± 12%  perf-profile.children.cycles-pp.update_rq_clock
> >       1.49 ±  8%      -1.1        0.41 ± 14%  perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
> >       1.30 ± 20%      -1.0        0.26 ± 18%  perf-profile.children.cycles-pp.finish_task_switch
> >       1.44 ±  5%      -1.0        0.42 ± 19%  perf-profile.children.cycles-pp.__switch_to_asm
> >       2.47 ±  7%      -1.0        1.50 ± 12%  perf-profile.children.cycles-pp.select_task_rq_fair
> >       2.33 ±  7%      -0.9        1.40 ±  3%  perf-profile.children.cycles-pp.prepare_to_wait_event
> >       1.24 ±  7%      -0.9        0.35 ± 14%  perf-profile.children.cycles-pp.__update_load_avg_se
> >       1.41 ± 32%      -0.9        0.56 ± 24%  perf-profile.children.cycles-pp.sched_ttwu_pending
> >       2.29 ±  8%      -0.8        1.45 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> >       1.04 ±  7%      -0.8        0.24 ± 22%  perf-profile.children.cycles-pp.check_preempt_curr
> >       1.01 ±  3%      -0.7        0.30 ± 20%  perf-profile.children.cycles-pp.__switch_to
> >       0.92 ±  7%      -0.7        0.26 ± 12%  perf-profile.children.cycles-pp.update_min_vruntime
> >       0.71 ±  2%      -0.6        0.08 ± 75%  perf-profile.children.cycles-pp.put_prev_entity
> >       0.76 ±  6%      -0.6        0.14 ± 32%  perf-profile.children.cycles-pp.check_preempt_wakeup
> >       0.81 ± 66%      -0.6        0.22 ± 34%  perf-profile.children.cycles-pp.set_task_cpu
> >       0.82 ± 17%      -0.6        0.23 ± 10%  perf-profile.children.cycles-pp.cpuacct_charge
> >       1.08 ± 15%      -0.6        0.51 ± 10%  perf-profile.children.cycles-pp.wake_affine
> >       0.56 ± 15%      -0.5        0.03 ±100%  perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi
> >       0.66 ±  3%      -0.5        0.15 ± 28%  perf-profile.children.cycles-pp.os_xsave
> >       0.52 ± 44%      -0.5        0.06 ±151%  perf-profile.children.cycles-pp.native_irq_return_iret
> >       0.55 ±  5%      -0.4        0.15 ± 21%  perf-profile.children.cycles-pp.__calc_delta
> >       0.56 ± 10%      -0.4        0.17 ± 26%  perf-profile.children.cycles-pp.___perf_sw_event
> >       0.70 ± 15%      -0.4        0.32 ± 11%  perf-profile.children.cycles-pp.task_h_load
> >       0.40 ±  4%      -0.3        0.06 ± 49%  perf-profile.children.cycles-pp.pick_next_entity
> >       0.57 ±  6%      -0.3        0.26 ±  7%  perf-profile.children.cycles-pp.__list_del_entry_valid
> >       0.39 ±  8%      -0.3        0.08 ± 24%  perf-profile.children.cycles-pp.set_next_buddy
> >       0.64 ±  6%      -0.3        0.36 ±  6%  perf-profile.children.cycles-pp._raw_spin_lock_irq
> >       0.53 ± 20%      -0.3        0.25 ±  8%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
> >       0.36 ±  8%      -0.3        0.08 ± 11%  perf-profile.children.cycles-pp.rb_insert_color
> >       0.41 ±  6%      -0.3        0.14 ± 17%  perf-profile.children.cycles-pp.sched_clock_cpu
> >       0.36 ± 33%      -0.3        0.10 ± 17%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
> >       0.37 ±  4%      -0.2        0.13 ± 16%  perf-profile.children.cycles-pp.native_sched_clock
> >       0.28 ±  5%      -0.2        0.07 ± 18%  perf-profile.children.cycles-pp.rb_erase
> >       0.32 ±  7%      -0.2        0.12 ± 10%  perf-profile.children.cycles-pp.__list_add_valid
> >       0.23 ±  6%      -0.2        0.03 ±103%  perf-profile.children.cycles-pp.resched_curr
> >       0.27 ±  5%      -0.2        0.08 ± 20%  perf-profile.children.cycles-pp.__wrgsbase_inactive
> >       0.26 ±  6%      -0.2        0.08 ± 17%  perf-profile.children.cycles-pp.finish_wait
> >       0.26 ±  4%      -0.2        0.08 ± 11%  perf-profile.children.cycles-pp.rcu_note_context_switch
> >       0.33 ± 21%      -0.2        0.15 ± 32%  perf-profile.children.cycles-pp.migrate_task_rq_fair
> >       0.22 ±  9%      -0.2        0.07 ± 22%  perf-profile.children.cycles-pp.perf_trace_buf_update
> >       0.17 ±  8%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.rb_next
> >       0.15 ± 32%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.llist_reverse_order
> >       0.34 ±  7%      -0.1        0.26 ±  3%  perf-profile.children.cycles-pp.anon_pipe_buf_release
> >       0.14 ±  6%      -0.1        0.07 ± 17%  perf-profile.children.cycles-pp.read@plt
> >       0.10 ± 17%      -0.1        0.04 ± 75%  perf-profile.children.cycles-pp.remove_entity_load_avg
> >       0.07 ± 10%      -0.0        0.02 ± 99%  perf-profile.children.cycles-pp.generic_update_time
> >       0.11 ±  6%      -0.0        0.07 ±  8%  perf-profile.children.cycles-pp.__mark_inode_dirty
> >       0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.load_balance
> >       0.00            +0.1        0.06 ± 11%  perf-profile.children.cycles-pp._raw_spin_trylock
> >       0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.uncharge_folio
> >       0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.__do_softirq
> >       0.00            +0.1        0.07 ± 10%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
> >       0.00            +0.1        0.08 ± 14%  perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg
> >       0.15 ± 23%      +0.1        0.23 ±  7%  perf-profile.children.cycles-pp.task_tick_fair
> >       0.19 ± 17%      +0.1        0.28 ±  7%  perf-profile.children.cycles-pp.scheduler_tick
> >       0.00            +0.1        0.10 ± 21%  perf-profile.children.cycles-pp.select_idle_core
> >       0.00            +0.1        0.10 ±  9%  perf-profile.children.cycles-pp.osq_unlock
> >       0.23 ± 12%      +0.1        0.34 ±  6%  perf-profile.children.cycles-pp.update_process_times
> >       0.37 ± 13%      +0.1        0.48 ±  5%  perf-profile.children.cycles-pp.hrtimer_interrupt
> >       0.24 ± 12%      +0.1        0.35 ±  6%  perf-profile.children.cycles-pp.tick_sched_handle
> >       0.31 ± 14%      +0.1        0.43 ±  4%  perf-profile.children.cycles-pp.__hrtimer_run_queues
> >       0.37 ± 12%      +0.1        0.49 ±  5%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
> >       0.00            +0.1        0.12 ± 10%  perf-profile.children.cycles-pp.__mod_memcg_state
> >       0.26 ± 10%      +0.1        0.38 ±  6%  perf-profile.children.cycles-pp.tick_sched_timer
> >       0.00            +0.1        0.13 ±  7%  perf-profile.children.cycles-pp.free_unref_page
> >       0.00            +0.1        0.14 ±  8%  perf-profile.children.cycles-pp.rmqueue
> >       0.15 ±  8%      +0.2        0.30 ±  5%  perf-profile.children.cycles-pp.rcu_all_qs
> >       0.16 ±  6%      +0.2        0.31 ±  5%  perf-profile.children.cycles-pp.__x64_sys_write
> >       0.00            +0.2        0.16 ± 10%  perf-profile.children.cycles-pp.propagate_protected_usage
> >       0.00            +0.2        0.16 ± 10%  perf-profile.children.cycles-pp.menu_select
> >       0.00            +0.2        0.16 ±  9%  perf-profile.children.cycles-pp.memcg_account_kmem
> >       0.42 ± 12%      +0.2        0.57 ±  4%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
> >       0.15 ± 11%      +0.2        0.31 ±  8%  perf-profile.children.cycles-pp.__x64_sys_read
> >       0.00            +0.2        0.17 ±  8%  perf-profile.children.cycles-pp.get_page_from_freelist
> >       0.44 ± 11%      +0.2        0.62 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> >       0.10 ± 31%      +0.2        0.28 ± 24%  perf-profile.children.cycles-pp.mnt_user_ns
> >       0.16 ±  4%      +0.2        0.35 ±  5%  perf-profile.children.cycles-pp.kill_fasync
> >       0.20 ± 10%      +0.2        0.40 ±  3%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
> >       0.09 ±  7%      +0.2        0.29 ±  4%  perf-profile.children.cycles-pp.page_copy_sane
> >       0.08 ±  8%      +0.2        0.31 ±  6%  perf-profile.children.cycles-pp.rw_verify_area
> >       0.12 ± 11%      +0.2        0.36 ±  8%  perf-profile.children.cycles-pp.ktime_get_coarse_real_ts64
> >       0.28 ± 12%      +0.2        0.52 ±  5%  perf-profile.children.cycles-pp.inode_needs_update_time
> >       0.00            +0.3        0.27 ±  7%  perf-profile.children.cycles-pp.__memcg_kmem_charge_page
> >       0.43 ±  6%      +0.3        0.73 ±  5%  perf-profile.children.cycles-pp.__cond_resched
> >       0.21 ± 29%      +0.3        0.54 ± 15%  perf-profile.children.cycles-pp.select_idle_cpu
> >       0.10 ± 10%      +0.3        0.43 ± 17%  perf-profile.children.cycles-pp.fsnotify_perm
> >       0.23 ± 11%      +0.3        0.56 ±  6%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
> >       0.06 ± 75%      +0.4        0.47 ± 27%  perf-profile.children.cycles-pp.queue_event
> >       0.21 ±  9%      +0.4        0.62 ±  5%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
> >       0.06 ± 75%      +0.4        0.48 ± 26%  perf-profile.children.cycles-pp.ordered_events__queue
> >       0.06 ± 73%      +0.4        0.50 ± 24%  perf-profile.children.cycles-pp.process_simple
> >       0.01 ±223%      +0.4        0.44 ±  9%  perf-profile.children.cycles-pp.schedule_idle
> >       0.05 ±  8%      +0.5        0.52 ±  7%  perf-profile.children.cycles-pp.__alloc_pages
> >       0.45 ±  7%      +0.5        0.94 ±  5%  perf-profile.children.cycles-pp.__get_task_ioprio
> >       0.89 ±  8%      +0.5        1.41 ±  4%  perf-profile.children.cycles-pp.__might_sleep
> >       0.01 ±223%      +0.5        0.54 ± 21%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
> >       0.05 ± 46%      +0.5        0.60 ±  7%  perf-profile.children.cycles-pp.osq_lock
> >       0.34 ±  8%      +0.6        0.90 ±  5%  perf-profile.children.cycles-pp.aa_file_perm
> >       0.01 ±223%      +0.7        0.67 ±  7%  perf-profile.children.cycles-pp.poll_idle
> >       0.14 ± 17%      +0.7        0.82 ±  6%  perf-profile.children.cycles-pp.mutex_spin_on_owner
> >       0.12 ± 12%      +0.7        0.82 ± 15%  perf-profile.children.cycles-pp.__cmd_record
> >       0.07 ± 72%      +0.7        0.78 ± 19%  perf-profile.children.cycles-pp.reader__read_event
> >       0.07 ± 72%      +0.7        0.78 ± 19%  perf-profile.children.cycles-pp.record__finish_output
> >       0.07 ± 72%      +0.7        0.78 ± 19%  perf-profile.children.cycles-pp.perf_session__process_events
> >       0.76 ±  8%      +0.8        1.52 ±  5%  perf-profile.children.cycles-pp.file_update_time
> >       0.08 ± 61%      +0.8        0.85 ± 11%  perf-profile.children.cycles-pp.intel_idle_irq
> >       1.23 ±  8%      +0.9        2.11 ±  4%  perf-profile.children.cycles-pp.__might_fault
> >       0.02 ±141%      +1.0        0.97 ±  7%  perf-profile.children.cycles-pp.page_counter_uncharge
> >       0.51 ±  9%      +1.0        1.48 ±  4%  perf-profile.children.cycles-pp.current_time
> >       0.05 ± 46%      +1.1        1.15 ±  7%  perf-profile.children.cycles-pp.uncharge_batch
> >       1.12 ±  6%      +1.1        2.23 ±  5%  perf-profile.children.cycles-pp.__fget_light
> >       0.06 ± 14%      +1.2        1.23 ±  6%  perf-profile.children.cycles-pp.__mem_cgroup_uncharge
> >       0.06 ± 14%      +1.2        1.24 ±  7%  perf-profile.children.cycles-pp.__folio_put
> >       0.64 ±  7%      +1.2        1.83 ±  5%  perf-profile.children.cycles-pp.syscall_return_via_sysret
> >       1.19 ±  8%      +1.2        2.42 ±  4%  perf-profile.children.cycles-pp.__might_resched
> >       0.59 ±  9%      +1.3        1.84 ±  6%  perf-profile.children.cycles-pp.atime_needs_update
> >      43.47            +1.4       44.83        perf-profile.children.cycles-pp.ksys_write
> >       1.28 ±  6%      +1.4        2.68 ±  5%  perf-profile.children.cycles-pp.__fdget_pos
> >       0.80 ±  8%      +1.5        2.28 ±  6%  perf-profile.children.cycles-pp.touch_atime
> >       0.11 ± 49%      +1.5        1.59 ±  9%  perf-profile.children.cycles-pp.cpuidle_enter_state
> >       0.11 ± 49%      +1.5        1.60 ±  9%  perf-profile.children.cycles-pp.cpuidle_enter
> >       0.12 ± 51%      +1.7        1.81 ±  9%  perf-profile.children.cycles-pp.cpuidle_idle_call
> >       1.44 ±  8%      +1.8        3.22 ±  6%  perf-profile.children.cycles-pp.copyin
> >       2.00 ±  9%      +2.0        4.03 ±  5%  perf-profile.children.cycles-pp.copyout
> >       1.02 ±  8%      +2.0        3.07 ±  5%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> >       1.63 ±  7%      +2.3        3.90 ±  5%  perf-profile.children.cycles-pp.apparmor_file_permission
> >       2.64 ±  8%      +2.3        4.98 ±  5%  perf-profile.children.cycles-pp._copy_from_iter
> >       0.40 ± 14%      +2.5        2.92 ±  7%  perf-profile.children.cycles-pp.__mutex_lock
> >       2.91 ±  8%      +2.6        5.54 ±  5%  perf-profile.children.cycles-pp.copy_page_from_iter
> >       0.17 ± 62%      +2.7        2.91 ± 11%  perf-profile.children.cycles-pp.start_secondary
> >       1.83 ±  7%      +2.8        4.59 ±  5%  perf-profile.children.cycles-pp.security_file_permission
> >       0.17 ± 60%      +2.8        2.94 ± 11%  perf-profile.children.cycles-pp.do_idle
> >       0.17 ± 60%      +2.8        2.96 ± 11%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
> >       0.17 ± 60%      +2.8        2.96 ± 11%  perf-profile.children.cycles-pp.cpu_startup_entry
> >       2.62 ±  9%      +3.2        5.84 ±  6%  perf-profile.children.cycles-pp._copy_to_iter
> >       1.55 ±  8%      +3.2        4.79 ±  5%  perf-profile.children.cycles-pp.__entry_text_start
> >       3.09 ±  8%      +3.7        6.77 ±  5%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
> >       2.95 ±  9%      +3.8        6.73 ±  5%  perf-profile.children.cycles-pp.copy_page_to_iter
> >       2.28 ± 11%      +5.1        7.40 ±  6%  perf-profile.children.cycles-pp.mutex_unlock
> >       3.92 ±  9%      +6.0        9.94 ±  5%  perf-profile.children.cycles-pp.mutex_lock
> >       8.37 ±  9%      -5.8        2.60 ± 23%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> >       6.54 ± 15%      -4.6        1.95 ± 57%  perf-profile.self.cycles-pp.update_cfs_group
> >       3.08 ±  4%      -2.3        0.74 ± 22%  perf-profile.self.cycles-pp.switch_mm_irqs_off
> >       2.96 ±  4%      -1.8        1.13 ± 33%  perf-profile.self.cycles-pp.update_load_avg
> >       2.22 ±  8%      -1.5        0.74 ± 12%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> >       1.96 ±  9%      -1.5        0.48 ± 15%  perf-profile.self.cycles-pp.update_curr
> >       1.94 ±  5%      -1.3        0.64 ± 16%  perf-profile.self.cycles-pp._raw_spin_lock
> >       1.78 ±  5%      -1.3        0.50 ± 18%  perf-profile.self.cycles-pp.__schedule
> >       1.59 ±  7%      -1.2        0.40 ± 12%  perf-profile.self.cycles-pp.enqueue_entity
> >       1.61 ±  6%      -1.2        0.42 ± 25%  perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
> >       1.44 ±  8%      -1.0        0.39 ± 14%  perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
> >       1.42 ±  5%      -1.0        0.41 ± 19%  perf-profile.self.cycles-pp.__switch_to_asm
> >       1.18 ±  7%      -0.9        0.33 ± 14%  perf-profile.self.cycles-pp.__update_load_avg_se
> >       1.14 ± 10%      -0.8        0.31 ±  9%  perf-profile.self.cycles-pp.update_rq_clock
> >       0.90 ±  7%      -0.7        0.19 ± 21%  perf-profile.self.cycles-pp.pick_next_task_fair
> >       1.04 ±  7%      -0.7        0.33 ± 13%  perf-profile.self.cycles-pp.prepare_task_switch
> >       0.98 ±  4%      -0.7        0.29 ± 20%  perf-profile.self.cycles-pp.__switch_to
> >       0.88 ±  6%      -0.7        0.20 ± 17%  perf-profile.self.cycles-pp.enqueue_task_fair
> >       1.01 ±  6%      -0.7        0.35 ± 10%  perf-profile.self.cycles-pp.prepare_to_wait_event
> >       0.90 ±  8%      -0.6        0.25 ± 12%  perf-profile.self.cycles-pp.update_min_vruntime
> >       0.79 ± 17%      -0.6        0.22 ±  9%  perf-profile.self.cycles-pp.cpuacct_charge
> >       1.10 ±  5%      -0.6        0.54 ±  9%  perf-profile.self.cycles-pp.try_to_wake_up
> >       0.66 ±  3%      -0.5        0.15 ± 27%  perf-profile.self.cycles-pp.os_xsave
> >       0.71 ±  6%      -0.5        0.22 ± 18%  perf-profile.self.cycles-pp.reweight_entity
> >       0.68 ±  9%      -0.5        0.19 ± 10%  perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
> >       0.67 ±  9%      -0.5        0.18 ± 11%  perf-profile.self.cycles-pp.__wake_up_common
> >       0.65 ±  6%      -0.5        0.17 ± 23%  perf-profile.self.cycles-pp.switch_fpu_return
> >       0.60 ± 11%      -0.5        0.14 ± 28%  perf-profile.self.cycles-pp.perf_tp_event
> >       0.52 ± 44%      -0.5        0.06 ±151%  perf-profile.self.cycles-pp.native_irq_return_iret
> >       0.52 ±  7%      -0.4        0.08 ± 25%  perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
> >       0.55 ±  4%      -0.4        0.15 ± 22%  perf-profile.self.cycles-pp.__calc_delta
> >       0.61 ±  5%      -0.4        0.21 ± 12%  perf-profile.self.cycles-pp.dequeue_task_fair
> >       0.69 ± 14%      -0.4        0.32 ± 11%  perf-profile.self.cycles-pp.task_h_load
> >       0.49 ± 11%      -0.3        0.15 ± 29%  perf-profile.self.cycles-pp.___perf_sw_event
> >       0.37 ±  4%      -0.3        0.05 ± 73%  perf-profile.self.cycles-pp.pick_next_entity
> >       0.50 ±  3%      -0.3        0.19 ± 15%  perf-profile.self.cycles-pp.select_idle_sibling
> >       0.38 ±  9%      -0.3        0.08 ± 24%  perf-profile.self.cycles-pp.set_next_buddy
> >       0.32 ±  4%      -0.3        0.03 ±100%  perf-profile.self.cycles-pp.put_prev_entity
> >       0.64 ±  6%      -0.3        0.35 ±  7%  perf-profile.self.cycles-pp._raw_spin_lock_irq
> >       0.52 ±  5%      -0.3        0.25 ±  6%  perf-profile.self.cycles-pp.__list_del_entry_valid
> >       0.34 ±  5%      -0.3        0.07 ± 29%  perf-profile.self.cycles-pp.schedule
> >       0.35 ±  9%      -0.3        0.08 ± 10%  perf-profile.self.cycles-pp.rb_insert_color
> >       0.40 ±  5%      -0.3        0.14 ± 16%  perf-profile.self.cycles-pp.select_task_rq_fair
> >       0.33 ±  6%      -0.3        0.08 ± 16%  perf-profile.self.cycles-pp.check_preempt_wakeup
> >       0.33 ±  8%      -0.2        0.10 ± 16%  perf-profile.self.cycles-pp.select_task_rq
> >       0.36 ±  3%      -0.2        0.13 ± 16%  perf-profile.self.cycles-pp.native_sched_clock
> >       0.32 ±  7%      -0.2        0.10 ± 14%  perf-profile.self.cycles-pp.finish_task_switch
> >       0.32 ±  4%      -0.2        0.11 ± 13%  perf-profile.self.cycles-pp.dequeue_entity
> >       0.32 ±  8%      -0.2        0.12 ± 10%  perf-profile.self.cycles-pp.__list_add_valid
> >       0.23 ±  5%      -0.2        0.03 ±103%  perf-profile.self.cycles-pp.resched_curr
> >       0.27 ±  6%      -0.2        0.07 ± 21%  perf-profile.self.cycles-pp.rb_erase
> >       0.27 ±  5%      -0.2        0.08 ± 20%  perf-profile.self.cycles-pp.__wrgsbase_inactive
> >       0.28 ± 13%      -0.2        0.09 ± 12%  perf-profile.self.cycles-pp.check_preempt_curr
> >       0.30 ± 13%      -0.2        0.12 ±  7%  perf-profile.self.cycles-pp.ttwu_queue_wakelist
> >       0.24 ±  5%      -0.2        0.06 ± 19%  perf-profile.self.cycles-pp.set_next_entity
> >       0.21 ± 34%      -0.2        0.04 ± 71%  perf-profile.self.cycles-pp.__flush_smp_call_function_queue
> >       0.25 ±  5%      -0.2        0.08 ± 16%  perf-profile.self.cycles-pp.rcu_note_context_switch
> >       0.19 ± 26%      -0.1        0.04 ± 73%  perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
> >       0.20 ±  8%      -0.1        0.06 ± 13%  perf-profile.self.cycles-pp.ttwu_do_activate
> >       0.17 ±  8%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.rb_next
> >       0.22 ± 23%      -0.1        0.09 ± 31%  perf-profile.self.cycles-pp.migrate_task_rq_fair
> >       0.15 ± 32%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.llist_reverse_order
> >       0.16 ±  8%      -0.1        0.06 ± 14%  perf-profile.self.cycles-pp.wake_affine
> >       0.10 ± 31%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.sched_ttwu_pending
> >       0.14 ±  5%      -0.1        0.07 ± 20%  perf-profile.self.cycles-pp.read@plt
> >       0.32 ±  8%      -0.1        0.26 ±  3%  perf-profile.self.cycles-pp.anon_pipe_buf_release
> >       0.10 ±  6%      -0.1        0.04 ± 45%  perf-profile.self.cycles-pp.__wake_up_common_lock
> >       0.10 ±  9%      -0.0        0.07 ±  8%  perf-profile.self.cycles-pp.__mark_inode_dirty
> >       0.00            +0.1        0.06 ± 11%  perf-profile.self.cycles-pp.free_unref_page
> >       0.00            +0.1        0.06 ±  6%  perf-profile.self.cycles-pp.__alloc_pages
> >       0.00            +0.1        0.06 ± 11%  perf-profile.self.cycles-pp._raw_spin_trylock
> >       0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.uncharge_folio
> >       0.00            +0.1        0.06 ± 11%  perf-profile.self.cycles-pp.uncharge_batch
> >       0.00            +0.1        0.07 ± 10%  perf-profile.self.cycles-pp.menu_select
> >       0.00            +0.1        0.08 ± 14%  perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg
> >       0.00            +0.1        0.08 ±  7%  perf-profile.self.cycles-pp.__memcg_kmem_charge_page
> >       0.00            +0.1        0.10 ± 10%  perf-profile.self.cycles-pp.osq_unlock
> >       0.07 ±  5%      +0.1        0.17 ±  8%  perf-profile.self.cycles-pp.copyin
> >       0.00            +0.1        0.11 ± 11%  perf-profile.self.cycles-pp.__mod_memcg_state
> >       0.13 ±  8%      +0.1        0.24 ±  6%  perf-profile.self.cycles-pp.rcu_all_qs
> >       0.14 ±  5%      +0.1        0.28 ±  5%  perf-profile.self.cycles-pp.__x64_sys_write
> >       0.07 ± 10%      +0.1        0.21 ±  5%  perf-profile.self.cycles-pp.page_copy_sane
> >       0.13 ± 12%      +0.1        0.28 ±  9%  perf-profile.self.cycles-pp.__x64_sys_read
> >       0.00            +0.2        0.15 ± 10%  perf-profile.self.cycles-pp.propagate_protected_usage
> >       0.18 ±  9%      +0.2        0.33 ±  4%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
> >       0.07 ±  8%      +0.2        0.23 ±  5%  perf-profile.self.cycles-pp.rw_verify_area
> >       0.08 ± 34%      +0.2        0.24 ± 27%  perf-profile.self.cycles-pp.mnt_user_ns
> >       0.13 ±  5%      +0.2        0.31 ±  7%  perf-profile.self.cycles-pp.kill_fasync
> >       0.21 ±  8%      +0.2        0.39 ±  5%  perf-profile.self.cycles-pp.__might_fault
> >       0.06 ± 13%      +0.2        0.26 ±  9%  perf-profile.self.cycles-pp.copyout
> >       0.10 ± 11%      +0.2        0.31 ±  8%  perf-profile.self.cycles-pp.ktime_get_coarse_real_ts64
> >       0.26 ± 13%      +0.2        0.49 ±  6%  perf-profile.self.cycles-pp.inode_needs_update_time
> >       0.23 ±  8%      +0.2        0.47 ±  5%  perf-profile.self.cycles-pp.copy_page_from_iter
> >       0.14 ±  7%      +0.2        0.38 ±  6%  perf-profile.self.cycles-pp.file_update_time
> >       0.36 ±  7%      +0.3        0.62 ±  4%  perf-profile.self.cycles-pp.ksys_read
> >       0.54 ± 13%      +0.3        0.80 ±  4%  perf-profile.self.cycles-pp._copy_from_iter
> >       0.15 ±  5%      +0.3        0.41 ±  8%  perf-profile.self.cycles-pp.touch_atime
> >       0.14 ±  5%      +0.3        0.40 ±  6%  perf-profile.self.cycles-pp.__cond_resched
> >       0.18 ±  5%      +0.3        0.47 ±  4%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
> >       0.16 ±  8%      +0.3        0.46 ±  6%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
> >       0.16 ±  9%      +0.3        0.47 ±  6%  perf-profile.self.cycles-pp.__fdget_pos
> >       1.79 ±  8%      +0.3        2.12 ±  3%  perf-profile.self.cycles-pp.pipe_read
> >       0.10 ±  8%      +0.3        0.43 ± 17%  perf-profile.self.cycles-pp.fsnotify_perm
> >       0.20 ±  4%      +0.4        0.55 ±  5%  perf-profile.self.cycles-pp.ksys_write
> >       0.05 ± 76%      +0.4        0.46 ± 27%  perf-profile.self.cycles-pp.queue_event
> >       0.32 ±  6%      +0.4        0.73 ±  6%  perf-profile.self.cycles-pp.exit_to_user_mode_prepare
> >       0.21 ±  9%      +0.4        0.62 ±  6%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
> >       0.79 ±  8%      +0.4        1.22 ±  4%  perf-profile.self.cycles-pp.__might_sleep
> >       0.44 ±  5%      +0.4        0.88 ±  7%  perf-profile.self.cycles-pp.do_syscall_64
> >       0.26 ±  8%      +0.4        0.70 ±  4%  perf-profile.self.cycles-pp.atime_needs_update
> >       0.42 ±  7%      +0.5        0.88 ±  5%  perf-profile.self.cycles-pp.__get_task_ioprio
> >       0.28 ± 12%      +0.5        0.75 ±  5%  perf-profile.self.cycles-pp.copy_page_to_iter
> >       0.19 ±  6%      +0.5        0.68 ± 10%  perf-profile.self.cycles-pp.security_file_permission
> >       0.31 ±  8%      +0.5        0.83 ±  5%  perf-profile.self.cycles-pp.aa_file_perm
> >       0.05 ± 46%      +0.5        0.59 ±  8%  perf-profile.self.cycles-pp.osq_lock
> >       0.30 ±  7%      +0.5        0.85 ±  6%  perf-profile.self.cycles-pp._copy_to_iter
> >       0.00            +0.6        0.59 ±  6%  perf-profile.self.cycles-pp.poll_idle
> >       0.13 ± 20%      +0.7        0.81 ±  6%  perf-profile.self.cycles-pp.mutex_spin_on_owner
> >       0.38 ±  9%      +0.7        1.12 ±  5%  perf-profile.self.cycles-pp.current_time
> >       0.08 ± 59%      +0.8        0.82 ± 11%  perf-profile.self.cycles-pp.intel_idle_irq
> >       0.92 ±  6%      +0.8        1.72 ±  4%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> >       0.01 ±223%      +0.8        0.82 ±  6%  perf-profile.self.cycles-pp.page_counter_uncharge
> >       0.86 ±  7%      +1.1        1.91 ±  4%  perf-profile.self.cycles-pp.vfs_read
> >       1.07 ±  6%      +1.1        2.14 ±  5%  perf-profile.self.cycles-pp.__fget_light
> >       0.67 ±  7%      +1.1        1.74 ±  6%  perf-profile.self.cycles-pp.vfs_write
> >       0.15 ± 12%      +1.1        1.28 ±  7%  perf-profile.self.cycles-pp.__mutex_lock
> >       1.09 ±  6%      +1.1        2.22 ±  5%  perf-profile.self.cycles-pp.__libc_read
> >       0.62 ±  6%      +1.2        1.79 ±  5%  perf-profile.self.cycles-pp.syscall_return_via_sysret
> >       1.16 ±  8%      +1.2        2.38 ±  4%  perf-profile.self.cycles-pp.__might_resched
> >       0.91 ±  7%      +1.3        2.20 ±  5%  perf-profile.self.cycles-pp.__libc_write
> >       0.59 ±  8%      +1.3        1.93 ±  6%  perf-profile.self.cycles-pp.__entry_text_start
> >       1.27 ±  7%      +1.7        3.00 ±  6%  perf-profile.self.cycles-pp.apparmor_file_permission
> >       0.99 ±  8%      +2.0        2.98 ±  5%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> >       1.74 ±  8%      +3.4        5.15 ±  6%  perf-profile.self.cycles-pp.pipe_write
> >       2.98 ±  8%      +3.7        6.64 ±  5%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
> >       2.62 ± 10%      +4.8        7.38 ±  5%  perf-profile.self.cycles-pp.mutex_lock
> >       2.20 ± 10%      +5.1        7.30 ±  6%  perf-profile.self.cycles-pp.mutex_unlock
> >
> >
> > ***************************************************************************************************
> > lkp-skl-fpga01: 104 threads 2 sockets (Skylake) with 192G memory
> > =========================================================================================
> > compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase:
> >   gcc-11/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-11.1-x86_64-20220510.cgz/lkp-skl-fpga01/hackbench
> >
> > commit:
> >   a2e90611b9 ("sched/fair: Remove capacity inversion detection")
> >   829c1651e9 ("sched/fair: sanitize vruntime of entity being placed")
> >
> > a2e90611b9f425ad 829c1651e9c4a6f78398d3e6765
> > ---------------- ---------------------------
> >          %stddev     %change         %stddev
> >              \          |                \
> >     177139            -8.1%     162815        hackbench.throughput
> >     174484           -18.8%     141618 ±  2%  hackbench.throughput_avg
> >     177139            -8.1%     162815        hackbench.throughput_best
> >     168530           -37.3%     105615 ±  3%  hackbench.throughput_worst
> >     281.38           +23.1%     346.39 ±  2%  hackbench.time.elapsed_time
> >     281.38           +23.1%     346.39 ±  2%  hackbench.time.elapsed_time.max
> >  1.053e+08 ±  2%    +688.4%  8.302e+08 ±  9%  hackbench.time.involuntary_context_switches
> >      21992           +27.8%      28116 ±  2%  hackbench.time.system_time
> >       6652            +8.2%       7196        hackbench.time.user_time
> >  3.482e+08          +289.2%  1.355e+09 ±  9%  hackbench.time.voluntary_context_switches
> >    2110813 ±  5%     +21.6%    2565791 ±  3%  cpuidle..usage
> >     333.95           +19.5%     399.05        uptime.boot
> >       0.03            -0.0        0.03        mpstat.cpu.all.soft%
> >      22.68            -2.9       19.77        mpstat.cpu.all.usr%
> >     561083 ± 10%     +45.5%     816171 ± 12%  numa-numastat.node0.local_node
> >     614314 ±  9%     +36.9%     841173 ± 12%  numa-numastat.node0.numa_hit
> >    1393279 ±  7%     -16.8%    1158997 ±  2%  numa-numastat.node1.local_node
> >    1443679 ±  5%     -14.9%    1229074 ±  3%  numa-numastat.node1.numa_hit
> >    4129900 ±  8%     -23.0%    3181115        vmstat.memory.cache
> >       1731           +30.8%       2265        vmstat.procs.r
> >    1598044          +290.3%    6237840 ±  7%  vmstat.system.cs
> >     320762           +60.5%     514672 ±  8%  vmstat.system.in
> >     962111 ±  6%     +46.0%    1404646 ±  7%  turbostat.C1
> >     233987 ±  5%     +51.2%     353892        turbostat.C1E
> >   91515563           +97.3%  1.806e+08 ± 10%  turbostat.IRQ
> >     448466 ± 14%     -34.2%     294934 ±  5%  turbostat.POLL
> >      34.60            -7.3%      32.07        turbostat.RAMWatt
> >     514028 ±  2%     -14.0%     442125 ±  2%  meminfo.AnonPages
> >    4006312 ±  8%     -23.9%    3047078        meminfo.Cached
> >    3321064 ± 10%     -32.7%    2236362 ±  2%  meminfo.Committed_AS
> >    1714752 ± 21%     -60.3%     680479 ±  8%  meminfo.Inactive
> >    1714585 ± 21%     -60.3%     680305 ±  8%  meminfo.Inactive(anon)
> >     757124 ± 18%     -67.2%     248485 ± 27%  meminfo.Mapped
> >    6476123 ±  6%     -19.4%    5220738        meminfo.Memused
> >    1275724 ± 26%     -75.2%     316896 ± 15%  meminfo.Shmem
> >    6806047 ±  3%     -13.3%    5901974        meminfo.max_used_kB
> >     161311 ± 23%     +31.7%     212494 ±  5%  numa-meminfo.node0.AnonPages
> >     165693 ± 22%     +30.5%     216264 ±  5%  numa-meminfo.node0.Inactive
> >     165563 ± 22%     +30.6%     216232 ±  5%  numa-meminfo.node0.Inactive(anon)
> >     140638 ± 19%     -36.7%      89034 ± 11%  numa-meminfo.node0.Mapped
> >     352173 ± 14%     -35.3%     227805 ±  8%  numa-meminfo.node1.AnonPages
> >     501396 ± 11%     -22.6%     388042 ±  5%  numa-meminfo.node1.AnonPages.max
> >    1702242 ± 43%     -77.8%     378325 ± 22%  numa-meminfo.node1.FilePages
> >    1540803 ± 25%     -70.4%     455592 ± 13%  numa-meminfo.node1.Inactive
> >    1540767 ± 25%     -70.4%     455451 ± 13%  numa-meminfo.node1.Inactive(anon)
> >     612123 ± 18%     -74.9%     153752 ± 37%  numa-meminfo.node1.Mapped
> >    3085231 ± 24%     -53.9%    1420940 ± 14%  numa-meminfo.node1.MemUsed
> >     254052 ±  4%     -19.1%     205632 ± 21%  numa-meminfo.node1.SUnreclaim
> >    1259640 ± 27%     -75.9%     303123 ± 15%  numa-meminfo.node1.Shmem
> >     304597 ±  7%     -20.2%     242920 ± 17%  numa-meminfo.node1.Slab
> >      40345 ± 23%     +31.5%      53054 ±  5%  numa-vmstat.node0.nr_anon_pages
> >      41412 ± 22%     +30.4%      53988 ±  5%  numa-vmstat.node0.nr_inactive_anon
> >      35261 ± 19%     -36.9%      22256 ± 12%  numa-vmstat.node0.nr_mapped
> >      41412 ± 22%     +30.4%      53988 ±  5%  numa-vmstat.node0.nr_zone_inactive_anon
> >     614185 ±  9%     +36.9%     841065 ± 12%  numa-vmstat.node0.numa_hit
> >     560955 ± 11%     +45.5%     816063 ± 12%  numa-vmstat.node0.numa_local
> >      88129 ± 14%     -35.2%      57097 ±  8%  numa-vmstat.node1.nr_anon_pages
> >     426425 ± 43%     -77.9%      94199 ± 22%  numa-vmstat.node1.nr_file_pages
> >     386166 ± 25%     -70.5%     113880 ± 13%  numa-vmstat.node1.nr_inactive_anon
> >     153658 ± 18%     -75.3%      38021 ± 37%  numa-vmstat.node1.nr_mapped
> >     315775 ± 27%     -76.1%      75399 ± 16%  numa-vmstat.node1.nr_shmem
> >      63411 ±  4%     -18.6%      51593 ± 21%  numa-vmstat.node1.nr_slab_unreclaimable
> >     386166 ± 25%     -70.5%     113880 ± 13%  numa-vmstat.node1.nr_zone_inactive_anon
> >    1443470 ±  5%     -14.9%    1228740 ±  3%  numa-vmstat.node1.numa_hit
> >    1393069 ±  7%     -16.8%    1158664 ±  2%  numa-vmstat.node1.numa_local
> >     128457 ±  2%     -14.0%     110530 ±  3%  proc-vmstat.nr_anon_pages
> >     999461 ±  8%     -23.8%     761774        proc-vmstat.nr_file_pages
> >     426485 ± 21%     -60.1%     170237 ±  9%  proc-vmstat.nr_inactive_anon
> >      82464            -2.6%      80281        proc-vmstat.nr_kernel_stack
> >     187777 ± 18%     -66.9%      62076 ± 28%  proc-vmstat.nr_mapped
> >     316813 ± 27%     -75.0%      79228 ± 16%  proc-vmstat.nr_shmem
> >      31469            -2.0%      30840        proc-vmstat.nr_slab_reclaimable
> >     117889            -8.4%     108036        proc-vmstat.nr_slab_unreclaimable
> >     426485 ± 21%     -60.1%     170237 ±  9%  proc-vmstat.nr_zone_inactive_anon
> >     187187 ± 12%     -43.5%     105680 ±  9%  proc-vmstat.numa_hint_faults
> >     128363 ± 15%     -61.5%      49371 ± 19%  proc-vmstat.numa_hint_faults_local
> >      47314 ± 22%     +39.2%      65863 ± 13%  proc-vmstat.numa_pages_migrated
> >     457026 ±  9%     -18.1%     374188 ± 13%  proc-vmstat.numa_pte_updates
> >    2586600 ±  3%     +27.7%    3302787 ±  8%  proc-vmstat.pgalloc_normal
> >    1589970            -6.2%    1491838        proc-vmstat.pgfault
> >    2347186 ± 10%     +37.7%    3232369 ±  8%  proc-vmstat.pgfree
> >      47314 ± 22%     +39.2%      65863 ± 13%  proc-vmstat.pgmigrate_success
> >     112713            +7.0%     120630 ±  3%  proc-vmstat.pgreuse
> >    2189056           +22.2%    2674944 ±  2%  proc-vmstat.unevictable_pgs_scanned
> >      14.08 ±  2%     +29.3%      18.20 ±  5%  sched_debug.cfs_rq:/.h_nr_running.avg
> >       0.80 ± 14%    +179.2%       2.23 ± 24%  sched_debug.cfs_rq:/.h_nr_running.min
> >     245.23 ± 12%     -19.7%     196.97 ±  6%  sched_debug.cfs_rq:/.load_avg.max
> >       2.27 ± 16%     +75.0%       3.97 ±  4%  sched_debug.cfs_rq:/.load_avg.min
> >      45.77 ± 16%     -17.8%      37.60 ±  6%  sched_debug.cfs_rq:/.load_avg.stddev
> >   11842707           +39.9%   16567992        sched_debug.cfs_rq:/.min_vruntime.avg
> >   13773080 ±  3%    +113.9%   29460281 ±  7%  sched_debug.cfs_rq:/.min_vruntime.max
> >   11423218           +30.3%   14885830        sched_debug.cfs_rq:/.min_vruntime.min
> >     301190 ± 12%    +439.9%    1626088 ± 10%  sched_debug.cfs_rq:/.min_vruntime.stddev
> >     203.83           -16.3%     170.67        sched_debug.cfs_rq:/.removed.load_avg.max
> >      14330 ±  3%     +30.9%      18756 ±  5%  sched_debug.cfs_rq:/.runnable_avg.avg
> >      25115 ±  4%     +15.5%      28999 ±  6%  sched_debug.cfs_rq:/.runnable_avg.max
> >       3811 ± 11%     +68.0%       6404 ± 21%  sched_debug.cfs_rq:/.runnable_avg.min
> >       3818 ±  6%     +15.3%       4404 ±  7%  sched_debug.cfs_rq:/.runnable_avg.stddev
> >    -849635          +410.6%   -4338612        sched_debug.cfs_rq:/.spread0.avg
> >    1092373 ± 54%    +691.1%    8641673 ± 21%  sched_debug.cfs_rq:/.spread0.max
> >   -1263082          +378.1%   -6038905        sched_debug.cfs_rq:/.spread0.min
> >     300764 ± 12%    +441.8%    1629507 ±  9%  sched_debug.cfs_rq:/.spread0.stddev
> >       1591 ±  4%     -11.1%       1413 ±  3%  sched_debug.cfs_rq:/.util_avg.max
> >     288.90 ± 11%     +64.5%     475.23 ± 13%  sched_debug.cfs_rq:/.util_avg.min
> >     240.33 ±  2%     -32.1%     163.09 ±  3%  sched_debug.cfs_rq:/.util_avg.stddev
> >     494.27 ±  3%     +41.6%     699.85 ±  3%  sched_debug.cfs_rq:/.util_est_enqueued.avg
> >      11.23 ± 54%    +634.1%      82.47 ± 22%  sched_debug.cfs_rq:/.util_est_enqueued.min
> >     174576           +20.7%     210681        sched_debug.cpu.clock.avg
> >     174926           +21.2%     211944        sched_debug.cpu.clock.max
> >     174164           +20.3%     209436        sched_debug.cpu.clock.min
> >     230.84 ± 33%    +226.1%     752.67 ± 20%  sched_debug.cpu.clock.stddev
> >     172836           +20.6%     208504        sched_debug.cpu.clock_task.avg
> >     173552           +21.0%     210079        sched_debug.cpu.clock_task.max
> >     156807           +22.3%     191789        sched_debug.cpu.clock_task.min
> >       1634           +17.1%       1914 ±  5%  sched_debug.cpu.clock_task.stddev
> >       0.00 ± 32%    +220.1%       0.00 ± 20%  sched_debug.cpu.next_balance.stddev
> >      14.12 ±  2%     +28.7%      18.18 ±  5%  sched_debug.cpu.nr_running.avg
> >       0.73 ± 25%    +213.6%       2.30 ± 24%  sched_debug.cpu.nr_running.min
> >    1810086          +461.3%   10159215 ± 10%  sched_debug.cpu.nr_switches.avg
> >    2315994 ±  3%    +515.6%   14258195 ±  9%  sched_debug.cpu.nr_switches.max
> >    1529863          +380.3%    7348324 ±  9%  sched_debug.cpu.nr_switches.min
> >     167487 ± 18%    +770.8%    1458519 ± 21%  sched_debug.cpu.nr_switches.stddev
> >     174149           +20.2%     209410        sched_debug.cpu_clk
> >     170980           +20.6%     206240        sched_debug.ktime
> >     174896           +20.2%     210153        sched_debug.sched_clk
> >       7.35           +24.9%       9.18 ±  4%  perf-stat.i.MPKI
> >  1.918e+10           +14.4%  2.194e+10        perf-stat.i.branch-instructions
> >       2.16            -0.1        2.09        perf-stat.i.branch-miss-rate%
> >  4.133e+08            +6.6%  4.405e+08        perf-stat.i.branch-misses
> >      23.08            -9.2       13.86 ±  7%  perf-stat.i.cache-miss-rate%
> >  1.714e+08           -37.2%  1.076e+08 ±  3%  perf-stat.i.cache-misses
> >  7.497e+08           +33.7%  1.002e+09 ±  5%  perf-stat.i.cache-references
> >    1636365          +382.4%    7893858 ±  5%  perf-stat.i.context-switches
> >       2.74            -6.8%       2.56        perf-stat.i.cpi
> >     131725          +288.0%     511159 ± 10%  perf-stat.i.cpu-migrations
> >       1672          +160.8%       4361 ±  4%  perf-stat.i.cycles-between-cache-misses
> >       0.49            +0.6        1.11 ±  5%  perf-stat.i.dTLB-load-miss-rate%
> >  1.417e+08          +158.7%  3.665e+08 ±  5%  perf-stat.i.dTLB-load-misses
> >  2.908e+10            +9.1%  3.172e+10        perf-stat.i.dTLB-loads
> >       0.12 ±  4%      +0.1        0.20 ±  4%  perf-stat.i.dTLB-store-miss-rate%
> >   20805655 ±  4%     +90.9%   39716345 ±  4%  perf-stat.i.dTLB-store-misses
> >  1.755e+10            +8.6%  1.907e+10        perf-stat.i.dTLB-stores
> >      29.04            +3.6       32.62 ±  2%  perf-stat.i.iTLB-load-miss-rate%
> >   56676082           +60.4%   90917582 ±  3%  perf-stat.i.iTLB-load-misses
> >  1.381e+08           +30.6%  1.804e+08        perf-stat.i.iTLB-loads
> >   1.03e+11           +10.5%  1.139e+11        perf-stat.i.instructions
> >       1840           -21.1%       1451 ±  4%  perf-stat.i.instructions-per-iTLB-miss
> >       0.37           +10.9%       0.41        perf-stat.i.ipc
> >       1084            -4.5%       1035 ±  2%  perf-stat.i.metric.K/sec
> >     640.69           +10.3%     706.44        perf-stat.i.metric.M/sec
> >       5249            -9.3%       4762 ±  3%  perf-stat.i.minor-faults
> >      23.57           +18.7       42.30 ±  8%  perf-stat.i.node-load-miss-rate%
> >   40174555           -45.0%   22109431 ± 10%  perf-stat.i.node-loads
> >       8.84 ±  2%     +24.5       33.30 ± 10%  perf-stat.i.node-store-miss-rate%
> >    2912322           +60.3%    4667137 ± 16%  perf-stat.i.node-store-misses
> >   34046752           -50.6%   16826621 ±  9%  perf-stat.i.node-stores
> >       5278            -9.2%       4791 ±  3%  perf-stat.i.page-faults
> >       7.24           +12.1%       8.12 ±  4%  perf-stat.overall.MPKI
> >       2.15            -0.1        2.05        perf-stat.overall.branch-miss-rate%
> >      22.92            -9.5       13.41 ±  7%  perf-stat.overall.cache-miss-rate%
> >       2.73            -6.3%       2.56        perf-stat.overall.cpi
> >       1644           +43.4%       2358 ±  3%  perf-stat.overall.cycles-between-cache-misses
> >       0.48            +0.5        0.99 ±  4%  perf-stat.overall.dTLB-load-miss-rate%
> >       0.12 ±  4%      +0.1        0.19 ±  4%  perf-stat.overall.dTLB-store-miss-rate%
> >      29.06            +2.9       32.01 ±  2%  perf-stat.overall.iTLB-load-miss-rate%
> >       1826           -26.6%       1340 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
> >       0.37            +6.8%       0.39        perf-stat.overall.ipc
> >      22.74            +6.8       29.53 ± 13%  perf-stat.overall.node-load-miss-rate%
> >       7.63            +8.4       16.02 ± 20%  perf-stat.overall.node-store-miss-rate%
> >  1.915e+10            +9.0%  2.088e+10        perf-stat.ps.branch-instructions
> >  4.119e+08            +3.9%  4.282e+08        perf-stat.ps.branch-misses
> >  1.707e+08           -30.5%  1.186e+08 ±  3%  perf-stat.ps.cache-misses
> >  7.446e+08           +19.2%  8.874e+08 ±  4%  perf-stat.ps.cache-references
> >    1611874          +289.1%    6271376 ±  7%  perf-stat.ps.context-switches
> >     127362          +189.0%     368041 ± 11%  perf-stat.ps.cpu-migrations
> >  1.407e+08          +116.2%  3.042e+08 ±  5%  perf-stat.ps.dTLB-load-misses
> >  2.901e+10            +5.4%  3.057e+10        perf-stat.ps.dTLB-loads
> >   20667480 ±  4%     +66.8%   34473793 ±  4%  perf-stat.ps.dTLB-store-misses
> >  1.751e+10            +5.1%   1.84e+10        perf-stat.ps.dTLB-stores
> >   56310692           +45.0%   81644183 ±  4%  perf-stat.ps.iTLB-load-misses
> >  1.375e+08           +26.1%  1.733e+08        perf-stat.ps.iTLB-loads
> >  1.028e+11            +6.3%  1.093e+11        perf-stat.ps.instructions
> >       4929           -24.5%       3723 ±  2%  perf-stat.ps.minor-faults
> >   40134633           -32.9%   26946247 ±  9%  perf-stat.ps.node-loads
> >    2805073           +39.5%    3914304 ± 16%  perf-stat.ps.node-store-misses
> >   33938259           -38.9%   20726382 ±  8%  perf-stat.ps.node-stores
> >       4952           -24.5%       3741 ±  2%  perf-stat.ps.page-faults
> >  2.911e+13           +30.9%  3.809e+13 ±  2%  perf-stat.total.instructions
> >      15.30 ±  4%      -8.6        6.66 ±  5%  perf-profile.calltrace.cycles-pp.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write
> >      13.84 ±  6%      -7.9        5.98 ±  6%  perf-profile.calltrace.cycles-pp.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
> >      13.61 ±  6%      -7.8        5.84 ±  6%  perf-profile.calltrace.cycles-pp.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg.sock_sendmsg
> >       9.00 ±  2%      -5.5        3.48 ±  4%  perf-profile.calltrace.cycles-pp.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
> >       6.44 ±  4%      -4.3        2.14 ±  6%  perf-profile.calltrace.cycles-pp.skb_release_data.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
> >       5.83 ±  8%      -3.4        2.44 ±  5%  perf-profile.calltrace.cycles-pp.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
> >       5.81 ±  6%      -3.3        2.48 ±  6%  perf-profile.calltrace.cycles-pp.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb.unix_stream_sendmsg
> >       5.50 ±  7%      -3.2        2.32 ±  6%  perf-profile.calltrace.cycles-pp.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
> >       5.07 ±  8%      -3.0        2.04 ±  6%  perf-profile.calltrace.cycles-pp.__kmem_cache_alloc_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags
> >       6.22 ±  2%      -2.9        3.33 ±  3%  perf-profile.calltrace.cycles-pp.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
> >       6.17 ±  2%      -2.9        3.30 ±  3%  perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
> >       6.11 ±  2%      -2.9        3.24 ±  3%  perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic.unix_stream_recvmsg
> >      50.99            -2.6       48.39        perf-profile.calltrace.cycles-pp.__libc_read
> >       5.66 ±  3%      -2.3        3.35 ±  3%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_read
> >       5.52 ±  3%      -2.3        3.27 ±  3%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_write
> >       3.14 ±  2%      -1.7        1.42 ±  4%  perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
> >       2.73 ±  2%      -1.6        1.15 ±  4%  perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor
> >       2.59 ±  2%      -1.5        1.07 ±  4%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
> >       2.72 ±  3%      -1.4        1.34 ±  6%  perf-profile.calltrace.cycles-pp.kmem_cache_free.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
> >      41.50            -1.2       40.27        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_read
> >       2.26 ±  4%      -1.1        1.12        perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
> >       2.76 ±  3%      -1.1        1.63 ±  3%  perf-profile.calltrace.cycles-pp.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor
> >       2.84 ±  3%      -1.1        1.71 ±  2%  perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.unix_stream_read_actor.unix_stream_read_generic
> >       2.20 ±  4%      -1.1        1.08        perf-profile.calltrace.cycles-pp.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
> >       2.98 ±  2%      -1.1        1.90 ±  6%  perf-profile.calltrace.cycles-pp.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write
> >       1.99 ±  4%      -1.1        0.92 ±  2%  perf-profile.calltrace.cycles-pp.sock_wfree.unix_destruct_scm.skb_release_head_state.consume_skb.unix_stream_read_generic
> >       2.10 ±  3%      -1.0        1.08 ±  4%  perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter
> >       2.08 ±  4%      -0.8        1.24 ±  3%  perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_write
> >       2.16 ±  3%      -0.7        1.47        perf-profile.calltrace.cycles-pp.__entry_text_start.__libc_read
> >       2.20 ±  2%      -0.7        1.52 ±  3%  perf-profile.calltrace.cycles-pp.__kmem_cache_free.skb_release_data.consume_skb.unix_stream_read_generic.unix_stream_recvmsg
> >       1.46 ±  3%      -0.6        0.87 ±  8%  perf-profile.calltrace.cycles-pp._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
> >       4.82 ±  2%      -0.6        4.24        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >       1.31 ±  2%      -0.4        0.90 ±  4%  perf-profile.calltrace.cycles-pp.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
> >       0.96 ±  3%      -0.4        0.57 ± 10%  perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg
> >       1.14 ±  3%      -0.4        0.76 ±  5%  perf-profile.calltrace.cycles-pp.memcg_slab_post_alloc_hook.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
> >       0.99 ±  3%      -0.3        0.65 ±  8%  perf-profile.calltrace.cycles-pp.memcg_slab_post_alloc_hook.__kmem_cache_alloc_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb
> >       1.30 ±  4%      -0.3        0.99 ±  3%  perf-profile.calltrace.cycles-pp.sock_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64
> >       0.98 ±  2%      -0.3        0.69 ±  3%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       0.67            -0.2        0.42 ± 50%  perf-profile.calltrace.cycles-pp.check_heap_object.__check_object_size.skb_copy_datagram_from_iter.unix_stream_sendmsg.sock_sendmsg
> >       0.56 ±  4%      -0.2        0.32 ± 81%  perf-profile.calltrace.cycles-pp.__fdget_pos.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >       0.86 ±  2%      -0.2        0.63 ±  3%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_write.ksys_write.do_syscall_64
> >       1.15 ±  4%      -0.2        0.93 ±  4%  perf-profile.calltrace.cycles-pp.security_socket_recvmsg.sock_recvmsg.sock_read_iter.vfs_read.ksys_read
> >       0.90            -0.2        0.69 ±  3%  perf-profile.calltrace.cycles-pp.get_obj_cgroup_from_current.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb
> >       1.23 ±  3%      -0.2        1.07 ±  3%  perf-profile.calltrace.cycles-pp.security_socket_sendmsg.sock_sendmsg.sock_write_iter.vfs_write.ksys_write
> >       1.05 ±  2%      -0.2        0.88 ±  2%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       0.84 ±  4%      -0.2        0.68 ±  4%  perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.sock_read_iter.vfs_read
> >       0.88            -0.1        0.78 ±  5%  perf-profile.calltrace.cycles-pp.apparmor_file_permission.security_file_permission.vfs_read.ksys_read.do_syscall_64
> >       0.94 ±  3%      -0.1        0.88 ±  4%  perf-profile.calltrace.cycles-pp.aa_sk_perm.security_socket_sendmsg.sock_sendmsg.sock_write_iter.vfs_write
> >       0.62 ±  2%      +0.3        0.90 ±  2%  perf-profile.calltrace.cycles-pp.mutex_lock.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
> >       0.00            +0.6        0.58 ±  3%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.prepare_to_wait.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg
> >       0.00            +0.6        0.61 ±  6%  perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
> >       0.00            +0.6        0.62 ±  4%  perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop
> >       0.00            +0.7        0.67 ± 11%  perf-profile.calltrace.cycles-pp.update_load_avg.dequeue_entity.dequeue_task_fair.__schedule.schedule
> >       0.00            +0.7        0.67 ±  7%  perf-profile.calltrace.cycles-pp.__switch_to_asm.__libc_write
> >       0.00            +0.8        0.76 ±  4%  perf-profile.calltrace.cycles-pp.reweight_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout
> >       0.00            +0.8        0.77 ±  4%  perf-profile.calltrace.cycles-pp.___perf_sw_event.prepare_task_switch.__schedule.schedule.schedule_timeout
> >       0.00            +0.8        0.77 ±  8%  perf-profile.calltrace.cycles-pp.put_prev_entity.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop
> >       0.00            +0.8        0.81 ±  5%  perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
> >       0.00            +0.8        0.81 ±  5%  perf-profile.calltrace.cycles-pp.check_preempt_wakeup.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
> >       0.00            +0.8        0.82 ±  2%  perf-profile.calltrace.cycles-pp.__switch_to_asm.__libc_read
> >       0.00            +0.8        0.82 ±  3%  perf-profile.calltrace.cycles-pp.reweight_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
> >       0.00            +0.9        0.86 ±  5%  perf-profile.calltrace.cycles-pp.perf_trace_sched_wakeup_template.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
> >       0.00            +0.9        0.87 ±  8%  perf-profile.calltrace.cycles-pp.update_load_avg.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
> >      29.66            +0.9       30.58        perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       0.00            +1.0        0.95 ±  3%  perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__schedule.schedule.schedule_timeout
> >       0.00            +1.0        0.98 ±  4%  perf-profile.calltrace.cycles-pp.check_preempt_curr.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
> >       0.00            +1.0        0.99 ±  3%  perf-profile.calltrace.cycles-pp.update_curr.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up
> >       0.00            +1.0        1.05 ±  4%  perf-profile.calltrace.cycles-pp.prepare_to_wait.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
> >       0.00            +1.1        1.07 ± 12%  perf-profile.calltrace.cycles-pp.select_idle_sibling.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function
> >      27.81 ±  2%      +1.2       28.98        perf-profile.calltrace.cycles-pp.unix_stream_recvmsg.sock_read_iter.vfs_read.ksys_read.do_syscall_64
> >      27.36 ±  2%      +1.2       28.59        perf-profile.calltrace.cycles-pp.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read.ksys_read
> >       0.00            +1.5        1.46 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common
> >       0.00            +1.6        1.55 ±  4%  perf-profile.calltrace.cycles-pp.prepare_task_switch.__schedule.schedule.schedule_timeout.unix_stream_data_wait
> >       0.00            +1.6        1.60 ±  4%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_read
> >      27.58            +1.6       29.19        perf-profile.calltrace.cycles-pp.sock_write_iter.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       0.00            +1.6        1.63 ±  5%  perf-profile.calltrace.cycles-pp.update_curr.dequeue_entity.dequeue_task_fair.__schedule.schedule
> >       0.00            +1.6        1.65 ±  5%  perf-profile.calltrace.cycles-pp.restore_fpregs_from_fpstate.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
> >       0.00            +1.7        1.66 ±  6%  perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
> >       0.00            +1.8        1.80        perf-profile.calltrace.cycles-pp._raw_spin_lock.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
> >       0.00            +1.8        1.84 ±  2%  perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.schedule_timeout.unix_stream_data_wait
> >       0.00            +2.0        1.97 ±  2%  perf-profile.calltrace.cycles-pp.switch_mm_irqs_off.__schedule.schedule.schedule_timeout.unix_stream_data_wait
> >      26.63 ±  2%      +2.0       28.61        perf-profile.calltrace.cycles-pp.sock_sendmsg.sock_write_iter.vfs_write.ksys_write.do_syscall_64
> >       0.00            +2.0        2.01 ±  6%  perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare
> >       0.00            +2.1        2.09 ±  6%  perf-profile.calltrace.cycles-pp.select_task_rq_fair.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common
> >       0.00            +2.1        2.11 ±  5%  perf-profile.calltrace.cycles-pp.switch_fpu_return.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >      25.21 ±  2%      +2.2       27.43        perf-profile.calltrace.cycles-pp.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write.ksys_write
> >       0.00            +2.4        2.43 ±  5%  perf-profile.calltrace.cycles-pp.select_task_rq.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
> >      48.00            +2.7       50.69        perf-profile.calltrace.cycles-pp.__libc_write
> >       0.00            +2.9        2.87 ±  5%  perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout
> >       0.09 ±223%      +3.4        3.47 ±  3%  perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
> >      39.07            +4.8       43.84        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__libc_write
> >       0.66 ± 18%      +5.0        5.62 ±  4%  perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.unix_stream_data_wait
> >       4.73            +5.1        9.88 ±  3%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       0.66 ± 20%      +5.3        5.98 ±  3%  perf-profile.calltrace.cycles-pp.enqueue_task_fair.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
> >      35.96            +5.7       41.68        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       0.00            +6.0        6.02 ±  6%  perf-profile.calltrace.cycles-pp.__schedule.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode
> >       0.00            +6.2        6.18 ±  6%  perf-profile.calltrace.cycles-pp.schedule.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64
> >       0.00            +6.4        6.36 ±  6%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
> >       0.78 ± 19%      +6.4        7.15 ±  3%  perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock
> >       0.18 ±141%      +7.0        7.18 ±  6%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_write
> >       1.89 ± 15%     +12.1       13.96 ±  3%  perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic
> >       1.92 ± 15%     +12.3       14.23 ±  3%  perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg
> >       1.66 ± 19%     +12.4       14.06 ±  2%  perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable
> >       1.96 ± 15%     +12.5       14.48 ±  3%  perf-profile.calltrace.cycles-pp.schedule_timeout.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter
> >       1.69 ± 19%     +12.7       14.38 ±  2%  perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg
> >       1.75 ± 19%     +13.0       14.75 ±  2%  perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg.sock_sendmsg
> >       2.53 ± 10%     +13.4       15.90 ±  2%  perf-profile.calltrace.cycles-pp.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter.vfs_write
> >       1.96 ± 16%     +13.5       15.42 ±  2%  perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sock_def_readable.unix_stream_sendmsg.sock_sendmsg.sock_write_iter
> >       2.28 ± 15%     +14.6       16.86 ±  3%  perf-profile.calltrace.cycles-pp.unix_stream_data_wait.unix_stream_read_generic.unix_stream_recvmsg.sock_read_iter.vfs_read
> >      15.31 ±  4%      -8.6        6.67 ±  5%  perf-profile.children.cycles-pp.sock_alloc_send_pskb
> >      13.85 ±  6%      -7.9        5.98 ±  5%  perf-profile.children.cycles-pp.alloc_skb_with_frags
> >      13.70 ±  6%      -7.8        5.89 ±  6%  perf-profile.children.cycles-pp.__alloc_skb
> >       9.01 ±  2%      -5.5        3.48 ±  4%  perf-profile.children.cycles-pp.consume_skb
> >       6.86 ± 26%      -4.7        2.15 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
> >      11.27 ±  3%      -4.6        6.67 ±  3%  perf-profile.children.cycles-pp.syscall_return_via_sysret
> >       6.46 ±  4%      -4.3        2.15 ±  6%  perf-profile.children.cycles-pp.skb_release_data
> >       4.18 ± 25%      -4.0        0.15 ± 69%  perf-profile.children.cycles-pp.___slab_alloc
> >       5.76 ± 32%      -3.9        1.91 ±  3%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
> >       5.98 ±  8%      -3.5        2.52 ±  5%  perf-profile.children.cycles-pp.kmem_cache_alloc_node
> >       5.84 ±  6%      -3.3        2.50 ±  6%  perf-profile.children.cycles-pp.kmalloc_reserve
> >       3.33 ± 30%      -3.3        0.05 ± 88%  perf-profile.children.cycles-pp.get_partial_node
> >       5.63 ±  7%      -3.3        2.37 ±  6%  perf-profile.children.cycles-pp.__kmalloc_node_track_caller
> >       5.20 ±  7%      -3.1        2.12 ±  6%  perf-profile.children.cycles-pp.__kmem_cache_alloc_node
> >       6.23 ±  2%      -2.9        3.33 ±  3%  perf-profile.children.cycles-pp.unix_stream_read_actor
> >       6.18 ±  2%      -2.9        3.31 ±  3%  perf-profile.children.cycles-pp.skb_copy_datagram_iter
> >       6.11 ±  2%      -2.9        3.25 ±  3%  perf-profile.children.cycles-pp.__skb_datagram_iter
> >      51.39            -2.5       48.85        perf-profile.children.cycles-pp.__libc_read
> >       3.14 ±  3%      -2.5        0.61 ± 13%  perf-profile.children.cycles-pp.__slab_free
> >       5.34 ±  3%      -2.1        3.23 ±  3%  perf-profile.children.cycles-pp.__entry_text_start
> >       3.57 ±  2%      -1.9        1.66 ±  6%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
> >       3.16 ±  2%      -1.7        1.43 ±  4%  perf-profile.children.cycles-pp._copy_to_iter
> >       2.74 ±  2%      -1.6        1.16 ±  4%  perf-profile.children.cycles-pp.copyout
> >       4.16 ±  2%      -1.5        2.62 ±  3%  perf-profile.children.cycles-pp.__check_object_size
> >       2.73 ±  3%      -1.4        1.35 ±  6%  perf-profile.children.cycles-pp.kmem_cache_free
> >       2.82 ±  2%      -1.2        1.63 ±  3%  perf-profile.children.cycles-pp.check_heap_object
> >       2.27 ±  4%      -1.1        1.13 ±  2%  perf-profile.children.cycles-pp.skb_release_head_state
> >       2.85 ±  3%      -1.1        1.72 ±  2%  perf-profile.children.cycles-pp.simple_copy_to_iter
> >       2.22 ±  4%      -1.1        1.10        perf-profile.children.cycles-pp.unix_destruct_scm
> >       3.00 ±  2%      -1.1        1.91 ±  5%  perf-profile.children.cycles-pp.skb_copy_datagram_from_iter
> >       2.00 ±  4%      -1.1        0.92 ±  2%  perf-profile.children.cycles-pp.sock_wfree
> >       2.16 ±  3%      -0.7        1.43 ±  7%  perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook
> >       1.45 ±  3%      -0.7        0.73 ±  7%  perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
> >       2.21 ±  2%      -0.7        1.52 ±  3%  perf-profile.children.cycles-pp.__kmem_cache_free
> >       1.49 ±  3%      -0.6        0.89 ±  8%  perf-profile.children.cycles-pp._copy_from_iter
> >       1.40 ±  3%      -0.6        0.85 ± 13%  perf-profile.children.cycles-pp.mod_objcg_state
> >       0.74            -0.5        0.24 ± 16%  perf-profile.children.cycles-pp.__build_skb_around
> >       1.48            -0.5        1.01 ±  2%  perf-profile.children.cycles-pp.get_obj_cgroup_from_current
> >       2.05 ±  2%      -0.5        1.59 ±  2%  perf-profile.children.cycles-pp.security_file_permission
> >       0.98 ±  2%      -0.4        0.59 ± 10%  perf-profile.children.cycles-pp.copyin
> >       1.08 ±  3%      -0.4        0.72 ±  3%  perf-profile.children.cycles-pp.__might_resched
> >       1.75            -0.3        1.42 ±  4%  perf-profile.children.cycles-pp.apparmor_file_permission
> >       1.32 ±  4%      -0.3        1.00 ±  3%  perf-profile.children.cycles-pp.sock_recvmsg
> >       0.54 ±  4%      -0.3        0.25 ±  6%  perf-profile.children.cycles-pp.skb_unlink
> >       0.54 ±  6%      -0.3        0.26 ±  3%  perf-profile.children.cycles-pp.unix_write_space
> >       0.66 ±  3%      -0.3        0.39 ±  4%  perf-profile.children.cycles-pp.obj_cgroup_charge
> >       0.68 ±  2%      -0.3        0.41 ±  4%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
> >       0.86 ±  4%      -0.3        0.59 ±  3%  perf-profile.children.cycles-pp.__check_heap_object
> >       0.75 ±  9%      -0.3        0.48 ±  2%  perf-profile.children.cycles-pp.skb_set_owner_w
> >       1.84 ±  3%      -0.3        1.58 ±  4%  perf-profile.children.cycles-pp.aa_sk_perm
> >       0.68 ± 11%      -0.2        0.44 ±  3%  perf-profile.children.cycles-pp.skb_queue_tail
> >       1.22 ±  4%      -0.2        0.99 ±  5%  perf-profile.children.cycles-pp.__fdget_pos
> >       0.70 ±  2%      -0.2        0.48 ±  5%  perf-profile.children.cycles-pp.__get_obj_cgroup_from_memcg
> >       1.16 ±  4%      -0.2        0.93 ±  3%  perf-profile.children.cycles-pp.security_socket_recvmsg
> >       0.48 ±  3%      -0.2        0.29 ±  4%  perf-profile.children.cycles-pp.__might_fault
> >       0.24 ±  7%      -0.2        0.05 ± 56%  perf-profile.children.cycles-pp.fsnotify_perm
> >       1.12 ±  4%      -0.2        0.93 ±  6%  perf-profile.children.cycles-pp.__fget_light
> >       1.24 ±  3%      -0.2        1.07 ±  3%  perf-profile.children.cycles-pp.security_socket_sendmsg
> >       0.61 ±  3%      -0.2        0.45 ±  2%  perf-profile.children.cycles-pp.__might_sleep
> >       0.33 ±  5%      -0.2        0.17 ±  6%  perf-profile.children.cycles-pp.refill_obj_stock
> >       0.40 ±  2%      -0.1        0.25 ±  4%  perf-profile.children.cycles-pp.kmalloc_slab
> >       0.57 ±  2%      -0.1        0.45        perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
> >       0.54 ±  3%      -0.1        0.42 ±  2%  perf-profile.children.cycles-pp.wait_for_unix_gc
> >       0.42 ±  2%      -0.1        0.30 ±  3%  perf-profile.children.cycles-pp.is_vmalloc_addr
> >       1.00 ±  2%      -0.1        0.87 ±  5%  perf-profile.children.cycles-pp.__virt_addr_valid
> >       0.52 ±  2%      -0.1        0.41        perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
> >       0.33 ±  3%      -0.1        0.21 ±  3%  perf-profile.children.cycles-pp.tick_sched_handle
> >       0.36 ±  2%      -0.1        0.25 ±  4%  perf-profile.children.cycles-pp.tick_sched_timer
> >       0.47 ±  2%      -0.1        0.36 ±  2%  perf-profile.children.cycles-pp.hrtimer_interrupt
> >       0.48 ±  2%      -0.1        0.36 ±  2%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
> >       0.32 ±  3%      -0.1        0.21 ±  5%  perf-profile.children.cycles-pp.update_process_times
> >       0.42 ±  3%      -0.1        0.31 ±  2%  perf-profile.children.cycles-pp.__hrtimer_run_queues
> >       0.26 ±  6%      -0.1        0.16 ±  4%  perf-profile.children.cycles-pp.kmalloc_size_roundup
> >       0.20 ±  4%      -0.1        0.10 ±  9%  perf-profile.children.cycles-pp.task_tick_fair
> >       0.24 ±  3%      -0.1        0.15 ±  4%  perf-profile.children.cycles-pp.scheduler_tick
> >       0.30 ±  5%      -0.1        0.21 ±  8%  perf-profile.children.cycles-pp.obj_cgroup_uncharge_pages
> >       0.20 ±  2%      -0.1        0.11 ±  6%  perf-profile.children.cycles-pp.should_failslab
> >       0.51 ±  2%      -0.1        0.43 ±  6%  perf-profile.children.cycles-pp.syscall_enter_from_user_mode
> >       0.15 ±  8%      -0.1        0.07 ± 13%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
> >       0.19 ±  4%      -0.1        0.12 ±  5%  perf-profile.children.cycles-pp.apparmor_socket_sendmsg
> >       0.20 ±  4%      -0.1        0.13 ±  5%  perf-profile.children.cycles-pp.aa_file_perm
> >       0.18 ±  5%      -0.1        0.12 ±  5%  perf-profile.children.cycles-pp.apparmor_socket_recvmsg
> >       0.14 ± 13%      -0.1        0.08 ± 55%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
> >       0.24 ±  4%      -0.1        0.18 ±  2%  perf-profile.children.cycles-pp.rcu_all_qs
> >       0.18 ± 10%      -0.1        0.12 ± 11%  perf-profile.children.cycles-pp.memcg_account_kmem
> >       0.37 ±  3%      -0.1        0.31 ±  3%  perf-profile.children.cycles-pp.security_socket_getpeersec_dgram
> >       0.08            -0.0        0.06 ±  8%  perf-profile.children.cycles-pp.put_pid
> >       0.18 ±  3%      -0.0        0.16 ±  4%  perf-profile.children.cycles-pp.apparmor_socket_getpeersec_dgram
> >       0.21 ±  3%      +0.0        0.23 ±  2%  perf-profile.children.cycles-pp.__get_task_ioprio
> >       0.00            +0.1        0.05        perf-profile.children.cycles-pp.perf_exclude_event
> >       0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.invalidate_user_asid
> >       0.00            +0.1        0.07 ±  6%  perf-profile.children.cycles-pp.__bitmap_and
> >       0.05            +0.1        0.13 ±  8%  perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
> >       0.00            +0.1        0.08 ±  7%  perf-profile.children.cycles-pp.schedule_debug
> >       0.00            +0.1        0.08 ± 13%  perf-profile.children.cycles-pp.read@plt
> >       0.00            +0.1        0.08 ±  5%  perf-profile.children.cycles-pp.sysvec_reschedule_ipi
> >       0.00            +0.1        0.10 ±  4%  perf-profile.children.cycles-pp.tracing_gen_ctx_irq_test
> >       0.00            +0.1        0.10 ±  4%  perf-profile.children.cycles-pp.place_entity
> >       0.00            +0.1        0.12 ± 10%  perf-profile.children.cycles-pp.native_irq_return_iret
> >       0.07 ± 14%      +0.1        0.19 ±  3%  perf-profile.children.cycles-pp.__list_add_valid
> >       0.00            +0.1        0.13 ±  6%  perf-profile.children.cycles-pp.perf_trace_buf_alloc
> >       0.00            +0.1        0.13 ± 34%  perf-profile.children.cycles-pp._find_next_and_bit
> >       0.00            +0.1        0.14 ±  5%  perf-profile.children.cycles-pp.switch_ldt
> >       0.00            +0.1        0.15 ±  5%  perf-profile.children.cycles-pp.check_cfs_rq_runtime
> >       0.00            +0.1        0.15 ± 30%  perf-profile.children.cycles-pp.migrate_task_rq_fair
> >       0.00            +0.2        0.15 ±  5%  perf-profile.children.cycles-pp.__rdgsbase_inactive
> >       0.00            +0.2        0.16 ±  3%  perf-profile.children.cycles-pp.save_fpregs_to_fpstate
> >       0.00            +0.2        0.16 ±  6%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
> >       0.00            +0.2        0.17        perf-profile.children.cycles-pp.perf_trace_buf_update
> >       0.00            +0.2        0.18 ±  2%  perf-profile.children.cycles-pp.rb_insert_color
> >       0.00            +0.2        0.18 ±  4%  perf-profile.children.cycles-pp.rb_next
> >       0.00            +0.2        0.18 ± 21%  perf-profile.children.cycles-pp.__cgroup_account_cputime
> >       0.01 ±223%      +0.2        0.21 ± 28%  perf-profile.children.cycles-pp.perf_trace_sched_switch
> >       0.00            +0.2        0.20 ±  3%  perf-profile.children.cycles-pp.select_idle_cpu
> >       0.00            +0.2        0.20 ±  3%  perf-profile.children.cycles-pp.rcu_note_context_switch
> >       0.00            +0.2        0.21 ± 26%  perf-profile.children.cycles-pp.set_task_cpu
> >       0.00            +0.2        0.22 ±  8%  perf-profile.children.cycles-pp.resched_curr
> >       0.08 ±  5%      +0.2        0.31 ± 11%  perf-profile.children.cycles-pp.task_h_load
> >       0.00            +0.2        0.24 ±  3%  perf-profile.children.cycles-pp.finish_wait
> >       0.04 ± 44%      +0.3        0.29 ±  5%  perf-profile.children.cycles-pp.rb_erase
> >       0.19 ±  6%      +0.3        0.46        perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
> >       0.20 ±  6%      +0.3        0.47 ±  3%  perf-profile.children.cycles-pp.__list_del_entry_valid
> >       0.00            +0.3        0.28 ±  3%  perf-profile.children.cycles-pp.__wrgsbase_inactive
> >       0.02 ±141%      +0.3        0.30 ±  2%  perf-profile.children.cycles-pp.native_sched_clock
> >       0.06 ± 13%      +0.3        0.34 ±  2%  perf-profile.children.cycles-pp.sched_clock_cpu
> >       0.64 ±  2%      +0.3        0.93        perf-profile.children.cycles-pp.mutex_lock
> >       0.00            +0.3        0.30 ±  5%  perf-profile.children.cycles-pp.cr4_update_irqsoff
> >       0.00            +0.3        0.30 ±  4%  perf-profile.children.cycles-pp.clear_buddies
> >       0.07 ± 55%      +0.3        0.37 ±  5%  perf-profile.children.cycles-pp.perf_trace_sched_stat_runtime
> >       0.10 ± 66%      +0.3        0.42 ±  5%  perf-profile.children.cycles-pp.perf_tp_event
> >       0.02 ±142%      +0.3        0.36 ±  6%  perf-profile.children.cycles-pp.cpuacct_charge
> >       0.12 ±  9%      +0.4        0.47 ± 11%  perf-profile.children.cycles-pp.wake_affine
> >       0.00            +0.4        0.36 ± 13%  perf-profile.children.cycles-pp.available_idle_cpu
> >       0.05 ± 48%      +0.4        0.42 ±  6%  perf-profile.children.cycles-pp.finish_task_switch
> >       0.12 ±  4%      +0.4        0.49 ±  4%  perf-profile.children.cycles-pp.asm_sysvec_reschedule_ipi
> >       0.07 ± 17%      +0.4        0.48        perf-profile.children.cycles-pp.__calc_delta
> >       0.03 ±100%      +0.5        0.49 ±  4%  perf-profile.children.cycles-pp.pick_next_entity
> >       0.00            +0.5        0.48 ±  8%  perf-profile.children.cycles-pp.set_next_buddy
> >       0.08 ± 14%      +0.6        0.66 ±  4%  perf-profile.children.cycles-pp.update_min_vruntime
> >       0.07 ± 17%      +0.6        0.68 ±  2%  perf-profile.children.cycles-pp.os_xsave
> >       0.29 ±  7%      +0.7        0.99 ±  3%  perf-profile.children.cycles-pp.update_cfs_group
> >       0.17 ± 17%      +0.7        0.87 ±  4%  perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
> >       0.14 ±  7%      +0.7        0.87 ±  3%  perf-profile.children.cycles-pp.__update_load_avg_se
> >       0.14 ± 16%      +0.8        0.90 ±  2%  perf-profile.children.cycles-pp.update_rq_clock
> >       0.08 ± 17%      +0.8        0.84 ±  5%  perf-profile.children.cycles-pp.check_preempt_wakeup
> >       0.12 ± 14%      +0.8        0.95 ±  3%  perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
> >       0.22 ±  5%      +0.8        1.07 ±  3%  perf-profile.children.cycles-pp.prepare_to_wait
> >       0.10 ± 18%      +0.9        0.98 ±  3%  perf-profile.children.cycles-pp.check_preempt_curr
> >      29.72            +0.9       30.61        perf-profile.children.cycles-pp.vfs_write
> >       0.14 ± 11%      +0.9        1.03 ±  4%  perf-profile.children.cycles-pp.__switch_to
> >       0.07 ± 20%      +0.9        0.99 ±  6%  perf-profile.children.cycles-pp.put_prev_entity
> >       0.12 ± 16%      +1.0        1.13 ±  5%  perf-profile.children.cycles-pp.___perf_sw_event
> >       0.07 ± 17%      +1.0        1.10 ± 13%  perf-profile.children.cycles-pp.select_idle_sibling
> >      27.82 ±  2%      +1.2       28.99        perf-profile.children.cycles-pp.unix_stream_recvmsg
> >      27.41 ±  2%      +1.2       28.63        perf-profile.children.cycles-pp.unix_stream_read_generic
> >       0.20 ± 15%      +1.4        1.59 ±  3%  perf-profile.children.cycles-pp.reweight_entity
> >       0.21 ± 13%      +1.4        1.60 ±  4%  perf-profile.children.cycles-pp.__switch_to_asm
> >       0.23 ± 10%      +1.4        1.65 ±  5%  perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
> >       0.20 ± 13%      +1.5        1.69 ±  3%  perf-profile.children.cycles-pp.set_next_entity
> >      27.59            +1.6       29.19        perf-profile.children.cycles-pp.sock_write_iter
> >       0.28 ± 10%      +1.8        2.12 ±  5%  perf-profile.children.cycles-pp.switch_fpu_return
> >       0.26 ± 11%      +1.8        2.10 ±  6%  perf-profile.children.cycles-pp.select_task_rq_fair
> >      26.66 ±  2%      +2.0       28.63        perf-profile.children.cycles-pp.sock_sendmsg
> >       0.31 ± 12%      +2.1        2.44 ±  5%  perf-profile.children.cycles-pp.select_task_rq
> >       0.30 ± 14%      +2.2        2.46 ±  4%  perf-profile.children.cycles-pp.prepare_task_switch
> >      25.27 ±  2%      +2.2       27.47        perf-profile.children.cycles-pp.unix_stream_sendmsg
> >       2.10            +2.3        4.38 ±  2%  perf-profile.children.cycles-pp._raw_spin_lock
> >       0.40 ± 14%      +2.5        2.92 ±  5%  perf-profile.children.cycles-pp.dequeue_entity
> >      48.40            +2.6       51.02        perf-profile.children.cycles-pp.__libc_write
> >       0.46 ± 15%      +3.1        3.51 ±  3%  perf-profile.children.cycles-pp.enqueue_entity
> >       0.49 ± 10%      +3.2        3.64 ±  7%  perf-profile.children.cycles-pp.update_load_avg
> >       0.53 ± 20%      +3.4        3.91 ±  3%  perf-profile.children.cycles-pp.update_curr
> >      80.81            +3.4       84.24        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
> >       0.50 ± 12%      +3.5        4.00 ±  4%  perf-profile.children.cycles-pp.switch_mm_irqs_off
> >       0.55 ±  9%      +3.8        4.38 ±  4%  perf-profile.children.cycles-pp.pick_next_task_fair
> >       9.60            +4.6       14.15 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
> >       0.78 ± 13%      +4.9        5.65 ±  4%  perf-profile.children.cycles-pp.dequeue_task_fair
> >       0.78 ± 15%      +5.2        5.99 ±  3%  perf-profile.children.cycles-pp.enqueue_task_fair
> >      74.30            +5.6       79.86        perf-profile.children.cycles-pp.do_syscall_64
> >       0.90 ± 15%      +6.3        7.16 ±  3%  perf-profile.children.cycles-pp.ttwu_do_activate
> >       0.33 ± 31%      +6.3        6.61 ±  6%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
> >       0.82 ± 15%      +8.1        8.92 ±  5%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
> >       1.90 ± 16%     +12.2       14.10 ±  2%  perf-profile.children.cycles-pp.try_to_wake_up
> >       2.36 ± 11%     +12.2       14.60 ±  3%  perf-profile.children.cycles-pp.schedule_timeout
> >       1.95 ± 15%     +12.5       14.41 ±  2%  perf-profile.children.cycles-pp.autoremove_wake_function
> >       2.01 ± 15%     +12.8       14.76 ±  2%  perf-profile.children.cycles-pp.__wake_up_common
> >       2.23 ± 13%     +13.2       15.45 ±  2%  perf-profile.children.cycles-pp.__wake_up_common_lock
> >       2.53 ± 10%     +13.4       15.90 ±  2%  perf-profile.children.cycles-pp.sock_def_readable
> >       2.29 ± 15%     +14.6       16.93 ±  3%  perf-profile.children.cycles-pp.unix_stream_data_wait
> >       2.61 ± 13%     +18.0       20.65 ±  4%  perf-profile.children.cycles-pp.schedule
> >       2.66 ± 13%     +18.1       20.77 ±  4%  perf-profile.children.cycles-pp.__schedule
> >      11.25 ±  3%      -4.6        6.67 ±  3%  perf-profile.self.cycles-pp.syscall_return_via_sysret
> >       5.76 ± 32%      -3.9        1.90 ±  3%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
> >       8.69 ±  3%      -3.4        5.27 ±  3%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
> >       3.11 ±  3%      -2.5        0.60 ± 13%  perf-profile.self.cycles-pp.__slab_free
> >       6.65 ±  2%      -2.2        4.47 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
> >       4.78 ±  3%      -1.9        2.88 ±  3%  perf-profile.self.cycles-pp.__entry_text_start
> >       3.52 ±  2%      -1.9        1.64 ±  6%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
> >       2.06 ±  3%      -1.1        0.96 ±  5%  perf-profile.self.cycles-pp.kmem_cache_free
> >       1.42 ±  3%      -1.0        0.46 ± 10%  perf-profile.self.cycles-pp.check_heap_object
> >       1.43 ±  4%      -0.8        0.64        perf-profile.self.cycles-pp.sock_wfree
> >       0.99 ±  3%      -0.8        0.21 ± 12%  perf-profile.self.cycles-pp.skb_release_data
> >       0.84 ±  8%      -0.7        0.10 ± 64%  perf-profile.self.cycles-pp.___slab_alloc
> >       1.97 ±  2%      -0.6        1.32        perf-profile.self.cycles-pp.unix_stream_read_generic
> >       1.60 ±  3%      -0.5        1.11 ±  4%  perf-profile.self.cycles-pp.memcg_slab_post_alloc_hook
> >       1.24 ±  2%      -0.5        0.75 ± 11%  perf-profile.self.cycles-pp.mod_objcg_state
> >       0.71            -0.5        0.23 ± 15%  perf-profile.self.cycles-pp.__build_skb_around
> >       0.95 ±  3%      -0.5        0.50 ±  6%  perf-profile.self.cycles-pp.__alloc_skb
> >       0.97 ±  4%      -0.4        0.55 ±  5%  perf-profile.self.cycles-pp.kmem_cache_alloc_node
> >       0.99 ±  3%      -0.4        0.59 ±  4%  perf-profile.self.cycles-pp.vfs_write
> >       1.38 ±  2%      -0.4        0.99        perf-profile.self.cycles-pp.__kmem_cache_free
> >       0.86 ±  2%      -0.4        0.50 ±  3%  perf-profile.self.cycles-pp.__kmem_cache_alloc_node
> >       0.92 ±  4%      -0.4        0.56 ±  4%  perf-profile.self.cycles-pp.sock_write_iter
> >       1.06 ±  3%      -0.4        0.70 ±  3%  perf-profile.self.cycles-pp.__might_resched
> >       0.73 ±  4%      -0.3        0.44 ±  4%  perf-profile.self.cycles-pp.__cond_resched
> >       0.85 ±  3%      -0.3        0.59 ±  4%  perf-profile.self.cycles-pp.__check_heap_object
> >       1.46 ±  7%      -0.3        1.20 ±  2%  perf-profile.self.cycles-pp.unix_stream_sendmsg
> >       0.73 ±  9%      -0.3        0.47 ±  2%  perf-profile.self.cycles-pp.skb_set_owner_w
> >       1.54            -0.3        1.28 ±  4%  perf-profile.self.cycles-pp.apparmor_file_permission
> >       0.74 ±  3%      -0.2        0.50 ±  2%  perf-profile.self.cycles-pp.get_obj_cgroup_from_current
> >       1.15 ±  3%      -0.2        0.91 ±  8%  perf-profile.self.cycles-pp.aa_sk_perm
> >       0.60            -0.2        0.36 ±  4%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
> >       0.65 ±  4%      -0.2        0.45 ±  6%  perf-profile.self.cycles-pp.__get_obj_cgroup_from_memcg
> >       0.24 ±  6%      -0.2        0.05 ± 56%  perf-profile.self.cycles-pp.fsnotify_perm
> >       0.76 ±  3%      -0.2        0.58 ±  2%  perf-profile.self.cycles-pp.sock_read_iter
> >       1.10 ±  4%      -0.2        0.92 ±  6%  perf-profile.self.cycles-pp.__fget_light
> >       0.42 ±  3%      -0.2        0.25 ±  4%  perf-profile.self.cycles-pp.obj_cgroup_charge
> >       0.32 ±  4%      -0.2        0.17 ±  6%  perf-profile.self.cycles-pp.refill_obj_stock
> >       0.29            -0.2        0.14 ±  8%  perf-profile.self.cycles-pp.__kmalloc_node_track_caller
> >       0.54 ±  3%      -0.1        0.40 ±  2%  perf-profile.self.cycles-pp.__might_sleep
> >       0.30 ±  7%      -0.1        0.16 ± 22%  perf-profile.self.cycles-pp.security_file_permission
> >       0.34 ±  3%      -0.1        0.21 ±  6%  perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
> >       0.41 ±  3%      -0.1        0.29 ±  3%  perf-profile.self.cycles-pp.is_vmalloc_addr
> >       0.27 ±  3%      -0.1        0.16 ±  6%  perf-profile.self.cycles-pp._copy_from_iter
> >       0.24 ±  3%      -0.1        0.12 ±  3%  perf-profile.self.cycles-pp.ksys_write
> >       0.95 ±  2%      -0.1        0.84 ±  5%  perf-profile.self.cycles-pp.__virt_addr_valid
> >       0.56 ± 11%      -0.1        0.46 ±  4%  perf-profile.self.cycles-pp.sock_def_readable
> >       0.16 ±  7%      -0.1        0.06 ± 18%  perf-profile.self.cycles-pp.sock_recvmsg
> >       0.22 ±  5%      -0.1        0.14 ±  2%  perf-profile.self.cycles-pp.ksys_read
> >       0.27 ±  4%      -0.1        0.19 ±  5%  perf-profile.self.cycles-pp.kmalloc_slab
> >       0.28 ±  2%      -0.1        0.20 ±  2%  perf-profile.self.cycles-pp.consume_skb
> >       0.35 ±  2%      -0.1        0.28 ±  3%  perf-profile.self.cycles-pp.__check_object_size
> >       0.13 ±  8%      -0.1        0.06 ± 18%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
> >       0.20 ±  5%      -0.1        0.12 ±  6%  perf-profile.self.cycles-pp.kmalloc_reserve
> >       0.26 ±  5%      -0.1        0.19 ±  4%  perf-profile.self.cycles-pp.sock_alloc_send_pskb
> >       0.42 ±  2%      -0.1        0.35 ±  7%  perf-profile.self.cycles-pp.syscall_enter_from_user_mode
> >       0.19 ±  5%      -0.1        0.12 ±  6%  perf-profile.self.cycles-pp.aa_file_perm
> >       0.16 ±  4%      -0.1        0.10 ±  4%  perf-profile.self.cycles-pp.skb_copy_datagram_from_iter
> >       0.18 ±  4%      -0.1        0.12 ±  6%  perf-profile.self.cycles-pp.apparmor_socket_sendmsg
> >       0.18 ±  5%      -0.1        0.12 ±  4%  perf-profile.self.cycles-pp.apparmor_socket_recvmsg
> >       0.15 ±  5%      -0.1        0.10 ±  5%  perf-profile.self.cycles-pp.alloc_skb_with_frags
> >       0.64 ±  3%      -0.1        0.59        perf-profile.self.cycles-pp.__libc_write
> >       0.20 ±  4%      -0.1        0.15 ±  3%  perf-profile.self.cycles-pp._copy_to_iter
> >       0.15 ±  5%      -0.1        0.10 ± 11%  perf-profile.self.cycles-pp.sock_sendmsg
> >       0.08 ±  4%      -0.1        0.03 ± 81%  perf-profile.self.cycles-pp.copyout
> >       0.11 ±  6%      -0.0        0.06 ±  7%  perf-profile.self.cycles-pp.__fdget_pos
> >       0.12 ±  5%      -0.0        0.07 ± 10%  perf-profile.self.cycles-pp.kmalloc_size_roundup
> >       0.34 ±  3%      -0.0        0.29        perf-profile.self.cycles-pp.do_syscall_64
> >       0.20 ±  4%      -0.0        0.15 ±  4%  perf-profile.self.cycles-pp.rcu_all_qs
> >       0.41 ±  3%      -0.0        0.37 ±  8%  perf-profile.self.cycles-pp.unix_stream_recvmsg
> >       0.22 ±  2%      -0.0        0.17 ±  4%  perf-profile.self.cycles-pp.unix_destruct_scm
> >       0.09 ±  4%      -0.0        0.05        perf-profile.self.cycles-pp.should_failslab
> >       0.10 ± 15%      -0.0        0.06 ± 50%  perf-profile.self.cycles-pp.__mod_memcg_lruvec_state
> >       0.11 ±  4%      -0.0        0.07        perf-profile.self.cycles-pp.__might_fault
> >       0.16 ±  2%      -0.0        0.13 ±  6%  perf-profile.self.cycles-pp.obj_cgroup_uncharge_pages
> >       0.18 ±  4%      -0.0        0.16 ±  3%  perf-profile.self.cycles-pp.security_socket_getpeersec_dgram
> >       0.28 ±  2%      -0.0        0.25 ±  2%  perf-profile.self.cycles-pp.unix_write_space
> >       0.17 ±  2%      -0.0        0.15 ±  5%  perf-profile.self.cycles-pp.apparmor_socket_getpeersec_dgram
> >       0.08 ±  6%      -0.0        0.05 ±  7%  perf-profile.self.cycles-pp.security_socket_sendmsg
> >       0.12 ±  4%      -0.0        0.10 ±  3%  perf-profile.self.cycles-pp.__skb_datagram_iter
> >       0.24 ±  2%      -0.0        0.22        perf-profile.self.cycles-pp.mutex_unlock
> >       0.08 ±  5%      +0.0        0.10 ±  6%  perf-profile.self.cycles-pp.scm_recv
> >       0.17 ±  2%      +0.0        0.19 ±  3%  perf-profile.self.cycles-pp.__x64_sys_read
> >       0.19 ±  3%      +0.0        0.22 ±  2%  perf-profile.self.cycles-pp.__get_task_ioprio
> >       0.00            +0.1        0.06        perf-profile.self.cycles-pp.finish_wait
> >       0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.cr4_update_irqsoff
> >       0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.invalidate_user_asid
> >       0.00            +0.1        0.07 ± 12%  perf-profile.self.cycles-pp.wake_affine
> >       0.00            +0.1        0.07 ±  7%  perf-profile.self.cycles-pp.check_cfs_rq_runtime
> >       0.00            +0.1        0.07 ±  5%  perf-profile.self.cycles-pp.perf_trace_buf_update
> >       0.00            +0.1        0.07 ±  9%  perf-profile.self.cycles-pp.asm_sysvec_reschedule_ipi
> >       0.00            +0.1        0.07 ± 10%  perf-profile.self.cycles-pp.__bitmap_and
> >       0.00            +0.1        0.08 ± 10%  perf-profile.self.cycles-pp.schedule_debug
> >       0.00            +0.1        0.08 ± 13%  perf-profile.self.cycles-pp.read@plt
> >       0.00            +0.1        0.08 ± 12%  perf-profile.self.cycles-pp.perf_trace_buf_alloc
> >       0.00            +0.1        0.09 ± 35%  perf-profile.self.cycles-pp.migrate_task_rq_fair
> >       0.00            +0.1        0.09 ±  5%  perf-profile.self.cycles-pp.place_entity
> >       0.00            +0.1        0.10 ±  4%  perf-profile.self.cycles-pp.tracing_gen_ctx_irq_test
> >       0.00            +0.1        0.10        perf-profile.self.cycles-pp.__wake_up_common_lock
> >       0.07 ± 17%      +0.1        0.18 ±  3%  perf-profile.self.cycles-pp.__list_add_valid
> >       0.00            +0.1        0.11 ±  8%  perf-profile.self.cycles-pp.native_irq_return_iret
> >       0.00            +0.1        0.12 ±  6%  perf-profile.self.cycles-pp.select_idle_cpu
> >       0.00            +0.1        0.12 ± 34%  perf-profile.self.cycles-pp._find_next_and_bit
> >       0.00            +0.1        0.13 ± 25%  perf-profile.self.cycles-pp.__cgroup_account_cputime
> >       0.00            +0.1        0.13 ±  7%  perf-profile.self.cycles-pp.switch_ldt
> >       0.00            +0.1        0.14 ±  5%  perf-profile.self.cycles-pp.check_preempt_curr
> >       0.00            +0.1        0.15 ±  2%  perf-profile.self.cycles-pp.save_fpregs_to_fpstate
> >       0.00            +0.1        0.15 ±  5%  perf-profile.self.cycles-pp.__rdgsbase_inactive
> >       0.14 ±  3%      +0.2        0.29        perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
> >       0.00            +0.2        0.15 ±  7%  perf-profile.self.cycles-pp.ttwu_queue_wakelist
> >       0.00            +0.2        0.17 ±  4%  perf-profile.self.cycles-pp.rb_insert_color
> >       0.00            +0.2        0.17 ±  5%  perf-profile.self.cycles-pp.rb_next
> >       0.00            +0.2        0.18 ±  2%  perf-profile.self.cycles-pp.autoremove_wake_function
> >       0.01 ±223%      +0.2        0.19 ±  6%  perf-profile.self.cycles-pp.ttwu_do_activate
> >       0.00            +0.2        0.20 ±  2%  perf-profile.self.cycles-pp.rcu_note_context_switch
> >       0.00            +0.2        0.20 ±  7%  perf-profile.self.cycles-pp.exit_to_user_mode_loop
> >       0.27            +0.2        0.47 ±  3%  perf-profile.self.cycles-pp.mutex_lock
> >       0.00            +0.2        0.20 ± 28%  perf-profile.self.cycles-pp.perf_trace_sched_switch
> >       0.00            +0.2        0.21 ±  9%  perf-profile.self.cycles-pp.resched_curr
> >       0.04 ± 45%      +0.2        0.26 ±  7%  perf-profile.self.cycles-pp.perf_tp_event
> >       0.06 ±  7%      +0.2        0.28 ±  8%  perf-profile.self.cycles-pp.perf_trace_sched_wakeup_template
> >       0.19 ±  7%      +0.2        0.41 ±  5%  perf-profile.self.cycles-pp.__list_del_entry_valid
> >       0.08 ±  5%      +0.2        0.31 ± 11%  perf-profile.self.cycles-pp.task_h_load
> >       0.00            +0.2        0.23 ±  5%  perf-profile.self.cycles-pp.finish_task_switch
> >       0.03 ± 70%      +0.2        0.27 ±  5%  perf-profile.self.cycles-pp.rb_erase
> >       0.02 ±142%      +0.3        0.29 ±  2%  perf-profile.self.cycles-pp.native_sched_clock
> >       0.00            +0.3        0.28 ±  3%  perf-profile.self.cycles-pp.__wrgsbase_inactive
> >       0.00            +0.3        0.28 ±  6%  perf-profile.self.cycles-pp.clear_buddies
> >       0.07 ± 10%      +0.3        0.35 ±  3%  perf-profile.self.cycles-pp.schedule_timeout
> >       0.03 ± 70%      +0.3        0.33 ±  3%  perf-profile.self.cycles-pp.select_task_rq
> >       0.06 ± 13%      +0.3        0.36 ±  4%  perf-profile.self.cycles-pp.__wake_up_common
> >       0.06 ± 13%      +0.3        0.36 ±  3%  perf-profile.self.cycles-pp.dequeue_entity
> >       0.06 ± 18%      +0.3        0.37 ±  7%  perf-profile.self.cycles-pp.perf_trace_sched_stat_runtime
> >       0.01 ±223%      +0.3        0.33 ±  4%  perf-profile.self.cycles-pp.schedule
> >       0.02 ±142%      +0.3        0.35 ±  7%  perf-profile.self.cycles-pp.cpuacct_charge
> >       0.01 ±223%      +0.3        0.35        perf-profile.self.cycles-pp.set_next_entity
> >       0.00            +0.4        0.35 ± 13%  perf-profile.self.cycles-pp.available_idle_cpu
> >       0.08 ± 10%      +0.4        0.44 ±  5%  perf-profile.self.cycles-pp.prepare_to_wait
> >       0.63 ±  3%      +0.4        1.00 ±  4%  perf-profile.self.cycles-pp.vfs_read
> >       0.02 ±142%      +0.4        0.40 ±  4%  perf-profile.self.cycles-pp.check_preempt_wakeup
> >       0.02 ±141%      +0.4        0.42 ±  4%  perf-profile.self.cycles-pp.pick_next_entity
> >       0.07 ± 17%      +0.4        0.48        perf-profile.self.cycles-pp.__calc_delta
> >       0.06 ± 14%      +0.4        0.47 ±  3%  perf-profile.self.cycles-pp.unix_stream_data_wait
> >       0.04 ± 45%      +0.4        0.45 ±  4%  perf-profile.self.cycles-pp.switch_fpu_return
> >       0.00            +0.5        0.46 ±  7%  perf-profile.self.cycles-pp.set_next_buddy
> >       0.07 ± 17%      +0.5        0.53 ±  3%  perf-profile.self.cycles-pp.select_task_rq_fair
> >       0.08 ± 16%      +0.5        0.55 ±  4%  perf-profile.self.cycles-pp.try_to_wake_up
> >       0.08 ± 19%      +0.5        0.56 ±  3%  perf-profile.self.cycles-pp.update_rq_clock
> >       0.02 ±141%      +0.5        0.50 ± 10%  perf-profile.self.cycles-pp.select_idle_sibling
> >       0.77 ±  2%      +0.5        1.25 ±  2%  perf-profile.self.cycles-pp.__libc_read
> >       0.09 ± 19%      +0.5        0.59 ±  3%  perf-profile.self.cycles-pp.reweight_entity
> >       0.08 ± 14%      +0.5        0.59 ±  2%  perf-profile.self.cycles-pp.dequeue_task_fair
> >       0.08 ± 13%      +0.6        0.64 ±  5%  perf-profile.self.cycles-pp.update_min_vruntime
> >       0.02 ±141%      +0.6        0.58 ±  7%  perf-profile.self.cycles-pp.put_prev_entity
> >       0.06 ± 11%      +0.6        0.64 ±  4%  perf-profile.self.cycles-pp.enqueue_task_fair
> >       0.07 ± 18%      +0.6        0.68 ±  3%  perf-profile.self.cycles-pp.os_xsave
> >       1.39 ±  2%      +0.7        2.06 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
> >       0.28 ±  8%      +0.7        0.97 ±  4%  perf-profile.self.cycles-pp.update_cfs_group
> >       0.14 ±  8%      +0.7        0.83 ±  3%  perf-profile.self.cycles-pp.__update_load_avg_se
> >       1.76 ±  3%      +0.7        2.47 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock
> >       0.12 ± 12%      +0.7        0.85 ±  5%  perf-profile.self.cycles-pp.prepare_task_switch
> >       0.12 ± 12%      +0.8        0.91 ±  3%  perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
> >       0.13 ± 12%      +0.8        0.93 ±  5%  perf-profile.self.cycles-pp.pick_next_task_fair
> >       0.13 ± 12%      +0.9        0.98 ±  4%  perf-profile.self.cycles-pp.__switch_to
> >       0.11 ± 18%      +0.9        1.06 ±  5%  perf-profile.self.cycles-pp.___perf_sw_event
> >       0.16 ± 11%      +1.2        1.34 ±  4%  perf-profile.self.cycles-pp.enqueue_entity
> >       0.20 ± 12%      +1.4        1.58 ±  4%  perf-profile.self.cycles-pp.__switch_to_asm
> >       0.23 ± 10%      +1.4        1.65 ±  5%  perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
> >       0.25 ± 12%      +1.5        1.77 ±  4%  perf-profile.self.cycles-pp.__schedule
> >       0.22 ± 10%      +1.6        1.78 ± 10%  perf-profile.self.cycles-pp.update_load_avg
> >       0.23 ± 16%      +1.7        1.91 ±  7%  perf-profile.self.cycles-pp.update_curr
> >       0.48 ± 11%      +3.4        3.86 ±  4%  perf-profile.self.cycles-pp.switch_mm_irqs_off
> >
> >
> > To reproduce:
> >
> >         git clone https://github.com/intel/lkp-tests.git
> >         cd lkp-tests
> >         sudo bin/lkp install job.yaml           # job file is attached in this email
> >         bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
> >         sudo bin/lkp run generated-yaml-file
> >
> >         # if come across any failure that blocks the test,
> >         # please remove ~/.lkp and /lkp dir to run from a clean state.
>
>
>
> Amazon Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> Sitz: Berlin
> Ust-ID: DE 289 237 879
>
>
>
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Roman Kagan 2 years, 6 months ago
On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
> > What scares me, though, is that I've got a message from the test robot
> > that this commit drammatically affected hackbench results, see the quote
> > below.  I expected the commit not to affect any benchmarks.
> >
> > Any idea what could have caused this change?
> 
> Hmm, It's most probably because se->exec_start is reset after a
> migration and the condition becomes true for newly migrated task
> whereas its vruntime should be after min_vruntime.
> 
> We have missed this condition

Makes sense to me.

But what would then be the reliable way to detect a sched_entity which
has slept for long and risks overflowing in .vruntime comparison?

Thanks,
Roman.



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Vincent Guittot 2 years, 6 months ago
On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
>
> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
> > On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
> > > What scares me, though, is that I've got a message from the test robot
> > > that this commit drammatically affected hackbench results, see the quote
> > > below.  I expected the commit not to affect any benchmarks.
> > >
> > > Any idea what could have caused this change?
> >
> > Hmm, It's most probably because se->exec_start is reset after a
> > migration and the condition becomes true for newly migrated task
> > whereas its vruntime should be after min_vruntime.
> >
> > We have missed this condition
>
> Makes sense to me.
>
> But what would then be the reliable way to detect a sched_entity which
> has slept for long and risks overflowing in .vruntime comparison?

For now I don't have a better idea than adding the same check in
migrate_task_rq_fair()

>
> Thanks,
> Roman.
>
>
>
> Amazon Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> Sitz: Berlin
> Ust-ID: DE 289 237 879
>
>
>
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Zhang Qiao 2 years, 6 months ago

在 2023/2/27 22:37, Vincent Guittot 写道:
> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
>>
>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
>>>> What scares me, though, is that I've got a message from the test robot
>>>> that this commit drammatically affected hackbench results, see the quote
>>>> below.  I expected the commit not to affect any benchmarks.
>>>>
>>>> Any idea what could have caused this change?
>>>
>>> Hmm, It's most probably because se->exec_start is reset after a
>>> migration and the condition becomes true for newly migrated task
>>> whereas its vruntime should be after min_vruntime.
>>>
>>> We have missed this condition
>>
>> Makes sense to me.
>>
>> But what would then be the reliable way to detect a sched_entity which
>> has slept for long and risks overflowing in .vruntime comparison?
> 
> For now I don't have a better idea than adding the same check in
> migrate_task_rq_fair()

Hi, Vincent,
I fixed this condition as you said, and the test results are as follows.

testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100
version1: v6.2
version2: v6.2 + commit 829c1651e9c4
version3: v6.2 + commit 829c1651e9c4 + this patch

-------------------------------------------------
	version1	version2	version3
test1	81.0 		118.1 		82.1
test2	82.1 		116.9 		80.3
test3	83.2 		103.9 		83.3
avg(s)	82.1 		113.0 		81.9

-------------------------------------------------
After deal with the task migration case, the hackbench result has restored.

The patch as follow, how does this look?

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ff4dbbae3b10..3a88d20fd29e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
 #endif
 }

+static inline u64 sched_sleeper_credit(struct sched_entity *se)
+{
+
+       unsigned long thresh;
+
+       if (se_is_idle(se))
+               thresh = sysctl_sched_min_granularity;
+       else
+               thresh = sysctl_sched_latency;
+
+       /*
+        * Halve their sleep time's effect, to allow
+        * for a gentler effect of sleepers:
+        */
+       if (sched_feat(GENTLE_FAIR_SLEEPERS))
+               thresh >>= 1;
+
+       return thresh;
+}
+
 static void
 place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
 {
@@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
                vruntime += sched_vslice(cfs_rq, se);

        /* sleeps up to a single latency don't count. */
-       if (!initial) {
-               unsigned long thresh;
-
-               if (se_is_idle(se))
-                       thresh = sysctl_sched_min_granularity;
-               else
-                       thresh = sysctl_sched_latency;
-
-               /*
-                * Halve their sleep time's effect, to allow
-                * for a gentler effect of sleepers:
-                */
-               if (sched_feat(GENTLE_FAIR_SLEEPERS))
-                       thresh >>= 1;
-
-               vruntime -= thresh;
-       }
+       if (!initial)
+               vruntime -= sched_sleeper_credit(se);

        /*
         * Pull vruntime of the entity being placed to the base level of
@@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
         * inversed due to s64 overflow.
         */
        sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
-       if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
+       if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC)
                se->vruntime = vruntime;
        else
                se->vruntime = max_vruntime(se->vruntime, vruntime);
@@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
         */
        if (READ_ONCE(p->__state) == TASK_WAKING) {
                struct cfs_rq *cfs_rq = cfs_rq_of(se);
+               u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;

-               se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
+               if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
+                       se->vruntime = -sched_sleeper_credit(se);
+               else
+                       se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
        }

        if (!task_on_rq_migrating(p)) {



Thanks.
Zhang Qiao.

> 
>>
>> Thanks,
>> Roman.
>>
>>
>>
>> Amazon Development Center Germany GmbH
>> Krausenstr. 38
>> 10117 Berlin
>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
>> Sitz: Berlin
>> Ust-ID: DE 289 237 879
>>
>>
>>
> .
> 
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Vincent Guittot 2 years, 6 months ago
On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote:
>
>
>
> 在 2023/2/27 22:37, Vincent Guittot 写道:
> > On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
> >>
> >> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
> >>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
> >>>> What scares me, though, is that I've got a message from the test robot
> >>>> that this commit drammatically affected hackbench results, see the quote
> >>>> below.  I expected the commit not to affect any benchmarks.
> >>>>
> >>>> Any idea what could have caused this change?
> >>>
> >>> Hmm, It's most probably because se->exec_start is reset after a
> >>> migration and the condition becomes true for newly migrated task
> >>> whereas its vruntime should be after min_vruntime.
> >>>
> >>> We have missed this condition
> >>
> >> Makes sense to me.
> >>
> >> But what would then be the reliable way to detect a sched_entity which
> >> has slept for long and risks overflowing in .vruntime comparison?
> >
> > For now I don't have a better idea than adding the same check in
> > migrate_task_rq_fair()
>
> Hi, Vincent,
> I fixed this condition as you said, and the test results are as follows.
>
> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100
> version1: v6.2
> version2: v6.2 + commit 829c1651e9c4
> version3: v6.2 + commit 829c1651e9c4 + this patch
>
> -------------------------------------------------
>         version1        version2        version3
> test1   81.0            118.1           82.1
> test2   82.1            116.9           80.3
> test3   83.2            103.9           83.3
> avg(s)  82.1            113.0           81.9
>
> -------------------------------------------------
> After deal with the task migration case, the hackbench result has restored.
>
> The patch as follow, how does this look?
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index ff4dbbae3b10..3a88d20fd29e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  #endif
>  }
>
> +static inline u64 sched_sleeper_credit(struct sched_entity *se)
> +{
> +
> +       unsigned long thresh;
> +
> +       if (se_is_idle(se))
> +               thresh = sysctl_sched_min_granularity;
> +       else
> +               thresh = sysctl_sched_latency;
> +
> +       /*
> +        * Halve their sleep time's effect, to allow
> +        * for a gentler effect of sleepers:
> +        */
> +       if (sched_feat(GENTLE_FAIR_SLEEPERS))
> +               thresh >>= 1;
> +
> +       return thresh;
> +}
> +
>  static void
>  place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>  {
> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>                 vruntime += sched_vslice(cfs_rq, se);
>
>         /* sleeps up to a single latency don't count. */
> -       if (!initial) {
> -               unsigned long thresh;
> -
> -               if (se_is_idle(se))
> -                       thresh = sysctl_sched_min_granularity;
> -               else
> -                       thresh = sysctl_sched_latency;
> -
> -               /*
> -                * Halve their sleep time's effect, to allow
> -                * for a gentler effect of sleepers:
> -                */
> -               if (sched_feat(GENTLE_FAIR_SLEEPERS))
> -                       thresh >>= 1;
> -
> -               vruntime -= thresh;
> -       }
> +       if (!initial)
> +               vruntime -= sched_sleeper_credit(se);
>
>         /*
>          * Pull vruntime of the entity being placed to the base level of
> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>          * inversed due to s64 overflow.
>          */
>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> -       if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> +       if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC)
>                 se->vruntime = vruntime;
>         else
>                 se->vruntime = max_vruntime(se->vruntime, vruntime);
> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
>          */
>         if (READ_ONCE(p->__state) == TASK_WAKING) {
>                 struct cfs_rq *cfs_rq = cfs_rq_of(se);
> +               u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
>
> -               se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
> +               if ((s64)sleep_time > 60LL * NSEC_PER_SEC)

You also need to test (se->exec_start !=0) here because the task might
migrate another time before being scheduled. You should create a
helper function like below and use it in both place

static inline bool entity_long_sleep(se)
{
        struct cfs_rq *cfs_rq;
        u64 sleep_time;

        if (se->exec_start == 0)
                return false;

        cfs_rq = cfs_rq_of(se);
        sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
        if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
                return true;

        return false;
}


> +                       se->vruntime = -sched_sleeper_credit(se);
> +               else
> +                       se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
>         }
>
>         if (!task_on_rq_migrating(p)) {
>
>
>
> Thanks.
> Zhang Qiao.
>
> >
> >>
> >> Thanks,
> >> Roman.
> >>
> >>
> >>
> >> Amazon Development Center Germany GmbH
> >> Krausenstr. 38
> >> 10117 Berlin
> >> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> >> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> >> Sitz: Berlin
> >> Ust-ID: DE 289 237 879
> >>
> >>
> >>
> > .
> >
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Zhang Qiao 2 years, 6 months ago

在 2023/3/2 21:34, Vincent Guittot 写道:
> On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote:
>>
>>
>>
>> 在 2023/2/27 22:37, Vincent Guittot 写道:
>>> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
>>>>
>>>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
>>>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
>>>>>> What scares me, though, is that I've got a message from the test robot
>>>>>> that this commit drammatically affected hackbench results, see the quote
>>>>>> below.  I expected the commit not to affect any benchmarks.
>>>>>>
>>>>>> Any idea what could have caused this change?
>>>>>
>>>>> Hmm, It's most probably because se->exec_start is reset after a
>>>>> migration and the condition becomes true for newly migrated task
>>>>> whereas its vruntime should be after min_vruntime.
>>>>>
>>>>> We have missed this condition
>>>>
>>>> Makes sense to me.
>>>>
>>>> But what would then be the reliable way to detect a sched_entity which
>>>> has slept for long and risks overflowing in .vruntime comparison?
>>>
>>> For now I don't have a better idea than adding the same check in
>>> migrate_task_rq_fair()
>>
>> Hi, Vincent,
>> I fixed this condition as you said, and the test results are as follows.
>>
>> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100
>> version1: v6.2
>> version2: v6.2 + commit 829c1651e9c4
>> version3: v6.2 + commit 829c1651e9c4 + this patch
>>
>> -------------------------------------------------
>>         version1        version2        version3
>> test1   81.0            118.1           82.1
>> test2   82.1            116.9           80.3
>> test3   83.2            103.9           83.3
>> avg(s)  82.1            113.0           81.9
>>
>> -------------------------------------------------
>> After deal with the task migration case, the hackbench result has restored.
>>
>> The patch as follow, how does this look?
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index ff4dbbae3b10..3a88d20fd29e 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>  #endif
>>  }
>>
>> +static inline u64 sched_sleeper_credit(struct sched_entity *se)
>> +{
>> +
>> +       unsigned long thresh;
>> +
>> +       if (se_is_idle(se))
>> +               thresh = sysctl_sched_min_granularity;
>> +       else
>> +               thresh = sysctl_sched_latency;
>> +
>> +       /*
>> +        * Halve their sleep time's effect, to allow
>> +        * for a gentler effect of sleepers:
>> +        */
>> +       if (sched_feat(GENTLE_FAIR_SLEEPERS))
>> +               thresh >>= 1;
>> +
>> +       return thresh;
>> +}
>> +
>>  static void
>>  place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>>  {
>> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>>                 vruntime += sched_vslice(cfs_rq, se);
>>
>>         /* sleeps up to a single latency don't count. */
>> -       if (!initial) {
>> -               unsigned long thresh;
>> -
>> -               if (se_is_idle(se))
>> -                       thresh = sysctl_sched_min_granularity;
>> -               else
>> -                       thresh = sysctl_sched_latency;
>> -
>> -               /*
>> -                * Halve their sleep time's effect, to allow
>> -                * for a gentler effect of sleepers:
>> -                */
>> -               if (sched_feat(GENTLE_FAIR_SLEEPERS))
>> -                       thresh >>= 1;
>> -
>> -               vruntime -= thresh;
>> -       }
>> +       if (!initial)
>> +               vruntime -= sched_sleeper_credit(se);
>>
>>         /*
>>          * Pull vruntime of the entity being placed to the base level of
>> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>>          * inversed due to s64 overflow.
>>          */
>>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
>> -       if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
>> +       if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC)
>>                 se->vruntime = vruntime;
>>         else
>>                 se->vruntime = max_vruntime(se->vruntime, vruntime);
>> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
>>          */
>>         if (READ_ONCE(p->__state) == TASK_WAKING) {
>>                 struct cfs_rq *cfs_rq = cfs_rq_of(se);
>> +               u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
>>
>> -               se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
>> +               if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> 
> You also need to test (se->exec_start !=0) here because the task might

Hi,

I don't understand when the another migration happend. Could you tell me in more detail?

I think the next migration will happend after the wakee task enqueued, but at this time
the p->__state isn't TASK_WAKING, p->__state already be changed to TASK_RUNNING at ttwu_do_wakeup().

If such a migration exists, Previous code "se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);" maybe
perform multiple times,wouldn't it go wrong in this way?

> migrate another time before being scheduled. You should create a
> helper function like below and use it in both place

Ok, I will update at next version.


Thanks,
ZhangQiao.

>
> static inline bool entity_long_sleep(se)
> {
>         struct cfs_rq *cfs_rq;
>         u64 sleep_time;
> 
>         if (se->exec_start == 0)
>                 return false;
> 
>         cfs_rq = cfs_rq_of(se);
>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
>         if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
>                 return true;
> 
>         return false;
> }
> 
> 
>> +                       se->vruntime = -sched_sleeper_credit(se);
>> +               else
>> +                       se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
>>         }
>>
>>         if (!task_on_rq_migrating(p)) {
>>
>>
>>
>> Thanks.
>> Zhang Qiao.
>>
>>>
>>>>
>>>> Thanks,
>>>> Roman.
>>>>
>>>>
>>>>
>>>> Amazon Development Center Germany GmbH
>>>> Krausenstr. 38
>>>> 10117 Berlin
>>>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
>>>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
>>>> Sitz: Berlin
>>>> Ust-ID: DE 289 237 879
>>>>
>>>>
>>>>
>>> .
>>>
> .
> 
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Vincent Guittot 2 years, 6 months ago
On Thu, 2 Mar 2023 at 15:29, Zhang Qiao <zhangqiao22@huawei.com> wrote:
>
>
>
> 在 2023/3/2 21:34, Vincent Guittot 写道:
> > On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote:
> >>
> >>
> >>
> >> 在 2023/2/27 22:37, Vincent Guittot 写道:
> >>> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
> >>>>
> >>>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
> >>>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
> >>>>>> What scares me, though, is that I've got a message from the test robot
> >>>>>> that this commit drammatically affected hackbench results, see the quote
> >>>>>> below.  I expected the commit not to affect any benchmarks.
> >>>>>>
> >>>>>> Any idea what could have caused this change?
> >>>>>
> >>>>> Hmm, It's most probably because se->exec_start is reset after a
> >>>>> migration and the condition becomes true for newly migrated task
> >>>>> whereas its vruntime should be after min_vruntime.
> >>>>>
> >>>>> We have missed this condition
> >>>>
> >>>> Makes sense to me.
> >>>>
> >>>> But what would then be the reliable way to detect a sched_entity which
> >>>> has slept for long and risks overflowing in .vruntime comparison?
> >>>
> >>> For now I don't have a better idea than adding the same check in
> >>> migrate_task_rq_fair()
> >>
> >> Hi, Vincent,
> >> I fixed this condition as you said, and the test results are as follows.
> >>
> >> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100
> >> version1: v6.2
> >> version2: v6.2 + commit 829c1651e9c4
> >> version3: v6.2 + commit 829c1651e9c4 + this patch
> >>
> >> -------------------------------------------------
> >>         version1        version2        version3
> >> test1   81.0            118.1           82.1
> >> test2   82.1            116.9           80.3
> >> test3   83.2            103.9           83.3
> >> avg(s)  82.1            113.0           81.9
> >>
> >> -------------------------------------------------
> >> After deal with the task migration case, the hackbench result has restored.
> >>
> >> The patch as follow, how does this look?
> >>
> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >> index ff4dbbae3b10..3a88d20fd29e 100644
> >> --- a/kernel/sched/fair.c
> >> +++ b/kernel/sched/fair.c
> >> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >>  #endif
> >>  }
> >>
> >> +static inline u64 sched_sleeper_credit(struct sched_entity *se)
> >> +{
> >> +
> >> +       unsigned long thresh;
> >> +
> >> +       if (se_is_idle(se))
> >> +               thresh = sysctl_sched_min_granularity;
> >> +       else
> >> +               thresh = sysctl_sched_latency;
> >> +
> >> +       /*
> >> +        * Halve their sleep time's effect, to allow
> >> +        * for a gentler effect of sleepers:
> >> +        */
> >> +       if (sched_feat(GENTLE_FAIR_SLEEPERS))
> >> +               thresh >>= 1;
> >> +
> >> +       return thresh;
> >> +}
> >> +
> >>  static void
> >>  place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> >>  {
> >> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> >>                 vruntime += sched_vslice(cfs_rq, se);
> >>
> >>         /* sleeps up to a single latency don't count. */
> >> -       if (!initial) {
> >> -               unsigned long thresh;
> >> -
> >> -               if (se_is_idle(se))
> >> -                       thresh = sysctl_sched_min_granularity;
> >> -               else
> >> -                       thresh = sysctl_sched_latency;
> >> -
> >> -               /*
> >> -                * Halve their sleep time's effect, to allow
> >> -                * for a gentler effect of sleepers:
> >> -                */
> >> -               if (sched_feat(GENTLE_FAIR_SLEEPERS))
> >> -                       thresh >>= 1;
> >> -
> >> -               vruntime -= thresh;
> >> -       }
> >> +       if (!initial)
> >> +               vruntime -= sched_sleeper_credit(se);
> >>
> >>         /*
> >>          * Pull vruntime of the entity being placed to the base level of
> >> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> >>          * inversed due to s64 overflow.
> >>          */
> >>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> >> -       if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> >> +       if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC)
> >>                 se->vruntime = vruntime;
> >>         else
> >>                 se->vruntime = max_vruntime(se->vruntime, vruntime);
> >> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
> >>          */
> >>         if (READ_ONCE(p->__state) == TASK_WAKING) {
> >>                 struct cfs_rq *cfs_rq = cfs_rq_of(se);
> >> +               u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> >>
> >> -               se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
> >> +               if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> >
> > You also need to test (se->exec_start !=0) here because the task might
>
> Hi,
>
> I don't understand when the another migration happend. Could you tell me in more detail?

se->exec_start is update when the task becomes current.

You can have the sequence:

task TA runs on CPU0
    TA's se->exec_start = xxxx
TA is put back into the rb tree waiting for next slice while another
task is running
CPU1 pulls TA which migrates on CPU1
    migrate_task_rq_fair() w/ TA's se->exec_start == xxxx
        TA's se->exec_start = 0
TA is put into the rb tree of CPU1 waiting to run on CPU1
CPU2 pulls TA which migrates on CPU2
    migrate_task_rq_fair() w/ TA's se->exec_start == 0
        TA's se->exec_start = 0

>
> I think the next migration will happend after the wakee task enqueued, but at this time
> the p->__state isn't TASK_WAKING, p->__state already be changed to TASK_RUNNING at ttwu_do_wakeup().
>
> If such a migration exists, Previous code "se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);" maybe
> perform multiple times,wouldn't it go wrong in this way?

the vruntime have been updated when enqueued but not exec_start

>
> > migrate another time before being scheduled. You should create a
> > helper function like below and use it in both place
>
> Ok, I will update at next version.
>
>
> Thanks,
> ZhangQiao.
>
> >
> > static inline bool entity_long_sleep(se)
> > {
> >         struct cfs_rq *cfs_rq;
> >         u64 sleep_time;
> >
> >         if (se->exec_start == 0)
> >                 return false;
> >
> >         cfs_rq = cfs_rq_of(se);
> >         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> >         if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> >                 return true;
> >
> >         return false;
> > }
> >
> >
> >> +                       se->vruntime = -sched_sleeper_credit(se);
> >> +               else
> >> +                       se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
> >>         }
> >>
> >>         if (!task_on_rq_migrating(p)) {
> >>
> >>
> >>
> >> Thanks.
> >> Zhang Qiao.
> >>
> >>>
> >>>>
> >>>> Thanks,
> >>>> Roman.
> >>>>
> >>>>
> >>>>
> >>>> Amazon Development Center Germany GmbH
> >>>> Krausenstr. 38
> >>>> 10117 Berlin
> >>>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> >>>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> >>>> Sitz: Berlin
> >>>> Ust-ID: DE 289 237 879
> >>>>
> >>>>
> >>>>
> >>> .
> >>>
> > .
> >
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Zhang Qiao 2 years, 6 months ago

在 2023/3/2 22:55, Vincent Guittot 写道:
> On Thu, 2 Mar 2023 at 15:29, Zhang Qiao <zhangqiao22@huawei.com> wrote:
>>
>>
>>
>> 在 2023/3/2 21:34, Vincent Guittot 写道:
>>> On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote:
>>>>
>>>>
>>>>
>>>> 在 2023/2/27 22:37, Vincent Guittot 写道:
>>>>> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
>>>>>>
>>>>>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
>>>>>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
>>>>>>>> What scares me, though, is that I've got a message from the test robot
>>>>>>>> that this commit drammatically affected hackbench results, see the quote
>>>>>>>> below.  I expected the commit not to affect any benchmarks.
>>>>>>>>
>>>>>>>> Any idea what could have caused this change?
>>>>>>>
>>>>>>> Hmm, It's most probably because se->exec_start is reset after a
>>>>>>> migration and the condition becomes true for newly migrated task
>>>>>>> whereas its vruntime should be after min_vruntime.
>>>>>>>
>>>>>>> We have missed this condition
>>>>>>
>>>>>> Makes sense to me.
>>>>>>
>>>>>> But what would then be the reliable way to detect a sched_entity which
>>>>>> has slept for long and risks overflowing in .vruntime comparison?
>>>>>
>>>>> For now I don't have a better idea than adding the same check in
>>>>> migrate_task_rq_fair()
>>>>
>>>> Hi, Vincent,
>>>> I fixed this condition as you said, and the test results are as follows.
>>>>
>>>> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100
>>>> version1: v6.2
>>>> version2: v6.2 + commit 829c1651e9c4
>>>> version3: v6.2 + commit 829c1651e9c4 + this patch
>>>>
>>>> -------------------------------------------------
>>>>         version1        version2        version3
>>>> test1   81.0            118.1           82.1
>>>> test2   82.1            116.9           80.3
>>>> test3   83.2            103.9           83.3
>>>> avg(s)  82.1            113.0           81.9
>>>>
>>>> -------------------------------------------------
>>>> After deal with the task migration case, the hackbench result has restored.
>>>>
>>>> The patch as follow, how does this look?
>>>>
>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>> index ff4dbbae3b10..3a88d20fd29e 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>>>  #endif
>>>>  }
>>>>
>>>> +static inline u64 sched_sleeper_credit(struct sched_entity *se)
>>>> +{
>>>> +
>>>> +       unsigned long thresh;
>>>> +
>>>> +       if (se_is_idle(se))
>>>> +               thresh = sysctl_sched_min_granularity;
>>>> +       else
>>>> +               thresh = sysctl_sched_latency;
>>>> +
>>>> +       /*
>>>> +        * Halve their sleep time's effect, to allow
>>>> +        * for a gentler effect of sleepers:
>>>> +        */
>>>> +       if (sched_feat(GENTLE_FAIR_SLEEPERS))
>>>> +               thresh >>= 1;
>>>> +
>>>> +       return thresh;
>>>> +}
>>>> +
>>>>  static void
>>>>  place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>>>>  {
>>>> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>>>>                 vruntime += sched_vslice(cfs_rq, se);
>>>>
>>>>         /* sleeps up to a single latency don't count. */
>>>> -       if (!initial) {
>>>> -               unsigned long thresh;
>>>> -
>>>> -               if (se_is_idle(se))
>>>> -                       thresh = sysctl_sched_min_granularity;
>>>> -               else
>>>> -                       thresh = sysctl_sched_latency;
>>>> -
>>>> -               /*
>>>> -                * Halve their sleep time's effect, to allow
>>>> -                * for a gentler effect of sleepers:
>>>> -                */
>>>> -               if (sched_feat(GENTLE_FAIR_SLEEPERS))
>>>> -                       thresh >>= 1;
>>>> -
>>>> -               vruntime -= thresh;
>>>> -       }
>>>> +       if (!initial)
>>>> +               vruntime -= sched_sleeper_credit(se);
>>>>
>>>>         /*
>>>>          * Pull vruntime of the entity being placed to the base level of
>>>> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
>>>>          * inversed due to s64 overflow.
>>>>          */
>>>>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
>>>> -       if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
>>>> +       if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC)
>>>>                 se->vruntime = vruntime;
>>>>         else
>>>>                 se->vruntime = max_vruntime(se->vruntime, vruntime);
>>>> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
>>>>          */
>>>>         if (READ_ONCE(p->__state) == TASK_WAKING) {
>>>>                 struct cfs_rq *cfs_rq = cfs_rq_of(se);
>>>> +               u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
>>>>
>>>> -               se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
>>>> +               if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
>>>
>>> You also need to test (se->exec_start !=0) here because the task might
>>
>> Hi,
>>
>> I don't understand when the another migration happend. Could you tell me in more detail?
> 
> se->exec_start is update when the task becomes current.
> 
> You can have the sequence:
> 
> task TA runs on CPU0
>     TA's se->exec_start = xxxx
> TA is put back into the rb tree waiting for next slice while another
> task is running
> CPU1 pulls TA which migrates on CPU1
>     migrate_task_rq_fair() w/ TA's se->exec_start == xxxx
>         TA's se->exec_start = 0
> TA is put into the rb tree of CPU1 waiting to run on CPU1
> CPU2 pulls TA which migrates on CPU2
>     migrate_task_rq_fair() w/ TA's se->exec_start == 0
>         TA's se->exec_start = 0
Hi, Vincent,

yes, you're right, such sequence does exist. But at this point, p->__state != TASK_WAKING.

I have a question, Whether there is case that is "p->se.exec_start == 0 && p->__state == TASK_WAKING" ?
I analyzed the code and concluded that this case isn't existed, is it right?

Thanks.
ZhangQiao.

> 
>>
>> I think the next migration will happend after the wakee task enqueued, but at this time
>> the p->__state isn't TASK_WAKING, p->__state already be changed to TASK_RUNNING at ttwu_do_wakeup().
>>
>> If such a migration exists, Previous code "se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);" maybe
>> perform multiple times,wouldn't it go wrong in this way?
> 
> the vruntime have been updated when enqueued but not exec_start
> 
>>
>>> migrate another time before being scheduled. You should create a
>>> helper function like below and use it in both place
>>
>> Ok, I will update at next version.
>>
>>
>> Thanks,
>> ZhangQiao.
>>
>>>
>>> static inline bool entity_long_sleep(se)
>>> {
>>>         struct cfs_rq *cfs_rq;
>>>         u64 sleep_time;
>>>
>>>         if (se->exec_start == 0)
>>>                 return false;
>>>
>>>         cfs_rq = cfs_rq_of(se);
>>>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
>>>         if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
>>>                 return true;
>>>
>>>         return false;
>>> }
>>>
>>>
>>>> +                       se->vruntime = -sched_sleeper_credit(se);
>>>> +               else
>>>> +                       se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
>>>>         }
>>>>
>>>>         if (!task_on_rq_migrating(p)) {
>>>>
>>>>
>>>>
>>>> Thanks.
>>>> Zhang Qiao.
>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Roman.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Amazon Development Center Germany GmbH
>>>>>> Krausenstr. 38
>>>>>> 10117 Berlin
>>>>>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
>>>>>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
>>>>>> Sitz: Berlin
>>>>>> Ust-ID: DE 289 237 879
>>>>>>
>>>>>>
>>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
> 
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Vincent Guittot 2 years, 6 months ago
On Fri, 3 Mar 2023 at 07:51, Zhang Qiao <zhangqiao22@huawei.com> wrote:
>
>
>
> 在 2023/3/2 22:55, Vincent Guittot 写道:
> > On Thu, 2 Mar 2023 at 15:29, Zhang Qiao <zhangqiao22@huawei.com> wrote:
> >>
> >>
> >>
> >> 在 2023/3/2 21:34, Vincent Guittot 写道:
> >>> On Thu, 2 Mar 2023 at 10:36, Zhang Qiao <zhangqiao22@huawei.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> 在 2023/2/27 22:37, Vincent Guittot 写道:
> >>>>> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
> >>>>>>
> >>>>>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
> >>>>>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
> >>>>>>>> What scares me, though, is that I've got a message from the test robot
> >>>>>>>> that this commit drammatically affected hackbench results, see the quote
> >>>>>>>> below.  I expected the commit not to affect any benchmarks.
> >>>>>>>>
> >>>>>>>> Any idea what could have caused this change?
> >>>>>>>
> >>>>>>> Hmm, It's most probably because se->exec_start is reset after a
> >>>>>>> migration and the condition becomes true for newly migrated task
> >>>>>>> whereas its vruntime should be after min_vruntime.
> >>>>>>>
> >>>>>>> We have missed this condition
> >>>>>>
> >>>>>> Makes sense to me.
> >>>>>>
> >>>>>> But what would then be the reliable way to detect a sched_entity which
> >>>>>> has slept for long and risks overflowing in .vruntime comparison?
> >>>>>
> >>>>> For now I don't have a better idea than adding the same check in
> >>>>> migrate_task_rq_fair()
> >>>>
> >>>> Hi, Vincent,
> >>>> I fixed this condition as you said, and the test results are as follows.
> >>>>
> >>>> testcase: hackbench -g 44 -f 20 --process --pipe -l 60000 -s 100
> >>>> version1: v6.2
> >>>> version2: v6.2 + commit 829c1651e9c4
> >>>> version3: v6.2 + commit 829c1651e9c4 + this patch
> >>>>
> >>>> -------------------------------------------------
> >>>>         version1        version2        version3
> >>>> test1   81.0            118.1           82.1
> >>>> test2   82.1            116.9           80.3
> >>>> test3   83.2            103.9           83.3
> >>>> avg(s)  82.1            113.0           81.9
> >>>>
> >>>> -------------------------------------------------
> >>>> After deal with the task migration case, the hackbench result has restored.
> >>>>
> >>>> The patch as follow, how does this look?
> >>>>
> >>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> >>>> index ff4dbbae3b10..3a88d20fd29e 100644
> >>>> --- a/kernel/sched/fair.c
> >>>> +++ b/kernel/sched/fair.c
> >>>> @@ -4648,6 +4648,26 @@ static void check_spread(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >>>>  #endif
> >>>>  }
> >>>>
> >>>> +static inline u64 sched_sleeper_credit(struct sched_entity *se)
> >>>> +{
> >>>> +
> >>>> +       unsigned long thresh;
> >>>> +
> >>>> +       if (se_is_idle(se))
> >>>> +               thresh = sysctl_sched_min_granularity;
> >>>> +       else
> >>>> +               thresh = sysctl_sched_latency;
> >>>> +
> >>>> +       /*
> >>>> +        * Halve their sleep time's effect, to allow
> >>>> +        * for a gentler effect of sleepers:
> >>>> +        */
> >>>> +       if (sched_feat(GENTLE_FAIR_SLEEPERS))
> >>>> +               thresh >>= 1;
> >>>> +
> >>>> +       return thresh;
> >>>> +}
> >>>> +
> >>>>  static void
> >>>>  place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> >>>>  {
> >>>> @@ -4664,23 +4684,8 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> >>>>                 vruntime += sched_vslice(cfs_rq, se);
> >>>>
> >>>>         /* sleeps up to a single latency don't count. */
> >>>> -       if (!initial) {
> >>>> -               unsigned long thresh;
> >>>> -
> >>>> -               if (se_is_idle(se))
> >>>> -                       thresh = sysctl_sched_min_granularity;
> >>>> -               else
> >>>> -                       thresh = sysctl_sched_latency;
> >>>> -
> >>>> -               /*
> >>>> -                * Halve their sleep time's effect, to allow
> >>>> -                * for a gentler effect of sleepers:
> >>>> -                */
> >>>> -               if (sched_feat(GENTLE_FAIR_SLEEPERS))
> >>>> -                       thresh >>= 1;
> >>>> -
> >>>> -               vruntime -= thresh;
> >>>> -       }
> >>>> +       if (!initial)
> >>>> +               vruntime -= sched_sleeper_credit(se);
> >>>>
> >>>>         /*
> >>>>          * Pull vruntime of the entity being placed to the base level of
> >>>> @@ -4690,7 +4695,7 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> >>>>          * inversed due to s64 overflow.
> >>>>          */
> >>>>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> >>>> -       if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> >>>> +       if (se->exec_start != 0 && (s64)sleep_time > 60LL * NSEC_PER_SEC)
> >>>>                 se->vruntime = vruntime;
> >>>>         else
> >>>>                 se->vruntime = max_vruntime(se->vruntime, vruntime);
> >>>> @@ -7634,8 +7639,12 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu)
> >>>>          */
> >>>>         if (READ_ONCE(p->__state) == TASK_WAKING) {
> >>>>                 struct cfs_rq *cfs_rq = cfs_rq_of(se);
> >>>> +               u64 sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> >>>>
> >>>> -               se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
> >>>> +               if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> >>>
> >>> You also need to test (se->exec_start !=0) here because the task might
> >>
> >> Hi,
> >>
> >> I don't understand when the another migration happend. Could you tell me in more detail?
> >
> > se->exec_start is update when the task becomes current.
> >
> > You can have the sequence:
> >
> > task TA runs on CPU0
> >     TA's se->exec_start = xxxx
> > TA is put back into the rb tree waiting for next slice while another
> > task is running
> > CPU1 pulls TA which migrates on CPU1
> >     migrate_task_rq_fair() w/ TA's se->exec_start == xxxx
> >         TA's se->exec_start = 0
> > TA is put into the rb tree of CPU1 waiting to run on CPU1
> > CPU2 pulls TA which migrates on CPU2
> >     migrate_task_rq_fair() w/ TA's se->exec_start == 0
> >         TA's se->exec_start = 0
> Hi, Vincent,
>
> yes, you're right, such sequence does exist. But at this point, p->__state != TASK_WAKING.
>
> I have a question, Whether there is case that is "p->se.exec_start == 0 && p->__state == TASK_WAKING" ?
> I analyzed the code and concluded that this case isn't existed, is it right?

Yes, you're right. Your proposal is enough

Thanks

>
> Thanks.
> ZhangQiao.
>
> >
> >>
> >> I think the next migration will happend after the wakee task enqueued, but at this time
> >> the p->__state isn't TASK_WAKING, p->__state already be changed to TASK_RUNNING at ttwu_do_wakeup().
> >>
> >> If such a migration exists, Previous code "se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);" maybe
> >> perform multiple times,wouldn't it go wrong in this way?
> >
> > the vruntime have been updated when enqueued but not exec_start
> >
> >>
> >>> migrate another time before being scheduled. You should create a
> >>> helper function like below and use it in both place
> >>
> >> Ok, I will update at next version.
> >>
> >>
> >> Thanks,
> >> ZhangQiao.
> >>
> >>>
> >>> static inline bool entity_long_sleep(se)
> >>> {
> >>>         struct cfs_rq *cfs_rq;
> >>>         u64 sleep_time;
> >>>
> >>>         if (se->exec_start == 0)
> >>>                 return false;
> >>>
> >>>         cfs_rq = cfs_rq_of(se);
> >>>         sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start;
> >>>         if ((s64)sleep_time > 60LL * NSEC_PER_SEC)
> >>>                 return true;
> >>>
> >>>         return false;
> >>> }
> >>>
> >>>
> >>>> +                       se->vruntime = -sched_sleeper_credit(se);
> >>>> +               else
> >>>> +                       se->vruntime -= u64_u32_load(cfs_rq->min_vruntime);
> >>>>         }
> >>>>
> >>>>         if (!task_on_rq_migrating(p)) {
> >>>>
> >>>>
> >>>>
> >>>> Thanks.
> >>>> Zhang Qiao.
> >>>>
> >>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Roman.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Amazon Development Center Germany GmbH
> >>>>>> Krausenstr. 38
> >>>>>> 10117 Berlin
> >>>>>> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> >>>>>> Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
> >>>>>> Sitz: Berlin
> >>>>>> Ust-ID: DE 289 237 879
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> .
> >>>>>
> >>> .
> >>>
> > .
> >
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Dietmar Eggemann 2 years, 6 months ago
On 27/02/2023 15:37, Vincent Guittot wrote:
> On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
>>
>> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
>>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
>>>> What scares me, though, is that I've got a message from the test robot
>>>> that this commit drammatically affected hackbench results, see the quote
>>>> below.  I expected the commit not to affect any benchmarks.
>>>>
>>>> Any idea what could have caused this change?
>>>
>>> Hmm, It's most probably because se->exec_start is reset after a
>>> migration and the condition becomes true for newly migrated task
>>> whereas its vruntime should be after min_vruntime.
>>>
>>> We have missed this condition
>>
>> Makes sense to me.
>>
>> But what would then be the reliable way to detect a sched_entity which
>> has slept for long and risks overflowing in .vruntime comparison?
> 
> For now I don't have a better idea than adding the same check in
> migrate_task_rq_fair()

Don't we have the issue that we could have a non-up-to-date rq clock in
migrate? No rq lock held in `!task_on_rq_migrating(p)`.

Also deferring `se->exec_start = 0` from `migrate` into `enqueue ->
place entity` doesn't seem to work since the rq clocks of different CPUs
are not in sync.
Re: [PATCH v3] sched/fair: sanitize vruntime of entity being placed
Posted by Vincent Guittot 2 years, 6 months ago
On Mon, 27 Feb 2023 at 18:00, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote:
>
> On 27/02/2023 15:37, Vincent Guittot wrote:
> > On Mon, 27 Feb 2023 at 09:43, Roman Kagan <rkagan@amazon.de> wrote:
> >>
> >> On Tue, Feb 21, 2023 at 06:26:11PM +0100, Vincent Guittot wrote:
> >>> On Tue, 21 Feb 2023 at 17:57, Roman Kagan <rkagan@amazon.de> wrote:
> >>>> What scares me, though, is that I've got a message from the test robot
> >>>> that this commit drammatically affected hackbench results, see the quote
> >>>> below.  I expected the commit not to affect any benchmarks.
> >>>>
> >>>> Any idea what could have caused this change?
> >>>
> >>> Hmm, It's most probably because se->exec_start is reset after a
> >>> migration and the condition becomes true for newly migrated task
> >>> whereas its vruntime should be after min_vruntime.
> >>>
> >>> We have missed this condition
> >>
> >> Makes sense to me.
> >>
> >> But what would then be the reliable way to detect a sched_entity which
> >> has slept for long and risks overflowing in .vruntime comparison?
> >
> > For now I don't have a better idea than adding the same check in
> > migrate_task_rq_fair()
>
> Don't we have the issue that we could have a non-up-to-date rq clock in
> migrate? No rq lock held in `!task_on_rq_migrating(p)`.

yes the rq clock may be not up to date but that would also mean that
the cfs was idle and as a result its min_vruntime has not moved
forward and we don't have a problem of possible overflow

>
> Also deferring `se->exec_start = 0` from `migrate` into `enqueue ->
> place entity` doesn't seem to work since the rq clocks of different CPUs
> are not in sync.

yes

>