[PATCH v2 00/10] sched/psi: some optimization and extension

Chengming Zhou posted 10 patches 3 years, 8 months ago
There is a newer version of this series
Documentation/admin-guide/cgroup-v2.rst |  13 ++
include/linux/psi.h                     |   6 +-
include/linux/psi_types.h               |  25 +--
include/linux/sched.h                   |   3 -
kernel/cgroup/cgroup.c                  |  73 +++++++
kernel/sched/core.c                     |   2 +
kernel/sched/psi.c                      | 247 +++++++++++++++++-------
kernel/sched/stats.h                    |  60 +++---
8 files changed, 313 insertions(+), 116 deletions(-)
[PATCH v2 00/10] sched/psi: some optimization and extension
Posted by Chengming Zhou 3 years, 8 months ago
Hi all,

This patch series are some optimization and extension for PSI, based on
the tip/sched/core branch.

patch 1/10 fix periodic aggregation shut off problem introduced by earlier
commit 4117cebf1a9f ("psi: Optimize task switch inside shared cgroups").

patch 2/10 optimize task switch inside shared cgroups when in_memstall status
of prev task and next task are different.

patch 3-4 optimize and simplify PSI status tracking by don't change task
psi_flags when migrate CPU/cgroup.

patch 7/10 remove NR_ONCPU task accounting to save 4 bytes in the first
cacheline to be used by the following patch 8/10, which introduce new
PSI resource PSI_IRQ to track IRQ/SOFTIRQ pressure stall information.

patch 9/10 introduce a per-cgroup interface "cgroup.psi" to disable
or re-enable PSI stats accounting in the cgroup level.

patch 10/10 cache parent psi_group in struct psi_group to speed up
the hot iteration path.

Thanks!

Changes in v2:
 - Add Acked-by tags from Johannes Weiner. Thanks for review!
 - Fix periodic aggregation wakeup for common ancestors in
   psi_task_switch().
 - Add patch 7/10 from Johannes Weiner, which remove NR_ONCPU
   task accounting to save 4 bytes in the first cacheline.
 - Remove "psi_irq=" kernel cmdline parameter in last version.
 - Add per-cgroup interface "cgroup.psi" to disable/re-enable
   PSI stats accounting in the cgroup level.

Chengming Zhou (9):
  sched/psi: fix periodic aggregation shut off
  sched/psi: optimize task switch inside shared cgroups again
  sched/psi: move private helpers to sched/stats.h
  sched/psi: don't change task psi_flags when migrate CPU/group
  sched/psi: don't create cgroup PSI files when psi_disabled
  sched/psi: save percpu memory when !psi_cgroups_enabled
  sched/psi: add PSI_IRQ to track IRQ/SOFTIRQ pressure
  sched/psi: per-cgroup PSI stats disable/re-enable interface
  sched/psi: cache parent psi_group to speed up groups iterate

Johannes Weiner (1):
  sched/psi: remove NR_ONCPU task accounting

 Documentation/admin-guide/cgroup-v2.rst |  13 ++
 include/linux/psi.h                     |   6 +-
 include/linux/psi_types.h               |  25 +--
 include/linux/sched.h                   |   3 -
 kernel/cgroup/cgroup.c                  |  73 +++++++
 kernel/sched/core.c                     |   2 +
 kernel/sched/psi.c                      | 247 +++++++++++++++++-------
 kernel/sched/stats.h                    |  60 +++---
 8 files changed, 313 insertions(+), 116 deletions(-)

-- 
2.36.1
Re: [PATCH v2 00/10] sched/psi: some optimization and extension
Posted by Michal Koutný 3 years, 7 months ago
On Mon, Aug 08, 2022 at 07:03:31PM +0800, Chengming Zhou <zhouchengming@bytedance.com> wrote:
> This patch series are some optimization and extension for PSI,

BTW do you have some numbers/example how much these modifications save
when aggregated together?

Thanks,
Michal
Re: [PATCH v2 00/10] sched/psi: some optimization and extension
Posted by Chengming Zhou 3 years, 7 months ago
On 2022/8/15 21:25, Michal Koutný wrote:
> On Mon, Aug 08, 2022 at 07:03:31PM +0800, Chengming Zhou <zhouchengming@bytedance.com> wrote:
>> This patch series are some optimization and extension for PSI,
> 
> BTW do you have some numbers/example how much these modifications save
> when aggregated together?
> 

I lost my test data last time, I will use mmtests/config-scheduler-perfpipe
to get some performance numbers right now.

Thanks.
Re: [PATCH v2 00/10] sched/psi: some optimization and extension
Posted by Chengming Zhou 3 years, 7 months ago
On 2022/8/16 22:01, Chengming Zhou wrote:
> On 2022/8/15 21:25, Michal Koutný wrote:
>> On Mon, Aug 08, 2022 at 07:03:31PM +0800, Chengming Zhou <zhouchengming@bytedance.com> wrote:
>>> This patch series are some optimization and extension for PSI,
>>
>> BTW do you have some numbers/example how much these modifications save
>> when aggregated together?
>>

Sorry about delay...

Performance test using mmtests/config-scheduler-perfpipe in /user.slice/user-0.slice/session-4.scope

                                 next                patched       patched/only-leaf
Min       Time        8.82 (   0.00%)        8.49 (   3.74%)        8.00 (   9.32%)
1st-qrtle Time        8.90 (   0.00%)        8.58 (   3.63%)        8.05 (   9.58%)
2nd-qrtle Time        8.94 (   0.00%)        8.61 (   3.65%)        8.09 (   9.50%)
3rd-qrtle Time        8.99 (   0.00%)        8.65 (   3.75%)        8.15 (   9.35%)
Max-1     Time        8.82 (   0.00%)        8.49 (   3.74%)        8.00 (   9.32%)
Max-5     Time        8.82 (   0.00%)        8.49 (   3.74%)        8.00 (   9.32%)
Max-10    Time        8.84 (   0.00%)        8.55 (   3.20%)        8.04 (   9.05%)
Max-90    Time        9.04 (   0.00%)        8.67 (   4.10%)        8.18 (   9.51%)
Max-95    Time        9.04 (   0.00%)        8.68 (   4.03%)        8.20 (   9.26%)
Max-99    Time        9.07 (   0.00%)        8.73 (   3.82%)        8.25 (   9.11%)
Max       Time        9.12 (   0.00%)        8.89 (   2.54%)        8.27 (   9.29%)
Amean     Time        8.95 (   0.00%)        8.62 *   3.67%*        8.11 *   9.43%*


Thanks!