[PATCH 0/4] sched/psi: Allow unprivileged PSI polling

Domenico Cerasuolo posted 4 patches 3 years, 1 month ago
There is a newer version of this series
Documentation/accounting/psi.rst |   4 +
include/linux/psi.h              |   2 +-
include/linux/psi_types.h        |  43 ++--
kernel/cgroup/cgroup.c           |   2 +-
kernel/sched/psi.c               | 412 ++++++++++++++++---------------
5 files changed, 250 insertions(+), 213 deletions(-)
[PATCH 0/4] sched/psi: Allow unprivileged PSI polling
Posted by Domenico Cerasuolo 3 years, 1 month ago
PSI offers 2 mechanisms to get information about a specific resource
pressure. One is reading from /proc/pressure/<resource>, which gives
average pressures aggregated every 2s. The other is creating a pollable
fd for a specific resource and cgroup.

The trigger creation requires CAP_SYS_RESOURCE, and gives the
possibility to pick specific time window and threshold, spawing an RT
thread to aggregate the data.

Systemd would like to provide containers the option to monitor pressure
on their own cgroup and sub-cgroups. For example, if systemd launches a
container that itself then launches services, the container should have
the ability to poll() for pressure in individual services. But neither
the container nor the services are privileged.

The series is implemented in 4 steps in order to reduce the noise of
the change.

Domenico Cerasuolo (4):
  sched/psi: rearrange polling code in preparation
  sched/psi: rename existing poll members in preparation
  sched/psi: extract update_triggers side effect
  sched/psi: allow unprivileged polling of N*2s period

 Documentation/accounting/psi.rst |   4 +
 include/linux/psi.h              |   2 +-
 include/linux/psi_types.h        |  43 ++--
 kernel/cgroup/cgroup.c           |   2 +-
 kernel/sched/psi.c               | 412 ++++++++++++++++---------------
 5 files changed, 250 insertions(+), 213 deletions(-)

-- 
2.34.1
Re: [PATCH 0/4] sched/psi: Allow unprivileged PSI polling
Posted by Suren Baghdasaryan 3 years ago
On Thu, Mar 9, 2023 at 9:08 AM Domenico Cerasuolo
<cerasuolodomenico@gmail.com> wrote:
>
> PSI offers 2 mechanisms to get information about a specific resource
> pressure. One is reading from /proc/pressure/<resource>, which gives
> average pressures aggregated every 2s. The other is creating a pollable
> fd for a specific resource and cgroup.
>
> The trigger creation requires CAP_SYS_RESOURCE, and gives the
> possibility to pick specific time window and threshold, spawing an RT
> thread to aggregate the data.
>
> Systemd would like to provide containers the option to monitor pressure
> on their own cgroup and sub-cgroups. For example, if systemd launches a
> container that itself then launches services, the container should have
> the ability to poll() for pressure in individual services. But neither
> the container nor the services are privileged.

This sounds like an interesting usecase. I'll need to take a closer
look once I'm back from vacation later this week.
Thanks!

>
> The series is implemented in 4 steps in order to reduce the noise of
> the change.
>
> Domenico Cerasuolo (4):
>   sched/psi: rearrange polling code in preparation
>   sched/psi: rename existing poll members in preparation
>   sched/psi: extract update_triggers side effect
>   sched/psi: allow unprivileged polling of N*2s period
>
>  Documentation/accounting/psi.rst |   4 +
>  include/linux/psi.h              |   2 +-
>  include/linux/psi_types.h        |  43 ++--
>  kernel/cgroup/cgroup.c           |   2 +-
>  kernel/sched/psi.c               | 412 ++++++++++++++++---------------
>  5 files changed, 250 insertions(+), 213 deletions(-)
>
> --
> 2.34.1
>
Re: [PATCH 0/4] sched/psi: Allow unprivileged PSI polling
Posted by Johannes Weiner 3 years ago
On Mon, Mar 13, 2023 at 08:29:37AM -0700, Suren Baghdasaryan wrote:
> On Thu, Mar 9, 2023 at 9:08 AM Domenico Cerasuolo
> <cerasuolodomenico@gmail.com> wrote:
> >
> > PSI offers 2 mechanisms to get information about a specific resource
> > pressure. One is reading from /proc/pressure/<resource>, which gives
> > average pressures aggregated every 2s. The other is creating a pollable
> > fd for a specific resource and cgroup.
> >
> > The trigger creation requires CAP_SYS_RESOURCE, and gives the
> > possibility to pick specific time window and threshold, spawing an RT
> > thread to aggregate the data.
> >
> > Systemd would like to provide containers the option to monitor pressure
> > on their own cgroup and sub-cgroups. For example, if systemd launches a
> > container that itself then launches services, the container should have
> > the ability to poll() for pressure in individual services. But neither
> > the container nor the services are privileged.
> 
> This sounds like an interesting usecase. I'll need to take a closer
> look once I'm back from vacation later this week.
> Thanks!

Thanks, Suren!

There is also the desktop monitoring usecase that Chris Down had
inquired about some while back:

https://lore.kernel.org/all/CAJuCfpGnJBEvQTUeJ_U6+rHmPcMjw_pPL+QFj7Sec5fHZPH67w@mail.gmail.com/T/

The patches should help with that as well.