[PATCH 00/27] cpuset/isolation: Honour kthreads preferred affinity

Frederic Weisbecker posted 27 patches 3 months, 2 weeks ago
There is a newer version of this series
block/blk-mq.c                  |   6 +-
drivers/base/cpu.c              |   2 +-
drivers/pci/pci-driver.c        |   3 +-
include/linux/cpuhplock.h       |   1 +
include/linux/cpuset.h          |   8 +-
include/linux/kthread.h         |   1 +
include/linux/memcontrol.h      |   4 +
include/linux/mmu_context.h     |   2 +-
include/linux/percpu-rwsem.h    |   1 +
include/linux/sched/isolation.h |  38 +++++---
include/linux/vmstat.h          |   2 +
include/linux/workqueue.h       |   2 +-
init/Kconfig                    |   1 +
kernel/cgroup/cpuset.c          |  60 +++++-------
kernel/cpu.c                    |  49 +++++++---
kernel/kthread.c                | 136 ++++++++++++++++-----------
kernel/sched/isolation.c        | 201 ++++++++++++++++++++++++++++------------
kernel/sched/sched.h            |   5 +
kernel/workqueue.c              |   2 +-
mm/memcontrol.c                 |  26 +++++-
mm/vmstat.c                     |  15 ++-
net/core/net-sysfs.c            |   2 +-
22 files changed, 371 insertions(+), 196 deletions(-)
[PATCH 00/27] cpuset/isolation: Honour kthreads preferred affinity
Posted by Frederic Weisbecker 3 months, 2 weeks ago
The kthread code was enhanced lately to provide an infrastructure which
manages the preferred affinity of unbound kthreads (node or custom
cpumask) against housekeeping constraints and CPU hotplug events.

One crucial missing piece is cpuset: when an isolated partition is
created, deleted, or its CPUs updated, all the unbound kthreads in the
top cpuset are affine to _all_ the non-isolated CPUs, possibly breaking
their preferred affinity along the way

Solve this with performing the kthreads affinity update from cpuset to
the kthreads consolidated relevant code instead so that preferred
affinities are honoured.

The dispatch of the new cpumasks to workqueues and kthreads is performed
by housekeeping, as per the nice Tejun's suggestion.

As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set
from isolcpus= and cpuset isolated partitions. Housekeeping cpumasks are
now modifyable with specific synchronization. A big step toward making
nohz_full= also mutable through cpuset in the future.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	kthread/core

HEAD: f43c8b542df665940c2f581d771d92ff50606a6e

Thanks,
	Frederic
---

Frederic Weisbecker (27):
      sched/isolation: Remove housekeeping static key
      sched/isolation: Introduce housekeeping per-cpu rwsem
      PCI: Protect against concurrent change of housekeeping cpumask
      cpu: Protect against concurrent isolated cpuset change
      memcg: Prepare to protect against concurrent isolated cpuset change
      mm: vmstat: Prepare to protect against concurrent isolated cpuset change
      sched/isolation: Save boot defined domain flags
      cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT
      driver core: cpu: Convert /sys/devices/system/cpu/isolated to use HK_TYPE_DOMAIN_BOOT
      net: Keep ignoring isolated cpuset change
      block: Protect against concurrent isolated cpuset change
      cpu: Provide lockdep check for CPU hotplug lock write-held
      cpuset: Provide lockdep check for cpuset lock held
      sched/isolation: Convert housekeeping cpumasks to rcu pointers
      cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset
      sched/isolation: Flush memcg workqueues on cpuset isolated partition change
      sched/isolation: Flush vmstat workqueues on cpuset isolated partition change
      cpuset: Propagate cpuset isolation update to workqueue through housekeeping
      cpuset: Remove cpuset_cpu_is_isolated()
      sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated()
      kthread: Refine naming of affinity related fields
      kthread: Include unbound kthreads in the managed affinity list
      kthread: Include kthreadd to the managed affinity list
      kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management
      sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN
      kthread: Honour kthreads preferred affinity after cpuset changes
      kthread: Comment on the purpose and placement of kthread_affine_node() call


 block/blk-mq.c                  |   6 +-
 drivers/base/cpu.c              |   2 +-
 drivers/pci/pci-driver.c        |   3 +-
 include/linux/cpuhplock.h       |   1 +
 include/linux/cpuset.h          |   8 +-
 include/linux/kthread.h         |   1 +
 include/linux/memcontrol.h      |   4 +
 include/linux/mmu_context.h     |   2 +-
 include/linux/percpu-rwsem.h    |   1 +
 include/linux/sched/isolation.h |  38 +++++---
 include/linux/vmstat.h          |   2 +
 include/linux/workqueue.h       |   2 +-
 init/Kconfig                    |   1 +
 kernel/cgroup/cpuset.c          |  60 +++++-------
 kernel/cpu.c                    |  49 +++++++---
 kernel/kthread.c                | 136 ++++++++++++++++-----------
 kernel/sched/isolation.c        | 201 ++++++++++++++++++++++++++++------------
 kernel/sched/sched.h            |   5 +
 kernel/workqueue.c              |   2 +-
 mm/memcontrol.c                 |  26 +++++-
 mm/vmstat.c                     |  15 ++-
 net/core/net-sysfs.c            |   2 +-
 22 files changed, 371 insertions(+), 196 deletions(-)
Re: [PATCH 00/27] cpuset/isolation: Honour kthreads preferred affinity
Posted by Bjorn Helgaas 3 months, 2 weeks ago
On Fri, Jun 20, 2025 at 05:22:41PM +0200, Frederic Weisbecker wrote:
> The kthread code was enhanced lately to provide an infrastructure which
> manages the preferred affinity of unbound kthreads (node or custom
> cpumask) against housekeeping constraints and CPU hotplug events.
> 
> One crucial missing piece is cpuset: when an isolated partition is
> created, deleted, or its CPUs updated, all the unbound kthreads in the
> top cpuset are affine to _all_ the non-isolated CPUs, possibly breaking
> their preferred affinity along the way
> 
> Solve this with performing the kthreads affinity update from cpuset to
> the kthreads consolidated relevant code instead so that preferred
> affinities are honoured.
> 
> The dispatch of the new cpumasks to workqueues and kthreads is performed
> by housekeeping, as per the nice Tejun's suggestion.
> 
> As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set
> from isolcpus= and cpuset isolated partitions. Housekeeping cpumasks are
> now modifyable with specific synchronization. A big step toward making
> nohz_full= also mutable through cpuset in the future.

Is there anything in Documentation/ that covers the "housekeeping"
feature (and isolation in general) and how to use it?  I see a few
mentions in kernel-parameters.txt and kernel-per-CPU-kthreads.rst, but
they are only incidental.

Bjorn
Re: [PATCH 00/27] cpuset/isolation: Honour kthreads preferred affinity
Posted by Frederic Weisbecker 3 months, 2 weeks ago
Le Fri, Jun 20, 2025 at 11:08:47AM -0500, Bjorn Helgaas a écrit :
> On Fri, Jun 20, 2025 at 05:22:41PM +0200, Frederic Weisbecker wrote:
> > The kthread code was enhanced lately to provide an infrastructure which
> > manages the preferred affinity of unbound kthreads (node or custom
> > cpumask) against housekeeping constraints and CPU hotplug events.
> > 
> > One crucial missing piece is cpuset: when an isolated partition is
> > created, deleted, or its CPUs updated, all the unbound kthreads in the
> > top cpuset are affine to _all_ the non-isolated CPUs, possibly breaking
> > their preferred affinity along the way
> > 
> > Solve this with performing the kthreads affinity update from cpuset to
> > the kthreads consolidated relevant code instead so that preferred
> > affinities are honoured.
> > 
> > The dispatch of the new cpumasks to workqueues and kthreads is performed
> > by housekeeping, as per the nice Tejun's suggestion.
> > 
> > As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set
> > from isolcpus= and cpuset isolated partitions. Housekeeping cpumasks are
> > now modifyable with specific synchronization. A big step toward making
> > nohz_full= also mutable through cpuset in the future.
> 
> Is there anything in Documentation/ that covers the "housekeeping"
> feature (and isolation in general) and how to use it?  I see a few
> mentions in kernel-parameters.txt and kernel-per-CPU-kthreads.rst, but
> they are only incidental.

Not yet, I'll try that for the next take.

Thanks.

-- 
Frederic Weisbecker
SUSE Labs