From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0586D2D5422 for ; Fri, 20 Jun 2025 15:23:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750432997; cv=none; b=TJcmBrqCe7YLuwmgIFeo5pVNnZvkQa4Uy6tK7lQFpxZKrFtMw8Y8jkIBX7aXxZE8VJHlFmbmp4y/Ckb7xlaWhb9XJ9WPPcq0gU2RnVOIZpV2Mws1a7YtYCzRkNNO3aW1qEbBc6SYspQZKdExY4yFD6n/URZFcplfOJMUW4oyzbo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750432997; c=relaxed/simple; bh=TBtcOpkcjcgReECVhMix4kGzTvhmxxJWRFsSJZ944+M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Gyx7yGqB4MfmB5CEcwTleRI4gOggL3yKpY5qUqgLf4fzI+SYcO/DDfxZcWBSGzLdPNclN4QMJqRkgwS9P8l7eBGeW+dfm6f70tsZgZ+TtOJHYRD7bt+9Aa5kDYH+LqzQ5BjvlIhNQeuR0a0VwNMnxFU2cqsn49HdxZJxjmtPNcU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=UdbEulxw; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="UdbEulxw" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BFE7BC4CEF0; Fri, 20 Jun 2025 15:23:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750432996; bh=TBtcOpkcjcgReECVhMix4kGzTvhmxxJWRFsSJZ944+M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UdbEulxwOe4Prc2pFVwWjt23Mv4Y/vBiRWMW8hpM9ieWR5FKP0rS1QuQLn8uhHbfJ QBPi/ZVmsBGXoWIfcLVUXbryy+9hemov/Z7DQNjxykbtD87TCT+XGk5MXH+ZFEr3OU CA2TeLHzzaFJGz842TbIo2jcQFb4M2dTc4e3K+u/gnx5se5SBT3PqpA8i+bMpQHhk4 TqtB3lj23wg76WNSpEMJ5WYs1kILPlCfYpDgiwHiEk4WnKgoa4B5hSrJoCWiR+XX+E ztFZagfIrHZFaPxUMpDVSq0zem+8P4N/y2cIqKgFqOIvh3dwh0ioduYk9k5uP7ycAx fDumS2gJUgp5Q== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Ingo Molnar , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 01/27] sched/isolation: Remove housekeeping static key Date: Fri, 20 Jun 2025 17:22:42 +0200 Message-ID: <20250620152308.27492-2-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The housekeeping static key in its current use is mostly irrelevant. Most of the time, a housekeeping function call had already been issued before the static call got a chance to be evaluated, defeating the initial call optimization purpose. housekeeping_cpu() is the sole correct user performing the static call before the actual slow-path function call. But it's seldom used in fast-path. Finally the static call prevents from synchronizing correctly against dynamic updates of the housekeeping cpumasks through cpusets. Get away with a simple flag test instead. Signed-off-by: Frederic Weisbecker --- include/linux/sched/isolation.h | 25 +++++---- kernel/sched/isolation.c | 90 ++++++++++++++------------------- 2 files changed, 55 insertions(+), 60 deletions(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index d8501f4709b5..f98ba0d71c52 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -25,12 +25,22 @@ enum hk_type { }; =20 #ifdef CONFIG_CPU_ISOLATION -DECLARE_STATIC_KEY_FALSE(housekeeping_overridden); +extern unsigned long housekeeping_flags; + extern int housekeeping_any_cpu(enum hk_type type); extern const struct cpumask *housekeeping_cpumask(enum hk_type type); extern bool housekeeping_enabled(enum hk_type type); extern void housekeeping_affine(struct task_struct *t, enum hk_type type); extern bool housekeeping_test_cpu(int cpu, enum hk_type type); + +static inline bool housekeeping_cpu(int cpu, enum hk_type type) +{ + if (housekeeping_flags & BIT(type)) + return housekeeping_test_cpu(cpu, type); + else + return true; +} + extern void __init housekeeping_init(void); =20 #else @@ -58,17 +68,14 @@ static inline bool housekeeping_test_cpu(int cpu, enum = hk_type type) return true; } =20 +static inline bool housekeeping_cpu(int cpu, enum hk_type type) +{ + return true; +} + static inline void housekeeping_init(void) { } #endif /* CONFIG_CPU_ISOLATION */ =20 -static inline bool housekeeping_cpu(int cpu, enum hk_type type) -{ -#ifdef CONFIG_CPU_ISOLATION - if (static_branch_unlikely(&housekeeping_overridden)) - return housekeeping_test_cpu(cpu, type); -#endif - return true; -} =20 static inline bool cpu_is_isolated(int cpu) { diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 93b038d48900..83cec3853864 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -14,19 +14,13 @@ enum hk_flags { HK_FLAG_KERNEL_NOISE =3D BIT(HK_TYPE_KERNEL_NOISE), }; =20 -DEFINE_STATIC_KEY_FALSE(housekeeping_overridden); -EXPORT_SYMBOL_GPL(housekeeping_overridden); - -struct housekeeping { - cpumask_var_t cpumasks[HK_TYPE_MAX]; - unsigned long flags; -}; - -static struct housekeeping housekeeping; +static cpumask_var_t housekeeping_cpumasks[HK_TYPE_MAX]; +unsigned long housekeeping_flags; +EXPORT_SYMBOL_GPL(housekeeping_flags); =20 bool housekeeping_enabled(enum hk_type type) { - return !!(housekeeping.flags & BIT(type)); + return !!(housekeeping_flags & BIT(type)); } EXPORT_SYMBOL_GPL(housekeeping_enabled); =20 @@ -34,50 +28,46 @@ int housekeeping_any_cpu(enum hk_type type) { int cpu; =20 - if (static_branch_unlikely(&housekeeping_overridden)) { - if (housekeeping.flags & BIT(type)) { - cpu =3D sched_numa_find_closest(housekeeping.cpumasks[type], smp_proces= sor_id()); - if (cpu < nr_cpu_ids) - return cpu; + if (housekeeping_flags & BIT(type)) { + cpu =3D sched_numa_find_closest(housekeeping_cpumasks[type], smp_process= or_id()); + if (cpu < nr_cpu_ids) + return cpu; =20 - cpu =3D cpumask_any_and_distribute(housekeeping.cpumasks[type], cpu_onl= ine_mask); - if (likely(cpu < nr_cpu_ids)) - return cpu; - /* - * Unless we have another problem this can only happen - * at boot time before start_secondary() brings the 1st - * housekeeping CPU up. - */ - WARN_ON_ONCE(system_state =3D=3D SYSTEM_RUNNING || - type !=3D HK_TYPE_TIMER); - } + cpu =3D cpumask_any_and_distribute(housekeeping_cpumasks[type], cpu_onli= ne_mask); + if (likely(cpu < nr_cpu_ids)) + return cpu; + /* + * Unless we have another problem this can only happen + * at boot time before start_secondary() brings the 1st + * housekeeping CPU up. + */ + WARN_ON_ONCE(system_state =3D=3D SYSTEM_RUNNING || + type !=3D HK_TYPE_TIMER); } + return smp_processor_id(); } EXPORT_SYMBOL_GPL(housekeeping_any_cpu); =20 const struct cpumask *housekeeping_cpumask(enum hk_type type) { - if (static_branch_unlikely(&housekeeping_overridden)) - if (housekeeping.flags & BIT(type)) - return housekeeping.cpumasks[type]; + if (housekeeping_flags & BIT(type)) + return housekeeping_cpumasks[type]; return cpu_possible_mask; } EXPORT_SYMBOL_GPL(housekeeping_cpumask); =20 void housekeeping_affine(struct task_struct *t, enum hk_type type) { - if (static_branch_unlikely(&housekeeping_overridden)) - if (housekeeping.flags & BIT(type)) - set_cpus_allowed_ptr(t, housekeeping.cpumasks[type]); + if (housekeeping_flags & BIT(type)) + set_cpus_allowed_ptr(t, housekeeping_cpumasks[type]); } EXPORT_SYMBOL_GPL(housekeeping_affine); =20 bool housekeeping_test_cpu(int cpu, enum hk_type type) { - if (static_branch_unlikely(&housekeeping_overridden)) - if (housekeeping.flags & BIT(type)) - return cpumask_test_cpu(cpu, housekeeping.cpumasks[type]); + if (housekeeping_flags & BIT(type)) + return cpumask_test_cpu(cpu, housekeeping_cpumasks[type]); return true; } EXPORT_SYMBOL_GPL(housekeeping_test_cpu); @@ -86,17 +76,15 @@ void __init housekeeping_init(void) { enum hk_type type; =20 - if (!housekeeping.flags) + if (!housekeeping_flags) return; =20 - static_branch_enable(&housekeeping_overridden); - - if (housekeeping.flags & HK_FLAG_KERNEL_NOISE) + if (housekeeping_flags & HK_FLAG_KERNEL_NOISE) sched_tick_offload_init(); =20 - for_each_set_bit(type, &housekeeping.flags, HK_TYPE_MAX) { + for_each_set_bit(type, &housekeeping_flags, HK_TYPE_MAX) { /* We need at least one CPU to handle housekeeping work */ - WARN_ON_ONCE(cpumask_empty(housekeeping.cpumasks[type])); + WARN_ON_ONCE(cpumask_empty(housekeeping_cpumasks[type])); } } =20 @@ -104,8 +92,8 @@ static void __init housekeeping_setup_type(enum hk_type = type, cpumask_var_t housekeeping_staging) { =20 - alloc_bootmem_cpumask_var(&housekeeping.cpumasks[type]); - cpumask_copy(housekeeping.cpumasks[type], + alloc_bootmem_cpumask_var(&housekeeping_cpumasks[type]); + cpumask_copy(housekeeping_cpumasks[type], housekeeping_staging); } =20 @@ -115,7 +103,7 @@ static int __init housekeeping_setup(char *str, unsigne= d long flags) unsigned int first_cpu; int err =3D 0; =20 - if ((flags & HK_FLAG_KERNEL_NOISE) && !(housekeeping.flags & HK_FLAG_KERN= EL_NOISE)) { + if ((flags & HK_FLAG_KERNEL_NOISE) && !(housekeeping_flags & HK_FLAG_KERN= EL_NOISE)) { if (!IS_ENABLED(CONFIG_NO_HZ_FULL)) { pr_warn("Housekeeping: nohz unsupported." " Build with CONFIG_NO_HZ_FULL\n"); @@ -137,7 +125,7 @@ static int __init housekeeping_setup(char *str, unsigne= d long flags) if (first_cpu >=3D nr_cpu_ids || first_cpu >=3D setup_max_cpus) { __cpumask_set_cpu(smp_processor_id(), housekeeping_staging); __cpumask_clear_cpu(smp_processor_id(), non_housekeeping_mask); - if (!housekeeping.flags) { + if (!housekeeping_flags) { pr_warn("Housekeeping: must include one present CPU, " "using boot CPU:%d\n", smp_processor_id()); } @@ -146,7 +134,7 @@ static int __init housekeeping_setup(char *str, unsigne= d long flags) if (cpumask_empty(non_housekeeping_mask)) goto free_housekeeping_staging; =20 - if (!housekeeping.flags) { + if (!housekeeping_flags) { /* First setup call ("nohz_full=3D" or "isolcpus=3D") */ enum hk_type type; =20 @@ -155,26 +143,26 @@ static int __init housekeeping_setup(char *str, unsig= ned long flags) } else { /* Second setup call ("nohz_full=3D" after "isolcpus=3D" or the reverse)= */ enum hk_type type; - unsigned long iter_flags =3D flags & housekeeping.flags; + unsigned long iter_flags =3D flags & housekeeping_flags; =20 for_each_set_bit(type, &iter_flags, HK_TYPE_MAX) { if (!cpumask_equal(housekeeping_staging, - housekeeping.cpumasks[type])) { + housekeeping_cpumasks[type])) { pr_warn("Housekeeping: nohz_full=3D must match isolcpus=3D\n"); goto free_housekeeping_staging; } } =20 - iter_flags =3D flags & ~housekeeping.flags; + iter_flags =3D flags & ~housekeeping_flags; =20 for_each_set_bit(type, &iter_flags, HK_TYPE_MAX) housekeeping_setup_type(type, housekeeping_staging); } =20 - if ((flags & HK_FLAG_KERNEL_NOISE) && !(housekeeping.flags & HK_FLAG_KERN= EL_NOISE)) + if ((flags & HK_FLAG_KERNEL_NOISE) && !(housekeeping_flags & HK_FLAG_KERN= EL_NOISE)) tick_nohz_full_setup(non_housekeeping_mask); =20 - housekeeping.flags |=3D flags; + housekeeping_flags |=3D flags; err =3D 1; =20 free_housekeeping_staging: --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4B622D8DB6 for ; Fri, 20 Jun 2025 15:23:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750432999; cv=none; b=sAcJ0MGCK66GIMOgjmxUwjgQ1KjA8MME9F/Rc2EDqTKI+Q2uYCPjFlrBMtVzDTmUlnWvtu4sFZ4qhsJiqcJTm0Yxja0ka+12SLgrbzCgF7IedPKUyK+4Cnr35bynvEyz7Jub12E+mtOC6QJxXpKRzeBy69FzU/zWqe+MtoiJBQc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750432999; c=relaxed/simple; bh=7JBz2aGfpObfMrYEMzZRqb7NgZBg573bIt1x2XUn3Gg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EvzKEx+jpxsdA8iYkVc5ebE+r/+vE3p8ZdPYLhhV9Tm8qDZai0vQfr80Cq6wgBNImSXbXdNyphyVeWcEzmVEvLEQYIBn2ai0ufPEQ58QFfbWXru02xZXiXigJ0DH3C9y5PUQ1zMo+jtvMDw8iSRupjtCtG7ufeTPJO3ffCLAt5s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=M2BAbYZ+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="M2BAbYZ+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 488FEC4CEEF; Fri, 20 Jun 2025 15:23:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750432999; bh=7JBz2aGfpObfMrYEMzZRqb7NgZBg573bIt1x2XUn3Gg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=M2BAbYZ+woLX7eXdidaqzDYNeHK2WbEUcsFC/bx8fkT3GOEQv2trb/iiQLtvO/CfO kXhZTJ/kv0r+/QN6Ju0HO2PHjVkVwO8iH+dqcDHUHUoSVyU18Aa3mkMKsmfdiBN5W5 Oyi7fhxvqwFDmY/8zQN/aGIsqLMCOTGrdHAL7fiiVIz1FVdqVDJgFAybnN2Xhxcwcm kxXM3D9IGHJgNMbx+r7o/2rHJILF5ObsQiPY8U70LY+xh+pQIj6RrC3GRxCyt1aCXB rNMAO7/22bYqfcP1TngPM3jLtb82T5POkro6RKPq4A4+zpPVejgL6gnaqXAhVyV8Yz 8bkK1yxpMdrLQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Ingo Molnar , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 02/27] sched/isolation: Introduce housekeeping per-cpu rwsem Date: Fri, 20 Jun 2025 17:22:43 +0200 Message-ID: <20250620152308.27492-3-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN isolation cpumask, and further the HK_TYPE_KERNEL_NOISE cpumask will be made modifiable at runtime in the future. The affected subsystems will need to synchronize against those cpumask changes so that: * The reader get a coherent snapshot * The housekeeping subsystem can safely propagate a cpumask update to the susbsytems after it has been published. Protect against readsides that can sleep with per-cpu rwsem. Updates are expected to be very rare given that CPU isolation is a niche usecase and related cpuset setup happen only in preparation work. On the other hand read sides can occur in more frequent paths. Signed-off-by: Frederic Weisbecker --- include/linux/sched/isolation.h | 7 +++++++ kernel/sched/isolation.c | 12 ++++++++++++ kernel/sched/sched.h | 1 + 3 files changed, 20 insertions(+) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index f98ba0d71c52..8de4f625a5c1 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -41,6 +41,9 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type= type) return true; } =20 +extern void housekeeping_lock(void); +extern void housekeeping_unlock(void); + extern void __init housekeeping_init(void); =20 #else @@ -73,6 +76,8 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type= type) return true; } =20 +static inline void housekeeping_lock(void) { } +static inline void housekeeping_unlock(void) { } static inline void housekeeping_init(void) { } #endif /* CONFIG_CPU_ISOLATION */ =20 @@ -84,4 +89,6 @@ static inline bool cpu_is_isolated(int cpu) cpuset_cpu_is_isolated(cpu); } =20 +DEFINE_LOCK_GUARD_0(housekeeping, housekeeping_lock(), housekeeping_unlock= ()) + #endif /* _LINUX_SCHED_ISOLATION_H */ diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 83cec3853864..8c02eeccea3b 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -18,12 +18,24 @@ static cpumask_var_t housekeeping_cpumasks[HK_TYPE_MAX]; unsigned long housekeeping_flags; EXPORT_SYMBOL_GPL(housekeeping_flags); =20 +DEFINE_STATIC_PERCPU_RWSEM(housekeeping_pcpu_lock); + bool housekeeping_enabled(enum hk_type type) { return !!(housekeeping_flags & BIT(type)); } EXPORT_SYMBOL_GPL(housekeeping_enabled); =20 +void housekeeping_lock(void) +{ + percpu_down_read(&housekeeping_pcpu_lock); +} + +void housekeeping_unlock(void) +{ + percpu_up_read(&housekeeping_pcpu_lock); +} + int housekeeping_any_cpu(enum hk_type type) { int cpu; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 475bb5998295..0cdb560ef2f3 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -46,6 +46,7 @@ #include #include #include +#include #include #include #include --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4213A2DFA57 for ; Fri, 20 Jun 2025 15:23:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433002; cv=none; b=M9RLBgLT4hfDwLPss6anSCGvNPvO9nfZdzabz8XF3M7iYAgIMZpxYlwsxwyOPEfpsOZ+1jYdEC875kbbCN5utpQciZOZE7+XZV/QCR2Fs+keUyTIph2A/XRAf64yOX77C7yKywGrvFRn7Vai6CI+ka4Wx68sv6hnOAKGmGtZ1No= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433002; c=relaxed/simple; bh=KdLS87L4/EPxci93NBE5ls3FMF8T1pxkiTjYbgnzRJg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NbkT6DJGWmmVOH/wiPMpbVljtf6fIldmW/EvVh7wYlLmB35szVrRVCLktLUeEtVNNizCYa+uRaISxmo3+BlVsgfsxAIh0w+AgPBz0ceMXo5nrKSrM25K+YdsKjnA2Qouqxnx1Tn11xCgrtMapIZLT20wc7TS/p5lrV0vmZI1fmQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WJGOFqK/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WJGOFqK/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ABA58C4CEF0; Fri, 20 Jun 2025 15:23:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433001; bh=KdLS87L4/EPxci93NBE5ls3FMF8T1pxkiTjYbgnzRJg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WJGOFqK/Va0Yez8G3wtW/iRHHxRIHI4Ri2Egy889qsdXGNqSSVYKJyLWC6vFzgKfd XkPiuhqB4Cx6dN+g2rNXOwlEMycdZjxBXnTUr7YEkW4aopebuEcHYLNKhlIBvTCmGb SoKvRtrPgIuRkWBEdPWYkp7hgwa5OKgdNQyhP4HkcLHTYxmFAHjroJaJk3u61LRAcT 902kQ5NKvokygVj6wJiloNfSml97fVXrsCLJAV2zkgpEsw7sOB5g8ojtBxgstpgMVB QXBZ3+k9NFOEC6eR9TvGFtOHakwFFefbyPqTmFGFnHwlA1ulXKMjH3gKInOrUPFuEE B+yOgE0u6ZwOA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Bjorn Helgaas , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 03/27] PCI: Protect against concurrent change of housekeeping cpumask Date: Fri, 20 Jun 2025 17:22:44 +0200 Message-ID: <20250620152308.27492-4-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" HK_TYPE_DOMAIN will soon integrate cpuset isolated partitions and therefore be made modifyable at runtime. Synchronize against the cpumask update using appropriate locking. Queue and wait for the PCI call to complete while holding the housekeeping rwsem. This way the housekeeping update side doesn't need to propagate its changes to PCI. Signed-off-by: Frederic Weisbecker --- drivers/pci/pci-driver.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 67db34fd10ee..459d211a408b 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -362,7 +362,7 @@ static int pci_call_probe(struct pci_driver *drv, struc= t pci_dev *dev, dev->is_probed =3D 1; =20 cpu_hotplug_disable(); - + housekeeping_lock(); /* * Prevent nesting work_on_cpu() for the case where a Virtual Function * device is probed from work_on_cpu() of the Physical device. @@ -392,6 +392,7 @@ static int pci_call_probe(struct pci_driver *drv, struc= t pci_dev *dev, error =3D local_pci_probe(&ddi); out: dev->is_probed =3D 0; + housekeeping_unlock(); cpu_hotplug_enable(); return error; } --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6026C2DFF25 for ; Fri, 20 Jun 2025 15:23:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433004; cv=none; b=td61xP0fqzh70hWJFJbrCTxOvr7Q6dVEczmr2yvK3RylxthmQRVH1d5mEBBcCjb7HFg/6FI2im8EN/eSEXwyVlBuhbFEPznkj+LmEQtHhri/fmyPwUMz3FlNMAUv6T5R2u8CS8WTiaclJxx964KYxB5JRoDc97e1xiJDXns5gWk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433004; c=relaxed/simple; bh=LRokWWX9WY4bH5RJ09iKb9EhBlunAS0FLllkbGyEOWc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lDPnjfir3QaxspmfvWr2JrRpMnN6ha+Q889zWioXtQh4XrzNLOyYMzNdLcUomACHn/MtOTb10S+wAYKgcNh0dmyddNFieUyOrOuVVI/EYwD8phBQ7DF6QkWZ5FENu2UDwfRXE+Zz4aDq76JsACsRRskGuslagmh7bCtg2/ny1wI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=huPgtdS4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="huPgtdS4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3DEB8C4AF09; Fri, 20 Jun 2025 15:23:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433004; bh=LRokWWX9WY4bH5RJ09iKb9EhBlunAS0FLllkbGyEOWc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=huPgtdS4svT8PsLCLa8AjLik66tSOROHDlJ4/5K9PLwUu9fjklyiW3RoTZ1pIKRTv xj8DuiHn1hKxqsiJmA1/fygYu1CRIUxFgU9+aZlyK/0XMDrrGcnW2RtatqUaA1+afa 448Ir1Y33Ysc0ux+6fopZFHQwA1OcwgKUbQZ9rSM5FK2iMKMgjYoNrY754mMDGsZS7 npUsxOD7plNK/34pG3ezpJELTBv8iVfm0U5sVX9dGuRNseN4vnQiTtFWJv5GdGqbna 0Pig3gxi94mgs2tBJAgL1rY0ZykWDSiRFenCUOqXy6FEWb+55Y5yV72Jn4ApfYsB8r WLR4EBhB7FyGg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Ingo Molnar , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 04/27] cpu: Protect against concurrent isolated cpuset change Date: Fri, 20 Jun 2025 17:22:45 +0200 Message-ID: <20250620152308.27492-5-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" _cpu_down() is called through work_on_cpu() on a target contained within the HK_TYPE_DOMAIN cpumask. But that cpumask will soon also integrate the cpuset isolated partitions and some synchronization is needed to make sure that the work_on_cpu() doesn't execute or last on an isolated CPU. Unfortunately housekeeping_lock() can't be held before the call to work_on_cpu() because _cpu_down() afterwards holds cpu_hotplug_lock. This would be a lock inversion: cpu_down() cpuset --------- ------ percpu_down_read(&housekeeping_pcpu_lock); percpu_down_read(&cpu= _hotplug_lock); percpu_down_write(&cpu_hotplug_lock); percpu_down_write(&ho= usekeeping_pcpu_lock); To solve this situation, write-lock the cpu_hotplug_lock around the call to work_on_cpu(). This will prevent from cpuset to modify the housekeeping cpumask and therefore synchronize against HK_TYPE_DOMAIN cpumask changes. Signed-off-by: Frederic Weisbecker --- kernel/cpu.c | 44 ++++++++++++++++++++++++++++++-------------- 1 file changed, 30 insertions(+), 14 deletions(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index a59e009e0be4..069fce6c7eae 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -1398,8 +1398,8 @@ static int cpuhp_down_callbacks(unsigned int cpu, str= uct cpuhp_cpu_state *st, } =20 /* Requires cpu_add_remove_lock to be held */ -static int __ref _cpu_down(unsigned int cpu, int tasks_frozen, - enum cpuhp_state target) +static int __ref _cpu_down_locked(unsigned int cpu, int tasks_frozen, + enum cpuhp_state target) { struct cpuhp_cpu_state *st =3D per_cpu_ptr(&cpuhp_state, cpu); int prev_state, ret =3D 0; @@ -1410,8 +1410,6 @@ static int __ref _cpu_down(unsigned int cpu, int task= s_frozen, if (!cpu_present(cpu)) return -EINVAL; =20 - cpus_write_lock(); - cpuhp_tasks_frozen =3D tasks_frozen; =20 prev_state =3D cpuhp_set_state(cpu, st, target); @@ -1427,14 +1425,14 @@ static int __ref _cpu_down(unsigned int cpu, int ta= sks_frozen, * return the error code.. */ if (ret) - goto out; + return ret; =20 /* * We might have stopped still in the range of the AP hotplug * thread. Nothing to do anymore. */ if (st->state > CPUHP_TEARDOWN_CPU) - goto out; + return ret; =20 st->target =3D target; } @@ -1452,9 +1450,6 @@ static int __ref _cpu_down(unsigned int cpu, int task= s_frozen, } } =20 -out: - cpus_write_unlock(); - arch_smt_update(); return ret; } =20 @@ -1463,16 +1458,17 @@ struct cpu_down_work { enum cpuhp_state target; }; =20 -static long __cpu_down_maps_locked(void *arg) +static long __cpu_down_locked_work(void *arg) { struct cpu_down_work *work =3D arg; =20 - return _cpu_down(work->cpu, 0, work->target); + return _cpu_down_locked(work->cpu, 0, work->target); } =20 static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target) { struct cpu_down_work work =3D { .cpu =3D cpu, .target =3D target, }; + int err; =20 /* * If the platform does not support hotplug, report it explicitly to @@ -1483,17 +1479,24 @@ static int cpu_down_maps_locked(unsigned int cpu, e= num cpuhp_state target) if (cpu_hotplug_disabled) return -EBUSY; =20 + err =3D -EBUSY; + /* * Ensure that the control task does not run on the to be offlined * CPU to prevent a deadlock against cfs_b->period_timer. * Also keep at least one housekeeping cpu onlined to avoid generating - * an empty sched_domain span. + * an empty sched_domain span. Hotplug must be locked already to prevent + * cpusets from concurrently changing the housekeeping mask. */ + cpus_write_lock(); for_each_cpu_and(cpu, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAI= N)) { if (cpu !=3D work.cpu) - return work_on_cpu(cpu, __cpu_down_maps_locked, &work); + err =3D work_on_cpu(cpu, __cpu_down_locked_work, &work); } - return -EBUSY; + cpus_write_unlock(); + arch_smt_update(); + + return err; } =20 static int cpu_down(unsigned int cpu, enum cpuhp_state target) @@ -1896,6 +1899,19 @@ void __init bringup_nonboot_cpus(unsigned int max_cp= us) #ifdef CONFIG_PM_SLEEP_SMP static cpumask_var_t frozen_cpus; =20 +static int __ref _cpu_down(unsigned int cpu, int tasks_frozen, + enum cpuhp_state target) +{ + int err; + + cpus_write_lock(); + err =3D _cpu_down_locked(cpu, tasks_frozen, target); + cpus_write_unlock(); + arch_smt_update(); + + return err; +} + int freeze_secondary_cpus(int primary) { int cpu, error =3D 0; --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CEB522BFC85 for ; Fri, 20 Jun 2025 15:23:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433007; cv=none; b=Rnk0izO4BbtGS2o08cRhO4Dikm8zemFQM/aWUFmDH00VePTg3FzeRjVaR9eFALueheCOFmdZVC0HG8V/h1AZ+4pG81eVIPBdi6jp3ycGmsMvdFP5/qfAHF5AYLqHjqODMMRzoEdsiZGsk02cFXAUTL2bOcMqSVwEHsMVdDkfbUM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433007; c=relaxed/simple; bh=JN4dJiSi2xi+VDUMdEVl1DDIsCOCidPcCcz4BH7+O6s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ozl/9OEotKU4k9xjCbDiig6EbkMX2iseUuQMne3Vlj9ugd6IgRGzMJZrH+AD0FCe2vcsTGna2CZS3a7IcaiDCAbNNUkJYOVuOsr/MY1AE0Z2IoIcFlvN7naFBYNWaPDiJEk7mDSg0p+bHeL9tjlaKZOBCZR/YgfiqyXw8G+1iKo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Z6o4EKMF; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Z6o4EKMF" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A0D39C4CEE3; Fri, 20 Jun 2025 15:23:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433007; bh=JN4dJiSi2xi+VDUMdEVl1DDIsCOCidPcCcz4BH7+O6s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Z6o4EKMFzmpVk9JrAFX395+lwSLETLDqKbF4Og9vl6696muk+th+eA7LJ8nyAZw9H VyPJdy4vzbW6mQGJB3cZppL7VMPoo4m2NsYz5Y8yKt51IP4SQKQnPtFZAauDrwUqZ9 y56ui5ihbt8CD2lTpf401Iy37KIcv/u/ZHuWV9pBBxt/+C+n8L4Naa67UqQE0TRxUs c5VKHxbJyTiWFFDliZgetGDDLMZE8q/f4Bgd7ywbX2Qlq7s4it0LSWpWVkIor2h6TU u2T0aHAXvfJIx7WI1alDWAbJu6bGAJl7+ngFgzYu+mqeZM8NKiPf0IuGdQsVnL7VeK F+K8RS32IQLVw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Johannes Weiner , Marco Crivellari , Michal Hocko , Michal Hocko , Muchun Song , Peter Zijlstra , Roman Gushchin , Shakeel Butt , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 05/27] memcg: Prepare to protect against concurrent isolated cpuset change Date: Fri, 20 Jun 2025 17:22:46 +0200 Message-ID: <20250620152308.27492-6-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask will soon be made modifyable at runtime. In order to synchronize against memcg workqueue to make sure that no asynchronous draining is pending or executing on a newly made isolated CPU, read-lock the housekeeping rwsem lock while targeting and queueing a drain work. Whenever housekeeping will update the HK_TYPE_DOMAIN cpumask, a memcg workqueue flush will also be issued in a further change to make sure that no work remains pending after a CPU had been made isolated. Signed-off-by: Frederic Weisbecker Acked-by: Shakeel Butt --- mm/memcontrol.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 902da8a9c643..29d44af6c426 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1975,6 +1975,14 @@ static bool is_memcg_drain_needed(struct memcg_stock= _pcp *stock, return flush; } =20 +static void schedule_drain_work(int cpu, struct work_struct *work) +{ + housekeeping_lock(); + if (!cpu_is_isolated(cpu)) + schedule_work_on(cpu, work); + housekeeping_unlock(); +} + /* * Drains all per-CPU charge caches for given root_memcg resp. subtree * of the hierarchy under it. @@ -2004,8 +2012,8 @@ void drain_all_stock(struct mem_cgroup *root_memcg) &memcg_st->flags)) { if (cpu =3D=3D curcpu) drain_local_memcg_stock(&memcg_st->work); - else if (!cpu_is_isolated(cpu)) - schedule_work_on(cpu, &memcg_st->work); + else + schedule_drain_work(cpu, &memcg_st->work); } =20 if (!test_bit(FLUSHING_CACHED_CHARGE, &obj_st->flags) && @@ -2014,8 +2022,8 @@ void drain_all_stock(struct mem_cgroup *root_memcg) &obj_st->flags)) { if (cpu =3D=3D curcpu) drain_local_obj_stock(&obj_st->work); - else if (!cpu_is_isolated(cpu)) - schedule_work_on(cpu, &obj_st->work); + else + schedule_drain_work(cpu, &obj_st->work); } } migrate_enable(); --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2CA512C0333 for ; Fri, 20 Jun 2025 15:23:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433010; cv=none; b=ZnmWpwIodDHCyzqZxZNSy1jOqlhVh43laYTCYIOlqG+jwgwDw+stksL/M3a1Qwv2SBoI4HgncHoLvaNhLJwL9T37vfQCvIZDoWibxYWnsrJ9sZxt2SDW17xfgSZq52b6Bw3Co73t1NoepKKwEpnug1IfPhQoWKwHnrACcQoFIr8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433010; c=relaxed/simple; bh=pRfQHigt/JUjt7Ua7aNjiQ8K2ZKMtGoq+XwcBsIjDcA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=m84qQnYPrwayToNoQizBoZ2xl29yoYiZdvhrO/MsrV0cmClaUUd2Iuas0IEM7ovnpJMJY1vorhICG57tcMjh+i1xvgL0VSoI5VNIuNSkwrwDGVDqCHPjVfP3cTaczBnchzFyGAbfvD1hF8Kpq3E8J3p9ChtnjaE8Nimc142XmO0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=LMOxLQew; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="LMOxLQew" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DA057C4CEEF; Fri, 20 Jun 2025 15:23:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433010; bh=pRfQHigt/JUjt7Ua7aNjiQ8K2ZKMtGoq+XwcBsIjDcA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LMOxLQew+8SD0QmKJR5IFcYSv9Ibc0MeI6Xv2VKk19HsGigM1Qm9amtVtp5hj2fQ0 pEslfbicQSZ8ekF5i02k2hUdg7d/IxxuoWmadNan3ALw9niHU6ev9IS9Jh+X6DX9I7 c9cZBpnrwRyJNmmX2hjp0ka9mnnRCWkK2HnlJp4QNZzl3ut8tlFbRu5Zd+UsKaoc7+ 0jazqNeXyXTyz9G3bIEjDfJ0BolBqWY4EhiqHoMS9Ix/cI8P2zxdt2rdvgywUPAZId kkX08YtsIcBB4EnRPi2j1vEDOpAm1DJV2lK7q2DCJUq9clpBDPKsRT0Kh4wAOx2Cb+ qkiMFu2Px64ZQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , linux-mm@kvack.org Subject: [PATCH 06/27] mm: vmstat: Prepare to protect against concurrent isolated cpuset change Date: Fri, 20 Jun 2025 17:22:47 +0200 Message-ID: <20250620152308.27492-7-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask will soon be made modifyable at runtime. In order to synchronize against vmstat workqueue to make sure that no asynchronous vmstat work is pending or executing on a newly made isolated CPU, read-lock the housekeeping rwsem lock while targeting and queueing a vmstat work. Whenever housekeeping will update the HK_TYPE_DOMAIN cpumask, a vmstat workqueue flush will also be issued in a further change to make sure that no work remains pending after a CPU had been made isolated. Signed-off-by: Frederic Weisbecker --- mm/vmstat.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/mm/vmstat.c b/mm/vmstat.c index 429ae5339bfe..53123675fe31 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -2115,11 +2115,13 @@ static void vmstat_shepherd(struct work_struct *w) * infrastructure ever noticing. Skip regular flushing from vmstat_sheph= erd * for all isolated CPUs to avoid interference with the isolated workloa= d. */ - if (cpu_is_isolated(cpu)) - continue; + scoped_guard(housekeeping) { + if (cpu_is_isolated(cpu)) + continue; =20 - if (!delayed_work_pending(dw) && need_update(cpu)) - queue_delayed_work_on(cpu, mm_percpu_wq, dw, 0); + if (!delayed_work_pending(dw) && need_update(cpu)) + queue_delayed_work_on(cpu, mm_percpu_wq, dw, 0); + } =20 cond_resched(); } --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C23C12E54CF for ; Fri, 20 Jun 2025 15:23:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433012; cv=none; b=gikun+fVh2oiFMo2RGRLSiDosVaCGvNVeJifswmVcdNF9nF+TJeTnZDfEOTwNXMfPgaGQ70Um0Y11q3O6UNZKjcuWKSOxznHlRSGVjfZ6+A9eEV7F+flvh+kH0f5xqrC8ErTh3M4AnEyuR5vBrofQTsEyXauc1HEiJTZDWYlVbo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433012; c=relaxed/simple; bh=5ZDs0SoJ8yatsvNiDpfTaIQZ7Et15dC+wW8JUAJyaC4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oIU+o20tlQMWcUfduFDbbKNvHPqMv7fsYPGepT5gpAEadIee805vofv7/W3fzlrgtDIizL4V/1voCboVvakOJTOl/qgSGbv5/X+lxcTzhInfXUtcAviX0p+DGXX4sgLp1QySVlColEBsoHYnHwaUsBBiQRWeint8ZUz9rp0Jq9g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=N0GoWGit; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="N0GoWGit" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6D8A3C4CEF1; Fri, 20 Jun 2025 15:23:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433012; bh=5ZDs0SoJ8yatsvNiDpfTaIQZ7Et15dC+wW8JUAJyaC4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=N0GoWGitkAAPEFD4VB5WV0lsjJeMMqPW/RjtPbDmKWgyegn813ztRSTvejygMwPe+ KngmwpDKwqSpbfjIrvCAc45lgKRgkrdqvaoBqgbO0GT6zY5oS9p/VZmpZe2CPVSHl5 N7iv8Ke1n7/qP4Vg56o0C/5E6EVGmPIQwZJKE0ssVJyBeEHXvjd82QZUy8Jkm1zfid +pnMSNJ/3noGsHybSfuXSHN3WW3tIIBolDNQ9f5CY5PBKtLZbuYC0HpIi2hVHU81Z9 wX0FI4cfDMgGL8Z1/8I0KBgvuL7n5VjBePTEpi4jRBhdiB8LnB53fCYteEM5ei8bH+ 5H28P9i3kAEJA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Ingo Molnar , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 07/27] sched/isolation: Save boot defined domain flags Date: Fri, 20 Jun 2025 17:22:48 +0200 Message-ID: <20250620152308.27492-8-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" HK_TYPE_DOMAIN will soon integrate not only boot defined isolcpus=3D CPUs but also cpuset isolated partitions. Housekeeping still needs a way to record what was initially passed to isolcpus=3D in order to keep these CPUs isolated after a cpuset isolated partition is modified or destroyed while containing some of them. Create a new HK_TYPE_DOMAIN_BOOT to keep track of those. Signed-off-by: Frederic Weisbecker --- include/linux/sched/isolation.h | 1 + kernel/sched/isolation.c | 5 +++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index 8de4f625a5c1..731506d312d2 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -7,6 +7,7 @@ #include =20 enum hk_type { + HK_TYPE_DOMAIN_BOOT, HK_TYPE_DOMAIN, HK_TYPE_MANAGED_IRQ, HK_TYPE_KERNEL_NOISE, diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 8c02eeccea3b..9ecf53c5328b 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -9,6 +9,7 @@ */ =20 enum hk_flags { + HK_FLAG_DOMAIN_BOOT =3D BIT(HK_TYPE_DOMAIN_BOOT), HK_FLAG_DOMAIN =3D BIT(HK_TYPE_DOMAIN), HK_FLAG_MANAGED_IRQ =3D BIT(HK_TYPE_MANAGED_IRQ), HK_FLAG_KERNEL_NOISE =3D BIT(HK_TYPE_KERNEL_NOISE), @@ -214,7 +215,7 @@ static int __init housekeeping_isolcpus_setup(char *str) =20 if (!strncmp(str, "domain,", 7)) { str +=3D 7; - flags |=3D HK_FLAG_DOMAIN; + flags |=3D HK_FLAG_DOMAIN | HK_FLAG_DOMAIN_BOOT; continue; } =20 @@ -244,7 +245,7 @@ static int __init housekeeping_isolcpus_setup(char *str) =20 /* Default behaviour for isolcpus without flags */ if (!flags) - flags |=3D HK_FLAG_DOMAIN; + flags |=3D HK_FLAG_DOMAIN | HK_FLAG_DOMAIN_BOOT; =20 return housekeeping_setup(str, flags); } --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C60962BF3F3; Fri, 20 Jun 2025 15:23:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433018; cv=none; b=ON00xh/1qT1FnlKAwaxd3NpWQhZoDMnc0O/cTqUw2qNGgit64rQ8H6F5oUa3M+HyfFXmsEOOAxnbk+k9NMj6U6cwTrGW9BHYoexJ9uV5+suS+tu/ESGvKQyHkxlR4c9nhLWgRWvcbp8uaZDNvKAg+3i7M2t0v7ZPPCp/RmI75LA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433018; c=relaxed/simple; bh=NgFkxLsB6pI1LPP33IKIqTl6H68LrR1CV3Ob+pEb3ew=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ilf3djERkwlPWI7ZtRwK1yFlGDHqHImahUnrkIVPCaioOOS5Wq6OBrFL5DUdM6z9qDU/bhKOJXPg072U1/RqR6lRJHqMHYzsgRSmHRG1kmtEotW0ni0lJOCZ9M81jMd3AxcOHk96nZQD2B1u7LSaowZNjwevnxha62RYgc1cipY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SHWVq1SJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SHWVq1SJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D1DACC4CEEF; Fri, 20 Jun 2025 15:23:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433015; bh=NgFkxLsB6pI1LPP33IKIqTl6H68LrR1CV3Ob+pEb3ew=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SHWVq1SJJifIxnLZZcRlQAtcXftd0wXnBdJiGBYnOCYXLHK9VFRD4/KRSXH1SxKqY HlTFixp5qIj9M+ywxx5kV8TxwZMcjBVeOidzjOwAJzqiQZlN2e67Qi4C9+U0zZdTrY sCcPy7iccNIH40lgQ/Qc6m3csyCMzOcyCGRS17i/0HeGy1AFsl7Y2JlRq5zj+o2Xxy Ds4F1NoC19qtfvEf0/zat5SJ4nIcyqKTanqsDaC8camPLkvDq/jpfLPCIBuxRARtAY dqIb/uWBfL/A/SBFExyIengU3xEyNURXCwKcHYCNyMoDJy7UDSUj/Mk8NIcKIZCcqf E+2FWRBekiybQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Johannes Weiner , Marco Crivellari , Michal Hocko , Michal Koutny , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , cgroups@vger.kernel.org Subject: [PATCH 08/27] cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT Date: Fri, 20 Jun 2025 17:22:49 +0200 Message-ID: <20250620152308.27492-9-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" boot_hk_cpus is an ad-hoc copy of HK_TYPE_DOMAIN_BOOT. Remove it and use the official version. Signed-off-by: Frederic Weisbecker --- kernel/cgroup/cpuset.c | 22 +++++++--------------- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 3bc4301466f3..aae8a739d48d 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -80,12 +80,6 @@ static cpumask_var_t subpartitions_cpus; */ static cpumask_var_t isolated_cpus; =20 -/* - * Housekeeping (HK_TYPE_DOMAIN) CPUs at boot - */ -static cpumask_var_t boot_hk_cpus; -static bool have_boot_isolcpus; - /* List of remote partition root children */ static struct list_head remote_children; =20 @@ -1601,15 +1595,16 @@ static void remote_cpus_update(struct cpuset *cs, s= truct cpumask *xcpus, * @new_cpus: cpu mask * Return: true if there is conflict, false otherwise * - * CPUs outside of boot_hk_cpus, if defined, can only be used in an + * CPUs outside of HK_TYPE_DOMAIN_BOOT, if defined, can only be used in an * isolated partition. */ static bool prstate_housekeeping_conflict(int prstate, struct cpumask *new= _cpus) { - if (!have_boot_isolcpus) + if (!housekeeping_enabled(HK_TYPE_DOMAIN_BOOT)) return false; =20 - if ((prstate !=3D PRS_ISOLATED) && !cpumask_subset(new_cpus, boot_hk_cpus= )) + if ((prstate !=3D PRS_ISOLATED) && + !cpumask_subset(new_cpus, housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT))) return true; =20 return false; @@ -3766,12 +3761,9 @@ int __init cpuset_init(void) =20 BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL)); =20 - have_boot_isolcpus =3D housekeeping_enabled(HK_TYPE_DOMAIN); - if (have_boot_isolcpus) { - BUG_ON(!alloc_cpumask_var(&boot_hk_cpus, GFP_KERNEL)); - cpumask_copy(boot_hk_cpus, housekeeping_cpumask(HK_TYPE_DOMAIN)); - cpumask_andnot(isolated_cpus, cpu_possible_mask, boot_hk_cpus); - } + if (housekeeping_enabled(HK_TYPE_DOMAIN_BOOT)) + cpumask_andnot(isolated_cpus, cpu_possible_mask, + housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT)); =20 return 0; } --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC1062E62A9 for ; Fri, 20 Jun 2025 15:23:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433019; cv=none; b=hyeQEH2V7UiunwddqZYUMoDtRyoF4Vt2CbxxHFOXEi0T9ASiN92v9Kf3aY/jFh7kI/YFS4L9/E7qQf0iAr+4XCcU/DJ2H9JNfiWh3R9p7rmRBHM9Yc2AupQSzfQfUvM0EM+ruQ5A7n1ABC2KVMPm+Z4zvg9mhJYtwpqY5clTShA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433019; c=relaxed/simple; bh=HF1OH2zDz94/TKNfz4B/0f8tym9eHNw4m6GNJbNVVxI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KM5L66Pes2KwDzfF4CWjZA4DsZew0gho0Roe+HqGaI7eSqt+MimWSa7pjEXtkNNWY00b9FWmjefgc3K+NBJNzuYuaWXcHVZVLm/X+aicxE+sb9MPDd8+3iPVycSe7bjiALPK9s9+hbz6b5q3/d+4xW8fl5e2Ht72998aR85se4A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IGvgDhGS; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IGvgDhGS" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A4BA8C4CEE3; Fri, 20 Jun 2025 15:23:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433018; bh=HF1OH2zDz94/TKNfz4B/0f8tym9eHNw4m6GNJbNVVxI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IGvgDhGSh0TvCXsOeO1GpLMdPeOFczr4GMtI5p3sbHeuLi6+EaqxODmmCm0As+62b kZP4WXcPXCLAchjL+5hOgHU+/DoUh0oJ3pICd+QsJ6cPlcFl0usAN3F8AqN5PksGwB GUJLfTOBWRoUVcU5NBdLK5yr0yxB7T9DCm990Bn4qJgFwhP7Z2gBQlnHgrczt1uXS7 EHWhCbFzynCsgYB5mhZ0VcAncH09lxE+kGd2NPX/NMJ1a3o+FUN3ZEN0HS5GpaYcr8 dBtR+Vs+48otzhHnBmqO7Q+08i/QY47BkoF5ZDbRXr0XJgB/+ZpZbyEUGu5Ar+rZsO +pC6remEEaXpA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Danilo Krummrich , Greg Kroah-Hartman , Marco Crivellari , Michal Hocko , Peter Zijlstra , "Rafael J . Wysocki" , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 09/27] driver core: cpu: Convert /sys/devices/system/cpu/isolated to use HK_TYPE_DOMAIN_BOOT Date: Fri, 20 Jun 2025 17:22:50 +0200 Message-ID: <20250620152308.27492-10-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Make sure /sys/devices/system/cpu/isolated only prints what was passed through the isolcpus=3D parameter before HK_TYPE_DOMAIN will also integrate cpuset isolated partitions. Signed-off-by: Frederic Weisbecker --- drivers/base/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c index 7779ab0ca7ce..e1663021fe24 100644 --- a/drivers/base/cpu.c +++ b/drivers/base/cpu.c @@ -291,7 +291,7 @@ static ssize_t print_cpus_isolated(struct device *dev, return -ENOMEM; =20 cpumask_andnot(isolated, cpu_possible_mask, - housekeeping_cpumask(HK_TYPE_DOMAIN)); + housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT)); len =3D sysfs_emit(buf, "%*pbl\n", cpumask_pr_args(isolated)); =20 free_cpumask_var(isolated); --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E21EE2E9757; Fri, 20 Jun 2025 15:23:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433022; cv=none; b=u4y9T1GC+nojC5XPMMcNQevljtwVGiN71Glycb6uPt/ooB7Mk9dPHC/Yn4CXYJFLD+LozzvV1fnvAVcoeYT61Kss9i9fYlAeIvEdp94X+B847y6ezyGLOFp1fRRqH+eKCwfT5LkQJ0sFVhXfwWXV1ZFqwMPM1QxLZNlMXmogEog= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433022; c=relaxed/simple; bh=f3r5MbCT/30gnOvcMmKLC7WpxctD7ruo9tQpgEPXK2A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=u0AlQ/LLZx305cTwhQd7OprWS0Op2fa2MLSjTvMy9MD74Df7IbddBTzfsho1yRR/vpEthJxqATiXfFjaq6rr+34X2xPQ5+/0coDFMwvyS3gVVAY88JO/YNoj7mdhzn3OznguK384chB5uwvGHZh7cp+rYD10yIYVmzy6pL88xXY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=feLXvdLh; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="feLXvdLh" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9AB88C4CEF0; Fri, 20 Jun 2025 15:23:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433021; bh=f3r5MbCT/30gnOvcMmKLC7WpxctD7ruo9tQpgEPXK2A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=feLXvdLhX9uApMu/guZIrApeAeKX7aSmn1jUjZ/tc9/dHFdP/ahFnNp2bOAiomoHj 6Xko+cB3X8DkuD5SXft7sd1Dg9TtHtPDQE2wxOXQD0AbQgtCW+2PuS7teT8/27Jw3l sf+kPPhAveRm+00kwQy62VuZDqF13sx6sMR6HLQQjY6mlGxxbV0F4Pdt1N/VvEEBtf QP71HpN4uhHMhJD5AI2IwCVXwgWabOKPVMeP0//gXbYWsR2o2g56jSDufJKL0bgSQI LCDBzypCuwc+nisePUJMF1uvukXSSGDLsaXTvgoweQD26oHO1djgOj2v2fXLZfMlvK RKCM8T4RKbAJg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Marco Crivellari , Michal Hocko , Paolo Abeni , Peter Zijlstra , Simon Horman , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , netdev@vger.kernel.org Subject: [PATCH 10/27] net: Keep ignoring isolated cpuset change Date: Fri, 20 Jun 2025 17:22:51 +0200 Message-ID: <20250620152308.27492-11-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" RPS cpumask can be overriden through sysfs/syctl. The boot defined isolated CPUs are then excluded from that cpumask. However HK_TYPE_DOMAIN will soon integrate cpuset isolated CPUs updates and the RPS infrastructure needs more thoughts to be able to propagate such changes and synchronize against them. Keep handling only what was passed through "isolcpus=3D" for now. Signed-off-by: Frederic Weisbecker --- net/core/net-sysfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 1ace0cd01adc..abff68ac34ec 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -1012,7 +1012,7 @@ static int netdev_rx_queue_set_rps_mask(struct netdev= _rx_queue *queue, int rps_cpumask_housekeeping(struct cpumask *mask) { if (!cpumask_empty(mask)) { - cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_DOMAIN)); + cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT)); cpumask_and(mask, mask, housekeeping_cpumask(HK_TYPE_WQ)); if (cpumask_empty(mask)) return -EINVAL; --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87CCD2E9ED8; Fri, 20 Jun 2025 15:23:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433024; cv=none; b=hPqeF9oisZIRGx1lUlwtp8vnTtJS60eR7op7HGE408dRfQNTZBcf7tlg4BvdiAiSTGTBnhOyfN90K6CmxWrB4oZBwEbn1G1A2u64O4JvnhJee/ICi5+fSaPPT3h7Rlb9T4e1vKqMlMa0O5lnj0V+ztELETrgETqKodbl1GSQGMs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433024; c=relaxed/simple; bh=ClR6PLlwiWDX0h6VoGwzQNxAgSSkzMHBGG+huRh08BY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FIljYmmZY3EXv+ImtXd37A6bEQ08QXUoY1tr12VBKSbzIHTRuEtv6hfkNrv1vnsYBXmaeCCVZd1nSxbFNSGNwEyNqn4lPouPd71rYww2avrhW65KTMooUGwI3PVfKwJAvzO9eYnbCqq0AaBFPWZAKm+UOEDxLNrFiNOIf30K7I8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XsBsaCQz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XsBsaCQz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD56FC4CEE3; Fri, 20 Jun 2025 15:23:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433024; bh=ClR6PLlwiWDX0h6VoGwzQNxAgSSkzMHBGG+huRh08BY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XsBsaCQzk80iWl6jugTSRK+uSqRdAnNwPrIjft1wUCtNvTrQdY3qNOjBhdhBnOQzC hMZkqCMBXWeg6WxW/p8udJcUASRauoecVnTF50Q65c33m96ltLJU55C9JWQ0j3BHkJ c/FFKaWl066S5dezCUtXI5AhquhiNcH/serTRyR5x5JQQnNLMOxxwYfxtzRovSvECO FWdjeMVhM1cy6N8VOAdDAAoqzTqc1CB9sez7lpggMOXcDvHifk56P5O0VIcXmDckXd SfSLaflg9hWETejvLKHXZsH7PCvFzD6wQ1Z2nOwDJ6m2/JoVB9c1c9iKH6r04LOZqb pecCtS/rdcTlQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Jens Axboe , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , linux-block@vger.kernel.org Subject: [PATCH 11/27] block: Protect against concurrent isolated cpuset change Date: Fri, 20 Jun 2025 17:22:52 +0200 Message-ID: <20250620152308.27492-12-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The block subsystem prevents running the workqueue to isolated CPUs, including those defined by cpuset isolated partitions. Since HK_TYPE_DOMAIN will soon contain both and be subject to runtime modifications, synchronize against housekeeping using the relevant lock. For full support of cpuset changes, the block subsystem may need to propagate changes to isolated cpumask through the workqueue in the future. Signed-off-by: Frederic Weisbecker --- block/blk-mq.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 4806b867e37d..ece3369825fe 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -4237,12 +4237,16 @@ static void blk_mq_map_swqueue(struct request_queue= *q) =20 /* * Rule out isolated CPUs from hctx->cpumask to avoid - * running block kworker on isolated CPUs + * running block kworker on isolated CPUs. + * FIXME: cpuset should propagate further changes to isolated CPUs + * here. */ + housekeeping_lock(); for_each_cpu(cpu, hctx->cpumask) { if (cpu_is_isolated(cpu)) cpumask_clear_cpu(cpu, hctx->cpumask); } + housekeeping_unlock(); =20 /* * Initialize batch roundrobin counts --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0646A2BFC79 for ; Fri, 20 Jun 2025 15:23:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433027; cv=none; b=AEPaHXPT3FYLXj+0zbsONWk7WkEU6lHNq3x/RMM462XSxvZvZNDlpBANLRCSV0oEfstvbfKwOX6HQOG/K6QX6kTGxEKg+qXad2k81zkA0PCDxuQ4dOq3q6kx4VEjpcMQjT10MNoBjniIEMHLRjtG6tk7Wteq9QRtPQN3pugjZKA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433027; c=relaxed/simple; bh=iUfaeCF69Nkp/7VspLJNX3subJ+K5BfO1gas842MLb8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=noUZWxMpFWt8U1YHz9GSCnqgWVDG43spgjAAgnHYpoq1XUsBtZrEKP+Syia+lzrnpy2+7qvQ32bbvvHKpK/CjdIN9LmgL1svHCSFHs03aOD8kfrvy3pQMQRDoPrWFaH0y3XwKnuXp+XcAmjMsEr9lXMeVgn8n1Jh/x/RZi4odpE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=H8FRPy2q; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="H8FRPy2q" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 83D3EC4CEF3; Fri, 20 Jun 2025 15:23:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433026; bh=iUfaeCF69Nkp/7VspLJNX3subJ+K5BfO1gas842MLb8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=H8FRPy2qNoTKzxtpc7RlnD1OVR7cuEo2NLTc+m75ifE/yaBYm6jotUNMrwbH/dwlH JFH/newSJ7oLWf4qc9dggeU2UUzVtSLZZv4gPJAdQaUGuOfPNx93PMhRah+v63UUC2 AU+Mx0kM2ZoxZUoHHI0oogq+AGiyBcIXJWR/GhrNz99jNRmx+eFwqpigU6tUYu9f0w FU7GXZeMIT860jHe3Hy7kZK2wqV+N9e1ZJ2QsAok/YuRxMTb6q1qKd+9j6PSULPSfC OBL5We/ItwuaNruJJeRZMZyXCDoDyXIUN36/cmzMtY0rzDNTz/4qf4xinqWntMRXxx rx8zsM4E6JEfw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Christoph Lameter , Dennis Zhou , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , linux-mm@kvack.org Subject: [PATCH 12/27] cpu: Provide lockdep check for CPU hotplug lock write-held Date: Fri, 20 Jun 2025 17:22:53 +0200 Message-ID: <20250620152308.27492-13-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" cpuset modifies partitions, including isolated, while holding the cpu hotplug lock read-held. This means that write-holding the CPU hotplug lock is safe to synchronize against housekeeping cpumask changes. Provide a lockdep check to validate that. Signed-off-by: Frederic Weisbecker --- include/linux/cpuhplock.h | 1 + include/linux/percpu-rwsem.h | 1 + kernel/cpu.c | 5 +++++ 3 files changed, 7 insertions(+) diff --git a/include/linux/cpuhplock.h b/include/linux/cpuhplock.h index f7aa20f62b87..286b3ab92e15 100644 --- a/include/linux/cpuhplock.h +++ b/include/linux/cpuhplock.h @@ -13,6 +13,7 @@ struct device; =20 extern int lockdep_is_cpus_held(void); +extern int lockdep_is_cpus_write_held(void); =20 #ifdef CONFIG_HOTPLUG_CPU void cpus_write_lock(void); diff --git a/include/linux/percpu-rwsem.h b/include/linux/percpu-rwsem.h index 288f5235649a..c8cb010d655e 100644 --- a/include/linux/percpu-rwsem.h +++ b/include/linux/percpu-rwsem.h @@ -161,6 +161,7 @@ extern void percpu_free_rwsem(struct percpu_rw_semaphor= e *); __percpu_init_rwsem(sem, #sem, &rwsem_key); \ }) =20 +#define percpu_rwsem_is_write_held(sem) lockdep_is_held_type(sem, 0) #define percpu_rwsem_is_held(sem) lockdep_is_held(sem) #define percpu_rwsem_assert_held(sem) lockdep_assert_held(sem) =20 diff --git a/kernel/cpu.c b/kernel/cpu.c index 069fce6c7eae..ccf11a17c7fd 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -533,6 +533,11 @@ int lockdep_is_cpus_held(void) { return percpu_rwsem_is_held(&cpu_hotplug_lock); } + +int lockdep_is_cpus_write_held(void) +{ + return percpu_rwsem_is_write_held(&cpu_hotplug_lock); +} #endif =20 static void lockdep_acquire_cpus_lock(void) --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 330E42BFC79; Fri, 20 Jun 2025 15:23:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433030; cv=none; b=UgQJYP24ix0Q+0YRStKnk7PhcxP6nCozUsDItwy+zeo2+VBf4yx9InmgNNFZ/JTcGKa1YT52vAv9lUA91De+4e0CjnDCzw2wKBg5IhRqo/KaV5Jkx0aw9cPHSraQEAGbDIBwU3T48tGMzBfoSQU8g5eRYY+8JtDp/9+PV7YJ5zM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433030; c=relaxed/simple; bh=6vWppYdTk5e9qrlSezsH+M7RzeU8JEFva4V6UhNLxfY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=p32OhRSPSikHwTJhUPri2ZyIAd1W3GukEia5GCW9g53rfcxflyRuK2ttwntPK5bpMOLZCRPBzsTQ/cvQ0fqe96KJT9xNHG0Z1F47wQgsvLGdV8DIrhHkO2Lix/lYkQPGHRUC3AdYraCnGKTlkhYc3rkgTSD4xLLdHYqHSAYzbV4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=apKUxTLn; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="apKUxTLn" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4B58BC4CEE3; Fri, 20 Jun 2025 15:23:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433029; bh=6vWppYdTk5e9qrlSezsH+M7RzeU8JEFva4V6UhNLxfY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=apKUxTLn3MyBYPHBJGTXh1Ss8p95bBRxsLRZ807pnflMsPa/4oRyNH8Qnh1S35p+F RPnRAFMswu0L+hX0sccHS+OHyFN0pLZUI/Su+KX7jSXO277Yy5d0kiq8YEa3oLhtpu v4VrnaKe0mxUgLYvYv9nMvM4Yop0b4zCIGrJg27grAxrG7C5IC1SL+mSlaQtTGMyAG h0edjuOtpmjIdp07v5w5+iNAJqfZO5Wk8XmgGr4nTH56z9viUjQ2OIRzJw706Zyprw yP3Ou1awhk/GgBcDGkgU/CSoyb4s9mw6f/mnIbfcQo9Xk8jjKiKJWjoB2fFfh7p9VT RkJ6cHOueAdrg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Johannes Weiner , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , cgroups@vger.kernel.org Subject: [PATCH 13/27] cpuset: Provide lockdep check for cpuset lock held Date: Fri, 20 Jun 2025 17:22:54 +0200 Message-ID: <20250620152308.27492-14-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" cpuset modifies partitions, including isolated, while holding the cpuset mutex. This means that holding the cpuset mutex is safe to synchronize against housekeeping cpumask changes. Provide a lockdep check to validate that. Signed-off-by: Frederic Weisbecker --- include/linux/cpuset.h | 2 ++ kernel/cgroup/cpuset.c | 7 +++++++ 2 files changed, 9 insertions(+) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index 2ddb256187b5..051d36fec578 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -18,6 +18,8 @@ #include #include =20 +extern bool lockdep_is_cpuset_held(void); + #ifdef CONFIG_CPUSETS =20 /* diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index aae8a739d48d..8221b6a7da46 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -254,6 +254,13 @@ void cpuset_unlock(void) mutex_unlock(&cpuset_mutex); } =20 +#ifdef CONFIG_LOCKDEP +bool lockdep_is_cpuset_held(void) +{ + return lockdep_is_held(&cpuset_mutex); +} +#endif + static DEFINE_SPINLOCK(callback_lock); =20 void cpuset_callback_lock_irq(void) --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 601AE2EB5CD for ; Fri, 20 Jun 2025 15:23:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433032; cv=none; b=aOVLnqHsrhRPGYxIcxM9jy/upgwEqUMKsAV8FLE8rukoWPR4cMU8z1KSQrI5KW7tk+mmo/4D43SCGNzULed8WYynX9Yx5yfl3cjtjUiTV0IxoaYLVpYww8sO4azn2Uu+3PFQv8h+A19n3wPSTcOl1gM0DjDgNrevpMvMbabXuy8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433032; c=relaxed/simple; bh=PxV8uVE16ABO59cZv2sbLkYT2OSz4gIZFG6bkET4xxs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=onxl1IbP+y0qG5cY7kxaLU5A6Y5zJu8Tj/E1J+kezZSkYb51T1E/c8EciAZ/qE/fcMtmWwJmbLsQFcbblB9Hh0Zf9pjTYaduX/hk6HCnP0rnrlQpCUOl9zDTNE0Ygg/vcW6o/4plA5reJXe8EL6oTtxuwxtzUA2hXLet0Tvbjjs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=c+UPX7Y8; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="c+UPX7Y8" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 34559C4CEEF; Fri, 20 Jun 2025 15:23:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433032; bh=PxV8uVE16ABO59cZv2sbLkYT2OSz4gIZFG6bkET4xxs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=c+UPX7Y8y7agrFwSLTB/Z5E5riFd8+b39XMcNueDLWxdFy5mXGXsgJMQfA2lN+fxR DrPputdojw5lMReE+9WSHGMSixf6OWWCdq5HNuIBTfVYVAfvAACZx3liCPVqoanGh5 PXSOu9yl7+GNBhznIa57FiDsUTOlwVeSfbZrcbUWEwY1hKS1wwyXTX4hGoioTKdgsY ZFU6f46jprATJFLLHFaFpDtHYvpTPwvyDrpLW5wB/OPMMLyP1M8/k/DGkxIz4svTjJ JaJyL+ZAdqs0F3tCPqifaHLtiloiiuUio7EAzPA4NXVdALwGUPvfzngxSbkmDNeacU owM0DzpIwVQqw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Ingo Molnar , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 14/27] sched/isolation: Convert housekeeping cpumasks to rcu pointers Date: Fri, 20 Jun 2025 17:22:55 +0200 Message-ID: <20250620152308.27492-15-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" HK_TYPE_DOMAIN's cpumask will soon be made modifyable by cpuset. Sleepable users of housekeeping can synchronize against cpumask modifications using the housekeeping rwsem. Other callsites need an alternative. Turn the housekeeping cpumasks into RCU pointers. Once a housekeeping cpumask will be modified, the update side will wait for an RCU grace period and propagate the change to interested subsystem when deemed necessary. Signed-off-by: Frederic Weisbecker --- kernel/sched/isolation.c | 52 ++++++++++++++++++++++++++-------------- kernel/sched/sched.h | 1 + 2 files changed, 35 insertions(+), 18 deletions(-) diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 9ecf53c5328b..75505668dcb9 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -15,7 +15,7 @@ enum hk_flags { HK_FLAG_KERNEL_NOISE =3D BIT(HK_TYPE_KERNEL_NOISE), }; =20 -static cpumask_var_t housekeeping_cpumasks[HK_TYPE_MAX]; +static struct cpumask __rcu *housekeeping_cpumasks[HK_TYPE_MAX]; unsigned long housekeeping_flags; EXPORT_SYMBOL_GPL(housekeeping_flags); =20 @@ -37,16 +37,25 @@ void housekeeping_unlock(void) percpu_up_read(&housekeeping_pcpu_lock); } =20 +const struct cpumask *housekeeping_cpumask(enum hk_type type) +{ + if (housekeeping_flags & BIT(type)) { + return rcu_dereference_check(housekeeping_cpumasks[type], 1); + } + return cpu_possible_mask; +} +EXPORT_SYMBOL_GPL(housekeeping_cpumask); + int housekeeping_any_cpu(enum hk_type type) { int cpu; =20 if (housekeeping_flags & BIT(type)) { - cpu =3D sched_numa_find_closest(housekeeping_cpumasks[type], smp_process= or_id()); + cpu =3D sched_numa_find_closest(housekeeping_cpumask(type), smp_processo= r_id()); if (cpu < nr_cpu_ids) return cpu; =20 - cpu =3D cpumask_any_and_distribute(housekeeping_cpumasks[type], cpu_onli= ne_mask); + cpu =3D cpumask_any_and_distribute(housekeeping_cpumask(type), cpu_onlin= e_mask); if (likely(cpu < nr_cpu_ids)) return cpu; /* @@ -62,25 +71,17 @@ int housekeeping_any_cpu(enum hk_type type) } EXPORT_SYMBOL_GPL(housekeeping_any_cpu); =20 -const struct cpumask *housekeeping_cpumask(enum hk_type type) -{ - if (housekeeping_flags & BIT(type)) - return housekeeping_cpumasks[type]; - return cpu_possible_mask; -} -EXPORT_SYMBOL_GPL(housekeeping_cpumask); - void housekeeping_affine(struct task_struct *t, enum hk_type type) { if (housekeeping_flags & BIT(type)) - set_cpus_allowed_ptr(t, housekeeping_cpumasks[type]); + set_cpus_allowed_ptr(t, housekeeping_cpumask(type)); } EXPORT_SYMBOL_GPL(housekeeping_affine); =20 bool housekeeping_test_cpu(int cpu, enum hk_type type) { if (housekeeping_flags & BIT(type)) - return cpumask_test_cpu(cpu, housekeeping_cpumasks[type]); + return cpumask_test_cpu(cpu, housekeeping_cpumask(type)); return true; } EXPORT_SYMBOL_GPL(housekeeping_test_cpu); @@ -95,9 +96,23 @@ void __init housekeeping_init(void) if (housekeeping_flags & HK_FLAG_KERNEL_NOISE) sched_tick_offload_init(); =20 + /* + * Realloc with a proper allocator so that any cpumask update + * can indifferently free the old version with kfree(). + */ for_each_set_bit(type, &housekeeping_flags, HK_TYPE_MAX) { + struct cpumask *omask, *nmask =3D kmalloc(cpumask_size(), GFP_KERNEL); + + if (WARN_ON_ONCE(!nmask)) + return; + + omask =3D rcu_dereference(housekeeping_cpumasks[type]); + /* We need at least one CPU to handle housekeeping work */ - WARN_ON_ONCE(cpumask_empty(housekeeping_cpumasks[type])); + WARN_ON_ONCE(cpumask_empty(omask)); + cpumask_copy(nmask, omask); + RCU_INIT_POINTER(housekeeping_cpumasks[type], nmask); + memblock_free(omask, cpumask_size()); } } =20 @@ -105,9 +120,10 @@ static void __init housekeeping_setup_type(enum hk_typ= e type, cpumask_var_t housekeeping_staging) { =20 - alloc_bootmem_cpumask_var(&housekeeping_cpumasks[type]); - cpumask_copy(housekeeping_cpumasks[type], - housekeeping_staging); + struct cpumask *mask =3D memblock_alloc_or_panic(cpumask_size(), SMP_CACH= E_BYTES); + + cpumask_copy(mask, housekeeping_staging); + RCU_INIT_POINTER(housekeeping_cpumasks[type], mask); } =20 static int __init housekeeping_setup(char *str, unsigned long flags) @@ -160,7 +176,7 @@ static int __init housekeeping_setup(char *str, unsigne= d long flags) =20 for_each_set_bit(type, &iter_flags, HK_TYPE_MAX) { if (!cpumask_equal(housekeeping_staging, - housekeeping_cpumasks[type])) { + housekeeping_cpumask(type))) { pr_warn("Housekeeping: nohz_full=3D must match isolcpus=3D\n"); goto free_housekeeping_staging; } diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 0cdb560ef2f3..407e7f5ad929 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4AEEA2D29C9; Fri, 20 Jun 2025 15:23:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433035; cv=none; b=uh8TFqr+xU2oAiKlIW+/jqRRgedH+tp4rVdcRlRbxfnaWCcXKaifdvAJumMmWRgrwMnh7vKbj4i27dZJw4d7tXPwflZF1RsOiHQmGuESZYqASpKOWX4EIjgby3M2iBRY0A3BboH7o7XsUGUmsSsmbDhKzsQGK4YmMxYuXSmPuaQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433035; c=relaxed/simple; bh=hhGPWkGQrxw1S/pBsDoaibkt3XaUAIfOsG7Kc7rzLcQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=OboRD8UEf0RyFytCQw0mi2+NiEQ1UMVwb0CdtYycROWjnaUrLWIU0H4DxGAyVTkJKamBrskh3nTVB3L7rezNHhMb61agWlRyen7a9+tuJPaGMCgQ1ZpyXjKQ2+OZlAfo1NPtLFS/14q7vyveXWA9byisvSrz07GpvUQJbCe+CnY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Yy0N0utz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Yy0N0utz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9743AC4CEE3; Fri, 20 Jun 2025 15:23:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433035; bh=hhGPWkGQrxw1S/pBsDoaibkt3XaUAIfOsG7Kc7rzLcQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Yy0N0utzuGBhQFbw9ReKQeM1tox8v+dreF3+kezh3/WytRnYXvjUQf33trg23knpb 0JPj6403jaHtjZC/E1ILEw4vQTNFpbY9CeW9K4P+EJhsAWKfKH+zoedjVVs6aYF6FU M+5JKByw/yoOLiBlPT3BDooZAasroww/8N8JdNmdo/fHmW5jAUCANU1jwujYVeKIb/ 5qGTuLABW2OFJ6OthAvIaFR/paPH3eIRUCim0/gGsZaAUoOvnSylaG8JSpIdX1O/s3 hXT9H4mqXSLQHBNMhgKz+IE9FHEk2RhJc1IyBSvDiRjNeXWsvrSPyw0BP6hthwwC8+ HZFXVhnz/vYSg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Johannes Weiner , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , cgroups@vger.kernel.org Subject: [PATCH 15/27] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset Date: Fri, 20 Jun 2025 17:22:56 +0200 Message-ID: <20250620152308.27492-16-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Until now, HK_TYPE_DOMAIN used to only include boot defined isolated CPUs passed through isolcpus=3D boot option. Users interested in also knowing the runtime defined isolated CPUs through cpuset must use different APIs: cpuset_cpu_is_isolated(), cpu_is_isolated(), etc... There are many drawbacks to that approach: 1) Most interested subsystems want to know about all isolated CPUs, not just those defined on boot time. 2) cpuset_cpu_is_isolated() / cpu_is_isolated() are not synchronized with concurrent cpuset changes. 3) Further cpuset modifications are not propagated to subsystems Solve 1) and 2) and centralize all isolated CPUs within the HK_TYPE_DOMAIN housekeeping cpumask under the housekeeping lock. Subsystems can rely on the housekeeping lock or RCU to synchronize against concurrent changes. The propagation mentioned in 3) will be handled in further patches. Signed-off-by: Frederic Weisbecker --- include/linux/sched/isolation.h | 5 ++- kernel/cgroup/cpuset.c | 2 + kernel/sched/isolation.c | 71 ++++++++++++++++++++++++++++++--- kernel/sched/sched.h | 1 + 4 files changed, 72 insertions(+), 7 deletions(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index 731506d312d2..f1b309f18511 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -36,7 +36,7 @@ extern bool housekeeping_test_cpu(int cpu, enum hk_type t= ype); =20 static inline bool housekeeping_cpu(int cpu, enum hk_type type) { - if (housekeeping_flags & BIT(type)) + if (READ_ONCE(housekeeping_flags) & BIT(type)) return housekeeping_test_cpu(cpu, type); else return true; @@ -45,6 +45,8 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type= type) extern void housekeeping_lock(void); extern void housekeeping_unlock(void); =20 +extern int housekeeping_update(struct cpumask *mask, enum hk_type type); + extern void __init housekeeping_init(void); =20 #else @@ -79,6 +81,7 @@ static inline bool housekeeping_cpu(int cpu, enum hk_type= type) =20 static inline void housekeeping_lock(void) { } static inline void housekeeping_unlock(void) { } +static inline int housekeeping_update(struct cpumask *mask, enum hk_type t= ype) { return 0; } static inline void housekeeping_init(void) { } #endif /* CONFIG_CPU_ISOLATION */ =20 diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 8221b6a7da46..5f169a56f06c 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1351,6 +1351,8 @@ static void update_unbound_workqueue_cpumask(bool iso= lcpus_updated) =20 ret =3D workqueue_unbound_exclude_cpumask(isolated_cpus); WARN_ON_ONCE(ret < 0); + ret =3D housekeeping_update(isolated_cpus, HK_TYPE_DOMAIN); + WARN_ON_ONCE(ret < 0); } =20 /** diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 75505668dcb9..7814d60be87e 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -23,7 +23,7 @@ DEFINE_STATIC_PERCPU_RWSEM(housekeeping_pcpu_lock); =20 bool housekeeping_enabled(enum hk_type type) { - return !!(housekeeping_flags & BIT(type)); + return !!(READ_ONCE(housekeeping_flags) & BIT(type)); } EXPORT_SYMBOL_GPL(housekeeping_enabled); =20 @@ -37,12 +37,39 @@ void housekeeping_unlock(void) percpu_up_read(&housekeeping_pcpu_lock); } =20 +static bool housekeeping_dereference_check(enum hk_type type) +{ + if (type =3D=3D HK_TYPE_DOMAIN) { + if (system_state =3D=3D SYSTEM_BOOTING) + return true; + if (IS_ENABLED(CONFIG_HOTPLUG_CPU) && lockdep_is_cpus_write_held()) + return true; + if (percpu_rwsem_is_held(&housekeeping_pcpu_lock)) + return true; + if (IS_ENABLED(CONFIG_CPUSETS) && lockdep_is_cpuset_held()) + return true; + + return false; + } + + return true; +} + +static inline struct cpumask *__housekeeping_cpumask(enum hk_type type) +{ + return rcu_dereference_check(housekeeping_cpumasks[type], + housekeeping_dereference_check(type)); +} + const struct cpumask *housekeeping_cpumask(enum hk_type type) { - if (housekeeping_flags & BIT(type)) { - return rcu_dereference_check(housekeeping_cpumasks[type], 1); - } - return cpu_possible_mask; + const struct cpumask *mask =3D NULL; + + if (READ_ONCE(housekeeping_flags) & BIT(type)) + mask =3D __housekeeping_cpumask(type); + if (!mask) + mask =3D cpu_possible_mask; + return mask; } EXPORT_SYMBOL_GPL(housekeeping_cpumask); =20 @@ -80,12 +107,44 @@ EXPORT_SYMBOL_GPL(housekeeping_affine); =20 bool housekeeping_test_cpu(int cpu, enum hk_type type) { - if (housekeeping_flags & BIT(type)) + if (READ_ONCE(housekeeping_flags) & BIT(type)) return cpumask_test_cpu(cpu, housekeeping_cpumask(type)); return true; } EXPORT_SYMBOL_GPL(housekeeping_test_cpu); =20 +int housekeeping_update(struct cpumask *mask, enum hk_type type) +{ + struct cpumask *trial, *old =3D NULL; + + if (type !=3D HK_TYPE_DOMAIN) + return -ENOTSUPP; + + trial =3D kmalloc(sizeof(*trial), GFP_KERNEL); + if (!trial) + return -ENOMEM; + + cpumask_andnot(trial, housekeeping_cpumask(HK_TYPE_DOMAIN_BOOT), mask); + if (!cpumask_intersects(trial, cpu_online_mask)) { + kfree(trial); + return -EINVAL; + } + + percpu_down_write(&housekeeping_pcpu_lock); + if (housekeeping_flags & BIT(type)) + old =3D __housekeeping_cpumask(type); + else + WRITE_ONCE(housekeeping_flags, housekeeping_flags | BIT(type)); + rcu_assign_pointer(housekeeping_cpumasks[type], trial); + percpu_up_write(&housekeeping_pcpu_lock); + + synchronize_rcu(); + + kfree(old); + + return 0; +} + void __init housekeeping_init(void) { enum hk_type type; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 407e7f5ad929..04094567cad4 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 932582ECD13; Fri, 20 Jun 2025 15:23:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433039; cv=none; b=pdlmNGQfL7Atn0XrzS1RVPMfNBPpY3zgVcCTP+Ag1gC+OiGcoPoYWyXYmZxyq983XWpOO69ES0+JFQT9WheKolOUP4GSmHo7SuncJTuhXhQ8KthNut5y3cPNBXl6NlrD88dtjVtN6K/V/3D+1ubkMhXkQaUUTLScleU/Q0JE3lA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433039; c=relaxed/simple; bh=puBcn1RLjuMlGziYQtO8T9w0Hol7ZIDbbn6QvjDV/zQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=duevx1ixkbQICAsEDxoRStdBy+2wHyaHjeWo7qm3t/KXWTVX6rP9SNL815DTyDozFwNg1HRYbSgrWVLHYY8kcAMIcqnyghy0cpdiVuIBetaAyEPVzl8vf4h5YET4YXNTPU6fsGjvC/fmlS5+30aQtjzHZ44DOo5jjxmF8v1LJqE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZuDPDV0N; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZuDPDV0N" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8EBFDC4CEE3; Fri, 20 Jun 2025 15:23:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433039; bh=puBcn1RLjuMlGziYQtO8T9w0Hol7ZIDbbn6QvjDV/zQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZuDPDV0NqIta2HyKRSmvefpltNr5OQmlAiOy9JNjBp+KZ7zcKekZkV7tLDeT3G9FC B/xYcs9tQuK3L5hIeu3TK7veaQeloAMkKamwtfMv9j3wyItFJ4Jeyumk+lanosdVam h6v8DnSe9cqMa2epNROFZmS3FTYQlORrrO5dSX6Oe6zjXsdM5PHYt0SJEddqQSwRhM r8oWPFNNGOLwoy1BcwIn1UxB2q+NCo+QdxAzjbG01MEn7Q0JYWLL6RMd8XkyVh3qW2 MUtEaCFkMso3v8JTsxk5WyijDJMv3M/XMPmcVJo1eyD/+QqlZ8A2cFCJu/fzktD0hH lFIS1EL3Rqvvw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Johannes Weiner , Marco Crivellari , Michal Hocko , Michal Hocko , Muchun Song , Peter Zijlstra , Roman Gushchin , Shakeel Butt , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 16/27] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Date: Fri, 20 Jun 2025 17:22:57 +0200 Message-ID: <20250620152308.27492-17-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask is now modifyable at runtime. In order to synchronize against memcg workqueue to make sure that no asynchronous draining is still pending or executing on a newly made isolated CPU, the housekeeping susbsystem must flush the memcg workqueues. However the memcg workqueues can't be flushed easily since they are queued to the main per-CPU workqueue pool. Solve this with creating a memcg specific pool and provide and use the appropriate flushing API. Signed-off-by: Frederic Weisbecker Acked-by: Shakeel Butt --- include/linux/memcontrol.h | 4 ++++ kernel/sched/isolation.c | 2 ++ kernel/sched/sched.h | 1 + mm/memcontrol.c | 12 +++++++++++- 4 files changed, 18 insertions(+), 1 deletion(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 87b6688f124a..ef5036c6bf04 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1046,6 +1046,8 @@ static inline u64 cgroup_id_from_mm(struct mm_struct = *mm) return id; } =20 +void mem_cgroup_flush_workqueue(void); + extern int mem_cgroup_init(void); #else /* CONFIG_MEMCG */ =20 @@ -1451,6 +1453,8 @@ static inline u64 cgroup_id_from_mm(struct mm_struct = *mm) return 0; } =20 +static inline void mem_cgroup_flush_workqueue(void) { } + static inline int mem_cgroup_init(void) { return 0; } #endif /* CONFIG_MEMCG */ =20 diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 7814d60be87e..6fb0c7956516 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -140,6 +140,8 @@ int housekeeping_update(struct cpumask *mask, enum hk_t= ype type) =20 synchronize_rcu(); =20 + mem_cgroup_flush_workqueue(); + kfree(old); =20 return 0; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 04094567cad4..53107c021fe9 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -44,6 +44,7 @@ #include #include #include +#include #include #include #include diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 29d44af6c426..928b90cdb5ba 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -96,6 +96,8 @@ static bool cgroup_memory_nokmem __ro_after_init; /* BPF memory accounting disabled? */ static bool cgroup_memory_nobpf __ro_after_init; =20 +static struct workqueue_struct *memcg_wq __ro_after_init; + static struct kmem_cache *memcg_cachep; static struct kmem_cache *memcg_pn_cachep; =20 @@ -1979,7 +1981,7 @@ static void schedule_drain_work(int cpu, struct work_= struct *work) { housekeeping_lock(); if (!cpu_is_isolated(cpu)) - schedule_work_on(cpu, work); + queue_work_on(cpu, memcg_wq, work); housekeeping_unlock(); } =20 @@ -5140,6 +5142,11 @@ void mem_cgroup_uncharge_skmem(struct mem_cgroup *me= mcg, unsigned int nr_pages) refill_stock(memcg, nr_pages); } =20 +void mem_cgroup_flush_workqueue(void) +{ + flush_workqueue(memcg_wq); +} + static int __init cgroup_memory(char *s) { char *token; @@ -5182,6 +5189,9 @@ int __init mem_cgroup_init(void) cpuhp_setup_state_nocalls(CPUHP_MM_MEMCQ_DEAD, "mm/memctrl:dead", NULL, memcg_hotplug_cpu_dead); =20 + memcg_wq =3D alloc_workqueue("memcg", 0, 0); + WARN_ON(!memcg_wq); + for_each_possible_cpu(cpu) { INIT_WORK(&per_cpu_ptr(&memcg_stock, cpu)->work, drain_local_memcg_stock); --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE7EC2ECEA4 for ; Fri, 20 Jun 2025 15:24:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433042; cv=none; b=NgT2H5KaE2FOZ+viLJLOo1hrLkQdv9RRDIsZ1tfdV4gtRvK25UJnfLUGuFStyCp9tISWnCJDW7dKdY5tTLwqTRIZZXYfBbkluCMuI0wpr2AnmS082/TV3jn0DbIPOFqJKT8SwrqX35tA565BdKDPSN969F9b+sank/84XbRnF/8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433042; c=relaxed/simple; bh=7S3CpVL6musulW7mm8cZzqVZcBLY7SkCm9k5f0nN0Cc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NIdKoy/hoVsnHO8DiTmC4GP373nzHdSkz4o/+6LiFuB7dvgfh/y+YhPvGFii0E3++TvOwLVj7JfWegqTmHnQtL1g6zdu1hHeOTiULrXawC1ZJiBEnLqiYGrhmOTeQkWYem52CrXA/3l4PJ5YvUOEiTniNnOwMbrzYoPivekJKVU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=utNbcnMD; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="utNbcnMD" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 74CEEC4CEF0; Fri, 20 Jun 2025 15:23:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433041; bh=7S3CpVL6musulW7mm8cZzqVZcBLY7SkCm9k5f0nN0Cc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=utNbcnMDJu/YLI/m4XSsSSTIdVdwIeqVEAjH7AgIx3D6VKXFn4skDnOTrHOoCMbQk 01x9V0JVqAdLoqkVCsc+E5oVb5/Q0yXNy0+GcoGkuT95M/n56WSfDzV+ppSIHh+8eH WMedA5yPeC7mYpiNxNcSLKMUn+/6EcyqUUkKUotROdtp+84g+mFY/wegouzQCLqfHF k5A/nQjSvEyGtlKogMlvtUEMc5G9CamOIRrNGhPjAw7kyRZXkDqDZ8Ym9t/tAm1RH6 IND+cMVAvxDqjBuULmW5YFVCGzO7SQWECP2ple3oRL/wIEhXipu9Pqd4bZEdzPhDKd ZDi8rwJfD4wDg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Andrew Morton , Ingo Molnar , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , linux-mm@kvack.org Subject: [PATCH 17/27] sched/isolation: Flush vmstat workqueues on cpuset isolated partition change Date: Fri, 20 Jun 2025 17:22:58 +0200 Message-ID: <20250620152308.27492-18-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The HK_TYPE_DOMAIN housekeeping cpumask is now modifyable at runtime. In order to synchronize against vmstat workqueue to make sure that no asynchronous vmstat work is still pending or executing on a newly made isolated CPU, the housekeeping susbsystem must flush the vmstat workqueues. This involves flushing the whole mm_percpu_wq workqueue, shared with LRU drain, introducing here a welcome side effect. Signed-off-by: Frederic Weisbecker --- include/linux/vmstat.h | 2 ++ kernel/sched/isolation.c | 1 + kernel/sched/sched.h | 1 + mm/vmstat.c | 5 +++++ 4 files changed, 9 insertions(+) diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h index b2ccb6845595..ba7caacdf356 100644 --- a/include/linux/vmstat.h +++ b/include/linux/vmstat.h @@ -303,6 +303,7 @@ int calculate_pressure_threshold(struct zone *zone); int calculate_normal_threshold(struct zone *zone); void set_pgdat_percpu_threshold(pg_data_t *pgdat, int (*calculate_pressure)(struct zone *)); +void vmstat_flush_workqueue(void); #else /* CONFIG_SMP */ =20 /* @@ -403,6 +404,7 @@ static inline void __dec_node_page_state(struct page *p= age, static inline void refresh_zone_stat_thresholds(void) { } static inline void cpu_vm_stats_fold(int cpu) { } static inline void quiet_vmstat(void) { } +static inline void vmstat_flush_workqueue(void) { } =20 static inline void drain_zonestat(struct zone *zone, struct per_cpu_zonestat *pzstats) { } diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 6fb0c7956516..0119685796be 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -141,6 +141,7 @@ int housekeeping_update(struct cpumask *mask, enum hk_t= ype type) synchronize_rcu(); =20 mem_cgroup_flush_workqueue(); + vmstat_flush_workqueue(); =20 kfree(old); =20 diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 53107c021fe9..e2c4258cb818 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -69,6 +69,7 @@ #include #include #include +#include #include #include #include diff --git a/mm/vmstat.c b/mm/vmstat.c index 53123675fe31..5d462fe12548 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -2095,6 +2095,11 @@ static void vmstat_shepherd(struct work_struct *w); =20 static DECLARE_DEFERRABLE_WORK(shepherd, vmstat_shepherd); =20 +void vmstat_flush_workqueue(void) +{ + flush_workqueue(mm_percpu_wq); +} + static void vmstat_shepherd(struct work_struct *w) { int cpu; --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 010F02ED151; Fri, 20 Jun 2025 15:24:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433045; cv=none; b=mm+R3zu8gonFJMHIvsqV/sXs5CWwO6WMrxw0nAUI7HtIBwTAIRNdTsXbX0RjVauFHS7CE/9ZHTxK3+ZWH5cauNkxoXQ3BmshBUbdFRV5rOZbHjTsDsaV2PBQ7DQLzP5eQx40Ipwmj3TO74a/EVYmvgvVMD2sMGgMjr1Lg63F/Go= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433045; c=relaxed/simple; bh=9V0n3RkIDJ8tgZuANPwTkG5LhMQIqvQ3Cah86/5LM3w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=UI7gGzEPqetwJaE/fsenpxa6/N/N5TaWqFqivz+eSDnFll+TZWh5/StmgWPexIcyJwRIQznxN7+uFOYUjIk0zRIBHQnGdzh72t6IFUYWo82Y187eP1fxiMeAkZe2a7EijI33GfhjHQ1fqD/A+DbX8znc7uSFrHJDHbkVAT6KZlo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Hy9+tazX; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Hy9+tazX" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 46AD0C4CEE3; Fri, 20 Jun 2025 15:24:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433044; bh=9V0n3RkIDJ8tgZuANPwTkG5LhMQIqvQ3Cah86/5LM3w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Hy9+tazXLVopXk6HxaS8WpZzonRumoRBM7gZXiF9a6M93nejHjUhbe9VIbFBqge0T jbCdJLjnKueb0sAmzqGq8JCBq6yPC5MzrxmpdYx7LjSZMmLOdpH/ECIaPtQMmmlVb5 JUrkONdygArMKD/JDMaRx77MDimWrumB2A3MsPpEVzN5w4KYyAioabVYXSHdoMUU0H JRp9rUx2vaOxtaskQSShCw6H1p11GCfx3xwMaorTXQbGF+mtr98TBpuRUkw02MDlQ7 1xXYL0cDe3yeZWnZxtxu3iyvU7x1QVBP7b9UU4/7o1rxTnG/Jeow9ygu+jtedtbG/2 wgTpjuxmFnE9g== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Johannes Weiner , Lai Jiangshan , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , cgroups@vger.kernel.org Subject: [PATCH 18/27] cpuset: Propagate cpuset isolation update to workqueue through housekeeping Date: Fri, 20 Jun 2025 17:22:59 +0200 Message-ID: <20250620152308.27492-19-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Until now, cpuset would propagate isolated partition changes to workqueues so that unbound workers get properly reaffined. Since housekeeping now centralizes, synchronize and propagates isolation cpumask changes, perform the work from that subsystem for consolidation and consistency purposes. Suggested-by: Tejun Heo Signed-off-by: Frederic Weisbecker --- include/linux/workqueue.h | 2 +- init/Kconfig | 1 + kernel/cgroup/cpuset.c | 14 ++++++-------- kernel/sched/isolation.c | 4 +++- kernel/workqueue.c | 2 +- 5 files changed, 12 insertions(+), 11 deletions(-) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 6e30f275da77..8a32c594bba1 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -581,7 +581,7 @@ struct workqueue_attrs *alloc_workqueue_attrs(void); void free_workqueue_attrs(struct workqueue_attrs *attrs); int apply_workqueue_attrs(struct workqueue_struct *wq, const struct workqueue_attrs *attrs); -extern int workqueue_unbound_exclude_cpumask(cpumask_var_t cpumask); +extern int workqueue_unbound_exclude_cpumask(const struct cpumask *cpumask= ); =20 extern bool queue_work_on(int cpu, struct workqueue_struct *wq, struct work_struct *work); diff --git a/init/Kconfig b/init/Kconfig index af4c2f085455..b7cbb6e01e8d 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -1205,6 +1205,7 @@ config CPUSETS bool "Cpuset controller" depends on SMP select UNION_FIND + select CPU_ISOLATION help This option will let you create and manage CPUSETs which allow dynamically partitioning a system into sets of CPUs and diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 5f169a56f06c..98b1ea0ad336 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1340,7 +1340,7 @@ static bool partition_xcpus_del(int old_prs, struct c= puset *parent, return isolcpus_updated; } =20 -static void update_unbound_workqueue_cpumask(bool isolcpus_updated) +static void update_housekeeping_cpumask(bool isolcpus_updated) { int ret; =20 @@ -1349,8 +1349,6 @@ static void update_unbound_workqueue_cpumask(bool iso= lcpus_updated) if (!isolcpus_updated) return; =20 - ret =3D workqueue_unbound_exclude_cpumask(isolated_cpus); - WARN_ON_ONCE(ret < 0); ret =3D housekeeping_update(isolated_cpus, HK_TYPE_DOMAIN); WARN_ON_ONCE(ret < 0); } @@ -1473,7 +1471,7 @@ static int remote_partition_enable(struct cpuset *cs,= int new_prs, list_add(&cs->remote_sibling, &remote_children); cpumask_copy(cs->effective_xcpus, tmp->new_cpus); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); cpuset_force_rebuild(); cs->prs_err =3D 0; =20 @@ -1514,7 +1512,7 @@ static void remote_partition_disable(struct cpuset *c= s, struct tmpmasks *tmp) compute_effective_exclusive_cpumask(cs, NULL, NULL); reset_partition_data(cs); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); cpuset_force_rebuild(); =20 /* @@ -1583,7 +1581,7 @@ static void remote_cpus_update(struct cpuset *cs, str= uct cpumask *xcpus, if (xcpus) cpumask_copy(cs->exclusive_cpus, xcpus); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); if (adding || deleting) cpuset_force_rebuild(); =20 @@ -1947,7 +1945,7 @@ static int update_parent_effective_cpumask(struct cpu= set *cs, int cmd, WARN_ON_ONCE(parent->nr_subparts < 0); } spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); =20 if ((old_prs !=3D new_prs) && (cmd =3D=3D partcmd_update)) update_partition_exclusive_flag(cs, new_prs); @@ -2972,7 +2970,7 @@ static int update_prstate(struct cpuset *cs, int new_= prs) else if (isolcpus_updated) isolated_cpus_update(old_prs, new_prs, cs->effective_xcpus); spin_unlock_irq(&callback_lock); - update_unbound_workqueue_cpumask(isolcpus_updated); + update_housekeeping_cpumask(isolcpus_updated); =20 /* Force update if switching back to member & update effective_xcpus */ update_cpumasks_hier(cs, &tmpmask, !new_prs); diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 0119685796be..e4e4fcd4cb2c 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -116,6 +116,7 @@ EXPORT_SYMBOL_GPL(housekeeping_test_cpu); int housekeeping_update(struct cpumask *mask, enum hk_type type) { struct cpumask *trial, *old =3D NULL; + int err; =20 if (type !=3D HK_TYPE_DOMAIN) return -ENOTSUPP; @@ -142,10 +143,11 @@ int housekeeping_update(struct cpumask *mask, enum hk= _type type) =20 mem_cgroup_flush_workqueue(); vmstat_flush_workqueue(); + err =3D workqueue_unbound_exclude_cpumask(housekeeping_cpumask(type)); =20 kfree(old); =20 - return 0; + return err; } =20 void __init housekeeping_init(void) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 97f37b5bae66..e55fcf980c5d 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -6948,7 +6948,7 @@ static int workqueue_apply_unbound_cpumask(const cpum= ask_var_t unbound_cpumask) * This function can be called from cpuset code to provide a set of isolat= ed * CPUs that should be excluded from wq_unbound_cpumask. */ -int workqueue_unbound_exclude_cpumask(cpumask_var_t exclude_cpumask) +int workqueue_unbound_exclude_cpumask(const struct cpumask *exclude_cpumas= k) { cpumask_var_t cpumask; int ret =3D 0; --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1DD02ED841; Fri, 20 Jun 2025 15:24:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433047; cv=none; b=ZMcyOgGA//8mRZ4bJ8FNRmihvHOMqJpnLhMaiU7KTNrgswbyprUi+tvEkdYmiBD+2pnSGSeEsO3ah+wZsi4k8/CHtqkSxFQQ4jMwdzqk1+4xSBVJHTWpW1EtxYm/k2KzwHDEk03SBg/mFcjIVZX5ha+fnZ+rwoyku08lqGoDs+4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433047; c=relaxed/simple; bh=7DejguHPFkVxCftEAVKS/blsYBEuLHXPel4oIb+v3f8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=ihIA8qfRb6b4y+xMfCRKk/UXTHOR+ID2DjXDlICxXBvwer/GFKvUFn5fJ4TeZXrzEzE1F5SGh07LDzttId7iZ07q4b8pH8p6zFmXCrkv/dMTZm5XnlRk12aoVLM2eHFWqf6MNVpfbyCbfnypwQcRaX0cJyYAyuekCcxAyw47h/8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kpWgXKar; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kpWgXKar" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 430BBC4CEF2; Fri, 20 Jun 2025 15:24:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433047; bh=7DejguHPFkVxCftEAVKS/blsYBEuLHXPel4oIb+v3f8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kpWgXKarrRUtUnKUt5+ZVIZfbqScBpWkbAn+u9HqYb9Wr//Ar4DiV3rkVQ3OcHZBo no7GUOdf7pDHa3xI9qj5A7+lmJ45X2oHgU0VuXmj4UUjdJgfBWlBboj+4PUWNqCXFs mvrjp6H4OuxIJiMLABQCG9H1ZtVd78d0Y+wgZssVZ3iAczsS8DhfDeGAGPmI8Jkb2G XsNMubQU+Ku7m3VdqZiTZwbMq0nHvdoBvCevozdf2o7Br58pxHMMaembQMIx/KwIsW RonzrriigUG2cGDU6WrBRFfBr5q7z/ccCy4/0zmEkjMAmQtAvREDDrjAw8/HoW5qwl qAIyEPSolef4Q== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Johannes Weiner , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , cgroups@vger.kernel.org Subject: [PATCH 19/27] cpuset: Remove cpuset_cpu_is_isolated() Date: Fri, 20 Jun 2025 17:23:00 +0200 Message-ID: <20250620152308.27492-20-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The set of cpuset isolated CPUs is now included in HK_TYPE_DOMAIN housekeeping cpumask. There is no usecase left interested in just checking what is isolated by cpuset and not by the isolcpus=3D kernel boot parameter. Signed-off-by: Frederic Weisbecker --- include/linux/cpuset.h | 6 ------ include/linux/sched/isolation.h | 3 +-- kernel/cgroup/cpuset.c | 12 ------------ 3 files changed, 1 insertion(+), 20 deletions(-) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index 051d36fec578..a10775a4f702 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -78,7 +78,6 @@ extern void cpuset_lock(void); extern void cpuset_unlock(void); extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mas= k); extern bool cpuset_cpus_allowed_fallback(struct task_struct *p); -extern bool cpuset_cpu_is_isolated(int cpu); extern nodemask_t cpuset_mems_allowed(struct task_struct *p); #define cpuset_current_mems_allowed (current->mems_allowed) void cpuset_init_current_mems_allowed(void); @@ -208,11 +207,6 @@ static inline bool cpuset_cpus_allowed_fallback(struct= task_struct *p) return false; } =20 -static inline bool cpuset_cpu_is_isolated(int cpu) -{ - return false; -} - static inline nodemask_t cpuset_mems_allowed(struct task_struct *p) { return node_possible_map; diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index f1b309f18511..9f039dfb5739 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -89,8 +89,7 @@ static inline void housekeeping_init(void) { } static inline bool cpu_is_isolated(int cpu) { return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || - !housekeeping_test_cpu(cpu, HK_TYPE_TICK) || - cpuset_cpu_is_isolated(cpu); + !housekeeping_test_cpu(cpu, HK_TYPE_TICK); } =20 DEFINE_LOCK_GUARD_0(housekeeping, housekeeping_lock(), housekeeping_unlock= ()) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 98b1ea0ad336..db80e72681ed 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -29,7 +29,6 @@ #include #include #include -#include #include #include #include @@ -1353,17 +1352,6 @@ static void update_housekeeping_cpumask(bool isolcpu= s_updated) WARN_ON_ONCE(ret < 0); } =20 -/** - * cpuset_cpu_is_isolated - Check if the given CPU is isolated - * @cpu: the CPU number to be checked - * Return: true if CPU is used in an isolated partition, false otherwise - */ -bool cpuset_cpu_is_isolated(int cpu) -{ - return cpumask_test_cpu(cpu, isolated_cpus); -} -EXPORT_SYMBOL_GPL(cpuset_cpu_is_isolated); - /* * compute_effective_exclusive_cpumask - compute effective exclusive CPUs * @cs: cpuset --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67B802ED858 for ; Fri, 20 Jun 2025 15:24:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433050; cv=none; b=cQYI5DqnCX0jMPW1Ots1xp3AQQXx3Y6TvJBie+QTGVFnduYKHh6k2ZoXBz1b3Zd9i2+5GUI48iyoFSm+e3cjeaa9LlAIRw6WjERA+kz3cIg3P5w6luHYw3B4wSSy6K/u2zM2eEu5CrotHYjzSJhQb4ZCBKCr1cy2InG57mn8sGc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433050; c=relaxed/simple; bh=Jw9wHfu0A4TsY4rYKLYQ+jzBCiVyP6zlxbiDxEx+J6c=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KJ717cLC9WlNCH0eoLoIbuax3hDQjThmTV/++iY/T9k1KyBUfsvuoSegbfpbVgFQJ++Po8NcepvSx+RLDHPTnRVvXCIhYcDld1AyiaHJLXlTRhbnsEDAlzrywoIsb+hegZjdy9hJyS+zGjV2LWt3D+2uepEY8tMQbUASdspJQ7k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=T1iTSF8w; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="T1iTSF8w" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E57CAC4CEFB; Fri, 20 Jun 2025 15:24:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433049; bh=Jw9wHfu0A4TsY4rYKLYQ+jzBCiVyP6zlxbiDxEx+J6c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T1iTSF8wzOBLkmU2HCPT4Xt4zSx5uKxpbG1+QhuvvucvndUmg8qJ6hQiyljFBoWHi uW0xKVBdArhMA93UMBBp/iCFkcItGI2mfpeMBEN0Rpp6VU/PwHVWs516VLp7iC8wmJ MUU91zxBjdHPZuqw1QyV16aDoT8te5fup5TQIk4x/qAcvMBL89JwiwtzRiimyrWHic JoeNoa4s93KHLTfpVIK3J7DR6c49sSSDkvqtMk9LylxivYBJ40GdtWsAtPP6tMRXqo bfB1eq6Zbnfg51QGvXOVqRJNL2J59Nl9DtQgHW9jnZCMH+gQ+rMArfI7H9NeVtTuJy xb+ocq0ImKgsw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Ingo Molnar , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 20/27] sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() Date: Fri, 20 Jun 2025 17:23:01 +0200 Message-ID: <20250620152308.27492-21-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It doesn't make sense to use nohz_full without also isolating the related CPUs from the domain topology, either through the use of isolcpus=3D or cpuset isolated partitions. And now HK_TYPE_DOMAIN includes all kinds of domain isolated CPUs. This means that HK_TYPE_KERNEL_NOISE (of which HK_TYPE_TICK is only an alias) implies HK_TYPE_DOMAIN and therefore checking the latter is enough to deduce the former. Signed-off-by: Frederic Weisbecker --- include/linux/sched/isolation.h | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolatio= n.h index 9f039dfb5739..46677e8edf76 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -88,8 +88,7 @@ static inline void housekeeping_init(void) { } =20 static inline bool cpu_is_isolated(int cpu) { - return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN) || - !housekeeping_test_cpu(cpu, HK_TYPE_TICK); + return !housekeeping_test_cpu(cpu, HK_TYPE_DOMAIN); } =20 DEFINE_LOCK_GUARD_0(housekeeping, housekeeping_lock(), housekeeping_unlock= ()) --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91AB72EE27A for ; Fri, 20 Jun 2025 15:24:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433052; cv=none; b=LRDm86QBSz4hn8LkjIYr4FWAafYuFXuuNVO88VoWhuaXbUQhz3hgNZH2xwafOzJB58WQL0K9Cx/ytBddY3XbA5htEYJoijK3VXfB2O+UUJveWW0GS0epEPIFsgtGrfCQLZ2JaUGjtM7aT6zpHVjUfzv0gMnWeIK6hZ/Hlo764Sg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433052; c=relaxed/simple; bh=wTItNQ/7MRbjxIguDhes2nWpTdXMtZcyR20Xbv98DcA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rcM3//vMX2pk9zjB52uSfBQarP9bkleV07NCaXwnJ7+T+JXoxF2R9LiJftcz8R140AgzW5OE/wvXv+WwGjKCVGVOxTockM9ZqzkAY0Ok7IQtz+4W6aS1Q9J2vG7ZlgHn1PoOiPGcNi1rzDloERRmwQZf1/ta2QRgMqEluAcdMg4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SruWTEvA; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SruWTEvA" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 52EDDC4CEF0; Fri, 20 Jun 2025 15:24:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433052; bh=wTItNQ/7MRbjxIguDhes2nWpTdXMtZcyR20Xbv98DcA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SruWTEvA/h7BkosV1Q/tXdcnpLcdJ84qtJC45Y4LuBfhkfO2EMjRpYfHEXkrxaJ15 6w75dvjF3UxuqOYb8qyyboBcrZXV6YRONX2cB0GAxoyMbKGb21kbvHEfAN1J8URG76 DaLiSeS0RqEHlSRO9wM2Jjljz76glA5gwaVoz/xVaoyEmdZ8vCucDVk8GVEGkTEpS2 EJT+snzoufMS1lo2cE0L502Su8IfqSVhUuBsFGK5S/sJQ9SN+2BPfAFzHqefgsJ9Ut Uv61pbkzoVzz6lpETvxi+8kAwjp8r0SxiLeLS5xfKICZmTdrKY7bMSlVlI5jG9FUHY 0dgrzCN+IpnWw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 21/27] kthread: Refine naming of affinity related fields Date: Fri, 20 Jun 2025 17:23:02 +0200 Message-ID: <20250620152308.27492-22-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The kthreads preferred affinity related fields use "hotplug" as the base of their naming because the affinity management was initially deemed to deal with CPU hotplug. The scope of this role is going to broaden now and also deal with cpuset isolated partition updates. Switch the naming accordingly. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 38 +++++++++++++++++++------------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 85fc068f0083..24008dd9f3dc 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -35,8 +35,8 @@ static DEFINE_SPINLOCK(kthread_create_lock); static LIST_HEAD(kthread_create_list); struct task_struct *kthreadd_task; =20 -static LIST_HEAD(kthreads_hotplug); -static DEFINE_MUTEX(kthreads_hotplug_lock); +static LIST_HEAD(kthread_affinity_list); +static DEFINE_MUTEX(kthread_affinity_lock); =20 struct kthread_create_info { @@ -69,7 +69,7 @@ struct kthread { /* To store the full name if task comm is truncated. */ char *full_name; struct task_struct *task; - struct list_head hotplug_node; + struct list_head affinity_node; struct cpumask *preferred_affinity; }; =20 @@ -129,7 +129,7 @@ bool set_kthread_struct(struct task_struct *p) =20 init_completion(&kthread->exited); init_completion(&kthread->parked); - INIT_LIST_HEAD(&kthread->hotplug_node); + INIT_LIST_HEAD(&kthread->affinity_node); p->vfork_done =3D &kthread->exited; =20 kthread->task =3D p; @@ -324,10 +324,10 @@ void __noreturn kthread_exit(long result) { struct kthread *kthread =3D to_kthread(current); kthread->result =3D result; - if (!list_empty(&kthread->hotplug_node)) { - mutex_lock(&kthreads_hotplug_lock); - list_del(&kthread->hotplug_node); - mutex_unlock(&kthreads_hotplug_lock); + if (!list_empty(&kthread->affinity_node)) { + mutex_lock(&kthread_affinity_lock); + list_del(&kthread->affinity_node); + mutex_unlock(&kthread_affinity_lock); =20 if (kthread->preferred_affinity) { kfree(kthread->preferred_affinity); @@ -391,9 +391,9 @@ static void kthread_affine_node(void) return; } =20 - mutex_lock(&kthreads_hotplug_lock); - WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); - list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + mutex_lock(&kthread_affinity_lock); + WARN_ON_ONCE(!list_empty(&kthread->affinity_node)); + list_add_tail(&kthread->affinity_node, &kthread_affinity_list); /* * The node cpumask is racy when read from kthread() but: * - a racing CPU going down will either fail on the subsequent @@ -403,7 +403,7 @@ static void kthread_affine_node(void) */ kthread_fetch_affinity(kthread, affinity); set_cpus_allowed_ptr(current, affinity); - mutex_unlock(&kthreads_hotplug_lock); + mutex_unlock(&kthread_affinity_lock); =20 free_cpumask_var(affinity); } @@ -877,10 +877,10 @@ int kthread_affine_preferred(struct task_struct *p, c= onst struct cpumask *mask) goto out; } =20 - mutex_lock(&kthreads_hotplug_lock); + mutex_lock(&kthread_affinity_lock); cpumask_copy(kthread->preferred_affinity, mask); - WARN_ON_ONCE(!list_empty(&kthread->hotplug_node)); - list_add_tail(&kthread->hotplug_node, &kthreads_hotplug); + WARN_ON_ONCE(!list_empty(&kthread->affinity_node)); + list_add_tail(&kthread->affinity_node, &kthread_affinity_list); kthread_fetch_affinity(kthread, affinity); =20 /* It's safe because the task is inactive. */ @@ -888,7 +888,7 @@ int kthread_affine_preferred(struct task_struct *p, con= st struct cpumask *mask) do_set_cpus_allowed(p, affinity); raw_spin_unlock_irqrestore(&p->pi_lock, flags); =20 - mutex_unlock(&kthreads_hotplug_lock); + mutex_unlock(&kthread_affinity_lock); out: free_cpumask_var(affinity); =20 @@ -908,9 +908,9 @@ static int kthreads_online_cpu(unsigned int cpu) struct kthread *k; int ret; =20 - guard(mutex)(&kthreads_hotplug_lock); + guard(mutex)(&kthread_affinity_lock); =20 - if (list_empty(&kthreads_hotplug)) + if (list_empty(&kthread_affinity_list)) return 0; =20 if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) @@ -918,7 +918,7 @@ static int kthreads_online_cpu(unsigned int cpu) =20 ret =3D 0; =20 - list_for_each_entry(k, &kthreads_hotplug, hotplug_node) { + list_for_each_entry(k, &kthread_affinity_list, affinity_node) { if (WARN_ON_ONCE((k->task->flags & PF_NO_SETAFFINITY) || kthread_is_per_cpu(k->task))) { ret =3D -EINVAL; --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F7322EE292 for ; Fri, 20 Jun 2025 15:24:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433054; cv=none; b=F81WeS4ImOQ4YRq3Ky2zV1G1P/UTmQXu7tV1OWe8L0Z4AgAf8RHgZarwQapIOvCGeMuHTxV6pEM4YwYJBGlIaUlWQqdllUVkyPGTWvYwdP+rlJjnqKiNEi/xDUoNONrWBsKD+VZDNj1pO8onPddbnRO6hGKSTGDSbOTNlpYiAbs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433054; c=relaxed/simple; bh=vnW8kJaV0O0HSpg7bi8GGu5oDp8CCjUQLMI+a+c5Vcw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GkZjYujV1dOlDSsBCVXZoPb2EtcaRs6+lvLk9H+HxuCbH2PnGM22GrVGp3H/6R2L6x4T//rO4YDWQSMsas++jFGA+y4L13Ko00/yHSvhcg1r6AfE2XSbOh5K8Wfjt1vT7vf3T4bP1H5YFEmf1PIQHL+9VR76su+uyNnL9Fs0DEg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=PNKp1ZbC; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="PNKp1ZbC" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9261DC4AF0B; Fri, 20 Jun 2025 15:24:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433054; bh=vnW8kJaV0O0HSpg7bi8GGu5oDp8CCjUQLMI+a+c5Vcw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PNKp1ZbCYdnBF13DAcFDhh/M24iQe9UvTOiMbkv5Np46wgVnHH4/w7hasjZsfke1G ID8YPOrLGM0ffC+gyo6ifmHIkHJPBkF5tJ2dvHUwQyfjqimxU40wjbDQTr2b99oBpi FYrWlC+u9CmY87WFOl7lHu9ttmGqoHuqrjff4064WlhbNrcItvNO85j7B/VzcWcrcv tDvpdz4tyGnKb1u0PHLljrCvIs8d6AvB+G8olw1oUAeqeElaSvMk9cgnlYEqOs8Xwr 1NH0Ph93k4SWZRxJwrsIBCCMd7IsPVEWeV7NMcr15a5Lzh5urT5NtrWep+LMXe7h+q C0e1tfv9VCXuA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 22/27] kthread: Include unbound kthreads in the managed affinity list Date: Fri, 20 Jun 2025 17:23:03 +0200 Message-ID: <20250620152308.27492-23-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The managed affinity list currently contains only unbound kthreads that have affinity preferences. Unbound kthreads globally affine by default are outside of the list because their affinity is automatically managed by the scheduler (through the fallback housekeeping mask) and by cpuset. However in order to preserve the preferred affinity of kthreads, cpuset will delegate the isolated partition update propagation to the housekeeping and kthread code. Prepare for that with including all unbound kthreads in the managed affinity list. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 59 ++++++++++++++++++++++++------------------------ 1 file changed, 30 insertions(+), 29 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 24008dd9f3dc..138bb41ca916 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -366,9 +366,10 @@ static void kthread_fetch_affinity(struct kthread *kth= read, struct cpumask *cpum if (kthread->preferred_affinity) { pref =3D kthread->preferred_affinity; } else { - if (WARN_ON_ONCE(kthread->node =3D=3D NUMA_NO_NODE)) - return; - pref =3D cpumask_of_node(kthread->node); + if (kthread->node =3D=3D NUMA_NO_NODE) + pref =3D housekeeping_cpumask(HK_TYPE_KTHREAD); + else + pref =3D cpumask_of_node(kthread->node); } =20 cpumask_and(cpumask, pref, housekeeping_cpumask(HK_TYPE_KTHREAD)); @@ -381,32 +382,29 @@ static void kthread_affine_node(void) struct kthread *kthread =3D to_kthread(current); cpumask_var_t affinity; =20 - WARN_ON_ONCE(kthread_is_per_cpu(current)); + if (WARN_ON_ONCE(kthread_is_per_cpu(current))) + return; =20 - if (kthread->node =3D=3D NUMA_NO_NODE) { - housekeeping_affine(current, HK_TYPE_KTHREAD); - } else { - if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) { - WARN_ON_ONCE(1); - return; - } - - mutex_lock(&kthread_affinity_lock); - WARN_ON_ONCE(!list_empty(&kthread->affinity_node)); - list_add_tail(&kthread->affinity_node, &kthread_affinity_list); - /* - * The node cpumask is racy when read from kthread() but: - * - a racing CPU going down will either fail on the subsequent - * call to set_cpus_allowed_ptr() or be migrated to housekeepers - * afterwards by the scheduler. - * - a racing CPU going up will be handled by kthreads_online_cpu() - */ - kthread_fetch_affinity(kthread, affinity); - set_cpus_allowed_ptr(current, affinity); - mutex_unlock(&kthread_affinity_lock); - - free_cpumask_var(affinity); + if (!zalloc_cpumask_var(&affinity, GFP_KERNEL)) { + WARN_ON_ONCE(1); + return; } + + mutex_lock(&kthread_affinity_lock); + WARN_ON_ONCE(!list_empty(&kthread->affinity_node)); + list_add_tail(&kthread->affinity_node, &kthread_affinity_list); + /* + * The node cpumask is racy when read from kthread() but: + * - a racing CPU going down will either fail on the subsequent + * call to set_cpus_allowed_ptr() or be migrated to housekeepers + * afterwards by the scheduler. + * - a racing CPU going up will be handled by kthreads_online_cpu() + */ + kthread_fetch_affinity(kthread, affinity); + set_cpus_allowed_ptr(current, affinity); + mutex_unlock(&kthread_affinity_lock); + + free_cpumask_var(affinity); } =20 static int kthread(void *_create) @@ -924,8 +922,11 @@ static int kthreads_online_cpu(unsigned int cpu) ret =3D -EINVAL; continue; } - kthread_fetch_affinity(k, affinity); - set_cpus_allowed_ptr(k->task, affinity); + + if (k->preferred_affinity || k->node !=3D NUMA_NO_NODE) { + kthread_fetch_affinity(k, affinity); + set_cpus_allowed_ptr(k->task, affinity); + } } =20 free_cpumask_var(affinity); --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 209F52EF2A3 for ; Fri, 20 Jun 2025 15:24:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433059; cv=none; b=oDdUPgXBPs7R3zYE6GxBfEIHd1uEglCLSsHLJKuQEGbxhihLftVneX28RiuQ6UlC/z0dU9LugVy1bzx7eomYxZQSLl+2mWUblUIHO1RCGU1+Ej0vgeFQIhMd+XsYBe0vzQaHmEzI06nVenKJApyBYY3yN6GVDL+ZghVOJq65EUc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433059; c=relaxed/simple; bh=twoxbeTMGkniqCFrzdy3nm9knKsAh3KTQan5m/kwiBs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OUEGG1QXhYjgRJzPQ2NuA7fJsbEO0t1vby4g7T0jQoPHoaWpAfvdr+j1ZvZbxH/Guqqh11h8AK48nDseR35236jxUa9nezN/JcqYGAhg9Zz9fkxgi7uJKSnZ0h953nwRDqP7OEUv0l2ZrNxhfm3zwx2eg0cdAxqOC3hCCk62o/k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Aa6wAJOJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Aa6wAJOJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E3F45C4CEF0; Fri, 20 Jun 2025 15:24:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433056; bh=twoxbeTMGkniqCFrzdy3nm9knKsAh3KTQan5m/kwiBs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Aa6wAJOJNBmKK3dCwCp26/7VmVS4d/gR+fX44/oXGFHNJCkLGttEiwF8vEjdOeNky Ljz4OGtc69qWvEnlEYeglGCwpu7j5LtbfNBn1ggyzxg38nArwSqXQ5uZSUct7WnG9J F4Xaffwsi31iyTQR7nGD+lfzo5NJRgh3I62oacDppI4/5tf+fp6ZOYoRwUWeLpmw71 X/xdGfbn8Eh8f3EPY2V2PuxtcAe0l2sCb4ZtAHvJkop2i7Mg8r+kn4GijzCAEK/kKn iLZ9lhJIb6vvcxmfHH5CuQo8p+lVymCClF8brY3SAPkeODzCvO7OMa5uPnrnq6jYOn fWYip5d4oyKhg== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 23/27] kthread: Include kthreadd to the managed affinity list Date: Fri, 20 Jun 2025 17:23:04 +0200 Message-ID: <20250620152308.27492-24-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The unbound kthreads affinity management performed by cpuset is going to be imported to the kthread core code for consolidation purposes. Treat kthreadd just like any other kthread. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 138bb41ca916..4aeb09be29f0 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -821,12 +821,13 @@ int kthreadd(void *unused) /* Setup a clean context for our children to inherit. */ set_task_comm(tsk, comm); ignore_signals(tsk); - set_cpus_allowed_ptr(tsk, housekeeping_cpumask(HK_TYPE_KTHREAD)); set_mems_allowed(node_states[N_MEMORY]); =20 current->flags |=3D PF_NOFREEZE; cgroup_init_kthreadd(); =20 + kthread_affine_node(); + for (;;) { set_current_state(TASK_INTERRUPTIBLE); if (list_empty(&kthread_create_list)) --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 299AF2EF290 for ; Fri, 20 Jun 2025 15:24:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433059; cv=none; b=LM16Tvq01cvm4IIq9uM5nE+A7KUeaEZlwjbjrQRTemFyFE0jv5m+7R6yhsui3AI3dLrWv1JvoMUIWcLN97N0RAaqx0M0GpEqrnSh7MBZC5LWBjwmzTtfFZRe+mDXt1TyRZRFt1k5hBtGn5WHw0EBj+0DnQuxXgfKdDC9r6wp+O4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433059; c=relaxed/simple; bh=UV3VyURymvSUapUQQPhpkbrDPxSIsZH5yBdm53mj+3g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fWD759cLcfi6Zm6ra2WW4paroIhEOv8RlWLcIEua2gEKaHfeTFD3B4KZTcu/4Yzp7H9WyKUhViRDL+Jtr85obcPBHiDlMbN6ZccWETkZ5cOw4ShxLdLmrfPX7m9Pva2xjsW3MUk3i8vQuCxEQWGRrVTJ/CVXbveplbMgwGWe7iY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=l9AS5h+3; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="l9AS5h+3" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2FD72C4CEE3; Fri, 20 Jun 2025 15:24:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433059; bh=UV3VyURymvSUapUQQPhpkbrDPxSIsZH5yBdm53mj+3g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=l9AS5h+3hJJV9B0ge45p7UVR+KMUPjC6LcP836Bc5B8IwxD235Pz1hB0H36fqcBLg ZzE27hZNJaQGuHHFiTzqpvGS+DZTQtpsc7+UBqgE9Zd+Mr00wbhPHnnGmFIaICjXe8 Y4SDQaBwUaHyM8q/ue5b4Lch/Ygpy9dmWO+BnyuDRLynSeoAy9EyhYb6NujPGquXAE zNqgu4fJ5u0vzDdetDHY8WUFGa/tf2XJ+0qgZDeAOiuZ5sUtcifEp7uvIgUkxXQ0za FPnYeutDcjSc9YA69dbHiI6JXdZn55pUNUqIASnemI1pVWPd6AozI/frjxTsFI237T 3N4QU1gqWluow== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 24/27] kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management Date: Fri, 20 Jun 2025 17:23:05 +0200 Message-ID: <20250620152308.27492-25-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Unbound kthreads want to run neither on nohz_full CPUs nor on domain isolated CPUs. And since nohz_full implies domain isolation, checking the latter is enough to verify both. Therefore exclude kthreads from domain isolation. Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 4aeb09be29f0..42cd6e119335 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -363,18 +363,20 @@ static void kthread_fetch_affinity(struct kthread *kt= hread, struct cpumask *cpum { const struct cpumask *pref; =20 + guard(rcu)(); + if (kthread->preferred_affinity) { pref =3D kthread->preferred_affinity; } else { if (kthread->node =3D=3D NUMA_NO_NODE) - pref =3D housekeeping_cpumask(HK_TYPE_KTHREAD); + pref =3D housekeeping_cpumask(HK_TYPE_DOMAIN); else pref =3D cpumask_of_node(kthread->node); } =20 - cpumask_and(cpumask, pref, housekeeping_cpumask(HK_TYPE_KTHREAD)); + cpumask_and(cpumask, pref, housekeeping_cpumask(HK_TYPE_DOMAIN)); if (cpumask_empty(cpumask)) - cpumask_copy(cpumask, housekeeping_cpumask(HK_TYPE_KTHREAD)); + cpumask_copy(cpumask, housekeeping_cpumask(HK_TYPE_DOMAIN)); } =20 static void kthread_affine_node(void) --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63FB52EF2B0 for ; Fri, 20 Jun 2025 15:24:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433061; cv=none; b=uytM7mKZGbTQV1/bWcWSqNsAYNBjpBdNnPO6a7sfO/P1611sv5afa384HwWb1mo0U5oPRjyN7phv/IfRqv9WL17JTtFw2GBcVj7ha4tTD0kdr7Pbdzvw+U98PQuaEba7tHPk4DJ26xfcD/nR1/z7syUNljJwrgRng3ILmraqTIg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433061; c=relaxed/simple; bh=K5NmiuLK/ABSDoPkESRn/wBApzEKeov/t7f19NS3wSg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GCQEsgmewlWYU/XdiM5bRQ3VVn4QGwJT9/FkQQ9w1F4c2F34gN8XiF/+DUsHlRSE3Odq320f8bUneVfVk4wgWP7jygNWCI2qYUhWA++3jfkdM94HrmwcdtgU+BoTeNdalidU/AIJ+vbWcZS+rJpevX8gK+tXMahvAipD32kyero= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=e/U0cgJR; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="e/U0cgJR" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6C958C4CEF1; Fri, 20 Jun 2025 15:24:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433061; bh=K5NmiuLK/ABSDoPkESRn/wBApzEKeov/t7f19NS3wSg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=e/U0cgJRyMi91O3AcZBEVd/rp/Ohq8cZAp9X5yy+7dXaAMnfIvymRPoloYFKxMTQ5 TM+hCwDh6kJkYOHAMx17+5tT7LLWf9JZL+cydKzVBGdtzGO7nKK5EbLiIQEFz4AAZC bnnYzQ5xP52JMY8kUqdp7itx8Y34gYl6EI+YteJRTWPU59+u1MJdwxmNdoPNERi1V8 MkRI7nBuwaOeWgsVIooUBnR88KVVHGvMlU2sMoKX68vSdvIU9a29VgwmFoC8Ja4arN aYcRQp8u1G6LPnmki6lkfQHMW7dYlImE4OoM88nSxi21mrMORtN0a81q72FzpFq2ee EATtrtmiBfjHA== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 25/27] sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN Date: Fri, 20 Jun 2025 17:23:06 +0200 Message-ID: <20250620152308.27492-26-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Tasks that have all their allowed CPUs offline don't want their affinity to fallback on either nohz_full CPUs or on domain isolated CPUs. And since nohz_full implies domain isolation, checking the latter is enough to verify both. Therefore exclude domain isolation from fallback task affinity. Signed-off-by: Frederic Weisbecker --- include/linux/mmu_context.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/mmu_context.h b/include/linux/mmu_context.h index ac01dc4eb2ce..ed3dd0f3fe19 100644 --- a/include/linux/mmu_context.h +++ b/include/linux/mmu_context.h @@ -24,7 +24,7 @@ static inline void leave_mm(void) { } #ifndef task_cpu_possible_mask # define task_cpu_possible_mask(p) cpu_possible_mask # define task_cpu_possible(cpu, p) true -# define task_cpu_fallback_mask(p) housekeeping_cpumask(HK_TYPE_TICK) +# define task_cpu_fallback_mask(p) housekeeping_cpumask(HK_TYPE_DOMAIN) #else # define task_cpu_possible(cpu, p) cpumask_test_cpu((cpu), task_cpu_possib= le_mask(p)) #endif --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B08872DCC05; Fri, 20 Jun 2025 15:24:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433064; cv=none; b=GDhjxaKEkaK4Pvf8e710YZ4563k4orXERzK+p/frX+qIj1yYkZ4GQb55gCREy48haXj7FkjIPrbGsllt7jXrj9KZ0F400d/BHaeGctkvkTDygIXqAlXuujvd4tutwrAPXLnRMKvWMLzBEvq7GMyoBFrgo4Fi46mF3Lui51Udacc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433064; c=relaxed/simple; bh=7c2L9Rkr1Q3nbhXIN3mYilgx0lbW4iBrIbPuK7IzLQU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RNGXvzBf3JZ5/CoCzhE3pQINYmhWszPQX9DL37TlKImS6l5hv8YooB1hejs6x6wfc7AMrSRWWC9pjB41FXvwUoPdH0g7uTrRppDdJoA5djlAIbaIfFcj6znPx5oc4suWr57vovGoisWS5JRqIUy1OTnGBaXlFCExEa90gug+AMA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EAjCYF5W; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EAjCYF5W" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A9CBCC4CEF1; Fri, 20 Jun 2025 15:24:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433064; bh=7c2L9Rkr1Q3nbhXIN3mYilgx0lbW4iBrIbPuK7IzLQU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EAjCYF5WIBh0wREJw2JnCFFkjoim4hOhMVvtZxkqoTv+Ps9zV8by/OD9UQrsm12aE +4xutNHniS1BMpSZUpFV67R9hJtLpOCbGoGw57X+hhqKKx7wiR2SjLZvuBtpAoPGqN boINr+Yvf8YFLTceOTik+6EUnuOvo36c8hiRSQGB+sUPMZf5hb3OYP6A2E4Bg0Hkif AaBHkB1vcT03yFmzHg5hAXAvLgT53+HOc3Lr40yK1q5QY1gza/ZeD6t26woyiNaa+s KcOBM0OgYeN4IBYTUdbOS/1WANJQb0QgpyvRWUCRe1lDOnmiDFQUkyftbf3LftJUc1 k7SkbLWM/z0rQ== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Johannes Weiner , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long , cgroups@vger.kernel.org Subject: [PATCH 26/27] kthread: Honour kthreads preferred affinity after cpuset changes Date: Fri, 20 Jun 2025 17:23:07 +0200 Message-ID: <20250620152308.27492-27-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When cpuset isolated partitions get updated, unbound kthreads get indifferently affine to all non isolated CPUs, regardless of their individual affinity preferences. For example kswapd is a per-node kthread that prefers to be affine to the node it refers to. Whenever an isolated partition is created, updated or deleted, kswapd's node affinity is going to be broken if any CPU in the related node is not isolated because kswapd will be affine globally. Fix this with letting the consolidated kthread managed affinity code do the affinity update on behalf of cpuset. Signed-off-by: Frederic Weisbecker --- include/linux/kthread.h | 1 + kernel/cgroup/cpuset.c | 5 ++--- kernel/kthread.c | 38 +++++++++++++++++++++++++++++--------- kernel/sched/isolation.c | 2 ++ 4 files changed, 34 insertions(+), 12 deletions(-) diff --git a/include/linux/kthread.h b/include/linux/kthread.h index 8d27403888ce..c92c1149ee6e 100644 --- a/include/linux/kthread.h +++ b/include/linux/kthread.h @@ -100,6 +100,7 @@ void kthread_unpark(struct task_struct *k); void kthread_parkme(void); void kthread_exit(long result) __noreturn; void kthread_complete_and_exit(struct completion *, long) __noreturn; +int kthreads_update_housekeeping(void); =20 int kthreadd(void *unused); extern struct task_struct *kthreadd_task; diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index db80e72681ed..99ee187d941b 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1130,11 +1130,10 @@ void cpuset_update_tasks_cpumask(struct cpuset *cs,= struct cpumask *new_cpus) =20 if (top_cs) { /* + * PF_KTHREAD tasks are handled by housekeeping. * PF_NO_SETAFFINITY tasks are ignored. - * All per cpu kthreads should have PF_NO_SETAFFINITY - * flag set, see kthread_set_per_cpu(). */ - if (task->flags & PF_NO_SETAFFINITY) + if (task->flags & (PF_KTHREAD | PF_NO_SETAFFINITY)) continue; cpumask_andnot(new_cpus, possible_mask, subpartitions_cpus); } else { diff --git a/kernel/kthread.c b/kernel/kthread.c index 42cd6e119335..8c1268c2cee9 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -896,14 +896,7 @@ int kthread_affine_preferred(struct task_struct *p, co= nst struct cpumask *mask) return ret; } =20 -/* - * Re-affine kthreads according to their preferences - * and the newly online CPU. The CPU down part is handled - * by select_fallback_rq() which default re-affines to - * housekeepers from other nodes in case the preferred - * affinity doesn't apply anymore. - */ -static int kthreads_online_cpu(unsigned int cpu) +static int kthreads_update_affinity(bool force) { cpumask_var_t affinity; struct kthread *k; @@ -926,7 +919,7 @@ static int kthreads_online_cpu(unsigned int cpu) continue; } =20 - if (k->preferred_affinity || k->node !=3D NUMA_NO_NODE) { + if (force || k->preferred_affinity || k->node !=3D NUMA_NO_NODE) { kthread_fetch_affinity(k, affinity); set_cpus_allowed_ptr(k->task, affinity); } @@ -937,6 +930,33 @@ static int kthreads_online_cpu(unsigned int cpu) return ret; } =20 +/** + * kthreads_update_housekeeping - Update kthreads affinity on cpuset change + * + * When cpuset changes a partition type to/from "isolated" or updates rela= ted + * cpumasks, propagate the housekeeping cpumask change to preferred kthrea= ds + * affinity. + * + * Returns 0 if successful, -ENOMEM if temporary mask couldn't + * be allocated or -EINVAL in case of internal error. + */ +int kthreads_update_housekeeping(void) +{ + return kthreads_update_affinity(true); +} + +/* + * Re-affine kthreads according to their preferences + * and the newly online CPU. The CPU down part is handled + * by select_fallback_rq() which default re-affines to + * housekeepers from other nodes in case the preferred + * affinity doesn't apply anymore. + */ +static int kthreads_online_cpu(unsigned int cpu) +{ + return kthreads_update_affinity(false); +} + static int kthreads_init(void) { return cpuhp_setup_state(CPUHP_AP_KTHREADS_ONLINE, "kthreads:online", diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index e4e4fcd4cb2c..2750b80a5511 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -144,6 +144,8 @@ int housekeeping_update(struct cpumask *mask, enum hk_t= ype type) mem_cgroup_flush_workqueue(); vmstat_flush_workqueue(); err =3D workqueue_unbound_exclude_cpumask(housekeeping_cpumask(type)); + WARN_ON_ONCE(err < 0); + err =3D kthreads_update_housekeeping(); =20 kfree(old); =20 --=20 2.48.1 From nobody Thu Oct 9 03:14:49 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C27A92DE1F6 for ; Fri, 20 Jun 2025 15:24:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433066; cv=none; b=S4Itv9YZINJ1htyA1n4gRc7KCFCjerEE9uq9Aw348mHCYR1A9NppFCBGw75URkQhWUWdDLBXReDbuZUtPIhxr3ku4PhY7g83Yz76z3NSEmt/hmr4ejxRksyYiRFTHReTkcFFROmrkL4skC8OMZi5Aw2d1ZYaCsFjlG6X7ZBU99w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750433066; c=relaxed/simple; bh=FkUNWsMflMl5tLznIKDkBisS5Q1eWoqBcZohsw2Uy7U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=RGn9WfSrRDamjNlF1J4J9U/M7C4Id/P31vc6l7O8y3u47+MLFpBZSaq1UzDn/nFY2GdjkU6lHap5HkdBZqSaDlN3XeJeQP82WLi+kMV8yF9qMG4q6lNN64QUxWELM0IYaUgpkwqAZRjIu+bavY+ziIU5thITl4S2y+HPe1nXkmU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jil0UiD/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jil0UiD/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7CE0C4CEF2; Fri, 20 Jun 2025 15:24:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1750433066; bh=FkUNWsMflMl5tLznIKDkBisS5Q1eWoqBcZohsw2Uy7U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jil0UiD/f/N7lgmuVUH+zt4aQwr40Ej1Ju1mPR87ZzCW7ykTl+MsnjZwrK/YD94KU OAimXSulkjaAStmnOcwrrfnMwUnKU2FqHQv1inCGJnFrRizSeamphgpOwNMtqbLgIr d3l6NL5s/fwMn5Ouc2CwUkLPVtU6evu8HIUcdjzB6rK27XIw1tN/FS5cxiX/36Ed5O pCLQnOVHP6Gi6/ldFj59qBGkfYqVZXALXJMCdnJt9hRir39wGsZr1I58fi2FxWgZYI nSmcyR662++GghJbrTsCd7aghBEITgx8cggvMOVPVOcsPtXGmkp9wW/6BDnw8Zww1H WEBW1zPriEYgw== From: Frederic Weisbecker To: LKML Cc: Frederic Weisbecker , Marco Crivellari , Michal Hocko , Peter Zijlstra , Tejun Heo , Thomas Gleixner , Vlastimil Babka , Waiman Long Subject: [PATCH 27/27] kthread: Comment on the purpose and placement of kthread_affine_node() call Date: Fri, 20 Jun 2025 17:23:08 +0200 Message-ID: <20250620152308.27492-28-frederic@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: <20250620152308.27492-1-frederic@kernel.org> References: <20250620152308.27492-1-frederic@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" It may not appear obvious why kthread_affine_node() is not called before the kthread creation completion instead of after the first wake-up. The reason is that kthread_affine_node() applies a default affinity behaviour that only takes place if no affinity preference have already been passed by the kthread creation call site. Add a comment to clarify that. Reported-by: Peter Zijlstra Signed-off-by: Frederic Weisbecker --- kernel/kthread.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/kthread.c b/kernel/kthread.c index 8c1268c2cee9..85e29b250107 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -454,6 +454,10 @@ static int kthread(void *_create) =20 self->started =3D 1; =20 + /* + * Apply default node affinity if no call to kthread_bind[_mask]() nor + * kthread_affine_preferred() was issued before the first wake-up. + */ if (!(current->flags & PF_NO_SETAFFINITY) && !self->preferred_affinity) kthread_affine_node(); =20 --=20 2.48.1