From nobody Sun Feb 8 20:59:04 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 600D03314DB; Thu, 18 Dec 2025 09:46:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051206; cv=none; b=QOhxe8UgJiBzo8NRTw8L1dHa4xdoQ27MM7btwkG0IH3qEYvTQphMspb7pcBRcnFue8d5YCo5xlODaBXvMSb66GPyqrWPiTunh6zkyZBgX+ZtZffpRnluKNcx8/p3+GHSNcWz+x/6CZZCzdrYBG13TR4B0iMkzE1RT5rf11XpbeI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051206; c=relaxed/simple; bh=x8shxPeO+uqCYyW85qAQLr5LRF6VRc2qTedaT0z2eac=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Mx1bZZcrxenkW202WmpHPZOVxo1sfR5j0vOMMsnDobWDX86r1d7jLYitntYcmPiYAXEZ/avT3dhnpAvJ/4i0T7KaNUl457OwcfGRzNQGS3x5L6nQAcf0mAzg6HC1c2/H9feMEIcE56A4jykhsqsPhkpG64srKgvZ4sHeapkPUS0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dX5QS2yjbzYQv6b; Thu, 18 Dec 2025 17:46:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id C091940590; Thu, 18 Dec 2025 17:46:36 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgA35vZxzUNpHcJzAg--.22754S3; Thu, 18 Dec 2025 17:46:36 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next v2 1/6] cpuset: add lockdep_assert_cpuset_lock_held helper Date: Thu, 18 Dec 2025 09:31:36 +0000 Message-Id: <20251218093141.2687291-2-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251218093141.2687291-1-chenridong@huaweicloud.com> References: <20251218093141.2687291-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgA35vZxzUNpHcJzAg--.22754S3 X-Coremail-Antispam: 1UD129KBjvJXoW7Kr13XF1DKF13Xr13Xw1kKrg_yoW8Cw48pF yvk34UJ3yrAFy09a4DXwsru34Sgw1kCFW5JFn5t34FyFyaqF48u3WkXF9xJr13tw1fCF15 GFZFkw4a9FyDGrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBG14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAVWUtw CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_ Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUfDGrUUUUU= X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong Add lockdep_assert_cpuset_lock_held() to allow other subsystems to verify that cpuset_mutex is held. Suggested-by: Waiman Long Signed-off-by: Chen Ridong Reviewed-by: Waiman Long --- include/linux/cpuset.h | 2 ++ kernel/cgroup/cpuset.c | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index a98d3330385c..a634489c9f12 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -74,6 +74,7 @@ extern void inc_dl_tasks_cs(struct task_struct *task); extern void dec_dl_tasks_cs(struct task_struct *task); extern void cpuset_lock(void); extern void cpuset_unlock(void); +extern void lockdep_assert_cpuset_lock_held(void); extern void cpuset_cpus_allowed_locked(struct task_struct *p, struct cpuma= sk *mask); extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mas= k); extern bool cpuset_cpus_allowed_fallback(struct task_struct *p); @@ -195,6 +196,7 @@ static inline void inc_dl_tasks_cs(struct task_struct *= task) { } static inline void dec_dl_tasks_cs(struct task_struct *task) { } static inline void cpuset_lock(void) { } static inline void cpuset_unlock(void) { } +static inline void lockdep_assert_cpuset_lock_held(void) { } =20 static inline void cpuset_cpus_allowed_locked(struct task_struct *p, struct cpumask *mask) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index fea577b4016a..4fa3080da3be 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -271,6 +271,11 @@ void cpuset_unlock(void) mutex_unlock(&cpuset_mutex); } =20 +void lockdep_assert_cpuset_lock_held(void) +{ + lockdep_assert_held(&cpuset_mutex); +} + /** * cpuset_full_lock - Acquire full protection for cpuset modification * --=20 2.34.1 From nobody Sun Feb 8 20:59:04 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 602CA3328F6; Thu, 18 Dec 2025 09:46:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051206; cv=none; b=HlOCIsthzaCqBaXka6Wyjkn1ME/YSnBdgBBGmR77cifOMglAw9SJYF7837zYeF0By/euJkZnYwxXgDT0sAjZzo6CXkG2i8BHZynNVEQmXvfRPNVqfCZxa+dscfSWLr4b7xc72ChDJ96HlJhQ1qVMRuELtmWvqcLtEpF8NkkWtwA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051206; c=relaxed/simple; bh=n5y+c/A74XVvbdJLbrBLOd3u4NopT0CCszElgJPaD2Q=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LF50Gvqn9+X20bEChopHdQR+voGofXCsYj+IVX328f3m7gcVe7x0pQ9RXyJiFev1lLtsgFWnK+CWNKkhUdmycS4FHXcxGEuEtq5sCfsjBF6pzYOpTuxm7SCMWG7FUWnzZb/S5vtHKpfuwA6JKTqAUJJVbYxjaERcS5pbnP+yrTI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dX5QS3ZYVzYQv7j; Thu, 18 Dec 2025 17:46:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D2E8F40577; Thu, 18 Dec 2025 17:46:36 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgA35vZxzUNpHcJzAg--.22754S4; Thu, 18 Dec 2025 17:46:36 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next v2 2/6] cpuset: add cpuset1_online_css helper for v1-specific operations Date: Thu, 18 Dec 2025 09:31:37 +0000 Message-Id: <20251218093141.2687291-3-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251218093141.2687291-1-chenridong@huaweicloud.com> References: <20251218093141.2687291-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgA35vZxzUNpHcJzAg--.22754S4 X-Coremail-Antispam: 1UD129KBjvJXoWxtr1UtFWkKFy5JFWUZryUGFg_yoWxJryfpF 18CFy3JayUJFyUu3yfJa4DWrZ3Kw40qF45KF95Ca4rJFy3AF1j9F1kZas8XFy5JFyDCrWU Xan0y3yS9a4qkrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBG14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAVWUtw CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_ Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUFoGdUUUUU= X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong This commit introduces the cpuset1_online_css helper to centralize v1-specific handling during cpuset online. It performs operations such as updating the CS_SPREAD_PAGE, CS_SPREAD_SLAB, and CGRP_CPUSET_CLONE_CHILDREN flags, which are unique to the cpuset v1 control group interface. The helper is now placed in cpuset-v1.c to maintain clear separation between v1 and v2 logic. Signed-off-by: Chen Ridong Reviewed-by: Waiman Long --- kernel/cgroup/cpuset-internal.h | 2 ++ kernel/cgroup/cpuset-v1.c | 48 +++++++++++++++++++++++++++++++++ kernel/cgroup/cpuset.c | 39 +-------------------------- 3 files changed, 51 insertions(+), 38 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index 01976c8e7d49..6c03cad02302 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -293,6 +293,7 @@ void cpuset1_hotplug_update_tasks(struct cpuset *cs, struct cpumask *new_cpus, nodemask_t *new_mems, bool cpus_updated, bool mems_updated); int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial); +void cpuset1_online_css(struct cgroup_subsys_state *css); #else static inline void fmeter_init(struct fmeter *fmp) {} static inline void cpuset1_update_task_spread_flags(struct cpuset *cs, @@ -303,6 +304,7 @@ static inline void cpuset1_hotplug_update_tasks(struct = cpuset *cs, bool cpus_updated, bool mems_updated) {} static inline int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial) { return 0; } +static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} #endif /* CONFIG_CPUSETS_V1 */ =20 #endif /* __CPUSET_INTERNAL_H */ diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index 12e76774c75b..c296ce47616a 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -499,6 +499,54 @@ static int cpuset_write_u64(struct cgroup_subsys_state= *css, struct cftype *cft, return retval; } =20 +void cpuset1_online_css(struct cgroup_subsys_state *css) +{ + struct cpuset *tmp_cs; + struct cgroup_subsys_state *pos_css; + struct cpuset *cs =3D css_cs(css); + struct cpuset *parent =3D parent_cs(cs); + + lockdep_assert_cpus_held(); + lockdep_assert_cpuset_lock_held(); + + if (is_spread_page(parent)) + set_bit(CS_SPREAD_PAGE, &cs->flags); + if (is_spread_slab(parent)) + set_bit(CS_SPREAD_SLAB, &cs->flags); + + if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags)) + return; + + /* + * Clone @parent's configuration if CGRP_CPUSET_CLONE_CHILDREN is + * set. This flag handling is implemented in cgroup core for + * historical reasons - the flag may be specified during mount. + * + * Currently, if any sibling cpusets have exclusive cpus or mem, we + * refuse to clone the configuration - thereby refusing the task to + * be entered, and as a result refusing the sys_unshare() or + * clone() which initiated it. If this becomes a problem for some + * users who wish to allow that scenario, then this could be + * changed to grant parent->cpus_allowed-sibling_cpus_exclusive + * (and likewise for mems) to the new cgroup. + */ + rcu_read_lock(); + cpuset_for_each_child(tmp_cs, pos_css, parent) { + if (is_mem_exclusive(tmp_cs) || is_cpu_exclusive(tmp_cs)) { + rcu_read_unlock(); + return; + } + } + rcu_read_unlock(); + + cpuset_callback_lock_irq(); + cs->mems_allowed =3D parent->mems_allowed; + cs->effective_mems =3D parent->mems_allowed; + cpumask_copy(cs->cpus_allowed, parent->cpus_allowed); + cpumask_copy(cs->effective_cpus, parent->cpus_allowed); + cpuset_callback_unlock_irq(); +} + /* * for the common functions, 'private' gives the type of file */ diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 4fa3080da3be..5d9dbd1aeed3 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -3616,17 +3616,11 @@ static int cpuset_css_online(struct cgroup_subsys_s= tate *css) { struct cpuset *cs =3D css_cs(css); struct cpuset *parent =3D parent_cs(cs); - struct cpuset *tmp_cs; - struct cgroup_subsys_state *pos_css; =20 if (!parent) return 0; =20 cpuset_full_lock(); - if (is_spread_page(parent)) - set_bit(CS_SPREAD_PAGE, &cs->flags); - if (is_spread_slab(parent)) - set_bit(CS_SPREAD_SLAB, &cs->flags); /* * For v2, clear CS_SCHED_LOAD_BALANCE if parent is isolated */ @@ -3641,39 +3635,8 @@ static int cpuset_css_online(struct cgroup_subsys_st= ate *css) cs->effective_mems =3D parent->effective_mems; } spin_unlock_irq(&callback_lock); + cpuset1_online_css(css); =20 - if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags)) - goto out_unlock; - - /* - * Clone @parent's configuration if CGRP_CPUSET_CLONE_CHILDREN is - * set. This flag handling is implemented in cgroup core for - * historical reasons - the flag may be specified during mount. - * - * Currently, if any sibling cpusets have exclusive cpus or mem, we - * refuse to clone the configuration - thereby refusing the task to - * be entered, and as a result refusing the sys_unshare() or - * clone() which initiated it. If this becomes a problem for some - * users who wish to allow that scenario, then this could be - * changed to grant parent->cpus_allowed-sibling_cpus_exclusive - * (and likewise for mems) to the new cgroup. - */ - rcu_read_lock(); - cpuset_for_each_child(tmp_cs, pos_css, parent) { - if (is_mem_exclusive(tmp_cs) || is_cpu_exclusive(tmp_cs)) { - rcu_read_unlock(); - goto out_unlock; - } - } - rcu_read_unlock(); - - spin_lock_irq(&callback_lock); - cs->mems_allowed =3D parent->mems_allowed; - cs->effective_mems =3D parent->mems_allowed; - cpumask_copy(cs->cpus_allowed, parent->cpus_allowed); - cpumask_copy(cs->effective_cpus, parent->cpus_allowed); - spin_unlock_irq(&callback_lock); -out_unlock: cpuset_full_unlock(); return 0; } --=20 2.34.1 From nobody Sun Feb 8 20:59:04 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F3482EA48F; Thu, 18 Dec 2025 09:46:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051202; cv=none; b=lYlFnwxU8tNAsdFUqpznWsLbonPAimYldOOYfAUBytjFxGEMm3h7NuT1QlFDGnIeUU7VQMpR0CtJ0YgIdNBr2KbwJypyNJ7RngRJJLCX1VFnyMGDuoX/nAXaFqcUItCTqDN9F0DJWtT3iG9vzHtjQlp1ZBzqEk8wnXFv5aWyLEE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051202; c=relaxed/simple; bh=bvnb+XCxurX3bmAEDIKg/mz9A7UmT8WlyRgF8U9/4Mk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=hYYxMxbOrOiECIZYcm/fca6Ad5Z1axPmR7Em0YXBVpi9Zr3slhddX5iBCYHIcxnJSNcb3xhwF10bMJAYpwMgOZCFyaHHh9u3XkLOG8e8z008U58AruWKzXkI5HshpBCcwMuz6x6rebQJRjB41Au6tXjEBsbsoJRiX1GzhzM9NXc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dX5Qp3WN4zKHMmt; Thu, 18 Dec 2025 17:46:26 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id E006F40579; Thu, 18 Dec 2025 17:46:36 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgA35vZxzUNpHcJzAg--.22754S5; Thu, 18 Dec 2025 17:46:36 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next v2 3/6] cpuset: add cpuset1_init helper for v1 initialization Date: Thu, 18 Dec 2025 09:31:38 +0000 Message-Id: <20251218093141.2687291-4-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251218093141.2687291-1-chenridong@huaweicloud.com> References: <20251218093141.2687291-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgA35vZxzUNpHcJzAg--.22754S5 X-Coremail-Antispam: 1UD129KBjvJXoWxAry5ZF15Zr15CFWkCr17ZFb_yoWrCryfpF y8Ca4Ut3y5JF1xu34kA3yDu393Kwn7tF17KF98K34rJF47tF4UuF1kXwn8Zry5tFWDur43 ZFs2yw43uF1qyrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBC14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAVWUtw CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_ Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUYdb1UUUUU X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong This patch introduces the cpuset1_init helper in cpuset_v1.c to initialize v1-specific fields, including the fmeter and relax_domain_level members. The relax_domain_level related code will be moved to cpuset_v1.c in a subsequent patch. After this move, v1-specific members will only be visible when CONFIG_CPUSETS_V1=3Dy. Signed-off-by: Chen Ridong Reviewed-by: Waiman Long --- kernel/cgroup/cpuset-internal.h | 10 ++++++---- kernel/cgroup/cpuset-v1.c | 7 ++++++- kernel/cgroup/cpuset.c | 4 ++-- 3 files changed, 14 insertions(+), 7 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index 6c03cad02302..a32517da8231 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -144,8 +144,6 @@ struct cpuset { */ nodemask_t old_mems_allowed; =20 - struct fmeter fmeter; /* memory_pressure filter */ - /* * Tasks are being attached to this cpuset. Used to prevent * zeroing cpus/mems_allowed between ->can_attach() and ->attach(). @@ -181,6 +179,10 @@ struct cpuset { =20 /* Used to merge intersecting subsets for generate_sched_domains */ struct uf_node node; + +#ifdef CONFIG_CPUSETS_V1 + struct fmeter fmeter; /* memory_pressure filter */ +#endif }; =20 static inline struct cpuset *css_cs(struct cgroup_subsys_state *css) @@ -285,7 +287,6 @@ void cpuset_full_unlock(void); */ #ifdef CONFIG_CPUSETS_V1 extern struct cftype cpuset1_files[]; -void fmeter_init(struct fmeter *fmp); void cpuset1_update_task_spread_flags(struct cpuset *cs, struct task_struct *tsk); void cpuset1_update_tasks_flags(struct cpuset *cs); @@ -293,9 +294,9 @@ void cpuset1_hotplug_update_tasks(struct cpuset *cs, struct cpumask *new_cpus, nodemask_t *new_mems, bool cpus_updated, bool mems_updated); int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial); +void cpuset1_init(struct cpuset *cs); void cpuset1_online_css(struct cgroup_subsys_state *css); #else -static inline void fmeter_init(struct fmeter *fmp) {} static inline void cpuset1_update_task_spread_flags(struct cpuset *cs, struct task_struct *tsk) {} static inline void cpuset1_update_tasks_flags(struct cpuset *cs) {} @@ -304,6 +305,7 @@ static inline void cpuset1_hotplug_update_tasks(struct = cpuset *cs, bool cpus_updated, bool mems_updated) {} static inline int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial) { return 0; } +static inline void cpuset1_init(struct cpuset *cs) {} static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} #endif /* CONFIG_CPUSETS_V1 */ =20 diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index c296ce47616a..84f00ab9c81f 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -62,7 +62,7 @@ struct cpuset_remove_tasks_struct { #define FM_SCALE 1000 /* faux fixed point scale */ =20 /* Initialize a frequency meter */ -void fmeter_init(struct fmeter *fmp) +static void fmeter_init(struct fmeter *fmp) { fmp->cnt =3D 0; fmp->val =3D 0; @@ -499,6 +499,11 @@ static int cpuset_write_u64(struct cgroup_subsys_state= *css, struct cftype *cft, return retval; } =20 +void cpuset1_init(struct cpuset *cs) +{ + fmeter_init(&cs->fmeter); +} + void cpuset1_online_css(struct cgroup_subsys_state *css) { struct cpuset *tmp_cs; diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 5d9dbd1aeed3..8ef8b7511659 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -3602,7 +3602,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_c= ss) return ERR_PTR(-ENOMEM); =20 __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags); - fmeter_init(&cs->fmeter); + cpuset1_init(cs); cs->relax_domain_level =3D -1; =20 /* Set CS_MEMORY_MIGRATE for default hierarchy */ @@ -3836,7 +3836,7 @@ int __init cpuset_init(void) cpumask_setall(top_cpuset.exclusive_cpus); nodes_setall(top_cpuset.effective_mems); =20 - fmeter_init(&top_cpuset.fmeter); + cpuset1_init(&top_cpuset); =20 BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL)); =20 --=20 2.34.1 From nobody Sun Feb 8 20:59:04 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F3C52F617D; Thu, 18 Dec 2025 09:46:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051202; cv=none; b=cO9qA37LsArfdpZX3tiwzSN17uRS25xL1JRGL7Q3IxID32lFCJz7PwmmbN/sTb8ApUa9F61tx/rtt93asl3lOFyAPzVS0qi5s5Y8M3/U4/Z04tA6Tq3OyqDSPmRotSJm6RHE8oyHHsFm7iYVidrzS3f5HAXpynUKDj8Q+M7LwXs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051202; c=relaxed/simple; bh=c7c9Vt+iRSYaSqZqYkR8wh1yMEIPJrDa1NnV7a4w85Q=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=m0QOnsFCMOT7VGHbUEQQfhC5WouzK2vlDCJOdEd9RII1m5+2cHuTR7VVbWpM1JoJ5xvny74LopjKKrpg8JZvgN8lDrJ+HXHAF1s3YmEZDPKJseAFdBK2OHpu8AdKVPpYB6+EJd3n/RCnULo0WVKneWGyxeVsboOUOxCpT0cQmTA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dX5Qp3dn4zKHMn1; Thu, 18 Dec 2025 17:46:26 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id ECFC940593; Thu, 18 Dec 2025 17:46:36 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgA35vZxzUNpHcJzAg--.22754S6; Thu, 18 Dec 2025 17:46:36 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next v2 4/6] cpuset: move update_domain_attr_tree to cpuset_v1.c Date: Thu, 18 Dec 2025 09:31:39 +0000 Message-Id: <20251218093141.2687291-5-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251218093141.2687291-1-chenridong@huaweicloud.com> References: <20251218093141.2687291-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgA35vZxzUNpHcJzAg--.22754S6 X-Coremail-Antispam: 1UD129KBjvJXoW3WrWUWrW7Gr4DtF4xJF1xGrg_yoW7Cr17pF yrCay3Jw45JryUuwn5C34Uu3sagw18ta1jyw15K34rJF47ta4DuFyvvasIvFy5AFyDCr47 ZFsIvw43u3WUtFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPj14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAV WUtwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUYcTQUUUU U X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong Since relax_domain_level is only applicable to v1, move update_domain_attr_tree() to cpuset-v1.c, which solely updates relax_domain_level, Additionally, relax_domain_level is now initialized in cpuset1_inited. Accordingly, the initialization of relax_domain_level in top_cpuset is removed. The unnecessary remote_partition initialization in top_cpuset is also cleaned up. As a result, relax_domain_level can be defined in cpuset only when CONFIG_CPUSETS_V1=3Dy. Signed-off-by: Chen Ridong Reviewed-by: Waiman Long --- kernel/cgroup/cpuset-internal.h | 11 ++++++++--- kernel/cgroup/cpuset-v1.c | 28 ++++++++++++++++++++++++++++ kernel/cgroup/cpuset.c | 31 ------------------------------- 3 files changed, 36 insertions(+), 34 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index a32517da8231..677053ffb913 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -150,9 +150,6 @@ struct cpuset { */ int attach_in_progress; =20 - /* for custom sched domain */ - int relax_domain_level; - /* partition root state */ int partition_root_state; =20 @@ -182,6 +179,9 @@ struct cpuset { =20 #ifdef CONFIG_CPUSETS_V1 struct fmeter fmeter; /* memory_pressure filter */ + + /* for custom sched domain */ + int relax_domain_level; #endif }; =20 @@ -296,6 +296,8 @@ void cpuset1_hotplug_update_tasks(struct cpuset *cs, int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial); void cpuset1_init(struct cpuset *cs); void cpuset1_online_css(struct cgroup_subsys_state *css); +void update_domain_attr_tree(struct sched_domain_attr *dattr, + struct cpuset *root_cs); #else static inline void cpuset1_update_task_spread_flags(struct cpuset *cs, struct task_struct *tsk) {} @@ -307,6 +309,9 @@ static inline int cpuset1_validate_change(struct cpuset= *cur, struct cpuset *trial) { return 0; } static inline void cpuset1_init(struct cpuset *cs) {} static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} +static inline void update_domain_attr_tree(struct sched_domain_attr *dattr, + struct cpuset *root_cs) {} + #endif /* CONFIG_CPUSETS_V1 */ =20 #endif /* __CPUSET_INTERNAL_H */ diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index 84f00ab9c81f..a4f8f1c3cfaa 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -502,6 +502,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state = *css, struct cftype *cft, void cpuset1_init(struct cpuset *cs) { fmeter_init(&cs->fmeter); + cs->relax_domain_level =3D -1; } =20 void cpuset1_online_css(struct cgroup_subsys_state *css) @@ -552,6 +553,33 @@ void cpuset1_online_css(struct cgroup_subsys_state *cs= s) cpuset_callback_unlock_irq(); } =20 +static void +update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c) +{ + if (dattr->relax_domain_level < c->relax_domain_level) + dattr->relax_domain_level =3D c->relax_domain_level; +} + +void update_domain_attr_tree(struct sched_domain_attr *dattr, + struct cpuset *root_cs) +{ + struct cpuset *cp; + struct cgroup_subsys_state *pos_css; + + rcu_read_lock(); + cpuset_for_each_descendant_pre(cp, pos_css, root_cs) { + /* skip the whole subtree if @cp doesn't have any CPU */ + if (cpumask_empty(cp->cpus_allowed)) { + pos_css =3D css_rightmost_descendant(pos_css); + continue; + } + + if (is_sched_load_balance(cp)) + update_domain_attr(dattr, cp); + } + rcu_read_unlock(); +} + /* * for the common functions, 'private' gives the type of file */ diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 8ef8b7511659..cf2363a9c552 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -215,8 +215,6 @@ static struct cpuset top_cpuset =3D { .flags =3D BIT(CS_CPU_EXCLUSIVE) | BIT(CS_MEM_EXCLUSIVE) | BIT(CS_SCHED_LOAD_BALANCE), .partition_root_state =3D PRS_ROOT, - .relax_domain_level =3D -1, - .remote_partition =3D false, }; =20 /* @@ -755,34 +753,6 @@ static int cpusets_overlap(struct cpuset *a, struct cp= uset *b) return cpumask_intersects(a->effective_cpus, b->effective_cpus); } =20 -static void -update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c) -{ - if (dattr->relax_domain_level < c->relax_domain_level) - dattr->relax_domain_level =3D c->relax_domain_level; - return; -} - -static void update_domain_attr_tree(struct sched_domain_attr *dattr, - struct cpuset *root_cs) -{ - struct cpuset *cp; - struct cgroup_subsys_state *pos_css; - - rcu_read_lock(); - cpuset_for_each_descendant_pre(cp, pos_css, root_cs) { - /* skip the whole subtree if @cp doesn't have any CPU */ - if (cpumask_empty(cp->cpus_allowed)) { - pos_css =3D css_rightmost_descendant(pos_css); - continue; - } - - if (is_sched_load_balance(cp)) - update_domain_attr(dattr, cp); - } - rcu_read_unlock(); -} - /* Must be called with cpuset_mutex held. */ static inline int nr_cpusets(void) { @@ -3603,7 +3573,6 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_c= ss) =20 __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags); cpuset1_init(cs); - cs->relax_domain_level =3D -1; =20 /* Set CS_MEMORY_MIGRATE for default hierarchy */ if (cpuset_v2()) --=20 2.34.1 From nobody Sun Feb 8 20:59:04 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 601823328ED; Thu, 18 Dec 2025 09:46:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051206; cv=none; b=YGwH65AThaOnmjHWL/KZN4d9LMV0YYE3YlRcMUuV3EgZLifa47Q0UnQrmpJRjnG5TCTtWJWrcGFL/Eqi0F8XW6m1fE9LhJkrniHnLB/CoTtQ2eiHUAoDMk6CbmqAEESSGTA11dZYgF7QI7pvLtyGuNPAtXlto9OoJ4rZA8ky0AE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051206; c=relaxed/simple; bh=wZBl7x8+Ne/XNK8QrztwYkC4j/fLfB8EzFHDjngzbK8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=TNicqljMlPD75RK6AYOg4ZRRUnkHyykgCqKfwZCFNwTjMuwxgGtjPIRCzHEJuNm6xQ1N0hFf9jDIK7EF+CzHPE6nFo3kniuOTh+S1gV1sP+2run9UnYmI3XrqedbvWQxzO3a1SJepJvJPH9KWCqVnlJz4RclBrx9blqEluF0V0k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dX5QS4hwHzYQv6M; Thu, 18 Dec 2025 17:46:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 0569C40576; Thu, 18 Dec 2025 17:46:37 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgA35vZxzUNpHcJzAg--.22754S7; Thu, 18 Dec 2025 17:46:36 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next v2 5/6] cpuset: separate generate_sched_domains for v1 and v2 Date: Thu, 18 Dec 2025 09:31:40 +0000 Message-Id: <20251218093141.2687291-6-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251218093141.2687291-1-chenridong@huaweicloud.com> References: <20251218093141.2687291-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgA35vZxzUNpHcJzAg--.22754S7 X-Coremail-Antispam: 1UD129KBjvJXoWfJr4rGw15Kr1UCry7XF1kZrb_yoWDuFW3pF W8urW2vrWUtr1xu3yrCa18Z34a9wn7JFWUt3W5G3s5AF17tF1DuFy0vF9Ikry5urWDCrWU ZFsIq3y3u3WqyrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPj14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAV WUtwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUYcTQUUUU U X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong The generate_sched_domains() function currently handles both v1 and v2 logic. However, the underlying mechanisms for building scheduler domains differ significantly between the two versions. For cpuset v2, scheduler domains are straightforwardly derived from valid partitions, whereas cpuset v1 employs a more complex union-find algorithm to merge overlapping cpusets. Co-locating these implementations complicates maintenance. This patch, along with subsequent ones, aims to separate the v1 and v2 logic. For ease of review, this patch first copies the generate_sched_domains() function into cpuset-v1.c as cpuset1_generate_sched_domains() and removes v2-specific code. Common helpers and top_cpuset are declared in cpuset-internal.h. When operating in v1 mode, the code now calls cpuset1_generate_sched_domains(). Currently there is some code duplication, which will be largely eliminated once v1-specific code is removed from v2 in the following patch. Signed-off-by: Chen Ridong Reviewed-by: Waiman Long --- kernel/cgroup/cpuset-internal.h | 23 +++++ kernel/cgroup/cpuset-v1.c | 158 ++++++++++++++++++++++++++++++++ kernel/cgroup/cpuset.c | 31 +------ 3 files changed, 185 insertions(+), 27 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index 677053ffb913..8622a4666170 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -9,6 +9,7 @@ #include #include #include +#include =20 /* See "Frequency meter" comments, below. */ =20 @@ -185,6 +186,8 @@ struct cpuset { #endif }; =20 +extern struct cpuset top_cpuset; + static inline struct cpuset *css_cs(struct cgroup_subsys_state *css) { return css ? container_of(css, struct cpuset, css) : NULL; @@ -242,6 +245,21 @@ static inline int is_spread_slab(const struct cpuset *= cs) return test_bit(CS_SPREAD_SLAB, &cs->flags); } =20 +/* + * Helper routine for generate_sched_domains(). + * Do cpusets a, b have overlapping effective cpus_allowed masks? + */ +static inline int cpusets_overlap(struct cpuset *a, struct cpuset *b) +{ + return cpumask_intersects(a->effective_cpus, b->effective_cpus); +} + +static inline int nr_cpusets(void) +{ + /* jump label reference count + the top-level cpuset */ + return static_key_count(&cpusets_enabled_key.key) + 1; +} + /** * cpuset_for_each_child - traverse online children of a cpuset * @child_cs: loop cursor pointing to the current child @@ -298,6 +316,9 @@ void cpuset1_init(struct cpuset *cs); void cpuset1_online_css(struct cgroup_subsys_state *css); void update_domain_attr_tree(struct sched_domain_attr *dattr, struct cpuset *root_cs); +int cpuset1_generate_sched_domains(cpumask_var_t **domains, + struct sched_domain_attr **attributes); + #else static inline void cpuset1_update_task_spread_flags(struct cpuset *cs, struct task_struct *tsk) {} @@ -311,6 +332,8 @@ static inline void cpuset1_init(struct cpuset *cs) {} static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} static inline void update_domain_attr_tree(struct sched_domain_attr *dattr, struct cpuset *root_cs) {} +static inline int cpuset1_generate_sched_domains(cpumask_var_t **domains, + struct sched_domain_attr **attributes) { return 0; }; =20 #endif /* CONFIG_CPUSETS_V1 */ =20 diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index a4f8f1c3cfaa..ffa7a8dc6c3a 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -580,6 +580,164 @@ void update_domain_attr_tree(struct sched_domain_attr= *dattr, rcu_read_unlock(); } =20 +/* + * cpuset1_generate_sched_domains() + * + * Finding the best partition (set of domains): + * The double nested loops below over i, j scan over the load + * balanced cpusets (using the array of cpuset pointers in csa[]) + * looking for pairs of cpusets that have overlapping cpus_allowed + * and merging them using a union-find algorithm. + * + * The union of the cpus_allowed masks from the set of all cpusets + * having the same root then form the one element of the partition + * (one sched domain) to be passed to partition_sched_domains(). + */ +int cpuset1_generate_sched_domains(cpumask_var_t **domains, + struct sched_domain_attr **attributes) +{ + struct cpuset *cp; /* top-down scan of cpusets */ + struct cpuset **csa; /* array of all cpuset ptrs */ + int csn; /* how many cpuset ptrs in csa so far */ + int i, j; /* indices for partition finding loops */ + cpumask_var_t *doms; /* resulting partition; i.e. sched domains */ + struct sched_domain_attr *dattr; /* attributes for custom domains */ + int ndoms =3D 0; /* number of sched domains in result */ + int nslot; /* next empty doms[] struct cpumask slot */ + struct cgroup_subsys_state *pos_css; + bool root_load_balance =3D is_sched_load_balance(&top_cpuset); + int nslot_update; + + lockdep_assert_cpuset_lock_held(); + + doms =3D NULL; + dattr =3D NULL; + csa =3D NULL; + + /* Special case for the 99% of systems with one, full, sched domain */ + if (root_load_balance) { + ndoms =3D 1; + doms =3D alloc_sched_domains(ndoms); + if (!doms) + goto done; + + dattr =3D kmalloc(sizeof(struct sched_domain_attr), GFP_KERNEL); + if (dattr) { + *dattr =3D SD_ATTR_INIT; + update_domain_attr_tree(dattr, &top_cpuset); + } + cpumask_and(doms[0], top_cpuset.effective_cpus, + housekeeping_cpumask(HK_TYPE_DOMAIN)); + + goto done; + } + + csa =3D kmalloc_array(nr_cpusets(), sizeof(cp), GFP_KERNEL); + if (!csa) + goto done; + csn =3D 0; + + rcu_read_lock(); + if (root_load_balance) + csa[csn++] =3D &top_cpuset; + cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) { + if (cp =3D=3D &top_cpuset) + continue; + + /* + * Continue traversing beyond @cp iff @cp has some CPUs and + * isn't load balancing. The former is obvious. The + * latter: All child cpusets contain a subset of the + * parent's cpus, so just skip them, and then we call + * update_domain_attr_tree() to calc relax_domain_level of + * the corresponding sched domain. + */ + if (!cpumask_empty(cp->cpus_allowed) && + !(is_sched_load_balance(cp) && + cpumask_intersects(cp->cpus_allowed, + housekeeping_cpumask(HK_TYPE_DOMAIN)))) + continue; + + if (is_sched_load_balance(cp) && + !cpumask_empty(cp->effective_cpus)) + csa[csn++] =3D cp; + + /* skip @cp's subtree */ + pos_css =3D css_rightmost_descendant(pos_css); + continue; + } + rcu_read_unlock(); + + for (i =3D 0; i < csn; i++) + uf_node_init(&csa[i]->node); + + /* Merge overlapping cpusets */ + for (i =3D 0; i < csn; i++) { + for (j =3D i + 1; j < csn; j++) { + if (cpusets_overlap(csa[i], csa[j])) + uf_union(&csa[i]->node, &csa[j]->node); + } + } + + /* Count the total number of domains */ + for (i =3D 0; i < csn; i++) { + if (uf_find(&csa[i]->node) =3D=3D &csa[i]->node) + ndoms++; + } + + /* + * Now we know how many domains to create. + * Convert to and populate cpu masks. + */ + doms =3D alloc_sched_domains(ndoms); + if (!doms) + goto done; + + /* + * The rest of the code, including the scheduler, can deal with + * dattr=3D=3DNULL case. No need to abort if alloc fails. + */ + dattr =3D kmalloc_array(ndoms, sizeof(struct sched_domain_attr), + GFP_KERNEL); + + for (nslot =3D 0, i =3D 0; i < csn; i++) { + nslot_update =3D 0; + for (j =3D i; j < csn; j++) { + if (uf_find(&csa[j]->node) =3D=3D &csa[i]->node) { + struct cpumask *dp =3D doms[nslot]; + + if (i =3D=3D j) { + nslot_update =3D 1; + cpumask_clear(dp); + if (dattr) + *(dattr + nslot) =3D SD_ATTR_INIT; + } + cpumask_or(dp, dp, csa[j]->effective_cpus); + cpumask_and(dp, dp, housekeeping_cpumask(HK_TYPE_DOMAIN)); + if (dattr) + update_domain_attr_tree(dattr + nslot, csa[j]); + } + } + if (nslot_update) + nslot++; + } + BUG_ON(nslot !=3D ndoms); + +done: + kfree(csa); + + /* + * Fallback to the default domain if kmalloc() failed. + * See comments in partition_sched_domains(). + */ + if (doms =3D=3D NULL) + ndoms =3D 1; + + *domains =3D doms; + *attributes =3D dattr; + return ndoms; +} + /* * for the common functions, 'private' gives the type of file */ diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index cf2363a9c552..33c929b191e8 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -211,7 +211,7 @@ static inline void notify_partition_change(struct cpuse= t *cs, int old_prs) * If cpu_online_mask is used while a hotunplug operation is happening in * parallel, we may leave an offline CPU in cpu_allowed or some other mask= s. */ -static struct cpuset top_cpuset =3D { +struct cpuset top_cpuset =3D { .flags =3D BIT(CS_CPU_EXCLUSIVE) | BIT(CS_MEM_EXCLUSIVE) | BIT(CS_SCHED_LOAD_BALANCE), .partition_root_state =3D PRS_ROOT, @@ -744,21 +744,6 @@ static int validate_change(struct cpuset *cur, struct = cpuset *trial) } =20 #ifdef CONFIG_SMP -/* - * Helper routine for generate_sched_domains(). - * Do cpusets a, b have overlapping effective cpus_allowed masks? - */ -static int cpusets_overlap(struct cpuset *a, struct cpuset *b) -{ - return cpumask_intersects(a->effective_cpus, b->effective_cpus); -} - -/* Must be called with cpuset_mutex held. */ -static inline int nr_cpusets(void) -{ - /* jump label reference count + the top-level cpuset */ - return static_key_count(&cpusets_enabled_key.key) + 1; -} =20 /* * generate_sched_domains() @@ -798,17 +783,6 @@ static inline int nr_cpusets(void) * convenient format, that can be easily compared to the prior * value to determine what partition elements (sched domains) * were changed (added or removed.) - * - * Finding the best partition (set of domains): - * The double nested loops below over i, j scan over the load - * balanced cpusets (using the array of cpuset pointers in csa[]) - * looking for pairs of cpusets that have overlapping cpus_allowed - * and merging them using a union-find algorithm. - * - * The union of the cpus_allowed masks from the set of all cpusets - * having the same root then form the one element of the partition - * (one sched domain) to be passed to partition_sched_domains(). - * */ static int generate_sched_domains(cpumask_var_t **domains, struct sched_domain_attr **attributes) @@ -826,6 +800,9 @@ static int generate_sched_domains(cpumask_var_t **domai= ns, bool cgrpv2 =3D cpuset_v2(); int nslot_update; =20 + if (!cgrpv2) + return cpuset1_generate_sched_domains(domains, attributes); + doms =3D NULL; dattr =3D NULL; csa =3D NULL; --=20 2.34.1 From nobody Sun Feb 8 20:59:04 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 602143328F3; Thu, 18 Dec 2025 09:46:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051206; cv=none; b=KHt8ZPvGE8xjORpzvpKiBhtXmchR0dUSd7j0t5W7zuOrHs32QJ6qA8P907RsLDM7NR7HaKBLfQg2FIK8N+G2kucMF2Apqbf3PeVakfRIpPNjiwE6UMHMX6rZ8KcVF+8VqioajioiPnra5XQNP6irO5HWWguWdjVqhNqpV5Wya7s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1766051206; c=relaxed/simple; bh=s6FqSO7UAPYGyKrjenr4VMbg3mMAX3ZTlhNX6z4gP38=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=IpaZGqLmKUq3VWW4d5+KmqOBbPCYeocVVbwsxt/trAsAhWyAu9jx4IhX1A17YjmLsJblXwXxVt2+x19h1yUnc6EWsHhjLM7i+CF4g2GzkQkuYkUXnDJkTb8jf7p2/jvarrhaYyYWjeub/8487WOnTE60/wyHymuYQis/loZjgJk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dX5QS52kdzYQv7s; Thu, 18 Dec 2025 17:46:08 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 145BF4058D; Thu, 18 Dec 2025 17:46:37 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgA35vZxzUNpHcJzAg--.22754S8; Thu, 18 Dec 2025 17:46:36 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next v2 6/6] cpuset: remove v1-specific code from generate_sched_domains Date: Thu, 18 Dec 2025 09:31:41 +0000 Message-Id: <20251218093141.2687291-7-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251218093141.2687291-1-chenridong@huaweicloud.com> References: <20251218093141.2687291-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgA35vZxzUNpHcJzAg--.22754S8 X-Coremail-Antispam: 1UD129KBjvJXoW3GrWrWw18Wry3KF4ftrW7XFb_yoWfGryfpF W8Cay2qrW5tr1UG39YkwsrZ34S9wsrGayUK3W5Wwn5ZF17t3Wv9Fy0v3ZxCFWY9FyDCr13 ZFZIgr4fW3WqkFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAV WUtwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJw CI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbx9NDUU UUU== X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong Following the introduction of cpuset1_generate_sched_domains() for v1 in the previous patch, v1-specific logic can now be removed from the generic generate_sched_domains(). This patch cleans up the v1-only code and ensures uf_node is only visible when CONFIG_CPUSETS_V1=3Dy. Signed-off-by: Chen Ridong Reviewed-by: Waiman Long --- kernel/cgroup/cpuset-internal.h | 10 +-- kernel/cgroup/cpuset-v1.c | 2 +- kernel/cgroup/cpuset.c | 145 ++++++-------------------------- 3 files changed, 28 insertions(+), 129 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index 8622a4666170..e718a4f54360 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -175,14 +175,14 @@ struct cpuset { /* Handle for cpuset.cpus.partition */ struct cgroup_file partition_file; =20 - /* Used to merge intersecting subsets for generate_sched_domains */ - struct uf_node node; - #ifdef CONFIG_CPUSETS_V1 struct fmeter fmeter; /* memory_pressure filter */ =20 /* for custom sched domain */ int relax_domain_level; + + /* Used to merge intersecting subsets for generate_sched_domains */ + struct uf_node node; #endif }; =20 @@ -314,8 +314,6 @@ void cpuset1_hotplug_update_tasks(struct cpuset *cs, int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial); void cpuset1_init(struct cpuset *cs); void cpuset1_online_css(struct cgroup_subsys_state *css); -void update_domain_attr_tree(struct sched_domain_attr *dattr, - struct cpuset *root_cs); int cpuset1_generate_sched_domains(cpumask_var_t **domains, struct sched_domain_attr **attributes); =20 @@ -330,8 +328,6 @@ static inline int cpuset1_validate_change(struct cpuset= *cur, struct cpuset *trial) { return 0; } static inline void cpuset1_init(struct cpuset *cs) {} static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} -static inline void update_domain_attr_tree(struct sched_domain_attr *dattr, - struct cpuset *root_cs) {} static inline int cpuset1_generate_sched_domains(cpumask_var_t **domains, struct sched_domain_attr **attributes) { return 0; }; =20 diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index ffa7a8dc6c3a..7303315fdba7 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -560,7 +560,7 @@ update_domain_attr(struct sched_domain_attr *dattr, str= uct cpuset *c) dattr->relax_domain_level =3D c->relax_domain_level; } =20 -void update_domain_attr_tree(struct sched_domain_attr *dattr, +static void update_domain_attr_tree(struct sched_domain_attr *dattr, struct cpuset *root_cs) { struct cpuset *cp; diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 33c929b191e8..221da921b4f9 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -789,18 +789,13 @@ static int generate_sched_domains(cpumask_var_t **dom= ains, { struct cpuset *cp; /* top-down scan of cpusets */ struct cpuset **csa; /* array of all cpuset ptrs */ - int csn; /* how many cpuset ptrs in csa so far */ int i, j; /* indices for partition finding loops */ cpumask_var_t *doms; /* resulting partition; i.e. sched domains */ struct sched_domain_attr *dattr; /* attributes for custom domains */ int ndoms =3D 0; /* number of sched domains in result */ - int nslot; /* next empty doms[] struct cpumask slot */ struct cgroup_subsys_state *pos_css; - bool root_load_balance =3D is_sched_load_balance(&top_cpuset); - bool cgrpv2 =3D cpuset_v2(); - int nslot_update; =20 - if (!cgrpv2) + if (!cpuset_v2()) return cpuset1_generate_sched_domains(domains, attributes); =20 doms =3D NULL; @@ -808,70 +803,26 @@ static int generate_sched_domains(cpumask_var_t **dom= ains, csa =3D NULL; =20 /* Special case for the 99% of systems with one, full, sched domain */ - if (root_load_balance && cpumask_empty(subpartitions_cpus)) { -single_root_domain: + if (cpumask_empty(subpartitions_cpus)) { ndoms =3D 1; - doms =3D alloc_sched_domains(ndoms); - if (!doms) - goto done; - - dattr =3D kmalloc(sizeof(struct sched_domain_attr), GFP_KERNEL); - if (dattr) { - *dattr =3D SD_ATTR_INIT; - update_domain_attr_tree(dattr, &top_cpuset); - } - cpumask_and(doms[0], top_cpuset.effective_cpus, - housekeeping_cpumask(HK_TYPE_DOMAIN)); - - goto done; + /* !csa will be checked and can be correctly handled */ + goto generate_doms; } =20 csa =3D kmalloc_array(nr_cpusets(), sizeof(cp), GFP_KERNEL); if (!csa) goto done; - csn =3D 0; =20 + /* Find how many partitions and cache them to csa[] */ rcu_read_lock(); - if (root_load_balance) - csa[csn++] =3D &top_cpuset; cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) { - if (cp =3D=3D &top_cpuset) - continue; - - if (cgrpv2) - goto v2; - - /* - * v1: - * Continue traversing beyond @cp iff @cp has some CPUs and - * isn't load balancing. The former is obvious. The - * latter: All child cpusets contain a subset of the - * parent's cpus, so just skip them, and then we call - * update_domain_attr_tree() to calc relax_domain_level of - * the corresponding sched domain. - */ - if (!cpumask_empty(cp->cpus_allowed) && - !(is_sched_load_balance(cp) && - cpumask_intersects(cp->cpus_allowed, - housekeeping_cpumask(HK_TYPE_DOMAIN)))) - continue; - - if (is_sched_load_balance(cp) && - !cpumask_empty(cp->effective_cpus)) - csa[csn++] =3D cp; - - /* skip @cp's subtree */ - pos_css =3D css_rightmost_descendant(pos_css); - continue; - -v2: /* * Only valid partition roots that are not isolated and with - * non-empty effective_cpus will be saved into csn[]. + * non-empty effective_cpus will be saved into csa[]. */ if ((cp->partition_root_state =3D=3D PRS_ROOT) && !cpumask_empty(cp->effective_cpus)) - csa[csn++] =3D cp; + csa[ndoms++] =3D cp; =20 /* * Skip @cp's subtree if not a partition root and has no @@ -882,40 +833,18 @@ static int generate_sched_domains(cpumask_var_t **dom= ains, } rcu_read_unlock(); =20 - /* - * If there are only isolated partitions underneath the cgroup root, - * we can optimize out unneeded sched domains scanning. - */ - if (root_load_balance && (csn =3D=3D 1)) - goto single_root_domain; - - for (i =3D 0; i < csn; i++) - uf_node_init(&csa[i]->node); - - /* Merge overlapping cpusets */ - for (i =3D 0; i < csn; i++) { - for (j =3D i + 1; j < csn; j++) { - if (cpusets_overlap(csa[i], csa[j])) { + for (i =3D 0; i < ndoms; i++) { + for (j =3D i + 1; j < ndoms; j++) { + if (cpusets_overlap(csa[i], csa[j])) /* * Cgroup v2 shouldn't pass down overlapping * partition root cpusets. */ - WARN_ON_ONCE(cgrpv2); - uf_union(&csa[i]->node, &csa[j]->node); - } + WARN_ON_ONCE(1); } } =20 - /* Count the total number of domains */ - for (i =3D 0; i < csn; i++) { - if (uf_find(&csa[i]->node) =3D=3D &csa[i]->node) - ndoms++; - } - - /* - * Now we know how many domains to create. - * Convert to and populate cpu masks. - */ +generate_doms: doms =3D alloc_sched_domains(ndoms); if (!doms) goto done; @@ -932,45 +861,19 @@ static int generate_sched_domains(cpumask_var_t **dom= ains, * to SD_ATTR_INIT. Also non-isolating partition root CPUs are a * subset of HK_TYPE_DOMAIN housekeeping CPUs. */ - if (cgrpv2) { - for (i =3D 0; i < ndoms; i++) { - /* - * The top cpuset may contain some boot time isolated - * CPUs that need to be excluded from the sched domain. - */ - if (csa[i] =3D=3D &top_cpuset) - cpumask_and(doms[i], csa[i]->effective_cpus, - housekeeping_cpumask(HK_TYPE_DOMAIN)); - else - cpumask_copy(doms[i], csa[i]->effective_cpus); - if (dattr) - dattr[i] =3D SD_ATTR_INIT; - } - goto done; - } - - for (nslot =3D 0, i =3D 0; i < csn; i++) { - nslot_update =3D 0; - for (j =3D i; j < csn; j++) { - if (uf_find(&csa[j]->node) =3D=3D &csa[i]->node) { - struct cpumask *dp =3D doms[nslot]; - - if (i =3D=3D j) { - nslot_update =3D 1; - cpumask_clear(dp); - if (dattr) - *(dattr + nslot) =3D SD_ATTR_INIT; - } - cpumask_or(dp, dp, csa[j]->effective_cpus); - cpumask_and(dp, dp, housekeeping_cpumask(HK_TYPE_DOMAIN)); - if (dattr) - update_domain_attr_tree(dattr + nslot, csa[j]); - } - } - if (nslot_update) - nslot++; + for (i =3D 0; i < ndoms; i++) { + /* + * The top cpuset may contain some boot time isolated + * CPUs that need to be excluded from the sched domain. + */ + if (!csa || csa[i] =3D=3D &top_cpuset) + cpumask_and(doms[i], top_cpuset.effective_cpus, + housekeeping_cpumask(HK_TYPE_DOMAIN)); + else + cpumask_copy(doms[i], csa[i]->effective_cpus); + if (dattr) + dattr[i] =3D SD_ATTR_INIT; } - BUG_ON(nslot !=3D ndoms); =20 done: kfree(csa); --=20 2.34.1