From nobody Mon Feb 9 01:34:29 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4441F33B6F9; Wed, 17 Dec 2025 09:04:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962293; cv=none; b=a6wqu9B6KlGz+D6kVpFgvlgeyVkd2TiNsFX6YOA5XRi1nnO86d4zkTptCy95VzBgP5/mXPBIQpN+32SgOIBEiue0NGWF6vivFVxjvh0HlgWuAjwI2AqMzNRtV2XucVbHyBTlOE0mG+mik419tNVNlREBg9gg6wDlrvUG0uYXg9c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962293; c=relaxed/simple; bh=D5U2QiVikoYwaXTRQ0/n7A0CEj3uycwFr4Li2hQW4fs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oKyI6JBqv5+E1Yej/VyqRhfCnGqv2PoO0l7+j4sNwLVHQDRFPAL5fJlbP0jmnSXzgYGTY6kTKceLJedOfboIr6SD+P8/zs5tW3HyVokpKb1Lhz8IdnPPsrf1BU6QRiVNFQuZHWYrJOvZv6VLW39k7ehjxGog4/Au7SXSTa2YaaQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dWSXd2lXlzYQvFq; Wed, 17 Dec 2025 17:04:17 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 07B0F40592; Wed, 17 Dec 2025 17:04:44 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgCHNvcackJp5AL6AQ--.18103S3; Wed, 17 Dec 2025 17:04:43 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next 1/6] cpuset: add assert_cpuset_lock_held helper Date: Wed, 17 Dec 2025 08:49:37 +0000 Message-Id: <20251217084942.2666405-2-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251217084942.2666405-1-chenridong@huaweicloud.com> References: <20251217084942.2666405-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCHNvcackJp5AL6AQ--.18103S3 X-Coremail-Antispam: 1UD129KBjvJXoW7CrW8Xr1DKr48Zr4fKrW5trb_yoW8CF4kpF 92k34UJ3yYyFy09a4DXwsrua4Sgw1kCF15JFn5t34FyFy3tF4I93WkXF9xJr13tr1fCF12 gFZFkw4a9FyDArJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBG14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAVWUtw CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_ Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUfDGrUUUUU= X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong Add assert_cpuset_lock_held() to allow other subsystems to verify that cpuset_mutex is held. Suggested-by: Waiman Long Signed-off-by: Chen Ridong --- include/linux/cpuset.h | 2 ++ kernel/cgroup/cpuset.c | 5 +++++ 2 files changed, 7 insertions(+) diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h index a98d3330385c..af0e76d10476 100644 --- a/include/linux/cpuset.h +++ b/include/linux/cpuset.h @@ -74,6 +74,7 @@ extern void inc_dl_tasks_cs(struct task_struct *task); extern void dec_dl_tasks_cs(struct task_struct *task); extern void cpuset_lock(void); extern void cpuset_unlock(void); +extern void assert_cpuset_lock_held(void); extern void cpuset_cpus_allowed_locked(struct task_struct *p, struct cpuma= sk *mask); extern void cpuset_cpus_allowed(struct task_struct *p, struct cpumask *mas= k); extern bool cpuset_cpus_allowed_fallback(struct task_struct *p); @@ -195,6 +196,7 @@ static inline void inc_dl_tasks_cs(struct task_struct *= task) { } static inline void dec_dl_tasks_cs(struct task_struct *task) { } static inline void cpuset_lock(void) { } static inline void cpuset_unlock(void) { } +static inline void assert_cpuset_lock_held(void) { } =20 static inline void cpuset_cpus_allowed_locked(struct task_struct *p, struct cpumask *mask) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index fea577b4016a..a5ad124ea1cf 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -271,6 +271,11 @@ void cpuset_unlock(void) mutex_unlock(&cpuset_mutex); } =20 +void assert_cpuset_lock_held(void) +{ + lockdep_assert_held(&cpuset_mutex); +} + /** * cpuset_full_lock - Acquire full protection for cpuset modification * --=20 2.34.1 From nobody Mon Feb 9 01:34:29 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4458333B946; Wed, 17 Dec 2025 09:04:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962295; cv=none; b=bxyEfwMvRBdbvJG0XlHqHcytO7XdweCx5VSleuUebX9XM5X+oEtpAwZo6nXYQtVCMV1uB3heQS4BQR+bU8U280GKmUZnPkXC4QGXc/L2SJM8YDAwcBXZ05bo6CDus7NchEDmUmxvNddjqIFKhzKMutwBuXBCPJW9+FqsOMR/ZK8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962295; c=relaxed/simple; bh=VSjEwdY4vzNo1Y266cmO1BQbbnv5YMwl6/zbRswQrBU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Nzzbl9dSmJu6J31km7qgKNNZS8hDhgjPD0YrDISfRaknC6uapSdQDiPhe5f/t5HdbQgLy3qol0I+StVExngSgOyena9tG4HdeYgIKU1uECtTKBoUJxINBZTG9q4m/Xoo0pcyPDcQanbyVKuVJegeUwHYeSDcGFnbJRjqvQqoN8o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dWSXd387BzYQvFr; Wed, 17 Dec 2025 17:04:17 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 15B9E4056F; Wed, 17 Dec 2025 17:04:44 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgCHNvcackJp5AL6AQ--.18103S4; Wed, 17 Dec 2025 17:04:43 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next 2/6] cpuset: add cpuset1_online_css helper for v1-specific operations Date: Wed, 17 Dec 2025 08:49:38 +0000 Message-Id: <20251217084942.2666405-3-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251217084942.2666405-1-chenridong@huaweicloud.com> References: <20251217084942.2666405-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCHNvcackJp5AL6AQ--.18103S4 X-Coremail-Antispam: 1UD129KBjvJXoWxtr1UtFWkKFy5JFWUZryUGFg_yoWxJr4kpF 18CFy5JayUJFyUu3yfJa4DWrZ3Kw40qa15tF95Ca4rJFy3AF1j9F1kZas8XFy5JFyDCrWU Xan0y3yS9a4qkrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBG14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAVWUtw CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_ Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUFoGdUUUUU= X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong This commit introduces the cpuset1_online_css helper to centralize v1-specific handling during cpuset online. It performs operations such as updating the CS_SPREAD_PAGE, CS_SPREAD_SLAB, and CGRP_CPUSET_CLONE_CHILDREN flags, which are unique to the cpuset v1 control group interface. The helper is now placed in cpuset-v1.c to maintain clear separation between v1 and v2 logic. Signed-off-by: Chen Ridong --- kernel/cgroup/cpuset-internal.h | 2 ++ kernel/cgroup/cpuset-v1.c | 48 +++++++++++++++++++++++++++++++++ kernel/cgroup/cpuset.c | 39 +-------------------------- 3 files changed, 51 insertions(+), 38 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index 01976c8e7d49..6c03cad02302 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -293,6 +293,7 @@ void cpuset1_hotplug_update_tasks(struct cpuset *cs, struct cpumask *new_cpus, nodemask_t *new_mems, bool cpus_updated, bool mems_updated); int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial); +void cpuset1_online_css(struct cgroup_subsys_state *css); #else static inline void fmeter_init(struct fmeter *fmp) {} static inline void cpuset1_update_task_spread_flags(struct cpuset *cs, @@ -303,6 +304,7 @@ static inline void cpuset1_hotplug_update_tasks(struct = cpuset *cs, bool cpus_updated, bool mems_updated) {} static inline int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial) { return 0; } +static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} #endif /* CONFIG_CPUSETS_V1 */ =20 #endif /* __CPUSET_INTERNAL_H */ diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index 12e76774c75b..650028ee250b 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -499,6 +499,54 @@ static int cpuset_write_u64(struct cgroup_subsys_state= *css, struct cftype *cft, return retval; } =20 +void cpuset1_online_css(struct cgroup_subsys_state *css) +{ + struct cpuset *tmp_cs; + struct cgroup_subsys_state *pos_css; + struct cpuset *cs =3D css_cs(css); + struct cpuset *parent =3D parent_cs(cs); + + lockdep_assert_cpus_held(); + assert_cpuset_lock_held(); + + if (is_spread_page(parent)) + set_bit(CS_SPREAD_PAGE, &cs->flags); + if (is_spread_slab(parent)) + set_bit(CS_SPREAD_SLAB, &cs->flags); + + if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags)) + return; + + /* + * Clone @parent's configuration if CGRP_CPUSET_CLONE_CHILDREN is + * set. This flag handling is implemented in cgroup core for + * historical reasons - the flag may be specified during mount. + * + * Currently, if any sibling cpusets have exclusive cpus or mem, we + * refuse to clone the configuration - thereby refusing the task to + * be entered, and as a result refusing the sys_unshare() or + * clone() which initiated it. If this becomes a problem for some + * users who wish to allow that scenario, then this could be + * changed to grant parent->cpus_allowed-sibling_cpus_exclusive + * (and likewise for mems) to the new cgroup. + */ + rcu_read_lock(); + cpuset_for_each_child(tmp_cs, pos_css, parent) { + if (is_mem_exclusive(tmp_cs) || is_cpu_exclusive(tmp_cs)) { + rcu_read_unlock(); + return; + } + } + rcu_read_unlock(); + + cpuset_callback_lock_irq(); + cs->mems_allowed =3D parent->mems_allowed; + cs->effective_mems =3D parent->mems_allowed; + cpumask_copy(cs->cpus_allowed, parent->cpus_allowed); + cpumask_copy(cs->effective_cpus, parent->cpus_allowed); + cpuset_callback_unlock_irq(); +} + /* * for the common functions, 'private' gives the type of file */ diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index a5ad124ea1cf..f74da3086120 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -3616,17 +3616,11 @@ static int cpuset_css_online(struct cgroup_subsys_s= tate *css) { struct cpuset *cs =3D css_cs(css); struct cpuset *parent =3D parent_cs(cs); - struct cpuset *tmp_cs; - struct cgroup_subsys_state *pos_css; =20 if (!parent) return 0; =20 cpuset_full_lock(); - if (is_spread_page(parent)) - set_bit(CS_SPREAD_PAGE, &cs->flags); - if (is_spread_slab(parent)) - set_bit(CS_SPREAD_SLAB, &cs->flags); /* * For v2, clear CS_SCHED_LOAD_BALANCE if parent is isolated */ @@ -3641,39 +3635,8 @@ static int cpuset_css_online(struct cgroup_subsys_st= ate *css) cs->effective_mems =3D parent->effective_mems; } spin_unlock_irq(&callback_lock); + cpuset1_online_css(css); =20 - if (!test_bit(CGRP_CPUSET_CLONE_CHILDREN, &css->cgroup->flags)) - goto out_unlock; - - /* - * Clone @parent's configuration if CGRP_CPUSET_CLONE_CHILDREN is - * set. This flag handling is implemented in cgroup core for - * historical reasons - the flag may be specified during mount. - * - * Currently, if any sibling cpusets have exclusive cpus or mem, we - * refuse to clone the configuration - thereby refusing the task to - * be entered, and as a result refusing the sys_unshare() or - * clone() which initiated it. If this becomes a problem for some - * users who wish to allow that scenario, then this could be - * changed to grant parent->cpus_allowed-sibling_cpus_exclusive - * (and likewise for mems) to the new cgroup. - */ - rcu_read_lock(); - cpuset_for_each_child(tmp_cs, pos_css, parent) { - if (is_mem_exclusive(tmp_cs) || is_cpu_exclusive(tmp_cs)) { - rcu_read_unlock(); - goto out_unlock; - } - } - rcu_read_unlock(); - - spin_lock_irq(&callback_lock); - cs->mems_allowed =3D parent->mems_allowed; - cs->effective_mems =3D parent->mems_allowed; - cpumask_copy(cs->cpus_allowed, parent->cpus_allowed); - cpumask_copy(cs->effective_cpus, parent->cpus_allowed); - spin_unlock_irq(&callback_lock); -out_unlock: cpuset_full_unlock(); return 0; } --=20 2.34.1 From nobody Mon Feb 9 01:34:29 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F06A33B94B; Wed, 17 Dec 2025 09:04:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962293; cv=none; b=f9s7aM/sRF6YyURpHFvAd4nTVY68SjKHFF4bIScZvrv91GLY2JkKN3Dog+ixGbVE6Uq2fumF+3PHUNPhpaWSY1UmEJYs/s+Pxhi7LZi3WkGDPpxUhahXClwDmMwCW5pv4HxZNXydDum+cSAhdx34qYzN7CYNukylzzd2L7nNFvs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962293; c=relaxed/simple; bh=4nWJI8tRYu5Iat/NIOHpqLp1IJCwosnDDk2mrwaZr6Q=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CIsWzWusYI00Rn/LtllUyjMZsu2hscIkQC/D4N5anD0DaTs4DM0WMc+YvPvHiB59QTiWfQC6SCAG3RElR9uKLLft0I9gxWJljVrlSR7NhShAj2PdP1NtEjDyrCnpcEYmNiN3hWWJlQv4QwozxBpMUohWx2bKJ5i5bV4H3ANeAOo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dWSXd3c88zYQvFy; Wed, 17 Dec 2025 17:04:17 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 24EF240570; Wed, 17 Dec 2025 17:04:44 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgCHNvcackJp5AL6AQ--.18103S5; Wed, 17 Dec 2025 17:04:44 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next 3/6] cpuset: add cpuset1_init helper for v1 initialization Date: Wed, 17 Dec 2025 08:49:39 +0000 Message-Id: <20251217084942.2666405-4-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251217084942.2666405-1-chenridong@huaweicloud.com> References: <20251217084942.2666405-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCHNvcackJp5AL6AQ--.18103S5 X-Coremail-Antispam: 1UD129KBjvJXoWxAry5ZF15Zr15CFWkCr17ZFb_yoWrCryfpF y8Ca4Ut3y5JF1xu34kA3yDu393Kwn7tFy7Kr98K34rXF47tF4UuF1kXwn8Zry5tFWDur43 ZFs2yw43uF1qyr7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBG14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAVWUtw CF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j 6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64 vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_ Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0x vEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JU9J5rUUUUU= X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong This patch introduces the cpuset1_init helper in cpuset_v1.c to initialize v1-specific fields, including the fmeter and relax_domain_level members. The relax_domain_level related code will be moved to cpuset_v1.c in a subsequent patch. After this move, v1-specific members will only be visible when CONFIG_CPUSETS_V1=3Dy. Signed-off-by: Chen Ridong --- kernel/cgroup/cpuset-internal.h | 10 ++++++---- kernel/cgroup/cpuset-v1.c | 7 ++++++- kernel/cgroup/cpuset.c | 4 ++-- 3 files changed, 14 insertions(+), 7 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index 6c03cad02302..a32517da8231 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -144,8 +144,6 @@ struct cpuset { */ nodemask_t old_mems_allowed; =20 - struct fmeter fmeter; /* memory_pressure filter */ - /* * Tasks are being attached to this cpuset. Used to prevent * zeroing cpus/mems_allowed between ->can_attach() and ->attach(). @@ -181,6 +179,10 @@ struct cpuset { =20 /* Used to merge intersecting subsets for generate_sched_domains */ struct uf_node node; + +#ifdef CONFIG_CPUSETS_V1 + struct fmeter fmeter; /* memory_pressure filter */ +#endif }; =20 static inline struct cpuset *css_cs(struct cgroup_subsys_state *css) @@ -285,7 +287,6 @@ void cpuset_full_unlock(void); */ #ifdef CONFIG_CPUSETS_V1 extern struct cftype cpuset1_files[]; -void fmeter_init(struct fmeter *fmp); void cpuset1_update_task_spread_flags(struct cpuset *cs, struct task_struct *tsk); void cpuset1_update_tasks_flags(struct cpuset *cs); @@ -293,9 +294,9 @@ void cpuset1_hotplug_update_tasks(struct cpuset *cs, struct cpumask *new_cpus, nodemask_t *new_mems, bool cpus_updated, bool mems_updated); int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial); +void cpuset1_init(struct cpuset *cs); void cpuset1_online_css(struct cgroup_subsys_state *css); #else -static inline void fmeter_init(struct fmeter *fmp) {} static inline void cpuset1_update_task_spread_flags(struct cpuset *cs, struct task_struct *tsk) {} static inline void cpuset1_update_tasks_flags(struct cpuset *cs) {} @@ -304,6 +305,7 @@ static inline void cpuset1_hotplug_update_tasks(struct = cpuset *cs, bool cpus_updated, bool mems_updated) {} static inline int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial) { return 0; } +static inline void cpuset1_init(struct cpuset *cs) {} static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} #endif /* CONFIG_CPUSETS_V1 */ =20 diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index 650028ee250b..574df740f21a 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -62,7 +62,7 @@ struct cpuset_remove_tasks_struct { #define FM_SCALE 1000 /* faux fixed point scale */ =20 /* Initialize a frequency meter */ -void fmeter_init(struct fmeter *fmp) +static void fmeter_init(struct fmeter *fmp) { fmp->cnt =3D 0; fmp->val =3D 0; @@ -499,6 +499,11 @@ static int cpuset_write_u64(struct cgroup_subsys_state= *css, struct cftype *cft, return retval; } =20 +void cpuset1_init(struct cpuset *cs) +{ + fmeter_init(&cs->fmeter); +} + void cpuset1_online_css(struct cgroup_subsys_state *css) { struct cpuset *tmp_cs; diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index f74da3086120..e836a1f2b951 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -3602,7 +3602,7 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_c= ss) return ERR_PTR(-ENOMEM); =20 __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags); - fmeter_init(&cs->fmeter); + cpuset1_init(cs); cs->relax_domain_level =3D -1; =20 /* Set CS_MEMORY_MIGRATE for default hierarchy */ @@ -3836,7 +3836,7 @@ int __init cpuset_init(void) cpumask_setall(top_cpuset.exclusive_cpus); nodes_setall(top_cpuset.effective_mems); =20 - fmeter_init(&top_cpuset.fmeter); + cpuset1_init(&top_cpuset); =20 BUG_ON(!alloc_cpumask_var(&cpus_attach, GFP_KERNEL)); =20 --=20 2.34.1 From nobody Mon Feb 9 01:34:29 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 444A233B6FF; Wed, 17 Dec 2025 09:04:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962294; cv=none; b=H5bYbQddEeSzB+bi+FijgPyOBe9pT7UwSFagvb2BKqM32BazhqzwnPr4gsMFsc/7naWXXKt9pUfsh6CJl9Vw0f3s0zYNtdeK1ZkR+UAOa1Up2/18p4t7WdaoKTg07TMJoizNa5Cu6a8ScO8VmIUE+pXLl6tciCgK7IILNrp16lE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962294; c=relaxed/simple; bh=SxPe91eYHqVgGHDHcHMkJAiv90gJTjLwsD3VTLlQP9I=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=aLFGsaBhH5Nw+2sG7nEjhBOhJ77Wam8FeZfoxs1I6zzxDkT56syo0TX/iAW8/kr/dnHs/yXZP1NFiplv5FpFRM3CzDAR+ZT8gVxP2PJdTZFRZqdT/pZTD9SUhzMmLCwX/qoQAS65m7xQccBzFkfGF04y/CiRKmF3KpDXmb0f4Ks= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dWSXd40yTzYQvG4; Wed, 17 Dec 2025 17:04:17 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 321FC40578; Wed, 17 Dec 2025 17:04:44 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgCHNvcackJp5AL6AQ--.18103S6; Wed, 17 Dec 2025 17:04:44 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next 4/6] cpuset: move update_domain_attr_tree to cpuset_v1.c Date: Wed, 17 Dec 2025 08:49:40 +0000 Message-Id: <20251217084942.2666405-5-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251217084942.2666405-1-chenridong@huaweicloud.com> References: <20251217084942.2666405-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCHNvcackJp5AL6AQ--.18103S6 X-Coremail-Antispam: 1UD129KBjvJXoW3WrWUWrW7Gr4DtF4xJF1xGrg_yoW7Cr17pF yrCay3Jw45JryUuwn5C34Uu3sagw18ta1Ut345K34rJF47ta4DuFyvvasI9Fy5AFyDCr47 ZFsIv3y3u3WUtFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAV WUtwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUvYLPUUUUU = X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong Since relax_domain_level is only applicable to v1, move update_domain_attr_tree() to cpuset-v1.c, which solely updates relax_domain_level, Additionally, relax_domain_level is now initialized in cpuset1_inited. Accordingly, the initialization of relax_domain_level in top_cpuset is removed. The unnecessary remote_partition initialization in top_cpuset is also cleaned up. As a result, relax_domain_level can be defined in cpuset only when CONFIG_CPUSETS_V1=3Dy. Signed-off-by: Chen Ridong --- kernel/cgroup/cpuset-internal.h | 11 ++++++++--- kernel/cgroup/cpuset-v1.c | 28 ++++++++++++++++++++++++++++ kernel/cgroup/cpuset.c | 31 ------------------------------- 3 files changed, 36 insertions(+), 34 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index a32517da8231..677053ffb913 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -150,9 +150,6 @@ struct cpuset { */ int attach_in_progress; =20 - /* for custom sched domain */ - int relax_domain_level; - /* partition root state */ int partition_root_state; =20 @@ -182,6 +179,9 @@ struct cpuset { =20 #ifdef CONFIG_CPUSETS_V1 struct fmeter fmeter; /* memory_pressure filter */ + + /* for custom sched domain */ + int relax_domain_level; #endif }; =20 @@ -296,6 +296,8 @@ void cpuset1_hotplug_update_tasks(struct cpuset *cs, int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial); void cpuset1_init(struct cpuset *cs); void cpuset1_online_css(struct cgroup_subsys_state *css); +void update_domain_attr_tree(struct sched_domain_attr *dattr, + struct cpuset *root_cs); #else static inline void cpuset1_update_task_spread_flags(struct cpuset *cs, struct task_struct *tsk) {} @@ -307,6 +309,9 @@ static inline int cpuset1_validate_change(struct cpuset= *cur, struct cpuset *trial) { return 0; } static inline void cpuset1_init(struct cpuset *cs) {} static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} +static inline void update_domain_attr_tree(struct sched_domain_attr *dattr, + struct cpuset *root_cs) {} + #endif /* CONFIG_CPUSETS_V1 */ =20 #endif /* __CPUSET_INTERNAL_H */ diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index 574df740f21a..95de6f2a4cc5 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -502,6 +502,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state = *css, struct cftype *cft, void cpuset1_init(struct cpuset *cs) { fmeter_init(&cs->fmeter); + cs->relax_domain_level =3D -1; } =20 void cpuset1_online_css(struct cgroup_subsys_state *css) @@ -552,6 +553,33 @@ void cpuset1_online_css(struct cgroup_subsys_state *cs= s) cpuset_callback_unlock_irq(); } =20 +static void +update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c) +{ + if (dattr->relax_domain_level < c->relax_domain_level) + dattr->relax_domain_level =3D c->relax_domain_level; +} + +void update_domain_attr_tree(struct sched_domain_attr *dattr, + struct cpuset *root_cs) +{ + struct cpuset *cp; + struct cgroup_subsys_state *pos_css; + + rcu_read_lock(); + cpuset_for_each_descendant_pre(cp, pos_css, root_cs) { + /* skip the whole subtree if @cp doesn't have any CPU */ + if (cpumask_empty(cp->cpus_allowed)) { + pos_css =3D css_rightmost_descendant(pos_css); + continue; + } + + if (is_sched_load_balance(cp)) + update_domain_attr(dattr, cp); + } + rcu_read_unlock(); +} + /* * for the common functions, 'private' gives the type of file */ diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index e836a1f2b951..88ca8b40e01a 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -215,8 +215,6 @@ static struct cpuset top_cpuset =3D { .flags =3D BIT(CS_CPU_EXCLUSIVE) | BIT(CS_MEM_EXCLUSIVE) | BIT(CS_SCHED_LOAD_BALANCE), .partition_root_state =3D PRS_ROOT, - .relax_domain_level =3D -1, - .remote_partition =3D false, }; =20 /* @@ -755,34 +753,6 @@ static int cpusets_overlap(struct cpuset *a, struct cp= uset *b) return cpumask_intersects(a->effective_cpus, b->effective_cpus); } =20 -static void -update_domain_attr(struct sched_domain_attr *dattr, struct cpuset *c) -{ - if (dattr->relax_domain_level < c->relax_domain_level) - dattr->relax_domain_level =3D c->relax_domain_level; - return; -} - -static void update_domain_attr_tree(struct sched_domain_attr *dattr, - struct cpuset *root_cs) -{ - struct cpuset *cp; - struct cgroup_subsys_state *pos_css; - - rcu_read_lock(); - cpuset_for_each_descendant_pre(cp, pos_css, root_cs) { - /* skip the whole subtree if @cp doesn't have any CPU */ - if (cpumask_empty(cp->cpus_allowed)) { - pos_css =3D css_rightmost_descendant(pos_css); - continue; - } - - if (is_sched_load_balance(cp)) - update_domain_attr(dattr, cp); - } - rcu_read_unlock(); -} - /* Must be called with cpuset_mutex held. */ static inline int nr_cpusets(void) { @@ -3603,7 +3573,6 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_c= ss) =20 __set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags); cpuset1_init(cs); - cs->relax_domain_level =3D -1; =20 /* Set CS_MEMORY_MIGRATE for default hierarchy */ if (cpuset_v2()) --=20 2.34.1 From nobody Mon Feb 9 01:34:29 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4466D33B949; Wed, 17 Dec 2025 09:04:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962294; cv=none; b=hPla37vQBLJp2onJnYFwLyHFT6UP9SGo2PIKHSC46aAt0xQSHOOcGzej7SKI7QSNnqh9dQE5DGr0dEHllPVj7FZu1avkU8VN4A9+BLFktP96J61O8z2ck2YuVkiZoW5ujjcmBuIKXqfn5d7MLUxKJ7f0WCLPW+IjGqyIIlDXM84= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962294; c=relaxed/simple; bh=bSy05OdX2XA09uzGJdrJmbqhrvjjfLdrgOnMh8BUDsY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JBteGlR1cgMelmF6kuHDpUUSRjNvMccNQ93EzLwAOEojMxcGtEhR+8Rh8HY5XQxs/oP/UxbxZU+VuZxcnXrjYy1ssgZC1aVANAxof5SEdgykz0zVMiD33wukK4A3clzTPkObMMAx/L01aUN6H5N6mVcpjKoIO/3oDdDDs5GS3ao= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dWSXd4Y1HzYQvG4; Wed, 17 Dec 2025 17:04:17 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 422D140579; Wed, 17 Dec 2025 17:04:44 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgCHNvcackJp5AL6AQ--.18103S7; Wed, 17 Dec 2025 17:04:44 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next 5/6] cpuset: separate generate_sched_domains for v1 and v2 Date: Wed, 17 Dec 2025 08:49:41 +0000 Message-Id: <20251217084942.2666405-6-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251217084942.2666405-1-chenridong@huaweicloud.com> References: <20251217084942.2666405-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCHNvcackJp5AL6AQ--.18103S7 X-Coremail-Antispam: 1UD129KBjvJXoWfJr4rGw15Kr1UCry7XF1kZrb_yoWDKw1fpF W8u3y2yrWUtr1xC3yrCa18Z34S9wn7JayUt3W5G3s5AF17tF1kuFy0vF9Ikry5urWDCrWU ZFsIq3y3u3WqyrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAV WUtwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUvYLPUUUUU = X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong The generate_sched_domains() function currently handles both v1 and v2 logic. However, the underlying mechanisms for building scheduler domains differ significantly between the two versions. For cpuset v2, scheduler domains are straightforwardly derived from valid partitions, whereas cpuset v1 employs a more complex union-find algorithm to merge overlapping cpusets. Co-locating these implementations complicates maintenance. This patch, along with subsequent ones, aims to separate the v1 and v2 logic. For ease of review, this patch first copies the generate_sched_domains() function into cpuset-v1.c as cpuset1_generate_sched_domains() and removes v2-specific code. Common helpers and top_cpuset are declared in cpuset-internal.h. When operating in v1 mode, the code now calls cpuset1_generate_sched_domains(). Currently there is some code duplication, which will be largely eliminated once v1-specific code is removed from v2 in the following patch. Signed-off-by: Chen Ridong --- kernel/cgroup/cpuset-internal.h | 24 +++++ kernel/cgroup/cpuset-v1.c | 167 ++++++++++++++++++++++++++++++++ kernel/cgroup/cpuset.c | 31 +----- 3 files changed, 195 insertions(+), 27 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index 677053ffb913..bd767f8cb0ed 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -9,6 +9,7 @@ #include #include #include +#include =20 /* See "Frequency meter" comments, below. */ =20 @@ -185,6 +186,8 @@ struct cpuset { #endif }; =20 +extern struct cpuset top_cpuset; + static inline struct cpuset *css_cs(struct cgroup_subsys_state *css) { return css ? container_of(css, struct cpuset, css) : NULL; @@ -242,6 +245,22 @@ static inline int is_spread_slab(const struct cpuset *= cs) return test_bit(CS_SPREAD_SLAB, &cs->flags); } =20 +/* + * Helper routine for generate_sched_domains(). + * Do cpusets a, b have overlapping effective cpus_allowed masks? + */ +static inline int cpusets_overlap(struct cpuset *a, struct cpuset *b) +{ + return cpumask_intersects(a->effective_cpus, b->effective_cpus); +} + +static inline int nr_cpusets(void) +{ + assert_cpuset_lock_held(); + /* jump label reference count + the top-level cpuset */ + return static_key_count(&cpusets_enabled_key.key) + 1; +} + /** * cpuset_for_each_child - traverse online children of a cpuset * @child_cs: loop cursor pointing to the current child @@ -298,6 +317,9 @@ void cpuset1_init(struct cpuset *cs); void cpuset1_online_css(struct cgroup_subsys_state *css); void update_domain_attr_tree(struct sched_domain_attr *dattr, struct cpuset *root_cs); +int cpuset1_generate_sched_domains(cpumask_var_t **domains, + struct sched_domain_attr **attributes); + #else static inline void cpuset1_update_task_spread_flags(struct cpuset *cs, struct task_struct *tsk) {} @@ -311,6 +333,8 @@ static inline void cpuset1_init(struct cpuset *cs) {} static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} static inline void update_domain_attr_tree(struct sched_domain_attr *dattr, struct cpuset *root_cs) {} +static inline int cpuset1_generate_sched_domains(cpumask_var_t **domains, + struct sched_domain_attr **attributes) { return 0; }; =20 #endif /* CONFIG_CPUSETS_V1 */ =20 diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index 95de6f2a4cc5..5c0bded46a7c 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -580,6 +580,173 @@ void update_domain_attr_tree(struct sched_domain_attr= *dattr, rcu_read_unlock(); } =20 +/* + * cpuset1_generate_sched_domains() + * + * Finding the best partition (set of domains): + * The double nested loops below over i, j scan over the load + * balanced cpusets (using the array of cpuset pointers in csa[]) + * looking for pairs of cpusets that have overlapping cpus_allowed + * and merging them using a union-find algorithm. + * + * The union of the cpus_allowed masks from the set of all cpusets + * having the same root then form the one element of the partition + * (one sched domain) to be passed to partition_sched_domains(). + */ +int cpuset1_generate_sched_domains(cpumask_var_t **domains, + struct sched_domain_attr **attributes) +{ + struct cpuset *cp; /* top-down scan of cpusets */ + struct cpuset **csa; /* array of all cpuset ptrs */ + int csn; /* how many cpuset ptrs in csa so far */ + int i, j; /* indices for partition finding loops */ + cpumask_var_t *doms; /* resulting partition; i.e. sched domains */ + struct sched_domain_attr *dattr; /* attributes for custom domains */ + int ndoms =3D 0; /* number of sched domains in result */ + int nslot; /* next empty doms[] struct cpumask slot */ + struct cgroup_subsys_state *pos_css; + bool root_load_balance =3D is_sched_load_balance(&top_cpuset); + int nslot_update; + + assert_cpuset_lock_held(); + + doms =3D NULL; + dattr =3D NULL; + csa =3D NULL; + + /* Special case for the 99% of systems with one, full, sched domain */ + if (root_load_balance) { +single_root_domain: + ndoms =3D 1; + doms =3D alloc_sched_domains(ndoms); + if (!doms) + goto done; + + dattr =3D kmalloc(sizeof(struct sched_domain_attr), GFP_KERNEL); + if (dattr) { + *dattr =3D SD_ATTR_INIT; + update_domain_attr_tree(dattr, &top_cpuset); + } + cpumask_and(doms[0], top_cpuset.effective_cpus, + housekeeping_cpumask(HK_TYPE_DOMAIN)); + + goto done; + } + + csa =3D kmalloc_array(nr_cpusets(), sizeof(cp), GFP_KERNEL); + if (!csa) + goto done; + csn =3D 0; + + rcu_read_lock(); + if (root_load_balance) + csa[csn++] =3D &top_cpuset; + cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) { + if (cp =3D=3D &top_cpuset) + continue; + + /* + * v1: + * Continue traversing beyond @cp iff @cp has some CPUs and + * isn't load balancing. The former is obvious. The + * latter: All child cpusets contain a subset of the + * parent's cpus, so just skip them, and then we call + * update_domain_attr_tree() to calc relax_domain_level of + * the corresponding sched domain. + */ + if (!cpumask_empty(cp->cpus_allowed) && + !(is_sched_load_balance(cp) && + cpumask_intersects(cp->cpus_allowed, + housekeeping_cpumask(HK_TYPE_DOMAIN)))) + continue; + + if (is_sched_load_balance(cp) && + !cpumask_empty(cp->effective_cpus)) + csa[csn++] =3D cp; + + /* skip @cp's subtree */ + pos_css =3D css_rightmost_descendant(pos_css); + continue; + } + rcu_read_unlock(); + + /* + * If there are only isolated partitions underneath the cgroup root, + * we can optimize out unneeded sched domains scanning. + */ + if (root_load_balance && (csn =3D=3D 1)) + goto single_root_domain; + + for (i =3D 0; i < csn; i++) + uf_node_init(&csa[i]->node); + + /* Merge overlapping cpusets */ + for (i =3D 0; i < csn; i++) { + for (j =3D i + 1; j < csn; j++) { + if (cpusets_overlap(csa[i], csa[j])) + uf_union(&csa[i]->node, &csa[j]->node); + } + } + + /* Count the total number of domains */ + for (i =3D 0; i < csn; i++) { + if (uf_find(&csa[i]->node) =3D=3D &csa[i]->node) + ndoms++; + } + + /* + * Now we know how many domains to create. + * Convert to and populate cpu masks. + */ + doms =3D alloc_sched_domains(ndoms); + if (!doms) + goto done; + + /* + * The rest of the code, including the scheduler, can deal with + * dattr=3D=3DNULL case. No need to abort if alloc fails. + */ + dattr =3D kmalloc_array(ndoms, sizeof(struct sched_domain_attr), + GFP_KERNEL); + + for (nslot =3D 0, i =3D 0; i < csn; i++) { + nslot_update =3D 0; + for (j =3D i; j < csn; j++) { + if (uf_find(&csa[j]->node) =3D=3D &csa[i]->node) { + struct cpumask *dp =3D doms[nslot]; + + if (i =3D=3D j) { + nslot_update =3D 1; + cpumask_clear(dp); + if (dattr) + *(dattr + nslot) =3D SD_ATTR_INIT; + } + cpumask_or(dp, dp, csa[j]->effective_cpus); + cpumask_and(dp, dp, housekeeping_cpumask(HK_TYPE_DOMAIN)); + if (dattr) + update_domain_attr_tree(dattr + nslot, csa[j]); + } + } + if (nslot_update) + nslot++; + } + BUG_ON(nslot !=3D ndoms); + +done: + kfree(csa); + + /* + * Fallback to the default domain if kmalloc() failed. + * See comments in partition_sched_domains(). + */ + if (doms =3D=3D NULL) + ndoms =3D 1; + + *domains =3D doms; + *attributes =3D dattr; + return ndoms; +} + /* * for the common functions, 'private' gives the type of file */ diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 88ca8b40e01a..6bb0b201c34b 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -211,7 +211,7 @@ static inline void notify_partition_change(struct cpuse= t *cs, int old_prs) * If cpu_online_mask is used while a hotunplug operation is happening in * parallel, we may leave an offline CPU in cpu_allowed or some other mask= s. */ -static struct cpuset top_cpuset =3D { +struct cpuset top_cpuset =3D { .flags =3D BIT(CS_CPU_EXCLUSIVE) | BIT(CS_MEM_EXCLUSIVE) | BIT(CS_SCHED_LOAD_BALANCE), .partition_root_state =3D PRS_ROOT, @@ -744,21 +744,6 @@ static int validate_change(struct cpuset *cur, struct = cpuset *trial) } =20 #ifdef CONFIG_SMP -/* - * Helper routine for generate_sched_domains(). - * Do cpusets a, b have overlapping effective cpus_allowed masks? - */ -static int cpusets_overlap(struct cpuset *a, struct cpuset *b) -{ - return cpumask_intersects(a->effective_cpus, b->effective_cpus); -} - -/* Must be called with cpuset_mutex held. */ -static inline int nr_cpusets(void) -{ - /* jump label reference count + the top-level cpuset */ - return static_key_count(&cpusets_enabled_key.key) + 1; -} =20 /* * generate_sched_domains() @@ -798,17 +783,6 @@ static inline int nr_cpusets(void) * convenient format, that can be easily compared to the prior * value to determine what partition elements (sched domains) * were changed (added or removed.) - * - * Finding the best partition (set of domains): - * The double nested loops below over i, j scan over the load - * balanced cpusets (using the array of cpuset pointers in csa[]) - * looking for pairs of cpusets that have overlapping cpus_allowed - * and merging them using a union-find algorithm. - * - * The union of the cpus_allowed masks from the set of all cpusets - * having the same root then form the one element of the partition - * (one sched domain) to be passed to partition_sched_domains(). - * */ static int generate_sched_domains(cpumask_var_t **domains, struct sched_domain_attr **attributes) @@ -826,6 +800,9 @@ static int generate_sched_domains(cpumask_var_t **domai= ns, bool cgrpv2 =3D cpuset_v2(); int nslot_update; =20 + if (!cgrpv2) + return cpuset1_generate_sched_domains(domains, attributes); + doms =3D NULL; dattr =3D NULL; csa =3D NULL; --=20 2.34.1 From nobody Mon Feb 9 01:34:29 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 611BE33A6EF; Wed, 17 Dec 2025 09:04:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962289; cv=none; b=I/42Q2kn5aFKE1ilzojUi10hK/MeKWGD34DLM32EE0n4UMEmLTSINYLqQUgbiKjXTxzoayz1BteWUC+/C5bD8pdCcZbC9gvDggnZOn6eKo/iO56bVY4J61sg2h06pehFJb8gJLobIr3mY4NL8k6L6RI9fLT9qLL3ILNRW4Ta01k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1765962289; c=relaxed/simple; bh=+w/xJHXWB9DLZi4vc8Pzg5TWD1celiYHG4/GhAWj+zA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=I/j6VwX9zK54BQwq9yFUCOdVO9rgTe7/odJZsSWP3Ptw6H8ZQO7D9xyTgwNjvVgpzrr5RdrCvtMymf1DHPeC0xDFzzXVRQf3Nea0ROA65EMwh7gBQoYPzgjQWoIlmqKp/zo+ctJRclfCJ0I2IwuW8DOp775dlGiI8ah2nHpUnQw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dWSXz3KH2zKHMmG; Wed, 17 Dec 2025 17:04:35 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 5502A40574; Wed, 17 Dec 2025 17:04:44 +0800 (CST) Received: from hulk-vt.huawei.com (unknown [10.67.174.121]) by APP4 (Coremail) with SMTP id gCh0CgCHNvcackJp5AL6AQ--.18103S8; Wed, 17 Dec 2025 17:04:44 +0800 (CST) From: Chen Ridong To: longman@redhat.com, tj@kernel.org, hannes@cmpxchg.org, mkoutny@suse.com Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, lujialin4@huawei.com, chenridong@huaweicloud.com Subject: [PATCH -next 6/6] cpuset: remove v1-specific code from generate_sched_domains Date: Wed, 17 Dec 2025 08:49:42 +0000 Message-Id: <20251217084942.2666405-7-chenridong@huaweicloud.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251217084942.2666405-1-chenridong@huaweicloud.com> References: <20251217084942.2666405-1-chenridong@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgCHNvcackJp5AL6AQ--.18103S8 X-Coremail-Antispam: 1UD129KBjvJXoW3GrWrWw18WrWfCryfAw1UWrg_yoWfJw18pF W8Cay2qrW5tw1UG39YkwsrZ34S9wsrGayUK3W5Wwn5ZF17J3Wv9Fy0v3ZxCFWY9FyDCr13 ZFZIgr47W3WqkFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPj14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY1x0262kKe7AKxVWUAV WUtwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJw CI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUYcTQUUUU U X-CM-SenderInfo: hfkh02xlgr0w46kxt4xhlfz01xgou0bp/ Content-Type: text/plain; charset="utf-8" From: Chen Ridong Following the introduction of cpuset1_generate_sched_domains() for v1 in the previous patch, v1-specific logic can now be removed from the generic generate_sched_domains(). This patch cleans up the v1-only code and ensures uf_node is only visible when CONFIG_CPUSETS_V1=3Dy. Signed-off-by: Chen Ridong --- kernel/cgroup/cpuset-internal.h | 10 +-- kernel/cgroup/cpuset-v1.c | 2 +- kernel/cgroup/cpuset.c | 144 +++++--------------------------- 3 files changed, 27 insertions(+), 129 deletions(-) diff --git a/kernel/cgroup/cpuset-internal.h b/kernel/cgroup/cpuset-interna= l.h index bd767f8cb0ed..ef7b7c5afd4c 100644 --- a/kernel/cgroup/cpuset-internal.h +++ b/kernel/cgroup/cpuset-internal.h @@ -175,14 +175,14 @@ struct cpuset { /* Handle for cpuset.cpus.partition */ struct cgroup_file partition_file; =20 - /* Used to merge intersecting subsets for generate_sched_domains */ - struct uf_node node; - #ifdef CONFIG_CPUSETS_V1 struct fmeter fmeter; /* memory_pressure filter */ =20 /* for custom sched domain */ int relax_domain_level; + + /* Used to merge intersecting subsets for generate_sched_domains */ + struct uf_node node; #endif }; =20 @@ -315,8 +315,6 @@ void cpuset1_hotplug_update_tasks(struct cpuset *cs, int cpuset1_validate_change(struct cpuset *cur, struct cpuset *trial); void cpuset1_init(struct cpuset *cs); void cpuset1_online_css(struct cgroup_subsys_state *css); -void update_domain_attr_tree(struct sched_domain_attr *dattr, - struct cpuset *root_cs); int cpuset1_generate_sched_domains(cpumask_var_t **domains, struct sched_domain_attr **attributes); =20 @@ -331,8 +329,6 @@ static inline int cpuset1_validate_change(struct cpuset= *cur, struct cpuset *trial) { return 0; } static inline void cpuset1_init(struct cpuset *cs) {} static inline void cpuset1_online_css(struct cgroup_subsys_state *css) {} -static inline void update_domain_attr_tree(struct sched_domain_attr *dattr, - struct cpuset *root_cs) {} static inline int cpuset1_generate_sched_domains(cpumask_var_t **domains, struct sched_domain_attr **attributes) { return 0; }; =20 diff --git a/kernel/cgroup/cpuset-v1.c b/kernel/cgroup/cpuset-v1.c index 5c0bded46a7c..0226350e704f 100644 --- a/kernel/cgroup/cpuset-v1.c +++ b/kernel/cgroup/cpuset-v1.c @@ -560,7 +560,7 @@ update_domain_attr(struct sched_domain_attr *dattr, str= uct cpuset *c) dattr->relax_domain_level =3D c->relax_domain_level; } =20 -void update_domain_attr_tree(struct sched_domain_attr *dattr, +static void update_domain_attr_tree(struct sched_domain_attr *dattr, struct cpuset *root_cs) { struct cpuset *cp; diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 6bb0b201c34b..3e3468d928f3 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -789,18 +789,13 @@ static int generate_sched_domains(cpumask_var_t **dom= ains, { struct cpuset *cp; /* top-down scan of cpusets */ struct cpuset **csa; /* array of all cpuset ptrs */ - int csn; /* how many cpuset ptrs in csa so far */ int i, j; /* indices for partition finding loops */ cpumask_var_t *doms; /* resulting partition; i.e. sched domains */ struct sched_domain_attr *dattr; /* attributes for custom domains */ int ndoms =3D 0; /* number of sched domains in result */ - int nslot; /* next empty doms[] struct cpumask slot */ struct cgroup_subsys_state *pos_css; - bool root_load_balance =3D is_sched_load_balance(&top_cpuset); - bool cgrpv2 =3D cpuset_v2(); - int nslot_update; =20 - if (!cgrpv2) + if (!cpuset_v2()) return cpuset1_generate_sched_domains(domains, attributes); =20 doms =3D NULL; @@ -808,70 +803,25 @@ static int generate_sched_domains(cpumask_var_t **dom= ains, csa =3D NULL; =20 /* Special case for the 99% of systems with one, full, sched domain */ - if (root_load_balance && cpumask_empty(subpartitions_cpus)) { -single_root_domain: + if (cpumask_empty(subpartitions_cpus)) { ndoms =3D 1; - doms =3D alloc_sched_domains(ndoms); - if (!doms) - goto done; - - dattr =3D kmalloc(sizeof(struct sched_domain_attr), GFP_KERNEL); - if (dattr) { - *dattr =3D SD_ATTR_INIT; - update_domain_attr_tree(dattr, &top_cpuset); - } - cpumask_and(doms[0], top_cpuset.effective_cpus, - housekeeping_cpumask(HK_TYPE_DOMAIN)); - - goto done; + goto generate_doms; } =20 csa =3D kmalloc_array(nr_cpusets(), sizeof(cp), GFP_KERNEL); if (!csa) goto done; - csn =3D 0; =20 + /* Find how many partitions and cache them to csa[] */ rcu_read_lock(); - if (root_load_balance) - csa[csn++] =3D &top_cpuset; cpuset_for_each_descendant_pre(cp, pos_css, &top_cpuset) { - if (cp =3D=3D &top_cpuset) - continue; - - if (cgrpv2) - goto v2; - - /* - * v1: - * Continue traversing beyond @cp iff @cp has some CPUs and - * isn't load balancing. The former is obvious. The - * latter: All child cpusets contain a subset of the - * parent's cpus, so just skip them, and then we call - * update_domain_attr_tree() to calc relax_domain_level of - * the corresponding sched domain. - */ - if (!cpumask_empty(cp->cpus_allowed) && - !(is_sched_load_balance(cp) && - cpumask_intersects(cp->cpus_allowed, - housekeeping_cpumask(HK_TYPE_DOMAIN)))) - continue; - - if (is_sched_load_balance(cp) && - !cpumask_empty(cp->effective_cpus)) - csa[csn++] =3D cp; - - /* skip @cp's subtree */ - pos_css =3D css_rightmost_descendant(pos_css); - continue; - -v2: /* * Only valid partition roots that are not isolated and with - * non-empty effective_cpus will be saved into csn[]. + * non-empty effective_cpus will be saved into csa[]. */ if ((cp->partition_root_state =3D=3D PRS_ROOT) && !cpumask_empty(cp->effective_cpus)) - csa[csn++] =3D cp; + csa[ndoms++] =3D cp; =20 /* * Skip @cp's subtree if not a partition root and has no @@ -882,40 +832,18 @@ static int generate_sched_domains(cpumask_var_t **dom= ains, } rcu_read_unlock(); =20 - /* - * If there are only isolated partitions underneath the cgroup root, - * we can optimize out unneeded sched domains scanning. - */ - if (root_load_balance && (csn =3D=3D 1)) - goto single_root_domain; - - for (i =3D 0; i < csn; i++) - uf_node_init(&csa[i]->node); - - /* Merge overlapping cpusets */ - for (i =3D 0; i < csn; i++) { - for (j =3D i + 1; j < csn; j++) { - if (cpusets_overlap(csa[i], csa[j])) { + for (i =3D 0; i < ndoms; i++) { + for (j =3D i + 1; j < ndoms; j++) { + if (cpusets_overlap(csa[i], csa[j])) /* * Cgroup v2 shouldn't pass down overlapping * partition root cpusets. */ - WARN_ON_ONCE(cgrpv2); - uf_union(&csa[i]->node, &csa[j]->node); - } + WARN_ON_ONCE(1); } } =20 - /* Count the total number of domains */ - for (i =3D 0; i < csn; i++) { - if (uf_find(&csa[i]->node) =3D=3D &csa[i]->node) - ndoms++; - } - - /* - * Now we know how many domains to create. - * Convert to and populate cpu masks. - */ +generate_doms: doms =3D alloc_sched_domains(ndoms); if (!doms) goto done; @@ -932,45 +860,19 @@ static int generate_sched_domains(cpumask_var_t **dom= ains, * to SD_ATTR_INIT. Also non-isolating partition root CPUs are a * subset of HK_TYPE_DOMAIN housekeeping CPUs. */ - if (cgrpv2) { - for (i =3D 0; i < ndoms; i++) { - /* - * The top cpuset may contain some boot time isolated - * CPUs that need to be excluded from the sched domain. - */ - if (csa[i] =3D=3D &top_cpuset) - cpumask_and(doms[i], csa[i]->effective_cpus, - housekeeping_cpumask(HK_TYPE_DOMAIN)); - else - cpumask_copy(doms[i], csa[i]->effective_cpus); - if (dattr) - dattr[i] =3D SD_ATTR_INIT; - } - goto done; - } - - for (nslot =3D 0, i =3D 0; i < csn; i++) { - nslot_update =3D 0; - for (j =3D i; j < csn; j++) { - if (uf_find(&csa[j]->node) =3D=3D &csa[i]->node) { - struct cpumask *dp =3D doms[nslot]; - - if (i =3D=3D j) { - nslot_update =3D 1; - cpumask_clear(dp); - if (dattr) - *(dattr + nslot) =3D SD_ATTR_INIT; - } - cpumask_or(dp, dp, csa[j]->effective_cpus); - cpumask_and(dp, dp, housekeeping_cpumask(HK_TYPE_DOMAIN)); - if (dattr) - update_domain_attr_tree(dattr + nslot, csa[j]); - } - } - if (nslot_update) - nslot++; + for (i =3D 0; i < ndoms; i++) { + /* + * The top cpuset may contain some boot time isolated + * CPUs that need to be excluded from the sched domain. + */ + if (!csa || csa[i] =3D=3D &top_cpuset) + cpumask_and(doms[i], top_cpuset.effective_cpus, + housekeeping_cpumask(HK_TYPE_DOMAIN)); + else + cpumask_copy(doms[i], csa[i]->effective_cpus); + if (dattr) + dattr[i] =3D SD_ATTR_INIT; } - BUG_ON(nslot !=3D ndoms); =20 done: kfree(csa); --=20 2.34.1