From nobody Thu Apr 2 06:10:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD54C2F7AB0 for ; Sat, 21 Feb 2026 18:54:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700093; cv=none; b=rvhikfNdLWY5ATeQI+XYqYxU3wDKo0MrLNMv7mSKebGm2EB2psqgnPutIQ0+uwrO2YrznI3G144mDCCtueQBlARO9wH6ZisRfzpJuMdBSdpmE9qvVHtZH+mliux74SUFenXRyJCCPOG/eVAwJn2nmZJf7RCIp523PwFzggXG26s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700093; c=relaxed/simple; bh=SccU/7lNAwptDz6quN2kuH9L4uDtYo/Jxa+nW4laf1s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=My06g/Iqj9bPww0Ju8ht4Brkn4Z7HThpnXitFcFhGHzb+yuFCNt+0LXlGY/hL/wfwKP1W91//49DSvLCz5CRDcnKBPi52LPKMGdKEcibmHqZBr8z2hc7DEUXga1uGtZzGa3skPhKDzDzrqy+vkJKclQFj/0st3QiIW4pmCw5pGY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Ys3Z30Mx; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ys3Z30Mx" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771700091; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/mUVi/Bf+nz7Vl/nMR86R4IIl2B+eNG5a+UoMwKwkVE=; b=Ys3Z30MxNbehIp6DTrdEizxcLIFuFFkCbJR2HgXVQRdW1nTdxcdCaRZRjzP28aQvLP8wfC ZLL3osNsf+PZKwHbDzA9fMciFUnmHsrA7yYZtlKL7OTu5bApCPTFzJTb4tg/sK4uO1KlIk SIyLC3XZbioTVOiCmBpBOPT9DHkCS/8= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-197-ND3nx1n6OWu4WFyGsnoGHw-1; Sat, 21 Feb 2026 13:54:48 -0500 X-MC-Unique: ND3nx1n6OWu4WFyGsnoGHw-1 X-Mimecast-MFC-AGG-ID: ND3nx1n6OWu4WFyGsnoGHw_1771700086 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 6CFE518003F6; Sat, 21 Feb 2026 18:54:45 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.2.16.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 7B0201955D85; Sat, 21 Feb 2026 18:54:40 +0000 (UTC) From: Waiman Long To: Chen Ridong , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Thomas Gleixner , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 1/8] cgroup/cpuset: Fix incorrect change to effective_xcpus in partition_xcpus_del() Date: Sat, 21 Feb 2026 13:54:11 -0500 Message-ID: <20260221185418.29319-2-longman@redhat.com> In-Reply-To: <20260221185418.29319-1-longman@redhat.com> References: <20260221185418.29319-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" The effective_xcpus of a cpuset can contain offline CPUs. In partition_xcpus_del(), the xcpus parameter is incorrectly used as a temporary cpumask to mask out offline CPUs. As xcpus can be the effective_xcpus of a cpuset, this can result in unexpected changes in that cpumask. Fix this problem by not making any changes to the xcpus parameter. Fixes: 11e5f407b64a ("cgroup/cpuset: Keep track of CPUs in isolated partiti= ons") Reviewed-by: Chen Ridong Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 7607dfe516e6..4d10e320b144 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1220,8 +1220,8 @@ static void partition_xcpus_del(int old_prs, struct c= puset *parent, isolated_cpus_update(old_prs, parent->partition_root_state, xcpus); =20 - cpumask_and(xcpus, xcpus, cpu_active_mask); cpumask_or(parent->effective_cpus, parent->effective_cpus, xcpus); + cpumask_and(parent->effective_cpus, parent->effective_cpus, cpu_active_ma= sk); } =20 /* --=20 2.53.0 From nobody Thu Apr 2 06:10:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 54D3D3090C4 for ; Sat, 21 Feb 2026 18:54:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700100; cv=none; b=lY4q0d679JzY1X133FYzjyjdRD3zkg++2wSGHVlee9pV+9q4myvJDw+ZtGwWUhyDvn3mZHqsbmxJnxUfTxCtn5n5gRNBTs10lhy7zyufPaP1uk5bnm1RXTHh/wj5KtnD11H9OUZiBqlRgUwyIYxmauOe5WMHGap2eNBm9GSshOk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700100; c=relaxed/simple; bh=hnwKm3cHN0wYSPdYO1TG3wE2ivHbM3YkTcrGxlm0l88=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uxYrmj6A5JfmnxbHz7WRD3OYVU5GTcbaRvX8h16FoKV8rqXuqKwpq0YC40/JL4Nh8FkpbXIxtde/S6kN8glhgrU8U3jbyJfWejAjb+4QId60WEVsxiENS8awKWspL5F3Z+cBYf8qz9Odajt0OZScxOMO0rE1DEqur0ZfsFtO8iE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=A6dCdDBc; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="A6dCdDBc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771700098; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mq2cciCs0tqvPp9QLtFzwUl4nr4xQ0n8TapNOd99Qkg=; b=A6dCdDBcbkxhAHMvYgxYXRBnyQZYj00ybWTSI0VL+jGjam34LWuBr6RTlHxf+zBOvAwv// z8TNx0UFQTIEkWTI5zfaIIBzqOcI2J3LhfdkjMxdMkUsNZaGEIHZvLNVfiRsy0iQSWGtmf YT3kFVdEzid7IAq5zfagy9bt9oq7RS4= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-658-GsCa34eJNzGUaNe9FQ14Kw-1; Sat, 21 Feb 2026 13:54:53 -0500 X-MC-Unique: GsCa34eJNzGUaNe9FQ14Kw-1 X-Mimecast-MFC-AGG-ID: GsCa34eJNzGUaNe9FQ14Kw_1771700091 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2CE4E1956095; Sat, 21 Feb 2026 18:54:51 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.2.16.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 09CA81955F22; Sat, 21 Feb 2026 18:54:45 +0000 (UTC) From: Waiman Long To: Chen Ridong , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Thomas Gleixner , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 2/8] cgroup/cpuset: Fix incorrect use of cpuset_update_tasks_cpumask() in update_cpumasks_hier() Date: Sat, 21 Feb 2026 13:54:12 -0500 Message-ID: <20260221185418.29319-3-longman@redhat.com> In-Reply-To: <20260221185418.29319-1-longman@redhat.com> References: <20260221185418.29319-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" Commit e2ffe502ba45 ("cgroup/cpuset: Add cpuset.cpus.exclusive for v2") incorrectly changed the 2nd parameter of cpuset_update_tasks_cpumask() from tmp->new_cpus to cp->effective_cpus. This second parameter is just a temporary cpumask for internal use. The cpuset_update_tasks_cpumask() function was originally called update_tasks_cpumask() before commit 381b53c3b549 ("cgroup/cpuset: rename functions shared between v1 and v2"). This mistake can incorrectly change the effective_cpus of the cpuset when it is the top_cpuset or in arm64 architecture where task_cpu_possible_mask() may differ from cpu_possible_mask. So far top_cpuset hasn't been passed to update_cpumasks_hier() yet, but arm64 arch can still be impacted. Fix it by reverting the incorrect change. Fixes: e2ffe502ba45 ("cgroup/cpuset: Add cpuset.cpus.exclusive for v2") Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 4d10e320b144..58660e06d322 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -2156,7 +2156,7 @@ static void update_cpumasks_hier(struct cpuset *cs, s= truct tmpmasks *tmp, WARN_ON(!is_in_v2_mode() && !cpumask_equal(cp->cpus_allowed, cp->effective_cpus)); =20 - cpuset_update_tasks_cpumask(cp, cp->effective_cpus); + cpuset_update_tasks_cpumask(cp, tmp->new_cpus); =20 /* * On default hierarchy, inherit the CS_SCHED_LOAD_BALANCE --=20 2.53.0 From nobody Thu Apr 2 06:10:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EA6852F7AB0 for ; Sat, 21 Feb 2026 18:55:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700104; cv=none; b=KNOezKIfGbyM9MWaqANZndqwrpHXldmsNCzKXV08j4xoFLfb68spqtyvKEUJJbxWa8htKzmqcVSLjii2bfeCpZRfGr7WksVcKSRqgiJvzGLASpglNqRbZx+TO0U3udt+7dxDP87Tc2JFAXPP3QoUtPDSCU250gFDmRByuC4Ou3M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700104; c=relaxed/simple; bh=ARjNEQWL4cBHJYa2kYM7egcPbdm+7idkr8alfNJR2J0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BxUaTfQAjrC+ErG42iHzO0aS8gBv/aXAsrDQLqiZanFTqC102SvilBAtMS+ZFNVNbJIUVBbJynOlxoEQkSRQUDjsih9d9LHtFWkOjpzgnVIfdXhFWplc9ZMTkQUKeTK4A5cflv4Krgis+U4WPsEt8TwW5pLNJEGmiD6/KyhiBcg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=M1Ewqc6o; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="M1Ewqc6o" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771700102; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nuUC4MU0ZvRekCFMgEoa8ReLcK5LYqI5EV5cazsBOc4=; b=M1Ewqc6oUmRJxhxM+baNoPvbU3t5tWvC18IgiP14kASV1wzy1L6Ft2a7694JXfVJLMH/tP sSGoeFGXTzDcozlJpZXIFPKNQOETzOrJd3AOpd12s9XZA8cKO3VjU+c9t+is0GlWA5J3x8 1CYyDjP5NZWq6XVdfkESLV0qlXeFCMk= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-558-xX-DmYqJO8aG_yCXfZt64g-1; Sat, 21 Feb 2026 13:54:58 -0500 X-MC-Unique: xX-DmYqJO8aG_yCXfZt64g-1 X-Mimecast-MFC-AGG-ID: xX-DmYqJO8aG_yCXfZt64g_1771700096 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 62E45195609F; Sat, 21 Feb 2026 18:54:56 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.2.16.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id A55321955F22; Sat, 21 Feb 2026 18:54:51 +0000 (UTC) From: Waiman Long To: Chen Ridong , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Thomas Gleixner , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 3/8] cgroup/cpuset: Clarify exclusion rules for cpuset internal variables Date: Sat, 21 Feb 2026 13:54:13 -0500 Message-ID: <20260221185418.29319-4-longman@redhat.com> In-Reply-To: <20260221185418.29319-1-longman@redhat.com> References: <20260221185418.29319-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" Clarify the locking rules associated with file level internal variables inside the cpuset code. There is no functional change. Reviewed-by: Chen Ridong Signed-off-by: Waiman Long Acked-by: Frederic Weisbecker --- kernel/cgroup/cpuset.c | 105 ++++++++++++++++++++++++----------------- 1 file changed, 61 insertions(+), 44 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 58660e06d322..e8c0b3cfd1f9 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -61,6 +61,58 @@ static const char * const perr_strings[] =3D { [PERR_REMOTE] =3D "Have remote partition underneath", }; =20 +/* + * CPUSET Locking Convention + * ------------------------- + * + * Below are the three global locks guarding cpuset structures in lock + * acquisition order: + * - cpu_hotplug_lock (cpus_read_lock/cpus_write_lock) + * - cpuset_mutex + * - callback_lock (raw spinlock) + * + * A task must hold all the three locks to modify externally visible or + * used fields of cpusets, though some of the internally used cpuset fields + * and internal variables can be modified without holding callback_lock. I= f only + * reliable read access of the externally used fields are needed, a task c= an + * hold either cpuset_mutex or callback_lock which are exposed to other + * external subsystems. + * + * If a task holds cpu_hotplug_lock and cpuset_mutex, it blocks others, + * ensuring that it is the only task able to also acquire callback_lock and + * be able to modify cpusets. It can perform various checks on the cpuset + * structure first, knowing nothing will change. It can also allocate memo= ry + * without holding callback_lock. While it is performing these checks, var= ious + * callback routines can briefly acquire callback_lock to query cpusets. = Once + * it is ready to make the changes, it takes callback_lock, blocking every= one + * else. + * + * Calls to the kernel memory allocator cannot be made while holding + * callback_lock which is a spinlock, as the memory allocator may sleep or + * call back into cpuset code and acquire callback_lock. + * + * Now, the task_struct fields mems_allowed and mempolicy may be changed + * by other task, we use alloc_lock in the task_struct fields to protect + * them. + * + * The cpuset_common_seq_show() handlers only hold callback_lock across + * small pieces of code, such as when reading out possibly multi-word + * cpumasks and nodemasks. + */ + +static DEFINE_MUTEX(cpuset_mutex); + +/* + * File level internal variables below follow one of the following exclusi= on + * rules. + * + * RWCS: Read/write-able by holding either cpus_write_lock (and optionally + * cpuset_mutex) or both cpus_read_lock and cpuset_mutex. + * + * CSCB: Readable by holding either cpuset_mutex or callback_lock. Writable + * by holding both cpuset_mutex and callback_lock. + */ + /* * For local partitions, update to subpartitions_cpus & isolated_cpus is d= one * in update_parent_effective_cpumask(). For remote partitions, it is done= in @@ -70,19 +122,18 @@ static const char * const perr_strings[] =3D { * Exclusive CPUs distributed out to local or remote sub-partitions of * top_cpuset */ -static cpumask_var_t subpartitions_cpus; +static cpumask_var_t subpartitions_cpus; /* RWCS */ =20 /* - * Exclusive CPUs in isolated partitions + * Exclusive CPUs in isolated partitions (shown in cpuset.cpus.isolated) */ -static cpumask_var_t isolated_cpus; +static cpumask_var_t isolated_cpus; /* CSCB */ =20 /* - * isolated_cpus updating flag (protected by cpuset_mutex) - * Set if isolated_cpus is going to be updated in the current - * cpuset_mutex crtical section. + * Set if isolated_cpus is being updated in the current cpuset_mutex + * critical section. */ -static bool isolated_cpus_updating; +static bool isolated_cpus_updating; /* RWCS */ =20 /* * A flag to force sched domain rebuild at the end of an operation. @@ -98,7 +149,7 @@ static bool isolated_cpus_updating; * Note that update_relax_domain_level() in cpuset-v1.c can still call * rebuild_sched_domains_locked() directly without using this flag. */ -static bool force_sd_rebuild; +static bool force_sd_rebuild; /* RWCS */ =20 /* * Partition root states: @@ -218,42 +269,6 @@ struct cpuset top_cpuset =3D { .partition_root_state =3D PRS_ROOT, }; =20 -/* - * There are two global locks guarding cpuset structures - cpuset_mutex and - * callback_lock. The cpuset code uses only cpuset_mutex. Other kernel - * subsystems can use cpuset_lock()/cpuset_unlock() to prevent change to c= puset - * structures. Note that cpuset_mutex needs to be a mutex as it is used in - * paths that rely on priority inheritance (e.g. scheduler - on RT) for - * correctness. - * - * A task must hold both locks to modify cpusets. If a task holds - * cpuset_mutex, it blocks others, ensuring that it is the only task able = to - * also acquire callback_lock and be able to modify cpusets. It can perfo= rm - * various checks on the cpuset structure first, knowing nothing will chan= ge. - * It can also allocate memory while just holding cpuset_mutex. While it = is - * performing these checks, various callback routines can briefly acquire - * callback_lock to query cpusets. Once it is ready to make the changes, = it - * takes callback_lock, blocking everyone else. - * - * Calls to the kernel memory allocator can not be made while holding - * callback_lock, as that would risk double tripping on callback_lock - * from one of the callbacks into the cpuset code from within - * __alloc_pages(). - * - * If a task is only holding callback_lock, then it has read-only - * access to cpusets. - * - * Now, the task_struct fields mems_allowed and mempolicy may be changed - * by other task, we use alloc_lock in the task_struct fields to protect - * them. - * - * The cpuset_common_seq_show() handlers only hold callback_lock across - * small pieces of code, such as when reading out possibly multi-word - * cpumasks and nodemasks. - */ - -static DEFINE_MUTEX(cpuset_mutex); - /** * cpuset_lock - Acquire the global cpuset mutex * @@ -1162,6 +1177,8 @@ static void reset_partition_data(struct cpuset *cs) static void isolated_cpus_update(int old_prs, int new_prs, struct cpumask = *xcpus) { WARN_ON_ONCE(old_prs =3D=3D new_prs); + lockdep_assert_held(&callback_lock); + lockdep_assert_held(&cpuset_mutex); if (new_prs =3D=3D PRS_ISOLATED) cpumask_or(isolated_cpus, isolated_cpus, xcpus); else --=20 2.53.0 From nobody Thu Apr 2 06:10:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15EE02F7AB0 for ; Sat, 21 Feb 2026 18:55:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700109; cv=none; b=rE8cBAcSqY2IzU022IRGn+m0kW2UeGAqrVGEhKoCMtzQeRHto+ChjfOAWPsaMNohi1US3n79QXPjmmLsSyDX9SgqSuwPz53ra2PMNn4emxzH6X3ISkmMr5JXcmzbBbcBzYCNd+2cUfz5951QsfwanGGxBjrZHaaSSEa+odke+CE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700109; c=relaxed/simple; bh=3/mMWO/OLnn9oAQq9BtmSdQudV3VYbGTs9aKl2ZAVQM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=X29VqMyOZ105klhcQ2IipigO0Bvt2i9Dw7hO9GnD/LfMls+YQBI+9VARFM/tCgSBxbC9mEp0Y0C7eRjV0sB2Vg6JrN3WbqzY2qrbLMgdIgKOMXqlXkNBxNUn1LYMQcoyhz/tLUP0KGbjIWSW9M1WnGQqFGcGJfE/FU0RQB9IqqI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=ERty8GDZ; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ERty8GDZ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771700106; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dnRXsnkA10h3F4dXZbAzjijGxS97Xcwkc9wwb0raH3g=; b=ERty8GDZBhSP1bd2PIA/DKzj/uLANdzfjnq119IjsKQzxO9D6SbjkfXjwn+Vmoa24qIDCt RgTdDIRu+Q+1HeplHHYsc/V8F+WYV/UMqD5jDpKzgyOOiuZQ20Rr0CqgBstLt4rjBsu+70 lctIDTy9eZNZInEmx8M+av9yx+QS0nE= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-576-oT5MGJmXM_uYAVmrqn1PCQ-1; Sat, 21 Feb 2026 13:55:03 -0500 X-MC-Unique: oT5MGJmXM_uYAVmrqn1PCQ-1 X-Mimecast-MFC-AGG-ID: oT5MGJmXM_uYAVmrqn1PCQ_1771700101 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5D44F195608E; Sat, 21 Feb 2026 18:55:01 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.2.16.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id BFD241955F22; Sat, 21 Feb 2026 18:54:56 +0000 (UTC) From: Waiman Long To: Chen Ridong , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Thomas Gleixner , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 4/8] cgroup/cpuset: Set isolated_cpus_updating only if isolated_cpus is changed Date: Sat, 21 Feb 2026 13:54:14 -0500 Message-ID: <20260221185418.29319-5-longman@redhat.com> In-Reply-To: <20260221185418.29319-1-longman@redhat.com> References: <20260221185418.29319-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" As cpuset is updating HK_TYPE_DOMAIN housekeeping mask when there is a change in the set of isolated CPUs, making this change is now more costly than before. Right now, the isolated_cpus_updating flag can be set even if there is no real change in isolated_cpus. Put in additional checks to make sure that isolated_cpus_updating is set only if there is a real change in isolated_cpus. Reviewed-by: Chen Ridong Signed-off-by: Waiman Long Reviewed-by: Frederic Weisbecker --- kernel/cgroup/cpuset.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index e8c0b3cfd1f9..05adf6697030 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1179,11 +1179,15 @@ static void isolated_cpus_update(int old_prs, int n= ew_prs, struct cpumask *xcpus WARN_ON_ONCE(old_prs =3D=3D new_prs); lockdep_assert_held(&callback_lock); lockdep_assert_held(&cpuset_mutex); - if (new_prs =3D=3D PRS_ISOLATED) + if (new_prs =3D=3D PRS_ISOLATED) { + if (cpumask_subset(xcpus, isolated_cpus)) + return; cpumask_or(isolated_cpus, isolated_cpus, xcpus); - else + } else { + if (!cpumask_intersects(xcpus, isolated_cpus)) + return; cpumask_andnot(isolated_cpus, isolated_cpus, xcpus); - + } isolated_cpus_updating =3D true; } =20 --=20 2.53.0 From nobody Thu Apr 2 06:10:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7063A2F7AB0 for ; Sat, 21 Feb 2026 18:55:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700116; cv=none; b=e5b/dLLQtG+J4N6ANoC1cE7Arcp2aMAfB3EO5NZDMWYJNMStooTe+h+ttBPwZR9iea1U3cxBakTQS2Gi7Gt6cChrGkKvSRf/iTImlBj+Pzxsb0GfhBCwuayGj0G6XYzlyfH63d/IODV+pZJg7jzoBVVU3dfcosGLRyBIYBuhQEk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700116; c=relaxed/simple; bh=hGuiZrF0R/c3PrznZ4YKUJ5s/LWLPK0dFCijEUtkx/M=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MEBFGVrXMlZ0Zm6JsSztyCT1DSqX0pOTeZtN03pRCt8HFsIjGeQx3NABfIlJ1wp1NObY7t7tnddYDhamgzqteWhmEO6Hg6IQTdH3FMmphCRfK3Sh9gtHCJUuuZigtKGSdW8G6TnkOVjjjtrvI00th+Uinpj+iY4ILdxh7UNS1pA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=WJpNwnfN; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="WJpNwnfN" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771700113; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=o28S/e1kClKuLAr8yiDPeV96fynw2TO1+E91ese5bcw=; b=WJpNwnfN9q6dMug/tqGK11i/efh0M07GU4TIAkaWEQiJaVziC7oUwayDwWvuBDkxZZXfOC R7GjgnRLFg6kZwiEDqsPR17ZMJmENHmJTj6pCFHD4qdBvLr3YptSVmQ9THx1aovAs9Fthn mZirzMayW0RdXFsAFHSD3gA2ufjVfkI= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-144-FcQHXs05NGKcjauKLdX8qg-1; Sat, 21 Feb 2026 13:55:09 -0500 X-MC-Unique: FcQHXs05NGKcjauKLdX8qg-1 X-Mimecast-MFC-AGG-ID: FcQHXs05NGKcjauKLdX8qg_1771700107 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 63B451956053; Sat, 21 Feb 2026 18:55:06 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.2.16.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id AB5201955F22; Sat, 21 Feb 2026 18:55:01 +0000 (UTC) From: Waiman Long To: Chen Ridong , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Thomas Gleixner , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 5/8] kselftest/cgroup: Simplify test_cpuset_prs.sh by removing "S+" command Date: Sat, 21 Feb 2026 13:54:15 -0500 Message-ID: <20260221185418.29319-6-longman@redhat.com> In-Reply-To: <20260221185418.29319-1-longman@redhat.com> References: <20260221185418.29319-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" The "S+" command is used in the test matrix to enable the cpuset controller. However this can be done automatically and we never use the "S-" command to disable cpuset controller. Simplify the test matrix and reduce clutter by removing the command and doing that automatically. There is no functional change to the test cases. Signed-off-by: Waiman Long --- .../selftests/cgroup/test_cpuset_prs.sh | 214 +++++++++--------- 1 file changed, 105 insertions(+), 109 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/test= ing/selftests/cgroup/test_cpuset_prs.sh index 5dff3ad53867..0c5db118f2d1 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -196,7 +196,6 @@ test_add_proc() # P =3D set cpus.partition (0:member, 1:root, 2:isolated) # C =3D add cpu-list to cpuset.cpus # X =3D add cpu-list to cpuset.cpus.exclusive -# S

=3D use prefix in subtree_control # T =3D put a task into cgroup # CX =3D add cpu-list to both cpuset.cpus and cpuset.cpus.exclusive # O=3D =3D Write to CPU online file of @@ -209,44 +208,44 @@ test_add_proc() # sched-debug matching which includes offline CPUs and single-CPU partitio= ns # while the second one is for matching cpuset.cpus.isolated. # -SETUP_A123_PARTITIONS=3D"C1-3:P1:S+ C2-3:P1:S+ C3:P1" +SETUP_A123_PARTITIONS=3D"C1-3:P1 C2-3:P1 C3:P1" TEST_MATRIX=3D( # old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 fail ECPUs Pst= ate ISOLCPUS # ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ---= --- -------- - " C0-1 . . C2-3 S+ C4-5 . . 0 A2:0-1" + " C0-1 . . C2-3 . C4-5 . . 0 A2:0-1" " C0-1 . . C2-3 P1 . . . 0 " - " C0-1 . . C2-3 P1:S+ C0-1:P1 . . 0 " - " C0-1 . . C2-3 P1:S+ C1:P1 . . 0 " - " C0-1:S+ . . C2-3 . . . P1 0 " - " C0-1:P1 . . C2-3 S+ C1 . . 0 " - " C0-1:P1 . . C2-3 S+ C1:P1 . . 0 " - " C0-1:P1 . . C2-3 S+ C1:P1 . P1 0 " + " C0-1 . . C2-3 P1 C0-1:P1 . . 0 " + " C0-1 . . C2-3 P1 C1:P1 . . 0 " + " C0-1 . . C2-3 . . . P1 0 " + " C0-1:P1 . . C2-3 . C1 . . 0 " + " C0-1:P1 . . C2-3 . C1:P1 . . 0 " + " C0-1:P1 . . C2-3 . C1:P1 . P1 0 " + " C0-1:P1 . . C2-3 C4-5 . . . 0 A1:4-5" " C0-1:P1 . . C2-3 C4-5 . . . 0 A1:4-5" - " C0-1:P1 . . C2-3 S+:C4-5 . . . 0 A1:4-5" " C0-1 . . C2-3:P1 . . . C2 0 " " C0-1 . . C2-3:P1 . . . C4-5 0 B1:4-5" - "C0-3:P1:S+ C2-3:P1 . . . . . . 0 A1:0-1|A2:= 2-3|XA2:2-3" - "C0-3:P1:S+ C2-3:P1 . . C1-3 . . . 0 A1:1|A2:2-= 3|XA2:2-3" - "C2-3:P1:S+ C3:P1 . . C3 . . . 0 A1:|A2:3|X= A2:3 A1:P1|A2:P1" - "C2-3:P1:S+ C3:P1 . . C3 P0 . . 0 A1:3|A2:3 = A1:P1|A2:P0" - "C2-3:P1:S+ C2:P1 . . C2-4 . . . 0 A1:3-4|A2:= 2" - "C2-3:P1:S+ C3:P1 . . C3 . . C0-2 0 A1:|B1:0-2= A1:P1|A2:P1" + " C0-3:P1 C2-3:P1 . . . . . . 0 A1:0-1|A2:= 2-3|XA2:2-3" + " C0-3:P1 C2-3:P1 . . C1-3 . . . 0 A1:1|A2:2-= 3|XA2:2-3" + " C2-3:P1 C3:P1 . . C3 . . . 0 A1:|A2:3|X= A2:3 A1:P1|A2:P1" + " C2-3:P1 C3:P1 . . C3 P0 . . 0 A1:3|A2:3 = A1:P1|A2:P0" + " C2-3:P1 C2:P1 . . C2-4 . . . 0 A1:3-4|A2:= 2" + " C2-3:P1 C3:P1 . . C3 . . C0-2 0 A1:|B1:0-2= A1:P1|A2:P1" "$SETUP_A123_PARTITIONS . C2-3 . . . 0 A1:|A2:2|A= 3:3 A1:P1|A2:P1|A3:P1" =20 # CPU offlining cases: - " C0-1 . . C2-3 S+ C4-5 . O2=3D0 0 A1:0-1|B= 1:3" - "C0-3:P1:S+ C2-3:P1 . . O2=3D0 . . . 0 A1:0-1|A= 2:3" - "C0-3:P1:S+ C2-3:P1 . . O2=3D0 O2=3D1 . . 0 A1:0-1= |A2:2-3" - "C0-3:P1:S+ C2-3:P1 . . O1=3D0 . . . 0 A1:0|A2:= 2-3" - "C0-3:P1:S+ C2-3:P1 . . O1=3D0 O1=3D1 . . 0 A1:0-1= |A2:2-3" - "C2-3:P1:S+ C3:P1 . . O3=3D0 O3=3D1 . . 0 A1:2|A= 2:3 A1:P1|A2:P1" - "C2-3:P1:S+ C3:P2 . . O3=3D0 O3=3D1 . . 0 A1:2|A= 2:3 A1:P1|A2:P2" - "C2-3:P1:S+ C3:P1 . . O2=3D0 O2=3D1 . . 0 A1:2|A= 2:3 A1:P1|A2:P1" - "C2-3:P1:S+ C3:P2 . . O2=3D0 O2=3D1 . . 0 A1:2|A= 2:3 A1:P1|A2:P2" - "C2-3:P1:S+ C3:P1 . . O2=3D0 . . . 0 A1:|A2:3= A1:P1|A2:P1" - "C2-3:P1:S+ C3:P1 . . O3=3D0 . . . 0 A1:2|A2:= A1:P1|A2:P1" - "C2-3:P1:S+ C3:P1 . . T:O2=3D0 . . . 0 A1:3|A2:= 3 A1:P1|A2:P-1" - "C2-3:P1:S+ C3:P1 . . . T:O3=3D0 . . 0 A1:2|A2:= 2 A1:P1|A2:P-1" + " C0-1 . . C2-3 . C4-5 . O2=3D0 0 A1:0-1|B= 1:3" + " C0-3:P1 C2-3:P1 . . O2=3D0 . . . 0 A1:0-1|A= 2:3" + " C0-3:P1 C2-3:P1 . . O2=3D0 O2=3D1 . . 0 A1:0-1= |A2:2-3" + " C0-3:P1 C2-3:P1 . . O1=3D0 . . . 0 A1:0|A2:= 2-3" + " C0-3:P1 C2-3:P1 . . O1=3D0 O1=3D1 . . 0 A1:0-1= |A2:2-3" + " C2-3:P1 C3:P1 . . O3=3D0 O3=3D1 . . 0 A1:2|A= 2:3 A1:P1|A2:P1" + " C2-3:P1 C3:P2 . . O3=3D0 O3=3D1 . . 0 A1:2|A= 2:3 A1:P1|A2:P2" + " C2-3:P1 C3:P1 . . O2=3D0 O2=3D1 . . 0 A1:2|A= 2:3 A1:P1|A2:P1" + " C2-3:P1 C3:P2 . . O2=3D0 O2=3D1 . . 0 A1:2|A= 2:3 A1:P1|A2:P2" + " C2-3:P1 C3:P1 . . O2=3D0 . . . 0 A1:|A2:3= A1:P1|A2:P1" + " C2-3:P1 C3:P1 . . O3=3D0 . . . 0 A1:2|A2:= A1:P1|A2:P1" + " C2-3:P1 C3:P1 . . T:O2=3D0 . . . 0 A1:3|A2:= 3 A1:P1|A2:P-1" + " C2-3:P1 C3:P1 . . . T:O3=3D0 . . 0 A1:2|A2:= 2 A1:P1|A2:P-1" "$SETUP_A123_PARTITIONS . O1=3D0 . . . 0 A1:|A2:2= |A3:3 A1:P1|A2:P1|A3:P1" "$SETUP_A123_PARTITIONS . O2=3D0 . . . 0 A1:1|A2:= |A3:3 A1:P1|A2:P1|A3:P1" "$SETUP_A123_PARTITIONS . O3=3D0 . . . 0 A1:1|A2:= 2|A3: A1:P1|A2:P1|A3:P1" @@ -264,88 +263,87 @@ TEST_MATRIX=3D( # # Remote partition and cpuset.cpus.exclusive tests # - " C0-3:S+ C1-3:S+ C2-3 . X2-3 . . . 0 A1:0-3|A2:= 1-3|A3:2-3|XA1:2-3" - " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3:P2 . . 0 A1:0-1|A2:= 2-3|A3:2-3 A1:P0|A2:P2 2-3" - " C0-3:S+ C1-3:S+ C2-3 . X2-3 X3:P2 . . 0 A1:0-2|A2:= 3|A3:3 A1:P0|A2:P2 3" - " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2 . 0 A1:0-1|A2:= 1|A3:2-3 A1:P0|A3:P2 2-3" - " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2:C3 . 0 A1:0-1|A2:= 1|A3:2-3 A1:P0|A3:P2 2-3" - " C0-3:S+ C1-3:S+ C2-3 C2-3 . . . P2 0 A1:0-1|A2:= 1|A3:1|B1:2-3 A1:P0|A3:P0|B1:P2" - " C0-3:S+ C1-3:S+ C2-3 C4-5 . . . P2 0 B1:4-5 B1:= P2 4-5" - " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3 X2-3:P2 P2 0 A3:2-3|B1:= 4 A3:P2|B1:P2 2-4" - " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3 X2-3:P2:C1-3 P2 0 A3:2-3|B1:= 4 A3:P2|B1:P2 2-4" - " C0-3:S+ C1-3:S+ C2-3 C4 X1-3 X1-3:P2 P2 . 0 A2:1|A3:2-= 3 A2:P2|A3:P2 1-3" - " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3 X2-3:P2 P2:C4-5 0 A3:2-3|B1:= 4-5 A3:P2|B1:P2 2-5" - " C4:X0-3:S+ X1-3:S+ X2-3 . . P2 . . 0 A1:4|A2:1-= 3|A3:1-3 A2:P2 1-3" - " C4:X0-3:S+ X1-3:S+ X2-3 . . . P2 . 0 A1:4|A2:4|= A3:2-3 A3:P2 2-3" + " C0-3 C1-3 C2-3 . X2-3 . . . 0 A1:0-3|A2:= 1-3|A3:2-3|XA1:2-3" + " C0-3 C1-3 C2-3 . X2-3 X2-3:P2 . . 0 A1:0-1|A2:= 2-3|A3:2-3 A1:P0|A2:P2 2-3" + " C0-3 C1-3 C2-3 . X2-3 X3:P2 . . 0 A1:0-2|A2:= 3|A3:3 A1:P0|A2:P2 3" + " C0-3 C1-3 C2-3 . X2-3 X2-3 X2-3:P2 . 0 A1:0-1|A2:= 1|A3:2-3 A1:P0|A3:P2 2-3" + " C0-3 C1-3 C2-3 . X2-3 X2-3 X2-3:P2:C3 . 0 A1:0-1|A2:= 1|A3:2-3 A1:P0|A3:P2 2-3" + " C0-3 C1-3 C2-3 C2-3 . . . P2 0 A1:0-1|A2:= 1|A3:1|B1:2-3 A1:P0|A3:P0|B1:P2" + " C0-3 C1-3 C2-3 C4-5 . . . P2 0 B1:4-5 B1:= P2 4-5" + " C0-3 C1-3 C2-3 C4 X2-3 X2-3 X2-3:P2 P2 0 A3:2-3|B1:= 4 A3:P2|B1:P2 2-4" + " C0-3 C1-3 C2-3 C4 X2-3 X2-3 X2-3:P2:C1-3 P2 0 A3:2-3|B1:= 4 A3:P2|B1:P2 2-4" + " C0-3 C1-3 C2-3 C4 X1-3 X1-3:P2 P2 . 0 A2:1|A3:2-= 3 A2:P2|A3:P2 1-3" + " C0-3 C1-3 C2-3 C4 X2-3 X2-3 X2-3:P2 P2:C4-5 0 A3:2-3|B1:= 4-5 A3:P2|B1:P2 2-5" + " C4:X0-3 X1-3 X2-3 . . P2 . . 0 A1:4|A2:1-= 3|A3:1-3 A2:P2 1-3" + " C4:X0-3 X1-3 X2-3 . . . P2 . 0 A1:4|A2:4|= A3:2-3 A3:P2 2-3" =20 # Nested remote/local partition tests - " C0-3:S+ C1-3:S+ C2-3 C4-5 X2-3 X2-3:P1 P2 P1 0 A1:0-1|A2:= |A3:2-3|B1:4-5 \ + " C0-3 C1-3 C2-3 C4-5 X2-3 X2-3:P1 P2 P1 0 A1:0-1|A2:= |A3:2-3|B1:4-5 \ A1:P0|A2:P1|A3:P2|B1:P1 2-3" - " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3:P1 P2 P1 0 A1:0-1|A2:= |A3:2-3|B1:4 \ + " C0-3 C1-3 C2-3 C4 X2-3 X2-3:P1 P2 P1 0 A1:0-1|A2:= |A3:2-3|B1:4 \ A1:P0|A2:P1|A3:P2|B1:P1 2-4|2-3" - " C0-3:S+ C1-3:S+ C2-3 C4 X2-3 X2-3:P1 . P1 0 A1:0-1|A2:= 2-3|A3:2-3|B1:4 \ + " C0-3 C1-3 C2-3 C4 X2-3 X2-3:P1 . P1 0 A1:0-1|A2:= 2-3|A3:2-3|B1:4 \ A1:P0|A2:P1|A3:P0|B1:P1" - " C0-3:S+ C1-3:S+ C3 C4 X2-3 X2-3:P1 P2 P1 0 A1:0-1|A2:= 2|A3:3|B1:4 \ + " C0-3 C1-3 C3 C4 X2-3 X2-3:P1 P2 P1 0 A1:0-1|A2:= 2|A3:3|B1:4 \ A1:P0|A2:P1|A3:P2|B1:P1 2-4|3" - " C0-4:S+ C1-4:S+ C2-4 . X2-4 X2-4:P2 X4:P1 . 0 A1:0-1|A2:= 2-3|A3:4 \ + " C0-4 C1-4 C2-4 . X2-4 X2-4:P2 X4:P1 . 0 A1:0-1|A2:= 2-3|A3:4 \ A1:P0|A2:P2|A3:P1 2-4|2-3" - " C0-4:S+ C1-4:S+ C2-4 . X2-4 X2-4:P2 X3-4:P1 . 0 A1:0-1|A2:= 2|A3:3-4 \ + " C0-4 C1-4 C2-4 . X2-4 X2-4:P2 X3-4:P1 . 0 A1:0-1|A2:= 2|A3:3-4 \ A1:P0|A2:P2|A3:P1 2" - " C0-4:X2-4:S+ C1-4:X2-4:S+:P2 C2-4:X4:P1 \ + " C0-4:X2-4 C1-4:X2-4:P2 C2-4:X4:P1 \ . . X5 . . 0 A1:0-4|A2:1-4|A3:2-4 \ A1:P0|A2:P-2|A3:P-1 ." - " C0-4:X2-4:S+ C1-4:X2-4:S+:P2 C2-4:X4:P1 \ + " C0-4:X2-4 C1-4:X2-4:P2 C2-4:X4:P1 \ . . . X1 . 0 A1:0-1|A2:2-4|A3:2-4 \ A1:P0|A2:P2|A3:P-1 2-4" =20 # Remote partition offline tests - " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2:O2=3D0 . 0 A1:0-1|A= 2:1|A3:3 A1:P0|A3:P2 2-3" - " C0-3:S+ C1-3:S+ C2-3 . X2-3 X2-3 X2-3:P2:O2=3D0 O2=3D1 0 A1:0-= 1|A2:1|A3:2-3 A1:P0|A3:P2 2-3" - " C0-3:S+ C1-3:S+ C3 . X2-3 X2-3 P2:O3=3D0 . 0 A1:0-2|A= 2:1-2|A3: A1:P0|A3:P2 3" - " C0-3:S+ C1-3:S+ C3 . X2-3 X2-3 T:P2:O3=3D0 . 0 A1:0-2|A= 2:1-2|A3:1-2 A1:P0|A3:P-2 3|" + " C0-3 C1-3 C2-3 . X2-3 X2-3 X2-3:P2:O2=3D0 . 0 A1:0-1|A= 2:1|A3:3 A1:P0|A3:P2 2-3" + " C0-3 C1-3 C2-3 . X2-3 X2-3 X2-3:P2:O2=3D0 O2=3D1 0 A1:0-= 1|A2:1|A3:2-3 A1:P0|A3:P2 2-3" + " C0-3 C1-3 C3 . X2-3 X2-3 P2:O3=3D0 . 0 A1:0-2|A= 2:1-2|A3: A1:P0|A3:P2 3" + " C0-3 C1-3 C3 . X2-3 X2-3 T:P2:O3=3D0 . 0 A1:0-2|A= 2:1-2|A3:1-2 A1:P0|A3:P-2 3|" =20 # An invalidated remote partition cannot self-recover from hotplug - " C0-3:S+ C1-3:S+ C2 . X2-3 X2-3 T:P2:O2=3D0 O2=3D1 0 A1:0-3= |A2:1-3|A3:2 A1:P0|A3:P-2 ." + " C0-3 C1-3 C2 . X2-3 X2-3 T:P2:O2=3D0 O2=3D1 0 A1:0-3= |A2:1-3|A3:2 A1:P0|A3:P-2 ." =20 # cpus.exclusive.effective clearing test - " C0-3:S+ C1-3:S+ C2 . X2-3:X . . . 0 A1:0-3|A2:= 1-3|A3:2|XA1:" + " C0-3 C1-3 C2 . X2-3:X . . . 0 A1:0-3|A2:= 1-3|A3:2|XA1:" =20 # Invalid to valid remote partition transition test - " C0-3:S+ C1-3 . . . X3:P2 . . 0 A1:0-3|A2:= 1-3|XA2: A2:P-2 ." - " C0-3:S+ C1-3:X3:P2 - . . X2-3 P2 . . 0 A1:0-2|A2:3|XA2:3 A2:P2 = 3" + " C0-3 C1-3 . . . X3:P2 . . 0 A1:0-3|A2:= 1-3|XA2: A2:P-2 ." + " C0-3 C1-3:X3:P2 . . X2-3 P2 . . 0 A1:0-2|A2:= 3|XA2:3 A2:P2 3" =20 # Invalid to valid local partition direct transition tests - " C1-3:S+:P2 X4:P2 . . . . . . 0 A1:1-3|XA1= :1-3|A2:1-3:XA2: A1:P2|A2:P-2 1-3" - " C1-3:S+:P2 X4:P2 . . . X3:P2 . . 0 A1:1-2|XA1= :1-3|A2:3:XA2:3 A1:P2|A2:P2 1-3" - " C0-3:P2 . . C4-6 C0-4 . . . 0 A1:0-4|B1:= 5-6 A1:P2|B1:P0" - " C0-3:P2 . . C4-6 C0-4:C0-3 . . . 0 A1:0-3|B1:= 4-6 A1:P2|B1:P0 0-3" + " C1-3:P2 X4:P2 . . . . . . 0 A1:1-3|XA1= :1-3|A2:1-3:XA2: A1:P2|A2:P-2 1-3" + " C1-3:P2 X4:P2 . . . X3:P2 . . 0 A1:1-2|XA1= :1-3|A2:3:XA2:3 A1:P2|A2:P2 1-3" + " C0-3:P2 . . C4-6 C0-4 . . . 0 A1:0-4|B1:= 5-6 A1:P2|B1:P0" + " C0-3:P2 . . C4-6 C0-4:C0-3 . . . 0 A1:0-3|B1:= 4-6 A1:P2|B1:P0 0-3" =20 # Local partition invalidation tests - " C0-3:X1-3:S+:P2 C1-3:X2-3:S+:P2 C2-3:X3:P2 \ + " C0-3:X1-3:P2 C1-3:X2-3:P2 C2-3:X3:P2 \ . . . . . 0 A1:1|A2:2|A3:3 A1:P2|A2:P2|A3:P= 2 1-3" - " C0-3:X1-3:S+:P2 C1-3:X2-3:S+:P2 C2-3:X3:P2 \ + " C0-3:X1-3:P2 C1-3:X2-3:P2 C2-3:X3:P2 \ . . X4 . . 0 A1:1-3|A2:1-3|A3:2-3|XA2:|XA3: = A1:P2|A2:P-2|A3:P-2 1-3" - " C0-3:X1-3:S+:P2 C1-3:X2-3:S+:P2 C2-3:X3:P2 \ + " C0-3:X1-3:P2 C1-3:X2-3:P2 C2-3:X3:P2 \ . . C4:X . . 0 A1:1-3|A2:1-3|A3:2-3|XA2:|XA3: = A1:P2|A2:P-2|A3:P-2 1-3" # Local partition CPU change tests - " C0-5:S+:P2 C4-5:S+:P1 . . . C3-5 . . 0 A1:0-2|A2:= 3-5 A1:P2|A2:P1 0-2" - " C0-5:S+:P2 C4-5:S+:P1 . . C1-5 . . . 0 A1:1-3|A2:= 4-5 A1:P2|A2:P1 1-3" + " C0-5:P2 C4-5:P1 . . . C3-5 . . 0 A1:0-2|A2:= 3-5 A1:P2|A2:P1 0-2" + " C0-5:P2 C4-5:P1 . . C1-5 . . . 0 A1:1-3|A2:= 4-5 A1:P2|A2:P1 1-3" =20 # cpus_allowed/exclusive_cpus update tests - " C0-3:X2-3:S+ C1-3:X2-3:S+ C2-3:X2-3 \ + " C0-3:X2-3 C1-3:X2-3 C2-3:X2-3 \ . X:C4 . P2 . 0 A1:4|A2:4|XA2:|XA3:|A3:4 \ A1:P0|A3:P-2 ." - " C0-3:X2-3:S+ C1-3:X2-3:S+ C2-3:X2-3 \ + " C0-3:X2-3 C1-3:X2-3 C2-3:X2-3 \ . X1 . P2 . 0 A1:0-3|A2:1-3|XA1:1|XA2:|XA3:|A= 3:2-3 \ A1:P0|A3:P-2 ." - " C0-3:X2-3:S+ C1-3:X2-3:S+ C2-3:X2-3 \ + " C0-3:X2-3 C1-3:X2-3 C2-3:X2-3 \ . . X3 P2 . 0 A1:0-2|A2:1-2|XA2:3|XA3:3|A3:3 \ A1:P0|A3:P2 3" - " C0-3:X2-3:S+ C1-3:X2-3:S+ C2-3:X2-3:P2 \ + " C0-3:X2-3 C1-3:X2-3 C2-3:X2-3:P2 \ . . X3 . . 0 A1:0-2|A2:1-2|XA2:3|XA3:3|A3:3|= XA3:3 \ A1:P0|A3:P2 3" - " C0-3:X2-3:S+ C1-3:X2-3:S+ C2-3:X2-3:P2 \ + " C0-3:X2-3 C1-3:X2-3 C2-3:X2-3:P2 \ . X4 . . . 0 A1:0-3|A2:1-3|A3:2-3|XA1:4|XA2:= |XA3 \ A1:P0|A3:P-2" =20 @@ -356,37 +354,37 @@ TEST_MATRIX=3D( # # Adding CPUs to partition root that are not in parent's # cpuset.cpus is allowed, but those extra CPUs are ignored. - "C2-3:P1:S+ C3:P1 . . . C2-4 . . 0 A1:|A2:2-3= A1:P1|A2:P1" + " C2-3:P1 C3:P1 . . . C2-4 . . 0 A1:|A2:2-3= A1:P1|A2:P1" =20 # Taking away all CPUs from parent or itself if there are tasks # will make the partition invalid. - "C2-3:P1:S+ C3:P1 . . T C2-3 . . 0 A1:2-3|A2:= 2-3 A1:P1|A2:P-1" - " C3:P1:S+ C3 . . T P1 . . 0 A1:3|A2:3 = A1:P1|A2:P-1" + " C2-3:P1 C3:P1 . . T C2-3 . . 0 A1:2-3|A2:= 2-3 A1:P1|A2:P-1" + " C3:P1 C3 . . T P1 . . 0 A1:3|A2:3 = A1:P1|A2:P-1" "$SETUP_A123_PARTITIONS . T:C2-3 . . . 0 A1:2-3|A2:= 2-3|A3:3 A1:P1|A2:P-1|A3:P-1" "$SETUP_A123_PARTITIONS . T:C2-3:C1-3 . . . 0 A1:1|A2:2|= A3:3 A1:P1|A2:P1|A3:P1" =20 # Changing a partition root to member makes child partitions invalid - "C2-3:P1:S+ C3:P1 . . P0 . . . 0 A1:2-3|A2:= 3 A1:P0|A2:P-1" + " C2-3:P1 C3:P1 . . P0 . . . 0 A1:2-3|A2:= 3 A1:P0|A2:P-1" "$SETUP_A123_PARTITIONS . C2-3 P0 . . 0 A1:2-3|A2:= 2-3|A3:3 A1:P1|A2:P0|A3:P-1" =20 # cpuset.cpus can contains cpus not in parent's cpuset.cpus as long # as they overlap. - "C2-3:P1:S+ . . . . C3-4:P1 . . 0 A1:2|A2:3 = A1:P1|A2:P1" + " C2-3:P1 . . . . C3-4:P1 . . 0 A1:2|A2:3 = A1:P1|A2:P1" =20 # Deletion of CPUs distributed to child cgroup is allowed. - "C0-1:P1:S+ C1 . C2-3 C4-5 . . . 0 A1:4-5|A2:= 4-5" + " C0-1:P1 C1 . C2-3 C4-5 . . . 0 A1:4-5|A2:= 4-5" =20 # To become a valid partition root, cpuset.cpus must overlap parent's # cpuset.cpus. - " C0-1:P1 . . C2-3 S+ C4-5:P1 . . 0 A1:0-1|A2:= 0-1 A1:P1|A2:P-1" + " C0-1:P1 . . C2-3 . C4-5:P1 . . 0 A1:0-1|A2:= 0-1 A1:P1|A2:P-1" =20 # Enabling partition with child cpusets is allowed - " C0-1:S+ C1 . C2-3 P1 . . . 0 A1:0-1|A2:= 1 A1:P1" + " C0-1 C1 . C2-3 P1 . . . 0 A1:0-1|A2:= 1 A1:P1" =20 # A partition root with non-partition root parent is invalid| but it # can be made valid if its parent becomes a partition root too. - " C0-1:S+ C1 . C2-3 . P2 . . 0 A1:0-1|A2:= 1 A1:P0|A2:P-2" - " C0-1:S+ C1:P2 . C2-3 P1 . . . 0 A1:0|A2:1 = A1:P1|A2:P2 0-1|1" + " C0-1 C1 . C2-3 . P2 . . 0 A1:0-1|A2:= 1 A1:P0|A2:P-2" + " C0-1 C1:P2 . C2-3 P1 . . . 0 A1:0|A2:1 = A1:P1|A2:P2 0-1|1" =20 # A non-exclusive cpuset.cpus change will not invalidate its siblings par= tition. " C0-1:P1 . . C2-3 C0-2 . . . 0 A1:0-2|B1:= 3 A1:P1|B1:P0" @@ -398,23 +396,23 @@ TEST_MATRIX=3D( =20 # Child partition root that try to take all CPUs from parent partition # with tasks will remain invalid. - " C1-4:P1:S+ P1 . . . . . . 0 A1:1-4|A2:= 1-4 A1:P1|A2:P-1" - " C1-4:P1:S+ P1 . . . C1-4 . . 0 A1|A2:1-4 = A1:P1|A2:P1" - " C1-4:P1:S+ P1 . . T C1-4 . . 0 A1:1-4|A2:= 1-4 A1:P1|A2:P-1" + " C1-4:P1 P1 . . . . . . 0 A1:1-4|A2:= 1-4 A1:P1|A2:P-1" + " C1-4:P1 P1 . . . C1-4 . . 0 A1|A2:1-4 = A1:P1|A2:P1" + " C1-4:P1 P1 . . T C1-4 . . 0 A1:1-4|A2:= 1-4 A1:P1|A2:P-1" =20 # Clearing of cpuset.cpus with a preset cpuset.cpus.exclusive shouldn't # affect cpuset.cpus.exclusive.effective. - " C1-4:X3:S+ C1:X3 . . . C . . 0 A2:1-4|XA2= :3" + " C1-4:X3 C1:X3 . . . C . . 0 A2:1-4|XA2= :3" =20 # cpuset.cpus can contain CPUs that overlap a sibling cpuset with cpus.ex= clusive # but creating a local partition out of it is not allowed. Similarly and = change # in cpuset.cpus of a local partition that overlaps sibling exclusive CPU= s will # invalidate it. - " CX1-4:S+ CX2-4:P2 . C5-6 . . . P1 0 A1:1|A2:2-= 4|B1:5-6|XB1:5-6 \ + " CX1-4 CX2-4:P2 . C5-6 . . . P1 0 A1:1|A2:2-= 4|B1:5-6|XB1:5-6 \ A1:P0|A2:P2:B1:P1 2-4" - " CX1-4:S+ CX2-4:P2 . C3-6 . . . P1 0 A1:1|A2:2-= 4|B1:5-6 \ + " CX1-4 CX2-4:P2 . C3-6 . . . P1 0 A1:1|A2:2-= 4|B1:5-6 \ A1:P0|A2:P2:B1:P-1 2-4" - " CX1-4:S+ CX2-4:P2 . C5-6 . . . P1:C3-6 0 A1:1|A2:2-= 4|B1:5-6 \ + " CX1-4 CX2-4:P2 . C5-6 . . . P1:C3-6 0 A1:1|A2:2-= 4|B1:5-6 \ A1:P0|A2:P2:B1:P-1 2-4" =20 # When multiple partitions with conflicting cpuset.cpus are created, the @@ -426,14 +424,14 @@ TEST_MATRIX=3D( " C1-3:X1-3 . . C4-5 . . . C1-2 0 A1:1-3|B1:= 1-2" =20 # cpuset.cpus can become empty with task in it as it inherits parent's ef= fective CPUs - " C1-3:S+ C2 . . . T:C . . 0 A1:1-3|A2:= 1-3" + " C1-3 C2 . . . T:C . . 0 A1:1-3|A2:= 1-3" =20 # old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 fail ECPUs Pst= ate ISOLCPUS # ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ---= --- -------- # Failure cases: =20 # A task cannot be added to a partition with no cpu - "C2-3:P1:S+ C3:P1 . . O2=3D0:T . . . 1 A1:|A2:3= A1:P1|A2:P1" + " C2-3:P1 C3:P1 . . O2=3D0:T . . . 1 A1:|A2:3= A1:P1|A2:P1" =20 # Changes to cpuset.cpus.exclusive that violate exclusivity rule is rejec= ted " C0-3 . . C4-5 X0-3 . . X3-5 1 A1:0-3|B1:= 4-5" @@ -465,31 +463,31 @@ REMOTE_TEST_MATRIX=3D( # old-p1 old-p2 old-c11 old-c12 old-c21 old-c22 # new-p1 new-p2 new-c11 new-c12 new-c21 new-c22 ECPUs Pstate ISOLCPUS # ------ ------ ------- ------- ------- ------- ----- ------ -------- - " X1-3:S+ X4-6:S+ X1-2 X3 X4-5 X6 \ + " X1-3 X4-6 X1-2 X3 X4-5 X6 \ . . P2 P2 P2 P2 c11:1-2|c12:3|c21:4-5|c2= 2:6 \ c11:P2|c12:P2|c21:P2|c22:P2 1-6" - " CX1-4:S+ . X1-2:P2 C3 . . \ + " CX1-4 . X1-2:P2 C3 . . \ . . . C3-4 . . p1:3-4|c11:1-2|c12:3-4 \ p1:P0|c11:P2|c12:P0 1-2" - " CX1-4:S+ . X1-2:P2 . . . \ + " CX1-4 . X1-2:P2 . . . \ X2-4 . . . . . p1:1,3-4|c11:2 \ p1:P0|c11:P2 2" - " CX1-5:S+ . X1-2:P2 X3-5:P1 . . \ + " CX1-5 . X1-2:P2 X3-5:P1 . . \ X2-4 . . . . . p1:1,5|c11:2|c12:3-4 \ p1:P0|c11:P2|c12:P1 2" - " CX1-4:S+ . X1-2:P2 X3-4:P1 . . \ + " CX1-4 . X1-2:P2 X3-4:P1 . . \ . . X2 . . . p1:1|c11:2|c12:3-4 \ p1:P0|c11:P2|c12:P1 2" # p1 as member, will get its effective CPUs from its parent rtest - " CX1-4:S+ . X1-2:P2 X3-4:P1 . . \ + " CX1-4 . X1-2:P2 X3-4:P1 . . \ . . X1 CX2-4 . . p1:5-7|c11:1|c12:2-4 \ p1:P0|c11:P2|c12:P1 1" - " CX1-4:S+ X5-6:P1:S+ . . . . \ - . . X1-2:P2 X4-5:P1 . X1-7:P2 p1:3|c11:1-2|c12:4:c22:5= -6 \ + " CX1-4 X5-6:P1 . . . . \ + . . X1-2:P2 X4-5:P1 . X1-7:P2 p1:3|c11:1-2|c12:4:c22:5= -6 \ p1:P0|p2:P1|c11:P2|c12:P1|c22:P2 \ 1-2,4-6|1-2,5-6" # c12 whose cpuset.cpus CPUs are all granted to c11 will become invalid p= artition - " C1-5:P1:S+ . C1-4:P1 C2-3 . . \ + " C1-5:P1 . C1-4:P1 C2-3 . . \ . . . P1 . . p1:5|c11:1-4|c12:5 \ p1:P1|c11:P1|c12:P-1" ) @@ -530,7 +528,6 @@ set_ctrl_state() CGRP=3D$1 STATE=3D$2 SHOWERR=3D${3} - CTRL=3D${CTRL:=3D$CONTROLLER} HASERR=3D0 REDIRECT=3D"2> $TMPMSG" [[ -z "$STATE" || "$STATE" =3D '.' ]] && return 0 @@ -540,15 +537,16 @@ set_ctrl_state() for CMD in $(echo $STATE | sed -e "s/:/ /g") do TFILE=3D$CGRP/cgroup.procs - SFILE=3D$CGRP/cgroup.subtree_control PFILE=3D$CGRP/cpuset.cpus.partition CFILE=3D$CGRP/cpuset.cpus XFILE=3D$CGRP/cpuset.cpus.exclusive - case $CMD in - S*) PREFIX=3D${CMD#?} - COMM=3D"echo ${PREFIX}${CTRL} > $SFILE" + + # Enable cpuset controller if not enabled yet + [[ -f $CFILE ]] || { + COMM=3D"echo +cpuset > $CGRP/../cgroup.subtree_control" eval $COMM $REDIRECT - ;; + } + case $CMD in X*) CPUS=3D${CMD#?} COMM=3D"echo $CPUS > $XFILE" @@ -947,7 +945,6 @@ check_test_results() run_state_test() { TEST=3D$1 - CONTROLLER=3Dcpuset CGROUP_LIST=3D". A1 A1/A2 A1/A2/A3 B1" RESET_LIST=3D"A1/A2/A3 A1/A2 A1 B1" I=3D0 @@ -1003,7 +1000,6 @@ run_state_test() run_remote_state_test() { TEST=3D$1 - CONTROLLER=3Dcpuset [[ -d rtest ]] || mkdir rtest cd rtest echo +cpuset > cgroup.subtree_control --=20 2.53.0 From nobody Thu Apr 2 06:10:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E48530F808 for ; Sat, 21 Feb 2026 18:55:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700121; cv=none; b=epW+ZP8uuL46XfbfP4BmrAnULkbjgSlDuVpZEkalFcQ1aR5Bqf+4q7LJ0l5GECxu1mGWzo2UgS9fwKTPftXXjBc13MeCXrSOOoTXH2/7iXxquxbOJH//aeWClC7FlkezSZ5rVtrcvPQGARMP8XaTGShZrjWpXD8bRpE1FRg+sMw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700121; c=relaxed/simple; bh=Dm8ouI4wcPuZG9a1WrcqgvvWhTFB3pxL4rOFUvi12V8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QN2s2HeYKazaIrnGrzvcrSe/9/ldHH1SyZgu4fcbDjd2/TjJrbXvtA+ZGaJwbmlr57dsK/0aS5zlZhzrkPl0KsWSgWt6mXXED2gMYZSSLUh84OU9qs9m3lICjmEmV8e4RhtM50g1Vi7PtvO6dVHXkQVCBRTjFYwvI8bOGeTZ/mQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=APHzY4o5; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="APHzY4o5" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771700118; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=52VEDWZxrhdgx5iTzfOlzrgMprmkKIqMic5dXsBw/Oo=; b=APHzY4o50XT+mS7fmUxKmplT08HzHNTchbXjj2nTL9DZGy+XVtDt3eBF+WLTCYW9JZDe7L 0x0LJccW9rZGstpvdDrvm3Rs5JavPz7e+qggwlA/K27jJ7lbpXfLS4zFOCIolI29yDFTXq ht7xz9JFBbONuRfYM268SmwxU5l+SEE= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-138-hDjBKlOUPs6aCv5cB5W2Cw-1; Sat, 21 Feb 2026 13:55:13 -0500 X-MC-Unique: hDjBKlOUPs6aCv5cB5W2Cw-1 X-Mimecast-MFC-AGG-ID: hDjBKlOUPs6aCv5cB5W2Cw_1771700111 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 260EA19560A3; Sat, 21 Feb 2026 18:55:11 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.2.16.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B9F891955D85; Sat, 21 Feb 2026 18:55:06 +0000 (UTC) From: Waiman Long To: Chen Ridong , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Thomas Gleixner , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 6/8] cgroup/cpuset: Move housekeeping_update()/rebuild_sched_domains() together Date: Sat, 21 Feb 2026 13:54:16 -0500 Message-ID: <20260221185418.29319-7-longman@redhat.com> In-Reply-To: <20260221185418.29319-1-longman@redhat.com> References: <20260221185418.29319-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" With the latest changes in sched/isolation.c, rebuild_sched_domains*() requires the HK_TYPE_DOMAIN housekeeping cpumask to be properly updated first, if needed, before the sched domains can be rebuilt. So the two naturally fit together. Do that by creating a new update_hk_sched_domains() helper to house both actions. The name of the isolated_cpus_updating flag to control the call to housekeeping_update() is now outdated. So change it to update_housekeeping to better reflect its purpose. Also move the call to update_hk_sched_domains() to the end of cpuset and hotplug operations before releasing the cpuset_mutex. Signed-off-by: Waiman Long Acked-by: Frederic Weisbecker --- kernel/cgroup/cpuset.c | 51 ++++++++++++++++++++---------------------- 1 file changed, 24 insertions(+), 27 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 05adf6697030..3d0d18bf182f 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -130,10 +130,9 @@ static cpumask_var_t subpartitions_cpus; /* RWCS */ static cpumask_var_t isolated_cpus; /* CSCB */ =20 /* - * Set if isolated_cpus is being updated in the current cpuset_mutex - * critical section. + * Set if housekeeping cpumasks are to be updated. */ -static bool isolated_cpus_updating; /* RWCS */ +static bool update_housekeeping; /* RWCS */ =20 /* * A flag to force sched domain rebuild at the end of an operation. @@ -1188,7 +1187,7 @@ static void isolated_cpus_update(int old_prs, int new= _prs, struct cpumask *xcpus return; cpumask_andnot(isolated_cpus, isolated_cpus, xcpus); } - isolated_cpus_updating =3D true; + update_housekeeping =3D true; } =20 /* @@ -1306,22 +1305,22 @@ static bool prstate_housekeeping_conflict(int prsta= te, struct cpumask *new_cpus) } =20 /* - * update_isolation_cpumasks - Update external isolation related CPU masks + * update_hk_sched_domains - Update HK cpumasks & rebuild sched domains * - * The following external CPU masks will be updated if necessary: - * - workqueue unbound cpumask + * Update housekeeping cpumasks and rebuild sched domains if necessary. + * This should be called at the end of cpuset or hotplug actions. */ -static void update_isolation_cpumasks(void) +static void update_hk_sched_domains(void) { - int ret; - - if (!isolated_cpus_updating) - return; - - ret =3D housekeeping_update(isolated_cpus); - WARN_ON_ONCE(ret < 0); - - isolated_cpus_updating =3D false; + if (update_housekeeping) { + /* Updating HK cpumasks implies rebuild sched domains */ + WARN_ON_ONCE(housekeeping_update(isolated_cpus)); + update_housekeeping =3D false; + force_sd_rebuild =3D true; + } + /* force_sd_rebuild will be cleared in rebuild_sched_domains_locked() */ + if (force_sd_rebuild) + rebuild_sched_domains_locked(); } =20 /** @@ -1472,7 +1471,6 @@ static int remote_partition_enable(struct cpuset *cs,= int new_prs, cs->remote_partition =3D true; cpumask_copy(cs->effective_xcpus, tmp->new_cpus); spin_unlock_irq(&callback_lock); - update_isolation_cpumasks(); cpuset_force_rebuild(); cs->prs_err =3D 0; =20 @@ -1517,7 +1515,6 @@ static void remote_partition_disable(struct cpuset *c= s, struct tmpmasks *tmp) compute_excpus(cs, cs->effective_xcpus); reset_partition_data(cs); spin_unlock_irq(&callback_lock); - update_isolation_cpumasks(); cpuset_force_rebuild(); =20 /* @@ -1588,7 +1585,6 @@ static void remote_cpus_update(struct cpuset *cs, str= uct cpumask *xcpus, if (xcpus) cpumask_copy(cs->exclusive_cpus, xcpus); spin_unlock_irq(&callback_lock); - update_isolation_cpumasks(); if (adding || deleting) cpuset_force_rebuild(); =20 @@ -1932,7 +1928,6 @@ static int update_parent_effective_cpumask(struct cpu= set *cs, int cmd, partition_xcpus_add(new_prs, parent, tmp->delmask); =20 spin_unlock_irq(&callback_lock); - update_isolation_cpumasks(); =20 if ((old_prs !=3D new_prs) && (cmd =3D=3D partcmd_update)) update_partition_exclusive_flag(cs, new_prs); @@ -2900,7 +2895,6 @@ static int update_prstate(struct cpuset *cs, int new_= prs) else if (isolcpus_updated) isolated_cpus_update(old_prs, new_prs, cs->effective_xcpus); spin_unlock_irq(&callback_lock); - update_isolation_cpumasks(); =20 /* Force update if switching back to member & update effective_xcpus */ update_cpumasks_hier(cs, &tmpmask, !new_prs); @@ -3190,9 +3184,8 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file = *of, } =20 free_cpuset(trialcs); - if (force_sd_rebuild) - rebuild_sched_domains_locked(); out_unlock: + update_hk_sched_domains(); cpuset_full_unlock(); if (of_cft(of)->private =3D=3D FILE_MEMLIST) schedule_flush_migrate_mm(); @@ -3300,6 +3293,7 @@ static ssize_t cpuset_partition_write(struct kernfs_o= pen_file *of, char *buf, cpuset_full_lock(); if (is_cpuset_online(cs)) retval =3D update_prstate(cs, val); + update_hk_sched_domains(); cpuset_full_unlock(); return retval ?: nbytes; } @@ -3474,6 +3468,7 @@ static void cpuset_css_killed(struct cgroup_subsys_st= ate *css) /* Reset valid partition back to member */ if (is_partition_valid(cs)) update_prstate(cs, PRS_MEMBER); + update_hk_sched_domains(); cpuset_full_unlock(); } =20 @@ -3881,10 +3876,12 @@ static void cpuset_handle_hotplug(void) rcu_read_unlock(); } =20 - /* rebuild sched domains if necessary */ - if (force_sd_rebuild) - rebuild_sched_domains_cpuslocked(); =20 + if (update_housekeeping || force_sd_rebuild) { + mutex_lock(&cpuset_mutex); + update_hk_sched_domains(); + mutex_unlock(&cpuset_mutex); + } free_tmpmasks(ptmp); } =20 --=20 2.53.0 From nobody Thu Apr 2 06:10:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3091130E837 for ; Sat, 21 Feb 2026 18:55:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700126; cv=none; b=KeGtR94Ghw7hAlzFjj0xvYAdg0PE/HB6s8bJeOmmFHwVyMqxmo+FFG8ybeS/+ob0xcdpM3SE5Jh3TyRybr4Bvo+iC/AnW0Hxz3HCAUnob9DtrevsX69u/waXZ1VTz9XuzKl/apfS7uCM5IuWAnVYA0KTKdPvqlKBEVxxcsdnxoo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700126; c=relaxed/simple; bh=zj1nzFsQKy7G9JLp3bdrBW08Nt2BFrGjPQK/n7+/h8E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XluY2n4IZFUX5cUd6tOykh8Mw1KffhK6vHPgSjQ9ojcULJnwRA8NCaT2UDwKqPPDCmF7JxBI/8hhUSKWkWr5ua7xFORKJhELEMO+yXD/0p86eK53lBvbfEsUVm3QdySYBo3m8hRw3s2yb+VZQXOW1rmoc3ZRU5w17zwcBct+jYk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=h4gvZ/WB; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="h4gvZ/WB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771700124; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xKg+N2bAOrd2PMcoX6bAcqgeV9uYlmhLKffPR6U/E6o=; b=h4gvZ/WBMnsFlJUUlW94dAwnu5z9lNLCJ5qpqHTx5SFRCJSZFE91Ye7FBmu3wtTILYBeVH nOzoShBBbJE7jX+4zIevkjSP4lsioelUdz1IqlUby1T/boQwTcThy0G9GpCXoxqihfVs0G E8ew6NtgwJ+nCGUl3FIKxA2RAfbqZ0E= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-316-oGt6ycoXM6Gzk6jQt95DkA-1; Sat, 21 Feb 2026 13:55:18 -0500 X-MC-Unique: oGt6ycoXM6Gzk6jQt95DkA-1 X-Mimecast-MFC-AGG-ID: oGt6ycoXM6Gzk6jQt95DkA_1771700117 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 42E7A1956052; Sat, 21 Feb 2026 18:55:16 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.2.16.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id ADF4A1955D85; Sat, 21 Feb 2026 18:55:11 +0000 (UTC) From: Waiman Long To: Chen Ridong , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Thomas Gleixner , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 7/8] cgroup/cpuset: Defer housekeeping_update() calls from CPU hotplug to workqueue Date: Sat, 21 Feb 2026 13:54:17 -0500 Message-ID: <20260221185418.29319-8-longman@redhat.com> In-Reply-To: <20260221185418.29319-1-longman@redhat.com> References: <20260221185418.29319-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" The cpuset_handle_hotplug() may need to invoke housekeeping_update(), for instance, when an isolated partition is invalidated because its last active CPU has been put offline. As we are going to enable dynamic update to the nozh_full housekeeping cpumask (HK_TYPE_KERNEL_NOISE) soon with the help of CPU hotplug, allowing the CPU hotplug path to call into housekeeping_update() directly from update_isolation_cpumasks() will likely cause deadlock. So we have to defer any call to housekeeping_update() after the CPU hotplug operation has finished. This is now done via the workqueue where the update_hk_sched_domains() function will be invoked via the hk_sd_workfn(). An concurrent cpuset control file write may have executed the required update_hk_sched_domains() function before the work function is called. So the work function call may become a no-op when it is invoked. Signed-off-by: Waiman Long Tested-by: Jon Hunter --- kernel/cgroup/cpuset.c | 31 ++++++++++++++++--- .../selftests/cgroup/test_cpuset_prs.sh | 11 ++++++- 2 files changed, 36 insertions(+), 6 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 3d0d18bf182f..2c80bfc30bbc 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1323,6 +1323,16 @@ static void update_hk_sched_domains(void) rebuild_sched_domains_locked(); } =20 +/* + * Work function to invoke update_hk_sched_domains() + */ +static void hk_sd_workfn(struct work_struct *work) +{ + cpuset_full_lock(); + update_hk_sched_domains(); + cpuset_full_unlock(); +} + /** * rm_siblings_excl_cpus - Remove exclusive CPUs that are used by sibling = cpusets * @parent: Parent cpuset containing all siblings @@ -3795,6 +3805,7 @@ static void cpuset_hotplug_update_tasks(struct cpuset= *cs, struct tmpmasks *tmp) */ static void cpuset_handle_hotplug(void) { + static DECLARE_WORK(hk_sd_work, hk_sd_workfn); static cpumask_t new_cpus; static nodemask_t new_mems; bool cpus_updated, mems_updated; @@ -3877,11 +3888,21 @@ static void cpuset_handle_hotplug(void) } =20 =20 - if (update_housekeeping || force_sd_rebuild) { - mutex_lock(&cpuset_mutex); - update_hk_sched_domains(); - mutex_unlock(&cpuset_mutex); - } + /* + * Queue a work to call housekeeping_update() & rebuild_sched_domains() + * There will be a slight delay before the HK_TYPE_DOMAIN housekeeping + * cpumask can correctly reflect what is in isolated_cpus. + * + * We rely on WORK_STRUCT_PENDING_BIT to not requeue a work item that + * is still pending. Before the pending bit is cleared, the work data + * is copied out and work item dequeued. So it is possible to queue + * the work again before the hk_sd_workfn() is invoked to process the + * previously queued work. Since hk_sd_workfn() doesn't use the work + * item at all, this is not a problem. + */ + if (update_housekeeping || force_sd_rebuild) + queue_work(system_unbound_wq, &hk_sd_work); + free_tmpmasks(ptmp); } =20 diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/test= ing/selftests/cgroup/test_cpuset_prs.sh index 0c5db118f2d1..dc2dff361ec6 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -246,6 +246,9 @@ TEST_MATRIX=3D( " C2-3:P1 C3:P1 . . O3=3D0 . . . 0 A1:2|A2:= A1:P1|A2:P1" " C2-3:P1 C3:P1 . . T:O2=3D0 . . . 0 A1:3|A2:= 3 A1:P1|A2:P-1" " C2-3:P1 C3:P1 . . . T:O3=3D0 . . 0 A1:2|A2:= 2 A1:P1|A2:P-1" + " C2-3:P1 C3:P2 . . T:O2=3D0 . . . 0 A1:3|A2:= 3 A1:P1|A2:P-2" + " C1-3:P1 C3:P2 . . . T:O3=3D0 . . 0 A1:1-2|A= 2:1-2 A1:P1|A2:P-2 3|" + " C1-3:P1 C3:P2 . . . T:O3=3D0 O3=3D1 . 0 A1:1-2= |A2:3 A1:P1|A2:P2 3" "$SETUP_A123_PARTITIONS . O1=3D0 . . . 0 A1:|A2:2= |A3:3 A1:P1|A2:P1|A3:P1" "$SETUP_A123_PARTITIONS . O2=3D0 . . . 0 A1:1|A2:= |A3:3 A1:P1|A2:P1|A3:P1" "$SETUP_A123_PARTITIONS . O3=3D0 . . . 0 A1:1|A2:= 2|A3: A1:P1|A2:P1|A3:P1" @@ -762,7 +765,7 @@ check_cgroup_states() # only CPUs in isolated partitions as well as those that are isolated at # boot time. # -# $1 - expected isolated cpu list(s) {,} +# $1 - expected isolated cpu list(s) {|} # - expected sched/domains value # - cpuset.cpus.isolated value =3D if not defined # @@ -771,6 +774,7 @@ check_isolcpus() EXPECTED_ISOLCPUS=3D$1 ISCPUS=3D${CGROUP2}/cpuset.cpus.isolated ISOLCPUS=3D$(cat $ISCPUS) + HKICPUS=3D$(cat /sys/devices/system/cpu/isolated) LASTISOLCPU=3D SCHED_DOMAINS=3D/sys/kernel/debug/sched/domains if [[ $EXPECTED_ISOLCPUS =3D . ]] @@ -808,6 +812,11 @@ check_isolcpus() ISOLCPUS=3D EXPECTED_ISOLCPUS=3D$EXPECTED_SDOMAIN =20 + # + # The inverse of HK_TYPE_DOMAIN cpumask in $HKICPUS should match $ISOLCPUS + # + [[ "$ISOLCPUS" !=3D "$HKICPUS" ]] && return 1 + # # Use the sched domain in debugfs to check isolated CPUs, if available # --=20 2.53.0 From nobody Thu Apr 2 06:10:00 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87CA1313534 for ; Sat, 21 Feb 2026 18:55:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700129; cv=none; b=N9XUyvqMneUdw2Aexvb4uaNN9jDzohsnVrPJysqsaI9o//yYeqpBWQ4yMa5y4Ng7KDHq9MuR0aXROlCkPXwuVEKW7SR215E1pJC3e78obDzlTmajeJ4et6UPQplMOOYCITLz1PeGb090v6xgVMSl+R1w5lOl+hsCnw06ABU5YIE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771700129; c=relaxed/simple; bh=sCsHRcyKcERjUNKeUpJE7elYWPVKzxonGlxyHXoVLTA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CVBcS1hvHsWV/wS/ChHcNrSOwy3Dsk4Y0cIjLMiiXRbkOhrcRXYSq8/SMkBRr+0BkIZcqIfiClxSq9RtS2g73BbbzlawdQj0gZ2vcHe+PYbp1vPGKHe7hIsSdp0uTeKNBDBiSBMtpHQwXS8wR+T+WO1qKSNGYT0dO69aOtaVfp4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=grj9Qp0R; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="grj9Qp0R" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1771700126; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k4vR+Shf01bTPXjClL10ebMyrEpjUBxnF56CNK7mE1M=; b=grj9Qp0R/3yQdsydDM5IVetZZRp4HuX5uamvYNh9OTomwh7MWusOmacTFeeCDwpZHUCDuO aK8HPsL334u9PLUf3CazDPvITUFHjZnshOokEAOVBD/WjIcYjezTwVHNAaqyyxu7Fqhqsn uxjCSk1Tuy5guSqfC3WGFQlvCkIUaZ0= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-460-9Of_0gChN2a-XvmWHzG-ig-1; Sat, 21 Feb 2026 13:55:23 -0500 X-MC-Unique: 9Of_0gChN2a-XvmWHzG-ig-1 X-Mimecast-MFC-AGG-ID: 9Of_0gChN2a-XvmWHzG-ig_1771700121 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id EFCB31956088; Sat, 21 Feb 2026 18:55:20 +0000 (UTC) Received: from llong-thinkpadp16vgen1.westford.csb (unknown [10.2.16.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 9FE18195419E; Sat, 21 Feb 2026 18:55:16 +0000 (UTC) From: Waiman Long To: Chen Ridong , Tejun Heo , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , Frederic Weisbecker , Thomas Gleixner , Shuah Khan Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Waiman Long Subject: [PATCH v6 8/8] cgroup/cpuset: Call housekeeping_update() without holding cpus_read_lock Date: Sat, 21 Feb 2026 13:54:18 -0500 Message-ID: <20260221185418.29319-9-longman@redhat.com> In-Reply-To: <20260221185418.29319-1-longman@redhat.com> References: <20260221185418.29319-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" The current cpuset partition code is able to dynamically update the sched domains of a running system and the corresponding HK_TYPE_DOMAIN housekeeping cpumask to perform what is essentally the "isolcpus=3Ddomain,..." boot command line feature at run time. The housekeeping cpumask update requires flushing a number of different workqueues which may not be safe with cpus_read_lock() held as the workqueue flushing code may acquire cpus_read_lock() or acquiring locks which have locking dependency with cpus_read_lock() down the chain. Below is an example of such circular locking problem. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D WARNING: possible circular locking dependency detected 6.18.0-test+ #2 Tainted: G S ------------------------------------------------------ test_cpuset_prs/10971 is trying to acquire lock: ffff888112ba4958 ((wq_completion)sync_wq){+.+.}-{0:0}, at: touch_wq_lockd= ep_map+0x7a/0x180 but task is already holding lock: ffffffffae47f450 (cpuset_mutex){+.+.}-{4:4}, at: cpuset_partition_write+0= x85/0x130 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (cpuset_mutex){+.+.}-{4:4}: -> #3 (cpu_hotplug_lock){++++}-{0:0}: -> #2 (rtnl_mutex){+.+.}-{4:4}: -> #1 ((work_completion)(&arg.work)){+.+.}-{0:0}: -> #0 ((wq_completion)sync_wq){+.+.}-{0:0}: Chain exists of: (wq_completion)sync_wq --> cpu_hotplug_lock --> cpuset_mutex 5 locks held by test_cpuset_prs/10971: #0: ffff88816810e440 (sb_writers#7){.+.+}-{0:0}, at: ksys_write+0xf9/0x1= d0 #1: ffff8891ab620890 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_it= er+0x260/0x5f0 #2: ffff8890a78b83e8 (kn->active#187){.+.+}-{0:0}, at: kernfs_fop_write_= iter+0x2b6/0x5f0 #3: ffffffffadf32900 (cpu_hotplug_lock){++++}-{0:0}, at: cpuset_partitio= n_write+0x77/0x130 #4: ffffffffae47f450 (cpuset_mutex){+.+.}-{4:4}, at: cpuset_partition_wr= ite+0x85/0x130 Call Trace: : touch_wq_lockdep_map+0x93/0x180 __flush_workqueue+0x111/0x10b0 housekeeping_update+0x12d/0x2d0 update_parent_effective_cpumask+0x595/0x2440 update_prstate+0x89d/0xce0 cpuset_partition_write+0xc5/0x130 cgroup_file_write+0x1a5/0x680 kernfs_fop_write_iter+0x3df/0x5f0 vfs_write+0x525/0xfd0 ksys_write+0xf9/0x1d0 do_syscall_64+0x95/0x520 entry_SYSCALL_64_after_hwframe+0x76/0x7e To avoid such a circular locking dependency problem, we have to call housekeeping_update() without holding the cpus_read_lock() and cpuset_mutex. The current set of wq's flushed by housekeeping_update() may not have work functions that call cpus_read_lock() directly, but we are likely to extend the list of wq's that are flushed in the future. Moreover, the current set of work functions may hold locks that may have cpu_hotplug_lock down the dependency chain. So housekeeping_update() is now called after releasing cpus_read_lock and cpuset_mutex at the end of a cpuset operation. These two locks are then re-acquired later beforce calling rebuild_sched_domains_locked(). To enable mutual exclusion between the housekeeping_update() call and other cpuset control file write actions, a new top level cpuset_top_mutex is introduced. This new mutex will be acquired first to allow sharing variables used by both code paths. However, cpuset update from CPU hotplug can still happen in parallel with the housekeeping_update() call, though that should be rare in production environment. As cpus_read_lock() is now no longer held when tmigr_isolated_exclude_cpumask() is called, it needs to acquire it directly. The lockdep_is_cpuset_held() is also updated to return true if either cpuset_top_mutex or cpuset_mutex is held. Fixes: 03ff73510169 ("cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset") Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 47 +++++++++++++++++++++++++++++++---- kernel/sched/isolation.c | 4 +-- kernel/time/timer_migration.c | 4 +-- 3 files changed, 44 insertions(+), 11 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 2c80bfc30bbc..dbda09391b19 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -65,14 +65,28 @@ static const char * const perr_strings[] =3D { * CPUSET Locking Convention * ------------------------- * - * Below are the three global locks guarding cpuset structures in lock + * Below are the four global/local locks guarding cpuset structures in lock * acquisition order: + * - cpuset_top_mutex * - cpu_hotplug_lock (cpus_read_lock/cpus_write_lock) * - cpuset_mutex * - callback_lock (raw spinlock) * - * A task must hold all the three locks to modify externally visible or - * used fields of cpusets, though some of the internally used cpuset fields + * As cpuset will now indirectly flush a number of different workqueues in + * housekeeping_update() to update housekeeping cpumasks when the set of + * isolated CPUs is going to be changed, it may be vulnerable to deadlock + * if we hold cpus_read_lock while calling into housekeeping_update(). + * + * The first cpuset_top_mutex will be held except when calling into + * cpuset_handle_hotplug() from the CPU hotplug code where cpus_write_lock + * and cpuset_mutex will be held instead. The main purpose of this mutex + * is to prevent regular cpuset control file write actions from interfering + * with the call to housekeeping_update(), though CPU hotplug operation can + * still happen in parallel. This mutex also provides protection for some + * internal variables. + * + * A task must hold all the remaining three locks to modify externally vis= ible + * or used fields of cpusets, though some of the internally used cpuset fi= elds * and internal variables can be modified without holding callback_lock. I= f only * reliable read access of the externally used fields are needed, a task c= an * hold either cpuset_mutex or callback_lock which are exposed to other @@ -100,6 +114,7 @@ static const char * const perr_strings[] =3D { * cpumasks and nodemasks. */ =20 +static DEFINE_MUTEX(cpuset_top_mutex); static DEFINE_MUTEX(cpuset_mutex); =20 /* @@ -111,6 +126,8 @@ static DEFINE_MUTEX(cpuset_mutex); * * CSCB: Readable by holding either cpuset_mutex or callback_lock. Writable * by holding both cpuset_mutex and callback_lock. + * + * T: Read/write-able by holding the cpuset_top_mutex. */ =20 /* @@ -134,6 +151,11 @@ static cpumask_var_t isolated_cpus; /* CSCB */ */ static bool update_housekeeping; /* RWCS */ =20 +/* + * Copy of isolated_cpus to be passed to housekeeping_update() + */ +static cpumask_var_t isolated_hk_cpus; /* T */ + /* * A flag to force sched domain rebuild at the end of an operation. * It can be set in @@ -297,6 +319,7 @@ void lockdep_assert_cpuset_lock_held(void) */ void cpuset_full_lock(void) { + mutex_lock(&cpuset_top_mutex); cpus_read_lock(); mutex_lock(&cpuset_mutex); } @@ -305,12 +328,14 @@ void cpuset_full_unlock(void) { mutex_unlock(&cpuset_mutex); cpus_read_unlock(); + mutex_unlock(&cpuset_top_mutex); } =20 #ifdef CONFIG_LOCKDEP bool lockdep_is_cpuset_held(void) { - return lockdep_is_held(&cpuset_mutex); + return lockdep_is_held(&cpuset_mutex) || + lockdep_is_held(&cpuset_top_mutex); } #endif =20 @@ -1314,9 +1339,20 @@ static void update_hk_sched_domains(void) { if (update_housekeeping) { /* Updating HK cpumasks implies rebuild sched domains */ - WARN_ON_ONCE(housekeeping_update(isolated_cpus)); update_housekeeping =3D false; force_sd_rebuild =3D true; + cpumask_copy(isolated_hk_cpus, isolated_cpus); + + /* + * housekeeping_update() is now called without holding + * cpus_read_lock and cpuset_mutex. Only top_cpuset_mutex + * is still being held for mutual exclusion. + */ + mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); + WARN_ON_ONCE(housekeeping_update(isolated_hk_cpus)); + cpus_read_lock(); + mutex_lock(&cpuset_mutex); } /* force_sd_rebuild will be cleared in rebuild_sched_domains_locked() */ if (force_sd_rebuild) @@ -3634,6 +3670,7 @@ int __init cpuset_init(void) BUG_ON(!alloc_cpumask_var(&top_cpuset.exclusive_cpus, GFP_KERNEL)); BUG_ON(!zalloc_cpumask_var(&subpartitions_cpus, GFP_KERNEL)); BUG_ON(!zalloc_cpumask_var(&isolated_cpus, GFP_KERNEL)); + BUG_ON(!zalloc_cpumask_var(&isolated_hk_cpus, GFP_KERNEL)); =20 cpumask_setall(top_cpuset.cpus_allowed); nodes_setall(top_cpuset.mems_allowed); diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c index 3b725d39c06e..ef152d401fe2 100644 --- a/kernel/sched/isolation.c +++ b/kernel/sched/isolation.c @@ -123,8 +123,6 @@ int housekeeping_update(struct cpumask *isol_mask) struct cpumask *trial, *old =3D NULL; int err; =20 - lockdep_assert_cpus_held(); - trial =3D kmalloc(cpumask_size(), GFP_KERNEL); if (!trial) return -ENOMEM; @@ -136,7 +134,7 @@ int housekeeping_update(struct cpumask *isol_mask) } =20 if (!housekeeping.flags) - static_branch_enable_cpuslocked(&housekeeping_overridden); + static_branch_enable(&housekeeping_overridden); =20 if (housekeeping.flags & HK_FLAG_DOMAIN) old =3D housekeeping_cpumask_dereference(HK_TYPE_DOMAIN); diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c index 6da9cd562b20..83428aa03aef 100644 --- a/kernel/time/timer_migration.c +++ b/kernel/time/timer_migration.c @@ -1559,8 +1559,6 @@ int tmigr_isolated_exclude_cpumask(struct cpumask *ex= clude_cpumask) cpumask_var_t cpumask __free(free_cpumask_var) =3D CPUMASK_VAR_NULL; int cpu; =20 - lockdep_assert_cpus_held(); - if (!works) return -ENOMEM; if (!alloc_cpumask_var(&cpumask, GFP_KERNEL)) @@ -1570,6 +1568,7 @@ int tmigr_isolated_exclude_cpumask(struct cpumask *ex= clude_cpumask) * First set previously isolated CPUs as available (unisolate). * This cpumask contains only CPUs that switched to available now. */ + guard(cpus_read_lock)(); cpumask_andnot(cpumask, cpu_online_mask, exclude_cpumask); cpumask_andnot(cpumask, cpumask, tmigr_available_cpumask); =20 @@ -1626,7 +1625,6 @@ static int __init tmigr_init_isolation(void) cpumask_andnot(cpumask, cpu_possible_mask, housekeeping_cpumask(HK_TYPE_D= OMAIN)); =20 /* Protect against RCU torture hotplug testing */ - guard(cpus_read_lock)(); return tmigr_isolated_exclude_cpumask(cpumask); } late_initcall(tmigr_init_isolation); --=20 2.53.0