From nobody Fri Dec 19 20:33:49 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1086B13CFAB for ; Mon, 5 Aug 2024 01:30:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821459; cv=none; b=VP614QnUm1FxJpTkNMpzD88aPmzFSnrFzhY5AolnJ4MtmhlVDYJEMlJrNLNwfCNFS4djL+qrCdYPU9Sutopqe8YLFmQWfkAHZHpCbwa/SCBFcg8mrhRHhsCLMkaZao0ejxeCpvm+JBq+CgjT4cbk4twSISLXIkKz6GNzpTGLUHw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821459; c=relaxed/simple; bh=24n0O8EihRmwu/U40b2U1GbKUUo+VNmhOv4QXRNc14U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MFdr8brKzJMG8q2HxB/MhLaCYNspx2mN7zgz6pBjHibJfNob++UuNDEgE/fgQK0LEYt0L9OtNuvfcNBF+ri+X+IyUcLNyfKX0sIVjxBEUa0GHQjgj3pt+vf/hyyI/efwRhDHarRV4QDuQgYtrRP3dsnLOX3xFjAJXOVFq1tioKo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=XOV9vRUo; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XOV9vRUo" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1722821457; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sP3iZPGlfOi9UUNw+wqCkMAcnRxwaFx/V0HEvg4RAIU=; b=XOV9vRUo1CUeFs0x3k/Zf0iKwE441G77spLGuyQ71Mb3hHX+eCoCFHN7yZxvbnInkiponH MAXEL2JItzIoYhjPMC0agc5vmCGXSqRNeMl+8VPfUfKAv+luDb5Ek6nqCMpP/tckhNdYLq s0NTNdqG0iItzLnJ/Ahh8Ysj0xaC0ok= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-115-CAp74mqCMo2Xhn59WViRog-1; Sun, 04 Aug 2024 21:30:51 -0400 X-MC-Unique: CAp74mqCMo2Xhn59WViRog-1 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 3F70F195608A; Mon, 5 Aug 2024 01:30:50 +0000 (UTC) Received: from llong.com (unknown [10.2.16.2]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D521A1955E80; Mon, 5 Aug 2024 01:30:47 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Shuah Khan Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org, Chen Ridong , Waiman Long Subject: [PATCH-cgroup 1/5] cgroup/cpuset: fix panic caused by partcmd_update Date: Sun, 4 Aug 2024 21:30:15 -0400 Message-ID: <20240805013019.724300-2-longman@redhat.com> In-Reply-To: <20240805013019.724300-1-longman@redhat.com> References: <20240805013019.724300-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" From: Chen Ridong We find a bug as below: BUG: unable to handle page fault for address: 00000003 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 358 Comm: bash Tainted: G W I 6.6.0-10893-g60d6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/4 RIP: 0010:partition_sched_domains_locked+0x483/0x600 Code: 01 48 85 d2 74 0d 48 83 05 29 3f f8 03 01 f3 48 0f bc c2 89 c0 48 9 RSP: 0018:ffffc90000fdbc58 EFLAGS: 00000202 RAX: 0000000100000003 RBX: ffff888100b3dfa0 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000002fe80 RBP: ffff888100b3dfb0 R08: 0000000000000001 R09: 0000000000000000 R10: ffffc90000fdbcb0 R11: 0000000000000004 R12: 0000000000000002 R13: ffff888100a92b48 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f44a5425740(0000) GS:ffff888237d80000(0000) knlGS:0000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000100030973 CR3: 000000010722c000 CR4: 00000000000006e0 Call Trace: ? show_regs+0x8c/0xa0 ? __die_body+0x23/0xa0 ? __die+0x3a/0x50 ? page_fault_oops+0x1d2/0x5c0 ? partition_sched_domains_locked+0x483/0x600 ? search_module_extables+0x2a/0xb0 ? search_exception_tables+0x67/0x90 ? kernelmode_fixup_or_oops+0x144/0x1b0 ? __bad_area_nosemaphore+0x211/0x360 ? up_read+0x3b/0x50 ? bad_area_nosemaphore+0x1a/0x30 ? exc_page_fault+0x890/0xd90 ? __lock_acquire.constprop.0+0x24f/0x8d0 ? __lock_acquire.constprop.0+0x24f/0x8d0 ? asm_exc_page_fault+0x26/0x30 ? partition_sched_domains_locked+0x483/0x600 ? partition_sched_domains_locked+0xf0/0x600 rebuild_sched_domains_locked+0x806/0xdc0 update_partition_sd_lb+0x118/0x130 cpuset_write_resmask+0xffc/0x1420 cgroup_file_write+0xb2/0x290 kernfs_fop_write_iter+0x194/0x290 new_sync_write+0xeb/0x160 vfs_write+0x16f/0x1d0 ksys_write+0x81/0x180 __x64_sys_write+0x21/0x30 x64_sys_call+0x2f25/0x4630 do_syscall_64+0x44/0xb0 entry_SYSCALL_64_after_hwframe+0x78/0xe2 RIP: 0033:0x7f44a553c887 It can be reproduced with cammands: cd /sys/fs/cgroup/ mkdir test cd test/ echo +cpuset > ../cgroup.subtree_control echo root > cpuset.cpus.partition cat /sys/fs/cgroup/cpuset.cpus.effective 0-3 echo 0-3 > cpuset.cpus // taking away all cpus from root This issue is caused by the incorrect rebuilding of scheduling domains. In this scenario, test/cpuset.cpus.partition should be an invalid root and should not trigger the rebuilding of scheduling domains. When calling update_parent_effective_cpumask with partcmd_update, if newmask is not null, it should recheck newmask whether there are cpus is available for parect/cs that has tasks. Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective fo= r v2") Signed-off-by: Chen Ridong Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 9066f9b4af24..f1846a08e245 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1978,6 +1978,8 @@ static int update_parent_effective_cpumask(struct cpu= set *cs, int cmd, part_error =3D PERR_CPUSEMPTY; goto write_error; } + /* Check newmask again, whether cpus are available for parent/cs */ + nocpu |=3D tasks_nocpu_error(parent, cs, newmask); =20 /* * partcmd_update with newmask: --=20 2.43.5 From nobody Fri Dec 19 20:33:49 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C475613D529 for ; Mon, 5 Aug 2024 01:31:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821464; cv=none; b=piP8lC4/Vmj9PiuWdXRkZvBbb3Nx9nz0X9Za2v6uhItUF6TiPRVuSaTaUtqSxCzebF3Rq4WCI416bPAv0nN/1BUCSZBbETzpr8wqq/P3Cr1/LClQ8e3Y6tOlMeJDMlSUq6jAtD99x3FksNI6hqiGv4cW43Qt6glp8sAXrb/4NVY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821464; c=relaxed/simple; bh=0OQIq6EmZJ0HmRr8b7d0sV3ZnAl1teu6B5a0IB6BF5o=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ak3OfoojHqfbXKwgOO32nXRprLmMp5YkpKhFMFIC0p95JYM4mzoYKmBOnvCXZZ5/FLGFHngPHrcDupxDKlE4uULhsSnY40b3OKQfe1kP8Pw+A4GDR2hEGNlrjx5/3YtLq4x4RUCJzIdHE/yEtbhQbg473+DwmQa7dfWfIyjdbSE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EX5pLl34; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EX5pLl34" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1722821461; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GRUJt2egi+CPulZ04jC4gpWP21ZBU0lpCZ8ow9/DXIQ=; b=EX5pLl34ZCGUptzNEdJWybqYsaLLvKADOrT4KBKEht/lgO/gqA5ne+RWrche6wZV5S9Ezg 9eD2pShOFpQ80dCHfKhUv8aoZECU8mKT7BDVDYEVbs9GxOs109dXYnk4e5Grt5Ofpfue3H 1fSbXZQQmVH33wz3c48zZOplYiqzM7A= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-179-ZAfL3G4PMcCaNWwzn1ZIhw-1; Sun, 04 Aug 2024 21:30:54 -0400 X-MC-Unique: ZAfL3G4PMcCaNWwzn1ZIhw-1 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 893611956064; Mon, 5 Aug 2024 01:30:52 +0000 (UTC) Received: from llong.com (unknown [10.2.16.2]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 43D731955F40; Mon, 5 Aug 2024 01:30:50 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Shuah Khan Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org, Chen Ridong , Waiman Long Subject: [PATCH-cgroup 2/5] cgroup/cpuset: Clear effective_xcpus on cpus_allowed clearing only if cpus.exclusive not set Date: Sun, 4 Aug 2024 21:30:16 -0400 Message-ID: <20240805013019.724300-3-longman@redhat.com> In-Reply-To: <20240805013019.724300-1-longman@redhat.com> References: <20240805013019.724300-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" Commit e2ffe502ba45 ("cgroup/cpuset: Add cpuset.cpus.exclusive for v2") adds a user writable cpuset.cpus.exclusive file for setting exclusive CPUs to be used for the creation of partitions. Since then effective_xcpus depends on both the cpuset.cpus and cpuset.cpus.exclusive setting. If cpuset.cpus.exclusive is set, effective_xcpus will depend only on cpuset.cpus.exclusive. When it is not set, effective_xcpus will be set according to the cpuset.cpus value when the cpuset becomes a valid partition root. When cpuset.cpus is being cleared by the user, effective_xcpus should only be cleared when cpuset.cpus.exclusive is not set. However, that is not currently the case. # cd /sys/fs/cgroup/ # mkdir test # echo +cpuset > cgroup.subtree_control # cd test # echo 3 > cpuset.cpus.exclusive # cat cpuset.cpus.exclusive.effective 3 # echo > cpuset.cpus # cat cpuset.cpus.exclusive.effective // was cleared Fix it by clearing effective_xcpus only if cpuset.cpus.exclusive is not set. Fixes: e2ffe502ba45 ("cgroup/cpuset: Add cpuset.cpus.exclusive for v2") Reported-by: Chen Ridong Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index f1846a08e245..7287cecb27d1 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -2508,7 +2508,8 @@ static int update_cpumask(struct cpuset *cs, struct c= puset *trialcs, */ if (!*buf) { cpumask_clear(trialcs->cpus_allowed); - cpumask_clear(trialcs->effective_xcpus); + if (cpumask_empty(trialcs->exclusive_cpus)) + cpumask_clear(trialcs->effective_xcpus); } else { retval =3D cpulist_parse(buf, trialcs->cpus_allowed); if (retval < 0) --=20 2.43.5 From nobody Fri Dec 19 20:33:49 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D4E942BAE2 for ; Mon, 5 Aug 2024 01:31:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821492; cv=none; b=QDBHaX8rSg0JD+WXMuZkkhlT38BFx6mzLyS4NuykP2Sgmj/IRbBWJHBHY92Zg9/cvDG7zAjKxLzGAqt9kR2CL7keifqDgqbYXv7rhU+Nw4f+WhF4TqS3VE33oT+uCCv8/WehfToF2woM50WqQZUZiPhXbrt7oWOn+utMmyqE0BM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821492; c=relaxed/simple; bh=mxtcofD8yXEaAR/bV6+10SBjgxhGaO0FQnNefKPZM3g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J3at9eOPEn0o4UN3N6WbBZUzxjOCYlYtLlUdyjQt25s6JFWxvuUTAe53J+WWRVMfOMU+FFAFxjQtO6Z9Cer7rdJtQ1SIHDx4BnvogMTiKNbsz/vJkxMjtzwK8jAmTrKJ3fuLwtIDENblF9oodfgbXS1W3lI4lhGGUc7mWsjqzBU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=eZYmu/lS; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="eZYmu/lS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1722821490; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N2Qnd3fWt0i6b5mUAj4mng7D982ubzp7a0Vw45AxyrM=; b=eZYmu/lSmc4qvu8C8GrQrbX7y+AFgG7JOUcwc8LJtv9SqB28pTjpvb6qMjp0x57qLaGOJe Yn2ZIyeUHDqCVUs8k7Dj3ah99dwfJYsufN4qn3Bhf2twUdN+nPHwHL46QeJatMAWNVyLio R9XOx+Sdf0CwBbS4eW52HTA/QYCJKlk= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-303-jiT3B-jtOT6E-BAjZNHBrw-1; Sun, 04 Aug 2024 21:30:56 -0400 X-MC-Unique: jiT3B-jtOT6E-BAjZNHBrw-1 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 39266195609F; Mon, 5 Aug 2024 01:30:55 +0000 (UTC) Received: from llong.com (unknown [10.2.16.2]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id CEDC31955E80; Mon, 5 Aug 2024 01:30:52 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Shuah Khan Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org, Chen Ridong , Waiman Long Subject: [PATCH-cgroup 3/5] cgroup/cpuset: Eliminate unncessary sched domains rebuilds in hotplug Date: Sun, 4 Aug 2024 21:30:17 -0400 Message-ID: <20240805013019.724300-4-longman@redhat.com> In-Reply-To: <20240805013019.724300-1-longman@redhat.com> References: <20240805013019.724300-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" It was found that some hotplug operations may cause multiple rebuild_sched_domains_locked() calls. Some of those intermediate calls may use cpuset states not in the final correct form leading to incorrect sched domain setting. Fix this problem by using the existing force_rebuild flag to inhibit immediate rebuild_sched_domains_locked() calls if set and only doing one final call at the end. Also renaming the force_rebuild flag to force_sd_rebuild to make its meaning for clear. Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 33 +++++++++++++++++---------------- 1 file changed, 17 insertions(+), 16 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 7287cecb27d1..e070e391d7a8 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -231,6 +231,13 @@ static cpumask_var_t isolated_cpus; /* List of remote partition root children */ static struct list_head remote_children; =20 +/* + * A flag to force sched domain rebuild at the end of an operation while + * inhibiting it in the intermediate stages when set. Currently it is only + * set in hotplug code. + */ +static bool force_sd_rebuild; + /* * Partition root states: * @@ -1467,7 +1474,7 @@ static void update_partition_sd_lb(struct cpuset *cs,= int old_prs) clear_bit(CS_SCHED_LOAD_BALANCE, &cs->flags); } =20 - if (rebuild_domains) + if (rebuild_domains && !force_sd_rebuild) rebuild_sched_domains_locked(); } =20 @@ -1820,7 +1827,7 @@ static void remote_partition_check(struct cpuset *cs,= struct cpumask *newmask, remote_partition_disable(child, tmp); disable_cnt++; } - if (disable_cnt) + if (disable_cnt && !force_sd_rebuild) rebuild_sched_domains_locked(); } =20 @@ -2425,7 +2432,8 @@ static void update_cpumasks_hier(struct cpuset *cs, s= truct tmpmasks *tmp, } rcu_read_unlock(); =20 - if (need_rebuild_sched_domains && !(flags & HIER_NO_SD_REBUILD)) + if (need_rebuild_sched_domains && !(flags & HIER_NO_SD_REBUILD) && + !force_sd_rebuild) rebuild_sched_domains_locked(); } =20 @@ -3087,7 +3095,8 @@ static int update_flag(cpuset_flagbits_t bit, struct = cpuset *cs, cs->flags =3D trialcs->flags; spin_unlock_irq(&callback_lock); =20 - if (!cpumask_empty(trialcs->cpus_allowed) && balance_flag_changed) + if (!cpumask_empty(trialcs->cpus_allowed) && balance_flag_changed && + !force_sd_rebuild) rebuild_sched_domains_locked(); =20 if (spread_flag_changed) @@ -4468,11 +4477,9 @@ hotplug_update_tasks(struct cpuset *cs, update_tasks_nodemask(cs); } =20 -static bool force_rebuild; - void cpuset_force_rebuild(void) { - force_rebuild =3D true; + force_sd_rebuild =3D true; } =20 /** @@ -4620,15 +4627,9 @@ static void cpuset_handle_hotplug(void) !cpumask_empty(subpartitions_cpus); mems_updated =3D !nodes_equal(top_cpuset.effective_mems, new_mems); =20 - /* - * In the rare case that hotplug removes all the cpus in - * subpartitions_cpus, we assumed that cpus are updated. - */ - if (!cpus_updated && !cpumask_empty(subpartitions_cpus)) - cpus_updated =3D true; - /* For v1, synchronize cpus_allowed to cpu_active_mask */ if (cpus_updated) { + cpuset_force_rebuild(); spin_lock_irq(&callback_lock); if (!on_dfl) cpumask_copy(top_cpuset.cpus_allowed, &new_cpus); @@ -4684,8 +4685,8 @@ static void cpuset_handle_hotplug(void) } =20 /* rebuild sched domains if cpus_allowed has changed */ - if (cpus_updated || force_rebuild) { - force_rebuild =3D false; + if (force_sd_rebuild) { + force_sd_rebuild =3D false; rebuild_sched_domains_cpuslocked(); } =20 --=20 2.43.5 From nobody Fri Dec 19 20:33:49 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A667213D635 for ; Mon, 5 Aug 2024 01:31:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821465; cv=none; b=kbrTqS39mByYk8RmoXgD7dASzIoxao9I0cJeTvoBqcGO+nf7Py5sOmBGWrr3ijt1dWxGlQ7hLsNfuXarsmaliLL53tk/Fs6z5bTrN574jmm/sWpSdv9f5h4Y4GPSQN65otLVVIwxJsAp/1zdPrViFwxTH6TtAcU+XyLMdYiWANk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821465; c=relaxed/simple; bh=ylZ3A6VIvKNpzMYzDc4pdc5uWXxiHdkjd4jxukjWzt4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=N2N6UfaHxXqWp+GUTT37ZK/Og5M82mbKaZRMGn8L6sPXjWOqjVykOM7m+RqVAF1UyoNSLTuEMmneBLUej/xuoqT16ISxX24h0VNU6VV5DI3rD5cpi/WAdjA4BSn6rlOKpwjMcjMdw9lkab+VuMoB6+Xnd2HpZf7hSeuY5EqnHJg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=FehVzQ3I; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="FehVzQ3I" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1722821462; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3MlJNHfxT1aqtDvO1F0M3+o6Fo/uaaUDc5R0lfvrDMA=; b=FehVzQ3I0QRuxwibUH+w5N6qYlrzVtnJOOT+XcK/2LtniA8vT58w6dOC2xeLx7pQSkOD0O B0v0ugEGk8Zs/hHAB43pcgR505w9ycWhVKq4hHMPX0+FxqYD5sMu/von43Sdiw1TkdF0aJ 2cGy9VfzhLWhqkbyZxR2VKOrh0xG52o= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-688-AChQeFaMMci5PF72oufDhA-1; Sun, 04 Aug 2024 21:30:59 -0400 X-MC-Unique: AChQeFaMMci5PF72oufDhA-1 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 5D561195608A; Mon, 5 Aug 2024 01:30:57 +0000 (UTC) Received: from llong.com (unknown [10.2.16.2]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 5E0FF1955F40; Mon, 5 Aug 2024 01:30:55 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Shuah Khan Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org, Chen Ridong , Waiman Long Subject: [PATCH-cgroup 4/5] cgroup/cpuset: Check for partition roots with overlapping CPUs Date: Sun, 4 Aug 2024 21:30:18 -0400 Message-ID: <20240805013019.724300-5-longman@redhat.com> In-Reply-To: <20240805013019.724300-1-longman@redhat.com> References: <20240805013019.724300-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" With the previous commit that eliminates the overlapping partition root corner cases in the hotplug code, the partition roots passed down to generate_sched_domains() should not have overlapping CPUs. Enable overlapping cpuset check for v2 and warn if that happens. This patch also has the benefit of increasing test coverage of the new Union-Find cpuset merging code to cgroup v2. Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index e070e391d7a8..e34fd6108b06 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1127,25 +1127,27 @@ static int generate_sched_domains(cpumask_var_t **d= omains, if (root_load_balance && (csn =3D=3D 1)) goto single_root_domain; =20 - if (!cgrpv2) { - for (i =3D 0; i < csn; i++) - uf_node_init(&csa[i]->node); - - /* Merge overlapping cpusets */ - for (i =3D 0; i < csn; i++) { - for (j =3D i + 1; j < csn; j++) { - if (cpusets_overlap(csa[i], csa[j])) - uf_union(&csa[i]->node, &csa[j]->node); + for (i =3D 0; i < csn; i++) + uf_node_init(&csa[i]->node); + + /* Merge overlapping cpusets */ + for (i =3D 0; i < csn; i++) { + for (j =3D i + 1; j < csn; j++) { + if (cpusets_overlap(csa[i], csa[j])) { + /* + * Cgroup v2 shouldn't pass down overlapping + * partition root cpusets. + */ + WARN_ON_ONCE(cgrpv2); + uf_union(&csa[i]->node, &csa[j]->node); } } + } =20 - /* Count the total number of domains */ - for (i =3D 0; i < csn; i++) { - if (uf_find(&csa[i]->node) =3D=3D &csa[i]->node) - ndoms++; - } - } else { - ndoms =3D csn; + /* Count the total number of domains */ + for (i =3D 0; i < csn; i++) { + if (uf_find(&csa[i]->node) =3D=3D &csa[i]->node) + ndoms++; } =20 /* --=20 2.43.5 From nobody Fri Dec 19 20:33:49 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0141813D882 for ; Mon, 5 Aug 2024 01:31:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821466; cv=none; b=XBxxgOjmpoJLRhUb6IpKY12xg5xQTMygItKn6J+vDtZ6uxvPwMvro2L9S5/ohf0GQ0wxSmJUJZ9YcpmF5VqXdS4ry1jHRJisesKh/UNHmso2uWrelbIYErS00xTWoV1yERm7BjMrxUhJTw1zIbCiAmRDqy2D+uDRr6qA2BG/ARc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722821466; c=relaxed/simple; bh=39sLqkK1hW5EV6YFEtmuk1YEgFv6Iqi71UFgllFK/p8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dFobZyaDO7jsVtyHmU0d7g2orGhoxsxurzs3K5cyXI/f8VT9TSsxGXjO0TYtKy5H3Uu38m1Aknz0ZzBOSFxjPbIZ4ShO9HEv/ppxNcS9FdPdRN5kTZ5ECumHfs9MXfizoy5c96IIvbulpoaRASpGPCXqUuEaQgF/4NDLnGfiQb4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=N595e4MY; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="N595e4MY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1722821463; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=URmVA8EbEW4iZr2WAUB//HEv0SoOzbHc2nBPlkzxeFI=; b=N595e4MYehaA910nBmC7I829a1YUDxIeGtXQ+D6wIzdGF4Bt6f/nVTxb65ZVv6kEK0aSXs kW+TXctVWqsMpOiBOu3cpQ9P6QVu9Fe33Z0BJcHSmcZjtu7jgCA9LXvXw6NobOhICZsnn0 qiTTXARYqrplBwj1oLCj1KstaTqYsWk= Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-553-tO7IIqb3Obu4ALm3Mbn3eA-1; Sun, 04 Aug 2024 21:31:01 -0400 X-MC-Unique: tO7IIqb3Obu4ALm3Mbn3eA-1 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C25BD1955D50; Mon, 5 Aug 2024 01:30:59 +0000 (UTC) Received: from llong.com (unknown [10.2.16.2]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id AEFA11955E80; Mon, 5 Aug 2024 01:30:57 +0000 (UTC) From: Waiman Long To: Tejun Heo , Zefan Li , Johannes Weiner , =?UTF-8?q?Michal=20Koutn=C3=BD?= , Shuah Khan Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org, Chen Ridong , Waiman Long Subject: [PATCH-cgroup 5/5] selftest/cgroup: Add new test cases to test_cpuset_prs.sh Date: Sun, 4 Aug 2024 21:30:19 -0400 Message-ID: <20240805013019.724300-6-longman@redhat.com> In-Reply-To: <20240805013019.724300-1-longman@redhat.com> References: <20240805013019.724300-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" Add new test cases to test_cpuset_prs.sh to cover corner cases reported in previous fix commits. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_cpuset_prs.sh | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/test= ing/selftests/cgroup/test_cpuset_prs.sh index 7c08cc153367..7295424502b9 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -321,7 +321,7 @@ TEST_MATRIX=3D( # old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 fail ECPUs Pst= ate ISOLCPUS # ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ---= --- -------- # - # Incorrect change to cpuset.cpus invalidates partition root + # Incorrect change to cpuset.cpus[.exclusive] invalidates partition root # # Adding CPUs to partition root that are not in parent's # cpuset.cpus is allowed, but those extra CPUs are ignored. @@ -365,6 +365,16 @@ TEST_MATRIX=3D( # cpuset.cpus can overlap with sibling cpuset.cpus.exclusive but not subs= umed by it " C0-3 . . C4-5 X5 . . . 0 A1:0-3,B1:= 4-5" =20 + # Child partition root that try to take all CPUs from parent partition + # with tasks will remain invalid. + " C1-4:P1:S+ P1 . . . . . . 0 A1:1-4,A2:= 1-4 A1:P1,A2:P-1" + " C1-4:P1:S+ P1 . . . C1-4 . . 0 A1,A2:1-4 = A1:P1,A2:P1" + " C1-4:P1:S+ P1 . . T C1-4 . . 0 A1:1-4,A2:= 1-4 A1:P1,A2:P-1" + + # Clearing of cpuset.cpus with a preset cpuset.cpus.exclusive shouldn't + # affect cpuset.cpus.exclusive.effective. + " C1-4:X3:S+ C1:X3 . . . C . . 0 A2:1-4,XA2= :3" + # old-A1 old-A2 old-A3 old-B1 new-A1 new-A2 new-A3 new-B1 fail ECPUs Pst= ate ISOLCPUS # ------ ------ ------ ------ ------ ------ ------ ------ ---- ----- ---= --- -------- # Failure cases: --=20 2.43.5