[PATCH v2] cgroup/cpuset: remove kernfs active break

Chen Ridong posted 1 patch 1 year, 1 month ago
kernel/cgroup/cpuset.c | 25 -------------------------
1 file changed, 25 deletions(-)
[PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Chen Ridong 1 year, 1 month ago
From: Chen Ridong <chenridong@huawei.com>

A warning was found:

WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828
CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G
RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0
RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202
RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000
RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04
RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180
R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08
R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0
FS:  00007f84cd0be740(0000) GS:ffff8887ddc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 kernfs_drain+0x15e/0x2f0
 __kernfs_remove+0x165/0x300
 kernfs_remove_by_name_ns+0x7b/0xc0
 cgroup_rm_file+0x154/0x1c0
 cgroup_addrm_files+0x1c2/0x1f0
 css_clear_dir+0x77/0x110
 kill_css+0x4c/0x1b0
 cgroup_destroy_locked+0x194/0x380
 cgroup_rmdir+0x2a/0x140

It can be explained by:
rmdir 				echo 1 > cpuset.cpus
				kernfs_fop_write_iter // active=0
cgroup_rm_file
kernfs_remove_by_name_ns	kernfs_get_active // active=1
__kernfs_remove					  // active=0x80000002
kernfs_drain			cpuset_write_resmask
wait_event
//waiting (active == 0x80000001)
				kernfs_break_active_protection
				// active = 0x80000001
// continue
				kernfs_unbreak_active_protection
				// active = 0x80000002
...
kernfs_should_drain_open_files
// warning occurs
				kernfs_put_active

This warning is caused by 'kernfs_break_active_protection' when it is
writing to cpuset.cpus, and the cgroup is removed concurrently.

The commit 3a5a6d0c2b03 ("cpuset: don't nest cgroup_mutex inside
get_online_cpus()") made cpuset_hotplug_workfn asynchronous, This change
involves calling flush_work(), which can create a multiple processes
circular locking dependency that involve cgroup_mutex, potentially leading
to a deadlock. To avoid deadlock. the commit 76bb5ab8f6e3 ("cpuset: break
kernfs active protection in cpuset_write_resmask()") added
'kernfs_break_active_protection' in the cpuset_write_resmask. This could
lead to this warning.

After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug
processing synchronous"), the cpuset_write_resmask no longer needs to
wait the hotplug to finish, which means that concurrent hotplug and cpuset
operations are no longer possible. Therefore, the deadlock doesn't exist
anymore and it does not have to 'break active protection' now. To fix this
warning, just remove kernfs_break_active_protection operation in the
'cpuset_write_resmask'.

Fixes: 76bb5ab8f6e3 ("cpuset: break kernfs active protection in cpuset_write_resmask()")
Reported-by: Ji Fa <jifa@huawei.com>
Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
 kernel/cgroup/cpuset.c | 25 -------------------------
 1 file changed, 25 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 7ea559fb0cbf..0f910c828973 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -3124,29 +3124,6 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
 	int retval = -ENODEV;
 
 	buf = strstrip(buf);
-
-	/*
-	 * CPU or memory hotunplug may leave @cs w/o any execution
-	 * resources, in which case the hotplug code asynchronously updates
-	 * configuration and transfers all tasks to the nearest ancestor
-	 * which can execute.
-	 *
-	 * As writes to "cpus" or "mems" may restore @cs's execution
-	 * resources, wait for the previously scheduled operations before
-	 * proceeding, so that we don't end up keep removing tasks added
-	 * after execution capability is restored.
-	 *
-	 * cpuset_handle_hotplug may call back into cgroup core asynchronously
-	 * via cgroup_transfer_tasks() and waiting for it from a cgroupfs
-	 * operation like this one can lead to a deadlock through kernfs
-	 * active_ref protection.  Let's break the protection.  Losing the
-	 * protection is okay as we check whether @cs is online after
-	 * grabbing cpuset_mutex anyway.  This only happens on the legacy
-	 * hierarchies.
-	 */
-	css_get(&cs->css);
-	kernfs_break_active_protection(of->kn);
-
 	cpus_read_lock();
 	mutex_lock(&cpuset_mutex);
 	if (!is_cpuset_online(cs))
@@ -3179,8 +3156,6 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
 out_unlock:
 	mutex_unlock(&cpuset_mutex);
 	cpus_read_unlock();
-	kernfs_unbreak_active_protection(of->kn);
-	css_put(&cs->css);
 	flush_workqueue(cpuset_migrate_mm_wq);
 	return retval ?: nbytes;
 }
-- 
2.34.1
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Tejun Heo 1 year, 1 month ago
On Mon, Jan 06, 2025 at 08:19:04AM +0000, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
> 
> A warning was found:
> 
> WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828
> CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G
> RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0
> RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202
> RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000
> RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04
> RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180
> R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08
> R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0
> FS:  00007f84cd0be740(0000) GS:ffff8887ddc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  kernfs_drain+0x15e/0x2f0
>  __kernfs_remove+0x165/0x300
>  kernfs_remove_by_name_ns+0x7b/0xc0
>  cgroup_rm_file+0x154/0x1c0
>  cgroup_addrm_files+0x1c2/0x1f0
>  css_clear_dir+0x77/0x110
>  kill_css+0x4c/0x1b0
>  cgroup_destroy_locked+0x194/0x380
>  cgroup_rmdir+0x2a/0x140
> 
> It can be explained by:
> rmdir 				echo 1 > cpuset.cpus
> 				kernfs_fop_write_iter // active=0
> cgroup_rm_file
> kernfs_remove_by_name_ns	kernfs_get_active // active=1
> __kernfs_remove					  // active=0x80000002
> kernfs_drain			cpuset_write_resmask
> wait_event
> //waiting (active == 0x80000001)
> 				kernfs_break_active_protection
> 				// active = 0x80000001
> // continue
> 				kernfs_unbreak_active_protection
> 				// active = 0x80000002
> ...
> kernfs_should_drain_open_files
> // warning occurs
> 				kernfs_put_active
> 
> This warning is caused by 'kernfs_break_active_protection' when it is
> writing to cpuset.cpus, and the cgroup is removed concurrently.
> 
> The commit 3a5a6d0c2b03 ("cpuset: don't nest cgroup_mutex inside
> get_online_cpus()") made cpuset_hotplug_workfn asynchronous, This change
> involves calling flush_work(), which can create a multiple processes
> circular locking dependency that involve cgroup_mutex, potentially leading
> to a deadlock. To avoid deadlock. the commit 76bb5ab8f6e3 ("cpuset: break
> kernfs active protection in cpuset_write_resmask()") added
> 'kernfs_break_active_protection' in the cpuset_write_resmask. This could
> lead to this warning.
> 
> After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug
> processing synchronous"), the cpuset_write_resmask no longer needs to
> wait the hotplug to finish, which means that concurrent hotplug and cpuset
> operations are no longer possible. Therefore, the deadlock doesn't exist
> anymore and it does not have to 'break active protection' now. To fix this
> warning, just remove kernfs_break_active_protection operation in the
> 'cpuset_write_resmask'.
> 
> Fixes: 76bb5ab8f6e3 ("cpuset: break kernfs active protection in cpuset_write_resmask()")
> Reported-by: Ji Fa <jifa@huawei.com>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>

Applied to cgroup/for-6.13-fixes.

Thanks.

-- 
tejun
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Chen Ridong 1 year, 1 month ago

On 2025/1/9 3:23, Tejun Heo wrote:
> On Mon, Jan 06, 2025 at 08:19:04AM +0000, Chen Ridong wrote:
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> A warning was found:
>>
>> WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828
>> CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G
>> RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0
>> RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202
>> RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000
>> RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04
>> RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180
>> R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08
>> R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0
>> FS:  00007f84cd0be740(0000) GS:ffff8887ddc00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>  kernfs_drain+0x15e/0x2f0
>>  __kernfs_remove+0x165/0x300
>>  kernfs_remove_by_name_ns+0x7b/0xc0
>>  cgroup_rm_file+0x154/0x1c0
>>  cgroup_addrm_files+0x1c2/0x1f0
>>  css_clear_dir+0x77/0x110
>>  kill_css+0x4c/0x1b0
>>  cgroup_destroy_locked+0x194/0x380
>>  cgroup_rmdir+0x2a/0x140
>>
>> It can be explained by:
>> rmdir 				echo 1 > cpuset.cpus
>> 				kernfs_fop_write_iter // active=0
>> cgroup_rm_file
>> kernfs_remove_by_name_ns	kernfs_get_active // active=1
>> __kernfs_remove					  // active=0x80000002
>> kernfs_drain			cpuset_write_resmask
>> wait_event
>> //waiting (active == 0x80000001)
>> 				kernfs_break_active_protection
>> 				// active = 0x80000001
>> // continue
>> 				kernfs_unbreak_active_protection
>> 				// active = 0x80000002
>> ...
>> kernfs_should_drain_open_files
>> // warning occurs
>> 				kernfs_put_active
>>
>> This warning is caused by 'kernfs_break_active_protection' when it is
>> writing to cpuset.cpus, and the cgroup is removed concurrently.
>>
>> The commit 3a5a6d0c2b03 ("cpuset: don't nest cgroup_mutex inside
>> get_online_cpus()") made cpuset_hotplug_workfn asynchronous, This change
>> involves calling flush_work(), which can create a multiple processes
>> circular locking dependency that involve cgroup_mutex, potentially leading
>> to a deadlock. To avoid deadlock. the commit 76bb5ab8f6e3 ("cpuset: break
>> kernfs active protection in cpuset_write_resmask()") added
>> 'kernfs_break_active_protection' in the cpuset_write_resmask. This could
>> lead to this warning.
>>
>> After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug
>> processing synchronous"), the cpuset_write_resmask no longer needs to
>> wait the hotplug to finish, which means that concurrent hotplug and cpuset
>> operations are no longer possible. Therefore, the deadlock doesn't exist
>> anymore and it does not have to 'break active protection' now. To fix this
>> warning, just remove kernfs_break_active_protection operation in the
>> 'cpuset_write_resmask'.
>>
>> Fixes: 76bb5ab8f6e3 ("cpuset: break kernfs active protection in cpuset_write_resmask()")
>> Reported-by: Ji Fa <jifa@huawei.com>
>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> 
> Applied to cgroup/for-6.13-fixes.
> 
> Thanks.
> 

Hi, Tj and Longman, I am sorry, the fix tag is not exactly right. I just
failed to reproduce this issue at the version 5.10, and I found this
warning was added with the commit bdb2fd7fc56e ("kernfs: Skip
kernfs_drain_open_files() more aggressively"), which is at version 6.1.
I believe it should both fix  bdb2fd7fc56e ("kernfs: Skip
kernfs_drain_open_files() more aggressively") and 76bb5ab8f6e3 ("cpuset:
break kernfs active protection in cpuset_write_resmask()"). Should I
resend a new patch?

Best regards,
Ridong
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Tejun Heo 1 year, 1 month ago
On Thu, Jan 09, 2025 at 09:29:59AM +0800, Chen Ridong wrote:
> Hi, Tj and Longman, I am sorry, the fix tag is not exactly right. I just
> failed to reproduce this issue at the version 5.10, and I found this
> warning was added with the commit bdb2fd7fc56e ("kernfs: Skip
> kernfs_drain_open_files() more aggressively"), which is at version 6.1.
> I believe it should both fix  bdb2fd7fc56e ("kernfs: Skip
> kernfs_drain_open_files() more aggressively") and 76bb5ab8f6e3 ("cpuset:
> break kernfs active protection in cpuset_write_resmask()"). Should I
> resend a new patch?

No worries. I updated the commit in place.

Thanks.

-- 
tejun
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Chen Ridong 1 year, 1 month ago

On 2025/1/9 9:55, Tejun Heo wrote:
> On Thu, Jan 09, 2025 at 09:29:59AM +0800, Chen Ridong wrote:
>> Hi, Tj and Longman, I am sorry, the fix tag is not exactly right. I just
>> failed to reproduce this issue at the version 5.10, and I found this
>> warning was added with the commit bdb2fd7fc56e ("kernfs: Skip
>> kernfs_drain_open_files() more aggressively"), which is at version 6.1.
>> I believe it should both fix  bdb2fd7fc56e ("kernfs: Skip
>> kernfs_drain_open_files() more aggressively") and 76bb5ab8f6e3 ("cpuset:
>> break kernfs active protection in cpuset_write_resmask()"). Should I
>> resend a new patch?
> 
> No worries. I updated the commit in place.
> 
> Thanks.
> 

Thank you very much.

Best regards,
Ridong
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Waiman Long 1 year, 1 month ago
On 1/6/25 3:19 AM, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> A warning was found:
>
> WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828
> CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G
> RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0
> RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202
> RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000
> RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04
> RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180
> R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08
> R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0
> FS:  00007f84cd0be740(0000) GS:ffff8887ddc00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>   kernfs_drain+0x15e/0x2f0
>   __kernfs_remove+0x165/0x300
>   kernfs_remove_by_name_ns+0x7b/0xc0
>   cgroup_rm_file+0x154/0x1c0
>   cgroup_addrm_files+0x1c2/0x1f0
>   css_clear_dir+0x77/0x110
>   kill_css+0x4c/0x1b0
>   cgroup_destroy_locked+0x194/0x380
>   cgroup_rmdir+0x2a/0x140
>
> It can be explained by:
> rmdir 				echo 1 > cpuset.cpus
> 				kernfs_fop_write_iter // active=0
> cgroup_rm_file
> kernfs_remove_by_name_ns	kernfs_get_active // active=1
> __kernfs_remove					  // active=0x80000002
> kernfs_drain			cpuset_write_resmask
> wait_event
> //waiting (active == 0x80000001)
> 				kernfs_break_active_protection
> 				// active = 0x80000001
> // continue
> 				kernfs_unbreak_active_protection
> 				// active = 0x80000002
> ...
> kernfs_should_drain_open_files
> // warning occurs
> 				kernfs_put_active
>
> This warning is caused by 'kernfs_break_active_protection' when it is
> writing to cpuset.cpus, and the cgroup is removed concurrently.
>
> The commit 3a5a6d0c2b03 ("cpuset: don't nest cgroup_mutex inside
> get_online_cpus()") made cpuset_hotplug_workfn asynchronous, This change
> involves calling flush_work(), which can create a multiple processes
> circular locking dependency that involve cgroup_mutex, potentially leading
> to a deadlock. To avoid deadlock. the commit 76bb5ab8f6e3 ("cpuset: break
> kernfs active protection in cpuset_write_resmask()") added
> 'kernfs_break_active_protection' in the cpuset_write_resmask. This could
> lead to this warning.
>
> After the commit 2125c0034c5d ("cgroup/cpuset: Make cpuset hotplug
> processing synchronous"), the cpuset_write_resmask no longer needs to
> wait the hotplug to finish, which means that concurrent hotplug and cpuset
> operations are no longer possible. Therefore, the deadlock doesn't exist
> anymore and it does not have to 'break active protection' now. To fix this
> warning, just remove kernfs_break_active_protection operation in the
> 'cpuset_write_resmask'.
>
> Fixes: 76bb5ab8f6e3 ("cpuset: break kernfs active protection in cpuset_write_resmask()")
> Reported-by: Ji Fa <jifa@huawei.com>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
>   kernel/cgroup/cpuset.c | 25 -------------------------
>   1 file changed, 25 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 7ea559fb0cbf..0f910c828973 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -3124,29 +3124,6 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
>   	int retval = -ENODEV;
>   
>   	buf = strstrip(buf);
> -
> -	/*
> -	 * CPU or memory hotunplug may leave @cs w/o any execution
> -	 * resources, in which case the hotplug code asynchronously updates
> -	 * configuration and transfers all tasks to the nearest ancestor
> -	 * which can execute.
> -	 *
> -	 * As writes to "cpus" or "mems" may restore @cs's execution
> -	 * resources, wait for the previously scheduled operations before
> -	 * proceeding, so that we don't end up keep removing tasks added
> -	 * after execution capability is restored.
> -	 *
> -	 * cpuset_handle_hotplug may call back into cgroup core asynchronously
> -	 * via cgroup_transfer_tasks() and waiting for it from a cgroupfs
> -	 * operation like this one can lead to a deadlock through kernfs
> -	 * active_ref protection.  Let's break the protection.  Losing the
> -	 * protection is okay as we check whether @cs is online after
> -	 * grabbing cpuset_mutex anyway.  This only happens on the legacy
> -	 * hierarchies.
> -	 */
> -	css_get(&cs->css);
> -	kernfs_break_active_protection(of->kn);
> -
>   	cpus_read_lock();
>   	mutex_lock(&cpuset_mutex);
>   	if (!is_cpuset_online(cs))
> @@ -3179,8 +3156,6 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
>   out_unlock:
>   	mutex_unlock(&cpuset_mutex);
>   	cpus_read_unlock();
> -	kernfs_unbreak_active_protection(of->kn);
> -	css_put(&cs->css);
>   	flush_workqueue(cpuset_migrate_mm_wq);
>   	return retval ?: nbytes;
>   }

That looks good to me. Breaking active lock protection is no longer needed.

Acked-by: Waiman Long <longman@redhat.com>
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Tejun Heo 1 year, 1 month ago
Waiman, what do you think?

-- 
tejun
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Waiman Long 1 year, 1 month ago
On 1/8/25 11:53 AM, Tejun Heo wrote:
> Waiman, what do you think?
>
This patch looks good me. However, this does raise a question that I 
overlook when I made hotplug operation synchronous while task transfer, 
if needed, remained asynchronous. There is a very slight chance where we 
keep removing tasks added after execution capability is restored. As 
cgroup v1 is in the process of being deprecated, do you think we still 
need to do something to address this issue?

Cheers,
Longman
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Tejun Heo 1 year, 1 month ago
Hello,

On Wed, Jan 08, 2025 at 02:27:07PM -0500, Waiman Long wrote:
> On 1/8/25 11:53 AM, Tejun Heo wrote:
> This patch looks good me. However, this does raise a question that I
> overlook when I made hotplug operation synchronous while task transfer, if
> needed, remained asynchronous. There is a very slight chance where we keep
> removing tasks added after execution capability is restored. As cgroup v1 is
> in the process of being deprecated, do you think we still need to do
> something to address this issue?

I *think* that should be fine. In cgroup1, the kernel is making irreversible
system config changes when a cgroup loses all its CPUs. I have a hard time
imagining use cases that would depend on the the exact ordering of
operations at that point. The auto transfer-out was always the last ditch
measure to not leave the system in a broken state after all. If someone's
depending on the transfer out being strictly ordered w.r.t. the cgroup
losing all CPUs and then gaining some, let's hear why the hell that ordering
matters first.

Thanks.

-- 
tejun
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Waiman Long 1 year, 1 month ago
On 1/8/25 2:35 PM, Tejun Heo wrote:
> Hello,
>
> On Wed, Jan 08, 2025 at 02:27:07PM -0500, Waiman Long wrote:
>> On 1/8/25 11:53 AM, Tejun Heo wrote:
>> This patch looks good me. However, this does raise a question that I
>> overlook when I made hotplug operation synchronous while task transfer, if
>> needed, remained asynchronous. There is a very slight chance where we keep
>> removing tasks added after execution capability is restored. As cgroup v1 is
>> in the process of being deprecated, do you think we still need to do
>> something to address this issue?
> I *think* that should be fine. In cgroup1, the kernel is making irreversible
> system config changes when a cgroup loses all its CPUs. I have a hard time
> imagining use cases that would depend on the the exact ordering of
> operations at that point. The auto transfer-out was always the last ditch
> measure to not leave the system in a broken state after all. If someone's
> depending on the transfer out being strictly ordered w.r.t. the cgroup
> losing all CPUs and then gaining some, let's hear why the hell that ordering
> matters first.

Thanks for the explanation.

It is not the strict ordering that I am worrying about. It is all about 
the possibility of hitting some race conditions.

I am thinking of a scenario where a cpuset loses all its CPUs in 
hotunplug and then restored by adding other CPUs. There is chance that 
the css will be operated on concurrently by the auto-transfer task and 
another task moving new task to the css. I am not sure if that will be a 
problem or not. Anyway, it is very rare that we will be in such a situation.

Thanks,
Longman
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Tejun Heo 1 year, 1 month ago
Hello,

On Wed, Jan 08, 2025 at 02:50:19PM -0500, Waiman Long wrote:
...
> It is not the strict ordering that I am worrying about. It is all about the
> possibility of hitting some race conditions.
> 
> I am thinking of a scenario where a cpuset loses all its CPUs in hotunplug
> and then restored by adding other CPUs. There is chance that the css will be
> operated on concurrently by the auto-transfer task and another task moving
> new task to the css. I am not sure if that will be a problem or not. Anyway,
> it is very rare that we will be in such a situation.

Hmm... I might be missing something but cgroup_transfer_tasks() is fully
synchronized against migrations. I don't see anything dangerous there.

Thanks.

-- 
tejun
Re: [PATCH v2] cgroup/cpuset: remove kernfs active break
Posted by Waiman Long 1 year, 1 month ago
On 1/8/25 3:00 PM, Tejun Heo wrote:
> Hello,
>
> On Wed, Jan 08, 2025 at 02:50:19PM -0500, Waiman Long wrote:
> ...
>> It is not the strict ordering that I am worrying about. It is all about the
>> possibility of hitting some race conditions.
>>
>> I am thinking of a scenario where a cpuset loses all its CPUs in hotunplug
>> and then restored by adding other CPUs. There is chance that the css will be
>> operated on concurrently by the auto-transfer task and another task moving
>> new task to the css. I am not sure if that will be a problem or not. Anyway,
>> it is very rare that we will be in such a situation.
> Hmm... I might be missing something but cgroup_transfer_tasks() is fully
> synchronized against migrations. I don't see anything dangerous there.
>
> Thanks.

I probably misread some of the code. Then it is not an issue anymore.

Thanks,
Longman