[PATCH] cgroup: Keep favordynmods enabled once per-threadgroup rwsem is active

Guopeng Zhang posted 1 patch 1 month ago
kernel/cgroup/cgroup.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
[PATCH] cgroup: Keep favordynmods enabled once per-threadgroup rwsem is active
Posted by Guopeng Zhang 1 month ago
cgroup_enable_per_threadgroup_rwsem is a one-way switch. Once it is
enabled, cgroup.procs writes use the per-threadgroup rwsem and
cgroup_threadgroup_change_begin()/end() use the same global state to
decide whether to take and release the per-threadgroup rwsem.

The disable path warned that the per-threadgroup rwsem mechanism could not
be disabled but still called rcu_sync_exit() and cleared
CGRP_ROOT_FAVOR_DYNMODS. That partially disabled favordynmods while the
global per-threadgroup rwsem mode remained enabled: cgroup.procs writes
would continue to use the per-threadgroup rwsem, while
cgroup_threadgroup_change_begin()/end() could observe the exited rcu_sync
state. The root would also no longer report favordynmods.

Make the transition match the documented one-way semantics. Call
rcu_sync_enter() only for the first favordynmods enable, and make later
disable attempts a no-op after warning once the per-threadgroup rwsem mode
has been enabled.

Fixes: 0568f89d4fb8 ("cgroup: replace global percpu_rwsem with per threadgroup resem when writing to cgroup.procs")
Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
---
Manual AB test:

Before this patch:
  enable favordynmods:
    cgroup2 opts: rw,relatime,favordynmods
  disable attempt:
    cgroup2 opts: rw,relatime
  dmesg:
    cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled

After this patch:
  enable favordynmods:
    cgroup2 opts: rw,relatime,favordynmods
  disable attempt:
    cgroup2 opts: rw,relatime,favordynmods
  dmesg:
    cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled

 kernel/cgroup/cgroup.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 6152add0c5eb..fd10fb5b3598 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1297,14 +1297,13 @@ void cgroup_favor_dynmods(struct cgroup_root *root, bool favor)
 	 */
 	percpu_down_write(&cgroup_threadgroup_rwsem);
 	if (favor && !favoring) {
-		cgroup_enable_per_threadgroup_rwsem = true;
-		rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
+		if (!cgroup_enable_per_threadgroup_rwsem) {
+			cgroup_enable_per_threadgroup_rwsem = true;
+			rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
+		}
 		root->flags |= CGRP_ROOT_FAVOR_DYNMODS;
 	} else if (!favor && favoring) {
-		if (cgroup_enable_per_threadgroup_rwsem)
-			pr_warn_once("cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled\n");
-		rcu_sync_exit(&cgroup_threadgroup_rwsem.rss);
-		root->flags &= ~CGRP_ROOT_FAVOR_DYNMODS;
+		pr_warn_once("cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled\n");
 	}
 	percpu_up_write(&cgroup_threadgroup_rwsem);
 }
-- 
2.43.0
Re: [PATCH] cgroup: Keep favordynmods enabled once per-threadgroup rwsem is active
Posted by Chen Ridong 1 month ago

On 2026/5/11 16:16, Guopeng Zhang wrote:
> cgroup_enable_per_threadgroup_rwsem is a one-way switch. Once it is
> enabled, cgroup.procs writes use the per-threadgroup rwsem and
> cgroup_threadgroup_change_begin()/end() use the same global state to
> decide whether to take and release the per-threadgroup rwsem.
> 
> The disable path warned that the per-threadgroup rwsem mechanism could not
> be disabled but still called rcu_sync_exit() and cleared
> CGRP_ROOT_FAVOR_DYNMODS. That partially disabled favordynmods while the
> global per-threadgroup rwsem mode remained enabled: cgroup.procs writes
> would continue to use the per-threadgroup rwsem, while
> cgroup_threadgroup_change_begin()/end() could observe the exited rcu_sync
> state. The root would also no longer report favordynmods.
> 
> Make the transition match the documented one-way semantics. Call
> rcu_sync_enter() only for the first favordynmods enable, and make later
> disable attempts a no-op after warning once the per-threadgroup rwsem mode
> has been enabled.
> 
> Fixes: 0568f89d4fb8 ("cgroup: replace global percpu_rwsem with per threadgroup resem when writing to cgroup.procs")
> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
> ---
> Manual AB test:
> 
> Before this patch:
>   enable favordynmods:
>     cgroup2 opts: rw,relatime,favordynmods
>   disable attempt:
>     cgroup2 opts: rw,relatime
>   dmesg:
>     cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled
> 
> After this patch:
>   enable favordynmods:
>     cgroup2 opts: rw,relatime,favordynmods
>   disable attempt:
>     cgroup2 opts: rw,relatime,favordynmods
>   dmesg:
>     cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled
> 
>  kernel/cgroup/cgroup.c | 11 +++++------
>  1 file changed, 5 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 6152add0c5eb..fd10fb5b3598 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -1297,14 +1297,13 @@ void cgroup_favor_dynmods(struct cgroup_root *root, bool favor)
>  	 */
>  	percpu_down_write(&cgroup_threadgroup_rwsem);
>  	if (favor && !favoring) {
> -		cgroup_enable_per_threadgroup_rwsem = true;
> -		rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
> +		if (!cgroup_enable_per_threadgroup_rwsem) {

Is this branch redundant? I think if (favor && !favoring) alone should suffice —
or can the outer condition be true twice (i.e., can this block be entered
multiple times)?


> +			cgroup_enable_per_threadgroup_rwsem = true;
> +			rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
> +		}
>  		root->flags |= CGRP_ROOT_FAVOR_DYNMODS;
>  	} else if (!favor && favoring) {
> -		if (cgroup_enable_per_threadgroup_rwsem)
> -			pr_warn_once("cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled\n");
> -		rcu_sync_exit(&cgroup_threadgroup_rwsem.rss);
> -		root->flags &= ~CGRP_ROOT_FAVOR_DYNMODS;
> +		pr_warn_once("cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled\n");
>  	}
>  	percpu_up_write(&cgroup_threadgroup_rwsem);
>  }

-- 
Best regards,
Ridong

Re: [PATCH] cgroup: Keep favordynmods enabled once per-threadgroup rwsem is active
Posted by Guopeng Zhang 1 month ago

在 2026/5/11 17:05, Chen Ridong 写道:
> 
> 
> On 2026/5/11 16:16, Guopeng Zhang wrote:
>> cgroup_enable_per_threadgroup_rwsem is a one-way switch. Once it is
>> enabled, cgroup.procs writes use the per-threadgroup rwsem and
>> cgroup_threadgroup_change_begin()/end() use the same global state to
>> decide whether to take and release the per-threadgroup rwsem.
>>
>> The disable path warned that the per-threadgroup rwsem mechanism could not
>> be disabled but still called rcu_sync_exit() and cleared
>> CGRP_ROOT_FAVOR_DYNMODS. That partially disabled favordynmods while the
>> global per-threadgroup rwsem mode remained enabled: cgroup.procs writes
>> would continue to use the per-threadgroup rwsem, while
>> cgroup_threadgroup_change_begin()/end() could observe the exited rcu_sync
>> state. The root would also no longer report favordynmods.
>>
>> Make the transition match the documented one-way semantics. Call
>> rcu_sync_enter() only for the first favordynmods enable, and make later
>> disable attempts a no-op after warning once the per-threadgroup rwsem mode
>> has been enabled.
>>
>> Fixes: 0568f89d4fb8 ("cgroup: replace global percpu_rwsem with per threadgroup resem when writing to cgroup.procs")
>> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
>> ---
>> Manual AB test:
>>
>> Before this patch:
>>   enable favordynmods:
>>     cgroup2 opts: rw,relatime,favordynmods
>>   disable attempt:
>>     cgroup2 opts: rw,relatime
>>   dmesg:
>>     cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled
>>
>> After this patch:
>>   enable favordynmods:
>>     cgroup2 opts: rw,relatime,favordynmods
>>   disable attempt:
>>     cgroup2 opts: rw,relatime,favordynmods
>>   dmesg:
>>     cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled
>>
>>  kernel/cgroup/cgroup.c | 11 +++++------
>>  1 file changed, 5 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
>> index 6152add0c5eb..fd10fb5b3598 100644
>> --- a/kernel/cgroup/cgroup.c
>> +++ b/kernel/cgroup/cgroup.c
>> @@ -1297,14 +1297,13 @@ void cgroup_favor_dynmods(struct cgroup_root *root, bool favor)
>>  	 */
>>  	percpu_down_write(&cgroup_threadgroup_rwsem);
>>  	if (favor && !favoring) {
>> -		cgroup_enable_per_threadgroup_rwsem = true;
>> -		rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
>> +		if (!cgroup_enable_per_threadgroup_rwsem) {
> 
> Is this branch redundant? I think if (favor && !favoring) alone should suffice —
> or can the outer condition be true twice (i.e., can this block be entered
> multiple times)?
> Hi Ridong,

Thanks for taking a look.

I don't think the inner check is redundant. `favoring` is per-root, while
`cgroup_enable_per_threadgroup_rwsem` is global.

For example, root A may have already enabled favordynmods:

cgroup_enable_per_threadgroup_rwsem = true
rcu_sync_enter(&cgroup_threadgroup_rwsem.rss) has already been called

Then root B may enable favordynmods for the first time:

favoring is false for root B
favor && !favoring is true
but cgroup_enable_per_threadgroup_rwsem is already true

In that case, we only need to set `CGRP_ROOT_FAVOR_DYNMODS` for root B and
should not call `rcu_sync_enter()` again.

That said, Tejun pointed out that the visible flag state is ambiguous either
way after a disable attempt, so please consider this patch withdrawn.
-- 
Best regards,
Guopeng

Re: [PATCH] cgroup: Keep favordynmods enabled once per-threadgroup rwsem is active
Posted by Chen Ridong 1 month ago

On 2026/5/11 17:53, Guopeng Zhang wrote:
> 
> 
> 在 2026/5/11 17:05, Chen Ridong 写道:
>>
>>
>> On 2026/5/11 16:16, Guopeng Zhang wrote:
>>> cgroup_enable_per_threadgroup_rwsem is a one-way switch. Once it is
>>> enabled, cgroup.procs writes use the per-threadgroup rwsem and
>>> cgroup_threadgroup_change_begin()/end() use the same global state to
>>> decide whether to take and release the per-threadgroup rwsem.
>>>
>>> The disable path warned that the per-threadgroup rwsem mechanism could not
>>> be disabled but still called rcu_sync_exit() and cleared
>>> CGRP_ROOT_FAVOR_DYNMODS. That partially disabled favordynmods while the
>>> global per-threadgroup rwsem mode remained enabled: cgroup.procs writes
>>> would continue to use the per-threadgroup rwsem, while
>>> cgroup_threadgroup_change_begin()/end() could observe the exited rcu_sync
>>> state. The root would also no longer report favordynmods.
>>>
>>> Make the transition match the documented one-way semantics. Call
>>> rcu_sync_enter() only for the first favordynmods enable, and make later
>>> disable attempts a no-op after warning once the per-threadgroup rwsem mode
>>> has been enabled.
>>>
>>> Fixes: 0568f89d4fb8 ("cgroup: replace global percpu_rwsem with per threadgroup resem when writing to cgroup.procs")
>>> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
>>> ---
>>> Manual AB test:
>>>
>>> Before this patch:
>>>   enable favordynmods:
>>>     cgroup2 opts: rw,relatime,favordynmods
>>>   disable attempt:
>>>     cgroup2 opts: rw,relatime
>>>   dmesg:
>>>     cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled
>>>
>>> After this patch:
>>>   enable favordynmods:
>>>     cgroup2 opts: rw,relatime,favordynmods
>>>   disable attempt:
>>>     cgroup2 opts: rw,relatime,favordynmods
>>>   dmesg:
>>>     cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled
>>>
>>>  kernel/cgroup/cgroup.c | 11 +++++------
>>>  1 file changed, 5 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
>>> index 6152add0c5eb..fd10fb5b3598 100644
>>> --- a/kernel/cgroup/cgroup.c
>>> +++ b/kernel/cgroup/cgroup.c
>>> @@ -1297,14 +1297,13 @@ void cgroup_favor_dynmods(struct cgroup_root *root, bool favor)
>>>  	 */
>>>  	percpu_down_write(&cgroup_threadgroup_rwsem);
>>>  	if (favor && !favoring) {
>>> -		cgroup_enable_per_threadgroup_rwsem = true;
>>> -		rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
>>> +		if (!cgroup_enable_per_threadgroup_rwsem) {
>>
>> Is this branch redundant? I think if (favor && !favoring) alone should suffice —
>> or can the outer condition be true twice (i.e., can this block be entered
>> multiple times)?
>> Hi Ridong,
> 
> Thanks for taking a look.
> 
> I don't think the inner check is redundant. `favoring` is per-root, while
> `cgroup_enable_per_threadgroup_rwsem` is global.
> 
> For example, root A may have already enabled favordynmods:
> 

This functionality is only available for cgroup v2, right?

Never mind — I see Tj has already replied.

-- 
Best regards,
Ridong

Re: [PATCH] cgroup: Keep favordynmods enabled once per-threadgroup rwsem is active
Posted by Chen Ridong 1 month ago

On 2026/5/11 20:08, Chen Ridong wrote:
> 
> 
> On 2026/5/11 17:53, Guopeng Zhang wrote:
>>
>>
>> 在 2026/5/11 17:05, Chen Ridong 写道:
>>>
>>>
>>> On 2026/5/11 16:16, Guopeng Zhang wrote:
>>>> cgroup_enable_per_threadgroup_rwsem is a one-way switch. Once it is
>>>> enabled, cgroup.procs writes use the per-threadgroup rwsem and
>>>> cgroup_threadgroup_change_begin()/end() use the same global state to
>>>> decide whether to take and release the per-threadgroup rwsem.
>>>>
>>>> The disable path warned that the per-threadgroup rwsem mechanism could not
>>>> be disabled but still called rcu_sync_exit() and cleared
>>>> CGRP_ROOT_FAVOR_DYNMODS. That partially disabled favordynmods while the
>>>> global per-threadgroup rwsem mode remained enabled: cgroup.procs writes
>>>> would continue to use the per-threadgroup rwsem, while
>>>> cgroup_threadgroup_change_begin()/end() could observe the exited rcu_sync
>>>> state. The root would also no longer report favordynmods.
>>>>
>>>> Make the transition match the documented one-way semantics. Call
>>>> rcu_sync_enter() only for the first favordynmods enable, and make later
>>>> disable attempts a no-op after warning once the per-threadgroup rwsem mode
>>>> has been enabled.
>>>>
>>>> Fixes: 0568f89d4fb8 ("cgroup: replace global percpu_rwsem with per threadgroup resem when writing to cgroup.procs")
>>>> Signed-off-by: Guopeng Zhang <zhangguopeng@kylinos.cn>
>>>> ---
>>>> Manual AB test:
>>>>
>>>> Before this patch:
>>>>   enable favordynmods:
>>>>     cgroup2 opts: rw,relatime,favordynmods
>>>>   disable attempt:
>>>>     cgroup2 opts: rw,relatime
>>>>   dmesg:
>>>>     cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled
>>>>
>>>> After this patch:
>>>>   enable favordynmods:
>>>>     cgroup2 opts: rw,relatime,favordynmods
>>>>   disable attempt:
>>>>     cgroup2 opts: rw,relatime,favordynmods
>>>>   dmesg:
>>>>     cgroup: cgroup favordynmods: per threadgroup rwsem mechanism can't be disabled
>>>>
>>>>  kernel/cgroup/cgroup.c | 11 +++++------
>>>>  1 file changed, 5 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
>>>> index 6152add0c5eb..fd10fb5b3598 100644
>>>> --- a/kernel/cgroup/cgroup.c
>>>> +++ b/kernel/cgroup/cgroup.c
>>>> @@ -1297,14 +1297,13 @@ void cgroup_favor_dynmods(struct cgroup_root *root, bool favor)
>>>>  	 */
>>>>  	percpu_down_write(&cgroup_threadgroup_rwsem);
>>>>  	if (favor && !favoring) {
>>>> -		cgroup_enable_per_threadgroup_rwsem = true;
>>>> -		rcu_sync_enter(&cgroup_threadgroup_rwsem.rss);
>>>> +		if (!cgroup_enable_per_threadgroup_rwsem) {
>>>
>>> Is this branch redundant? I think if (favor && !favoring) alone should suffice —
>>> or can the outer condition be true twice (i.e., can this block be entered
>>> multiple times)?
>>> Hi Ridong,
>>
>> Thanks for taking a look.
>>
>> I don't think the inner check is redundant. `favoring` is per-root, while
>> `cgroup_enable_per_threadgroup_rwsem` is global.
>>
>> For example, root A may have already enabled favordynmods:
>>
> 
> This functionality is only available for cgroup v2, right?
> 

Sorry for the mistake. I incorrectly recalled that it was only available for v2.
Please ignore that.

-- 
Best regards,
Ridong