[PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug

Waiman Long posted 3 patches 1 month, 4 weeks ago
[PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug
Posted by Waiman Long 1 month, 4 weeks ago
It was found during testing that an invalid leaf partition with an
empty effective exclusive CPU list can become a valid empty partition
with no CPU afer an offline/online operation of an unrelated CPU. An
empty partition root is allowed in the special case that it has no
task in its cgroup and has distributed out all its CPUs to its child
partitions. That is certainly not the case here.

The problem is in the cpumask_subsets() test in the hotplug case
(update with no new mask) of update_parent_effective_cpumask() as it
also returns true if the effective exclusive CPU list is empty. Fix that
by addding the cpumask_empty() test to root out this exception case.
Also add the cpumask_empty() test in cpuset_hotplug_update_tasks()
to avoid calling update_parent_effective_cpumask() for this special case.

Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/cgroup/cpuset.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index bf149246e001..d993e058a663 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1843,7 +1843,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
 			if (is_partition_valid(cs))
 				adding = cpumask_and(tmp->addmask,
 						xcpus, parent->effective_xcpus);
-		} else if (is_partition_invalid(cs) &&
+		} else if (is_partition_invalid(cs) && !cpumask_empty(xcpus) &&
 			   cpumask_subset(xcpus, parent->effective_xcpus)) {
 			struct cgroup_subsys_state *css;
 			struct cpuset *child;
@@ -3870,9 +3870,10 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
 		partcmd = partcmd_invalidate;
 	/*
 	 * On the other hand, an invalid partition root may be transitioned
-	 * back to a regular one.
+	 * back to a regular one with a non-empty effective xcpus.
 	 */
-	else if (is_partition_valid(parent) && is_partition_invalid(cs))
+	else if (is_partition_valid(parent) && is_partition_invalid(cs) &&
+		 !cpumask_empty(cs->effective_xcpus))
 		partcmd = partcmd_update;
 
 	if (partcmd >= 0) {
-- 
2.50.0
Re: [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug
Posted by Chen Ridong 1 month, 4 weeks ago

On 2025/8/7 1:24, Waiman Long wrote:
> It was found during testing that an invalid leaf partition with an
> empty effective exclusive CPU list can become a valid empty partition
> with no CPU afer an offline/online operation of an unrelated CPU. An
> empty partition root is allowed in the special case that it has no
> task in its cgroup and has distributed out all its CPUs to its child
> partitions. That is certainly not the case here.
> 
> The problem is in the cpumask_subsets() test in the hotplug case
> (update with no new mask) of update_parent_effective_cpumask() as it
> also returns true if the effective exclusive CPU list is empty. Fix that
> by addding the cpumask_empty() test to root out this exception case.
> Also add the cpumask_empty() test in cpuset_hotplug_update_tasks()
> to avoid calling update_parent_effective_cpumask() for this special case.
> 
> Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  kernel/cgroup/cpuset.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index bf149246e001..d993e058a663 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1843,7 +1843,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
>  			if (is_partition_valid(cs))
>  				adding = cpumask_and(tmp->addmask,
>  						xcpus, parent->effective_xcpus);
> -		} else if (is_partition_invalid(cs) &&
> +		} else if (is_partition_invalid(cs) && !cpumask_empty(xcpus) &&
>  			   cpumask_subset(xcpus, parent->effective_xcpus)) {
>  			struct cgroup_subsys_state *css;
>  			struct cpuset *child;

This path looks good to me.

However, I found the update_parent_effective_cpumask function a bit difficult to follow due to its
complexity.

To improve readability, could we refactor the partcmd_enable, partcmd_disable, partcmd_update and
partcmd_invalidate logic into separate, well-defined function blocks?  I'd be happy to take
ownership of this refactoring work if you agree with the approach.

Best regards,
Ridong

> @@ -3870,9 +3870,10 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
>  		partcmd = partcmd_invalidate;
>  	/*
>  	 * On the other hand, an invalid partition root may be transitioned
> -	 * back to a regular one.
> +	 * back to a regular one with a non-empty effective xcpus.
>  	 */
> -	else if (is_partition_valid(parent) && is_partition_invalid(cs))
> +	else if (is_partition_valid(parent) && is_partition_invalid(cs) &&
> +		 !cpumask_empty(cs->effective_xcpus))
>  		partcmd = partcmd_update;
>  
>  	if (partcmd >= 0) {
Re: [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug
Posted by Waiman Long 1 month, 4 weeks ago
On 8/6/25 10:44 PM, Chen Ridong wrote:
>
> On 2025/8/7 1:24, Waiman Long wrote:
>> It was found during testing that an invalid leaf partition with an
>> empty effective exclusive CPU list can become a valid empty partition
>> with no CPU afer an offline/online operation of an unrelated CPU. An
>> empty partition root is allowed in the special case that it has no
>> task in its cgroup and has distributed out all its CPUs to its child
>> partitions. That is certainly not the case here.
>>
>> The problem is in the cpumask_subsets() test in the hotplug case
>> (update with no new mask) of update_parent_effective_cpumask() as it
>> also returns true if the effective exclusive CPU list is empty. Fix that
>> by addding the cpumask_empty() test to root out this exception case.
>> Also add the cpumask_empty() test in cpuset_hotplug_update_tasks()
>> to avoid calling update_parent_effective_cpumask() for this special case.
>>
>> Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   kernel/cgroup/cpuset.c | 7 ++++---
>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index bf149246e001..d993e058a663 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -1843,7 +1843,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
>>   			if (is_partition_valid(cs))
>>   				adding = cpumask_and(tmp->addmask,
>>   						xcpus, parent->effective_xcpus);
>> -		} else if (is_partition_invalid(cs) &&
>> +		} else if (is_partition_invalid(cs) && !cpumask_empty(xcpus) &&
>>   			   cpumask_subset(xcpus, parent->effective_xcpus)) {
>>   			struct cgroup_subsys_state *css;
>>   			struct cpuset *child;
> This path looks good to me.
>
> However, I found the update_parent_effective_cpumask function a bit difficult to follow due to its
> complexity.
>
> To improve readability, could we refactor the partcmd_enable, partcmd_disable, partcmd_update and
> partcmd_invalidate logic into separate, well-defined function blocks?  I'd be happy to take
> ownership of this refactoring work if you agree with the approach.

I agree that the code can be a bit hard to read. You are more than 
welcome to improve the readability of the code if you have time.

Cheers,
Longman
Re: [PATCH 2/3] cgroup/cpuset.c: Fix a partition error with CPU hotplug
Posted by Waiman Long 1 month, 4 weeks ago
On 8/6/25 1:24 PM, Waiman Long wrote:
> It was found during testing that an invalid leaf partition with an
> empty effective exclusive CPU list can become a valid empty partition
> with no CPU afer an offline/online operation of an unrelated CPU. An
> empty partition root is allowed in the special case that it has no
> task in its cgroup and has distributed out all its CPUs to its child
> partitions. That is certainly not the case here.
>
> The problem is in the cpumask_subsets() test in the hotplug case
> (update with no new mask) of update_parent_effective_cpumask() as it
> also returns true if the effective exclusive CPU list is empty. Fix that
> by addding the cpumask_empty() test to root out this exception case.
> Also add the cpumask_empty() test in cpuset_hotplug_update_tasks()
> to avoid calling update_parent_effective_cpumask() for this special case.
>
> Fixes: 0c7f293efc87 ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
> Signed-off-by: Waiman Long <longman@redhat.com>

Oh, forget to remove the unneeded ".c" from the patch title.

-Longman

> ---
>   kernel/cgroup/cpuset.c | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index bf149246e001..d993e058a663 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1843,7 +1843,7 @@ static int update_parent_effective_cpumask(struct cpuset *cs, int cmd,
>   			if (is_partition_valid(cs))
>   				adding = cpumask_and(tmp->addmask,
>   						xcpus, parent->effective_xcpus);
> -		} else if (is_partition_invalid(cs) &&
> +		} else if (is_partition_invalid(cs) && !cpumask_empty(xcpus) &&
>   			   cpumask_subset(xcpus, parent->effective_xcpus)) {
>   			struct cgroup_subsys_state *css;
>   			struct cpuset *child;
> @@ -3870,9 +3870,10 @@ static void cpuset_hotplug_update_tasks(struct cpuset *cs, struct tmpmasks *tmp)
>   		partcmd = partcmd_invalidate;
>   	/*
>   	 * On the other hand, an invalid partition root may be transitioned
> -	 * back to a regular one.
> +	 * back to a regular one with a non-empty effective xcpus.
>   	 */
> -	else if (is_partition_valid(parent) && is_partition_invalid(cs))
> +	else if (is_partition_valid(parent) && is_partition_invalid(cs) &&
> +		 !cpumask_empty(cs->effective_xcpus))
>   		partcmd = partcmd_update;
>   
>   	if (partcmd >= 0) {