[PATCH -next RFC 05/16] cpuset: factor out partition_update() function

Chen Ridong posted 16 patches 4 months, 2 weeks ago
There is a newer version of this series
[PATCH -next RFC 05/16] cpuset: factor out partition_update() function
Posted by Chen Ridong 4 months, 2 weeks ago
From: Chen Ridong <chenridong@huawei.com>

Extract the core partition update logic into a dedicated partition_update()
function. This refactoring centralizes updates to key cpuset data
structures including remote_sibling, effective_xcpus, partition_root_state,
and prs_err.

The function handles the complete partition update workflow:
- Adding and removing exclusive CPUs via partition_xcpus_add()/del()
- Managing remote sibling relationships
- Synchronizing effective exclusive CPUs mask
- Updating partition state and error status
- Triggering required system updates and workqueue synchronization

This creates a coherent interface for partition operations and establishes
a foundation for enhanced partition management while maintaining existing
remote partition behavior.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
---
 kernel/cgroup/cpuset.c | 71 ++++++++++++++++++++++++++++--------------
 1 file changed, 47 insertions(+), 24 deletions(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 1944410ae872..0e2f95daf459 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1587,6 +1587,49 @@ static void partition_disable(struct cpuset *cs, struct cpuset *parent,
 	cpuset_force_rebuild();
 }
 
+/**
+ * partition_update - Update an existing partition configuration
+ * @cs: The cpuset to update
+ * @prs: Partition root state (must be positive)
+ * @xcpus: New exclusive CPUs mask for the partition (NULL to keep current)
+ * @excpus: New effective exclusive CPUs mask
+ * @tmp: Temporary masks
+ *
+ * Updates partition-related fields. The tmp->addmask is the CPU mask that
+ * will be added to the subpartitions_cpus and removed from parent's
+ * effective_cpus, and the tmp->delmask vice versa.
+ */
+static void partition_update(struct cpuset *cs, int prs, struct cpumask *xcpus,
+				  struct cpumask *excpus, struct tmpmasks *tmp)
+{
+	bool isolcpus_updated;
+	bool excl_updated;
+	struct cpuset *parent;
+
+	lockdep_assert_held(&cpuset_mutex);
+	WARN_ON_ONCE(!cpuset_v2());
+	WARN_ON_ONCE(prs <= 0);
+
+	parent = is_remote_partition(cs) ? NULL : parent_cs(cs);
+	excl_updated = !cpumask_empty(tmp->addmask) ||
+			!cpumask_empty(tmp->delmask);
+
+	spin_lock_irq(&callback_lock);
+	isolcpus_updated = partition_xcpus_add(prs, parent, tmp->addmask);
+	isolcpus_updated |= partition_xcpus_del(prs, parent, tmp->delmask);
+	/*
+	 * Need to update effective_xcpus and exclusive_cpus now as
+	 * update_sibling_cpumasks() below may iterate back to the same cs.
+	 */
+	cpumask_copy(cs->effective_xcpus, excpus);
+	if (xcpus)
+		cpumask_copy(cs->exclusive_cpus, xcpus);
+	spin_unlock_irq(&callback_lock);
+	update_unbound_workqueue_cpumask(isolcpus_updated);
+	if (excl_updated)
+		cpuset_force_rebuild();
+}
+
 /*
  * remote_partition_enable - Enable current cpuset as a remote partition root
  * @cs: the cpuset to update
@@ -1669,10 +1712,6 @@ static void remote_partition_disable(struct cpuset *cs, struct tmpmasks *tmp)
 static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
 			       struct cpumask *excpus, struct tmpmasks *tmp)
 {
-	bool adding, deleting;
-	int prs = cs->partition_root_state;
-	int isolcpus_updated = 0;
-
 	if (WARN_ON_ONCE(!is_remote_partition(cs)))
 		return;
 
@@ -1683,15 +1722,15 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
 		goto invalidate;
 	}
 
-	adding   = cpumask_andnot(tmp->addmask, excpus, cs->effective_xcpus);
-	deleting = cpumask_andnot(tmp->delmask, cs->effective_xcpus, excpus);
+	cpumask_andnot(tmp->addmask, excpus, cs->effective_xcpus);
+	cpumask_andnot(tmp->delmask, cs->effective_xcpus, excpus);
 
 	/*
 	 * Additions of remote CPUs is only allowed if those CPUs are
 	 * not allocated to other partitions and there are effective_cpus
 	 * left in the top cpuset.
 	 */
-	if (adding) {
+	if (!cpumask_empty(tmp->addmask)) {
 		WARN_ON_ONCE(cpumask_intersects(tmp->addmask, subpartitions_cpus));
 		if (!capable(CAP_SYS_ADMIN))
 			cs->prs_err = PERR_ACCESS;
@@ -1702,23 +1741,7 @@ static void remote_cpus_update(struct cpuset *cs, struct cpumask *xcpus,
 			goto invalidate;
 	}
 
-	spin_lock_irq(&callback_lock);
-	if (adding)
-		isolcpus_updated += partition_xcpus_add(prs, NULL, tmp->addmask);
-	if (deleting)
-		isolcpus_updated += partition_xcpus_del(prs, NULL, tmp->delmask);
-	/*
-	 * Need to update effective_xcpus and exclusive_cpus now as
-	 * update_sibling_cpumasks() below may iterate back to the same cs.
-	 */
-	cpumask_copy(cs->effective_xcpus, excpus);
-	if (xcpus)
-		cpumask_copy(cs->exclusive_cpus, xcpus);
-	spin_unlock_irq(&callback_lock);
-	update_unbound_workqueue_cpumask(isolcpus_updated);
-	if (adding || deleting)
-		cpuset_force_rebuild();
-
+	partition_update(cs, cs->partition_root_state, xcpus, excpus, tmp);
 	/*
 	 * Propagate changes in top_cpuset's effective_cpus down the hierarchy.
 	 */
-- 
2.34.1
Re: [PATCH -next RFC 05/16] cpuset: factor out partition_update() function
Posted by Waiman Long 3 months, 3 weeks ago
On 9/28/25 3:12 AM, Chen Ridong wrote:
> From: Chen Ridong <chenridong@huawei.com>
>
> Extract the core partition update logic into a dedicated partition_update()
> function. This refactoring centralizes updates to key cpuset data
> structures including remote_sibling, effective_xcpus, partition_root_state,
> and prs_err.
>
> The function handles the complete partition update workflow:
> - Adding and removing exclusive CPUs via partition_xcpus_add()/del()
> - Managing remote sibling relationships
> - Synchronizing effective exclusive CPUs mask
> - Updating partition state and error status
> - Triggering required system updates and workqueue synchronization
>
> This creates a coherent interface for partition operations and establishes
> a foundation for enhanced partition management while maintaining existing
> remote partition behavior.
>
> Signed-off-by: Chen Ridong <chenridong@huawei.com>
> ---
>   kernel/cgroup/cpuset.c | 71 ++++++++++++++++++++++++++++--------------
>   1 file changed, 47 insertions(+), 24 deletions(-)
>
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 1944410ae872..0e2f95daf459 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1587,6 +1587,49 @@ static void partition_disable(struct cpuset *cs, struct cpuset *parent,
>   	cpuset_force_rebuild();
>   }
>   
> +/**
> + * partition_update - Update an existing partition configuration
> + * @cs: The cpuset to update
> + * @prs: Partition root state (must be positive)
> + * @xcpus: New exclusive CPUs mask for the partition (NULL to keep current)
> + * @excpus: New effective exclusive CPUs mask
> + * @tmp: Temporary masks
> + *
> + * Updates partition-related fields. The tmp->addmask is the CPU mask that
> + * will be added to the subpartitions_cpus and removed from parent's
> + * effective_cpus, and the tmp->delmask vice versa.
> + */
> +static void partition_update(struct cpuset *cs, int prs, struct cpumask *xcpus,
> +				  struct cpumask *excpus, struct tmpmasks *tmp)
> +{
> +	bool isolcpus_updated;
> +	bool excl_updated;
> +	struct cpuset *parent;
> +
> +	lockdep_assert_held(&cpuset_mutex);
> +	WARN_ON_ONCE(!cpuset_v2());
> +	WARN_ON_ONCE(prs <= 0);
> +
> +	parent = is_remote_partition(cs) ? NULL : parent_cs(cs);
> +	excl_updated = !cpumask_empty(tmp->addmask) ||
> +			!cpumask_empty(tmp->delmask);
> +
> +	spin_lock_irq(&callback_lock);
> +	isolcpus_updated = partition_xcpus_add(prs, parent, tmp->addmask);
> +	isolcpus_updated |= partition_xcpus_del(prs, parent, tmp->delmask);

The current partition_xcpus_add/del() functions assume the given cpumas 
is non-empty. In the new partition_update() helper, you can pass an 
empty cpumask to them. This will cause useless work to be done. Also 
isolcpus_update may not be correct because of that causing unneeded work 
to be done in the workqueue code.

-Longman
Re: [PATCH -next RFC 05/16] cpuset: factor out partition_update() function
Posted by Chen Ridong 3 months, 3 weeks ago

On 2025/10/20 10:43, Waiman Long wrote:
> On 9/28/25 3:12 AM, Chen Ridong wrote:
>> From: Chen Ridong <chenridong@huawei.com>
>>
>> Extract the core partition update logic into a dedicated partition_update()
>> function. This refactoring centralizes updates to key cpuset data
>> structures including remote_sibling, effective_xcpus, partition_root_state,
>> and prs_err.
>>
>> The function handles the complete partition update workflow:
>> - Adding and removing exclusive CPUs via partition_xcpus_add()/del()
>> - Managing remote sibling relationships
>> - Synchronizing effective exclusive CPUs mask
>> - Updating partition state and error status
>> - Triggering required system updates and workqueue synchronization
>>
>> This creates a coherent interface for partition operations and establishes
>> a foundation for enhanced partition management while maintaining existing
>> remote partition behavior.
>>
>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>> ---
>>   kernel/cgroup/cpuset.c | 71 ++++++++++++++++++++++++++++--------------
>>   1 file changed, 47 insertions(+), 24 deletions(-)
>>
>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>> index 1944410ae872..0e2f95daf459 100644
>> --- a/kernel/cgroup/cpuset.c
>> +++ b/kernel/cgroup/cpuset.c
>> @@ -1587,6 +1587,49 @@ static void partition_disable(struct cpuset *cs, struct cpuset *parent,
>>       cpuset_force_rebuild();
>>   }
>>   +/**
>> + * partition_update - Update an existing partition configuration
>> + * @cs: The cpuset to update
>> + * @prs: Partition root state (must be positive)
>> + * @xcpus: New exclusive CPUs mask for the partition (NULL to keep current)
>> + * @excpus: New effective exclusive CPUs mask
>> + * @tmp: Temporary masks
>> + *
>> + * Updates partition-related fields. The tmp->addmask is the CPU mask that
>> + * will be added to the subpartitions_cpus and removed from parent's
>> + * effective_cpus, and the tmp->delmask vice versa.
>> + */
>> +static void partition_update(struct cpuset *cs, int prs, struct cpumask *xcpus,
>> +                  struct cpumask *excpus, struct tmpmasks *tmp)
>> +{
>> +    bool isolcpus_updated;
>> +    bool excl_updated;
>> +    struct cpuset *parent;
>> +
>> +    lockdep_assert_held(&cpuset_mutex);
>> +    WARN_ON_ONCE(!cpuset_v2());
>> +    WARN_ON_ONCE(prs <= 0);
>> +
>> +    parent = is_remote_partition(cs) ? NULL : parent_cs(cs);
>> +    excl_updated = !cpumask_empty(tmp->addmask) ||
>> +            !cpumask_empty(tmp->delmask);
>> +
>> +    spin_lock_irq(&callback_lock);
>> +    isolcpus_updated = partition_xcpus_add(prs, parent, tmp->addmask);
>> +    isolcpus_updated |= partition_xcpus_del(prs, parent, tmp->delmask);
> 
> The current partition_xcpus_add/del() functions assume the given cpumas is non-empty. In the new
> partition_update() helper, you can pass an empty cpumask to them. This will cause useless work to be
> done. Also isolcpus_update may not be correct because of that causing unneeded work to be done in
> the workqueue code.
> 
> -Longman

Thank you, Longman.

I think we can add a check for empty cpumask inputs in partition_xcpus_add() and
partition_xcpus_del() to avoid unnecessary operations.

To clarify, do you mean that passing an empty cpumask to these functions might lead to incorrect
isolcpus_updated value? or are you referring to other potential logic issues?

-- 
Best regards,
Ridong

Re: [PATCH -next RFC 05/16] cpuset: factor out partition_update() function
Posted by Waiman Long 3 months, 3 weeks ago
On 10/20/25 4:05 AM, Chen Ridong wrote:
>
> On 2025/10/20 10:43, Waiman Long wrote:
>> On 9/28/25 3:12 AM, Chen Ridong wrote:
>>> From: Chen Ridong <chenridong@huawei.com>
>>>
>>> Extract the core partition update logic into a dedicated partition_update()
>>> function. This refactoring centralizes updates to key cpuset data
>>> structures including remote_sibling, effective_xcpus, partition_root_state,
>>> and prs_err.
>>>
>>> The function handles the complete partition update workflow:
>>> - Adding and removing exclusive CPUs via partition_xcpus_add()/del()
>>> - Managing remote sibling relationships
>>> - Synchronizing effective exclusive CPUs mask
>>> - Updating partition state and error status
>>> - Triggering required system updates and workqueue synchronization
>>>
>>> This creates a coherent interface for partition operations and establishes
>>> a foundation for enhanced partition management while maintaining existing
>>> remote partition behavior.
>>>
>>> Signed-off-by: Chen Ridong <chenridong@huawei.com>
>>> ---
>>>    kernel/cgroup/cpuset.c | 71 ++++++++++++++++++++++++++++--------------
>>>    1 file changed, 47 insertions(+), 24 deletions(-)
>>>
>>> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
>>> index 1944410ae872..0e2f95daf459 100644
>>> --- a/kernel/cgroup/cpuset.c
>>> +++ b/kernel/cgroup/cpuset.c
>>> @@ -1587,6 +1587,49 @@ static void partition_disable(struct cpuset *cs, struct cpuset *parent,
>>>        cpuset_force_rebuild();
>>>    }
>>>    +/**
>>> + * partition_update - Update an existing partition configuration
>>> + * @cs: The cpuset to update
>>> + * @prs: Partition root state (must be positive)
>>> + * @xcpus: New exclusive CPUs mask for the partition (NULL to keep current)
>>> + * @excpus: New effective exclusive CPUs mask
>>> + * @tmp: Temporary masks
>>> + *
>>> + * Updates partition-related fields. The tmp->addmask is the CPU mask that
>>> + * will be added to the subpartitions_cpus and removed from parent's
>>> + * effective_cpus, and the tmp->delmask vice versa.
>>> + */
>>> +static void partition_update(struct cpuset *cs, int prs, struct cpumask *xcpus,
>>> +                  struct cpumask *excpus, struct tmpmasks *tmp)
>>> +{
>>> +    bool isolcpus_updated;
>>> +    bool excl_updated;
>>> +    struct cpuset *parent;
>>> +
>>> +    lockdep_assert_held(&cpuset_mutex);
>>> +    WARN_ON_ONCE(!cpuset_v2());
>>> +    WARN_ON_ONCE(prs <= 0);
>>> +
>>> +    parent = is_remote_partition(cs) ? NULL : parent_cs(cs);
>>> +    excl_updated = !cpumask_empty(tmp->addmask) ||
>>> +            !cpumask_empty(tmp->delmask);
>>> +
>>> +    spin_lock_irq(&callback_lock);
>>> +    isolcpus_updated = partition_xcpus_add(prs, parent, tmp->addmask);
>>> +    isolcpus_updated |= partition_xcpus_del(prs, parent, tmp->delmask);
>> The current partition_xcpus_add/del() functions assume the given cpumas is non-empty. In the new
>> partition_update() helper, you can pass an empty cpumask to them. This will cause useless work to be
>> done. Also isolcpus_update may not be correct because of that causing unneeded work to be done in
>> the workqueue code.
>>
>> -Longman
> Thank you, Longman.
>
> I think we can add a check for empty cpumask inputs in partition_xcpus_add() and
> partition_xcpus_del() to avoid unnecessary operations.
Yes, you should do an empty cpumask check if empty cpumask can be passed 
to the helper.
>
> To clarify, do you mean that passing an empty cpumask to these functions might lead to incorrect
> isolcpus_updated value? or are you referring to other potential logic issues?

My main concern is doing non-useful work. However, I am sure if there 
will be an undesirable side effects as well.

Cheers,
Longman