[PATCH v2 0/3] Take the scheduling domain into account in numa balancin

Chuyi Zhou posted 3 patches 1 year, 1 month ago
There is a newer version of this series
kernel/sched/fair.c | 26 +++++++++++++++++---------
1 file changed, 17 insertions(+), 9 deletions(-)
[PATCH v2 0/3] Take the scheduling domain into account in numa balancin
Posted by Chuyi Zhou 1 year, 1 month ago
This patchset tries to adjust the logic of handling isolate cpus in numa balancing.

patch#1: Clean up for task_numa_migrate().

patch#2: Skips the isolate cpus when gathering numa status and finding
idle cpus in update_numa_stats().

patch#3: Ensure that we do not select an isolated CPU in
task_numa_find_cpu(), even if it is present in the task's CPU mask.

v1 -> v2:
 - collect Prateek's reviewed-by and add the history of
   task_numa_migrate() to commit log.

Chuyi Zhou (3):
  sched/fair: Remove unused task_numa_migrate return value
  sched/fair: Ignore isolated cpus in update_numa_stat
  sched/fair: Ensure select housekeeping cpus in task_numa_find_cpu

 kernel/sched/fair.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

-- 
2.20.1
Re: [PATCH v2 0/3] Take the scheduling domain into account in numa balancin
Posted by Peter Zijlstra 1 year, 1 month ago
On Fri, Jan 03, 2025 at 02:59:27PM +0800, Chuyi Zhou wrote:
> This patchset tries to adjust the logic of handling isolate cpus in numa balancing.
> 
> patch#1: Clean up for task_numa_migrate().
> 
> patch#2: Skips the isolate cpus when gathering numa status and finding
> idle cpus in update_numa_stats().
> 
> patch#3: Ensure that we do not select an isolated CPU in
> task_numa_find_cpu(), even if it is present in the task's CPU mask.

Your $subject and actual patches do not patch.

Your subject suggests you're taking the scheduling domains into account
for numa balancing, your actual patches are bunch of special case hacks
that totally ignore the actual sched domains :-(
Re: [PATCH v2 0/3] Take the scheduling domain into account in numa balancin
Posted by Chuyi Zhou 1 year, 1 month ago
Hello Peter,

在 2025/1/3 19:36, Peter Zijlstra 写道:
> Your $subject and actual patches do not patch.
> 
> Your subject suggests you're taking the scheduling domains into account
> for numa balancing, your actual patches are bunch of special case hacks
> that totally ignore the actual sched domains

The subject is indeed inappropriate, but the issues mentioned in this 
patchset still exist, right?

- We should not consider isolated CPUs in numa_stats.
- We should not select isolated CPUs as candidate CPUs.

The current patch handle the above cases with hacks because I thought 
this kind of change is minimal to fix this issue. Perhaps you have a 
better solution for this issue? If so, please let me know, and I am more 
than willing to continue addressing this problem.

Thanks.
Re: [PATCH v2 0/3] Take the scheduling domain into account in numa balancin
Posted by Peter Zijlstra 1 year, 1 month ago
On Fri, Jan 03, 2025 at 08:40:18PM +0800, Chuyi Zhou wrote:
> Hello Peter,
> 
> 在 2025/1/3 19:36, Peter Zijlstra 写道:
> > Your $subject and actual patches do not patch.
> > 
> > Your subject suggests you're taking the scheduling domains into account
> > for numa balancing, your actual patches are bunch of special case hacks
> > that totally ignore the actual sched domains
> 
> The subject is indeed inappropriate, but the issues mentioned in this
> patchset still exist, right?
> 
> - We should not consider isolated CPUs in numa_stats.
> - We should not select isolated CPUs as candidate CPUs.
> 
> The current patch handle the above cases with hacks because I thought this
> kind of change is minimal to fix this issue. Perhaps you have a better
> solution for this issue? If so, please let me know, and I am more than
> willing to continue addressing this problem.

I would much rather see you do what the subject claims :-)

At the very least you should mask the whole thing against rq->rd->span
if there is a rq->rd at all ofcourse. This is not just the iteration in
update_numa_stats() but also the for_each_node_state() iteration.

I'm not sure there's anything saner than:

	cpumask_intersects(rq->rd->span, cpumask_of_node(nid))

Re: [PATCH v2 0/3] Take the scheduling domain into account in numa balancin
Posted by Chuyi Zhou 1 year, 1 month ago

在 2025/1/7 18:48, Peter Zijlstra 写道:
> On Fri, Jan 03, 2025 at 08:40:18PM +0800, Chuyi Zhou wrote:
>> Hello Peter,
>>
>> 在 2025/1/3 19:36, Peter Zijlstra 写道:
>>> Your $subject and actual patches do not patch.
>>>
>>> Your subject suggests you're taking the scheduling domains into account
>>> for numa balancing, your actual patches are bunch of special case hacks
>>> that totally ignore the actual sched domains
>>
>> The subject is indeed inappropriate, but the issues mentioned in this
>> patchset still exist, right?
>>
>> - We should not consider isolated CPUs in numa_stats.
>> - We should not select isolated CPUs as candidate CPUs.
>>
>> The current patch handle the above cases with hacks because I thought this
>> kind of change is minimal to fix this issue. Perhaps you have a better
>> solution for this issue? If so, please let me know, and I am more than
>> willing to continue addressing this problem.
> 
> I would much rather see you do what the subject claims :-)
> 
> At the very least you should mask the whole thing against rq->rd->span
> if there is a rq->rd at all ofcourse. This is not just the iteration in
> update_numa_stats() but also the for_each_node_state() iteration.
> 
> I'm not sure there's anything saner than:
> 
> 	cpumask_intersects(rq->rd->span, cpumask_of_node(nid))
> 

Thank you for your suggestion. It will be implemented in the next version.

Thansk.