[PATCH] sched/fair: Fix ksmd and kthreadd running on isolated CPU0 on arm64 systems

wujing posted 1 patch 1 year, 1 month ago
There is a newer version of this series
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] sched/fair: Fix ksmd and kthreadd running on isolated CPU0 on arm64 systems
Posted by wujing 1 year, 1 month ago
This bug can be reproduced on Kunpeng arm64 and Phytium arm physical machines,
as well as in virtual machine environments, based on the linux-4.19.y stable
branch:
1. Check the number of CPUs on the system:
   nproc --all
   96

2. Add the parameter isolcpus=0-85 to the grub configuration,
update grub, and reboot.

3. Check the ksmd process:

   ps aux | grep -i ksmd
   root      502  0.0  0.0      0     0 ?        S    10:00   0:00 [ksmd]

   ps -o pid,psr,comm -p 502
   PID PSR COMMAND
   502   0 ksmd

4. Check the kthreadd process:

   ps aux | grep -i kthreadd
   root        2  0.0  0.0      0     0 ?        S    10:00   0:00 [kthreadd]

   ps -o pid,psr,comm -p 2
   PID PSR COMMAND
     2   0 kthreadd

From the output above, it can be seen that both ksmd and kthreadd are still
running on CPU0, which is unreasonable since CPU0 has been isolated.

Signed-off-by: wujing <realwujing@qq.com>
Signed-off-by: QiLiang Yuan <yuanql9@chinatelecom.cn>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 0950cabfc1d0..454021ff70a1 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6211,7 +6211,7 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
 
 	this_sd = rcu_dereference(*this_cpu_ptr(&sd_llc));
 	if (!this_sd)
-		return -1;
+		return housekeeping_any_cpu(HK_FLAG_DOMAIN);
 
 	/*
 	 * Due to large variance we need a large fuzz factor; hackbench in
-- 
2.39.5
Re: [PATCH] sched/fair: Fix ksmd and kthreadd running on isolated CPU0 on arm64 systems
Posted by Peter Zijlstra 1 year, 1 month ago
Your subject explicitly mentions arm64, however there is absolutely
nothing arm64 specific to this patch, as such the subject is just plain
wrong.

On Mon, Jan 06, 2025 at 05:04:03PM +0800, wujing wrote:
> This bug can be reproduced on Kunpeng arm64 and Phytium arm physical machines,
> as well as in virtual machine environments, based on the linux-4.19.y stable

Development does not happen on stable branches.
Re: [PATCH] sched/fair: Fix ksmd and kthreadd running on isolated CPU0 on arm64 systems
Posted by wujing 1 year, 1 month ago
The reason this issue does not occur on x86 architecture is that x86 does not 
use the `select_idle_cpu` function for core selection. Instead, it uses the
`select_idle_smt` function. However, there is still an issue on x86 where 
isolated cores are not properly excluded. My other commit, 
`sched/fair: Correct CPU selection from isolated domain`, addresses the issue 
of isolated cores on x86.

> Your subject explicitly mentions arm64, however there is absolutely
> nothing arm64 specific to this patch, as such the subject is just plain
> wrong.
Re: [PATCH] sched/fair: Fix ksmd and kthreadd running on isolated CPU0 on arm64 systems
Posted by wujing 1 year, 1 month ago
On x86 architecture, we have checked many machines in the cluster, and this 
issue does not exist.

> Your subject explicitly mentions arm64, however there is absolutely
> nothing arm64 specific to this patch, as such the subject is just plain
>wrong.