kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+)
The current implementation of available_idle_cpu() doesn't test
whether a possible cpu is offline. On s390 this dereferences a
NULL pointer in arch_vcpu_is_preempted() because lowcore is not
allocated for offline cpus. On x86, tracing also shows calls to
available_idle_cpu() after a cpu is disabled, but it looks like
this isn't causing any (obvious) issue. Nevertheless, add a check
and return early if the cpu isn't online.
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
---
kernel/sched/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7019a40457a6..1d9b80411594 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7434,6 +7434,9 @@ int idle_cpu(int cpu)
*/
int available_idle_cpu(int cpu)
{
+ if (!cpu_online(cpu))
+ return 0;
+
if (!idle_cpu(cpu))
return 0;
--
2.40.1
On 29/04/24 07:54, Sven Schnelle wrote: > The current implementation of available_idle_cpu() doesn't test > whether a possible cpu is offline. On s390 this dereferences a > NULL pointer in arch_vcpu_is_preempted() because lowcore is not > allocated for offline cpus. On x86, tracing also shows calls to > available_idle_cpu() after a cpu is disabled, but it looks like > this isn't causing any (obvious) issue. Nevertheless, add a check > and return early if the cpu isn't online. > > Signed-off-by: Sven Schnelle <svens@linux.ibm.com> So most of the uses of that function is in wakeup task placement. o find_idlest_cpu() works on the sched_domain spans, so shouldn't deal with offline CPUs. o select_idle_sibling() may issue an available_idle_cpu(prev) with an offline previous, which would trigger your issue. Currently, even if select_idle_sibling() picks an offline CPU, this will get corrected by select_fallback_rq() at the end of select_task_rq(). However, it would make sense to realize @prev isn't a suitable pick before making it to the fallback machinery, in which case your patch makes sense beyond just fixing s390. Reviewed-by: Valentin Schneider <vschneid@redhat.com>
Valentin Schneider <vschneid@redhat.com> writes: > On 29/04/24 07:54, Sven Schnelle wrote: >> The current implementation of available_idle_cpu() doesn't test >> whether a possible cpu is offline. On s390 this dereferences a >> NULL pointer in arch_vcpu_is_preempted() because lowcore is not >> allocated for offline cpus. On x86, tracing also shows calls to >> available_idle_cpu() after a cpu is disabled, but it looks like >> this isn't causing any (obvious) issue. Nevertheless, add a check >> and return early if the cpu isn't online. >> >> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> > > > So most of the uses of that function is in wakeup task placement. > o find_idlest_cpu() works on the sched_domain spans, so shouldn't deal with > offline CPUs. > o select_idle_sibling() may issue an available_idle_cpu(prev) with an > offline previous, which would trigger your issue. > > Currently, even if select_idle_sibling() picks an offline CPU, this will > get corrected by select_fallback_rq() at the end of > select_task_rq(). However, it would make sense to realize @prev isn't a > suitable pick before making it to the fallback machinery, in which case > your patch makes sense beyond just fixing s390. > > Reviewed-by: Valentin Schneider <vschneid@redhat.com> Thanks for the review! Ingo/Peter, gentle ping, are you planning to take this patch?
Sven Schnelle <svens@linux.ibm.com> writes: > Valentin Schneider <vschneid@redhat.com> writes: > >> On 29/04/24 07:54, Sven Schnelle wrote: >>> The current implementation of available_idle_cpu() doesn't test >>> whether a possible cpu is offline. On s390 this dereferences a >>> NULL pointer in arch_vcpu_is_preempted() because lowcore is not >>> allocated for offline cpus. On x86, tracing also shows calls to >>> available_idle_cpu() after a cpu is disabled, but it looks like >>> this isn't causing any (obvious) issue. Nevertheless, add a check >>> and return early if the cpu isn't online. >>> >>> Signed-off-by: Sven Schnelle <svens@linux.ibm.com> >> >> >> So most of the uses of that function is in wakeup task placement. >> o find_idlest_cpu() works on the sched_domain spans, so shouldn't > deal with >> offline CPUs. >> o select_idle_sibling() may issue an available_idle_cpu(prev) with > an >> offline previous, which would trigger your issue. >> >> Currently, even if select_idle_sibling() picks an offline CPU, this > will >> get corrected by select_fallback_rq() at the end of >> select_task_rq(). However, it would make sense to realize @prev > isn't a >> suitable pick before making it to the fallback machinery, in which > case >> your patch makes sense beyond just fixing s390. >> >> Reviewed-by: Valentin Schneider <vschneid@redhat.com> > > Thanks for the review! Ingo/Peter, gentle ping, are you planning to > take > this patch? Ping? Thanks, Sven
© 2016 - 2025 Red Hat, Inc.