kernel/sched/core.c | 4 ++++ 1 file changed, 4 insertions(+)
I got the following warn report while doing stress test:
jump label: negative count!
WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
Call Trace:
<TASK>
__static_key_slow_dec_cpuslocked+0x16/0x70
sched_cpu_deactivate+0x26e/0x2a0
cpuhp_invoke_callback+0x3ad/0x10d0
cpuhp_thread_fun+0x3f5/0x680
smpboot_thread_fn+0x56d/0x8d0
kthread+0x309/0x400
ret_from_fork+0x41/0x70
ret_from_fork_asm+0x1b/0x30
</TASK>
Becaus when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
the cpu offline failed, but sched_smt_present is decreased before
calling sched_cpu_deactivate, it leads unbalance dec/inc, so fix
it by increasing sched_smt_present in the error path.
Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
kernel/sched/core.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2299a5cfbfb9..b7ef2df36b75 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -9745,6 +9745,10 @@ int sched_cpu_deactivate(unsigned int cpu)
sched_update_numa(cpu, false);
ret = cpuset_cpu_inactive(cpu);
if (ret) {
+#ifdef CONFIG_SCHED_SMT
+ if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
+ static_branch_inc_cpuslocked(&sched_smt_present);
+#endif
balance_push_set(cpu, false);
set_cpu_active(cpu, true);
sched_update_numa(cpu, true);
--
2.25.1
Hi Yingliang,
On 2023-09-02 at 15:46:09 +0800, Yang Yingliang wrote:
> I got the following warn report while doing stress test:
>
May I know if the test is to run many deadline tasks while offline the CPUs,
so as to trigger the failing case that removing one CPU gets us below the total
allocated bandwidth?
> jump label: negative count!
> WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
> Call Trace:
> <TASK>
> __static_key_slow_dec_cpuslocked+0x16/0x70
> sched_cpu_deactivate+0x26e/0x2a0
> cpuhp_invoke_callback+0x3ad/0x10d0
> cpuhp_thread_fun+0x3f5/0x680
> smpboot_thread_fn+0x56d/0x8d0
> kthread+0x309/0x400
> ret_from_fork+0x41/0x70
> ret_from_fork_asm+0x1b/0x30
> </TASK>
>
> Becaus when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
s/Becaus/Because/
> the cpu offline failed, but sched_smt_present is decreased before
> calling sched_cpu_deactivate, it leads unbalance dec/inc, so fix
s/calling sched_cpu_deactivate/calling cpuset_cpu_inactive() ?
> it by increasing sched_smt_present in the error path.
>
> Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> ---
> kernel/sched/core.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 2299a5cfbfb9..b7ef2df36b75 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -9745,6 +9745,10 @@ int sched_cpu_deactivate(unsigned int cpu)
> sched_update_numa(cpu, false);
> ret = cpuset_cpu_inactive(cpu);
> if (ret) {
> +#ifdef CONFIG_SCHED_SMT
> + if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
> + static_branch_inc_cpuslocked(&sched_smt_present);
> +#endif
While checking the code, it seems that the core scheduling also missed
the error path, maybe we should also invoke sched_core_cpu_starting() to
restore the context. I'll have a check.
Other than above typo, it looks good to me,
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
thanks,
Chenyu
On Mon, 2023-09-04 at 10:26 +0800, Chen Yu wrote:
> Hi Yingliang,
>
> On 2023-09-02 at 15:46:09 +0800, Yang Yingliang wrote:
> > I got the following warn report while doing stress test:
> >
>
> May I know if the test is to run many deadline tasks while offline the CPUs,
> so as to trigger the failing case that removing one CPU gets us below the total
> allocated bandwidth?
>
> > jump label: negative count!
> > WARNING: CPU: 3 PID: 38 at kernel/jump_label.c:263 static_key_slow_try_dec+0x9d/0xb0
> > Call Trace:
> > <TASK>
> > __static_key_slow_dec_cpuslocked+0x16/0x70
> > sched_cpu_deactivate+0x26e/0x2a0
> > cpuhp_invoke_callback+0x3ad/0x10d0
> > cpuhp_thread_fun+0x3f5/0x680
> > smpboot_thread_fn+0x56d/0x8d0
> > kthread+0x309/0x400
> > ret_from_fork+0x41/0x70
> > ret_from_fork_asm+0x1b/0x30
> > </TASK>
> >
> > Becaus when cpuset_cpu_inactive() fails in sched_cpu_deactivate(),
>
> s/Becaus/Because/
>
> > the cpu offline failed, but sched_smt_present is decreased before
s/decreased/decremented
to be precise
> > calling sched_cpu_deactivate, it leads unbalance dec/inc, so fix
s/leads unbalance dec/inc / leads to unbalanced dec/inc
>
> s/calling sched_cpu_deactivate/calling cpuset_cpu_inactive() ?
> > it by increasing sched_smt_present in the error path.
s/increasing/incrementing
> >
> > Fixes: c5511d03ec09 ("sched/smt: Make sched_smt_present track topology")
> > Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
> > ---
> > kernel/sched/core.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 2299a5cfbfb9..b7ef2df36b75 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -9745,6 +9745,10 @@ int sched_cpu_deactivate(unsigned int cpu)
> > sched_update_numa(cpu, false);
> > ret = cpuset_cpu_inactive(cpu);
> > if (ret) {
> > +#ifdef CONFIG_SCHED_SMT
> > + if (cpumask_weight(cpu_smt_mask(cpu)) == 2)
> > + static_branch_inc_cpuslocked(&sched_smt_present);
> > +#endif
>
> While checking the code, it seems that the core scheduling also missed
> the error path, maybe we should also invoke sched_core_cpu_starting() to
> restore the context. I'll have a check.
>
> Other than above typo, it looks good to me,
>
> Reviewed-by: Chen Yu <yu.c.chen@intel.com>
>
Other than some minor nits to commit log wording,
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
On 2023-09-04 at 10:26:20 +0800, Chen Yu wrote: > Hi Yingliang, > > On 2023-09-02 at 15:46:09 +0800, Yang Yingliang wrote: > > While checking the code, it seems that the core scheduling also missed > the error path, maybe we should also invoke sched_core_cpu_starting() to > restore the context. I'll have a check. After checking the core scheduling code, there is no need to restore the context 'broken' by sched_core_cpu_deactivate(). It just tries to find a new leader if the offline CPU is the old leader, and it should not have any impact to switch to a new leader if the cpu hotplug offline fails. thanks, Chenyu
© 2016 - 2025 Red Hat, Inc.