kernel/sched/core.c | 2 ++ 1 file changed, 2 insertions(+)
When CPU 1 enters the nohz_full state, and the kworker on CPU 0 executes
the function sched_tick_remote, holding the lock on CPU1's rq
and triggering the warning WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3).
This leads to the process of printing the warning message, where the
console_sem semaphore is held. At this point, the print task on the
CPU1's rq cannot acquire the console_sem and joins the wait queue,
entering the UNINTERRUPTIBLE state. It waits for the console_sem to be
released and then wakes up. After the task on CPU 0 releases
the console_sem, it wakes up the waiting console_sem task.
In try_to_wake_up, it attempts to acquire the lock on CPU1's rq again,
resulting in a deadlock.
The triggering scenario is as follows:
CPU 0 CPU1
sched_tick_remote
WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3)
report_bug con_write
printk
console_unlock
do_con_write
console_lock
down(&console_sem)
list_add_tail(&waiter.list, &sem->wait_list);
up(&console_sem)
wake_up_q(&wake_q)
try_to_wake_up
__task_rq_lock
_raw_spin_lock
This patch fixes the issue by deffering all printk console printing
during the lock holding period.
Fixes: d84b31313ef8 ("sched/isolation: Offload residual 1Hz scheduler tick")
Signed-off-by: Wang Tao <wangtao554@huawei.com>
---
kernel/sched/core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 40f40f359c5d..fd2c83058ec2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4091,6 +4091,7 @@ static void sched_tick_remote(struct work_struct *work)
goto out_requeue;
rq_lock_irq(rq, &rf);
+ printk_deferred_enter();
curr = rq->curr;
if (cpu_is_offline(cpu))
goto out_unlock;
@@ -4109,6 +4110,7 @@ static void sched_tick_remote(struct work_struct *work)
calc_load_nohz_remote(rq);
out_unlock:
+ printk_deferred_exit();
rq_unlock_irq(rq, &rf);
out_requeue:
--
2.34.1
On Mon, Sep 08, 2025 at 08:42:30AM +0000, Wang Tao wrote: > When CPU 1 enters the nohz_full state, and the kworker on CPU 0 executes > the function sched_tick_remote, holding the lock on CPU1's rq > and triggering the warning WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3). > This leads to the process of printing the warning message, where the > console_sem semaphore is held. At this point, the print task on the > CPU1's rq cannot acquire the console_sem and joins the wait queue, > entering the UNINTERRUPTIBLE state. It waits for the console_sem to be > released and then wakes up. After the task on CPU 0 releases > the console_sem, it wakes up the waiting console_sem task. > In try_to_wake_up, it attempts to acquire the lock on CPU1's rq again, > resulting in a deadlock. > > The triggering scenario is as follows: > > CPU 0 CPU1 > sched_tick_remote > WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3) > > report_bug con_write > printk > > console_unlock > do_con_write > console_lock > down(&console_sem) > list_add_tail(&waiter.list, &sem->wait_list); > up(&console_sem) > wake_up_q(&wake_q) > try_to_wake_up > __task_rq_lock > _raw_spin_lock > > This patch fixes the issue by deffering all printk console printing > during the lock holding period. > > Fixes: d84b31313ef8 ("sched/isolation: Offload residual 1Hz scheduler tick") > Signed-off-by: Wang Tao <wangtao554@huawei.com> > --- > kernel/sched/core.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 40f40f359c5d..fd2c83058ec2 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -4091,6 +4091,7 @@ static void sched_tick_remote(struct work_struct *work) > goto out_requeue; > > rq_lock_irq(rq, &rf); > + printk_deferred_enter(); > curr = rq->curr; > if (cpu_is_offline(cpu)) > goto out_unlock; > @@ -4109,6 +4110,7 @@ static void sched_tick_remote(struct work_struct *work) > > calc_load_nohz_remote(rq); > out_unlock: > + printk_deferred_exit(); > rq_unlock_irq(rq, &rf); > out_requeue: > > -- > 2.34.1 > > What is the git commit id of this in Linus's tree? thanks, greg k-h
在 2025/9/11 20:20, Greg KH 写道: > On Mon, Sep 08, 2025 at 08:42:30AM +0000, Wang Tao wrote: >> When CPU 1 enters the nohz_full state, and the kworker on CPU 0 executes >> the function sched_tick_remote, holding the lock on CPU1's rq >> and triggering the warning WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3). >> This leads to the process of printing the warning message, where the >> console_sem semaphore is held. At this point, the print task on the >> CPU1's rq cannot acquire the console_sem and joins the wait queue, >> entering the UNINTERRUPTIBLE state. It waits for the console_sem to be >> released and then wakes up. After the task on CPU 0 releases >> the console_sem, it wakes up the waiting console_sem task. >> In try_to_wake_up, it attempts to acquire the lock on CPU1's rq again, >> resulting in a deadlock. >> >> The triggering scenario is as follows: >> >> CPU 0 CPU1 >> sched_tick_remote >> WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3) >> >> report_bug con_write >> printk >> >> console_unlock >> do_con_write >> console_lock >> down(&console_sem) >> list_add_tail(&waiter.list, &sem->wait_list); >> up(&console_sem) >> wake_up_q(&wake_q) >> try_to_wake_up >> __task_rq_lock >> _raw_spin_lock >> >> This patch fixes the issue by deffering all printk console printing >> during the lock holding period. >> >> Fixes: d84b31313ef8 ("sched/isolation: Offload residual 1Hz scheduler tick") >> Signed-off-by: Wang Tao <wangtao554@huawei.com> >> --- >> kernel/sched/core.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >> index 40f40f359c5d..fd2c83058ec2 100644 >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -4091,6 +4091,7 @@ static void sched_tick_remote(struct work_struct *work) >> goto out_requeue; >> >> rq_lock_irq(rq, &rf); >> + printk_deferred_enter(); >> curr = rq->curr; >> if (cpu_is_offline(cpu)) >> goto out_unlock; >> @@ -4109,6 +4110,7 @@ static void sched_tick_remote(struct work_struct *work) >> >> calc_load_nohz_remote(rq); >> out_unlock: >> + printk_deferred_exit(); >> rq_unlock_irq(rq, &rf); >> out_requeue: >> >> -- >> 2.34.1 >> >> Sorry, we initially discovered the issue while testing the stable branch, and it seems that the mainline has the same problem, but I haven't submitted a patch yet. Thanks Tao > What is the git commit id of this in Linus's tree? > > thanks, > > greg k-h >
On Thu, Sep 11, 2025 at 08:30:44PM +0800, wangtao (EQ) wrote: > > 在 2025/9/11 20:20, Greg KH 写道: > > On Mon, Sep 08, 2025 at 08:42:30AM +0000, Wang Tao wrote: > > > When CPU 1 enters the nohz_full state, and the kworker on CPU 0 executes > > > the function sched_tick_remote, holding the lock on CPU1's rq > > > and triggering the warning WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3). > > > This leads to the process of printing the warning message, where the > > > console_sem semaphore is held. At this point, the print task on the > > > CPU1's rq cannot acquire the console_sem and joins the wait queue, > > > entering the UNINTERRUPTIBLE state. It waits for the console_sem to be > > > released and then wakes up. After the task on CPU 0 releases > > > the console_sem, it wakes up the waiting console_sem task. > > > In try_to_wake_up, it attempts to acquire the lock on CPU1's rq again, > > > resulting in a deadlock. > > > > > > The triggering scenario is as follows: > > > > > > CPU 0 CPU1 > > > sched_tick_remote > > > WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3) > > > > > > report_bug con_write > > > printk > > > > > > console_unlock > > > do_con_write > > > console_lock > > > down(&console_sem) > > > list_add_tail(&waiter.list, &sem->wait_list); > > > up(&console_sem) > > > wake_up_q(&wake_q) > > > try_to_wake_up > > > __task_rq_lock > > > _raw_spin_lock > > > > > > This patch fixes the issue by deffering all printk console printing > > > during the lock holding period. > > > > > > Fixes: d84b31313ef8 ("sched/isolation: Offload residual 1Hz scheduler tick") > > > Signed-off-by: Wang Tao <wangtao554@huawei.com> > > > --- > > > kernel/sched/core.c | 2 ++ > > > 1 file changed, 2 insertions(+) > > > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > > index 40f40f359c5d..fd2c83058ec2 100644 > > > --- a/kernel/sched/core.c > > > +++ b/kernel/sched/core.c > > > @@ -4091,6 +4091,7 @@ static void sched_tick_remote(struct work_struct *work) > > > goto out_requeue; > > > rq_lock_irq(rq, &rf); > > > + printk_deferred_enter(); > > > curr = rq->curr; > > > if (cpu_is_offline(cpu)) > > > goto out_unlock; > > > @@ -4109,6 +4110,7 @@ static void sched_tick_remote(struct work_struct *work) > > > calc_load_nohz_remote(rq); > > > out_unlock: > > > + printk_deferred_exit(); > > > rq_unlock_irq(rq, &rf); > > > out_requeue: > > > -- > > > 2.34.1 > > > > > > > Sorry, we initially discovered the issue while testing the stable branch, > and it seems that the mainline has the same problem, but I haven't submitted > a patch yet. That is required for us to be able to take a patch into the stable trees, thanks. greg k-h
在 2025/9/11 20:53, Greg KH 写道: > On Thu, Sep 11, 2025 at 08:30:44PM +0800, wangtao (EQ) wrote: >> 在 2025/9/11 20:20, Greg KH 写道: >>> On Mon, Sep 08, 2025 at 08:42:30AM +0000, Wang Tao wrote: >>>> When CPU 1 enters the nohz_full state, and the kworker on CPU 0 executes >>>> the function sched_tick_remote, holding the lock on CPU1's rq >>>> and triggering the warning WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3). >>>> This leads to the process of printing the warning message, where the >>>> console_sem semaphore is held. At this point, the print task on the >>>> CPU1's rq cannot acquire the console_sem and joins the wait queue, >>>> entering the UNINTERRUPTIBLE state. It waits for the console_sem to be >>>> released and then wakes up. After the task on CPU 0 releases >>>> the console_sem, it wakes up the waiting console_sem task. >>>> In try_to_wake_up, it attempts to acquire the lock on CPU1's rq again, >>>> resulting in a deadlock. >>>> >>>> The triggering scenario is as follows: >>>> >>>> CPU 0 CPU1 >>>> sched_tick_remote >>>> WARN_ON_ONCE(delta > (u64)NSEC_PER_SEC * 3) >>>> >>>> report_bug con_write >>>> printk >>>> >>>> console_unlock >>>> do_con_write >>>> console_lock >>>> down(&console_sem) >>>> list_add_tail(&waiter.list, &sem->wait_list); >>>> up(&console_sem) >>>> wake_up_q(&wake_q) >>>> try_to_wake_up >>>> __task_rq_lock >>>> _raw_spin_lock >>>> >>>> This patch fixes the issue by deffering all printk console printing >>>> during the lock holding period. >>>> >>>> Fixes: d84b31313ef8 ("sched/isolation: Offload residual 1Hz scheduler tick") >>>> Signed-off-by: Wang Tao <wangtao554@huawei.com> >>>> --- >>>> kernel/sched/core.c | 2 ++ >>>> 1 file changed, 2 insertions(+) >>>> >>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >>>> index 40f40f359c5d..fd2c83058ec2 100644 >>>> --- a/kernel/sched/core.c >>>> +++ b/kernel/sched/core.c >>>> @@ -4091,6 +4091,7 @@ static void sched_tick_remote(struct work_struct *work) >>>> goto out_requeue; >>>> rq_lock_irq(rq, &rf); >>>> + printk_deferred_enter(); >>>> curr = rq->curr; >>>> if (cpu_is_offline(cpu)) >>>> goto out_unlock; >>>> @@ -4109,6 +4110,7 @@ static void sched_tick_remote(struct work_struct *work) >>>> calc_load_nohz_remote(rq); >>>> out_unlock: >>>> + printk_deferred_exit(); >>>> rq_unlock_irq(rq, &rf); >>>> out_requeue: >>>> -- >>>> 2.34.1 >>>> >>>> >> Sorry, we initially discovered the issue while testing the stable branch, >> and it seems that the mainline has the same problem, but I haven't submitted >> a patch yet. > That is required for us to be able to take a patch into the stable > trees, thanks. > > greg k-h I have resent this patch for linus's tree. Please review on the following thread: Link: https://lore.kernel.org/all/20250911124249.1154043-1-wangtao554@huawei.com/ Thanks.
© 2016 - 2025 Red Hat, Inc.