[PATCH] sched/debug: Add cond_resched() to sched_debug_show()

Li Chen posted 1 patch 3 months, 1 week ago
kernel/sched/debug.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] sched/debug: Add cond_resched() to sched_debug_show()
Posted by Li Chen 3 months, 1 week ago
From: Li Chen <chenl311@chinatelecom.cn>

Running stress-ng on large CPUs (e.g., ≥256 cores) can
spawn numerous process/threads (e.g., over 70w told from
vmcore) and trigger softlockup watchdogs when read
/sys/kernel/debug/sched/debug:
https://github.com/ColinIanKing/stress-ng/blob/V0.18.10/stress-cpu-sched.c#L860

To improve responsiveness during extensive debug dumps,
insert cond_resched() into sched_debug_show(). This allows the
kernel to periodically yield and remain responsive, similar to how
cond_resched() is used in other iteration-heavy code paths.

Below is soft lockup call trace:

[ 1996.543070] RIP: 0010:print_cpu+0x2a4/0x770
[ 1996.543084] Code: f6 ff ff 49 81 ff 58 fc c0 b6 74 69 49 8b 8f 58 03 00 00 48 8b 41 10 48 8d 51 10 48 8d 98 20 f5 ff ff 48 39 c2 74 37 8b 43 14 <39> c5 75 19 49 8b b5 10 0a 00 00 48 89 da 4c 89 e7 e8 d6 f1 ff ff
[ 1996.543087] RSP: 0018:ffffc900704a7d40 EFLAGS: 00000202
[ 1996.543090] RAX: 0000000000000038 RBX: ffff88b1b9073900 RCX: ffff88b326b86880
[ 1996.543093] RDX: ffff88b326b86890 RSI: ffffffffb6527fde RDI: ffff88d579bd7256
[ 1996.543096] RBP: 0000000000000000 R08: 0000000000000028 R09: ffff88d679bd722d
[ 1996.543098] R10: ffffffffffffffff R11: 0000000000000000 R12: ffff88d4662cf880
[ 1996.543099] R13: ffff889045e34d40 R14: ffff88b1b9073900 R15: ffff88b1b9074258
[ 1996.543101] FS:  00007f0d2a254000(0000) GS:ffff88e04f080000(0000) knlGS:0000000000000000
[ 1996.543104] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1996.543106] CR2: 0000000000b6d7c0 CR3: 00000033f894a000 CR4: 0000000000350ee0
[ 1996.543108] Call Trace:
[ 1996.543115]  <TASK>
[ 1996.543122]  sched_debug_show+0x13/0x30
[ 1996.543127]  seq_read_iter+0x122/0x470
[ 1996.543133]  ? restore_fpregs_from_user+0xa9/0x150
[ 1996.543139]  seq_read+0xaa/0xe0
[ 1996.543148]  full_proxy_read+0x59/0x80
[ 1996.543155]  vfs_read+0xa1/0x1c0
[ 1996.543164]  ksys_read+0x63/0xe0
[ 1996.543168]  do_syscall_64+0x55/0x100
[ 1996.543175]  entry_SYSCALL_64_after_hwframe+0x78/0xe2

The full soft lockup message is here:
https://gist.github.com/FirstLoveLife/73f2185bed83a5faf7f94af8032a527b

Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
---
 kernel/sched/debug.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 9d71baf080751..9dd444c604a8b 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -1065,6 +1065,7 @@ static int sched_debug_show(struct seq_file *m, void *v)
 	else
 		sched_debug_header(m);
 
+	cond_resched();
 	return 0;
 }
 
-- 
2.49.0

Re: [PATCH] sched/debug: Add cond_resched() to sched_debug_show()
Posted by Li Chen 2 months, 3 weeks ago
gentle ping

 ---- On Mon, 30 Jun 2025 14:08:26 +0800  Li Chen <me@linux.beauty> wrote --- 
 > From: Li Chen <chenl311@chinatelecom.cn>
 > 
 > Running stress-ng on large CPUs (e.g., ≥256 cores) can
 > spawn numerous process/threads (e.g., over 70w told from
 > vmcore) and trigger softlockup watchdogs when read
 > /sys/kernel/debug/sched/debug:
 > https://github.com/ColinIanKing/stress-ng/blob/V0.18.10/stress-cpu-sched.c#L860
 > 
 > To improve responsiveness during extensive debug dumps,
 > insert cond_resched() into sched_debug_show(). This allows the
 > kernel to periodically yield and remain responsive, similar to how
 > cond_resched() is used in other iteration-heavy code paths.
 > 
 > Below is soft lockup call trace:
 > 
 > [ 1996.543070] RIP: 0010:print_cpu+0x2a4/0x770
 > [ 1996.543084] Code: f6 ff ff 49 81 ff 58 fc c0 b6 74 69 49 8b 8f 58 03 00 00 48 8b 41 10 48 8d 51 10 48 8d 98 20 f5 ff ff 48 39 c2 74 37 8b 43 14 <39> c5 75 19 49 8b b5 10 0a 00 00 48 89 da 4c 89 e7 e8 d6 f1 ff ff
 > [ 1996.543087] RSP: 0018:ffffc900704a7d40 EFLAGS: 00000202
 > [ 1996.543090] RAX: 0000000000000038 RBX: ffff88b1b9073900 RCX: ffff88b326b86880
 > [ 1996.543093] RDX: ffff88b326b86890 RSI: ffffffffb6527fde RDI: ffff88d579bd7256
 > [ 1996.543096] RBP: 0000000000000000 R08: 0000000000000028 R09: ffff88d679bd722d
 > [ 1996.543098] R10: ffffffffffffffff R11: 0000000000000000 R12: ffff88d4662cf880
 > [ 1996.543099] R13: ffff889045e34d40 R14: ffff88b1b9073900 R15: ffff88b1b9074258
 > [ 1996.543101] FS:  00007f0d2a254000(0000) GS:ffff88e04f080000(0000) knlGS:0000000000000000
 > [ 1996.543104] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 > [ 1996.543106] CR2: 0000000000b6d7c0 CR3: 00000033f894a000 CR4: 0000000000350ee0
 > [ 1996.543108] Call Trace:
 > [ 1996.543115]  <TASK>
 > [ 1996.543122]  sched_debug_show+0x13/0x30
 > [ 1996.543127]  seq_read_iter+0x122/0x470
 > [ 1996.543133]  ? restore_fpregs_from_user+0xa9/0x150
 > [ 1996.543139]  seq_read+0xaa/0xe0
 > [ 1996.543148]  full_proxy_read+0x59/0x80
 > [ 1996.543155]  vfs_read+0xa1/0x1c0
 > [ 1996.543164]  ksys_read+0x63/0xe0
 > [ 1996.543168]  do_syscall_64+0x55/0x100
 > [ 1996.543175]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
 > 
 > The full soft lockup message is here:
 > https://gist.github.com/FirstLoveLife/73f2185bed83a5faf7f94af8032a527b
 > 
 > Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
 > ---
 >  kernel/sched/debug.c | 1 +
 >  1 file changed, 1 insertion(+)
 > 
 > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
 > index 9d71baf080751..9dd444c604a8b 100644
 > --- a/kernel/sched/debug.c
 > +++ b/kernel/sched/debug.c
 > @@ -1065,6 +1065,7 @@ static int sched_debug_show(struct seq_file *m, void *v)
 >      else
 >          sched_debug_header(m);
 >  
 > +    cond_resched();
 >      return 0;
 >  }
 >  
 > -- 
 > 2.49.0
 > 
 > 
Regards,

Li​