[PATCH] kthread: Unpark only parked kthreads (was Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2))

Frederic Weisbecker posted 1 patch 2 months, 2 weeks ago
kernel/kthread.c | 2 ++
1 file changed, 2 insertions(+)
[PATCH] kthread: Unpark only parked kthreads (was Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2))
Posted by Frederic Weisbecker 2 months, 2 weeks ago
Le Wed, Jul 31, 2024 at 04:29:02AM -0700, syzbot a écrit :
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> 
> Reported-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com
> Tested-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com
> 
> Tested on:
> 
> commit:         dc1c8034 minmax: simplify min()/max()/clamp() implemen..
> git tree:       git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> console output: https://syzkaller.appspot.com/x/log.txt?x=1264b511980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=2258b49cd9b339fa
> dashboard link: https://syzkaller.appspot.com/bug?extid=943d34fa3cf2191e3068
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=10fe9911980000
> 
> Note: testing is done by a robot and is best-effort only.
> 

The problem is in the kthread code. kthread_stop() seem to assume that
the target is parked and since kthread_stop() is seldom called on per-cpu
kthreads (smpboot_unregister_percpu_thread() doesn't have any user yet), this
went unnoticed until workqueue happened to do it.

Can you test the following?
---
From: Frederic Weisbecker <frederic@kernel.org>
Date: Tue, 10 Sep 2024 22:10:19 +0200
Subject: [PATCH] kthread: Unpark only parked kthreads

Calling into kthread unparking unconditionally is mostly harmless when
the kthread is already unparked. The wake up is then simply ignored
because the target is not in TASK_PARKED state.

However if the kthread is per CPU, the wake up is preceded by a call
to kthread_bind() which expects the task to be inactive and in
TASK_PARKED state, which obviously isn't the case if it is unparked.

As a result, calling kthread_stop() on an unparked per-cpu kthread
triggers such a warning:

	WARNING: CPU: 0 PID: 11 at kernel/kthread.c:525 __kthread_bind_mask kernel/kthread.c:525
	 <TASK>
	 kthread_stop+0x17a/0x630 kernel/kthread.c:707
	 destroy_workqueue+0x136/0xc40 kernel/workqueue.c:5810
	 wg_destruct+0x1e2/0x2e0 drivers/net/wireguard/device.c:257
	 netdev_run_todo+0xe1a/0x1000 net/core/dev.c:10693
	 default_device_exit_batch+0xa14/0xa90 net/core/dev.c:11769
	 ops_exit_list net/core/net_namespace.c:178 [inline]
	 cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
	 process_one_work kernel/workqueue.c:3231 [inline]
	 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
	 worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
	 kthread+0x2f0/0x390 kernel/kthread.c:389
	 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
	 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
	 </TASK>

Fix this with skipping unecessary unparking while stopping a kthread.

Reported-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/kthread.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index f7be976ff88a..5e2ba556aba8 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -623,6 +623,8 @@ void kthread_unpark(struct task_struct *k)
 {
 	struct kthread *kthread = to_kthread(k);
 
+	if (!test_bit(KTHREAD_SHOULD_PARK, &kthread->flags))
+		return;
 	/*
 	 * Newly created kthread was parked when the CPU was offline.
 	 * The binding was lost and we need to set it again.
-- 
2.46.0
Re: [PATCH] kthread: Unpark only parked kthreads (was Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2))
Posted by Hillf Danton 2 months, 2 weeks ago
Test Frederic's idea.

#syz test upstream master

--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -623,6 +623,8 @@ void kthread_unpark(struct task_struct *
 {
 	struct kthread *kthread = to_kthread(k);
 
+	if (!test_bit(KTHREAD_SHOULD_PARK, &kthread->flags))
+		return;
 	/*
 	 * Newly created kthread was parked when the CPU was offline.
 	 * The binding was lost and we need to set it again.
--- l/drivers/input/misc/yealink.c
+++ y/drivers/input/misc/yealink.c
@@ -438,7 +438,7 @@ static void urb_irq_callback(struct urb
 
 	yealink_do_idle_tasks(yld);
 
-	if (!yld->shutdown) {
+	if (!yld->shutdown && status != -EPROTO) {
 		ret = usb_submit_urb(yld->urb_ctl, GFP_ATOMIC);
 		if (ret && ret != -EPERM)
 			dev_err(&yld->intf->dev,
@@ -460,13 +460,13 @@ static void urb_ctl_callback(struct urb
 	case CMD_KEYPRESS:
 	case CMD_SCANCODE:
 		/* ask for a response */
-		if (!yld->shutdown)
+		if (!yld->shutdown && status != -EPROTO)
 			ret = usb_submit_urb(yld->urb_irq, GFP_ATOMIC);
 		break;
 	default:
 		/* send new command */
 		yealink_do_idle_tasks(yld);
-		if (!yld->shutdown)
+		if (!yld->shutdown && status != -EPROTO)
 			ret = usb_submit_urb(yld->urb_ctl, GFP_ATOMIC);
 		break;
 	}
--
Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2)
Posted by syzbot 2 months, 2 weeks ago
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com
Tested-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com

Tested on:

commit:         196145c6 Merge tag 'clk-fixes-for-linus' of git://git...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11ce10a9980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=9e236c2f9e028b26
dashboard link: https://syzkaller.appspot.com/bug?extid=943d34fa3cf2191e3068
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=126610a9980000

Note: testing is done by a robot and is best-effort only.
Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2)
Posted by Frederic Weisbecker 2 months, 2 weeks ago
Le Fri, Sep 13, 2024 at 05:38:02AM -0700, syzbot a écrit :
> Hello,
> 
> syzbot has tested the proposed patch and the reproducer did not trigger any issue:
> 
> Reported-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com
> Tested-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com

Thanks! I'm cooking an updated patch with those tags.

> 
> Tested on:
> 
> commit:         196145c6 Merge tag 'clk-fixes-for-linus' of git://git...
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11ce10a9980000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=9e236c2f9e028b26
> dashboard link: https://syzkaller.appspot.com/bug?extid=943d34fa3cf2191e3068
> compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> patch:          https://syzkaller.appspot.com/x/patch.diff?x=126610a9980000
> 
> Note: testing is done by a robot and is best-effort only.
Re: [PATCH] kthread: Unpark only parked kthreads (was Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2))
Posted by Frederic Weisbecker 2 months, 2 weeks ago
Le Fri, Sep 13, 2024 at 08:11:09PM +0800, Hillf Danton a écrit :
> Test Frederic's idea.
> 
> #syz test upstream master

Thanks!

> 
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -623,6 +623,8 @@ void kthread_unpark(struct task_struct *
>  {
>  	struct kthread *kthread = to_kthread(k);
>  
> +	if (!test_bit(KTHREAD_SHOULD_PARK, &kthread->flags))
> +		return;
>  	/*
>  	 * Newly created kthread was parked when the CPU was offline.
>  	 * The binding was lost and we need to set it again.

But are the following bits deliberate?

> --- l/drivers/input/misc/yealink.c
> +++ y/drivers/input/misc/yealink.c
> @@ -438,7 +438,7 @@ static void urb_irq_callback(struct urb
>  
>  	yealink_do_idle_tasks(yld);
>  
> -	if (!yld->shutdown) {
> +	if (!yld->shutdown && status != -EPROTO) {
>  		ret = usb_submit_urb(yld->urb_ctl, GFP_ATOMIC);
>  		if (ret && ret != -EPERM)
>  			dev_err(&yld->intf->dev,
> @@ -460,13 +460,13 @@ static void urb_ctl_callback(struct urb
>  	case CMD_KEYPRESS:
>  	case CMD_SCANCODE:
>  		/* ask for a response */
> -		if (!yld->shutdown)
> +		if (!yld->shutdown && status != -EPROTO)
>  			ret = usb_submit_urb(yld->urb_irq, GFP_ATOMIC);
>  		break;
>  	default:
>  		/* send new command */
>  		yealink_do_idle_tasks(yld);
> -		if (!yld->shutdown)
> +		if (!yld->shutdown && status != -EPROTO)
>  			ret = usb_submit_urb(yld->urb_ctl, GFP_ATOMIC);
>  		break;
>  	}
> --
> 
Re: [PATCH] kthread: Unpark only parked kthreads (was Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2))
Posted by Hillf Danton 2 months, 2 weeks ago
On Fri, 13 Sep 2024 14:31:52 +0200 Frederic Weisbecker <frederic@kernel.org>
> Le Fri, Sep 13, 2024 at 08:11:09PM +0800, Hillf Danton a �crit :
> 
> But are the following bits deliberate?
> 
It is added to kill rcu stall [1,2] and win Tested-by.

[1] https://lore.kernel.org/lkml/000000000000e6ca5d0621ece2dc@google.com/
[2] https://lore.kernel.org/lkml/0000000000008de5720617f64aae@google.com/

> > --- l/drivers/input/misc/yealink.c
> > +++ y/drivers/input/misc/yealink.c
> > @@ -438,7 +438,7 @@ static void urb_irq_callback(struct urb
> >  
> >  	yealink_do_idle_tasks(yld);
> >  
> > -	if (!yld->shutdown) {
> > +	if (!yld->shutdown && status != -EPROTO) {
> >  		ret = usb_submit_urb(yld->urb_ctl, GFP_ATOMIC);
> >  		if (ret && ret != -EPERM)
> >  			dev_err(&yld->intf->dev,
> > @@ -460,13 +460,13 @@ static void urb_ctl_callback(struct urb
> >  	case CMD_KEYPRESS:
> >  	case CMD_SCANCODE:
> >  		/* ask for a response */
> > -		if (!yld->shutdown)
> > +		if (!yld->shutdown && status != -EPROTO)
> >  			ret = usb_submit_urb(yld->urb_irq, GFP_ATOMIC);
> >  		break;
> >  	default:
> >  		/* send new command */
> >  		yealink_do_idle_tasks(yld);
> > -		if (!yld->shutdown)
> > +		if (!yld->shutdown && status != -EPROTO)
> >  			ret = usb_submit_urb(yld->urb_ctl, GFP_ATOMIC);
> >  		break;
> >  	}
> > --
Re: [PATCH] kthread: Unpark only parked kthreads (was Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2))
Posted by Hillf Danton 2 months, 2 weeks ago
On Wed, 11 Sep 2024 14:04:28 +0200 Frederic Weisbecker <frederic@kernel.org>
> 
> Can you test the following?
> ---

One line is needed to feed syzbot including four parts, see below for an example
	two keywords 	tree  		tag or commit
	syz test	upstream	master

> From: Frederic Weisbecker <frederic@kernel.org>
> Date: Tue, 10 Sep 2024 22:10:19 +0200
> Subject: [PATCH] kthread: Unpark only parked kthreads
> 
> Calling into kthread unparking unconditionally is mostly harmless when
> the kthread is already unparked. The wake up is then simply ignored
> because the target is not in TASK_PARKED state.
> 
> However if the kthread is per CPU, the wake up is preceded by a call
> to kthread_bind() which expects the task to be inactive and in
> TASK_PARKED state, which obviously isn't the case if it is unparked.
> 
> As a result, calling kthread_stop() on an unparked per-cpu kthread
> triggers such a warning:
> 
> 	WARNING: CPU: 0 PID: 11 at kernel/kthread.c:525 __kthread_bind_mask kernel/kthread.c:525
> 	 <TASK>
> 	 kthread_stop+0x17a/0x630 kernel/kthread.c:707
> 	 destroy_workqueue+0x136/0xc40 kernel/workqueue.c:5810
> 	 wg_destruct+0x1e2/0x2e0 drivers/net/wireguard/device.c:257
> 	 netdev_run_todo+0xe1a/0x1000 net/core/dev.c:10693
> 	 default_device_exit_batch+0xa14/0xa90 net/core/dev.c:11769
> 	 ops_exit_list net/core/net_namespace.c:178 [inline]
> 	 cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
> 	 process_one_work kernel/workqueue.c:3231 [inline]
> 	 process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
> 	 worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
> 	 kthread+0x2f0/0x390 kernel/kthread.c:389
> 	 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> 	 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> 	 </TASK>
> 
> Fix this with skipping unecessary unparking while stopping a kthread.
> 
> Reported-by: syzbot+943d34fa3cf2191e3068@syzkaller.appspotmail.com
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
> ---
>  kernel/kthread.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index f7be976ff88a..5e2ba556aba8 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -623,6 +623,8 @@ void kthread_unpark(struct task_struct *k)
>  {
>  	struct kthread *kthread = to_kthread(k);
>  
> +	if (!test_bit(KTHREAD_SHOULD_PARK, &kthread->flags))
> +		return;
>  	/*
>  	 * Newly created kthread was parked when the CPU was offline.
>  	 * The binding was lost and we need to set it again.
> -- 
> 2.46.0


#syz test upstream master

--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -623,6 +623,8 @@ void kthread_unpark(struct task_struct *
 {
 	struct kthread *kthread = to_kthread(k);
 
+	if (!test_bit(KTHREAD_SHOULD_PARK, &kthread->flags))
+		return;
 	/*
 	 * Newly created kthread was parked when the CPU was offline.
 	 * The binding was lost and we need to set it again.
--
Re: [syzbot] [wireguard?] WARNING in kthread_unpark (2)
Posted by syzbot 2 months, 2 weeks ago
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: rcu detected stall in corrupted

rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: {
 P5320
 } 3518 jiffies s: 7245 root: 0x0/T


Tested on:

commit:         77f58789 Merge tag 'arm-fixes-6.11-3' of git://git.ker..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1533b100580000
kernel config:  https://syzkaller.appspot.com/x/.config?x=9e236c2f9e028b26
dashboard link: https://syzkaller.appspot.com/bug?extid=943d34fa3cf2191e3068
compiler:       Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch:          https://syzkaller.appspot.com/x/patch.diff?x=1663e49f980000