[PATCH] net: usb: r8152: fix resume reset deadlock

Sergey Senozhatsky posted 1 patch 1 week, 2 days ago
There is a newer version of this series
drivers/net/usb/r8152.c | 27 ++++++++++++++-------------
1 file changed, 14 insertions(+), 13 deletions(-)
[PATCH] net: usb: r8152: fix resume reset deadlock
Posted by Sergey Senozhatsky 1 week, 2 days ago
rtl8152 can trigger device reset during reset which
potentially can result in a deadlock:

 **** DPM device timeout after 10 seconds; 15 seconds until panic ****
 Call Trace:
 <TASK>
 schedule+0x483/0x1370
 schedule_preempt_disabled+0x15/0x30
 __mutex_lock_common+0x1fd/0x470
 __rtl8152_set_mac_address+0x80/0x1f0
 dev_set_mac_address+0x7f/0x150
 rtl8152_post_reset+0x72/0x150
 usb_reset_device+0x1d0/0x220
 rtl8152_resume+0x99/0xc0
 usb_resume_interface+0x3e/0xc0
 usb_resume_both+0x104/0x150
 usb_resume+0x22/0x110

The problem is that rtl8152 resume calls reset under
tp->control mutex while reset basically re-enters rtl8152
and attempts to acquire the same tp->control lock once
again.

Reset INACCESSIBLE device outside of tp->control mutex
scope to avoid recursive mutex_lock() deadlock.

Fixes: 4933b066fefb ("r8152: If inaccessible at resume time, issue a reset")
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
---
 drivers/net/usb/r8152.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 30f937527cd2..c4f4e6a35ff4 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -8538,19 +8538,6 @@ static int rtl8152_system_resume(struct r8152 *tp)
 		usb_submit_urb(tp->intr_urb, GFP_NOIO);
 	}
 
-	/* If the device is RTL8152_INACCESSIBLE here then we should do a
-	 * reset. This is important because the usb_lock_device_for_reset()
-	 * that happens as a result of usb_queue_reset_device() will silently
-	 * fail if the device was suspended or if too much time passed.
-	 *
-	 * NOTE: The device is locked here so we can directly do the reset.
-	 * We don't need usb_lock_device_for_reset() because that's just a
-	 * wrapper over device_lock() and device_resume() (which calls us)
-	 * does that for us.
-	 */
-	if (test_bit(RTL8152_INACCESSIBLE, &tp->flags))
-		usb_reset_device(tp->udev);
-
 	return 0;
 }
 
@@ -8661,6 +8648,7 @@ static int rtl8152_suspend(struct usb_interface *intf, pm_message_t message)
 static int rtl8152_resume(struct usb_interface *intf)
 {
 	struct r8152 *tp = usb_get_intfdata(intf);
+	bool system_resume = !test_bit(SELECTIVE_SUSPEND, &tp->flags);
 	int ret;
 
 	mutex_lock(&tp->control);
@@ -8674,6 +8662,19 @@ static int rtl8152_resume(struct usb_interface *intf)
 
 	mutex_unlock(&tp->control);
 
+	/* If the device is RTL8152_INACCESSIBLE here then we should do a
+	 * reset. This is important because the usb_lock_device_for_reset()
+	 * that happens as a result of usb_queue_reset_device() will silently
+	 * fail if the device was suspended or if too much time passed.
+	 *
+	 * NOTE: The device is locked here so we can directly do the reset.
+	 * We don't need usb_lock_device_for_reset() because that's just a
+	 * wrapper over device_lock() and device_resume() (which calls us)
+	 * does that for us.
+	 */
+	if (system_resume && test_bit(RTL8152_INACCESSIBLE, &tp->flags))
+		ret = usb_reset_device(tp->udev);
+
 	return ret;
 }
 
-- 
2.53.0.rc1.225.gd81095ad13-goog
Re: [PATCH] net: usb: r8152: fix resume reset deadlock
Posted by Doug Anderson 1 week, 2 days ago
Hi,

On Tue, Jan 27, 2026 at 11:02 PM Sergey Senozhatsky
<senozhatsky@chromium.org> wrote:
>
> rtl8152 can trigger device reset during reset which
> potentially can result in a deadlock:
>
>  **** DPM device timeout after 10 seconds; 15 seconds until panic ****
>  Call Trace:
>  <TASK>
>  schedule+0x483/0x1370
>  schedule_preempt_disabled+0x15/0x30
>  __mutex_lock_common+0x1fd/0x470
>  __rtl8152_set_mac_address+0x80/0x1f0
>  dev_set_mac_address+0x7f/0x150
>  rtl8152_post_reset+0x72/0x150
>  usb_reset_device+0x1d0/0x220
>  rtl8152_resume+0x99/0xc0
>  usb_resume_interface+0x3e/0xc0
>  usb_resume_both+0x104/0x150
>  usb_resume+0x22/0x110
>
> The problem is that rtl8152 resume calls reset under
> tp->control mutex while reset basically re-enters rtl8152
> and attempts to acquire the same tp->control lock once
> again.
>
> Reset INACCESSIBLE device outside of tp->control mutex
> scope to avoid recursive mutex_lock() deadlock.
>
> Fixes: 4933b066fefb ("r8152: If inaccessible at resume time, issue a reset")
> Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
> ---
>  drivers/net/usb/r8152.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)

This is effectively v2 of:

https://lore.kernel.org/r/20241018141337.316807-1-danielgeorgem@chromium.org/

...and you've incorporated my feedback there. Thanks! :-)


> @@ -8674,6 +8662,19 @@ static int rtl8152_resume(struct usb_interface *intf)
>
>         mutex_unlock(&tp->control);
>
> +       /* If the device is RTL8152_INACCESSIBLE here then we should do a
> +        * reset. This is important because the usb_lock_device_for_reset()
> +        * that happens as a result of usb_queue_reset_device() will silently
> +        * fail if the device was suspended or if too much time passed.
> +        *
> +        * NOTE: The device is locked here so we can directly do the reset.
> +        * We don't need usb_lock_device_for_reset() because that's just a
> +        * wrapper over device_lock() and device_resume() (which calls us)
> +        * does that for us.
> +        */
> +       if (system_resume && test_bit(RTL8152_INACCESSIBLE, &tp->flags))
> +               ret = usb_reset_device(tp->udev);
> +
>         return ret;

Question when looking at the above again: have you thought about the
consequences of clobbering `ret` above? I guess it's fine since
rtl8152_system_resume() always returns 0, but it looks a little
awkward. It's been long enough since I thought through all this code
that I'm not 100% sure what it _should_ do if rtl8152_system_resume()
was ever changed to return an error. Shouldn't it honor the existing
error instead of trying to reset the device and clearing the error?

Also: I guess you've added the `system_resume` variable here, which is
different than the earlier patch. It seems fine to me, though maybe
you want to consistently use the `system_resume` variable earlier in
the function too?

In any case, both of the above are pretty nitty, so I'm OK with:

Reviewed-by: Douglas Anderson <dianders@chromium.org>
Re: [PATCH] net: usb: r8152: fix resume reset deadlock
Posted by Sergey Senozhatsky 1 week, 2 days ago
Hi Doug,

On (26/01/28 10:05), Doug Anderson wrote:
> > rtl8152 can trigger device reset during reset which
> > potentially can result in a deadlock:
> >
> >  **** DPM device timeout after 10 seconds; 15 seconds until panic ****
> >  Call Trace:
> >  <TASK>
> >  schedule+0x483/0x1370
> >  schedule_preempt_disabled+0x15/0x30
> >  __mutex_lock_common+0x1fd/0x470
> >  __rtl8152_set_mac_address+0x80/0x1f0
> >  dev_set_mac_address+0x7f/0x150
> >  rtl8152_post_reset+0x72/0x150
> >  usb_reset_device+0x1d0/0x220
> >  rtl8152_resume+0x99/0xc0
> >  usb_resume_interface+0x3e/0xc0
> >  usb_resume_both+0x104/0x150
> >  usb_resume+0x22/0x110
> >
> > The problem is that rtl8152 resume calls reset under
> > tp->control mutex while reset basically re-enters rtl8152
> > and attempts to acquire the same tp->control lock once
> > again.
> >
> > Reset INACCESSIBLE device outside of tp->control mutex
> > scope to avoid recursive mutex_lock() deadlock.
> >
> > Fixes: 4933b066fefb ("r8152: If inaccessible at resume time, issue a reset")
> > Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
> > ---
> >  drivers/net/usb/r8152.c | 27 ++++++++++++++-------------
> >  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> This is effectively v2 of:
> 
> https://lore.kernel.org/r/20241018141337.316807-1-danielgeorgem@chromium.org/
> 
> ...and you've incorporated my feedback there. Thanks! :-)

Oh, nice :)

> > @@ -8674,6 +8662,19 @@ static int rtl8152_resume(struct usb_interface *intf)
> >
> >         mutex_unlock(&tp->control);
> >
> > +       /* If the device is RTL8152_INACCESSIBLE here then we should do a
> > +        * reset. This is important because the usb_lock_device_for_reset()
> > +        * that happens as a result of usb_queue_reset_device() will silently
> > +        * fail if the device was suspended or if too much time passed.
> > +        *
> > +        * NOTE: The device is locked here so we can directly do the reset.
> > +        * We don't need usb_lock_device_for_reset() because that's just a
> > +        * wrapper over device_lock() and device_resume() (which calls us)
> > +        * does that for us.
> > +        */
> > +       if (system_resume && test_bit(RTL8152_INACCESSIBLE, &tp->flags))
> > +               ret = usb_reset_device(tp->udev);
> > +
> >         return ret;
> 
> Question when looking at the above again: have you thought about the
> consequences of clobbering `ret` above? I guess it's fine since
> rtl8152_system_resume() always returns 0, but it looks a little
> awkward. It's been long enough since I thought through all this code
> that I'm not 100% sure what it _should_ do if rtl8152_system_resume()
> was ever changed to return an error. Shouldn't it honor the existing
> error instead of trying to reset the device and clearing the error?

Right... so that "ret" thing, I thought about it and at the end I
just decided that returning an actual device reset error from resume
is still better than "return 0 but device is inaccessible" ("mission
failed successfully" kind of a thing).  I'm not entirely sure what
would be the best way to handle this.  Like you said, for the time
being, rtl8152_system_resume() always returns 0.  Do we expect this
to change in the future?  Probably not.  On the other hand if
RTL8152_INACCESSIBLE bit is not cleared then user-space will
figure it out eventually (ioctl calls will fail, etc).  So maybe I
can just keep the existing code and ignore usb_reset_device() return
value.

> Also: I guess you've added the `system_resume` variable here, which is
> different than the earlier patch. It seems fine to me, though maybe
> you want to consistently use the `system_resume` variable earlier in
> the function too?

Sounds good!

> In any case, both of the above are pretty nitty, so I'm OK with:
> 
> Reviewed-by: Douglas Anderson <dianders@chromium.org>

Thanks!