[PATCH] media: uvcvideo: Fix deadlock if uvc_status_stop is called from async_ctrl.work

Sean Anderson posted 1 patch 4 weeks ago
There is a newer version of this series
drivers/media/usb/uvc/uvc_status.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
[PATCH] media: uvcvideo: Fix deadlock if uvc_status_stop is called from async_ctrl.work
Posted by Sean Anderson 4 weeks ago
If a UVC camera has an asynchronous control, uvc_status_stop may be
called from async_ctrl.work:

uvc_ctrl_status_event_work()
    uvc_ctrl_status_event()
        uvc_ctrl_clear_handle()
	    uvc_pm_put()
	        uvc_status_put()
		    uvc_status_stop()
		        cancel_work_sync()

This will cause a deadlock, since cancel_work_sync will wait for
uvc_ctrl_status_event_work to complete before returning.

Fix this by returning early from uvc_status_stop if we are currently in
the work function. flush_status now remains false until uvc_status_start
is called again, ensuring that uvc_ctrl_status_event_work won't resubmit
the URB.

Fixes: a32d9c41bdb8 ("media: uvcvideo: Make power management granular")
Closes: https://lore.kernel.org/all/6733bdfb-3e88-479f-8956-ab09c04c433e@linux.dev/
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
---

 drivers/media/usb/uvc/uvc_status.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_status.c b/drivers/media/usb/uvc/uvc_status.c
index 231cfee8e7c2c..2a23606c7f4c6 100644
--- a/drivers/media/usb/uvc/uvc_status.c
+++ b/drivers/media/usb/uvc/uvc_status.c
@@ -316,6 +316,14 @@ static int uvc_status_start(struct uvc_device *dev, gfp_t flags)
 	if (!dev->int_urb)
 		return 0;
 
+	/*
+	 * If the work called uvc_status_stop it may still be running. Wait for
+	 * it to finish before we submit the urb.
+	 */
+	cancel_work_sync(&dev->async_ctrl.work);
+
+	/* Clear the flush status if we were previously stopped */
+	smp_store_release(&dev->flush_status, false);
 	return usb_submit_urb(dev->int_urb, flags);
 }
 
@@ -336,6 +344,14 @@ static void uvc_status_stop(struct uvc_device *dev)
 	 */
 	smp_store_release(&dev->flush_status, true);
 
+	/*
+	 * We will deadlock if we are currently in the work function.
+	 * Fortunately, we know that the URB is already dead and that no
+	 * further work can be queued, so there's nothing left for us to do.
+	 */
+	if (current_work() == &w->work)
+		return;
+
 	/*
 	 * Cancel any pending asynchronous work. If any status event was queued,
 	 * process it synchronously.
@@ -354,15 +370,6 @@ static void uvc_status_stop(struct uvc_device *dev)
 	 */
 	if (cancel_work_sync(&w->work))
 		uvc_ctrl_status_event(w->chain, w->ctrl, w->data);
-
-	/*
-	 * From this point, there are no events on the queue and the status URB
-	 * is dead. No events will be queued until uvc_status_start() is called.
-	 * The barrier is needed to make sure that flush_status is visible to
-	 * uvc_ctrl_status_event_work() when uvc_status_start() will be called
-	 * again.
-	 */
-	smp_store_release(&dev->flush_status, false);
 }
 
 int uvc_status_resume(struct uvc_device *dev)
-- 
2.35.1.1320.gc452695387.dirty
Re: [PATCH] media: uvcvideo: Fix deadlock if uvc_status_stop is called from async_ctrl.work
Posted by Laurent Pinchart 3 weeks, 4 days ago
On Tue, Mar 10, 2026 at 06:22:59PM -0400, Sean Anderson wrote:
> If a UVC camera has an asynchronous control, uvc_status_stop may be
> called from async_ctrl.work:
> 
> uvc_ctrl_status_event_work()
>     uvc_ctrl_status_event()
>         uvc_ctrl_clear_handle()
> 	    uvc_pm_put()
> 	        uvc_status_put()
> 		    uvc_status_stop()
> 		        cancel_work_sync()
> 
> This will cause a deadlock, since cancel_work_sync will wait for
> uvc_ctrl_status_event_work to complete before returning.
> 
> Fix this by returning early from uvc_status_stop if we are currently in
> the work function. flush_status now remains false until uvc_status_start
> is called again, ensuring that uvc_ctrl_status_event_work won't resubmit
> the URB.
> 
> Fixes: a32d9c41bdb8 ("media: uvcvideo: Make power management granular")
> Closes: https://lore.kernel.org/all/6733bdfb-3e88-479f-8956-ab09c04c433e@linux.dev/
> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
> ---
> 
>  drivers/media/usb/uvc/uvc_status.c | 25 ++++++++++++++++---------
>  1 file changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/media/usb/uvc/uvc_status.c b/drivers/media/usb/uvc/uvc_status.c
> index 231cfee8e7c2c..2a23606c7f4c6 100644
> --- a/drivers/media/usb/uvc/uvc_status.c
> +++ b/drivers/media/usb/uvc/uvc_status.c
> @@ -316,6 +316,14 @@ static int uvc_status_start(struct uvc_device *dev, gfp_t flags)
>  	if (!dev->int_urb)
>  		return 0;
>  
> +	/*
> +	 * If the work called uvc_status_stop it may still be running. Wait for
> +	 * it to finish before we submit the urb.

I don't like this much. The code is becoming really convoluted and hard
to follow. Would there be a way to solve the issue with a broader
refactoring that would simplify the implementation ?

> +	 */
> +	cancel_work_sync(&dev->async_ctrl.work);
> +
> +	/* Clear the flush status if we were previously stopped */

s/stopped/stopped./

> +	smp_store_release(&dev->flush_status, false);

Add a blank line here.

>  	return usb_submit_urb(dev->int_urb, flags);
>  }
>  
> @@ -336,6 +344,14 @@ static void uvc_status_stop(struct uvc_device *dev)
>  	 */
>  	smp_store_release(&dev->flush_status, true);
>  
> +	/*
> +	 * We will deadlock if we are currently in the work function.
> +	 * Fortunately, we know that the URB is already dead and that no

The URB hasn't been killed, at has just completed and not been
resubmitted. I'd write

	/*
	 * If we are called from the event work function, the URB is guaranteed
	 * to not be in flight as it has completed and has not been resubmitted.
	 * There's no need to cancel the work (which would deadlock), or to kill
	 * the URB.
	 */

> +	 * further work can be queued, so there's nothing left for us to do.
> +	 */
> +	if (current_work() == &w->work)
> +		return;
> +
>  	/*
>  	 * Cancel any pending asynchronous work. If any status event was queued,
>  	 * process it synchronously.
> @@ -354,15 +370,6 @@ static void uvc_status_stop(struct uvc_device *dev)
>  	 */
>  	if (cancel_work_sync(&w->work))
>  		uvc_ctrl_status_event(w->chain, w->ctrl, w->data);
> -
> -	/*
> -	 * From this point, there are no events on the queue and the status URB
> -	 * is dead. No events will be queued until uvc_status_start() is called.
> -	 * The barrier is needed to make sure that flush_status is visible to
> -	 * uvc_ctrl_status_event_work() when uvc_status_start() will be called
> -	 * again.
> -	 */
> -	smp_store_release(&dev->flush_status, false);
>  }
>  
>  int uvc_status_resume(struct uvc_device *dev)

-- 
Regards,

Laurent Pinchart
Re: [PATCH] media: uvcvideo: Fix deadlock if uvc_status_stop is called from async_ctrl.work
Posted by Sean Anderson 3 weeks, 4 days ago
On 3/13/26 13:45, Laurent Pinchart wrote:
> On Tue, Mar 10, 2026 at 06:22:59PM -0400, Sean Anderson wrote:
>> If a UVC camera has an asynchronous control, uvc_status_stop may be
>> called from async_ctrl.work:
>> 
>> uvc_ctrl_status_event_work()
>>     uvc_ctrl_status_event()
>>         uvc_ctrl_clear_handle()
>> 	    uvc_pm_put()
>> 	        uvc_status_put()
>> 		    uvc_status_stop()
>> 		        cancel_work_sync()
>> 
>> This will cause a deadlock, since cancel_work_sync will wait for
>> uvc_ctrl_status_event_work to complete before returning.
>> 
>> Fix this by returning early from uvc_status_stop if we are currently in
>> the work function. flush_status now remains false until uvc_status_start
>> is called again, ensuring that uvc_ctrl_status_event_work won't resubmit
>> the URB.
>> 
>> Fixes: a32d9c41bdb8 ("media: uvcvideo: Make power management granular")
>> Closes: https://lore.kernel.org/all/6733bdfb-3e88-479f-8956-ab09c04c433e@linux.dev/
>> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
>> ---
>> 
>>  drivers/media/usb/uvc/uvc_status.c | 25 ++++++++++++++++---------
>>  1 file changed, 16 insertions(+), 9 deletions(-)
>> 
>> diff --git a/drivers/media/usb/uvc/uvc_status.c b/drivers/media/usb/uvc/uvc_status.c
>> index 231cfee8e7c2c..2a23606c7f4c6 100644
>> --- a/drivers/media/usb/uvc/uvc_status.c
>> +++ b/drivers/media/usb/uvc/uvc_status.c
>> @@ -316,6 +316,14 @@ static int uvc_status_start(struct uvc_device *dev, gfp_t flags)
>>  	if (!dev->int_urb)
>>  		return 0;
>>  
>> +	/*
>> +	 * If the work called uvc_status_stop it may still be running. Wait for
>> +	 * it to finish before we submit the urb.
> 
> I don't like this much. The code is becoming really convoluted and hard
> to follow. Would there be a way to solve the issue with a broader
> refactoring that would simplify the implementation ?

I don't think so. The complexity is fundamental to how the URB and work
schedule each other and that there are multiple opportunities for one to
"resurrect" the other.

As noted on the other reply, I think this call can be loosened to a
flush_work.

>> +	 */
>> +	cancel_work_sync(&dev->async_ctrl.work);
>> +
>> +	/* Clear the flush status if we were previously stopped */
> 
> s/stopped/stopped./
> 
>> +	smp_store_release(&dev->flush_status, false);
> 
> Add a blank line here.
> 
>>  	return usb_submit_urb(dev->int_urb, flags);
>>  }
>>  
>> @@ -336,6 +344,14 @@ static void uvc_status_stop(struct uvc_device *dev)
>>  	 */
>>  	smp_store_release(&dev->flush_status, true);
>>  
>> +	/*
>> +	 * We will deadlock if we are currently in the work function.
>> +	 * Fortunately, we know that the URB is already dead and that no
> 
> The URB hasn't been killed, at has just completed and not been
> resubmitted.

That's why I said "dead" and not "killed" :)

> I'd write
> 
> 	/*
> 	 * If we are called from the event work function, the URB is guaranteed
> 	 * to not be in flight as it has completed and has not been resubmitted.
> 	 * There's no need to cancel the work (which would deadlock), or to kill
> 	 * the URB.
> 	 */

Fine by me

>> +	 * further work can be queued, so there's nothing left for us to do.
>> +	 */
>> +	if (current_work() == &w->work)
>> +		return;
>> +
>>  	/*
>>  	 * Cancel any pending asynchronous work. If any status event was queued,
>>  	 * process it synchronously.
>> @@ -354,15 +370,6 @@ static void uvc_status_stop(struct uvc_device *dev)
>>  	 */
>>  	if (cancel_work_sync(&w->work))
>>  		uvc_ctrl_status_event(w->chain, w->ctrl, w->data);
>> -
>> -	/*
>> -	 * From this point, there are no events on the queue and the status URB
>> -	 * is dead. No events will be queued until uvc_status_start() is called.
>> -	 * The barrier is needed to make sure that flush_status is visible to
>> -	 * uvc_ctrl_status_event_work() when uvc_status_start() will be called
>> -	 * again.
>> -	 */
>> -	smp_store_release(&dev->flush_status, false);
>>  }
>>  
>>  int uvc_status_resume(struct uvc_device *dev)
>
Re: [PATCH] media: uvcvideo: Fix deadlock if uvc_status_stop is called from async_ctrl.work
Posted by Ricardo Ribalda 3 weeks, 6 days ago
Hi Sean

Thanks for the patch. In your original report you mentioned that you
could repro with qv4l2 and changing a control.
May I assume that it was while the camera was not streaming and the
control was a "slow" controp (zoom, focus)... Can you give more some
more details?

I have tested your change with 3 threads running:

1 # while true; do yavta --capture=3 /dev/video0; sleep 1;done
2 #  while true; do yavta -w "0x00980900 64" /dev/video0; yavta -w
"0x00980900 0" /dev/video0; done
3 /sys/bus/usb/devices/3-6 # while true; do echo 1 > authorized; sleep
3; echo 0 > authorized; sleep 3 ; done

And I have not seen any freeze. So that is good :), But I also could
not repro without your patch :P.

Anyway I agree with the lockdep report that we introduced a bug when
uvc_status_stop can be called from the async work, So we must fix it.


On Tue, 10 Mar 2026 at 23:23, Sean Anderson <sean.anderson@linux.dev> wrote:
>
> If a UVC camera has an asynchronous control, uvc_status_stop may be
> called from async_ctrl.work:
>
> uvc_ctrl_status_event_work()
>     uvc_ctrl_status_event()
>         uvc_ctrl_clear_handle()
>             uvc_pm_put()
>                 uvc_status_put()
>                     uvc_status_stop()
>                         cancel_work_sync()
>
> This will cause a deadlock, since cancel_work_sync will wait for
> uvc_ctrl_status_event_work to complete before returning.
>
> Fix this by returning early from uvc_status_stop if we are currently in
> the work function. flush_status now remains false until uvc_status_start
> is called again, ensuring that uvc_ctrl_status_event_work won't resubmit
> the URB.
>
Tested-by: Ricardo Ribalda <ribalda@chromium.org>
Acked-by: Ricardo Ribalda <ribalda@chromium.org>

Your patch is very similar to what I sent some time ago (I did not
have the cancel_work_sync() in uvc_status_start())
You can see the old discussion:
https://lore.kernel.org/all/Y6sAO7URJpSIulye@pendragon.ideasonboard.com/

For now, I am only ack the patch because I want to ensure the locking
is working as expected and need to re-read the old threads.
It would be great if Hans or Laurent also take a look at this.

Thanks again


> Fixes: a32d9c41bdb8 ("media: uvcvideo: Make power management granular")
> Closes: https://lore.kernel.org/all/6733bdfb-3e88-479f-8956-ab09c04c433e@linux.dev/
> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
> ---
>
>  drivers/media/usb/uvc/uvc_status.c | 25 ++++++++++++++++---------
>  1 file changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/media/usb/uvc/uvc_status.c b/drivers/media/usb/uvc/uvc_status.c
> index 231cfee8e7c2c..2a23606c7f4c6 100644
> --- a/drivers/media/usb/uvc/uvc_status.c
> +++ b/drivers/media/usb/uvc/uvc_status.c
> @@ -316,6 +316,14 @@ static int uvc_status_start(struct uvc_device *dev, gfp_t flags)
>         if (!dev->int_urb)
>                 return 0;
>
> +       /*
> +        * If the work called uvc_status_stop it may still be running. Wait for
> +        * it to finish before we submit the urb.
> +        */
> +       cancel_work_sync(&dev->async_ctrl.work);
> +
> +       /* Clear the flush status if we were previously stopped */
> +       smp_store_release(&dev->flush_status, false);
>         return usb_submit_urb(dev->int_urb, flags);
>  }
>
> @@ -336,6 +344,14 @@ static void uvc_status_stop(struct uvc_device *dev)
>          */
>         smp_store_release(&dev->flush_status, true);
>
> +       /*
> +        * We will deadlock if we are currently in the work function.
> +        * Fortunately, we know that the URB is already dead and that no
> +        * further work can be queued, so there's nothing left for us to do.
> +        */
> +       if (current_work() == &w->work)
> +               return;
> +
>         /*
>          * Cancel any pending asynchronous work. If any status event was queued,
>          * process it synchronously.
> @@ -354,15 +370,6 @@ static void uvc_status_stop(struct uvc_device *dev)
>          */
>         if (cancel_work_sync(&w->work))
>                 uvc_ctrl_status_event(w->chain, w->ctrl, w->data);
> -
> -       /*
> -        * From this point, there are no events on the queue and the status URB
> -        * is dead. No events will be queued until uvc_status_start() is called.
> -        * The barrier is needed to make sure that flush_status is visible to
> -        * uvc_ctrl_status_event_work() when uvc_status_start() will be called
> -        * again.
> -        */
> -       smp_store_release(&dev->flush_status, false);
>  }
>
>  int uvc_status_resume(struct uvc_device *dev)
> --
> 2.35.1.1320.gc452695387.dirty
>


-- 
Ricardo Ribalda
Re: [PATCH] media: uvcvideo: Fix deadlock if uvc_status_stop is called from async_ctrl.work
Posted by Sean Anderson 3 weeks, 5 days ago
Hi Ricardo,

On 3/11/26 12:06, Ricardo Ribalda wrote:
> Hi Sean
> 
> Thanks for the patch. In your original report you mentioned that you
> could repro with qv4l2 and changing a control.
> May I assume that it was while the camera was not streaming and the
> control was a "slow" controp (zoom, focus)... Can you give more some
> more details?

Yes. A minimal reproducer is

$ v4l2-ctl -c focus_absolute=500
$ v4l2-ctl -c focus_absolute=500
(hangs)

I believe the reason guvcview does not have this issue is because it
continuously displays camera output, keeping the refcount above one,
whereas qv4l2ctrl 

> I have tested your change with 3 threads running:
> 
> 1 # while true; do yavta --capture=3 /dev/video0; sleep 1;done
> 2 #  while true; do yavta -w "0x00980900 64" /dev/video0; yavta -w
> "0x00980900 0" /dev/video0; done
> 3 /sys/bus/usb/devices/3-6 # while true; do echo 1 > authorized; sleep
> 3; echo 0 > authorized; sleep 3 ; done
> 
> And I have not seen any freeze. So that is good :), But I also could
> not repro without your patch :P.
> 
> Anyway I agree with the lockdep report that we introduced a bug when
> uvc_status_stop can be called from the async work, So we must fix it.
> 
> 
> On Tue, 10 Mar 2026 at 23:23, Sean Anderson <sean.anderson@linux.dev> wrote:
>>
>> If a UVC camera has an asynchronous control, uvc_status_stop may be
>> called from async_ctrl.work:
>>
>> uvc_ctrl_status_event_work()
>>     uvc_ctrl_status_event()
>>         uvc_ctrl_clear_handle()
>>             uvc_pm_put()
>>                 uvc_status_put()
>>                     uvc_status_stop()
>>                         cancel_work_sync()
>>
>> This will cause a deadlock, since cancel_work_sync will wait for
>> uvc_ctrl_status_event_work to complete before returning.
>>
>> Fix this by returning early from uvc_status_stop if we are currently in
>> the work function. flush_status now remains false until uvc_status_start
>> is called again, ensuring that uvc_ctrl_status_event_work won't resubmit
>> the URB.
>>
> Tested-by: Ricardo Ribalda <ribalda@chromium.org>
> Acked-by: Ricardo Ribalda <ribalda@chromium.org>
> 
> Your patch is very similar to what I sent some time ago (I did not
> have the cancel_work_sync() in uvc_status_start())

This could probably be downgraded to flush_work() (along with the first
cancel_work_sync in uvc_status_stop).

> You can see the old discussion:
> https://lore.kernel.org/all/Y6sAO7URJpSIulye@pendragon.ideasonboard.com/
> 
> For now, I am only ack the patch because I want to ensure the locking
> is working as expected and need to re-read the old threads.
> It would be great if Hans or Laurent also take a look at this.
> 
> Thanks again
> 
> 
>> Fixes: a32d9c41bdb8 ("media: uvcvideo: Make power management granular")
>> Closes: https://lore.kernel.org/all/6733bdfb-3e88-479f-8956-ab09c04c433e@linux.dev/
>> Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
>> ---
>>
>>  drivers/media/usb/uvc/uvc_status.c | 25 ++++++++++++++++---------
>>  1 file changed, 16 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/media/usb/uvc/uvc_status.c b/drivers/media/usb/uvc/uvc_status.c
>> index 231cfee8e7c2c..2a23606c7f4c6 100644
>> --- a/drivers/media/usb/uvc/uvc_status.c
>> +++ b/drivers/media/usb/uvc/uvc_status.c
>> @@ -316,6 +316,14 @@ static int uvc_status_start(struct uvc_device *dev, gfp_t flags)
>>         if (!dev->int_urb)
>>                 return 0;
>>
>> +       /*
>> +        * If the work called uvc_status_stop it may still be running. Wait for
>> +        * it to finish before we submit the urb.
>> +        */
>> +       cancel_work_sync(&dev->async_ctrl.work);
>> +
>> +       /* Clear the flush status if we were previously stopped */
>> +       smp_store_release(&dev->flush_status, false);
>>         return usb_submit_urb(dev->int_urb, flags);
>>  }
>>
>> @@ -336,6 +344,14 @@ static void uvc_status_stop(struct uvc_device *dev)
>>          */
>>         smp_store_release(&dev->flush_status, true);
>>
>> +       /*
>> +        * We will deadlock if we are currently in the work function.
>> +        * Fortunately, we know that the URB is already dead and that no
>> +        * further work can be queued, so there's nothing left for us to do.
>> +        */
>> +       if (current_work() == &w->work)
>> +               return;
>> +
>>         /*
>>          * Cancel any pending asynchronous work. If any status event was queued,
>>          * process it synchronously.
>> @@ -354,15 +370,6 @@ static void uvc_status_stop(struct uvc_device *dev)
>>          */
>>         if (cancel_work_sync(&w->work))
>>                 uvc_ctrl_status_event(w->chain, w->ctrl, w->data);
>> -
>> -       /*
>> -        * From this point, there are no events on the queue and the status URB
>> -        * is dead. No events will be queued until uvc_status_start() is called.
>> -        * The barrier is needed to make sure that flush_status is visible to
>> -        * uvc_ctrl_status_event_work() when uvc_status_start() will be called
>> -        * again.
>> -        */
>> -       smp_store_release(&dev->flush_status, false);
>>  }
>>
>>  int uvc_status_resume(struct uvc_device *dev)
>> --
>> 2.35.1.1320.gc452695387.dirty
>>
> 
>