drivers/usb/dwc3/gadget.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
The below “No resource for ep” warning appears when a StartTransfer
command is issued for bulk or interrupt endpoints in
`dwc3_gadget_ep_enable` while a previous StartTransfer on the same
endpoint is still in progress. The gadget functions drivers can invoke
`usb_ep_enable` (which triggers a new StartTransfer command) before the
earlier transfer has completed. Because the previous StartTransfer is
still active, `dwc3_gadget_ep_disable` can skip the required
`EndTransfer` due to `DWC3_EP_DELAY_STOP`, leading to the endpoint
resources are busy for previous StartTransfer and warning ("No resource
for ep") from dwc3 driver.
To resolve this, a check is added to `dwc3_gadget_ep_enable` that
checks the `DWC3_EP_TRANSFER_STARTED` flag before issuing a new
StartTransfer. By preventing a second StartTransfer on an already busy
endpoint, the resource conflict is eliminated, the warning disappears,
and potential kernel panics caused by `panic_on_warn` are avoided.
------------[ cut here ]------------
dwc3 13200000.dwc3: No resource for ep1out
WARNING: CPU: 0 PID: 700 at drivers/usb/dwc3/gadget.c:398 dwc3_send_gadget_ep_cmd+0x2f8/0x76c
Call trace:
dwc3_send_gadget_ep_cmd+0x2f8/0x76c
__dwc3_gadget_ep_enable+0x490/0x7c0
dwc3_gadget_ep_enable+0x6c/0xe4
usb_ep_enable+0x5c/0x15c
mp_eth_stop+0xd4/0x11c
__dev_close_many+0x160/0x1c8
__dev_change_flags+0xfc/0x220
dev_change_flags+0x24/0x70
devinet_ioctl+0x434/0x524
inet_ioctl+0xa8/0x224
sock_do_ioctl+0x74/0x128
sock_ioctl+0x3bc/0x468
__arm64_sys_ioctl+0xa8/0xe4
invoke_syscall+0x58/0x10c
el0_svc_common+0xa8/0xdc
do_el0_svc+0x1c/0x28
el0_svc+0x38/0x88
el0t_64_sync_handler+0x70/0xbc
el0t_64_sync+0x1a8/0x1ac
Fixes: a97ea994605e ("usb: dwc3: gadget: offset Start Transfer latency for bulk EPs")
Cc: stable@vger.kernel.org
Signed-off-by: Selvarasu Ganesan <selvarasu.g@samsung.com>
---
Changes in v2:
- Removed change-id.
- Updated commit message.
Link to v1: https://lore.kernel.org/linux-usb/20251117152812.622-1-selvarasu.g@samsung.com/
---
drivers/usb/dwc3/gadget.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 1f67fb6aead5..8d3caa71ea12 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -963,8 +963,9 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action)
* Issue StartTransfer here with no-op TRB so we can always rely on No
* Response Update Transfer command.
*/
- if (usb_endpoint_xfer_bulk(desc) ||
- usb_endpoint_xfer_int(desc)) {
+ if ((usb_endpoint_xfer_bulk(desc) ||
+ usb_endpoint_xfer_int(desc)) &&
+ !(dep->flags & DWC3_EP_TRANSFER_STARTED)) {
struct dwc3_gadget_ep_cmd_params params;
struct dwc3_trb *trb;
dma_addr_t trb_dma;
--
2.34.1
On Mon, Nov 17, 2025, Selvarasu Ganesan wrote:
> The below “No resource for ep” warning appears when a StartTransfer
> command is issued for bulk or interrupt endpoints in
> `dwc3_gadget_ep_enable` while a previous StartTransfer on the same
> endpoint is still in progress. The gadget functions drivers can invoke
> `usb_ep_enable` (which triggers a new StartTransfer command) before the
> earlier transfer has completed. Because the previous StartTransfer is
> still active, `dwc3_gadget_ep_disable` can skip the required
> `EndTransfer` due to `DWC3_EP_DELAY_STOP`, leading to the endpoint
> resources are busy for previous StartTransfer and warning ("No resource
> for ep") from dwc3 driver.
>
> To resolve this, a check is added to `dwc3_gadget_ep_enable` that
> checks the `DWC3_EP_TRANSFER_STARTED` flag before issuing a new
> StartTransfer. By preventing a second StartTransfer on an already busy
> endpoint, the resource conflict is eliminated, the warning disappears,
> and potential kernel panics caused by `panic_on_warn` are avoided.
>
> ------------[ cut here ]------------
> dwc3 13200000.dwc3: No resource for ep1out
> WARNING: CPU: 0 PID: 700 at drivers/usb/dwc3/gadget.c:398 dwc3_send_gadget_ep_cmd+0x2f8/0x76c
> Call trace:
> dwc3_send_gadget_ep_cmd+0x2f8/0x76c
> __dwc3_gadget_ep_enable+0x490/0x7c0
> dwc3_gadget_ep_enable+0x6c/0xe4
> usb_ep_enable+0x5c/0x15c
> mp_eth_stop+0xd4/0x11c
> __dev_close_many+0x160/0x1c8
> __dev_change_flags+0xfc/0x220
> dev_change_flags+0x24/0x70
> devinet_ioctl+0x434/0x524
> inet_ioctl+0xa8/0x224
> sock_do_ioctl+0x74/0x128
> sock_ioctl+0x3bc/0x468
> __arm64_sys_ioctl+0xa8/0xe4
> invoke_syscall+0x58/0x10c
> el0_svc_common+0xa8/0xdc
> do_el0_svc+0x1c/0x28
> el0_svc+0x38/0x88
> el0t_64_sync_handler+0x70/0xbc
> el0t_64_sync+0x1a8/0x1ac
>
> Fixes: a97ea994605e ("usb: dwc3: gadget: offset Start Transfer latency for bulk EPs")
> Cc: stable@vger.kernel.org
> Signed-off-by: Selvarasu Ganesan <selvarasu.g@samsung.com>
> ---
>
> Changes in v2:
> - Removed change-id.
> - Updated commit message.
> Link to v1: https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20251117152812.622-1-selvarasu.g@samsung.com/__;!!A4F2R9G_pg!dQNEaMvjzrOiP0c9U6lyj31ysXGkjBAs8c_KjgDCp6ONSZcTrF15DXaJeFTK02v0RLS3w0NQ2K5_pvXak_7c8tIxKL8$
> ---
> drivers/usb/dwc3/gadget.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 1f67fb6aead5..8d3caa71ea12 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -963,8 +963,9 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action)
> * Issue StartTransfer here with no-op TRB so we can always rely on No
> * Response Update Transfer command.
> */
> - if (usb_endpoint_xfer_bulk(desc) ||
> - usb_endpoint_xfer_int(desc)) {
> + if ((usb_endpoint_xfer_bulk(desc) ||
> + usb_endpoint_xfer_int(desc)) &&
> + !(dep->flags & DWC3_EP_TRANSFER_STARTED)) {
> struct dwc3_gadget_ep_cmd_params params;
> struct dwc3_trb *trb;
> dma_addr_t trb_dma;
> --
> 2.34.1
>
Thanks for the catch. The problem is that the "ep_disable" process
should be completed after usb_ep_disable is completed. But currently it
may be an async call.
This brings up some conflicting wording of the gadget API regarding
usb_ep_disable. Here's the doc regarding usb_ep_disable:
/**
* usb_ep_disable - endpoint is no longer usable
* @ep:the endpoint being unconfigured. may not be the endpoint named "ep0".
*
* no other task may be using this endpoint when this is called.
* any pending and uncompleted requests will complete with status
* indicating disconnect (-ESHUTDOWN) before this call returns.
* gadget drivers must call usb_ep_enable() again before queueing
* requests to the endpoint.
*
* This routine may be called in an atomic (interrupt) context.
*
* returns zero, or a negative error code.
*/
It expects all requests to be completed and given back on completion. It
also notes that this can also be called in interrupt context. Currently,
there's a scenario where dwc3 may not want to give back the requests
right away (ie. DWC3_EP_DELAY_STOP). To fix that in dwc3, it would need
to "wait" for the right condition. But waiting does not make sense in
interrupt context. (We could busy-poll to satisfy the interrupt context,
but that doesn't sound right either)
This was updated from process context only to may be called in interrupt
context:
b0d5d2a71641 ("usb: gadget: udc: core: Revise comments for USB ep enable/disable")
Hi Alan,
Can you help give your opinion on this?
Thanks,
Thinh
On Tue, Nov 18, 2025 at 02:21:17AM +0000, Thinh Nguyen wrote:
> Thanks for the catch. The problem is that the "ep_disable" process
> should be completed after usb_ep_disable is completed. But currently it
> may be an async call.
>
> This brings up some conflicting wording of the gadget API regarding
> usb_ep_disable. Here's the doc regarding usb_ep_disable:
>
> /**
> * usb_ep_disable - endpoint is no longer usable
> * @ep:the endpoint being unconfigured. may not be the endpoint named "ep0".
> *
> * no other task may be using this endpoint when this is called.
> * any pending and uncompleted requests will complete with status
> * indicating disconnect (-ESHUTDOWN) before this call returns.
> * gadget drivers must call usb_ep_enable() again before queueing
> * requests to the endpoint.
> *
> * This routine may be called in an atomic (interrupt) context.
> *
> * returns zero, or a negative error code.
> */
>
> It expects all requests to be completed and given back on completion. It
> also notes that this can also be called in interrupt context. Currently,
> there's a scenario where dwc3 may not want to give back the requests
> right away (ie. DWC3_EP_DELAY_STOP). To fix that in dwc3, it would need
> to "wait" for the right condition. But waiting does not make sense in
> interrupt context. (We could busy-poll to satisfy the interrupt context,
> but that doesn't sound right either)
>
> This was updated from process context only to may be called in interrupt
> context:
>
> b0d5d2a71641 ("usb: gadget: udc: core: Revise comments for USB ep enable/disable")
>
>
> Hi Alan,
>
> Can you help give your opinion on this?
Well, I think the change to the API was made because drivers _were_
calling these routines in interrupt context. That's what the commit's
description says, anyway.
One way out of the problem would be to change the kerneldoc for
usb_ep_disable(). Instead of saying that pending requests will complete
before the all returns, say that the the requests will be marked for
cancellation (with -ESHUTDOWN) before the call returns, but the actual
completions might happen asynchronously later on.
The difficulty comes when a gadget driver has to handle a Set-Interface
request, or Set-Config for the same configuration. The endpoints for
the old altsetting/config have to be disabled and then the endpoints for
the new altsetting/config have to be enabled, all while managing any
pending requests. I don't know how various function drivers handle
this, just that f_mass_storage is very careful about taking care of
everything in a separate kernel thread that explicitly dequeues the
pending requests and flushes the endpoints. In fact, this scenario was
the whole reason for inventing the DELAYED_STATUS mechanism, because it
was impossible to do all the necessary work within the callback routine
for a control-request interrupt handler.
Alan Stern
On Mon, Nov 17, 2025, Alan Stern wrote:
> On Tue, Nov 18, 2025 at 02:21:17AM +0000, Thinh Nguyen wrote:
> > Thanks for the catch. The problem is that the "ep_disable" process
> > should be completed after usb_ep_disable is completed. But currently it
> > may be an async call.
> >
> > This brings up some conflicting wording of the gadget API regarding
> > usb_ep_disable. Here's the doc regarding usb_ep_disable:
> >
> > /**
> > * usb_ep_disable - endpoint is no longer usable
> > * @ep:the endpoint being unconfigured. may not be the endpoint named "ep0".
> > *
> > * no other task may be using this endpoint when this is called.
> > * any pending and uncompleted requests will complete with status
> > * indicating disconnect (-ESHUTDOWN) before this call returns.
> > * gadget drivers must call usb_ep_enable() again before queueing
> > * requests to the endpoint.
> > *
> > * This routine may be called in an atomic (interrupt) context.
> > *
> > * returns zero, or a negative error code.
> > */
> >
> > It expects all requests to be completed and given back on completion. It
> > also notes that this can also be called in interrupt context. Currently,
> > there's a scenario where dwc3 may not want to give back the requests
> > right away (ie. DWC3_EP_DELAY_STOP). To fix that in dwc3, it would need
> > to "wait" for the right condition. But waiting does not make sense in
> > interrupt context. (We could busy-poll to satisfy the interrupt context,
> > but that doesn't sound right either)
> >
> > This was updated from process context only to may be called in interrupt
> > context:
> >
> > b0d5d2a71641 ("usb: gadget: udc: core: Revise comments for USB ep enable/disable")
> >
> >
> > Hi Alan,
> >
> > Can you help give your opinion on this?
>
> Well, I think the change to the API was made because drivers _were_
> calling these routines in interrupt context. That's what the commit's
> description says, anyway.
>
> One way out of the problem would be to change the kerneldoc for
> usb_ep_disable(). Instead of saying that pending requests will complete
> before the all returns, say that the the requests will be marked for
> cancellation (with -ESHUTDOWN) before the call returns, but the actual
> completions might happen asynchronously later on.
The burden of synchronization would be shifted to the gadget drivers.
The problem with this is that gadget drivers may modify the requests
after usb_ep_disable() when it should not (e.g. the controller may still
be processing the buffer). Also, gadget drivers shouldn't call
usb_ep_enabled() until the requests are returned.
>
> The difficulty comes when a gadget driver has to handle a Set-Interface
> request, or Set-Config for the same configuration. The endpoints for
> the old altsetting/config have to be disabled and then the endpoints for
> the new altsetting/config have to be enabled, all while managing any
Right.
> pending requests. I don't know how various function drivers handle
> this, just that f_mass_storage is very careful about taking care of
> everything in a separate kernel thread that explicitly dequeues the
> pending requests and flushes the endpoints. In fact, this scenario was
> the whole reason for inventing the DELAYED_STATUS mechanism, because it
> was impossible to do all the necessary work within the callback routine
> for a control-request interrupt handler.
>
If we want to keep usb_ep_disable in interrupt context, should we revise
the wording such that gadget drivers must ensure pending requests are
dequeued before calling usb_ep_disable()? That requests are expected to
be given back before usb_ep_disable().
Or perhaps revert the commit above (commit b0d5d2a71641), fix the dwc3
driver for that, and gadget drivers need to follow the original
requirement.
BR,
Thinh
On Wed, Nov 19, 2025 at 01:49:12AM +0000, Thinh Nguyen wrote: > On Mon, Nov 17, 2025, Alan Stern wrote: > > > Hi Alan, > > > > > > Can you help give your opinion on this? > > > > Well, I think the change to the API was made because drivers _were_ > > calling these routines in interrupt context. That's what the commit's > > description says, anyway. > > > > One way out of the problem would be to change the kerneldoc for > > usb_ep_disable(). Instead of saying that pending requests will complete > > before the all returns, say that the the requests will be marked for > > cancellation (with -ESHUTDOWN) before the call returns, but the actual > > completions might happen asynchronously later on. > > The burden of synchronization would be shifted to the gadget drivers. > The problem with this is that gadget drivers may modify the requests > after usb_ep_disable() when it should not (e.g. the controller may still > be processing the buffer). Also, gadget drivers shouldn't call > usb_ep_enabled() until the requests are returned. No, they probably shouldn't, although I don't know if that would actually cause any trouble. It's not a good idea, in any case -- particularly if the drivers want to re-use the same requests as before. The problem is that function drivers' ->set_alt() callbacks are expected to do two things: disable all the endpoints from the old altsetting and enable all the endpoints in the new altsetting. There's no way for any part of the gadget core to intervene between those things (for instance, to wait for requests to complete). > > The difficulty comes when a gadget driver has to handle a Set-Interface > > request, or Set-Config for the same configuration. The endpoints for > > the old altsetting/config have to be disabled and then the endpoints for > > the new altsetting/config have to be enabled, all while managing any > > Right. > > > pending requests. I don't know how various function drivers handle > > this, just that f_mass_storage is very careful about taking care of > > everything in a separate kernel thread that explicitly dequeues the > > pending requests and flushes the endpoints. In fact, this scenario was > > the whole reason for inventing the DELAYED_STATUS mechanism, because it > > was impossible to do all the necessary work within the callback routine > > for a control-request interrupt handler. > > > > If we want to keep usb_ep_disable in interrupt context, should we revise > the wording such that gadget drivers must ensure pending requests are > dequeued before calling usb_ep_disable()? That requests are expected to > be given back before usb_ep_disable(). > > Or perhaps revert the commit above (commit b0d5d2a71641), fix the dwc3 > driver for that, and gadget drivers need to follow the original > requirement. Function drivers would have to go to great lengths to guarantee that requests had completed before the endpoint is re-enabled. Right now their ->set_alt() callback routines are designed to run in interrupt context; they can't afford to wait for requests to complete. The easiest way out is for usb_ep_disable() to do what the kerneldoc says: ensure that pending requests do complete before it returns. Can dwc3 do this? (And what if at some time in the future we want to start using an asynchronous bottom half for request completions, like usbcore does for URBs?) Let's face it; the situation is a mess. Alan Stern
On Tue, Nov 18, 2025, Alan Stern wrote: > On Wed, Nov 19, 2025 at 01:49:12AM +0000, Thinh Nguyen wrote: > > On Mon, Nov 17, 2025, Alan Stern wrote: > > > > Hi Alan, > > > > > > > > Can you help give your opinion on this? > > > > > > Well, I think the change to the API was made because drivers _were_ > > > calling these routines in interrupt context. That's what the commit's > > > description says, anyway. > > > > > > One way out of the problem would be to change the kerneldoc for > > > usb_ep_disable(). Instead of saying that pending requests will complete > > > before the all returns, say that the the requests will be marked for > > > cancellation (with -ESHUTDOWN) before the call returns, but the actual > > > completions might happen asynchronously later on. > > > > The burden of synchronization would be shifted to the gadget drivers. > > The problem with this is that gadget drivers may modify the requests > > after usb_ep_disable() when it should not (e.g. the controller may still > > be processing the buffer). Also, gadget drivers shouldn't call > > usb_ep_enabled() until the requests are returned. > > No, they probably shouldn't, although I don't know if that would > actually cause any trouble. It's not a good idea, in any case -- > particularly if the drivers want to re-use the same requests as before. Right. > > The problem is that function drivers' ->set_alt() callbacks are expected > to do two things: disable all the endpoints from the old altsetting and > enable all the endpoints in the new altsetting. There's no way for any > part of the gadget core to intervene between those things (for instance, > to wait for requests to complete). > > > > The difficulty comes when a gadget driver has to handle a Set-Interface > > > request, or Set-Config for the same configuration. The endpoints for > > > the old altsetting/config have to be disabled and then the endpoints for > > > the new altsetting/config have to be enabled, all while managing any > > > > Right. > > > > > pending requests. I don't know how various function drivers handle > > > this, just that f_mass_storage is very careful about taking care of > > > everything in a separate kernel thread that explicitly dequeues the > > > pending requests and flushes the endpoints. In fact, this scenario was > > > the whole reason for inventing the DELAYED_STATUS mechanism, because it > > > was impossible to do all the necessary work within the callback routine > > > for a control-request interrupt handler. > > > > > > > If we want to keep usb_ep_disable in interrupt context, should we revise > > the wording such that gadget drivers must ensure pending requests are > > dequeued before calling usb_ep_disable()? That requests are expected to > > be given back before usb_ep_disable(). > > > > Or perhaps revert the commit above (commit b0d5d2a71641), fix the dwc3 > > driver for that, and gadget drivers need to follow the original > > requirement. > > Function drivers would have to go to great lengths to guarantee that > requests had completed before the endpoint is re-enabled. Right now > their ->set_alt() callback routines are designed to run in interrupt > context; they can't afford to wait for requests to complete. Why is ->set_alt() designed for interrupt context? We can't expect requests to be completed before usb_ep_disable() completes _and_ also expect usb_ep_disable() be able to be called in interrupt context. > > The easiest way out is for usb_ep_disable() to do what the kerneldoc > says: ensure that pending requests do complete before it returns. Can > dwc3 do this? (And what if at some time in the future we want to start The dwc3 can do that, but we need to note that usb_ep_disable() must be executed in process context and might sleep. I suspect we may run into some issues from some function drivers that expected usb_ep_disable() to be executable in interrupt context. > using an asynchronous bottom half for request completions, like usbcore > does for URBs?) Which one are you referring to? From what I see, even the host side expected ->endpoint_disable to be executed in process context. Perhaps we can introduce endpoint_flush() on gadget side for synchronization if we want to keep usb_ep_disable() to be asynchronous? > > Let's face it; the situation is a mess. > Glad you're here to help with the mess :) Thanks, Thinh
On Thu, Nov 20, 2025 at 02:07:33AM +0000, Thinh Nguyen wrote: > > Function drivers would have to go to great lengths to guarantee that > > requests had completed before the endpoint is re-enabled. Right now > > their ->set_alt() callback routines are designed to run in interrupt > > context; they can't afford to wait for requests to complete. > > Why is ->set_alt() designed for interrupt context? We can't expect > requests to be completed before usb_ep_disable() completes _and_ also > expect usb_ep_disable() be able to be called in interrupt context. ->set_alt() is called by the composite core when a Set-Interface or Set-Config control request arrives from the host. It happens within the composite_setup() handler, which is called by the UDC driver when a control request arrives, which means it happens in the context of the UDC driver's interrupt handler. Therefore ->set_alt() callbacks must not sleep. > > The easiest way out is for usb_ep_disable() to do what the kerneldoc > > says: ensure that pending requests do complete before it returns. Can > > dwc3 do this? (And what if at some time in the future we want to start > > The dwc3 can do that, but we need to note that usb_ep_disable() must be > executed in process context and might sleep. I suspect we may run into > some issues from some function drivers that expected usb_ep_disable() to > be executable in interrupt context. Well, that's part of what I meant to ask. Is it possible to wait for all pending requests to be given back while in interrupt context? > > using an asynchronous bottom half for request completions, like usbcore > > does for URBs?) > > Which one are you referring to? From what I see, even the host side > expected ->endpoint_disable to be executed in process context. I was referring to the way usb_hcd_giveback_urb() uses system_bh_wq or system_bh_highpri_wq to do its work. This makes it impossible for an interrupt handler to wait for a giveback to complete. If the gadget core also switches over to using a work queue for request completions, it will then likewise become impossible for an interrupt handler to wait for a request to complete. > Perhaps we can introduce endpoint_flush() on gadget side for > synchronization if we want to keep usb_ep_disable() to be asynchronous? > > > > > Let's face it; the situation is a mess. > > > > Glad you're here to help with the mess :) To do this right, I can't think of any approach other than to make the composite core use a work queue or other kernel thread for handling Set-Interface and Set-Config calls. It would be nice if there was a way to invoke the ->set_alt() callback that would just disable the interface's endpoints without re-enabling anything. Then the composite core could disable the existing altsetting, flush the old endpoints, and call ->set_alt() a second time to install the new altsetting and enable the new endpoints. But implementing this would require us to update every function driver's ->set_alt() callback routine. Without that ability, we will have to audit every function driver to make sure the ->set_alt() callbacks do ensure that endpoints are flushed before they are re-enabled. There does not seem to be any way to fix the problem just by changing the gadget core. Alan Stern
On Wed, Nov 19, 2025, Alan Stern wrote: > On Thu, Nov 20, 2025 at 02:07:33AM +0000, Thinh Nguyen wrote: > > > Function drivers would have to go to great lengths to guarantee that > > > requests had completed before the endpoint is re-enabled. Right now > > > their ->set_alt() callback routines are designed to run in interrupt > > > context; they can't afford to wait for requests to complete. > > > > Why is ->set_alt() designed for interrupt context? We can't expect > > requests to be completed before usb_ep_disable() completes _and_ also > > expect usb_ep_disable() be able to be called in interrupt context. > > ->set_alt() is called by the composite core when a Set-Interface or > Set-Config control request arrives from the host. It happens within the > composite_setup() handler, which is called by the UDC driver when a > control request arrives, which means it happens in the context of the > UDC driver's interrupt handler. Therefore ->set_alt() callbacks must > not sleep. This should be changed. I don't think we can expect set_alt() to be in interrupt context only. > > > > The easiest way out is for usb_ep_disable() to do what the kerneldoc > > > says: ensure that pending requests do complete before it returns. Can > > > dwc3 do this? (And what if at some time in the future we want to start > > > > The dwc3 can do that, but we need to note that usb_ep_disable() must be > > executed in process context and might sleep. I suspect we may run into > > some issues from some function drivers that expected usb_ep_disable() to > > be executable in interrupt context. > > Well, that's part of what I meant to ask. Is it possible to wait for > all pending requests to be given back while in interrupt context? The dwc3 controller will need some time (usually less than 2ms) to flush the endpoints and give back requests. It's probably too long to have busy poll in interrupt context. > > > > using an asynchronous bottom half for request completions, like usbcore > > > does for URBs?) > > > > Which one are you referring to? From what I see, even the host side > > expected ->endpoint_disable to be executed in process context. > > I was referring to the way usb_hcd_giveback_urb() uses system_bh_wq or > system_bh_highpri_wq to do its work. This makes it impossible for an > interrupt handler to wait for a giveback to complete. > > If the gadget core also switches over to using a work queue for request > completions, it will then likewise become impossible for an interrupt > handler to wait for a request to complete. > > > Perhaps we can introduce endpoint_flush() on gadget side for > > synchronization if we want to keep usb_ep_disable() to be asynchronous? > > > > > > > > Let's face it; the situation is a mess. > > > > > > > Glad you're here to help with the mess :) > > To do this right, I can't think of any approach other than to make the > composite core use a work queue or other kernel thread for handling > Set-Interface and Set-Config calls. Sounds like it should've been like this initially. > > It would be nice if there was a way to invoke the ->set_alt() callback > that would just disable the interface's endpoints without re-enabling > anything. Then the composite core could disable the existing > altsetting, flush the old endpoints, and call ->set_alt() a second time > to install the new altsetting and enable the new endpoints. But > implementing this would require us to update every function driver's > ->set_alt() callback routine. Yeah.. > > Without that ability, we will have to audit every function driver to > make sure the ->set_alt() callbacks do ensure that endpoints are flushed > before they are re-enabled. > > There does not seem to be any way to fix the problem just by changing > the gadget core. > We can have a workaround in dwc3 that can temporarily "work" with what we have. However, eventually, we will need to properly rework this and audit the gadget drivers. Thanks, Thinh
On Fri, Nov 21, 2025 at 02:22:02AM +0000, Thinh Nguyen wrote: > On Wed, Nov 19, 2025, Alan Stern wrote: > > ->set_alt() is called by the composite core when a Set-Interface or > > Set-Config control request arrives from the host. It happens within the > > composite_setup() handler, which is called by the UDC driver when a > > control request arrives, which means it happens in the context of the > > UDC driver's interrupt handler. Therefore ->set_alt() callbacks must > > not sleep. > > This should be changed. I don't think we can expect set_alt() to > be in interrupt context only. Agreed. > > To do this right, I can't think of any approach other than to make the > > composite core use a work queue or other kernel thread for handling > > Set-Interface and Set-Config calls. > > Sounds like it should've been like this initially. I guess the nobody thought through the issues very carefully at the time the composite framework was designed. Maybe the UDCs that existed back did not require a lot of time to flush endpoints; I can't remember. > > Without that ability, we will have to audit every function driver to > > make sure the ->set_alt() callbacks do ensure that endpoints are flushed > > before they are re-enabled. > > > > There does not seem to be any way to fix the problem just by changing > > the gadget core. > > > > We can have a workaround in dwc3 that can temporarily "work" with what > we have. However, eventually, we will need to properly rework this and > audit the gadget drivers. Clearly, the first step is to change the composite core. That can be done without messing up anything else. But yes, eventually the gadget drivers will have to be audited. Alan Stern
On 11/18/2025 7:51 AM, Thinh Nguyen wrote:
> On Mon, Nov 17, 2025, Selvarasu Ganesan wrote:
>> The below “No resource for ep” warning appears when a StartTransfer
>> command is issued for bulk or interrupt endpoints in
>> `dwc3_gadget_ep_enable` while a previous StartTransfer on the same
>> endpoint is still in progress. The gadget functions drivers can invoke
>> `usb_ep_enable` (which triggers a new StartTransfer command) before the
>> earlier transfer has completed. Because the previous StartTransfer is
>> still active, `dwc3_gadget_ep_disable` can skip the required
>> `EndTransfer` due to `DWC3_EP_DELAY_STOP`, leading to the endpoint
>> resources are busy for previous StartTransfer and warning ("No resource
>> for ep") from dwc3 driver.
>>
>> To resolve this, a check is added to `dwc3_gadget_ep_enable` that
>> checks the `DWC3_EP_TRANSFER_STARTED` flag before issuing a new
>> StartTransfer. By preventing a second StartTransfer on an already busy
>> endpoint, the resource conflict is eliminated, the warning disappears,
>> and potential kernel panics caused by `panic_on_warn` are avoided.
>>
>> ------------[ cut here ]------------
>> dwc3 13200000.dwc3: No resource for ep1out
>> WARNING: CPU: 0 PID: 700 at drivers/usb/dwc3/gadget.c:398 dwc3_send_gadget_ep_cmd+0x2f8/0x76c
>> Call trace:
>> dwc3_send_gadget_ep_cmd+0x2f8/0x76c
>> __dwc3_gadget_ep_enable+0x490/0x7c0
>> dwc3_gadget_ep_enable+0x6c/0xe4
>> usb_ep_enable+0x5c/0x15c
>> mp_eth_stop+0xd4/0x11c
>> __dev_close_many+0x160/0x1c8
>> __dev_change_flags+0xfc/0x220
>> dev_change_flags+0x24/0x70
>> devinet_ioctl+0x434/0x524
>> inet_ioctl+0xa8/0x224
>> sock_do_ioctl+0x74/0x128
>> sock_ioctl+0x3bc/0x468
>> __arm64_sys_ioctl+0xa8/0xe4
>> invoke_syscall+0x58/0x10c
>> el0_svc_common+0xa8/0xdc
>> do_el0_svc+0x1c/0x28
>> el0_svc+0x38/0x88
>> el0t_64_sync_handler+0x70/0xbc
>> el0t_64_sync+0x1a8/0x1ac
>>
>> Fixes: a97ea994605e ("usb: dwc3: gadget: offset Start Transfer latency for bulk EPs")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Selvarasu Ganesan <selvarasu.g@samsung.com>
>> ---
>>
>> Changes in v2:
>> - Removed change-id.
>> - Updated commit message.
>> Link to v1: https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20251117152812.622-1-selvarasu.g@samsung.com/__;!!A4F2R9G_pg!dQNEaMvjzrOiP0c9U6lyj31ysXGkjBAs8c_KjgDCp6ONSZcTrF15DXaJeFTK02v0RLS3w0NQ2K5_pvXak_7c8tIxKL8$
>> ---
>> drivers/usb/dwc3/gadget.c | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 1f67fb6aead5..8d3caa71ea12 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -963,8 +963,9 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action)
>> * Issue StartTransfer here with no-op TRB so we can always rely on No
>> * Response Update Transfer command.
>> */
>> - if (usb_endpoint_xfer_bulk(desc) ||
>> - usb_endpoint_xfer_int(desc)) {
>> + if ((usb_endpoint_xfer_bulk(desc) ||
>> + usb_endpoint_xfer_int(desc)) &&
>> + !(dep->flags & DWC3_EP_TRANSFER_STARTED)) {
>> struct dwc3_gadget_ep_cmd_params params;
>> struct dwc3_trb *trb;
>> dma_addr_t trb_dma;
>> --
>> 2.34.1
>>
> Thanks for the catch. The problem is that the "ep_disable" process
> should be completed after usb_ep_disable is completed. But currently it
> may be an async call.
>
> This brings up some conflicting wording of the gadget API regarding
> usb_ep_disable. Here's the doc regarding usb_ep_disable:
>
> /**
> * usb_ep_disable - endpoint is no longer usable
> * @ep:the endpoint being unconfigured. may not be the endpoint named "ep0".
> *
> * no other task may be using this endpoint when this is called.
> * any pending and uncompleted requests will complete with status
> * indicating disconnect (-ESHUTDOWN) before this call returns.
> * gadget drivers must call usb_ep_enable() again before queueing
> * requests to the endpoint.
> *
> * This routine may be called in an atomic (interrupt) context.
> *
> * returns zero, or a negative error code.
> */
>
> It expects all requests to be completed and given back on completion. It
> also notes that this can also be called in interrupt context. Currently,
> there's a scenario where dwc3 may not want to give back the requests
> right away (ie. DWC3_EP_DELAY_STOP). To fix that in dwc3, it would need
Hi Thinh,
Thanks for your comments.
I agree with you on dwc3 may not want to give back the requests right
away (ie. DWC3_EP_DELAY_STOP) and might also ignore the End Transfer
command.
Consequently, there’s no point in scheduling a new Start Transfer before
the previous one has completed. The traces below illustrate this: for
EP1OUT the End Transfer never occurs, so a new Start Transfer is issued,
which eventually ends with a “No Resource” error.
1. EP disable for ep1in: (working EP)
----------------------------------
dwc3_gadget_ep_disable: ep1in: mps 512/1024 streams 16 burst 0 ring 8/0
flags E:swBp:< dwc3_gadget_ep_cmd: ep1in: cmd 'End Transfer' [30c08]
params 00000000 00000000 00000000 --> status: Successful
dwc3_gadget_giveback: ep1in: req 00000000cfc03ba0 length 0/194 Zsi ==>
-108 dwc3_gadget_giveback: ep1in: req 0000000075196149 length 0/194 Zsi
==> -108 dwc3_gadget_giveback: ep1in: req 000000002b2ffaa2 length 0/86
Zsi ==> -108
2. EP enable for ep1out:(Not working EP) -->(No End Transfer)
----------------------------------------------------------
dwc3_gadget_ep_disable: ep1out: mps 512/682 streams 16 burst 0 ring
114/74 flags E:swBp:>
3. EP disable for ep1in:
-----------------------
dwc3_gadget_ep_cmd: ep1in: cmd 'Set Endpoint Configuration' [401] params
00021004 06000200 00000000 --> status: Successful dwc3_gadget_ep_cmd:
ep1in: cmd 'Start Transfer' [406] params 00000008 ad6b9000 00000000 -->
status: Successful dwc3_gadget_ep_enable: ep1in: mps 512/1024 streams 16
burst 0 ring 0/0 flags E:swBp:<
4. EP enable for ep1out:(Triggered start Transfer without End Transfer)
----------------------------------------------------------------------
dwc3_gadget_ep_cmd: ep1out: cmd 'Set Endpoint Configuration' [401]
params 00001004 04000200 00000000 --> status: Successful
dwc3_gadget_ep_cmd: ep1out: cmd 'Start Transfer' [406] params 00000008
ad6b7000 00000000 --> status: No Resource dwc3_gadget_ep_enable: ep1out:
mps 512/682 streams 16 burst 0 ring 114/74 flags E:swBp:>
The given fix will prevent in this failure case ,
dwc3_gadget_ep_enable:
---------------------
+ if ((usb_endpoint_xfer_bulk(desc) ||
+ usb_endpoint_xfer_int(desc)) &&
+ !(dep->flags & DWC3_EP_TRANSFER_STARTED)) {
Thanks,
Selva
> to "wait" for the right condition. But waiting does not make sense in
> interrupt context. (We could busy-poll to satisfy the interrupt context,
> but that doesn't sound right either)
>
> This was updated from process context only to may be called in interrupt
> context:
>
> b0d5d2a71641 ("usb: gadget: udc: core: Revise comments for USB ep enable/disable")
>
>
> Hi Alan,
>
> Can you help give your opinion on this?
>
> Thanks,
> Thinh
On Tue, Nov 18, 2025, Selvarasu Ganesan wrote:
>
> On 11/18/2025 7:51 AM, Thinh Nguyen wrote:
> > On Mon, Nov 17, 2025, Selvarasu Ganesan wrote:
> >> The below “No resource for ep” warning appears when a StartTransfer
> >> command is issued for bulk or interrupt endpoints in
> >> `dwc3_gadget_ep_enable` while a previous StartTransfer on the same
> >> endpoint is still in progress. The gadget functions drivers can invoke
> >> `usb_ep_enable` (which triggers a new StartTransfer command) before the
> >> earlier transfer has completed. Because the previous StartTransfer is
> >> still active, `dwc3_gadget_ep_disable` can skip the required
> >> `EndTransfer` due to `DWC3_EP_DELAY_STOP`, leading to the endpoint
> >> resources are busy for previous StartTransfer and warning ("No resource
> >> for ep") from dwc3 driver.
> >>
> >> To resolve this, a check is added to `dwc3_gadget_ep_enable` that
> >> checks the `DWC3_EP_TRANSFER_STARTED` flag before issuing a new
> >> StartTransfer. By preventing a second StartTransfer on an already busy
> >> endpoint, the resource conflict is eliminated, the warning disappears,
> >> and potential kernel panics caused by `panic_on_warn` are avoided.
> >>
> >> ------------[ cut here ]------------
> >> dwc3 13200000.dwc3: No resource for ep1out
> >> WARNING: CPU: 0 PID: 700 at drivers/usb/dwc3/gadget.c:398 dwc3_send_gadget_ep_cmd+0x2f8/0x76c
> >> Call trace:
> >> dwc3_send_gadget_ep_cmd+0x2f8/0x76c
> >> __dwc3_gadget_ep_enable+0x490/0x7c0
> >> dwc3_gadget_ep_enable+0x6c/0xe4
> >> usb_ep_enable+0x5c/0x15c
> >> mp_eth_stop+0xd4/0x11c
> >> __dev_close_many+0x160/0x1c8
> >> __dev_change_flags+0xfc/0x220
> >> dev_change_flags+0x24/0x70
> >> devinet_ioctl+0x434/0x524
> >> inet_ioctl+0xa8/0x224
> >> sock_do_ioctl+0x74/0x128
> >> sock_ioctl+0x3bc/0x468
> >> __arm64_sys_ioctl+0xa8/0xe4
> >> invoke_syscall+0x58/0x10c
> >> el0_svc_common+0xa8/0xdc
> >> do_el0_svc+0x1c/0x28
> >> el0_svc+0x38/0x88
> >> el0t_64_sync_handler+0x70/0xbc
> >> el0t_64_sync+0x1a8/0x1ac
> >>
> >> Fixes: a97ea994605e ("usb: dwc3: gadget: offset Start Transfer latency for bulk EPs")
> >> Cc: stable@vger.kernel.org
> >> Signed-off-by: Selvarasu Ganesan <selvarasu.g@samsung.com>
> >> ---
> >>
> >> Changes in v2:
> >> - Removed change-id.
> >> - Updated commit message.
> >> Link to v1: https://urldefense.com/v3/__https://lore.kernel.org/linux-usb/20251117152812.622-1-selvarasu.g@samsung.com/__;!!A4F2R9G_pg!dQNEaMvjzrOiP0c9U6lyj31ysXGkjBAs8c_KjgDCp6ONSZcTrF15DXaJeFTK02v0RLS3w0NQ2K5_pvXak_7c8tIxKL8$
> >> ---
> >> drivers/usb/dwc3/gadget.c | 5 +++--
> >> 1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> >> index 1f67fb6aead5..8d3caa71ea12 100644
> >> --- a/drivers/usb/dwc3/gadget.c
> >> +++ b/drivers/usb/dwc3/gadget.c
> >> @@ -963,8 +963,9 @@ static int __dwc3_gadget_ep_enable(struct dwc3_ep *dep, unsigned int action)
> >> * Issue StartTransfer here with no-op TRB so we can always rely on No
> >> * Response Update Transfer command.
> >> */
> >> - if (usb_endpoint_xfer_bulk(desc) ||
> >> - usb_endpoint_xfer_int(desc)) {
> >> + if ((usb_endpoint_xfer_bulk(desc) ||
> >> + usb_endpoint_xfer_int(desc)) &&
> >> + !(dep->flags & DWC3_EP_TRANSFER_STARTED)) {
> >> struct dwc3_gadget_ep_cmd_params params;
> >> struct dwc3_trb *trb;
> >> dma_addr_t trb_dma;
> >> --
> >> 2.34.1
> >>
> > Thanks for the catch. The problem is that the "ep_disable" process
> > should be completed after usb_ep_disable is completed. But currently it
> > may be an async call.
> >
> > This brings up some conflicting wording of the gadget API regarding
> > usb_ep_disable. Here's the doc regarding usb_ep_disable:
> >
> > /**
> > * usb_ep_disable - endpoint is no longer usable
> > * @ep:the endpoint being unconfigured. may not be the endpoint named "ep0".
> > *
> > * no other task may be using this endpoint when this is called.
> > * any pending and uncompleted requests will complete with status
> > * indicating disconnect (-ESHUTDOWN) before this call returns.
> > * gadget drivers must call usb_ep_enable() again before queueing
> > * requests to the endpoint.
> > *
> > * This routine may be called in an atomic (interrupt) context.
> > *
> > * returns zero, or a negative error code.
> > */
> >
> > It expects all requests to be completed and given back on completion. It
> > also notes that this can also be called in interrupt context. Currently,
> > there's a scenario where dwc3 may not want to give back the requests
> > right away (ie. DWC3_EP_DELAY_STOP). To fix that in dwc3, it would need
>
>
> Hi Thinh,
> Thanks for your comments.
> I agree with you on dwc3 may not want to give back the requests right
> away (ie. DWC3_EP_DELAY_STOP) and might also ignore the End Transfer
> command.
> Consequently, there’s no point in scheduling a new Start Transfer before
> the previous one has completed. The traces below illustrate this: for
> EP1OUT the End Transfer never occurs, so a new Start Transfer is issued,
> which eventually ends with a “No Resource” error.
>
Hi,
In your point, we may as well remove the Start Transfer for all cases.
It actually doesn't impact performance, or required, or needed for No
Response Update Transfer command flow.
The reason we still have that is actually for bulk streams. Your change
actually doesn't fix for the case of bulk streams. (Needed to start and
stop the endpoint to bring it to a certain state as documented in the
same function).
The problem I'm bringing up is related to the dwc3 not really following
the usb_ep_disable() doc in 1 scenario. Depending on how we want to fix
that, then we may not need to change here.
BR,
Thinh
© 2016 - 2025 Red Hat, Inc.