From: Akash M <akash.m5@samsung.com>
This commit addresses a rarely observed endpoint command timeout
which causes kernel panic due to warn when 'panic_on_warn' is enabled
and unnecessary call trace prints when 'panic_on_warn' is disabled.
It is seen during fast software-controlled connect/disconnect testcases.
The following is one such endpoint command timeout that we observed:
1. Connect
=======
->dwc3_thread_interrupt
->dwc3_ep0_interrupt
->configfs_composite_setup
->composite_setup
->usb_ep_queue
->dwc3_gadget_ep0_queue
->__dwc3_gadget_ep0_queue
->__dwc3_ep0_do_control_data
->dwc3_send_gadget_ep_cmd
2. Disconnect
==========
->dwc3_thread_interrupt
->dwc3_gadget_disconnect_interrupt
->dwc3_ep0_reset_state
->dwc3_ep0_end_control_data
->dwc3_send_gadget_ep_cmd
In the issue scenario, in Exynos platforms, we observed that control
transfers for the previous connect have not yet been completed and end
transfer command sent as a part of the disconnect sequence and
processing of USB_ENDPOINT_HALT feature request from the host timeout.
This maybe an expected scenario since the controller is processing EP
commands sent as a part of the previous connect. It maybe better to
remove WARN_ON in all places where device endpoint commands are sent to
avoid unnecessary kernel panic due to warn.
Fixes: e192cc7b5239 ("usb: dwc3: gadget: move cmd_endtransfer to extra function")
Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver")
Fixes: c7fcdeb2627c ("usb: dwc3: ep0: simplify EP0 state machine")
Fixes: f0f2b2a2db85 ("usb: dwc3: ep0: push ep0state into xfernotready processing")
Fixes: 2e3db064855a ("usb: dwc3: ep0: drop XferNotReady(DATA) support")
Cc: stable@vger.kernel.org
Signed-off-by: Akash M <akash.m5@samsung.com>
Signed-off-by: Selvarasu Ganesan <selvarasu.g@samsung.com>
diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
index 666ac432f52d..7b313836f62b 100644
--- a/drivers/usb/dwc3/ep0.c
+++ b/drivers/usb/dwc3/ep0.c
@@ -288,7 +288,9 @@ void dwc3_ep0_out_start(struct dwc3 *dwc)
dwc3_ep0_prepare_one_trb(dep, dwc->ep0_trb_addr, 8,
DWC3_TRBCTL_CONTROL_SETUP, false);
ret = dwc3_ep0_start_trans(dep);
- WARN_ON(ret < 0);
+ if (ret < 0)
+ dev_warn(dwc->dev, "ep0 out start transfer failed: %d\n", ret);
+
for (i = 2; i < DWC3_ENDPOINTS_NUM; i++) {
struct dwc3_ep *dwc3_ep;
@@ -1061,7 +1063,9 @@ static void __dwc3_ep0_do_control_data(struct dwc3 *dwc,
ret = dwc3_ep0_start_trans(dep);
}
- WARN_ON(ret < 0);
+ if (ret < 0)
+ dev_warn(dwc->dev, "ep0 data phase start transfer failed: %d\n",
+ ret);
}
static int dwc3_ep0_start_control_status(struct dwc3_ep *dep)
@@ -1078,7 +1082,12 @@ static int dwc3_ep0_start_control_status(struct dwc3_ep *dep)
static void __dwc3_ep0_do_control_status(struct dwc3 *dwc, struct dwc3_ep *dep)
{
- WARN_ON(dwc3_ep0_start_control_status(dep));
+ int ret;
+
+ ret = dwc3_ep0_start_control_status(dep);
+ if (ret)
+ dev_warn(dwc->dev,
+ "ep0 status phase start transfer failed: %d\n", ret);
}
static void dwc3_ep0_do_control_status(struct dwc3 *dwc,
@@ -1121,7 +1130,10 @@ void dwc3_ep0_end_control_data(struct dwc3 *dwc, struct dwc3_ep *dep)
cmd |= DWC3_DEPCMD_PARAM(dep->resource_index);
memset(¶ms, 0, sizeof(params));
ret = dwc3_send_gadget_ep_cmd(dep, cmd, ¶ms);
- WARN_ON_ONCE(ret);
+ if (ret)
+ dev_warn_ratelimited(dwc->dev,
+ "ep0 data phase end transfer failed: %d\n", ret);
+
dep->resource_index = 0;
}
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 321361288935..50e4f667b2f2 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1774,7 +1774,11 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int
dep->flags |= DWC3_EP_DELAY_STOP;
return 0;
}
- WARN_ON_ONCE(ret);
+
+ if (ret)
+ dev_warn_ratelimited(dep->dwc->dev,
+ "end transfer failed: ret = %d\n", ret);
+
dep->resource_index = 0;
if (!interrupt)
@@ -4041,7 +4045,9 @@ static void dwc3_clear_stall_all_ep(struct dwc3 *dwc)
dep->flags &= ~DWC3_EP_STALL;
ret = dwc3_send_clear_stall_ep_cmd(dep);
- WARN_ON_ONCE(ret);
+ if (ret)
+ dev_warn_ratelimited(dwc->dev,
+ "failed to clear STALL on %s\n", dep->name);
}
}
--
2.17.1
On Mon, Aug 04, 2025, Selvarasu Ganesan wrote: > From: Akash M <akash.m5@samsung.com> > > This commit addresses a rarely observed endpoint command timeout > which causes kernel panic due to warn when 'panic_on_warn' is enabled > and unnecessary call trace prints when 'panic_on_warn' is disabled. > It is seen during fast software-controlled connect/disconnect testcases. > The following is one such endpoint command timeout that we observed: > > 1. Connect > ======= > ->dwc3_thread_interrupt > ->dwc3_ep0_interrupt > ->configfs_composite_setup > ->composite_setup > ->usb_ep_queue > ->dwc3_gadget_ep0_queue > ->__dwc3_gadget_ep0_queue > ->__dwc3_ep0_do_control_data > ->dwc3_send_gadget_ep_cmd > > 2. Disconnect > ========== > ->dwc3_thread_interrupt > ->dwc3_gadget_disconnect_interrupt > ->dwc3_ep0_reset_state > ->dwc3_ep0_end_control_data > ->dwc3_send_gadget_ep_cmd > > In the issue scenario, in Exynos platforms, we observed that control > transfers for the previous connect have not yet been completed and end > transfer command sent as a part of the disconnect sequence and > processing of USB_ENDPOINT_HALT feature request from the host timeout. > This maybe an expected scenario since the controller is processing EP > commands sent as a part of the previous connect. It maybe better to > remove WARN_ON in all places where device endpoint commands are sent to > avoid unnecessary kernel panic due to warn. > > Fixes: e192cc7b5239 ("usb: dwc3: gadget: move cmd_endtransfer to extra function") > Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") > Fixes: c7fcdeb2627c ("usb: dwc3: ep0: simplify EP0 state machine") > Fixes: f0f2b2a2db85 ("usb: dwc3: ep0: push ep0state into xfernotready processing") > Fixes: 2e3db064855a ("usb: dwc3: ep0: drop XferNotReady(DATA) support") > Cc: stable@vger.kernel.org I don't think this is a fix patch. You're just replacing WARN* with dev_warn* without doing any recovery. Let's remove the Fixes and table tag. Also, can we replace dev_warn* with dev_err* because these are critical errors that may put the controller in a bad state. Thanks, Thinh > Signed-off-by: Akash M <akash.m5@samsung.com> > Signed-off-by: Selvarasu Ganesan <selvarasu.g@samsung.com> > > diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c > index 666ac432f52d..7b313836f62b 100644 > --- a/drivers/usb/dwc3/ep0.c > +++ b/drivers/usb/dwc3/ep0.c > @@ -288,7 +288,9 @@ void dwc3_ep0_out_start(struct dwc3 *dwc) > dwc3_ep0_prepare_one_trb(dep, dwc->ep0_trb_addr, 8, > DWC3_TRBCTL_CONTROL_SETUP, false); > ret = dwc3_ep0_start_trans(dep); > - WARN_ON(ret < 0); > + if (ret < 0) > + dev_warn(dwc->dev, "ep0 out start transfer failed: %d\n", ret); > + > for (i = 2; i < DWC3_ENDPOINTS_NUM; i++) { > struct dwc3_ep *dwc3_ep; > > @@ -1061,7 +1063,9 @@ static void __dwc3_ep0_do_control_data(struct dwc3 *dwc, > ret = dwc3_ep0_start_trans(dep); > } > > - WARN_ON(ret < 0); > + if (ret < 0) > + dev_warn(dwc->dev, "ep0 data phase start transfer failed: %d\n", > + ret); > } > > static int dwc3_ep0_start_control_status(struct dwc3_ep *dep) > @@ -1078,7 +1082,12 @@ static int dwc3_ep0_start_control_status(struct dwc3_ep *dep) > > static void __dwc3_ep0_do_control_status(struct dwc3 *dwc, struct dwc3_ep *dep) > { > - WARN_ON(dwc3_ep0_start_control_status(dep)); > + int ret; > + > + ret = dwc3_ep0_start_control_status(dep); > + if (ret) > + dev_warn(dwc->dev, > + "ep0 status phase start transfer failed: %d\n", ret); > } > > static void dwc3_ep0_do_control_status(struct dwc3 *dwc, > @@ -1121,7 +1130,10 @@ void dwc3_ep0_end_control_data(struct dwc3 *dwc, struct dwc3_ep *dep) > cmd |= DWC3_DEPCMD_PARAM(dep->resource_index); > memset(¶ms, 0, sizeof(params)); > ret = dwc3_send_gadget_ep_cmd(dep, cmd, ¶ms); > - WARN_ON_ONCE(ret); > + if (ret) > + dev_warn_ratelimited(dwc->dev, > + "ep0 data phase end transfer failed: %d\n", ret); > + > dep->resource_index = 0; > } > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 321361288935..50e4f667b2f2 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -1774,7 +1774,11 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int > dep->flags |= DWC3_EP_DELAY_STOP; > return 0; > } > - WARN_ON_ONCE(ret); > + > + if (ret) > + dev_warn_ratelimited(dep->dwc->dev, > + "end transfer failed: ret = %d\n", ret); > + > dep->resource_index = 0; > > if (!interrupt) > @@ -4041,7 +4045,9 @@ static void dwc3_clear_stall_all_ep(struct dwc3 *dwc) > dep->flags &= ~DWC3_EP_STALL; > > ret = dwc3_send_clear_stall_ep_cmd(dep); > - WARN_ON_ONCE(ret); > + if (ret) > + dev_warn_ratelimited(dwc->dev, > + "failed to clear STALL on %s\n", dep->name); > } > } > > -- > 2.17.1 >
On 8/6/2025 5:08 AM, Thinh Nguyen wrote: > On Mon, Aug 04, 2025, Selvarasu Ganesan wrote: >> From: Akash M <akash.m5@samsung.com> >> >> This commit addresses a rarely observed endpoint command timeout >> which causes kernel panic due to warn when 'panic_on_warn' is enabled >> and unnecessary call trace prints when 'panic_on_warn' is disabled. >> It is seen during fast software-controlled connect/disconnect testcases. >> The following is one such endpoint command timeout that we observed: >> >> 1. Connect >> ======= >> ->dwc3_thread_interrupt >> ->dwc3_ep0_interrupt >> ->configfs_composite_setup >> ->composite_setup >> ->usb_ep_queue >> ->dwc3_gadget_ep0_queue >> ->__dwc3_gadget_ep0_queue >> ->__dwc3_ep0_do_control_data >> ->dwc3_send_gadget_ep_cmd >> >> 2. Disconnect >> ========== >> ->dwc3_thread_interrupt >> ->dwc3_gadget_disconnect_interrupt >> ->dwc3_ep0_reset_state >> ->dwc3_ep0_end_control_data >> ->dwc3_send_gadget_ep_cmd >> >> In the issue scenario, in Exynos platforms, we observed that control >> transfers for the previous connect have not yet been completed and end >> transfer command sent as a part of the disconnect sequence and >> processing of USB_ENDPOINT_HALT feature request from the host timeout. >> This maybe an expected scenario since the controller is processing EP >> commands sent as a part of the previous connect. It maybe better to >> remove WARN_ON in all places where device endpoint commands are sent to >> avoid unnecessary kernel panic due to warn. >> >> Fixes: e192cc7b5239 ("usb: dwc3: gadget: move cmd_endtransfer to extra function") >> Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") >> Fixes: c7fcdeb2627c ("usb: dwc3: ep0: simplify EP0 state machine") >> Fixes: f0f2b2a2db85 ("usb: dwc3: ep0: push ep0state into xfernotready processing") >> Fixes: 2e3db064855a ("usb: dwc3: ep0: drop XferNotReady(DATA) support") >> Cc: stable@vger.kernel.org > I don't think this is a fix patch. You're just replacing WARN* with > dev_warn* without doing any recovery. Let's remove the Fixes and table > tag. Also, can we replace dev_warn* with dev_err* because these are > critical errors that may put the controller in a bad state. > > Thanks, > Thinh Hi Thinh, Thanks for your review comments. Yeah we agree. This is not a fix patch. Sure we will update new patchset with replace dev_warn* with dev_err*. As for dropping the stable tag, It would be better these changes to be applied across all stable kernels, so shall we keep stable tag in place? Thanks, Selva > >> Signed-off-by: Akash M <akash.m5@samsung.com> >> Signed-off-by: Selvarasu Ganesan <selvarasu.g@samsung.com> >> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c >> index 666ac432f52d..7b313836f62b 100644 >> --- a/drivers/usb/dwc3/ep0.c >> +++ b/drivers/usb/dwc3/ep0.c >> @@ -288,7 +288,9 @@ void dwc3_ep0_out_start(struct dwc3 *dwc) >> dwc3_ep0_prepare_one_trb(dep, dwc->ep0_trb_addr, 8, >> DWC3_TRBCTL_CONTROL_SETUP, false); >> ret = dwc3_ep0_start_trans(dep); >> - WARN_ON(ret < 0); >> + if (ret < 0) >> + dev_warn(dwc->dev, "ep0 out start transfer failed: %d\n", ret); >> + >> for (i = 2; i < DWC3_ENDPOINTS_NUM; i++) { >> struct dwc3_ep *dwc3_ep; >> >> @@ -1061,7 +1063,9 @@ static void __dwc3_ep0_do_control_data(struct dwc3 *dwc, >> ret = dwc3_ep0_start_trans(dep); >> } >> >> - WARN_ON(ret < 0); >> + if (ret < 0) >> + dev_warn(dwc->dev, "ep0 data phase start transfer failed: %d\n", >> + ret); >> } >> >> static int dwc3_ep0_start_control_status(struct dwc3_ep *dep) >> @@ -1078,7 +1082,12 @@ static int dwc3_ep0_start_control_status(struct dwc3_ep *dep) >> >> static void __dwc3_ep0_do_control_status(struct dwc3 *dwc, struct dwc3_ep *dep) >> { >> - WARN_ON(dwc3_ep0_start_control_status(dep)); >> + int ret; >> + >> + ret = dwc3_ep0_start_control_status(dep); >> + if (ret) >> + dev_warn(dwc->dev, >> + "ep0 status phase start transfer failed: %d\n", ret); >> } >> >> static void dwc3_ep0_do_control_status(struct dwc3 *dwc, >> @@ -1121,7 +1130,10 @@ void dwc3_ep0_end_control_data(struct dwc3 *dwc, struct dwc3_ep *dep) >> cmd |= DWC3_DEPCMD_PARAM(dep->resource_index); >> memset(¶ms, 0, sizeof(params)); >> ret = dwc3_send_gadget_ep_cmd(dep, cmd, ¶ms); >> - WARN_ON_ONCE(ret); >> + if (ret) >> + dev_warn_ratelimited(dwc->dev, >> + "ep0 data phase end transfer failed: %d\n", ret); >> + >> dep->resource_index = 0; >> } >> >> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c >> index 321361288935..50e4f667b2f2 100644 >> --- a/drivers/usb/dwc3/gadget.c >> +++ b/drivers/usb/dwc3/gadget.c >> @@ -1774,7 +1774,11 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int >> dep->flags |= DWC3_EP_DELAY_STOP; >> return 0; >> } >> - WARN_ON_ONCE(ret); >> + >> + if (ret) >> + dev_warn_ratelimited(dep->dwc->dev, >> + "end transfer failed: ret = %d\n", ret); >> + >> dep->resource_index = 0; >> >> if (!interrupt) >> @@ -4041,7 +4045,9 @@ static void dwc3_clear_stall_all_ep(struct dwc3 *dwc) >> dep->flags &= ~DWC3_EP_STALL; >> >> ret = dwc3_send_clear_stall_ep_cmd(dep); >> - WARN_ON_ONCE(ret); >> + if (ret) >> + dev_warn_ratelimited(dwc->dev, >> + "failed to clear STALL on %s\n", dep->name); >> } >> } >> >> -- >> 2.17.1
On Wed, Aug 06, 2025, Selvarasu Ganesan wrote: > > On 8/6/2025 5:08 AM, Thinh Nguyen wrote: > > On Mon, Aug 04, 2025, Selvarasu Ganesan wrote: > >> From: Akash M <akash.m5@samsung.com> > >> > >> This commit addresses a rarely observed endpoint command timeout > >> which causes kernel panic due to warn when 'panic_on_warn' is enabled > >> and unnecessary call trace prints when 'panic_on_warn' is disabled. > >> It is seen during fast software-controlled connect/disconnect testcases. > >> The following is one such endpoint command timeout that we observed: > >> > >> 1. Connect > >> ======= > >> ->dwc3_thread_interrupt > >> ->dwc3_ep0_interrupt > >> ->configfs_composite_setup > >> ->composite_setup > >> ->usb_ep_queue > >> ->dwc3_gadget_ep0_queue > >> ->__dwc3_gadget_ep0_queue > >> ->__dwc3_ep0_do_control_data > >> ->dwc3_send_gadget_ep_cmd > >> > >> 2. Disconnect > >> ========== > >> ->dwc3_thread_interrupt > >> ->dwc3_gadget_disconnect_interrupt > >> ->dwc3_ep0_reset_state > >> ->dwc3_ep0_end_control_data > >> ->dwc3_send_gadget_ep_cmd > >> > >> In the issue scenario, in Exynos platforms, we observed that control > >> transfers for the previous connect have not yet been completed and end > >> transfer command sent as a part of the disconnect sequence and > >> processing of USB_ENDPOINT_HALT feature request from the host timeout. > >> This maybe an expected scenario since the controller is processing EP > >> commands sent as a part of the previous connect. It maybe better to > >> remove WARN_ON in all places where device endpoint commands are sent to > >> avoid unnecessary kernel panic due to warn. > >> > >> Fixes: e192cc7b5239 ("usb: dwc3: gadget: move cmd_endtransfer to extra function") > >> Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") > >> Fixes: c7fcdeb2627c ("usb: dwc3: ep0: simplify EP0 state machine") > >> Fixes: f0f2b2a2db85 ("usb: dwc3: ep0: push ep0state into xfernotready processing") > >> Fixes: 2e3db064855a ("usb: dwc3: ep0: drop XferNotReady(DATA) support") > >> Cc: stable@vger.kernel.org > > I don't think this is a fix patch. You're just replacing WARN* with > > dev_warn* without doing any recovery. Let's remove the Fixes and table > > tag. Also, can we replace dev_warn* with dev_err* because these are > > critical errors that may put the controller in a bad state. > > > > Thanks, > > Thinh > > > Hi Thinh, > > Thanks for your review comments. > Yeah we agree. This is not a fix patch. Sure we will update new patchset > with replace dev_warn* with dev_err*. > > As for dropping the stable tag, It would be better these changes to be > applied across all stable kernels, so shall we keep stable tag in place? > That's fine with me. BR, Thinh
On 8/7/2025 6:26 AM, Thinh Nguyen wrote: > On Wed, Aug 06, 2025, Selvarasu Ganesan wrote: >> On 8/6/2025 5:08 AM, Thinh Nguyen wrote: >>> On Mon, Aug 04, 2025, Selvarasu Ganesan wrote: >>>> From: Akash M <akash.m5@samsung.com> >>>> >>>> This commit addresses a rarely observed endpoint command timeout >>>> which causes kernel panic due to warn when 'panic_on_warn' is enabled >>>> and unnecessary call trace prints when 'panic_on_warn' is disabled. >>>> It is seen during fast software-controlled connect/disconnect testcases. >>>> The following is one such endpoint command timeout that we observed: >>>> >>>> 1. Connect >>>> ======= >>>> ->dwc3_thread_interrupt >>>> ->dwc3_ep0_interrupt >>>> ->configfs_composite_setup >>>> ->composite_setup >>>> ->usb_ep_queue >>>> ->dwc3_gadget_ep0_queue >>>> ->__dwc3_gadget_ep0_queue >>>> ->__dwc3_ep0_do_control_data >>>> ->dwc3_send_gadget_ep_cmd >>>> >>>> 2. Disconnect >>>> ========== >>>> ->dwc3_thread_interrupt >>>> ->dwc3_gadget_disconnect_interrupt >>>> ->dwc3_ep0_reset_state >>>> ->dwc3_ep0_end_control_data >>>> ->dwc3_send_gadget_ep_cmd >>>> >>>> In the issue scenario, in Exynos platforms, we observed that control >>>> transfers for the previous connect have not yet been completed and end >>>> transfer command sent as a part of the disconnect sequence and >>>> processing of USB_ENDPOINT_HALT feature request from the host timeout. >>>> This maybe an expected scenario since the controller is processing EP >>>> commands sent as a part of the previous connect. It maybe better to >>>> remove WARN_ON in all places where device endpoint commands are sent to >>>> avoid unnecessary kernel panic due to warn. >>>> >>>> Fixes: e192cc7b5239 ("usb: dwc3: gadget: move cmd_endtransfer to extra function") >>>> Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") >>>> Fixes: c7fcdeb2627c ("usb: dwc3: ep0: simplify EP0 state machine") >>>> Fixes: f0f2b2a2db85 ("usb: dwc3: ep0: push ep0state into xfernotready processing") >>>> Fixes: 2e3db064855a ("usb: dwc3: ep0: drop XferNotReady(DATA) support") >>>> Cc: stable@vger.kernel.org >>> I don't think this is a fix patch. You're just replacing WARN* with >>> dev_warn* without doing any recovery. Let's remove the Fixes and table >>> tag. Also, can we replace dev_warn* with dev_err* because these are >>> critical errors that may put the controller in a bad state. >>> >>> Thanks, >>> Thinh >> >> Hi Thinh, >> >> Thanks for your review comments. >> Yeah we agree. This is not a fix patch. Sure we will update new patchset >> with replace dev_warn* with dev_err*. >> >> As for dropping the stable tag, It would be better these changes to be >> applied across all stable kernels, so shall we keep stable tag in place? >> > That's fine with me. Hi Thinh, Thanks for the confirmation. Please review the updated patchset v2: https://lore.kernel.org/all/20250807014639.1596-1-selvarasu.g@samsung.com/ Thanks, Selva > > BR, > Thinh
© 2016 - 2025 Red Hat, Inc.