drivers/tty/serial/qcom_geni_serial.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
A deadlock is observed in the qcom_geni_serial driver during runtime
resume. This occurs when the pinctrl subsystem reconfigures device pins
via msm_pinmux_set_mux() while the serial device's interrupt is an
active wakeup source. msm_pinmux_set_mux() calls disable_irq() or
__synchronize_irq(), conflicting with the active wakeup state and
causing the IRQ thread to enter an uninterruptible (D-state) sleep,
leading to system instability.
The critical call trace leading to the deadlock is:
Call trace:
__switch_to+0xe0/0x120
__schedule+0x39c/0x978
schedule+0x5c/0xf8
__synchronize_irq+0x88/0xb4
disable_irq+0x3c/0x4c
msm_pinmux_set_mux+0x508/0x644
pinmux_enable_setting+0x190/0x2dc
pinctrl_commit_state+0x13c/0x208
pinctrl_pm_select_default_state+0x4c/0xa4
geni_se_resources_on+0xe8/0x154
qcom_geni_serial_runtime_resume+0x4c/0x88
pm_generic_runtime_resume+0x2c/0x44
__genpd_runtime_resume+0x30/0x80
genpd_runtime_resume+0x114/0x29c
__rpm_callback+0x48/0x1d8
rpm_callback+0x6c/0x78
rpm_resume+0x530/0x750
__pm_runtime_resume+0x50/0x94
handle_threaded_wake_irq+0x30/0x94
irq_thread_fn+0x2c/xa8
irq_thread+0x160/x248
kthread+0x110/x114
ret_from_fork+0x10/x20
To resolve this, explicitly manage the wakeup IRQ state within the
runtime suspend/resume callbacks. In the runtime resume callback, call
disable_irq_wake() before enabling resources. This preemptively
removes the "wakeup" capability from the IRQ, allowing subsequent
interrupt management calls to proceed without conflict. An error path
re-enables the wakeup IRQ if resource enablement fails.
Conversely, in runtime suspend, call enable_irq_wake() after resources
are disabled. This ensures the interrupt is configured as a wakeup
source only once the device has fully entered its low-power state. An
error path handles disabling the wakeup IRQ if the suspend operation
fails.
Fixes: 1afa70632c39 ("serial: qcom-geni: Enable PM runtime for serial driver")
Signed-off-by: Praveen Talari <praveen.talari@oss.qualcomm.com>
---
drivers/tty/serial/qcom_geni_serial.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index 0fdda3a1e70b..4f5ea28dfe8f 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -1926,8 +1926,17 @@ static int __maybe_unused qcom_geni_serial_runtime_suspend(struct device *dev)
struct uart_port *uport = &port->uport;
int ret = 0;
- if (port->dev_data->power_state)
+ if (port->dev_data->power_state) {
ret = port->dev_data->power_state(uport, false);
+ if (ret) {
+ if (device_can_wakeup(dev))
+ disable_irq_wake(port->wakeup_irq);
+ return ret;
+ }
+ }
+
+ if (device_can_wakeup(dev))
+ enable_irq_wake(port->wakeup_irq);
return ret;
}
@@ -1938,8 +1947,17 @@ static int __maybe_unused qcom_geni_serial_runtime_resume(struct device *dev)
struct uart_port *uport = &port->uport;
int ret = 0;
- if (port->dev_data->power_state)
+ if (device_can_wakeup(dev))
+ disable_irq_wake(port->wakeup_irq);
+
+ if (port->dev_data->power_state) {
ret = port->dev_data->power_state(uport, true);
+ if (ret) {
+ if (device_can_wakeup(dev))
+ enable_irq_wake(port->wakeup_irq);
+ return ret;
+ }
+ }
return ret;
}
base-commit: 3e8e5822146bc396d2a7e5fbb7be13271665522a
--
2.34.1
On Mon Sep 8, 2025 at 5:45 PM BST, Praveen Talari wrote: > A deadlock is observed in the qcom_geni_serial driver during runtime > resume. This occurs when the pinctrl subsystem reconfigures device pins > via msm_pinmux_set_mux() while the serial device's interrupt is an > active wakeup source. msm_pinmux_set_mux() calls disable_irq() or > __synchronize_irq(), conflicting with the active wakeup state and > causing the IRQ thread to enter an uninterruptible (D-state) sleep, > leading to system instability. > > The critical call trace leading to the deadlock is: > > Call trace: > __switch_to+0xe0/0x120 > __schedule+0x39c/0x978 > schedule+0x5c/0xf8 > __synchronize_irq+0x88/0xb4 > disable_irq+0x3c/0x4c > msm_pinmux_set_mux+0x508/0x644 > pinmux_enable_setting+0x190/0x2dc > pinctrl_commit_state+0x13c/0x208 > pinctrl_pm_select_default_state+0x4c/0xa4 > geni_se_resources_on+0xe8/0x154 > qcom_geni_serial_runtime_resume+0x4c/0x88 > pm_generic_runtime_resume+0x2c/0x44 > __genpd_runtime_resume+0x30/0x80 > genpd_runtime_resume+0x114/0x29c > __rpm_callback+0x48/0x1d8 > rpm_callback+0x6c/0x78 > rpm_resume+0x530/0x750 > __pm_runtime_resume+0x50/0x94 > handle_threaded_wake_irq+0x30/0x94 > irq_thread_fn+0x2c/xa8 > irq_thread+0x160/x248 > kthread+0x110/x114 > ret_from_fork+0x10/x20 > > To resolve this, explicitly manage the wakeup IRQ state within the > runtime suspend/resume callbacks. In the runtime resume callback, call > disable_irq_wake() before enabling resources. This preemptively > removes the "wakeup" capability from the IRQ, allowing subsequent > interrupt management calls to proceed without conflict. An error path > re-enables the wakeup IRQ if resource enablement fails. > > Conversely, in runtime suspend, call enable_irq_wake() after resources > are disabled. This ensures the interrupt is configured as a wakeup > source only once the device has fully entered its low-power state. An > error path handles disabling the wakeup IRQ if the suspend operation > fails. > > Fixes: 1afa70632c39 ("serial: qcom-geni: Enable PM runtime for serial driver") > Signed-off-by: Praveen Talari <praveen.talari@oss.qualcomm.com> You forgot: Reported-by: Alexey Klimov <alexey.klimov@linaro.org> Also, not sure where this change will go, via Greg or Jiri, but ideally this should be picked for current -rc cycle since regression is introduced during latest merge window. I also would like to test it on qrb2210 rb1 where this regression is reproduciable. Thanks, Alexey [..]
(adding Krzysztof to c/c) On Mon Sep 8, 2025 at 6:43 PM BST, Alexey Klimov wrote: > On Mon Sep 8, 2025 at 5:45 PM BST, Praveen Talari wrote: >> A deadlock is observed in the qcom_geni_serial driver during runtime >> resume. This occurs when the pinctrl subsystem reconfigures device pins >> via msm_pinmux_set_mux() while the serial device's interrupt is an >> active wakeup source. msm_pinmux_set_mux() calls disable_irq() or >> __synchronize_irq(), conflicting with the active wakeup state and >> causing the IRQ thread to enter an uninterruptible (D-state) sleep, >> leading to system instability. >> >> The critical call trace leading to the deadlock is: >> >> Call trace: >> __switch_to+0xe0/0x120 >> __schedule+0x39c/0x978 >> schedule+0x5c/0xf8 >> __synchronize_irq+0x88/0xb4 >> disable_irq+0x3c/0x4c >> msm_pinmux_set_mux+0x508/0x644 >> pinmux_enable_setting+0x190/0x2dc >> pinctrl_commit_state+0x13c/0x208 >> pinctrl_pm_select_default_state+0x4c/0xa4 >> geni_se_resources_on+0xe8/0x154 >> qcom_geni_serial_runtime_resume+0x4c/0x88 >> pm_generic_runtime_resume+0x2c/0x44 >> __genpd_runtime_resume+0x30/0x80 >> genpd_runtime_resume+0x114/0x29c >> __rpm_callback+0x48/0x1d8 >> rpm_callback+0x6c/0x78 >> rpm_resume+0x530/0x750 >> __pm_runtime_resume+0x50/0x94 >> handle_threaded_wake_irq+0x30/0x94 >> irq_thread_fn+0x2c/xa8 >> irq_thread+0x160/x248 >> kthread+0x110/x114 >> ret_from_fork+0x10/x20 >> >> To resolve this, explicitly manage the wakeup IRQ state within the >> runtime suspend/resume callbacks. In the runtime resume callback, call >> disable_irq_wake() before enabling resources. This preemptively >> removes the "wakeup" capability from the IRQ, allowing subsequent >> interrupt management calls to proceed without conflict. An error path >> re-enables the wakeup IRQ if resource enablement fails. >> >> Conversely, in runtime suspend, call enable_irq_wake() after resources >> are disabled. This ensures the interrupt is configured as a wakeup >> source only once the device has fully entered its low-power state. An >> error path handles disabling the wakeup IRQ if the suspend operation >> fails. >> >> Fixes: 1afa70632c39 ("serial: qcom-geni: Enable PM runtime for serial driver") >> Signed-off-by: Praveen Talari <praveen.talari@oss.qualcomm.com> > > You forgot: > > Reported-by: Alexey Klimov <alexey.klimov@linaro.org> > > Also, not sure where this change will go, via Greg or Jiri, but ideally > this should be picked for current -rc cycle since regression is > introduced during latest merge window. > > I also would like to test it on qrb2210 rb1 where this regression is > reproduciable. It doesn't seem that it fixes the regression on RB1 board: INFO: task kworker/u16:3:50 blocked for more than 120 seconds. Not tainted 6.17.0-rc5-00018-g9dd1835ecda5-dirty #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u16:3 state:D stack:0 pid:50 tgid:50 ppid:2 task_flags:0x4208060 flags:0x00000010 Workqueue: async async_run_entry_fn Call trace: __switch_to+0xf0/0x1c0 (T) __schedule+0x358/0x99c schedule+0x34/0x11c rpm_resume+0x17c/0x6a0 rpm_resume+0x2c4/0x6a0 rpm_resume+0x2c4/0x6a0 rpm_resume+0x2c4/0x6a0 __pm_runtime_resume+0x50/0x9c __driver_probe_device+0x58/0x120 driver_probe_device+0x3c/0x154 __driver_attach_async_helper+0x4c/0xc0 async_run_entry_fn+0x34/0xe0 process_one_work+0x148/0x284 worker_thread+0x2c4/0x3e0 kthread+0x12c/0x210 ret_from_fork+0x10/0x20 INFO: task irq/92-4a8c000.:79 blocked for more than 120 seconds. Not tainted 6.17.0-rc5-00018-g9dd1835ecda5-dirty #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:irq/92-4a8c000. state:D stack:0 pid:79 tgid:79 ppid:2 task_flags:0x208040 flags:0x00000010 Call trace: __switch_to+0xf0/0x1c0 (T) __schedule+0x358/0x99c schedule+0x34/0x11c __synchronize_irq+0x90/0xcc disable_irq+0x3c/0x4c msm_pinmux_set_mux+0x3b4/0x45c pinmux_enable_setting+0x1fc/0x2d8 pinctrl_commit_state+0xa0/0x260 pinctrl_pm_select_default_state+0x4c/0xa0 geni_se_resources_on+0xe8/0x154 geni_serial_resource_state+0x8c/0xbc qcom_geni_serial_runtime_resume+0x3c/0x88 pm_generic_runtime_resume+0x2c/0x44 __rpm_callback+0x48/0x1e0 rpm_callback+0x74/0x80 rpm_resume+0x3bc/0x6a0 __pm_runtime_resume+0x50/0x9c handle_threaded_wake_irq+0x30/0x80 irq_thread_fn+0x2c/0xb0 irq_thread+0x170/0x334 kthread+0x12c/0x210 ret_from_fork+0x10/0x20 I see exactly the same behaviour with this changes applied. root@rb1:~# uname -a Linux rb1 6.17.0-rc5-00018-g9dd1835ecda5-dirty #13 SMP PREEMPT Tue Sep 9 20:14:22 BST 2025 aarch64 GNU/Linux I see the same behaviour with linux-next but my local tree is a bit old, maybe there are some dependencies. Best regards, Alexey
© 2016 - 2025 Red Hat, Inc.