drivers/soc/qcom/pdr_interface.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
When some client process A call pdr_add_lookup() to add the look up for
the service and does schedule locator work, later a process B got a new
server packet indicating locator is up and call pdr_locator_new_server()
which eventually sets pdr->locator_init_complete to true which process A
sees and takes list lock and queries domain list but it will timeout due
to deadlock as the response will queued to the same qmi->wq and it is
ordered workqueue and process B is not able to complete new server
request work due to deadlock on list lock.
Process A Process B
process_scheduled_works()
pdr_add_lookup() qmi_data_ready_work()
process_scheduled_works() pdr_locator_new_server()
pdr->locator_init_complete=true;
pdr_locator_work()
mutex_lock(&pdr->list_lock);
pdr_locate_service() mutex_lock(&pdr->list_lock);
pdr_get_domain_list()
pr_err("PDR: %s get domain list
txn wait failed: %d\n",
req->service_name,
ret);
Fix it by removing the unnecessary list iteration as the list iteration
is already being done inside locator work, so avoid it here and just
call schedule_work() here.
Signed-off-by: Saranya R <quic_sarar@quicinc.com>
Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
---
drivers/soc/qcom/pdr_interface.c | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/soc/qcom/pdr_interface.c b/drivers/soc/qcom/pdr_interface.c
index 328b6153b2be..71be378d2e43 100644
--- a/drivers/soc/qcom/pdr_interface.c
+++ b/drivers/soc/qcom/pdr_interface.c
@@ -75,7 +75,6 @@ static int pdr_locator_new_server(struct qmi_handle *qmi,
{
struct pdr_handle *pdr = container_of(qmi, struct pdr_handle,
locator_hdl);
- struct pdr_service *pds;
mutex_lock(&pdr->lock);
/* Create a local client port for QMI communication */
@@ -87,12 +86,7 @@ static int pdr_locator_new_server(struct qmi_handle *qmi,
mutex_unlock(&pdr->lock);
/* Service pending lookup requests */
- mutex_lock(&pdr->list_lock);
- list_for_each_entry(pds, &pdr->lookups, node) {
- if (pds->need_locator_lookup)
- schedule_work(&pdr->locator_work);
- }
- mutex_unlock(&pdr->list_lock);
+ schedule_work(&pdr->locator_work);
return 0;
}
--
2.34.1
On Tue, Jan 28, 2025 at 01:37:51PM +0530, Mukesh Ojha wrote:
> When some client process A call pdr_add_lookup() to add the look up for
> the service and does schedule locator work, later a process B got a new
> server packet indicating locator is up and call pdr_locator_new_server()
> which eventually sets pdr->locator_init_complete to true which process A
> sees and takes list lock and queries domain list but it will timeout due
> to deadlock as the response will queued to the same qmi->wq and it is
> ordered workqueue and process B is not able to complete new server
> request work due to deadlock on list lock.
>
> Process A Process B
>
> process_scheduled_works()
> pdr_add_lookup() qmi_data_ready_work()
> process_scheduled_works() pdr_locator_new_server()
> pdr->locator_init_complete=true;
> pdr_locator_work()
> mutex_lock(&pdr->list_lock);
>
> pdr_locate_service() mutex_lock(&pdr->list_lock);
>
> pdr_get_domain_list()
> pr_err("PDR: %s get domain list
> txn wait failed: %d\n",
> req->service_name,
> ret);
>
> Fix it by removing the unnecessary list iteration as the list iteration
> is already being done inside locator work, so avoid it here and just
> call schedule_work() here.
>
> Signed-off-by: Saranya R <quic_sarar@quicinc.com>
> Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
Missing Fixes tag.
> ---
> drivers/soc/qcom/pdr_interface.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
--
With best wishes
Dmitry
On Tue, Jan 28, 2025 at 06:10:24PM +0200, Dmitry Baryshkov wrote:
> On Tue, Jan 28, 2025 at 01:37:51PM +0530, Mukesh Ojha wrote:
> > When some client process A call pdr_add_lookup() to add the look up for
> > the service and does schedule locator work, later a process B got a new
> > server packet indicating locator is up and call pdr_locator_new_server()
> > which eventually sets pdr->locator_init_complete to true which process A
> > sees and takes list lock and queries domain list but it will timeout due
> > to deadlock as the response will queued to the same qmi->wq and it is
> > ordered workqueue and process B is not able to complete new server
> > request work due to deadlock on list lock.
> >
> > Process A Process B
> >
> > process_scheduled_works()
> > pdr_add_lookup() qmi_data_ready_work()
> > process_scheduled_works() pdr_locator_new_server()
> > pdr->locator_init_complete=true;
> > pdr_locator_work()
> > mutex_lock(&pdr->list_lock);
> >
> > pdr_locate_service() mutex_lock(&pdr->list_lock);
> >
> > pdr_get_domain_list()
> > pr_err("PDR: %s get domain list
> > txn wait failed: %d\n",
> > req->service_name,
> > ret);
> >
> > Fix it by removing the unnecessary list iteration as the list iteration
> > is already being done inside locator work, so avoid it here and just
> > call schedule_work() here.
> >
> > Signed-off-by: Saranya R <quic_sarar@quicinc.com>
> > Signed-off-by: Mukesh Ojha <mukesh.ojha@oss.qualcomm.com>
>
> Missing Fixes tag.
Sure, will add.
-Mukesh
>
> > ---
> > drivers/soc/qcom/pdr_interface.c | 8 +-------
> > 1 file changed, 1 insertion(+), 7 deletions(-)
> >
>
> --
> With best wishes
> Dmitry
© 2016 - 2026 Red Hat, Inc.