[PATCH v3] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling

Zhongqiu Han posted 1 patch 2 weeks, 1 day ago
drivers/ufs/core/ufs-sysfs.c | 2 ++
drivers/ufs/core/ufshcd.c    | 9 +++++++++
include/ufs/ufshcd.h         | 3 +++
3 files changed, 14 insertions(+)
[PATCH v3] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
Posted by Zhongqiu Han 2 weeks, 1 day ago
The cpu_latency_qos_add/remove/update_request interfaces lack internal
synchronization by design, requiring the caller to ensure thread safety.
The current implementation relies on the `pm_qos_enabled` flag, which is
insufficient to prevent concurrent access and cannot serve as a proper
synchronization mechanism. This has led to data races and list corruption
issues.

A typical race condition call trace is:

[Thread A]
ufshcd_pm_qos_exit()
  --> cpu_latency_qos_remove_request()
    --> cpu_latency_qos_apply();
      --> pm_qos_update_target()
        --> plist_del              <--(1) delete plist node
    --> memset(req, 0, sizeof(*req));
  --> hba->pm_qos_enabled = false;

[Thread B]
ufshcd_devfreq_target
  --> ufshcd_devfreq_scale
    --> ufshcd_scale_clks
      --> ufshcd_pm_qos_update     <--(2) pm_qos_enabled is true
        --> cpu_latency_qos_update_request
          --> pm_qos_update_target
            --> plist_del          <--(3) plist node use-after-free

This patch introduces a dedicated mutex to serialize PM QoS operations,
preventing data races and ensuring safe access to PM QoS resources,
including sysfs interface reads.

Fixes: 2777e73fc154 ("scsi: ufs: core: Add CPU latency QoS support for UFS driver")
Signed-off-by: Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>
---
v2 -> v3:
- Per Bart's comments, replaced READ_ONCE with a mutex for sysfs access to ensure
  thread safety, and updated the commit message accordingly.
- Also per Bart's suggestion, use guard(mutex)(&hba->pm_qos_mutex) instead of
  explicit mutex_lock/mutex_unlock to improve readability and ease code review.
- Link to v2: https://lore.kernel.org/all/20250902074829.657343-1-zhongqiu.han@oss.qualcomm.com/

v1 -> v2:
- Fix misleading indentation by adding braces to if statements in pm_qos logic.
- Resolve checkpatch strict mode warning by adding an inline comment for pm_qos_mutex.

 drivers/ufs/core/ufs-sysfs.c | 2 ++
 drivers/ufs/core/ufshcd.c    | 9 +++++++++
 include/ufs/ufshcd.h         | 3 +++
 3 files changed, 14 insertions(+)

diff --git a/drivers/ufs/core/ufs-sysfs.c b/drivers/ufs/core/ufs-sysfs.c
index 4bd7d491e3c5..0086816b27cd 100644
--- a/drivers/ufs/core/ufs-sysfs.c
+++ b/drivers/ufs/core/ufs-sysfs.c
@@ -512,6 +512,8 @@ static ssize_t pm_qos_enable_show(struct device *dev,
 {
 	struct ufs_hba *hba = dev_get_drvdata(dev);
 
+	guard(mutex)(&hba->pm_qos_mutex);
+
 	return sysfs_emit(buf, "%d\n", hba->pm_qos_enabled);
 }
 
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 88a0e9289ca6..2ea7cf86cea1 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1047,6 +1047,7 @@ EXPORT_SYMBOL_GPL(ufshcd_is_hba_active);
  */
 void ufshcd_pm_qos_init(struct ufs_hba *hba)
 {
+	guard(mutex)(&hba->pm_qos_mutex);
 
 	if (hba->pm_qos_enabled)
 		return;
@@ -1063,6 +1064,8 @@ void ufshcd_pm_qos_init(struct ufs_hba *hba)
  */
 void ufshcd_pm_qos_exit(struct ufs_hba *hba)
 {
+	guard(mutex)(&hba->pm_qos_mutex);
+
 	if (!hba->pm_qos_enabled)
 		return;
 
@@ -1077,6 +1080,8 @@ void ufshcd_pm_qos_exit(struct ufs_hba *hba)
  */
 static void ufshcd_pm_qos_update(struct ufs_hba *hba, bool on)
 {
+	guard(mutex)(&hba->pm_qos_mutex);
+
 	if (!hba->pm_qos_enabled)
 		return;
 
@@ -10765,6 +10770,10 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
 	mutex_init(&hba->ee_ctrl_mutex);
 
 	mutex_init(&hba->wb_mutex);
+
+	/* Initialize mutex for PM QoS request synchronization */
+	mutex_init(&hba->pm_qos_mutex);
+
 	init_rwsem(&hba->clk_scaling_lock);
 
 	ufshcd_init_clk_gating(hba);
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index ea0021f067c9..277dda606f4d 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -938,6 +938,7 @@ enum ufshcd_mcq_opr {
  * @ufs_rtc_update_work: A work for UFS RTC periodic update
  * @pm_qos_req: PM QoS request handle
  * @pm_qos_enabled: flag to check if pm qos is enabled
+ * @pm_qos_mutex: synchronizes PM QoS request and status updates
  * @critical_health_count: count of critical health exceptions
  * @dev_lvl_exception_count: count of device level exceptions since last reset
  * @dev_lvl_exception_id: vendor specific information about the
@@ -1110,6 +1111,8 @@ struct ufs_hba {
 	struct delayed_work ufs_rtc_update_work;
 	struct pm_qos_request pm_qos_req;
 	bool pm_qos_enabled;
+	/* synchronizes PM QoS request and status updates */
+	struct mutex pm_qos_mutex;
 
 	int critical_health_count;
 	atomic_t dev_lvl_exception_count;
-- 
2.43.0
Re: [PATCH v3] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
Posted by Martin K. Petersen 2 days, 8 hours ago
On Wed, 17 Sep 2025 17:41:43 +0800, Zhongqiu Han wrote:

> The cpu_latency_qos_add/remove/update_request interfaces lack internal
> synchronization by design, requiring the caller to ensure thread safety.
> The current implementation relies on the `pm_qos_enabled` flag, which is
> insufficient to prevent concurrent access and cannot serve as a proper
> synchronization mechanism. This has led to data races and list corruption
> issues.
> 
> [...]

Applied to 6.18/scsi-queue, thanks!

[1/1] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
      https://git.kernel.org/mkp/scsi/c/79dde5f7dc7c

-- 
Martin K. Petersen
[PATCH v3] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
Posted by Huan Tang 2 weeks ago
> This patch introduces a dedicated mutex to serialize PM QoS operations,
> preventing data races and ensuring safe access to PM QoS resources,
> including sysfs interface reads.

Reboot stress test on vivo phone:
without the patch, fail rate: 11/2000
with the patch, fail rate: 0/3200

Tested-by: Huan Tang <tanghuan@vivo.com>


Re: [PATCH v3] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
Posted by Peter Wang (王信友) 2 weeks ago
On Wed, 2025-09-17 at 17:41 +0800, Zhongqiu Han wrote:
> 
>         ufshcd_init_clk_gating(hba);
> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
> index ea0021f067c9..277dda606f4d 100644
> --- a/include/ufs/ufshcd.h
> +++ b/include/ufs/ufshcd.h
> @@ -938,6 +938,7 @@ enum ufshcd_mcq_opr {
>   * @ufs_rtc_update_work: A work for UFS RTC periodic update
>   * @pm_qos_req: PM QoS request handle
>   * @pm_qos_enabled: flag to check if pm qos is enabled
> + * @pm_qos_mutex: synchronizes PM QoS request and status updates
>   * @critical_health_count: count of critical health exceptions
>   * @dev_lvl_exception_count: count of device level exceptions since
> last reset
>   * @dev_lvl_exception_id: vendor specific information about the
> @@ -1110,6 +1111,8 @@ struct ufs_hba {
>         struct delayed_work ufs_rtc_update_work;
>         struct pm_qos_request pm_qos_req;
>         bool pm_qos_enabled;
> +       /* synchronizes PM QoS request and status updates */
> 

Hi Zhongqiu,

I think this line can be removed to make the code cleaner, 
since you’ve already explained the purpose of each parameter above.

Thanks.
Peter



> +       struct mutex pm_qos_mutex;
> 
>         int critical_health_count;
>         atomic_t dev_lvl_exception_count;
> --
> 2.43.0
> 



Re: [PATCH v3] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
Posted by Zhongqiu Han 2 weeks ago
On 9/18/2025 11:16 AM, Peter Wang (王信友) wrote:
> On Wed, 2025-09-17 at 17:41 +0800, Zhongqiu Han wrote:
>> 
>>         ufshcd_init_clk_gating(hba);
>> diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
>> index ea0021f067c9..277dda606f4d 100644
>> --- a/include/ufs/ufshcd.h
>> +++ b/include/ufs/ufshcd.h
>> @@ -938,6 +938,7 @@ enum ufshcd_mcq_opr {
>>   * @ufs_rtc_update_work: A work for UFS RTC periodic update
>>   * @pm_qos_req: PM QoS request handle
>>   * @pm_qos_enabled: flag to check if pm qos is enabled
>> + * @pm_qos_mutex: synchronizes PM QoS request and status updates
>>   * @critical_health_count: count of critical health exceptions
>>   * @dev_lvl_exception_count: count of device level exceptions since
>> last reset
>>   * @dev_lvl_exception_id: vendor specific information about the
>> @@ -1110,6 +1111,8 @@ struct ufs_hba {
>>         struct delayed_work ufs_rtc_update_work;
>>         struct pm_qos_request pm_qos_req;
>>         bool pm_qos_enabled;
>> +       /* synchronizes PM QoS request and status updates */
>> 
> 
> Hi Zhongqiu,
> 
> I think this line can be removed to make the code cleaner,
> since you’ve already explained the purpose of each parameter above.
> 
> Thanks.
> Peter
> 
> 
> 
>> +       struct mutex pm_qos_mutex;
>> 
>>         int critical_health_count;
>>         atomic_t dev_lvl_exception_count;
>> --
>> 2.43.0
>> 
> 


Hi Peter,
Thanks a lot for your review~

---
In Patch V1, we got below info from checkpatch.pl strict mode:

include/ufs/ufshcd.h:1140: CHECK:UNCOMMENTED_DEFINITION: struct mutex 
definition without comment
+	struct mutex pm_qos_mutex;


So since V2, the comment was added. I want to strictly follow the
community coding style guidelines. Thank you~

> 
> 
> 
> ************* MEDIATEK Confidentiality Notice
>   ********************
> The information contained in this e-mail message (including any
> attachments) may be confidential, proprietary, privileged, or otherwise
> exempt from disclosure under applicable laws. It is intended to be
> conveyed only to the designated recipient(s). Any use, dissemination,
> distribution, printing, retaining or copying of this e-mail (including its
> attachments) by unintended recipient(s) is strictly prohibited and may
> be unlawful. If you are not an intended recipient of this e-mail, or believe
>   
> that you have received this e-mail in error, please notify the sender
> immediately (by replying to this e-mail), delete any and all copies of
> this e-mail (including any attachments) from your system, and do not
> disclose the content of this e-mail to any other person. Thank you!
> 



-- 
Thx and BRs,
Zhongqiu Han
Re: [PATCH v3] scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
Posted by Bart Van Assche 2 weeks ago
On 9/17/25 2:41 AM, Zhongqiu Han wrote:
> This patch introduces a dedicated mutex to serialize PM QoS operations,
> preventing data races and ensuring safe access to PM QoS resources,
> including sysfs interface reads.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>