[PATCH v2] drm/xe/hwmon: Return early on power limit read failure

zhaoguohan@kylinos.cn posted 1 patch 1 month, 2 weeks ago
drivers/gpu/drm/xe/xe_hwmon.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
[PATCH v2] drm/xe/hwmon: Return early on power limit read failure
Posted by zhaoguohan@kylinos.cn 1 month, 2 weeks ago
From: GuoHan Zhao <zhaoguohan@kylinos.cn>

In xe_hwmon_pcode_rmw_power_limit(), when xe_pcode_read() fails,
the function logs the error but continues to execute the subsequent
logic. This can result in undefined behavior as the values val0 and
val1 may contain invalid data.

Fix this by adding an early return after logging the read failure,
ensuring that we don't proceed with potentially corrupted data.

Fixes: 8aa7306631f0 ("drm/xe/hwmon: Fix xe_hwmon_power_max_write")

V2:
- Change 'drm_dbg' to 'drm_err'
- Added the Fixes tag in commit message

Signed-off-by: GuoHan Zhao <zhaoguohan@kylinos.cn>
---
 drivers/gpu/drm/xe/xe_hwmon.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
index f08fc4377d25..8e29fa155d7e 100644
--- a/drivers/gpu/drm/xe/xe_hwmon.c
+++ b/drivers/gpu/drm/xe/xe_hwmon.c
@@ -190,9 +190,11 @@ static int xe_hwmon_pcode_rmw_power_limit(const struct xe_hwmon *hwmon, u32 attr
 						  READ_PL_FROM_PCODE : READ_PL_FROM_FW),
 						  &val0, &val1);
 
-	if (ret)
-		drm_dbg(&hwmon->xe->drm, "read failed ch %d val0 0x%08x, val1 0x%08x, ret %d\n",
+	if (ret) {
+		drm_err(&hwmon->xe->drm, "read failed ch %d val0 0x%08x, val1 0x%08x, ret %d\n",
 			channel, val0, val1, ret);
+		return ret;
+	}
 
 	if (attr == PL1_HWMON_ATTR)
 		val0 = (val0 & ~clr) | set;
-- 
2.43.0
Re: [PATCH v2] drm/xe/hwmon: Return early on power limit read failure
Posted by Rodrigo Vivi 1 month, 2 weeks ago
On Fri, Aug 15, 2025 at 02:36:23PM +0800, zhaoguohan@kylinos.cn wrote:
> From: GuoHan Zhao <zhaoguohan@kylinos.cn>
> 
> In xe_hwmon_pcode_rmw_power_limit(), when xe_pcode_read() fails,
> the function logs the error but continues to execute the subsequent
> logic. This can result in undefined behavior as the values val0 and
> val1 may contain invalid data.
> 
> Fix this by adding an early return after logging the read failure,
> ensuring that we don't proceed with potentially corrupted data.
> 
> Fixes: 8aa7306631f0 ("drm/xe/hwmon: Fix xe_hwmon_power_max_write")
> 
> V2:
> - Change 'drm_dbg' to 'drm_err'
> - Added the Fixes tag in commit message

There are still missed/unanswered questions/concerns in the original review:

https://lore.kernel.org/intel-xe/aJtG0xmBBgwnTANg@intel.com

Please ensure to address all of them before re-iterating the patch.

Thanks,
Rodrigo.

> 
> Signed-off-by: GuoHan Zhao <zhaoguohan@kylinos.cn>
> ---
>  drivers/gpu/drm/xe/xe_hwmon.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xe/xe_hwmon.c b/drivers/gpu/drm/xe/xe_hwmon.c
> index f08fc4377d25..8e29fa155d7e 100644
> --- a/drivers/gpu/drm/xe/xe_hwmon.c
> +++ b/drivers/gpu/drm/xe/xe_hwmon.c
> @@ -190,9 +190,11 @@ static int xe_hwmon_pcode_rmw_power_limit(const struct xe_hwmon *hwmon, u32 attr
>  						  READ_PL_FROM_PCODE : READ_PL_FROM_FW),
>  						  &val0, &val1);
>  
> -	if (ret)
> -		drm_dbg(&hwmon->xe->drm, "read failed ch %d val0 0x%08x, val1 0x%08x, ret %d\n",
> +	if (ret) {
> +		drm_err(&hwmon->xe->drm, "read failed ch %d val0 0x%08x, val1 0x%08x, ret %d\n",
>  			channel, val0, val1, ret);
> +		return ret;
> +	}
>  
>  	if (attr == PL1_HWMON_ATTR)
>  		val0 = (val0 & ~clr) | set;
> -- 
> 2.43.0
>