[PATCH] firmware: stratix10-rsu: Fix NULL deref on rsu_send_msg() timeout in probe

Dinh Nguyen posted 1 patch 3 days, 17 hours ago
drivers/firmware/stratix10-rsu.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
[PATCH] firmware: stratix10-rsu: Fix NULL deref on rsu_send_msg() timeout in probe
Posted by Dinh Nguyen 3 days, 17 hours ago
rsu_send_msg() can return -ETIMEDOUT when
wait_for_completion_interruptible_timeout() fires while the SMC call is still
pending. In stratix10_rsu_probe(), the error paths for COMMAND_RSU_DCMF_VERSION,
COMMAND_RSU_DCMF_STATUS, COMMAND_RSU_MAX_RETRY and COMMAND_RSU_GET_SPT_TABLE
call stratix10_svc_free_channel() - which sets chan->scl to NULL - but then
fall through and queue the next request on the same channel. The next svc
kthread that runs will dereference pdata->chan->scl in its receive callback
path, triggering a NULL pointer dereference identical to the one fixed by
commit c45f7263100c ("firmware: stratix10-rsu: Fix NULL pointer dereference
when RSU is disabled") for the COMMAND_RSU_STATUS path.

Apply the same cleanup pattern to the remaining failure paths: remove the
async client, free the channel, and return early so no further messages are
queued on a channel whose scl has been cleared.

This bug was spotted when Sashiko reviewed another patch:
https://sashiko.dev/#/patchset/cover.1779248894.git.tze.yee.ng%40altera.com

Fixes: 15847537b623 ("firmware: stratix10-rsu: Migrate RSU driver to use stratix10 asynchronous framework.")
Cc: stable@kernel.org
Assisted-by: Claude:claude-4.7-opus-high Cursor
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
---
 drivers/firmware/stratix10-rsu.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/stratix10-rsu.c b/drivers/firmware/stratix10-rsu.c
index e1912108a0fee..840f12377f2a5 100644
--- a/drivers/firmware/stratix10-rsu.c
+++ b/drivers/firmware/stratix10-rsu.c
@@ -778,32 +778,39 @@ static int stratix10_rsu_probe(struct platform_device *pdev)
 			   0, rsu_dcmf_version_callback);
 	if (ret) {
 		dev_err(dev, "Error, getting DCMF version %i\n", ret);
+		stratix10_svc_remove_async_client(priv->chan);
 		stratix10_svc_free_channel(priv->chan);
+		return ret;
 	}
 
 	ret = rsu_send_msg(priv, COMMAND_RSU_DCMF_STATUS,
 			   0, rsu_dcmf_status_callback);
 	if (ret) {
 		dev_err(dev, "Error, getting DCMF status %i\n", ret);
+		stratix10_svc_remove_async_client(priv->chan);
 		stratix10_svc_free_channel(priv->chan);
+		return ret;
 	}
 
 	ret = rsu_send_msg(priv, COMMAND_RSU_MAX_RETRY, 0,
 			   rsu_max_retry_callback);
 	if (ret) {
 		dev_err(dev, "Error, getting RSU max retry %i\n", ret);
+		stratix10_svc_remove_async_client(priv->chan);
 		stratix10_svc_free_channel(priv->chan);
+		return ret;
 	}
 
-
 	ret = rsu_send_async_msg(dev, priv, COMMAND_RSU_GET_SPT_TABLE, 0,
 				 rsu_async_get_spt_table_callback);
 	if (ret) {
 		dev_err(dev, "Error, getting SPT table %i\n", ret);
+		stratix10_svc_remove_async_client(priv->chan);
 		stratix10_svc_free_channel(priv->chan);
+		return ret;
 	}
 
-	return ret;
+	return 0;
 }
 
 static void stratix10_rsu_remove(struct platform_device *pdev)
-- 
2.42.0.411.g813d9a9188
Re: [PATCH] firmware: stratix10-rsu: Fix NULL deref on rsu_send_msg() timeout in probe
Posted by Markus Elfring 3 days, 4 hours ago
…
> Apply the same cleanup pattern to the remaining failure paths: remove the
> async client, free the channel, and return early so no further messages are
> queued on a channel whose scl has been cleared.
…

How do you think about to avoid a bit of duplicate source code
in the implementation of the function “stratix10_rsu_probe”?
https://elixir.bootlin.com/linux/v7.1-rc4/source/drivers/firmware/stratix10-rsu.c#L715-L807

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/coding-style.rst?h=v7.1-rc4#n526

Regards,
Markus