[PATCH] EDAC/versalnet: Fix resource leaks and NULL derefs in init_versalnet()

Eric-Terminal posted 1 patch 1 month, 3 weeks ago
There is a newer version of this series
drivers/edac/versalnet_edac.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
[PATCH] EDAC/versalnet: Fix resource leaks and NULL derefs in init_versalnet()
Posted by Eric-Terminal 1 month, 3 weeks ago
From: Yufan Chen <ericterminal@gmail.com>

init_versalnet() has several bugs in its error handling:

 - kzalloc() and kmalloc() return values are used without NULL checks,
   causing a NULL pointer dereference when allocation fails.

 - The cleanup loop uses while (i--) which skips the current failing
   index, leaking the resources already allocated for that slot.

 - edac_mc_del_mc() is called unconditionally during unwind, even for
   controllers that were never registered with edac_mc_add_mc().

 - sprintf() is used instead of snprintf() on a fixed-size buffer.

Fix by adding NULL checks for dev and name allocations, replacing
while (i--) with for (j = i; j >= 0; j--) to include the failing
index, tracking successful edac_mc_add_mc() calls with a bool array,
and switching to snprintf().

Signed-off-by: Yufan Chen <ericterminal@gmail.com>
---
 drivers/edac/versalnet_edac.c | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c
index 1a1092793..128e9cd5f 100644
--- a/drivers/edac/versalnet_edac.c
+++ b/drivers/edac/versalnet_edac.c
@@ -764,11 +764,12 @@ static int init_versalnet(struct mc_priv *priv, struct platform_device *pdev)
 {
 	u32 num_chans, rank, dwidth, config;
 	struct edac_mc_layer layers[2];
+	bool mc_added[NUM_CONTROLLERS] = { };
 	struct mem_ctl_info *mci;
 	struct device *dev;
 	enum dev_type dt;
 	char *name;
-	int rc, i;
+	int rc, i, j;
 
 	for (i = 0; i < NUM_CONTROLLERS; i++) {
 		config = priv->adec[CONF + i * ADEC_NUM];
@@ -813,11 +814,25 @@ static int init_versalnet(struct mc_priv *priv, struct platform_device *pdev)
 		priv->dwidth = dt;
 
 		dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+		if (!dev) {
+			rc = -ENOMEM;
+			goto err_alloc;
+		}
+
 		dev->release = versal_edac_release;
 		name = kmalloc(32, GFP_KERNEL);
-		sprintf(name, "versal-net-ddrmc5-edac-%d", i);
+		if (!name) {
+			kfree(dev);
+			rc = -ENOMEM;
+			goto err_alloc;
+		}
+
+		snprintf(name, 32, "versal-net-ddrmc5-edac-%d", i);
 		dev->init_name = name;
 		rc = device_register(dev);
+		kfree(name);
+		if (rc)
+			put_device(dev);
 		if (rc)
 			goto err_alloc;
 
@@ -831,21 +846,25 @@ static int init_versalnet(struct mc_priv *priv, struct platform_device *pdev)
 			edac_printk(KERN_ERR, EDAC_MC, "Failed to register MC%d with EDAC core\n", i);
 			goto err_alloc;
 		}
+
+		mc_added[i] = true;
 	}
 	return 0;
 
 err_alloc:
-	while (i--) {
-		mci = priv->mci[i];
+	for (j = i; j >= 0; j--) {
+		mci = priv->mci[j];
 		if (!mci)
 			continue;
 
 		if (mci->pdev) {
 			device_unregister(mci->pdev);
-			edac_mc_del_mc(mci->pdev);
+			if (mc_added[j])
+				edac_mc_del_mc(mci->pdev);
 		}
 
 		edac_mc_free(mci);
+		priv->mci[j] = NULL;
 	}
 
 	return rc;
-- 
2.47.3
Re: [PATCH] EDAC/versalnet: Fix resource leaks and NULL derefs in init_versalnet()
Posted by Borislav Petkov 1 month, 3 weeks ago
On February 22, 2026 10:39:18 AM UTC, Eric-Terminal <ericterminal@gmail.com> wrote:
>From: Yufan Chen <ericterminal@gmail.com>
>
>init_versalnet() has several bugs in its error handling:
>
> - kzalloc() and kmalloc() return values are used without NULL checks,
>   causing a NULL pointer dereference when allocation fails.
>
> - The cleanup loop uses while (i--) which skips the current failing
>   index, leaking the resources already allocated for that slot.
>
> - edac_mc_del_mc() is called unconditionally during unwind, even for
>   controllers that were never registered with edac_mc_add_mc().
>
> - sprintf() is used instead of snprintf() on a fixed-size buffer.
>
>Fix by adding NULL checks for dev and name allocations, replacing
>while (i--) with for (j = i; j >= 0; j--) to include the failing
>index, tracking successful edac_mc_add_mc() calls with a bool array,
>and switching to snprintf().
>
>Signed-off-by: Yufan Chen <ericterminal@gmail.com>
>---
> drivers/edac/versalnet_edac.c | 29 ++++++++++++++++++++++++-----

So that particular area of stinking goo is being dealt with here:

<https://lore.kernel.org/all/20251104093932.3838876-1-shubhrajyoti.datta@amd.com/T/#u>

And I'd usually say you can take the current diffs, productize them and send them after testing. 

However, testing is the problem here - I highly doubt you have access to the hardware and Shubhrajyoti is probably one of small number of people who can test it.

Except that he's not really moving here - this particular issue has been outstanding for at least three months.

I'll ping him when I get back from vacation to see whether there's still interest in this driver or I can remove it for lack of support.

Oh well.

-- 
Small device. Typos and formatting crap
Re: [PATCH] EDAC/versalnet: Fix resource leaks and NULL derefs in init_versalnet()
Posted by Eric_Terminal 1 month, 3 weeks ago
On Mon, Feb 23, 2026 at 9:47 PM Borislav Petkov <bp@alien8.de> wrote:
> And I'd usually say you can take the current diffs, productize them and send them after testing.

Uh, yeah, that’s true — after all, most of us regular developers don’t
exactly have an FPGA sitting around at home.

> However, testing is the problem here - I highly doubt you have access to the hardware and Shubhrajyoti is probably one of small number of people who can test it.

This is purely a logic bug in the code. What I did was just fix an
error-handling path; it doesn’t change the normal execution flow at
all. The issue itself can be derived through static analysis of the
existing code, so it’s not something that depends on specific hardware
behavior.

> Except that he's not really moving here - this particular issue has been outstanding for at least three months.

If the driver isn’t actively maintained… well… I can understand that
too. Still, this patch is very low risk, and merging it shouldn’t
really add to your workload. Maybe straightforward logic fixes like
this could be considered based on static review alone? At the very
least, it would help improve the baseline code quality. :)

Anyway, enjoy your vacation.