drivers/edac/versalnet_edac.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-)
From: Yufan Chen <ericterminal@gmail.com>
init_versalnet() has several bugs in its error handling:
- kzalloc() and kmalloc() return values are used without NULL checks,
causing a NULL pointer dereference when allocation fails.
- The cleanup loop uses while (i--) which skips the current failing
index, leaking the resources already allocated for that slot.
- edac_mc_del_mc() is called unconditionally during unwind, even for
controllers that were never registered with edac_mc_add_mc().
- sprintf() is used instead of snprintf() on a fixed-size buffer.
Fix by adding NULL checks for dev and name allocations, replacing
while (i--) with for (j = i; j >= 0; j--) to include the failing
index, tracking successful edac_mc_add_mc() calls with a bool array,
and switching to snprintf().
Signed-off-by: Yufan Chen <ericterminal@gmail.com>
---
drivers/edac/versalnet_edac.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
diff --git a/drivers/edac/versalnet_edac.c b/drivers/edac/versalnet_edac.c
index 1a1092793..128e9cd5f 100644
--- a/drivers/edac/versalnet_edac.c
+++ b/drivers/edac/versalnet_edac.c
@@ -764,11 +764,12 @@ static int init_versalnet(struct mc_priv *priv, struct platform_device *pdev)
{
u32 num_chans, rank, dwidth, config;
struct edac_mc_layer layers[2];
+ bool mc_added[NUM_CONTROLLERS] = { };
struct mem_ctl_info *mci;
struct device *dev;
enum dev_type dt;
char *name;
- int rc, i;
+ int rc, i, j;
for (i = 0; i < NUM_CONTROLLERS; i++) {
config = priv->adec[CONF + i * ADEC_NUM];
@@ -813,11 +814,25 @@ static int init_versalnet(struct mc_priv *priv, struct platform_device *pdev)
priv->dwidth = dt;
dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+ if (!dev) {
+ rc = -ENOMEM;
+ goto err_alloc;
+ }
+
dev->release = versal_edac_release;
name = kmalloc(32, GFP_KERNEL);
- sprintf(name, "versal-net-ddrmc5-edac-%d", i);
+ if (!name) {
+ kfree(dev);
+ rc = -ENOMEM;
+ goto err_alloc;
+ }
+
+ snprintf(name, 32, "versal-net-ddrmc5-edac-%d", i);
dev->init_name = name;
rc = device_register(dev);
+ kfree(name);
+ if (rc)
+ put_device(dev);
if (rc)
goto err_alloc;
@@ -831,21 +846,25 @@ static int init_versalnet(struct mc_priv *priv, struct platform_device *pdev)
edac_printk(KERN_ERR, EDAC_MC, "Failed to register MC%d with EDAC core\n", i);
goto err_alloc;
}
+
+ mc_added[i] = true;
}
return 0;
err_alloc:
- while (i--) {
- mci = priv->mci[i];
+ for (j = i; j >= 0; j--) {
+ mci = priv->mci[j];
if (!mci)
continue;
if (mci->pdev) {
device_unregister(mci->pdev);
- edac_mc_del_mc(mci->pdev);
+ if (mc_added[j])
+ edac_mc_del_mc(mci->pdev);
}
edac_mc_free(mci);
+ priv->mci[j] = NULL;
}
return rc;
--
2.47.3
On February 22, 2026 10:39:18 AM UTC, Eric-Terminal <ericterminal@gmail.com> wrote: >From: Yufan Chen <ericterminal@gmail.com> > >init_versalnet() has several bugs in its error handling: > > - kzalloc() and kmalloc() return values are used without NULL checks, > causing a NULL pointer dereference when allocation fails. > > - The cleanup loop uses while (i--) which skips the current failing > index, leaking the resources already allocated for that slot. > > - edac_mc_del_mc() is called unconditionally during unwind, even for > controllers that were never registered with edac_mc_add_mc(). > > - sprintf() is used instead of snprintf() on a fixed-size buffer. > >Fix by adding NULL checks for dev and name allocations, replacing >while (i--) with for (j = i; j >= 0; j--) to include the failing >index, tracking successful edac_mc_add_mc() calls with a bool array, >and switching to snprintf(). > >Signed-off-by: Yufan Chen <ericterminal@gmail.com> >--- > drivers/edac/versalnet_edac.c | 29 ++++++++++++++++++++++++----- So that particular area of stinking goo is being dealt with here: <https://lore.kernel.org/all/20251104093932.3838876-1-shubhrajyoti.datta@amd.com/T/#u> And I'd usually say you can take the current diffs, productize them and send them after testing. However, testing is the problem here - I highly doubt you have access to the hardware and Shubhrajyoti is probably one of small number of people who can test it. Except that he's not really moving here - this particular issue has been outstanding for at least three months. I'll ping him when I get back from vacation to see whether there's still interest in this driver or I can remove it for lack of support. Oh well. -- Small device. Typos and formatting crap
On Mon, Feb 23, 2026 at 9:47 PM Borislav Petkov <bp@alien8.de> wrote: > And I'd usually say you can take the current diffs, productize them and send them after testing. Uh, yeah, that’s true — after all, most of us regular developers don’t exactly have an FPGA sitting around at home. > However, testing is the problem here - I highly doubt you have access to the hardware and Shubhrajyoti is probably one of small number of people who can test it. This is purely a logic bug in the code. What I did was just fix an error-handling path; it doesn’t change the normal execution flow at all. The issue itself can be derived through static analysis of the existing code, so it’s not something that depends on specific hardware behavior. > Except that he's not really moving here - this particular issue has been outstanding for at least three months. If the driver isn’t actively maintained… well… I can understand that too. Still, this patch is very low risk, and merging it shouldn’t really add to your workload. Maybe straightforward logic fixes like this could be considered based on static review alone? At the very least, it would help improve the baseline code quality. :) Anyway, enjoy your vacation.
© 2016 - 2026 Red Hat, Inc.