drivers/nvmem/core.c | 270 +++++++++++++++++++++++++--------------------- drivers/nvmem/internals.h | 18 +++- 2 files changed, 163 insertions(+), 125 deletions(-)
Nvmem is one of the subsystems vulnerable to object life-time issues.
The memory nvmem core dereferences is owned by nvmem providers which can
be unbound at any time and even though nvmem devices themselves are
reference-counted, there's no synchronization with the provider modules.
This typically is not a problem because thanks to fw_devlink, consumers
get synchronously unbound before providers but it's enough to pass
fw_devlink=off over the command line, unbind the nvmem controller with
consumers still holding references to it and try to read/write in order
to see fireworks in the kernel log.
User-space can trigger it too if a device (for instance: i2c eeprom on a
cp2112 USB expander) is unplugged halfway through a long read.
Thankfully the design of nvmem is rather sane so it just needs a bit of
encouragement and synchronization.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
---
Bartosz Golaszewski (7):
nvmem: remove unused field from struct nvmem_device
nvmem: return -EOPNOTSUPP to in-kernel users on missing callbacks
nvmem: check the return value of gpiod_set_value_cansleep()
nvmem: simplify locking with guard()
nvmem: remove unneeded __nvmem_device_put()
nvmem: split struct nvmem_device into refcounted and provider-owned data
nvmem: synchronize nvmem device unregistering with SRCU
drivers/nvmem/core.c | 270 +++++++++++++++++++++++++---------------------
drivers/nvmem/internals.h | 18 +++-
2 files changed, 163 insertions(+), 125 deletions(-)
---
base-commit: 79455889ce4668b4239b3ffe97a86b24e99f2d7e
change-id: 20260114-nvmem-unbind-673b52fc84a0
Best regards,
--
Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
On Fri, Jan 16, 2026 at 12:01:07PM +0100, Bartosz Golaszewski wrote: > Nvmem is one of the subsystems vulnerable to object life-time issues. > The memory nvmem core dereferences is owned by nvmem providers which can > be unbound at any time and even though nvmem devices themselves are > reference-counted, there's no synchronization with the provider modules. > > This typically is not a problem because thanks to fw_devlink, consumers > get synchronously unbound before providers but it's enough to pass > fw_devlink=off over the command line, unbind the nvmem controller with > consumers still holding references to it and try to read/write in order > to see fireworks in the kernel log. Well, don't do that then. Only root can unbind drivers, and we have things like module references to prevent drivers from being unloaded while in use. > User-space can trigger it too if a device (for instance: i2c eeprom on a > cp2112 USB expander) is unplugged halfway through a long read. Hotplugging may be a real issue, though. But this can solved at the user interface level. Did you explore that? For reference, this is related to the i2c discussion here: https://lore.kernel.org/lkml/aW4OWnyYp6Vas53L@hovoldconsulting.com/ Johan
On Mon, Jan 19, 2026 at 12:22 PM Johan Hovold <johan@kernel.org> wrote: > > On Fri, Jan 16, 2026 at 12:01:07PM +0100, Bartosz Golaszewski wrote: > > Nvmem is one of the subsystems vulnerable to object life-time issues. > > The memory nvmem core dereferences is owned by nvmem providers which can > > be unbound at any time and even though nvmem devices themselves are > > reference-counted, there's no synchronization with the provider modules. > > > > This typically is not a problem because thanks to fw_devlink, consumers > > get synchronously unbound before providers but it's enough to pass > > fw_devlink=off over the command line, unbind the nvmem controller with > > consumers still holding references to it and try to read/write in order > > to see fireworks in the kernel log. > > Well, don't do that then. Only root can unbind drivers, and we have > things like module references to prevent drivers from being unloaded > while in use. > That's a very weird argument. It roughly translates to: "There's an issue that can crash the system but only root can trigger it so let's never fix it". If that's how we work then why don't we allow root to kill init with a signal? Is this a corner-case? Sure. Should we not address corner cases? Then why do we fix error paths we never hit or that result in an unusable system anyway? Also: sysfs attributes don't take a module reference. Unloading an nvmem module during a long read from user-space is equivalent to a USB unplug. > > User-space can trigger it too if a device (for instance: i2c eeprom on a > > cp2112 USB expander) is unplugged halfway through a long read. > > Hotplugging may be a real issue, though. But this can solved at the user > interface level. Did you explore that? > Did I explore what exactly? If you're saying this *can* be solved, then please also say how. Nvmem has no idea what abstraction layer it sits above. Nvmem core doesn't even know what abstraction layer the nvmem providers use. > For reference, this is related to the i2c discussion here: > > https://lore.kernel.org/lkml/aW4OWnyYp6Vas53L@hovoldconsulting.com/ > Your argument against the i2c changes is that they affect lots of drivers and decrease readability. I can see it as a point worth considering. In the end it's up to Wolfram to decide and I think he was pretty clear he wants it to proceed. This series addresses the unbinding issue entirely within nvmem core, no driver changes are required. It uses SRCU which is very fast in read sections. What exactly is your argument against it? What is the reason for your comment? Unless you deal with nvmem internals, you never have to concern yourself with it. Bartosz
On Mon, Jan 19, 2026 at 01:43:01PM +0100, Bartosz Golaszewski wrote: > On Mon, Jan 19, 2026 at 12:22 PM Johan Hovold <johan@kernel.org> wrote: > > > > On Fri, Jan 16, 2026 at 12:01:07PM +0100, Bartosz Golaszewski wrote: > > > Nvmem is one of the subsystems vulnerable to object life-time issues. > > > The memory nvmem core dereferences is owned by nvmem providers which can > > > be unbound at any time and even though nvmem devices themselves are > > > reference-counted, there's no synchronization with the provider modules. > > > > > > This typically is not a problem because thanks to fw_devlink, consumers > > > get synchronously unbound before providers but it's enough to pass > > > fw_devlink=off over the command line, unbind the nvmem controller with > > > consumers still holding references to it and try to read/write in order > > > to see fireworks in the kernel log. > > > > Well, don't do that then. Only root can unbind drivers, and we have > > things like module references to prevent drivers from being unloaded > > while in use. > > That's a very weird argument. It roughly translates to: "There's an > issue that can crash the system but only root can trigger it so let's > never fix it". If that's how we work then why don't we allow root to > kill init with a signal? It's not weird at all. The cost associated with preventing *root* from using a foot gun may simply be too high for it to be worth it. For a regular user, the equation is different. > Is this a corner-case? Sure. Should we not address corner cases? Then > why do we fix error paths we never hit or that result in an unusable > system anyway? That's just a false equivalence. > Also: sysfs attributes don't take a module reference. Unloading an > nvmem module during a long read from user-space is equivalent to a USB > unplug. Sysfs will drain any ongoing operations before deregistering, so not sure what you're referring to here. > > > User-space can trigger it too if a device (for instance: i2c eeprom on a > > > cp2112 USB expander) is unplugged halfway through a long read. > > > > Hotplugging may be a real issue, though. But this can solved at the user > > interface level. Did you explore that? > > > > Did I explore what exactly? If you're saying this *can* be solved, > then please also say how. Nvmem has no idea what abstraction layer it > sits above. Nvmem core doesn't even know what abstraction layer the > nvmem providers use. The userspace interface for nvmem access. If the only interface is the sysfs one then there should be nothing to worry about there. > > For reference, this is related to the i2c discussion here: > > > > https://lore.kernel.org/lkml/aW4OWnyYp6Vas53L@hovoldconsulting.com/ > > > > Your argument against the i2c changes is that they affect lots of > drivers and decrease readability. I can see it as a point worth > considering. In the end it's up to Wolfram to decide and I think he > was pretty clear he wants it to proceed. > > This series addresses the unbinding issue entirely within nvmem core, > no driver changes are required. It uses SRCU which is very fast in > read sections. What exactly is your argument against it? What is the > reason for your comment? Unless you deal with nvmem internals, you > never have to concern yourself with it. My concern is that you're stuck on the idea of wrapping every access in the kernel with unnecessary constructs because you seem to think driver unbinding is something we need to worry about it. It's not. At least not at any cost (readability, maintainability, cognitive load, churn, performance, ...). Johan
On Tue, Jan 20, 2026 at 11:18 AM Johan Hovold <johan@kernel.org> wrote: > [snip to avoid more bickering] > > > Also: sysfs attributes don't take a module reference. Unloading an > > nvmem module during a long read from user-space is equivalent to a USB > > unplug. > > Sysfs will drain any ongoing operations before deregistering, so not > sure what you're referring to here. > True, sysfs will wait for the current operation to complete when removing attributes but currently you can still very easily trigger a use-after-free splat from user-space with nvmem the way I described due to ordering of the teardown. But fair enough - this is an nvmem problem, not sysfs. > > > For reference, this is related to the i2c discussion here: > > > > > > https://lore.kernel.org/lkml/aW4OWnyYp6Vas53L@hovoldconsulting.com/ > > > > > > > Your argument against the i2c changes is that they affect lots of > > drivers and decrease readability. I can see it as a point worth > > considering. In the end it's up to Wolfram to decide and I think he > > was pretty clear he wants it to proceed. > > > > This series addresses the unbinding issue entirely within nvmem core, > > no driver changes are required. It uses SRCU which is very fast in > > read sections. What exactly is your argument against it? What is the > > reason for your comment? Unless you deal with nvmem internals, you > > never have to concern yourself with it. > > My concern is that you're stuck on the idea of wrapping every access in "Every access" is an exaggeration. > the kernel with unnecessary constructs because you seem to think driver > unbinding is something we need to worry about it. It's not. At least > not at any cost (readability, maintainability, cognitive load, churn, > performance, ...). > That is simply incorrect. I didn't just suddenly make this up. I encountered this issue - and subsequently started working on it - in my work with a client's device based on CP2112 I2C/GPIO USB expander where unplugging it would either crash the system in GPIO unbind path or freeze the kernel thread forever waiting for a completion in I2C unbind path. That's why - contrary to what you're claiming here - we (the kernel community) *do* care about driver unbinding. I truly can't comprehend why kernel just easily crashing on what is pretty normal operation does not seem wrong to you. Ever since Laurent's first talk about object life-time issues (was that 2021?), I've discussed this with many people - including key kernel maintainers like Greg KH. People disagree on ways of fixing this family of problems (it's not a single bug) but you are quite literally the only person who says that we should do nothing. Meanwhile, unbinding may not be a problem until it is. I've had Brian Masney from Red Hat approach me during LPC in Tokyo because he needs to be able to unbind a clock driver at runtime. At another conference, Bootlin had a whole BoF/miniconf about dynamically instantiated and removed platform devices (this was for cape-like extensions but unbinding was one of the issues). If you're claiming it's unimportant, then you're simply wrong. Feel free to review the patches or point out anything incorrect in the description of the problem, but please stop commenting under every related series I post (seriously, did you filter lore by my name to find this one?), saying "we don't want to do this". The consensus is that this can and should be fixed. You yourself posted a link to Wolfram's email saying that much about I2C. I'll wait for Srini to decide whether he wants this or not. Thanks, Bartosz
On Tue, Jan 20, 2026 at 08:57:41PM +0100, Bartosz Golaszewski wrote: > On Tue, Jan 20, 2026 at 11:18 AM Johan Hovold <johan@kernel.org> wrote: > > > > For reference, this is related to the i2c discussion here: > > > > > > > > https://lore.kernel.org/lkml/aW4OWnyYp6Vas53L@hovoldconsulting.com/ > > > > > > > > > > Your argument against the i2c changes is that they affect lots of > > > drivers and decrease readability. I can see it as a point worth > > > considering. In the end it's up to Wolfram to decide and I think he > > > was pretty clear he wants it to proceed. > > > > > > This series addresses the unbinding issue entirely within nvmem core, > > > no driver changes are required. It uses SRCU which is very fast in > > > read sections. What exactly is your argument against it? What is the > > > reason for your comment? Unless you deal with nvmem internals, you > > > never have to concern yourself with it. > > > > My concern is that you're stuck on the idea of wrapping every access in > > "Every access" is an exaggeration. > > > the kernel with unnecessary constructs because you seem to think driver > > unbinding is something we need to worry about it. It's not. At least > > not at any cost (readability, maintainability, cognitive load, churn, > > performance, ...). > > > > That is simply incorrect. I didn't just suddenly make this up. I > encountered this issue - and subsequently started working on it - in > my work with a client's device based on CP2112 I2C/GPIO USB expander > where unplugging it would either crash the system in GPIO unbind path > or freeze the kernel thread forever waiting for a completion in I2C > unbind path. That's why - contrary to what you're claiming here - we > (the kernel community) *do* care about driver unbinding. I truly can't > comprehend why kernel just easily crashing on what is pretty normal > operation does not seem wrong to you. Again, driver unbinding (module unloading) is a development tool, if making it 100% fool proof would require making drivers unmaintainable while adding enough runtime overhead it *may* simply not be worth it. And there are other ways to address this without any such costs since in 99.9999% of cases where unbinding makes no sense at all. If you don't trust yourself from running around unbinding random drivers, you can even disable unloading completely. And fw_devlink unbinds consumers before providers. Etc, etc. Hot-plug is a different issue, and not something new that you've just discovered. USB serial deals with this since forever and during Project Ara we quickly learned that most subsystems do not support hot plugging (for obvious reasons). But the most relevant issue here is the user-space interface since a user can keep a file descriptor open indefinitely. And that can potentially be solved in a generic fashion without messing up every driver for something that is not a problem in practice. > Ever since Laurent's first talk about object life-time issues (was > that 2021?), I've discussed this with many people - including key > kernel maintainers like Greg KH. People disagree on ways of fixing > this family of problems (it's not a single bug) but you are quite > literally the only person who says that we should do nothing. Re-read what I just wrote above. > Meanwhile, unbinding may not be a problem until it is. I've had Brian > Masney from Red Hat approach me during LPC in Tokyo because he needs > to be able to unbind a clock driver at runtime. At another conference, > Bootlin had a whole BoF/miniconf about dynamically instantiated and > removed platform devices (this was for cape-like extensions but > unbinding was one of the issues). If you're claiming it's unimportant, > then you're simply wrong. That's about tearing down a subtree, where children would be removed before parents. If we have random links to hotpluggable devices then that would be an issue, but so far that looks like yet another contrived example. > Feel free to review the patches or point out anything incorrect in the > description of the problem, but please stop commenting under every > related series I post (seriously, did you filter lore by my name to > find this one?), saying "we don't want to do this". The consensus is > that this can and should be fixed. You yourself posted a link to > Wolfram's email saying that much about I2C. Don't flatter yourself. Johan
© 2016 - 2026 Red Hat, Inc.