drivers/base/bus.c | 17 ++++++++++++----- drivers/base/dd.c | 3 +++ 2 files changed, 15 insertions(+), 5 deletions(-)
driver_set_override() modifies and frees dev->driver_override while
holding device_lock(dev). However, driver_match_device() reads
dev->driver_override when calling bus match functions.
Currently, driver_match_device() is called from three sites. One site
(__device_attach_driver) holds device_lock(dev), but the other two
(bind_store and __driver_attach) do not. This allows a concurrent
driver_set_override() to free the string while driver_match_device() is
using it, leading to a use-after-free (UAF).
This issue affects at least 11 bus types (including PCI, AMBA, Platform)
that rely on driver_override for matching.
Fix this by holding device_lock(dev) around the driver_match_device() calls
in bind_store() and __driver_attach(). This ensures all access to
dev->driver_override via driver_match_device() is protected by the device
lock.
Tested with the PoCs from Bugzilla that trigger this UAF. Stress testing
the two newly locked paths for 24 hours with CONFIG_PROVE_LOCKING and
CONFIG_LOCKDEP enabled showed no UAF recurrence and no lockdep
warnings.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
---
The Bugzilla entry contains full KASAN reports and two PoCs that reliably
reproduce the UAF on both unlocked paths using a standard QEMU setup
(default e1000 device at 0000:00:03.0).
I chose to fix this in the driver core for the following reasons:
1. Both racing functions are part of the driver core.
2. Fixing this per-driver/per-bus is tedious and would require careful
ad-hoc locking that does not align with the existing device_lock(dev).
3. We cannot simply add device_lock(dev) inside bus match functions because
one call path (__device_attach_driver) already holds this lock. Adding the
lock inside the match callback would cause a deadlock on that path.
---
drivers/base/bus.c | 17 ++++++++++++-----
drivers/base/dd.c | 3 +++
2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 5e75e1bce551..9e62d6009058 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -261,13 +261,20 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
const struct bus_type *bus = bus_get(drv->bus);
struct device *dev;
int err = -ENODEV;
+ int ret;
dev = bus_find_device_by_name(bus, NULL, buf);
- if (dev && driver_match_device(drv, dev)) {
- err = device_driver_attach(drv, dev);
- if (!err) {
- /* success */
- err = count;
+ if (dev) {
+ /* Protects against driver_set_override() races */
+ device_lock(dev);
+ ret = driver_match_device(drv, dev);
+ device_unlock(dev);
+ if (ret) {
+ err = device_driver_attach(drv, dev);
+ if (!err) {
+ /* success */
+ err = count;
+ }
}
}
put_device(dev);
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 13ab98e033ea..db60b4500136 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -1170,7 +1170,10 @@ static int __driver_attach(struct device *dev, void *data)
* is an error.
*/
+ /* Protects against driver_set_override() races */
+ device_lock(dev);
ret = driver_match_device(drv, dev);
+ device_unlock(dev);
if (ret == 0) {
/* no match */
return 0;
--
2.43.0
On Wed, Nov 26, 2025 at 07:30:06PM +0800, Gui-Dong Han wrote:
> driver_set_override() modifies and frees dev->driver_override while
> holding device_lock(dev). However, driver_match_device() reads
> dev->driver_override when calling bus match functions.
>
> Currently, driver_match_device() is called from three sites. One site
> (__device_attach_driver) holds device_lock(dev), but the other two
> (bind_store and __driver_attach) do not. This allows a concurrent
> driver_set_override() to free the string while driver_match_device() is
> using it, leading to a use-after-free (UAF).
>
> This issue affects at least 11 bus types (including PCI, AMBA, Platform)
> that rely on driver_override for matching.
>
> Fix this by holding device_lock(dev) around the driver_match_device() calls
> in bind_store() and __driver_attach(). This ensures all access to
> dev->driver_override via driver_match_device() is protected by the device
> lock.
>
> Tested with the PoCs from Bugzilla that trigger this UAF. Stress testing
> the two newly locked paths for 24 hours with CONFIG_PROVE_LOCKING and
> CONFIG_LOCKDEP enabled showed no UAF recurrence and no lockdep
> warnings.
>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
> Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com>
> Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
> ---
> The Bugzilla entry contains full KASAN reports and two PoCs that reliably
> reproduce the UAF on both unlocked paths using a standard QEMU setup
> (default e1000 device at 0000:00:03.0).
> I chose to fix this in the driver core for the following reasons:
> 1. Both racing functions are part of the driver core.
> 2. Fixing this per-driver/per-bus is tedious and would require careful
> ad-hoc locking that does not align with the existing device_lock(dev).
> 3. We cannot simply add device_lock(dev) inside bus match functions because
> one call path (__device_attach_driver) already holds this lock. Adding the
> lock inside the match callback would cause a deadlock on that path.
> ---
> drivers/base/bus.c | 17 ++++++++++++-----
> drivers/base/dd.c | 3 +++
> 2 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> index 5e75e1bce551..9e62d6009058 100644
> --- a/drivers/base/bus.c
> +++ b/drivers/base/bus.c
> @@ -261,13 +261,20 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
> const struct bus_type *bus = bus_get(drv->bus);
> struct device *dev;
> int err = -ENODEV;
> + int ret;
>
> dev = bus_find_device_by_name(bus, NULL, buf);
> - if (dev && driver_match_device(drv, dev)) {
> - err = device_driver_attach(drv, dev);
> - if (!err) {
> - /* success */
> - err = count;
> + if (dev) {
> + /* Protects against driver_set_override() races */
> + device_lock(dev);
> + ret = driver_match_device(drv, dev);
> + device_unlock(dev);
Why not have driver_match_device() take the lock instead? This way
looks like an "anti-pattern" that we will get wrong over time.
thanks,
greg k-h
On Wed, Nov 26, 2025 at 7:39 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Nov 26, 2025 at 07:30:06PM +0800, Gui-Dong Han wrote:
> > driver_set_override() modifies and frees dev->driver_override while
> > holding device_lock(dev). However, driver_match_device() reads
> > dev->driver_override when calling bus match functions.
> >
> > Currently, driver_match_device() is called from three sites. One site
> > (__device_attach_driver) holds device_lock(dev), but the other two
> > (bind_store and __driver_attach) do not. This allows a concurrent
> > driver_set_override() to free the string while driver_match_device() is
> > using it, leading to a use-after-free (UAF).
> >
> > This issue affects at least 11 bus types (including PCI, AMBA, Platform)
> > that rely on driver_override for matching.
> >
> > Fix this by holding device_lock(dev) around the driver_match_device() calls
> > in bind_store() and __driver_attach(). This ensures all access to
> > dev->driver_override via driver_match_device() is protected by the device
> > lock.
> >
> > Tested with the PoCs from Bugzilla that trigger this UAF. Stress testing
> > the two newly locked paths for 24 hours with CONFIG_PROVE_LOCKING and
> > CONFIG_LOCKDEP enabled showed no UAF recurrence and no lockdep
> > warnings.
> >
> > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
> > Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com>
> > Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
> > ---
> > The Bugzilla entry contains full KASAN reports and two PoCs that reliably
> > reproduce the UAF on both unlocked paths using a standard QEMU setup
> > (default e1000 device at 0000:00:03.0).
> > I chose to fix this in the driver core for the following reasons:
> > 1. Both racing functions are part of the driver core.
> > 2. Fixing this per-driver/per-bus is tedious and would require careful
> > ad-hoc locking that does not align with the existing device_lock(dev).
> > 3. We cannot simply add device_lock(dev) inside bus match functions because
> > one call path (__device_attach_driver) already holds this lock. Adding the
> > lock inside the match callback would cause a deadlock on that path.
> > ---
> > drivers/base/bus.c | 17 ++++++++++++-----
> > drivers/base/dd.c | 3 +++
> > 2 files changed, 15 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> > index 5e75e1bce551..9e62d6009058 100644
> > --- a/drivers/base/bus.c
> > +++ b/drivers/base/bus.c
> > @@ -261,13 +261,20 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
> > const struct bus_type *bus = bus_get(drv->bus);
> > struct device *dev;
> > int err = -ENODEV;
> > + int ret;
> >
> > dev = bus_find_device_by_name(bus, NULL, buf);
> > - if (dev && driver_match_device(drv, dev)) {
> > - err = device_driver_attach(drv, dev);
> > - if (!err) {
> > - /* success */
> > - err = count;
> > + if (dev) {
> > + /* Protects against driver_set_override() races */
> > + device_lock(dev);
> > + ret = driver_match_device(drv, dev);
> > + device_unlock(dev);
>
> Why not have driver_match_device() take the lock instead? This way
> looks like an "anti-pattern" that we will get wrong over time.
The reason I did not put the lock inside driver_match_device() is that
one of its existing callers, __device_attach_driver(), already holds
device_lock(dev). Unconditionally adding the lock inside
driver_match_device() would cause a deadlock on that path.
To address this and move the locking inside as you suggested, I would
need to modify the signature of driver_match_device() to accept a
flag, for example:
int driver_match_device(struct device_driver *drv, struct device *dev,
bool locked)
This would allow the function to conditionally acquire the lock based
on the caller's context.
Is this design acceptable? If so, I can prepare a v2 with this approach.
Thanks,
Gui-Dong Han
On Wed, Nov 26, 2025 at 07:55:06PM +0800, Gui-Dong Han wrote:
> On Wed, Nov 26, 2025 at 7:39 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Nov 26, 2025 at 07:30:06PM +0800, Gui-Dong Han wrote:
> > > driver_set_override() modifies and frees dev->driver_override while
> > > holding device_lock(dev). However, driver_match_device() reads
> > > dev->driver_override when calling bus match functions.
> > >
> > > Currently, driver_match_device() is called from three sites. One site
> > > (__device_attach_driver) holds device_lock(dev), but the other two
> > > (bind_store and __driver_attach) do not. This allows a concurrent
> > > driver_set_override() to free the string while driver_match_device() is
> > > using it, leading to a use-after-free (UAF).
> > >
> > > This issue affects at least 11 bus types (including PCI, AMBA, Platform)
> > > that rely on driver_override for matching.
> > >
> > > Fix this by holding device_lock(dev) around the driver_match_device() calls
> > > in bind_store() and __driver_attach(). This ensures all access to
> > > dev->driver_override via driver_match_device() is protected by the device
> > > lock.
> > >
> > > Tested with the PoCs from Bugzilla that trigger this UAF. Stress testing
> > > the two newly locked paths for 24 hours with CONFIG_PROVE_LOCKING and
> > > CONFIG_LOCKDEP enabled showed no UAF recurrence and no lockdep
> > > warnings.
> > >
> > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
> > > Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com>
> > > Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
> > > ---
> > > The Bugzilla entry contains full KASAN reports and two PoCs that reliably
> > > reproduce the UAF on both unlocked paths using a standard QEMU setup
> > > (default e1000 device at 0000:00:03.0).
> > > I chose to fix this in the driver core for the following reasons:
> > > 1. Both racing functions are part of the driver core.
> > > 2. Fixing this per-driver/per-bus is tedious and would require careful
> > > ad-hoc locking that does not align with the existing device_lock(dev).
> > > 3. We cannot simply add device_lock(dev) inside bus match functions because
> > > one call path (__device_attach_driver) already holds this lock. Adding the
> > > lock inside the match callback would cause a deadlock on that path.
> > > ---
> > > drivers/base/bus.c | 17 ++++++++++++-----
> > > drivers/base/dd.c | 3 +++
> > > 2 files changed, 15 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> > > index 5e75e1bce551..9e62d6009058 100644
> > > --- a/drivers/base/bus.c
> > > +++ b/drivers/base/bus.c
> > > @@ -261,13 +261,20 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
> > > const struct bus_type *bus = bus_get(drv->bus);
> > > struct device *dev;
> > > int err = -ENODEV;
> > > + int ret;
> > >
> > > dev = bus_find_device_by_name(bus, NULL, buf);
> > > - if (dev && driver_match_device(drv, dev)) {
> > > - err = device_driver_attach(drv, dev);
> > > - if (!err) {
> > > - /* success */
> > > - err = count;
> > > + if (dev) {
> > > + /* Protects against driver_set_override() races */
> > > + device_lock(dev);
> > > + ret = driver_match_device(drv, dev);
> > > + device_unlock(dev);
> >
> > Why not have driver_match_device() take the lock instead? This way
> > looks like an "anti-pattern" that we will get wrong over time.
>
> The reason I did not put the lock inside driver_match_device() is that
> one of its existing callers, __device_attach_driver(), already holds
> device_lock(dev). Unconditionally adding the lock inside
> driver_match_device() would cause a deadlock on that path.
Ok, then we should add a lockdep check that this specific lock is
grabbed in that function so that we don't forget that this is a
requirement now.
> To address this and move the locking inside as you suggested, I would
> need to modify the signature of driver_match_device() to accept a
> flag, for example:
> int driver_match_device(struct device_driver *drv, struct device *dev,
> bool locked)
No, that way lies madness, never do that :)
thanks,
greg k-h
On Wed, Nov 26, 2025 at 8:28 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Nov 26, 2025 at 07:55:06PM +0800, Gui-Dong Han wrote:
> > On Wed, Nov 26, 2025 at 7:39 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Nov 26, 2025 at 07:30:06PM +0800, Gui-Dong Han wrote:
> > > > driver_set_override() modifies and frees dev->driver_override while
> > > > holding device_lock(dev). However, driver_match_device() reads
> > > > dev->driver_override when calling bus match functions.
> > > >
> > > > Currently, driver_match_device() is called from three sites. One site
> > > > (__device_attach_driver) holds device_lock(dev), but the other two
> > > > (bind_store and __driver_attach) do not. This allows a concurrent
> > > > driver_set_override() to free the string while driver_match_device() is
> > > > using it, leading to a use-after-free (UAF).
> > > >
> > > > This issue affects at least 11 bus types (including PCI, AMBA, Platform)
> > > > that rely on driver_override for matching.
> > > >
> > > > Fix this by holding device_lock(dev) around the driver_match_device() calls
> > > > in bind_store() and __driver_attach(). This ensures all access to
> > > > dev->driver_override via driver_match_device() is protected by the device
> > > > lock.
> > > >
> > > > Tested with the PoCs from Bugzilla that trigger this UAF. Stress testing
> > > > the two newly locked paths for 24 hours with CONFIG_PROVE_LOCKING and
> > > > CONFIG_LOCKDEP enabled showed no UAF recurrence and no lockdep
> > > > warnings.
> > > >
> > > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
> > > > Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com>
> > > > Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
> > > > ---
> > > > The Bugzilla entry contains full KASAN reports and two PoCs that reliably
> > > > reproduce the UAF on both unlocked paths using a standard QEMU setup
> > > > (default e1000 device at 0000:00:03.0).
> > > > I chose to fix this in the driver core for the following reasons:
> > > > 1. Both racing functions are part of the driver core.
> > > > 2. Fixing this per-driver/per-bus is tedious and would require careful
> > > > ad-hoc locking that does not align with the existing device_lock(dev).
> > > > 3. We cannot simply add device_lock(dev) inside bus match functions because
> > > > one call path (__device_attach_driver) already holds this lock. Adding the
> > > > lock inside the match callback would cause a deadlock on that path.
> > > > ---
> > > > drivers/base/bus.c | 17 ++++++++++++-----
> > > > drivers/base/dd.c | 3 +++
> > > > 2 files changed, 15 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/base/bus.c b/drivers/base/bus.c
> > > > index 5e75e1bce551..9e62d6009058 100644
> > > > --- a/drivers/base/bus.c
> > > > +++ b/drivers/base/bus.c
> > > > @@ -261,13 +261,20 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
> > > > const struct bus_type *bus = bus_get(drv->bus);
> > > > struct device *dev;
> > > > int err = -ENODEV;
> > > > + int ret;
> > > >
> > > > dev = bus_find_device_by_name(bus, NULL, buf);
> > > > - if (dev && driver_match_device(drv, dev)) {
> > > > - err = device_driver_attach(drv, dev);
> > > > - if (!err) {
> > > > - /* success */
> > > > - err = count;
> > > > + if (dev) {
> > > > + /* Protects against driver_set_override() races */
> > > > + device_lock(dev);
> > > > + ret = driver_match_device(drv, dev);
> > > > + device_unlock(dev);
> > >
> > > Why not have driver_match_device() take the lock instead? This way
> > > looks like an "anti-pattern" that we will get wrong over time.
> >
> > The reason I did not put the lock inside driver_match_device() is that
> > one of its existing callers, __device_attach_driver(), already holds
> > device_lock(dev). Unconditionally adding the lock inside
> > driver_match_device() would cause a deadlock on that path.
>
> Ok, then we should add a lockdep check that this specific lock is
> grabbed in that function so that we don't forget that this is a
> requirement now.
Agreed. I have just sent a v2 patch which adds device_lock_assert(dev)
to driver_match_device() to enforce this requirement.
Thanks,
Gui-Dong Han
© 2016 - 2025 Red Hat, Inc.