[PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()

Gui-Dong Han posted 1 patch 2 months, 1 week ago
drivers/base/base.h |  3 +++
drivers/base/bus.c  | 16 +++++++++++-----
drivers/base/dd.c   |  2 ++
3 files changed, 16 insertions(+), 5 deletions(-)
[PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()
Posted by Gui-Dong Han 2 months, 1 week ago
driver_set_override() modifies and frees dev->driver_override while
holding device_lock(dev). However, driver_match_device() reads
dev->driver_override when calling bus match functions.

Currently, driver_match_device() is called from three sites. One site
(__device_attach_driver) holds device_lock(dev), but the other two
(bind_store and __driver_attach) do not. This allows a concurrent
driver_set_override() to free the string while driver_match_device() is
using it, leading to a use-after-free (UAF).

This issue affects at least 11 bus types (including PCI, AMBA, Platform)
that rely on driver_override for matching.

Fix this by holding device_lock(dev) around the driver_match_device() calls
in bind_store() and __driver_attach(). This ensures all access to
dev->driver_override via driver_match_device() is protected by the device
lock. Also add a lock assertion to driver_match_device() to prevent future
locking regressions.

Tested with the PoCs from Bugzilla that trigger this UAF. Stress testing
the two newly locked paths for 24 hours with CONFIG_PROVE_LOCKING and
CONFIG_LOCKDEP enabled showed no UAF recurrence and no lockdep
warnings.

Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Suggested-by: Qiu-ji Chen <chenqiuji666@gmail.com>
Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com>
---
v3:
* Remove redundant locking comments at call sites and add a blank line
after the lock assertion in driver_match_device(), as suggested by Greg KH.
v2:
* Add device_lock_assert() in driver_match_device() to enforce locking
requirement, as suggested by Greg KH.
v1:
* The Bugzilla entry contains full KASAN reports and two PoCs that reliably
reproduce the UAF on both unlocked paths using a standard QEMU setup
(default e1000 device at 0000:00:03.0).
I chose to fix this in the driver core for the following reasons:
1. Both racing functions are part of the driver core.
2. Fixing this per-driver/per-bus is tedious and would require careful
ad-hoc locking that does not align with the existing device_lock(dev).
3. We cannot simply add device_lock(dev) inside bus match functions because
one call path (__device_attach_driver) already holds this lock. Adding the
lock inside the match callback would cause a deadlock on that path.
---
 drivers/base/base.h |  3 +++
 drivers/base/bus.c  | 16 +++++++++++-----
 drivers/base/dd.c   |  2 ++
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 86fa7fbb3548..72791125de91 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct device *dev, struct va_format
 static inline int driver_match_device(const struct device_driver *drv,
 				      struct device *dev)
 {
+	/* Protects against driver_set_override() races */
+	device_lock_assert(dev);
+
 	return drv->bus->match ? drv->bus->match(dev, drv) : 1;
 }
 
diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index 5e75e1bce551..cec5f3164a0a 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -261,13 +261,19 @@ static ssize_t bind_store(struct device_driver *drv, const char *buf,
 	const struct bus_type *bus = bus_get(drv->bus);
 	struct device *dev;
 	int err = -ENODEV;
+	int ret;
 
 	dev = bus_find_device_by_name(bus, NULL, buf);
-	if (dev && driver_match_device(drv, dev)) {
-		err = device_driver_attach(drv, dev);
-		if (!err) {
-			/* success */
-			err = count;
+	if (dev) {
+		device_lock(dev);
+		ret = driver_match_device(drv, dev);
+		device_unlock(dev);
+		if (ret) {
+			err = device_driver_attach(drv, dev);
+			if (!err) {
+				/* success */
+				err = count;
+			}
 		}
 	}
 	put_device(dev);
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 13ab98e033ea..2ad2e8ca9052 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -1170,7 +1170,9 @@ static int __driver_attach(struct device *dev, void *data)
 	 * is an error.
 	 */
 
+	device_lock(dev);
 	ret = driver_match_device(drv, dev);
+	device_unlock(dev);
 	if (ret == 0) {
 		/* no match */
 		return 0;
-- 
2.43.0
Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()
Posted by Danilo Krummrich 3 weeks, 4 days ago
On Thu Nov 27, 2025 at 3:57 PM CET, Gui-Dong Han wrote:
> diff --git a/drivers/base/base.h b/drivers/base/base.h
> index 86fa7fbb3548..72791125de91 100644
> --- a/drivers/base/base.h
> +++ b/drivers/base/base.h
> @@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct device *dev, struct va_format
>  static inline int driver_match_device(const struct device_driver *drv,
>  				      struct device *dev)
>  {
> +	/* Protects against driver_set_override() races */
> +	device_lock_assert(dev);
> +
>  	return drv->bus->match ? drv->bus->match(dev, drv) : 1;
>  }

I am not convinced that this is the correct fix, since

  1. Not all match() callbacks access the driver_override field,

  2. driver_override is accessed in other places as well,

  3. driver_override is a bus device specific field (with a common
     helper admittedly).

I think it would be better to make driver_override a field in the base
struct device. This way we can not only provide driver_set_override(), but also
driver_get_override(), which should contain the device_lock_assert() instead.

While not all devices require the driver_override field, an additional pointer
in struct device does not hurt and it clarifies ownership and hence locking.

- Danilo
Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()
Posted by Gui-Dong Han 3 weeks, 4 days ago
On Tue, Jan 13, 2026 at 5:55 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Thu Nov 27, 2025 at 3:57 PM CET, Gui-Dong Han wrote:
> > diff --git a/drivers/base/base.h b/drivers/base/base.h
> > index 86fa7fbb3548..72791125de91 100644
> > --- a/drivers/base/base.h
> > +++ b/drivers/base/base.h
> > @@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct device *dev, struct va_format
> >  static inline int driver_match_device(const struct device_driver *drv,
> >                                     struct device *dev)
> >  {
> > +     /* Protects against driver_set_override() races */
> > +     device_lock_assert(dev);
> > +
> >       return drv->bus->match ? drv->bus->match(dev, drv) : 1;
> >  }
>
> I am not convinced that this is the correct fix, since
>
>   1. Not all match() callbacks access the driver_override field,
>
>   2. driver_override is accessed in other places as well,
>
>   3. driver_override is a bus device specific field (with a common
>      helper admittedly).
>
> I think it would be better to make driver_override a field in the base
> struct device. This way we can not only provide driver_set_override(), but also
> driver_get_override(), which should contain the device_lock_assert() instead.

Hi Danilo,

Thanks for the feedback. While I agree that moving driver_override to
struct device is a good idea for future refactoring, I believe this
patch is necessary as the immediate fix due to existing locking
constraints.

The main constraint is the device_lock. Currently,
driver_match_device() is called from three paths. One of them
(__device_attach_driver) already holds device_lock to protect the
whole binding process. We cannot add locking inside the bus match()
callbacks (or after refactoring, acquiring the lock to call a
hypothetical getter inside match) because it would deadlock this
existing path. Since we cannot remove the lock from
__device_attach_driver, the only safe solution is to ensure the lock
is held at the other two call sites. Moving the field to struct device
would not change this locking asymmetry. The device_lock is a coarse
lock intended for correctness here, and contention is low.

I understand you might be concerned about performance. I tested this
using ftrace. driver_match_device() is a cold path (called ~600 times
during boot in my setup). I observed no measurable boot time
regression. After boot, it is rarely called (e.g., module loading). So
I think the impact is negligible.

Regarding other access sites: driver_override is specifically designed
for matching. Apart from sysfs (which I am fixing separately [1]), I
have not found other places that require locking. Other accesses occur
mostly during init/cleanup (e.g., pci_device_probe/pci_release_dev)
where concurrency is not an issue. If there are other specific race
conditions, we can fix them individually as they are not universal. In
contrast, the match() path is the critical concurrent path where the
feature is actually implemented, affecting over 10 buses.

Even if we refactor (e.g., adding dev_access_driver_override), we
would still need to hold the lock at these call sites for the reasons
above. This patch is the minimal fix to close the UAF. Refactoring
involves the core, documentation, and multiple bus drivers, which will
take time to discuss and design. I suggest merging this fix first, and
we can explore the refactoring as a follow-up.

[1] https://lore.kernel.org/all/20251202174948.12693-1-hanguidong02@gmail.com/

Thanks
Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()
Posted by Danilo Krummrich 3 weeks, 4 days ago
On Tue Jan 13, 2026 at 1:42 PM CET, Gui-Dong Han wrote:
> On Tue, Jan 13, 2026 at 5:55 PM Danilo Krummrich <dakr@kernel.org> wrote:
>>
>> On Thu Nov 27, 2025 at 3:57 PM CET, Gui-Dong Han wrote:
>> > diff --git a/drivers/base/base.h b/drivers/base/base.h
>> > index 86fa7fbb3548..72791125de91 100644
>> > --- a/drivers/base/base.h
>> > +++ b/drivers/base/base.h
>> > @@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct device *dev, struct va_format
>> >  static inline int driver_match_device(const struct device_driver *drv,
>> >                                     struct device *dev)
>> >  {
>> > +     /* Protects against driver_set_override() races */
>> > +     device_lock_assert(dev);
>> > +
>> >       return drv->bus->match ? drv->bus->match(dev, drv) : 1;
>> >  }
>>
>> I am not convinced that this is the correct fix, since
>>
>>   1. Not all match() callbacks access the driver_override field,
>>
>>   2. driver_override is accessed in other places as well,
>>
>>   3. driver_override is a bus device specific field (with a common
>>      helper admittedly).
>>
>> I think it would be better to make driver_override a field in the base
>> struct device. This way we can not only provide driver_set_override(), but also
>> driver_get_override(), which should contain the device_lock_assert() instead.
>
> Hi Danilo,
>
> Thanks for the feedback. While I agree that moving driver_override to
> struct device is a good idea for future refactoring, I believe this
> patch is necessary as the immediate fix due to existing locking
> constraints.
>
> The main constraint is the device_lock. Currently,
> driver_match_device() is called from three paths. One of them
> (__device_attach_driver) already holds device_lock to protect the
> whole binding process. We cannot add locking inside the bus match()
> callbacks (or after refactoring, acquiring the lock to call a
> hypothetical getter inside match) because it would deadlock this
> existing path. Since we cannot remove the lock from
> __device_attach_driver, the only safe solution is to ensure the lock
> is held at the other two call sites. Moving the field to struct device
> would not change this locking asymmetry. The device_lock is a coarse
> lock intended for correctness here, and contention is low.

Don't get me wrong, I'm not against providing the guarantee that the match()
callbacks are always called with the device lock held, I think we should do
that.

But I do not agree that this is *properly* fixing locking around the
driver_override field by itself.

> I understand you might be concerned about performance. I tested this
> using ftrace. driver_match_device() is a cold path (called ~600 times
> during boot in my setup). I observed no measurable boot time
> regression. After boot, it is rarely called (e.g., module loading). So
> I think the impact is negligible.

I'm not concerned about performance at all. I'm concerned about locking code
paths instead of locking data.

> Regarding other access sites: driver_override is specifically designed
> for matching. Apart from sysfs (which I am fixing separately [1]), I
> have not found other places that require locking. Other accesses occur
> mostly during init/cleanup (e.g., pci_device_probe/pci_release_dev)
> where concurrency is not an issue. If there are other specific race
> conditions, we can fix them individually as they are not universal. In
> contrast, the match() path is the critical concurrent path where the
> feature is actually implemented, affecting over 10 buses.

Again, the problem is not that match() must be called with the device lock held
because of driver_set_override().

The problem is that currently match() is only *sometimes* called with the device
lock held, i.e. it would also work if match() is never called with the device
lock held.

So, there are two separate issues: fix the inconsistent locking guarantee of
match() callbacks and locking design of driver_override.

Regarding the latter, it is odd that the driver_override field belongs to the
bus device structure, which ultimately makes the bus responsible to ensure
mutual exclusion, while the driver-core helper driver_set_override() assumes
that the field is protected with the device lock.

So, either we leave it to the bus entirely, then the driver_set_override()
helper should take a lock pointer provided by the bus or we define that
driver_override is protected with the device lock, but then we should move it to
struct device and provide an accessor with proper documentation of the locking
rules.

Having that said, I think this patch should focus on fixing the inconsistent
locking locking guarantee of the match() callback. But it should not justify
that driver_match_device() is called with the device lock held with "Protects
against driver_set_override() races".

> Even if we refactor (e.g., adding dev_access_driver_override), we
> would still need to hold the lock at these call sites for the reasons
> above.

Sure, but then we have a lockdep assert and the rules are properly documented.
Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()
Posted by Gui-Dong Han 3 weeks, 3 days ago
On Tue, Jan 13, 2026 at 9:34 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Tue Jan 13, 2026 at 1:42 PM CET, Gui-Dong Han wrote:
> > On Tue, Jan 13, 2026 at 5:55 PM Danilo Krummrich <dakr@kernel.org> wrote:
> >>
> >> On Thu Nov 27, 2025 at 3:57 PM CET, Gui-Dong Han wrote:
> >> > diff --git a/drivers/base/base.h b/drivers/base/base.h
> >> > index 86fa7fbb3548..72791125de91 100644
> >> > --- a/drivers/base/base.h
> >> > +++ b/drivers/base/base.h
> >> > @@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct device *dev, struct va_format
> >> >  static inline int driver_match_device(const struct device_driver *drv,
> >> >                                     struct device *dev)
> >> >  {
> >> > +     /* Protects against driver_set_override() races */
> >> > +     device_lock_assert(dev);
> >> > +
> >> >       return drv->bus->match ? drv->bus->match(dev, drv) : 1;
> >> >  }
> >>
> >> I am not convinced that this is the correct fix, since
> >>
> >>   1. Not all match() callbacks access the driver_override field,
> >>
> >>   2. driver_override is accessed in other places as well,
> >>
> >>   3. driver_override is a bus device specific field (with a common
> >>      helper admittedly).
> >>
> >> I think it would be better to make driver_override a field in the base
> >> struct device. This way we can not only provide driver_set_override(), but also
> >> driver_get_override(), which should contain the device_lock_assert() instead.
> >
> > Hi Danilo,
> >
> > Thanks for the feedback. While I agree that moving driver_override to
> > struct device is a good idea for future refactoring, I believe this
> > patch is necessary as the immediate fix due to existing locking
> > constraints.
> >
> > The main constraint is the device_lock. Currently,
> > driver_match_device() is called from three paths. One of them
> > (__device_attach_driver) already holds device_lock to protect the
> > whole binding process. We cannot add locking inside the bus match()
> > callbacks (or after refactoring, acquiring the lock to call a
> > hypothetical getter inside match) because it would deadlock this
> > existing path. Since we cannot remove the lock from
> > __device_attach_driver, the only safe solution is to ensure the lock
> > is held at the other two call sites. Moving the field to struct device
> > would not change this locking asymmetry. The device_lock is a coarse
> > lock intended for correctness here, and contention is low.
>
> Don't get me wrong, I'm not against providing the guarantee that the match()
> callbacks are always called with the device lock held, I think we should do
> that.
>
> But I do not agree that this is *properly* fixing locking around the
> driver_override field by itself.
>
> > I understand you might be concerned about performance. I tested this
> > using ftrace. driver_match_device() is a cold path (called ~600 times
> > during boot in my setup). I observed no measurable boot time
> > regression. After boot, it is rarely called (e.g., module loading). So
> > I think the impact is negligible.
>
> I'm not concerned about performance at all. I'm concerned about locking code
> paths instead of locking data.
>
> > Regarding other access sites: driver_override is specifically designed
> > for matching. Apart from sysfs (which I am fixing separately [1]), I
> > have not found other places that require locking. Other accesses occur
> > mostly during init/cleanup (e.g., pci_device_probe/pci_release_dev)
> > where concurrency is not an issue. If there are other specific race
> > conditions, we can fix them individually as they are not universal. In
> > contrast, the match() path is the critical concurrent path where the
> > feature is actually implemented, affecting over 10 buses.
>
> Again, the problem is not that match() must be called with the device lock held
> because of driver_set_override().
>
> The problem is that currently match() is only *sometimes* called with the device
> lock held, i.e. it would also work if match() is never called with the device
> lock held.
>
> So, there are two separate issues: fix the inconsistent locking guarantee of
> match() callbacks and locking design of driver_override.

Fully agree.

> Regarding the latter, it is odd that the driver_override field belongs to the
> bus device structure, which ultimately makes the bus responsible to ensure
> mutual exclusion, while the driver-core helper driver_set_override() assumes
> that the field is protected with the device lock.
>
> So, either we leave it to the bus entirely, then the driver_set_override()
> helper should take a lock pointer provided by the bus or we define that
> driver_override is protected with the device lock, but then we should move it to
> struct device and provide an accessor with proper documentation of the locking
> rules.
>
> Having that said, I think this patch should focus on fixing the inconsistent
> locking locking guarantee of the match() callback. But it should not justify
> that driver_match_device() is called with the device lock held with "Protects
> against driver_set_override() races".

I see your point now. I agree that we should frame this as fixing the
inconsistent locking guarantee of match() callbacks. This is exactly
what the code changes in my patch implement. The fix for
driver_override UAF is a natural result of this consistency.

I plan to send a v4 to update the subject and commit log to focus on
enforcing consistent locking.

For the code comment, I will remove the mention of
driver_set_override. Do you prefer /* Ensure consistent locking for
match() callbacks */ or simply no comment?

Regarding the larger refactoring of driver_override, I am willing to
look into it as a follow-up task.

Sorry for the confusion earlier. Please let me know if this plan works for you.

Thanks
Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()
Posted by Danilo Krummrich 3 weeks, 3 days ago
On Tue Jan 13, 2026 at 3:05 PM CET, Gui-Dong Han wrote:
> I see your point now. I agree that we should frame this as fixing the
> inconsistent locking guarantee of match() callbacks. This is exactly
> what the code changes in my patch implement. The fix for
> driver_override UAF is a natural result of this consistency.
>
> I plan to send a v4 to update the subject and commit log to focus on
> enforcing consistent locking.

Great, thanks!

> For the code comment, I will remove the mention of
> driver_set_override. Do you prefer /* Ensure consistent locking for
> match() callbacks */ or simply no comment?

I think both is fine, the comment probably doesn't hurt.

> Regarding the larger refactoring of driver_override, I am willing to
> look into it as a follow-up task.

Thanks!
Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()
Posted by Rafael J. Wysocki 3 weeks, 4 days ago
On Tue, Jan 13, 2026 at 2:34 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Tue Jan 13, 2026 at 1:42 PM CET, Gui-Dong Han wrote:
> > On Tue, Jan 13, 2026 at 5:55 PM Danilo Krummrich <dakr@kernel.org> wrote:
> >>
> >> On Thu Nov 27, 2025 at 3:57 PM CET, Gui-Dong Han wrote:
> >> > diff --git a/drivers/base/base.h b/drivers/base/base.h
> >> > index 86fa7fbb3548..72791125de91 100644
> >> > --- a/drivers/base/base.h
> >> > +++ b/drivers/base/base.h
> >> > @@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct device *dev, struct va_format
> >> >  static inline int driver_match_device(const struct device_driver *drv,
> >> >                                     struct device *dev)
> >> >  {
> >> > +     /* Protects against driver_set_override() races */
> >> > +     device_lock_assert(dev);
> >> > +
> >> >       return drv->bus->match ? drv->bus->match(dev, drv) : 1;
> >> >  }
> >>
> >> I am not convinced that this is the correct fix, since
> >>
> >>   1. Not all match() callbacks access the driver_override field,
> >>
> >>   2. driver_override is accessed in other places as well,
> >>
> >>   3. driver_override is a bus device specific field (with a common
> >>      helper admittedly).
> >>
> >> I think it would be better to make driver_override a field in the base
> >> struct device. This way we can not only provide driver_set_override(), but also
> >> driver_get_override(), which should contain the device_lock_assert() instead.
> >
> > Hi Danilo,
> >
> > Thanks for the feedback. While I agree that moving driver_override to
> > struct device is a good idea for future refactoring, I believe this
> > patch is necessary as the immediate fix due to existing locking
> > constraints.
> >
> > The main constraint is the device_lock. Currently,
> > driver_match_device() is called from three paths. One of them
> > (__device_attach_driver) already holds device_lock to protect the
> > whole binding process. We cannot add locking inside the bus match()
> > callbacks (or after refactoring, acquiring the lock to call a
> > hypothetical getter inside match) because it would deadlock this
> > existing path. Since we cannot remove the lock from
> > __device_attach_driver, the only safe solution is to ensure the lock
> > is held at the other two call sites. Moving the field to struct device
> > would not change this locking asymmetry. The device_lock is a coarse
> > lock intended for correctness here, and contention is low.
>
> Don't get me wrong, I'm not against providing the guarantee that the match()
> callbacks are always called with the device lock held, I think we should do
> that.
>
> But I do not agree that this is *properly* fixing locking around the
> driver_override field by itself.
>
> > I understand you might be concerned about performance. I tested this
> > using ftrace. driver_match_device() is a cold path (called ~600 times
> > during boot in my setup). I observed no measurable boot time
> > regression. After boot, it is rarely called (e.g., module loading). So
> > I think the impact is negligible.
>
> I'm not concerned about performance at all. I'm concerned about locking code
> paths instead of locking data.
>
> > Regarding other access sites: driver_override is specifically designed
> > for matching. Apart from sysfs (which I am fixing separately [1]), I
> > have not found other places that require locking. Other accesses occur
> > mostly during init/cleanup (e.g., pci_device_probe/pci_release_dev)
> > where concurrency is not an issue. If there are other specific race
> > conditions, we can fix them individually as they are not universal. In
> > contrast, the match() path is the critical concurrent path where the
> > feature is actually implemented, affecting over 10 buses.
>
> Again, the problem is not that match() must be called with the device lock held
> because of driver_set_override().
>
> The problem is that currently match() is only *sometimes* called with the device
> lock held, i.e. it would also work if match() is never called with the device
> lock held.
>
> So, there are two separate issues: fix the inconsistent locking guarantee of
> match() callbacks and locking design of driver_override.
>
> Regarding the latter, it is odd that the driver_override field belongs to the
> bus device structure, which ultimately makes the bus responsible to ensure
> mutual exclusion, while the driver-core helper driver_set_override() assumes
> that the field is protected with the device lock.
>
> So, either we leave it to the bus entirely, then the driver_set_override()
> helper should take a lock pointer provided by the bus or we define that
> driver_override is protected with the device lock, but then we should move it to
> struct device and provide an accessor with proper documentation of the locking
> rules.
>
> Having that said, I think this patch should focus on fixing the inconsistent
> locking locking guarantee of the match() callback. But it should not justify
> that driver_match_device() is called with the device lock held with "Protects
> against driver_set_override() races".

I agree here.

The inconsistent locking is a problem and the resultant race
conditions are a symptom of it.

> > Even if we refactor (e.g., adding dev_access_driver_override), we
> > would still need to hold the lock at these call sites for the reasons
> > above.
>
> Sure, but then we have a lockdep assert and the rules are properly documented.

Right.
Re: [PATCH v3] driver core: fix use-after-free of driver_override via driver_match_device()
Posted by Danilo Krummrich 3 weeks, 4 days ago
On Tue Jan 13, 2026 at 10:55 AM CET, Danilo Krummrich wrote:
> On Thu Nov 27, 2025 at 3:57 PM CET, Gui-Dong Han wrote:
>> diff --git a/drivers/base/base.h b/drivers/base/base.h
>> index 86fa7fbb3548..72791125de91 100644
>> --- a/drivers/base/base.h
>> +++ b/drivers/base/base.h
>> @@ -166,6 +166,9 @@ void device_set_deferred_probe_reason(const struct device *dev, struct va_format
>>  static inline int driver_match_device(const struct device_driver *drv,
>>  				      struct device *dev)
>>  {
>> +	/* Protects against driver_set_override() races */
>> +	device_lock_assert(dev);
>> +
>>  	return drv->bus->match ? drv->bus->match(dev, drv) : 1;
>>  }
>
> I am not convinced that this is the correct fix, since
>
>   1. Not all match() callbacks access the driver_override field,
>
>   2. driver_override is accessed in other places as well,
>
>   3. driver_override is a bus device specific field (with a common
>      helper admittedly).
>
> I think it would be better to make driver_override a field in the base
> struct device. This way we can not only provide driver_set_override(), but also
> driver_get_override(), which should contain the device_lock_assert() instead.

We should probably avoid the term 'get' in the accessor, since it could be
misleading.

More generally the naming is a bit odd: driver_set_override() reads as if we'd
set an override field of some driver structure, whereas we actually set the
driver_override field of a device structure.

Hence, I'd suggest to eventually go with dev_set_driver_override() and
dev_access_driver_override() (or just dev_driver_override()).

Also, the documentation of the accessor should document that the returned
pointer is only valid as long as the lock is held.

(In Rust it would be possible to make the compiler ensure that the returned
pointer can't be accessed anymore after the lock is dropped.)

> While not all devices require the driver_override field, an additional pointer
> in struct device does not hurt and it clarifies ownership and hence locking.
>
> - Danilo