[PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable

Tzung-Bi Shih posted 5 patches 2 weeks, 6 days ago
There is a newer version of this series
.../driver-api/driver-model/index.rst         |   1 +
.../driver-api/driver-model/revocable.rst     | 151 ++++++++++++
MAINTAINERS                                   |   9 +
drivers/base/Kconfig                          |   8 +
drivers/base/Makefile                         |   5 +-
drivers/base/revocable.c                      | 229 ++++++++++++++++++
drivers/base/revocable_test.c                 | 110 +++++++++
drivers/platform/chrome/cros_ec.c             |   5 +
drivers/platform/chrome/cros_ec_chardev.c     | 124 +++++++---
include/linux/platform_data/cros_ec_proto.h   |   4 +
include/linux/revocable.h                     |  37 +++
tools/testing/selftests/Makefile              |   1 +
.../selftests/drivers/base/revocable/Makefile |   7 +
.../drivers/base/revocable/revocable_test.c   | 116 +++++++++
.../drivers/base/revocable/test-revocable.sh  |  39 +++
.../base/revocable/test_modules/Makefile      |  10 +
.../revocable/test_modules/revocable_test.c   | 188 ++++++++++++++
17 files changed, 1003 insertions(+), 41 deletions(-)
create mode 100644 Documentation/driver-api/driver-model/revocable.rst
create mode 100644 drivers/base/revocable.c
create mode 100644 drivers/base/revocable_test.c
create mode 100644 include/linux/revocable.h
create mode 100644 tools/testing/selftests/drivers/base/revocable/Makefile
create mode 100644 tools/testing/selftests/drivers/base/revocable/revocable_test.c
create mode 100755 tools/testing/selftests/drivers/base/revocable/test-revocable.sh
create mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/Makefile
create mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c
[PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Tzung-Bi Shih 2 weeks, 6 days ago
This is a follow-up series of [1].  It tries to fix a possible UAF in the
fops of cros_ec_chardev after the underlying protocol device has gone by
using revocable.

The 1st patch introduces the revocable which is an implementation of ideas
from the talk [2].

The 2nd and 3rd patches add test cases for revocable in Kunit and selftest.

The 4th patch converts existing protocol devices to resource providers
of cros_ec_device.

The 5th patch converts cros_ec_chardev to a resource consumer of
cros_ec_device to fix the UAF.

[1] https://lore.kernel.org/chrome-platform/20250721044456.2736300-6-tzungbi@kernel.org/
[2] https://lpc.events/event/17/contributions/1627/

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>

v3:
- Rebase onto https://lore.kernel.org/chrome-platform/20250828083601.856083-1-tzungbi@kernel.org
  and next-20250912.
- Change the 4th patch accordingly.

v2: https://lore.kernel.org/chrome-platform/20250820081645.847919-1-tzungbi@kernel.org
- Rename "ref_proxy" -> "revocable".
- Add test cases in Kunit and selftest.

v1: https://lore.kernel.org/chrome-platform/20250814091020.1302888-1-tzungbi@kernel.org/

Tzung-Bi Shih (5):
  revocable: Revocable resource management
  revocable: Add Kunit test cases
  selftests: revocable: Add kselftest cases
  platform/chrome: Protect cros_ec_device lifecycle with revocable
  platform/chrome: cros_ec_chardev: Consume cros_ec_device via revocable

 .../driver-api/driver-model/index.rst         |   1 +
 .../driver-api/driver-model/revocable.rst     | 151 ++++++++++++
 MAINTAINERS                                   |   9 +
 drivers/base/Kconfig                          |   8 +
 drivers/base/Makefile                         |   5 +-
 drivers/base/revocable.c                      | 229 ++++++++++++++++++
 drivers/base/revocable_test.c                 | 110 +++++++++
 drivers/platform/chrome/cros_ec.c             |   5 +
 drivers/platform/chrome/cros_ec_chardev.c     | 124 +++++++---
 include/linux/platform_data/cros_ec_proto.h   |   4 +
 include/linux/revocable.h                     |  37 +++
 tools/testing/selftests/Makefile              |   1 +
 .../selftests/drivers/base/revocable/Makefile |   7 +
 .../drivers/base/revocable/revocable_test.c   | 116 +++++++++
 .../drivers/base/revocable/test-revocable.sh  |  39 +++
 .../base/revocable/test_modules/Makefile      |  10 +
 .../revocable/test_modules/revocable_test.c   | 188 ++++++++++++++
 17 files changed, 1003 insertions(+), 41 deletions(-)
 create mode 100644 Documentation/driver-api/driver-model/revocable.rst
 create mode 100644 drivers/base/revocable.c
 create mode 100644 drivers/base/revocable_test.c
 create mode 100644 include/linux/revocable.h
 create mode 100644 tools/testing/selftests/drivers/base/revocable/Makefile
 create mode 100644 tools/testing/selftests/drivers/base/revocable/revocable_test.c
 create mode 100755 tools/testing/selftests/drivers/base/revocable/test-revocable.sh
 create mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/Makefile
 create mode 100644 tools/testing/selftests/drivers/base/revocable/test_modules/revocable_test.c

-- 
2.51.0.384.g4c02a37b29-goog
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Krzysztof Kozlowski 2 weeks, 6 days ago
On 12/09/2025 10:17, Tzung-Bi Shih wrote:
> This is a follow-up series of [1].  It tries to fix a possible UAF in the
> fops of cros_ec_chardev after the underlying protocol device has gone by
> using revocable.
> 
> The 1st patch introduces the revocable which is an implementation of ideas
> from the talk [2].
> 
> The 2nd and 3rd patches add test cases for revocable in Kunit and selftest.
> 
> The 4th patch converts existing protocol devices to resource providers
> of cros_ec_device.
> 
> The 5th patch converts cros_ec_chardev to a resource consumer of
> cros_ec_device to fix the UAF.
> 
> [1] https://lore.kernel.org/chrome-platform/20250721044456.2736300-6-tzungbi@kernel.org/
> [2] https://lpc.events/event/17/contributions/1627/
> 
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>

Thanks for the work. Just a note, please start using b4, so above Cc
will be propagated to all patches. Folks above received only the cover
letter...

Best regards,
Krzysztof
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Bartosz Golaszewski 2 weeks, 6 days ago
On Fri, 12 Sept 2025 at 11:09, Krzysztof Kozlowski <krzk@kernel.org> wrote:
>
> On 12/09/2025 10:17, Tzung-Bi Shih wrote:
> > This is a follow-up series of [1].  It tries to fix a possible UAF in the
> > fops of cros_ec_chardev after the underlying protocol device has gone by
> > using revocable.
> >
> > The 1st patch introduces the revocable which is an implementation of ideas
> > from the talk [2].
> >
> > The 2nd and 3rd patches add test cases for revocable in Kunit and selftest.
> >
> > The 4th patch converts existing protocol devices to resource providers
> > of cros_ec_device.
> >
> > The 5th patch converts cros_ec_chardev to a resource consumer of
> > cros_ec_device to fix the UAF.
> >
> > [1] https://lore.kernel.org/chrome-platform/20250721044456.2736300-6-tzungbi@kernel.org/
> > [2] https://lpc.events/event/17/contributions/1627/
> >
> > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
>
> Thanks for the work. Just a note, please start using b4, so above Cc
> will be propagated to all patches. Folks above received only the cover
> letter...
>

Thanks to Krzysztof for making me aware of this. Could you please Cc
my brgl@bgdev.pl address on the next iteration.

I haven't looked into the details yet but the small size of the first
patch strikes me as odd. The similar changes I did for GPIO were quite
big and they were designed just for a single sub-system.

During the talk you reference, after I suggested a library like this,
Greg KH can be heard saying: do this for two big subsystems so that
you're sure it's a generic solution. Here you're only using it in a
single driver which makes me wonder if we can actually use it to
improve bigger offenders, like for example I2C, or even replace the
custom, SRCU-based solution in GPIO we have now. Have you considered
at least doing a PoC in a wider kernel framework?

Just my two cents.

Bartosz
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Tzung-Bi Shih 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 11:24:10AM +0200, Bartosz Golaszewski wrote:
> On Fri, 12 Sept 2025 at 11:09, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> > On 12/09/2025 10:17, Tzung-Bi Shih wrote:
> > > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
> >
> > Thanks for the work. Just a note, please start using b4, so above Cc
> > will be propagated to all patches. Folks above received only the cover
> > letter...

Thank you for bringing this to my attention.  I wasn't aware of that and
will ensure this is handled correctly in the future.

> Thanks to Krzysztof for making me aware of this. Could you please Cc
> my brgl@bgdev.pl address on the next iteration.

Sure, will do.

> I haven't looked into the details yet but the small size of the first
> patch strikes me as odd. The similar changes I did for GPIO were quite
> big and they were designed just for a single sub-system.
> 
> During the talk you reference, after I suggested a library like this,
> Greg KH can be heard saying: do this for two big subsystems so that
> you're sure it's a generic solution. Here you're only using it in a
> single driver which makes me wonder if we can actually use it to
> improve bigger offenders, like for example I2C, or even replace the
> custom, SRCU-based solution in GPIO we have now. Have you considered
> at least doing a PoC in a wider kernel framework?

Yes, I'm happy to take this on.

To help me get started, could you please point me to some relevant code
locations?  Also, could you let me know if any specific physical devices
will be needed for testing?
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 08:49:30PM +0800, Tzung-Bi Shih wrote:
> On Fri, Sep 12, 2025 at 11:24:10AM +0200, Bartosz Golaszewski wrote:
> > On Fri, 12 Sept 2025 at 11:09, Krzysztof Kozlowski wrote:
> > > On 12/09/2025 10:17, Tzung-Bi Shih wrote:
> > > > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > > Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > >
> > > Thanks for the work. Just a note, please start using b4, so above Cc
> > > will be propagated to all patches. Folks above received only the cover
> > > letter...
> 
> Thank you for bringing this to my attention.  I wasn't aware of that and
> will ensure this is handled correctly in the future.
> 
> > Thanks to Krzysztof for making me aware of this. Could you please Cc
> > my brgl@bgdev.pl address on the next iteration.
> 
> Sure, will do.
> 
> > I haven't looked into the details yet but the small size of the first
> > patch strikes me as odd. The similar changes I did for GPIO were quite
> > big and they were designed just for a single sub-system.
> > 
> > During the talk you reference, after I suggested a library like this,
> > Greg KH can be heard saying: do this for two big subsystems so that
> > you're sure it's a generic solution. Here you're only using it in a
> > single driver which makes me wonder if we can actually use it to
> > improve bigger offenders, like for example I2C, or even replace the
> > custom, SRCU-based solution in GPIO we have now. Have you considered
> > at least doing a PoC in a wider kernel framework?
> 
> Yes, I'm happy to take this on.
> 
> To help me get started, could you please point me to some relevant code
> locations?  Also, could you let me know if any specific physical devices
> will be needed for testing?

One interesting test would be to move the logic to the cdev layer. The
use-after-free problem isn't specific to one type of character device,
and so shouldn't require a fix in every driver instantiating a cdev
directly (or indirectly). See [1] for a previous attempt to handle this
at the V4L2 level and [2] for an attempt to handle it at the cdev level.

In [1], two new functions named video_device_enter() and
video_device_exit() flag the beginning and end of protected code
sections. The equivalent in [2] is the manual get/put of cdev->qactive,
and if I understand things correctly, your series creates a REVOCABLE()
macro to do the same. I'm sure we'll bikesheed about names at some
point, but for the time being, what I'd like to see if this being done
in fs/char_dev.c to cover all entry points from userspace at the cdev
level.

We then have video_device_unplug() in [1], which I think is more or less
the equivalent of revocable_provider_free(). I don't think we'll be able
to hide this completely from drivers, at least not in all cases. We
should however design the API to make it easy for drivers, likely with
subsystem-specific wrappers.

What I have in mind is roughly the following:

1. Protect all access to the cdev from userspace with enter/exit calls
   that flag if a call is in progress. This can be done with explicit
   function calls, or with a scope guard as in your series.

2. At .remove() time, start by flagging that the device is being
   removed. That has to be an explicit call from drivers I believe,
   likely using subsystem-specific wrappers to simplify things.

3. Once the device is marked as being removed, all enter() calls should
   fail at the cdev level.

4. In .remove(), proceed to perform driver-specific operations that will
   stop the device and wake up any userspace task blocked on a syscall
   protected by enter()/remove(). This isn't needed for
   drivers/subsystems that don't provide any blocking API, but is
   required otherwise.

5. Unregister, still in .remove(), the cdev (likely through
   subsystem-specific APIs in most cases). This should block until all
   protected sections have exited.

6. The cdev is now unregistered, can't be opened anymore, and any
   new syscall on any opened file handle will return an error. The
   driver's .remove() function can proceed to free data, there won't be
   any UAF caused by userspace.

[1] implemented this fairly naively with flags and spinlocks. An
RCU-based implementation is probably more efficient, even if I don't
know how performance-sensitive all this is.

Does this align with your design, and do you think you could give a try
at pushing revocable resource handling to the cdev level ?

On a separate note, I'm not sure "revocable" is the right name here. I
believe a revocable resource API is needed, and well-named, for
in-kernel consumers (e.g. drivers consuming a GPIO or clock). For the
userspace syscalls racing with .remove(), I don't think we're dealing
with "revocable resources". Now, if a "revocable resources" API were to
support the in-kernel users, and be usable as a building block to fix
the cdev issue, I would have nothing against it, but the "revocable"
name should be internal in that case, used in the cdev layer only, and
not exposed to drivers (or even subsystem helpers that should wrap cdev
functions instead).

[1] https://lore.kernel.org/r/20171116003349.19235-1-laurent.pinchart+renesas@ideasonboard.com
[2] https://lore.kernel.org/r/161117153248.2853729.2452425259045172318.stgit@dwillia2-desk3.amr.corp.intel.com

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Greg Kroah-Hartman 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 04:26:56PM +0300, Laurent Pinchart wrote:
> On Fri, Sep 12, 2025 at 08:49:30PM +0800, Tzung-Bi Shih wrote:
> > On Fri, Sep 12, 2025 at 11:24:10AM +0200, Bartosz Golaszewski wrote:
> > > On Fri, 12 Sept 2025 at 11:09, Krzysztof Kozlowski wrote:
> > > > On 12/09/2025 10:17, Tzung-Bi Shih wrote:
> > > > > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > > Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > > > Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > > >
> > > > Thanks for the work. Just a note, please start using b4, so above Cc
> > > > will be propagated to all patches. Folks above received only the cover
> > > > letter...
> > 
> > Thank you for bringing this to my attention.  I wasn't aware of that and
> > will ensure this is handled correctly in the future.
> > 
> > > Thanks to Krzysztof for making me aware of this. Could you please Cc
> > > my brgl@bgdev.pl address on the next iteration.
> > 
> > Sure, will do.
> > 
> > > I haven't looked into the details yet but the small size of the first
> > > patch strikes me as odd. The similar changes I did for GPIO were quite
> > > big and they were designed just for a single sub-system.
> > > 
> > > During the talk you reference, after I suggested a library like this,
> > > Greg KH can be heard saying: do this for two big subsystems so that
> > > you're sure it's a generic solution. Here you're only using it in a
> > > single driver which makes me wonder if we can actually use it to
> > > improve bigger offenders, like for example I2C, or even replace the
> > > custom, SRCU-based solution in GPIO we have now. Have you considered
> > > at least doing a PoC in a wider kernel framework?
> > 
> > Yes, I'm happy to take this on.
> > 
> > To help me get started, could you please point me to some relevant code
> > locations?  Also, could you let me know if any specific physical devices
> > will be needed for testing?
> 
> One interesting test would be to move the logic to the cdev layer. The
> use-after-free problem isn't specific to one type of character device,
> and so shouldn't require a fix in every driver instantiating a cdev
> directly (or indirectly). See [1] for a previous attempt to handle this
> at the V4L2 level and [2] for an attempt to handle it at the cdev level.
> 
> In [1], two new functions named video_device_enter() and
> video_device_exit() flag the beginning and end of protected code
> sections. The equivalent in [2] is the manual get/put of cdev->qactive,
> and if I understand things correctly, your series creates a REVOCABLE()
> macro to do the same. I'm sure we'll bikesheed about names at some
> point, but for the time being, what I'd like to see if this being done
> in fs/char_dev.c to cover all entry points from userspace at the cdev
> level.
> 
> We then have video_device_unplug() in [1], which I think is more or less
> the equivalent of revocable_provider_free(). I don't think we'll be able
> to hide this completely from drivers, at least not in all cases. We
> should however design the API to make it easy for drivers, likely with
> subsystem-specific wrappers.
> 
> What I have in mind is roughly the following:
> 
> 1. Protect all access to the cdev from userspace with enter/exit calls
>    that flag if a call is in progress. This can be done with explicit
>    function calls, or with a scope guard as in your series.
> 
> 2. At .remove() time, start by flagging that the device is being
>    removed. That has to be an explicit call from drivers I believe,
>    likely using subsystem-specific wrappers to simplify things.
> 
> 3. Once the device is marked as being removed, all enter() calls should
>    fail at the cdev level.
> 
> 4. In .remove(), proceed to perform driver-specific operations that will
>    stop the device and wake up any userspace task blocked on a syscall
>    protected by enter()/remove(). This isn't needed for
>    drivers/subsystems that don't provide any blocking API, but is
>    required otherwise.
> 
> 5. Unregister, still in .remove(), the cdev (likely through
>    subsystem-specific APIs in most cases). This should block until all
>    protected sections have exited.
> 
> 6. The cdev is now unregistered, can't be opened anymore, and any
>    new syscall on any opened file handle will return an error. The
>    driver's .remove() function can proceed to free data, there won't be
>    any UAF caused by userspace.
> 
> [1] implemented this fairly naively with flags and spinlocks. An
> RCU-based implementation is probably more efficient, even if I don't
> know how performance-sensitive all this is.
> 
> Does this align with your design, and do you think you could give a try
> at pushing revocable resource handling to the cdev level ?
> 
> On a separate note, I'm not sure "revocable" is the right name here. I
> believe a revocable resource API is needed, and well-named, for
> in-kernel consumers (e.g. drivers consuming a GPIO or clock). For the
> userspace syscalls racing with .remove(), I don't think we're dealing
> with "revocable resources". Now, if a "revocable resources" API were to
> support the in-kernel users, and be usable as a building block to fix
> the cdev issue, I would have nothing against it, but the "revocable"
> name should be internal in that case, used in the cdev layer only, and
> not exposed to drivers (or even subsystem helpers that should wrap cdev
> functions instead).

I think the name makes sense as it matches up with how things are
working (the backend structure is "revoked"), but naming is tough :)

I have no objection moving this to the cdev api, BUT given that 'struct
cdev' is embedded everywhere, I don't think it's going to be a simple
task, but rather have to be done one-driver-at-a-time like the patch in
this series does it.

And that's fine, we have "interns" that we can set loose on this type of
code conversions, I think we just need to wrap the access to the cdev
with this api, which will take a bit of rewriting in many drivers.

Anyway, just my thought, if someone else can see how this could drop
into the core cdev code without any changes needed, that would be great,
but I don't see it at the moment.  cdev is just too "raw" for that.

thanks,

greg k-h
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Bartosz Golaszewski 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 3:39 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> I have no objection moving this to the cdev api, BUT given that 'struct
> cdev' is embedded everywhere, I don't think it's going to be a simple
> task, but rather have to be done one-driver-at-a-time like the patch in
> this series does it.
>

I don't think cdev is the right place for this as user-space keeping a
reference to a file-descriptor whose "backend" disappeared is not the
only possible problem. We can easily create a use-case of a USB I2C
expander being used by some in-kernel consumer and then unplugged.
This has nothing to do with the character device. I believe the
sub-system level is the right place for this and every driver
subsystem would have to integrate it separately, taking its various
quirks into account.

Bartosz
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 03:46:27PM +0200, Bartosz Golaszewski wrote:
> On Fri, Sep 12, 2025 at 3:39 PM Greg Kroah-Hartman wrote:
> >
> > I have no objection moving this to the cdev api, BUT given that 'struct
> > cdev' is embedded everywhere, I don't think it's going to be a simple
> > task, but rather have to be done one-driver-at-a-time like the patch in
> > this series does it.
> 
> I don't think cdev is the right place for this as user-space keeping a
> reference to a file-descriptor whose "backend" disappeared is not the
> only possible problem. We can easily create a use-case of a USB I2C
> expander being used by some in-kernel consumer and then unplugged.
> This has nothing to do with the character device. I believe the
> sub-system level is the right place for this and every driver
> subsystem would have to integrate it separately, taking its various
> quirks into account.

That's why I mentioned in-kernel users previously. Drivers routinely
acquire resources provided by other drivers, and having a way to revoke
those is needed.

It is a different but related problem compared to userspace racing with
.remove(). Could we solve both using the same backend concepts ?
Perhaps, time will tell, and if that works nicely, great. But we still
have lots of drivers exposing character devices to userspace (usually
through a subsystem-specific API, drivers that create a cdev manually
are the minority). That problem is in my opinion more urgent than
handling the removal of in-kernel resources, because it's more common,
and is easily triggerable by userspace. The good news is that it should
also be simpler to solve, we should be able to address the enter/exit
part entirely in cdev, and limit the changes to drivers in .remove() to
the strict minimum.

What I'd like to see is if the proposed implementation of revocable
resources can be used as a building block to fix the cdev issue. If it
ca, great, let's solve it then. If it can't, that's still fine, it will
still be useful for in-kernel resources, even if we need a different
implementation for cdev.

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Greg Kroah-Hartman 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 04:59:16PM +0300, Laurent Pinchart wrote:
> On Fri, Sep 12, 2025 at 03:46:27PM +0200, Bartosz Golaszewski wrote:
> > On Fri, Sep 12, 2025 at 3:39 PM Greg Kroah-Hartman wrote:
> > >
> > > I have no objection moving this to the cdev api, BUT given that 'struct
> > > cdev' is embedded everywhere, I don't think it's going to be a simple
> > > task, but rather have to be done one-driver-at-a-time like the patch in
> > > this series does it.
> > 
> > I don't think cdev is the right place for this as user-space keeping a
> > reference to a file-descriptor whose "backend" disappeared is not the
> > only possible problem. We can easily create a use-case of a USB I2C
> > expander being used by some in-kernel consumer and then unplugged.
> > This has nothing to do with the character device. I believe the
> > sub-system level is the right place for this and every driver
> > subsystem would have to integrate it separately, taking its various
> > quirks into account.
> 
> That's why I mentioned in-kernel users previously. Drivers routinely
> acquire resources provided by other drivers, and having a way to revoke
> those is needed.
> 
> It is a different but related problem compared to userspace racing with
> .remove(). Could we solve both using the same backend concepts ?
> Perhaps, time will tell, and if that works nicely, great. But we still
> have lots of drivers exposing character devices to userspace (usually
> through a subsystem-specific API, drivers that create a cdev manually
> are the minority). That problem is in my opinion more urgent than
> handling the removal of in-kernel resources, because it's more common,
> and is easily triggerable by userspace. The good news is that it should
> also be simpler to solve, we should be able to address the enter/exit
> part entirely in cdev, and limit the changes to drivers in .remove() to
> the strict minimum.
> 
> What I'd like to see is if the proposed implementation of revocable
> resources can be used as a building block to fix the cdev issue. If it
> ca, great, let's solve it then. If it can't, that's still fine, it will
> still be useful for in-kernel resources, even if we need a different
> implementation for cdev.

Patch 5/5 in this series does just this for a specific use of a cdev in
the driver.  Is that what you are looking for?

thanks,

greg k-h
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 6 days ago
(CC'ing Dan Williams)

On Fri, Sep 12, 2025 at 04:19:53PM +0200, Greg KH wrote:
> On Fri, Sep 12, 2025 at 04:59:16PM +0300, Laurent Pinchart wrote:
> > On Fri, Sep 12, 2025 at 03:46:27PM +0200, Bartosz Golaszewski wrote:
> > > On Fri, Sep 12, 2025 at 3:39 PM Greg Kroah-Hartman wrote:
> > > >
> > > > I have no objection moving this to the cdev api, BUT given that 'struct
> > > > cdev' is embedded everywhere, I don't think it's going to be a simple
> > > > task, but rather have to be done one-driver-at-a-time like the patch in
> > > > this series does it.
> > > 
> > > I don't think cdev is the right place for this as user-space keeping a
> > > reference to a file-descriptor whose "backend" disappeared is not the
> > > only possible problem. We can easily create a use-case of a USB I2C
> > > expander being used by some in-kernel consumer and then unplugged.
> > > This has nothing to do with the character device. I believe the
> > > sub-system level is the right place for this and every driver
> > > subsystem would have to integrate it separately, taking its various
> > > quirks into account.
> > 
> > That's why I mentioned in-kernel users previously. Drivers routinely
> > acquire resources provided by other drivers, and having a way to revoke
> > those is needed.
> > 
> > It is a different but related problem compared to userspace racing with
> > .remove(). Could we solve both using the same backend concepts ?
> > Perhaps, time will tell, and if that works nicely, great. But we still
> > have lots of drivers exposing character devices to userspace (usually
> > through a subsystem-specific API, drivers that create a cdev manually
> > are the minority). That problem is in my opinion more urgent than
> > handling the removal of in-kernel resources, because it's more common,
> > and is easily triggerable by userspace. The good news is that it should
> > also be simpler to solve, we should be able to address the enter/exit
> > part entirely in cdev, and limit the changes to drivers in .remove() to
> > the strict minimum.
> > 
> > What I'd like to see is if the proposed implementation of revocable
> > resources can be used as a building block to fix the cdev issue. If it
> > ca, great, let's solve it then. If it can't, that's still fine, it will
> > still be useful for in-kernel resources, even if we need a different
> > implementation for cdev.
> 
> Patch 5/5 in this series does just this for a specific use of a cdev in
> the driver.  Is that what you are looking for?

Not quite, I would like to see the enter/exit (aka revocable scope if my
understanding is correct) being pushed to char_dev.c, as Dan did in [1].
I'm fine having to add an extra function call in the .remove() path of
drivers, but I'm not fine with having to mark revocable sections
manually in drivers. That part belongs to cdev.

[1] https://lore.kernel.org/r/161117153248.2853729.2452425259045172318.stgit@dwillia2-desk3.amr.corp.intel.com

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Jason Gunthorpe 1 week, 3 days ago
On Fri, Sep 12, 2025 at 05:26:46PM +0300, Laurent Pinchart wrote:
> Not quite, I would like to see the enter/exit (aka revocable scope if my
> understanding is correct) being pushed to char_dev.c, as Dan did in [1].
> I'm fine having to add an extra function call in the .remove() path of
> drivers, but I'm not fine with having to mark revocable sections
> manually in drivers. That part belongs to cdev.

+1 I've open coded something here many times.

The implementation challenge is performance..

The big ask would be to make file_operations replaceable without
races. I can't see a way to do that at the struct file level without
slowing down everything.

Dan's idea to provide a wrapper file_operations that then effectively
calls another set of file_operations under a synchornization is more
reasonable, but the performance cost is now a percpu ref and another
indirect function call for every file operation. It also relies on the
inode which some cdev users end up replacing.

Open coding the lock in the subsystems avoids the indirect function
call, flexible inode, and subsystems can choose their locking
preference (rwsem, srcu, percpu).

As was said later in this thread, it would be a real shame to see
people implement revocable in drivers instead of rely on subsystems to
have sane unregistration semantics where the subsystem guarentees that
no driver callbacks are running after unregister. You never need
driver revocable in a world like that.

Jason
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Danilo Krummrich 1 week, 3 days ago
On Mon Sep 22, 2025 at 5:10 PM CEST, Jason Gunthorpe wrote:
> As was said later in this thread, it would be a real shame to see
> people implement revocable in drivers instead of rely on subsystems to
> have sane unregistration semantics where the subsystem guarentees that
> no driver callbacks are running after unregister. You never need
> driver revocable in a world like that.

I fully agree with that, in C there is indeed no value of a revocable type when
subsystems can guarantee "sane unregistration semantics".

I say "in C" because in C there is no way to get a proof by the compiler that
we're in a scope (e.g. through the subsystem guarentee) where the device is
guaranteed to be bound (which we can in Rust).

So, effectively, we're not getting any value out of the revocable in C in such a
case: In the best case, we're just bypassing the revocable by accessing the
pointer unchecked (regardless whether that's valid or not); in the worst case
we're introducing a useless SRCU read side critical section.

(In Rust the compiler will stop us from accessing the pointer unchecked if we're
not in a scope where unchecked access is valid.)

So, I think in C revocable should be restricted to use-cases where scopes are
unbound by design. DRM device callbacks are an example for that and it's the
reason why things like drm_dev_{enter,exit}() and drm_dev_unplug() exist. In the
end, those are exactly the same as revocable implemented in a slightly different
way.

- Danilo
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Jason Gunthorpe 1 week, 3 days ago
On Mon, Sep 22, 2025 at 05:55:43PM +0200, Danilo Krummrich wrote:
> I fully agree with that, in C there is indeed no value of a revocable type when
> subsystems can guarantee "sane unregistration semantics".

Indeed, I agree with your message. If I look at the ec_cros presented
here, there is no reason for any revocable. In fact it already seems
like an abuse of the idea.

The cros_ec.c spawns a MFD with a platform_device of "cros-ec-chardev"
that is alreay properly lifetime controlled - the platform is already
removed during the remove of the cros_ec.c.

So, there is no need for a revocable that spans two drivers like this!

The bug is that cros_ec_chardev.c doesn't implement itself correctly
and doesn't have a well designed remove() for something that creates a
char dev. This issue should be entirely handled within
cros_ec_chardev.c and not leak out to other files.

1) Using a miscdevice means loosing out on any refcount managed
memory. When using a file you need some per-device memory that lives
for as long as all files are open. So don't use miscdev, the better
pattern is:

struct chardev_data {
	struct device dev;
	struct cdev cdev;

Now you get to have a struct device linked refcount and a free
function. The file can savely hold onto a chardev_data for its
lifetime.

2) Add locking so the file can exist after the driver is removed:

struct chardev_data {
[..]
	struct rw_semaphore sem;
	struct cros_ec_dev *ec_dev;
};

Refcount the chardev_data::dev in the file operations open/release,
refer to it via the private_data.

3) At the start of every fop take the sem and check the ec_dev:

ACQUIRE(rwsem_read, ecdev_sem)(&data->sem);
if (ACQUIRE_ERR(ecdev_sem) || !data->ec_dev)
   return -ENODEV;

4) After unregistering the cdev, but before destroying the device take
the write side of the rwsem and NULL ec_dev.

5) Purge all the devm usage from cros_ec_chardev, the only allocation
is refcounted instead.

Simple. No need for any synchronize_srcu() for such a simple
non-performance oriented driver.

Yes, this can be made better, there is a bit too much boilerplate, but
revocable is not the way for cros_ec.

Jason
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Greg Kroah-Hartman 1 week, 2 days ago
On Mon, Sep 22, 2025 at 02:40:10PM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 22, 2025 at 05:55:43PM +0200, Danilo Krummrich wrote:
> > I fully agree with that, in C there is indeed no value of a revocable type when
> > subsystems can guarantee "sane unregistration semantics".
> 
> Indeed, I agree with your message. If I look at the ec_cros presented
> here, there is no reason for any revocable. In fact it already seems
> like an abuse of the idea.
> 
> The cros_ec.c spawns a MFD with a platform_device of "cros-ec-chardev"
> that is alreay properly lifetime controlled - the platform is already
> removed during the remove of the cros_ec.c.
> 
> So, there is no need for a revocable that spans two drivers like this!

It's a very common pattern, you have a struct device, and a userspace
access, and need to "split" them apart, as you say below.  This logic
here handles that pattern.

See the many talks about this in the past at Plumbers and other
conferences, this isn't a new thing, and it isn't quite as simple as I
think you are making it out to be to solve.

> The bug is that cros_ec_chardev.c doesn't implement itself correctly
> and doesn't have a well designed remove() for something that creates a
> char dev. This issue should be entirely handled within
> cros_ec_chardev.c and not leak out to other files.
> 
> 1) Using a miscdevice means loosing out on any refcount managed
> memory. When using a file you need some per-device memory that lives
> for as long as all files are open. So don't use miscdev, the better
> pattern is:
> 
> struct chardev_data {
> 	struct device dev;
> 	struct cdev cdev;

Woah, no, that is totally wrong.

> Now you get to have a struct device linked refcount and a free
> function. The file can savely hold onto a chardev_data for its
> lifetime.

You have 2 different reference counts for the same structure.  A bug
that should never be done (yes, it's done a lot in the kernel, it's
wrong.)

And really, let's not abuse cdev anymore than it currently is please.
There's a reason that miscdevice returns just a pointer.  Right now cdev
can be used in 3-4 different ways, let's not come up with another way to
abuse that api :)

> 2) Add locking so the file can exist after the driver is removed:
> 
> struct chardev_data {
> [..]
> 	struct rw_semaphore sem;
> 	struct cros_ec_dev *ec_dev;
> };
> 
> Refcount the chardev_data::dev in the file operations open/release,
> refer to it via the private_data.
> 
> 3) At the start of every fop take the sem and check the ec_dev:
> 
> ACQUIRE(rwsem_read, ecdev_sem)(&data->sem);
> if (ACQUIRE_ERR(ecdev_sem) || !data->ec_dev)
>    return -ENODEV;
> 
> 4) After unregistering the cdev, but before destroying the device take
> the write side of the rwsem and NULL ec_dev.
> 
> 5) Purge all the devm usage from cros_ec_chardev, the only allocation
> is refcounted instead.
> 
> Simple. No need for any synchronize_srcu() for such a simple
> non-performance oriented driver.

There is no performance issue here, but there is lifetime rule logic
here that I really really do not want to have to "open code" for every
user.  revokable gives this to us in a simple way that we can "know" is
being used correctly.

And again, you can't have a single structure with two reference counts,
that's not allowed :)

> Yes, this can be made better, there is a bit too much boilerplate, but
> revocable is not the way for cros_ec.

I strongly disagree, sorry.

greg k-h
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Jason Gunthorpe 1 week, 2 days ago
On Mon, Sep 22, 2025 at 08:42:23PM +0200, Greg Kroah-Hartman wrote:
> On Mon, Sep 22, 2025 at 02:40:10PM -0300, Jason Gunthorpe wrote:
> > On Mon, Sep 22, 2025 at 05:55:43PM +0200, Danilo Krummrich wrote:
> > > I fully agree with that, in C there is indeed no value of a revocable type when
> > > subsystems can guarantee "sane unregistration semantics".
> > 
> > Indeed, I agree with your message. If I look at the ec_cros presented
> > here, there is no reason for any revocable. In fact it already seems
> > like an abuse of the idea.
> > 
> > The cros_ec.c spawns a MFD with a platform_device of "cros-ec-chardev"
> > that is alreay properly lifetime controlled - the platform is already
> > removed during the remove of the cros_ec.c.
> > 
> > So, there is no need for a revocable that spans two drivers like this!
> 
> It's a very common pattern, you have a struct device, and a userspace
> access, and need to "split" them apart, as you say below.  This logic
> here handles that pattern.

This is two struct devices, two device drivers and a fops.  There is
no reason to trigger revoke from the parent driver to a child driver,
that's just mis-layering and obfuscation.

It already subtly relies on the parent not triggering revoke until the
child is removed because it added this to the cros-ec-chardev driver:

@@ -166,7 +183,12 @@ static int cros_ec_chardev_open(struct inode *inode, struct file *filp)
	struct cros_ec_dev *ec_dev = dev_get_drvdata(mdev->parent);

 	if (!priv)
 		return -ENOMEM;
 
-	priv->ec_dev = ec_dev;
+	priv->ec_dev_rev = revocable_alloc(ec_dev->revocable_provider);

It would UAF if the revoke is triggered by the parent driver at the
wrong time.

So why is the parent involved at all? Why does it have to violate
modularity?

> See the many talks about this in the past at Plumbers and other
> conferences, this isn't a new thing, and it isn't quite as simple as I
> think you are making it out to be to solve.

There are more complex situations, but this isn't one of them. All
this needs is to fence the file operations of a single cdev. I've
solved it enough times now to know exactly how simple this really is..

 > struct chardev_data {
> > 	struct device dev;
> > 	struct cdev cdev;
> 
> Woah, no, that is totally wrong.

Sigh. I'm sure we've had this exchange before. Please
re-read the documentation for cdev_device_add():

 * This function should be used whenever the struct cdev and the
 * struct device are members of the same structure whose lifetime is
 * managed by the struct device.    ^^^^^^^^^^^^

We created this helper specifically to clean up the refcount bugs
around cdev usage. It supports the above pattern which is the natural
and easy way to use cdev.

> And really, let's not abuse cdev anymore than it currently is please.
> There's a reason that miscdevice returns just a pointer.  Right now cdev
> can be used in 3-4 different ways, let's not come up with another way to
> abuse that api :)

It is true there are many abuses, but converting cdev users to use
cdev_device_add() is the right and best option IMHO.

miscdev is not a good option in cases like this because you don't get
a nice natural kref around the memory, among other limitations.

> There is no performance issue here, but there is lifetime rule logic
> here that I really really do not want to have to "open code" for every
> user.  revokable gives this to us in a simple way that we can "know" is
> being used correctly.

I strongly prefer Laurent's direction to add a helper file_operations
that automatically wraps all ops in a lock.

I think all this series does is create a weirdly named lock that
drivers still have to open code.

The fact the very first proposed user is already abusing it to
needlessly violate driver modularity is really depressing. Let's not
create another devm :\

Jason
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Greg Kroah-Hartman 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 05:26:46PM +0300, Laurent Pinchart wrote:
> (CC'ing Dan Williams)
> 
> On Fri, Sep 12, 2025 at 04:19:53PM +0200, Greg KH wrote:
> > On Fri, Sep 12, 2025 at 04:59:16PM +0300, Laurent Pinchart wrote:
> > > On Fri, Sep 12, 2025 at 03:46:27PM +0200, Bartosz Golaszewski wrote:
> > > > On Fri, Sep 12, 2025 at 3:39 PM Greg Kroah-Hartman wrote:
> > > > >
> > > > > I have no objection moving this to the cdev api, BUT given that 'struct
> > > > > cdev' is embedded everywhere, I don't think it's going to be a simple
> > > > > task, but rather have to be done one-driver-at-a-time like the patch in
> > > > > this series does it.
> > > > 
> > > > I don't think cdev is the right place for this as user-space keeping a
> > > > reference to a file-descriptor whose "backend" disappeared is not the
> > > > only possible problem. We can easily create a use-case of a USB I2C
> > > > expander being used by some in-kernel consumer and then unplugged.
> > > > This has nothing to do with the character device. I believe the
> > > > sub-system level is the right place for this and every driver
> > > > subsystem would have to integrate it separately, taking its various
> > > > quirks into account.
> > > 
> > > That's why I mentioned in-kernel users previously. Drivers routinely
> > > acquire resources provided by other drivers, and having a way to revoke
> > > those is needed.
> > > 
> > > It is a different but related problem compared to userspace racing with
> > > .remove(). Could we solve both using the same backend concepts ?
> > > Perhaps, time will tell, and if that works nicely, great. But we still
> > > have lots of drivers exposing character devices to userspace (usually
> > > through a subsystem-specific API, drivers that create a cdev manually
> > > are the minority). That problem is in my opinion more urgent than
> > > handling the removal of in-kernel resources, because it's more common,
> > > and is easily triggerable by userspace. The good news is that it should
> > > also be simpler to solve, we should be able to address the enter/exit
> > > part entirely in cdev, and limit the changes to drivers in .remove() to
> > > the strict minimum.
> > > 
> > > What I'd like to see is if the proposed implementation of revocable
> > > resources can be used as a building block to fix the cdev issue. If it
> > > ca, great, let's solve it then. If it can't, that's still fine, it will
> > > still be useful for in-kernel resources, even if we need a different
> > > implementation for cdev.
> > 
> > Patch 5/5 in this series does just this for a specific use of a cdev in
> > the driver.  Is that what you are looking for?
> 
> Not quite, I would like to see the enter/exit (aka revocable scope if my
> understanding is correct) being pushed to char_dev.c, as Dan did in [1].
> I'm fine having to add an extra function call in the .remove() path of
> drivers, but I'm not fine with having to mark revocable sections
> manually in drivers. That part belongs to cdev.
> 
> [1] https://lore.kernel.org/r/161117153248.2853729.2452425259045172318.stgit@dwillia2-desk3.amr.corp.intel.com

Dan's proposal here is a good start, but the "sleep in cdev_del() until
the device drains all existing opens" is going to not really work well
for what we want.

So sure, make a new cdev api to use this, that's fine, then we will have
what, 5 different ways to use a cdev?  :)

Seriously, that would be good, then we can work to convert things over,
but I think overall it will look much the same as what patch 5/5 does
here.  But details matter, I don't really known for sure...

Either way, I think this patch series stands on its own, it doesn't
require cdev to implement it, drivers can use it to wrap a cdev if they
want to.  We have other structures that want to do this type of thing
today as is proof with the rust implementation for the devm api.

thanks,

greg k-h
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 04:40:27PM +0200, Greg KH wrote:
> On Fri, Sep 12, 2025 at 05:26:46PM +0300, Laurent Pinchart wrote:
> > (CC'ing Dan Williams)
> > 
> > On Fri, Sep 12, 2025 at 04:19:53PM +0200, Greg KH wrote:
> > > On Fri, Sep 12, 2025 at 04:59:16PM +0300, Laurent Pinchart wrote:
> > > > On Fri, Sep 12, 2025 at 03:46:27PM +0200, Bartosz Golaszewski wrote:
> > > > > On Fri, Sep 12, 2025 at 3:39 PM Greg Kroah-Hartman wrote:
> > > > > >
> > > > > > I have no objection moving this to the cdev api, BUT given that 'struct
> > > > > > cdev' is embedded everywhere, I don't think it's going to be a simple
> > > > > > task, but rather have to be done one-driver-at-a-time like the patch in
> > > > > > this series does it.
> > > > > 
> > > > > I don't think cdev is the right place for this as user-space keeping a
> > > > > reference to a file-descriptor whose "backend" disappeared is not the
> > > > > only possible problem. We can easily create a use-case of a USB I2C
> > > > > expander being used by some in-kernel consumer and then unplugged.
> > > > > This has nothing to do with the character device. I believe the
> > > > > sub-system level is the right place for this and every driver
> > > > > subsystem would have to integrate it separately, taking its various
> > > > > quirks into account.
> > > > 
> > > > That's why I mentioned in-kernel users previously. Drivers routinely
> > > > acquire resources provided by other drivers, and having a way to revoke
> > > > those is needed.
> > > > 
> > > > It is a different but related problem compared to userspace racing with
> > > > .remove(). Could we solve both using the same backend concepts ?
> > > > Perhaps, time will tell, and if that works nicely, great. But we still
> > > > have lots of drivers exposing character devices to userspace (usually
> > > > through a subsystem-specific API, drivers that create a cdev manually
> > > > are the minority). That problem is in my opinion more urgent than
> > > > handling the removal of in-kernel resources, because it's more common,
> > > > and is easily triggerable by userspace. The good news is that it should
> > > > also be simpler to solve, we should be able to address the enter/exit
> > > > part entirely in cdev, and limit the changes to drivers in .remove() to
> > > > the strict minimum.
> > > > 
> > > > What I'd like to see is if the proposed implementation of revocable
> > > > resources can be used as a building block to fix the cdev issue. If it
> > > > ca, great, let's solve it then. If it can't, that's still fine, it will
> > > > still be useful for in-kernel resources, even if we need a different
> > > > implementation for cdev.
> > > 
> > > Patch 5/5 in this series does just this for a specific use of a cdev in
> > > the driver.  Is that what you are looking for?
> > 
> > Not quite, I would like to see the enter/exit (aka revocable scope if my
> > understanding is correct) being pushed to char_dev.c, as Dan did in [1].
> > I'm fine having to add an extra function call in the .remove() path of
> > drivers, but I'm not fine with having to mark revocable sections
> > manually in drivers. That part belongs to cdev.
> > 
> > [1] https://lore.kernel.org/r/161117153248.2853729.2452425259045172318.stgit@dwillia2-desk3.amr.corp.intel.com
> 
> Dan's proposal here is a good start, but the "sleep in cdev_del() until
> the device drains all existing opens" is going to not really work well
> for what we want.

I think you missed the fact that the code doesn't wait for all open file
handles to be closed. It waits for all in-progress syscalls to return
from the driver. That's a big difference.

> So sure, make a new cdev api to use this, that's fine, then we will have
> what, 5 different ways to use a cdev?  :)
> 
> Seriously, that would be good, then we can work to convert things over,
> but I think overall it will look much the same as what patch 5/5 does
> here.  But details matter, I don't really known for sure...

What I don't want to see is drivers using this new API directly to
protect the cdev race. That would be a big step in the wrong direction.
Patch 5/5 needs to move driver code to the cdev level. That shouldn't be
difficult, so I really not see why it can't be done in v4 to see how it
will look like.

> Either way, I think this patch series stands on its own, it doesn't
> require cdev to implement it, drivers can use it to wrap a cdev if they
> want to.

No, drivers should *not* do that manually. That's a recipe for disaster
that we would regret later.

> We have other structures that want to do this type of thing
> today as is proof with the rust implementation for the devm api.

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Bartosz Golaszewski 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 4:40 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> Dan's proposal here is a good start, but the "sleep in cdev_del() until
> the device drains all existing opens" is going to not really work well
> for what we want.
>
> So sure, make a new cdev api to use this, that's fine, then we will have
> what, 5 different ways to use a cdev?  :)
>
> Seriously, that would be good, then we can work to convert things over,
> but I think overall it will look much the same as what patch 5/5 does
> here.  But details matter, I don't really known for sure...
>
> Either way, I think this patch series stands on its own, it doesn't
> require cdev to implement it, drivers can use it to wrap a cdev if they
> want to.  We have other structures that want to do this type of thing
> today as is proof with the rust implementation for the devm api.
>

Yeah, I'm not against this going upstream. If more development is
needed for this to be usable in other parts of the kernel, that can be
done gradually. Literally no subsystem ever was perfect on day 1.

Tzung-Bi: I'm not sure if you did submit anything but I'd love to see
this discussed during Linux Plumbers in Tokyo, it's the perfect fit
for the kernel summit.

Bartosz
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 04:44:56PM +0200, Bartosz Golaszewski wrote:
> On Fri, Sep 12, 2025 at 4:40 PM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > Dan's proposal here is a good start, but the "sleep in cdev_del() until
> > the device drains all existing opens" is going to not really work well
> > for what we want.
> >
> > So sure, make a new cdev api to use this, that's fine, then we will have
> > what, 5 different ways to use a cdev?  :)
> >
> > Seriously, that would be good, then we can work to convert things over,
> > but I think overall it will look much the same as what patch 5/5 does
> > here.  But details matter, I don't really known for sure...
> >
> > Either way, I think this patch series stands on its own, it doesn't
> > require cdev to implement it, drivers can use it to wrap a cdev if they
> > want to.  We have other structures that want to do this type of thing
> > today as is proof with the rust implementation for the devm api.
> 
> Yeah, I'm not against this going upstream. If more development is
> needed for this to be usable in other parts of the kernel, that can be
> done gradually. Literally no subsystem ever was perfect on day 1.

To be clear, I'm not against the API being merged for the use cases that
would benefit from it, but I don't want to see drivers using it to
protect from the cdev/unregistration race.

> Tzung-Bi: I'm not sure if you did submit anything but I'd love to see
> this discussed during Linux Plumbers in Tokyo, it's the perfect fit
> for the kernel summit.

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Tzung-Bi Shih 2 weeks, 5 days ago
On Fri, Sep 12, 2025 at 05:54:16PM +0300, Laurent Pinchart wrote:
> On Fri, Sep 12, 2025 at 04:44:56PM +0200, Bartosz Golaszewski wrote:
> > On Fri, Sep 12, 2025 at 4:40 PM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > Dan's proposal here is a good start, but the "sleep in cdev_del() until
> > > the device drains all existing opens" is going to not really work well
> > > for what we want.
> > >
> > > So sure, make a new cdev api to use this, that's fine, then we will have
> > > what, 5 different ways to use a cdev?  :)
> > >
> > > Seriously, that would be good, then we can work to convert things over,
> > > but I think overall it will look much the same as what patch 5/5 does
> > > here.  But details matter, I don't really known for sure...
> > >
> > > Either way, I think this patch series stands on its own, it doesn't
> > > require cdev to implement it, drivers can use it to wrap a cdev if they
> > > want to.  We have other structures that want to do this type of thing
> > > today as is proof with the rust implementation for the devm api.
> > 
> > Yeah, I'm not against this going upstream. If more development is
> > needed for this to be usable in other parts of the kernel, that can be
> > done gradually. Literally no subsystem ever was perfect on day 1.
> 
> To be clear, I'm not against the API being merged for the use cases that
> would benefit from it, but I don't want to see drivers using it to
> protect from the cdev/unregistration race.

Based on the discussion thread, my main takeaways are:

- Current `revocable` is considered a low level API.  We shouldn't (and
  likely can't) stop drivers, like the one in patch 5/5 in the series,
  from using it directly to fix UAFs.

- Subsystems (like cdev) should build on this API to provide an easier
  interface for their drivers to manage revocable resources.

I'll create a PoC based on this.

> > Tzung-Bi: I'm not sure if you did submit anything but I'd love to see
> > this discussed during Linux Plumbers in Tokyo, it's the perfect fit
> > for the kernel summit.

Yes, and I just realized that in addition to the website submission, a
separate email is also required (or at least encouraged).  I've just sent
that email and am hoping it's not too late.
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 5 days ago
On Sat, Sep 13, 2025 at 11:55:45PM +0800, Tzung-Bi Shih wrote:
> On Fri, Sep 12, 2025 at 05:54:16PM +0300, Laurent Pinchart wrote:
> > On Fri, Sep 12, 2025 at 04:44:56PM +0200, Bartosz Golaszewski wrote:
> > > On Fri, Sep 12, 2025 at 4:40 PM Greg Kroah-Hartman wrote:
> > > >
> > > > Dan's proposal here is a good start, but the "sleep in cdev_del() until
> > > > the device drains all existing opens" is going to not really work well
> > > > for what we want.
> > > >
> > > > So sure, make a new cdev api to use this, that's fine, then we will have
> > > > what, 5 different ways to use a cdev?  :)
> > > >
> > > > Seriously, that would be good, then we can work to convert things over,
> > > > but I think overall it will look much the same as what patch 5/5 does
> > > > here.  But details matter, I don't really known for sure...
> > > >
> > > > Either way, I think this patch series stands on its own, it doesn't
> > > > require cdev to implement it, drivers can use it to wrap a cdev if they
> > > > want to.  We have other structures that want to do this type of thing
> > > > today as is proof with the rust implementation for the devm api.
> > > 
> > > Yeah, I'm not against this going upstream. If more development is
> > > needed for this to be usable in other parts of the kernel, that can be
> > > done gradually. Literally no subsystem ever was perfect on day 1.
> > 
> > To be clear, I'm not against the API being merged for the use cases that
> > would benefit from it, but I don't want to see drivers using it to
> > protect from the cdev/unregistration race.
> 
> Based on the discussion thread, my main takeaways are:
> 
> - Current `revocable` is considered a low level API.  We shouldn't (and
>   likely can't) stop drivers, like the one in patch 5/5 in the series,
>   from using it directly to fix UAFs.

Why shouldn't we ? We have enough precedents where driver authors rushed
to adopt brand new APIs without understand the implications.
devm_kzalloc() is a prime example of a small new API that very quickly
got misused everywhere. If we had taken the time to clearly explain when
it should be used and when it should *not* be used, we wouldn't be
plagued by as many device removal race conditions today. Let's not
repeat the same mistake, I'd like this new API to make things better,
not worse.

> - Subsystems (like cdev) should build on this API to provide an easier
>   interface for their drivers to manage revocable resources.
> 
> I'll create a PoC based on this.

I'm looking forward to that. Please let me know if there's anything you
would like to discuss. I didn't dive deep in technical details in this
thread, and I don't expect anyone to guess what I have in mind if I
failed to express it :-) I'm very confident the cdev race condition can
be fixed in a neat way, so let's do that.

> > > Tzung-Bi: I'm not sure if you did submit anything but I'd love to see
> > > this discussed during Linux Plumbers in Tokyo, it's the perfect fit
> > > for the kernel summit.
> 
> Yes, and I just realized that in addition to the website submission, a
> separate email is also required (or at least encouraged).  I've just sent
> that email and am hoping it's not too late.

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Tzung-Bi Shih 1 week, 2 days ago
On Sat, Sep 13, 2025 at 07:14:13PM +0300, Laurent Pinchart wrote:
> On Sat, Sep 13, 2025 at 11:55:45PM +0800, Tzung-Bi Shih wrote:
> > - Subsystems (like cdev) should build on this API to provide an easier
> >   interface for their drivers to manage revocable resources.
> > 
> > I'll create a PoC based on this.
> 
> I'm looking forward to that. Please let me know if there's anything you
> would like to discuss. I didn't dive deep in technical details in this
> thread, and I don't expect anyone to guess what I have in mind if I
> failed to express it :-) I'm very confident the cdev race condition can
> be fixed in a neat way, so let's do that.

Even though I think this isn't what you are looking for originally, please
take a look on the PoC attempt (5th - 7th patches) in [1].  Unlike [2], the
PoC allows fops to be exuected and defers to the driver to decide what to do
if the resource is unavailable.

[1] https://lore.kernel.org/chrome-platform/20250923075302.591026-1-tzungbi@kernel.org
[2] https://lore.kernel.org/all/161117153776.2853729.6944617921517514510.stgit@dwillia2-desk3.amr.corp.intel.com/
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Danilo Krummrich 2 weeks, 6 days ago
On 9/12/25 4:54 PM, Laurent Pinchart wrote:
> On Fri, Sep 12, 2025 at 04:44:56PM +0200, Bartosz Golaszewski wrote:
>> On Fri, Sep 12, 2025 at 4:40 PM Greg Kroah-Hartman
>> <gregkh@linuxfoundation.org> wrote:
>>> Either way, I think this patch series stands on its own, it doesn't
>>> require cdev to implement it, drivers can use it to wrap a cdev if they
>>> want to.  We have other structures that want to do this type of thing
>>> today as is proof with the rust implementation for the devm api.
>>
>> Yeah, I'm not against this going upstream. If more development is
>> needed for this to be usable in other parts of the kernel, that can be
>> done gradually. Literally no subsystem ever was perfect on day 1.
> 
> To be clear, I'm not against the API being merged for the use cases that
> would benefit from it, but I don't want to see drivers using it to
> protect from the cdev/unregistration race.

I mean, revocable is really a synchronization primitive in the end that
"revokes" access to some resource in a race free way.

So, technically, it probably belongs into lib/.

I think the reason it ended up in drivers/base/ is that one common use case is
to revoke a device resource from a driver when the device is unbound from this
driver; or in other words devres is an obvious user.

So, I think that any other API (cdev, devres, etc.) should  be built on top of it.

This is also what we do in Rust, Revocable is just a common synchronization
primitive and the (only) user it has is currently Devres building on top of it.
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 5 days ago
On Fri, Sep 12, 2025 at 06:22:48PM +0200, Danilo Krummrich wrote:
> On 9/12/25 4:54 PM, Laurent Pinchart wrote:
> > On Fri, Sep 12, 2025 at 04:44:56PM +0200, Bartosz Golaszewski wrote:
> >> On Fri, Sep 12, 2025 at 4:40 PM Greg Kroah-Hartman wrote:
> >>> Either way, I think this patch series stands on its own, it doesn't
> >>> require cdev to implement it, drivers can use it to wrap a cdev if they
> >>> want to.  We have other structures that want to do this type of thing
> >>> today as is proof with the rust implementation for the devm api.
> >>
> >> Yeah, I'm not against this going upstream. If more development is
> >> needed for this to be usable in other parts of the kernel, that can be
> >> done gradually. Literally no subsystem ever was perfect on day 1.
> > 
> > To be clear, I'm not against the API being merged for the use cases that
> > would benefit from it, but I don't want to see drivers using it to
> > protect from the cdev/unregistration race.
> 
> I mean, revocable is really a synchronization primitive in the end that
> "revokes" access to some resource in a race free way.
> 
> So, technically, it probably belongs into lib/.
> 
> I think the reason it ended up in drivers/base/ is that one common use case is
> to revoke a device resource from a driver when the device is unbound from this
> driver; or in other words devres is an obvious user.
> 
> So, I think that any other API (cdev, devres, etc.) should  be built on top of it.

No issue with that. I'm sure there are people who have better knowledge
than me when it comes to implementing the low-level primitive in the
most efficient way. What I have lots of experience with is the impact of
API design on drivers, and the API misuse (including through cargo-cult
programming) this can generate. Let's design the API towards drivers
correctly.

> This is also what we do in Rust, Revocable is just a common synchronization
> primitive and the (only) user it has is currently Devres building on top of it.

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 03:39:45PM +0200, Greg KH wrote:
> On Fri, Sep 12, 2025 at 04:26:56PM +0300, Laurent Pinchart wrote:
> > On Fri, Sep 12, 2025 at 08:49:30PM +0800, Tzung-Bi Shih wrote:
> > > On Fri, Sep 12, 2025 at 11:24:10AM +0200, Bartosz Golaszewski wrote:
> > > > On Fri, 12 Sept 2025 at 11:09, Krzysztof Kozlowski wrote:
> > > > > On 12/09/2025 10:17, Tzung-Bi Shih wrote:
> > > > > > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > > > Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > > > > Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
> > > > >
> > > > > Thanks for the work. Just a note, please start using b4, so above Cc
> > > > > will be propagated to all patches. Folks above received only the cover
> > > > > letter...
> > > 
> > > Thank you for bringing this to my attention.  I wasn't aware of that and
> > > will ensure this is handled correctly in the future.
> > > 
> > > > Thanks to Krzysztof for making me aware of this. Could you please Cc
> > > > my brgl@bgdev.pl address on the next iteration.
> > > 
> > > Sure, will do.
> > > 
> > > > I haven't looked into the details yet but the small size of the first
> > > > patch strikes me as odd. The similar changes I did for GPIO were quite
> > > > big and they were designed just for a single sub-system.
> > > > 
> > > > During the talk you reference, after I suggested a library like this,
> > > > Greg KH can be heard saying: do this for two big subsystems so that
> > > > you're sure it's a generic solution. Here you're only using it in a
> > > > single driver which makes me wonder if we can actually use it to
> > > > improve bigger offenders, like for example I2C, or even replace the
> > > > custom, SRCU-based solution in GPIO we have now. Have you considered
> > > > at least doing a PoC in a wider kernel framework?
> > > 
> > > Yes, I'm happy to take this on.
> > > 
> > > To help me get started, could you please point me to some relevant code
> > > locations?  Also, could you let me know if any specific physical devices
> > > will be needed for testing?
> > 
> > One interesting test would be to move the logic to the cdev layer. The
> > use-after-free problem isn't specific to one type of character device,
> > and so shouldn't require a fix in every driver instantiating a cdev
> > directly (or indirectly). See [1] for a previous attempt to handle this
> > at the V4L2 level and [2] for an attempt to handle it at the cdev level.
> > 
> > In [1], two new functions named video_device_enter() and
> > video_device_exit() flag the beginning and end of protected code
> > sections. The equivalent in [2] is the manual get/put of cdev->qactive,
> > and if I understand things correctly, your series creates a REVOCABLE()
> > macro to do the same. I'm sure we'll bikesheed about names at some
> > point, but for the time being, what I'd like to see if this being done
> > in fs/char_dev.c to cover all entry points from userspace at the cdev
> > level.
> > 
> > We then have video_device_unplug() in [1], which I think is more or less
> > the equivalent of revocable_provider_free(). I don't think we'll be able
> > to hide this completely from drivers, at least not in all cases. We
> > should however design the API to make it easy for drivers, likely with
> > subsystem-specific wrappers.
> > 
> > What I have in mind is roughly the following:
> > 
> > 1. Protect all access to the cdev from userspace with enter/exit calls
> >    that flag if a call is in progress. This can be done with explicit
> >    function calls, or with a scope guard as in your series.
> > 
> > 2. At .remove() time, start by flagging that the device is being
> >    removed. That has to be an explicit call from drivers I believe,
> >    likely using subsystem-specific wrappers to simplify things.
> > 
> > 3. Once the device is marked as being removed, all enter() calls should
> >    fail at the cdev level.
> > 
> > 4. In .remove(), proceed to perform driver-specific operations that will
> >    stop the device and wake up any userspace task blocked on a syscall
> >    protected by enter()/remove(). This isn't needed for
> >    drivers/subsystems that don't provide any blocking API, but is
> >    required otherwise.
> > 
> > 5. Unregister, still in .remove(), the cdev (likely through
> >    subsystem-specific APIs in most cases). This should block until all
> >    protected sections have exited.
> > 
> > 6. The cdev is now unregistered, can't be opened anymore, and any
> >    new syscall on any opened file handle will return an error. The
> >    driver's .remove() function can proceed to free data, there won't be
> >    any UAF caused by userspace.
> > 
> > [1] implemented this fairly naively with flags and spinlocks. An
> > RCU-based implementation is probably more efficient, even if I don't
> > know how performance-sensitive all this is.
> > 
> > Does this align with your design, and do you think you could give a try
> > at pushing revocable resource handling to the cdev level ?
> > 
> > On a separate note, I'm not sure "revocable" is the right name here. I
> > believe a revocable resource API is needed, and well-named, for
> > in-kernel consumers (e.g. drivers consuming a GPIO or clock). For the
> > userspace syscalls racing with .remove(), I don't think we're dealing
> > with "revocable resources". Now, if a "revocable resources" API were to
> > support the in-kernel users, and be usable as a building block to fix
> > the cdev issue, I would have nothing against it, but the "revocable"
> > name should be internal in that case, used in the cdev layer only, and
> > not exposed to drivers (or even subsystem helpers that should wrap cdev
> > functions instead).
> 
> I think the name makes sense as it matches up with how things are
> working (the backend structure is "revoked"), but naming is tough :)
> 
> I have no objection moving this to the cdev api, BUT given that 'struct
> cdev' is embedded everywhere, I don't think it's going to be a simple
> task, but rather have to be done one-driver-at-a-time like the patch in
> this series does it.

For the .remove() code paths, yes, I expect driver changes. We need
subsystem-level helpers that will make those easy and hide the
complexity. For the code paths from userspace into the drivers through
cdev file operations, there should be no driver change required.

> And that's fine, we have "interns" that we can set loose on this type of
> code conversions, I think we just need to wrap the access to the cdev
> with this api, which will take a bit of rewriting in many drivers.
> 
> Anyway, just my thought, if someone else can see how this could drop
> into the core cdev code without any changes needed, that would be great,
> but I don't see it at the moment.  cdev is just too "raw" for that.

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Greg Kroah-Hartman 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 08:17:12AM +0000, Tzung-Bi Shih wrote:
> This is a follow-up series of [1].  It tries to fix a possible UAF in the
> fops of cros_ec_chardev after the underlying protocol device has gone by
> using revocable.
> 
> The 1st patch introduces the revocable which is an implementation of ideas
> from the talk [2].
> 
> The 2nd and 3rd patches add test cases for revocable in Kunit and selftest.
> 
> The 4th patch converts existing protocol devices to resource providers
> of cros_ec_device.
> 
> The 5th patch converts cros_ec_chardev to a resource consumer of
> cros_ec_device to fix the UAF.
> 
> [1] https://lore.kernel.org/chrome-platform/20250721044456.2736300-6-tzungbi@kernel.org/
> [2] https://lpc.events/event/17/contributions/1627/
> 
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>

This is, frankly, wonderful work.  Thanks so much for doing this, it's
what many of us have been wanting to see for a very long time but none
of us got around to actually doing it.

And it has tests!  And documentation!  Couldn't ask for more.

We can bikeshed about the REVOCABLE() macro name, but frankly, you wrote
it, you get to pick it :)

Laurent, Bartosz, Wolfram, any objection to this series?  I think this
addresses the issues that all of you have been raising for years with
our access of pointers that have different lifecycles from other
structures (i.e. struct cdev from struct device).

Also, Danilo, if you get the chance, can you give this a review as well?
At first glance it looks good to me, but as you wrote the Rust
implementation of this feature, a second pair of eyes would be great to
have if you have the time.

thanks,

greg k-h
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Laurent Pinchart 2 weeks, 6 days ago
On Fri, Sep 12, 2025 at 10:30:45AM +0200, Greg KH wrote:
> On Fri, Sep 12, 2025 at 08:17:12AM +0000, Tzung-Bi Shih wrote:
> > This is a follow-up series of [1].  It tries to fix a possible UAF in the
> > fops of cros_ec_chardev after the underlying protocol device has gone by
> > using revocable.
> > 
> > The 1st patch introduces the revocable which is an implementation of ideas
> > from the talk [2].
> > 
> > The 2nd and 3rd patches add test cases for revocable in Kunit and selftest.
> > 
> > The 4th patch converts existing protocol devices to resource providers
> > of cros_ec_device.
> > 
> > The 5th patch converts cros_ec_chardev to a resource consumer of
> > cros_ec_device to fix the UAF.
> > 
> > [1] https://lore.kernel.org/chrome-platform/20250721044456.2736300-6-tzungbi@kernel.org/
> > [2] https://lpc.events/event/17/contributions/1627/
> > 
> > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > Cc: Wolfram Sang <wsa+renesas@sang-engineering.com>
> 
> This is, frankly, wonderful work.  Thanks so much for doing this, it's
> what many of us have been wanting to see for a very long time but none
> of us got around to actually doing it.
> 
> And it has tests!  And documentation!  Couldn't ask for more.
> 
> We can bikeshed about the REVOCABLE() macro name, but frankly, you wrote
> it, you get to pick it :)
> 
> Laurent, Bartosz, Wolfram, any objection to this series?  I think this
> addresses the issues that all of you have been raising for years with
> our access of pointers that have different lifecycles from other
> structures (i.e. struct cdev from struct device).

I'll check this either later today or over the weekend.

> Also, Danilo, if you get the chance, can you give this a review as well?
> At first glance it looks good to me, but as you wrote the Rust
> implementation of this feature, a second pair of eyes would be great to
> have if you have the time.

-- 
Regards,

Laurent Pinchart
Re: [PATCH v3 0/5] platform/chrome: Fix a possible UAF via revocable
Posted by Danilo Krummrich 2 weeks, 6 days ago
On 9/12/25 10:30 AM, Greg Kroah-Hartman wrote:
> Also, Danilo, if you get the chance, can you give this a review as well?
> At first glance it looks good to me, but as you wrote the Rust
> implementation of this feature, a second pair of eyes would be great to
> have if you have the time.

Sure, I will follow up on this later today.