[PATCH v2 0/4] drm/tyr: implement GPU reset API

Onur Özkan posted 4 patches 1 month, 3 weeks ago
Only 3 patches received!
MAINTAINERS                          |   3 +
drivers/gpu/drm/tyr/driver.rs        |  40 +---
drivers/gpu/drm/tyr/reset.rs         | 293 +++++++++++++++++++++++++++
drivers/gpu/drm/tyr/reset/hw_gate.rs | 155 ++++++++++++++
drivers/gpu/drm/tyr/tyr.rs           |   1 +
rust/helpers/helpers.c               |   1 +
rust/helpers/srcu.c                  |  18 ++
rust/kernel/sync.rs                  |   2 +
rust/kernel/sync/srcu.rs             | 109 ++++++++++
rust/kernel/workqueue/mod.rs         |  15 ++
10 files changed, 607 insertions(+), 30 deletions(-)
create mode 100644 drivers/gpu/drm/tyr/reset.rs
create mode 100644 drivers/gpu/drm/tyr/reset/hw_gate.rs
create mode 100644 rust/helpers/srcu.c
create mode 100644 rust/kernel/sync/srcu.rs
[PATCH v2 0/4] drm/tyr: implement GPU reset API
Posted by Onur Özkan 1 month, 3 weeks ago
This series adds GPU reset handling support for Tyr in a new module
drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
controller internals and exposes a ResetHandle API to the driver.

This series is based on Alice's "Creation of workqueues in Rust" [1]
series.

Changes since v1:
  - Removed OrderedQueue and using Alice's workqueue implementation [1] instead.
  - Added Resettable trait with pre_reset and post_reset hooks to be implemented by
    reset-managed hardwares.
  - Added SRCU abstraction and used it to synchronize the reset work and hardware access.

3 important points:
  - There is no hardware using this API yet.
  - On post_reset() failure, we don't do anything for now. We should unplug the GPU (that's
    what Panthor does) but we don't have the infrastructure for that yet (see [2]).
  - In schedule(), similar to panthor_device_schedule_reset(), we should have a PM check
    but similar to the note above, we don't have the infrastructure for that yet.

Link: https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/ [1]
Link: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/29#note_3391826 [2]
Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28

Onur Özkan (4):
  rust: add SRCU abstraction
  MAINTAINERS: add Rust SRCU files to SRCU entry
  rust: add Work::disable_sync
  drm/tyr: add reset management API

 MAINTAINERS                          |   3 +
 drivers/gpu/drm/tyr/driver.rs        |  40 +---
 drivers/gpu/drm/tyr/reset.rs         | 293 +++++++++++++++++++++++++++
 drivers/gpu/drm/tyr/reset/hw_gate.rs | 155 ++++++++++++++
 drivers/gpu/drm/tyr/tyr.rs           |   1 +
 rust/helpers/helpers.c               |   1 +
 rust/helpers/srcu.c                  |  18 ++
 rust/kernel/sync.rs                  |   2 +
 rust/kernel/sync/srcu.rs             | 109 ++++++++++
 rust/kernel/workqueue/mod.rs         |  15 ++
 10 files changed, 607 insertions(+), 30 deletions(-)
 create mode 100644 drivers/gpu/drm/tyr/reset.rs
 create mode 100644 drivers/gpu/drm/tyr/reset/hw_gate.rs
 create mode 100644 rust/helpers/srcu.c
 create mode 100644 rust/kernel/sync/srcu.rs

-- 
2.51.2

Re: [PATCH v2 0/4] drm/tyr: implement GPU reset API
Posted by Onur Özkan 1 month, 3 weeks ago
On Thu, 16 Apr 2026 20:17:26 +0300
Onur Özkan <work@onurozkan.dev> wrote:

> This series adds GPU reset handling support for Tyr in a new module
> drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> controller internals and exposes a ResetHandle API to the driver.
> 
> This series is based on Alice's "Creation of workqueues in Rust" [1]
> series.
> 
> Changes since v1:
>   - Removed OrderedQueue and using Alice's workqueue implementation [1] instead.
>   - Added Resettable trait with pre_reset and post_reset hooks to be implemented by
>     reset-managed hardwares.
>   - Added SRCU abstraction and used it to synchronize the reset work and hardware access.
> 
> 3 important points:
>   - There is no hardware using this API yet.
>   - On post_reset() failure, we don't do anything for now. We should unplug the GPU (that's
>     what Panthor does) but we don't have the infrastructure for that yet (see [2]).
>   - In schedule(), similar to panthor_device_schedule_reset(), we should have a PM check
>     but similar to the note above, we don't have the infrastructure for that yet.
> 
> Link: https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/ [1]
> Link: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/29#note_3391826 [2]
> Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> 
> Onur Özkan (4):
>   rust: add SRCU abstraction
>   MAINTAINERS: add Rust SRCU files to SRCU entry
>   rust: add Work::disable_sync
>   drm/tyr: add reset management API
> 
>  MAINTAINERS                          |   3 +
>  drivers/gpu/drm/tyr/driver.rs        |  40 +---
>  drivers/gpu/drm/tyr/reset.rs         | 293 +++++++++++++++++++++++++++
>  drivers/gpu/drm/tyr/reset/hw_gate.rs | 155 ++++++++++++++
>  drivers/gpu/drm/tyr/tyr.rs           |   1 +
>  rust/helpers/helpers.c               |   1 +
>  rust/helpers/srcu.c                  |  18 ++
>  rust/kernel/sync.rs                  |   2 +
>  rust/kernel/sync/srcu.rs             | 109 ++++++++++
>  rust/kernel/workqueue/mod.rs         |  15 ++
>  10 files changed, 607 insertions(+), 30 deletions(-)
>  create mode 100644 drivers/gpu/drm/tyr/reset.rs
>  create mode 100644 drivers/gpu/drm/tyr/reset/hw_gate.rs
>  create mode 100644 rust/helpers/srcu.c
>  create mode 100644 rust/kernel/sync/srcu.rs
> 
> -- 
> 2.51.2
> 

I messed up when sending the series (part of it was sent as a separate series
[1]. I will resend this properly, sorry for the noise.

[1]: https://lore.kernel.org/all/20260416171728.205141-1-work@onurozkan.dev/

-Onur
Re: [PATCH v2 0/4] drm/tyr: implement GPU reset API
Posted by Boqun Feng 1 month, 3 weeks ago
On Thu, Apr 16, 2026 at 08:23:45PM +0300, Onur Özkan wrote:
> On Thu, 16 Apr 2026 20:17:26 +0300
> Onur Özkan <work@onurozkan.dev> wrote:
> 
> > This series adds GPU reset handling support for Tyr in a new module
> > drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> > controller internals and exposes a ResetHandle API to the driver.
> > 
> > This series is based on Alice's "Creation of workqueues in Rust" [1]
> > series.
> > 
> > Changes since v1:
> >   - Removed OrderedQueue and using Alice's workqueue implementation [1] instead.
> >   - Added Resettable trait with pre_reset and post_reset hooks to be implemented by
> >     reset-managed hardwares.
> >   - Added SRCU abstraction and used it to synchronize the reset work and hardware access.
> > 
> > 3 important points:
> >   - There is no hardware using this API yet.
> >   - On post_reset() failure, we don't do anything for now. We should unplug the GPU (that's
> >     what Panthor does) but we don't have the infrastructure for that yet (see [2]).
> >   - In schedule(), similar to panthor_device_schedule_reset(), we should have a PM check
> >     but similar to the note above, we don't have the infrastructure for that yet.
> > 
> > Link: https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/ [1]
> > Link: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/29#note_3391826 [2]
> > Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> > 
> > Onur Özkan (4):
> >   rust: add SRCU abstraction
> >   MAINTAINERS: add Rust SRCU files to SRCU entry
> >   rust: add Work::disable_sync
> >   drm/tyr: add reset management API
> > 
> >  MAINTAINERS                          |   3 +
> >  drivers/gpu/drm/tyr/driver.rs        |  40 +---
> >  drivers/gpu/drm/tyr/reset.rs         | 293 +++++++++++++++++++++++++++
> >  drivers/gpu/drm/tyr/reset/hw_gate.rs | 155 ++++++++++++++
> >  drivers/gpu/drm/tyr/tyr.rs           |   1 +
> >  rust/helpers/helpers.c               |   1 +
> >  rust/helpers/srcu.c                  |  18 ++
> >  rust/kernel/sync.rs                  |   2 +
> >  rust/kernel/sync/srcu.rs             | 109 ++++++++++
> >  rust/kernel/workqueue/mod.rs         |  15 ++
> >  10 files changed, 607 insertions(+), 30 deletions(-)
> >  create mode 100644 drivers/gpu/drm/tyr/reset.rs
> >  create mode 100644 drivers/gpu/drm/tyr/reset/hw_gate.rs
> >  create mode 100644 rust/helpers/srcu.c
> >  create mode 100644 rust/kernel/sync/srcu.rs
> > 
> > -- 
> > 2.51.2
> > 
> 
> I messed up when sending the series (part of it was sent as a separate series
> [1]. I will resend this properly, sorry for the noise.
> 

FWIW, I didn't receive your patch #3 (even from my subscription on
rust-for-linux list).

Could you add a doc test for disable_sync(), I'm curious about it
because you may disable a work that has not be executed yet, and
wouldn't that be leaking memory (IIUC, we rely on Arc::drop() in
WorkItemPointer::run() to decrease the refcounts), but maybe I'm missing
something subtle.

Regards,
Boqun

> [1]: https://lore.kernel.org/all/20260416171728.205141-1-work@onurozkan.dev/
> 
> -Onur
> 
Re: [PATCH v2 0/4] drm/tyr: implement GPU reset API
Posted by Onur Özkan 1 month, 1 week ago
On Thu, 16 Apr 2026 11:45:56 -0700
Boqun Feng <boqun@kernel.org> wrote:

> On Thu, Apr 16, 2026 at 08:23:45PM +0300, Onur Özkan wrote:
> > On Thu, 16 Apr 2026 20:17:26 +0300
> > Onur Özkan <work@onurozkan.dev> wrote:
> > 
> > > This series adds GPU reset handling support for Tyr in a new module
> > > drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> > > controller internals and exposes a ResetHandle API to the driver.
> > > 
> > > This series is based on Alice's "Creation of workqueues in Rust" [1]
> > > series.
> > > 
> > > Changes since v1:
> > >   - Removed OrderedQueue and using Alice's workqueue implementation [1] instead.
> > >   - Added Resettable trait with pre_reset and post_reset hooks to be implemented by
> > >     reset-managed hardwares.
> > >   - Added SRCU abstraction and used it to synchronize the reset work and hardware access.
> > > 
> > > 3 important points:
> > >   - There is no hardware using this API yet.
> > >   - On post_reset() failure, we don't do anything for now. We should unplug the GPU (that's
> > >     what Panthor does) but we don't have the infrastructure for that yet (see [2]).
> > >   - In schedule(), similar to panthor_device_schedule_reset(), we should have a PM check
> > >     but similar to the note above, we don't have the infrastructure for that yet.
> > > 
> > > Link: https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/ [1]
> > > Link: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/29#note_3391826 [2]
> > > Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> > > 
> > > Onur Özkan (4):
> > >   rust: add SRCU abstraction
> > >   MAINTAINERS: add Rust SRCU files to SRCU entry
> > >   rust: add Work::disable_sync
> > >   drm/tyr: add reset management API
> > > 
> > >  MAINTAINERS                          |   3 +
> > >  drivers/gpu/drm/tyr/driver.rs        |  40 +---
> > >  drivers/gpu/drm/tyr/reset.rs         | 293 +++++++++++++++++++++++++++
> > >  drivers/gpu/drm/tyr/reset/hw_gate.rs | 155 ++++++++++++++
> > >  drivers/gpu/drm/tyr/tyr.rs           |   1 +
> > >  rust/helpers/helpers.c               |   1 +
> > >  rust/helpers/srcu.c                  |  18 ++
> > >  rust/kernel/sync.rs                  |   2 +
> > >  rust/kernel/sync/srcu.rs             | 109 ++++++++++
> > >  rust/kernel/workqueue/mod.rs         |  15 ++
> > >  10 files changed, 607 insertions(+), 30 deletions(-)
> > >  create mode 100644 drivers/gpu/drm/tyr/reset.rs
> > >  create mode 100644 drivers/gpu/drm/tyr/reset/hw_gate.rs
> > >  create mode 100644 rust/helpers/srcu.c
> > >  create mode 100644 rust/kernel/sync/srcu.rs
> > > 
> > > -- 
> > > 2.51.2
> > > 
> > 
> > I messed up when sending the series (part of it was sent as a separate series
> > [1]. I will resend this properly, sorry for the noise.
> > 
> 
> FWIW, I didn't receive your patch #3 (even from my subscription on
> rust-for-linux list).
> 
> Could you add a doc test for disable_sync(), I'm curious about it
> because you may disable a work that has not be executed yet, and
> wouldn't that be leaking memory (IIUC, we rely on Arc::drop() in
> WorkItemPointer::run() to decrease the refcounts), but maybe I'm missing
> something subtle.
> 
> Regards,
> Boqun
> 

Hi Boqun,

I fixed the leak issue and this change now has its own series at [1]. I couldn't
figure an easy way to write the doc-test tho, it started to add too much
complextiy and I didn't think it's worth it.

- Onur

[1]: https://lore.kernel.org/all/20260428104459.174602-1-work@onurozkan.dev

> > [1]: https://lore.kernel.org/all/20260416171728.205141-1-work@onurozkan.dev/
> > 
> > -Onur
> > 
Re: [PATCH v2 0/4] drm/tyr: implement GPU reset API
Posted by Onur Özkan 1 month, 3 weeks ago
On Thu, 16 Apr 2026 11:45:56 -0700
Boqun Feng <boqun@kernel.org> wrote:

> On Thu, Apr 16, 2026 at 08:23:45PM +0300, Onur Özkan wrote:
> > On Thu, 16 Apr 2026 20:17:26 +0300
> > Onur Özkan <work@onurozkan.dev> wrote:
> > 
> > > This series adds GPU reset handling support for Tyr in a new module
> > > drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> > > controller internals and exposes a ResetHandle API to the driver.
> > > 
> > > This series is based on Alice's "Creation of workqueues in Rust" [1]
> > > series.
> > > 
> > > Changes since v1:
> > >   - Removed OrderedQueue and using Alice's workqueue implementation [1] instead.
> > >   - Added Resettable trait with pre_reset and post_reset hooks to be implemented by
> > >     reset-managed hardwares.
> > >   - Added SRCU abstraction and used it to synchronize the reset work and hardware access.
> > > 
> > > 3 important points:
> > >   - There is no hardware using this API yet.
> > >   - On post_reset() failure, we don't do anything for now. We should unplug the GPU (that's
> > >     what Panthor does) but we don't have the infrastructure for that yet (see [2]).
> > >   - In schedule(), similar to panthor_device_schedule_reset(), we should have a PM check
> > >     but similar to the note above, we don't have the infrastructure for that yet.
> > > 
> > > Link: https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/ [1]
> > > Link: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/29#note_3391826 [2]
> > > Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> > > 
> > > Onur Özkan (4):
> > >   rust: add SRCU abstraction
> > >   MAINTAINERS: add Rust SRCU files to SRCU entry
> > >   rust: add Work::disable_sync
> > >   drm/tyr: add reset management API
> > > 
> > >  MAINTAINERS                          |   3 +
> > >  drivers/gpu/drm/tyr/driver.rs        |  40 +---
> > >  drivers/gpu/drm/tyr/reset.rs         | 293 +++++++++++++++++++++++++++
> > >  drivers/gpu/drm/tyr/reset/hw_gate.rs | 155 ++++++++++++++
> > >  drivers/gpu/drm/tyr/tyr.rs           |   1 +
> > >  rust/helpers/helpers.c               |   1 +
> > >  rust/helpers/srcu.c                  |  18 ++
> > >  rust/kernel/sync.rs                  |   2 +
> > >  rust/kernel/sync/srcu.rs             | 109 ++++++++++
> > >  rust/kernel/workqueue/mod.rs         |  15 ++
> > >  10 files changed, 607 insertions(+), 30 deletions(-)
> > >  create mode 100644 drivers/gpu/drm/tyr/reset.rs
> > >  create mode 100644 drivers/gpu/drm/tyr/reset/hw_gate.rs
> > >  create mode 100644 rust/helpers/srcu.c
> > >  create mode 100644 rust/kernel/sync/srcu.rs
> > > 
> > > -- 
> > > 2.51.2
> > > 
> > 
> > I messed up when sending the series (part of it was sent as a separate series
> > [1]. I will resend this properly, sorry for the noise.
> > 
> 
> FWIW, I didn't receive your patch #3 (even from my subscription on
> rust-for-linux list).
> 

Interesting, it's actually sent to rust-for-linux list [1]. But yeah, I totally
messed up with sending this series...

[1]: https://lore.kernel.org/all/20260416171728.205141-2-work@onurozkan.dev/

> Could you add a doc test for disable_sync(), I'm curious about it
> because you may disable a work that has not be executed yet, and
> wouldn't that be leaking memory (IIUC, we rely on Arc::drop() in
> WorkItemPointer::run() to decrease the refcounts), but maybe I'm missing
> something subtle.

I was expecting the C call to handle the teardown properly over the pointer but
I wasn't aware about the Rust side internals on the workqueue abstraction. I
will check that in more detail during next week and I will definitely add the
test on v3.

Thanks,
Onur

> 
> Regards,
> Boqun
> 
> > [1]: https://lore.kernel.org/all/20260416171728.205141-1-work@onurozkan.dev/
> > 
> > -Onur
> >