drivers/gpu/drm/tyr/driver.rs | 38 +++---- drivers/gpu/drm/tyr/reset.rs | 180 ++++++++++++++++++++++++++++++++++ drivers/gpu/drm/tyr/tyr.rs | 1 + rust/helpers/workqueue.c | 6 ++ rust/kernel/workqueue.rs | 62 ++++++++++++ 5 files changed, 260 insertions(+), 27 deletions(-) create mode 100644 drivers/gpu/drm/tyr/reset.rs
This series adds GPU reset handling support for Tyr in a new module drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset controller internals and exposes a ResetHandle API to the driver. The reset module owns reset state, queueing and execution ordering through OrderedQueue and handles duplicate/concurrent reset requests with a pending flag. Apart from the reset module, the first 3 patches: - Fixes a potential reset-complete stale state bug by clearing completed state before doing soft reset. - Adds Work::disable_sync() (wrapper of bindings::disable_work_sync). - Adds OrderedQueue support. Runtime tested on hardware by Deborah Brouwer (see [1]) and myself. [1]: https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131 Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28 --- Onur Özkan (4): drm/tyr: clear reset IRQ before soft reset rust: add Work::disable_sync rust: add ordered workqueue wrapper drm/tyr: add GPU reset handling drivers/gpu/drm/tyr/driver.rs | 38 +++---- drivers/gpu/drm/tyr/reset.rs | 180 ++++++++++++++++++++++++++++++++++ drivers/gpu/drm/tyr/tyr.rs | 1 + rust/helpers/workqueue.c | 6 ++ rust/kernel/workqueue.rs | 62 ++++++++++++ 5 files changed, 260 insertions(+), 27 deletions(-) create mode 100644 drivers/gpu/drm/tyr/reset.rs base-commit: 0ccc0dac94bf2f5c6eb3e9e7f1014cd9dddf009f -- 2.51.2
> This series adds GPU reset handling support for Tyr in a new module
> drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset
> controller internals and exposes a ResetHandle API to the driver.
>
> The reset module owns reset state, queueing and execution ordering
> through OrderedQueue and handles duplicate/concurrent reset requests
> with a pending flag.
>
> Apart from the reset module, the first 3 patches:
>
> - Fixes a potential reset-complete stale state bug by clearing completed
> state before doing soft reset.
> - Adds Work::disable_sync() (wrapper of bindings::disable_work_sync).
> - Adds OrderedQueue support.
>
> Runtime tested on hardware by Deborah Brouwer (see [1]) and myself.
>
> [1]: https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131
>
> Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28
> ---
>
> Onur Özkan (4):
> drm/tyr: clear reset IRQ before soft reset
> rust: add Work::disable_sync
> rust: add ordered workqueue wrapper
> drm/tyr: add GPU reset handling
>
> drivers/gpu/drm/tyr/driver.rs | 38 +++----
> drivers/gpu/drm/tyr/reset.rs | 180 ++++++++++++++++++++++++++++++++++
> drivers/gpu/drm/tyr/tyr.rs | 1 +
> rust/helpers/workqueue.c | 6 ++
> rust/kernel/workqueue.rs | 62 ++++++++++++
> 5 files changed, 260 insertions(+), 27 deletions(-)
> create mode 100644 drivers/gpu/drm/tyr/reset.rs
>
>
> base-commit: 0ccc0dac94bf2f5c6eb3e9e7f1014cd9dddf009f
> --
> 2.51.2
>
Hi all,
Writing the current status of this work, I have 2 blockers to move forward.
1- GPU unplug API
On the existing C side, reset failure handling eventually needs to unplug the
device, and that path is part of the broader reset flow in:
- srctree/drivers/gpu/drm/panthor/panthor_device.c
This is part of [1] and as far as I understand, it is still work in progress. For Tyr,
I currently keep this as a placeholder (todo!("unplug the GPU")) in the reset path,
because I do not want to introduce temporary or partial unplug handling in this series
before the unplug design is settled.
[1]: https://gitlab.freedesktop.org/panfrost/linux/-/work_items/29
2- Design decisions for reset handling
The second blocker is the design around how Resettable (a generic pre_reset post_reset hook trait)
implemeter should stop admitting new work, drain in-flight operations and recover after reset.
My current understanding is that the cleanest approach is to keep reset.rs responsible only for
reset orchestration:
- schedule reset work
- call pre_reset() hooks
- perform the hardware reset
- call post_reset() hooks
- propagate failure.
Then, each Resettable implementer should own its local recovery logic.
This is also how the existing C implementation is structured. The reset worker is centralized, but
recovery is implemented by the participating subsystems:
- srctree/drivers/gpu/drm/panthor/panthor_sched.c
- srctree/drivers/gpu/drm/panthor/panthor_fw.c
- srctree/drivers/gpu/drm/panthor/panthor_mmu.c
More specifically, the existing C side has hooks such as:
- panthor_sched_pre_reset() / panthor_sched_post_reset()
- panthor_fw_pre_reset() / panthor_fw_post_reset()
- panthor_mmu_pre_reset() / panthor_mmu_post_reset()
The reason I am leaning in the same direction for Tyr is that "stop new work", "drain" and "resume"
are not generic operations. They depend on the implementer.
Because of that, I think reset.rs should not have a global guard/checking API for all of this.
Comments and suggestions are very welcome.
Regards,
Onur
On Fri, Mar 13, 2026 at 12:16:40PM +0300, Onur Özkan wrote: > This series adds GPU reset handling support for Tyr in a new module > drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset > controller internals and exposes a ResetHandle API to the driver. > > The reset module owns reset state, queueing and execution ordering > through OrderedQueue and handles duplicate/concurrent reset requests > with a pending flag. > > Apart from the reset module, the first 3 patches: > > - Fixes a potential reset-complete stale state bug by clearing completed > state before doing soft reset. > - Adds Work::disable_sync() (wrapper of bindings::disable_work_sync). > - Adds OrderedQueue support. > > Runtime tested on hardware by Deborah Brouwer (see [1]) and myself. > > [1]: https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131 > > Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28 > --- > > Onur Özkan (4): > drm/tyr: clear reset IRQ before soft reset > rust: add Work::disable_sync > rust: add ordered workqueue wrapper I actually added ordered workqueue support here: https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/ Alice
On Fri, 13 Mar 2026 09:52:16 +0000 Alice Ryhl <aliceryhl@google.com> wrote: > On Fri, Mar 13, 2026 at 12:16:40PM +0300, Onur Özkan wrote: > > This series adds GPU reset handling support for Tyr in a new module > > drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset > > controller internals and exposes a ResetHandle API to the driver. > > > > The reset module owns reset state, queueing and execution ordering > > through OrderedQueue and handles duplicate/concurrent reset requests > > with a pending flag. > > > > Apart from the reset module, the first 3 patches: > > > > - Fixes a potential reset-complete stale state bug by clearing > > completed state before doing soft reset. > > - Adds Work::disable_sync() (wrapper of > > bindings::disable_work_sync). > > - Adds OrderedQueue support. > > > > Runtime tested on hardware by Deborah Brouwer (see [1]) and myself. > > > > [1]: > > https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131 > > > > Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28 > > --- > > > > Onur Özkan (4): > > drm/tyr: clear reset IRQ before soft reset > > rust: add Work::disable_sync > > rust: add ordered workqueue wrapper > > I actually added ordered workqueue support here: > https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/ > > Alice That's cool. I guess this will wait until your patch lands unless we want to combine them into a single series. - Onur
On Fri, Mar 13, 2026 at 12:12 PM Onur Özkan <work@onurozkan.dev> wrote: > > On Fri, 13 Mar 2026 09:52:16 +0000 > Alice Ryhl <aliceryhl@google.com> wrote: > > > On Fri, Mar 13, 2026 at 12:16:40PM +0300, Onur Özkan wrote: > > > This series adds GPU reset handling support for Tyr in a new module > > > drivers/gpu/drm/tyr/driver.rs which encapsulates the low-level reset > > > controller internals and exposes a ResetHandle API to the driver. > > > > > > The reset module owns reset state, queueing and execution ordering > > > through OrderedQueue and handles duplicate/concurrent reset requests > > > with a pending flag. > > > > > > Apart from the reset module, the first 3 patches: > > > > > > - Fixes a potential reset-complete stale state bug by clearing > > > completed state before doing soft reset. > > > - Adds Work::disable_sync() (wrapper of > > > bindings::disable_work_sync). > > > - Adds OrderedQueue support. > > > > > > Runtime tested on hardware by Deborah Brouwer (see [1]) and myself. > > > > > > [1]: > > > https://gitlab.freedesktop.org/panfrost/linux/-/merge_requests/63#note_3364131 > > > > > > Link: https://gitlab.freedesktop.org/panfrost/linux/-/issues/28 > > > --- > > > > > > Onur Özkan (4): > > > drm/tyr: clear reset IRQ before soft reset > > > rust: add Work::disable_sync > > > rust: add ordered workqueue wrapper > > > > I actually added ordered workqueue support here: > > https://lore.kernel.org/all/20260312-create-workqueue-v4-0-ea39c351c38f@google.com/ > > > > Alice > > That's cool. I guess this will wait until your patch lands unless we > want to combine them into a single series. You can just say in your cover letter that your series depends on mine. Alice
© 2016 - 2026 Red Hat, Inc.