rust: dma: add CoherentArray for compile-time sized allocations

[PATCH 0/9] rust: dma: add CoherentArray for compile-time sized allocations

Posted by Eliot Courtney 1 week, 1 day ago

This series extends the DMA coherent allocation API to support compile-time
known sizes. This lets bounds checking to be moved from runtime to build
time, which is useful to avoid runtime panics from index typos. It also
removes the need for a Result return type in some places.

The compile time size is specified via a marker type: StaticSize<N>.
Statically sized allocations can decay to runtime sized ones via deref
coercion for code that doesn't need to know the size at compile time, or to
avoid having to carry around extra type parameters. The implementation
follows a similar pattern to Device/DeviceContext.

The series defines three type aliases: CoherentSlice<T> (for runtime size),
CoherentArray<T, N> (for compile-time size N), and CoherentObject<T> (for
single object allocations). It also adds infallible dma_read!/dma_write!
macros and methods to CoherentArray, while prefixing the existing fallible
methods and macros with `try_`.

The macros keep the same syntax (i.e.
coherent_allocation[index].optional_fields = expression) even for
CoherentObject, because the [] syntax is needed to know where to split the
actual CoherentAllocation object from the fields. This means that
CoherentObject is indexed with [0] in dma_write!/dma_read! macros. The
alternative is defining a separate macro for single object access, but it
still would need a way to delineate between the allocation and the fields,
perhaps by using commas (dma_read_obj!(object, fields),
dma_write_obj!(object, fields, value)). This would be inconsistent with the
array/slice syntax.

The last patch in the series may be useful as an example of what this
looks like to use. Also, there is probably a better name than
CoherentSlice. I found that specifying a default of RuntimeSize on
CoherentAllocation stopped the compiler from being able to resolve
which alloc_attrs to call in usages like e.g.
CoherentAllocation<u8>::alloc_attrs. Also, we probably want to encourage
people to use the statically sized one if possible, so it may be nice to
avoid defaulting CoherentAllocation to RuntimeSize.

Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
---
Eliot Courtney (9):
      rust: dma: rename CoherentAllocation fallible methods
      rust: dma: parameterize CoherentAllocation with AllocationSize
      rust: dma: add CoherentArray for compile-time sized allocations
      rust: dma: simplify try_dma_read! and try_dma_write!
      rust: dma: rename try_item_from_index to try_ptr_at
      rust: dma: add dma_read! and dma_write! macros
      rust: dma: implement decay from CoherentArray to CoherentSlice
      rust: dma: add CoherentObject for single element allocations
      gpu: nova-core: migrate to CoherentArray and CoherentObject

 drivers/gpu/nova-core/dma.rs            |  10 +-
 drivers/gpu/nova-core/falcon.rs         |   2 +-
 drivers/gpu/nova-core/firmware/fwsec.rs |   4 +-
 drivers/gpu/nova-core/gsp.rs            |  44 +--
 drivers/gpu/nova-core/gsp/boot.rs       |   6 +-
 drivers/gpu/nova-core/gsp/cmdq.rs       |  20 +-
 drivers/gpu/nova-core/gsp/fw.rs         |  12 +-
 rust/kernel/dma.rs                      | 555 +++++++++++++++++++++++++-------
 samples/rust/rust_dma.rs                |  14 +-
 9 files changed, 489 insertions(+), 178 deletions(-)
---
base-commit: c71257394bc9c59ea727803f6e55e83fe63db74e
change-id: 20260128-coherent-array-0321eb723d4c

Best regards,
-- 
Eliot Courtney <ecourtney@nvidia.com>

Re: [PATCH 0/9] rust: dma: add CoherentArray for compile-time sized allocations

Posted by Gary Guo 4 days, 21 hours ago

On Fri Jan 30, 2026 at 8:34 AM GMT, Eliot Courtney wrote:
> This series extends the DMA coherent allocation API to support compile-time
> known sizes. This lets bounds checking to be moved from runtime to build
> time, which is useful to avoid runtime panics from index typos. It also
> removes the need for a Result return type in some places.
>
> The compile time size is specified via a marker type: StaticSize<N>.
> Statically sized allocations can decay to runtime sized ones via deref
> coercion for code that doesn't need to know the size at compile time, or to
> avoid having to carry around extra type parameters. The implementation
> follows a similar pattern to Device/DeviceContext.
>
> The series defines three type aliases: CoherentSlice<T> (for runtime size),
> CoherentArray<T, N> (for compile-time size N), and CoherentObject<T> (for
> single object allocations). It also adds infallible dma_read!/dma_write!
> macros and methods to CoherentArray, while prefixing the existing fallible
> methods and macros with `try_`.
>
> The macros keep the same syntax (i.e.
> coherent_allocation[index].optional_fields = expression) even for
> CoherentObject, because the [] syntax is needed to know where to split the
> actual CoherentAllocation object from the fields. This means that
> CoherentObject is indexed with [0] in dma_write!/dma_read! macros. The
> alternative is defining a separate macro for single object access, but it
> still would need a way to delineate between the allocation and the fields,
> perhaps by using commas (dma_read_obj!(object, fields),
> dma_write_obj!(object, fields, value)). This would be inconsistent with the
> array/slice syntax.
>
> The last patch in the series may be useful as an example of what this
> looks like to use. Also, there is probably a better name than
> CoherentSlice. I found that specifying a default of RuntimeSize on
> CoherentAllocation stopped the compiler from being able to resolve
> which alloc_attrs to call in usages like e.g.
> CoherentAllocation<u8>::alloc_attrs. Also, we probably want to encourage
> people to use the statically sized one if possible, so it may be nice to
> avoid defaulting CoherentAllocation to RuntimeSize.

I've already posted an example on Zulip but for visibility I'll post it here
too:

I think the design should be `CoherentObject<T: ?Sized>` so that if you need a
`CoherentArray`, it's `CoherentObject<[T; N]>`, and `CoherentSlice<T>` is
`CoherentObject<[T]>`. The existing `Io` that has a fixed minimum size but
variable actual size can be abstracted as a new type.

Link: https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/Generic.20I.2FO.20backends/near/571228593

Best,
Gary

>
> Signed-off-by: Eliot Courtney <ecourtney@nvidia.com>
> ---
> Eliot Courtney (9):
>       rust: dma: rename CoherentAllocation fallible methods
>       rust: dma: parameterize CoherentAllocation with AllocationSize
>       rust: dma: add CoherentArray for compile-time sized allocations
>       rust: dma: simplify try_dma_read! and try_dma_write!
>       rust: dma: rename try_item_from_index to try_ptr_at
>       rust: dma: add dma_read! and dma_write! macros
>       rust: dma: implement decay from CoherentArray to CoherentSlice
>       rust: dma: add CoherentObject for single element allocations
>       gpu: nova-core: migrate to CoherentArray and CoherentObject
>
>  drivers/gpu/nova-core/dma.rs            |  10 +-
>  drivers/gpu/nova-core/falcon.rs         |   2 +-
>  drivers/gpu/nova-core/firmware/fwsec.rs |   4 +-
>  drivers/gpu/nova-core/gsp.rs            |  44 +--
>  drivers/gpu/nova-core/gsp/boot.rs       |   6 +-
>  drivers/gpu/nova-core/gsp/cmdq.rs       |  20 +-
>  drivers/gpu/nova-core/gsp/fw.rs         |  12 +-
>  rust/kernel/dma.rs                      | 555 +++++++++++++++++++++++++-------
>  samples/rust/rust_dma.rs                |  14 +-
>  9 files changed, 489 insertions(+), 178 deletions(-)
> ---
> base-commit: c71257394bc9c59ea727803f6e55e83fe63db74e
> change-id: 20260128-coherent-array-0321eb723d4c
>
> Best regards,

Re: [PATCH 0/9] rust: dma: add CoherentArray for compile-time sized allocations

Posted by Danilo Krummrich 6 days, 23 hours ago

(Cc: Lyude)

On Fri Jan 30, 2026 at 9:34 AM CET, Eliot Courtney wrote:
> This series extends the DMA coherent allocation API to support compile-time
> known sizes. This lets bounds checking to be moved from runtime to build
> time, which is useful to avoid runtime panics from index typos. It also
> removes the need for a Result return type in some places.
>
> The compile time size is specified via a marker type: StaticSize<N>.
> Statically sized allocations can decay to runtime sized ones via deref
> coercion for code that doesn't need to know the size at compile time, or to
> avoid having to carry around extra type parameters. The implementation
> follows a similar pattern to Device/DeviceContext.
>
> The series defines three type aliases: CoherentSlice<T> (for runtime size),
> CoherentArray<T, N> (for compile-time size N), and CoherentObject<T> (for
> single object allocations). It also adds infallible dma_read!/dma_write!
> macros and methods to CoherentArray, while prefixing the existing fallible
> methods and macros with `try_`.
>
> The macros keep the same syntax (i.e.
> coherent_allocation[index].optional_fields = expression) even for
> CoherentObject, because the [] syntax is needed to know where to split the
> actual CoherentAllocation object from the fields. This means that
> CoherentObject is indexed with [0] in dma_write!/dma_read! macros. The
> alternative is defining a separate macro for single object access, but it
> still would need a way to delineate between the allocation and the fields,
> perhaps by using commas (dma_read_obj!(object, fields),
> dma_write_obj!(object, fields, value)). This would be inconsistent with the
> array/slice syntax.

We've just generalized I/O to support arbitrary I/O backends (busses, backing
storage, etc.).

With this we can wire up the I/O traits to DMA and generalize the dma_read() and
dma_write() macros accordingly. I.e. we can extend the I/O traits with
field_write() and field_read().

(Lyude is going to work on this as a more integrated alternative to iosys_map.
It would be good to align with her regarding this work.)

This has the advantage that we don't have to duplicate all this infrastructure
for I/O memory, DMA, etc.

I also think that CoherentSlice is too specific of a type. I'd rather have a
generic type, maybe UnsafeSlice or IoSlice, that just uses the I/O backend for
accesses.

Re: [PATCH 0/9] rust: dma: add CoherentArray for compile-time sized allocations

Posted by Alexandre Courbot 6 days, 22 hours ago

On Sat Jan 31, 2026 at 9:27 PM JST, Danilo Krummrich wrote:
> (Cc: Lyude)
>
> On Fri Jan 30, 2026 at 9:34 AM CET, Eliot Courtney wrote:
>> This series extends the DMA coherent allocation API to support compile-time
>> known sizes. This lets bounds checking to be moved from runtime to build
>> time, which is useful to avoid runtime panics from index typos. It also
>> removes the need for a Result return type in some places.
>>
>> The compile time size is specified via a marker type: StaticSize<N>.
>> Statically sized allocations can decay to runtime sized ones via deref
>> coercion for code that doesn't need to know the size at compile time, or to
>> avoid having to carry around extra type parameters. The implementation
>> follows a similar pattern to Device/DeviceContext.
>>
>> The series defines three type aliases: CoherentSlice<T> (for runtime size),
>> CoherentArray<T, N> (for compile-time size N), and CoherentObject<T> (for
>> single object allocations). It also adds infallible dma_read!/dma_write!
>> macros and methods to CoherentArray, while prefixing the existing fallible
>> methods and macros with `try_`.
>>
>> The macros keep the same syntax (i.e.
>> coherent_allocation[index].optional_fields = expression) even for
>> CoherentObject, because the [] syntax is needed to know where to split the
>> actual CoherentAllocation object from the fields. This means that
>> CoherentObject is indexed with [0] in dma_write!/dma_read! macros. The
>> alternative is defining a separate macro for single object access, but it
>> still would need a way to delineate between the allocation and the fields,
>> perhaps by using commas (dma_read_obj!(object, fields),
>> dma_write_obj!(object, fields, value)). This would be inconsistent with the
>> array/slice syntax.
>
> We've just generalized I/O to support arbitrary I/O backends (busses, backing
> storage, etc.).
>
> With this we can wire up the I/O traits to DMA and generalize the dma_read() and
> dma_write() macros accordingly. I.e. we can extend the I/O traits with
> field_write() and field_read().

With the caveat that the I/O traits for now only support accessing
primitive types; is the plan to add a function to read any type
implementing `FromBytes`?

>
> (Lyude is going to work on this as a more integrated alternative to iosys_map.
> It would be good to align with her regarding this work.)

Heads up, I am also doing some plumbing in `io.rs` related to the
register macro. Maybe we should have a thread on Zulip to discuss what
everyone is working on.

>
> This has the advantage that we don't have to duplicate all this infrastructure
> for I/O memory, DMA, etc.
>
> I also think that CoherentSlice is too specific of a type. I'd rather have a
> generic type, maybe UnsafeSlice or IoSlice, that just uses the I/O backend for
> accesses.

For me the main appeal of this patchset is that it provides a way to
work infallibly with a single object or a fixed-size array. I hope
that's something we can preserve.

Re: [PATCH 0/9] rust: dma: add CoherentArray for compile-time sized allocations

Posted by Danilo Krummrich 6 days, 21 hours ago

(Cc: Lyude)

I assume something odd is going on with your mail client for people that have
been added to a thread later on?

On Sat Jan 31, 2026 at 2:16 PM CET, Alexandre Courbot wrote:
> On Sat Jan 31, 2026 at 9:27 PM JST, Danilo Krummrich wrote:
>> We've just generalized I/O to support arbitrary I/O backends (busses, backing
>> storage, etc.).
>>
>> With this we can wire up the I/O traits to DMA and generalize the dma_read() and
>> dma_write() macros accordingly. I.e. we can extend the I/O traits with
>> field_write() and field_read().
>
> With the caveat that the I/O traits for now only support accessing
> primitive types; is the plan to add a function to read any type
> implementing `FromBytes`?

That's exactly what I say above: generalize the dma_read!() and dma_write!()
macros by adding field_write() and field_read() to the I/O traits. :)

For reference, this is where I brought this up originally [1].

[1] https://lore.kernel.org/all/DFOP5BY09539.AFY5L5FV1HNV@kernel.org/

>> (Lyude is going to work on this as a more integrated alternative to iosys_map.
>> It would be good to align with her regarding this work.)
>
> Heads up, I am also doing some plumbing in `io.rs` related to the
> register macro. Maybe we should have a thread on Zulip to discuss what
> everyone is working on.

Done!

Link: https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/Generic.20I.2FO.20backends/with/571198078

>> This has the advantage that we don't have to duplicate all this infrastructure
>> for I/O memory, DMA, etc.
>>
>> I also think that CoherentSlice is too specific of a type. I'd rather have a
>> generic type, maybe UnsafeSlice or IoSlice, that just uses the I/O backend for
>> accesses.
>
> For me the main appeal of this patchset is that it provides a way to
> work infallibly with a single object or a fixed-size array. I hope
> that's something we can preserve.

Of course, the generic I/O backend infrastructure is based on the distinction
between compile-time and run-time.