[v7] Add dma coherent allocator abstraction

[PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Abdiel Janulgue 1 year, 2 months ago

Add a simple dma coherent allocator rust abstraction. Based on
Andreas Hindborg's dma abstractions from the rnvme driver, which
was also based on earlier work by Wedson Almeida Filho.

Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
---
 rust/bindings/bindings_helper.h |   1 +
 rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
 rust/kernel/lib.rs              |   1 +
 3 files changed, 225 insertions(+)
 create mode 100644 rust/kernel/dma.rs

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 5c4dfe22f41a..49bf713b9bb6 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -11,6 +11,7 @@
 #include <linux/blk_types.h>
 #include <linux/blkdev.h>
 #include <linux/cred.h>
+#include <linux/dma-mapping.h>
 #include <linux/errname.h>
 #include <linux/ethtool.h>
 #include <linux/file.h>
diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
new file mode 100644
index 000000000000..29ae744d6f2b
--- /dev/null
+++ b/rust/kernel/dma.rs
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Direct memory access (DMA).
+//!
+//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
+
+use crate::{
+    bindings,
+    build_assert,
+    device::Device,
+    error::code::*,
+    error::Result,
+    types::ARef,
+    transmute::{AsBytes, FromBytes},
+};
+
+/// Possible attributes associated with a DMA mapping.
+///
+/// They can be combined with the operators `|`, `&`, and `!`.
+///
+/// Values can be used from the [`attrs`] module.
+#[derive(Clone, Copy, PartialEq)]
+pub struct Attribs(u32);
+
+impl Attribs {
+    /// Get the raw representation of this attribute.
+    pub(crate) fn as_raw(self) -> u64 {
+        self.0.into()
+    }
+
+    /// Check whether `flags` is contained in `self`.
+    pub fn contains(self, flags: Attribs) -> bool {
+        (self & flags) == flags
+    }
+}
+
+impl core::ops::BitOr for Attribs {
+    type Output = Self;
+    fn bitor(self, rhs: Self) -> Self::Output {
+        Self(self.0 | rhs.0)
+    }
+}
+
+impl core::ops::BitAnd for Attribs {
+    type Output = Self;
+    fn bitand(self, rhs: Self) -> Self::Output {
+        Self(self.0 & rhs.0)
+    }
+}
+
+impl core::ops::Not for Attribs {
+    type Output = Self;
+    fn not(self) -> Self::Output {
+        Self(!self.0)
+    }
+}
+
+/// DMA mapping attrributes.
+pub mod attrs {
+    use super::Attribs;
+
+    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
+    /// and writes may pass each other.
+    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
+
+    /// Specifies that writes to the mapping may be buffered to improve performance.
+    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
+
+    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
+    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
+
+    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
+    /// that it has been already transferred to 'device' domain.
+    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
+
+    /// Forces contiguous allocation of the buffer in physical memory.
+    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
+
+    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
+    /// to allocate memory to in a way that gives better TLB efficiency.
+    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
+
+    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
+    /// __GFP_NOWARN).
+    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
+
+    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
+    /// ideally inaccessible or at least read-only at lesser-privileged levels).
+    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
+}
+
+/// An abstraction of the `dma_alloc_coherent` API.
+///
+/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
+/// large consistent DMA regions.
+///
+/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
+/// processor's virtual address space) and the device address which can be given to the device
+/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
+/// is dropped.
+///
+/// # Invariants
+///
+/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
+/// to an allocated region of consistent memory and we hold a reference to the device.
+pub struct CoherentAllocation<T: AsBytes + FromBytes> {
+    dev: ARef<Device>,
+    dma_handle: bindings::dma_addr_t,
+    count: usize,
+    cpu_addr: *mut T,
+    dma_attrs: Attribs,
+}
+
+impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
+    /// Allocates a region of `size_of::<T> * count` of consistent memory.
+    ///
+    /// # Examples
+    ///
+    /// ```
+    /// use kernel::device::Device;
+    /// use kernel::dma::{attrs::*, CoherentAllocation};
+    ///
+    /// # fn test(dev: &Device) -> Result {
+    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
+    ///                                                                  DMA_ATTR_NO_WARN)?;
+    /// # Ok::<(), Error>(()) }
+    /// ```
+    pub fn alloc_attrs(
+        dev: &Device,
+        count: usize,
+        gfp_flags: kernel::alloc::Flags,
+        dma_attrs: Attribs,
+    ) -> Result<CoherentAllocation<T>> {
+        build_assert!(core::mem::size_of::<T>() > 0,
+                      "It doesn't make sense for the allocated type to be a ZST");
+
+        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
+        let mut dma_handle = 0;
+        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
+        // We ensure that we catch the failure on this function and throw an ENOMEM
+        let ret = unsafe {
+            bindings::dma_alloc_attrs(
+                dev.as_raw(),
+                size,
+                &mut dma_handle, gfp_flags.as_raw(),
+                dma_attrs.as_raw(),
+            )
+        };
+        if ret.is_null() {
+            return Err(ENOMEM)
+        }
+        // INVARIANT: We just successfully allocated a coherent region which is accessible for
+        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
+        // to the device.
+        Ok(Self {
+            dev: dev.into(),
+            dma_handle,
+            count,
+            cpu_addr: ret as *mut T,
+            dma_attrs,
+        })
+    }
+
+    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
+    pub fn alloc_coherent(dev: &Device,
+                          count: usize,
+                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
+        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
+    }
+
+    /// Returns the base address to the allocated region and the dma handle. The caller takes
+    /// ownership of the returned resources.
+    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
+        let ret = (self.cpu_addr as _, self.dma_handle);
+        core::mem::forget(self);
+        ret
+    }
+
+    /// Returns the base address to the allocated region in the CPU's virtual address space.
+    pub fn start_ptr(&self) -> *const T {
+        self.cpu_addr as _
+    }
+
+    /// Returns the base address to the allocated region in the CPU's virtual address space as
+    /// a mutable pointer.
+    pub fn start_ptr_mut(&mut self) -> *mut T {
+        self.cpu_addr
+    }
+
+    /// Returns a DMA handle which may given to the device as the DMA address base of
+    /// the region.
+    pub fn dma_handle(&self) -> bindings::dma_addr_t {
+        self.dma_handle
+    }
+
+    /// Returns the CPU-addressable region as a slice.
+    pub fn cpu_buf(&self) -> &[T]
+    {
+        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
+        // is valid for reads for `self.count * size_of::<T>` bytes.
+        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
+    }
+
+    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
+    pub fn cpu_buf_mut(&mut self) -> &mut [T]
+    {
+        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
+        // is valid for reads for `self.count * size_of::<T>` bytes.
+        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
+    }
+}
+
+impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
+    fn drop(&mut self) {
+        let size = self.count * core::mem::size_of::<T>();
+        // SAFETY: the device, cpu address, and the dma handle is valid due to the
+        // type invariants on `CoherentAllocation`.
+        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
+                                          self.cpu_addr as _,
+                                          self.dma_handle,
+                                          self.dma_attrs.as_raw(),) }
+    }
+}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index e1065a7551a3..6e90ebf5a130 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -35,6 +35,7 @@
 mod build_assert;
 pub mod cred;
 pub mod device;
+pub mod dma;
 pub mod error;
 #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
 pub mod firmware;
-- 
2.43.0

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Alice Ryhl 1 year, 1 month ago

On Tue, Dec 10, 2024 at 11:16 PM Abdiel Janulgue
<abdiel.janulgue@gmail.com> wrote:
>
> Add a simple dma coherent allocator rust abstraction. Based on
> Andreas Hindborg's dma abstractions from the rnvme driver, which
> was also based on earlier work by Wedson Almeida Filho.
>
> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
> ---
>  rust/bindings/bindings_helper.h |   1 +
>  rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
>  rust/kernel/lib.rs              |   1 +
>  3 files changed, 225 insertions(+)
>  create mode 100644 rust/kernel/dma.rs
>
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 5c4dfe22f41a..49bf713b9bb6 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -11,6 +11,7 @@
>  #include <linux/blk_types.h>
>  #include <linux/blkdev.h>
>  #include <linux/cred.h>
> +#include <linux/dma-mapping.h>
>  #include <linux/errname.h>
>  #include <linux/ethtool.h>
>  #include <linux/file.h>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> new file mode 100644
> index 000000000000..29ae744d6f2b
> --- /dev/null
> +++ b/rust/kernel/dma.rs
> @@ -0,0 +1,223 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Direct memory access (DMA).
> +//!
> +//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
> +
> +use crate::{
> +    bindings,
> +    build_assert,
> +    device::Device,
> +    error::code::*,
> +    error::Result,
> +    types::ARef,
> +    transmute::{AsBytes, FromBytes},
> +};
> +
> +/// Possible attributes associated with a DMA mapping.
> +///
> +/// They can be combined with the operators `|`, `&`, and `!`.
> +///
> +/// Values can be used from the [`attrs`] module.
> +#[derive(Clone, Copy, PartialEq)]
> +pub struct Attribs(u32);
> +
> +impl Attribs {
> +    /// Get the raw representation of this attribute.
> +    pub(crate) fn as_raw(self) -> u64 {
> +        self.0.into()
> +    }
> +
> +    /// Check whether `flags` is contained in `self`.
> +    pub fn contains(self, flags: Attribs) -> bool {
> +        (self & flags) == flags
> +    }
> +}
> +
> +impl core::ops::BitOr for Attribs {
> +    type Output = Self;
> +    fn bitor(self, rhs: Self) -> Self::Output {
> +        Self(self.0 | rhs.0)
> +    }
> +}
> +
> +impl core::ops::BitAnd for Attribs {
> +    type Output = Self;
> +    fn bitand(self, rhs: Self) -> Self::Output {
> +        Self(self.0 & rhs.0)
> +    }
> +}
> +
> +impl core::ops::Not for Attribs {
> +    type Output = Self;
> +    fn not(self) -> Self::Output {
> +        Self(!self.0)
> +    }
> +}
> +
> +/// DMA mapping attrributes.
> +pub mod attrs {
> +    use super::Attribs;
> +
> +    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
> +    /// and writes may pass each other.
> +    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
> +
> +    /// Specifies that writes to the mapping may be buffered to improve performance.
> +    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
> +
> +    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
> +    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
> +
> +    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
> +    /// that it has been already transferred to 'device' domain.
> +    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
> +
> +    /// Forces contiguous allocation of the buffer in physical memory.
> +    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
> +
> +    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
> +    /// to allocate memory to in a way that gives better TLB efficiency.
> +    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
> +
> +    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
> +    /// __GFP_NOWARN).
> +    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
> +
> +    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
> +    /// ideally inaccessible or at least read-only at lesser-privileged levels).
> +    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
> +}
> +
> +/// An abstraction of the `dma_alloc_coherent` API.
> +///
> +/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
> +/// large consistent DMA regions.
> +///
> +/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
> +/// processor's virtual address space) and the device address which can be given to the device
> +/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
> +/// is dropped.
> +///
> +/// # Invariants
> +///
> +/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
> +/// to an allocated region of consistent memory and we hold a reference to the device.
> +pub struct CoherentAllocation<T: AsBytes + FromBytes> {
> +    dev: ARef<Device>,
> +    dma_handle: bindings::dma_addr_t,
> +    count: usize,
> +    cpu_addr: *mut T,
> +    dma_attrs: Attribs,
> +}
> +
> +impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
> +    /// Allocates a region of `size_of::<T> * count` of consistent memory.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::device::Device;
> +    /// use kernel::dma::{attrs::*, CoherentAllocation};
> +    ///
> +    /// # fn test(dev: &Device) -> Result {
> +    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
> +    ///                                                                  DMA_ATTR_NO_WARN)?;
> +    /// # Ok::<(), Error>(()) }
> +    /// ```
> +    pub fn alloc_attrs(
> +        dev: &Device,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attribs,
> +    ) -> Result<CoherentAllocation<T>> {
> +        build_assert!(core::mem::size_of::<T>() > 0,
> +                      "It doesn't make sense for the allocated type to be a ZST");
> +
> +        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
> +        let mut dma_handle = 0;
> +        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
> +        // We ensure that we catch the failure on this function and throw an ENOMEM
> +        let ret = unsafe {
> +            bindings::dma_alloc_attrs(
> +                dev.as_raw(),
> +                size,
> +                &mut dma_handle, gfp_flags.as_raw(),
> +                dma_attrs.as_raw(),
> +            )
> +        };
> +        if ret.is_null() {
> +            return Err(ENOMEM)
> +        }
> +        // INVARIANT: We just successfully allocated a coherent region which is accessible for
> +        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
> +        // to the device.
> +        Ok(Self {
> +            dev: dev.into(),
> +            dma_handle,
> +            count,
> +            cpu_addr: ret as *mut T,
> +            dma_attrs,
> +        })
> +    }
> +
> +    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
> +    pub fn alloc_coherent(dev: &Device,
> +                          count: usize,
> +                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
> +        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
> +    }

Please run rustfmt.

> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
> +    /// ownership of the returned resources.
> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
> +        let ret = (self.cpu_addr as _, self.dma_handle);
> +        core::mem::forget(self);
> +        ret
> +    }

Not only does this skip the destructor of `dma_free_attrs`. It also
skips the destructor of fields including the `dev` field. Did you
intend to leave the refcount on `struct device` without decrementing
it?

> +    /// Returns the base address to the allocated region in the CPU's virtual address space.
> +    pub fn start_ptr(&self) -> *const T {
> +        self.cpu_addr as _

This conversion happens without needing a cast.

> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space as
> +    /// a mutable pointer.
> +    pub fn start_ptr_mut(&mut self) -> *mut T {
> +        self.cpu_addr
> +    }
> +
> +    /// Returns a DMA handle which may given to the device as the DMA address base of
> +    /// the region.
> +    pub fn dma_handle(&self) -> bindings::dma_addr_t {
> +        self.dma_handle
> +    }
> +
> +    /// Returns the CPU-addressable region as a slice.
> +    pub fn cpu_buf(&self) -> &[T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }

Immutable slices require that the data does not change while the
reference is live. Is that the case? If so, your safety comment should
explain that.

> +    }
> +
> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }

Mutable slices require that the data is not written to *or read* by
anybody else while the reference is live. Is that the case? If so,
your safety comment should explain that.

> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
> +    fn drop(&mut self) {
> +        let size = self.count * core::mem::size_of::<T>();
> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
> +        // type invariants on `CoherentAllocation`.
> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
> +                                          self.cpu_addr as _,
> +                                          self.dma_handle,
> +                                          self.dma_attrs.as_raw(),) }
> +    }
> +}
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index e1065a7551a3..6e90ebf5a130 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -35,6 +35,7 @@
>  mod build_assert;
>  pub mod cred;
>  pub mod device;
> +pub mod dma;
>  pub mod error;
>  #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>  pub mod firmware;
> --
> 2.43.0
>

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Abdiel Janulgue 1 year, 1 month ago

On 13/12/2024 16:27, Alice Ryhl wrote:
> 
>> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
>> +    /// ownership of the returned resources.
>> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
>> +        let ret = (self.cpu_addr as _, self.dma_handle);
>> +        core::mem::forget(self);
>> +        ret
>> +    }
> 
> Not only does this skip the destructor of `dma_free_attrs`. It also
> skips the destructor of fields including the `dev` field. Did you
> intend to leave the refcount on `struct device` without decrementing
> it?
> 

Good catch. Yeah dev should be decremented here as well.
Will incorporate fix into next revision.

Regards,
Abdiel

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Daniel Almeida 1 year, 1 month ago


> On 13 Dec 2024, at 11:27, Alice Ryhl <aliceryhl@google.com> wrote:
> 
> On Tue, Dec 10, 2024 at 11:16 PM Abdiel Janulgue
> <abdiel.janulgue@gmail.com> wrote:
>> 
>> Add a simple dma coherent allocator rust abstraction. Based on
>> Andreas Hindborg's dma abstractions from the rnvme driver, which
>> was also based on earlier work by Wedson Almeida Filho.
>> 
>> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
>> Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
>> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
>> ---
>> rust/bindings/bindings_helper.h |   1 +
>> rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
>> rust/kernel/lib.rs              |   1 +
>> 3 files changed, 225 insertions(+)
>> create mode 100644 rust/kernel/dma.rs
>> 
>> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
>> index 5c4dfe22f41a..49bf713b9bb6 100644
>> --- a/rust/bindings/bindings_helper.h
>> +++ b/rust/bindings/bindings_helper.h
>> @@ -11,6 +11,7 @@
>> #include <linux/blk_types.h>
>> #include <linux/blkdev.h>
>> #include <linux/cred.h>
>> +#include <linux/dma-mapping.h>
>> #include <linux/errname.h>
>> #include <linux/ethtool.h>
>> #include <linux/file.h>
>> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
>> new file mode 100644
>> index 000000000000..29ae744d6f2b
>> --- /dev/null
>> +++ b/rust/kernel/dma.rs
>> @@ -0,0 +1,223 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +//! Direct memory access (DMA).
>> +//!
>> +//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
>> +
>> +use crate::{
>> +    bindings,
>> +    build_assert,
>> +    device::Device,
>> +    error::code::*,
>> +    error::Result,
>> +    types::ARef,
>> +    transmute::{AsBytes, FromBytes},
>> +};
>> +
>> +/// Possible attributes associated with a DMA mapping.
>> +///
>> +/// They can be combined with the operators `|`, `&`, and `!`.
>> +///
>> +/// Values can be used from the [`attrs`] module.
>> +#[derive(Clone, Copy, PartialEq)]
>> +pub struct Attribs(u32);
>> +
>> +impl Attribs {
>> +    /// Get the raw representation of this attribute.
>> +    pub(crate) fn as_raw(self) -> u64 {
>> +        self.0.into()
>> +    }
>> +
>> +    /// Check whether `flags` is contained in `self`.
>> +    pub fn contains(self, flags: Attribs) -> bool {
>> +        (self & flags) == flags
>> +    }
>> +}
>> +
>> +impl core::ops::BitOr for Attribs {
>> +    type Output = Self;
>> +    fn bitor(self, rhs: Self) -> Self::Output {
>> +        Self(self.0 | rhs.0)
>> +    }
>> +}
>> +
>> +impl core::ops::BitAnd for Attribs {
>> +    type Output = Self;
>> +    fn bitand(self, rhs: Self) -> Self::Output {
>> +        Self(self.0 & rhs.0)
>> +    }
>> +}
>> +
>> +impl core::ops::Not for Attribs {
>> +    type Output = Self;
>> +    fn not(self) -> Self::Output {
>> +        Self(!self.0)
>> +    }
>> +}
>> +
>> +/// DMA mapping attrributes.
>> +pub mod attrs {
>> +    use super::Attribs;
>> +
>> +    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
>> +    /// and writes may pass each other.
>> +    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
>> +
>> +    /// Specifies that writes to the mapping may be buffered to improve performance.
>> +    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
>> +
>> +    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
>> +    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
>> +
>> +    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
>> +    /// that it has been already transferred to 'device' domain.
>> +    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
>> +
>> +    /// Forces contiguous allocation of the buffer in physical memory.
>> +    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
>> +
>> +    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
>> +    /// to allocate memory to in a way that gives better TLB efficiency.
>> +    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
>> +
>> +    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
>> +    /// __GFP_NOWARN).
>> +    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
>> +
>> +    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
>> +    /// ideally inaccessible or at least read-only at lesser-privileged levels).
>> +    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
>> +}
>> +
>> +/// An abstraction of the `dma_alloc_coherent` API.
>> +///
>> +/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
>> +/// large consistent DMA regions.
>> +///
>> +/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
>> +/// processor's virtual address space) and the device address which can be given to the device
>> +/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
>> +/// is dropped.
>> +///
>> +/// # Invariants
>> +///
>> +/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
>> +/// to an allocated region of consistent memory and we hold a reference to the device.
>> +pub struct CoherentAllocation<T: AsBytes + FromBytes> {
>> +    dev: ARef<Device>,
>> +    dma_handle: bindings::dma_addr_t,
>> +    count: usize,
>> +    cpu_addr: *mut T,
>> +    dma_attrs: Attribs,
>> +}
>> +
>> +impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
>> +    /// Allocates a region of `size_of::<T> * count` of consistent memory.
>> +    ///
>> +    /// # Examples
>> +    ///
>> +    /// ```
>> +    /// use kernel::device::Device;
>> +    /// use kernel::dma::{attrs::*, CoherentAllocation};
>> +    ///
>> +    /// # fn test(dev: &Device) -> Result {
>> +    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
>> +    ///                                                                  DMA_ATTR_NO_WARN)?;
>> +    /// # Ok::<(), Error>(()) }
>> +    /// ```
>> +    pub fn alloc_attrs(
>> +        dev: &Device,
>> +        count: usize,
>> +        gfp_flags: kernel::alloc::Flags,
>> +        dma_attrs: Attribs,
>> +    ) -> Result<CoherentAllocation<T>> {
>> +        build_assert!(core::mem::size_of::<T>() > 0,
>> +                      "It doesn't make sense for the allocated type to be a ZST");
>> +
>> +        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
>> +        let mut dma_handle = 0;
>> +        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
>> +        // We ensure that we catch the failure on this function and throw an ENOMEM
>> +        let ret = unsafe {
>> +            bindings::dma_alloc_attrs(
>> +                dev.as_raw(),
>> +                size,
>> +                &mut dma_handle, gfp_flags.as_raw(),
>> +                dma_attrs.as_raw(),
>> +            )
>> +        };
>> +        if ret.is_null() {
>> +            return Err(ENOMEM)
>> +        }
>> +        // INVARIANT: We just successfully allocated a coherent region which is accessible for
>> +        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
>> +        // to the device.
>> +        Ok(Self {
>> +            dev: dev.into(),
>> +            dma_handle,
>> +            count,
>> +            cpu_addr: ret as *mut T,
>> +            dma_attrs,
>> +        })
>> +    }
>> +
>> +    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
>> +    pub fn alloc_coherent(dev: &Device,
>> +                          count: usize,
>> +                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
>> +        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
>> +    }
> 
> Please run rustfmt.
> 
>> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
>> +    /// ownership of the returned resources.
>> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
>> +        let ret = (self.cpu_addr as _, self.dma_handle);
>> +        core::mem::forget(self);
>> +        ret
>> +    }
> 
> Not only does this skip the destructor of `dma_free_attrs`. It also
> skips the destructor of fields including the `dev` field. Did you
> intend to leave the refcount on `struct device` without decrementing
> it?
> 
>> +    /// Returns the base address to the allocated region in the CPU's virtual address space.
>> +    pub fn start_ptr(&self) -> *const T {
>> +        self.cpu_addr as _
> 
> This conversion happens without needing a cast.
> 
>> +    }
>> +
>> +    /// Returns the base address to the allocated region in the CPU's virtual address space as
>> +    /// a mutable pointer.
>> +    pub fn start_ptr_mut(&mut self) -> *mut T {
>> +        self.cpu_addr
>> +    }
>> +
>> +    /// Returns a DMA handle which may given to the device as the DMA address base of
>> +    /// the region.
>> +    pub fn dma_handle(&self) -> bindings::dma_addr_t {
>> +        self.dma_handle
>> +    }
>> +
>> +    /// Returns the CPU-addressable region as a slice.
>> +    pub fn cpu_buf(&self) -> &[T]
>> +    {
>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
> 
> Immutable slices require that the data does not change while the
> reference is live. Is that the case? If so, your safety comment should
> explain that.
> 
>> +    }
>> +
>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>> +    {
>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
> 
> Mutable slices require that the data is not written to *or read* by
> anybody else while the reference is live. Is that the case? If so,
> your safety comment should explain that.
> 

The buffer will probably be shared between the CPU and some hardware device, since this is the
point of the dma mapping API.

It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
place while the slices above are alive.

I think that adding that as a safety requirement to `cpu_buf` and `cpu_buf_mut` will be sufficient.

>> +    }
>> +}
>> +
>> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
>> +    fn drop(&mut self) {
>> +        let size = self.count * core::mem::size_of::<T>();
>> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
>> +        // type invariants on `CoherentAllocation`.
>> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
>> +                                          self.cpu_addr as _,
>> +                                          self.dma_handle,
>> +                                          self.dma_attrs.as_raw(),) }
>> +    }
>> +}
>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>> index e1065a7551a3..6e90ebf5a130 100644
>> --- a/rust/kernel/lib.rs
>> +++ b/rust/kernel/lib.rs
>> @@ -35,6 +35,7 @@
>> mod build_assert;
>> pub mod cred;
>> pub mod device;
>> +pub mod dma;
>> pub mod error;
>> #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>> pub mod firmware;
>> —
>> 2.43.0

— Daniel

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Robin Murphy 1 year, 1 month ago

On 13/12/2024 2:47 pm, Daniel Almeida wrote:
[...]
>>> +    /// Returns the CPU-addressable region as a slice.
>>> +    pub fn cpu_buf(&self) -> &[T]
>>> +    {
>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>
>> Immutable slices require that the data does not change while the
>> reference is live. Is that the case? If so, your safety comment should
>> explain that.
>>
>>> +    }
>>> +
>>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>> +    {
>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>
>> Mutable slices require that the data is not written to *or read* by
>> anybody else while the reference is live. Is that the case? If so,
>> your safety comment should explain that.
>>
> 
> The buffer will probably be shared between the CPU and some hardware device, since this is the
> point of the dma mapping API.
> 
> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
> place while the slices above are alive.

Hmm, that sounds troublesome... the nature of coherent allocations is 
that both CPU and device may access them at any time, and you can 
definitely expect ringbuffer-style usage models where a CPU is writing 
to part of the buffer while the device is reading/writing another part, 
but also cases where a CPU needs to poll for a device write to a 
particular location.

Thanks,
Robin.

> I think that adding that as a safety requirement to `cpu_buf` and `cpu_buf_mut` will be sufficient.
> 
>>> +    }
>>> +}
>>> +
>>> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
>>> +    fn drop(&mut self) {
>>> +        let size = self.count * core::mem::size_of::<T>();
>>> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
>>> +        // type invariants on `CoherentAllocation`.
>>> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
>>> +                                          self.cpu_addr as _,
>>> +                                          self.dma_handle,
>>> +                                          self.dma_attrs.as_raw(),) }
>>> +    }
>>> +}
>>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>>> index e1065a7551a3..6e90ebf5a130 100644
>>> --- a/rust/kernel/lib.rs
>>> +++ b/rust/kernel/lib.rs
>>> @@ -35,6 +35,7 @@
>>> mod build_assert;
>>> pub mod cred;
>>> pub mod device;
>>> +pub mod dma;
>>> pub mod error;
>>> #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>>> pub mod firmware;
>>> —
>>> 2.43.0
> 
> — Daniel

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Daniel Almeida 1 year, 1 month ago

Hi Robin,

> On 13 Dec 2024, at 12:28, Robin Murphy <robin.murphy@arm.com> wrote:
> 
> On 13/12/2024 2:47 pm, Daniel Almeida wrote:
> [...]
>>>> +    /// Returns the CPU-addressable region as a slice.
>>>> +    pub fn cpu_buf(&self) -> &[T]
>>>> +    {
>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>> 
>>> Immutable slices require that the data does not change while the
>>> reference is live. Is that the case? If so, your safety comment should
>>> explain that.
>>> 
>>>> +    }
>>>> +
>>>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>>> +    {
>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>> 
>>> Mutable slices require that the data is not written to *or read* by
>>> anybody else while the reference is live. Is that the case? If so,
>>> your safety comment should explain that.
>>> 
>> The buffer will probably be shared between the CPU and some hardware device, since this is the
>> point of the dma mapping API.
>> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
>> place while the slices above are alive.
> 
> Hmm, that sounds troublesome... the nature of coherent allocations is that both CPU and device may access them at any time, and you can definitely expect ringbuffer-style usage models where a CPU is writing to part of the buffer while the device is reading/writing another part, but also cases where a CPU needs to poll for a device write to a particular location.
> 

Ok, I had based my answer on some other drivers I’ve worked on in the past where the approach I cited would work.

I can see it not working for what you described, though.

This is a bit unfortunate, because it means we are back to square one, i.e.: back to read() and write() functions and
to the bound on `Copy`. That’s because, no matter how you try to dress this, there is no way to give safe and direct access
to the underlying memory if you can’t avoid situations where both the CPU and the device will be accessing the memory
at the same time.

I guess the only improvement that could be made over the approach used for v2 is to at least use copy_nonoverlapping
instead, because I assume the performance of things like:

+ /// Reads a value on a location specified by index.
+ pub fn read(&self, index: usize) -> Result<T>
+ where
+ T: Copy
+ {
+ if let Some(val) = self.cpu_buf().get(index) {
+ Ok(*val)
+ } else {
+ Err(EINVAL)
+ }
+ }

will be atrocious.


> Thanks,
> Robin.
> 
>> I think that adding that as a safety requirement to `cpu_buf` and `cpu_buf_mut` will be sufficient.
>>>> +    }
>>>> +}
>>>> +
>>>> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
>>>> +    fn drop(&mut self) {
>>>> +        let size = self.count * core::mem::size_of::<T>();
>>>> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
>>>> +        // type invariants on `CoherentAllocation`.
>>>> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
>>>> +                                          self.cpu_addr as _,
>>>> +                                          self.dma_handle,
>>>> +                                          self.dma_attrs.as_raw(),) }
>>>> +    }
>>>> +}
>>>> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
>>>> index e1065a7551a3..6e90ebf5a130 100644
>>>> --- a/rust/kernel/lib.rs
>>>> +++ b/rust/kernel/lib.rs
>>>> @@ -35,6 +35,7 @@
>>>> mod build_assert;
>>>> pub mod cred;
>>>> pub mod device;
>>>> +pub mod dma;
>>>> pub mod error;
>>>> #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>>>> pub mod firmware;
>>>> —
>>>> 2.43.0
>> — Daniel

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Abdiel Janulgue 1 year, 1 month ago


On 13/12/2024 21:08, Daniel Almeida wrote:
> Hi Robin,
> 
>> On 13 Dec 2024, at 12:28, Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> On 13/12/2024 2:47 pm, Daniel Almeida wrote:
>> [...]
>>>>> +    /// Returns the CPU-addressable region as a slice.
>>>>> +    pub fn cpu_buf(&self) -> &[T]
>>>>> +    {
>>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>>>
>>>> Immutable slices require that the data does not change while the
>>>> reference is live. Is that the case? If so, your safety comment should
>>>> explain that.
>>>>
>>>>> +    }
>>>>> +
>>>>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>>>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>>>> +    {
>>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>>>
>>>> Mutable slices require that the data is not written to *or read* by
>>>> anybody else while the reference is live. Is that the case? If so,
>>>> your safety comment should explain that.
>>>>
>>> The buffer will probably be shared between the CPU and some hardware device, since this is the
>>> point of the dma mapping API.
>>> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
>>> place while the slices above are alive.
>>
>> Hmm, that sounds troublesome... the nature of coherent allocations is that both CPU and device may access them at any time, and you can definitely expect ringbuffer-style usage models where a CPU is writing to part of the buffer while the device is reading/writing another part, but also cases where a CPU needs to poll for a device write to a particular location.
>>
> 
> Ok, I had based my answer on some other drivers I’ve worked on in the past where the approach I cited would work.
> 
> I can see it not working for what you described, though.
> 
> This is a bit unfortunate, because it means we are back to square one, i.e.: back to read() and write() functions and
> to the bound on `Copy`. That’s because, no matter how you try to dress this, there is no way to give safe and direct access
> to the underlying memory if you can’t avoid situations where both the CPU and the device will be accessing the memory
> at the same time.
> 

This is unfortunate indeed. Thanks Alice for pointing out the 
limitations of slice.

Btw, do we have any other concerns in going back to plain old raw 
pointers instead? i.e.,

     pub fn read(&self, index: usize) -> Result<T> {
         if index >= self.count {
             return Err(EINVAL);
         }

         let ptr = self.cpu_addr.wrapping_add(index);
         // SAFETY: We just checked that the index is within bounds.
         Ok(unsafe { ptr.read() })
     }

     pub fn write(&self, index: usize, value: &T) -> Result
     where
         T: Copy,
     {
         if index >= self.count {
             return Err(EINVAL);
         }

         let ptr = self.cpu_addr.wrapping_add(index);
         // SAFETY: We just checked that the index is within bounds.
         unsafe { ptr.write(*value) };
         Ok(())
     }

> I guess the only improvement that could be made over the approach used for v2 is to at least use copy_nonoverlapping
> instead, 

You mean introduce something like read_raw(dst: *mut u8,...) and 
write_raw(&self, src: *const u8,...)?

Regards,
Abdiel

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Daniel Almeida 1 year, 1 month ago


> On 16 Dec 2024, at 07:23, Abdiel Janulgue <abdiel.janulgue@gmail.com> wrote:
> 
> 
> 
> On 13/12/2024 21:08, Daniel Almeida wrote:
>> Hi Robin,
>>> On 13 Dec 2024, at 12:28, Robin Murphy <robin.murphy@arm.com> wrote:
>>> 
>>> On 13/12/2024 2:47 pm, Daniel Almeida wrote:
>>> [...]
>>>>>> +    /// Returns the CPU-addressable region as a slice.
>>>>>> +    pub fn cpu_buf(&self) -> &[T]
>>>>>> +    {
>>>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>>> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
>>>>> 
>>>>> Immutable slices require that the data does not change while the
>>>>> reference is live. Is that the case? If so, your safety comment should
>>>>> explain that.
>>>>> 
>>>>>> +    }
>>>>>> +
>>>>>> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
>>>>>> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
>>>>>> +    {
>>>>>> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
>>>>>> +        // is valid for reads for `self.count * size_of::<T>` bytes.
>>>>>> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
>>>>> 
>>>>> Mutable slices require that the data is not written to *or read* by
>>>>> anybody else while the reference is live. Is that the case? If so,
>>>>> your safety comment should explain that.
>>>>> 
>>>> The buffer will probably be shared between the CPU and some hardware device, since this is the
>>>> point of the dma mapping API.
>>>> It’s up to the caller to ensure that no hardware operations that involve the buffer are currently taking
>>>> place while the slices above are alive.
>>> 
>>> Hmm, that sounds troublesome... the nature of coherent allocations is that both CPU and device may access them at any time, and you can definitely expect ringbuffer-style usage models where a CPU is writing to part of the buffer while the device is reading/writing another part, but also cases where a CPU needs to poll for a device write to a particular location.
>>> 
>> Ok, I had based my answer on some other drivers I’ve worked on in the past where the approach I cited would work.
>> I can see it not working for what you described, though.
>> This is a bit unfortunate, because it means we are back to square one, i.e.: back to read() and write() functions and
>> to the bound on `Copy`. That’s because, no matter how you try to dress this, there is no way to give safe and direct access
>> to the underlying memory if you can’t avoid situations where both the CPU and the device will be accessing the memory
>> at the same time.
> 
> This is unfortunate indeed. Thanks Alice for pointing out the limitations of slice.
> 
> Btw, do we have any other concerns in going back to plain old raw pointers instead? i.e.,
> 
>    pub fn read(&self, index: usize) -> Result<T> {
>        if index >= self.count {
>            return Err(EINVAL);
>        }
> 
>        let ptr = self.cpu_addr.wrapping_add(index);
>        // SAFETY: We just checked that the index is within bounds.
>        Ok(unsafe { ptr.read() })
>    }
> 
>    pub fn write(&self, index: usize, value: &T) -> Result
>    where
>        T: Copy,
>    {
>        if index >= self.count {
>            return Err(EINVAL);
>        }
> 
>        let ptr = self.cpu_addr.wrapping_add(index);
>        // SAFETY: We just checked that the index is within bounds.
>        unsafe { ptr.write(*value) };
>        Ok(())
>    }
> 
>> I guess the only improvement that could be made over the approach used for v2 is to at least use copy_nonoverlapping
>> instead, 
> 
> You mean introduce something like read_raw(dst: *mut u8,...) and write_raw(&self, src: *const u8,...)?


What I mean is that you don’t have to read (and bounds-check) a single index per function call.

Using copy_nonoverlapping lets you read/write multiple values with only a single function call and a
single check. It’s basically equivalent to a memcpy.


> 
> Regards,
> Abdiel

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Daniel Almeida 1 year, 1 month ago

Hi Abdiel,

> On 10 Dec 2024, at 19:14, Abdiel Janulgue <abdiel.janulgue@gmail.com> wrote:
> 
> Add a simple dma coherent allocator rust abstraction. Based on
> Andreas Hindborg's dma abstractions from the rnvme driver, which
> was also based on earlier work by Wedson Almeida Filho.
> 
> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
> ---
> rust/bindings/bindings_helper.h |   1 +
> rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
> rust/kernel/lib.rs              |   1 +
> 3 files changed, 225 insertions(+)
> create mode 100644 rust/kernel/dma.rs
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 5c4dfe22f41a..49bf713b9bb6 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -11,6 +11,7 @@
> #include <linux/blk_types.h>
> #include <linux/blkdev.h>
> #include <linux/cred.h>
> +#include <linux/dma-mapping.h>
> #include <linux/errname.h>
> #include <linux/ethtool.h>
> #include <linux/file.h>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> new file mode 100644
> index 000000000000..29ae744d6f2b
> --- /dev/null
> +++ b/rust/kernel/dma.rs
> @@ -0,0 +1,223 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Direct memory access (DMA).
> +//!
> +//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
> +
> +use crate::{
> +    bindings,
> +    build_assert,
> +    device::Device,
> +    error::code::*,
> +    error::Result,
> +    types::ARef,
> +    transmute::{AsBytes, FromBytes},
> +};
> +
> +/// Possible attributes associated with a DMA mapping.
> +///
> +/// They can be combined with the operators `|`, `&`, and `!`.
> +///
> +/// Values can be used from the [`attrs`] module.
> +#[derive(Clone, Copy, PartialEq)]
> +pub struct Attribs(u32);
> +
> +impl Attribs {
> +    /// Get the raw representation of this attribute.
> +    pub(crate) fn as_raw(self) -> u64 {
> +        self.0.into()
> +    }
> +
> +    /// Check whether `flags` is contained in `self`.
> +    pub fn contains(self, flags: Attribs) -> bool {
> +        (self & flags) == flags
> +    }
> +}
> +
> +impl core::ops::BitOr for Attribs {
> +    type Output = Self;
> +    fn bitor(self, rhs: Self) -> Self::Output {
> +        Self(self.0 | rhs.0)
> +    }
> +}
> +
> +impl core::ops::BitAnd for Attribs {
> +    type Output = Self;
> +    fn bitand(self, rhs: Self) -> Self::Output {
> +        Self(self.0 & rhs.0)
> +    }
> +}
> +
> +impl core::ops::Not for Attribs {
> +    type Output = Self;
> +    fn not(self) -> Self::Output {
> +        Self(!self.0)
> +    }
> +}
> +
> +/// DMA mapping attrributes.
> +pub mod attrs {
> +    use super::Attribs;
> +
> +    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
> +    /// and writes may pass each other.
> +    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
> +
> +    /// Specifies that writes to the mapping may be buffered to improve performance.
> +    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
> +
> +    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
> +    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
> +
> +    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
> +    /// that it has been already transferred to 'device' domain.
> +    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
> +
> +    /// Forces contiguous allocation of the buffer in physical memory.
> +    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
> +
> +    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
> +    /// to allocate memory to in a way that gives better TLB efficiency.
> +    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
> +
> +    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
> +    /// __GFP_NOWARN).
> +    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
> +
> +    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
> +    /// ideally inaccessible or at least read-only at lesser-privileged levels).
> +    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
> +}
> +
> +/// An abstraction of the `dma_alloc_coherent` API.
> +///
> +/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
> +/// large consistent DMA regions.
> +///
> +/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
> +/// processor's virtual address space) and the device address which can be given to the device
> +/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
> +/// is dropped.
> +///
> +/// # Invariants
> +///
> +/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
> +/// to an allocated region of consistent memory and we hold a reference to the device.
> +pub struct CoherentAllocation<T: AsBytes + FromBytes> {
> +    dev: ARef<Device>,
> +    dma_handle: bindings::dma_addr_t,
> +    count: usize,
> +    cpu_addr: *mut T,
> +    dma_attrs: Attribs,
> +}
> +
> +impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
> +    /// Allocates a region of `size_of::<T> * count` of consistent memory.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::device::Device;
> +    /// use kernel::dma::{attrs::*, CoherentAllocation};
> +    ///
> +    /// # fn test(dev: &Device) -> Result {
> +    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
> +    ///                                                                  DMA_ATTR_NO_WARN)?;
> +    /// # Ok::<(), Error>(()) }
> +    /// ```
> +    pub fn alloc_attrs(
> +        dev: &Device,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attribs,
> +    ) -> Result<CoherentAllocation<T>> {
> +        build_assert!(core::mem::size_of::<T>() > 0,
> +                      "It doesn't make sense for the allocated type to be a ZST");
> +
> +        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
> +        let mut dma_handle = 0;
> +        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
> +        // We ensure that we catch the failure on this function and throw an ENOMEM
> +        let ret = unsafe {
> +            bindings::dma_alloc_attrs(
> +                dev.as_raw(),
> +                size,
> +                &mut dma_handle, gfp_flags.as_raw(),
> +                dma_attrs.as_raw(),
> +            )
> +        };
> +        if ret.is_null() {
> +            return Err(ENOMEM)
> +        }
> +        // INVARIANT: We just successfully allocated a coherent region which is accessible for
> +        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
> +        // to the device.
> +        Ok(Self {
> +            dev: dev.into(),
> +            dma_handle,
> +            count,
> +            cpu_addr: ret as *mut T,
> +            dma_attrs,
> +        })
> +    }
> +
> +    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
> +    pub fn alloc_coherent(dev: &Device,
> +                          count: usize,
> +                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
> +        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
> +    }
> +
> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
> +    /// ownership of the returned resources.
> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
> +        let ret = (self.cpu_addr as _, self.dma_handle);
> +        core::mem::forget(self);
> +        ret
> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space.
> +    pub fn start_ptr(&self) -> *const T {
> +        self.cpu_addr as _
> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space as
> +    /// a mutable pointer.
> +    pub fn start_ptr_mut(&mut self) -> *mut T {
> +        self.cpu_addr
> +    }
> +
> +    /// Returns a DMA handle which may given to the device as the DMA address base of
> +    /// the region.
> +    pub fn dma_handle(&self) -> bindings::dma_addr_t {
> +        self.dma_handle
> +    }
> +
> +    /// Returns the CPU-addressable region as a slice.
> +    pub fn cpu_buf(&self) -> &[T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
> +    }
> +
> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
> +    fn drop(&mut self) {
> +        let size = self.count * core::mem::size_of::<T>();
> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
> +        // type invariants on `CoherentAllocation`.
> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
> +                                          self.cpu_addr as _,
> +                                          self.dma_handle,
> +                                          self.dma_attrs.as_raw(),) }
> +    }
> +}
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index e1065a7551a3..6e90ebf5a130 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -35,6 +35,7 @@
> mod build_assert;
> pub mod cred;
> pub mod device;
> +pub mod dma;
> pub mod error;
> #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
> pub mod firmware;
> -- 
> 2.43.0
> 


This still looks good to me.

— Daniel

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Abdiel Janulgue 1 year, 1 month ago

Gentle ping. Does this approach make sense? :)

On 11/12/2024 00:14, Abdiel Janulgue wrote:
> Add a simple dma coherent allocator rust abstraction. Based on
> Andreas Hindborg's dma abstractions from the rnvme driver, which
> was also based on earlier work by Wedson Almeida Filho.
> 
> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Tested-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
> ---
>   rust/bindings/bindings_helper.h |   1 +
>   rust/kernel/dma.rs              | 223 ++++++++++++++++++++++++++++++++
>   rust/kernel/lib.rs              |   1 +
>   3 files changed, 225 insertions(+)
>   create mode 100644 rust/kernel/dma.rs
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 5c4dfe22f41a..49bf713b9bb6 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -11,6 +11,7 @@
>   #include <linux/blk_types.h>
>   #include <linux/blkdev.h>
>   #include <linux/cred.h>
> +#include <linux/dma-mapping.h>
>   #include <linux/errname.h>
>   #include <linux/ethtool.h>
>   #include <linux/file.h>
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> new file mode 100644
> index 000000000000..29ae744d6f2b
> --- /dev/null
> +++ b/rust/kernel/dma.rs
> @@ -0,0 +1,223 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Direct memory access (DMA).
> +//!
> +//! C header: [`include/linux/dma-mapping.h`](srctree/include/linux/dma-mapping.h)
> +
> +use crate::{
> +    bindings,
> +    build_assert,
> +    device::Device,
> +    error::code::*,
> +    error::Result,
> +    types::ARef,
> +    transmute::{AsBytes, FromBytes},
> +};
> +
> +/// Possible attributes associated with a DMA mapping.
> +///
> +/// They can be combined with the operators `|`, `&`, and `!`.
> +///
> +/// Values can be used from the [`attrs`] module.
> +#[derive(Clone, Copy, PartialEq)]
> +pub struct Attribs(u32);
> +
> +impl Attribs {
> +    /// Get the raw representation of this attribute.
> +    pub(crate) fn as_raw(self) -> u64 {
> +        self.0.into()
> +    }
> +
> +    /// Check whether `flags` is contained in `self`.
> +    pub fn contains(self, flags: Attribs) -> bool {
> +        (self & flags) == flags
> +    }
> +}
> +
> +impl core::ops::BitOr for Attribs {
> +    type Output = Self;
> +    fn bitor(self, rhs: Self) -> Self::Output {
> +        Self(self.0 | rhs.0)
> +    }
> +}
> +
> +impl core::ops::BitAnd for Attribs {
> +    type Output = Self;
> +    fn bitand(self, rhs: Self) -> Self::Output {
> +        Self(self.0 & rhs.0)
> +    }
> +}
> +
> +impl core::ops::Not for Attribs {
> +    type Output = Self;
> +    fn not(self) -> Self::Output {
> +        Self(!self.0)
> +    }
> +}
> +
> +/// DMA mapping attrributes.
> +pub mod attrs {
> +    use super::Attribs;
> +
> +    /// Specifies that reads and writes to the mapping may be weakly ordered, that is that reads
> +    /// and writes may pass each other.
> +    pub const DMA_ATTR_WEAK_ORDERING: Attribs = Attribs(bindings::DMA_ATTR_WEAK_ORDERING);
> +
> +    /// Specifies that writes to the mapping may be buffered to improve performance.
> +    pub const DMA_ATTR_WRITE_COMBINE: Attribs = Attribs(bindings::DMA_ATTR_WRITE_COMBINE);
> +
> +    /// Lets the platform to avoid creating a kernel virtual mapping for the allocated buffer.
> +    pub const DMA_ATTR_NO_KERNEL_MAPPING: Attribs = Attribs(bindings::DMA_ATTR_NO_KERNEL_MAPPING);
> +
> +    /// Allows platform code to skip synchronization of the CPU cache for the given buffer assuming
> +    /// that it has been already transferred to 'device' domain.
> +    pub const DMA_ATTR_SKIP_CPU_SYNC: Attribs = Attribs(bindings::DMA_ATTR_SKIP_CPU_SYNC);
> +
> +    /// Forces contiguous allocation of the buffer in physical memory.
> +    pub const DMA_ATTR_FORCE_CONTIGUOUS: Attribs = Attribs(bindings::DMA_ATTR_FORCE_CONTIGUOUS);
> +
> +    /// This is a hint to the DMA-mapping subsystem that it's probably not worth the time to try
> +    /// to allocate memory to in a way that gives better TLB efficiency.
> +    pub const DMA_ATTR_ALLOC_SINGLE_PAGES: Attribs = Attribs(bindings::DMA_ATTR_ALLOC_SINGLE_PAGES);
> +
> +    /// This tells the DMA-mapping subsystem to suppress allocation failure reports (similarly to
> +    /// __GFP_NOWARN).
> +    pub const DMA_ATTR_NO_WARN: Attribs = Attribs(bindings::DMA_ATTR_NO_WARN);
> +
> +    /// Used to indicate that the buffer is fully accessible at an elevated privilege level (and
> +    /// ideally inaccessible or at least read-only at lesser-privileged levels).
> +    pub const DMA_ATTR_PRIVILEGED: Attribs = Attribs(bindings::DMA_ATTR_PRIVILEGED);
> +}
> +
> +/// An abstraction of the `dma_alloc_coherent` API.
> +///
> +/// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
> +/// large consistent DMA regions.
> +///
> +/// A [`CoherentAllocation`] instance contains a pointer to the allocated region (in the
> +/// processor's virtual address space) and the device address which can be given to the device
> +/// as the DMA address base of the region. The region is released once [`CoherentAllocation`]
> +/// is dropped.
> +///
> +/// # Invariants
> +///
> +/// For the lifetime of an instance of [`CoherentAllocation`], the cpu address is a valid pointer
> +/// to an allocated region of consistent memory and we hold a reference to the device.
> +pub struct CoherentAllocation<T: AsBytes + FromBytes> {
> +    dev: ARef<Device>,
> +    dma_handle: bindings::dma_addr_t,
> +    count: usize,
> +    cpu_addr: *mut T,
> +    dma_attrs: Attribs,
> +}
> +
> +impl<T: AsBytes + FromBytes> CoherentAllocation<T> {
> +    /// Allocates a region of `size_of::<T> * count` of consistent memory.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::device::Device;
> +    /// use kernel::dma::{attrs::*, CoherentAllocation};
> +    ///
> +    /// # fn test(dev: &Device) -> Result {
> +    /// let c: CoherentAllocation<u64> = CoherentAllocation::alloc_attrs(dev, 4, GFP_KERNEL,
> +    ///                                                                  DMA_ATTR_NO_WARN)?;
> +    /// # Ok::<(), Error>(()) }
> +    /// ```
> +    pub fn alloc_attrs(
> +        dev: &Device,
> +        count: usize,
> +        gfp_flags: kernel::alloc::Flags,
> +        dma_attrs: Attribs,
> +    ) -> Result<CoherentAllocation<T>> {
> +        build_assert!(core::mem::size_of::<T>() > 0,
> +                      "It doesn't make sense for the allocated type to be a ZST");
> +
> +        let size = count.checked_mul(core::mem::size_of::<T>()).ok_or(EOVERFLOW)?;
> +        let mut dma_handle = 0;
> +        // SAFETY: device pointer is guaranteed as valid by invariant on `Device`.
> +        // We ensure that we catch the failure on this function and throw an ENOMEM
> +        let ret = unsafe {
> +            bindings::dma_alloc_attrs(
> +                dev.as_raw(),
> +                size,
> +                &mut dma_handle, gfp_flags.as_raw(),
> +                dma_attrs.as_raw(),
> +            )
> +        };
> +        if ret.is_null() {
> +            return Err(ENOMEM)
> +        }
> +        // INVARIANT: We just successfully allocated a coherent region which is accessible for
> +        // `count` elements, hence the cpu address is valid. We also hold a refcounted reference
> +        // to the device.
> +        Ok(Self {
> +            dev: dev.into(),
> +            dma_handle,
> +            count,
> +            cpu_addr: ret as *mut T,
> +            dma_attrs,
> +        })
> +    }
> +
> +    /// Performs the same functionality as `alloc_attrs`, except the `dma_attrs` is 0 by default.
> +    pub fn alloc_coherent(dev: &Device,
> +                          count: usize,
> +                          gfp_flags: kernel::alloc::Flags) -> Result<CoherentAllocation<T>> {
> +        CoherentAllocation::alloc_attrs(dev, count, gfp_flags, Attribs(0))
> +    }
> +
> +    /// Returns the base address to the allocated region and the dma handle. The caller takes
> +    /// ownership of the returned resources.
> +    pub fn into_parts(self) -> (usize, bindings::dma_addr_t) {
> +        let ret = (self.cpu_addr as _, self.dma_handle);
> +        core::mem::forget(self);
> +        ret
> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space.
> +    pub fn start_ptr(&self) -> *const T {
> +        self.cpu_addr as _
> +    }
> +
> +    /// Returns the base address to the allocated region in the CPU's virtual address space as
> +    /// a mutable pointer.
> +    pub fn start_ptr_mut(&mut self) -> *mut T {
> +        self.cpu_addr
> +    }
> +
> +    /// Returns a DMA handle which may given to the device as the DMA address base of
> +    /// the region.
> +    pub fn dma_handle(&self) -> bindings::dma_addr_t {
> +        self.dma_handle
> +    }
> +
> +    /// Returns the CPU-addressable region as a slice.
> +    pub fn cpu_buf(&self) -> &[T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts(self.cpu_addr, self.count) }
> +    }
> +
> +    /// Performs the same functionality as `cpu_buf`, except that a mutable slice is returned.
> +    pub fn cpu_buf_mut(&mut self) -> &mut [T]
> +    {
> +        // SAFETY: The pointer is valid due to type invariant on `CoherentAllocation` and
> +        // is valid for reads for `self.count * size_of::<T>` bytes.
> +        unsafe { core::slice::from_raw_parts_mut(self.cpu_addr, self.count) }
> +    }
> +}
> +
> +impl<T: AsBytes + FromBytes> Drop for CoherentAllocation<T> {
> +    fn drop(&mut self) {
> +        let size = self.count * core::mem::size_of::<T>();
> +        // SAFETY: the device, cpu address, and the dma handle is valid due to the
> +        // type invariants on `CoherentAllocation`.
> +        unsafe { bindings::dma_free_attrs(self.dev.as_raw(), size,
> +                                          self.cpu_addr as _,
> +                                          self.dma_handle,
> +                                          self.dma_attrs.as_raw(),) }
> +    }
> +}
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index e1065a7551a3..6e90ebf5a130 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -35,6 +35,7 @@
>   mod build_assert;
>   pub mod cred;
>   pub mod device;
> +pub mod dma;
>   pub mod error;
>   #[cfg(CONFIG_RUST_FW_LOADER_ABSTRACTIONS)]
>   pub mod firmware;

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Andreas Hindborg 1 year, 1 month ago

Abdiel Janulgue <abdiel.janulgue@gmail.com> writes:

> Gentle ping. Does this approach make sense? :)

If you include the `Allocator` trait and the `Pool` it would make my
life a lot easier when rebasing nvme. For now I'm just going to use out
of tree patches that include the dma pool.

If you do not have a user for dma pool, maybe just include the
`Allocator` trait and indicate that it is there to support other dma
allocators in the future?


Best regards,
Andreas Hindborg

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Abdiel Janulgue 1 year, 1 month ago

Hi Andreas,

On 19/12/2024 13:22, Andreas Hindborg wrote:
> If you include the `Allocator` trait and the `Pool` it would make my
> life a lot easier when rebasing nvme. For now I'm just going to use out
> of tree patches that include the dma pool.
> 
> If you do not have a user for dma pool, maybe just include the
> `Allocator` trait and indicate that it is there to support other dma
> allocators in the future?
> 
> 

Would it be okay with you to have the basic functionality merged first? 
Once this is done, I will send up a follow-up patch that would then 
include the allocator trait and the pool.

Thanks!

Cheers,
Abdiel

Re: [PATCH v7 2/2] rust: add dma coherent allocator abstraction.

Posted by Andreas Hindborg 1 year, 1 month ago

"Abdiel Janulgue" <abdiel.janulgue@gmail.com> writes:

> Hi Andreas,
>
> On 19/12/2024 13:22, Andreas Hindborg wrote:
>> If you include the `Allocator` trait and the `Pool` it would make my
>> life a lot easier when rebasing nvme. For now I'm just going to use out
>> of tree patches that include the dma pool.
>>
>> If you do not have a user for dma pool, maybe just include the
>> `Allocator` trait and indicate that it is there to support other dma
>> allocators in the future?
>>
>>
>
> Would it be okay with you to have the basic functionality merged first?
> Once this is done, I will send up a follow-up patch that would then
> include the allocator trait and the pool.

Later is better than never :)


Best regards,
Andreas Hindborg

[PATCH v7 1/2] rust: error: Add EOVERFLOW
[PATCH v7 2/2] rust: add dma coherent allocator abstraction.