[v1] Generic `Allocator` support for Rust

[PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Danilo Krummrich 1 year, 7 months ago

Add a kernel specific `Allocator` trait, that in contrast to the one in
Rust's core library doesn't require unstable features and supports GFP
flags.

Subsequent patches add the following trait implementors: `Kmalloc`,
`Vmalloc` and `KVmalloc`.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 rust/kernel/alloc.rs | 73 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/rust/kernel/alloc.rs b/rust/kernel/alloc.rs
index 531b5e471cb1..462e00982510 100644
--- a/rust/kernel/alloc.rs
+++ b/rust/kernel/alloc.rs
@@ -11,6 +11,7 @@
 /// Indicates an allocation error.
 #[derive(Copy, Clone, PartialEq, Eq, Debug)]
 pub struct AllocError;
+use core::{alloc::Layout, ptr, ptr::NonNull};
 
 /// Flags to be used when allocating memory.
 ///
@@ -71,3 +72,75 @@ pub mod flags {
     /// small allocations.
     pub const GFP_NOWAIT: Flags = Flags(bindings::GFP_NOWAIT);
 }
+
+/// The kernel's [`Allocator`] trait.
+///
+/// An implementation of [`Allocator`] can allocate, re-allocate and free memory buffer described
+/// via [`Layout`].
+///
+/// [`Allocator`] is designed to be implemented on ZSTs; its safety requirements to not allow for
+/// keeping a state throughout an instance.
+///
+/// # Safety
+///
+/// Memory returned from an allocator must point to a valid memory buffer and remain valid until
+/// its explicitly freed.
+///
+/// Copying, cloning, or moving the allocator must not invalidate memory blocks returned from this
+/// allocator. A copied, cloned or even new allocator of the same type must behave like the same
+/// allocator, and any pointer to a memory buffer which is currently allocated may be passed to any
+/// other method of the allocator.
+pub unsafe trait Allocator {
+    /// Allocate memory based on `layout` and `flags`.
+    ///
+    /// On success, returns a buffer represented as `NonNull<[u8]>` that satisfies the size an
+    /// alignment requirements of layout, but may exceed the requested size.
+    ///
+    /// This function is equivalent to `realloc` when called with a NULL pointer and an `old_size`
+    /// of `0`.
+    fn alloc(&self, layout: Layout, flags: Flags) -> Result<NonNull<[u8]>, AllocError> {
+        // SAFETY: Passing a NULL pointer to `realloc` is valid by it's safety requirements and asks
+        // for a new memory allocation.
+        unsafe { self.realloc(ptr::null_mut(), 0, layout, flags) }
+    }
+
+    /// Re-allocate an existing memory allocation to satisfy the requested `layout`. If the
+    /// requested size is zero, `realloc` behaves equivalent to `free`.
+    ///
+    /// If the requested size is larger than `old_size`, a successful call to `realloc` guarantees
+    /// that the new or grown buffer has at least `Layout::size` bytes, but may also be larger.
+    ///
+    /// If the requested size is smaller than `old_size`, `realloc` may or may not shrink the
+    /// buffer; this is implementation specific to the allocator.
+    ///
+    /// On allocation failure, the existing buffer, if any, remains valid.
+    ///
+    /// The buffer is represented as `NonNull<[u8]>`.
+    ///
+    /// # Safety
+    ///
+    /// `ptr` must point to an existing and valid memory allocation created by this allocator
+    /// instance of a size of at least `old_size`.
+    ///
+    /// Additionally, `ptr` is allowed to be a NULL pointer; in this case a new memory allocation is
+    /// created.
+    unsafe fn realloc(
+        &self,
+        ptr: *mut u8,
+        old_size: usize,
+        layout: Layout,
+        flags: Flags,
+    ) -> Result<NonNull<[u8]>, AllocError>;
+
+    /// Free an existing memory allocation.
+    ///
+    /// # Safety
+    ///
+    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
+    /// instance.
+    unsafe fn free(&self, ptr: *mut u8) {
+        // SAFETY: `ptr` is guaranteed to be previously allocated with this `Allocator` or NULL.
+        // Calling `realloc` with a buffer size of zero, frees the buffer `ptr` points to.
+        let _ = unsafe { self.realloc(ptr, 0, Layout::new::<()>(), Flags(0)) };
+    }
+}
-- 
2.45.2

Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Benno Lossin 1 year, 7 months ago

On 04.07.24 19:06, Danilo Krummrich wrote:
> Add a kernel specific `Allocator` trait, that in contrast to the one in
> Rust's core library doesn't require unstable features and supports GFP
> flags.
> 
> Subsequent patches add the following trait implementors: `Kmalloc`,
> `Vmalloc` and `KVmalloc`.
> 
> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> ---
>  rust/kernel/alloc.rs | 73 ++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
> 
> diff --git a/rust/kernel/alloc.rs b/rust/kernel/alloc.rs
> index 531b5e471cb1..462e00982510 100644
> --- a/rust/kernel/alloc.rs
> +++ b/rust/kernel/alloc.rs
> @@ -11,6 +11,7 @@
>  /// Indicates an allocation error.
>  #[derive(Copy, Clone, PartialEq, Eq, Debug)]
>  pub struct AllocError;
> +use core::{alloc::Layout, ptr, ptr::NonNull};
> 
>  /// Flags to be used when allocating memory.
>  ///
> @@ -71,3 +72,75 @@ pub mod flags {
>      /// small allocations.
>      pub const GFP_NOWAIT: Flags = Flags(bindings::GFP_NOWAIT);
>  }
> +
> +/// The kernel's [`Allocator`] trait.
> +///
> +/// An implementation of [`Allocator`] can allocate, re-allocate and free memory buffer described
> +/// via [`Layout`].
> +///
> +/// [`Allocator`] is designed to be implemented on ZSTs; its safety requirements to not allow for
> +/// keeping a state throughout an instance.

Why do the functions take `&self` if it is forbidden to have state? I
would remove the receiver in that case.

> +///
> +/// # Safety
> +///
> +/// Memory returned from an allocator must point to a valid memory buffer and remain valid until
> +/// its explicitly freed.
> +///
> +/// Copying, cloning, or moving the allocator must not invalidate memory blocks returned from this
> +/// allocator. A copied, cloned or even new allocator of the same type must behave like the same
> +/// allocator, and any pointer to a memory buffer which is currently allocated may be passed to any
> +/// other method of the allocator.

If you provide no receiver methods, then I think we can remove this
requirement.

> +pub unsafe trait Allocator {
> +    /// Allocate memory based on `layout` and `flags`.
> +    ///
> +    /// On success, returns a buffer represented as `NonNull<[u8]>` that satisfies the size an

typo "an" -> "and"

> +    /// alignment requirements of layout, but may exceed the requested size.

Also if it may exceed the size, then I wouldn't call that "satisfies the
size [...] requirements".

> +    ///
> +    /// This function is equivalent to `realloc` when called with a NULL pointer and an `old_size`
> +    /// of `0`.

This is only true for the default implementation and could be
overridden, since it is not a requirement of implementing this trait to
keep it this way. I would remove this sentence.

> +    fn alloc(&self, layout: Layout, flags: Flags) -> Result<NonNull<[u8]>, AllocError> {

Instead of using the `Flags` type from the alloc module, we should have
an associated `Flags` type in this trait.

Similarly, it might also be a good idea to let the implementer specify a
custom error type.

> +        // SAFETY: Passing a NULL pointer to `realloc` is valid by it's safety requirements and asks
> +        // for a new memory allocation.
> +        unsafe { self.realloc(ptr::null_mut(), 0, layout, flags) }
> +    }
> +
> +    /// Re-allocate an existing memory allocation to satisfy the requested `layout`. If the
> +    /// requested size is zero, `realloc` behaves equivalent to `free`.

This is not guaranteed by the implementation.

> +    ///
> +    /// If the requested size is larger than `old_size`, a successful call to `realloc` guarantees
> +    /// that the new or grown buffer has at least `Layout::size` bytes, but may also be larger.
> +    ///
> +    /// If the requested size is smaller than `old_size`, `realloc` may or may not shrink the
> +    /// buffer; this is implementation specific to the allocator.
> +    ///
> +    /// On allocation failure, the existing buffer, if any, remains valid.
> +    ///
> +    /// The buffer is represented as `NonNull<[u8]>`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// `ptr` must point to an existing and valid memory allocation created by this allocator
> +    /// instance of a size of at least `old_size`.
> +    ///
> +    /// Additionally, `ptr` is allowed to be a NULL pointer; in this case a new memory allocation is
> +    /// created.
> +    unsafe fn realloc(
> +        &self,
> +        ptr: *mut u8,
> +        old_size: usize,

Why not request the old layout like the std Allocator's grow/shrink
functions do?

> +        layout: Layout,
> +        flags: Flags,
> +    ) -> Result<NonNull<[u8]>, AllocError>;
> +
> +    /// Free an existing memory allocation.
> +    ///
> +    /// # Safety
> +    ///
> +    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
> +    /// instance.
> +    unsafe fn free(&self, ptr: *mut u8) {

`ptr` should be `NonNull<u8>`.

> +        // SAFETY: `ptr` is guaranteed to be previously allocated with this `Allocator` or NULL.
> +        // Calling `realloc` with a buffer size of zero, frees the buffer `ptr` points to.
> +        let _ = unsafe { self.realloc(ptr, 0, Layout::new::<()>(), Flags(0)) };

Why does the implementer have to guarantee this?

> +    }
> +}
> --
> 2.45.2
> 

More general questions:
- are there functions in the kernel to efficiently allocate zeroed
  memory? In that case, the Allocator trait should also have methods
  that do that (with a iterating default impl).
- I am not sure putting everything into the single realloc function is a
  good idea, I like the grow/shrink methods of the std allocator. Is
  there a reason aside from concentrating the impl to go for only a
  single realloc function?

---
Cheers,
Benno

Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Danilo Krummrich 1 year, 7 months ago

On Sat, Jul 06, 2024 at 10:33:49AM +0000, Benno Lossin wrote:
> On 04.07.24 19:06, Danilo Krummrich wrote:
> > Add a kernel specific `Allocator` trait, that in contrast to the one in
> > Rust's core library doesn't require unstable features and supports GFP
> > flags.
> > 
> > Subsequent patches add the following trait implementors: `Kmalloc`,
> > `Vmalloc` and `KVmalloc`.
> > 
> > Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> > ---
> >  rust/kernel/alloc.rs | 73 ++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 73 insertions(+)
> > 
> > diff --git a/rust/kernel/alloc.rs b/rust/kernel/alloc.rs
> > index 531b5e471cb1..462e00982510 100644
> > --- a/rust/kernel/alloc.rs
> > +++ b/rust/kernel/alloc.rs
> > @@ -11,6 +11,7 @@
> >  /// Indicates an allocation error.
> >  #[derive(Copy, Clone, PartialEq, Eq, Debug)]
> >  pub struct AllocError;
> > +use core::{alloc::Layout, ptr, ptr::NonNull};
> > 
> >  /// Flags to be used when allocating memory.
> >  ///
> > @@ -71,3 +72,75 @@ pub mod flags {
> >      /// small allocations.
> >      pub const GFP_NOWAIT: Flags = Flags(bindings::GFP_NOWAIT);
> >  }
> > +
> > +/// The kernel's [`Allocator`] trait.
> > +///
> > +/// An implementation of [`Allocator`] can allocate, re-allocate and free memory buffer described
> > +/// via [`Layout`].
> > +///
> > +/// [`Allocator`] is designed to be implemented on ZSTs; its safety requirements to not allow for
> > +/// keeping a state throughout an instance.
> 
> Why do the functions take `&self` if it is forbidden to have state? I
> would remove the receiver in that case.

Yes, that that makes sense.

> 
> > +///
> > +/// # Safety
> > +///
> > +/// Memory returned from an allocator must point to a valid memory buffer and remain valid until
> > +/// its explicitly freed.
> > +///
> > +/// Copying, cloning, or moving the allocator must not invalidate memory blocks returned from this
> > +/// allocator. A copied, cloned or even new allocator of the same type must behave like the same
> > +/// allocator, and any pointer to a memory buffer which is currently allocated may be passed to any
> > +/// other method of the allocator.
> 
> If you provide no receiver methods, then I think we can remove this
> requirement.

Indeed.

> 
> > +pub unsafe trait Allocator {
> > +    /// Allocate memory based on `layout` and `flags`.
> > +    ///
> > +    /// On success, returns a buffer represented as `NonNull<[u8]>` that satisfies the size an
> 
> typo "an" -> "and"
> 
> > +    /// alignment requirements of layout, but may exceed the requested size.
> 
> Also if it may exceed the size, then I wouldn't call that "satisfies the
> size [...] requirements".

Do you have a better proposal? To me "satisfies or exceeds" sounds reasonable.

> 
> > +    ///
> > +    /// This function is equivalent to `realloc` when called with a NULL pointer and an `old_size`
> > +    /// of `0`.
> 
> This is only true for the default implementation and could be
> overridden, since it is not a requirement of implementing this trait to
> keep it this way. I would remove this sentence.

I could add a bit more generic description and say that for the default impl
"This function is eq..."?

> 
> > +    fn alloc(&self, layout: Layout, flags: Flags) -> Result<NonNull<[u8]>, AllocError> {
> 
> Instead of using the `Flags` type from the alloc module, we should have
> an associated `Flags` type in this trait.

What does this give us?

> 
> Similarly, it might also be a good idea to let the implementer specify a
> custom error type.

Same here, why?

> 
> > +        // SAFETY: Passing a NULL pointer to `realloc` is valid by it's safety requirements and asks
> > +        // for a new memory allocation.
> > +        unsafe { self.realloc(ptr::null_mut(), 0, layout, flags) }
> > +    }
> > +
> > +    /// Re-allocate an existing memory allocation to satisfy the requested `layout`. If the
> > +    /// requested size is zero, `realloc` behaves equivalent to `free`.
> 
> This is not guaranteed by the implementation.

Not sure what exactly you mean? Is it about "satisfy" again?

> 
> > +    ///
> > +    /// If the requested size is larger than `old_size`, a successful call to `realloc` guarantees
> > +    /// that the new or grown buffer has at least `Layout::size` bytes, but may also be larger.
> > +    ///
> > +    /// If the requested size is smaller than `old_size`, `realloc` may or may not shrink the
> > +    /// buffer; this is implementation specific to the allocator.
> > +    ///
> > +    /// On allocation failure, the existing buffer, if any, remains valid.
> > +    ///
> > +    /// The buffer is represented as `NonNull<[u8]>`.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// `ptr` must point to an existing and valid memory allocation created by this allocator
> > +    /// instance of a size of at least `old_size`.
> > +    ///
> > +    /// Additionally, `ptr` is allowed to be a NULL pointer; in this case a new memory allocation is
> > +    /// created.
> > +    unsafe fn realloc(
> > +        &self,
> > +        ptr: *mut u8,
> > +        old_size: usize,
> 
> Why not request the old layout like the std Allocator's grow/shrink
> functions do?

Because we only care about the size that needs to be preserved when growing the
buffer. The `alignment` field of `Layout` would be wasted.

> 
> > +        layout: Layout,
> > +        flags: Flags,
> > +    ) -> Result<NonNull<[u8]>, AllocError>;
> > +
> > +    /// Free an existing memory allocation.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
> > +    /// instance.
> > +    unsafe fn free(&self, ptr: *mut u8) {
> 
> `ptr` should be `NonNull<u8>`.

Creating a `NonNull` from a raw pointer is an extra operation for any user of
`free` and given that all `free` functions in the kernel accept a NULL pointer,
I think there is not much value in making this `NonNull`.

> 
> > +        // SAFETY: `ptr` is guaranteed to be previously allocated with this `Allocator` or NULL.
> > +        // Calling `realloc` with a buffer size of zero, frees the buffer `ptr` points to.
> > +        let _ = unsafe { self.realloc(ptr, 0, Layout::new::<()>(), Flags(0)) };
> 
> Why does the implementer have to guarantee this?

Who else can guarantee this?

> 
> > +    }
> > +}
> > --
> > 2.45.2
> > 
> 
> More general questions:
> - are there functions in the kernel to efficiently allocate zeroed
>   memory? In that case, the Allocator trait should also have methods
>   that do that (with a iterating default impl).

We do this with GFP flags. In particular, you can pass GFP_ZERO to `alloc` and
`realloc` to get zeroed memory. Hence, I think having dedicated functions that
just do "flags | GFP_ZERO" would not add much value.

> - I am not sure putting everything into the single realloc function is a
>   good idea, I like the grow/shrink methods of the std allocator. Is
>   there a reason aside from concentrating the impl to go for only a
>   single realloc function?

Yes, `krealloc()` already provides exactly the described behaviour. See the
implementation of `Kmalloc`.

> 
> ---
> Cheers,
> Benno
>

Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Benno Lossin 1 year, 7 months ago

On 06.07.24 13:05, Danilo Krummrich wrote:
> On Sat, Jul 06, 2024 at 10:33:49AM +0000, Benno Lossin wrote:
>> On 04.07.24 19:06, Danilo Krummrich wrote:
>>> +pub unsafe trait Allocator {
>>> +    /// Allocate memory based on `layout` and `flags`.
>>> +    ///
>>> +    /// On success, returns a buffer represented as `NonNull<[u8]>` that satisfies the size an
>>
>> typo "an" -> "and"
>>
>>> +    /// alignment requirements of layout, but may exceed the requested size.
>>
>> Also if it may exceed the size, then I wouldn't call that "satisfies the
>> size [...] requirements".
> 
> Do you have a better proposal? To me "satisfies or exceeds" sounds reasonable.

I think "satisfies the layout constraints (i.e. minimum size and
alignment as specified by `layout`)" would be better.

>>> +    ///
>>> +    /// This function is equivalent to `realloc` when called with a NULL pointer and an `old_size`
>>> +    /// of `0`.
>>
>> This is only true for the default implementation and could be
>> overridden, since it is not a requirement of implementing this trait to
>> keep it this way. I would remove this sentence.
> 
> I could add a bit more generic description and say that for the default impl
> "This function is eq..."?
> 
>>
>>> +    fn alloc(&self, layout: Layout, flags: Flags) -> Result<NonNull<[u8]>, AllocError> {
>>
>> Instead of using the `Flags` type from the alloc module, we should have
>> an associated `Flags` type in this trait.
> 
> What does this give us?

1. IIRC not all flags can be used with every allocator (or do not have
   an effect) and it would be best if only the working ones are allowed.
2. that way the design is more flexible and could be upstreamed more
   easily.

>> Similarly, it might also be a good idea to let the implementer specify a
>> custom error type.
> 
> Same here, why?

In this case the argument is weaker, but it could allow us to implement
an allocator with `Error = Infallible`, to statically guarantee
allocation (e.g. when using GFP_ATOMIC). But at the moment there is no
user.

>>> +        // SAFETY: Passing a NULL pointer to `realloc` is valid by it's safety requirements and asks
>>> +        // for a new memory allocation.
>>> +        unsafe { self.realloc(ptr::null_mut(), 0, layout, flags) }
>>> +    }
>>> +
>>> +    /// Re-allocate an existing memory allocation to satisfy the requested `layout`. If the
>>> +    /// requested size is zero, `realloc` behaves equivalent to `free`.
>>
>> This is not guaranteed by the implementation.
> 
> Not sure what exactly you mean? Is it about "satisfy" again?

If the requested size is zero, the implementation could also leak the
memory, nothing prevents me from implementing such an Allocator.

>>> +    ///
>>> +    /// If the requested size is larger than `old_size`, a successful call to `realloc` guarantees
>>> +    /// that the new or grown buffer has at least `Layout::size` bytes, but may also be larger.
>>> +    ///
>>> +    /// If the requested size is smaller than `old_size`, `realloc` may or may not shrink the
>>> +    /// buffer; this is implementation specific to the allocator.
>>> +    ///
>>> +    /// On allocation failure, the existing buffer, if any, remains valid.
>>> +    ///
>>> +    /// The buffer is represented as `NonNull<[u8]>`.
>>> +    ///
>>> +    /// # Safety
>>> +    ///
>>> +    /// `ptr` must point to an existing and valid memory allocation created by this allocator
>>> +    /// instance of a size of at least `old_size`.
>>> +    ///
>>> +    /// Additionally, `ptr` is allowed to be a NULL pointer; in this case a new memory allocation is
>>> +    /// created.
>>> +    unsafe fn realloc(
>>> +        &self,
>>> +        ptr: *mut u8,
>>> +        old_size: usize,
>>
>> Why not request the old layout like the std Allocator's grow/shrink
>> functions do?
> 
> Because we only care about the size that needs to be preserved when growing the
> buffer. The `alignment` field of `Layout` would be wasted.

In the std Allocator they specified an old layout. This is probably
because of the following: if `Layout` is ever extended to hold another
property that would need to be updated, the signatures are already
correct.
In our case we could change it tree-wide, so I guess we could fix that
issue when it comes up.

>>> +        layout: Layout,
>>> +        flags: Flags,
>>> +    ) -> Result<NonNull<[u8]>, AllocError>;
>>> +
>>> +    /// Free an existing memory allocation.
>>> +    ///
>>> +    /// # Safety
>>> +    ///
>>> +    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
>>> +    /// instance.
>>> +    unsafe fn free(&self, ptr: *mut u8) {
>>
>> `ptr` should be `NonNull<u8>`.
> 
> Creating a `NonNull` from a raw pointer is an extra operation for any user of
> `free` and given that all `free` functions in the kernel accept a NULL pointer,
> I think there is not much value in making this `NonNull`.

I don't think that this argument holds for Rust though. For example,
`KBox` contains a `Unique` that contains a `NonNull`, so freeing could
just be done with `free(self.0.0)`.

>>> +        // SAFETY: `ptr` is guaranteed to be previously allocated with this `Allocator` or NULL.
>>> +        // Calling `realloc` with a buffer size of zero, frees the buffer `ptr` points to.
>>> +        let _ = unsafe { self.realloc(ptr, 0, Layout::new::<()>(), Flags(0)) };
>>
>> Why does the implementer have to guarantee this?
> 
> Who else can guarantee this?

Only the implementer yes. But they are not forced to do this i.e.
nothing in the safety requirements of `Allocator` prevents me from doing
a no-op on reallocating to a zero size.

>>> +    }
>>> +}
>>> --
>>> 2.45.2
>>>
>>
>> More general questions:
>> - are there functions in the kernel to efficiently allocate zeroed
>>   memory? In that case, the Allocator trait should also have methods
>>   that do that (with a iterating default impl).
> 
> We do this with GFP flags. In particular, you can pass GFP_ZERO to `alloc` and
> `realloc` to get zeroed memory. Hence, I think having dedicated functions that
> just do "flags | GFP_ZERO" would not add much value.

Ah right, no in that case, we don't need it.

>> - I am not sure putting everything into the single realloc function is a
>>   good idea, I like the grow/shrink methods of the std allocator. Is
>>   there a reason aside from concentrating the impl to go for only a
>>   single realloc function?
> 
> Yes, `krealloc()` already provides exactly the described behaviour. See the
> implementation of `Kmalloc`.

But `kvmalloc` does not and neither does `vmalloc`. I would prefer
multiple smaller functions over one big one in this case.

---
Cheers,
Benno

Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Danilo Krummrich 1 year, 7 months ago

On Sat, Jul 06, 2024 at 01:17:19PM +0000, Benno Lossin wrote:
> On 06.07.24 13:05, Danilo Krummrich wrote:
> > On Sat, Jul 06, 2024 at 10:33:49AM +0000, Benno Lossin wrote:
> >> On 04.07.24 19:06, Danilo Krummrich wrote:
> >>> +pub unsafe trait Allocator {
> >>> +    /// Allocate memory based on `layout` and `flags`.
> >>> +    ///
> >>> +    /// On success, returns a buffer represented as `NonNull<[u8]>` that satisfies the size an
> >>
> >> typo "an" -> "and"
> >>
> >>> +    /// alignment requirements of layout, but may exceed the requested size.
> >>
> >> Also if it may exceed the size, then I wouldn't call that "satisfies the
> >> size [...] requirements".
> > 
> > Do you have a better proposal? To me "satisfies or exceeds" sounds reasonable.
> 
> I think "satisfies the layout constraints (i.e. minimum size and
> alignment as specified by `layout`)" would be better.
> 
> >>> +    ///
> >>> +    /// This function is equivalent to `realloc` when called with a NULL pointer and an `old_size`
> >>> +    /// of `0`.
> >>
> >> This is only true for the default implementation and could be
> >> overridden, since it is not a requirement of implementing this trait to
> >> keep it this way. I would remove this sentence.
> > 
> > I could add a bit more generic description and say that for the default impl
> > "This function is eq..."?
> > 
> >>
> >>> +    fn alloc(&self, layout: Layout, flags: Flags) -> Result<NonNull<[u8]>, AllocError> {
> >>
> >> Instead of using the `Flags` type from the alloc module, we should have
> >> an associated `Flags` type in this trait.
> > 
> > What does this give us?
> 
> 1. IIRC not all flags can be used with every allocator (or do not have
>    an effect) and it would be best if only the working ones are allowed.

Agreed, but I'm not sure if it's worth the effort having different `Flags`
types for that only.

But I guess this and the below argument justify using an associated type. I will
queue this change up.

> 2. that way the design is more flexible and could be upstreamed more
>    easily.
> 
> >> Similarly, it might also be a good idea to let the implementer specify a
> >> custom error type.
> > 
> > Same here, why?
> 
> In this case the argument is weaker, but it could allow us to implement
> an allocator with `Error = Infallible`, to statically guarantee
> allocation (e.g. when using GFP_ATOMIC). But at the moment there is no
> user.

GFP_ATOMIC can fail, I guess you mean __GFP_NOFAIL.

Not really sure how this would work other than with separate `alloc_nofail` and
`realloc_nofail` functions?

> 
> >>> +        // SAFETY: Passing a NULL pointer to `realloc` is valid by it's safety requirements and asks
> >>> +        // for a new memory allocation.
> >>> +        unsafe { self.realloc(ptr::null_mut(), 0, layout, flags) }
> >>> +    }
> >>> +
> >>> +    /// Re-allocate an existing memory allocation to satisfy the requested `layout`. If the
> >>> +    /// requested size is zero, `realloc` behaves equivalent to `free`.
> >>
> >> This is not guaranteed by the implementation.
> > 
> > Not sure what exactly you mean? Is it about "satisfy" again?
> 
> If the requested size is zero, the implementation could also leak the
> memory, nothing prevents me from implementing such an Allocator.

Well, hopefully the documentation stating that `realloc` must be implemented
this exact way prevents you from doing otherwise. :-)

Please let me know if I need to document this in a different way if it's not
sufficient as it is.

> 
> >>> +    ///
> >>> +    /// If the requested size is larger than `old_size`, a successful call to `realloc` guarantees
> >>> +    /// that the new or grown buffer has at least `Layout::size` bytes, but may also be larger.
> >>> +    ///
> >>> +    /// If the requested size is smaller than `old_size`, `realloc` may or may not shrink the
> >>> +    /// buffer; this is implementation specific to the allocator.
> >>> +    ///
> >>> +    /// On allocation failure, the existing buffer, if any, remains valid.
> >>> +    ///
> >>> +    /// The buffer is represented as `NonNull<[u8]>`.
> >>> +    ///
> >>> +    /// # Safety
> >>> +    ///
> >>> +    /// `ptr` must point to an existing and valid memory allocation created by this allocator
> >>> +    /// instance of a size of at least `old_size`.
> >>> +    ///
> >>> +    /// Additionally, `ptr` is allowed to be a NULL pointer; in this case a new memory allocation is
> >>> +    /// created.
> >>> +    unsafe fn realloc(
> >>> +        &self,
> >>> +        ptr: *mut u8,
> >>> +        old_size: usize,
> >>
> >> Why not request the old layout like the std Allocator's grow/shrink
> >> functions do?
> > 
> > Because we only care about the size that needs to be preserved when growing the
> > buffer. The `alignment` field of `Layout` would be wasted.
> 
> In the std Allocator they specified an old layout. This is probably
> because of the following: if `Layout` is ever extended to hold another
> property that would need to be updated, the signatures are already
> correct.
> In our case we could change it tree-wide, so I guess we could fix that
> issue when it comes up.

Yes, I think so too.

> 
> >>> +        layout: Layout,
> >>> +        flags: Flags,
> >>> +    ) -> Result<NonNull<[u8]>, AllocError>;
> >>> +
> >>> +    /// Free an existing memory allocation.
> >>> +    ///
> >>> +    /// # Safety
> >>> +    ///
> >>> +    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
> >>> +    /// instance.
> >>> +    unsafe fn free(&self, ptr: *mut u8) {
> >>
> >> `ptr` should be `NonNull<u8>`.
> > 
> > Creating a `NonNull` from a raw pointer is an extra operation for any user of
> > `free` and given that all `free` functions in the kernel accept a NULL pointer,
> > I think there is not much value in making this `NonNull`.
> 
> I don't think that this argument holds for Rust though. For example,
> `KBox` contains a `Unique` that contains a `NonNull`, so freeing could
> just be done with `free(self.0.0)`.

Agreed, we can indeed make it a `&NonNull<u8>`. However, I find this a bit
inconsistent with the signature of `realloc`.

Should we go with separate `shrink` / `grow`, `free` could be implemented as
shrinking to zero and allowing a NULL pointer makes not much sense.

But as mentioned, I'm not yet seeing the benefit of having `realloc` split into
`grow` and `shrink`.

> 
> >>> +        // SAFETY: `ptr` is guaranteed to be previously allocated with this `Allocator` or NULL.
> >>> +        // Calling `realloc` with a buffer size of zero, frees the buffer `ptr` points to.
> >>> +        let _ = unsafe { self.realloc(ptr, 0, Layout::new::<()>(), Flags(0)) };
> >>
> >> Why does the implementer have to guarantee this?
> > 
> > Who else can guarantee this?
> 
> Only the implementer yes. But they are not forced to do this i.e.
> nothing in the safety requirements of `Allocator` prevents me from doing
> a no-op on reallocating to a zero size.

Ah, I see now, this is the same as your comment on the documentation of
`realloc`. So, this indeed just about missing a safety comment.

> 
> >>> +    }
> >>> +}
> >>> --
> >>> 2.45.2
> >>>
> >>
> >> More general questions:
> >> - are there functions in the kernel to efficiently allocate zeroed
> >>   memory? In that case, the Allocator trait should also have methods
> >>   that do that (with a iterating default impl).
> > 
> > We do this with GFP flags. In particular, you can pass GFP_ZERO to `alloc` and
> > `realloc` to get zeroed memory. Hence, I think having dedicated functions that
> > just do "flags | GFP_ZERO" would not add much value.
> 
> Ah right, no in that case, we don't need it.
> 
> >> - I am not sure putting everything into the single realloc function is a
> >>   good idea, I like the grow/shrink methods of the std allocator. Is
> >>   there a reason aside from concentrating the impl to go for only a
> >>   single realloc function?
> > 
> > Yes, `krealloc()` already provides exactly the described behaviour. See the
> > implementation of `Kmalloc`.
> 
> But `kvmalloc` does not and neither does `vmalloc`. I would prefer
> multiple smaller functions over one big one in this case.

What I forsee is that:

  - `alloc` becomes a `grow` from zero.
  - `free` becomes a `shrink` to zero.
  - `grow` and `shrink` become a `realloc` alias,
     because they're almost the same

Wouldn't this just put us were we already are, effectively?

> 
> ---
> Cheers,
> Benno
>

Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Benno Lossin 1 year, 7 months ago

On 06.07.24 17:11, Danilo Krummrich wrote:
> On Sat, Jul 06, 2024 at 01:17:19PM +0000, Benno Lossin wrote:
>> On 06.07.24 13:05, Danilo Krummrich wrote:
>>> On Sat, Jul 06, 2024 at 10:33:49AM +0000, Benno Lossin wrote:
>>>> Similarly, it might also be a good idea to let the implementer specify a
>>>> custom error type.
>>>
>>> Same here, why?
>>
>> In this case the argument is weaker, but it could allow us to implement
>> an allocator with `Error = Infallible`, to statically guarantee
>> allocation (e.g. when using GFP_ATOMIC). But at the moment there is no
>> user.
> 
> GFP_ATOMIC can fail, I guess you mean __GFP_NOFAIL.
> 
> Not really sure how this would work other than with separate `alloc_nofail` and
> `realloc_nofail` functions?

You could have an Allocator that always enables __GFP_NOFAIL, so the
error type could be Infallible. But this doesn't seem that useful at the
moment, so keep the AllocError as is.

>>>>> +        // SAFETY: Passing a NULL pointer to `realloc` is valid by it's safety requirements and asks
>>>>> +        // for a new memory allocation.
>>>>> +        unsafe { self.realloc(ptr::null_mut(), 0, layout, flags) }
>>>>> +    }
>>>>> +
>>>>> +    /// Re-allocate an existing memory allocation to satisfy the requested `layout`. If the
>>>>> +    /// requested size is zero, `realloc` behaves equivalent to `free`.
>>>>
>>>> This is not guaranteed by the implementation.
>>>
>>> Not sure what exactly you mean? Is it about "satisfy" again?
>>
>> If the requested size is zero, the implementation could also leak the
>> memory, nothing prevents me from implementing such an Allocator.
> 
> Well, hopefully the documentation stating that `realloc` must be implemented
> this exact way prevents you from doing otherwise. :-)
> 
> Please let me know if I need to document this in a different way if it's not
> sufficient as it is.

It should be part of the safety requirements of the Allocator trait.

>>>>> +    ///
>>>>> +    /// If the requested size is larger than `old_size`, a successful call to `realloc` guarantees
>>>>> +    /// that the new or grown buffer has at least `Layout::size` bytes, but may also be larger.
>>>>> +    ///
>>>>> +    /// If the requested size is smaller than `old_size`, `realloc` may or may not shrink the
>>>>> +    /// buffer; this is implementation specific to the allocator.
>>>>> +    ///
>>>>> +    /// On allocation failure, the existing buffer, if any, remains valid.
>>>>> +    ///
>>>>> +    /// The buffer is represented as `NonNull<[u8]>`.
>>>>> +    ///
>>>>> +    /// # Safety
>>>>> +    ///
>>>>> +    /// `ptr` must point to an existing and valid memory allocation created by this allocator
>>>>> +    /// instance of a size of at least `old_size`.
>>>>> +    ///
>>>>> +    /// Additionally, `ptr` is allowed to be a NULL pointer; in this case a new memory allocation is
>>>>> +    /// created.
>>>>> +    unsafe fn realloc(
>>>>> +        &self,
>>>>> +        ptr: *mut u8,
>>>>> +        old_size: usize,
>>>>
>>>> Why not request the old layout like the std Allocator's grow/shrink
>>>> functions do?
>>>
>>> Because we only care about the size that needs to be preserved when growing the
>>> buffer. The `alignment` field of `Layout` would be wasted.
>>
>> In the std Allocator they specified an old layout. This is probably
>> because of the following: if `Layout` is ever extended to hold another
>> property that would need to be updated, the signatures are already
>> correct.
>> In our case we could change it tree-wide, so I guess we could fix that
>> issue when it comes up.
> 
> Yes, I think so too.
> 
>>
>>>>> +        layout: Layout,
>>>>> +        flags: Flags,
>>>>> +    ) -> Result<NonNull<[u8]>, AllocError>;
>>>>> +
>>>>> +    /// Free an existing memory allocation.
>>>>> +    ///
>>>>> +    /// # Safety
>>>>> +    ///
>>>>> +    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
>>>>> +    /// instance.
>>>>> +    unsafe fn free(&self, ptr: *mut u8) {
>>>>
>>>> `ptr` should be `NonNull<u8>`.
>>>
>>> Creating a `NonNull` from a raw pointer is an extra operation for any user of
>>> `free` and given that all `free` functions in the kernel accept a NULL pointer,
>>> I think there is not much value in making this `NonNull`.
>>
>> I don't think that this argument holds for Rust though. For example,
>> `KBox` contains a `Unique` that contains a `NonNull`, so freeing could
>> just be done with `free(self.0.0)`.
> 
> Agreed, we can indeed make it a `&NonNull<u8>`. However, I find this a bit

I think you mean `NonNull<u8>`, right?

> inconsistent with the signature of `realloc`.
> 
> Should we go with separate `shrink` / `grow`, `free` could be implemented as
> shrinking to zero and allowing a NULL pointer makes not much sense.
> 
> But as mentioned, I'm not yet seeing the benefit of having `realloc` split into
> `grow` and `shrink`.

I would not split it into grow/shrink. I am not sure what exactly would
be best here, but here is what I am trying to achieve:
- people should strongly prefer alloc/free over realloc,
- calling realloc with zero size should not signify freeing the memory,
  but rather resizing the allocation to 0. E.g. because a buffer now
  decides to hold zero elements (in this case, the size should be a
  variable that just happens to be zero).
- calling realloc with a null pointer should not be necessary, since
  `alloc` exists.

This is to improve readability of code, or do you find

    realloc(ptr, 0, Layout::new::<()>(), Flags(0))

more readable than
    
    free(ptr)

>>>>> +        // SAFETY: `ptr` is guaranteed to be previously allocated with this `Allocator` or NULL.
>>>>> +        // Calling `realloc` with a buffer size of zero, frees the buffer `ptr` points to.
>>>>> +        let _ = unsafe { self.realloc(ptr, 0, Layout::new::<()>(), Flags(0)) };
>>>>
>>>> Why does the implementer have to guarantee this?
>>>
>>> Who else can guarantee this?
>>
>> Only the implementer yes. But they are not forced to do this i.e.
>> nothing in the safety requirements of `Allocator` prevents me from doing
>> a no-op on reallocating to a zero size.
> 
> Ah, I see now, this is the same as your comment on the documentation of
> `realloc`. So, this indeed just about missing a safety comment.
> 
>>
>>>>> +    }
>>>>> +}
>>>>> --
>>>>> 2.45.2
>>>>>
>>>>
>>>> More general questions:
>>>> - are there functions in the kernel to efficiently allocate zeroed
>>>>   memory? In that case, the Allocator trait should also have methods
>>>>   that do that (with a iterating default impl).
>>>
>>> We do this with GFP flags. In particular, you can pass GFP_ZERO to `alloc` and
>>> `realloc` to get zeroed memory. Hence, I think having dedicated functions that
>>> just do "flags | GFP_ZERO" would not add much value.
>>
>> Ah right, no in that case, we don't need it.
>>
>>>> - I am not sure putting everything into the single realloc function is a
>>>>   good idea, I like the grow/shrink methods of the std allocator. Is
>>>>   there a reason aside from concentrating the impl to go for only a
>>>>   single realloc function?
>>>
>>> Yes, `krealloc()` already provides exactly the described behaviour. See the
>>> implementation of `Kmalloc`.
>>
>> But `kvmalloc` does not and neither does `vmalloc`. I would prefer
>> multiple smaller functions over one big one in this case.
> 
> What I forsee is that:
> 
>   - `alloc` becomes a `grow` from zero.
>   - `free` becomes a `shrink` to zero.
>   - `grow` and `shrink` become a `realloc` alias,
>      because they're almost the same
> 
> Wouldn't this just put us were we already are, effectively?

We could have a NonNull parameter for realloc and discourage calling
realloc for freeing.

---
Cheers,
Benno

Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Danilo Krummrich 1 year, 7 months ago

On Sat, Jul 06, 2024 at 05:08:26PM +0000, Benno Lossin wrote:
> On 06.07.24 17:11, Danilo Krummrich wrote:
> > On Sat, Jul 06, 2024 at 01:17:19PM +0000, Benno Lossin wrote:
> >> On 06.07.24 13:05, Danilo Krummrich wrote:
> >>> On Sat, Jul 06, 2024 at 10:33:49AM +0000, Benno Lossin wrote:
> >>>> Similarly, it might also be a good idea to let the implementer specify a
> >>>> custom error type.
> >>>
> >>> Same here, why?
> >>
> >> In this case the argument is weaker, but it could allow us to implement
> >> an allocator with `Error = Infallible`, to statically guarantee
> >> allocation (e.g. when using GFP_ATOMIC). But at the moment there is no
> >> user.
> > 
> > GFP_ATOMIC can fail, I guess you mean __GFP_NOFAIL.
> > 
> > Not really sure how this would work other than with separate `alloc_nofail` and
> > `realloc_nofail` functions?
> 
> You could have an Allocator that always enables __GFP_NOFAIL, so the
> error type could be Infallible. But this doesn't seem that useful at the
> moment, so keep the AllocError as is.
> 
> >>>>> +        // SAFETY: Passing a NULL pointer to `realloc` is valid by it's safety requirements and asks
> >>>>> +        // for a new memory allocation.
> >>>>> +        unsafe { self.realloc(ptr::null_mut(), 0, layout, flags) }
> >>>>> +    }
> >>>>> +
> >>>>> +    /// Re-allocate an existing memory allocation to satisfy the requested `layout`. If the
> >>>>> +    /// requested size is zero, `realloc` behaves equivalent to `free`.
> >>>>
> >>>> This is not guaranteed by the implementation.
> >>>
> >>> Not sure what exactly you mean? Is it about "satisfy" again?
> >>
> >> If the requested size is zero, the implementation could also leak the
> >> memory, nothing prevents me from implementing such an Allocator.
> > 
> > Well, hopefully the documentation stating that `realloc` must be implemented
> > this exact way prevents you from doing otherwise. :-)
> > 
> > Please let me know if I need to document this in a different way if it's not
> > sufficient as it is.
> 
> It should be part of the safety requirements of the Allocator trait.

Makes sense, gonna add it.

> 
> >>>>> +    ///
> >>>>> +    /// If the requested size is larger than `old_size`, a successful call to `realloc` guarantees
> >>>>> +    /// that the new or grown buffer has at least `Layout::size` bytes, but may also be larger.
> >>>>> +    ///
> >>>>> +    /// If the requested size is smaller than `old_size`, `realloc` may or may not shrink the
> >>>>> +    /// buffer; this is implementation specific to the allocator.
> >>>>> +    ///
> >>>>> +    /// On allocation failure, the existing buffer, if any, remains valid.
> >>>>> +    ///
> >>>>> +    /// The buffer is represented as `NonNull<[u8]>`.
> >>>>> +    ///
> >>>>> +    /// # Safety
> >>>>> +    ///
> >>>>> +    /// `ptr` must point to an existing and valid memory allocation created by this allocator
> >>>>> +    /// instance of a size of at least `old_size`.
> >>>>> +    ///
> >>>>> +    /// Additionally, `ptr` is allowed to be a NULL pointer; in this case a new memory allocation is
> >>>>> +    /// created.
> >>>>> +    unsafe fn realloc(
> >>>>> +        &self,
> >>>>> +        ptr: *mut u8,
> >>>>> +        old_size: usize,
> >>>>
> >>>> Why not request the old layout like the std Allocator's grow/shrink
> >>>> functions do?
> >>>
> >>> Because we only care about the size that needs to be preserved when growing the
> >>> buffer. The `alignment` field of `Layout` would be wasted.
> >>
> >> In the std Allocator they specified an old layout. This is probably
> >> because of the following: if `Layout` is ever extended to hold another
> >> property that would need to be updated, the signatures are already
> >> correct.
> >> In our case we could change it tree-wide, so I guess we could fix that
> >> issue when it comes up.
> > 
> > Yes, I think so too.
> > 
> >>
> >>>>> +        layout: Layout,
> >>>>> +        flags: Flags,
> >>>>> +    ) -> Result<NonNull<[u8]>, AllocError>;
> >>>>> +
> >>>>> +    /// Free an existing memory allocation.
> >>>>> +    ///
> >>>>> +    /// # Safety
> >>>>> +    ///
> >>>>> +    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
> >>>>> +    /// instance.
> >>>>> +    unsafe fn free(&self, ptr: *mut u8) {
> >>>>
> >>>> `ptr` should be `NonNull<u8>`.
> >>>
> >>> Creating a `NonNull` from a raw pointer is an extra operation for any user of
> >>> `free` and given that all `free` functions in the kernel accept a NULL pointer,
> >>> I think there is not much value in making this `NonNull`.
> >>
> >> I don't think that this argument holds for Rust though. For example,
> >> `KBox` contains a `Unique` that contains a `NonNull`, so freeing could
> >> just be done with `free(self.0.0)`.
> > 
> > Agreed, we can indeed make it a `&NonNull<u8>`. However, I find this a bit
> 
> I think you mean `NonNull<u8>`, right?

Yes, but I still don't see how that improves things, e.g. in `Drop` of
`KVec`:

`A::free(self.ptr.to_non_null().cast())`

vs.

`A::free(self.as_mut_ptr().cast())`

I'm not against this change, but I don't see how this makes things better.

> 
> > inconsistent with the signature of `realloc`.
> > 
> > Should we go with separate `shrink` / `grow`, `free` could be implemented as
> > shrinking to zero and allowing a NULL pointer makes not much sense.
> > 
> > But as mentioned, I'm not yet seeing the benefit of having `realloc` split into
> > `grow` and `shrink`.
> 
> I would not split it into grow/shrink. I am not sure what exactly would
> be best here, but here is what I am trying to achieve:
> - people should strongly prefer alloc/free over realloc,

I agree; the functions for that are there: `Allocator::alloc` and
`Allocator::free`.

`KBox` uses both of them, `KVec` instead, for obvious reasons, uses
`Allocator::realloc` directly to grow from zero and `Allocator::free`.

> - calling realloc with zero size should not signify freeing the memory,
>   but rather resizing the allocation to 0. E.g. because a buffer now
>   decides to hold zero elements (in this case, the size should be a
>   variable that just happens to be zero).

If a buffer is forced to a new size of zero, isn't that effectively a free?

At least that's exactly what the kernel does, if we ask krealloc() to resize to
zero it will free the memory and return ZERO_SIZE_PTR.

So, what exactly would you want `realloc` to do when a size of zero is passed
in?

> - calling realloc with a null pointer should not be necessary, since
>   `alloc` exists.

But `alloc` calls `realloc` with a NULL pointer to allocate new memory.

Let's take `Kmalloc` as example, surely I could implement `alloc` by calling
into kmalloc() instead. But then we'd have to implement `alloc` for all
allocators, instead of having a generic `alloc`.

And I wonder what's the point given that `realloc` with a NULL pointer already
does this naturally? Besides that, it comes in handy when we want to allocate
memory for data structures that grow from zero, such as `KVec`.

> 
> This is to improve readability of code, or do you find
> 
>     realloc(ptr, 0, Layout::new::<()>(), Flags(0))
> 
> more readable than
>     
>     free(ptr)

No, but that's not what users of allocators would do. They'd just call `free`,
as I do in `KBox` and `KVec`.

> 
> >>>>> +        // SAFETY: `ptr` is guaranteed to be previously allocated with this `Allocator` or NULL.
> >>>>> +        // Calling `realloc` with a buffer size of zero, frees the buffer `ptr` points to.
> >>>>> +        let _ = unsafe { self.realloc(ptr, 0, Layout::new::<()>(), Flags(0)) };
> >>>>
> >>>> Why does the implementer have to guarantee this?
> >>>
> >>> Who else can guarantee this?
> >>
> >> Only the implementer yes. But they are not forced to do this i.e.
> >> nothing in the safety requirements of `Allocator` prevents me from doing
> >> a no-op on reallocating to a zero size.
> > 
> > Ah, I see now, this is the same as your comment on the documentation of
> > `realloc`. So, this indeed just about missing a safety comment.
> > 
> >>
> >>>>> +    }
> >>>>> +}
> >>>>> --
> >>>>> 2.45.2
> >>>>>
> >>>>
> >>>> More general questions:
> >>>> - are there functions in the kernel to efficiently allocate zeroed
> >>>>   memory? In that case, the Allocator trait should also have methods
> >>>>   that do that (with a iterating default impl).
> >>>
> >>> We do this with GFP flags. In particular, you can pass GFP_ZERO to `alloc` and
> >>> `realloc` to get zeroed memory. Hence, I think having dedicated functions that
> >>> just do "flags | GFP_ZERO" would not add much value.
> >>
> >> Ah right, no in that case, we don't need it.
> >>
> >>>> - I am not sure putting everything into the single realloc function is a
> >>>>   good idea, I like the grow/shrink methods of the std allocator. Is
> >>>>   there a reason aside from concentrating the impl to go for only a
> >>>>   single realloc function?
> >>>
> >>> Yes, `krealloc()` already provides exactly the described behaviour. See the
> >>> implementation of `Kmalloc`.
> >>
> >> But `kvmalloc` does not and neither does `vmalloc`. I would prefer
> >> multiple smaller functions over one big one in this case.
> > 
> > What I forsee is that:
> > 
> >   - `alloc` becomes a `grow` from zero.
> >   - `free` becomes a `shrink` to zero.
> >   - `grow` and `shrink` become a `realloc` alias,
> >      because they're almost the same
> > 
> > Wouldn't this just put us were we already are, effectively?
> 
> We could have a NonNull parameter for realloc and discourage calling
> realloc for freeing.

But what does this get us, other than that we have to implement `alloc` and
`free` explicitly for every allocator?

Also, as mentioned above, `realloc` taking a NULL pointer for a new allocation
is useful for growing structures that start from zero, such as `KVec`.

> 
> ---
> Cheers,
> Benno
>

Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Benno Lossin 1 year, 7 months ago

On 06.07.24 20:47, Danilo Krummrich wrote:
> On Sat, Jul 06, 2024 at 05:08:26PM +0000, Benno Lossin wrote:
>> On 06.07.24 17:11, Danilo Krummrich wrote:
>>> On Sat, Jul 06, 2024 at 01:17:19PM +0000, Benno Lossin wrote:
>>>> On 06.07.24 13:05, Danilo Krummrich wrote:
>>>>>>> +        layout: Layout,
>>>>>>> +        flags: Flags,
>>>>>>> +    ) -> Result<NonNull<[u8]>, AllocError>;
>>>>>>> +
>>>>>>> +    /// Free an existing memory allocation.
>>>>>>> +    ///
>>>>>>> +    /// # Safety
>>>>>>> +    ///
>>>>>>> +    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
>>>>>>> +    /// instance.
>>>>>>> +    unsafe fn free(&self, ptr: *mut u8) {
>>>>>>
>>>>>> `ptr` should be `NonNull<u8>`.
>>>>>
>>>>> Creating a `NonNull` from a raw pointer is an extra operation for any user of
>>>>> `free` and given that all `free` functions in the kernel accept a NULL pointer,
>>>>> I think there is not much value in making this `NonNull`.
>>>>
>>>> I don't think that this argument holds for Rust though. For example,
>>>> `KBox` contains a `Unique` that contains a `NonNull`, so freeing could
>>>> just be done with `free(self.0.0)`.
>>>
>>> Agreed, we can indeed make it a `&NonNull<u8>`. However, I find this a bit
>>
>> I think you mean `NonNull<u8>`, right?
> 
> Yes, but I still don't see how that improves things, e.g. in `Drop` of
> `KVec`:
> 
> `A::free(self.ptr.to_non_null().cast())`
> 
> vs.
> 
> `A::free(self.as_mut_ptr().cast())`
> 
> I'm not against this change, but I don't see how this makes things better.

Ah you still need to convert the `Unique<T>` to a pointer...
But we could have a trait that allows that conversion. Additionally, we
could get rid of the cast if we made the function generic.

>>> inconsistent with the signature of `realloc`.
>>>
>>> Should we go with separate `shrink` / `grow`, `free` could be implemented as
>>> shrinking to zero and allowing a NULL pointer makes not much sense.
>>>
>>> But as mentioned, I'm not yet seeing the benefit of having `realloc` split into
>>> `grow` and `shrink`.
>>
>> I would not split it into grow/shrink. I am not sure what exactly would
>> be best here, but here is what I am trying to achieve:
>> - people should strongly prefer alloc/free over realloc,
> 
> I agree; the functions for that are there: `Allocator::alloc` and
> `Allocator::free`.
> 
> `KBox` uses both of them, `KVec` instead, for obvious reasons, uses
> `Allocator::realloc` directly to grow from zero and `Allocator::free`.
> 
>> - calling realloc with zero size should not signify freeing the memory,
>>   but rather resizing the allocation to 0. E.g. because a buffer now
>>   decides to hold zero elements (in this case, the size should be a
>>   variable that just happens to be zero).
> 
> If a buffer is forced to a new size of zero, isn't that effectively a free?

I would argue that they are different, since you get a pointer back that
points to an allocation of zero size. A vector of size zero still keeps
around a (dangling) pointer.
You also can free a zero sized allocation (it's a no-op), but you must
not free an allocation twice.

> At least that's exactly what the kernel does, if we ask krealloc() to resize to
> zero it will free the memory and return ZERO_SIZE_PTR.

Not every allocator behaves like krealloc, in your patch, both vmalloc
and kvmalloc are implemented with `if`s to check for the various special
cases.

> So, what exactly would you want `realloc` to do when a size of zero is passed
> in?

I don't want to change the behavior, I want to prevent people from using
it unnecessarily.

>> - calling realloc with a null pointer should not be necessary, since
>>   `alloc` exists.
> 
> But `alloc` calls `realloc` with a NULL pointer to allocate new memory.
> 
> Let's take `Kmalloc` as example, surely I could implement `alloc` by calling
> into kmalloc() instead. But then we'd have to implement `alloc` for all
> allocators, instead of having a generic `alloc`.

My intuition is telling me that I don't like that you can pass null to
realloc. I can't put my finger on exactly why that is, maybe because
there isn't actually any argument here or maybe there is. I'd like to
hear other people's opinion.

> And I wonder what's the point given that `realloc` with a NULL pointer already
> does this naturally? Besides that, it comes in handy when we want to allocate
> memory for data structures that grow from zero, such as `KVec`.

You can just `alloc` with size zero and then call `realloc` with the
pointer that you got. I don't see how this would be a problem.

>> This is to improve readability of code, or do you find
>>
>>     realloc(ptr, 0, Layout::new::<()>(), Flags(0))
>>
>> more readable than
>>
>>     free(ptr)
> 
> No, but that's not what users of allocators would do. They'd just call `free`,
> as I do in `KBox` and `KVec`.

I agree that we have to free the memory when supplying a zero size, but
I don't like adding additional features to `realloc`.

Conceptually, I see an allocator like this:

    pub trait Allocator {
        type Flags;
        type Allocation;
        type Error;
    
        fn alloc(layout: Layout, flags: Self::Flags) -> Result<Self::Allocation, Self::Error>;
    
        fn realloc(
            alloc: Self::Allocation,
            old: Layout,
            new: Layout,
            flags: Self::Flags,
        ) -> Result<Self::Allocation, (Self::Allocation, Self::Error)>;
    
        fn free(alloc: Self::Allocation);
    }

I.e. to reallocate something, you first have to have something
allocated.

For some reason if we use `Option<NonNull<u8>>` instead of `*mut u8`, I
have a better feeling, but that might be worse for ergonomics...

---
Cheers,
Benno

Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

Posted by Danilo Krummrich 1 year, 7 months ago

On Mon, Jul 08, 2024 at 08:12:34AM +0000, Benno Lossin wrote:
> On 06.07.24 20:47, Danilo Krummrich wrote:
> > On Sat, Jul 06, 2024 at 05:08:26PM +0000, Benno Lossin wrote:
> >> On 06.07.24 17:11, Danilo Krummrich wrote:
> >>> On Sat, Jul 06, 2024 at 01:17:19PM +0000, Benno Lossin wrote:
> >>>> On 06.07.24 13:05, Danilo Krummrich wrote:
> >>>>>>> +        layout: Layout,
> >>>>>>> +        flags: Flags,
> >>>>>>> +    ) -> Result<NonNull<[u8]>, AllocError>;
> >>>>>>> +
> >>>>>>> +    /// Free an existing memory allocation.
> >>>>>>> +    ///
> >>>>>>> +    /// # Safety
> >>>>>>> +    ///
> >>>>>>> +    /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
> >>>>>>> +    /// instance.
> >>>>>>> +    unsafe fn free(&self, ptr: *mut u8) {
> >>>>>>
> >>>>>> `ptr` should be `NonNull<u8>`.
> >>>>>
> >>>>> Creating a `NonNull` from a raw pointer is an extra operation for any user of
> >>>>> `free` and given that all `free` functions in the kernel accept a NULL pointer,
> >>>>> I think there is not much value in making this `NonNull`.
> >>>>
> >>>> I don't think that this argument holds for Rust though. For example,
> >>>> `KBox` contains a `Unique` that contains a `NonNull`, so freeing could
> >>>> just be done with `free(self.0.0)`.
> >>>
> >>> Agreed, we can indeed make it a `&NonNull<u8>`. However, I find this a bit
> >>
> >> I think you mean `NonNull<u8>`, right?
> > 
> > Yes, but I still don't see how that improves things, e.g. in `Drop` of
> > `KVec`:
> > 
> > `A::free(self.ptr.to_non_null().cast())`
> > 
> > vs.
> > 
> > `A::free(self.as_mut_ptr().cast())`
> > 
> > I'm not against this change, but I don't see how this makes things better.
> 
> Ah you still need to convert the `Unique<T>` to a pointer...
> But we could have a trait that allows that conversion. Additionally, we
> could get rid of the cast if we made the function generic.

I'm still not convinced of the `NonNull`, since technically it's not required,
but using a generic could indeed get us rid of all the `cast` calls - same for
`realloc` I guess.

> 
> >>> inconsistent with the signature of `realloc`.
> >>>
> >>> Should we go with separate `shrink` / `grow`, `free` could be implemented as
> >>> shrinking to zero and allowing a NULL pointer makes not much sense.
> >>>
> >>> But as mentioned, I'm not yet seeing the benefit of having `realloc` split into
> >>> `grow` and `shrink`.
> >>
> >> I would not split it into grow/shrink. I am not sure what exactly would
> >> be best here, but here is what I am trying to achieve:
> >> - people should strongly prefer alloc/free over realloc,
> > 
> > I agree; the functions for that are there: `Allocator::alloc` and
> > `Allocator::free`.
> > 
> > `KBox` uses both of them, `KVec` instead, for obvious reasons, uses
> > `Allocator::realloc` directly to grow from zero and `Allocator::free`.
> > 
> >> - calling realloc with zero size should not signify freeing the memory,
> >>   but rather resizing the allocation to 0. E.g. because a buffer now
> >>   decides to hold zero elements (in this case, the size should be a
> >>   variable that just happens to be zero).
> > 
> > If a buffer is forced to a new size of zero, isn't that effectively a free?
> 
> I would argue that they are different, since you get a pointer back that
> points to an allocation of zero size. A vector of size zero still keeps
> around a (dangling) pointer.
> You also can free a zero sized allocation (it's a no-op), but you must
> not free an allocation twice.

I don't think this is contradictive. For instance, if you call krealloc() with
a size of zero, it will free the existing allocation and return ZERO_SIZE_PTR,
which, effectively, can be seen as zero sized allocation. You can also pass
ZERO_SIZE_PTR to free(), but, of course, it doens't do anything, hence, as you
said, it's a no-op.

> 
> > At least that's exactly what the kernel does, if we ask krealloc() to resize to
> > zero it will free the memory and return ZERO_SIZE_PTR.
> 
> Not every allocator behaves like krealloc, in your patch, both vmalloc
> and kvmalloc are implemented with `if`s to check for the various special
> cases.

Well, for `Vmalloc` there simply is no vrealloc() on the C side...

For `KVmalloc` there is indeed a kvrealloc(), but it behaves different than
krealloc(), which seems inconsistant. Also note that kvrealloc() has a hand full
of users only.

My plan was to stick with the logic that krealloc() uses (because I think that's
what it should be) and, once we land this series, align kvrealloc() and get a
vrealloc() on the C side in place. We could do this in the scope of this series
as well, but I think this series is huge enough and separating this out makes
things much easier.

So, my goal would be that `Vmalloc` and `KVmalloc` look exactly like `Kmalloc`
looks right now in the end.

> 
> > So, what exactly would you want `realloc` to do when a size of zero is passed
> > in?
> 
> I don't want to change the behavior, I want to prevent people from using
> it unnecessarily.

I'm not overly worried about this. In fact, the C side proves that krealloc()
isn't really prone to be used where kmalloc() or free() can be used instead. And
sometimes it makes sense, `KVec` is a good example for that.

Besides that, who do we expect to use this API? In Rust most use cases should be
covered by `KBox` and `KVec` anyways. I don't expect much direct users of this
API.

> 
> >> - calling realloc with a null pointer should not be necessary, since
> >>   `alloc` exists.
> > 
> > But `alloc` calls `realloc` with a NULL pointer to allocate new memory.
> > 
> > Let's take `Kmalloc` as example, surely I could implement `alloc` by calling
> > into kmalloc() instead. But then we'd have to implement `alloc` for all
> > allocators, instead of having a generic `alloc`.
> 
> My intuition is telling me that I don't like that you can pass null to
> realloc. I can't put my finger on exactly why that is, maybe because
> there isn't actually any argument here or maybe there is. I'd like to
> hear other people's opinion.
> 
> > And I wonder what's the point given that `realloc` with a NULL pointer already
> > does this naturally? Besides that, it comes in handy when we want to allocate
> > memory for data structures that grow from zero, such as `KVec`.
> 
> You can just `alloc` with size zero and then call `realloc` with the
> pointer that you got. I don't see how this would be a problem.

In contrast to that, I don't see why calling two functions where just one would
be sufficient makes anything better.

> 
> >> This is to improve readability of code, or do you find
> >>
> >>     realloc(ptr, 0, Layout::new::<()>(), Flags(0))
> >>
> >> more readable than
> >>
> >>     free(ptr)
> > 
> > No, but that's not what users of allocators would do. They'd just call `free`,
> > as I do in `KBox` and `KVec`.
> 
> I agree that we have to free the memory when supplying a zero size, but
> I don't like adding additional features to `realloc`.

So, if you agree that we have to free the memory when supplying a zero size, why
would it be an additional feature then?

Besides that, as mentioned above, krealloc() already does that and kvrealloc()
should be fixed to be consistent with that.

> 
> Conceptually, I see an allocator like this:
> 
>     pub trait Allocator {
>         type Flags;

Agreed.

>         type Allocation;

What do we need this for?

I already thought about having some kind of `Allocation` type, also with a
drop() trait freeing the memory etc. But, I came to the conclusion that this is
likely to be overkill. We have `KBox` and `KVec` to take care of managing their
memory.

>         type Error;

I'm not worried about adding this, but maybe it makes sense to wait until we
actually implement somthing that needs this.

>     
>         fn alloc(layout: Layout, flags: Self::Flags) -> Result<Self::Allocation, Self::Error>;
>     
>         fn realloc(
>             alloc: Self::Allocation,
>             old: Layout,

We only really care about the size here. `Alignment` would be wasted. Are we
keen taking this downside for a bit more consistency?

>             new: Layout,
>             flags: Self::Flags,
>         ) -> Result<Self::Allocation, (Self::Allocation, Self::Error)>;
>     
>         fn free(alloc: Self::Allocation);
>     }
> 
> I.e. to reallocate something, you first have to have something
> allocated.

See the discussion above for this point.

> 
> For some reason if we use `Option<NonNull<u8>>` instead of `*mut u8`, I
> have a better feeling, but that might be worse for ergonomics...

As argument for `realloc`?

I get your point here. And for any other API I'd probably agree instantly.

Generally, `Allocator` is a bit special to me. It's not supposed to be used by a
lot of users, other than types like `KBox` or `KVec`, that are supposed to
manage the memory and the interface is pretty unsafe by definition. To me
keeping a rather "raw" access to krealloc() and friends seems to be desirable
here.

> 
> ---
> Cheers,
> Benno
>