[PATCH 09/20] rust: types: implement `Unique<T>`

Danilo Krummrich posted 20 patches 1 year, 7 months ago
There is a newer version of this series
[PATCH 09/20] rust: types: implement `Unique<T>`
Posted by Danilo Krummrich 1 year, 7 months ago
Implement the `Unique` type as a prerequisite for `KBox` and `Kvec`
introduced in subsequent patches.

`Unique` serves as wrapper around a `NonNull`, but indicates that the
possessor of this wrapper owns the referent.

This type already exists in Rust's core library, but, unfortunately, is
exposed as unstable API and hence shouldn't be used in the kernel.

This implementation of `Unique` is almost identical, but mostly stripped
down to the functionality we need for `KBox` and `KVec`. Additionally,
all unstable features are removed and / or replaced by stable ones.

Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
 rust/kernel/types.rs | 176 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 176 insertions(+)

diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
index 2e7c9008621f..281327ea2932 100644
--- a/rust/kernel/types.rs
+++ b/rust/kernel/types.rs
@@ -409,3 +409,179 @@ pub enum Either<L, R> {
     /// Constructs an instance of [`Either`] containing a value of type `R`.
     Right(R),
 }
+
+/// A wrapper around a raw non-null `*mut T` that indicates that the possessor
+/// of this wrapper owns the referent. Useful for building abstractions like
+/// `Box<T>`, `Vec<T>`, `String`, and `HashMap<K, V>`.
+///
+/// Unlike `*mut T`, `Unique<T>` behaves "as if" it were an instance of `T`.
+/// It implements `Send`/`Sync` if `T` is `Send`/`Sync`. It also implies
+/// the kind of strong aliasing guarantees an instance of `T` can expect:
+/// the referent of the pointer should not be modified without a unique path to
+/// its owning Unique.
+///
+/// If you're uncertain of whether it's correct to use `Unique` for your purposes,
+/// consider using `NonNull`, which has weaker semantics.
+///
+/// Unlike `*mut T`, the pointer must always be non-null, even if the pointer
+/// is never dereferenced. This is so that enums may use this forbidden value
+/// as a discriminant -- `Option<Unique<T>>` has the same size as `Unique<T>`.
+/// However the pointer may still dangle if it isn't dereferenced.
+///
+/// Unlike `*mut T`, `Unique<T>` is covariant over `T`. This should always be correct
+/// for any type which upholds Unique's aliasing requirements.
+#[repr(transparent)]
+pub struct Unique<T: ?Sized> {
+    pointer: NonNull<T>,
+    // NOTE: this marker has no consequences for variance, but is necessary
+    // for dropck to understand that we logically own a `T`.
+    //
+    // For details, see:
+    // https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md#phantom-data
+    _marker: PhantomData<T>,
+}
+
+/// `Unique` pointers are `Send` if `T` is `Send` because the data they
+/// reference is unaliased. Note that this aliasing invariant is
+/// unenforced by the type system; the abstraction using the
+/// `Unique` must enforce it.
+unsafe impl<T: Send + ?Sized> Send for Unique<T> {}
+
+/// `Unique` pointers are `Sync` if `T` is `Sync` because the data they
+/// reference is unaliased. Note that this aliasing invariant is
+/// unenforced by the type system; the abstraction using the
+/// `Unique` must enforce it.
+unsafe impl<T: Sync + ?Sized> Sync for Unique<T> {}
+
+impl<T: Sized> Unique<T> {
+    /// Creates a new `Unique` that is dangling, but well-aligned.
+    ///
+    /// This is useful for initializing types which lazily allocate, like
+    /// `Vec::new` does.
+    ///
+    /// Note that the pointer value may potentially represent a valid pointer to
+    /// a `T`, which means this must not be used as a "not yet initialized"
+    /// sentinel value. Types that lazily allocate must track initialization by
+    /// some other means.
+    #[must_use]
+    #[inline]
+    pub const fn dangling() -> Self {
+        Unique {
+            pointer: NonNull::dangling(),
+            _marker: PhantomData,
+        }
+    }
+}
+
+impl<T: ?Sized> Unique<T> {
+    /// Creates a new `Unique`.
+    ///
+    /// # Safety
+    ///
+    /// `ptr` must be non-null.
+    #[inline]
+    pub const unsafe fn new_unchecked(ptr: *mut T) -> Self {
+        // SAFETY: the caller must guarantee that `ptr` is non-null.
+        unsafe {
+            Unique {
+                pointer: NonNull::new_unchecked(ptr),
+                _marker: PhantomData,
+            }
+        }
+    }
+
+    /// Creates a new `Unique` if `ptr` is non-null.
+    #[allow(clippy::manual_map)]
+    #[inline]
+    pub fn new(ptr: *mut T) -> Option<Self> {
+        if let Some(pointer) = NonNull::new(ptr) {
+            Some(Unique {
+                pointer,
+                _marker: PhantomData,
+            })
+        } else {
+            None
+        }
+    }
+
+    /// Acquires the underlying `*mut` pointer.
+    #[must_use = "`self` will be dropped if the result is not used"]
+    #[inline]
+    pub const fn as_ptr(self) -> *mut T {
+        self.pointer.as_ptr()
+    }
+
+    /// Dereferences the content.
+    ///
+    /// The resulting lifetime is bound to self so this behaves "as if"
+    /// it were actually an instance of T that is getting borrowed. If a longer
+    /// (unbound) lifetime is needed, use `&*my_ptr.as_ptr()`.
+    ///
+    /// # Safety
+    ///
+    /// Safety requirements for this function are inherited from [NonNull::as_ref].
+    ///
+    #[must_use]
+    #[inline]
+    pub const unsafe fn as_ref(&self) -> &T {
+        // SAFETY: the caller must guarantee that `self` meets all the
+        // requirements for a reference.
+        unsafe { self.pointer.as_ref() }
+    }
+
+    /// Mutably dereferences the content.
+    ///
+    /// The resulting lifetime is bound to self so this behaves "as if"
+    /// it were actually an instance of T that is getting borrowed. If a longer
+    /// (unbound) lifetime is needed, use `&mut *my_ptr.as_ptr()`.
+    ///
+    /// # Safety
+    ///
+    /// Safety requirements for this function are inherited from [NonNull::as_mut].
+    #[must_use]
+    #[inline]
+    pub unsafe fn as_mut(&mut self) -> &mut T {
+        // SAFETY: the caller must guarantee that `self` meets all the
+        // requirements for a mutable reference.
+        unsafe { self.pointer.as_mut() }
+    }
+
+    /// Casts to a pointer of another type.
+    #[must_use = "`self` will be dropped if the result is not used"]
+    #[inline]
+    pub fn cast<U>(self) -> Unique<U> {
+        Unique::from(self.pointer.cast())
+    }
+}
+
+impl<T: ?Sized> Clone for Unique<T> {
+    #[inline]
+    fn clone(&self) -> Self {
+        *self
+    }
+}
+
+impl<T: ?Sized> Copy for Unique<T> {}
+
+impl<T: ?Sized> From<&mut T> for Unique<T> {
+    /// Converts a `&mut T` to a `Unique<T>`.
+    ///
+    /// This conversion is infallible since references cannot be null.
+    #[inline]
+    fn from(reference: &mut T) -> Self {
+        Self::from(NonNull::from(reference))
+    }
+}
+
+impl<T: ?Sized> From<NonNull<T>> for Unique<T> {
+    /// Converts a `NonNull<T>` to a `Unique<T>`.
+    ///
+    /// This conversion is infallible since `NonNull` cannot be null.
+    #[inline]
+    fn from(pointer: NonNull<T>) -> Self {
+        Unique {
+            pointer,
+            _marker: PhantomData,
+        }
+    }
+}
-- 
2.45.2
Re: [PATCH 09/20] rust: types: implement `Unique<T>`
Posted by Benno Lossin 1 year, 7 months ago
On 04.07.24 19:06, Danilo Krummrich wrote:
> Implement the `Unique` type as a prerequisite for `KBox` and `Kvec`
> introduced in subsequent patches.
> 
> `Unique` serves as wrapper around a `NonNull`, but indicates that the
> possessor of this wrapper owns the referent.
> 
> This type already exists in Rust's core library, but, unfortunately, is
> exposed as unstable API and hence shouldn't be used in the kernel.

It's not really exposed, as the feature (ptr_internals) is an internal
unstable feature and thus probably perma-unstable.

(but your argument still holds, this just means that we *really* should
not be using it)

> This implementation of `Unique` is almost identical, but mostly stripped
> down to the functionality we need for `KBox` and `KVec`. Additionally,
> all unstable features are removed and / or replaced by stable ones.
> 
> Signed-off-by: Danilo Krummrich <dakr@redhat.com>
> ---
>  rust/kernel/types.rs | 176 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 176 insertions(+)
> 
> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> index 2e7c9008621f..281327ea2932 100644
> --- a/rust/kernel/types.rs
> +++ b/rust/kernel/types.rs
> @@ -409,3 +409,179 @@ pub enum Either<L, R> {
>      /// Constructs an instance of [`Either`] containing a value of type `R`.
>      Right(R),
>  }
> +
> +/// A wrapper around a raw non-null `*mut T` that indicates that the possessor
> +/// of this wrapper owns the referent. Useful for building abstractions like
> +/// `Box<T>`, `Vec<T>`, `String`, and `HashMap<K, V>`.
> +///
> +/// Unlike `*mut T`, `Unique<T>` behaves "as if" it were an instance of `T`.
> +/// It implements `Send`/`Sync` if `T` is `Send`/`Sync`. It also implies
> +/// the kind of strong aliasing guarantees an instance of `T` can expect:
> +/// the referent of the pointer should not be modified without a unique path to
> +/// its owning Unique.
> +///
> +/// If you're uncertain of whether it's correct to use `Unique` for your purposes,
> +/// consider using `NonNull`, which has weaker semantics.
> +///
> +/// Unlike `*mut T`, the pointer must always be non-null, even if the pointer
> +/// is never dereferenced. This is so that enums may use this forbidden value
> +/// as a discriminant -- `Option<Unique<T>>` has the same size as `Unique<T>`.
> +/// However the pointer may still dangle if it isn't dereferenced.
> +///
> +/// Unlike `*mut T`, `Unique<T>` is covariant over `T`. This should always be correct
> +/// for any type which upholds Unique's aliasing requirements.

Since you copied this directly from the stdlib, should this be citing
it?

> +#[repr(transparent)]
> +pub struct Unique<T: ?Sized> {
> +    pointer: NonNull<T>,

Gary and I already mentioned this in the meeting, but I will put it here
for the record:

The `Unique` from std is special, in the sense that the Rust compiler
will emit the `noalias` LLVM attribute.

This gives std's `Box` type a possible performance advantage (IIRC Gary
had a compiler explorer example that showed different instruction
count).

I thought that we could mimic this behavior if we would change the type
of `pointer` to `ManuallyDrop<Box<T>>`. But that only works if the
pointer is always valid, since otherwise we can't call `Box::from_raw`.

So currently I don't see a workaround that would give us the noalias
attribute back.

It would be a good idea to get some benchmarks on how this affects
performance, Andreas or Alice do you think you can give a bit of compute
time, once this patchset is more mature?

> +    // NOTE: this marker has no consequences for variance, but is necessary
> +    // for dropck to understand that we logically own a `T`.
> +    //
> +    // For details, see:
> +    // https://github.com/rust-lang/rfcs/blob/master/text/0769-sound-generic-drop.md#phantom-data
> +    _marker: PhantomData<T>,
> +}
> +
> +/// `Unique` pointers are `Send` if `T` is `Send` because the data they
> +/// reference is unaliased. Note that this aliasing invariant is
> +/// unenforced by the type system; the abstraction using the
> +/// `Unique` must enforce it.
> +unsafe impl<T: Send + ?Sized> Send for Unique<T> {}
> +
> +/// `Unique` pointers are `Sync` if `T` is `Sync` because the data they
> +/// reference is unaliased. Note that this aliasing invariant is
> +/// unenforced by the type system; the abstraction using the
> +/// `Unique` must enforce it.
> +unsafe impl<T: Sync + ?Sized> Sync for Unique<T> {}
> +
> +impl<T: Sized> Unique<T> {
> +    /// Creates a new `Unique` that is dangling, but well-aligned.
> +    ///
> +    /// This is useful for initializing types which lazily allocate, like
> +    /// `Vec::new` does.
> +    ///
> +    /// Note that the pointer value may potentially represent a valid pointer to
> +    /// a `T`, which means this must not be used as a "not yet initialized"
> +    /// sentinel value. Types that lazily allocate must track initialization by
> +    /// some other means.
> +    #[must_use]
> +    #[inline]
> +    pub const fn dangling() -> Self {
> +        Unique {
> +            pointer: NonNull::dangling(),
> +            _marker: PhantomData,
> +        }
> +    }
> +}
> +
> +impl<T: ?Sized> Unique<T> {
> +    /// Creates a new `Unique`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// `ptr` must be non-null.
> +    #[inline]
> +    pub const unsafe fn new_unchecked(ptr: *mut T) -> Self {
> +        // SAFETY: the caller must guarantee that `ptr` is non-null.
> +        unsafe {
> +            Unique {
> +                pointer: NonNull::new_unchecked(ptr),
> +                _marker: PhantomData,
> +            }
> +        }
> +    }
> +
> +    /// Creates a new `Unique` if `ptr` is non-null.
> +    #[allow(clippy::manual_map)]
> +    #[inline]
> +    pub fn new(ptr: *mut T) -> Option<Self> {

I think we should give `Unique` the invariant that it is unique, making
this function `unsafe`, since the caller must ensure that only this
unique has access to the pointee.

---
Cheers,
Benno

> +        if let Some(pointer) = NonNull::new(ptr) {
> +            Some(Unique {
> +                pointer,
> +                _marker: PhantomData,
> +            })
> +        } else {
> +            None
> +        }
> +    }
Re: [PATCH 09/20] rust: types: implement `Unique<T>`
Posted by Miguel Ojeda 1 year, 7 months ago
On Sat, Jul 6, 2024 at 12:59 PM Benno Lossin <benno.lossin@proton.me> wrote:
>
> The `Unique` from std is special, in the sense that the Rust compiler
> will emit the `noalias` LLVM attribute.
>
> This gives std's `Box` type a possible performance advantage (IIRC Gary
> had a compiler explorer example that showed different instruction
> count).

The example in question: https://godbolt.org/z/n93vePqMj -- there is
one less memory load.

One can also easily craft examples where the compiler e.g. removes an
entire loop: https://godbolt.org/z/c8ncbdKMe

But, of course, it depends on whether we will actually encounter these
situations in real code, as you say. It could also be that today we
don't find any relevant benefit, but there may exist situations later
(perhaps because we have more code, or perhaps because codegen
backends change).

From a quick look, there are still quite a few open issues about the
exact properties of `Box` and `Unique`, including whether `Box` has a
derefencability requirement
(https://github.com/rust-lang/unsafe-code-guidelines/issues/145).

What properties would we want, today, from our `Box` types, if we
could pick any? Should we have several kinds of `Box`es if there is no
unique answer? Is it worth diverging from the standard one(s) in
either direction (i.e. more invariants or less)?

Cheers,
Miguel
Re: [PATCH 09/20] rust: types: implement `Unique<T>`
Posted by Benno Lossin 1 year, 7 months ago
On 06.07.24 14:40, Miguel Ojeda wrote:
> On Sat, Jul 6, 2024 at 12:59 PM Benno Lossin <benno.lossin@proton.me> wrote:
>>
>> The `Unique` from std is special, in the sense that the Rust compiler
>> will emit the `noalias` LLVM attribute.
>>
>> This gives std's `Box` type a possible performance advantage (IIRC Gary
>> had a compiler explorer example that showed different instruction
>> count).
> 
> The example in question: https://godbolt.org/z/n93vePqMj -- there is
> one less memory load.
> 
> One can also easily craft examples where the compiler e.g. removes an
> entire loop: https://godbolt.org/z/c8ncbdKMe
> 
> But, of course, it depends on whether we will actually encounter these
> situations in real code, as you say. It could also be that today we
> don't find any relevant benefit, but there may exist situations later
> (perhaps because we have more code, or perhaps because codegen
> backends change).

It would be a good idea to ask the Rust folks if they could give us a
unique type that does what we need.

> From a quick look, there are still quite a few open issues about the
> exact properties of `Box` and `Unique`, including whether `Box` has a
> derefencability requirement
> (https://github.com/rust-lang/unsafe-code-guidelines/issues/145).

Yes, this is why I wasn't sure if we could do the workaround I
mentioned. My guess is that it doesn't work or that at least it isn't
supported (i.e. could change at any point).

> What properties would we want, today, from our `Box` types, if we
> could pick any? Should we have several kinds of `Box`es if there is no
> unique answer? Is it worth diverging from the standard one(s) in
> either direction (i.e. more invariants or less)?

I would expect that `Box<T>` behaves like `&mut T` except that it owns
the memory.

For starters we probably want this "normal" Box. If we get a unique
pointer type, then we can introduce more, but I don't think we need that
at the moment.

---
Cheers,
Benno