rust: types: Add explanation for ARef pattern

[RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Boqun Feng 1 year, 7 months ago

As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
to add explanation of the "ARef pattern" to cover the most "DO" and "DO
NOT" cases when wrapping a self-refcounted C type.

Hence an "ARef pattern" section is added in the documentation of `ARef`.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
This is motivated by:

	https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/

 rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 156 insertions(+)

diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
index bd189d646adb..70fdc780882e 100644
--- a/rust/kernel/types.rs
+++ b/rust/kernel/types.rs
@@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
 ///
 /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
 /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
+///
+/// # [`ARef`] pattern
+///
+/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting
+/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
+/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct
+/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
+/// "[`ARef`] pattern", i.e. **bad case**:
+///
+/// ```ignore
+/// pub struct Foo(NonNull<foo>);
+///
+/// impl Foo {
+///     // An operation on the pointer.
+///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
+///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
+///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
+///         // reference?
+///         unsafe { get_foo(ptr); }
+///
+///         unsafe { Foo(NonNull::new_unchecked(foo)) }
+///     }
+///
+///     // An operation on the object.
+///     pub fn get_bar(&self) -> Bar {
+///         unsafe { (*foo.0.as_ptr()).bar }
+///     }
+/// }
+///
+/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
+/// impl Clone for Foo {
+///     fn clone(&self) -> Self {
+///         unsafe { get_foo(self.0.as_ptr()); }
+///
+///         Foo(self.0)
+///     }
+/// }
+///
+/// impl Drop for Foo {
+///     fn drop(&mut self) {
+///         unsafe { put_foo(self.0.as_ptr()); }
+///     }
+/// }
+/// ```
+///
+/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
+/// `foo`.
+///
+/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
+///
+/// ```ignore
+/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on
+/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
+/// /// foo` to a `*mut Foo`
+/// #[repr(transparent)]
+/// pub struct Foo(Opaque<foo>);
+///
+/// impl Foo {
+///     pub fn get_bar(&self) -> Bar {
+///         // SAFETY: `self.0.get()` is a valid pointer.
+///         //
+///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
+///         // doesn't race with C side. Most cases are either calling a C function, which has its
+///         // own concurrent access protection, or holding a lock.
+///         unsafe { (*self.0.get()).bar }
+///     }
+/// }
+/// ```
+///
+/// ## Avoid `impl AlwaysRefCounted` if unnecesarry
+///
+/// If Rust code doesn't touch the part where the object lifetimes of `foo` are maintained, `impl
+/// AlwaysRefCounted` can be temporarily avoided: it can always be added later as an extension of
+/// the functionality of the Rust code. This is usually the case for callbacks where the object
+/// lifetimes are already maintained by a framework. In such a case, an `unsafe` `fn(*mut foo) ->
+/// &Foo` function usually suffices:
+///
+/// ```ignore
+/// impl Foo {
+///     /// # Safety
+///     ///
+///     /// `ptr` has to be a valid pointer to `foo` for the entire lifetime `'a'.
+///     pub unsafe fn as_ref<'a>(ptr: *mut foo) -> &'a Self {
+///         // SAFETY: Per function safety requirement, reborrow is valid.
+///         unsafe { &*ptr.cast() }
+///     }
+/// }
+/// ```
+///
+/// ## Type invariants of `impl AlwaysRefCounted`
+///
+/// Types that `impl AlwaysRefCounted` usually needs an invariant to describe why the type can meet
+/// the safety requirement of `AlwaysRefCounted`, e.g.
+///
+/// ```ignore
+/// /// # Invariants:
+/// ///
+/// /// Instances of this type are always refcounted, that is, a call to `get_foo` ensures that the
+/// /// allocation remains valid at least until the matching call to `put_foo`.
+/// #[repr(transparent)]
+/// pub struct Foo(Opaque<foo>);
+///
+/// // SAFETY: `Foo` is always ref-counted per type invariants.
+/// unsafe impl AlwaysRefCounted for Foo {
+///     fn inc_ref(&self) {
+///         // SAFETY: `self.0.get()` is a valid pointer and per type invariants, the existence of
+///         // `&self` means it has a non-zero reference count.
+///         unsafe { get_foo(self.0.get()); }
+///     }
+///
+///     unsafe dec_ref(obj: NonNull<Self>) {
+///         // SAFETY: The refcount of `obj` is non-zero per function safety requirement, and the
+///         // cast is OK since `foo` is transparent to `Foo`.
+///         unsafe { put_foo(obj.cast()); }
+///     }
+/// }
+/// ```
+///
+/// After `impl AlwaysRefCounted for foo`, `clone()` (`get_foo()`) and `drop()` (`put_foo()`)  are
+/// available to `ARef<Foo>` thanks to the generic implementation.
+///
+/// ## `ARef<Self>` vs `&Self`
+///
+/// For an `impl AlwaysRefCounted` type, `ARef<Self>` represents an owner of one reference count,
+/// e.g.
+///
+/// ```ignore
+/// impl Foo {
+///     /// Gets a ref-counted reference of [`Self`].
+///     ///
+///     /// # Safety
+///     ///
+///     /// - `ptr` must be a valid pointer to `foo` with at least one reference count.
+///     pub unsafe fn from_ptr(ptr: *mut foo) -> ARef<Self> {
+///         // SAFETY: `ptr` is a valid pointer per function safety requirement. The cast is OK
+///         // since `foo` is transparent to `Foo`.
+///         //
+///         // Note: `.into()` here increases the reference count, so the returned value has its own
+///         // reference count.
+///         unsafe { &*(ptr.cast::<Foo>()) }.into()
+///     }
+/// }
+/// ```
+///
+/// Another function that returns an `ARef<Self>` but with a different semantics is
+/// [`ARef::from_raw`]: it takes away the refcount of the input pointer, i.e. no refcount
+/// incrementation inside the function.
+///
+/// However `&Self` represents a reference to the object, and the lifetime of the **reference** is
+/// known at compile-time. E.g. the `Foo::as_ref()` above.
+///
+/// ## `impl Drop` of an `impl AlwaysRefCounted` should not touch the refcount
+///
+/// [`ARef`] descreases the refcount automatically (in [`ARef::drop`]) when it goes out of the
+/// scope, therefore there's no need to `impl Drop` for the type of objects (e.g. `Foo`) to decrease
+/// the refcount.
 pub struct ARef<T: AlwaysRefCounted> {
     ptr: NonNull<T>,
     _p: PhantomData<T>,
-- 
2.45.2

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Benno Lossin 1 year, 6 months ago

On 10.07.24 05:24, Boqun Feng wrote:
> As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
> to add explanation of the "ARef pattern" to cover the most "DO" and "DO
> NOT" cases when wrapping a self-refcounted C type.
> 
> Hence an "ARef pattern" section is added in the documentation of `ARef`.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
> This is motivated by:
> 
> 	https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/
> 
>  rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 156 insertions(+)
> 
> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> index bd189d646adb..70fdc780882e 100644
> --- a/rust/kernel/types.rs
> +++ b/rust/kernel/types.rs
> @@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
>  ///
>  /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
>  /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
> +///
> +/// # [`ARef`] pattern
> +///
> +/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting

I would have written "[...] struct which is reference-counted, because
[...]", is there a specific reason you wrote "its own"?

> +/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
> +/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct

Not exactly sure I understand your point here, what exactly is the
advantage of decoupling the operations?
In my mind the following points are the advantages of using `ARef`:
(1) prevents having to implement multiple abstractions for a single C
    object: say there is a `struct foo` that is both used via reference
    counting and by-value on the stack. Without `ARef`, we would have to
    write two abstractions, one for each use-case. With `ARef`, we can
    have one `Foo` that can be wrapped with `ARef` to represent a
    reference-counted object.
(2) `ARef<T>` always represents a reference counted object, so it helps
    with understanding the code. If you read `Foo`, you cannot be sure
    if it is heap or stack allocated.
(3) generalizes common code of reference-counted objects (ie avoiding
    code duplication) and concentration of `unsafe` code.

In my opinion (1) is the most important, then (2). And (3) is a nice
bonus. If you agree with the list above (maybe you also have additional
advantages of `ARef`?) then it would be great if you could also add them
somewhere here.

> +/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
> +/// "[`ARef`] pattern", i.e. **bad case**:

Instead of "bad case" I would have written "i.e. you want to avoid this:".

> +///
> +/// ```ignore
> +/// pub struct Foo(NonNull<foo>);
> +///
> +/// impl Foo {
> +///     // An operation on the pointer.
> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
> +///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
> +///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
> +///         // reference?
> +///         unsafe { get_foo(ptr); }
> +///
> +///         unsafe { Foo(NonNull::new_unchecked(foo)) }
> +///     }
> +///
> +///     // An operation on the object.
> +///     pub fn get_bar(&self) -> Bar {
> +///         unsafe { (*foo.0.as_ptr()).bar }
> +///     }
> +/// }
> +///
> +/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
> +/// impl Clone for Foo {
> +///     fn clone(&self) -> Self {
> +///         unsafe { get_foo(self.0.as_ptr()); }
> +///
> +///         Foo(self.0)
> +///     }
> +/// }
> +///
> +/// impl Drop for Foo {
> +///     fn drop(&mut self) {
> +///         unsafe { put_foo(self.0.as_ptr()); }
> +///     }
> +/// }
> +/// ```
> +///
> +/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
> +/// `foo`.
> +///
> +/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
> +///
> +/// ```ignore
> +/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on

I would disagree for the reason that `Opaque` is needed. You need it if
the `foo` eg contains a bool, since C might just write a nonsense
integer which would then result in immediate UB in Rust.
Other reasons might be that certain bytes of `foo` are written to by
other threads, even though on the Rust side we have `&mut Foo` (eg a
`mutex`).

> +/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
> +/// /// foo` to a `*mut Foo`
> +/// #[repr(transparent)]
> +/// pub struct Foo(Opaque<foo>);
> +///
> +/// impl Foo {
> +///     pub fn get_bar(&self) -> Bar {
> +///         // SAFETY: `self.0.get()` is a valid pointer.
> +///         //
> +///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
> +///         // doesn't race with C side. Most cases are either calling a C function, which has its
> +///         // own concurrent access protection, or holding a lock.
> +///         unsafe { (*self.0.get()).bar }
> +///     }
> +/// }
> +/// ```
> +///
> +/// ## Avoid `impl AlwaysRefCounted` if unnecesarry

I would move this section below the next one.

> +///
> +/// If Rust code doesn't touch the part where the object lifetimes of `foo` are maintained, `impl
> +/// AlwaysRefCounted` can be temporarily avoided: it can always be added later as an extension of
> +/// the functionality of the Rust code. This is usually the case for callbacks where the object
> +/// lifetimes are already maintained by a framework. In such a case, an `unsafe` `fn(*mut foo) ->
> +/// &Foo` function usually suffices:
> +///
> +/// ```ignore
> +/// impl Foo {
> +///     /// # Safety
> +///     ///
> +///     /// `ptr` has to be a valid pointer to `foo` for the entire lifetime `'a'.
> +///     pub unsafe fn as_ref<'a>(ptr: *mut foo) -> &'a Self {
> +///         // SAFETY: Per function safety requirement, reborrow is valid.
> +///         unsafe { &*ptr.cast() }
> +///     }
> +/// }
> +/// ```
> +///
> +/// ## Type invariants of `impl AlwaysRefCounted`

I think you should first show how the example looks like with `ARef` and
then talk about type invariants.

> +///
> +/// Types that `impl AlwaysRefCounted` usually needs an invariant to describe why the type can meet
> +/// the safety requirement of `AlwaysRefCounted`, e.g.
> +///
> +/// ```ignore
> +/// /// # Invariants:
> +/// ///
> +/// /// Instances of this type are always refcounted, that is, a call to `get_foo` ensures that the
> +/// /// allocation remains valid at least until the matching call to `put_foo`.
> +/// #[repr(transparent)]
> +/// pub struct Foo(Opaque<foo>);
> +///
> +/// // SAFETY: `Foo` is always ref-counted per type invariants.
> +/// unsafe impl AlwaysRefCounted for Foo {
> +///     fn inc_ref(&self) {
> +///         // SAFETY: `self.0.get()` is a valid pointer and per type invariants, the existence of
> +///         // `&self` means it has a non-zero reference count.
> +///         unsafe { get_foo(self.0.get()); }
> +///     }
> +///
> +///     unsafe dec_ref(obj: NonNull<Self>) {
> +///         // SAFETY: The refcount of `obj` is non-zero per function safety requirement, and the
> +///         // cast is OK since `foo` is transparent to `Foo`.
> +///         unsafe { put_foo(obj.cast()); }
> +///     }
> +/// }
> +/// ```
> +///
> +/// After `impl AlwaysRefCounted for foo`, `clone()` (`get_foo()`) and `drop()` (`put_foo()`)  are

Typo: it should be `impl AlwaysRefCounted for Foo`.

---
Cheers,
Benno

> +/// available to `ARef<Foo>` thanks to the generic implementation.
> +///
> +/// ## `ARef<Self>` vs `&Self`
> +///
> +/// For an `impl AlwaysRefCounted` type, `ARef<Self>` represents an owner of one reference count,
> +/// e.g.
> +///
> +/// ```ignore
> +/// impl Foo {
> +///     /// Gets a ref-counted reference of [`Self`].
> +///     ///
> +///     /// # Safety
> +///     ///
> +///     /// - `ptr` must be a valid pointer to `foo` with at least one reference count.
> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> ARef<Self> {
> +///         // SAFETY: `ptr` is a valid pointer per function safety requirement. The cast is OK
> +///         // since `foo` is transparent to `Foo`.
> +///         //
> +///         // Note: `.into()` here increases the reference count, so the returned value has its own
> +///         // reference count.
> +///         unsafe { &*(ptr.cast::<Foo>()) }.into()
> +///     }
> +/// }
> +/// ```
> +///
> +/// Another function that returns an `ARef<Self>` but with a different semantics is
> +/// [`ARef::from_raw`]: it takes away the refcount of the input pointer, i.e. no refcount
> +/// incrementation inside the function.
> +///
> +/// However `&Self` represents a reference to the object, and the lifetime of the **reference** is
> +/// known at compile-time. E.g. the `Foo::as_ref()` above.
> +///
> +/// ## `impl Drop` of an `impl AlwaysRefCounted` should not touch the refcount
> +///
> +/// [`ARef`] descreases the refcount automatically (in [`ARef::drop`]) when it goes out of the
> +/// scope, therefore there's no need to `impl Drop` for the type of objects (e.g. `Foo`) to decrease
> +/// the refcount.
>  pub struct ARef<T: AlwaysRefCounted> {
>      ptr: NonNull<T>,
>      _p: PhantomData<T>,
> --
> 2.45.2
>

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Boqun Feng 1 year, 6 months ago

Hi Benno,

Thanks for taking a look.

On Thu, Jul 25, 2024 at 06:51:56PM +0000, Benno Lossin wrote:
> On 10.07.24 05:24, Boqun Feng wrote:
> > As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
> > to add explanation of the "ARef pattern" to cover the most "DO" and "DO
> > NOT" cases when wrapping a self-refcounted C type.
> > 
> > Hence an "ARef pattern" section is added in the documentation of `ARef`.
> > 
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > ---
> > This is motivated by:
> > 
> > 	https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/
> > 
> >  rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 156 insertions(+)
> > 
> > diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> > index bd189d646adb..70fdc780882e 100644
> > --- a/rust/kernel/types.rs
> > +++ b/rust/kernel/types.rs
> > @@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
> >  ///
> >  /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
> >  /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
> > +///
> > +/// # [`ARef`] pattern
> > +///
> > +/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting
> 
> I would have written "[...] struct which is reference-counted, because
> [...]", is there a specific reason you wrote "its own"?
> 

"its own" indicates the reference counters are inside the object (i.e.
self refcounted), it's different than `Arc<T>` where the reference
counters are "attached" to `T`. Your version looks good to me as well.

> > +/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
> > +/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct
> 
> Not exactly sure I understand your point here, what exactly is the
> advantage of decoupling the operations?
> In my mind the following points are the advantages of using `ARef`:
> (1) prevents having to implement multiple abstractions for a single C
>     object: say there is a `struct foo` that is both used via reference
>     counting and by-value on the stack. Without `ARef`, we would have to
>     write two abstractions, one for each use-case. With `ARef`, we can
>     have one `Foo` that can be wrapped with `ARef` to represent a
>     reference-counted object.
> (2) `ARef<T>` always represents a reference counted object, so it helps
>     with understanding the code. If you read `Foo`, you cannot be sure
>     if it is heap or stack allocated.
> (3) generalizes common code of reference-counted objects (ie avoiding
>     code duplication) and concentration of `unsafe` code.
> 
> In my opinion (1) is the most important, then (2). And (3) is a nice
> bonus. If you agree with the list above (maybe you also have additional
> advantages of `ARef`?) then it would be great if you could also add them
> somewhere here.
> 

Basically to me, the advantages are mostly (1) and (2) in your list,
thank you for the list. And I did try to use an example (below) to
explain these, because I felt an example of the bad cases is
straightforward.

I will add your list here, because although an example may be
straightforward of reading, a list of advantages are better for
references. Again, thanks a lot!

> > +/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
> > +/// "[`ARef`] pattern", i.e. **bad case**:
> 
> Instead of "bad case" I would have written "i.e. you want to avoid this:".
> 

I'm OK with your version, but for my personal interest, why? ;-)

> > +///
> > +/// ```ignore
> > +/// pub struct Foo(NonNull<foo>);
> > +///
> > +/// impl Foo {
> > +///     // An operation on the pointer.
> > +///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
> > +///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
> > +///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
> > +///         // reference?
> > +///         unsafe { get_foo(ptr); }
> > +///
> > +///         unsafe { Foo(NonNull::new_unchecked(foo)) }
> > +///     }
> > +///
> > +///     // An operation on the object.
> > +///     pub fn get_bar(&self) -> Bar {
> > +///         unsafe { (*foo.0.as_ptr()).bar }
> > +///     }
> > +/// }
> > +///
> > +/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
> > +/// impl Clone for Foo {
> > +///     fn clone(&self) -> Self {
> > +///         unsafe { get_foo(self.0.as_ptr()); }
> > +///
> > +///         Foo(self.0)
> > +///     }
> > +/// }
> > +///
> > +/// impl Drop for Foo {
> > +///     fn drop(&mut self) {
> > +///         unsafe { put_foo(self.0.as_ptr()); }
> > +///     }
> > +/// }
> > +/// ```
> > +///
> > +/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
> > +/// `foo`.
> > +///
> > +/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
> > +///
> > +/// ```ignore
> > +/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on
> 
> I would disagree for the reason that `Opaque` is needed. You need it if
> the `foo` eg contains a bool, since C might just write a nonsense
> integer which would then result in immediate UB in Rust.
> Other reasons might be that certain bytes of `foo` are written to by
> other threads, even though on the Rust side we have `&mut Foo` (eg a
> `mutex`).
> 

hmm.. "since there usually exist C operations on ..." include these two
cases you mentioned, no? Plus, the reference counters themselves are not
marked as atomic at the moment, so without `Opaque`, we also have UB
because of the reference counters. I was trying to summarize all these
as "C operations on ...", maybe I should say "concurrent C operations on
..."? I am trying to be concise here since it's a comment inside a
comment ;-)

> > +/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
> > +/// /// foo` to a `*mut Foo`
> > +/// #[repr(transparent)]
> > +/// pub struct Foo(Opaque<foo>);
> > +///
> > +/// impl Foo {
> > +///     pub fn get_bar(&self) -> Bar {
> > +///         // SAFETY: `self.0.get()` is a valid pointer.
> > +///         //
> > +///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
> > +///         // doesn't race with C side. Most cases are either calling a C function, which has its
> > +///         // own concurrent access protection, or holding a lock.
> > +///         unsafe { (*self.0.get()).bar }
> > +///     }
> > +/// }
> > +/// ```
> > +///
> > +/// ## Avoid `impl AlwaysRefCounted` if unnecesarry
> 
> I would move this section below the next one.
> 
> > +///
> > +/// If Rust code doesn't touch the part where the object lifetimes of `foo` are maintained, `impl
> > +/// AlwaysRefCounted` can be temporarily avoided: it can always be added later as an extension of
> > +/// the functionality of the Rust code. This is usually the case for callbacks where the object
> > +/// lifetimes are already maintained by a framework. In such a case, an `unsafe` `fn(*mut foo) ->
> > +/// &Foo` function usually suffices:
> > +///
> > +/// ```ignore
> > +/// impl Foo {
> > +///     /// # Safety
> > +///     ///
> > +///     /// `ptr` has to be a valid pointer to `foo` for the entire lifetime `'a'.
> > +///     pub unsafe fn as_ref<'a>(ptr: *mut foo) -> &'a Self {
> > +///         // SAFETY: Per function safety requirement, reborrow is valid.
> > +///         unsafe { &*ptr.cast() }
> > +///     }
> > +/// }
> > +/// ```
> > +///
> > +/// ## Type invariants of `impl AlwaysRefCounted`
> 
> I think you should first show how the example looks like with `ARef` and
> then talk about type invariants.
> 

These two sound good to me.

Regards,
Boqun

> > +///
> > +/// Types that `impl AlwaysRefCounted` usually needs an invariant to describe why the type can meet
> > +/// the safety requirement of `AlwaysRefCounted`, e.g.
> > +///
> > +/// ```ignore
> > +/// /// # Invariants:
> > +/// ///
> > +/// /// Instances of this type are always refcounted, that is, a call to `get_foo` ensures that the
> > +/// /// allocation remains valid at least until the matching call to `put_foo`.
> > +/// #[repr(transparent)]
> > +/// pub struct Foo(Opaque<foo>);
> > +///
> > +/// // SAFETY: `Foo` is always ref-counted per type invariants.
> > +/// unsafe impl AlwaysRefCounted for Foo {
> > +///     fn inc_ref(&self) {
> > +///         // SAFETY: `self.0.get()` is a valid pointer and per type invariants, the existence of
> > +///         // `&self` means it has a non-zero reference count.
> > +///         unsafe { get_foo(self.0.get()); }
> > +///     }
> > +///
> > +///     unsafe dec_ref(obj: NonNull<Self>) {
> > +///         // SAFETY: The refcount of `obj` is non-zero per function safety requirement, and the
> > +///         // cast is OK since `foo` is transparent to `Foo`.
> > +///         unsafe { put_foo(obj.cast()); }
> > +///     }
> > +/// }
> > +/// ```
> > +///
> > +/// After `impl AlwaysRefCounted for foo`, `clone()` (`get_foo()`) and `drop()` (`put_foo()`)  are
> 
> Typo: it should be `impl AlwaysRefCounted for Foo`.
> 
> ---
> Cheers,
> Benno
> 
> > +/// available to `ARef<Foo>` thanks to the generic implementation.
> > +///
> > +/// ## `ARef<Self>` vs `&Self`
> > +///
> > +/// For an `impl AlwaysRefCounted` type, `ARef<Self>` represents an owner of one reference count,
> > +/// e.g.
> > +///
> > +/// ```ignore
> > +/// impl Foo {
> > +///     /// Gets a ref-counted reference of [`Self`].
> > +///     ///
> > +///     /// # Safety
> > +///     ///
> > +///     /// - `ptr` must be a valid pointer to `foo` with at least one reference count.
> > +///     pub unsafe fn from_ptr(ptr: *mut foo) -> ARef<Self> {
> > +///         // SAFETY: `ptr` is a valid pointer per function safety requirement. The cast is OK
> > +///         // since `foo` is transparent to `Foo`.
> > +///         //
> > +///         // Note: `.into()` here increases the reference count, so the returned value has its own
> > +///         // reference count.
> > +///         unsafe { &*(ptr.cast::<Foo>()) }.into()
> > +///     }
> > +/// }
> > +/// ```
> > +///
> > +/// Another function that returns an `ARef<Self>` but with a different semantics is
> > +/// [`ARef::from_raw`]: it takes away the refcount of the input pointer, i.e. no refcount
> > +/// incrementation inside the function.
> > +///
> > +/// However `&Self` represents a reference to the object, and the lifetime of the **reference** is
> > +/// known at compile-time. E.g. the `Foo::as_ref()` above.
> > +///
> > +/// ## `impl Drop` of an `impl AlwaysRefCounted` should not touch the refcount
> > +///
> > +/// [`ARef`] descreases the refcount automatically (in [`ARef::drop`]) when it goes out of the
> > +/// scope, therefore there's no need to `impl Drop` for the type of objects (e.g. `Foo`) to decrease
> > +/// the refcount.
> >  pub struct ARef<T: AlwaysRefCounted> {
> >      ptr: NonNull<T>,
> >      _p: PhantomData<T>,
> > --
> > 2.45.2
> > 
>

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Benno Lossin 1 year, 6 months ago

On 25.07.24 22:06, Boqun Feng wrote:
> Hi Benno,
> 
> Thanks for taking a look.
> 
> On Thu, Jul 25, 2024 at 06:51:56PM +0000, Benno Lossin wrote:
>> On 10.07.24 05:24, Boqun Feng wrote:
>>> As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
>>> to add explanation of the "ARef pattern" to cover the most "DO" and "DO
>>> NOT" cases when wrapping a self-refcounted C type.
>>>
>>> Hence an "ARef pattern" section is added in the documentation of `ARef`.
>>>
>>> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
>>> ---
>>> This is motivated by:
>>>
>>> 	https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/
>>>
>>>  rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 156 insertions(+)
>>>
>>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
>>> index bd189d646adb..70fdc780882e 100644
>>> --- a/rust/kernel/types.rs
>>> +++ b/rust/kernel/types.rs
>>> @@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
>>>  ///
>>>  /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
>>>  /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
>>> +///
>>> +/// # [`ARef`] pattern
>>> +///
>>> +/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting
>>
>> I would have written "[...] struct which is reference-counted, because
>> [...]", is there a specific reason you wrote "its own"?
>>
> 
> "its own" indicates the reference counters are inside the object (i.e.
> self refcounted), it's different than `Arc<T>` where the reference
> counters are "attached" to `T`. Your version looks good to me as well.

I thought about that as well, but the paragraph above talks about a C
struct, so what is meant with "its own" there?

>>> +/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
>>> +/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct
>>
>> Not exactly sure I understand your point here, what exactly is the
>> advantage of decoupling the operations?
>> In my mind the following points are the advantages of using `ARef`:
>> (1) prevents having to implement multiple abstractions for a single C
>>     object: say there is a `struct foo` that is both used via reference
>>     counting and by-value on the stack. Without `ARef`, we would have to
>>     write two abstractions, one for each use-case. With `ARef`, we can
>>     have one `Foo` that can be wrapped with `ARef` to represent a
>>     reference-counted object.
>> (2) `ARef<T>` always represents a reference counted object, so it helps
>>     with understanding the code. If you read `Foo`, you cannot be sure
>>     if it is heap or stack allocated.
>> (3) generalizes common code of reference-counted objects (ie avoiding
>>     code duplication) and concentration of `unsafe` code.
>>
>> In my opinion (1) is the most important, then (2). And (3) is a nice
>> bonus. If you agree with the list above (maybe you also have additional
>> advantages of `ARef`?) then it would be great if you could also add them
>> somewhere here.
>>
> 
> Basically to me, the advantages are mostly (1) and (2) in your list,
> thank you for the list. And I did try to use an example (below) to
> explain these, because I felt an example of the bad cases is
> straightforward.
> 
> I will add your list here, because although an example may be
> straightforward of reading, a list of advantages are better for
> references. Again, thanks a lot!
> 
>>> +/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
>>> +/// "[`ARef`] pattern", i.e. **bad case**:
>>
>> Instead of "bad case" I would have written "i.e. you want to avoid this:".
>>
> 
> I'm OK with your version, but for my personal interest, why? ;-)

I felt like "bad case" did not "flow" right when reading and I also
think that "you want to avoid this" sounds more polite :)

>>> +///
>>> +/// ```ignore
>>> +/// pub struct Foo(NonNull<foo>);
>>> +///
>>> +/// impl Foo {
>>> +///     // An operation on the pointer.
>>> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
>>> +///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
>>> +///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
>>> +///         // reference?
>>> +///         unsafe { get_foo(ptr); }
>>> +///
>>> +///         unsafe { Foo(NonNull::new_unchecked(foo)) }
>>> +///     }
>>> +///
>>> +///     // An operation on the object.
>>> +///     pub fn get_bar(&self) -> Bar {
>>> +///         unsafe { (*foo.0.as_ptr()).bar }
>>> +///     }
>>> +/// }
>>> +///
>>> +/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
>>> +/// impl Clone for Foo {
>>> +///     fn clone(&self) -> Self {
>>> +///         unsafe { get_foo(self.0.as_ptr()); }
>>> +///
>>> +///         Foo(self.0)
>>> +///     }
>>> +/// }
>>> +///
>>> +/// impl Drop for Foo {
>>> +///     fn drop(&mut self) {
>>> +///         unsafe { put_foo(self.0.as_ptr()); }
>>> +///     }
>>> +/// }
>>> +/// ```
>>> +///
>>> +/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
>>> +/// `foo`.
>>> +///
>>> +/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
>>> +///
>>> +/// ```ignore
>>> +/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on
>>
>> I would disagree for the reason that `Opaque` is needed. You need it if
>> the `foo` eg contains a bool, since C might just write a nonsense
>> integer which would then result in immediate UB in Rust.
>> Other reasons might be that certain bytes of `foo` are written to by
>> other threads, even though on the Rust side we have `&mut Foo` (eg a
>> `mutex`).
>>
> 
> hmm.. "since there usually exist C operations on ..." include these two
> cases you mentioned, no? Plus, the reference counters themselves are not
> marked as atomic at the moment, so without `Opaque`, we also have UB
> because of the reference counters. I was trying to summarize all these
> as "C operations on ...", maybe I should say "concurrent C operations on
> ..."? I am trying to be concise here since it's a comment inside a
> comment ;-)

Ah that is your definition of "C operations", I interpreted it as "there
are functions that take `struct foo *`". So maybe it would be good to
spell out exactly why `Opaque` might be needed.
I think its fine to be verbose here.

---
Cheers,
Benno

>>> +/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
>>> +/// /// foo` to a `*mut Foo`
>>> +/// #[repr(transparent)]
>>> +/// pub struct Foo(Opaque<foo>);
>>> +///
>>> +/// impl Foo {
>>> +///     pub fn get_bar(&self) -> Bar {
>>> +///         // SAFETY: `self.0.get()` is a valid pointer.
>>> +///         //
>>> +///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
>>> +///         // doesn't race with C side. Most cases are either calling a C function, which has its
>>> +///         // own concurrent access protection, or holding a lock.
>>> +///         unsafe { (*self.0.get()).bar }
>>> +///     }
>>> +/// }
>>> +/// ```
>>> +///
>>> +/// ## Avoid `impl AlwaysRefCounted` if unnecesarry

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Boqun Feng 1 year, 6 months ago

On Thu, Jul 25, 2024 at 08:32:10PM +0000, Benno Lossin wrote:
> On 25.07.24 22:06, Boqun Feng wrote:
> > Hi Benno,
> > 
> > Thanks for taking a look.
> > 
> > On Thu, Jul 25, 2024 at 06:51:56PM +0000, Benno Lossin wrote:
> >> On 10.07.24 05:24, Boqun Feng wrote:
> >>> As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
> >>> to add explanation of the "ARef pattern" to cover the most "DO" and "DO
> >>> NOT" cases when wrapping a self-refcounted C type.
> >>>
> >>> Hence an "ARef pattern" section is added in the documentation of `ARef`.
> >>>
> >>> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> >>> ---
> >>> This is motivated by:
> >>>
> >>> 	https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/
> >>>
> >>>  rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
> >>>  1 file changed, 156 insertions(+)
> >>>
> >>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> >>> index bd189d646adb..70fdc780882e 100644
> >>> --- a/rust/kernel/types.rs
> >>> +++ b/rust/kernel/types.rs
> >>> @@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
> >>>  ///
> >>>  /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
> >>>  /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
> >>> +///
> >>> +/// # [`ARef`] pattern
> >>> +///
> >>> +/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting
> >>
> >> I would have written "[...] struct which is reference-counted, because
> >> [...]", is there a specific reason you wrote "its own"?
> >>
> > 
> > "its own" indicates the reference counters are inside the object (i.e.
> > self refcounted), it's different than `Arc<T>` where the reference
> > counters are "attached" to `T`. Your version looks good to me as well.
> 
> I thought about that as well, but the paragraph above talks about a C
> struct, so what is meant with "its own" there?
> 

Still the same thing, the `refcount_t` (or other reference counter
types) is inside the structure other than outside, one example of an
outside reference-counting is:

	struct foo_ref {
		struct foo *foo;
		refcount_t ref;
	}

	struct foo {
		struct foo_ref *ref;
		...
	}

TBH, I'm not sure whether that case really exist and we care or we want
to use `ARef<Foo>` for that case. So I just put "its own" to avoid that
for now and for the documentation purpose.

> >>> +/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
> >>> +/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct
> >>
> >> Not exactly sure I understand your point here, what exactly is the
> >> advantage of decoupling the operations?
> >> In my mind the following points are the advantages of using `ARef`:
> >> (1) prevents having to implement multiple abstractions for a single C
> >>     object: say there is a `struct foo` that is both used via reference
> >>     counting and by-value on the stack. Without `ARef`, we would have to
> >>     write two abstractions, one for each use-case. With `ARef`, we can
> >>     have one `Foo` that can be wrapped with `ARef` to represent a
> >>     reference-counted object.
> >> (2) `ARef<T>` always represents a reference counted object, so it helps
> >>     with understanding the code. If you read `Foo`, you cannot be sure
> >>     if it is heap or stack allocated.
> >> (3) generalizes common code of reference-counted objects (ie avoiding
> >>     code duplication) and concentration of `unsafe` code.
> >>
> >> In my opinion (1) is the most important, then (2). And (3) is a nice
> >> bonus. If you agree with the list above (maybe you also have additional
> >> advantages of `ARef`?) then it would be great if you could also add them
> >> somewhere here.
> >>
> > 
> > Basically to me, the advantages are mostly (1) and (2) in your list,
> > thank you for the list. And I did try to use an example (below) to
> > explain these, because I felt an example of the bad cases is
> > straightforward.
> > 
> > I will add your list here, because although an example may be
> > straightforward of reading, a list of advantages are better for
> > references. Again, thanks a lot!
> > 
> >>> +/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
> >>> +/// "[`ARef`] pattern", i.e. **bad case**:
> >>
> >> Instead of "bad case" I would have written "i.e. you want to avoid this:".
> >>
> > 
> > I'm OK with your version, but for my personal interest, why? ;-)
> 
> I felt like "bad case" did not "flow" right when reading and I also
> think that "you want to avoid this" sounds more polite :)
> 

Got it, will use "you want to avoid this" then.

> >>> +///
> >>> +/// ```ignore
> >>> +/// pub struct Foo(NonNull<foo>);
> >>> +///
> >>> +/// impl Foo {
> >>> +///     // An operation on the pointer.
> >>> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
> >>> +///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
> >>> +///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
> >>> +///         // reference?
> >>> +///         unsafe { get_foo(ptr); }
> >>> +///
> >>> +///         unsafe { Foo(NonNull::new_unchecked(foo)) }
> >>> +///     }
> >>> +///
> >>> +///     // An operation on the object.
> >>> +///     pub fn get_bar(&self) -> Bar {
> >>> +///         unsafe { (*foo.0.as_ptr()).bar }
> >>> +///     }
> >>> +/// }
> >>> +///
> >>> +/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
> >>> +/// impl Clone for Foo {
> >>> +///     fn clone(&self) -> Self {
> >>> +///         unsafe { get_foo(self.0.as_ptr()); }
> >>> +///
> >>> +///         Foo(self.0)
> >>> +///     }
> >>> +/// }
> >>> +///
> >>> +/// impl Drop for Foo {
> >>> +///     fn drop(&mut self) {
> >>> +///         unsafe { put_foo(self.0.as_ptr()); }
> >>> +///     }
> >>> +/// }
> >>> +/// ```
> >>> +///
> >>> +/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
> >>> +/// `foo`.
> >>> +///
> >>> +/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
> >>> +///
> >>> +/// ```ignore
> >>> +/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on
> >>
> >> I would disagree for the reason that `Opaque` is needed. You need it if
> >> the `foo` eg contains a bool, since C might just write a nonsense
> >> integer which would then result in immediate UB in Rust.
> >> Other reasons might be that certain bytes of `foo` are written to by
> >> other threads, even though on the Rust side we have `&mut Foo` (eg a
> >> `mutex`).
> >>
> > 
> > hmm.. "since there usually exist C operations on ..." include these two
> > cases you mentioned, no? Plus, the reference counters themselves are not
> > marked as atomic at the moment, so without `Opaque`, we also have UB
> > because of the reference counters. I was trying to summarize all these
> > as "C operations on ...", maybe I should say "concurrent C operations on
> > ..."? I am trying to be concise here since it's a comment inside a
> > comment ;-)
> 
> Ah that is your definition of "C operations", I interpreted it as "there
> are functions that take `struct foo *`". So maybe it would be good to
> spell out exactly why `Opaque` might be needed.
> I think its fine to be verbose here.
> 

Will do.

Regards,
Boqun

> ---
> Cheers,
> Benno
> 
> >>> +/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
> >>> +/// /// foo` to a `*mut Foo`
> >>> +/// #[repr(transparent)]
> >>> +/// pub struct Foo(Opaque<foo>);
> >>> +///
> >>> +/// impl Foo {
> >>> +///     pub fn get_bar(&self) -> Bar {
> >>> +///         // SAFETY: `self.0.get()` is a valid pointer.
> >>> +///         //
> >>> +///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
> >>> +///         // doesn't race with C side. Most cases are either calling a C function, which has its
> >>> +///         // own concurrent access protection, or holding a lock.
> >>> +///         unsafe { (*self.0.get()).bar }
> >>> +///     }
> >>> +/// }
> >>> +/// ```
> >>> +///
> >>> +/// ## Avoid `impl AlwaysRefCounted` if unnecesarry
>

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Alice Ryhl 1 year, 6 months ago

On Wed, Jul 10, 2024 at 5:26 AM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> As the usage of `ARef` and `AlwaysRefCounted` is growing, it makes sense
> to add explanation of the "ARef pattern" to cover the most "DO" and "DO
> NOT" cases when wrapping a self-refcounted C type.
>
> Hence an "ARef pattern" section is added in the documentation of `ARef`.
>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> ---
> This is motivated by:
>
>         https://lore.kernel.org/rust-for-linux/20240705110228.qqhhynbwwuwpcdeo@vireshk-i7/
>
>  rust/kernel/types.rs | 156 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 156 insertions(+)
>
> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> index bd189d646adb..70fdc780882e 100644
> --- a/rust/kernel/types.rs
> +++ b/rust/kernel/types.rs
> @@ -329,6 +329,162 @@ pub unsafe trait AlwaysRefCounted {
>  ///
>  /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
>  /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
> +///
> +/// # [`ARef`] pattern
> +///
> +/// "[`ARef`] pattern" is preferred when wrapping a C struct which has its own refcounting
> +/// mechanism, because it decouples the operations on the object itself (usually via a `&Foo`) vs the
> +/// operations on a pointer to the object (usually via an `ARef<Foo>`). For example, given a `struct
> +/// foo` defined in C, which has its own refcounting operations `get_foo()` and `put_foo()`. Without
> +/// "[`ARef`] pattern", i.e. **bad case**:
> +///
> +/// ```ignore
> +/// pub struct Foo(NonNull<foo>);
> +///
> +/// impl Foo {
> +///     // An operation on the pointer.
> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> Self {
> +///         // Note that whether `get_foo()` is needed here depends on the exact semantics of
> +///         // `from_ptr()`: is it creating a new reference, or it continues using the caller's
> +///         // reference?
> +///         unsafe { get_foo(ptr); }
> +///
> +///         unsafe { Foo(NonNull::new_unchecked(foo)) }
> +///     }
> +///
> +///     // An operation on the object.
> +///     pub fn get_bar(&self) -> Bar {
> +///         unsafe { (*foo.0.as_ptr()).bar }
> +///     }
> +/// }
> +///
> +/// // Plus `impl Clone` and `impl Drop` are also needed to implement manually.
> +/// impl Clone for Foo {
> +///     fn clone(&self) -> Self {
> +///         unsafe { get_foo(self.0.as_ptr()); }
> +///
> +///         Foo(self.0)
> +///     }
> +/// }
> +///
> +/// impl Drop for Foo {
> +///     fn drop(&mut self) {
> +///         unsafe { put_foo(self.0.as_ptr()); }
> +///     }
> +/// }
> +/// ```
> +///
> +/// In this case, it's hard to tell whether `Foo` represent an object of `foo` or a pointer to
> +/// `foo`.
> +///
> +/// However, if using [`ARef`] pattern, `foo` can be wrapped as follow:
> +///
> +/// ```ignore
> +/// /// Note: `Opaque` is needed in most cases since there usually exist C operations on
> +/// /// `struct foo *`, and `#[repr(transparent)]` is needed for the safety of converting a `*mut
> +/// /// foo` to a `*mut Foo`
> +/// #[repr(transparent)]
> +/// pub struct Foo(Opaque<foo>);
> +///
> +/// impl Foo {
> +///     pub fn get_bar(&self) -> Bar {
> +///         // SAFETY: `self.0.get()` is a valid pointer.
> +///         //
> +///         // Note: Usually extra safety comments are needed here to explain why accessing `.bar`
> +///         // doesn't race with C side. Most cases are either calling a C function, which has its
> +///         // own concurrent access protection, or holding a lock.
> +///         unsafe { (*self.0.get()).bar }
> +///     }
> +/// }
> +/// ```
> +///
> +/// ## Avoid `impl AlwaysRefCounted` if unnecesarry
> +///
> +/// If Rust code doesn't touch the part where the object lifetimes of `foo` are maintained, `impl
> +/// AlwaysRefCounted` can be temporarily avoided: it can always be added later as an extension of
> +/// the functionality of the Rust code. This is usually the case for callbacks where the object
> +/// lifetimes are already maintained by a framework. In such a case, an `unsafe` `fn(*mut foo) ->
> +/// &Foo` function usually suffices:
> +///
> +/// ```ignore
> +/// impl Foo {
> +///     /// # Safety
> +///     ///
> +///     /// `ptr` has to be a valid pointer to `foo` for the entire lifetime `'a'.
> +///     pub unsafe fn as_ref<'a>(ptr: *mut foo) -> &'a Self {
> +///         // SAFETY: Per function safety requirement, reborrow is valid.
> +///         unsafe { &*ptr.cast() }
> +///     }
> +/// }
> +/// ```
> +///
> +/// ## Type invariants of `impl AlwaysRefCounted`
> +///
> +/// Types that `impl AlwaysRefCounted` usually needs an invariant to describe why the type can meet
> +/// the safety requirement of `AlwaysRefCounted`, e.g.
> +///
> +/// ```ignore
> +/// /// # Invariants:
> +/// ///
> +/// /// Instances of this type are always refcounted, that is, a call to `get_foo` ensures that the
> +/// /// allocation remains valid at least until the matching call to `put_foo`.
> +/// #[repr(transparent)]
> +/// pub struct Foo(Opaque<foo>);
> +///
> +/// // SAFETY: `Foo` is always ref-counted per type invariants.
> +/// unsafe impl AlwaysRefCounted for Foo {
> +///     fn inc_ref(&self) {
> +///         // SAFETY: `self.0.get()` is a valid pointer and per type invariants, the existence of
> +///         // `&self` means it has a non-zero reference count.
> +///         unsafe { get_foo(self.0.get()); }
> +///     }
> +///
> +///     unsafe dec_ref(obj: NonNull<Self>) {
> +///         // SAFETY: The refcount of `obj` is non-zero per function safety requirement, and the
> +///         // cast is OK since `foo` is transparent to `Foo`.
> +///         unsafe { put_foo(obj.cast()); }
> +///     }
> +/// }
> +/// ```
> +///
> +/// After `impl AlwaysRefCounted for foo`, `clone()` (`get_foo()`) and `drop()` (`put_foo()`)  are
> +/// available to `ARef<Foo>` thanks to the generic implementation.
> +///
> +/// ## `ARef<Self>` vs `&Self`
> +///
> +/// For an `impl AlwaysRefCounted` type, `ARef<Self>` represents an owner of one reference count,
> +/// e.g.
> +///
> +/// ```ignore
> +/// impl Foo {
> +///     /// Gets a ref-counted reference of [`Self`].
> +///     ///
> +///     /// # Safety
> +///     ///
> +///     /// - `ptr` must be a valid pointer to `foo` with at least one reference count.
> +///     pub unsafe fn from_ptr(ptr: *mut foo) -> ARef<Self> {
> +///         // SAFETY: `ptr` is a valid pointer per function safety requirement. The cast is OK
> +///         // since `foo` is transparent to `Foo`.
> +///         //
> +///         // Note: `.into()` here increases the reference count, so the returned value has its own
> +///         // reference count.
> +///         unsafe { &*(ptr.cast::<Foo>()) }.into()
> +///     }
> +/// }
> +/// ```
> +///
> +/// Another function that returns an `ARef<Self>` but with a different semantics is
> +/// [`ARef::from_raw`]: it takes away the refcount of the input pointer, i.e. no refcount
> +/// incrementation inside the function.
> +///
> +/// However `&Self` represents a reference to the object, and the lifetime of the **reference** is
> +/// known at compile-time. E.g. the `Foo::as_ref()` above.
> +///
> +/// ## `impl Drop` of an `impl AlwaysRefCounted` should not touch the refcount
> +///
> +/// [`ARef`] descreases the refcount automatically (in [`ARef::drop`]) when it goes out of the
> +/// scope, therefore there's no need to `impl Drop` for the type of objects (e.g. `Foo`) to decrease
> +/// the refcount.
>  pub struct ARef<T: AlwaysRefCounted> {
>      ptr: NonNull<T>,
>      _p: PhantomData<T>,
> --
> 2.45.2
>

I think this is missing some basic information related to `&Self` ->
`ARef<Self>` conversions. We should explain that these conversions are
possible, and that you usually don't want `raw_ptr` -> `ARef<Self>` to
increment the refcount - instead provide a `raw_ptr` -> `&Self` and
convert the `&Self` to `ARef<Self>`.

Alice

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Boqun Feng 1 year, 6 months ago

On Tue, Jul 23, 2024 at 11:14:29AM +0200, Alice Ryhl wrote:
[...]
> > +/// ## `ARef<Self>` vs `&Self`
> > +///
> > +/// For an `impl AlwaysRefCounted` type, `ARef<Self>` represents an owner of one reference count,
> > +/// e.g.
> > +///
> > +/// ```ignore
> > +/// impl Foo {
> > +///     /// Gets a ref-counted reference of [`Self`].
> > +///     ///
> > +///     /// # Safety
> > +///     ///
> > +///     /// - `ptr` must be a valid pointer to `foo` with at least one reference count.
> > +///     pub unsafe fn from_ptr(ptr: *mut foo) -> ARef<Self> {
> > +///         // SAFETY: `ptr` is a valid pointer per function safety requirement. The cast is OK
> > +///         // since `foo` is transparent to `Foo`.
> > +///         //
> > +///         // Note: `.into()` here increases the reference count, so the returned value has its own
> > +///         // reference count.
> > +///         unsafe { &*(ptr.cast::<Foo>()) }.into()

So I did use the `&Self` -> `ARef<Self>` conversion here,

> > +///     }
> > +/// }
> > +/// ```
> > +///
> > +/// Another function that returns an `ARef<Self>` but with a different semantics is
> > +/// [`ARef::from_raw`]: it takes away the refcount of the input pointer, i.e. no refcount
> > +/// incrementation inside the function.
> > +///

and mentioned the difference between .into() and `ARef::from_raw()`.

> > +/// However `&Self` represents a reference to the object, and the lifetime of the **reference** is
> > +/// known at compile-time. E.g. the `Foo::as_ref()` above.
> > +///
> > +/// ## `impl Drop` of an `impl AlwaysRefCounted` should not touch the refcount
> > +///
> > +/// [`ARef`] descreases the refcount automatically (in [`ARef::drop`]) when it goes out of the
> > +/// scope, therefore there's no need to `impl Drop` for the type of objects (e.g. `Foo`) to decrease
> > +/// the refcount.
> >  pub struct ARef<T: AlwaysRefCounted> {
> >      ptr: NonNull<T>,
> >      _p: PhantomData<T>,
> > --
> > 2.45.2
> >
> 
> I think this is missing some basic information related to `&Self` ->
> `ARef<Self>` conversions. We should explain that these conversions are
> possible, and that you usually don't want `raw_ptr` -> `ARef<Self>` to
> increment the refcount - instead provide a `raw_ptr` -> `&Self` and
> convert the `&Self` to `ARef<Self>`.
> 

I could be more explicit on this, but could there be a case where a `T`
only wants to return `ARef<T>` as a public API? In other words, the
author of `T` doesn't want to expose an `-> &T` function, therefore a
`-> ARef<T>` function makes more sense? If all the users of `T` want to
operate on an `ARef<T>` other than `&T`, I think it makes sense, right?

Overall, I feel like we don't necessarily make a preference between
`->&Self` and `->ARef<Self>` functions here, since it's up to the users'
design?

Regards,
Boqun

> Alice

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Benno Lossin 1 year, 6 months ago

On 24.07.24 19:44, Boqun Feng wrote:
> On Tue, Jul 23, 2024 at 11:14:29AM +0200, Alice Ryhl wrote:
>>> +/// However `&Self` represents a reference to the object, and the lifetime of the **reference** is
>>> +/// known at compile-time. E.g. the `Foo::as_ref()` above.
>>> +///
>>> +/// ## `impl Drop` of an `impl AlwaysRefCounted` should not touch the refcount
>>> +///
>>> +/// [`ARef`] descreases the refcount automatically (in [`ARef::drop`]) when it goes out of the
>>> +/// scope, therefore there's no need to `impl Drop` for the type of objects (e.g. `Foo`) to decrease
>>> +/// the refcount.
>>>  pub struct ARef<T: AlwaysRefCounted> {
>>>      ptr: NonNull<T>,
>>>      _p: PhantomData<T>,
>>> --
>>> 2.45.2
>>>
>>
>> I think this is missing some basic information related to `&Self` ->
>> `ARef<Self>` conversions. We should explain that these conversions are
>> possible, and that you usually don't want `raw_ptr` -> `ARef<Self>` to
>> increment the refcount - instead provide a `raw_ptr` -> `&Self` and
>> convert the `&Self` to `ARef<Self>`.
>>
> 
> I could be more explicit on this, but could there be a case where a `T`
> only wants to return `ARef<T>` as a public API? In other words, the
> author of `T` doesn't want to expose an `-> &T` function, therefore a
> `-> ARef<T>` function makes more sense? If all the users of `T` want to
> operate on an `ARef<T>` other than `&T`, I think it makes sense, right?

You can always get a `&T` from `ARef<T>`, since it implements `Deref`.

> Overall, I feel like we don't necessarily make a preference between
> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
> design?

I would argue that there should be a clear preference for functions
returning `&Self` when possible (ie there is a parameter that the
lifetime can bind to). This is because then you get the two versions of
the function (non-incrementing and incrementing) for the price of one
function.

---
Cheers,
Benno

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Boqun Feng 1 year, 6 months ago

On Thu, Jul 25, 2024 at 06:12:56PM +0000, Benno Lossin wrote:
> On 24.07.24 19:44, Boqun Feng wrote:
> > On Tue, Jul 23, 2024 at 11:14:29AM +0200, Alice Ryhl wrote:
> >>> +/// However `&Self` represents a reference to the object, and the lifetime of the **reference** is
> >>> +/// known at compile-time. E.g. the `Foo::as_ref()` above.
> >>> +///
> >>> +/// ## `impl Drop` of an `impl AlwaysRefCounted` should not touch the refcount
> >>> +///
> >>> +/// [`ARef`] descreases the refcount automatically (in [`ARef::drop`]) when it goes out of the
> >>> +/// scope, therefore there's no need to `impl Drop` for the type of objects (e.g. `Foo`) to decrease
> >>> +/// the refcount.
> >>>  pub struct ARef<T: AlwaysRefCounted> {
> >>>      ptr: NonNull<T>,
> >>>      _p: PhantomData<T>,
> >>> --
> >>> 2.45.2
> >>>
> >>
> >> I think this is missing some basic information related to `&Self` ->
> >> `ARef<Self>` conversions. We should explain that these conversions are
> >> possible, and that you usually don't want `raw_ptr` -> `ARef<Self>` to
> >> increment the refcount - instead provide a `raw_ptr` -> `&Self` and
> >> convert the `&Self` to `ARef<Self>`.
> >>
> > 
> > I could be more explicit on this, but could there be a case where a `T`
> > only wants to return `ARef<T>` as a public API? In other words, the
> > author of `T` doesn't want to expose an `-> &T` function, therefore a
> > `-> ARef<T>` function makes more sense? If all the users of `T` want to
> > operate on an `ARef<T>` other than `&T`, I think it makes sense, right?
> 
> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
> 

Yeah, but this is unrelated. I was talking about that API providers can
decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
they don't need to provide a `raw_ptr` -> `&Self`.

> > Overall, I feel like we don't necessarily make a preference between
> > `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
> > design?
> 
> I would argue that there should be a clear preference for functions
> returning `&Self` when possible (ie there is a parameter that the

If "possible" also means there's going to be `raw_ptr` -> `&Self`
function (as the same publicity level) anyway, then agreed. In other
words, if the users only need the `raw_ptr` -> `ARef<Self>`
functionality, we don't want to force people to provide a `raw_ptr` ->
`&Self` just because, right?

Regards,
Boqun

> lifetime can bind to). This is because then you get the two versions of
> the function (non-incrementing and incrementing) for the price of one
> function.
> 
> ---
> Cheers,
> Benno
>

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Benno Lossin 1 year, 6 months ago

On 25.07.24 22:29, Boqun Feng wrote:
> On Thu, Jul 25, 2024 at 06:12:56PM +0000, Benno Lossin wrote:
>> On 24.07.24 19:44, Boqun Feng wrote:
>>> On Tue, Jul 23, 2024 at 11:14:29AM +0200, Alice Ryhl wrote:
>>>>> +/// However `&Self` represents a reference to the object, and the lifetime of the **reference** is
>>>>> +/// known at compile-time. E.g. the `Foo::as_ref()` above.
>>>>> +///
>>>>> +/// ## `impl Drop` of an `impl AlwaysRefCounted` should not touch the refcount
>>>>> +///
>>>>> +/// [`ARef`] descreases the refcount automatically (in [`ARef::drop`]) when it goes out of the
>>>>> +/// scope, therefore there's no need to `impl Drop` for the type of objects (e.g. `Foo`) to decrease
>>>>> +/// the refcount.
>>>>>  pub struct ARef<T: AlwaysRefCounted> {
>>>>>      ptr: NonNull<T>,
>>>>>      _p: PhantomData<T>,
>>>>> --
>>>>> 2.45.2
>>>>>
>>>>
>>>> I think this is missing some basic information related to `&Self` ->
>>>> `ARef<Self>` conversions. We should explain that these conversions are
>>>> possible, and that you usually don't want `raw_ptr` -> `ARef<Self>` to
>>>> increment the refcount - instead provide a `raw_ptr` -> `&Self` and
>>>> convert the `&Self` to `ARef<Self>`.
>>>>
>>>
>>> I could be more explicit on this, but could there be a case where a `T`
>>> only wants to return `ARef<T>` as a public API? In other words, the
>>> author of `T` doesn't want to expose an `-> &T` function, therefore a
>>> `-> ARef<T>` function makes more sense? If all the users of `T` want to
>>> operate on an `ARef<T>` other than `&T`, I think it makes sense, right?
>>
>> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
>>
> 
> Yeah, but this is unrelated. I was talking about that API providers can
> decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
> they don't need to provide a `raw_ptr` -> `&Self`.
> 
>>> Overall, I feel like we don't necessarily make a preference between
>>> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
>>> design?
>>
>> I would argue that there should be a clear preference for functions
>> returning `&Self` when possible (ie there is a parameter that the
> 
> If "possible" also means there's going to be `raw_ptr` -> `&Self`
> function (as the same publicity level) anyway, then agreed. In other
> words, if the users only need the `raw_ptr` -> `ARef<Self>`
> functionality, we don't want to force people to provide a `raw_ptr` ->
> `&Self` just because, right?

I see... I am having a hard time coming up with an example where users
would exclusively want `ARef<Self>` though... What do you have in mind?
Normally types wrapped by `ARef` have `&self` methods.

Cheers,
Benno

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Boqun Feng 1 year, 6 months ago

On Fri, Jul 26, 2024 at 01:43:38PM +0000, Benno Lossin wrote:
[...]
> >>
> >> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
> >>
> > 
> > Yeah, but this is unrelated. I was talking about that API providers can
> > decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
> > they don't need to provide a `raw_ptr` -> `&Self`.
> > 
> >>> Overall, I feel like we don't necessarily make a preference between
> >>> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
> >>> design?
> >>
> >> I would argue that there should be a clear preference for functions
> >> returning `&Self` when possible (ie there is a parameter that the
> > 
> > If "possible" also means there's going to be `raw_ptr` -> `&Self`
> > function (as the same publicity level) anyway, then agreed. In other
> > words, if the users only need the `raw_ptr` -> `ARef<Self>`
> > functionality, we don't want to force people to provide a `raw_ptr` ->
> > `&Self` just because, right?
> 
> I see... I am having a hard time coming up with an example where users
> would exclusively want `ARef<Self>` though... What do you have in mind?
> Normally types wrapped by `ARef` have `&self` methods.
> 

Having `&self` methods doesn't mean the necessarity of a `raw_ptr` ->
`&Self` function, for example, a `Foo` is wrapped as follow:

	struct Foo(Opaque<foo>);
	impl Foo {
	    pub fn bar(&self) -> Bar { ... }
	    pub unsafe fn get_foo(ptr: *mut foo) -> ARef<Foo> { ... }
	}

in this case, the abstration provider may not want user to get a
`raw_ptr` -> `&Self` function, so no need to have it.

Regards,
Boqun

> Cheers,
> Benno
>

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Benno Lossin 1 year, 6 months ago

On 26.07.24 16:26, Boqun Feng wrote:
> On Fri, Jul 26, 2024 at 01:43:38PM +0000, Benno Lossin wrote:
> [...]
>>>>
>>>> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
>>>>
>>>
>>> Yeah, but this is unrelated. I was talking about that API providers can
>>> decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
>>> they don't need to provide a `raw_ptr` -> `&Self`.
>>>
>>>>> Overall, I feel like we don't necessarily make a preference between
>>>>> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
>>>>> design?
>>>>
>>>> I would argue that there should be a clear preference for functions
>>>> returning `&Self` when possible (ie there is a parameter that the
>>>
>>> If "possible" also means there's going to be `raw_ptr` -> `&Self`
>>> function (as the same publicity level) anyway, then agreed. In other
>>> words, if the users only need the `raw_ptr` -> `ARef<Self>`
>>> functionality, we don't want to force people to provide a `raw_ptr` ->
>>> `&Self` just because, right?
>>
>> I see... I am having a hard time coming up with an example where users
>> would exclusively want `ARef<Self>` though... What do you have in mind?
>> Normally types wrapped by `ARef` have `&self` methods.
>>
> 
> Having `&self` methods doesn't mean the necessarity of a `raw_ptr` ->
> `&Self` function, for example, a `Foo` is wrapped as follow:
> 
> 	struct Foo(Opaque<foo>);
> 	impl Foo {
> 	    pub fn bar(&self) -> Bar { ... }
> 	    pub unsafe fn get_foo(ptr: *mut foo) -> ARef<Foo> { ... }
> 	}
> 
> in this case, the abstration provider may not want user to get a
> `raw_ptr` -> `&Self` function, so no need to have it.

I don't understand this, why would the abstraction provider do that? The
user can already get a `&Foo` reference, so what's the harm having a
function supplying that directly?
I get the argument that you need to always convert to `ARef` if users
only use that, but when `Foo` provides `&self` methods, you're not
required to have an `ARef`.

---
Cheers,
Benno

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Boqun Feng 1 year, 6 months ago

On Fri, Jul 26, 2024 at 02:42:36PM +0000, Benno Lossin wrote:
> On 26.07.24 16:26, Boqun Feng wrote:
> > On Fri, Jul 26, 2024 at 01:43:38PM +0000, Benno Lossin wrote:
> > [...]
> >>>>
> >>>> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
> >>>>
> >>>
> >>> Yeah, but this is unrelated. I was talking about that API providers can
> >>> decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
> >>> they don't need to provide a `raw_ptr` -> `&Self`.
> >>>
> >>>>> Overall, I feel like we don't necessarily make a preference between
> >>>>> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
> >>>>> design?
> >>>>
> >>>> I would argue that there should be a clear preference for functions
> >>>> returning `&Self` when possible (ie there is a parameter that the
> >>>
> >>> If "possible" also means there's going to be `raw_ptr` -> `&Self`
> >>> function (as the same publicity level) anyway, then agreed. In other
> >>> words, if the users only need the `raw_ptr` -> `ARef<Self>`
> >>> functionality, we don't want to force people to provide a `raw_ptr` ->
> >>> `&Self` just because, right?
> >>
> >> I see... I am having a hard time coming up with an example where users
> >> would exclusively want `ARef<Self>` though... What do you have in mind?
> >> Normally types wrapped by `ARef` have `&self` methods.
> >>
> > 
> > Having `&self` methods doesn't mean the necessarity of a `raw_ptr` ->
> > `&Self` function, for example, a `Foo` is wrapped as follow:
> > 
> > 	struct Foo(Opaque<foo>);
> > 	impl Foo {
> > 	    pub fn bar(&self) -> Bar { ... }
> > 	    pub unsafe fn get_foo(ptr: *mut foo) -> ARef<Foo> { ... }
> > 	}
> > 
> > in this case, the abstration provider may not want user to get a
> > `raw_ptr` -> `&Self` function, so no need to have it.
> 
> I don't understand this, why would the abstraction provider do that? The

Because no user really needs to convert a `raw_ptr` to a `&Self` whose
lifetime is limited to a scope?

Why do we provide a function if no one needs and the solely purpose is
to just avoid providing another function?

> user can already get a `&Foo` reference, so what's the harm having a
> function supplying that directly?

Getting a `&Foo` from a `ARef<Foo>` is totally different than getting a
`&Foo` from a pointer, right? And it's OK for an abstraction provider to
want to avoid that.

Another example that you may not want to provide a `-> &Self` function
is:
 	struct Foo(Opaque<foo>);
 	impl Foo {
 	    pub fn bar(&self) -> Bar { ... }
 	    pub fn find_foo(idx: u32) -> ARef<Foo> { ... }
 	}

in other words, you have a query function (idx -> *mut foo), and I think
in this case, you would avoid `find_foo(idx: u32) -> &Foo`, right?

Honestly, this discussion has been going to a rabit hole. I will mention
and already mentioned the conversion `&Self` -> `ARef<Self>`. Leaving
the preference part blank is fine to me, since if it's a good practice,
then everybody will follow, otherwise, we are missing something here.
Just trying to not make a descision for the users...

Regards,
Boqun

> I get the argument that you need to always convert to `ARef` if users
> only use that, but when `Foo` provides `&self` methods, you're not
> required to have an `ARef`.
> 
> ---
> Cheers,
> Benno
>

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Benno Lossin 1 year, 6 months ago

On 26.07.24 17:15, Boqun Feng wrote:
> On Fri, Jul 26, 2024 at 02:42:36PM +0000, Benno Lossin wrote:
>> On 26.07.24 16:26, Boqun Feng wrote:
>>> On Fri, Jul 26, 2024 at 01:43:38PM +0000, Benno Lossin wrote:
>>> [...]
>>>>>>
>>>>>> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
>>>>>>
>>>>>
>>>>> Yeah, but this is unrelated. I was talking about that API providers can
>>>>> decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
>>>>> they don't need to provide a `raw_ptr` -> `&Self`.
>>>>>
>>>>>>> Overall, I feel like we don't necessarily make a preference between
>>>>>>> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
>>>>>>> design?
>>>>>>
>>>>>> I would argue that there should be a clear preference for functions
>>>>>> returning `&Self` when possible (ie there is a parameter that the
>>>>>
>>>>> If "possible" also means there's going to be `raw_ptr` -> `&Self`
>>>>> function (as the same publicity level) anyway, then agreed. In other
>>>>> words, if the users only need the `raw_ptr` -> `ARef<Self>`
>>>>> functionality, we don't want to force people to provide a `raw_ptr` ->
>>>>> `&Self` just because, right?
>>>>
>>>> I see... I am having a hard time coming up with an example where users
>>>> would exclusively want `ARef<Self>` though... What do you have in mind?
>>>> Normally types wrapped by `ARef` have `&self` methods.
>>>>
>>>
>>> Having `&self` methods doesn't mean the necessarity of a `raw_ptr` ->
>>> `&Self` function, for example, a `Foo` is wrapped as follow:
>>>
>>> 	struct Foo(Opaque<foo>);
>>> 	impl Foo {
>>> 	    pub fn bar(&self) -> Bar { ... }
>>> 	    pub unsafe fn get_foo(ptr: *mut foo) -> ARef<Foo> { ... }
>>> 	}
>>>
>>> in this case, the abstration provider may not want user to get a
>>> `raw_ptr` -> `&Self` function, so no need to have it.
>>
>> I don't understand this, why would the abstraction provider do that? The
> 
> Because no user really needs to convert a `raw_ptr` to a `&Self` whose
> lifetime is limited to a scope?

What if you have this:

    unsafe extern "C" fn called_from_c_via_vtable(foo: *mut bindings::foo) {
        // SAFETY: ...
        let foo = unsafe { Foo::from_raw(foo) };
        foo.bar();
    }

In this case, there is no need to take a refcount on `foo`.

> Why do we provide a function if no one needs and the solely purpose is
> to just avoid providing another function?

I don't think that there should be a lot of calls to that function
anyways and thus I don't think there is value in providing two functions
for almost the same behavior. Since one can be derived by the other, I
would go for only implementing the first one.

>> user can already get a `&Foo` reference, so what's the harm having a
>> function supplying that directly?
> 
> Getting a `&Foo` from a `ARef<Foo>` is totally different than getting a
> `&Foo` from a pointer, right? And it's OK for an abstraction provider to
> want to avoid that.
> 
> Another example that you may not want to provide a `-> &Self` function
> is:
>  	struct Foo(Opaque<foo>);
>  	impl Foo {
>  	    pub fn bar(&self) -> Bar { ... }
>  	    pub fn find_foo(idx: u32) -> ARef<Foo> { ... }
>  	}
> 
> in other words, you have a query function (idx -> *mut foo), and I think
> in this case, you would avoid `find_foo(idx: u32) -> &Foo`, right?

Yes, this is the exception I had in mind with "if possible (ie there is
a parameter that the lifetime can bind to)" (in this case there wouldn't
be such a parameter).

> Honestly, this discussion has been going to a rabit hole. I will mention
> and already mentioned the conversion `&Self` -> `ARef<Self>`. Leaving
> the preference part blank is fine to me, since if it's a good practice,
> then everybody will follow, otherwise, we are missing something here.
> Just trying to not make a descision for the users...

Sure.

---
Cheers,
Benno

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Danilo Krummrich 1 year, 6 months ago

On Fri, Jul 26, 2024 at 03:54:37PM +0000, Benno Lossin wrote:
> On 26.07.24 17:15, Boqun Feng wrote:
> > On Fri, Jul 26, 2024 at 02:42:36PM +0000, Benno Lossin wrote:
> >> On 26.07.24 16:26, Boqun Feng wrote:
> >>> On Fri, Jul 26, 2024 at 01:43:38PM +0000, Benno Lossin wrote:
> >>> [...]
> >>>>>>
> >>>>>> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
> >>>>>>
> >>>>>
> >>>>> Yeah, but this is unrelated. I was talking about that API providers can
> >>>>> decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
> >>>>> they don't need to provide a `raw_ptr` -> `&Self`.
> >>>>>
> >>>>>>> Overall, I feel like we don't necessarily make a preference between
> >>>>>>> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
> >>>>>>> design?
> >>>>>>
> >>>>>> I would argue that there should be a clear preference for functions
> >>>>>> returning `&Self` when possible (ie there is a parameter that the
> >>>>>
> >>>>> If "possible" also means there's going to be `raw_ptr` -> `&Self`
> >>>>> function (as the same publicity level) anyway, then agreed. In other
> >>>>> words, if the users only need the `raw_ptr` -> `ARef<Self>`
> >>>>> functionality, we don't want to force people to provide a `raw_ptr` ->
> >>>>> `&Self` just because, right?
> >>>>
> >>>> I see... I am having a hard time coming up with an example where users
> >>>> would exclusively want `ARef<Self>` though... What do you have in mind?
> >>>> Normally types wrapped by `ARef` have `&self` methods.
> >>>>
> >>>
> >>> Having `&self` methods doesn't mean the necessarity of a `raw_ptr` ->
> >>> `&Self` function, for example, a `Foo` is wrapped as follow:
> >>>
> >>> 	struct Foo(Opaque<foo>);
> >>> 	impl Foo {
> >>> 	    pub fn bar(&self) -> Bar { ... }
> >>> 	    pub unsafe fn get_foo(ptr: *mut foo) -> ARef<Foo> { ... }
> >>> 	}
> >>>
> >>> in this case, the abstration provider may not want user to get a
> >>> `raw_ptr` -> `&Self` function, so no need to have it.
> >>
> >> I don't understand this, why would the abstraction provider do that? The
> > 
> > Because no user really needs to convert a `raw_ptr` to a `&Self` whose
> > lifetime is limited to a scope?
> 
> What if you have this:
> 
>     unsafe extern "C" fn called_from_c_via_vtable(foo: *mut bindings::foo) {
>         // SAFETY: ...
>         let foo = unsafe { Foo::from_raw(foo) };
>         foo.bar();
>     }
> 
> In this case, there is no need to take a refcount on `foo`.
> 
> > Why do we provide a function if no one needs and the solely purpose is
> > to just avoid providing another function?
> 
> I don't think that there should be a lot of calls to that function
> anyways and thus I don't think there is value in providing two functions
> for almost the same behavior. Since one can be derived by the other, I
> would go for only implementing the first one.

I don't think there should be a rule saying that we can't provide a wrapper
function for deriving an `ARef<T>`. `Device` is a good example:

`let dev: ARef<Device> = unsafe { Device::from_raw(raw_dev) }.into();`

vs.

`let dev = unsafe { Device::get(raw_dev) };`

To me personally, the latter looks quite a bit cleaner.

Besides that, I think every kernel engineer (even without Rust background) will
be able to decode the meaning of this call. And if we get the chance to make
things obvious to everyone *without* the need to make a compromise, we should
clearly take it.

> 
> >> user can already get a `&Foo` reference, so what's the harm having a
> >> function supplying that directly?
> > 
> > Getting a `&Foo` from a `ARef<Foo>` is totally different than getting a
> > `&Foo` from a pointer, right? And it's OK for an abstraction provider to
> > want to avoid that.
> > 
> > Another example that you may not want to provide a `-> &Self` function
> > is:
> >  	struct Foo(Opaque<foo>);
> >  	impl Foo {
> >  	    pub fn bar(&self) -> Bar { ... }
> >  	    pub fn find_foo(idx: u32) -> ARef<Foo> { ... }
> >  	}
> > 
> > in other words, you have a query function (idx -> *mut foo), and I think
> > in this case, you would avoid `find_foo(idx: u32) -> &Foo`, right?
> 
> Yes, this is the exception I had in mind with "if possible (ie there is
> a parameter that the lifetime can bind to)" (in this case there wouldn't
> be such a parameter).
> 
> > Honestly, this discussion has been going to a rabit hole. I will mention
> > and already mentioned the conversion `&Self` -> `ARef<Self>`. Leaving
> > the preference part blank is fine to me, since if it's a good practice,
> > then everybody will follow, otherwise, we are missing something here.
> > Just trying to not make a descision for the users...
> 
> Sure.
> 
> ---
> Cheers,
> Benno
>

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Alice Ryhl 1 year, 6 months ago

On Fri, Jul 26, 2024 at 6:20 PM Danilo Krummrich <dakr@kernel.org> wrote:
>
> On Fri, Jul 26, 2024 at 03:54:37PM +0000, Benno Lossin wrote:
> > On 26.07.24 17:15, Boqun Feng wrote:
> > > On Fri, Jul 26, 2024 at 02:42:36PM +0000, Benno Lossin wrote:
> > >> On 26.07.24 16:26, Boqun Feng wrote:
> > >>> On Fri, Jul 26, 2024 at 01:43:38PM +0000, Benno Lossin wrote:
> > >>> [...]
> > >>>>>>
> > >>>>>> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
> > >>>>>>
> > >>>>>
> > >>>>> Yeah, but this is unrelated. I was talking about that API providers can
> > >>>>> decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
> > >>>>> they don't need to provide a `raw_ptr` -> `&Self`.
> > >>>>>
> > >>>>>>> Overall, I feel like we don't necessarily make a preference between
> > >>>>>>> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
> > >>>>>>> design?
> > >>>>>>
> > >>>>>> I would argue that there should be a clear preference for functions
> > >>>>>> returning `&Self` when possible (ie there is a parameter that the
> > >>>>>
> > >>>>> If "possible" also means there's going to be `raw_ptr` -> `&Self`
> > >>>>> function (as the same publicity level) anyway, then agreed. In other
> > >>>>> words, if the users only need the `raw_ptr` -> `ARef<Self>`
> > >>>>> functionality, we don't want to force people to provide a `raw_ptr` ->
> > >>>>> `&Self` just because, right?
> > >>>>
> > >>>> I see... I am having a hard time coming up with an example where users
> > >>>> would exclusively want `ARef<Self>` though... What do you have in mind?
> > >>>> Normally types wrapped by `ARef` have `&self` methods.
> > >>>>
> > >>>
> > >>> Having `&self` methods doesn't mean the necessarity of a `raw_ptr` ->
> > >>> `&Self` function, for example, a `Foo` is wrapped as follow:
> > >>>
> > >>>   struct Foo(Opaque<foo>);
> > >>>   impl Foo {
> > >>>       pub fn bar(&self) -> Bar { ... }
> > >>>       pub unsafe fn get_foo(ptr: *mut foo) -> ARef<Foo> { ... }
> > >>>   }
> > >>>
> > >>> in this case, the abstration provider may not want user to get a
> > >>> `raw_ptr` -> `&Self` function, so no need to have it.
> > >>
> > >> I don't understand this, why would the abstraction provider do that? The
> > >
> > > Because no user really needs to convert a `raw_ptr` to a `&Self` whose
> > > lifetime is limited to a scope?
> >
> > What if you have this:
> >
> >     unsafe extern "C" fn called_from_c_via_vtable(foo: *mut bindings::foo) {
> >         // SAFETY: ...
> >         let foo = unsafe { Foo::from_raw(foo) };
> >         foo.bar();
> >     }
> >
> > In this case, there is no need to take a refcount on `foo`.
> >
> > > Why do we provide a function if no one needs and the solely purpose is
> > > to just avoid providing another function?
> >
> > I don't think that there should be a lot of calls to that function
> > anyways and thus I don't think there is value in providing two functions
> > for almost the same behavior. Since one can be derived by the other, I
> > would go for only implementing the first one.
>
> I don't think there should be a rule saying that we can't provide a wrapper
> function for deriving an `ARef<T>`. `Device` is a good example:
>
> `let dev: ARef<Device> = unsafe { Device::from_raw(raw_dev) }.into();`
>
> vs.
>
> `let dev = unsafe { Device::get(raw_dev) };`
>
> To me personally, the latter looks quite a bit cleaner.
>
> Besides that, I think every kernel engineer (even without Rust background) will
> be able to decode the meaning of this call. And if we get the chance to make
> things obvious to everyone *without* the need to make a compromise, we should
> clearly take it.

I think I've come around on this question. I think it's fine to have
raw_ptr->ARef methods that increment the refcount, but we should make
a naming convention clear. I propose:

* Functions named things like from_raw_file or from_raw_mm do not
increment the refcount.
* Functions named things like get_file or or mmget do increment the
refcount, just like the C function of the same name.

Alice

Re: [RFC PATCH] rust: types: Add explanation for ARef pattern

Posted by Benno Lossin 1 year, 6 months ago

On 29.07.24 13:31, Alice Ryhl wrote:
> On Fri, Jul 26, 2024 at 6:20 PM Danilo Krummrich <dakr@kernel.org> wrote:
>>
>> On Fri, Jul 26, 2024 at 03:54:37PM +0000, Benno Lossin wrote:
>>> On 26.07.24 17:15, Boqun Feng wrote:
>>>> On Fri, Jul 26, 2024 at 02:42:36PM +0000, Benno Lossin wrote:
>>>>> On 26.07.24 16:26, Boqun Feng wrote:
>>>>>> On Fri, Jul 26, 2024 at 01:43:38PM +0000, Benno Lossin wrote:
>>>>>> [...]
>>>>>>>>>
>>>>>>>>> You can always get a `&T` from `ARef<T>`, since it implements `Deref`.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yeah, but this is unrelated. I was talking about that API providers can
>>>>>>>> decide whether they want to only provide a `raw_ptr` -> `ARef<Self>` if
>>>>>>>> they don't need to provide a `raw_ptr` -> `&Self`.
>>>>>>>>
>>>>>>>>>> Overall, I feel like we don't necessarily make a preference between
>>>>>>>>>> `->&Self` and `->ARef<Self>` functions here, since it's up to the users'
>>>>>>>>>> design?
>>>>>>>>>
>>>>>>>>> I would argue that there should be a clear preference for functions
>>>>>>>>> returning `&Self` when possible (ie there is a parameter that the
>>>>>>>>
>>>>>>>> If "possible" also means there's going to be `raw_ptr` -> `&Self`
>>>>>>>> function (as the same publicity level) anyway, then agreed. In other
>>>>>>>> words, if the users only need the `raw_ptr` -> `ARef<Self>`
>>>>>>>> functionality, we don't want to force people to provide a `raw_ptr` ->
>>>>>>>> `&Self` just because, right?
>>>>>>>
>>>>>>> I see... I am having a hard time coming up with an example where users
>>>>>>> would exclusively want `ARef<Self>` though... What do you have in mind?
>>>>>>> Normally types wrapped by `ARef` have `&self` methods.
>>>>>>>
>>>>>>
>>>>>> Having `&self` methods doesn't mean the necessarity of a `raw_ptr` ->
>>>>>> `&Self` function, for example, a `Foo` is wrapped as follow:
>>>>>>
>>>>>>   struct Foo(Opaque<foo>);
>>>>>>   impl Foo {
>>>>>>       pub fn bar(&self) -> Bar { ... }
>>>>>>       pub unsafe fn get_foo(ptr: *mut foo) -> ARef<Foo> { ... }
>>>>>>   }
>>>>>>
>>>>>> in this case, the abstration provider may not want user to get a
>>>>>> `raw_ptr` -> `&Self` function, so no need to have it.
>>>>>
>>>>> I don't understand this, why would the abstraction provider do that? The
>>>>
>>>> Because no user really needs to convert a `raw_ptr` to a `&Self` whose
>>>> lifetime is limited to a scope?
>>>
>>> What if you have this:
>>>
>>>     unsafe extern "C" fn called_from_c_via_vtable(foo: *mut bindings::foo) {
>>>         // SAFETY: ...
>>>         let foo = unsafe { Foo::from_raw(foo) };
>>>         foo.bar();
>>>     }
>>>
>>> In this case, there is no need to take a refcount on `foo`.
>>>
>>>> Why do we provide a function if no one needs and the solely purpose is
>>>> to just avoid providing another function?
>>>
>>> I don't think that there should be a lot of calls to that function
>>> anyways and thus I don't think there is value in providing two functions
>>> for almost the same behavior. Since one can be derived by the other, I
>>> would go for only implementing the first one.
>>
>> I don't think there should be a rule saying that we can't provide a wrapper
>> function for deriving an `ARef<T>`. `Device` is a good example:
>>
>> `let dev: ARef<Device> = unsafe { Device::from_raw(raw_dev) }.into();`
>>
>> vs.
>>
>> `let dev = unsafe { Device::get(raw_dev) };`
>>
>> To me personally, the latter looks quite a bit cleaner.
>>
>> Besides that, I think every kernel engineer (even without Rust background) will
>> be able to decode the meaning of this call. And if we get the chance to make
>> things obvious to everyone *without* the need to make a compromise, we should
>> clearly take it.
> 
> I think I've come around on this question. I think it's fine to have
> raw_ptr->ARef methods that increment the refcount, but we should make
> a naming convention clear. I propose:
> 
> * Functions named things like from_raw_file or from_raw_mm do not
> increment the refcount.
> * Functions named things like get_file or or mmget do increment the
> refcount, just like the C function of the same name.

I have thought about this a bit and I think that we can try to do it. I
like the name `Device::get` and `Device::from_raw`. I would not
duplicate the name ie `Device::get_device` (nor would I do that with
`from_raw`).

One of my bigger problems was the naming, so it's good to see this.

---
Cheers,
Benno