These abstractions allow you to reference a `struct mm_struct` using
both mmgrab and mmget refcounts. This is done using two Rust types:
* Mm - represents an mm_struct where you don't know anything about the
value of mm_users.
* MmWithUser - represents an mm_struct where you know at compile time
that mm_users is non-zero.
This allows us to encode in the type system whether a method requires
that mm_users is non-zero or not. For instance, you can always call
`mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is
non-zero.
It's possible to access current->mm without a refcount increment, but
that is added in a later patch of this series.
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/helpers/helpers.c | 1 +
rust/helpers/mm.c | 39 +++++++++
rust/kernel/lib.rs | 1 +
rust/kernel/mm.rs | 219 +++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 260 insertions(+)
diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
index dcf827a61b52..9d748ec845b3 100644
--- a/rust/helpers/helpers.c
+++ b/rust/helpers/helpers.c
@@ -16,6 +16,7 @@
#include "fs.c"
#include "jump_label.c"
#include "kunit.c"
+#include "mm.c"
#include "mutex.c"
#include "page.c"
#include "pid_namespace.c"
diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
new file mode 100644
index 000000000000..7201747a5d31
--- /dev/null
+++ b/rust/helpers/mm.c
@@ -0,0 +1,39 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/mm.h>
+#include <linux/sched/mm.h>
+
+void rust_helper_mmgrab(struct mm_struct *mm)
+{
+ mmgrab(mm);
+}
+
+void rust_helper_mmdrop(struct mm_struct *mm)
+{
+ mmdrop(mm);
+}
+
+void rust_helper_mmget(struct mm_struct *mm)
+{
+ mmget(mm);
+}
+
+bool rust_helper_mmget_not_zero(struct mm_struct *mm)
+{
+ return mmget_not_zero(mm);
+}
+
+void rust_helper_mmap_read_lock(struct mm_struct *mm)
+{
+ mmap_read_lock(mm);
+}
+
+bool rust_helper_mmap_read_trylock(struct mm_struct *mm)
+{
+ return mmap_read_trylock(mm);
+}
+
+void rust_helper_mmap_read_unlock(struct mm_struct *mm)
+{
+ mmap_read_unlock(mm);
+}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index e1065a7551a3..6555e0847192 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -46,6 +46,7 @@
pub mod kunit;
pub mod list;
pub mod miscdevice;
+pub mod mm;
#[cfg(CONFIG_NET)]
pub mod net;
pub mod page;
diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
new file mode 100644
index 000000000000..84cba581edaa
--- /dev/null
+++ b/rust/kernel/mm.rs
@@ -0,0 +1,219 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2024 Google LLC.
+
+//! Memory management.
+//!
+//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
+
+use crate::{
+ bindings,
+ types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
+};
+use core::{ops::Deref, ptr::NonNull};
+
+/// A wrapper for the kernel's `struct mm_struct`.
+///
+/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
+/// [`mmget_not_zero`] to be able to access the address space.
+///
+/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
+///
+/// # Invariants
+///
+/// Values of this type are always refcounted using `mmgrab`.
+///
+/// [`mmget_not_zero`]: Mm::mmget_not_zero
+#[repr(transparent)]
+pub struct Mm {
+ mm: Opaque<bindings::mm_struct>,
+}
+
+// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called.
+unsafe impl Send for Mm {}
+// SAFETY: All methods on `Mm` can be called in parallel from several threads.
+unsafe impl Sync for Mm {}
+
+// SAFETY: By the type invariants, this type is always refcounted.
+unsafe impl AlwaysRefCounted for Mm {
+ #[inline]
+ fn inc_ref(&self) {
+ // SAFETY: The pointer is valid since self is a reference.
+ unsafe { bindings::mmgrab(self.as_raw()) };
+ }
+
+ #[inline]
+ unsafe fn dec_ref(obj: NonNull<Self>) {
+ // SAFETY: The caller is giving up their refcount.
+ unsafe { bindings::mmdrop(obj.cast().as_ptr()) };
+ }
+}
+
+/// A wrapper for the kernel's `struct mm_struct`.
+///
+/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
+/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget`
+/// refcount. It can be used to access the associated address space.
+///
+/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
+///
+/// # Invariants
+///
+/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
+#[repr(transparent)]
+pub struct MmWithUser {
+ mm: Mm,
+}
+
+// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called.
+unsafe impl Send for MmWithUser {}
+// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads.
+unsafe impl Sync for MmWithUser {}
+
+// SAFETY: By the type invariants, this type is always refcounted.
+unsafe impl AlwaysRefCounted for MmWithUser {
+ #[inline]
+ fn inc_ref(&self) {
+ // SAFETY: The pointer is valid since self is a reference.
+ unsafe { bindings::mmget(self.as_raw()) };
+ }
+
+ #[inline]
+ unsafe fn dec_ref(obj: NonNull<Self>) {
+ // SAFETY: The caller is giving up their refcount.
+ unsafe { bindings::mmput(obj.cast().as_ptr()) };
+ }
+}
+
+// Make all `Mm` methods available on `MmWithUser`.
+impl Deref for MmWithUser {
+ type Target = Mm;
+
+ #[inline]
+ fn deref(&self) -> &Mm {
+ &self.mm
+ }
+}
+
+// These methods are safe to call even if `mm_users` is zero.
+impl Mm {
+ /// Call `mmgrab` on `current.mm`.
+ #[inline]
+ pub fn mmgrab_current() -> Option<ARef<Mm>> {
+ // SAFETY: It's safe to get the `mm` field from current.
+ let mm = unsafe {
+ let current = bindings::get_current();
+ (*current).mm
+ };
+
+ if mm.is_null() {
+ return None;
+ }
+
+ // SAFETY: The value of `current->mm` is guaranteed to be null or a valid `mm_struct`. We
+ // just checked that it's not null. Furthermore, the returned `&Mm` is valid only for the
+ // duration of this function, and `current->mm` will stay valid for that long.
+ let mm = unsafe { Mm::from_raw(mm) };
+
+ // This increments the refcount using `mmgrab`.
+ Some(ARef::from(mm))
+ }
+
+ /// Returns a raw pointer to the inner `mm_struct`.
+ #[inline]
+ pub fn as_raw(&self) -> *mut bindings::mm_struct {
+ self.mm.get()
+ }
+
+ /// Obtain a reference from a raw pointer.
+ ///
+ /// # Safety
+ ///
+ /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated
+ /// during the lifetime 'a.
+ #[inline]
+ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm {
+ // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to
+ // repr(transparent).
+ unsafe { &*ptr.cast() }
+ }
+
+ /// Calls `mmget_not_zero` and returns a handle if it succeeds.
+ #[inline]
+ pub fn mmget_not_zero(&self) -> Option<ARef<MmWithUser>> {
+ // SAFETY: The pointer is valid since self is a reference.
+ let success = unsafe { bindings::mmget_not_zero(self.as_raw()) };
+
+ if success {
+ // SAFETY: We just created an `mmget` refcount.
+ Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) })
+ } else {
+ None
+ }
+ }
+}
+
+// These methods require `mm_users` to be non-zero.
+impl MmWithUser {
+ /// Obtain a reference from a raw pointer.
+ ///
+ /// # Safety
+ ///
+ /// The caller must ensure that `ptr` points at an `mm_struct`, and that `mm_users` remains
+ /// non-zero for the duration of the lifetime 'a.
+ #[inline]
+ pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a MmWithUser {
+ // SAFETY: Caller promises that the pointer is valid for 'a. The layout is compatible due
+ // to repr(transparent).
+ unsafe { &*ptr.cast() }
+ }
+
+ /// Lock the mmap read lock.
+ #[inline]
+ pub fn mmap_read_lock(&self) -> MmapReadGuard<'_> {
+ // SAFETY: The pointer is valid since self is a reference.
+ unsafe { bindings::mmap_read_lock(self.as_raw()) };
+
+ // INVARIANT: We just acquired the read lock.
+ MmapReadGuard {
+ mm: self,
+ _nts: NotThreadSafe,
+ }
+ }
+
+ /// Try to lock the mmap read lock.
+ #[inline]
+ pub fn mmap_read_trylock(&self) -> Option<MmapReadGuard<'_>> {
+ // SAFETY: The pointer is valid since self is a reference.
+ let success = unsafe { bindings::mmap_read_trylock(self.as_raw()) };
+
+ if success {
+ // INVARIANT: We just acquired the read lock.
+ Some(MmapReadGuard {
+ mm: self,
+ _nts: NotThreadSafe,
+ })
+ } else {
+ None
+ }
+ }
+}
+
+/// A guard for the mmap read lock.
+///
+/// # Invariants
+///
+/// This `MmapReadGuard` guard owns the mmap read lock.
+pub struct MmapReadGuard<'a> {
+ mm: &'a MmWithUser,
+ // `mmap_read_lock` and `mmap_read_unlock` must be called on the same thread
+ _nts: NotThreadSafe,
+}
+
+impl Drop for MmapReadGuard<'_> {
+ #[inline]
+ fn drop(&mut self) {
+ // SAFETY: We hold the read lock by the type invariants.
+ unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) };
+ }
+}
--
2.47.1.613.gc27f4b7a9f-goog
On 12/11/24 21:37, Alice Ryhl wrote: > These abstractions allow you to reference a `struct mm_struct` using > both mmgrab and mmget refcounts. This is done using two Rust types: > > * Mm - represents an mm_struct where you don't know anything about the > value of mm_users. > * MmWithUser - represents an mm_struct where you know at compile time > that mm_users is non-zero. > > This allows us to encode in the type system whether a method requires > that mm_users is non-zero or not. For instance, you can always call > `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is > non-zero. > > It's possible to access current->mm without a refcount increment, but > that is added in a later patch of this series. > > Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits) > Signed-off-by: Alice Ryhl <aliceryhl@google.com> > --- It might be good to add some #Examples similar to kernel/task.rs Acked-by: Balbir Singh <balbirs@nvidia.com>
On Fri, Jan 17, 2025 at 1:45 AM Balbir Singh <balbirs@nvidia.com> wrote: > > On 12/11/24 21:37, Alice Ryhl wrote: > > These abstractions allow you to reference a `struct mm_struct` using > > both mmgrab and mmget refcounts. This is done using two Rust types: > > > > * Mm - represents an mm_struct where you don't know anything about the > > value of mm_users. > > * MmWithUser - represents an mm_struct where you know at compile time > > that mm_users is non-zero. > > > > This allows us to encode in the type system whether a method requires > > that mm_users is non-zero or not. For instance, you can always call > > `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is > > non-zero. > > > > It's possible to access current->mm without a refcount increment, but > > that is added in a later patch of this series. > > > > Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits) > > Signed-off-by: Alice Ryhl <aliceryhl@google.com> > > --- > > It might be good to add some #Examples similar to kernel/task.rs You're probably right. > Acked-by: Balbir Singh <balbirs@nvidia.com> I'll pick this up for v13 since I already sent v12. Thanks! Alice
"Alice Ryhl" <aliceryhl@google.com> writes:
> These abstractions allow you to reference a `struct mm_struct` using
> both mmgrab and mmget refcounts. This is done using two Rust types:
>
> * Mm - represents an mm_struct where you don't know anything about the
> value of mm_users.
> * MmWithUser - represents an mm_struct where you know at compile time
> that mm_users is non-zero.
>
> This allows us to encode in the type system whether a method requires
> that mm_users is non-zero or not. For instance, you can always call
> `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is
> non-zero.
>
> It's possible to access current->mm without a refcount increment, but
> that is added in a later patch of this series.
>
> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> rust/helpers/helpers.c | 1 +
> rust/helpers/mm.c | 39 +++++++++
> rust/kernel/lib.rs | 1 +
> rust/kernel/mm.rs | 219 +++++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 260 insertions(+)
>
> diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> new file mode 100644
> index 000000000000..84cba581edaa
> --- /dev/null
> +++ b/rust/kernel/mm.rs
> @@ -0,0 +1,219 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Memory management.
Could you add a little more context here?
> +//!
> +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
> +
> +use crate::{
> + bindings,
> + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
> +};
> +use core::{ops::Deref, ptr::NonNull};
> +
> +/// A wrapper for the kernel's `struct mm_struct`.
Could you elaborate the data structure use case? When do I need it, what
does it do?
> +///
> +/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> +/// [`mmget_not_zero`] to be able to access the address space.
> +///
> +/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
> +///
> +/// # Invariants
> +///
> +/// Values of this type are always refcounted using `mmgrab`.
> +///
> +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> +#[repr(transparent)]
> +pub struct Mm {
Could we come up with a better name? `MemoryMap` or `MemoryMapping`?. You
use `MMapReadGuard` later.
> + mm: Opaque<bindings::mm_struct>,
> +}
> +
> +// SAFETY: It is safe to call `mmdrop` on another thread than where `mmgrab` was called.
> +unsafe impl Send for Mm {}
> +// SAFETY: All methods on `Mm` can be called in parallel from several threads.
> +unsafe impl Sync for Mm {}
> +
> +// SAFETY: By the type invariants, this type is always refcounted.
> +unsafe impl AlwaysRefCounted for Mm {
> + #[inline]
> + fn inc_ref(&self) {
> + // SAFETY: The pointer is valid since self is a reference.
> + unsafe { bindings::mmgrab(self.as_raw()) };
> + }
> +
> + #[inline]
> + unsafe fn dec_ref(obj: NonNull<Self>) {
> + // SAFETY: The caller is giving up their refcount.
> + unsafe { bindings::mmdrop(obj.cast().as_ptr()) };
> + }
> +}
> +
> +/// A wrapper for the kernel's `struct mm_struct`.
> +///
> +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
> +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget`
> +/// refcount. It can be used to access the associated address space.
> +///
> +/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
> +///
> +/// # Invariants
> +///
> +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
> +#[repr(transparent)]
> +pub struct MmWithUser {
> + mm: Mm,
> +}
> +
> +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called.
> +unsafe impl Send for MmWithUser {}
> +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads.
> +unsafe impl Sync for MmWithUser {}
> +
> +// SAFETY: By the type invariants, this type is always refcounted.
> +unsafe impl AlwaysRefCounted for MmWithUser {
> + #[inline]
> + fn inc_ref(&self) {
> + // SAFETY: The pointer is valid since self is a reference.
> + unsafe { bindings::mmget(self.as_raw()) };
> + }
> +
> + #[inline]
> + unsafe fn dec_ref(obj: NonNull<Self>) {
> + // SAFETY: The caller is giving up their refcount.
> + unsafe { bindings::mmput(obj.cast().as_ptr()) };
> + }
> +}
> +
> +// Make all `Mm` methods available on `MmWithUser`.
> +impl Deref for MmWithUser {
> + type Target = Mm;
> +
> + #[inline]
> + fn deref(&self) -> &Mm {
> + &self.mm
> + }
> +}
> +
> +// These methods are safe to call even if `mm_users` is zero.
> +impl Mm {
> + /// Call `mmgrab` on `current.mm`.
> + #[inline]
> + pub fn mmgrab_current() -> Option<ARef<Mm>> {
> + // SAFETY: It's safe to get the `mm` field from current.
> + let mm = unsafe {
> + let current = bindings::get_current();
> + (*current).mm
> + };
> +
> + if mm.is_null() {
> + return None;
> + }
> +
> + // SAFETY: The value of `current->mm` is guaranteed to be null or a valid `mm_struct`. We
> + // just checked that it's not null. Furthermore, the returned `&Mm` is valid only for the
> + // duration of this function, and `current->mm` will stay valid for that long.
> + let mm = unsafe { Mm::from_raw(mm) };
> +
> + // This increments the refcount using `mmgrab`.
> + Some(ARef::from(mm))
> + }
> +
> + /// Returns a raw pointer to the inner `mm_struct`.
> + #[inline]
> + pub fn as_raw(&self) -> *mut bindings::mm_struct {
> + self.mm.get()
> + }
> +
> + /// Obtain a reference from a raw pointer.
> + ///
> + /// # Safety
> + ///
> + /// The caller must ensure that `ptr` points at an `mm_struct`, and that it is not deallocated
> + /// during the lifetime 'a.
> + #[inline]
> + pub unsafe fn from_raw<'a>(ptr: *const bindings::mm_struct) -> &'a Mm {
> + // SAFETY: Caller promises that the pointer is valid for 'a. Layouts are compatible due to
> + // repr(transparent).
> + unsafe { &*ptr.cast() }
> + }
> +
> + /// Calls `mmget_not_zero` and returns a handle if it succeeds.
> + #[inline]
> + pub fn mmget_not_zero(&self) -> Option<ARef<MmWithUser>> {
> + // SAFETY: The pointer is valid since self is a reference.
> + let success = unsafe { bindings::mmget_not_zero(self.as_raw()) };
> +
> + if success {
> + // SAFETY: We just created an `mmget` refcount.
> + Some(unsafe { ARef::from_raw(NonNull::new_unchecked(self.as_raw().cast())) })
> + } else {
> + None
> + }
> + }
> +}
Nit: could we put the impl next to the struct definition?
Best regards,
Andreas Hindborg
On Mon, Dec 16, 2024 at 3:50 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>
> "Alice Ryhl" <aliceryhl@google.com> writes:
>
> > These abstractions allow you to reference a `struct mm_struct` using
> > both mmgrab and mmget refcounts. This is done using two Rust types:
> >
> > * Mm - represents an mm_struct where you don't know anything about the
> > value of mm_users.
> > * MmWithUser - represents an mm_struct where you know at compile time
> > that mm_users is non-zero.
> >
> > This allows us to encode in the type system whether a method requires
> > that mm_users is non-zero or not. For instance, you can always call
> > `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is
> > non-zero.
> >
> > It's possible to access current->mm without a refcount increment, but
> > that is added in a later patch of this series.
> >
> > Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > ---
> > rust/helpers/helpers.c | 1 +
> > rust/helpers/mm.c | 39 +++++++++
> > rust/kernel/lib.rs | 1 +
> > rust/kernel/mm.rs | 219 +++++++++++++++++++++++++++++++++++++++++++++++++
> > 4 files changed, 260 insertions(+)
> >
> > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > new file mode 100644
> > index 000000000000..84cba581edaa
> > --- /dev/null
> > +++ b/rust/kernel/mm.rs
> > @@ -0,0 +1,219 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +// Copyright (C) 2024 Google LLC.
> > +
> > +//! Memory management.
>
> Could you add a little more context here?
How about this?
//! Memory management.
//!
//! This module deals with managing the address space of userspace
processes. Each process has an
//! instance of [`Mm`], which keeps track of multiple VMAs (virtual
memory areas). Each VMA
//! corresponds to a region of memory that the userspace process can
access, and the VMA lets you
//! control what happens when userspace reads or writes to that region
of memory.
//!
//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
> > +//!
> > +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
> > +
> > +use crate::{
> > + bindings,
> > + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
> > +};
> > +use core::{ops::Deref, ptr::NonNull};
> > +
> > +/// A wrapper for the kernel's `struct mm_struct`.
>
> Could you elaborate the data structure use case? When do I need it, what
> does it do?
How about this?
/// A wrapper for the kernel's `struct mm_struct`.
///
/// This represents the address space of a userspace process, so each
process has one `Mm`
/// instance. It may hold many VMAs internally.
///
/// There is a counter called `mm_users` that counts the users of the
address space; this includes
/// the userspace process itself, but can also include kernel threads
accessing the address space.
/// Once `mm_users` reaches zero, this indicates that the address
space can be destroyed. To access
/// the address space, you must prevent `mm_users` from reaching zero
while you are accessing it.
/// The [`MmWithUser`] type represents an address space where this is
guaranteed, and you can
/// create one using [`mmget_not_zero`].
///
/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its
destructor may sleep.
> > +///
> > +/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> > +/// [`mmget_not_zero`] to be able to access the address space.
> > +///
> > +/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
> > +///
> > +/// # Invariants
> > +///
> > +/// Values of this type are always refcounted using `mmgrab`.
> > +///
> > +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> > +#[repr(transparent)]
> > +pub struct Mm {
>
> Could we come up with a better name? `MemoryMap` or `MemoryMapping`?. You
> use `MMapReadGuard` later.
Those names seem really confusing to me. The mmap syscall creates a
new VMA, but MemoryMap sounds like it's the thing that mmap creates.
Lorenzo, what do you think? I'm inclined to just call it Mm since
that's what C calls it.
Alice
"Alice Ryhl" <aliceryhl@google.com> writes:
> On Mon, Dec 16, 2024 at 3:50 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>>
>> "Alice Ryhl" <aliceryhl@google.com> writes:
>>
>> > These abstractions allow you to reference a `struct mm_struct` using
>> > both mmgrab and mmget refcounts. This is done using two Rust types:
>> >
>> > * Mm - represents an mm_struct where you don't know anything about the
>> > value of mm_users.
>> > * MmWithUser - represents an mm_struct where you know at compile time
>> > that mm_users is non-zero.
>> >
>> > This allows us to encode in the type system whether a method requires
>> > that mm_users is non-zero or not. For instance, you can always call
>> > `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is
>> > non-zero.
>> >
>> > It's possible to access current->mm without a refcount increment, but
>> > that is added in a later patch of this series.
>> >
>> > Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
>> > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
>> > ---
>> > rust/helpers/helpers.c | 1 +
>> > rust/helpers/mm.c | 39 +++++++++
>> > rust/kernel/lib.rs | 1 +
>> > rust/kernel/mm.rs | 219 +++++++++++++++++++++++++++++++++++++++++++++++++
>> > 4 files changed, 260 insertions(+)
>> >
>> > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
>> > new file mode 100644
>> > index 000000000000..84cba581edaa
>> > --- /dev/null
>> > +++ b/rust/kernel/mm.rs
>> > @@ -0,0 +1,219 @@
>> > +// SPDX-License-Identifier: GPL-2.0
>> > +
>> > +// Copyright (C) 2024 Google LLC.
>> > +
>> > +//! Memory management.
>>
>> Could you add a little more context here?
>
> How about this?
>
> //! Memory management.
> //!
> //! This module deals with managing the address space of userspace
> processes. Each process has an
> //! instance of [`Mm`], which keeps track of multiple VMAs (virtual
> memory areas). Each VMA
> //! corresponds to a region of memory that the userspace process can
> access, and the VMA lets you
> //! control what happens when userspace reads or writes to that region
> of memory.
> //!
> //! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
Nice 👍
>
>> > +//!
>> > +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
>> > +
>> > +use crate::{
>> > + bindings,
>> > + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
>> > +};
>> > +use core::{ops::Deref, ptr::NonNull};
>> > +
>> > +/// A wrapper for the kernel's `struct mm_struct`.
>>
>> Could you elaborate the data structure use case? When do I need it, what
>> does it do?
>
> How about this?
>
> /// A wrapper for the kernel's `struct mm_struct`.
> ///
> /// This represents the address space of a userspace process, so each
> process has one `Mm`
> /// instance. It may hold many VMAs internally.
> ///
> /// There is a counter called `mm_users` that counts the users of the
> address space; this includes
> /// the userspace process itself, but can also include kernel threads
> accessing the address space.
> /// Once `mm_users` reaches zero, this indicates that the address
> space can be destroyed. To access
> /// the address space, you must prevent `mm_users` from reaching zero
> while you are accessing it.
> /// The [`MmWithUser`] type represents an address space where this is
> guaranteed, and you can
> /// create one using [`mmget_not_zero`].
> ///
> /// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its
> destructor may sleep.
Cool 👍
>
>> > +///
>> > +/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
>> > +/// [`mmget_not_zero`] to be able to access the address space.
>> > +///
>> > +/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
>> > +///
>> > +/// # Invariants
>> > +///
>> > +/// Values of this type are always refcounted using `mmgrab`.
>> > +///
>> > +/// [`mmget_not_zero`]: Mm::mmget_not_zero
>> > +#[repr(transparent)]
>> > +pub struct Mm {
>>
>> Could we come up with a better name? `MemoryMap` or `MemoryMapping`?. You
>> use `MMapReadGuard` later.
>
> Those names seem really confusing to me. The mmap syscall creates a
> new VMA, but MemoryMap sounds like it's the thing that mmap creates.
>
> Lorenzo, what do you think? I'm inclined to just call it Mm since
> that's what C calls it.
Well I guess there is value in using same names as C. The additional
docs you sent help a lot so I guess it is fine.
If we were writing from scratch I would have held hard on `AddressSpace`
or `MemoryMap` over `Mm`. `Mm` has got to be one of the least
descriptive names we can come up with.
Best regards,
Andreas Hindborg
On 1/15/25 2:36 AM, Andreas Hindborg wrote:
> "Alice Ryhl" <aliceryhl@google.com> writes:
>> On Mon, Dec 16, 2024 at 3:50 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>>> "Alice Ryhl" <aliceryhl@google.com> writes:
...
>>>> +/// [`mmget_not_zero`]: Mm::mmget_not_zero
>>>> +#[repr(transparent)]
>>>> +pub struct Mm {
>>>
>>> Could we come up with a better name? `MemoryMap` or `MemoryMapping`?. You
>>> use `MMapReadGuard` later.
>>
>> Those names seem really confusing to me. The mmap syscall creates a
>> new VMA, but MemoryMap sounds like it's the thing that mmap creates.
>>
>> Lorenzo, what do you think? I'm inclined to just call it Mm since
>> that's what C calls it.
>
> Well I guess there is value in using same names as C. The additional
> docs you sent help a lot so I guess it is fine.
Hi Andreas!
>
> If we were writing from scratch I would have held hard on `AddressSpace`
> or `MemoryMap` over `Mm`. `Mm` has got to be one of the least
> descriptive names we can come up with.
>
...but, see the other thread: "Mm" is actually very effective in the context
of kernel development. And we are doing a perfect mix of kernel and Rust
development here. So it's not from scratch at all.
Kernel engineers will immediately know what "Mm" means! Really.
thanks,
--
John Hubbard
On Mon, Jan 13, 2025 at 10:53:33AM +0100, Alice Ryhl wrote:
> On Mon, Dec 16, 2024 at 3:50 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
> >
> > "Alice Ryhl" <aliceryhl@google.com> writes:
> >
> > > These abstractions allow you to reference a `struct mm_struct` using
> > > both mmgrab and mmget refcounts. This is done using two Rust types:
> > >
> > > * Mm - represents an mm_struct where you don't know anything about the
> > > value of mm_users.
> > > * MmWithUser - represents an mm_struct where you know at compile time
> > > that mm_users is non-zero.
> > >
> > > This allows us to encode in the type system whether a method requires
> > > that mm_users is non-zero or not. For instance, you can always call
> > > `mmget_not_zero` but you can only call `mmap_read_lock` when mm_users is
> > > non-zero.
> > >
> > > It's possible to access current->mm without a refcount increment, but
> > > that is added in a later patch of this series.
> > >
> > > Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> (for mm bits)
> > > Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> > > ---
> > > rust/helpers/helpers.c | 1 +
> > > rust/helpers/mm.c | 39 +++++++++
> > > rust/kernel/lib.rs | 1 +
> > > rust/kernel/mm.rs | 219 +++++++++++++++++++++++++++++++++++++++++++++++++
> > > 4 files changed, 260 insertions(+)
> > >
> > > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > > new file mode 100644
> > > index 000000000000..84cba581edaa
> > > --- /dev/null
> > > +++ b/rust/kernel/mm.rs
> > > @@ -0,0 +1,219 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +// Copyright (C) 2024 Google LLC.
> > > +
> > > +//! Memory management.
> >
> > Could you add a little more context here?
>
> How about this?
>
> //! Memory management.
> //!
> //! This module deals with managing the address space of userspace
> processes. Each process has an
> //! instance of [`Mm`], which keeps track of multiple VMAs (virtual
> memory areas). Each VMA
> //! corresponds to a region of memory that the userspace process can
> access, and the VMA lets you
> //! control what happens when userspace reads or writes to that region
> of memory.
> //!
> //! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
>
> > > +//!
> > > +//! C header: [`include/linux/mm.h`](srctree/include/linux/mm.h)
> > > +
> > > +use crate::{
> > > + bindings,
> > > + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
> > > +};
> > > +use core::{ops::Deref, ptr::NonNull};
> > > +
> > > +/// A wrapper for the kernel's `struct mm_struct`.
> >
> > Could you elaborate the data structure use case? When do I need it, what
> > does it do?
>
> How about this?
>
> /// A wrapper for the kernel's `struct mm_struct`.
> ///
> /// This represents the address space of a userspace process, so each
> process has one `Mm`
> /// instance. It may hold many VMAs internally.
> ///
> /// There is a counter called `mm_users` that counts the users of the
> address space; this includes
> /// the userspace process itself, but can also include kernel threads
> accessing the address space.
> /// Once `mm_users` reaches zero, this indicates that the address
> space can be destroyed. To access
> /// the address space, you must prevent `mm_users` from reaching zero
> while you are accessing it.
> /// The [`MmWithUser`] type represents an address space where this is
> guaranteed, and you can
> /// create one using [`mmget_not_zero`].
> ///
> /// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its
> destructor may sleep.
>
> > > +///
> > > +/// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> > > +/// [`mmget_not_zero`] to be able to access the address space.
> > > +///
> > > +/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
> > > +///
> > > +/// # Invariants
> > > +///
> > > +/// Values of this type are always refcounted using `mmgrab`.
> > > +///
> > > +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> > > +#[repr(transparent)]
> > > +pub struct Mm {
> >
> > Could we come up with a better name? `MemoryMap` or `MemoryMapping`?. You
> > use `MMapReadGuard` later.
>
> Those names seem really confusing to me. The mmap syscall creates a
> new VMA, but MemoryMap sounds like it's the thing that mmap creates.
>
> Lorenzo, what do you think? I'm inclined to just call it Mm since
> that's what C calls it.
I think Mm is better just for aligment with the C stuff, I mean the alternative
is MmStruct or something and... yuck.
And like, here I am TOTALLY onboard with Andreas here, because this naming
SUCKS. But it sucks on the C side too (we're experts at bad naming :). So for
consistency, let's suck everywhere...
Feel free to put a comment about this being a bad name if you like
though... (not obligatory :)
>
> Alice
On 1/14/25 7:48 AM, Lorenzo Stoakes wrote:
> On Mon, Jan 13, 2025 at 10:53:33AM +0100, Alice Ryhl wrote:
>> On Mon, Dec 16, 2024 at 3:50 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>>> "Alice Ryhl" <aliceryhl@google.com> writes:
...
>>>> +/// [`mmget_not_zero`]: Mm::mmget_not_zero
>>>> +#[repr(transparent)]
>>>> +pub struct Mm {
>>>
>>> Could we come up with a better name? `MemoryMap` or `MemoryMapping`?. You
>>> use `MMapReadGuard` later.
>>
>> Those names seem really confusing to me. The mmap syscall creates a
>> new VMA, but MemoryMap sounds like it's the thing that mmap creates.
>>
>> Lorenzo, what do you think? I'm inclined to just call it Mm since
>> that's what C calls it.
>
> I think Mm is better just for aligment with the C stuff, I mean the alternative
> is MmStruct or something and... yuck.
For what it's worth, I think using the C naming here is a very good approach.
Because if you come up with a "good" name that is different than what C has
been calling it for 30+ years, then we have to be very thorough in associating
that new name with the C name. And it's hard.
And "mm struct" goes waaay back. Just use that name and everyone will know
what it means.
For less well-established areas, with fewer callers, there is much more
freedom to come up with new, better names.
>
> And like, here I am TOTALLY onboard with Andreas here, because this naming
> SUCKS. But it sucks on the C side too (we're experts at bad naming :). So for
> consistency, let's suck everywhere...
>
> Feel free to put a comment about this being a bad name if you like
> though... (not obligatory :)
For mm struct? Maybe let's not! Explanation without the criticism seems
more appropriate imho. :)
btw, I'm very excited to see all of this Rust for Linux progress, it is
wonderful! Thank you for this!
thanks,
--
John Hubbard
On Tue, Jan 14, 2025 at 05:54:15PM -0800, John Hubbard wrote:
> On 1/14/25 7:48 AM, Lorenzo Stoakes wrote:
> > On Mon, Jan 13, 2025 at 10:53:33AM +0100, Alice Ryhl wrote:
> > > On Mon, Dec 16, 2024 at 3:50 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
> > > > "Alice Ryhl" <aliceryhl@google.com> writes:
> ...
> > > > > +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> > > > > +#[repr(transparent)]
> > > > > +pub struct Mm {
> > > >
> > > > Could we come up with a better name? `MemoryMap` or `MemoryMapping`?. You
> > > > use `MMapReadGuard` later.
> > >
> > > Those names seem really confusing to me. The mmap syscall creates a
> > > new VMA, but MemoryMap sounds like it's the thing that mmap creates.
> > >
> > > Lorenzo, what do you think? I'm inclined to just call it Mm since
> > > that's what C calls it.
> >
> > I think Mm is better just for aligment with the C stuff, I mean the alternative
> > is MmStruct or something and... yuck.
>
> For what it's worth, I think using the C naming here is a very good approach.
> Because if you come up with a "good" name that is different than what C has
> been calling it for 30+ years, then we have to be very thorough in associating
> that new name with the C name. And it's hard.
100% agree!
>
> And "mm struct" goes waaay back. Just use that name and everyone will know
> what it means.
>
> For less well-established areas, with fewer callers, there is much more
> freedom to come up with new, better names.
>
> >
> > And like, here I am TOTALLY onboard with Andreas here, because this naming
> > SUCKS. But it sucks on the C side too (we're experts at bad naming :). So for
> > consistency, let's suck everywhere...
> >
> > Feel free to put a comment about this being a bad name if you like
> > though... (not obligatory :)
>
> For mm struct? Maybe let's not! Explanation without the criticism seems
> more appropriate imho. :)
;) Well one could phrase this in a relatifvely benign way for instance 'while
this name may seem a little unclear, historically it has been used as a
shorthand within the kernel for time immemorial' or such.
>
> btw, I'm very excited to see all of this Rust for Linux progress, it is
> wonderful! Thank you for this!
+1 to this sentiment, am very happy to try to do my best to add value to get
this series in as - from my perspective - I want the compiler to tell me when I
make mistakes nice and early :))
Thanks Alice, Andreas and all involved!
>
>
> thanks,
> --
> John Hubbard
>
© 2016 - 2026 Red Hat, Inc.