[PATCH v10 8/8] rust: file: add abstraction for `poll_table`

Alice Ryhl posted 8 patches 2 months, 2 weeks ago
[PATCH v10 8/8] rust: file: add abstraction for `poll_table`
Posted by Alice Ryhl 2 months, 2 weeks ago
The existing `CondVar` abstraction is a wrapper around
`wait_queue_head`, but it does not support all use-cases of the C
`wait_queue_head` type. To be specific, a `CondVar` cannot be registered
with a `struct poll_table`. This limitation has the advantage that you
do not need to call `synchronize_rcu` when destroying a `CondVar`.

However, we need the ability to register a `poll_table` with a
`wait_queue_head` in Rust Binder. To enable this, introduce a type
called `PollCondVar`, which is like `CondVar` except that you can
register a `poll_table`. We also introduce `PollTable`, which is a safe
wrapper around `poll_table` that is intended to be used with
`PollCondVar`.

The destructor of `PollCondVar` unconditionally calls `synchronize_rcu`
to ensure that the removal of epoll waiters has fully completed before
the `wait_queue_head` is destroyed.

That said, `synchronize_rcu` is rather expensive and is not needed in
all cases: If we have never registered a `poll_table` with the
`wait_queue_head`, then we don't need to call `synchronize_rcu`. (And
this is a common case in Binder - not all processes use Binder with
epoll.) The current implementation does not account for this, but if we
find that it is necessary to improve this, a future patch could store a
boolean next to the `wait_queue_head` to keep track of whether a
`poll_table` has ever been registered.

Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/bindings/bindings_helper.h |   1 +
 rust/kernel/sync.rs             |   1 +
 rust/kernel/sync/poll.rs        | 121 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 123 insertions(+)

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index e854ccddecee..ca13659ded4c 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -20,6 +20,7 @@
 #include <linux/mdio.h>
 #include <linux/phy.h>
 #include <linux/pid_namespace.h>
+#include <linux/poll.h>
 #include <linux/refcount.h>
 #include <linux/sched.h>
 #include <linux/security.h>
diff --git a/rust/kernel/sync.rs b/rust/kernel/sync.rs
index 0ab20975a3b5..bae4a5179c72 100644
--- a/rust/kernel/sync.rs
+++ b/rust/kernel/sync.rs
@@ -11,6 +11,7 @@
 mod condvar;
 pub mod lock;
 mod locked_by;
+pub mod poll;
 
 pub use arc::{Arc, ArcBorrow, UniqueArc};
 pub use condvar::{new_condvar, CondVar, CondVarTimeoutResult};
diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
new file mode 100644
index 000000000000..d5f17153b424
--- /dev/null
+++ b/rust/kernel/sync/poll.rs
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2024 Google LLC.
+
+//! Utilities for working with `struct poll_table`.
+
+use crate::{
+    bindings,
+    fs::File,
+    prelude::*,
+    sync::{CondVar, LockClassKey},
+    types::Opaque,
+};
+use core::ops::Deref;
+
+/// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
+#[macro_export]
+macro_rules! new_poll_condvar {
+    ($($name:literal)?) => {
+        $crate::sync::poll::PollCondVar::new(
+            $crate::optional_name!($($name)?), $crate::static_lock_class!()
+        )
+    };
+}
+
+/// Wraps the kernel's `struct poll_table`.
+///
+/// # Invariants
+///
+/// This struct contains a valid `struct poll_table`.
+///
+/// For a `struct poll_table` to be valid, its `_qproc` function must follow the safety
+/// requirements of `_qproc` functions:
+///
+/// * The `_qproc` function is given permission to enqueue a waiter to the provided `poll_table`
+///   during the call. Once the waiter is removed and an rcu grace period has passed, it must no
+///   longer access the `wait_queue_head`.
+#[repr(transparent)]
+pub struct PollTable(Opaque<bindings::poll_table>);
+
+impl PollTable {
+    /// Creates a reference to a [`PollTable`] from a valid pointer.
+    ///
+    /// # Safety
+    ///
+    /// The caller must ensure that for the duration of 'a, the pointer will point at a valid poll
+    /// table (as defined in the type invariants).
+    ///
+    /// The caller must also ensure that the `poll_table` is only accessed via the returned
+    /// reference for the duration of 'a.
+    pub unsafe fn from_ptr<'a>(ptr: *mut bindings::poll_table) -> &'a mut PollTable {
+        // SAFETY: The safety requirements guarantee the validity of the dereference, while the
+        // `PollTable` type being transparent makes the cast ok.
+        unsafe { &mut *ptr.cast() }
+    }
+
+    fn get_qproc(&self) -> bindings::poll_queue_proc {
+        let ptr = self.0.get();
+        // SAFETY: The `ptr` is valid because it originates from a reference, and the `_qproc`
+        // field is not modified concurrently with this call since we have an immutable reference.
+        unsafe { (*ptr)._qproc }
+    }
+
+    /// Register this [`PollTable`] with the provided [`PollCondVar`], so that it can be notified
+    /// using the condition variable.
+    pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) {
+        if let Some(qproc) = self.get_qproc() {
+            // SAFETY: The pointers to `file` and `self` need to be valid for the duration of this
+            // call to `qproc`, which they are because they are references.
+            //
+            // The `cv.wait_queue_head` pointer must be valid until an rcu grace period after the
+            // waiter is removed. The `PollCondVar` is pinned, so before `cv.wait_queue_head` can
+            // be destroyed, the destructor must run. That destructor first removes all waiters,
+            // and then waits for an rcu grace period. Therefore, `cv.wait_queue_head` is valid for
+            // long enough.
+            unsafe { qproc(file.as_ptr() as _, cv.wait_queue_head.get(), self.0.get()) };
+        }
+    }
+}
+
+/// A wrapper around [`CondVar`] that makes it usable with [`PollTable`].
+///
+/// [`CondVar`]: crate::sync::CondVar
+#[pin_data(PinnedDrop)]
+pub struct PollCondVar {
+    #[pin]
+    inner: CondVar,
+}
+
+impl PollCondVar {
+    /// Constructs a new condvar initialiser.
+    pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self> {
+        pin_init!(Self {
+            inner <- CondVar::new(name, key),
+        })
+    }
+}
+
+// Make the `CondVar` methods callable on `PollCondVar`.
+impl Deref for PollCondVar {
+    type Target = CondVar;
+
+    fn deref(&self) -> &CondVar {
+        &self.inner
+    }
+}
+
+#[pinned_drop]
+impl PinnedDrop for PollCondVar {
+    fn drop(self: Pin<&mut Self>) {
+        // Clear anything registered using `register_wait`.
+        //
+        // SAFETY: The pointer points at a valid `wait_queue_head`.
+        unsafe { bindings::__wake_up_pollfree(self.inner.wait_queue_head.get()) };
+
+        // Wait for epoll items to be properly removed.
+        //
+        // SAFETY: Just an FFI call.
+        unsafe { bindings::synchronize_rcu() };
+    }
+}

-- 
2.46.0.662.g92d0881bb0-goog
Re: [PATCH v10 8/8] rust: file: add abstraction for `poll_table`
Posted by Gary Guo 2 months, 2 weeks ago
On Sun, 15 Sep 2024 14:31:34 +0000
Alice Ryhl <aliceryhl@google.com> wrote:

> The existing `CondVar` abstraction is a wrapper around
> `wait_queue_head`, but it does not support all use-cases of the C
> `wait_queue_head` type. To be specific, a `CondVar` cannot be registered
> with a `struct poll_table`. This limitation has the advantage that you
> do not need to call `synchronize_rcu` when destroying a `CondVar`.
> 
> However, we need the ability to register a `poll_table` with a
> `wait_queue_head` in Rust Binder. To enable this, introduce a type
> called `PollCondVar`, which is like `CondVar` except that you can
> register a `poll_table`. We also introduce `PollTable`, which is a safe
> wrapper around `poll_table` that is intended to be used with
> `PollCondVar`.
> 
> The destructor of `PollCondVar` unconditionally calls `synchronize_rcu`
> to ensure that the removal of epoll waiters has fully completed before
> the `wait_queue_head` is destroyed.
> 
> That said, `synchronize_rcu` is rather expensive and is not needed in
> all cases: If we have never registered a `poll_table` with the
> `wait_queue_head`, then we don't need to call `synchronize_rcu`. (And
> this is a common case in Binder - not all processes use Binder with
> epoll.) The current implementation does not account for this, but if we
> find that it is necessary to improve this, a future patch could store a
> boolean next to the `wait_queue_head` to keep track of whether a
> `poll_table` has ever been registered.
> 
> Reviewed-by: Benno Lossin <benno.lossin@proton.me>
> Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
> Reviewed-by: Trevor Gross <tmgross@umich.edu>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/bindings/bindings_helper.h |   1 +
>  rust/kernel/sync.rs             |   1 +
>  rust/kernel/sync/poll.rs        | 121 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 123 insertions(+)
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index e854ccddecee..ca13659ded4c 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -20,6 +20,7 @@
>  #include <linux/mdio.h>
>  #include <linux/phy.h>
>  #include <linux/pid_namespace.h>
> +#include <linux/poll.h>
>  #include <linux/refcount.h>
>  #include <linux/sched.h>
>  #include <linux/security.h>
> diff --git a/rust/kernel/sync.rs b/rust/kernel/sync.rs
> index 0ab20975a3b5..bae4a5179c72 100644
> --- a/rust/kernel/sync.rs
> +++ b/rust/kernel/sync.rs
> @@ -11,6 +11,7 @@
>  mod condvar;
>  pub mod lock;
>  mod locked_by;
> +pub mod poll;
>  
>  pub use arc::{Arc, ArcBorrow, UniqueArc};
>  pub use condvar::{new_condvar, CondVar, CondVarTimeoutResult};
> diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
> new file mode 100644
> index 000000000000..d5f17153b424
> --- /dev/null
> +++ b/rust/kernel/sync/poll.rs
> @@ -0,0 +1,121 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Utilities for working with `struct poll_table`.
> +
> +use crate::{
> +    bindings,
> +    fs::File,
> +    prelude::*,
> +    sync::{CondVar, LockClassKey},
> +    types::Opaque,
> +};
> +use core::ops::Deref;
> +
> +/// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
> +#[macro_export]
> +macro_rules! new_poll_condvar {
> +    ($($name:literal)?) => {
> +        $crate::sync::poll::PollCondVar::new(
> +            $crate::optional_name!($($name)?), $crate::static_lock_class!()
> +        )
> +    };
> +}
> +
> +/// Wraps the kernel's `struct poll_table`.
> +///
> +/// # Invariants
> +///
> +/// This struct contains a valid `struct poll_table`.
> +///
> +/// For a `struct poll_table` to be valid, its `_qproc` function must follow the safety
> +/// requirements of `_qproc` functions:
> +///
> +/// * The `_qproc` function is given permission to enqueue a waiter to the provided `poll_table`
> +///   during the call. Once the waiter is removed and an rcu grace period has passed, it must no
> +///   longer access the `wait_queue_head`.
> +#[repr(transparent)]
> +pub struct PollTable(Opaque<bindings::poll_table>);
> +
> +impl PollTable {
> +    /// Creates a reference to a [`PollTable`] from a valid pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The caller must ensure that for the duration of 'a, the pointer will point at a valid poll
> +    /// table (as defined in the type invariants).
> +    ///
> +    /// The caller must also ensure that the `poll_table` is only accessed via the returned
> +    /// reference for the duration of 'a.
> +    pub unsafe fn from_ptr<'a>(ptr: *mut bindings::poll_table) -> &'a mut PollTable {
> +        // SAFETY: The safety requirements guarantee the validity of the dereference, while the
> +        // `PollTable` type being transparent makes the cast ok.
> +        unsafe { &mut *ptr.cast() }
> +    }
> +
> +    fn get_qproc(&self) -> bindings::poll_queue_proc {
> +        let ptr = self.0.get();
> +        // SAFETY: The `ptr` is valid because it originates from a reference, and the `_qproc`
> +        // field is not modified concurrently with this call since we have an immutable reference.
> +        unsafe { (*ptr)._qproc }
> +    }
> +
> +    /// Register this [`PollTable`] with the provided [`PollCondVar`], so that it can be notified
> +    /// using the condition variable.
> +    pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) {
> +        if let Some(qproc) = self.get_qproc() {
> +            // SAFETY: The pointers to `file` and `self` need to be valid for the duration of this
> +            // call to `qproc`, which they are because they are references.
> +            //
> +            // The `cv.wait_queue_head` pointer must be valid until an rcu grace period after the
> +            // waiter is removed. The `PollCondVar` is pinned, so before `cv.wait_queue_head` can
> +            // be destroyed, the destructor must run. That destructor first removes all waiters,
> +            // and then waits for an rcu grace period. Therefore, `cv.wait_queue_head` is valid for
> +            // long enough.
> +            unsafe { qproc(file.as_ptr() as _, cv.wait_queue_head.get(), self.0.get()) };
> +        }

Should this be calling `poll_wait` instead?

> +    }
> +}
> +
> +/// A wrapper around [`CondVar`] that makes it usable with [`PollTable`].
> +///
> +/// [`CondVar`]: crate::sync::CondVar
> +#[pin_data(PinnedDrop)]
> +pub struct PollCondVar {
> +    #[pin]
> +    inner: CondVar,
> +}
> +
> +impl PollCondVar {
> +    /// Constructs a new condvar initialiser.
> +    pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self> {
> +        pin_init!(Self {
> +            inner <- CondVar::new(name, key),
> +        })
> +    }
> +}
> +
> +// Make the `CondVar` methods callable on `PollCondVar`.
> +impl Deref for PollCondVar {
> +    type Target = CondVar;
> +
> +    fn deref(&self) -> &CondVar {
> +        &self.inner
> +    }
> +}
> +
> +#[pinned_drop]
> +impl PinnedDrop for PollCondVar {
> +    fn drop(self: Pin<&mut Self>) {
> +        // Clear anything registered using `register_wait`.
> +        //
> +        // SAFETY: The pointer points at a valid `wait_queue_head`.
> +        unsafe { bindings::__wake_up_pollfree(self.inner.wait_queue_head.get()) };

Should this use `wake_up_pollfree` (without the leading __)?

> +
> +        // Wait for epoll items to be properly removed.
> +        //
> +        // SAFETY: Just an FFI call.
> +        unsafe { bindings::synchronize_rcu() };
> +    }
> +}
>
Re: [PATCH v10 8/8] rust: file: add abstraction for `poll_table`
Posted by Alice Ryhl 2 months, 1 week ago
On Mon, Sep 16, 2024 at 12:24 AM Gary Guo <gary@garyguo.net> wrote:
>
> On Sun, 15 Sep 2024 14:31:34 +0000
> Alice Ryhl <aliceryhl@google.com> wrote:
> > +    /// Register this [`PollTable`] with the provided [`PollCondVar`], so that it can be notified
> > +    /// using the condition variable.
> > +    pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) {
> > +        if let Some(qproc) = self.get_qproc() {
> > +            // SAFETY: The pointers to `file` and `self` need to be valid for the duration of this
> > +            // call to `qproc`, which they are because they are references.
> > +            //
> > +            // The `cv.wait_queue_head` pointer must be valid until an rcu grace period after the
> > +            // waiter is removed. The `PollCondVar` is pinned, so before `cv.wait_queue_head` can
> > +            // be destroyed, the destructor must run. That destructor first removes all waiters,
> > +            // and then waits for an rcu grace period. Therefore, `cv.wait_queue_head` is valid for
> > +            // long enough.
> > +            unsafe { qproc(file.as_ptr() as _, cv.wait_queue_head.get(), self.0.get()) };
> > +        }
>
> Should this be calling `poll_wait` instead?
>
> > +#[pinned_drop]
> > +impl PinnedDrop for PollCondVar {
> > +    fn drop(self: Pin<&mut Self>) {
> > +        // Clear anything registered using `register_wait`.
> > +        //
> > +        // SAFETY: The pointer points at a valid `wait_queue_head`.
> > +        unsafe { bindings::__wake_up_pollfree(self.inner.wait_queue_head.get()) };
>
> Should this use `wake_up_pollfree` (without the leading __)?

For both cases, that would require a Rust helper. But I suppose we could do it.

Alice