The existing `CondVar` abstraction is a wrapper around
`wait_queue_head`, but it does not support all use-cases of the C
`wait_queue_head` type. To be specific, a `CondVar` cannot be registered
with a `struct poll_table`. This limitation has the advantage that you
do not need to call `synchronize_rcu` when destroying a `CondVar`.
However, we need the ability to register a `poll_table` with a
`wait_queue_head` in Rust Binder. To enable this, introduce a type
called `PollCondVar`, which is like `CondVar` except that you can
register a `poll_table`. We also introduce `PollTable`, which is a safe
wrapper around `poll_table` that is intended to be used with
`PollCondVar`.
The destructor of `PollCondVar` unconditionally calls `synchronize_rcu`
to ensure that the removal of epoll waiters has fully completed before
the `wait_queue_head` is destroyed.
That said, `synchronize_rcu` is rather expensive and is not needed in
all cases: If we have never registered a `poll_table` with the
`wait_queue_head`, then we don't need to call `synchronize_rcu`. (And
this is a common case in Binder - not all processes use Binder with
epoll.) The current implementation does not account for this, but if we
find that it is necessary to improve this, a future patch could store a
boolean next to the `wait_queue_head` to keep track of whether a
`poll_table` has ever been registered.
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Reviewed-by: Trevor Gross <tmgross@umich.edu>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
rust/bindings/bindings_helper.h | 1 +
rust/kernel/sync.rs | 1 +
rust/kernel/sync/poll.rs | 121 ++++++++++++++++++++++++++++++++++++++++
3 files changed, 123 insertions(+)
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index e854ccddecee..ca13659ded4c 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -20,6 +20,7 @@
#include <linux/mdio.h>
#include <linux/phy.h>
#include <linux/pid_namespace.h>
+#include <linux/poll.h>
#include <linux/refcount.h>
#include <linux/sched.h>
#include <linux/security.h>
diff --git a/rust/kernel/sync.rs b/rust/kernel/sync.rs
index 0ab20975a3b5..bae4a5179c72 100644
--- a/rust/kernel/sync.rs
+++ b/rust/kernel/sync.rs
@@ -11,6 +11,7 @@
mod condvar;
pub mod lock;
mod locked_by;
+pub mod poll;
pub use arc::{Arc, ArcBorrow, UniqueArc};
pub use condvar::{new_condvar, CondVar, CondVarTimeoutResult};
diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs
new file mode 100644
index 000000000000..d5f17153b424
--- /dev/null
+++ b/rust/kernel/sync/poll.rs
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2024 Google LLC.
+
+//! Utilities for working with `struct poll_table`.
+
+use crate::{
+ bindings,
+ fs::File,
+ prelude::*,
+ sync::{CondVar, LockClassKey},
+ types::Opaque,
+};
+use core::ops::Deref;
+
+/// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class.
+#[macro_export]
+macro_rules! new_poll_condvar {
+ ($($name:literal)?) => {
+ $crate::sync::poll::PollCondVar::new(
+ $crate::optional_name!($($name)?), $crate::static_lock_class!()
+ )
+ };
+}
+
+/// Wraps the kernel's `struct poll_table`.
+///
+/// # Invariants
+///
+/// This struct contains a valid `struct poll_table`.
+///
+/// For a `struct poll_table` to be valid, its `_qproc` function must follow the safety
+/// requirements of `_qproc` functions:
+///
+/// * The `_qproc` function is given permission to enqueue a waiter to the provided `poll_table`
+/// during the call. Once the waiter is removed and an rcu grace period has passed, it must no
+/// longer access the `wait_queue_head`.
+#[repr(transparent)]
+pub struct PollTable(Opaque<bindings::poll_table>);
+
+impl PollTable {
+ /// Creates a reference to a [`PollTable`] from a valid pointer.
+ ///
+ /// # Safety
+ ///
+ /// The caller must ensure that for the duration of 'a, the pointer will point at a valid poll
+ /// table (as defined in the type invariants).
+ ///
+ /// The caller must also ensure that the `poll_table` is only accessed via the returned
+ /// reference for the duration of 'a.
+ pub unsafe fn from_ptr<'a>(ptr: *mut bindings::poll_table) -> &'a mut PollTable {
+ // SAFETY: The safety requirements guarantee the validity of the dereference, while the
+ // `PollTable` type being transparent makes the cast ok.
+ unsafe { &mut *ptr.cast() }
+ }
+
+ fn get_qproc(&self) -> bindings::poll_queue_proc {
+ let ptr = self.0.get();
+ // SAFETY: The `ptr` is valid because it originates from a reference, and the `_qproc`
+ // field is not modified concurrently with this call since we have an immutable reference.
+ unsafe { (*ptr)._qproc }
+ }
+
+ /// Register this [`PollTable`] with the provided [`PollCondVar`], so that it can be notified
+ /// using the condition variable.
+ pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) {
+ if let Some(qproc) = self.get_qproc() {
+ // SAFETY: The pointers to `file` and `self` need to be valid for the duration of this
+ // call to `qproc`, which they are because they are references.
+ //
+ // The `cv.wait_queue_head` pointer must be valid until an rcu grace period after the
+ // waiter is removed. The `PollCondVar` is pinned, so before `cv.wait_queue_head` can
+ // be destroyed, the destructor must run. That destructor first removes all waiters,
+ // and then waits for an rcu grace period. Therefore, `cv.wait_queue_head` is valid for
+ // long enough.
+ unsafe { qproc(file.as_ptr() as _, cv.wait_queue_head.get(), self.0.get()) };
+ }
+ }
+}
+
+/// A wrapper around [`CondVar`] that makes it usable with [`PollTable`].
+///
+/// [`CondVar`]: crate::sync::CondVar
+#[pin_data(PinnedDrop)]
+pub struct PollCondVar {
+ #[pin]
+ inner: CondVar,
+}
+
+impl PollCondVar {
+ /// Constructs a new condvar initialiser.
+ pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self> {
+ pin_init!(Self {
+ inner <- CondVar::new(name, key),
+ })
+ }
+}
+
+// Make the `CondVar` methods callable on `PollCondVar`.
+impl Deref for PollCondVar {
+ type Target = CondVar;
+
+ fn deref(&self) -> &CondVar {
+ &self.inner
+ }
+}
+
+#[pinned_drop]
+impl PinnedDrop for PollCondVar {
+ fn drop(self: Pin<&mut Self>) {
+ // Clear anything registered using `register_wait`.
+ //
+ // SAFETY: The pointer points at a valid `wait_queue_head`.
+ unsafe { bindings::__wake_up_pollfree(self.inner.wait_queue_head.get()) };
+
+ // Wait for epoll items to be properly removed.
+ //
+ // SAFETY: Just an FFI call.
+ unsafe { bindings::synchronize_rcu() };
+ }
+}
--
2.46.0.662.g92d0881bb0-goog
On Sun, 15 Sep 2024 14:31:34 +0000 Alice Ryhl <aliceryhl@google.com> wrote: > The existing `CondVar` abstraction is a wrapper around > `wait_queue_head`, but it does not support all use-cases of the C > `wait_queue_head` type. To be specific, a `CondVar` cannot be registered > with a `struct poll_table`. This limitation has the advantage that you > do not need to call `synchronize_rcu` when destroying a `CondVar`. > > However, we need the ability to register a `poll_table` with a > `wait_queue_head` in Rust Binder. To enable this, introduce a type > called `PollCondVar`, which is like `CondVar` except that you can > register a `poll_table`. We also introduce `PollTable`, which is a safe > wrapper around `poll_table` that is intended to be used with > `PollCondVar`. > > The destructor of `PollCondVar` unconditionally calls `synchronize_rcu` > to ensure that the removal of epoll waiters has fully completed before > the `wait_queue_head` is destroyed. > > That said, `synchronize_rcu` is rather expensive and is not needed in > all cases: If we have never registered a `poll_table` with the > `wait_queue_head`, then we don't need to call `synchronize_rcu`. (And > this is a common case in Binder - not all processes use Binder with > epoll.) The current implementation does not account for this, but if we > find that it is necessary to improve this, a future patch could store a > boolean next to the `wait_queue_head` to keep track of whether a > `poll_table` has ever been registered. > > Reviewed-by: Benno Lossin <benno.lossin@proton.me> > Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com> > Reviewed-by: Trevor Gross <tmgross@umich.edu> > Signed-off-by: Alice Ryhl <aliceryhl@google.com> > --- > rust/bindings/bindings_helper.h | 1 + > rust/kernel/sync.rs | 1 + > rust/kernel/sync/poll.rs | 121 ++++++++++++++++++++++++++++++++++++++++ > 3 files changed, 123 insertions(+) > > diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h > index e854ccddecee..ca13659ded4c 100644 > --- a/rust/bindings/bindings_helper.h > +++ b/rust/bindings/bindings_helper.h > @@ -20,6 +20,7 @@ > #include <linux/mdio.h> > #include <linux/phy.h> > #include <linux/pid_namespace.h> > +#include <linux/poll.h> > #include <linux/refcount.h> > #include <linux/sched.h> > #include <linux/security.h> > diff --git a/rust/kernel/sync.rs b/rust/kernel/sync.rs > index 0ab20975a3b5..bae4a5179c72 100644 > --- a/rust/kernel/sync.rs > +++ b/rust/kernel/sync.rs > @@ -11,6 +11,7 @@ > mod condvar; > pub mod lock; > mod locked_by; > +pub mod poll; > > pub use arc::{Arc, ArcBorrow, UniqueArc}; > pub use condvar::{new_condvar, CondVar, CondVarTimeoutResult}; > diff --git a/rust/kernel/sync/poll.rs b/rust/kernel/sync/poll.rs > new file mode 100644 > index 000000000000..d5f17153b424 > --- /dev/null > +++ b/rust/kernel/sync/poll.rs > @@ -0,0 +1,121 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +// Copyright (C) 2024 Google LLC. > + > +//! Utilities for working with `struct poll_table`. > + > +use crate::{ > + bindings, > + fs::File, > + prelude::*, > + sync::{CondVar, LockClassKey}, > + types::Opaque, > +}; > +use core::ops::Deref; > + > +/// Creates a [`PollCondVar`] initialiser with the given name and a newly-created lock class. > +#[macro_export] > +macro_rules! new_poll_condvar { > + ($($name:literal)?) => { > + $crate::sync::poll::PollCondVar::new( > + $crate::optional_name!($($name)?), $crate::static_lock_class!() > + ) > + }; > +} > + > +/// Wraps the kernel's `struct poll_table`. > +/// > +/// # Invariants > +/// > +/// This struct contains a valid `struct poll_table`. > +/// > +/// For a `struct poll_table` to be valid, its `_qproc` function must follow the safety > +/// requirements of `_qproc` functions: > +/// > +/// * The `_qproc` function is given permission to enqueue a waiter to the provided `poll_table` > +/// during the call. Once the waiter is removed and an rcu grace period has passed, it must no > +/// longer access the `wait_queue_head`. > +#[repr(transparent)] > +pub struct PollTable(Opaque<bindings::poll_table>); > + > +impl PollTable { > + /// Creates a reference to a [`PollTable`] from a valid pointer. > + /// > + /// # Safety > + /// > + /// The caller must ensure that for the duration of 'a, the pointer will point at a valid poll > + /// table (as defined in the type invariants). > + /// > + /// The caller must also ensure that the `poll_table` is only accessed via the returned > + /// reference for the duration of 'a. > + pub unsafe fn from_ptr<'a>(ptr: *mut bindings::poll_table) -> &'a mut PollTable { > + // SAFETY: The safety requirements guarantee the validity of the dereference, while the > + // `PollTable` type being transparent makes the cast ok. > + unsafe { &mut *ptr.cast() } > + } > + > + fn get_qproc(&self) -> bindings::poll_queue_proc { > + let ptr = self.0.get(); > + // SAFETY: The `ptr` is valid because it originates from a reference, and the `_qproc` > + // field is not modified concurrently with this call since we have an immutable reference. > + unsafe { (*ptr)._qproc } > + } > + > + /// Register this [`PollTable`] with the provided [`PollCondVar`], so that it can be notified > + /// using the condition variable. > + pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) { > + if let Some(qproc) = self.get_qproc() { > + // SAFETY: The pointers to `file` and `self` need to be valid for the duration of this > + // call to `qproc`, which they are because they are references. > + // > + // The `cv.wait_queue_head` pointer must be valid until an rcu grace period after the > + // waiter is removed. The `PollCondVar` is pinned, so before `cv.wait_queue_head` can > + // be destroyed, the destructor must run. That destructor first removes all waiters, > + // and then waits for an rcu grace period. Therefore, `cv.wait_queue_head` is valid for > + // long enough. > + unsafe { qproc(file.as_ptr() as _, cv.wait_queue_head.get(), self.0.get()) }; > + } Should this be calling `poll_wait` instead? > + } > +} > + > +/// A wrapper around [`CondVar`] that makes it usable with [`PollTable`]. > +/// > +/// [`CondVar`]: crate::sync::CondVar > +#[pin_data(PinnedDrop)] > +pub struct PollCondVar { > + #[pin] > + inner: CondVar, > +} > + > +impl PollCondVar { > + /// Constructs a new condvar initialiser. > + pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self> { > + pin_init!(Self { > + inner <- CondVar::new(name, key), > + }) > + } > +} > + > +// Make the `CondVar` methods callable on `PollCondVar`. > +impl Deref for PollCondVar { > + type Target = CondVar; > + > + fn deref(&self) -> &CondVar { > + &self.inner > + } > +} > + > +#[pinned_drop] > +impl PinnedDrop for PollCondVar { > + fn drop(self: Pin<&mut Self>) { > + // Clear anything registered using `register_wait`. > + // > + // SAFETY: The pointer points at a valid `wait_queue_head`. > + unsafe { bindings::__wake_up_pollfree(self.inner.wait_queue_head.get()) }; Should this use `wake_up_pollfree` (without the leading __)? > + > + // Wait for epoll items to be properly removed. > + // > + // SAFETY: Just an FFI call. > + unsafe { bindings::synchronize_rcu() }; > + } > +} >
On Mon, Sep 16, 2024 at 12:24 AM Gary Guo <gary@garyguo.net> wrote: > > On Sun, 15 Sep 2024 14:31:34 +0000 > Alice Ryhl <aliceryhl@google.com> wrote: > > + /// Register this [`PollTable`] with the provided [`PollCondVar`], so that it can be notified > > + /// using the condition variable. > > + pub fn register_wait(&mut self, file: &File, cv: &PollCondVar) { > > + if let Some(qproc) = self.get_qproc() { > > + // SAFETY: The pointers to `file` and `self` need to be valid for the duration of this > > + // call to `qproc`, which they are because they are references. > > + // > > + // The `cv.wait_queue_head` pointer must be valid until an rcu grace period after the > > + // waiter is removed. The `PollCondVar` is pinned, so before `cv.wait_queue_head` can > > + // be destroyed, the destructor must run. That destructor first removes all waiters, > > + // and then waits for an rcu grace period. Therefore, `cv.wait_queue_head` is valid for > > + // long enough. > > + unsafe { qproc(file.as_ptr() as _, cv.wait_queue_head.get(), self.0.get()) }; > > + } > > Should this be calling `poll_wait` instead? > > > +#[pinned_drop] > > +impl PinnedDrop for PollCondVar { > > + fn drop(self: Pin<&mut Self>) { > > + // Clear anything registered using `register_wait`. > > + // > > + // SAFETY: The pointer points at a valid `wait_queue_head`. > > + unsafe { bindings::__wake_up_pollfree(self.inner.wait_queue_head.get()) }; > > Should this use `wake_up_pollfree` (without the leading __)? For both cases, that would require a Rust helper. But I suppose we could do it. Alice
© 2016 - 2024 Red Hat, Inc.