[v2] Rust support for `struct iov_iter`

[PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE

Posted by Alice Ryhl 3 months ago

This adds abstractions for the iov_iter type in the case where
data_source is ITER_SOURCE. This will make Rust implementations of
fops->write_iter possible.

This series only has support for using existing IO vectors created by C
code. Additional abstractions will be needed to support the creation of
IO vectors in Rust code.

These abstractions make the assumption that `struct iov_iter` does not
have internal self-references, which implies that it is valid to move it
between different local variables.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/kernel/iov.rs | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 rust/kernel/lib.rs |   1 +
 2 files changed, 153 insertions(+)

diff --git a/rust/kernel/iov.rs b/rust/kernel/iov.rs
new file mode 100644
index 0000000000000000000000000000000000000000..b4d7ec14c57a561a01cd65b6bdf0f94b1b373b84
--- /dev/null
+++ b/rust/kernel/iov.rs
@@ -0,0 +1,152 @@
+// SPDX-License-Identifier: GPL-2.0
+
+// Copyright (C) 2025 Google LLC.
+
+//! IO vectors.
+//!
+//! C headers: [`include/linux/iov_iter.h`](srctree/include/linux/iov_iter.h),
+//! [`include/linux/uio.h`](srctree/include/linux/uio.h)
+
+use crate::{
+    alloc::{Allocator, Flags},
+    bindings,
+    prelude::*,
+    types::Opaque,
+};
+use core::{marker::PhantomData, mem::MaybeUninit, slice};
+
+const ITER_SOURCE: bool = bindings::ITER_SOURCE != 0;
+
+/// An IO vector that acts as a source of data.
+///
+/// The data may come from many different sources. This includes both things in kernel-space and
+/// reading from userspace. It's not necessarily the case that the data source is immutable, so
+/// rewinding the IO vector to read the same data twice is not guaranteed to result in the same
+/// bytes. It's also possible that the data source is mapped in a thread-local manner using e.g.
+/// `kmap_local_page()`, so this type is not `Send` to ensure that the mapping is read from the
+/// right context in that scenario.
+///
+/// # Invariants
+///
+/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
+/// of `'data`, it must be safe to read the data in this IO vector.
+#[repr(transparent)]
+pub struct IovIterSource<'data> {
+    iov: Opaque<bindings::iov_iter>,
+    /// Represent to the type system that this value contains a pointer to readable data it does
+    /// not own.
+    _source: PhantomData<&'data [u8]>,
+}
+
+impl<'data> IovIterSource<'data> {
+    /// Obtain an `IovIterSource` from a raw pointer.
+    ///
+    /// # Safety
+    ///
+    /// * For the duration of `'iov`, the `struct iov_iter` must remain valid and must not be
+    ///   accessed except through the returned reference.
+    /// * For the duration of `'data`, the buffers backing this IO vector must be valid for
+    ///   reading.
+    #[track_caller]
+    #[inline]
+    pub unsafe fn from_raw<'iov>(ptr: *mut bindings::iov_iter) -> &'iov mut IovIterSource<'data> {
+        // SAFETY: The caller ensures that `ptr` is valid.
+        let data_source = unsafe { (*ptr).data_source };
+        assert_eq!(data_source, ITER_SOURCE);
+
+        // SAFETY: The caller ensures the struct invariants for the right durations.
+        unsafe { &mut *ptr.cast::<IovIterSource<'data>>() }
+    }
+
+    /// Access this as a raw `struct iov_iter`.
+    #[inline]
+    pub fn as_raw(&mut self) -> *mut bindings::iov_iter {
+        self.iov.get()
+    }
+
+    /// Returns the number of bytes available in this IO vector.
+    ///
+    /// Note that this may overestimate the number of bytes. For example, reading from userspace
+    /// memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
+    #[inline]
+    pub fn len(&self) -> usize {
+        // SAFETY: It is safe to access the `count` field.
+        unsafe {
+            (*self.iov.get())
+                .__bindgen_anon_1
+                .__bindgen_anon_1
+                .as_ref()
+                .count
+        }
+    }
+
+    /// Returns whether there are any bytes left in this IO vector.
+    ///
+    /// This may return `true` even if there are no more bytes available. For example, reading from
+    /// userspace memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
+    #[inline]
+    pub fn is_empty(&self) -> bool {
+        self.len() == 0
+    }
+
+    /// Advance this IO vector by `bytes` bytes.
+    ///
+    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
+    #[inline]
+    pub fn advance(&mut self, bytes: usize) {
+        // SAFETY: `self.iov` is a valid IO vector.
+        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
+    }
+
+    /// Advance this IO vector backwards by `bytes` bytes.
+    ///
+    /// # Safety
+    ///
+    /// The IO vector must not be reverted to before its beginning.
+    #[inline]
+    pub unsafe fn revert(&mut self, bytes: usize) {
+        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
+        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
+    }
+
+    /// Read data from this IO vector.
+    ///
+    /// Returns the number of bytes that have been copied.
+    #[inline]
+    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
+        // SAFETY: We will not write uninitialized bytes to `out`.
+        let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
+
+        self.copy_from_iter_raw(out).len()
+    }
+
+    /// Read data from this IO vector and append it to a vector.
+    ///
+    /// Returns the number of bytes that have been copied.
+    #[inline]
+    pub fn copy_from_iter_vec<A: Allocator>(
+        &mut self,
+        out: &mut Vec<u8, A>,
+        flags: Flags,
+    ) -> Result<usize> {
+        out.reserve(self.len(), flags)?;
+        let len = self.copy_from_iter_raw(out.spare_capacity_mut()).len();
+        // SAFETY: The next `len` bytes of the vector have been initialized.
+        unsafe { out.inc_len(len) };
+        Ok(len)
+    }
+
+    /// Read data from this IO vector into potentially uninitialized memory.
+    ///
+    /// Returns the sub-slice of the output that has been initialized. If the returned slice is
+    /// shorter than the input buffer, then the entire IO vector has been read.
+    #[inline]
+    pub fn copy_from_iter_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> &mut [u8] {
+        // SAFETY: `out` is valid for `out.len()` bytes.
+        let len =
+            unsafe { bindings::_copy_from_iter(out.as_mut_ptr().cast(), out.len(), self.as_raw()) };
+
+        // SAFETY: We just initialized the first `len` bytes of `out`.
+        unsafe { slice::from_raw_parts_mut(out.as_mut_ptr().cast(), len) }
+    }
+}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 6b4774b2b1c37f4da1866e993be6230bc6715841..278b6fdee62156f4ed997c13fa10bd2fb0fa3ad6 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -81,6 +81,7 @@
 pub mod init;
 pub mod io;
 pub mod ioctl;
+pub mod iov;
 pub mod jump_label;
 #[cfg(CONFIG_KUNIT)]
 pub mod kunit;

-- 
2.50.0.727.gbf7dc18ff4-goog

Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE

Posted by Andreas Hindborg 3 months ago

"Alice Ryhl" <aliceryhl@google.com> writes:

> This adds abstractions for the iov_iter type in the case where
> data_source is ITER_SOURCE. This will make Rust implementations of
> fops->write_iter possible.
>
> This series only has support for using existing IO vectors created by C
> code. Additional abstractions will be needed to support the creation of
> IO vectors in Rust code.
>
> These abstractions make the assumption that `struct iov_iter` does not
> have internal self-references, which implies that it is valid to move it
> between different local variables.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/iov.rs | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  rust/kernel/lib.rs |   1 +
>  2 files changed, 153 insertions(+)
>
> diff --git a/rust/kernel/iov.rs b/rust/kernel/iov.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..b4d7ec14c57a561a01cd65b6bdf0f94b1b373b84
> --- /dev/null
> +++ b/rust/kernel/iov.rs
> @@ -0,0 +1,152 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2025 Google LLC.
> +
> +//! IO vectors.
> +//!
> +//! C headers: [`include/linux/iov_iter.h`](srctree/include/linux/iov_iter.h),
> +//! [`include/linux/uio.h`](srctree/include/linux/uio.h)
> +
> +use crate::{
> +    alloc::{Allocator, Flags},
> +    bindings,
> +    prelude::*,
> +    types::Opaque,
> +};
> +use core::{marker::PhantomData, mem::MaybeUninit, slice};
> +
> +const ITER_SOURCE: bool = bindings::ITER_SOURCE != 0;
> +
> +/// An IO vector that acts as a source of data.
> +///
> +/// The data may come from many different sources. This includes both things in kernel-space and
> +/// reading from userspace. It's not necessarily the case that the data source is immutable, so
> +/// rewinding the IO vector to read the same data twice is not guaranteed to result in the same
> +/// bytes. It's also possible that the data source is mapped in a thread-local manner using e.g.
> +/// `kmap_local_page()`, so this type is not `Send` to ensure that the mapping is read from the
> +/// right context in that scenario.
> +///
> +/// # Invariants
> +///
> +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
> +/// of `'data`, it must be safe to read the data in this IO vector.

In my opinion, the phrasing you had in v1 was better:

  The buffers referenced by the IO vector must be valid for reading for
  the duration of `'data`.

That is, I would prefer "must be valid for reading" over "it must be
safe to read ...".

> +#[repr(transparent)]
> +pub struct IovIterSource<'data> {
> +    iov: Opaque<bindings::iov_iter>,
> +    /// Represent to the type system that this value contains a pointer to readable data it does
> +    /// not own.
> +    _source: PhantomData<&'data [u8]>,
> +}
> +
> +impl<'data> IovIterSource<'data> {
> +    /// Obtain an `IovIterSource` from a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// * For the duration of `'iov`, the `struct iov_iter` must remain valid and must not be
> +    ///   accessed except through the returned reference.
> +    /// * For the duration of `'data`, the buffers backing this IO vector must be valid for
> +    ///   reading.
> +    #[track_caller]
> +    #[inline]
> +    pub unsafe fn from_raw<'iov>(ptr: *mut bindings::iov_iter) -> &'iov mut IovIterSource<'data> {
> +        // SAFETY: The caller ensures that `ptr` is valid.
> +        let data_source = unsafe { (*ptr).data_source };
> +        assert_eq!(data_source, ITER_SOURCE);
> +
> +        // SAFETY: The caller ensures the struct invariants for the right durations.
> +        unsafe { &mut *ptr.cast::<IovIterSource<'data>>() }
> +    }
> +
> +    /// Access this as a raw `struct iov_iter`.
> +    #[inline]
> +    pub fn as_raw(&mut self) -> *mut bindings::iov_iter {
> +        self.iov.get()
> +    }
> +
> +    /// Returns the number of bytes available in this IO vector.
> +    ///
> +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
> +    /// memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> +    #[inline]
> +    pub fn len(&self) -> usize {
> +        // SAFETY: It is safe to access the `count` field.

Reiterating my comment from v1: Why?

> +        unsafe {
> +            (*self.iov.get())
> +                .__bindgen_anon_1
> +                .__bindgen_anon_1
> +                .as_ref()
> +                .count
> +        }
> +    }
> +
> +    /// Returns whether there are any bytes left in this IO vector.
> +    ///
> +    /// This may return `true` even if there are no more bytes available. For example, reading from
> +    /// userspace memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> +    #[inline]
> +    pub fn is_empty(&self) -> bool {
> +        self.len() == 0
> +    }
> +
> +    /// Advance this IO vector by `bytes` bytes.
> +    ///
> +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
> +    #[inline]
> +    pub fn advance(&mut self, bytes: usize) {
> +        // SAFETY: `self.iov` is a valid IO vector.
> +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
> +    }
> +
> +    /// Advance this IO vector backwards by `bytes` bytes.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The IO vector must not be reverted to before its beginning.
> +    #[inline]
> +    pub unsafe fn revert(&mut self, bytes: usize) {
> +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
> +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
> +    }
> +
> +    /// Read data from this IO vector.
> +    ///
> +    /// Returns the number of bytes that have been copied.
> +    #[inline]
> +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
> +        // SAFETY: We will not write uninitialized bytes to `out`.

Can you provide something to back this claim?


Best regards,
Andreas Hindborg

Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE

Posted by Alice Ryhl 3 months ago

On Tue, Jul 08, 2025 at 04:45:14PM +0200, Andreas Hindborg wrote:
> "Alice Ryhl" <aliceryhl@google.com> writes:
> > +/// # Invariants
> > +///
> > +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
> > +/// of `'data`, it must be safe to read the data in this IO vector.
> 
> In my opinion, the phrasing you had in v1 was better:
> 
>   The buffers referenced by the IO vector must be valid for reading for
>   the duration of `'data`.
> 
> That is, I would prefer "must be valid for reading" over "it must be
> safe to read ...".

If it's backed by userspace data, then technically there aren't any
buffers that are valid for reading in the usual sense. We need to call
into special assembly to read it, and a normal pointer dereference would
be illegal.

> > +    /// Returns the number of bytes available in this IO vector.
> > +    ///
> > +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
> > +    /// memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> > +    #[inline]
> > +    pub fn len(&self) -> usize {
> > +        // SAFETY: It is safe to access the `count` field.
> 
> Reiterating my comment from v1: Why?

It's the same reason as why this is safe:

struct HasLength {
    length: usize,
}
impl HasLength {
    fn len(&self) -> usize {
        // why is this safe?
        self.length
    }
}

I'm not sure how to say it concisely. I guess it's because all access to
the iov_iter goes through the &IovIterSource.

> > +        unsafe {
> > +            (*self.iov.get())
> > +                .__bindgen_anon_1
> > +                .__bindgen_anon_1
> > +                .as_ref()
> > +                .count
> > +        }
> > +    }
> > +
> > +    /// Returns whether there are any bytes left in this IO vector.
> > +    ///
> > +    /// This may return `true` even if there are no more bytes available. For example, reading from
> > +    /// userspace memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> > +    #[inline]
> > +    pub fn is_empty(&self) -> bool {
> > +        self.len() == 0
> > +    }
> > +
> > +    /// Advance this IO vector by `bytes` bytes.
> > +    ///
> > +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
> > +    #[inline]
> > +    pub fn advance(&mut self, bytes: usize) {
> > +        // SAFETY: `self.iov` is a valid IO vector.
> > +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
> > +    }
> > +
> > +    /// Advance this IO vector backwards by `bytes` bytes.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// The IO vector must not be reverted to before its beginning.
> > +    #[inline]
> > +    pub unsafe fn revert(&mut self, bytes: usize) {
> > +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
> > +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
> > +    }
> > +
> > +    /// Read data from this IO vector.
> > +    ///
> > +    /// Returns the number of bytes that have been copied.
> > +    #[inline]
> > +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
> > +        // SAFETY: We will not write uninitialized bytes to `out`.
> 
> Can you provide something to back this claim?

I guess the logic could go along these lines:

* If the iov_iter reads from userspace, then it's because we always
  consider such reads to produce initialized data.
* If the iov_iter reads from a kernel buffer, then the creator of the
  iov_iter must provide an initialized buffer.

Ultimately, if we don't know that the bytes are initialized, then it's
impossible to use the API correctly because you can never inspect the
bytes in any way. I.e., any implementation of copy_from_iter that
produces uninit data is necessarily buggy.

Alice

Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE

Posted by Andreas Hindborg 3 months ago

"Alice Ryhl" <aliceryhl@google.com> writes:

> On Tue, Jul 08, 2025 at 04:45:14PM +0200, Andreas Hindborg wrote:
>> "Alice Ryhl" <aliceryhl@google.com> writes:
>> > +/// # Invariants
>> > +///
>> > +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
>> > +/// of `'data`, it must be safe to read the data in this IO vector.
>>
>> In my opinion, the phrasing you had in v1 was better:
>>
>>   The buffers referenced by the IO vector must be valid for reading for
>>   the duration of `'data`.
>>
>> That is, I would prefer "must be valid for reading" over "it must be
>> safe to read ...".
>
> If it's backed by userspace data, then technically there aren't any
> buffers that are valid for reading in the usual sense. We need to call
> into special assembly to read it, and a normal pointer dereference would
> be illegal.

If you go with "safe to read" for this reason, I think you should expand
the statement along the lines you used here.

What is the special assembly that is used to read this data? From a
quick scan it looks like that if `CONFIG_UACCESS_MEMCPY` is enabled, a
regular `memcpy` call is used.

>
>> > +    /// Returns the number of bytes available in this IO vector.
>> > +    ///
>> > +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
>> > +    /// memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
>> > +    #[inline]
>> > +    pub fn len(&self) -> usize {
>> > +        // SAFETY: It is safe to access the `count` field.
>>
>> Reiterating my comment from v1: Why?
>
> It's the same reason as why this is safe:
>
> struct HasLength {
>     length: usize,
> }
> impl HasLength {
>     fn len(&self) -> usize {
>         // why is this safe?
>         self.length
>     }
> }
>
> I'm not sure how to say it concisely. I guess it's because all access to
> the iov_iter goes through the &IovIterSource.

So "By existence of a shared reference to `self`, `count` is valid for read."?

>
>> > +        unsafe {
>> > +            (*self.iov.get())
>> > +                .__bindgen_anon_1
>> > +                .__bindgen_anon_1
>> > +                .as_ref()
>> > +                .count
>> > +        }
>> > +    }
>> > +
>> > +    /// Returns whether there are any bytes left in this IO vector.
>> > +    ///
>> > +    /// This may return `true` even if there are no more bytes available. For example, reading from
>> > +    /// userspace memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
>> > +    #[inline]
>> > +    pub fn is_empty(&self) -> bool {
>> > +        self.len() == 0
>> > +    }
>> > +
>> > +    /// Advance this IO vector by `bytes` bytes.
>> > +    ///
>> > +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
>> > +    #[inline]
>> > +    pub fn advance(&mut self, bytes: usize) {
>> > +        // SAFETY: `self.iov` is a valid IO vector.
>> > +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
>> > +    }
>> > +
>> > +    /// Advance this IO vector backwards by `bytes` bytes.
>> > +    ///
>> > +    /// # Safety
>> > +    ///
>> > +    /// The IO vector must not be reverted to before its beginning.
>> > +    #[inline]
>> > +    pub unsafe fn revert(&mut self, bytes: usize) {
>> > +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
>> > +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
>> > +    }
>> > +
>> > +    /// Read data from this IO vector.
>> > +    ///
>> > +    /// Returns the number of bytes that have been copied.
>> > +    #[inline]
>> > +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
>> > +        // SAFETY: We will not write uninitialized bytes to `out`.
>>
>> Can you provide something to back this claim?
>
> I guess the logic could go along these lines:
>
> * If the iov_iter reads from userspace, then it's because we always
>   consider such reads to produce initialized data.

I don't think it is enough to just state that we consider the reads to
produce initialized data.

> * If the iov_iter reads from a kernel buffer, then the creator of the
>   iov_iter must provide an initialized buffer.
>
> Ultimately, if we don't know that the bytes are initialized, then it's
> impossible to use the API correctly because you can never inspect the
> bytes in any way. I.e., any implementation of copy_from_iter that
> produces uninit data is necessarily buggy.

I would agree. How do we fix that? You are more knowledgeable than me in
this field, so you probably have a better shot than me, at finding a
solution.

As far as I can tell, we need to read from a place unknown to the rust
abstract machine, and we need to be able to have the abstract machine
consider the data initialized after the read.

Is this volatile memcpy [1], or would that only solve the data race
problem, not uninitialized data problem?


Best regards,
Andreas Hindborg

[1] https://lore.kernel.org/all/25e7e425-ae72-4370-ae95-958882a07df9@ralfj.de

Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE

Posted by Alice Ryhl 3 months ago

On Wed, Jul 09, 2025 at 01:56:37PM +0200, Andreas Hindborg wrote:
> "Alice Ryhl" <aliceryhl@google.com> writes:
> 
> > On Tue, Jul 08, 2025 at 04:45:14PM +0200, Andreas Hindborg wrote:
> >> "Alice Ryhl" <aliceryhl@google.com> writes:
> >> > +/// # Invariants
> >> > +///
> >> > +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
> >> > +/// of `'data`, it must be safe to read the data in this IO vector.
> >>
> >> In my opinion, the phrasing you had in v1 was better:
> >>
> >>   The buffers referenced by the IO vector must be valid for reading for
> >>   the duration of `'data`.
> >>
> >> That is, I would prefer "must be valid for reading" over "it must be
> >> safe to read ...".
> >
> > If it's backed by userspace data, then technically there aren't any
> > buffers that are valid for reading in the usual sense. We need to call
> > into special assembly to read it, and a normal pointer dereference would
> > be illegal.
> 
> If you go with "safe to read" for this reason, I think you should expand
> the statement along the lines you used here.
> 
> What is the special assembly that is used to read this data? From a
> quick scan it looks like that if `CONFIG_UACCESS_MEMCPY` is enabled, a
> regular `memcpy` call is used.

When reading from userspace, you're given an arbitrary untrusted address
that could point anywhere. The memory could be swapped out and need to
be loaded back from disk. The memory could correspond to an mmap region
for a file on a NFS mount and reading it could involve a network call.
The address could be dangling, which must be properly handled and turned
into an EFAULT error instead of UB. Every architecture has its own asm
for handling all of this safely so that behavior is safe no matter what
pointer we are given from userspace.

As for CONFIG_UACCESS_MEMCPY, I don't think it is used on any real
system today. It would require you to be on a NOMMU system where the
userspace and the kernel are in the same address space.

> >> > +    /// Returns the number of bytes available in this IO vector.
> >> > +    ///
> >> > +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
> >> > +    /// memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> >> > +    #[inline]
> >> > +    pub fn len(&self) -> usize {
> >> > +        // SAFETY: It is safe to access the `count` field.
> >>
> >> Reiterating my comment from v1: Why?
> >
> > It's the same reason as why this is safe:
> >
> > struct HasLength {
> >     length: usize,
> > }
> > impl HasLength {
> >     fn len(&self) -> usize {
> >         // why is this safe?
> >         self.length
> >     }
> > }
> >
> > I'm not sure how to say it concisely. I guess it's because all access to
> > the iov_iter goes through the &IovIterSource.
> 
> So "By existence of a shared reference to `self`, `count` is valid for read."?
> 
> >
> >> > +        unsafe {
> >> > +            (*self.iov.get())
> >> > +                .__bindgen_anon_1
> >> > +                .__bindgen_anon_1
> >> > +                .as_ref()
> >> > +                .count
> >> > +        }
> >> > +    }
> >> > +
> >> > +    /// Returns whether there are any bytes left in this IO vector.
> >> > +    ///
> >> > +    /// This may return `true` even if there are no more bytes available. For example, reading from
> >> > +    /// userspace memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> >> > +    #[inline]
> >> > +    pub fn is_empty(&self) -> bool {
> >> > +        self.len() == 0
> >> > +    }
> >> > +
> >> > +    /// Advance this IO vector by `bytes` bytes.
> >> > +    ///
> >> > +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
> >> > +    #[inline]
> >> > +    pub fn advance(&mut self, bytes: usize) {
> >> > +        // SAFETY: `self.iov` is a valid IO vector.
> >> > +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
> >> > +    }
> >> > +
> >> > +    /// Advance this IO vector backwards by `bytes` bytes.
> >> > +    ///
> >> > +    /// # Safety
> >> > +    ///
> >> > +    /// The IO vector must not be reverted to before its beginning.
> >> > +    #[inline]
> >> > +    pub unsafe fn revert(&mut self, bytes: usize) {
> >> > +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
> >> > +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
> >> > +    }
> >> > +
> >> > +    /// Read data from this IO vector.
> >> > +    ///
> >> > +    /// Returns the number of bytes that have been copied.
> >> > +    #[inline]
> >> > +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
> >> > +        // SAFETY: We will not write uninitialized bytes to `out`.
> >>
> >> Can you provide something to back this claim?
> >
> > I guess the logic could go along these lines:
> >
> > * If the iov_iter reads from userspace, then it's because we always
> >   consider such reads to produce initialized data.
> 
> I don't think it is enough to just state that we consider the reads to
> produce initialized data.

See above re userspace.

> > * If the iov_iter reads from a kernel buffer, then the creator of the
> >   iov_iter must provide an initialized buffer.
> >
> > Ultimately, if we don't know that the bytes are initialized, then it's
> > impossible to use the API correctly because you can never inspect the
> > bytes in any way. I.e., any implementation of copy_from_iter that
> > produces uninit data is necessarily buggy.
> 
> I would agree. How do we fix that? You are more knowledgeable than me in
> this field, so you probably have a better shot than me, at finding a
> solution.

I think there is nothing to fix. If there exists a callsite on the C
side that creates an iov_iter that reads from an uninitialized kernel
buffer, then we can fix that specific call-site. I don't think anything
else needs to be done.

> As far as I can tell, we need to read from a place unknown to the rust
> abstract machine, and we need to be able to have the abstract machine
> consider the data initialized after the read.
> 
> Is this volatile memcpy [1], or would that only solve the data race
> problem, not uninitialized data problem?
> 
> [1] https://lore.kernel.org/all/25e7e425-ae72-4370-ae95-958882a07df9@ralfj.de

Volatile memcpy deals with data races.

In general, we can argue all we want about wording of these safety
comments, but calling copy_from_iter is the right way to read from an
iov_iter. If there is a problem, the problem is specific call-sites that
construct an iov_iter with an uninit buffer. I don't know whether such
call-sites exist.

Alice

Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE

Posted by Andreas Hindborg 3 months ago

"Alice Ryhl" <aliceryhl@google.com> writes:

> On Wed, Jul 09, 2025 at 01:56:37PM +0200, Andreas Hindborg wrote:
>> "Alice Ryhl" <aliceryhl@google.com> writes:
>>
>> > On Tue, Jul 08, 2025 at 04:45:14PM +0200, Andreas Hindborg wrote:
>> >> "Alice Ryhl" <aliceryhl@google.com> writes:
>> >> > +/// # Invariants
>> >> > +///
>> >> > +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
>> >> > +/// of `'data`, it must be safe to read the data in this IO vector.
>> >>
>> >> In my opinion, the phrasing you had in v1 was better:
>> >>
>> >>   The buffers referenced by the IO vector must be valid for reading for
>> >>   the duration of `'data`.
>> >>
>> >> That is, I would prefer "must be valid for reading" over "it must be
>> >> safe to read ...".
>> >
>> > If it's backed by userspace data, then technically there aren't any
>> > buffers that are valid for reading in the usual sense. We need to call
>> > into special assembly to read it, and a normal pointer dereference would
>> > be illegal.
>>
>> If you go with "safe to read" for this reason, I think you should expand
>> the statement along the lines you used here.
>>
>> What is the special assembly that is used to read this data? From a
>> quick scan it looks like that if `CONFIG_UACCESS_MEMCPY` is enabled, a
>> regular `memcpy` call is used.
>
> When reading from userspace, you're given an arbitrary untrusted address
> that could point anywhere. The memory could be swapped out and need to
> be loaded back from disk. The memory could correspond to an mmap region
> for a file on a NFS mount and reading it could involve a network call.
> The address could be dangling, which must be properly handled and turned
> into an EFAULT error instead of UB. Every architecture has its own asm
> for handling all of this safely so that behavior is safe no matter what
> pointer we are given from userspace.

I don't think that is relevant. My point is, you can't reference
"special assemby" without detailing what that means.

You have a safety requirement in `from_raw`:

    /// * For the duration of `'data`, the buffers backing this IO vector must be valid for
    ///   reading.

This should probably be promoted to invariant for the type, since
`from_raw` is the only way to construct the type?

But are you saying that the referenced buffers need not be mapped and
readable while this type exists? The mapping happens as part of
`bindings::_copy_to_iter`?

> As for CONFIG_UACCESS_MEMCPY, I don't think it is used on any real
> system today. It would require you to be on a NOMMU system where the
> userspace and the kernel are in the same address space.

Ah. I was just browsing for the "special assembly", and that was all I
could find.

>
>> >> > +    /// Returns the number of bytes available in this IO vector.
>> >> > +    ///
>> >> > +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
>> >> > +    /// memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
>> >> > +    #[inline]
>> >> > +    pub fn len(&self) -> usize {
>> >> > +        // SAFETY: It is safe to access the `count` field.
>> >>
>> >> Reiterating my comment from v1: Why?
>> >
>> > It's the same reason as why this is safe:
>> >
>> > struct HasLength {
>> >     length: usize,
>> > }
>> > impl HasLength {
>> >     fn len(&self) -> usize {
>> >         // why is this safe?
>> >         self.length
>> >     }
>> > }
>> >
>> > I'm not sure how to say it concisely. I guess it's because all access to
>> > the iov_iter goes through the &IovIterSource.
>>
>> So "By existence of a shared reference to `self`, `count` is valid for read."?
>>
>> >
>> >> > +        unsafe {
>> >> > +            (*self.iov.get())
>> >> > +                .__bindgen_anon_1
>> >> > +                .__bindgen_anon_1
>> >> > +                .as_ref()
>> >> > +                .count
>> >> > +        }
>> >> > +    }
>> >> > +
>> >> > +    /// Returns whether there are any bytes left in this IO vector.
>> >> > +    ///
>> >> > +    /// This may return `true` even if there are no more bytes available. For example, reading from
>> >> > +    /// userspace memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
>> >> > +    #[inline]
>> >> > +    pub fn is_empty(&self) -> bool {
>> >> > +        self.len() == 0
>> >> > +    }
>> >> > +
>> >> > +    /// Advance this IO vector by `bytes` bytes.
>> >> > +    ///
>> >> > +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
>> >> > +    #[inline]
>> >> > +    pub fn advance(&mut self, bytes: usize) {
>> >> > +        // SAFETY: `self.iov` is a valid IO vector.
>> >> > +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
>> >> > +    }
>> >> > +
>> >> > +    /// Advance this IO vector backwards by `bytes` bytes.
>> >> > +    ///
>> >> > +    /// # Safety
>> >> > +    ///
>> >> > +    /// The IO vector must not be reverted to before its beginning.
>> >> > +    #[inline]
>> >> > +    pub unsafe fn revert(&mut self, bytes: usize) {
>> >> > +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
>> >> > +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
>> >> > +    }
>> >> > +
>> >> > +    /// Read data from this IO vector.
>> >> > +    ///
>> >> > +    /// Returns the number of bytes that have been copied.
>> >> > +    #[inline]
>> >> > +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
>> >> > +        // SAFETY: We will not write uninitialized bytes to `out`.
>> >>
>> >> Can you provide something to back this claim?
>> >
>> > I guess the logic could go along these lines:
>> >
>> > * If the iov_iter reads from userspace, then it's because we always
>> >   consider such reads to produce initialized data.
>>
>> I don't think it is enough to just state that we consider the reads to
>> produce initialized data.
>
> See above re userspace.

You actually have the safety requirement I was looking for in
`from_raw`:


    /// * For the duration of `'data`, the buffers backing this IO vector must be valid for
    ///   reading.

But I am wondering whether this needs to align with the invariant, and
not the other way around?

>
>> > * If the iov_iter reads from a kernel buffer, then the creator of the
>> >   iov_iter must provide an initialized buffer.
>> >
>> > Ultimately, if we don't know that the bytes are initialized, then it's
>> > impossible to use the API correctly because you can never inspect the
>> > bytes in any way. I.e., any implementation of copy_from_iter that
>> > produces uninit data is necessarily buggy.
>>
>> I would agree. How do we fix that? You are more knowledgeable than me in
>> this field, so you probably have a better shot than me, at finding a
>> solution.
>
> I think there is nothing to fix. If there exists a callsite on the C
> side that creates an iov_iter that reads from an uninitialized kernel
> buffer, then we can fix that specific call-site. I don't think anything
> else needs to be done.

If soundness of this code hinges on specific call site behavior, this
should be a safety requirement.

>
>> As far as I can tell, we need to read from a place unknown to the rust
>> abstract machine, and we need to be able to have the abstract machine
>> consider the data initialized after the read.
>>
>> Is this volatile memcpy [1], or would that only solve the data race
>> problem, not uninitialized data problem?
>>
>> [1] https://lore.kernel.org/all/25e7e425-ae72-4370-ae95-958882a07df9@ralfj.de
>
> Volatile memcpy deals with data races.
>
> In general, we can argue all we want about wording of these safety
> comments, but calling copy_from_iter is the right way to read from an
> iov_iter. If there is a problem, the problem is specific call-sites that
> construct an iov_iter with an uninit buffer. I don't know whether such
> call-sites exist.

I am not saying it is the wrong way. I am asking that we detail in the
safety requirements _why_ it is the right way.

You have a type invariant

  For the duration of `'data`, it must be safe to read the data in this IO vector.

that says "safe to read" instead of "valid for read" because "special
assembly" is used to read the data, and that somehow makes it OK. We
should be more specific.

How about making the invariant:

  For the duration of `'data`, it must be safe to read the data in this
  IO vector with the C API `_copy_from_iter`.

And then your safety comment regarding uninit bytes can be:

  We write `out` with `copy_from_iter_raw`, which transitively writes
  `out` using `_copy_from_iter`. By C API contract, `_copy_from_iter`
  does not write uninitialized bytes to `out`.

In this way we can defer to the implementation of `_copy_from_user`,
which is what I think you want?


Best regards,
Andreas Hindborg

Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE

Posted by Alice Ryhl 2 months, 3 weeks ago

On Wed, Jul 09, 2025 at 07:05:01PM +0200, Andreas Hindborg wrote:
> "Alice Ryhl" <aliceryhl@google.com> writes:
> 
> > On Wed, Jul 09, 2025 at 01:56:37PM +0200, Andreas Hindborg wrote:
> >> "Alice Ryhl" <aliceryhl@google.com> writes:
> >>
> >> > On Tue, Jul 08, 2025 at 04:45:14PM +0200, Andreas Hindborg wrote:
> >> >> "Alice Ryhl" <aliceryhl@google.com> writes:
> >> >> > +/// # Invariants
> >> >> > +///
> >> >> > +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
> >> >> > +/// of `'data`, it must be safe to read the data in this IO vector.
> >> >>
> >> >> In my opinion, the phrasing you had in v1 was better:
> >> >>
> >> >>   The buffers referenced by the IO vector must be valid for reading for
> >> >>   the duration of `'data`.
> >> >>
> >> >> That is, I would prefer "must be valid for reading" over "it must be
> >> >> safe to read ...".
> >> >
> >> > If it's backed by userspace data, then technically there aren't any
> >> > buffers that are valid for reading in the usual sense. We need to call
> >> > into special assembly to read it, and a normal pointer dereference would
> >> > be illegal.
> >>
> >> If you go with "safe to read" for this reason, I think you should expand
> >> the statement along the lines you used here.
> >>
> >> What is the special assembly that is used to read this data? From a
> >> quick scan it looks like that if `CONFIG_UACCESS_MEMCPY` is enabled, a
> >> regular `memcpy` call is used.
> >
> > When reading from userspace, you're given an arbitrary untrusted address
> > that could point anywhere. The memory could be swapped out and need to
> > be loaded back from disk. The memory could correspond to an mmap region
> > for a file on a NFS mount and reading it could involve a network call.
> > The address could be dangling, which must be properly handled and turned
> > into an EFAULT error instead of UB. Every architecture has its own asm
> > for handling all of this safely so that behavior is safe no matter what
> > pointer we are given from userspace.
> 
> I don't think that is relevant. My point is, you can't reference
> "special assemby" without detailing what that means.
> 
> You have a safety requirement in `from_raw`:
> 
>     /// * For the duration of `'data`, the buffers backing this IO vector must be valid for
>     ///   reading.
> 
> This should probably be promoted to invariant for the type, since
> `from_raw` is the only way to construct the type?

Sure, let's get the wording consistent, but that was the purpose of this
line in the invariants:

For the duration of `'data`, it must be safe to read the data in this IO vector.

> But are you saying that the referenced buffers need not be mapped and
> readable while this type exists? The mapping happens as part of
> `bindings::_copy_to_iter`?

Ultimately, it's an implementation detail.

In our "# Invariants" section, we tend to "expand" the underlying C
types and describe exactly what it means for that C type to be valid,
even if those details are implementation details that nobody outside
that C file should think about. Usually that's fine, but in this case,
I don't think it is feasible.

The iov_iter type is like a giant enum with a bunch of different
implementations. Some implementations just read from a simple kernel
buffer that must, of course, be mapped. Some implementations traverse
complex data structures and stitch the data together from multiple
buffers. Other implementations map the data into memory on-demand inside
the copy_from_iter call, without requiring it to be mapped at other
times. And finally, some implementations perform IO by reading from
userspace, in which case it's valid for the userspace pointer to be
*literally any 64-bit integer*. If the address is dangling, that's
caught inside the call to copy_from_iter and is not a safety issue.

I just want the type invariant to say that reading from it is valid, as
long as the read happens before a certain lifetime expires, without
elaborating on precisely what that means.

> >> > * If the iov_iter reads from a kernel buffer, then the creator of the
> >> >   iov_iter must provide an initialized buffer.
> >> >
> >> > Ultimately, if we don't know that the bytes are initialized, then it's
> >> > impossible to use the API correctly because you can never inspect the
> >> > bytes in any way. I.e., any implementation of copy_from_iter that
> >> > produces uninit data is necessarily buggy.
> >>
> >> I would agree. How do we fix that? You are more knowledgeable than me in
> >> this field, so you probably have a better shot than me, at finding a
> >> solution.
> >
> > I think there is nothing to fix. If there exists a callsite on the C
> > side that creates an iov_iter that reads from an uninitialized kernel
> > buffer, then we can fix that specific call-site. I don't think anything
> > else needs to be done.
> 
> If soundness of this code hinges on specific call site behavior, this
> should be a safety requirement.

Yes, when we add Rust constructors for this type, they will need
appropriate soundness checks or safety requirements to verify that the
provided buffer is valid for the chosen iter_type.

For now, it is constructed in C and we usually don't have safety
comments in C code.

> >> As far as I can tell, we need to read from a place unknown to the rust
> >> abstract machine, and we need to be able to have the abstract machine
> >> consider the data initialized after the read.
> >>
> >> Is this volatile memcpy [1], or would that only solve the data race
> >> problem, not uninitialized data problem?
> >>
> >> [1] https://lore.kernel.org/all/25e7e425-ae72-4370-ae95-958882a07df9@ralfj.de
> >
> > Volatile memcpy deals with data races.
> >
> > In general, we can argue all we want about wording of these safety
> > comments, but calling copy_from_iter is the right way to read from an
> > iov_iter. If there is a problem, the problem is specific call-sites that
> > construct an iov_iter with an uninit buffer. I don't know whether such
> > call-sites exist.
> 
> I am not saying it is the wrong way. I am asking that we detail in the
> safety requirements _why_ it is the right way.
> 
> You have a type invariant
> 
>   For the duration of `'data`, it must be safe to read the data in this IO vector.
> 
> that says "safe to read" instead of "valid for read" because "special
> assembly" is used to read the data, and that somehow makes it OK. We
> should be more specific.
> 
> How about making the invariant:
> 
>   For the duration of `'data`, it must be safe to read the data in this
>   IO vector with the C API `_copy_from_iter`.
> 
> And then your safety comment regarding uninit bytes can be:
> 
>   We write `out` with `copy_from_iter_raw`, which transitively writes
>   `out` using `_copy_from_iter`. By C API contract, `_copy_from_iter`
>   does not write uninitialized bytes to `out`.
> 
> In this way we can defer to the implementation of `_copy_from_user`,
> which is what I think you want?

Yes, this is pretty much what I want except that _copy_from_user isn't
the only C function you could call to read from an iov_iter.

Alice

Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE

Posted by Andreas Hindborg 2 months ago

"Alice Ryhl" <aliceryhl@google.com> writes:

> On Wed, Jul 09, 2025 at 07:05:01PM +0200, Andreas Hindborg wrote:
>> "Alice Ryhl" <aliceryhl@google.com> writes:
>>
>> > On Wed, Jul 09, 2025 at 01:56:37PM +0200, Andreas Hindborg wrote:
>> >> "Alice Ryhl" <aliceryhl@google.com> writes:
>> >>
>> >> > On Tue, Jul 08, 2025 at 04:45:14PM +0200, Andreas Hindborg wrote:
>> >> >> "Alice Ryhl" <aliceryhl@google.com> writes:
> The iov_iter type is like a giant enum with a bunch of different
> implementations. Some implementations just read from a simple kernel
> buffer that must, of course, be mapped. Some implementations traverse
> complex data structures and stitch the data together from multiple
> buffers. Other implementations map the data into memory on-demand inside
> the copy_from_iter call, without requiring it to be mapped at other
> times. And finally, some implementations perform IO by reading from
> userspace, in which case it's valid for the userspace pointer to be
> *literally any 64-bit integer*. If the address is dangling, that's
> caught inside the call to copy_from_iter and is not a safety issue.

At any rate, this is a very informative paragraph. It would be great if
you could have this information in the documentation for this type.


Best regards,
Andreas Hindborg

[PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER_SOURCE
[PATCH v2 2/4] rust: iov: add iov_iter abstractions for ITER_DEST
[PATCH v2 3/4] rust: miscdevice: Provide additional abstractions for iov_iter and kiocb structures
[PATCH v2 4/4] samples: rust_misc_device: Expand the sample to support read()ing from userspace