[PATCH v3] rust: page: add byte-wise atomic memory copy methods

Andreas Hindborg posted 1 patch 1 month, 2 weeks ago
rust/kernel/page.rs        | 76 ++++++++++++++++++++++++++++++++++++++++++++++
rust/kernel/sync/atomic.rs | 32 +++++++++++++++++++
2 files changed, 108 insertions(+)
[PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Andreas Hindborg 1 month, 2 weeks ago
When copying data from buffers that are mapped to user space, it is
impossible to guarantee absence of concurrent memory operations on those
buffers. Copying data to/from `Page` from/to these buffers would be
undefined behavior if no special considerations are made.

Add methods on `Page` to read and write the contents using byte-wise atomic
operations.

Also improve clarity by specifying additional requirements on
`read_raw`/`write_raw` methods regarding concurrent operations on involved
buffers.

Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
Changes in v3:
- Update documentation adn safety requirements for `Page::{read,write}_bytewise_atomic`.
- Update safety comments in `Page::{read,write}_bytewise_atomic`.
- Call the correct copy function in `Page::{read,write}_bytewise_atomic`.
- Link to v2: https://msgid.link/20260212-page-volatile-io-v2-1-a36cb97d15c2@kernel.org

Changes in v2:
- Rewrite patch with byte-wise atomic operations as foundation of operation.
- Update subject and commit message.
- Link to v1: https://lore.kernel.org/r/20260130-page-volatile-io-v1-1-19f3d3e8f265@kernel.org
---
 rust/kernel/page.rs        | 76 ++++++++++++++++++++++++++++++++++++++++++++++
 rust/kernel/sync/atomic.rs | 32 +++++++++++++++++++
 2 files changed, 108 insertions(+)

diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
index 432fc0297d4a8..d4494a7c98401 100644
--- a/rust/kernel/page.rs
+++ b/rust/kernel/page.rs
@@ -260,6 +260,8 @@ fn with_pointer_into_page<T>(
     /// # Safety
     ///
     /// * Callers must ensure that `dst` is valid for writing `len` bytes.
+    /// * Callers must ensure that there are no other concurrent reads or writes to/from the
+    ///   destination memory region.
     /// * Callers must ensure that this call does not race with a write to the same page that
     ///   overlaps with this read.
     pub unsafe fn read_raw(&self, dst: *mut u8, offset: usize, len: usize) -> Result {
@@ -274,6 +276,40 @@ pub unsafe fn read_raw(&self, dst: *mut u8, offset: usize, len: usize) -> Result
         })
     }
 
+    /// Maps the page and reads from it into the given memory region using byte-wise atomic memory
+    /// operations.
+    ///
+    /// This method will perform bounds checks on the page offset. If `offset .. offset+len` goes
+    /// outside of the page, then this call returns [`EINVAL`].
+    ///
+    /// # Safety
+    ///
+    /// Callers must ensure that:
+    ///
+    /// - `dst` is valid for writes for `len` bytes for the duration of the call.
+    /// - For the duration of the call, other accesses to the area described by `dst` and `len`,
+    ///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
+    ///   function. Note that if all other accesses are atomic, then this safety requirement is
+    ///   trivially fulfilled.
+    /// - Callers must ensure that this call does not race with a write to the source page that
+    ///   overlaps with this read.
+    ///
+    /// [`LKMM`]: srctree/tools/memory-model
+    pub unsafe fn read_bytewise_atomic(&self, dst: *mut u8, offset: usize, len: usize) -> Result {
+        self.with_pointer_into_page(offset, len, move |src| {
+            // SAFETY:
+            // - If `with_pointer_into_page` calls into this closure, then it has performed a
+            //   bounds check and guarantees that `src` is valid for `len` bytes.
+            // - By function safety requirements `dst` is valid for writes for `len` bytes.
+            // - By function safety requirements there are no other writes to `src` during this
+            //   call.
+            // - By function safety requirements all other access to `dst` during this call are
+            //   atomic.
+            unsafe { kernel::sync::atomic::atomic_per_byte_memcpy(src, dst, len) };
+            Ok(())
+        })
+    }
+
     /// Maps the page and writes into it from the given buffer.
     ///
     /// This method will perform bounds checks on the page offset. If `offset .. offset+len` goes
@@ -282,6 +318,7 @@ pub unsafe fn read_raw(&self, dst: *mut u8, offset: usize, len: usize) -> Result
     /// # Safety
     ///
     /// * Callers must ensure that `src` is valid for reading `len` bytes.
+    /// * Callers must ensure that there are no concurrent writes to the source memory region.
     /// * Callers must ensure that this call does not race with a read or write to the same page
     ///   that overlaps with this write.
     pub unsafe fn write_raw(&self, src: *const u8, offset: usize, len: usize) -> Result {
@@ -295,6 +332,45 @@ pub unsafe fn write_raw(&self, src: *const u8, offset: usize, len: usize) -> Res
         })
     }
 
+    /// Maps the page and writes into it from the given memory region using byte-wise atomic memory
+    /// operations.
+    ///
+    /// This method will perform bounds checks on the page offset. If `offset .. offset+len` goes
+    /// outside of the page, then this call returns [`EINVAL`].
+    ///
+    /// # Safety
+    ///
+    /// Callers must ensure that:
+    ///
+    /// - `src` is valid for reads for `len` bytes for the duration of the call.
+    /// - For the duration of the call, other accesses to the areas described by `src` and `len`,
+    ///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
+    ///   function. Note that if all other accesses are atomic, then this safety requirement is
+    ///   trivially fulfilled.
+    /// - Callers must ensure that this call does not race with a read or write to the destination
+    ///   page that overlaps with this write.
+    ///
+    /// [`LKMM`]: srctree/tools/memory-model
+    pub unsafe fn write_bytewise_atomic(
+        &self,
+        src: *const u8,
+        offset: usize,
+        len: usize,
+    ) -> Result {
+        self.with_pointer_into_page(offset, len, move |dst| {
+            // SAFETY:
+            // - By function safety requirements `src` is valid for writes for `len` bytes.
+            // - If `with_pointer_into_page` calls into this closure, then it has performed a
+            //   bounds check and guarantees that `dst` is valid for `len` bytes.
+            // - By function safety requirements there are no other writes to `dst` during this
+            //   call.
+            // - By function safety requirements all other access to `src` during this call are
+            //   atomic.
+            unsafe { kernel::sync::atomic::atomic_per_byte_memcpy(src, dst, len) };
+            Ok(())
+        })
+    }
+
     /// Maps the page and zeroes the given slice.
     ///
     /// This method will perform bounds checks on the page offset. If `offset .. offset+len` goes
diff --git a/rust/kernel/sync/atomic.rs b/rust/kernel/sync/atomic.rs
index 4aebeacb961a2..8ab20126a88cf 100644
--- a/rust/kernel/sync/atomic.rs
+++ b/rust/kernel/sync/atomic.rs
@@ -560,3 +560,35 @@ pub fn fetch_add<Rhs, Ordering: ordering::Ordering>(&self, v: Rhs, _: Ordering)
         unsafe { from_repr(ret) }
     }
 }
+
+/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
+///
+/// This copy operation is volatile.
+///
+/// # Safety
+///
+/// Callers must ensure that:
+///
+/// - `src` is valid for reads for `len` bytes for the duration of the call.
+/// - `dst` is valid for writes for `len` bytes for the duration of the call.
+/// - For the duration of the call, other accesses to the areas described by `src`, `dst` and `len`,
+///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
+///   function. Note that if all other accesses are atomic, then this safety requirement is
+///   trivially fulfilled.
+///
+/// [`LKMM`]: srctree/tools/memory-model
+pub unsafe fn atomic_per_byte_memcpy(src: *const u8, dst: *mut u8, len: usize) {
+    // SAFETY: By the safety requirements of this function, the following operation will not:
+    //  - Trap.
+    //  - Invalidate any reference invariants.
+    //  - Race with any operation by the Rust AM, as `bindings::memcpy` is a byte-wise atomic
+    //    operation and all operations by the Rust AM to the involved memory areas use byte-wise
+    //    atomic semantics.
+    unsafe {
+        bindings::memcpy(
+            dst.cast::<kernel::ffi::c_void>(),
+            src.cast::<kernel::ffi::c_void>(),
+            len,
+        )
+    };
+}

---
base-commit: 63804fed149a6750ffd28610c5c1c98cce6bd377
change-id: 20260130-page-volatile-io-05ff595507d3

Best regards,
-- 
Andreas Hindborg <a.hindborg@kernel.org>
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Alice Ryhl 1 month, 2 weeks ago
On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> When copying data from buffers that are mapped to user space, it is
> impossible to guarantee absence of concurrent memory operations on those
> buffers. Copying data to/from `Page` from/to these buffers would be
> undefined behavior if no special considerations are made.
> 
> Add methods on `Page` to read and write the contents using byte-wise atomic
> operations.
> 
> Also improve clarity by specifying additional requirements on
> `read_raw`/`write_raw` methods regarding concurrent operations on involved
> buffers.
> 
> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>

> +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
> +///
> +/// This copy operation is volatile.
> +///
> +/// # Safety
> +///
> +/// Callers must ensure that:
> +///
> +/// - `src` is valid for reads for `len` bytes for the duration of the call.
> +/// - `dst` is valid for writes for `len` bytes for the duration of the call.
> +/// - For the duration of the call, other accesses to the areas described by `src`, `dst` and `len`,
> +///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
> +///   function. Note that if all other accesses are atomic, then this safety requirement is
> +///   trivially fulfilled.
> +///
> +/// [`LKMM`]: srctree/tools/memory-model
> +pub unsafe fn atomic_per_byte_memcpy(src: *const u8, dst: *mut u8, len: usize) {
> +    // SAFETY: By the safety requirements of this function, the following operation will not:
> +    //  - Trap.
> +    //  - Invalidate any reference invariants.
> +    //  - Race with any operation by the Rust AM, as `bindings::memcpy` is a byte-wise atomic
> +    //    operation and all operations by the Rust AM to the involved memory areas use byte-wise
> +    //    atomic semantics.
> +    unsafe {
> +        bindings::memcpy(
> +            dst.cast::<kernel::ffi::c_void>(),
> +            src.cast::<kernel::ffi::c_void>(),
> +            len,

Are we sure that LLVM will not say "memcpy is a special function name, I
know what it means" and optimize this like a non-atomic memcpy?

I think we should consider using the

	std::intrinsics::volatile_copy_nonoverlapping_memory

intrinsic until Rust stabilizes a built-in atomic per-byte memcpy. Yes I
know the intrinsic is unstable, but we should at least ask the Rust
folks about it. They are plausibly ok with this particular usage.

Alice
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Boqun Feng 1 month, 1 week ago
On Tue, Feb 17, 2026 at 12:03:00PM +0000, Alice Ryhl wrote:
> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> > When copying data from buffers that are mapped to user space, it is
> > impossible to guarantee absence of concurrent memory operations on those
> > buffers. Copying data to/from `Page` from/to these buffers would be
> > undefined behavior if no special considerations are made.
> > 
> > Add methods on `Page` to read and write the contents using byte-wise atomic
> > operations.
> > 
> > Also improve clarity by specifying additional requirements on
> > `read_raw`/`write_raw` methods regarding concurrent operations on involved
> > buffers.
> > 
> > Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> 
> > +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
> > +///
> > +/// This copy operation is volatile.
> > +///
> > +/// # Safety
> > +///
> > +/// Callers must ensure that:
> > +///
> > +/// - `src` is valid for reads for `len` bytes for the duration of the call.
> > +/// - `dst` is valid for writes for `len` bytes for the duration of the call.
> > +/// - For the duration of the call, other accesses to the areas described by `src`, `dst` and `len`,
> > +///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
> > +///   function. Note that if all other accesses are atomic, then this safety requirement is
> > +///   trivially fulfilled.
> > +///
> > +/// [`LKMM`]: srctree/tools/memory-model
> > +pub unsafe fn atomic_per_byte_memcpy(src: *const u8, dst: *mut u8, len: usize) {
> > +    // SAFETY: By the safety requirements of this function, the following operation will not:
> > +    //  - Trap.
> > +    //  - Invalidate any reference invariants.
> > +    //  - Race with any operation by the Rust AM, as `bindings::memcpy` is a byte-wise atomic
> > +    //    operation and all operations by the Rust AM to the involved memory areas use byte-wise
> > +    //    atomic semantics.
> > +    unsafe {
> > +        bindings::memcpy(
> > +            dst.cast::<kernel::ffi::c_void>(),
> > +            src.cast::<kernel::ffi::c_void>(),
> > +            len,
> 
> Are we sure that LLVM will not say "memcpy is a special function name, I
> know what it means" and optimize this like a non-atomic memcpy?
> 
> I think we should consider using the
> 
> 	std::intrinsics::volatile_copy_nonoverlapping_memory
> 

But two racing volatile_copy_nonoverlapping_memory()s are still data
race hence UB, no?

Regards,
Boqun

> intrinsic until Rust stabilizes a built-in atomic per-byte memcpy. Yes I
> know the intrinsic is unstable, but we should at least ask the Rust
> folks about it. They are plausibly ok with this particular usage.
> 
> Alice
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Gary Guo 1 month, 1 week ago
On 2026-02-17 12:03, Alice Ryhl wrote:
> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
>> When copying data from buffers that are mapped to user space, it is
>> impossible to guarantee absence of concurrent memory operations on 
>> those
>> buffers. Copying data to/from `Page` from/to these buffers would be
>> undefined behavior if no special considerations are made.
>> 
>> Add methods on `Page` to read and write the contents using byte-wise 
>> atomic
>> operations.
>> 
>> Also improve clarity by specifying additional requirements on
>> `read_raw`/`write_raw` methods regarding concurrent operations on 
>> involved
>> buffers.
>> 
>> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> 
>> +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic 
>> operations.
>> +///
>> +/// This copy operation is volatile.
>> +///
>> +/// # Safety
>> +///
>> +/// Callers must ensure that:
>> +///
>> +/// - `src` is valid for reads for `len` bytes for the duration of 
>> the call.
>> +/// - `dst` is valid for writes for `len` bytes for the duration of 
>> the call.
>> +/// - For the duration of the call, other accesses to the areas 
>> described by `src`, `dst` and `len`,
>> +///   must not cause data races (defined by [`LKMM`]) against atomic 
>> operations executed by this
>> +///   function. Note that if all other accesses are atomic, then this 
>> safety requirement is
>> +///   trivially fulfilled.
>> +///
>> +/// [`LKMM`]: srctree/tools/memory-model
>> +pub unsafe fn atomic_per_byte_memcpy(src: *const u8, dst: *mut u8, 
>> len: usize) {
>> +    // SAFETY: By the safety requirements of this function, the 
>> following operation will not:
>> +    //  - Trap.
>> +    //  - Invalidate any reference invariants.
>> +    //  - Race with any operation by the Rust AM, as 
>> `bindings::memcpy` is a byte-wise atomic
>> +    //    operation and all operations by the Rust AM to the involved 
>> memory areas use byte-wise
>> +    //    atomic semantics.
>> +    unsafe {
>> +        bindings::memcpy(
>> +            dst.cast::<kernel::ffi::c_void>(),
>> +            src.cast::<kernel::ffi::c_void>(),
>> +            len,
> 
> Are we sure that LLVM will not say "memcpy is a special function name, 
> I
> know what it means" and optimize this like a non-atomic memcpy?

This "treating special symbol name as intrinsics" logic is done in 
Clang,
and won't be performed once lower to LLVM IR, so Rust is immune to that 
(even
when LTO'ed together with Clang generated IR). So calling to bindings is 
fine.

> 
> I think we should consider using the
> 
> 	std::intrinsics::volatile_copy_nonoverlapping_memory
> 
> intrinsic until Rust stabilizes a built-in atomic per-byte memcpy. Yes 
> I
> know the intrinsic is unstable, but we should at least ask the Rust
> folks about it. They are plausibly ok with this particular usage.

If we have this in stable, I think it's sufficient for LKMM. However for 
Rust/C11 MM
says that volatile ops are not atomic and use them for concurrency is 
UB.

I recall in last Rust all hands the vibe at discussion is that it's 
desirable to define
volatile as being byte-wise atomic, so if that actually happens, this 
would indeed be
what we want (but I think semantics w.r.t. mixed-size atomics need to be 
figured out first).

Best,
Gary

> 
> Alice
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Alice Ryhl 1 month, 1 week ago
On Tue, Feb 17, 2026 at 11:10:15PM +0000, Gary Guo wrote:
> On 2026-02-17 12:03, Alice Ryhl wrote:
> > On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> > > When copying data from buffers that are mapped to user space, it is
> > > impossible to guarantee absence of concurrent memory operations on
> > > those
> > > buffers. Copying data to/from `Page` from/to these buffers would be
> > > undefined behavior if no special considerations are made.
> > > 
> > > Add methods on `Page` to read and write the contents using byte-wise
> > > atomic
> > > operations.
> > > 
> > > Also improve clarity by specifying additional requirements on
> > > `read_raw`/`write_raw` methods regarding concurrent operations on
> > > involved
> > > buffers.
> > > 
> > > Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> > 
> > > +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic
> > > operations.
> > > +///
> > > +/// This copy operation is volatile.
> > > +///
> > > +/// # Safety
> > > +///
> > > +/// Callers must ensure that:
> > > +///
> > > +/// - `src` is valid for reads for `len` bytes for the duration of
> > > the call.
> > > +/// - `dst` is valid for writes for `len` bytes for the duration of
> > > the call.
> > > +/// - For the duration of the call, other accesses to the areas
> > > described by `src`, `dst` and `len`,
> > > +///   must not cause data races (defined by [`LKMM`]) against
> > > atomic operations executed by this
> > > +///   function. Note that if all other accesses are atomic, then
> > > this safety requirement is
> > > +///   trivially fulfilled.
> > > +///
> > > +/// [`LKMM`]: srctree/tools/memory-model
> > > +pub unsafe fn atomic_per_byte_memcpy(src: *const u8, dst: *mut u8,
> > > len: usize) {
> > > +    // SAFETY: By the safety requirements of this function, the
> > > following operation will not:
> > > +    //  - Trap.
> > > +    //  - Invalidate any reference invariants.
> > > +    //  - Race with any operation by the Rust AM, as
> > > `bindings::memcpy` is a byte-wise atomic
> > > +    //    operation and all operations by the Rust AM to the
> > > involved memory areas use byte-wise
> > > +    //    atomic semantics.
> > > +    unsafe {
> > > +        bindings::memcpy(
> > > +            dst.cast::<kernel::ffi::c_void>(),
> > > +            src.cast::<kernel::ffi::c_void>(),
> > > +            len,
> > 
> > Are we sure that LLVM will not say "memcpy is a special function name, I
> > know what it means" and optimize this like a non-atomic memcpy?
> 
> This "treating special symbol name as intrinsics" logic is done in Clang,
> and won't be performed once lower to LLVM IR, so Rust is immune to that
> (even
> when LTO'ed together with Clang generated IR). So calling to bindings is
> fine.

Ok, that's good! Then I'm less concerned.

Though I guess it means that even if it's known to be e.g. an 8-byte
aligned memcpy of length 8, then it still can't optimize it to e.g. a
movq instruction.

> > I think we should consider using the
> > 
> > 	std::intrinsics::volatile_copy_nonoverlapping_memory
> > 
> > intrinsic until Rust stabilizes a built-in atomic per-byte memcpy. Yes I
> > know the intrinsic is unstable, but we should at least ask the Rust
> > folks about it. They are plausibly ok with this particular usage.
> 
> If we have this in stable, I think it's sufficient for LKMM. However for
> Rust/C11 MM
> says that volatile ops are not atomic and use them for concurrency is UB.

I'm well aware of that! Yet, Rust currently provides no alternative
whatsoever, even on nightly, and has already told us in other situations
they're ok with Linux using volatile for this purpose in limited
situations. That is why I suggest doing this temporarily, and after
asking the rustc compiler folks about it.

> I recall in last Rust all hands the vibe at discussion is that it's
> desirable to define
> volatile as being byte-wise atomic, so if that actually happens, this would
> indeed be
> what we want (but I think semantics w.r.t. mixed-size atomics need to be
> figured out first).

Yes, that's right.

Alice
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Peter Zijlstra 1 month, 1 week ago
On Tue, Feb 17, 2026 at 11:10:15PM +0000, Gary Guo wrote:

> If we have this in stable, I think it's sufficient for LKMM. However
> for Rust/C11 MM says that volatile ops are not atomic and use them for
> concurrency is UB.
> 
> I recall in last Rust all hands the vibe at discussion is that it's
> desirable to define volatile as being byte-wise atomic, so if that
> actually happens, this would indeed be what we want (but I think
> semantics w.r.t. mixed-size atomics need to be figured out first).

I would strongly suggest for volatile to be single-copy 'atomic' for any
naturally aligned word sized access. This is what we have with
GCC/Clang.

If you pick anything else, you're explicitly creation interoperability
issues.
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Gary Guo 1 month, 1 week ago
On 2026-02-18 10:20, Peter Zijlstra wrote:
> On Tue, Feb 17, 2026 at 11:10:15PM +0000, Gary Guo wrote:
> 
>> If we have this in stable, I think it's sufficient for LKMM. However
>> for Rust/C11 MM says that volatile ops are not atomic and use them for
>> concurrency is UB.
>> 
>> I recall in last Rust all hands the vibe at discussion is that it's
>> desirable to define volatile as being byte-wise atomic, so if that
>> actually happens, this would indeed be what we want (but I think
>> semantics w.r.t. mixed-size atomics need to be figured out first).
> 
> I would strongly suggest for volatile to be single-copy 'atomic' for any
> naturally aligned word sized access. This is what we have with
> GCC/Clang.
> 
> If you pick anything else, you're explicitly creation interoperability
> issues.

AFAIK LLVM IR only "guarantees" this for primitives, so if you have a struct
that happens to be word-aligned and word-sized, it can still tear, which is
why the the "byte-wise atomicity" semantics is what's being proposed.

I recall it was being discussed that, for the MMIO use case, it is desirable
to have this defined in such way that one single instruction is generated for
an aligned access of small-enough integer primitive.

This is exactly the same situation in C too. If you have a volatile struct load
then Clang actually generates a volatile memcpy for you, and it can tear.

Best,
Gary
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Peter Zijlstra 1 month, 1 week ago
On Wed, Feb 18, 2026 at 11:36:20AM +0000, Gary Guo wrote:
> On 2026-02-18 10:20, Peter Zijlstra wrote:
> > On Tue, Feb 17, 2026 at 11:10:15PM +0000, Gary Guo wrote:
> > 
> >> If we have this in stable, I think it's sufficient for LKMM. However
> >> for Rust/C11 MM says that volatile ops are not atomic and use them for
> >> concurrency is UB.
> >> 
> >> I recall in last Rust all hands the vibe at discussion is that it's
> >> desirable to define volatile as being byte-wise atomic, so if that
> >> actually happens, this would indeed be what we want (but I think
> >> semantics w.r.t. mixed-size atomics need to be figured out first).
> > 
> > I would strongly suggest for volatile to be single-copy 'atomic' for any
> > naturally aligned word sized access. This is what we have with
> > GCC/Clang.
> > 
> > If you pick anything else, you're explicitly creation interoperability
> > issues.
> 
> AFAIK LLVM IR only "guarantees" this for primitives, so if you have a struct
> that happens to be word-aligned and word-sized, it can still tear, which is
> why the the "byte-wise atomicity" semantics is what's being proposed.

Urgh, what does GCC do? And are we sure this doesn't actually break
anything? I'm fairly sure we rely on at least 'small' struct volatile
reads (eg struct fd) to 'work'.

> I recall it was being discussed that, for the MMIO use case, it is desirable
> to have this defined in such way that one single instruction is generated for
> an aligned access of small-enough integer primitive.
> 
> This is exactly the same situation in C too. If you have a volatile struct load
> then Clang actually generates a volatile memcpy for you, and it can tear.

It could just be LLVM is broken and needs fixing in this case.
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Miguel Ojeda 1 month, 1 week ago
On Fri, Feb 13, 2026 at 7:43 AM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>
> @@ -282,6 +318,7 @@ pub unsafe fn read_raw(&self, dst: *mut u8, offset: usize, len: usize) -> Result
>      /// # Safety
>      ///
>      /// * Callers must ensure that `src` is valid for reading `len` bytes.
> +    /// * Callers must ensure that there are no concurrent writes to the source memory region.
>      /// * Callers must ensure that this call does not race with a read or write to the same page
>      ///   that overlaps with this write.
>      pub unsafe fn write_raw(&self, src: *const u8, offset: usize, len: usize) -> Result {

Coming from:

  https://lore.kernel.org/rust-for-linux/20260215-page-additions-v1-0-4827790a9bc4@kernel.org/T/#md120cdea73132fc698bf69bf3d69287c2cb28449

Leaving a comment here to avoid forgetting: I think the new bullet
point here and elsewhere is not needed, i.e. the first one uses "valid
for reads" which I think is meant to already exclude data races.

> +    /// - For the duration of the call, other accesses to the areas described by `src` and `len`,
> +    ///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
> +    ///   function. Note that if all other accesses are atomic, then this safety requirement is
> +    ///   trivially fulfilled.

And, for this one, Benno said perhaps we should introduce a shorthand.

Cheers,
Miguel
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Alice Ryhl 1 month, 1 week ago
On Wed, Feb 18, 2026 at 12:57 PM Miguel Ojeda
<miguel.ojeda.sandonis@gmail.com> wrote:
>
> On Fri, Feb 13, 2026 at 7:43 AM Andreas Hindborg <a.hindborg@kernel.org> wrote:
> >
> > @@ -282,6 +318,7 @@ pub unsafe fn read_raw(&self, dst: *mut u8, offset: usize, len: usize) -> Result
> >      /// # Safety
> >      ///
> >      /// * Callers must ensure that `src` is valid for reading `len` bytes.
> > +    /// * Callers must ensure that there are no concurrent writes to the source memory region.
> >      /// * Callers must ensure that this call does not race with a read or write to the same page
> >      ///   that overlaps with this write.
> >      pub unsafe fn write_raw(&self, src: *const u8, offset: usize, len: usize) -> Result {
>
> Coming from:
>
>   https://lore.kernel.org/rust-for-linux/20260215-page-additions-v1-0-4827790a9bc4@kernel.org/T/#md120cdea73132fc698bf69bf3d69287c2cb28449
>
> Leaving a comment here to avoid forgetting: I think the new bullet
> point here and elsewhere is not needed, i.e. the first one uses "valid
> for reads" which I think is meant to already exclude data races.

That's right, valid for reads implies that it's okay to read it.

> > +    /// - For the duration of the call, other accesses to the areas described by `src` and `len`,
> > +    ///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
> > +    ///   function. Note that if all other accesses are atomic, then this safety requirement is
> > +    ///   trivially fulfilled.
>
> And, for this one, Benno said perhaps we should introduce a shorthand.

valid for atomic reads?

Alice
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Miguel Ojeda 1 month, 1 week ago
On Wed, Feb 18, 2026 at 1:00 PM Alice Ryhl <aliceryhl@google.com> wrote:
>
> That's right, valid for reads implies that it's okay to read it.

Great, thanks for confirming.

> valid for atomic reads?

That is the one he suggested, so it is likely to be a good one if two
people thought about it :)

Cheers,
Miguel
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Andreas Hindborg 1 month, 1 week ago
"Miguel Ojeda" <miguel.ojeda.sandonis@gmail.com> writes:

> On Wed, Feb 18, 2026 at 1:00 PM Alice Ryhl <aliceryhl@google.com> wrote:
>>
>> That's right, valid for reads implies that it's okay to read it.
>
> Great, thanks for confirming.

Cool, will adjust in next version.

>
>> valid for atomic reads?
>
> That is the one he suggested, so it is likely to be a good one if two
> people thought about it :)

Great, I like it.

This document should be in tree. It's not, is it?


Best regards,
Andreas Hindborg
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Benno Lossin 1 month, 1 week ago
On Wed Feb 18, 2026 at 1:00 PM CET, Alice Ryhl wrote:
> On Wed, Feb 18, 2026 at 12:57 PM Miguel Ojeda
> <miguel.ojeda.sandonis@gmail.com> wrote:
>> > +    /// - For the duration of the call, other accesses to the areas described by `src` and `len`,
>> > +    ///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
>> > +    ///   function. Note that if all other accesses are atomic, then this safety requirement is
>> > +    ///   trivially fulfilled.
>>
>> And, for this one, Benno said perhaps we should introduce a shorthand.
>
> valid for atomic reads?

Yes, in particular it should be folded in with the "valid for reads"
above, since that would otherwise be conflicting.

Cheers,
Benno
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Boqun Feng 1 month, 2 weeks ago
On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
[...]
> diff --git a/rust/kernel/sync/atomic.rs b/rust/kernel/sync/atomic.rs
> index 4aebeacb961a2..8ab20126a88cf 100644
> --- a/rust/kernel/sync/atomic.rs
> +++ b/rust/kernel/sync/atomic.rs
> @@ -560,3 +560,35 @@ pub fn fetch_add<Rhs, Ordering: ordering::Ordering>(&self, v: Rhs, _: Ordering)
>          unsafe { from_repr(ret) }
>      }
>  }
> +
> +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
> +///

Given Greg and Peter's feedback, I think it's better to call out why we
need `atomic_per_byte_memcpy()` and why we use bindings::memcpy() to
implement it. How about a paragraph as follow:

/// This is the concurrent-safe version of `core::ptr::copy()` (the
/// counterpart of standard C's `memcpy()`). Because of the atomicity at
/// byte level, when racing with another concurrent atomic access (or
/// a normal read races with an atomic read) or an external access (from
/// DMA or userspace), the behavior of this function is defined:
/// copying memory at the (at least) byte granularity.
///
/// Implementation note: it's currently implemented by kernel's
/// `memcpy()`, because kernel's `memcpy()` is implemented in a way that
/// byte-wise atomic memory load/store instructions are used.

And probably we make it a separate patch for this
atomic_per_byte_memcpy().

Thoughts?

Regards,
Boqun

> +/// This copy operation is volatile.
> +///
> +/// # Safety
> +///
> +/// Callers must ensure that:
> +///
> +/// - `src` is valid for reads for `len` bytes for the duration of the call.
> +/// - `dst` is valid for writes for `len` bytes for the duration of the call.
> +/// - For the duration of the call, other accesses to the areas described by `src`, `dst` and `len`,
> +///   must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
> +///   function. Note that if all other accesses are atomic, then this safety requirement is
> +///   trivially fulfilled.
> +///
> +/// [`LKMM`]: srctree/tools/memory-model
> +pub unsafe fn atomic_per_byte_memcpy(src: *const u8, dst: *mut u8, len: usize) {
> +    // SAFETY: By the safety requirements of this function, the following operation will not:
> +    //  - Trap.
> +    //  - Invalidate any reference invariants.
> +    //  - Race with any operation by the Rust AM, as `bindings::memcpy` is a byte-wise atomic
> +    //    operation and all operations by the Rust AM to the involved memory areas use byte-wise
> +    //    atomic semantics.
> +    unsafe {
> +        bindings::memcpy(
> +            dst.cast::<kernel::ffi::c_void>(),
> +            src.cast::<kernel::ffi::c_void>(),
> +            len,
> +        )
> +    };
> +}
> 
> ---
> base-commit: 63804fed149a6750ffd28610c5c1c98cce6bd377
> change-id: 20260130-page-volatile-io-05ff595507d3
> 
> Best regards,
> -- 
> Andreas Hindborg <a.hindborg@kernel.org>
> 
>
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Peter Zijlstra 1 month, 2 weeks ago
On Fri, Feb 13, 2026 at 09:44:18AM -0800, Boqun Feng wrote:
> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> [...]
> > diff --git a/rust/kernel/sync/atomic.rs b/rust/kernel/sync/atomic.rs
> > index 4aebeacb961a2..8ab20126a88cf 100644
> > --- a/rust/kernel/sync/atomic.rs
> > +++ b/rust/kernel/sync/atomic.rs
> > @@ -560,3 +560,35 @@ pub fn fetch_add<Rhs, Ordering: ordering::Ordering>(&self, v: Rhs, _: Ordering)
> >          unsafe { from_repr(ret) }
> >      }
> >  }
> > +
> > +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
> > +///
> 
> Given Greg and Peter's feedback, I think it's better to call out why we
> need `atomic_per_byte_memcpy()` and why we use bindings::memcpy() to
> implement it. How about a paragraph as follow:
> 
> /// This is the concurrent-safe version of `core::ptr::copy()` (the
> /// counterpart of standard C's `memcpy()`). Because of the atomicity at
> /// byte level, when racing with another concurrent atomic access (or
> /// a normal read races with an atomic read) or an external access (from
> /// DMA or userspace), the behavior of this function is defined:
> /// copying memory at the (at least) byte granularity.
> ///
> /// Implementation note: it's currently implemented by kernel's
> /// `memcpy()`, because kernel's `memcpy()` is implemented in a way that
> /// byte-wise atomic memory load/store instructions are used.
> 
> And probably we make it a separate patch for this
> atomic_per_byte_memcpy().
> 
> Thoughts?

Its still not making sense; an no kernel memcpy() does not necessarily
use byte wise copy. And please stop talking about 'atomic' here. There
are no atomic ops used (and atomic ops will fundamentally not help).

Seriously, none of this makes *ANY* sense.

Yes we have racing copies. And yes that is 'tricky'. But there is no
magic fix. Nor does it matter.

You copy 'n' bytes (in any way you like, preferably the fastest, that's
all that really matters), and then you get to go validate that the
content makes sense, like always when you get something from userspace.
Must not trust userspace.

So even if there was no concurrency, and your copy is 'perfect' you
*STILL* must not trust it. So the presence of concurrency matters not.
It is just another way userspace can serve you bad values, nothing more,
nothing less.
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Gary Guo 1 month, 2 weeks ago
On 2026-02-17 08:55, Peter Zijlstra wrote:
> On Fri, Feb 13, 2026 at 09:44:18AM -0800, Boqun Feng wrote:
>> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
>> [...]
>> > diff --git a/rust/kernel/sync/atomic.rs b/rust/kernel/sync/atomic.rs
>> > index 4aebeacb961a2..8ab20126a88cf 100644
>> > --- a/rust/kernel/sync/atomic.rs
>> > +++ b/rust/kernel/sync/atomic.rs
>> > @@ -560,3 +560,35 @@ pub fn fetch_add<Rhs, Ordering: ordering::Ordering>(&self, v: Rhs, _: Ordering)
>> >          unsafe { from_repr(ret) }
>> >      }
>> >  }
>> > +
>> > +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
>> > +///
>> 
>> Given Greg and Peter's feedback, I think it's better to call out why we
>> need `atomic_per_byte_memcpy()` and why we use bindings::memcpy() to
>> implement it. How about a paragraph as follow:
>> 
>> /// This is the concurrent-safe version of `core::ptr::copy()` (the
>> /// counterpart of standard C's `memcpy()`). Because of the atomicity at
>> /// byte level, when racing with another concurrent atomic access (or
>> /// a normal read races with an atomic read) or an external access (from
>> /// DMA or userspace), the behavior of this function is defined:
>> /// copying memory at the (at least) byte granularity.
>> ///
>> /// Implementation note: it's currently implemented by kernel's
>> /// `memcpy()`, because kernel's `memcpy()` is implemented in a way that
>> /// byte-wise atomic memory load/store instructions are used.
>> 
>> And probably we make it a separate patch for this
>> atomic_per_byte_memcpy().
>> 
>> Thoughts?
> 
> Its still not making sense; an no kernel memcpy() does not necessarily
> use byte wise copy. And please stop talking about 'atomic' here. There
> are no atomic ops used (and atomic ops will fundamentally not help).

Byte-wise atomicity means that the guaranteed atomicity is per-byte, not that
the copying is per byte. The copying size and order can be arbitrary.

The "atomicity" is needed here so that concurrent access is defined and does
not race. "Per-byte" means that tearing is allowed to be observed.

Best,
Gary

> 
> Seriously, none of this makes *ANY* sense.
> 
> Yes we have racing copies. And yes that is 'tricky'. But there is no
> magic fix. Nor does it matter.
> 
> You copy 'n' bytes (in any way you like, preferably the fastest, that's
> all that really matters), and then you get to go validate that the
> content makes sense, like always when you get something from userspace.
> Must not trust userspace.
> 
> So even if there was no concurrency, and your copy is 'perfect' you
> *STILL* must not trust it. So the presence of concurrency matters not.
> It is just another way userspace can serve you bad values, nothing more,
> nothing less.
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Will Deacon 1 month, 2 weeks ago
On Tue, Feb 17, 2026 at 09:42:37AM +0000, Gary Guo wrote:
> On 2026-02-17 08:55, Peter Zijlstra wrote:
> > On Fri, Feb 13, 2026 at 09:44:18AM -0800, Boqun Feng wrote:
> >> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> >> [...]
> >> > diff --git a/rust/kernel/sync/atomic.rs b/rust/kernel/sync/atomic.rs
> >> > index 4aebeacb961a2..8ab20126a88cf 100644
> >> > --- a/rust/kernel/sync/atomic.rs
> >> > +++ b/rust/kernel/sync/atomic.rs
> >> > @@ -560,3 +560,35 @@ pub fn fetch_add<Rhs, Ordering: ordering::Ordering>(&self, v: Rhs, _: Ordering)
> >> >          unsafe { from_repr(ret) }
> >> >      }
> >> >  }
> >> > +
> >> > +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
> >> > +///
> >> 
> >> Given Greg and Peter's feedback, I think it's better to call out why we
> >> need `atomic_per_byte_memcpy()` and why we use bindings::memcpy() to
> >> implement it. How about a paragraph as follow:
> >> 
> >> /// This is the concurrent-safe version of `core::ptr::copy()` (the
> >> /// counterpart of standard C's `memcpy()`). Because of the atomicity at
> >> /// byte level, when racing with another concurrent atomic access (or
> >> /// a normal read races with an atomic read) or an external access (from
> >> /// DMA or userspace), the behavior of this function is defined:
> >> /// copying memory at the (at least) byte granularity.
> >> ///
> >> /// Implementation note: it's currently implemented by kernel's
> >> /// `memcpy()`, because kernel's `memcpy()` is implemented in a way that
> >> /// byte-wise atomic memory load/store instructions are used.
> >> 
> >> And probably we make it a separate patch for this
> >> atomic_per_byte_memcpy().
> >> 
> >> Thoughts?
> > 
> > Its still not making sense; an no kernel memcpy() does not necessarily
> > use byte wise copy. And please stop talking about 'atomic' here. There
> > are no atomic ops used (and atomic ops will fundamentally not help).
> 
> Byte-wise atomicity means that the guaranteed atomicity is per-byte, not that
> the copying is per byte. The copying size and order can be arbitrary.

Curious, but how would you implement a memcpy that _isn't_ "atomic" by
that definition? Are you worried about accessing bytes multiple times,
or losing dependency ordering, or something else?

This all feels like playing tricks to placate the type system for
something that isn't actually a problem in practice. But I think I'm
probably at least as confused as Peter :)

Will
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Boqun Feng 1 month, 2 weeks ago
On Tue, Feb 17, 2026 at 10:47:03AM +0000, Will Deacon wrote:
> On Tue, Feb 17, 2026 at 09:42:37AM +0000, Gary Guo wrote:
> > On 2026-02-17 08:55, Peter Zijlstra wrote:
> > > On Fri, Feb 13, 2026 at 09:44:18AM -0800, Boqun Feng wrote:
> > >> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> > >> [...]
> > >> > diff --git a/rust/kernel/sync/atomic.rs b/rust/kernel/sync/atomic.rs
> > >> > index 4aebeacb961a2..8ab20126a88cf 100644
> > >> > --- a/rust/kernel/sync/atomic.rs
> > >> > +++ b/rust/kernel/sync/atomic.rs
> > >> > @@ -560,3 +560,35 @@ pub fn fetch_add<Rhs, Ordering: ordering::Ordering>(&self, v: Rhs, _: Ordering)
> > >> >          unsafe { from_repr(ret) }
> > >> >      }
> > >> >  }
> > >> > +
> > >> > +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
> > >> > +///
> > >> 
> > >> Given Greg and Peter's feedback, I think it's better to call out why we
> > >> need `atomic_per_byte_memcpy()` and why we use bindings::memcpy() to
> > >> implement it. How about a paragraph as follow:
> > >> 
> > >> /// This is the concurrent-safe version of `core::ptr::copy()` (the
> > >> /// counterpart of standard C's `memcpy()`). Because of the atomicity at
> > >> /// byte level, when racing with another concurrent atomic access (or
> > >> /// a normal read races with an atomic read) or an external access (from
> > >> /// DMA or userspace), the behavior of this function is defined:
> > >> /// copying memory at the (at least) byte granularity.
> > >> ///
> > >> /// Implementation note: it's currently implemented by kernel's
> > >> /// `memcpy()`, because kernel's `memcpy()` is implemented in a way that
> > >> /// byte-wise atomic memory load/store instructions are used.
> > >> 
> > >> And probably we make it a separate patch for this
> > >> atomic_per_byte_memcpy().
> > >> 
> > >> Thoughts?
> > > 
> > > Its still not making sense; an no kernel memcpy() does not necessarily
> > > use byte wise copy. And please stop talking about 'atomic' here. There
> > > are no atomic ops used (and atomic ops will fundamentally not help).
> > 
> > Byte-wise atomicity means that the guaranteed atomicity is per-byte, not that
> > the copying is per byte. The copying size and order can be arbitrary.
> 
> Curious, but how would you implement a memcpy that _isn't_ "atomic" by
> that definition? Are you worried about accessing bytes multiple times,
> or losing dependency ordering, or something else?
> 

We are worried about two racing memcpy()s end up being data race and
that's undefined behavior. And "atomic" is the key word in C (and Rust)
to "lift" normal accesses to non-data-race, for example:

	thread 1		thread 2
	--------		--------
    	*a = 1;			r1 = *a;

is data race, and 

	thread 1		thread 2
	--------		--------
    	atomic_store(a,1);	r1 = atomic_load(a);

is not.

In memcpy() case, since we don't need the whole copy to be a single
atomic operation, so as long as the atomicity is guaranteed at byte
level (larger is fine because 2byte atomic is still byte atomic), it
should be sufficient as a concurrent-safe memcpy().

So either we want to live in a world where

"concurrent normal accesses with at least one being write are data race
therefore UBs, use the corresponding atomic API in this case and handle
the data carefully with concurrent accesses in mind".

or we want to live in a world where

"concurrent normal accesses with at least one being write are data race
therefore UBs, but there are 17 and more API which are technically UBs,
but they are not considered as UBs in kernel, use them"

To me, having a atomic_bytewise_memcpy() at least clear things out about
what is actually needed (at the very minimal) to have a concurrent-safe
memcpy(). Moving forward, since the concept has been already somehow
proposed to C/C++, it's likely to be standardized (we can push it from
the kernel end as well) so we don't need to implement a concurrent-safe
memcpy() for all architectures on our own.

Hope this makes some sense ;-)

Regards,
Boqun

> This all feels like playing tricks to placate the type system for
> something that isn't actually a problem in practice. But I think I'm
> probably at least as confused as Peter :)
> 

> Will
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Peter Zijlstra 1 month, 1 week ago
On Tue, Feb 17, 2026 at 09:10:42AM -0800, Boqun Feng wrote:

> We are worried about two racing memcpy()s end up being data race and
> that's undefined behavior. And "atomic" is the key word in C (and Rust)
> to "lift" normal accesses to non-data-race, for example:

I hate people for calling that atomic. It has nothing to do with
atomics.

> 
> 	thread 1		thread 2
> 	--------		--------
>     	*a = 1;			r1 = *a;
> 
> is data race, and 
> 
> 	thread 1		thread 2
> 	--------		--------
>     	atomic_store(a,1);	r1 = atomic_load(a);
> 
> is not.

At the end of the day, they're both the bloody same thing, no matter
what you call them :-( All this UB nonsense is just compiler people
being silly.

> In memcpy() case, since we don't need the whole copy to be a single
> atomic operation, so as long as the atomicity is guaranteed at byte
> level (larger is fine because 2byte atomic is still byte atomic), it
> should be sufficient as a concurrent-safe memcpy().

But this is every memcpy(), ever :/

> So either we want to live in a world where
> 
> "concurrent normal accesses with at least one being write are data race
> therefore UBs, use the corresponding atomic API in this case and handle
> the data carefully with concurrent accesses in mind".
> 
> or we want to live in a world where
> 
> "concurrent normal accesses with at least one being write are data race
> therefore UBs, but there are 17 and more API which are technically UBs,
> but they are not considered as UBs in kernel, use them"
> 
> To me, having a atomic_bytewise_memcpy() at least clear things out about
> what is actually needed (at the very minimal) to have a concurrent-safe
> memcpy().

I'm still not seeing what it does over any other memcpy(), except you
created one more API, so now we have 18 :-(

> Moving forward, since the concept has been already somehow
> proposed to C/C++, it's likely to be standardized (we can push it from
> the kernel end as well) so we don't need to implement a concurrent-safe
> memcpy() for all architectures on our own.
> 
> Hope this makes some sense ;-)

I'm still not seeing it. All memcpy() implementations are already
meeting the criteria you want. There is nothing to implement. And I
really don't see the point in creating: magical_memcpy() that is
*identical* to every other memcpy() we already have.

AFAICT the only problem here is that from:

  https://lkml.kernel.org/r/20260218083754.GB2995752@noisy.programming.kicks-ass.net
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Peter Zijlstra 1 month, 1 week ago
On Tue, Feb 17, 2026 at 09:10:42AM -0800, Boqun Feng wrote:

> To me, having a atomic_bytewise_memcpy() at least clear things out about
> what is actually needed (at the very minimal) to have a concurrent-safe
> memcpy(). Moving forward, since the concept has been already somehow
> proposed to C/C++, it's likely to be standardized (we can push it from
> the kernel end as well) so we don't need to implement a concurrent-safe
> memcpy() for all architectures on our own.
> 
> Hope this makes some sense ;-)

So all of this is about compilers being silly, not about there being a
problem with memcpy().

memcpy() as implemented by all architectures is *FINE*.

The proposal you refer to is:

  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1478r7.html

(as shared by Will on IRC) and right from the start it goes sideways:

  "But fences only order atomic accesses,"

This is of course complete and utter bollocks. 

No actual hardware works that way, so for the C virtual machine to be
specified that way is complete insanity. This insanity then leads to all
sorts of problems, and then these imaginary problems need solutions and
we're up a creek that smells real bad.

Can we please just 'fix' things by stating that *all* loads and stores
are properly affected by the relevant barriers. Then things like:

  https://lkml.kernel.org/r/cbbea9ecb994df975109d97a7756d73e@garyguo.net

also instantly resolve themselves, because the compiler just isn't
allowed to be *that* stupid^Wclever.
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Andreas Hindborg 1 month, 2 weeks ago
Boqun Feng <boqun@kernel.org> writes:

> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> [...]
>> diff --git a/rust/kernel/sync/atomic.rs b/rust/kernel/sync/atomic.rs
>> index 4aebeacb961a2..8ab20126a88cf 100644
>> --- a/rust/kernel/sync/atomic.rs
>> +++ b/rust/kernel/sync/atomic.rs
>> @@ -560,3 +560,35 @@ pub fn fetch_add<Rhs, Ordering: ordering::Ordering>(&self, v: Rhs, _: Ordering)
>>          unsafe { from_repr(ret) }
>>      }
>>  }
>> +
>> +/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
>> +///
>
> Given Greg and Peter's feedback, I think it's better to call out why we
> need `atomic_per_byte_memcpy()` and why we use bindings::memcpy() to
> implement it. How about a paragraph as follow:
>
> /// This is the concurrent-safe version of `core::ptr::copy()` (the
> /// counterpart of standard C's `memcpy()`). Because of the atomicity at
> /// byte level, when racing with another concurrent atomic access (or
> /// a normal read races with an atomic read) or an external access (from
> /// DMA or userspace), the behavior of this function is defined:
> /// copying memory at the (at least) byte granularity.
> ///
> /// Implementation note: it's currently implemented by kernel's
> /// `memcpy()`, because kernel's `memcpy()` is implemented in a way that
> /// byte-wise atomic memory load/store instructions are used.
>
> And probably we make it a separate patch for this
> atomic_per_byte_memcpy().

Sure, I'll queue that.


Best regards,
Andreas Hindborg
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Peter Zijlstra 1 month, 2 weeks ago
On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> When copying data from buffers that are mapped to user space, it is
> impossible to guarantee absence of concurrent memory operations on those
> buffers. Copying data to/from `Page` from/to these buffers would be
> undefined behavior if no special considerations are made.
> 
> Add methods on `Page` to read and write the contents using byte-wise atomic
> operations.
> 
> Also improve clarity by specifying additional requirements on
> `read_raw`/`write_raw` methods regarding concurrent operations on involved
> buffers.


> +    /// - Callers must ensure that this call does not race with a write to the source page that
> +    ///   overlaps with this read.

Yeah, but per the bit above, its user mapped, you *CANNOT* ensure this.

And same comment as for v2, none of this makes sense. Byte loads are not
magically atomic. And they don't actually fix anything.

NAK
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Boqun Feng 1 month, 2 weeks ago
On Fri, Feb 13, 2026 at 12:28:37PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> > When copying data from buffers that are mapped to user space, it is
> > impossible to guarantee absence of concurrent memory operations on those
> > buffers. Copying data to/from `Page` from/to these buffers would be
> > undefined behavior if no special considerations are made.
> > 
> > Add methods on `Page` to read and write the contents using byte-wise atomic
> > operations.
> > 
> > Also improve clarity by specifying additional requirements on
> > `read_raw`/`write_raw` methods regarding concurrent operations on involved
> > buffers.
> 
> 
> > +    /// - Callers must ensure that this call does not race with a write to the source page that
> > +    ///   overlaps with this read.
> 
> Yeah, but per the bit above, its user mapped, you *CANNOT* ensure this.
> 

First, this safety requirement is actually incorrect, because of the
user mapped case you mentioned. I believe Andreas put it to prevent
others from racing with memcpy(), e.g.

	(I'm flipping the read/write here between normal access and
	memcpy, but it's the same)

	CPU 0				CPU 1
	=====				=====
	let ptr: *mut i32 = ..; // a pointer to a 32 bit integer
					let x = 42;

					memcpy(ptr, &raw x);
	let v = *ptr; // <- the result of this is UB because of data
		      // race.

it's data race in C as well:

	CPU 0				CPU 1
	=====				=====
	int *ptr = ..;			int x = 42;
				
	int r0 = *ptr;			memcpy(ptr, &x);

But this is already covered by the previous safety requirement on "no
data races". Hence this safety requirement is redundant and incorrect to
me as well.

> And same comment as for v2, none of this makes sense. Byte loads are not
> magically atomic. And they don't actually fix anything.
> 

The problem that byte-wise atomic memcpy "solves" is the normal C
standard memcpy() and Rust's `core::ptr::copy()` cannot race with other
memory accesses. We didn't have our problem in our C memcpy(), because
it's implemented in a way that at byte level, they are atomic, i.e. you
cannot observe a teared byte.

(Note, the atomic part is indeed not necessary if you only need to
memcpy() a user mapped memory, but in Andreas use case, I believe the
same code is shared between "copying from user mapped memory" scenario
and "copying from in-kernel memory" scenario, for latter we need both
sides to be byte-wise atomic to avoid data races)

Of course, it's not magical on its own:

* when racing with in-kernel accesses, the other accesses need to be
  atomic or it's a read on the location the bytewise_atomic is reading
  from.

* when racing with external accesses (for example, userspace), the 
  kernel code needs to deal with whatever the userspace can do.

Regards,
Boqun

> NAK
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Andreas Hindborg 1 month, 2 weeks ago
Boqun Feng <boqun@kernel.org> writes:

> On Fri, Feb 13, 2026 at 12:28:37PM +0100, Peter Zijlstra wrote:
>> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
>> > When copying data from buffers that are mapped to user space, it is
>> > impossible to guarantee absence of concurrent memory operations on those
>> > buffers. Copying data to/from `Page` from/to these buffers would be
>> > undefined behavior if no special considerations are made.
>> > 
>> > Add methods on `Page` to read and write the contents using byte-wise atomic
>> > operations.
>> > 
>> > Also improve clarity by specifying additional requirements on
>> > `read_raw`/`write_raw` methods regarding concurrent operations on involved
>> > buffers.
>> 
>> 
>> > +    /// - Callers must ensure that this call does not race with a write to the source page that
>> > +    ///   overlaps with this read.
>> 
>> Yeah, but per the bit above, its user mapped, you *CANNOT* ensure this.
>> 
>
> First, this safety requirement is actually incorrect, because of the
> user mapped case you mentioned. I believe Andreas put it to prevent
> others from racing with memcpy(), e.g.

Since context is a bit washed out here, let's make sure we are talking
about `Page::read_bytewise_atomic``.

There are two buffers in play. `src`, which is provided by the `self:
&Page` and `dst: *mut u8`, which is passed as a function parameter.

The requirement for `src` is:

    Callers must ensure that this call does not race with a write to the **source page** that
    overlaps with this read.

This requirement is different than the requirement on `dst`. I do not
want to enforce that all memory operations on `src` be atomic, simply
that they are synchronized. This is a weaker requirement than the
requirement on `dst`. As we hold a shared reference to `self` and there
is no internal synchronization, I think this is the correct requirement.

For `dst` we have:

    For the duration of the call, other accesses to the area described by `dst` and `len`,
    must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
    function. Note that if all other accesses are atomic, then this safety requirement is
    trivially fulfilled.

Which is also requiring no races, but is specifically mentioning atomic
operations, which I did not want on `src`.

With this in mind, do you still think they are redundant?


Best regards,
Andreas Hindborg
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Boqun Feng 1 month, 1 week ago
On Sat, Feb 14, 2026 at 09:18:16AM +0100, Andreas Hindborg wrote:
> Boqun Feng <boqun@kernel.org> writes:
> 
> > On Fri, Feb 13, 2026 at 12:28:37PM +0100, Peter Zijlstra wrote:
> >> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
> >> > When copying data from buffers that are mapped to user space, it is
> >> > impossible to guarantee absence of concurrent memory operations on those
> >> > buffers. Copying data to/from `Page` from/to these buffers would be
> >> > undefined behavior if no special considerations are made.
> >> > 
> >> > Add methods on `Page` to read and write the contents using byte-wise atomic
> >> > operations.
> >> > 
> >> > Also improve clarity by specifying additional requirements on
> >> > `read_raw`/`write_raw` methods regarding concurrent operations on involved
> >> > buffers.
> >> 
> >> 
> >> > +    /// - Callers must ensure that this call does not race with a write to the source page that
> >> > +    ///   overlaps with this read.
> >> 
> >> Yeah, but per the bit above, its user mapped, you *CANNOT* ensure this.
> >> 
> >
> > First, this safety requirement is actually incorrect, because of the
> > user mapped case you mentioned. I believe Andreas put it to prevent
> > others from racing with memcpy(), e.g.
> 
> Since context is a bit washed out here, let's make sure we are talking
> about `Page::read_bytewise_atomic``.
> 
> There are two buffers in play. `src`, which is provided by the `self:
> &Page` and `dst: *mut u8`, which is passed as a function parameter.
> 
> The requirement for `src` is:
> 
>     Callers must ensure that this call does not race with a write to the **source page** that
>     overlaps with this read.
> 
> This requirement is different than the requirement on `dst`. I do not
> want to enforce that all memory operations on `src` be atomic, simply
> that they are synchronized. This is a weaker requirement than the
> requirement on `dst`. As we hold a shared reference to `self` and there
> is no internal synchronization, I think this is the correct requirement.
> 
> For `dst` we have:
> 
>     For the duration of the call, other accesses to the area described by `dst` and `len`,
>     must not cause data races (defined by [`LKMM`]) against atomic operations executed by this
>     function. Note that if all other accesses are atomic, then this safety requirement is
>     trivially fulfilled.
> 
> Which is also requiring no races, but is specifically mentioning atomic
> operations, which I did not want on `src`.
> 
> With this in mind, do you still think they are redundant?
> 

I see, I've overlooked and I was confused and I guess Peter was too. The
`dst` could be a user mapped page, but the `src` cannot be. I.e. it's
function that copying a public `dst` to a private `src`, of course you
can request no concurrent access to the private `src`. So his objection
is invalid.

It's not redundant, but probably we can make it more clear. Maybe we
start by saying:

/// Notice this function is guaranteed to performs an atomic byte-wise
/// memory write to the `dst` side but it may only perform a normal
/// memory read from the [`Page`] `self`.

Then

/// # Safety
///
/// Callers must ensure that:
/// 
/// - `dst` ..
/// - For ..
/// - No concurrent write to the source page `self` that overlaps with
///   the read.

or anything else that could care out the difference between `src` and
`dst`.

Regards,
Boqun

> 
> Best regards,
> Andreas Hindborg
> 
>
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Andreas Hindborg 1 month, 2 weeks ago
"Peter Zijlstra" <peterz@infradead.org> writes:

> On Fri, Feb 13, 2026 at 07:42:53AM +0100, Andreas Hindborg wrote:
>> When copying data from buffers that are mapped to user space, it is
>> impossible to guarantee absence of concurrent memory operations on those
>> buffers. Copying data to/from `Page` from/to these buffers would be
>> undefined behavior if no special considerations are made.
>>
>> Add methods on `Page` to read and write the contents using byte-wise atomic
>> operations.
>>
>> Also improve clarity by specifying additional requirements on
>> `read_raw`/`write_raw` methods regarding concurrent operations on involved
>> buffers.
>
>
>> +    /// - Callers must ensure that this call does not race with a write to the source page that
>> +    ///   overlaps with this read.
>
> Yeah, but per the bit above, its user mapped, you *CANNOT* ensure this.

Not all pages are user mapped. If `self` is user mapped, you cannot use
this function. As you say, it would not be possible to satisfy the
safety precondition.

If `self` is only mapped in the kernel and if you can guarantee that
there are no other concurrent writes to it, you can use this function.

>
> And same comment as for v2, none of this makes sense. Byte loads are not
> magically atomic. And they don't actually fix anything.

I am curious about on what architectures byte loads can tear?


Best regards,
Andreas Hindborg
Re: [PATCH v3] rust: page: add byte-wise atomic memory copy methods
Posted by Peter Zijlstra 1 month, 2 weeks ago
On Fri, Feb 13, 2026 at 01:45:28PM +0100, Andreas Hindborg wrote:

> > And same comment as for v2, none of this makes sense. Byte loads are not
> > magically atomic. And they don't actually fix anything.
> 
> I am curious about on what architectures byte loads can tear?

That's not the point. It's just a byte load, not an atomic byte load.

Unless you're thinking of using load-exclusive or something along those
lines; but that would be completely insane, nor actually solve the
problem.