[v6] LKMM generic atomics in Rust

[PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Boqun Feng 7 months ago

Memory barriers are building blocks for concurrent code, hence provide
a minimal set of them.

The compiler barrier, barrier(), is implemented in inline asm instead of
using core::sync::atomic::compiler_fence() because memory models are
different: kernel's atomics are implemented in inline asm therefore the
compiler barrier should be implemented in inline asm as well. Also it's
currently only public to the kernel crate until there's a reasonable
driver usage.

Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
---
 rust/helpers/barrier.c      | 18 ++++++++++
 rust/helpers/helpers.c      |  1 +
 rust/kernel/sync.rs         |  1 +
 rust/kernel/sync/barrier.rs | 65 +++++++++++++++++++++++++++++++++++++
 4 files changed, 85 insertions(+)
 create mode 100644 rust/helpers/barrier.c
 create mode 100644 rust/kernel/sync/barrier.rs

diff --git a/rust/helpers/barrier.c b/rust/helpers/barrier.c
new file mode 100644
index 000000000000..cdf28ce8e511
--- /dev/null
+++ b/rust/helpers/barrier.c
@@ -0,0 +1,18 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <asm/barrier.h>
+
+void rust_helper_smp_mb(void)
+{
+	smp_mb();
+}
+
+void rust_helper_smp_wmb(void)
+{
+	smp_wmb();
+}
+
+void rust_helper_smp_rmb(void)
+{
+	smp_rmb();
+}
diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
index 83e89f6a68fb..8ddfc8f84e87 100644
--- a/rust/helpers/helpers.c
+++ b/rust/helpers/helpers.c
@@ -9,6 +9,7 @@
 
 #include "atomic.c"
 #include "auxiliary.c"
+#include "barrier.c"
 #include "blk.c"
 #include "bug.c"
 #include "build_assert.c"
diff --git a/rust/kernel/sync.rs b/rust/kernel/sync.rs
index b620027e0641..c7c0e552bafe 100644
--- a/rust/kernel/sync.rs
+++ b/rust/kernel/sync.rs
@@ -11,6 +11,7 @@
 
 mod arc;
 pub mod atomic;
+pub mod barrier;
 mod condvar;
 pub mod lock;
 mod locked_by;
diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
new file mode 100644
index 000000000000..df4015221503
--- /dev/null
+++ b/rust/kernel/sync/barrier.rs
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Memory barriers.
+//!
+//! These primitives have the same semantics as their C counterparts: and the precise definitions
+//! of semantics can be found at [`LKMM`].
+//!
+//! [`LKMM`]: srctree/tools/memory-model/
+
+/// A compiler barrier.
+///
+/// A barrier that prevents compiler from reordering memory accesses across the barrier.
+pub(crate) fn barrier() {
+    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
+    // it suffices as a compiler barrier.
+    //
+    // SAFETY: An empty asm block should be safe.
+    unsafe {
+        core::arch::asm!("");
+    }
+}
+
+/// A full memory barrier.
+///
+/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
+pub fn smp_mb() {
+    if cfg!(CONFIG_SMP) {
+        // SAFETY: `smp_mb()` is safe to call.
+        unsafe {
+            bindings::smp_mb();
+        }
+    } else {
+        barrier();
+    }
+}
+
+/// A write-write memory barrier.
+///
+/// A barrier that prevents compiler and CPU from reordering memory write accesses across the
+/// barrier.
+pub fn smp_wmb() {
+    if cfg!(CONFIG_SMP) {
+        // SAFETY: `smp_wmb()` is safe to call.
+        unsafe {
+            bindings::smp_wmb();
+        }
+    } else {
+        barrier();
+    }
+}
+
+/// A read-read memory barrier.
+///
+/// A barrier that prevents compiler and CPU from reordering memory read accesses across the
+/// barrier.
+pub fn smp_rmb() {
+    if cfg!(CONFIG_SMP) {
+        // SAFETY: `smp_rmb()` is safe to call.
+        unsafe {
+            bindings::smp_rmb();
+        }
+    } else {
+        barrier();
+    }
+}
-- 
2.39.5 (Apple Git-154)

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Benno Lossin 7 months ago

On Thu Jul 10, 2025 at 8:00 AM CEST, Boqun Feng wrote:
> diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
> new file mode 100644
> index 000000000000..df4015221503
> --- /dev/null
> +++ b/rust/kernel/sync/barrier.rs
> @@ -0,0 +1,65 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Memory barriers.
> +//!
> +//! These primitives have the same semantics as their C counterparts: and the precise definitions
> +//! of semantics can be found at [`LKMM`].
> +//!
> +//! [`LKMM`]: srctree/tools/memory-model/
> +
> +/// A compiler barrier.
> +///
> +/// A barrier that prevents compiler from reordering memory accesses across the barrier.
> +pub(crate) fn barrier() {
> +    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
> +    // it suffices as a compiler barrier.

I don't know about this, but it also isn't my area of expertise... I
think I heard Ralf talk about this at Rust Week, but I don't remember...

> +    //
> +    // SAFETY: An empty asm block should be safe.

    // SAFETY: An empty asm block.

> +    unsafe {
> +        core::arch::asm!("");
> +    }

    unsafe { core::arch::asm!("") };

> +}
> +
> +/// A full memory barrier.
> +///
> +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
> +pub fn smp_mb() {
> +    if cfg!(CONFIG_SMP) {
> +        // SAFETY: `smp_mb()` is safe to call.
> +        unsafe {
> +            bindings::smp_mb();

Does this really work? How does the Rust compiler know this is a memory
barrier?

---
Cheers,
Benno

> +        }
> +    } else {
> +        barrier();
> +    }
> +}

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Boqun Feng 7 months ago

On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
> On Thu Jul 10, 2025 at 8:00 AM CEST, Boqun Feng wrote:
> > diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
> > new file mode 100644
> > index 000000000000..df4015221503
> > --- /dev/null
> > +++ b/rust/kernel/sync/barrier.rs
> > @@ -0,0 +1,65 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +//! Memory barriers.
> > +//!
> > +//! These primitives have the same semantics as their C counterparts: and the precise definitions
> > +//! of semantics can be found at [`LKMM`].
> > +//!
> > +//! [`LKMM`]: srctree/tools/memory-model/
> > +
> > +/// A compiler barrier.
> > +///
> > +/// A barrier that prevents compiler from reordering memory accesses across the barrier.
> > +pub(crate) fn barrier() {
> > +    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
> > +    // it suffices as a compiler barrier.
> 
> I don't know about this, but it also isn't my area of expertise... I
> think I heard Ralf talk about this at Rust Week, but I don't remember...
> 

Easy, let's Cc Ralf ;-)

Ralf, I believe the question here is:

In kernel C, we define a compiler barrier (barrier()), which is
implemented as:

# define barrier() __asm__ __volatile__("": : :"memory")

Now we want to have a Rust version, and I think an empty `asm!()` should
be enough as an equivalent as a barrier() in C, because an empty
`asm!()` in Rust implies "memory" as the clobber:

	https://godbolt.org/z/3z3fnWYjs

?

I know you have some opinions on C++ compiler_fence() [1]. But in LKMM,
barrier() and other barriers work for all memory accesses not just
atomics, so the problem "So, if your program contains no atomic
accesses, but some atomic fences, those fences do nothing." doesn't
exist for us. And our barrier() is strictly weaker than other barriers.

And based on my understanding of the consensus on Rust vs LKMM, "do
whatever kernel C does and rely on whatever kernel C relies" is the
general suggestion, so I think an empty `asm!()` works here. Of course
if in practice, we find an issue, I'm happy to look for solutions ;-)

Thoughts?

[1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/347

Regards,
Boqun

> > +    //
> > +    // SAFETY: An empty asm block should be safe.
> 
>     // SAFETY: An empty asm block.
> 
> > +    unsafe {
> > +        core::arch::asm!("");
> > +    }
> 
>     unsafe { core::arch::asm!("") };
> 
> > +}
> > +
> > +/// A full memory barrier.
> > +///
> > +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
> > +pub fn smp_mb() {
> > +    if cfg!(CONFIG_SMP) {
> > +        // SAFETY: `smp_mb()` is safe to call.
> > +        unsafe {
> > +            bindings::smp_mb();
> 
> Does this really work? How does the Rust compiler know this is a memory
> barrier?
> 
> ---
> Cheers,
> Benno
> 
> > +        }
> > +    } else {
> > +        barrier();
> > +    }
> > +}

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Ralf Jung 6 months, 4 weeks ago

Hi all,

On 11.07.25 20:20, Boqun Feng wrote:
> On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
>> On Thu Jul 10, 2025 at 8:00 AM CEST, Boqun Feng wrote:
>>> diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
>>> new file mode 100644
>>> index 000000000000..df4015221503
>>> --- /dev/null
>>> +++ b/rust/kernel/sync/barrier.rs
>>> @@ -0,0 +1,65 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +
>>> +//! Memory barriers.
>>> +//!
>>> +//! These primitives have the same semantics as their C counterparts: and the precise definitions
>>> +//! of semantics can be found at [`LKMM`].
>>> +//!
>>> +//! [`LKMM`]: srctree/tools/memory-model/
>>> +
>>> +/// A compiler barrier.
>>> +///
>>> +/// A barrier that prevents compiler from reordering memory accesses across the barrier.
>>> +pub(crate) fn barrier() {
>>> +    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
>>> +    // it suffices as a compiler barrier.
>>
>> I don't know about this, but it also isn't my area of expertise... I
>> think I heard Ralf talk about this at Rust Week, but I don't remember...
>>
> 
> Easy, let's Cc Ralf ;-)
> 
> Ralf, I believe the question here is:
> 
> In kernel C, we define a compiler barrier (barrier()), which is
> implemented as:
> 
> # define barrier() __asm__ __volatile__("": : :"memory")
> 
> Now we want to have a Rust version, and I think an empty `asm!()` should
> be enough as an equivalent as a barrier() in C, because an empty
> `asm!()` in Rust implies "memory" as the clobber:
> 
> 	https://godbolt.org/z/3z3fnWYjs
> 
> ?
> 
> I know you have some opinions on C++ compiler_fence() [1]. But in LKMM,
> barrier() and other barriers work for all memory accesses not just
> atomics, so the problem "So, if your program contains no atomic
> accesses, but some atomic fences, those fences do nothing." doesn't
> exist for us. And our barrier() is strictly weaker than other barriers.
> 
> And based on my understanding of the consensus on Rust vs LKMM, "do
> whatever kernel C does and rely on whatever kernel C relies" is the
> general suggestion, so I think an empty `asm!()` works here. Of course
> if in practice, we find an issue, I'm happy to look for solutions ;-)
> 
> Thoughts?
> 
> [1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/347

If I understood correctly, this is about using "compiler barriers" to order 
volatile accesses that the LKMM uses in lieu of atomic accesses?
I can't give a principled answer here, unfortunately -- as you know, the mapping 
of LKMM through the compiler isn't really in a state where we can make 
principled formal statements. And making principled formal statements is my main 
expertise so I am a bit out of my depth here. ;)

So I agree with your 2nd paragraph: I would say just like the fact that you are 
using volatile accesses in the first place, this falls under "do whatever the C 
code does, it shouldn't be any more broken in Rust than it is in C".

However, saying that it in general "prevents reordering all memory accesses" is 
unlikely to be fully correct -- if the compiler can prove that the inline asm 
block could not possibly have access to a local variable (e.g. because it never 
had its address taken), its accesses can still be reordered. This applies both 
to C compilers and Rust compilers. Extra annotations such as `noalias` (or 
`restrict` in C) can also give rise to reorderings around arbitrary code, 
including such barriers. This is not a problem for concurrent code since it 
would anyway be wrong to claim that some pointer doesn't have aliases when it is 
accessed by multiple threads, but it shows that the framing of barriers in terms 
of preventing reordering of accesses is too imprecise. That's why the C++ memory 
model uses a very different framing, and that's why I can't give a definite 
answer here. :)

Kind regards,
Ralf

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Boqun Feng 6 months, 4 weeks ago

On Mon, Jul 14, 2025 at 05:42:39PM +0200, Ralf Jung wrote:
> Hi all,
> 
> On 11.07.25 20:20, Boqun Feng wrote:
> > On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
> > > On Thu Jul 10, 2025 at 8:00 AM CEST, Boqun Feng wrote:
> > > > diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
> > > > new file mode 100644
> > > > index 000000000000..df4015221503
> > > > --- /dev/null
> > > > +++ b/rust/kernel/sync/barrier.rs
> > > > @@ -0,0 +1,65 @@
> > > > +// SPDX-License-Identifier: GPL-2.0
> > > > +
> > > > +//! Memory barriers.
> > > > +//!
> > > > +//! These primitives have the same semantics as their C counterparts: and the precise definitions
> > > > +//! of semantics can be found at [`LKMM`].
> > > > +//!
> > > > +//! [`LKMM`]: srctree/tools/memory-model/
> > > > +
> > > > +/// A compiler barrier.
> > > > +///
> > > > +/// A barrier that prevents compiler from reordering memory accesses across the barrier.
> > > > +pub(crate) fn barrier() {
> > > > +    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
> > > > +    // it suffices as a compiler barrier.
> > > 
> > > I don't know about this, but it also isn't my area of expertise... I
> > > think I heard Ralf talk about this at Rust Week, but I don't remember...
> > > 
> > 
> > Easy, let's Cc Ralf ;-)
> > 
> > Ralf, I believe the question here is:
> > 
> > In kernel C, we define a compiler barrier (barrier()), which is
> > implemented as:
> > 
> > # define barrier() __asm__ __volatile__("": : :"memory")
> > 
> > Now we want to have a Rust version, and I think an empty `asm!()` should
> > be enough as an equivalent as a barrier() in C, because an empty
> > `asm!()` in Rust implies "memory" as the clobber:
> > 
> > 	https://godbolt.org/z/3z3fnWYjs
> > 
> > ?
> > 
> > I know you have some opinions on C++ compiler_fence() [1]. But in LKMM,
> > barrier() and other barriers work for all memory accesses not just
> > atomics, so the problem "So, if your program contains no atomic
> > accesses, but some atomic fences, those fences do nothing." doesn't
> > exist for us. And our barrier() is strictly weaker than other barriers.
> > 
> > And based on my understanding of the consensus on Rust vs LKMM, "do
> > whatever kernel C does and rely on whatever kernel C relies" is the
> > general suggestion, so I think an empty `asm!()` works here. Of course
> > if in practice, we find an issue, I'm happy to look for solutions ;-)
> > 
> > Thoughts?
> > 
> > [1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/347
> 
> If I understood correctly, this is about using "compiler barriers" to order
> volatile accesses that the LKMM uses in lieu of atomic accesses?
> I can't give a principled answer here, unfortunately -- as you know, the
> mapping of LKMM through the compiler isn't really in a state where we can
> make principled formal statements. And making principled formal statements
> is my main expertise so I am a bit out of my depth here. ;)
> 

Understood ;-)

> So I agree with your 2nd paragraph: I would say just like the fact that you
> are using volatile accesses in the first place, this falls under "do
> whatever the C code does, it shouldn't be any more broken in Rust than it is
> in C".
> 
> However, saying that it in general "prevents reordering all memory accesses"
> is unlikely to be fully correct -- if the compiler can prove that the inline
> asm block could not possibly have access to a local variable (e.g. because
> it never had its address taken), its accesses can still be reordered. This
> applies both to C compilers and Rust compilers. Extra annotations such as
> `noalias` (or `restrict` in C) can also give rise to reorderings around
> arbitrary code, including such barriers. This is not a problem for
> concurrent code since it would anyway be wrong to claim that some pointer
> doesn't have aliases when it is accessed by multiple threads, but it shows

Right, it shouldn't be a problem for most of the concurrent code, and
thank you for bringing this up. I believe we can rely on the barrier
behavior if the memory accesses on both sides are done via aliased
references/pointers, which should be the same as C code relies on.

One thing though is we don't use much of `restrict` in kernel C, so I
wonder the compiler's behavior in the following code:

    let mut x = KBox::new_uninit(GFP_KERNEL)?;
    // ^ KBox is our own Box implementation based on kmalloc(), and it
    // accepts a flag in new*() functions for different allocation
    // behavior (can sleep or not, etc), of course we want it to behave
    // like an std Box in term of aliasing.

    let x = KBox::write(x, foo); // A

    smp_mb():
      // using Rust asm!() for explanation, it's really implemented in
      // C.
      asm!("mfence");

    let a: &Atomic<*mut Foo> = ...; // `a` was null initially.

    a.store(KBox::into_raw(x), Relaxed); // B

Now we obviously want A and B to be ordered, because smp_mb() is
supposed to be stronger than Release ordering. So if another thread does
an Acquire read or uses address dependency:

    let a: &Atomic<*mut Foo> = ...;
    let foo_ptr = a.load(Acquire); // or load(Relaxed);

    if !foo_ptr.is_null() {
        let y: KBox<Foo> = unsafe { KBox::from_raw(foo_ptr) };
	// ^ this should be safe.
    }

Is it something Rust AM could guarantee? I think it makes no difference
than 1) allocating some normal memory for DMA; 2) writing to the normal
memory; 3) issuing some io barrier instructions to make sure the device
will see the writes in step 2; 4) doing some MMIO to notify the
device for a DMA read. Therefore reordering of A and B by compiler will
be problematic.

Regards,
Boqun

> that the framing of barriers in terms of preventing reordering of accesses
> is too imprecise. That's why the C++ memory model uses a very different
> framing, and that's why I can't give a definite answer here. :)
> 
> Kind regards,
> Ralf
>

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Ralf Jung 6 months, 4 weeks ago

Hi all,

On 15.07.25 17:21, Boqun Feng wrote:
> On Mon, Jul 14, 2025 at 05:42:39PM +0200, Ralf Jung wrote:
>> Hi all,
>>
>> On 11.07.25 20:20, Boqun Feng wrote:
>>> On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
>>>> On Thu Jul 10, 2025 at 8:00 AM CEST, Boqun Feng wrote:
>>>>> diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
>>>>> new file mode 100644
>>>>> index 000000000000..df4015221503
>>>>> --- /dev/null
>>>>> +++ b/rust/kernel/sync/barrier.rs
>>>>> @@ -0,0 +1,65 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>> +
>>>>> +//! Memory barriers.
>>>>> +//!
>>>>> +//! These primitives have the same semantics as their C counterparts: and the precise definitions
>>>>> +//! of semantics can be found at [`LKMM`].
>>>>> +//!
>>>>> +//! [`LKMM`]: srctree/tools/memory-model/
>>>>> +
>>>>> +/// A compiler barrier.
>>>>> +///
>>>>> +/// A barrier that prevents compiler from reordering memory accesses across the barrier.
>>>>> +pub(crate) fn barrier() {
>>>>> +    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
>>>>> +    // it suffices as a compiler barrier.
>>>>
>>>> I don't know about this, but it also isn't my area of expertise... I
>>>> think I heard Ralf talk about this at Rust Week, but I don't remember...
>>>>
>>>
>>> Easy, let's Cc Ralf ;-)
>>>
>>> Ralf, I believe the question here is:
>>>
>>> In kernel C, we define a compiler barrier (barrier()), which is
>>> implemented as:
>>>
>>> # define barrier() __asm__ __volatile__("": : :"memory")
>>>
>>> Now we want to have a Rust version, and I think an empty `asm!()` should
>>> be enough as an equivalent as a barrier() in C, because an empty
>>> `asm!()` in Rust implies "memory" as the clobber:
>>>
>>> 	https://godbolt.org/z/3z3fnWYjs
>>>
>>> ?
>>>
>>> I know you have some opinions on C++ compiler_fence() [1]. But in LKMM,
>>> barrier() and other barriers work for all memory accesses not just
>>> atomics, so the problem "So, if your program contains no atomic
>>> accesses, but some atomic fences, those fences do nothing." doesn't
>>> exist for us. And our barrier() is strictly weaker than other barriers.
>>>
>>> And based on my understanding of the consensus on Rust vs LKMM, "do
>>> whatever kernel C does and rely on whatever kernel C relies" is the
>>> general suggestion, so I think an empty `asm!()` works here. Of course
>>> if in practice, we find an issue, I'm happy to look for solutions ;-)
>>>
>>> Thoughts?
>>>
>>> [1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/347
>>
>> If I understood correctly, this is about using "compiler barriers" to order
>> volatile accesses that the LKMM uses in lieu of atomic accesses?
>> I can't give a principled answer here, unfortunately -- as you know, the
>> mapping of LKMM through the compiler isn't really in a state where we can
>> make principled formal statements. And making principled formal statements
>> is my main expertise so I am a bit out of my depth here. ;)
>>
> 
> Understood ;-)
> 
>> So I agree with your 2nd paragraph: I would say just like the fact that you
>> are using volatile accesses in the first place, this falls under "do
>> whatever the C code does, it shouldn't be any more broken in Rust than it is
>> in C".
>>
>> However, saying that it in general "prevents reordering all memory accesses"
>> is unlikely to be fully correct -- if the compiler can prove that the inline
>> asm block could not possibly have access to a local variable (e.g. because
>> it never had its address taken), its accesses can still be reordered. This
>> applies both to C compilers and Rust compilers. Extra annotations such as
>> `noalias` (or `restrict` in C) can also give rise to reorderings around
>> arbitrary code, including such barriers. This is not a problem for
>> concurrent code since it would anyway be wrong to claim that some pointer
>> doesn't have aliases when it is accessed by multiple threads, but it shows
> 
> Right, it shouldn't be a problem for most of the concurrent code, and
> thank you for bringing this up. I believe we can rely on the barrier
> behavior if the memory accesses on both sides are done via aliased
> references/pointers, which should be the same as C code relies on.
> 
> One thing though is we don't use much of `restrict` in kernel C, so I
> wonder the compiler's behavior in the following code:
> 
>      let mut x = KBox::new_uninit(GFP_KERNEL)?;
>      // ^ KBox is our own Box implementation based on kmalloc(), and it
>      // accepts a flag in new*() functions for different allocation
>      // behavior (can sleep or not, etc), of course we want it to behave
>      // like an std Box in term of aliasing.
> 
>      let x = KBox::write(x, foo); // A
> 
>      smp_mb():
>        // using Rust asm!() for explanation, it's really implemented in
>        // C.
>        asm!("mfence");
> 
>      let a: &Atomic<*mut Foo> = ...; // `a` was null initially.
> 
>      a.store(KBox::into_raw(x), Relaxed); // B
> 
> Now we obviously want A and B to be ordered, because smp_mb() is
> supposed to be stronger than Release ordering. So if another thread does
> an Acquire read or uses address dependency:
> 
>      let a: &Atomic<*mut Foo> = ...;
>      let foo_ptr = a.load(Acquire); // or load(Relaxed);
> 
>      if !foo_ptr.is_null() {
>          let y: KBox<Foo> = unsafe { KBox::from_raw(foo_ptr) };
> 	// ^ this should be safe.
>      }
> 
> Is it something Rust AM could guarantee?

If we pretend these are normal Rust atomics, and we look at the acquire read, 
then yeah that should work -- the asm block can act like a release fence. With 
the LKMM, it's not a "guarantee" in the same sense any more since it lacks the 
formal foundations, but "it shouldn't be worse than in C".

The Rust/C/C++ memory models do not allow that last example with a relaxed load 
and an address dependency. In C/C++ this requires "consume", which Rust doesn't 
have (and which clang treats as "acquire" -- and GCC does the same, IIRC), and 
which nobody figured out how to properly integrate into any of these languages. 
I will refrain from making any definite statements for the LKMM here. ;)

Kind regards,
Ralf

> I think it makes no difference
> than 1) allocating some normal memory for DMA; 2) writing to the normal
> memory; 3) issuing some io barrier instructions to make sure the device
> will see the writes in step 2; 4) doing some MMIO to notify the
> device for a DMA read. Therefore reordering of A and B by compiler will
> be problematic.
> 
> Regards,
> Boqun
> 
>> that the framing of barriers in terms of preventing reordering of accesses
>> is too imprecise. That's why the C++ memory model uses a very different
>> framing, and that's why I can't give a definite answer here. :)
>>
>> Kind regards,
>> Ralf
>>

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Boqun Feng 6 months, 4 weeks ago

On Tue, Jul 15, 2025 at 05:35:47PM +0200, Ralf Jung wrote:
> Hi all,
> 
> On 15.07.25 17:21, Boqun Feng wrote:
> > On Mon, Jul 14, 2025 at 05:42:39PM +0200, Ralf Jung wrote:
> > > Hi all,
> > > 
> > > On 11.07.25 20:20, Boqun Feng wrote:
> > > > On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
> > > > > On Thu Jul 10, 2025 at 8:00 AM CEST, Boqun Feng wrote:
> > > > > > diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
> > > > > > new file mode 100644
> > > > > > index 000000000000..df4015221503
> > > > > > --- /dev/null
> > > > > > +++ b/rust/kernel/sync/barrier.rs
> > > > > > @@ -0,0 +1,65 @@
> > > > > > +// SPDX-License-Identifier: GPL-2.0
> > > > > > +
> > > > > > +//! Memory barriers.
> > > > > > +//!
> > > > > > +//! These primitives have the same semantics as their C counterparts: and the precise definitions
> > > > > > +//! of semantics can be found at [`LKMM`].
> > > > > > +//!
> > > > > > +//! [`LKMM`]: srctree/tools/memory-model/
> > > > > > +
> > > > > > +/// A compiler barrier.
> > > > > > +///
> > > > > > +/// A barrier that prevents compiler from reordering memory accesses across the barrier.
> > > > > > +pub(crate) fn barrier() {
> > > > > > +    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
> > > > > > +    // it suffices as a compiler barrier.
> > > > > 
> > > > > I don't know about this, but it also isn't my area of expertise... I
> > > > > think I heard Ralf talk about this at Rust Week, but I don't remember...
> > > > > 
> > > > 
> > > > Easy, let's Cc Ralf ;-)
> > > > 
> > > > Ralf, I believe the question here is:
> > > > 
> > > > In kernel C, we define a compiler barrier (barrier()), which is
> > > > implemented as:
> > > > 
> > > > # define barrier() __asm__ __volatile__("": : :"memory")
> > > > 
> > > > Now we want to have a Rust version, and I think an empty `asm!()` should
> > > > be enough as an equivalent as a barrier() in C, because an empty
> > > > `asm!()` in Rust implies "memory" as the clobber:
> > > > 
> > > > 	https://godbolt.org/z/3z3fnWYjs
> > > > 
> > > > ?
> > > > 
> > > > I know you have some opinions on C++ compiler_fence() [1]. But in LKMM,
> > > > barrier() and other barriers work for all memory accesses not just
> > > > atomics, so the problem "So, if your program contains no atomic
> > > > accesses, but some atomic fences, those fences do nothing." doesn't
> > > > exist for us. And our barrier() is strictly weaker than other barriers.
> > > > 
> > > > And based on my understanding of the consensus on Rust vs LKMM, "do
> > > > whatever kernel C does and rely on whatever kernel C relies" is the
> > > > general suggestion, so I think an empty `asm!()` works here. Of course
> > > > if in practice, we find an issue, I'm happy to look for solutions ;-)
> > > > 
> > > > Thoughts?
> > > > 
> > > > [1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/347
> > > 
> > > If I understood correctly, this is about using "compiler barriers" to order
> > > volatile accesses that the LKMM uses in lieu of atomic accesses?
> > > I can't give a principled answer here, unfortunately -- as you know, the
> > > mapping of LKMM through the compiler isn't really in a state where we can
> > > make principled formal statements. And making principled formal statements
> > > is my main expertise so I am a bit out of my depth here. ;)
> > > 
> > 
> > Understood ;-)
> > 
> > > So I agree with your 2nd paragraph: I would say just like the fact that you
> > > are using volatile accesses in the first place, this falls under "do
> > > whatever the C code does, it shouldn't be any more broken in Rust than it is
> > > in C".
> > > 
> > > However, saying that it in general "prevents reordering all memory accesses"
> > > is unlikely to be fully correct -- if the compiler can prove that the inline
> > > asm block could not possibly have access to a local variable (e.g. because
> > > it never had its address taken), its accesses can still be reordered. This
> > > applies both to C compilers and Rust compilers. Extra annotations such as
> > > `noalias` (or `restrict` in C) can also give rise to reorderings around
> > > arbitrary code, including such barriers. This is not a problem for
> > > concurrent code since it would anyway be wrong to claim that some pointer
> > > doesn't have aliases when it is accessed by multiple threads, but it shows
> > 
> > Right, it shouldn't be a problem for most of the concurrent code, and
> > thank you for bringing this up. I believe we can rely on the barrier
> > behavior if the memory accesses on both sides are done via aliased
> > references/pointers, which should be the same as C code relies on.
> > 
> > One thing though is we don't use much of `restrict` in kernel C, so I
> > wonder the compiler's behavior in the following code:
> > 
> >      let mut x = KBox::new_uninit(GFP_KERNEL)?;
> >      // ^ KBox is our own Box implementation based on kmalloc(), and it
> >      // accepts a flag in new*() functions for different allocation
> >      // behavior (can sleep or not, etc), of course we want it to behave
> >      // like an std Box in term of aliasing.
> > 
> >      let x = KBox::write(x, foo); // A
> > 
> >      smp_mb():
> >        // using Rust asm!() for explanation, it's really implemented in
> >        // C.
> >        asm!("mfence");
> > 
> >      let a: &Atomic<*mut Foo> = ...; // `a` was null initially.
> > 
> >      a.store(KBox::into_raw(x), Relaxed); // B
> > 
> > Now we obviously want A and B to be ordered, because smp_mb() is
> > supposed to be stronger than Release ordering. So if another thread does
> > an Acquire read or uses address dependency:
> > 
> >      let a: &Atomic<*mut Foo> = ...;
> >      let foo_ptr = a.load(Acquire); // or load(Relaxed);
> > 
> >      if !foo_ptr.is_null() {
> >          let y: KBox<Foo> = unsafe { KBox::from_raw(foo_ptr) };
> > 	// ^ this should be safe.
> >      }
> > 
> > Is it something Rust AM could guarantee?
> 
> If we pretend these are normal Rust atomics, and we look at the acquire
> read, then yeah that should work -- the asm block can act like a release
> fence. With the LKMM, it's not a "guarantee" in the same sense any more
> since it lacks the formal foundations, but "it shouldn't be worse than in
> C".

> 
> The Rust/C/C++ memory models do not allow that last example with a relaxed
> load and an address dependency. In C/C++ this requires "consume", which Rust

Sorry I wasn't clear, of course I wasn't going to start a discussion
about address dependency and formal guarantee about it ;-)

What I meant was the "prevent reordering A and B because of the asm!()"
at the release side, because normally we won't use a restrict pointer to
a kmalloc() result, so I'm curious whether Box make the behavior
different:

    let mut b = Box::new_uninit(...);
    let b = Box::write(b, ...); // <- this is a write done via noalias
    asm!(...);
    a.store(Box::from_raw(b), Relaxed);

But looks like we can just model the asm() as a Rust release fence, so
it should work. Thanks!

Regards,
Boqun


> doesn't have (and which clang treats as "acquire" -- and GCC does the same,
> IIRC), and which nobody figured out how to properly integrate into any of
> these languages. I will refrain from making any definite statements for the
> LKMM here. ;)
> 
[...]

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Ralf Jung 6 months, 3 weeks ago

Hi Boqun,

On 15.07.25 17:56, Boqun Feng wrote:
> On Tue, Jul 15, 2025 at 05:35:47PM +0200, Ralf Jung wrote:
>> Hi all,
>>
>> On 15.07.25 17:21, Boqun Feng wrote:
>>> On Mon, Jul 14, 2025 at 05:42:39PM +0200, Ralf Jung wrote:
>>>> Hi all,
>>>>
>>>> On 11.07.25 20:20, Boqun Feng wrote:
>>>>> On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
>>>>>> On Thu Jul 10, 2025 at 8:00 AM CEST, Boqun Feng wrote:
>>>>>>> diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
>>>>>>> new file mode 100644
>>>>>>> index 000000000000..df4015221503
>>>>>>> --- /dev/null
>>>>>>> +++ b/rust/kernel/sync/barrier.rs
>>>>>>> @@ -0,0 +1,65 @@
>>>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>>>> +
>>>>>>> +//! Memory barriers.
>>>>>>> +//!
>>>>>>> +//! These primitives have the same semantics as their C counterparts: and the precise definitions
>>>>>>> +//! of semantics can be found at [`LKMM`].
>>>>>>> +//!
>>>>>>> +//! [`LKMM`]: srctree/tools/memory-model/
>>>>>>> +
>>>>>>> +/// A compiler barrier.
>>>>>>> +///
>>>>>>> +/// A barrier that prevents compiler from reordering memory accesses across the barrier.
>>>>>>> +pub(crate) fn barrier() {
>>>>>>> +    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
>>>>>>> +    // it suffices as a compiler barrier.
>>>>>>
>>>>>> I don't know about this, but it also isn't my area of expertise... I
>>>>>> think I heard Ralf talk about this at Rust Week, but I don't remember...
>>>>>>
>>>>>
>>>>> Easy, let's Cc Ralf ;-)
>>>>>
>>>>> Ralf, I believe the question here is:
>>>>>
>>>>> In kernel C, we define a compiler barrier (barrier()), which is
>>>>> implemented as:
>>>>>
>>>>> # define barrier() __asm__ __volatile__("": : :"memory")
>>>>>
>>>>> Now we want to have a Rust version, and I think an empty `asm!()` should
>>>>> be enough as an equivalent as a barrier() in C, because an empty
>>>>> `asm!()` in Rust implies "memory" as the clobber:
>>>>>
>>>>> 	https://godbolt.org/z/3z3fnWYjs
>>>>>
>>>>> ?
>>>>>
>>>>> I know you have some opinions on C++ compiler_fence() [1]. But in LKMM,
>>>>> barrier() and other barriers work for all memory accesses not just
>>>>> atomics, so the problem "So, if your program contains no atomic
>>>>> accesses, but some atomic fences, those fences do nothing." doesn't
>>>>> exist for us. And our barrier() is strictly weaker than other barriers.
>>>>>
>>>>> And based on my understanding of the consensus on Rust vs LKMM, "do
>>>>> whatever kernel C does and rely on whatever kernel C relies" is the
>>>>> general suggestion, so I think an empty `asm!()` works here. Of course
>>>>> if in practice, we find an issue, I'm happy to look for solutions ;-)
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> [1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/347
>>>>
>>>> If I understood correctly, this is about using "compiler barriers" to order
>>>> volatile accesses that the LKMM uses in lieu of atomic accesses?
>>>> I can't give a principled answer here, unfortunately -- as you know, the
>>>> mapping of LKMM through the compiler isn't really in a state where we can
>>>> make principled formal statements. And making principled formal statements
>>>> is my main expertise so I am a bit out of my depth here. ;)
>>>>
>>>
>>> Understood ;-)
>>>
>>>> So I agree with your 2nd paragraph: I would say just like the fact that you
>>>> are using volatile accesses in the first place, this falls under "do
>>>> whatever the C code does, it shouldn't be any more broken in Rust than it is
>>>> in C".
>>>>
>>>> However, saying that it in general "prevents reordering all memory accesses"
>>>> is unlikely to be fully correct -- if the compiler can prove that the inline
>>>> asm block could not possibly have access to a local variable (e.g. because
>>>> it never had its address taken), its accesses can still be reordered. This
>>>> applies both to C compilers and Rust compilers. Extra annotations such as
>>>> `noalias` (or `restrict` in C) can also give rise to reorderings around
>>>> arbitrary code, including such barriers. This is not a problem for
>>>> concurrent code since it would anyway be wrong to claim that some pointer
>>>> doesn't have aliases when it is accessed by multiple threads, but it shows
>>>
>>> Right, it shouldn't be a problem for most of the concurrent code, and
>>> thank you for bringing this up. I believe we can rely on the barrier
>>> behavior if the memory accesses on both sides are done via aliased
>>> references/pointers, which should be the same as C code relies on.
>>>
>>> One thing though is we don't use much of `restrict` in kernel C, so I
>>> wonder the compiler's behavior in the following code:
>>>
>>>       let mut x = KBox::new_uninit(GFP_KERNEL)?;
>>>       // ^ KBox is our own Box implementation based on kmalloc(), and it
>>>       // accepts a flag in new*() functions for different allocation
>>>       // behavior (can sleep or not, etc), of course we want it to behave
>>>       // like an std Box in term of aliasing.
>>>
>>>       let x = KBox::write(x, foo); // A
>>>
>>>       smp_mb():
>>>         // using Rust asm!() for explanation, it's really implemented in
>>>         // C.
>>>         asm!("mfence");
>>>
>>>       let a: &Atomic<*mut Foo> = ...; // `a` was null initially.
>>>
>>>       a.store(KBox::into_raw(x), Relaxed); // B
>>>
>>> Now we obviously want A and B to be ordered, because smp_mb() is
>>> supposed to be stronger than Release ordering. So if another thread does
>>> an Acquire read or uses address dependency:
>>>
>>>       let a: &Atomic<*mut Foo> = ...;
>>>       let foo_ptr = a.load(Acquire); // or load(Relaxed);
>>>
>>>       if !foo_ptr.is_null() {
>>>           let y: KBox<Foo> = unsafe { KBox::from_raw(foo_ptr) };
>>> 	// ^ this should be safe.
>>>       }
>>>
>>> Is it something Rust AM could guarantee?
>>
>> If we pretend these are normal Rust atomics, and we look at the acquire
>> read, then yeah that should work -- the asm block can act like a release
>> fence. With the LKMM, it's not a "guarantee" in the same sense any more
>> since it lacks the formal foundations, but "it shouldn't be worse than in
>> C".
> 
>>
>> The Rust/C/C++ memory models do not allow that last example with a relaxed
>> load and an address dependency. In C/C++ this requires "consume", which Rust
> 
> Sorry I wasn't clear, of course I wasn't going to start a discussion
> about address dependency and formal guarantee about it ;-)
> 
> What I meant was the "prevent reordering A and B because of the asm!()"
> at the release side, because normally we won't use a restrict pointer to
> a kmalloc() result, so I'm curious whether Box make the behavior
> different:
> 
>      let mut b = Box::new_uninit(...);
>      let b = Box::write(b, ...); // <- this is a write done via noalias
>      asm!(...);
>      a.store(Box::from_raw(b), Relaxed);
> 
> But looks like we can just model the asm() as a Rust release fence, so
> it should work. Thanks!

Yeah... this is actually a subtle case and there are some adjacent compiler bugs 
(when doing the same with local variables, not Box), but those are bugs (and 
they affect both C and Rust).

Kind regards,
Ralf

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Boqun Feng 7 months ago

On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
[...]
> > +}
> > +
> > +/// A full memory barrier.
> > +///
> > +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
> > +pub fn smp_mb() {
> > +    if cfg!(CONFIG_SMP) {
> > +        // SAFETY: `smp_mb()` is safe to call.
> > +        unsafe {
> > +            bindings::smp_mb();
> 
> Does this really work? How does the Rust compiler know this is a memory
> barrier?
> 

- Without INLINE_HELPER, this is an FFI call, it's safe to assume that
  Rust compiler would treat it as a compiler barrier and in smp_mb() a
  real memory barrier instruction will be executed. 

- With INLINE_HELPER, this will be inlined as an asm block with "memory"
  as clobber, and LLVM will know it's a compiler memory barrier, and the
  real memory barrier instruction guarantees it's a memory barrier at
  CPU reordering level as well.

Think about this, SpinLock and Mutex need memory barriers for critical
section, if this doesn't work, then SpinLock and Mutex don't work
either, then we have a bigger problem ;-)

Regards,
Boqun

> ---
> Cheers,
> Benno
> 
> > +        }
> > +    } else {
> > +        barrier();
> > +    }
> > +}

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Benno Lossin 7 months ago

On Fri Jul 11, 2025 at 3:32 PM CEST, Boqun Feng wrote:
> On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
> [...]
>> > +}
>> > +
>> > +/// A full memory barrier.
>> > +///
>> > +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
>> > +pub fn smp_mb() {
>> > +    if cfg!(CONFIG_SMP) {
>> > +        // SAFETY: `smp_mb()` is safe to call.
>> > +        unsafe {
>> > +            bindings::smp_mb();
>> 
>> Does this really work? How does the Rust compiler know this is a memory
>> barrier?
>> 
>
> - Without INLINE_HELPER, this is an FFI call, it's safe to assume that
>   Rust compiler would treat it as a compiler barrier and in smp_mb() a
>   real memory barrier instruction will be executed. 
>
> - With INLINE_HELPER, this will be inlined as an asm block with "memory"
>   as clobber, and LLVM will know it's a compiler memory barrier, and the
>   real memory barrier instruction guarantees it's a memory barrier at
>   CPU reordering level as well.
>
> Think about this, SpinLock and Mutex need memory barriers for critical
> section, if this doesn't work, then SpinLock and Mutex don't work
> either, then we have a bigger problem ;-)

By "this not working" I meant that he barrier would be too strong :)

So essentially without INLINE_HELPER, all barriers in this file are the
same, but with it, we get less strict ones?

---
Cheers,
Benno

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Boqun Feng 7 months ago

On Fri, Jul 11, 2025 at 08:57:27PM +0200, Benno Lossin wrote:
> On Fri Jul 11, 2025 at 3:32 PM CEST, Boqun Feng wrote:
> > On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
> > [...]
> >> > +}
> >> > +
> >> > +/// A full memory barrier.
> >> > +///
> >> > +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
> >> > +pub fn smp_mb() {
> >> > +    if cfg!(CONFIG_SMP) {
> >> > +        // SAFETY: `smp_mb()` is safe to call.
> >> > +        unsafe {
> >> > +            bindings::smp_mb();
> >> 
> >> Does this really work? How does the Rust compiler know this is a memory
> >> barrier?
> >> 
> >
> > - Without INLINE_HELPER, this is an FFI call, it's safe to assume that
> >   Rust compiler would treat it as a compiler barrier and in smp_mb() a
> >   real memory barrier instruction will be executed. 
> >
> > - With INLINE_HELPER, this will be inlined as an asm block with "memory"
> >   as clobber, and LLVM will know it's a compiler memory barrier, and the
> >   real memory barrier instruction guarantees it's a memory barrier at
> >   CPU reordering level as well.
> >
> > Think about this, SpinLock and Mutex need memory barriers for critical
> > section, if this doesn't work, then SpinLock and Mutex don't work
> > either, then we have a bigger problem ;-)
> 
> By "this not working" I meant that he barrier would be too strong :)
> 
> So essentially without INLINE_HELPER, all barriers in this file are the
> same, but with it, we get less strict ones?

Not the same, each barrier function may generate a different hardware
instruction ;-)

I would say for a Rust function (e.g. smp_mb()), the difference between
with and without INLINE_HELPER is:

- with INLINE_HELPER enabled, they behave exactly like a C function
  calling a C smp_mb().

- without INLINE_HELPER enabled, they behave like a C function calling 
  a function that never inlined:

  void outofline_smp_mb(void)
  {
    smp_mb();
  }

  It might be stronger than the "with INLINE_HELPER" case but both are
  correct regarding memory ordering.

Regards,
Boqun

> 
> ---
> Cheers,
> Benno

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Benno Lossin 7 months ago

On Fri Jul 11, 2025 at 9:26 PM CEST, Boqun Feng wrote:
> On Fri, Jul 11, 2025 at 08:57:27PM +0200, Benno Lossin wrote:
>> On Fri Jul 11, 2025 at 3:32 PM CEST, Boqun Feng wrote:
>> > On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
>> > [...]
>> >> > +}
>> >> > +
>> >> > +/// A full memory barrier.
>> >> > +///
>> >> > +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
>> >> > +pub fn smp_mb() {
>> >> > +    if cfg!(CONFIG_SMP) {
>> >> > +        // SAFETY: `smp_mb()` is safe to call.
>> >> > +        unsafe {
>> >> > +            bindings::smp_mb();
>> >> 
>> >> Does this really work? How does the Rust compiler know this is a memory
>> >> barrier?
>> >> 
>> >
>> > - Without INLINE_HELPER, this is an FFI call, it's safe to assume that
>> >   Rust compiler would treat it as a compiler barrier and in smp_mb() a
>> >   real memory barrier instruction will be executed. 
>> >
>> > - With INLINE_HELPER, this will be inlined as an asm block with "memory"
>> >   as clobber, and LLVM will know it's a compiler memory barrier, and the
>> >   real memory barrier instruction guarantees it's a memory barrier at
>> >   CPU reordering level as well.
>> >
>> > Think about this, SpinLock and Mutex need memory barriers for critical
>> > section, if this doesn't work, then SpinLock and Mutex don't work
>> > either, then we have a bigger problem ;-)
>> 
>> By "this not working" I meant that he barrier would be too strong :)
>> 
>> So essentially without INLINE_HELPER, all barriers in this file are the
>> same, but with it, we get less strict ones?
>
> Not the same, each barrier function may generate a different hardware
> instruction ;-)
>
> I would say for a Rust function (e.g. smp_mb()), the difference between
> with and without INLINE_HELPER is:
>
> - with INLINE_HELPER enabled, they behave exactly like a C function
>   calling a C smp_mb().
>
> - without INLINE_HELPER enabled, they behave like a C function calling 
>   a function that never inlined:
>
>   void outofline_smp_mb(void)
>   {
>     smp_mb();
>   }
>
>   It might be stronger than the "with INLINE_HELPER" case but both are
>   correct regarding memory ordering.

Yes, this is exactly what I meant with "not working" :)

---
Cheers,
Benno

Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Posted by Boqun Feng 7 months ago

On Fri, Jul 11, 2025 at 11:04:09PM +0200, Benno Lossin wrote:
> On Fri Jul 11, 2025 at 9:26 PM CEST, Boqun Feng wrote:
> > On Fri, Jul 11, 2025 at 08:57:27PM +0200, Benno Lossin wrote:
> >> On Fri Jul 11, 2025 at 3:32 PM CEST, Boqun Feng wrote:
> >> > On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
> >> > [...]
> >> >> > +}
> >> >> > +
> >> >> > +/// A full memory barrier.
> >> >> > +///
> >> >> > +/// A barrier that prevents compiler and CPU from reordering memory accesses across the barrier.
> >> >> > +pub fn smp_mb() {
> >> >> > +    if cfg!(CONFIG_SMP) {
> >> >> > +        // SAFETY: `smp_mb()` is safe to call.
> >> >> > +        unsafe {
> >> >> > +            bindings::smp_mb();
> >> >> 
> >> >> Does this really work? How does the Rust compiler know this is a memory
> >> >> barrier?
> >> >> 
> >> >
> >> > - Without INLINE_HELPER, this is an FFI call, it's safe to assume that
> >> >   Rust compiler would treat it as a compiler barrier and in smp_mb() a
> >> >   real memory barrier instruction will be executed. 
> >> >
> >> > - With INLINE_HELPER, this will be inlined as an asm block with "memory"
> >> >   as clobber, and LLVM will know it's a compiler memory barrier, and the
> >> >   real memory barrier instruction guarantees it's a memory barrier at
> >> >   CPU reordering level as well.
> >> >
> >> > Think about this, SpinLock and Mutex need memory barriers for critical
> >> > section, if this doesn't work, then SpinLock and Mutex don't work
> >> > either, then we have a bigger problem ;-)
> >> 
> >> By "this not working" I meant that he barrier would be too strong :)
> >> 
> >> So essentially without INLINE_HELPER, all barriers in this file are the
> >> same, but with it, we get less strict ones?
> >
> > Not the same, each barrier function may generate a different hardware
> > instruction ;-)
> >
> > I would say for a Rust function (e.g. smp_mb()), the difference between
> > with and without INLINE_HELPER is:
> >
> > - with INLINE_HELPER enabled, they behave exactly like a C function
> >   calling a C smp_mb().
> >
> > - without INLINE_HELPER enabled, they behave like a C function calling 
> >   a function that never inlined:
> >
> >   void outofline_smp_mb(void)
> >   {
> >     smp_mb();
> >   }
> >
> >   It might be stronger than the "with INLINE_HELPER" case but both are
> >   correct regarding memory ordering.
> 
> Yes, this is exactly what I meant with "not working" :)
> 

But be stronger is still "working" ;-)

BTW, replying you made me realize I should make these function
#[inline(always)] so thank you ;-)

Regards,
Boqun

> ---
> Cheers,
> Benno
>