[PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust

Lyude Paul posted 16 patches 1 week, 4 days ago
arch/arm64/include/asm/preempt.h     |  18 ++
arch/s390/include/asm/preempt.h      |  10 +
arch/x86/include/asm/preempt.h       |  10 +
include/asm-generic/preempt.h        |  14 ++
include/linux/hardirq.h              |  16 +-
include/linux/interrupt_rc.h         |  61 ++++++
include/linux/preempt.h              |  23 ++-
include/linux/spinlock.h             |  51 +++--
include/linux/spinlock_api_smp.h     |  27 +++
include/linux/spinlock_api_up.h      |   8 +
include/linux/spinlock_rt.h          |  15 ++
kernel/irq/Makefile                  |   1 +
kernel/irq/refcount_interrupt_test.c | 109 +++++++++++
kernel/locking/spinlock.c            |  29 +++
kernel/softirq.c                     |   5 +
rust/helpers/helpers.c               |   1 +
rust/helpers/interrupt.c             |  18 ++
rust/helpers/spinlock.c              |  15 ++
rust/helpers/sync.c                  |   5 +
rust/kernel/interrupt.rs             |  86 +++++++++
rust/kernel/lib.rs                   |   1 +
rust/kernel/sync.rs                  |   5 +-
rust/kernel/sync/lock.rs             |  69 ++++++-
rust/kernel/sync/lock/global.rs      |  91 ++++++---
rust/kernel/sync/lock/mutex.rs       |   2 +
rust/kernel/sync/lock/spinlock.rs    | 272 +++++++++++++++++++++++++++
26 files changed, 908 insertions(+), 54 deletions(-)
create mode 100644 include/linux/interrupt_rc.h
create mode 100644 kernel/irq/refcount_interrupt_test.c
create mode 100644 rust/helpers/interrupt.c
create mode 100644 rust/kernel/interrupt.rs
[PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by Lyude Paul 1 week, 4 days ago
This is the latest patch series for adding rust bindings for controlling
local processor interrupts, adding support for spinlocks in rust that
are acquired with local processor interrupts disabled, and implementing
local interrupt controls through refcounting in the kernel.

The previous version of this patch series can be found here:

https://lkml.org/lkml/2025/10/13/1155

This patch series applies on top of the rust-next branch.

There's a few big changes from the last time. Mainly that we've
addressed all(?) of the open questions on this patch series:

* Thanks to Joel Fernandes, we now have a seperate per-CPU counter for
  tracking NMI nesting - which ensures that we don't have to sacrifice
  NMI nest level bits in order to store a counter for refcounted IRQs.
  These patches have been included at the start of the series.
* We've been able to prove that being able to convert the kernel over to
  this new interface is indeed possible, more on this below.
* Also thank to Joel, we also now have actual benchmarks for how this
  affects performance:
  https://lore.kernel.org/rust-for-linux/20250619175335.2905836-1-joelagnelf@nvidia.com/
* Also some small changes to the kunit test I added, mainly just making
  sure I don't forget to include a MODULE_DESCRIPTION or MODULE_LICENSE.

Regarding the conversion plan: we've had some success at getting kernels
to boot after attempting to convert the entire kernel from the
non-refcounted API to the new refcounted API. It will definitely take
quite a lot of work to get this right though, at least in the kernel
core side of things. To give readers an idea of what I mean, here's a
few of the issues that we ended up running into:

On my end, I tried running a number of coccinelle conversions for this.
At first I did actually try simply rewiring
local_irq_disable()/local_irq_enable() to
local_interrupt_enable()/local_interrupt_disable(). This wasn't really
workable though, as it causes the kernel to crash very early on in a
number of ways that I haven't fully untangled. Doing this with
coccinelle on the other hand allowed me to convert individual files at a
time, along with specific usage patterns of the old API, and as a result
this ended up giving me a pretty good idea of where our issues are
coming from. This coccinelle script, while still leaving most of the
kernel unconverted, was at least able to be run on almost all of kernel/
while still allowing us to boot on x86_64

@depends on patch && !report@
@@
- local_irq_disable();
+ local_interrupt_disable();
...
- local_irq_enable();
+ local_interrupt_enable();

There were two files in kernel/ that were exceptions to this:

* kernel/softirq.c
* kernel/main.c (I figured out at least one fix to an issue here)

The reason this worked is because it seems like the vast majority of the
issues we're seeing come from "unbalanced"/"misordered" usages of the
old irq API. And there seems to be a few reasons for this:

* The first simple reason: occasionally the enable/disable was split
  across a function, which this script didn't handle.
* The second more complicated reason: some portions of the kernel core
  end up calling processor instructions that modify the processor's
  local interrupt flags independently of the kernel. In x86_64's case, I
  believe we came to the conclusion the iret instruction (interrupt
  return) was modifying the interrupt flag state. There's possibly a few
  more instances like this elsewhere.

Boqun also took a stab at this on aarch64, and ended up having similar
findings. In their case, they discovered one of the culprits being
raw_spin_rq_unlock_irq(). Here the reason is that on aarch64
preempt_count is per-thread and not just per-cpu, and when context
switching you generally disable interrupts from one task and restore it
in the other task. So in order to fix it, we'll need to make some
modifications to the aarch64 context-switching code.

So - with this being said, we decided that the best way of converting it
is likely to just leave us with 3 APIs for the time being - and have new
drivers and code use the new API while we go through and convert the
rest of the kernel.

Full changelog below

Boqun Feng (6):
  preempt: Introduce HARDIRQ_DISABLE_BITS
  preempt: Introduce __preempt_count_{sub, add}_return()
  irq & spin_lock: Add counted interrupt disabling/enabling
  rust: helper: Add spin_{un,}lock_irq_{enable,disable}() helpers
  rust: sync: lock: Add `Backend::BackendInContext`
  locking: Switch to _irq_{disable,enable}() variants in cleanup guards

Joel Fernandes (1):
  preempt: Track NMI nesting to separate per-CPU counter

Lyude Paul (9):
  irq: Add KUnit test for refcounted interrupt enable/disable
  rust: Introduce interrupt module
  rust: sync: Add SpinLockIrq
  rust: sync: Introduce lock::Backend::Context
  rust: sync: lock/global: Rename B to G in trait bounds
  rust: sync: Add a lifetime parameter to lock::global::GlobalGuard
  rust: sync: Expose lock::Backend
  rust: sync: lock/global: Add Backend parameter to GlobalGuard
  rust: sync: lock/global: Add BackendInContext support to GlobalLock

                               --- Changelog ---

* preempt: Introduce HARDIRQ_DISABLE_BITS
  V14:
    * Fix HARDIRQ_DISABLE_MASK definition

* preempt: Track NMI nesting to separate per-CPU counter
* preempt: Introduce __preempt_count_{sub, add}_return()
  V10:
    * Add commit message I forgot
    * Rebase against latest pcpu_hot changes
  V11:
    * Remove CONFIG_PROFILE_ALL_BRANCHES workaround from
      __preempt_count_add_return()

* irq & spin_lock: Add counted interrupt disabling/enabling
  V10:
    * Add missing __raw_spin_lock_irq_disable() definition in spinlock.c
  V11:
    * Move definition of spin_trylock_irq_disable() into this commit
    * Get rid of leftover space
    * Remove unneeded preempt_disable()/preempt_enable()
  V12:
    * Move local_interrupt_enable()/local_interrupt_disable() out of
      include/linux/spinlock.h, into include/linux/irqflags.h
  V14:
    * Move local_interrupt_enable()/disable() again, this time into its own
      header, interrupt_rc.h, in order to fix a hexagon-specific build issue
      caught by the CKI bot.
      The reason this is needed is because on most architectures, irqflags.h
      ends up including <arch/smp.h>. This provides a definition for the
      raw_smp_processor_id() function which we depend on like so:

                                 <linux/percpu-defs.h>    <arch/smp.h>
        local_interrupt_disable() → raw_cpu_write() → raw_smp_processor_id()

      Unfortunately, hexagon appears to be one such architecture which does
      not pull in <arch/smp.h> by default here - causing kernel builds to
      fail and claim that raw_smp_processor_id() is undefined:

         In file included from kernel/sched/rq-offsets.c:5:
         In file included from kernel/sched/sched.h:8:
         In file included from include/linux/sched/affinity.h:1:
         In file included from include/linux/sched.h:37:
         In file included from include/linux/spinlock.h:59:
      >> include/linux/irqflags.h:277:3: error: call to undeclared function
         'raw_smp_processor_id'; ISO C99 and later do not support implicit
         function declarations [-Wimplicit-function-declaration]
           277 |   raw_cpu_write(local_interrupt_disable_state.flags, flags);
               |   ^
         include/linux/percpu-defs.h:413:34: note: expanded from macro
         'raw_cpu_write'

      While including <arch/smp.h> in <linux/irqflags.h> does fix the build
      on hexagon, it ends up breaking the build on x86_64:

        In file included from kernel/sched/rq-offsets.c:5:
        In file included from kernel/sched/sched.h:8:
        In file included from ./include/linux/sched/affinity.h:1:
        In file included from ./include/linux/sched.h:13:
        In file included from ./arch/x86/include/asm/processor.h:25:
        In file included from ./arch/x86/include/asm/special_insns.h:10:
        In file included from ./include/linux/irqflags.h:22:
        In file included from ./arch/x86/include/asm/smp.h:6:
        In file included from ./include/linux/thread_info.h:60:
        In file included from ./arch/x86/include/asm/thread_info.h:59:
        ./arch/x86/include/asm/cpufeature.h:110:40: error: use of undeclared
        identifier 'boot_cpu_data'

          [cap_byte] "i" (&((const char *)boot_cpu_data.x86_capability)[bit >> 3])
                                          ^
      While boot_cpu_data is defined in <asm/processor.h>, it's not possible
      for us to include that header in irqflags.h because we're already
      inside of <asm/processor.h>.

      As a result, I just concluded there's no reasonable way of having these
      functions in <linux/irqflags.h> because of how many low level ASM
      headers depend on it. So, we go with the solution of simply giving
      ourselves our own header file.

* rust: Introduce interrupt module
  V10:
    * Fix documentation typos
  V11:
    * Get rid of unneeded `use bindings;`
    * Move ASSUME_DISABLED into assume_disabled()
    * Confirm using lockdep_assert_irqs_disabled() that local interrupts are in
      fact disabled when LocalInterruptDisabled::assume_disabled() is called.

* rust: helper: Add spin_{un,}lock_irq_{enable,disable}() helpers
* rust: sync: Add SpinLockIrq
  V10:
    * Also add support to GlobalLock
    * Documentation fixes from Dirk
  V11:
    * Add unit test requested by Daniel Almeida
  V14:
    * Improve rustdoc for SpinLockIrqBackend

* rust: sync: Introduce lock::Backend::Context
  V10:
    * Fix typos - Dirk

* rust: sync: lock: Add `Backend::BackendInContext`
  V10:
    * Fix typos - Dirk/Lyude
    * Since we're adding support for context locks to GlobalLock as well, let's
      also make sure to cover try_lock while we're at it and add try_lock_with
    * Add a private function as_lock_in_context() for handling casting from a
      Lock<T, B> to Lock<T, B::BackendInContext> so we don't have to duplicate
      safety comments
  V11:
    * Fix clippy::ref_as_ptr error in Lock::as_lock_in_context()
  V14:
    * Add benchmark results, rewrite commit message

* rust: sync: lock/global: Rename B to G in trait bounds
* rust: sync: Add a lifetime parameter to lock::global::GlobalGuard
* rust: sync: Expose lock::Backend
* rust: sync: lock/global: Add Backend parameter to GlobalGuard
* rust: sync: lock/global: Add BackendInContext support to GlobalLock
* locking: Switch to _irq_{disable,enable}() variants in cleanup guards
  V10:
    * Add PREEMPT_RT build fix from Guangbo Cui

 arch/arm64/include/asm/preempt.h     |  18 ++
 arch/s390/include/asm/preempt.h      |  10 +
 arch/x86/include/asm/preempt.h       |  10 +
 include/asm-generic/preempt.h        |  14 ++
 include/linux/hardirq.h              |  16 +-
 include/linux/interrupt_rc.h         |  61 ++++++
 include/linux/preempt.h              |  23 ++-
 include/linux/spinlock.h             |  51 +++--
 include/linux/spinlock_api_smp.h     |  27 +++
 include/linux/spinlock_api_up.h      |   8 +
 include/linux/spinlock_rt.h          |  15 ++
 kernel/irq/Makefile                  |   1 +
 kernel/irq/refcount_interrupt_test.c | 109 +++++++++++
 kernel/locking/spinlock.c            |  29 +++
 kernel/softirq.c                     |   5 +
 rust/helpers/helpers.c               |   1 +
 rust/helpers/interrupt.c             |  18 ++
 rust/helpers/spinlock.c              |  15 ++
 rust/helpers/sync.c                  |   5 +
 rust/kernel/interrupt.rs             |  86 +++++++++
 rust/kernel/lib.rs                   |   1 +
 rust/kernel/sync.rs                  |   5 +-
 rust/kernel/sync/lock.rs             |  69 ++++++-
 rust/kernel/sync/lock/global.rs      |  91 ++++++---
 rust/kernel/sync/lock/mutex.rs       |   2 +
 rust/kernel/sync/lock/spinlock.rs    | 272 +++++++++++++++++++++++++++
 26 files changed, 908 insertions(+), 54 deletions(-)
 create mode 100644 include/linux/interrupt_rc.h
 create mode 100644 kernel/irq/refcount_interrupt_test.c
 create mode 100644 rust/helpers/interrupt.c
 create mode 100644 rust/kernel/interrupt.rs


base-commit: e5d330e13f67d574f683c052c9a342814fd8fa39
-- 
2.51.1

Re: [PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by John Hubbard 1 week, 4 days ago
On 11/20/25 1:45 PM, Lyude Paul wrote:
...
>   this new interface is indeed possible, more on this below.
> * Also thank to Joel, we also now have actual benchmarks for how this
>   affects performance:
>   https://lore.kernel.org/rust-for-linux/20250619175335.2905836-1-joelagnelf@nvidia.com/
> * Also some small changes to the kunit test I added, mainly just making
>   sure I don't forget to include a MODULE_DESCRIPTION or MODULE_LICENSE.

Hi,

The above link says that local_interrupt takes 3.6x as as as local_irq.

This is alarming, but is it the final word? In other words, is the Rust
side of this doomed to slower performance forever, or is there some
hope of reaching performance parity with the C part of the kernel?

Do we have to start telling the Rust for Linux story this way: "our
new Rust-based drivers are slower, but memory-safer"?

I'm not able to deduce the answer from a quick scan through the patches
and the cover letter.

thanks,
-- 
John Hubbard
Re: [PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by Boqun Feng 1 week, 3 days ago
On Thu, Nov 20, 2025 at 03:16:04PM -0800, John Hubbard wrote:
> On 11/20/25 1:45 PM, Lyude Paul wrote:
> ...
> >   this new interface is indeed possible, more on this below.
> > * Also thank to Joel, we also now have actual benchmarks for how this
> >   affects performance:
> >   https://lore.kernel.org/rust-for-linux/20250619175335.2905836-1-joelagnelf@nvidia.com/
> > * Also some small changes to the kunit test I added, mainly just making
> >   sure I don't forget to include a MODULE_DESCRIPTION or MODULE_LICENSE.
> 
> Hi,
> 
> The above link says that local_interrupt takes 3.6x as as as local_irq.
> 
> This is alarming, but is it the final word? In other words, is the Rust
> side of this doomed to slower performance forever, or is there some
> hope of reaching performance parity with the C part of the kernel?
> 

Note that local_interrupt API is for safe Rust code, you can always
use unsafe local_irq if the interrupt disabling is the performance
bottleneck for you. So language-wise there is no difference between Rust
and C.

> Do we have to start telling the Rust for Linux story this way: "our
> new Rust-based drivers are slower, but memory-safer"?
> 

I would not jump into that conclusion at the moment, because 1) as I
mentioned you can always go into unsafe if something begins the
bottleneck, and 2) there is always a gap between micro benchmark results
and the whole system performance, being slow on one operation doesn't
means the whole system will perform observably worse.

Think about a similar thing in C, we recommend people to use existing
locks instead of customized synchronization vi atomics in most cases,
and technically, locks can be slower compared to a special
synchronization based on atomics, but it's more difficult to mess up.

Regards,
Boqun

> I'm not able to deduce the answer from a quick scan through the patches
> and the cover letter.
> 
> thanks,
> -- 
> John Hubbard
>
Re: [PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by John Hubbard 1 week, 3 days ago
On 11/21/25 9:47 AM, Boqun Feng wrote:
> On Thu, Nov 20, 2025 at 03:16:04PM -0800, John Hubbard wrote:
>> On 11/20/25 1:45 PM, Lyude Paul wrote:
>> ...
>> This is alarming, but is it the final word? In other words, is the Rust
>> side of this doomed to slower performance forever, or is there some
>> hope of reaching performance parity with the C part of the kernel?
>>
> 
> Note that local_interrupt API is for safe Rust code, you can always
> use unsafe local_irq if the interrupt disabling is the performance
> bottleneck for you. So language-wise there is no difference between Rust
> and C.
>

OK, but there *is* a performance difference between Safe Rust (which is
the whole point of this project, after all) and C.

Is 3.6x longer really something we are stuck with? Or is there some other
way forward that could potentially provide higher performance, for Safe
Rust?

 
>> Do we have to start telling the Rust for Linux story this way: "our
>> new Rust-based drivers are slower, but memory-safer"?
>>
> 
> I would not jump into that conclusion at the moment, because 1) as I
> mentioned you can always go into unsafe if something begins the
> bottleneck, and 2) there is always a gap between micro benchmark results
> and the whole system performance, being slow on one operation doesn't
> means the whole system will perform observably worse.
> 
> Think about a similar thing in C, we recommend people to use existing
> locks instead of customized synchronization vi atomics in most cases,
> and technically, locks can be slower compared to a special
> synchronization based on atomics, but it's more difficult to mess up.
> 

Yes yes, I fully understand that micro benchmarks don't always translate
to a real-world observable effects. But interrupt operations...those can
be on a hot path. So it's prudent to worry about these.


thanks,
-- 
John Hubbard
Re: [PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by Boqun Feng 1 week, 2 days ago
On Fri, Nov 21, 2025 at 05:09:24PM -0800, John Hubbard wrote:
> On 11/21/25 9:47 AM, Boqun Feng wrote:
> > On Thu, Nov 20, 2025 at 03:16:04PM -0800, John Hubbard wrote:
> >> On 11/20/25 1:45 PM, Lyude Paul wrote:
> >> ...
> >> This is alarming, but is it the final word? In other words, is the Rust
> >> side of this doomed to slower performance forever, or is there some
> >> hope of reaching performance parity with the C part of the kernel?
> >>
> > 
> > Note that local_interrupt API is for safe Rust code, you can always
> > use unsafe local_irq if the interrupt disabling is the performance
> > bottleneck for you. So language-wise there is no difference between Rust
> > and C.
> >
> 
> OK, but there *is* a performance difference between Safe Rust (which is
> the whole point of this project, after all) and C.
> 

Again, this is a premature statement.

First of all, the safe SpinLockIrq API is made to work with other API
like CondVar, there are certain design requirements making it being
implemented in a certain way. In other words, the cost is justified.

Second, one safe API being slow than unsafe code or C doesn't mean Safe
Rust is slow than C in all the cases.

Last but not least, safe Rust is preferred, but it doesn't mean unsafe
code should be avoided completely, if we establish some data that shows
some unsafe code provides better performance and we have clear guideline
for the particular scenarios, then it's definitely OK. Hence I don't
fully agree your saying "Safe Rust is the whole point of this project",
to me understanding how we can utilize the type system and other tools
is more of a realistic goal.

> Is 3.6x longer really something we are stuck with? Or is there some other
> way forward that could potentially provide higher performance, for Safe
> Rust?
> 

Well by 3.6x longer, you mean ~1.3ns vs ~4.5ns, right? And in real world
code, the code in the interrupt disabling critical section would be more
than couples of nano seconds, hence the delta will probably be
noise-out. But again, yes if 3ns turns out to be a bottleneck in the
driver, we are happy to look into, but you need to show the data.

>  
> >> Do we have to start telling the Rust for Linux story this way: "our
> >> new Rust-based drivers are slower, but memory-safer"?
> >>
> > 
> > I would not jump into that conclusion at the moment, because 1) as I
> > mentioned you can always go into unsafe if something begins the
> > bottleneck, and 2) there is always a gap between micro benchmark results
> > and the whole system performance, being slow on one operation doesn't
> > means the whole system will perform observably worse.
> > 
> > Think about a similar thing in C, we recommend people to use existing
> > locks instead of customized synchronization vi atomics in most cases,
> > and technically, locks can be slower compared to a special
> > synchronization based on atomics, but it's more difficult to mess up.
> > 
> 
> Yes yes, I fully understand that micro benchmarks don't always translate
> to a real-world observable effects. But interrupt operations...those can
> be on a hot path. So it's prudent to worry about these.
> 

Note that it's the interrupt *disabling* operations, which means the
code could be otherwise interrupted outside the critical section, so yes
it could still be hot path, but there are more things than 3ns to affect
here.

Also one thing to notice is that

	local_interrupt_disable();
	<some other function>
	local_interrupt_disable();

should be cheaper than

	local_irq_save();
	<some other function>
	local_irq_save();

because the latter will access the interrupt disabling register twice.
So it's really hard to say whether the new API is strictly worse than
the existing ones.

Regards,
Boqun

> 
> thanks,
> -- 
> John Hubbard
>
Re: [PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by John Hubbard 1 week, 2 days ago
On 11/21/25 6:38 PM, Boqun Feng wrote:
> On Fri, Nov 21, 2025 at 05:09:24PM -0800, John Hubbard wrote:
>> On 11/21/25 9:47 AM, Boqun Feng wrote:
>>> On Thu, Nov 20, 2025 at 03:16:04PM -0800, John Hubbard wrote:
>>>> On 11/20/25 1:45 PM, Lyude Paul wrote:
>>>> ...
>> OK, but there *is* a performance difference between Safe Rust (which is
>> the whole point of this project, after all) and C.
> 
> Again, this is a premature statement.

Yes, it is. :)

> 
> First of all, the safe SpinLockIrq API is made to work with other API
> like CondVar, there are certain design requirements making it being
> implemented in a certain way. In other words, the cost is justified.

This argument doesn't apply directly, because the response to "Safe Rust
is slower in this case", is "the slowdown is justified".

Perhaps it is, but that doesn't refute the claim. It's still slower.

Maybe that's not a problem at all, but it's not yet clear.

> 
> Second, one safe API being slow than unsafe code or C doesn't mean Safe
> Rust is slow than C in all the cases.

Yes, we have already mentioned several times that micro-benchmarks are
not the final word. However, they can be an early indication of a problem.

> 
> Last but not least, safe Rust is preferred, but it doesn't mean unsafe
> code should be avoided completely, if we establish some data that shows

Perhaps we need to be slightly more precise. I'm not sure if you are
referring to the usual practice of creating an unsafe block, wrapped
within a safe Rust function, or something else?

> some unsafe code provides better performance and we have clear guideline
> for the particular scenarios, then it's definitely OK. Hence I don't
> fully agree your saying "Safe Rust is the whole point of this project",
> to me understanding how we can utilize the type system and other tools
> is more of a realistic goal.
> 
>> Is 3.6x longer really something we are stuck with? Or is there some other
>> way forward that could potentially provide higher performance, for Safe
>> Rust?
>>
> 
> Well by 3.6x longer, you mean ~1.3ns vs ~4.5ns, right? And in real world
> code, the code in the interrupt disabling critical section would be more
> than couples of nano seconds, hence the delta will probably be
> noise-out. But again, yes if 3ns turns out to be a bottleneck in the
> driver, we are happy to look into, but you need to show the data.
> 

So this is what I'm asking about: given that we *already know* that we 
have a performance drop in the micro-benchmark, is there any reasonable
approach that avoids this? Or has a less noticeable impact?

I'm asking early (see above: I agree that this is "premature"), because
we have early data.

It would be nice to explore now, rather than later, after someone shows
up with detailed perf data about their use case.


thanks,
-- 
John Hubbard
Re: [PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by Boqun Feng 1 week, 2 days ago
On Fri, Nov 21, 2025 at 06:56:28PM -0800, John Hubbard wrote:
> On 11/21/25 6:38 PM, Boqun Feng wrote:
[...]
> > 
> > Last but not least, safe Rust is preferred, but it doesn't mean unsafe
> > code should be avoided completely, if we establish some data that shows
> 
> Perhaps we need to be slightly more precise. I'm not sure if you are
> referring to the usual practice of creating an unsafe block, wrapped
> within a safe Rust function, or something else?
> 

I was referring to providing an unsafe API for core kernel
functionality, for example local_irq_disable(), and then teaching how to
use it correctly.

> > some unsafe code provides better performance and we have clear guideline
> > for the particular scenarios, then it's definitely OK. Hence I don't
> > fully agree your saying "Safe Rust is the whole point of this project",
> > to me understanding how we can utilize the type system and other tools
> > is more of a realistic goal.
> > 
> >> Is 3.6x longer really something we are stuck with? Or is there some other
> >> way forward that could potentially provide higher performance, for Safe
> >> Rust?
> >>
> > 
> > Well by 3.6x longer, you mean ~1.3ns vs ~4.5ns, right? And in real world
> > code, the code in the interrupt disabling critical section would be more
> > than couples of nano seconds, hence the delta will probably be
> > noise-out. But again, yes if 3ns turns out to be a bottleneck in the
> > driver, we are happy to look into, but you need to show the data.
> > 
> 
> So this is what I'm asking about: given that we *already know* that we 
> have a performance drop in the micro-benchmark, is there any reasonable
> approach that avoids this? Or has a less noticeable impact?
> 

Lyude had tried another approach [1], which uses an unsafe public API,
and doesn't work (easily) with CondVar or PREEMPT_RT And that eventually
triggered more discussion about a better API design, and as Thomas
pointed out [2]: "Stop worrying about mostly irrelevant low level
details which are not relevant to the primary audience of rust adoption.
We can worry about them when we replace the scheduler and the low level
interrupt handling code ten years down the road." And I agreed. The
current implementation is actually quite efficient and should even
out-perform the existing API in some cases as I pointed out. More
importantly, it utilizes Rust type system and make it easy to use (or
hard to mis-use).

That being said, if anyone has a better idea, feel free to bring it up.

> I'm asking early (see above: I agree that this is "premature"), because
> we have early data.
> 
> It would be nice to explore now, rather than later, after someone shows
> up with detailed perf data about their use case.
> 
> 

Not sure I fully agree with this, given it's to my knowledge the best
solution at the moment, I feel it's hard to justify the cost of
exploring a better solution without a real usage. But then again, if
anyone has any better idea feel free to bring it up.

[1]: https://lore.kernel.org/rust-for-linux/20240916213025.477225-2-lyude@redhat.com/
[2]: https://lore.kernel.org/rust-for-linux/87iktrahld.ffs@tglx/

Regards,
Boqun

> thanks,
> -- 
> John Hubbard
Re: [PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by John Hubbard 1 week, 2 days ago
On 11/21/25 7:35 PM, Boqun Feng wrote:
> On Fri, Nov 21, 2025 at 06:56:28PM -0800, John Hubbard wrote:
>> On 11/21/25 6:38 PM, Boqun Feng wrote:
> [...]
>>>
>>> Last but not least, safe Rust is preferred, but it doesn't mean unsafe
>>> code should be avoided completely, if we establish some data that shows
>>
>> Perhaps we need to be slightly more precise. I'm not sure if you are
>> referring to the usual practice of creating an unsafe block, wrapped
>> within a safe Rust function, or something else?
>>
> 
> I was referring to providing an unsafe API for core kernel
> functionality, for example local_irq_disable(), and then teaching how to
> use it correctly.

Ack.

> 
>>> some unsafe code provides better performance and we have clear guideline
>>> for the particular scenarios, then it's definitely OK. Hence I don't
>>> fully agree your saying "Safe Rust is the whole point of this project",
>>> to me understanding how we can utilize the type system and other tools
>>> is more of a realistic goal.
>>>
>>>> Is 3.6x longer really something we are stuck with? Or is there some other
>>>> way forward that could potentially provide higher performance, for Safe
>>>> Rust?
>>>>
>>>
>>> Well by 3.6x longer, you mean ~1.3ns vs ~4.5ns, right? And in real world
>>> code, the code in the interrupt disabling critical section would be more
>>> than couples of nano seconds, hence the delta will probably be
>>> noise-out. But again, yes if 3ns turns out to be a bottleneck in the
>>> driver, we are happy to look into, but you need to show the data.
>>>
>>
>> So this is what I'm asking about: given that we *already know* that we
>> have a performance drop in the micro-benchmark, is there any reasonable
>> approach that avoids this? Or has a less noticeable impact?
>>
> 
> Lyude had tried another approach [1], which uses an unsafe public API,
> and doesn't work (easily) with CondVar or PREEMPT_RT And that eventually
> triggered more discussion about a better API design, and as Thomas
> pointed out [2]: "Stop worrying about mostly irrelevant low level
> details which are not relevant to the primary audience of rust adoption.
> We can worry about them when we replace the scheduler and the low level
> interrupt handling code ten years down the road." And I agreed. The
> current implementation is actually quite efficient and should even
> out-perform the existing API in some cases as I pointed out. More
> importantly, it utilizes Rust type system and make it easy to use (or
> hard to mis-use).
> 
> That being said, if anyone has a better idea, feel free to bring it up.
> 
>> I'm asking early (see above: I agree that this is "premature"), because
>> we have early data.
>>
>> It would be nice to explore now, rather than later, after someone shows
>> up with detailed perf data about their use case.
>>
>>
> 
> Not sure I fully agree with this, given it's to my knowledge the best
> solution at the moment, I feel it's hard to justify the cost of
> exploring a better solution without a real usage. But then again, if
> anyone has any better idea feel free to bring it up.
> 
> [1]: https://lore.kernel.org/rust-for-linux/20240916213025.477225-2-lyude@redhat.com/
> [2]: https://lore.kernel.org/rust-for-linux/87iktrahld.ffs@tglx/
> 

Thanks for this context, I hadn't followed the earlier discussions,
and when looking at this v14, it seemed to gloss over the performance
implications (they were linked to, but not discussed).

I won't further harass you all about this, let's see how it goes. :)

Optionally, it might be helpful to include some top-level notes
that justify the choices made so far.

thanks,
-- 
John Hubbard
Re: [PATCH v14 00/16] Refcounted interrupts, SpinLockIrq for rust
Posted by Boqun Feng 1 week, 2 days ago
On Fri, Nov 21, 2025 at 08:14:59PM -0800, John Hubbard wrote:
[...]
> > 
> > Lyude had tried another approach [1], which uses an unsafe public API,
> > and doesn't work (easily) with CondVar or PREEMPT_RT And that eventually
> > triggered more discussion about a better API design, and as Thomas
> > pointed out [2]: "Stop worrying about mostly irrelevant low level
> > details which are not relevant to the primary audience of rust adoption.
> > We can worry about them when we replace the scheduler and the low level
> > interrupt handling code ten years down the road." And I agreed. The
> > current implementation is actually quite efficient and should even
> > out-perform the existing API in some cases as I pointed out. More
> > importantly, it utilizes Rust type system and make it easy to use (or
> > hard to mis-use).
> > 
> > That being said, if anyone has a better idea, feel free to bring it up.
> > 
> > > I'm asking early (see above: I agree that this is "premature"), because
> > > we have early data.
> > > 
> > > It would be nice to explore now, rather than later, after someone shows
> > > up with detailed perf data about their use case.
> > > 
> > > 
> > 
> > Not sure I fully agree with this, given it's to my knowledge the best
> > solution at the moment, I feel it's hard to justify the cost of
> > exploring a better solution without a real usage. But then again, if
> > anyone has any better idea feel free to bring it up.
> > 
> > [1]: https://lore.kernel.org/rust-for-linux/20240916213025.477225-2-lyude@redhat.com/
> > [2]: https://lore.kernel.org/rust-for-linux/87iktrahld.ffs@tglx/
> > 
> 
> Thanks for this context, I hadn't followed the earlier discussions,
> and when looking at this v14, it seemed to gloss over the performance
> implications (they were linked to, but not discussed).
> 

I could have provided this earlier ;-)

> I won't further harass you all about this, let's see how it goes. :)
> 
> Optionally, it might be helpful to include some top-level notes
> that justify the choices made so far.
> 

Sure, I will keep a note on that when applying the series. Thank you!

Regards,
Boqun

> thanks,
> -- 
> John Hubbard
>