io_uring: add IPC channel infrastructure

[RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Daniel Hodges 3 weeks, 3 days ago

io_uring currently lacks a dedicated mechanism for efficient inter-process
communication. Applications needing low-latency IPC must use pipes, Unix
domain sockets, or hand-roll shared memory with manual synchronization --
none of which integrate with the io_uring completion model or offer
built-in fan-out semantics.

This series adds an IPC channel primitive to io_uring that provides:

  - Shared memory ring buffer for zero-copy-style message passing
  - Lock-free CAS-based producer (no mutex on the send hot path)
  - RCU-based subscriber lookup on send/recv paths
  - Three delivery modes:
      * Unicast: single consumer, shared consumer.head via cmpxchg
      * Broadcast: all subscribers receive every message via
        per-subscriber local_head tracking
      * Multicast: round-robin distribution across receivers with
        cached recv_count for O(1) target selection
  - Lazy broadcast consumer.head advancement (amortized O(N) scan
    every 16 messages instead of per-recv)
  - Channel-based design supporting multiple subscribers across
    processes via anonymous file + mmap
  - Permission model based on Unix file permissions (mode bits)
  - Four registration commands: CREATE, ATTACH, DETACH, DESTROY
  - Two new opcodes: IORING_OP_IPC_SEND and IORING_OP_IPC_RECV
  - Targeted unicast send via sqe->file_index for subscriber ID

Benchmark results (VM, 32 vCPUs, 32 GB RAM):

Point-to-point latency (1 sender, 1 receiver, ns/msg):

  Msg Size   pipe    unix sock  shm+eventfd  io_uring unicast
  --------   ----    ---------  -----------  ---------------
  64 B        212      436        222           632
  256 B       240      845        216           597
  1 KB        424      613        550           640
  4 KB        673    1,326        350           848
  16 KB     1,982    1,477      2,169         1,893
  32 KB     4,777    3,667      2,443         3,185

For point-to-point, io_uring IPC is within 1.5-2.5x of pipe/shm for
small messages due to the io_uring submission overhead. At 16-32 KB the
copy cost dominates and all mechanisms converge.

Broadcast to 16 receivers (ns/msg, sender-side):

  Msg Size    pipe     unix sock  shm+eventfd  io_uring bcast
  --------   ------   ---------  -----------  --------------
  64 B       28,550    32,504      5,970         5,674
  256 B      27,588    34,777      5,429         6,600
  1 KB       28,072    34,845      6,542         6,095
  4 KB       37,277    46,154     11,520         6,367
  16 KB      57,998    58,348     34,969         7,592
  32 KB      89,404    83,496     93,082         8,202

The shared-ring broadcast architecture is the key differentiator: at
16 receivers with 32 KB messages, io_uring broadcast is 10.9x faster
than pipe and 11.3x faster than shm+eventfd because data is written
once to the shared ring rather than copied N times.

Scaling from 1 to 16 receivers (64 B messages):

  Mechanism        1 recv   16 recv   Degradation
  ---------        ------   -------   -----------
  pipe               212    28,550       135x
  unix sock          436    32,504        75x
  shm+eventfd        222     5,970        27x
  io_uring bcast     651     5,674       8.7x
  io_uring mcast     569     1,406       2.5x

Multicast sender throughput (ns/msg):

  Msg Size   1 recv  2 recv  4 recv  8 recv  16 recv
  --------   ------  ------  ------  ------  -------
  64 B         569     557     726     916    1,406
  4 KB         825     763     829   1,395    1,630
  32 KB      3,067   3,107   3,218   3,576    4,415

Multicast scales nearly flat because the CAS producer only contends
with the shared consumer.head, not with individual receivers.

Daniel Hodges (2):
  io_uring: add high-performance IPC channel infrastructure
  selftests/ipc: Add io_uring IPC selftest

 MAINTAINERS                                |    1 +
 include/linux/io_uring_types.h             |    7 +
 include/uapi/linux/io_uring.h              |   74 ++
 io_uring/Kconfig                           |   14 +
 io_uring/Makefile                          |    1 +
 io_uring/io_uring.c                        |    6 +
 io_uring/ipc.c                             | 1002 ++++++++++++++++
 io_uring/ipc.h                             |  161 +++
 io_uring/opdef.c                           |   19 +
 io_uring/register.c                        |   25 +
 tools/testing/selftests/ipc/Makefile       |    2 +-
 tools/testing/selftests/ipc/io_uring_ipc.c | 1265 ++++++++++++++++++++
 12 files changed, 2576 insertions(+), 1 deletion(-)
 create mode 100644 io_uring/ipc.c
 create mode 100644 io_uring/ipc.h
 create mode 100644 tools/testing/selftests/ipc/io_uring_ipc.c

--
2.52.0

Re: [RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Daniel Hodges 3 weeks, 2 days ago

On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
>
>   Point-to-point latency (64B-32KB messages):
>     io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)

Benchmark sources used to generate the numbers in the cover letter:

  io_uring IPC modes (broadcast, multicast, unicast):
    https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c

  IPC comparison (pipes, unix sockets, shm+eventfd):
    https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c

Re: [RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Jens Axboe 3 weeks, 2 days ago

On 3/14/26 7:50 AM, Daniel Hodges wrote:
> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
>>
>>   Point-to-point latency (64B-32KB messages):
>>     io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)
> 
> Benchmark sources used to generate the numbers in the cover letter:
> 
>   io_uring IPC modes (broadcast, multicast, unicast):
>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c
> 
>   IPC comparison (pipes, unix sockets, shm+eventfd):
>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c

Thanks for sending these, was going to ask you about them. I'll take a
look at your patches Monday.

-- 
Jens Axboe

Re: [RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Daniel Hodges 3 weeks ago

On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote:
> On 3/14/26 7:50 AM, Daniel Hodges wrote:
> > On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
> >> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
> >>
> >>   Point-to-point latency (64B-32KB messages):
> >>     io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)
> > 
> > Benchmark sources used to generate the numbers in the cover letter:
> > 
> >   io_uring IPC modes (broadcast, multicast, unicast):
> >     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c
> > 
> >   IPC comparison (pipes, unix sockets, shm+eventfd):
> >     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c
> 
> Thanks for sending these, was going to ask you about them. I'll take a
> look at your patches Monday.
> 
> -- 
> Jens Axboe

No rush, thanks for taking the time!

-Daniel

Re: [RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Jens Axboe 3 weeks ago

On 3/16/26 6:49 AM, Daniel Hodges wrote:
> On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote:
>> On 3/14/26 7:50 AM, Daniel Hodges wrote:
>>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
>>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
>>>>
>>>>   Point-to-point latency (64B-32KB messages):
>>>>     io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)
>>>
>>> Benchmark sources used to generate the numbers in the cover letter:
>>>
>>>   io_uring IPC modes (broadcast, multicast, unicast):
>>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c
>>>
>>>   IPC comparison (pipes, unix sockets, shm+eventfd):
>>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c
>>
>> Thanks for sending these, was going to ask you about them. I'll take a
>> look at your patches Monday.
>>
>> -- 
>> Jens Axboe
> 
> No rush, thanks for taking the time!

I took a look - and I think it's quite apparent that it's a AI vibe
coded patch. Hence my first question is, do you have a specific use case
in mind? Or phrased differently, was this done for a specific use case
you have and want to pursue, or was it more of a "let's see if we can do
this and what it'd look like" kind of thing?

I have a lot of comments on the patch itself, but let's establish the
motivation here first.

-- 
Jens Axboe

Re: [RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Daniel Hodges 3 weeks ago

On Mon, Mar 16, 2026 at 04:17:05PM -0600, Jens Axboe wrote:
> On 3/16/26 6:49 AM, Daniel Hodges wrote:
> > On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote:
> >> On 3/14/26 7:50 AM, Daniel Hodges wrote:
> >>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
> >>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
> >>>>
> >>>>   Point-to-point latency (64B-32KB messages):
> >>>>     io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)
> >>>
> >>> Benchmark sources used to generate the numbers in the cover letter:
> >>>
> >>>   io_uring IPC modes (broadcast, multicast, unicast):
> >>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c
> >>>
> >>>   IPC comparison (pipes, unix sockets, shm+eventfd):
> >>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c
> >>
> >> Thanks for sending these, was going to ask you about them. I'll take a
> >> look at your patches Monday.
> >>
> >> -- 
> >> Jens Axboe
> > 
> > No rush, thanks for taking the time!
> 
> I took a look - and I think it's quite apparent that it's a AI vibe
> coded patch. Hence my first question is, do you have a specific use case
> in mind? Or phrased differently, was this done for a specific use case
> you have and want to pursue, or was it more of a "let's see if we can do
> this and what it'd look like" kind of thing?
> 
> I have a lot of comments on the patch itself, but let's establish the
> motivation here first.
> 
> -- 
> Jens Axboe

I've been helping Alexandre prototype a D-Bus broker replacement that
scales better on large machines. Here's some docs/benchmarks:
https://github.com/fiorix/sbus/blob/main/sbus-broker/docs/analysis.md

The idea for this RFC by trying to come up with a design if D-Bus was to
be built from the ground so that it could scale on large machines. D-Bus
was built because the kernel never really had a broadcast/multicast
solution for IPC and kdbus demonstrated that moving dbus into the kernel
wasn't viable either. So that's where I sort of landed on the idea of
what if io_uring could be used for this type of IPC.

There isn't a working io_uring backed D-Bus implementation yet as
it would require features that aren't in this patch such a handling
credentials etc. I fully acknowledge I had AI help in working on this,
but if this idea make sense I would appreciate some human direction. If
it seems like it could be feasible from your pespective I would like to
try to give it a proper attempt. Thanks!

-Daniel

Re: [RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Jens Axboe 3 weeks ago

On 3/16/26 5:13 PM, Daniel Hodges wrote:
> On Mon, Mar 16, 2026 at 04:17:05PM -0600, Jens Axboe wrote:
>> On 3/16/26 6:49 AM, Daniel Hodges wrote:
>>> On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote:
>>>> On 3/14/26 7:50 AM, Daniel Hodges wrote:
>>>>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
>>>>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
>>>>>>
>>>>>>   Point-to-point latency (64B-32KB messages):
>>>>>>     io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)
>>>>>
>>>>> Benchmark sources used to generate the numbers in the cover letter:
>>>>>
>>>>>   io_uring IPC modes (broadcast, multicast, unicast):
>>>>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c
>>>>>
>>>>>   IPC comparison (pipes, unix sockets, shm+eventfd):
>>>>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c
>>>>
>>>> Thanks for sending these, was going to ask you about them. I'll take a
>>>> look at your patches Monday.
>>>>
>>>> -- 
>>>> Jens Axboe
>>>
>>> No rush, thanks for taking the time!
>>
>> I took a look - and I think it's quite apparent that it's a AI vibe
>> coded patch. Hence my first question is, do you have a specific use case
>> in mind? Or phrased differently, was this done for a specific use case
>> you have and want to pursue, or was it more of a "let's see if we can do
>> this and what it'd look like" kind of thing?
>>
>> I have a lot of comments on the patch itself, but let's establish the
>> motivation here first.
>>
>> -- 
>> Jens Axboe
> 
> I've been helping Alexandre prototype a D-Bus broker replacement that
> scales better on large machines. Here's some docs/benchmarks:
> https://github.com/fiorix/sbus/blob/main/sbus-broker/docs/analysis.md
> 
> The idea for this RFC by trying to come up with a design if D-Bus was to
> be built from the ground so that it could scale on large machines. D-Bus
> was built because the kernel never really had a broadcast/multicast
> solution for IPC and kdbus demonstrated that moving dbus into the kernel
> wasn't viable either. So that's where I sort of landed on the idea of
> what if io_uring could be used for this type of IPC.
> 
> There isn't a working io_uring backed D-Bus implementation yet as
> it would require features that aren't in this patch such a handling
> credentials etc. I fully acknowledge I had AI help in working on this,
> but if this idea make sense I would appreciate some human direction. If
> it seems like it could be feasible from your pespective I would like to
> try to give it a proper attempt. Thanks!

OK, thanks for the explanation! I do think it makes sense to do, and
starting with the basic mechanism first makes sense. I haven't read your
link yet, but I suppose that had details on what else would be needed
feature wise on top of the base?

-- 
Jens Axboe

Re: [RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Daniel Hodges 3 weeks ago

On Mon, Mar 16, 2026 at 07:13:42PM -0400, Daniel Hodges wrote:
> On Mon, Mar 16, 2026 at 04:17:05PM -0600, Jens Axboe wrote:
> > On 3/16/26 6:49 AM, Daniel Hodges wrote:
> > > On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote:
> > >> On 3/14/26 7:50 AM, Daniel Hodges wrote:
> > >>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
> > >>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
> > >>>>
> > >>>>   Point-to-point latency (64B-32KB messages):
> > >>>>     io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)
> > >>>
> > >>> Benchmark sources used to generate the numbers in the cover letter:
> > >>>
> > >>>   io_uring IPC modes (broadcast, multicast, unicast):
> > >>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c
> > >>>
> > >>>   IPC comparison (pipes, unix sockets, shm+eventfd):
> > >>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c
> > >>
> > >> Thanks for sending these, was going to ask you about them. I'll take a
> > >> look at your patches Monday.
> > >>
> > >> -- 
> > >> Jens Axboe
> > > 
> > > No rush, thanks for taking the time!
> > 
> > I took a look - and I think it's quite apparent that it's a AI vibe
> > coded patch. Hence my first question is, do you have a specific use case
> > in mind? Or phrased differently, was this done for a specific use case
> > you have and want to pursue, or was it more of a "let's see if we can do
> > this and what it'd look like" kind of thing?
> > 
> > I have a lot of comments on the patch itself, but let's establish the
> > motivation here first.
> > 
> > -- 
> > Jens Axboe
> 
> I've been helping Alexandre prototype a D-Bus broker replacement that
> scales better on large machines. Here's some docs/benchmarks:
> https://github.com/fiorix/sbus/blob/main/sbus-broker/docs/analysis.md
> 
> The idea for this RFC by trying to come up with a design if D-Bus was to
> be built from the ground so that it could scale on large machines. D-Bus
> was built because the kernel never really had a broadcast/multicast
> solution for IPC and kdbus demonstrated that moving dbus into the kernel
> wasn't viable either. So that's where I sort of landed on the idea of
> what if io_uring could be used for this type of IPC.
> 
> There isn't a working io_uring backed D-Bus implementation yet as
> it would require features that aren't in this patch such a handling
> credentials etc. I fully acknowledge I had AI help in working on this,
> but if this idea make sense I would appreciate some human direction. If
> it seems like it could be feasible from your pespective I would like to
> try to give it a proper attempt. Thanks!
> 
> -Daniel

I just realized the link I sent is private, here's a link to the D-Bus broker
docs/benchmarks from my fork:
https://github.com/hodgesds/dbus-rust/blob/main/sbus-broker/docs/analysis.md

Re: [RFC PATCH 0/2] io_uring: add IPC channel infrastructure

Posted by Jens Axboe 3 weeks, 2 days ago

On 3/14/26 10:54 AM, Jens Axboe wrote:
> On 3/14/26 7:50 AM, Daniel Hodges wrote:
>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
>>>
>>>   Point-to-point latency (64B-32KB messages):
>>>     io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)
>>
>> Benchmark sources used to generate the numbers in the cover letter:
>>
>>   io_uring IPC modes (broadcast, multicast, unicast):
>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c
>>
>>   IPC comparison (pipes, unix sockets, shm+eventfd):
>>     https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c
> 
> Thanks for sending these, was going to ask you about them. I'll take a
> look at your patches Monday.

Just a side note since I peeked at a bit - this is using the raw
interface? But more importantly, you'd definitely want
IORING_SETUP_DEFER_TASKRUN and IORING_SETUP_SINGLE_ISSUER set in those
init flags.

-- 
Jens Axboe