MAINTAINERS | 1 + include/linux/io_uring_types.h | 7 + include/uapi/linux/io_uring.h | 74 ++ io_uring/Kconfig | 14 + io_uring/Makefile | 1 + io_uring/io_uring.c | 6 + io_uring/ipc.c | 1002 ++++++++++++++++ io_uring/ipc.h | 161 +++ io_uring/opdef.c | 19 + io_uring/register.c | 25 + tools/testing/selftests/ipc/Makefile | 2 +- tools/testing/selftests/ipc/io_uring_ipc.c | 1265 ++++++++++++++++++++ 12 files changed, 2576 insertions(+), 1 deletion(-) create mode 100644 io_uring/ipc.c create mode 100644 io_uring/ipc.h create mode 100644 tools/testing/selftests/ipc/io_uring_ipc.c
io_uring currently lacks a dedicated mechanism for efficient inter-process
communication. Applications needing low-latency IPC must use pipes, Unix
domain sockets, or hand-roll shared memory with manual synchronization --
none of which integrate with the io_uring completion model or offer
built-in fan-out semantics.
This series adds an IPC channel primitive to io_uring that provides:
- Shared memory ring buffer for zero-copy-style message passing
- Lock-free CAS-based producer (no mutex on the send hot path)
- RCU-based subscriber lookup on send/recv paths
- Three delivery modes:
* Unicast: single consumer, shared consumer.head via cmpxchg
* Broadcast: all subscribers receive every message via
per-subscriber local_head tracking
* Multicast: round-robin distribution across receivers with
cached recv_count for O(1) target selection
- Lazy broadcast consumer.head advancement (amortized O(N) scan
every 16 messages instead of per-recv)
- Channel-based design supporting multiple subscribers across
processes via anonymous file + mmap
- Permission model based on Unix file permissions (mode bits)
- Four registration commands: CREATE, ATTACH, DETACH, DESTROY
- Two new opcodes: IORING_OP_IPC_SEND and IORING_OP_IPC_RECV
- Targeted unicast send via sqe->file_index for subscriber ID
Benchmark results (VM, 32 vCPUs, 32 GB RAM):
Point-to-point latency (1 sender, 1 receiver, ns/msg):
Msg Size pipe unix sock shm+eventfd io_uring unicast
-------- ---- --------- ----------- ---------------
64 B 212 436 222 632
256 B 240 845 216 597
1 KB 424 613 550 640
4 KB 673 1,326 350 848
16 KB 1,982 1,477 2,169 1,893
32 KB 4,777 3,667 2,443 3,185
For point-to-point, io_uring IPC is within 1.5-2.5x of pipe/shm for
small messages due to the io_uring submission overhead. At 16-32 KB the
copy cost dominates and all mechanisms converge.
Broadcast to 16 receivers (ns/msg, sender-side):
Msg Size pipe unix sock shm+eventfd io_uring bcast
-------- ------ --------- ----------- --------------
64 B 28,550 32,504 5,970 5,674
256 B 27,588 34,777 5,429 6,600
1 KB 28,072 34,845 6,542 6,095
4 KB 37,277 46,154 11,520 6,367
16 KB 57,998 58,348 34,969 7,592
32 KB 89,404 83,496 93,082 8,202
The shared-ring broadcast architecture is the key differentiator: at
16 receivers with 32 KB messages, io_uring broadcast is 10.9x faster
than pipe and 11.3x faster than shm+eventfd because data is written
once to the shared ring rather than copied N times.
Scaling from 1 to 16 receivers (64 B messages):
Mechanism 1 recv 16 recv Degradation
--------- ------ ------- -----------
pipe 212 28,550 135x
unix sock 436 32,504 75x
shm+eventfd 222 5,970 27x
io_uring bcast 651 5,674 8.7x
io_uring mcast 569 1,406 2.5x
Multicast sender throughput (ns/msg):
Msg Size 1 recv 2 recv 4 recv 8 recv 16 recv
-------- ------ ------ ------ ------ -------
64 B 569 557 726 916 1,406
4 KB 825 763 829 1,395 1,630
32 KB 3,067 3,107 3,218 3,576 4,415
Multicast scales nearly flat because the CAS producer only contends
with the shared consumer.head, not with individual receivers.
Daniel Hodges (2):
io_uring: add high-performance IPC channel infrastructure
selftests/ipc: Add io_uring IPC selftest
MAINTAINERS | 1 +
include/linux/io_uring_types.h | 7 +
include/uapi/linux/io_uring.h | 74 ++
io_uring/Kconfig | 14 +
io_uring/Makefile | 1 +
io_uring/io_uring.c | 6 +
io_uring/ipc.c | 1002 ++++++++++++++++
io_uring/ipc.h | 161 +++
io_uring/opdef.c | 19 +
io_uring/register.c | 25 +
tools/testing/selftests/ipc/Makefile | 2 +-
tools/testing/selftests/ipc/io_uring_ipc.c | 1265 ++++++++++++++++++++
12 files changed, 2576 insertions(+), 1 deletion(-)
create mode 100644 io_uring/ipc.c
create mode 100644 io_uring/ipc.h
create mode 100644 tools/testing/selftests/ipc/io_uring_ipc.c
--
2.52.0
On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote:
> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB):
>
> Point-to-point latency (64B-32KB messages):
> io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs)
Benchmark sources used to generate the numbers in the cover letter:
io_uring IPC modes (broadcast, multicast, unicast):
https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c
IPC comparison (pipes, unix sockets, shm+eventfd):
https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c
On 3/14/26 7:50 AM, Daniel Hodges wrote: > On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote: >> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB): >> >> Point-to-point latency (64B-32KB messages): >> io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs) > > Benchmark sources used to generate the numbers in the cover letter: > > io_uring IPC modes (broadcast, multicast, unicast): > https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c > > IPC comparison (pipes, unix sockets, shm+eventfd): > https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c Thanks for sending these, was going to ask you about them. I'll take a look at your patches Monday. -- Jens Axboe
On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote: > On 3/14/26 7:50 AM, Daniel Hodges wrote: > > On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote: > >> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB): > >> > >> Point-to-point latency (64B-32KB messages): > >> io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs) > > > > Benchmark sources used to generate the numbers in the cover letter: > > > > io_uring IPC modes (broadcast, multicast, unicast): > > https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c > > > > IPC comparison (pipes, unix sockets, shm+eventfd): > > https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c > > Thanks for sending these, was going to ask you about them. I'll take a > look at your patches Monday. > > -- > Jens Axboe No rush, thanks for taking the time! -Daniel
On 3/16/26 6:49 AM, Daniel Hodges wrote: > On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote: >> On 3/14/26 7:50 AM, Daniel Hodges wrote: >>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote: >>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB): >>>> >>>> Point-to-point latency (64B-32KB messages): >>>> io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs) >>> >>> Benchmark sources used to generate the numbers in the cover letter: >>> >>> io_uring IPC modes (broadcast, multicast, unicast): >>> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c >>> >>> IPC comparison (pipes, unix sockets, shm+eventfd): >>> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c >> >> Thanks for sending these, was going to ask you about them. I'll take a >> look at your patches Monday. >> >> -- >> Jens Axboe > > No rush, thanks for taking the time! I took a look - and I think it's quite apparent that it's a AI vibe coded patch. Hence my first question is, do you have a specific use case in mind? Or phrased differently, was this done for a specific use case you have and want to pursue, or was it more of a "let's see if we can do this and what it'd look like" kind of thing? I have a lot of comments on the patch itself, but let's establish the motivation here first. -- Jens Axboe
On Mon, Mar 16, 2026 at 04:17:05PM -0600, Jens Axboe wrote: > On 3/16/26 6:49 AM, Daniel Hodges wrote: > > On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote: > >> On 3/14/26 7:50 AM, Daniel Hodges wrote: > >>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote: > >>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB): > >>>> > >>>> Point-to-point latency (64B-32KB messages): > >>>> io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs) > >>> > >>> Benchmark sources used to generate the numbers in the cover letter: > >>> > >>> io_uring IPC modes (broadcast, multicast, unicast): > >>> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c > >>> > >>> IPC comparison (pipes, unix sockets, shm+eventfd): > >>> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c > >> > >> Thanks for sending these, was going to ask you about them. I'll take a > >> look at your patches Monday. > >> > >> -- > >> Jens Axboe > > > > No rush, thanks for taking the time! > > I took a look - and I think it's quite apparent that it's a AI vibe > coded patch. Hence my first question is, do you have a specific use case > in mind? Or phrased differently, was this done for a specific use case > you have and want to pursue, or was it more of a "let's see if we can do > this and what it'd look like" kind of thing? > > I have a lot of comments on the patch itself, but let's establish the > motivation here first. > > -- > Jens Axboe I've been helping Alexandre prototype a D-Bus broker replacement that scales better on large machines. Here's some docs/benchmarks: https://github.com/fiorix/sbus/blob/main/sbus-broker/docs/analysis.md The idea for this RFC by trying to come up with a design if D-Bus was to be built from the ground so that it could scale on large machines. D-Bus was built because the kernel never really had a broadcast/multicast solution for IPC and kdbus demonstrated that moving dbus into the kernel wasn't viable either. So that's where I sort of landed on the idea of what if io_uring could be used for this type of IPC. There isn't a working io_uring backed D-Bus implementation yet as it would require features that aren't in this patch such a handling credentials etc. I fully acknowledge I had AI help in working on this, but if this idea make sense I would appreciate some human direction. If it seems like it could be feasible from your pespective I would like to try to give it a proper attempt. Thanks! -Daniel
On 3/16/26 5:13 PM, Daniel Hodges wrote: > On Mon, Mar 16, 2026 at 04:17:05PM -0600, Jens Axboe wrote: >> On 3/16/26 6:49 AM, Daniel Hodges wrote: >>> On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote: >>>> On 3/14/26 7:50 AM, Daniel Hodges wrote: >>>>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote: >>>>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB): >>>>>> >>>>>> Point-to-point latency (64B-32KB messages): >>>>>> io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs) >>>>> >>>>> Benchmark sources used to generate the numbers in the cover letter: >>>>> >>>>> io_uring IPC modes (broadcast, multicast, unicast): >>>>> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c >>>>> >>>>> IPC comparison (pipes, unix sockets, shm+eventfd): >>>>> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c >>>> >>>> Thanks for sending these, was going to ask you about them. I'll take a >>>> look at your patches Monday. >>>> >>>> -- >>>> Jens Axboe >>> >>> No rush, thanks for taking the time! >> >> I took a look - and I think it's quite apparent that it's a AI vibe >> coded patch. Hence my first question is, do you have a specific use case >> in mind? Or phrased differently, was this done for a specific use case >> you have and want to pursue, or was it more of a "let's see if we can do >> this and what it'd look like" kind of thing? >> >> I have a lot of comments on the patch itself, but let's establish the >> motivation here first. >> >> -- >> Jens Axboe > > I've been helping Alexandre prototype a D-Bus broker replacement that > scales better on large machines. Here's some docs/benchmarks: > https://github.com/fiorix/sbus/blob/main/sbus-broker/docs/analysis.md > > The idea for this RFC by trying to come up with a design if D-Bus was to > be built from the ground so that it could scale on large machines. D-Bus > was built because the kernel never really had a broadcast/multicast > solution for IPC and kdbus demonstrated that moving dbus into the kernel > wasn't viable either. So that's where I sort of landed on the idea of > what if io_uring could be used for this type of IPC. > > There isn't a working io_uring backed D-Bus implementation yet as > it would require features that aren't in this patch such a handling > credentials etc. I fully acknowledge I had AI help in working on this, > but if this idea make sense I would appreciate some human direction. If > it seems like it could be feasible from your pespective I would like to > try to give it a proper attempt. Thanks! OK, thanks for the explanation! I do think it makes sense to do, and starting with the basic mechanism first makes sense. I haven't read your link yet, but I suppose that had details on what else would be needed feature wise on top of the base? -- Jens Axboe
On Mon, Mar 16, 2026 at 07:13:42PM -0400, Daniel Hodges wrote: > On Mon, Mar 16, 2026 at 04:17:05PM -0600, Jens Axboe wrote: > > On 3/16/26 6:49 AM, Daniel Hodges wrote: > > > On Sat, Mar 14, 2026 at 10:54:15AM -0600, Jens Axboe wrote: > > >> On 3/14/26 7:50 AM, Daniel Hodges wrote: > > >>> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote: > > >>>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB): > > >>>> > > >>>> Point-to-point latency (64B-32KB messages): > > >>>> io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs) > > >>> > > >>> Benchmark sources used to generate the numbers in the cover letter: > > >>> > > >>> io_uring IPC modes (broadcast, multicast, unicast): > > >>> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c > > >>> > > >>> IPC comparison (pipes, unix sockets, shm+eventfd): > > >>> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c > > >> > > >> Thanks for sending these, was going to ask you about them. I'll take a > > >> look at your patches Monday. > > >> > > >> -- > > >> Jens Axboe > > > > > > No rush, thanks for taking the time! > > > > I took a look - and I think it's quite apparent that it's a AI vibe > > coded patch. Hence my first question is, do you have a specific use case > > in mind? Or phrased differently, was this done for a specific use case > > you have and want to pursue, or was it more of a "let's see if we can do > > this and what it'd look like" kind of thing? > > > > I have a lot of comments on the patch itself, but let's establish the > > motivation here first. > > > > -- > > Jens Axboe > > I've been helping Alexandre prototype a D-Bus broker replacement that > scales better on large machines. Here's some docs/benchmarks: > https://github.com/fiorix/sbus/blob/main/sbus-broker/docs/analysis.md > > The idea for this RFC by trying to come up with a design if D-Bus was to > be built from the ground so that it could scale on large machines. D-Bus > was built because the kernel never really had a broadcast/multicast > solution for IPC and kdbus demonstrated that moving dbus into the kernel > wasn't viable either. So that's where I sort of landed on the idea of > what if io_uring could be used for this type of IPC. > > There isn't a working io_uring backed D-Bus implementation yet as > it would require features that aren't in this patch such a handling > credentials etc. I fully acknowledge I had AI help in working on this, > but if this idea make sense I would appreciate some human direction. If > it seems like it could be feasible from your pespective I would like to > try to give it a proper attempt. Thanks! > > -Daniel I just realized the link I sent is private, here's a link to the D-Bus broker docs/benchmarks from my fork: https://github.com/hodgesds/dbus-rust/blob/main/sbus-broker/docs/analysis.md
On 3/14/26 10:54 AM, Jens Axboe wrote: > On 3/14/26 7:50 AM, Daniel Hodges wrote: >> On Thu, Mar 13, 2026 at 01:07:37PM +0000, Daniel Hodges wrote: >>> Performance (virtme-ng VM, single-socket, msg_size sweep 64B-32KB): >>> >>> Point-to-point latency (64B-32KB messages): >>> io_uring unicast: 597-3,185 ns/msg (within 1.5-2.5x of pipe for small msgs) >> >> Benchmark sources used to generate the numbers in the cover letter: >> >> io_uring IPC modes (broadcast, multicast, unicast): >> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-io_uring_ipc_bench-c >> >> IPC comparison (pipes, unix sockets, shm+eventfd): >> https://gist.github.com/hodgesds/fbcd8bb8497bc0ec2bf1f95244a984fe#file-ipc_comparison_bench-c > > Thanks for sending these, was going to ask you about them. I'll take a > look at your patches Monday. Just a side note since I peeked at a bit - this is using the raw interface? But more importantly, you'd definitely want IORING_SETUP_DEFER_TASKRUN and IORING_SETUP_SINGLE_ISSUER set in those init flags. -- Jens Axboe
© 2016 - 2026 Red Hat, Inc.