From: Geliang Tang <tanggeliang@kylinos.cn>
This series (previously named "MPTCP support to NVMe over TCP") had three
RFC versions sent to Hannes in May 2025, with subsequent revisions based on
his input. Following that, I initiated the process of upstreaming the
dependent "mptcp: implement .read_sock" series, which was merged into the
Linux kernel in February 2026.
After several rounds of iteration on the MPTCP mailing list, this set
addresses all the reviewer comments (including Sashiko's) and fixes the
identified issues.
This topic was presented as a discussion item at LSF/MM/BPF 2026.
During the "NVMe over MPTCP" [1] discussion at the conference, it was
concluded that MPTCP should be treated as a new transport type, rather than
a TCP variant. A request will be submitted to the NVMe working group to
officially allocate a transport value for MPTCP.
This series runs without any user space changes (libnvme, nvme-cli).
Later, MPTCP KTLS support will be added, and a follow-up series will be
sent to enable TLS for NVMe over MPTCP.
Based on NVMe Multipath and Block Multiqueue, each TCP queue is converted
into one MPTCP queue. This is achieved by abstracting six socket helpers
(set_nodelay, set_reuseaddr, no_linger, etc.) into per-transport
structures. Inside each MPTCP queue, multiple subflows using different
IP addresses aggregate multi-NIC bandwidth and provide fail-over
resilience.
Patch 10 demonstrates that with a single NVMe multipath configuration and
four network interfaces, MPTCP achieves four times the bandwidth of TCP.
Patch 11 demonstrates that with four NVMe multipath paths, using the
round-robin I/O policy and a lossy four-interface environment, MPTCP
still achieves four times the bandwidth of TCP.
[1]
https://lore.kernel.org/linux-nvme/a9f115aa5719e1088702a3fdeee766a3166611b1.camel@kernel.org/
Cc: Hannes Reinecke <hare@suse.de>
Cc: John Meneghini <jmeneghi@redhat.com>
Cc: Randy Jennings <randyj@purestorage.com>
Cc: Nilay Shroff <nilay@linux.ibm.com>
Co-developed-by: zhenwei pi <zhenwei.pi@linux.dev>
Signed-off-by: zhenwei pi <zhenwei.pi@linux.dev>
Co-developed-by: Hui Zhu <zhuhui@kylinos.cn>
Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
Co-developed-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Gang Yan <yangang@kylinos.cn>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Geliang Tang (11):
nvmet-tcp: define accept tcp_proto struct
nvmet-tcp: implement accept mptcp proto
nvmet-tcp: define listen socket ops
nvmet-tcp: register target mptcp transport
nvmet-tcp: implement mptcp listen socket ops
nvme-fabrics: compare transport in ip_options_match
nvme-tcp: define host tcp_proto struct
nvme-tcp: register host mptcp transport
nvme-tcp: implement host mptcp proto
selftests: mptcp: add nvme over mptcp test
selftests: mptcp: nvme: add iopolicy tests
drivers/nvme/host/fabrics.c | 1 +
drivers/nvme/host/tcp.c | 101 ++++-
drivers/nvme/target/configfs.c | 1 +
drivers/nvme/target/tcp.c | 128 +++++-
include/linux/nvme.h | 1 +
include/net/mptcp.h | 31 ++
net/mptcp/sockopt.c | 149 +++++++
tools/testing/selftests/net/mptcp/Makefile | 1 +
tools/testing/selftests/net/mptcp/config | 8 +
.../testing/selftests/net/mptcp/mptcp_lib.sh | 12 +
.../testing/selftests/net/mptcp/mptcp_nvme.sh | 397 ++++++++++++++++++
11 files changed, 813 insertions(+), 17 deletions(-)
create mode 100755 tools/testing/selftests/net/mptcp/mptcp_nvme.sh
--
2.53.0