[PATCH mptcp-next v15 0/6] implement mptcp read_sock

Geliang Tang posted 6 patches 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/cover.1765023923.git.tanggeliang@kylinos.cn
include/net/tcp.h                             |  11 +
net/ipv4/tcp.c                                |  13 +-
net/mptcp/protocol.c                          | 215 +++++++++++++++++-
tools/testing/selftests/net/mptcp/Makefile    |   1 +
.../selftests/net/mptcp/mptcp_connect.c       |  63 ++++-
.../net/mptcp/mptcp_connect_splice.sh         |   5 +
6 files changed, 289 insertions(+), 19 deletions(-)
create mode 100755 tools/testing/selftests/net/mptcp/mptcp_connect_splice.sh
[PATCH mptcp-next v15 0/6] implement mptcp read_sock
Posted by Geliang Tang 1 week ago
From: Geliang Tang <tanggeliang@kylinos.cn>

v15:
 - Patch 2, remove the maximum length limit as Mat suggested.
 - Move tcp_recv_should_stop() helper out of the series as Mat
   suggested.

v14:
 - Patch 2, new helper __mptcp_read_sock() with noack parameter,
 this makes it more similar to __tcp_read_sock() and also prepares
 for the use of mptcp_read_sock_noack() in MPTCP KTLS support. Also
 invoke msk_owned_by_me() in it to make sure socket was locked.
 - Patch 5, export tcp_splice_data_recv() as Paolo suggested in v7.
 - Patch 6, drop mptcp_splice_data_recv().
 - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1763974740.git.tanggeliang@kylinos.cn/

v13:
 - rebase on "mptcp: introduce backlog processing" v6
 - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1761198660.git.geliang@kernel.org/

v12:
 - rebase on "mptcp: receive path improvement" 1-7.
 - some cleanups.

v11:
 - drop "tcp: drop release and lock again in splice_read", and add this
   release and lock again in mptcp_splice_read too. (Thanks Mat, I didn't
   understand the intent of this code before.)
 - call mptcp_rps_record_subflows() in mptcp_splice_read as Mat
   suggested.

v10:
 - add an offset parameter for mptcp_recv_skb and make it more like
tcp_recv_skb.
 - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1756780274.git.tanggeliang@kylinos.cn/

v9:
 - merge the squash-to patches.
 - a new patch "drop release and lock again in splice_read".
 - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1752399660.git.tanggeliang@kylinos.cn/

v8:
 - export struct tcp_splice_state and tcp_splice_data_recv() in net/tcp.h.
 - add a new helper mptcp_recv_should_stop.
 - add mptcp_connect_splice.sh.
 - update commit logs.

v7:
 - only patch 1 and patch 2 changed.
 - add a new helper mptcp_eat_recv_skb.
 - invoke skb_peek in mptcp_recv_skb().
 - use while ((skb = mptcp_recv_skb(sk)) != NULL) instead of
 skb_queue_walk_safe(&sk->sk_receive_queue, skb, tmp).

v6:
 - address Paolo's comments for v4, v5 (thanks)

v5:
 - extract the common code of __mptcp_recvmsg_mskq() and mptcp_read_sock()
 into a new helper __mptcp_recvmsg_desc() to reduce duplication code.

v4:
 - v3 doesn't work for MPTCP fallback tests in mptcp_connect.sh, this
   set fix it.
 - invoke __mptcp_move_skbs in mptcp_read_sock.
 - use INDIRECT_CALL_INET_1 in __tcp_splice_read.

v3:
 - merge the two squash-to patches.
 - use sk->sk_rcvbuf instead of INT_MAX as the max len in
 mptcp_read_sock().
 - add splice io mode for mptcp_connect and drop mptcp_splice.c test.
 - the splice test for packetdrill is also added here:
https://github.com/multipath-tcp/packetdrill/pull/162

v2:
 - set splice_read of mptcp
 - add a splice selftest.

I have good news! I recently added MPTCP support to "NVME over TCP".
And my RFC patches are under review by NVME maintainer Hannes.

Replacing "NVME over TCP" with MPTCP is very simple. I used IPPROTO_MPTCP
instead of IPPROTO_TCP to create MPTCP sockets on both target and host
sides, these sockets are created in Kernel space.

nvmet_tcp_add_port:

	ret = sock_create(port->addr.ss_family, SOCK_STREAM,
				IPPROTO_MPTCP, &port->sock);

nvme_tcp_alloc_queue:

	ret = sock_create_kern(current->nsproxy->net_ns,
			ctrl->addr.ss_family, SOCK_STREAM,
			IPPROTO_MPTCP, &queue->sock);

nvme_tcp_try_recv() needs to call .read_sock interface of struct
proto_ops, but it is not implemented in MPTCP. So I implemented it
with reference to __mptcp_recvmsg_mskq().

Since the NVME part patches are still under reviewing, I only send the
MPTCP part patches in this set to MPTCP ML for your opinions.

Geliang Tang (6):
  mptcp: add eat_recv_skb helper
  mptcp: implement .read_sock
  tcp: export tcp_splice_state
  mptcp: implement .splice_read
  selftests: mptcp: add splice io mode
  selftests: mptcp: connect: cover splice mode

 include/net/tcp.h                             |  11 +
 net/ipv4/tcp.c                                |  13 +-
 net/mptcp/protocol.c                          | 215 +++++++++++++++++-
 tools/testing/selftests/net/mptcp/Makefile    |   1 +
 .../selftests/net/mptcp/mptcp_connect.c       |  63 ++++-
 .../net/mptcp/mptcp_connect_splice.sh         |   5 +
 6 files changed, 289 insertions(+), 19 deletions(-)
 create mode 100755 tools/testing/selftests/net/mptcp/mptcp_connect_splice.sh

-- 
2.51.0
Re: [PATCH mptcp-next v15 0/6] implement mptcp read_sock
Posted by Matthieu Baerts 2 days, 7 hours ago
Hi Geliang, Mat,

On 06/12/2025 13:33, Geliang Tang wrote:
> From: Geliang Tang <tanggeliang@kylinos.cn>
> 
> v15:
>  - Patch 2, remove the maximum length limit as Mat suggested.
>  - Move tcp_recv_should_stop() helper out of the series as Mat
>    suggested.

(...)

> Geliang Tang (6):
>   mptcp: add eat_recv_skb helper
>   mptcp: implement .read_sock
>   tcp: export tcp_splice_state
>   mptcp: implement .splice_read
>   selftests: mptcp: add splice io mode
>   selftests: mptcp: connect: cover splice mode

Thank you for this series and the reviews!

Now in our tree (I have one question on patch 2/6, but not blocking).

New patches for t/upstream:
- e78890a6ea4b: mptcp: add eat_recv_skb helper
- cd77bb397ce5: mptcp: implement .read_sock
- 5e585fdd542a: tcp: export tcp_splice_state
  - f7f50269c753: tg:msg: switch Mat's RvB to Acked-by
- 95e576d9d745: mptcp: implement .splice_read
- 5c48da682a24: selftests: mptcp: add splice io mode
- 8a1663feb989: selftests: mptcp: connect: cover splice mode
- Results: ff3fd5f60460..51ec4882d220 (export)

Tests are now in progress:

- export:
https://github.com/multipath-tcp/mptcp_net-next/commit/cdae7102afdc1eeafc0fa217f76f746ffd67c561/checks

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [PATCH mptcp-next v15 0/6] implement mptcp read_sock
Posted by Mat Martineau 4 days ago
On Sat, 6 Dec 2025, Geliang Tang wrote:

> From: Geliang Tang <tanggeliang@kylinos.cn>
>
> v15:
> - Patch 2, remove the maximum length limit as Mat suggested.
> - Move tcp_recv_should_stop() helper out of the series as Mat
>   suggested.

Thanks Geliang, series LGTM:

Reviewed-by: Mat Martineau <martineau@kernel.org>

(except patch 2 which I will ack separately)


>
> v14:
> - Patch 2, new helper __mptcp_read_sock() with noack parameter,
> this makes it more similar to __tcp_read_sock() and also prepares
> for the use of mptcp_read_sock_noack() in MPTCP KTLS support. Also
> invoke msk_owned_by_me() in it to make sure socket was locked.
> - Patch 5, export tcp_splice_data_recv() as Paolo suggested in v7.
> - Patch 6, drop mptcp_splice_data_recv().
> - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1763974740.git.tanggeliang@kylinos.cn/
>
> v13:
> - rebase on "mptcp: introduce backlog processing" v6
> - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1761198660.git.geliang@kernel.org/
>
> v12:
> - rebase on "mptcp: receive path improvement" 1-7.
> - some cleanups.
>
> v11:
> - drop "tcp: drop release and lock again in splice_read", and add this
>   release and lock again in mptcp_splice_read too. (Thanks Mat, I didn't
>   understand the intent of this code before.)
> - call mptcp_rps_record_subflows() in mptcp_splice_read as Mat
>   suggested.
>
> v10:
> - add an offset parameter for mptcp_recv_skb and make it more like
> tcp_recv_skb.
> - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1756780274.git.tanggeliang@kylinos.cn/
>
> v9:
> - merge the squash-to patches.
> - a new patch "drop release and lock again in splice_read".
> - Link: https://patchwork.kernel.org/project/mptcp/cover/cover.1752399660.git.tanggeliang@kylinos.cn/
>
> v8:
> - export struct tcp_splice_state and tcp_splice_data_recv() in net/tcp.h.
> - add a new helper mptcp_recv_should_stop.
> - add mptcp_connect_splice.sh.
> - update commit logs.
>
> v7:
> - only patch 1 and patch 2 changed.
> - add a new helper mptcp_eat_recv_skb.
> - invoke skb_peek in mptcp_recv_skb().
> - use while ((skb = mptcp_recv_skb(sk)) != NULL) instead of
> skb_queue_walk_safe(&sk->sk_receive_queue, skb, tmp).
>
> v6:
> - address Paolo's comments for v4, v5 (thanks)
>
> v5:
> - extract the common code of __mptcp_recvmsg_mskq() and mptcp_read_sock()
> into a new helper __mptcp_recvmsg_desc() to reduce duplication code.
>
> v4:
> - v3 doesn't work for MPTCP fallback tests in mptcp_connect.sh, this
>   set fix it.
> - invoke __mptcp_move_skbs in mptcp_read_sock.
> - use INDIRECT_CALL_INET_1 in __tcp_splice_read.
>
> v3:
> - merge the two squash-to patches.
> - use sk->sk_rcvbuf instead of INT_MAX as the max len in
> mptcp_read_sock().
> - add splice io mode for mptcp_connect and drop mptcp_splice.c test.
> - the splice test for packetdrill is also added here:
> https://github.com/multipath-tcp/packetdrill/pull/162
>
> v2:
> - set splice_read of mptcp
> - add a splice selftest.
>
> I have good news! I recently added MPTCP support to "NVME over TCP".
> And my RFC patches are under review by NVME maintainer Hannes.
>
> Replacing "NVME over TCP" with MPTCP is very simple. I used IPPROTO_MPTCP
> instead of IPPROTO_TCP to create MPTCP sockets on both target and host
> sides, these sockets are created in Kernel space.
>
> nvmet_tcp_add_port:
>
> 	ret = sock_create(port->addr.ss_family, SOCK_STREAM,
> 				IPPROTO_MPTCP, &port->sock);
>
> nvme_tcp_alloc_queue:
>
> 	ret = sock_create_kern(current->nsproxy->net_ns,
> 			ctrl->addr.ss_family, SOCK_STREAM,
> 			IPPROTO_MPTCP, &queue->sock);
>
> nvme_tcp_try_recv() needs to call .read_sock interface of struct
> proto_ops, but it is not implemented in MPTCP. So I implemented it
> with reference to __mptcp_recvmsg_mskq().
>
> Since the NVME part patches are still under reviewing, I only send the
> MPTCP part patches in this set to MPTCP ML for your opinions.
>
> Geliang Tang (6):
>  mptcp: add eat_recv_skb helper
>  mptcp: implement .read_sock
>  tcp: export tcp_splice_state
>  mptcp: implement .splice_read
>  selftests: mptcp: add splice io mode
>  selftests: mptcp: connect: cover splice mode
>
> include/net/tcp.h                             |  11 +
> net/ipv4/tcp.c                                |  13 +-
> net/mptcp/protocol.c                          | 215 +++++++++++++++++-
> tools/testing/selftests/net/mptcp/Makefile    |   1 +
> .../selftests/net/mptcp/mptcp_connect.c       |  63 ++++-
> .../net/mptcp/mptcp_connect_splice.sh         |   5 +
> 6 files changed, 289 insertions(+), 19 deletions(-)
> create mode 100755 tools/testing/selftests/net/mptcp/mptcp_connect_splice.sh
>
> -- 
> 2.51.0
>
>
>
Re: [PATCH mptcp-next v15 0/6] implement mptcp read_sock
Posted by Mat Martineau 4 days ago
On Tue, 9 Dec 2025, Mat Martineau wrote:

> On Sat, 6 Dec 2025, Geliang Tang wrote:
>
>> From: Geliang Tang <tanggeliang@kylinos.cn>
>> 
>> v15:
>> - Patch 2, remove the maximum length limit as Mat suggested.
>> - Move tcp_recv_should_stop() helper out of the series as Mat
>>   suggested.
>
> Thanks Geliang, series LGTM:
>
> Reviewed-by: Mat Martineau <martineau@kernel.org>
>
> (except patch 2 which I will ack separately)

Actually, patch 3 :)
Re: [PATCH mptcp-next v15 0/6] implement mptcp read_sock
Posted by MPTCP CI 1 week ago
Hi Geliang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Unstable: 1 failed test(s): selftest_simult_flows 🔴
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19988767626

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/4bf455dd3ee0
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1031070


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Re: [PATCH mptcp-next v15 0/6] implement mptcp read_sock
Posted by MPTCP CI 1 week ago
Hi Geliang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Critical: Global Timeout ❌
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Success! ✅
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/19988767626

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/4bf455dd3ee0
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1031070


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)