[RFC mptcp-next] mptcp: support MSG_EOR in mptcp_sendmsg

Gang Yan posted 1 patch 4 days, 23 hours ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/multipath-tcp/mptcp_net-next tags/patchew/20260202040916.626066-1-gang.yan@linux.dev
There is a newer version of this series
net/mptcp/protocol.c | 22 +++++++++++++++++++---
net/mptcp/protocol.h |  1 +
2 files changed, 20 insertions(+), 3 deletions(-)
[RFC mptcp-next] mptcp: support MSG_EOR in mptcp_sendmsg
Posted by Gang Yan 4 days, 23 hours ago
From: Gang Yan <yangang@kylinos.cn>

This patch adds support for the MSG_EOR flag in MPTCP's sendmsg path,
ensuring that data fragments marked with MSG_EOR are properly handled
to prevent coalescing with subsequent data.

Key changes:
1. Added an 'eor' field to struct mptcp_data_frag to track MSG_EOR marking
2. Initialize the eor field to 0 in mptcp_carve_data_frag()
3. In mptcp_sendmsg_frag(), when sending the last chunk of a data fragment
   that has MSG_EOR set, mark the corresponding skb with TCP_SKB_CB(skb)->eor = 1
   to prevent coalescing with subsequent data
4. Modified mptcp_sendmsg() to:
   - Preserve MSG_EOR flag in msg_flags filtering
   - Mark the last pending data fragment with eor = 1 when MSG_EOR is set
     in the message flags

This ensures that applications using MSG_EOR to indicate record boundaries
have their intent preserved across MPTCP subflows, maintaining proper
message segmentation semantics.

Signed-off-by: Gang Yan <yangang@kylinos.cn>
---

Notes:
    Hi Matt,
    
    I create two packetdrill scripts to test the 'MSG_EOR' in
    single/multi subflow(s). The link is attached below:
    https://github.com/multipath-tcp/packetdrill/compare/mptcp-net-next...Dwyane-Yan:packetdrill:mptcp-net-next
    
    Please ignore the indent problems, I will fix them when
    the 'MSG_EOR' patch is accepted.
    
    Thanks

 net/mptcp/protocol.c | 22 +++++++++++++++++++---
 net/mptcp/protocol.h |  1 +
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index c88882062c40..b8200765506f 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1174,6 +1174,7 @@ mptcp_carve_data_frag(const struct mptcp_sock *msk, struct page_frag *pfrag,
 	dfrag->offset = offset + sizeof(struct mptcp_data_frag);
 	dfrag->already_sent = 0;
 	dfrag->page = pfrag->page;
+	dfrag->eor = 0;
 
 	return dfrag;
 }
@@ -1434,6 +1435,13 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
 		mptcp_update_infinite_map(msk, ssk, mpext);
 	trace_mptcp_sendmsg_frag(mpext);
 	mptcp_subflow_ctx(ssk)->rel_write_seq += copy;
+
+	 /* If this is the last chunk of a dfrag with MSG_EOR set
+	  * mark the skb to prevent coalescing with subsequent data
+	  */
+	if (dfrag->eor && info->sent + copy >= dfrag->data_len)
+		TCP_SKB_CB(skb)->eor = 1;
+
 	return copy;
 }
 
@@ -1894,7 +1902,8 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	long timeo;
 
 	/* silently ignore everything else */
-	msg->msg_flags &= MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL | MSG_FASTOPEN;
+	msg->msg_flags &= MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
+			  MSG_FASTOPEN | MSG_EOR;
 
 	lock_sock(sk);
 
@@ -2001,9 +2010,16 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			goto do_error;
 	}
 
-	if (copied)
-		__mptcp_push_pending(sk, msg->msg_flags);
+	if (copied) {
+		/* Mark the last dfrag with EOR if MSG_EOR was set */
+		if (msg->msg_flags & MSG_EOR) {
+			struct mptcp_data_frag *dfrag = mptcp_pending_tail(sk);
 
+			if (dfrag)
+				dfrag->eor = 1;
+		}
+		__mptcp_push_pending(sk, msg->msg_flags);
+	}
 out:
 	release_sock(sk);
 	return copied;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h
index cd5266099993..5bfe1002242d 100644
--- a/net/mptcp/protocol.h
+++ b/net/mptcp/protocol.h
@@ -267,6 +267,7 @@ struct mptcp_data_frag {
 	u16 overhead;
 	u16 already_sent;
 	struct page *page;
+	u8 eor;					/* Is MSG_EOR marked? Prevents coalescing with next frag */
 };
 
 /* Arbitrary compromise between as low as possible to react timely to subflow
-- 
2.43.0
Re: [RFC mptcp-next] mptcp: support MSG_EOR in mptcp_sendmsg
Posted by Matthieu Baerts 4 days, 16 hours ago
Hi Gang,

On 02/02/2026 05:09, Gang Yan wrote:
> From: Gang Yan <yangang@kylinos.cn>
> 
> This patch adds support for the MSG_EOR flag in MPTCP's sendmsg path,
> ensuring that data fragments marked with MSG_EOR are properly handled
> to prevent coalescing with subsequent data.
> 
> Key changes:
> 1. Added an 'eor' field to struct mptcp_data_frag to track MSG_EOR marking
> 2. Initialize the eor field to 0 in mptcp_carve_data_frag()
> 3. In mptcp_sendmsg_frag(), when sending the last chunk of a data fragment
>    that has MSG_EOR set, mark the corresponding skb with TCP_SKB_CB(skb)->eor = 1
>    to prevent coalescing with subsequent data
> 4. Modified mptcp_sendmsg() to:
>    - Preserve MSG_EOR flag in msg_flags filtering
>    - Mark the last pending data fragment with eor = 1 when MSG_EOR is set
>      in the message flags
> 
> This ensures that applications using MSG_EOR to indicate record boundaries
> have their intent preserved across MPTCP subflows, maintaining proper
> message segmentation semantics.

Thank you for looking at this.

> 
> Signed-off-by: Gang Yan <yangang@kylinos.cn>
> ---
> 
> Notes:
>     Hi Matt,
>     
>     I create two packetdrill scripts to test the 'MSG_EOR' in
>     single/multi subflow(s). The link is attached below:
>     https://github.com/multipath-tcp/packetdrill/compare/mptcp-net-next...Dwyane-Yan:packetdrill:mptcp-net-next
>     
>     Please ignore the indent problems, I will fix them when
>     the 'MSG_EOR' patch is accepted.

I think it is better to be able to review the tests at the same time as
the implementation. Can you then fix the indent problems and open a PR
please?

In the meantime, it looks like checkpatch is complaining here:

> WARNING: Prefer a maximum 75 chars per line (possible unwrapped commit description?)
> #14: 
>    that has MSG_EOR set, mark the corresponding skb with TCP_SKB_CB(skb)->eor = 1
> 
> WARNING: line length of 107 exceeds 80 columns
> #95: FILE: net/mptcp/protocol.h:270:
> +	u8 eor;					/* Is MSG_EOR marked? Prevents coalescing with next frag */
> 
> total: 0 errors, 2 warnings, 0 checks, 54 lines checked

Can you look at that, please?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.
Re: [RFC mptcp-next] mptcp: support MSG_EOR in mptcp_sendmsg
Posted by MPTCP CI 4 days, 21 hours ago
Hi Gang,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal (except selftest_mptcp_join): Success! ✅
- KVM Validation: normal (only selftest_mptcp_join): Success! ✅
- KVM Validation: debug (except selftest_mptcp_join): Unstable: 3 failed test(s): packetdrill_dss packetdrill_mp_capable selftest_mptcp_connect_checksum 🔴
- KVM Validation: debug (only selftest_mptcp_join): Success! ✅
- KVM Validation: btf-normal (only bpftest_all): Success! ✅
- KVM Validation: btf-debug (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/21577318531

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/37cb6cc02ae8
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=1049557


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)