[PATCH v2] io_uring/net: don't check MSG_CTRUNC for IORING_OP_RECV

Hannes Furmans posted 1 patch 3 months, 1 week ago
io_uring/net.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH v2] io_uring/net: don't check MSG_CTRUNC for IORING_OP_RECV
Posted by Hannes Furmans 3 months, 1 week ago
IORING_OP_RECV sets up the msghdr with msg_control=NULL and
msg_controllen=0, as it has no cmsg support. Any socket layer that
calls put_cmsg() will find no buffer space and set MSG_CTRUNC in
msg_flags. This is expected — the caller didn't ask for control data.

However, io_recv checks:

    if ((flags & MSG_WAITALL) && (msg_flags & (MSG_TRUNC | MSG_CTRUNC)))
        req_set_fail(req);

This sets REQ_F_FAIL on a fully successful recv (ret >= min_ret) when
MSG_CTRUNC is set, which causes io_disarm_next() to cancel all linked
operations with -ECANCELED. The recv CQE shows the full requested byte
count, yet linked operations are cancelled.

This is triggered by kTLS, which calls put_cmsg(SOL_TLS,
TLS_GET_RECORD_TYPE) for every record in tls_record_content_type()
(tls_sw.c), but it affects any protocol that delivers cmsg data on
the kernel side.

The MSG_CTRUNC check was introduced by commit 0031275d119e ("io_uring:
call req_set_fail_links() on short send[msg]()/recv[msg]() with
MSG_WAITALL") whose commit message states "For IORING_OP_RECVMSG we
also check for the MSG_TRUNC and MSG_CTRUNC flags", but the code
applied the check to IORING_OP_RECV as well. MSG_CTRUNC is meaningful
for IORING_OP_RECVMSG where the user provides a cmsg buffer —
truncation there means lost metadata. It is meaningless for
IORING_OP_RECV which never provides a cmsg buffer.

Remove MSG_CTRUNC from the io_recv check. The io_recvmsg check is
left unchanged as MSG_CTRUNC is meaningful there.

Fixes: 0031275d119e ("io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL")
Cc: stable@vger.kernel.org
Signed-off-by: Hannes Furmans <hannes@stillwind.ai>
---
v2: v1 incorrectly guarded req_set_fail() for all done_io > 0 cases.
    Stefan Metzmacher correctly pointed out that short MSG_WAITALL
    reads should still sever the link chain.

    Root-caused via ftrace + msg_flags inspection on a real kTLS
    connection (TLS 1.3, AES-128-GCM, S3 download):

    ftrace shows io_uring_fail_link firing immediately after
    io_uring_complete with result=67108864 (full 64MB), from io-wq:

      iou-wrk-52242 io_uring_complete: req ..., result 67108864
      iou-wrk-52242 io_uring_fail_link: opcode RECV, link ...

    A debug recvmsg on the same kTLS socket shows:

      recvmsg: ret=67108864 msg_flags=0x88 (MSG_EOR | MSG_CTRUNC)

    MSG_CTRUNC is always set because kTLS calls put_cmsg() but
    IORING_OP_RECV provides no cmsg buffer.

 io_uring/net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/io_uring/net.c b/io_uring/net.c
index 8576c6cb2236..8baaf74e8f8d 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -1221,7 +1221,7 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags)
 		if (ret == -ERESTARTSYS)
 			ret = -EINTR;
 		req_set_fail(req);
-	} else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
+	} else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & MSG_TRUNC)) {
 out_free:
 		req_set_fail(req);
 	}
-- 
2.53.0

Re: [PATCH v2] io_uring/net: don't check MSG_CTRUNC for IORING_OP_RECV
Posted by Hannes Furmans 2 months, 1 week ago
Gentle ping on this. This is a one-line fix for a real bug where IORING_OP_RECV on kTLS sockets spuriously fails linked ops due to MSG_CTRUNC being sent by put_cmsg() when no cmsg buffer is provided.
Stefan indicated the approach looks correct. Would be great to get this into 7.0 if possible, as we’re in the RC window and this is a straightforward bug fix.

> On 27. Feb 2026, at 17:27, Hannes Furmans <hannes@stillwind.ai> wrote:
> 
> IORING_OP_RECV sets up the msghdr with msg_control=NULL and
> msg_controllen=0, as it has no cmsg support. Any socket layer that
> calls put_cmsg() will find no buffer space and set MSG_CTRUNC in
> msg_flags. This is expected — the caller didn't ask for control data.
> 
> However, io_recv checks:
> 
>    if ((flags & MSG_WAITALL) && (msg_flags & (MSG_TRUNC | MSG_CTRUNC)))
>        req_set_fail(req);
> 
> This sets REQ_F_FAIL on a fully successful recv (ret >= min_ret) when
> MSG_CTRUNC is set, which causes io_disarm_next() to cancel all linked
> operations with -ECANCELED. The recv CQE shows the full requested byte
> count, yet linked operations are cancelled.
> 
> This is triggered by kTLS, which calls put_cmsg(SOL_TLS,
> TLS_GET_RECORD_TYPE) for every record in tls_record_content_type()
> (tls_sw.c), but it affects any protocol that delivers cmsg data on
> the kernel side.
> 
> The MSG_CTRUNC check was introduced by commit 0031275d119e ("io_uring:
> call req_set_fail_links() on short send[msg]()/recv[msg]() with
> MSG_WAITALL") whose commit message states "For IORING_OP_RECVMSG we
> also check for the MSG_TRUNC and MSG_CTRUNC flags", but the code
> applied the check to IORING_OP_RECV as well. MSG_CTRUNC is meaningful
> for IORING_OP_RECVMSG where the user provides a cmsg buffer —
> truncation there means lost metadata. It is meaningless for
> IORING_OP_RECV which never provides a cmsg buffer.
> 
> Remove MSG_CTRUNC from the io_recv check. The io_recvmsg check is
> left unchanged as MSG_CTRUNC is meaningful there.
> 
> Fixes: 0031275d119e ("io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL")
> Cc: stable@vger.kernel.org
> Signed-off-by: Hannes Furmans <hannes@stillwind.ai>
> ---
> v2: v1 incorrectly guarded req_set_fail() for all done_io > 0 cases.
>    Stefan Metzmacher correctly pointed out that short MSG_WAITALL
>    reads should still sever the link chain.
> 
>    Root-caused via ftrace + msg_flags inspection on a real kTLS
>    connection (TLS 1.3, AES-128-GCM, S3 download):
> 
>    ftrace shows io_uring_fail_link firing immediately after
>    io_uring_complete with result=67108864 (full 64MB), from io-wq:
> 
>      iou-wrk-52242 io_uring_complete: req ..., result 67108864
>      iou-wrk-52242 io_uring_fail_link: opcode RECV, link ...
> 
>    A debug recvmsg on the same kTLS socket shows:
> 
>      recvmsg: ret=67108864 msg_flags=0x88 (MSG_EOR | MSG_CTRUNC)
> 
>    MSG_CTRUNC is always set because kTLS calls put_cmsg() but
>    IORING_OP_RECV provides no cmsg buffer.
> 
> io_uring/net.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/io_uring/net.c b/io_uring/net.c
> index 8576c6cb2236..8baaf74e8f8d 100644
> --- a/io_uring/net.c
> +++ b/io_uring/net.c
> @@ -1221,7 +1221,7 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags)
> if (ret == -ERESTARTSYS)
> ret = -EINTR;
> req_set_fail(req);
> - } else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
> + } else if ((flags & MSG_WAITALL) && (kmsg->msg.msg_flags & MSG_TRUNC)) {
> out_free:
> req_set_fail(req);
> }
> -- 
> 2.53.0
>