net/kcm/kcmsock.c | 6 ++++++ 1 file changed, 6 insertions(+)
Flags passed in for splice() syscall should not end up in
skb_recv_datagram(). As SPLICE_F_NONBLOCK == MSG_PEEK, kernel gets
confused: skb isn't unlinked from a receive queue, while strp_msg::offset
and strp_msg::full_len are updated.
Unbreak the logic a bit more by mapping both O_NONBLOCK and
SPLICE_F_NONBLOCK to MSG_DONTWAIT. This way we align with man splice(2) in
regard to errno EAGAIN:
SPLICE_F_NONBLOCK was specified in flags or one of the file descriptors
had been marked as nonblocking (O_NONBLOCK), and the operation would
block.
Fixes: 5121197ecc5d ("kcm: close race conditions on sk_receive_queue")
Fixes: 91687355b927 ("kcm: Splice support")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
---
net/kcm/kcmsock.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index 24aec295a51cf737912f1aefe81556bd9f23331e..c05047dad62d7e201c950ab98af6dc7f0d48276c 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -19,6 +19,7 @@
#include <linux/rculist.h>
#include <linux/skbuff.h>
#include <linux/socket.h>
+#include <linux/splice.h>
#include <linux/uaccess.h>
#include <linux/workqueue.h>
#include <linux/syscalls.h>
@@ -1030,6 +1031,11 @@ static ssize_t kcm_splice_read(struct socket *sock, loff_t *ppos,
ssize_t copied;
struct sk_buff *skb;
+ if (sock->file->f_flags & O_NONBLOCK || flags & SPLICE_F_NONBLOCK)
+ flags = MSG_DONTWAIT;
+ else
+ flags = 0;
+
/* Only support splice for SOCKSEQPACKET */
skb = skb_recv_datagram(sk, flags, &err);
---
base-commit: c8f13134349b4385ae739f1efe403d5d3949ef92
change-id: 20250619-kcm-splice-01cb39a5997b
Best regards,
--
Michal Luczaj <mhal@rbox.co>
On Fri, 25 Jul 2025 12:33:04 +0200 Michal Luczaj wrote: > Flags passed in for splice() syscall should not end up in > skb_recv_datagram(). As SPLICE_F_NONBLOCK == MSG_PEEK, kernel gets > confused: skb isn't unlinked from a receive queue, while strp_msg::offset > and strp_msg::full_len are updated. > > Unbreak the logic a bit more by mapping both O_NONBLOCK and > SPLICE_F_NONBLOCK to MSG_DONTWAIT. This way we align with man splice(2) in > regard to errno EAGAIN: > > SPLICE_F_NONBLOCK was specified in flags or one of the file descriptors > had been marked as nonblocking (O_NONBLOCK), and the operation would > block. Coincidentally looks like we're not honoring sock->file->f_flags & O_NONBLOCK in TLS..
On 7/31/25 03:02, Jakub Kicinski wrote: > On Fri, 25 Jul 2025 12:33:04 +0200 Michal Luczaj wrote: >> Flags passed in for splice() syscall should not end up in >> skb_recv_datagram(). As SPLICE_F_NONBLOCK == MSG_PEEK, kernel gets >> confused: skb isn't unlinked from a receive queue, while strp_msg::offset >> and strp_msg::full_len are updated. >> >> Unbreak the logic a bit more by mapping both O_NONBLOCK and >> SPLICE_F_NONBLOCK to MSG_DONTWAIT. This way we align with man splice(2) in >> regard to errno EAGAIN: >> >> SPLICE_F_NONBLOCK was specified in flags or one of the file descriptors >> had been marked as nonblocking (O_NONBLOCK), and the operation would >> block. > > Coincidentally looks like we're not honoring > > sock->file->f_flags & O_NONBLOCK > > in TLS.. I'm a bit confused. Comparing AF_UNIX and pure (non-TLS) TCP, I see two non-blocking-splice interpretations. Unix socket doesn't block on `f_flags & O_NONBLOCK || flags & SPLICE_F_NONBLOCK` (which this patch follows), while TCP, after commit 42324c627043 ("net: splice() from tcp to pipe should take into account O_NONBLOCK"), honours O_NONBLOCK and ignores SPLICE_F_NONBLOCK. Should KCM (and TLS) follow TCP behaviour instead? Thanks, Michal
On Sun, 3 Aug 2025 12:00:38 +0200 Michal Luczaj wrote: > On 7/31/25 03:02, Jakub Kicinski wrote: > > On Fri, 25 Jul 2025 12:33:04 +0200 Michal Luczaj wrote: > >> Flags passed in for splice() syscall should not end up in > >> skb_recv_datagram(). As SPLICE_F_NONBLOCK == MSG_PEEK, kernel gets > >> confused: skb isn't unlinked from a receive queue, while strp_msg::offset > >> and strp_msg::full_len are updated. > >> > >> Unbreak the logic a bit more by mapping both O_NONBLOCK and > >> SPLICE_F_NONBLOCK to MSG_DONTWAIT. This way we align with man splice(2) in > >> regard to errno EAGAIN: > >> > >> SPLICE_F_NONBLOCK was specified in flags or one of the file descriptors > >> had been marked as nonblocking (O_NONBLOCK), and the operation would > >> block. > > > > Coincidentally looks like we're not honoring > > > > sock->file->f_flags & O_NONBLOCK > > > > in TLS.. > > I'm a bit confused. > > Comparing AF_UNIX and pure (non-TLS) TCP, I see two non-blocking-splice > interpretations. Unix socket doesn't block on `f_flags & O_NONBLOCK || > flags & SPLICE_F_NONBLOCK` (which this patch follows), while TCP, after > commit 42324c627043 ("net: splice() from tcp to pipe should take into > account O_NONBLOCK"), honours O_NONBLOCK and ignores SPLICE_F_NONBLOCK. > > Should KCM (and TLS) follow TCP behaviour instead? I didn't look closely, but FWIW - yes, in principle KCM and TLS should copy TCP.
On 8/5/25 01:51, Jakub Kicinski wrote: > On Sun, 3 Aug 2025 12:00:38 +0200 Michal Luczaj wrote: >> On 7/31/25 03:02, Jakub Kicinski wrote: >>> On Fri, 25 Jul 2025 12:33:04 +0200 Michal Luczaj wrote: >>>> Flags passed in for splice() syscall should not end up in >>>> skb_recv_datagram(). As SPLICE_F_NONBLOCK == MSG_PEEK, kernel gets >>>> confused: skb isn't unlinked from a receive queue, while strp_msg::offset >>>> and strp_msg::full_len are updated. >>>> >>>> Unbreak the logic a bit more by mapping both O_NONBLOCK and >>>> SPLICE_F_NONBLOCK to MSG_DONTWAIT. This way we align with man splice(2) in >>>> regard to errno EAGAIN: >>>> >>>> SPLICE_F_NONBLOCK was specified in flags or one of the file descriptors >>>> had been marked as nonblocking (O_NONBLOCK), and the operation would >>>> block. >>> >>> Coincidentally looks like we're not honoring >>> >>> sock->file->f_flags & O_NONBLOCK >>> >>> in TLS.. >> >> I'm a bit confused. >> >> Comparing AF_UNIX and pure (non-TLS) TCP, I see two non-blocking-splice >> interpretations. Unix socket doesn't block on `f_flags & O_NONBLOCK || >> flags & SPLICE_F_NONBLOCK` (which this patch follows), while TCP, after >> commit 42324c627043 ("net: splice() from tcp to pipe should take into >> account O_NONBLOCK"), honours O_NONBLOCK and ignores SPLICE_F_NONBLOCK. >> >> Should KCM (and TLS) follow TCP behaviour instead? > > I didn't look closely, but FWIW - yes, in principle KCM and TLS should > copy TCP. Ugh, so this KCM patch is incorrect. Sorry, I'll submit a follow up tweaking KCM and TLS, as suggested. Note about SPLICE_F_NONBLOCK: besides AF_UNIX, it is also honoured in AF_SMC and tracefs.
© 2016 - 2025 Red Hat, Inc.