From nobody Sat Feb 7 18:15:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93251C7EE24 for ; Wed, 31 May 2023 11:06:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235057AbjEaLGH (ORCPT ); Wed, 31 May 2023 07:06:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232226AbjEaLGE (ORCPT ); Wed, 31 May 2023 07:06:04 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2E66E52 for ; Wed, 31 May 2023 04:04:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1685531074; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oo2uxP/MzmHd10a5KbN5Oq58DgcC9fhPDz9XsE6LWiI=; b=MCYDYhXw1c/wEc+USh+i2Ag+UmNzXMj7t2JTGX9nRyMxZOaKz7KUvAIOdi8BAmmc9XFD0o MkEO7gEq9gwkQ2hV5rd08ctRRL9cefU+moofg0FvkVzNwmaaa6mNhS8Yd4qUvlVtsGTqp8 s3SS/SnR0eS+Gp6jTOwGOyY03PweuWU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-210-5MnFPOU1Pei6bNTHzvR1rw-1; Wed, 31 May 2023 07:04:32 -0400 X-MC-Unique: 5MnFPOU1Pei6bNTHzvR1rw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 015F4185A78B; Wed, 31 May 2023 11:04:32 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id 48C7A112132D; Wed, 31 May 2023 11:04:29 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Tom Herbert , Tom Herbert , Cong Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next v2 1/2] kcm: Support MSG_SPLICE_PAGES Date: Wed, 31 May 2023 12:04:21 +0100 Message-ID: <20230531110423.643196-2-dhowells@redhat.com> In-Reply-To: <20230531110423.643196-1-dhowells@redhat.com> References: <20230531110423.643196-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Make AF_KCM sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Tom Herbert cc: Tom Herbert cc: Cong Wang cc: Jakub Kicinski cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- Notes: ver #2) - Only account the amount actually copied. - Wrap at 80 chars. net/kcm/kcmsock.c | 57 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 40 insertions(+), 17 deletions(-) diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index cfe828bd7fc6..8555ede66333 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -989,29 +989,52 @@ static int kcm_sendmsg(struct socket *sock, struct ms= ghdr *msg, size_t len) merge =3D false; } =20 - copy =3D min_t(int, msg_data_left(msg), - pfrag->size - pfrag->offset); + if (msg->msg_flags & MSG_SPLICE_PAGES) { + copy =3D msg_data_left(msg); + if (!sk_wmem_schedule(sk, copy)) + goto wait_for_memory; =20 - if (!sk_wmem_schedule(sk, copy)) - goto wait_for_memory; + err =3D skb_splice_from_iter(skb, &msg->msg_iter, copy, + sk->sk_allocation); + if (err < 0) { + if (err =3D=3D -EMSGSIZE) + goto wait_for_memory; + goto out_error; + } =20 - err =3D skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, - pfrag->page, - pfrag->offset, - copy); - if (err) - goto out_error; + copy =3D err; + skb_shinfo(skb)->flags |=3D SKBFL_SHARED_FRAG; + sk_wmem_queued_add(sk, copy); + sk_mem_charge(sk, copy); =20 - /* Update the skb. */ - if (merge) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + if (head !=3D skb) + head->truesize +=3D copy; } else { - skb_fill_page_desc(skb, i, pfrag->page, - pfrag->offset, copy); - get_page(pfrag->page); + copy =3D min_t(int, msg_data_left(msg), + pfrag->size - pfrag->offset); + if (!sk_wmem_schedule(sk, copy)) + goto wait_for_memory; + + err =3D skb_copy_to_page_nocache(sk, &msg->msg_iter, skb, + pfrag->page, + pfrag->offset, + copy); + if (err) + goto out_error; + + /* Update the skb. */ + if (merge) { + skb_frag_size_add( + &skb_shinfo(skb)->frags[i - 1], copy); + } else { + skb_fill_page_desc(skb, i, pfrag->page, + pfrag->offset, copy); + get_page(pfrag->page); + } + + pfrag->offset +=3D copy; } =20 - pfrag->offset +=3D copy; copied +=3D copy; if (head !=3D skb) { head->len +=3D copy; From nobody Sat Feb 7 18:15:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C78FC77B7A for ; Wed, 31 May 2023 11:06:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235369AbjEaLGR (ORCPT ); Wed, 31 May 2023 07:06:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55544 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235257AbjEaLGJ (ORCPT ); Wed, 31 May 2023 07:06:09 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57F1D18B for ; Wed, 31 May 2023 04:04:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1685531078; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Lspjk3nUuuzKHM/LR2nBU2AWpiVJh3beRxHXZXIuEZs=; b=UzXsJFRPeZm3rkTJ3vOfEtFCfKcQWUBjn0eTs58vdapzZbWNXE2vSW0q5mx2tmOtluY9Jc csPTHFnt10NMy3f9zkF/ZhtBNB2VYm8cjr/efWohEIKktryujTWYihn7DcGhFReHeRMXJA 5guDrQh87t98vOGY/Fe1kWWHnD8FwUQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-395-ksUvxhdGOC6sRfiW3m6cCg-1; Wed, 31 May 2023 07:04:35 -0400 X-MC-Unique: ksUvxhdGOC6sRfiW3m6cCg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9CB101019C86; Wed, 31 May 2023 11:04:34 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id C706B2166B25; Wed, 31 May 2023 11:04:32 +0000 (UTC) From: David Howells To: netdev@vger.kernel.org Cc: David Howells , Tom Herbert , Tom Herbert , Cong Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Jens Axboe , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next v2 2/2] kcm: Convert kcm_sendpage() to use MSG_SPLICE_PAGES Date: Wed, 31 May 2023 12:04:22 +0100 Message-ID: <20230531110423.643196-3-dhowells@redhat.com> In-Reply-To: <20230531110423.643196-1-dhowells@redhat.com> References: <20230531110423.643196-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Convert kcm_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells cc: Tom Herbert cc: Tom Herbert cc: Cong Wang cc: Jakub Kicinski cc: Eric Dumazet cc: "David S. Miller" cc: Paolo Abeni cc: Jens Axboe cc: Matthew Wilcox cc: netdev@vger.kernel.org --- net/kcm/kcmsock.c | 161 ++++++---------------------------------------- 1 file changed, 18 insertions(+), 143 deletions(-) diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c index 8555ede66333..ba22af16b96d 100644 --- a/net/kcm/kcmsock.c +++ b/net/kcm/kcmsock.c @@ -761,149 +761,6 @@ static void kcm_push(struct kcm_sock *kcm) kcm_write_msgs(kcm); } =20 -static ssize_t kcm_sendpage(struct socket *sock, struct page *page, - int offset, size_t size, int flags) - -{ - struct sock *sk =3D sock->sk; - struct kcm_sock *kcm =3D kcm_sk(sk); - struct sk_buff *skb =3D NULL, *head =3D NULL; - long timeo =3D sock_sndtimeo(sk, flags & MSG_DONTWAIT); - bool eor; - int err =3D 0; - int i; - - if (flags & MSG_SENDPAGE_NOTLAST) - flags |=3D MSG_MORE; - - /* No MSG_EOR from splice, only look at MSG_MORE */ - eor =3D !(flags & MSG_MORE); - - lock_sock(sk); - - sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk); - - err =3D -EPIPE; - if (sk->sk_err) - goto out_error; - - if (kcm->seq_skb) { - /* Previously opened message */ - head =3D kcm->seq_skb; - skb =3D kcm_tx_msg(head)->last_skb; - i =3D skb_shinfo(skb)->nr_frags; - - if (skb_can_coalesce(skb, i, page, offset)) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], size); - skb_shinfo(skb)->flags |=3D SKBFL_SHARED_FRAG; - goto coalesced; - } - - if (i >=3D MAX_SKB_FRAGS) { - struct sk_buff *tskb; - - tskb =3D alloc_skb(0, sk->sk_allocation); - while (!tskb) { - kcm_push(kcm); - err =3D sk_stream_wait_memory(sk, &timeo); - if (err) - goto out_error; - } - - if (head =3D=3D skb) - skb_shinfo(head)->frag_list =3D tskb; - else - skb->next =3D tskb; - - skb =3D tskb; - skb->ip_summed =3D CHECKSUM_UNNECESSARY; - i =3D 0; - } - } else { - /* Call the sk_stream functions to manage the sndbuf mem. */ - if (!sk_stream_memory_free(sk)) { - kcm_push(kcm); - set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); - err =3D sk_stream_wait_memory(sk, &timeo); - if (err) - goto out_error; - } - - head =3D alloc_skb(0, sk->sk_allocation); - while (!head) { - kcm_push(kcm); - err =3D sk_stream_wait_memory(sk, &timeo); - if (err) - goto out_error; - } - - skb =3D head; - i =3D 0; - } - - get_page(page); - skb_fill_page_desc_noacc(skb, i, page, offset, size); - skb_shinfo(skb)->flags |=3D SKBFL_SHARED_FRAG; - -coalesced: - skb->len +=3D size; - skb->data_len +=3D size; - skb->truesize +=3D size; - sk->sk_wmem_queued +=3D size; - sk_mem_charge(sk, size); - - if (head !=3D skb) { - head->len +=3D size; - head->data_len +=3D size; - head->truesize +=3D size; - } - - if (eor) { - bool not_busy =3D skb_queue_empty(&sk->sk_write_queue); - - /* Message complete, queue it on send buffer */ - __skb_queue_tail(&sk->sk_write_queue, head); - kcm->seq_skb =3D NULL; - KCM_STATS_INCR(kcm->stats.tx_msgs); - - if (flags & MSG_BATCH) { - kcm->tx_wait_more =3D true; - } else if (kcm->tx_wait_more || not_busy) { - err =3D kcm_write_msgs(kcm); - if (err < 0) { - /* We got a hard error in write_msgs but have - * already queued this message. Report an error - * in the socket, but don't affect return value - * from sendmsg - */ - pr_warn("KCM: Hard failure on kcm_write_msgs\n"); - report_csk_error(&kcm->sk, -err); - } - } - } else { - /* Message not complete, save state */ - kcm->seq_skb =3D head; - kcm_tx_msg(head)->last_skb =3D skb; - } - - KCM_STATS_ADD(kcm->stats.tx_bytes, size); - - release_sock(sk); - return size; - -out_error: - kcm_push(kcm); - - err =3D sk_stream_error(sk, flags, err); - - /* make sure we wake any epoll edge trigger waiter */ - if (unlikely(skb_queue_len(&sk->sk_write_queue) =3D=3D 0 && err =3D=3D -E= AGAIN)) - sk->sk_write_space(sk); - - release_sock(sk); - return err; -} - static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len) { struct sock *sk =3D sock->sk; @@ -1111,6 +968,24 @@ static int kcm_sendmsg(struct socket *sock, struct ms= ghdr *msg, size_t len) return err; } =20 +static ssize_t kcm_sendpage(struct socket *sock, struct page *page, + int offset, size_t size, int flags) + +{ + struct bio_vec bvec; + struct msghdr msg =3D { .msg_flags =3D flags | MSG_SPLICE_PAGES, }; + + if (flags & MSG_SENDPAGE_NOTLAST) + msg.msg_flags |=3D MSG_MORE; + + if (flags & MSG_OOB) + return -EOPNOTSUPP; + + bvec_set_page(&bvec, page, size, offset); + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size); + return kcm_sendmsg(sock, &msg, size); +} + static int kcm_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int flags) {