From nobody Thu Mar 28 17:56:04 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a05:6638:d02:0:0:0:0 with SMTP id q2csp169473jaj; Thu, 2 Sep 2021 08:52:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz62fu7ePxLNuhul/vl3zgiqZrKvW8IT6ZYrnvX7OUsqjbv0Z6ACazTPPkNDRRvMhYSybsX X-Received: by 2002:a62:1795:0:b0:3ff:2201:905c with SMTP id 143-20020a621795000000b003ff2201905cmr4118367pfx.72.1630597968778; Thu, 02 Sep 2021 08:52:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630597968; cv=none; d=google.com; s=arc-20160816; b=IKAFV6qyUC8tjLXKNH/tsYuOXDRDJYQnwMiGCqAb3IdMtife8PeJgYONlg0JtiCZSX 8PNIKXBQ9SAdoKUNRIQNipCMVkooUbcG4oksWmQFcbAlse5x/h90ctCUROiawi8sRigt teB1CSdEs9f/e92XR6pOFhdP9odYUebVFFBbyArIKLAEGK0oO6yuM4OK8+WKCwVefG6L oYMocsy0RCpEmLvMzwZtGW51r3YCACppdDX24whyyzw8lO3n4iESCZlRZAOl9kKAeejV 2FirixhtqS+4kIm5Xd5oKEIrEhL6EwWni4KN6m3QrTUHkR+lIMkbZJWWuXq4jk0N7oRG f8Og== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=5mDJU6NE9VaROPqiItmVTwgDB2WVu7JPOayM+M/8Tak=; b=yumoVnybARK22pzX0FijCqWJ68sLmhngzPmhYrVZyOERY4u3ufYQypdlo6lBG+5nKR Y1WtEQ+zZhI41uAIgJ+RxmPVhLnTqTQla3TuA/HSNNCtdWtZGeYiQ6/kPxHpiLos/5vB 7tWVO8nVlfPiFA/zqFgVyvJHTG14H3xVCbzcM0Bo66KcYuHAFfee4I9ggHa4wQZD+8M4 RKPLuiFf/D0OdcEiPlKcC4IWcwRZeNb8YrUW1IKNFJLKo289QpNSgkE+nsppgOD34y8A woBKfS6gKzQpLyh439/sGpxaCxRSHlpbAKPSifmfUOLfF9juYK5NMd73J1wOq15ccCXO x9Gw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=cSh3a+xa; spf=pass (google.com: domain of mptcp+bounces-1817-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.69.165 as permitted sender) smtp.mailfrom="mptcp+bounces-1817-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sjc.edge.kernel.org (sjc.edge.kernel.org. [147.75.69.165]) by mx.google.com with ESMTPS id c14si2633485pjg.46.2021.09.02.08.52.48 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Sep 2021 08:52:48 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-1817-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.69.165 as permitted sender) client-ip=147.75.69.165; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=cSh3a+xa; spf=pass (google.com: domain of mptcp+bounces-1817-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.69.165 as permitted sender) smtp.mailfrom="mptcp+bounces-1817-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sjc.edge.kernel.org (Postfix) with ESMTPS id 8A4E93E0F3B for ; Thu, 2 Sep 2021 15:52:48 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id F134B2FB3; Thu, 2 Sep 2021 15:52:46 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF3392FAE for ; Thu, 2 Sep 2021 15:52:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630597964; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5mDJU6NE9VaROPqiItmVTwgDB2WVu7JPOayM+M/8Tak=; b=cSh3a+xa4cmZQAmJacVGepo0UwM6vnYSkNK3S95KATA1kzdN3Y7TUjV+468PwJ9RZuH1iU MOonAmar1+jRRH+DrwyY226VG6Oz4Nq71YD6U887vpVWeDEBwOunWLK2W39lELr1+Yw1kl 2jlJPeQvKbwBCwc6Q5Fjp5/avmB14qg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-476-_YNa5noZOfKp69LdYFoDIw-1; Thu, 02 Sep 2021 11:52:43 -0400 X-MC-Unique: _YNa5noZOfKp69LdYFoDIw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 20CE2107ACCD for ; Thu, 2 Sep 2021 15:52:43 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.194.237]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8C5E55D9DC for ; Thu, 2 Sep 2021 15:52:42 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v2 mptcp-next 1/4] tcp: expose the tcp_mark_push() and skb_entail() helpers Date: Thu, 2 Sep 2021 17:52:11 +0200 Message-Id: <1c29a0077ea95b4429a3dd82a4ea3ec826b3ae99.1630595171.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" They will be used by the next patch. Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- include/net/tcp.h | 2 ++ net/ipv4/tcp.c | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 3166dc15d7d6..dc52ea8adfc7 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -581,6 +581,8 @@ __u32 cookie_v6_init_sequence(const struct sk_buff *skb= , __u16 *mss); #endif /* tcp_output.c */ =20 +void skb_entail(struct sock *sk, struct sk_buff *skb); +void tcp_mark_push(struct tcp_sock *tp, struct sk_buff *skb); void __tcp_push_pending_frames(struct sock *sk, unsigned int cur_mss, int nonagle); int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e8b48df73c85..7a3e632b0048 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -647,7 +647,7 @@ int tcp_ioctl(struct sock *sk, int cmd, unsigned long a= rg) } EXPORT_SYMBOL(tcp_ioctl); =20 -static inline void tcp_mark_push(struct tcp_sock *tp, struct sk_buff *skb) +void tcp_mark_push(struct tcp_sock *tp, struct sk_buff *skb) { TCP_SKB_CB(skb)->tcp_flags |=3D TCPHDR_PSH; tp->pushed_seq =3D tp->write_seq; @@ -658,7 +658,7 @@ static inline bool forced_push(const struct tcp_sock *t= p) return after(tp->write_seq, tp->pushed_seq + (tp->max_window >> 1)); } =20 -static void skb_entail(struct sock *sk, struct sk_buff *skb) +void skb_entail(struct sock *sk, struct sk_buff *skb) { struct tcp_sock *tp =3D tcp_sk(sk); struct tcp_skb_cb *tcb =3D TCP_SKB_CB(skb); --=20 2.26.3 From nobody Thu Mar 28 17:56:04 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a05:6638:d02:0:0:0:0 with SMTP id q2csp169495jaj; Thu, 2 Sep 2021 08:52:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyAkrTBO1UExOvGGCC+ToLEkx6BYRQwP0ODfvl5FHm4LR4Q4fp11daUujPbIceh2ChYmx/e X-Received: by 2002:ab0:74cc:: with SMTP id f12mr2411684uaq.85.1630597970598; Thu, 02 Sep 2021 08:52:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630597970; cv=none; d=google.com; s=arc-20160816; b=hY4ux4jVVdvtR9PQzO81YI8TFRFAAy9LLOSEHg+3h+cSmZdB3IZgB5oklhZcxdJycH +htvlxCppIElbVBjMLbqGBj/AE9/UiwZ7HuLZPhu+hjThyxm8YIWSGAYFkD+kNhREFaf jXTLTnVYm8wUYhps43y7APWqMjM0ooJzGiCXtji4d6URGKNzTUHtKF8EGIsyCpKu0Ok7 fkzEgJHxfukzS9N32VhOQ8lmXJ9KbqZz/8Bo5YqdpBTqN530G+MvYzBguZGrRbSF1ajp P4kpwCDvrpUfkcxUNunJp3OTJ65ejT8fwXRBqJuy843cWwbOTGWZ7FiBfuU4U/socqPC 1uwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=335rQ+WSjl3+MH56U7q8zQP/qJ1J5ng13PI/9oW4pPA=; b=MqY3r/mMnAAMPihfmzFLXQjzQdTHPfdJjDmHXEJZCcFgrKuFh7AJTmu0NQfmYBeero QAJfgLkSfD/JWdMc5CFifZqurjAgJL80eidOrcykX9+hwKM9JPaqSxGoMu7naDGcuuAN Ef9gLGdESn10+NVqstEtJ0dVC6nEipKgMypCOvgVSp3eazhdEmRBufsfdodIKhS7Unuf g33UFBW5Mja1bnmgYXhTv932+P9jdpZAQI9jejMOOfMW6Vnff1fwrnH2Zn0z314MUHGA TmmsTXgAoppSZQYTZNr4v4hAFqmlr/SlaDYCSWm8wdmBrLto6qhzt1v1wumd7GkIy2l2 rmyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UThBG8dw; spf=pass (google.com: domain of mptcp+bounces-1818-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-1818-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ewr.edge.kernel.org (ewr.edge.kernel.org. [147.75.197.195]) by mx.google.com with ESMTPS id m74si945031vkm.90.2021.09.02.08.52.50 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Sep 2021 08:52:50 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-1818-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) client-ip=147.75.197.195; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UThBG8dw; spf=pass (google.com: domain of mptcp+bounces-1818-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-1818-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ewr.edge.kernel.org (Postfix) with ESMTPS id 31C7A1C09FF for ; Thu, 2 Sep 2021 15:52:50 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1A41E2FB6; Thu, 2 Sep 2021 15:52:48 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED93D2FB2 for ; Thu, 2 Sep 2021 15:52:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630597966; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=335rQ+WSjl3+MH56U7q8zQP/qJ1J5ng13PI/9oW4pPA=; b=UThBG8dwNVfN8+evkfeTdAOoE/Hp0s4u3Cxks85hW6wU9CnjjxxFZTyRxELDfLJelaHTl7 9sHGZkHHjnI7/sD0jHly96z+8baHAl2Ie4qVoGoXAbdBydJkx1Uj3l2qpk1M8k6XvGJS3L UECcwbK/K6X136C/ifg1mntRicdLb5k= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-492-k_CxOsukNROgHcvpYbzKNg-1; Thu, 02 Sep 2021 11:52:44 -0400 X-MC-Unique: k_CxOsukNROgHcvpYbzKNg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 193DA801ADA for ; Thu, 2 Sep 2021 15:52:44 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.194.237]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8375D5D9DC for ; Thu, 2 Sep 2021 15:52:43 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v2 mptcp-next 2/4] mptcp: stop relaying on tcp_tx_skb_cache. Date: Thu, 2 Sep 2021 17:52:12 +0200 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We want to revert the skb TX cache, but MPTCP is currently using it unconditionally. Rework the MPTCP tx code, so that tcp_tx_skb_cache is not needed anymore: do the whole coalescing check, skb allocation skb initialization/update inside mptcp_sendmsg_frag(), quite alike the current TCP code. Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- v1 -> v2: - hopefully fix OoB, fetching nr_frags on new skbs --- net/mptcp/protocol.c | 132 +++++++++++++++++++++++++------------------ 1 file changed, 77 insertions(+), 55 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index faf6e7000d18..101e61bb2a80 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1224,6 +1224,7 @@ static struct sk_buff *__mptcp_do_alloc_tx_skb(struct= sock *sk, gfp_t gfp) if (likely(__mptcp_add_ext(skb, gfp))) { skb_reserve(skb, MAX_TCP_HEADER); skb->reserved_tailroom =3D skb->end - skb->tail; + INIT_LIST_HEAD(&skb->tcp_tsorted_anchor); return skb; } __kfree_skb(skb); @@ -1233,31 +1234,23 @@ static struct sk_buff *__mptcp_do_alloc_tx_skb(stru= ct sock *sk, gfp_t gfp) return NULL; } =20 -static bool __mptcp_alloc_tx_skb(struct sock *sk, struct sock *ssk, gfp_t = gfp) +static struct sk_buff *__mptcp_alloc_tx_skb(struct sock *sk, struct sock *= ssk, gfp_t gfp) { struct sk_buff *skb; =20 - if (ssk->sk_tx_skb_cache) { - skb =3D ssk->sk_tx_skb_cache; - if (unlikely(!skb_ext_find(skb, SKB_EXT_MPTCP) && - !__mptcp_add_ext(skb, gfp))) - return false; - return true; - } - skb =3D __mptcp_do_alloc_tx_skb(sk, gfp); if (!skb) - return false; + return NULL; =20 if (likely(sk_wmem_schedule(ssk, skb->truesize))) { - ssk->sk_tx_skb_cache =3D skb; - return true; + skb_entail(ssk, skb); + return skb; } kfree_skb(skb); - return false; + return NULL; } =20 -static bool mptcp_alloc_tx_skb(struct sock *sk, struct sock *ssk, bool dat= a_lock_held) +static struct sk_buff *mptcp_alloc_tx_skb(struct sock *sk, struct sock *ss= k, bool data_lock_held) { gfp_t gfp =3D data_lock_held ? GFP_ATOMIC : sk->sk_allocation; =20 @@ -1287,23 +1280,29 @@ static int mptcp_sendmsg_frag(struct sock *sk, stru= ct sock *ssk, struct mptcp_sendmsg_info *info) { u64 data_seq =3D dfrag->data_seq + info->sent; + int offset =3D dfrag->offset + info->sent; struct mptcp_sock *msk =3D mptcp_sk(sk); bool zero_window_probe =3D false; struct mptcp_ext *mpext =3D NULL; - struct sk_buff *skb, *tail; - bool must_collapse =3D false; - int size_bias =3D 0; - int avail_size; - size_t ret =3D 0; + bool can_coalesce =3D false; + bool reuse_skb =3D true; + struct sk_buff *skb; + size_t copy; + int i; =20 pr_debug("msk=3D%p ssk=3D%p sending dfrag at seq=3D%llu len=3D%u already = sent=3D%u", msk, ssk, dfrag->data_seq, dfrag->data_len, info->sent); =20 + if (WARN_ON_ONCE(info->sent > info->limit || + info->limit > dfrag->data_len)) + return 0; + /* compute send limit */ info->mss_now =3D tcp_send_mss(ssk, &info->size_goal, info->flags); - avail_size =3D info->size_goal; + copy =3D info->size_goal; + skb =3D tcp_write_queue_tail(ssk); - if (skb) { + if (skb && (copy > skb->len)) { /* Limit the write to the size available in the * current skb, if any, so that we create at most a new skb. * Explicitly tells TCP internals to avoid collapsing on later @@ -1316,53 +1315,76 @@ static int mptcp_sendmsg_frag(struct sock *sk, stru= ct sock *ssk, goto alloc_skb; } =20 - must_collapse =3D (info->size_goal - skb->len > 0) && - (skb_shinfo(skb)->nr_frags < sysctl_max_skb_frags); - if (must_collapse) { - size_bias =3D skb->len; - avail_size =3D info->size_goal - skb->len; + i =3D skb_shinfo(skb)->nr_frags; + can_coalesce =3D skb_can_coalesce(skb, i, dfrag->page, offset); + if (!can_coalesce && i >=3D sysctl_max_skb_frags) { + tcp_mark_push(tcp_sk(ssk), skb); + goto alloc_skb; } - } =20 + copy -=3D skb->len; + } else { alloc_skb: - if (!must_collapse && !ssk->sk_tx_skb_cache && - !mptcp_alloc_tx_skb(sk, ssk, info->data_lock_held)) - return 0; + skb =3D mptcp_alloc_tx_skb(sk, ssk, info->data_lock_held); + if (!skb) + return -ENOMEM; + + i =3D skb_shinfo(skb)->nr_frags; + reuse_skb =3D false; + mpext =3D skb_ext_find(skb, SKB_EXT_MPTCP); + } =20 /* Zero window and all data acked? Probe. */ - avail_size =3D mptcp_check_allowed_size(msk, data_seq, avail_size); - if (avail_size =3D=3D 0) { + copy =3D mptcp_check_allowed_size(msk, data_seq, copy); + if (copy =3D=3D 0) { u64 snd_una =3D READ_ONCE(msk->snd_una); =20 - if (skb || snd_una !=3D msk->snd_nxt) + if (skb || snd_una !=3D msk->snd_nxt) { + tcp_remove_empty_skb(ssk, tcp_write_queue_tail(ssk)); return 0; + } + zero_window_probe =3D true; data_seq =3D snd_una - 1; - avail_size =3D 1; - } + copy =3D 1; =20 - if (WARN_ON_ONCE(info->sent > info->limit || - info->limit > dfrag->data_len)) - return 0; + /* all mptcp-level data is acked, no skbs should be present into the + * ssk write queue + */ + WARN_ON_ONCE(reuse_skb); + } =20 - ret =3D info->limit - info->sent; - tail =3D tcp_build_frag(ssk, avail_size + size_bias, info->flags, - dfrag->page, dfrag->offset + info->sent, &ret); - if (!tail) { - tcp_remove_empty_skb(sk, tcp_write_queue_tail(ssk)); + copy =3D min_t(size_t, copy, info->limit - info->sent); + if (!sk_wmem_schedule(ssk, copy)) { + tcp_remove_empty_skb(ssk, tcp_write_queue_tail(ssk)); return -ENOMEM; } =20 - /* if the tail skb is still the cached one, collapsing really happened. - */ - if (skb =3D=3D tail) { - TCP_SKB_CB(tail)->tcp_flags &=3D ~TCPHDR_PSH; - mpext->data_len +=3D ret; + if (can_coalesce) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + } else { + get_page(dfrag->page); + skb_fill_page_desc(skb, i, dfrag->page, offset, copy); + } + + skb->len +=3D copy; + skb->data_len +=3D copy; + skb->truesize +=3D copy; + sk_wmem_queued_add(ssk, copy); + sk_mem_charge(ssk, copy); + skb->ip_summed =3D CHECKSUM_PARTIAL; + WRITE_ONCE(tcp_sk(ssk)->write_seq, tcp_sk(ssk)->write_seq + copy); + TCP_SKB_CB(skb)->end_seq +=3D copy; + tcp_skb_pcount_set(skb, 0); + + /* on skb reuse we just need to update the DSS len */ + if (reuse_skb) { + TCP_SKB_CB(skb)->tcp_flags &=3D ~TCPHDR_PSH; + mpext->data_len +=3D copy; WARN_ON_ONCE(zero_window_probe); goto out; } =20 - mpext =3D skb_ext_find(tail, SKB_EXT_MPTCP); if (WARN_ON_ONCE(!mpext)) { /* should never reach here, stream corrupted */ return -EINVAL; @@ -1371,7 +1393,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct= sock *ssk, memset(mpext, 0, sizeof(*mpext)); mpext->data_seq =3D data_seq; mpext->subflow_seq =3D mptcp_subflow_ctx(ssk)->rel_write_seq; - mpext->data_len =3D ret; + mpext->data_len =3D copy; mpext->use_map =3D 1; mpext->dsn64 =3D 1; =20 @@ -1380,18 +1402,18 @@ static int mptcp_sendmsg_frag(struct sock *sk, stru= ct sock *ssk, mpext->dsn64); =20 if (zero_window_probe) { - mptcp_subflow_ctx(ssk)->rel_write_seq +=3D ret; + mptcp_subflow_ctx(ssk)->rel_write_seq +=3D copy; mpext->frozen =3D 1; if (READ_ONCE(msk->csum_enabled)) - mptcp_update_data_checksum(tail, ret); + mptcp_update_data_checksum(skb, copy); tcp_push_pending_frames(ssk); return 0; } out: if (READ_ONCE(msk->csum_enabled)) - mptcp_update_data_checksum(tail, ret); - mptcp_subflow_ctx(ssk)->rel_write_seq +=3D ret; - return ret; + mptcp_update_data_checksum(skb, copy); + mptcp_subflow_ctx(ssk)->rel_write_seq +=3D copy; + return copy; } =20 #define MPTCP_SEND_BURST_SIZE ((1 << 16) - \ --=20 2.26.3 From nobody Thu Mar 28 17:56:04 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a05:6638:d02:0:0:0:0 with SMTP id q2csp169509jaj; Thu, 2 Sep 2021 08:52:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJztQGkwZSjj4PJEPtffMUH2X5E54qHaTtvL9BSXSV6ZsMyqkjBc52Uj85M7BiSLUmRzmW0j X-Received: by 2002:a05:6e02:1d06:: with SMTP id i6mr2875373ila.113.1630597971801; Thu, 02 Sep 2021 08:52:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630597971; cv=none; d=google.com; s=arc-20160816; b=pwGcpXa067qVCXhvf3ep1/hScq6SLrp5PZpame2Rl7jU7ykmvMRCkztmaGMTpEbkiF FQbIrLwvu7OI1dUk3tmvo/eULxRMkOMKFgfmw2JXl8mtCnQfccPKrLiKU8Hy71BuksG7 sMC4w+6+S5ISHLJP05co28IdQWze39orO/A8J4OPM2fVRn4rdo/bRgOOir0zPWYR+kvJ FgTzbniZ1+j8v2CWWMbaF0Zc/mIOc1jtAyPYKoVZaCZAP3ATtjHLVw4BoHp126Qk+BIX N5FT9JRc9/LDsuJodJxcfws6nA+Ve5ByaKca+ihDFVLWA+Nei08tNW28/EP+f+Yskc77 rwAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=Lvgijn04rkBbFuXU24CIyP2H68wXwzmEnNA1gqXKzi8=; b=ytTZM1SqRCLEAbLlbawP1aibg2E3A6v7BMg28DXCoto2Es6KVMEMhCtld4BqIHfgv8 H3DKOrlygP9OynSkaOrKU52wYw05t4fQ+yH5/SRQ9kH79vI9sq0MW2FTKxLBSHzrAdQB dIIT75SsOVxMSvF/ziYcZBT+uWDp+gEdILaNJ4E0AOb12y4lhleNPjywKXBclodlAEeE R6I5BE4oxwKyc4cbbOWSRrbPB4xL8EDP78h1JvVvxpjooSXUPtmafNGQ67eaFd0Ww3JR QceBh/Fvnp3t39R1ZtfhfjzcUvs2oeu11dMBAEQXw19qInSdfNqkwQ4EELWUA6k1Cvbb oJOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="hAz/i5Wa"; spf=pass (google.com: domain of mptcp+bounces-1819-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1000:8100::1 as permitted sender) smtp.mailfrom="mptcp+bounces-1819-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sjc.edge.kernel.org (sjc.edge.kernel.org. [2604:1380:1000:8100::1]) by mx.google.com with ESMTPS id x11si2370621ion.51.2021.09.02.08.52.51 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Sep 2021 08:52:51 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-1819-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1000:8100::1 as permitted sender) client-ip=2604:1380:1000:8100::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="hAz/i5Wa"; spf=pass (google.com: domain of mptcp+bounces-1819-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1000:8100::1 as permitted sender) smtp.mailfrom="mptcp+bounces-1819-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sjc.edge.kernel.org (Postfix) with ESMTPS id 1D3953E0F60 for ; Thu, 2 Sep 2021 15:52:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9B99A2FAF; Thu, 2 Sep 2021 15:52:48 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 94E5B2FAE for ; Thu, 2 Sep 2021 15:52:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630597966; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Lvgijn04rkBbFuXU24CIyP2H68wXwzmEnNA1gqXKzi8=; b=hAz/i5WaCImd4ziQXBDqvXbp0gWeg66GKWVQhyD+jD6sINoC8DSMEXz3F0hON/OwpkPCOS +fTSlp2Pa7sBE7Ad4UIM13oaJvght2uRHZoeY9ZEpGAeugUkjd+VxK5ag56uVvypbdksBN wKdwbmWmqxR8DsWc6I+YaNhRjL2SWzs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-18-iwcANvgBOImM6k52pswlaQ-1; Thu, 02 Sep 2021 11:52:45 -0400 X-MC-Unique: iwcANvgBOImM6k52pswlaQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 02043801AFC for ; Thu, 2 Sep 2021 15:52:45 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.194.237]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6D1725D9DC for ; Thu, 2 Sep 2021 15:52:44 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v2 mptcp-next 3/4] Partially revert "tcp: factor out tcp_build_frag()" Date: Thu, 2 Sep 2021 17:52:13 +0200 Message-Id: <03dce55be4a164a139da069a3fa3f424cb3f8c95.1630595171.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This is a partial revert for commit b796d04bd014 ("tcp: factor out tcp_build_frag()"). MPTCP was the only user of the tcp_build_frag helper, and after the previous patch MPTCP does not use the mentioned helper anymore. Let's avoid exposing TCP internals. The revert is partial, as tcp_remove_empty_skb(), exposed by the same commit is still required. Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- include/net/tcp.h | 2 - net/ipv4/tcp.c | 117 ++++++++++++++++++++-------------------------- 2 files changed, 51 insertions(+), 68 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index dc52ea8adfc7..91f4397c4c08 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -330,8 +330,6 @@ int tcp_sendpage(struct sock *sk, struct page *page, in= t offset, size_t size, int flags); int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset, size_t size, int flags); -struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags, - struct page *page, int offset, size_t *size); ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, size_t size, int flags); int tcp_send_mss(struct sock *sk, int *size_goal, int flags); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 7a3e632b0048..caf0c50d86bc 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -963,68 +963,6 @@ void tcp_remove_empty_skb(struct sock *sk, struct sk_b= uff *skb) } } =20 -struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags, - struct page *page, int offset, size_t *size) -{ - struct sk_buff *skb =3D tcp_write_queue_tail(sk); - struct tcp_sock *tp =3D tcp_sk(sk); - bool can_coalesce; - int copy, i; - - if (!skb || (copy =3D size_goal - skb->len) <=3D 0 || - !tcp_skb_can_collapse_to(skb)) { -new_segment: - if (!sk_stream_memory_free(sk)) - return NULL; - - skb =3D sk_stream_alloc_skb(sk, 0, sk->sk_allocation, - tcp_rtx_and_write_queues_empty(sk)); - if (!skb) - return NULL; - -#ifdef CONFIG_TLS_DEVICE - skb->decrypted =3D !!(flags & MSG_SENDPAGE_DECRYPTED); -#endif - skb_entail(sk, skb); - copy =3D size_goal; - } - - if (copy > *size) - copy =3D *size; - - i =3D skb_shinfo(skb)->nr_frags; - can_coalesce =3D skb_can_coalesce(skb, i, page, offset); - if (!can_coalesce && i >=3D sysctl_max_skb_frags) { - tcp_mark_push(tp, skb); - goto new_segment; - } - if (!sk_wmem_schedule(sk, copy)) - return NULL; - - if (can_coalesce) { - skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); - } else { - get_page(page); - skb_fill_page_desc(skb, i, page, offset, copy); - } - - if (!(flags & MSG_NO_SHARED_FRAGS)) - skb_shinfo(skb)->flags |=3D SKBFL_SHARED_FRAG; - - skb->len +=3D copy; - skb->data_len +=3D copy; - skb->truesize +=3D copy; - sk_wmem_queued_add(sk, copy); - sk_mem_charge(sk, copy); - skb->ip_summed =3D CHECKSUM_PARTIAL; - WRITE_ONCE(tp->write_seq, tp->write_seq + copy); - TCP_SKB_CB(skb)->end_seq +=3D copy; - tcp_skb_pcount_set(skb, 0); - - *size =3D copy; - return skb; -} - ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset, size_t size, int flags) { @@ -1060,13 +998,60 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct pag= e *page, int offset, goto out_err; =20 while (size > 0) { - struct sk_buff *skb; - size_t copy =3D size; + struct sk_buff *skb =3D tcp_write_queue_tail(sk); + int copy, i; + bool can_coalesce; =20 - skb =3D tcp_build_frag(sk, size_goal, flags, page, offset, ©); - if (!skb) + if (!skb || (copy =3D size_goal - skb->len) <=3D 0 || + !tcp_skb_can_collapse_to(skb)) { +new_segment: + if (!sk_stream_memory_free(sk)) + goto wait_for_space; + + skb =3D sk_stream_alloc_skb(sk, 0, sk->sk_allocation, + tcp_rtx_and_write_queues_empty(sk)); + if (!skb) + goto wait_for_space; + +#ifdef CONFIG_TLS_DEVICE + skb->decrypted =3D !!(flags & MSG_SENDPAGE_DECRYPTED); +#endif + skb_entail(sk, skb); + copy =3D size_goal; + } + + if (copy > size) + copy =3D size; + + i =3D skb_shinfo(skb)->nr_frags; + can_coalesce =3D skb_can_coalesce(skb, i, page, offset); + if (!can_coalesce && i >=3D sysctl_max_skb_frags) { + tcp_mark_push(tp, skb); + goto new_segment; + } + if (!sk_wmem_schedule(sk, copy)) goto wait_for_space; =20 + if (can_coalesce) { + skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy); + } else { + get_page(page); + skb_fill_page_desc(skb, i, page, offset, copy); + } + + if (!(flags & MSG_NO_SHARED_FRAGS)) + skb_shinfo(skb)->flags |=3D SKBFL_SHARED_FRAG; + + skb->len +=3D copy; + skb->data_len +=3D copy; + skb->truesize +=3D copy; + sk_wmem_queued_add(sk, copy); + sk_mem_charge(sk, copy); + skb->ip_summed =3D CHECKSUM_PARTIAL; + WRITE_ONCE(tp->write_seq, tp->write_seq + copy); + TCP_SKB_CB(skb)->end_seq +=3D copy; + tcp_skb_pcount_set(skb, 0); + if (!copied) TCP_SKB_CB(skb)->tcp_flags &=3D ~TCPHDR_PSH; =20 --=20 2.26.3 From nobody Thu Mar 28 17:56:04 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a05:6638:d02:0:0:0:0 with SMTP id q2csp169526jaj; Thu, 2 Sep 2021 08:52:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyG1C48BjD4GPUTFUs4pLW6xYAl6tDo9hEBn6NDpbxYVsICPGLl4F09nr8nxuCch88V0PJo X-Received: by 2002:a67:ed1a:: with SMTP id l26mr2938334vsp.8.1630597972938; Thu, 02 Sep 2021 08:52:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630597972; cv=none; d=google.com; s=arc-20160816; b=zwGfs5UnbokAhYOujTUrzJ2jGH0OPNWda/zHhj4adBjJvfPHiTDRl2e/VapGjKDXow DXZMldeg4BnXeIQloP1harmfEl6zcJwCLs81F3XaaV8Gin4psmOBgnPktpKznxeCld64 eOpCQZT/xF3znzJAe0lyF3+7gUm8W+imlyzHZVft7BRAEAuasQ27weV7FZBbg8wPfVwd 7OiepbrKhIC6O9WVGd6rioQgoewXLo1XhooL/uqcJNC1LeSv/M36nSKgUjkBuBEd5Zyj 5+/r07py0fyntIcsDDipMbMKtkt4GFglSlLZD16QZ7cjJqaM7L6PPgyK0Ey03RLe/lbw +2SQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=0sXuSGNUZPcTS2HU99UWlUvle8o+zTUPolerURfGs60=; b=WWhi4tcEk9gDdOPiUPOOYxmMMtN7G3Ygp7k2leVyxjglPc2XK3nfl0+TZlgWZtfKTL CMS1rEYzhmdxkNyLnWG3UaFdTKQdsWavPIQQI1VBUKQtjM+jRUBE7h1ENaUcbybC4XV1 vWRlkWY3lQ1JcaRvsV7tfkglIkacMHhMpyJ4U0N/kWaTAgT4zOoWFgV07gQ+KwffwGAo DN6kLm5AtvpTp19c5oCK41nFHr+STIvX6tb4DGAZcpYKqnzKAv0mzUuH/I6vKHBR6ewM WKPEOkqyHaMxKX0DjlIm35M8L5gCkHbelJ6ltn9NXz+yhyTZe5F79iD5U9OMuWDoGKS7 8GZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YwqbZDm2; spf=pass (google.com: domain of mptcp+bounces-1820-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-1820-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ewr.edge.kernel.org (ewr.edge.kernel.org. [147.75.197.195]) by mx.google.com with ESMTPS id o15si853176uap.87.2021.09.02.08.52.52 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 02 Sep 2021 08:52:52 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-1820-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) client-ip=147.75.197.195; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YwqbZDm2; spf=pass (google.com: domain of mptcp+bounces-1820-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-1820-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ewr.edge.kernel.org (Postfix) with ESMTPS id 833521C0B62 for ; Thu, 2 Sep 2021 15:52:52 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0989B2FB2; Thu, 2 Sep 2021 15:52:50 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DAAE02FAE for ; Thu, 2 Sep 2021 15:52:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630597967; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0sXuSGNUZPcTS2HU99UWlUvle8o+zTUPolerURfGs60=; b=YwqbZDm2zGDp2baXEO+D7piUx1BLSzNI3vHVACr5L3cc+ihyQPJAVNKn7G9Z984CTMuvSR OXmH0dAxE6fZHlPMNIOQnNNOxA9sMzvuC9HacCCEGHJ6nmvaPSVFtUWtgYit3UOABkg2Fc XdL2xdA4i1NkLpltSlNiN2HcNzkc90E= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-21-Bhr7UB9DPB62cNIwhvBybA-1; Thu, 02 Sep 2021 11:52:46 -0400 X-MC-Unique: Bhr7UB9DPB62cNIwhvBybA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DF221189C444 for ; Thu, 2 Sep 2021 15:52:45 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.194.237]) by smtp.corp.redhat.com (Postfix) with ESMTP id 55DA35D9DC for ; Thu, 2 Sep 2021 15:52:45 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH v2 mptcp-next 4/4] tcp: remove sk_{tr}x_skb_cache Date: Thu, 2 Sep 2021 17:52:14 +0200 Message-Id: <8f93fb14f915b068c34f559e42d706a24792edbf.1630595171.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Eric Dumazet This reverts the following patches : 2e05fcae83c41eb2df10558338dc600dc783af47 ("tcp: fix compile error if !CONFI= G_SYSCTL") 4f661542a40217713f2cee0bb6678fbb30d9d367 ("tcp: fix zerocopy and notsent_lo= wat issues") 472c2e07eef045145bc1493cc94a01c87140780a ("tcp: add one skb cache for tx") 8b27dae5a2e89a61c46c6dbc76c040c0e6d0ed4c ("tcp: add one skb cache for rx") Having a cache of one skb (in each direction) per TCP socket is fragile, since it can cause a significant increase of memory needs, and not good enough for high speed flows anyway where more than one skb is needed. We want instead to add a generic infrastructure, with more flexible per-cpu caches, for alien NUMA nodes. Signed-off-by: Eric Dumazet Reviewed-by: Mat Martineau --- Documentation/networking/ip-sysctl.rst | 8 -------- include/net/sock.h | 19 ------------------- net/ipv4/af_inet.c | 4 ---- net/ipv4/sysctl_net_ipv4.c | 12 ------------ net/ipv4/tcp.c | 26 -------------------------- net/ipv4/tcp_ipv4.c | 6 ------ net/ipv6/tcp_ipv6.c | 6 ------ 7 files changed, 81 deletions(-) diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/network= ing/ip-sysctl.rst index d91ab28718d4..16b8bf72feaf 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -989,14 +989,6 @@ tcp_challenge_ack_limit - INTEGER in RFC 5961 (Improving TCP's Robustness to Blind In-Window Attacks) Default: 1000 =20 -tcp_rx_skb_cache - BOOLEAN - Controls a per TCP socket cache of one skb, that might help - performance of some workloads. This might be dangerous - on systems with a lot of TCP sockets, since it increases - memory usage. - - Default: 0 (disabled) - UDP variables =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 diff --git a/include/net/sock.h b/include/net/sock.h index 66a9a90f9558..708b9de3cdbb 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -262,7 +262,6 @@ struct bpf_local_storage; * @sk_dst_cache: destination cache * @sk_dst_pending_confirm: need to confirm neighbour * @sk_policy: flow policy - * @sk_rx_skb_cache: cache copy of recently accessed RX skb * @sk_receive_queue: incoming packets * @sk_wmem_alloc: transmit queue bytes committed * @sk_tsq_flags: TCP Small Queues flags @@ -328,7 +327,6 @@ struct bpf_local_storage; * @sk_peek_off: current peek_offset value * @sk_send_head: front of stuff to transmit * @tcp_rtx_queue: TCP re-transmit queue [union with @sk_send_head] - * @sk_tx_skb_cache: cache copy of recently accessed TX skb * @sk_security: used by security modules * @sk_mark: generic packet mark * @sk_cgrp_data: cgroup data for this cgroup @@ -393,7 +391,6 @@ struct sock { atomic_t sk_drops; int sk_rcvlowat; struct sk_buff_head sk_error_queue; - struct sk_buff *sk_rx_skb_cache; struct sk_buff_head sk_receive_queue; /* * The backlog queue is special, it is always used with @@ -442,7 +439,6 @@ struct sock { struct sk_buff *sk_send_head; struct rb_root tcp_rtx_queue; }; - struct sk_buff *sk_tx_skb_cache; struct sk_buff_head sk_write_queue; __s32 sk_peek_off; int sk_write_pending; @@ -1555,18 +1551,10 @@ static inline void sk_mem_uncharge(struct sock *sk,= int size) __sk_mem_reclaim(sk, 1 << 20); } =20 -DECLARE_STATIC_KEY_FALSE(tcp_tx_skb_cache_key); static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb) { sk_wmem_queued_add(sk, -skb->truesize); sk_mem_uncharge(sk, skb->truesize); - if (static_branch_unlikely(&tcp_tx_skb_cache_key) && - !sk->sk_tx_skb_cache && !skb_cloned(skb)) { - skb_ext_reset(skb); - skb_zcopy_clear(skb, true); - sk->sk_tx_skb_cache =3D skb; - return; - } __kfree_skb(skb); } =20 @@ -2575,7 +2563,6 @@ static inline void skb_setup_tx_timestamp(struct sk_b= uff *skb, __u16 tsflags) &skb_shinfo(skb)->tskey); } =20 -DECLARE_STATIC_KEY_FALSE(tcp_rx_skb_cache_key); /** * sk_eat_skb - Release a skb if it is no longer needed * @sk: socket to eat this skb from @@ -2587,12 +2574,6 @@ DECLARE_STATIC_KEY_FALSE(tcp_rx_skb_cache_key); static inline void sk_eat_skb(struct sock *sk, struct sk_buff *skb) { __skb_unlink(skb, &sk->sk_receive_queue); - if (static_branch_unlikely(&tcp_rx_skb_cache_key) && - !sk->sk_rx_skb_cache) { - sk->sk_rx_skb_cache =3D skb; - skb_orphan(skb); - return; - } __kfree_skb(skb); } =20 diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 9dc7613e589d..63eda8cb0d26 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -133,10 +133,6 @@ void inet_sock_destruct(struct sock *sk) struct inet_sock *inet =3D inet_sk(sk); =20 __skb_queue_purge(&sk->sk_receive_queue); - if (sk->sk_rx_skb_cache) { - __kfree_skb(sk->sk_rx_skb_cache); - sk->sk_rx_skb_cache =3D NULL; - } __skb_queue_purge(&sk->sk_error_queue); =20 sk_mem_reclaim(sk); diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 6f1e64d49232..6eb43dc91218 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -594,18 +594,6 @@ static struct ctl_table ipv4_table[] =3D { .extra1 =3D &sysctl_fib_sync_mem_min, .extra2 =3D &sysctl_fib_sync_mem_max, }, - { - .procname =3D "tcp_rx_skb_cache", - .data =3D &tcp_rx_skb_cache_key.key, - .mode =3D 0644, - .proc_handler =3D proc_do_static_key, - }, - { - .procname =3D "tcp_tx_skb_cache", - .data =3D &tcp_tx_skb_cache_key.key, - .mode =3D 0644, - .proc_handler =3D proc_do_static_key, - }, { } }; =20 diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index caf0c50d86bc..cbb0f807be46 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -325,11 +325,6 @@ struct tcp_splice_state { unsigned long tcp_memory_pressure __read_mostly; EXPORT_SYMBOL_GPL(tcp_memory_pressure); =20 -DEFINE_STATIC_KEY_FALSE(tcp_rx_skb_cache_key); -EXPORT_SYMBOL(tcp_rx_skb_cache_key); - -DEFINE_STATIC_KEY_FALSE(tcp_tx_skb_cache_key); - void tcp_enter_memory_pressure(struct sock *sk) { unsigned long val; @@ -866,18 +861,6 @@ struct sk_buff *sk_stream_alloc_skb(struct sock *sk, i= nt size, gfp_t gfp, { struct sk_buff *skb; =20 - if (likely(!size)) { - skb =3D sk->sk_tx_skb_cache; - if (skb) { - skb->truesize =3D SKB_TRUESIZE(skb_end_offset(skb)); - sk->sk_tx_skb_cache =3D NULL; - pskb_trim(skb, 0); - INIT_LIST_HEAD(&skb->tcp_tsorted_anchor); - skb_shinfo(skb)->tx_flags =3D 0; - memset(TCP_SKB_CB(skb), 0, sizeof(struct tcp_skb_cb)); - return skb; - } - } /* The TCP header must be at least 32-bit aligned. */ size =3D ALIGN(size, 4); =20 @@ -2905,11 +2888,6 @@ void tcp_write_queue_purge(struct sock *sk) sk_wmem_free_skb(sk, skb); } tcp_rtx_queue_purge(sk); - skb =3D sk->sk_tx_skb_cache; - if (skb) { - __kfree_skb(skb); - sk->sk_tx_skb_cache =3D NULL; - } INIT_LIST_HEAD(&tcp_sk(sk)->tsorted_sent_queue); sk_mem_reclaim(sk); tcp_clear_all_retrans_hints(tcp_sk(sk)); @@ -2946,10 +2924,6 @@ int tcp_disconnect(struct sock *sk, int flags) =20 tcp_clear_xmit_timers(sk); __skb_queue_purge(&sk->sk_receive_queue); - if (sk->sk_rx_skb_cache) { - __kfree_skb(sk->sk_rx_skb_cache); - sk->sk_rx_skb_cache =3D NULL; - } WRITE_ONCE(tp->copied_seq, tp->rcv_nxt); tp->urg_data =3D 0; tcp_write_queue_purge(sk); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 2e62e0d6373a..29a57bd159f0 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1941,7 +1941,6 @@ static void tcp_v4_fill_cb(struct sk_buff *skb, const= struct iphdr *iph, int tcp_v4_rcv(struct sk_buff *skb) { struct net *net =3D dev_net(skb->dev); - struct sk_buff *skb_to_free; int sdif =3D inet_sdif(skb); int dif =3D inet_iif(skb); const struct iphdr *iph; @@ -2082,17 +2081,12 @@ int tcp_v4_rcv(struct sk_buff *skb) tcp_segs_in(tcp_sk(sk), skb); ret =3D 0; if (!sock_owned_by_user(sk)) { - skb_to_free =3D sk->sk_rx_skb_cache; - sk->sk_rx_skb_cache =3D NULL; ret =3D tcp_v4_do_rcv(sk, skb); } else { if (tcp_add_backlog(sk, skb)) goto discard_and_relse; - skb_to_free =3D NULL; } bh_unlock_sock(sk); - if (skb_to_free) - __kfree_skb(skb_to_free); =20 put_and_return: if (refcounted) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 0ce52d46e4f8..8cf5ff2e9504 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1618,7 +1618,6 @@ static void tcp_v6_fill_cb(struct sk_buff *skb, const= struct ipv6hdr *hdr, =20 INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_buff *skb) { - struct sk_buff *skb_to_free; int sdif =3D inet6_sdif(skb); int dif =3D inet6_iif(skb); const struct tcphdr *th; @@ -1754,17 +1753,12 @@ INDIRECT_CALLABLE_SCOPE int tcp_v6_rcv(struct sk_bu= ff *skb) tcp_segs_in(tcp_sk(sk), skb); ret =3D 0; if (!sock_owned_by_user(sk)) { - skb_to_free =3D sk->sk_rx_skb_cache; - sk->sk_rx_skb_cache =3D NULL; ret =3D tcp_v6_do_rcv(sk, skb); } else { if (tcp_add_backlog(sk, skb)) goto discard_and_relse; - skb_to_free =3D NULL; } bh_unlock_sock(sk); - if (skb_to_free) - __kfree_skb(skb_to_free); put_and_return: if (refcounted) sock_put(sk); --=20 2.26.3