From nobody Thu Sep 18 08:16:30 2025 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:ab0:35eb:0:0:0:0:0 with SMTP id w11csp1347532uau; Mon, 20 Jun 2022 04:27:24 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uQNsgaqQQSRq7xBbfDSDkkVgqQ0qYmv1RTyv5bCq9b1pQ10dzJm/lt/UXpRGlG2q5Y80Zf X-Received: by 2002:a05:6808:140e:b0:32e:b362:34f with SMTP id w14-20020a056808140e00b0032eb362034fmr11616049oiv.276.1655724444310; Mon, 20 Jun 2022 04:27:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1655724444; cv=none; d=google.com; s=arc-20160816; b=B1f073Bj+0H+jxm11R8rCoHMYZSeUGs2gusrpNm0cT+x0UpWneD8Rhn2NbWxbHh1co yZnZTNm0YdW41oCGrsSTJjn6MDncr3ToIzjbpEODj2jAltVS+qC5UgY2MR+a5TxqtHDY STQzyaDLbcx4ipgweAcAu9ISvJFVQwjtQoBdAUeP/w/Fz8b/GTWXNrjHuE97IaJ2DcCJ KOUWyU09n6DsP2+FHFmKz1whJgOpWjaJkeGXlLbPUG34lEqpT928k1ttdHI/Ev7BHXq0 S0gVFQrSZeO54uX+tm9NHgCN6vzK2A7wdev2Nda7SVr8QB43avtCLaO+T431NupIULrZ +qfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=Q/f3mi8a44SumZTuibnx56J5FEpM7idTkqtgPJnmMuc=; b=KTshhVw1Uz7o8YGMqj95kBJ7tbjJYsBgbuYvX2XPMHxoKzQH88VmRJDcsKGdnzmYXK zOtOaOMm4jY/41MKl1JyPtEE0zTMPch6Tx0sG/6Nl6FXjXv8Y5HoBewKCGRUsC/MiEqD Lgwn1TVdHSxtKbvUHqXeNfvCQlsWSjp0dDn0qNcD1NDgHLsUgupoBscnpmz9yEBsDJV2 LB+VDEU2At6+/rw+P5cbZYJT4fWf6xVtSoAZ+W9LarrbW/9Q4qA0jIK+uq08J0J2JEKZ +xodggPOuXRhxKc7t/w9/7CWF5Uike5fzo1CCInRI/INgeRDNRiowzpBYrR8H3WlpDy/ tV5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=CBUpzxg7; spf=pass (google.com: domain of mptcp+bounces-5709-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 139.178.84.19 as permitted sender) smtp.mailfrom="mptcp+bounces-5709-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from da.mirrors.kernel.org (da.mirrors.kernel.org. [139.178.84.19]) by mx.google.com with ESMTPS id k13-20020a54468d000000b0032bcbc43ad9si9711970oic.101.2022.06.20.04.27.24 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 20 Jun 2022 04:27:24 -0700 (PDT) Received-SPF: pass (google.com: domain of mptcp+bounces-5709-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 139.178.84.19 as permitted sender) client-ip=139.178.84.19; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=CBUpzxg7; spf=pass (google.com: domain of mptcp+bounces-5709-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 139.178.84.19 as permitted sender) smtp.mailfrom="mptcp+bounces-5709-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by da.mirrors.kernel.org (Postfix) with ESMTPS id 0EDA72E0A1C for ; Mon, 20 Jun 2022 11:26:54 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E2C1380F; Mon, 20 Jun 2022 11:26:51 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F9C27F6 for ; Mon, 20 Jun 2022 11:26:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655724409; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q/f3mi8a44SumZTuibnx56J5FEpM7idTkqtgPJnmMuc=; b=CBUpzxg7dBm5rzFo1/k0TrwcTiJh/8vjyFbBr1JS3bgi57GvyzguklqimMeqn2p49LLb+F ikClUHQVVaZZjEsjoBx9cgy7/g6U/fihtQBzjLZi1wHuOLLC0flBfz/ymQsRlK1zgRGUvg 0N1PkA8Ms6OZZlVJ5rBvxAygVQp7R/k= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-591-87H258OQPc-8flBT5cunLg-1; Mon, 20 Jun 2022 07:26:48 -0400 X-MC-Unique: 87H258OQPc-8flBT5cunLg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1F9C6101AA48 for ; Mon, 20 Jun 2022 11:26:48 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.195.59]) by smtp.corp.redhat.com (Postfix) with ESMTP id A3C1F40BB4F for ; Mon, 20 Jun 2022 11:26:47 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-net v4 3/6] Squash-to: "mptcp: invoke MP_FAIL response when needed" Date: Mon, 20 Jun 2022 13:26:33 +0200 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.85 on 10.11.54.10 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" This tries to address a few issues outstanding in the mentioned patch: - we explicitly need to reset the timeout timer for mp_fail's sake - we need to explicitly generate a tcp ack for mp_fail, otherwise there are no guarantees for suck option being sent out - the timeout timer needs handling need some caring, as it's still shared between mp_fail and msk socket timeout. - we can re-use msk->first for msk->fail_ssk, as only the first/mpc subflow can fail without reset. That additionally avoid the need to clear fail_ssk on the relevant subflow close. - fail_tout would need some additional annotation. Just to be on the safe side move its manipulaiton under the ssk socket lock. Last 2 paragraph of the squash to commit should be replaced with: """ It leverages the fact that only the MPC/first subflow can gracefully fail to avoid unneeded subflows traversal: the failing subflow can be only msk->first. A new 'fail_tout' field is added to the subflow context to record the MP_FAIL response timeout and use such field to reliably share the timeout timer between the MP_FAIL event and the MPTCP socket close timeout. Finally, a new ack is generated to send out MP_FAIL notification as soon as we hit the relevant condition, instead of waiting a possibly unbound time for the next data packet. """ Signed-off-by: Paolo Abeni --- v3 -> v4: - fixed a couple of typo in commit message --- net/mptcp/pm.c | 4 +++- net/mptcp/protocol.c | 50 ++++++++++++++++++++++++++++++++++++-------- net/mptcp/protocol.h | 4 ++-- net/mptcp/subflow.c | 30 ++++++++++++++++++++++++-- 4 files changed, 74 insertions(+), 14 deletions(-) diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 3c7f07bb124e..45e2a48397b9 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -305,13 +305,15 @@ void mptcp_pm_mp_fail_received(struct sock *sk, u64 f= ail_seq) if (!READ_ONCE(msk->allow_infinite_fallback)) return; =20 - if (!msk->fail_ssk) { + if (!subflow->fail_tout) { pr_debug("send MP_FAIL response and infinite map"); =20 subflow->send_mp_fail =3D 1; subflow->send_infinite_map =3D 1; + tcp_send_ack(sk); } else { pr_debug("MP_FAIL response received"); + WRITE_ONCE(subflow->fail_tout, 0); } } =20 diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index a0f9f3831509..725fd417ebb1 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -500,7 +500,7 @@ static void mptcp_set_timeout(struct sock *sk) __mptcp_set_timeout(sk, tout); } =20 -static bool tcp_can_send_ack(const struct sock *ssk) +static inline bool tcp_can_send_ack(const struct sock *ssk) { return !((1 << inet_sk_state_load(ssk)) & (TCPF_SYN_SENT | TCPF_SYN_RECV | TCPF_TIME_WAIT | TCPF_CLOSE | TCP= F_LISTEN)); @@ -2490,24 +2490,50 @@ static void __mptcp_retrans(struct sock *sk) mptcp_reset_timer(sk); } =20 +/* schedule the timeout timer for the relevant event: either close timeout + * or mp_fail timeout. The close timeout takes precedence on the mp_fail o= ne + */ +void mptcp_reset_timeout(struct mptcp_sock *msk, unsigned long fail_tout) +{ + struct sock *sk =3D (struct sock *)msk; + unsigned long timeout, close_timeout; + + if (!fail_tout && !sock_flag(sk, SOCK_DEAD)) + return; + + close_timeout =3D inet_csk(sk)->icsk_mtup.probe_timestamp - tcp_jiffies32= + jiffies + TCP_TIMEWAIT_LEN; + + /* the close timeout takes precedence on the fail one, and here at least = one of + * them is active + */ + timeout =3D sock_flag(sk, SOCK_DEAD) ? close_timeout : fail_tout; + + sk_reset_timer(sk, &sk->sk_timer, timeout); +} + static void mptcp_mp_fail_no_response(struct mptcp_sock *msk) { - struct sock *ssk =3D msk->fail_ssk; + struct sock *ssk =3D msk->first; bool slow; =20 + if (!ssk) + return; + pr_debug("MP_FAIL doesn't respond, reset the subflow"); =20 slow =3D lock_sock_fast(ssk); mptcp_subflow_reset(ssk); + WRITE_ONCE(mptcp_subflow_ctx(ssk)->fail_tout, 0); unlock_sock_fast(ssk, slow); =20 - msk->fail_ssk =3D NULL; + mptcp_reset_timeout(msk, 0); } =20 static void mptcp_worker(struct work_struct *work) { struct mptcp_sock *msk =3D container_of(work, struct mptcp_sock, work); struct sock *sk =3D &msk->sk.icsk_inet.sk; + unsigned long fail_tout; int state; =20 lock_sock(sk); @@ -2544,7 +2570,8 @@ static void mptcp_worker(struct work_struct *work) if (test_and_clear_bit(MPTCP_WORK_RTX, &msk->flags)) __mptcp_retrans(sk); =20 - if (msk->fail_ssk && time_after(jiffies, msk->fail_tout)) + fail_tout =3D msk->first ? READ_ONCE(mptcp_subflow_ctx(msk->first)->fail_= tout) : 0; + if (fail_tout && time_after(jiffies, fail_tout)) mptcp_mp_fail_no_response(msk); =20 unlock: @@ -2572,8 +2599,6 @@ static int __mptcp_init_sock(struct sock *sk) WRITE_ONCE(msk->csum_enabled, mptcp_is_checksum_enabled(sock_net(sk))); WRITE_ONCE(msk->allow_infinite_fallback, true); msk->recovery =3D false; - msk->fail_ssk =3D NULL; - msk->fail_tout =3D 0; =20 mptcp_pm_data_init(msk); =20 @@ -2804,6 +2829,7 @@ static void __mptcp_destroy_sock(struct sock *sk) static void mptcp_close(struct sock *sk, long timeout) { struct mptcp_subflow_context *subflow; + struct mptcp_sock *msk =3D mptcp_sk(sk); bool do_cancel_work =3D false; =20 lock_sock(sk); @@ -2822,10 +2848,16 @@ static void mptcp_close(struct sock *sk, long timeo= ut) cleanup: /* orphan all the subflows */ inet_csk(sk)->icsk_mtup.probe_timestamp =3D tcp_jiffies32; - mptcp_for_each_subflow(mptcp_sk(sk), subflow) { + mptcp_for_each_subflow(msk, subflow) { struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); bool slow =3D lock_sock_fast_nested(ssk); =20 + /* since the close timeout takes precedence on the fail one, + * cancel the latter + */ + if (ssk =3D=3D msk->first) + subflow->fail_tout =3D 0; + sock_orphan(ssk); unlock_sock_fast(ssk, slow); } @@ -2834,13 +2866,13 @@ static void mptcp_close(struct sock *sk, long timeo= ut) sock_hold(sk); pr_debug("msk=3D%p state=3D%d", sk, sk->sk_state); if (mptcp_sk(sk)->token) - mptcp_event(MPTCP_EVENT_CLOSED, mptcp_sk(sk), NULL, GFP_KERNEL); + mptcp_event(MPTCP_EVENT_CLOSED, msk, NULL, GFP_KERNEL); =20 if (sk->sk_state =3D=3D TCP_CLOSE) { __mptcp_destroy_sock(sk); do_cancel_work =3D true; } else { - sk_reset_timer(sk, &sk->sk_timer, jiffies + TCP_TIMEWAIT_LEN); + mptcp_reset_timeout(msk, 0); } release_sock(sk); if (do_cancel_work) diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index bef7dea9f358..077a717799a0 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -306,8 +306,6 @@ struct mptcp_sock { =20 u32 setsockopt_seq; char ca_name[TCP_CA_NAME_MAX]; - struct sock *fail_ssk; - unsigned long fail_tout; }; =20 #define mptcp_data_lock(sk) spin_lock_bh(&(sk)->sk_lock.slock) @@ -484,6 +482,7 @@ struct mptcp_subflow_context { u8 stale_count; =20 long delegated_status; + unsigned long fail_tout; =20 ); =20 @@ -677,6 +676,7 @@ void mptcp_get_options(const struct sk_buff *skb, =20 void mptcp_finish_connect(struct sock *sk); void __mptcp_set_connected(struct sock *sk); +void mptcp_reset_timeout(struct mptcp_sock *msk, unsigned long fail_tout); static inline bool mptcp_is_fully_established(struct sock *sk) { return inet_sk_state_load(sk) =3D=3D TCP_ESTABLISHED && diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 98b12a9c4eb5..238330da3f1f 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1158,6 +1158,33 @@ static bool subflow_can_fallback(struct mptcp_subflo= w_context *subflow) return !subflow->fully_established; } =20 +static void mptcp_subflow_fail(struct mptcp_sock *msk, struct sock *ssk) +{ + struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); + unsigned long fail_tout; + + /* greceful failure can happen only on the MPC subflow */ + if (WARN_ON_ONCE(ssk !=3D READ_ONCE(msk->first))) + return; + + /* since the close timeout take precedence on the fail one, + * no need to start the latter when the first is already set + */ + if (sock_flag((struct sock *)msk, SOCK_DEAD)) + return; + + /* we don't need extreme accuracy here, use a zero fail_tout as special + * value meaning no fail timeout at all; + */ + fail_tout =3D jiffies + TCP_RTO_MAX; + if (!fail_tout) + fail_tout =3D 1; + WRITE_ONCE(subflow->fail_tout, fail_tout); + tcp_send_ack(ssk); + + mptcp_reset_timeout(msk, subflow->fail_tout); +} + static bool subflow_check_data_avail(struct sock *ssk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); @@ -1233,8 +1260,7 @@ static bool subflow_check_data_avail(struct sock *ssk) while ((skb =3D skb_peek(&ssk->sk_receive_queue))) sk_eat_skb(ssk, skb); } else { - msk->fail_ssk =3D ssk; - msk->fail_tout =3D jiffies + TCP_RTO_MAX; + mptcp_subflow_fail(msk, ssk); } WRITE_ONCE(subflow->data_avail, MPTCP_SUBFLOW_NODATA); return true; --=20 2.35.3