From nobody Tue May 7 14:13:16 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a02:cbb9:0:0:0:0:0 with SMTP id v25csp4789692jap; Mon, 6 Dec 2021 09:35:01 -0800 (PST) X-Google-Smtp-Source: ABdhPJxeux1hyGUva4A3CDMBXFAZb3zArL4OQJlZOMSMRLTR3TBlX8xuWOOQjoX71wXqjcB1ZYkC X-Received: by 2002:a05:622a:50a:: with SMTP id l10mr19906673qtx.491.1638812101308; Mon, 06 Dec 2021 09:35:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1638812101; cv=none; d=google.com; s=arc-20160816; b=KzUDZy/TuYl0bGPvNTyLUibsxOhp7Ib0ktW3DzrpHergh7HysExYzFD8/WSmDvCN3/ VjNO3RbzhUvxlzLmNlIO4CaDeTyXjtQjGUiF69yuX47Dx8qUEViMuv3KYS2fHUrZWx3j mFt5UZWSAKs8zNgVsSmV1lkmwCSH8OY5C87EVqpKBJRRhyAGaAFaP7qmtlwl3whWSm58 bET90/1ZjgGd2JJXYUze4StenXaOd7QBJHvWl0v/grIkjRfAexRdwUTM8OXBZu6NNJGp ulvfMzxQoQqK81AtrkuuRfuDZBYVulN0u/ZoT7sDd1Avto38nVy1O4ld0DISvmQ4xEvj HNQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=aq5Mf/c7TfgqPXjX1WJFq9GtiBD6Q8uGL1VONusYfCA=; b=TgV9th5TLhL+0Snuq/+X+lk9CRi3qJfDEc+KPDSaKUsn2T3WuInwdmNQillmgJW4TR 7px6bwTjlU8MMeT1hgiW7s7exgJMvnwDEg6zYzqK40sfkhwcP9s1awPEqxhv/7FOWLEP aKkoAUADTQrXbkpVz1v/K/i1eAfEDoGVkcJ7dkecMQgG99nh/15EkzGGsSEB1Z2e6Rh7 b+bry2d5Jd4xBizPL0IeUOHsPykjeHh85yQ5aVkJFYkmedYdv3EsXTTwAwlKuSKuoqG6 ZPrv204zbhWq7dIQ8m5YvZFmvtqM2AneeplPc9QPucJvbfavLXPXy90YKeVdgYs4x3lV E3Gw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=MwXttcKP; spf=pass (google.com: domain of mptcp+bounces-2658-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-2658-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from ewr.edge.kernel.org (ewr.edge.kernel.org. [147.75.197.195]) by mx.google.com with ESMTPS id j4si14717246qtj.463.2021.12.06.09.35.01 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 Dec 2021 09:35:01 -0800 (PST) Received-SPF: pass (google.com: domain of mptcp+bounces-2658-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) client-ip=147.75.197.195; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=MwXttcKP; spf=pass (google.com: domain of mptcp+bounces-2658-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 147.75.197.195 as permitted sender) smtp.mailfrom="mptcp+bounces-2658-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ewr.edge.kernel.org (Postfix) with ESMTPS id EB4ED1C08CB for ; Mon, 6 Dec 2021 17:35:00 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C0B392CAB; Mon, 6 Dec 2021 17:34:59 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A50A62C9E for ; Mon, 6 Dec 2021 17:34:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638812096; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aq5Mf/c7TfgqPXjX1WJFq9GtiBD6Q8uGL1VONusYfCA=; b=MwXttcKPlbrhd3D7giVtAqfvqQm9nywTdWU1oE5VZz8wvFo8eyRTy2QflwWeEv0xS2PJsc EN4LhmJvQ9LwazxHFj2Wgj69CCrvPtxRyAAaNtkMktbsGHqqdn878ssGmwiLWxKEqm3IRN JuoKFjslmJnf7ER+P6ovnEu4mRUlG+0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-262-BKFfSx4oNHyhyQn9J2pHhw-1; Mon, 06 Dec 2021 12:34:55 -0500 X-MC-Unique: BKFfSx4oNHyhyQn9J2pHhw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BE15F64083 for ; Mon, 6 Dec 2021 17:34:54 +0000 (UTC) Received: from gerbillo.fritz.box (unknown [10.39.194.111]) by smtp.corp.redhat.com (Postfix) with ESMTP id EF7B013A58 for ; Mon, 6 Dec 2021 17:34:53 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next 1/2] mptcp: cleanup MPJ subflow list handling Date: Mon, 6 Dec 2021 18:34:46 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We can simplify the join list handling leveraging the mptcp_release_cb(): if we can acquire the msk socket lock ad mptcp_finish_join time, move the new subflow directly into the conn_list, othewise place it on join_list and let the release_cb process such list. Since pending MPJ connection are now always processed in a timely way, we can avoid flushing the join list every time we have to process all the current subflows. Additionally we can now use the mptcp data lock to protect the join_list, removing the additional spin lock. Finally, the MPJ handshake is now always finalized under the msk socket lock, we can drop the additional synchronization between mptcp_finish_join() and mptcp_close(). Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- rfc -> v1: - fix list corruption in __mptcp_flush_join_list() - fix label name typo (Mat) --- net/mptcp/pm_netlink.c | 3 -- net/mptcp/protocol.c | 109 +++++++++++++++++------------------------ net/mptcp/protocol.h | 15 +----- net/mptcp/sockopt.c | 24 +++------ net/mptcp/subflow.c | 5 +- 5 files changed, 56 insertions(+), 100 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 48963ae4c610..3162e8d25d05 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -165,7 +165,6 @@ select_local_address(const struct pm_nl_pernet *pernet, msk_owned_by_me(msk); =20 rcu_read_lock(); - __mptcp_flush_join_list(msk); list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { if (!(entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW)) continue; @@ -544,7 +543,6 @@ static unsigned int fill_local_addresses_vec(struct mpt= cp_sock *msk, subflows_max =3D mptcp_pm_get_subflows_max(msk); =20 rcu_read_lock(); - __mptcp_flush_join_list(msk); list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { if (!(entry->flags & MPTCP_PM_ADDR_FLAG_FULLMESH)) continue; @@ -633,7 +631,6 @@ void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk) !mptcp_pm_should_rm_signal(msk)) return; =20 - __mptcp_flush_join_list(msk); subflow =3D list_first_entry_or_null(&msk->conn_list, typeof(*subflow), n= ode); if (subflow) { struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index accd109fd86d..38651c80b321 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -808,47 +808,31 @@ void mptcp_data_ready(struct sock *sk, struct sock *s= sk) mptcp_data_unlock(sk); } =20 -static bool mptcp_do_flush_join_list(struct mptcp_sock *msk) +static bool __mptcp_finish_join(struct mptcp_sock *msk, struct sock *ssk) { - struct mptcp_subflow_context *subflow; - bool ret =3D false; - - if (likely(list_empty(&msk->join_list))) + if (((struct sock *)msk)->sk_state !=3D TCP_ESTABLISHED) return false; =20 - spin_lock_bh(&msk->join_list_lock); - list_for_each_entry(subflow, &msk->join_list, node) { - u32 sseq =3D READ_ONCE(subflow->setsockopt_seq); - - mptcp_propagate_sndbuf((struct sock *)msk, mptcp_subflow_tcp_sock(subflo= w)); - if (READ_ONCE(msk->setsockopt_seq) !=3D sseq) - ret =3D true; - } - list_splice_tail_init(&msk->join_list, &msk->conn_list); - spin_unlock_bh(&msk->join_list_lock); - - return ret; -} - -void __mptcp_flush_join_list(struct mptcp_sock *msk) -{ - if (likely(!mptcp_do_flush_join_list(msk))) - return; - - if (!test_and_set_bit(MPTCP_WORK_SYNC_SETSOCKOPT, &msk->flags)) - mptcp_schedule_work((struct sock *)msk); + mptcp_propagate_sndbuf((struct sock *)msk, ssk); + mptcp_sockopt_sync_locked(msk, ssk); + WRITE_ONCE(msk->allow_infinite_fallback, false); + return true; } =20 -static void mptcp_flush_join_list(struct mptcp_sock *msk) +static void __mptcp_flush_join_list(struct sock *sk) { - bool sync_needed =3D test_and_clear_bit(MPTCP_WORK_SYNC_SETSOCKOPT, &msk-= >flags); - - might_sleep(); + struct mptcp_subflow_context *tmp, *subflow; + struct mptcp_sock *msk =3D mptcp_sk(sk); =20 - if (!mptcp_do_flush_join_list(msk) && !sync_needed) - return; + list_for_each_entry_safe(subflow, tmp, &msk->join_list, node) { + struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); + bool slow =3D lock_sock_fast(ssk); =20 - mptcp_sockopt_sync_all(msk); + list_move_tail(&subflow->node, &msk->conn_list); + if (!__mptcp_finish_join(msk, ssk)) + mptcp_subflow_reset(ssk); + unlock_sock_fast(ssk, slow); + } } =20 static bool mptcp_timer_pending(struct sock *sk) @@ -1568,7 +1552,6 @@ void __mptcp_push_pending(struct sock *sk, unsigned i= nt flags) int ret =3D 0; =20 prev_ssk =3D ssk; - __mptcp_flush_join_list(msk); ssk =3D mptcp_subflow_get_send(msk); =20 /* First check. If the ssk has changed since @@ -1973,7 +1956,6 @@ static bool __mptcp_move_skbs(struct mptcp_sock *msk) unsigned int moved =3D 0; bool ret, done; =20 - mptcp_flush_join_list(msk); do { struct sock *ssk =3D mptcp_subflow_recv_lookup(msk); bool slowpath; @@ -2510,7 +2492,6 @@ static void mptcp_worker(struct work_struct *work) goto unlock; =20 mptcp_check_data_fin_ack(sk); - mptcp_flush_join_list(msk); =20 mptcp_check_fastclose(msk); =20 @@ -2549,8 +2530,6 @@ static int __mptcp_init_sock(struct sock *sk) { struct mptcp_sock *msk =3D mptcp_sk(sk); =20 - spin_lock_init(&msk->join_list_lock); - INIT_LIST_HEAD(&msk->conn_list); INIT_LIST_HEAD(&msk->join_list); INIT_LIST_HEAD(&msk->rtx_queue); @@ -2729,7 +2708,6 @@ static void __mptcp_check_send_data_fin(struct sock *= sk) } } =20 - mptcp_flush_join_list(msk); mptcp_for_each_subflow(msk, subflow) { struct sock *tcp_sk =3D mptcp_subflow_tcp_sock(subflow); =20 @@ -2762,12 +2740,7 @@ static void __mptcp_destroy_sock(struct sock *sk) =20 might_sleep(); =20 - /* be sure to always acquire the join list lock, to sync vs - * mptcp_finish_join(). - */ - spin_lock_bh(&msk->join_list_lock); - list_splice_tail_init(&msk->join_list, &msk->conn_list); - spin_unlock_bh(&msk->join_list_lock); + /* join list will be eventually flushed (with rst) at sock lock release t= ime*/ list_splice_init(&msk->conn_list, &conn_list); =20 sk_stop_timer(sk, &msk->sk.icsk_retransmit_timer); @@ -2870,8 +2843,6 @@ static int mptcp_disconnect(struct sock *sk, int flag= s) struct mptcp_subflow_context *subflow; struct mptcp_sock *msk =3D mptcp_sk(sk); =20 - mptcp_do_flush_join_list(msk); - inet_sk_state_store(sk, TCP_CLOSE); =20 mptcp_for_each_subflow(msk, subflow) { @@ -3102,6 +3073,8 @@ static void mptcp_release_cb(struct sock *sk) flags |=3D BIT(MPTCP_PUSH_PENDING); if (test_and_clear_bit(MPTCP_RETRANSMIT, &mptcp_sk(sk)->flags)) flags |=3D BIT(MPTCP_RETRANSMIT); + if (test_and_clear_bit(MPTCP_FLUSH_JOIN_LIST, &mptcp_sk(sk)->flags)) + flags |=3D BIT(MPTCP_FLUSH_JOIN_LIST); if (!flags) break; =20 @@ -3114,6 +3087,8 @@ static void mptcp_release_cb(struct sock *sk) */ =20 spin_unlock_bh(&sk->sk_lock.slock); + if (flags & BIT(MPTCP_FLUSH_JOIN_LIST)) + __mptcp_flush_join_list(sk); if (flags & BIT(MPTCP_PUSH_PENDING)) __mptcp_push_pending(sk, 0); if (flags & BIT(MPTCP_RETRANSMIT)) @@ -3259,7 +3234,7 @@ bool mptcp_finish_join(struct sock *ssk) struct mptcp_sock *msk =3D mptcp_sk(subflow->conn); struct sock *parent =3D (void *)msk; struct socket *parent_sock; - bool ret; + bool ret =3D true; =20 pr_debug("msk=3D%p, subflow=3D%p", msk, subflow); =20 @@ -3272,24 +3247,32 @@ bool mptcp_finish_join(struct sock *ssk) if (!msk->pm.server_side) goto out; =20 - if (!mptcp_pm_allow_new_subflow(msk)) { - subflow->reset_reason =3D MPTCP_RST_EPROHIBIT; - return false; - } + if (!mptcp_pm_allow_new_subflow(msk)) + goto err_prohibited; =20 - /* active connections are already on conn_list, and we can't acquire - * msk lock here. - * use the join list lock as synchronization point and double-check - * msk status to avoid racing with __mptcp_destroy_sock() + if (WARN_ON_ONCE(!list_empty(&subflow->node))) + goto err_prohibited; + + /* active connections are already on conn_list. + * If we can't acquire msk socket lock here, let the release callback + * handle it */ - spin_lock_bh(&msk->join_list_lock); - ret =3D inet_sk_state_load(parent) =3D=3D TCP_ESTABLISHED; - if (ret && !WARN_ON_ONCE(!list_empty(&subflow->node))) { - list_add_tail(&subflow->node, &msk->join_list); + mptcp_data_lock(parent); + if (!sock_owned_by_user(parent)) { + ret =3D __mptcp_finish_join(msk, ssk); + if (ret) { + sock_hold(ssk); + list_add_tail(&subflow->node, &msk->conn_list); + } + } else { sock_hold(ssk); + list_add_tail(&subflow->node, &msk->join_list); + set_bit(MPTCP_FLUSH_JOIN_LIST, &msk->flags); } - spin_unlock_bh(&msk->join_list_lock); + mptcp_data_unlock(parent); + if (!ret) { +err_prohibited: subflow->reset_reason =3D MPTCP_RST_EPROHIBIT; return false; } @@ -3300,8 +3283,9 @@ bool mptcp_finish_join(struct sock *ssk) parent_sock =3D READ_ONCE(parent->sk_socket); if (parent_sock && !ssk->sk_socket) mptcp_sock_graft(ssk, parent_sock); + subflow->map_seq =3D READ_ONCE(msk->ack_seq); - WRITE_ONCE(msk->allow_infinite_fallback, false); + out: mptcp_event(MPTCP_EVENT_SUB_ESTABLISHED, msk, ssk, GFP_ATOMIC); return true; @@ -3566,7 +3550,6 @@ static int mptcp_stream_accept(struct socket *sock, s= truct socket *newsock, /* set ssk->sk_socket of accept()ed flows to mptcp socket. * This is needed so NOSPACE flag can be set from tcp stack. */ - mptcp_flush_join_list(msk); mptcp_for_each_subflow(msk, subflow) { struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); =20 diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 3e7e541a013f..4a140aec68b3 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -120,7 +120,7 @@ #define MPTCP_CLEAN_UNA 7 #define MPTCP_ERROR_REPORT 8 #define MPTCP_RETRANSMIT 9 -#define MPTCP_WORK_SYNC_SETSOCKOPT 10 +#define MPTCP_FLUSH_JOIN_LIST 10 #define MPTCP_CONNECTED 11 =20 static inline bool before64(__u64 seq1, __u64 seq2) @@ -255,7 +255,6 @@ struct mptcp_sock { u8 recvmsg_inq:1, cork:1, nodelay:1; - spinlock_t join_list_lock; struct work_struct work; struct sk_buff *ooo_last_skb; struct rb_root out_of_order_queue; @@ -505,15 +504,6 @@ mptcp_subflow_get_mapped_dsn(const struct mptcp_subflo= w_context *subflow) return subflow->map_seq + mptcp_subflow_get_map_offset(subflow); } =20 -static inline void mptcp_add_pending_subflow(struct mptcp_sock *msk, - struct mptcp_subflow_context *subflow) -{ - sock_hold(mptcp_subflow_tcp_sock(subflow)); - spin_lock_bh(&msk->join_list_lock); - list_add_tail(&subflow->node, &msk->join_list); - spin_unlock_bh(&msk->join_list_lock); -} - void mptcp_subflow_process_delegated(struct sock *ssk); =20 static inline void mptcp_subflow_delegate(struct mptcp_subflow_context *su= bflow, int action) @@ -678,7 +668,6 @@ void __mptcp_data_acked(struct sock *sk); void __mptcp_error_report(struct sock *sk); void mptcp_subflow_eof(struct sock *sk); bool mptcp_update_rcv_data_fin(struct mptcp_sock *msk, u64 data_fin_seq, b= ool use_64bit); -void __mptcp_flush_join_list(struct mptcp_sock *msk); static inline bool mptcp_data_fin_enabled(const struct mptcp_sock *msk) { return READ_ONCE(msk->snd_data_fin_enable) && @@ -838,7 +827,7 @@ unsigned int mptcp_pm_get_subflows_max(struct mptcp_soc= k *msk); unsigned int mptcp_pm_get_local_addr_max(struct mptcp_sock *msk); =20 void mptcp_sockopt_sync(struct mptcp_sock *msk, struct sock *ssk); -void mptcp_sockopt_sync_all(struct mptcp_sock *msk); +void mptcp_sockopt_sync_locked(struct mptcp_sock *msk, struct sock *ssk); =20 static inline struct mptcp_ext *mptcp_get_ext(const struct sk_buff *skb) { diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index 3c3db22fd36a..962cfe3c463e 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -1286,27 +1286,15 @@ void mptcp_sockopt_sync(struct mptcp_sock *msk, str= uct sock *ssk) } } =20 -void mptcp_sockopt_sync_all(struct mptcp_sock *msk) +void mptcp_sockopt_sync_locked(struct mptcp_sock *msk, struct sock *ssk) { - struct mptcp_subflow_context *subflow; - struct sock *sk =3D (struct sock *)msk; - u32 seq; - - seq =3D sockopt_seq_reset(sk); + struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); =20 - mptcp_for_each_subflow(msk, subflow) { - struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); - u32 sseq =3D READ_ONCE(subflow->setsockopt_seq); + msk_owned_by_me(msk); =20 - if (sseq !=3D msk->setsockopt_seq) { - __mptcp_sockopt_sync(msk, ssk); - WRITE_ONCE(subflow->setsockopt_seq, seq); - } else if (sseq !=3D seq) { - WRITE_ONCE(subflow->setsockopt_seq, seq); - } + if (READ_ONCE(subflow->setsockopt_seq) !=3D msk->setsockopt_seq) { + sync_socket_options(msk, ssk); =20 - cond_resched(); + subflow->setsockopt_seq =3D msk->setsockopt_seq; } - - msk->setsockopt_seq =3D seq; } diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 76556743e952..536a322a6fd0 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1448,7 +1448,8 @@ int __mptcp_subflow_connect(struct sock *sk, const st= ruct mptcp_addr_info *loc, subflow->start_stamp =3D tcp_jiffies32; mptcp_info2sockaddr(remote, &addr, ssk->sk_family); =20 - mptcp_add_pending_subflow(msk, subflow); + sock_hold(ssk); + list_add_tail(&subflow->node, &msk->conn_list); err =3D kernel_connect(sf, (struct sockaddr *)&addr, addrlen, O_NONBLOCK); if (err && err !=3D -EINPROGRESS) goto failed_unlink; @@ -1460,9 +1461,7 @@ int __mptcp_subflow_connect(struct sock *sk, const st= ruct mptcp_addr_info *loc, return err; =20 failed_unlink: - spin_lock_bh(&msk->join_list_lock); list_del(&subflow->node); - spin_unlock_bh(&msk->join_list_lock); mptcp_pm_subflow_check_next(msk, ssk, subflow); sock_put(mptcp_subflow_tcp_sock(subflow)); =20 --=20 2.33.1 From nobody Tue May 7 14:13:16 2024 Delivered-To: wpasupplicant.patchew@gmail.com Received: by 2002:a02:cbb9:0:0:0:0:0 with SMTP id v25csp4789710jap; Mon, 6 Dec 2021 09:35:02 -0800 (PST) X-Google-Smtp-Source: ABdhPJz9ZR5s8LWVf9vOC7jZIlptO51Xv31spgaSjPAlKJQOz+F2vSxaFrPQ1Y45Dj0ztAwjq9PZ X-Received: by 2002:a17:902:e852:b0:143:8152:26c7 with SMTP id t18-20020a170902e85200b00143815226c7mr45439658plg.75.1638812102347; Mon, 06 Dec 2021 09:35:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1638812102; cv=none; d=google.com; s=arc-20160816; b=lZxnpQGwjNT7dyE7cdk9r7/n4GAAR/OoBDzr239v1AfcWPEO1dawg4YBC9akk2XDTH rzxUR8iRNuVoCVVUcPKZxlfcd1KEAE2xPTNCZP3t1tOpdPW3zjUUWA7E51vhCxrzVMWX fR7vt5aP0KFeaZb0ISqt+Ydc+NniKJUrYDTn20WKLFjJkMJ09uLmmzj/ecvb5f+QWs6c ra4tS8t0tHeSRnjXo/m36gkDWZ6yqcCMobbtKyC695WvKQTW+km9RhUVUUTA83IILxYJ p0krQmMCAzdXnT/uzBOnM0GxruPVilWqyEdsoujVm2BM7Y6KsGyOjHZ7TANV7gC6jGKC 17vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:to:from:dkim-signature; bh=iqOgRXddF588gihmpOScUBIy3djtxZknyPx0u3WZ/oU=; b=CeU41CHnBFOnOTn80DMS06DutdpjKFDsphiFhVn0Pwy0QIIENrEsOZ95xqR4UzNtUR TlNla2nr7sWrw4e/QjjR13SLDG2DQWDR8GwhTWbtaCZPCIFCsLpN9PZR9rZ6bbegCuS1 Yfb/JtPIKTqAs7JBYZRmvwDNTicnZZV33RZrXFqqtAzVOUIaQkMup5rsclTgwH3R2uEk rmtdpLZanwNmg4dc0Ij4ilDLpOuUvhHx+fRh75WVAPxGzCL6kSURFLopBybfzIezoQbE 1ZjmIlKYj7ap+Hswhl3xqwnQ19UiEds0HYVUV/vf7WvAoJ3ZdwXgZQ3LlOCG7PTKNgGQ ReFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JelzMk3l; spf=pass (google.com: domain of mptcp+bounces-2659-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1000:8100::1 as permitted sender) smtp.mailfrom="mptcp+bounces-2659-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sjc.edge.kernel.org (sjc.edge.kernel.org. [2604:1380:1000:8100::1]) by mx.google.com with ESMTPS id b3si18405469pgw.18.2021.12.06.09.35.02 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 Dec 2021 09:35:02 -0800 (PST) Received-SPF: pass (google.com: domain of mptcp+bounces-2659-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1000:8100::1 as permitted sender) client-ip=2604:1380:1000:8100::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JelzMk3l; spf=pass (google.com: domain of mptcp+bounces-2659-wpasupplicant.patchew=gmail.com@lists.linux.dev designates 2604:1380:1000:8100::1 as permitted sender) smtp.mailfrom="mptcp+bounces-2659-wpasupplicant.patchew=gmail.com@lists.linux.dev"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sjc.edge.kernel.org (Postfix) with ESMTPS id DDC643E0E63 for ; Mon, 6 Dec 2021 17:35:01 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E42592C9E; Mon, 6 Dec 2021 17:34:59 +0000 (UTC) X-Original-To: mptcp@lists.linux.dev Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B889D68 for ; Mon, 6 Dec 2021 17:34:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1638812097; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iqOgRXddF588gihmpOScUBIy3djtxZknyPx0u3WZ/oU=; b=JelzMk3lTZcb0pVaCwGllz+pn1qWTywWiSSu9eCZ2wTQkVF1w+s7Ovz6+/ytE2POWGiBQu umSlILrB+XB0XX0Rghj5nCyuCTh/HZ2xz5Iyz4L4FNJzoXiQyUC0PQbToK5TG2N1/QLH3b 9xdDpIgnJ3aFWEMS2yqBjyr0Wd/4EfY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-568-uQolgJYMOMCQAdEnPfC0dA-1; Mon, 06 Dec 2021 12:34:56 -0500 X-MC-Unique: uQolgJYMOMCQAdEnPfC0dA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B62DF81EE62 for ; Mon, 6 Dec 2021 17:34:55 +0000 (UTC) Received: from gerbillo.fritz.box (unknown [10.39.194.111]) by smtp.corp.redhat.com (Postfix) with ESMTP id 245CD4ABAF for ; Mon, 6 Dec 2021 17:34:54 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next 2/2] mptcp: avoid atomic bit manipulation when possible Date: Mon, 6 Dec 2021 18:34:47 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=pabeni@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently the msk->flags bitmask carries both state for the mptcp_release_cb() - mostly touched under the mptcp data lock - and others state info touched even outside such lock scope. As a consequence, msk->flags is always manipulated with atomic operations. This change splits such bitmask in two separate fields, so that we use plain bit oper operations when touching the cb-related info. The MPTCP_PUSH_PENDING bit needs additional care, as it is the only CB related field currently accessed either under the mptcp data lock or the mptcp socket lock. Let's add another mask just for such bit's sake. Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau --- net/mptcp/protocol.c | 46 +++++++++++++++++++++++--------------------- net/mptcp/protocol.h | 18 ++++++++++------- net/mptcp/subflow.c | 4 ++-- 3 files changed, 37 insertions(+), 31 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 38651c80b321..42c382d0eb01 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -763,7 +763,7 @@ static bool move_skbs_to_msk(struct mptcp_sock *msk, st= ruct sock *ssk) if (!sock_owned_by_user(sk)) __mptcp_error_report(sk); else - set_bit(MPTCP_ERROR_REPORT, &msk->flags); + __set_bit(MPTCP_ERROR_REPORT, &msk->cb_flags); } =20 /* If the moves have caught up with the DATA_FIN sequence number @@ -1529,9 +1529,8 @@ static void mptcp_update_post_push(struct mptcp_sock = *msk, =20 void mptcp_check_and_set_pending(struct sock *sk) { - if (mptcp_send_head(sk) && - !test_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags)) - set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags); + if (mptcp_send_head(sk)) + mptcp_sk(sk)->push_pending |=3D MPTCP_PUSH_PENDING; } =20 void __mptcp_push_pending(struct sock *sk, unsigned int flags) @@ -2146,7 +2145,7 @@ static void mptcp_retransmit_timer(struct timer_list = *t) mptcp_schedule_work(sk); } else { /* delegate our work to tcp_release_cb() */ - set_bit(MPTCP_RETRANSMIT, &msk->flags); + __set_bit(MPTCP_RETRANSMIT, &msk->cb_flags); } bh_unlock_sock(sk); sock_put(sk); @@ -2859,7 +2858,9 @@ static int mptcp_disconnect(struct sock *sk, int flag= s) =20 mptcp_destroy_common(msk); msk->last_snd =3D NULL; - msk->flags =3D 0; + WRITE_ONCE(msk->flags, 0); + msk->cb_flags =3D 0; + msk->push_pending =3D 0; msk->recovery =3D false; msk->can_ack =3D false; msk->fully_established =3D false; @@ -3040,7 +3041,7 @@ void __mptcp_data_acked(struct sock *sk) if (!sock_owned_by_user(sk)) __mptcp_clean_una(sk); else - set_bit(MPTCP_CLEAN_UNA, &mptcp_sk(sk)->flags); + __set_bit(MPTCP_CLEAN_UNA, &mptcp_sk(sk)->cb_flags); =20 if (mptcp_pending_data_fin_ack(sk)) mptcp_schedule_work(sk); @@ -3059,22 +3060,22 @@ void __mptcp_check_push(struct sock *sk, struct soc= k *ssk) else if (xmit_ssk) mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), MPTCP_DELEGATE_SEND= ); } else { - set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags); + __set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->cb_flags); } } =20 +#define MPTCP_FLAGS_PROCESS_CTX_NEED (MPTCP_PUSH_PENDING | \ + MPTCP_RETRANSMIT | \ + MPTCP_FLUSH_JOIN_LIST) + /* processes deferred events and flush wmem */ static void mptcp_release_cb(struct sock *sk) { + struct mptcp_sock *msk =3D mptcp_sk(sk); + for (;;) { - unsigned long flags =3D 0; - - if (test_and_clear_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags)) - flags |=3D BIT(MPTCP_PUSH_PENDING); - if (test_and_clear_bit(MPTCP_RETRANSMIT, &mptcp_sk(sk)->flags)) - flags |=3D BIT(MPTCP_RETRANSMIT); - if (test_and_clear_bit(MPTCP_FLUSH_JOIN_LIST, &mptcp_sk(sk)->flags)) - flags |=3D BIT(MPTCP_FLUSH_JOIN_LIST); + unsigned long flags =3D (msk->cb_flags & MPTCP_FLAGS_PROCESS_CTX_NEED) | + msk->push_pending; if (!flags) break; =20 @@ -3085,7 +3086,8 @@ static void mptcp_release_cb(struct sock *sk) * datapath acquires the msk socket spinlock while helding * the subflow socket lock */ - + msk->push_pending =3D 0; + msk->cb_flags &=3D ~flags; spin_unlock_bh(&sk->sk_lock.slock); if (flags & BIT(MPTCP_FLUSH_JOIN_LIST)) __mptcp_flush_join_list(sk); @@ -3101,11 +3103,11 @@ static void mptcp_release_cb(struct sock *sk) /* be sure to set the current sk state before tacking actions * depending on sk_state */ - if (test_and_clear_bit(MPTCP_CONNECTED, &mptcp_sk(sk)->flags)) + if (__test_and_clear_bit(MPTCP_CONNECTED, &msk->cb_flags)) __mptcp_set_connected(sk); - if (test_and_clear_bit(MPTCP_CLEAN_UNA, &mptcp_sk(sk)->flags)) + if (__test_and_clear_bit(MPTCP_CLEAN_UNA, &msk->cb_flags)) __mptcp_clean_una_wakeup(sk); - if (test_and_clear_bit(MPTCP_ERROR_REPORT, &mptcp_sk(sk)->flags)) + if (__test_and_clear_bit(MPTCP_ERROR_REPORT, &msk->cb_flags)) __mptcp_error_report(sk); =20 __mptcp_update_rmem(sk); @@ -3147,7 +3149,7 @@ void mptcp_subflow_process_delegated(struct sock *ssk) if (!sock_owned_by_user(sk)) __mptcp_subflow_push_pending(sk, ssk); else - set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags); + __set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->cb_flags); mptcp_data_unlock(sk); mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_SEND); } @@ -3267,7 +3269,7 @@ bool mptcp_finish_join(struct sock *ssk) } else { sock_hold(ssk); list_add_tail(&subflow->node, &msk->join_list); - set_bit(MPTCP_FLUSH_JOIN_LIST, &msk->flags); + __set_bit(MPTCP_FLUSH_JOIN_LIST, &msk->cb_flags); } mptcp_data_unlock(parent); =20 diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 4a140aec68b3..fca045f1936f 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -110,18 +110,20 @@ /* MPTCP TCPRST flags */ #define MPTCP_RST_TRANSIENT BIT(0) =20 -/* MPTCP socket flags */ +/* MPTCP socket atomic flags */ #define MPTCP_NOSPACE 1 #define MPTCP_WORK_RTX 2 #define MPTCP_WORK_EOF 3 #define MPTCP_FALLBACK_DONE 4 #define MPTCP_WORK_CLOSE_SUBFLOW 5 -#define MPTCP_PUSH_PENDING 6 -#define MPTCP_CLEAN_UNA 7 -#define MPTCP_ERROR_REPORT 8 -#define MPTCP_RETRANSMIT 9 -#define MPTCP_FLUSH_JOIN_LIST 10 -#define MPTCP_CONNECTED 11 + +/* MPTCP socket release cb flags */ +#define MPTCP_PUSH_PENDING 1 +#define MPTCP_CLEAN_UNA 2 +#define MPTCP_ERROR_REPORT 3 +#define MPTCP_RETRANSMIT 4 +#define MPTCP_FLUSH_JOIN_LIST 5 +#define MPTCP_CONNECTED 6 =20 static inline bool before64(__u64 seq1, __u64 seq2) { @@ -243,6 +245,8 @@ struct mptcp_sock { u32 token; int rmem_released; unsigned long flags; + unsigned long cb_flags; + unsigned long push_pending; bool recovery; /* closing subflow write queue reinjected */ bool can_ack; bool fully_established; diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 536a322a6fd0..09225b57c7f6 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -388,7 +388,7 @@ static void mptcp_set_connected(struct sock *sk) if (!sock_owned_by_user(sk)) __mptcp_set_connected(sk); else - set_bit(MPTCP_CONNECTED, &mptcp_sk(sk)->flags); + __set_bit(MPTCP_CONNECTED, &mptcp_sk(sk)->cb_flags); mptcp_data_unlock(sk); } =20 @@ -1280,7 +1280,7 @@ static void subflow_error_report(struct sock *ssk) if (!sock_owned_by_user(sk)) __mptcp_error_report(sk); else - set_bit(MPTCP_ERROR_REPORT, &mptcp_sk(sk)->flags); + __set_bit(MPTCP_ERROR_REPORT, &mptcp_sk(sk)->cb_flags); mptcp_data_unlock(sk); } =20 --=20 2.33.1