From nobody Sun May 12 05:40:31 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13EFA13ACE; Wed, 4 Oct 2023 20:38:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SErEAxL2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5D719C433C9; Wed, 4 Oct 2023 20:38:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1696451902; bh=7b3RC+RCCWMFJYgE0lgix2Ut1RLI3dKTkHCq236cmvw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=SErEAxL2/UStBJH3sdgd0ViHfFAN2t4MskNLkOIJlNNBc/H7LHeRM10n+ZHL26+xl kCVwUXlJWUhcHftxrVCMI88P0DKHqVfTFl3vJ3ttSUMN59fE+8GTUIXB8eOZICRwPJ oUmbszHgcaYkJF7ZvPbMQ/vXaXXNb5T3LNGlLoa+O9doEFR40QAvWGdyMf/29dxXaU jsASdxsfNB2SUZMwzJ5v24jTAiOdh0ponLIra3BrRxc8JefAVyuD57y0BifUI8s8BI y0UOkykPO3+wbBVNsIa6UM16K/C+jRiJw7BOJ/rCw+xVkwQxd3KwpFvSy16Vlse/T2 ywzYQHEIo++yA== From: Mat Martineau Date: Wed, 04 Oct 2023 13:38:11 -0700 Subject: [PATCH net 1/3] mptcp: fix delegated action races Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20231004-send-net-20231004-v1-1-28de4ac663ae@kernel.org> References: <20231004-send-net-20231004-v1-0-28de4ac663ae@kernel.org> In-Reply-To: <20231004-send-net-20231004-v1-0-28de4ac663ae@kernel.org> To: Matthieu Baerts , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Matthieu Baerts Cc: netdev@vger.kernel.org, mptcp@lists.linux.dev, Kishen Maloor , Florian Westphal , Mat Martineau , stable@vger.kernel.org X-Mailer: b4 0.12.3 From: Paolo Abeni The delegated action infrastructure is prone to the following race: different CPUs can try to schedule different delegated actions on the same subflow at the same time. Each of them will check different bits via mptcp_subflow_delegate(), and will try to schedule the action on the related per-cpu napi instance. Depending on the timing, both can observe an empty delegated list node, causing the same entry to be added simultaneously on two different lists. The root cause is that the delegated actions infra does not provide a single synchronization point. Address the issue reserving an additional bit to mark the subflow as scheduled for delegation. Acquiring such bit guarantee the caller to own the delegated list node, and being able to safely schedule the subflow. Clear such bit only when the subflow scheduling is completed, ensuring proper barrier in place. Additionally swap the meaning of the delegated_action bitmask, to allow the usage of the existing helper to set multiple bit at once. Fixes: bcd97734318d ("mptcp: use delegate action to schedule 3rd ack retran= s") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau Signed-off-by: Paolo Abeni Signed-off-by: Mat Martineau --- net/mptcp/protocol.c | 28 ++++++++++++++-------------- net/mptcp/protocol.h | 35 ++++++++++++----------------------- net/mptcp/subflow.c | 10 ++++++++-- 3 files changed, 34 insertions(+), 39 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e252539b1e19..c3b83cb390d9 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3425,24 +3425,21 @@ static void schedule_3rdack_retransmission(struct s= ock *ssk) sk_reset_timer(ssk, &icsk->icsk_delack_timer, timeout); } =20 -void mptcp_subflow_process_delegated(struct sock *ssk) +void mptcp_subflow_process_delegated(struct sock *ssk, long status) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); struct sock *sk =3D subflow->conn; =20 - if (test_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status)) { + if (status & BIT(MPTCP_DELEGATE_SEND)) { mptcp_data_lock(sk); if (!sock_owned_by_user(sk)) __mptcp_subflow_push_pending(sk, ssk, true); else __set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->cb_flags); mptcp_data_unlock(sk); - mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_SEND); } - if (test_bit(MPTCP_DELEGATE_ACK, &subflow->delegated_status)) { + if (status & BIT(MPTCP_DELEGATE_ACK)) schedule_3rdack_retransmission(ssk); - mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_ACK); - } } =20 static int mptcp_hash(struct sock *sk) @@ -3968,14 +3965,17 @@ static int mptcp_napi_poll(struct napi_struct *napi= , int budget) struct sock *ssk =3D mptcp_subflow_tcp_sock(subflow); =20 bh_lock_sock_nested(ssk); - if (!sock_owned_by_user(ssk) && - mptcp_subflow_has_delegated_action(subflow)) - mptcp_subflow_process_delegated(ssk); - /* ... elsewhere tcp_release_cb_override already processed - * the action or will do at next release_sock(). - * In both case must dequeue the subflow here - on the same - * CPU that scheduled it. - */ + if (!sock_owned_by_user(ssk)) { + mptcp_subflow_process_delegated(ssk, xchg(&subflow->delegated_status, 0= )); + } else { + /* tcp_release_cb_override already processed + * the action or will do at next release_sock(). + * In both case must dequeue the subflow here - on the same + * CPU that scheduled it. + */ + smp_wmb(); + clear_bit(MPTCP_DELEGATE_SCHEDULED, &subflow->delegated_status); + } bh_unlock_sock(ssk); sock_put(ssk); =20 diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index ed61d6850cce..3612545fa62e 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -444,9 +444,11 @@ struct mptcp_delegated_action { =20 DECLARE_PER_CPU(struct mptcp_delegated_action, mptcp_delegated_actions); =20 -#define MPTCP_DELEGATE_SEND 0 -#define MPTCP_DELEGATE_ACK 1 +#define MPTCP_DELEGATE_SCHEDULED 0 +#define MPTCP_DELEGATE_SEND 1 +#define MPTCP_DELEGATE_ACK 2 =20 +#define MPTCP_DELEGATE_ACTIONS_MASK (~BIT(MPTCP_DELEGATE_SCHEDULED)) /* MPTCP subflow context */ struct mptcp_subflow_context { struct list_head node;/* conn_list of subflows */ @@ -564,23 +566,24 @@ mptcp_subflow_get_mapped_dsn(const struct mptcp_subfl= ow_context *subflow) return subflow->map_seq + mptcp_subflow_get_map_offset(subflow); } =20 -void mptcp_subflow_process_delegated(struct sock *ssk); +void mptcp_subflow_process_delegated(struct sock *ssk, long actions); =20 static inline void mptcp_subflow_delegate(struct mptcp_subflow_context *su= bflow, int action) { + long old, set_bits =3D BIT(MPTCP_DELEGATE_SCHEDULED) | BIT(action); struct mptcp_delegated_action *delegated; bool schedule; =20 /* the caller held the subflow bh socket lock */ lockdep_assert_in_softirq(); =20 - /* The implied barrier pairs with mptcp_subflow_delegated_done(), and - * ensures the below list check sees list updates done prior to status - * bit changes + /* The implied barrier pairs with tcp_release_cb_override() + * mptcp_napi_poll(), and ensures the below list check sees list + * updates done prior to delegated status bits changes */ - if (!test_and_set_bit(action, &subflow->delegated_status)) { - /* still on delegated list from previous scheduling */ - if (!list_empty(&subflow->delegated_node)) + old =3D set_mask_bits(&subflow->delegated_status, 0, set_bits); + if (!(old & BIT(MPTCP_DELEGATE_SCHEDULED))) { + if (WARN_ON_ONCE(!list_empty(&subflow->delegated_node))) return; =20 delegated =3D this_cpu_ptr(&mptcp_delegated_actions); @@ -605,20 +608,6 @@ mptcp_subflow_delegated_next(struct mptcp_delegated_ac= tion *delegated) return ret; } =20 -static inline bool mptcp_subflow_has_delegated_action(const struct mptcp_s= ubflow_context *subflow) -{ - return !!READ_ONCE(subflow->delegated_status); -} - -static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_conte= xt *subflow, int action) -{ - /* pairs with mptcp_subflow_delegate, ensures delegate_node is updated be= fore - * touching the status bit - */ - smp_wmb(); - clear_bit(action, &subflow->delegated_status); -} - int mptcp_is_enabled(const struct net *net); unsigned int mptcp_get_add_addr_timeout(const struct net *net); int mptcp_is_checksum_enabled(const struct net *net); diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 918c1a235790..9c1f8d1d63d2 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1956,9 +1956,15 @@ static void subflow_ulp_clone(const struct request_s= ock *req, static void tcp_release_cb_override(struct sock *ssk) { struct mptcp_subflow_context *subflow =3D mptcp_subflow_ctx(ssk); + long status; =20 - if (mptcp_subflow_has_delegated_action(subflow)) - mptcp_subflow_process_delegated(ssk); + /* process and clear all the pending actions, but leave the subflow into + * the napi queue. To respect locking, only the same CPU that originated + * the action can touch the list. mptcp_napi_poll will take care of it. + */ + status =3D set_mask_bits(&subflow->delegated_status, MPTCP_DELEGATE_ACTIO= NS_MASK, 0); + if (status) + mptcp_subflow_process_delegated(ssk, status); =20 tcp_release_cb(ssk); } --=20 2.41.0 From nobody Sun May 12 05:40:31 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5240B200AB; Wed, 4 Oct 2023 20:38:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="KaxzsuLv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9C8FAC433CD; Wed, 4 Oct 2023 20:38:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1696451902; bh=7IsvQZm94ACVEELdVb0zhCa5ptfJqE0xG7ACha379Qg=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=KaxzsuLvLrkVf8rTUJ7meAERSenH/kiAEf1fX7UKrSoSn6PhEo6+1jBjFqSij1FbP wOFLvsSDiU0kPqCYrUi7a3vlLa1hfGfSHFRLrRE9TAW1Irn2L6xSdUHuZ70R+Sbndx Y8FIrOFv2/p9KCaIi7cWWSM3Og5GWub9k5jjtA07ZbwsjOrFAKtM+vZSMwBdbRhQdQ +3dPNUYsMcMDZlrk6RgTVy7uUqDzL5tIPKn1s9208qtTcnXo4DHHWVyn+E0s6LWRLg USR6m3tX7fEZ7p+RFkQD+ZZy4d0Uk+VnGR0Xc4uqH1bEyxTwM9wtmxu6docAPuAvN1 KMdFqSUsVEbdg== From: Mat Martineau Date: Wed, 04 Oct 2023 13:38:12 -0700 Subject: [PATCH net 2/3] mptcp: userspace pm allow creating id 0 subflow Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20231004-send-net-20231004-v1-2-28de4ac663ae@kernel.org> References: <20231004-send-net-20231004-v1-0-28de4ac663ae@kernel.org> In-Reply-To: <20231004-send-net-20231004-v1-0-28de4ac663ae@kernel.org> To: Matthieu Baerts , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Matthieu Baerts Cc: netdev@vger.kernel.org, mptcp@lists.linux.dev, Kishen Maloor , Florian Westphal , Mat Martineau , Geliang Tang , stable@vger.kernel.org X-Mailer: b4 0.12.3 From: Geliang Tang This patch drops id 0 limitation in mptcp_nl_cmd_sf_create() to allow creating additional subflows with the local addr ID 0. There is no reason not to allow additional subflows from this local address: we should be able to create new subflows from the initial endpoint. This limitation was breaking fullmesh support from userspace. Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establ= ishment") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/391 Cc: stable@vger.kernel.org Suggested-by: Matthieu Baerts Reviewed-by: Matthieu Baerts Signed-off-by: Geliang Tang Signed-off-by: Mat Martineau --- net/mptcp/pm_userspace.c | 6 ------ 1 file changed, 6 deletions(-) diff --git a/net/mptcp/pm_userspace.c b/net/mptcp/pm_userspace.c index b5a8aa4c1ebd..d042d32beb4d 100644 --- a/net/mptcp/pm_userspace.c +++ b/net/mptcp/pm_userspace.c @@ -307,12 +307,6 @@ int mptcp_nl_cmd_sf_create(struct sk_buff *skb, struct= genl_info *info) goto create_err; } =20 - if (addr_l.id =3D=3D 0) { - NL_SET_ERR_MSG_ATTR(info->extack, laddr, "missing local addr id"); - err =3D -EINVAL; - goto create_err; - } - err =3D mptcp_pm_parse_addr(raddr, info, &addr_r); if (err < 0) { NL_SET_ERR_MSG_ATTR(info->extack, raddr, "error parsing remote addr"); --=20 2.41.0 From nobody Sun May 12 05:40:31 2024 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC833224F5; Wed, 4 Oct 2023 20:38:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="OQ/FBM+v" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E21E1C43395; Wed, 4 Oct 2023 20:38:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1696451903; bh=tWSdTi8obnFsvIC2v1iHQX7NjJHX3oSF/bftWpn0C9I=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=OQ/FBM+vwdNEdAFLL59EakZsn7FKRfBEXOAy8NDHGXhx0fM1JXq7NGOmnHveavrRy KK8xyD51c/UFymE6ON8DDtYfKLyXpXvcaag9nQ9ytV4sESUbd8DeToRk9RZYsSdSoq MxQFMECDcgNF+hbRNowB1/WJP+wIL3EizgGhfs6ycyN7HOgXKGKi/G9NXLZp3Va9lp re1PKrsFVTJyjx6qQxBim1W502Znqk3VMR1jieVEam2zO32vPvL5b8y7V1LogOvSEi uxNa5IdqktxSp/8LLy0DIqxT6zxXktb2GCQuxFR07mlacl8edVyira/SOHyHciPBYW +L/XCjmdlZ5rw== From: Mat Martineau Date: Wed, 04 Oct 2023 13:38:13 -0700 Subject: [PATCH net 3/3] MAINTAINERS: update Matthieu's email address Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20231004-send-net-20231004-v1-3-28de4ac663ae@kernel.org> References: <20231004-send-net-20231004-v1-0-28de4ac663ae@kernel.org> In-Reply-To: <20231004-send-net-20231004-v1-0-28de4ac663ae@kernel.org> To: Matthieu Baerts , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Matthieu Baerts Cc: netdev@vger.kernel.org, mptcp@lists.linux.dev, Kishen Maloor , Florian Westphal , Mat Martineau X-Mailer: b4 0.12.3 From: Matthieu Baerts Use my kernel.org account instead. The other one will bounce by the end of the year. Signed-off-by: Matthieu Baerts Signed-off-by: Mat Martineau --- .mailmap | 1 + MAINTAINERS | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/.mailmap b/.mailmap index a0a6efe87186..c80903efec75 100644 --- a/.mailmap +++ b/.mailmap @@ -377,6 +377,7 @@ Matthew Wilcox Matthew Wilcox Matthew Wilcox Matthias Fuchs +Matthieu Baerts Matthieu CASTET Matti Vaittinen Matt Ranostay diff --git a/MAINTAINERS b/MAINTAINERS index 9275708c9b96..0bb5451e9b86 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14942,7 +14942,7 @@ K: macsec K: \bmdo_ =20 NETWORKING [MPTCP] -M: Matthieu Baerts +M: Matthieu Baerts M: Mat Martineau L: netdev@vger.kernel.org L: mptcp@lists.linux.dev --=20 2.41.0