From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1507D1FAF for ; Tue, 17 Jan 2023 07:37:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941035; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XcPA4sXVHI6jTcEXVxRdebV8orFAjwvdL7prR34Xm/0=; b=iR9wFx/4zOsKI9bDbvgQZWaUm4UkiJFFeKUfoEy2DOOEq4R/THuiaoaSd35AKAfJIuDu9e VFRNcvTKSdZxX3fW7UcZ3fBs21KJlJC7eh1Qy2bytSQNSZ+h9cxTMkhVSJEJAM5b5+3QtV Mqfhcnicg+78FRid8K4RBcXT6QlFJNg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-6-sR345FC0Mf-QlRfH1CE85g-1; Tue, 17 Jan 2023 02:37:13 -0500 X-MC-Unique: sR345FC0Mf-QlRfH1CE85g-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4067D85C6E0 for ; Tue, 17 Jan 2023 07:37:13 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id C35E040C2064 for ; Tue, 17 Jan 2023 07:37:12 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 01/13] mptcp: fix locking for setsockopt corner-case Date: Tue, 17 Jan 2023 08:36:21 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" We need to call the __mptcp_nmpc_socket(), and later subflow socket access under the msk socket lock, or e.g. a racing connect() could change the socket status under the wood, with unexpected results. Fixes: 54635bd04701 ("mptcp: add TCP_FASTOPEN_CONNECT socket option") Signed-off-by: Paolo Abeni --- this and the next patch are new v2. In both case the addressed issue is almost irrelevant until patch 8/12, but will cause lockdep splat after such change --- net/mptcp/sockopt.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index 9986681aaf40..8a9656248b0f 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -760,14 +760,21 @@ static int mptcp_setsockopt_v4(struct mptcp_sock *msk= , int optname, static int mptcp_setsockopt_first_sf_only(struct mptcp_sock *msk, int leve= l, int optname, sockptr_t optval, unsigned int optlen) { + struct sock *sk =3D (struct sock *)msk; struct socket *sock; + int ret =3D -EINVAL; =20 /* Limit to first subflow, before the connection establishment */ + lock_sock(sk); sock =3D __mptcp_nmpc_socket(msk); if (!sock) - return -EINVAL; + goto unlock; =20 - return tcp_setsockopt(sock->sk, level, optname, optval, optlen); + ret =3D tcp_setsockopt(sock->sk, level, optname, optval, optlen); + +unlock: + release_sock(sk); + return ret; } =20 static int mptcp_setsockopt_sol_tcp(struct mptcp_sock *msk, int optname, --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB05C1FB0 for ; Tue, 17 Jan 2023 07:37:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941035; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0NQpx2O11ITsvam4FhyJVJSItB8MdbllQV8nyfkH+aU=; b=DrV+s45ck9G3UK2BFP/EWDvfq12HCRdS4lPU4cgeI7k4giPHQ34WjOBrcw1uvRgzF56Luq CkXQ4/tklAb8mjor/60EIRBO4e0KcMP/Z1FtUrtJFIYtgXUj9549EH8BbLTmwprDgxjwEq FhfaPR7EoLxiCKd7o6dZS9zRh9jVvOI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-664-cTct4M7LMomB4236YlDdHg-1; Tue, 17 Jan 2023 02:37:14 -0500 X-MC-Unique: cTct4M7LMomB4236YlDdHg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 068DC101A521 for ; Tue, 17 Jan 2023 07:37:14 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 86D4E40C2004 for ; Tue, 17 Jan 2023 07:37:13 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 02/13] mptcp: fix locking for in-kernel listener creation. Date: Tue, 17 Jan 2023 08:36:22 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" For consistency, in mptcp_pm_nl_create_listen_socket(), we need to call the __mptcp_nmpc_socket() under the msk socket lock. Note that as a side effect, mptcp_subflow_create_socket() needs a 'nested' lockdep annotation, as it will acquire the subflow (kernel) socket lock under the in-kernel listener msk socket lock. The current lack of locking is almost harmless, because the relevant socket is not exposed to the user space, but in future we will add more complexity to the mentioned helper, let's play safe. Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port") Signed-off-by: Paolo Abeni --- net/mptcp/pm_netlink.c | 10 ++++++---- net/mptcp/subflow.c | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index e82a112b8779..155916174841 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -999,8 +999,8 @@ static int mptcp_pm_nl_create_listen_socket(struct sock= *sk, { int addrlen =3D sizeof(struct sockaddr_in); struct sockaddr_storage addr; - struct mptcp_sock *msk; struct socket *ssock; + struct sock *newsk; int backlog =3D 1024; int err; =20 @@ -1009,11 +1009,13 @@ static int mptcp_pm_nl_create_listen_socket(struct = sock *sk, if (err) return err; =20 - msk =3D mptcp_sk(entry->lsk->sk); - if (!msk) + newsk =3D entry->lsk->sk; + if (!newsk) return -EINVAL; =20 - ssock =3D __mptcp_nmpc_socket(msk); + lock_sock(newsk); + ssock =3D __mptcp_nmpc_socket(mptcp_sk(newsk)); + release_sock(newsk); if (!ssock) return -EINVAL; =20 diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index ec54413fb31f..a3e5026bee5b 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1679,7 +1679,7 @@ int mptcp_subflow_create_socket(struct sock *sk, unsi= gned short family, if (err) return err; =20 - lock_sock(sf->sk); + lock_sock_nested(sf->sk, SINGLE_DEPTH_NESTING); =20 /* the newly created socket has to be in the same cgroup as its parent */ mptcp_attach_cgroup(sk, sf->sk); --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0E691FAF for ; Tue, 17 Jan 2023 07:37:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941037; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5FuBEOGB8WbYsLnl0XuWDprH+mEUMqORoM+jRt7xwTQ=; b=K5t8UB7ke6hSeYb570foYAsrv9bz3Kh7IvpLEtSSUEQNeSIzJWgZ6fQqq4Xs3QYm9mlE8K eHUwi9wqSaCHGV3fl61hKsKQTkThbZwYtAEwUX3cWVCxq2zu5SdxSYW2Iu1m4y4qH2hCdf SvcN/C9DnucS7Z9VUrcbMBkMIjJITec= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-52-rQwuNkDqPK6PfFBuGQBBEQ-1; Tue, 17 Jan 2023 02:37:15 -0500 X-MC-Unique: rQwuNkDqPK6PfFBuGQBBEQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BD64080D0E7 for ; Tue, 17 Jan 2023 07:37:14 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4D0F940C2064 for ; Tue, 17 Jan 2023 07:37:14 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 03/13] mptcp: refactor passive socket initialization. Date: Tue, 17 Jan 2023 08:36:23 +0100 Message-Id: <03be07463840d00536e639f76fb0089c6f99a3ff.1673940640.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" After commit 30e51b923e43 ("mptcp: fix unreleased socket in accept queue") unaccepted msk sockets go throu complete shutdown, we don't need anymore to delay inserting the first subflow into the subflow lists. The reference counting deserve some extra care, as __mptcp_close() is unaware of the request socket linkage to the first subflow. Signed-off-by: Paolo Abeni --- Notes: - this schema assumes that the TCP code will never drop a request socket from the receive queue after the 3whs. I tried to verify such assumption as strictily as I could, but more eyes more then welcome! - this will cause pktdrill failure for close_before_accept.pkt, because the msk will become fully established before accept() - imho a good thing - and send out add_addr earlier. The pktdrill test change should be easier. --- net/mptcp/protocol.c | 17 ----------------- net/mptcp/subflow.c | 31 ++++++++++++++++++++++++------- 2 files changed, 24 insertions(+), 24 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 003b44a79fce..d298d629b3b2 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -825,7 +825,6 @@ static bool __mptcp_finish_join(struct mptcp_sock *msk,= struct sock *ssk) if (sk->sk_socket && !ssk->sk_socket) mptcp_sock_graft(ssk, sk->sk_socket); =20 - mptcp_propagate_sndbuf((struct sock *)msk, ssk); mptcp_sockopt_sync_locked(msk, ssk); return true; } @@ -3753,22 +3752,6 @@ static int mptcp_stream_accept(struct socket *sock, = struct socket *newsock, =20 lock_sock(newsk); =20 - /* PM/worker can now acquire the first subflow socket - * lock without racing with listener queue cleanup, - * we can notify it, if needed. - * - * Even if remote has reset the initial subflow by now - * the refcnt is still at least one. - */ - subflow =3D mptcp_subflow_ctx(msk->first); - list_add(&subflow->node, &msk->conn_list); - sock_hold(msk->first); - if (mptcp_is_fully_established(newsk)) - mptcp_pm_fully_established(msk, msk->first, GFP_KERNEL); - - mptcp_rcv_space_init(msk, msk->first); - mptcp_propagate_sndbuf(newsk, msk->first); - /* set ssk->sk_socket of accept()ed flows to mptcp socket. * This is needed so NOSPACE flag can be set from tcp stack. */ diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index a3e5026bee5b..5e6752cd280b 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -749,6 +749,7 @@ static struct sock *subflow_syn_recv_sock(const struct = sock *sk, struct mptcp_options_received mp_opt; bool fallback, fallback_is_fatal; struct sock *new_msk =3D NULL; + struct mptcp_sock *owner; struct sock *child; =20 pr_debug("listener=3D%p, req=3D%p, conn=3D%p", listener, req, listener->c= onn); @@ -823,6 +824,8 @@ static struct sock *subflow_syn_recv_sock(const struct = sock *sk, ctx->setsockopt_seq =3D listener->setsockopt_seq; =20 if (ctx->mp_capable) { + owner =3D mptcp_sk(new_msk); + /* this can't race with mptcp_close(), as the msk is * not yet exposted to user-space */ @@ -831,14 +834,14 @@ static struct sock *subflow_syn_recv_sock(const struc= t sock *sk, /* record the newly created socket as the first msk * subflow, but don't link it yet into conn_list */ - WRITE_ONCE(mptcp_sk(new_msk)->first, child); + WRITE_ONCE(owner->first, child); =20 /* new mpc subflow takes ownership of the newly * created mptcp socket */ mptcp_sk(new_msk)->setsockopt_seq =3D ctx->setsockopt_seq; - mptcp_pm_new_connection(mptcp_sk(new_msk), child, 1); - mptcp_token_accept(subflow_req, mptcp_sk(new_msk)); + mptcp_pm_new_connection(owner, child, 1); + mptcp_token_accept(subflow_req, owner); ctx->conn =3D new_msk; new_msk =3D NULL; =20 @@ -846,15 +849,21 @@ static struct sock *subflow_syn_recv_sock(const struc= t sock *sk, * uses the correct data */ mptcp_copy_inaddrs(ctx->conn, child); + mptcp_propagate_sndbuf(ctx->conn, child); + + mptcp_rcv_space_init(owner, child); + list_add(&ctx->node, &owner->conn_list); + sock_hold(child); =20 /* with OoO packets we can reach here without ingress * mpc option */ - if (mp_opt.suboptions & OPTION_MPTCP_MPC_ACK) + if (mp_opt.suboptions & OPTION_MPTCP_MPC_ACK) { mptcp_subflow_fully_established(ctx, &mp_opt); + mptcp_pm_fully_established(owner, child, GFP_ATOMIC); + ctx->pm_notified =3D 1; + } } else if (ctx->mp_join) { - struct mptcp_sock *owner; - owner =3D subflow_req->msk; if (!owner) { subflow_add_reset_reason(skb, MPTCP_RST_EPROHIBIT); @@ -1836,9 +1845,17 @@ void mptcp_subflow_queue_clean(struct sock *listener= _sk, struct sock *listener_s sock_hold(sk); lock_sock_nested(sk, SINGLE_DEPTH_NESTING); next =3D msk->dl_next; - msk->first =3D NULL; msk->dl_next =3D NULL; =20 + /* The upcoming mptcp_close is going to drop all the references + * to the first subflow, ignoring that one of such reference is + * owned by the request socket still in the accept queue and that + * later inet_csk_listen_stop will drop it. + * Acquire an extra reference here to avoid an UaF at that point. + */ + if (msk->first) + sock_hold(msk->first); + do_cancel_work =3D __mptcp_close(sk, 0); release_sock(sk); if (do_cancel_work) { --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CC801FA7 for ; Tue, 17 Jan 2023 07:37:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941039; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x1R/zNJXy/zImUspA339/XCm8+fn1opSE/YPAqPAbyY=; b=hpeqopCG2CPpFNZpmNEynBk/RxSc60Me8BiwvM/2Y5HBjrpuDNsdZMpTwfP/N1Wdyrg/4i RZ1I2KXFfF+UJWBv09DVaoXL4wD3vagAPbv/xy6oTkYfglLBAytKNSdFYbC/WbPlalw8vr jZODjbJvbhguO6JgbUG4eR0jRLlTNsU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-645-Nty0h3SzOv-KhpYaiF7qeg-1; Tue, 17 Jan 2023 02:37:15 -0500 X-MC-Unique: Nty0h3SzOv-KhpYaiF7qeg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 82006101A521 for ; Tue, 17 Jan 2023 07:37:15 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1077340C2004 for ; Tue, 17 Jan 2023 07:37:14 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 04/13] mptcp: drop unneeded argument Date: Tue, 17 Jan 2023 08:36:24 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" After the previous commit every mptcp_pm_fully_established() is always invoked with a GFP_ATOMIC argument. We can drop it. Signed-off-by: Paolo Abeni --- net/mptcp/options.c | 2 +- net/mptcp/pm.c | 4 ++-- net/mptcp/protocol.h | 2 +- net/mptcp/subflow.c | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index b30cea2fbf3f..99c4f9e9bb90 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -1001,7 +1001,7 @@ static bool check_fully_established(struct mptcp_sock= *msk, struct sock *ssk, clear_3rdack_retransmission(ssk); mptcp_pm_subflow_established(msk); } else { - mptcp_pm_fully_established(msk, ssk, GFP_ATOMIC); + mptcp_pm_fully_established(msk, ssk); } return true; =20 diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 8e0cf6275e94..4ed4d29d9c11 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -126,7 +126,7 @@ static bool mptcp_pm_schedule_work(struct mptcp_sock *m= sk, return true; } =20 -void mptcp_pm_fully_established(struct mptcp_sock *msk, const struct sock = *ssk, gfp_t gfp) +void mptcp_pm_fully_established(struct mptcp_sock *msk, const struct sock = *ssk) { struct mptcp_pm_data *pm =3D &msk->pm; bool announce =3D false; @@ -150,7 +150,7 @@ void mptcp_pm_fully_established(struct mptcp_sock *msk,= const struct sock *ssk, spin_unlock_bh(&pm->lock); =20 if (announce) - mptcp_event(MPTCP_EVENT_ESTABLISHED, msk, ssk, gfp); + mptcp_event(MPTCP_EVENT_ESTABLISHED, msk, ssk, GFP_ATOMIC); } =20 void mptcp_pm_connection_closed(struct mptcp_sock *msk) diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 871ec3e93314..5f1a30959b5c 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -799,7 +799,7 @@ bool mptcp_pm_addr_families_match(const struct sock *sk, void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock = *ssk); void mptcp_pm_nl_subflow_chk_stale(const struct mptcp_sock *msk, struct so= ck *ssk); void mptcp_pm_new_connection(struct mptcp_sock *msk, const struct sock *ss= k, int server_side); -void mptcp_pm_fully_established(struct mptcp_sock *msk, const struct sock = *ssk, gfp_t gfp); +void mptcp_pm_fully_established(struct mptcp_sock *msk, const struct sock = *ssk); bool mptcp_pm_allow_new_subflow(struct mptcp_sock *msk); void mptcp_pm_connection_closed(struct mptcp_sock *msk); void mptcp_pm_subflow_established(struct mptcp_sock *msk); diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 5e6752cd280b..7b91dc57049e 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -860,7 +860,7 @@ static struct sock *subflow_syn_recv_sock(const struct = sock *sk, */ if (mp_opt.suboptions & OPTION_MPTCP_MPC_ACK) { mptcp_subflow_fully_established(ctx, &mp_opt); - mptcp_pm_fully_established(owner, child, GFP_ATOMIC); + mptcp_pm_fully_established(owner, child); ctx->pm_notified =3D 1; } } else if (ctx->mp_join) { --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C2831FB3 for ; Tue, 17 Jan 2023 07:37:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941038; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=barDMb2Xf6YqJ0tZHH4LwL7fLiEstH2PdV7UFCbjr2o=; b=R3KxFRO7+/4U1r3VfWTxIh7OCOIvDZZe6v3abd3FQPXBZfnLMMUaCBPv5Z4apGElKS5B1l +JfFb+GbjH0C6pascxE8zuBev8ZL+QL6L6PPEi1WoNwrSbsJf0WLMTSzkOoyampzQK1EhB 1CB49j6WOJ+eV94B3ZoyaKlL8DGXnj4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-9-FO-d1X3aMkCwM_3eZAJ1mw-1; Tue, 17 Jan 2023 02:37:16 -0500 X-MC-Unique: FO-d1X3aMkCwM_3eZAJ1mw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5498580D0E3 for ; Tue, 17 Jan 2023 07:37:16 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id C8CE640C2064 for ; Tue, 17 Jan 2023 07:37:15 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 05/13] mptcp: drop legacy code. Date: Tue, 17 Jan 2023 08:36:25 +0100 Message-Id: <86e96a76504f819131cd3f6065ad0bb1b261d6b8.1673940640.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" After the previous commits the PM worker can't race anymore with the unaccepted subflow close and disposal, as the msk keeps a reference to such subflow. We can remove the now irrelevant and confusing checks explicitly preventing the mentioned race. Signed-off-by: Paolo Abeni --- net/mptcp/options.c | 7 +------ net/mptcp/subflow.c | 7 +++---- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 99c4f9e9bb90..91d5b59540e9 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -988,12 +988,7 @@ static bool check_fully_established(struct mptcp_sock = *msk, struct sock *ssk, mptcp_subflow_fully_established(subflow, mp_opt); =20 check_notify: - /* if the subflow is not already linked into the conn_list, we can't - * notify the PM: this subflow is still on the listener queue - * and the PM possibly acquiring the subflow lock could race with - * the listener close - */ - if (likely(subflow->pm_notified) || list_empty(&subflow->node)) + if (likely(subflow->pm_notified)) return true; =20 subflow->pm_notified =3D 1; diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 7b91dc57049e..6f198d6e1b22 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1933,11 +1933,10 @@ static void subflow_ulp_release(struct sock *ssk) =20 sk =3D ctx->conn; if (sk) { - /* if the msk has been orphaned, keep the ctx - * alive, will be freed by __mptcp_close_ssk(), - * when the subflow is still unaccepted + /* if the subflow has been closed by the TCP stack, keep + * the ctx alive, will be freed by __mptcp_close_ssk() */ - release =3D ctx->disposable || list_empty(&ctx->node); + release =3D ctx->disposable; sock_put(sk); } =20 --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4EA971FB0 for ; Tue, 17 Jan 2023 07:37:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941038; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3HjysDh572wY9Js9r3sfPKUTwXRf2DKKbBeaZSRym/g=; b=NKeAILqW46BVkuoxmA6c1glJyHJOIJSx5NAWmfE5mZ9Hm9pks3IrGFQ9n3nlPemLFmYFDX D8IjrxQHIGK/SdX4R2WJ9wWcCrnLLko5XDGsSYacQ6FMZiOSqEtBPQ9c7QaDe37lLnUbUr 8zZpBwEBbvX1cxxPKehf/51yqPegfBs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-520-RMIZ-_zGMJ64jjcZ3KmzlA-1; Tue, 17 Jan 2023 02:37:17 -0500 X-MC-Unique: RMIZ-_zGMJ64jjcZ3KmzlA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1864D802D2A for ; Tue, 17 Jan 2023 07:37:17 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9BDA340C2004 for ; Tue, 17 Jan 2023 07:37:16 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 06/13] mptcp: avoid unneeded __mptcp_nmpc_socket() usage Date: Tue, 17 Jan 2023 08:36:26 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" In a few spots the mptcp code invokes the __mptcp_nmpc_socket() helper multiple times under the same socket lock scope. Additionally, in such places, the socket status ensure that threre is not an MP capable handshake running. Under the above condition we can replace the later __mptcp_nmpc_socket() helper invocation with direct access to the msk->subflow pointer and better document such access is not supposed to fail with WARN(). Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index d298d629b3b2..fc13f1e45137 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3172,7 +3172,7 @@ static struct sock *mptcp_accept(struct sock *sk, int= flags, int *err, struct socket *listener; struct sock *newsk; =20 - listener =3D __mptcp_nmpc_socket(msk); + listener =3D msk->subflow; if (WARN_ON_ONCE(!listener)) { *err =3D -EINVAL; return NULL; @@ -3392,7 +3392,7 @@ static int mptcp_get_port(struct sock *sk, unsigned s= hort snum) struct mptcp_sock *msk =3D mptcp_sk(sk); struct socket *ssock; =20 - ssock =3D __mptcp_nmpc_socket(msk); + ssock =3D msk->subflow; pr_debug("msk=3D%p, subflow=3D%p", msk, ssock); if (WARN_ON_ONCE(!ssock)) return -EINVAL; @@ -3738,7 +3738,10 @@ static int mptcp_stream_accept(struct socket *sock, = struct socket *newsock, =20 pr_debug("msk=3D%p", msk); =20 - ssock =3D __mptcp_nmpc_socket(msk); + /* buggy applications can call accept on socket states other then LISTEN + * but no need to allocate the first subflow just to error out. + */ + ssock =3D msk->subflow; if (!ssock) return -EINVAL; =20 --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16AEF1FB6 for ; Tue, 17 Jan 2023 07:37:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941040; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TMGg0Tfh/BNDpXs4p5HbUKbtIyGBffJA+AN6MG0y4aM=; b=GVj9W2NpqiB81gDoM73MsppDvf1o5yAjahdkvY0OQpayUJcEAMFsiGC6qZKXGqcCtikqbX d4klTUnHgOIcytXN/8GVGkCtOjuWpS1Oek2uaq3xJGKrV6OwQnbQHhO96TYQoqunX8nQ4r 6g44VCRZJbdLz7y+TKBw+TMa5N3Rows= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-524-GzpoIdafPGCXferUg0wB3g-1; Tue, 17 Jan 2023 02:37:18 -0500 X-MC-Unique: GzpoIdafPGCXferUg0wB3g-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D2413811E6E for ; Tue, 17 Jan 2023 07:37:17 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5F0A640C2064 for ; Tue, 17 Jan 2023 07:37:17 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 07/13] mptcp: move fastopen subflow check inside mptcp_sendmsg_fastopen() Date: Tue, 17 Jan 2023 08:36:27 +0100 Message-Id: <3a304d195506dd6e5bf23015bbd0b97f105556ea.1673940640.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" So that we can avoid a bunch of check in fastpath. Additionally we can specialize such check according to the specific fastopen method - defer_connect vs MSG_FASTOPEN. The latter bits will simplify the next patches. Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index fc13f1e45137..9c4c729bf271 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -1694,13 +1694,27 @@ static void mptcp_set_nospace(struct sock *sk) =20 static int mptcp_disconnect(struct sock *sk, int flags); =20 -static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struc= t msghdr *msg, +static int mptcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, size_t len, int *copied_syn) { unsigned int saved_flags =3D msg->msg_flags; struct mptcp_sock *msk =3D mptcp_sk(sk); + struct sock *ssk; int ret; =20 + /* on flags based fastopen the mptcp is supposed to create the + * first subflow right now. Otherwise we are in the defer_connect + * path, and the first subflow must be already present. + * Since the defer_connect flag is cleared after the first succsful + * fastopen attempt, no need to check for additional subflow status. + */ + if (msg->msg_flags & MSG_FASTOPEN && !__mptcp_nmpc_socket(msk)) + return -EINVAL; + if (!msk->first) + return -EINVAL; + + ssk =3D msk->first; + lock_sock(ssk); msg->msg_flags |=3D MSG_DONTWAIT; msk->connect_flags =3D O_NONBLOCK; @@ -1723,6 +1737,7 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, st= ruct sock *ssk, struct msgh } else if (ret && ret !=3D -EINPROGRESS) { mptcp_disconnect(sk, 0); } + inet_sk(sk)->defer_connect =3D 0; =20 return ret; } @@ -1731,7 +1746,6 @@ static int mptcp_sendmsg(struct sock *sk, struct msgh= dr *msg, size_t len) { struct mptcp_sock *msk =3D mptcp_sk(sk); struct page_frag *pfrag; - struct socket *ssock; size_t copied =3D 0; int ret =3D 0; long timeo; @@ -1741,12 +1755,10 @@ static int mptcp_sendmsg(struct sock *sk, struct ms= ghdr *msg, size_t len) =20 lock_sock(sk); =20 - ssock =3D __mptcp_nmpc_socket(msk); - if (unlikely(ssock && (inet_sk(ssock->sk)->defer_connect || - msg->msg_flags & MSG_FASTOPEN))) { + if (unlikely(inet_sk(sk)->defer_connect || msg->msg_flags & MSG_FASTOPEN)= ) { int copied_syn =3D 0; =20 - ret =3D mptcp_sendmsg_fastopen(sk, ssock->sk, msg, len, &copied_syn); + ret =3D mptcp_sendmsg_fastopen(sk, msg, len, &copied_syn); copied +=3D copied_syn; if (ret =3D=3D -EINPROGRESS && copied_syn > 0) goto out; --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D46071FB0 for ; Tue, 17 Jan 2023 07:37:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941039; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=STOWPGwSE773zhlawTEXcElOJSTVsESesy8J/2SHfTI=; b=REjnvtU3itUOiP85ch3lfyq6WRUzBjjtglDLZntbJWL16rYjDnLoCgXzfUbpsfuOda1Lqg 9qGua+2svQPl1nMljsxoJQVcy/MpicRjYeZJjKNHui6Wwp6Wjv9KP5CjJc/Zi4PtxspQnm GWDgrB6Y0bCpxpn8izuHamhN5Pk5XAU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-526-zNqd7FJTOAOwB-FbJmfYpw-1; Tue, 17 Jan 2023 02:37:18 -0500 X-MC-Unique: zNqd7FJTOAOwB-FbJmfYpw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 98B9980D0E3 for ; Tue, 17 Jan 2023 07:37:18 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 27D7740C2004 for ; Tue, 17 Jan 2023 07:37:18 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 08/13] mptcp: move first subflow allocation at mpc access time Date: Tue, 17 Jan 2023 08:36:28 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" In the long run this will simplify the mptcp code and will allow for more consistent behavior. Move the first subflow allocation out of the sock->init ops into the __mptcp_nmpc_socket() helper. Since the first subflow creation can now happen after the first setsockopt() we additionally need to invoke mptcp_sockopt_sync() on it. Signed-off-by: Paolo Abeni --- v1 -> v2: - really remove subflow creation from init() (!!!) --- net/mptcp/pm_netlink.c | 4 +-- net/mptcp/protocol.c | 61 +++++++++++++++++++++++++----------------- net/mptcp/protocol.h | 2 +- net/mptcp/sockopt.c | 20 +++++++------- 4 files changed, 51 insertions(+), 36 deletions(-) diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 155916174841..d1d859517d91 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -1016,8 +1016,8 @@ static int mptcp_pm_nl_create_listen_socket(struct so= ck *sk, lock_sock(newsk); ssock =3D __mptcp_nmpc_socket(mptcp_sk(newsk)); release_sock(newsk); - if (!ssock) - return -EINVAL; + if (IS_ERR(ssock)) + return PTR_ERR(ssock); =20 mptcp_info2sockaddr(&entry->addr, &addr, entry->addr.family); #if IS_ENABLED(CONFIG_MPTCP_IPV6) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 9c4c729bf271..4f7a71561efd 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -49,18 +49,6 @@ static void __mptcp_check_send_data_fin(struct sock *sk); DEFINE_PER_CPU(struct mptcp_delegated_action, mptcp_delegated_actions); static struct net_device mptcp_napi_dev; =20 -/* If msk has an initial subflow socket, and the MP_CAPABLE handshake has = not - * completed yet or has failed, return the subflow socket. - * Otherwise return NULL. - */ -struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk) -{ - if (!msk->subflow || READ_ONCE(msk->can_ack)) - return NULL; - - return msk->subflow; -} - /* Returns end sequence number of the receiver's advertised window */ static u64 mptcp_wnd_end(const struct mptcp_sock *msk) { @@ -116,6 +104,31 @@ static int __mptcp_socket_create(struct mptcp_sock *ms= k) return 0; } =20 +/* Returns the first subflow socket if available and the MPC + * handshake is not started yet. + */ +struct socket *__mptcp_nmpc_socket(struct mptcp_sock *msk) +{ + struct sock *sk =3D (struct sock *)msk; + int ret; + + if (!((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN))) + return ERR_PTR(-EINVAL); + + if (!msk->subflow) { + if (msk->first) + return ERR_PTR(-EINVAL); + + ret =3D __mptcp_socket_create(msk); + if (ret) + return ERR_PTR(ret); + + mptcp_sockopt_sync(msk, msk->first); + } + + return msk->subflow; +} + static void mptcp_drop(struct sock *sk, struct sk_buff *skb) { sk_drops_add(sk, skb); @@ -1699,6 +1712,7 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, st= ruct msghdr *msg, { unsigned int saved_flags =3D msg->msg_flags; struct mptcp_sock *msk =3D mptcp_sk(sk); + struct socket *ssock; struct sock *ssk; int ret; =20 @@ -1708,8 +1722,11 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, s= truct msghdr *msg, * Since the defer_connect flag is cleared after the first succsful * fastopen attempt, no need to check for additional subflow status. */ - if (msg->msg_flags & MSG_FASTOPEN && !__mptcp_nmpc_socket(msk)) - return -EINVAL; + if (msg->msg_flags & MSG_FASTOPEN) { + ssock =3D __mptcp_nmpc_socket(msk); + if (IS_ERR(ssock)) + return PTR_ERR(ssock); + } if (!msk->first) return -EINVAL; =20 @@ -2768,10 +2785,6 @@ static int mptcp_init_sock(struct sock *sk) if (unlikely(!net->mib.mptcp_statistics) && !mptcp_mib_alloc(net)) return -ENOMEM; =20 - ret =3D __mptcp_socket_create(mptcp_sk(sk)); - if (ret) - return ret; - ret =3D mptcp_init_sched(mptcp_sk(sk), mptcp_sched_find(mptcp_get_scheduler(net))); if (ret) @@ -3592,8 +3605,8 @@ static int mptcp_connect(struct sock *sk, struct sock= addr *uaddr, int addr_len) int err =3D -EINVAL; =20 ssock =3D __mptcp_nmpc_socket(msk); - if (!ssock) - return -EINVAL; + if (IS_ERR(ssock)) + return PTR_ERR(ssock); =20 mptcp_token_destroy(msk); inet_sk_state_store(sk, TCP_SYN_SENT); @@ -3681,8 +3694,8 @@ static int mptcp_bind(struct socket *sock, struct soc= kaddr *uaddr, int addr_len) =20 lock_sock(sock->sk); ssock =3D __mptcp_nmpc_socket(msk); - if (!ssock) { - err =3D -EINVAL; + if (IS_ERR(ssock)) { + err =3D PTR_ERR(ssock); goto unlock; } =20 @@ -3718,8 +3731,8 @@ static int mptcp_listen(struct socket *sock, int back= log) =20 lock_sock(sk); ssock =3D __mptcp_nmpc_socket(msk); - if (!ssock) { - err =3D -EINVAL; + if (IS_ERR(ssock)) { + err =3D PTR_ERR(ssock); goto unlock; } =20 diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 5f1a30959b5c..3a055438c65e 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -632,7 +632,7 @@ void __mptcp_subflow_send_ack(struct sock *ssk); void mptcp_subflow_reset(struct sock *ssk); void mptcp_subflow_queue_clean(struct sock *sk, struct sock *ssk); void mptcp_sock_graft(struct sock *sk, struct socket *parent); -struct socket *__mptcp_nmpc_socket(const struct mptcp_sock *msk); +struct socket *__mptcp_nmpc_socket(struct mptcp_sock *msk); bool __mptcp_close(struct sock *sk, long timeout); void mptcp_cancel_work(struct sock *sk); void mptcp_set_owner_r(struct sk_buff *skb, struct sock *sk); diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c index 8a9656248b0f..9bddae9c5c58 100644 --- a/net/mptcp/sockopt.c +++ b/net/mptcp/sockopt.c @@ -301,9 +301,9 @@ static int mptcp_setsockopt_sol_socket(struct mptcp_soc= k *msk, int optname, case SO_BINDTOIFINDEX: lock_sock(sk); ssock =3D __mptcp_nmpc_socket(msk); - if (!ssock) { + if (IS_ERR(ssock)) { release_sock(sk); - return -EINVAL; + return PTR_ERR(ssock); } =20 ret =3D sock_setsockopt(ssock, SOL_SOCKET, optname, optval, optlen); @@ -396,9 +396,9 @@ static int mptcp_setsockopt_v6(struct mptcp_sock *msk, = int optname, case IPV6_FREEBIND: lock_sock(sk); ssock =3D __mptcp_nmpc_socket(msk); - if (!ssock) { + if (IS_ERR(ssock)) { release_sock(sk); - return -EINVAL; + return PTR_ERR(ssock); } =20 ret =3D tcp_setsockopt(ssock->sk, SOL_IPV6, optname, optval, optlen); @@ -693,9 +693,9 @@ static int mptcp_setsockopt_sol_ip_set_transparent(stru= ct mptcp_sock *msk, int o lock_sock(sk); =20 ssock =3D __mptcp_nmpc_socket(msk); - if (!ssock) { + if (IS_ERR(ssock)) { release_sock(sk); - return -EINVAL; + return PTR_ERR(ssock); } =20 issk =3D inet_sk(ssock->sk); @@ -762,13 +762,15 @@ static int mptcp_setsockopt_first_sf_only(struct mptc= p_sock *msk, int level, int { struct sock *sk =3D (struct sock *)msk; struct socket *sock; - int ret =3D -EINVAL; + int ret; =20 /* Limit to first subflow, before the connection establishment */ lock_sock(sk); sock =3D __mptcp_nmpc_socket(msk); - if (!sock) + if (IS_ERR(sock)) { + ret =3D PTR_ERR(sock); goto unlock; + } =20 ret =3D tcp_setsockopt(sock->sk, level, optname, optval, optlen); =20 @@ -872,7 +874,7 @@ static int mptcp_getsockopt_first_sf_only(struct mptcp_= sock *msk, int level, int } =20 ssock =3D __mptcp_nmpc_socket(msk); - if (!ssock) + if (IS_ERR(ssock)) goto out; =20 ret =3D tcp_getsockopt(ssock->sk, level, optname, optval, optlen); --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FAA21FB7 for ; Tue, 17 Jan 2023 07:37:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941041; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hI7KDsVT/Mi0T+DWAF9+aOS8tyYS2rhVoEOXIYb/AF8=; b=evnzcEo2CaISeUKmrPyqMA8hqiWOkelSqqLRWuEye3bZAPO5kVlwPA7kH+i4UiM4cVF+ks XY1uY/HWlaldYwtzGCxTPuwzj7pkV0XcVonfpDjDo1VMo8AW3HYAMhPutGZcS+9NSiXOYP Y5R0G8TMdhCKWNDH4J9eBt2Ang1VRaU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-609-rntRgAl9OS-wHYgoeaMOzQ-1; Tue, 17 Jan 2023 02:37:19 -0500 X-MC-Unique: rntRgAl9OS-wHYgoeaMOzQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5C7492804127 for ; Tue, 17 Jan 2023 07:37:19 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id DF5D540C2064 for ; Tue, 17 Jan 2023 07:37:18 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 09/13] mptcp: do not keep around the first subflow after disconnect. Date: Tue, 17 Jan 2023 08:36:29 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" After the previous patch the first subflow is allocated as needed at bind, connect, listen time. We don't need anymore to keep alive the first subflow after a disconnect just to be able to perform such syscall. Overal this change makes the passive and active sockets consistent: even passive sockets will be allowed to complete life cycle after disconnect. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/290 Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 24 +++++------------------- 1 file changed, 5 insertions(+), 19 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 4f7a71561efd..b867e3eec5b9 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2371,11 +2371,9 @@ static void __mptcp_close_ssk(struct sock *sk, struc= t sock *ssk, unsigned int flags) { struct mptcp_sock *msk =3D mptcp_sk(sk); - bool need_push, dispose_it; + bool need_push; =20 - dispose_it =3D !msk->subflow || ssk !=3D msk->subflow->sk; - if (dispose_it) - list_del(&subflow->node); + list_del(&subflow->node); =20 lock_sock_nested(ssk, SINGLE_DEPTH_NESTING); =20 @@ -2389,15 +2387,6 @@ static void __mptcp_close_ssk(struct sock *sk, struc= t sock *ssk, } =20 need_push =3D (flags & MPTCP_CF_PUSH) && __mptcp_retransmit_pending_data(= sk); - if (!dispose_it) { - tcp_disconnect(ssk, 0); - msk->subflow->state =3D SS_UNCONNECTED; - mptcp_subflow_ctx_reset(subflow); - release_sock(ssk); - - goto out; - } - sock_orphan(ssk); subflow->disposable =3D 1; =20 @@ -2424,10 +2413,11 @@ static void __mptcp_close_ssk(struct sock *sk, stru= ct sock *ssk, =20 sock_put(ssk); =20 - if (ssk =3D=3D msk->first) + if (ssk =3D=3D msk->first) { msk->first =3D NULL; + mptcp_dispose_initial_subflow(msk); + } =20 -out: if (ssk =3D=3D msk->last_snd) msk->last_snd =3D NULL; =20 @@ -3270,10 +3260,6 @@ static void mptcp_destroy(struct sock *sk) { struct mptcp_sock *msk =3D mptcp_sk(sk); =20 - /* clears msk->subflow, allowing the following to close - * even the initial subflow - */ - mptcp_dispose_initial_subflow(msk); mptcp_destroy_common(msk, 0); sk_sockets_allocated_dec(sk); } --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E9011FB3 for ; Tue, 17 Jan 2023 07:37:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941041; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jBNr5u5HARESPaBwPPbQQt0n5Q7p7/rsZplOVB5vUPk=; b=bKyQcWQZGeS7KBFhb1PDViwc7BfMW9CtSz9E961Wk35VsaKZFH+RrOSvpKOqvFgXhgTni2 kYAgKsKuPf5yoKkAagC8YTnpCxKGU/AggMjpKAV3n5aZK/GZUE9VwfeB39XNTEv2VjxJo3 fZfUf0oeZWoSMg+l+40GT62eq1YUsFk= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-671-sQl5Z7p2MhiPwHz4eniAvg-1; Tue, 17 Jan 2023 02:37:20 -0500 X-MC-Unique: sQl5Z7p2MhiPwHz4eniAvg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 24DF3381494D for ; Tue, 17 Jan 2023 07:37:20 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id A76BD40C2004 for ; Tue, 17 Jan 2023 07:37:19 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 10/13] mptcp: fastclose msk when cleaning unaccepted sockets Date: Tue, 17 Jan 2023 08:36:30 +0100 Message-Id: <394c4a5a2cb8764519aa732fc27dad59fa2ba71c.1673940640.git.pabeni@redhat.com> In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" When cleaning up unaccepted mptcp socket still laying inside the listener queue at listener close time, such sockets will go through a regular close, waiting for a timeout before shutting down the subflows. There is no need to keep the kernel resources in use for such a possibly long time: short-circuit to fast-close. Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 7 +++++-- net/mptcp/subflow.c | 2 +- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index b867e3eec5b9..f742d558a1b8 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -2977,10 +2977,13 @@ bool __mptcp_close(struct sock *sk, long timeout) goto cleanup; } =20 - if (mptcp_check_readable(msk)) { - /* the msk has read data, do the MPTCP equivalent of TCP reset */ + if (mptcp_check_readable(msk) || timeout < 0) { + /* If the msk has read data, or the caller explicitly ask it, + * do the MPTCP equivalent of TCP reset, aka MTCP fastclose + */ inet_sk_state_store(sk, TCP_CLOSE); mptcp_do_fastclose(sk); + timeout =3D 0; } else if (mptcp_close_state(sk)) { __mptcp_wr_shutdown(sk); } diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 6f198d6e1b22..a1f8bb745c1b 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1856,7 +1856,7 @@ void mptcp_subflow_queue_clean(struct sock *listener_= sk, struct sock *listener_s if (msk->first) sock_hold(msk->first); =20 - do_cancel_work =3D __mptcp_close(sk, 0); + do_cancel_work =3D __mptcp_close(sk, -1); release_sock(sk); if (do_cancel_work) { /* lockdep will report a false positive ABBA deadlock --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 694E11FB6 for ; Tue, 17 Jan 2023 07:37:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941042; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7L11CLY7hRFXPUClesgTsdftxeVUVk9UeBNwXIYWbmM=; b=Q1gp2UjQACV/NimzmxuDdarHiVKY+rW4vN61/tH2a2GvIPZltD/P6nMXOHnDvBMta4hd4C Xuf0N3nmuvF5uU2Gr3JxvVT3smUiohGxFgbemXYiEt2ik1QbPtVA2e1s763bQCFZzpQqcG IZ511VjPCXeCcqBx0ir8lNqJv7t0bW0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-338-3gJmsd4yMlOpe4u7cpEZRg-1; Tue, 17 Jan 2023 02:37:21 -0500 X-MC-Unique: 3gJmsd4yMlOpe4u7cpEZRg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DF752858F09 for ; Tue, 17 Jan 2023 07:37:20 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6C51A40C2006 for ; Tue, 17 Jan 2023 07:37:20 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 11/13] mptcp: refactor mptcp_stream_accept() Date: Tue, 17 Jan 2023 08:36:31 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Rewrite the mptcp socket accept op, partially open-codying inet_accept(), instead of indirectly calling it. This way we can avoid acquiring the new socket lock twice and we can avoid a couple of indirect calls. Signed-off-by: Paolo Abeni --- net/mptcp/protocol.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index f742d558a1b8..cf161deb64d8 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -3748,6 +3748,7 @@ static int mptcp_stream_accept(struct socket *sock, s= truct socket *newsock, { struct mptcp_sock *msk =3D mptcp_sk(sock->sk); struct socket *ssock; + struct sock *newsk; int err; =20 pr_debug("msk=3D%p", msk); @@ -3759,16 +3760,26 @@ static int mptcp_stream_accept(struct socket *sock,= struct socket *newsock, if (!ssock) return -EINVAL; =20 - err =3D ssock->ops->accept(sock, newsock, flags, kern); - if (err =3D=3D 0 && !mptcp_is_tcpsk(newsock->sk)) { - struct mptcp_sock *msk =3D mptcp_sk(newsock->sk); + newsk =3D mptcp_accept(sock->sk, flags, &err, kern); + if (!newsk) + return err; + + lock_sock(newsk); + + sock_rps_record_flow(newsk); + WARN_ON(!((1 << newsk->sk_state) & + (TCPF_ESTABLISHED | TCPF_SYN_RECV | + TCPF_CLOSE_WAIT | TCPF_CLOSE))); + + sock_graft(newsk, newsock); + + newsock->state =3D SS_CONNECTED; + if (!mptcp_is_tcpsk(newsock->sk)) { + struct mptcp_sock *msk =3D mptcp_sk(newsk); struct mptcp_subflow_context *subflow; - struct sock *newsk =3D newsock->sk; =20 set_bit(SOCK_CUSTOM_SOCKOPT, &newsock->flags); =20 - lock_sock(newsk); - /* set ssk->sk_socket of accept()ed flows to mptcp socket. * This is needed so NOSPACE flag can be set from tcp stack. */ @@ -3778,10 +3789,10 @@ static int mptcp_stream_accept(struct socket *sock,= struct socket *newsock, if (!ssk->sk_socket) mptcp_sock_graft(ssk, newsock); } - release_sock(newsk); } + release_sock(newsk); =20 - return err; + return 0; } =20 static __poll_t mptcp_check_writeable(struct mptcp_sock *msk) --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15AC51FB3 for ; Tue, 17 Jan 2023 07:37:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vaFPD7Fhlx/TCNyPpZ+Wg+1rHaPKGhM2KhYUP5Nyg1Y=; b=RQbYtdbkhcDxqBcQLkyXkiUcBZLq2icPCGB+GnLDMrc7OmjXgdFGbI98WH5asiSI56ei8B gOenlZOHt91SYV9VeemkkRDQbtzhF5ovUiD7w0gMkvTUn5alvjuOrqctsRaiTWME2QElJO osxHPEmVfdSQ0CD0x2sROA4XzR9HqUs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-41--a2Fzs8FO5ScmNiRH2U_rA-1; Tue, 17 Jan 2023 02:37:21 -0500 X-MC-Unique: -a2Fzs8FO5ScmNiRH2U_rA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A30AE85C6E4 for ; Tue, 17 Jan 2023 07:37:21 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id 31DF940C2004 for ; Tue, 17 Jan 2023 07:37:21 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 12/13] security, lsm: Introduce security_mptcp_add_subflow() Date: Tue, 17 Jan 2023 08:36:32 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" MPTCP can create subflows in kernel context, and later indirectly expose them to user-space, via the owning mptcp socket. As discussed in the reported link, the above causes unexpected failures for server, MPTCP-enabled applications. Let's introduce a new LSM hook to allow the security module to relabel the subflow according to the owing process. Note that the new hook requires both the mptcp socket and the new subflow. This could allow future extensions, e.g. explicitly validating the mptcp <-> subflow linkage. Link: https://lore.kernel.org/mptcp/CAHC9VhTNh-YwiyTds=3DP1e3rixEDqbRTFj22b= pya=3D+qJqfcaMfg@mail.gmail.com/ Signed-off-by: Paolo Abeni --- v1 -> v2: - fix build issue with !CONFIG_SECURITY_NETWORK --- include/linux/lsm_hook_defs.h | 1 + include/linux/lsm_hooks.h | 9 +++++++++ include/linux/security.h | 6 ++++++ net/mptcp/subflow.c | 6 ++++++ security/security.c | 5 +++++ 5 files changed, 27 insertions(+) diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h index ed6cb2ac55fa..860e11e3a26b 100644 --- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -343,6 +343,7 @@ LSM_HOOK(void, LSM_RET_VOID, sctp_sk_clone, struct sctp= _association *asoc, struct sock *sk, struct sock *newsk) LSM_HOOK(int, 0, sctp_assoc_established, struct sctp_association *asoc, struct sk_buff *skb) +LSM_HOOK(int, 0, mptcp_add_subflow, struct sock *sk, struct sock *ssk) #endif /* CONFIG_SECURITY_NETWORK */ =20 #ifdef CONFIG_SECURITY_INFINIBAND diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h index 0a5ba81f7367..84c9c4d4341e 100644 --- a/include/linux/lsm_hooks.h +++ b/include/linux/lsm_hooks.h @@ -1096,6 +1096,15 @@ * @skb pointer to skbuff of association packet. * Return 0 if permission is granted. * + * Security hooks for MPTCP + * + * @mptcp_add_subflow + * Update the labeling for the given MPTCP subflow, to match to + * owning MPTCP socket. + * @sk: the owning MPTCP socket + * @ssk: the new subflow + * Return 0 if successful, otherwise < 0 error code. + * * Security hooks for Infiniband * * @ib_pkey_access: diff --git a/include/linux/security.h b/include/linux/security.h index 5b67f208f7de..82d50b2ba683 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -1479,6 +1479,7 @@ void security_sctp_sk_clone(struct sctp_association *= asoc, struct sock *sk, struct sock *newsk); int security_sctp_assoc_established(struct sctp_association *asoc, struct sk_buff *skb); +int security_mptcp_add_subflow(struct sock *sk, struct sock *ssk); =20 #else /* CONFIG_SECURITY_NETWORK */ static inline int security_unix_stream_connect(struct sock *sock, @@ -1706,6 +1707,11 @@ static inline int security_sctp_assoc_established(st= ruct sctp_association *asoc, { return 0; } + +static inline int security_mptcp_add_subflow(struct sock *sk, struct sock = *ssk) +{ + return 0; +} #endif /* CONFIG_SECURITY_NETWORK */ =20 #ifdef CONFIG_SECURITY_INFINIBAND diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index a1f8bb745c1b..c9a7460d469a 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1690,6 +1690,10 @@ int mptcp_subflow_create_socket(struct sock *sk, uns= igned short family, =20 lock_sock_nested(sf->sk, SINGLE_DEPTH_NESTING); =20 + err =3D security_mptcp_add_subflow(sk, sf->sk); + if (err) + goto release_ssk; + /* the newly created socket has to be in the same cgroup as its parent */ mptcp_attach_cgroup(sk, sf->sk); =20 @@ -1702,6 +1706,8 @@ int mptcp_subflow_create_socket(struct sock *sk, unsi= gned short family, get_net_track(net, &sf->sk->ns_tracker, GFP_KERNEL); sock_inuse_add(net, 1); err =3D tcp_set_ulp(sf->sk, "mptcp"); + +release_ssk: release_sock(sf->sk); =20 if (err) { diff --git a/security/security.c b/security/security.c index d1571900a8c7..3491a4fc2b1f 100644 --- a/security/security.c +++ b/security/security.c @@ -2493,6 +2493,11 @@ int security_sctp_assoc_established(struct sctp_asso= ciation *asoc, } EXPORT_SYMBOL(security_sctp_assoc_established); =20 +int security_mptcp_add_subflow(struct sock *sk, struct sock *ssk) +{ + return call_int_hook(mptcp_add_subflow, 0, sk, ssk); +} + #endif /* CONFIG_SECURITY_NETWORK */ =20 #ifdef CONFIG_SECURITY_INFINIBAND --=20 2.39.0 From nobody Sat Jun 3 17:01:56 2023 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AD481FB6 for ; Tue, 17 Jan 2023 07:37:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673941044; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0EO/JH4gKngksQOeggNkg/wN952j8ab5v0INEKnLio0=; b=N9aco8VIgqgdfXj2wxpT4UajNVK8QinhzTNo79RGGqwe+CnrOVTAVmoHWex4H06etyXYVw XtoW/gnEuV5wUcude+WUM88Ne4Ac38u5ndZ3ia8W/jUBg7zphvz4Yc9/kRnTSWZyRFH692 FF4SCiWAOgqUd0TpMqUtzOPkIzjHe78= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-19-1-qUnE6fMX6khaE84q-4Hw-1; Tue, 17 Jan 2023 02:37:22 -0500 X-MC-Unique: 1-qUnE6fMX6khaE84q-4Hw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 691793C0ED43 for ; Tue, 17 Jan 2023 07:37:22 +0000 (UTC) Received: from gerbillo.redhat.com (unknown [10.39.192.87]) by smtp.corp.redhat.com (Postfix) with ESMTP id EA5CF40C2064 for ; Tue, 17 Jan 2023 07:37:21 +0000 (UTC) From: Paolo Abeni To: mptcp@lists.linux.dev Subject: [PATCH mptcp-next v2 13/13] selinux: Implement mptcp_add_subflow hook Date: Tue, 17 Jan 2023 08:36:33 +0100 Message-Id: In-Reply-To: References: Precedence: bulk X-Mailing-List: mptcp@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8"; x-default="true" Newly added subflows should inherit the LSM label from the associated msk socket regarless current context. This patch implements the above copying sid and class from the msk context, deleting the existing subflow label, if any, and then re-creating a new one. The new helper reuses the selinux_netlbl_sk_security_free() function, and the latter can end-up being called multiple times with the same argument; we additionally need to make it idempotent. Signed-off-by: Paolo Abeni --- v1 -> v2: - cleanup selinux_netlbl_sk_security_free() (Paul) - inherit label from msk (Paul) v1 -> v2: - fix build issue with !CONFIG_NETLABEL --- security/selinux/hooks.c | 16 ++++++++++++++++ security/selinux/netlabel.c | 8 ++++++-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 3c5be76a9199..1e0ca10a6c02 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -5476,6 +5476,21 @@ static void selinux_sctp_sk_clone(struct sctp_associ= ation *asoc, struct sock *sk selinux_netlbl_sctp_sk_clone(sk, newsk); } =20 +static int selinux_mptcp_add_subflow(struct sock *sk, struct sock *ssk) +{ + struct sk_security_struct *ssksec =3D ssk->sk_security; + struct sk_security_struct *sksec =3D sk->sk_security; + + ssksec->sclass =3D sksec->sclass; + ssksec->sid =3D sksec->sid; + + /* replace the existing subflow label deleting the existing one + * and re-recrating a new label using the current context + */ + selinux_netlbl_sk_security_free(ssksec); + return selinux_netlbl_socket_post_create(ssk, ssk->sk_family); +} + static int selinux_inet_conn_request(const struct sock *sk, struct sk_buff= *skb, struct request_sock *req) { @@ -7216,6 +7231,7 @@ static struct security_hook_list selinux_hooks[] __ls= m_ro_after_init =3D { LSM_HOOK_INIT(sctp_sk_clone, selinux_sctp_sk_clone), LSM_HOOK_INIT(sctp_bind_connect, selinux_sctp_bind_connect), LSM_HOOK_INIT(sctp_assoc_established, selinux_sctp_assoc_established), + LSM_HOOK_INIT(mptcp_add_subflow, selinux_mptcp_add_subflow), LSM_HOOK_INIT(inet_conn_request, selinux_inet_conn_request), LSM_HOOK_INIT(inet_csk_clone, selinux_inet_csk_clone), LSM_HOOK_INIT(inet_conn_established, selinux_inet_conn_established), diff --git a/security/selinux/netlabel.c b/security/selinux/netlabel.c index 1321f15799e2..33187e38def7 100644 --- a/security/selinux/netlabel.c +++ b/security/selinux/netlabel.c @@ -155,8 +155,12 @@ void selinux_netlbl_err(struct sk_buff *skb, u16 famil= y, int error, int gateway) */ void selinux_netlbl_sk_security_free(struct sk_security_struct *sksec) { - if (sksec->nlbl_secattr !=3D NULL) - netlbl_secattr_free(sksec->nlbl_secattr); + if (!sksec->nlbl_secattr) + return; + + netlbl_secattr_free(sksec->nlbl_secattr); + sksec->nlbl_secattr =3D NULL; + sksec->nlbl_state =3D NLBL_UNSET; } =20 /** --=20 2.39.0