From nobody Thu Sep 18 23:35:48 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2F1FC4332F for ; Fri, 2 Dec 2022 00:24:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232071AbiLBAYG (ORCPT ); Thu, 1 Dec 2022 19:24:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231860AbiLBAXa (ORCPT ); Thu, 1 Dec 2022 19:23:30 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20C86D3DFA for ; Thu, 1 Dec 2022 16:19:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669940357; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0ItRdbm0UyoTawpAty3EqeXI5my06iwbuvMc90NvWqA=; b=Ha8pa+XxPLR3sSq+LelSWqNCcZK+EnALXfRYjQ/aOyLNoajm2JRXZBNecvgX1BkfgkWSLN LKuaBmxRbqNVkd3dcKhgFThQe2jFcximVmBIyf+8I8u03WwyC6gef3MuavrSQws1TzbmfV Bf9YmWxlWWRvTL5SszrbfRiY6urzttg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-201-6fBdFN06N4KNyQyVfp0jPQ-1; Thu, 01 Dec 2022 19:19:14 -0500 X-MC-Unique: 6fBdFN06N4KNyQyVfp0jPQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CCF2085A59D; Fri, 2 Dec 2022 00:19:13 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.36]) by smtp.corp.redhat.com (Postfix) with ESMTP id 18F5E492B11; Fri, 2 Dec 2022 00:19:13 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH net-next 29/36] rxrpc: Reduce the use of RCU in packet input From: David Howells To: netdev@vger.kernel.org Cc: Marc Dionne , linux-afs@lists.infradead.org, dhowells@redhat.com, linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org Date: Fri, 02 Dec 2022 00:19:10 +0000 Message-ID: <166994035029.1732290.5164470094736354349.stgit@warthog.procyon.org.uk> In-Reply-To: <166994010342.1732290.13771061038178613124.stgit@warthog.procyon.org.uk> References: <166994010342.1732290.13771061038178613124.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.5 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Shrink the region of rxrpc_input_packet() that is covered by the RCU read lock so that it only covers the connection and call lookup. This means that the bits now outside of that can call sleepable functions such as kmalloc and sendmsg. Also take a ref on the conn or call we're going to use before we drop the RCU read lock. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org --- net/rxrpc/ar-internal.h | 3 +- net/rxrpc/call_accept.c | 13 ++------- net/rxrpc/input.c | 7 ++--- net/rxrpc/io_thread.c | 68 ++++++++++++++++++++++++++++++++++++-------= ---- 4 files changed, 59 insertions(+), 32 deletions(-) diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 6af7298af39b..cfd16f1e5c83 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -961,8 +961,7 @@ void rxrpc_unpublish_service_conn(struct rxrpc_connecti= on *); * input.c */ void rxrpc_input_call_event(struct rxrpc_call *, struct sk_buff *); -void rxrpc_input_implicit_end_call(struct rxrpc_sock *, struct rxrpc_conne= ction *, - struct rxrpc_call *); +void rxrpc_input_implicit_end_call(struct rxrpc_connection *, struct rxrpc= _call *); =20 /* * io_thread.c diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c index 5f978b0b2404..beb8efa2e7a9 100644 --- a/net/rxrpc/call_accept.c +++ b/net/rxrpc/call_accept.c @@ -336,13 +336,13 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(s= truct rxrpc_sock *rx, * If this is for a kernel service, when we allocate the call, it will have * three refs on it: (1) the kernel service, (2) the user_call_ID tree, (3= ) the * retainer ref obtained from the backlog buffer. Prealloc calls for user= space - * services only have the ref from the backlog buffer. We want to pass th= is - * ref to non-BH context to dispose of. + * services only have the ref from the backlog buffer. We pass this ref t= o the + * caller. * * If we want to report an error, we mark the skb with the packet type and * abort code and return NULL. * - * The call is returned with the user access mutex held. + * The call is returned with the user access mutex held and a ref on it. */ struct rxrpc_call *rxrpc_new_incoming_call(struct rxrpc_local *local, struct rxrpc_sock *rx, @@ -426,13 +426,6 @@ struct rxrpc_call *rxrpc_new_incoming_call(struct rxrp= c_local *local, =20 rxrpc_send_ping(call, skb); =20 - /* We have to discard the prealloc queue's ref here and rely on a - * combination of the RCU read lock and refs held either by the socket - * (recvmsg queue, to-be-accepted queue or user ID tree) or the kernel - * service to prevent the call from being deallocated too early. - */ - rxrpc_put_call(call, rxrpc_call_put_discard_prealloc); - if (hlist_unhashed(&call->error_link)) { spin_lock(&call->peer->lock); hlist_add_head(&call->error_link, &call->peer->error_targets); diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c index 42addbcf59f9..01d32f817a7a 100644 --- a/net/rxrpc/input.c +++ b/net/rxrpc/input.c @@ -1072,8 +1072,7 @@ void rxrpc_input_call_event(struct rxrpc_call *call, = struct sk_buff *skb) * * TODO: If callNumber > call_id + 1, renegotiate security. */ -void rxrpc_input_implicit_end_call(struct rxrpc_sock *rx, - struct rxrpc_connection *conn, +void rxrpc_input_implicit_end_call(struct rxrpc_connection *conn, struct rxrpc_call *call) { switch (READ_ONCE(call->state)) { @@ -1091,7 +1090,7 @@ void rxrpc_input_implicit_end_call(struct rxrpc_sock = *rx, break; } =20 - spin_lock(&rx->incoming_lock); + spin_lock(&conn->bundle->channel_lock); __rxrpc_disconnect_call(conn, call); - spin_unlock(&rx->incoming_lock); + spin_unlock(&conn->bundle->channel_lock); } diff --git a/net/rxrpc/io_thread.c b/net/rxrpc/io_thread.c index 91b8ba5b90db..3b6927610677 100644 --- a/net/rxrpc/io_thread.c +++ b/net/rxrpc/io_thread.c @@ -257,6 +257,8 @@ static int rxrpc_input_packet(struct rxrpc_local *local= , struct sk_buff **_skb) if (sp->hdr.serviceId =3D=3D 0) goto bad_message; =20 + rcu_read_lock(); + if (rxrpc_to_server(sp)) { /* Weed out packets to services we're not offering. Packets * that would begin a call are explicitly rejected and the rest @@ -264,7 +266,9 @@ static int rxrpc_input_packet(struct rxrpc_local *local= , struct sk_buff **_skb) */ rx =3D rcu_dereference(local->service); if (!rx || (sp->hdr.serviceId !=3D rx->srx.srx_service && - sp->hdr.serviceId !=3D rx->second_service)) { + sp->hdr.serviceId !=3D rx->second_service) + ) { + rcu_read_unlock(); if (sp->hdr.type =3D=3D RXRPC_PACKET_TYPE_DATA && sp->hdr.seq =3D=3D 1) goto unsupported_service; @@ -293,7 +297,12 @@ static int rxrpc_input_packet(struct rxrpc_local *loca= l, struct sk_buff **_skb) if (sp->hdr.callNumber =3D=3D 0) { /* Connection-level packet */ _debug("CONN %p {%d}", conn, conn->debug_id); - rxrpc_post_packet_to_conn(conn, skb); + conn =3D rxrpc_get_connection_maybe(conn, rxrpc_conn_get_conn_input); + rcu_read_unlock(); + if (conn) { + rxrpc_post_packet_to_conn(conn, skb); + rxrpc_put_connection(conn, rxrpc_conn_put_conn_input); + } return 0; } =20 @@ -305,20 +314,26 @@ static int rxrpc_input_packet(struct rxrpc_local *loc= al, struct sk_buff **_skb) chan =3D &conn->channels[channel]; =20 /* Ignore really old calls */ - if (sp->hdr.callNumber < chan->last_call) + if (sp->hdr.callNumber < chan->last_call) { + rcu_read_unlock(); return 0; + } =20 if (sp->hdr.callNumber =3D=3D chan->last_call) { if (chan->call || - sp->hdr.type =3D=3D RXRPC_PACKET_TYPE_ABORT) + sp->hdr.type =3D=3D RXRPC_PACKET_TYPE_ABORT) { + rcu_read_unlock(); return 0; + } =20 /* For the previous service call, if completed * successfully, we discard all further packets. */ if (rxrpc_conn_is_service(conn) && - chan->last_type =3D=3D RXRPC_PACKET_TYPE_ACK) + chan->last_type =3D=3D RXRPC_PACKET_TYPE_ACK) { + rcu_read_unlock(); return 0; + } =20 /* But otherwise we need to retransmit the final packet * from data cached in the connection record. @@ -328,20 +343,32 @@ static int rxrpc_input_packet(struct rxrpc_local *loc= al, struct sk_buff **_skb) sp->hdr.seq, sp->hdr.serial, sp->hdr.flags); - rxrpc_post_packet_to_conn(conn, skb); + conn =3D rxrpc_get_connection_maybe(conn, rxrpc_conn_get_call_input); + rcu_read_unlock(); + if (conn) { + rxrpc_post_packet_to_conn(conn, skb); + rxrpc_put_connection(conn, rxrpc_conn_put_call_input); + } return 0; } =20 call =3D rcu_dereference(chan->call); =20 if (sp->hdr.callNumber > chan->call_id) { - if (rxrpc_to_client(sp)) + if (rxrpc_to_client(sp)) { + rcu_read_unlock(); goto reject_packet; - if (call) - rxrpc_input_implicit_end_call(rx, conn, call); - call =3D NULL; + } + if (call) { + rxrpc_input_implicit_end_call(conn, call); + chan->call =3D NULL; + call =3D NULL; + } } =20 + if (call && !rxrpc_try_get_call(call, rxrpc_call_get_input)) + call =3D NULL; + if (call) { if (sp->hdr.serviceId !=3D call->dest_srx.srx_service) call->dest_srx.srx_service =3D sp->hdr.serviceId; @@ -352,23 +379,33 @@ static int rxrpc_input_packet(struct rxrpc_local *loc= al, struct sk_buff **_skb) } } =20 - if (!call || refcount_read(&call->ref) =3D=3D 0) { + if (!call) { if (rxrpc_to_client(sp) || - sp->hdr.type !=3D RXRPC_PACKET_TYPE_DATA) + sp->hdr.type !=3D RXRPC_PACKET_TYPE_DATA) { + rcu_read_unlock(); goto bad_message; - if (sp->hdr.seq !=3D 1) + } + if (sp->hdr.seq !=3D 1) { + rcu_read_unlock(); return 0; + } call =3D rxrpc_new_incoming_call(local, rx, skb); - if (!call) + if (!call) { + rcu_read_unlock(); goto reject_packet; + } } =20 + rcu_read_unlock(); + /* Process a call packet. */ rxrpc_input_call_event(call, skb); + rxrpc_put_call(call, rxrpc_call_put_input); trace_rxrpc_rx_done(0, 0); return 0; =20 wrong_security: + rcu_read_unlock(); trace_rxrpc_abort(0, "SEC", sp->hdr.cid, sp->hdr.callNumber, sp->hdr.seq, RXKADINCONSISTENCY, EBADMSG); skb->priority =3D RXKADINCONSISTENCY; @@ -381,6 +418,7 @@ static int rxrpc_input_packet(struct rxrpc_local *local= , struct sk_buff **_skb) goto post_abort; =20 reupgrade: + rcu_read_unlock(); trace_rxrpc_abort(0, "UPG", sp->hdr.cid, sp->hdr.callNumber, sp->hdr.seq, RX_PROTOCOL_ERROR, EBADMSG); goto protocol_error; @@ -433,9 +471,7 @@ int rxrpc_io_thread(void *data) switch (skb->mark) { case RXRPC_SKB_MARK_PACKET: skb->priority =3D 0; - rcu_read_lock(); rxrpc_input_packet(local, &skb); - rcu_read_unlock(); trace_rxrpc_rx_done(skb->mark, skb->priority); rxrpc_free_skb(skb, rxrpc_skb_put_input); break;