From nobody Fri Dec 26 05:27:31 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F64439FF1 for ; Tue, 9 Jan 2024 15:10:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="I5B9sWXA" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1704813054; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=tkpr1RhaJrAP/oNWGGaaG9PEfXHE0nB87Qi7Q623/H0=; b=I5B9sWXA7Sj7C7nTAz6QySrfeOk+L399Ttm7n1WMbg/Dzq00d//HZoVuqZMvEAmETkAhHF e58AypO+IpTmrjA7qTIJssHXzyMzo2+Ac1989qXIbIhH8T/PSMeEDxAAf1aKyz/by281B/ GIqgeH9/tiWb9i8w7Hhaeg4PRdo0h0U= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-610-LZhzqSeqNpqZ_akE8bFssQ-1; Tue, 09 Jan 2024 10:10:50 -0500 X-MC-Unique: LZhzqSeqNpqZ_akE8bFssQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 78429185A780; Tue, 9 Jan 2024 15:10:50 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.67]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1F8031121306; Tue, 9 Jan 2024 15:10:49 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells To: Marc Dionne cc: dhowells@redhat.com, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , linux-afs@lists.infradead.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net v3] rxrpc: Fix use of Don't Fragment flag Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-ID: <1581851.1704813048.1@warthog.procyon.org.uk> Content-Transfer-Encoding: quoted-printable Date: Tue, 09 Jan 2024 15:10:48 +0000 Message-ID: <1581852.1704813048@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 Content-Type: text/plain; charset="utf-8" =20 rxrpc normally has the Don't Fragment flag set on the UDP packets it transmits, except when it has decided that DATA packets aren't getting through - in which case it turns it off just for the DATA transmissions. This can be a problem, however, for RESPONSE packets that convey authentication and crypto data from the client to the server as ticket may be larger than can fit in the MTU. In such a case, rxrpc gets itself into an infinite loop as the sendmsg returns an error (EMSGSIZE), which causes rxkad_send_response() to return -EAGAIN - and the CHALLENGE packet is put back on the Rx queue to retry, leading to the I/O thread endlessly attempting to perform the transmission. Fix this by disabling DF on RESPONSE packets for now. The use of DF and best data MTU determination needs reconsidering at some point in the future. Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by u= serspace and kernel both") Reported-by: Marc Dionne Signed-off-by: David Howells cc: "David S. Miller" cc: Eric Dumazet cc: Jakub Kicinski cc: Paolo Abeni cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org Acked-by: Paolo Abeni --- net/rxrpc/ar-internal.h | 1 + net/rxrpc/local_object.c | 13 ++++++++++++- net/rxrpc/output.c | 6 ++---- net/rxrpc/rxkad.c | 2 ++ 4 files changed, 17 insertions(+), 5 deletions(-) diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index e8e14c6f904d..e8b43408136a 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -1076,6 +1076,7 @@ void rxrpc_send_version_request(struct rxrpc_local *l= ocal, /* * local_object.c */ +void rxrpc_local_dont_fragment(const struct rxrpc_local *local, bool set); struct rxrpc_local *rxrpc_lookup_local(struct net *, const struct sockaddr= _rxrpc *); struct rxrpc_local *rxrpc_get_local(struct rxrpc_local *, enum rxrpc_local= _trace); struct rxrpc_local *rxrpc_get_local_maybe(struct rxrpc_local *, enum rxrpc= _local_trace); diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index c553a30e9c83..34d307368135 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -36,6 +36,17 @@ static void rxrpc_encap_err_rcv(struct sock *sk, struct = sk_buff *skb, int err, return ipv6_icmp_error(sk, skb, err, port, info, payload); } =20 +/* + * Set or clear the Don't Fragment flag on a socket. + */ +void rxrpc_local_dont_fragment(const struct rxrpc_local *local, bool set) +{ + if (set) + ip_sock_set_mtu_discover(local->socket->sk, IP_PMTUDISC_DO); + else + ip_sock_set_mtu_discover(local->socket->sk, IP_PMTUDISC_DONT); +} + /* * Compare a local to an address. Return -ve, 0 or +ve to indicate less t= han, * same or greater than. @@ -203,7 +214,7 @@ static int rxrpc_open_socket(struct rxrpc_local *local,= struct net *net) ip_sock_set_recverr(usk); =20 /* we want to set the don't fragment bit */ - ip_sock_set_mtu_discover(usk, IP_PMTUDISC_DO); + rxrpc_local_dont_fragment(local, true); =20 /* We want receive timestamps. */ sock_enable_timestamps(usk); diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c index 5e53429c6922..a0906145e829 100644 --- a/net/rxrpc/output.c +++ b/net/rxrpc/output.c @@ -494,14 +494,12 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, s= truct rxrpc_txbuf *txb) switch (conn->local->srx.transport.family) { case AF_INET6: case AF_INET: - ip_sock_set_mtu_discover(conn->local->socket->sk, - IP_PMTUDISC_DONT); + rxrpc_local_dont_fragment(conn->local, false); rxrpc_inc_stat(call->rxnet, stat_tx_data_send_frag); ret =3D do_udp_sendmsg(conn->local->socket, &msg, len); conn->peer->last_tx_at =3D ktime_get_seconds(); =20 - ip_sock_set_mtu_discover(conn->local->socket->sk, - IP_PMTUDISC_DO); + rxrpc_local_dont_fragment(conn->local, true); break; =20 default: diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c index 1bf571a66e02..b52dedcebce0 100644 --- a/net/rxrpc/rxkad.c +++ b/net/rxrpc/rxkad.c @@ -724,7 +724,9 @@ static int rxkad_send_response(struct rxrpc_connection = *conn, serial =3D atomic_inc_return(&conn->serial); whdr.serial =3D htonl(serial); =20 + rxrpc_local_dont_fragment(conn->local, false); ret =3D kernel_sendmsg(conn->local->socket, &msg, iov, 3, len); + rxrpc_local_dont_fragment(conn->local, true); if (ret < 0) { trace_rxrpc_tx_fail(conn->debug_id, serial, ret, rxrpc_tx_point_rxkad_response);