From nobody Mon Feb 9 15:26:27 2026 Received: from relay.smtp-ext.broadcom.com (relay.smtp-ext.broadcom.com [192.19.144.207]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 16F431EEA34; Tue, 25 Mar 2025 19:28:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.19.144.207 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742930939; cv=none; b=l9NDLvXamWJRyYErdg2XTLhblC5JHSJKNqCpp31OrxG3CV61WkzXWi0bHXtSaUvdN2j+f7A+sF4xoi1d3cDRB/Ht0EIeHGc3VuI7++RhbW7XnbMoPrCDVpmko4zuvXGRu+rpsseWWCvc2pKl9O3VI+SqS6dw52wjS2syzPF2kuI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742930939; c=relaxed/simple; bh=2aWUPGucd8AuuSWRiNSqKp8izzd3drjNNIRgIfJMtHo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZfwvXD+KlXokwhTY8n4s0bNEIXoBzi1/V/4b7+j0MZEmMz2qoqgF3FmmTMpU1hPEUEoqMsVbraAQatHCegPKicBehYKX3QQ1S0zwABnf+S60JsF4Vynn2tYe9FOBCUMvRb1TaIeRAat2oV1Z/5dQ9eIdRPAvGFOxmbvGRRdDzUo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=broadcom.com; spf=fail smtp.mailfrom=broadcom.com; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b=ezDBN4UC; arc=none smtp.client-ip=192.19.144.207 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=broadcom.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=broadcom.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b="ezDBN4UC" Received: from mail-lvn-it-01.broadcom.com (mail-lvn-it-01.lvn.broadcom.net [10.36.132.253]) by relay.smtp-ext.broadcom.com (Postfix) with ESMTP id B1A4BC0003E3; Tue, 25 Mar 2025 12:22:25 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 relay.smtp-ext.broadcom.com B1A4BC0003E3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=broadcom.com; s=dkimrelay; t=1742930545; bh=2aWUPGucd8AuuSWRiNSqKp8izzd3drjNNIRgIfJMtHo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ezDBN4UC+yceiuD6LYBvI70QpnVr2DEO3yiyIFVYpkHn0sWmu9uBEzm4uUo6sXVAo xJhHoIjGEFZEGZHjB1SmuuYAgaxPIKnDbiPj7ExMDUhPei9PelyNIUfql9LgBfuZTz 64dWk5Ooo6X53nRq9bhuxathTIFMMSJkLi1BFzQs= Received: from stbirv-lnx-1.igp.broadcom.net (stbirv-lnx-1.igp.broadcom.net [10.67.48.32]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mail-lvn-it-01.broadcom.com (Postfix) with ESMTPSA id 2A29518000520; Tue, 25 Mar 2025 12:22:25 -0700 (PDT) From: Florian Fainelli To: stable@vger.kernel.org Cc: Ilya Maximets , Friedrich Weber , Aaron Conole , Jakub Kicinski , Sasha Levin , Carlos Soto , Florian Fainelli , "David S. Miller" , Pravin B Shelar , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Eric Dumazet , Willem de Bruijn , Paolo Abeni , Breno Leitao , =?UTF-8?q?Beno=C3=AEt=20Monin?= , Yan Zhai , Felix Huettner , Joe Stringer , Andy Zhou , Justin Pettit , Thomas Graf , Luca Czesla , Simon Horman , netdev@vger.kernel.org (open list:NETWORKING [GENERAL]), linux-kernel@vger.kernel.org (open list), dev@openvswitch.org (open list:OPENVSWITCH), bpf@vger.kernel.org (open list:BPF (Safe dynamic programs and tools)) Subject: [PATCH stable 5.4 v2 2/2] openvswitch: fix lockup on tx to unregistering netdev with carrier Date: Tue, 25 Mar 2025 12:22:20 -0700 Message-Id: <20250325192220.1849902-3-florian.fainelli@broadcom.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250325192220.1849902-1-florian.fainelli@broadcom.com> References: <20250325192220.1849902-1-florian.fainelli@broadcom.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Ilya Maximets [ Upstream commit 82f433e8dd0629e16681edf6039d094b5518d8ed ] Commit in a fixes tag attempted to fix the issue in the following sequence of calls: do_output -> ovs_vport_send -> dev_queue_xmit -> __dev_queue_xmit -> netdev_core_pick_tx -> skb_tx_hash When device is unregistering, the 'dev->real_num_tx_queues' goes to zero and the 'while (unlikely(hash >=3D qcount))' loop inside the 'skb_tx_hash' becomes infinite, locking up the core forever. But unfortunately, checking just the carrier status is not enough to fix the issue, because some devices may still be in unregistering state while reporting carrier status OK. One example of such device is a net/dummy. It sets carrier ON on start, but it doesn't implement .ndo_stop to set the carrier off. And it makes sense, because dummy doesn't really have a carrier. Therefore, while this device is unregistering, it's still easy to hit the infinite loop in the skb_tx_hash() from the OVS datapath. There might be other drivers that do the same, but dummy by itself is important for the OVS ecosystem, because it is frequently used as a packet sink for tcpdump while debugging OVS deployments. And when the issue is hit, the only way to recover is to reboot. Fix that by also checking if the device is running. The running state is handled by the net core during unregistering, so it covers unregistering case better, and we don't really need to send packets to devices that are not running anyway. While only checking the running state might be enough, the carrier check is preserved. The running and the carrier states seem disjoined throughout the code and different drivers. And other core functions like __dev_direct_xmit() check both before attempting to transmit a packet. So, it seems safer to check both flags in OVS as well. Fixes: 066b86787fa3 ("net: openvswitch: fix race on port output") Reported-by: Friedrich Weber Closes: https://mail.openvswitch.org/pipermail/ovs-discuss/2025-January/053= 423.html Signed-off-by: Ilya Maximets Tested-by: Friedrich Weber Reviewed-by: Aaron Conole Link: https://patch.msgid.link/20250109122225.4034688-1-i.maximets@ovn.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin Signed-off-by: Carlos Soto Signed-off-by: Florian Fainelli --- net/openvswitch/actions.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c index aec20faadfcc..815a55fa7356 100644 --- a/net/openvswitch/actions.c +++ b/net/openvswitch/actions.c @@ -920,7 +920,9 @@ static void do_output(struct datapath *dp, struct sk_bu= ff *skb, int out_port, { struct vport *vport =3D ovs_vport_rcu(dp, out_port); =20 - if (likely(vport && netif_carrier_ok(vport->dev))) { + if (likely(vport && + netif_running(vport->dev) && + netif_carrier_ok(vport->dev))) { u16 mru =3D OVS_CB(skb)->mru; u32 cutlen =3D OVS_CB(skb)->cutlen; =20 --=20 2.34.1