From nobody Tue Nov 11 11:31:34 2025 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1FC824EA90 for ; Tue, 11 Nov 2025 06:44:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762843497; cv=none; b=IU6RlsB7HZE1DEnY6p/k+1CkC2JS5EqMTNEHWos0PTuVThhbSu95uaz063qmt+4NOc0eVdjCsccG++nW9h7UpaPBddNX214q6KtFRmZi+CG6cpw1ngp3K/H3vEwD1leIKgqB04RuYtcBBkJPfr41nBghZD4YchZew3ZE7gy6Z98= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1762843497; c=relaxed/simple; bh=H45lOcTSAtMgo+EmX/n0VzDUtwq2OFEo3xSTYKVecVA=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=DY437RSGuss04F+KEqhc4c93ATJYRCIhctKRcILxbJd83P30zD3hSgOL26YZm4TsqBegi2UYK53fpnHAukfirveed+ZPaDCBvSs+WmuhClEtryzX5JIX51Fn3CqKi49MIHuAw77SL7nwg3FTfqCxECrQY1BqzFxz+wY7NLcyo64= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=PghJRMm/; arc=none smtp.client-ip=209.85.215.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="PghJRMm/" Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-b98a619f020so3381256a12.2 for ; Mon, 10 Nov 2025 22:44:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1762843495; x=1763448295; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=K1zJFRsAiK3pdg0if+VvxTy8sqPxZLBPEDWV0WcU7gw=; b=PghJRMm/2+YFedsexSXIqdAp3zCkE9G77EdYVZpUndWCKzBQt07gSVVU8Js2k1PgnJ PnHz/C6YfX83m/6pLtW5hXqvpYWw52wu+VujRg8e2Q7AQov9r5e6af/YmmG7EU8hdRS/ h5ekIcZFArGH/7PGmCnZG0ljq61y6kPYpyeTAjKqc8MoaEoiswp1QU/5xWI5+/SKptrr JSlOHSCSOimmdoYAIl8ZfSqUZCY2RPQibCv4yX7PU/UuOLu4KaCZFbjAj7FMipPzmBY5 FpuJMPcYgEFW3D3j6oVGV0jvMaJ3MK4PbmKWpkr6SNnENNPZ9YZoXf5Bg/JbPQqrg38t HoHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1762843495; x=1763448295; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=K1zJFRsAiK3pdg0if+VvxTy8sqPxZLBPEDWV0WcU7gw=; b=pt8vMokhT23iMeNZtHe5xHp1Zy+EZ+OVRiokCGtpqP6+V+5X4iFrJHWZHAVDJUobZl lIOlRQtQmkFAZwtqOaIVetYPJeGWPEGtTzWsM/Y4Y0gZFGE3TIo3m5aZI7DAEVokCCed E5pW3SzQatEoEeJ6amfKP5ucxXNmAs/ripy1yCww5fkURgWPExe5icXl/ZFbSEZvRp48 UD9Dstflqt+RL8clvJ281RXAfIX+9siy4v+jtL5VGBziBhQiLjaSoT+XdfZwFoBH3N5T bV9Smu12bOHMhL93sGVPyCY/FZKovHrD/Jg1cwCn+8y7/0h9ZiK4uNZDvfC8olB2E7Ru I1oQ== X-Forwarded-Encrypted: i=1; AJvYcCVDe6H+6L+IH1yc2IMK9OmzIUjhls3uwqf77+WG1/AfUluqYs6PtMtDa3bmxh+Ez3QfmmVwY1s6Iiyibhg=@vger.kernel.org X-Gm-Message-State: AOJu0YzNb7EQHRmmVfVzygKUaj4ABEcJE1pVtE8xnbEAytqn68HBipw9 mwFCe6O+syVev1BtKcaDPUWo5aVfdkXiEN5emBsyfyfph9cn4Xqy/KLX X-Gm-Gg: ASbGncudQCKWgLex0dDUHS6viI31zHwcAuAFwOyPdyDvqfFxWn5s60AokNW7tzzeGoS Skq4quD/zHAv/rsI6Kz9YPP4eCUiYIDnNjvDLkQL+w75cAbdpMLPPbc7DsEaiseekWoOLgnhKMq sKZnG8JXcU59Fr5Dst6/Pv9qoCJd25AyHdHfwFNymvMHcFigqjKJMy6mHy9mpDIcsqz9Of4WK4j 8CO379kYRz9RDC3gMkcu6mlf784v2Lfqpry9N2qnD1hH1egWwaLQGkwxrgWQyZj6/2eTtQb0qhq Wp8+BVqB+xZPna69PxGYhv2Ik3zjHQp1jV254f4ZpsfcWKmxIRL1f/OyzWPUeTZt873e9ni4OXB wbnTxq65Zd7sj+6k5g2EpPOOJwXMIiIOkUjeAgvLRKr4TcEk1wk47RaKsQxG1gddEVIQMudXhSd veNZqCdeOj3YnpdaLb9nzgFL7zmGI/dmp3CK16Eg== X-Google-Smtp-Source: AGHT+IGZjkzeenTM4AWU7s1CLf+GaRSSpWdmQcFkLH83i6KqyKtQqF5GhsLYfhlBk2XgbJ7fvxelGg== X-Received: by 2002:a17:902:da4b:b0:265:47:a7bd with SMTP id d9443c01a7336-297e53e7ce6mr126106965ad.4.1762843495053; Mon, 10 Nov 2025 22:44:55 -0800 (PST) Received: from localhost.localdomain ([116.232.109.229]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-297e2484bfbsm104161735ad.26.2025.11.10.22.44.49 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Mon, 10 Nov 2025 22:44:54 -0800 (PST) From: Chuang Wang To: Cc: Chuang Wang , stable@vger.kernel.org, "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net v1] ipv4: route: Prevent rt_bind_exception() from rebinding stale fnhe Date: Tue, 11 Nov 2025 14:43:24 +0800 Message-ID: <20251111064328.24440-1-nashuiliang@gmail.com> X-Mailer: git-send-email 2.50.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The sit driver's packet transmission path calls: sit_tunnel_xmit() -> update_or_create_fnhe(), which lead to fnhe_remove_oldest() being called to delete entries exceeding FNHE_RECLAIM_DEPTH+random. The race window is between fnhe_remove_oldest() selecting fnheX for deletion and the subsequent kfree_rcu(). During this time, the concurrent path's __mkroute_output() -> find_exception() can fetch the soon-to-be-deleted fnheX, and rt_bind_exception() then binds it with a new dst using a dst_hold(). When the original fnheX is freed via RCU, the dst reference remains permanently leaked. CPU 0 CPU 1 __mkroute_output() find_exception() [fnheX] update_or_create_fnhe() fnhe_remove_oldest() [fnheX] rt_bind_exception() [bind dst] RCU callback [fnheX freed, dst leak] This issue manifests as a device reference count leak and a warning in dmesg when unregistering the net device: unregister_netdevice: waiting for sitX to become free. Usage count =3D N Ido Schimmel provided the simple test validation method [1]. The fix clears 'oldest->fnhe_daddr' before calling fnhe_flush_routes(). Since rt_bind_exception() checks this field, setting it to zero prevents the stale fnhe from being reused and bound to a new dst just before it is freed. [1] ip netns add ns1 ip -n ns1 link set dev lo up ip -n ns1 address add 192.0.2.1/32 dev lo ip -n ns1 link add name dummy1 up type dummy ip -n ns1 route add 192.0.2.2/32 dev dummy1 ip -n ns1 link add name gretap1 up arp off type gretap \ local 192.0.2.1 remote 192.0.2.2 ip -n ns1 route add 198.51.0.0/16 dev gretap1 taskset -c 0 ip netns exec ns1 mausezahn gretap1 \ -A 198.51.100.1 -B 198.51.0.0/16 -t udp -p 1000 -c 0 -q & taskset -c 2 ip netns exec ns1 mausezahn gretap1 \ -A 198.51.100.1 -B 198.51.0.0/16 -t udp -p 1000 -c 0 -q & sleep 10 ip netns pids ns1 | xargs kill ip netns del ns1 Cc: stable@vger.kernel.org Fixes: 67d6d681e15b ("ipv4: make exception cache less predictible") Signed-off-by: Chuang Wang --- v0 -> v1: - Expanded commit description to fully document the race condition, including the sit driver's call chain and stack trace. - Added Ido Schimmel's validation method. --- net/ipv4/route.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 6d27d3610c1c..b549d6a57307 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -607,6 +607,11 @@ static void fnhe_remove_oldest(struct fnhe_hash_bucket= *hash) oldest_p =3D fnhe_p; } } + + /* Clear oldest->fnhe_daddr to prevent this fnhe from being + * rebound with new dsts in rt_bind_exception(). + */ + oldest->fnhe_daddr =3D 0; fnhe_flush_routes(oldest); *oldest_p =3D oldest->fnhe_next; kfree_rcu(oldest, rcu); --=20 2.47.3