From nobody Sat Feb 7 14:39:26 2026 Received: from out-186.mta0.migadu.com (out-186.mta0.migadu.com [91.218.175.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D81A354AF1 for ; Thu, 5 Feb 2026 07:02:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.186 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770274967; cv=none; b=cFr5iEc8YfGO3oewxc4d76v3pt9qMe+ilEkJzlRMn/lt6HmSdAEXFOpk+Zvtde2ymL4lTstkH1xLpLV5E61pxUPEKwoP+4pnnlw1T/aV2PXv9AdFaiWLEfacGhXN2p0GwAyDu7k51iYc1inDJJaX/GdZyVzUaZkdppRJoXoxpJk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770274967; c=relaxed/simple; bh=LP/PTh1RI2vrI9YcFfx8ZmGSu6yPnXt6FvME3CwFFHk=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=NNgtT4u7SB3noWXAdPr1+1AwIpVICv0WLx1BCoCqJFN1/REoqYu1sU20MA2zM7RHOKWsVEyy3pXnsxV4BvmEheb4FfhBr9MuJuEGCHFi6Lv4Htvnfp6AvTKIAL4+7FVZYNRbdVGjkw5rGWOMUPDyBBBtPdpqa3qJm62L1sty2go= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=LuqoV2C4; arc=none smtp.client-ip=91.218.175.186 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="LuqoV2C4" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1770274964; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=bqLxOvLzcNx9RiZYGWigx6P2BqjV1IJiHhUsW/aMTmo=; b=LuqoV2C4PLsW1po9EhMYnQ/b2Mn6WJAjSlKmALECgzH+S76APpjHi6gnN53gMoI0/s8vr9 U1XT3jBN/OZgyfRvGXGL9R1TdyQyfU5w0Ky2ERJXrNeTB40v119r6G31QtWDk8qTSlaggk Q+htXeNtsaC/HfNsHtEghCZPlT8cg+w= From: Jiayuan Chen To: netdev@vger.kernel.org Cc: Jiayuan Chen , syzbot+e738404dcd14b620923c@syzkaller.appspotmail.com, Jiayuan Chen , "David S. Miller" , David Ahern , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Herbert Xu , linux-kernel@vger.kernel.org Subject: [PATCH net v3] xfrm: fix ip_rt_bug race in icmp_route_lookup reverse path Date: Thu, 5 Feb 2026 15:02:02 +0800 Message-ID: <20260205070203.61560-1-jiayuan.chen@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" From: Jiayuan Chen icmp_route_lookup() performs multiple route lookups to find a suitable route for sending ICMP error messages, with special handling for XFRM (IPsec) policies. The lookup sequence is: 1. First, lookup output route for ICMP reply (dst =3D original src) 2. Pass through xfrm_lookup() for policy check 3. If blocked (-EPERM) or dst is not local, enter "reverse path" 4. In reverse path, call xfrm_decode_session_reverse() to get fl4_dec which reverses the original packet's flow (saddr<->daddr swapped) 5. If fl4_dec.saddr is local (we are the original destination), use __ip_route_output_key() for output route lookup 6. If fl4_dec.saddr is NOT local (we are a forwarding node), use ip_route_input() to simulate the reverse packet's input path 7. Finally, pass rt2 through xfrm_lookup() with XFRM_LOOKUP_ICMP flag The bug occurs in step 6: ip_route_input() is called with fl4_dec.daddr (original packet's source) as destination. If this address becomes local between the initial check and ip_route_input() call (e.g., due to concurrent "ip addr add"), ip_route_input() returns a LOCAL route with dst.output set to ip_rt_bug. This route is then used for ICMP output, causing dst_output() to call ip_rt_bug(), triggering a WARN_ON: ------------[ cut here ]------------ WARNING: net/ipv4/route.c:1275 at ip_rt_bug+0x21/0x30, CPU#1 Call Trace: ip_push_pending_frames+0x202/0x240 icmp_push_reply+0x30d/0x430 __icmp_send+0x1149/0x24f0 ip_options_compile+0xa2/0xd0 ip_rcv_finish_core+0x829/0x1950 ip_rcv+0x2d7/0x420 __netif_receive_skb_one_core+0x185/0x1f0 netif_receive_skb+0x90/0x450 tun_get_user+0x3413/0x3fb0 tun_chr_write_iter+0xe4/0x220 ... Fix this by checking rt2->rt_type after ip_route_input(). If it's RTN_LOCAL, the route cannot be used for output, so treat it as an error. The reproducer requires kernel modification to widen the race window, making it unsuitable as a selftest. It is available at: https://gist.github.com/mrpre/eae853b72ac6a750f5d45d64ddac1e81 Reported-by: syzbot+e738404dcd14b620923c@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000b1060905eada8881@google.com= /T/ Closes: https://lore.kernel.org/r/20260128090523.356953-1-jiayuan.chen@linu= x.dev Fixes: 8b7817f3a959 ("[IPSEC]: Add ICMP host relookup support") Signed-off-by: Jiayuan Chen Signed-off-by: Jiayuan Chen --- v1 -> v3: Suggested by Paolo Abeni: - Resend it using net tree and using xfrm prefix - Fix text string over 80 chars limit. - Simplify commit message. v1: https://lore.kernel.org/r/20260128090523.356953-1-jiayuan.chen@linu= x.dev v2: https://lore.kernel.org/netdev/20260203063449.44737-1-jiayuan.chen@= linux.dev/ --- net/ipv4/icmp.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 4abbec2f47ef..35816ac749bc 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -554,6 +554,22 @@ static struct rtable *icmp_route_lookup(struct net *ne= t, struct flowi4 *fl4, /* steal dst entry from skb_in, don't drop refcnt */ skb_dstref_steal(skb_in); skb_dstref_restore(skb_in, orefdst); + + /* + * At this point, fl4_dec.daddr should NOT be local (we + * checked fl4_dec.saddr above). However, a race condition + * may occur if the address is added to the interface + * concurrently. In that case, ip_route_input() returns a + * LOCAL route with dst.output=3Dip_rt_bug, which must not + * be used for output. + */ + if (!err && rt2 && rt2->rt_type =3D=3D RTN_LOCAL) { + net_warn_ratelimited("detected local route for %pI4 " + "during ICMP sending, src %pI4\n", + &fl4_dec.daddr, &fl4_dec.saddr); + dst_release(&rt2->dst); + err =3D -EINVAL; + } } =20 if (err) --=20 2.43.0