From nobody Tue Apr 7 08:08:05 2026 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74DB1372B56 for ; Sat, 14 Mar 2026 13:42:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773495746; cv=none; b=Jrh2CL7nZ20sxJo5ayDNm+h5cizfKIgO/rMpHT7a/+P8J0V7ZXGYAophTR5F5qHVF4mB+PZGsT2x99YeGT0o0HouiD+bs7s5LnidVUaa0vmEiRCgJ18hhlQqouNUcOA5O29geBTZaOhnbV4dnmNYu2ZcloAFOAqLz/z+eYcMnFY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773495746; c=relaxed/simple; bh=vD0lWqO8uPTniXCR2wzvzqwCRNCqPfEZxRMIzwdc90o=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=LHnCcZJPvpECtp3TfXVlLlU1qsxAyqztHFcc+JYpZrOQ0O+Ta85CUwCXcQGrhaBQADTnOTOu+Nkg+e+pol4K9f+v2n2X8zqOGBTLF9KHddYWAfjDsCc4PKL61Ft5/JDehtV7dJEdD5e0ij1K+zcg2mlPT7d+XNsJyWkkxgJDC/I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=A7BirgDy; arc=none smtp.client-ip=91.218.175.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="A7BirgDy" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1773495741; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=OitJ18LbV3U8rIZ7uUZRphAgZiJyxDjT4DUpok4Nlpw=; b=A7BirgDycIkrKxpduErh6TtnO3g1rvWxWRAz4oB4AOJ4lU3EPtc8Z3jCl4qvieXD6L/HgS TXp5YLGteQV+l+Fq9IqnYkzimJohlcxQVv/um1xYZXinKhWX2SnOFVhoaJmoR4Tah+nYzR CM9FEO49PV8d89/TSEqHkQfgs8t6vUI= From: Jiayuan Chen To: netdev@vger.kernel.org, edumazet@google.com Cc: Jiayuan Chen , Jiayuan Chen , Jay Vosburgh , Andrew Lunn , "David S. Miller" , Jakub Kicinski , Paolo Abeni , Shuah Khan , linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH net v1] selftests: bonding: add test for stacked bond header_parse recursion Date: Sat, 14 Mar 2026 21:42:05 +0800 Message-ID: <20260314134211.33405-1-jiayuan.chen@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" From: Jiayuan Chen Add a selftest to reproduce the infinite recursion in bond_header_parse() when bonds are stacked (bond1 -> bond0 -> gre). When a packet is received via AF_PACKET SOCK_DGRAM on the topmost bond, dev_parse_header() calls bond_header_parse() which used skb->dev (always the topmost bond) to get the bonding struct. This caused it to recurse back into itself indefinitely, leading to stack overflow. Before Eric's fix [2], the test triggers: ./bond-stacked-header-parse.sh [ 71.999481] BUG: MAX_LOCK_DEPTH too low! [ 72.000170] turning off the locking correctness validator. [ 72.001029] Please attach the output of /proc/lock_stat to the bug repo= rt [ 72.002079] depth: 48 max: 48! ... After Eric's fix [2], everything works fine: ./bond-stacked-header-parse.sh TEST: Stacked bond header_parse does not recurse [ OK ] Also verified via make run_tests -C drivers/net/bonding: ... ok 3 selftests: drivers/net/bonding: bond-eth-type-change.sh # timeout set to 1200 # selftests: drivers/net/bonding: bond-stacked-header-parse.sh # TEST: Stacked bond header_parse does not recurse [ OK ] ok 4 selftests: drivers/net/bonding: bond-stacked-header-parse.sh # timeout set to 1200 # selftests: drivers/net/bonding: bond-lladdr-target.sh # PASS ... [1] https://lore.kernel.org/netdev/CANn89iK2EURqsjtd=3DOVP4awYTJHGcR-UU-V9W= ovpWR1Z3f03oQ@mail.gmail.com/ [2] https://lore.kernel.org/netdev/20260314115650.3646361-1-edumazet@google= .com/ Cc: Jiayuan Chen Signed-off-by: Jiayuan Chen --- .../selftests/drivers/net/bonding/Makefile | 1 + .../net/bonding/bond-stacked-header-parse.sh | 142 ++++++++++++++++++ 2 files changed, 143 insertions(+) create mode 100755 tools/testing/selftests/drivers/net/bonding/bond-stacke= d-header-parse.sh diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile b/tools/t= esting/selftests/drivers/net/bonding/Makefile index 6c5c60adb5e8..055f6af03b5d 100644 --- a/tools/testing/selftests/drivers/net/bonding/Makefile +++ b/tools/testing/selftests/drivers/net/bonding/Makefile @@ -5,6 +5,7 @@ TEST_PROGS :=3D \ bond-arp-interval-causes-panic.sh \ bond-break-lacpdu-tx.sh \ bond-eth-type-change.sh \ + bond-stacked-header-parse.sh \ bond-lladdr-target.sh \ bond_ipsec_offload.sh \ bond_lacp_prio.sh \ diff --git a/tools/testing/selftests/drivers/net/bonding/bond-stacked-heade= r-parse.sh b/tools/testing/selftests/drivers/net/bonding/bond-stacked-heade= r-parse.sh new file mode 100755 index 000000000000..d377bedaef63 --- /dev/null +++ b/tools/testing/selftests/drivers/net/bonding/bond-stacked-header-parse= .sh @@ -0,0 +1,142 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Test that bond_header_parse() does not infinitely recurse with stacked b= onds. +# +# When a non-Ethernet device (e.g. GRE) is enslaved to a bond that is itse= lf +# enslaved to another bond (bond1 -> bond0 -> gre), receiving a packet via +# AF_PACKET SOCK_DGRAM triggers dev_parse_header() -> bond_header_parse(). +# Since parse() used skb->dev (always the topmost bond) instead of a passe= d-in +# dev pointer, it would recurse back into itself indefinitely. + +ALL_TESTS=3D" + bond_test_stacked_header_parse +" +REQUIRE_MZ=3Dno +NUM_NETIFS=3D0 +lib_dir=3D$(dirname "$0") +source "$lib_dir"/../../../net/forwarding/lib.sh + +require_command() +{ + if ! command -v "$1" &>/dev/null; then + echo "SKIP: $1 is not installed" + exit "$ksft_skip" + fi +} + +bond_test_stacked_header_parse() +{ + local devdummy=3D"test-dummy0" + local devgre=3D"test-gre0" + local devbond0=3D"test-bond0" + local devbond1=3D"test-bond1" + + RET=3D0 + + # Setup: dummy -> gre -> bond0 -> bond1 + modprobe dummy 2>/dev/null + modprobe ip_gre 2>/dev/null + modprobe bonding 2>/dev/null + + ip link add name "$devdummy" type dummy + if [ $? -ne 0 ]; then + log_test_skip "could not create dummy device (CONFIG_DUMMY)" + return + fi + ip addr add 10.0.0.1/24 dev "$devdummy" + ip link set "$devdummy" up + + ip link add name "$devgre" type gre local 10.0.0.1 + if [ $? -ne 0 ]; then + log_test_skip "could not create GRE device (CONFIG_NET_IPGRE)" + ip link del "$devdummy" 2>/dev/null + return + fi + + ip link add name "$devbond0" type bond mode active-backup + check_err $? "could not create bond0" + ip link add name "$devbond1" type bond mode active-backup + check_err $? "could not create bond1" + + ip link set "$devgre" master "$devbond0" + check_err $? "could not enslave $devgre to $devbond0" + ip link set "$devbond0" master "$devbond1" + check_err $? "could not enslave $devbond0 to $devbond1" + + ip link set "$devgre" up + ip link set "$devbond0" up + ip link set "$devbond1" up + + # Send a GRE-encapsulated packet to 10.0.0.1 while an AF_PACKET + # SOCK_DGRAM socket is listening on bond1. The receive path calls + # dev_parse_header() which invokes bond_header_parse(). With the + # bug, this recurses infinitely and causes a stack overflow. + # + # Use Python to: + # 1. Open AF_PACKET SOCK_DGRAM on bond1 + # 2. Send a GRE packet to 10.0.0.1 via raw socket + # 3. Try to receive (triggers parse path) + python3 -c " +import socket, struct, time + +# AF_PACKET SOCK_DGRAM on bond1 +ETH_P_ALL =3D 0x0003 +pkt_fd =3D socket.socket(socket.AF_PACKET, socket.SOCK_DGRAM, + socket.htons(ETH_P_ALL)) +pkt_fd.settimeout(2) +pkt_fd.bind(('$devbond1', ETH_P_ALL)) + +# Build GRE-encapsulated IP packet +def build_ip_hdr(proto, saddr, daddr, payload_len): + ihl_ver =3D 0x45 + total_len =3D 20 + payload_len + hdr =3D struct.pack('!BBHHHBBH4s4s', + ihl_ver, 0, total_len, 0, 0, 64, proto, 0, + socket.inet_aton(saddr), socket.inet_aton(daddr)) + # compute checksum + words =3D struct.unpack('!10H', hdr) + s =3D sum(words) + while s >> 16: + s =3D (s & 0xffff) + (s >> 16) + chksum =3D ~s & 0xffff + hdr =3D hdr[:10] + struct.pack('!H', chksum) + hdr[12:] + return hdr + +inner =3D build_ip_hdr(17, '192.168.1.1', '192.168.1.2', 8) + b'\x00' * 8 +gre_hdr =3D struct.pack('!HH', 0, 0x0800) # flags=3D0, proto=3DIP +outer =3D build_ip_hdr(47, '10.0.0.2', '10.0.0.1', len(gre_hdr) + len(inne= r)) +pkt =3D outer + gre_hdr + inner + +raw_fd =3D socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_R= AW) +raw_fd.setsockopt(socket.IPPROTO_IP, socket.IP_HDRINCL, 1) +raw_fd.sendto(pkt, ('10.0.0.1', 0)) +raw_fd.close() + +try: + pkt_fd.recv(2048) +except socket.timeout: + pass +pkt_fd.close() +" 2>/dev/null + + # If we get here without a kernel crash/hang, the test passed. + # Also check dmesg for signs of the recursion bug. + if dmesg | tail -20 | grep -q "BUG: MAX_LOCK_DEPTH\|stack-overflow\|stack= overflow"; then + check_err 1 "kernel detected recursion in bond_header_parse" + fi + + # Cleanup + ip link del "$devbond1" 2>/dev/null + ip link del "$devbond0" 2>/dev/null + ip link del "$devgre" 2>/dev/null + ip link del "$devdummy" 2>/dev/null + + log_test "Stacked bond header_parse does not recurse" +} + +require_command python3 + +tests_run + +exit "$EXIT_STATUS" --=20 2.43.0