From nobody Fri Dec 19 20:35:48 2025 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E66D4214A66 for ; Thu, 9 Jan 2025 07:14:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406856; cv=none; b=kL/6feqroQUavHjhf611yU3/3PsvDN5TlSxS4Wd8zw0+oWHtRh43lyGuN0aS5lkcWkmUayUsWpmrr1ShXBHTERlCt2IQSTAgsW5gQSx3U/XzIVWinWlHz5vFb1gN77FsZqEP7+InCF2CrwL8Ib9Jy9WQXH7gzk1oiuDiuc5yhE8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406856; c=relaxed/simple; bh=1TNAX81EP2NvwqtjhU4qtoa/QvcfDKP8cdLmXM5RmEc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To; b=WZCOrjUBXTydCkb27YXagHt2rxO6821CovbW8Z+63dURKx0v5FlnFWr78DS4qPXtcjNJexzH2zdTRuifWAwab0EJDj0qrjwUVNjD+sXnD+yq11M2xmIJADOHiOIGaFRlnaUxGZmuaRnp2CJ5lqydQBXb2bcfkchwabSSFoiUVvA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com; spf=pass smtp.mailfrom=daynix.com; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b=AZIWWRr8; arc=none smtp.client-ip=209.85.214.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=daynix.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b="AZIWWRr8" Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-2166360285dso8883255ad.1 for ; Wed, 08 Jan 2025 23:14:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1736406854; x=1737011654; darn=vger.kernel.org; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=/u44TfuF4b66FsV08yQ0tUx1xEQu600SrekVmHrVz4Q=; b=AZIWWRr8sC4ZoRQ/6CH21kIEgLdJFTih+uOjPlX5z/0iXGYWDELKF8R/M9AT8VeUHP i9Qppdr/WpURwm0bEoLu0Muk/KAjZMmi0tkBlicewYp8As5QJpVBe5Ahch3anuGqoj9K ugXyXmMPWEiRd9q2MEjyEmTwckB1xtrhp6taPkvV2Ua6sC5ZZnRHaTfbozzlf/XPgmKx oWoCk23hi+Pdgac8KjcqfT/+0ymmeORa4xk1GD8/6H58Jx6PK7YM4Zmtxj+mDvvYs+tY rotfxdIybbwCp+k7bmkRrezLW2prBMNLAbuEM0KJDmOF/5qV59r5YI1NHp4jd6ud/TQh FsxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736406854; x=1737011654; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/u44TfuF4b66FsV08yQ0tUx1xEQu600SrekVmHrVz4Q=; b=M48Tswjzym9xW2I+RWN3Ql7QKZUbJonkf6CubaP7C9OTjcYCOMbjgIEE/xuG+5RVBN tdBSBKmesPj/UPdkSXc06VZgqdcIn66Q1De98q6IG5tDZUoYrCiykFPwUH8zp06gQ01M 2iXzNzqRF+P3au3S9EftVNrBWs3XUmWOJnu4YdQKDf2N7opa/0sq/l08YPa+X8YibIxe Tb/eEBhUooQySaoRC3fATi15igbfn+YSxlyxX2jSBqlhToo0XKWgsOP0nuENcB9wkgKg HLec7P69DkRuFk1Bh6hMY/cUNiMHogbJtsebNTyJvmlT7Qxs3sVKsc2Zil/Me3XFU4gZ u3zg== X-Forwarded-Encrypted: i=1; AJvYcCVeqxpjR+CV+mzKNvhy2nzDg6G53EOHVCzxKMCGRWHImCg0NUVo1EfaSnfFVWDIYrfV+oQ9aP36tnfDFvU=@vger.kernel.org X-Gm-Message-State: AOJu0YynIgKUpHm5S5CU67rKaY+RhjzhV7NYIIx2wWm97TT9gaJDa46c PFccZFlAQ1S9qMRBFIntzZmazTvs292s/UodglrzuAng6KJkDAajlaf/baJXLIQ= X-Gm-Gg: ASbGncuRwp+bWU6gUoYXhUAZdZrTbw6Ar395xEk63Es3vr87oGPdS2tKyubUEx77Sqn HZneg8DjELl29w3MAUDn/zfetzkQw1Z8+NXg+eJtK03DeK7O11LB8sLuJ0WIdd2yfLPjTywhlhB ZeoOEvEFPRjPkMgYH91VDLGQuGIz6DVFE9yqnUatCtsPl1bbHe44kMxYgMv7bqtcdNpaF5uyZeJ +vIgcNsGnRIom/NHjq1D7cxufa4I64RlZIyVMqx3Xf7v7VG/mcUC69jdWk= X-Google-Smtp-Source: AGHT+IGQHw/pFPM7Y3PGxXg9GDfX/8EmNWzm8K64ShGCkEBkRaUarZlr1VPbMp2xbl4BT3Qn47tHew== X-Received: by 2002:a05:6a20:3d8b:b0:1d9:18af:d150 with SMTP id adf61e73a8af0-1e88d114f9bmr9397267637.21.1736406854193; Wed, 08 Jan 2025 23:14:14 -0800 (PST) Received: from localhost ([157.82.203.37]) by smtp.gmail.com with UTF8SMTPSA id d2e1a72fcca58-72aad84eb50sm36483803b3a.90.2025.01.08.23.14.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Jan 2025 23:14:13 -0800 (PST) From: Akihiko Odaki Date: Thu, 09 Jan 2025 16:13:39 +0900 Subject: [PATCH v6 1/6] virtio_net: Add functions for hashing Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250109-rss-v6-1-b1c90ad708f6@daynix.com> References: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> In-Reply-To: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> To: Jonathan Corbet , Willem de Bruijn , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, Yuri Benditovich , Andrew Melnychenko , Stephen Hemminger , gur.stavi@huawei.com, Akihiko Odaki X-Mailer: b4 0.14-dev-fd6e3 They are useful to implement VIRTIO_NET_F_RSS and VIRTIO_NET_F_HASH_REPORT. Signed-off-by: Akihiko Odaki Tested-by: Lei Yang --- include/linux/virtio_net.h | 188 +++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 188 insertions(+) diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h index 02a9f4dc594d..3b25ca75710b 100644 --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -9,6 +9,194 @@ #include #include =20 +struct virtio_net_hash { + u32 value; + u16 report; +}; + +struct virtio_net_toeplitz_state { + u32 hash; + const u32 *key; +}; + +#define VIRTIO_NET_SUPPORTED_HASH_TYPES (VIRTIO_NET_RSS_HASH_TYPE_IPv4 | \ + VIRTIO_NET_RSS_HASH_TYPE_TCPv4 | \ + VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | \ + VIRTIO_NET_RSS_HASH_TYPE_IPv6 | \ + VIRTIO_NET_RSS_HASH_TYPE_TCPv6 | \ + VIRTIO_NET_RSS_HASH_TYPE_UDPv6) + +#define VIRTIO_NET_RSS_MAX_KEY_SIZE 40 + +static inline void virtio_net_toeplitz_convert_key(u32 *input, size_t len) +{ + while (len >=3D sizeof(*input)) { + *input =3D be32_to_cpu((__force __be32)*input); + input++; + len -=3D sizeof(*input); + } +} + +static inline void virtio_net_toeplitz_calc(struct virtio_net_toeplitz_sta= te *state, + const __be32 *input, size_t len) +{ + while (len >=3D sizeof(*input)) { + for (u32 map =3D be32_to_cpu(*input); map; map &=3D (map - 1)) { + u32 i =3D ffs(map); + + state->hash ^=3D state->key[0] << (32 - i) | + (u32)((u64)state->key[1] >> i); + } + + state->key++; + input++; + len -=3D sizeof(*input); + } +} + +static inline u8 virtio_net_hash_key_length(u32 types) +{ + size_t len =3D 0; + + if (types & VIRTIO_NET_HASH_REPORT_IPv4) + len =3D max(len, + sizeof(struct flow_dissector_key_ipv4_addrs)); + + if (types & + (VIRTIO_NET_HASH_REPORT_TCPv4 | VIRTIO_NET_HASH_REPORT_UDPv4)) + len =3D max(len, + sizeof(struct flow_dissector_key_ipv4_addrs) + + sizeof(struct flow_dissector_key_ports)); + + if (types & VIRTIO_NET_HASH_REPORT_IPv6) + len =3D max(len, + sizeof(struct flow_dissector_key_ipv6_addrs)); + + if (types & + (VIRTIO_NET_HASH_REPORT_TCPv6 | VIRTIO_NET_HASH_REPORT_UDPv6)) + len =3D max(len, + sizeof(struct flow_dissector_key_ipv6_addrs) + + sizeof(struct flow_dissector_key_ports)); + + return len + 4; +} + +static inline u32 virtio_net_hash_report(u32 types, + const struct flow_keys_basic *keys) +{ + switch (keys->basic.n_proto) { + case cpu_to_be16(ETH_P_IP): + if (!(keys->control.flags & FLOW_DIS_IS_FRAGMENT)) { + if (keys->basic.ip_proto =3D=3D IPPROTO_TCP && + (types & VIRTIO_NET_RSS_HASH_TYPE_TCPv4)) + return VIRTIO_NET_HASH_REPORT_TCPv4; + + if (keys->basic.ip_proto =3D=3D IPPROTO_UDP && + (types & VIRTIO_NET_RSS_HASH_TYPE_UDPv4)) + return VIRTIO_NET_HASH_REPORT_UDPv4; + } + + if (types & VIRTIO_NET_RSS_HASH_TYPE_IPv4) + return VIRTIO_NET_HASH_REPORT_IPv4; + + return VIRTIO_NET_HASH_REPORT_NONE; + + case cpu_to_be16(ETH_P_IPV6): + if (!(keys->control.flags & FLOW_DIS_IS_FRAGMENT)) { + if (keys->basic.ip_proto =3D=3D IPPROTO_TCP && + (types & VIRTIO_NET_RSS_HASH_TYPE_TCPv6)) + return VIRTIO_NET_HASH_REPORT_TCPv6; + + if (keys->basic.ip_proto =3D=3D IPPROTO_UDP && + (types & VIRTIO_NET_RSS_HASH_TYPE_UDPv6)) + return VIRTIO_NET_HASH_REPORT_UDPv6; + } + + if (types & VIRTIO_NET_RSS_HASH_TYPE_IPv6) + return VIRTIO_NET_HASH_REPORT_IPv6; + + return VIRTIO_NET_HASH_REPORT_NONE; + + default: + return VIRTIO_NET_HASH_REPORT_NONE; + } +} + +static inline void virtio_net_hash_rss(const struct sk_buff *skb, + u32 types, const u32 *key, + struct virtio_net_hash *hash) +{ + struct virtio_net_toeplitz_state toeplitz_state =3D { .key =3D key }; + struct flow_keys flow; + struct flow_keys_basic flow_basic; + u16 report; + + if (!skb_flow_dissect_flow_keys(skb, &flow, 0)) { + hash->report =3D VIRTIO_NET_HASH_REPORT_NONE; + return; + } + + flow_basic =3D (struct flow_keys_basic) { + .control =3D flow.control, + .basic =3D flow.basic + }; + + report =3D virtio_net_hash_report(types, &flow_basic); + + switch (report) { + case VIRTIO_NET_HASH_REPORT_IPv4: + virtio_net_toeplitz_calc(&toeplitz_state, + (__be32 *)&flow.addrs.v4addrs, + sizeof(flow.addrs.v4addrs)); + break; + + case VIRTIO_NET_HASH_REPORT_TCPv4: + virtio_net_toeplitz_calc(&toeplitz_state, + (__be32 *)&flow.addrs.v4addrs, + sizeof(flow.addrs.v4addrs)); + virtio_net_toeplitz_calc(&toeplitz_state, &flow.ports.ports, + sizeof(flow.ports.ports)); + break; + + case VIRTIO_NET_HASH_REPORT_UDPv4: + virtio_net_toeplitz_calc(&toeplitz_state, + (__be32 *)&flow.addrs.v4addrs, + sizeof(flow.addrs.v4addrs)); + virtio_net_toeplitz_calc(&toeplitz_state, &flow.ports.ports, + sizeof(flow.ports.ports)); + break; + + case VIRTIO_NET_HASH_REPORT_IPv6: + virtio_net_toeplitz_calc(&toeplitz_state, + (__be32 *)&flow.addrs.v6addrs, + sizeof(flow.addrs.v6addrs)); + break; + + case VIRTIO_NET_HASH_REPORT_TCPv6: + virtio_net_toeplitz_calc(&toeplitz_state, + (__be32 *)&flow.addrs.v6addrs, + sizeof(flow.addrs.v6addrs)); + virtio_net_toeplitz_calc(&toeplitz_state, &flow.ports.ports, + sizeof(flow.ports.ports)); + break; + + case VIRTIO_NET_HASH_REPORT_UDPv6: + virtio_net_toeplitz_calc(&toeplitz_state, + (__be32 *)&flow.addrs.v6addrs, + sizeof(flow.addrs.v6addrs)); + virtio_net_toeplitz_calc(&toeplitz_state, &flow.ports.ports, + sizeof(flow.ports.ports)); + break; + + default: + hash->report =3D VIRTIO_NET_HASH_REPORT_NONE; + return; + } + + hash->value =3D toeplitz_state.hash; + hash->report =3D report; +} + static inline bool virtio_net_hdr_match_proto(__be16 protocol, __u8 gso_ty= pe) { switch (gso_type & ~VIRTIO_NET_HDR_GSO_ECN) { --=20 2.47.1 From nobody Fri Dec 19 20:35:48 2025 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DFB7A214A66 for ; Thu, 9 Jan 2025 07:14:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406863; cv=none; b=ObrFGf5ArbYubDQMY3ZOWGmlhNTfGBShUDjXMer4iWAkMDiF1yXz0Zyylm4qOMNSZtmnTwph94BergWK9rabQpPy1vbr/NYNyXh5wsaj2qt2HnMmCimnAi1VrQ7Mxgy0RpsE4K9CasGdIdAk5E0bjUqyVB3tsaCtripw1+uGa7g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406863; c=relaxed/simple; bh=BSM7DgNUt2aqdcBjbMe4R26JPGIRCbc7iP5uO0U8WDw=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To; b=mvGxRi7u4PzpY31B4PCfAdQ2syQKehA+FlnsiWWVUYY5OTpUXfF9nOdjdzPn9jEEZmMWU34gICJiQbmd6GFkdE/z/EnXaf6j5BlJnZ4r2keE9udCM4Mw9UuBXvulj0RQCi24cXNhTAptozunePt68iyh1aS9n6f4XFo8kvRkD8M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com; spf=pass smtp.mailfrom=daynix.com; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b=oXFTU4mO; arc=none smtp.client-ip=209.85.214.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=daynix.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b="oXFTU4mO" Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-2163dc5155fso8642565ad.0 for ; Wed, 08 Jan 2025 23:14:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1736406861; x=1737011661; darn=vger.kernel.org; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=Jsv9MJq3uIz1TpM1DnTlfmh6B4pnNr2T13dTstPwHw0=; b=oXFTU4mOKPrkfVbMSsitSGO6nIUhAK0TK/0LOguHaktzIJlaihQv+qBaUrH2qVD8/4 G7i44wxpooNITwDuBDXP0ZqCbiuP+iKFtQBnQuLMhIjWlK4Gn90ibH//oUnBRdwP9tEJ FQrUdq6m8U7705fFMFhbhPf1N3nX88toarX/TmpQ3hDb+JWcabS8X9SD13tsFSfiksl8 o2V0+1ylU8abAf3Wuc3YwB+tqS/9Fy82/wW7ouZzEeNu4CW5mUXCjYln7bGCVJStUk/G ADPdxKjvg3FtSeneQMBQBX34ogdROWZ5eRQD/+PXgeH5ZG+SNFHqhNq/+8ISvwyV7bHE yZ6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736406861; x=1737011661; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Jsv9MJq3uIz1TpM1DnTlfmh6B4pnNr2T13dTstPwHw0=; b=vQtci0iFn2j/mnWrjki0sFl9PwRBb2i0YAB+TGtf1C4uhI//jW72u8ex8TAIO3/Zce dlvfQZvXcP1AlzoJdHkeofo6RZH94Oc+r4wbIwoQqH5OL+mxfaisyr4z0p6L00YX/2CQ 8lr+sMW9ByMYgqapQDqOLWOkfiLhJAymva6sjlo76X8xp7DT/xSb87zlp2SR29qu/6Md LuWf18gP1V6xY4NqOahc0EjzuQfYmui+k8v8aNfl4+jmDGso8d+EQ0fR5W/FgGXTilZc PRZ/+7gtFrtSDlif15MeCoEISvHhy3UTZHjBO2YXfcqxVPnShdLyZ0eOqUhyTM3F5q6d CAVQ== X-Forwarded-Encrypted: i=1; AJvYcCWR55X1p26knE1A+Du5n4pXevpRhA+VqWg4OL38x0oKUioxqXuoCNj2U/jhrCOaHAhdsHCaFdmXxZ0FZA8=@vger.kernel.org X-Gm-Message-State: AOJu0YwJZ48+TnHTa0UfLuPO5l+etJaCartST03g9rgtKBDvBWLrwgK0 e2BtZHfdb5eQCj0GB1iaoKbCiq5n12Tj64tSN3XtlqKIRfT5xCm/CdXeFqlg2Pk= X-Gm-Gg: ASbGnctFJo6XgWgtsXH+nSISrtxG6hCSP35KR83EU98X2sEqfrTZoX11mX0caJ7asU2 iEwq9cQsAPt3m85sMN/9BLtJQVSI+hKbFi4G0Y3EELUoq4lquZDA2pzjzIJXDK9zHcIQ3qt7tTf ki86h2Sp0/qhjlSL51ZJeY4+mozgc1bQQ0S6F6riv+OxARspvpehGLTNA17qbsAXxSmsDTA4ZW0 O2f/JwdkDUpGGKRf1vfmd9+Tgcd8QsPbQLUCeAofGtHV1ck47V4iECwOxQ= X-Google-Smtp-Source: AGHT+IHWt2iyxhsURGGm00T78Ud8g8z49DZ1+HVJANgH+a6WNvQ5wGbFg/saU0NxSmQebXyhzwWD4A== X-Received: by 2002:a17:903:1112:b0:216:138a:5956 with SMTP id d9443c01a7336-21a83f59822mr91255545ad.19.1736406861198; Wed, 08 Jan 2025 23:14:21 -0800 (PST) Received: from localhost ([157.82.203.37]) by smtp.gmail.com with UTF8SMTPSA id d9443c01a7336-219dca025desm338521435ad.256.2025.01.08.23.14.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Jan 2025 23:14:20 -0800 (PST) From: Akihiko Odaki Date: Thu, 09 Jan 2025 16:13:40 +0900 Subject: [PATCH v6 2/6] net: flow_dissector: Export flow_keys_dissector_symmetric Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250109-rss-v6-2-b1c90ad708f6@daynix.com> References: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> In-Reply-To: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> To: Jonathan Corbet , Willem de Bruijn , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, Yuri Benditovich , Andrew Melnychenko , Stephen Hemminger , gur.stavi@huawei.com, Akihiko Odaki X-Mailer: b4 0.14-dev-fd6e3 flow_keys_dissector_symmetric is useful to derive a symmetric hash and to know its source such as IPv4, IPv6, TCP, and UDP. Signed-off-by: Akihiko Odaki Tested-by: Lei Yang --- include/net/flow_dissector.h | 1 + net/core/flow_dissector.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index ced79dc8e856..d01c1ec77b7d 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -423,6 +423,7 @@ __be32 flow_get_u32_src(const struct flow_keys *flow); __be32 flow_get_u32_dst(const struct flow_keys *flow); =20 extern struct flow_dissector flow_keys_dissector; +extern struct flow_dissector flow_keys_dissector_symmetric; extern struct flow_dissector flow_keys_basic_dissector; =20 /* struct flow_keys_digest: diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 0e638a37aa09..9822988f2d49 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -1852,7 +1852,8 @@ void make_flow_keys_digest(struct flow_keys_digest *d= igest, } EXPORT_SYMBOL(make_flow_keys_digest); =20 -static struct flow_dissector flow_keys_dissector_symmetric __read_mostly; +struct flow_dissector flow_keys_dissector_symmetric __read_mostly; +EXPORT_SYMBOL(flow_keys_dissector_symmetric); =20 u32 __skb_get_hash_symmetric_net(const struct net *net, const struct sk_bu= ff *skb) { --=20 2.47.1 From nobody Fri Dec 19 20:35:48 2025 Received: from mail-pl1-f176.google.com (mail-pl1-f176.google.com [209.85.214.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A6A7F20102E for ; Thu, 9 Jan 2025 07:14:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406872; cv=none; b=jo1Yb+Hw4KpaJPweoA1Q941uQbuxqhxJ6aoVF5ZRoh37H0Qm/KSOPlZHtSBviqhn5vXFs+GgMjdyuaFuttipANmCHj98Whykrxo2NbCakxSn4XnEP+EOvMRtwrGhYprjgXALo2pmSW37e9DfRpGb7GYEekZJl757++igW4rIf5U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406872; c=relaxed/simple; bh=MpQggsHwcr3quFNH02GbFNFIU0odd3AYM9S/xEZ/LNo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To; b=L+lwKVHf7qGmJBXMJhgMJ42GOJW+5PY4a6BzOqSqZmBnGECY0MbFIqP1p440f+tTfaOd987MRfkiG9jullvr5bCtEFeNfTGTjb8YVnOv6WB3osy7AIdEFmuyZw8v1QIARVtwkdxbjzkGywXR8rKcYf7GqLTKT9w0Lah5DW71agw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com; spf=pass smtp.mailfrom=daynix.com; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b=GHBH9hnS; arc=none smtp.client-ip=209.85.214.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=daynix.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b="GHBH9hnS" Received: by mail-pl1-f176.google.com with SMTP id d9443c01a7336-2167141dfa1so10075625ad.1 for ; Wed, 08 Jan 2025 23:14:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1736406869; x=1737011669; darn=vger.kernel.org; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=qq9olBgFYF/pH2ygQjYpyQkp0lSYfuRIO4vuFqVZVzg=; b=GHBH9hnSCUSdvJFQ1tSmW1GdzvWqzoG9BoXHeY9aaBt6eHPikCxH9iQVLVtCP3yl12 YxVQeyxIGFqoCasButDzS4buXTAd+vJfyrYgjEcgnTiSxF9eelfJ+D9/eInbclVocrW6 e0r5yPX77Kk3WtP7jUO9qbttNz8ZvuEakwqK29WMpGPtmYD1ER7CsDcbUS/m3EOaYHy/ cYhfnbc/38tUB9DOqc4/Z7r/MVn2Ffih1HHUCdh4SSRe5/PA0nEdetvyJy10XUWbsTHM NRnEv7gwOBxVNXUG5CNIqM/8de92Wa9By4EIBogIxmLoKnfhZNSpFNc8nARIy94oaV0A afJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736406869; x=1737011669; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qq9olBgFYF/pH2ygQjYpyQkp0lSYfuRIO4vuFqVZVzg=; b=QhmiVYlZjpeS122YaYHrYDQmF5YkEXaEsQQ21/PAA6Dzi5Dkqv5+WJFBoLXLAGK+w4 owFgMMniWEjkrbI/O7ThsbUUvtx/9yflS389tyNVIhZA0qbaBddFa49rwx50sNaFDPlN ixEaSNrcKvg9AQY414ZBUGNJNyVy1R66BAFdeX9rdKhp/nZZt/lDgiwTGCQZpLpWUuAz dHegtNQrNFggV9wLMgvnUp/g3q2RqrRaZS8Mc4ol+1JQG1gUxL0mFZvo/GGknMgBydxT rkBubg9gFClYuRwYMbrW06O5/e8FApL9I3vZV7+GqSnAW8imED3Ng3T7Co9b9nStFigE ceUw== X-Forwarded-Encrypted: i=1; AJvYcCXkNe4onqyJUGLdE4H/1wG7IQc/yZKW/ril7sZqck5pLXrhF5soT7y624xdg/7ffy5Vm1s2BJgl09i9YiM=@vger.kernel.org X-Gm-Message-State: AOJu0YxzPhHgl9qtNuN4W+cIXFsi0foef3eb907ygUxsZLxWU9VFmMF8 HqibtWFiQwtH5Rg4baTuUBYX/3oL2SbMl6tut3Z7wiPw+cCoEXN5ts3dQcHh7C4= X-Gm-Gg: ASbGncsgveLCEl4EveRQvuvkhjWqd0daZl1sWgjiDiGR6+lkqs8rGAhflrAhSUIxSB1 ktRuBd+AZdPkEt3EOn90BbwRPgI+gA/w4TBG95LYwa7t5rft3OAUDgoZmGGi9Me2EdSVfOFsfEz Iow5MPzHtuWzGZxBEcyXZsCbzqSRKpDQWplDx0AuFljdUqwjbAHZT2+PZyAp3WPecsGH76hXueY aJWxLsfHxCW3I3XUgiA1sMLE/ierxuYZMRX5SN7XhEMARpqLsglUYAtwFo= X-Google-Smtp-Source: AGHT+IF5rqwmvdGJejjG83zRMUuw2T7inCaslbbkkoui4rFBACLrV1Gs+Os+YPaps2pllaBXHudEvQ== X-Received: by 2002:a17:902:ecd0:b0:216:3f6e:fabd with SMTP id d9443c01a7336-21a8d647c0fmr31237175ad.7.1736406868935; Wed, 08 Jan 2025 23:14:28 -0800 (PST) Received: from localhost ([157.82.203.37]) by smtp.gmail.com with UTF8SMTPSA id 41be03b00d2f7-a31d5047e95sm598113a12.67.2025.01.08.23.14.23 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Jan 2025 23:14:28 -0800 (PST) From: Akihiko Odaki Date: Thu, 09 Jan 2025 16:13:41 +0900 Subject: [PATCH v6 3/6] tun: Introduce virtio-net hash feature Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250109-rss-v6-3-b1c90ad708f6@daynix.com> References: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> In-Reply-To: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> To: Jonathan Corbet , Willem de Bruijn , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, Yuri Benditovich , Andrew Melnychenko , Stephen Hemminger , gur.stavi@huawei.com, Akihiko Odaki X-Mailer: b4 0.14-dev-fd6e3 Hash reporting Tested-by: Lei Yang -------------- Allow the guest to reuse the hash value to make receive steering consistent between the host and guest, and to save hash computation. RSS --- RSS is a receive steering algorithm that can be negotiated to use with virtio_net. Conventionally the hash calculation was done by the VMM. However, computing the hash after the queue was chosen defeats the purpose of RSS. Another approach is to use eBPF steering program. This approach has another downside: it cannot report the calculated hash due to the restrictive nature of eBPF steering program. Introduce the code to perform RSS to the kernel in order to overcome thse challenges. An alternative solution is to extend the eBPF steering program so that it will be able to report to the userspace, but I didn't opt for it because extending the current mechanism of eBPF steering program as is because it relies on legacy context rewriting, and introducing kfunc-based eBPF will result in non-UAPI dependency while the other relevant virtualization APIs such as KVM and vhost_net are UAPIs. Signed-off-by: Akihiko Odaki --- Documentation/networking/tuntap.rst | 7 ++ drivers/net/Kconfig | 1 + drivers/net/tap.c | 50 ++++++++++- drivers/net/tun.c | 93 +++++++++++++++----- drivers/net/tun_vnet.c | 167 ++++++++++++++++++++++++++++++++= +--- drivers/net/tun_vnet.h | 33 ++++++- include/linux/if_tap.h | 2 + include/linux/skbuff.h | 3 + include/uapi/linux/if_tun.h | 75 ++++++++++++++++ net/core/skbuff.c | 4 + 10 files changed, 397 insertions(+), 38 deletions(-) diff --git a/Documentation/networking/tuntap.rst b/Documentation/networking= /tuntap.rst index 4d7087f727be..86b4ae8caa8a 100644 --- a/Documentation/networking/tuntap.rst +++ b/Documentation/networking/tuntap.rst @@ -206,6 +206,13 @@ enable is true we enable it, otherwise we disable it:: return ioctl(fd, TUNSETQUEUE, (void *)&ifr); } =20 +3.4 Reference +------------- + +``linux/if_tun.h`` defines the interface described below: + +.. kernel-doc:: include/uapi/linux/if_tun.h + Universal TUN/TAP device driver Frequently Asked Question =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D =20 diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index 255c8f9f1d7c..f7b0d9a89a71 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -395,6 +395,7 @@ config TUN tristate "Universal TUN/TAP device driver support" depends on INET select CRC32 + select SKB_EXTENSIONS select TUN_VNET help TUN/TAP provides packet reception and transmission for user space diff --git a/drivers/net/tap.c b/drivers/net/tap.c index fe9554ee5b8b..27659df1f96e 100644 --- a/drivers/net/tap.c +++ b/drivers/net/tap.c @@ -179,6 +179,16 @@ static void tap_put_queue(struct tap_queue *q) sock_put(&q->sk); } =20 +static struct virtio_net_hash *tap_add_hash(struct sk_buff *skb) +{ + return (struct virtio_net_hash *)skb->cb; +} + +static const struct virtio_net_hash *tap_find_hash(const struct sk_buff *s= kb) +{ + return (const struct virtio_net_hash *)skb->cb; +} + /* * Select a queue based on the rxq of the device on which this packet * arrived. If the incoming device is not mq, calculate a flow hash @@ -189,6 +199,7 @@ static void tap_put_queue(struct tap_queue *q) static struct tap_queue *tap_get_queue(struct tap_dev *tap, struct sk_buff *skb) { + struct flow_keys_basic keys_basic; struct tap_queue *queue =3D NULL; /* Access to taps array is protected by rcu, but access to numvtaps * isn't. Below we use it to lookup a queue, but treat it as a hint @@ -196,17 +207,41 @@ static struct tap_queue *tap_get_queue(struct tap_dev= *tap, * racing against queue removal. */ int numvtaps =3D READ_ONCE(tap->numvtaps); + struct tun_vnet_hash_container *vnet_hash =3D rcu_dereference(tap->vnet_h= ash); __u32 rxq; =20 + *tap_add_hash(skb) =3D (struct virtio_net_hash) { .report =3D VIRTIO_NET_= HASH_REPORT_NONE }; + if (!numvtaps) goto out; =20 if (numvtaps =3D=3D 1) goto single; =20 + if (vnet_hash && (vnet_hash->common.flags & TUN_VNET_HASH_RSS)) { + rxq =3D tun_vnet_rss_select_queue(numvtaps, vnet_hash, skb, tap_add_hash= ); + queue =3D rcu_dereference(tap->taps[rxq]); + goto out; + } + + if (!skb->l4_hash && !skb->sw_hash) { + struct flow_keys keys; + + skb_flow_dissect_flow_keys(skb, &keys, FLOW_DISSECTOR_F_STOP_AT_FLOW_LAB= EL); + rxq =3D flow_hash_from_keys(&keys); + keys_basic =3D (struct flow_keys_basic) { + .control =3D keys.control, + .basic =3D keys.basic + }; + } else { + skb_flow_dissect_flow_keys_basic(NULL, skb, &keys_basic, NULL, 0, 0, 0, + FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL); + rxq =3D skb->hash; + } + /* Check if we can use flow to select a queue */ - rxq =3D skb_get_hash(skb); if (rxq) { + tun_vnet_hash_report(vnet_hash, skb, &keys_basic, rxq, tap_add_hash); queue =3D rcu_dereference(tap->taps[rxq % numvtaps]); goto out; } @@ -713,11 +748,12 @@ static ssize_t tap_put_user(struct tap_queue *q, int total; =20 if (q->flags & IFF_VNET_HDR) { - struct virtio_net_hdr_v1 vnet_hdr; + struct virtio_net_hdr_v1_hash vnet_hdr; =20 vnet_hdr_len =3D READ_ONCE(q->vnet_hdr_sz); =20 - ret =3D tun_vnet_hdr_from_skb(q->flags, NULL, skb, &vnet_hdr); + ret =3D tun_vnet_hdr_from_skb(vnet_hdr_len, q->flags, NULL, skb, + tap_find_hash, &vnet_hdr); if (ret < 0) goto done; =20 @@ -1025,7 +1061,13 @@ static long tap_ioctl(struct file *file, unsigned in= t cmd, return ret; =20 default: - return tun_vnet_ioctl(&q->vnet_hdr_sz, &q->flags, cmd, sp); + rtnl_lock(); + tap =3D rtnl_dereference(q->tap); + ret =3D tun_vnet_ioctl(&q->vnet_hdr_sz, &q->flags, + tap ? &tap->vnet_hash : NULL, -EINVAL, + true, cmd, sp); + rtnl_unlock(); + return ret; } } =20 diff --git a/drivers/net/tun.c b/drivers/net/tun.c index f211d0580887..efdbd2f65100 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -209,6 +209,7 @@ struct tun_struct { struct bpf_prog __rcu *xdp_prog; struct tun_prog __rcu *steering_prog; struct tun_prog __rcu *filter_prog; + struct tun_vnet_hash_container __rcu *vnet_hash; struct ethtool_link_ksettings link_ksettings; /* init args */ struct file *file; @@ -451,20 +452,37 @@ static inline void tun_flow_save_rps_rxhash(struct tu= n_flow_entry *e, u32 hash) e->rps_rxhash =3D hash; } =20 +static struct virtio_net_hash *tun_add_hash(struct sk_buff *skb) +{ + return skb_ext_add(skb, SKB_EXT_TUN_VNET_HASH); +} + +static const struct virtio_net_hash *tun_find_hash(const struct sk_buff *s= kb) +{ + return skb_ext_find(skb, SKB_EXT_TUN_VNET_HASH); +} + /* We try to identify a flow through its rxhash. The reason that * we do not check rxq no. is because some cards(e.g 82599), chooses * the rxq based on the txq where the last packet of the flow comes. As * the userspace application move between processors, we may get a * different rxq no. here. */ -static u16 tun_automq_select_queue(struct tun_struct *tun, struct sk_buff = *skb) +static u16 tun_automq_select_queue(struct tun_struct *tun, + const struct tun_vnet_hash_container *vnet_hash, + struct sk_buff *skb) { + struct flow_keys keys; + struct flow_keys_basic keys_basic; struct tun_flow_entry *e; u32 txq, numqueues; =20 numqueues =3D READ_ONCE(tun->numqueues); =20 - txq =3D __skb_get_hash_symmetric(skb); + memset(&keys, 0, sizeof(keys)); + skb_flow_dissect(skb, &flow_keys_dissector_symmetric, &keys, 0); + + txq =3D flow_hash_from_keys(&keys); e =3D tun_flow_find(&tun->flows[tun_hashfn(txq)], txq); if (e) { tun_flow_save_rps_rxhash(e, txq); @@ -473,6 +491,13 @@ static u16 tun_automq_select_queue(struct tun_struct *= tun, struct sk_buff *skb) txq =3D reciprocal_scale(txq, numqueues); } =20 + keys_basic =3D (struct flow_keys_basic) { + .control =3D keys.control, + .basic =3D keys.basic + }; + tun_vnet_hash_report(vnet_hash, skb, &keys_basic, skb->l4_hash ? skb->has= h : txq, + tun_add_hash); + return txq; } =20 @@ -500,10 +525,17 @@ static u16 tun_select_queue(struct net_device *dev, s= truct sk_buff *skb, u16 ret; =20 rcu_read_lock(); - if (rcu_dereference(tun->steering_prog)) + if (rcu_dereference(tun->steering_prog)) { ret =3D tun_ebpf_select_queue(tun, skb); - else - ret =3D tun_automq_select_queue(tun, skb); + } else { + struct tun_vnet_hash_container *vnet_hash =3D rcu_dereference(tun->vnet_= hash); + + if (vnet_hash && (vnet_hash->common.flags & TUN_VNET_HASH_RSS)) + ret =3D tun_vnet_rss_select_queue(READ_ONCE(tun->numqueues), vnet_hash, + skb, tun_add_hash); + else + ret =3D tun_automq_select_queue(tun, vnet_hash, skb); + } rcu_read_unlock(); =20 return ret; @@ -1991,8 +2023,8 @@ static ssize_t tun_put_user_xdp(struct tun_struct *tu= n, size_t total; =20 if (tun->flags & IFF_VNET_HDR) { - struct virtio_net_hdr_v1 gso =3D { - .num_buffers =3D __virtio16_to_cpu(true, 1) + struct virtio_net_hdr_v1_hash gso =3D { + .hdr =3D { .num_buffers =3D __virtio16_to_cpu(true, 1) } }; =20 vnet_hdr_sz =3D READ_ONCE(tun->vnet_hdr_sz); @@ -2021,7 +2053,6 @@ static ssize_t tun_put_user(struct tun_struct *tun, int vlan_offset =3D 0; int vlan_hlen =3D 0; int vnet_hdr_sz =3D 0; - int ret; =20 if (skb_vlan_tag_present(skb)) vlan_hlen =3D VLAN_HLEN; @@ -2046,9 +2077,11 @@ static ssize_t tun_put_user(struct tun_struct *tun, } =20 if (vnet_hdr_sz) { - struct virtio_net_hdr_v1 gso; + struct virtio_net_hdr_v1_hash gso; + int ret; =20 - ret =3D tun_vnet_hdr_from_skb(tun->flags, tun->dev, skb, &gso); + ret =3D tun_vnet_hdr_from_skb(vnet_hdr_sz, tun->flags, tun->dev, + skb, tun_find_hash, &gso); if (ret < 0) goto done; =20 @@ -2229,6 +2262,9 @@ static void tun_free_netdev(struct net_device *dev) security_tun_dev_free_security(tun->security); __tun_set_ebpf(tun, &tun->steering_prog, NULL); __tun_set_ebpf(tun, &tun->filter_prog, NULL); + rtnl_lock(); + kfree_rcu_mightsleep(rtnl_dereference(tun->vnet_hash)); + rtnl_unlock(); } =20 static void tun_setup(struct net_device *dev) @@ -2927,13 +2963,9 @@ static int tun_set_queue(struct file *file, struct i= freq *ifr) } =20 static int tun_set_ebpf(struct tun_struct *tun, struct tun_prog __rcu **pr= og_p, - void __user *data) + int fd) { struct bpf_prog *prog; - int fd; - - if (copy_from_user(&fd, data, sizeof(fd))) - return -EFAULT; =20 if (fd =3D=3D -1) { prog =3D NULL; @@ -3000,6 +3032,7 @@ static long __tun_chr_ioctl(struct file *file, unsign= ed int cmd, int sndbuf; int ret; bool do_notify =3D false; + struct tun_vnet_hash_container *vnet_hash; =20 if (cmd =3D=3D TUNSETIFF || cmd =3D=3D TUNSETQUEUE || (_IOC_TYPE(cmd) =3D=3D SOCK_IOC_TYPE && cmd !=3D SIOCGSKNS)) { @@ -3058,9 +3091,10 @@ static long __tun_chr_ioctl(struct file *file, unsig= ned int cmd, goto unlock; } =20 - ret =3D -EBADFD; - if (!tun) + if (!tun) { + ret =3D tun_vnet_ioctl(NULL, NULL, NULL, -EBADFD, true, cmd, argp); goto unlock; + } =20 netif_info(tun, drv, tun->dev, "tun_chr_ioctl cmd %u\n", cmd); =20 @@ -3236,11 +3270,27 @@ static long __tun_chr_ioctl(struct file *file, unsi= gned int cmd, break; =20 case TUNSETSTEERINGEBPF: - ret =3D tun_set_ebpf(tun, &tun->steering_prog, argp); + if (get_user(ret, (int __user *)argp)) { + ret =3D -EFAULT; + break; + } + + vnet_hash =3D rtnl_dereference(tun->vnet_hash); + if (ret !=3D -1 && vnet_hash && (vnet_hash->common.flags & TUN_VNET_HASH= _RSS)) { + ret =3D -EBUSY; + break; + } + + ret =3D tun_set_ebpf(tun, &tun->steering_prog, ret); break; =20 case TUNSETFILTEREBPF: - ret =3D tun_set_ebpf(tun, &tun->filter_prog, argp); + if (get_user(ret, (int __user *)argp)) { + ret =3D -EFAULT; + break; + } + + ret =3D tun_set_ebpf(tun, &tun->filter_prog, ret); break; =20 case TUNSETCARRIER: @@ -3259,7 +3309,10 @@ static long __tun_chr_ioctl(struct file *file, unsig= ned int cmd, break; =20 default: - ret =3D tun_vnet_ioctl(&tun->vnet_hdr_sz, &tun->flags, cmd, argp); + ret =3D tun_vnet_ioctl(&tun->vnet_hdr_sz, &tun->flags, + &tun->vnet_hash, -EINVAL, + !rtnl_dereference(tun->steering_prog), + cmd, argp); } =20 if (do_notify) diff --git a/drivers/net/tun_vnet.c b/drivers/net/tun_vnet.c index a7a7989fae56..d36ca3b23265 100644 --- a/drivers/net/tun_vnet.c +++ b/drivers/net/tun_vnet.c @@ -58,18 +58,33 @@ static __virtio16 cpu_to_tun_vnet16(unsigned int flags,= u16 val) } =20 long tun_vnet_ioctl(int *sz, unsigned int *flags, - unsigned int cmd, int __user *sp) + struct tun_vnet_hash_container __rcu **hashp, + long fallback, bool can_rss, + unsigned int cmd, void __user *argp) { + static const struct tun_vnet_hash cap =3D { + .flags =3D TUN_VNET_HASH_REPORT | TUN_VNET_HASH_RSS, + .types =3D VIRTIO_NET_SUPPORTED_HASH_TYPES + }; + struct tun_vnet_hash hash_buf; + struct tun_vnet_hash_container *hash; + int __user *sp =3D argp; int s; =20 switch (cmd) { case TUNGETVNETHDRSZ: + if (!sz) + return -EBADFD; + s =3D *sz; if (put_user(s, sp)) return -EFAULT; return 0; =20 case TUNSETVNETHDRSZ: + if (!sz) + return -EBADFD; + if (get_user(s, sp)) return -EFAULT; if (s < (int)sizeof(struct virtio_net_hdr)) @@ -79,12 +94,18 @@ long tun_vnet_ioctl(int *sz, unsigned int *flags, return 0; =20 case TUNGETVNETLE: + if (!flags) + return -EBADFD; + s =3D !!(*flags & TUN_VNET_LE); if (put_user(s, sp)) return -EFAULT; return 0; =20 case TUNSETVNETLE: + if (!flags) + return -EBADFD; + if (get_user(s, sp)) return -EFAULT; if (s) @@ -94,17 +115,132 @@ long tun_vnet_ioctl(int *sz, unsigned int *flags, return 0; =20 case TUNGETVNETBE: + if (!flags) + return -EBADFD; + return tun_vnet_get_be(*flags, sp); =20 case TUNSETVNETBE: + if (!flags) + return -EBADFD; + return tun_vnet_set_be(flags, sp); =20 + case TUNGETVNETHASHCAP: + return copy_to_user(argp, &cap, sizeof(cap)) ? -EFAULT : 0; + + case TUNSETVNETHASH: + if (!hashp) + return -EBADFD; + + if (copy_from_user(&hash_buf, argp, sizeof(hash_buf))) + return -EFAULT; + argp =3D (struct tun_vnet_hash __user *)argp + 1; + + if (hash_buf.flags & TUN_VNET_HASH_RSS) { + struct tun_vnet_hash_rss rss; + size_t indirection_table_size; + size_t key_size; + size_t size; + + if (!can_rss) + return -EBUSY; + + if (copy_from_user(&rss, argp, sizeof(rss))) + return -EFAULT; + argp =3D (struct tun_vnet_hash_rss __user *)argp + 1; + + indirection_table_size =3D ((size_t)rss.indirection_table_mask + 1) * 2; + key_size =3D virtio_net_hash_key_length(hash_buf.types); + size =3D struct_size(hash, rss_indirection_table, + (size_t)rss.indirection_table_mask + 1); + + hash =3D kmalloc(size, GFP_KERNEL); + if (!hash) + return -ENOMEM; + + if (copy_from_user(hash->rss_indirection_table, + argp, indirection_table_size)) { + kfree(hash); + return -EFAULT; + } + argp =3D (u16 __user *)argp + rss.indirection_table_mask + 1; + + if (copy_from_user(hash->rss_key, argp, key_size)) { + kfree(hash); + return -EFAULT; + } + + virtio_net_toeplitz_convert_key(hash->rss_key, key_size); + hash->rss =3D rss; + } else { + hash =3D kmalloc(sizeof(hash->common), GFP_KERNEL); + if (!hash) + return -ENOMEM; + } + + hash->common =3D hash_buf; + kfree_rcu_mightsleep(rcu_replace_pointer_rtnl(*hashp, hash)); + return 0; + default: - return -EINVAL; + return fallback; } } EXPORT_SYMBOL_GPL(tun_vnet_ioctl); =20 +void tun_vnet_hash_report(const struct tun_vnet_hash_container *hash, + struct sk_buff *skb, + const struct flow_keys_basic *keys, + u32 value, + tun_vnet_hash_add vnet_hash_add) +{ + struct virtio_net_hash *report; + + if (!hash || !(hash->common.flags & TUN_VNET_HASH_REPORT)) + return; + + report =3D vnet_hash_add(skb); + if (!report) + return; + + *report =3D (struct virtio_net_hash) { + .report =3D virtio_net_hash_report(hash->common.types, keys), + .value =3D value + }; +} +EXPORT_SYMBOL_GPL(tun_vnet_hash_report); + +u16 tun_vnet_rss_select_queue(u32 numqueues, + const struct tun_vnet_hash_container *hash, + struct sk_buff *skb, + tun_vnet_hash_add vnet_hash_add) +{ + struct virtio_net_hash *report; + struct virtio_net_hash ret; + u16 txq, index; + + if (!numqueues) + return 0; + + virtio_net_hash_rss(skb, hash->common.types, hash->rss_key, &ret); + + if (!ret.report) + return hash->rss.unclassified_queue % numqueues; + + if (hash->common.flags & TUN_VNET_HASH_REPORT) { + report =3D vnet_hash_add(skb); + if (report) + *report =3D ret; + } + + index =3D ret.value & hash->rss.indirection_table_mask; + txq =3D READ_ONCE(hash->rss_indirection_table[index]); + + return txq % numqueues; +} +EXPORT_SYMBOL_GPL(tun_vnet_rss_select_queue); + int tun_vnet_hdr_get(int sz, unsigned int flags, struct iov_iter *from, struct virtio_net_hdr *hdr) { @@ -130,7 +266,7 @@ int tun_vnet_hdr_get(int sz, unsigned int flags, struct= iov_iter *from, EXPORT_SYMBOL_GPL(tun_vnet_hdr_get); =20 int tun_vnet_hdr_put(int sz, struct iov_iter *iter, - const struct virtio_net_hdr_v1 *hdr) + const struct virtio_net_hdr_v1_hash *hdr) { int content_sz =3D MIN(sizeof(*hdr), sz); =20 @@ -154,11 +290,24 @@ int tun_vnet_hdr_to_skb(unsigned int flags, struct sk= _buff *skb, } EXPORT_SYMBOL_GPL(tun_vnet_hdr_to_skb); =20 -int tun_vnet_hdr_from_skb(unsigned int flags, const struct net_device *dev, +int tun_vnet_hdr_from_skb(int sz, unsigned int flags, + const struct net_device *dev, const struct sk_buff *skb, - struct virtio_net_hdr_v1 *hdr) + tun_vnet_hash_find vnet_hash_find, + struct virtio_net_hdr_v1_hash *hdr) { int vlan_hlen =3D skb_vlan_tag_present(skb) ? VLAN_HLEN : 0; + const struct virtio_net_hash *report =3D sz < sizeof(struct virtio_net_hd= r_v1_hash) ? + NULL : vnet_hash_find(skb); + + *hdr =3D (struct virtio_net_hdr_v1_hash) { + .hdr =3D { .num_buffers =3D __cpu_to_virtio16(true, 1) } + }; + + if (report) { + hdr->hash_value =3D cpu_to_le32(report->value); + hdr->hash_report =3D cpu_to_le16(report->report); + } =20 if (virtio_net_hdr_from_skb(skb, (struct virtio_net_hdr *)hdr, tun_vnet_is_little_endian(flags), true, @@ -167,19 +316,17 @@ int tun_vnet_hdr_from_skb(unsigned int flags, const s= truct net_device *dev, =20 if (net_ratelimit()) { netdev_err(dev, "unexpected GSO type: 0x%x, gso_size %d, hdr_len %d\n", - sinfo->gso_type, tun_vnet16_to_cpu(flags, hdr->gso_size), - tun_vnet16_to_cpu(flags, hdr->hdr_len)); + sinfo->gso_type, tun_vnet16_to_cpu(flags, hdr->hdr.gso_size), + tun_vnet16_to_cpu(flags, hdr->hdr.hdr_len)); print_hex_dump(KERN_ERR, "tun: ", DUMP_PREFIX_NONE, 16, 1, skb->head, - min(tun_vnet16_to_cpu(flags, hdr->hdr_len), 64), true); + min(tun_vnet16_to_cpu(flags, hdr->hdr.hdr_len), 64), true); } WARN_ON_ONCE(1); return -EINVAL; } =20 - hdr->num_buffers =3D 1; - return 0; } EXPORT_SYMBOL_GPL(tun_vnet_hdr_from_skb); diff --git a/drivers/net/tun_vnet.h b/drivers/net/tun_vnet.h index d8fd94094227..046fb051d089 100644 --- a/drivers/net/tun_vnet.h +++ b/drivers/net/tun_vnet.h @@ -5,20 +5,45 @@ #include #include =20 +typedef struct virtio_net_hash *(*tun_vnet_hash_add)(struct sk_buff *); +typedef const struct virtio_net_hash *(*tun_vnet_hash_find)(const struct s= k_buff *); + +struct tun_vnet_hash_container { + struct tun_vnet_hash common; + struct tun_vnet_hash_rss rss; + u32 rss_key[VIRTIO_NET_RSS_MAX_KEY_SIZE]; + u16 rss_indirection_table[]; +}; + long tun_vnet_ioctl(int *sz, unsigned int *flags, - unsigned int cmd, int __user *sp); + struct tun_vnet_hash_container __rcu **hashp, + long fallback, bool can_rss, + unsigned int cmd, void __user *argp); + +void tun_vnet_hash_report(const struct tun_vnet_hash_container *hash, + struct sk_buff *skb, + const struct flow_keys_basic *keys, + u32 value, + tun_vnet_hash_add vnet_hash_add); + +u16 tun_vnet_rss_select_queue(u32 numqueues, + const struct tun_vnet_hash_container *hash, + struct sk_buff *skb, + tun_vnet_hash_add vnet_hash_add); =20 int tun_vnet_hdr_get(int sz, unsigned int flags, struct iov_iter *from, struct virtio_net_hdr *hdr); =20 int tun_vnet_hdr_put(int sz, struct iov_iter *iter, - const struct virtio_net_hdr_v1 *hdr); + const struct virtio_net_hdr_v1_hash *hdr); =20 int tun_vnet_hdr_to_skb(unsigned int flags, struct sk_buff *skb, const struct virtio_net_hdr *hdr); =20 -int tun_vnet_hdr_from_skb(unsigned int flags, const struct net_device *dev, +int tun_vnet_hdr_from_skb(int sz, unsigned int flags, + const struct net_device *dev, const struct sk_buff *skb, - struct virtio_net_hdr_v1 *hdr); + tun_vnet_hash_find vnet_hash_find, + struct virtio_net_hdr_v1_hash *hdr); =20 #endif /* TUN_VNET_H */ diff --git a/include/linux/if_tap.h b/include/linux/if_tap.h index 553552fa635c..7334c46a3f10 100644 --- a/include/linux/if_tap.h +++ b/include/linux/if_tap.h @@ -31,6 +31,7 @@ static inline struct ptr_ring *tap_get_ptr_ring(struct fi= le *f) #define MAX_TAP_QUEUES 256 =20 struct tap_queue; +struct tun_vnet_hash_container; =20 struct tap_dev { struct net_device *dev; @@ -43,6 +44,7 @@ struct tap_dev { int numqueues; netdev_features_t tap_features; int minor; + struct tun_vnet_hash_container __rcu *vnet_hash; =20 void (*update_features)(struct tap_dev *tap, netdev_features_t features); void (*count_tx_dropped)(struct tap_dev *tap); diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 58009fa66102..73214f1a378e 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -4813,6 +4813,9 @@ enum skb_ext_id { #endif #if IS_ENABLED(CONFIG_MCTP_FLOWS) SKB_EXT_MCTP, +#endif +#if IS_ENABLED(CONFIG_TUN) + SKB_EXT_TUN_VNET_HASH, #endif SKB_EXT_NUM, /* must be last */ }; diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h index 287cdc81c939..4887f97500a8 100644 --- a/include/uapi/linux/if_tun.h +++ b/include/uapi/linux/if_tun.h @@ -62,6 +62,42 @@ #define TUNSETCARRIER _IOW('T', 226, int) #define TUNGETDEVNETNS _IO('T', 227) =20 +/** + * define TUNGETVNETHASHCAP - ioctl to get virtio_net hashing capability. + * + * The argument is a pointer to &struct tun_vnet_hash which will store the + * maximal virtio_net hashing configuration. + */ +#define TUNGETVNETHASHCAP _IOR('T', 228, struct tun_vnet_hash) + +/** + * define TUNSETVNETHASH - ioctl to configure virtio_net hashing + * + * The argument is a pointer to &struct tun_vnet_hash. + * + * The argument is a pointer to the compound of the following in order if + * %TUN_VNET_HASH_RSS is set: + * + * 1. &struct tun_vnet_hash + * 2. &struct tun_vnet_hash_rss + * 3. Indirection table + * 4. Key + * + * The %TUN_VNET_HASH_REPORT flag set with this ioctl will be effective on= ly + * after calling the %TUNSETVNETHDRSZ ioctl with a number greater than or = equal + * to the size of &struct virtio_net_hdr_v1_hash. + * + * The members added to the legacy header by %TUN_VNET_HASH_REPORT flag wi= ll + * always be little-endian. + * + * This ioctl results in %EBADFD if the underlying device is deleted. It a= ffects + * all queues attached to the same device. + * + * This ioctl currently has no effect on XDP packets and packets with + * queue_mapping set by TC. + */ +#define TUNSETVNETHASH _IOW('T', 229, struct tun_vnet_hash) + /* TUNSETIFF ifr flags */ #define IFF_TUN 0x0001 #define IFF_TAP 0x0002 @@ -115,4 +151,43 @@ struct tun_filter { __u8 addr[][ETH_ALEN]; }; =20 +/** + * define TUN_VNET_HASH_REPORT - Request virtio_net hash reporting for vho= st + */ +#define TUN_VNET_HASH_REPORT 0x0001 + +/** + * define TUN_VNET_HASH_RSS - Request virtio_net RSS + * + * This is mutually exclusive with eBPF steering program. + */ +#define TUN_VNET_HASH_RSS 0x0002 + +/** + * struct tun_vnet_hash - virtio_net hashing configuration + * @flags: + * Bitmask consists of %TUN_VNET_HASH_REPORT and %TUN_VNET_HASH_RSS + * @pad: + * Should be filled with zero before passing to %TUNSETVNETHASH + * @types: + * Bitmask of allowed hash types + */ +struct tun_vnet_hash { + __u16 flags; + __u8 pad[2]; + __u32 types; +}; + +/** + * struct tun_vnet_hash_rss - virtio_net RSS configuration + * @indirection_table_mask: + * Bitmask to be applied to the indirection table index + * @unclassified_queue: + * The index of the queue to place unclassified packets in + */ +struct tun_vnet_hash_rss { + __u16 indirection_table_mask; + __u16 unclassified_queue; +}; + #endif /* _UAPI__IF_TUN_H */ diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 6841e61a6bd0..97b22833905d 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -64,6 +64,7 @@ #include #include #include +#include =20 #include #include @@ -5059,6 +5060,9 @@ static const u8 skb_ext_type_len[] =3D { #if IS_ENABLED(CONFIG_MCTP_FLOWS) [SKB_EXT_MCTP] =3D SKB_EXT_CHUNKSIZEOF(struct mctp_flow), #endif +#if IS_ENABLED(CONFIG_TUN) + [SKB_EXT_TUN_VNET_HASH] =3D SKB_EXT_CHUNKSIZEOF(struct virtio_net_hash), +#endif }; =20 static __always_inline unsigned int skb_ext_total_length(void) --=20 2.47.1 From nobody Fri Dec 19 20:35:48 2025 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AF9C21507E for ; Thu, 9 Jan 2025 07:14:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406878; cv=none; b=KbH0hljYxnVvBLCxdIXSaIas7lfWqLPycwXKUT/qnA7QdRUFFqz+oi1YJpwymfa5rdUXOKGHqn6FS8mg1hwL/taCawia6SoBjRlrZbksmSzvZofaBliwdYG5hThNlRrLRixP/Ha80sX+79heocBeil0OxOGESyQegwqYMXkEdW4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406878; c=relaxed/simple; bh=bKMEvJwhH14xg+cNxuCNztQTA+YdMMHjdpgOof+VPTA=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To; b=rCahOznGGpYkyrx9MncJo7Ip/jS56gRHIoIL2MTNtf0dkvBR4Xsqi/KV9RJoWB4zEAlmgOcvagX0tFWOSNowu1tB/hDNllxWg5JPj9n54dVyNmRwXMUwlIr3u+a8MG5zNBO5gt3B5qOvDFDNKh/kvuBPBxONfvNe9YZ5SAO8Gp4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com; spf=pass smtp.mailfrom=daynix.com; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b=irZhKMoS; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=daynix.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b="irZhKMoS" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-2efded08c79so850881a91.0 for ; Wed, 08 Jan 2025 23:14:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1736406876; x=1737011676; darn=vger.kernel.org; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=O5JVItUDqWznN3D/qkeD3NuSCrOvt/lnTEOCr915p28=; b=irZhKMoSLwOlQ0gMgeqII8xkMtHk+a07jojdQ9EpG4faS1XznQ/hxKvwlI36RnOpV8 vmTl14VpYbcRQ+yD8mYZ6yikW2jab/c/0Qk/Apz+8/gYvuOKaIvFk77+EV4cOksYlzi7 BbrNhs8Bm661+ZPCNLlBPP8EqCjAPns4Np08qEJgWVynrejTIW3tPuQgvfsLXIjCAtvD Qc6ttfFJlSU3eFioT4SxcJupzfJgKKcrweecUEu3EQbUxdWHaJ7Zgyp2hp48JhVfPJEx cG25jrafBuy3n6ecpA35ojx9cGLOe+510RMP5TliJS4zBMl7TfH6mq2vNidErX64pUU0 sxAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736406876; x=1737011676; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=O5JVItUDqWznN3D/qkeD3NuSCrOvt/lnTEOCr915p28=; b=EMeZxcgLk61XlP2f8Dx7E3dLDlBI52yXIdIuTBP7Z4aXafpGErhIyzVCsBQyiJO42b KFY7V8/RJmgRdLskh9RQJ9MQlMXv6YgTZE7P9XXAAcylZvNKlhgcynjFpE0GhFBNzhoA stjGnaQXXeOwo2da5VJhjHrCwYelsx2h20Gm+p9/IHx3P0++GOR9m7XlYtbxE9TFQRkS xGEdfkMuFRRFENTCeNujvG1xWec99TKLaFI5oQH4diwQNb2O1XDZBHA6s+bo4soxhgt/ aI+8GppllBHWtUdYqrfFXwPzKcCMdl1TQeM0XxqX3l0jxScEeBSLfbKlLdnSwCq6DpSC IWsw== X-Forwarded-Encrypted: i=1; AJvYcCVCXuPkHEpp3vnBmIW/tSG1HZtrql9RMWiC49TCkaZmnTRT3Ob8mB2tmZJHhRHUiaZiGwUBnXWF4H8Cxr0=@vger.kernel.org X-Gm-Message-State: AOJu0YwLYKdcAQO3m74PRAZvJeBxhiBQExtvAk6vR6eZmwd6mirbSwvW IgMfhBCJQ9QpwUR4sowg07ibzFUDXeez6qmbfBeyex9D0lkLbd2cC9T+xM2jN1XCwa301+BkhhK vHD0= X-Gm-Gg: ASbGncsX62x4xfx6L68lmksvkzCb/3LEGwQ2WPeHnD3Io51zAjnJBv7hKhQ5RvzoNJn 8+Gxt4ERHk82BjirOLLHFPCSH2sfqHbwozwu9T5qwhYwadNWwuNww/BKPBNfzaab79ls6geZj5y zr6PWT6Mr7ECR7LXAzJumvOGQ9vXxT3pmSyxXy/uQaOVUHkusw30YBflBqrAmPX1c0hUSo1Hmff IQ20ji8nkUVbcKWnroBf89wu7u82EcNuLCujuy2De5eqz52tVNxztEnrSA= X-Google-Smtp-Source: AGHT+IGPJuUGyRRURc1qr4TS/2LI/qWJLc1yL6cIE0cCrjfybxEqzid+tHM9XM8b4Qxi4Ej3H+L8mg== X-Received: by 2002:a05:6a00:1a89:b0:725:f4c6:6b71 with SMTP id d2e1a72fcca58-72d21fc0b36mr9799011b3a.20.1736406876527; Wed, 08 Jan 2025 23:14:36 -0800 (PST) Received: from localhost ([157.82.203.37]) by smtp.gmail.com with UTF8SMTPSA id 41be03b00d2f7-a317a07cce2sm635365a12.4.2025.01.08.23.14.31 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Jan 2025 23:14:36 -0800 (PST) From: Akihiko Odaki Date: Thu, 09 Jan 2025 16:13:42 +0900 Subject: [PATCH v6 4/6] selftest: tun: Test vnet ioctls without device Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250109-rss-v6-4-b1c90ad708f6@daynix.com> References: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> In-Reply-To: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> To: Jonathan Corbet , Willem de Bruijn , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, Yuri Benditovich , Andrew Melnychenko , Stephen Hemminger , gur.stavi@huawei.com, Akihiko Odaki X-Mailer: b4 0.14-dev-fd6e3 Ensure that vnet ioctls result in EBADFD when the underlying device is deleted. Signed-off-by: Akihiko Odaki Tested-by: Lei Yang --- tools/testing/selftests/net/tun.c | 74 +++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 74 insertions(+) diff --git a/tools/testing/selftests/net/tun.c b/tools/testing/selftests/ne= t/tun.c index fa83918b62d1..463dd98f2b80 100644 --- a/tools/testing/selftests/net/tun.c +++ b/tools/testing/selftests/net/tun.c @@ -159,4 +159,78 @@ TEST_F(tun, reattach_close_delete) { EXPECT_EQ(tun_delete(self->ifname), 0); } =20 +FIXTURE(tun_deleted) +{ + char ifname[IFNAMSIZ]; + int fd; +}; + +FIXTURE_SETUP(tun_deleted) +{ + self->ifname[0] =3D 0; + self->fd =3D tun_alloc(self->ifname); + ASSERT_LE(0, self->fd); + + ASSERT_EQ(0, tun_delete(self->ifname)) + EXPECT_EQ(0, close(self->fd)); +} + +FIXTURE_TEARDOWN(tun_deleted) +{ + EXPECT_EQ(0, close(self->fd)); +} + +TEST_F(tun_deleted, getvnethdrsz) +{ + ASSERT_EQ(-1, ioctl(self->fd, TUNGETVNETHDRSZ)); + EXPECT_EQ(EBADFD, errno); +} + +TEST_F(tun_deleted, setvnethdrsz) +{ + ASSERT_EQ(-1, ioctl(self->fd, TUNSETVNETHDRSZ)); + EXPECT_EQ(EBADFD, errno); +} + +TEST_F(tun_deleted, getvnetle) +{ + ASSERT_EQ(-1, ioctl(self->fd, TUNGETVNETLE)); + EXPECT_EQ(EBADFD, errno); +} + +TEST_F(tun_deleted, setvnetle) +{ + ASSERT_EQ(-1, ioctl(self->fd, TUNSETVNETLE)); + EXPECT_EQ(EBADFD, errno); +} + +TEST_F(tun_deleted, getvnetbe) +{ + ASSERT_EQ(-1, ioctl(self->fd, TUNGETVNETBE)); + EXPECT_EQ(EBADFD, errno); +} + +TEST_F(tun_deleted, setvnetbe) +{ + ASSERT_EQ(-1, ioctl(self->fd, TUNSETVNETBE)); + EXPECT_EQ(EBADFD, errno); +} + +TEST_F(tun_deleted, getvnethashcap) +{ + struct tun_vnet_hash cap; + int i =3D ioctl(self->fd, TUNGETVNETHASHCAP, &cap); + + if (i =3D=3D -1 && errno =3D=3D EBADFD) + SKIP(return, "TUNGETVNETHASHCAP not supported"); + + EXPECT_EQ(0, i); +} + +TEST_F(tun_deleted, setvnethash) +{ + ASSERT_EQ(-1, ioctl(self->fd, TUNSETVNETHASH)); + EXPECT_EQ(EBADFD, errno); +} + TEST_HARNESS_MAIN --=20 2.47.1 From nobody Fri Dec 19 20:35:48 2025 Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4B232036F4 for ; Thu, 9 Jan 2025 07:14:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406887; cv=none; b=VrRaN9lA8CZpOr3ZWoQprRkoo6Hm9Gzsvd1P143/f3Ejq0gGgLgvMmh4KvzrxX7K4/nSt9DK2En3E7po0XwBRb7W37nHFeOaOq9pu5+asBhoWn5QraPqGMihcJ6IiXJ+BoH3GoOyWoix4xpKRF56xlpF8Cgppcs+CVYdWfjx62U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406887; c=relaxed/simple; bh=EeYcPPbCSLbCrhUJCUP1//ZBEhOhawzmQgeiV4SdASE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To; b=q4C0Ms3iMS9HtgKhq8fcfKlPv2cnCb9RBRepj6B+eNDqlVxeV9VqL6LHKVO6Cc4tmjPG2Vhx/r7yje2PGC09XX16RJFxycc5HByPkmGh82V8oXE6Efrv7zdHkB3Mnu7PuadOqshbHxSv4amew/nTJhcwavsqFYMTIj8o1l/raP4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com; spf=pass smtp.mailfrom=daynix.com; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b=YaORSePH; arc=none smtp.client-ip=209.85.216.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=daynix.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b="YaORSePH" Received: by mail-pj1-f45.google.com with SMTP id 98e67ed59e1d1-2ef748105deso797368a91.1 for ; Wed, 08 Jan 2025 23:14:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1736406884; x=1737011684; darn=vger.kernel.org; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=DhykU8yDFv6bKTE6U3PKxCcBF79WrMHwP95TkvcbY50=; b=YaORSePHnvbsmyerUM3Q0v59Y0n9xVhS8yqkHB+IM79LAy0H8MgNWOCwdfGsTgEGz+ gNpRLXzBNAETOQ/PT/WpM7ARaG/JQNcFZAVTON2hDjRPSvzHpfzu9ybI+yTljq0yzOkl JbrHwaThExdYznVv1YTGuPTwldMpgD3WfAaNjLF4YM3H+wMDCVQAb6+bhHjLqtr9O/M1 HuYeUpUSjWDGuJTWkhK5GN+YLVbVL3awqA+Cm0BGw9zHPonsn1lUn7XoFF1FfxEFzbO0 8pYNiEXQ+040THMiATJKK5IqhYhu+cWAgcFJsfP+mDZPchIW+PjyEDHBSc1AU3zeDwRw Th0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736406884; x=1737011684; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DhykU8yDFv6bKTE6U3PKxCcBF79WrMHwP95TkvcbY50=; b=qnmJdC2zIS5jhkZcpVRv8z/2aoa6x5SrRymfvbzUh9dlWYuBGH7zS4dqTGaN4Mh1RL 6vil9bIzVEY5HqWDT7NqeY+OkhaQYmudKrKz+plwS/CJCgJiKaLSg0YtLyMULd9cKlb5 hnYXbVgg9xGR7VIhJozLJA7x0xA0ubqGALk6K67jMSad4reMb0cuVfcaKiB2VSY3Pmc3 gsNSi6tsvfagbpyXHofzOONqUA8uRRLTArXqQ/ihba92/E4noRmkEhdnpWwx/4T7h2eG q0W3hCWAYeZbWMIaier2Ngh/MPXPjuSwIuJx8wutw0WbHs1RH3ETvc4CpN0BKg9u+3Jf mNsg== X-Forwarded-Encrypted: i=1; AJvYcCXVmngjE6pC5DPRdKbjJdNaXm//HuLyZi6cF17d7uGuCi9tcHxnDqscN9NAfW4feQvMhwBy3sfMGiJaDug=@vger.kernel.org X-Gm-Message-State: AOJu0YzvDf/cEUADXkK63MtOwvhSAdAtt9Gbg3mZxssBxu/eGkybuhaG 8loobrMmoI7bIYvwlSEHty9wET8TixZEk9CJS/JAKZo2fMdcQH2n0CAVzXzSZoA= X-Gm-Gg: ASbGnct4cZMJD1hNP5rixcEQsSFnU+sVOI8EeGFg5f3NHsQ+0V57xs3iENByoZrib61 mWR/aEeZ+k1S0Jv4ed+94LR1qiZX7mhEtGBKUPxzMSiD150XrwOFyX81aqoWE3mU2ZjunUgBmMI MfKudCQzjzD3CvKSp6W631UBSrwm/Y6431zwzg9zmNsbwbBu7kKMZV6D0vXn+Jjdqdz/EINMFd5 TtyrxybQhnpPeFuqSMdal7d8KPXW5kFyI84CmWd0OTw8ojp7QKOYz0KrTI= X-Google-Smtp-Source: AGHT+IEFzLk+dYP2gUZX1j9QprQjMcMYH7OXrauZj6QuQu7BQagWyNwe0AGvIuURkIf2ciUmgYhS6g== X-Received: by 2002:a17:90b:51c4:b0:2ee:e18b:c1fa with SMTP id 98e67ed59e1d1-2f548f1d732mr7972467a91.28.1736406884139; Wed, 08 Jan 2025 23:14:44 -0800 (PST) Received: from localhost ([157.82.203.37]) by smtp.gmail.com with UTF8SMTPSA id 98e67ed59e1d1-2f54a34e456sm3082939a91.35.2025.01.08.23.14.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Jan 2025 23:14:43 -0800 (PST) From: Akihiko Odaki Date: Thu, 09 Jan 2025 16:13:43 +0900 Subject: [PATCH v6 5/6] selftest: tun: Add tests for virtio-net hashing Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250109-rss-v6-5-b1c90ad708f6@daynix.com> References: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> In-Reply-To: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> To: Jonathan Corbet , Willem de Bruijn , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, Yuri Benditovich , Andrew Melnychenko , Stephen Hemminger , gur.stavi@huawei.com, Akihiko Odaki X-Mailer: b4 0.14-dev-fd6e3 The added tests confirm tun can perform RSS and hash reporting, and reject invalid configurations for them. Signed-off-by: Akihiko Odaki Tested-by: Lei Yang --- tools/testing/selftests/net/Makefile | 2 +- tools/testing/selftests/net/tun.c | 558 +++++++++++++++++++++++++++++++= +++- 2 files changed, 551 insertions(+), 9 deletions(-) diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests= /net/Makefile index cb2fc601de66..92762ce3ebd4 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -121,6 +121,6 @@ $(OUTPUT)/reuseport_bpf_numa: LDLIBS +=3D -lnuma $(OUTPUT)/tcp_mmap: LDLIBS +=3D -lpthread -lcrypto $(OUTPUT)/tcp_inq: LDLIBS +=3D -lpthread $(OUTPUT)/bind_bhash: LDLIBS +=3D -lpthread -$(OUTPUT)/io_uring_zerocopy_tx: CFLAGS +=3D -I../../../include/ +$(OUTPUT)/io_uring_zerocopy_tx $(OUTPUT)/tun: CFLAGS +=3D -I../../../inclu= de/ =20 include bpf.mk diff --git a/tools/testing/selftests/net/tun.c b/tools/testing/selftests/ne= t/tun.c index 463dd98f2b80..9424d897e341 100644 --- a/tools/testing/selftests/net/tun.c +++ b/tools/testing/selftests/net/tun.c @@ -2,21 +2,37 @@ =20 #define _GNU_SOURCE =20 +#include #include #include +#include #include #include #include #include -#include +#include +#include +#include +#include +#include +#include +#include #include +#include #include #include -#include -#include +#include +#include +#include +#include =20 #include "../kselftest_harness.h" =20 +#define TUN_HWADDR_SOURCE { 0x02, 0x00, 0x00, 0x00, 0x00, 0x00 } +#define TUN_HWADDR_DEST { 0x02, 0x00, 0x00, 0x00, 0x00, 0x01 } +#define TUN_IPADDR_SOURCE htonl((172 << 24) | (17 << 16) | 0) +#define TUN_IPADDR_DEST htonl((172 << 24) | (17 << 16) | 1) + static int tun_attach(int fd, char *dev) { struct ifreq ifr; @@ -39,7 +55,7 @@ static int tun_detach(int fd, char *dev) return ioctl(fd, TUNSETQUEUE, (void *) &ifr); } =20 -static int tun_alloc(char *dev) +static int tun_alloc(char *dev, short flags) { struct ifreq ifr; int fd, err; @@ -52,7 +68,8 @@ static int tun_alloc(char *dev) =20 memset(&ifr, 0, sizeof(ifr)); strcpy(ifr.ifr_name, dev); - ifr.ifr_flags =3D IFF_TAP | IFF_NAPI | IFF_MULTI_QUEUE; + ifr.ifr_flags =3D flags | IFF_TAP | IFF_NAPI | IFF_NO_PI | + IFF_MULTI_QUEUE; =20 err =3D ioctl(fd, TUNSETIFF, (void *) &ifr); if (err < 0) { @@ -64,6 +81,40 @@ static int tun_alloc(char *dev) return fd; } =20 +static bool tun_add_to_bridge(int local_fd, const char *name) +{ + struct ifreq ifreq =3D { + .ifr_name =3D "xbridge", + .ifr_ifindex =3D if_nametoindex(name) + }; + + if (!ifreq.ifr_ifindex) { + perror("if_nametoindex"); + return false; + } + + if (ioctl(local_fd, SIOCBRADDIF, &ifreq)) { + perror("SIOCBRADDIF"); + return false; + } + + return true; +} + +static bool tun_set_flags(int local_fd, const char *name, short flags) +{ + struct ifreq ifreq =3D { .ifr_flags =3D flags }; + + strcpy(ifreq.ifr_name, name); + + if (ioctl(local_fd, SIOCSIFFLAGS, &ifreq)) { + perror("SIOCSIFFLAGS"); + return false; + } + + return true; +} + static int tun_delete(char *dev) { struct { @@ -102,6 +153,159 @@ static int tun_delete(char *dev) return ret; } =20 +static uint32_t tun_sum(const void *buf, size_t len) +{ + const uint16_t *sbuf =3D buf; + uint32_t sum =3D 0; + + while (len > 1) { + sum +=3D *sbuf++; + len -=3D 2; + } + + if (len) + sum +=3D *(uint8_t *)sbuf; + + return sum; +} + +static uint16_t tun_build_ip_check(uint32_t sum) +{ + return ~((sum & 0xffff) + (sum >> 16)); +} + +static uint32_t tun_build_ip_pseudo_sum(const void *iphdr) +{ + uint16_t tot_len =3D ntohs(((struct iphdr *)iphdr)->tot_len); + + return tun_sum((char *)iphdr + offsetof(struct iphdr, saddr), 8) + + htons(((struct iphdr *)iphdr)->protocol) + + htons(tot_len - sizeof(struct iphdr)); +} + +static uint32_t tun_build_ipv6_pseudo_sum(const void *ipv6hdr) +{ + return tun_sum((char *)ipv6hdr + offsetof(struct ipv6hdr, saddr), 32) + + ((struct ipv6hdr *)ipv6hdr)->payload_len + + htons(((struct ipv6hdr *)ipv6hdr)->nexthdr); +} + +static void tun_build_ethhdr(struct ethhdr *ethhdr, uint16_t proto) +{ + *ethhdr =3D (struct ethhdr) { + .h_dest =3D TUN_HWADDR_DEST, + .h_source =3D TUN_HWADDR_SOURCE, + .h_proto =3D htons(proto) + }; +} + +static void tun_build_iphdr(void *dest, uint16_t len, uint8_t protocol) +{ + struct iphdr iphdr =3D { + .ihl =3D sizeof(iphdr) / 4, + .version =3D 4, + .tot_len =3D htons(sizeof(iphdr) + len), + .ttl =3D 255, + .protocol =3D protocol, + .saddr =3D TUN_IPADDR_SOURCE, + .daddr =3D TUN_IPADDR_DEST + }; + + iphdr.check =3D tun_build_ip_check(tun_sum(&iphdr, sizeof(iphdr))); + memcpy(dest, &iphdr, sizeof(iphdr)); +} + +static void tun_build_ipv6hdr(void *dest, uint16_t len, uint8_t protocol) +{ + struct ipv6hdr ipv6hdr =3D { + .version =3D 6, + .payload_len =3D htons(len), + .nexthdr =3D protocol, + .saddr =3D { + .s6_addr32 =3D { + htonl(0xffff0000), 0, 0, TUN_IPADDR_SOURCE + } + }, + .daddr =3D { + .s6_addr32 =3D { + htonl(0xffff0000), 0, 0, TUN_IPADDR_DEST + } + }, + }; + + memcpy(dest, &ipv6hdr, sizeof(ipv6hdr)); +} + +static void tun_build_tcphdr(void *dest, uint32_t sum) +{ + struct tcphdr tcphdr =3D { + .source =3D htons(9), + .dest =3D htons(9), + .fin =3D 1, + .doff =3D sizeof(tcphdr) / 4, + }; + uint32_t tcp_sum =3D tun_sum(&tcphdr, sizeof(tcphdr)); + + tcphdr.check =3D tun_build_ip_check(sum + tcp_sum); + memcpy(dest, &tcphdr, sizeof(tcphdr)); +} + +static void tun_build_udphdr(void *dest, uint32_t sum) +{ + struct udphdr udphdr =3D { + .source =3D htons(9), + .dest =3D htons(9), + .len =3D htons(sizeof(udphdr)), + }; + uint32_t udp_sum =3D tun_sum(&udphdr, sizeof(udphdr)); + + udphdr.check =3D tun_build_ip_check(sum + udp_sum); + memcpy(dest, &udphdr, sizeof(udphdr)); +} + +static bool tun_vnet_hash_check(int source_fd, const int *dest_fds, + const void *buffer, size_t len, + uint8_t flags, + uint16_t hash_report, uint32_t hash_value) +{ + size_t read_len =3D sizeof(struct virtio_net_hdr_v1_hash) + len; + struct virtio_net_hdr_v1_hash *read_buffer; + struct virtio_net_hdr_v1_hash hdr =3D { + .hdr =3D { + .flags =3D flags, + .num_buffers =3D htole16(1) + }, + .hash_value =3D htole32(hash_value), + .hash_report =3D htole16(hash_report) + }; + int ret; + int txq =3D hash_report ? hash_value & 1 : 2; + + if (write(source_fd, buffer, len) !=3D len) { + perror("write"); + return false; + } + + read_buffer =3D malloc(read_len); + if (!read_buffer) { + perror("malloc"); + return false; + } + + ret =3D read(dest_fds[txq], read_buffer, read_len); + if (ret !=3D read_len) { + perror("read"); + free(read_buffer); + return false; + } + + ret =3D !memcmp(read_buffer, &hdr, sizeof(*read_buffer)) && + !memcmp(read_buffer + 1, buffer, len); + + free(read_buffer); + return ret; +} + FIXTURE(tun) { char ifname[IFNAMSIZ]; @@ -112,10 +316,10 @@ FIXTURE_SETUP(tun) { memset(self->ifname, 0, sizeof(self->ifname)); =20 - self->fd =3D tun_alloc(self->ifname); + self->fd =3D tun_alloc(self->ifname, 0); ASSERT_GE(self->fd, 0); =20 - self->fd2 =3D tun_alloc(self->ifname); + self->fd2 =3D tun_alloc(self->ifname, 0); ASSERT_GE(self->fd2, 0); } =20 @@ -168,7 +372,7 @@ FIXTURE(tun_deleted) FIXTURE_SETUP(tun_deleted) { self->ifname[0] =3D 0; - self->fd =3D tun_alloc(self->ifname); + self->fd =3D tun_alloc(self->ifname, 0); ASSERT_LE(0, self->fd); =20 ASSERT_EQ(0, tun_delete(self->ifname)) @@ -233,4 +437,342 @@ TEST_F(tun_deleted, setvnethash) EXPECT_EQ(EBADFD, errno); } =20 +FIXTURE(tun_vnet_hash) +{ + int local_fd; + int source_fd; + int dest_fds[3]; +}; + +FIXTURE_SETUP(tun_vnet_hash) +{ + static const struct { + struct tun_vnet_hash hdr; + struct tun_vnet_hash_rss rss; + uint16_t rss_indirection_table[2]; + uint8_t rss_key[40]; + } vnet_hash =3D { + .hdr =3D { + .flags =3D TUN_VNET_HASH_REPORT | TUN_VNET_HASH_RSS, + .types =3D VIRTIO_NET_RSS_HASH_TYPE_IPv4 | + VIRTIO_NET_RSS_HASH_TYPE_TCPv4 | + VIRTIO_NET_RSS_HASH_TYPE_UDPv4 | + VIRTIO_NET_RSS_HASH_TYPE_IPv6 | + VIRTIO_NET_RSS_HASH_TYPE_TCPv6 | + VIRTIO_NET_RSS_HASH_TYPE_UDPv6 + }, + .rss =3D { .indirection_table_mask =3D 1, .unclassified_queue =3D 5 }, + .rss_indirection_table =3D { 3, 4 }, + .rss_key =3D { + 0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2, + 0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0, + 0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4, + 0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c, + 0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa + } + }; + + struct { + struct virtio_net_hdr_v1_hash vnet_hdr; + struct ethhdr ethhdr; + struct arphdr arphdr; + unsigned char sender_hwaddr[6]; + uint32_t sender_ipaddr; + unsigned char target_hwaddr[6]; + uint32_t target_ipaddr; + } __packed packet =3D { + .ethhdr =3D { + .h_source =3D TUN_HWADDR_SOURCE, + .h_dest =3D { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }, + .h_proto =3D htons(ETH_P_ARP) + }, + .arphdr =3D { + .ar_hrd =3D htons(ARPHRD_ETHER), + .ar_pro =3D htons(ETH_P_IP), + .ar_hln =3D ETH_ALEN, + .ar_pln =3D 4, + .ar_op =3D htons(ARPOP_REQUEST) + }, + .sender_hwaddr =3D TUN_HWADDR_DEST, + .sender_ipaddr =3D TUN_IPADDR_DEST, + .target_ipaddr =3D TUN_IPADDR_DEST + }; + + struct tun_vnet_hash cap; + char source_ifname[IFNAMSIZ] =3D ""; + char dest_ifname[IFNAMSIZ] =3D ""; + int i; + + self->local_fd =3D socket(AF_LOCAL, SOCK_STREAM, 0); + ASSERT_LE(0, self->local_fd); + + self->source_fd =3D tun_alloc(source_ifname, 0); + ASSERT_LE(0, self->source_fd) { + EXPECT_EQ(0, close(self->local_fd)); + } + + i =3D ioctl(self->source_fd, TUNGETVNETHASHCAP, &cap); + if (i =3D=3D -1 && errno =3D=3D EINVAL) { + EXPECT_EQ(0, close(self->local_fd)); + SKIP(return, "TUNGETVNETHASHCAP not supported"); + } + + ASSERT_EQ(0, i) + EXPECT_EQ(0, close(self->local_fd)); + + if ((cap.flags & vnet_hash.hdr.flags) !=3D vnet_hash.hdr.flags) { + EXPECT_EQ(0, close(self->local_fd)); + SKIP(return, "Lacks some hash flag support"); + } + + if ((cap.types & vnet_hash.hdr.types) !=3D vnet_hash.hdr.types) { + EXPECT_EQ(0, close(self->local_fd)); + SKIP(return, "Lacks some hash type support"); + } + + ASSERT_TRUE(tun_set_flags(self->local_fd, source_ifname, IFF_UP)) + EXPECT_EQ(0, close(self->local_fd)); + + self->dest_fds[0] =3D tun_alloc(dest_ifname, IFF_VNET_HDR); + ASSERT_LE(0, self->dest_fds[0]) { + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + i =3D sizeof(struct virtio_net_hdr_v1_hash); + ASSERT_EQ(0, ioctl(self->dest_fds[0], TUNSETVNETHDRSZ, &i)) { + EXPECT_EQ(0, close(self->dest_fds[0])); + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + i =3D 1; + ASSERT_EQ(0, ioctl(self->dest_fds[0], TUNSETVNETLE, &i)) { + EXPECT_EQ(0, close(self->dest_fds[0])); + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + ASSERT_TRUE(tun_set_flags(self->local_fd, dest_ifname, IFF_UP)) { + EXPECT_EQ(0, close(self->dest_fds[0])); + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + ASSERT_EQ(sizeof(packet), + write(self->dest_fds[0], &packet, sizeof(packet))) { + EXPECT_EQ(0, close(self->dest_fds[0])); + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + ASSERT_EQ(0, ioctl(self->dest_fds[0], TUNSETVNETHASH, &vnet_hash)) { + EXPECT_EQ(0, close(self->dest_fds[0])); + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + for (i =3D 1; i < ARRAY_SIZE(self->dest_fds); i++) { + self->dest_fds[i] =3D tun_alloc(dest_ifname, IFF_VNET_HDR); + ASSERT_LE(0, self->dest_fds[i]) { + while (i) { + i--; + EXPECT_EQ(0, close(self->local_fd)); + } + + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + } + + ASSERT_EQ(0, ioctl(self->local_fd, SIOCBRADDBR, "xbridge")) { + EXPECT_EQ(0, ioctl(self->local_fd, SIOCBRDELBR, "xbridge")); + + for (i =3D 0; i < ARRAY_SIZE(self->dest_fds); i++) + EXPECT_EQ(0, close(self->dest_fds[i])); + + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + ASSERT_TRUE(tun_add_to_bridge(self->local_fd, source_ifname)) { + EXPECT_EQ(0, ioctl(self->local_fd, SIOCBRDELBR, "xbridge")); + + for (i =3D 0; i < ARRAY_SIZE(self->dest_fds); i++) + EXPECT_EQ(0, close(self->dest_fds[i])); + + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + ASSERT_TRUE(tun_add_to_bridge(self->local_fd, dest_ifname)) { + EXPECT_EQ(0, ioctl(self->local_fd, SIOCBRDELBR, "xbridge")); + + for (i =3D 0; i < ARRAY_SIZE(self->dest_fds); i++) + EXPECT_EQ(0, close(self->dest_fds[i])); + + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + ASSERT_TRUE(tun_set_flags(self->local_fd, "xbridge", IFF_UP)) { + EXPECT_EQ(0, ioctl(self->local_fd, SIOCBRDELBR, "xbridge")); + + for (i =3D 0; i < ARRAY_SIZE(self->dest_fds); i++) + EXPECT_EQ(0, close(self->dest_fds[i])); + + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } +} + +FIXTURE_TEARDOWN(tun_vnet_hash) +{ + ASSERT_TRUE(tun_set_flags(self->local_fd, "xbridge", 0)) { + for (size_t i =3D 0; i < ARRAY_SIZE(self->dest_fds); i++) + EXPECT_EQ(0, close(self->dest_fds[i])); + + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); + } + + EXPECT_EQ(0, ioctl(self->local_fd, SIOCBRDELBR, "xbridge")); + + for (size_t i =3D 0; i < ARRAY_SIZE(self->dest_fds); i++) + EXPECT_EQ(0, close(self->dest_fds[i])); + + EXPECT_EQ(0, close(self->source_fd)); + EXPECT_EQ(0, close(self->local_fd)); +} + +TEST_F(tun_vnet_hash, unclassified) +{ + struct { + struct ethhdr ethhdr; + struct iphdr iphdr; + } __packed packet; + + tun_build_ethhdr(&packet.ethhdr, ETH_P_LOOPBACK); + + EXPECT_TRUE(tun_vnet_hash_check(self->source_fd, self->dest_fds, + &packet, sizeof(packet), 0, + VIRTIO_NET_HASH_REPORT_NONE, 0)); +} + +TEST_F(tun_vnet_hash, ipv4) +{ + struct { + struct ethhdr ethhdr; + struct iphdr iphdr; + } __packed packet; + + tun_build_ethhdr(&packet.ethhdr, ETH_P_IP); + tun_build_iphdr(&packet.iphdr, 0, 253); + + EXPECT_TRUE(tun_vnet_hash_check(self->source_fd, self->dest_fds, + &packet, sizeof(packet), 0, + VIRTIO_NET_HASH_REPORT_IPv4, + 0x6e45d952)); +} + +TEST_F(tun_vnet_hash, tcpv4) +{ + struct { + struct ethhdr ethhdr; + struct iphdr iphdr; + struct tcphdr tcphdr; + } __packed packet; + + tun_build_ethhdr(&packet.ethhdr, ETH_P_IP); + tun_build_iphdr(&packet.iphdr, sizeof(struct tcphdr), IPPROTO_TCP); + + tun_build_tcphdr(&packet.tcphdr, + tun_build_ip_pseudo_sum(&packet.iphdr)); + + EXPECT_TRUE(tun_vnet_hash_check(self->source_fd, self->dest_fds, + &packet, sizeof(packet), + VIRTIO_NET_HDR_F_DATA_VALID, + VIRTIO_NET_HASH_REPORT_TCPv4, + 0xfb63539a)); +} + +TEST_F(tun_vnet_hash, udpv4) +{ + struct { + struct ethhdr ethhdr; + struct iphdr iphdr; + struct udphdr udphdr; + } __packed packet; + + tun_build_ethhdr(&packet.ethhdr, ETH_P_IP); + tun_build_iphdr(&packet.iphdr, sizeof(struct udphdr), IPPROTO_UDP); + + tun_build_udphdr(&packet.udphdr, + tun_build_ip_pseudo_sum(&packet.iphdr)); + + EXPECT_TRUE(tun_vnet_hash_check(self->source_fd, self->dest_fds, + &packet, sizeof(packet), + VIRTIO_NET_HDR_F_DATA_VALID, + VIRTIO_NET_HASH_REPORT_UDPv4, + 0xfb63539a)); +} + +TEST_F(tun_vnet_hash, ipv6) +{ + struct { + struct ethhdr ethhdr; + struct ipv6hdr ipv6hdr; + } __packed packet; + + tun_build_ethhdr(&packet.ethhdr, ETH_P_IPV6); + tun_build_ipv6hdr(&packet.ipv6hdr, 0, 253); + + EXPECT_TRUE(tun_vnet_hash_check(self->source_fd, self->dest_fds, + &packet, sizeof(packet), 0, + VIRTIO_NET_HASH_REPORT_IPv6, + 0xd6eb560f)); +} + +TEST_F(tun_vnet_hash, tcpv6) +{ + struct { + struct ethhdr ethhdr; + struct ipv6hdr ipv6hdr; + struct tcphdr tcphdr; + } __packed packet; + + tun_build_ethhdr(&packet.ethhdr, ETH_P_IPV6); + tun_build_ipv6hdr(&packet.ipv6hdr, sizeof(struct tcphdr), IPPROTO_TCP); + + tun_build_tcphdr(&packet.tcphdr, + tun_build_ipv6_pseudo_sum(&packet.ipv6hdr)); + + EXPECT_TRUE(tun_vnet_hash_check(self->source_fd, self->dest_fds, + &packet, sizeof(packet), + VIRTIO_NET_HDR_F_DATA_VALID, + VIRTIO_NET_HASH_REPORT_TCPv6, + 0xc2b9f251)); +} + +TEST_F(tun_vnet_hash, udpv6) +{ + struct { + struct ethhdr ethhdr; + struct ipv6hdr ipv6hdr; + struct udphdr udphdr; + } __packed packet; + + tun_build_ethhdr(&packet.ethhdr, ETH_P_IPV6); + tun_build_ipv6hdr(&packet.ipv6hdr, sizeof(struct udphdr), IPPROTO_UDP); + + tun_build_udphdr(&packet.udphdr, + tun_build_ipv6_pseudo_sum(&packet.ipv6hdr)); + + EXPECT_TRUE(tun_vnet_hash_check(self->source_fd, self->dest_fds, + &packet, sizeof(packet), + VIRTIO_NET_HDR_F_DATA_VALID, + VIRTIO_NET_HASH_REPORT_UDPv6, + 0xc2b9f251)); +} + TEST_HARNESS_MAIN --=20 2.47.1 From nobody Fri Dec 19 20:35:48 2025 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B9F882153D8 for ; Thu, 9 Jan 2025 07:14:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406893; cv=none; b=MtUfxOjTCYqJVKMhHOLs37MXZB3FRo905W4AJuFXhWSGHHN5L82QZScNgHezGMhW/JpBjzOxi8CJpXZ0Pqo02OxPYqXjA+KmLEInnadF4yRJNVuFj9ZzSPWO5C65zpesw7JuGkKQjuTuNcvGVFYQGZwOBKTbHhBNAeubaJTrpp4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736406893; c=relaxed/simple; bh=V4rTr48gCTTCBFSUklYTYzH1BMfrr/4hphLyr2CFHBM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To; b=Gq+SGq+G6IEeOUdPq4pvDTJTwheOEwbFMZFlq8jpYIH6jHKbO0HMAQcTer8zVI+sAmUiUxqusTXePhHBZSr3dGTShqwSKAihwOKFgroXPhSn8Xjk2V+FRB/X4TI7am/6rNRc3HLwHnuugFHN3zF3P5L5uq5i9cmdMKvQEVU2QUQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com; spf=pass smtp.mailfrom=daynix.com; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b=01P2Pd2G; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=daynix.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=daynix.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=daynix-com.20230601.gappssmtp.com header.i=@daynix-com.20230601.gappssmtp.com header.b="01P2Pd2G" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-2efd81c7ca4so827213a91.2 for ; Wed, 08 Jan 2025 23:14:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1736406891; x=1737011691; darn=vger.kernel.org; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=7g1VKQa2xOP+8VFeFm3JNqgr+uiGplo0eLcy0YMBCzc=; b=01P2Pd2G9mFDTgMLRM3sPOFxbOrA4hwqfe9b2YwR2qRWq4xnSUFkf/Eazsx4Dn4ji7 SbowVzlKn33wYdigYN6eMvz/OTkmfQWxEEzDUJmSh10Yvb0MuKrzIMzalvKOww/QU049 lttv8RnVuTEMYX9TEBlLXnnZmN3awe5aKJVN5jD2rTRd7I4cn4SpY42VOA9I+aebYsQT ZDPCeWk0RhPOr0a7DE/AAEqp405UMl4aYJG5NDN2V61qa8Cjg/Egws1B/NLQ15BhQ9Se lGeiu1hcQn9oUpxWFPvN1GPwNtA/v4WBtvcSUngvOund+D3tyooff8NVtXqsx+sVDsXQ 1MDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736406891; x=1737011691; h=to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7g1VKQa2xOP+8VFeFm3JNqgr+uiGplo0eLcy0YMBCzc=; b=IqYWDMmhy3q/CTaJ6Ayo5Hsqo6E7tw/QtNobureyQXR9KETsU0zDjSXRp5z3iYNMFT AZ+TFS/ErmMNZKpTz/ISFu5VY2tItPFH6M4HjR3D5B3KXPVTqiZjbI6SWHcmYcPAPsru nH4+hvdXXc/ZRkYuJ0BR+6tXEwOx2hRLtqidXFd3G+bvjzdM4Vtb5m5PYZ8e/X6qYVP6 ckaF/Lt/c7qAVuCwiYSyvPdu/56gU236M0+5nKOWbyOBhrM5KNRAHMcFe8KPbi+/evah NVP7mH9p6CKCmdZHZDaFOhHs/TVxdcJT2Vg8NzI6JoOtnUdgfSFosxtzotKPPQpX4RWL os2A== X-Forwarded-Encrypted: i=1; AJvYcCVMXeJ+6RAL5GbI+utrmFwXw3QQZzDvm2uxpRlYSDw5N2az6C9fxIt81MnZ02f8oQ/BxUibIycF7YngG8s=@vger.kernel.org X-Gm-Message-State: AOJu0YyR9ipJFJvjkfvM18wkjP/J8CmVi9epfoVEy814yoPseMp7uDeK AFAtDEnfqUasb5nQ8tXKtPe+a1hXdxJ7xiFdFRIvrz7KYDRtSjT2F+ac2386fnc= X-Gm-Gg: ASbGncuGFuRuuVnuhrgMAv0fFjblNey3z5rhMrWCMd7kNq+YDLdRxmgVP6OsNSj+vCI 455A+L7AJQGOwmpKSeORlGfAOLF9Qh7Tbs/ssiWDSRBu6zh8fKfDumY8lX2MRtYPax+G6c9un8c xFrrUsViqY6KXuYE8HQRh9CkPAGoaWnn8Q2OjAZGU28DY34eEJbXELNbzEqBN3TSAohXvwLUEr6 wbvjTJKeKwcGerzW/u9ZBqOjqUXBMM2UVJYOJUFQRw67lQ7td06Jh2t46k= X-Google-Smtp-Source: AGHT+IGd9lpv9zb1K0tz5iHHV3Jqm/5MPg+oC5ZDuZy/BjPg77kTbQjj8ZNOzcyz3pMcdff2Q/gChg== X-Received: by 2002:a05:6a00:1d81:b0:728:e906:e45a with SMTP id d2e1a72fcca58-72d220599a3mr9260391b3a.24.1736406891083; Wed, 08 Jan 2025 23:14:51 -0800 (PST) Received: from localhost ([157.82.203.37]) by smtp.gmail.com with UTF8SMTPSA id d2e1a72fcca58-72aad816464sm36435407b3a.40.2025.01.08.23.14.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 08 Jan 2025 23:14:50 -0800 (PST) From: Akihiko Odaki Date: Thu, 09 Jan 2025 16:13:44 +0900 Subject: [PATCH v6 6/6] vhost/net: Support VIRTIO_NET_F_HASH_REPORT Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250109-rss-v6-6-b1c90ad708f6@daynix.com> References: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> In-Reply-To: <20250109-rss-v6-0-b1c90ad708f6@daynix.com> To: Jonathan Corbet , Willem de Bruijn , Jason Wang , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-kselftest@vger.kernel.org, Yuri Benditovich , Andrew Melnychenko , Stephen Hemminger , gur.stavi@huawei.com, Akihiko Odaki X-Mailer: b4 0.14-dev-fd6e3 VIRTIO_NET_F_HASH_REPORT allows to report hash values calculated on the host. When VHOST_NET_F_VIRTIO_NET_HDR is employed, it will report no hash values (i.e., the hash_report member is always set to VIRTIO_NET_HASH_REPORT_NONE). Otherwise, the values reported by the underlying socket will be reported. VIRTIO_NET_F_HASH_REPORT requires VIRTIO_F_VERSION_1. Signed-off-by: Akihiko Odaki Tested-by: Lei Yang --- drivers/vhost/net.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 9ad37c012189..ed1bf01a7fcf 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -73,6 +73,7 @@ enum { VHOST_NET_FEATURES =3D VHOST_FEATURES | (1ULL << VHOST_NET_F_VIRTIO_NET_HDR) | (1ULL << VIRTIO_NET_F_MRG_RXBUF) | + (1ULL << VIRTIO_NET_F_HASH_REPORT) | (1ULL << VIRTIO_F_ACCESS_PLATFORM) | (1ULL << VIRTIO_F_RING_RESET) }; @@ -1604,10 +1605,13 @@ static int vhost_net_set_features(struct vhost_net = *n, u64 features) size_t vhost_hlen, sock_hlen, hdr_len; int i; =20 - hdr_len =3D (features & ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | - (1ULL << VIRTIO_F_VERSION_1))) ? - sizeof(struct virtio_net_hdr_mrg_rxbuf) : - sizeof(struct virtio_net_hdr); + if (features & (1ULL << VIRTIO_NET_F_HASH_REPORT)) + hdr_len =3D sizeof(struct virtio_net_hdr_v1_hash); + else if (features & ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | + (1ULL << VIRTIO_F_VERSION_1))) + hdr_len =3D sizeof(struct virtio_net_hdr_mrg_rxbuf); + else + hdr_len =3D sizeof(struct virtio_net_hdr); if (features & (1 << VHOST_NET_F_VIRTIO_NET_HDR)) { /* vhost provides vnet_hdr */ vhost_hlen =3D hdr_len; @@ -1688,6 +1692,10 @@ static long vhost_net_ioctl(struct file *f, unsigned= int ioctl, return -EFAULT; if (features & ~VHOST_NET_FEATURES) return -EOPNOTSUPP; + if ((features & ((1ULL << VIRTIO_F_VERSION_1) | + (1ULL << VIRTIO_NET_F_HASH_REPORT))) =3D=3D + (1ULL << VIRTIO_NET_F_HASH_REPORT)) + return -EINVAL; return vhost_net_set_features(n, features); case VHOST_GET_BACKEND_FEATURES: features =3D VHOST_NET_BACKEND_FEATURES; --=20 2.47.1