net/netfilter/ipvs/ip_vs_xmit.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
Currently, IPVS skips MTU checks for GSO packets by excluding them with
the !skb_is_gso(skb) condition. This creates problems when IPVS tunnel
mode encapsulates GSO packets with IPIP headers.
The issue manifests in two ways:
1. MTU violation after encapsulation:
When a GSO packet passes through IPVS tunnel mode, the original MTU
check is bypassed. After adding the IPIP tunnel header, the packet
size may exceed the outgoing interface MTU, leading to unexpected
fragmentation at the IP layer.
2. Fragmentation with problematic IP IDs:
When net.ipv4.vs.pmtu_disc=1 and a GSO packet with multiple segments
is fragmented after encapsulation, each segment gets a sequentially
incremented IP ID (0, 1, 2, ...). This happens because:
a) The GSO packet bypasses MTU check and gets encapsulated
b) At __ip_finish_output, the oversized GSO packet is split into
separate SKBs (one per segment), with IP IDs incrementing
c) Each SKB is then fragmented again based on the actual MTU
This sequential IP ID allocation differs from the expected behavior
and can cause issues with fragment reassembly and packet tracking.
Fix this by removing the GSO packet exception from the MTU check and
properly validating GSO packets using skb_gso_validate_network_len().
This function correctly validates whether the GSO segments will fit
within the MTU after segmentation. If validation fails, send an ICMP
Fragmentation Needed message to enable proper PMTU discovery.
Fixes: 4cdd34084d53 ("netfilter: nf_conntrack_ipv6: improve fragmentation handling")
Signed-off-by: Yingnan Zhang <342144303@qq.com>
---
net/netfilter/ipvs/ip_vs_xmit.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index 3601eb86d..82f2e7a32 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -232,8 +232,15 @@ static inline bool ensure_mtu_is_adequate(struct netns_ipvs *ipvs, int skb_af,
return true;
if (unlikely(ip_hdr(skb)->frag_off & htons(IP_DF) &&
- skb->len > mtu && !skb_is_gso(skb) &&
+ skb->len > mtu &&
!ip_vs_iph_icmp(ipvsh))) {
+ if (skb_is_gso(skb)) {
+ if (skb_gso_validate_network_len(skb, mtu))
+ return true;
+ icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
+ IP_VS_DBG(1, "frag needed for %pI4\n", &ip_hdr(skb)->saddr);
+ return false;
+ }
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
htonl(mtu));
IP_VS_DBG(1, "frag needed for %pI4\n",
--
2.51.0
Hello,
On Wed, 1 Apr 2026, Yingnan Zhang wrote:
> Currently, IPVS skips MTU checks for GSO packets by excluding them with
> the !skb_is_gso(skb) condition. This creates problems when IPVS tunnel
> mode encapsulates GSO packets with IPIP headers.
>
> The issue manifests in two ways:
>
> 1. MTU violation after encapsulation:
> When a GSO packet passes through IPVS tunnel mode, the original MTU
> check is bypassed. After adding the IPIP tunnel header, the packet
> size may exceed the outgoing interface MTU, leading to unexpected
> fragmentation at the IP layer.
>
> 2. Fragmentation with problematic IP IDs:
> When net.ipv4.vs.pmtu_disc=1 and a GSO packet with multiple segments
> is fragmented after encapsulation, each segment gets a sequentially
> incremented IP ID (0, 1, 2, ...). This happens because:
>
> a) The GSO packet bypasses MTU check and gets encapsulated
> b) At __ip_finish_output, the oversized GSO packet is split into
> separate SKBs (one per segment), with IP IDs incrementing
> c) Each SKB is then fragmented again based on the actual MTU
>
> This sequential IP ID allocation differs from the expected behavior
> and can cause issues with fragment reassembly and packet tracking.
>
> Fix this by removing the GSO packet exception from the MTU check and
> properly validating GSO packets using skb_gso_validate_network_len().
> This function correctly validates whether the GSO segments will fit
> within the MTU after segmentation. If validation fails, send an ICMP
> Fragmentation Needed message to enable proper PMTU discovery.
>
> Fixes: 4cdd34084d53 ("netfilter: nf_conntrack_ipv6: improve fragmentation handling")
> Signed-off-by: Yingnan Zhang <342144303@qq.com>
> ---
> net/netfilter/ipvs/ip_vs_xmit.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
> index 3601eb86d..82f2e7a32 100644
> --- a/net/netfilter/ipvs/ip_vs_xmit.c
> +++ b/net/netfilter/ipvs/ip_vs_xmit.c
> @@ -232,8 +232,15 @@ static inline bool ensure_mtu_is_adequate(struct netns_ipvs *ipvs, int skb_af,
> return true;
>
> if (unlikely(ip_hdr(skb)->frag_off & htons(IP_DF) &&
> - skb->len > mtu && !skb_is_gso(skb) &&
> + skb->len > mtu &&
> !ip_vs_iph_icmp(ipvsh))) {
> + if (skb_is_gso(skb)) {
> + if (skb_gso_validate_network_len(skb, mtu))
> + return true;
Should we add the same function call in
__mtu_check_toobig_v6() for IPv6 ? Comparing it with
net/ipv6/ip6_output.c:ip6_pkt_too_big()...
> + icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED, htonl(mtu));
> + IP_VS_DBG(1, "frag needed for %pI4\n", &ip_hdr(skb)->saddr);
> + return false;
> + }
> icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> htonl(mtu));
> IP_VS_DBG(1, "frag needed for %pI4\n",
> --
> 2.51.0
Regards
--
Julian Anastasov <ja@ssi.bg>
© 2016 - 2026 Red Hat, Inc.