From nobody Tue Feb 10 02:49:22 2026
Return-Path: <linux-kernel-owner@vger.kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 2ACF3EB64DD
	for <linux-kernel@archiver.kernel.org>; Tue,  8 Aug 2023 03:22:43 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S231454AbjHHDWk (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 7 Aug 2023 23:22:40 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46664 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S231394AbjHHDVu (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 7 Aug 2023 23:21:50 -0400
Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com
 [IPv6:2607:f8b0:4864:20::536])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96E1D1FFC
        for <linux-kernel@vger.kernel.org>;
 Mon,  7 Aug 2023 20:21:19 -0700 (PDT)
Received: by mail-pg1-x536.google.com with SMTP id
 41be03b00d2f7-563e21a6011so3822560a12.0
        for <linux-kernel@vger.kernel.org>;
 Mon, 07 Aug 2023 20:21:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=bytedance.com; s=google; t=1691464879; x=1692069679;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:from:to:cc:subject:date
         :message-id:reply-to;
        bh=VNxERTokHn2QOzCO7nu1QH5j2GU8N7cGLZ9USDe1a7I=;
        b=C4mBcxo9HLIzgPmkZwvVwMe2Y8+oOUbmzCrCDJqkOO4K6EXNq7sy0sXwURk4dnYWN/
         nqw603ksEycjKPz+q7Gb7cwNFSYIK+vVrjiyNyqw3L/+v1thar9dzpcpeGKEFR3o4wLH
         S6xLfrbCC72IxnIkr5RURV71aSuBQadDIjBi3PHym4cpsgU/YZrISBPUzd4jceT+L/0a
         307O/xTRmtCcwfALjTjmrGApGmrIdSdjks+lyuqttA5hf1liWdxzdc0Cqo6COCriBkaS
         QPLzDbZufpr2FBVXchXRpN1W/43dNgJJikedWzqrHrtBXTwNFXaAoiDaBXhaBmaMmOlA
         CRuw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1691464879; x=1692069679;
        h=content-transfer-encoding:mime-version:references:in-reply-to
         :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=VNxERTokHn2QOzCO7nu1QH5j2GU8N7cGLZ9USDe1a7I=;
        b=GftnFg7Kn/I7llgRSeCKiDaji6HG56QA0JGB7pil67rOun4dMc13GEQ7xsZ8k1kRfD
         C4OpOAcFQIcFWDa1tCpJ4EYu+z4/rHbo263LYq6hFInKS9DsgXbMovpn1K/z5NUZHVUd
         xJc7H19jj+cgLQNrzjoP4EYGJc/4Q0PGrNN18Egt6VoeeFLKVNyBPBR+NPpfzVOAGYJi
         7FJDLGV3O3bCl5JS4PnzJV/DTLsLfwwvPsroZLkiTXFzuf2+6HtRMCsV6htFtTLpAOwx
         50+bpm3t9ceLgGfuCCm1Z2e71MKiV7L7EZl0UNUZhAOIKd81UoobLPAEfutwd1qd3DlT
         /5hQ==
X-Gm-Message-State: AOJu0YxMFgJN9fNkUNTOeORbxziJEkQOqtxhyTxx6Jw12YSgKOpx6HKR
        NNtdBZ9/t+/36WoxwBuE0n8TQw==
X-Google-Smtp-Source: 
 AGHT+IEOYdq2Iq7uEVVkuLq5Idoxa0lnhD1ymimKbq0IIAGWG6YIbaqjqtDvHW0ImxBWxvTQ4G+7Nw==
X-Received: by 2002:a05:6a20:3206:b0:138:2fb8:6a14 with SMTP id
 hl6-20020a056a20320600b001382fb86a14mr11332615pzc.3.1691464878793;
        Mon, 07 Aug 2023 20:21:18 -0700 (PDT)
Received: from C02FG34NMD6R.bytedance.net ([2408:8656:30f8:e020::b])
        by smtp.gmail.com with ESMTPSA id
 13-20020a170902c10d00b001b896686c78sm7675800pli.66.2023.08.07.20.21.13
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 07 Aug 2023 20:21:18 -0700 (PDT)
From: Albert Huang <huangjie.albert@bytedance.com>
To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
        pabeni@redhat.com
Cc: Albert Huang <huangjie.albert@bytedance.com>,
        Alexei Starovoitov <ast@kernel.org>,
        Daniel Borkmann <daniel@iogearbox.net>,
        Jesper Dangaard Brouer <hawk@kernel.org>,
        John Fastabend <john.fastabend@gmail.com>,
        =?UTF-8?q?Bj=C3=B6rn=20T=C3=B6pel?= <bjorn@kernel.org>,
        Magnus Karlsson <magnus.karlsson@intel.com>,
        Maciej Fijalkowski <maciej.fijalkowski@intel.com>,
        Jonathan Lemon <jonathan.lemon@gmail.com>,
        Pavel Begunkov <asml.silence@gmail.com>,
        Yunsheng Lin <linyunsheng@huawei.com>,
        Kees Cook <keescook@chromium.org>,
        Richard Gobert <richardbgobert@gmail.com>,
        "open list:NETWORKING DRIVERS" <netdev@vger.kernel.org>,
        open list <linux-kernel@vger.kernel.org>,
        "open list:XDP (eXpress Data Path)" <bpf@vger.kernel.org>
Subject: [RFC v3 Optimizing veth xsk performance 8/9] veth: af_xdp tx batch
 support for ipv4 udp
Date: Tue,  8 Aug 2023 11:19:12 +0800
Message-Id: <20230808031913.46965-9-huangjie.albert@bytedance.com>
X-Mailer: git-send-email 2.37.1 (Apple Git-137.1)
In-Reply-To: <20230808031913.46965-1-huangjie.albert@bytedance.com>
References: <20230808031913.46965-1-huangjie.albert@bytedance.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

A typical topology is shown below:
veth<--------veth-peer
        1       |
                |2
                |
              bridge<------->eth0(such as mlnx5 NIC)

If you use af_xdp to send packets from veth to a physical NIC,
it needs to go through some software paths, so we can refer to
the implementation of kernel GSO. When af_xdp sends packets out
from veth, consider aggregating packets and send a large packet
from the veth virtual NIC to the physical NIC.

performance:(test weth libxdp lib)
AF_XDP without batch : 480 Kpps (with ksoftirqd 100% cpu)
AF_XDP  with   batch : 1.5 Mpps (with ksoftirqd 15% cpu)

With af_xdp batch, the libxdp user-space program reaches a bottleneck.
Therefore, the softirq did not reach the limit.

Signed-off-by: Albert Huang <huangjie.albert@bytedance.com>
---
 drivers/net/veth.c | 408 ++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 387 insertions(+), 21 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index ac78d6a87416..70489d017b51 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -29,6 +29,7 @@
 #include <net/page_pool.h>
 #include <net/xdp_sock_drv.h>
 #include <net/xdp.h>
+#include <net/udp.h>
=20
 #define DRV_NAME	"veth"
 #define DRV_VERSION	"1.0"
@@ -103,6 +104,23 @@ struct veth_xdp_tx_bq {
 	unsigned int count;
 };
=20
+struct veth_batch_tuple {
+	__u8	protocol;
+	__be32	saddr;
+	__be32	daddr;
+	__be16	source;
+	__be16	dest;
+	__be16	batch_size;
+	__be16	batch_segs;
+	bool    batch_enable;
+	bool    batch_flush;
+};
+
+struct veth_seg_info {
+	u32 segs;
+	u64 desc[] ____cacheline_aligned_in_smp;
+};
+
 /*
  * ethtool interface
  */
@@ -1078,11 +1096,340 @@ static struct sk_buff *veth_build_skb(void *head, =
int headroom, int len,
 	return skb;
 }
=20
+static void veth_xsk_destruct_skb(struct sk_buff *skb)
+{
+	struct skb_shared_info *si =3D skb_shinfo(skb);
+	struct xsk_buff_pool *pool =3D (struct xsk_buff_pool *)si->destructor_arg=
_xsk_pool;
+	struct veth_seg_info *seg_info =3D (struct veth_seg_info *)si->destructor=
_arg;
+	unsigned long flags;
+	u32 index =3D 0;
+	u64 addr;
+
+	/* release cq */
+	spin_lock_irqsave(&pool->cq_lock, flags);
+	for (index =3D 0; index < seg_info->segs; index++) {
+		addr =3D (u64)(long)seg_info->desc[index];
+		xsk_tx_completed_addr(pool, addr);
+	}
+	spin_unlock_irqrestore(&pool->cq_lock, flags);
+
+	kfree(seg_info);
+	si->destructor_arg =3D NULL;
+	si->destructor_arg_xsk_pool =3D NULL;
+}
+
+static struct sk_buff *veth_build_gso_head_skb(struct net_device *dev,
+					       char *buff, u32 tot_len,
+					       u32 headroom, u32 iph_len,
+					       u32 th_len)
+{
+	struct sk_buff *skb =3D NULL;
+	int err =3D 0;
+
+	skb =3D alloc_skb(tot_len, GFP_KERNEL);
+	if (unlikely(!skb))
+		return NULL;
+
+	/* header room contains the eth header */
+	skb_reserve(skb, headroom - ETH_HLEN);
+	skb_put(skb, ETH_HLEN + iph_len + th_len);
+	skb_shinfo(skb)->gso_segs =3D 0;
+
+	err =3D skb_store_bits(skb, 0, buff, ETH_HLEN + iph_len + th_len);
+	if (unlikely(err)) {
+		kfree_skb(skb);
+		return NULL;
+	}
+
+	skb->protocol =3D eth_type_trans(skb, dev);
+	skb->network_header =3D skb->mac_header + ETH_HLEN;
+	skb->transport_header =3D skb->network_header + iph_len;
+	skb->ip_summed =3D CHECKSUM_PARTIAL;
+
+	return skb;
+}
+
+/* only ipv4 udp match
+ * to do: tcp and ipv6
+ */
+static inline bool veth_segment_match(struct veth_batch_tuple *tuple,
+				      struct iphdr *iph, struct udphdr *udph)
+{
+	if (tuple->protocol =3D=3D iph->protocol &&
+	    tuple->saddr =3D=3D iph->saddr &&
+		tuple->daddr =3D=3D iph->daddr &&
+		tuple->source =3D=3D udph->source &&
+		tuple->dest =3D=3D udph->dest &&
+		tuple->batch_size =3D=3D ntohs(udph->len)) {
+		tuple->batch_flush =3D false;
+		return true;
+	}
+
+	tuple->batch_flush =3D true;
+	return false;
+}
+
+static inline void veth_tuple_init(struct veth_batch_tuple *tuple,
+				   struct iphdr *iph, struct udphdr *udph)
+{
+	tuple->protocol =3D iph->protocol;
+	tuple->saddr =3D iph->saddr;
+	tuple->daddr =3D iph->daddr;
+	tuple->source =3D udph->source;
+	tuple->dest =3D udph->dest;
+	tuple->batch_flush =3D false;
+	tuple->batch_size =3D ntohs(udph->len);
+	tuple->batch_segs =3D 0;
+}
+
+static inline bool veth_batch_ip_check_v4(struct iphdr *iph, u32 len)
+{
+	if (len <=3D (ETH_HLEN + sizeof(*iph)))
+		return false;
+
+	if (iph->ihl < 5 || iph->version !=3D 4 || len < (iph->ihl * 4 + ETH_HLEN=
))
+		return false;
+
+	return true;
+}
+
+static struct sk_buff *veth_build_skb_batch_udp(struct net_device *dev,
+						struct xsk_buff_pool *pool,
+						struct xdp_desc *desc,
+						struct veth_batch_tuple *tuple,
+						struct sk_buff *prev_skb)
+{
+	u32 hr, len, ts, index, iph_len, th_len, data_offset, data_len, tot_len;
+	struct veth_seg_info *seg_info;
+	void *buffer;
+	struct udphdr *udph;
+	struct iphdr *iph;
+	struct sk_buff *skb;
+	struct page *page;
+	u32 seg_len =3D 0;
+	int hh_len =3D 0;
+	u64 addr;
+
+	addr =3D desc->addr;
+	len =3D desc->len;
+
+	/* l2 reserved len */
+	hh_len =3D LL_RESERVED_SPACE(dev);
+	hr =3D max(NET_SKB_PAD, L1_CACHE_ALIGN(hh_len));
+
+	/* data points to eth header */
+	buffer =3D (unsigned char *)xsk_buff_raw_get_data(pool, addr);
+
+	iph =3D (struct iphdr *)(buffer + ETH_HLEN);
+	iph_len =3D iph->ihl * 4;
+
+	udph =3D (struct udphdr *)(buffer + ETH_HLEN + iph_len);
+	th_len =3D sizeof(struct udphdr);
+
+	if (tuple->batch_flush)
+		veth_tuple_init(tuple, iph, udph);
+
+	ts =3D pool->unaligned ? len : pool->chunk_size;
+
+	data_offset =3D offset_in_page(buffer) + ETH_HLEN + iph_len + th_len;
+	data_len =3D len - (ETH_HLEN + iph_len + th_len);
+
+	/* head is null or this is a new 5 tuple */
+	if (!prev_skb || !veth_segment_match(tuple, iph, udph)) {
+		tot_len =3D hr + iph_len + th_len;
+		skb =3D veth_build_gso_head_skb(dev, buffer, tot_len, hr, iph_len, th_le=
n);
+		if (!skb) {
+			/* to do: handle here for skb */
+			return NULL;
+		}
+
+		/* store information for gso */
+		seg_len =3D struct_size(seg_info, desc, MAX_SKB_FRAGS);
+		seg_info =3D kmalloc(seg_len, GFP_KERNEL);
+		if (!seg_info) {
+			/* to do */
+			kfree_skb(skb);
+			return NULL;
+		}
+	} else {
+		skb =3D prev_skb;
+		skb_shinfo(skb)->gso_type =3D SKB_GSO_UDP_L4 | SKB_GSO_PARTIAL;
+		skb_shinfo(skb)->gso_size =3D data_len;
+		skb->ip_summed =3D CHECKSUM_PARTIAL;
+
+		/* max segment is MAX_SKB_FRAGS */
+		if (skb_shinfo(skb)->gso_segs >=3D MAX_SKB_FRAGS - 1)
+			tuple->batch_flush =3D true;
+
+		seg_info =3D (struct veth_seg_info *)skb_shinfo(skb)->destructor_arg;
+	}
+
+	/* offset in umem pool buffer */
+	addr =3D buffer - pool->addrs;
+
+	/* get the page of the desc */
+	page =3D pool->umem->pgs[addr >> PAGE_SHIFT];
+
+	/* in order to avoid to get freed by kfree_skb */
+	get_page(page);
+
+	/* desc.data can not hold in two */
+	skb_fill_page_desc(skb, skb_shinfo(skb)->gso_segs, page, data_offset, dat=
a_len);
+
+	skb->len +=3D data_len;
+	skb->data_len +=3D data_len;
+	skb->truesize +=3D ts;
+	skb->dev =3D dev;
+
+	/* later we will support gso for this */
+	index =3D skb_shinfo(skb)->gso_segs;
+	seg_info->desc[index] =3D desc->addr;
+	seg_info->segs =3D ++index;
+	skb_shinfo(skb)->gso_segs++;
+
+	skb_shinfo(skb)->destructor_arg =3D (void *)(long)seg_info;
+	skb_shinfo(skb)->destructor_arg_xsk_pool =3D (void *)(long)pool;
+	skb->destructor =3D veth_xsk_destruct_skb;
+
+	/* to do:
+	 *  add skb to sock. may be there is no need to do for this
+	 *  and this might be multiple xsk sockets involved, so it's
+	 *  difficult to determine which socket is sending the data.
+	 *  refcount_add(ts, &xs->sk.sk_wmem_alloc);
+	 */
+	return skb;
+}
+
+static inline struct sk_buff *veth_build_skb_def(struct net_device *dev,
+						 struct xsk_buff_pool *pool, struct xdp_desc *desc)
+{
+	struct sk_buff *skb =3D NULL;
+	struct page *page;
+	void *buffer;
+	void *vaddr;
+
+	page =3D dev_alloc_page();
+	if (!page)
+		return NULL;
+
+	buffer =3D (unsigned char *)xsk_buff_raw_get_data(pool, desc->addr);
+
+	vaddr =3D page_to_virt(page);
+	memcpy(vaddr + pool->headroom, buffer, desc->len);
+	skb =3D veth_build_skb(vaddr, pool->headroom, desc->len, PAGE_SIZE);
+	if (!skb) {
+		put_page(page);
+		return NULL;
+	}
+
+	skb->protocol =3D eth_type_trans(skb, dev);
+
+	return skb;
+}
+
+/* To call the following function, the following conditions must be met:
+ * 1.The data packet must be a standard Ethernet data packet
+ * 2. Data packets support batch sending
+ */
+static inline struct sk_buff *veth_build_skb_batch_v4(struct net_device *d=
ev,
+						      struct xsk_buff_pool *pool,
+						      struct xdp_desc *desc,
+						      struct veth_batch_tuple *tuple,
+						      struct sk_buff *prev_skb)
+{
+	struct iphdr *iph;
+	void *buffer;
+	u64 addr;
+
+	addr =3D desc->addr;
+	buffer =3D (unsigned char *)xsk_buff_raw_get_data(pool, addr);
+	iph =3D (struct iphdr *)(buffer + ETH_HLEN);
+	if (!veth_batch_ip_check_v4(iph, desc->len))
+		goto normal;
+
+	switch (iph->protocol) {
+	case IPPROTO_UDP:
+		return veth_build_skb_batch_udp(dev, pool, desc, tuple, prev_skb);
+	default:
+		break;
+	}
+normal:
+	tuple->batch_enable =3D false;
+	return veth_build_skb_def(dev, pool, desc);
+}
+
+/* Zero copy needs to meet the following conditions=EF=BC=9A
+ * 1. The data content of tx desc must be within one page
+ * 2=E3=80=81the tx desc must support batch xmit, which seted by userspace
+ */
+static inline bool veth_batch_desc_check(void *buff, u32 len)
+{
+	u32 offset;
+
+	offset =3D offset_in_page(buff);
+	if (PAGE_SIZE - offset < len)
+		return false;
+
+	return true;
+}
+
+/* here must be a ipv4 or ipv6 packet */
+static inline struct sk_buff *veth_build_skb_batch(struct net_device *dev,
+						   struct xsk_buff_pool *pool,
+						   struct xdp_desc *desc,
+						   struct veth_batch_tuple *tuple,
+						   struct sk_buff *prev_skb)
+{
+	const struct ethhdr *eth;
+	void *buffer;
+
+	buffer =3D xsk_buff_raw_get_data(pool, desc->addr);
+	if (!veth_batch_desc_check(buffer, desc->len))
+		goto normal;
+
+	eth =3D (struct ethhdr *)buffer;
+	switch (ntohs(eth->h_proto)) {
+	case ETH_P_IP:
+		tuple->batch_enable =3D true;
+		return veth_build_skb_batch_v4(dev, pool, desc, tuple, prev_skb);
+	/* to do: not support yet, just build skb, no batch */
+	case ETH_P_IPV6:
+		fallthrough;
+	default:
+		break;
+	}
+
+normal:
+	tuple->batch_flush =3D false;
+	tuple->batch_enable =3D false;
+	return veth_build_skb_def(dev, pool, desc);
+}
+
+/* just support ipv4 udp batch
+ * to do: ipv4 tcp and ipv6
+ */
+static inline void veth_skb_batch_checksum(struct sk_buff *skb)
+{
+	struct iphdr *iph =3D ip_hdr(skb);
+	struct udphdr *uh =3D udp_hdr(skb);
+	int ip_tot_len =3D skb->len;
+	int udp_len =3D skb->len - (skb->transport_header - skb->network_header);
+
+	iph->tot_len =3D htons(ip_tot_len);
+	ip_send_check(iph);
+	uh->len =3D htons(udp_len);
+	uh->check =3D 0;
+
+	udp4_hwcsum(skb, iph->saddr, iph->daddr);
+}
+
 static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_=
pool, int budget)
 {
 	struct veth_priv *priv, *peer_priv;
 	struct net_device *dev, *peer_dev;
+	struct veth_batch_tuple tuple;
 	struct veth_stats stats =3D {};
+	struct sk_buff *prev_skb =3D NULL;
 	struct sk_buff *skb =3D NULL;
 	struct veth_rq *peer_rq;
 	struct xdp_desc desc;
@@ -1093,24 +1440,23 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, str=
uct xsk_buff_pool *xsk_pool,
 	peer_dev =3D priv->peer;
 	peer_priv =3D netdev_priv(peer_dev);
=20
-	/* todo: queue index must set before this */
+	/* queue_index set in napi enable
+	 * to do:may be we should select rq by 5-tuple or hash
+	 */
 	peer_rq =3D &peer_priv->rq[sq->queue_index];
=20
+	memset(&tuple, 0, sizeof(tuple));
+
 	/* set xsk wake up flag, to do: where to disable */
 	if (xsk_uses_need_wakeup(xsk_pool))
 		xsk_set_tx_need_wakeup(xsk_pool);
=20
 	while (budget-- > 0) {
 		unsigned int truesize =3D 0;
-		struct page *page;
-		void *vaddr;
-		void *addr;
=20
 		if (!xsk_tx_peek_desc(xsk_pool, &desc))
 			break;
=20
-		addr =3D xsk_buff_raw_get_data(xsk_pool, desc.addr);
-
 		/* can not hold all data in a page */
 		truesize =3D  SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 		truesize +=3D desc.len + xsk_pool->headroom;
@@ -1120,30 +1466,50 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, str=
uct xsk_buff_pool *xsk_pool,
 			break;
 		}
=20
-		page =3D dev_alloc_page();
-		if (!page) {
+		skb =3D veth_build_skb_batch(peer_dev, xsk_pool, &desc, &tuple, prev_skb=
);
+		if (!skb) {
+			stats.rx_drops++;
 			xsk_tx_completed_addr(xsk_pool, desc.addr);
-			stats.xdp_drops++;
-			break;
+			if (prev_skb !=3D skb) {
+				napi_gro_receive(&peer_rq->xdp_napi, prev_skb);
+				prev_skb =3D NULL;
+			}
+			continue;
 		}
-		vaddr =3D page_to_virt(page);
-
-		memcpy(vaddr + xsk_pool->headroom, addr, desc.len);
-		xsk_tx_completed_addr(xsk_pool, desc.addr);
=20
-		skb =3D veth_build_skb(vaddr, xsk_pool->headroom, desc.len, PAGE_SIZE);
-		if (!skb) {
-			put_page(page);
-			stats.xdp_drops++;
-			break;
+		if (!tuple.batch_enable) {
+			xsk_tx_completed_addr(xsk_pool, desc.addr);
+			/* flush the prev skb first to avoid out of order */
+			if (prev_skb !=3D skb && prev_skb) {
+				veth_skb_batch_checksum(prev_skb);
+				napi_gro_receive(&peer_rq->xdp_napi, prev_skb);
+				prev_skb =3D NULL;
+			}
+			napi_gro_receive(&peer_rq->xdp_napi, skb);
+			skb =3D NULL;
+		} else {
+			if (prev_skb && tuple.batch_flush) {
+				veth_skb_batch_checksum(prev_skb);
+				napi_gro_receive(&peer_rq->xdp_napi, prev_skb);
+				if (prev_skb =3D=3D skb)
+					prev_skb =3D skb =3D NULL;
+				else
+					prev_skb =3D skb;
+			} else {
+				prev_skb =3D skb;
+			}
 		}
-		skb->protocol =3D eth_type_trans(skb, peer_dev);
-		napi_gro_receive(&peer_rq->xdp_napi, skb);
=20
 		stats.xdp_bytes +=3D desc.len;
 		done++;
 	}
=20
+	/* means there is a skb need to send to peer_rq (batch)*/
+	if (skb) {
+		veth_skb_batch_checksum(skb);
+		napi_gro_receive(&peer_rq->xdp_napi, skb);
+	}
+
 	/* release, move consumer=EF=BC=8Cand wakeup the producer */
 	if (done) {
 		napi_schedule(&peer_rq->xdp_napi);
--=20
2.20.1