From nobody Wed Nov 27 00:30:42 2024 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E80EA1DD0D2; Tue, 15 Oct 2024 14:54:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729004101; cv=none; b=W6ReIXhCW4ajd+IPJtUmiW4vJn4uT/ZvcQkS/PkRwz0woQ1Xg6rpFEu1rbRlurl65cUx0A6cXkZEzpJR2xfOzUJlFD67kizZhbPHwd0KDkYJ9bVvUKdHpFiraDZsZUgiNaoTGCIMzcXBIY9L2SmfFrBmz1fbHGOKEA3HKcnFzgM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729004101; c=relaxed/simple; bh=5F6Zk1LHMTvHlxJgB5M9u/Xg3DSJT2pwk/emzwtaeKQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AuBP4h6G3iSuP7ouClpwblJr9QCP4objCH5IARXwCht/DwM8K2DDb30AME72+uCHA7qk1t71AZnDUNrUBhWowa/wCRCBQyhCUvtlnJDhKRY4Kr391q+laChkqfNvPAN8zD53IJdc3OJG40eazCUEgwubrjvntv1Lzpza4F46XB8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ThjiG9mc; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ThjiG9mc" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729004100; x=1760540100; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5F6Zk1LHMTvHlxJgB5M9u/Xg3DSJT2pwk/emzwtaeKQ=; b=ThjiG9mcjiCfRYU37K+eytsNHsOiUrMfa09ReNSz93zYXT1M0rZ11WY+ 8sqP9Zx8n4tl0CRBqlSGyWMDd1KxMIdon+jJXLyLiOfmHrD20bQmkjSMA nQR0nWpq8AqF7Y0JL6KVbxiBHL61GqCYzAZvT8tYx6rfYMBXWDYuzchOd 3ZfRSZHs/2rqoO6YCmT06cyHgykzujnS63uY0/zd/x9C1iyW6OjlrSAgA UbgFhA0hsGSfQuvWtw3Wyl8QJrAtsZfVS3rdmzCIEhuafxh8ylajFNaRt GsLedEdf9LY6y+cHmJ/wTYps4t+wWbd2XeLqz7fU2SvQB+TiAjHprq/97 w==; X-CSE-ConnectionGUID: wphZs2scQ3yFnFx4bQ85xQ== X-CSE-MsgGUID: vs61i+PkQKG3x2kYEUqi5Q== X-IronPort-AV: E=McAfee;i="6700,10204,11225"; a="31277571" X-IronPort-AV: E=Sophos;i="6.11,205,1725346800"; d="scan'208";a="31277571" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2024 07:54:59 -0700 X-CSE-ConnectionGUID: jZXXuYbzTgq6MTIbc6Gy9g== X-CSE-MsgGUID: 4OG+qwNmRQiVlXMzFiIEjw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="82723109" Received: from newjersey.igk.intel.com ([10.102.20.203]) by orviesa003.jf.intel.com with ESMTP; 15 Oct 2024 07:54:55 -0700 From: Alexander Lobakin To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni Cc: Alexander Lobakin , =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= , Alexei Starovoitov , Daniel Borkmann , John Fastabend , Andrii Nakryiko , Stanislav Fomichev , Magnus Karlsson , nex.sw.ncis.osdt.itp.upstreaming@intel.com, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next v2 11/18] xdp: add generic xdp_buff_add_frag() Date: Tue, 15 Oct 2024 16:53:43 +0200 Message-ID: <20241015145350.4077765-12-aleksander.lobakin@intel.com> X-Mailer: git-send-email 2.46.2 In-Reply-To: <20241015145350.4077765-1-aleksander.lobakin@intel.com> References: <20241015145350.4077765-1-aleksander.lobakin@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The code piece which would attach a frag to &xdp_buff is almost identical across the drivers supporting XDP multi-buffer on Rx. Make it a generic elegant onelner. Also, I see lots of drivers calculating frags_truesize as `xdp->frame_sz * nr_frags`. I can't say this is fully correct, since frags might be backed by chunks of different sizes, especially with stuff like the header split. Even page_pool_alloc() can give you two different truesizes on two subsequent requests to allocate the same buffer size. Add a field to &skb_shared_info (unionized as there's no free slot currently on x6_64) to track the "true" truesize. It can be used later when updating an skb. Signed-off-by: Alexander Lobakin Reviewed-by: Maciej Fijalkowski --- include/linux/skbuff.h | 16 ++++++-- include/net/xdp.h | 90 +++++++++++++++++++++++++++++++++++++++++- 2 files changed, 101 insertions(+), 5 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index c867df5b1051..6ec78c1598fe 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -607,11 +607,19 @@ struct skb_shared_info { * Warning : all fields before dataref are cleared in __alloc_skb() */ atomic_t dataref; - unsigned int xdp_frags_size; =20 - /* Intermediate layers must ensure that destructor_arg - * remains valid until skb destructor */ - void * destructor_arg; + union { + struct { + u32 xdp_frags_size; + u32 xdp_frags_truesize; + }; + + /* + * Intermediate layers must ensure that destructor_arg + * remains valid until skb destructor. + */ + void *destructor_arg; + }; =20 /* must be last field, see pskb_expand_head() */ skb_frag_t frags[MAX_SKB_FRAGS]; diff --git a/include/net/xdp.h b/include/net/xdp.h index c4b408d22669..19d2b283b845 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -167,6 +167,88 @@ xdp_get_buff_len(const struct xdp_buff *xdp) return len; } =20 +/** + * __xdp_buff_add_frag - attach a frag to an &xdp_buff + * @xdp: XDP buffer to attach the frag to + * @page: page containing the frag + * @offset: page offset at which the frag starts + * @size: size of the frag + * @truesize: truesize (page / page frag size) of the frag + * @try_coalesce: whether to try coalescing the frags + * + * Attach a frag to an XDP buffer. If it currently has no frags attached, + * initialize the related fields, otherwise check that the frag number + * didn't reach the limit of ``MAX_SKB_FRAGS``. If possible, try coalescing + * the frag with the previous one. + * The function doesn't check/update the pfmemalloc bit. Please use the + * non-underscored wrapper in drivers. + * + * Return: true on success, false if there's no space for the frag in + * the shared info struct. + */ +static inline bool __xdp_buff_add_frag(struct xdp_buff *xdp, struct page *= page, + u32 offset, u32 size, u32 truesize, + bool try_coalesce) +{ + struct skb_shared_info *sinfo =3D xdp_get_shared_info_from_buff(xdp); + skb_frag_t *prev; + u32 nr_frags; + + if (!xdp_buff_has_frags(xdp)) { + xdp_buff_set_frags_flag(xdp); + + nr_frags =3D 0; + sinfo->xdp_frags_size =3D 0; + sinfo->xdp_frags_truesize =3D 0; + + goto fill; + } + + nr_frags =3D sinfo->nr_frags; + if (unlikely(nr_frags =3D=3D MAX_SKB_FRAGS)) + return false; + + prev =3D &sinfo->frags[nr_frags - 1]; + if (try_coalesce && page =3D=3D skb_frag_page(prev) && + offset =3D=3D skb_frag_off(prev) + skb_frag_size(prev)) + skb_frag_size_add(prev, size); + else +fill: + __skb_fill_page_desc_noacc(sinfo, nr_frags++, page, + offset, size); + + sinfo->nr_frags =3D nr_frags; + sinfo->xdp_frags_size +=3D size; + sinfo->xdp_frags_truesize +=3D truesize; + + return true; +} + +/** + * xdp_buff_add_frag - attach a frag to an &xdp_buff + * @xdp: XDP buffer to attach the frag to + * @page: page containing the frag + * @offset: page offset at which the frag starts + * @size: size of the frag + * @truesize: truesize (page / page frag size) of the frag + * + * Version of __xdp_buff_add_frag() which takes care of the pfmemalloc bit. + * + * Return: true on success, false if there's no space for the frag in + * the shared info struct. + */ +static inline bool xdp_buff_add_frag(struct xdp_buff *xdp, struct page *pa= ge, + u32 offset, u32 size, u32 truesize) +{ + if (!__xdp_buff_add_frag(xdp, page, offset, size, truesize, true)) + return false; + + if (unlikely(page_is_pfmemalloc(page))) + xdp_buff_set_frag_pfmemalloc(xdp); + + return true; +} + struct xdp_frame { void *data; u32 len; @@ -230,7 +312,13 @@ xdp_update_skb_shared_info(struct sk_buff *skb, u8 nr_= frags, unsigned int size, unsigned int truesize, bool pfmemalloc) { - skb_shinfo(skb)->nr_frags =3D nr_frags; + struct skb_shared_info *sinfo =3D skb_shinfo(skb); + + sinfo->nr_frags =3D nr_frags; + /* ``destructor_arg`` is unionized with ``xdp_frags_{,true}size``, + * reset it after that these fields aren't used anymore. + */ + sinfo->destructor_arg =3D NULL; =20 skb->len +=3D size; skb->data_len +=3D size; --=20 2.46.2