From nobody Sat Feb 7 18:28:49 2026 Received: from BYAPR05CU005.outbound.protection.outlook.com (mail-westusazon11010011.outbound.protection.outlook.com [52.101.85.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E28B0358D2F; Mon, 12 Jan 2026 13:23:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.85.11 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768224185; cv=fail; b=QFT4Jn2kWHwVkxQ+uiyXxH8dnMDrEgOu2+vdifeVS/75/eVYqwjMyI46wuZmI3quVU7LcGEGrVBNo+vk222UEx7lA+AHQx62+qX5JRLbZoT2ovdg0edNp20oVQyM2HqKTnLYlXsKd01mLim/uFCHx8cHgtRi9mREdHho4RdpHSU= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768224185; c=relaxed/simple; bh=zs1Y/N4QT66MKIHuekBBWCz9BoUM9p8dEUiBFXRm/8o=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=SuxrPPV8UlVxlEF+7XhYM8EDh7voXdA4x1rZBxIREGf32QC1MHJymGJNUZrBuuLwelxqQ15t5WJKCa36N65F41O3nN6UnlvLVdDqxB9VwO0/dZZHy8OWfJd2pdMjbFEKK0dWq0RzqFJ+pXmzW5LR4nerTZCeEjSPXR9WrM2IRo0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=NR3OfM0M; arc=fail smtp.client-ip=52.101.85.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="NR3OfM0M" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=xerUIhuPtOnJDUBEYFnk/3UZ2MQEWOr7CJM/44pMuKYbHtdrQmBlM+Q+dSJij4c6X9YRrTgKjk6OsNEknd40Fg2RTOg12i1mutF5wEWs9oBzLJoA8r+O26HFSNYvG/fWRt9eDWoLxtug9T0o+utuJNv14X0M93dy8svS/WjqgTe+0LpDsoO35G14ucyAwgds0/zKppjbZJ5BrDUIAmP7qAgd14JF58jvac5qnKZvljFUd/o8jbxY0uEn/4vBySJ0BIg4zq+ynit4ZCBMoCauxysckMvJM320MF9bnUKbTZcXgjc4v64MAJoGkUqMZwxSznWwoIC9xamdKcOjAVFAEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=SG3fWHz6UjuMH23PHaVMBoaHKibKpTpw2QsYLKujYu4=; b=mBqzJBpBDWBZbWxVrHF21AHaj2biGuHqwJ4LfQx4zvsDAwFJHkEa8RFRTIhEEh5OM6wO8JK8arQv52/pXrU3tD2uhfqU7Dm1IbXDP7eEfaeyilDCWvXD31IKXphkQPvk/5hfG9tz32qj/NN1QVL/hfpxxCbN0m8r0eusOS8u+Ora2swaVNOgKwd0z/fycrVtnZQuTkiXG63nJeeJx17IHKCqxlWgmkQvbA//AnxeswJdQpNkkigG5kX1p/zU1XlZRk6Cs73FjQjjz+z4lKARxINMwgGGZsC9EmLhU7FiKTCaWN8l7WB4mDMEY/4DQxx6Yd0K0DdMGIQQe6icBSS6Wg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=google.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=SG3fWHz6UjuMH23PHaVMBoaHKibKpTpw2QsYLKujYu4=; b=NR3OfM0M3OyHkdtM9lRuJCqHO32uzXGdEZa8krVyVnHi+h7zfA3Clw8xre/Go5jdFGLj9xOuGSjQSTm1+Oh+RRdZpcFyFrhaSySVg1I0KbLvRR2mwxWlmpTOfyH4QOZYgGuEHUVwA32bs1IugYO8Z0uxNpjwnPArArpMfIRdTU/04b/cxf8O8CNykAyp5a8H5KKNKMaVKGc/YNLK8uUyMx1oGKRZlN1a6VuUWHzz3R9WmCiL1ciJ874os57H6b9b03QRIXWmSIpg/xD17xQQqw1NNMz04xHq8HPAI5b+ROiOdMUTjZoQnFax5Y88fArexCeTV3J4abG8iCplJ0RvOA== Received: from SJ0PR05CA0104.namprd05.prod.outlook.com (2603:10b6:a03:334::19) by DS4PR12MB9585.namprd12.prod.outlook.com (2603:10b6:8:27e::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9499.7; Mon, 12 Jan 2026 13:22:57 +0000 Received: from SJ5PEPF000001ED.namprd05.prod.outlook.com (2603:10b6:a03:334:cafe::13) by SJ0PR05CA0104.outlook.office365.com (2603:10b6:a03:334::19) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9520.0 via Frontend Transport; Mon, 12 Jan 2026 13:22:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by SJ5PEPF000001ED.mail.protection.outlook.com (10.167.242.201) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9520.1 via Frontend Transport; Mon, 12 Jan 2026 13:22:56 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 12 Jan 2026 05:22:38 -0800 Received: from rnnvmail201.nvidia.com (10.129.68.8) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 12 Jan 2026 05:22:38 -0800 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.129.68.8) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 12 Jan 2026 05:22:33 -0800 From: Tariq Toukan To: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Andrew Lunn , "David S. Miller" CC: Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Mark Bloch , , , , Gal Pressman , Moshe Shemesh , Cosmin Ratiu , Dragos Tatulea Subject: [PATCH net-next 1/3] net/mlx5e: RX, Drop oversized packets in non-linear mode Date: Mon, 12 Jan 2026 15:22:07 +0200 Message-ID: <1768224129-1600265-2-git-send-email-tariqt@nvidia.com> X-Mailer: git-send-email 2.8.0 In-Reply-To: <1768224129-1600265-1-git-send-email-tariqt@nvidia.com> References: <1768224129-1600265-1-git-send-email-tariqt@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001ED:EE_|DS4PR12MB9585:EE_ X-MS-Office365-Filtering-Correlation-Id: 34e26ff4-02b1-4298-927f-08de51ddb31c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|1800799024|36860700013|376014|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?Z7T81ldBFV5HwkpeK4VznKhmsPU2yN7GSHmrJ3H2fDjKGdNgu63xfJ36VywA?= =?us-ascii?Q?0qwETS61rP2wIWfgI+EWJegafoTXbPuNAB6q4dq3Lv9QD21wZMZmf7JaHLKv?= =?us-ascii?Q?8WpLSSzVLfZ1A76YWwZYTXaL4Iumd3l4DDuUty6pQgYlAeW/UnUocib7u5Wn?= =?us-ascii?Q?HqCiFl2+MU2Vky8pP22yLxRF6P8AmmeCfJpBTZ1fMPRfMpDCjT1kT+g7BR50?= =?us-ascii?Q?rX4J1orlw3aKvC4xfgC7kCiC72g1QnZGYGytm+rWIA3jEiHZDo/r/wek7UlH?= =?us-ascii?Q?89V6dEfIYINjAbYsGeeLpjtGj46Q+PeTo9OWDLOwFXiycVy0c/8lx+rtSFUQ?= =?us-ascii?Q?D2TNqReuCaT18YmEpGkK8vLPynrHRGiNghGcwnut+EkAh3MWJwVshEyZRmZE?= =?us-ascii?Q?Svcif90LtQ7Hq03MmR6KfZ2/3GzNoCrJoFzSvD3ZWtbD5uecVTVZrQ4fmHyw?= =?us-ascii?Q?4fpgcucRSoI+eXDGBUWevuHbi/4N2JLA+felBnXj7ODic1BzRjGE/hdaYE1c?= =?us-ascii?Q?002asznZpEf4MpMC2YTt0jQGmgPby0rZgBto9WJHddZ9fgcaN6fs8YH7o0Jm?= =?us-ascii?Q?pnjoQCXHEsuQ5Q2NWjZQncOX1yLsh1/7FhZTkAXhaeoxwjWB3XiPEbfsp1CK?= =?us-ascii?Q?S6Xp+R/wfpb5fDeUhKUNle//cSFw/4MiJrVj1/YE/PQxoaipZvYQzTCvMS+5?= =?us-ascii?Q?k4jBHGXFQnx3QDvi1vciyFIxMQzaeEfGOFY/b872kCbLP91adi/Ehz23D7DJ?= =?us-ascii?Q?5JWsrVSAG16jyNJ5sIwXFpPFVaXl0oG3qKzqo/jaC9l30KCkQEBxiHOcEG5k?= =?us-ascii?Q?ra6MRo1W2ENPSmAGzXIDoN3j7bX67jB3r4zGu6HAXX5HKKajTZVzZnTKkJ9Q?= =?us-ascii?Q?M6SemqrlIWDtvtZdq9otwZe6T8eAlHuVT3LNYqojXHBmHpOuUYP98LWKKAbb?= =?us-ascii?Q?keKLhJB6pgC3uVWF/SS3Ofp5wztmwgP08ruxnVuHrvl7jJP8rZcZPSzrLdM2?= =?us-ascii?Q?ddP7RJjv22i6evBlNKCwdWrF5X+ilsO7GE7bSyuZ1Mb67zL1u7L7GfGIAogy?= =?us-ascii?Q?LtzzyWuMMLAFm81GIRe0w6eHkWHBulKcwyIdjPIo2A7OabYv1jzuZ8g64kd8?= =?us-ascii?Q?g/S3d3vkpAvUKowhLRibUdGWB3KWzGY9AnsgNseMq5OfzMEI2m4eMNKk4jSz?= =?us-ascii?Q?H1pPiH0EKUYf0tac904ay+ZCKySw+5VBB70hfnuqdbkW+K01JDc1Z1CGGUme?= =?us-ascii?Q?XIJRw6oyWcfiTTeLPhv//2K74TyqLRUTMhuum0QNfeViEazBy39wNNwf5MUI?= =?us-ascii?Q?uqUxE354DV/6ghPSl7CT/y1bJ48tT7Ipyq/94ndWhWCEBa2L01u1OKr0J752?= =?us-ascii?Q?YfKxInHBvWqzQ8HtmbI5/2lx4lBDBP5NOqN+iN9Dfk3sHDrtUQcC/uiUmNiJ?= =?us-ascii?Q?XBFMP54c6zMadyQKfruvFKNbHH4lETI7rgBkgBG37AHFg7ApZ9Wim+5hfJea?= =?us-ascii?Q?LapVQr3ui0773yEc01ZH61ZL0pUoFJw8NF6jchZ8SiCcvqc8OhzWkFeMUE+7?= =?us-ascii?Q?Cs4zIvTJQ+9tnYAlMR0=3D?= X-Forefront-Antispam-Report: CIP:216.228.117.160;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge1.nvidia.com;CAT:NONE;SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2026 13:22:56.6514 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 34e26ff4-02b1-4298-927f-08de51ddb31c X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.160];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001ED.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS4PR12MB9585 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dragos Tatulea Currently the driver has an inconsistent behaviour between modes when it comes to oversized packets that are not dropped through the physical MTU check in HW. This can happen for Multi Host configurations where each port has a different MTU. Current behavior: 1) Striding RQ in linear mode drops the packet in SW and counts it with oversize_pkts_sw_drop. 2) Striding RQ in non-linear mode allows it like a normal packet. 3) Legacy RQ can't receive oversized packets by design: the RX WQE uses MTU sized packet buffers. This inconsistency is not a violation of the netdev policy [1] but it is better to be consistent across modes. This patch aligns (2) with (1) and (3). One exception is added for LRO: don't drop the oversized packet if it is an LRO packet. As now rq->hw_mtu always needs to be updated during the MTU change flow, drop the reset avoidance optimization from mlx5e_change_mtu(). Extract the CQE LRO segments reading into a helper function as it is used twice now. [1] Documentation/networking/netdevices.rst#L205 Signed-off-by: Dragos Tatulea Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman --- .../net/ethernet/mellanox/mlx5/core/en_main.c | 25 ++----------------- .../net/ethernet/mellanox/mlx5/core/en_rx.c | 11 +++++++- include/linux/mlx5/device.h | 6 +++++ 3 files changed, 18 insertions(+), 24 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/ne= t/ethernet/mellanox/mlx5/core/en_main.c index 3ac47df83ac8..136fa8f05607 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4664,7 +4664,6 @@ int mlx5e_change_mtu(struct net_device *netdev, int n= ew_mtu, struct mlx5e_priv *priv =3D netdev_priv(netdev); struct mlx5e_params new_params; struct mlx5e_params *params; - bool reset =3D true; int err =3D 0; =20 mutex_lock(&priv->state_lock); @@ -4690,28 +4689,8 @@ int mlx5e_change_mtu(struct net_device *netdev, int = new_mtu, goto out; } =20 - if (params->packet_merge.type =3D=3D MLX5E_PACKET_MERGE_LRO) - reset =3D false; - - if (params->rq_wq_type =3D=3D MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ && - params->packet_merge.type !=3D MLX5E_PACKET_MERGE_SHAMPO) { - bool is_linear_old =3D mlx5e_rx_mpwqe_is_linear_skb(priv->mdev, params, = NULL); - bool is_linear_new =3D mlx5e_rx_mpwqe_is_linear_skb(priv->mdev, - &new_params, NULL); - u8 sz_old =3D mlx5e_mpwqe_get_log_rq_size(priv->mdev, params, NULL); - u8 sz_new =3D mlx5e_mpwqe_get_log_rq_size(priv->mdev, &new_params, NULL); - - /* Always reset in linear mode - hw_mtu is used in data path. - * Check that the mode was non-linear and didn't change. - * If XSK is active, XSK RQs are linear. - * Reset if the RQ size changed, even if it's non-linear. - */ - if (!is_linear_old && !is_linear_new && !priv->xsk.refcnt && - sz_old =3D=3D sz_new) - reset =3D false; - } - - err =3D mlx5e_safe_switch_params(priv, &new_params, preactivate, NULL, re= set); + err =3D mlx5e_safe_switch_params(priv, &new_params, preactivate, NULL, + true); =20 out: WRITE_ONCE(netdev->mtu, params->sw_mtu); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/= ethernet/mellanox/mlx5/core/en_rx.c index 1f6930c77437..57e20beb05dc 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -1570,7 +1570,7 @@ static inline bool mlx5e_build_rx_skb(struct mlx5_cqe= 64 *cqe, struct mlx5e_rq *rq, struct sk_buff *skb) { - u8 lro_num_seg =3D be32_to_cpu(cqe->srqn) >> 24; + u8 lro_num_seg =3D get_cqe_lro_num_seg(cqe); struct mlx5e_rq_stats *stats =3D rq->stats; struct net_device *netdev =3D rq->netdev; =20 @@ -2054,6 +2054,15 @@ mlx5e_skb_from_cqe_mpwrq_nonlinear(struct mlx5e_rq *= rq, struct mlx5e_mpw_info *w u16 linear_hr; void *va; =20 + if (unlikely(cqe_bcnt > rq->hw_mtu)) { + u8 lro_num_seg =3D get_cqe_lro_num_seg(cqe); + + if (lro_num_seg <=3D 1) { + rq->stats->oversize_pkts_sw_drop++; + return NULL; + } + } + prog =3D rcu_dereference(rq->xdp_prog); =20 if (prog) { diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index d7f46a8fbfa1..6e08092a8e35 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -962,6 +962,12 @@ static inline u16 get_cqe_flow_tag(struct mlx5_cqe64 *= cqe) return be32_to_cpu(cqe->sop_drop_qpn) & 0xFFF; } =20 + +static inline u8 get_cqe_lro_num_seg(struct mlx5_cqe64 *cqe) +{ + return be32_to_cpu(cqe->srqn) >> 24; +} + #define MLX5_MPWQE_LOG_NUM_STRIDES_EXT_BASE 3 #define MLX5_MPWQE_LOG_NUM_STRIDES_BASE 9 #define MLX5_MPWQE_LOG_NUM_STRIDES_MAX 16 --=20 2.31.1 From nobody Sat Feb 7 18:28:49 2026 Received: from BYAPR05CU005.outbound.protection.outlook.com (mail-westusazon11010008.outbound.protection.outlook.com [52.101.85.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 23C3427602C; Mon, 12 Jan 2026 13:23:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.85.8 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768224191; cv=fail; b=T70TbotKtJvihyoFj27HSaLsy8oA8XOzLao2yPJ9NrbjOO4CqMsbPuaZvZwpa3Ba4NKouQzKE6PZxcH9TS+xmD8RAEkgWzrqwyJo0ahYgUydn0sWB4OwPKpbItrGs6vGTh1Dy/3kd6kElWkjeKnLc2uz6OQMssAYLrem+GuLBRM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768224191; c=relaxed/simple; bh=TE4WXu2YL9mEXo9JD2jG30P4Q5Dgg1BhmiOJqcuA8lo=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=tDjski6+CavbfswU9vgLRkxkA4r6p436e2D3Pf0W7XuU23KZZvYcnLLzUrbp8d8PKgRaPG9yyZnYQPNr87k1k0CuK2hztoKgnToifNuY3Yo7z5ldyVzIMr+39g36QzSnGcvfZBo3PvGSjbhYKN91w3+vfNoVgUKSM4W22tQHb6c= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=qXpO1cGM; arc=fail smtp.client-ip=52.101.85.8 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="qXpO1cGM" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=KUZvQE31Qyaao1sdIs0s0BWVMHQ4sfnpHk9G0wTLKVjX86gDCzJ9lfY7NT9vvtx7BGdtZ4p0cDgqoB5ZGBauMgZpTH4ubi026HwVY9FK1yyEAAVtJhFHH9fAiJSOQUaAho/5XXtmO8szjJNKgK7Lx26sCoBE3J3dQlwKptREad2fSi+w1n4MX4cMVr+uSAOcj9tTjPDB7bQHvJavXgZ3AOv9PwcccdbDVi7g+80EibtDPEzEjgSzQd8pzFOo0k40becj9hg4REGCq+KKPpWbphcYxnPceZBnKGC3dTRMJOGNpDprrLrvUU9pWa7rfEGQNi9SosqZKJs39jU0QyZomA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+irih6VYI4g3IS2/tqcU63mWIluptYIAeRbw+syyrV0=; b=Z/GKj9eGnUVUuGubbR13wQ6sLPptSz1bAKEJDhX6Er0nxFMVBzo1xp3yVspcOtlWWwKLggcjjDsmnCMRLrP3pu7CHgZk6spuifFysNVD9ToycGFuYEn53zO8+FKBl8OAjtlS6Lo9XMh2vgl53/yjAwDRL+XjcyoL+zrBDaUBGWfWjUSHOH97M4hhJN2tA5LAtDM/BvrRqzjQ8pb9zHLnGfMnlDoAdyPNnRHuD+PQedpzfOq8KD9zT5ISRRmXFwcKhI4oNFm+8yN22knu68LP0EwHELpFvoqkFeOLSa741BYb5PViBKHIfiIV5sxozCbQOLcqqlAHm2izO9uuyEBoKg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=google.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+irih6VYI4g3IS2/tqcU63mWIluptYIAeRbw+syyrV0=; b=qXpO1cGMEheyIpPSDMZCPtm3Bpba45Rc3XfnWDoeKidCygF+p3NeNEPlRSk8S4t61g2v9bgGXKueZ/XO/Z8uatlIO9E8Cptrv2GNbLtT4CODtGbuVTDBWtB+kjTBs927D9/rnb+E5T+4wtUmPvdjSjKzB4vHFXYkXBGHENQG6xI8Z6vyyJGBZ5EbeID8192SeMCgzZ37EEVNJ6dxU0RkJ0aPJppbroPTl8piWi0eA/pMPdDOMnm3UNdmdIyRLOOZ5zNK5zEStW0lBDtZfM+DFmgutq18IjfI4aFCpJbxyb5EIJm21J7iWG6H9RnWCWwTrjREUi0zZ76uqkCyQmT6yQ== Received: from SJ0PR05CA0100.namprd05.prod.outlook.com (2603:10b6:a03:334::15) by SJ2PR12MB9192.namprd12.prod.outlook.com (2603:10b6:a03:55d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9499.7; Mon, 12 Jan 2026 13:23:02 +0000 Received: from SJ5PEPF000001ED.namprd05.prod.outlook.com (2603:10b6:a03:334:cafe::f1) by SJ0PR05CA0100.outlook.office365.com (2603:10b6:a03:334::15) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9520.0 via Frontend Transport; Mon, 12 Jan 2026 13:23:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by SJ5PEPF000001ED.mail.protection.outlook.com (10.167.242.201) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9520.1 via Frontend Transport; Mon, 12 Jan 2026 13:23:02 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 12 Jan 2026 05:22:43 -0800 Received: from rnnvmail201.nvidia.com (10.129.68.8) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 12 Jan 2026 05:22:43 -0800 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.129.68.8) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 12 Jan 2026 05:22:38 -0800 From: Tariq Toukan To: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Andrew Lunn , "David S. Miller" CC: Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Mark Bloch , , , , Gal Pressman , Moshe Shemesh , Cosmin Ratiu , Dragos Tatulea Subject: [PATCH net-next 2/3] net/mlx5e: SHAMPO, Improve allocation recovery Date: Mon, 12 Jan 2026 15:22:08 +0200 Message-ID: <1768224129-1600265-3-git-send-email-tariqt@nvidia.com> X-Mailer: git-send-email 2.8.0 In-Reply-To: <1768224129-1600265-1-git-send-email-tariqt@nvidia.com> References: <1768224129-1600265-1-git-send-email-tariqt@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001ED:EE_|SJ2PR12MB9192:EE_ X-MS-Office365-Filtering-Correlation-Id: 27bc0ef4-2bc2-40ac-8024-08de51ddb6a3 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|376014|30052699003|1800799024|82310400026; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?dqa6CRU5YRWD4CtZq+POFB1xS3B8vEy22WmVCeo+ZPGxm99iuJn+6PHnNY13?= =?us-ascii?Q?X1P+p4OjPZs+Zjaaj4k3RwXRRxfeNBT+hMMdGuhYdo+B9dq2pAiPPBRsCk8W?= =?us-ascii?Q?FPD517zcS3MQrYFz66+L1YmNBDf1vet/DzBx6HTNqiHQ7wJbZZS+pVe3ocXP?= =?us-ascii?Q?3xEhVW3l1djRMbawjixMaaYIv9brvGK7Zh40wgiqTJeBeH0EJklG1Bm3wZDq?= =?us-ascii?Q?vZFDvQZQT8X0J8C1ZsHAEWlRauOFW4dLMLcEOc/e9ZoidCVep7MZhRlIwawo?= =?us-ascii?Q?DBFj1Xe2ElDXwnGHdUB3uL04AXdYykSODdvwML1rwv7UmHUsVijQ393omUDO?= =?us-ascii?Q?gZjRFru8LZHja9SJafuNifb5E1DoCi+sC2xgj0PZBtzIBwwI6RjxOAXpcjxy?= =?us-ascii?Q?qVf2/7WkMNvibn1+Pk5+lFUp/H5PwwNN9fDtkfC0Eci7llKvc8uwhdbXqXKp?= =?us-ascii?Q?PFIk6Gd2loitkRCcA+TKjq0KOtzXhePKOidr9Ut7Ho+TcQooShljt9lh48Gt?= =?us-ascii?Q?wwchAPiYPL6evoJZPMFYHvhxNSb/JBp7D9FZqKjUIoItjnSEGYnydv8KvPtz?= =?us-ascii?Q?VeElmmBmfkZRzSW8u+qxWh0iVPDXBOuVYPtkdvsB0EGt5D3ao/mnpIHQS/+1?= =?us-ascii?Q?yyMvwgvvFMTK7aRpPGt+eJ1WZKucHVj1OZMXMm5Xtxr0Mh0E9Cqw6UDk8HqK?= =?us-ascii?Q?TXmCMsCPgamv/ocaBRM2COLCdWLPePW1DH4HnMY6n03p7MtI9OSzCnHFBPol?= =?us-ascii?Q?7Wx4Ii1F29nFmjQH4Y4Zua5zkAKuYkiEnSZw1bIOe8qK5RVd4HvcoguQk9Lk?= =?us-ascii?Q?XUcEtGQo5qufhMafmco64ND7qeEqLI/SF9B4DmTNJk9xQ3zBwEojFy/9qn5p?= =?us-ascii?Q?/TVZQ6SfIn1knqfCEHylfZ+HQFR7KbOxPNzt2zMo7Bx0bFeZQUSj/iErgx4O?= =?us-ascii?Q?teNNzhFD1+F3ps1L4WoZmNpCtUj6joUt03Oz7mNkAt24fENJMBtq/5qIVrpt?= =?us-ascii?Q?wxH8JnQAiBIJchzWoQ8pAu1mJ+2mDkjW1rN2GbayP57wB6pZ76PgA/DMkewH?= =?us-ascii?Q?YyDHnmaWWPKQmgZP649zE0E0z4hgo6mF4C+r5e0Pze8hoMSowqpbZ/bo+6qZ?= =?us-ascii?Q?zMoORFSKLTP6Ocl/15eHPFU62tT8sBy8TfbQSSXY1YMA8wRgOBSEtFtuHtr9?= =?us-ascii?Q?HN0b8Ol6/U+r8o/2pS/+2SCjSo87fZhQrbttxpn/HDlHmIgi+zqreE4hLnxy?= =?us-ascii?Q?eRkZAVFQdyyyBbXQPaIbd61Yf8O2lueHbfzjpYUemn1cDLDH79b0xAZdA3iY?= =?us-ascii?Q?UANVa0087jrIMrTFd+a13oHI04GvNvWMWyYKNcY+TUFLWEqycpiEYMKARAAW?= =?us-ascii?Q?q5lr9xRKPsWfbZNiLeUblfnIBu/Dho8mF2Yelj/ryBVESK7iC7MsShyy8B+e?= =?us-ascii?Q?jDOC6+ZGgt9JnEGYy5QziG6vLOmST5+W48I35EDbSgeRheca6MxzyXKAZxu3?= =?us-ascii?Q?GG6c/JtdTo/nlvqYSCe22vsfvw/YRzC8YD+TK4cuCqkwlAiQGFQyURpmRiNq?= =?us-ascii?Q?NTkn0if1HToxGhW7h2c=3D?= X-Forefront-Antispam-Report: CIP:216.228.117.160;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge1.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(376014)(30052699003)(1800799024)(82310400026);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2026 13:23:02.5565 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 27bc0ef4-2bc2-40ac-8024-08de51ddb6a3 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.160];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001ED.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ2PR12MB9192 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dragos Tatulea When memory providers are used, there is a disconnect between the page_pool size and the available memory in the provider. This means that the page_pool can run out of memory if the user didn't provision a large enough buffer. Under these conditions, mlx5 gets stuck trying to allocate new buffers without being able to release existing buffers. This happens due to the optimization introduced in commit 4c2a13236807 ("net/mlx5e: RX, Defer page release in striding rq for better recycling") which delays WQE releases to increase the chance of page_pool direct recycling. The optimization was developed before memory providers existed and this circumstance was not considered. This patch unblocks the queue by reclaiming pages from WQEs that can be freed and doing a one-shot retry. A WQE can be freed when: 1) All its strides have been consumed (WQE is no longer in linked list). 2) The WQE pages/netmems have not been previously released. This reclaim mechanism is useful for regular pages as well. Note that provisioning memory that can't fill even one MPWQE (64 4K pages) will still render the queue unusable. Same when the application doesn't release its buffers for various reasons. Or a combination of the two: a very small buffer is provisioned, application releases buffers in bulk, bulk size never reached =3D> queue is stuck. Signed-off-by: Dragos Tatulea Reviewed-by: Cosmin Ratiu Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman --- .../net/ethernet/mellanox/mlx5/core/en_rx.c | 26 +++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/= ethernet/mellanox/mlx5/core/en_rx.c index 57e20beb05dc..aae4db392992 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -1083,11 +1083,24 @@ int mlx5e_poll_ico_cq(struct mlx5e_cq *cq) return i; } =20 +static void mlx5e_reclaim_mpwqe_pages(struct mlx5e_rq *rq, int head, + int reclaim) +{ + struct mlx5_wq_ll *wq =3D &rq->mpwqe.wq; + + for (int i =3D 0; i < reclaim; i++) { + head =3D mlx5_wq_ll_get_wqe_next_ix(wq, head); + + mlx5e_dealloc_rx_mpwqe(rq, head); + } +} + INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_mpwqes(struct mlx5e_rq *rq) { struct mlx5_wq_ll *wq =3D &rq->mpwqe.wq; u8 umr_completed =3D rq->mpwqe.umr_completed; struct mlx5e_icosq *sq =3D rq->icosq; + bool reclaimed =3D false; int alloc_err =3D 0; u8 missing, i; u16 head; @@ -1122,11 +1135,20 @@ INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_mpwqes(s= truct mlx5e_rq *rq) /* Deferred free for better page pool cache usage. */ mlx5e_free_rx_mpwqe(rq, wi); =20 +retry: alloc_err =3D rq->xsk_pool ? mlx5e_xsk_alloc_rx_mpwqe(rq, head) : mlx5e_alloc_rx_mpwqe(rq, head); + if (unlikely(alloc_err)) { + int reclaim =3D i - 1; =20 - if (unlikely(alloc_err)) - break; + if (reclaimed || !reclaim) + break; + + mlx5e_reclaim_mpwqe_pages(rq, head, reclaim); + reclaimed =3D true; + + goto retry; + } head =3D mlx5_wq_ll_get_wqe_next_ix(wq, head); } while (--i); =20 --=20 2.31.1 From nobody Sat Feb 7 18:28:49 2026 Received: from DM1PR04CU001.outbound.protection.outlook.com (mail-centralusazon11010010.outbound.protection.outlook.com [52.101.61.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8CF2358D19; Mon, 12 Jan 2026 13:23:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.61.10 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768224196; cv=fail; b=GNmiXjAVkEPTU57Z7pT+8/BAIaHyHRsQthDZt1HSVtJQt/5hzZ0Xc2lHn0elAIzocjBWU98UwjSqveRsT45X3FBNK9zyR9FaYm8ATHENXK3BmsbpLsnMWbqeKXnZKfS2KGqqzEK0A7KAhDuUH/ZgBqzARdukucjmdnaDKHNva0Y= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768224196; c=relaxed/simple; bh=I22uhK9iyNvwXYwmtTEfIdk4kxbs51wUn2JK97VFmPE=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=EctqDKkk0Q/t5jVwgmu0aQBPyQ/k6NZ8wtTjQfqK4VFMMqUsOEDpG/woss7HjShiWC3BxQnQVRBgrZ9tSQl3aCQoH9rA9kZ/Fd5/UDQTPpeCK+l5t1XBKsyT6QeOMAKYY0g0PAwZSXDMQa94M7Ey6l1zBZEPb/xABOClp+MNJxs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=HgKgfOCW; arc=fail smtp.client-ip=52.101.61.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="HgKgfOCW" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=wZjjWF6pn2AFNcKAz1uqfJF+XkzECPRxsXIAehaOdWum6biGFarHMvQn6KWEz95atjwejCyTcpyNP7DSPOFXqKYA7xc44ARnk1OoQX4s2wHqpXFrlnkuQ43pIGDNKD2Gqp+Udog1gdIeKzCZy2kc4EZ5UXQoSLmwMibtOYUZQqcreiBHG00lalxD9GXW8pzDpq5H5xhmKnHvjIeVs237DVxoSDWEwXY4pzyxQfqJvmq7BefBLim8Z5nCSVQUfrvfdpa8dAl5ADFwG4HHW4YCt4zQ5RLq9f6nK9dQccuwFH8mNQJDPIrg21UEO21ssN9z6gZz07BTbyFcycNVk7ye2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=YRFdxbDZM2xxx4kq640GhovHGccLMszCN9L3kvah/EY=; b=nq5JQHuBO5MZTKqZOjqRarwA+AK0fbsBCJuzQJ8sjcEp2llUqLYxn9F6J1nw8dLnfaLeDz3qBfxa8E9VsNpm36u6WJy0Uj7m7dnFgarGNUjuspkMl3ARcz+0wEWTmbe5y8nk0el+YEchMrrTFHYFdLsr4DD+yQw7LjIgPbkjzci3MPj0Uh8+yJTU9sLlTJO+jnHcOo7LIwtK3l38+6jpzPASASi7V8Eg1f/1ju73xit5TuzXKjsPBm6DX1mIY6b6oN/kmnJuqDVW1EdRdS/YXqVK/tgxJ0k5fJQRxrIUQSsqLHWyHZc5hT6ZEAjaV3aIMNw8Ria5eN2K65DhSdsH3Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=google.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YRFdxbDZM2xxx4kq640GhovHGccLMszCN9L3kvah/EY=; b=HgKgfOCW5BOrgOfe8d0QM9zSYL5zX2QS3Qd4k2nZuySoDy7oLQkmnXE9UxFYx35HdzKQnzrD23iefMhMFmWk1hwIPkR8EzZ3upxh3lUo/YI+ZhO3VkGaSTI8AyX5EyZDkYRk81cgKaQWJIs+DFgQGavtRujIENQj4t+r1VwDmq8tS3wSOLdyWrzuHBab113VExH1EOJI8Hekii9+CaVgB5mdqm/vLG/tEv4IoGRXWx3iaKtHdiVg7T782hYAdUskHs6MhWxL/AZ7PjrxYKIcL/OoqXvjoJkUY9yLScOdrXxPZxaf2yyAvkE/CsEXf9R8WR4w2KEGcKt4+BNRke/xkg== Received: from DS7PR03CA0061.namprd03.prod.outlook.com (2603:10b6:5:3bb::6) by IA1PR12MB6067.namprd12.prod.outlook.com (2603:10b6:208:3ed::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9499.7; Mon, 12 Jan 2026 13:23:02 +0000 Received: from DS1PEPF0001709D.namprd05.prod.outlook.com (2603:10b6:5:3bb:cafe::85) by DS7PR03CA0061.outlook.office365.com (2603:10b6:5:3bb::6) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.9499.7 via Frontend Transport; Mon, 12 Jan 2026 13:22:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by DS1PEPF0001709D.mail.protection.outlook.com (10.167.18.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.9520.1 via Frontend Transport; Mon, 12 Jan 2026 13:23:02 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 12 Jan 2026 05:22:48 -0800 Received: from rnnvmail201.nvidia.com (10.129.68.8) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.2562.20; Mon, 12 Jan 2026 05:22:47 -0800 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.129.68.8) with Microsoft SMTP Server id 15.2.2562.20 via Frontend Transport; Mon, 12 Jan 2026 05:22:43 -0800 From: Tariq Toukan To: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Andrew Lunn , "David S. Miller" CC: Saeed Mahameed , Leon Romanovsky , Tariq Toukan , Mark Bloch , , , , Gal Pressman , Moshe Shemesh , Cosmin Ratiu , Dragos Tatulea Subject: [PATCH net-next 3/3] net/mlx5e: SHAMPO, Switch to header memcpy Date: Mon, 12 Jan 2026 15:22:09 +0200 Message-ID: <1768224129-1600265-4-git-send-email-tariqt@nvidia.com> X-Mailer: git-send-email 2.8.0 In-Reply-To: <1768224129-1600265-1-git-send-email-tariqt@nvidia.com> References: <1768224129-1600265-1-git-send-email-tariqt@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS1PEPF0001709D:EE_|IA1PR12MB6067:EE_ X-MS-Office365-Filtering-Correlation-Id: 94216fcf-8e49-4ea4-b313-08de51ddb65f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|36860700013|1800799024|82310400026|7053199007; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?0LAhp7gcepQ1PrXg3l552tHLfUcK/JpPYb2ebO8UChcgi/VWssd0Fm2U8KYx?= =?us-ascii?Q?jSTUh0X5AHbVRa23l2RRywZ82FUJa+bjkCVzzEhhS0owoSOdHA4q8/q71VEE?= =?us-ascii?Q?48dvEA+PErLjeXqgf4+DZTZ9epfrkSmr1KvpMqbfGm+oX2YOArrjKzNi+Kz9?= =?us-ascii?Q?UliEouwrHVVmD+ZpbC23qHMmb8rt1/tMwvRAseq18C/eCFQntW4vgO9KtQkW?= =?us-ascii?Q?EyqxtV7AvHt5Kbbz6t3otQLUMUP9MpNrC3G4OKkOyCkia8a/F/Eh0EOwnY9B?= =?us-ascii?Q?IpglQvVntSr/pDXSnfAlmWzFRRTs6iA1iFLFxUG7YCuyaiJL6ZZa/VDoCqFl?= =?us-ascii?Q?lrV0H9nit/YCddigf/ZsTEVEk2vPgHfqqroEhjHptbPQfrCIoFFvzgcmTr6T?= =?us-ascii?Q?eaCYn3OtwKrUsh8XEKofiF3pMaAnInPoqylyEZ3Wa9uaYpwGIwJuTd6f0RI1?= =?us-ascii?Q?/f9vxQKfLyjO+p+9hMP5sA8Sjw3cLUGWjCfLdioKT7bIZxuifp/HpZEPTNs7?= =?us-ascii?Q?11e+Ix925ZpOuzRAFCaz7ZnR6EYMgAjOXhVN3yITcBDtv7+1FCucgTRxo54f?= =?us-ascii?Q?3S46iYaXOeXnvHmjH7QInAuYrjVsBaUai2wJ3qG/w460VNfFP7I63SnReUxx?= =?us-ascii?Q?zZ9p1jvONyG/ssbx5vFJ5wBycD0gQuIvknfHyfDlyXrXJQ2OIXLCxeN4xMTy?= =?us-ascii?Q?NaCAan2igsuWZ6PikIlizN2lf5eSCXqb0N0bj2ySLoQz9zeq9rvmuuAjaVSe?= =?us-ascii?Q?s7jPNJw9cDsXoNAArJ1vSAs/k1NiiaDrk3WuyoHGFjkeUatD6oflEUK5BkYR?= =?us-ascii?Q?EEW4Z9jX+jU0yztlr5xFVf7aA2YPcuMTqB6yWU6NfXl0LjqcL6TtJJeMWlQL?= =?us-ascii?Q?5crZ6oUyxg6X0LdSgb55x0Z4O90MbPIcWSwBc58rRevHi0NEtt7RYyetS56i?= =?us-ascii?Q?8yCQsfUQgiemeAYhZfg1VxahSxLtK1aaQHhYD5fJ0FZCNgbgJD/7tlzxz1QS?= =?us-ascii?Q?PRXFTwtIuMgPORVG4Q+tc58caMpfevMieo7dAUEv2p9Aah9/VzLDJoeYQvVC?= =?us-ascii?Q?9oP2JKMexX0C+5uYMQwu7XPmcWoXh8NoqnTYIpcqTHYR6fJU7q7jKiu1DXPr?= =?us-ascii?Q?hHq4HvOVDApReKdngvvpyWprOy6WCLU2uFkOLYy6fhI0eBXmcwZ7rIPaPXhP?= =?us-ascii?Q?+wGQmBfxbA8NOPJ26AwZX0GzoPn0edFfflIoQViKT0j7+Czfs0e/3pqRAiHp?= =?us-ascii?Q?W2VfHv4WWNa7IpQSQqNRr0vhoRmyVPCBL0kU19/hoL4KorDvOsLo+FRevZJM?= =?us-ascii?Q?uFetVyOjKyvxT32sGHfO+Jbk/pv6I0TSmIl8hfvDnNqw8gJNRNUaYc/68YAZ?= =?us-ascii?Q?KDeDnZ0afpQ5KKoN4HU5Diu0O7t9LfbgCGbPgmIcpeAqwWzdUQ6gcEuY/u+P?= =?us-ascii?Q?0JvYi4Cy5C32keiuHapNaMs9E4Kq60kdfuUsxmz/xotDZqsH+XvwfF3BHEAk?= =?us-ascii?Q?+Hkw0AYVLCbE8gTg/ASbaxYp5fUABNK0aPq+Z7ubKpyHUwWcO53nqgTwTIc/?= =?us-ascii?Q?EA7ym8OejazhEujVCW0=3D?= X-Forefront-Antispam-Report: CIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230040)(376014)(36860700013)(1800799024)(82310400026)(7053199007);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Jan 2026 13:23:02.1076 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 94216fcf-8e49-4ea4-b313-08de51ddb65f X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DS1PEPF0001709D.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB6067 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dragos Tatulea Previously the HW-GRO code was using a separate page_pool for the header buffer. The pages of the header buffer were replenished via UMR. This mechanism has some drawbacks: - Reference counting on the page_pool page frags is not cheap. - UMRs have HW overhead for updating and also for access. Especially for the KLM type which was previously used. - UMR code for headers is complex. This patch switches to using a static memory area (static MTT MKEY) for the header buffer and does a header memcpy. This happens only once per GRO session. The SKB is allocated from the per-cpu NAPI SKB cache. Performance numbers for x86: +---------------------------------------------------------+ | Test | Baseline | Header Copy | Change | |---------------------+------------+-------------+--------| | iperf3 oncpu | 59.5 Gbps | 64.00 Gbps | 7 % | | iperf3 offcpu | 102.5 Gbps | 104.20 Gbps | 2 % | | kperf oncpu | 115.0 Gbps | 130.00 Gbps | 12 % | | XDP_DROP (skb mode) | 3.9 Mpps | 3.9 Mpps | 0 % | +---------------------------------------------------------+ Notes on test: - System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz - oncpu: NAPI and application running on same CPU - offcpu: NAPI and application running on different CPUs - MTU: 1500 - iperf3 tests are single stream, 60s with IPv6 (for slightly larger headers) - kperf version [1] [1] git://git.kernel.dk/kperf.git Suggested-by: Eric Dumazet Signed-off-by: Dragos Tatulea Signed-off-by: Tariq Toukan Reviewed-by: Simon Horman --- drivers/net/ethernet/mellanox/mlx5/core/en.h | 20 +- .../net/ethernet/mellanox/mlx5/core/en/txrx.h | 1 - .../net/ethernet/mellanox/mlx5/core/en_main.c | 287 +++++++-------- .../net/ethernet/mellanox/mlx5/core/en_rx.c | 336 +++--------------- 4 files changed, 185 insertions(+), 459 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/eth= ernet/mellanox/mlx5/core/en.h index 262dc032e276..fcce50e46165 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -82,9 +82,10 @@ struct page_pool; =20 #define MLX5E_RX_MAX_HEAD (256) #define MLX5E_SHAMPO_LOG_HEADER_ENTRY_SIZE (8) -#define MLX5E_SHAMPO_LOG_MAX_HEADER_ENTRY_SIZE (9) -#define MLX5E_SHAMPO_WQ_HEADER_PER_PAGE (PAGE_SIZE >> MLX5E_SHAMPO_LOG_MAX= _HEADER_ENTRY_SIZE) -#define MLX5E_SHAMPO_LOG_WQ_HEADER_PER_PAGE (PAGE_SHIFT - MLX5E_SHAMPO_LOG= _MAX_HEADER_ENTRY_SIZE) +#define MLX5E_SHAMPO_WQ_HEADER_PER_PAGE \ + (PAGE_SIZE >> MLX5E_SHAMPO_LOG_HEADER_ENTRY_SIZE) +#define MLX5E_SHAMPO_LOG_WQ_HEADER_PER_PAGE \ + (PAGE_SHIFT - MLX5E_SHAMPO_LOG_HEADER_ENTRY_SIZE) #define MLX5E_SHAMPO_WQ_BASE_HEAD_ENTRY_SIZE_SHIFT (6) #define MLX5E_SHAMPO_WQ_RESRV_SIZE_BASE_SHIFT (12) #define MLX5E_SHAMPO_WQ_LOG_RESRV_SIZE (16) @@ -632,16 +633,11 @@ struct mlx5e_dma_info { }; =20 struct mlx5e_shampo_hd { - struct mlx5e_frag_page *pages; u32 hd_per_wq; - u32 hd_per_page; - u16 hd_per_wqe; - u8 log_hd_per_page; - u8 log_hd_entry_size; - unsigned long *bitmap; - u16 pi; - u16 ci; - __be32 mkey_be; + u32 hd_buf_size; + u32 mkey; + u32 nentries; + DECLARE_FLEX_ARRAY(struct mlx5e_dma_info, hd_buf_pages); }; =20 struct mlx5e_hw_gro_data { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/ne= t/ethernet/mellanox/mlx5/core/en/txrx.h index 7e191e1569e8..f2a8453d8dce 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h @@ -65,7 +65,6 @@ ktime_t mlx5e_cqe_ts_to_ns(cqe_ts_to_ns func, struct mlx5= _clock *clock, u64 cqe_ enum mlx5e_icosq_wqe_type { MLX5E_ICOSQ_WQE_NOP, MLX5E_ICOSQ_WQE_UMR_RX, - MLX5E_ICOSQ_WQE_SHAMPO_HD_UMR, #ifdef CONFIG_MLX5_EN_TLS MLX5E_ICOSQ_WQE_UMR_TLS, MLX5E_ICOSQ_WQE_SET_PSV_TLS, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/ne= t/ethernet/mellanox/mlx5/core/en_main.c index 136fa8f05607..4ee92eea1324 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -492,40 +492,6 @@ static int mlx5e_create_umr_mkey(struct mlx5_core_dev = *mdev, return err; } =20 -static int mlx5e_create_umr_ksm_mkey(struct mlx5_core_dev *mdev, - u64 nentries, u8 log_entry_size, - u32 *umr_mkey) -{ - int inlen; - void *mkc; - u32 *in; - int err; - - inlen =3D MLX5_ST_SZ_BYTES(create_mkey_in); - - in =3D kvzalloc(inlen, GFP_KERNEL); - if (!in) - return -ENOMEM; - - mkc =3D MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); - - MLX5_SET(mkc, mkc, free, 1); - MLX5_SET(mkc, mkc, umr_en, 1); - MLX5_SET(mkc, mkc, lw, 1); - MLX5_SET(mkc, mkc, lr, 1); - MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_KSM); - mlx5e_mkey_set_relaxed_ordering(mdev, mkc); - MLX5_SET(mkc, mkc, qpn, 0xffffff); - MLX5_SET(mkc, mkc, pd, mdev->mlx5e_res.hw_objs.pdn); - MLX5_SET(mkc, mkc, translations_octword_size, nentries); - MLX5_SET(mkc, mkc, log_page_size, log_entry_size); - MLX5_SET64(mkc, mkc, len, nentries << log_entry_size); - err =3D mlx5_core_create_mkey(mdev, umr_mkey, in, inlen); - - kvfree(in); - return err; -} - static int mlx5e_create_rq_umr_mkey(struct mlx5_core_dev *mdev, struct mlx= 5e_rq *rq) { u32 xsk_chunk_size =3D rq->xsk_pool ? rq->xsk_pool->chunk_size : 0; @@ -551,29 +517,6 @@ static int mlx5e_create_rq_umr_mkey(struct mlx5_core_d= ev *mdev, struct mlx5e_rq return err; } =20 -static int mlx5e_create_rq_hd_umr_mkey(struct mlx5_core_dev *mdev, - u16 hd_per_wq, __be32 *umr_mkey) -{ - u32 max_ksm_size =3D BIT(MLX5_CAP_GEN(mdev, log_max_klm_list_size)); - u32 mkey; - int err; - - if (max_ksm_size < hd_per_wq) { - mlx5_core_err(mdev, "max ksm list size 0x%x is smaller than shampo heade= r buffer list size 0x%x\n", - max_ksm_size, hd_per_wq); - return -EINVAL; - } - - err =3D mlx5e_create_umr_ksm_mkey(mdev, hd_per_wq, - MLX5E_SHAMPO_LOG_HEADER_ENTRY_SIZE, - &mkey); - if (err) - return err; - - *umr_mkey =3D cpu_to_be32(mkey); - return 0; -} - static void mlx5e_init_frags_partition(struct mlx5e_rq *rq) { struct mlx5e_wqe_frag_info next_frag =3D {}; @@ -754,145 +697,169 @@ static int mlx5e_init_rxq_rq(struct mlx5e_channel *= c, struct mlx5e_params *param xdp_frag_size); } =20 -static int mlx5e_rq_shampo_hd_info_alloc(struct mlx5e_rq *rq, u16 hd_per_w= q, - int node) +static void mlx5e_release_rq_hd_pages(struct mlx5e_rq *rq, + struct mlx5e_shampo_hd *shampo) + { - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; + for (int i =3D 0; i < shampo->nentries; i++) { + struct mlx5e_dma_info *info =3D &shampo->hd_buf_pages[i]; =20 - shampo->hd_per_wq =3D hd_per_wq; + if (!info->page) + continue; + + dma_unmap_page(rq->pdev, info->addr, PAGE_SIZE, + rq->buff.map_dir); + __free_page(info->page); + } +} + +static int mlx5e_alloc_rq_hd_pages(struct mlx5e_rq *rq, int node, + struct mlx5e_shampo_hd *shampo) +{ + int err, i; + + for (i =3D 0; i < shampo->nentries; i++) { + struct page *page =3D alloc_pages_node(node, GFP_KERNEL, 0); + dma_addr_t addr; + + if (!page) { + err =3D -ENOMEM; + goto err_free_pages; + } =20 - shampo->bitmap =3D bitmap_zalloc_node(hd_per_wq, GFP_KERNEL, node); - shampo->pages =3D kvzalloc_node(array_size(hd_per_wq, - sizeof(*shampo->pages)), - GFP_KERNEL, node); - if (!shampo->bitmap || !shampo->pages) - goto err_nomem; + addr =3D dma_map_page(rq->pdev, page, 0, PAGE_SIZE, + rq->buff.map_dir); + err =3D dma_mapping_error(rq->pdev, addr); + if (err) { + __free_page(page); + goto err_free_pages; + } + + shampo->hd_buf_pages[i].page =3D page; + shampo->hd_buf_pages[i].addr =3D addr; + } =20 return 0; =20 -err_nomem: - kvfree(shampo->pages); - bitmap_free(shampo->bitmap); +err_free_pages: + mlx5e_release_rq_hd_pages(rq, shampo); =20 - return -ENOMEM; + return err; } =20 -static void mlx5e_rq_shampo_hd_info_free(struct mlx5e_rq *rq) +static int mlx5e_create_rq_hd_mkey(struct mlx5_core_dev *mdev, + struct mlx5e_shampo_hd *shampo) { - kvfree(rq->mpwqe.shampo->pages); - bitmap_free(rq->mpwqe.shampo->bitmap); + enum mlx5e_mpwrq_umr_mode umr_mode =3D MLX5E_MPWRQ_UMR_MODE_ALIGNED; + struct mlx5_mtt *mtt; + void *mkc, *in; + int inlen, err; + u32 octwords; + + octwords =3D mlx5e_mpwrq_umr_octowords(shampo->nentries, umr_mode); + inlen =3D MLX5_FLEXIBLE_INLEN(mdev, MLX5_ST_SZ_BYTES(create_mkey_in), + MLX5_OCTWORD, octwords); + if (inlen < 0) + return inlen; + + in =3D kvzalloc(inlen, GFP_KERNEL); + if (!in) + return -ENOMEM; + + mkc =3D MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry); + + MLX5_SET(mkc, mkc, lw, 1); + MLX5_SET(mkc, mkc, lr, 1); + MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT); + mlx5e_mkey_set_relaxed_ordering(mdev, mkc); + MLX5_SET(mkc, mkc, qpn, 0xffffff); + MLX5_SET(mkc, mkc, pd, mdev->mlx5e_res.hw_objs.pdn); + MLX5_SET64(mkc, mkc, len, shampo->hd_buf_size); + MLX5_SET(mkc, mkc, log_page_size, PAGE_SHIFT); + MLX5_SET(mkc, mkc, translations_octword_size, octwords); + MLX5_SET(create_mkey_in, in, translations_octword_actual_size, + octwords); + + mtt =3D MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt); + for (int i =3D 0; i < shampo->nentries; i++) + mtt[i].ptag =3D cpu_to_be64(shampo->hd_buf_pages[i].addr); + + err =3D mlx5_core_create_mkey(mdev, &shampo->mkey, in, inlen); + + kvfree(in); + return err; } =20 static int mlx5_rq_shampo_alloc(struct mlx5_core_dev *mdev, struct mlx5e_params *params, struct mlx5e_rq_param *rqp, struct mlx5e_rq *rq, - u32 *pool_size, int node) { - void *wqc =3D MLX5_ADDR_OF(rqc, rqp->rqc, wq); - u8 log_hd_per_page, log_hd_entry_size; - u16 hd_per_wq, hd_per_wqe; - u32 hd_pool_size; - int wq_size; - int err; + struct mlx5e_shampo_hd *shampo; + int nentries, err, shampo_sz; + u32 hd_per_wq, hd_buf_size; =20 if (!test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) return 0; =20 - rq->mpwqe.shampo =3D kvzalloc_node(sizeof(*rq->mpwqe.shampo), - GFP_KERNEL, node); - if (!rq->mpwqe.shampo) - return -ENOMEM; - - /* split headers data structures */ hd_per_wq =3D mlx5e_shampo_hd_per_wq(mdev, params, rqp); - err =3D mlx5e_rq_shampo_hd_info_alloc(rq, hd_per_wq, node); - if (err) - goto err_shampo_hd_info_alloc; - - err =3D mlx5e_create_rq_hd_umr_mkey(mdev, hd_per_wq, - &rq->mpwqe.shampo->mkey_be); - if (err) - goto err_umr_mkey; - - hd_per_wqe =3D mlx5e_shampo_hd_per_wqe(mdev, params, rqp); - wq_size =3D BIT(MLX5_GET(wq, wqc, log_wq_sz)); - - BUILD_BUG_ON(MLX5E_SHAMPO_LOG_MAX_HEADER_ENTRY_SIZE > PAGE_SHIFT); - if (hd_per_wqe >=3D MLX5E_SHAMPO_WQ_HEADER_PER_PAGE) { - log_hd_per_page =3D MLX5E_SHAMPO_LOG_WQ_HEADER_PER_PAGE; - log_hd_entry_size =3D MLX5E_SHAMPO_LOG_MAX_HEADER_ENTRY_SIZE; - } else { - log_hd_per_page =3D order_base_2(hd_per_wqe); - log_hd_entry_size =3D order_base_2(PAGE_SIZE / hd_per_wqe); + hd_buf_size =3D hd_per_wq * BIT(MLX5E_SHAMPO_LOG_HEADER_ENTRY_SIZE); + nentries =3D hd_buf_size / PAGE_SIZE; + if (!nentries) { + mlx5_core_err(mdev, "SHAMPO header buffer size %u < %lu\n", + hd_buf_size, PAGE_SIZE); + return -EINVAL; } =20 - rq->mpwqe.shampo->hd_per_wqe =3D hd_per_wqe; - rq->mpwqe.shampo->hd_per_page =3D BIT(log_hd_per_page); - rq->mpwqe.shampo->log_hd_per_page =3D log_hd_per_page; - rq->mpwqe.shampo->log_hd_entry_size =3D log_hd_entry_size; - - hd_pool_size =3D (hd_per_wqe * wq_size) >> log_hd_per_page; - - if (netif_rxq_has_unreadable_mp(rq->netdev, rq->ix)) { - /* Separate page pool for shampo headers */ - struct page_pool_params pp_params =3D { }; + shampo_sz =3D struct_size(shampo, hd_buf_pages, nentries); + shampo =3D kvzalloc_node(shampo_sz, GFP_KERNEL, node); + if (!shampo) + return -ENOMEM; =20 - pp_params.order =3D 0; - pp_params.flags =3D PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV; - pp_params.pool_size =3D hd_pool_size; - pp_params.nid =3D node; - pp_params.dev =3D rq->pdev; - pp_params.napi =3D rq->cq.napi; - pp_params.netdev =3D rq->netdev; - pp_params.dma_dir =3D rq->buff.map_dir; - pp_params.max_len =3D PAGE_SIZE; + shampo->hd_per_wq =3D hd_per_wq; + shampo->hd_buf_size =3D hd_buf_size; + shampo->nentries =3D nentries; + err =3D mlx5e_alloc_rq_hd_pages(rq, node, shampo); + if (err) + goto err_free; =20 - rq->hd_page_pool =3D page_pool_create(&pp_params); - if (IS_ERR(rq->hd_page_pool)) { - err =3D PTR_ERR(rq->hd_page_pool); - rq->hd_page_pool =3D NULL; - goto err_hds_page_pool; - } - } else { - /* Common page pool, reserve space for headers. */ - *pool_size +=3D hd_pool_size; - rq->hd_page_pool =3D NULL; - } + err =3D mlx5e_create_rq_hd_mkey(mdev, shampo); + if (err) + goto err_release_pages; =20 /* gro only data structures */ rq->hw_gro_data =3D kvzalloc_node(sizeof(*rq->hw_gro_data), GFP_KERNEL, n= ode); if (!rq->hw_gro_data) { err =3D -ENOMEM; - goto err_hw_gro_data; + goto err_destroy_mkey; } =20 + rq->mpwqe.shampo =3D shampo; + return 0; =20 -err_hw_gro_data: - page_pool_destroy(rq->hd_page_pool); -err_hds_page_pool: - mlx5_core_destroy_mkey(mdev, be32_to_cpu(rq->mpwqe.shampo->mkey_be)); -err_umr_mkey: - mlx5e_rq_shampo_hd_info_free(rq); -err_shampo_hd_info_alloc: - kvfree(rq->mpwqe.shampo); +err_destroy_mkey: + mlx5_core_destroy_mkey(mdev, shampo->mkey); +err_release_pages: + mlx5e_release_rq_hd_pages(rq, shampo); +err_free: + kvfree(shampo); + return err; } =20 static void mlx5e_rq_free_shampo(struct mlx5e_rq *rq) { - if (!test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) + struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; + + if (!shampo) return; =20 kvfree(rq->hw_gro_data); - if (rq->hd_page_pool !=3D rq->page_pool) - page_pool_destroy(rq->hd_page_pool); - mlx5e_rq_shampo_hd_info_free(rq); - mlx5_core_destroy_mkey(rq->mdev, - be32_to_cpu(rq->mpwqe.shampo->mkey_be)); - kvfree(rq->mpwqe.shampo); + mlx5_core_destroy_mkey(rq->mdev, shampo->mkey); + mlx5e_release_rq_hd_pages(rq, shampo); + kvfree(shampo); } =20 static int mlx5e_alloc_rq(struct mlx5e_params *params, @@ -970,7 +937,7 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, if (err) goto err_rq_mkey; =20 - err =3D mlx5_rq_shampo_alloc(mdev, params, rqp, rq, &pool_size, node); + err =3D mlx5_rq_shampo_alloc(mdev, params, rqp, rq, node); if (err) goto err_free_mpwqe_info; =20 @@ -1165,8 +1132,7 @@ int mlx5e_create_rq(struct mlx5e_rq *rq, struct mlx5e= _rq_param *param, u16 q_cou if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) { MLX5_SET(wq, wq, log_headers_buffer_entry_num, order_base_2(rq->mpwqe.shampo->hd_per_wq)); - MLX5_SET(wq, wq, headers_mkey, - be32_to_cpu(rq->mpwqe.shampo->mkey_be)); + MLX5_SET(wq, wq, headers_mkey, rq->mpwqe.shampo->mkey); } =20 mlx5_fill_page_frag_array(&rq->wq_ctrl.buf, @@ -1326,14 +1292,6 @@ void mlx5e_free_rx_missing_descs(struct mlx5e_rq *rq) rq->mpwqe.actual_wq_head =3D wq->head; rq->mpwqe.umr_in_progress =3D 0; rq->mpwqe.umr_completed =3D 0; - - if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) { - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; - u16 len; - - len =3D (shampo->pi - shampo->ci) & shampo->hd_per_wq; - mlx5e_shampo_fill_umr(rq, len); - } } =20 void mlx5e_free_rx_descs(struct mlx5e_rq *rq) @@ -1356,9 +1314,6 @@ void mlx5e_free_rx_descs(struct mlx5e_rq *rq) mlx5_wq_ll_pop(wq, wqe_ix_be, &wqe->next.next_wqe_index); } - - if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) - mlx5e_shampo_dealloc_hd(rq); } else { struct mlx5_wq_cyc *wq =3D &rq->wqe.wq; u16 missing =3D mlx5_wq_cyc_missing(wq); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/= ethernet/mellanox/mlx5/core/en_rx.c index aae4db392992..5ab70e057a5c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c @@ -611,165 +611,6 @@ static void mlx5e_post_rx_mpwqe(struct mlx5e_rq *rq, = u8 n) mlx5_wq_ll_update_db_record(wq); } =20 -/* This function returns the size of the continuous free space inside a bi= tmap - * that starts from first and no longer than len including circular ones. - */ -static int bitmap_find_window(unsigned long *bitmap, int len, - int bitmap_size, int first) -{ - int next_one, count; - - next_one =3D find_next_bit(bitmap, bitmap_size, first); - if (next_one =3D=3D bitmap_size) { - if (bitmap_size - first >=3D len) - return len; - next_one =3D find_next_bit(bitmap, bitmap_size, 0); - count =3D next_one + bitmap_size - first; - } else { - count =3D next_one - first; - } - - return min(len, count); -} - -static void build_ksm_umr(struct mlx5e_icosq *sq, struct mlx5e_umr_wqe *um= r_wqe, - __be32 key, u16 offset, u16 ksm_len) -{ - memset(umr_wqe, 0, offsetof(struct mlx5e_umr_wqe, inline_ksms)); - umr_wqe->hdr.ctrl.opmod_idx_opcode =3D - cpu_to_be32((sq->pc << MLX5_WQE_CTRL_WQE_INDEX_SHIFT) | - MLX5_OPCODE_UMR); - umr_wqe->hdr.ctrl.umr_mkey =3D key; - umr_wqe->hdr.ctrl.qpn_ds =3D cpu_to_be32((sq->sqn << MLX5_WQE_CTRL_QPN_SH= IFT) - | MLX5E_KSM_UMR_DS_CNT(ksm_len)); - umr_wqe->hdr.uctrl.flags =3D MLX5_UMR_TRANSLATION_OFFSET_EN | MLX5_UMR_IN= LINE; - umr_wqe->hdr.uctrl.xlt_offset =3D cpu_to_be16(offset); - umr_wqe->hdr.uctrl.xlt_octowords =3D cpu_to_be16(ksm_len); - umr_wqe->hdr.uctrl.mkey_mask =3D cpu_to_be64(MLX5_MKEY_MASK_FREE); -} - -static struct mlx5e_frag_page *mlx5e_shampo_hd_to_frag_page(struct mlx5e_r= q *rq, - int header_index) -{ - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; - - return &shampo->pages[header_index >> shampo->log_hd_per_page]; -} - -static u64 mlx5e_shampo_hd_offset(struct mlx5e_rq *rq, int header_index) -{ - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; - u32 hd_per_page =3D shampo->hd_per_page; - - return (header_index & (hd_per_page - 1)) << shampo->log_hd_entry_size; -} - -static void mlx5e_free_rx_shampo_hd_entry(struct mlx5e_rq *rq, u16 header_= index); - -static int mlx5e_build_shampo_hd_umr(struct mlx5e_rq *rq, - struct mlx5e_icosq *sq, - u16 ksm_entries, u16 index) -{ - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; - u16 pi, header_offset, err, wqe_bbs; - u32 lkey =3D rq->mdev->mlx5e_res.hw_objs.mkey; - struct mlx5e_umr_wqe *umr_wqe; - int headroom, i; - - headroom =3D rq->buff.headroom; - wqe_bbs =3D MLX5E_KSM_UMR_WQEBBS(ksm_entries); - pi =3D mlx5e_icosq_get_next_pi(sq, wqe_bbs); - umr_wqe =3D mlx5_wq_cyc_get_wqe(&sq->wq, pi); - build_ksm_umr(sq, umr_wqe, shampo->mkey_be, index, ksm_entries); - - for (i =3D 0; i < ksm_entries; i++, index++) { - struct mlx5e_frag_page *frag_page; - u64 addr; - - frag_page =3D mlx5e_shampo_hd_to_frag_page(rq, index); - header_offset =3D mlx5e_shampo_hd_offset(rq, index); - if (!header_offset) { - err =3D mlx5e_page_alloc_fragmented(rq->hd_page_pool, - frag_page); - if (err) - goto err_unmap; - } - - addr =3D page_pool_get_dma_addr_netmem(frag_page->netmem); - umr_wqe->inline_ksms[i] =3D (struct mlx5_ksm) { - .key =3D cpu_to_be32(lkey), - .va =3D cpu_to_be64(addr + header_offset + headroom), - }; - } - - sq->db.wqe_info[pi] =3D (struct mlx5e_icosq_wqe_info) { - .wqe_type =3D MLX5E_ICOSQ_WQE_SHAMPO_HD_UMR, - .num_wqebbs =3D wqe_bbs, - .shampo.len =3D ksm_entries, - }; - - shampo->pi =3D (shampo->pi + ksm_entries) & (shampo->hd_per_wq - 1); - sq->pc +=3D wqe_bbs; - sq->doorbell_cseg =3D &umr_wqe->hdr.ctrl; - - return 0; - -err_unmap: - while (--i >=3D 0) { - --index; - header_offset =3D mlx5e_shampo_hd_offset(rq, index); - if (!header_offset) { - struct mlx5e_frag_page *frag_page =3D mlx5e_shampo_hd_to_frag_page(rq, = index); - - mlx5e_page_release_fragmented(rq->hd_page_pool, - frag_page); - } - } - - rq->stats->buff_alloc_err++; - return err; -} - -static int mlx5e_alloc_rx_hd_mpwqe(struct mlx5e_rq *rq) -{ - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; - u16 ksm_entries, num_wqe, index, entries_before; - struct mlx5e_icosq *sq =3D rq->icosq; - int i, err, max_ksm_entries, len; - - max_ksm_entries =3D MLX5E_MAX_KSM_PER_WQE(rq->mdev); - ksm_entries =3D bitmap_find_window(shampo->bitmap, - shampo->hd_per_wqe, - shampo->hd_per_wq, shampo->pi); - ksm_entries =3D ALIGN_DOWN(ksm_entries, shampo->hd_per_page); - if (!ksm_entries) - return 0; - - /* pi is aligned to MLX5E_SHAMPO_WQ_HEADER_PER_PAGE */ - index =3D shampo->pi; - entries_before =3D shampo->hd_per_wq - index; - - if (unlikely(entries_before < ksm_entries)) - num_wqe =3D DIV_ROUND_UP(entries_before, max_ksm_entries) + - DIV_ROUND_UP(ksm_entries - entries_before, max_ksm_entries); - else - num_wqe =3D DIV_ROUND_UP(ksm_entries, max_ksm_entries); - - for (i =3D 0; i < num_wqe; i++) { - len =3D (ksm_entries > max_ksm_entries) ? max_ksm_entries : - ksm_entries; - if (unlikely(index + len > shampo->hd_per_wq)) - len =3D shampo->hd_per_wq - index; - err =3D mlx5e_build_shampo_hd_umr(rq, sq, len, index); - if (unlikely(err)) - return err; - index =3D (index + len) & (rq->mpwqe.shampo->hd_per_wq - 1); - ksm_entries -=3D len; - } - - return 0; -} - static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) { struct mlx5e_mpw_info *wi =3D mlx5e_get_mpw_info(rq, ix); @@ -782,12 +623,6 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, u= 16 ix) int err; int i; =20 - if (test_bit(MLX5E_RQ_STATE_SHAMPO, &rq->state)) { - err =3D mlx5e_alloc_rx_hd_mpwqe(rq); - if (unlikely(err)) - goto err; - } - pi =3D mlx5e_icosq_get_next_pi(sq, rq->mpwqe.umr_wqebbs); umr_wqe =3D mlx5_wq_cyc_get_wqe(wq, pi); memcpy(umr_wqe, &rq->mpwqe.umr_wqe, sizeof(struct mlx5e_umr_wqe)); @@ -848,34 +683,11 @@ static int mlx5e_alloc_rx_mpwqe(struct mlx5e_rq *rq, = u16 ix) =20 bitmap_fill(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe); =20 -err: rq->stats->buff_alloc_err++; =20 return err; } =20 -static void -mlx5e_free_rx_shampo_hd_entry(struct mlx5e_rq *rq, u16 header_index) -{ - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; - - if (((header_index + 1) & (shampo->hd_per_page - 1)) =3D=3D 0) { - struct mlx5e_frag_page *frag_page =3D mlx5e_shampo_hd_to_frag_page(rq, h= eader_index); - - mlx5e_page_release_fragmented(rq->hd_page_pool, frag_page); - } - clear_bit(header_index, shampo->bitmap); -} - -void mlx5e_shampo_dealloc_hd(struct mlx5e_rq *rq) -{ - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; - int i; - - for_each_set_bit(i, shampo->bitmap, rq->mpwqe.shampo->hd_per_wq) - mlx5e_free_rx_shampo_hd_entry(rq, i); -} - static void mlx5e_dealloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix) { struct mlx5e_mpw_info *wi =3D mlx5e_get_mpw_info(rq, ix); @@ -968,33 +780,6 @@ void mlx5e_free_icosq_descs(struct mlx5e_icosq *sq) sq->cc =3D sqcc; } =20 -void mlx5e_shampo_fill_umr(struct mlx5e_rq *rq, int len) -{ - struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; - int end, from, full_len =3D len; - - end =3D shampo->hd_per_wq; - from =3D shampo->ci; - if (from + len > end) { - len -=3D end - from; - bitmap_set(shampo->bitmap, from, end - from); - from =3D 0; - } - - bitmap_set(shampo->bitmap, from, len); - shampo->ci =3D (shampo->ci + full_len) & (shampo->hd_per_wq - 1); -} - -static void mlx5e_handle_shampo_hd_umr(struct mlx5e_shampo_umr umr, - struct mlx5e_icosq *sq) -{ - struct mlx5e_channel *c =3D container_of(sq, struct mlx5e_channel, icosq); - /* assume 1:1 relationship between RQ and icosq */ - struct mlx5e_rq *rq =3D &c->rq; - - mlx5e_shampo_fill_umr(rq, umr.len); -} - int mlx5e_poll_ico_cq(struct mlx5e_cq *cq) { struct mlx5e_icosq *sq =3D container_of(cq, struct mlx5e_icosq, cq); @@ -1055,9 +840,6 @@ int mlx5e_poll_ico_cq(struct mlx5e_cq *cq) break; case MLX5E_ICOSQ_WQE_NOP: break; - case MLX5E_ICOSQ_WQE_SHAMPO_HD_UMR: - mlx5e_handle_shampo_hd_umr(wi->shampo, sq); - break; #ifdef CONFIG_MLX5_EN_TLS case MLX5E_ICOSQ_WQE_UMR_TLS: break; @@ -1245,15 +1027,6 @@ static unsigned int mlx5e_lro_update_hdr(struct sk_b= uff *skb, return (unsigned int)((unsigned char *)tcp + tcp->doff * 4 - skb->data); } =20 -static void *mlx5e_shampo_get_packet_hd(struct mlx5e_rq *rq, u16 header_in= dex) -{ - struct mlx5e_frag_page *frag_page =3D mlx5e_shampo_hd_to_frag_page(rq, he= ader_index); - u16 head_offset =3D mlx5e_shampo_hd_offset(rq, header_index); - void *addr =3D netmem_address(frag_page->netmem); - - return addr + head_offset + rq->buff.headroom; -} - static void mlx5e_shampo_update_ipv4_udp_hdr(struct mlx5e_rq *rq, struct i= phdr *ipv4) { int udp_off =3D rq->hw_gro_data->fk.control.thoff; @@ -1292,15 +1065,41 @@ static void mlx5e_shampo_update_ipv6_udp_hdr(struct= mlx5e_rq *rq, struct ipv6hdr skb_shinfo(skb)->gso_type |=3D SKB_GSO_UDP_L4; } =20 +static inline u32 mlx5e_shampo_get_header_offset(int header_index) +{ + return (header_index & (MLX5E_SHAMPO_WQ_HEADER_PER_PAGE - 1)) * + BIT(MLX5E_SHAMPO_LOG_HEADER_ENTRY_SIZE); +} + +static void *mlx5e_shampo_get_hdr(struct mlx5e_rq *rq, struct mlx5_cqe64 *= cqe, + int len) +{ + struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; + u32 head_offset, header_index, di_index; + struct mlx5e_dma_info *di; + + header_index =3D mlx5e_shampo_get_cqe_header_index(rq, cqe); + head_offset =3D mlx5e_shampo_get_header_offset(header_index); + di_index =3D header_index >> MLX5E_SHAMPO_LOG_WQ_HEADER_PER_PAGE; + di =3D &shampo->hd_buf_pages[di_index]; + + dma_sync_single_range_for_cpu(rq->pdev, di->addr, head_offset, + len, rq->buff.map_dir); + + return page_address(di->page) + head_offset; +} + static void mlx5e_shampo_update_fin_psh_flags(struct mlx5e_rq *rq, struct = mlx5_cqe64 *cqe, struct tcphdr *skb_tcp_hd) { - u16 header_index =3D mlx5e_shampo_get_cqe_header_index(rq, cqe); + int nhoff =3D ETH_HLEN + rq->hw_gro_data->fk.control.thoff; + int len =3D nhoff + sizeof(struct tcphdr); struct tcphdr *last_tcp_hd; void *last_hd_addr; =20 - last_hd_addr =3D mlx5e_shampo_get_packet_hd(rq, header_index); - last_tcp_hd =3D last_hd_addr + ETH_HLEN + rq->hw_gro_data->fk.control.th= off; + last_hd_addr =3D mlx5e_shampo_get_hdr(rq, cqe, len); + last_tcp_hd =3D (struct tcphdr *)(last_hd_addr + nhoff); + tcp_flag_word(skb_tcp_hd) |=3D tcp_flag_word(last_tcp_hd) & (TCP_FLAG_FIN= | TCP_FLAG_PSH); } =20 @@ -2299,52 +2098,29 @@ static struct sk_buff * mlx5e_skb_from_cqe_shampo(struct mlx5e_rq *rq, struct mlx5e_mpw_info *wi, struct mlx5_cqe64 *cqe, u16 header_index) { - struct mlx5e_frag_page *frag_page =3D mlx5e_shampo_hd_to_frag_page(rq, he= ader_index); - u16 head_offset =3D mlx5e_shampo_hd_offset(rq, header_index); struct mlx5e_shampo_hd *shampo =3D rq->mpwqe.shampo; u16 head_size =3D cqe->shampo.header_size; - u16 rx_headroom =3D rq->buff.headroom; - struct sk_buff *skb =3D NULL; - dma_addr_t page_dma_addr; - dma_addr_t dma_addr; - void *hdr, *data; - u32 frag_size; - - page_dma_addr =3D page_pool_get_dma_addr_netmem(frag_page->netmem); - dma_addr =3D page_dma_addr + head_offset; + struct mlx5e_dma_info *di; + u32 head_offset, di_index; + struct sk_buff *skb; + int len; =20 - hdr =3D netmem_address(frag_page->netmem) + head_offset; - data =3D hdr + rx_headroom; - frag_size =3D MLX5_SKB_FRAG_SZ(rx_headroom + head_size); + len =3D ALIGN(head_size, sizeof(long)); + skb =3D napi_alloc_skb(rq->cq.napi, len); + if (unlikely(!skb)) { + rq->stats->buff_alloc_err++; + return NULL; + } =20 - if (likely(frag_size <=3D BIT(shampo->log_hd_entry_size))) { - /* build SKB around header */ - dma_sync_single_range_for_cpu(rq->pdev, dma_addr, 0, frag_size, rq->buff= .map_dir); - net_prefetchw(hdr); - net_prefetch(data); - skb =3D mlx5e_build_linear_skb(rq, hdr, frag_size, rx_headroom, head_siz= e, 0); - if (unlikely(!skb)) - return NULL; + net_prefetchw(skb->data); =20 - frag_page->frags++; - } else { - /* allocate SKB and copy header for large header */ - rq->stats->gro_large_hds++; - skb =3D napi_alloc_skb(rq->cq.napi, - ALIGN(head_size, sizeof(long))); - if (unlikely(!skb)) { - rq->stats->buff_alloc_err++; - return NULL; - } + head_offset =3D mlx5e_shampo_get_header_offset(header_index); + di_index =3D header_index >> MLX5E_SHAMPO_LOG_WQ_HEADER_PER_PAGE; + di =3D &shampo->hd_buf_pages[di_index]; =20 - net_prefetchw(skb->data); - mlx5e_copy_skb_header(rq, skb, frag_page->netmem, dma_addr, - head_offset + rx_headroom, - rx_headroom, head_size); - /* skb linear part was allocated with headlen and aligned to long */ - skb->tail +=3D head_size; - skb->len +=3D head_size; - } + mlx5e_copy_skb_header(rq, skb, page_to_netmem(di->page), di->addr, + head_offset, head_offset, len); + __skb_put(skb, head_size); =20 /* queue up for recycling/reuse */ skb_mark_for_recycle(skb); @@ -2445,7 +2221,7 @@ static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct m= lx5e_rq *rq, struct mlx5_cq * prevent the kernel from touching it. */ if (unlikely(netmem_is_net_iov(frag_page->netmem))) - goto free_hd_entry; + goto mpwrq_cqe_out; *skb =3D mlx5e_skb_from_cqe_mpwrq_nonlinear(rq, wi, cqe, cqe_bcnt, data_offset, @@ -2453,19 +2229,22 @@ static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct= mlx5e_rq *rq, struct mlx5_cq } =20 if (unlikely(!*skb)) - goto free_hd_entry; + goto mpwrq_cqe_out; =20 NAPI_GRO_CB(*skb)->count =3D 1; skb_shinfo(*skb)->gso_size =3D cqe_bcnt - head_size; } else { NAPI_GRO_CB(*skb)->count++; + if (NAPI_GRO_CB(*skb)->count =3D=3D 2 && rq->hw_gro_data->fk.basic.n_proto =3D=3D htons(ETH_P_IP)) { - void *hd_addr =3D mlx5e_shampo_get_packet_hd(rq, header_index); - int nhoff =3D ETH_HLEN + rq->hw_gro_data->fk.control.thoff - - sizeof(struct iphdr); - struct iphdr *iph =3D (struct iphdr *)(hd_addr + nhoff); + int len =3D ETH_HLEN + rq->hw_gro_data->fk.control.thoff; + int nhoff =3D len - sizeof(struct iphdr); + void *last_hd_addr; + struct iphdr *iph; =20 + last_hd_addr =3D mlx5e_shampo_get_hdr(rq, cqe, len); + iph =3D (struct iphdr *)(last_hd_addr + nhoff); rq->hw_gro_data->second_ip_id =3D ntohs(iph->id); } } @@ -2487,13 +2266,10 @@ static void mlx5e_handle_rx_cqe_mpwrq_shampo(struct= mlx5e_rq *rq, struct mlx5_cq =20 if (mlx5e_shampo_complete_rx_cqe(rq, cqe, cqe_bcnt, *skb)) { *skb =3D NULL; - goto free_hd_entry; + goto mpwrq_cqe_out; } if (flush && rq->hw_gro_data->skb) mlx5e_shampo_flush_skb(rq, cqe, match); -free_hd_entry: - if (likely(head_size)) - mlx5e_free_rx_shampo_hd_entry(rq, header_index); mpwrq_cqe_out: if (likely(wi->consumed_strides < rq->mpwqe.num_strides)) return; --=20 2.31.1