From nobody Tue Oct 7 11:39:05 2025 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2061.outbound.protection.outlook.com [40.107.93.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C9DD2283FD9; Thu, 10 Jul 2025 06:52:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.93.61 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752130352; cv=fail; b=fzjEjuwDUdZAhHLaz0ytrSBwEkLIn+mP2Lc8y6o+U4ywnwfP3HXElsI959U3ljhtpa3OyVbqFFmEFcezW6gH++SUW9PnF/wkYZDXT5+OC9n5n+I2Fm/o3zfmPJAV6D5+4h30zSU8N3w95FEOgWHSHHbnS0nvGzsMJCt6g0hnVBE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752130352; c=relaxed/simple; bh=aJPF9AJ8lJkiwDbbgsJ+R+Y3s8z1bKEdp5guwaj6Tt4=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=NuGjIgTzaR8O2PlKccaWdBe+TP9CSv/uRDeybSmUvv3PLyq7Tcz3fMlrVhZ0IGZkSTN+epz+wBHtLbxSm8Sthy6xai99ZQrrSlWZkLBtDqxfP/0WwXl39kjknqmpcu8ksNAiBq2SFURey8adVpNC9/HNKSS0TsJwLmV5jPcgRJ0= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=ZW/vkbSn; arc=fail smtp.client-ip=40.107.93.61 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="ZW/vkbSn" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=rZHS04AZTYX0QoLvv88RjTbwaBSBLLkZkHh+VyMTkGaduItW2n4Husli5hz+qg5zgoHz5q4cB2Yc5QvHGyB6dxh1Ce5YwCqJYwS8tATz9lw02Xwucn+BkRUi27Ar5t4gjPp+Ic0AenVUQwdmeBtBkyLCIJPPu+eNiHkCvFtrLvCqO8EriPbz45/xnUpeap9HFos1VzcCWn32+jYWFdVxlU5tAnFGf5uIOFlGSFHK7jO7wqj9arNeR+ZKuz03DWdMGRWLpIexGFf7pGYjdvOZMsCsP6O8hg2RvilTXOX+vYltR9LED9Wr0IUMJ0xuVbuIfZN0wg0EJ+i8KFT94cb5lA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rzaHo2myWsb3DnHWqQUvlBDIjR7vnESyRCKID+1n8cY=; b=WfIXeGzvAb8zk1dcCN9hvvjLtwPu2Hu8Vnydzoi7S5mDdwSYTtYYZi4ISx3Du1U3PAr2Nzr/YaKh1wFPB+qud3Ph/mnCbyPZn/oQ+SUmfTvQVaIGl6hU1fkU2biu/IAC6BZW7IJMXwckEGwPIaIu9YKRtl3/f4pQ/+Ix6vNq0jE5/ZCMR51t9riBMBrtzYEVBEeBQ9QRlIBVlisjipPJxWZsssp+BGXpxtyWEY12l+X6qNwKE8jeHWaoISLQK1Wi/eH1lhsiyubhUFUpBXJe2DdlVTqqbD6bhCvO/FbFAx7qneeM/wckUQavFCUdnLwf9XQfCYpeA+JVvk6+kredYg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=google.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=rzaHo2myWsb3DnHWqQUvlBDIjR7vnESyRCKID+1n8cY=; b=ZW/vkbSngzD1uIU69zB1a4qz6LSaXI8SWQUK7nBwtpW6SqLZ7T7mzOFzVDkXl6bflOnT8R07IDhAK2C+D6FxMcA15nC2jPJ9GTSQtokm+4THQ7ohnhbTV1vuAEC2QEKBTSq3raQ6G5I+5Z/+s9YV5zhLdV366+Moh8Gi9DqceRGfsYnaVgQgwfwKurwTDMV/wrMw4QSDSTow3Vq7uHr64ensEVDqdAufusvM5ebA4Fb0xyea66HD29mF2YAy98O/N71aAo9YAk6m8PMD2zALRACv3uwi60kKU2sbNO+VCkT9PI9z9O2B+lAidn3jZVK4y137Ni9tzEZ4xag7cVBENA== Received: from SJ0PR13CA0033.namprd13.prod.outlook.com (2603:10b6:a03:2c2::8) by IA0PR12MB8908.namprd12.prod.outlook.com (2603:10b6:208:48a::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8880.21; Thu, 10 Jul 2025 06:52:24 +0000 Received: from SJ1PEPF0000231C.namprd03.prod.outlook.com (2603:10b6:a03:2c2:cafe::46) by SJ0PR13CA0033.outlook.office365.com (2603:10b6:a03:2c2::8) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8922.20 via Frontend Transport; Thu, 10 Jul 2025 06:52:24 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by SJ1PEPF0000231C.mail.protection.outlook.com (10.167.242.233) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.22 via Frontend Transport; Thu, 10 Jul 2025 06:52:24 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 9 Jul 2025 23:52:11 -0700 Received: from rnnvmail204.nvidia.com (10.129.68.6) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 9 Jul 2025 23:52:11 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.129.68.6) with Microsoft SMTP Server id 15.2.1544.14 via Frontend Transport; Wed, 9 Jul 2025 23:52:06 -0700 From: Tariq Toukan To: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Andrew Lunn , "David S. Miller" CC: Saeed Mahameed , Gal Pressman , "Leon Romanovsky" , Saeed Mahameed , "Tariq Toukan" , Mark Bloch , Jonathan Corbet , , , , , Dragos Tatulea Subject: [PATCH net-next V2 1/3] net/mlx5e: Create/destroy PCIe Congestion Event object Date: Thu, 10 Jul 2025 09:51:30 +0300 Message-ID: <1752130292-22249-2-git-send-email-tariqt@nvidia.com> X-Mailer: git-send-email 2.8.0 In-Reply-To: <1752130292-22249-1-git-send-email-tariqt@nvidia.com> References: <1752130292-22249-1-git-send-email-tariqt@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: AnonymousSubmission X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF0000231C:EE_|IA0PR12MB8908:EE_ X-MS-Office365-Filtering-Correlation-Id: ebf265ab-00ac-47c9-8f34-08ddbf7e536f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|36860700013|376014|7416014|82310400026|1800799024; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?PKyXvlUdxYekIf2nF+WZ6qF+LX8KYqOBsOA6BzpRHN/Nen2sjEO9+vxMq983?= =?us-ascii?Q?m+Pm9UWq76Dk/nlAXnGKoYFZENye3zUiYDoafMFAtfkPmYKfoAm2NiUwu2Mv?= =?us-ascii?Q?MEVE2RxbcGi6MSGHsyQs9YnSZ4+GvpS1QhpLoHNsvBI83nWcUNGDNpX4QxK7?= =?us-ascii?Q?cBGo6QSqDMXLrGNeK621R7Ve6+F0CQE3htD4UiOtFSKMN4cZMF5afXdQeH9k?= =?us-ascii?Q?emcZO7TjZojPBFCay1Mzk/ua6ogM0Z45RWivioBZsMa7NbFHbX/X0zmeZigY?= =?us-ascii?Q?Pmez4M3u3X3hCRts1s9EyB3vFzcGkJJjo5RK4AvGGVQSnW/IbvfueXyxvoZW?= =?us-ascii?Q?2miIzpH0QqHTIhTRsrO8Ha6lt+Jepv0maR6U8MdEruWsMWLbe4XScAVuWVUU?= =?us-ascii?Q?r51TxfAmHyEQ7zoQjuGtzltTi3de5YlGN3b/rxG8HM4pouDHH69veYqti5mS?= =?us-ascii?Q?c5eDWKOxOmAQ6zzznQWboUuqVlWUzC+EWDTiF9mw9LyE9gbexSbBGAnRjJKL?= =?us-ascii?Q?JKC+R5WRmR0AK8Mv56XyciZrDwAel9NpTMprjr30fa4Z7TaiAj7uRJ0o7GVR?= =?us-ascii?Q?yJOj2CN/vLgNHQbOBCcK+n97JDpr4M79B7Yz598tzJ3xYyuAcXP5vGjPaL5c?= =?us-ascii?Q?75mVgEi28okz/DciKZLvjIvKVn/wScYCIlinfmXQwMuwfhPitQ+oxjNmv8HQ?= =?us-ascii?Q?KX1JJryGESewBQsGKQ8setx2G81hOmrl0EDEETder1pmG0zaQNX7gfrui/tT?= =?us-ascii?Q?LeWgsojkplgJwUiU5ECCWiX8gc9mF85F1IHL0r753TQdv7Jex5ed3zoJp5lb?= =?us-ascii?Q?b5gHjbZFLkN+BohUp6L4JoBewSBa/OPB3C+/T1yvEvx1lADqHtXmbCu6l+UF?= =?us-ascii?Q?xZBeMOQDnK6KfSwZgtLsaQQRedOVOMWg/UqIp2DJbAdUXDOAl609lkDtTBe5?= =?us-ascii?Q?C9pFtPsuMe9MqeSsvSaY8mzJGnPUq7x/2aX53ydLH9gTfrxFfQ9TUOIDmFoe?= =?us-ascii?Q?3WNv3QL5HIMs+Sd01i7U+tRU+IjfEL8TWq17OqK1Szk/wZYH9XjdZ8/Aeu1n?= =?us-ascii?Q?Jj/DTVy0Vugj2SM2/tBwyTNkA+4anebDqVeucKmFilTy0iCKT3PNH4oW5VzJ?= =?us-ascii?Q?q4oXaeqVQMWzExS6dd/hoOp3oa4aOIauJ4jQwF7Pqoc/0DrfNKmP0RWfDxbi?= =?us-ascii?Q?A30Yjns0IJEl1LX962qdcqEyXav/TUij2KCVgLKvqRWVnrjwozK6U1NzJj1V?= =?us-ascii?Q?uLTdzNu02+HlNOmsjM7lXUkW7+0jXPJWow390oaziNkEncm02CuPPQidR5my?= =?us-ascii?Q?GNSoZTWIFQW4Xq8dYqyZ0Bdve53ltOMqFUOP1m+bFYgeACWeM7MXumRN0Zxx?= =?us-ascii?Q?33mqC+VksdqtvWLxlVVpDUK3iNkIdXIQ1gM43cVENvFR80dCHRztMOVbSQW5?= =?us-ascii?Q?/bI18MDy8d+nioypM6L5HkSicHxDhRmGtxzzkdlupYAIgPcp2NzOY665hbFr?= =?us-ascii?Q?NzasAXzcwosQp2WDtxz1E/cpLNhlhPDjJH6K?= X-Forefront-Antispam-Report: CIP:216.228.117.160;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge1.nvidia.com;CAT:NONE;SFS:(13230040)(36860700013)(376014)(7416014)(82310400026)(1800799024);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jul 2025 06:52:24.1764 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ebf265ab-00ac-47c9-8f34-08ddbf7e536f X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.160];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF0000231C.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA0PR12MB8908 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dragos Tatulea Add initial infrastructure to create and destroy the PCIe Congestion Event object if the object is supported. The verb for the object creation function is "set" instead of "create" because the function will accommodate the modify operation as well in a subsequent patch. The next patches will hook it up to the event handler and will add actual functionality. Signed-off-by: Dragos Tatulea Signed-off-by: Tariq Toukan --- .../net/ethernet/mellanox/mlx5/core/Makefile | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 + .../mellanox/mlx5/core/en/pcie_cong_event.c | 153 ++++++++++++++++++ .../mellanox/mlx5/core/en/pcie_cong_event.h | 11 ++ .../net/ethernet/mellanox/mlx5/core/en_main.c | 3 + 5 files changed, 170 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_ev= ent.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_ev= ent.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net= /ethernet/mellanox/mlx5/core/Makefile index d292e6a9e22c..650df18a9216 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -29,7 +29,7 @@ mlx5_core-$(CONFIG_MLX5_CORE_EN) +=3D en/rqt.o en/tir.o e= n/rss.o en/rx_res.o \ en/reporter_tx.o en/reporter_rx.o en/params.o en/xsk/pool.o \ en/xsk/setup.o en/xsk/rx.o en/xsk/tx.o en/devlink.o en/ptp.o \ en/qos.o en/htb.o en/trap.o en/fs_tt_redirect.o en/selq.o \ - lib/crypto.o lib/sd.o + lib/crypto.o lib/sd.o en/pcie_cong_event.o =20 # # Netdev extra diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/eth= ernet/mellanox/mlx5/core/en.h index 64e69e616b1f..b6340e9453c0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h @@ -920,6 +920,8 @@ struct mlx5e_priv { struct notifier_block events_nb; struct notifier_block blocking_events_nb; =20 + struct mlx5e_pcie_cong_event *cong_event; + struct udp_tunnel_nic_info nic_info; #ifdef CONFIG_MLX5_CORE_EN_DCB struct mlx5e_dcbx dcbx; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c b= /drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c new file mode 100644 index 000000000000..95a6db9d30b3 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c @@ -0,0 +1,153 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +// Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. + +#include "en.h" +#include "pcie_cong_event.h" + +struct mlx5e_pcie_cong_thresh { + u16 inbound_high; + u16 inbound_low; + u16 outbound_high; + u16 outbound_low; +}; + +struct mlx5e_pcie_cong_event { + u64 obj_id; + + struct mlx5e_priv *priv; +}; + +/* In units of 0.01 % */ +static const struct mlx5e_pcie_cong_thresh default_thresh_config =3D { + .inbound_high =3D 9000, + .inbound_low =3D 7500, + .outbound_high =3D 9000, + .outbound_low =3D 7500, +}; + +static int +mlx5_cmd_pcie_cong_event_set(struct mlx5_core_dev *dev, + const struct mlx5e_pcie_cong_thresh *config, + u64 *obj_id) +{ + u32 in[MLX5_ST_SZ_DW(pcie_cong_event_cmd_in)] =3D {}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)]; + void *cong_obj; + void *hdr; + int err; + + hdr =3D MLX5_ADDR_OF(pcie_cong_event_cmd_in, in, hdr); + cong_obj =3D MLX5_ADDR_OF(pcie_cong_event_cmd_in, in, cong_obj); + + MLX5_SET(general_obj_in_cmd_hdr, hdr, opcode, + MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + + MLX5_SET(general_obj_in_cmd_hdr, hdr, obj_type, + MLX5_GENERAL_OBJECT_TYPES_PCIE_CONG_EVENT); + + MLX5_SET(pcie_cong_event_obj, cong_obj, inbound_event_en, 1); + MLX5_SET(pcie_cong_event_obj, cong_obj, outbound_event_en, 1); + + MLX5_SET(pcie_cong_event_obj, cong_obj, + inbound_cong_high_threshold, config->inbound_high); + MLX5_SET(pcie_cong_event_obj, cong_obj, + inbound_cong_low_threshold, config->inbound_low); + + MLX5_SET(pcie_cong_event_obj, cong_obj, + outbound_cong_high_threshold, config->outbound_high); + MLX5_SET(pcie_cong_event_obj, cong_obj, + outbound_cong_low_threshold, config->outbound_low); + + err =3D mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); + if (err) + return err; + + *obj_id =3D MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); + + mlx5_core_dbg(dev, "PCIe congestion event (obj_id=3D%llu) created. Config= : in: [%u, %u], out: [%u, %u]\n", + *obj_id, + config->inbound_high, config->inbound_low, + config->outbound_high, config->outbound_low); + + return 0; +} + +static int mlx5_cmd_pcie_cong_event_destroy(struct mlx5_core_dev *dev, + u64 obj_id) +{ + u32 in[MLX5_ST_SZ_DW(pcie_cong_event_cmd_in)] =3D {}; + u32 out[MLX5_ST_SZ_DW(general_obj_out_cmd_hdr)]; + void *hdr; + + hdr =3D MLX5_ADDR_OF(pcie_cong_event_cmd_in, in, hdr); + MLX5_SET(general_obj_in_cmd_hdr, hdr, opcode, + MLX5_CMD_OP_DESTROY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, hdr, obj_type, + MLX5_GENERAL_OBJECT_TYPES_PCIE_CONG_EVENT); + MLX5_SET(general_obj_in_cmd_hdr, hdr, obj_id, obj_id); + + return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); +} + +bool mlx5e_pcie_cong_event_supported(struct mlx5_core_dev *dev) +{ + u64 features =3D MLX5_CAP_GEN_2_64(dev, general_obj_types_127_64); + + if (!(features & MLX5_HCA_CAP_2_GENERAL_OBJECT_TYPES_PCIE_CONG_EVENT)) + return false; + + if (dev->sd) + return false; + + return true; +} + +int mlx5e_pcie_cong_event_init(struct mlx5e_priv *priv) +{ + struct mlx5e_pcie_cong_event *cong_event; + struct mlx5_core_dev *mdev =3D priv->mdev; + int err; + + if (!mlx5e_pcie_cong_event_supported(mdev)) + return 0; + + cong_event =3D kvzalloc_node(sizeof(*cong_event), GFP_KERNEL, + mdev->priv.numa_node); + if (!cong_event) + return -ENOMEM; + + cong_event->priv =3D priv; + + err =3D mlx5_cmd_pcie_cong_event_set(mdev, &default_thresh_config, + &cong_event->obj_id); + if (err) { + mlx5_core_warn(mdev, "Error creating a PCIe congestion event object\n"); + goto err_free; + } + + priv->cong_event =3D cong_event; + + return 0; + +err_free: + kvfree(cong_event); + + return err; +} + +void mlx5e_pcie_cong_event_cleanup(struct mlx5e_priv *priv) +{ + struct mlx5e_pcie_cong_event *cong_event =3D priv->cong_event; + struct mlx5_core_dev *mdev =3D priv->mdev; + + if (!cong_event) + return; + + priv->cong_event =3D NULL; + + if (mlx5_cmd_pcie_cong_event_destroy(mdev, cong_event->obj_id)) + mlx5_core_warn(mdev, "Error destroying PCIe congestion event (obj_id=3D%= llu)\n", + cong_event->obj_id); + + kvfree(cong_event); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.h b= /drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.h new file mode 100644 index 000000000000..bf1e3632d596 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. */ + +#ifndef __MLX5_PCIE_CONG_EVENT_H__ +#define __MLX5_PCIE_CONG_EVENT_H__ + +bool mlx5e_pcie_cong_event_supported(struct mlx5_core_dev *dev); +int mlx5e_pcie_cong_event_init(struct mlx5e_priv *priv); +void mlx5e_pcie_cong_event_cleanup(struct mlx5e_priv *priv); + +#endif /* __MLX5_PCIE_CONG_EVENT_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/ne= t/ethernet/mellanox/mlx5/core/en_main.c index fee323ade522..bd481f3384d0 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -76,6 +76,7 @@ #include "en/trap.h" #include "lib/devcom.h" #include "lib/sd.h" +#include "en/pcie_cong_event.h" =20 static bool mlx5e_hw_gro_supported(struct mlx5_core_dev *mdev) { @@ -5989,6 +5990,7 @@ static void mlx5e_nic_enable(struct mlx5e_priv *priv) if (mlx5e_monitor_counter_supported(priv)) mlx5e_monitor_counter_init(priv); =20 + mlx5e_pcie_cong_event_init(priv); mlx5e_hv_vhca_stats_create(priv); if (netdev->reg_state !=3D NETREG_REGISTERED) return; @@ -6028,6 +6030,7 @@ static void mlx5e_nic_disable(struct mlx5e_priv *priv) =20 mlx5e_nic_set_rx_mode(priv); =20 + mlx5e_pcie_cong_event_cleanup(priv); mlx5e_hv_vhca_stats_destroy(priv); if (mlx5e_monitor_counter_supported(priv)) mlx5e_monitor_counter_cleanup(priv); --=20 2.31.1 From nobody Tue Oct 7 11:39:05 2025 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2082.outbound.protection.outlook.com [40.107.94.82]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83FA52882DE; Thu, 10 Jul 2025 06:52:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.82 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752130358; cv=fail; b=Thdslii6dlSBToZTCQBa1QS9whl8rt7uDGte2ePAXpsFZuzCxvksfs29cwHRuGYXGu1o+yafTinYPWGdbNAWefofkTpbSsgWFVlq3IIwk3+Ss44u/Dnj6gq7g9Wpt9ItVRf2Da9SmKliWqKarjNDhBANT74yCHSX2BgnDm+pUcw= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752130358; c=relaxed/simple; bh=v/RhXCY0RX4KrChOcbodvevl36m1l5GkRSeXKVALN9o=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Em9LQ/KczJBX59tRIdjz1H/2ItlioyyQ9sTtlSC80KFZevz71G6W/RV7Z//Jq2UwU/+ysZV0Yz7C2dxwxs2s3zHHbY+wIhHVRusgk+1sMWnZx2+GYLiJfV80DoLLPnjBs1CsDFmmfV3l66jutTeTGXNLLS3Ml7AoyIHDUOXGqQc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=omGgxFpM; arc=fail smtp.client-ip=40.107.94.82 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="omGgxFpM" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=sj/feFKfNBUGnyacfbWyKVd1Qquv2Ne+xv+kDMez+1V7DdcfZAxtHj2HwSCKOemSTUsajEAFuqezz9D0HM+m7+FDo1IAUtc1d16HGQEOCwsG3tO8HPSmZXPj9pzRQrcgJr1/A5J6p/rJb74B3d4J9STpbhVPLJXkM8yxiYQvv7D/zQatxzr3YwDyrNPHCDkH60tqeCUJ1DJ319u+X7d9pESSuuqSxVc9bGedIjKwTLi4QZrJhLJsPZJ/mWTkXODM4HVD+TPnRHMNWD7aOJKPk2FBtgdA5GZftnefdckmOzFQF7BeiBlYcHrvCDQEfciCV/T7E3WwkAOZ0ulY7auRxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6qTjs9USelpNlebvp63SRK47bTzSEQR2P0cgxBbrSyU=; b=sJtDwqGJBLYmhlP9QTscSlP9jy0K/mBGUMeujIdZEKqbiTvseZt3P+3kgEZkifotnCODn7o3wElQ25btYW9OhIhs9CCW738mjAkjKQcN3/wU0Nh5LBPs16FsV1h9qzS/A7DhnOTZsInXyGF1kdJWEMbRXUinBDnxKoURdgqorGAEjBfv0StSh8FLJ0s9c/WrNs94+LzOmPKnICHu1ii5pwtGyisi+F3wWvbhiuLH6oZkx9Fsp2pcQdRGujknRMGbjMkRfZJFzauv+038P5RU2y5+YWt54iZ7wM8DLg2ddNXtxAEvBJY2JXU4t+aBvq3+kZHoM/TdL8UXpo9WLjZqxA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.160) smtp.rcpttodomain=google.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6qTjs9USelpNlebvp63SRK47bTzSEQR2P0cgxBbrSyU=; b=omGgxFpMIoXtWTGTCB1biI7f4WyAMORf61t7CmhsYYCF18uAMlxXfZdWZTlvdt99JZGkZnCsWSwqyqxr35I8u4k68a9Uo/qmCgB+OATJvF+BCpB/VacIgsyLg/YglP8G+y5UhvNA7wBHa+ph3FzWGh2lD7BGN62AU6dxx5G6wzNiJcUsjSJ5bhGfZYvsxpcChHgUOG9CL69mvJq6Gbj2UpBy4xc9k2DJYKIEeYKt1JpDZjtIB3CG5h4st/KFimCvd2WuWMTi/mpC1zMtNa8BLTS/Q0w061BB7opji9SQlWiV40PM5T7jrAEs1tngwQzYHl79aQLsBubOjCFOY5D0Sg== Received: from SJ0PR05CA0113.namprd05.prod.outlook.com (2603:10b6:a03:334::28) by IA1PR12MB8311.namprd12.prod.outlook.com (2603:10b6:208:3fa::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8901.28; Thu, 10 Jul 2025 06:52:30 +0000 Received: from SJ1PEPF00002319.namprd03.prod.outlook.com (2603:10b6:a03:334:cafe::fe) by SJ0PR05CA0113.outlook.office365.com (2603:10b6:a03:334::28) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8922.21 via Frontend Transport; Thu, 10 Jul 2025 06:52:30 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.160) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.160 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.160; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.160) by SJ1PEPF00002319.mail.protection.outlook.com (10.167.242.229) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.22 via Frontend Transport; Thu, 10 Jul 2025 06:52:30 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by mail.nvidia.com (10.129.200.66) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 9 Jul 2025 23:52:16 -0700 Received: from rnnvmail204.nvidia.com (10.129.68.6) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 9 Jul 2025 23:52:15 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.129.68.6) with Microsoft SMTP Server id 15.2.1544.14 via Frontend Transport; Wed, 9 Jul 2025 23:52:11 -0700 From: Tariq Toukan To: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Andrew Lunn , "David S. Miller" CC: Saeed Mahameed , Gal Pressman , "Leon Romanovsky" , Saeed Mahameed , "Tariq Toukan" , Mark Bloch , Jonathan Corbet , , , , , Dragos Tatulea Subject: [PATCH net-next V2 2/3] net/mlx5e: Add device PCIe congestion ethtool stats Date: Thu, 10 Jul 2025 09:51:31 +0300 Message-ID: <1752130292-22249-3-git-send-email-tariqt@nvidia.com> X-Mailer: git-send-email 2.8.0 In-Reply-To: <1752130292-22249-1-git-send-email-tariqt@nvidia.com> References: <1752130292-22249-1-git-send-email-tariqt@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: AnonymousSubmission X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ1PEPF00002319:EE_|IA1PR12MB8311:EE_ X-MS-Office365-Filtering-Correlation-Id: 89cd02f0-821f-4f10-afe3-08ddbf7e5732 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|7416014|376014|1800799024|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?k7MPRaybelPqhRKQ9D5yUjEpoauG3wRpMwD/8MXWGmkKocBZJnxuudg+tPL6?= =?us-ascii?Q?cPZ6PSAVuO6xMkAVwKSTVg2zyw5gMBYMWL+aUotqYCj1ZNXt9BEGDg0trXj4?= =?us-ascii?Q?DgWNa6Tcv3tB1JGPelSDFsm3O6+oIMzHEEuaxJBRSeNUOPF0J/sm2SCGuhus?= =?us-ascii?Q?AMP+AIrmBztaa+hZEYJh9zpdgnrPhJiZhgpKEZF3awCGPzuavBAAyHrQ+LoT?= =?us-ascii?Q?4REGFt3RWNOZ4DQqrE1cQH4ff4dJNFEOb1z68wxXScXCV4BFLfhQk/omy+Wq?= =?us-ascii?Q?wgO1fHdgrAajYAeeSIiAWqI5K684pkQpIUhXVm8GfaiaswDLCaFU1n9PSrQ/?= =?us-ascii?Q?T4uqRsn2kJVnMSugfVX5elMFgAeG5En/QA9YLYR9u+yg1mvwMNH/pPISTT5R?= =?us-ascii?Q?7Xpt+Fc8WvF+eYdcNKKbRM0tWy9jRutUUMhymqsq+oEUiRClZJixrSOQPTpa?= =?us-ascii?Q?V4asK/DTA1LHoDdqYfH1Dt/lNHZBinvaxz2R5Jyuinop/TdrY7oU6Y79peAs?= =?us-ascii?Q?uiohGMSDuvXS8T3BrSJeIBurZecYi2Lr0WvySjgbfRHNntyIIgdMblLYThfk?= =?us-ascii?Q?3uBKimqJtUe6CbtF4XlptVTf9Ca+rFIn0JeDoI9BQ33O6H7leqfSpmo+oslx?= =?us-ascii?Q?OF1phmoWv7NObtAQJ9UQ2MH+7VhUFXlASsmrIU2XaQD+370c4v/Mi+Y4L1fl?= =?us-ascii?Q?pbhH7Kj59J5+DwhgMgemWsKPhlJpM44SK3cVHQ74c+/OtgK6xDedZttEe6mQ?= =?us-ascii?Q?VVD6N959oNJPNhQQk4lIOD+Pu/PkpKZU0zbtIphcmmyBPhwOP+Y0+KAs1oGa?= =?us-ascii?Q?U7HgNkV5oZkPVMETOVDtVW6J2KUQ+CskrZ7ydvqw/PhaIG+dH1U2PsNEqrio?= =?us-ascii?Q?3GcjyFdHEL64ieJSPnh5SBM8kUnFEHMdWr6e7HBK+vMaRFtONkMIEuDAPFRv?= =?us-ascii?Q?4iqYKVRVksJz74FxKoTp9F1rInbdMmzxyu3UyeL3MizX3ajSApQv87TrDfns?= =?us-ascii?Q?NmbUJkhRK1mCEoAISGsZBJBGkU3yb0IUqJGAj42laknT0uho1BuR516aum5E?= =?us-ascii?Q?jMc/p2mM0DnL7PRKWaNnv1h1JuLIqtdMbS5kpTS3/Gt+0E2ACdq8qouUbLKW?= =?us-ascii?Q?a2Bdw1M3bNc/Le0SzLa4fj2Muh0MKUu3T3DHXjAs/Idv2KFeglRDqm+gZRGM?= =?us-ascii?Q?mKgPrDkb/hT87OdlRp4qlsbF/gl3qkstg6XMhs+fcjTikmT3kACtl5a3s47/?= =?us-ascii?Q?CiAWhWpmwAPBnRyGJPr/z5f5lYPgUSSMDNuzVMWZBOBJ5uO3WPlaIkj43nlu?= =?us-ascii?Q?USnddpjQK8xBTYqA+BSANHpSiGNOSYV3dBEOXmF+2mPUVoaBmP/Z6JD1sx2G?= =?us-ascii?Q?CkAvT4+VaHN2Nx5XtMX3W9hRqlQ0yZPGM9ydl1nkQoVFb1QfxFT3aRcTTNOg?= =?us-ascii?Q?GTWhyGPFP8x5dBQO4jfNYLo3+p32aEd0NwJo/q/rSbqIjntsc7oeOlMpLvCZ?= =?us-ascii?Q?FWhImcn8GCEP+jnFFdJgFDoGZxFY5tOmTVEQ?= X-Forefront-Antispam-Report: CIP:216.228.117.160;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge1.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(7416014)(376014)(1800799024)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jul 2025 06:52:30.5039 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 89cd02f0-821f-4f10-afe3-08ddbf7e5732 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.160];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ1PEPF00002319.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: IA1PR12MB8311 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dragos Tatulea Implement the PCIe Congestion Event notifier which triggers a work item to query the PCIe Congestion Event object. The result of the congestion state is reflected in the new ethtool stats: * pci_bw_inbound_high: the device has crossed the high threshold for inbound PCIe traffic. * pci_bw_inbound_low: the device has crossed the low threshold for inbound PCIe traffic * pci_bw_outbound_high: the device has crossed the high threshold for outbound PCIe traffic. * pci_bw_outbound_low: the device has crossed the low threshold for outbound PCIe traffic The high and low thresholds are currently configured at 90% and 75%. These are hysteresis thresholds which help to check if the PCI bus on the device side is in a congested state. If low + 1 =3D high then the device is in a congested state. If low =3D=3D = high then the device is not in a congested state. The counters are also documented. A follow-up patch will make the thresholds configurable. Signed-off-by: Dragos Tatulea Signed-off-by: Tariq Toukan --- .../ethernet/mellanox/mlx5/counters.rst | 32 ++++ .../mellanox/mlx5/core/en/pcie_cong_event.c | 175 ++++++++++++++++++ .../ethernet/mellanox/mlx5/core/en_stats.c | 1 + .../ethernet/mellanox/mlx5/core/en_stats.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/eq.c | 4 + 5 files changed, 213 insertions(+) diff --git a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5= /counters.rst b/Documentation/networking/device_drivers/ethernet/mellanox/m= lx5/counters.rst index 43d72c8b713b..754c81436408 100644 --- a/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counte= rs.rst +++ b/Documentation/networking/device_drivers/ethernet/mellanox/mlx5/counte= rs.rst @@ -1341,3 +1341,35 @@ Device Counters - The number of times the device owned queue had not enough buffers allocated. - Error + + * - `pci_bw_inbound_high` + - The number of times the device crossed the high inbound pcie bandwi= dth + threshold. To be compared to pci_bw_inbound_low to check if the dev= ice + is in a congested state. + If pci_bw_inbound_high =3D=3D pci_bw_inbound_low then the device is= not congested. + If pci_bw_inbound_high > pci_bw_inbound_low then the device is cong= ested. + - Tnformative + + * - `pci_bw_inbound_low` + - The number of times the device crossed the low inbound PCIe bandwid= th + threshold. To be compared to pci_bw_inbound_high to check if the de= vice + is in a congested state. + If pci_bw_inbound_high =3D=3D pci_bw_inbound_low then the device is= not congested. + If pci_bw_inbound_high > pci_bw_inbound_low then the device is cong= ested. + - Informative + + * - `pci_bw_outbound_high` + - The number of times the device crossed the high outbound pcie bandw= idth + threshold. To be compared to pci_bw_outbound_low to check if the de= vice + is in a congested state. + If pci_bw_outbound_high =3D=3D pci_bw_outbound_low then the device = is not congested. + If pci_bw_outbound_high > pci_bw_outbound_low then the device is co= ngested. + - Informative + + * - `pci_bw_outbound_low` + - The number of times the device crossed the low outbound PCIe bandwi= dth + threshold. To be compared to pci_bw_outbound_high to check if the d= evice + is in a congested state. + If pci_bw_outbound_high =3D=3D pci_bw_outbound_low then the device = is not congested. + If pci_bw_outbound_high > pci_bw_outbound_low then the device is co= ngested. + - Informative diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c b= /drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c index 95a6db9d30b3..a24e5465ceeb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c @@ -4,6 +4,13 @@ #include "en.h" #include "pcie_cong_event.h" =20 +#define MLX5E_CONG_HIGH_STATE 0x7 + +enum { + MLX5E_INBOUND_CONG =3D BIT(0), + MLX5E_OUTBOUND_CONG =3D BIT(1), +}; + struct mlx5e_pcie_cong_thresh { u16 inbound_high; u16 inbound_low; @@ -11,10 +18,27 @@ struct mlx5e_pcie_cong_thresh { u16 outbound_low; }; =20 +struct mlx5e_pcie_cong_stats { + u32 pci_bw_inbound_high; + u32 pci_bw_inbound_low; + u32 pci_bw_outbound_high; + u32 pci_bw_outbound_low; +}; + struct mlx5e_pcie_cong_event { u64 obj_id; =20 struct mlx5e_priv *priv; + + /* For event notifier and workqueue. */ + struct work_struct work; + struct mlx5_nb nb; + + /* Stores last read state. */ + u8 state; + + /* For ethtool stats group. */ + struct mlx5e_pcie_cong_stats stats; }; =20 /* In units of 0.01 % */ @@ -25,6 +49,51 @@ static const struct mlx5e_pcie_cong_thresh default_thres= h_config =3D { .outbound_low =3D 7500, }; =20 +static const struct counter_desc mlx5e_pcie_cong_stats_desc[] =3D { + { MLX5E_DECLARE_STAT(struct mlx5e_pcie_cong_stats, + pci_bw_inbound_high) }, + { MLX5E_DECLARE_STAT(struct mlx5e_pcie_cong_stats, + pci_bw_inbound_low) }, + { MLX5E_DECLARE_STAT(struct mlx5e_pcie_cong_stats, + pci_bw_outbound_high) }, + { MLX5E_DECLARE_STAT(struct mlx5e_pcie_cong_stats, + pci_bw_outbound_low) }, +}; + +#define NUM_PCIE_CONG_COUNTERS ARRAY_SIZE(mlx5e_pcie_cong_stats_desc) + +static MLX5E_DECLARE_STATS_GRP_OP_NUM_STATS(pcie_cong) +{ + return priv->cong_event ? NUM_PCIE_CONG_COUNTERS : 0; +} + +static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(pcie_cong) {} + +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STRS(pcie_cong) +{ + if (!priv->cong_event) + return; + + for (int i =3D 0; i < NUM_PCIE_CONG_COUNTERS; i++) + ethtool_puts(data, mlx5e_pcie_cong_stats_desc[i].format); +} + +static MLX5E_DECLARE_STATS_GRP_OP_FILL_STATS(pcie_cong) +{ + if (!priv->cong_event) + return; + + for (int i =3D 0; i < NUM_PCIE_CONG_COUNTERS; i++) { + u32 ctr =3D MLX5E_READ_CTR32_CPU(&priv->cong_event->stats, + mlx5e_pcie_cong_stats_desc, + i); + + mlx5e_ethtool_put_stat(data, ctr); + } +} + +MLX5E_DEFINE_STATS_GRP(pcie_cong, 0); + static int mlx5_cmd_pcie_cong_event_set(struct mlx5_core_dev *dev, const struct mlx5e_pcie_cong_thresh *config, @@ -89,6 +158,97 @@ static int mlx5_cmd_pcie_cong_event_destroy(struct mlx5= _core_dev *dev, return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); } =20 +static int mlx5_cmd_pcie_cong_event_query(struct mlx5_core_dev *dev, + u64 obj_id, + u32 *state) +{ + u32 in[MLX5_ST_SZ_DW(pcie_cong_event_cmd_in)] =3D {}; + u32 out[MLX5_ST_SZ_DW(pcie_cong_event_cmd_out)]; + void *obj; + void *hdr; + u8 cong; + int err; + + hdr =3D MLX5_ADDR_OF(pcie_cong_event_cmd_in, in, hdr); + + MLX5_SET(general_obj_in_cmd_hdr, hdr, opcode, + MLX5_CMD_OP_QUERY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, hdr, obj_type, + MLX5_GENERAL_OBJECT_TYPES_PCIE_CONG_EVENT); + MLX5_SET(general_obj_in_cmd_hdr, hdr, obj_id, obj_id); + + err =3D mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out)); + if (err) + return err; + + obj =3D MLX5_ADDR_OF(pcie_cong_event_cmd_out, out, cong_obj); + + if (state) { + cong =3D MLX5_GET(pcie_cong_event_obj, obj, inbound_cong_state); + if (cong =3D=3D MLX5E_CONG_HIGH_STATE) + *state |=3D MLX5E_INBOUND_CONG; + + cong =3D MLX5_GET(pcie_cong_event_obj, obj, outbound_cong_state); + if (cong =3D=3D MLX5E_CONG_HIGH_STATE) + *state |=3D MLX5E_OUTBOUND_CONG; + } + + return 0; +} + +static void mlx5e_pcie_cong_event_work(struct work_struct *work) +{ + struct mlx5e_pcie_cong_event *cong_event; + struct mlx5_core_dev *dev; + struct mlx5e_priv *priv; + u32 new_cong_state =3D 0; + u32 changes; + int err; + + cong_event =3D container_of(work, struct mlx5e_pcie_cong_event, work); + priv =3D cong_event->priv; + dev =3D priv->mdev; + + err =3D mlx5_cmd_pcie_cong_event_query(dev, cong_event->obj_id, + &new_cong_state); + if (err) { + mlx5_core_warn(dev, "Error %d when querying PCIe cong event object (obj_= id=3D%llu).\n", + err, cong_event->obj_id); + return; + } + + changes =3D cong_event->state ^ new_cong_state; + if (!changes) + return; + + cong_event->state =3D new_cong_state; + + if (changes & MLX5E_INBOUND_CONG) { + if (new_cong_state & MLX5E_INBOUND_CONG) + cong_event->stats.pci_bw_inbound_high++; + else + cong_event->stats.pci_bw_inbound_low++; + } + + if (changes & MLX5E_OUTBOUND_CONG) { + if (new_cong_state & MLX5E_OUTBOUND_CONG) + cong_event->stats.pci_bw_outbound_high++; + else + cong_event->stats.pci_bw_outbound_low++; + } +} + +static int mlx5e_pcie_cong_event_handler(struct notifier_block *nb, + unsigned long event, void *eqe) +{ + struct mlx5e_pcie_cong_event *cong_event; + + cong_event =3D mlx5_nb_cof(nb, struct mlx5e_pcie_cong_event, nb); + queue_work(cong_event->priv->wq, &cong_event->work); + + return NOTIFY_OK; +} + bool mlx5e_pcie_cong_event_supported(struct mlx5_core_dev *dev) { u64 features =3D MLX5_CAP_GEN_2_64(dev, general_obj_types_127_64); @@ -116,6 +276,10 @@ int mlx5e_pcie_cong_event_init(struct mlx5e_priv *priv) if (!cong_event) return -ENOMEM; =20 + INIT_WORK(&cong_event->work, mlx5e_pcie_cong_event_work); + MLX5_NB_INIT(&cong_event->nb, mlx5e_pcie_cong_event_handler, + OBJECT_CHANGE); + cong_event->priv =3D priv; =20 err =3D mlx5_cmd_pcie_cong_event_set(mdev, &default_thresh_config, @@ -125,10 +289,18 @@ int mlx5e_pcie_cong_event_init(struct mlx5e_priv *pri= v) goto err_free; } =20 + err =3D mlx5_eq_notifier_register(mdev, &cong_event->nb); + if (err) { + mlx5_core_warn(mdev, "Error registering notifier for the PCIe congestion= event\n"); + goto err_obj_destroy; + } + priv->cong_event =3D cong_event; =20 return 0; =20 +err_obj_destroy: + mlx5_cmd_pcie_cong_event_destroy(mdev, cong_event->obj_id); err_free: kvfree(cong_event); =20 @@ -145,6 +317,9 @@ void mlx5e_pcie_cong_event_cleanup(struct mlx5e_priv *p= riv) =20 priv->cong_event =3D NULL; =20 + mlx5_eq_notifier_unregister(mdev, &cong_event->nb); + cancel_work_sync(&cong_event->work); + if (mlx5_cmd_pcie_cong_event_destroy(mdev, cong_event->obj_id)) mlx5_core_warn(mdev, "Error destroying PCIe congestion event (obj_id=3D%= llu)\n", cong_event->obj_id); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/n= et/ethernet/mellanox/mlx5/core/en_stats.c index 19664fa7f217..87536f158d07 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c @@ -2612,6 +2612,7 @@ mlx5e_stats_grp_t mlx5e_nic_stats_grps[] =3D { #ifdef CONFIG_MLX5_MACSEC &MLX5E_STATS_GRP(macsec_hw), #endif + &MLX5E_STATS_GRP(pcie_cong), }; =20 unsigned int mlx5e_nic_stats_grps_num(struct mlx5e_priv *priv) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/n= et/ethernet/mellanox/mlx5/core/en_stats.h index def5dea1463d..72dbcc1928ef 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h @@ -535,5 +535,6 @@ extern MLX5E_DECLARE_STATS_GRP(ipsec_hw); extern MLX5E_DECLARE_STATS_GRP(ipsec_sw); extern MLX5E_DECLARE_STATS_GRP(ptp); extern MLX5E_DECLARE_STATS_GRP(macsec_hw); +extern MLX5E_DECLARE_STATS_GRP(pcie_cong); =20 #endif /* __MLX5_EN_STATS_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/eth= ernet/mellanox/mlx5/core/eq.c index dfb079e59d85..db54f6d26591 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -21,6 +21,7 @@ #include "pci_irq.h" #include "devlink.h" #include "en_accel/ipsec.h" +#include "en/pcie_cong_event.h" =20 enum { MLX5_EQE_OWNER_INIT_VAL =3D 0x1, @@ -585,6 +586,9 @@ static void gather_async_events_mask(struct mlx5_core_d= ev *dev, u64 mask[4]) async_event_mask |=3D (1ull << MLX5_EVENT_TYPE_OBJECT_CHANGE); =20 + if (mlx5e_pcie_cong_event_supported(dev)) + async_event_mask |=3D (1ull << MLX5_EVENT_TYPE_OBJECT_CHANGE); + mask[0] =3D async_event_mask; =20 if (MLX5_CAP_GEN(dev, event_cap)) --=20 2.31.1 From nobody Tue Oct 7 11:39:05 2025 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (mail-mw2nam10on2058.outbound.protection.outlook.com [40.107.94.58]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B82B288C18; Thu, 10 Jul 2025 06:52:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.94.58 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752130361; cv=fail; b=keKnnZpuYr2zwVTENc3prQlkbpWpz8C5g1PX9rJQ6cJxsBmYE+gfs1OPaU000aUStDAr99o7xvkfc2yOtZ2sYmF6FqX+ty9YSb01pH1caf3z+szhmhB1VciA4J1NsmyTR91tef7UfUworqG80K4wYKrOaCLQqN8AYdymxDmny88= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752130361; c=relaxed/simple; bh=CH2gBlPqxE04QkIWbtGgHzhhNXQD5QacitJEHnTrcc0=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=M4x73rtGMGu3yFJVvaId/QqNzDb4x3P2SLM+3tTytplo/Bj6ZXDLEVA8ap7u606zra0EXTd/B1BKZj4jH9DD0SRJcdF0B7eD6BMJBXT+0HlTrof+Jhbzp3oENkJJyDXENrfRIL5qmS45VLDvv/qTprAkvVbUSwMuiZuwBfrJCbs= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=hAGSFxLg; arc=fail smtp.client-ip=40.107.94.58 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="hAGSFxLg" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=PXaYAeFyxapG03nzjab9EhRIv9e508A3kQ4jm/UhR2Z0LZKk8BRcSEs6K4jQTEGlJv7ovPUfeLfCmA7E8JVoy+jkrsMDC9QPBC1bNNQj9McppoepdDtqokuqoMd8I/+uF8rNhsi1SYYsAs9SfOpYx15MT8zuWzfr2rz0PFx37KtMCgFYDjHNGAqyIWEBVK3jyRRUrtgAi+cp8Hq/CCaVW/m/smvcImHMIb4znGik8JQS/jO8YN6VSJjQtqVO70K7zJFqwoVBu9bgVRUOGkIR5xlB1aTwdLxbTOpSr1lKLG4qbgQmfrq6lhjJHqsn+R4Y89E84d8fg+K/jQVJKBYW9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ZsqjDe8E5qc7syhD27uZQvs7T/gaINEWypmm/iczSSw=; b=bJedqgRqvo4QSwoI43EY1hS9pDCSJrFM8KnfKVUcu22KT+cFTWZDK9smWmkmlG5wuKQ5+jQGaq/T2q+U5+YMkCpZX7hSr7gva+239ZiHVSO1hREGJoI/j87AW6Qw4WVfR0pdNvwqLEpZgGDM4TigpaDcSUbdIlMjnsBAY1UYIKv/+vAXXG4uSf8ihr1NcRXk6q/YH+KWY9KrNSgVfuTHQWxlBGSUCPKpjab2iccfuVjvgaTYdF2prj2oRU5bfM1yQ9oGie/T97lBih0HytNMo8QJXuUfM33oXVaLoJIXybcynPquo53UiUDnE7mbyCMNnw+RdyxAC7L4n6CJfU12Wg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=google.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ZsqjDe8E5qc7syhD27uZQvs7T/gaINEWypmm/iczSSw=; b=hAGSFxLgot1oG6q4Lcg4URUT3DROjZbfht1ITrlHPYeX1e9jDHcg57LQor14T57wPa0Qapav8IXOpeSwwp578ABHOrlZwUhPHpoKRNUgaU79NYQ5WTMjkyaGWK6aabyU+qtXTSXrckubyv5AibRKDYH75z316NCuDTjiljFczZgmZj3z5OBRnpbDYSZWum5Ap9RqRcFDi51AcvlOsi7dC/EDg7Wl2uA/3CCXi2T9IAOFtrChxJ2jaQ6C/fRdOX2Z2Hzy2nNrOMbBPk661l1MaMBTxDWWOLSk6eRlS31lM5Rg7D1WugtQO0QOA1E2Uwce88NXXzPgx4psArKMwYPCYQ== Received: from CH0PR03CA0449.namprd03.prod.outlook.com (2603:10b6:610:10e::25) by DS0PR12MB8525.namprd12.prod.outlook.com (2603:10b6:8:159::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8880.21; Thu, 10 Jul 2025 06:52:34 +0000 Received: from CH2PEPF0000009C.namprd02.prod.outlook.com (2603:10b6:610:10e:cafe::ac) by CH0PR03CA0449.outlook.office365.com (2603:10b6:610:10e::25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8922.22 via Frontend Transport; Thu, 10 Jul 2025 06:52:34 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by CH2PEPF0000009C.mail.protection.outlook.com (10.167.244.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8922.22 via Frontend Transport; Thu, 10 Jul 2025 06:52:33 +0000 Received: from rnnvmail205.nvidia.com (10.129.68.10) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Wed, 9 Jul 2025 23:52:20 -0700 Received: from rnnvmail204.nvidia.com (10.129.68.6) by rnnvmail205.nvidia.com (10.129.68.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.14; Wed, 9 Jul 2025 23:52:20 -0700 Received: from vdi.nvidia.com (10.127.8.10) by mail.nvidia.com (10.129.68.6) with Microsoft SMTP Server id 15.2.1544.14 via Frontend Transport; Wed, 9 Jul 2025 23:52:16 -0700 From: Tariq Toukan To: Eric Dumazet , Jakub Kicinski , Paolo Abeni , Andrew Lunn , "David S. Miller" CC: Saeed Mahameed , Gal Pressman , "Leon Romanovsky" , Saeed Mahameed , "Tariq Toukan" , Mark Bloch , Jonathan Corbet , , , , , Dragos Tatulea Subject: [PATCH net-next V2 3/3] net/mlx5e: Make PCIe congestion event thresholds configurable Date: Thu, 10 Jul 2025 09:51:32 +0300 Message-ID: <1752130292-22249-4-git-send-email-tariqt@nvidia.com> X-Mailer: git-send-email 2.8.0 In-Reply-To: <1752130292-22249-1-git-send-email-tariqt@nvidia.com> References: <1752130292-22249-1-git-send-email-tariqt@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-NV-OnPremToCloud: AnonymousSubmission X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CH2PEPF0000009C:EE_|DS0PR12MB8525:EE_ X-MS-Office365-Filtering-Correlation-Id: b37f2184-99ca-4a8f-f690-08ddbf7e5944 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|82310400026|376014|7416014|1800799024|36860700013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?mYBRqVCpC7D/XYdHvEAx6MBBzr/sDbPeqp1E2EHZpdJR5r1/RbY8cBgpdyrD?= =?us-ascii?Q?8ESqxJOdGsJDq/NJnq5LvT+xjwdK3m11jUM4OXIvyJiMy1FtbwDmTxDig6J7?= =?us-ascii?Q?UivYCsmQ1U04iOOa+XSetJV54k/h/p/7/LHsYPPpcgyzpavtek1cMTi8XLUd?= =?us-ascii?Q?dTOb9m+9/v4ACOXmM5z1XdSowntS6v1tI+9lP4hRREHQwJ6Ewr6s/yCcD6Mf?= =?us-ascii?Q?L1JJzKTrAHNXspvEVGTvzjFgRMQNVg6/3mHBNAB3VlegGGlxm3Q2S/03Yj8e?= =?us-ascii?Q?CRDpass408eVpR3OhelSp0mxNt6mCfA/sWJGt+khZQMiks0Yu0yIDhj57SX6?= =?us-ascii?Q?OgMIGZBL3URLWrTilmY5t9QelznWwtIuhKbea9ET16/RTXDPKoMcPo6AcTTk?= =?us-ascii?Q?4fdLNJYHxrJMaOoHn48QiDiXC4K7kBa7wnY739ITogch0S9X9XvYLpdxiJ13?= =?us-ascii?Q?JFFy1Hs9FeODAd0msuCnkTMxLWC05z5BccgZnhMuWcFU4G/f0iKQqnT2b/Pm?= =?us-ascii?Q?m+ha/1s3tpj+Czw3QGFDQQl4BnwISrEA9N6UcG6+fi4e6KSCifvp7+aCCZT5?= =?us-ascii?Q?+ZVZetO4E2fGtHR1ymAr82HTobbnwNp4x+aS79r90TT3fE8jyylPfWBd0DCi?= =?us-ascii?Q?HegkQnyDtvWCVm/gqief+37Ea8d+3CAvzaJACJlnECJ8zdRBBsLH8FZJ1znY?= =?us-ascii?Q?ptiByQYghnCnqg+rrBbfKsFhN/UQs+3G7Br4RWMxxnWUd77EfpaV2vTip6fq?= =?us-ascii?Q?UZ6QkvhqTsc9OZKx2VUgLKmzz7NaXqVhmgWQSEAbdcWutwXOSqkegX6NbXRV?= =?us-ascii?Q?duUlvNgTx7zVN9DAZqz2IMVl2OstP0D4frVXQPOOuIwH5MYKlQ7QGFZT9laf?= =?us-ascii?Q?4ekmWxLXArNrigQEdzgMAicz8GpnhhsiHOB7nBPpw3H/B5BF7zRKKQHDIByS?= =?us-ascii?Q?NJ4eYjyBmpj17DGcw1+BRhN0rR9i1x+ijuvrfjLwDDyzeGEm2HaE0XTDxJaC?= =?us-ascii?Q?juG1FJanryAr4T6P8GHzBM4RAZgeZxRNWrsKkm0UapOF2UkUClhbKNin8o4o?= =?us-ascii?Q?PpBvPu04QI6W0YY5d73aSNpudvRwMiChkSLeulKTiDS/czUh4eOgtq2waviw?= =?us-ascii?Q?akmQ0Y8Cd8AjaCV/vc/aTQ6yMYv5R0pgT+kbx9UGmKCWVEX1BkKN6Wa9hVrE?= =?us-ascii?Q?8aafDFqfe+EGWJG+QRPFV0vqASQ7FJFIolpArCOzx7bL1/Wyrau3qx+IXmqq?= =?us-ascii?Q?7d1DLsO0QITyojN3jMx9sG9tHs/Quh/z6hp2/aXU36gnLCsSQpjuW1EDUJKK?= =?us-ascii?Q?IkmHuMCED9gaNH2rM3Tss+Vss+cSNiblkagAK+7oonrfuCFincYf1TThiem2?= =?us-ascii?Q?NG6ElXkCmdxefYSVIB6tPlP8N/Wpr1Wc+FanTa0pqwpRRMYqYOw1om0tINp1?= =?us-ascii?Q?j0M8QbsJ6GEZaLLJBF1G48heNfkhIMj45yj5jaCPYUj7FsmA4BIrjzc+HZkC?= =?us-ascii?Q?qU4ibKvYRfV1J50O5GGx20/a3bVfXCnFytgz?= X-Forefront-Antispam-Report: CIP:216.228.117.161;CTRY:US;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:mail.nvidia.com;PTR:dc6edge2.nvidia.com;CAT:NONE;SFS:(13230040)(82310400026)(376014)(7416014)(1800799024)(36860700013);DIR:OUT;SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jul 2025 06:52:33.9330 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b37f2184-99ca-4a8f-f690-08ddbf7e5944 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a;Ip=[216.228.117.161];Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: CH2PEPF0000009C.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8525 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Dragos Tatulea Add a new sysfs entry for reading and configuring the PCIe congestion event thresholds. The format is the following: Units are 0.01 %. Accepted values are in range (0, 10000]. When new thresholds are configured, a object modify operation will happen. The set function is updated accordingly to act as a modify as well. The threshold configuration is stored and queried directly in the firmware. To prevent fat fingering the numbers, read them initially as u64. Signed-off-by: Dragos Tatulea Signed-off-by: Tariq Toukan --- .../mellanox/mlx5/core/en/pcie_cong_event.c | 152 +++++++++++++++++- 1 file changed, 144 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c b= /drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c index a24e5465ceeb..a74d1e15c92e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/pcie_cong_event.c @@ -39,9 +39,13 @@ struct mlx5e_pcie_cong_event { =20 /* For ethtool stats group. */ struct mlx5e_pcie_cong_stats stats; + + struct device_attribute attr; }; =20 /* In units of 0.01 % */ +#define MLX5E_PCIE_CONG_THRESH_MAX 10000 + static const struct mlx5e_pcie_cong_thresh default_thresh_config =3D { .inbound_high =3D 9000, .inbound_low =3D 7500, @@ -97,6 +101,7 @@ MLX5E_DEFINE_STATS_GRP(pcie_cong, 0); static int mlx5_cmd_pcie_cong_event_set(struct mlx5_core_dev *dev, const struct mlx5e_pcie_cong_thresh *config, + bool modify, u64 *obj_id) { u32 in[MLX5_ST_SZ_DW(pcie_cong_event_cmd_in)] =3D {}; @@ -108,8 +113,16 @@ mlx5_cmd_pcie_cong_event_set(struct mlx5_core_dev *dev, hdr =3D MLX5_ADDR_OF(pcie_cong_event_cmd_in, in, hdr); cong_obj =3D MLX5_ADDR_OF(pcie_cong_event_cmd_in, in, cong_obj); =20 - MLX5_SET(general_obj_in_cmd_hdr, hdr, opcode, - MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + if (!modify) { + MLX5_SET(general_obj_in_cmd_hdr, hdr, opcode, + MLX5_CMD_OP_CREATE_GENERAL_OBJECT); + } else { + MLX5_SET(general_obj_in_cmd_hdr, hdr, opcode, + MLX5_CMD_OP_MODIFY_GENERAL_OBJECT); + MLX5_SET(general_obj_in_cmd_hdr, in, obj_id, *obj_id); + MLX5_SET64(pcie_cong_event_obj, cong_obj, modify_select_field, + MLX5_PCIE_CONG_EVENT_MOD_THRESH); + } =20 MLX5_SET(general_obj_in_cmd_hdr, hdr, obj_type, MLX5_GENERAL_OBJECT_TYPES_PCIE_CONG_EVENT); @@ -131,10 +144,12 @@ mlx5_cmd_pcie_cong_event_set(struct mlx5_core_dev *de= v, if (err) return err; =20 - *obj_id =3D MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); + if (!modify) + *obj_id =3D MLX5_GET(general_obj_out_cmd_hdr, out, obj_id); =20 - mlx5_core_dbg(dev, "PCIe congestion event (obj_id=3D%llu) created. Config= : in: [%u, %u], out: [%u, %u]\n", + mlx5_core_dbg(dev, "PCIe congestion event (obj_id=3D%llu) %s. Config: in:= [%u, %u], out: [%u, %u]\n", *obj_id, + modify ? "modified" : "created", config->inbound_high, config->inbound_low, config->outbound_high, config->outbound_low); =20 @@ -160,13 +175,13 @@ static int mlx5_cmd_pcie_cong_event_destroy(struct ml= x5_core_dev *dev, =20 static int mlx5_cmd_pcie_cong_event_query(struct mlx5_core_dev *dev, u64 obj_id, - u32 *state) + u32 *state, + struct mlx5e_pcie_cong_thresh *config) { u32 in[MLX5_ST_SZ_DW(pcie_cong_event_cmd_in)] =3D {}; u32 out[MLX5_ST_SZ_DW(pcie_cong_event_cmd_out)]; void *obj; void *hdr; - u8 cong; int err; =20 hdr =3D MLX5_ADDR_OF(pcie_cong_event_cmd_in, in, hdr); @@ -184,6 +199,8 @@ static int mlx5_cmd_pcie_cong_event_query(struct mlx5_c= ore_dev *dev, obj =3D MLX5_ADDR_OF(pcie_cong_event_cmd_out, out, cong_obj); =20 if (state) { + u8 cong; + cong =3D MLX5_GET(pcie_cong_event_obj, obj, inbound_cong_state); if (cong =3D=3D MLX5E_CONG_HIGH_STATE) *state |=3D MLX5E_INBOUND_CONG; @@ -193,6 +210,19 @@ static int mlx5_cmd_pcie_cong_event_query(struct mlx5_= core_dev *dev, *state |=3D MLX5E_OUTBOUND_CONG; } =20 + if (config) { + *config =3D (struct mlx5e_pcie_cong_thresh) { + .inbound_low =3D MLX5_GET(pcie_cong_event_obj, obj, + inbound_cong_low_threshold), + .inbound_high =3D MLX5_GET(pcie_cong_event_obj, obj, + inbound_cong_high_threshold), + .outbound_low =3D MLX5_GET(pcie_cong_event_obj, obj, + outbound_cong_low_threshold), + .outbound_high =3D MLX5_GET(pcie_cong_event_obj, obj, + outbound_cong_high_threshold), + }; + } + return 0; } =20 @@ -210,7 +240,7 @@ static void mlx5e_pcie_cong_event_work(struct work_stru= ct *work) dev =3D priv->mdev; =20 err =3D mlx5_cmd_pcie_cong_event_query(dev, cong_event->obj_id, - &new_cong_state); + &new_cong_state, NULL); if (err) { mlx5_core_warn(dev, "Error %d when querying PCIe cong event object (obj_= id=3D%llu).\n", err, cong_event->obj_id); @@ -249,6 +279,101 @@ static int mlx5e_pcie_cong_event_handler(struct notif= ier_block *nb, return NOTIFY_OK; } =20 +static bool mlx5e_thresh_check_val(u64 val) +{ + return val > 0 && val <=3D MLX5E_PCIE_CONG_THRESH_MAX; +} + +static bool +mlx5e_thresh_config_check_order(const struct mlx5e_pcie_cong_thresh *confi= g) +{ + if (config->inbound_high <=3D config->inbound_low) + return false; + + if (config->outbound_high <=3D config->outbound_low) + return false; + + return true; +} + +#define MLX5E_PCIE_CONG_THRESH_SYSFS_VALUES 4 + +static ssize_t thresh_config_store(struct device *dev, + struct device_attribute *attr, + const char *buf, + size_t count) +{ + struct mlx5e_pcie_cong_thresh config =3D {}; + struct mlx5e_pcie_cong_event *cong_event; + u64 outbound_high, outbound_low; + u64 inbound_high, inbound_low; + struct mlx5e_priv *priv; + int ret; + int err; + + cong_event =3D container_of(attr, struct mlx5e_pcie_cong_event, attr); + priv =3D cong_event->priv; + + ret =3D sscanf(buf, "%llu %llu %llu %llu", + &inbound_low, &inbound_high, + &outbound_low, &outbound_high); + if (ret !=3D MLX5E_PCIE_CONG_THRESH_SYSFS_VALUES) { + mlx5_core_err(priv->mdev, "Invalid format for PCIe congestion threshold = configuration. Expected %d, got %d.\n", + MLX5E_PCIE_CONG_THRESH_SYSFS_VALUES, ret); + return -EINVAL; + } + + if (!mlx5e_thresh_check_val(inbound_high) || + !mlx5e_thresh_check_val(inbound_low) || + !mlx5e_thresh_check_val(outbound_high) || + !mlx5e_thresh_check_val(outbound_low)) { + mlx5_core_err(priv->mdev, "Invalid values for PCIe congestion threshold = configuration. Valid range [1, %d]\n", + MLX5E_PCIE_CONG_THRESH_MAX); + return -EINVAL; + } + + config =3D (struct mlx5e_pcie_cong_thresh) { + .inbound_low =3D inbound_low, + .inbound_high =3D inbound_high, + .outbound_low =3D outbound_low, + .outbound_high =3D outbound_high, + + }; + + if (!mlx5e_thresh_config_check_order(&config)) { + mlx5_core_err(priv->mdev, "Invalid order of values for PCIe congestion t= hreshold configuration.\n"); + return -EINVAL; + } + + err =3D mlx5_cmd_pcie_cong_event_set(priv->mdev, &config, + true, &cong_event->obj_id); + + return err ? err : count; +} + +static ssize_t thresh_config_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + struct mlx5e_pcie_cong_event *cong_event; + struct mlx5e_pcie_cong_thresh config; + struct mlx5e_priv *priv; + int err; + + cong_event =3D container_of(attr, struct mlx5e_pcie_cong_event, attr); + priv =3D cong_event->priv; + + err =3D mlx5_cmd_pcie_cong_event_query(priv->mdev, cong_event->obj_id, + NULL, &config); + + if (err) + return err; + + return sysfs_emit(buf, "%u %u %u %u\n", + config.inbound_low, config.inbound_high, + config.outbound_low, config.outbound_high); +} + bool mlx5e_pcie_cong_event_supported(struct mlx5_core_dev *dev) { u64 features =3D MLX5_CAP_GEN_2_64(dev, general_obj_types_127_64); @@ -283,7 +408,7 @@ int mlx5e_pcie_cong_event_init(struct mlx5e_priv *priv) cong_event->priv =3D priv; =20 err =3D mlx5_cmd_pcie_cong_event_set(mdev, &default_thresh_config, - &cong_event->obj_id); + false, &cong_event->obj_id); if (err) { mlx5_core_warn(mdev, "Error creating a PCIe congestion event object\n"); goto err_free; @@ -295,10 +420,20 @@ int mlx5e_pcie_cong_event_init(struct mlx5e_priv *pri= v) goto err_obj_destroy; } =20 + cong_event->attr =3D (struct device_attribute)__ATTR_RW(thresh_config); + err =3D sysfs_create_file(&mdev->device->kobj, + &cong_event->attr.attr); + if (err) { + mlx5_core_warn(mdev, "Error creating a sysfs entry for pcie_cong limits.= \n"); + goto err_unregister_nb; + } + priv->cong_event =3D cong_event; =20 return 0; =20 +err_unregister_nb: + mlx5_eq_notifier_unregister(mdev, &cong_event->nb); err_obj_destroy: mlx5_cmd_pcie_cong_event_destroy(mdev, cong_event->obj_id); err_free: @@ -316,6 +451,7 @@ void mlx5e_pcie_cong_event_cleanup(struct mlx5e_priv *p= riv) return; =20 priv->cong_event =3D NULL; + sysfs_remove_file(&mdev->device->kobj, &cong_event->attr.attr); =20 mlx5_eq_notifier_unregister(mdev, &cong_event->nb); cancel_work_sync(&cong_event->work); --=20 2.31.1