From nobody Sun Oct 5 18:16:45 2025 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0BAB2BF00B; Thu, 31 Jul 2025 13:55:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.188 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753970123; cv=none; b=gt2XnUpFLGQfxyMFhY3b+AqYaFCobf8fxS1dquay+WZ4ezk0zeTSTmHUrif30PfWpoza1Q9hM6y3cI5TU27IIXmQ2UCwqN8ibjRzSm86ZEL5LyTZIcM9w0csKbp0lUJaKKHINzh30sNbkxvrMG+amHfhtwEFtEXFb2I9yqZNnAA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753970123; c=relaxed/simple; bh=7hHBqQf805NswColiBex+ANNVzs2Xz5BQCv/ANGMenY=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=pUY67nFPLYvtiWtnSejB+00yAKXlhAfMdEtRuM9hiCr0GMN3qme5MfZjcb7tauuHcbRSdypDq+l5zVzR/hZ0JFmc6+s/OMwSXPidgeN9EDmxKRaK+Zl/GF5zN+gUHLN5OP+jGpQXhx14ucoAxuXUvii1Jg2yJSiADsBRjHmdNVc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.188 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.48]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4bt9YL3j4wztSbd; Thu, 31 Jul 2025 21:54:14 +0800 (CST) Received: from kwepemk100013.china.huawei.com (unknown [7.202.194.61]) by mail.maildlp.com (Postfix) with ESMTPS id 0813118007F; Thu, 31 Jul 2025 21:55:17 +0800 (CST) Received: from localhost.localdomain (10.90.31.46) by kwepemk100013.china.huawei.com (7.202.194.61) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 31 Jul 2025 21:55:16 +0800 From: Jijie Shao To: , , , , , CC: , , , , , , , , Subject: [PATCH net 1/3] net: hibmcge: fix rtnl deadlock issue Date: Thu, 31 Jul 2025 21:47:47 +0800 Message-ID: <20250731134749.4090041-2-shaojijie@huawei.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20250731134749.4090041-1-shaojijie@huawei.com> References: <20250731134749.4090041-1-shaojijie@huawei.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-ClientProxiedBy: kwepems200002.china.huawei.com (7.221.188.68) To kwepemk100013.china.huawei.com (7.202.194.61) Content-Type: text/plain; charset="utf-8" Currently, the hibmcge netdev acquires the rtnl_lock in pci_error_handlers.reset_prepare() and releases it in pci_error_handlers.reset_done(). However, in the PCI framework: pci_reset_bus - __pci_reset_slot - pci_slot_save_and_disable_locked - pci_dev_save_and_disable - err_handler->reset_prepare(dev); In pci_slot_save_and_disable_locked(): list_for_each_entry(dev, &slot->bus->devices, bus_list) { if (!dev->slot || dev->slot!=3D slot) continue; pci_dev_save_and_disable(dev); if (dev->subordinate) pci_bus_save_and_disable_locked(dev->subordinate); } This will iterate through all devices under the current bus and execute err_handler->reset_prepare(), causing two devices of the hibmcge driver to sequentially request the rtnl_lock, leading to a deadlock. Since the driver now executes netif_device_detach() before the reset process, it will not concurrently with other netdev APIs, so there is no need to hold the rtnl_lock now. Therefore, this patch removes the rtnl_lock during the reset process and adjusts the position of HBG_NIC_STATE_RESETTING to ensure that multiple resets are not executed concurrently. Fixes: 3f5a61f6d504f ("net: hibmcge: Add reset supported in this module") Signed-off-by: Jijie Shao --- drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c b/drivers/net= /ethernet/hisilicon/hibmcge/hbg_err.c index 503cfbfb4a8a..94bc6f0da912 100644 --- a/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c +++ b/drivers/net/ethernet/hisilicon/hibmcge/hbg_err.c @@ -53,9 +53,11 @@ static int hbg_reset_prepare(struct hbg_priv *priv, enum= hbg_reset_type type) { int ret; =20 - ASSERT_RTNL(); + if (test_and_set_bit(HBG_NIC_STATE_RESETTING, &priv->state)) + return -EBUSY; =20 if (netif_running(priv->netdev)) { + clear_bit(HBG_NIC_STATE_RESETTING, &priv->state); dev_warn(&priv->pdev->dev, "failed to reset because port is up\n"); return -EBUSY; @@ -64,7 +66,6 @@ static int hbg_reset_prepare(struct hbg_priv *priv, enum = hbg_reset_type type) netif_device_detach(priv->netdev); =20 priv->reset_type =3D type; - set_bit(HBG_NIC_STATE_RESETTING, &priv->state); clear_bit(HBG_NIC_STATE_RESET_FAIL, &priv->state); ret =3D hbg_hw_event_notify(priv, HBG_HW_EVENT_RESET); if (ret) { @@ -84,10 +85,8 @@ static int hbg_reset_done(struct hbg_priv *priv, enum hb= g_reset_type type) type !=3D priv->reset_type) return 0; =20 - ASSERT_RTNL(); - - clear_bit(HBG_NIC_STATE_RESETTING, &priv->state); ret =3D hbg_rebuild(priv); + clear_bit(HBG_NIC_STATE_RESETTING, &priv->state); if (ret) { priv->stats.reset_fail_cnt++; set_bit(HBG_NIC_STATE_RESET_FAIL, &priv->state); @@ -101,12 +100,10 @@ static int hbg_reset_done(struct hbg_priv *priv, enum= hbg_reset_type type) return ret; } =20 -/* must be protected by rtnl lock */ int hbg_reset(struct hbg_priv *priv) { int ret; =20 - ASSERT_RTNL(); ret =3D hbg_reset_prepare(priv, HBG_RESET_TYPE_FUNCTION); if (ret) return ret; @@ -171,7 +168,6 @@ static void hbg_pci_err_reset_prepare(struct pci_dev *p= dev) struct net_device *netdev =3D pci_get_drvdata(pdev); struct hbg_priv *priv =3D netdev_priv(netdev); =20 - rtnl_lock(); hbg_reset_prepare(priv, HBG_RESET_TYPE_FLR); } =20 @@ -181,7 +177,6 @@ static void hbg_pci_err_reset_done(struct pci_dev *pdev) struct hbg_priv *priv =3D netdev_priv(netdev); =20 hbg_reset_done(priv, HBG_RESET_TYPE_FLR); - rtnl_unlock(); } =20 static const struct pci_error_handlers hbg_pci_err_handler =3D { --=20 2.33.0