From nobody Thu Oct 9 09:02:00 2025 Received: from raptorengineering.com (mail.raptorengineering.com [23.155.224.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA2FA2F0028; Wed, 18 Jun 2025 16:56:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=23.155.224.40 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265798; cv=none; b=OSuxuI6qzErXvcxf5aw4jQszg4FJdUhQ3XVwfvCmV7k00aNn8q2h5t2R49CEAnHC48sjl4ES5N2hLoVXtxkaUt1ilGasKHOQvjSP3Ui3aGoXWw9XVuzjjLEnVR881+EmHVbqyyeEajDm2Fwj3eESuS+3u8rbq9UFqABuomMXRTU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265798; c=relaxed/simple; bh=5v3Nyb+JZ0z7JMGCNWavJyrjZz4WWtoVrrD+w4tlVSo=; h=Date:From:To:Cc:Message-ID:In-Reply-To:References:Subject: MIME-Version:Content-Type; b=rZp4NB24snIRsyQIqjI+6gxDlFmJFw4a7gTz+BUjDoDfwwEDnrp0n/iFuuFJfhP1alwcTrdcNWYHyV+ghajz9b7+Kq+JDJjyrDKVvCTJ3WaHlAeDxh1csTWNM7O43QbtlLK7FWF80pEAWYXwzGDE9VrhPgJ3oy62H4UkulWyLuc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com; spf=pass smtp.mailfrom=raptorengineering.com; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b=cTB4k5UE; arc=none smtp.client-ip=23.155.224.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b="cTB4k5UE" Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 9EAAD8286FBD; Wed, 18 Jun 2025 11:56:34 -0500 (CDT) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id DoNQmYovNiVd; Wed, 18 Jun 2025 11:56:33 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 79BDE8287179; Wed, 18 Jun 2025 11:56:33 -0500 (CDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com 79BDE8287179 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1750265793; bh=ScUP9ReH/qFYDsUDSw3dDNroWmrhhRfY73Buoe8COh0=; h=Date:From:To:Message-ID:MIME-Version; b=cTB4k5UEGBXzmW7cFQHNwEV5iOXGlX2h8AEEdEZaxGjOoD0DJrNNtImilu2UEfgeh sXAEv/yhd2+wcPsTNDtBukyEgxwTNhqIRRD5BH9RjHRJ/ubKmPZNhjWqDVyCfc+6fy C65gb69PGeLWdhYaCk1Ap8vq9zOjmOsWfMdPK1m4= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id jlLpS7__9g6j; Wed, 18 Jun 2025 11:56:33 -0500 (CDT) Received: from vali.starlink.edu (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 5428E8286FBD; Wed, 18 Jun 2025 11:56:33 -0500 (CDT) Date: Wed, 18 Jun 2025 11:56:33 -0500 (CDT) From: Timothy Pearson To: Timothy Pearson Cc: linuxppc-dev , linux-kernel , linux-pci , Madhavan Srinivasan , Michael Ellerman , christophe leroy , Naveen N Rao , Bjorn Helgaas , Shawn Anastasio Message-ID: <1472752760.1310634.1750265793260.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> References: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> Subject: [PATCH v2 1/6] pci/hotplug/pnv_php: Properly clean up allocated IRQs on Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC137 (Linux)/8.5.0_GA_3042) Thread-Topic: pci/hotplug/pnv_php: Properly clean up allocated IRQs on Thread-Index: 7ViWVrejj338yZQm64sXoMCfdWvE4HYiyUqA Content-Type: text/plain; charset="utf-8" unplug In cases where the root of a nested PCIe bridge configuration is unplugged, the pnv_php driver would leak the allocated IRQ resources for the child bridges' hotplug event notifications, resulting in a panic. Fix this by walking all child buses and deallocating all it's IRQ resources before calling pci_hp_remove_devices. Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so that it is only destroyed in pnv_php_free_slot, instead of pnv_php_disable_irq. This is required since pnv_php_disable_irq will now be called by workers triggered by hot unplug interrupts, so the workqueue needs to stay allocated. The abridged kernel panic that occurs without this patch is as follows: WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+= 0x6c/0x9c CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2 Call Trace: msi_device_data_release+0x34/0x9c (unreliable) release_nodes+0x64/0x13c devres_release_all+0xc0/0x140 device_del+0x2d4/0x46c pci_destroy_dev+0x5c/0x194 pci_hp_remove_devices+0x90/0x128 pci_hp_remove_devices+0x44/0x128 pnv_php_disable_slot+0x54/0xd4 power_write_file+0xf8/0x18c pci_slot_attr_store+0x40/0x5c sysfs_kf_write+0x64/0x78 kernfs_fop_write_iter+0x1b0/0x290 vfs_write+0x3bc/0x50c ksys_write+0x84/0x140 system_call_exception+0x124/0x230 system_call_vectored_common+0x15c/0x2ec Signed-off-by: Shawn Anastasio Signed-off-by: Timothy Pearson --- drivers/pci/hotplug/pnv_php.c | 94 ++++++++++++++++++++++++++++------- 1 file changed, 75 insertions(+), 19 deletions(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 573a41869c15..aec0a6d594ac 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -3,6 +3,7 @@ * PCI Hotplug Driver for PowerPC PowerNV platform. * * Copyright Gavin Shan, IBM Corporation 2016. + * Copyright (C) 2025 Raptor Engineering, LLC */ =20 #include @@ -36,8 +37,10 @@ static void pnv_php_register(struct device_node *dn); static void pnv_php_unregister_one(struct device_node *dn); static void pnv_php_unregister(struct device_node *dn); =20 +static void pnv_php_enable_irq(struct pnv_php_slot *php_slot); + static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, - bool disable_device) + bool disable_device, bool disable_msi) { struct pci_dev *pdev =3D php_slot->pdev; u16 ctrl; @@ -53,19 +56,15 @@ static void pnv_php_disable_irq(struct pnv_php_slot *ph= p_slot, php_slot->irq =3D 0; } =20 - if (php_slot->wq) { - destroy_workqueue(php_slot->wq); - php_slot->wq =3D NULL; - } - - if (disable_device) { + if (disable_device || disable_msi) { if (pdev->msix_enabled) pci_disable_msix(pdev); else if (pdev->msi_enabled) pci_disable_msi(pdev); + } =20 + if (disable_device) pci_disable_device(pdev); - } } =20 static void pnv_php_free_slot(struct kref *kref) @@ -74,7 +73,8 @@ static void pnv_php_free_slot(struct kref *kref) struct pnv_php_slot, kref); =20 WARN_ON(!list_empty(&php_slot->children)); - pnv_php_disable_irq(php_slot, false); + pnv_php_disable_irq(php_slot, false, false); + destroy_workqueue(php_slot->wq); kfree(php_slot->name); kfree(php_slot); } @@ -561,8 +561,57 @@ static int pnv_php_reset_slot(struct hotplug_slot *slo= t, bool probe) static int pnv_php_enable_slot(struct hotplug_slot *slot) { struct pnv_php_slot *php_slot =3D to_pnv_php_slot(slot); + u32 prop32; + int ret; + + ret =3D pnv_php_enable(php_slot, true); + if (ret) + return ret; + + /* (Re-)enable interrupt if the slot supports surprise hotplug */ + ret =3D of_property_read_u32(php_slot->dn, "ibm,slot-surprise-pluggable",= &prop32); + if (!ret && prop32) + pnv_php_enable_irq(php_slot); + + return 0; +} + +/** + * Disable any hotplug interrupts for all slots on the provided bus, as we= ll as + * all downstream slots in preparation for a hot unplug. + */ +static int pnv_php_disable_all_irqs(struct pci_bus *bus) +{ + struct pci_bus *child_bus; + struct pci_slot *cur_slot; + + /* First go down child busses */ + list_for_each_entry(child_bus, &bus->children, node) + pnv_php_disable_all_irqs(child_bus); + + /* Disable IRQs for all pnv_php slots on this bus */ + list_for_each_entry(cur_slot, &bus->slots, list) { + struct pnv_php_slot *php_slot =3D to_pnv_php_slot(cur_slot->hotplug); + + pnv_php_disable_irq(php_slot, false, true); + } =20 - return pnv_php_enable(php_slot, true); + return 0; +} + +/** + * Disable any hotplug interrupts for all downstream slots on the provided= bus in + * preparation for a hot unplug. + */ +static int pnv_php_disable_all_downstream_irqs(struct pci_bus *bus) +{ + struct pci_bus *child_bus; + + /* Go down child busses, recursively deactivating their IRQs */ + list_for_each_entry(child_bus, &bus->children, node) + pnv_php_disable_all_irqs(child_bus); + + return 0; } =20 static int pnv_php_disable_slot(struct hotplug_slot *slot) @@ -579,6 +628,12 @@ static int pnv_php_disable_slot(struct hotplug_slot *s= lot) php_slot->state !=3D PNV_PHP_STATE_REGISTERED) return 0; =20 + /* Free all IRQ resources from all child slots before remove. + * Note that we do not disable the root slot IRQ here as that + * would also deactivate the slot hot (re)plug interrupt! + */ + pnv_php_disable_all_downstream_irqs(php_slot->bus); + /* Remove all devices behind the slot */ pci_lock_rescan_remove(); pci_hp_remove_devices(php_slot->bus); @@ -647,6 +702,15 @@ static struct pnv_php_slot *pnv_php_alloc_slot(struct = device_node *dn) return NULL; } =20 + /* Allocate workqueue for this slot's interrupt handling */ + php_slot->wq =3D alloc_workqueue("pciehp-%s", 0, 0, php_slot->name); + if (!php_slot->wq) { + SLOT_WARN(php_slot, "Cannot alloc workqueue\n"); + kfree(php_slot->name); + kfree(php_slot); + return NULL; + } + if (dn->child && PCI_DN(dn->child)) php_slot->slot_no =3D PCI_SLOT(PCI_DN(dn->child)->devfn); else @@ -843,14 +907,6 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_= slot, int irq) u16 sts, ctrl; int ret; =20 - /* Allocate workqueue */ - php_slot->wq =3D alloc_workqueue("pciehp-%s", 0, 0, php_slot->name); - if (!php_slot->wq) { - SLOT_WARN(php_slot, "Cannot alloc workqueue\n"); - pnv_php_disable_irq(php_slot, true); - return; - } - /* Check PDC (Presence Detection Change) is broken or not */ ret =3D of_property_read_u32(php_slot->dn, "ibm,slot-broken-pdc", &broken_pdc); @@ -869,7 +925,7 @@ static void pnv_php_init_irq(struct pnv_php_slot *php_s= lot, int irq) ret =3D request_irq(irq, pnv_php_interrupt, IRQF_SHARED, php_slot->name, php_slot); if (ret) { - pnv_php_disable_irq(php_slot, true); + pnv_php_disable_irq(php_slot, true, true); SLOT_WARN(php_slot, "Error %d enabling IRQ %d\n", ret, irq); return; } --=20 2.39.5 From nobody Thu Oct 9 09:02:00 2025 Received: from raptorengineering.com (mail.raptorengineering.com [23.155.224.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4207E2FEE16; Wed, 18 Jun 2025 16:56:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=23.155.224.40 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265819; cv=none; b=DJjYWDvzLSTXHmJjSgb1RVHaWClY63MyRx5cN6w1FVDf4Zk8VK6m47UfuUZx9Hn6pAQFJ+tALBh5xX9fEQrI0yOCUvwCh2HQY4SNNOcNYQvrOU+6U7Gg6CdmVjn5ZgrL5IFT/Dy7Ix4P70K1zHcwQzK+Cjy5kdfo6RQ/tuaFxws= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265819; c=relaxed/simple; bh=GtU4YHodsG0/CIhLM5v4G6n95rh7zrUYrO5GWs4Y384=; h=Date:From:To:Cc:Message-ID:In-Reply-To:References:Subject: MIME-Version:Content-Type; b=KzBqJyeZsEcmsJKJDdwlHJfO0JQKXq8Ijh+SiO7TOLS1qy7ZELpVQ02xPUq+4+Xk1LtW9HRbl6/bOZb5pL8XxyiyMmgh+dYHBvGxzoHvSKBV41kRFj683b5atlb/n0vBHznTf+3dVacnv0x/3U1L0yjfUDMb12TKX5oSoq/daM4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com; spf=pass smtp.mailfrom=raptorengineering.com; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b=FPLMlgln; arc=none smtp.client-ip=23.155.224.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b="FPLMlgln" Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 499608286FBD; Wed, 18 Jun 2025 11:56:55 -0500 (CDT) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id OzwSo1i5lY-5; Wed, 18 Jun 2025 11:56:54 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id A66358287179; Wed, 18 Jun 2025 11:56:54 -0500 (CDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com A66358287179 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1750265814; bh=sREPJsWJTR+II/tfen1J5UvqEzkTitKMBh+Xdyx4rAM=; h=Date:From:To:Message-ID:MIME-Version; b=FPLMlgln/Ncr+9I083bBPxQi6g4rRCiOdyqww8zDpkhfPuyneVgmqD2w9/pwRpEQm M3fFKnX2Os14BnzxrmV0UU58L8pJAKY3s7O9nO2yXk2lUbO99ufwZtrpcn/SZXjDOl 8o/6Ia84kDydEcBhqIdC2LGd+YvkTFgqtdT3OVTA= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id LlwZ-2WCS83M; Wed, 18 Jun 2025 11:56:54 -0500 (CDT) Received: from vali.starlink.edu (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 7329E8286FBD; Wed, 18 Jun 2025 11:56:54 -0500 (CDT) Date: Wed, 18 Jun 2025 11:56:54 -0500 (CDT) From: Timothy Pearson To: Timothy Pearson Cc: linuxppc-dev , linux-kernel , linux-pci , Madhavan Srinivasan , Michael Ellerman , christophe leroy , Naveen N Rao , Bjorn Helgaas , Shawn Anastasio Message-ID: <1741778252.1310636.1750265814430.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> References: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> Subject: [PATCH v2 2/6] pci/hotplug/pnv_php: Work around switches with broken Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC137 (Linux)/8.5.0_GA_3042) Thread-Topic: pci/hotplug/pnv_php: Work around switches with broken Thread-Index: 7ViWVrejj338yZQm64sXoMCfdWvE4IAxp3+8 Content-Type: text/plain; charset="utf-8" presence detection The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system was observed to incorrectly assert the Presence Detect Set bit in its capabilities when tested on a Raptor Computing Systems Blackbird system, resulting in the hot insert path never attempting a rescan of the bus and any downstream devices not being re-detected. Work around this by additionally checking whether the PCIe data link is active or not when performing presence detection on downstream switches' ports, similar to the pciehp_hpc.c driver. Signed-off-by: Shawn Anastasio Signed-off-by: Timothy Pearson --- drivers/pci/hotplug/pnv_php.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index aec0a6d594ac..bac8af3df41a 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -391,6 +391,20 @@ static int pnv_php_get_power_state(struct hotplug_slot= *slot, u8 *state) return 0; } =20 +static int pcie_check_link_active(struct pci_dev *pdev) +{ + u16 lnk_status; + int ret; + + ret =3D pcie_capability_read_word(pdev, PCI_EXP_LNKSTA, &lnk_status); + if (ret =3D=3D PCIBIOS_DEVICE_NOT_FOUND || PCI_POSSIBLE_ERROR(lnk_status)) + return -ENODEV; + + ret =3D !!(lnk_status & PCI_EXP_LNKSTA_DLLLA); + + return ret; +} + static int pnv_php_get_adapter_state(struct hotplug_slot *slot, u8 *state) { struct pnv_php_slot *php_slot =3D to_pnv_php_slot(slot); @@ -403,6 +417,19 @@ static int pnv_php_get_adapter_state(struct hotplug_sl= ot *slot, u8 *state) */ ret =3D pnv_pci_get_presence_state(php_slot->id, &presence); if (ret >=3D 0) { + if (pci_pcie_type(php_slot->pdev) =3D=3D PCI_EXP_TYPE_DOWNSTREAM && + presence =3D=3D OPAL_PCI_SLOT_EMPTY) { + /* + * Similar to pciehp_hpc, check whether the Link Active + * bit is set to account for broken downstream bridges + * that don't properly assert Presence Detect State, as + * was observed on the Microsemi Switchtec PM8533 PFX + * [11f8:8533]. + */ + if (pcie_check_link_active(php_slot->pdev) > 0) + presence =3D OPAL_PCI_SLOT_PRESENT; + } + *state =3D presence; ret =3D 0; } else { --=20 2.39.5 From nobody Thu Oct 9 09:02:00 2025 Received: from raptorengineering.com (mail.raptorengineering.com [23.155.224.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 88D3F2FEE36; Wed, 18 Jun 2025 16:57:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=23.155.224.40 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265841; cv=none; b=LJDXUEd0r1Ww83Ep3tQFuUQ+hbpVaZgCZ8D+PQkvGedtNBR7pVN8/XPPMqMLXTS/hAMixzNMD8gAZj0dHyMMHHLyGhhQa+roZj8CBA/2Rr1jM4tZSQo1uqGo3v7888J01wTPKmUiSrBZm9DBz1befVbPmcoVxf1Acd0ZW2FzjTU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265841; c=relaxed/simple; bh=EBV0EZxvuUqV5iGLVkQC/Gh2UeJ/c8RB1TdSZu699iE=; h=Date:From:To:Cc:Message-ID:In-Reply-To:References:Subject: MIME-Version:Content-Type; b=JhL7D3SSbLVsFKE3rASt9oVHVZN5Pr8u/7UhEPWyO6ALthh0/v//UV5g4tURu6YAUCfT1mCiJwdSUsfffpPtx3UlFAhj9t8lr4XOm5V0cHgBhgzC4tPsyWAMOdYnPiFyscPUhtLBdtRV9+4ZvpMQG7X8f4N7tU4a9M08KvZY8OE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com; spf=pass smtp.mailfrom=raptorengineering.com; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b=LFEN0DId; arc=none smtp.client-ip=23.155.224.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b="LFEN0DId" Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id C20468286FBD; Wed, 18 Jun 2025 11:57:17 -0500 (CDT) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id kRel1SYbqiIX; Wed, 18 Jun 2025 11:57:17 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 319628287179; Wed, 18 Jun 2025 11:57:17 -0500 (CDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com 319628287179 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1750265837; bh=USAGms6qzWe7rESrb9AGGpmLmspKTjgFGVbO0Q+1BzA=; h=Date:From:To:Message-ID:MIME-Version; b=LFEN0DId7062kxNccMz/LnwhOwiWj4dMzKqnwFilyzqXGLpDuVEf3XpD8iZU4T0Nu n4Q2hgjUDoFxzS35f4zaig8s01sjPr2hz56/z6acnphLNHLFa6qNIQQjPU91iSiXkx Owd0NK0N5atfBk+HRHI+GZPxt1eeJZTgBsSC3FeI= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id JsKIlELgcYoO; Wed, 18 Jun 2025 11:57:17 -0500 (CDT) Received: from vali.starlink.edu (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id EDA928286FBD; Wed, 18 Jun 2025 11:57:16 -0500 (CDT) Date: Wed, 18 Jun 2025 11:57:16 -0500 (CDT) From: Timothy Pearson To: Timothy Pearson Cc: linuxppc-dev , linux-kernel , linux-pci , Madhavan Srinivasan , Michael Ellerman , christophe leroy , Naveen N Rao , Bjorn Helgaas , Shawn Anastasio Message-ID: <374686472.1310637.1750265836363.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> References: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> Subject: [PATCH v2 3/6] powerpc/eeh: Export eeh_unfreeze_pe() Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC137 (Linux)/8.5.0_GA_3042) Thread-Topic: powerpc/eeh: Export eeh_unfreeze_pe() Thread-Index: 7ViWVrejj338yZQm64sXoMCfdWvE4GWR/U+K Content-Type: text/plain; charset="utf-8" The PowerNV hotplug driver needs to be able to clear any frozen PE(s) on the PHB after suprise removal of a downstream device. Export the eeh_unfreeze_pe() symbol to allow implementation of this functionality in the php_nv module. Signed-off-by: Timothy Pearson --- arch/powerpc/kernel/eeh.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 83fe99861eb1..3a2a926fbd64 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -1139,6 +1139,7 @@ int eeh_unfreeze_pe(struct eeh_pe *pe) =20 return ret; } +EXPORT_SYMBOL_GPL(eeh_unfreeze_pe); =20 =20 static struct pci_device_id eeh_reset_ids[] =3D { --=20 2.39.5 From nobody Thu Oct 9 09:02:00 2025 Received: from raptorengineering.com (mail.raptorengineering.com [23.155.224.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78B442FEE23; Wed, 18 Jun 2025 16:57:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=23.155.224.40 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265862; cv=none; b=usSbwdS/qynp0vXP5emmiZu6o99WPOCsZQi6NRuwgtjXl8rPFonfGtReohrswMqvcBAAf0sLs9qkfJNxTSMSvRmEbf6qp55Cz+MgoF7V4wog+zy4Mv3lNb2TY5rz+JviRlOG2FtSqpT5//liFTj6l02rGWiPkO07o35pDaRm/i8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265862; c=relaxed/simple; bh=U5Lk+VkVJJP5V8zUZPjwKlzGXzRmt+MR/UQ4a+BmKuk=; h=Date:From:To:Cc:Message-ID:In-Reply-To:References:Subject: MIME-Version:Content-Type; b=qti2QVtnIEBfokcREZtw8GBNrqi/1uAh3tpLvGSWZCzFM1eRXaGlev0LVJu66kbb1Tr84lnYTbQdbEER9gLZFx0sjlWOTbr2OxZQgqCqZkTR7Qug20zUJr57M0XnJjzpZYzUMfz9vzA+BPSdEQ7iqBG+rjRq68EDGoilGu947Jc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com; spf=pass smtp.mailfrom=raptorengineering.com; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b=kjcV40oY; arc=none smtp.client-ip=23.155.224.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b="kjcV40oY" Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 7269C82879FF; Wed, 18 Jun 2025 11:57:39 -0500 (CDT) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id dQ1N6Vkh6LkL; Wed, 18 Jun 2025 11:57:38 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 303CC8287715; Wed, 18 Jun 2025 11:57:38 -0500 (CDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com 303CC8287715 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1750265858; bh=AzvxEw34wp1D5TRYWoqBxaNBDOvxPv48PtS4dQyEJ24=; h=Date:From:To:Message-ID:MIME-Version; b=kjcV40oYDXBtXvEpK0y+QaUZ+fr+gwsWxOGio6lZcjzPCxI5kr3fh6J9qftIeoedw I6m8w42BFgt7ntW3yrSVCzvat2uV/iw9dGUPks1eMjZggF4mcv4zZNC0ENYerjudFr z4jBpRi05Pmlu5x0bsOlCOjFW6+yqFOrKoPnmH40= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id DpQ4waGREUEg; Wed, 18 Jun 2025 11:57:38 -0500 (CDT) Received: from vali.starlink.edu (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id EE5608286FBD; Wed, 18 Jun 2025 11:57:37 -0500 (CDT) Date: Wed, 18 Jun 2025 11:57:37 -0500 (CDT) From: Timothy Pearson To: Timothy Pearson Cc: linuxppc-dev , linux-kernel , linux-pci , Madhavan Srinivasan , Michael Ellerman , christophe leroy , Naveen N Rao , Bjorn Helgaas , Shawn Anastasio Message-ID: <1904950895.1310639.1750265857901.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> References: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> Subject: [PATCH v2 4/6] powerpc/eeh: Make EEH driver device hotplug safe Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC137 (Linux)/8.5.0_GA_3042) Thread-Topic: powerpc/eeh: Make EEH driver device hotplug safe Thread-Index: 7ViWVrejj338yZQm64sXoMCfdWvE4IaEh4Bm Content-Type: text/plain; charset="utf-8" Multiple race conditions existed between the PCIe hotplug driver and the EEH driver, leading to a variety of kernel oopses of the same general nature: A second class of oops is also seen when the underling bus disappears during device recovery. Refactor the EEH module to be PCI rescan and remove safe. Also clean up a few minor formatting / readability issues. Signed-off-by: Timothy Pearson --- arch/powerpc/kernel/eeh_driver.c | 48 +++++++++++++++++++++----------- arch/powerpc/kernel/eeh_pe.c | 10 ++++--- 2 files changed, 38 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_dri= ver.c index 7efe04c68f0f..dd50de91c438 100644 --- a/arch/powerpc/kernel/eeh_driver.c +++ b/arch/powerpc/kernel/eeh_driver.c @@ -257,13 +257,12 @@ static void eeh_pe_report_edev(struct eeh_dev *edev, = eeh_report_fn fn, struct pci_driver *driver; enum pci_ers_result new_result; =20 - pci_lock_rescan_remove(); pdev =3D edev->pdev; if (pdev) get_device(&pdev->dev); - pci_unlock_rescan_remove(); if (!pdev) { eeh_edev_info(edev, "no device"); + *result =3D PCI_ERS_RESULT_DISCONNECT; return; } device_lock(&pdev->dev); @@ -304,8 +303,9 @@ static void eeh_pe_report(const char *name, struct eeh_= pe *root, struct eeh_dev *edev, *tmp; =20 pr_info("EEH: Beginning: '%s'\n", name); - eeh_for_each_pe(root, pe) eeh_pe_for_each_dev(pe, edev, tmp) - eeh_pe_report_edev(edev, fn, result); + eeh_for_each_pe(root, pe) + eeh_pe_for_each_dev(pe, edev, tmp) + eeh_pe_report_edev(edev, fn, result); if (result) pr_info("EEH: Finished:'%s' with aggregate recovery state:'%s'\n", name, pci_ers_result_name(*result)); @@ -383,6 +383,8 @@ static void eeh_dev_restore_state(struct eeh_dev *edev,= void *userdata) if (!edev) return; =20 + pci_lock_rescan_remove(); + /* * The content in the config space isn't saved because * the blocked config space on some adapters. We have @@ -393,14 +395,19 @@ static void eeh_dev_restore_state(struct eeh_dev *ede= v, void *userdata) if (list_is_last(&edev->entry, &edev->pe->edevs)) eeh_pe_restore_bars(edev->pe); =20 + pci_unlock_rescan_remove(); return; } =20 pdev =3D eeh_dev_to_pci_dev(edev); - if (!pdev) + if (!pdev) { + pci_unlock_rescan_remove(); return; + } =20 pci_restore_state(pdev); + + pci_unlock_rescan_remove(); } =20 /** @@ -647,9 +654,7 @@ static int eeh_reset_device(struct eeh_pe *pe, struct p= ci_bus *bus, if (any_passed || driver_eeh_aware || (pe->type & EEH_PE_VF)) { eeh_pe_dev_traverse(pe, eeh_rmv_device, rmv_data); } else { - pci_lock_rescan_remove(); pci_hp_remove_devices(bus); - pci_unlock_rescan_remove(); } =20 /* @@ -665,8 +670,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct p= ci_bus *bus, if (rc) return rc; =20 - pci_lock_rescan_remove(); - /* Restore PE */ eeh_ops->configure_bridge(pe); eeh_pe_restore_bars(pe); @@ -674,7 +677,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct p= ci_bus *bus, /* Clear frozen state */ rc =3D eeh_clear_pe_frozen_state(pe, false); if (rc) { - pci_unlock_rescan_remove(); return rc; } =20 @@ -709,7 +711,6 @@ static int eeh_reset_device(struct eeh_pe *pe, struct p= ci_bus *bus, pe->tstamp =3D tstamp; pe->freeze_count =3D cnt; =20 - pci_unlock_rescan_remove(); return 0; } =20 @@ -843,10 +844,13 @@ void eeh_handle_normal_event(struct eeh_pe *pe) {LIST_HEAD_INIT(rmv_data.removed_vf_list), 0}; int devices =3D 0; =20 + pci_lock_rescan_remove(); + bus =3D eeh_pe_bus_get(pe); if (!bus) { pr_err("%s: Cannot find PCI bus for PHB#%x-PE#%x\n", __func__, pe->phb->global_number, pe->addr); + pci_unlock_rescan_remove(); return; } =20 @@ -1094,10 +1098,15 @@ void eeh_handle_normal_event(struct eeh_pe *pe) eeh_pe_state_clear(pe, EEH_PE_PRI_BUS, true); eeh_pe_dev_mode_mark(pe, EEH_DEV_REMOVED); =20 - pci_lock_rescan_remove(); - pci_hp_remove_devices(bus); - pci_unlock_rescan_remove(); + bus =3D eeh_pe_bus_get(pe); + if (bus) + pci_hp_remove_devices(bus); + else + pr_err("%s: PCI bus for PHB#%x-PE#%x disappeared\n", + __func__, pe->phb->global_number, pe->addr); + /* The passed PE should no longer be used */ + pci_unlock_rescan_remove(); return; } =20 @@ -1114,6 +1123,8 @@ void eeh_handle_normal_event(struct eeh_pe *pe) eeh_clear_slot_attention(edev->pdev); =20 eeh_pe_state_clear(pe, EEH_PE_RECOVERING, true); + + pci_unlock_rescan_remove(); } =20 /** @@ -1132,6 +1143,7 @@ void eeh_handle_special_event(void) unsigned long flags; int rc; =20 + pci_lock_rescan_remove(); =20 do { rc =3D eeh_ops->next_error(&pe); @@ -1171,10 +1183,12 @@ void eeh_handle_special_event(void) =20 break; case EEH_NEXT_ERR_NONE: + pci_unlock_rescan_remove(); return; default: pr_warn("%s: Invalid value %d from next_error()\n", __func__, rc); + pci_unlock_rescan_remove(); return; } =20 @@ -1186,7 +1200,9 @@ void eeh_handle_special_event(void) if (rc =3D=3D EEH_NEXT_ERR_FROZEN_PE || rc =3D=3D EEH_NEXT_ERR_FENCED_PHB) { eeh_pe_state_mark(pe, EEH_PE_RECOVERING); + pci_unlock_rescan_remove(); eeh_handle_normal_event(pe); + pci_lock_rescan_remove(); } else { eeh_for_each_pe(pe, tmp_pe) eeh_pe_for_each_dev(tmp_pe, edev, tmp_edev) @@ -1199,7 +1215,6 @@ void eeh_handle_special_event(void) eeh_report_failure, NULL); eeh_set_channel_state(pe, pci_channel_io_perm_failure); =20 - pci_lock_rescan_remove(); list_for_each_entry(hose, &hose_list, list_node) { phb_pe =3D eeh_phb_pe_get(hose); if (!phb_pe || @@ -1218,7 +1233,6 @@ void eeh_handle_special_event(void) } pci_hp_remove_devices(bus); } - pci_unlock_rescan_remove(); } =20 /* @@ -1228,4 +1242,6 @@ void eeh_handle_special_event(void) if (rc =3D=3D EEH_NEXT_ERR_DEAD_IOC) break; } while (rc !=3D EEH_NEXT_ERR_NONE); + + pci_unlock_rescan_remove(); } diff --git a/arch/powerpc/kernel/eeh_pe.c b/arch/powerpc/kernel/eeh_pe.c index d283d281d28e..e740101fadf3 100644 --- a/arch/powerpc/kernel/eeh_pe.c +++ b/arch/powerpc/kernel/eeh_pe.c @@ -671,10 +671,12 @@ static void eeh_bridge_check_link(struct eeh_dev *ede= v) eeh_ops->write_config(edev, cap + PCI_EXP_LNKCTL, 2, val); =20 /* Check link */ - if (!edev->pdev->link_active_reporting) { - eeh_edev_dbg(edev, "No link reporting capability\n"); - msleep(1000); - return; + if (edev->pdev) { + if (!edev->pdev->link_active_reporting) { + eeh_edev_dbg(edev, "No link reporting capability\n"); + msleep(1000); + return; + } } =20 /* Wait the link is up until timeout (5s) */ --=20 2.39.5 From nobody Thu Oct 9 09:02:00 2025 Received: from raptorengineering.com (mail.raptorengineering.com [23.155.224.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 92D192FEE06; Wed, 18 Jun 2025 16:58:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=23.155.224.40 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265907; cv=none; b=O26iaKTsrj0h11A+4QFKRUdZlP+cqxHW/QRIpJaNQYWplrz0uFLQ2Ce+6bFSY5Cs4AscbL/U61zhxEVoEw5khiKM0USMTufVPsr+pb88gMQtOTsxj+qJOYdY8krqEYqei3/YGaWN/ttblkF6lYKK9zeX/b3sy5LjLcQLzBoSYRM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265907; c=relaxed/simple; bh=K4xO/NvLZLmHFsh6LT2gQUQ/7RWKduLSHNDL1emCB0I=; h=Date:From:To:Cc:Message-ID:In-Reply-To:References:Subject: MIME-Version:Content-Type; b=YiUDfcm9s2vSUCQ3AbAulz6RoVgkW2dXIFxbEtxuKJveUn5YCBmDDL6uR2JI0d7gvfNjXDr6hn6XYqiBjUXVWQE0F8iYtUowJOhmcHn/4J8De/70ijJgzSsPM0dNxb90CJMqJRedph4WTpVoddh6W5pY8t/76Hc0n5nUtI6jb9Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com; spf=pass smtp.mailfrom=raptorengineering.com; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b=s/sJ8shh; arc=none smtp.client-ip=23.155.224.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b="s/sJ8shh" Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id C8AEC8287715; Wed, 18 Jun 2025 11:58:24 -0500 (CDT) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id uziop6L64xOS; Wed, 18 Jun 2025 11:58:23 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 9B45B82879FF; Wed, 18 Jun 2025 11:58:23 -0500 (CDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com 9B45B82879FF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1750265903; bh=Pcx4Y/gSefDYdeUAyqXJQfazbp0AxfDxq+BYdkzSYrU=; h=Date:From:To:Message-ID:MIME-Version; b=s/sJ8shhbgmPauWxMst2O7Uqt7RZgWFL216oq1FVlmrHdRbspHHw9wQnZR0oBUjPe XwabUIPp/CKySuMeXD0meGlnXOI9BozdFqAoyuVq/mJxxcX6G6LSVoMAmEUcaKLo9/ I/ua3xdbAn2q3vscvWAjrgYxGyGYXzg92alTPp0c= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id rOwciQdemcmg; Wed, 18 Jun 2025 11:58:23 -0500 (CDT) Received: from vali.starlink.edu (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 619338287715; Wed, 18 Jun 2025 11:58:23 -0500 (CDT) Date: Wed, 18 Jun 2025 11:58:23 -0500 (CDT) From: Timothy Pearson To: Timothy Pearson Cc: linuxppc-dev , linux-kernel , linux-pci , Madhavan Srinivasan , Michael Ellerman , christophe leroy , Naveen N Rao , Bjorn Helgaas , Shawn Anastasio Message-ID: <317515920.1310655.1750265903281.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> References: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> Subject: [PATCH v2 5/6] pci/hotplug/pnv_php: Fix surprise plug detection and Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC137 (Linux)/8.5.0_GA_3042) Thread-Topic: pci/hotplug/pnv_php: Fix surprise plug detection and Thread-Index: 7ViWVrejj338yZQm64sXoMCfdWvE4Nfh1kgU Content-Type: text/plain; charset="utf-8" recovery The existing PowerNV hotplug code did not handle suprise plug events correctly, leading to a complete failure of the hotplug system after device removal and a required reboot to detect new devices. This comes down to two issues: 1.) When a device is suprise removed, oftentimes the bridge upstream port will cause a PE freeze on the PHB. If this freeze is not cleared, the MSI interrupts from the bridge hotplug notification logic will not be received by the kernel, stalling all plug events on all slots associated with the PE. 2.) When a device is removed from a slot, regardless of suprise or programmatic removal, the associated PHB/PE ls left frozen. If this freeze is not cleared via a fundamental reset, skiboot is unable to clear the freeze and cannot retrain / rescan the slot. This also requires a reboot to clear the freeze and redetect the device in the slot. Issue the appropriate unfreeze and rescan commands on hotplug events, and don't oops on hotplug if pci_bus_to_OF_node() returns NULL. Signed-off-by: Timothy Pearson --- arch/powerpc/kernel/pci-hotplug.c | 3 ++ drivers/pci/hotplug/pnv_php.c | 53 ++++++++++++++++++++++++++++++- 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/pci-hotplug.c b/arch/powerpc/kernel/pci-ho= tplug.c index 9ea74973d78d..6f444d0822d8 100644 --- a/arch/powerpc/kernel/pci-hotplug.c +++ b/arch/powerpc/kernel/pci-hotplug.c @@ -141,6 +141,9 @@ void pci_hp_add_devices(struct pci_bus *bus) struct pci_controller *phb; struct device_node *dn =3D pci_bus_to_OF_node(bus); =20 + if (!dn) + return; + phb =3D pci_bus_to_host(bus); =20 mode =3D PCI_PROBE_NORMAL; diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index bac8af3df41a..0ceb4a2c3c79 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include =20 @@ -474,7 +475,7 @@ static int pnv_php_enable(struct pnv_php_slot *php_slot= , bool rescan) struct hotplug_slot *slot =3D &php_slot->slot; uint8_t presence =3D OPAL_PCI_SLOT_EMPTY; uint8_t power_status =3D OPAL_PCI_SLOT_POWER_ON; - int ret; + int ret, i; =20 /* Check if the slot has been configured */ if (php_slot->state !=3D PNV_PHP_STATE_REGISTERED) @@ -532,6 +533,27 @@ static int pnv_php_enable(struct pnv_php_slot *php_slo= t, bool rescan) =20 /* Power is off, turn it on and then scan the slot */ ret =3D pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON); + if (ret) { + SLOT_WARN(php_slot, "PCI slot activation failed with error code %d, poss= ible frozen PHB", ret); + SLOT_WARN(php_slot, "Attempting complete PHB reset before retrying slot = activation\n"); + for (i =3D 0; i < 3; i++) { + /* Slot activation failed, PHB may be fenced from a prior device failure + * Use the OPAL fundamental reset call to both try a device reset and c= lear + * any potentially active PHB fence / freeze + */ + SLOT_WARN(php_slot, "Try %d...\n", i + 1); + pci_set_pcie_reset_state(php_slot->pdev, pcie_warm_reset); + msleep(250); + pci_set_pcie_reset_state(php_slot->pdev, pcie_deassert_reset); + + ret =3D pnv_php_set_slot_power_state(slot, OPAL_PCI_SLOT_POWER_ON); + if (!ret) + break; + } + + if (i >=3D 3) + SLOT_WARN(php_slot, "Failed to bring slot online, aborting!\n"); + } if (ret) return ret; =20 @@ -841,12 +863,41 @@ static void pnv_php_event_handler(struct work_struct = *work) struct pnv_php_event *event =3D container_of(work, struct pnv_php_event, work); struct pnv_php_slot *php_slot =3D event->php_slot; + struct pci_dev *pdev =3D php_slot->pdev; + struct eeh_dev *edev; + struct eeh_pe *pe; + int i, rc; =20 if (event->added) pnv_php_enable_slot(&php_slot->slot); else pnv_php_disable_slot(&php_slot->slot); =20 + if (!event->added) { + /* When a device is surprise removed from a downstream bridge slot, the = upstream bridge port + * can still end up frozen due to related EEH events, which will in turn= block the MSI interrupts + * for slot hotplug detection. Detect and thaw any frozen upstream PE a= fter slot deactivation... + */ + edev =3D pci_dev_to_eeh_dev(pdev); + pe =3D edev ? edev->pe : NULL; + rc =3D eeh_pe_get_state(pe); + if ((rc =3D=3D -ENODEV) || (rc =3D=3D -ENOENT)) { + SLOT_WARN(php_slot, "Upstream bridge PE state unknown, hotplug detect m= ay fail\n"); + } + else { + if (pe->state & EEH_PE_ISOLATED) { + SLOT_WARN(php_slot, "Upstream bridge PE %02x frozen, thawing...\n", pe= ->addr); + for (i =3D 0; i < 3; i++) + if (!eeh_unfreeze_pe(pe)) + break; + if (i >=3D 3) + SLOT_WARN(php_slot, "Unable to thaw PE %02x, hotplug detect will fail= !\n", pe->addr); + else + SLOT_WARN(php_slot, "PE %02x thawed successfully\n", pe->addr); + } + } + } + kfree(event); } =20 --=20 2.39.5 From nobody Thu Oct 9 09:02:00 2025 Received: from raptorengineering.com (mail.raptorengineering.com [23.155.224.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DAE502FEE2A; Wed, 18 Jun 2025 16:59:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=23.155.224.40 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265942; cv=none; b=aFdI6eB6EUa9V4jrk2hWiV5wAnq9uG50SD+o6xNBAw1N1OeaGmVcn1DjU1ZCt6ZHD4Q/L74r5ATt3Mm1HuVVRIYnh9OE0a9dqTGFI7cU65238EHmcew2Pti9GH8+WbJdTZ2AfqRUEm/4bx2X7cee7VwfXpzPhghY9fcxEwvFqBk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750265942; c=relaxed/simple; bh=xFaKsIUdVbLbloNEcOi9HJOJURbTFqzaCVjp5Ps04vY=; h=Date:From:To:Cc:Message-ID:In-Reply-To:References:Subject: MIME-Version:Content-Type; b=aHCpXR5FqCUcrzsVsgR6un4Lt1U6icAopjx7Jn0LI7NQWzx+03OxgQQI70En+NeLVkxzxcynFhhwJapOMlypIeuefBaafa2jk7l3JFADxhVFIjITQ7SO707OrRvkRmmELjwqr4aFDgkjkgbYMpba556Fj0/uqQDB72eUcfDW+Fs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com; spf=pass smtp.mailfrom=raptorengineering.com; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b=A2Hd7zzk; arc=none smtp.client-ip=23.155.224.40 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=raptorengineering.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=raptorengineering.com header.i=@raptorengineering.com header.b="A2Hd7zzk" Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 5F8A98288467; Wed, 18 Jun 2025 11:59:00 -0500 (CDT) Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id 4xfm3ULz_6Gq; Wed, 18 Jun 2025 11:58:59 -0500 (CDT) Received: from localhost (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 783828288670; Wed, 18 Jun 2025 11:58:59 -0500 (CDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.rptsys.com 783828288670 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raptorengineering.com; s=B8E824E6-0BE2-11E6-931D-288C65937AAD; t=1750265939; bh=OlONqedgdZu8PeQ6IQKPSJGIJsQmR5PwabMbkbNrUiE=; h=Date:From:To:Message-ID:MIME-Version; b=A2Hd7zzkPPZn5qnkUjG1tOwsNtgyxueq7hUjeHQEDPQEh6dHZ8JNQWk+g6ZOm+rFY gbDwIHh6+nBmg3hjk6V5VK1mfM/a65FORGDtjcz13dn35Em98yVI6Mggi/ASkT5uNr P5izOOF6fFUQjfdI0pnJjrIWAmtMJz/yc9R/IDgI= X-Virus-Scanned: amavisd-new at rptsys.com Received: from mail.rptsys.com ([127.0.0.1]) by localhost (vali.starlink.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id mxrrclct1c8B; Wed, 18 Jun 2025 11:58:59 -0500 (CDT) Received: from vali.starlink.edu (localhost [127.0.0.1]) by mail.rptsys.com (Postfix) with ESMTP id 4EBDA8288467; Wed, 18 Jun 2025 11:58:59 -0500 (CDT) Date: Wed, 18 Jun 2025 11:58:59 -0500 (CDT) From: Timothy Pearson To: Timothy Pearson Cc: linuxppc-dev , linux-kernel , linux-pci , Madhavan Srinivasan , Michael Ellerman , christophe leroy , Naveen N Rao , Bjorn Helgaas , Shawn Anastasio Message-ID: <300098407.1310656.1750265939231.JavaMail.zimbra@raptorengineeringinc.com> In-Reply-To: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> References: <581463409.1310624.1750265668004.JavaMail.zimbra@raptorengineeringinc.com> Subject: [PATCH v2 6/6] pci/hotplug/pnv_php: Enable third attention indicator Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Mailer: Zimbra 8.5.0_GA_3042 (ZimbraWebClient - GC137 (Linux)/8.5.0_GA_3042) Thread-Topic: pci/hotplug/pnv_php: Enable third attention indicator Thread-Index: 7ViWVrejj338yZQm64sXoMCfdWvE4ADzdyG8 Content-Type: text/plain; charset="utf-8" state The PCIe specification allows three attention indicator states, on, off, and blink. Enable all three states instead of basic on / off control. Signed-off-by: Timothy Pearson --- drivers/pci/hotplug/pnv_php.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 0ceb4a2c3c79..c3005324be3d 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -440,10 +440,23 @@ static int pnv_php_get_adapter_state(struct hotplug_s= lot *slot, u8 *state) return ret; } =20 +static int pnv_php_get_raw_indicator_status(struct hotplug_slot *slot, u8 = *state) +{ + struct pnv_php_slot *php_slot =3D to_pnv_php_slot(slot); + struct pci_dev *bridge =3D php_slot->pdev; + u16 status; + + pcie_capability_read_word(bridge, PCI_EXP_SLTCTL, &status); + *state =3D (status & (PCI_EXP_SLTCTL_AIC | PCI_EXP_SLTCTL_PIC)) >> 6; + return 0; +} + + static int pnv_php_get_attention_state(struct hotplug_slot *slot, u8 *stat= e) { struct pnv_php_slot *php_slot =3D to_pnv_php_slot(slot); =20 + pnv_php_get_raw_indicator_status(slot, &php_slot->attention_state); *state =3D php_slot->attention_state; return 0; } @@ -461,7 +474,7 @@ static int pnv_php_set_attention_state(struct hotplug_s= lot *slot, u8 state) mask =3D PCI_EXP_SLTCTL_AIC; =20 if (state) - new =3D PCI_EXP_SLTCTL_ATTN_IND_ON; + new =3D FIELD_PREP(PCI_EXP_SLTCTL_AIC, state); else new =3D PCI_EXP_SLTCTL_ATTN_IND_OFF; =20 --=20 2.39.5