drivers/pci/controller/pci-host-common.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
From: Peng Fan <peng.fan@nxp.com>
When PCI node was created using an overlay and the overlay is
reverted/destroyed, the "linux,pci-domain" property no longer exists,
so of_get_pci_domain_nr will return failure. Then
of_pci_bus_release_domain_nr will actually use the dynamic IDA, even
if the IDA was allocated in static IDA. So the flow is as below:
A: of_changeset_revert
pci_host_common_remove
pci_bus_release_domain_nr
of_pci_bus_release_domain_nr
of_get_pci_domain_nr # fails because overlay is gone
ida_free(&pci_domain_nr_dynamic_ida)
With driver calls pci_host_common_remove explicity, the flow becomes:
B pci_host_common_remove
pci_bus_release_domain_nr
of_pci_bus_release_domain_nr
of_get_pci_domain_nr # succeeds in this order
ida_free(&pci_domain_nr_static_ida)
A of_changeset_revert
pci_host_common_remove
With updated flow, the pci_host_common_remove will be called twice,
so need to check 'bridge->bus' to avoid accessing invalid pointer.
Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()")
Signed-off-by: Peng Fan <peng.fan@nxp.com>
---
V1:
Not sure to keep the fixes here. I could drop the Fixes tag if it is
improper.
This is to revisit the patch [1] which was rejected last year. This
new flow is using the suggest flow following Bjorn's suggestion.
But of_changeset_revert will still invoke plaform_remove and then
pci_host_common_remove. So worked out this patch together with a patch
to jailhouse driver as below:
static void destroy_vpci_of_overlay(void)
{
+ struct device_node *vpci_node = NULL;
+
if (overlay_applied) {
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,6,0)
+ vpci_node = of_find_node_by_path("/pci@0");
+ if (vpci_node) {
+ struct platform_device *pdev = of_find_device_by_node(vpci_node);
+ if (!pdev)
+ printk("Not found device for /pci@0\n");
+ else {
+ pci_host_common_remove(pdev);
+ platform_device_put(pdev);
+ }
+ }
+ of_node_put(vpci_node);
+#endif
+
of_changeset_revert(&overlay_changeset);
[1] https://lore.kernel.org/lkml/20230908224858.GA306960@bhelgaas/T/#md12e6097d91a012ede78c997fc5abf460029a569
drivers/pci/controller/pci-host-common.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c
index cf5f59a745b3..5a9c29fc57cd 100644
--- a/drivers/pci/controller/pci-host-common.c
+++ b/drivers/pci/controller/pci-host-common.c
@@ -86,8 +86,10 @@ void pci_host_common_remove(struct platform_device *pdev)
struct pci_host_bridge *bridge = platform_get_drvdata(pdev);
pci_lock_rescan_remove();
- pci_stop_root_bus(bridge->bus);
- pci_remove_root_bus(bridge->bus);
+ if (bridge->bus) {
+ pci_stop_root_bus(bridge->bus);
+ pci_remove_root_bus(bridge->bus);
+ }
pci_unlock_rescan_remove();
}
EXPORT_SYMBOL_GPL(pci_host_common_remove);
--
2.37.1
On Mon, Oct 28, 2024 at 04:46:43PM +0800, Peng Fan (OSS) wrote:
> From: Peng Fan <peng.fan@nxp.com>
>
> When PCI node was created using an overlay and the overlay is
> reverted/destroyed, the "linux,pci-domain" property no longer exists,
> so of_get_pci_domain_nr will return failure. Then
> of_pci_bus_release_domain_nr will actually use the dynamic IDA, even
> if the IDA was allocated in static IDA. So the flow is as below:
> A: of_changeset_revert
> pci_host_common_remove
> pci_bus_release_domain_nr
> of_pci_bus_release_domain_nr
> of_get_pci_domain_nr # fails because overlay is gone
> ida_free(&pci_domain_nr_dynamic_ida)
>
> With driver calls pci_host_common_remove explicity, the flow becomes:
> B pci_host_common_remove
> pci_bus_release_domain_nr
> of_pci_bus_release_domain_nr
> of_get_pci_domain_nr # succeeds in this order
> ida_free(&pci_domain_nr_static_ida)
> A of_changeset_revert
> pci_host_common_remove
>
> With updated flow, the pci_host_common_remove will be called twice,
> so need to check 'bridge->bus' to avoid accessing invalid pointer.
If/when you post a v2 of this, please:
- Update the subject to say *why* this change is desirable.
- Follow the capitalization convention (use "git log --oneline" to
discover it).
- Add "()" after function names in the text (no need in the call
tree because that's obviously all functions).
- Mention the user-visible problem this fixes, e.g., do you see an
oops because of a NULL pointer dereference?
On Mon, Oct 28, 2024 at 04:46:43PM +0800, Peng Fan (OSS) wrote:
> From: Peng Fan <peng.fan@nxp.com>
>
> When PCI node was created using an overlay and the overlay is
> reverted/destroyed, the "linux,pci-domain" property no longer exists,
> so of_get_pci_domain_nr will return failure. Then
> of_pci_bus_release_domain_nr will actually use the dynamic IDA, even
> if the IDA was allocated in static IDA. So the flow is as below:
> A: of_changeset_revert
> pci_host_common_remove
> pci_bus_release_domain_nr
> of_pci_bus_release_domain_nr
> of_get_pci_domain_nr # fails because overlay is gone
> ida_free(&pci_domain_nr_dynamic_ida)
>
> With driver calls pci_host_common_remove explicity, the flow becomes:
> B pci_host_common_remove
> pci_bus_release_domain_nr
> of_pci_bus_release_domain_nr
> of_get_pci_domain_nr # succeeds in this order
> ida_free(&pci_domain_nr_static_ida)
> A of_changeset_revert
> pci_host_common_remove
>
> With updated flow, the pci_host_common_remove will be called twice,
> so need to check 'bridge->bus' to avoid accessing invalid pointer.
>
> Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()")
> Signed-off-by: Peng Fan <peng.fan@nxp.com>
I went through the previous discussion [1] and I couldn't see an agreement on
the point raised by Bjorn on 'removing the host bridge before the overlay'.
I do think this is a valid point and if you do not think so, please state the
reason.
- Mani
[1] https://lore.kernel.org/lkml/20230913115737.GA426735@bhelgaas/
> ---
>
> V1:
> Not sure to keep the fixes here. I could drop the Fixes tag if it is
> improper.
> This is to revisit the patch [1] which was rejected last year. This
> new flow is using the suggest flow following Bjorn's suggestion.
> But of_changeset_revert will still invoke plaform_remove and then
> pci_host_common_remove. So worked out this patch together with a patch
> to jailhouse driver as below:
> static void destroy_vpci_of_overlay(void)
> {
> + struct device_node *vpci_node = NULL;
> +
> if (overlay_applied) {
> +#if LINUX_VERSION_CODE >= KERNEL_VERSION(6,6,0)
> + vpci_node = of_find_node_by_path("/pci@0");
> + if (vpci_node) {
> + struct platform_device *pdev = of_find_device_by_node(vpci_node);
> + if (!pdev)
> + printk("Not found device for /pci@0\n");
> + else {
> + pci_host_common_remove(pdev);
> + platform_device_put(pdev);
> + }
> + }
> + of_node_put(vpci_node);
> +#endif
> +
> of_changeset_revert(&overlay_changeset);
>
> [1] https://lore.kernel.org/lkml/20230908224858.GA306960@bhelgaas/T/#md12e6097d91a012ede78c997fc5abf460029a569
>
> drivers/pci/controller/pci-host-common.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-host-common.c b/drivers/pci/controller/pci-host-common.c
> index cf5f59a745b3..5a9c29fc57cd 100644
> --- a/drivers/pci/controller/pci-host-common.c
> +++ b/drivers/pci/controller/pci-host-common.c
> @@ -86,8 +86,10 @@ void pci_host_common_remove(struct platform_device *pdev)
> struct pci_host_bridge *bridge = platform_get_drvdata(pdev);
>
> pci_lock_rescan_remove();
> - pci_stop_root_bus(bridge->bus);
> - pci_remove_root_bus(bridge->bus);
> + if (bridge->bus) {
> + pci_stop_root_bus(bridge->bus);
> + pci_remove_root_bus(bridge->bus);
> + }
> pci_unlock_rescan_remove();
> }
> EXPORT_SYMBOL_GPL(pci_host_common_remove);
> --
> 2.37.1
>
--
மணிவண்ணன் சதாசிவம்
© 2016 - 2026 Red Hat, Inc.