drivers/pci/bus.c | 4 ++++ drivers/pci/pci.c | 2 -- 2 files changed, 4 insertions(+), 2 deletions(-)
Today, it's possible for a PCI device to be created and
runtime-suspended before it is fully initialized. When that happens, the
device will remain in D0, but the suspend process may save an
intermediate version of that device's state -- for example, without
appropriate BAR configuration. When the device later resumes, we'll
restore invalid PCI state and the device may not function.
Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
until we've fully initialized the device.
More details on how exactly this may occur:
1. PCI device is created by pci_scan_slot() or similar
2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the
device starts "active" and we initially prevent (pm_runtime_forbid())
suspend -- but see [*] footnote
3. Underlying 'struct device' is added to the system (device_add());
runtime PM can now be configured by user space
4. PCI device receives BAR configuration
(pci_assign_unassigned_bus_resources(), etc.)
5. PCI device is added to the system in pci_bus_add_device()
The device may potentially suspend between #3 and #4.
[*] By default, pm_runtime_forbid() prevents suspending a device; but by
design [**], this can be overridden by user space policy via
echo auto > /sys/bus/pci/devices/.../power/control
Thus, the above #3/#4 sequence is racy with user space (udev or
similar).
Notably, many PCI devices are enumerated at subsys_initcall time and so
will not race with user space. However, there are several scenarios
where PCI devices are created later on, such as with hotplug or when
drivers (pwrctrl or controller drivers) are built as modules.
---
[**] The relationship between pm_runtime_forbid(), pm_runtime_allow(),
/sys/.../power/control, and the runtime PM usage counter can be subtle.
It appears that the intention of pm_runtime_forbid() /
pm_runtime_allow() is twofold:
1. Allow the user to disable runtime_pm (force device to always be
powered on) through sysfs.
2. Allow the driver to start with runtime_pm disabled (device forced
on) and user space could later enable runtime_pm.
This conclusion comes from reading `Documentation/power/runtime_pm.rst`,
specifically the section starting "The user space can effectively
disallow".
This means that while pm_runtime_forbid() does technically increase the
runtime PM usage counter, this usage counter is not a guarantee of
functional correctness, because sysfs can decrease that count again.
---
Note that we also move pm_runtime_set_active(), but leave
pm_runtime_forbid() in place earlier in the initialization sequence, to
avoid confusing user space. From Documentation/power/runtime_pm.rst:
"It should be noted, however, that if the user space has already
intentionally changed the value of /sys/devices/.../power/control to
"auto" to allow the driver to power manage the device at run time, the
driver may confuse it by using pm_runtime_forbid() this way."
Thus, we should ensure pm_runtime_forbid() is called before the device
is available to user space.
Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
Signed-off-by: Brian Norris <briannorris@chromium.org>
Cc: <stable@vger.kernel.org>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
---
Changes in v4:
* Move pm_runtime_set_active() too
Changes in v3:
* Add Link to initial discussion
* Add Rafael's Reviewed-by
* Add lengthier footnotes about forbid vs allow vs sysfs
Changes in v2:
* Update CC list
* Rework problem description
* Update solution: defer pm_runtime_enable(), instead of trying to
get()/put()
drivers/pci/bus.c | 4 ++++
drivers/pci/pci.c | 2 --
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index f26aec6ff588..40ff954f416f 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -14,6 +14,7 @@
#include <linux/of.h>
#include <linux/of_platform.h>
#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
#include <linux/proc_fs.h>
#include <linux/slab.h>
@@ -375,6 +376,9 @@ void pci_bus_add_device(struct pci_dev *dev)
put_device(&pdev->dev);
}
+ pm_runtime_set_active(&dev->dev);
+ pm_runtime_enable(&dev->dev);
+
if (!dn || of_device_is_available(dn))
pci_dev_allow_binding(dev);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..234bf3608569 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3225,8 +3225,6 @@ void pci_pm_init(struct pci_dev *dev)
poweron:
pci_pm_power_up_and_verify_state(dev);
pm_runtime_forbid(&dev->dev);
- pm_runtime_set_active(&dev->dev);
- pm_runtime_enable(&dev->dev);
}
static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
--
2.51.1.821.gb6fe4d2222-goog
On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
> Today, it's possible for a PCI device to be created and
> runtime-suspended before it is fully initialized. When that happens, the
> device will remain in D0, but the suspend process may save an
> intermediate version of that device's state -- for example, without
> appropriate BAR configuration. When the device later resumes, we'll
> restore invalid PCI state and the device may not function.
>
> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> until we've fully initialized the device.
>
> More details on how exactly this may occur:
>
> 1. PCI device is created by pci_scan_slot() or similar
> 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the
> device starts "active" and we initially prevent (pm_runtime_forbid())
> suspend -- but see [*] footnote
> 3. Underlying 'struct device' is added to the system (device_add());
> runtime PM can now be configured by user space
> 4. PCI device receives BAR configuration
> (pci_assign_unassigned_bus_resources(), etc.)
> 5. PCI device is added to the system in pci_bus_add_device()
>
> The device may potentially suspend between #3 and #4.
>
> [*] By default, pm_runtime_forbid() prevents suspending a device; but by
> design [**], this can be overridden by user space policy via
>
> echo auto > /sys/bus/pci/devices/.../power/control
>
> Thus, the above #3/#4 sequence is racy with user space (udev or
> similar).
>
> Notably, many PCI devices are enumerated at subsys_initcall time and so
> will not race with user space. However, there are several scenarios
> where PCI devices are created later on, such as with hotplug or when
> drivers (pwrctrl or controller drivers) are built as modules.
>
> ---
>
> [**] The relationship between pm_runtime_forbid(), pm_runtime_allow(),
> /sys/.../power/control, and the runtime PM usage counter can be subtle.
> It appears that the intention of pm_runtime_forbid() /
> pm_runtime_allow() is twofold:
>
> 1. Allow the user to disable runtime_pm (force device to always be
> powered on) through sysfs.
> 2. Allow the driver to start with runtime_pm disabled (device forced
> on) and user space could later enable runtime_pm.
>
> This conclusion comes from reading `Documentation/power/runtime_pm.rst`,
> specifically the section starting "The user space can effectively
> disallow".
>
> This means that while pm_runtime_forbid() does technically increase the
> runtime PM usage counter, this usage counter is not a guarantee of
> functional correctness, because sysfs can decrease that count again.
>
> ---
>
> Note that we also move pm_runtime_set_active(), but leave
> pm_runtime_forbid() in place earlier in the initialization sequence, to
> avoid confusing user space. From Documentation/power/runtime_pm.rst:
>
> "It should be noted, however, that if the user space has already
> intentionally changed the value of /sys/devices/.../power/control to
> "auto" to allow the driver to power manage the device at run time, the
> driver may confuse it by using pm_runtime_forbid() this way."
>
> Thus, we should ensure pm_runtime_forbid() is called before the device
> is available to user space.
>
> Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> Cc: <stable@vger.kernel.org>
> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Applied to pci/pm for v6.20, thanks! I tried to simplify the commit
log so the issue isn't hidden by details. Happy to restore things if
I trimmed too much:
PCI/PM: Prevent runtime suspend until devices are fully initialized
Previously, it was possible for a PCI device to be runtime-suspended before
it was fully initialized. When that happened, the suspend process could
save invalid device state, for example, before BAR assignment. Restoring
the invalid state during resume may leave the device non-functional.
Prevent runtime suspend for PCI devices until they are fully initialized by
deferring pm_runtime_enable().
More details on how exactly this may occur:
1. PCI device is created by pci_scan_slot() or similar
2. As part of pci_scan_slot(), pci_pm_init() puts the device in D0 and
prevents runtime suspend prevented via pm_runtime_forbid()
3. pci_device_add() adds the underlying 'struct device' via device_add(),
which means user space can allow runtime suspend, e.g.,
echo auto > /sys/bus/pci/devices/.../power/control
4. PCI device receives BAR configuration
(pci_assign_unassigned_bus_resources(), etc.)
5. pci_bus_add_device() applies final fixups, saves device state, and
tries to attach a driver
The device may potentially be suspended between #3 and #5, so this is racy
with user space (udev or similar).
Many PCI devices are enumerated at subsys_initcall time and so will not
race with user space, but devices created later by hotplug or modular
pwrctrl or host controller drivers are susceptible to this race.
More runtime PM details at the first Link: below.
Signed-off-by: Brian Norris <briannorris@chromium.org>
[bhelgaas: simplify commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
Link: https://patch.msgid.link/20251023140901.v4.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid
> ---
>
> Changes in v4:
> * Move pm_runtime_set_active() too
>
> Changes in v3:
> * Add Link to initial discussion
> * Add Rafael's Reviewed-by
> * Add lengthier footnotes about forbid vs allow vs sysfs
>
> Changes in v2:
> * Update CC list
> * Rework problem description
> * Update solution: defer pm_runtime_enable(), instead of trying to
> get()/put()
>
> drivers/pci/bus.c | 4 ++++
> drivers/pci/pci.c | 2 --
> 2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index f26aec6ff588..40ff954f416f 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -14,6 +14,7 @@
> #include <linux/of.h>
> #include <linux/of_platform.h>
> #include <linux/platform_device.h>
> +#include <linux/pm_runtime.h>
> #include <linux/proc_fs.h>
> #include <linux/slab.h>
>
> @@ -375,6 +376,9 @@ void pci_bus_add_device(struct pci_dev *dev)
> put_device(&pdev->dev);
> }
>
> + pm_runtime_set_active(&dev->dev);
> + pm_runtime_enable(&dev->dev);
> +
> if (!dn || of_device_is_available(dn))
> pci_dev_allow_binding(dev);
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b14dd064006c..234bf3608569 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3225,8 +3225,6 @@ void pci_pm_init(struct pci_dev *dev)
> poweron:
> pci_pm_power_up_and_verify_state(dev);
> pm_runtime_forbid(&dev->dev);
> - pm_runtime_set_active(&dev->dev);
> - pm_runtime_enable(&dev->dev);
> }
>
> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
> --
> 2.51.1.821.gb6fe4d2222-goog
>
On 06.01.2026 23:27, Bjorn Helgaas wrote:
> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
>> Today, it's possible for a PCI device to be created and
>> runtime-suspended before it is fully initialized. When that happens, the
>> device will remain in D0, but the suspend process may save an
>> intermediate version of that device's state -- for example, without
>> appropriate BAR configuration. When the device later resumes, we'll
>> restore invalid PCI state and the device may not function.
>>
>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
>> until we've fully initialized the device.
>>
>> More details on how exactly this may occur:
>>
>> 1. PCI device is created by pci_scan_slot() or similar
>> 2. As part of pci_scan_slot(), pci_pm_init() enables runtime PM; the
>> device starts "active" and we initially prevent (pm_runtime_forbid())
>> suspend -- but see [*] footnote
>> 3. Underlying 'struct device' is added to the system (device_add());
>> runtime PM can now be configured by user space
>> 4. PCI device receives BAR configuration
>> (pci_assign_unassigned_bus_resources(), etc.)
>> 5. PCI device is added to the system in pci_bus_add_device()
>>
>> The device may potentially suspend between #3 and #4.
>>
>> [*] By default, pm_runtime_forbid() prevents suspending a device; but by
>> design [**], this can be overridden by user space policy via
>>
>> echo auto > /sys/bus/pci/devices/.../power/control
>>
>> Thus, the above #3/#4 sequence is racy with user space (udev or
>> similar).
>>
>> Notably, many PCI devices are enumerated at subsys_initcall time and so
>> will not race with user space. However, there are several scenarios
>> where PCI devices are created later on, such as with hotplug or when
>> drivers (pwrctrl or controller drivers) are built as modules.
>>
>> ---
>>
>> [**] The relationship between pm_runtime_forbid(), pm_runtime_allow(),
>> /sys/.../power/control, and the runtime PM usage counter can be subtle.
>> It appears that the intention of pm_runtime_forbid() /
>> pm_runtime_allow() is twofold:
>>
>> 1. Allow the user to disable runtime_pm (force device to always be
>> powered on) through sysfs.
>> 2. Allow the driver to start with runtime_pm disabled (device forced
>> on) and user space could later enable runtime_pm.
>>
>> This conclusion comes from reading `Documentation/power/runtime_pm.rst`,
>> specifically the section starting "The user space can effectively
>> disallow".
>>
>> This means that while pm_runtime_forbid() does technically increase the
>> runtime PM usage counter, this usage counter is not a guarantee of
>> functional correctness, because sysfs can decrease that count again.
>>
>> ---
>>
>> Note that we also move pm_runtime_set_active(), but leave
>> pm_runtime_forbid() in place earlier in the initialization sequence, to
>> avoid confusing user space. From Documentation/power/runtime_pm.rst:
>>
>> "It should be noted, however, that if the user space has already
>> intentionally changed the value of /sys/devices/.../power/control to
>> "auto" to allow the driver to power manage the device at run time, the
>> driver may confuse it by using pm_runtime_forbid() this way."
>>
>> Thus, we should ensure pm_runtime_forbid() is called before the device
>> is available to user space.
>>
>> Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
>> Signed-off-by: Brian Norris <briannorris@chromium.org>
>> Cc: <stable@vger.kernel.org>
>> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
> Applied to pci/pm for v6.20, thanks! I tried to simplify the commit
> log so the issue isn't hidden by details. Happy to restore things if
> I trimmed too much:
>
> PCI/PM: Prevent runtime suspend until devices are fully initialized
>
> Previously, it was possible for a PCI device to be runtime-suspended before
> it was fully initialized. When that happened, the suspend process could
> save invalid device state, for example, before BAR assignment. Restoring
> the invalid state during resume may leave the device non-functional.
>
> Prevent runtime suspend for PCI devices until they are fully initialized by
> deferring pm_runtime_enable().
>
> More details on how exactly this may occur:
>
> 1. PCI device is created by pci_scan_slot() or similar
>
> 2. As part of pci_scan_slot(), pci_pm_init() puts the device in D0 and
> prevents runtime suspend prevented via pm_runtime_forbid()
>
> 3. pci_device_add() adds the underlying 'struct device' via device_add(),
> which means user space can allow runtime suspend, e.g.,
>
> echo auto > /sys/bus/pci/devices/.../power/control
>
> 4. PCI device receives BAR configuration
> (pci_assign_unassigned_bus_resources(), etc.)
>
> 5. pci_bus_add_device() applies final fixups, saves device state, and
> tries to attach a driver
>
> The device may potentially be suspended between #3 and #5, so this is racy
> with user space (udev or similar).
>
> Many PCI devices are enumerated at subsys_initcall time and so will not
> race with user space, but devices created later by hotplug or modular
> pwrctrl or host controller drivers are susceptible to this race.
>
> More runtime PM details at the first Link: below.
>
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> [bhelgaas: simplify commit log]
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
> Cc: stable@vger.kernel.org
> Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> Link: https://patch.msgid.link/20251023140901.v4.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid
This patch landed recently in linux-next as commit c796513dc54e
("PCI/PM: Prevent runtime suspend until devices are fully initialized").
In my tests I found that it sometimes causes the "pci 0000:01:00.0:
runtime PM trying to activate child device 0000:01:00.0 but parent
(0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
(arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
lockdep warning about console lock, but this is just a consequence of
the runtime pm warning. Reverting $subject patch on top of current
linux-next hides this warning.
Here is a kernel log:
pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
PCIe x1 link)
pci 0000:01:00.0: Adding to iommu group 13
pci 0000:01:00.0: ASPM: default states L0s L1
pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
pci 0000:01:00.0: runtime PM trying to activate child device
0000:01:00.0 but parent (0000:00:00.0) is not active
======================================================
WARNING: possible circular locking dependency detected
6.19.0-rc1+ #16398 Not tainted
------------------------------------------------------
kworker/3:0/33 is trying to acquire lock:
ffffcd182ff1ae98 (console_owner){..-.}-{0:0}, at:
console_lock_spinning_enable+0x44/0x78
but task is already holding lock:
ffff0000835c5238 (&dev->power.lock/1){....}-{3:3}, at:
__pm_runtime_set_status+0x240/0x384
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&dev->power.lock/1){....}-{3:3}:
_raw_spin_lock_nested+0x44/0x5c
__pm_runtime_set_status+0x240/0x384
arm_smmu_device_probe+0xbe0/0xe5c
platform_probe+0x5c/0xac
really_probe+0xbc/0x298
__driver_probe_device+0x78/0x12c
driver_probe_device+0x40/0x164
__driver_attach+0x9c/0x1ac
bus_for_each_dev+0x74/0xd0
driver_attach+0x24/0x30
bus_add_driver+0xe4/0x208
driver_register+0x60/0x128
__platform_driver_register+0x24/0x30
arm_smmu_driver_init+0x20/0x2c
do_one_initcall+0x64/0x308
kernel_init_freeable+0x284/0x500
kernel_init+0x20/0x1d8
ret_from_fork+0x10/0x20
-> #2 (&dev->power.lock){-...}-{3:3}:
_raw_spin_lock_irqsave+0x60/0x88
__pm_runtime_resume+0x4c/0xbc
__uart_start+0x4c/0x114
uart_write+0x98/0x278
n_tty_write+0x1dc/0x4f0
file_tty_write.constprop.0+0x12c/0x2bc
redirected_tty_write+0xc0/0x108
do_iter_readv_writev+0xdc/0x1c0
vfs_writev+0xf0/0x280
do_writev+0x74/0x13c
__arm64_sys_writev+0x20/0x2c
invoke_syscall+0x48/0x10c
el0_svc_common.constprop.0+0x40/0xe8
do_el0_svc+0x20/0x2c
el0_svc+0x50/0x2e8
el0t_64_sync_handler+0xa0/0xe4
el0t_64_sync+0x198/0x19c
-> #1 (&port_lock_key){-.-.}-{3:3}:
_raw_spin_lock_irqsave+0x60/0x88
qcom_geni_serial_console_write+0x50/0x344
console_flush_one_record+0x33c/0x474
console_unlock+0x80/0x14c
vprintk_emit+0x258/0x3d0
vprintk_default+0x38/0x44
vprintk+0x28/0x34
_printk+0x5c/0x84
register_console+0x278/0x510
serial_core_register_port+0x6cc/0x79c
serial_ctrl_register_port+0x10/0x1c
uart_add_one_port+0x10/0x1c
qcom_geni_serial_probe+0x2c0/0x448
platform_probe+0x5c/0xac
really_probe+0xbc/0x298
__driver_probe_device+0x78/0x12c
driver_probe_device+0x40/0x164
__device_attach_driver+0xb8/0x138
bus_for_each_drv+0x80/0xdc
__device_attach+0xa8/0x1b0
device_initial_probe+0x50/0x54
bus_probe_device+0x38/0xa8
device_add+0x540/0x720
of_device_add+0x44/0x60
of_platform_device_create_pdata+0x90/0x11c
of_platform_bus_create+0x17c/0x394
of_platform_populate+0x58/0xf8
devm_of_platform_populate+0x58/0xbc
geni_se_probe+0xdc/0x164
platform_probe+0x5c/0xac
really_probe+0xbc/0x298
__driver_probe_device+0x78/0x12c
driver_probe_device+0x40/0x164
__device_attach_driver+0xb8/0x138
bus_for_each_drv+0x80/0xdc
__device_attach+0xa8/0x1b0
device_initial_probe+0x50/0x54
bus_probe_device+0x38/0xa8
deferred_probe_work_func+0x8c/0xc8
process_one_work+0x208/0x604
worker_thread+0x244/0x388
kthread+0x150/0x228
ret_from_fork+0x10/0x20
-> #0 (console_owner){..-.}-{0:0}:
__lock_acquire+0x1408/0x2254
lock_acquire+0x1c8/0x354
console_lock_spinning_enable+0x68/0x78
console_flush_one_record+0x300/0x474
console_unlock+0x80/0x14c
vprintk_emit+0x258/0x3d0
dev_vprintk_emit+0xd8/0x1a0
dev_printk_emit+0x58/0x80
__dev_printk+0x3c/0x88
_dev_err+0x60/0x88
__pm_runtime_set_status+0x28c/0x384
pci_bus_add_device+0xa4/0x18c
pci_bus_add_devices+0x3c/0x88
pci_bus_add_devices+0x68/0x88
pci_rescan_bus+0x30/0x44
rescan_work_func+0x28/0x3c
process_one_work+0x208/0x604
worker_thread+0x244/0x388
kthread+0x150/0x228
ret_from_fork+0x10/0x20
other info that might help us debug this:
Chain exists of:
console_owner --> &dev->power.lock --> &dev->power.lock/1
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&dev->power.lock/1);
lock(&dev->power.lock);
lock(&dev->power.lock/1);
lock(console_owner);
*** DEADLOCK ***
7 locks held by kworker/3:0/33:
#0: ffff00008000d948 ((wq_completion)events){+.+.}-{0:0}, at:
process_one_work+0x18c/0x604
#1: ffff8000802a3de0 ((work_completion)(&pwrctrl->work)){+.+.}-{0:0},
at: process_one_work+0x1b4/0x604
#2: ffffcd18301138e8 (pci_rescan_remove_lock){+.+.}-{4:4}, at:
pci_lock_rescan_remove+0x1c/0x28
#3: ffff00008ac8a238 (&dev->power.lock){-...}-{3:3}, at:
__pm_runtime_set_status+0x1d4/0x384
#4: ffff0000835c5238 (&dev->power.lock/1){....}-{3:3}, at:
__pm_runtime_set_status+0x240/0x384
#5: ffffcd182ff1ac78 (console_lock){+.+.}-{0:0}, at:
dev_vprintk_emit+0xd8/0x1a0
#6: ffffcd182ff1acd8 (console_srcu){....}-{0:0}, at:
console_flush_one_record+0x0/0x474
stack backtrace:
CPU: 3 UID: 0 PID: 33 Comm: kworker/3:0 Not tainted 6.19.0-rc1+ #16398
PREEMPT
Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
Workqueue: events rescan_work_func
Call trace:
show_stack+0x18/0x24 (C)
dump_stack_lvl+0x90/0xd0
dump_stack+0x18/0x24
print_circular_bug+0x298/0x37c
check_noncircular+0x15c/0x170
__lock_acquire+0x1408/0x2254
lock_acquire+0x1c8/0x354
console_lock_spinning_enable+0x68/0x78
console_flush_one_record+0x300/0x474
console_unlock+0x80/0x14c
vprintk_emit+0x258/0x3d0
dev_vprintk_emit+0xd8/0x1a0
dev_printk_emit+0x58/0x80
__dev_printk+0x3c/0x88
_dev_err+0x60/0x88
__pm_runtime_set_status+0x28c/0x384
pci_bus_add_device+0xa4/0x18c
pci_bus_add_devices+0x3c/0x88
pci_bus_add_devices+0x68/0x88
pci_rescan_bus+0x30/0x44
rescan_work_func+0x28/0x3c
process_one_work+0x208/0x604
worker_thread+0x244/0x388
kthread+0x150/0x228
ret_from_fork+0x10/0x20
This looks a bit similar to the issue reported some time ago on a
different board:
https://lore.kernel.org/all/6d438995-4d6d-4a21-9ad2-8a0352482d44@samsung.com/
Let me know if I can somehow help debugging this.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Hi Marek,
On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
> On 06.01.2026 23:27, Bjorn Helgaas wrote:
> > On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
> >> Today, it's possible for a PCI device to be created and
> >> runtime-suspended before it is fully initialized. When that happens, the
> >> device will remain in D0, but the suspend process may save an
> >> intermediate version of that device's state -- for example, without
> >> appropriate BAR configuration. When the device later resumes, we'll
> >> restore invalid PCI state and the device may not function.
> >>
> >> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> >> until we've fully initialized the device.
...
> This patch landed recently in linux-next as commit c796513dc54e
> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
> runtime PM trying to activate child device 0000:01:00.0 but parent
> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
> lockdep warning about console lock, but this is just a consequence of
> the runtime pm warning. Reverting $subject patch on top of current
> linux-next hides this warning.
>
> Here is a kernel log:
>
> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
> PCIe x1 link)
> pci 0000:01:00.0: Adding to iommu group 13
> pci 0000:01:00.0: ASPM: default states L0s L1
> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
> pci 0000:01:00.0: runtime PM trying to activate child device
> 0000:01:00.0 but parent (0000:00:00.0) is not active
Thanks for the report. I'll try to look at reproducing this, or at least
getting a better mental model of exactly why this might fail (or,
"warn") this way. But if you have the time and desire to try things out
for me, can you give v1 a try?
https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
I'm pretty sure it would not invoke the same problem. I also suspect v3
might not, but I'm less sure:
https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.19.0-rc1+ #16398 Not tainted
> ------------------------------------------------------
> kworker/3:0/33 is trying to acquire lock:
> ffffcd182ff1ae98 (console_owner){..-.}-{0:0}, at:
> console_lock_spinning_enable+0x44/0x78
>
> but task is already holding lock:
> ffff0000835c5238 (&dev->power.lock/1){....}-{3:3}, at:
> __pm_runtime_set_status+0x240/0x384
>
> which lock already depends on the new lock.
The lockdep warning is a bit messier, and I'd also have to take some
more time to be sure, but in principle, this sounds like a totally
orthogonal problem. It seems like simply performing printk() to a qcom
UART in the "wrong" context is enough to cause this. If so, that's
definitely a console/UART bug (or maybe a lockdep false positive) and
not a PCI/runtime-PM bug.
> the existing dependency chain (in reverse order) is:
>
> -> #3 (&dev->power.lock/1){....}-{3:3}:
> _raw_spin_lock_nested+0x44/0x5c
> __pm_runtime_set_status+0x240/0x384
> arm_smmu_device_probe+0xbe0/0xe5c
> platform_probe+0x5c/0xac
> really_probe+0xbc/0x298
> __driver_probe_device+0x78/0x12c
> driver_probe_device+0x40/0x164
> __driver_attach+0x9c/0x1ac
> bus_for_each_dev+0x74/0xd0
> driver_attach+0x24/0x30
> bus_add_driver+0xe4/0x208
> driver_register+0x60/0x128
> __platform_driver_register+0x24/0x30
> arm_smmu_driver_init+0x20/0x2c
> do_one_initcall+0x64/0x308
> kernel_init_freeable+0x284/0x500
> kernel_init+0x20/0x1d8
> ret_from_fork+0x10/0x20
>
> -> #2 (&dev->power.lock){-...}-{3:3}:
> _raw_spin_lock_irqsave+0x60/0x88
> __pm_runtime_resume+0x4c/0xbc
> __uart_start+0x4c/0x114
> uart_write+0x98/0x278
> n_tty_write+0x1dc/0x4f0
> file_tty_write.constprop.0+0x12c/0x2bc
> redirected_tty_write+0xc0/0x108
> do_iter_readv_writev+0xdc/0x1c0
> vfs_writev+0xf0/0x280
> do_writev+0x74/0x13c
> __arm64_sys_writev+0x20/0x2c
> invoke_syscall+0x48/0x10c
> el0_svc_common.constprop.0+0x40/0xe8
> do_el0_svc+0x20/0x2c
> el0_svc+0x50/0x2e8
> el0t_64_sync_handler+0xa0/0xe4
> el0t_64_sync+0x198/0x19c
>
> -> #1 (&port_lock_key){-.-.}-{3:3}:
> _raw_spin_lock_irqsave+0x60/0x88
> qcom_geni_serial_console_write+0x50/0x344
> console_flush_one_record+0x33c/0x474
> console_unlock+0x80/0x14c
> vprintk_emit+0x258/0x3d0
> vprintk_default+0x38/0x44
> vprintk+0x28/0x34
> _printk+0x5c/0x84
> register_console+0x278/0x510
> serial_core_register_port+0x6cc/0x79c
> serial_ctrl_register_port+0x10/0x1c
> uart_add_one_port+0x10/0x1c
> qcom_geni_serial_probe+0x2c0/0x448
> platform_probe+0x5c/0xac
> really_probe+0xbc/0x298
> __driver_probe_device+0x78/0x12c
> driver_probe_device+0x40/0x164
> __device_attach_driver+0xb8/0x138
> bus_for_each_drv+0x80/0xdc
> __device_attach+0xa8/0x1b0
> device_initial_probe+0x50/0x54
> bus_probe_device+0x38/0xa8
> device_add+0x540/0x720
> of_device_add+0x44/0x60
> of_platform_device_create_pdata+0x90/0x11c
> of_platform_bus_create+0x17c/0x394
> of_platform_populate+0x58/0xf8
> devm_of_platform_populate+0x58/0xbc
> geni_se_probe+0xdc/0x164
> platform_probe+0x5c/0xac
> really_probe+0xbc/0x298
> __driver_probe_device+0x78/0x12c
> driver_probe_device+0x40/0x164
> __device_attach_driver+0xb8/0x138
> bus_for_each_drv+0x80/0xdc
> __device_attach+0xa8/0x1b0
> device_initial_probe+0x50/0x54
> bus_probe_device+0x38/0xa8
> deferred_probe_work_func+0x8c/0xc8
> process_one_work+0x208/0x604
> worker_thread+0x244/0x388
> kthread+0x150/0x228
> ret_from_fork+0x10/0x20
>
> -> #0 (console_owner){..-.}-{0:0}:
> __lock_acquire+0x1408/0x2254
> lock_acquire+0x1c8/0x354
> console_lock_spinning_enable+0x68/0x78
> console_flush_one_record+0x300/0x474
> console_unlock+0x80/0x14c
> vprintk_emit+0x258/0x3d0
> dev_vprintk_emit+0xd8/0x1a0
> dev_printk_emit+0x58/0x80
> __dev_printk+0x3c/0x88
> _dev_err+0x60/0x88
> __pm_runtime_set_status+0x28c/0x384
> pci_bus_add_device+0xa4/0x18c
> pci_bus_add_devices+0x3c/0x88
> pci_bus_add_devices+0x68/0x88
> pci_rescan_bus+0x30/0x44
> rescan_work_func+0x28/0x3c
> process_one_work+0x208/0x604
> worker_thread+0x244/0x388
> kthread+0x150/0x228
> ret_from_fork+0x10/0x20
>
> other info that might help us debug this:
>
> Chain exists of:
> console_owner --> &dev->power.lock --> &dev->power.lock/1
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(&dev->power.lock/1);
> lock(&dev->power.lock);
> lock(&dev->power.lock/1);
> lock(console_owner);
>
> *** DEADLOCK ***
>
> 7 locks held by kworker/3:0/33:
> #0: ffff00008000d948 ((wq_completion)events){+.+.}-{0:0}, at:
> process_one_work+0x18c/0x604
> #1: ffff8000802a3de0 ((work_completion)(&pwrctrl->work)){+.+.}-{0:0},
> at: process_one_work+0x1b4/0x604
> #2: ffffcd18301138e8 (pci_rescan_remove_lock){+.+.}-{4:4}, at:
> pci_lock_rescan_remove+0x1c/0x28
> #3: ffff00008ac8a238 (&dev->power.lock){-...}-{3:3}, at:
> __pm_runtime_set_status+0x1d4/0x384
> #4: ffff0000835c5238 (&dev->power.lock/1){....}-{3:3}, at:
> __pm_runtime_set_status+0x240/0x384
> #5: ffffcd182ff1ac78 (console_lock){+.+.}-{0:0}, at:
> dev_vprintk_emit+0xd8/0x1a0
> #6: ffffcd182ff1acd8 (console_srcu){....}-{0:0}, at:
> console_flush_one_record+0x0/0x474
>
> stack backtrace:
> CPU: 3 UID: 0 PID: 33 Comm: kworker/3:0 Not tainted 6.19.0-rc1+ #16398
> PREEMPT
> Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
> Workqueue: events rescan_work_func
> Call trace:
> show_stack+0x18/0x24 (C)
> dump_stack_lvl+0x90/0xd0
> dump_stack+0x18/0x24
> print_circular_bug+0x298/0x37c
> check_noncircular+0x15c/0x170
> __lock_acquire+0x1408/0x2254
> lock_acquire+0x1c8/0x354
> console_lock_spinning_enable+0x68/0x78
> console_flush_one_record+0x300/0x474
> console_unlock+0x80/0x14c
> vprintk_emit+0x258/0x3d0
> dev_vprintk_emit+0xd8/0x1a0
> dev_printk_emit+0x58/0x80
> __dev_printk+0x3c/0x88
> _dev_err+0x60/0x88
> __pm_runtime_set_status+0x28c/0x384
> pci_bus_add_device+0xa4/0x18c
> pci_bus_add_devices+0x3c/0x88
> pci_bus_add_devices+0x68/0x88
> pci_rescan_bus+0x30/0x44
> rescan_work_func+0x28/0x3c
> process_one_work+0x208/0x604
> worker_thread+0x244/0x388
> kthread+0x150/0x228
> ret_from_fork+0x10/0x20
>
> This looks a bit similar to the issue reported some time ago on a
> different board:
>
> https://lore.kernel.org/all/6d438995-4d6d-4a21-9ad2-8a0352482d44@samsung.com/
Huh, yeah, the lockdep warning is rather similar looking. So that bug
(whether real or false positive) may have been around a while.
And the "Enabling runtime PM for inactive device with active children"
log is similar, but it involves a different set of devices -- now we're
dealing with the PCIe port and child device, whereas that report was
about the host bridge/controller device.
Brian
> Let me know if I can somehow help debugging this.
>
> Best regards
> --
> Marek Szyprowski, PhD
> Samsung R&D Institute Poland
>
Hi Brian,
On 14.01.2026 21:10, Brian Norris wrote:
> On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
>> On 06.01.2026 23:27, Bjorn Helgaas wrote:
>>> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
>>>> Today, it's possible for a PCI device to be created and
>>>> runtime-suspended before it is fully initialized. When that happens, the
>>>> device will remain in D0, but the suspend process may save an
>>>> intermediate version of that device's state -- for example, without
>>>> appropriate BAR configuration. When the device later resumes, we'll
>>>> restore invalid PCI state and the device may not function.
>>>>
>>>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
>>>> until we've fully initialized the device.
> ...
>> This patch landed recently in linux-next as commit c796513dc54e
>> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
>> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
>> runtime PM trying to activate child device 0000:01:00.0 but parent
>> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
>> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
>> lockdep warning about console lock, but this is just a consequence of
>> the runtime pm warning. Reverting $subject patch on top of current
>> linux-next hides this warning.
>>
>> Here is a kernel log:
>>
>> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
>> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
>> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
>> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
>> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
>> PCIe x1 link)
>> pci 0000:01:00.0: Adding to iommu group 13
>> pci 0000:01:00.0: ASPM: default states L0s L1
>> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
>> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
>> pci 0000:01:00.0: runtime PM trying to activate child device
>> 0000:01:00.0 but parent (0000:00:00.0) is not active
> Thanks for the report. I'll try to look at reproducing this, or at least
> getting a better mental model of exactly why this might fail (or,
> "warn") this way. But if you have the time and desire to try things out
> for me, can you give v1 a try?
>
> https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
>
> I'm pretty sure it would not invoke the same problem.
Right, this one works fine.
> I also suspect v3
> might not, but I'm less sure:
>
> https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
This one too, at least I was not able to reproduce any fail.
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 6.19.0-rc1+ #16398 Not tainted
>> ------------------------------------------------------
>> kworker/3:0/33 is trying to acquire lock:
>> ffffcd182ff1ae98 (console_owner){..-.}-{0:0}, at:
>> console_lock_spinning_enable+0x44/0x78
>>
>> but task is already holding lock:
>> ffff0000835c5238 (&dev->power.lock/1){....}-{3:3}, at:
>> __pm_runtime_set_status+0x240/0x384
>>
>> which lock already depends on the new lock.
> The lockdep warning is a bit messier, and I'd also have to take some
> more time to be sure, but in principle, this sounds like a totally
> orthogonal problem. It seems like simply performing printk() to a qcom
> UART in the "wrong" context is enough to cause this. If so, that's
> definitely a console/UART bug (or maybe a lockdep false positive) and
> not a PCI/runtime-PM bug.
Yes, the lockdep warning is not really a problem, it is just a
consequence of the printing that "runtime PM trying to activate child
device 0000:01:00.0 but parent (0000:00:00.0) is not active" message.
However that message is itself a problem imho.
>> (...)
>>
>> This looks a bit similar to the issue reported some time ago on a
>> different board:
>>
>> https://lore.kernel.org/all/6d438995-4d6d-4a21-9ad2-8a0352482d44@samsung.com/
> Huh, yeah, the lockdep warning is rather similar looking. So that bug
> (whether real or false positive) may have been around a while.
>
> And the "Enabling runtime PM for inactive device with active children"
> log is similar, but it involves a different set of devices -- now we're
> dealing with the PCIe port and child device, whereas that report was
> about the host bridge/controller device.
Okay, so a bit different case. At least it confirms that the lockdep
issue is not really a problem.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Hi Marek,
On Thu, Jan 15, 2026 at 12:14:49PM +0100, Marek Szyprowski wrote:
> On 14.01.2026 21:10, Brian Norris wrote:
> > On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
> >> On 06.01.2026 23:27, Bjorn Helgaas wrote:
> >>> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
> >>>> Today, it's possible for a PCI device to be created and
> >>>> runtime-suspended before it is fully initialized. When that happens, the
> >>>> device will remain in D0, but the suspend process may save an
> >>>> intermediate version of that device's state -- for example, without
> >>>> appropriate BAR configuration. When the device later resumes, we'll
> >>>> restore invalid PCI state and the device may not function.
> >>>>
> >>>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> >>>> until we've fully initialized the device.
> > ...
> >> This patch landed recently in linux-next as commit c796513dc54e
> >> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
> >> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
> >> runtime PM trying to activate child device 0000:01:00.0 but parent
> >> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
> >> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
> >> lockdep warning about console lock, but this is just a consequence of
> >> the runtime pm warning. Reverting $subject patch on top of current
> >> linux-next hides this warning.
> >>
> >> Here is a kernel log:
> >>
> >> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
> >> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
> >> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
> >> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
> >> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
> >> PCIe x1 link)
> >> pci 0000:01:00.0: Adding to iommu group 13
> >> pci 0000:01:00.0: ASPM: default states L0s L1
> >> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
> >> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
> >> pci 0000:01:00.0: runtime PM trying to activate child device
> >> 0000:01:00.0 but parent (0000:00:00.0) is not active
> > Thanks for the report. I'll try to look at reproducing this, or at least
> > getting a better mental model of exactly why this might fail (or,
> > "warn") this way. But if you have the time and desire to try things out
> > for me, can you give v1 a try?
> >
> > https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> >
> > I'm pretty sure it would not invoke the same problem.
>
> Right, this one works fine.
>
> > I also suspect v3
> > might not, but I'm less sure:
> >
> > https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> This one too, at least I was not able to reproduce any fail.
Thanks for testing. I'm still not sure exactly how to reproduce your
failure, but it seems as if the root port is being allowed to suspend
before the endpoint is added to the system, and it remains so while the
endpoint is about to probe. device_initial_probe() will be OK with
respect to PM, since it will wake up the port if needed. But this
particular code is not OK, since it doesn't ensure the parent device is
active while preparing the endpoint power state.
I suppose one way to "solve" that is (untested):
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -380,8 +380,12 @@ void pci_bus_add_device(struct pci_dev *dev)
put_device(&pdev->dev);
}
+ if (dev->dev.parent)
+ pm_runtime_get_sync(dev->dev.parent);
pm_runtime_set_active(&dev->dev);
pm_runtime_enable(&dev->dev);
+ if (dev->dev.parent)
+ pm_runtime_put(dev->dev.parent);
if (!dn || of_device_is_available(dn))
pci_dev_allow_binding(dev);
Personally, I'm more inclined to go back to v1, since it prepares the
runtime PM status when the device is first discovered. That way, its
ancestors are still active, avoiding these sorts of problems. I'm
frankly not sure of all the reasons Rafael recommended I make the
v1->v3->v4 changes, and now that they cause problems, I'm inclined to
question them again.
Rafael, do you have any thoughts?
Brian
On Sat, Jan 17, 2026 at 2:19 AM Brian Norris <briannorris@chromium.org> wrote:
>
> Hi Marek,
>
> On Thu, Jan 15, 2026 at 12:14:49PM +0100, Marek Szyprowski wrote:
> > On 14.01.2026 21:10, Brian Norris wrote:
> > > On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
> > >> On 06.01.2026 23:27, Bjorn Helgaas wrote:
> > >>> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
> > >>>> Today, it's possible for a PCI device to be created and
> > >>>> runtime-suspended before it is fully initialized. When that happens, the
> > >>>> device will remain in D0, but the suspend process may save an
> > >>>> intermediate version of that device's state -- for example, without
> > >>>> appropriate BAR configuration. When the device later resumes, we'll
> > >>>> restore invalid PCI state and the device may not function.
> > >>>>
> > >>>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> > >>>> until we've fully initialized the device.
> > > ...
> > >> This patch landed recently in linux-next as commit c796513dc54e
> > >> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
> > >> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
> > >> runtime PM trying to activate child device 0000:01:00.0 but parent
> > >> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
> > >> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
> > >> lockdep warning about console lock, but this is just a consequence of
> > >> the runtime pm warning. Reverting $subject patch on top of current
> > >> linux-next hides this warning.
> > >>
> > >> Here is a kernel log:
> > >>
> > >> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
> > >> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
> > >> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
> > >> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
> > >> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
> > >> PCIe x1 link)
> > >> pci 0000:01:00.0: Adding to iommu group 13
> > >> pci 0000:01:00.0: ASPM: default states L0s L1
> > >> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
> > >> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
> > >> pci 0000:01:00.0: runtime PM trying to activate child device
> > >> 0000:01:00.0 but parent (0000:00:00.0) is not active
> > > Thanks for the report. I'll try to look at reproducing this, or at least
> > > getting a better mental model of exactly why this might fail (or,
> > > "warn") this way. But if you have the time and desire to try things out
> > > for me, can you give v1 a try?
> > >
> > > https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> > >
> > > I'm pretty sure it would not invoke the same problem.
> >
> > Right, this one works fine.
> >
> > > I also suspect v3
> > > might not, but I'm less sure:
> > >
> > > https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> > This one too, at least I was not able to reproduce any fail.
>
> Thanks for testing. I'm still not sure exactly how to reproduce your
> failure, but it seems as if the root port is being allowed to suspend
> before the endpoint is added to the system, and it remains so while the
> endpoint is about to probe. device_initial_probe() will be OK with
> respect to PM, since it will wake up the port if needed. But this
> particular code is not OK, since it doesn't ensure the parent device is
> active while preparing the endpoint power state.
>
> I suppose one way to "solve" that is (untested):
>
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -380,8 +380,12 @@ void pci_bus_add_device(struct pci_dev *dev)
> put_device(&pdev->dev);
> }
>
> + if (dev->dev.parent)
> + pm_runtime_get_sync(dev->dev.parent);
> pm_runtime_set_active(&dev->dev);
> pm_runtime_enable(&dev->dev);
> + if (dev->dev.parent)
> + pm_runtime_put(dev->dev.parent);
>
> if (!dn || of_device_is_available(dn))
> pci_dev_allow_binding(dev);
>
> Personally, I'm more inclined to go back to v1, since it prepares the
> runtime PM status when the device is first discovered. That way, its
> ancestors are still active, avoiding these sorts of problems. I'm
> frankly not sure of all the reasons Rafael recommended I make the
> v1->v3->v4 changes, and now that they cause problems, I'm inclined to
> question them again.
>
> Rafael, do you have any thoughts?
Yeah.
Move back pm_runtime_set_active(&dev->dev) back to pm_runtime_init()
because that would prevent the parent from suspending and keep
pm_runtime_enable() here because that would prevent the device itself
from suspending between pm_runtime_init() and this place.
And I would add comments in both places.
Hi Rafael, Thanks for your thoughts! On Sun, Jan 18, 2026 at 12:53:21PM +0100, Rafael J. Wysocki wrote: > On Sat, Jan 17, 2026 at 2:19 AM Brian Norris <briannorris@chromium.org> wrote: > > I suppose one way to "solve" that is (untested): > > > > --- a/drivers/pci/bus.c > > +++ b/drivers/pci/bus.c > > @@ -380,8 +380,12 @@ void pci_bus_add_device(struct pci_dev *dev) > > put_device(&pdev->dev); > > } > > > > + if (dev->dev.parent) > > + pm_runtime_get_sync(dev->dev.parent); > > pm_runtime_set_active(&dev->dev); > > pm_runtime_enable(&dev->dev); > > + if (dev->dev.parent) > > + pm_runtime_put(dev->dev.parent); > > > > if (!dn || of_device_is_available(dn)) > > pci_dev_allow_binding(dev); > > > > Personally, I'm more inclined to go back to v1, since it prepares the > > runtime PM status when the device is first discovered. That way, its > > ancestors are still active, avoiding these sorts of problems. I'm > > frankly not sure of all the reasons Rafael recommended I make the > > v1->v3->v4 changes, and now that they cause problems, I'm inclined to > > question them again. > > > > Rafael, do you have any thoughts? > > Yeah. > > Move back pm_runtime_set_active(&dev->dev) back to pm_runtime_init() > because that would prevent the parent from suspending and keep > pm_runtime_enable() here because that would prevent the device itself > from suspending between pm_runtime_init() and this place. I'll admit, I was a little fuzzy on the details of the first part of the sentence here -- specifically, that an "active" (but still "disabled") device will prevent suspend of its parent. I suppose I'm more familiar with the typical "disabled and suspended" device, which essentially has no effect on its parent. Anyway, that's basically v3, so I rerolled a v5 that looks similar. > And I would add comments in both places. I tried to add a short comment to each. It's an art form to write exactly the right size of comment to make everyone happy (people complain about too much commenting, and then others complain about non-obvious behaviors that could have used more comments), especially when it comes to something as tricky as runtime PM. At least I tried... Brian
On Sun, Jan 18, 2026 at 12:53 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Sat, Jan 17, 2026 at 2:19 AM Brian Norris <briannorris@chromium.org> wrote:
> >
> > Hi Marek,
> >
> > On Thu, Jan 15, 2026 at 12:14:49PM +0100, Marek Szyprowski wrote:
> > > On 14.01.2026 21:10, Brian Norris wrote:
> > > > On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
> > > >> On 06.01.2026 23:27, Bjorn Helgaas wrote:
> > > >>> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
> > > >>>> Today, it's possible for a PCI device to be created and
> > > >>>> runtime-suspended before it is fully initialized. When that happens, the
> > > >>>> device will remain in D0, but the suspend process may save an
> > > >>>> intermediate version of that device's state -- for example, without
> > > >>>> appropriate BAR configuration. When the device later resumes, we'll
> > > >>>> restore invalid PCI state and the device may not function.
> > > >>>>
> > > >>>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> > > >>>> until we've fully initialized the device.
> > > > ...
> > > >> This patch landed recently in linux-next as commit c796513dc54e
> > > >> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
> > > >> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
> > > >> runtime PM trying to activate child device 0000:01:00.0 but parent
> > > >> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
> > > >> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
> > > >> lockdep warning about console lock, but this is just a consequence of
> > > >> the runtime pm warning. Reverting $subject patch on top of current
> > > >> linux-next hides this warning.
> > > >>
> > > >> Here is a kernel log:
> > > >>
> > > >> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
> > > >> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
> > > >> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
> > > >> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
> > > >> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
> > > >> PCIe x1 link)
> > > >> pci 0000:01:00.0: Adding to iommu group 13
> > > >> pci 0000:01:00.0: ASPM: default states L0s L1
> > > >> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
> > > >> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
> > > >> pci 0000:01:00.0: runtime PM trying to activate child device
> > > >> 0000:01:00.0 but parent (0000:00:00.0) is not active
> > > > Thanks for the report. I'll try to look at reproducing this, or at least
> > > > getting a better mental model of exactly why this might fail (or,
> > > > "warn") this way. But if you have the time and desire to try things out
> > > > for me, can you give v1 a try?
> > > >
> > > > https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> > > >
> > > > I'm pretty sure it would not invoke the same problem.
> > >
> > > Right, this one works fine.
> > >
> > > > I also suspect v3
> > > > might not, but I'm less sure:
> > > >
> > > > https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> > > This one too, at least I was not able to reproduce any fail.
> >
> > Thanks for testing. I'm still not sure exactly how to reproduce your
> > failure, but it seems as if the root port is being allowed to suspend
> > before the endpoint is added to the system, and it remains so while the
> > endpoint is about to probe. device_initial_probe() will be OK with
> > respect to PM, since it will wake up the port if needed. But this
> > particular code is not OK, since it doesn't ensure the parent device is
> > active while preparing the endpoint power state.
> >
> > I suppose one way to "solve" that is (untested):
> >
> > --- a/drivers/pci/bus.c
> > +++ b/drivers/pci/bus.c
> > @@ -380,8 +380,12 @@ void pci_bus_add_device(struct pci_dev *dev)
> > put_device(&pdev->dev);
> > }
> >
> > + if (dev->dev.parent)
> > + pm_runtime_get_sync(dev->dev.parent);
> > pm_runtime_set_active(&dev->dev);
> > pm_runtime_enable(&dev->dev);
> > + if (dev->dev.parent)
> > + pm_runtime_put(dev->dev.parent);
> >
> > if (!dn || of_device_is_available(dn))
> > pci_dev_allow_binding(dev);
> >
> > Personally, I'm more inclined to go back to v1, since it prepares the
> > runtime PM status when the device is first discovered. That way, its
> > ancestors are still active, avoiding these sorts of problems. I'm
> > frankly not sure of all the reasons Rafael recommended I make the
> > v1->v3->v4 changes, and now that they cause problems, I'm inclined to
> > question them again.
> >
> > Rafael, do you have any thoughts?
>
> Yeah.
>
> Move back pm_runtime_set_active(&dev->dev) back to pm_runtime_init()
Or rather leave it there to be precise, but I think you know what I mean. :-)
> because that would prevent the parent from suspending and keep
> pm_runtime_enable() here because that would prevent the device itself
> from suspending between pm_runtime_init() and this place.
>
> And I would add comments in both places.
On 18.01.2026 12:59, Rafael J. Wysocki wrote:
> On Sun, Jan 18, 2026 at 12:53 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>> On Sat, Jan 17, 2026 at 2:19 AM Brian Norris <briannorris@chromium.org> wrote:
>>> On Thu, Jan 15, 2026 at 12:14:49PM +0100, Marek Szyprowski wrote:
>>>> On 14.01.2026 21:10, Brian Norris wrote:
>>>>> On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
>>>>>> On 06.01.2026 23:27, Bjorn Helgaas wrote:
>>>>>>> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
>>>>>>>> Today, it's possible for a PCI device to be created and
>>>>>>>> runtime-suspended before it is fully initialized. When that happens, the
>>>>>>>> device will remain in D0, but the suspend process may save an
>>>>>>>> intermediate version of that device's state -- for example, without
>>>>>>>> appropriate BAR configuration. When the device later resumes, we'll
>>>>>>>> restore invalid PCI state and the device may not function.
>>>>>>>>
>>>>>>>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
>>>>>>>> until we've fully initialized the device.
>>>>> ...
>>>>>> This patch landed recently in linux-next as commit c796513dc54e
>>>>>> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
>>>>>> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
>>>>>> runtime PM trying to activate child device 0000:01:00.0 but parent
>>>>>> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
>>>>>> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
>>>>>> lockdep warning about console lock, but this is just a consequence of
>>>>>> the runtime pm warning. Reverting $subject patch on top of current
>>>>>> linux-next hides this warning.
>>>>>>
>>>>>> Here is a kernel log:
>>>>>>
>>>>>> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
>>>>>> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
>>>>>> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
>>>>>> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
>>>>>> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
>>>>>> PCIe x1 link)
>>>>>> pci 0000:01:00.0: Adding to iommu group 13
>>>>>> pci 0000:01:00.0: ASPM: default states L0s L1
>>>>>> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
>>>>>> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
>>>>>> pci 0000:01:00.0: runtime PM trying to activate child device
>>>>>> 0000:01:00.0 but parent (0000:00:00.0) is not active
>>>>> Thanks for the report. I'll try to look at reproducing this, or at least
>>>>> getting a better mental model of exactly why this might fail (or,
>>>>> "warn") this way. But if you have the time and desire to try things out
>>>>> for me, can you give v1 a try?
>>>>>
>>>>> https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
>>>>>
>>>>> I'm pretty sure it would not invoke the same problem.
>>>> Right, this one works fine.
>>>>
>>>>> I also suspect v3
>>>>> might not, but I'm less sure:
>>>>>
>>>>> https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
>>>> This one too, at least I was not able to reproduce any fail.
>>> Thanks for testing. I'm still not sure exactly how to reproduce your
>>> failure, but it seems as if the root port is being allowed to suspend
>>> before the endpoint is added to the system, and it remains so while the
>>> endpoint is about to probe. device_initial_probe() will be OK with
>>> respect to PM, since it will wake up the port if needed. But this
>>> particular code is not OK, since it doesn't ensure the parent device is
>>> active while preparing the endpoint power state.
>>>
>>> I suppose one way to "solve" that is (untested):
>>>
>>> --- a/drivers/pci/bus.c
>>> +++ b/drivers/pci/bus.c
>>> @@ -380,8 +380,12 @@ void pci_bus_add_device(struct pci_dev *dev)
>>> put_device(&pdev->dev);
>>> }
>>>
>>> + if (dev->dev.parent)
>>> + pm_runtime_get_sync(dev->dev.parent);
>>> pm_runtime_set_active(&dev->dev);
>>> pm_runtime_enable(&dev->dev);
>>> + if (dev->dev.parent)
>>> + pm_runtime_put(dev->dev.parent);
>>>
>>> if (!dn || of_device_is_available(dn))
>>> pci_dev_allow_binding(dev);
>>>
>>> Personally, I'm more inclined to go back to v1, since it prepares the
>>> runtime PM status when the device is first discovered. That way, its
>>> ancestors are still active, avoiding these sorts of problems. I'm
>>> frankly not sure of all the reasons Rafael recommended I make the
>>> v1->v3->v4 changes, and now that they cause problems, I'm inclined to
>>> question them again.
>>>
>>> Rafael, do you have any thoughts?
>> Yeah.
>>
>> Move back pm_runtime_set_active(&dev->dev) back to pm_runtime_init()
> Or rather leave it there to be precise, but I think you know what I mean. :-)
>
>> because that would prevent the parent from suspending and keep
>> pm_runtime_enable() here because that would prevent the device itself
>> from suspending between pm_runtime_init() and this place.
>>
>> And I would add comments in both places.
Confirmed, the following change (compared to $subject patch) fixed my issue:
diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
index 3ef60c2fbd89..7e2b7e452d51 100644
--- a/drivers/pci/bus.c
+++ b/drivers/pci/bus.c
@@ -381,7 +381,6 @@ void pci_bus_add_device(struct pci_dev *dev)
}
pm_runtime_set_active(&dev->dev);
- pm_runtime_enable(&dev->dev);
if (!dn || of_device_is_available(dn))
pci_dev_allow_binding(dev);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index fae5a683cf87..22b897416025 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3201,6 +3201,7 @@ void pci_pm_init(struct pci_dev *dev)
poweron:
pci_pm_power_up_and_verify_state(dev);
pm_runtime_forbid(&dev->dev);
+ pm_runtime_enable(&dev->dev);
}
static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
Feel free to add:
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On Mon, Jan 19, 2026 at 11:01 AM Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
>
> On 18.01.2026 12:59, Rafael J. Wysocki wrote:
> > On Sun, Jan 18, 2026 at 12:53 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >> On Sat, Jan 17, 2026 at 2:19 AM Brian Norris <briannorris@chromium.org> wrote:
> >>> On Thu, Jan 15, 2026 at 12:14:49PM +0100, Marek Szyprowski wrote:
> >>>> On 14.01.2026 21:10, Brian Norris wrote:
> >>>>> On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
> >>>>>> On 06.01.2026 23:27, Bjorn Helgaas wrote:
> >>>>>>> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
> >>>>>>>> Today, it's possible for a PCI device to be created and
> >>>>>>>> runtime-suspended before it is fully initialized. When that happens, the
> >>>>>>>> device will remain in D0, but the suspend process may save an
> >>>>>>>> intermediate version of that device's state -- for example, without
> >>>>>>>> appropriate BAR configuration. When the device later resumes, we'll
> >>>>>>>> restore invalid PCI state and the device may not function.
> >>>>>>>>
> >>>>>>>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> >>>>>>>> until we've fully initialized the device.
> >>>>> ...
> >>>>>> This patch landed recently in linux-next as commit c796513dc54e
> >>>>>> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
> >>>>>> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
> >>>>>> runtime PM trying to activate child device 0000:01:00.0 but parent
> >>>>>> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
> >>>>>> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
> >>>>>> lockdep warning about console lock, but this is just a consequence of
> >>>>>> the runtime pm warning. Reverting $subject patch on top of current
> >>>>>> linux-next hides this warning.
> >>>>>>
> >>>>>> Here is a kernel log:
> >>>>>>
> >>>>>> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
> >>>>>> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
> >>>>>> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
> >>>>>> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
> >>>>>> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
> >>>>>> PCIe x1 link)
> >>>>>> pci 0000:01:00.0: Adding to iommu group 13
> >>>>>> pci 0000:01:00.0: ASPM: default states L0s L1
> >>>>>> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
> >>>>>> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
> >>>>>> pci 0000:01:00.0: runtime PM trying to activate child device
> >>>>>> 0000:01:00.0 but parent (0000:00:00.0) is not active
> >>>>> Thanks for the report. I'll try to look at reproducing this, or at least
> >>>>> getting a better mental model of exactly why this might fail (or,
> >>>>> "warn") this way. But if you have the time and desire to try things out
> >>>>> for me, can you give v1 a try?
> >>>>>
> >>>>> https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> >>>>>
> >>>>> I'm pretty sure it would not invoke the same problem.
> >>>> Right, this one works fine.
> >>>>
> >>>>> I also suspect v3
> >>>>> might not, but I'm less sure:
> >>>>>
> >>>>> https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> >>>> This one too, at least I was not able to reproduce any fail.
> >>> Thanks for testing. I'm still not sure exactly how to reproduce your
> >>> failure, but it seems as if the root port is being allowed to suspend
> >>> before the endpoint is added to the system, and it remains so while the
> >>> endpoint is about to probe. device_initial_probe() will be OK with
> >>> respect to PM, since it will wake up the port if needed. But this
> >>> particular code is not OK, since it doesn't ensure the parent device is
> >>> active while preparing the endpoint power state.
> >>>
> >>> I suppose one way to "solve" that is (untested):
> >>>
> >>> --- a/drivers/pci/bus.c
> >>> +++ b/drivers/pci/bus.c
> >>> @@ -380,8 +380,12 @@ void pci_bus_add_device(struct pci_dev *dev)
> >>> put_device(&pdev->dev);
> >>> }
> >>>
> >>> + if (dev->dev.parent)
> >>> + pm_runtime_get_sync(dev->dev.parent);
> >>> pm_runtime_set_active(&dev->dev);
> >>> pm_runtime_enable(&dev->dev);
> >>> + if (dev->dev.parent)
> >>> + pm_runtime_put(dev->dev.parent);
> >>>
> >>> if (!dn || of_device_is_available(dn))
> >>> pci_dev_allow_binding(dev);
> >>>
> >>> Personally, I'm more inclined to go back to v1, since it prepares the
> >>> runtime PM status when the device is first discovered. That way, its
> >>> ancestors are still active, avoiding these sorts of problems. I'm
> >>> frankly not sure of all the reasons Rafael recommended I make the
> >>> v1->v3->v4 changes, and now that they cause problems, I'm inclined to
> >>> question them again.
> >>>
> >>> Rafael, do you have any thoughts?
> >> Yeah.
> >>
> >> Move back pm_runtime_set_active(&dev->dev) back to pm_runtime_init()
> > Or rather leave it there to be precise, but I think you know what I mean. :-)
> >
> >> because that would prevent the parent from suspending and keep
> >> pm_runtime_enable() here because that would prevent the device itself
> >> from suspending between pm_runtime_init() and this place.
> >>
> >> And I would add comments in both places.
>
> Confirmed, the following change (compared to $subject patch) fixed my issue:
>
> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> index 3ef60c2fbd89..7e2b7e452d51 100644
> --- a/drivers/pci/bus.c
> +++ b/drivers/pci/bus.c
> @@ -381,7 +381,6 @@ void pci_bus_add_device(struct pci_dev *dev)
> }
>
> pm_runtime_set_active(&dev->dev);
> - pm_runtime_enable(&dev->dev);
That works too, but it would defeat the purpose of the original
change, so I mean the other way around.
That is, leave the pm_runtime_enable() here and move the
pm_runtime_set_active() back to the other place.
>
> if (!dn || of_device_is_available(dn))
> pci_dev_allow_binding(dev);
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index fae5a683cf87..22b897416025 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3201,6 +3201,7 @@ void pci_pm_init(struct pci_dev *dev)
> poweron:
> pci_pm_power_up_and_verify_state(dev);
> pm_runtime_forbid(&dev->dev);
> + pm_runtime_enable(&dev->dev);
> }
>
> static unsigned long pci_ea_flags(struct pci_dev *dev, u8 prop)
On 19.01.2026 13:26, Rafael J. Wysocki wrote:
> On Mon, Jan 19, 2026 at 11:01 AM Marek Szyprowski
> <m.szyprowski@samsung.com> wrote:
>> On 18.01.2026 12:59, Rafael J. Wysocki wrote:
>>> On Sun, Jan 18, 2026 at 12:53 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>>> On Sat, Jan 17, 2026 at 2:19 AM Brian Norris <briannorris@chromium.org> wrote:
>>>>> On Thu, Jan 15, 2026 at 12:14:49PM +0100, Marek Szyprowski wrote:
>>>>>> On 14.01.2026 21:10, Brian Norris wrote:
>>>>>>> On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
>>>>>>>> On 06.01.2026 23:27, Bjorn Helgaas wrote:
>>>>>>>>> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
>>>>>>>>>> Today, it's possible for a PCI device to be created and
>>>>>>>>>> runtime-suspended before it is fully initialized. When that happens, the
>>>>>>>>>> device will remain in D0, but the suspend process may save an
>>>>>>>>>> intermediate version of that device's state -- for example, without
>>>>>>>>>> appropriate BAR configuration. When the device later resumes, we'll
>>>>>>>>>> restore invalid PCI state and the device may not function.
>>>>>>>>>>
>>>>>>>>>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
>>>>>>>>>> until we've fully initialized the device.
>>>>>>> ...
>>>>>>>> This patch landed recently in linux-next as commit c796513dc54e
>>>>>>>> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
>>>>>>>> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
>>>>>>>> runtime PM trying to activate child device 0000:01:00.0 but parent
>>>>>>>> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
>>>>>>>> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
>>>>>>>> lockdep warning about console lock, but this is just a consequence of
>>>>>>>> the runtime pm warning. Reverting $subject patch on top of current
>>>>>>>> linux-next hides this warning.
>>>>>>>>
>>>>>>>> Here is a kernel log:
>>>>>>>>
>>>>>>>> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
>>>>>>>> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
>>>>>>>> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
>>>>>>>> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
>>>>>>>> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
>>>>>>>> PCIe x1 link)
>>>>>>>> pci 0000:01:00.0: Adding to iommu group 13
>>>>>>>> pci 0000:01:00.0: ASPM: default states L0s L1
>>>>>>>> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
>>>>>>>> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
>>>>>>>> pci 0000:01:00.0: runtime PM trying to activate child device
>>>>>>>> 0000:01:00.0 but parent (0000:00:00.0) is not active
>>>>>>> Thanks for the report. I'll try to look at reproducing this, or at least
>>>>>>> getting a better mental model of exactly why this might fail (or,
>>>>>>> "warn") this way. But if you have the time and desire to try things out
>>>>>>> for me, can you give v1 a try?
>>>>>>>
>>>>>>> https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
>>>>>>>
>>>>>>> I'm pretty sure it would not invoke the same problem.
>>>>>> Right, this one works fine.
>>>>>>
>>>>>>> I also suspect v3
>>>>>>> might not, but I'm less sure:
>>>>>>>
>>>>>>> https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
>>>>>> This one too, at least I was not able to reproduce any fail.
>>>>> Thanks for testing. I'm still not sure exactly how to reproduce your
>>>>> failure, but it seems as if the root port is being allowed to suspend
>>>>> before the endpoint is added to the system, and it remains so while the
>>>>> endpoint is about to probe. device_initial_probe() will be OK with
>>>>> respect to PM, since it will wake up the port if needed. But this
>>>>> particular code is not OK, since it doesn't ensure the parent device is
>>>>> active while preparing the endpoint power state.
>>>>>
>>>>> I suppose one way to "solve" that is (untested):
>>>>>
>>>>> --- a/drivers/pci/bus.c
>>>>> +++ b/drivers/pci/bus.c
>>>>> @@ -380,8 +380,12 @@ void pci_bus_add_device(struct pci_dev *dev)
>>>>> put_device(&pdev->dev);
>>>>> }
>>>>>
>>>>> + if (dev->dev.parent)
>>>>> + pm_runtime_get_sync(dev->dev.parent);
>>>>> pm_runtime_set_active(&dev->dev);
>>>>> pm_runtime_enable(&dev->dev);
>>>>> + if (dev->dev.parent)
>>>>> + pm_runtime_put(dev->dev.parent);
>>>>>
>>>>> if (!dn || of_device_is_available(dn))
>>>>> pci_dev_allow_binding(dev);
>>>>>
>>>>> Personally, I'm more inclined to go back to v1, since it prepares the
>>>>> runtime PM status when the device is first discovered. That way, its
>>>>> ancestors are still active, avoiding these sorts of problems. I'm
>>>>> frankly not sure of all the reasons Rafael recommended I make the
>>>>> v1->v3->v4 changes, and now that they cause problems, I'm inclined to
>>>>> question them again.
>>>>>
>>>>> Rafael, do you have any thoughts?
>>>> Yeah.
>>>>
>>>> Move back pm_runtime_set_active(&dev->dev) back to pm_runtime_init()
>>> Or rather leave it there to be precise, but I think you know what I mean. :-)
>>>
>>>> because that would prevent the parent from suspending and keep
>>>> pm_runtime_enable() here because that would prevent the device itself
>>>> from suspending between pm_runtime_init() and this place.
>>>>
>>>> And I would add comments in both places.
>> Confirmed, the following change (compared to $subject patch) fixed my issue:
>>
>> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
>> index 3ef60c2fbd89..7e2b7e452d51 100644
>> --- a/drivers/pci/bus.c
>> +++ b/drivers/pci/bus.c
>> @@ -381,7 +381,6 @@ void pci_bus_add_device(struct pci_dev *dev)
>> }
>>
>> pm_runtime_set_active(&dev->dev);
>> - pm_runtime_enable(&dev->dev);
> That works too, but it would defeat the purpose of the original
> change, so I mean the other way around.
>
> That is, leave the pm_runtime_enable() here and move the
> pm_runtime_set_active() back to the other place.
Okay, I mixed that. This way it works too and fixes the observed issue.
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On Mon, Jan 19, 2026 at 2:13 PM Marek Szyprowski
<m.szyprowski@samsung.com> wrote:
>
> On 19.01.2026 13:26, Rafael J. Wysocki wrote:
> > On Mon, Jan 19, 2026 at 11:01 AM Marek Szyprowski
> > <m.szyprowski@samsung.com> wrote:
> >> On 18.01.2026 12:59, Rafael J. Wysocki wrote:
> >>> On Sun, Jan 18, 2026 at 12:53 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >>>> On Sat, Jan 17, 2026 at 2:19 AM Brian Norris <briannorris@chromium.org> wrote:
> >>>>> On Thu, Jan 15, 2026 at 12:14:49PM +0100, Marek Szyprowski wrote:
> >>>>>> On 14.01.2026 21:10, Brian Norris wrote:
> >>>>>>> On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
> >>>>>>>> On 06.01.2026 23:27, Bjorn Helgaas wrote:
> >>>>>>>>> On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
> >>>>>>>>>> Today, it's possible for a PCI device to be created and
> >>>>>>>>>> runtime-suspended before it is fully initialized. When that happens, the
> >>>>>>>>>> device will remain in D0, but the suspend process may save an
> >>>>>>>>>> intermediate version of that device's state -- for example, without
> >>>>>>>>>> appropriate BAR configuration. When the device later resumes, we'll
> >>>>>>>>>> restore invalid PCI state and the device may not function.
> >>>>>>>>>>
> >>>>>>>>>> Prevent runtime suspend for PCI devices by deferring pm_runtime_enable()
> >>>>>>>>>> until we've fully initialized the device.
> >>>>>>> ...
> >>>>>>>> This patch landed recently in linux-next as commit c796513dc54e
> >>>>>>>> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
> >>>>>>>> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
> >>>>>>>> runtime PM trying to activate child device 0000:01:00.0 but parent
> >>>>>>>> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
> >>>>>>>> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
> >>>>>>>> lockdep warning about console lock, but this is just a consequence of
> >>>>>>>> the runtime pm warning. Reverting $subject patch on top of current
> >>>>>>>> linux-next hides this warning.
> >>>>>>>>
> >>>>>>>> Here is a kernel log:
> >>>>>>>>
> >>>>>>>> pci 0000:01:00.0: [17cb:1101] type 00 class 0xff0000 PCIe Endpoint
> >>>>>>>> pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
> >>>>>>>> pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
> >>>>>>>> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0
> >>>>>>>> GT/s PCIe x1 link at 0000:00:00.0 (capable of 7.876 Gb/s with 8.0 GT/s
> >>>>>>>> PCIe x1 link)
> >>>>>>>> pci 0000:01:00.0: Adding to iommu group 13
> >>>>>>>> pci 0000:01:00.0: ASPM: default states L0s L1
> >>>>>>>> pcieport 0000:00:00.0: bridge window [mem 0x60400000-0x604fffff]: assigned
> >>>>>>>> pci 0000:01:00.0: BAR 0 [mem 0x60400000-0x604fffff 64bit]: assigned
> >>>>>>>> pci 0000:01:00.0: runtime PM trying to activate child device
> >>>>>>>> 0000:01:00.0 but parent (0000:00:00.0) is not active
> >>>>>>> Thanks for the report. I'll try to look at reproducing this, or at least
> >>>>>>> getting a better mental model of exactly why this might fail (or,
> >>>>>>> "warn") this way. But if you have the time and desire to try things out
> >>>>>>> for me, can you give v1 a try?
> >>>>>>>
> >>>>>>> https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> >>>>>>>
> >>>>>>> I'm pretty sure it would not invoke the same problem.
> >>>>>> Right, this one works fine.
> >>>>>>
> >>>>>>> I also suspect v3
> >>>>>>> might not, but I'm less sure:
> >>>>>>>
> >>>>>>> https://lore.kernel.org/all/20251022141434.v3.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/
> >>>>>> This one too, at least I was not able to reproduce any fail.
> >>>>> Thanks for testing. I'm still not sure exactly how to reproduce your
> >>>>> failure, but it seems as if the root port is being allowed to suspend
> >>>>> before the endpoint is added to the system, and it remains so while the
> >>>>> endpoint is about to probe. device_initial_probe() will be OK with
> >>>>> respect to PM, since it will wake up the port if needed. But this
> >>>>> particular code is not OK, since it doesn't ensure the parent device is
> >>>>> active while preparing the endpoint power state.
> >>>>>
> >>>>> I suppose one way to "solve" that is (untested):
> >>>>>
> >>>>> --- a/drivers/pci/bus.c
> >>>>> +++ b/drivers/pci/bus.c
> >>>>> @@ -380,8 +380,12 @@ void pci_bus_add_device(struct pci_dev *dev)
> >>>>> put_device(&pdev->dev);
> >>>>> }
> >>>>>
> >>>>> + if (dev->dev.parent)
> >>>>> + pm_runtime_get_sync(dev->dev.parent);
> >>>>> pm_runtime_set_active(&dev->dev);
> >>>>> pm_runtime_enable(&dev->dev);
> >>>>> + if (dev->dev.parent)
> >>>>> + pm_runtime_put(dev->dev.parent);
> >>>>>
> >>>>> if (!dn || of_device_is_available(dn))
> >>>>> pci_dev_allow_binding(dev);
> >>>>>
> >>>>> Personally, I'm more inclined to go back to v1, since it prepares the
> >>>>> runtime PM status when the device is first discovered. That way, its
> >>>>> ancestors are still active, avoiding these sorts of problems. I'm
> >>>>> frankly not sure of all the reasons Rafael recommended I make the
> >>>>> v1->v3->v4 changes, and now that they cause problems, I'm inclined to
> >>>>> question them again.
> >>>>>
> >>>>> Rafael, do you have any thoughts?
> >>>> Yeah.
> >>>>
> >>>> Move back pm_runtime_set_active(&dev->dev) back to pm_runtime_init()
> >>> Or rather leave it there to be precise, but I think you know what I mean. :-)
> >>>
> >>>> because that would prevent the parent from suspending and keep
> >>>> pm_runtime_enable() here because that would prevent the device itself
> >>>> from suspending between pm_runtime_init() and this place.
> >>>>
> >>>> And I would add comments in both places.
> >> Confirmed, the following change (compared to $subject patch) fixed my issue:
> >>
> >> diff --git a/drivers/pci/bus.c b/drivers/pci/bus.c
> >> index 3ef60c2fbd89..7e2b7e452d51 100644
> >> --- a/drivers/pci/bus.c
> >> +++ b/drivers/pci/bus.c
> >> @@ -381,7 +381,6 @@ void pci_bus_add_device(struct pci_dev *dev)
> >> }
> >>
> >> pm_runtime_set_active(&dev->dev);
> >> - pm_runtime_enable(&dev->dev);
> > That works too, but it would defeat the purpose of the original
> > change, so I mean the other way around.
> >
> > That is, leave the pm_runtime_enable() here and move the
> > pm_runtime_set_active() back to the other place.
>
> Okay, I mixed that. This way it works too and fixes the observed issue.
>
> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cool, thanks for verifying!
On Wed, Jan 14, 2026 at 10:46:41AM +0100, Marek Szyprowski wrote:
> On 06.01.2026 23:27, Bjorn Helgaas wrote:
> > On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote:
> >> Today, it's possible for a PCI device to be created and
> >> runtime-suspended before it is fully initialized. When that happens, the
> >> device will remain in D0, but the suspend process may save an
> >> intermediate version of that device's state -- for example, without
> >> appropriate BAR configuration. When the device later resumes, we'll
> >> restore invalid PCI state and the device may not function.
> ...
> > Link: https://patch.msgid.link/20251023140901.v4.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid
>
> This patch landed recently in linux-next as commit c796513dc54e
> ("PCI/PM: Prevent runtime suspend until devices are fully initialized").
> In my tests I found that it sometimes causes the "pci 0000:01:00.0:
> runtime PM trying to activate child device 0000:01:00.0 but parent
> (0000:00:00.0) is not active" warning on Qualcomm Robotics RB5 board
> (arch/arm64/boot/dts/qcom/qrb5165-rb5.dts). This in turn causes a
> lockdep warning about console lock, but this is just a consequence of
> the runtime pm warning. Reverting $subject patch on top of current
> linux-next hides this warning.
I moved this patch from pci/pm to pci/pend to remove it from
linux-next while we figure this out. Thanks for the report and
debugging!
Bjorn
Hi Bjorn, On Thu, Oct 23, 2025 at 02:09:01PM -0700, Brian Norris wrote: > Today, it's possible for a PCI device to be created and > runtime-suspended before it is fully initialized. When that happens, the > device will remain in D0, but the suspend process may save an > intermediate version of that device's state -- for example, without > appropriate BAR configuration. When the device later resumes, we'll > restore invalid PCI state and the device may not function. > > Prevent runtime suspend for PCI devices by deferring pm_runtime_enable() > until we've fully initialized the device. [...] > Link: https://lore.kernel.org/all/20251016155335.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid/ > Signed-off-by: Brian Norris <briannorris@chromium.org> > Cc: <stable@vger.kernel.org> > Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> > --- > > Changes in v4: > * Move pm_runtime_set_active() too > > Changes in v3: > * Add Link to initial discussion > * Add Rafael's Reviewed-by > * Add lengthier footnotes about forbid vs allow vs sysfs > > Changes in v2: > * Update CC list > * Rework problem description > * Update solution: defer pm_runtime_enable(), instead of trying to > get()/put() > > drivers/pci/bus.c | 4 ++++ > drivers/pci/pci.c | 2 -- > 2 files changed, 4 insertions(+), 2 deletions(-) I'm wondering what the status of this patch is, as the next merge window is approaching. It fixes a critical bug for me, and it has had plenty of review. Thanks, Brian
© 2016 - 2026 Red Hat, Inc.