[RFC 0/1] PCI: Fix pci devices double register WARNING in the kernel starting process

Shuan He posted 1 patch 7 months, 1 week ago
drivers/pci/proc.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
[RFC 0/1] PCI: Fix pci devices double register WARNING in the kernel starting process
Posted by Shuan He 7 months, 1 week ago
Hi All.
I encountered a WARNING printed out during the kernel starting process
on my developing environment.
(with RISC-V arch, 6.12 kernel, and Debian 13 OS).

WARN Trace:
[    0.518993] proc_dir_entry '000c:00/00.0' already registered
[    0.519187] WARNING: CPU: 2 PID: 179 at fs/proc/generic.c:375 proc_register+0xf6/0x180
[    0.519214] [<ffffffff804055a6>] proc_register+0xf6/0x180
[    0.519217] [<ffffffff80405a9e>] proc_create_data+0x3e/0x60
[    0.519220] [<ffffffff80616e44>] pci_proc_attach_device+0x74/0x130
[    0.509991] [<ffffffff805f1af2>] pci_bus_add_device+0x42/0x100
[    0.509997] [<ffffffff805f1c76>] pci_bus_add_devices+0xc6/0x110
[    0.519230] [<ffffffff8066763c>] acpi_pci_root_add+0x54c/0x810
[    0.519233] [<ffffffff8065d206>] acpi_bus_attach+0x196/0x2f0
[    0.519234] [<ffffffff8065d390>] acpi_scan_clear_dep_fn+0x30/0x70
[    0.519236] [<ffffffff800468fa>] process_one_work+0x19a/0x390
[    0.519239] [<ffffffff80047a6e>] worker_thread+0x2be/0x420
[    0.519241] [<ffffffff80050dc4>] kthread+0xc4/0xf0
[    0.519243] [<ffffffff80ad6ad2>] ret_from_fork+0xe/0x1c

After digging into this issue a little bit, I find the double-register
of PCIe devices occurs in the following logic:

Early:
static int __init pci_proc_init(void)
{
...
    for_each_pci_dev(dev)
        pci_proc_attach_device(dev);
        //000c:00:00.0 will be registered here for the first time (succeeded).
...
}

Later:
acpi_pci_root_add
-> pci_bus_add_devices
  -> pci_bus_add_device
    -> pci_proc_attach_device
    //try to register 000c:00:00.0 here for the second time
      (failed and triggered the WARNING trace);

I tried to add two more steps in the pci_proc_init function
(shown in the attached patch).
1st is to prevent the concurrent issue by holding the
pci_rescan_remove_lock.
2nd is to correctly update the device's status after
it's been successfully registered (by adding the
PCI_DEV_ADDED bit to the device's flag).

Then the WARNING disappeared and the system worked well. 

I understand that the device_initcall(pci_proc_init) function
stays there already for 20 years (time of the initiliaztion of
repo), and it hasn't really changed since then.
So I wonder if my patch is the RIGHT way to fix this WARNING issue?
I am not positive about this. :(

Any suggestions?

As a beginner Linux programmer, I am not sure whether I have included
all related reviews/maintainers in my email list, but sincerely seeking
your help and any comments are very welcome.

Warmly regards,
Shuan

Shuan He (1):
  PCI: Fix pci devices double register WARN in the kernel starting
    process

 drivers/pci/proc.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

-- 
2.39.5 (Apple Git-154)
Re: [RFC 0/1] PCI: Fix pci devices double register WARNING in the kernel starting process
Posted by Sunil V L 7 months, 1 week ago
On Wed, Jul 02, 2025 at 11:51:11PM +0800, Shuan He wrote:
> Hi All.
> I encountered a WARNING printed out during the kernel starting process
> on my developing environment.
> (with RISC-V arch, 6.12 kernel, and Debian 13 OS).
> 
> WARN Trace:
> [    0.518993] proc_dir_entry '000c:00/00.0' already registered
> [    0.519187] WARNING: CPU: 2 PID: 179 at fs/proc/generic.c:375 proc_register+0xf6/0x180
> [    0.519214] [<ffffffff804055a6>] proc_register+0xf6/0x180
> [    0.519217] [<ffffffff80405a9e>] proc_create_data+0x3e/0x60
> [    0.519220] [<ffffffff80616e44>] pci_proc_attach_device+0x74/0x130
> [    0.509991] [<ffffffff805f1af2>] pci_bus_add_device+0x42/0x100
> [    0.509997] [<ffffffff805f1c76>] pci_bus_add_devices+0xc6/0x110
> [    0.519230] [<ffffffff8066763c>] acpi_pci_root_add+0x54c/0x810
> [    0.519233] [<ffffffff8065d206>] acpi_bus_attach+0x196/0x2f0
> [    0.519234] [<ffffffff8065d390>] acpi_scan_clear_dep_fn+0x30/0x70
> [    0.519236] [<ffffffff800468fa>] process_one_work+0x19a/0x390
> [    0.519239] [<ffffffff80047a6e>] worker_thread+0x2be/0x420
> [    0.519241] [<ffffffff80050dc4>] kthread+0xc4/0xf0
> [    0.519243] [<ffffffff80ad6ad2>] ret_from_fork+0xe/0x1c
> 
This should not happen. I suspect some issue in ACPI namespace/_PRT. Can
you reproduce this on qemu virt machine?

Regards
Sunil
Re: [External] Re: [RFC 0/1] PCI: Fix pci devices double register WARNING in the kernel starting process
Posted by 拴何 7 months, 1 week ago
Hi Sunil,
Thanks for your reply! (really appricate it).
This WARNING truly occurred. Through the added debug info, I found
that the device was registered to proc via pci_proc_init and
acpi_pci_root_add paths respectively, which ultimately triggered
the warning message.
Let me try to reproduce it on qemu first. I'll keep you updated.
Thanks again.

Regards,
Shuan

On Thu, Jul 3, 2025 at 2:36 PM Sunil V L <sunilvl@ventanamicro.com> wrote:
>
> On Wed, Jul 02, 2025 at 11:51:11PM +0800, Shuan He wrote:
> > Hi All.
> > I encountered a WARNING printed out during the kernel starting process
> > on my developing environment.
> > (with RISC-V arch, 6.12 kernel, and Debian 13 OS).
> >
> > WARN Trace:
> > [    0.518993] proc_dir_entry '000c:00/00.0' already registered
> > [    0.519187] WARNING: CPU: 2 PID: 179 at fs/proc/generic.c:375 proc_register+0xf6/0x180
> > [    0.519214] [<ffffffff804055a6>] proc_register+0xf6/0x180
> > [    0.519217] [<ffffffff80405a9e>] proc_create_data+0x3e/0x60
> > [    0.519220] [<ffffffff80616e44>] pci_proc_attach_device+0x74/0x130
> > [    0.509991] [<ffffffff805f1af2>] pci_bus_add_device+0x42/0x100
> > [    0.509997] [<ffffffff805f1c76>] pci_bus_add_devices+0xc6/0x110
> > [    0.519230] [<ffffffff8066763c>] acpi_pci_root_add+0x54c/0x810
> > [    0.519233] [<ffffffff8065d206>] acpi_bus_attach+0x196/0x2f0
> > [    0.519234] [<ffffffff8065d390>] acpi_scan_clear_dep_fn+0x30/0x70
> > [    0.519236] [<ffffffff800468fa>] process_one_work+0x19a/0x390
> > [    0.519239] [<ffffffff80047a6e>] worker_thread+0x2be/0x420
> > [    0.519241] [<ffffffff80050dc4>] kthread+0xc4/0xf0
> > [    0.519243] [<ffffffff80ad6ad2>] ret_from_fork+0xe/0x1c
> >
> This should not happen. I suspect some issue in ACPI namespace/_PRT. Can
> you reproduce this on qemu virt machine?
>
> Regards
> Sunil
>
Re: [External] Re: [RFC 0/1] PCI: Fix pci devices double register WARNING in the kernel starting process
Posted by Manivannan Sadhasivam 7 months, 1 week ago
On Thu, Jul 03, 2025 at 07:31:10PM GMT, 拴何 wrote:
> Hi Sunil,
> Thanks for your reply! (really appricate it).
> This WARNING truly occurred. Through the added debug info, I found
> that the device was registered to proc via pci_proc_init and
> acpi_pci_root_add paths respectively, which ultimately triggered
> the warning message.
> Let me try to reproduce it on qemu first. I'll keep you updated.
> Thanks again.
> 

I think you have uncovered a valid bug. There is nothing preventing (except the
blessings of the initcall order) the occurence of the race between
pci_proc_init() and pci_bus_add_device(). I think it went mostly unnoticed
because, pci_proc_init() gets called very early before any PCI devices were
registered. So for_each_pci_dev() loop never gets executed.

But in your case, looks like the PCI device is available somehow before
pci_proc_init() gets executed. Now, it is not very clear to me how the device
becomes available at this point. It might be due to some other issue. But in
anycase, I think we need to get rid of calling pci_proc_attach_device() from
pci_proc_init() as I don't see a reason to call this function from two
different places. pci_bus_add_device() should be the one calling this function
as it is the one adding the PCI device.

Ironically, I do see a similar pattern for sysfs also. Maybe there is (or was) a
reason to create these files from two different places?

- Mani

-- 
மணிவண்ணன் சதாசிவம்
Re: [External] Re: [RFC 0/1] PCI: Fix pci devices double register WARNING in the kernel starting process
Posted by He Shuan 7 months, 1 week ago
Hi Mani,

Thanks for your comments.

>But in your case, looks like the PCI device is available somehow before
>pci_proc_init() gets executed. Now, it is not very clear to me how the device
>becomes available at this point. It might be due to some other issue. But in
>anycase, I think we need to get rid of calling pci_proc_attach_device() from
>pci_proc_init() as I don't see a reason to call this function from two
>different places. pci_bus_add_device() should be the one calling this function
>as it is the one adding the PCI device.
Got it. I need to figure out why the PCI device is available already before
pci_proc_init() is executed. (Actually I didn't change too much source code yet,
basically running my test based on the upstream code).

>Ironically, I do see a similar pattern for sysfs also. Maybe there is (or was) a
>reason to create these files from two different places?
Yes, I see the sysfs register confusion (pci_create_sysfs_dev_files) from
pci_sysfs_init() and pci_bus_add_device() as well. There do have concurrence
protection through pci_bus_add_device() paths, so I agree that function
pci_proc_attach_device and pci_create_sysfs_dev_files should be called
from pci_bus_add_devices().

Anyway, it appears there is a great deal of work/effort needed before
making this part clear. :(

Bests,
Shuan