[PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug

Sebastian Ott posted 1 patch 2 days, 7 hours ago
Failed in applying to current master (apply log)
hw/arm/virt.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug
Posted by Sebastian Ott 2 days, 7 hours ago
ACPI PCI hotplug of a PCI device using a bar > 2M fails on ARM
with a resource allocation failure in the guest kernel (Linux):

qemu-system-aarch64 [..] -device pcie-root-port,id=rp1,bus=pcie.0 \
 		    -global acpi-ged.acpi-pci-hotplug-with-bridge-support=on \
 		    -pflash /usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.qcow2

(qemu) device_add pci-testdev,bus=rp1,membar=4M

[   55.106283] pci 0000:01:00.0: [1b36:0005] type 00 class 0x00ff00 conventional PCI endpoint
[   55.107627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00000fff]
[   55.107864] pci 0000:01:00.0: BAR 1 [io  0x0000-0x00ff]
[   55.108015] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x003fffff 64bit pref]
[   55.118609] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
[   55.118653] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
[   55.118671] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
[   55.122828] pci 0000:01:00.0: BAR 1 [io  0x1000-0x10ff]: assigned
[   55.127090] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: releasing
[   55.127098] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
[   55.127102] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
[   55.127106] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned

This behavior is exactly the same on x86 - except the busses are created
with a default window size of 256M there. If the PCI devices being
hotplugged use a bar > 256M you get the same failure like on ARM.

The workaround on x86 is to specify "pref64-reserve=" for the PCIe
root-port. On ARM this is currently ignored.

The following patch changes ACPI DSM #5 to return 0 (only when
ACPI HP is used) which basically tells the guest kernel to keep
all PCI resource allocations.

------->8
From 92cd64b8398f13e51b67ff8baf62f15cac300c94 Mon Sep 17 00:00:00 2001
From: Sebastian Ott <sebott@redhat.com>
Date: Wed, 20 May 2026 10:39:26 -0400
Subject: [PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug

Change ACPI DSM #5 to return 0 when ACPI PCI hotplug is used
to tell the guest kernel to keep all PCI resource allocations.

This allows to use the pref64-reserve= option for pcie-root-ports
and thus enables hotplug of PCI devices with a barsize > 2M.

Signed-off-by: Sebastian Ott <sebott@redhat.com>
---
  hw/arm/virt.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b090233893..293a0c2163 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1094,6 +1094,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
          pcihp_region_index = sysbus_mmio_map_name(sbdev, ACPI_PCIHP_REGION_NAME,
                                                    vms->memmap[VIRT_ACPI_PCIHP].base);
          assert(pcihp_region_index >= 0);
+        vms->pci_preserve_config = true;
      }

      sysbus_connect_irq(sbdev, 0, qdev_get_gpio_in(vms->gic, irq));
-- 
2.53.0
Re: [PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug
Posted by Peter Maydell 2 days, 7 hours ago
On Thu, 28 May 2026 at 10:23, Sebastian Ott <sebott@redhat.com> wrote:
>
> ACPI PCI hotplug of a PCI device using a bar > 2M fails on ARM
> with a resource allocation failure in the guest kernel (Linux):
>
> qemu-system-aarch64 [..] -device pcie-root-port,id=rp1,bus=pcie.0 \
>                     -global acpi-ged.acpi-pci-hotplug-with-bridge-support=on \
>                     -pflash /usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.qcow2
>
> (qemu) device_add pci-testdev,bus=rp1,membar=4M
>
> [   55.106283] pci 0000:01:00.0: [1b36:0005] type 00 class 0x00ff00 conventional PCI endpoint
> [   55.107627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00000fff]
> [   55.107864] pci 0000:01:00.0: BAR 1 [io  0x0000-0x00ff]
> [   55.108015] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x003fffff 64bit pref]
> [   55.118609] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
> [   55.118653] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
> [   55.118671] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
> [   55.122828] pci 0000:01:00.0: BAR 1 [io  0x1000-0x10ff]: assigned
> [   55.127090] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: releasing
> [   55.127098] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
> [   55.127102] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
> [   55.127106] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
>
> This behavior is exactly the same on x86 - except the busses are created
> with a default window size of 256M there. If the PCI devices being
> hotplugged use a bar > 256M you get the same failure like on ARM.
>
> The workaround on x86 is to specify "pref64-reserve=" for the PCIe
> root-port. On ARM this is currently ignored.

From the perspective of somebody who doesn't know much about
ACPI or PCI, this commit message doesn't really tell me why
QEMU needs to change. The above reads like "this is a kernel
problem" to me. Why is the kernel unable to allocate this BAR?

> The following patch changes ACPI DSM #5 to return 0 (only when
> ACPI HP is used) which basically tells the guest kernel to keep
> all PCI resource allocations.

If this makes things work why isn't it the default for all ACPI?
Why doesn't the kernel ignore the ACPI table setting and do
the thing that works all the time? Do we break some other use
case by setting it to 0 ?

> ------->8
> From 92cd64b8398f13e51b67ff8baf62f15cac300c94 Mon Sep 17 00:00:00 2001
> From: Sebastian Ott <sebott@redhat.com>
> Date: Wed, 20 May 2026 10:39:26 -0400
> Subject: [PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug
>
> Change ACPI DSM #5 to return 0 when ACPI PCI hotplug is used
> to tell the guest kernel to keep all PCI resource allocations.
>
> This allows to use the pref64-reserve= option for pcie-root-ports
> and thus enables hotplug of PCI devices with a barsize > 2M.
>
> Signed-off-by: Sebastian Ott <sebott@redhat.com>
> ---
>   hw/arm/virt.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index b090233893..293a0c2163 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1094,6 +1094,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
>           pcihp_region_index = sysbus_mmio_map_name(sbdev, ACPI_PCIHP_REGION_NAME,
>                                                     vms->memmap[VIRT_ACPI_PCIHP].base);
>           assert(pcihp_region_index >= 0);
> +        vms->pci_preserve_config = true;
>       }

If we set this here, do we still need to set it in
virt_machine_device_pre_plug_cb(), or is that now redundant?

>       sysbus_connect_irq(sbdev, 0, qdev_get_gpio_in(vms->gic, irq));
> --

thanks
-- PMM
Re: [PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug
Posted by Sebastian Ott 1 day, 3 hours ago
On Thu, 28 May 2026, Peter Maydell wrote:
> On Thu, 28 May 2026 at 10:23, Sebastian Ott <sebott@redhat.com> wrote:
>>
>> ACPI PCI hotplug of a PCI device using a bar > 2M fails on ARM
>> with a resource allocation failure in the guest kernel (Linux):
>>
>> qemu-system-aarch64 [..] -device pcie-root-port,id=rp1,bus=pcie.0 \
>>                     -global acpi-ged.acpi-pci-hotplug-with-bridge-support=on \
>>                     -pflash /usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.qcow2
>>
>> (qemu) device_add pci-testdev,bus=rp1,membar=4M
>>
>> [   55.106283] pci 0000:01:00.0: [1b36:0005] type 00 class 0x00ff00 conventional PCI endpoint
>> [   55.107627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00000fff]
>> [   55.107864] pci 0000:01:00.0: BAR 1 [io  0x0000-0x00ff]
>> [   55.108015] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x003fffff 64bit pref]
>> [   55.118609] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
>> [   55.118653] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
>> [   55.118671] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
>> [   55.122828] pci 0000:01:00.0: BAR 1 [io  0x1000-0x10ff]: assigned
>> [   55.127090] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: releasing
>> [   55.127098] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
>> [   55.127102] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
>> [   55.127106] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
>>
>> This behavior is exactly the same on x86 - except the busses are created
>> with a default window size of 256M there. If the PCI devices being
>> hotplugged use a bar > 256M you get the same failure like on ARM.
>>
>> The workaround on x86 is to specify "pref64-reserve=" for the PCIe
>> root-port. On ARM this is currently ignored.
>
> From the perspective of somebody who doesn't know much about
> ACPI or PCI, this commit message doesn't really tell me why
> QEMU needs to change. The above reads like "this is a kernel
> problem" to me. Why is the kernel unable to allocate this BAR?

Lack of resources on the parent bus/bridge device - the bridge windows are
2M per default on ARM. The native PCIe hotplug controller would add the
additional resources to the parent bridge and retry allocation. The
ACPI PCI hotplug controller does not do that. I'm curently playing with
a hack that adds this to acpiphp.

Sebastian
Re: [PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug
Posted by Sebastian Ott 2 days, 5 hours ago
On Thu, 28 May 2026, Peter Maydell wrote:
> On Thu, 28 May 2026 at 10:23, Sebastian Ott <sebott@redhat.com> wrote:
>>
>> ACPI PCI hotplug of a PCI device using a bar > 2M fails on ARM
>> with a resource allocation failure in the guest kernel (Linux):
>>
>> qemu-system-aarch64 [..] -device pcie-root-port,id=rp1,bus=pcie.0 \
>>                     -global acpi-ged.acpi-pci-hotplug-with-bridge-support=on \
>>                     -pflash /usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.qcow2
>>
>> (qemu) device_add pci-testdev,bus=rp1,membar=4M
>>
>> [   55.106283] pci 0000:01:00.0: [1b36:0005] type 00 class 0x00ff00 conventional PCI endpoint
>> [   55.107627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00000fff]
>> [   55.107864] pci 0000:01:00.0: BAR 1 [io  0x0000-0x00ff]
>> [   55.108015] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x003fffff 64bit pref]
>> [   55.118609] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
>> [   55.118653] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
>> [   55.118671] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
>> [   55.122828] pci 0000:01:00.0: BAR 1 [io  0x1000-0x10ff]: assigned
>> [   55.127090] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: releasing
>> [   55.127098] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
>> [   55.127102] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
>> [   55.127106] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
>>
>> This behavior is exactly the same on x86 - except the busses are created
>> with a default window size of 256M there. If the PCI devices being
>> hotplugged use a bar > 256M you get the same failure like on ARM.
>>
>> The workaround on x86 is to specify "pref64-reserve=" for the PCIe
>> root-port. On ARM this is currently ignored.
>
> From the perspective of somebody who doesn't know much about
> ACPI or PCI, this commit message doesn't really tell me why
> QEMU needs to change. The above reads like "this is a kernel
> problem" to me. Why is the kernel unable to allocate this BAR?

DSM #5 defaults to 1 on ARM which tells the OS that it could
ignore the firmwares PCI configuration at boot time. Linux is
just doing that. The acpi pci hotplug controller can't find a
sufficient resource window of the parent bus and resource
allocation fails. DSM #5 == 0 tells the OS to not ignore the
boot time config from firmware.

>> The following patch changes ACPI DSM #5 to return 0 (only when
>> ACPI HP is used) which basically tells the guest kernel to keep
>> all PCI resource allocations.
>
> If this makes things work why isn't it the default for all ACPI?
> Why doesn't the kernel ignore the ACPI table setting and do
> the thing that works all the time? Do we break some other use
> case by setting it to 0 ?

From what I was able to find in the archives this has been tried
before but led to regressions for legacy devices:
https://lore.kernel.org/all/20210724185234.GA2265457@roeck-us.net

It was recently selectively enabled for accel SMMU - I guess the
rational was that you don't use accel SMMU with legacy devices:
45c57eaefd hw/arm/virt: Set PCI preserve_config for accel SMMUv3

>
>> ------->8
>> From 92cd64b8398f13e51b67ff8baf62f15cac300c94 Mon Sep 17 00:00:00 2001
>> From: Sebastian Ott <sebott@redhat.com>
>> Date: Wed, 20 May 2026 10:39:26 -0400
>> Subject: [PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug
>>
>> Change ACPI DSM #5 to return 0 when ACPI PCI hotplug is used
>> to tell the guest kernel to keep all PCI resource allocations.
>>
>> This allows to use the pref64-reserve= option for pcie-root-ports
>> and thus enables hotplug of PCI devices with a barsize > 2M.
>>
>> Signed-off-by: Sebastian Ott <sebott@redhat.com>
>> ---
>>   hw/arm/virt.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index b090233893..293a0c2163 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -1094,6 +1094,7 @@ static inline DeviceState *create_acpi_ged(VirtMachineState *vms)
>>           pcihp_region_index = sysbus_mmio_map_name(sbdev, ACPI_PCIHP_REGION_NAME,
>>                                                     vms->memmap[VIRT_ACPI_PCIHP].base);
>>           assert(pcihp_region_index >= 0);
>> +        vms->pci_preserve_config = true;
>>       }
>
> If we set this here, do we still need to set it in
> virt_machine_device_pre_plug_cb(), or is that now redundant?

I guess you can use accel SMMU without acpi pci hotplug and vice versa.

Thanks,
Sebastian
Re: [PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug
Posted by Peter Maydell 2 days, 5 hours ago
On Thu, 28 May 2026 at 12:23, Sebastian Ott <sebott@redhat.com> wrote:
>
> On Thu, 28 May 2026, Peter Maydell wrote:
> > On Thu, 28 May 2026 at 10:23, Sebastian Ott <sebott@redhat.com> wrote:
> >>
> >> ACPI PCI hotplug of a PCI device using a bar > 2M fails on ARM
> >> with a resource allocation failure in the guest kernel (Linux):
> >>
> >> qemu-system-aarch64 [..] -device pcie-root-port,id=rp1,bus=pcie.0 \
> >>                     -global acpi-ged.acpi-pci-hotplug-with-bridge-support=on \
> >>                     -pflash /usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.qcow2
> >>
> >> (qemu) device_add pci-testdev,bus=rp1,membar=4M
> >>
> >> [   55.106283] pci 0000:01:00.0: [1b36:0005] type 00 class 0x00ff00 conventional PCI endpoint
> >> [   55.107627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00000fff]
> >> [   55.107864] pci 0000:01:00.0: BAR 1 [io  0x0000-0x00ff]
> >> [   55.108015] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x003fffff 64bit pref]
> >> [   55.118609] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
> >> [   55.118653] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
> >> [   55.118671] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
> >> [   55.122828] pci 0000:01:00.0: BAR 1 [io  0x1000-0x10ff]: assigned
> >> [   55.127090] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: releasing
> >> [   55.127098] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
> >> [   55.127102] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
> >> [   55.127106] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
> >>
> >> This behavior is exactly the same on x86 - except the busses are created
> >> with a default window size of 256M there. If the PCI devices being
> >> hotplugged use a bar > 256M you get the same failure like on ARM.
> >>
> >> The workaround on x86 is to specify "pref64-reserve=" for the PCIe
> >> root-port. On ARM this is currently ignored.
> >
> > From the perspective of somebody who doesn't know much about
> > ACPI or PCI, this commit message doesn't really tell me why
> > QEMU needs to change. The above reads like "this is a kernel
> > problem" to me. Why is the kernel unable to allocate this BAR?
>
> DSM #5 defaults to 1 on ARM which tells the OS that it could
> ignore the firmwares PCI configuration at boot time. Linux is
> just doing that. The acpi pci hotplug controller can't find a
> sufficient resource window of the parent bus and resource
> allocation fails. DSM #5 == 0 tells the OS to not ignore the
> boot time config from firmware.

OK, but if the firmware can find a working PCI configuration,
why can't the kernel also do that?

> >> The following patch changes ACPI DSM #5 to return 0 (only when
> >> ACPI HP is used) which basically tells the guest kernel to keep
> >> all PCI resource allocations.
> >
> > If this makes things work why isn't it the default for all ACPI?
> > Why doesn't the kernel ignore the ACPI table setting and do
> > the thing that works all the time? Do we break some other use
> > case by setting it to 0 ?
>
> From what I was able to find in the archives this has been tried
> before but led to regressions for legacy devices:
> https://lore.kernel.org/all/20210724185234.GA2265457@roeck-us.net

> It was recently selectively enabled for accel SMMU - I guess the
> rational was that you don't use accel SMMU with legacy devices:
> 45c57eaefd hw/arm/virt: Set PCI preserve_config for accel SMMUv3

    /*
     * Accel SMMU requires RMRs for MSI 1-1 mapping, which require _DSM
     * function 5 (_DSM for Preserving PCI Boot Configurations).
     */

This isn't really very informative to me either... (I have cc'd
some people involved in the smmu-accel stuff.)

The quote of the spec from that other thread on what the values mean is:

# 0: No (The operating system shall not ignore the PCI configuration
#  that firmware has done at boot time. However, the operating system
# is free to configure the devices in this hierarchy that have not
# been configured by the firmware. There may be a reduced level of
# hot plug capability support in this hierarchy due to resource
# constraints. This situation is the same as the legacy situation
# where this _DSM is not provided.)
# 1: Yes (The operating system may ignore the PCI configuration that
# the firmware has done at boot time, and reconfigure/rebalance the
# resources in the hierarchy.)

For "1", that's a "may", not a "must". So that sounds to me like
QEMU should always report 1, and the OS can choose whether it wants
to use the existing PCI config from the firmware, wipe it and start
afresh, or some mixture of the two.

Similarly, it's not clear to me either why the accel SMMU stuff
needs this to be set to 0.

To me this all sounds like it's a guest kernel problem, and we
should drop all the preserve_config handling and let the guest
OS sort out what it wants to do.

thanks
-- PMM
Re: [PATCH] hw/arm/virt: preserve config for ACPI PCI hotplug
Posted by Eric Auger 2 days, 4 hours ago
Hi Peter,

On 5/28/26 1:46 PM, Peter Maydell wrote:
> On Thu, 28 May 2026 at 12:23, Sebastian Ott <sebott@redhat.com> wrote:
>>
>> On Thu, 28 May 2026, Peter Maydell wrote:
>>> On Thu, 28 May 2026 at 10:23, Sebastian Ott <sebott@redhat.com> wrote:
>>>>
>>>> ACPI PCI hotplug of a PCI device using a bar > 2M fails on ARM
>>>> with a resource allocation failure in the guest kernel (Linux):
>>>>
>>>> qemu-system-aarch64 [..] -device pcie-root-port,id=rp1,bus=pcie.0 \
>>>>                     -global acpi-ged.acpi-pci-hotplug-with-bridge-support=on \
>>>>                     -pflash /usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.qcow2
>>>>
>>>> (qemu) device_add pci-testdev,bus=rp1,membar=4M
>>>>
>>>> [   55.106283] pci 0000:01:00.0: [1b36:0005] type 00 class 0x00ff00 conventional PCI endpoint
>>>> [   55.107627] pci 0000:01:00.0: BAR 0 [mem 0x00000000-0x00000fff]
>>>> [   55.107864] pci 0000:01:00.0: BAR 1 [io  0x0000-0x00ff]
>>>> [   55.108015] pci 0000:01:00.0: BAR 2 [mem 0x00000000-0x003fffff 64bit pref]
>>>> [   55.118609] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
>>>> [   55.118653] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
>>>> [   55.118671] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
>>>> [   55.122828] pci 0000:01:00.0: BAR 1 [io  0x1000-0x10ff]: assigned
>>>> [   55.127090] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: releasing
>>>> [   55.127098] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: can't assign; no space
>>>> [   55.127102] pci 0000:01:00.0: BAR 2 [mem size 0x00400000 64bit pref]: failed to assign
>>>> [   55.127106] pci 0000:01:00.0: BAR 0 [mem 0x10000000-0x10000fff]: assigned
>>>>
>>>> This behavior is exactly the same on x86 - except the busses are created
>>>> with a default window size of 256M there. If the PCI devices being
>>>> hotplugged use a bar > 256M you get the same failure like on ARM.
>>>>
>>>> The workaround on x86 is to specify "pref64-reserve=" for the PCIe
>>>> root-port. On ARM this is currently ignored.
>>>
>>> From the perspective of somebody who doesn't know much about
>>> ACPI or PCI, this commit message doesn't really tell me why
>>> QEMU needs to change. The above reads like "this is a kernel
>>> problem" to me. Why is the kernel unable to allocate this BAR?
>>
>> DSM #5 defaults to 1 on ARM which tells the OS that it could
>> ignore the firmwares PCI configuration at boot time. Linux is
>> just doing that. The acpi pci hotplug controller can't find a
>> sufficient resource window of the parent bus and resource
>> allocation fails. DSM #5 == 0 tells the OS to not ignore the
>> boot time config from firmware.
> 
> OK, but if the firmware can find a working PCI configuration,
> why can't the kernel also do that?
> 
>>>> The following patch changes ACPI DSM #5 to return 0 (only when
>>>> ACPI HP is used) which basically tells the guest kernel to keep
>>>> all PCI resource allocations.
>>>
>>> If this makes things work why isn't it the default for all ACPI?
>>> Why doesn't the kernel ignore the ACPI table setting and do
>>> the thing that works all the time? Do we break some other use
>>> case by setting it to 0 ?
>>
>> From what I was able to find in the archives this has been tried
>> before but led to regressions for legacy devices:
>> https://lore.kernel.org/all/20210724185234.GA2265457@roeck-us.net
> 
>> It was recently selectively enabled for accel SMMU - I guess the
>> rational was that you don't use accel SMMU with legacy devices:
>> 45c57eaefd hw/arm/virt: Set PCI preserve_config for accel SMMUv3
> 
>     /*
>      * Accel SMMU requires RMRs for MSI 1-1 mapping, which require _DSM
>      * function 5 (_DSM for Preserving PCI Boot Configurations).
>      */

> 
> This isn't really very informative to me either... (I have cc'd
> some people involved in the smmu-accel stuff.)

The requirement stems from the IORT specification as we are relying on
reserved memory regions:

"If reserved memory regions are present, the OS must preserve PCIe
configuration performed by the boot firmware. This preservation is
required to ensure functional continuity of the endpoints that are using
the reserved memory regions. Therefore, RMR nodes must be supported by
the inclusion of the PCI Firmware defined _DSM for ignoring PCI boot
configuration, Function 5, in the ACPI device object of the PCIe host
bridge in ACPI namespace. The _DSM method should return a value of 0 to
indicate that the OS must honour the PCI configuration that the firmware
has done at boot time. See [PCIFW] for more details on this _DSM method.
"

Eric

> 
> The quote of the spec from that other thread on what the values mean is:
> 
> # 0: No (The operating system shall not ignore the PCI configuration
> #  that firmware has done at boot time. However, the operating system
> # is free to configure the devices in this hierarchy that have not
> # been configured by the firmware. There may be a reduced level of
> # hot plug capability support in this hierarchy due to resource
> # constraints. This situation is the same as the legacy situation
> # where this _DSM is not provided.)
> # 1: Yes (The operating system may ignore the PCI configuration that
> # the firmware has done at boot time, and reconfigure/rebalance the
> # resources in the hierarchy.)
> 
> For "1", that's a "may", not a "must". So that sounds to me like
> QEMU should always report 1, and the OS can choose whether it wants
> to use the existing PCI config from the firmware, wipe it and start
> afresh, or some mixture of the two.
> 
> Similarly, it's not clear to me either why the accel SMMU stuff
> needs this to be set to 0.
> 
> To me this all sounds like it's a guest kernel problem, and we
> should drop all the preserve_config handling and let the guest
> OS sort out what it wants to do.
> 
> thanks
> -- PMM
>