[PATCH] hvmloader: flip "ACPI data" to ACPI NVS type for ACPI table region

Igor Druzhinin posted 1 patch 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/1602586216-27371-1-git-send-email-igor.druzhinin@citrix.com
Maintainers: Wei Liu <wl@xen.org>, Jan Beulich <jbeulich@suse.com>, "Roger Pau Monné" <roger.pau@citrix.com>, Ian Jackson <iwj@xenproject.org>, Andrew Cooper <andrew.cooper3@citrix.com>
tools/firmware/hvmloader/e820.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

[PATCH] hvmloader: flip "ACPI data" to ACPI NVS type for ACPI table region

Posted by Igor Druzhinin 2 weeks ago
ACPI specification contains statements describing memory marked with regular
"ACPI data" type as reclaimable by the guest. Although the guest shouldn't
really do it if it wants kexec or similar functionality to work, there
could still be ambiguities in treating these regions as potentially regular
RAM.

One such an example is SeaBIOS which currently reports "ACPI data" regions as
RAM to the guest in its e801 call. The guest then tries to use this region
for initrd placement and gets stuck. While arguably SeaBIOS needs to be fixed
here, that is just one example of the potential problems from using
reclaimable memory type.

Flip the type to "ACPI NVS" which doesn't have this ambiguity in it and is
described by the spec as non-reclaimable (so cannot ever be treated like RAM).

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
---
 tools/firmware/hvmloader/e820.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index 38bcf18..8870099 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -202,16 +202,17 @@ int build_e820_table(struct e820entry *e820,
     nr++;
 
     /*
-     * Mark populated reserved memory that contains ACPI tables as ACPI data.
+     * Mark populated reserved memory that contains ACPI tables as ACPI NVS.
      * That should help the guest to treat it correctly later: e.g. pass to
-     * the next kernel on kexec or reclaim if necessary.
+     * the next kernel on kexec and prevent space reclaim which is possible
+     * with regular ACPI data type accoring to ACPI spec v6.3.
      */
 
     if ( acpi_enabled )
     {
         e820[nr].addr = RESERVED_MEMBASE;
         e820[nr].size = acpi_mem_end - RESERVED_MEMBASE;
-        e820[nr].type = E820_ACPI;
+        e820[nr].type = E820_NVS;
         nr++;
     }
 
-- 
2.7.4


Re: [PATCH] hvmloader: flip "ACPI data" to ACPI NVS type for ACPI table region

Posted by Jan Beulich 2 weeks ago
On 13.10.2020 12:50, Igor Druzhinin wrote:
> ACPI specification contains statements describing memory marked with regular
> "ACPI data" type as reclaimable by the guest. Although the guest shouldn't
> really do it if it wants kexec or similar functionality to work, there
> could still be ambiguities in treating these regions as potentially regular
> RAM.
> 
> One such an example is SeaBIOS which currently reports "ACPI data" regions as
> RAM to the guest in its e801 call. The guest then tries to use this region
> for initrd placement and gets stuck.

Any theory on why it would get stuck? Having read the thread rooting
at Sander's report, it hasn't become clear to me where the collision
there is. A consumer of E801 (rather than E820) intends to not use
ACPI data, and hence I consider SeaBIOS right in this regard (the
lack of considering holes is a problem, though).

> --- a/tools/firmware/hvmloader/e820.c
> +++ b/tools/firmware/hvmloader/e820.c
> @@ -202,16 +202,17 @@ int build_e820_table(struct e820entry *e820,
>      nr++;
>  
>      /*
> -     * Mark populated reserved memory that contains ACPI tables as ACPI data.
> +     * Mark populated reserved memory that contains ACPI tables as ACPI NVS.
>       * That should help the guest to treat it correctly later: e.g. pass to
> -     * the next kernel on kexec or reclaim if necessary.
> +     * the next kernel on kexec and prevent space reclaim which is possible
> +     * with regular ACPI data type accoring to ACPI spec v6.3.

Preventing space reclaim is not the business of hvmloader. As per above,
an ACPI unaware OS ought to be permitted to use as ordinary RAM all the
space the tables occupy. Therefore at the very least the comment needs
to reflect that this preventing of space reclaim is a workaround, not
correct behavior.

Also as a nit: "according".

As a consequence I think we will also want to adjust Xen itself to
automatically disable ACPI when it ends up consuming E801 data. Or
alternatively we should consider dropping all E801-related code (as
being inapplicable to 64-bit systems).

Jan

Re: [PATCH] hvmloader: flip "ACPI data" to ACPI NVS type for ACPI table region

Posted by Igor Druzhinin 2 weeks ago
On 13/10/2020 13:51, Jan Beulich wrote:
> On 13.10.2020 12:50, Igor Druzhinin wrote:
>> ACPI specification contains statements describing memory marked with regular
>> "ACPI data" type as reclaimable by the guest. Although the guest shouldn't
>> really do it if it wants kexec or similar functionality to work, there
>> could still be ambiguities in treating these regions as potentially regular
>> RAM.
>>
>> One such an example is SeaBIOS which currently reports "ACPI data" regions as
>> RAM to the guest in its e801 call. The guest then tries to use this region
>> for initrd placement and gets stuck.
> 
> Any theory on why it would get stuck? Having read the thread rooting
> at Sander's report, it hasn't become clear to me where the collision
> there is. A consumer of E801 (rather than E820) intends to not use
> ACPI data, and hence I consider SeaBIOS right in this regard (the
> lack of considering holes is a problem, though).

QEMU's fw_cfg Linux boot loader (that is used by our direct kernel boot method)
is usign E801 to find the top of RAM and places images below that address.
Since now it's 0xfc00000 it gets located right in a PCI hole below - which causes
the loader to hang.

>> --- a/tools/firmware/hvmloader/e820.c
>> +++ b/tools/firmware/hvmloader/e820.c
>> @@ -202,16 +202,17 @@ int build_e820_table(struct e820entry *e820,
>>      nr++;
>>  
>>      /*
>> -     * Mark populated reserved memory that contains ACPI tables as ACPI data.
>> +     * Mark populated reserved memory that contains ACPI tables as ACPI NVS.
>>       * That should help the guest to treat it correctly later: e.g. pass to
>> -     * the next kernel on kexec or reclaim if necessary.
>> +     * the next kernel on kexec and prevent space reclaim which is possible
>> +     * with regular ACPI data type accoring to ACPI spec v6.3.
> 
> Preventing space reclaim is not the business of hvmloader. As per above,
> an ACPI unaware OS ought to be permitted to use as ordinary RAM all the
> space the tables occupy. Therefore at the very least the comment needs
> to reflect that this preventing of space reclaim is a workaround, not
> correct behavior.

Agree to modify the comment.

> Also as a nit: "according".
> 
> As a consequence I think we will also want to adjust Xen itself to
> automatically disable ACPI when it ends up consuming E801 data. Or
> alternatively we should consider dropping all E801-related code (as
> being inapplicable to 64-bit systems).

I'm not following here. What Xen has to do with E801? It's a SeaBIOS implemented
call that happened to be used by QEMU option ROM. We cannot drop it from there
as it's part of BIOS spec.

Igor

Re: [PATCH] hvmloader: flip "ACPI data" to ACPI NVS type for ACPI table region

Posted by Jan Beulich 2 weeks ago
On 13.10.2020 14:59, Igor Druzhinin wrote:
> On 13/10/2020 13:51, Jan Beulich wrote:
>> As a consequence I think we will also want to adjust Xen itself to
>> automatically disable ACPI when it ends up consuming E801 data. Or
>> alternatively we should consider dropping all E801-related code (as
>> being inapplicable to 64-bit systems).
> 
> I'm not following here. What Xen has to do with E801? It's a SeaBIOS implemented
> call that happened to be used by QEMU option ROM. We cannot drop it from there
> as it's part of BIOS spec.

Any ACPI aware OS has to use E820 (and nothing else). Hence our
own use of E801 should either be dropped, or lead to the
disabling of ACPI. Otherwise real firmware using logic similar
to SeaBIOS'es (but hopefully properly accounting for holes)
could make us use ACPI table space as normal RAM.

Jan

Re: [PATCH] hvmloader: flip "ACPI data" to ACPI NVS type for ACPI table region

Posted by Igor Druzhinin 2 weeks ago
On 13/10/2020 16:35, Jan Beulich wrote:
> On 13.10.2020 14:59, Igor Druzhinin wrote:
>> On 13/10/2020 13:51, Jan Beulich wrote:
>>> As a consequence I think we will also want to adjust Xen itself to
>>> automatically disable ACPI when it ends up consuming E801 data. Or
>>> alternatively we should consider dropping all E801-related code (as
>>> being inapplicable to 64-bit systems).
>>
>> I'm not following here. What Xen has to do with E801? It's a SeaBIOS implemented
>> call that happened to be used by QEMU option ROM. We cannot drop it from there
>> as it's part of BIOS spec.
> 
> Any ACPI aware OS has to use E820 (and nothing else). Hence our
> own use of E801 should either be dropped, or lead to the
> disabling of ACPI. Otherwise real firmware using logic similar
> to SeaBIOS'es (but hopefully properly accounting for holes)
> could make us use ACPI table space as normal RAM.

It's not us using it - it's a boot loader from QEMU in a form of option ROM
that works in 16bit pre-OS environment which is not OS and relies on e801 BIOS call.
I'm sure any ACPI aware OS does indeed use E820 but the problem here is not an OS.

The option ROM is loaded using fw_cfg from QEMU so it's not our code. Technically
it's one foreign code (QEMU boot loader) talking to another foreign code (SeaBIOS)
which provides information based on E820 that we gave them.

So I'm afraid decision to dynamically disable ACPI (whatever you mean by this)
cannot be made by sole usage of this call by a pre-OS boot loader.

Igor

Re: [PATCH] hvmloader: flip "ACPI data" to ACPI NVS type for ACPI table region

Posted by Jan Beulich 2 weeks ago
On 13.10.2020 17:47, Igor Druzhinin wrote:
> On 13/10/2020 16:35, Jan Beulich wrote:
>> On 13.10.2020 14:59, Igor Druzhinin wrote:
>>> On 13/10/2020 13:51, Jan Beulich wrote:
>>>> As a consequence I think we will also want to adjust Xen itself to
>>>> automatically disable ACPI when it ends up consuming E801 data. Or
>>>> alternatively we should consider dropping all E801-related code (as
>>>> being inapplicable to 64-bit systems).
>>>
>>> I'm not following here. What Xen has to do with E801? It's a SeaBIOS implemented
>>> call that happened to be used by QEMU option ROM. We cannot drop it from there
>>> as it's part of BIOS spec.
>>
>> Any ACPI aware OS has to use E820 (and nothing else). Hence our
>> own use of E801 should either be dropped, or lead to the
>> disabling of ACPI. Otherwise real firmware using logic similar
>> to SeaBIOS'es (but hopefully properly accounting for holes)
>> could make us use ACPI table space as normal RAM.
> 
> It's not us using it - it's a boot loader from QEMU in a form of option ROM
> that works in 16bit pre-OS environment which is not OS and relies on e801 BIOS call.
> I'm sure any ACPI aware OS does indeed use E820 but the problem here is not an OS.
> 
> The option ROM is loaded using fw_cfg from QEMU so it's not our code. Technically
> it's one foreign code (QEMU boot loader) talking to another foreign code (SeaBIOS)
> which provides information based on E820 that we gave them.
> 
> So I'm afraid decision to dynamically disable ACPI (whatever you mean by this)
> cannot be made by sole usage of this call by a pre-OS boot loader.

I guess this is simply a misunderstanding. I'm not talking about
your change or hvmloader or the boot loader at all. I was merely
noticing a consequence of your findings on the behavior of Xen
itself: Use of ACPI and use of E801 are exclusive of one another.

Jan

Re: [PATCH] hvmloader: flip "ACPI data" to ACPI NVS type for ACPI table region

Posted by Igor Druzhinin 2 weeks ago
On 13/10/2020 16:54, Jan Beulich wrote:
> On 13.10.2020 17:47, Igor Druzhinin wrote:
>> On 13/10/2020 16:35, Jan Beulich wrote:
>>> On 13.10.2020 14:59, Igor Druzhinin wrote:
>>>> On 13/10/2020 13:51, Jan Beulich wrote:
>>>>> As a consequence I think we will also want to adjust Xen itself to
>>>>> automatically disable ACPI when it ends up consuming E801 data. Or
>>>>> alternatively we should consider dropping all E801-related code (as
>>>>> being inapplicable to 64-bit systems).
>>>>
>>>> I'm not following here. What Xen has to do with E801? It's a SeaBIOS implemented
>>>> call that happened to be used by QEMU option ROM. We cannot drop it from there
>>>> as it's part of BIOS spec.
>>>
>>> Any ACPI aware OS has to use E820 (and nothing else). Hence our
>>> own use of E801 should either be dropped, or lead to the
>>> disabling of ACPI. Otherwise real firmware using logic similar
>>> to SeaBIOS'es (but hopefully properly accounting for holes)
>>> could make us use ACPI table space as normal RAM.
>>
>> It's not us using it - it's a boot loader from QEMU in a form of option ROM
>> that works in 16bit pre-OS environment which is not OS and relies on e801 BIOS call.
>> I'm sure any ACPI aware OS does indeed use E820 but the problem here is not an OS.
>>
>> The option ROM is loaded using fw_cfg from QEMU so it's not our code. Technically
>> it's one foreign code (QEMU boot loader) talking to another foreign code (SeaBIOS)
>> which provides information based on E820 that we gave them.
>>
>> So I'm afraid decision to dynamically disable ACPI (whatever you mean by this)
>> cannot be made by sole usage of this call by a pre-OS boot loader.
> 
> I guess this is simply a misunderstanding. I'm not talking about
> your change or hvmloader or the boot loader at all. I was merely
> noticing a consequence of your findings on the behavior of Xen
> itself: Use of ACPI and use of E801 are exclusive of one another.

Sorry, yes. I forgot e801 is also used by Xen as an alternative to e820.

Igor