[PATCH v2] tools: Mark ACPI SDTs as NVS in the PVH build path

Alejandro Vallejo posted 1 patch 9 months, 1 week ago
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20250311092905.991-1-alejandro.vallejo@cloud.com
tools/firmware/hvmloader/e820.c |  4 ++++
tools/libs/light/libxl_x86.c    | 17 ++++++++++++++++-
2 files changed, 20 insertions(+), 1 deletion(-)
[PATCH v2] tools: Mark ACPI SDTs as NVS in the PVH build path
Posted by Alejandro Vallejo 9 months, 1 week ago
Commit cefeffc7e583 marked ACPI tables as NVS in the hvmloader path
because SeaBIOS may otherwise just mark it as RAM. There is, however,
yet another reason to do it even in the PVH path. Xen's incarnation of
AML relies on having access to some ACPI tables (e.g: _STA of Processor
objects relies on reading the processor online bit in its MADT entry)

This is problematic if the OS tries to reclaim ACPI memory for page
tables as it's needed for runtime and can't be reclaimed after the OSPM
is up and running.

Fixes: de6d188a519f("hvmloader: flip "ACPI data" to "ACPI NVS" type for ACPI table region)"
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
v1->v2:
  * Copy explanatory comment in hvmloader/e820.c to its libxl_x86.c counterpart

---
 tools/firmware/hvmloader/e820.c |  4 ++++
 tools/libs/light/libxl_x86.c    | 17 ++++++++++++++++-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
index c490a0bc790c..86d39544e887 100644
--- a/tools/firmware/hvmloader/e820.c
+++ b/tools/firmware/hvmloader/e820.c
@@ -210,6 +210,10 @@ int build_e820_table(struct e820entry *e820,
      * space reuse by an ACPI unaware / buggy bootloader, option ROM, etc.
      * before an ACPI OS takes control. This is possible due to the fact that
      * ACPI NVS memory is explicitly described as non-reclaimable in ACPI spec.
+     *
+     * Furthermore, Xen relies on accessing ACPI tables from within the AML
+     * code exposed to guests. So Xen's ACPI tables are not, in general,
+     * reclaimable.
      */
 
     if ( acpi_enabled )
diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
index a3164a3077fe..2ba96d12e595 100644
--- a/tools/libs/light/libxl_x86.c
+++ b/tools/libs/light/libxl_x86.c
@@ -737,12 +737,27 @@ static int domain_construct_memmap(libxl__gc *gc,
         nr++;
     }
 
+    /*
+     * Mark populated reserved memory that contains ACPI tables as ACPI NVS.
+     * That should help the guest to treat it correctly later: e.g. pass to
+     * the next kernel on kexec.
+     *
+     * Using NVS type instead of a regular one helps to prevent potential
+     * space reuse by an ACPI unaware / buggy bootloader, option ROM, etc.
+     * before an ACPI OS takes control. This is possible due to the fact that
+     * ACPI NVS memory is explicitly described as non-reclaimable in ACPI spec.
+     *
+     * Furthermore, Xen relies on accessing ACPI tables from within the AML
+     * code exposed to guests. So Xen's ACPI tables are not, in general,
+     * reclaimable.
+     */
+
     for (i = 0; i < MAX_ACPI_MODULES; i++) {
         if (dom->acpi_modules[i].length) {
             e820[nr].addr = dom->acpi_modules[i].guest_addr_out & ~(page_size - 1);
             e820[nr].size = dom->acpi_modules[i].length +
                 (dom->acpi_modules[i].guest_addr_out & (page_size - 1));
-            e820[nr].type = E820_ACPI;
+            e820[nr].type = E820_NVS;
             nr++;
         }
     }
-- 
2.48.1
Re: [PATCH v2] tools: Mark ACPI SDTs as NVS in the PVH build path
Posted by Jan Beulich 9 months, 1 week ago
On 11.03.2025 10:29, Alejandro Vallejo wrote:
> Commit cefeffc7e583 marked ACPI tables as NVS in the hvmloader path
> because SeaBIOS may otherwise just mark it as RAM. There is, however,
> yet another reason to do it even in the PVH path. Xen's incarnation of
> AML relies on having access to some ACPI tables (e.g: _STA of Processor
> objects relies on reading the processor online bit in its MADT entry)
> 
> This is problematic if the OS tries to reclaim ACPI memory for page
> tables as it's needed for runtime and can't be reclaimed after the OSPM
> is up and running.
> 
> Fixes: de6d188a519f("hvmloader: flip "ACPI data" to "ACPI NVS" type for ACPI table region)"
> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> ---
> v1->v2:
>   * Copy explanatory comment in hvmloader/e820.c to its libxl_x86.c counterpart
> 
> ---
>  tools/firmware/hvmloader/e820.c |  4 ++++
>  tools/libs/light/libxl_x86.c    | 17 ++++++++++++++++-
>  2 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
> index c490a0bc790c..86d39544e887 100644
> --- a/tools/firmware/hvmloader/e820.c
> +++ b/tools/firmware/hvmloader/e820.c
> @@ -210,6 +210,10 @@ int build_e820_table(struct e820entry *e820,
>       * space reuse by an ACPI unaware / buggy bootloader, option ROM, etc.
>       * before an ACPI OS takes control. This is possible due to the fact that
>       * ACPI NVS memory is explicitly described as non-reclaimable in ACPI spec.
> +     *
> +     * Furthermore, Xen relies on accessing ACPI tables from within the AML
> +     * code exposed to guests. So Xen's ACPI tables are not, in general,
> +     * reclaimable.
>       */
>  
>      if ( acpi_enabled )
> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> index a3164a3077fe..2ba96d12e595 100644
> --- a/tools/libs/light/libxl_x86.c
> +++ b/tools/libs/light/libxl_x86.c
> @@ -737,12 +737,27 @@ static int domain_construct_memmap(libxl__gc *gc,
>          nr++;
>      }
>  
> +    /*
> +     * Mark populated reserved memory that contains ACPI tables as ACPI NVS.
> +     * That should help the guest to treat it correctly later: e.g. pass to
> +     * the next kernel on kexec.
> +     *
> +     * Using NVS type instead of a regular one helps to prevent potential
> +     * space reuse by an ACPI unaware / buggy bootloader, option ROM, etc.
> +     * before an ACPI OS takes control. This is possible due to the fact that
> +     * ACPI NVS memory is explicitly described as non-reclaimable in ACPI spec.
> +     *
> +     * Furthermore, Xen relies on accessing ACPI tables from within the AML
> +     * code exposed to guests. So Xen's ACPI tables are not, in general,
> +     * reclaimable.
> +     */

When asking for a comment here I really only was after what the last paragraph says.
Especially the middle paragraph seems questionable to me: It would not only be ACPI-
unawareness, but also E820-unawareness, for the range to be prematurely re-used. And
buggy bootloaders really would need fixing, I think - they'd put OSes into trouble on
real hardware as well.

In short - I'd like to ask that the middle paragraph be dropped from here (which
surely could be done while committing).

However, there's a second concern: You say "PVH" in the title, yet this function is
in use also for HVM, and ...

>      for (i = 0; i < MAX_ACPI_MODULES; i++) {
>          if (dom->acpi_modules[i].length) {
>              e820[nr].addr = dom->acpi_modules[i].guest_addr_out & ~(page_size - 1);
>              e820[nr].size = dom->acpi_modules[i].length +
>                  (dom->acpi_modules[i].guest_addr_out & (page_size - 1));
> -            e820[nr].type = E820_ACPI;
> +            e820[nr].type = E820_NVS;
>              nr++;
>          }
>      }

... this code is outside of any conditionals. This imo needs sorting one way or
another.

Jan
Re: [PATCH v2] tools: Mark ACPI SDTs as NVS in the PVH build path
Posted by Alejandro Vallejo 9 months, 1 week ago
On Thu Mar 13, 2025 at 1:14 PM GMT, Jan Beulich wrote:
> On 11.03.2025 10:29, Alejandro Vallejo wrote:
> > Commit cefeffc7e583 marked ACPI tables as NVS in the hvmloader path
> > because SeaBIOS may otherwise just mark it as RAM. There is, however,
> > yet another reason to do it even in the PVH path. Xen's incarnation of
> > AML relies on having access to some ACPI tables (e.g: _STA of Processor
> > objects relies on reading the processor online bit in its MADT entry)
> > 
> > This is problematic if the OS tries to reclaim ACPI memory for page
> > tables as it's needed for runtime and can't be reclaimed after the OSPM
> > is up and running.
> > 
> > Fixes: de6d188a519f("hvmloader: flip "ACPI data" to "ACPI NVS" type for ACPI table region)"
> > Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
> > ---
> > v1->v2:
> >   * Copy explanatory comment in hvmloader/e820.c to its libxl_x86.c counterpart
> > 
> > ---
> >  tools/firmware/hvmloader/e820.c |  4 ++++
> >  tools/libs/light/libxl_x86.c    | 17 ++++++++++++++++-
> >  2 files changed, 20 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
> > index c490a0bc790c..86d39544e887 100644
> > --- a/tools/firmware/hvmloader/e820.c
> > +++ b/tools/firmware/hvmloader/e820.c
> > @@ -210,6 +210,10 @@ int build_e820_table(struct e820entry *e820,
> >       * space reuse by an ACPI unaware / buggy bootloader, option ROM, etc.
> >       * before an ACPI OS takes control. This is possible due to the fact that
> >       * ACPI NVS memory is explicitly described as non-reclaimable in ACPI spec.
> > +     *
> > +     * Furthermore, Xen relies on accessing ACPI tables from within the AML
> > +     * code exposed to guests. So Xen's ACPI tables are not, in general,
> > +     * reclaimable.
> >       */
> >  
> >      if ( acpi_enabled )
> > diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> > index a3164a3077fe..2ba96d12e595 100644
> > --- a/tools/libs/light/libxl_x86.c
> > +++ b/tools/libs/light/libxl_x86.c
> > @@ -737,12 +737,27 @@ static int domain_construct_memmap(libxl__gc *gc,
> >          nr++;
> >      }
> >  
> > +    /*
> > +     * Mark populated reserved memory that contains ACPI tables as ACPI NVS.
> > +     * That should help the guest to treat it correctly later: e.g. pass to
> > +     * the next kernel on kexec.
> > +     *
> > +     * Using NVS type instead of a regular one helps to prevent potential
> > +     * space reuse by an ACPI unaware / buggy bootloader, option ROM, etc.
> > +     * before an ACPI OS takes control. This is possible due to the fact that
> > +     * ACPI NVS memory is explicitly described as non-reclaimable in ACPI spec.
> > +     *
> > +     * Furthermore, Xen relies on accessing ACPI tables from within the AML
> > +     * code exposed to guests. So Xen's ACPI tables are not, in general,
> > +     * reclaimable.
> > +     */
>
> When asking for a comment here I really only was after what the last paragraph says.
> Especially the middle paragraph seems questionable to me: It would not only be ACPI-
> unawareness, but also E820-unawareness, for the range to be prematurely re-used. And
> buggy bootloaders really would need fixing, I think - they'd put OSes into trouble on
> real hardware as well.
>
> In short - I'd like to ask that the middle paragraph be dropped from here (which
> surely could be done while committing).

I feel the rationale is the same on both paths, so the comment blocks ought to
be aligned (whichever way). But I have no strong motivations and would be fine
dropping the middle paragraph here.

If that's your only remark, I'm happy for it to be dropped on commit.

>
> However, there's a second concern: You say "PVH" in the title, yet this function is
> in use also for HVM, and ...
>
> >      for (i = 0; i < MAX_ACPI_MODULES; i++) {
> >          if (dom->acpi_modules[i].length) {
> >              e820[nr].addr = dom->acpi_modules[i].guest_addr_out & ~(page_size - 1);
> >              e820[nr].size = dom->acpi_modules[i].length +
> >                  (dom->acpi_modules[i].guest_addr_out & (page_size - 1));
> > -            e820[nr].type = E820_ACPI;
> > +            e820[nr].type = E820_NVS;
> >              nr++;
> >          }
> >      }
>
> ... this code is outside of any conditionals. This imo needs sorting one way or
> another.
>
> Jan

ACPI tables are populated by hvmloader, while libxl generates those of PVH.

dom->acpi_modules are populated by libxl__dom_load_acpi(), which is gated on
the type being PVH (see the caller of this function). So this loop should be
effectively skipped.

I called it the PVH path because it happens to be at the moment. Nothing
prevents this path from being the HVM path too, but that involves rewiring
hvmloader.

Cheers,
Alejandro
Re: [PATCH v2] tools: Mark ACPI SDTs as NVS in the PVH build path
Posted by Jan Beulich 9 months, 1 week ago
On 13.03.2025 15:30, Alejandro Vallejo wrote:
> On Thu Mar 13, 2025 at 1:14 PM GMT, Jan Beulich wrote:
>> On 11.03.2025 10:29, Alejandro Vallejo wrote:
>>> Commit cefeffc7e583 marked ACPI tables as NVS in the hvmloader path
>>> because SeaBIOS may otherwise just mark it as RAM. There is, however,
>>> yet another reason to do it even in the PVH path. Xen's incarnation of
>>> AML relies on having access to some ACPI tables (e.g: _STA of Processor
>>> objects relies on reading the processor online bit in its MADT entry)
>>>
>>> This is problematic if the OS tries to reclaim ACPI memory for page
>>> tables as it's needed for runtime and can't be reclaimed after the OSPM
>>> is up and running.
>>>
>>> Fixes: de6d188a519f("hvmloader: flip "ACPI data" to "ACPI NVS" type for ACPI table region)"
>>> Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
>>> ---
>>> v1->v2:
>>>   * Copy explanatory comment in hvmloader/e820.c to its libxl_x86.c counterpart
>>>
>>> ---
>>>  tools/firmware/hvmloader/e820.c |  4 ++++
>>>  tools/libs/light/libxl_x86.c    | 17 ++++++++++++++++-
>>>  2 files changed, 20 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c
>>> index c490a0bc790c..86d39544e887 100644
>>> --- a/tools/firmware/hvmloader/e820.c
>>> +++ b/tools/firmware/hvmloader/e820.c
>>> @@ -210,6 +210,10 @@ int build_e820_table(struct e820entry *e820,
>>>       * space reuse by an ACPI unaware / buggy bootloader, option ROM, etc.
>>>       * before an ACPI OS takes control. This is possible due to the fact that
>>>       * ACPI NVS memory is explicitly described as non-reclaimable in ACPI spec.
>>> +     *
>>> +     * Furthermore, Xen relies on accessing ACPI tables from within the AML
>>> +     * code exposed to guests. So Xen's ACPI tables are not, in general,
>>> +     * reclaimable.
>>>       */
>>>  
>>>      if ( acpi_enabled )
>>> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
>>> index a3164a3077fe..2ba96d12e595 100644
>>> --- a/tools/libs/light/libxl_x86.c
>>> +++ b/tools/libs/light/libxl_x86.c
>>> @@ -737,12 +737,27 @@ static int domain_construct_memmap(libxl__gc *gc,
>>>          nr++;
>>>      }
>>>  
>>> +    /*
>>> +     * Mark populated reserved memory that contains ACPI tables as ACPI NVS.
>>> +     * That should help the guest to treat it correctly later: e.g. pass to
>>> +     * the next kernel on kexec.
>>> +     *
>>> +     * Using NVS type instead of a regular one helps to prevent potential
>>> +     * space reuse by an ACPI unaware / buggy bootloader, option ROM, etc.
>>> +     * before an ACPI OS takes control. This is possible due to the fact that
>>> +     * ACPI NVS memory is explicitly described as non-reclaimable in ACPI spec.
>>> +     *
>>> +     * Furthermore, Xen relies on accessing ACPI tables from within the AML
>>> +     * code exposed to guests. So Xen's ACPI tables are not, in general,
>>> +     * reclaimable.
>>> +     */
>>
>> When asking for a comment here I really only was after what the last paragraph says.
>> Especially the middle paragraph seems questionable to me: It would not only be ACPI-
>> unawareness, but also E820-unawareness, for the range to be prematurely re-used. And
>> buggy bootloaders really would need fixing, I think - they'd put OSes into trouble on
>> real hardware as well.
>>
>> In short - I'd like to ask that the middle paragraph be dropped from here (which
>> surely could be done while committing).
> 
> I feel the rationale is the same on both paths, so the comment blocks ought to
> be aligned (whichever way). But I have no strong motivations and would be fine
> dropping the middle paragraph here.
> 
> If that's your only remark, I'm happy for it to be dropped on commit.
> 
>>
>> However, there's a second concern: You say "PVH" in the title, yet this function is
>> in use also for HVM, and ...
>>
>>>      for (i = 0; i < MAX_ACPI_MODULES; i++) {
>>>          if (dom->acpi_modules[i].length) {
>>>              e820[nr].addr = dom->acpi_modules[i].guest_addr_out & ~(page_size - 1);
>>>              e820[nr].size = dom->acpi_modules[i].length +
>>>                  (dom->acpi_modules[i].guest_addr_out & (page_size - 1));
>>> -            e820[nr].type = E820_ACPI;
>>> +            e820[nr].type = E820_NVS;
>>>              nr++;
>>>          }
>>>      }
>>
>> ... this code is outside of any conditionals. This imo needs sorting one way or
>> another.
> 
> ACPI tables are populated by hvmloader, while libxl generates those of PVH.
> 
> dom->acpi_modules are populated by libxl__dom_load_acpi(), which is gated on
> the type being PVH (see the caller of this function). So this loop should be
> effectively skipped.
> 
> I called it the PVH path because it happens to be at the moment. Nothing
> prevents this path from being the HVM path too, but that involves rewiring
> hvmloader.

Oh, okay - what I was missing then is that ->acpi_modules[] is populated by
libxl__dom_load_acpi(), and that bails early for non-PVH. So the loop here
will have only no-op iterations for HVM.

Jan