[PATCH v2 1/2] libacpi: Prevent CPU hotplug AML from corrupting memory

Alejandro Vallejo posted 2 patches 2 days, 14 hours ago
[PATCH v2 1/2] libacpi: Prevent CPU hotplug AML from corrupting memory
Posted by Alejandro Vallejo 2 days, 14 hours ago
CPU hotplug relies on the online CPU bitmap being provided on PIO 0xaf00
by the device model. The GPE handler checks this and compares it against
the "online" flag on each MADT LAPIC entry, setting the flag to its
related bit in the bitmap and adjusting the table's checksum.

The bytecode doesn't, however, stop at NCPUS. It keeps comparing until it
reaches 128, even if that overflows the MADT into some other (hopefully
mapped) memory. The reading isn't as problematic as the writing though.

If an "entry" outside the MADT is deemed to disagree with the CPU bitmap
then the bit where the "online" flag would be is flipped, thus
corrupting that memory. And the MADT checksum gets adjusted for a flip
that happened outside its range. It's all terrible.

Note that this corruption happens regardless of the device-model being
present or not, because even if the bitmap holds 0s, the overflowed
memory might not at the bits corresponding to the "online" flag.

This patch adjusts the DSDT so entries >=NCPUS are skipped.

Fixes: 087543338924("hvmloader: limit CPUs exposed to guests")
Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
Signed-off-by: Alejandro Vallejo <alejandro.garciavallejo@amd.com>
---
v2:
  * Added early returns in AML rather than dubious And() statements
---
 tools/libacpi/mk_dsdt.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/tools/libacpi/mk_dsdt.c b/tools/libacpi/mk_dsdt.c
index 8ac4f9d0b4..aeeb71dfe6 100644
--- a/tools/libacpi/mk_dsdt.c
+++ b/tools/libacpi/mk_dsdt.c
@@ -231,6 +231,20 @@ int main(int argc, char **argv)
     stmt("Store", "ToBuffer(PRS), Local0");
     for ( cpu = 0; cpu < max_cpus; cpu++ )
     {
+        if ( cpu )
+        {
+            /*
+             * Check if we're still within the MADT bounds
+             *
+             * LLess() takes one byte, but LLessEqual() takes two. Increase
+             * `cpu` by 1, so we can avoid it. It does add up once you do it
+             * 127 times!
+             */
+            push_block("If", "LLess(\\_SB.NCPU, %d)", 1 + cpu);
+            stmt("Return", "One");
+            pop_block();
+        }
+
         /* Read a byte at a time from the PRST online-CPU bitmask. */
         if ( (cpu & 7) == 0 )
             stmt("Store", "DerefOf(Index(Local0, %u)), Local1", cpu/8);
-- 
2.43.0
Re: [PATCH v2 1/2] libacpi: Prevent CPU hotplug AML from corrupting memory
Posted by Jan Beulich 1 day, 23 hours ago
On 11.09.2025 18:23, Alejandro Vallejo wrote:
> CPU hotplug relies on the online CPU bitmap being provided on PIO 0xaf00
> by the device model. The GPE handler checks this and compares it against
> the "online" flag on each MADT LAPIC entry, setting the flag to its
> related bit in the bitmap and adjusting the table's checksum.
> 
> The bytecode doesn't, however, stop at NCPUS. It keeps comparing until it
> reaches 128, even if that overflows the MADT into some other (hopefully
> mapped) memory. The reading isn't as problematic as the writing though.
> 
> If an "entry" outside the MADT is deemed to disagree with the CPU bitmap
> then the bit where the "online" flag would be is flipped, thus
> corrupting that memory. And the MADT checksum gets adjusted for a flip
> that happened outside its range. It's all terrible.
> 
> Note that this corruption happens regardless of the device-model being
> present or not, because even if the bitmap holds 0s, the overflowed
> memory might not at the bits corresponding to the "online" flag.
> 
> This patch adjusts the DSDT so entries >=NCPUS are skipped.
> 
> Fixes: 087543338924("hvmloader: limit CPUs exposed to guests")
> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
> Signed-off-by: Alejandro Vallejo <alejandro.garciavallejo@amd.com>

In principle:
Reviewed-by: Jan Beulich <jbeulich@suse.com>

However, ...

> --- a/tools/libacpi/mk_dsdt.c
> +++ b/tools/libacpi/mk_dsdt.c
> @@ -231,6 +231,20 @@ int main(int argc, char **argv)
>      stmt("Store", "ToBuffer(PRS), Local0");
>      for ( cpu = 0; cpu < max_cpus; cpu++ )
>      {
> +        if ( cpu )
> +        {
> +            /*
> +             * Check if we're still within the MADT bounds
> +             *
> +             * LLess() takes one byte, but LLessEqual() takes two. Increase
> +             * `cpu` by 1, so we can avoid it. It does add up once you do it
> +             * 127 times!
> +             */
> +            push_block("If", "LLess(\\_SB.NCPU, %d)", 1 + cpu);
> +            stmt("Return", "One");

... if you already care about size bloat in the conditional, why are the two
bytes per instance that this extra return requires not relevant? They too
add up, and they can be avoided by wrapping the If around the rest of the
code. I didn't count it, but I expect the If encoding to grow by at most one
byte, perhaps none at all.

Jan

> +            pop_block();
> +        }
> +
>          /* Read a byte at a time from the PRST online-CPU bitmask. */
>          if ( (cpu & 7) == 0 )
>              stmt("Store", "DerefOf(Index(Local0, %u)), Local1", cpu/8);
Re: [PATCH v2 1/2] libacpi: Prevent CPU hotplug AML from corrupting memory
Posted by Alejandro Vallejo 1 day, 21 hours ago
On Fri Sep 12, 2025 at 8:40 AM CEST, Jan Beulich wrote:
> On 11.09.2025 18:23, Alejandro Vallejo wrote:
>> CPU hotplug relies on the online CPU bitmap being provided on PIO 0xaf00
>> by the device model. The GPE handler checks this and compares it against
>> the "online" flag on each MADT LAPIC entry, setting the flag to its
>> related bit in the bitmap and adjusting the table's checksum.
>> 
>> The bytecode doesn't, however, stop at NCPUS. It keeps comparing until it
>> reaches 128, even if that overflows the MADT into some other (hopefully
>> mapped) memory. The reading isn't as problematic as the writing though.
>> 
>> If an "entry" outside the MADT is deemed to disagree with the CPU bitmap
>> then the bit where the "online" flag would be is flipped, thus
>> corrupting that memory. And the MADT checksum gets adjusted for a flip
>> that happened outside its range. It's all terrible.
>> 
>> Note that this corruption happens regardless of the device-model being
>> present or not, because even if the bitmap holds 0s, the overflowed
>> memory might not at the bits corresponding to the "online" flag.
>> 
>> This patch adjusts the DSDT so entries >=NCPUS are skipped.
>> 
>> Fixes: 087543338924("hvmloader: limit CPUs exposed to guests")
>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>> Signed-off-by: Alejandro Vallejo <alejandro.garciavallejo@amd.com>
>
> In principle:
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

Cheers,

>
> However, ...
>
>> --- a/tools/libacpi/mk_dsdt.c
>> +++ b/tools/libacpi/mk_dsdt.c
>> @@ -231,6 +231,20 @@ int main(int argc, char **argv)
>>      stmt("Store", "ToBuffer(PRS), Local0");
>>      for ( cpu = 0; cpu < max_cpus; cpu++ )
>>      {
>> +        if ( cpu )
>> +        {
>> +            /*
>> +             * Check if we're still within the MADT bounds
>> +             *
>> +             * LLess() takes one byte, but LLessEqual() takes two. Increase
>> +             * `cpu` by 1, so we can avoid it. It does add up once you do it
>> +             * 127 times!
>> +             */
>> +            push_block("If", "LLess(\\_SB.NCPU, %d)", 1 + cpu);
>> +            stmt("Return", "One");
>
> ... if you already care about size bloat in the conditional, why are the two
> bytes per instance that this extra return requires not relevant? They too
> add up, and they can be avoided by wrapping the If around the rest of the
> code. I didn't count it, but I expect the If encoding to grow by at most one
> byte, perhaps none at all.

I don't mind either way. Removing the "return" statement and the pop_block()
would save 254 bytes in tota at most. I don't think the conditional would grow
because the there wouldn't be that much contained within, but regardless the
early return is in the spirit of not going through 127 conditionals on every
GPE handle when you typically only have a handful of CPUs. The sooner we drop
out of AML, the better.

In due time I want to shrink this to be an AML loop in dsdt.asl so it can
be taken out of mk_dsdt.c. That will both shrink the DSDT (a ton) and accelerate
GPE handling, but I don't have time to do it at the moment.

Do you have a preference in table size vs execution-time?

Cheers,
Alejandro
Re: [PATCH v2 1/2] libacpi: Prevent CPU hotplug AML from corrupting memory
Posted by Jan Beulich 1 day, 18 hours ago
On 12.09.2025 11:32, Alejandro Vallejo wrote:
> On Fri Sep 12, 2025 at 8:40 AM CEST, Jan Beulich wrote:
>> On 11.09.2025 18:23, Alejandro Vallejo wrote:
>>> CPU hotplug relies on the online CPU bitmap being provided on PIO 0xaf00
>>> by the device model. The GPE handler checks this and compares it against
>>> the "online" flag on each MADT LAPIC entry, setting the flag to its
>>> related bit in the bitmap and adjusting the table's checksum.
>>>
>>> The bytecode doesn't, however, stop at NCPUS. It keeps comparing until it
>>> reaches 128, even if that overflows the MADT into some other (hopefully
>>> mapped) memory. The reading isn't as problematic as the writing though.
>>>
>>> If an "entry" outside the MADT is deemed to disagree with the CPU bitmap
>>> then the bit where the "online" flag would be is flipped, thus
>>> corrupting that memory. And the MADT checksum gets adjusted for a flip
>>> that happened outside its range. It's all terrible.
>>>
>>> Note that this corruption happens regardless of the device-model being
>>> present or not, because even if the bitmap holds 0s, the overflowed
>>> memory might not at the bits corresponding to the "online" flag.
>>>
>>> This patch adjusts the DSDT so entries >=NCPUS are skipped.
>>>
>>> Fixes: 087543338924("hvmloader: limit CPUs exposed to guests")
>>> Reported-by: Grygorii Strashko <grygorii_strashko@epam.com>
>>> Signed-off-by: Alejandro Vallejo <alejandro.garciavallejo@amd.com>
>>
>> In principle:
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> 
> Cheers,
> 
>>
>> However, ...
>>
>>> --- a/tools/libacpi/mk_dsdt.c
>>> +++ b/tools/libacpi/mk_dsdt.c
>>> @@ -231,6 +231,20 @@ int main(int argc, char **argv)
>>>      stmt("Store", "ToBuffer(PRS), Local0");
>>>      for ( cpu = 0; cpu < max_cpus; cpu++ )
>>>      {
>>> +        if ( cpu )
>>> +        {
>>> +            /*
>>> +             * Check if we're still within the MADT bounds
>>> +             *
>>> +             * LLess() takes one byte, but LLessEqual() takes two. Increase
>>> +             * `cpu` by 1, so we can avoid it. It does add up once you do it
>>> +             * 127 times!
>>> +             */
>>> +            push_block("If", "LLess(\\_SB.NCPU, %d)", 1 + cpu);
>>> +            stmt("Return", "One");
>>
>> ... if you already care about size bloat in the conditional, why are the two
>> bytes per instance that this extra return requires not relevant? They too
>> add up, and they can be avoided by wrapping the If around the rest of the
>> code. I didn't count it, but I expect the If encoding to grow by at most one
>> byte, perhaps none at all.
> 
> I don't mind either way. Removing the "return" statement and the pop_block()
> would save 254 bytes in tota at most. I don't think the conditional would grow
> because the there wouldn't be that much contained within, but regardless the
> early return is in the spirit of not going through 127 conditionals on every
> GPE handle when you typically only have a handful of CPUs. The sooner we drop
> out of AML, the better.
> 
> In due time I want to shrink this to be an AML loop in dsdt.asl so it can
> be taken out of mk_dsdt.c. That will both shrink the DSDT (a ton) and accelerate
> GPE handling, but I don't have time to do it at the moment.
> 
> Do you have a preference in table size vs execution-time?

Personally I'd favor table size. The AML interpreter is slow anyway.

Jan