[v3] target/arm/kvm: Improve memory error handling

[PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB

Posted by Gavin Shan 3 months, 1 week ago

The current GHES raw data maximal length isn't enough for 16 consecutive
CPER errors, which will be sent to a guest with 4KiB page size on a
erroneous 64KiB host page. Note those 16 CPER errors will be contained
in one single error block, meaning all CPER errors should be identical
in terms of type and severity and all of them should be delivered in
one shot.

Increase GHES raw data maximal length from 1KiB to 4KiB so that the
error block has enough storage space for 16 consecutive CPER errors.

Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 docs/specs/acpi_hest_ghes.rst | 2 +-
 hw/acpi/ghes.c                | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
index aaf7b1ad11..acf31d6eeb 100644
--- a/docs/specs/acpi_hest_ghes.rst
+++ b/docs/specs/acpi_hest_ghes.rst
@@ -68,7 +68,7 @@ Design Details
     and N Read Ack Register entries. The size for each entry is 8-byte.
     The Error Status Data Block table contains N Error Status Data Block
     entries. The size for each entry is defined at the source code as
-    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 1024 bytes). The total size
+    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 4096 bytes). The total size
     for the "etc/hardware_errors" fw_cfg blob is
     (N * 8 * 2 + N * ACPI_GHES_MAX_RAW_DATA_LENGTH) bytes.
     N is the number of the kinds of hardware error sources.
diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
index 06555905ce..a9c08e73c0 100644
--- a/hw/acpi/ghes.c
+++ b/hw/acpi/ghes.c
@@ -33,7 +33,7 @@
 #define ACPI_HEST_ADDR_FW_CFG_FILE          "etc/acpi_table_hest_addr"
 
 /* The max size in bytes for one error block */
-#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
+#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (4 * KiB)
 
 /* Generic Hardware Error Source version 2 */
 #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10
-- 
2.51.0

Re: [PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB

Posted by Igor Mammedov 3 months ago

On Wed,  5 Nov 2025 21:44:47 +1000
Gavin Shan <gshan@redhat.com> wrote:

> The current GHES raw data maximal length isn't enough for 16 consecutive
> CPER errors, which will be sent to a guest with 4KiB page size on a
> erroneous 64KiB host page. Note those 16 CPER errors will be contained
> in one single error block, meaning all CPER errors should be identical
> in terms of type and severity and all of them should be delivered in
> one shot.
> 
> Increase GHES raw data maximal length from 1KiB to 4KiB so that the
> error block has enough storage space for 16 consecutive CPER errors.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  docs/specs/acpi_hest_ghes.rst | 2 +-
>  hw/acpi/ghes.c                | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
> index aaf7b1ad11..acf31d6eeb 100644
> --- a/docs/specs/acpi_hest_ghes.rst
> +++ b/docs/specs/acpi_hest_ghes.rst
> @@ -68,7 +68,7 @@ Design Details
>      and N Read Ack Register entries. The size for each entry is 8-byte.
>      The Error Status Data Block table contains N Error Status Data Block
>      entries. The size for each entry is defined at the source code as
> -    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 1024 bytes). The total size
> +    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 4096 bytes). The total size

is it safe to bump without compat glue?

consider VM migrated from old QEMU to new one,
it will have  etc/hardware_errors allocated with 1K GESB,
and more importantly error_block_addressN will have 1K offsets as well

however with ACPI_GHES_MAX_RAW_DATA_LENGTH all length checks will
let >1K blocks to be written into into 1K 'formated' etc/hardware_errors.

Thanks to previous refactoring we get all addresses right (1K version),
but if you write large GESB there it will either overlap with the next GESB
or a smaller GESB might overwrite tail of preceding large one.
And in works case it's OOB when writing large GESB in the last block.

Given we have to write GESB successfully or abort, there is no point
in adding compat knobs. But we still need to check if GEBS will fit into
whatever block size etc/hardware_errors inside guest RAM is laid out originally.

>      for the "etc/hardware_errors" fw_cfg blob is
>      (N * 8 * 2 + N * ACPI_GHES_MAX_RAW_DATA_LENGTH) bytes.
>      N is the number of the kinds of hardware error sources.
> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> index 06555905ce..a9c08e73c0 100644
> --- a/hw/acpi/ghes.c
> +++ b/hw/acpi/ghes.c
> @@ -33,7 +33,7 @@
>  #define ACPI_HEST_ADDR_FW_CFG_FILE          "etc/acpi_table_hest_addr"
>  
>  /* The max size in bytes for one error block */
> -#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
> +#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (4 * KiB)
>  
>  /* Generic Hardware Error Source version 2 */
>  #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10

Re: [PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB

Posted by Gavin Shan 3 months ago

Hi Igor,

On 11/11/25 12:11 AM, Igor Mammedov wrote:
> On Wed,  5 Nov 2025 21:44:47 +1000
> Gavin Shan <gshan@redhat.com> wrote:
> 
>> The current GHES raw data maximal length isn't enough for 16 consecutive
>> CPER errors, which will be sent to a guest with 4KiB page size on a
>> erroneous 64KiB host page. Note those 16 CPER errors will be contained
>> in one single error block, meaning all CPER errors should be identical
>> in terms of type and severity and all of them should be delivered in
>> one shot.
>>
>> Increase GHES raw data maximal length from 1KiB to 4KiB so that the
>> error block has enough storage space for 16 consecutive CPER errors.
>>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>   docs/specs/acpi_hest_ghes.rst | 2 +-
>>   hw/acpi/ghes.c                | 2 +-
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
>> index aaf7b1ad11..acf31d6eeb 100644
>> --- a/docs/specs/acpi_hest_ghes.rst
>> +++ b/docs/specs/acpi_hest_ghes.rst
>> @@ -68,7 +68,7 @@ Design Details
>>       and N Read Ack Register entries. The size for each entry is 8-byte.
>>       The Error Status Data Block table contains N Error Status Data Block
>>       entries. The size for each entry is defined at the source code as
>> -    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 1024 bytes). The total size
>> +    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 4096 bytes). The total size
> 
> is it safe to bump without compat glue?
> 
> consider VM migrated from old QEMU to new one,
> it will have  etc/hardware_errors allocated with 1K GESB,
> and more importantly error_block_addressN will have 1K offsets as well
> 
> however with ACPI_GHES_MAX_RAW_DATA_LENGTH all length checks will
> let >1K blocks to be written into into 1K 'formated' etc/hardware_errors.
> 
> Thanks to previous refactoring we get all addresses right (1K version),
> but if you write large GESB there it will either overlap with the next GESB
> or a smaller GESB might overwrite tail of preceding large one.
> And in works case it's OOB when writing large GESB in the last block.
> 
> Given we have to write GESB successfully or abort, there is no point
> in adding compat knobs. But we still need to check if GEBS will fit into
> whatever block size etc/hardware_errors inside guest RAM is laid out originally.
> 

Good point. You're right that we're not safe for migration from old QEMU to
and new QEMU. So I think I need to bump vmstate_hest_state::minimum_version_id
in generic_event_device.c ?


>>       for the "etc/hardware_errors" fw_cfg blob is
>>       (N * 8 * 2 + N * ACPI_GHES_MAX_RAW_DATA_LENGTH) bytes.
>>       N is the number of the kinds of hardware error sources.
>> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
>> index 06555905ce..a9c08e73c0 100644
>> --- a/hw/acpi/ghes.c
>> +++ b/hw/acpi/ghes.c
>> @@ -33,7 +33,7 @@
>>   #define ACPI_HEST_ADDR_FW_CFG_FILE          "etc/acpi_table_hest_addr"
>>   
>>   /* The max size in bytes for one error block */
>> -#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
>> +#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (4 * KiB)
>>   
>>   /* Generic Hardware Error Source version 2 */
>>   #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10

Thanks,
Gavin

Re: [PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB

Posted by Igor Mammedov 3 months ago

On Tue, 11 Nov 2025 14:05:23 +1000
Gavin Shan <gshan@redhat.com> wrote:

> Hi Igor,
> 
> On 11/11/25 12:11 AM, Igor Mammedov wrote:
> > On Wed,  5 Nov 2025 21:44:47 +1000
> > Gavin Shan <gshan@redhat.com> wrote:
> >   
> >> The current GHES raw data maximal length isn't enough for 16 consecutive
> >> CPER errors, which will be sent to a guest with 4KiB page size on a
> >> erroneous 64KiB host page. Note those 16 CPER errors will be contained
> >> in one single error block, meaning all CPER errors should be identical
> >> in terms of type and severity and all of them should be delivered in
> >> one shot.
> >>
> >> Increase GHES raw data maximal length from 1KiB to 4KiB so that the
> >> error block has enough storage space for 16 consecutive CPER errors.
> >>
> >> Signed-off-by: Gavin Shan <gshan@redhat.com>
> >> ---
> >>   docs/specs/acpi_hest_ghes.rst | 2 +-
> >>   hw/acpi/ghes.c                | 2 +-
> >>   2 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst
> >> index aaf7b1ad11..acf31d6eeb 100644
> >> --- a/docs/specs/acpi_hest_ghes.rst
> >> +++ b/docs/specs/acpi_hest_ghes.rst
> >> @@ -68,7 +68,7 @@ Design Details
> >>       and N Read Ack Register entries. The size for each entry is 8-byte.
> >>       The Error Status Data Block table contains N Error Status Data Block
> >>       entries. The size for each entry is defined at the source code as
> >> -    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 1024 bytes). The total size
> >> +    ACPI_GHES_MAX_RAW_DATA_LENGTH (currently 4096 bytes). The total size  
> > 
> > is it safe to bump without compat glue?
> > 
> > consider VM migrated from old QEMU to new one,
> > it will have  etc/hardware_errors allocated with 1K GESB,
> > and more importantly error_block_addressN will have 1K offsets as well
> > 
> > however with ACPI_GHES_MAX_RAW_DATA_LENGTH all length checks will
> > let >1K blocks to be written into into 1K 'formated' etc/hardware_errors.
> > 
> > Thanks to previous refactoring we get all addresses right (1K version),
> > but if you write large GESB there it will either overlap with the next GESB
> > or a smaller GESB might overwrite tail of preceding large one.
> > And in works case it's OOB when writing large GESB in the last block.
> > 
> > Given we have to write GESB successfully or abort, there is no point
> > in adding compat knobs. But we still need to check if GEBS will fit into
> > whatever block size etc/hardware_errors inside guest RAM is laid out originally.
> >   
> 
> Good point. You're right that we're not safe for migration from old QEMU to
> and new QEMU. So I think I need to bump vmstate_hest_state::minimum_version_id
> in generic_event_device.c ?

that won't help,
what would help is creating compat property (in the owner of GHES MMIO registers),
and lower limits (to former value) for older machine types.
That way sizes would match even if you do ping pong migration
between old qemu and new one, since one would still be using old machine type
for that.

> 
> 
> >>       for the "etc/hardware_errors" fw_cfg blob is
> >>       (N * 8 * 2 + N * ACPI_GHES_MAX_RAW_DATA_LENGTH) bytes.
> >>       N is the number of the kinds of hardware error sources.
> >> diff --git a/hw/acpi/ghes.c b/hw/acpi/ghes.c
> >> index 06555905ce..a9c08e73c0 100644
> >> --- a/hw/acpi/ghes.c
> >> +++ b/hw/acpi/ghes.c
> >> @@ -33,7 +33,7 @@
> >>   #define ACPI_HEST_ADDR_FW_CFG_FILE          "etc/acpi_table_hest_addr"
> >>   
> >>   /* The max size in bytes for one error block */
> >> -#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (1 * KiB)
> >> +#define ACPI_GHES_MAX_RAW_DATA_LENGTH   (4 * KiB)
> >>   
> >>   /* Generic Hardware Error Source version 2 */
> >>   #define ACPI_GHES_SOURCE_GENERIC_ERROR_V2   10  
> 
> Thanks,
> Gavin
>

Re: [PATCH v3 2/8] acpi/ghes: Increase GHES raw data maximal length to 4KiB

Posted by Jonathan Cameron via 3 months, 1 week ago

On Wed,  5 Nov 2025 21:44:47 +1000
Gavin Shan <gshan@redhat.com> wrote:

> The current GHES raw data maximal length isn't enough for 16 consecutive
> CPER errors, which will be sent to a guest with 4KiB page size on a
> erroneous 64KiB host page. Note those 16 CPER errors will be contained
> in one single error block, meaning all CPER errors should be identical
> in terms of type and severity and all of them should be delivered in
> one shot.
> 
> Increase GHES raw data maximal length from 1KiB to 4KiB so that the
> error block has enough storage space for 16 consecutive CPER errors.
> 
> Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>