[PATCH] hw/scsi/esp: fix assertion error in fifo8_push

Zheng Huang posted 1 patch 5 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/37889706-8576-476c-8fea-c1a3a2858b1e@gmail.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Fam Zheng <fam@euphon.net>
hw/scsi/esp.c | 6 ++++++
1 file changed, 6 insertions(+)
[PATCH] hw/scsi/esp: fix assertion error in fifo8_push
Posted by Zheng Huang 5 months, 3 weeks ago
This patch add validation checks on FIFO structures in esp_post_load() to
avoid assertion error `assert(fifo->num < fifo->capacity);` in fifo8_push(),
which can occur if the inbound migration stream is malformed. By performing 
these checks during post-load, we can catch and handle such issues earlier, 
avoiding crashes due to corrupted state.

Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
---
 hw/scsi/esp.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
index ac841dc32e..ba77017087 100644
--- a/hw/scsi/esp.c
+++ b/hw/scsi/esp.c
@@ -1350,11 +1350,17 @@ static int esp_post_load(void *opaque, int version_id)
         /* Migrate ti_buf to fifo */
         len = s->mig_ti_wptr - s->mig_ti_rptr;
         for (i = 0; i < len; i++) {
+            if (&s->fifo.num >= &s->fifo.capacity) {
+                return -1;
+            }
             fifo8_push(&s->fifo, s->mig_ti_buf[i]);
         }
 
         /* Migrate cmdbuf to cmdfifo */
         for (i = 0; i < s->mig_cmdlen; i++) {
+            if (&s->cmdfifo.num >= &s->cmdfifo.capacity) {
+                return -1;
+            }
             fifo8_push(&s->cmdfifo, s->mig_cmdbuf[i]);
         }
     }
-- 
2.34.1
Re: [PATCH] hw/scsi/esp: fix assertion error in fifo8_push
Posted by Philippe Mathieu-Daudé 5 months, 3 weeks ago
Hi,

Cc'ing maintainers:

$ ./scripts/get_maintainer.pl -f hw/scsi/esp.c
Paolo Bonzini <pbonzini@redhat.com> (supporter:SCSI)
Fam Zheng <fam@euphon.net> (reviewer:SCSI)
$ ./scripts/get_maintainer.pl -f migration/
Peter Xu <peterx@redhat.com> (maintainer:Migration)
Fabiano Rosas <farosas@suse.de> (maintainer:Migration)

On 27/5/25 15:12, Zheng Huang wrote:
> This patch add validation checks on FIFO structures in esp_post_load() to
> avoid assertion error `assert(fifo->num < fifo->capacity);` in fifo8_push(),
> which can occur if the inbound migration stream is malformed. By performing
> these checks during post-load, we can catch and handle such issues earlier,
> avoiding crashes due to corrupted state.

How can that happen? Can you share a reproducer?

> 
> Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
> ---
>   hw/scsi/esp.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
> index ac841dc32e..ba77017087 100644
> --- a/hw/scsi/esp.c
> +++ b/hw/scsi/esp.c
> @@ -1350,11 +1350,17 @@ static int esp_post_load(void *opaque, int version_id)
>           /* Migrate ti_buf to fifo */
>           len = s->mig_ti_wptr - s->mig_ti_rptr;
>           for (i = 0; i < len; i++) {
> +            if (&s->fifo.num >= &s->fifo.capacity) {
> +                return -1;
> +            }
>               fifo8_push(&s->fifo, s->mig_ti_buf[i]);
>           }
>   
>           /* Migrate cmdbuf to cmdfifo */
>           for (i = 0; i < s->mig_cmdlen; i++) {
> +            if (&s->cmdfifo.num >= &s->cmdfifo.capacity) {
> +                return -1;
> +            }
>               fifo8_push(&s->cmdfifo, s->mig_cmdbuf[i]);
>           }
>       }
Re: [PATCH] hw/scsi/esp: fix assertion error in fifo8_push
Posted by Mark Cave-Ayland 5 months, 3 weeks ago
On 27/05/2025 14:59, Philippe Mathieu-Daudé wrote:

> Hi,
> 
> Cc'ing maintainers:
> 
> $ ./scripts/get_maintainer.pl -f hw/scsi/esp.c
> Paolo Bonzini <pbonzini@redhat.com> (supporter:SCSI)
> Fam Zheng <fam@euphon.net> (reviewer:SCSI)
> $ ./scripts/get_maintainer.pl -f migration/
> Peter Xu <peterx@redhat.com> (maintainer:Migration)
> Fabiano Rosas <farosas@suse.de> (maintainer:Migration)
> 
> On 27/5/25 15:12, Zheng Huang wrote:
>> This patch add validation checks on FIFO structures in esp_post_load() to
>> avoid assertion error `assert(fifo->num < fifo->capacity);` in fifo8_push(),
>> which can occur if the inbound migration stream is malformed. By performing
>> these checks during post-load, we can catch and handle such issues earlier,
>> avoiding crashes due to corrupted state.
> 
> How can that happen? Can you share a reproducer?
> 
>>
>> Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
>> ---
>>   hw/scsi/esp.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
>> index ac841dc32e..ba77017087 100644
>> --- a/hw/scsi/esp.c
>> +++ b/hw/scsi/esp.c
>> @@ -1350,11 +1350,17 @@ static int esp_post_load(void *opaque, int version_id)
>>           /* Migrate ti_buf to fifo */
>>           len = s->mig_ti_wptr - s->mig_ti_rptr;
>>           for (i = 0; i < len; i++) {
>> +            if (&s->fifo.num >= &s->fifo.capacity) {
>> +                return -1;
>> +            }
>>               fifo8_push(&s->fifo, s->mig_ti_buf[i]);
>>           }
>>           /* Migrate cmdbuf to cmdfifo */
>>           for (i = 0; i < s->mig_cmdlen; i++) {
>> +            if (&s->cmdfifo.num >= &s->cmdfifo.capacity) {
>> +                return -1;
>> +            }
>>               fifo8_push(&s->cmdfifo, s->mig_cmdbuf[i]);
>>           }
>>       }

This seems odd: this logic in esp_post_load() is for converting from pre-Fifo8 code 
to the current Fifo8 code, so why wouldn't we want to assert() for the case when the 
migration stream is intentionally malformed? Is there a case whereby the old code 
could generate an invalid migration stream like this?


ATB,

Mark.


Re: [PATCH] hw/scsi/esp: fix assertion error in fifo8_push
Posted by Zheng Huang 5 months, 3 weeks ago
On 2025/5/28 03:40, Mark Cave-Ayland wrote:
> On 27/05/2025 14:59, Philippe Mathieu-Daudé wrote:
> 
>> Hi,
>>
>> Cc'ing maintainers:
>>
>> $ ./scripts/get_maintainer.pl -f hw/scsi/esp.c
>> Paolo Bonzini <pbonzini@redhat.com> (supporter:SCSI)
>> Fam Zheng <fam@euphon.net> (reviewer:SCSI)
>> $ ./scripts/get_maintainer.pl -f migration/
>> Peter Xu <peterx@redhat.com> (maintainer:Migration)
>> Fabiano Rosas <farosas@suse.de> (maintainer:Migration)
>>
>> On 27/5/25 15:12, Zheng Huang wrote:
>>> This patch add validation checks on FIFO structures in esp_post_load() to
>>> avoid assertion error `assert(fifo->num < fifo->capacity);` in fifo8_push(),
>>> which can occur if the inbound migration stream is malformed. By performing
>>> these checks during post-load, we can catch and handle such issues earlier,
>>> avoiding crashes due to corrupted state.
>>
>> How can that happen? Can you share a reproducer?
>>
>>>
>>> Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
>>> ---
>>>   hw/scsi/esp.c | 6 ++++++
>>>   1 file changed, 6 insertions(+)
>>>
>>> diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
>>> index ac841dc32e..ba77017087 100644
>>> --- a/hw/scsi/esp.c
>>> +++ b/hw/scsi/esp.c
>>> @@ -1350,11 +1350,17 @@ static int esp_post_load(void *opaque, int version_id)
>>>           /* Migrate ti_buf to fifo */
>>>           len = s->mig_ti_wptr - s->mig_ti_rptr;
>>>           for (i = 0; i < len; i++) {
>>> +            if (&s->fifo.num >= &s->fifo.capacity) {
>>> +                return -1;
>>> +            }
>>>               fifo8_push(&s->fifo, s->mig_ti_buf[i]);
>>>           }
>>>           /* Migrate cmdbuf to cmdfifo */
>>>           for (i = 0; i < s->mig_cmdlen; i++) {
>>> +            if (&s->cmdfifo.num >= &s->cmdfifo.capacity) {
>>> +                return -1;
>>> +            }
>>>               fifo8_push(&s->cmdfifo, s->mig_cmdbuf[i]);
>>>           }
>>>       }
> 
> This seems odd: this logic in esp_post_load() is for converting from pre-Fifo8 code to the current Fifo8 code, so why wouldn't we want to assert() for the case when the migration stream is intentionally malformed? Is there a case whereby the old code could generate an invalid migration stream like this?
> 
> 
> ATB,
> 
> Mark.
> 

Hi Mark,

The malformed migration stream in question originates from QEMU itself—either accidentally, due to 
a bug, or maliciously crafted. If we allow unchecked data through in esp_post_load(), an attacker
controlling the migration source could send crafted values that trigger undefined behavior.
The commit https://gitlab.com/qemu-project/qemu/-/commit/b88cfee90268cad376682da8f99ccf024d7aa304
also check the migration stream integrity in post_load handler, which is suggested by Peter Maydell in
https://lists.gnu.org/archive/html/qemu-devel/2024-07/msg00099.html, 'to prevent the division-by-zero
in the "malicious inbound migration state" case'.

Also, I would appreciate your opinion on how we should handle such "malformed migration stream" case
more generally, if there are more severe issues than assertion error, such as FPE, UAF, etc.? Should
QEMU adopt a more systematic “post_load” validation pattern—verifying all critical fields across every
migration handler—to harden the migration subsystem against any tampering of the migration image?

Best regards,

Zheng.


Re: [PATCH] hw/scsi/esp: fix assertion error in fifo8_push
Posted by Zheng Huang 5 months, 3 weeks ago
On 2025/5/28 03:40, Mark Cave-Ayland wrote:
> On 27/05/2025 14:59, Philippe Mathieu-Daudé wrote:
> 
>> Hi,
>>
>> Cc'ing maintainers:
>>
>> $ ./scripts/get_maintainer.pl -f hw/scsi/esp.c
>> Paolo Bonzini <pbonzini@redhat.com> (supporter:SCSI)
>> Fam Zheng <fam@euphon.net> (reviewer:SCSI)
>> $ ./scripts/get_maintainer.pl -f migration/
>> Peter Xu <peterx@redhat.com> (maintainer:Migration)
>> Fabiano Rosas <farosas@suse.de> (maintainer:Migration)
>>
>> On 27/5/25 15:12, Zheng Huang wrote:
>>> This patch add validation checks on FIFO structures in esp_post_load() to
>>> avoid assertion error `assert(fifo->num < fifo->capacity);` in fifo8_push(),
>>> which can occur if the inbound migration stream is malformed. By performing
>>> these checks during post-load, we can catch and handle such issues earlier,
>>> avoiding crashes due to corrupted state.
>>
>> How can that happen? Can you share a reproducer?
>>
>>>
>>> Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
>>> ---
>>>   hw/scsi/esp.c | 6 ++++++
>>>   1 file changed, 6 insertions(+)
>>>
>>> diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
>>> index ac841dc32e..ba77017087 100644
>>> --- a/hw/scsi/esp.c
>>> +++ b/hw/scsi/esp.c
>>> @@ -1350,11 +1350,17 @@ static int esp_post_load(void *opaque, int version_id)
>>>           /* Migrate ti_buf to fifo */
>>>           len = s->mig_ti_wptr - s->mig_ti_rptr;
>>>           for (i = 0; i < len; i++) {
>>> +            if (&s->fifo.num >= &s->fifo.capacity) {
>>> +                return -1;
>>> +            }
>>>               fifo8_push(&s->fifo, s->mig_ti_buf[i]);
>>>           }
>>>           /* Migrate cmdbuf to cmdfifo */
>>>           for (i = 0; i < s->mig_cmdlen; i++) {
>>> +            if (&s->cmdfifo.num >= &s->cmdfifo.capacity) {
>>> +                return -1;
>>> +            }
>>>               fifo8_push(&s->cmdfifo, s->mig_cmdbuf[i]);
>>>           }
>>>       }
> 
> This seems odd: this logic in esp_post_load() is for converting from pre-Fifo8 code to the current Fifo8 code, so why wouldn't we want to assert() for the case when the migration stream is intentionally malformed? Is there a case whereby the old code could generate an invalid migration stream like this?
> 
> 
> ATB,
> 
> Mark.
> 

Hi Mark,

The malformed migration stream in question originates from QEMU itself—either accidentally, due to 
a bug, or maliciously crafted. If we allow unchecked data through in esp_post_load(), an attacker
controlling the migration source could send crafted values that trigger undefined behavior.
The commit https://gitlab.com/qemu-project/qemu/-/commit/b88cfee90268cad376682da8f99ccf024d7aa304
also check the migration stream integrity in post_load handler, which is suggested by Peter Maydell in
https://lists.gnu.org/archive/html/qemu-devel/2024-07/msg00099.html, 'to prevent the division-by-zero
in the "malicious inbound migration state" case'.

Also, I would appreciate your opinion on how we should handle such "malformed migration stream" case
more generally, if there are more severe issues than assertion error, such as FPE, UAF, etc.? Should
QEMU adopt a more systematic “post_load” validation pattern—verifying all critical fields across every
migration handler—to harden the migration subsystem against any tampering of the migration image?


Best regards,

Zheng.


Re: [PATCH] hw/scsi/esp: fix assertion error in fifo8_push
Posted by Fabiano Rosas 5 months, 3 weeks ago
Zheng Huang <hz1624917200@gmail.com> writes:

> On 2025/5/28 03:40, Mark Cave-Ayland wrote:
>> On 27/05/2025 14:59, Philippe Mathieu-Daudé wrote:
>> 
>>> Hi,
>>>
>>> Cc'ing maintainers:
>>>
>>> $ ./scripts/get_maintainer.pl -f hw/scsi/esp.c
>>> Paolo Bonzini <pbonzini@redhat.com> (supporter:SCSI)
>>> Fam Zheng <fam@euphon.net> (reviewer:SCSI)
>>> $ ./scripts/get_maintainer.pl -f migration/
>>> Peter Xu <peterx@redhat.com> (maintainer:Migration)
>>> Fabiano Rosas <farosas@suse.de> (maintainer:Migration)
>>>
>>> On 27/5/25 15:12, Zheng Huang wrote:
>>>> This patch add validation checks on FIFO structures in esp_post_load() to
>>>> avoid assertion error `assert(fifo->num < fifo->capacity);` in fifo8_push(),
>>>> which can occur if the inbound migration stream is malformed. By performing
>>>> these checks during post-load, we can catch and handle such issues earlier,
>>>> avoiding crashes due to corrupted state.
>>>
>>> How can that happen? Can you share a reproducer?
>>>
>>>>
>>>> Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
>>>> ---
>>>>   hw/scsi/esp.c | 6 ++++++
>>>>   1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
>>>> index ac841dc32e..ba77017087 100644
>>>> --- a/hw/scsi/esp.c
>>>> +++ b/hw/scsi/esp.c
>>>> @@ -1350,11 +1350,17 @@ static int esp_post_load(void *opaque, int version_id)
>>>>           /* Migrate ti_buf to fifo */
>>>>           len = s->mig_ti_wptr - s->mig_ti_rptr;
>>>>           for (i = 0; i < len; i++) {
>>>> +            if (&s->fifo.num >= &s->fifo.capacity) {
>>>> +                return -1;
>>>> +            }
>>>>               fifo8_push(&s->fifo, s->mig_ti_buf[i]);
>>>>           }
>>>>           /* Migrate cmdbuf to cmdfifo */
>>>>           for (i = 0; i < s->mig_cmdlen; i++) {
>>>> +            if (&s->cmdfifo.num >= &s->cmdfifo.capacity) {
>>>> +                return -1;
>>>> +            }
>>>>               fifo8_push(&s->cmdfifo, s->mig_cmdbuf[i]);
>>>>           }
>>>>       }
>> 
>> This seems odd: this logic in esp_post_load() is for converting from pre-Fifo8 code to the current Fifo8 code, so why wouldn't we want to assert() for the case when the migration stream is intentionally malformed? Is there a case whereby the old code could generate an invalid migration stream like this?
>> 
>> 
>> ATB,
>> 
>> Mark.
>> 
>
> Hi Mark,
>
> The malformed migration stream in question originates from QEMU itself—either accidentally, due to 
> a bug, or maliciously crafted. If we allow unchecked data through in esp_post_load(), an attacker
> controlling the migration source could send crafted values that trigger undefined behavior.
> The commit https://gitlab.com/qemu-project/qemu/-/commit/b88cfee90268cad376682da8f99ccf024d7aa304
> also check the migration stream integrity in post_load handler, which is suggested by Peter Maydell in
> https://lists.gnu.org/archive/html/qemu-devel/2024-07/msg00099.html, 'to prevent the division-by-zero
> in the "malicious inbound migration state" case'.
>
> Also, I would appreciate your opinion on how we should handle such "malformed migration stream" case
> more generally, if there are more severe issues than assertion error, such as FPE, UAF, etc.? Should
> QEMU adopt a more systematic “post_load” validation pattern—verifying all critical fields across every
> migration handler—to harden the migration subsystem against any tampering of the migration image?
>

From the migration perspective it does make sense to validate the values
and return error. The migration stream should indeed be considered
untrusted.

But I agree that it would be nice to have some sort of reproducer. I
don't think it's practical to go around adding code to handle every
single hypothetical scenario. That creates some churn in the codebase
that might introduce bugs by itself.
Re: [PATCH] hw/scsi/esp: fix assertion error in fifo8_push
Posted by Zheng Huang 5 months, 2 weeks ago

On 2025/5/28 21:07, Fabiano Rosas wrote:
> Zheng Huang <hz1624917200@gmail.com> writes:
> 
>> On 2025/5/28 03:40, Mark Cave-Ayland wrote:
>>> On 27/05/2025 14:59, Philippe Mathieu-Daudé wrote:
>>>
>>>> Hi,
>>>>
>>>> Cc'ing maintainers:
>>>>
>>>> $ ./scripts/get_maintainer.pl -f hw/scsi/esp.c
>>>> Paolo Bonzini <pbonzini@redhat.com> (supporter:SCSI)
>>>> Fam Zheng <fam@euphon.net> (reviewer:SCSI)
>>>> $ ./scripts/get_maintainer.pl -f migration/
>>>> Peter Xu <peterx@redhat.com> (maintainer:Migration)
>>>> Fabiano Rosas <farosas@suse.de> (maintainer:Migration)
>>>>
>>>> On 27/5/25 15:12, Zheng Huang wrote:
>>>>> This patch add validation checks on FIFO structures in esp_post_load() to
>>>>> avoid assertion error `assert(fifo->num < fifo->capacity);` in fifo8_push(),
>>>>> which can occur if the inbound migration stream is malformed. By performing
>>>>> these checks during post-load, we can catch and handle such issues earlier,
>>>>> avoiding crashes due to corrupted state.
>>>>
>>>> How can that happen? Can you share a reproducer?
>>>>
>>>>>
>>>>> Signed-off-by: Zheng Huang <hz1624917200@gmail.com>
>>>>> ---
>>>>>   hw/scsi/esp.c | 6 ++++++
>>>>>   1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/hw/scsi/esp.c b/hw/scsi/esp.c
>>>>> index ac841dc32e..ba77017087 100644
>>>>> --- a/hw/scsi/esp.c
>>>>> +++ b/hw/scsi/esp.c
>>>>> @@ -1350,11 +1350,17 @@ static int esp_post_load(void *opaque, int version_id)
>>>>>           /* Migrate ti_buf to fifo */
>>>>>           len = s->mig_ti_wptr - s->mig_ti_rptr;
>>>>>           for (i = 0; i < len; i++) {
>>>>> +            if (&s->fifo.num >= &s->fifo.capacity) {
>>>>> +                return -1;
>>>>> +            }
>>>>>               fifo8_push(&s->fifo, s->mig_ti_buf[i]);
>>>>>           }
>>>>>           /* Migrate cmdbuf to cmdfifo */
>>>>>           for (i = 0; i < s->mig_cmdlen; i++) {
>>>>> +            if (&s->cmdfifo.num >= &s->cmdfifo.capacity) {
>>>>> +                return -1;
>>>>> +            }
>>>>>               fifo8_push(&s->cmdfifo, s->mig_cmdbuf[i]);
>>>>>           }
>>>>>       }
>>>
>>> This seems odd: this logic in esp_post_load() is for converting from pre-Fifo8 code to the current Fifo8 code, so why wouldn't we want to assert() for the case when the migration stream is intentionally malformed? Is there a case whereby the old code could generate an invalid migration stream like this?
>>>
>>>
>>> ATB,
>>>
>>> Mark.
>>>
>>
>> Hi Mark,
>>
>> The malformed migration stream in question originates from QEMU itself—either accidentally, due to 
>> a bug, or maliciously crafted. If we allow unchecked data through in esp_post_load(), an attacker
>> controlling the migration source could send crafted values that trigger undefined behavior.
>> The commit https://gitlab.com/qemu-project/qemu/-/commit/b88cfee90268cad376682da8f99ccf024d7aa304
>> also check the migration stream integrity in post_load handler, which is suggested by Peter Maydell in
>> https://lists.gnu.org/archive/html/qemu-devel/2024-07/msg00099.html, 'to prevent the division-by-zero
>> in the "malicious inbound migration state" case'.
>>
>> Also, I would appreciate your opinion on how we should handle such "malformed migration stream" case
>> more generally, if there are more severe issues than assertion error, such as FPE, UAF, etc.? Should
>> QEMU adopt a more systematic “post_load” validation pattern—verifying all critical fields across every
>> migration handler—to harden the migration subsystem against any tampering of the migration image?
>>
> 
> From the migration perspective it does make sense to validate the values
> and return error. The migration stream should indeed be considered
> untrusted.
> 
> But I agree that it would be nice to have some sort of reproducer. I
> don't think it's practical to go around adding code to handle every
> single hypothetical scenario. That creates some churn in the codebase
> that might introduce bugs by itself.

Hi Fibiano,

I'm very appreciated for your advice. I also agree that there's no need
and inpractical to check all fields of each device. So I'm focusing on
security or critical issues such as buffer overflow, UAF, etc.. For
instance, unmatching between buffer size and item number field will
cause an arbitrary address write, which can be exploited to attack host
machine. CVE-2013-4536(https://nvd.nist.gov/vuln/detail/CVE-2013-4536)
describes this scenario.

I've developed a fuzzer aiming to discover security issues caused 
by malformed migration stream, which has reported some memory corruption
crashes. I'm wondering if you will be interested in such issues that 
are more severe than assertion error? I'll be glad to have some further
discussion.