[PATCH] tests/qtest/boot-sector: Check that the guest did not panic

Thomas Huth posted 1 patch 4 years, 9 months ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20210212113141.854871-1-thuth@redhat.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, Thomas Huth <thuth@redhat.com>, Laurent Vivier <lvivier@redhat.com>
tests/qtest/boot-sector.c | 9 +++++++++
1 file changed, 9 insertions(+)
[PATCH] tests/qtest/boot-sector: Check that the guest did not panic
Posted by Thomas Huth 4 years, 9 months ago
The s390-ccw bios code panics if it can not boot successfully. In
this case, it does not make sense that we wait the full 600 seconds
for the boot sector test to finish and can signal the failure
immediately, thus let's check the status of the guest with the
"query-status" QMP command here, too.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 tests/qtest/boot-sector.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tests/qtest/boot-sector.c b/tests/qtest/boot-sector.c
index 24df5c4734..ea8f264661 100644
--- a/tests/qtest/boot-sector.c
+++ b/tests/qtest/boot-sector.c
@@ -138,6 +138,7 @@ void boot_sector_test(QTestState *qts)
     uint8_t signature_low;
     uint8_t signature_high;
     uint16_t signature;
+    QDict *qrsp, *qret;
     int i;
 
     /* Wait at most 600 seconds (test is slow with TCI and --enable-debug) */
@@ -155,6 +156,14 @@ void boot_sector_test(QTestState *qts)
         if (signature == SIGNATURE) {
             break;
         }
+
+        /* check that guest is still in "running" state and did not panic */
+        qrsp = qtest_qmp(qts, "{ 'execute': 'query-status' }");
+        qret = qdict_get_qdict(qrsp, "return");
+        g_assert_nonnull(qret);
+        g_assert_cmpstr(qdict_get_try_str(qret, "status"), ==, "running");
+        qobject_unref(qrsp);
+
         g_usleep(TEST_DELAY);
     }
 
-- 
2.27.0


Re: [PATCH] tests/qtest/boot-sector: Check that the guest did not panic
Posted by Philippe Mathieu-Daudé 4 years, 9 months ago
On 2/12/21 12:31 PM, Thomas Huth wrote:
> The s390-ccw bios code panics if it can not boot successfully. In
> this case, it does not make sense that we wait the full 600 seconds
> for the boot sector test to finish and can signal the failure
> immediately, thus let's check the status of the guest with the
> "query-status" QMP command here, too.
> 
> Reported-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  tests/qtest/boot-sector.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/tests/qtest/boot-sector.c b/tests/qtest/boot-sector.c
> index 24df5c4734..ea8f264661 100644
> --- a/tests/qtest/boot-sector.c
> +++ b/tests/qtest/boot-sector.c
> @@ -138,6 +138,7 @@ void boot_sector_test(QTestState *qts)
>      uint8_t signature_low;
>      uint8_t signature_high;
>      uint16_t signature;
> +    QDict *qrsp, *qret;
>      int i;
>  
>      /* Wait at most 600 seconds (test is slow with TCI and --enable-debug) */
> @@ -155,6 +156,14 @@ void boot_sector_test(QTestState *qts)
>          if (signature == SIGNATURE) {
>              break;
>          }
> +
> +        /* check that guest is still in "running" state and did not panic */
> +        qrsp = qtest_qmp(qts, "{ 'execute': 'query-status' }");
> +        qret = qdict_get_qdict(qrsp, "return");
> +        g_assert_nonnull(qret);
> +        g_assert_cmpstr(qdict_get_try_str(qret, "status"), ==, "running");

Interesting idea. Does it make sense to have a similar (optional?) check
done in QEMUMachine? This could benefit integration tests, quicker exit
on failure.

> +        qobject_unref(qrsp);
> +
>          g_usleep(TEST_DELAY);
>      }
>  
> 


Re: [PATCH] tests/qtest/boot-sector: Check that the guest did not panic
Posted by Thomas Huth 4 years, 9 months ago
On 12/02/2021 14.18, Philippe Mathieu-Daudé wrote:
> On 2/12/21 12:31 PM, Thomas Huth wrote:
>> The s390-ccw bios code panics if it can not boot successfully. In
>> this case, it does not make sense that we wait the full 600 seconds
>> for the boot sector test to finish and can signal the failure
>> immediately, thus let's check the status of the guest with the
>> "query-status" QMP command here, too.
>>
>> Reported-by: Peter Maydell <peter.maydell@linaro.org>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>   tests/qtest/boot-sector.c | 9 +++++++++
>>   1 file changed, 9 insertions(+)
>>
>> diff --git a/tests/qtest/boot-sector.c b/tests/qtest/boot-sector.c
>> index 24df5c4734..ea8f264661 100644
>> --- a/tests/qtest/boot-sector.c
>> +++ b/tests/qtest/boot-sector.c
>> @@ -138,6 +138,7 @@ void boot_sector_test(QTestState *qts)
>>       uint8_t signature_low;
>>       uint8_t signature_high;
>>       uint16_t signature;
>> +    QDict *qrsp, *qret;
>>       int i;
>>   
>>       /* Wait at most 600 seconds (test is slow with TCI and --enable-debug) */
>> @@ -155,6 +156,14 @@ void boot_sector_test(QTestState *qts)
>>           if (signature == SIGNATURE) {
>>               break;
>>           }
>> +
>> +        /* check that guest is still in "running" state and did not panic */
>> +        qrsp = qtest_qmp(qts, "{ 'execute': 'query-status' }");
>> +        qret = qdict_get_qdict(qrsp, "return");
>> +        g_assert_nonnull(qret);
>> +        g_assert_cmpstr(qdict_get_try_str(qret, "status"), ==, "running");
> 
> Interesting idea. Does it make sense to have a similar (optional?) check
> done in QEMUMachine? This could benefit integration tests, quicker exit
> on failure.

Well, it only makes sense in cases where the guest is causing a panic event. 
That's what the s390 ccw bios is doing, but other firmwares do *not* panic 
in case they cannot boot the guest.

It might be also useful for the acceptance tests if they can trigger a panic 
event, but I think we already check for the "Kernel panic" in the console 
output in most cases, so I guess that's enough already?

  Thomas


Re: [PATCH] tests/qtest/boot-sector: Check that the guest did not panic
Posted by John Snow 4 years, 9 months ago
On 2/12/21 8:18 AM, Philippe Mathieu-Daudé wrote:
> Interesting idea. Does it make sense to have a similar (optional?) check
> done in QEMUMachine? This could benefit integration tests, quicker exit
> on failure.
> 

That might be the wrong layer to do it in. I am trying to keep 
QEMUMachine be the mechanisms, not the policy. Not all QEMUMachine 
instances even have a QMP socket, either.

Having shared code (somewhere) that allows you to do stuff like issue a 
query status every sec to do a more pro-active heartbeat check on-demand 
is probably a good idea, though:

e.g.

vm = QEMUManagedMachine(...)
with vm.start_heartbeat() as heartbeat:
     ... do things prone to failure here ...
     ...
     ...
# as of here, the heartbeat has been stopped


It might be worth looking into creating a "value-added" version of 
QEMUMachine that offers stuff like this, in a manner similar to how 
iotests has its own extended versions of the QEMUMachine to offer 
test-specific behavior.

(Patches welcome!)

--js