[PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled

Richard Henderson posted 1 patch 11 months ago
Failed in applying to current master (apply log)
.gitlab-ci.d/buildtest.yml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled
Posted by Richard Henderson 11 months ago
This test consistently fails on Azure cloud build hosts in
a way that suggests a timing problem in the test itself:

--- .../194.out
+++ .../194.out.bad
@@ -14,7 +14,6 @@
 {"return": {}}
 {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
-{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 Gracefully ending the `drive-mirror` job on source...
 {"return": {}}

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 .gitlab-ci.d/buildtest.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
index 0f1be14cb6..000062483f 100644
--- a/.gitlab-ci.d/buildtest.yml
+++ b/.gitlab-ci.d/buildtest.yml
@@ -236,7 +236,7 @@ build-tcg-disabled:
     - cd tests/qemu-iotests/
     - ./check -raw 001 002 003 004 005 008 009 010 011 012 021 025 032 033 048
             052 063 077 086 101 104 106 113 148 150 151 152 157 159 160 163
-            170 171 183 184 192 194 208 221 226 227 236 253 277 image-fleecing
+            170 171 183 184 192 208 221 226 227 236 253 277 image-fleecing
     - ./check -qcow2 028 051 056 057 058 065 068 082 085 091 095 096 102 122
             124 132 139 142 144 145 151 152 155 157 165 194 196 200 202
             208 209 216 218 227 234 246 247 248 250 254 255 257 258
-- 
2.34.1
Re: [PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled
Posted by Vladimir Sementsov-Ogievskiy 11 months ago
On 06.06.23 19:25, Richard Henderson wrote:
> This test consistently fails on Azure cloud build hosts in
> a way that suggests a timing problem in the test itself:
> 
> --- .../194.out
> +++ .../194.out.bad
> @@ -14,7 +14,6 @@
>   {"return": {}}
>   {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>   {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
> -{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>   {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>   Gracefully ending the `drive-mirror` job on source...
>   {"return": {}}
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   .gitlab-ci.d/buildtest.yml | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> index 0f1be14cb6..000062483f 100644
> --- a/.gitlab-ci.d/buildtest.yml
> +++ b/.gitlab-ci.d/buildtest.yml
> @@ -236,7 +236,7 @@ build-tcg-disabled:
>       - cd tests/qemu-iotests/
>       - ./check -raw 001 002 003 004 005 008 009 010 011 012 021 025 032 033 048
>               052 063 077 086 101 104 106 113 148 150 151 152 157 159 160 163
> -            170 171 183 184 192 194 208 221 226 227 236 253 277 image-fleecing
> +            170 171 183 184 192 208 221 226 227 236 253 277 image-fleecing
>       - ./check -qcow2 028 051 056 057 058 065 068 082 085 091 095 096 102 122
>               124 132 139 142 144 145 151 152 155 157 165 194 196 200 202
>               208 209 216 218 227 234 246 247 248 250 254 255 257 258


There is actually a bug in the test, I've sent a patch:

<20230607143606.1557395-1-vsementsov@yandex-team.ru>
[PATCH] iotests: fix 194: filter out racy postcopy-active event

-- 
Best regards,
Vladimir
Re: [PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled
Posted by Stefan Hajnoczi 11 months ago
On Wed, 7 Jun 2023 at 10:39, Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> On 06.06.23 19:25, Richard Henderson wrote:
> > This test consistently fails on Azure cloud build hosts in
> > a way that suggests a timing problem in the test itself:
> >
> > --- .../194.out
> > +++ .../194.out.bad
> > @@ -14,7 +14,6 @@
> >   {"return": {}}
> >   {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
> >   {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
> > -{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
> >   {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
> >   Gracefully ending the `drive-mirror` job on source...
> >   {"return": {}}
> >
> > Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> > ---
> >   .gitlab-ci.d/buildtest.yml | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> > index 0f1be14cb6..000062483f 100644
> > --- a/.gitlab-ci.d/buildtest.yml
> > +++ b/.gitlab-ci.d/buildtest.yml
> > @@ -236,7 +236,7 @@ build-tcg-disabled:
> >       - cd tests/qemu-iotests/
> >       - ./check -raw 001 002 003 004 005 008 009 010 011 012 021 025 032 033 048
> >               052 063 077 086 101 104 106 113 148 150 151 152 157 159 160 163
> > -            170 171 183 184 192 194 208 221 226 227 236 253 277 image-fleecing
> > +            170 171 183 184 192 208 221 226 227 236 253 277 image-fleecing
> >       - ./check -qcow2 028 051 056 057 058 065 068 082 085 091 095 096 102 122
> >               124 132 139 142 144 145 151 152 155 157 165 194 196 200 202
> >               208 209 216 218 227 234 246 247 248 250 254 255 257 258
>
>
> There is actually a bug in the test, I've sent a patch:
>
> <20230607143606.1557395-1-vsementsov@yandex-team.ru>
> [PATCH] iotests: fix 194: filter out racy postcopy-active event

Awesome, thank you!

Stefan
Re: [PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled
Posted by Stefan Hajnoczi 11 months ago
The line of output that has changed was originally added by the
following commit:

commit ae00aa2398476824f0eca80461da215e7cdc1c3b
Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Date:   Fri May 22 01:06:46 2020 +0300

    iotests: 194: test also migration of dirty bitmap

Vladimir: Any idea why the postcopy-active event may not be emitted in
some cases?

Stefan

On Tue, 6 Jun 2023 at 12:26, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> This test consistently fails on Azure cloud build hosts in
> a way that suggests a timing problem in the test itself:
>
> --- .../194.out
> +++ .../194.out.bad
> @@ -14,7 +14,6 @@
>  {"return": {}}
>  {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>  {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
> -{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>  {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>  Gracefully ending the `drive-mirror` job on source...
>  {"return": {}}
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  .gitlab-ci.d/buildtest.yml | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> index 0f1be14cb6..000062483f 100644
> --- a/.gitlab-ci.d/buildtest.yml
> +++ b/.gitlab-ci.d/buildtest.yml
> @@ -236,7 +236,7 @@ build-tcg-disabled:
>      - cd tests/qemu-iotests/
>      - ./check -raw 001 002 003 004 005 008 009 010 011 012 021 025 032 033 048
>              052 063 077 086 101 104 106 113 148 150 151 152 157 159 160 163
> -            170 171 183 184 192 194 208 221 226 227 236 253 277 image-fleecing
> +            170 171 183 184 192 208 221 226 227 236 253 277 image-fleecing
>      - ./check -qcow2 028 051 056 057 058 065 068 082 085 091 095 096 102 122
>              124 132 139 142 144 145 151 152 155 157 165 194 196 200 202
>              208 209 216 218 227 234 246 247 248 250 254 255 257 258
> --
> 2.34.1
>
>
Re: [PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled
Posted by Vladimir Sementsov-Ogievskiy 11 months ago
On 07.06.23 15:44, Stefan Hajnoczi wrote:
> The line of output that has changed was originally added by the
> following commit:
> 
> commit ae00aa2398476824f0eca80461da215e7cdc1c3b
> Author: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Date:   Fri May 22 01:06:46 2020 +0300
> 
>      iotests: 194: test also migration of dirty bitmap
> 
> Vladimir: Any idea why the postcopy-active event may not be emitted in
> some cases?
> 

I think:

fast connection + small postcopy data => postcopy actually not started, everything is migrated in downtime, when both source and target are not running.

The test doesn't want to test exactly postcopy, but just want to check that bitmaps are migrated somehow.


So, we need something like this:

diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
index 68894371f5..c0ce82dd25 100755
--- a/tests/qemu-iotests/194
+++ b/tests/qemu-iotests/194
@@ -74,6 +74,11 @@ with iotests.FilePath('source.img') as source_img_path, \
  
      while True:
          event1 = source_vm.event_wait('MIGRATION')
+        if event1['data']['status'] == 'postcopy-active':
+            # This event is racy, it depends do we really do postcopy or bitmap
+            # was migrated during downtime (and no data to migrate in postcopy
+            # phase). So, don't log it.
+            continue
          iotests.log(event1, filters=[iotests.filter_qmp_event])
          if event1['data']['status'] in ('completed', 'failed'):
              iotests.log('Gracefully ending the `drive-mirror` job on source...')



> 
> On Tue, 6 Jun 2023 at 12:26, Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> This test consistently fails on Azure cloud build hosts in
>> a way that suggests a timing problem in the test itself:
>>
>> --- .../194.out
>> +++ .../194.out.bad
>> @@ -14,7 +14,6 @@
>>   {"return": {}}
>>   {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>>   {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>> -{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>>   {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>>   Gracefully ending the `drive-mirror` job on source...
>>   {"return": {}}
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   .gitlab-ci.d/buildtest.yml | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
>> index 0f1be14cb6..000062483f 100644
>> --- a/.gitlab-ci.d/buildtest.yml
>> +++ b/.gitlab-ci.d/buildtest.yml
>> @@ -236,7 +236,7 @@ build-tcg-disabled:
>>       - cd tests/qemu-iotests/
>>       - ./check -raw 001 002 003 004 005 008 009 010 011 012 021 025 032 033 048
>>               052 063 077 086 101 104 106 113 148 150 151 152 157 159 160 163
>> -            170 171 183 184 192 194 208 221 226 227 236 253 277 image-fleecing
>> +            170 171 183 184 192 208 221 226 227 236 253 277 image-fleecing
>>       - ./check -qcow2 028 051 056 057 058 065 068 082 085 091 095 096 102 122
>>               124 132 139 142 144 145 151 152 155 157 165 194 196 200 202
>>               208 209 216 218 227 234 246 247 248 250 254 255 257 258
>> --
>> 2.34.1
>>
>>

-- 
Best regards,
Vladimir
Re: [PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled
Posted by Philippe Mathieu-Daudé 11 months ago
On 6/6/23 18:25, Richard Henderson wrote:
> This test consistently fails on Azure cloud build hosts in
> a way that suggests a timing problem in the test itself:
> 
> --- .../194.out
> +++ .../194.out.bad
> @@ -14,7 +14,6 @@
>   {"return": {}}
>   {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>   {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
> -{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}

Is it useful to modify 194.out.bad ...

>   {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>   Gracefully ending the `drive-mirror` job on source...
>   {"return": {}}
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   .gitlab-ci.d/buildtest.yml | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> index 0f1be14cb6..000062483f 100644
> --- a/.gitlab-ci.d/buildtest.yml
> +++ b/.gitlab-ci.d/buildtest.yml
> @@ -236,7 +236,7 @@ build-tcg-disabled:
>       - cd tests/qemu-iotests/
>       - ./check -raw 001 002 003 004 005 008 009 010 011 012 021 025 032 033 048
>               052 063 077 086 101 104 106 113 148 150 151 152 157 159 160 163
> -            170 171 183 184 192 194 208 221 226 227 236 253 277 image-fleecing
> +            170 171 183 184 192 208 221 226 227 236 253 277 image-fleecing

... if we don't run test #194 anymore?

>       - ./check -qcow2 028 051 056 057 058 065 068 082 085 091 095 096 102 122
>               124 132 139 142 144 145 151 152 155 157 165 194 196 200 202
>               208 209 216 218 227 234 246 247 248 250 254 255 257 258
Re: [PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled
Posted by Richard Henderson 11 months ago
On 6/6/23 12:24, Philippe Mathieu-Daudé wrote:
> On 6/6/23 18:25, Richard Henderson wrote:
>> This test consistently fails on Azure cloud build hosts in
>> a way that suggests a timing problem in the test itself:
>>
>> --- .../194.out
>> +++ .../194.out.bad
>> @@ -14,7 +14,6 @@
>>   {"return": {}}
>>   {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": 
>> "USECS", "seconds": "SECS"}}
>>   {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": 
>> "USECS", "seconds": "SECS"}}
>> -{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": 
>> {"microseconds": "USECS", "seconds": "SECS"}}
> 
> Is it useful to modify 194.out.bad ...

This is not a patch, it's the testsuite output.


r~

> 
>>   {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": 
>> "USECS", "seconds": "SECS"}}
>>   Gracefully ending the `drive-mirror` job on source...
>>   {"return": {}}
>>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>   .gitlab-ci.d/buildtest.yml | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
>> index 0f1be14cb6..000062483f 100644
>> --- a/.gitlab-ci.d/buildtest.yml
>> +++ b/.gitlab-ci.d/buildtest.yml
>> @@ -236,7 +236,7 @@ build-tcg-disabled:
>>       - cd tests/qemu-iotests/
>>       - ./check -raw 001 002 003 004 005 008 009 010 011 012 021 025 032 033 048
>>               052 063 077 086 101 104 106 113 148 150 151 152 157 159 160 163
>> -            170 171 183 184 192 194 208 221 226 227 236 253 277 image-fleecing
>> +            170 171 183 184 192 208 221 226 227 236 253 277 image-fleecing
> 
> ... if we don't run test #194 anymore?
> 
>>       - ./check -qcow2 028 051 056 057 058 065 068 082 085 091 095 096 102 122
>>               124 132 139 142 144 145 151 152 155 157 165 194 196 200 202
>>               208 209 216 218 227 234 246 247 248 250 254 255 257 258
> 


Re: [PATCH] gitlab: Disable io-raw-194 for build-tcg-disabled
Posted by Thomas Huth 11 months ago
On 06/06/2023 18.25, Richard Henderson wrote:
> This test consistently fails on Azure cloud build hosts in
> a way that suggests a timing problem in the test itself:
> 
> --- .../194.out
> +++ .../194.out.bad
> @@ -14,7 +14,6 @@
>   {"return": {}}
>   {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>   {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
> -{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>   {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
>   Gracefully ending the `drive-mirror` job on source...
>   {"return": {}}
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   .gitlab-ci.d/buildtest.yml | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml
> index 0f1be14cb6..000062483f 100644
> --- a/.gitlab-ci.d/buildtest.yml
> +++ b/.gitlab-ci.d/buildtest.yml
> @@ -236,7 +236,7 @@ build-tcg-disabled:
>       - cd tests/qemu-iotests/
>       - ./check -raw 001 002 003 004 005 008 009 010 011 012 021 025 032 033 048
>               052 063 077 086 101 104 106 113 148 150 151 152 157 159 160 163
> -            170 171 183 184 192 194 208 221 226 227 236 253 277 image-fleecing
> +            170 171 183 184 192 208 221 226 227 236 253 277 image-fleecing
>       - ./check -qcow2 028 051 056 057 058 065 068 082 085 091 095 096 102 122
>               124 132 139 142 144 145 151 152 155 157 165 194 196 200 202
>               208 209 216 218 227 234 246 247 248 250 254 255 257 258

That's fair to remove it: The iotests that are run here were only added here 
because they were tested to succeed on the shared runners - if they were 
running successfull everywhere, we would have added them to the "auto" group 
instead so that they'd run during "make check-block". So if something is 
failing on the private runners now, it's ok to remove them from this list here.

Reviewed-by: Thomas Huth <thuth@redhat.com>