[PATCH v4 4/6] iotests: 194: test also migration of dirty bitmap

Vladimir Sementsov-Ogievskiy posted 6 patches 5 years, 8 months ago
Maintainers: Eric Blake <eblake@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>, Max Reitz <mreitz@redhat.com>, John Snow <jsnow@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Juan Quintela <quintela@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Fam Zheng <fam@euphon.net>
[PATCH v4 4/6] iotests: 194: test also migration of dirty bitmap
Posted by Vladimir Sementsov-Ogievskiy 5 years, 8 months ago
Test that dirty bitmap migration works when we deal with mirror.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
---
 tests/qemu-iotests/194     | 14 ++++++++++----
 tests/qemu-iotests/194.out |  6 ++++++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
index 8b1f720af4..3fad7c6c1a 100755
--- a/tests/qemu-iotests/194
+++ b/tests/qemu-iotests/194
@@ -42,6 +42,8 @@ with iotests.FilePath('source.img') as source_img_path, \
             .add_incoming('unix:{0}'.format(migration_sock_path))
             .launch())
 
+    source_vm.qmp_log('block-dirty-bitmap-add', node='drive0', name='bitmap0')
+
     iotests.log('Launching NBD server on destination...')
     iotests.log(dest_vm.qmp('nbd-server-start', addr={'type': 'unix', 'data': {'path': nbd_sock_path}}))
     iotests.log(dest_vm.qmp('nbd-server-add', device='drive0', writable=True))
@@ -61,12 +63,14 @@ with iotests.FilePath('source.img') as source_img_path, \
                 filters=[iotests.filter_qmp_event])
 
     iotests.log('Starting migration...')
-    source_vm.qmp('migrate-set-capabilities',
-                  capabilities=[{'capability': 'events', 'state': True}])
-    dest_vm.qmp('migrate-set-capabilities',
-                capabilities=[{'capability': 'events', 'state': True}])
+    capabilities = [{'capability': 'events', 'state': True},
+                    {'capability': 'dirty-bitmaps', 'state': True}]
+    source_vm.qmp('migrate-set-capabilities', capabilities=capabilities)
+    dest_vm.qmp('migrate-set-capabilities', capabilities=capabilities)
     iotests.log(source_vm.qmp('migrate', uri='unix:{0}'.format(migration_sock_path)))
 
+    source_vm.qmp_log('migrate-start-postcopy')
+
     while True:
         event1 = source_vm.event_wait('MIGRATION')
         iotests.log(event1, filters=[iotests.filter_qmp_event])
@@ -82,3 +86,5 @@ with iotests.FilePath('source.img') as source_img_path, \
             iotests.log('Stopping the NBD server on destination...')
             iotests.log(dest_vm.qmp('nbd-server-stop'))
             break
+
+    iotests.log(source_vm.qmp('query-block')['return'][0]['dirty-bitmaps'])
diff --git a/tests/qemu-iotests/194.out b/tests/qemu-iotests/194.out
index 71857853fb..dd60dcc14f 100644
--- a/tests/qemu-iotests/194.out
+++ b/tests/qemu-iotests/194.out
@@ -1,4 +1,6 @@
 Launching VMs...
+{"execute": "block-dirty-bitmap-add", "arguments": {"name": "bitmap0", "node": "drive0"}}
+{"return": {}}
 Launching NBD server on destination...
 {"return": {}}
 {"return": {}}
@@ -8,11 +10,15 @@ Waiting for `drive-mirror` to complete...
 {"data": {"device": "mirror-job0", "len": 1073741824, "offset": 1073741824, "speed": 0, "type": "mirror"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 Starting migration...
 {"return": {}}
+{"execute": "migrate-start-postcopy", "arguments": {}}
+{"return": {}}
 {"data": {"status": "setup"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"status": "active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
+{"data": {"status": "postcopy-active"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 {"data": {"status": "completed"}, "event": "MIGRATION", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 Gracefully ending the `drive-mirror` job on source...
 {"return": {}}
 {"data": {"device": "mirror-job0", "len": 1073741824, "offset": 1073741824, "speed": 0, "type": "mirror"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": {"microseconds": "USECS", "seconds": "SECS"}}
 Stopping the NBD server on destination...
 {"return": {}}
+[{"busy": false, "count": 0, "granularity": 65536, "name": "bitmap0", "persistent": false, "recording": true, "status": "active"}]
-- 
2.21.0


Re: [PATCH v4 4/6] iotests: 194: test also migration of dirty bitmap
Posted by Thomas Huth 5 years, 8 months ago
On 22/05/2020 00.06, Vladimir Sementsov-Ogievskiy wrote:
> Test that dirty bitmap migration works when we deal with mirror.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> ---
>  tests/qemu-iotests/194     | 14 ++++++++++----
>  tests/qemu-iotests/194.out |  6 ++++++
>  2 files changed, 16 insertions(+), 4 deletions(-)

 Hi!

This test broke the iotest in the gitlab CI:

 https://gitlab.com/huth/qemu/-/jobs/578520599#L3780

it works again when I revert this commit.

Could the test be reworked so that it works in CI pipelines, too?
Otherwise, I think it's best if we disable it in the .gitlab-ci.yml file...

 Thomas


Re: [PATCH v4 4/6] iotests: 194: test also migration of dirty bitmap
Posted by Vladimir Sementsov-Ogievskiy 5 years, 8 months ago
03.06.2020 10:52, Thomas Huth wrote:
> On 22/05/2020 00.06, Vladimir Sementsov-Ogievskiy wrote:
>> Test that dirty bitmap migration works when we deal with mirror.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> ---
>>   tests/qemu-iotests/194     | 14 ++++++++++----
>>   tests/qemu-iotests/194.out |  6 ++++++
>>   2 files changed, 16 insertions(+), 4 deletions(-)
> 
>   Hi!
> 
> This test broke the iotest in the gitlab CI:
> 
>   https://gitlab.com/huth/qemu/-/jobs/578520599#L3780
> 
> it works again when I revert this commit.
> 
> Could the test be reworked so that it works in CI pipelines, too?
> Otherwise, I think it's best if we disable it in the .gitlab-ci.yml file...
> 
>   Thomas
> 

Hi!

from the link you provided:

...

../configure --cc=clang --enable-werror --disable-tcg --audio-drv-list=""

...


194      fail       [06:31:15] [06:31:16]                    output mismatch (see 194.out.bad)
--- /builds/huth/qemu/tests/qemu-iotests/194.out	2020-06-03 06:18:10.000000000 +0000
+++ /builds/huth/qemu/build/tests/qemu-iotests/194.out.bad	2020-06-03 06:31:16.000000000 +0000
@@ -22,3 +22,4 @@
  Stopping the NBD server on destination...
  {"return": {}}
  [{"busy": false, "count": 0, "granularity": 65536, "name": "bitmap0", "persistent": false, "recording": true, "status": "active"}]
+WARNING:qemu.machine:qemu received signal 6: /builds/huth/qemu/build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64 -display none -vga none -chardev socket,id=mon,path=/tmp/tmp.e9BvOjVl1W/qemudest-15756-monitor.sock -mon chardev=mon,mode=control -qtest unix:path=/tmp/tmp.e9BvOjVl1W/qemudest-15756-qtest.sock -accel qtest -nodefaults -display none -accel qtest -drive if=virtio,id=drive0,file=/builds/huth/qemu/build/tests/qemu-iotests/scratch/15756-dest.img,format=raw,cache=writeback,aio=threads -incoming unix:/tmp/tmp.e9BvOjVl1W/15756-migration.sock



- Qemu aborted. Not good. Definitely is better to fix it than just exclude the test.. I can't reproduce. Could you provide backtrace from coredump?


-- 
Best regards,
Vladimir

Re: [PATCH v4 4/6] iotests: 194: test also migration of dirty bitmap
Posted by Thomas Huth 5 years, 8 months ago
On 03/06/2020 10.06, Vladimir Sementsov-Ogievskiy wrote:
> 03.06.2020 10:52, Thomas Huth wrote:
>> On 22/05/2020 00.06, Vladimir Sementsov-Ogievskiy wrote:
>>> Test that dirty bitmap migration works when we deal with mirror.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> Reviewed-by: Eric Blake <eblake@redhat.com>
>>> ---
>>>   tests/qemu-iotests/194     | 14 ++++++++++----
>>>   tests/qemu-iotests/194.out |  6 ++++++
>>>   2 files changed, 16 insertions(+), 4 deletions(-)
>>
>>   Hi!
>>
>> This test broke the iotest in the gitlab CI:
>>
>>   https://gitlab.com/huth/qemu/-/jobs/578520599#L3780
>>
>> it works again when I revert this commit.
>>
>> Could the test be reworked so that it works in CI pipelines, too?
>> Otherwise, I think it's best if we disable it in the .gitlab-ci.yml
>> file...
[...]
> - Qemu aborted. Not good. Definitely is better to fix it than just
> exclude the test.. I can't reproduce. Could you provide backtrace from
> coredump?

It aborted in block/dirty-bitmap.c, line 295, that's the
"assert(!bdrv_dirty_bitmap_busy(bitmap));" if I got it right.

Full backtrace here:

 https://gitlab.com/huth/qemu/-/jobs/580553686#L3638

HTH,
 Thomas


Re: [PATCH v4 4/6] iotests: 194: test also migration of dirty bitmap
Posted by Vladimir Sementsov-Ogievskiy 5 years, 8 months ago
04.06.2020 10:21, Thomas Huth wrote:
> On 03/06/2020 10.06, Vladimir Sementsov-Ogievskiy wrote:
>> 03.06.2020 10:52, Thomas Huth wrote:
>>> On 22/05/2020 00.06, Vladimir Sementsov-Ogievskiy wrote:
>>>> Test that dirty bitmap migration works when we deal with mirror.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> Reviewed-by: Eric Blake <eblake@redhat.com>
>>>> ---
>>>>    tests/qemu-iotests/194     | 14 ++++++++++----
>>>>    tests/qemu-iotests/194.out |  6 ++++++
>>>>    2 files changed, 16 insertions(+), 4 deletions(-)
>>>
>>>    Hi!
>>>
>>> This test broke the iotest in the gitlab CI:
>>>
>>>    https://gitlab.com/huth/qemu/-/jobs/578520599#L3780
>>>
>>> it works again when I revert this commit.
>>>
>>> Could the test be reworked so that it works in CI pipelines, too?
>>> Otherwise, I think it's best if we disable it in the .gitlab-ci.yml
>>> file...
> [...]
>> - Qemu aborted. Not good. Definitely is better to fix it than just
>> exclude the test.. I can't reproduce. Could you provide backtrace from
>> coredump?
> 
> It aborted in block/dirty-bitmap.c, line 295, that's the
> "assert(!bdrv_dirty_bitmap_busy(bitmap));" if I got it right.
> 
> Full backtrace here:
> 
>   https://gitlab.com/huth/qemu/-/jobs/580553686#L3638
> 

Aha, missed it, thanks.

Hm. in 194 iotest we wait for migration finish on source, but not on target. I can assume, that in your setup target shutdown occurs earlier than migration finish, so we have busy bitmap on shutdown.
It's known bug (at least for me :), patch is in list:
[PATCH v2 00/22] Fix error handling during bitmap postcopy
   [..]
   [PATCH v2 10/22] migration/block-dirty-bitmap: cancel migration on shutdown  (https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg04288.html)
      If target is turned of prior to postcopy finished, target crashes
      because busy bitmaps are found at shutdown.
      Canceling incoming migration helps, as it removes all unfinished (and
      therefore busy) bitmaps.
   [..]


Still, of course iotest 194 should be fixed too, to wait for migration finish on target too, I'll send a patch.

-- 
Best regards,
Vladimir

Re: [PATCH v4 4/6] iotests: 194: test also migration of dirty bitmap
Posted by Thomas Huth 5 years, 8 months ago
On 04/06/2020 09.51, Vladimir Sementsov-Ogievskiy wrote:
> 04.06.2020 10:21, Thomas Huth wrote:
>> On 03/06/2020 10.06, Vladimir Sementsov-Ogievskiy wrote:
>>> 03.06.2020 10:52, Thomas Huth wrote:
>>>> On 22/05/2020 00.06, Vladimir Sementsov-Ogievskiy wrote:
>>>>> Test that dirty bitmap migration works when we deal with mirror.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>>> Reviewed-by: Eric Blake <eblake@redhat.com>
>>>>> ---
>>>>>    tests/qemu-iotests/194     | 14 ++++++++++----
>>>>>    tests/qemu-iotests/194.out |  6 ++++++
>>>>>    2 files changed, 16 insertions(+), 4 deletions(-)
>>>>
>>>>    Hi!
>>>>
>>>> This test broke the iotest in the gitlab CI:
>>>>
>>>>    https://gitlab.com/huth/qemu/-/jobs/578520599#L3780
>>>>
>>>> it works again when I revert this commit.
>>>>
>>>> Could the test be reworked so that it works in CI pipelines, too?
>>>> Otherwise, I think it's best if we disable it in the .gitlab-ci.yml
>>>> file...
>> [...]
>>> - Qemu aborted. Not good. Definitely is better to fix it than just
>>> exclude the test.. I can't reproduce. Could you provide backtrace from
>>> coredump?
>>
>> It aborted in block/dirty-bitmap.c, line 295, that's the
>> "assert(!bdrv_dirty_bitmap_busy(bitmap));" if I got it right.
>>
>> Full backtrace here:
>>
>>   https://gitlab.com/huth/qemu/-/jobs/580553686#L3638
>>
> 
> Aha, missed it, thanks.

No, you didn't miss it. That was a new run where I added some more
debugging magic to the gitlab-ci.yml file:

build-tcg-disabled:
 [...]
 - make -j"$JOBS"
 - sysctl -w kernel.core_pattern=/tmp/core
 - ulimit -c
 - cd tests/qemu-iotests/
 - ./check -raw 194 || find /tmp -name "*core*"
 - gdb -ex bt ../../x86_64-softmmu/qemu-system-x86_64 /tmp/core*

> Hm. in 194 iotest we wait for migration finish on source, but not on
> target. I can assume, that in your setup target shutdown occurs earlier
> than migration finish, so we have busy bitmap on shutdown.

Sounds plausible. Especially if you consider that these gitlab CI
containers only run with one virtual CPU, so there is likely more delay
here.

> It's known bug (at least for me :), patch is in list:
> [PATCH v2 00/22] Fix error handling during bitmap postcopy
>   [..]
>   [PATCH v2 10/22] migration/block-dirty-bitmap: cancel migration on
> shutdown 
> (https://lists.gnu.org/archive/html/qemu-devel/2020-02/msg04288.html)
>      If target is turned of prior to postcopy finished, target crashes
>      because busy bitmaps are found at shutdown.
>      Canceling incoming migration helps, as it removes all unfinished (and
>      therefore busy) bitmaps.
>   [..]
> 
> 
> Still, of course iotest 194 should be fixed too, to wait for migration
> finish on target too, I'll send a patch.

Thanks, I'll give it a try!

 Thomas