[Qemu-devel] [PATCH] iotests: fix 185

Vladimir Sementsov-Ogievskiy posted 1 patch 6 years, 7 months ago
Failed in applying to current master (apply log)
There is a newer version of this series
tests/qemu-iotests/185     | 2 +-
tests/qemu-iotests/185.out | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
[Qemu-devel] [PATCH] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
185 iotest is broken.

How to test:
> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
  done; echo N = $i

finished for me like this:

185 2s ... - output mismatch (see 185.out.bad)
--- /work/src/qemu/master/tests/qemu-iotests/185.out    2017-07-14 \
    15:14:29.520343805 +0300
+++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
@@ -37,7 +37,7 @@
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
     "event": "SHUTDOWN", "data": {"guest": false}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
    "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
        "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
            "mirror"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
    "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
        "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}

 === Start backup job and exit qemu ===

Failures: 185
Failed 1 of 1 tests
N = 8

It doesn't seems logical to expect the constant offset on cancel,
so let filter it out.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 tests/qemu-iotests/185     | 2 +-
 tests/qemu-iotests/185.out | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/185 b/tests/qemu-iotests/185
index 0eda371f27..446d78447e 100755
--- a/tests/qemu-iotests/185
+++ b/tests/qemu-iotests/185
@@ -157,7 +157,7 @@ _send_qemu_cmd $h \
     "return"
 
 _send_qemu_cmd $h "{ 'execute': 'quit' }" "return"
-wait=1 _cleanup_qemu
+wait=1 _cleanup_qemu | _filter_block_job_offset
 
 echo
 echo === Start backup job and exit qemu ===
diff --git a/tests/qemu-iotests/185.out b/tests/qemu-iotests/185.out
index 57eaf8d699..f51af627fe 100644
--- a/tests/qemu-iotests/185.out
+++ b/tests/qemu-iotests/185.out
@@ -37,7 +37,7 @@ Formatting 'TEST_DIR/t.qcow2.copy', fmt=qcow2 size=67108864 cluster_size=65536 l
 {"return": {}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "SHUTDOWN", "data": {"guest": false}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset": 4194304, "speed": 65536, "type": "mirror"}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset": OFFSET, "speed": 65536, "type": "mirror"}}
 
 === Start backup job and exit qemu ===
 
-- 
2.11.1


Re: [Qemu-devel] [PATCH for-2.10] iotests: fix 185
Posted by Eric Blake 6 years, 7 months ago
On 08/07/2017 09:16 AM, Vladimir Sementsov-Ogievskiy wrote:
> 185 iotest is broken.
> 
> How to test:
>> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>   done; echo N = $i
> 
> finished for me like this:
> 
> 185 2s ... - output mismatch (see 185.out.bad)
> --- /work/src/qemu/master/tests/qemu-iotests/185.out    2017-07-14 \
>     15:14:29.520343805 +0300
> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
> @@ -37,7 +37,7 @@
>  {"return": {}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>      "event": "SHUTDOWN", "data": {"guest": false}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>     "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>         "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
>             "mirror"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>     "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>         "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}

This diff says both 'len' and 'offset' differ...

> 
>  === Start backup job and exit qemu ===
> 
> Failures: 185
> Failed 1 of 1 tests
> N = 8
> 
> It doesn't seems logical to expect the constant offset on cancel,
> so let filter it out.

s/let/let's/

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  tests/qemu-iotests/185     | 2 +-
>  tests/qemu-iotests/185.out | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset": 4194304, "speed": 65536, "type": "mirror"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset": OFFSET, "speed": 65536, "type": "mirror"}}

...while this diff only touched 'offset'.  Did you copy-and-paste
incorrectly in the commit message?  If so, then with the commit message
fixed, I'm okay with:
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH for-2.10] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
07.08.2017 17:29, Eric Blake wrote:
> On 08/07/2017 09:16 AM, Vladimir Sementsov-Ogievskiy wrote:
>> 185 iotest is broken.
>>
>> How to test:
>>> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>>    done; echo N = $i
>>
>> finished for me like this:
>>
>> 185 2s ... - output mismatch (see 185.out.bad)
>> --- /work/src/qemu/master/tests/qemu-iotests/185.out    2017-07-14 \
>>      15:14:29.520343805 +0300
>> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
>> @@ -37,7 +37,7 @@
>>   {"return": {}}
>>   {"return": {}}
>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>       "event": "SHUTDOWN", "data": {"guest": false}}
>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>          "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
>>              "mirror"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>          "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
> This diff says both 'len' and 'offset' differ...
>
>>   === Start backup job and exit qemu ===
>>
>> Failures: 185
>> Failed 1 of 1 tests
>> N = 8
>>
>> It doesn't seems logical to expect the constant offset on cancel,
>> so let filter it out.
> s/let/let's/
>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   tests/qemu-iotests/185     | 2 +-
>>   tests/qemu-iotests/185.out | 2 +-
>>   2 files changed, 2 insertions(+), 2 deletions(-)
>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset": 4194304, "speed": 65536, "type": "mirror"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 4194304, "offset": OFFSET, "speed": 65536, "type": "mirror"}}
> ...while this diff only touched 'offset'.  Did you copy-and-paste
> incorrectly in the commit message?  If so, then with the commit message
> fixed, I'm okay with:
> Reviewed-by: Eric Blake <eblake@redhat.com>
>
Hmm, I can't reproduce with "len = 0", and it looks impossible. But I've 
scrolled up in one of my terminals and found this reproduce, really, 
'"len": 0', so, that is possible.

And, after looking through the code it looks like it really possible - 
s->common.len is set not in the very beginning, but after for ex. call 
to block_job_pause_point().

So, the reproduction that I've copied into commit message was very good 
match.

But, I'm not sure, that len should be fixed in the same way. May be it 
would be better to set earlier and don't update on each iteration (why 
is it updated??)


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH for-2.10] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
07.08.2017 18:46, Vladimir Sementsov-Ogievskiy wrote:
> 07.08.2017 17:29, Eric Blake wrote:
>> On 08/07/2017 09:16 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> 185 iotest is broken.
>>>
>>> How to test:
>>>> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>>>    done; echo N = $i
>>>
>>> finished for me like this:
>>>
>>> 185 2s ... - output mismatch (see 185.out.bad)
>>> --- /work/src/qemu/master/tests/qemu-iotests/185.out 2017-07-14 \
>>>      15:14:29.520343805 +0300
>>> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
>>> @@ -37,7 +37,7 @@
>>>   {"return": {}}
>>>   {"return": {}}
>>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>       "event": "SHUTDOWN", "data": {"guest": false}}
>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>          "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
>>>              "mirror"}}
>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>          "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
>> This diff says both 'len' and 'offset' differ...
>>
>>>   === Start backup job and exit qemu ===
>>>
>>> Failures: 185
>>> Failed 1 of 1 tests
>>> N = 8
>>>
>>> It doesn't seems logical to expect the constant offset on cancel,
>>> so let filter it out.
>> s/let/let's/
>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   tests/qemu-iotests/185     | 2 +-
>>>   tests/qemu-iotests/185.out | 2 +-
>>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, 
>>> "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 
>>> 4194304, "offset": 4194304, "speed": 65536, "type": "mirror"}}
>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, 
>>> "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 
>>> 4194304, "offset": OFFSET, "speed": 65536, "type": "mirror"}}
>> ...while this diff only touched 'offset'.  Did you copy-and-paste
>> incorrectly in the commit message?  If so, then with the commit message
>> fixed, I'm okay with:
>> Reviewed-by: Eric Blake <eblake@redhat.com>
>>
> Hmm, I can't reproduce with "len = 0", and it looks impossible. But 
> I've scrolled up in one of my terminals and found this reproduce, 
> really, '"len": 0', so, that is possible.
>
> And, after looking through the code it looks like it really possible - 
> s->common.len is set not in the very beginning, but after for ex. call 
> to block_job_pause_point().
>
> So, the reproduction that I've copied into commit message was very 
> good match.
>
> But, I'm not sure, that len should be fixed in the same way. May be it 
> would be better to set earlier and don't update on each iteration (why 
> is it updated??)
>
>

So, IMHO, this patch is good (with s/"len": 0/"len": 4194304/ and 
s/let/let's/ in commit message)


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH for-2.10] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
07.08.2017 18:57, Vladimir Sementsov-Ogievskiy wrote:
> 07.08.2017 18:46, Vladimir Sementsov-Ogievskiy wrote:
>> 07.08.2017 17:29, Eric Blake wrote:
>>> On 08/07/2017 09:16 AM, Vladimir Sementsov-Ogievskiy wrote:
>>>> 185 iotest is broken.
>>>>
>>>> How to test:
>>>>> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>>>>    done; echo N = $i
>>>>
>>>> finished for me like this:
>>>>
>>>> 185 2s ... - output mismatch (see 185.out.bad)
>>>> --- /work/src/qemu/master/tests/qemu-iotests/185.out 2017-07-14 \
>>>>      15:14:29.520343805 +0300
>>>> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
>>>> @@ -37,7 +37,7 @@
>>>>   {"return": {}}
>>>>   {"return": {}}
>>>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>>       "event": "SHUTDOWN", "data": {"guest": false}}
>>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>>      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>>          "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
>>>>              "mirror"}}
>>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>>      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>>          "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
>>> This diff says both 'len' and 'offset' differ...
>>>
>>>>   === Start backup job and exit qemu ===
>>>>
>>>> Failures: 185
>>>> Failed 1 of 1 tests
>>>> N = 8
>>>>
>>>> It doesn't seems logical to expect the constant offset on cancel,
>>>> so let filter it out.
>>> s/let/let's/
>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>   tests/qemu-iotests/185     | 2 +-
>>>>   tests/qemu-iotests/185.out | 2 +-
>>>>   2 files changed, 2 insertions(+), 2 deletions(-)
>>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, 
>>>> "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 
>>>> 4194304, "offset": 4194304, "speed": 65536, "type": "mirror"}}
>>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, 
>>>> "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", "len": 
>>>> 4194304, "offset": OFFSET, "speed": 65536, "type": "mirror"}}
>>> ...while this diff only touched 'offset'.  Did you copy-and-paste
>>> incorrectly in the commit message?  If so, then with the commit message
>>> fixed, I'm okay with:
>>> Reviewed-by: Eric Blake <eblake@redhat.com>
>>>
>> Hmm, I can't reproduce with "len = 0", and it looks impossible. But 
>> I've scrolled up in one of my terminals and found this reproduce, 
>> really, '"len": 0', so, that is possible.

reproduced on N = 426


>>
>> And, after looking through the code it looks like it really possible 
>> - s->common.len is set not in the very beginning, but after for ex. 
>> call to block_job_pause_point().
>>
>> So, the reproduction that I've copied into commit message was very 
>> good match.
>>
>> But, I'm not sure, that len should be fixed in the same way. May be 
>> it would be better to set earlier and don't update on each iteration 
>> (why is it updated??)
>>
>>
>
> So, IMHO, this patch is good (with s/"len": 0/"len": 4194304/ and 
> s/let/let's/ in commit message)
>
>

-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Kevin Wolf 6 years, 7 months ago
Am 07.08.2017 um 16:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 185 iotest is broken.
> 
> How to test:
> > i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>   done; echo N = $i
> 
> finished for me like this:
> 
> 185 2s ... - output mismatch (see 185.out.bad)
> --- /work/src/qemu/master/tests/qemu-iotests/185.out    2017-07-14 \
>     15:14:29.520343805 +0300
> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
> @@ -37,7 +37,7 @@
>  {"return": {}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>      "event": "SHUTDOWN", "data": {"guest": false}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>     "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>         "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
>             "mirror"}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>     "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>         "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
> 
>  === Start backup job and exit qemu ===
> 
> Failures: 185
> Failed 1 of 1 tests
> N = 8
> 
> It doesn't seems logical to expect the constant offset on cancel,
> so let filter it out.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

I'm quoting 185:

# Note that the reference output intentionally includes the 'offset' field in
# BLOCK_JOB_CANCELLED events for all of the following block jobs. They are
# predictable and any change in the offsets would hint at a bug in the job
# throttling code.
#
# In order to achieve these predictable offsets, all of the following tests
# use speed=65536. Each job will perform exactly one iteration before it has
# to sleep at least for a second, which is plenty of time for the 'quit' QMP
# command to be received (after receiving the command, the rest runs
# synchronously, so jobs can arbitrarily continue or complete).
#
# The buffer size for commit and streaming is 512k (waiting for 8 seconds after
# the first request), for active commit and mirror it's large enough to cover
# the full 4M, and for backup it's the qcow2 cluster size, which we know is
# 64k. As all of these are at least as large as the speed, we are sure that the
# offset doesn't advance after the first iteration before qemu exits.

So before we change the expected output, can we explain why the offsets
aren't predictable, even if throttling is used and contrary to what the
comment says? Should we sleep a little before issuing 'quit'?

(By the way, I couldn't reproduce in N = 128 attempts, so it doesn't
look like I can look into what's happening in detail, except with code
review.)

Kevin

Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
07.08.2017 18:57, Kevin Wolf wrote:
> Am 07.08.2017 um 16:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 185 iotest is broken.
>>
>> How to test:
>>> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>>    done; echo N = $i
>>
>> finished for me like this:
>>
>> 185 2s ... - output mismatch (see 185.out.bad)
>> --- /work/src/qemu/master/tests/qemu-iotests/185.out    2017-07-14 \
>>      15:14:29.520343805 +0300
>> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
>> @@ -37,7 +37,7 @@
>>   {"return": {}}
>>   {"return": {}}
>>   {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>       "event": "SHUTDOWN", "data": {"guest": false}}
>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>          "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
>>              "mirror"}}
>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>          "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
>>
>>   === Start backup job and exit qemu ===
>>
>> Failures: 185
>> Failed 1 of 1 tests
>> N = 8
>>
>> It doesn't seems logical to expect the constant offset on cancel,
>> so let filter it out.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> I'm quoting 185:
>
> # Note that the reference output intentionally includes the 'offset' field in
> # BLOCK_JOB_CANCELLED events for all of the following block jobs. They are
> # predictable and any change in the offsets would hint at a bug in the job
> # throttling code.
> #
> # In order to achieve these predictable offsets, all of the following tests
> # use speed=65536. Each job will perform exactly one iteration before it has
> # to sleep at least for a second, which is plenty of time for the 'quit' QMP
> # command to be received (after receiving the command, the rest runs
> # synchronously, so jobs can arbitrarily continue or complete).
> #
> # The buffer size for commit and streaming is 512k (waiting for 8 seconds after
> # the first request), for active commit and mirror it's large enough to cover
> # the full 4M, and for backup it's the qcow2 cluster size, which we know is
> # 64k. As all of these are at least as large as the speed, we are sure that the
> # offset doesn't advance after the first iteration before qemu exits.
>
> So before we change the expected output, can we explain why the offsets
> aren't predictable, even if throttling is used and contrary to what the
> comment says? Should we sleep a little before issuing 'quit'?

Throttling "guaranties" that there will not be more than one request. 
But what prevent less than one, i.e. zero, like in my reproduction?

>
> (By the way, I couldn't reproduce in N = 128 attempts, so it doesn't
> look like I can look into what's happening in detail, except with code
> review.)
>
> Kevin


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Kevin Wolf 6 years, 7 months ago
Am 08.08.2017 um 10:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 07.08.2017 18:57, Kevin Wolf wrote:
> > Am 07.08.2017 um 16:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > 185 iotest is broken.
> > > 
> > > How to test:
> > > > i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
> > >    done; echo N = $i
> > > 
> > > finished for me like this:
> > > 
> > > 185 2s ... - output mismatch (see 185.out.bad)
> > > --- /work/src/qemu/master/tests/qemu-iotests/185.out    2017-07-14 \
> > >      15:14:29.520343805 +0300
> > > +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
> > > @@ -37,7 +37,7 @@
> > >   {"return": {}}
> > >   {"return": {}}
> > >   {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
> > >       "event": "SHUTDOWN", "data": {"guest": false}}
> > > -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
> > >      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
> > >          "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
> > >              "mirror"}}
> > > +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
> > >      "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
> > >          "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
> > > 
> > >   === Start backup job and exit qemu ===
> > > 
> > > Failures: 185
> > > Failed 1 of 1 tests
> > > N = 8
> > > 
> > > It doesn't seems logical to expect the constant offset on cancel,
> > > so let filter it out.
> > > 
> > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> > I'm quoting 185:
> > 
> > # Note that the reference output intentionally includes the 'offset' field in
> > # BLOCK_JOB_CANCELLED events for all of the following block jobs. They are
> > # predictable and any change in the offsets would hint at a bug in the job
> > # throttling code.
> > #
> > # In order to achieve these predictable offsets, all of the following tests
> > # use speed=65536. Each job will perform exactly one iteration before it has
> > # to sleep at least for a second, which is plenty of time for the 'quit' QMP
> > # command to be received (after receiving the command, the rest runs
> > # synchronously, so jobs can arbitrarily continue or complete).
> > #
> > # The buffer size for commit and streaming is 512k (waiting for 8 seconds after
> > # the first request), for active commit and mirror it's large enough to cover
> > # the full 4M, and for backup it's the qcow2 cluster size, which we know is
> > # 64k. As all of these are at least as large as the speed, we are sure that the
> > # offset doesn't advance after the first iteration before qemu exits.
> > 
> > So before we change the expected output, can we explain why the offsets
> > aren't predictable, even if throttling is used and contrary to what the
> > comment says? Should we sleep a little before issuing 'quit'?
> 
> Throttling "guaranties" that there will not be more than one request. But
> what prevent less than one, i.e. zero, like in my reproduction?

Yes, I understand. Can we somehow make sure that at least one iteration
is made? I'd really like to keep the functional test for block job
throttling. I suppose a simple 'sleep 0.1' would do the trick, though
it's not very clean.

Kevin

Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
08.08.2017 11:53, Kevin Wolf wrote:
> Am 08.08.2017 um 10:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
>> 07.08.2017 18:57, Kevin Wolf wrote:
>>> Am 07.08.2017 um 16:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> 185 iotest is broken.
>>>>
>>>> How to test:
>>>>> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>>>>     done; echo N = $i
>>>>
>>>> finished for me like this:
>>>>
>>>> 185 2s ... - output mismatch (see 185.out.bad)
>>>> --- /work/src/qemu/master/tests/qemu-iotests/185.out    2017-07-14 \
>>>>       15:14:29.520343805 +0300
>>>> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
>>>> @@ -37,7 +37,7 @@
>>>>    {"return": {}}
>>>>    {"return": {}}
>>>>    {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>>>        "event": "SHUTDOWN", "data": {"guest": false}}
>>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>>>       "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>>           "len": 4194304, "offset": 4194304, "speed": 65536, "type": \
>>>>               "mirror"}}
>>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, \
>>>>       "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>>           "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
>>>>
>>>>    === Start backup job and exit qemu ===
>>>>
>>>> Failures: 185
>>>> Failed 1 of 1 tests
>>>> N = 8
>>>>
>>>> It doesn't seems logical to expect the constant offset on cancel,
>>>> so let filter it out.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> I'm quoting 185:
>>>
>>> # Note that the reference output intentionally includes the 'offset' field in
>>> # BLOCK_JOB_CANCELLED events for all of the following block jobs. They are
>>> # predictable and any change in the offsets would hint at a bug in the job
>>> # throttling code.
>>> #
>>> # In order to achieve these predictable offsets, all of the following tests
>>> # use speed=65536. Each job will perform exactly one iteration before it has
>>> # to sleep at least for a second, which is plenty of time for the 'quit' QMP
>>> # command to be received (after receiving the command, the rest runs
>>> # synchronously, so jobs can arbitrarily continue or complete).
>>> #
>>> # The buffer size for commit and streaming is 512k (waiting for 8 seconds after
>>> # the first request), for active commit and mirror it's large enough to cover
>>> # the full 4M, and for backup it's the qcow2 cluster size, which we know is
>>> # 64k. As all of these are at least as large as the speed, we are sure that the
>>> # offset doesn't advance after the first iteration before qemu exits.
>>>
>>> So before we change the expected output, can we explain why the offsets
>>> aren't predictable, even if throttling is used and contrary to what the
>>> comment says? Should we sleep a little before issuing 'quit'?
>> Throttling "guaranties" that there will not be more than one request. But
>> what prevent less than one, i.e. zero, like in my reproduction?
> Yes, I understand. Can we somehow make sure that at least one iteration
> is made? I'd really like to keep the functional test for block job
> throttling. I suppose a simple 'sleep 0.1' would do the trick, though
> it's not very clean.
>
> Kevin


I've started with 'sleep 0.5', now there are >100 successful 
iterations... The other way is to check in test that there was 0 or 1 
requests, but for this it looks better to rewrite it in python.


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
08.08.2017 12:04, Vladimir Sementsov-Ogievskiy wrote:
> 08.08.2017 11:53, Kevin Wolf wrote:
>> Am 08.08.2017 um 10:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>> 07.08.2017 18:57, Kevin Wolf wrote:
>>>> Am 07.08.2017 um 16:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>> 185 iotest is broken.
>>>>>
>>>>> How to test:
>>>>>> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>>>>>     done; echo N = $i
>>>>>
>>>>> finished for me like this:
>>>>>
>>>>> 185 2s ... - output mismatch (see 185.out.bad)
>>>>> --- /work/src/qemu/master/tests/qemu-iotests/185.out 2017-07-14 \
>>>>>       15:14:29.520343805 +0300
>>>>> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
>>>>> @@ -37,7 +37,7 @@
>>>>>    {"return": {}}
>>>>>    {"return": {}}
>>>>>    {"timestamp": {"seconds":  TIMESTAMP, "microseconds": 
>>>>> TIMESTAMP}, \
>>>>>        "event": "SHUTDOWN", "data": {"guest": false}}
>>>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>>>       "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>>>           "len": 4194304, "offset": 4194304, "speed": 65536, 
>>>>> "type": \
>>>>>               "mirror"}}
>>>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>>>       "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>>>           "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
>>>>>
>>>>>    === Start backup job and exit qemu ===
>>>>>
>>>>> Failures: 185
>>>>> Failed 1 of 1 tests
>>>>> N = 8
>>>>>
>>>>> It doesn't seems logical to expect the constant offset on cancel,
>>>>> so let filter it out.
>>>>>
>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>>>> <vsementsov@virtuozzo.com>
>>>> I'm quoting 185:
>>>>
>>>> # Note that the reference output intentionally includes the 
>>>> 'offset' field in
>>>> # BLOCK_JOB_CANCELLED events for all of the following block jobs. 
>>>> They are
>>>> # predictable and any change in the offsets would hint at a bug in 
>>>> the job
>>>> # throttling code.
>>>> #
>>>> # In order to achieve these predictable offsets, all of the 
>>>> following tests
>>>> # use speed=65536. Each job will perform exactly one iteration 
>>>> before it has
>>>> # to sleep at least for a second, which is plenty of time for the 
>>>> 'quit' QMP
>>>> # command to be received (after receiving the command, the rest runs
>>>> # synchronously, so jobs can arbitrarily continue or complete).
>>>> #
>>>> # The buffer size for commit and streaming is 512k (waiting for 8 
>>>> seconds after
>>>> # the first request), for active commit and mirror it's large 
>>>> enough to cover
>>>> # the full 4M, and for backup it's the qcow2 cluster size, which we 
>>>> know is
>>>> # 64k. As all of these are at least as large as the speed, we are 
>>>> sure that the
>>>> # offset doesn't advance after the first iteration before qemu exits.
>>>>
>>>> So before we change the expected output, can we explain why the 
>>>> offsets
>>>> aren't predictable, even if throttling is used and contrary to what 
>>>> the
>>>> comment says? Should we sleep a little before issuing 'quit'?
>>> Throttling "guaranties" that there will not be more than one 
>>> request. But
>>> what prevent less than one, i.e. zero, like in my reproduction?
>> Yes, I understand. Can we somehow make sure that at least one iteration
>> is made? I'd really like to keep the functional test for block job
>> throttling. I suppose a simple 'sleep 0.1' would do the trick, though
>> it's not very clean.
>>
>> Kevin
>
>
> I've started with 'sleep 0.5', now there are >100 successful 
> iterations... The other way is to check in test that there was 0 or 1 
> requests, but for this it looks better to rewrite it in python.
>
>

is sleep for ms portable?



-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
08.08.2017 12:04, Vladimir Sementsov-Ogievskiy wrote:
> 08.08.2017 12:04, Vladimir Sementsov-Ogievskiy wrote:
>> 08.08.2017 11:53, Kevin Wolf wrote:
>>> Am 08.08.2017 um 10:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>> 07.08.2017 18:57, Kevin Wolf wrote:
>>>>> Am 07.08.2017 um 16:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
>>>>>> 185 iotest is broken.
>>>>>>
>>>>>> How to test:
>>>>>>> i=0; while ./check -qcow2 -nocache 185; do ((i+=1)); echo N = $i; \
>>>>>>     done; echo N = $i
>>>>>>
>>>>>> finished for me like this:
>>>>>>
>>>>>> 185 2s ... - output mismatch (see 185.out.bad)
>>>>>> --- /work/src/qemu/master/tests/qemu-iotests/185.out 2017-07-14 \
>>>>>>       15:14:29.520343805 +0300
>>>>>> +++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
>>>>>> @@ -37,7 +37,7 @@
>>>>>>    {"return": {}}
>>>>>>    {"return": {}}
>>>>>>    {"timestamp": {"seconds":  TIMESTAMP, "microseconds": 
>>>>>> TIMESTAMP}, \
>>>>>>        "event": "SHUTDOWN", "data": {"guest": false}}
>>>>>> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>>>>       "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>>>>           "len": 4194304, "offset": 4194304, "speed": 65536, 
>>>>>> "type": \
>>>>>>               "mirror"}}
>>>>>> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds": TIMESTAMP}, \
>>>>>>       "event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
>>>>>>           "len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
>>>>>>
>>>>>>    === Start backup job and exit qemu ===
>>>>>>
>>>>>> Failures: 185
>>>>>> Failed 1 of 1 tests
>>>>>> N = 8
>>>>>>
>>>>>> It doesn't seems logical to expect the constant offset on cancel,
>>>>>> so let filter it out.
>>>>>>
>>>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>>>>> <vsementsov@virtuozzo.com>
>>>>> I'm quoting 185:
>>>>>
>>>>> # Note that the reference output intentionally includes the 
>>>>> 'offset' field in
>>>>> # BLOCK_JOB_CANCELLED events for all of the following block jobs. 
>>>>> They are
>>>>> # predictable and any change in the offsets would hint at a bug in 
>>>>> the job
>>>>> # throttling code.
>>>>> #
>>>>> # In order to achieve these predictable offsets, all of the 
>>>>> following tests
>>>>> # use speed=65536. Each job will perform exactly one iteration 
>>>>> before it has
>>>>> # to sleep at least for a second, which is plenty of time for the 
>>>>> 'quit' QMP
>>>>> # command to be received (after receiving the command, the rest runs
>>>>> # synchronously, so jobs can arbitrarily continue or complete).
>>>>> #
>>>>> # The buffer size for commit and streaming is 512k (waiting for 8 
>>>>> seconds after
>>>>> # the first request), for active commit and mirror it's large 
>>>>> enough to cover
>>>>> # the full 4M, and for backup it's the qcow2 cluster size, which 
>>>>> we know is
>>>>> # 64k. As all of these are at least as large as the speed, we are 
>>>>> sure that the
>>>>> # offset doesn't advance after the first iteration before qemu exits.
>>>>>
>>>>> So before we change the expected output, can we explain why the 
>>>>> offsets
>>>>> aren't predictable, even if throttling is used and contrary to 
>>>>> what the
>>>>> comment says? Should we sleep a little before issuing 'quit'?
>>>> Throttling "guaranties" that there will not be more than one 
>>>> request. But
>>>> what prevent less than one, i.e. zero, like in my reproduction?
>>> Yes, I understand. Can we somehow make sure that at least one iteration
>>> is made? I'd really like to keep the functional test for block job
>>> throttling. I suppose a simple 'sleep 0.1' would do the trick, though
>>> it's not very clean.
>>>
>>> Kevin
>>
>>
>> I've started with 'sleep 0.5', now there are >100 successful 
>> iterations... The other way is to check in test that there was 0 or 1 
>> requests, but for this it looks better to rewrite it in python.
>>
>>
>
> is sleep for ms portable?
>
>
>
 >1500 successful iterations, so, 'sleep 0.5' is OK for me.


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Eric Blake 6 years, 7 months ago
On 08/08/2017 04:04 AM, Vladimir Sementsov-Ogievskiy wrote:

>>>> Throttling "guaranties" that there will not be more than one
>>>> request. But
>>>> what prevent less than one, i.e. zero, like in my reproduction?
>>> Yes, I understand. Can we somehow make sure that at least one iteration
>>> is made? I'd really like to keep the functional test for block job
>>> throttling. I suppose a simple 'sleep 0.1' would do the trick, though
>>> it's not very clean.
>>>
>>> Kevin
>>
>>
>> I've started with 'sleep 0.5', now there are >100 successful
>> iterations... The other way is to check in test that there was 0 or 1
>> requests, but for this it looks better to rewrite it in python.
>>
>>
> 
> is sleep for ms portable?

Sadly, sub-second sleep is a GNU coreutils feature; I suspect the BSD
machines may fail to parse it.  (Of course, we could do some sort of
'sleep $SMALL', where $SMALL is 0.5 if sleep supports it, and 1 otherwise).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Vladimir Sementsov-Ogievskiy 6 years, 7 months ago
08.08.2017 18:07, Eric Blake wrote:
> On 08/08/2017 04:04 AM, Vladimir Sementsov-Ogievskiy wrote:
>
>>>>> Throttling "guaranties" that there will not be more than one
>>>>> request. But
>>>>> what prevent less than one, i.e. zero, like in my reproduction?
>>>> Yes, I understand. Can we somehow make sure that at least one iteration
>>>> is made? I'd really like to keep the functional test for block job
>>>> throttling. I suppose a simple 'sleep 0.1' would do the trick, though
>>>> it's not very clean.
>>>>
>>>> Kevin
>>>
>>> I've started with 'sleep 0.5', now there are >100 successful
>>> iterations... The other way is to check in test that there was 0 or 1
>>> requests, but for this it looks better to rewrite it in python.
>>>
>>>
>> is sleep for ms portable?
> Sadly, sub-second sleep is a GNU coreutils feature; I suspect the BSD
> machines may fail to parse it.  (Of course, we could do some sort of
> 'sleep $SMALL', where $SMALL is 0.5 if sleep supports it, and 1 otherwise).
>
sleep for 1 second may lead to more then one request done before qemu quite.


-- 
Best regards,
Vladimir


Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Kevin Wolf 6 years, 7 months ago
Am 08.08.2017 um 17:10 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 08.08.2017 18:07, Eric Blake wrote:
> > On 08/08/2017 04:04 AM, Vladimir Sementsov-Ogievskiy wrote:
> > 
> > > > > > Throttling "guaranties" that there will not be more than one
> > > > > > request. But
> > > > > > what prevent less than one, i.e. zero, like in my reproduction?
> > > > > Yes, I understand. Can we somehow make sure that at least one iteration
> > > > > is made? I'd really like to keep the functional test for block job
> > > > > throttling. I suppose a simple 'sleep 0.1' would do the trick, though
> > > > > it's not very clean.
> > > > > 
> > > > > Kevin
> > > > 
> > > > I've started with 'sleep 0.5', now there are >100 successful
> > > > iterations... The other way is to check in test that there was 0 or 1
> > > > requests, but for this it looks better to rewrite it in python.
> > > > 
> > > > 
> > > is sleep for ms portable?
> > Sadly, sub-second sleep is a GNU coreutils feature; I suspect the BSD
> > machines may fail to parse it.  (Of course, we could do some sort of
> > 'sleep $SMALL', where $SMALL is 0.5 if sleep supports it, and 1 otherwise).
> > 
> sleep for 1 second may lead to more then one request done before qemu quite.

_supported_os Linux

So do we really care about portability? And are all the other test cases
working on the BSDs?

Kevin

Re: [Qemu-devel] [PATCH] iotests: fix 185
Posted by Eric Blake 6 years, 7 months ago
On 08/08/2017 10:16 AM, Kevin Wolf wrote:

>>>> is sleep for ms portable?
>>> Sadly, sub-second sleep is a GNU coreutils feature; I suspect the BSD
>>> machines may fail to parse it.  (Of course, we could do some sort of
>>> 'sleep $SMALL', where $SMALL is 0.5 if sleep supports it, and 1 otherwise).
>>>
>> sleep for 1 second may lead to more then one request done before qemu quite.
> 
> _supported_os Linux
> 
> So do we really care about portability? And are all the other test cases
> working on the BSDs?

You've got valid points there - I suspect that a run of qemu-iotests on
BSD would turn up quite a few failures, some of which would be easy to
address; but it's not on my personal priority list.  I'm fine with an
approach of committing something that works where it was first tested,
then adding followup patches to improve portability, especially as long
as we are not currently getting anyone complaining about qemu-iotests
failures on BSD.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org