[PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"

Michael S. Tsirkin posted 1 patch 3 years, 10 months ago
Test FreeBSD passed
Test docker-quick@centos7 passed
Test checkpatch passed
Test docker-mingw@fedora passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20200623145506.439100-1-mst@redhat.com
Maintainers: Juan Quintela <quintela@redhat.com>, Laurent Vivier <lvivier@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Thomas Huth <thuth@redhat.com>
tests/qtest/migration-test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Michael S. Tsirkin 3 years, 10 months ago
This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
since that change makes unit tests much slower for all developers, while it's not
a robust way to fix migration tests. Migration tests need to find
a more robust way to discover a reasonable bandwidth without slowing
things down for everyone.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---
 tests/qtest/migration-test.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index dc3490c9fa..21ea5ba1d2 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
      * without throttling.
      */
     migrate_set_parameter_int(from, "downtime-limit", 1);
-    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
+    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
 
     /* To check remaining size after precopy */
     migrate_set_capability(from, "pause-before-switchover", true);
-- 
MST


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Philippe Mathieu-Daudé 3 years, 10 months ago
On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> since that change makes unit tests much slower for all developers, while it's not
> a robust way to fix migration tests. Migration tests need to find
> a more robust way to discover a reasonable bandwidth without slowing
> things down for everyone.

Please also mention we can do this since 1de8e4c4dcf which allow
marked the s390x job as "unstable" and allow it to fail.

But if nobody is going to look at it, instead lets disable
it until someone figure out the issue:

-- >8 --
diff --git a/.travis.yml b/.travis.yml
index 74158f741b..364e67b14b 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -507,6 +507,7 @@ jobs:

     - name: "[s390x] Clang (disable-tcg)"
       arch: s390x
+      if: false # Temporarily disabled due to issue testing migration
(see commit 6d1da867e65).
       dist: bionic
       compiler: clang
       addons:
---

With the hunk amended (no need to mention 1de8e4c4d actually):
Acked-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---
>  tests/qtest/migration-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index dc3490c9fa..21ea5ba1d2 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>       * without throttling.
>       */
>      migrate_set_parameter_int(from, "downtime-limit", 1);
> -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>  
>      /* To check remaining size after precopy */
>      migrate_set_capability(from, "pause-before-switchover", true);
> 


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Thomas Huth 3 years, 10 months ago
On 23/06/2020 17.39, Philippe Mathieu-Daudé wrote:
> On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
>> since that change makes unit tests much slower for all developers, while it's not
>> a robust way to fix migration tests. Migration tests need to find
>> a more robust way to discover a reasonable bandwidth without slowing
>> things down for everyone.
> 
> Please also mention we can do this since 1de8e4c4dcf which allow
> marked the s390x job as "unstable" and allow it to fail.
> 
> But if nobody is going to look at it, instead lets disable
> it until someone figure out the issue:
> 
> -- >8 --
> diff --git a/.travis.yml b/.travis.yml
> index 74158f741b..364e67b14b 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -507,6 +507,7 @@ jobs:
> 
>      - name: "[s390x] Clang (disable-tcg)"
>        arch: s390x
> +      if: false # Temporarily disabled due to issue testing migration
> (see commit 6d1da867e65).
>        dist: bionic
>        compiler: clang
>        addons:

Sorry, but that looks wrong. First, the disable-tcg test does not run
the qtests at all. So this is certainly the wrong location here. Second,
if just one of the qtests is failing, please only disable that single
failing qtest and not the whole test pipeline.

 Thanks,
  Thomas


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Philippe Mathieu-Daudé 3 years, 10 months ago
On 6/23/20 7:07 PM, Thomas Huth wrote:
> On 23/06/2020 17.39, Philippe Mathieu-Daudé wrote:
>> On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
>>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
>>> since that change makes unit tests much slower for all developers, while it's not
>>> a robust way to fix migration tests. Migration tests need to find
>>> a more robust way to discover a reasonable bandwidth without slowing
>>> things down for everyone.
>>
>> Please also mention we can do this since 1de8e4c4dcf which allow
>> marked the s390x job as "unstable" and allow it to fail.
>>
>> But if nobody is going to look at it, instead lets disable
>> it until someone figure out the issue:
>>
>> -- >8 --
>> diff --git a/.travis.yml b/.travis.yml
>> index 74158f741b..364e67b14b 100644
>> --- a/.travis.yml
>> +++ b/.travis.yml
>> @@ -507,6 +507,7 @@ jobs:
>>
>>      - name: "[s390x] Clang (disable-tcg)"
>>        arch: s390x
>> +      if: false # Temporarily disabled due to issue testing migration
>> (see commit 6d1da867e65).
>>        dist: bionic
>>        compiler: clang
>>        addons:
> 
> Sorry, but that looks wrong. First, the disable-tcg test does not run
> the qtests at all. So this is certainly the wrong location here.

Indeed, this is the previous job:

-- >8 --
diff --git a/.travis.yml b/.travis.yml
index 74158f741b..b399e20078 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -464,6 +464,7 @@ jobs:
         - CONFIG="--disable-containers
--target-list=ppc64-softmmu,ppc64le-linux-user"

     - name: "[s390x] GCC check-tcg"
+      if: false # Temporarily disabled due to issue testing migration
(see commit 6d1da867e65).
       arch: s390x
       dist: bionic
       addons:
---

> Second,
> if just one of the qtests is failing, please only disable that single
> failing qtest and not the whole test pipeline.

Last time we talked about this Dave was against that option:

https://www.mail-archive.com/qemu-devel@nongnu.org/msg690085.html


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Michael S. Tsirkin 3 years, 10 months ago
On Tue, Jun 23, 2020 at 07:35:34PM +0200, Philippe Mathieu-Daudé wrote:
> On 6/23/20 7:07 PM, Thomas Huth wrote:
> > On 23/06/2020 17.39, Philippe Mathieu-Daudé wrote:
> >> On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
> >>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> >>> since that change makes unit tests much slower for all developers, while it's not
> >>> a robust way to fix migration tests. Migration tests need to find
> >>> a more robust way to discover a reasonable bandwidth without slowing
> >>> things down for everyone.
> >>
> >> Please also mention we can do this since 1de8e4c4dcf which allow
> >> marked the s390x job as "unstable" and allow it to fail.
> >>
> >> But if nobody is going to look at it, instead lets disable
> >> it until someone figure out the issue:
> >>
> >> -- >8 --
> >> diff --git a/.travis.yml b/.travis.yml
> >> index 74158f741b..364e67b14b 100644
> >> --- a/.travis.yml
> >> +++ b/.travis.yml
> >> @@ -507,6 +507,7 @@ jobs:
> >>
> >>      - name: "[s390x] Clang (disable-tcg)"
> >>        arch: s390x
> >> +      if: false # Temporarily disabled due to issue testing migration
> >> (see commit 6d1da867e65).
> >>        dist: bionic
> >>        compiler: clang
> >>        addons:
> > 
> > Sorry, but that looks wrong. First, the disable-tcg test does not run
> > the qtests at all. So this is certainly the wrong location here.
> 
> Indeed, this is the previous job:
> 
> -- >8 --
> diff --git a/.travis.yml b/.travis.yml
> index 74158f741b..b399e20078 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -464,6 +464,7 @@ jobs:
>          - CONFIG="--disable-containers
> --target-list=ppc64-softmmu,ppc64le-linux-user"
> 
>      - name: "[s390x] GCC check-tcg"
> +      if: false # Temporarily disabled due to issue testing migration
> (see commit 6d1da867e65).
>        arch: s390x
>        dist: bionic
>        addons:
> ---


OK - can yo submit this as a proper patch?

> > Second,
> > if just one of the qtests is failing, please only disable that single
> > failing qtest and not the whole test pipeline.
> 
> Last time we talked about this Dave was against that option:
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg690085.html


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Thomas Huth 3 years, 9 months ago
On 23/06/2020 19.35, Philippe Mathieu-Daudé wrote:
> On 6/23/20 7:07 PM, Thomas Huth wrote:
>> On 23/06/2020 17.39, Philippe Mathieu-Daudé wrote:
>>> On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
>>>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
>>>> since that change makes unit tests much slower for all developers, while it's not
>>>> a robust way to fix migration tests. Migration tests need to find
>>>> a more robust way to discover a reasonable bandwidth without slowing
>>>> things down for everyone.
>>>
>>> Please also mention we can do this since 1de8e4c4dcf which allow
>>> marked the s390x job as "unstable" and allow it to fail.
>>>
>>> But if nobody is going to look at it, instead lets disable
>>> it until someone figure out the issue:
>>>
>>> -- >8 --
>>> diff --git a/.travis.yml b/.travis.yml
>>> index 74158f741b..364e67b14b 100644
>>> --- a/.travis.yml
>>> +++ b/.travis.yml
>>> @@ -507,6 +507,7 @@ jobs:
>>>
>>>      - name: "[s390x] Clang (disable-tcg)"
>>>        arch: s390x
>>> +      if: false # Temporarily disabled due to issue testing migration
>>> (see commit 6d1da867e65).
>>>        dist: bionic
>>>        compiler: clang
>>>        addons:
>>
>> Sorry, but that looks wrong. First, the disable-tcg test does not run
>> the qtests at all. So this is certainly the wrong location here.
> 
> Indeed, this is the previous job:
> 
> -- >8 --
> diff --git a/.travis.yml b/.travis.yml
> index 74158f741b..b399e20078 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -464,6 +464,7 @@ jobs:
>          - CONFIG="--disable-containers
> --target-list=ppc64-softmmu,ppc64le-linux-user"
> 
>      - name: "[s390x] GCC check-tcg"
> +      if: false # Temporarily disabled due to issue testing migration
> (see commit 6d1da867e65).
>        arch: s390x
>        dist: bionic
>        addons:
> ---
> 
>> Second,
>> if just one of the qtests is failing, please only disable that single
>> failing qtest and not the whole test pipeline.
> 
> Last time we talked about this Dave was against that option:
> 
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg690085.html
> 

Was he? Citing his reply to the mail from your URL:

 "Before we take the hammer to it, could you try reducing it's initial
bandwidth"

So all I can see is that he first wanted to try something different than
disabling the test. And now,  instead of using a small hammer to disable
just this test, you now even use a very *big* hammer to disable *all*
tests. That's just a very bad idea. Please don't.

 Thomas


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Philippe Mathieu-Daudé 3 years, 9 months ago
On 6/24/20 7:04 AM, Thomas Huth wrote:
> On 23/06/2020 19.35, Philippe Mathieu-Daudé wrote:
>> On 6/23/20 7:07 PM, Thomas Huth wrote:
>>> On 23/06/2020 17.39, Philippe Mathieu-Daudé wrote:
>>>> On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
>>>>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
>>>>> since that change makes unit tests much slower for all developers, while it's not
>>>>> a robust way to fix migration tests. Migration tests need to find
>>>>> a more robust way to discover a reasonable bandwidth without slowing
>>>>> things down for everyone.
>>>>
>>>> Please also mention we can do this since 1de8e4c4dcf which allow
>>>> marked the s390x job as "unstable" and allow it to fail.
>>>>
>>>> But if nobody is going to look at it, instead lets disable
>>>> it until someone figure out the issue:
>>>>
>>>> -- >8 --
>>>> diff --git a/.travis.yml b/.travis.yml
>>>> index 74158f741b..364e67b14b 100644
>>>> --- a/.travis.yml
>>>> +++ b/.travis.yml
>>>> @@ -507,6 +507,7 @@ jobs:
>>>>
>>>>      - name: "[s390x] Clang (disable-tcg)"
>>>>        arch: s390x
>>>> +      if: false # Temporarily disabled due to issue testing migration
>>>> (see commit 6d1da867e65).
>>>>        dist: bionic
>>>>        compiler: clang
>>>>        addons:
>>>
>>> Sorry, but that looks wrong. First, the disable-tcg test does not run
>>> the qtests at all. So this is certainly the wrong location here.
>>
>> Indeed, this is the previous job:
>>
>> -- >8 --
>> diff --git a/.travis.yml b/.travis.yml
>> index 74158f741b..b399e20078 100644
>> --- a/.travis.yml
>> +++ b/.travis.yml
>> @@ -464,6 +464,7 @@ jobs:
>>          - CONFIG="--disable-containers
>> --target-list=ppc64-softmmu,ppc64le-linux-user"
>>
>>      - name: "[s390x] GCC check-tcg"
>> +      if: false # Temporarily disabled due to issue testing migration
>> (see commit 6d1da867e65).
>>        arch: s390x
>>        dist: bionic
>>        addons:
>> ---
>>
>>> Second,
>>> if just one of the qtests is failing, please only disable that single
>>> failing qtest and not the whole test pipeline.
>>
>> Last time we talked about this Dave was against that option:
>>
>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg690085.html
>>
> 
> Was he? Citing his reply to the mail from your URL:
> 
>  "Before we take the hammer to it, could you try reducing it's initial
> bandwidth"
> 
> So all I can see is that he first wanted to try something different than
> disabling the test. And now,  instead of using a small hammer to disable
> just this test, you now even use a very *big* hammer to disable *all*
> tests. That's just a very bad idea. Please don't.

You are right. I was being concerned about having CI working because
the more red it stay, the less likely the community will worry about
it, and I didn't want we loose interest in testing (or discredit its
importance). I now understand without having CI gating, it is
pointless to try to keep it green (at the cost of having all local
testing running slower, it is worst if maintainers stop their local
testing).

WRT this test I have no idea what it is doing, furthermore why it
fails on s390x containers, so I sent a simple patch to fix the CI,
but failed to foreseen its negative effect on the rest of the
developers.

Thanks Michael for fixing my mess with your patch:

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

Regards,

Phil.


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Philippe Mathieu-Daudé 3 years, 9 months ago
Hi Thomas,

On 6/24/20 12:21 PM, Philippe Mathieu-Daudé wrote:
> On 6/24/20 7:04 AM, Thomas Huth wrote:
>> On 23/06/2020 19.35, Philippe Mathieu-Daudé wrote:
>>> On 6/23/20 7:07 PM, Thomas Huth wrote:
>>>> On 23/06/2020 17.39, Philippe Mathieu-Daudé wrote:
>>>>> On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
>>>>>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
>>>>>> since that change makes unit tests much slower for all developers, while it's not
>>>>>> a robust way to fix migration tests. Migration tests need to find
>>>>>> a more robust way to discover a reasonable bandwidth without slowing
>>>>>> things down for everyone.
>>>>>
>>>>> Please also mention we can do this since 1de8e4c4dcf which allow
>>>>> marked the s390x job as "unstable" and allow it to fail.
>>>>>
>>>>> But if nobody is going to look at it, instead lets disable
>>>>> it until someone figure out the issue:
>>>>>
>>>>> -- >8 --
>>>>> diff --git a/.travis.yml b/.travis.yml
>>>>> index 74158f741b..364e67b14b 100644
>>>>> --- a/.travis.yml
>>>>> +++ b/.travis.yml
>>>>> @@ -507,6 +507,7 @@ jobs:
>>>>>
>>>>>      - name: "[s390x] Clang (disable-tcg)"
>>>>>        arch: s390x
>>>>> +      if: false # Temporarily disabled due to issue testing migration
>>>>> (see commit 6d1da867e65).
>>>>>        dist: bionic
>>>>>        compiler: clang
>>>>>        addons:
>>>>
>>>> Sorry, but that looks wrong. First, the disable-tcg test does not run
>>>> the qtests at all. So this is certainly the wrong location here.
>>>
>>> Indeed, this is the previous job:
>>>
>>> -- >8 --
>>> diff --git a/.travis.yml b/.travis.yml
>>> index 74158f741b..b399e20078 100644
>>> --- a/.travis.yml
>>> +++ b/.travis.yml
>>> @@ -464,6 +464,7 @@ jobs:
>>>          - CONFIG="--disable-containers
>>> --target-list=ppc64-softmmu,ppc64le-linux-user"
>>>
>>>      - name: "[s390x] GCC check-tcg"
>>> +      if: false # Temporarily disabled due to issue testing migration
>>> (see commit 6d1da867e65).
>>>        arch: s390x
>>>        dist: bionic
>>>        addons:
>>> ---
>>>
>>>> Second,
>>>> if just one of the qtests is failing, please only disable that single
>>>> failing qtest and not the whole test pipeline.
>>>
>>> Last time we talked about this Dave was against that option:
>>>
>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg690085.html
>>>
>>
>> Was he? Citing his reply to the mail from your URL:
>>
>>  "Before we take the hammer to it, could you try reducing it's initial
>> bandwidth"
>>
>> So all I can see is that he first wanted to try something different than
>> disabling the test. And now,  instead of using a small hammer to disable
>> just this test, you now even use a very *big* hammer to disable *all*
>> tests. That's just a very bad idea. Please don't.

You asked on IRC the CI failures history:

https://travis-ci.org/github/philmd/qemu/jobs/663607963
https://travis-ci.org/github/philmd/qemu/jobs/663622229
https://travis-ci.org/github/philmd/qemu/jobs/663642522

"No output has been received in the last 10m0s, this potentially
indicates a stalled build or something wrong with the build itself."

> 
> You are right. I was being concerned about having CI working because
> the more red it stay, the less likely the community will worry about
> it, and I didn't want we loose interest in testing (or discredit its
> importance). I now understand without having CI gating, it is
> pointless to try to keep it green (at the cost of having all local
> testing running slower, it is worst if maintainers stop their local
> testing).
> 
> WRT this test I have no idea what it is doing, furthermore why it
> fails on s390x containers, so I sent a simple patch to fix the CI,
> but failed to foreseen its negative effect on the rest of the
> developers.
> 
> Thanks Michael for fixing my mess with your patch:
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> 
> Regards,
> 
> Phil.
> 


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Thomas Huth 3 years, 9 months ago
On 24/06/2020 18.26, Philippe Mathieu-Daudé wrote:
> Hi Thomas,
> 
> On 6/24/20 12:21 PM, Philippe Mathieu-Daudé wrote:
>> On 6/24/20 7:04 AM, Thomas Huth wrote:
>>> On 23/06/2020 19.35, Philippe Mathieu-Daudé wrote:
>>>> On 6/23/20 7:07 PM, Thomas Huth wrote:
>>>>> On 23/06/2020 17.39, Philippe Mathieu-Daudé wrote:
>>>>>> On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
>>>>>>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
>>>>>>> since that change makes unit tests much slower for all developers, while it's not
>>>>>>> a robust way to fix migration tests. Migration tests need to find
>>>>>>> a more robust way to discover a reasonable bandwidth without slowing
>>>>>>> things down for everyone.
>>>>>>
>>>>>> Please also mention we can do this since 1de8e4c4dcf which allow
>>>>>> marked the s390x job as "unstable" and allow it to fail.
>>>>>>
>>>>>> But if nobody is going to look at it, instead lets disable
>>>>>> it until someone figure out the issue:
>>>>>>
>>>>>> -- >8 --
>>>>>> diff --git a/.travis.yml b/.travis.yml
>>>>>> index 74158f741b..364e67b14b 100644
>>>>>> --- a/.travis.yml
>>>>>> +++ b/.travis.yml
>>>>>> @@ -507,6 +507,7 @@ jobs:
>>>>>>
>>>>>>       - name: "[s390x] Clang (disable-tcg)"
>>>>>>         arch: s390x
>>>>>> +      if: false # Temporarily disabled due to issue testing migration
>>>>>> (see commit 6d1da867e65).
>>>>>>         dist: bionic
>>>>>>         compiler: clang
>>>>>>         addons:
>>>>>
>>>>> Sorry, but that looks wrong. First, the disable-tcg test does not run
>>>>> the qtests at all. So this is certainly the wrong location here.
>>>>
>>>> Indeed, this is the previous job:
>>>>
>>>> -- >8 --
>>>> diff --git a/.travis.yml b/.travis.yml
>>>> index 74158f741b..b399e20078 100644
>>>> --- a/.travis.yml
>>>> +++ b/.travis.yml
>>>> @@ -464,6 +464,7 @@ jobs:
>>>>           - CONFIG="--disable-containers
>>>> --target-list=ppc64-softmmu,ppc64le-linux-user"
>>>>
>>>>       - name: "[s390x] GCC check-tcg"
>>>> +      if: false # Temporarily disabled due to issue testing migration
>>>> (see commit 6d1da867e65).
>>>>         arch: s390x
>>>>         dist: bionic
>>>>         addons:
>>>> ---
>>>>
>>>>> Second,
>>>>> if just one of the qtests is failing, please only disable that single
>>>>> failing qtest and not the whole test pipeline.
>>>>
>>>> Last time we talked about this Dave was against that option:
>>>>
>>>> https://www.mail-archive.com/qemu-devel@nongnu.org/msg690085.html
>>>>
>>>
>>> Was he? Citing his reply to the mail from your URL:
>>>
>>>   "Before we take the hammer to it, could you try reducing it's initial
>>> bandwidth"
>>>
>>> So all I can see is that he first wanted to try something different than
>>> disabling the test. And now,  instead of using a small hammer to disable
>>> just this test, you now even use a very *big* hammer to disable *all*
>>> tests. That's just a very bad idea. Please don't.
> 
> You asked on IRC the CI failures history:
> 
> https://travis-ci.org/github/philmd/qemu/jobs/663607963
> https://travis-ci.org/github/philmd/qemu/jobs/663622229
> https://travis-ci.org/github/philmd/qemu/jobs/663642522
> 
> "No output has been received in the last 10m0s, this potentially
> indicates a stalled build or something wrong with the build itself."

Ok, but none of these hangs seem to be related to the migration test. I 
assume that's just a generic unreliability of the builder machines, 
which should have been addresses with Alex' patch 1de8e4c4dcf2af8e1 ?

  Thomas


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Michael S. Tsirkin 3 years, 10 months ago
On Tue, Jun 23, 2020 at 05:39:13PM +0200, Philippe Mathieu-Daudé wrote:
> On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
> > This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> > since that change makes unit tests much slower for all developers, while it's not
> > a robust way to fix migration tests. Migration tests need to find
> > a more robust way to discover a reasonable bandwidth without slowing
> > things down for everyone.
> 
> Please also mention we can do this since 1de8e4c4dcf which allow
> marked the s390x job as "unstable" and allow it to fail.
> 
> But if nobody is going to look at it, instead lets disable

OK we can do this as an extra patch. Can you supply a S.O.B. pls?

> it until someone figure out the issue:
> 
> -- >8 --
> diff --git a/.travis.yml b/.travis.yml
> index 74158f741b..364e67b14b 100644
> --- a/.travis.yml
> +++ b/.travis.yml
> @@ -507,6 +507,7 @@ jobs:
> 
>      - name: "[s390x] Clang (disable-tcg)"
>        arch: s390x
> +      if: false # Temporarily disabled due to issue testing migration
> (see commit 6d1da867e65).
>        dist: bionic
>        compiler: clang
>        addons:
> ---
> 
> With the hunk amended (no need to mention 1de8e4c4d actually):
> Acked-by: Philippe Mathieu-Daudé <philmd@redhat.com>
> 
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > ---
> >  tests/qtest/migration-test.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index dc3490c9fa..21ea5ba1d2 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> >       * without throttling.
> >       */
> >      migrate_set_parameter_int(from, "downtime-limit", 1);
> > -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> > +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> >  
> >      /* To check remaining size after precopy */
> >      migrate_set_capability(from, "pause-before-switchover", true);
> > 


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Dr. David Alan Gilbert 3 years, 9 months ago
* Michael S. Tsirkin (mst@redhat.com) wrote:
> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> since that change makes unit tests much slower for all developers, while it's not
> a robust way to fix migration tests. Migration tests need to find
> a more robust way to discover a reasonable bandwidth without slowing
> things down for everyone.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Yeh, I'd hoped something else could provide another way but I hadn't
realised how this worked;  You don't hit auto-converge until you've done
two passes, since we're running ~100MByte of dirty memory, that means
it wont hit the autoconverge stage until
2x100MByte/1MByte bandwidth=200 seconds

I'm actually measuring 130 seconds, which seems sane to me, since
there's a lot of overlap; so yeh we need to find a different way to set
this up.


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>


> ---
>  tests/qtest/migration-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index dc3490c9fa..21ea5ba1d2 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>       * without throttling.
>       */
>      migrate_set_parameter_int(from, "downtime-limit", 1);
> -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>  
>      /* To check remaining size after precopy */
>      migrate_set_capability(from, "pause-before-switchover", true);
> -- 
> MST
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Michael S. Tsirkin 3 years, 9 months ago
On Tue, Jun 23, 2020 at 10:57:02AM -0400, Michael S. Tsirkin wrote:
> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> since that change makes unit tests much slower for all developers, while it's not
> a robust way to fix migration tests. Migration tests need to find
> a more robust way to discover a reasonable bandwidth without slowing
> things down for everyone.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

What's the conclusion here? Should I merge this?


> ---
>  tests/qtest/migration-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index dc3490c9fa..21ea5ba1d2 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>       * without throttling.
>       */
>      migrate_set_parameter_int(from, "downtime-limit", 1);
> -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>  
>      /* To check remaining size after precopy */
>      migrate_set_capability(from, "pause-before-switchover", true);
> -- 
> MST


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Thomas Huth 3 years, 9 months ago
On 30/06/2020 15.07, Michael S. Tsirkin wrote:
> On Tue, Jun 23, 2020 at 10:57:02AM -0400, Michael S. Tsirkin wrote:
>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
>> since that change makes unit tests much slower for all developers, while it's not
>> a robust way to fix migration tests. Migration tests need to find
>> a more robust way to discover a reasonable bandwidth without slowing
>> things down for everyone.
>>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> What's the conclusion here? Should I merge this?

Fine for me (from the s390x side). The test should not run with TCG in 
the CI for s390x ... if it still does, we have to have another closer 
look at the check there instead.

  Thomas


> 
>> ---
>>   tests/qtest/migration-test.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index dc3490c9fa..21ea5ba1d2 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>>        * without throttling.
>>        */
>>       migrate_set_parameter_int(from, "downtime-limit", 1);
>> -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
>> +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>>   
>>       /* To check remaining size after precopy */
>>       migrate_set_capability(from, "pause-before-switchover", true);
>> -- 
>> MST
> 
> 


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Michael S. Tsirkin 3 years, 9 months ago
On Tue, Jun 30, 2020 at 03:20:04PM +0200, Thomas Huth wrote:
> On 30/06/2020 15.07, Michael S. Tsirkin wrote:
> > On Tue, Jun 23, 2020 at 10:57:02AM -0400, Michael S. Tsirkin wrote:
> > > This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> > > since that change makes unit tests much slower for all developers, while it's not
> > > a robust way to fix migration tests. Migration tests need to find
> > > a more robust way to discover a reasonable bandwidth without slowing
> > > things down for everyone.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > What's the conclusion here? Should I merge this?
> 
> Fine for me (from the s390x side). The test should not run with TCG in the
> CI for s390x ... if it still does, we have to have another closer look at
> the check there instead.
> 
>  Thomas

ack pls?

> 
> > 
> > > ---
> > >   tests/qtest/migration-test.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > > index dc3490c9fa..21ea5ba1d2 100644
> > > --- a/tests/qtest/migration-test.c
> > > +++ b/tests/qtest/migration-test.c
> > > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> > >        * without throttling.
> > >        */
> > >       migrate_set_parameter_int(from, "downtime-limit", 1);
> > > -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> > > +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> > >       /* To check remaining size after precopy */
> > >       migrate_set_capability(from, "pause-before-switchover", true);
> > > -- 
> > > MST
> > 
> > 


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Thomas Huth 3 years, 9 months ago
On 30/06/2020 16.43, Michael S. Tsirkin wrote:
> On Tue, Jun 30, 2020 at 03:20:04PM +0200, Thomas Huth wrote:
>> On 30/06/2020 15.07, Michael S. Tsirkin wrote:
>>> On Tue, Jun 23, 2020 at 10:57:02AM -0400, Michael S. Tsirkin wrote:
>>>> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
>>>> since that change makes unit tests much slower for all developers, while it's not
>>>> a robust way to fix migration tests. Migration tests need to find
>>>> a more robust way to discover a reasonable bandwidth without slowing
>>>> things down for everyone.
>>>>
>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>
>>> What's the conclusion here? Should I merge this?
>>
>> Fine for me (from the s390x side). The test should not run with TCG in the
>> CI for s390x ... if it still does, we have to have another closer look at
>> the check there instead.
>>
>>   Thomas
> 
> ack pls?

Acked-by: Thomas Huth <thuth@redhat.com>


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Dr. David Alan Gilbert 3 years, 9 months ago
* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Tue, Jun 23, 2020 at 10:57:02AM -0400, Michael S. Tsirkin wrote:
> > This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> > since that change makes unit tests much slower for all developers, while it's not
> > a robust way to fix migration tests. Migration tests need to find
> > a more robust way to discover a reasonable bandwidth without slowing
> > things down for everyone.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> What's the conclusion here? Should I merge this?

Yes please; I need to rethink that.

Dave

> 
> > ---
> >  tests/qtest/migration-test.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index dc3490c9fa..21ea5ba1d2 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> >       * without throttling.
> >       */
> >      migrate_set_parameter_int(from, "downtime-limit", 1);
> > -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> > +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> >  
> >      /* To check remaining size after precopy */
> >      migrate_set_capability(from, "pause-before-switchover", true);
> > -- 
> > MST
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Michael S. Tsirkin 3 years, 9 months ago
On Tue, Jun 30, 2020 at 02:59:12PM +0100, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (mst@redhat.com) wrote:
> > On Tue, Jun 23, 2020 at 10:57:02AM -0400, Michael S. Tsirkin wrote:
> > > This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> > > since that change makes unit tests much slower for all developers, while it's not
> > > a robust way to fix migration tests. Migration tests need to find
> > > a more robust way to discover a reasonable bandwidth without slowing
> > > things down for everyone.
> > > 
> > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > What's the conclusion here? Should I merge this?
> 
> Yes please; I need to rethink that.
> 
> Dave
> 
> > 
> > > ---
> > >  tests/qtest/migration-test.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > > index dc3490c9fa..21ea5ba1d2 100644
> > > --- a/tests/qtest/migration-test.c
> > > +++ b/tests/qtest/migration-test.c
> > > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> > >       * without throttling.
> > >       */
> > >      migrate_set_parameter_int(from, "downtime-limit", 1);
> > > -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> > > +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> > >  
> > >      /* To check remaining size after precopy */
> > >      migrate_set_capability(from, "pause-before-switchover", true);
> > > -- 
> > > MST
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

ack pls?


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Dr. David Alan Gilbert 3 years, 9 months ago
* Michael S. Tsirkin (mst@redhat.com) wrote:
> On Tue, Jun 30, 2020 at 02:59:12PM +0100, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (mst@redhat.com) wrote:
> > > On Tue, Jun 23, 2020 at 10:57:02AM -0400, Michael S. Tsirkin wrote:
> > > > This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> > > > since that change makes unit tests much slower for all developers, while it's not
> > > > a robust way to fix migration tests. Migration tests need to find
> > > > a more robust way to discover a reasonable bandwidth without slowing
> > > > things down for everyone.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > 
> > > What's the conclusion here? Should I merge this?
> > 
> > Yes please; I need to rethink that.
> > 
> > Dave
> > 
> > > 
> > > > ---
> > > >  tests/qtest/migration-test.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > > > index dc3490c9fa..21ea5ba1d2 100644
> > > > --- a/tests/qtest/migration-test.c
> > > > +++ b/tests/qtest/migration-test.c
> > > > @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
> > > >       * without throttling.
> > > >       */
> > > >      migrate_set_parameter_int(from, "downtime-limit", 1);
> > > > -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> > > > +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
> > > >  
> > > >      /* To check remaining size after precopy */
> > > >      migrate_set_capability(from, "pause-before-switchover", true);
> > > > -- 
> > > > MST
> > > 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> ack pls?

Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


Re: [PATCH] Revert "tests/migration: Reduce autoconverge initial bandwidth"
Posted by Philippe Mathieu-Daudé 3 years, 9 months ago
On 6/23/20 4:56 PM, Michael S. Tsirkin wrote:
> This reverts commit 6d1da867e65f ("tests/migration: Reduce autoconverge initial bandwidth")
> since that change makes unit tests much slower for all developers, while it's not
> a robust way to fix migration tests. Migration tests need to find
> a more robust way to discover a reasonable bandwidth without slowing
> things down for everyone.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---
>  tests/qtest/migration-test.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index dc3490c9fa..21ea5ba1d2 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -1211,7 +1211,7 @@ static void test_migrate_auto_converge(void)
>       * without throttling.
>       */
>      migrate_set_parameter_int(from, "downtime-limit", 1);
> -    migrate_set_parameter_int(from, "max-bandwidth", 1000000); /* ~1Mb/s */
> +    migrate_set_parameter_int(from, "max-bandwidth", 100000000); /* ~100Mb/s */
>  
>      /* To check remaining size after precopy */
>      migrate_set_capability(from, "pause-before-switchover", true);
> 

Acked-by: Philippe Mathieu-Daudé <philmd@redhat.com>