[PATCH] python: QEMUMachine: enable qmp accept timeout by default

Vladimir Sementsov-Ogievskiy posted 1 patch 3 years, 7 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220624195252.175249-1-vsementsov@yandex-team.ru
Maintainers: John Snow <jsnow@redhat.com>, Cleber Rosa <crosa@redhat.com>, Beraldo Leal <bleal@redhat.com>
python/qemu/machine/machine.py | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] python: QEMUMachine: enable qmp accept timeout by default
Posted by Vladimir Sementsov-Ogievskiy 3 years, 7 months ago
I've spent much time trying to debug hanging pipeline in gitlab. I
started from and idea that I have problem in code in my series (which
has some timeouts). Finally I found that the problem is that I've used
QEMUMachine class directly to avoid qtest, and didn't add necessary
arguments. Qemu fails and we wait for qmp accept endlessly. In gitlab
it's just stopped by timeout (one hour) with no sign of what's going
wrong.

With timeout enabled, gitlab don't wait for an hour and prints all
needed information.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
---

Hi all!

Just compare this
  https://gitlab.com/vsementsov/qemu/-/pipelines/572232557
and this
  https://gitlab.com/vsementsov/qemu/-/pipelines/572526252

and you'll see that the latter is much better.

 python/qemu/machine/machine.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 37191f433b..01a12f6f73 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -131,7 +131,7 @@ def __init__(self,
                  drain_console: bool = False,
                  console_log: Optional[str] = None,
                  log_dir: Optional[str] = None,
-                 qmp_timer: Optional[float] = None):
+                 qmp_timer: float = 30):
         '''
         Initialize a QEMUMachine
 
-- 
2.25.1
Re: [PATCH] python: QEMUMachine: enable qmp accept timeout by default
Posted by John Snow 3 years, 7 months ago
On Fri, Jun 24, 2022 at 3:53 PM Vladimir Sementsov-Ogievskiy
<vsementsov@yandex-team.ru> wrote:
>
> I've spent much time trying to debug hanging pipeline in gitlab. I
> started from and idea that I have problem in code in my series (which
> has some timeouts). Finally I found that the problem is that I've used
> QEMUMachine class directly to avoid qtest, and didn't add necessary
> arguments. Qemu fails and we wait for qmp accept endlessly. In gitlab
> it's just stopped by timeout (one hour) with no sign of what's going
> wrong.
>
> With timeout enabled, gitlab don't wait for an hour and prints all
> needed information.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> ---
>
> Hi all!
>
> Just compare this
>   https://gitlab.com/vsementsov/qemu/-/pipelines/572232557
> and this
>   https://gitlab.com/vsementsov/qemu/-/pipelines/572526252
>
> and you'll see that the latter is much better.
>
>  python/qemu/machine/machine.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
> index 37191f433b..01a12f6f73 100644
> --- a/python/qemu/machine/machine.py
> +++ b/python/qemu/machine/machine.py
> @@ -131,7 +131,7 @@ def __init__(self,
>                   drain_console: bool = False,
>                   console_log: Optional[str] = None,
>                   log_dir: Optional[str] = None,
> -                 qmp_timer: Optional[float] = None):
> +                 qmp_timer: float = 30):
>          '''
>          Initialize a QEMUMachine
>
> --
> 2.25.1
>

Oh, this is because machine.py uses the qmp_timer for *all* timeouts,
and not just the QMP commands themselves, and this relates to the work
Marc Andre is doing with regards to changing the launch mechanism to
handle the race condition when the QEMU launch fails, but the QMP
connection just sits waiting.

I'm quite of the mind that it's really time to rewrite machine.py to
use the native asyncio interfaces I've been writing to help manage
this, but in the meantime I think this is probably a reasonable
concession and a more useful default.

...I think. Willing to take it for now and re-investigate when the
other fixes make it to the tree.

Reviewed-by: John Snow <jsnow@redhat.com>
Re: [PATCH] python: QEMUMachine: enable qmp accept timeout by default
Posted by John Snow 3 years, 7 months ago
On Mon, Jul 11, 2022 at 5:16 PM John Snow <jsnow@redhat.com> wrote:
>
> On Fri, Jun 24, 2022 at 3:53 PM Vladimir Sementsov-Ogievskiy
> <vsementsov@yandex-team.ru> wrote:
> >
> > I've spent much time trying to debug hanging pipeline in gitlab. I
> > started from and idea that I have problem in code in my series (which
> > has some timeouts). Finally I found that the problem is that I've used
> > QEMUMachine class directly to avoid qtest, and didn't add necessary
> > arguments. Qemu fails and we wait for qmp accept endlessly. In gitlab
> > it's just stopped by timeout (one hour) with no sign of what's going
> > wrong.
> >
> > With timeout enabled, gitlab don't wait for an hour and prints all
> > needed information.
> >
> > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> > ---
> >
> > Hi all!
> >
> > Just compare this
> >   https://gitlab.com/vsementsov/qemu/-/pipelines/572232557
> > and this
> >   https://gitlab.com/vsementsov/qemu/-/pipelines/572526252
> >
> > and you'll see that the latter is much better.
> >
> >  python/qemu/machine/machine.py | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
> > index 37191f433b..01a12f6f73 100644
> > --- a/python/qemu/machine/machine.py
> > +++ b/python/qemu/machine/machine.py
> > @@ -131,7 +131,7 @@ def __init__(self,
> >                   drain_console: bool = False,
> >                   console_log: Optional[str] = None,
> >                   log_dir: Optional[str] = None,
> > -                 qmp_timer: Optional[float] = None):
> > +                 qmp_timer: float = 30):
> >          '''
> >          Initialize a QEMUMachine
> >
> > --
> > 2.25.1
> >
>
> Oh, this is because machine.py uses the qmp_timer for *all* timeouts,
> and not just the QMP commands themselves, and this relates to the work
> Marc Andre is doing with regards to changing the launch mechanism to
> handle the race condition when the QEMU launch fails, but the QMP
> connection just sits waiting.
>
> I'm quite of the mind that it's really time to rewrite machine.py to
> use the native asyncio interfaces I've been writing to help manage
> this, but in the meantime I think this is probably a reasonable
> concession and a more useful default.
>
> ...I think. Willing to take it for now and re-investigate when the
> other fixes make it to the tree.
>
> Reviewed-by: John Snow <jsnow@redhat.com>

Oh, keep the type as Optional[float], though, so the timeout can be
disabled again, and keeps the type consistent with the qtest
derivative class. I've staged your patch with that change made, let me
know if that's not OK. Modified patch is on my python branch:

Thanks, merged.

--js
Re: [PATCH] python: QEMUMachine: enable qmp accept timeout by default
Posted by Vladimir Sementsov-Ogievskiy 3 years ago
On 7/12/22 00:21, John Snow wrote:
> On Mon, Jul 11, 2022 at 5:16 PM John Snow <jsnow@redhat.com> wrote:
>>
>> On Fri, Jun 24, 2022 at 3:53 PM Vladimir Sementsov-Ogievskiy
>> <vsementsov@yandex-team.ru> wrote:
>>>
>>> I've spent much time trying to debug hanging pipeline in gitlab. I
>>> started from and idea that I have problem in code in my series (which
>>> has some timeouts). Finally I found that the problem is that I've used
>>> QEMUMachine class directly to avoid qtest, and didn't add necessary
>>> arguments. Qemu fails and we wait for qmp accept endlessly. In gitlab
>>> it's just stopped by timeout (one hour) with no sign of what's going
>>> wrong.
>>>
>>> With timeout enabled, gitlab don't wait for an hour and prints all
>>> needed information.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>> ---
>>>
>>> Hi all!
>>>
>>> Just compare this
>>>    https://gitlab.com/vsementsov/qemu/-/pipelines/572232557
>>> and this
>>>    https://gitlab.com/vsementsov/qemu/-/pipelines/572526252
>>>
>>> and you'll see that the latter is much better.
>>>
>>>   python/qemu/machine/machine.py | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
>>> index 37191f433b..01a12f6f73 100644
>>> --- a/python/qemu/machine/machine.py
>>> +++ b/python/qemu/machine/machine.py
>>> @@ -131,7 +131,7 @@ def __init__(self,
>>>                    drain_console: bool = False,
>>>                    console_log: Optional[str] = None,
>>>                    log_dir: Optional[str] = None,
>>> -                 qmp_timer: Optional[float] = None):
>>> +                 qmp_timer: float = 30):
>>>           '''
>>>           Initialize a QEMUMachine
>>>
>>> --
>>> 2.25.1
>>>
>>
>> Oh, this is because machine.py uses the qmp_timer for *all* timeouts,
>> and not just the QMP commands themselves, and this relates to the work
>> Marc Andre is doing with regards to changing the launch mechanism to
>> handle the race condition when the QEMU launch fails, but the QMP
>> connection just sits waiting.
>>
>> I'm quite of the mind that it's really time to rewrite machine.py to
>> use the native asyncio interfaces I've been writing to help manage
>> this, but in the meantime I think this is probably a reasonable
>> concession and a more useful default.
>>
>> ...I think. Willing to take it for now and re-investigate when the
>> other fixes make it to the tree.
>>
>> Reviewed-by: John Snow <jsnow@redhat.com>
> 
> Oh, keep the type as Optional[float], though, so the timeout can be
> disabled again, and keeps the type consistent with the qtest
> derivative class. I've staged your patch with that change made, let me
> know if that's not OK. Modified patch is on my python branch:
> 
> Thanks, merged.
> 

Hmm, seems that's lost.. I don't see it neither in master nor in your python branch..

-- 
Best regards,
Vladimir
Re: [PATCH] python: QEMUMachine: enable qmp accept timeout by default
Posted by John Snow 3 years ago
On Tue, Jan 10, 2023, 3:53 AM Vladimir Sementsov-Ogievskiy <
vsementsov@yandex-team.ru> wrote:

> On 7/12/22 00:21, John Snow wrote:
> > On Mon, Jul 11, 2022 at 5:16 PM John Snow <jsnow@redhat.com> wrote:
> >>
> >> On Fri, Jun 24, 2022 at 3:53 PM Vladimir Sementsov-Ogievskiy
> >> <vsementsov@yandex-team.ru> wrote:
> >>>
> >>> I've spent much time trying to debug hanging pipeline in gitlab. I
> >>> started from and idea that I have problem in code in my series (which
> >>> has some timeouts). Finally I found that the problem is that I've used
> >>> QEMUMachine class directly to avoid qtest, and didn't add necessary
> >>> arguments. Qemu fails and we wait for qmp accept endlessly. In gitlab
> >>> it's just stopped by timeout (one hour) with no sign of what's going
> >>> wrong.
> >>>
> >>> With timeout enabled, gitlab don't wait for an hour and prints all
> >>> needed information.
> >>>
> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru
> >
> >>> ---
> >>>
> >>> Hi all!
> >>>
> >>> Just compare this
> >>>    https://gitlab.com/vsementsov/qemu/-/pipelines/572232557
> >>> and this
> >>>    https://gitlab.com/vsementsov/qemu/-/pipelines/572526252
> >>>
> >>> and you'll see that the latter is much better.
> >>>
> >>>   python/qemu/machine/machine.py | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/python/qemu/machine/machine.py
> b/python/qemu/machine/machine.py
> >>> index 37191f433b..01a12f6f73 100644
> >>> --- a/python/qemu/machine/machine.py
> >>> +++ b/python/qemu/machine/machine.py
> >>> @@ -131,7 +131,7 @@ def __init__(self,
> >>>                    drain_console: bool = False,
> >>>                    console_log: Optional[str] = None,
> >>>                    log_dir: Optional[str] = None,
> >>> -                 qmp_timer: Optional[float] = None):
> >>> +                 qmp_timer: float = 30):
> >>>           '''
> >>>           Initialize a QEMUMachine
> >>>
> >>> --
> >>> 2.25.1
> >>>
> >>
> >> Oh, this is because machine.py uses the qmp_timer for *all* timeouts,
> >> and not just the QMP commands themselves, and this relates to the work
> >> Marc Andre is doing with regards to changing the launch mechanism to
> >> handle the race condition when the QEMU launch fails, but the QMP
> >> connection just sits waiting.
> >>
> >> I'm quite of the mind that it's really time to rewrite machine.py to
> >> use the native asyncio interfaces I've been writing to help manage
> >> this, but in the meantime I think this is probably a reasonable
> >> concession and a more useful default.
> >>
> >> ...I think. Willing to take it for now and re-investigate when the
> >> other fixes make it to the tree.
> >>
> >> Reviewed-by: John Snow <jsnow@redhat.com>
> >
> > Oh, keep the type as Optional[float], though, so the timeout can be
> > disabled again, and keeps the type consistent with the qtest
> > derivative class. I've staged your patch with that change made, let me
> > know if that's not OK. Modified patch is on my python branch:
> >
> > Thanks, merged.
> >
>
> Hmm, seems that's lost.. I don't see it neither in master nor in your
> python branch..
>
> --
> Best regards,
> Vladimir
>

:(

I'll fix it. Thanks for resending the iotests series, too - the old version
was at the very top of my inbox :)

>
Re: [PATCH] python: QEMUMachine: enable qmp accept timeout by default
Posted by John Snow 3 years ago
On Tue, Jan 10, 2023 at 12:06 PM John Snow <jsnow@redhat.com> wrote:
>
>
>
> On Tue, Jan 10, 2023, 3:53 AM Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> wrote:
>>
>> On 7/12/22 00:21, John Snow wrote:
>> > On Mon, Jul 11, 2022 at 5:16 PM John Snow <jsnow@redhat.com> wrote:
>> >>
>> >> On Fri, Jun 24, 2022 at 3:53 PM Vladimir Sementsov-Ogievskiy
>> >> <vsementsov@yandex-team.ru> wrote:
>> >>>
>> >>> I've spent much time trying to debug hanging pipeline in gitlab. I
>> >>> started from and idea that I have problem in code in my series (which
>> >>> has some timeouts). Finally I found that the problem is that I've used
>> >>> QEMUMachine class directly to avoid qtest, and didn't add necessary
>> >>> arguments. Qemu fails and we wait for qmp accept endlessly. In gitlab
>> >>> it's just stopped by timeout (one hour) with no sign of what's going
>> >>> wrong.
>> >>>
>> >>> With timeout enabled, gitlab don't wait for an hour and prints all
>> >>> needed information.
>> >>>
>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> >>> ---
>> >>>
>> >>> Hi all!
>> >>>
>> >>> Just compare this
>> >>>    https://gitlab.com/vsementsov/qemu/-/pipelines/572232557
>> >>> and this
>> >>>    https://gitlab.com/vsementsov/qemu/-/pipelines/572526252
>> >>>
>> >>> and you'll see that the latter is much better.
>> >>>
>> >>>   python/qemu/machine/machine.py | 2 +-
>> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
>> >>>
>> >>> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
>> >>> index 37191f433b..01a12f6f73 100644
>> >>> --- a/python/qemu/machine/machine.py
>> >>> +++ b/python/qemu/machine/machine.py
>> >>> @@ -131,7 +131,7 @@ def __init__(self,
>> >>>                    drain_console: bool = False,
>> >>>                    console_log: Optional[str] = None,
>> >>>                    log_dir: Optional[str] = None,
>> >>> -                 qmp_timer: Optional[float] = None):
>> >>> +                 qmp_timer: float = 30):
>> >>>           '''
>> >>>           Initialize a QEMUMachine
>> >>>
>> >>> --
>> >>> 2.25.1
>> >>>
>> >>
>> >> Oh, this is because machine.py uses the qmp_timer for *all* timeouts,
>> >> and not just the QMP commands themselves, and this relates to the work
>> >> Marc Andre is doing with regards to changing the launch mechanism to
>> >> handle the race condition when the QEMU launch fails, but the QMP
>> >> connection just sits waiting.
>> >>
>> >> I'm quite of the mind that it's really time to rewrite machine.py to
>> >> use the native asyncio interfaces I've been writing to help manage
>> >> this, but in the meantime I think this is probably a reasonable
>> >> concession and a more useful default.
>> >>
>> >> ...I think. Willing to take it for now and re-investigate when the
>> >> other fixes make it to the tree.
>> >>
>> >> Reviewed-by: John Snow <jsnow@redhat.com>
>> >
>> > Oh, keep the type as Optional[float], though, so the timeout can be
>> > disabled again, and keeps the type consistent with the qtest
>> > derivative class. I've staged your patch with that change made, let me
>> > know if that's not OK. Modified patch is on my python branch:
>> >
>> > Thanks, merged.
>> >
>>
>> Hmm, seems that's lost.. I don't see it neither in master nor in your python branch..
>>
>> --
>> Best regards,
>> Vladimir
>
>
> :(
>
> I'll fix it. Thanks for resending the iotests series, too - the old version was at the very top of my inbox :)

Re-edited and Re-staged:

https://gitlab.com/jsnow/qemu/-/commits/python

--js