[v4] monitor: add asynchronous command type

[Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Marc-André Lureau 5 years ago

Hi,

HMP and QMP commands are handled synchronously in qemu today. But
there are benefits allowing the command handler to re-enter the main
loop if the command cannot be handled synchronously, or if it is
long-lasting. Some bugs such as rhbz#1230527 are difficult to solve
without it.

The common solution is to use a pair of command+event in this case.
But this approach has a number of issues:
- you can't "fix" an existing command: you need a new API, and ad-hoc
  documentation for that command+signal association, and old/broken
  command deprecation
- since the reply event is broadcasted and 'id' is used for matching the
  request, it may conflict with other clients request 'id' space
- it is arguably less efficient and elegant (weird API, useless return
  in most cases, broadcast reply, no cancelling on disconnect etc)

The following series implements an async command solution instead. By
introducing a session context and a command return handler, it can:
- defer the return, allowing the mainloop to reenter
- return only to the caller (instead of broadcast events for reply)
- optionnally allow cancellation when the client is gone
- track on-going qapi command(s) per client/session

and without introduction of new QMP APIs or client visible change.

Existing qemu commands can be gradually replaced by async:true
variants when needed, while carefully reviewing the concurrency
aspects. The async:true commands marshaller helpers are splitted in
half, the calling and return functions. The command is called with a
QmpReturn context, that can return immediately or later, using the
generated return helper, which allows for a step-by-step conversion.

The screendump command is converted to an async:true version to solve
rhbz#1230527. The command shows basic cancellation (this could be
extended if needed). It could be further improved to do asynchronous
IO writes as well.

v4:
- rebased, mostly adapting to new OOB code
  (there was not much feedback in v3 for the async command part,
   but preliminary patches got merged!)
- drop the RFC status

v3:
- complete rework, dropping the asynchronous commands visibility from
  the protocol side entirely (until there is a real need for it)
- rebased, with a few preliminary cleanup patches
- teach asynchronous commands to HMP

v2:
- documentation fixes and improvements
- fix calling async commands sync without id
- fix bad hmp monitor assert
- add a few extra asserts
- add async with no-id failure and screendump test

Marc-André Lureau (20):
  qmp: constify QmpCommand and list
  json-lexer: make it safe to call destroy multiple times
  qmp: add QmpSession
  QmpSession: add a return callback
  QmpSession: add json parser and use it in qga
  monitor: use qmp session to parse json feed
  qga: simplify dispatch_return_cb
  QmpSession: introduce QmpReturn
  qmp: simplify qmp_return_error()
  QmpSession: keep a queue of pending commands
  QmpSession: return orderly
  qmp: introduce asynchronous command type
  scripts: learn 'async' qapi commands
  qmp: add qmp_return_is_cancelled()
  monitor: add qmp_return_get_monitor()
  console: add graphic_hw_update_done()
  console: make screendump asynchronous
  monitor: start making qmp_human_monitor_command() asynchronous
  monitor: teach HMP about asynchronous commands
  hmp: call the asynchronous QMP screendump to fix outdated/glitches

 qapi/misc.json                          |   3 +-
 qapi/ui.json                            |   3 +-
 scripts/qapi/commands.py                | 151 ++++++++++++++---
 scripts/qapi/common.py                  |  15 +-
 scripts/qapi/doc.py                     |   3 +-
 scripts/qapi/introspect.py              |   3 +-
 hmp.h                                   |   3 +-
 include/monitor/monitor.h               |   3 +
 include/qapi/qmp/dispatch.h             |  89 +++++++++-
 include/qapi/qmp/json-parser.h          |   7 +-
 include/ui/console.h                    |   5 +
 hmp.c                                   |   6 +-
 hw/display/qxl-render.c                 |   9 +-
 hw/display/qxl.c                        |   1 +
 monitor.c                               | 198 ++++++++++++++--------
 qapi/qmp-dispatch.c                     | 214 +++++++++++++++++++-----
 qapi/qmp-registry.c                     |  33 +++-
 qga/commands.c                          |   2 +-
 qga/main.c                              |  51 ++----
 qobject/json-lexer.c                    |   5 +-
 qobject/json-streamer.c                 |   3 +-
 tests/test-qmp-cmds.c                   | 206 +++++++++++++++++++----
 ui/console.c                            | 100 +++++++++--
 hmp-commands.hx                         |   3 +-
 tests/qapi-schema/qapi-schema-test.json |   5 +
 tests/qapi-schema/qapi-schema-test.out  |   8 +
 tests/qapi-schema/test-qapi.py          |   8 +-
 27 files changed, 877 insertions(+), 260 deletions(-)

-- 
2.21.0.196.g041f5ea1cf

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by no-reply@patchew.org 5 years ago

Patchew URL: https://patchew.org/QEMU/20190409161009.6322-1-marcandre.lureau@redhat.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===




The full log is available at
http://patchew.org/logs/20190409161009.6322-1-marcandre.lureau@redhat.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Markus Armbruster 4 years, 11 months ago

Marc-André, before you invest your time to answer my questions below: my
bandwidth for non-trivial QAPI features like this one is painfully
limited.  To get your QAPI conditionals in, I had to postpone other QAPI
projects.  I don't regret doing that, I'm rather pleased with how QAPI
conditionals turned out.  But I can't keep postponing all other QAPI
projects.  Because of that, this one will be slow going, at best.  Sorry
about that.

Marc-André Lureau <marcandre.lureau@redhat.com> writes:

> Hi,
>
> HMP and QMP commands are handled synchronously in qemu today. But
> there are benefits allowing the command handler to re-enter the main
> loop if the command cannot be handled synchronously, or if it is
> long-lasting. Some bugs such as rhbz#1230527 are difficult to solve
> without it.
>
> The common solution is to use a pair of command+event in this case.

In particular, background jobs (qapi/jobs.json).  They grew out of block
jobs, and are still used only for "blocky" things.  Using them more
widely would probably make sense.

> But this approach has a number of issues:
> - you can't "fix" an existing command: you need a new API, and ad-hoc
>   documentation for that command+signal association, and old/broken
>   command deprecation

Making a synchronous command asynchronous is an incompatible change.  We
need to let the client needs opt in.  How is that done in this series?

> - since the reply event is broadcasted and 'id' is used for matching the
>   request, it may conflict with other clients request 'id' space

Any event that does that now is broken and needs to be fixed.  The
obvious fix is to include a monitor ID with the command ID.  For events
that can only ever be useful in the context of one particular monitor,
we could unicast to that monitor instead; see below.

Corollary: this is just a fixable bug, not a fundamental advantage of
the async feature.

> - it is arguably less efficient and elegant (weird API, useless return
>   in most cases, broadcast reply, no cancelling on disconnect etc)

The return value is useful for synchronously reporting failure to start
the background task.  I grant you that background tasks may exist that
won't ever fail to start.  I challenge the idea that it's most of them.

Broadcast reply could be avoided by unicasting events.  If I remember
correctly, Peter Xu even posted patches some time ago.  We ended up not
using them, because we found a better solution for the problem at hand.
My point is: this isn't a fundamental problem, it's just the way we
coded things up.

What do you mean by "no cancelling on disconnect"?

I'm ignoring "etc" unless you expand it into something specific.

I'm also not taking the "weird" bait :)

> The following series implements an async command solution instead. By
> introducing a session context and a command return handler, it can:
> - defer the return, allowing the mainloop to reenter
> - return only to the caller (instead of broadcast events for reply)
> - optionnally allow cancellation when the client is gone
> - track on-going qapi command(s) per client/session
>
> and without introduction of new QMP APIs or client visible change.

What do async commands provide that jobs lack?

Why do we want both?

I started to write a feature-by-feature comparison, but realized I don't
have the time to figure out either jobs or async from their (rather
sparse) documentation, let alone from code.

> Existing qemu commands can be gradually replaced by async:true
> variants when needed, while carefully reviewing the concurrency
> aspects. The async:true commands marshaller helpers are splitted in
> half, the calling and return functions. The command is called with a
> QmpReturn context, that can return immediately or later, using the
> generated return helper, which allows for a step-by-step conversion.
>
> The screendump command is converted to an async:true version to solve
> rhbz#1230527. The command shows basic cancellation (this could be
> extended if needed). It could be further improved to do asynchronous
> IO writes as well.

What is "basic cancellation"?

What extension(s) do you have in mind?

What's the impact of screendump writing synchronously?

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Marc-André Lureau 4 years, 11 months ago

Hi

On Tue, May 21, 2019 at 4:18 PM Markus Armbruster <armbru@redhat.com> wrote:
>
> Marc-André, before you invest your time to answer my questions below: my
> bandwidth for non-trivial QAPI features like this one is painfully
> limited.  To get your QAPI conditionals in, I had to postpone other QAPI
> projects.  I don't regret doing that, I'm rather pleased with how QAPI
> conditionals turned out.  But I can't keep postponing all other QAPI
> projects.  Because of that, this one will be slow going, at best.  Sorry
> about that.

We have different priorities, fair enough.

>
> Marc-André Lureau <marcandre.lureau@redhat.com> writes:
>
> > Hi,
> >
> > HMP and QMP commands are handled synchronously in qemu today. But
> > there are benefits allowing the command handler to re-enter the main
> > loop if the command cannot be handled synchronously, or if it is
> > long-lasting. Some bugs such as rhbz#1230527 are difficult to solve
> > without it.
> >
> > The common solution is to use a pair of command+event in this case.
>
> In particular, background jobs (qapi/jobs.json).  They grew out of block
> jobs, and are still used only for "blocky" things.  Using them more
> widely would probably make sense.
>
> > But this approach has a number of issues:
> > - you can't "fix" an existing command: you need a new API, and ad-hoc
> >   documentation for that command+signal association, and old/broken
> >   command deprecation
>
> Making a synchronous command asynchronous is an incompatible change.  We
> need to let the client needs opt in.  How is that done in this series?

No change visible on client side. I dropped the async command support
a while ago already, based on your recommendations. I can dig the
archive for the discussion if necessary.

>
> > - since the reply event is broadcasted and 'id' is used for matching the
> >   request, it may conflict with other clients request 'id' space
>
> Any event that does that now is broken and needs to be fixed.  The
> obvious fix is to include a monitor ID with the command ID.  For events
> that can only ever be useful in the context of one particular monitor,
> we could unicast to that monitor instead; see below.
>
> Corollary: this is just a fixable bug, not a fundamental advantage of
> the async feature.

I am just pointing out today drawbacks of turning a function async by
introducing new commands and signals.

>
> > - it is arguably less efficient and elegant (weird API, useless return
> >   in most cases, broadcast reply, no cancelling on disconnect etc)
>
> The return value is useful for synchronously reporting failure to start
> the background task.  I grant you that background tasks may exist that
> won't ever fail to start.  I challenge the idea that it's most of them.
>
> Broadcast reply could be avoided by unicasting events.  If I remember
> correctly, Peter Xu even posted patches some time ago.  We ended up not
> using them, because we found a better solution for the problem at hand.
> My point is: this isn't a fundamental problem, it's just the way we
> coded things up.
>
> What do you mean by "no cancelling on disconnect"?

When the client disconnects, the background task keeps running, and
there is no simple way to know about that event afaik. My proposal has
a simple API for that (see "qmp: add qmp_return_is_cancelled()"
patch).

>
> I'm ignoring "etc" unless you expand it into something specific.
>
> I'm also not taking the "weird" bait :)
> > The following series implements an async command solution instead. By
> > introducing a session context and a command return handler, it can:
> > - defer the return, allowing the mainloop to reenter
> > - return only to the caller (instead of broadcast events for reply)
> > - optionnally allow cancellation when the client is gone
> > - track on-going qapi command(s) per client/session
> >
> > and without introduction of new QMP APIs or client visible change.
>
> What do async commands provide that jobs lack?
>
> Why do we want both?

They are different things, last we discussed it: jobs are geared
toward block device operations, and do not provide simple qmp-level
facilities that I listed above. What I introduce is a way for an
*existing* QMP command to be splitted, so it can re-enter the main
loop sanely (and not by introducing new commands or signals or making
things unnecessarily more complicated).

My proposal is fairly small:
  27 files changed, 877 insertions(+), 260 deletions(-)

Including test, and the qxl screendump fix, which account for about
1/3 of the series.

> I started to write a feature-by-feature comparison, but realized I don't
> have the time to figure out either jobs or async from their (rather
> sparse) documentation, let alone from code.
>
> > Existing qemu commands can be gradually replaced by async:true
> > variants when needed, while carefully reviewing the concurrency
> > aspects. The async:true commands marshaller helpers are splitted in
> > half, the calling and return functions. The command is called with a
> > QmpReturn context, that can return immediately or later, using the
> > generated return helper, which allows for a step-by-step conversion.
> >
> > The screendump command is converted to an async:true version to solve
> > rhbz#1230527. The command shows basic cancellation (this could be
> > extended if needed). It could be further improved to do asynchronous
> > IO writes as well.
>
> What is "basic cancellation"?
> What extension(s) do you have in mind?

It checks for cancellation in a few places, between IO. Full
cancellation would allow to cancel at any time.

>
> What's the impact of screendump writing synchronously?

It can be pretty bad, think about 4k screens. It is 33177600 bytes,
written in PPM format, blocking the main loop..

QMP operation doing large IO (dumps), or blocking on events, could be
switched to this async form without introducing user-visible change,
and with minimal effort compared to jobs.


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Markus Armbruster 4 years, 11 months ago

Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi
>
> On Tue, May 21, 2019 at 4:18 PM Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Marc-André, before you invest your time to answer my questions below: my
>> bandwidth for non-trivial QAPI features like this one is painfully
>> limited.  To get your QAPI conditionals in, I had to postpone other QAPI
>> projects.  I don't regret doing that, I'm rather pleased with how QAPI
>> conditionals turned out.  But I can't keep postponing all other QAPI
>> projects.  Because of that, this one will be slow going, at best.  Sorry
>> about that.
>
> We have different priorities, fair enough.

I wish I could give you better service.  But no use pretending.

>> Marc-André Lureau <marcandre.lureau@redhat.com> writes:
>>
>> > Hi,
>> >
>> > HMP and QMP commands are handled synchronously in qemu today. But
>> > there are benefits allowing the command handler to re-enter the main
>> > loop if the command cannot be handled synchronously, or if it is
>> > long-lasting. Some bugs such as rhbz#1230527 are difficult to solve
>> > without it.
>> >
>> > The common solution is to use a pair of command+event in this case.
>>
>> In particular, background jobs (qapi/jobs.json).  They grew out of block
>> jobs, and are still used only for "blocky" things.  Using them more
>> widely would probably make sense.
>>
>> > But this approach has a number of issues:
>> > - you can't "fix" an existing command: you need a new API, and ad-hoc
>> >   documentation for that command+signal association, and old/broken
>> >   command deprecation
>>
>> Making a synchronous command asynchronous is an incompatible change.  We
>> need to let the client needs opt in.  How is that done in this series?
>
> No change visible on client side. I dropped the async command support
> a while ago already, based on your recommendations. I can dig the
> archive for the discussion if necessary.

Not right now.

>> > - since the reply event is broadcasted and 'id' is used for matching the
>> >   request, it may conflict with other clients request 'id' space
>>
>> Any event that does that now is broken and needs to be fixed.  The
>> obvious fix is to include a monitor ID with the command ID.  For events
>> that can only ever be useful in the context of one particular monitor,
>> we could unicast to that monitor instead; see below.
>>
>> Corollary: this is just a fixable bug, not a fundamental advantage of
>> the async feature.
>
> I am just pointing out today drawbacks of turning a function async by
> introducing new commands and signals.

And I'm just pointing out that some of today's drawbacks could also be
addressed differently :)

>> > - it is arguably less efficient and elegant (weird API, useless return
>> >   in most cases, broadcast reply, no cancelling on disconnect etc)
>>
>> The return value is useful for synchronously reporting failure to start
>> the background task.  I grant you that background tasks may exist that
>> won't ever fail to start.  I challenge the idea that it's most of them.
>>
>> Broadcast reply could be avoided by unicasting events.  If I remember
>> correctly, Peter Xu even posted patches some time ago.  We ended up not
>> using them, because we found a better solution for the problem at hand.
>> My point is: this isn't a fundamental problem, it's just the way we
>> coded things up.
>>
>> What do you mean by "no cancelling on disconnect"?
>
> When the client disconnects, the background task keeps running, and
> there is no simple way to know about that event afaik. My proposal has
> a simple API for that (see "qmp: add qmp_return_is_cancelled()"
> patch).

Auto-cancellation on client disconnect may be exactly what's wanted for
simple use cases.

Jobs are designed with more use cases in mind.  Consider a backup job
that's take some time.  We certainly don't want to cancel it just
because the management application hiccups and disconnects.  Instead, we
want to permit the management application to recover, reconnect, find
the backup job, examine its state, and resume managing it.  To support
this, jobs have a unique ID.  Job cancellation is explicit.

Jobs could acquire a "auto-cancel on disconnect" feature if there's a
need.

I'm not sure how asynchronous commands could support reconnect and
resume.

>> I'm ignoring "etc" unless you expand it into something specific.
>>
>> I'm also not taking the "weird" bait :)
>> > The following series implements an async command solution instead. By
>> > introducing a session context and a command return handler, it can:
>> > - defer the return, allowing the mainloop to reenter
>> > - return only to the caller (instead of broadcast events for reply)
>> > - optionnally allow cancellation when the client is gone
>> > - track on-going qapi command(s) per client/session
>> >
>> > and without introduction of new QMP APIs or client visible change.
>>
>> What do async commands provide that jobs lack?
>>
>> Why do we want both?
>
> They are different things, last we discussed it: jobs are geared
> toward block device operations,

Historical accident.  We've discussed using them for non-blocky stuff,
such as migration.  Of course, discussions are cheap, code is what
counts.

>                                 and do not provide simple qmp-level
> facilities that I listed above. What I introduce is a way for an
> *existing* QMP command to be splitted, so it can re-enter the main
> loop sanely (and not by introducing new commands or signals or making
> things unnecessarily more complicated).
>
> My proposal is fairly small:
>   27 files changed, 877 insertions(+), 260 deletions(-)
>
> Including test, and the qxl screendump fix, which account for about
> 1/3 of the series.
>
>> I started to write a feature-by-feature comparison, but realized I don't
>> have the time to figure out either jobs or async from their (rather
>> sparse) documentation, let alone from code.
>>
>> > Existing qemu commands can be gradually replaced by async:true
>> > variants when needed, while carefully reviewing the concurrency
>> > aspects. The async:true commands marshaller helpers are splitted in
>> > half, the calling and return functions. The command is called with a
>> > QmpReturn context, that can return immediately or later, using the
>> > generated return helper, which allows for a step-by-step conversion.
>> >
>> > The screendump command is converted to an async:true version to solve
>> > rhbz#1230527. The command shows basic cancellation (this could be
>> > extended if needed). It could be further improved to do asynchronous
>> > IO writes as well.
>>
>> What is "basic cancellation"?
>> What extension(s) do you have in mind?
>
> It checks for cancellation in a few places, between IO. Full
> cancellation would allow to cancel at any time.
>
>>
>> What's the impact of screendump writing synchronously?
>
> It can be pretty bad, think about 4k screens. It is 33177600 bytes,
> written in PPM format, blocking the main loop..

My question was specifically about "could be further improved to do
asynchronous IO writes as well".  What's the impact of not having this
improvement?  I *guess* it means that even with the asynchronous
command, the synchronous writes still block "something", but I'm not
sure what "something" may be, and how it could impact behavior.  Hence
my question.

> QMP operation doing large IO (dumps), or blocking on events, could be
> switched to this async form without introducing user-visible change,

Letting the next QMP command start before the current one is done is a
user-visible change.  We can discuss whether the change is harmless.

> and with minimal effort compared to jobs.

To gauge the difference in effort, we'd need actual code to compare.

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Marc-André Lureau 4 years, 11 months ago

Hi

On Thu, May 23, 2019 at 9:52 AM Markus Armbruster <armbru@redhat.com> wrote:
> I'm not sure how asynchronous commands could support reconnect and
> resume.

The same way as current commands, including job commands.

>
> >> I'm ignoring "etc" unless you expand it into something specific.
> >>
> >> I'm also not taking the "weird" bait :)
> >> > The following series implements an async command solution instead. By
> >> > introducing a session context and a command return handler, it can:
> >> > - defer the return, allowing the mainloop to reenter
> >> > - return only to the caller (instead of broadcast events for reply)
> >> > - optionnally allow cancellation when the client is gone
> >> > - track on-going qapi command(s) per client/session
> >> >
> >> > and without introduction of new QMP APIs or client visible change.
> >>
> >> What do async commands provide that jobs lack?
> >>
> >> Why do we want both?
> >
> > They are different things, last we discussed it: jobs are geared
> > toward block device operations,
>
> Historical accident.  We've discussed using them for non-blocky stuff,
> such as migration.  Of course, discussions are cheap, code is what
> counts.

Using job API means providing new (& more complex) APIs to client.

The screendump fix here doesn't need new API, it needs new internal
dispatch of QMP commands: the purpose of this series.

Whenever we can solve things on qemu side, I would rather not
deprecate current API.

> >                                 and do not provide simple qmp-level
> > facilities that I listed above. What I introduce is a way for an
> > *existing* QMP command to be splitted, so it can re-enter the main
> > loop sanely (and not by introducing new commands or signals or making
> > things unnecessarily more complicated).
> >
> > My proposal is fairly small:
> >   27 files changed, 877 insertions(+), 260 deletions(-)
> >
> > Including test, and the qxl screendump fix, which account for about
> > 1/3 of the series.
> >
> >> I started to write a feature-by-feature comparison, but realized I don't
> >> have the time to figure out either jobs or async from their (rather
> >> sparse) documentation, let alone from code.
> >>
> >> > Existing qemu commands can be gradually replaced by async:true
> >> > variants when needed, while carefully reviewing the concurrency
> >> > aspects. The async:true commands marshaller helpers are splitted in
> >> > half, the calling and return functions. The command is called with a
> >> > QmpReturn context, that can return immediately or later, using the
> >> > generated return helper, which allows for a step-by-step conversion.
> >> >
> >> > The screendump command is converted to an async:true version to solve
> >> > rhbz#1230527. The command shows basic cancellation (this could be
> >> > extended if needed). It could be further improved to do asynchronous
> >> > IO writes as well.
> >>
> >> What is "basic cancellation"?
> >> What extension(s) do you have in mind?
> >
> > It checks for cancellation in a few places, between IO. Full
> > cancellation would allow to cancel at any time.
> >
> >>
> >> What's the impact of screendump writing synchronously?
> >
> > It can be pretty bad, think about 4k screens. It is 33177600 bytes,
> > written in PPM format, blocking the main loop..
>
> My question was specifically about "could be further improved to do
> asynchronous IO writes as well".  What's the impact of not having this
> improvement?  I *guess* it means that even with the asynchronous
> command, the synchronous writes still block "something", but I'm not
> sure what "something" may be, and how it could impact behavior.  Hence
> my question.

It blocks many things since the BQL is taken.

The goal is not to improve responsiveness at this point, but to fix
the QXL screendump bug, by introducing a split dispatch in QMP
commands: callback for starting, and a separate return function. This
is not rocket science. See below.

>
> > QMP operation doing large IO (dumps), or blocking on events, could be
> > switched to this async form without introducing user-visible change,
>
> Letting the next QMP command start before the current one is done is a
> user-visible change.  We can discuss whether the change is harmless.

Agree, from cover letter:
Existing qemu commands can be gradually replaced by async:true
variants when needed, while carefully reviewing the concurrency
aspects.

>
> > and with minimal effort compared to jobs.
>
> To gauge the difference in effort, we'd need actual code to compare.

It's a no-go to me, you don't want to teach all users out there with
new job API for existing commands when you can improve or fix things
in QEMU.

The QEMU change for the command can't really be simpler than what I
propose. You go from:

qmp_foo() {
  // do foo synchronously and
  return something
}

to:

qmp_foo_async(QmpReturn *r) {
  // do foo asynchronously (or return synchronously)
}

foo_done() {
  qmp_foo_async_return(r, something)
}

See "scripts: learn 'async' qapi commands" for the details.

thanks

-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Markus Armbruster 4 years, 11 months ago

Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi
>
> On Thu, May 23, 2019 at 9:52 AM Markus Armbruster <armbru@redhat.com> wrote:
>> I'm not sure how asynchronous commands could support reconnect and
>> resume.
>
> The same way as current commands, including job commands.

Consider the following scenario: a management application such as
libvirt starts a long-running task with the intent to monitor it until
it finishes.  Half-way through, the management application needs to
disconnect and reconnect for some reason (systemctl restart, or crash &
recover, or whatever).

If the long-running task is a job, the management application can resume
after reconnect: the job's ID is as valid as it was before, and the
commands to query and control the job work as before.

What if it's and asynchronous command?

>> >> I'm ignoring "etc" unless you expand it into something specific.
>> >>
>> >> I'm also not taking the "weird" bait :)
>> >> > The following series implements an async command solution instead. By
>> >> > introducing a session context and a command return handler, it can:
>> >> > - defer the return, allowing the mainloop to reenter
>> >> > - return only to the caller (instead of broadcast events for reply)
>> >> > - optionnally allow cancellation when the client is gone
>> >> > - track on-going qapi command(s) per client/session
>> >> >
>> >> > and without introduction of new QMP APIs or client visible change.
>> >>
>> >> What do async commands provide that jobs lack?
>> >>
>> >> Why do we want both?
>> >
>> > They are different things, last we discussed it: jobs are geared
>> > toward block device operations,
>>
>> Historical accident.  We've discussed using them for non-blocky stuff,
>> such as migration.  Of course, discussions are cheap, code is what
>> counts.
>
> Using job API means providing new (& more complex) APIs to client.
>
> The screendump fix here doesn't need new API, it needs new internal
> dispatch of QMP commands: the purpose of this series.
>
> Whenever we can solve things on qemu side, I would rather not
> deprecate current API.

Making a synchronous command asynchronous definitely changes API.

You could still argue the change is easier to handle for QMP clients
than a replacement by a job.

[...]

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Gerd Hoffmann 4 years, 11 months ago

On Mon, May 27, 2019 at 10:18:42AM +0200, Markus Armbruster wrote:
> Marc-André Lureau <marcandre.lureau@gmail.com> writes:
> 
> > Hi
> >
> > On Thu, May 23, 2019 at 9:52 AM Markus Armbruster <armbru@redhat.com> wrote:
> >> I'm not sure how asynchronous commands could support reconnect and
> >> resume.
> >
> > The same way as current commands, including job commands.
> 
> Consider the following scenario: a management application such as
> libvirt starts a long-running task with the intent to monitor it until
> it finishes.  Half-way through, the management application needs to
> disconnect and reconnect for some reason (systemctl restart, or crash &
> recover, or whatever).
> 
> If the long-running task is a job, the management application can resume
> after reconnect: the job's ID is as valid as it was before, and the
> commands to query and control the job work as before.
> 
> What if it's and asynchronous command?

This is not meant for some long-running job which you have to manage.

Allowing commands being asynchronous makes sense for things which (a)
typically don't take long, and (b) don't need any management.

So, if the connection goes down the job is simply canceled, and after
reconnecting the management can simply send the same command again.

> > Whenever we can solve things on qemu side, I would rather not
> > deprecate current API.
> 
> Making a synchronous command asynchronous definitely changes API.

Inside qemu yes, sure.  But for the QMP client nothing changes.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Markus Armbruster 4 years, 11 months ago

Gerd Hoffmann <kraxel@redhat.com> writes:

> On Mon, May 27, 2019 at 10:18:42AM +0200, Markus Armbruster wrote:
>> Marc-André Lureau <marcandre.lureau@gmail.com> writes:
>> 
>> > Hi
>> >
>> > On Thu, May 23, 2019 at 9:52 AM Markus Armbruster <armbru@redhat.com> wrote:
>> >> I'm not sure how asynchronous commands could support reconnect and
>> >> resume.
>> >
>> > The same way as current commands, including job commands.
>> 
>> Consider the following scenario: a management application such as
>> libvirt starts a long-running task with the intent to monitor it until
>> it finishes.  Half-way through, the management application needs to
>> disconnect and reconnect for some reason (systemctl restart, or crash &
>> recover, or whatever).
>> 
>> If the long-running task is a job, the management application can resume
>> after reconnect: the job's ID is as valid as it was before, and the
>> commands to query and control the job work as before.
>> 
>> What if it's and asynchronous command?
>
> This is not meant for some long-running job which you have to manage.
>
> Allowing commands being asynchronous makes sense for things which (a)
> typically don't take long, and (b) don't need any management.
>
> So, if the connection goes down the job is simply canceled, and after
> reconnecting the management can simply send the same command again.

Is this worth its own infrastructure?

Would you hazard a guess on how many commands can take long enough to
demand a conversion to asynchronous, yet not need any management?

>> > Whenever we can solve things on qemu side, I would rather not
>> > deprecate current API.
>> 
>> Making a synchronous command asynchronous definitely changes API.
>
> Inside qemu yes, sure.  But for the QMP client nothing changes.

Command replies can arrive out of order, can't they?

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Marc-André Lureau 4 years, 11 months ago

Hi

On Mon, May 27, 2019 at 3:23 PM Markus Armbruster <armbru@redhat.com> wrote:
>
> Gerd Hoffmann <kraxel@redhat.com> writes:
>
> > On Mon, May 27, 2019 at 10:18:42AM +0200, Markus Armbruster wrote:
> >> Marc-André Lureau <marcandre.lureau@gmail.com> writes:
> >>
> >> > Hi
> >> >
> >> > On Thu, May 23, 2019 at 9:52 AM Markus Armbruster <armbru@redhat.com> wrote:
> >> >> I'm not sure how asynchronous commands could support reconnect and
> >> >> resume.
> >> >
> >> > The same way as current commands, including job commands.
> >>
> >> Consider the following scenario: a management application such as
> >> libvirt starts a long-running task with the intent to monitor it until
> >> it finishes.  Half-way through, the management application needs to
> >> disconnect and reconnect for some reason (systemctl restart, or crash &
> >> recover, or whatever).
> >>
> >> If the long-running task is a job, the management application can resume
> >> after reconnect: the job's ID is as valid as it was before, and the
> >> commands to query and control the job work as before.
> >>
> >> What if it's and asynchronous command?
> >
> > This is not meant for some long-running job which you have to manage.
> >
> > Allowing commands being asynchronous makes sense for things which (a)
> > typically don't take long, and (b) don't need any management.
> >
> > So, if the connection goes down the job is simply canceled, and after
> > reconnecting the management can simply send the same command again.
>
> Is this worth its own infrastructure?

Yes, not having to change/break the client side API is worth some effort.

> Would you hazard a guess on how many commands can take long enough to
> demand a conversion to asynchronous, yet not need any management?

Some of the currently synchronous commands that are doing some
substantial task (many of them are not simply reading values from
memory) could be gradually converted, as needed.

> >> > Whenever we can solve things on qemu side, I would rather not
> >> > deprecate current API.
> >>
> >> Making a synchronous command asynchronous definitely changes API.
> >
> > Inside qemu yes, sure.  But for the QMP client nothing changes.
>
> Command replies can arrive out of order, can't they?

They are returned in order, see "QmpSession: return orderly".


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Gerd Hoffmann 4 years, 11 months ago

  Hi,

> > This is not meant for some long-running job which you have to manage.
> >
> > Allowing commands being asynchronous makes sense for things which (a)
> > typically don't take long, and (b) don't need any management.
> >
> > So, if the connection goes down the job is simply canceled, and after
> > reconnecting the management can simply send the same command again.
> 
> Is this worth its own infrastructure?
> 
> Would you hazard a guess on how many commands can take long enough to
> demand a conversion to asynchronous, yet not need any management?

Required:
  screendump with qxl (needs round-drop to spice-server display thread
  for fully up-to-date screen content, due to lazy rendering).

Nice to have:
  Move anything which needs more than a milisecond to a thread or
  coroutine, so we avoid monitor commands causing guest-visible latency
  spikes due to holding the big qemu lock for too long.

  From a quick scan through monitor help hot candidates are screendump
  and pmemsave because they might write rather large data files.

  Dunno about savevm/loadvm.  I think they stop the guest anyway.  So
  moving them to async probably doesn't buy us much, at least from a
  latency point of view.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Kevin Wolf 4 years, 11 months ago

Am 27.05.2019 um 15:23 hat Markus Armbruster geschrieben:
> Gerd Hoffmann <kraxel@redhat.com> writes:
> 
> > On Mon, May 27, 2019 at 10:18:42AM +0200, Markus Armbruster wrote:
> >> Marc-André Lureau <marcandre.lureau@gmail.com> writes:
> >> 
> >> > Hi
> >> >
> >> > On Thu, May 23, 2019 at 9:52 AM Markus Armbruster <armbru@redhat.com> wrote:
> >> >> I'm not sure how asynchronous commands could support reconnect and
> >> >> resume.
> >> >
> >> > The same way as current commands, including job commands.
> >> 
> >> Consider the following scenario: a management application such as
> >> libvirt starts a long-running task with the intent to monitor it until
> >> it finishes.  Half-way through, the management application needs to
> >> disconnect and reconnect for some reason (systemctl restart, or crash &
> >> recover, or whatever).
> >> 
> >> If the long-running task is a job, the management application can resume
> >> after reconnect: the job's ID is as valid as it was before, and the
> >> commands to query and control the job work as before.
> >> 
> >> What if it's and asynchronous command?
> >
> > This is not meant for some long-running job which you have to manage.
> >
> > Allowing commands being asynchronous makes sense for things which (a)
> > typically don't take long, and (b) don't need any management.
> >
> > So, if the connection goes down the job is simply canceled, and after
> > reconnecting the management can simply send the same command again.
> 
> Is this worth its own infrastructure?
> 
> Would you hazard a guess on how many commands can take long enough to
> demand a conversion to asynchronous, yet not need any management?

Candidates are any commands that perform I/O. You don't want to hold the
BQL while doing I/O. Probably most block layer commands fall into this
category.

In fact, even the commands to start a block job could probably make use
of this infrastructure because they typically do some I/O before
returning success for starting the job.

> >> > Whenever we can solve things on qemu side, I would rather not
> >> > deprecate current API.
> >> 
> >> Making a synchronous command asynchronous definitely changes API.
> >
> > Inside qemu yes, sure.  But for the QMP client nothing changes.
> 
> Command replies can arrive out of order, can't they?

My understanding is that this is just an internal change and commands
still aren't processed in parallel.

Kevin

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by John Snow 4 years, 11 months ago


On 5/21/19 10:17 AM, Markus Armbruster wrote:
> Marc-André, before you invest your time to answer my questions below: my
> bandwidth for non-trivial QAPI features like this one is painfully
> limited.  To get your QAPI conditionals in, I had to postpone other QAPI
> projects.  I don't regret doing that, I'm rather pleased with how QAPI
> conditionals turned out.  But I can't keep postponing all other QAPI
> projects.  Because of that, this one will be slow going, at best.  Sorry
> about that.
> 
> Marc-André Lureau <marcandre.lureau@redhat.com> writes:
> 
>> Hi,
>>
>> HMP and QMP commands are handled synchronously in qemu today. But
>> there are benefits allowing the command handler to re-enter the main
>> loop if the command cannot be handled synchronously, or if it is
>> long-lasting. Some bugs such as rhbz#1230527 are difficult to solve
>> without it.
>>
>> The common solution is to use a pair of command+event in this case.
> 
> In particular, background jobs (qapi/jobs.json).  They grew out of block
> jobs, and are still used only for "blocky" things.  Using them more
> widely would probably make sense.
> >> But this approach has a number of issues:
>> - you can't "fix" an existing command: you need a new API, and ad-hoc
>>   documentation for that command+signal association, and old/broken
>>   command deprecation
> 
> Making a synchronous command asynchronous is an incompatible change.  We
> need to let the client needs opt in.  How is that done in this series?
> 
>> - since the reply event is broadcasted and 'id' is used for matching the
>>   request, it may conflict with other clients request 'id' space
> 
> Any event that does that now is broken and needs to be fixed.  The
> obvious fix is to include a monitor ID with the command ID.  For events
> that can only ever be useful in the context of one particular monitor,
> we could unicast to that monitor instead; see below.
> 
> Corollary: this is just a fixable bug, not a fundamental advantage of
> the async feature.
> 
>> - it is arguably less efficient and elegant (weird API, useless return
>>   in most cases, broadcast reply, no cancelling on disconnect etc)
> 
> The return value is useful for synchronously reporting failure to start
> the background task.  I grant you that background tasks may exist that
> won't ever fail to start.  I challenge the idea that it's most of them.
> 
> Broadcast reply could be avoided by unicasting events.  If I remember
> correctly, Peter Xu even posted patches some time ago.  We ended up not
> using them, because we found a better solution for the problem at hand.
> My point is: this isn't a fundamental problem, it's just the way we
> coded things up.
> 
> What do you mean by "no cancelling on disconnect"?
> 
> I'm ignoring "etc" unless you expand it into something specific.
> 
> I'm also not taking the "weird" bait :)
> 
>> The following series implements an async command solution instead. By
>> introducing a session context and a command return handler, it can:
>> - defer the return, allowing the mainloop to reenter
>> - return only to the caller (instead of broadcast events for reply)
>> - optionnally allow cancellation when the client is gone
>> - track on-going qapi command(s) per client/session
>>
>> and without introduction of new QMP APIs or client visible change.
> 
> What do async commands provide that jobs lack?
> 
> Why do we want both?
> 
> I started to write a feature-by-feature comparison, but realized I don't
> have the time to figure out either jobs or async from their (rather
> sparse) documentation, let alone from code.
> 

Sorry about that. I still have a todo item from you in my inbox to
improve the documentation there, but I've been focusing on bitmaps
documentation lately instead, but I hope to branch it out as part of my
caring a bit more about docs lately.

I'll keep an eye out here. I don't really want to prohibit things on the
sole basis that they aren't jobs, but I do want to make sure that adding
a new lifecycle paradigm for commands doesn't complicate the jobs code
in accidental ways.

I'll try to look this over too, but I have a bit of a backlog right now.

>> Existing qemu commands can be gradually replaced by async:true
>> variants when needed, while carefully reviewing the concurrency
>> aspects. The async:true commands marshaller helpers are splitted in
>> half, the calling and return functions. The command is called with a
>> QmpReturn context, that can return immediately or later, using the
>> generated return helper, which allows for a step-by-step conversion.
>>
>> The screendump command is converted to an async:true version to solve
>> rhbz#1230527. The command shows basic cancellation (this could be
>> extended if needed). It could be further improved to do asynchronous
>> IO writes as well.
> 
> What is "basic cancellation"?
> 
> What extension(s) do you have in mind?
> 
> What's the impact of screendump writing synchronously?
>

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by Marc-André Lureau 4 years, 11 months ago

On Tue, Apr 9, 2019 at 6:12 PM Marc-André Lureau
<marcandre.lureau@redhat.com> wrote:
>
> Hi,
>
> HMP and QMP commands are handled synchronously in qemu today. But
> there are benefits allowing the command handler to re-enter the main
> loop if the command cannot be handled synchronously, or if it is
> long-lasting. Some bugs such as rhbz#1230527 are difficult to solve
> without it.
>
> The common solution is to use a pair of command+event in this case.
> But this approach has a number of issues:
> - you can't "fix" an existing command: you need a new API, and ad-hoc
>   documentation for that command+signal association, and old/broken
>   command deprecation
> - since the reply event is broadcasted and 'id' is used for matching the
>   request, it may conflict with other clients request 'id' space
> - it is arguably less efficient and elegant (weird API, useless return
>   in most cases, broadcast reply, no cancelling on disconnect etc)
>
> The following series implements an async command solution instead. By
> introducing a session context and a command return handler, it can:
> - defer the return, allowing the mainloop to reenter
> - return only to the caller (instead of broadcast events for reply)
> - optionnally allow cancellation when the client is gone
> - track on-going qapi command(s) per client/session
>
> and without introduction of new QMP APIs or client visible change.
>
> Existing qemu commands can be gradually replaced by async:true
> variants when needed, while carefully reviewing the concurrency
> aspects. The async:true commands marshaller helpers are splitted in
> half, the calling and return functions. The command is called with a
> QmpReturn context, that can return immediately or later, using the
> generated return helper, which allows for a step-by-step conversion.
>
> The screendump command is converted to an async:true version to solve
> rhbz#1230527. The command shows basic cancellation (this could be
> extended if needed). It could be further improved to do asynchronous
> IO writes as well.
>
> v4:
> - rebased, mostly adapting to new OOB code
>   (there was not much feedback in v3 for the async command part,
>    but preliminary patches got merged!)
> - drop the RFC status

ping

>
> v3:
> - complete rework, dropping the asynchronous commands visibility from
>   the protocol side entirely (until there is a real need for it)
> - rebased, with a few preliminary cleanup patches
> - teach asynchronous commands to HMP
>
> v2:
> - documentation fixes and improvements
> - fix calling async commands sync without id
> - fix bad hmp monitor assert
> - add a few extra asserts
> - add async with no-id failure and screendump test
>
> Marc-André Lureau (20):
>   qmp: constify QmpCommand and list
>   json-lexer: make it safe to call destroy multiple times
>   qmp: add QmpSession
>   QmpSession: add a return callback
>   QmpSession: add json parser and use it in qga
>   monitor: use qmp session to parse json feed
>   qga: simplify dispatch_return_cb
>   QmpSession: introduce QmpReturn
>   qmp: simplify qmp_return_error()
>   QmpSession: keep a queue of pending commands
>   QmpSession: return orderly
>   qmp: introduce asynchronous command type
>   scripts: learn 'async' qapi commands
>   qmp: add qmp_return_is_cancelled()
>   monitor: add qmp_return_get_monitor()
>   console: add graphic_hw_update_done()
>   console: make screendump asynchronous
>   monitor: start making qmp_human_monitor_command() asynchronous
>   monitor: teach HMP about asynchronous commands
>   hmp: call the asynchronous QMP screendump to fix outdated/glitches
>
>  qapi/misc.json                          |   3 +-
>  qapi/ui.json                            |   3 +-
>  scripts/qapi/commands.py                | 151 ++++++++++++++---
>  scripts/qapi/common.py                  |  15 +-
>  scripts/qapi/doc.py                     |   3 +-
>  scripts/qapi/introspect.py              |   3 +-
>  hmp.h                                   |   3 +-
>  include/monitor/monitor.h               |   3 +
>  include/qapi/qmp/dispatch.h             |  89 +++++++++-
>  include/qapi/qmp/json-parser.h          |   7 +-
>  include/ui/console.h                    |   5 +
>  hmp.c                                   |   6 +-
>  hw/display/qxl-render.c                 |   9 +-
>  hw/display/qxl.c                        |   1 +
>  monitor.c                               | 198 ++++++++++++++--------
>  qapi/qmp-dispatch.c                     | 214 +++++++++++++++++++-----
>  qapi/qmp-registry.c                     |  33 +++-
>  qga/commands.c                          |   2 +-
>  qga/main.c                              |  51 ++----
>  qobject/json-lexer.c                    |   5 +-
>  qobject/json-streamer.c                 |   3 +-
>  tests/test-qmp-cmds.c                   | 206 +++++++++++++++++++----
>  ui/console.c                            | 100 +++++++++--
>  hmp-commands.hx                         |   3 +-
>  tests/qapi-schema/qapi-schema-test.json |   5 +
>  tests/qapi-schema/qapi-schema-test.out  |   8 +
>  tests/qapi-schema/test-qapi.py          |   8 +-
>  27 files changed, 877 insertions(+), 260 deletions(-)
>
> --
> 2.21.0.196.g041f5ea1cf
>
>


-- 
Marc-André Lureau

Re: [Qemu-devel] [PATCH v4 00/20] monitor: add asynchronous command type

Posted by no-reply@patchew.org 5 years ago

Patchew URL: https://patchew.org/QEMU/20190409161009.6322-1-marcandre.lureau@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
time make docker-test-mingw@fedora SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===




The full log is available at
http://patchew.org/logs/20190409161009.6322-1-marcandre.lureau@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com