[v5] monitor: enable OOB by default

[Qemu-devel] [PATCH v5 0/7] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

v5:
- more r-bs are collected
- make sure no patch contains extra s-o-bs by doing proper indents for
  diff quotes [Markus]
- misc commit message or comment changes [Markus]

v4:
- collect some r-bs
- remove one extra s-o-b of mine in one patch [Thomas, Markus]
- split the qmp response flush patch into two; apply changes to commit
  message [Markus]
- fix up the doc update patch [Markus]

v3:
- drop patch "tests: iotests: don't compare SHUTDOWN event", replace
  it with "monitor: flush qmp responses when CLOSED" to fix up the
  race. [Eric, Markus]
- tweak the oob revert patch to not break qmp-test [Eric]
- quite a few comment and commit message fix-ups [Eric]
- add more comment in commit message, mention about why MUX can't be
  used for Out-Of-Band [Markus]
- one new patch to comment on chardev CLOSED event [Stefan]
- one new patch to fix iotest breakage on 060 specifically

Tests: make check, iotests

Please review.  Thanks,

Peter Xu (7):
  chardev: comment details for CLOSED event
  monitor: rename *_pop_one to *_pop_any
  monitor: flush qmp responses when CLOSED
  tests: iotests: drop some stderr line
  docs: mention shared state protect for OOB
  monitor: remove "x-oob", turn oob on by default
  Revert "tests: Add parameter to qtest_init_without_qmp_handshake"

 docs/devel/qapi-code-gen.txt | 17 +++++---
 include/chardev/char.h       | 11 +++++-
 include/monitor/monitor.h    |  1 -
 tests/libqtest.h             |  4 +-
 monitor.c                    | 75 +++++++++++++++++++++---------------
 tests/libqtest.c             | 10 ++---
 tests/qmp-test.c             |  6 +--
 vl.c                         |  5 ---
 tests/qemu-iotests/060       | 10 ++++-
 tests/qemu-iotests/060.out   |  1 -
 10 files changed, 83 insertions(+), 57 deletions(-)

-- 
2.17.1

Re: [Qemu-devel] [PATCH v5 0/7] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

I'm going to merge the fixes and cleanups [PATCH 1-5].  We need a few
more fixes before we can enable OOB [PATCH 6-7].  I'm working on some.

[Qemu-devel] (no subject)

Posted by Markus Armbruster 7 years, 7 months ago

I fooled around a bit, and I think there are a few lose ends.

Lets update the examples in docs/interop/qmp-spec.txt to show the
current greeting (section 3.1) and how to accept a capability (section
3.2).  The capability negotiation documentation could use some polish.
I'll post a patch.

Talking to a QMP monitor that supports OOB:

    $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
    QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
    {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
    QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
    {"return": {}}
    QMP> { "execute": "query-qmp-schema" }
    {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}

Why does every command require 'id'?

Talking to a QMP monitor that doesn't support OOB:

    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": []}}
    QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
    {"error": {"class": "GenericError", "desc": "This monitor does not support Out-Of-Band (OOB)"}}
    QMP> { "execute": "qmp_capabilities" }
    {"return": {}}
    QMP> { "execute": "query-kvm" }
    {"return": {"enabled": true, "present": true}}
    QMP> { "execute": "query-kvm", "control": { "run-oob": true } }
    {"error": {"class": "GenericError", "desc": "Please enable Out-Of-Band first for the session during capabilities negotiation"}}

Telling people to enable OOB when that cannot be done is suboptimal.
More so when it cannot be used here anyway.  I'll post a patch.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Markus Armbruster <armbru@redhat.com> writes:

> I fooled around a bit, and I think there are a few lose ends.
[...]
> Talking to a QMP monitor that supports OOB:
>
>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
>     {"return": {}}
>     QMP> { "execute": "query-qmp-schema" }
>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
>
> Why does every command require 'id'?

I found one reason: event COMMAND_DROPPED wants it.  Any other reason?

[...]

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Markus Armbruster <armbru@redhat.com> writes:

> Markus Armbruster <armbru@redhat.com> writes:
>
>> I fooled around a bit, and I think there are a few lose ends.
> [...]
>> Talking to a QMP monitor that supports OOB:
>>
>>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
>>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
>>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
>>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
>>     {"return": {}}
>>     QMP> { "execute": "query-qmp-schema" }
>>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
>>
>> Why does every command require 'id'?
>
> I found one reason: event COMMAND_DROPPED wants it.  Any other reason?
>
> [...]

Apropos COMMAND_DROPPED: we send an event rather than an error response
because we may send it out-of-order.  Makes sense.

However, broadcasting it to all monitors doesn't make sense.  We could
use a way to send an event to just one monitor.

Another use for that might be QMP "deprecated" notifications, because
those also can't be error responses, and also only make sense for the
client that caused them.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Daniel P. Berrangé 7 years, 7 months ago

On Wed, Jun 27, 2018 at 10:41:38AM +0200, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
> > Markus Armbruster <armbru@redhat.com> writes:
> >
> >> I fooled around a bit, and I think there are a few lose ends.
> > [...]
> >> Talking to a QMP monitor that supports OOB:
> >>
> >>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> >>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
> >>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
> >>     {"return": {}}
> >>     QMP> { "execute": "query-qmp-schema" }
> >>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
> >>
> >> Why does every command require 'id'?
> >
> > I found one reason: event COMMAND_DROPPED wants it.  Any other reason?
> >
> > [...]
> 
> Apropos COMMAND_DROPPED: we send an event rather than an error response
> because we may send it out-of-order.  Makes sense.
> 
> However, broadcasting it to all monitors doesn't make sense.  We could
> use a way to send an event to just one monitor.

Worse than that - broadcasting to all monitors is categorically broken.
Different monitors make use the same "id" formatting scheme, so if you
broadcast COMMAND_DROPPED to a different monitor you might have clashing
"id" and thus incorrectly tell a client its command was dropped when in
fact it was processed. You'd have to be fairly unlucky in timing, but
it could happen.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Jun 27, 2018 at 10:41:38AM +0200, Markus Armbruster wrote:
>> Markus Armbruster <armbru@redhat.com> writes:
>> 
>> > Markus Armbruster <armbru@redhat.com> writes:
>> >
>> >> I fooled around a bit, and I think there are a few lose ends.
>> > [...]
>> >> Talking to a QMP monitor that supports OOB:
>> >>
>> >>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>> >>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
>> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
>> >>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
>> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
>> >>     {"return": {}}
>> >>     QMP> { "execute": "query-qmp-schema" }
>> >>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
>> >>
>> >> Why does every command require 'id'?
>> >
>> > I found one reason: event COMMAND_DROPPED wants it.  Any other reason?
>> >
>> > [...]
>> 
>> Apropos COMMAND_DROPPED: we send an event rather than an error response
>> because we may send it out-of-order.  Makes sense.
>> 
>> However, broadcasting it to all monitors doesn't make sense.  We could
>> use a way to send an event to just one monitor.
>
> Worse than that - broadcasting to all monitors is categorically broken.
> Different monitors make use the same "id" formatting scheme, so if you
> broadcast COMMAND_DROPPED to a different monitor you might have clashing
> "id" and thus incorrectly tell a client its command was dropped when in
> fact it was processed. You'd have to be fairly unlucky in timing, but
> it could happen.

Right.  Must fix bug.

I'm glad I went over this one more time, and in public!

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Wed, Jun 27, 2018 at 01:23:07PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Wed, Jun 27, 2018 at 10:41:38AM +0200, Markus Armbruster wrote:
> >> Markus Armbruster <armbru@redhat.com> writes:
> >> 
> >> > Markus Armbruster <armbru@redhat.com> writes:
> >> >
> >> >> I fooled around a bit, and I think there are a few lose ends.
> >> > [...]
> >> >> Talking to a QMP monitor that supports OOB:
> >> >>
> >> >>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> >> >>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
> >> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
> >> >>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
> >> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
> >> >>     {"return": {}}
> >> >>     QMP> { "execute": "query-qmp-schema" }
> >> >>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
> >> >>
> >> >> Why does every command require 'id'?
> >> >
> >> > I found one reason: event COMMAND_DROPPED wants it.  Any other reason?
> >> >
> >> > [...]
> >> 
> >> Apropos COMMAND_DROPPED: we send an event rather than an error response
> >> because we may send it out-of-order.  Makes sense.
> >> 
> >> However, broadcasting it to all monitors doesn't make sense.  We could
> >> use a way to send an event to just one monitor.

True.

(Sorry for the late responses; I was on Linuxcon China in the past few
days)

> >
> > Worse than that - broadcasting to all monitors is categorically broken.
> > Different monitors make use the same "id" formatting scheme, so if you
> > broadcast COMMAND_DROPPED to a different monitor you might have clashing
> > "id" and thus incorrectly tell a client its command was dropped when in
> > fact it was processed. You'd have to be fairly unlucky in timing, but
> > it could happen.
> 
> Right.  Must fix bug.

Even more true.

> 
> I'm glad I went over this one more time, and in public!

I had a glance at current qmp-spec, it seems that we don't have any
restriction currently on "we must send events to all the monitors".
Does it mean that we should be free to have per-monitor events
starting from this event?

My current plan is that I can touch up scripts/qapi/events.py and
related stuff to allow QMPEventFuncEmit to take a monitor parameter,
then we pass in NULL when we want to send the event to all monitors.

Would that work?

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Eric Blake 7 years, 7 months ago

On 06/27/2018 07:07 AM, Peter Xu wrote:

>>> Worse than that - broadcasting to all monitors is categorically broken.
>>> Different monitors make use the same "id" formatting scheme, so if you
>>> broadcast COMMAND_DROPPED to a different monitor you might have clashing
>>> "id" and thus incorrectly tell a client its command was dropped when in
>>> fact it was processed. You'd have to be fairly unlucky in timing, but
>>> it could happen.
>>
>> Right.  Must fix bug.
> 

> 
> My current plan is that I can touch up scripts/qapi/events.py and
> related stuff to allow QMPEventFuncEmit to take a monitor parameter,
> then we pass in NULL when we want to send the event to all monitors.
> 
> Would that work?

Makes sense to me. Also, right now, ALL callers of qapi_event_send_* 
pass &error_abort as their final parameter. If you're refactoring 
everything anyways, you could get rid of that parameter on the 
presumption that it doesn't buy us anything.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Eric Blake <eblake@redhat.com> writes:

> On 06/27/2018 07:07 AM, Peter Xu wrote:
>
>>>> Worse than that - broadcasting to all monitors is categorically broken.
>>>> Different monitors make use the same "id" formatting scheme, so if you
>>>> broadcast COMMAND_DROPPED to a different monitor you might have clashing
>>>> "id" and thus incorrectly tell a client its command was dropped when in
>>>> fact it was processed. You'd have to be fairly unlucky in timing, but
>>>> it could happen.
>>>
>>> Right.  Must fix bug.
>>
>
>>
>> My current plan is that I can touch up scripts/qapi/events.py and
>> related stuff to allow QMPEventFuncEmit to take a monitor parameter,
>> then we pass in NULL when we want to send the event to all monitors.
>>
>> Would that work?
>
> Makes sense to me. Also, right now, ALL callers of qapi_event_send_*
> pass &error_abort as their final parameter. If you're refactoring
> everything anyways, you could get rid of that parameter on the
> presumption that it doesn't buy us anything.

Let me try to turn presumption into fact :)

The generated qapi_event_send_FOO() can detect the following error
conditions:

* Visitor fails.  But the QObject output visitor can't actually fail.

  Since the errp parameter is part of the abstract visitor interface, we
  can't eliminate it.  We can pass &error_abort.

* emit() function fails.  Can't happen, either.

  Let's eliminate the unused errp parameter of QMPEventFuncEmit.

  If we can eliminate the somewhat silly indirection through
  qmp_event_get_func_emit(), even better.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Thu, Jun 28, 2018 at 09:04:19AM +0200, Markus Armbruster wrote:
> Eric Blake <eblake@redhat.com> writes:
> 
> > On 06/27/2018 07:07 AM, Peter Xu wrote:
> >
> >>>> Worse than that - broadcasting to all monitors is categorically broken.
> >>>> Different monitors make use the same "id" formatting scheme, so if you
> >>>> broadcast COMMAND_DROPPED to a different monitor you might have clashing
> >>>> "id" and thus incorrectly tell a client its command was dropped when in
> >>>> fact it was processed. You'd have to be fairly unlucky in timing, but
> >>>> it could happen.
> >>>
> >>> Right.  Must fix bug.
> >>
> >
> >>
> >> My current plan is that I can touch up scripts/qapi/events.py and
> >> related stuff to allow QMPEventFuncEmit to take a monitor parameter,
> >> then we pass in NULL when we want to send the event to all monitors.
> >>
> >> Would that work?
> >
> > Makes sense to me. Also, right now, ALL callers of qapi_event_send_*
> > pass &error_abort as their final parameter. If you're refactoring
> > everything anyways, you could get rid of that parameter on the
> > presumption that it doesn't buy us anything.
> 
> Let me try to turn presumption into fact :)
> 
> The generated qapi_event_send_FOO() can detect the following error
> conditions:
> 
> * Visitor fails.  But the QObject output visitor can't actually fail.
> 
>   Since the errp parameter is part of the abstract visitor interface, we
>   can't eliminate it.  We can pass &error_abort.
> 
> * emit() function fails.  Can't happen, either.
> 
>   Let's eliminate the unused errp parameter of QMPEventFuncEmit.

Hmm, I think I got your point.

> 
>   If we can eliminate the somewhat silly indirection through
>   qmp_event_get_func_emit(), even better.

It confused me before on why we had that after all. Let me give it a
shot.

Regards,

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Peter Xu <peterx@redhat.com> writes:

> On Wed, Jun 27, 2018 at 01:23:07PM +0200, Markus Armbruster wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > On Wed, Jun 27, 2018 at 10:41:38AM +0200, Markus Armbruster wrote:
>> >> Markus Armbruster <armbru@redhat.com> writes:
>> >> 
>> >> > Markus Armbruster <armbru@redhat.com> writes:
>> >> >
>> >> >> I fooled around a bit, and I think there are a few lose ends.
>> >> > [...]
>> >> >> Talking to a QMP monitor that supports OOB:
>> >> >>
>> >> >>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>> >> >>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
>> >> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
>> >> >>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
>> >> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
>> >> >>     {"return": {}}
>> >> >>     QMP> { "execute": "query-qmp-schema" }
>> >> >>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
>> >> >>
>> >> >> Why does every command require 'id'?
>> >> >
>> >> > I found one reason: event COMMAND_DROPPED wants it.  Any other reason?
>> >> >
>> >> > [...]
>> >> 
>> >> Apropos COMMAND_DROPPED: we send an event rather than an error response
>> >> because we may send it out-of-order.  Makes sense.
>> >> 
>> >> However, broadcasting it to all monitors doesn't make sense.  We could
>> >> use a way to send an event to just one monitor.
>
> True.
>
> (Sorry for the late responses; I was on Linuxcon China in the past few
> days)

No need to be sorry :)

>> >
>> > Worse than that - broadcasting to all monitors is categorically broken.
>> > Different monitors make use the same "id" formatting scheme, so if you
>> > broadcast COMMAND_DROPPED to a different monitor you might have clashing
>> > "id" and thus incorrectly tell a client its command was dropped when in
>> > fact it was processed. You'd have to be fairly unlucky in timing, but
>> > it could happen.
>> 
>> Right.  Must fix bug.
>
> Even more true.
>
>> 
>> I'm glad I went over this one more time, and in public!
>
> I had a glance at current qmp-spec, it seems that we don't have any
> restriction currently on "we must send events to all the monitors".
> Does it mean that we should be free to have per-monitor events
> starting from this event?

Changing an existing event from broadcast to unicast is an observable
change in existing behavior.  Compatibility break unless we can show
nobody can use / uses the observation.

Creating a new event is not a compatibility break by itself[*],
regardless of broadcast vs. unicast.

> My current plan is that I can touch up scripts/qapi/events.py and
> related stuff to allow QMPEventFuncEmit to take a monitor parameter,
> then we pass in NULL when we want to send the event to all monitors.
>
> Would that work?

Think so.

Alternatively, a pair of functions:

    void qapi_event_bcast_EVENT(... event params ..., Error **errp);
    void qapi_event_send_EVENT(Monitor *mon, ... event params ..., Error **errp);

Slightly more compact in the broadcast case, might be a bit easier to
read.


[*] In the case of COMMAND_DROPPED, the compatibility break is dropping
commands (the event's trigger), caused by the command queuing feature.
That's why command queuing has to be opt-in.  Details discussed
elsewhere in this thread.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Eric Blake 7 years, 7 months ago

On 06/28/2018 01:55 AM, Markus Armbruster wrote:

> Changing an existing event from broadcast to unicast is an observable
> change in existing behavior.  Compatibility break unless we can show
> nobody can use / uses the observation.

And no one could have been relying on the broadcast COMMAND_DROPPED 
event semantics, since OOB was previously experimental and since we just 
proved that they are wrong if not limited to one monitor.

> 
> Creating a new event is not a compatibility break by itself[*],
> regardless of broadcast vs. unicast.
> 
>> My current plan is that I can touch up scripts/qapi/events.py and
>> related stuff to allow QMPEventFuncEmit to take a monitor parameter,
>> then we pass in NULL when we want to send the event to all monitors.
>>
>> Would that work?
> 
> Think so.
> 
> Alternatively, a pair of functions:
> 
>      void qapi_event_bcast_EVENT(... event params ..., Error **errp);
>      void qapi_event_send_EVENT(Monitor *mon, ... event params ..., Error **errp);
> 
> Slightly more compact in the broadcast case, might be a bit easier to
> read.

Also, fewer NULL arguments to add into existing call sites (although 
existing call sites would change to use the _bcast_ form, so you're 
already touching all call sites, which brings back the topic from the 
other mail on dropping useless errp while at it).

> 
> 
> [*] In the case of COMMAND_DROPPED, the compatibility break is dropping
> commands (the event's trigger), caused by the command queuing feature.
> That's why command queuing has to be opt-in.  Details discussed
> elsewhere in this thread.

Indeed, and I thought the conclusion there was that we DO promise that 
COMMAND_DROPPED (and dropped commands in general) won't happen unless 
you negotiate oob at initial connection.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Thu, Jun 28, 2018 at 08:55:48AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Jun 27, 2018 at 01:23:07PM +0200, Markus Armbruster wrote:
> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >> 
> >> > On Wed, Jun 27, 2018 at 10:41:38AM +0200, Markus Armbruster wrote:
> >> >> Markus Armbruster <armbru@redhat.com> writes:
> >> >> 
> >> >> > Markus Armbruster <armbru@redhat.com> writes:
> >> >> >
> >> >> >> I fooled around a bit, and I think there are a few lose ends.
> >> >> > [...]
> >> >> >> Talking to a QMP monitor that supports OOB:
> >> >> >>
> >> >> >>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> >> >> >>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
> >> >> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
> >> >> >>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
> >> >> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
> >> >> >>     {"return": {}}
> >> >> >>     QMP> { "execute": "query-qmp-schema" }
> >> >> >>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
> >> >> >>
> >> >> >> Why does every command require 'id'?
> >> >> >
> >> >> > I found one reason: event COMMAND_DROPPED wants it.  Any other reason?
> >> >> >
> >> >> > [...]
> >> >> 
> >> >> Apropos COMMAND_DROPPED: we send an event rather than an error response
> >> >> because we may send it out-of-order.  Makes sense.
> >> >> 
> >> >> However, broadcasting it to all monitors doesn't make sense.  We could
> >> >> use a way to send an event to just one monitor.
> >
> > True.
> >
> > (Sorry for the late responses; I was on Linuxcon China in the past few
> > days)
> 
> No need to be sorry :)
> 
> >> >
> >> > Worse than that - broadcasting to all monitors is categorically broken.
> >> > Different monitors make use the same "id" formatting scheme, so if you
> >> > broadcast COMMAND_DROPPED to a different monitor you might have clashing
> >> > "id" and thus incorrectly tell a client its command was dropped when in
> >> > fact it was processed. You'd have to be fairly unlucky in timing, but
> >> > it could happen.
> >> 
> >> Right.  Must fix bug.
> >
> > Even more true.
> >
> >> 
> >> I'm glad I went over this one more time, and in public!
> >
> > I had a glance at current qmp-spec, it seems that we don't have any
> > restriction currently on "we must send events to all the monitors".
> > Does it mean that we should be free to have per-monitor events
> > starting from this event?
> 
> Changing an existing event from broadcast to unicast is an observable
> change in existing behavior.  Compatibility break unless we can show
> nobody can use / uses the observation.
> 
> Creating a new event is not a compatibility break by itself[*],
> regardless of broadcast vs. unicast.

Yeah.

Though this change will be a bit unique in that basically we should
not break anyone but fix potential problems that the clients might
encounter.

Firstly I really suspect anyone has really started to use OOB
commands...  But let's just assume there is.

Then if a client has implemented the COMMAND_DROPPED event, our change
should still keep that event be there when any of the command is
dropped for that monitor.  What we really changed in the behavior is
that we won't send incorrect COMMAND_DROPPED events to them which
should not belong to them.  From that point of view, it's not really
the spec/api that is changed, we just try to fix a bug from a broken
implementation (from me) of the spec. :)

So if you would agree I'd prefer to directly change that bit to change
COMMAND_DROPPED as per-monitor event.  It should only benefit existing
clients (from not receiving broken events) rather than breaking any of
them.

> 
> > My current plan is that I can touch up scripts/qapi/events.py and
> > related stuff to allow QMPEventFuncEmit to take a monitor parameter,
> > then we pass in NULL when we want to send the event to all monitors.
> >
> > Would that work?
> 
> Think so.
> 
> Alternatively, a pair of functions:
> 
>     void qapi_event_bcast_EVENT(... event params ..., Error **errp);
>     void qapi_event_send_EVENT(Monitor *mon, ... event params ..., Error **errp);
> 
> Slightly more compact in the broadcast case, might be a bit easier to
> read.

Yeah I can try this.

> 
> 
> [*] In the case of COMMAND_DROPPED, the compatibility break is dropping
> commands (the event's trigger), caused by the command queuing feature.
> That's why command queuing has to be opt-in.  Details discussed
> elsewhere in this thread.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Monitor behavior changes even when the client rejects capability "oob".

Traditionally, the monitor reads, executes and responds to one command
after the other.  If the client sends in-band commands faster than the
server can execute them, the kernel will eventually refuse to buffer
more, and sending blocks or fails with EAGAIN.

To make OOB possible, we need to read and queue commands as we receive
them.  If the client sends in-band commands faster than the server can
execute them, the server will eventually drop commands to limit the
queue length.  The sever sends event COMMAND_DROPPED then.

However, we get the new behavior even when the client rejects capability
"oob".  We get the traditional behavior only when the server doesn't
offer "oob".

Is this what we want?

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Daniel P. Berrangé 7 years, 7 months ago

On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote:
> Monitor behavior changes even when the client rejects capability "oob".
> 
> Traditionally, the monitor reads, executes and responds to one command
> after the other.  If the client sends in-band commands faster than the
> server can execute them, the kernel will eventually refuse to buffer
> more, and sending blocks or fails with EAGAIN.
> 
> To make OOB possible, we need to read and queue commands as we receive
> them.  If the client sends in-band commands faster than the server can
> execute them, the server will eventually drop commands to limit the
> queue length.  The sever sends event COMMAND_DROPPED then.
> 
> However, we get the new behavior even when the client rejects capability
> "oob".  We get the traditional behavior only when the server doesn't
> offer "oob".
> 
> Is this what we want?

IMHO the key benefit of allowing the client to reject the capability
is to enable backwards compat support. So this behaviour feels wrong.
Rejecting OOB should have same semantics as previous QEMU's without
OOB available, otherwise we now have 3 distinct ways the monitor
operates  (no OOB, OOB rejected, OOB accepted). This can only ever
lead to more bugs due to lack of testing of no OOB vs OOB rejected
scenarios.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote:
>> Monitor behavior changes even when the client rejects capability "oob".
>> 
>> Traditionally, the monitor reads, executes and responds to one command
>> after the other.  If the client sends in-band commands faster than the
>> server can execute them, the kernel will eventually refuse to buffer
>> more, and sending blocks or fails with EAGAIN.
>> 
>> To make OOB possible, we need to read and queue commands as we receive
>> them.  If the client sends in-band commands faster than the server can
>> execute them, the server will eventually drop commands to limit the
>> queue length.  The sever sends event COMMAND_DROPPED then.
>> 
>> However, we get the new behavior even when the client rejects capability
>> "oob".  We get the traditional behavior only when the server doesn't
>> offer "oob".
>> 
>> Is this what we want?
>
> IMHO the key benefit of allowing the client to reject the capability
> is to enable backwards compat support. So this behaviour feels wrong.
> Rejecting OOB should have same semantics as previous QEMU's without
> OOB available, otherwise we now have 3 distinct ways the monitor
> operates  (no OOB, OOB rejected, OOB accepted). This can only ever
> lead to more bugs due to lack of testing of no OOB vs OOB rejected
> scenarios.

Agreed.

We have three configuration cases

* OOB not offered (because MUX)

* OOB offered, but rejected by client

* OOB offered and accepted

We want to map them to two run time cases

* OOB off

* OOB on

We may use "server offered OOB" only for configuration purposes.

Keep that in mind when reading my reply to Peter's reply.

Aside: it would be nice to get rid of the configuration case "OOB not
offered", but that's for later.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote:
> Monitor behavior changes even when the client rejects capability "oob".
> 
> Traditionally, the monitor reads, executes and responds to one command
> after the other.  If the client sends in-band commands faster than the
> server can execute them, the kernel will eventually refuse to buffer
> more, and sending blocks or fails with EAGAIN.
> 
> To make OOB possible, we need to read and queue commands as we receive
> them.  If the client sends in-band commands faster than the server can
> execute them, the server will eventually drop commands to limit the
> queue length.  The sever sends event COMMAND_DROPPED then.
> 
> However, we get the new behavior even when the client rejects capability
> "oob".  We get the traditional behavior only when the server doesn't
> offer "oob".
> 
> Is this what we want?

Not really.  But I thought we kept that old behavior, haven't we?

In handle_qmp_command() we have this:

    /*
     * If OOB is not enabled on the current monitor, we'll emulate the
     * old behavior that we won't process the current monitor any more
     * until it has responded.  This helps make sure that as long as
     * OOB is not enabled, the server will never drop any command.
     */
    if (!qmp_oob_enabled(mon)) {
        monitor_suspend(mon);
        req_obj->need_resume = true;
    } else {
        /* Drop the request if queue is full. */
        if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
            qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
            qapi_event_send_command_dropped(id,
                                            COMMAND_DROP_REASON_QUEUE_FULL,
                                            &error_abort);
            qmp_request_free(req_obj);
            return;
        }
    }

Here assuming that we are not enabling OOB, then since we will suspend
the monitor when we receive one command, then monitor_can_read() later
will check with a result that "we should not read the chardev", then
it'll leave all the data in the input buffer.  Then AFAIU the QMP
client that didn't enable OOB will have similar behavior as before.

Also, we will never receive a COMMAND_DROPPED event in that legacy
mode as well since it's in the "else" block.   Am I right?

Regards,

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Peter Xu <peterx@redhat.com> writes:

> On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote:
>> Monitor behavior changes even when the client rejects capability "oob".
>> 
>> Traditionally, the monitor reads, executes and responds to one command
>> after the other.  If the client sends in-band commands faster than the
>> server can execute them, the kernel will eventually refuse to buffer
>> more, and sending blocks or fails with EAGAIN.
>> 
>> To make OOB possible, we need to read and queue commands as we receive
>> them.  If the client sends in-band commands faster than the server can
>> execute them, the server will eventually drop commands to limit the
>> queue length.  The sever sends event COMMAND_DROPPED then.
>> 
>> However, we get the new behavior even when the client rejects capability
>> "oob".  We get the traditional behavior only when the server doesn't
>> offer "oob".
>> 
>> Is this what we want?
>
> Not really.  But I thought we kept that old behavior, haven't we?
>
> In handle_qmp_command() we have this:
>
>     /*
>      * If OOB is not enabled on the current monitor, we'll emulate the
>      * old behavior that we won't process the current monitor any more
>      * until it has responded.  This helps make sure that as long as
>      * OOB is not enabled, the server will never drop any command.
>      */
>     if (!qmp_oob_enabled(mon)) {
>         monitor_suspend(mon);
>         req_obj->need_resume = true;
>     } else {
>         /* Drop the request if queue is full. */
>         if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
>             qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
>             qapi_event_send_command_dropped(id,
>                                             COMMAND_DROP_REASON_QUEUE_FULL,
>                                             &error_abort);
>             qmp_request_free(req_obj);
>             return;
>         }
>     }
>
> Here assuming that we are not enabling OOB, then since we will suspend
> the monitor when we receive one command, then monitor_can_read() later
> will check with a result that "we should not read the chardev", then
> it'll leave all the data in the input buffer.  Then AFAIU the QMP
> client that didn't enable OOB will have similar behavior as before.
>
> Also, we will never receive a COMMAND_DROPPED event in that legacy
> mode as well since it's in the "else" block.   Am I right?

Hmm.  Did I get confused?  Let me look again.

Some places check qmp_oob_enabled(), which is true when the server
offered capability "oob" and the client accepted.  In terms of my reply
to Daniel, it distinguishes the two run time cases "OOB off" and "OOB
on".

Other places check ->use_io_thr, which is true when

    (flags & MONITOR_USE_CONTROL) && !CHARDEV_IS_MUX(chr)

in monitor_init().

One of these places is get_qmp_greeting().  It makes the server offer
"oob" when ->use_io_thr.  Makes sense.

In terms of my reply to Daniel, ->use_io_thr distinguishes between "OOB
not offered" and ("OOB offered, but rejected by client" or "OOB offered
and accepted").

Uses could to violate 'may use "server offered OOB" only for
configuration purposes', so let's check them:

* monitor_json_emitter()

  If mon->use_io_thr, push to response queue.  Else emit directly.

  I'm afraid run time behavior differs for "OOB not offered" (emit
  directly) and "OOB offered by rejected" (queue).

* qmp_caps_check()

  If !mon->use_io_thr, reject client's acceptance of "oob".  Implicitly
  recomputing the set of offered capabilities here is less than elegant,
  but it's not wrong.

* monitor_resume()

  If mon->use_io_thr(), kick the monitor I/O thread.

  Again, different behavior for the two variations of "OOB off".

* get_qmp_greeting()

  Covered above, looks good to me.

* monitor_qmp_setup_handlers_bh()

  If mon->use_io_thr(), pass the monitor I/O thread's AIOContext to
  qemu_chr_fe_set_handlers(), else pass NULL.

  Again, different behavior for the two variations of "OOB off".

* monitor_init()

  If mon->use_io_thr, set up bottom half
  monitor_qmp_setup_handlers_bh(), else call qemu_chr_fe_set_handlers()
  directly.

  Again, different behavior for the two variations of "OOB off".

Either I am confused now, or the two variations of "OOB off" execute
different code at run time.  Which is it?

If it's different code, are the differences observable for the client?

Observable or not, I suspect the differences should not exist.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Thu, Jun 28, 2018 at 03:20:41PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote:
> >> Monitor behavior changes even when the client rejects capability "oob".
> >> 
> >> Traditionally, the monitor reads, executes and responds to one command
> >> after the other.  If the client sends in-band commands faster than the
> >> server can execute them, the kernel will eventually refuse to buffer
> >> more, and sending blocks or fails with EAGAIN.
> >> 
> >> To make OOB possible, we need to read and queue commands as we receive
> >> them.  If the client sends in-band commands faster than the server can
> >> execute them, the server will eventually drop commands to limit the
> >> queue length.  The sever sends event COMMAND_DROPPED then.
> >> 
> >> However, we get the new behavior even when the client rejects capability
> >> "oob".  We get the traditional behavior only when the server doesn't
> >> offer "oob".
> >> 
> >> Is this what we want?
> >
> > Not really.  But I thought we kept that old behavior, haven't we?
> >
> > In handle_qmp_command() we have this:
> >
> >     /*
> >      * If OOB is not enabled on the current monitor, we'll emulate the
> >      * old behavior that we won't process the current monitor any more
> >      * until it has responded.  This helps make sure that as long as
> >      * OOB is not enabled, the server will never drop any command.
> >      */
> >     if (!qmp_oob_enabled(mon)) {
> >         monitor_suspend(mon);
> >         req_obj->need_resume = true;
> >     } else {
> >         /* Drop the request if queue is full. */
> >         if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
> >             qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> >             qapi_event_send_command_dropped(id,
> >                                             COMMAND_DROP_REASON_QUEUE_FULL,
> >                                             &error_abort);
> >             qmp_request_free(req_obj);
> >             return;
> >         }
> >     }
> >
> > Here assuming that we are not enabling OOB, then since we will suspend
> > the monitor when we receive one command, then monitor_can_read() later
> > will check with a result that "we should not read the chardev", then
> > it'll leave all the data in the input buffer.  Then AFAIU the QMP
> > client that didn't enable OOB will have similar behavior as before.
> >
> > Also, we will never receive a COMMAND_DROPPED event in that legacy
> > mode as well since it's in the "else" block.   Am I right?
> 
> Hmm.  Did I get confused?  Let me look again.
> 
> Some places check qmp_oob_enabled(), which is true when the server
> offered capability "oob" and the client accepted.  In terms of my reply
> to Daniel, it distinguishes the two run time cases "OOB off" and "OOB
> on".

Exactly.

> 
> Other places check ->use_io_thr, which is true when
> 
>     (flags & MONITOR_USE_CONTROL) && !CHARDEV_IS_MUX(chr)
> 
> in monitor_init().
> 
> One of these places is get_qmp_greeting().  It makes the server offer
> "oob" when ->use_io_thr.  Makes sense.
> 
> In terms of my reply to Daniel, ->use_io_thr distinguishes between "OOB
> not offered" and ("OOB offered, but rejected by client" or "OOB offered
> and accepted").

Exactly.

To be more clear, let's name the three cases (as you mentioned):

(A) OOB not offered
(B) OOB offered, but rejected by client
(C) OOB offered, and accepted

Then:

(1) qmp_oob_enabled() will only be return true if (C).
(2) ->use_io_thr will only be true if (B) or (C).

> 
> Uses could to violate 'may use "server offered OOB" only for
> configuration purposes', so let's check them:
> 
> * monitor_json_emitter()
> 
>   If mon->use_io_thr, push to response queue.  Else emit directly.
> 
>   I'm afraid run time behavior differs for "OOB not offered" (emit
>   directly) and "OOB offered by rejected" (queue).

Yeah it's different, but logically it's the same.  IMHO it's only a
fast path for main thread.  In other word, I think it can still work
correctly if we remove that else clause.  In that sense, it's not
really a "true" behavior change.

> 
> * qmp_caps_check()
> 
>   If !mon->use_io_thr, reject client's acceptance of "oob".  Implicitly
>   recomputing the set of offered capabilities here is less than elegant,
>   but it's not wrong.
> 
> * monitor_resume()
> 
>   If mon->use_io_thr(), kick the monitor I/O thread.
> 
>   Again, different behavior for the two variations of "OOB off".

This is another example like above - IMHO we can just kick it no
matter what, but we just don't need to do that for main thread (since
we are in main thread so we are sure it is not asleep).  It's just
another trivial enhancement on top of the main logic.

> 
> * get_qmp_greeting()
> 
>   Covered above, looks good to me.
> 
> * monitor_qmp_setup_handlers_bh()
> 
>   If mon->use_io_thr(), pass the monitor I/O thread's AIOContext to
>   qemu_chr_fe_set_handlers(), else pass NULL.
> 
>   Again, different behavior for the two variations of "OOB off".

This is different indeed, but IMHO it's not really "behavior
difference", it's just the context (thread to run the code) that is
different.  The major code path for case (A) and (B) (which are the
two "off" cases) should be the same (when I say "major", I excluded
those trivial enhancements that I mentioned).

> 
> * monitor_init()
> 
>   If mon->use_io_thr, set up bottom half
>   monitor_qmp_setup_handlers_bh(), else call qemu_chr_fe_set_handlers()
>   directly.

This is indeed a bit tricky (to avoid a potential race that Stefan has
pointed out), however after things are setup it'll be exactly the same
code path for both OFF cases.

And actually when I read the code I noticed that we can actually
simplify the code with this:

diff --git a/monitor.c b/monitor.c
index 0730a27172..5d6bff8d51 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4635,18 +4635,14 @@ static void monitor_qmp_setup_handlers_bh(void *opaque)
     Monitor *mon = opaque;
     GMainContext *context;

-    if (mon->use_io_thr) {
-        /*
-         * When use_io_thr is set, we use the global shared dedicated
-         * IO thread for this monitor to handle input/output.
-         */
-        context = monitor_get_io_context();
-        /* We should have inited globals before reaching here. */
-        assert(context);
-    } else {
-        /* The default main loop, which is the main thread */
-        context = NULL;
-    }
+    assert(mon->use_io_thr);
+    /*
+     * When use_io_thr is set, we use the global shared dedicated
+     * IO thread for this monitor to handle input/output.
+     */
+    context = monitor_get_io_context();
+    /* We should have inited globals before reaching here. */
+    assert(context);

     qemu_chr_fe_set_handlers(&mon->chr, monitor_can_read, monitor_qmp_read,
                              monitor_qmp_event, NULL, mon, context, true);

Since monitor_qmp_setup_handlers_bh() will only be used for IOThread
case.  I can post a patch for that.

> 
>   Again, different behavior for the two variations of "OOB off".
> 
> Either I am confused now, or the two variations of "OOB off" execute
> different code at run time.  Which is it?

As mentioned, they should be running the same code at run time.
Though with some trivial differences, but if you see above most of
them should only be small enhancements which can actually be removed.

> 
> If it's different code, are the differences observable for the client?

AFAIU since the code path should mostly be the same for the two OOF
cases, IMHO there should have no difference observable from the
client, otherwise it's buggy and I'd be willing to fix it.

> 
> Observable or not, I suspect the differences should not exist.

We can remove those "trivial enhancements" if we want to make sure the
code paths are exactly the same, but I'm not sure whether that's
really what we need (or we just need to make sure it's unobservable).

Thanks!

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 6 months ago

Marc-André, one question for you inline.  Search for your name.

Peter Xu <peterx@redhat.com> writes:

> On Thu, Jun 28, 2018 at 03:20:41PM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote:
>> >> Monitor behavior changes even when the client rejects capability "oob".
>> >> 
>> >> Traditionally, the monitor reads, executes and responds to one command
>> >> after the other.  If the client sends in-band commands faster than the
>> >> server can execute them, the kernel will eventually refuse to buffer
>> >> more, and sending blocks or fails with EAGAIN.
>> >> 
>> >> To make OOB possible, we need to read and queue commands as we receive
>> >> them.  If the client sends in-band commands faster than the server can
>> >> execute them, the server will eventually drop commands to limit the
>> >> queue length.  The sever sends event COMMAND_DROPPED then.
>> >> 
>> >> However, we get the new behavior even when the client rejects capability
>> >> "oob".  We get the traditional behavior only when the server doesn't
>> >> offer "oob".
>> >> 
>> >> Is this what we want?
>> >
>> > Not really.  But I thought we kept that old behavior, haven't we?
>> >
>> > In handle_qmp_command() we have this:
>> >
>> >     /*
>> >      * If OOB is not enabled on the current monitor, we'll emulate the
>> >      * old behavior that we won't process the current monitor any more
>> >      * until it has responded.  This helps make sure that as long as
>> >      * OOB is not enabled, the server will never drop any command.
>> >      */
>> >     if (!qmp_oob_enabled(mon)) {
>> >         monitor_suspend(mon);
>> >         req_obj->need_resume = true;
>> >     } else {
>> >         /* Drop the request if queue is full. */
>> >         if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
>> >             qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
>> >             qapi_event_send_command_dropped(id,
>> >                                             COMMAND_DROP_REASON_QUEUE_FULL,
>> >                                             &error_abort);
>> >             qmp_request_free(req_obj);
>> >             return;
>> >         }
>> >     }
>> >
>> > Here assuming that we are not enabling OOB, then since we will suspend
>> > the monitor when we receive one command, then monitor_can_read() later
>> > will check with a result that "we should not read the chardev", then
>> > it'll leave all the data in the input buffer.  Then AFAIU the QMP
>> > client that didn't enable OOB will have similar behavior as before.
>> >
>> > Also, we will never receive a COMMAND_DROPPED event in that legacy
>> > mode as well since it's in the "else" block.   Am I right?
>> 
>> Hmm.  Did I get confused?  Let me look again.
>> 
>> Some places check qmp_oob_enabled(), which is true when the server
>> offered capability "oob" and the client accepted.  In terms of my reply
>> to Daniel, it distinguishes the two run time cases "OOB off" and "OOB
>> on".
>
> Exactly.
>
>> 
>> Other places check ->use_io_thr, which is true when
>> 
>>     (flags & MONITOR_USE_CONTROL) && !CHARDEV_IS_MUX(chr)
>> 
>> in monitor_init().

->use_io_thr is now spelled ->use_io_thread.

>> One of these places is get_qmp_greeting().  It makes the server offer
>> "oob" when ->use_io_thr.  Makes sense.
>> 
>> In terms of my reply to Daniel, ->use_io_thr distinguishes between "OOB
>> not offered" and ("OOB offered, but rejected by client" or "OOB offered
>> and accepted").
>
> Exactly.
>
> To be more clear, let's name the three cases (as you mentioned):
>
> (A) OOB not offered
> (B) OOB offered, but rejected by client
> (C) OOB offered, and accepted
>
> Then:
>
> (1) qmp_oob_enabled() will only be return true if (C).
> (2) ->use_io_thr will only be true if (B) or (C).

Notes on (A) to be kept in mind for the remainder of the discussion:

* I don't expect (A) to occur in production

  (A) means the monitor has a chardev-mux backend.  chardev-mux
  multiplexes multiple character backends, e.g. multiplex a monitor and
  a console on stdio.  C-a c switches focus.  While such multiplexing
  may be convenient for humans, it's an unnecessary complication for
  machines, which could just as well use two UNIX domain sockets and not
  have to worry about focus and escape sequences.

* I do expect (A) to go away eventually

  (A) exists only because "not all the chardev frontends can run without
  main thread, or can run in multiple threads" [PATCH 6].  Hopefully,
  that'll be fixed eventually, so (A) can go away.

  Marc-André, your "[PATCH 04/12] Revert "qmp: isolate responses into io
  thread" states the "chardev write path is thread safe".  What's
  missing to make chardev-mux capable of supporting OOB?

>> Uses could to violate 'may use "server offered OOB" only for
>> configuration purposes', so let's check them:
>> 
>> * monitor_json_emitter()

This is now qmp_queue_response()

>>   If mon->use_io_thr, push to response queue.  Else emit directly.
>> 
>>   I'm afraid run time behavior differs for "OOB not offered" (emit
>>   directly) and "OOB offered by rejected" (queue).
>
> Yeah it's different, but logically it's the same.  IMHO it's only a
> fast path for main thread.  In other word, I think it can still work
> correctly if we remove that else clause.  In that sense, it's not
> really a "true" behavior change.

Remember that (A) should not occur in production, and should go away
eventually.  Maintaining a fast path just for that feels unjustified.

>> * qmp_caps_check()
>> 
>>   If !mon->use_io_thr, reject client's acceptance of "oob".  Implicitly
>>   recomputing the set of offered capabilities here is less than elegant,
>>   but it's not wrong.

This has been replaced by monitor_qmp_caps_reset().

If mon->use_io_thread, offer OOB.  Makes sense.

>> * monitor_resume()
>> 
>>   If mon->use_io_thr(), kick the monitor I/O thread.
>> 
>>   Again, different behavior for the two variations of "OOB off".
>
> This is another example like above - IMHO we can just kick it no
> matter what, but we just don't need to do that for main thread (since
> we are in main thread so we are sure it is not asleep).  It's just
> another trivial enhancement on top of the main logic.

Again, maintaining a special case just for (A) feels unjustified.

>> * get_qmp_greeting()
>> 
>>   Covered above, looks good to me.

Also replaced by monitor_qmp_caps_reset().

>> * monitor_qmp_setup_handlers_bh()
>> 
>>   If mon->use_io_thr(), pass the monitor I/O thread's AIOContext to
>>   qemu_chr_fe_set_handlers(), else pass NULL.
>> 
>>   Again, different behavior for the two variations of "OOB off".
>
> This is different indeed, but IMHO it's not really "behavior
> difference", it's just the context (thread to run the code) that is
> different.  The major code path for case (A) and (B) (which are the
> two "off" cases) should be the same (when I say "major", I excluded
> those trivial enhancements that I mentioned).

As far as I can tell, monitor_qmp_setup_handlers_bh() can run only if
scheduled by monitor_init() when mon->use_io_thread.  If that's true,
then the !mon->use_io_thread case is unreachable.  Ah, I see you've also
noticed that, and you're proposing a patch below.

>> * monitor_init()
>> 
>>   If mon->use_io_thr, set up bottom half
>>   monitor_qmp_setup_handlers_bh(), else call qemu_chr_fe_set_handlers()
>>   directly.
>
> This is indeed a bit tricky (to avoid a potential race that Stefan has
> pointed out), however after things are setup it'll be exactly the same
> code path for both OFF cases.
>
> And actually when I read the code I noticed that we can actually
> simplify the code with this:
>
> diff --git a/monitor.c b/monitor.c
> index 0730a27172..5d6bff8d51 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -4635,18 +4635,14 @@ static void monitor_qmp_setup_handlers_bh(void *opaque)
>      Monitor *mon = opaque;
>      GMainContext *context;
>
> -    if (mon->use_io_thr) {
> -        /*
> -         * When use_io_thr is set, we use the global shared dedicated
> -         * IO thread for this monitor to handle input/output.
> -         */
> -        context = monitor_get_io_context();
> -        /* We should have inited globals before reaching here. */
> -        assert(context);
> -    } else {
> -        /* The default main loop, which is the main thread */
> -        context = NULL;
> -    }
> +    assert(mon->use_io_thr);
> +    /*
> +     * When use_io_thr is set, we use the global shared dedicated
> +     * IO thread for this monitor to handle input/output.
> +     */
> +    context = monitor_get_io_context();
> +    /* We should have inited globals before reaching here. */
> +    assert(context);
>
>      qemu_chr_fe_set_handlers(&mon->chr, monitor_can_read, monitor_qmp_read,
>                               monitor_qmp_event, NULL, mon, context, true);
>
> Since monitor_qmp_setup_handlers_bh() will only be used for IOThread
> case.  I can post a patch for that.

Yes, please.

>> 
>>   Again, different behavior for the two variations of "OOB off".

The difference is

* timing: right away for (A) vs. bottom half for (B) and (C)

* context argument for qemu_chr_fe_set_handlers(): NULL, i.e. main loop
  thread for (A), monitor_get_io_context(), i.e. the I/O thread for (B)
  and (C)

Okay, I think.

>> Either I am confused now, or the two variations of "OOB off" execute
>> different code at run time.  Which is it?
>
> As mentioned, they should be running the same code at run time.
> Though with some trivial differences, but if you see above most of
> them should only be small enhancements which can actually be removed.

Please do.

>> If it's different code, are the differences observable for the client?
>
> AFAIU since the code path should mostly be the same for the two OOF
> cases, IMHO there should have no difference observable from the
> client, otherwise it's buggy and I'd be willing to fix it.
>
>> 
>> Observable or not, I suspect the differences should not exist.
>
> We can remove those "trivial enhancements" if we want to make sure the
> code paths are exactly the same, but I'm not sure whether that's
> really what we need (or we just need to make sure it's unobservable).

As long as we're running separate code for (A), "unobservable
difference" is a proposition we need to prove.

Reducing separate code should simplify the proof.

Eliminiating it will simplify it maximally :)

> Thanks!

Thank *you* for persevering :)

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 6 months ago

On Wed, Jul 18, 2018 at 05:08:21PM +0200, Markus Armbruster wrote:
> Marc-André, one question for you inline.  Search for your name.
> 
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, Jun 28, 2018 at 03:20:41PM +0200, Markus Armbruster wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote:
> >> >> Monitor behavior changes even when the client rejects capability "oob".
> >> >> 
> >> >> Traditionally, the monitor reads, executes and responds to one command
> >> >> after the other.  If the client sends in-band commands faster than the
> >> >> server can execute them, the kernel will eventually refuse to buffer
> >> >> more, and sending blocks or fails with EAGAIN.
> >> >> 
> >> >> To make OOB possible, we need to read and queue commands as we receive
> >> >> them.  If the client sends in-band commands faster than the server can
> >> >> execute them, the server will eventually drop commands to limit the
> >> >> queue length.  The sever sends event COMMAND_DROPPED then.
> >> >> 
> >> >> However, we get the new behavior even when the client rejects capability
> >> >> "oob".  We get the traditional behavior only when the server doesn't
> >> >> offer "oob".
> >> >> 
> >> >> Is this what we want?
> >> >
> >> > Not really.  But I thought we kept that old behavior, haven't we?
> >> >
> >> > In handle_qmp_command() we have this:
> >> >
> >> >     /*
> >> >      * If OOB is not enabled on the current monitor, we'll emulate the
> >> >      * old behavior that we won't process the current monitor any more
> >> >      * until it has responded.  This helps make sure that as long as
> >> >      * OOB is not enabled, the server will never drop any command.
> >> >      */
> >> >     if (!qmp_oob_enabled(mon)) {
> >> >         monitor_suspend(mon);
> >> >         req_obj->need_resume = true;
> >> >     } else {
> >> >         /* Drop the request if queue is full. */
> >> >         if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
> >> >             qemu_mutex_unlock(&mon->qmp.qmp_queue_lock);
> >> >             qapi_event_send_command_dropped(id,
> >> >                                             COMMAND_DROP_REASON_QUEUE_FULL,
> >> >                                             &error_abort);
> >> >             qmp_request_free(req_obj);
> >> >             return;
> >> >         }
> >> >     }
> >> >
> >> > Here assuming that we are not enabling OOB, then since we will suspend
> >> > the monitor when we receive one command, then monitor_can_read() later
> >> > will check with a result that "we should not read the chardev", then
> >> > it'll leave all the data in the input buffer.  Then AFAIU the QMP
> >> > client that didn't enable OOB will have similar behavior as before.
> >> >
> >> > Also, we will never receive a COMMAND_DROPPED event in that legacy
> >> > mode as well since it's in the "else" block.   Am I right?
> >> 
> >> Hmm.  Did I get confused?  Let me look again.
> >> 
> >> Some places check qmp_oob_enabled(), which is true when the server
> >> offered capability "oob" and the client accepted.  In terms of my reply
> >> to Daniel, it distinguishes the two run time cases "OOB off" and "OOB
> >> on".
> >
> > Exactly.
> >
> >> 
> >> Other places check ->use_io_thr, which is true when
> >> 
> >>     (flags & MONITOR_USE_CONTROL) && !CHARDEV_IS_MUX(chr)
> >> 
> >> in monitor_init().
> 
> ->use_io_thr is now spelled ->use_io_thread.
> 
> >> One of these places is get_qmp_greeting().  It makes the server offer
> >> "oob" when ->use_io_thr.  Makes sense.
> >> 
> >> In terms of my reply to Daniel, ->use_io_thr distinguishes between "OOB
> >> not offered" and ("OOB offered, but rejected by client" or "OOB offered
> >> and accepted").
> >
> > Exactly.
> >
> > To be more clear, let's name the three cases (as you mentioned):
> >
> > (A) OOB not offered
> > (B) OOB offered, but rejected by client
> > (C) OOB offered, and accepted
> >
> > Then:
> >
> > (1) qmp_oob_enabled() will only be return true if (C).
> > (2) ->use_io_thr will only be true if (B) or (C).
> 
> Notes on (A) to be kept in mind for the remainder of the discussion:
> 
> * I don't expect (A) to occur in production
> 
>   (A) means the monitor has a chardev-mux backend.  chardev-mux
>   multiplexes multiple character backends, e.g. multiplex a monitor and
>   a console on stdio.  C-a c switches focus.  While such multiplexing
>   may be convenient for humans, it's an unnecessary complication for
>   machines, which could just as well use two UNIX domain sockets and not
>   have to worry about focus and escape sequences.
> 
> * I do expect (A) to go away eventually
> 
>   (A) exists only because "not all the chardev frontends can run without
>   main thread, or can run in multiple threads" [PATCH 6].  Hopefully,
>   that'll be fixed eventually, so (A) can go away.
> 
>   Marc-André, your "[PATCH 04/12] Revert "qmp: isolate responses into io
>   thread" states the "chardev write path is thread safe".  What's
>   missing to make chardev-mux capable of supporting OOB?
> 
> >> Uses could to violate 'may use "server offered OOB" only for
> >> configuration purposes', so let's check them:
> >> 
> >> * monitor_json_emitter()
> 
> This is now qmp_queue_response()
> 
> >>   If mon->use_io_thr, push to response queue.  Else emit directly.
> >> 
> >>   I'm afraid run time behavior differs for "OOB not offered" (emit
> >>   directly) and "OOB offered by rejected" (queue).
> >
> > Yeah it's different, but logically it's the same.  IMHO it's only a
> > fast path for main thread.  In other word, I think it can still work
> > correctly if we remove that else clause.  In that sense, it's not
> > really a "true" behavior change.
> 
> Remember that (A) should not occur in production, and should go away
> eventually.  Maintaining a fast path just for that feels unjustified.
> 
> >> * qmp_caps_check()
> >> 
> >>   If !mon->use_io_thr, reject client's acceptance of "oob".  Implicitly
> >>   recomputing the set of offered capabilities here is less than elegant,
> >>   but it's not wrong.
> 
> This has been replaced by monitor_qmp_caps_reset().
> 
> If mon->use_io_thread, offer OOB.  Makes sense.
> 
> >> * monitor_resume()
> >> 
> >>   If mon->use_io_thr(), kick the monitor I/O thread.
> >> 
> >>   Again, different behavior for the two variations of "OOB off".
> >
> > This is another example like above - IMHO we can just kick it no
> > matter what, but we just don't need to do that for main thread (since
> > we are in main thread so we are sure it is not asleep).  It's just
> > another trivial enhancement on top of the main logic.
> 
> Again, maintaining a special case just for (A) feels unjustified.
> 
> >> * get_qmp_greeting()
> >> 
> >>   Covered above, looks good to me.
> 
> Also replaced by monitor_qmp_caps_reset().
> 
> >> * monitor_qmp_setup_handlers_bh()
> >> 
> >>   If mon->use_io_thr(), pass the monitor I/O thread's AIOContext to
> >>   qemu_chr_fe_set_handlers(), else pass NULL.
> >> 
> >>   Again, different behavior for the two variations of "OOB off".
> >
> > This is different indeed, but IMHO it's not really "behavior
> > difference", it's just the context (thread to run the code) that is
> > different.  The major code path for case (A) and (B) (which are the
> > two "off" cases) should be the same (when I say "major", I excluded
> > those trivial enhancements that I mentioned).
> 
> As far as I can tell, monitor_qmp_setup_handlers_bh() can run only if
> scheduled by monitor_init() when mon->use_io_thread.  If that's true,
> then the !mon->use_io_thread case is unreachable.  Ah, I see you've also
> noticed that, and you're proposing a patch below.

Yeah, and...

> 
> >> * monitor_init()
> >> 
> >>   If mon->use_io_thr, set up bottom half
> >>   monitor_qmp_setup_handlers_bh(), else call qemu_chr_fe_set_handlers()
> >>   directly.
> >
> > This is indeed a bit tricky (to avoid a potential race that Stefan has
> > pointed out), however after things are setup it'll be exactly the same
> > code path for both OFF cases.
> >
> > And actually when I read the code I noticed that we can actually
> > simplify the code with this:
> >
> > diff --git a/monitor.c b/monitor.c
> > index 0730a27172..5d6bff8d51 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -4635,18 +4635,14 @@ static void monitor_qmp_setup_handlers_bh(void *opaque)
> >      Monitor *mon = opaque;
> >      GMainContext *context;
> >
> > -    if (mon->use_io_thr) {
> > -        /*
> > -         * When use_io_thr is set, we use the global shared dedicated
> > -         * IO thread for this monitor to handle input/output.
> > -         */
> > -        context = monitor_get_io_context();
> > -        /* We should have inited globals before reaching here. */
> > -        assert(context);
> > -    } else {
> > -        /* The default main loop, which is the main thread */
> > -        context = NULL;
> > -    }
> > +    assert(mon->use_io_thr);
> > +    /*
> > +     * When use_io_thr is set, we use the global shared dedicated
> > +     * IO thread for this monitor to handle input/output.
> > +     */
> > +    context = monitor_get_io_context();
> > +    /* We should have inited globals before reaching here. */
> > +    assert(context);
> >
> >      qemu_chr_fe_set_handlers(&mon->chr, monitor_can_read, monitor_qmp_read,
> >                               monitor_qmp_event, NULL, mon, context, true);
> >
> > Since monitor_qmp_setup_handlers_bh() will only be used for IOThread
> > case.  I can post a patch for that.
> 
> Yes, please.

... actually I have had that patch in my "turn-oob-on-default" series,
and even with your Reviewed-by. :)

  https://patchwork.kernel.org/patch/10506173/

I suppose I'll just repost it and the series after the release.

> 
> >> 
> >>   Again, different behavior for the two variations of "OOB off".
> 
> The difference is
> 
> * timing: right away for (A) vs. bottom half for (B) and (C)
> 
> * context argument for qemu_chr_fe_set_handlers(): NULL, i.e. main loop
>   thread for (A), monitor_get_io_context(), i.e. the I/O thread for (B)
>   and (C)
> 
> Okay, I think.
> 
> >> Either I am confused now, or the two variations of "OOB off" execute
> >> different code at run time.  Which is it?
> >
> > As mentioned, they should be running the same code at run time.
> > Though with some trivial differences, but if you see above most of
> > them should only be small enhancements which can actually be removed.
> 
> Please do.
> 
> >> If it's different code, are the differences observable for the client?
> >
> > AFAIU since the code path should mostly be the same for the two OOF
> > cases, IMHO there should have no difference observable from the
> > client, otherwise it's buggy and I'd be willing to fix it.
> >
> >> 
> >> Observable or not, I suspect the differences should not exist.
> >
> > We can remove those "trivial enhancements" if we want to make sure the
> > code paths are exactly the same, but I'm not sure whether that's
> > really what we need (or we just need to make sure it's unobservable).
> 
> As long as we're running separate code for (A), "unobservable
> difference" is a proposition we need to prove.
> 
> Reducing separate code should simplify the proof.
> 
> Eliminiating it will simplify it maximally :)

Ok, this I can try to do too after the release.

> 
> > Thanks!
> 
> Thank *you* for persevering :)

I should thank you for your continuous support on out-of-band (or even
before it was proposed and named by you :).

(I hope I didn't miss anything that I should comment on; let me know
 if I have)

Regards,

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Another lose end: event COMMAND_DROPPED seems to lack test coverage.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Markus Armbruster <armbru@redhat.com> writes:

> Another lose end: event COMMAND_DROPPED seems to lack test coverage.

Hmm, dropping commands serves to limit the request queue.  What limits
the response queue?

Before OOB, the monitor read at most one command, and wrote its response
with monitor_puts().

For input, this leaves queueing to the kernel: if the client sends
commands faster than the server can execute them, eventually the kernel
refuses to buffer more, and the client's send either blocks or fails
with EAGAIN.

Output is a mess.  monitor_puts() uses an output buffer.  It tries to
flush at newline.  Issues:

* If flushing runs into a partial write, the unwritten remainder remains
  in the output buffer until the next newline.  That newline may take
  its own sweet time to arrive.  Could even lead to deadlocks, where a
  client awaits complete output before it sends more input.  Bug,
  predates OOB, doesn't block this series.

* If the client fails to read, the output buffer can grow without bound.
  Not a security issue; the client is trusted.  Just bad workmanship.

OOB doesn't change this for monitors running in the main thread.  Only
mux chardevs run there.

Aside: keeping special case code around just for mux is a really bad
idea.  We need to get rid of it.

For monitors running in an I/O thread, we add another buffer: the
response queue.  It's drained by monitor_qmp_bh_responder().  I guess
that means the response queue is effectively bounded by timely draining.
Correct?

Buffering twice seems silly, but that could be addressed in follow-up
patches.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Wed, Jun 27, 2018 at 10:35:15AM +0200, Markus Armbruster wrote:
> Markus Armbruster <armbru@redhat.com> writes:
> 
> > Another lose end: event COMMAND_DROPPED seems to lack test coverage.
> 
> Hmm, dropping commands serves to limit the request queue.  What limits
> the response queue?

As long as we have a request queue limitation, that'll somehow be
"part of" the limitation of response queue.  Since the real responses
(let's not consider the events first) should be no more than the
maximum QMP requests that we allow in the request queue (one response
for one request).  In that sense it seems fine to me.

> 
> Before OOB, the monitor read at most one command, and wrote its response
> with monitor_puts().
> 
> For input, this leaves queueing to the kernel: if the client sends
> commands faster than the server can execute them, eventually the kernel
> refuses to buffer more, and the client's send either blocks or fails
> with EAGAIN.
> 
> Output is a mess.  monitor_puts() uses an output buffer.  It tries to
> flush at newline.  Issues:
> 
> * If flushing runs into a partial write, the unwritten remainder remains
>   in the output buffer until the next newline.  That newline may take
>   its own sweet time to arrive.  Could even lead to deadlocks, where a
>   client awaits complete output before it sends more input.  Bug,
>   predates OOB, doesn't block this series.

True.  Though I noticed that we have a "hackish" line in
monitor_json_emitter_raw():

    qstring_append_chr(json, '\n');

So it seems that at least we should never encounter a deadlock, after
all there will always be a newline there. But I'd say I agree with you
on that it's at least not that "beautiful". :-)

> 
> * If the client fails to read, the output buffer can grow without bound.
>   Not a security issue; the client is trusted.  Just bad workmanship.

True.

> 
> OOB doesn't change this for monitors running in the main thread.  Only
> mux chardevs run there.
> 
> Aside: keeping special case code around just for mux is a really bad
> idea.  We need to get rid of it.

We should be running the same code path even for MUX-ed typed, right?
Do you mean to put MUX-ed typed handling onto iothreads as well when
you say "get rid of it"?

> 
> For monitors running in an I/O thread, we add another buffer: the
> response queue.  It's drained by monitor_qmp_bh_responder().  I guess
> that means the response queue is effectively bounded by timely draining.
> Correct?

I don't see a timely manner to flush it, but as long as we queue
anything (including events) onto the response queue, we'll poke the
bottom half (in monitor_json_emitter() we call qemu_bh_schedule()) so
we'll possibly drain the queue very soon, and there should be no
chance to have a stale message in that queue.

> 
> Buffering twice seems silly, but that could be addressed in follow-up
> patches.

Do you mean that we can write the response immediately into
Monitor.outbuf, then only flush it in iothread?  IMHO that's fine -
after all, the response queue, as mentioned above, should have a
natural restriction as well due to the request queue, then we won't
waste too much resources for that.  Meanwhile using a queue with QMP
response objects seems to be a bit more cleaner to me from design pov
(though I might be wrong).

Regards,

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Peter Xu <peterx@redhat.com> writes:

> On Wed, Jun 27, 2018 at 10:35:15AM +0200, Markus Armbruster wrote:
>> Markus Armbruster <armbru@redhat.com> writes:
>> 
>> > Another lose end: event COMMAND_DROPPED seems to lack test coverage.
>> 
>> Hmm, dropping commands serves to limit the request queue.  What limits
>> the response queue?
>
> As long as we have a request queue limitation, that'll somehow be
> "part of" the limitation of response queue.  Since the real responses
> (let's not consider the events first) should be no more than the
> maximum QMP requests that we allow in the request queue (one response
> for one request).  In that sense it seems fine to me.

"Normal" flow of the request through the server (QEMU):

    receive
 -> handle_qmp_command()

 -> request queue: mon->qmp.qmp_requests

 -> monitor_qmp_bh_dispatcher()
 -> monitor_qmp_dispatch_one()
 -> monitor_qmp_respond()
 -> monitor_json_emitter()

 -> response queue: mon->qmp.qmp_requests

 -> monitor_qmp_response_pop_one()
 -> monitor_qmp_bh_responder()
 -> monitor_json_emitter_raw()
 -> monitor_puts()

 -> output buffer: mon->outbuf

 -> monitor_flush_locked()

The purpose of event COMMAND_DROPPED is flow control notification: when
the client sends more than we're willing to buffer, we drop the excess
and notify the client.

If the client sends too many requests too quickly, the request queue
fills up, and flow control kicks in.  Good.

As long as the client sends requests at a moderate pace, the request
queue never fills up: the dispatch & execute code sitting between
request queue and response queue drains it just fine.

The response queue proper also doesn't fill up: the emitter code sitting
between response queue and output buffer drains it just fine.

However, output buffer can still grow without bounds!
monitor_flush_locked() requires the client to keep up to make progress.
Our flow control fails then.

Extreme case: a (misbehaving!) client that keeps sending requests at a
moderate pace while not reading any responses.  output buffer grows
without bounds.

Less extreme case: a client sends a small number of requests quickly,
then reads responses very slowly or not at all for some reason, say
because the network goes bad right at this time.  Here, the size of the
request queue does limit the size of output buffer, as you proposed, but
the size multiplier isn't really known.

My point is: the idea "limiting the request queue also limits the
response queue + output buffer" isn't entirely wrong, but it's not
entirely right, either.

Can we improve flow control to cover the complete flow, not just the
flow into the request queue?

>> Before OOB, the monitor read at most one command, and wrote its response
>> with monitor_puts().
>> 
>> For input, this leaves queueing to the kernel: if the client sends
>> commands faster than the server can execute them, eventually the kernel
>> refuses to buffer more, and the client's send either blocks or fails
>> with EAGAIN.
>> 
>> Output is a mess.  monitor_puts() uses an output buffer.  It tries to
>> flush at newline.  Issues:
>> 
>> * If flushing runs into a partial write, the unwritten remainder remains
>>   in the output buffer until the next newline.  That newline may take
>>   its own sweet time to arrive.

Hmm, it's also flushed via mon->out_watch.  If that works how I guess it
does, there's no deadlock.

>>                                  Could even lead to deadlocks, where a
>>   client awaits complete output before it sends more input.  Bug,
>>   predates OOB, doesn't block this series.
>
> True.  Though I noticed that we have a "hackish" line in
> monitor_json_emitter_raw():
>
>     qstring_append_chr(json, '\n');
>
> So it seems that at least we should never encounter a deadlock, after
> all there will always be a newline there. But I'd say I agree with you
> on that it's at least not that "beautiful". :-)

The newline ensures responses arrive on their own line.  JSON doesn't
care about lines (makes sense), and qmp-spec.txt doesn't care, either
(that's wrong, in my opinion).  Anyone playing with QMP by hand will
definitely care.  Even QMP clients might.

The newline also triggers the actual write(), because monitor_puts() is
line-buffered.  That buffering makes sense for HMP, but it's useless for
QMP.  Let's not worry about that right now.

The flushing, however, is not guaranteed to write anything!  If
qemu_chr_fe_write() fails with EAGAIN, mon->outbuf remains unchanged.

>> * If the client fails to read, the output buffer can grow without bound.
>>   Not a security issue; the client is trusted.  Just bad workmanship.
>
> True.
>
>> 
>> OOB doesn't change this for monitors running in the main thread.  Only
>> mux chardevs run there.
>> 
>> Aside: keeping special case code around just for mux is a really bad
>> idea.  We need to get rid of it.
>
> We should be running the same code path even for MUX-ed typed, right?
> Do you mean to put MUX-ed typed handling onto iothreads as well when
> you say "get rid of it"?

I figure I'll cover this in my reply to Daniel.  If not, I'll reply to
this one again.

>> For monitors running in an I/O thread, we add another buffer: the
>> response queue.  It's drained by monitor_qmp_bh_responder().  I guess
>> that means the response queue is effectively bounded by timely draining.
>> Correct?
>
> I don't see a timely manner to flush it, but as long as we queue
> anything (including events) onto the response queue, we'll poke the
> bottom half (in monitor_json_emitter() we call qemu_bh_schedule()) so
> we'll possibly drain the queue very soon, and there should be no
> chance to have a stale message in that queue.

As long as bottom halves work, the response queue remains small.  That's
okay.

>> Buffering twice seems silly, but that could be addressed in follow-up
>> patches.
>
> Do you mean that we can write the response immediately into
> Monitor.outbuf, then only flush it in iothread?  IMHO that's fine -
> after all, the response queue, as mentioned above, should have a
> natural restriction as well due to the request queue, then we won't
> waste too much resources for that.  Meanwhile using a queue with QMP
> response objects seems to be a bit more cleaner to me from design pov
> (though I might be wrong).

Again, let's not worry about this right now.

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Thu, Jun 28, 2018 at 11:29:30AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Jun 27, 2018 at 10:35:15AM +0200, Markus Armbruster wrote:
> >> Markus Armbruster <armbru@redhat.com> writes:
> >> 
> >> > Another lose end: event COMMAND_DROPPED seems to lack test coverage.
> >> 
> >> Hmm, dropping commands serves to limit the request queue.  What limits
> >> the response queue?
> >
> > As long as we have a request queue limitation, that'll somehow be
> > "part of" the limitation of response queue.  Since the real responses
> > (let's not consider the events first) should be no more than the
> > maximum QMP requests that we allow in the request queue (one response
> > for one request).  In that sense it seems fine to me.
> 
> "Normal" flow of the request through the server (QEMU):
> 
>     receive
>  -> handle_qmp_command()
> 
>  -> request queue: mon->qmp.qmp_requests
> 
>  -> monitor_qmp_bh_dispatcher()
>  -> monitor_qmp_dispatch_one()
>  -> monitor_qmp_respond()
>  -> monitor_json_emitter()
> 
>  -> response queue: mon->qmp.qmp_requests
> 
>  -> monitor_qmp_response_pop_one()
>  -> monitor_qmp_bh_responder()
>  -> monitor_json_emitter_raw()
>  -> monitor_puts()
> 
>  -> output buffer: mon->outbuf
> 
>  -> monitor_flush_locked()
> 
> The purpose of event COMMAND_DROPPED is flow control notification: when
> the client sends more than we're willing to buffer, we drop the excess
> and notify the client.
> 
> If the client sends too many requests too quickly, the request queue
> fills up, and flow control kicks in.  Good.
> 
> As long as the client sends requests at a moderate pace, the request
> queue never fills up: the dispatch & execute code sitting between
> request queue and response queue drains it just fine.
> 
> The response queue proper also doesn't fill up: the emitter code sitting
> between response queue and output buffer drains it just fine.
> 
> However, output buffer can still grow without bounds!
> monitor_flush_locked() requires the client to keep up to make progress.
> Our flow control fails then.
> 
> Extreme case: a (misbehaving!) client that keeps sending requests at a
> moderate pace while not reading any responses.  output buffer grows
> without bounds.
> 
> Less extreme case: a client sends a small number of requests quickly,
> then reads responses very slowly or not at all for some reason, say
> because the network goes bad right at this time.  Here, the size of the
> request queue does limit the size of output buffer, as you proposed, but
> the size multiplier isn't really known.
> 
> My point is: the idea "limiting the request queue also limits the
> response queue + output buffer" isn't entirely wrong, but it's not
> entirely right, either.

Good point!  I obviously overlooked this.

> 
> Can we improve flow control to cover the complete flow, not just the
> flow into the request queue?

How about we simply apply the queue length to the response queue as
well?  Here's the steps:

(1) A new CommandDropReason called COMMAND_DROP_REASON_RESPONSE_FULL,
    showing that one command is dropped since response queue of the
    monitor is full.

(2) When handle QMP command, we not only check request queue, but also
    response queue.  If response queue length is bigger than
    QMP_REQ_QUEUE_LEN_MAX, then we drop the command with the reason
    COMMAND_DROP_REASON_RESPONSE_FULL.

We can do more fine-grained flow control in the future, though I hope
this would work for us as the first step.

> 
> >> Before OOB, the monitor read at most one command, and wrote its response
> >> with monitor_puts().
> >> 
> >> For input, this leaves queueing to the kernel: if the client sends
> >> commands faster than the server can execute them, eventually the kernel
> >> refuses to buffer more, and the client's send either blocks or fails
> >> with EAGAIN.
> >> 
> >> Output is a mess.  monitor_puts() uses an output buffer.  It tries to
> >> flush at newline.  Issues:
> >> 
> >> * If flushing runs into a partial write, the unwritten remainder remains
> >>   in the output buffer until the next newline.  That newline may take
> >>   its own sweet time to arrive.
> 
> Hmm, it's also flushed via mon->out_watch.  If that works how I guess it
> does, there's no deadlock.
> 
> >>                                  Could even lead to deadlocks, where a
> >>   client awaits complete output before it sends more input.  Bug,
> >>   predates OOB, doesn't block this series.
> >
> > True.  Though I noticed that we have a "hackish" line in
> > monitor_json_emitter_raw():
> >
> >     qstring_append_chr(json, '\n');
> >
> > So it seems that at least we should never encounter a deadlock, after
> > all there will always be a newline there. But I'd say I agree with you
> > on that it's at least not that "beautiful". :-)
> 
> The newline ensures responses arrive on their own line.  JSON doesn't
> care about lines (makes sense), and qmp-spec.txt doesn't care, either
> (that's wrong, in my opinion).  Anyone playing with QMP by hand will
> definitely care.  Even QMP clients might.
> 
> The newline also triggers the actual write(), because monitor_puts() is
> line-buffered.  That buffering makes sense for HMP, but it's useless for
> QMP.  Let's not worry about that right now.

Yeah.

> 
> The flushing, however, is not guaranteed to write anything!  If
> qemu_chr_fe_write() fails with EAGAIN, mon->outbuf remains unchanged.

Ouch!  I know that watch, but I didn't notice that it's actually not
following the general semantics of "flush".  Though it's for sure
understandable since we don't want to hang-death the main thread, but
the name "flush" might be misleading.

> 
> >> * If the client fails to read, the output buffer can grow without bound.
> >>   Not a security issue; the client is trusted.  Just bad workmanship.
> >
> > True.
> >
> >> 
> >> OOB doesn't change this for monitors running in the main thread.  Only
> >> mux chardevs run there.
> >> 
> >> Aside: keeping special case code around just for mux is a really bad
> >> idea.  We need to get rid of it.
> >
> > We should be running the same code path even for MUX-ed typed, right?
> > Do you mean to put MUX-ed typed handling onto iothreads as well when
> > you say "get rid of it"?
> 
> I figure I'll cover this in my reply to Daniel.  If not, I'll reply to
> this one again.
> 
> >> For monitors running in an I/O thread, we add another buffer: the
> >> response queue.  It's drained by monitor_qmp_bh_responder().  I guess
> >> that means the response queue is effectively bounded by timely draining.
> >> Correct?
> >
> > I don't see a timely manner to flush it, but as long as we queue
> > anything (including events) onto the response queue, we'll poke the
> > bottom half (in monitor_json_emitter() we call qemu_bh_schedule()) so
> > we'll possibly drain the queue very soon, and there should be no
> > chance to have a stale message in that queue.
> 
> As long as bottom halves work, the response queue remains small.  That's
> okay.
> 
> >> Buffering twice seems silly, but that could be addressed in follow-up
> >> patches.
> >
> > Do you mean that we can write the response immediately into
> > Monitor.outbuf, then only flush it in iothread?  IMHO that's fine -
> > after all, the response queue, as mentioned above, should have a
> > natural restriction as well due to the request queue, then we won't
> > waste too much resources for that.  Meanwhile using a queue with QMP
> > response objects seems to be a bit more cleaner to me from design pov
> > (though I might be wrong).
> 
> Again, let's not worry about this right now.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Wed, Jun 27, 2018 at 09:40:40AM +0200, Markus Armbruster wrote:
> Another lose end: event COMMAND_DROPPED seems to lack test coverage.

This one I can do.  Since I'll try to prepare some patches to fix the
event issue, I can try to add the test alongside.

Thanks for mentioning!

-- 
Peter Xu

Re: [Qemu-devel] your mail

Posted by Peter Xu 7 years, 7 months ago

On Tue, Jun 26, 2018 at 07:21:49PM +0200, Markus Armbruster wrote:
> I fooled around a bit, and I think there are a few lose ends.
> 
> Lets update the examples in docs/interop/qmp-spec.txt to show the
> current greeting (section 3.1) and how to accept a capability (section
> 3.2).  The capability negotiation documentation could use some polish.
> I'll post a patch.

Thanks for doing that!

> 
> Talking to a QMP monitor that supports OOB:
> 
>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
>     {"return": {}}
>     QMP> { "execute": "query-qmp-schema" }
>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
> 
> Why does every command require 'id'?

The COMMAND_DROPPED event is one reason, as you mentioned in the other
thread. Meanwhile as long as we have OOB command it means that we can
have more than one commands running in parallel, then it's a must that
we can mark that out in the response to mark out which command we are
replying to.

> 
> Talking to a QMP monitor that doesn't support OOB:
> 
>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": []}}
>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
>     {"error": {"class": "GenericError", "desc": "This monitor does not support Out-Of-Band (OOB)"}}
>     QMP> { "execute": "qmp_capabilities" }
>     {"return": {}}
>     QMP> { "execute": "query-kvm" }
>     {"return": {"enabled": true, "present": true}}
>     QMP> { "execute": "query-kvm", "control": { "run-oob": true } }
>     {"error": {"class": "GenericError", "desc": "Please enable Out-Of-Band first for the session during capabilities negotiation"}}
> 
> Telling people to enable OOB when that cannot be done is suboptimal.
> More so when it cannot be used here anyway.  I'll post a patch.

Thanks again!

-- 
Peter Xu

Re: [Qemu-devel] your mail

Posted by Markus Armbruster 7 years, 7 months ago

Peter Xu <peterx@redhat.com> writes:

> On Tue, Jun 26, 2018 at 07:21:49PM +0200, Markus Armbruster wrote:
>> I fooled around a bit, and I think there are a few lose ends.
[...]
>> Talking to a QMP monitor that supports OOB:
>> 
>>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
>>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
>>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
>>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
>>     {"return": {}}
>>     QMP> { "execute": "query-qmp-schema" }
>>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
>> 
>> Why does every command require 'id'?

Why am I asking?  Requiring 'id' is rather onerous when you play with
QMP by hand.

> The COMMAND_DROPPED event is one reason, as you mentioned in the other
> thread. Meanwhile as long as we have OOB command it means that we can
> have more than one commands running in parallel, then it's a must that
> we can mark that out in the response to mark out which command we are
> replying to.

Without OOB, the client can match response to command, because it'll
receive responses in command issue order.

Example 1: after sending

    { "execute": "cmd1", ... }
    { "execute": "cmd2", ... }
    { "execute": "cmd3", ... }

the client will receive three responses, first one is cmd1's, second one
is cmd2's, third one is cmd3's.

With OOB, we have to independent command streams, in-band and
out-of-band.  Each separate stream's responses arrive in-order, but the
two streams may be interleaved.

Example 2: after sending

    { "execute": "cmd1", ... }
    { "execute": "cmd2", ... }
    { "execute": "cmd3", ... }
    { "execute": "oob1", "control": { "run-oob": true }, ... }
    { "execute": "oob2", "control": { "run-oob": true }, ... }
    { "execute": "oob3", "control": { "run-oob": true }, ... }

the client will receive six responses: cmd1's before cmd2's before
cmd3's, and oob1's before oob2's before oob3's.

If the client sends "id" with each command, it can match responses to
commands by id.

But that's not the only way to match.  Imagine the client had a perfect
oracle to tell it whether a response is in-band or out-of-band.  Then
matching can work pretty much as in example 1: the first in-band
response is cmd1's, the second one is cmd2's, and the third one is
cmd3's; the first out-of-band response is oob1's, the second one is
oob2's and the third one is oob3's.

Possible solutions other than requiring "id":

1. Make out-of-band responses recognizable

   Just like we copy "id" to the response, we could copy "run-oob", or
   maybe "control".

   Solves the "match response to command" problem, doesn't solve the
   "match COMMAND_DROPPED event to command" problem.

   Permit me a digression.  "control": { "run-oob": true } is awfully
   verbose.  Back when we hashed out the OOB design, I proposed to use
   "exec-oob" instead of "execute" to run a command OOB.  I missed when
   that morphed into flag "run-oob" wrapped in object "control".  Was it
   in response to reviewer feedback?  Can you give me a pointer?

   The counterpart to "exec-oob" would be "return-oob" and "error-oob".
   Hmm.

2. Do nothing; it's the client's problem

   If the client needs to match responses and events to commands, it
   should use the feature that was expressly designed for helping with
   that: "id".

   Note that QMP's initial design assumed asynchronous commands,
   i.e. respones that may arrive in any order.  Nevertheless, it did not
   require "id".  Same reasoning: if the client needs to match, it can
   use "id".

As so often when "do nothing" is a viable solution, it looks mighty
attractive to me :)

[...]

Re: [Qemu-devel] your mail

Posted by Eric Blake 7 years, 7 months ago

On 06/28/2018 03:31 AM, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
>> On Tue, Jun 26, 2018 at 07:21:49PM +0200, Markus Armbruster wrote:
>>> I fooled around a bit, and I think there are a few lose ends.
> [...]
>>> Talking to a QMP monitor that supports OOB:
>>>
>>>      $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
>>>      {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
>>>      QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
>>>      {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
>>>      QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
>>>      {"return": {}}
>>>      QMP> { "execute": "query-qmp-schema" }
>>>      {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
>>>
>>> Why does every command require 'id'?
> 
> Why am I asking?  Requiring 'id' is rather onerous when you play with
> QMP by hand.
> 

> Possible solutions other than requiring "id":
> 
> 1. Make out-of-band responses recognizable
> 
>     Just like we copy "id" to the response, we could copy "run-oob", or
>     maybe "control".
> 
>     Solves the "match response to command" problem, doesn't solve the
>     "match COMMAND_DROPPED event to command" problem.
> 
>     Permit me a digression.  "control": { "run-oob": true } is awfully
>     verbose.  Back when we hashed out the OOB design, I proposed to use
>     "exec-oob" instead of "execute" to run a command OOB.  I missed when
>     that morphed into flag "run-oob" wrapped in object "control".  Was it
>     in response to reviewer feedback?  Can you give me a pointer?

It's not too late to change back.  Since OOB is still new to 3.0, we 
could indeed go with a shorter invocation of 'exec-oob' (and I'd 
actually be in favor of such a change).

> 
>     The counterpart to "exec-oob" would be "return-oob" and "error-oob".
>     Hmm.

In other words, the change in the keyword of the response will let you 
know that it is replying to the most recent oob command.  This works 
well if we guarantee that we never have more than one unprocessed oob 
command in the pipeline at a time; but fails if oob commands can be 
threaded against one another and start replying in a different order 
than original submission.  Or, we could state that if you use 
'exec-oob', then 'id' is mandatory (and 'id' in the 'return' or 'error' 
is then sufficient to tie it back); but when using plain 'execute', 'id' 
remains optional.

> 
> 2. Do nothing; it's the client's problem
> 
>     If the client needs to match responses and events to commands, it
>     should use the feature that was expressly designed for helping with
>     that: "id".
> 
>     Note that QMP's initial design assumed asynchronous commands,
>     i.e. respones that may arrive in any order.  Nevertheless, it did not
>     require "id".  Same reasoning: if the client needs to match, it can
>     use "id".
> 
> As so often when "do nothing" is a viable solution, it looks mighty
> attractive to me :)

Indeed, although the documentation should still be explicit on 
recommending the use of 'id' when oob is enabled.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] your mail

Posted by Daniel P. Berrangé 7 years, 7 months ago

On Thu, Jun 28, 2018 at 10:31:42AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Jun 26, 2018 at 07:21:49PM +0200, Markus Armbruster wrote:
> >> I fooled around a bit, and I think there are a few lose ends.
> [...]
> >> Talking to a QMP monitor that supports OOB:
> >> 
> >>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> >>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
> >>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
> >>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
> >>     {"return": {}}
> >>     QMP> { "execute": "query-qmp-schema" }
> >>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
> >> 
> >> Why does every command require 'id'?
> 
> Why am I asking?  Requiring 'id' is rather onerous when you play with
> QMP by hand.
> 
> > The COMMAND_DROPPED event is one reason, as you mentioned in the other
> > thread. Meanwhile as long as we have OOB command it means that we can
> > have more than one commands running in parallel, then it's a must that
> > we can mark that out in the response to mark out which command we are
> > replying to.
> 
> Without OOB, the client can match response to command, because it'll
> receive responses in command issue order.
> 
> Example 1: after sending
> 
>     { "execute": "cmd1", ... }
>     { "execute": "cmd2", ... }
>     { "execute": "cmd3", ... }
> 
> the client will receive three responses, first one is cmd1's, second one
> is cmd2's, third one is cmd3's.
> 
> With OOB, we have to independent command streams, in-band and
> out-of-band.  Each separate stream's responses arrive in-order, but the
> two streams may be interleaved.
> 
> Example 2: after sending
> 
>     { "execute": "cmd1", ... }
>     { "execute": "cmd2", ... }
>     { "execute": "cmd3", ... }
>     { "execute": "oob1", "control": { "run-oob": true }, ... }
>     { "execute": "oob2", "control": { "run-oob": true }, ... }
>     { "execute": "oob3", "control": { "run-oob": true }, ... }
> 
> the client will receive six responses: cmd1's before cmd2's before
> cmd3's, and oob1's before oob2's before oob3's.

I thought the intention was that oob1, oob2, and oob3 are defined
to return in any order. It just happens that we only hve a single
oob processing stream right now, but we may have more in future.

> If the client sends "id" with each command, it can match responses to
> commands by id.
> 
> But that's not the only way to match.  Imagine the client had a perfect
> oracle to tell it whether a response is in-band or out-of-band.  Then
> matching can work pretty much as in example 1: the first in-band
> response is cmd1's, the second one is cmd2's, and the third one is
> cmd3's; the first out-of-band response is oob1's, the second one is
> oob2's and the third one is oob3's.

Not if oob1/2/3 can retunr in any order in future, which I think we
should be allowing fore.

IMHO 'id' should be mandatory for oob usage.

> Possible solutions other than requiring "id":
> 
> 1. Make out-of-band responses recognizable
> 
>    Just like we copy "id" to the response, we could copy "run-oob", or
>    maybe "control".
> 
>    Solves the "match response to command" problem, doesn't solve the
>    "match COMMAND_DROPPED event to command" problem.
> 
>    Permit me a digression.  "control": { "run-oob": true } is awfully
>    verbose.  Back when we hashed out the OOB design, I proposed to use
>    "exec-oob" instead of "execute" to run a command OOB.  I missed when
>    that morphed into flag "run-oob" wrapped in object "control".  Was it
>    in response to reviewer feedback?  Can you give me a pointer?
> 
>    The counterpart to "exec-oob" would be "return-oob" and "error-oob".
>    Hmm.

I do prefer the less verbose proposal too.

> 2. Do nothing; it's the client's problem
> 
>    If the client needs to match responses and events to commands, it
>    should use the feature that was expressly designed for helping with
>    that: "id".
> 
>    Note that QMP's initial design assumed asynchronous commands,
>    i.e. respones that may arrive in any order.  Nevertheless, it did not
>    require "id".  Same reasoning: if the client needs to match, it can
>    use "id".

IMHO we should just mandate 'id' when using "exec-oob", as it is
conceptually broken to use OOB without a way to match responses.
We don't want clients relying on all OOB replies coming in sequence
now, and then breaking when we allow multiple OOB processing threads.
That would require us to add a eec-oob-no-really-i-mean-it command
to allow overlapping OOB responses later.

> 
> As so often when "do nothing" is a viable solution, it looks mighty
> attractive to me :)
> 
> [...]
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] your mail

Posted by Peter Xu 7 years, 7 months ago

On Thu, Jun 28, 2018 at 01:00:44PM +0100, Daniel P. Berrangé wrote:
> On Thu, Jun 28, 2018 at 10:31:42AM +0200, Markus Armbruster wrote:
> > Peter Xu <peterx@redhat.com> writes:
> > 
> > > On Tue, Jun 26, 2018 at 07:21:49PM +0200, Markus Armbruster wrote:
> > >> I fooled around a bit, and I think there are a few lose ends.
> > [...]
> > >> Talking to a QMP monitor that supports OOB:
> > >> 
> > >>     $ socat UNIX:test-qmp READLINE,history=$HOME/.qmp_history,prompt='QMP> '
> > >>     {"QMP": {"version": {"qemu": {"micro": 50, "minor": 12, "major": 2}, "package": "v2.12.0-1703-gb909799463"}, "capabilities": ["oob"]}}
> > >>     QMP> { "execute": "qmp_capabilities", "arguments": { "oob": true } }
> > >>     {"error": {"class": "GenericError", "desc": "Parameter 'oob' is unexpected"}}
> > >>     QMP> { "execute": "qmp_capabilities", "arguments": { "enable": ["oob"] } }
> > >>     {"return": {}}
> > >>     QMP> { "execute": "query-qmp-schema" }
> > >>     {"error": {"class": "GenericError", "desc": "Out-Of-Band capability requires that every command contains an 'id' field"}}
> > >> 
> > >> Why does every command require 'id'?
> > 
> > Why am I asking?  Requiring 'id' is rather onerous when you play with
> > QMP by hand.
> > 
> > > The COMMAND_DROPPED event is one reason, as you mentioned in the other
> > > thread. Meanwhile as long as we have OOB command it means that we can
> > > have more than one commands running in parallel, then it's a must that
> > > we can mark that out in the response to mark out which command we are
> > > replying to.
> > 
> > Without OOB, the client can match response to command, because it'll
> > receive responses in command issue order.
> > 
> > Example 1: after sending
> > 
> >     { "execute": "cmd1", ... }
> >     { "execute": "cmd2", ... }
> >     { "execute": "cmd3", ... }
> > 
> > the client will receive three responses, first one is cmd1's, second one
> > is cmd2's, third one is cmd3's.
> > 
> > With OOB, we have to independent command streams, in-band and
> > out-of-band.  Each separate stream's responses arrive in-order, but the
> > two streams may be interleaved.
> > 
> > Example 2: after sending
> > 
> >     { "execute": "cmd1", ... }
> >     { "execute": "cmd2", ... }
> >     { "execute": "cmd3", ... }
> >     { "execute": "oob1", "control": { "run-oob": true }, ... }
> >     { "execute": "oob2", "control": { "run-oob": true }, ... }
> >     { "execute": "oob3", "control": { "run-oob": true }, ... }
> > 
> > the client will receive six responses: cmd1's before cmd2's before
> > cmd3's, and oob1's before oob2's before oob3's.
> 
> I thought the intention was that oob1, oob2, and oob3 are defined
> to return in any order. It just happens that we only hve a single
> oob processing stream right now, but we may have more in future.

Exactly.

> 
> > If the client sends "id" with each command, it can match responses to
> > commands by id.
> > 
> > But that's not the only way to match.  Imagine the client had a perfect
> > oracle to tell it whether a response is in-band or out-of-band.  Then
> > matching can work pretty much as in example 1: the first in-band
> > response is cmd1's, the second one is cmd2's, and the third one is
> > cmd3's; the first out-of-band response is oob1's, the second one is
> > oob2's and the third one is oob3's.
> 
> Not if oob1/2/3 can retunr in any order in future, which I think we
> should be allowing fore.
> 
> IMHO 'id' should be mandatory for oob usage.
> 
> > Possible solutions other than requiring "id":
> > 
> > 1. Make out-of-band responses recognizable
> > 
> >    Just like we copy "id" to the response, we could copy "run-oob", or
> >    maybe "control".
> > 
> >    Solves the "match response to command" problem, doesn't solve the
> >    "match COMMAND_DROPPED event to command" problem.
> > 
> >    Permit me a digression.  "control": { "run-oob": true } is awfully
> >    verbose.  Back when we hashed out the OOB design, I proposed to use
> >    "exec-oob" instead of "execute" to run a command OOB.  I missed when
> >    that morphed into flag "run-oob" wrapped in object "control".  Was it
> >    in response to reviewer feedback?  Can you give me a pointer?

It's me who proposed this, not from any reviewer's feedback.  It just
happened that no one was against it.

> > 
> >    The counterpart to "exec-oob" would be "return-oob" and "error-oob".
> >    Hmm.
> 
> I do prefer the less verbose proposal too.

But frankly speaking I still prefer current way.  QMP is majorly for
clients (computer programs) rather than users to use it.  Comparing to
verbosity, I would care more about coherency for how we define the
interface.  Say, OOB execution is still one way to execute a command,
so naturally it should still be using the same way that we execute a
QMP command, we just need some extra fields to tell the server that
"this message is more urgent than the other ones".

If we are discussing HMP interfaces, I'll have the same preference
with both of you to consider more about whether it's user-friendly or
not.  But now we are talking about QMP, then I'll prefer "control".

In the future, it's also possible to add some more sub-fields into the
"control" field.  When that happens, do we want to introduce another
standalone wording for that?  I would assume the answer is no.

> 
> > 2. Do nothing; it's the client's problem
> > 
> >    If the client needs to match responses and events to commands, it
> >    should use the feature that was expressly designed for helping with
> >    that: "id".
> > 
> >    Note that QMP's initial design assumed asynchronous commands,
> >    i.e. respones that may arrive in any order.  Nevertheless, it did not
> >    require "id".  Same reasoning: if the client needs to match, it can
> >    use "id".
> 
> IMHO we should just mandate 'id' when using "exec-oob", as it is
> conceptually broken to use OOB without a way to match responses.
> We don't want clients relying on all OOB replies coming in sequence
> now, and then breaking when we allow multiple OOB processing threads.
> That would require us to add a eec-oob-no-really-i-mean-it command
> to allow overlapping OOB responses later.

Agreed, again.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] your mail

Posted by Eric Blake 7 years, 7 months ago

On 06/29/2018 04:57 AM, Peter Xu wrote:

>>>     Permit me a digression.  "control": { "run-oob": true } is awfully
>>>     verbose.  Back when we hashed out the OOB design, I proposed to use
>>>     "exec-oob" instead of "execute" to run a command OOB.  I missed when
>>>     that morphed into flag "run-oob" wrapped in object "control".  Was it
>>>     in response to reviewer feedback?  Can you give me a pointer?
> 
> It's me who proposed this, not from any reviewer's feedback.  It just
> happened that no one was against it.

Only because I didn't notice the difference, and was helping clear the 
QAPI backlog before the release.  You've now had both the maintainer and 
the backup express a desire for the shorter form.

> 
>>>
>>>     The counterpart to "exec-oob" would be "return-oob" and "error-oob".
>>>     Hmm.
>>
>> I do prefer the less verbose proposal too.
> 
> But frankly speaking I still prefer current way.  QMP is majorly for
> clients (computer programs) rather than users to use it.  Comparing to
> verbosity, I would care more about coherency for how we define the
> interface.  Say, OOB execution is still one way to execute a command,
> so naturally it should still be using the same way that we execute a
> QMP command, we just need some extra fields to tell the server that
> "this message is more urgent than the other ones".

Code-wise, it's actually simpler for libvirt to write:

if (oob) {
     virJSONValueObjectCreate(&cmd, "s:exec-oob", cmdname, ...);
} else {
     virJSONValueObjectCreate(&cmd, "s:execute", cmdname, ...);
}

that it is to write:

virJSONValueObjectCreate(&cmd, "s:execute", cmdname, ...);
if (oob) {
     virJSONValuePtr control;
     virJSONValueObjectCreate(&control, "b:run-oob", true, NULL);
     virJSONValueObjectAppend(&cmd, "control", control);
}

(plus error-checking that I omitted).

In short, adding extra fields is MORE work than just using the command 
name AS the additional bit of information.

> 
> If we are discussing HMP interfaces, I'll have the same preference
> with both of you to consider more about whether it's user-friendly or
> not.  But now we are talking about QMP, then I'll prefer "control".

If you don't want to write the patch, then probably Markus or I will.

> 
> In the future, it's also possible to add some more sub-fields into the
> "control" field.  When that happens, do we want to introduce another
> standalone wording for that?  I would assume the answer is no.

We may add a "control" field at that time, and may even offer 
convenience syntax to allow "exec-oob" to behave as if a "control" field 
were passed.  But the addition of new control features is rare, so I'd 
rather deal with that when we have something to actually add, rather 
than making us suffer with unneeded verbosity for potentially years 
waiting for that next control field to materialize.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

More lose ends:

* scripts/qmp/ doesn't support OOB, yet.  qmp-shell.py in particular

* test-qmp-cmds neglects to cover the OOB additions to qmp-dispatch.c

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Peter Xu 7 years, 7 months ago

On Mon, Jul 02, 2018 at 07:43:06AM +0200, Markus Armbruster wrote:
> More lose ends:
> 
> * scripts/qmp/ doesn't support OOB, yet.  qmp-shell.py in particular
> 
> * test-qmp-cmds neglects to cover the OOB additions to qmp-dispatch.c

Would you mind I put these aside for now?

I'm afraid things grow then we lose control, so not sure whether I can
just focus on the bugs first (e.g., the COMMAND_DROP event
broadcasting, and also the response queue flow control issue).

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] monitor: enable OOB by default

Posted by Markus Armbruster 7 years, 7 months ago

Peter Xu <peterx@redhat.com> writes:

> On Mon, Jul 02, 2018 at 07:43:06AM +0200, Markus Armbruster wrote:
>> More lose ends:
>> 
>> * scripts/qmp/ doesn't support OOB, yet.  qmp-shell.py in particular
>> 
>> * test-qmp-cmds neglects to cover the OOB additions to qmp-dispatch.c
>
> Would you mind I put these aside for now?
>
> I'm afraid things grow then we lose control, so not sure whether I can
> just focus on the bugs first (e.g., the COMMAND_DROP event
> broadcasting, and also the response queue flow control issue).

I agree we need to focus and prioritize.

Any lose ends we can't fix right away we should document with suitable
TODO or FIXME comments.

We can also try to spread the load onto more shoulders.  Perhaps
qmp-shell could be updated by someone who actually uses it.