[Qemu-devel] [PATCH 56/56] docs/interop/qmp-spec: How to force known good parser state

Markus Armbruster posted 56 patches 7 years, 2 months ago
There is a newer version of this series
[Qemu-devel] [PATCH 56/56] docs/interop/qmp-spec: How to force known good parser state
Posted by Markus Armbruster 7 years, 2 months ago
Section "QGA Synchronization" specifies that sending "a raw 0xFF
sentinel byte" makes the server "reset its state and discard all
pending data prior to the sentinel."  What actually happens there is a
lexical error, which will produce one ore more error responses.
Moreover, it's not specific to QGA.

Create new section "Forcing the JSON parser into known-good state" to
document the technique properly.  Rewrite section "QGA
Synchronization" to document just the other direction, i.e. command
guest-sync-delimited.

Section "Protocol Specification" mentions "synchronization bytes
(documented below)".  Delete that.

While there, fix it not to claim '"Server" is QEMU itself', but
'"Server" is either QEMU or the QEMU Guest Agent'.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 docs/interop/qmp-spec.txt | 37 ++++++++++++++++++++++++-------------
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/docs/interop/qmp-spec.txt b/docs/interop/qmp-spec.txt
index 1566b8ae5e..d4a42fe2cc 100644
--- a/docs/interop/qmp-spec.txt
+++ b/docs/interop/qmp-spec.txt
@@ -20,9 +20,9 @@ operating system.
 2. Protocol Specification
 =========================
 
-This section details the protocol format. For the purpose of this document
-"Client" is any application which is using QMP to communicate with QEMU and
-"Server" is QEMU itself.
+This section details the protocol format. For the purpose of this
+document, "Server" is either QEMU or the QEMU Guest Agent, and
+"Client" is any application communicating with it via QMP.
 
 JSON data structures, when mentioned in this document, are always in the
 following format:
@@ -34,9 +34,8 @@ by the JSON standard:
 
 http://www.ietf.org/rfc/rfc7159.txt
 
-The protocol is always encoded in UTF-8 except for synchronization
-bytes (documented below); although thanks to json-string escape
-sequences, the server will reply using only the strict ASCII subset.
+The sever expects its input to be encoded in UTF-8, and sends its
+output encoded in ASCII.
 
 For convenience, json-object members mentioned in this document will
 be in a certain order. However, in real protocol usage they can be in
@@ -215,16 +214,28 @@ Some events are rate-limited to at most one per second.  If additional
 dropped, and the last one is delayed.  "Similar" normally means same
 event type.  See qmp-events.txt for details.
 
-2.6 QGA Synchronization
+2.6 Forcing the JSON parser into known-good state
+-------------------------------------------------
+
+Incomplete or invalid input can leave the server's JSON parser in a
+state where it can't parse additional commands.  To get it back into
+known-good state, the client should provoke a lexical error.
+
+The cleanest way to do that is sending an ASCII control character
+other than '\t' (horizontal tab), '\r' (carriage return), and '\n'
+(new line).
+
+Sadly, older versions of QEMU can fail to flag this as an error.  If a
+client needs to deal with them, it should send a 0xFF byte.
+
+2.7 QGA Synchronization
 -----------------------
 
 When using QGA, an additional synchronization feature is built into
-the protocol.  If the Client sends a raw 0xFF sentinel byte (not valid
-JSON), then the Server will reset its state and discard all pending
-data prior to the sentinel.  Conversely, if the Client makes use of
-the 'guest-sync-delimited' command, the Server will send a raw 0xFF
-sentinel byte prior to its response, to aid the Client in discarding
-any data prior to the sentinel.
+the protocol. If the Client makes use of the 'guest-sync-delimited'
+command, the Server will send a raw 0xFF sentinel byte prior to its
+response, to aid the Client in discarding any data prior to the
+sentinel.
 
 
 3. QMP Examples
-- 
2.17.1


Re: [Qemu-devel] [PATCH 56/56] docs/interop/qmp-spec: How to force known good parser state
Posted by Eric Blake 7 years, 2 months ago
On 08/08/2018 07:03 AM, Markus Armbruster wrote:
> Section "QGA Synchronization" specifies that sending "a raw 0xFF
> sentinel byte" makes the server "reset its state and discard all
> pending data prior to the sentinel."  What actually happens there is a
> lexical error, which will produce one ore more error responses.
> Moreover, it's not specific to QGA.

Hoisting my review of this, as you may want to move it sooner in the series.

> 
> Create new section "Forcing the JSON parser into known-good state" to
> document the technique properly.  Rewrite section "QGA
> Synchronization" to document just the other direction, i.e. command
> guest-sync-delimited.
> 
> Section "Protocol Specification" mentions "synchronization bytes
> (documented below)".  Delete that.
> 
> While there, fix it not to claim '"Server" is QEMU itself', but
> '"Server" is either QEMU or the QEMU Guest Agent'.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>   docs/interop/qmp-spec.txt | 37 ++++++++++++++++++++++++-------------
>   1 file changed, 24 insertions(+), 13 deletions(-)
> 
> diff --git a/docs/interop/qmp-spec.txt b/docs/interop/qmp-spec.txt
> index 1566b8ae5e..d4a42fe2cc 100644
> --- a/docs/interop/qmp-spec.txt
> +++ b/docs/interop/qmp-spec.txt
> @@ -20,9 +20,9 @@ operating system.
>   2. Protocol Specification
>   =========================
>   
> -This section details the protocol format. For the purpose of this document
> -"Client" is any application which is using QMP to communicate with QEMU and
> -"Server" is QEMU itself.
> +This section details the protocol format. For the purpose of this
> +document, "Server" is either QEMU or the QEMU Guest Agent, and
> +"Client" is any application communicating with it via QMP.
>   

Broadens the term "QMP" to mean any client speaking to a qemu 
machine-readable server (previously, we tended to treat "QMP" as the 
direct-to-qemu service, and "QGA" as the guest agent service). I can 
live with that, especially since this document was already mentioning QGA.

>   JSON data structures, when mentioned in this document, are always in the
>   following format:
> @@ -34,9 +34,8 @@ by the JSON standard:
>   
>   http://www.ietf.org/rfc/rfc7159.txt
>   
> -The protocol is always encoded in UTF-8 except for synchronization
> -bytes (documented below); although thanks to json-string escape
> -sequences, the server will reply using only the strict ASCII subset.
> +The sever expects its input to be encoded in UTF-8, and sends its
> +output encoded in ASCII.
>   

Perhaps worth documenting is the range of JSON numbers produced by qemu 
(maybe as a separate patch). Libvirt just hit a bug with the jansson 
library making it extremely difficult to parse JSON containing numbers 
larger than INT64_MAX, when compared to yajl which had a way to support 
up to UINT64_MAX.

https://bugzilla.redhat.com/show_bug.cgi?id=1614569

Knowing that qemu sends numbers larger than INT64_MAX with the intent 
that they not be truncated/rounded by conversion to double can be a 
vital piece of information for implementing a client, when it comes to 
picking a particular library for JSON parsing.

>   For convenience, json-object members mentioned in this document will
>   be in a certain order. However, in real protocol usage they can be in
> @@ -215,16 +214,28 @@ Some events are rate-limited to at most one per second.  If additional
>   dropped, and the last one is delayed.  "Similar" normally means same
>   event type.  See qmp-events.txt for details.
>   
> -2.6 QGA Synchronization
> +2.6 Forcing the JSON parser into known-good state
> +-------------------------------------------------
> +
> +Incomplete or invalid input can leave the server's JSON parser in a
> +state where it can't parse additional commands.  To get it back into
> +known-good state, the client should provoke a lexical error.
> +
> +The cleanest way to do that is sending an ASCII control character
> +other than '\t' (horizontal tab), '\r' (carriage return), and '\n'

s/and/or/

> +(new line).
> +
> +Sadly, older versions of QEMU can fail to flag this as an error.  If a
> +client needs to deal with them, it should send a 0xFF byte.

Here's where we have the choice of whether to intentionally document 
0xff as an intentional parser reset, instead of a lexical error. If so, 
the advice to provoke a lexical error via an ASCII control (of which I 
would be most likely to use 0x00 NUL or 0x1b ESC) vs. an intentional use 
of 0xff may need different wording here.

But if you don't want to give 0xff any more special treatment than what 
it already has as a lexical error (and that ALL lexical errors result in 
a stream reset, but possibly after emitting error messages), then this 
wording seems okay.

> +
> +2.7 QGA Synchronization
>   -----------------------
>   
>   When using QGA, an additional synchronization feature is built into
> -the protocol.  If the Client sends a raw 0xFF sentinel byte (not valid
> -JSON), then the Server will reset its state and discard all pending
> -data prior to the sentinel.  Conversely, if the Client makes use of
> -the 'guest-sync-delimited' command, the Server will send a raw 0xFF
> -sentinel byte prior to its response, to aid the Client in discarding
> -any data prior to the sentinel.
> +the protocol. If the Client makes use of the 'guest-sync-delimited'
> +command, the Server will send a raw 0xFF sentinel byte prior to its
> +response, to aid the Client in discarding any data prior to the
> +sentinel.

Maybe worth mentioning "including error messages reported about any 
lexical errors received prior to the guest-sync-delimited command"

>   
>   
>   3. QMP Examples
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH 56/56] docs/interop/qmp-spec: How to force known good parser state
Posted by Markus Armbruster 7 years, 2 months ago
Eric Blake <eblake@redhat.com> writes:

> On 08/08/2018 07:03 AM, Markus Armbruster wrote:
>> Section "QGA Synchronization" specifies that sending "a raw 0xFF
>> sentinel byte" makes the server "reset its state and discard all
>> pending data prior to the sentinel."  What actually happens there is a
>> lexical error, which will produce one ore more error responses.
>> Moreover, it's not specific to QGA.
>
> Hoisting my review of this, as you may want to move it sooner in the series.
>
>>
>> Create new section "Forcing the JSON parser into known-good state" to
>> document the technique properly.  Rewrite section "QGA
>> Synchronization" to document just the other direction, i.e. command
>> guest-sync-delimited.
>>
>> Section "Protocol Specification" mentions "synchronization bytes
>> (documented below)".  Delete that.
>>
>> While there, fix it not to claim '"Server" is QEMU itself', but
>> '"Server" is either QEMU or the QEMU Guest Agent'.
>>
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>   docs/interop/qmp-spec.txt | 37 ++++++++++++++++++++++++-------------
>>   1 file changed, 24 insertions(+), 13 deletions(-)
>>
>> diff --git a/docs/interop/qmp-spec.txt b/docs/interop/qmp-spec.txt
>> index 1566b8ae5e..d4a42fe2cc 100644
>> --- a/docs/interop/qmp-spec.txt
>> +++ b/docs/interop/qmp-spec.txt
>> @@ -20,9 +20,9 @@ operating system.
>>   2. Protocol Specification
>>   =========================
>>   -This section details the protocol format. For the purpose of this
>> document
>> -"Client" is any application which is using QMP to communicate with QEMU and
>> -"Server" is QEMU itself.
>> +This section details the protocol format. For the purpose of this
>> +document, "Server" is either QEMU or the QEMU Guest Agent, and
>> +"Client" is any application communicating with it via QMP.
>>   
>
> Broadens the term "QMP" to mean any client speaking to a qemu
> machine-readable server (previously, we tended to treat "QMP" as the
> direct-to-qemu service, and "QGA" as the guest agent service). I can
> live with that, especially since this document was already mentioning
> QGA.

And by that it already had QMP denote two disctinct things: the protocol
and one of its two applications.  I'm not really making this worse.  I'm
not really improving it, either.

>>   JSON data structures, when mentioned in this document, are always in the
>>   following format:
>> @@ -34,9 +34,8 @@ by the JSON standard:
>>     http://www.ietf.org/rfc/rfc7159.txt
>>   -The protocol is always encoded in UTF-8 except for
>> synchronization
>> -bytes (documented below); although thanks to json-string escape
>> -sequences, the server will reply using only the strict ASCII subset.
>> +The sever expects its input to be encoded in UTF-8, and sends its
>> +output encoded in ASCII.
>>   
>
> Perhaps worth documenting is the range of JSON numbers produced by
> qemu (maybe as a separate patch). Libvirt just hit a bug with the
> jansson library making it extremely difficult to parse JSON containing
> numbers larger than INT64_MAX, when compared to yajl which had a way
> to support up to UINT64_MAX.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1614569
>
> Knowing that qemu sends numbers larger than INT64_MAX with the intent
> that they not be truncated/rounded by conversion to double can be a
> vital piece of information for implementing a client, when it comes to
> picking a particular library for JSON parsing.

Good point.  Doesn't really fit into this commit, though.  Care to
propose a patch?

>>   For convenience, json-object members mentioned in this document will
>>   be in a certain order. However, in real protocol usage they can be in
>> @@ -215,16 +214,28 @@ Some events are rate-limited to at most one per second.  If additional
>>   dropped, and the last one is delayed.  "Similar" normally means same
>>   event type.  See qmp-events.txt for details.
>>   -2.6 QGA Synchronization
>> +2.6 Forcing the JSON parser into known-good state
>> +-------------------------------------------------
>> +
>> +Incomplete or invalid input can leave the server's JSON parser in a
>> +state where it can't parse additional commands.  To get it back into
>> +known-good state, the client should provoke a lexical error.
>> +
>> +The cleanest way to do that is sending an ASCII control character
>> +other than '\t' (horizontal tab), '\r' (carriage return), and '\n'
>
> s/and/or/

Done.

>> +(new line).
>> +
>> +Sadly, older versions of QEMU can fail to flag this as an error.  If a
>> +client needs to deal with them, it should send a 0xFF byte.
>
> Here's where we have the choice of whether to intentionally document
> 0xff as an intentional parser reset, instead of a lexical error. If
> so, the advice to provoke a lexical error via an ASCII control (of
> which I would be most likely to use 0x00 NUL or 0x1b ESC) vs. an
> intentional use of 0xff may need different wording here.
>
> But if you don't want to give 0xff any more special treatment than
> what it already has as a lexical error (and that ALL lexical errors
> result in a stream reset, but possibly after emitting error messages),
> then this wording seems okay.
>
>> +
>> +2.7 QGA Synchronization
>>   -----------------------
>>     When using QGA, an additional synchronization feature is built
>> into
>> -the protocol.  If the Client sends a raw 0xFF sentinel byte (not valid
>> -JSON), then the Server will reset its state and discard all pending
>> -data prior to the sentinel.  Conversely, if the Client makes use of
>> -the 'guest-sync-delimited' command, the Server will send a raw 0xFF
>> -sentinel byte prior to its response, to aid the Client in discarding
>> -any data prior to the sentinel.
>> +the protocol. If the Client makes use of the 'guest-sync-delimited'
>> +command, the Server will send a raw 0xFF sentinel byte prior to its
>> +response, to aid the Client in discarding any data prior to the
>> +sentinel.
>
> Maybe worth mentioning "including error messages reported about any
> lexical errors received prior to the guest-sync-delimited command"
>
>>       3. QMP Examples
>>

Thanks!

Re: [Qemu-devel] [PATCH 56/56] docs/interop/qmp-spec: How to force known good parser state
Posted by Eric Blake 7 years, 2 months ago
On 08/17/2018 03:37 AM, Markus Armbruster wrote:

>> Perhaps worth documenting is the range of JSON numbers produced by
>> qemu (maybe as a separate patch). Libvirt just hit a bug with the
>> jansson library making it extremely difficult to parse JSON containing
>> numbers larger than INT64_MAX, when compared to yajl which had a way
>> to support up to UINT64_MAX.
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1614569
>>
>> Knowing that qemu sends numbers larger than INT64_MAX with the intent
>> that they not be truncated/rounded by conversion to double can be a
>> vital piece of information for implementing a client, when it comes to
>> picking a particular library for JSON parsing.
> 
> Good point.  Doesn't really fit into this commit, though.  Care to
> propose a patch?

Will do, but I'll probably wait for your v2 series to land first.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH 56/56] docs/interop/qmp-spec: How to force known good parser state
Posted by Markus Armbruster 7 years, 2 months ago
Eric Blake <eblake@redhat.com> writes:

> On 08/08/2018 07:03 AM, Markus Armbruster wrote:
>> Section "QGA Synchronization" specifies that sending "a raw 0xFF
>> sentinel byte" makes the server "reset its state and discard all
>> pending data prior to the sentinel."  What actually happens there is a
>> lexical error, which will produce one ore more error responses.
>> Moreover, it's not specific to QGA.
>
> Hoisting my review of this, as you may want to move it sooner in the series.
>
>>
>> Create new section "Forcing the JSON parser into known-good state" to
>> document the technique properly.  Rewrite section "QGA
>> Synchronization" to document just the other direction, i.e. command
>> guest-sync-delimited.
>>
>> Section "Protocol Specification" mentions "synchronization bytes
>> (documented below)".  Delete that.
>>
>> While there, fix it not to claim '"Server" is QEMU itself', but
>> '"Server" is either QEMU or the QEMU Guest Agent'.
>>
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>   docs/interop/qmp-spec.txt | 37 ++++++++++++++++++++++++-------------
>>   1 file changed, 24 insertions(+), 13 deletions(-)
>>
>> diff --git a/docs/interop/qmp-spec.txt b/docs/interop/qmp-spec.txt
>> index 1566b8ae5e..d4a42fe2cc 100644
>> --- a/docs/interop/qmp-spec.txt
>> +++ b/docs/interop/qmp-spec.txt
>> @@ -20,9 +20,9 @@ operating system.
>>   2. Protocol Specification
>>   =========================
>>   -This section details the protocol format. For the purpose of this
>> document
>> -"Client" is any application which is using QMP to communicate with QEMU and
>> -"Server" is QEMU itself.
>> +This section details the protocol format. For the purpose of this
>> +document, "Server" is either QEMU or the QEMU Guest Agent, and
>> +"Client" is any application communicating with it via QMP.
>>   
[...]
>>   JSON data structures, when mentioned in this document, are always in the
>>   following format:
>> @@ -34,9 +34,8 @@ by the JSON standard:
>>     http://www.ietf.org/rfc/rfc7159.txt
>>   -The protocol is always encoded in UTF-8 except for
>> synchronization
>> -bytes (documented below); although thanks to json-string escape
>> -sequences, the server will reply using only the strict ASCII subset.
>> +The sever expects its input to be encoded in UTF-8, and sends its
>> +output encoded in ASCII.
>>   
[...]
>>   For convenience, json-object members mentioned in this document will
>>   be in a certain order. However, in real protocol usage they can be in
>> @@ -215,16 +214,28 @@ Some events are rate-limited to at most one per second.  If additional
>>   dropped, and the last one is delayed.  "Similar" normally means same
>>   event type.  See qmp-events.txt for details.
>>   -2.6 QGA Synchronization
>> +2.6 Forcing the JSON parser into known-good state
>> +-------------------------------------------------
>> +
>> +Incomplete or invalid input can leave the server's JSON parser in a
>> +state where it can't parse additional commands.  To get it back into
>> +known-good state, the client should provoke a lexical error.
>> +
>> +The cleanest way to do that is sending an ASCII control character
>> +other than '\t' (horizontal tab), '\r' (carriage return), and '\n'
>
> s/and/or/
>
>> +(new line).
>> +
>> +Sadly, older versions of QEMU can fail to flag this as an error.  If a
>> +client needs to deal with them, it should send a 0xFF byte.
[...]
>> +
>> +2.7 QGA Synchronization
>>   -----------------------
>>     When using QGA, an additional synchronization feature is built
>> into
>> -the protocol.  If the Client sends a raw 0xFF sentinel byte (not valid
>> -JSON), then the Server will reset its state and discard all pending
>> -data prior to the sentinel.  Conversely, if the Client makes use of
>> -the 'guest-sync-delimited' command, the Server will send a raw 0xFF
>> -sentinel byte prior to its response, to aid the Client in discarding
>> -any data prior to the sentinel.
>> +the protocol. If the Client makes use of the 'guest-sync-delimited'
>> +command, the Server will send a raw 0xFF sentinel byte prior to its
>> +response, to aid the Client in discarding any data prior to the
>> +sentinel.
>
> Maybe worth mentioning "including error messages reported about any
> lexical errors received prior to the guest-sync-delimited command"
>
>>       3. QMP Examples
>>

What about:

    2.7 QGA Synchronization
    -----------------------

    When a client connects to QGA over a transport lacking proper
    connection semantics such as virtio-serial, QGA may have read partial
    input from a previous client.  The client needs to force QGA's parser
    into known-good state using the previous section's technique.
    Moreover, the client may receive output a previous client didn't read.
    To help with skipping that output, QGA provides the
    'guest-sync-delimited' command.  Refer to its documentation for
    details.

Re: [Qemu-devel] [PATCH 56/56] docs/interop/qmp-spec: How to force known good parser state
Posted by Eric Blake 7 years, 2 months ago
On 08/17/2018 06:16 AM, Markus Armbruster wrote:

>>> +2.7 QGA Synchronization
>>>    -----------------------
>>>      When using QGA, an additional synchronization feature is built
>>> into
>>> -the protocol.  If the Client sends a raw 0xFF sentinel byte (not valid
>>> -JSON), then the Server will reset its state and discard all pending
>>> -data prior to the sentinel.  Conversely, if the Client makes use of
>>> -the 'guest-sync-delimited' command, the Server will send a raw 0xFF
>>> -sentinel byte prior to its response, to aid the Client in discarding
>>> -any data prior to the sentinel.
>>> +the protocol. If the Client makes use of the 'guest-sync-delimited'
>>> +command, the Server will send a raw 0xFF sentinel byte prior to its
>>> +response, to aid the Client in discarding any data prior to the
>>> +sentinel.
>>
>> Maybe worth mentioning "including error messages reported about any
>> lexical errors received prior to the guest-sync-delimited command"
>>
>>>        3. QMP Examples
>>>
> 
> What about:
> 
>      2.7 QGA Synchronization
>      -----------------------
> 
>      When a client connects to QGA over a transport lacking proper
>      connection semantics such as virtio-serial, QGA may have read partial
>      input from a previous client.  The client needs to force QGA's parser
>      into known-good state using the previous section's technique.
>      Moreover, the client may receive output a previous client didn't read.
>      To help with skipping that output, QGA provides the
>      'guest-sync-delimited' command.  Refer to its documentation for
>      details.

That works for me.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org