[Qemu-devel] [RFC PATCH v2] qmp.py: Fix exception parsing partial JSON

Philippe Mathieu-Daudé posted 1 patch 5 years, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180606192731.6910-1-f4bug@amsat.org
Test checkpatch passed
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test s390x passed
scripts/qmp/qmp.py | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
[Qemu-devel] [RFC PATCH v2] qmp.py: Fix exception parsing partial JSON
Posted by Philippe Mathieu-Daudé 5 years, 9 months ago
The readline() call returns partial data.
Keep appending until the JSON buffer is complete.

This fixes:

    $ scripts/qmp/qmp-shell -v -p /tmp/qmp.sock
    Traceback (most recent call last):
      File "scripts/qmp/qmp-shell", line 456, in <module>
        main()
      File "scripts/qmp/qmp-shell", line 441, in main
        qemu.connect(negotiate)
      File "scripts/qmp/qmp-shell", line 284, in connect
        self._greeting = super(QMPShell, self).connect(negotiate)
      File "scripts/qmp/qmp.py", line 143, in connect
        return self.__negotiate_capabilities()
      File "scripts/qmp/qmp.py", line 71, in __negotiate_capabilities
        greeting = self.__json_read()
      File "scripts/qmp/qmp.py", line 85, in __json_read
        resp = json.loads(data)
      File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
        return _default_decoder.decode(s)
      File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
        obj, end = self.raw_decode(s, idx=_w(s, 0).end())
      File "/usr/lib/python2.7/json/decoder.py", line 380, in raw_decode
        obj, end = self.scan_once(s, idx)
    ValueError: Expecting object: line 1 column 3 (char 2)

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
Since v1:
- addressed Daniel review: clean data after json.loads() succeeds
- add a XXX comment

Daniel suggested this is due to blocking i/o.

I'm sure there is a nicer/more pythonic way to do this, but this works for me,
sorry :)

 scripts/qmp/qmp.py | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py
index 5c8cf6a056..14f0b48936 100644
--- a/scripts/qmp/qmp.py
+++ b/scripts/qmp/qmp.py
@@ -78,11 +78,18 @@ class QEMUMonitorProtocol(object):
         raise QMPCapabilitiesError
 
     def __json_read(self, only_event=False):
+        data = ""
         while True:
-            data = self.__sockfile.readline()
-            if not data:
+            tmp = self.__sockfile.readline()
+            if not tmp:
                 return
-            resp = json.loads(data)
+            data += tmp
+            try:
+                resp = json.loads(data)
+            except ValueError:
+                # XXX: blindly loop, even if QEMU ever sends malformed data
+                continue
+            data = ""
             if 'event' in resp:
                 self.logger.debug("<<< %s", resp)
                 self.__events.append(resp)
-- 
2.17.1


Re: [Qemu-devel] [RFC PATCH v2] qmp.py: Fix exception parsing partial JSON
Posted by Eduardo Habkost 5 years, 9 months ago
On Wed, Jun 06, 2018 at 04:27:31PM -0300, Philippe Mathieu-Daudé wrote:
> The readline() call returns partial data.

How can this be reproduced?  Despite not being forbidden by the
QMP specification, QEMU normally doesn't break QMP replies in
multiple lines, and readline() is not supposed to return a
partial line unless it encounters EOF.


> Keep appending until the JSON buffer is complete.
> 
> This fixes:
> 
>     $ scripts/qmp/qmp-shell -v -p /tmp/qmp.sock
>     Traceback (most recent call last):
>       File "scripts/qmp/qmp-shell", line 456, in <module>
>         main()
>       File "scripts/qmp/qmp-shell", line 441, in main
>         qemu.connect(negotiate)
>       File "scripts/qmp/qmp-shell", line 284, in connect
>         self._greeting = super(QMPShell, self).connect(negotiate)
>       File "scripts/qmp/qmp.py", line 143, in connect
>         return self.__negotiate_capabilities()
>       File "scripts/qmp/qmp.py", line 71, in __negotiate_capabilities
>         greeting = self.__json_read()
>       File "scripts/qmp/qmp.py", line 85, in __json_read
>         resp = json.loads(data)
>       File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
>         return _default_decoder.decode(s)
>       File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
>         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>       File "/usr/lib/python2.7/json/decoder.py", line 380, in raw_decode
>         obj, end = self.scan_once(s, idx)
>     ValueError: Expecting object: line 1 column 3 (char 2)
> 
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> Since v1:
> - addressed Daniel review: clean data after json.loads() succeeds
> - add a XXX comment
> 
> Daniel suggested this is due to blocking i/o.
> 
> I'm sure there is a nicer/more pythonic way to do this, but this works for me,
> sorry :)

It looks like there's no elegant solution for this:
https://stackoverflow.com/a/21709058

> 
>  scripts/qmp/qmp.py | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py
> index 5c8cf6a056..14f0b48936 100644
> --- a/scripts/qmp/qmp.py
> +++ b/scripts/qmp/qmp.py
> @@ -78,11 +78,18 @@ class QEMUMonitorProtocol(object):
>          raise QMPCapabilitiesError
>  
>      def __json_read(self, only_event=False):
> +        data = ""
>          while True:
> -            data = self.__sockfile.readline()
> -            if not data:
> +            tmp = self.__sockfile.readline()
> +            if not tmp:
>                  return
> -            resp = json.loads(data)
> +            data += tmp
> +            try:
> +                resp = json.loads(data)
> +            except ValueError:
> +                # XXX: blindly loop, even if QEMU ever sends malformed data
> +                continue

I was going to suggest using json.JSONDecoder.raw_decode() and
saving the remaining data in case we already read partial data
for a second JSON message.

But the QMP specification says all messages are terminated with
CRLF, so we we should never see the data for two different
messages in a single readline() call.  Noting this in a comment
wouldn't hurt, though.

The patch seems reasonable, but first I would like to understand
how this bug can be triggered.

> +            data = ""
>              if 'event' in resp:
>                  self.logger.debug("<<< %s", resp)
>                  self.__events.append(resp)
> -- 
> 2.17.1
> 

-- 
Eduardo

Re: [Qemu-devel] [RFC PATCH v2] qmp.py: Fix exception parsing partial JSON
Posted by Philippe Mathieu-Daudé 5 years, 9 months ago
On 06/06/2018 05:05 PM, Eduardo Habkost wrote:
> On Wed, Jun 06, 2018 at 04:27:31PM -0300, Philippe Mathieu-Daudé wrote:
>> The readline() call returns partial data.
> 
> How can this be reproduced?  Despite not being forbidden by the
> QMP specification, QEMU normally doesn't break QMP replies in
> multiple lines, and readline() is not supposed to return a
> partial line unless it encounters EOF.

$ git rev-parse HEAD
c1c2a435905ae76b159c573b0c0d6f095b45ebc6

config copy/pasted from:
https://wiki.qemu.org/index.php/Documentation/QMP#Trying_it
(now looking at it, it seems I'm mixing configs...)

$ cat qmp.conf
[chardev "qmp"]
  backend = "socket"
  path = "/tmp/qmp.sock"
  server = "on"
  wait = "off"
[mon "qmp"]
  mode = "control"
  chardev = "qmp"
  pretty = "on"

$ arm-softmmu/qemu-system-arm -M lm3s6965evb -kernel /dev/zero \
 -readconfig qmp.conf -S

> 
> 
>> Keep appending until the JSON buffer is complete.
>>
>> This fixes:
>>
>>     $ scripts/qmp/qmp-shell -v -p /tmp/qmp.sock
>>     Traceback (most recent call last):
>>       File "scripts/qmp/qmp-shell", line 456, in <module>
>>         main()
>>       File "scripts/qmp/qmp-shell", line 441, in main
>>         qemu.connect(negotiate)
>>       File "scripts/qmp/qmp-shell", line 284, in connect
>>         self._greeting = super(QMPShell, self).connect(negotiate)
>>       File "scripts/qmp/qmp.py", line 143, in connect
>>         return self.__negotiate_capabilities()
>>       File "scripts/qmp/qmp.py", line 71, in __negotiate_capabilities
>>         greeting = self.__json_read()
>>       File "scripts/qmp/qmp.py", line 85, in __json_read
>>         resp = json.loads(data)
>>       File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
>>         return _default_decoder.decode(s)
>>       File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
>>         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>>       File "/usr/lib/python2.7/json/decoder.py", line 380, in raw_decode
>>         obj, end = self.scan_once(s, idx)
>>     ValueError: Expecting object: line 1 column 3 (char 2)
>>
>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>> ---
>> Since v1:
>> - addressed Daniel review: clean data after json.loads() succeeds
>> - add a XXX comment
>>
>> Daniel suggested this is due to blocking i/o.
>>
>> I'm sure there is a nicer/more pythonic way to do this, but this works for me,
>> sorry :)
> 
> It looks like there's no elegant solution for this:
> https://stackoverflow.com/a/21709058
> 
>>
>>  scripts/qmp/qmp.py | 13 ++++++++++---
>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py
>> index 5c8cf6a056..14f0b48936 100644
>> --- a/scripts/qmp/qmp.py
>> +++ b/scripts/qmp/qmp.py
>> @@ -78,11 +78,18 @@ class QEMUMonitorProtocol(object):
>>          raise QMPCapabilitiesError
>>  
>>      def __json_read(self, only_event=False):
>> +        data = ""
>>          while True:
>> -            data = self.__sockfile.readline()
>> -            if not data:
>> +            tmp = self.__sockfile.readline()
>> +            if not tmp:
>>                  return
>> -            resp = json.loads(data)
>> +            data += tmp
>> +            try:
>> +                resp = json.loads(data)
>> +            except ValueError:
>> +                # XXX: blindly loop, even if QEMU ever sends malformed data
>> +                continue
> 
> I was going to suggest using json.JSONDecoder.raw_decode() and
> saving the remaining data in case we already read partial data
> for a second JSON message.
> 
> But the QMP specification says all messages are terminated with
> CRLF, so we we should never see the data for two different
> messages in a single readline() call.  Noting this in a comment
> wouldn't hurt, though.
> 
> The patch seems reasonable, but first I would like to understand
> how this bug can be triggered.
> 
>> +            data = ""
>>              if 'event' in resp:
>>                  self.logger.debug("<<< %s", resp)
>>                  self.__events.append(resp)
>> -- 
>> 2.17.1
>>
> 

Re: [Qemu-devel] [RFC PATCH v2] qmp.py: Fix exception parsing partial JSON
Posted by Lukáš Doktor 5 years, 9 months ago
Hello guys,

Dne 7.6.2018 v 01:06 Philippe Mathieu-Daudé napsal(a):
> On 06/06/2018 05:05 PM, Eduardo Habkost wrote:
>> On Wed, Jun 06, 2018 at 04:27:31PM -0300, Philippe Mathieu-Daudé wrote:
>>> The readline() call returns partial data.
>>
>> How can this be reproduced?  Despite not being forbidden by the
>> QMP specification, QEMU normally doesn't break QMP replies in
>> multiple lines, and readline() is not supposed to return a
>> partial line unless it encounters EOF.
> 
> $ git rev-parse HEAD
> c1c2a435905ae76b159c573b0c0d6f095b45ebc6
> 
> config copy/pasted from:
> https://wiki.qemu.org/index.php/Documentation/QMP#Trying_it
> (now looking at it, it seems I'm mixing configs...)
> 
> $ cat qmp.conf
> [chardev "qmp"]
>   backend = "socket"
>   path = "/tmp/qmp.sock"
>   server = "on"
>   wait = "off"
> [mon "qmp"]
>   mode = "control"
>   chardev = "qmp"
>   pretty = "on"
> 

nice, pretty printing..., didn't expected that.

> $ arm-softmmu/qemu-system-arm -M lm3s6965evb -kernel /dev/zero \
>  -readconfig qmp.conf -S
> 
>>
>>
>>> Keep appending until the JSON buffer is complete.
>>>
>>> This fixes:
>>>
>>>     $ scripts/qmp/qmp-shell -v -p /tmp/qmp.sock
>>>     Traceback (most recent call last):
>>>       File "scripts/qmp/qmp-shell", line 456, in <module>
>>>         main()
>>>       File "scripts/qmp/qmp-shell", line 441, in main
>>>         qemu.connect(negotiate)
>>>       File "scripts/qmp/qmp-shell", line 284, in connect
>>>         self._greeting = super(QMPShell, self).connect(negotiate)
>>>       File "scripts/qmp/qmp.py", line 143, in connect
>>>         return self.__negotiate_capabilities()
>>>       File "scripts/qmp/qmp.py", line 71, in __negotiate_capabilities
>>>         greeting = self.__json_read()
>>>       File "scripts/qmp/qmp.py", line 85, in __json_read
>>>         resp = json.loads(data)
>>>       File "/usr/lib/python2.7/json/__init__.py", line 339, in loads
>>>         return _default_decoder.decode(s)
>>>       File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
>>>         obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>>>       File "/usr/lib/python2.7/json/decoder.py", line 380, in raw_decode
>>>         obj, end = self.scan_once(s, idx)
>>>     ValueError: Expecting object: line 1 column 3 (char 2)
>>>
>>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>> ---
>>> Since v1:
>>> - addressed Daniel review: clean data after json.loads() succeeds
>>> - add a XXX comment
>>>
>>> Daniel suggested this is due to blocking i/o.
>>>
>>> I'm sure there is a nicer/more pythonic way to do this, but this works for me,
>>> sorry :)
>>
>> It looks like there's no elegant solution for this:
>> https://stackoverflow.com/a/21709058
>>

Yep, that looks nicer, but even the original solution should not be that bad as it should be rarely used. What troubles me more is the possible infinite loop. Would you mind adding a timeout?

Lukáš

>>>
>>>  scripts/qmp/qmp.py | 13 ++++++++++---
>>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/scripts/qmp/qmp.py b/scripts/qmp/qmp.py
>>> index 5c8cf6a056..14f0b48936 100644
>>> --- a/scripts/qmp/qmp.py
>>> +++ b/scripts/qmp/qmp.py
>>> @@ -78,11 +78,18 @@ class QEMUMonitorProtocol(object):
>>>          raise QMPCapabilitiesError
>>>  
>>>      def __json_read(self, only_event=False):
>>> +        data = ""
>>>          while True:
>>> -            data = self.__sockfile.readline()
>>> -            if not data:
>>> +            tmp = self.__sockfile.readline()
>>> +            if not tmp:
>>>                  return
>>> -            resp = json.loads(data)
>>> +            data += tmp
>>> +            try:
>>> +                resp = json.loads(data)
>>> +            except ValueError:
>>> +                # XXX: blindly loop, even if QEMU ever sends malformed data
>>> +                continue
>>>> I was going to suggest using json.JSONDecoder.raw_decode() and
>> saving the remaining data in case we already read partial data
>> for a second JSON message.
>>
>> But the QMP specification says all messages are terminated with
>> CRLF, so we we should never see the data for two different
>> messages in a single readline() call.  Noting this in a comment
>> wouldn't hurt, though.
>>
>> The patch seems reasonable, but first I would like to understand
>> how this bug can be triggered.
>>
>>> +            data = ""
>>>              if 'event' in resp:
>>>                  self.logger.debug("<<< %s", resp)
>>>                  self.__events.append(resp)
>>> -- 
>>> 2.17.1
>>>
>>


Re: [Qemu-devel] [RFC PATCH v2] qmp.py: Fix exception parsing partial JSON
Posted by Eduardo Habkost 5 years, 9 months ago
On Fri, Jun 08, 2018 at 07:57:55PM +0200, Lukáš Doktor wrote:
> Hello guys,
> 
> Dne 7.6.2018 v 01:06 Philippe Mathieu-Daudé napsal(a):
> > On 06/06/2018 05:05 PM, Eduardo Habkost wrote:
> >> On Wed, Jun 06, 2018 at 04:27:31PM -0300, Philippe Mathieu-Daudé wrote:
> >>> The readline() call returns partial data.
> >>
> >> How can this be reproduced?  Despite not being forbidden by the
> >> QMP specification, QEMU normally doesn't break QMP replies in
> >> multiple lines, and readline() is not supposed to return a
> >> partial line unless it encounters EOF.
> > 
> > $ git rev-parse HEAD
> > c1c2a435905ae76b159c573b0c0d6f095b45ebc6
> > 
> > config copy/pasted from:
> > https://wiki.qemu.org/index.php/Documentation/QMP#Trying_it
> > (now looking at it, it seems I'm mixing configs...)
> > 
> > $ cat qmp.conf
> > [chardev "qmp"]
> >   backend = "socket"
> >   path = "/tmp/qmp.sock"
> >   server = "on"
> >   wait = "off"
> > [mon "qmp"]
> >   mode = "control"
> >   chardev = "qmp"
> >   pretty = "on"
> > 
> 
> nice, pretty printing..., didn't expected that.

Oh, so *that*'s the root cause.  Thanks, I stared at this
configuration for a while and I couldn't see what could cause
extra newlines to appear in the output.  :)

[...]
> >>> I'm sure there is a nicer/more pythonic way to do this, but this works for me,
> >>> sorry :)
> >>
> >> It looks like there's no elegant solution for this:
> >> https://stackoverflow.com/a/21709058
> >>
> 
> Yep, that looks nicer, but even the original solution should
> not be that bad as it should be rarely used. What troubles me
> more is the possible infinite loop. Would you mind adding a
> timeout?

We already have a socket timeout, so I assume this is alrady
covered?

-- 
Eduardo