[Qemu-devel] [PATCH] qmp-shell: fix nested json regression

Marc-André Lureau posted 1 patch 5 years, 2 months ago
Test docker-mingw@fedora passed
Test asan passed
Test checkpatch passed
Test docker-clang@ubuntu passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190205134926.8312-1-marcandre.lureau@redhat.com
Maintainers: Markus Armbruster <armbru@redhat.com>, Eduardo Habkost <ehabkost@redhat.com>, Cleber Rosa <crosa@redhat.com>
scripts/qmp/qmp-shell | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[Qemu-devel] [PATCH] qmp-shell: fix nested json regression
Posted by Marc-André Lureau 5 years, 2 months ago
Commit fcfab7541 ("qmp-shell: learn to send commands with quoted
arguments") introduces the usage of Python 'shlex' to handle quoted
arguments, but it accidentally broke generation of nested JSON
structs.

shlex drops quotes, which breaks parsing of the nested struct.

cmd='blockdev-create job-id="job0 foo" options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}'

shlex.split(cmd)
['blockdev-create',
 'job-id=job0 foo',
 'options={driver:qcow2,size:16384,file:{driver:file,filename:foo.qcow2}}']

Replace with a regexp to split while respecting quoted strings and preserving quotes:

re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmd)
['blockdev-create',
 'job-id="job0 foo"',
 'options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}']

Fixes: fcfab7541 ("qmp-shell: learn to send commands with quoted arguments")
Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 scripts/qmp/qmp-shell | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell
index 770140772d..813dd68232 100755
--- a/scripts/qmp/qmp-shell
+++ b/scripts/qmp/qmp-shell
@@ -74,7 +74,7 @@ import sys
 import os
 import errno
 import atexit
-import shlex
+import re
 
 class QMPCompleter(list):
     def complete(self, text, state):
@@ -220,7 +220,7 @@ class QMPShell(qmp.QEMUMonitorProtocol):
 
             < command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ]
         """
-        cmdargs = shlex.split(cmdline)
+        cmdargs = re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmdline)
 
         # Transactional CLI entry/exit:
         if cmdargs[0] == 'transaction(':
-- 
2.20.1.98.gecbdaf0899


Re: [Qemu-devel] [PATCH] qmp-shell: fix nested json regression
Posted by John Snow 5 years, 2 months ago

On 2/5/19 8:49 AM, Marc-André Lureau wrote:
> Commit fcfab7541 ("qmp-shell: learn to send commands with quoted
> arguments") introduces the usage of Python 'shlex' to handle quoted
> arguments, but it accidentally broke generation of nested JSON
> structs.
> 
> shlex drops quotes, which breaks parsing of the nested struct.
> 
> cmd='blockdev-create job-id="job0 foo" options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}'
> 
> shlex.split(cmd)
> ['blockdev-create',
>  'job-id=job0 foo',
>  'options={driver:qcow2,size:16384,file:{driver:file,filename:foo.qcow2}}']
> 
> Replace with a regexp to split while respecting quoted strings and preserving quotes:
> 
> re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmd)
> ['blockdev-create',
>  'job-id="job0 foo"',
>  'options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}']
> 
> Fixes: fcfab7541 ("qmp-shell: learn to send commands with quoted arguments")
> Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  scripts/qmp/qmp-shell | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell
> index 770140772d..813dd68232 100755
> --- a/scripts/qmp/qmp-shell
> +++ b/scripts/qmp/qmp-shell
> @@ -74,7 +74,7 @@ import sys
>  import os
>  import errno
>  import atexit
> -import shlex
> +import re
>  
>  class QMPCompleter(list):
>      def complete(self, text, state):
> @@ -220,7 +220,7 @@ class QMPShell(qmp.QEMUMonitorProtocol):
>  
>              < command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ]
>          """
> -        cmdargs = shlex.split(cmdline)
> +        cmdargs = re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmdline)

It might really be nice to have a comment briefly explaining the regex.
This is pretty close to symbol soup.

Though I suppose we are approaching the limits of what this hacky little
debug script can do for us...

thank you for fixing it.

>  
>          # Transactional CLI entry/exit:
>          if cmdargs[0] == 'transaction(':
> 

Re: [Qemu-devel] [PATCH] qmp-shell: fix nested json regression
Posted by Kashyap Chamarthy 5 years, 2 months ago
On Tue, Feb 05, 2019 at 08:44:12PM -0500, John Snow wrote:
> On 2/5/19 8:49 AM, Marc-André Lureau wrote:

[...]

> > Replace with a regexp to split while respecting quoted strings and preserving quotes:
> > 
> > re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmd)
> > ['blockdev-create',
> >  'job-id="job0 foo"',
> >  'options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}']
> > 
> > Fixes: fcfab7541 ("qmp-shell: learn to send commands with quoted arguments")
> > Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> >  scripts/qmp/qmp-shell | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/scripts/qmp/qmp-shell b/scripts/qmp/qmp-shell
> > index 770140772d..813dd68232 100755
> > --- a/scripts/qmp/qmp-shell
> > +++ b/scripts/qmp/qmp-shell
> > @@ -74,7 +74,7 @@ import sys
> >  import os
> >  import errno
> >  import atexit
> > -import shlex
> > +import re
> >  
> >  class QMPCompleter(list):
> >      def complete(self, text, state):
> > @@ -220,7 +220,7 @@ class QMPShell(qmp.QEMUMonitorProtocol):
> >  
> >              < command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ]
> >          """
> > -        cmdargs = shlex.split(cmdline)
> > +        cmdargs = re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmdline)
> 
> It might really be nice to have a comment briefly explaining the regex.
> This is pretty close to symbol soup.

Yeah, a little comment explaining it would be nice.  

And thanks for the fix, indeed.  FWIW:

    Tested-by: Kashyap Chamarthy <kchamart@redhat.com>

> Though I suppose we are approaching the limits of what this hacky little
> debug script can do for us...

It is good enough for 80% of the cases. :-)

-- 
/kashyap

Re: [Qemu-devel] [PATCH] qmp-shell: fix nested json regression
Posted by Kashyap Chamarthy 5 years, 2 months ago
On Wed, Feb 06, 2019 at 11:48:51AM +0100, Kashyap Chamarthy wrote:
> On Tue, Feb 05, 2019 at 08:44:12PM -0500, John Snow wrote:
> > On 2/5/19 8:49 AM, Marc-André Lureau wrote:

[...]

> > >              < command-name > [ arg-name1=arg1 ] ... [ arg-nameN=argN ]
> > >          """
> > > -        cmdargs = shlex.split(cmdline)
> > > +        cmdargs = re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmdline)

Dan Berrangé explained on IRC this way:

    In plain english what that is saying is: give me all blocks of text
    which are:
    
      (a) set of chars not including " or '
      
      (b) a set of chars surrounded by ".." 
      
      (c) a set of chars surrounded by '...  

> > It might really be nice to have a comment briefly explaining the regex.
> > This is pretty close to symbol soup.
> 
> Yeah, a little comment explaining it would be nice.  

For my own education today I learned that ?: is called a "non-capturing
match".  For reference, quoting from the `perldoc perlretut`:

[quote]

  Non-capturing groupings
    A group that is required to bundle a set of alternatives may or may not
    be useful as a capturing group. If it isn't, it just creates a
    superfluous addition to the set of available capture group values,
    inside as well as outside the regexp. Non-capturing groupings, denoted
    by "(?:regexp)", still allow the regexp to be treated as a single unit,
    but don't establish a capturing group at the same time. Both capturing
    and non-capturing groupings are allowed to co-exist in the same regexp.
    Because there is no extraction, non-capturing groupings are faster than
    capturing groupings. Non-capturing groupings are also handy for choosing
    exactly which parts of a regexp are to be extracted to matching
    variables:

        # match a number, $1-$4 are set, but we only want $1
        /([+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?)/;

        # match a number faster , only $1 is set
        /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?)/;

        # match a number, get $1 = whole number, $2 = exponent
        /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE]([+-]?\d+))?)/;

    Non-capturing groupings are also useful for removing nuisance elements
    gathered from a split operation where parentheses are required for some
    reason:

        $x = '12aba34ba5';
        @num = split /(a|b)+/, $x;    # @num = ('12','a','34','a','5')
        @num = split /(?:a|b)+/, $x;  # @num = ('12','34','5')

[/quote]

[...]

-- 
/kashyap

Re: [Qemu-devel] [PATCH] qmp-shell: fix nested json regression
Posted by John Snow 5 years, 1 month ago

On 2/5/19 8:49 AM, Marc-André Lureau wrote:
> Commit fcfab7541 ("qmp-shell: learn to send commands with quoted
> arguments") introduces the usage of Python 'shlex' to handle quoted
> arguments, but it accidentally broke generation of nested JSON
> structs.
> 
> shlex drops quotes, which breaks parsing of the nested struct.
> 
> cmd='blockdev-create job-id="job0 foo" options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}'
> 
> shlex.split(cmd)
> ['blockdev-create',
>  'job-id=job0 foo',
>  'options={driver:qcow2,size:16384,file:{driver:file,filename:foo.qcow2}}']
> 
> Replace with a regexp to split while respecting quoted strings and preserving quotes:
> 
> re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmd)
> ['blockdev-create',
>  'job-id="job0 foo"',
>  'options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}']
> 
> Fixes: fcfab7541 ("qmp-shell: learn to send commands with quoted arguments")
> Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>

Hi, did this get misplaced? Would be nice to have back for 4.0 release.

Re: [Qemu-devel] [PATCH] qmp-shell: fix nested json regression
Posted by Markus Armbruster 5 years, 1 month ago
John Snow <jsnow@redhat.com> writes:

> On 2/5/19 8:49 AM, Marc-André Lureau wrote:
>> Commit fcfab7541 ("qmp-shell: learn to send commands with quoted
>> arguments") introduces the usage of Python 'shlex' to handle quoted
>> arguments, but it accidentally broke generation of nested JSON
>> structs.
>> 
>> shlex drops quotes, which breaks parsing of the nested struct.
>> 
>> cmd='blockdev-create job-id="job0 foo" options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}'
>> 
>> shlex.split(cmd)
>> ['blockdev-create',
>>  'job-id=job0 foo',
>>  'options={driver:qcow2,size:16384,file:{driver:file,filename:foo.qcow2}}']
>> 
>> Replace with a regexp to split while respecting quoted strings and preserving quotes:
>> 
>> re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmd)
>> ['blockdev-create',
>>  'job-id="job0 foo"',
>>  'options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}']
>> 
>> Fixes: fcfab7541 ("qmp-shell: learn to send commands with quoted arguments")
>> Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
>> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>
> Hi, did this get misplaced? Would be nice to have back for 4.0 release.

Eduardo, Cleber, I think this one's for you.

Re: [Qemu-devel] [PATCH] qmp-shell: fix nested json regression
Posted by Eduardo Habkost 5 years, 1 month ago
On Sat, Mar 09, 2019 at 10:01:58AM +0100, Markus Armbruster wrote:
> John Snow <jsnow@redhat.com> writes:
> 
> > On 2/5/19 8:49 AM, Marc-André Lureau wrote:
> >> Commit fcfab7541 ("qmp-shell: learn to send commands with quoted
> >> arguments") introduces the usage of Python 'shlex' to handle quoted
> >> arguments, but it accidentally broke generation of nested JSON
> >> structs.
> >> 
> >> shlex drops quotes, which breaks parsing of the nested struct.
> >> 
> >> cmd='blockdev-create job-id="job0 foo" options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}'
> >> 
> >> shlex.split(cmd)
> >> ['blockdev-create',
> >>  'job-id=job0 foo',
> >>  'options={driver:qcow2,size:16384,file:{driver:file,filename:foo.qcow2}}']
> >> 
> >> Replace with a regexp to split while respecting quoted strings and preserving quotes:
> >> 
> >> re.findall(r'''(?:[^\s"']|"(?:\\.|[^"])*"|'(?:\\.|[^'])*')+''', cmd)
> >> ['blockdev-create',
> >>  'job-id="job0 foo"',
> >>  'options={"driver":"qcow2","size":16384,"file":{"driver":"file","filename":"foo.qcow2"}}']
> >> 
> >> Fixes: fcfab7541 ("qmp-shell: learn to send commands with quoted arguments")
> >> Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
> >> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >
> > Hi, did this get misplaced? Would be nice to have back for 4.0 release.
> 
> Eduardo, Cleber, I think this one's for you.

Queued, thanks.

I'm really bothered by the lack of tests for qmp-shell
command-line parsing.  Does anybody volunteer to write a small
set of doctest examples for __build_cmd()?

-- 
Eduardo