[PATCH v3 0/7] qobject: switch JSON parser to push

Paolo Bonzini posted 7 patches 16 hours ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20260525150503.393743-1-pbonzini@redhat.com
Maintainers: Markus Armbruster <armbru@redhat.com>
include/qobject/json-parser.h |  16 +-
qobject/json-parser-int.h     |  13 +-
qobject/json-lexer.c          |  11 +-
qobject/json-parser.c         | 580 +++++++++++++++++++---------------
qobject/json-streamer.c       | 120 +++----
5 files changed, 415 insertions(+), 325 deletions(-)
[PATCH v3 0/7] qobject: switch JSON parser to push
Posted by Paolo Bonzini 16 hours ago
This rewrites the json-parser to use a push parser aka state machine.
While push parsers are inherently more complex than recursive descent,
the grammar for JSON is simple enough that the parser remains readable.
There is therefore no need to use e.g. QEMU coroutines.

Unlike the suggestion in commit 62815d85aed ("json: Redesign the callback
to consume JSON values", 2018-08-24), I kept the json-streamer concept.
It helps in handling input limits, it performs error recovery, and it
converts the token-at-a-time push interface to callbacks---all things
that are more easily done in a separate layer to keep the parser clean.
However, there is no need anymore for it to store partial JSON objects
in tokenized form, because the current state is stored in the push
parser's stack.

Another benefit is that QEMU can report the first parsing error
immediately, without waiting for parentheses to be balanced or for a
lexing error.  Error recovery then proceeds as before (i.e., the next
parse still starts after balanced parentheses or a lexing error).

On top of the benefits intrinsic in the push architecture, it so happens
that it's really easy to add a location to JSON parsing errors now, so
do that as well.

The diffstat is unfavorable, but most of the new lines delta is really
new comments explaining the grammar and state machines.

Paolo

v2->v3:
- accept interpolation for the key of a dictionary

v1->v2:
- remove part of the patch to pass around the lookahead token,
  it was hard to review and added little value
- separate patch to reuse the JSONParser
- separate patch to make brace/bracket count unsigned
- add comment with the structure of the stack
- add big comment with the grammar
- split long lines
- remove QObject **value argument to pop_entry()
- add assertions about the type of the top-of-stack
- change error to "key is not a string in object"
- split out json_parser_reset() already in the first patch
- rename json_parser_parse_token() to parse_token()
- do not use single quotes in commit messages
- move initialization of JSONToken close to usage


Paolo Bonzini (7):
  json-parser: constify JSONToken
  json-parser: replace with a push parser
  json-streamer: reuse parser
  json-streamer: make brace/bracket count unsigned
  json-streamer: remove token queue
  json-streamer: do not heap-allocate JSONToken
  json-parser: add location to JSON parsing errors

 include/qobject/json-parser.h |  16 +-
 qobject/json-parser-int.h     |  13 +-
 qobject/json-lexer.c          |  11 +-
 qobject/json-parser.c         | 580 +++++++++++++++++++---------------
 qobject/json-streamer.c       | 120 +++----
 5 files changed, 415 insertions(+), 325 deletions(-)

-- 
2.54.0