[Qemu-devel] [PATCH 16/56] json: Fix lexer to include the bad character in JSON_ERROR token

Markus Armbruster posted 56 patches 7 years, 2 months ago
There is a newer version of this series
[Qemu-devel] [PATCH 16/56] json: Fix lexer to include the bad character in JSON_ERROR token
Posted by Markus Armbruster 7 years, 2 months ago
json_lexer[] maps (lexer state, input character) to the new lexer
state.  The input character is consumed unless the new state is
terminal and the input character doesn't belong to this token,
i.e. the state transition uses look-ahead.  When this is the case,
input character '\0' would result in the same state transition.
TERMINAL_NEEDED_LOOKAHEAD() exploits this.

Except this is wrong for transitions to IN_ERROR.  There, the
offending input character is in fact consumed: case IN_ERROR returns.
It isn't added to the JSON_ERROR token, though.

Fix that by making TERMINAL_NEEDED_LOOKAHEAD() return false for
transitions to IN_ERROR.

There's a slight complication.  json_lexer_flush() passes input
character '\0' to flush an incomplete token.  If this results in
JSON_ERROR, we'd now add the '\0' to the token.  Suppress that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 qobject/json-lexer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
index 980ba159d6..7c0875d225 100644
--- a/qobject/json-lexer.c
+++ b/qobject/json-lexer.c
@@ -76,7 +76,7 @@ QEMU_BUILD_BUG_ON((int)JSON_MIN <= (int)IN_START);
    from OLD_STATE required lookahead.  This happens whenever the table
    below uses the TERMINAL macro.  */
 #define TERMINAL_NEEDED_LOOKAHEAD(old_state, terminal) \
-            (json_lexer[(old_state)][0] == (terminal))
+    (terminal != IN_ERROR && json_lexer[(old_state)][0] == (terminal))
 
 static const uint8_t json_lexer[][256] =  {
     /* Relies on default initialization to IN_ERROR! */
@@ -304,7 +304,7 @@ static int json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
         assert(lexer->state <= ARRAY_SIZE(json_lexer));
         new_state = json_lexer[lexer->state][(uint8_t)ch];
         char_consumed = !TERMINAL_NEEDED_LOOKAHEAD(lexer->state, new_state);
-        if (char_consumed) {
+        if (char_consumed && !flush) {
             g_string_append_c(lexer->token, ch);
         }
 
-- 
2.17.1


Re: [Qemu-devel] [PATCH 16/56] json: Fix lexer to include the bad character in JSON_ERROR token
Posted by Eric Blake 7 years, 2 months ago
On 08/08/2018 07:02 AM, Markus Armbruster wrote:
> json_lexer[] maps (lexer state, input character) to the new lexer
> state.  The input character is consumed unless the new state is
> terminal and the input character doesn't belong to this token,
> i.e. the state transition uses look-ahead.  When this is the case,
> input character '\0' would result in the same state transition.
> TERMINAL_NEEDED_LOOKAHEAD() exploits this.
> 
> Except this is wrong for transitions to IN_ERROR.  There, the
> offending input character is in fact consumed: case IN_ERROR returns.
> It isn't added to the JSON_ERROR token, though.
> 
> Fix that by making TERMINAL_NEEDED_LOOKAHEAD() return false for
> transitions to IN_ERROR.
> 
> There's a slight complication.  json_lexer_flush() passes input
> character '\0' to flush an incomplete token.  If this results in
> JSON_ERROR, we'd now add the '\0' to the token.  Suppress that.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>   qobject/json-lexer.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)

Deceptively small change, but worthwhile.

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org