Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
qobject/json-lexer.c | 9 +++++----
qobject/json-parser-int.h | 8 ++++----
2 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/qobject/json-lexer.c b/qobject/json-lexer.c
index 39c7ce7adc..2a5561c917 100644
--- a/qobject/json-lexer.c
+++ b/qobject/json-lexer.c
@@ -100,8 +100,7 @@
*/
enum json_lexer_state {
- IN_ERROR = 0, /* must really be 0, see json_lexer[] */
- IN_RECOVERY,
+ IN_RECOVERY = 1,
IN_DQ_STRING_ESCAPE,
IN_DQ_STRING,
IN_SQ_STRING_ESCAPE,
@@ -121,6 +120,8 @@ enum json_lexer_state {
IN_START_INTERP, /* must be IN_START + 1 */
};
+QEMU_BUILD_BUG_ON(JSON_ERROR != 0);
+QEMU_BUILD_BUG_ON(IN_RECOVERY != JSON_ERROR + 1);
QEMU_BUILD_BUG_ON((int)JSON_MIN <= (int)IN_START_INTERP);
QEMU_BUILD_BUG_ON(JSON_MAX >= 0x80);
QEMU_BUILD_BUG_ON(IN_START_INTERP != IN_START + 1);
@@ -176,7 +177,7 @@ static const uint8_t json_lexer[][256] = {
/* Zero */
[IN_ZERO] = {
TERMINAL(JSON_INTEGER),
- ['0' ... '9'] = IN_ERROR,
+ ['0' ... '9'] = JSON_ERROR,
['.'] = IN_MANTISSA,
},
@@ -328,7 +329,7 @@ static void json_lexer_feed_char(JSONLexer *lexer, char ch, bool flush)
case IN_START:
new_state = lexer->start_state;
break;
- case IN_ERROR:
+ case JSON_ERROR:
json_message_process_token(lexer, lexer->token, JSON_ERROR,
lexer->x, lexer->y);
new_state = IN_RECOVERY;
diff --git a/qobject/json-parser-int.h b/qobject/json-parser-int.h
index abeec63af5..57cb8e79d3 100644
--- a/qobject/json-parser-int.h
+++ b/qobject/json-parser-int.h
@@ -16,10 +16,11 @@
#include "qapi/qmp/json-parser.h"
-
typedef enum json_token_type {
- JSON_MIN = 100,
- JSON_LCURLY = JSON_MIN,
+ JSON_ERROR = 0, /* must be zero, see json_lexer[] */
+ /* Gap for lexer states */
+ JSON_LCURLY = 100,
+ JSON_MIN = JSON_LCURLY,
JSON_RCURLY,
JSON_LSQUARE,
JSON_RSQUARE,
@@ -31,7 +32,6 @@ typedef enum json_token_type {
JSON_STRING,
JSON_INTERP,
JSON_SKIP,
- JSON_ERROR,
JSON_END_OF_INPUT,
JSON_MAX = JSON_END_OF_INPUT
} JSONTokenType;
--
2.17.1
On 08/27/2018 02:00 AM, Markus Armbruster wrote: > Signed-off-by: Markus Armbruster <armbru@redhat.com> > --- > qobject/json-lexer.c | 9 +++++---- > qobject/json-parser-int.h | 8 ++++---- > 2 files changed, 9 insertions(+), 8 deletions(-) > Reviewed-by: Eric Blake <eblake@redhat.com> -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
On 08/27/2018 02:00 AM, Markus Armbruster wrote:
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
> qobject/json-lexer.c | 9 +++++----
> qobject/json-parser-int.h | 8 ++++----
> 2 files changed, 9 insertions(+), 8 deletions(-)
> -
> typedef enum json_token_type {
> - JSON_MIN = 100,
> - JSON_LCURLY = JSON_MIN,
> + JSON_ERROR = 0, /* must be zero, see json_lexer[] */
> + /* Gap for lexer states */
> + JSON_LCURLY = 100,
> + JSON_MIN = JSON_LCURLY,
In an earlier version of this type of cleanup, you swapped the IN_ and
JSON_ values and eliminated the gap, to make the overall table more
compact (no storage wasted on any of the states in the gap between the two).
https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg01178.html
Is it still worth trying to minimize the gap between the two sequences,
even if you now no longer swap them in order?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
Eric Blake <eblake@redhat.com> writes:
> On 08/27/2018 02:00 AM, Markus Armbruster wrote:
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>> qobject/json-lexer.c | 9 +++++----
>> qobject/json-parser-int.h | 8 ++++----
>> 2 files changed, 9 insertions(+), 8 deletions(-)
>
>> -
>> typedef enum json_token_type {
>> - JSON_MIN = 100,
>> - JSON_LCURLY = JSON_MIN,
>> + JSON_ERROR = 0, /* must be zero, see json_lexer[] */
>> + /* Gap for lexer states */
>> + JSON_LCURLY = 100,
>> + JSON_MIN = JSON_LCURLY,
>
> In an earlier version of this type of cleanup, you swapped the IN_ and
> JSON_ values and eliminated the gap, to make the overall table more
> compact (no storage wasted on any of the states in the gap between the
> two).
>
> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg01178.html
>
> Is it still worth trying to minimize the gap between the two
> sequences, even if you now no longer swap them in order?
You caught me :)
Eliminating the gap actually enlarges the table. I first got confused,
then measured the size change backwards to confirm my confused ideas.
When I looked at the patch again, I realized my mistake, and silently
dropped this part of the change.
On 08/27/2018 11:40 PM, Markus Armbruster wrote:
>>> typedef enum json_token_type {
>>> - JSON_MIN = 100,
>>> - JSON_LCURLY = JSON_MIN,
>>> + JSON_ERROR = 0, /* must be zero, see json_lexer[] */
>>> + /* Gap for lexer states */
>>> + JSON_LCURLY = 100,
>>> + JSON_MIN = JSON_LCURLY,
>>
>> In an earlier version of this type of cleanup, you swapped the IN_ and
>> JSON_ values and eliminated the gap, to make the overall table more
>> compact (no storage wasted on any of the states in the gap between the
>> two).
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg01178.html
>>
>> Is it still worth trying to minimize the gap between the two
>> sequences, even if you now no longer swap them in order?
>
> You caught me :)
>
> Eliminating the gap actually enlarges the table.
Rather, switching the order enlarges the table.
> I first got confused,
> then measured the size change backwards to confirm my confused ideas.
> When I looked at the patch again, I realized my mistake, and silently
> dropped this part of the change.
The size of the table is determined by the fact that we must initialize
entry 0 (whether we spell it IN_ERROR or JSON_ERROR), then pay attention
to the largest value assigned. Re-reading json_lexer[], you are only
initializing IN_* states, and not JSON_* states; swapping JSON_* to come
first enlarged the table because you now have a bunch of additional rows
in the table that are all 0-initialized to JSON_ERROR transitions.
So at the end of the day, leaving IN_* to be first, and putting JSON_*
second, makes sense.
The question remains, then, if a fixed-size gap (by making JSON_MIN be
exactly 100) is any smarter than a contiguous layout (by making JSON_MIN
be IN_START_INTERP + 1). I can't see any strong reason for preferring
one form over the other, so keeping the gap doesn't hurt.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
On 08/28/2018 10:01 AM, Eric Blake wrote: > The question remains, then, if a fixed-size gap (by making JSON_MIN be > exactly 100) is any smarter than a contiguous layout (by making JSON_MIN > be IN_START_INTERP + 1). I can't see any strong reason for preferring > one form over the other, so keeping the gap doesn't hurt. And having said that, this patch series also introduced a second gap for LOOKAHEAD defined at 0x80, along with assertions that it doesn't collide with either IN_* or JSON_*. It may be just as easy to make JSON_MIN start at LOOKAHEAD+1 than to assert that all of JSON_* fits in the space between 100 and 0x80 - 1. Although in the long run, I seriously doubt we'll be adding many new enum values to either the lexer or the parser. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Eric Blake <eblake@redhat.com> writes: > On 08/28/2018 10:01 AM, Eric Blake wrote: >> The question remains, then, if a fixed-size gap (by making JSON_MIN >> be exactly 100) is any smarter than a contiguous layout (by making >> JSON_MIN be IN_START_INTERP + 1). I can't see any strong reason for >> preferring one form over the other, so keeping the gap doesn't hurt. > > And having said that, this patch series also introduced a second gap > for LOOKAHEAD defined at 0x80, along with assertions that it doesn't > collide with either IN_* or JSON_*. It may be just as easy to make > JSON_MIN start at LOOKAHEAD+1 than to assert that all of JSON_* fits > in the space between 100 and 0x80 - 1. Although in the long run, I > seriously doubt we'll be adding many new enum values to either the > lexer or the parser. LOOKAHEAD is semantically independent from the lexer state. Evidence: we use both IN_START and IN_START | LOOKAHEAD in json_lexer[].
Eric Blake <eblake@redhat.com> writes:
> On 08/27/2018 11:40 PM, Markus Armbruster wrote:
>
>>>> typedef enum json_token_type {
>>>> - JSON_MIN = 100,
>>>> - JSON_LCURLY = JSON_MIN,
>>>> + JSON_ERROR = 0, /* must be zero, see json_lexer[] */
>>>> + /* Gap for lexer states */
>>>> + JSON_LCURLY = 100,
>>>> + JSON_MIN = JSON_LCURLY,
>>>
>>> In an earlier version of this type of cleanup, you swapped the IN_ and
>>> JSON_ values and eliminated the gap, to make the overall table more
>>> compact (no storage wasted on any of the states in the gap between the
>>> two).
>>>
>>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg01178.html
>>>
>>> Is it still worth trying to minimize the gap between the two
>>> sequences, even if you now no longer swap them in order?
>>
>> You caught me :)
>>
>> Eliminating the gap actually enlarges the table.
>
> Rather, switching the order enlarges the table.
>
>> I first got confused,
>> then measured the size change backwards to confirm my confused ideas.
>> When I looked at the patch again, I realized my mistake, and silently
>> dropped this part of the change.
>
> The size of the table is determined by the fact that we must
> initialize entry 0 (whether we spell it IN_ERROR or JSON_ERROR), then
> pay attention to the largest value assigned. Re-reading json_lexer[],
> you are only initializing IN_* states, and not JSON_* states;
Correct.
The JSON_* states other than JSON_ERROR all go to the start state
regardless of lookahead and without consuming it. We implement that
state transition in code instead of putting it into the table:
case JSON_STRING:
json_message_process_token(lexer, lexer->token, new_state,
lexer->x, lexer->y);
/* fall through */
case JSON_SKIP:
g_string_truncate(lexer->token, 0);
/* fall through */
case IN_START:
--> new_state = lexer->start_state;
break;
JSON_ERROR goes to IN_RECOVERY instead:
case JSON_ERROR:
json_message_process_token(lexer, lexer->token, JSON_ERROR,
lexer->x, lexer->y);
---> new_state = IN_RECOVERY;
/* fall through */
case IN_RECOVERY:
> swapping
> JSON_* to come first enlarged the table because you now have a bunch
> of additional rows in the table that are all 0-initialized to
> JSON_ERROR transitions.
Yes. These rows are never used.
> So at the end of the day, leaving IN_* to be first, and putting JSON_*
> second, makes sense.
>
> The question remains, then, if a fixed-size gap (by making JSON_MIN be
> exactly 100) is any smarter than a contiguous layout (by making
> JSON_MIN be IN_START_INTERP + 1). I can't see any strong reason for
> preferring one form over the other, so keeping the gap doesn't hurt.
The gap lets us hide the IN_* in json-lexer.c. Not sure it's worth the
trouble. We could move the IN_* to json-parser-int.h for simplicity.
Not sure that's worth the trouble, either :)
© 2016 - 2025 Red Hat, Inc.