[Qemu-devel] [PATCH v2 6/9] qapi: Rewrite string-input-visitor

David Hildenbrand posted 9 patches 7 years, 2 months ago
[Qemu-devel] [PATCH v2 6/9] qapi: Rewrite string-input-visitor
Posted by David Hildenbrand 7 years, 2 months ago
The input visitor has some problems right now, especially
- unsigned type "Range" is used to process signed ranges, resulting in
  inconsistent behavior and ugly/magical code
- uint64_t are parsed like int64_t, so big uint64_t values are not
  supported and error messages are misleading
- lists/ranges of int64_t are accepted although no list is parsed and
  we should rather report an error
- lists/ranges are preparsed using int64_t, making it hard to
  implement uint64_t values or uint64_t lists
- types that don't support lists don't bail out
- visiting beyond the end of a list is not handled properly
- we don't actually parse lists, we parse *sets*: members are sorted,
  and duplicates eliminated

So let's rewrite it by getting rid of usage of the type "Range" and
properly supporting lists of int64_t and uint64_t (including ranges of
both types), fixing the above mentioned issues.

Lists of other types are not supported and will properly report an
error. Virtual walks are now supported.

Tests have to be fixed up:
- Two BUGs were hardcoded that are fixed now
- The string-input-visitor now actually returns a parsed list and not
  an ordered set.

Please note that no users/callers have to be fixed up. Candiates using
visit_type_uint16List() and friends are:
- backends/hostmem.c:host_memory_backend_set_host_nodes()
-- Code can deal with dupilcates/unsorted lists
- numa.c::query_memdev()
-- via object_property_get_uint16List(), the list will still be sorted
   and without duplicates (via host_memory_backend_get_host_nodes())
- qapi-visit.c::visit_type_Memdev_members()
- qapi-visit.c::visit_type_NumaNodeOptions_members()
- qapi-visit.c::visit_type_RockerOfDpaGroup_members
- qapi-visit.c::visit_type_RxFilterInfo_members()
-- Not used with string-input-visitor.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/qapi/string-input-visitor.h |   4 +-
 qapi/string-input-visitor.c         | 405 ++++++++++++++++------------
 tests/test-string-input-visitor.c   |  18 +-
 3 files changed, 234 insertions(+), 193 deletions(-)

diff --git a/include/qapi/string-input-visitor.h b/include/qapi/string-input-visitor.h
index 33551340e3..921f3875b9 100644
--- a/include/qapi/string-input-visitor.h
+++ b/include/qapi/string-input-visitor.h
@@ -19,8 +19,8 @@ typedef struct StringInputVisitor StringInputVisitor;
 
 /*
  * The string input visitor does not implement support for visiting
- * QAPI structs, alternates, null, or arbitrary QTypes.  It also
- * requires a non-null list argument to visit_start_list().
+ * QAPI structs, alternates, null, or arbitrary QTypes. Only flat lists
+ * of integers (except type "size") are supported.
  */
 Visitor *string_input_visitor_new(const char *str);
 
diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
index b89c6c4e06..bd92080667 100644
--- a/qapi/string-input-visitor.c
+++ b/qapi/string-input-visitor.c
@@ -4,10 +4,10 @@
  * Copyright Red Hat, Inc. 2012-2016
  *
  * Author: Paolo Bonzini <pbonzini@redhat.com>
+ *         David Hildenbrand <david@redhat.com>
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
  * See the COPYING.LIB file in the top-level directory.
- *
  */
 
 #include "qemu/osdep.h"
@@ -18,21 +18,42 @@
 #include "qapi/qmp/qerror.h"
 #include "qapi/qmp/qnull.h"
 #include "qemu/option.h"
-#include "qemu/queue.h"
-#include "qemu/range.h"
 #include "qemu/cutils.h"
 
+typedef enum ListMode {
+    /* no list parsing active / no list expected */
+    LM_NONE,
+    /* we have an unparsed string remaining */
+    LM_UNPARSED,
+    /* we have an unfinished int64 range */
+    LM_INT64_RANGE,
+    /* we have an unfinished uint64 range */
+    LM_UINT64_RANGE,
+    /* we have parsed the string completely and no range is remaining */
+    LM_END,
+} ListMode;
+
+/* protect against DOS attacks, limit the amount of elements per range */
+#define RANGE_MAX_ELEMENTS 65536
+
+typedef union RangeElement {
+    int64_t i64;
+    uint64_t u64;
+} RangeElement;
 
 struct StringInputVisitor
 {
     Visitor visitor;
 
-    GList *ranges;
-    GList *cur_range;
-    int64_t cur;
+    /* List parsing state */
+    ListMode lm;
+    RangeElement rangeNext;
+    RangeElement rangeEnd;
+    const char *unparsed_string;
+    void *list;
 
+    /* The original string to parse */
     const char *string;
-    void *list; /* Only needed for sanity checking the caller */
 };
 
 static StringInputVisitor *to_siv(Visitor *v)
@@ -40,136 +61,42 @@ static StringInputVisitor *to_siv(Visitor *v)
     return container_of(v, StringInputVisitor, visitor);
 }
 
-static void free_range(void *range, void *dummy)
-{
-    g_free(range);
-}
-
-static int parse_str(StringInputVisitor *siv, const char *name, Error **errp)
-{
-    char *str = (char *) siv->string;
-    long long start, end;
-    Range *cur;
-    char *endptr;
-
-    if (siv->ranges) {
-        return 0;
-    }
-
-    if (!*str) {
-        return 0;
-    }
-
-    do {
-        errno = 0;
-        start = strtoll(str, &endptr, 0);
-        if (errno == 0 && endptr > str) {
-            if (*endptr == '\0') {
-                cur = g_malloc0(sizeof(*cur));
-                range_set_bounds(cur, start, start);
-                siv->ranges = range_list_insert(siv->ranges, cur);
-                cur = NULL;
-                str = NULL;
-            } else if (*endptr == '-') {
-                str = endptr + 1;
-                errno = 0;
-                end = strtoll(str, &endptr, 0);
-                if (errno == 0 && endptr > str && start <= end &&
-                    (start > INT64_MAX - 65536 ||
-                     end < start + 65536)) {
-                    if (*endptr == '\0') {
-                        cur = g_malloc0(sizeof(*cur));
-                        range_set_bounds(cur, start, end);
-                        siv->ranges = range_list_insert(siv->ranges, cur);
-                        cur = NULL;
-                        str = NULL;
-                    } else if (*endptr == ',') {
-                        str = endptr + 1;
-                        cur = g_malloc0(sizeof(*cur));
-                        range_set_bounds(cur, start, end);
-                        siv->ranges = range_list_insert(siv->ranges, cur);
-                        cur = NULL;
-                    } else {
-                        goto error;
-                    }
-                } else {
-                    goto error;
-                }
-            } else if (*endptr == ',') {
-                str = endptr + 1;
-                cur = g_malloc0(sizeof(*cur));
-                range_set_bounds(cur, start, start);
-                siv->ranges = range_list_insert(siv->ranges, cur);
-                cur = NULL;
-            } else {
-                goto error;
-            }
-        } else {
-            goto error;
-        }
-    } while (str);
-
-    return 0;
-error:
-    g_list_foreach(siv->ranges, free_range, NULL);
-    g_list_free(siv->ranges);
-    siv->ranges = NULL;
-    error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
-               "an int64 value or range");
-    return -1;
-}
-
-static void
-start_list(Visitor *v, const char *name, GenericList **list, size_t size,
-           Error **errp)
+static void start_list(Visitor *v, const char *name, GenericList **list,
+                       size_t size, Error **errp)
 {
     StringInputVisitor *siv = to_siv(v);
 
-    /* We don't support visits without a list */
-    assert(list);
+    assert(siv->lm == LM_NONE);
     siv->list = list;
+    siv->unparsed_string = siv->string;
 
-    if (parse_str(siv, name, errp) < 0) {
-        *list = NULL;
-        return;
-    }
-
-    siv->cur_range = g_list_first(siv->ranges);
-    if (siv->cur_range) {
-        Range *r = siv->cur_range->data;
-        if (r) {
-            siv->cur = range_lob(r);
+    if (!siv->string[0]) {
+        if (list) {
+            *list = NULL;
         }
-        *list = g_malloc0(size);
+        siv->lm = LM_END;
     } else {
-        *list = NULL;
+        if (list) {
+            *list = g_malloc0(size);
+        }
+        siv->lm = LM_UNPARSED;
     }
 }
 
 static GenericList *next_list(Visitor *v, GenericList *tail, size_t size)
 {
     StringInputVisitor *siv = to_siv(v);
-    Range *r;
-
-    if (!siv->ranges || !siv->cur_range) {
-        return NULL;
-    }
 
-    r = siv->cur_range->data;
-    if (!r) {
+    switch (siv->lm) {
+    case LM_END:
         return NULL;
-    }
-
-    if (!range_contains(r, siv->cur)) {
-        siv->cur_range = g_list_next(siv->cur_range);
-        if (!siv->cur_range) {
-            return NULL;
-        }
-        r = siv->cur_range->data;
-        if (!r) {
-            return NULL;
-        }
-        siv->cur = range_lob(r);
+    case LM_INT64_RANGE:
+    case LM_UINT64_RANGE:
+    case LM_UNPARSED:
+        /* we have an unparsed string or something left in a range */
+        break;
+    default:
+        abort();
     }
 
     tail->next = g_malloc0(size);
@@ -179,88 +106,208 @@ static GenericList *next_list(Visitor *v, GenericList *tail, size_t size)
 static void check_list(Visitor *v, Error **errp)
 {
     const StringInputVisitor *siv = to_siv(v);
-    Range *r;
-    GList *cur_range;
 
-    if (!siv->ranges || !siv->cur_range) {
+    switch (siv->lm) {
+    case LM_INT64_RANGE:
+    case LM_UINT64_RANGE:
+    case LM_UNPARSED:
+        error_setg(errp, "Fewer list elements expected");
         return;
-    }
-
-    r = siv->cur_range->data;
-    if (!r) {
+    case LM_END:
         return;
+    default:
+        abort();
     }
-
-    if (!range_contains(r, siv->cur)) {
-        cur_range = g_list_next(siv->cur_range);
-        if (!cur_range) {
-            return;
-        }
-        r = cur_range->data;
-        if (!r) {
-            return;
-        }
-    }
-
-    error_setg(errp, "Range contains too many values");
 }
 
 static void end_list(Visitor *v, void **obj)
 {
     StringInputVisitor *siv = to_siv(v);
 
+    assert(siv->lm != LM_NONE);
     assert(siv->list == obj);
+    siv->list = NULL;
+    siv->unparsed_string = NULL;
+    siv->lm = LM_NONE;
+}
+
+static int try_parse_int64_list_entry(StringInputVisitor *siv, int64_t *obj)
+{
+    const char *endptr;
+    int64_t start, end;
+
+    /* parse a simple int64 or range */
+    if (qemu_strtoi64(siv->unparsed_string, &endptr, 0, &start)) {
+        return -EINVAL;
+    }
+    end = start;
+
+    switch (endptr[0]) {
+    case '\0':
+        siv->unparsed_string = endptr;
+        break;
+    case ',':
+        siv->unparsed_string = endptr + 1;
+        break;
+    case '-':
+        /* parse the end of the range */
+        if (qemu_strtoi64(endptr + 1, &endptr, 0, &end)) {
+            return -EINVAL;
+        }
+        if (start > end || end - start >= RANGE_MAX_ELEMENTS) {
+            return -EINVAL;
+        }
+        switch (endptr[0]) {
+        case '\0':
+            siv->unparsed_string = endptr;
+            break;
+        case ',':
+            siv->unparsed_string = endptr + 1;
+            break;
+        default:
+            return -EINVAL;
+        }
+        break;
+    default:
+        return -EINVAL;
+    }
+
+    /* we have a proper range (with maybe only one element) */
+    siv->lm = LM_INT64_RANGE;
+    siv->rangeNext.i64 = start;
+    siv->rangeEnd.i64 = end;
+    return 0;
 }
 
 static void parse_type_int64(Visitor *v, const char *name, int64_t *obj,
                              Error **errp)
 {
     StringInputVisitor *siv = to_siv(v);
-
-    if (parse_str(siv, name, errp) < 0) {
+    int64_t val;
+
+    switch (siv->lm) {
+    case LM_NONE:
+        /* just parse a simple int64, bail out if not completely consumed */
+        if (qemu_strtoi64(siv->string, NULL, 0, &val)) {
+                error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                           name ? name : "null", "int64");
+            return;
+        }
+        *obj = val;
         return;
+    case LM_UNPARSED:
+        if (try_parse_int64_list_entry(siv, obj)) {
+            error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
+                       "list of int64 values or ranges");
+            return;
+        }
+        assert(siv->lm == LM_INT64_RANGE);
+        /* fall through */
+    case LM_INT64_RANGE:
+        /* return the next element in the range */
+        assert(siv->rangeNext.i64 <= siv->rangeEnd.i64);
+        *obj = siv->rangeNext.i64++;
+
+        if (siv->rangeNext.i64 > siv->rangeEnd.i64 || *obj == INT64_MAX) {
+            /* end of range, check if there is more to parse */
+            siv->lm = siv->unparsed_string[0] ? LM_UNPARSED : LM_END;
+        }
+        return;
+    case LM_END:
+        error_setg(errp, "Fewer list elements expected");
+        return;
+    default:
+        abort();
     }
+}
 
-    if (!siv->ranges) {
-        goto error;
-    }
-
-    if (!siv->cur_range) {
-        Range *r;
+static int try_parse_uint64_list_entry(StringInputVisitor *siv, uint64_t *obj)
+{
+    const char *endptr;
+    uint64_t start, end;
 
-        siv->cur_range = g_list_first(siv->ranges);
-        if (!siv->cur_range) {
-            goto error;
+    /* parse a simple uint64 or range */
+    if (qemu_strtou64(siv->unparsed_string, &endptr, 0, &start)) {
+        return -EINVAL;
+    }
+    end = start;
+
+    switch (endptr[0]) {
+    case '\0':
+        siv->unparsed_string = endptr;
+        break;
+    case ',':
+        siv->unparsed_string = endptr + 1;
+        break;
+    case '-':
+        /* parse the end of the range */
+        if (qemu_strtou64(endptr + 1, &endptr, 0, &end)) {
+            return -EINVAL;
         }
-
-        r = siv->cur_range->data;
-        if (!r) {
-            goto error;
+        if (start > end || end - start >= RANGE_MAX_ELEMENTS) {
+            return -EINVAL;
         }
-
-        siv->cur = range_lob(r);
+        switch (endptr[0]) {
+        case '\0':
+            siv->unparsed_string = endptr;
+            break;
+        case ',':
+            siv->unparsed_string = endptr + 1;
+            break;
+        default:
+            return -EINVAL;
+        }
+        break;
+    default:
+        return -EINVAL;
     }
 
-    *obj = siv->cur;
-    siv->cur++;
-    return;
-
-error:
-    error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
-               "an int64 value or range");
+    /* we have a proper range (with maybe only one element) */
+    siv->lm = LM_UINT64_RANGE;
+    siv->rangeNext.u64 = start;
+    siv->rangeEnd.u64 = end;
+    return 0;
 }
 
 static void parse_type_uint64(Visitor *v, const char *name, uint64_t *obj,
                               Error **errp)
 {
-    /* FIXME: parse_type_int64 mishandles values over INT64_MAX */
-    int64_t i;
-    Error *err = NULL;
-    parse_type_int64(v, name, &i, &err);
-    if (err) {
-        error_propagate(errp, err);
-    } else {
-        *obj = i;
+    StringInputVisitor *siv = to_siv(v);
+    uint64_t val;
+
+    switch (siv->lm) {
+    case LM_NONE:
+        /* just parse a simple uint64, bail out if not completely consumed */
+        if (qemu_strtou64(siv->string, NULL, 0, &val)) {
+            error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
+                       "uint64");
+            return;
+        }
+        *obj = val;
+        return;
+    case LM_UNPARSED:
+        if (try_parse_uint64_list_entry(siv, obj)) {
+            error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
+                       "list of uint64 values or ranges");
+            return;
+        }
+        assert(siv->lm == LM_UINT64_RANGE);
+        /* fall through */
+    case LM_UINT64_RANGE:
+        /* return the next element in the range */
+        assert(siv->rangeNext.u64 <= siv->rangeEnd.u64);
+        *obj = siv->rangeNext.u64++;
+
+        if (siv->rangeNext.u64 > siv->rangeEnd.u64 || *obj == UINT64_MAX) {
+            /* end of range, check if there is more to parse */
+            siv->lm = siv->unparsed_string[0] ? LM_UNPARSED : LM_END;
+        }
+        return;
+    case LM_END:
+        error_setg(errp, "Fewer list elements expected");
+        return;
+    default:
+        abort();
     }
 }
 
@@ -271,6 +318,7 @@ static void parse_type_size(Visitor *v, const char *name, uint64_t *obj,
     Error *err = NULL;
     uint64_t val;
 
+    assert(siv->lm == LM_NONE);
     parse_option_size(name, siv->string, &val, &err);
     if (err) {
         error_propagate(errp, err);
@@ -285,6 +333,7 @@ static void parse_type_bool(Visitor *v, const char *name, bool *obj,
 {
     StringInputVisitor *siv = to_siv(v);
 
+    assert(siv->lm == LM_NONE);
     if (!strcasecmp(siv->string, "on") ||
         !strcasecmp(siv->string, "yes") ||
         !strcasecmp(siv->string, "true")) {
@@ -307,6 +356,7 @@ static void parse_type_str(Visitor *v, const char *name, char **obj,
 {
     StringInputVisitor *siv = to_siv(v);
 
+    assert(siv->lm == LM_NONE);
     *obj = g_strdup(siv->string);
 }
 
@@ -316,6 +366,7 @@ static void parse_type_number(Visitor *v, const char *name, double *obj,
     StringInputVisitor *siv = to_siv(v);
     double val;
 
+    assert(siv->lm == LM_NONE);
     if (qemu_strtod_finite(siv->string, NULL, &val)) {
         error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
                    "number");
@@ -330,9 +381,10 @@ static void parse_type_null(Visitor *v, const char *name, QNull **obj,
 {
     StringInputVisitor *siv = to_siv(v);
 
+    assert(siv->lm == LM_NONE);
     *obj = NULL;
 
-    if (!siv->string || siv->string[0]) {
+    if (siv->string[0]) {
         error_setg(errp, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
                    "null");
         return;
@@ -345,8 +397,6 @@ static void string_input_free(Visitor *v)
 {
     StringInputVisitor *siv = to_siv(v);
 
-    g_list_foreach(siv->ranges, free_range, NULL);
-    g_list_free(siv->ranges);
     g_free(siv);
 }
 
@@ -372,5 +422,6 @@ Visitor *string_input_visitor_new(const char *str)
     v->visitor.free = string_input_free;
 
     v->string = str;
+    v->lm = LM_NONE;
     return &v->visitor;
 }
diff --git a/tests/test-string-input-visitor.c b/tests/test-string-input-visitor.c
index 8ee0d1b284..c5379365a6 100644
--- a/tests/test-string-input-visitor.c
+++ b/tests/test-string-input-visitor.c
@@ -92,16 +92,6 @@ static void check_ulist(Visitor *v, uint64_t *expected, size_t n)
     uint64List *tail;
     int i;
 
-    /* BUG: unsigned numbers above INT64_MAX don't work */
-    for (i = 0; i < n; i++) {
-        if (expected[i] > INT64_MAX) {
-            Error *err = NULL;
-            visit_type_uint64List(v, NULL, &res, &err);
-            error_free_or_abort(&err);
-            return;
-        }
-    }
-
     visit_type_uint64List(v, NULL, &res, &error_abort);
     tail = res;
     for (i = 0; i < n; i++) {
@@ -117,10 +107,10 @@ static void check_ulist(Visitor *v, uint64_t *expected, size_t n)
 static void test_visitor_in_intList(TestInputVisitorData *data,
                                     const void *unused)
 {
-    /* Note: the visitor *sorts* ranges *unsigned* */
-    int64_t expect1[] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 20 };
+    int64_t expect1[] = { 1, 2, 0, 2, 3, 4, 20, 5, 6, 7,
+                          8, 9, 1, 2, 3, 4, 5, 6, 7, 8 };
     int64_t expect2[] = { 32767, -32768, -32767 };
-    int64_t expect3[] = { INT64_MAX, INT64_MIN };
+    int64_t expect3[] = { INT64_MIN, INT64_MAX };
     int64_t expect4[] = { 1 };
     uint64_t expect5[] = { UINT64_MAX };
     Error *err = NULL;
@@ -226,7 +216,7 @@ static void test_visitor_in_intList(TestInputVisitorData *data,
     visit_type_int64(v, NULL, &tail->value, &err);
     g_assert_cmpint(tail->value, ==, 0);
     visit_type_int64(v, NULL, &val, &err);
-    g_assert_cmpint(val, ==, 1); /* BUG */
+    error_free_or_abort(&err);
     visit_check_list(v, &error_abort);
     visit_end_list(v, (void **)&res);
 
-- 
2.17.2


Re: [Qemu-devel] [PATCH v2 6/9] qapi: Rewrite string-input-visitor
Posted by Eric Blake 7 years, 2 months ago
On 11/20/18 3:25 AM, David Hildenbrand wrote:
> The input visitor has some problems right now, especially
> - unsigned type "Range" is used to process signed ranges, resulting in
>    inconsistent behavior and ugly/magical code
> - uint64_t are parsed like int64_t, so big uint64_t values are not
>    supported and error messages are misleading
> - lists/ranges of int64_t are accepted although no list is parsed and
>    we should rather report an error
> - lists/ranges are preparsed using int64_t, making it hard to
>    implement uint64_t values or uint64_t lists
> - types that don't support lists don't bail out
> - visiting beyond the end of a list is not handled properly
> - we don't actually parse lists, we parse *sets*: members are sorted,
>    and duplicates eliminated
> 
> So let's rewrite it by getting rid of usage of the type "Range" and
> properly supporting lists of int64_t and uint64_t (including ranges of
> both types), fixing the above mentioned issues.
> 
> Lists of other types are not supported and will properly report an
> error. Virtual walks are now supported.
> 
> Tests have to be fixed up:
> - Two BUGs were hardcoded that are fixed now
> - The string-input-visitor now actually returns a parsed list and not
>    an ordered set.
> 
> Please note that no users/callers have to be fixed up. Candiates using

s/Candiates/Candidates/

> visit_type_uint16List() and friends are:
> - backends/hostmem.c:host_memory_backend_set_host_nodes()
> -- Code can deal with dupilcates/unsorted lists

s/dupilcates/duplicates/

> - numa.c::query_memdev()
> -- via object_property_get_uint16List(), the list will still be sorted
>     and without duplicates (via host_memory_backend_get_host_nodes())
> - qapi-visit.c::visit_type_Memdev_members()
> - qapi-visit.c::visit_type_NumaNodeOptions_members()
> - qapi-visit.c::visit_type_RockerOfDpaGroup_members
> - qapi-visit.c::visit_type_RxFilterInfo_members()
> -- Not used with string-input-visitor.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   include/qapi/string-input-visitor.h |   4 +-
>   qapi/string-input-visitor.c         | 405 ++++++++++++++++------------
>   tests/test-string-input-visitor.c   |  18 +-
>   3 files changed, 234 insertions(+), 193 deletions(-)
> 

>   struct StringInputVisitor
>   {
>       Visitor visitor;
>   
> -    GList *ranges;
> -    GList *cur_range;
> -    int64_t cur;
> +    /* List parsing state */
> +    ListMode lm;
> +    RangeElement rangeNext;
> +    RangeElement rangeEnd;
> +    const char *unparsed_string;
> +    void *list;
>   
> +    /* The original string to parse */
>       const char *string;
> -    void *list; /* Only needed for sanity checking the caller */
>   };
>   

Makes sense.

> @@ -179,88 +106,208 @@ static GenericList *next_list(Visitor *v, GenericList *tail, size_t size)
>   static void check_list(Visitor *v, Error **errp)
>   {
>       const StringInputVisitor *siv = to_siv(v);
> -    Range *r;
> -    GList *cur_range;
>   
> -    if (!siv->ranges || !siv->cur_range) {
> +    switch (siv->lm) {
> +    case LM_INT64_RANGE:
> +    case LM_UINT64_RANGE:
> +    case LM_UNPARSED:
> +        error_setg(errp, "Fewer list elements expected");

Bike-shedding - I don't know if "Too many list elements supplied" would 
make the error any more legible.

>   
>   static void parse_type_int64(Visitor *v, const char *name, int64_t *obj,
>                                Error **errp)
>   {
>       StringInputVisitor *siv = to_siv(v);
> -
> -    if (parse_str(siv, name, errp) < 0) {
> +    int64_t val;
> +
> +    switch (siv->lm) {
> +    case LM_NONE:
> +        /* just parse a simple int64, bail out if not completely consumed */
> +        if (qemu_strtoi64(siv->string, NULL, 0, &val)) {
> +                error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> +                           name ? name : "null", "int64");
> +            return;
> +        }
> +        *obj = val;
>           return;
> +    case LM_UNPARSED:
> +        if (try_parse_int64_list_entry(siv, obj)) {
> +            error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
> +                       "list of int64 values or ranges");

The error message might be a bit misleading for a range larger than 64k, 
but that's not too bad.

> +            return;
> +        }
> +        assert(siv->lm == LM_INT64_RANGE);
> +        /* fall through */
> +    case LM_INT64_RANGE:
> +        /* return the next element in the range */
> +        assert(siv->rangeNext.i64 <= siv->rangeEnd.i64);
> +        *obj = siv->rangeNext.i64++;
> +
> +        if (siv->rangeNext.i64 > siv->rangeEnd.i64 || *obj == INT64_MAX) {

I think our compiler options guarantee that we have sane signed 
wraparound and thus this is a safe comparison on overflow; but if you 
were to swap it so that the *obj == INT64_MAX check is performed first, 
you wouldn't even have to debate about whether undefined C semantics are 
being invoked.

> +            /* end of range, check if there is more to parse */
> +            siv->lm = siv->unparsed_string[0] ? LM_UNPARSED : LM_END;
> +        }
> +        return;
> +    case LM_END:
> +        error_setg(errp, "Fewer list elements expected");

Again, bikeshedding if "too many list elements supplied" would make any 
more sense.

> +static int try_parse_uint64_list_entry(StringInputVisitor *siv, uint64_t *obj)
> +{
> +    const char *endptr;
> +    uint64_t start, end;
>   
> -        siv->cur_range = g_list_first(siv->ranges);
> -        if (!siv->cur_range) {
> -            goto error;
> +    /* parse a simple uint64 or range */
> +    if (qemu_strtou64(siv->unparsed_string, &endptr, 0, &start)) {

Lots of duplication between the signed and unsigned variants.  But I 
don't see any easy way to factor it out into a common helper, as there 
are just too many places where signed vs. unsigned does not easily lend 
itself to common code.

> @@ -330,9 +381,10 @@ static void parse_type_null(Visitor *v, const char *name, QNull **obj,
>   {
>       StringInputVisitor *siv = to_siv(v);
>   
> +    assert(siv->lm == LM_NONE);
>       *obj = NULL;
>   
> -    if (!siv->string || siv->string[0]) {
> +    if (siv->string[0]) {

Why did this condition change?

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [PATCH v2 6/9] qapi: Rewrite string-input-visitor
Posted by Markus Armbruster 7 years, 2 months ago
I think the title should be something like

    qapi: Rewrite string-input-visitor's integer and list parsing

because you don't actually rewrite all of it.

Eric Blake <eblake@redhat.com> writes:

> On 11/20/18 3:25 AM, David Hildenbrand wrote:
>> The input visitor has some problems right now, especially
>> - unsigned type "Range" is used to process signed ranges, resulting in
>>    inconsistent behavior and ugly/magical code
>> - uint64_t are parsed like int64_t, so big uint64_t values are not
>>    supported and error messages are misleading
>> - lists/ranges of int64_t are accepted although no list is parsed and
>>    we should rather report an error
>> - lists/ranges are preparsed using int64_t, making it hard to
>>    implement uint64_t values or uint64_t lists
>> - types that don't support lists don't bail out
>> - visiting beyond the end of a list is not handled properly
>> - we don't actually parse lists, we parse *sets*: members are sorted,
>>    and duplicates eliminated
>>
>> So let's rewrite it by getting rid of usage of the type "Range" and
>> properly supporting lists of int64_t and uint64_t (including ranges of
>> both types), fixing the above mentioned issues.
>>
>> Lists of other types are not supported and will properly report an
>> error. Virtual walks are now supported.
>>
>> Tests have to be fixed up:
>> - Two BUGs were hardcoded that are fixed now
>> - The string-input-visitor now actually returns a parsed list and not
>>    an ordered set.
>>
>> Please note that no users/callers have to be fixed up. Candiates using
>
> s/Candiates/Candidates/
>
>> visit_type_uint16List() and friends are:
>> - backends/hostmem.c:host_memory_backend_set_host_nodes()
>> -- Code can deal with dupilcates/unsorted lists
>
> s/dupilcates/duplicates/
>
>> - numa.c::query_memdev()
>> -- via object_property_get_uint16List(), the list will still be sorted
>>     and without duplicates (via host_memory_backend_get_host_nodes())
>> - qapi-visit.c::visit_type_Memdev_members()
>> - qapi-visit.c::visit_type_NumaNodeOptions_members()
>> - qapi-visit.c::visit_type_RockerOfDpaGroup_members
>> - qapi-visit.c::visit_type_RxFilterInfo_members()
>> -- Not used with string-input-visitor.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   include/qapi/string-input-visitor.h |   4 +-
>>   qapi/string-input-visitor.c         | 405 ++++++++++++++++------------
>>   tests/test-string-input-visitor.c   |  18 +-
>>   3 files changed, 234 insertions(+), 193 deletions(-)
>>
>
>>   struct StringInputVisitor
>>   {
>>       Visitor visitor;
>>   -    GList *ranges;
>> -    GList *cur_range;
>> -    int64_t cur;
>> +    /* List parsing state */
>> +    ListMode lm;
>> +    RangeElement rangeNext;
>> +    RangeElement rangeEnd;
>> +    const char *unparsed_string;
>> +    void *list;
>>   +    /* The original string to parse */
>>       const char *string;
>> -    void *list; /* Only needed for sanity checking the caller */
>>   };
>>   
>
> Makes sense.
>
>> @@ -179,88 +106,208 @@ static GenericList *next_list(Visitor *v, GenericList *tail, size_t size)
>>   static void check_list(Visitor *v, Error **errp)
>>   {
>>       const StringInputVisitor *siv = to_siv(v);
>> -    Range *r;
>> -    GList *cur_range;
>>   -    if (!siv->ranges || !siv->cur_range) {
>> +    switch (siv->lm) {
>> +    case LM_INT64_RANGE:
>> +    case LM_UINT64_RANGE:
>> +    case LM_UNPARSED:
>> +        error_setg(errp, "Fewer list elements expected");
>
> Bike-shedding - I don't know if "Too many list elements supplied"
> would make the error any more legible.

See my review of PATCH RFC 3/6:

    Hmm.  qobject_input_check_list() reports "Only %u list elements expected
    in %s" here.  Looks unintuitive, until you remember how it's normally
    used: to parse user input.  The error gets reported when the parsing
    didn't consume the whole list.  Can only happen with a virtual walk.
    And when it happens, the issue to report is "you provided a longer list
    than I can accept".  qobject_input_check_list() does exactly that.  Your
    message is less clear.

David didn't feel counting elements just for that was worthwhile here,
and I agreed.

>>     static void parse_type_int64(Visitor *v, const char *name,
>> int64_t *obj,
>>                                Error **errp)
>>   {
>>       StringInputVisitor *siv = to_siv(v);
>> -
>> -    if (parse_str(siv, name, errp) < 0) {
>> +    int64_t val;
>> +
>> +    switch (siv->lm) {
>> +    case LM_NONE:
>> +        /* just parse a simple int64, bail out if not completely consumed */
>> +        if (qemu_strtoi64(siv->string, NULL, 0, &val)) {
>> +                error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
>> +                           name ? name : "null", "int64");
>> +            return;
>> +        }
>> +        *obj = val;
>>           return;
>> +    case LM_UNPARSED:
>> +        if (try_parse_int64_list_entry(siv, obj)) {
>> +            error_setg(errp, QERR_INVALID_PARAMETER_VALUE, name ? name : "null",
>> +                       "list of int64 values or ranges");
>
> The error message might be a bit misleading for a range larger than
> 64k, but that's not too bad.
>
>> +            return;
>> +        }
>> +        assert(siv->lm == LM_INT64_RANGE);
>> +        /* fall through */
>> +    case LM_INT64_RANGE:
>> +        /* return the next element in the range */
>> +        assert(siv->rangeNext.i64 <= siv->rangeEnd.i64);
>> +        *obj = siv->rangeNext.i64++;
>> +
>> +        if (siv->rangeNext.i64 > siv->rangeEnd.i64 || *obj == INT64_MAX) {
>
> I think our compiler options guarantee that we have sane signed
> wraparound and thus this is a safe comparison on overflow; but if you
> were to swap it so that the *obj == INT64_MAX check is performed
> first, you wouldn't even have to debate about whether undefined C
> semantics are being invoked.
>
>> +            /* end of range, check if there is more to parse */
>> +            siv->lm = siv->unparsed_string[0] ? LM_UNPARSED : LM_END;
>> +        }
>> +        return;
>> +    case LM_END:
>> +        error_setg(errp, "Fewer list elements expected");
>
> Again, bikeshedding if "too many list elements supplied" would make
> any more sense.

Same argument.

>> +static int try_parse_uint64_list_entry(StringInputVisitor *siv, uint64_t *obj)
>> +{
>> +    const char *endptr;
>> +    uint64_t start, end;
>>   -        siv->cur_range = g_list_first(siv->ranges);
>> -        if (!siv->cur_range) {
>> -            goto error;
>> +    /* parse a simple uint64 or range */
>> +    if (qemu_strtou64(siv->unparsed_string, &endptr, 0, &start)) {
>
> Lots of duplication between the signed and unsigned variants.  But I
> don't see any easy way to factor it out into a common helper, as there
> are just too many places where signed vs. unsigned does not easily
> lend itself to common code.

Yes.

>> @@ -330,9 +381,10 @@ static void parse_type_null(Visitor *v, const char *name, QNull **obj,
>>   {
>>       StringInputVisitor *siv = to_siv(v);
>>   +    assert(siv->lm == LM_NONE);
>>       *obj = NULL;
>>   -    if (!siv->string || siv->string[0]) {
>> +    if (siv->string[0]) {
>
> Why did this condition change?

As far as I can tell, siv->string can't ever be null.  Sticking the
change into this patch is perhaps debatable.  I'm okay with it.

> Reviewed-by: Eric Blake <eblake@redhat.com>

With the commit message improved once more:
Reviewed-by: Markus Armbruster <armbru@redhat.com>

Re: [Qemu-devel] [PATCH v2 6/9] qapi: Rewrite string-input-visitor
Posted by David Hildenbrand 7 years, 2 months ago
On 20.11.18 21:58, Markus Armbruster wrote:
> I think the title should be something like
> 
>     qapi: Rewrite string-input-visitor's integer and list parsing
> 
> because you don't actually rewrite all of it.
> 
> Eric Blake <eblake@redhat.com> writes:
> 
>> On 11/20/18 3:25 AM, David Hildenbrand wrote:
>>> The input visitor has some problems right now, especially
>>> - unsigned type "Range" is used to process signed ranges, resulting in
>>>    inconsistent behavior and ugly/magical code
>>> - uint64_t are parsed like int64_t, so big uint64_t values are not
>>>    supported and error messages are misleading
>>> - lists/ranges of int64_t are accepted although no list is parsed and
>>>    we should rather report an error
>>> - lists/ranges are preparsed using int64_t, making it hard to
>>>    implement uint64_t values or uint64_t lists
>>> - types that don't support lists don't bail out
>>> - visiting beyond the end of a list is not handled properly
>>> - we don't actually parse lists, we parse *sets*: members are sorted,
>>>    and duplicates eliminated
>>>
>>> So let's rewrite it by getting rid of usage of the type "Range" and
>>> properly supporting lists of int64_t and uint64_t (including ranges of
>>> both types), fixing the above mentioned issues.
>>>
>>> Lists of other types are not supported and will properly report an
>>> error. Virtual walks are now supported.
>>>
>>> Tests have to be fixed up:
>>> - Two BUGs were hardcoded that are fixed now
>>> - The string-input-visitor now actually returns a parsed list and not
>>>    an ordered set.
>>>
>>> Please note that no users/callers have to be fixed up. Candiates using
>>
>> s/Candiates/Candidates/
>>
>>> visit_type_uint16List() and friends are:
>>> - backends/hostmem.c:host_memory_backend_set_host_nodes()
>>> -- Code can deal with dupilcates/unsorted lists
>>
>> s/dupilcates/duplicates/

Thanks, both fixed.

> 
>>> @@ -330,9 +381,10 @@ static void parse_type_null(Visitor *v, const char *name, QNull **obj,
>>>   {
>>>       StringInputVisitor *siv = to_siv(v);
>>>   +    assert(siv->lm == LM_NONE);
>>>       *obj = NULL;
>>>   -    if (!siv->string || siv->string[0]) {
>>> +    if (siv->string[0]) {
>>
>> Why did this condition change?
> 
> As far as I can tell, siv->string can't ever be null.  Sticking the
> change into this patch is perhaps debatable.  I'm okay with it.

Yes, we have an assertion when creating the visitor. Do you want me to
pull this into a separate patch?

(It made sense under the old patch subject ;) )

> 
>> Reviewed-by: Eric Blake <eblake@redhat.com>
> 
> With the commit message improved once more:
> Reviewed-by: Markus Armbruster <armbru@redhat.com>
> 
Thanks!

-- 

Thanks,

David / dhildenb

Re: [Qemu-devel] [PATCH v2 6/9] qapi: Rewrite string-input-visitor
Posted by Markus Armbruster 7 years, 2 months ago
David Hildenbrand <david@redhat.com> writes:

> On 20.11.18 21:58, Markus Armbruster wrote:
>> I think the title should be something like
>> 
>>     qapi: Rewrite string-input-visitor's integer and list parsing
>> 
>> because you don't actually rewrite all of it.
>> 
>> Eric Blake <eblake@redhat.com> writes:
>> 
>>> On 11/20/18 3:25 AM, David Hildenbrand wrote:
>>>> The input visitor has some problems right now, especially
>>>> - unsigned type "Range" is used to process signed ranges, resulting in
>>>>    inconsistent behavior and ugly/magical code
>>>> - uint64_t are parsed like int64_t, so big uint64_t values are not
>>>>    supported and error messages are misleading
>>>> - lists/ranges of int64_t are accepted although no list is parsed and
>>>>    we should rather report an error
>>>> - lists/ranges are preparsed using int64_t, making it hard to
>>>>    implement uint64_t values or uint64_t lists
>>>> - types that don't support lists don't bail out
>>>> - visiting beyond the end of a list is not handled properly
>>>> - we don't actually parse lists, we parse *sets*: members are sorted,
>>>>    and duplicates eliminated
>>>>
>>>> So let's rewrite it by getting rid of usage of the type "Range" and
>>>> properly supporting lists of int64_t and uint64_t (including ranges of
>>>> both types), fixing the above mentioned issues.
>>>>
>>>> Lists of other types are not supported and will properly report an
>>>> error. Virtual walks are now supported.
>>>>
>>>> Tests have to be fixed up:
>>>> - Two BUGs were hardcoded that are fixed now
>>>> - The string-input-visitor now actually returns a parsed list and not
>>>>    an ordered set.
>>>>
>>>> Please note that no users/callers have to be fixed up. Candiates using
>>>
>>> s/Candiates/Candidates/
>>>
>>>> visit_type_uint16List() and friends are:
>>>> - backends/hostmem.c:host_memory_backend_set_host_nodes()
>>>> -- Code can deal with dupilcates/unsorted lists
>>>
>>> s/dupilcates/duplicates/
>
> Thanks, both fixed.
>
>> 
>>>> @@ -330,9 +381,10 @@ static void parse_type_null(Visitor *v, const char *name, QNull **obj,
>>>>   {
>>>>       StringInputVisitor *siv = to_siv(v);
>>>>   +    assert(siv->lm == LM_NONE);
>>>>       *obj = NULL;
>>>>   -    if (!siv->string || siv->string[0]) {
>>>> +    if (siv->string[0]) {
>>>
>>> Why did this condition change?
>> 
>> As far as I can tell, siv->string can't ever be null.  Sticking the
>> change into this patch is perhaps debatable.  I'm okay with it.
>
> Yes, we have an assertion when creating the visitor. Do you want me to
> pull this into a separate patch?

I'm okay with it as is.  Up to you.

> (It made sense under the old patch subject ;) )

It did :)

>>> Reviewed-by: Eric Blake <eblake@redhat.com>
>> 
>> With the commit message improved once more:
>> Reviewed-by: Markus Armbruster <armbru@redhat.com>
>> 
> Thanks!