[v1] qapi: statically type schema.py

[PATCH 15/19] qapi/parser: demote QAPIExpression to Dict[str, Any]

Posted by John Snow 2 years, 2 months ago

Dict[str, object] is a stricter type, but with the way that code is
currently arranged, it is infeasible to enforce this strictness.

In particular, although expr.py's entire raison d'être is normalization
and type-checking of QAPI Expressions, that type information is not
"remembered" in any meaningful way by mypy because each individual
expression is not downcast to a specific expression type that holds all
the details of each expression's unique form.

As a result, all of the code in schema.py that deals with actually
creating type-safe specialized structures has no guarantee (myopically)
that the data it is being passed is correct.

There are two ways to solve this:

(1) Re-assert that the incoming data is in the shape we expect it to be, or
(2) Disable type checking for this data.

(1) is appealing to my sense of strictness, but I gotta concede that it
is asinine to re-check the shape of a QAPIExpression in schema.py when
expr.py has just completed that work at length. The duplication of code
and the nightmare thought of needing to update both locations if and
when we change the shape of these structures makes me extremely
reluctant to go down this route.

(2) allows us the chance to miss updating types in the case that types
are updated in expr.py, but it *is* an awful lot simpler and,
importantly, gets us closer to type checking schema.py *at
all*. Something is better than nothing, I'd argue.

So, do the simpler dumber thing and worry about future strictness
improvements later.

Signed-off-by: John Snow <jsnow@redhat.com>
---
 scripts/qapi/parser.py | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index bf31018aef0..b7f08cf36f2 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -19,6 +19,7 @@
 import re
 from typing import (
     TYPE_CHECKING,
+    Any,
     Dict,
     List,
     Mapping,
@@ -43,7 +44,7 @@
 _ExprValue = Union[List[object], Dict[str, object], str, bool]
 
 
-class QAPIExpression(Dict[str, object]):
+class QAPIExpression(Dict[str, Any]):
     # pylint: disable=too-few-public-methods
     def __init__(self,
                  data: Mapping[str, object],
-- 
2.41.0

Re: [PATCH 15/19] qapi/parser: demote QAPIExpression to Dict[str, Any]

Posted by Markus Armbruster 2 years, 2 months ago

John Snow <jsnow@redhat.com> writes:

> Dict[str, object] is a stricter type, but with the way that code is
> currently arranged, it is infeasible to enforce this strictness.
>
> In particular, although expr.py's entire raison d'être is normalization
> and type-checking of QAPI Expressions, that type information is not
> "remembered" in any meaningful way by mypy because each individual
> expression is not downcast to a specific expression type that holds all
> the details of each expression's unique form.
>
> As a result, all of the code in schema.py that deals with actually
> creating type-safe specialized structures has no guarantee (myopically)
> that the data it is being passed is correct.
>
> There are two ways to solve this:
>
> (1) Re-assert that the incoming data is in the shape we expect it to be, or
> (2) Disable type checking for this data.
>
> (1) is appealing to my sense of strictness, but I gotta concede that it
> is asinine to re-check the shape of a QAPIExpression in schema.py when
> expr.py has just completed that work at length. The duplication of code
> and the nightmare thought of needing to update both locations if and
> when we change the shape of these structures makes me extremely
> reluctant to go down this route.
>
> (2) allows us the chance to miss updating types in the case that types
> are updated in expr.py, but it *is* an awful lot simpler and,
> importantly, gets us closer to type checking schema.py *at
> all*. Something is better than nothing, I'd argue.
>
> So, do the simpler dumber thing and worry about future strictness
> improvements later.

Yes.

While Dict[str, object] is stricter than Dict[str, Any], both are miles
away from the actual, recursive type.

> Signed-off-by: John Snow <jsnow@redhat.com>
> ---
>  scripts/qapi/parser.py | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> index bf31018aef0..b7f08cf36f2 100644
> --- a/scripts/qapi/parser.py
> +++ b/scripts/qapi/parser.py
> @@ -19,6 +19,7 @@
>  import re
>  from typing import (
>      TYPE_CHECKING,
> +    Any,
>      Dict,
>      List,
>      Mapping,
> @@ -43,7 +44,7 @@
>  _ExprValue = Union[List[object], Dict[str, object], str, bool]
>  
>  
> -class QAPIExpression(Dict[str, object]):
> +class QAPIExpression(Dict[str, Any]):
>      # pylint: disable=too-few-public-methods
>      def __init__(self,
>                   data: Mapping[str, object],

There are several occurences of Dict[str, object] elsewhere.  Would your
argument for dumbing down QAPIExpression apply to (some of) them, too?

Skimming them, I found this in introspect.py:

    # These types are based on structures defined in QEMU's schema, so we
    # lack precise types for them here. Python 3.6 does not offer
    # TypedDict constructs, so they are broadly typed here as simple
    # Python Dicts.
    SchemaInfo = Dict[str, object]
    SchemaInfoEnumMember = Dict[str, object]
    SchemaInfoObject = Dict[str, object]
    SchemaInfoObjectVariant = Dict[str, object]
    SchemaInfoObjectMember = Dict[str, object]
    SchemaInfoCommand = Dict[str, object]

Can we do better now we have 3.8?

Re: [PATCH 15/19] qapi/parser: demote QAPIExpression to Dict[str, Any]

Posted by John Snow 2 years, 1 month ago

On Thu, Nov 23, 2023 at 9:12 AM Markus Armbruster <armbru@redhat.com> wrote:
>
> John Snow <jsnow@redhat.com> writes:
>
> > Dict[str, object] is a stricter type, but with the way that code is
> > currently arranged, it is infeasible to enforce this strictness.
> >
> > In particular, although expr.py's entire raison d'être is normalization
> > and type-checking of QAPI Expressions, that type information is not
> > "remembered" in any meaningful way by mypy because each individual
> > expression is not downcast to a specific expression type that holds all
> > the details of each expression's unique form.
> >
> > As a result, all of the code in schema.py that deals with actually
> > creating type-safe specialized structures has no guarantee (myopically)
> > that the data it is being passed is correct.
> >
> > There are two ways to solve this:
> >
> > (1) Re-assert that the incoming data is in the shape we expect it to be, or
> > (2) Disable type checking for this data.
> >
> > (1) is appealing to my sense of strictness, but I gotta concede that it
> > is asinine to re-check the shape of a QAPIExpression in schema.py when
> > expr.py has just completed that work at length. The duplication of code
> > and the nightmare thought of needing to update both locations if and
> > when we change the shape of these structures makes me extremely
> > reluctant to go down this route.
> >
> > (2) allows us the chance to miss updating types in the case that types
> > are updated in expr.py, but it *is* an awful lot simpler and,
> > importantly, gets us closer to type checking schema.py *at
> > all*. Something is better than nothing, I'd argue.
> >
> > So, do the simpler dumber thing and worry about future strictness
> > improvements later.
>
> Yes.
>

(You were right, again.)

> While Dict[str, object] is stricter than Dict[str, Any], both are miles
> away from the actual, recursive type.
>
> > Signed-off-by: John Snow <jsnow@redhat.com>
> > ---
> >  scripts/qapi/parser.py | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
> > index bf31018aef0..b7f08cf36f2 100644
> > --- a/scripts/qapi/parser.py
> > +++ b/scripts/qapi/parser.py
> > @@ -19,6 +19,7 @@
> >  import re
> >  from typing import (
> >      TYPE_CHECKING,
> > +    Any,
> >      Dict,
> >      List,
> >      Mapping,
> > @@ -43,7 +44,7 @@
> >  _ExprValue = Union[List[object], Dict[str, object], str, bool]
> >
> >
> > -class QAPIExpression(Dict[str, object]):
> > +class QAPIExpression(Dict[str, Any]):
> >      # pylint: disable=too-few-public-methods
> >      def __init__(self,
> >                   data: Mapping[str, object],
>
> There are several occurences of Dict[str, object] elsewhere.  Would your
> argument for dumbing down QAPIExpression apply to (some of) them, too?

When and if they piss me off, sure. I'm just wary of making the types
too permissive because it can obscure typing errors; by using Any, you
really disable any further checks and might lead to false confidence
in the static checker. I still have a weird grudge against Any and
would like to fully eliminate it from any statically checked Python
code, but it's just not always feasible and I have to admit that "good
enough" is good enough. Doesn't have me running to lessen the
strictness in areas that didn't cause me pain, though...

> Skimming them, I found this in introspect.py:
>
>     # These types are based on structures defined in QEMU's schema, so we
>     # lack precise types for them here. Python 3.6 does not offer
>     # TypedDict constructs, so they are broadly typed here as simple
>     # Python Dicts.
>     SchemaInfo = Dict[str, object]
>     SchemaInfoEnumMember = Dict[str, object]
>     SchemaInfoObject = Dict[str, object]
>     SchemaInfoObjectVariant = Dict[str, object]
>     SchemaInfoObjectMember = Dict[str, object]
>     SchemaInfoCommand = Dict[str, object]
>
> Can we do better now we have 3.8?

A little bit, but it involves reproducing these types -- which are
ultimately meant to represent QAPI types defined in introspect.json --
with "redundant" type info. i.e. I have to reproduce the existing type
definitions in Python-ese, and then we have the maintenance burden of
making sure they match.

Maybe too much work to come up with a crazy dynamic definition thing
where we take the QAPI definition and build Python types from them ...
without some pretty interesting work to avoid the Ouroboros that'd
result. introspection.py wants static types based on types defined
dynamically by the schema definition; but we are not guaranteed to
have a suitable schema with these types at all. I'm not sure how to
express this kind of dependency without some interesting re-work. This
is a rare circumstance of the QAPI generator relying on the contents
of the Schema to provide static type assistance.

Now, I COULD do it statically, since I don't expect these types to
change much, but I'm wary of how quickly it might get out of hand
trying to achieve better type specificity.

General impression: "Not worth the hassle for this series, but we can
discuss proposals for future improvements"

Re: [PATCH 15/19] qapi/parser: demote QAPIExpression to Dict[str, Any]

Posted by Markus Armbruster 2 years, 1 month ago

John Snow <jsnow@redhat.com> writes:

> On Thu, Nov 23, 2023 at 9:12 AM Markus Armbruster <armbru@redhat.com> wrote:
>>
>> John Snow <jsnow@redhat.com> writes:
>>
>> > Dict[str, object] is a stricter type, but with the way that code is
>> > currently arranged, it is infeasible to enforce this strictness.
>> >
>> > In particular, although expr.py's entire raison d'être is normalization
>> > and type-checking of QAPI Expressions, that type information is not
>> > "remembered" in any meaningful way by mypy because each individual
>> > expression is not downcast to a specific expression type that holds all
>> > the details of each expression's unique form.
>> >
>> > As a result, all of the code in schema.py that deals with actually
>> > creating type-safe specialized structures has no guarantee (myopically)
>> > that the data it is being passed is correct.
>> >
>> > There are two ways to solve this:
>> >
>> > (1) Re-assert that the incoming data is in the shape we expect it to be, or
>> > (2) Disable type checking for this data.
>> >
>> > (1) is appealing to my sense of strictness, but I gotta concede that it
>> > is asinine to re-check the shape of a QAPIExpression in schema.py when
>> > expr.py has just completed that work at length. The duplication of code
>> > and the nightmare thought of needing to update both locations if and
>> > when we change the shape of these structures makes me extremely
>> > reluctant to go down this route.
>> >
>> > (2) allows us the chance to miss updating types in the case that types
>> > are updated in expr.py, but it *is* an awful lot simpler and,
>> > importantly, gets us closer to type checking schema.py *at
>> > all*. Something is better than nothing, I'd argue.
>> >
>> > So, do the simpler dumber thing and worry about future strictness
>> > improvements later.
>>
>> Yes.
>
> (You were right, again.)

Occasionally happens ;)

>> While Dict[str, object] is stricter than Dict[str, Any], both are miles
>> away from the actual, recursive type.
>>
>> > Signed-off-by: John Snow <jsnow@redhat.com>
>> > ---
>> >  scripts/qapi/parser.py | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
>> > index bf31018aef0..b7f08cf36f2 100644
>> > --- a/scripts/qapi/parser.py
>> > +++ b/scripts/qapi/parser.py
>> > @@ -19,6 +19,7 @@
>> >  import re
>> >  from typing import (
>> >      TYPE_CHECKING,
>> > +    Any,
>> >      Dict,
>> >      List,
>> >      Mapping,
>> > @@ -43,7 +44,7 @@
>> >  _ExprValue = Union[List[object], Dict[str, object], str, bool]
>> >
>> >
>> > -class QAPIExpression(Dict[str, object]):
>> > +class QAPIExpression(Dict[str, Any]):
>> >      # pylint: disable=too-few-public-methods
>> >      def __init__(self,
>> >                   data: Mapping[str, object],
>>
>> There are several occurences of Dict[str, object] elsewhere.  Would your
>> argument for dumbing down QAPIExpression apply to (some of) them, too?
>
> When and if they piss me off, sure. I'm just wary of making the types
> too permissive because it can obscure typing errors; by using Any, you
> really disable any further checks and might lead to false confidence
> in the static checker. I still have a weird grudge against Any and
> would like to fully eliminate it from any statically checked Python
> code, but it's just not always feasible and I have to admit that "good
> enough" is good enough. Doesn't have me running to lessen the
> strictness in areas that didn't cause me pain, though...
>
>> Skimming them, I found this in introspect.py:
>>
>>     # These types are based on structures defined in QEMU's schema, so we
>>     # lack precise types for them here. Python 3.6 does not offer
>>     # TypedDict constructs, so they are broadly typed here as simple
>>     # Python Dicts.
>>     SchemaInfo = Dict[str, object]
>>     SchemaInfoEnumMember = Dict[str, object]
>>     SchemaInfoObject = Dict[str, object]
>>     SchemaInfoObjectVariant = Dict[str, object]
>>     SchemaInfoObjectMember = Dict[str, object]
>>     SchemaInfoCommand = Dict[str, object]
>>
>> Can we do better now we have 3.8?
>
> A little bit, but it involves reproducing these types -- which are
> ultimately meant to represent QAPI types defined in introspect.json --
> with "redundant" type info. i.e. I have to reproduce the existing type
> definitions in Python-ese, and then we have the maintenance burden of
> making sure they match.
>
> Maybe too much work to come up with a crazy dynamic definition thing
> where we take the QAPI definition and build Python types from them ...
> without some pretty interesting work to avoid the Ouroboros that'd
> result. introspection.py wants static types based on types defined
> dynamically by the schema definition; but we are not guaranteed to
> have a suitable schema with these types at all. I'm not sure how to
> express this kind of dependency without some interesting re-work. This
> is a rare circumstance of the QAPI generator relying on the contents
> of the Schema to provide static type assistance.
>
> Now, I COULD do it statically, since I don't expect these types to
> change much, but I'm wary of how quickly it might get out of hand
> trying to achieve better type specificity.
>
> General impression: "Not worth the hassle for this series, but we can
> discuss proposals for future improvements"

Fair enough.

Thanks!