[PATCH libvirt-python 0/5] Fixes from adding type annotation

Philipp Hahn posted 5 patches 3 years, 11 months ago
Failed in applying to current master (apply log)
generator.py                          |  6 +++---
libvirt-override-virDomainSnapshot.py |  2 +-
libvirt-qemu-override-api.xml         |  4 ++--
libvirtaio.py                         | 18 ++++++++++++------
4 files changed, 18 insertions(+), 12 deletions(-)
[PATCH libvirt-python 0/5] Fixes from adding type annotation
Posted by Philipp Hahn 3 years, 11 months ago
Hello,

as announced a long time ago with
<https://www.redhat.com/archives/libvir-list/2018-November/msg00291.html>
and recently refreshed with
<https://www.redhat.com/archives/libvir-list/2020-April/msg00892.html>
I'm working on adding PEP 484 type hints
<https://www.python.org/dev/peps/pep-0484/> to the Python binding of
libvirt.

I have finished this work now and have a working version at
<https://github.com/univention/libvirt-python/tree/typing> which
consists of 90 patches in total as I has to go over evry file to
understand and fix all things.

As that patch bomb is quiet large I'm going to submit them in smaller
chunks to make them more reviewable. Today I start with the first round
consisting of "real" bugs in the current code:

Philipp Hahn (5):
  generator: Fix undefined variables file
  generator: Fix string formatting
  generator: Fix domainSnapshot.listAllChildren()
  qemu-api: Fix return type
  libvirtaio: Fix return types of callback


 generator.py                          |  6 +++---
 libvirt-override-virDomainSnapshot.py |  2 +-
 libvirt-qemu-override-api.xml         |  4 ++--
 libvirtaio.py                         | 18 ++++++++++++------
 4 files changed, 18 insertions(+), 12 deletions(-)


After that I plan to continue with:

2. fix examples/ to work with Python 3
3. Cleanup code tree-wide
4. Cleanup generator.py
5. Cleanup sanitytest.py
6. Teach generator.py to add PEP 484 annotation
7. Assorted cleanups

(the order and chunking is not final yet)

> examples/domipaddrs: Convert to python 3 print()
> examples/domipaddrs: Fix Python 2 dict.iteritems()
> examples/*: Remove stray semicolon
> example/dhcp*: Fix None comparison
> examples/event-test: Remove unneeded global statement
> examples/event-test: Work with old version of python-libvirt
> examples/event-test: Use atexit for Python 3
> examples/esxlist: Fix Python 2 raw_input()
> examples/consolecallback: Add var to save callback
> examples/consolecallback: Fix assorted errors
> examples: Add missing return values

> libvirtaio: Drop object(*args, **kwargs)
> libvirtaio: Fix return type
> libvirtaio: assert callback type

> Do not use bare except
> Cleanup imports
> Fix white space
> Remove legacy libvirtError arguments

> stream: Fix exception traceback handling
> override: Simplify exception handling
> generator: Simplify exception handling
> generator: Change type of quiet to bool
> generator: Remove unneeded line continuation
> generator: Convert to 'not in' and 'is not'
> generator: Remove dead variable assignments
> generator: Remove skipped_modules
> generator: Remove useless sort key
> generator: Fix return type on failure
> generator: Merge now identical if-elif-else cases
> generator: Use more string formatting
> generator: Simplify string concatentaion
> generator: Use enumerate()
> generator: Use increment assignment
> generator: Use string concatenation
> generator: Remove global declarations
> generator: Initialize function_classes directly
> generator: Check contained in hash
> generator: Use dict.item() to walk keys and values
> generator: Walk only the values
> generator: Directly get dict length
> generator: Just walk the dict
> generator: Use splitlines()
> generator: Open file with context manager
> generator: Refactor parser creation
> generator: Remove unused SAX content handler methods
> generator: Use SAX method names
> generator: Use string formatting
> generator: Convert in_function to boolean
> generator: Simplify XML attribute fetching
> generator: Initialize with empty strings
> generator: Expand tuple to names in for loop
> generator: Store arguments and return as tuple
> generator: Fixed writing cached=None
> generator: Simplify sorting
> generator: Simplify loop break
> generator: Simplify boolean condition
> generator: Convert dict() to set()
> generator: Converto to defaultdict()
> generator: Add PEP 484 type annotations
> override: Add manual PEP 484 type annotations
> sanitytest: Skip type annotations
> stream: Simplify boolean condition
> domain: Fix None comparison
> stream: no type change
> stream: Convert type() to isinstance()
> stream: Return None from callback
> connect: Just clear all event handlers
> override: no type change
> sanitytest: Do not re-declare set
> sanitytest: Drop else:pass
> sanitytest: Drop Python 2 compatibility
> sanitytest: Add PEP 484 type annotations
> sanitytest: Use 3-tuple for basicklassmap
> sanitytest: Use 3-tuple for finalklassmap
> sanitytest: Use set for tracking used functions
> sanitytest: Use str.startswith() instead of str[0]
> generator: Generate PEP 484 type annotation
> override: Catch type error
> generator: Special handling for virStoragePool.listAllVolumes
> generator: Merge code for __init__ genration
> generator: Use empty string instead of None
> generator: break lines in generated code
> generator: Expand tuple to names in for loop
> generator: Work around type change
> generator: use pointer wrapper for all objects

> examples: Add/fix PEP 484 type annotation
-- 
2.20.1


Re: [PATCH libvirt-python 0/5] Fixes from adding type annotation
Posted by Philipp Hahn 3 years, 8 months ago
Hello,

Am 27.04.20 um 15:44 schrieb Philipp Hahn:
> as announced a long time ago with
> <https://www.redhat.com/archives/libvir-list/2018-November/msg00291.html>
> and recently refreshed with
> <https://www.redhat.com/archives/libvir-list/2020-April/msg00892.html>
> I'm working on adding PEP 484 type hints
> <https://www.python.org/dev/peps/pep-0484/> to the Python binding of
> libvirt.
> 
> I have finished this work now and have a working version at
> <https://github.com/univention/libvirt-python/tree/typing> which
> consists of 90 patches in total as I has to go over evry file to
> understand and fix all things.

I just opened a merge request
<https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my
code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.

This also unearthed severl bugs where C-pointer point to the wrong
types, which lead to strange crashes. The fixes for them are at the
start of my current branch, except
<https://gitlab.com/pmhahn/libvirt-python/-/commit/9fdf8e9ece4b6695bcddaeb2f998bc11f57e2735>
which is more tricky.

If you want to fast-pache them I can try to extract those patches into a
separate branch, which can be merged without my other work.

Philipp

libvirt-python: API change List → (Named)Tuple?
Posted by Philipp Hahn 3 years, 8 months ago
Hello,

Am 25.07.20 um 23:45 schrieb Philipp Hahn:
> Am 27.04.20 um 15:44 schrieb Philipp Hahn:
>> I'm working on adding PEP 484 type hints
>> <https://www.python.org/dev/peps/pep-0484/> to the Python binding of
>> libvirt.
...
> I just opened a merge request
> <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my
> code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.

While working on that I stumbled over some annoyances in the current
Python API: There are many methods which return a List[int], for example:
  virDomainGetInfo
  virDomainGetState
  virDomainGetControlInfo
  virDomainGetBlockInfo
  virDomainGetJobInfo
  virStoragePoolGetInfo
  virStorageVolGetInfo
  virStorageVolGetInfoFlags

There are more function returning similar information as plain List:
  virNodeGetSecurityModel
  virNodeGetSecurityModel
  virDomainGetSecurityLabelList
  virDomainBlockStats
  virDomainInterfaceStats
  virDomainGetVcpus
  virNodeGetCPUMap

The worst offender is `virNodeGetInfo`, which returns `Tuple[str,
int*7]`: The problem for type annotation is that `List` is unlimited in
the number of elements, that is you cannot specify the number of entries
the list must or will contain.
Also all elements of a list must have the same (super-)type; for "str"
and "int" of "virNodeGetInfo()" that would be "Any", which is not very
useful here.

A better type for those `List`s would be `Tuple`, which has a fixed
length and can have different types for each position.

But that would be an API change: In most cases users of those functions
should not notice the difference as indexing tuples or lists is the same.
It would break code where someone would do something like this:
  ret = virFunction_returning_list()
  ret += [1, 2, 3]
which breaks if that function would return a `Tuple` in the future:

> $ python -c '() + []'
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
> TypeError: can only concatenate tuple (not "list") to tuple

Even better then plain `Tuple`s would be
<https://docs.python.org/3/library/collections.html#collections.namedtuple>,
which would allow us to use the return value of `virNodeGetInfo()` as this:

> from collections import namedtuple
> import  libvirt
> virNodeInfo = namedtuple("virNodeInfo", ["model", "memory", "cpus", "mhz", "nodes", "sockets", "cores", "threads"])
> c = libvirt.open('test:///default')
> info = virNodeInfo(*c.getInfo())
> print(info.model)
> print(info)
> # virNodeInfo(model='i686', memory=3072, cpus=16, mhz=1400, nodes=2, sockets=2, cores=2, threads=2)

<https://libvirt.org/html/libvirt-libvirt-host.html#virNodeInfo>

This could be improved even more by using
<https://mypy.readthedocs.io/en/stable/kinds_of_types.html?highlight=NamedTuple#named-tuples>,
which allows to add type information:

> from typing import NamedTuple
> virNodeInfo = NamedTuple("virNodeInfo", [("model", str), ("memory", int), ("cpus", int), ("mhz", int), ("nodes", int), ("sockets", int), ("cores", int), ("threads", int)])

or with Python 3.6 the more readable form

> class virNodeInfo(NamedTuple):
>   model: str
>   memory: int
>   cpus: int
>   mhz: int
>   nodes: int
>   sockets: int
>   cores: int
>   threads: int

IMHO that would be a real usability improvement as I have to lookup that
information every time I have to use those functions myself.
Indexing with `namedtuple` and `NamedTuple` still works, so you can
still use `info[0]` above to get the model.

What do you think of that?
When would be a good time to make such a change?

Philipp


Re: libvirt-python: API change List → (Named)Tuple?
Posted by Erik Skultety 3 years, 8 months ago
On Mon, Jul 27, 2020 at 08:32:29AM +0200, Philipp Hahn wrote:
> Hello,
> 
> Am 25.07.20 um 23:45 schrieb Philipp Hahn:
> > Am 27.04.20 um 15:44 schrieb Philipp Hahn:
> >> I'm working on adding PEP 484 type hints
> >> <https://www.python.org/dev/peps/pep-0484/> to the Python binding of
> >> libvirt.
> ...
> > I just opened a merge request
> > <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my
> > code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.
> 
> While working on that I stumbled over some annoyances in the current
> Python API: There are many methods which return a List[int], for example:
>   virDomainGetInfo
>   virDomainGetState
>   virDomainGetControlInfo
>   virDomainGetBlockInfo
>   virDomainGetJobInfo
>   virStoragePoolGetInfo
>   virStorageVolGetInfo
>   virStorageVolGetInfoFlags
> 
> There are more function returning similar information as plain List:
>   virNodeGetSecurityModel
>   virNodeGetSecurityModel
>   virDomainGetSecurityLabelList
>   virDomainBlockStats
>   virDomainInterfaceStats
>   virDomainGetVcpus
>   virNodeGetCPUMap
> 
> The worst offender is `virNodeGetInfo`, which returns `Tuple[str,
> int*7]`: The problem for type annotation is that `List` is unlimited in
> the number of elements, that is you cannot specify the number of entries
> the list must or will contain.
> Also all elements of a list must have the same (super-)type; for "str"
> and "int" of "virNodeGetInfo()" that would be "Any", which is not very
> useful here.
> 
> A better type for those `List`s would be `Tuple`, which has a fixed
> length and can have different types for each position.
> 
> But that would be an API change: In most cases users of those functions
> should not notice the difference as indexing tuples or lists is the same.
> It would break code where someone would do something like this:
>   ret = virFunction_returning_list()
>   ret += [1, 2, 3]

I would argue that ^this was never the intended way of using the
returned object and would lean towards using a named tuple, since like
you've pointed out it would be a transparent change in the vast majority
of cases.  However, since we don't document anywhere how the return
value should be treated, from that perspective it's still valid to
change the returned list for whatever purposes and therefore we are
breaking the API contract, which we can't do even though the change
itself is reasonable.

Regards,
Erik

Re: libvirt-python: API change List → (Named)Tuple?
Posted by Daniel P. Berrangé 3 years, 8 months ago
On Mon, Jul 27, 2020 at 11:19:46AM +0200, Erik Skultety wrote:
> On Mon, Jul 27, 2020 at 08:32:29AM +0200, Philipp Hahn wrote:
> > Hello,
> > 
> > Am 25.07.20 um 23:45 schrieb Philipp Hahn:
> > > Am 27.04.20 um 15:44 schrieb Philipp Hahn:
> > >> I'm working on adding PEP 484 type hints
> > >> <https://www.python.org/dev/peps/pep-0484/> to the Python binding of
> > >> libvirt.
> > ...
> > > I just opened a merge request
> > > <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my
> > > code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.
> > 
> > While working on that I stumbled over some annoyances in the current
> > Python API: There are many methods which return a List[int], for example:
> >   virDomainGetInfo
> >   virDomainGetState
> >   virDomainGetControlInfo
> >   virDomainGetBlockInfo
> >   virDomainGetJobInfo
> >   virStoragePoolGetInfo
> >   virStorageVolGetInfo
> >   virStorageVolGetInfoFlags
> > 
> > There are more function returning similar information as plain List:
> >   virNodeGetSecurityModel
> >   virNodeGetSecurityModel
> >   virDomainGetSecurityLabelList
> >   virDomainBlockStats
> >   virDomainInterfaceStats
> >   virDomainGetVcpus
> >   virNodeGetCPUMap
> > 
> > The worst offender is `virNodeGetInfo`, which returns `Tuple[str,
> > int*7]`: The problem for type annotation is that `List` is unlimited in
> > the number of elements, that is you cannot specify the number of entries
> > the list must or will contain.
> > Also all elements of a list must have the same (super-)type; for "str"
> > and "int" of "virNodeGetInfo()" that would be "Any", which is not very
> > useful here.
> > 
> > A better type for those `List`s would be `Tuple`, which has a fixed
> > length and can have different types for each position.
> > 
> > But that would be an API change: In most cases users of those functions
> > should not notice the difference as indexing tuples or lists is the same.
> > It would break code where someone would do something like this:
> >   ret = virFunction_returning_list()
> >   ret += [1, 2, 3]
> 
> I would argue that ^this was never the intended way of using the
> returned object and would lean towards using a named tuple, since like
> you've pointed out it would be a transparent change in the vast majority
> of cases.  However, since we don't document anywhere how the return
> value should be treated, from that perspective it's still valid to
> change the returned list for whatever purposes and therefore we are
> breaking the API contract, which we can't do even though the change
> itself is reasonable.

I think the above example is just plain crazy. While it may be technically
possible, I don't think that example is a compelling usage to justify not
doing the tuple change. Even if it does break some app (I very much doubt
it will), it'll still be a net win for all the other sane apps to change
to using a named tuple.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

libvirt-python: Extending libvirt-*override-api.xml?
Posted by Philipp Hahn 3 years, 8 months ago
Hello,

Am 25.07.20 um 23:45 schrieb Philipp Hahn:
> Am 27.04.20 um 15:44 schrieb Philipp Hahn:
>> I'm working on adding PEP 484 type hints
>> <https://www.python.org/dev/peps/pep-0484/> to the Python binding of
>> libvirt.
...
> I just opened a merge request
> <https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9> for my
> code at <https://gitlab.com/pmhahn/libvirt-python/-/tree/typing>.

While working on that I stumbled over two issues:

1. generator.skip_impl contained a list of the function names, which
were also defined in `libvirt-override-api.xml`. Adding an override
requires adding it in *two* locations.
With
<https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9/diffs?commit_id=96e7414e488af84f7be04a50f5cd7457142c898d>
it took the liberty to remove the list from generator.py and to extends
it by parsing "api.xml" where file="python*".

2. Historically "api*.xml" used "char *" to indicate some custom Python
type. Mapping this to "str" is obviously wrong, so I had to review them
and changed many of them to "Any" or more specific types with
<https://gitlab.com/libvirt/libvirt-python/-/merge_requests/9/diffs?commit_id=c3b26fccfc5e4e23ad93db51a41a85275f0ea5b5>.
As that information is both used to generate the low-level "C-to-Python"
mapping for "libvirtmod" but also the high-level Python module
"libvirt", changing `type` to something other then a C-type might breaks
the C-level wrapper.
Using a Python-type there which must be declared in "generator.py" is
also somehow cumbersome as there are many types which are used only once.
So I would like to extend this file with a new attribute like "pytype",
which then can be used to overwrite the type used by generator.py.

> <function name='virDomainGetVcpus' file='python'>
>   <info>Extract information about virtual CPUs of domain, store it in info array and also in cpumaps.</info>
>   <return type='char *' info='None in case of error, returns a tuple of vcpu info and vcpu map.'/>
>   <return type='PyAny' info='None in case of error, returns a tuple of vcpu info and vcpu map.'/><!-- Tuple[List[Tuple[int, int, int, int]], List[Tutple[int, ...]]] -->
>   <arg name='domain' type='virDomainPtr' info='pointer to domain object, or NULL for Domain0'/>
> </function>

Is that okay?

Philipp