[RFC 00/29] RFC: Generate object-model code based on relax-ng files

Shi Lei posted 29 patches 2 weeks ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/20200325071209.20841-1-shi_lei@massclouds.com
docs/schemas/basictypes.rng     |   15 +
docs/schemas/network.rng        |   76 ++
docs/schemas/networkcommon.rng  |    2 +-
po/POTFILES.in                  |    2 +
rng2c/directive.py              | 1693 +++++++++++++++++++++++++++++++
rng2c/generator.py              |  504 +++++++++
rng2c/go                        |    8 +
rng2c/schema.json               |  113 +++
rng2c/utils.py                  |  163 +++
src/Makefile.am                 |   16 +-
src/access/Makefile.inc.am      |    2 +-
src/conf/Makefile.inc.am        |   12 +-
src/conf/network_conf.c         |  739 ++++----------
src/conf/network_conf.h         |   51 +-
src/esx/Makefile.inc.am         |    2 +-
src/interface/Makefile.inc.am   |    2 +-
src/internal.h                  |   19 +
src/lxc/Makefile.inc.am         |    1 +
src/network/Makefile.inc.am     |    2 +-
src/network/bridge_driver.c     |    2 +-
src/node_device/Makefile.inc.am |    2 +-
src/nwfilter/Makefile.inc.am    |    2 +-
src/qemu/Makefile.inc.am        |    1 +
src/remote/Makefile.inc.am      |    2 +-
src/secret/Makefile.inc.am      |    2 +-
src/storage/Makefile.inc.am     |    2 +-
src/test/Makefile.inc.am        |    2 +-
src/util/Makefile.inc.am        |   13 +-
src/util/virenum.c              |   15 -
src/util/virenum.h              |   39 +-
src/util/virsocketaddr.c        |   27 +
src/util/virsocketaddr.h        |   10 +
src/util/virstring.c            |   37 +
src/util/virstring.h            |   10 +
src/util/virxml.c               |   57 ++
src/util/virxml.h               |    3 +
src/vbox/Makefile.inc.am        |    2 +-
tests/Makefile.am               |    2 +
tools/Makefile.am               |    2 +
39 files changed, 3000 insertions(+), 654 deletions(-)
create mode 100644 rng2c/directive.py
create mode 100755 rng2c/generator.py
create mode 100755 rng2c/go
create mode 100644 rng2c/schema.json
create mode 100644 rng2c/utils.py

[RFC 00/29] RFC: Generate object-model code based on relax-ng files

Posted by Shi Lei 2 weeks ago
 Outline
=========

In libvirt world, objects(like domain, network, etc.) are described with two
representations: structures in c-language and XML specified by relax-ng.
Since c-language-implementation and xml restricted by relax-ng are merely
the different expressions of the same object-model, we can make a tool to
generate those C-language codes based on relax-ng files.


 RNG2C tool
===========

RNG2C is a python tool which can translate hierarchy of notations
in relax-ng into c-language structures. In general, <element> is converted to
struct in C, struct's members derive from its <attribute>s and its sub <element>s;
<choice> with multiple <value>s is converted to enum in C; <data> are converted
into various builtin-type in C, such as int, char *.

Also, RNG2C can generate relative conversion functions: vir[XXX]ParseXML,
vir[XXX]FormatBuf and vir[XXX]Clear.

These structures and functions can be used to replace hardcoded codes in current
libvirt source.

The tool RNG2C has three subcommands: 'list', 'show' and 'generate'. The 'list' and
'show' are used for previewing generated code, and the 'generate' is prepared to be
invoked by Makefile.

1) list - list all types generated by this tool.

Example:

# ./rng2c/go list

SHORT_ID  META              LOCATION
--------  ----             ----------
3df63b7a  String            String
9b81a5f5  UInt              UInt
9d10c738  Enum              /basictypes.rng/virOnOff.define/choice
5a024c44  Struct            /network.rng/network.define/network.element/dns.element
f6f0c927  Struct            /network.rng/network.define/network.element
... ...


2) show - show details of the target type. The target is specified with
   SHORT_ID or LOCATION. Option '-k' specifies kinds of output information,
   'scpf' stands for [s]tructure | [c]learfunc | [p]arsefunc | [f]ormatfunc.

Example:

# ./rng2c/go show 5a024c44 -kscpf

Or

# ./rng2c/go show /network.rng/network.define/network.element/dns.element -kscpf

It will show four aspects of this Type: the directive of it; the declaration
and implementation of its structure; code of its clearfunc; code of its parsefunc;
code of its formatfunc.


3) generate - just like show, but it dosen't print information on screen. It is
   only for Makefile. It will generate c-language-code, such as struct/enum/
   parsefunc/etc., to take the place of the relative hardcoded stuff.


 Problems
=========

If we replace hardcoded code with generated code directly, there're still
several problems.

Problem1 - name convension
  Generated code follows specific name convension, now there're some differences
  between generated code and current libvirt codes.
  For example, generated struct's name will be like virXXXDef, parsefunc's name
  will be like virXXXDefParseXML; the 'XXX' is a kind of Camel-Case name which
  implies the struct's place in the hierarchy.
  But in current code, there're many variations of name.
  So if we replace current codes with generated codes directly, it requires
  a lot of modifications in the caller codes.

Problem2 - error-checking codes in the parsefunc
  Most of current parse functions have error-checking codes after parsing XML
  code. These code lines validate existence and parsing results.
  Now RNG2C can't generate error-checking codes in the parsefunc. So we
  should find a way to handle it.

Problem3 - hierarchy consistence
  The RNG2C generate a 'struct'(C code) for each 'element'(relax-ng). Most
  of current code follows this convention, but there're still some exceptions.
  For example, the element 'bridge'(in network.rng) should be translated into a
  struct called 'virNetworkBridgeDef'; but in fact, no this struct in the current
  codes, and its members (stp/delay/...) are exposed to the struct 'virNetworkDef'.

Problem4 - target output files for generated code
  The current codes to be replaced exist in various source files. They
  include other source files, and are included by other source files. So we
  should specify proper target files for generated codes to take the place of
  old codes and inherit those original relationships.


 Directive
===========

To solve these problems, the directive mechanism is introduced.

           stage1                      stage2
relax-ng ---------> directives(json) ---------> c-language-code
  ^                     |
  |                     |
  |---------------------|
    feedback-injection


The basic process of generating codes includes two internal stages:
Stage1: RNG2C parses the content of relax-ng files and generates directives.
Stage2: RNG2C generates c-language code based on directives. So the code's form
depends on directives directly.

Injecting directives with new values into relax-ng files can override the
default directives of stage1. Then in the stage2, the final c-code will be different.

Directives are in the form of json and then wrapped as an special XML comment.
So when common relax-ng parsers parse these relax-ng files, these directives will be
ignored. Only RNG2C can recognize and handle them.

For example:

We want to replace the hardcoded virNetworkDNSHostDef with the generated structure.
The original hardcoded structure is as below:

    typedef struct _virNetworkDNSHostDef virNetworkDNSHostDef;
    typedef virNetworkDNSHostDef *virNetworkDNSHostDefPtr;
    struct _virNetworkDNSHostDef {
        virSocketAddr ip;
        size_t nnames;
        char **names;
    };

Take several steps as below:

 1) Execute subcommand list to find it

    # ./rng2c/go list | grep dns | grep host

        SHORT_ID  META    LOCATION
        7892979d  Struct  /network.rng/network.define/network.element/dns.element/host.element

 2) Execute subcommand show with SHORT-ID to check its generated structure

    # ./rng2c/go show 7892979d -ks


        ###### Directive ######

        <!-- VIRT:DIRECTIVE {
          "id": "7892979d7e7a89882e9a324d95198d05d851d02b",
          "location": "/network.rng/network.define/network.element/dns.element/host.element",
          "name": "virNetworkDNSHostDef",
          "meta": "Struct",
          "members": [
            {"id": "ip", "type": "String:b48b4186"},
            {"id": "hostname", "more": true, "type": "String:7d3736b9"}
          ]
        } -->

        ###### Structure (Disabled: NO OUTPUT for "structure".) ######

        [.h]

        typedef struct _virNetworkDNSHostDef virNetworkDNSHostDef;
        typedef virNetworkDNSHostDef *virNetworkDNSHostDefPtr;
        struct _virNetworkDNSHostDef {
            virSocketAddr ip;
            size_t nhostnames;
            char **hostnames;
        };


    We can see, the name of the generated struct depends on root property "name"
    of the directive. Fortunately, it's the same and we don't need override it.

    There're two differences between generated structure and hardcoded stuff:
    In original struct, names of the second and third member are 'nnames' and
    'names'; but in generated code, they are 'nhostnames' and 'hostnames'.

    These names are controlled by this part:

      "members": [{"id": "hostname", ...}].

 3) Override properties of the directive and reinject it into the relax-ng

    Based on "location" of the directive, we find the injection point and only need
    to fill changed parts, just as below:

    <!-- VIRT:DIRECTIVE {
      "members": [{"id": "hostname", "name": "name"}]
    } -->
    <element name="host">
      ... ...

    We need to specify which member will be altered by property "id".

 4) Re-execute subcommand 'show' to confirm the sameness

    After we bridge the gap between the generated structure and the hardcoded
    stuff, we can perform the substitution.


 Directive properties
=====================

The schema of directive are defined in rng2c/schema.json.
Directive consists of properties. Properties retain or derive from the notations
in the relax-ng.

1) root properties

id - string, READ-ONLY.
  Unique identity of the type.
  The id is generated by SHA-1. Its leftmost 8 digits are taken out as SHORT-ID.

location - string, READ-ONLY.
  The location where the type derive from.
  When injecting a directive into relax-ng file, the injection point is based on
  this location.

tag - string, READ-ONLY.
  The type's source tag. It can be 'element', 'data' or 'choice'.

name - string.
  The type's name.
  For struct and enum, name is generated automatically. For struct, its default
  form is vir[XXX]Def; for enum, its default form is vir[XXX]Type. The main
  part 'XXX' represents its position in the hierarchy.
  If name is specified, it will override the default name.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {"name": "virNetDevIPRoute"} -->
  <element name="route">
  ... ...

  Then the name of the generated struct will be 'virNetDevIPRoute'.

  This root property 'name' can be used to solve Problem1.

meta - string.
  The type's meta.
  For struct, it is 'Struct'; for enum, it is 'Enum'; for builtins,
  it is the builtin's meta.

  Meta can be overrided. For example, inject directive as below:

  <define name="UUID">
    <!-- VIRT:DIRECTIVE {
      "meta": "UChars",
      "structure": {"array.size": "VIR_UUID_BUFLEN"}
    } -->
    <data type="string">
      ... ...

  Then this builtin will change into UChars from String(i.e. char *).
  Note: RNG2C don't generate implementation for builtin, they need to be
  hardcoded in libvirt source.

unpack - boolean. Only for struct.
  Only valid for struct. On setting it true, this type will expose its members
  to its upper-level struct rather than act as a standalone struct.
  For example, inject directive on element 'bridge' as below:

  <!-- VIRT:DIRECTIVE {
    "unpack": true,
    "members": [ ... ... ]
  } -->
  <element name="bridge">

  So struct virNetworkBridgeDef will not appear. All of its members, such as
  bridgeName/bridgeZone/stp/delay, are exposed to its parent struct virNetworkDef.

  The 'unpack' and 'pack' can be used to solve Problem3.

pack - boolean.
  It can only work on <define> in the relax-ng.
  Setting it true, children of <define> are converted into members; and then
  these members are packed into a struct for reuse.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "name": "virPCIDeviceAddress",
    "pack": true,
    ... ...
  } -->
  <define name="pciaddress">
    ... ...

  The <define name="pciaddress"> will be packed as a struct virPCIDeviceAddress
  which name is specified by the property "name".

union - string.
  It can only work on <choice> in the relax-ng.
  By default, all children of <choice> will act as members and be exposed to
  the upper-level struct. When setting 'union' a non-null string which is called
  union-id, those members will be packed into a struct; then, a new member
  defined by this struct will be exposed to the upper-level struct.
  The new member's id and name are specified by the union-id.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "union": "if",
    "name": "virNetworkForwardIfDef"
  } -->
  <choice>
    <group>
      <zeroOrMore>
        <!-- VIRT:DIRECTIVE {"unpack": true} -->
        <element name='interface'>
        ...
    </group>
    <group>
      <zeroOrMore>
      <!-- VIRT:DIRECTIVE {"unpack": true} -->
        <element name='address'>
        ...
    </group>
  </choice>

  A new struct virNetworkForwardIfDef will be created; then a new member named 'if'
  is exposed to the parent struct.

namespace - boolean. Only for struct.
  By default, there's no stuff about namespace in the struct. When setting
  it true, additional namespace members will be inserted into the struct.
  And the struct's clearfunc, parsefunc and formatfunc will include relative codes
  to handle these namespace stuff.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "namespace": true,
    ... ...
  } -->
  <element name="network">

  Then in the struct virNetworkDef, there're two additional members:

  typedef virNetworkDef *virNetworkDefPtr;
  struct _virNetworkDef {
      ... ...
      void *namespaceData;
      virXMLNamespace ns;
  }

  In the virNetworkDefClear, there're the codes to invoke ns.free;
  in the virNetworkDefParseXML, there're the codes to invoke ns.parse;
  in the virNetworkDefFormatBuf, there're the codes to invoke ns.format.

values - array. Only for enum.
  All values of an enum except 'none' and 'last'.
  It can be overrided.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "name": "virNetworkForwardType",
    "values": ["nat", "route", "open", "bridge",
               "private", "vepa", "passthrough", "hostdev"]
  } -->
  <choice>
    ... ...
    <value>passthrough</value>
    <value>private</value>
    <value>vepa</value>
    <value>hostdev</value>
  </choice>

  Generally the property 'values' doesn't need to be overrided,
  only if the order of values in the enum need to be adjusted.

members - array. Only for struct.
  Property 'members' is used to display all the members of the struct.
  But when altering a member, only fill in this member's 'id' and
  properties to be changed.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "localPtr", "name": "localPTR"},
      {"id": "tftp", "name": "tftproot"}
    ]
  } -->
  <element name="ip">

  Specify the member "localPtr" and "tftp" by their id, then only
  these two members will be altered. In this case, their names are changed.
  For more member properties, refer to 6)

PRESERVE and APPLY - string.
  By default, directives take effect on the injection point.
  But sometimes, we need preserve a directive at first and then
  apply it somewhere else.
  APPLY corresponds to PRESERVE through their value called preserve-id.
  This preserve-id can be any form of string.

  These two properties can be used on two notations, <choice> and <ref>.
  Here's an example for using them on <choice>, inject directive as below:

  <define name="virtualPortProfile">
    <!-- VIRT:DIRECTIVE {
      "PRESERVE": "virtualport.element",
      "name": "virNetDevVPortProfile",
      ... ...
    } -->
    <choice>
      <group>
        <!-- VIRT:DIRECTIVE {"APPLY": "virtualport.element"} -->
        <element name="virtualport">
      </group>
      <group>
        <!-- VIRT:DIRECTIVE {"APPLY": "virtualport.element"} -->
        <element name="virtualport">
      </group>
      ... ...

  In this case, there're many <element name="virtualport"> notation
  under <choice>. We can define a preserve directive and apply it to all
  the <element name="virtualport">.

  Here's an example for using them on <ref>, inject PRESERVE as below:

  <!-- VIRT:DIRECTIVE {
    "PRESERVE": "port.element",
    "unpack": true,
    ... ...
    "members": [{"id": "isolated", "name": "isolatedPort"}]
  } -->
  <ref name="portOptions"/>
    ... ...

  And in <define>, inject APPLY as below:

  <define name="portOptions">
    <!-- VIRT:DIRECTIVE {"APPLY": "port.element"} -->
    <element name="port">
    ... ...

  It is useful in a case: a <define> is referenced by several <ref>s.
  We can define different directives on different <ref>s and expect different
  effects when they are applied in the <define>.

TOUCH - boolean.
  Request to handle the target <define> explicitly.
  By default, the process of RNG2C starts at <start>, and handles a <define>
  only when a <ref> references to it. If a <define> isn't touched, RNG2C cannot
  find it.
  Inject a directive with TOUCH on a <define> to touch it.

  For example, the <define> 'zpciaddress' hasn't been referenced in the process
  of parsing network.rng (also networkcommon.rng and basictypes.rng), but we hope
  that this notation will be handle.
  Just inject the directive as below:

  <!-- VIRT:DIRECTIVE {"TOUCH": true} -->
  <define name="zpciaddress">
    <element name="zpci">
    ... ...

  Then it can be discovered when executing subcommand 'list'.


2) structure properties

output - string. Only for struct and enum.
  Control whether to output the structure code and where to output those code.
  Refer to 7)

enum.default - string. Only for enum.
  By default, for enum, first item's name is 'none'.
  Override it with other name.
  For example, inject directive as below:

  <define name="macTableManager">
    <!-- VIRT:DIRECTIVE {
      "name": "virNetworkBridgeMACTableManagerType",
      "structure": {"enum.default": "default"}
    } -->
    <choice>
      <value>kernel</value>
      ... ...

  Then the declaration of enum virNetworkBridgeMACTableManagerType will be like:

  typedef enum {
      VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_DEFAULT = 0,
      VIR_NETWORK_BRIDGE_MAC_TABLE_MANAGER_KERNEL,
      ... ...
  } virNetworkBridgeMACTableManagerType;

  The first item is suffixed with '_DEFAULT' rather than '_NONE'.

array.size - string. Only for char array or unsigned char array.
  It specifies length of array. The value can be a macro or a literal number.
  For example, inject directive as below:

  <define name="UUID">
    <choice>
    <!-- VIRT:DIRECTIVE {
      "meta": "UChars",
      "structure": {"array.size": "VIR_UUID_BUFLEN"}
    } -->
    <data type="string">
    ... ...

  Then it will generate an array type:
  unsigned char XXX[VIR_UUID_BUFLEN];

3) clearfunc properties

output - string. Only for struct.
  Control whether to output the clearfunc code and where to output those code.
  Refer to 7)

name - string.
  The clearfunc's name.
  By default, its name is based type's name and suffixed 'Clear'.
  It can be overrided with any other name.

args - array.
  Extra formal args for clearfunc. But now it hasn't been used.

4) parsefunc properties
  The parsefunc's declaration and implementation are controlled by properties.
  The basic form of parsefunc's declaration is as below:

  int
  ${parsefunc.name}(xmlNodePtr curnode,
                    ${type.name}Ptr def,
                    xmlXPathContextPtr ctxt,  /* omit if args.noctxt is true */
                    const char *instanceName, /* exist if args.instname is true */
                    ${parent.name}Ptr parent, /* exist if args.parent is true */
                    ${args});                 /* exist if args is NOT none */

output - string. Only for struct.
  Control whether to output the parsefunc code and where to output those code.
  Refer to 7)

name - string.
  The parsefunc's name.
  By default, its name is based type's name and suffixed 'ParseXML'.
  It can be overrided with any other name.

args - array.
  Extra formal args for parsefunc.
  Each arg can have four properties: name, type, pointer and ctype.
  The first three are according to the usage of namesake properties in member.
  And ctype is an alternate to type when type can't express the argument's type.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "parsefunc": {
      "args": [{"name": "partialOkay", "type": "Bool"}]
    }
  } -->
  <element name="txt">
  ... ...

  Then the args of parsefunc will be appended by a arg, as below:

  int
  virNetworkDNSTxtDefParseXML(xmlNodePtr curnode,
                              virNetworkDNSTxtDefPtr def,
                              xmlXPathContextPtr ctxt,
                              bool partialOkay);

args.noctxt - boolean.
  By default, there's a arg 'ctxt' in the function arguments.
  When 'args.noctxt' is set to true, omit ctxt from the arguments.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "parsefunc": {
      "args.noctxt": true,
    }
  } -->
  <element name="txt">
  ... ...

  Then the arg 'ctxt' will be removed from the default args.

  int
  virNetworkDNSTxtDefParseXML(xmlNodePtr curnode,
                              virNetworkDNSTxtDefPtr def);

  At the same time, the generated implementation of
  virNetworkDNSTxtDefParseXML will be different according to
  this property.

args.instname - boolean.
  By default, there's no instance-name in the function arguments.
  When it is set to true, add instname into the arguments.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "parsefunc": {
      "args.instname": true,
    }
  } -->
  <element name="txt">
  ... ...

  Then the arg 'instname' will be added into the args.

  int
  virNetworkDNSTxtDefParseXML(xmlNodePtr curnode,
                              virNetworkDNSTxtDefPtr def,
                              xmlXPathContextPtr ctxt,
                              const char *instanceName);

args.parent - boolean.
  By default, there's no parent struct pointer name in the function arguments.
  When it is set to true, add parent struct pointer into the arguments.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "parsefunc": {
      "args.parent": true
    }
  } -->
  <element name="dhcp">

  Then the arg 'parent' will be added into the args.

  int
  virNetworkIPDHCPDefParseXML(xmlNodePtr curnode,
                              virNetworkIPDHCPDefPtr def,
                              xmlXPathContextPtr ctxt,
                              virNetworkIPDefPtr parentdef);

post - boolean.
  By default, there's no post hook in the parse function.
  This property enables a post hook for parsefunc. The hook is invoked after
  all members have been parsed. The post hook function's name is based on
  parsefunc's name, and suffxied with 'Post'.
  To solve the foregoing Problem2, extract error-checking code from the current
  parsefunc and move them to the post hook function, then current parsefunc
  can be replaced with generated parsefunc.

  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "parsefunc": {"post": true}
  } -->
  <element name="dns">
  ... ...

  The declaration of the post hook will be generated, as below:

  int
  virNetworkDNSDefParseXMLPost(xmlNodePtr curnode,
                               virNetworkDNSDefPtr def,
                               xmlXPathContextPtr ctxt,
                               const char *enableStr,
                               const char *forwardPlainNamesStr,
                               int nForwarderNodes,
                               int nTxtNodes,
                               int nSrvNodes,
                               int nHostNodes);

  Implement it. It will be invoked by generated parsefunc automatically.

post.notmpvars - boolean. It is valid only when post is set.
  By default, all the temporary variables in the parsefunc will be passed to
  the post hook function. When it is set to true, there's no those temporary
  variables as arguments in the post hook functions.

  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "parsefunc": {"post": true, "post.notmpvars": true}
  } -->
  <element name="dns">
  ... ...

  The declaration of the post hook will be generated, as below:

  int
  virNetworkDNSDefParseXMLPost(xmlNodePtr curnode,
                               virNetworkDNSDefPtr def,
                               xmlXPathContextPtr ctxt);

  Now all tmpvars are removed from the args of parse hook function.

5) formatfunc properties

output - string. Only for struct.
  Control whether to output the formatfunc code and where to output those code.
  Refer to 7)

name - string.
  The formatfunc's name.
  By default, its name is based type's name and suffixed 'FormatBuf'.
  It can be overrided with any other name.

args - array.
  Extra formal args for formatfunc. Just like the args of parsefunc.

shorthand.ignore - boolean.
  By default, formatfunc adopts shorthand closing tag notation when there're no
  elements to output. If shorthand.ignore is set to true, this default
  optimization will be prevented.

  For example, virNetworkIPDefFormat does output <ip ...></ip> rather than
  <ip .../> when there're no child element 'tftp' and 'dhcp'.
  So this property is used to meet the need.

order - array.
  It is a array which item is member's id.
  In formatfunc, the relative order of formatting members is sensitive.
  If the order is not correct, setting this property to re-arrange the order.
  And when the array includes only part of ids, other members will follow and
  keep their original relative order.

  For example, virPortGroupDefFormat has a special order to format its members.

  <!-- VIRT:DIRECTIVE {
    "formatfunc": {
      "order": ["name", "default", "trustGuestRxFilters", "vlan"]
    }
  } -->

  Then format order will be altered according to this.

6) member properties
  Member properties can only appear in the members' array. Use these properties
  to override the struct members' default settings.

id - string, READ-ONLY.
  When id is set, this property specify the member to be changed.
  When id is ignored, it means this member is a new member and needed to append
  to the struct's members.

  For example, inject a directive as below:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "dev", "name": "devname"},
      {"name": "connections", "type": "Int"}
    ]
  } -->
  <element name='pf'>
  ... ...

  The the member 'dev' will be altered; and then, a new member 'connections' will
  be added. For a new member, its type must be specified.

name - string.
  Member's name.
  By default, name is the same with id.
  It can be overrided with any name.

opt - boolean.
  Whether optional or not. In the parsefunc, non-existence for optional member
  is legal; otherwise, there's a error-checking to check its existence and
  report error.
  For example, inject a directive as below:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "value", "opt": true}
    ]
  } -->
  <element name="txt">
    ... ...

  Then in the virNetworkDNSTxtDefParseXML, it's ok when the notation 'value'
  is missing.

more - boolean.
  When it is set, the member is a list.
  For example, inject a directive as below:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "pf", "more": true}
    ]
  } -->
  <element name="forward">
    ... ...

  Then the member 'pf' will be declared as below:

  size_t npfs;
  virNetworkForwardPfDefPtr pfs;

tag - string.
  Display the source's tag.
  When a new member is to add, it can be specified.

type - string.
  Member's type. Its form is "meta[:short_id]", meta is the type's meta, short_id
  is the leftmost 8 digits of type's id. For builtins, ignore colon and short_id.
  This property can be overrided with other type.

pointer - boolean.
  By default, member is an instance defined by its type.
  When this property is set to true, member is a pointer of its type.
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "bandwidth", "pointer": true}
    ]
  } -->

  Then type of the member 'bandwidth' will be changed to pointer, as below:
  virBandwidthDefPtr bandwidth;

specified - boolean.
  When this property is set to true, member will be attached with a new member
  which specifies the existence of its master. The new additional member's form
  is "bool XXX_specified;".
  For example, inject directive as below:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "managerid", "name": "managerID", "specified": true},
      ... ...
    ]
  } -->
  <element name="parameters">
  ... ...

  An addional member 'managerID_specified' will follow the 'managerID'.

  uint8_t       managerID;
  bool          managerID_specified;

declare.comment - string.
  Specify a comment behind member's declaration in struct.
  For example, inject a directive like below:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "uid", "declare.comment": "exempt from syntax-check"}
    ]
  } -->
  <element name="zpci">
  ... ...

  Then the declaration of the member 'uid' is like below:
  int uid;  /* exempt from syntax-check */

parse.disable - boolean.
  When this property is set to true, member will not appear in parsefunc.

parse.args - array. Only for struct-type member.
  Specify the extra actual arguments when member is parsed by calling parsefunc.
  If member's type is struct and the struct's parsefunc has extra formal
  argument, passing actual arguments by this property.

  For example, type of the member 'host' is virNetworkDHCPHostDef, and the parsefunc
  related to the type has an extra arg partialOkey(specified by "args" in "parsefunc").
  Here this extra arg needs to be specified a value.
  Directive is like below:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "host", "parse.args": [{"name": "partialOkay", "value": "false"}]}
    ]
  } -->
  <element name="dhcp">
  ... ...

  Then the parsing code for 'host' in the parsefunc is like below:

  if (virNetworkIPDHCPHostDefParseXML(node, &def->hosts[i], ctxt, false) < 0)
    goto error;

  A specific value is passed into the function as an actual arg.

parse.instname - string. Only for struct-type member.
  When member's type is struct and the struct's parsefunc needs a instance-name
  argument, passing instance name through this property.

  For example:

  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "ip", "parse.instname": "def->name"}
    ]
  } -->
  <element name="network">
  ... ...

  Then parsing code for 'ip' in the parsefunc will be:

  if (virNetworkIPDefParseXML(node, &def->ips[i], ctxt, def->name) < 0)
    goto error;

  The 'def->name' is passed in as instance-name.

parse.default - string.
  Specify a value for member when it doesn't get value from XML.

  For example:
  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "mode", "parse.default": "VIR_NETWORK_FORWARD_NAT"}
    ]
  } -->

  Then when the member 'mode' dosen't get value in parsefunc, it will be
  assigned the default value 'VIR_NETWORK_FORWARD_NAT'.

format.disable - boolean.
  When this property is set to true, member will not be skipped in formatfunc.

format.nocheck - boolean.
  Format and dump member to XML despite of the conditions.
  By default, a member needs to be check whether its value is empty before
  handling it in formatfunc. When this property is set to true, ignore these
  checks and always output.

  For example:
  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "domain", "format.nocheck": true}
    ]
  } -->
  <define name="pciaddress">
  ... ...

  Then whether or not the member 'domain' is zero, it will be formatted and output.

format.precheck - string. Only valid when format.nocheck isn't set.
  In formatfunc, invoke a function as condition-check rather than default checks.
  The function should be hardcoded in advance and its name is specified by this
  property.

  For example, in virNetworkDefFormatBuf, the member 'connections' will be formatted
  and output only when some conditions evaluate to true. So set this property to
  enable a check function named 'checkXMLInactive', and put logic of checking
  conditions into the function.

  For example:
  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "connections", "format.precheck": "checkXMLInactive"}
    ]
  } -->
  <define name="network">
  ... ...

  Then RNG2C generates the declaration of checkXMLInactive.

  bool
  checkXMLInactive(const unsigned int def G_GNUC_UNUSED,
                   const virNetworkDef *parent G_GNUC_UNUSED,
                   unsigned int flags);

  Implement this function in libvirt source.

format.fmt - string.
  For every builtin, there's a default format when they're formatting.
  This property can be overrided to change the default format.

  For example:
  <!-- VIRT:DIRECTIVE {
    "members": [
      {"id": "slot", "format.fmt": "0x%02x"}
    ]
  } -->
  <define name="pciaddress">
  ... ...

  Then the member 'slot' will be formatting based on format "0x%02x".

format.args - array. Only for struct-type member.
  This property is just like parse.args, except that it is for formatfunc.
  Now it hasn't been used.

hint - string. READ-ONLY.
  Display special cases for member.
  The 'unpack' indicates member's type is unpack; 'pack' indicates member's type
  is pack; 'union' indicates member's type is union; 'new' means this member
  is added.

7) output property
  This property is for structure, clearfunc, parsefunc and formatfunc.
  It has two effects:

  First, it controls whether or not to output the code to the files.
  Second, it controls what file to output.

  Value of the property indicates a base path, and it is suffixed with
  ".generated.[c|h]" to form two generated files.
  The base path should be according to the file in which the code is about to
  be replaced with generated code.
  So the relationship between original files and generated files is as below:

  original.c ---> original.h ---------
                     ^               |
                     |               v
            original.generated.c  original.generated.h

  The original.generated.c includes original.h automatically, and the
  original.h needs to be inserted with a line '#include "original.generated.h"'
  manually.

  This method can solve the Problem4.


 Steps of each patch
=====================

Replacing hardcoded code with generated stuff, we need a series of patches.
Each patch just replaces a target, i.e. a structure or a function.
The steps are as below:

 1) Execute subcommand 'list' to query the target's id and location.

 2) Execute subcommand 'show' to preview the target's directive and generated code.

 3) Modify the directive and re-inject it into the relax-ng file to bridge
   the gap between current code and generated code.

 4) Execute subcommand 'show' again to check the new directive and new generated code,
   until generated code can take the place of hardcoded stuff.

 5) Remove the hardcoded code.

 6) Compile & Test the libvirt source.


 About the series of patches
============================

Now I select network as a test target and make a try to generate
virNetworkDef, virNetworkDefClear, virNetworkDefParseXML and
virNetworkDefFormatBuf and their subordinate stuff automatically.

But this needs around 130 patches. That's too many!

So this series just include about 30 patches for evaluation.


Thanks!


Shi Lei (29):
  rng2c: Add tool RNG2C to translate relax-ng into c-language-code.
  maint: Call RNG2C automatically when relax-ng files change
  util: Add two helper functions virXMLChildNode and virXMLChildNodeSet
  util: Move VIR_ENUM_IMPL and VIR_ENUM_DECL from virenum.h to
    internal.h
  util: Replace virTristateBool|virTristateSwitch(hardcoded) with
    namesakes(generated)
  util: Add a helper function virStringParseOnOff
  util: Add some helper functions for virSocketAddr
  util: Add a helper function virStrToLong_u8p
  conf: Replace virNetworkDNSTxtDef(hardcoded) with namesake(generated)
  conf: Replace virNetworkDNSSrvDef(hardcoded) with namesake(generated)
  conf: Replace virNetworkDNSHostDef(hardcoded) with namesake(generated)
  conf: Replace virNetworkDNSForwarder(hardcoded) with
    namesake(generated)
  conf: Replace virNetworkDNSDef(hardcoded) with namesake(generated)
  conf: Replace virNetworkDNSDefClear(hardcoded) with
    namesake(generated)
  conf: Extract error-checking code from virNetworkDNSHostDefParseXML
  conf: Replace virNetworkDNSHostDefParseXML(hardcoded) with
    namesake(generated)
  conf: Extract error-checking code from virNetworkDNSSrvDefParseXML
  conf: Replace virNetworkDNSSrvDefParseXML(hardcoded) with
    namesake(generated)
  conf: Extract error-checking code from virNetworkDNSTxtDefParseXML
  conf: Replace virNetworkDNSTxtDefParseXML(hardcoded) with
    namesake(generated)
  conf: Add virNetworkDNSForwarderParseXML and
    virNetworkDNSForwarderParseXMLPost
  conf: Replace virNetworkDNSForwarderParseXML(hardcoded) with
    namesake(generated)
  conf: Extract error-checking code from virNetworkDNSDefParseXML
  conf: Replace virNetworkDNSDefParseXML(hardcoded) with
    namesake(generated)
  conf: Apply virNetworkDNSForwarderFormatBuf(generated) in
    virNetworkDNSDefFormat
  conf: Apply virNetworkDNSTxtDefFormatBuf(generated) in
    virNetworkDNSDefFormat
  conf: Apply virNetworkDNSSrvDefFormatBuf(generated) in
    virNetworkDNSDefFormat
  conf: Apply virNetworkDNSHostDefFormatBuf(generated) in
    virNetworkDNSDefFormat
  conf: Replace virNetworkDNSDefFormat(hardcoded) with
    virNetworkDNSDefFormatBuf(generated)

 docs/schemas/basictypes.rng     |   15 +
 docs/schemas/network.rng        |   76 ++
 docs/schemas/networkcommon.rng  |    2 +-
 po/POTFILES.in                  |    2 +
 rng2c/directive.py              | 1693 +++++++++++++++++++++++++++++++
 rng2c/generator.py              |  504 +++++++++
 rng2c/go                        |    8 +
 rng2c/schema.json               |  113 +++
 rng2c/utils.py                  |  163 +++
 src/Makefile.am                 |   16 +-
 src/access/Makefile.inc.am      |    2 +-
 src/conf/Makefile.inc.am        |   12 +-
 src/conf/network_conf.c         |  739 ++++----------
 src/conf/network_conf.h         |   51 +-
 src/esx/Makefile.inc.am         |    2 +-
 src/interface/Makefile.inc.am   |    2 +-
 src/internal.h                  |   19 +
 src/lxc/Makefile.inc.am         |    1 +
 src/network/Makefile.inc.am     |    2 +-
 src/network/bridge_driver.c     |    2 +-
 src/node_device/Makefile.inc.am |    2 +-
 src/nwfilter/Makefile.inc.am    |    2 +-
 src/qemu/Makefile.inc.am        |    1 +
 src/remote/Makefile.inc.am      |    2 +-
 src/secret/Makefile.inc.am      |    2 +-
 src/storage/Makefile.inc.am     |    2 +-
 src/test/Makefile.inc.am        |    2 +-
 src/util/Makefile.inc.am        |   13 +-
 src/util/virenum.c              |   15 -
 src/util/virenum.h              |   39 +-
 src/util/virsocketaddr.c        |   27 +
 src/util/virsocketaddr.h        |   10 +
 src/util/virstring.c            |   37 +
 src/util/virstring.h            |   10 +
 src/util/virxml.c               |   57 ++
 src/util/virxml.h               |    3 +
 src/vbox/Makefile.inc.am        |    2 +-
 tests/Makefile.am               |    2 +
 tools/Makefile.am               |    2 +
 39 files changed, 3000 insertions(+), 654 deletions(-)
 create mode 100644 rng2c/directive.py
 create mode 100755 rng2c/generator.py
 create mode 100755 rng2c/go
 create mode 100644 rng2c/schema.json
 create mode 100644 rng2c/utils.py

-- 
2.17.1



Re: [RFC 00/29] RFC: Generate object-model code based on relax-ng files

Posted by Daniel P. Berrangé 2 weeks ago
On Wed, Mar 25, 2020 at 03:11:40PM +0800, Shi Lei wrote:
>  Outline
> =========
> 
> In libvirt world, objects(like domain, network, etc.) are described with two
> representations: structures in c-language and XML specified by relax-ng.
> Since c-language-implementation and xml restricted by relax-ng are merely
> the different expressions of the same object-model, we can make a tool to
> generate those C-language codes based on relax-ng files.
> 
> 
>  RNG2C tool
> ===========
> 
> RNG2C is a python tool which can translate hierarchy of notations
> in relax-ng into c-language structures. In general, <element> is converted to
> struct in C, struct's members derive from its <attribute>s and its sub <element>s;
> <choice> with multiple <value>s is converted to enum in C; <data> are converted
> into various builtin-type in C, such as int, char *.
> 
> Also, RNG2C can generate relative conversion functions: vir[XXX]ParseXML,
> vir[XXX]FormatBuf and vir[XXX]Clear.
> 
> These structures and functions can be used to replace hardcoded codes in current
> libvirt source.
> 
> The tool RNG2C has three subcommands: 'list', 'show' and 'generate'. The 'list' and
> 'show' are used for previewing generated code, and the 'generate' is prepared to be
> invoked by Makefile.

[snip]

Wow, this is a very impressive and ambitious bit of work, so
thanks for proposing this idea !

The idea of generating the XML parsers is pretty appealing and the
kind of thing I wish we had done right back at the very start of
libvirt development. I really came to like the idea of a metadata
driven XML parser when I created the libvirt-go-xml project structs.
As you already no doubt found, doing the auto-generation for integrating
into our existing code is even more tricky.

I expect the hardest parts are probably going to be bits of the schema
which are discriminated enums. These were what caused me most pain in
the Go XML bindings.

eg with disks where the "type=xxx" attribute on <disk> in turn controls
which child elements we accept in the parser. In the RNG schema we have
often been lazy and not represented the restrictions which are imposed
by the actual parser. Similarly in the actual structs, we often don;t
represent the discriminated enums conceptually.

IOW if we go down the route of generating everything from the RNG schema,
we're quite likely to need to do work to make the RNG schema more accurate.
We'll probably aso need to change a fair number of the struts we use. The
domain XML schema is the one most affected by this. All the other XML
schemas are fairly simple in general. None of this is a bad thing, but it
is potentially a lot of work. 

This is a big & complex series / topic, so it will take a little time
for me to properly review this idea and give some more detailed
feedback. So it will probably be next week before i reply again.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [RFC 00/29] RFC: Generate object-model code based on relax-ng files

Posted by Peter Krempa 2 weeks ago
On Thu, Mar 26, 2020 at 18:19:20 +0000, Daniel Berrange wrote:
> On Wed, Mar 25, 2020 at 03:11:40PM +0800, Shi Lei wrote:

[...]

> IOW if we go down the route of generating everything from the RNG schema,
> we're quite likely to need to do work to make the RNG schema more accurate.
> We'll probably aso need to change a fair number of the struts we use. The
> domain XML schema is the one most affected by this. All the other XML
> schemas are fairly simple in general. None of this is a bad thing, but it
> is potentially a lot of work. 

What I'm afraid of is making the internal structs closer to the RNG
schema in many cases. I spent some time sanitizing the storage of data
in the internal structs which don't necessarily match the RNG schema.

Making them look more like the schema would be a pretty big regression
in understandibility of the code and I'd like to avoid that at all
costs. Specifically since we can't change the schema.

Most notable is the ultra-messy NUMA topology definition, where some
things are split accross multiple not-very appropriate elements and
fixing that will be unpleasant.

Specifically NUMA topology is a child of <cpu>, properties of numa nodes
are under <numatune> etc ... These must not be changed otherwise it will
be a mess.

One other thing to worry about is, that schema validation is not
mandatory. Coupled with guarantee of backwards compatibility, we must
preserve the quirks of the parser, including parsing of stuff which is
not in the schema and this will be extremely hard to preserve.

The other extreme is when the schema is too generic and there's specific
checking in the parser. We've got plenty of that too.

As you've pointed out in terms of the disk schema, there's plenty of the
above going on, including under (we were accepting seclabels even for
disks which don't use them, it came handy once) and overspecified
schema and also plenty of non-conformat storage. Specifically the <disk>
element contains the type, which actually belongs to the storage source.

The storage source also has 'backingStore' but in the XML it's not a
child but a sibling. Changing this to the format in the schema would
make the code, messy and more unmaintainable than it is now.

In this regards, the best thing that we could do, is to generate the
parser and then hand-write transformation from the XML schema into what
we use internally, but that is not really better than the current state.