From nobody Sat Nov 23 18:20:19 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) client-ip=8.43.85.245; envelope-from=devel-bounces@lists.libvirt.org; helo=lists.libvirt.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of lists.libvirt.org designates 8.43.85.245 as permitted sender) smtp.mailfrom=devel-bounces@lists.libvirt.org; dmarc=fail(p=none dis=none) header.from=gmail.com Return-Path: Received: from lists.libvirt.org (lists.libvirt.org [8.43.85.245]) by mx.zohomail.com with SMTPS id 17240847070881004.1887571043392; Mon, 19 Aug 2024 09:25:07 -0700 (PDT) Received: by lists.libvirt.org (Postfix, from userid 996) id 139311525; Mon, 19 Aug 2024 12:25:06 -0400 (EDT) Received: from lists.libvirt.org (localhost [IPv6:::1]) by lists.libvirt.org (Postfix) with ESMTP id EE972165F; Mon, 19 Aug 2024 12:20:26 -0400 (EDT) Received: by lists.libvirt.org (Postfix, from userid 996) id 61FAF1601; Mon, 19 Aug 2024 12:20:20 -0400 (EDT) Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.libvirt.org (Postfix) with ESMTPS id 5DDDB1647 for ; Mon, 19 Aug 2024 12:19:59 -0400 (EDT) Received: by mail-lj1-f179.google.com with SMTP id 38308e7fff4ca-2ef23d04541so54382531fa.2 for ; Mon, 19 Aug 2024 09:19:59 -0700 (PDT) Received: from localhost.localdomain ([37.186.51.21]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-5bebbde4964sm5738298a12.24.2024.08.19.09.19.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Aug 2024 09:19:56 -0700 (PDT) X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on lists.libvirt.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724084398; x=1724689198; darn=lists.libvirt.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kLTRrBtcHF2dvsJxcCLauGAexSPkfL50Ay5qDu/J3V4=; b=X3/XwHKquJyZnN0VqAcNq8uuVrc7hmOUUCZPin15CSiWgKwchXsG1md5tOwXpnpSs6 DQDOiab1j95YPNcYmpiYW215mQnL8Z6i9jAfGlEwwQ6TDJXmW82s4UOGIJPvcFzo4Q1J Nt/lgKgTPe3/FD8Jfgm/WH/t8R8MAD3Ud+tZI6GkEBPiMDbyPxQesXwM9x/pB2GUnCd6 HXc5qqxmY+sCHpMPyyu0mfypcV6bvdGxqHDTuxM4Y27l/fjfJnbrsRvznc9PSb3MstyR 8BwiYw+0m9s3NgG5tVuD6ZM91BgjWKBWAJL9IyxWQ79v3fhGA5xU1cE0l8Mc+UR7JlVU MrnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724084398; x=1724689198; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kLTRrBtcHF2dvsJxcCLauGAexSPkfL50Ay5qDu/J3V4=; b=E7ICG3Ll0PoAtkcWvtShg3GJSI9gfPzfJX8HQG6rd5mXh9GQFw1UzEJfxLQ9xYam4O TIILgTKByFdFmCe6AVDmVIsDRtYbjI6mIAk4b+4hYSaEJ4K6M7pv3sarqfRgav28B8YA jN/GYXRvMpm3i1lmwCoQ5XSWaRfAvCmSs9TuqlSKxOMLb9d2zmAjRisJCbIXECUVvRKa k5UVoUQDZjcysc5ib49PyeEwOwaB62pt/SRJ/7Gq6sTYCRG2VRN8gd0Mnq9JwgygAxZN +mLNTANZUGCgnY/ZstZ0R1fAAjvcXrCLlvDoOXIBLvUPWfF7FWq2+Z4Rkz5+2ttWQtQ3 hgJg== X-Gm-Message-State: AOJu0YzSS5ZGFJaY4pwJAYpsSxPWNb6btH3bqyumUhcfLJDVBJFaO6Z3 KEZ7pj3dwN6E5jgml3/AkpOlXMxOTXdpZ0ZMOUgb8V+Kh15LF2m/m7Wh6A== X-Google-Smtp-Source: AGHT+IFeG8zCZin2ujq++TYvJmH+dWAoAJ4+0CfEzgIlb6cqoyx+JgUPLnxPzpk44vx0wZ3aAD7IAQ== X-Received: by 2002:a05:651c:211a:b0:2ef:20ae:d116 with SMTP id 38308e7fff4ca-2f3be3df1bdmr83325701fa.0.1724084397178; Mon, 19 Aug 2024 09:19:57 -0700 (PDT) From: Rayhan Faizel To: devel@lists.libvirt.org Subject: [PATCH 05/14] scripts: Add script to convert relaxNG to protobuf Date: Mon, 19 Aug 2024 21:39:43 +0530 Message-Id: <20240819160952.351383-6-rayhan.faizel@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240819160952.351383-1-rayhan.faizel@gmail.com> References: <20240819160952.351383-1-rayhan.faizel@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Message-ID-Hash: ROHW6VHAGEMRGYKTRU3AL67VAMGVKWJB X-Message-ID-Hash: ROHW6VHAGEMRGYKTRU3AL67VAMGVKWJB X-MailFrom: rayhan.faizel@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-config-1; header-match-config-2; header-match-config-3; header-match-devel.lists.libvirt.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; suspicious-header CC: Rayhan Faizel X-Mailman-Version: 3.2.2 Precedence: list List-Id: Development discussions about the libvirt library & tools Archived-At: List-Archive: List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1724084708394116600 Content-Type: text/plain; charset="utf-8" This script converts relaxNG schemas to an equivalent protobuf definition file. The script captures the general structure of the XML schema and tries to guess the attribute datatypes.=20 The protobuf definitions give the fuzzers awareness of the XML schema. The protobuf files will be used by the fuzzers to mutate protobuf data and serialize them to XML. Signed-off-by: Rayhan Faizel --- build-aux/syntax-check.mk | 1 + scripts/meson.build | 1 + scripts/relaxng-to-proto.py | 505 ++++++++++++++++++++++++++++++++++++ 3 files changed, 507 insertions(+) create mode 100644 scripts/relaxng-to-proto.py diff --git a/build-aux/syntax-check.mk b/build-aux/syntax-check.mk index 0759372b2b..a60e4a8082 100644 --- a/build-aux/syntax-check.mk +++ b/build-aux/syntax-check.mk @@ -844,6 +844,7 @@ http_sites +=3D www.inkscape.org http_sites +=3D www.innotek.de http_sites +=3D www.w3.org http_sites +=3D xmlns +http_sites +=3D relaxng.org =20 # Links in licenses http_sites +=3D scripts.sil.org diff --git a/scripts/meson.build b/scripts/meson.build index 2798e302ab..7249346e45 100644 --- a/scripts/meson.build +++ b/scripts/meson.build @@ -32,6 +32,7 @@ scripts =3D [ 'mock-noinline.py', 'prohibit-duplicate-header.py', 'qemu-replies-tool.py', + 'relaxng-to-proto.py', ] =20 foreach name : scripts diff --git a/scripts/relaxng-to-proto.py b/scripts/relaxng-to-proto.py new file mode 100644 index 0000000000..f13d6f7e40 --- /dev/null +++ b/scripts/relaxng-to-proto.py @@ -0,0 +1,505 @@ +#!/usr/bin/env python3 + +import re +import sys +import xml.etree.ElementTree as ET +import argparse + +# Track XML tree objects of all tags +define_table =3D {} + +# Store parsed tree of tag +define_trees =3D {} + +relaxng_ns =3D "{http://relaxng.org/ns/structure/1.0}" + +integer_refs =3D ["positiveInteger", "unsignedInt", "uint8", "uint16", "ui= nt24", "uint32", "hexuint"] +integer_datatypes =3D ["positiveInteger", "unsignedInt", "int", "long", "u= nsignedLong", "integer"] + +#Override attribute based on ref +custom_ref_table =3D { + "virYesNo": {"type": "bool"}, + "virOnOff": {"type": "Switch"}, + "ipAddr": {"type": "IPAddr"}, + "ipv4Addr": {"type": "IPAddr"}, + "ipv6Addr": {"type": "IPAddr"}, + "diskTargetDev": {"type": "TargetDev"}, + "UUID": {"type": "DummyUUID"}, + "usbIdDefault": {"values": ["-1",]}, + "usbClass": {"type": "uint32"}, + "usbId": {"type": "uint32"}, + "usbVersion": {"type": "uint32"}, + "usbAddr": {"type": "uint32"}, + "usbPort": {"type": "uint32"}, + "virtioserialPort": {"type": "uint32"}, + "timeDelta": {"type": "uint32"}, + "absFilePath": {"type": "DummyPath"}, + "filePath": {"type": "DummyPath"}, + "absDirPath": {"type": "DummyPath"}, + "dirPath": {"type": "DummyPath"}, + "cpuset": {"type": "CPUSet"}, + "pciSlot": {"type": "uint32"}, + "pciFunc": {"type": "uint32"}, + "ccidSlot": {"type": "uint32"}, + "ccwCssidRange": {"type": "uint32"}, + "ccwSsidRange": {"type": "uint32"}, + "ccwDevnoRange": {"type": "uint32"}, + "driveController": {"type": "uint32"}, + "driveBus": {"type": "uint32"}, + "driveSCSITarget": {"type": "uint32"}, + "driveUnit": {"type": "uint32"}, + "irq": {"type": "uint32"}, + "iobase": {"type": "uint32"}, + "uniMacAddr": {"type": "MacAddr"}, + } + +net_model_names =3D ["virtio", "virtio-transitional", "virtio-non-transiti= onal", "e1000", "e1000e", "igb", + "rtl8139", "netfront", "usb-net", "spapr-vlan", "lan911= 8", "scm91c111", "vlance", "vmxnet", + "vmxnet2", "vmxnet3", "Am79C970A", "Am79C973", "82540EM= ", "82545EM", "82543GC"] + +# Override attribute based on paths +attr_path_table =3D { + "domain.devices.interface.model.type": {"values": net_= model_names}, + } + +# Tag paths end with a dot while attributes don't. +xml_modify_path =3D { + "domain.devices.smartcard.certificate.": {"repeated": = True} + } + +def tree_add_tag(tree, element_name, set_repeat): + if "tags" not in tree: + tree["tags"] =3D {} + + if element_name not in tree["tags"]: + tree["tags"][element_name] =3D {} + + if set_repeat: + tree["tags"][element_name]["repeated"] =3D True + +def tree_add_attribute(tree, child, attrib_name, path): + if "attributes" not in tree: + tree["attributes"] =3D {} + + if attrib_name not in tree["attributes"]: + tree["attributes"][attrib_name] =3D {} + + if "type_list" not in tree["attributes"][attrib_name]: + tree["attributes"][attrib_name]["type_list"] =3D [] + + if "value_list" not in tree["attributes"][attrib_name]: + tree["attributes"][attrib_name]["value_list"] =3D [] + + parse_attribute(child, attrib_name, tree["attributes"][attrib_name], p= ath + attrib_name) + +def parse_datatype(root): + datatype =3D root.attrib["type"] + if datatype in integer_datatypes: + return "uint32" + else: + return "DummyString" + +# Parse , which may point to either a primitive data type, a type +# defined in custom_ref_table or simply an XML tree pointer. +def parse_ref_node(ref_node): + ref_tree =3D define_table[ref_node.attrib["name"]] + ref_name =3D ref_node.attrib["name"] + + if ref_name in integer_refs: + return {"type": "uint32"} + elif ref_name in custom_ref_table: + ref_type =3D custom_ref_table[ref_name] + return ref_type + else: + return {"tree": ref_tree} + +def add_to_attr_list(l, new_val): + if new_val not in l: + l.append(new_val) + +# Map custom ref table entries to appropriate values in value_list +# and type_list of an attribute tree. +def add_ref_type_to_attr_lists(attr_type, value_list, type_list): + if "types" in attr_type: + for type_name in attr_type["types"]: + add_to_attr_list(type_list, type_name) + elif "type" in attr_type: + add_to_attr_list(type_list, attr_type["type"]) + + if "values" in attr_type: + for value in attr_type["values"]: + add_to_attr_list(value_list, value) + +# Parse inside . +# +# may contain one or more , or tags +def parse_attribute_choices(attribute_node, value_list, type_list): + for value_node in attribute_node: + if (value_node.tag =3D=3D relaxng_ns + "value"): + value =3D value_node.text + add_to_attr_list(value_list, value) + elif (value_node.tag =3D=3D relaxng_ns + "ref"): + ref_parse =3D parse_ref_node(value_node) + + if "tree" not in ref_parse: + add_ref_type_to_attr_lists(ref_parse, value_list, type_lis= t) + else: + ref_tree =3D ref_parse["tree"] + parse_attribute_choices(ref_tree, value_list, type_list) + elif (value_node.tag =3D=3D relaxng_ns + "data"): + datatype =3D parse_datatype(value_node) + add_to_attr_list(type_list, datatype) + elif (value_node.tag =3D=3D relaxng_ns + "text"): + add_to_attr_list(type_list, "DummyString") + else: + parse_attribute_choices(value_node, value_list, type_list) + +# Parse and generate an attribute tree +# +# An attribute tree consists of: +# 1. 'type_list': List of data types (Eg: uint32, string, etc.) +# 2. 'value_list': List of enum values. +# +# type_list and value_list can be extended further throughout the +# parsing of the XML tree. +def parse_attribute(root, attribute_name, attribute_tree, path): + type_list =3D attribute_tree["type_list"] + value_list =3D attribute_tree["value_list"] + + if path in attr_path_table: + add_ref_type_to_attr_lists(attr_path_table[path], value_list, type= _list) + return + + if (len(root) =3D=3D 0): + # If there is nothing in , assuming string. + add_to_attr_list(type_list, "DummyString") + return + + attribute_node =3D root[0] + + if attribute_node.tag =3D=3D relaxng_ns + "value": + # Single corresponds to mono-valued enum + value =3D attribute_node.text + add_to_attr_list(value_list, value) + elif attribute_node.tag =3D=3D relaxng_ns + "choice": + # Parse + parse_attribute_choices(attribute_node, value_list, type_list) + elif attribute_node.tag =3D=3D relaxng_ns + "data": + # Primitive datatypes can be mapped to protobuf types directly + data_type =3D parse_datatype(attribute_node) + add_to_attr_list(type_list, data_type) + elif attribute_node.tag =3D=3D relaxng_ns + "ref": + ref_name =3D attribute_node.attrib["name"] + ref_parse =3D parse_ref_node(attribute_node) + if "tree" not in ref_parse: + add_ref_type_to_attr_lists(ref_parse, value_list, type_list) + else: + # Recurse into ref + parse_attribute(define_table[ref_name], attribute_name, attrib= ute_tree, path) + return + elif attribute_node.tag =3D=3D relaxng_ns + "text": + # is simply a generic string + add_to_attr_list(type_list, "DummyString") + else: + # We should never reach here + raise ValueError(f"Attribute {attribute_name} has unknown datatype= ") + +# Store XML text node data +def initialize_text_tree(tree): + if "text" not in tree: + tree["text"] =3D {"value_list": [], "type_list": []} + +# Parse and store data in intermediate tree. +# +# An intermediate tree will consist of +# 1. 'tags': List of nested tag trees, which may contain other tags or att= ributes. +# 2. 'attributes': List of attribute trees +# 3. 'text': Similar in structure to attribute tree, representing an XML t= ext node. +def parse_define(root, tree, ref_traverse, path=3D"", set_repeat=3DFalse): + if path in xml_modify_path: + # TODO: Allow overriding more stuff when required + xml_modify_path_entry =3D xml_modify_path[path] + if "repeated" in xml_modify_path_entry: + tree["repeated"] =3D xml_modify_path_entry["repeated"] + + for child in root: + tag =3D child.tag + attrib =3D child.attrib + + # Handle tags which will be represented as T_ fields in + # the protobuf. + if tag =3D=3D relaxng_ns + "element": + if "name" not in attrib: + continue + + element_name =3D attrib["name"] + + tree_add_tag(tree, element_name, set_repeat) + + parse_define(child, tree["tags"][element_name], ref_traverse, = path + element_name + ".") + + # Handle tags which will be represented as A_ fields in + # the protobuf. + elif tag =3D=3D relaxng_ns + "attribute": + attrib_name =3D attrib["name"] + + tree_add_attribute(tree, child, attrib_name, path) + + # points to another which is recursively traversed. + elif tag =3D=3D relaxng_ns + "ref": + ref_name =3D attrib["name"] + + # If ref encapsulates datatype, generate V_ field instead of t= raversing inside + ref_parse =3D parse_ref_node(child) + if ("tree" not in ref_parse): + initialize_text_tree(tree) + add_ref_type_to_attr_lists(ref_parse, tree["text"]["value_= list"], tree["text"]["type_list"]) + continue + + # Handle infinitely recursive refs + if (define_table[ref_name] in ref_traverse): + pass + else: + parse_define(define_table[ref_name], tree, ref_traverse + = [define_table[ref_name]], path, set_repeat) + + # If or is used, + # immediate elements under it will have 'repeated' specifier in the + # final protobuf. + elif tag =3D=3D relaxng_ns + "oneOrMore" or tag =3D=3D relaxng_ns = + "zeroOrMore": + parse_define(child, tree, ref_traverse, path, True) + + # , or residing outside of are + # XML text nodes, represented by V_ fields. + elif tag =3D=3D relaxng_ns + "value": + initialize_text_tree(tree) + add_to_attr_list(tree["text"]["value_list"], child.text) + elif tag =3D=3D relaxng_ns + "data": + initialize_text_tree(tree) + add_to_attr_list(tree["text"]["type_list"], parse_datatype(chi= ld)) + elif tag =3D=3D relaxng_ns + "text": + initialize_text_tree(tree) + add_to_attr_list(tree["text"]["type_list"], "DummyString") + else: + parse_define(child, tree, ref_traverse, path, set_repeat) + +# Find all tags and store them to resolve tags later +# +# Also parse all tags in order to add more tags to the +# table +def get_defines(schema_path): + schema_tree =3D ET.parse(schema_path) + root =3D schema_tree.getroot() + + for child in root: + tag =3D child.tag + attrib =3D child.attrib + + if tag =3D=3D relaxng_ns + "start": + define_table["rng_entrypoint"] =3D child + if tag =3D=3D relaxng_ns + "define": + define_name =3D attrib["name"] + define_table[define_name] =3D child + elif tag =3D=3D relaxng_ns + "include": + include_href =3D attrib["href"] + get_defines(f"../src/conf/schemas/{include_href}") + +def padding(text, level): + return " " * level * 4 + text + +# Generate enum protobuf +def enum_to_proto(tree, level, scope): + proto =3D "" + enum_index =3D 0 + restricted_words =3D ["unix", "linux"] + for value in tree["values"]: + formatted_value =3D re.sub("[^a-zA-Z0-9_]", "_", value) + + if re.match("^[0-9]", formatted_value): + formatted_value =3D "_" + formatted_value + + if formatted_value in restricted_words: + formatted_value =3D "const_" + formatted_value + + while formatted_value in scope: + formatted_value =3D "_" + formatted_value + + proto +=3D padding(f"{formatted_value} =3D {enum_index}", level) + + if formatted_value !=3D value: + proto +=3D f" [(real_value) =3D '{value}'];\n" + else: + proto +=3D ";\n" + + scope.add(formatted_value) + enum_index +=3D 1 + + return proto + +# Generate oneof protobuf containing multiple protobuf fields. +def oneof_to_proto(tree, attribute, protobuf_index, level, proto_opt, scop= e): + proto =3D "" + if "enum" in tree["types"]: + proto +=3D padding(f"enum {attribute}Enum {{\n", level) + proto +=3D enum_to_proto(tree["types"]["enum"], level + 1, scope) + proto +=3D padding("}\n", level) + + optnum =3D 0 + proto +=3D padding(f"oneof {attribute}Option {{\n", level) + for datatype in tree["types"]: + if datatype =3D=3D "enum": + datatype =3D f"{attribute}Enum" + proto +=3D padding(f"{datatype} A_OPT{str(optnum).zfill(2)}_{attri= bute} =3D {protobuf_index}{proto_opt};\n", level + 1) + protobuf_index +=3D 1 + optnum +=3D 1 + + proto +=3D padding(f"}}\n", level) + + return (proto, protobuf_index - 1) + +# Given an attribute tree with type_list and value_list, +# determine how the protobuf field must be generated, i.e +# what field type it is and if it can take on multiple types. +def generate_attribute_type(attribute_tree): + result =3D {} + + type_list =3D attribute_tree["type_list"] + value_list =3D attribute_tree["value_list"] + # Number of data types possible for an attribute + # (enum values count as an additional type) + type_count =3D len(type_list) + (1 if len(value_list) > 0 else 0) + + if type_count =3D=3D 1: + if len(type_list) =3D=3D 1: + result["type"] =3D type_list[0] + elif len(value_list) > 0: + result["type"] =3D "enum" + result["values"] =3D value_list + else: + # If there are more than two data types for the attribute, + # it should be oneof in the protobuf. + result["type"] =3D "oneof" + result["types"] =3D {} + for datatype in type_list: + result["types"][datatype] =3D {"type": datatype} + + if (len(value_list) > 0): + result["types"]["enum"] =3D {"type": "enum", "values": value_l= ist} + + return result + +# Convert intermediate tree to protobuf +def define_tree_to_proto(tree, level): + tags =3D tree.get("tags", {}) + attributes =3D tree.get("attributes", {}) + content_type =3D tree.get("content_type", None) + + # Due to how protobuf scoping works, we can't have the same enum ideni= tifers + # under the same message. We need to keep track of the scope ourselves. + current_scope =3D set() + + proto =3D "" + protobuf_index =3D 1 + + for attribute in attributes: + renamed_attr =3D attribute + proto_opt =3D "" + if re.search("[^a-zA-Z0-9_]", attribute): + renamed_attr =3D re.sub("[^a-zA-Z0-9_]", "_", attribute) + proto_opt =3D f" [(real_name) =3D '{attribute}']" + + attribute_type =3D generate_attribute_type(attributes[attribute]) + datatype =3D attribute_type["type"] + + if datatype =3D=3D "oneof": + new_proto, new_index =3D oneof_to_proto(attribute_type, rename= d_attr, protobuf_index, level, proto_opt, current_scope) + proto +=3D new_proto + protobuf_index =3D new_index + elif datatype =3D=3D "enum": + proto +=3D padding(f"enum {renamed_attr}Enum {{\n", level) + proto +=3D enum_to_proto(attribute_type, level + 1, current_sc= ope) + proto +=3D padding("}\n", level) + proto +=3D padding(f"optional {renamed_attr}Enum A_{renamed_at= tr} =3D {protobuf_index}{proto_opt};\n", level) + else: + proto +=3D padding(f"optional {datatype} A_{renamed_attr} =3D = {protobuf_index}{proto_opt};\n", level) + + protobuf_index +=3D 1 + + protobuf_tag_index =3D 10000 + + if "text" in tree: + # Note that if both V_ and T_ fields are present, V_ will be favo= ured + # if its presence returns true (since it's optional), otherwise T_= fields + # will be used. + text_tree =3D tree["text"] + text_type =3D generate_attribute_type(text_tree) + datatype =3D text_type["type"] + + if datatype =3D=3D "oneof": + print("WARN: oneof of V_ not yet supported!") + elif datatype =3D=3D "enum": + proto +=3D padding(f"enum ValueEnum {{\n", level) + proto +=3D enum_to_proto(text_type, level + 1, current_scope) + proto +=3D padding("}\n", level) + proto +=3D padding(f"optional ValueEnum V_value =3D {protobuf_= tag_index};\n", level) + else: + proto +=3D padding(f"optional {datatype} V_value =3D {protobuf= _tag_index};\n", level) + + protobuf_tag_index +=3D 1 + + for tag in tags: + renamed_tag =3D tag + proto_opt =3D "" + if re.search("[^a-zA-Z0-9_]", tag): + renamed_tag =3D re.sub("[^a-zA-Z0-9_]", "_", tag) + proto_opt +=3D f" [(real_name) =3D '{tag}']" + + proto +=3D padding(f"message {renamed_tag}Tag {{\n", level) + proto +=3D define_tree_to_proto(tags[tag], level + 1) + proto +=3D padding("}\n", level) + + specifier =3D "optional" + if (tags[tag].get("repeated", False)): + specifier =3D "repeated" + + if level !=3D 0: + proto +=3D padding(f"{specifier} {renamed_tag}Tag T_{renamed_t= ag} =3D {protobuf_tag_index}{proto_opt};\n", level) + + protobuf_tag_index +=3D 1 + + return proto + +parser =3D argparse.ArgumentParser(formatter_class=3Dargparse.RawDescripti= onHelpFormatter, + description=3D"RelaxNG schema to protobuf= converter") + +parser.add_argument('rngfile', help=3D'Specify .rng file to process') + +parser.add_argument('protofile', help=3D'Specify .proto file to output') + +parser.add_argument('--defines', nargs=3D'*', default=3D[ 'rng_entrypoint'= ], + help=3D'Specify defines to be converted to equivalent = protobuf messages',) + +args =3D parser.parse_args() + +allowed_defines =3D args.defines +infile =3D args.rngfile +outfile =3D args.protofile + +get_defines(infile) + +for define_name in allowed_defines: + define_trees[define_name] =3D {} + parse_define(define_table[define_name], define_trees[define_name], []) + +prologue =3D """\ +syntax =3D 'proto2'; +package libvirt; + +import 'xml_datatypes.proto'; +""" + +with open(outfile, "w") as out_file: + out_file.write(prologue) + + for define_name in allowed_defines: + out_file.write(define_tree_to_proto(define_trees[define_name], 0)) + out_file.write("\n") --=20 2.34.1