From nobody Tue Oct 28 21:07:40 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622376012555.0810110095338; Mon, 18 Dec 2017 10:39:36 -0800 (PST) Received: from localhost ([::1]:52024 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0K0-0004oV-Hm for importer@patchew.org; Mon, 18 Dec 2017 13:39:28 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55469) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUH-00081C-Qd for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUE-0001lp-GO for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:01 -0500 Received: from mail-pl0-x232.google.com ([2607:f8b0:400e:c01::232]:47092) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUE-0001l2-4B for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:45:58 -0500 Received: by mail-pl0-x232.google.com with SMTP id i6so5248671plt.13 for ; Mon, 18 Dec 2017 09:45:57 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.45.54 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:45:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6OVJSkDEUfO1JD1m5iY8ONKJJ9CnoTqMjmeLxQj1JBk=; b=fyFTazMk8McPfD2TsZQKYtmYpSJjYJB1O67RTPY46E6lLpooLcWgsiY+2HgB9n2LEW Wx4QRXMqwLbfRA50llFaOR76wW/JX1qmdp3Nrm+jQem9F+7WGNQIvtQ+ouo7QNGiWn21 OKXh2/6TKkyJG74lkGdi+UDBgQWgBd4ovS/Xk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6OVJSkDEUfO1JD1m5iY8ONKJJ9CnoTqMjmeLxQj1JBk=; b=maJKvG7o58i+2d397T6k9kqYSvEJ0mjMh791qu0KGtOaU4qWwUpOq/jzhP0CO/18xm jiuK5YQbRsIBgI+J6QTRyJ+GbM7QMw0ur3DDrw7woHNpF0c+eRz8KvP0M//VkiV3w3Ow FWCzG4+5oSF/3onfM2/4q0GUEHi0OdU1pSAdRTOPjKzkj79VoSM5Bd46jqDgBR+S7btI vaDagYZR3oGIW40gJ5OFn5ynadyd0n3D8WlCHyikoIEqKXB+kbwYGfaa7iqblr92puYK 79k3XWp++2iECP+kBCA/uhpjBHvWABT2DihN34wbhwdov+Na+6sIstzlMBt6yh1QjnCC Vcug== X-Gm-Message-State: AKGB3mIyNykVuVqI4dbVIz60tQM5OckHnxhhEF2m6RluvUY4Nax7IQ66 QiGno6e2cEOTR5psZqmGCTbzmFXCTMc= X-Google-Smtp-Source: ACJfBot0G+Jx1iMf15o/rCSAf+50fH2PdQ1BATl/JNMfcM0hQ4PiXDLNaNRDc2VCcKUuu+PDtYHuww== X-Received: by 10.159.208.9 with SMTP id a9mr490289plp.66.1513619156314; Mon, 18 Dec 2017 09:45:56 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:30 -0800 Message-Id: <20171218174552.18871-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::232 Subject: [Qemu-devel] [PATCH 01/23] scripts: Add decodetree.py X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" To be used to decode ARM SVE, but could be used for any 32-bit RISC. It would need additional work to extend to insn sizes other than 32-bit. Signed-off-by: Richard Henderson --- scripts/decodetree.py | 984 ++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 984 insertions(+) create mode 100755 scripts/decodetree.py diff --git a/scripts/decodetree.py b/scripts/decodetree.py new file mode 100755 index 0000000000..acb0243915 --- /dev/null +++ b/scripts/decodetree.py @@ -0,0 +1,984 @@ +#!/usr/bin/env python +# +# Generate a decoding tree from a specification file. +# +# The tree is built from instruction "patterns". A pattern may represent +# a single architectural instruction or a group of same, depending on what +# is convenient for further processing. +# +# Each pattern has "fixedbits" & "fixedmask", the combination of which +# describes the condition under which the pattern is matched: +# +# (insn & fixedmask) =3D=3D fixedbits +# +# Each pattern may have "fields", which are extracted from the insn and +# passed along to the translator. Examples of such are registers, +# immediates, and sub-opcodes. +# +# In support of patterns, one may declare fields, argument sets, and +# formats, each of which may be re-used to simplify further definitions. +# +## Field syntax: +# +# field_def :=3D '%' identifier ( unnamed_field )+ ( !function=3Didentifie= r )? +# unnamed_field :=3D number ':' ( 's' ) number +# +# For unnamed_field, the first number is the least-significant bit positio= n of +# the field and the second number is the length of the field. If the 's' = is +# present, the field is considered signed. If multiple unnamed_fields are +# present, they are concatenated. In this way one can define disjoint fie= lds. +# +# If !function is specified, the concatenated result is passed through the +# named function, taking and returning an integral value. +# +# FIXME: the fields of the structure into which this result will be stored +# is restricted to "int". Which means that we cannot expand 64-bit items. +# +# Field examples: +# +# %disp 0:s16 -- sextract(i, 0, 16) +# %imm9 16:6 10:3 -- extract(i, 16, 6) << 3 | extract(i, 10, 3) +# %disp12 0:s1 1:1 2:10 -- sextract(i, 0, 1) << 11 +# | extract(i, 1, 1) << 10 +# | extract(i, 2, 10) +# %shimm8 5:s8 13:1 !function=3Dexpand_shimm8 +# -- expand_shimm8(sextract(i, 5, 8) << 1 +# | extract(i, 13, 1)) +# +## Argument set syntax: +# +# args_def :=3D '&' identifier ( args_elt )+ +# args_elt :=3D identifier +# +# Each args_elt defines an argument within the argument set. +# Each argument set will be rendered as a C structure "arg_$name" +# with each of the fields being one of the member arguments. +# +# Argument set examples: +# +# ®3 ra rb rc +# &loadstore reg base offset +# +## Format syntax: +# +# fmt_def :=3D '@' identifier ( fmt_elt )+ +# fmt_elt :=3D fixedbit_elt | field_elt | field_ref | args_ref +# fixedbit_elt :=3D [01.]+ +# field_elt :=3D identifier ':' 's'? number +# field_ref :=3D '%' identifier | identifier '=3D' '%' identifier +# args_ref :=3D '&' identifier +# +# Defining a format is a handy way to avoid replicating groups of fields +# across many instruction patterns. +# +# A fixedbit_elt describes a contiguous sequence of bits that must +# be 1, 0, or "." for don't care. +# +# A field_elt describes a simple field only given a width; the position of +# the field is implied by its position with respect to other fixedbit_elt +# and field_elt. +# +# If any fixedbit_elt or field_elt appear then all 32 bits must be defined. +# Padding with a fixedbit_elt of all '.' is an easy way to accomplish that. +# +# A field_ref incorporates a field by reference. This is the only way to +# add a complex field to a format. A field may be renamed in the process +# via assignment to another identifier. This is intended to allow the +# same argument set be used with disjoint named fields. +# +# A single args_ref may specify an argument set to use for the format. +# The set of fields in the format must be a subset of the arguments in +# the argument set. If an argument set is not specified, one will be +# inferred from the set of fields. +# +# It is recommended, but not required, that all field_ref and args_ref +# appear at the end of the line, not interleaving with fixedbit_elf or +# field_elt. +# +# Format examples: +# +# @opr ...... ra:5 rb:5 ... 0 ....... rc:5 +# @opi ...... ra:5 lit:8 1 ....... rc:5 +# +## Pattern syntax: +# +# pat_def :=3D identifier ( pat_elt )+ +# pat_elt :=3D fixedbit_elt | field_elt | field_ref +# | args_ref | fmt_ref | const_elt +# fmt_ref :=3D '@' identifier +# const_elt :=3D identifier '=3D' number +# +# The fixedbit_elt and field_elt specifiers are unchanged from formats. +# A pattern that does not specify a named format will have one inferred +# from a referenced argument set (if present) and the set of fields. +# +# A const_elt allows a argument to be set to a constant value. This may +# come in handy when fields overlap between patterns and one has to +# include the values in the fixedbit_elt instead. +# +# The decoder will call a translator function for each pattern matched. +# +# Pattern examples: +# +# addl_r 010000 ..... ..... .... 0000000 ..... @opr +# addl_i 010000 ..... ..... .... 0000000 ..... @opi +# +# which will, in part, invoke +# +# trans_addl_r(ctx, &arg_opr, insn) +# and +# trans_addl_i(ctx, &arg_opi, insn) +# + +import io +import re +import sys +import getopt +import pdb + +# ??? Parameterize insn_width from 32. +fields =3D {} +arguments =3D {} +formats =3D {} +patterns =3D [] + +translate_prefix =3D 'trans' +output_file =3D sys.stdout + +re_ident =3D '[a-zA-Z][a-zA-Z0-9_]*' + +def error(lineno, *args): + if lineno: + r =3D 'error:{0}:'.format(lineno) + else: + r =3D 'error:' + for a in args: + r +=3D ' ' + str(a) + r +=3D '\n' + sys.stderr.write(r) + exit(1) + +def output(*args): + global output_file + for a in args: + output_file.write(a) + +if sys.version_info >=3D (3, 0): + re_fullmatch =3D re.fullmatch +else: + def re_fullmatch(pat, str): + return re.match('^' + pat + '$', str) + +def output_autogen(): + output('/* This file is autogenerated. */\n\n') + +def str_indent(c): + """Return a string with C spaces""" + r =3D '' + for i in range(0, c): + r +=3D ' ' + return r + +def str_fields(fields): + """Return a string uniquely identifing FIELDS""" + + r =3D '' + for n in sorted(fields.keys()): + r +=3D '_' + n; + return r[1:] + +def str_match_bits(bits, mask): + """Return a string pretty-printing BITS/MASK""" + i =3D 0x80000000 + space =3D 0x01010100 + r =3D '' + while i !=3D 0: + if i & mask: + if i & bits: + r +=3D '1' + else: + r +=3D '0' + else: + r +=3D '.' + if i & space: + r +=3D ' ' + i >>=3D 1 + return r + +def is_pow2(bits): + return (bits & (bits - 1)) =3D=3D 0 + +def popcount(b): + b =3D (b & 0x55555555) + ((b >> 1) & 0x55555555) + b =3D (b & 0x33333333) + ((b >> 2) & 0x33333333) + b =3D (b & 0x0f0f0f0f) + ((b >> 4) & 0x0f0f0f0f) + b =3D (b + (b >> 8)) & 0x00ff00ff + b =3D (b + (b >> 16)) & 0xffff + return b + +def ctz(b): + r =3D 0 + while ((b >> r) & 1) =3D=3D 0: + r +=3D 1 + return r + +def is_contiguous(bits): + shift =3D ctz(bits) + if is_pow2((bits >> shift) + 1): + return shift + else: + return -1 + +def bit_iterate(bits): + iter =3D 0 + yield iter + if bits =3D=3D 0: + return + while True: + this =3D bits + while True: + lsb =3D this & -this + if iter & lsb: + iter ^=3D lsb + this ^=3D lsb + else: + iter ^=3D lsb + break + if this =3D=3D 0: + return + yield iter + +def eq_fields_for_args(flds_a, flds_b): + if len(flds_a) !=3D len(flds_b): + return False + for k, a in flds_a.items(): + if k not in flds_b: + return False + return True + +def eq_fields_for_fmts(flds_a, flds_b): + if len(flds_a) !=3D len(flds_b): + return False + for k, a in flds_a.items(): + if k not in flds_b: + return False + b =3D flds_b[k] + if a.__class__ !=3D b.__class__ or a !=3D b: + return False + return True + +class Field: + """Class representing a simple instruction field""" + def __init__(self, sign, pos, len): + self.sign =3D sign + self.pos =3D pos + self.len =3D len + self.mask =3D ((1 << len) - 1) << pos + + def __str__(self): + if self.sign: + s =3D 's' + else: + s =3D '' + return str(pos) + ':' + s + str(len) + + def str_extract(self): + if self.sign: + extr =3D 'sextract32' + else: + extr =3D 'extract32' + return '{0}(insn, {1}, {2})'.format(extr, self.pos, self.len) + + def __eq__(self, other): + return self.sign =3D=3D other.sign and self.sign =3D=3D other.sign + + def __ne__(self, other): + return not self.__eq__(other) +# end Field + +class MultiField: + """Class representing a compound instruction field""" + def __init__(self, subs): + self.subs =3D subs + self.sign =3D subs[0].sign + mask =3D 0 + for s in subs: + mask |=3D s.mask + self.mask =3D mask + + def __str__(self): + return str(self.subs) + + def str_extract(self): + ret =3D '0' + pos =3D 0 + for f in reversed(self.subs): + if pos =3D=3D 0: + ret =3D f.str_extract() + else: + ret =3D 'deposit32({0}, {1}, {2}, {3})'.format(ret, pos, 3= 2 - pos, f.str_extract()) + pos +=3D f.len + return ret + + def __ne__(self, other): + if len(self.subs) !=3D len(other.subs): + return True + for a, b in zip(self.subs, other.subs): + if a.__class__ !=3D b.__class__ or a !=3D b: + return True + return False; + + def __eq__(self, other): + return not self.__ne__(other) +# end MultiField + +class ConstField: + """Class representing an argument field with constant value""" + def __init__(self, value): + self.value =3D value + self.mask =3D 0 + self.sign =3D value < 0 + + def __str__(self): + return str(self.value) + + def str_extract(self): + return str(self.value) + + def __cmp__(self, other): + return self.value - other.value +# end ConstField + +class FunctionField: + """Class representing a field passed through an expander""" + def __init__(self, func, base): + self.mask =3D base.mask + self.sign =3D base.sign + self.base =3D base + self.func =3D func + + def __str__(self): + return self.func + '(' + str(self.base) + ')' + + def str_extract(self): + return self.func + '(' + self.base.str_extract() + ')' + + def __eq__(self, other): + return self.func =3D=3D other.func and self.base =3D=3D other.base + def __ne__(self, other): + return not self.__eq__(other) +# end FunctionField + +class Arguments: + """Class representing the extracted fields of a format""" + def __init__(self, nm, flds): + self.name =3D nm + self.fields =3D sorted(flds) + + def __str__(self): + return self.name + ' ' + str(self.fields) + + def struct_name(self): + return 'arg_' + self.name + + def output_def(self): + output('typedef struct {\n') + for n in self.fields: + output(' int ', n, ';\n') + output('} ', self.struct_name(), ';\n\n') +# end Arguments + +class General: + """Common code between instruction formats and instruction patterns""" + def __init__(self, name, base, fixb, fixm, fldm, flds): + self.name =3D name + self.base =3D base + self.fixedbits =3D fixb + self.fixedmask =3D fixm + self.fieldmask =3D fldm + self.fields =3D flds + + def __str__(self): + r =3D self.name + if self.base: + r =3D r + ' ' + self.base.name + else: + r =3D r + ' ' + str(self.fields) + r =3D r + ' ' + str_match_bits(self.fixedbits, self.fixedmask) + return r + + def str1(self, i): + return str_indent(i) + self.__str__() +# end General + +class Format(General): + """Class representing an instruction format""" + + def extract_name(self): + return 'extract_' + self.name + + def output_extract(self): + output('static void ', self.extract_name(), '(', + self.base.struct_name(), ' *a, uint32_t insn)\n{\n') + for n, f in self.fields.items(): + output(' a->', n, ' =3D ', f.str_extract(), ';\n') + output('}\n\n') +# end Format + +class Pattern(General): + """Class representing an instruction pattern""" + + def output_decl(self): + global translate_prefix + output('typedef ', self.base.base.struct_name(), + ' arg_', self.name, ';\n') + output('void ', translate_prefix, '_', self.name, + '(DisasContext *ctx, arg_', self.name, + ' *a, uint32_t insn);\n') + + def output_code(self, i, extracted, outerbits, outermask): + global translate_prefix + ind =3D str_indent(i) + arg =3D self.base.base.name + if not extracted: + output(ind, self.base.extract_name(), '(&u.f_', arg, ', insn);= \n') + for n, f in self.fields.items(): + output(ind, 'u.f_', arg, '.', n, ' =3D ', f.str_extract(), ';\= n') + output(ind, translate_prefix, '_', self.name, + '(ctx, &u.f_', arg, ', insn);\n') + output(ind, 'return true;\n') +# end Pattern + +def parse_field(lineno, name, toks): + """Parse one instruction field from TOKS at LINENO""" + global fields + global re_ident + + # A "simple" field will have only one entry; a "multifield" will have = several. + subs =3D [] + width =3D 0 + func =3D None + for t in toks: + if re_fullmatch('!function=3D' + re_ident, t): + if func: + error(lineno, 'duplicate function') + func =3D t.split('=3D') + func =3D func[1] + continue + + if re_fullmatch('[0-9]+:s[0-9]+', t): + # Signed field extract + subtoks =3D t.split(':s') + sign =3D True + elif re_fullmatch('[0-9]+:[0-9]+', t): + # Unsigned field extract + subtoks =3D t.split(':') + sign =3D False + else: + error(lineno, 'invalid field token "{0}"'.format(t)) + p =3D int(subtoks[0]) + l =3D int(subtoks[1]) + if p + l > 32: + error(lineno, 'field {0} too large'.format(t)) + f =3D Field(sign, p, l) + subs.append(f) + width +=3D l + + if width > 32: + error(lineno, 'field too large') + if len(subs) =3D=3D 1: + f =3D subs[0] + else: + f =3D MultiField(subs) + if func: + f =3D FunctionField(func, f) + + if name in fields: + error(lineno, 'duplicate field', name) + fields[name] =3D f +# end parse_field + +def parse_arguments(lineno, name, toks): + """Parse one argument set from TOKS at LINENO""" + global arguments + global re_ident + + flds =3D [] + for t in toks: + if not re_fullmatch(re_ident, t): + error(lineno, 'invalid argument set token "{0}"'.format(t)) + flds.append(t) + + if name in arguments: + error(lineno, 'duplicate argument set', name) + arguments[name] =3D Arguments(name, flds) +# end parse_arguments + +def lookup_field(lineno, name): + global fields + if name in fields: + return fields[name] + error(lineno, 'undefined field', name) + +def add_field(lineno, flds, new_name, f): + if new_name in flds: + error(lineno, 'duplicate field', new_name) + flds[new_name] =3D f + return flds + +def add_field_byname(lineno, flds, new_name, old_name): + return add_field(lineno, flds, new_name, lookup_field(lineno, old_name= )) + +def infer_argument_set(flds): + global arguments + + for arg in arguments.values(): + if eq_fields_for_args(flds, arg.fields): + return arg + + name =3D str(len(arguments)) + arg =3D Arguments(name, flds.keys()) + arguments[name] =3D arg + return arg + +def infer_format(arg, fieldmask, flds): + global arguments + global formats + + const_flds =3D {} + var_flds =3D {} + for n, c in flds.items(): + if c is ConstField: + const_flds[n] =3D c + else: + var_flds[n] =3D c + + # Look for an existing format with the same argument set and fields + for fmt in formats.values(): + if arg and fmt.base !=3D arg: + continue + if fieldmask !=3D fmt.fieldmask: + continue + if not eq_fields_for_fmts(flds, fmt.fields): + continue + return (fmt, const_flds) + + name =3D 'Fmt_' + str(len(formats)) + if not arg: + arg =3D infer_argument_set(flds) + + fmt =3D Format(name, arg, 0, 0, fieldmask, var_flds) + formats[name] =3D fmt + + return (fmt, const_flds) +# end infer_format + +def parse_generic(lineno, is_format, name, toks): + """Parse one instruction format from TOKS at LINENO""" + global fields + global arguments + global formats + global patterns + global re_ident + + fixedmask =3D 0 + fixedbits =3D 0 + width =3D 0 + flds =3D {} + arg =3D None + fmt =3D None + for t in toks: + # '&Foo' gives a format an explcit argument set. + if t[0] =3D=3D '&': + tt =3D t[1:] + if arg: + error(lineno, 'multiple argument sets') + if tt in arguments: + arg =3D arguments[tt] + else: + error(lineno, 'undefined argument set', t) + continue + + # '@Foo' gives a pattern an explicit format. + if t[0] =3D=3D '@': + tt =3D t[1:] + if fmt: + error(lineno, 'multiple formats') + if tt in formats: + fmt =3D formats[tt] + else: + error(lineno, 'undefined format', t) + continue + + # '%Foo' imports a field. + if t[0] =3D=3D '%': + tt =3D t[1:] + flds =3D add_field_byname(lineno, flds, tt, tt) + continue + + # 'Foo=3D%Bar' imports a field with a different name. + if re_fullmatch(re_ident + '=3D%' + re_ident, t): + (fname, iname) =3D t.split('=3D%') + flds =3D add_field_byname(lineno, flds, fname, iname) + continue + + # 'Foo=3Dnumber' sets an argument field to a constant value + if re_fullmatch(re_ident + '=3D[0-9]+', t): + (fname, value) =3D t.split('=3D') + value =3D int(value) + flds =3D add_field(lineno, flds, fname, ConstField(value)) + continue + + # Pattern of 0s, 1s and dots indicate required zeros, + # required ones, or dont-cares. + if re_fullmatch('[01.]+', t): + shift =3D len(t) + fms =3D t.replace('0','1') + fms =3D fms.replace('.','0') + fbs =3D t.replace('.','0') + fms =3D int(fms, 2) + fbs =3D int(fbs, 2) + fixedbits =3D (fixedbits << shift) | fbs + fixedmask =3D (fixedmask << shift) | fms + # Otherwise, fieldname:fieldwidth + elif re_fullmatch(re_ident + ':s?[0-9]+', t): + (fname, flen) =3D t.split(':') + sign =3D False; + if flen[0] =3D=3D 's': + sign =3D True + flen =3D flen[1:] + shift =3D int(flen, 10) + f =3D Field(sign, 32 - width - shift, shift) + flds =3D add_field(lineno, flds, fname, f) + fixedbits <<=3D shift + fixedmask <<=3D shift + else: + error(lineno, 'invalid token "{0}"'.format(t)) + width +=3D shift + + # We should have filled in all of the bits of the instruction. + if width !=3D 32: + error(lineno, 'definition has {0} bits'.format(width)) + + # The fields that we add, or import, cannot overlap bits that we speci= fy + fieldmask =3D 0 + for f in flds.values(): + fieldmask |=3D f.mask + + # Fix up what we've parsed to match either a format or a pattern. + if is_format: + # Formats cannot reference formats. + if fmt: + error(lineno, 'format referencing format') + # If an argument set is given, then there should be no fields + # without a place to store it. + if arg: + for f in flds.keys(): + if f not in arg.fields: + error(lineno, 'field {0} not in argument set {1}'.form= at(f, arg.name)) + else: + arg =3D infer_argument_set(flds) + if name in formats: + error(lineno, 'duplicate format name', name) + fmt =3D Format(name, arg, fixedbits, fixedmask, fieldmask, flds) + formats[name] =3D fmt + else: + # Patterns can reference a format ... + if fmt: + # ... but not an argument simultaneously + if arg: + error(lineno, 'pattern specifies both format and argument = set') + fieldmask |=3D fmt.fieldmask + fixedbits |=3D fmt.fixedbits + fixedmask |=3D fmt.fixedmask + else: + (fmt, flds) =3D infer_format(arg, fieldmask, flds) + arg =3D fmt.base + for f in flds.keys(): + if f not in arg.fields: + error(lineno, 'field {0} not in argument set {1}'.format(f= , arg.name)) + pat =3D Pattern(name, fmt, fixedbits, fixedmask, fieldmask, flds) + patterns.append(pat) + + if fieldmask & fixedmask: + error(lineno, 'fieldmask overlaps fixedmask (0x{0:08x} & 0x{1:08x}= )'.format(fieldmask, fixedmask)) +# end parse_general + +def parse_file(f): + """Parse all of the patterns within a file""" + + # Read all of the lines of the file. Concatenate lines + # ending in backslash; discard empty lines and comments. + toks =3D [] + lineno =3D 0 + for line in f: + lineno +=3D 1 + + # Discard comments + end =3D line.find('#') + if end >=3D 0: + line =3D line[:end] + + t =3D line.split() + if len(toks) !=3D 0: + # Next line after continuation + toks.extend(t) + elif len(t) =3D=3D 0: + # Empty line + continue + else: + toks =3D t + + # Continuation? + if toks[-1] =3D=3D '\\': + toks.pop() + continue + + if len(toks) < 2: + error(lineno, 'short line') + + name =3D toks[0] + del toks[0] + + # Determine the type of object needing to be parsed. + if name[0] =3D=3D '%': + parse_field(lineno, name[1:], toks) + elif name[0] =3D=3D '&': + parse_arguments(lineno, name[1:], toks) + elif name[0] =3D=3D '@': + parse_generic(lineno, True, name[1:], toks) + else: + parse_generic(lineno, False, name, toks) + toks =3D [] +# end parse_file + +class Tree: + """Class representing a node in a decode tree""" + + def __init__(self, fm, tm): + self.fixedmask =3D fm + self.thismask =3D tm + self.subs =3D [] + self.base =3D None + + def str1(self, i): + ind =3D str_indent(i) + r =3D '{0}{1:08x}'.format(ind, self.fixedmask) + if self.format: + r +=3D ' ' + self.format.name + r +=3D ' [\n' + for (b, s) in self.subs: + r +=3D '{0} {1:08x}:\n'.format(ind, b) + r +=3D s.str1(i + 4) + '\n' + r +=3D ind + ']' + return r + + def __str__(self): + return self.str1(0) + + def output_code(self, i, extracted, outerbits, outermask): + ind =3D str_indent(i) + + # If we identified all nodes below have the same format, + # extract the fields now. + if not extracted and self.base: + output(ind, self.base.extract_name(), + '(&u.f_', self.base.base.name, ', insn);\n') + extracted =3D True + + # Attempt to aid the compiler in producing compact switch statemen= ts. + # If the bits in the mask are contiguous, extract them. + sh =3D is_contiguous(self.thismask) + if sh > 0: + str_switch =3D lambda b: \ + '(insn >> {0}) & 0x{1:x}'.format(sh, b >> sh) + str_case =3D lambda b: '0x{0:x}'.format(b >> sh) + else: + str_switch =3D lambda b: 'insn & 0x{0:08x}'.format(b) + str_case =3D lambda b: '0x{0:08x}'.format(b) + + output(ind, 'switch (', str_switch(self.thismask), ') {\n') + for b, s in sorted(self.subs): + rept =3D self.thismask & ~s.fixedmask + innermask =3D outermask | (self.thismask & ~rept) + innerbits =3D outerbits | b + for bb in bit_iterate(rept): + output(ind, 'case ', str_case(b | bb), ':\n') + output(ind, ' /* ', + str_match_bits(innerbits, innermask), ' */\n') + s.output_code(i + 4, extracted, innerbits, innermask) + output(ind, '}\n') + output(ind, 'return false;\n') +# end Tree + +def build_tree(pats, outerbits, outermask): + # Find the intersection of all remaining fixedmask. + innermask =3D ~outermask + for i in pats: + innermask &=3D i.fixedmask + + if innermask =3D=3D 0: + pnames =3D [] + for p in pats: + pnames.append(p.name) + #pdb.set_trace() + error(0, 'overlapping patterns:', pnames) + + fullmask =3D outermask | innermask + extramask =3D 0 + + # If there are few enough items, see how many undecoded bits remain. + # Otherwise, attempt to avoid a subsequent Tree level testing one bit. + if len(pats) < 8: + for i in pats: + extramask |=3D i.fixedmask & ~fullmask + else: + for i in pats: + e =3D i.fixedmask & ~fullmask + if e !=3D 0 and popcount(e) <=3D 2: + extramask |=3D e + + if popcount(extramask) < 4: + innermask |=3D extramask + fullmask |=3D extramask + + # Sort each element of pats into the bin selected by the mask. + bins =3D {} + for i in pats: + fb =3D i.fixedbits & innermask + if fb in bins: + bins[fb].append(i) + else: + bins[fb] =3D [i] + + # We must recurse if any bin has more than one element or if + # the single element in the bin has not been fully matched. + t =3D Tree(fullmask, innermask) + + for b, l in bins.items(): + s =3D l[0] + if len(l) > 1 or s.fixedmask & ~fullmask !=3D 0: + s =3D build_tree(l, b | outerbits, fullmask) + t.subs.append((b, s)) + + return t +# end build_tree + +def prop_format(tree): + """Propagate Format objects into the decode tree""" + + # Depth first search. + for (b, s) in tree.subs: + if isinstance(s, Tree): + prop_format(s) + + # If all entries in SUBS have the same format, then + # propagate that into the tree. + f =3D None + for (b, s) in tree.subs: + if f is None: + f =3D s.base + if f is None: + return + if f is not s.base: + return + tree.base =3D f +# end prop_format + + +def main(): + global arguments + global formats + global patterns + global translate_prefix + global output_file + + h_file =3D None + c_file =3D None + decode_function =3D 'decode' + + long_opts =3D [ 'decode=3D', 'translate=3D', 'header=3D', 'output=3D' ] + try: + (opts, args) =3D getopt.getopt(sys.argv[1:], 'h:o:', long_opts) + except getopt.GetoptError as err: + error(0, err) + for o, a in opts: + if o in ('-h', '--header'): + h_file =3D a + elif o in ('-o', '--output'): + c_file =3D a + elif o =3D=3D '--decode': + decode_function =3D a + elif o =3D=3D '--translate': + translate_prefix =3D a + else: + assert False, 'unhandled option' + + if len(args) < 1: + error(0, 'missing input file') + f =3D open(args[0], 'r') + parse_file(f) + f.close() + + t =3D build_tree(patterns, 0, 0) + prop_format(t) + + if h_file: + output_file =3D open(h_file, 'w') + elif c_file: + output_file =3D open(c_file, 'w') + else: + output_file =3D sys.stdout + + output_autogen() + for n in sorted(arguments.keys()): + f =3D arguments[n] + f.output_def() + + if h_file: + output('bool ', decode_function, + '(DisasContext *ctx, uint32_t insn);\n\n') + + # A single translate function can be invoked for different patterns. + # Make sure that the argument sets are the same, and declare the + # function only once. + out_pats =3D {} + for i in patterns: + if i.name in out_pats: + p =3D out_pats[i.name] + if i.base.base !=3D p.base.base: + error(0, i.name, ' has conflicting argument sets') + else: + i.output_decl() + out_pats[i.name] =3D i + + if h_file: + output_file.close() + if c_file: + output_file =3D open(c_file, 'w') + output_autogen() + + for n in sorted(formats.keys()): + f =3D formats[n] + f.output_extract() + + output('bool ', decode_function, + '(DisasContext *ctx, uint32_t insn)\n{\n') + + i4 =3D str_indent(4) + output(i4, 'union {\n') + for n in sorted(arguments.keys()): + f =3D arguments[n] + output(i4, i4, f.struct_name(), ' f_', f.name, ';\n') + output(i4, '} u;\n\n') + + t.output_code(4, False, 0, 0) + + output('}\n') + + if c_file: + output_file.close() +#end main + +if __name__ =3D=3D '__main__': + main() --=20 2.14.3