From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622376012555.0810110095338; Mon, 18 Dec 2017 10:39:36 -0800 (PST) Received: from localhost ([::1]:52024 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0K0-0004oV-Hm for importer@patchew.org; Mon, 18 Dec 2017 13:39:28 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55469) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUH-00081C-Qd for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUE-0001lp-GO for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:01 -0500 Received: from mail-pl0-x232.google.com ([2607:f8b0:400e:c01::232]:47092) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUE-0001l2-4B for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:45:58 -0500 Received: by mail-pl0-x232.google.com with SMTP id i6so5248671plt.13 for ; Mon, 18 Dec 2017 09:45:57 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.45.54 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:45:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6OVJSkDEUfO1JD1m5iY8ONKJJ9CnoTqMjmeLxQj1JBk=; b=fyFTazMk8McPfD2TsZQKYtmYpSJjYJB1O67RTPY46E6lLpooLcWgsiY+2HgB9n2LEW Wx4QRXMqwLbfRA50llFaOR76wW/JX1qmdp3Nrm+jQem9F+7WGNQIvtQ+ouo7QNGiWn21 OKXh2/6TKkyJG74lkGdi+UDBgQWgBd4ovS/Xk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6OVJSkDEUfO1JD1m5iY8ONKJJ9CnoTqMjmeLxQj1JBk=; b=maJKvG7o58i+2d397T6k9kqYSvEJ0mjMh791qu0KGtOaU4qWwUpOq/jzhP0CO/18xm jiuK5YQbRsIBgI+J6QTRyJ+GbM7QMw0ur3DDrw7woHNpF0c+eRz8KvP0M//VkiV3w3Ow FWCzG4+5oSF/3onfM2/4q0GUEHi0OdU1pSAdRTOPjKzkj79VoSM5Bd46jqDgBR+S7btI vaDagYZR3oGIW40gJ5OFn5ynadyd0n3D8WlCHyikoIEqKXB+kbwYGfaa7iqblr92puYK 79k3XWp++2iECP+kBCA/uhpjBHvWABT2DihN34wbhwdov+Na+6sIstzlMBt6yh1QjnCC Vcug== X-Gm-Message-State: AKGB3mIyNykVuVqI4dbVIz60tQM5OckHnxhhEF2m6RluvUY4Nax7IQ66 QiGno6e2cEOTR5psZqmGCTbzmFXCTMc= X-Google-Smtp-Source: ACJfBot0G+Jx1iMf15o/rCSAf+50fH2PdQ1BATl/JNMfcM0hQ4PiXDLNaNRDc2VCcKUuu+PDtYHuww== X-Received: by 10.159.208.9 with SMTP id a9mr490289plp.66.1513619156314; Mon, 18 Dec 2017 09:45:56 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:30 -0800 Message-Id: <20171218174552.18871-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::232 Subject: [Qemu-devel] [PATCH 01/23] scripts: Add decodetree.py X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" To be used to decode ARM SVE, but could be used for any 32-bit RISC. It would need additional work to extend to insn sizes other than 32-bit. Signed-off-by: Richard Henderson --- scripts/decodetree.py | 984 ++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 984 insertions(+) create mode 100755 scripts/decodetree.py diff --git a/scripts/decodetree.py b/scripts/decodetree.py new file mode 100755 index 0000000000..acb0243915 --- /dev/null +++ b/scripts/decodetree.py @@ -0,0 +1,984 @@ +#!/usr/bin/env python +# +# Generate a decoding tree from a specification file. +# +# The tree is built from instruction "patterns". A pattern may represent +# a single architectural instruction or a group of same, depending on what +# is convenient for further processing. +# +# Each pattern has "fixedbits" & "fixedmask", the combination of which +# describes the condition under which the pattern is matched: +# +# (insn & fixedmask) =3D=3D fixedbits +# +# Each pattern may have "fields", which are extracted from the insn and +# passed along to the translator. Examples of such are registers, +# immediates, and sub-opcodes. +# +# In support of patterns, one may declare fields, argument sets, and +# formats, each of which may be re-used to simplify further definitions. +# +## Field syntax: +# +# field_def :=3D '%' identifier ( unnamed_field )+ ( !function=3Didentifie= r )? +# unnamed_field :=3D number ':' ( 's' ) number +# +# For unnamed_field, the first number is the least-significant bit positio= n of +# the field and the second number is the length of the field. If the 's' = is +# present, the field is considered signed. If multiple unnamed_fields are +# present, they are concatenated. In this way one can define disjoint fie= lds. +# +# If !function is specified, the concatenated result is passed through the +# named function, taking and returning an integral value. +# +# FIXME: the fields of the structure into which this result will be stored +# is restricted to "int". Which means that we cannot expand 64-bit items. +# +# Field examples: +# +# %disp 0:s16 -- sextract(i, 0, 16) +# %imm9 16:6 10:3 -- extract(i, 16, 6) << 3 | extract(i, 10, 3) +# %disp12 0:s1 1:1 2:10 -- sextract(i, 0, 1) << 11 +# | extract(i, 1, 1) << 10 +# | extract(i, 2, 10) +# %shimm8 5:s8 13:1 !function=3Dexpand_shimm8 +# -- expand_shimm8(sextract(i, 5, 8) << 1 +# | extract(i, 13, 1)) +# +## Argument set syntax: +# +# args_def :=3D '&' identifier ( args_elt )+ +# args_elt :=3D identifier +# +# Each args_elt defines an argument within the argument set. +# Each argument set will be rendered as a C structure "arg_$name" +# with each of the fields being one of the member arguments. +# +# Argument set examples: +# +# ®3 ra rb rc +# &loadstore reg base offset +# +## Format syntax: +# +# fmt_def :=3D '@' identifier ( fmt_elt )+ +# fmt_elt :=3D fixedbit_elt | field_elt | field_ref | args_ref +# fixedbit_elt :=3D [01.]+ +# field_elt :=3D identifier ':' 's'? number +# field_ref :=3D '%' identifier | identifier '=3D' '%' identifier +# args_ref :=3D '&' identifier +# +# Defining a format is a handy way to avoid replicating groups of fields +# across many instruction patterns. +# +# A fixedbit_elt describes a contiguous sequence of bits that must +# be 1, 0, or "." for don't care. +# +# A field_elt describes a simple field only given a width; the position of +# the field is implied by its position with respect to other fixedbit_elt +# and field_elt. +# +# If any fixedbit_elt or field_elt appear then all 32 bits must be defined. +# Padding with a fixedbit_elt of all '.' is an easy way to accomplish that. +# +# A field_ref incorporates a field by reference. This is the only way to +# add a complex field to a format. A field may be renamed in the process +# via assignment to another identifier. This is intended to allow the +# same argument set be used with disjoint named fields. +# +# A single args_ref may specify an argument set to use for the format. +# The set of fields in the format must be a subset of the arguments in +# the argument set. If an argument set is not specified, one will be +# inferred from the set of fields. +# +# It is recommended, but not required, that all field_ref and args_ref +# appear at the end of the line, not interleaving with fixedbit_elf or +# field_elt. +# +# Format examples: +# +# @opr ...... ra:5 rb:5 ... 0 ....... rc:5 +# @opi ...... ra:5 lit:8 1 ....... rc:5 +# +## Pattern syntax: +# +# pat_def :=3D identifier ( pat_elt )+ +# pat_elt :=3D fixedbit_elt | field_elt | field_ref +# | args_ref | fmt_ref | const_elt +# fmt_ref :=3D '@' identifier +# const_elt :=3D identifier '=3D' number +# +# The fixedbit_elt and field_elt specifiers are unchanged from formats. +# A pattern that does not specify a named format will have one inferred +# from a referenced argument set (if present) and the set of fields. +# +# A const_elt allows a argument to be set to a constant value. This may +# come in handy when fields overlap between patterns and one has to +# include the values in the fixedbit_elt instead. +# +# The decoder will call a translator function for each pattern matched. +# +# Pattern examples: +# +# addl_r 010000 ..... ..... .... 0000000 ..... @opr +# addl_i 010000 ..... ..... .... 0000000 ..... @opi +# +# which will, in part, invoke +# +# trans_addl_r(ctx, &arg_opr, insn) +# and +# trans_addl_i(ctx, &arg_opi, insn) +# + +import io +import re +import sys +import getopt +import pdb + +# ??? Parameterize insn_width from 32. +fields =3D {} +arguments =3D {} +formats =3D {} +patterns =3D [] + +translate_prefix =3D 'trans' +output_file =3D sys.stdout + +re_ident =3D '[a-zA-Z][a-zA-Z0-9_]*' + +def error(lineno, *args): + if lineno: + r =3D 'error:{0}:'.format(lineno) + else: + r =3D 'error:' + for a in args: + r +=3D ' ' + str(a) + r +=3D '\n' + sys.stderr.write(r) + exit(1) + +def output(*args): + global output_file + for a in args: + output_file.write(a) + +if sys.version_info >=3D (3, 0): + re_fullmatch =3D re.fullmatch +else: + def re_fullmatch(pat, str): + return re.match('^' + pat + '$', str) + +def output_autogen(): + output('/* This file is autogenerated. */\n\n') + +def str_indent(c): + """Return a string with C spaces""" + r =3D '' + for i in range(0, c): + r +=3D ' ' + return r + +def str_fields(fields): + """Return a string uniquely identifing FIELDS""" + + r =3D '' + for n in sorted(fields.keys()): + r +=3D '_' + n; + return r[1:] + +def str_match_bits(bits, mask): + """Return a string pretty-printing BITS/MASK""" + i =3D 0x80000000 + space =3D 0x01010100 + r =3D '' + while i !=3D 0: + if i & mask: + if i & bits: + r +=3D '1' + else: + r +=3D '0' + else: + r +=3D '.' + if i & space: + r +=3D ' ' + i >>=3D 1 + return r + +def is_pow2(bits): + return (bits & (bits - 1)) =3D=3D 0 + +def popcount(b): + b =3D (b & 0x55555555) + ((b >> 1) & 0x55555555) + b =3D (b & 0x33333333) + ((b >> 2) & 0x33333333) + b =3D (b & 0x0f0f0f0f) + ((b >> 4) & 0x0f0f0f0f) + b =3D (b + (b >> 8)) & 0x00ff00ff + b =3D (b + (b >> 16)) & 0xffff + return b + +def ctz(b): + r =3D 0 + while ((b >> r) & 1) =3D=3D 0: + r +=3D 1 + return r + +def is_contiguous(bits): + shift =3D ctz(bits) + if is_pow2((bits >> shift) + 1): + return shift + else: + return -1 + +def bit_iterate(bits): + iter =3D 0 + yield iter + if bits =3D=3D 0: + return + while True: + this =3D bits + while True: + lsb =3D this & -this + if iter & lsb: + iter ^=3D lsb + this ^=3D lsb + else: + iter ^=3D lsb + break + if this =3D=3D 0: + return + yield iter + +def eq_fields_for_args(flds_a, flds_b): + if len(flds_a) !=3D len(flds_b): + return False + for k, a in flds_a.items(): + if k not in flds_b: + return False + return True + +def eq_fields_for_fmts(flds_a, flds_b): + if len(flds_a) !=3D len(flds_b): + return False + for k, a in flds_a.items(): + if k not in flds_b: + return False + b =3D flds_b[k] + if a.__class__ !=3D b.__class__ or a !=3D b: + return False + return True + +class Field: + """Class representing a simple instruction field""" + def __init__(self, sign, pos, len): + self.sign =3D sign + self.pos =3D pos + self.len =3D len + self.mask =3D ((1 << len) - 1) << pos + + def __str__(self): + if self.sign: + s =3D 's' + else: + s =3D '' + return str(pos) + ':' + s + str(len) + + def str_extract(self): + if self.sign: + extr =3D 'sextract32' + else: + extr =3D 'extract32' + return '{0}(insn, {1}, {2})'.format(extr, self.pos, self.len) + + def __eq__(self, other): + return self.sign =3D=3D other.sign and self.sign =3D=3D other.sign + + def __ne__(self, other): + return not self.__eq__(other) +# end Field + +class MultiField: + """Class representing a compound instruction field""" + def __init__(self, subs): + self.subs =3D subs + self.sign =3D subs[0].sign + mask =3D 0 + for s in subs: + mask |=3D s.mask + self.mask =3D mask + + def __str__(self): + return str(self.subs) + + def str_extract(self): + ret =3D '0' + pos =3D 0 + for f in reversed(self.subs): + if pos =3D=3D 0: + ret =3D f.str_extract() + else: + ret =3D 'deposit32({0}, {1}, {2}, {3})'.format(ret, pos, 3= 2 - pos, f.str_extract()) + pos +=3D f.len + return ret + + def __ne__(self, other): + if len(self.subs) !=3D len(other.subs): + return True + for a, b in zip(self.subs, other.subs): + if a.__class__ !=3D b.__class__ or a !=3D b: + return True + return False; + + def __eq__(self, other): + return not self.__ne__(other) +# end MultiField + +class ConstField: + """Class representing an argument field with constant value""" + def __init__(self, value): + self.value =3D value + self.mask =3D 0 + self.sign =3D value < 0 + + def __str__(self): + return str(self.value) + + def str_extract(self): + return str(self.value) + + def __cmp__(self, other): + return self.value - other.value +# end ConstField + +class FunctionField: + """Class representing a field passed through an expander""" + def __init__(self, func, base): + self.mask =3D base.mask + self.sign =3D base.sign + self.base =3D base + self.func =3D func + + def __str__(self): + return self.func + '(' + str(self.base) + ')' + + def str_extract(self): + return self.func + '(' + self.base.str_extract() + ')' + + def __eq__(self, other): + return self.func =3D=3D other.func and self.base =3D=3D other.base + def __ne__(self, other): + return not self.__eq__(other) +# end FunctionField + +class Arguments: + """Class representing the extracted fields of a format""" + def __init__(self, nm, flds): + self.name =3D nm + self.fields =3D sorted(flds) + + def __str__(self): + return self.name + ' ' + str(self.fields) + + def struct_name(self): + return 'arg_' + self.name + + def output_def(self): + output('typedef struct {\n') + for n in self.fields: + output(' int ', n, ';\n') + output('} ', self.struct_name(), ';\n\n') +# end Arguments + +class General: + """Common code between instruction formats and instruction patterns""" + def __init__(self, name, base, fixb, fixm, fldm, flds): + self.name =3D name + self.base =3D base + self.fixedbits =3D fixb + self.fixedmask =3D fixm + self.fieldmask =3D fldm + self.fields =3D flds + + def __str__(self): + r =3D self.name + if self.base: + r =3D r + ' ' + self.base.name + else: + r =3D r + ' ' + str(self.fields) + r =3D r + ' ' + str_match_bits(self.fixedbits, self.fixedmask) + return r + + def str1(self, i): + return str_indent(i) + self.__str__() +# end General + +class Format(General): + """Class representing an instruction format""" + + def extract_name(self): + return 'extract_' + self.name + + def output_extract(self): + output('static void ', self.extract_name(), '(', + self.base.struct_name(), ' *a, uint32_t insn)\n{\n') + for n, f in self.fields.items(): + output(' a->', n, ' =3D ', f.str_extract(), ';\n') + output('}\n\n') +# end Format + +class Pattern(General): + """Class representing an instruction pattern""" + + def output_decl(self): + global translate_prefix + output('typedef ', self.base.base.struct_name(), + ' arg_', self.name, ';\n') + output('void ', translate_prefix, '_', self.name, + '(DisasContext *ctx, arg_', self.name, + ' *a, uint32_t insn);\n') + + def output_code(self, i, extracted, outerbits, outermask): + global translate_prefix + ind =3D str_indent(i) + arg =3D self.base.base.name + if not extracted: + output(ind, self.base.extract_name(), '(&u.f_', arg, ', insn);= \n') + for n, f in self.fields.items(): + output(ind, 'u.f_', arg, '.', n, ' =3D ', f.str_extract(), ';\= n') + output(ind, translate_prefix, '_', self.name, + '(ctx, &u.f_', arg, ', insn);\n') + output(ind, 'return true;\n') +# end Pattern + +def parse_field(lineno, name, toks): + """Parse one instruction field from TOKS at LINENO""" + global fields + global re_ident + + # A "simple" field will have only one entry; a "multifield" will have = several. + subs =3D [] + width =3D 0 + func =3D None + for t in toks: + if re_fullmatch('!function=3D' + re_ident, t): + if func: + error(lineno, 'duplicate function') + func =3D t.split('=3D') + func =3D func[1] + continue + + if re_fullmatch('[0-9]+:s[0-9]+', t): + # Signed field extract + subtoks =3D t.split(':s') + sign =3D True + elif re_fullmatch('[0-9]+:[0-9]+', t): + # Unsigned field extract + subtoks =3D t.split(':') + sign =3D False + else: + error(lineno, 'invalid field token "{0}"'.format(t)) + p =3D int(subtoks[0]) + l =3D int(subtoks[1]) + if p + l > 32: + error(lineno, 'field {0} too large'.format(t)) + f =3D Field(sign, p, l) + subs.append(f) + width +=3D l + + if width > 32: + error(lineno, 'field too large') + if len(subs) =3D=3D 1: + f =3D subs[0] + else: + f =3D MultiField(subs) + if func: + f =3D FunctionField(func, f) + + if name in fields: + error(lineno, 'duplicate field', name) + fields[name] =3D f +# end parse_field + +def parse_arguments(lineno, name, toks): + """Parse one argument set from TOKS at LINENO""" + global arguments + global re_ident + + flds =3D [] + for t in toks: + if not re_fullmatch(re_ident, t): + error(lineno, 'invalid argument set token "{0}"'.format(t)) + flds.append(t) + + if name in arguments: + error(lineno, 'duplicate argument set', name) + arguments[name] =3D Arguments(name, flds) +# end parse_arguments + +def lookup_field(lineno, name): + global fields + if name in fields: + return fields[name] + error(lineno, 'undefined field', name) + +def add_field(lineno, flds, new_name, f): + if new_name in flds: + error(lineno, 'duplicate field', new_name) + flds[new_name] =3D f + return flds + +def add_field_byname(lineno, flds, new_name, old_name): + return add_field(lineno, flds, new_name, lookup_field(lineno, old_name= )) + +def infer_argument_set(flds): + global arguments + + for arg in arguments.values(): + if eq_fields_for_args(flds, arg.fields): + return arg + + name =3D str(len(arguments)) + arg =3D Arguments(name, flds.keys()) + arguments[name] =3D arg + return arg + +def infer_format(arg, fieldmask, flds): + global arguments + global formats + + const_flds =3D {} + var_flds =3D {} + for n, c in flds.items(): + if c is ConstField: + const_flds[n] =3D c + else: + var_flds[n] =3D c + + # Look for an existing format with the same argument set and fields + for fmt in formats.values(): + if arg and fmt.base !=3D arg: + continue + if fieldmask !=3D fmt.fieldmask: + continue + if not eq_fields_for_fmts(flds, fmt.fields): + continue + return (fmt, const_flds) + + name =3D 'Fmt_' + str(len(formats)) + if not arg: + arg =3D infer_argument_set(flds) + + fmt =3D Format(name, arg, 0, 0, fieldmask, var_flds) + formats[name] =3D fmt + + return (fmt, const_flds) +# end infer_format + +def parse_generic(lineno, is_format, name, toks): + """Parse one instruction format from TOKS at LINENO""" + global fields + global arguments + global formats + global patterns + global re_ident + + fixedmask =3D 0 + fixedbits =3D 0 + width =3D 0 + flds =3D {} + arg =3D None + fmt =3D None + for t in toks: + # '&Foo' gives a format an explcit argument set. + if t[0] =3D=3D '&': + tt =3D t[1:] + if arg: + error(lineno, 'multiple argument sets') + if tt in arguments: + arg =3D arguments[tt] + else: + error(lineno, 'undefined argument set', t) + continue + + # '@Foo' gives a pattern an explicit format. + if t[0] =3D=3D '@': + tt =3D t[1:] + if fmt: + error(lineno, 'multiple formats') + if tt in formats: + fmt =3D formats[tt] + else: + error(lineno, 'undefined format', t) + continue + + # '%Foo' imports a field. + if t[0] =3D=3D '%': + tt =3D t[1:] + flds =3D add_field_byname(lineno, flds, tt, tt) + continue + + # 'Foo=3D%Bar' imports a field with a different name. + if re_fullmatch(re_ident + '=3D%' + re_ident, t): + (fname, iname) =3D t.split('=3D%') + flds =3D add_field_byname(lineno, flds, fname, iname) + continue + + # 'Foo=3Dnumber' sets an argument field to a constant value + if re_fullmatch(re_ident + '=3D[0-9]+', t): + (fname, value) =3D t.split('=3D') + value =3D int(value) + flds =3D add_field(lineno, flds, fname, ConstField(value)) + continue + + # Pattern of 0s, 1s and dots indicate required zeros, + # required ones, or dont-cares. + if re_fullmatch('[01.]+', t): + shift =3D len(t) + fms =3D t.replace('0','1') + fms =3D fms.replace('.','0') + fbs =3D t.replace('.','0') + fms =3D int(fms, 2) + fbs =3D int(fbs, 2) + fixedbits =3D (fixedbits << shift) | fbs + fixedmask =3D (fixedmask << shift) | fms + # Otherwise, fieldname:fieldwidth + elif re_fullmatch(re_ident + ':s?[0-9]+', t): + (fname, flen) =3D t.split(':') + sign =3D False; + if flen[0] =3D=3D 's': + sign =3D True + flen =3D flen[1:] + shift =3D int(flen, 10) + f =3D Field(sign, 32 - width - shift, shift) + flds =3D add_field(lineno, flds, fname, f) + fixedbits <<=3D shift + fixedmask <<=3D shift + else: + error(lineno, 'invalid token "{0}"'.format(t)) + width +=3D shift + + # We should have filled in all of the bits of the instruction. + if width !=3D 32: + error(lineno, 'definition has {0} bits'.format(width)) + + # The fields that we add, or import, cannot overlap bits that we speci= fy + fieldmask =3D 0 + for f in flds.values(): + fieldmask |=3D f.mask + + # Fix up what we've parsed to match either a format or a pattern. + if is_format: + # Formats cannot reference formats. + if fmt: + error(lineno, 'format referencing format') + # If an argument set is given, then there should be no fields + # without a place to store it. + if arg: + for f in flds.keys(): + if f not in arg.fields: + error(lineno, 'field {0} not in argument set {1}'.form= at(f, arg.name)) + else: + arg =3D infer_argument_set(flds) + if name in formats: + error(lineno, 'duplicate format name', name) + fmt =3D Format(name, arg, fixedbits, fixedmask, fieldmask, flds) + formats[name] =3D fmt + else: + # Patterns can reference a format ... + if fmt: + # ... but not an argument simultaneously + if arg: + error(lineno, 'pattern specifies both format and argument = set') + fieldmask |=3D fmt.fieldmask + fixedbits |=3D fmt.fixedbits + fixedmask |=3D fmt.fixedmask + else: + (fmt, flds) =3D infer_format(arg, fieldmask, flds) + arg =3D fmt.base + for f in flds.keys(): + if f not in arg.fields: + error(lineno, 'field {0} not in argument set {1}'.format(f= , arg.name)) + pat =3D Pattern(name, fmt, fixedbits, fixedmask, fieldmask, flds) + patterns.append(pat) + + if fieldmask & fixedmask: + error(lineno, 'fieldmask overlaps fixedmask (0x{0:08x} & 0x{1:08x}= )'.format(fieldmask, fixedmask)) +# end parse_general + +def parse_file(f): + """Parse all of the patterns within a file""" + + # Read all of the lines of the file. Concatenate lines + # ending in backslash; discard empty lines and comments. + toks =3D [] + lineno =3D 0 + for line in f: + lineno +=3D 1 + + # Discard comments + end =3D line.find('#') + if end >=3D 0: + line =3D line[:end] + + t =3D line.split() + if len(toks) !=3D 0: + # Next line after continuation + toks.extend(t) + elif len(t) =3D=3D 0: + # Empty line + continue + else: + toks =3D t + + # Continuation? + if toks[-1] =3D=3D '\\': + toks.pop() + continue + + if len(toks) < 2: + error(lineno, 'short line') + + name =3D toks[0] + del toks[0] + + # Determine the type of object needing to be parsed. + if name[0] =3D=3D '%': + parse_field(lineno, name[1:], toks) + elif name[0] =3D=3D '&': + parse_arguments(lineno, name[1:], toks) + elif name[0] =3D=3D '@': + parse_generic(lineno, True, name[1:], toks) + else: + parse_generic(lineno, False, name, toks) + toks =3D [] +# end parse_file + +class Tree: + """Class representing a node in a decode tree""" + + def __init__(self, fm, tm): + self.fixedmask =3D fm + self.thismask =3D tm + self.subs =3D [] + self.base =3D None + + def str1(self, i): + ind =3D str_indent(i) + r =3D '{0}{1:08x}'.format(ind, self.fixedmask) + if self.format: + r +=3D ' ' + self.format.name + r +=3D ' [\n' + for (b, s) in self.subs: + r +=3D '{0} {1:08x}:\n'.format(ind, b) + r +=3D s.str1(i + 4) + '\n' + r +=3D ind + ']' + return r + + def __str__(self): + return self.str1(0) + + def output_code(self, i, extracted, outerbits, outermask): + ind =3D str_indent(i) + + # If we identified all nodes below have the same format, + # extract the fields now. + if not extracted and self.base: + output(ind, self.base.extract_name(), + '(&u.f_', self.base.base.name, ', insn);\n') + extracted =3D True + + # Attempt to aid the compiler in producing compact switch statemen= ts. + # If the bits in the mask are contiguous, extract them. + sh =3D is_contiguous(self.thismask) + if sh > 0: + str_switch =3D lambda b: \ + '(insn >> {0}) & 0x{1:x}'.format(sh, b >> sh) + str_case =3D lambda b: '0x{0:x}'.format(b >> sh) + else: + str_switch =3D lambda b: 'insn & 0x{0:08x}'.format(b) + str_case =3D lambda b: '0x{0:08x}'.format(b) + + output(ind, 'switch (', str_switch(self.thismask), ') {\n') + for b, s in sorted(self.subs): + rept =3D self.thismask & ~s.fixedmask + innermask =3D outermask | (self.thismask & ~rept) + innerbits =3D outerbits | b + for bb in bit_iterate(rept): + output(ind, 'case ', str_case(b | bb), ':\n') + output(ind, ' /* ', + str_match_bits(innerbits, innermask), ' */\n') + s.output_code(i + 4, extracted, innerbits, innermask) + output(ind, '}\n') + output(ind, 'return false;\n') +# end Tree + +def build_tree(pats, outerbits, outermask): + # Find the intersection of all remaining fixedmask. + innermask =3D ~outermask + for i in pats: + innermask &=3D i.fixedmask + + if innermask =3D=3D 0: + pnames =3D [] + for p in pats: + pnames.append(p.name) + #pdb.set_trace() + error(0, 'overlapping patterns:', pnames) + + fullmask =3D outermask | innermask + extramask =3D 0 + + # If there are few enough items, see how many undecoded bits remain. + # Otherwise, attempt to avoid a subsequent Tree level testing one bit. + if len(pats) < 8: + for i in pats: + extramask |=3D i.fixedmask & ~fullmask + else: + for i in pats: + e =3D i.fixedmask & ~fullmask + if e !=3D 0 and popcount(e) <=3D 2: + extramask |=3D e + + if popcount(extramask) < 4: + innermask |=3D extramask + fullmask |=3D extramask + + # Sort each element of pats into the bin selected by the mask. + bins =3D {} + for i in pats: + fb =3D i.fixedbits & innermask + if fb in bins: + bins[fb].append(i) + else: + bins[fb] =3D [i] + + # We must recurse if any bin has more than one element or if + # the single element in the bin has not been fully matched. + t =3D Tree(fullmask, innermask) + + for b, l in bins.items(): + s =3D l[0] + if len(l) > 1 or s.fixedmask & ~fullmask !=3D 0: + s =3D build_tree(l, b | outerbits, fullmask) + t.subs.append((b, s)) + + return t +# end build_tree + +def prop_format(tree): + """Propagate Format objects into the decode tree""" + + # Depth first search. + for (b, s) in tree.subs: + if isinstance(s, Tree): + prop_format(s) + + # If all entries in SUBS have the same format, then + # propagate that into the tree. + f =3D None + for (b, s) in tree.subs: + if f is None: + f =3D s.base + if f is None: + return + if f is not s.base: + return + tree.base =3D f +# end prop_format + + +def main(): + global arguments + global formats + global patterns + global translate_prefix + global output_file + + h_file =3D None + c_file =3D None + decode_function =3D 'decode' + + long_opts =3D [ 'decode=3D', 'translate=3D', 'header=3D', 'output=3D' ] + try: + (opts, args) =3D getopt.getopt(sys.argv[1:], 'h:o:', long_opts) + except getopt.GetoptError as err: + error(0, err) + for o, a in opts: + if o in ('-h', '--header'): + h_file =3D a + elif o in ('-o', '--output'): + c_file =3D a + elif o =3D=3D '--decode': + decode_function =3D a + elif o =3D=3D '--translate': + translate_prefix =3D a + else: + assert False, 'unhandled option' + + if len(args) < 1: + error(0, 'missing input file') + f =3D open(args[0], 'r') + parse_file(f) + f.close() + + t =3D build_tree(patterns, 0, 0) + prop_format(t) + + if h_file: + output_file =3D open(h_file, 'w') + elif c_file: + output_file =3D open(c_file, 'w') + else: + output_file =3D sys.stdout + + output_autogen() + for n in sorted(arguments.keys()): + f =3D arguments[n] + f.output_def() + + if h_file: + output('bool ', decode_function, + '(DisasContext *ctx, uint32_t insn);\n\n') + + # A single translate function can be invoked for different patterns. + # Make sure that the argument sets are the same, and declare the + # function only once. + out_pats =3D {} + for i in patterns: + if i.name in out_pats: + p =3D out_pats[i.name] + if i.base.base !=3D p.base.base: + error(0, i.name, ' has conflicting argument sets') + else: + i.output_decl() + out_pats[i.name] =3D i + + if h_file: + output_file.close() + if c_file: + output_file =3D open(c_file, 'w') + output_autogen() + + for n in sorted(formats.keys()): + f =3D formats[n] + f.output_extract() + + output('bool ', decode_function, + '(DisasContext *ctx, uint32_t insn)\n{\n') + + i4 =3D str_indent(4) + output(i4, 'union {\n') + for n in sorted(arguments.keys()): + f =3D arguments[n] + output(i4, i4, f.struct_name(), ' f_', f.name, ';\n') + output(i4, '} u;\n\n') + + t.output_code(4, False, 0, 0) + + output('}\n') + + if c_file: + output_file.close() +#end main + +if __name__ =3D=3D '__main__': + main() --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513620437267636.6840758322743; Mon, 18 Dec 2017 10:07:17 -0800 (PST) Received: from localhost ([::1]:60173 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzol-00020X-0q for importer@patchew.org; Mon, 18 Dec 2017 13:07:11 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55479) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUI-00081Z-42 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUF-0001ml-HU for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:02 -0500 Received: from mail-pf0-x235.google.com ([2607:f8b0:400e:c00::235]:34933) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUF-0001m5-9i for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:45:59 -0500 Received: by mail-pf0-x235.google.com with SMTP id j124so9980013pfc.2 for ; Mon, 18 Dec 2017 09:45:59 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.45.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:45:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9fIXycO5JhyGlVjDiOz3Yvt7hKNnwnfO1CFiMTmV1po=; b=acqyNPY4G7eAfKMwlbP+cu1vaSeSA1bABeN3LaxyB/8q/tY1GDdumJtlDoKz7/Lsm3 NF/GRKm01g1LlK7dE/8Fzc6rjm5TtZITQfls3WzrtJcRefnjyZV57qzbGX4iuGG48wOL f5dTtdEWGcRIUD3Fuc8C8Z4SuvFMcRw3B+ulE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9fIXycO5JhyGlVjDiOz3Yvt7hKNnwnfO1CFiMTmV1po=; b=WQD0TRCembE3qwv/DLiaPP9yCA+19aJa1VxI0AfXpFXWvFYZarCRhnmXglRDHs4LLC nJ2HHJQE3cby1GQoihJHK96AwJVzE0DZjxW+UJVDxAqm8muwuiaEWMX9w1ieRF/7dYSC /uq6OoFpMVSyRfBN/u+tsllLX3FT4GDb+CGasZmgeZHPk/4oWRpBOI+s5pbOlQFf0DQp auKi/O8toCzfcGdB/HUDSCJ+JqpcleaEbcCZGwkWrQuAZWyOMF8OllNSIWJFWIbQTzlW jPnrVWTrJDj4TD+CvU5bhvMOHuYWgbOrmU1N6BvrQUaSIetHi838zFeRExMfy/VXuJUE h/bA== X-Gm-Message-State: AKGB3mJmmUIV3IedPK66DM+AvO2AmDL5sKHgbgTS+0v9S3exU7iuRXtX LFJ3UgJgp6rxARM89RLaw869Dhe3sqg= X-Google-Smtp-Source: ACJfBoutXmjz0Ixx2LJcRCIcwhRkY3mEHUVTrt5/9utdJV6sDxDGJhMq3yT9cB/vPN3SsMLTrW6NpQ== X-Received: by 10.98.133.65 with SMTP id u62mr490539pfd.22.1513619157633; Mon, 18 Dec 2017 09:45:57 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:31 -0800 Message-Id: <20171218174552.18871-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::235 Subject: [Qemu-devel] [PATCH 02/23] target/arm: Add SVE decode skeleton X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Including only 4, as-yet unimplemented, instruction patterns so that the whole thing compiles. Signed-off-by: Richard Henderson --- target/arm/translate-a64.h | 111 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-a64.c | 91 +++++++------------------------------ target/arm/translate-sve.c | 48 ++++++++++++++++++++ .gitignore | 1 + target/arm/Makefile.objs | 11 +++++ target/arm/sve.def | 45 ++++++++++++++++++ 6 files changed, 233 insertions(+), 74 deletions(-) create mode 100644 target/arm/translate-a64.h create mode 100644 target/arm/translate-sve.c create mode 100644 target/arm/sve.def diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h new file mode 100644 index 0000000000..9014b5bf8b --- /dev/null +++ b/target/arm/translate-a64.h @@ -0,0 +1,111 @@ +/* + * AArch64 translation, common definitions. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#ifndef TARGET_ARM_TRANSLATE_A64_H +#define TARGET_ARM_TRANSLATE_A64_H + +void unallocated_encoding(DisasContext *s); + +#define unsupported_encoding(s, insn) \ + do { \ + qemu_log_mask(LOG_UNIMP, \ + "%s:%d: unsupported instruction encoding 0x%08x " \ + "at pc=3D%016" PRIx64 "\n", = \ + __FILE__, __LINE__, insn, s->pc - 4); \ + unallocated_encoding(s); \ + } while (0); + +TCGv_i64 new_tmp_a64(DisasContext *s); +TCGv_i64 new_tmp_a64_zero(DisasContext *s); +TCGv_i64 cpu_reg(DisasContext *s, int reg); +TCGv_i64 cpu_reg_sp(DisasContext *s, int reg); +TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf); +TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf); + +/* We should have at some point before trying to access an FP register + * done the necessary access check, so assert that + * (a) we did the check and + * (b) we didn't then just plough ahead anyway if it failed. + * Print the instruction pattern in the abort message so we can figure + * out what we need to fix if a user encounters this problem in the wild. + */ +static inline void assert_fp_access_checked(DisasContext *s) +{ +#ifdef CONFIG_DEBUG_TCG + if (unlikely(!s->fp_access_checked || s->fp_excp_el)) { + fprintf(stderr, "target-arm: FP access check missing for " + "instruction 0x%08x\n", s->insn); + abort(); + } +#endif +} + +/* Return the offset into CPUARMState of an element of specified + * size, 'element' places in from the least significant end of + * the FP/vector register Qn. + */ +static inline int vec_reg_offset(DisasContext *s, int regno, + int element, TCGMemOp size) +{ + int offs =3D 0; +#ifdef HOST_WORDS_BIGENDIAN + /* This is complicated slightly because vfp.zregs[n].d[0] is + * still the low half and vfp.zregs[n].d[1] the high half + * of the 128 bit vector, even on big endian systems. + * Calculate the offset assuming a fully bigendian 128 bits, + * then XOR to account for the order of the two 64 bit halves. + */ + offs +=3D (16 - ((element + 1) * (1 << size))); + offs ^=3D 8; +#else + offs +=3D element * (1 << size); +#endif + offs +=3D offsetof(CPUARMState, vfp.zregs[regno]); + assert_fp_access_checked(s); + return offs; +} + +/* Return the offset info CPUARMState of the "whole" vector register Qn. = */ +static inline int vec_full_reg_offset(DisasContext *s, int regno) +{ + assert_fp_access_checked(s); + return offsetof(CPUARMState, vfp.zregs[regno]); +} + +/* Return the offset info CPUARMState of the predicate vector register Pn. + * Note for this purpose, FFR is P16. */ +static inline int pred_full_reg_offset(DisasContext *s, int regno) +{ + assert_fp_access_checked(s); + return offsetof(CPUARMState, vfp.pregs[regno]); +} + +/* Return the byte size of the "whole" vector register, VL / 8. */ +static inline int vec_full_reg_size(DisasContext *s) +{ + return s->sve_len; +} + +/* Return the byte size of the whole predicate register, VL / 64. */ +static inline int pred_full_reg_size(DisasContext *s) +{ + return s->sve_len >> 3; +} + +bool disas_sve(DisasContext *, uint32_t); + +#endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index ecb72e4d9c..8be1660661 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -36,13 +36,13 @@ #include "exec/log.h" =20 #include "trace-tcg.h" +#include "translate-a64.h" =20 static TCGv_i64 cpu_X[32]; static TCGv_i64 cpu_pc; =20 /* Load/store exclusive handling */ static TCGv_i64 cpu_exclusive_high; -static TCGv_i64 cpu_reg(DisasContext *s, int reg); =20 static const char *regnames[] =3D { "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", @@ -390,22 +390,13 @@ static inline void gen_goto_tb(DisasContext *s, int n= , uint64_t dest) } } =20 -static void unallocated_encoding(DisasContext *s) +void unallocated_encoding(DisasContext *s) { /* Unallocated and reserved encodings are uncategorized */ gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(), default_exception_el(s)); } =20 -#define unsupported_encoding(s, insn) \ - do { \ - qemu_log_mask(LOG_UNIMP, \ - "%s:%d: unsupported instruction encoding 0x%08x " \ - "at pc=3D%016" PRIx64 "\n", = \ - __FILE__, __LINE__, insn, s->pc - 4); \ - unallocated_encoding(s); \ - } while (0); - static void init_tmp_a64_array(DisasContext *s) { #ifdef CONFIG_DEBUG_TCG @@ -423,13 +414,13 @@ static void free_tmp_a64(DisasContext *s) init_tmp_a64_array(s); } =20 -static TCGv_i64 new_tmp_a64(DisasContext *s) +TCGv_i64 new_tmp_a64(DisasContext *s) { assert(s->tmp_a64_count < TMP_A64_MAX); return s->tmp_a64[s->tmp_a64_count++] =3D tcg_temp_new_i64(); } =20 -static TCGv_i64 new_tmp_a64_zero(DisasContext *s) +TCGv_i64 new_tmp_a64_zero(DisasContext *s) { TCGv_i64 t =3D new_tmp_a64(s); tcg_gen_movi_i64(t, 0); @@ -451,7 +442,7 @@ static TCGv_i64 new_tmp_a64_zero(DisasContext *s) * to cpu_X[31] and ZR accesses to a temporary which can be discarded. * This is the point of the _sp forms. */ -static TCGv_i64 cpu_reg(DisasContext *s, int reg) +TCGv_i64 cpu_reg(DisasContext *s, int reg) { if (reg =3D=3D 31) { return new_tmp_a64_zero(s); @@ -461,7 +452,7 @@ static TCGv_i64 cpu_reg(DisasContext *s, int reg) } =20 /* register access for when 31 =3D=3D SP */ -static TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) +TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) { return cpu_X[reg]; } @@ -470,7 +461,7 @@ static TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) * representing the register contents. This TCGv is an auto-freed * temporary so it need not be explicitly freed, and may be modified. */ -static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf) +TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf) { TCGv_i64 v =3D new_tmp_a64(s); if (reg !=3D 31) { @@ -485,7 +476,7 @@ static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, = int sf) return v; } =20 -static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf) +TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf) { TCGv_i64 v =3D new_tmp_a64(s); if (sf) { @@ -496,62 +487,6 @@ static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int r= eg, int sf) return v; } =20 -/* We should have at some point before trying to access an FP register - * done the necessary access check, so assert that - * (a) we did the check and - * (b) we didn't then just plough ahead anyway if it failed. - * Print the instruction pattern in the abort message so we can figure - * out what we need to fix if a user encounters this problem in the wild. - */ -static inline void assert_fp_access_checked(DisasContext *s) -{ -#ifdef CONFIG_DEBUG_TCG - if (unlikely(!s->fp_access_checked || s->fp_excp_el)) { - fprintf(stderr, "target-arm: FP access check missing for " - "instruction 0x%08x\n", s->insn); - abort(); - } -#endif -} - -/* Return the offset into CPUARMState of an element of specified - * size, 'element' places in from the least significant end of - * the FP/vector register Qn. - */ -static inline int vec_reg_offset(DisasContext *s, int regno, - int element, TCGMemOp size) -{ - int offs =3D 0; -#ifdef HOST_WORDS_BIGENDIAN - /* This is complicated slightly because vfp.zregs[n].d[0] is - * still the low half and vfp.zregs[n].d[1] the high half - * of the 128 bit vector, even on big endian systems. - * Calculate the offset assuming a fully bigendian 128 bits, - * then XOR to account for the order of the two 64 bit halves. - */ - offs +=3D (16 - ((element + 1) * (1 << size))); - offs ^=3D 8; -#else - offs +=3D element * (1 << size); -#endif - offs +=3D offsetof(CPUARMState, vfp.zregs[regno]); - assert_fp_access_checked(s); - return offs; -} - -/* Return the offset info CPUARMState of the "whole" vector register Qn. = */ -static inline int vec_full_reg_offset(DisasContext *s, int regno) -{ - assert_fp_access_checked(s); - return offsetof(CPUARMState, vfp.zregs[regno]); -} - -/* Return the byte size of the "whole" vector register, VL / 8. */ -static inline int vec_full_reg_size(DisasContext *s) -{ - return s->sve_len; -} - /* Return a newly allocated pointer to the vector register. */ static TCGv_ptr vec_full_reg_ptr(DisasContext *s, int regno) { @@ -12705,7 +12640,15 @@ static void disas_a64_insn(CPUARMState *env, Disas= Context *s) s->fp_access_checked =3D false; =20 switch (extract32(insn, 25, 4)) { - case 0x0: case 0x1: case 0x2: case 0x3: /* UNALLOCATED */ + case 0x0: case 0x1: case 0x3: /* UNALLOCATED */ + unallocated_encoding(s); + break; + case 0x2: + if (arm_dc_feature(s, ARM_FEATURE_SVE)) { + if (!fp_access_check(s) || disas_sve(s, insn)) { + break; + } + } unallocated_encoding(s); break; case 0x8: case 0x9: /* Data processing - immediate */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c new file mode 100644 index 0000000000..67ad94e310 --- /dev/null +++ b/target/arm/translate-sve.c @@ -0,0 +1,48 @@ +/* + * AArch64 SVE translation + * + * Copyright (c) 2017 Linaro, Ltd + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "tcg-op.h" +#include "tcg-op-gvec.h" +#include "qemu/log.h" +#include "arm_ldst.h" +#include "translate.h" +#include "internals.h" +#include "exec/helper-proto.h" +#include "exec/helper-gen.h" +#include "exec/log.h" +#include "trace-tcg.h" +#include "translate-a64.h" + +/* + * Include the generated decoder. + */ + +#include "decode-sve.inc.c" + +/* + * Implement all of the translator functions referenced by the decoder. + */ + +void trans_AND_zzz(DisasContext *s, arg_AND_zzz *a, uint32_t insn) { unsup= ported_encoding(s, insn); } +void trans_ORR_zzz(DisasContext *s, arg_ORR_zzz *a, uint32_t insn) { unsup= ported_encoding(s, insn); } +void trans_EOR_zzz(DisasContext *s, arg_EOR_zzz *a, uint32_t insn) { unsup= ported_encoding(s, insn); } +void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) { unsup= ported_encoding(s, insn); } diff --git a/.gitignore b/.gitignore index 588769b250..e5fc04de07 100644 --- a/.gitignore +++ b/.gitignore @@ -140,3 +140,4 @@ trace-dtrace-root.h trace-dtrace-root.dtrace trace-ust-all.h trace-ust-all.c +/target/arm/decode-sve.inc.c diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs index c2d32988f9..d1ca1f799b 100644 --- a/target/arm/Makefile.objs +++ b/target/arm/Makefile.objs @@ -10,3 +10,14 @@ obj-y +=3D gdbstub.o obj-$(TARGET_AARCH64) +=3D cpu64.o translate-a64.o helper-a64.o gdbstub64.o obj-y +=3D crypto_helper.o obj-$(CONFIG_SOFTMMU) +=3D arm-powerctl.o + +DECODETREE =3D $(SRC_PATH)/scripts/decodetree.py + +target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.def $(DECODETREE) + $(call quiet-command,\ + $(PYTHON) $(DECODETREE) -o $@ --decode disas_sve \ + $(SRC_PATH)/target/arm/sve.def || rm -f $@, \ + "GEN", $@) + +target/arm/translate-sve.o: target/arm/decode-sve.inc.c +obj-$(TARGET_AARCH64) +=3D translate-sve.o diff --git a/target/arm/sve.def b/target/arm/sve.def new file mode 100644 index 0000000000..0f47a21ef0 --- /dev/null +++ b/target/arm/sve.def @@ -0,0 +1,45 @@ +# AArch64 SVE instruction descriptions +# +# Copyright (c) 2017 Linaro, Ltd +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, see . + +# +# This file is processed by scripts/decodetree.py +# + +########################################################################### +# Named attribute sets. These are used to make nice(er) names +# when creating helpers common to those for the individual +# instruction patterns. + +&rrr_esz rd rn rm esz + +########################################################################### +# Named instruction formats. These are generally used to +# reduce the amount of duplication between instruction patterns. + +# Three operand with unused vector element size +@rd_rn_rm ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 + +########################################################################### +# Instruction patterns. Grouped according to the SVE encodingindex.xhtml. + +### SVE Logical - Unpredicated Group + +# SVE bitwise logical operations (unpredicated) +AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn_rm +ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm +EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm +BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513621279371159.76883679938624; Mon, 18 Dec 2017 10:21:19 -0800 (PST) Received: from localhost ([::1]:36473 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR02B-0000XP-TF for importer@patchew.org; Mon, 18 Dec 2017 13:21:03 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55475) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUI-00081V-1C for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUG-0001nn-LM for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:02 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:40516) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUG-0001nA-F9 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:00 -0500 Received: by mail-pg0-x241.google.com with SMTP id k15so9432616pgr.7 for ; Mon, 18 Dec 2017 09:46:00 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.45.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:45:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gQ2kkF9An4v3zz2RnIXaDTstLgGEHB4P1MCQCEXfaFM=; b=WYuwpBbxF9PgrWrWhb68ihzzbJKA+no5t1Q183MlePzFyH5atBgsxiJghFo3nANy8X 1ongYR6mkT0owXaJX5eCPvgfzgUK2K98A9VwqvgZsFXgHWCQirgIzSFSs1t/bHNTgDPU cGuwTvPStDCH6W+/Yz+gQxYedPqHRzFowgJNA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gQ2kkF9An4v3zz2RnIXaDTstLgGEHB4P1MCQCEXfaFM=; b=b1eIgdCdww9nSsPfXacvuLotYBxt6Ltl54Ag8jgvGlweqXh98MekUHpb4W93B/0GSV Ya3aIVzfmilHMCfHdBACSBprSsLSSVxoIu4U+T0cS/tnVKm//0yCjuJKHmk1ld5X92tL UGw58hyyyW/lLAtYwvTB4GGVMkyKzm42d93PNZLxhLxGs+pUkaGrzHhNZWCZd7Dw9Ipf t5opITrV2mU24xgYn3i+ey+uF7hC0Co1nMnCeqm0tpLWIONv9EDt5tVmVdUrM1wAbUnp SgSdIXax/V9nkb3hh0KGgh2GysO/QVs+epA26f0/zRyFELhitx6FDYgXK308wPRLr9q9 Oo7A== X-Gm-Message-State: AKGB3mKEi8nQ/zgyRIxIj7o54jUJ1ucyUZ3dlH9MsaQOXOTf3KFExjNM tarslFCH0xyc3MAozEF1yJR1pgVeYcg= X-Google-Smtp-Source: ACJfBotIG0f6ic3m8OMy4XluDk42wtiNOEELGKLs228lJRPOz6mDBVAXtxRKmwy+JFm7Hm7zj+0cmg== X-Received: by 10.101.101.73 with SMTP id a9mr436893pgw.148.1513619159147; Mon, 18 Dec 2017 09:45:59 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:32 -0800 Message-Id: <20171218174552.18871-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH 03/23] target/arm: Implement SVE Bitwise Logical - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" These were the instructions that were stubbed out when introducing the decode skeleton. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 61 ++++++++++++++++++++++++++++++++++++++++++= +--- 1 file changed, 57 insertions(+), 4 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 67ad94e310..43420fa124 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -32,6 +32,10 @@ #include "trace-tcg.h" #include "translate-a64.h" =20 +typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); +typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t); + /* * Include the generated decoder. */ @@ -42,7 +46,56 @@ * Implement all of the translator functions referenced by the decoder. */ =20 -void trans_AND_zzz(DisasContext *s, arg_AND_zzz *a, uint32_t insn) { unsup= ported_encoding(s, insn); } -void trans_ORR_zzz(DisasContext *s, arg_ORR_zzz *a, uint32_t insn) { unsup= ported_encoding(s, insn); } -void trans_EOR_zzz(DisasContext *s, arg_EOR_zzz *a, uint32_t insn) { unsup= ported_encoding(s, insn); } -void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) { unsup= ported_encoding(s, insn); } +static unsigned size_for_gvec(unsigned s) +{ + if (s <=3D 8) { + return 8; + } else { + return QEMU_ALIGN_UP(s, 16); + } +} + +static void do_genfn2(DisasContext *s, GVecGen2Fn *gvec_fn, + int esz, int rd, int rn) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + gvec_fn(esz, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), vsz, vsz); +} + +static void do_genfn3(DisasContext *s, GVecGen3Fn *gvec_fn, + int esz, int rd, int rn, int rm) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + gvec_fn(esz, vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), vsz, vsz); +} + +static void do_zzz_genfn(DisasContext *s, arg_rrr_esz *a, GVecGen3Fn *gvec= _fn) +{ + do_genfn3(s, gvec_fn, a->esz, a->rd, a->rn, a->rm); +} + +void trans_AND_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_and); +} + +void trans_ORR_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + if (a->rn =3D=3D a->rm) { /* MOV */ + do_genfn2(s, tcg_gen_gvec_mov, 0, a->rd, a->rn); + } else { + do_genfn3(s, tcg_gen_gvec_or, 0, a->rd, a->rn, a->rm); + } +} + +void trans_EOR_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_xor); +} + +void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_andc); +} --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 15136225538611006.5546533563565; Mon, 18 Dec 2017 10:42:33 -0800 (PST) Received: from localhost ([::1]:52232 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Mv-0007ix-Bt for importer@patchew.org; Mon, 18 Dec 2017 13:42:29 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55508) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUK-00083P-70 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUI-0001pM-6x for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:04 -0500 Received: from mail-pf0-x241.google.com ([2607:f8b0:400e:c00::241]:35853) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUH-0001oV-V8 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:02 -0500 Received: by mail-pf0-x241.google.com with SMTP id p84so9976341pfd.3 for ; Mon, 18 Dec 2017 09:46:01 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.45.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:45:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zOMNyAcRduqLy3rPHjhFl619bBAjJ5WdPzQQou1JeW8=; b=aV3hinnEjUUyGOCP6X/9vCfCQESge3iMy7lsox0hE4O7mk+34aW4Rf3kVCsOO/zKZm 7/UTJgS86mSZWKgxrSiI/9p1feh0lxFiPJ+H3CbG0te+wqRdPyVj7vg8tRErr8RB1itQ cuEFvPy3a2bX++YMvPmn59Uw/bJ4syGg7mcX4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zOMNyAcRduqLy3rPHjhFl619bBAjJ5WdPzQQou1JeW8=; b=Jtcaq+ZDSMNtNgyF17TSKliOSraDXlotNP5/Jby/qL4NV8DvVk6Dlxew+cRvXga4NV Oc5nah9o7T1SsSupWTINPjPHDsCJ5jiAhJduohU8pT7Tupoy153eUqgZlwQdu+JqgHbP pLGwob4ir97RYXG85u3E9atd4WB6c2s/Bx3lhw77XZxlWEdYACjArtGozVA/Y3pPO+4x FsnI1dizUEWeWUJBBCDhUkpV/pjphRr8kBxilzZ2vHc120xsQAST7K5Ds1h5i6z0nH6B Cw9PnmXrtCij6J2LBWPxXqw5O2XZeukvJYzLDG9orHmetVTfDpx7hqzygv+xXiWpP7xj RSsw== X-Gm-Message-State: AKGB3mI5gYQwTX8x2Fx0z+nLFicrapTCin3FwWWAXRKQ7UQ7QhIt0Is5 90fgh5qcddDC6IR0ypsc4CoC/60nCvc= X-Google-Smtp-Source: ACJfBoscLhzZ+3T3kdFuVNTSgu9v0FyQTAk567OrxBvAuNA5Ku0DpFgSsUqebl2dePpdorOiJLmWug== X-Received: by 10.99.125.69 with SMTP id m5mr389166pgn.415.1513619160510; Mon, 18 Dec 2017 09:46:00 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:33 -0800 Message-Id: <20171218174552.18871-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::241 Subject: [Qemu-devel] [PATCH 04/23] target/arm: Implement PTRUE, PFALSE, SETFFR X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 117 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.def | 12 +++++ 2 files changed, 129 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 43420fa124..fabf6f0a67 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -99,3 +99,120 @@ void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uin= t32_t insn) { do_zzz_genfn(s, a, tcg_gen_gvec_andc); } + +static uint64_t pred_esz_mask[4] =3D { + 0xffffffffffffffffull, 0x5555555555555555ull, + 0x1111111111111111ull, 0x0101010101010101ull +}; + +/* See the ARM pseudocode DecodePredCount. */ +static unsigned decode_pred_count(unsigned fullsz, int pattern, int esz) +{ + unsigned elements =3D fullsz >> esz; + + switch (pattern) { + case 0x0: /* POW2 */ + return pow2floor(elements); + case 0x1: /* VL1 */ + case 0x2: /* VL2 */ + case 0x3: /* VL3 */ + case 0x4: /* VL4 */ + case 0x5: /* VL5 */ + case 0x6: /* VL6 */ + case 0x7: /* VL7 */ + case 0x8: /* VL8 */ + return MIN(pattern, elements); + case 0x9: /* VL16 */ + case 0xa: /* VL32 */ + case 0xb: /* VL64 */ + case 0xc: /* VL128 */ + case 0xd: /* VL256 */ + return MIN(16 << (pattern - 9), elements); + case 0x1d: /* MUL4 */ + return elements - elements % 4; + case 0x1e: /* MUL3 */ + return elements - elements % 3; + case 0x1f: /* ALL */ + return elements; + default: /* #uimm5 */ + return 0; + } +} + +/* For PTRUE, PTRUES, PFALSE, SETFFR. */ +void trans_pred_set(DisasContext *s, arg_pred_set *a, uint32_t insn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem, setsz, setalign, allalign, ofs; + uint64_t word, lastword; + TCGv_i64 t; + + numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + + /* Determine what we must store into each bit, and how many. */ + if (numelem =3D=3D 0 || a->i =3D=3D 0) { + lastword =3D word =3D 0; + setsz =3D fullsz; + } else { + setsz =3D numelem << a->esz; + lastword =3D word =3D pred_esz_mask[a->esz]; + if (setsz % 64) { + lastword &=3D ~(-1ull << (setsz % 64)); + } + } + + /* Rescale from bits to bytes. */ + fullsz /=3D 8; + setsz /=3D 8; + + ofs =3D pred_full_reg_offset(s, a->rd); + setalign =3D QEMU_ALIGN_DOWN(setsz, 8); + allalign =3D QEMU_ALIGN_UP(fullsz, 16); + + /* Perform the stores. Use the vector infrastructure if the sizes + are large enough. */ + if (fullsz > 8) { + if (setsz >=3D 16 && setsz % 16 =3D=3D 0) { + tcg_gen_gvec_dup64i(ofs, setsz, allalign, word); + } else if (setsz <=3D 8 && fullsz > 16) { + tcg_gen_gvec_dup64i(ofs, allalign, allalign, 0); + } else if (fullsz - setsz <=3D 8 && fullsz > 16) { + tcg_gen_gvec_dup64i(ofs, allalign, allalign, word); + } else { + unsigned i =3D 0; + + t =3D tcg_temp_new_i64(); + if (setalign > 0) { + tcg_gen_movi_i64(t, word); + for (; i < setalign; i +=3D 8) { + tcg_gen_st_i64(t, cpu_env, ofs + i); + } + } + if (lastword !=3D word) { + tcg_gen_movi_i64(t, lastword); + tcg_gen_st_i64(t, cpu_env, ofs + i); + i +=3D 8; + } + if (i < fullsz) { + tcg_gen_movi_i64(t, 0); + for (; i < fullsz; i +=3D 8) { + tcg_gen_st_i64(t, cpu_env, ofs + i); + } + } + tcg_temp_free_i64(t); + goto done; + } + } + t =3D tcg_const_i64(lastword); + tcg_gen_st_i64(t, cpu_env, ofs + setalign); + tcg_temp_free_i64(t); + + done: + /* PTRUES */ + if (a->s) { + tcg_gen_movi_i32(cpu_NF, -(lastword !=3D 0)); + tcg_gen_movi_i32(cpu_CF, lastword !=3D 0); + tcg_gen_movi_i32(cpu_ZF, lastword =3D=3D 0); + tcg_gen_movi_i32(cpu_VF, 0); + } +} diff --git a/target/arm/sve.def b/target/arm/sve.def index 0f47a21ef0..f802031f51 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -25,6 +25,7 @@ # instruction patterns. =20 &rrr_esz rd rn rm esz +&pred_set rd pat esz i s =20 ########################################################################### # Named instruction formats. These are generally used to @@ -43,3 +44,14 @@ AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_r= n_rm ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm + +### SVE Predicate Generation Group + +# SVE initialize predicate (PTRUE, PTRUES) +pred_set 00100101 esz:2 011 00 s:1 111000 pat:5 0 rd:4 &pred_set i=3D1 + +# SVE zero predicate register (PFALSE) +pred_set 00100101 00 011 000 1110 0100 0000 rd:4 &pred_set pat=3D31 esz= =3D0 i=3D0 s=3D0 + +# SVE initialize FFR (SETFFR) +pred_set 00100101 0010 1100 1001 0000 0000 0000 &pred_set pat=3D31 esz= =3D0 rd=3D16 i=3D1 s=3D0 --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513621456038985.4385039867964; Mon, 18 Dec 2017 10:24:16 -0800 (PST) Received: from localhost ([::1]:36589 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR054-0002vw-Q5 for importer@patchew.org; Mon, 18 Dec 2017 13:24:02 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55555) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUM-00085X-An for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUJ-0001qj-Li for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:06 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:45266) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUJ-0001q6-EI for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:03 -0500 Received: by mail-pg0-x241.google.com with SMTP id m25so9422165pgv.12 for ; Mon, 18 Dec 2017 09:46:03 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=4QcJMJOWe7DeY3F8S3/xMM/5p/CfRCVAcYhm497DynI=; b=Ssn2k1VWw6jfnTP0fBx+Rh/NUSt5CGUi1kxBFsV9fvLRfnJQM8HnM2CO0lVsmHBm9p PmKEZf18qelz2vL4uhBFHfFLo3jLyVfR4Dh+UKT9BaJekMTrZnVEAKr1ogVYxUbOfB2W DL46NoMaBhEXe53FGHdMJGtcM9MmLIzLtGbFY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=4QcJMJOWe7DeY3F8S3/xMM/5p/CfRCVAcYhm497DynI=; b=iaYgc5TNfJpmRYSJLj642H/OTZJoX+cyL6zJMnFnMKs1ALMAEjI2b6gany4Y6Jw6rJ XoOdkOmxbEs/339f2Tt+RyHhsPkUxFcbzfxFdu5VsQfZIbV4el65yBA6b/6Z5thM+pKw GLlvSBD+39MIHHb39e1jn/sDnr++xld/gh+j4AhFgfm/n1L9BQFVLIMRox5XQFf/2SwT ilNXkL533RtOz6EzECE12t4NGEaoJCCwj78KXlZ8gp81pCKz9lMffBSDP0k/PsV7uos3 dF9WB3Sp0NZEWCToae2GWSvzcWFniKy46HGN9q7oixAiEQJa6/eSjzkWcQmhL2y/A5Nd IpQw== X-Gm-Message-State: AKGB3mK+o9lPJvyoiDm9Y0+hrmwpMKapj3Da21m6Q3FY3ujJA7KrqFWL LrBUbXZAYAkhIQMldeSiDwMHa7XBmSE= X-Google-Smtp-Source: ACJfBovKNs9n2Z/wl6MchepTpdta7/HesvHa1JOtoQt2xcH9lmos7KkP7F2Vk1GAtQPtUjke+EnjSA== X-Received: by 10.101.80.200 with SMTP id s8mr436342pgp.260.1513619161861; Mon, 18 Dec 2017 09:46:01 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:34 -0800 Message-Id: <20171218174552.18871-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH 05/23] target/arm: Implement SVE predicate logical operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 37 ++++++ target/arm/helper.h | 1 + target/arm/sve_helper.c | 126 ++++++++++++++++++ target/arm/translate-sve.c | 314 +++++++++++++++++++++++++++++++++++++++++= +++- target/arm/Makefile.objs | 2 +- target/arm/sve.def | 21 +++ 6 files changed, 498 insertions(+), 3 deletions(-) create mode 100644 target/arm/helper-sve.h create mode 100644 target/arm/sve_helper.c diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h new file mode 100644 index 0000000000..4a923a33b8 --- /dev/null +++ b/target/arm/helper-sve.h @@ -0,0 +1,37 @@ +/* + * AArch64 SVE specific helper definitions + * + * Copyright (c) 2017 Linaro, Ltd + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_sel_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_orr_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_orn_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_nor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_nand_pred, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_ands_pred, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_bics_pred, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_eors_pred, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_orrs_pred, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_orns_pred, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_nors_pred, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr= , i32) +DEF_HELPER_FLAGS_5(sve_nands_pred, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/helper.h b/target/arm/helper.h index 206e39a207..3c4fca220e 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -587,4 +587,5 @@ DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG, =20 #ifdef TARGET_AARCH64 #include "helper-a64.h" +#include "helper-sve.h" #endif diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c new file mode 100644 index 0000000000..5d2a6b2239 --- /dev/null +++ b/target/arm/sve_helper.c @@ -0,0 +1,126 @@ +/* + * ARM SVE Operations + * + * Copyright (c) 2017 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "exec/helper-proto.h" +#include "tcg/tcg-gvec-desc.h" + + +/* Note that vector data is stored in host-endian 64-bit chunks, + so addressing units smaller than that needs a host-endian fixup. */ +#ifdef HOST_WORDS_BIGENDIAN +#define H1(x) ((x) ^ 7) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#else +#define H1(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#endif + + +/* Given the first and last word of the result, the first and last word + of the governing mask, and the sum of the result, return a mask that + can be used to quickly set NZCV. */ +static uint32_t predtest(uint64_t first_d, uint64_t first_g, uint64_t last= _d, + uint64_t last_g, uint64_t sum_d, uint64_t size_ma= sk) +{ + first_g &=3D size_mask; + first_d &=3D first_g & -first_g; + last_g &=3D size_mask; + last_d &=3D pow2floor(last_g); + + return ((first_d !=3D 0) << 31) | ((sum_d !=3D 0) << 1) | (last_d =3D= =3D 0); +} + +#define LOGICAL_PRED(NAME, FUNC) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + uintptr_t opr_sz =3D simd_oprsz(desc); = \ + uint64_t *d =3D vd, *n =3D vn, *m =3D vm, *g =3D vg; = \ + uintptr_t i; \ + for (i =3D 0; i < opr_sz / 8; ++i) { = \ + d[i] =3D FUNC(n[i], m[i], g[i]); = \ + } \ +} + +#define LOGICAL_PRED_FLAGS(NAME, FUNC) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t bit= s) \ +{ = \ + uint64_t *d =3D vd, *n =3D vn, *m =3D vm, *g =3D vg; = \ + uint64_t first_d =3D 0, first_g =3D 0, last_d =3D 0, last_g =3D 0, sum= _d =3D 0; \ + uintptr_t i =3D 0; = \ + for (; i < bits / 64; ++i) { = \ + last_g =3D g[i]; = \ + d[i] =3D last_d =3D FUNC(n[i], m[i], last_g); = \ + sum_d |=3D last_d; = \ + if (i =3D=3D 0) { = \ + first_g =3D last_g, first_d =3D last_d; = \ + } = \ + d[i] =3D last_d; = \ + } = \ + if (bits % 64) { = \ + last_g =3D g[i] & ~(-1ull << bits % 64); = \ + d[i] =3D last_d =3D FUNC(n[i], m[i], last_g); = \ + sum_d |=3D last_d; = \ + if (i =3D=3D 0) { = \ + first_g =3D last_g, first_d =3D last_d; = \ + } = \ + } = \ + return predtest(first_d, first_g, last_d, last_g, sum_d, -1); = \ +} + +#define DO_AND(N, M, G) (((N) & (M)) & (G)) +#define DO_BIC(N, M, G) (((N) & ~(M)) & (G)) +#define DO_EOR(N, M, G) (((N) ^ (M)) & (G)) +#define DO_ORR(N, M, G) (((N) | (M)) & (G)) +#define DO_ORN(N, M, G) (((N) | ~(M)) & (G)) +#define DO_NOR(N, M, G) (~((N) | (M)) & (G)) +#define DO_NAND(N, M, G) (~((N) & (M)) & (G)) +#define DO_SEL(N, M, G) (((N) & (G)) | ((M) & ~(G))) + +LOGICAL_PRED(sve_and_pred, DO_AND) +LOGICAL_PRED(sve_bic_pred, DO_BIC) +LOGICAL_PRED(sve_eor_pred, DO_EOR) +LOGICAL_PRED(sve_sel_pred, DO_SEL) +LOGICAL_PRED(sve_orr_pred, DO_ORR) +LOGICAL_PRED(sve_orn_pred, DO_ORN) +LOGICAL_PRED(sve_nor_pred, DO_NOR) +LOGICAL_PRED(sve_nand_pred, DO_NAND) + +LOGICAL_PRED_FLAGS(sve_ands_pred, DO_AND) +LOGICAL_PRED_FLAGS(sve_bics_pred, DO_BIC) +LOGICAL_PRED_FLAGS(sve_eors_pred, DO_EOR) +LOGICAL_PRED_FLAGS(sve_orrs_pred, DO_ORR) +LOGICAL_PRED_FLAGS(sve_orns_pred, DO_ORN) +LOGICAL_PRED_FLAGS(sve_nors_pred, DO_NOR) +LOGICAL_PRED_FLAGS(sve_nands_pred, DO_NAND) + +#undef LOGICAL_PRED +#undef LOGICAL_PRED_FLAGS +#undef DO_ADD +#undef DO_BIC +#undef DO_EOR +#undef DO_ORR +#undef DO_ORN +#undef DO_NOR +#undef DO_NAND +#undef DO_SEL diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fabf6f0a67..ab03ead000 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -63,6 +63,14 @@ static void do_genfn2(DisasContext *s, GVecGen2Fn *gvec_= fn, vec_full_reg_offset(s, rn), vsz, vsz); } =20 +static void do_genfn2_p(DisasContext *s, GVecGen2Fn *gvec_fn, + int esz, int rd, int rn) +{ + unsigned vsz =3D size_for_gvec(pred_full_reg_size(s)); + gvec_fn(esz, pred_full_reg_offset(s, rd), + pred_full_reg_offset(s, rn), vsz, vsz); +} + static void do_genfn3(DisasContext *s, GVecGen3Fn *gvec_fn, int esz, int rd, int rn, int rm) { @@ -71,9 +79,27 @@ static void do_genfn3(DisasContext *s, GVecGen3Fn *gvec_= fn, vec_full_reg_offset(s, rm), vsz, vsz); } =20 -static void do_zzz_genfn(DisasContext *s, arg_rrr_esz *a, GVecGen3Fn *gvec= _fn) +static void do_genfn3_p(DisasContext *s, GVecGen3Fn *gvec_fn, + int esz, int rd, int rn, int rm) +{ + unsigned vsz =3D size_for_gvec(pred_full_reg_size(s)); + gvec_fn(esz, pred_full_reg_offset(s, rd), pred_full_reg_offset(s, rn), + pred_full_reg_offset(s, rm), vsz, vsz); +} + +static void do_genop4_p(DisasContext *s, const GVecGen4 *gvec_op, + int rd, int rn, int rm, int pg) +{ + unsigned vsz =3D size_for_gvec(pred_full_reg_size(s)); + tcg_gen_gvec_4(pred_full_reg_offset(s, rd), pred_full_reg_offset(s, rn= ), + pred_full_reg_offset(s, rm), pred_full_reg_offset(s, pg= ), + vsz, vsz, gvec_op); +} + + +static void do_zzz_genfn(DisasContext *s, arg_rrr_esz *a, GVecGen3Fn *fn) { - do_genfn3(s, gvec_fn, a->esz, a->rd, a->rn, a->rm); + do_genfn3(s, fn, a->esz, a->rd, a->rn, a->rm); } =20 void trans_AND_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) @@ -216,3 +242,287 @@ void trans_pred_set(DisasContext *s, arg_pred_set *a,= uint32_t insn) tcg_gen_movi_i32(cpu_VF, 0); } } + +static void do_mov_p(DisasContext *s, int rd, int rn) +{ + do_genfn2_p(s, tcg_gen_gvec_mov, 0, rd, rn); +} + +static void gen_and_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_and_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_and_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +void trans_AND_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static const GVecGen4 and_pg =3D { + .fni8 =3D gen_and_pg_i64, + .fniv =3D gen_and_pg_vec, + .fno =3D gen_helper_sve_and_pred, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + + if (a->pg =3D=3D a->rn && a->rn =3D=3D a->rm) { + do_mov_p(s, a->rd, a->rn); + } else if (a->pg =3D=3D a->rn || a->pg =3D=3D a->rm) { + do_genfn3_p(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm); + } else { + do_genop4_p(s, &and_pg, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_bic_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_andc_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_bic_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_andc_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +void trans_BIC_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static const GVecGen4 bic_pg =3D { + .fni8 =3D gen_bic_pg_i64, + .fniv =3D gen_bic_pg_vec, + .fno =3D gen_helper_sve_bic_pred, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + + if (a->pg =3D=3D a->rn) { + do_genfn3_p(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); + } else { + do_genop4_p(s, &bic_pg, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_eor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_xor_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_eor_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_xor_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +void trans_EOR_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static const GVecGen4 eor_pg =3D { + .fni8 =3D gen_eor_pg_i64, + .fniv =3D gen_eor_pg_vec, + .fno =3D gen_helper_sve_eor_pred, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + do_genop4_p(s, &eor_pg, a->rd, a->rn, a->rm, a->pg); +} + +static void gen_sel_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_and_i64(pn, pn, pg); + tcg_gen_andc_i64(pm, pm, pg); + tcg_gen_or_i64(pd, pn, pm); +} + +static void gen_sel_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pn, pn, pg); + tcg_gen_andc_vec(vece, pm, pm, pg); + tcg_gen_or_vec(vece, pd, pn, pm); +} + +void trans_SEL_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static const GVecGen4 sel_pg =3D { + .fni8 =3D gen_sel_pg_i64, + .fniv =3D gen_sel_pg_vec, + .fno =3D gen_helper_sve_sel_pred, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + do_genop4_p(s, &sel_pg, a->rd, a->rn, a->rm, a->pg); +} + +static void gen_orr_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_or_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_orr_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_or_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +void trans_ORR_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static const GVecGen4 orr_pg =3D { + .fni8 =3D gen_orr_pg_i64, + .fniv =3D gen_orr_pg_vec, + .fno =3D gen_helper_sve_orr_pred, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + + if (a->pg =3D=3D a->rn && a->rn =3D=3D a->rm) { + do_mov_p(s, a->rd, a->rn); + } else { + do_genop4_p(s, &orr_pg, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_orn_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_orc_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_orn_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_orc_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +void trans_ORN_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static const GVecGen4 orn_pg =3D { + .fni8 =3D gen_orn_pg_i64, + .fniv =3D gen_orn_pg_vec, + .fno =3D gen_helper_sve_orn_pred, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + do_genop4_p(s, &orn_pg, a->rd, a->rn, a->rm, a->pg); +} + +static void gen_nor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64= pg) +{ + tcg_gen_or_i64(pd, pn, pm); + tcg_gen_andc_i64(pd, pg, pd); +} + +static void gen_nor_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_or_vec(vece, pd, pn, pm); + tcg_gen_andc_vec(vece, pd, pg, pd); +} + +void trans_NOR_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static const GVecGen4 nor_pg =3D { + .fni8 =3D gen_nor_pg_i64, + .fniv =3D gen_nor_pg_vec, + .fno =3D gen_helper_sve_nor_pred, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + do_genop4_p(s, &nor_pg, a->rd, a->rn, a->rm, a->pg); +} + +static void gen_nand_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i6= 4 pg) +{ + tcg_gen_and_i64(pd, pn, pm); + tcg_gen_andc_i64(pd, pg, pd); +} + +static void gen_nand_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pd, pn, pm); + tcg_gen_andc_vec(vece, pd, pg, pd); +} + +void trans_NAND_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static const GVecGen4 nand_pg =3D { + .fni8 =3D gen_nand_pg_i64, + .fniv =3D gen_nand_pg_vec, + .fno =3D gen_helper_sve_nand_pred, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + do_genop4_p(s, &nand_pg, a->rd, a->rn, a->rm, a->pg); +} + +/* A predicate logical operation that sets the flags is always implemented + out of line. The helper returns a 3-bit mask to set N,Z,C -- + N in bit 31, Z in bit 2, and C in bit 1. */ +static void do_logical_pppp_flags(DisasContext *s, arg_rprr_esz *a, + void (*gen_fn)(TCGv_i32, TCGv_ptr, TCGv_= ptr, + TCGv_ptr, TCGv_ptr, TCGv_= i32)) +{ + TCGv_i32 t =3D tcg_const_i32(vec_full_reg_size(s)); + TCGv_ptr pd =3D tcg_temp_new_ptr(); + TCGv_ptr pn =3D tcg_temp_new_ptr(); + TCGv_ptr pm =3D tcg_temp_new_ptr(); + TCGv_ptr pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(pn, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(pm, cpu_env, pred_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, pn, pm, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(pn); + tcg_temp_free_ptr(pm); + tcg_temp_free_ptr(pg); + + tcg_gen_sari_i32(cpu_NF, t, 31); + tcg_gen_andi_i32(cpu_ZF, t, 2); + tcg_gen_andi_i32(cpu_CF, t, 1); + tcg_gen_movi_i32(cpu_VF, 0); + + tcg_temp_free_i32(t); +} + +void trans_ANDS_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_logical_pppp_flags(s, a, gen_helper_sve_ands_pred); +} + +void trans_BICS_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_logical_pppp_flags(s, a, gen_helper_sve_bics_pred); +} + +void trans_EORS_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_logical_pppp_flags(s, a, gen_helper_sve_eors_pred); +} + +void trans_ORRS_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_logical_pppp_flags(s, a, gen_helper_sve_orrs_pred); +} + +void trans_ORNS_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_logical_pppp_flags(s, a, gen_helper_sve_orns_pred); +} + +void trans_NORS_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_logical_pppp_flags(s, a, gen_helper_sve_nors_pred); +} + +void trans_NANDS_pppp(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_logical_pppp_flags(s, a, gen_helper_sve_nands_pred); +} diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs index d1ca1f799b..edcd32db88 100644 --- a/target/arm/Makefile.objs +++ b/target/arm/Makefile.objs @@ -20,4 +20,4 @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.d= ef $(DECODETREE) "GEN", $@) =20 target/arm/translate-sve.o: target/arm/decode-sve.inc.c -obj-$(TARGET_AARCH64) +=3D translate-sve.o +obj-$(TARGET_AARCH64) +=3D translate-sve.o sve_helper.o diff --git a/target/arm/sve.def b/target/arm/sve.def index f802031f51..77f96510d8 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -25,6 +25,7 @@ # instruction patterns. =20 &rrr_esz rd rn rm esz +&rprr_esz rd pg rn rm esz &pred_set rd pat esz i s =20 ########################################################################### @@ -34,6 +35,9 @@ # Three operand with unused vector element size @rd_rn_rm ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 =20 +# Three prediate operand, with governing predicate, unused vector element = size +@pd_pg_pn_pm ........ .... rm:4 .. pg:4 . rn:4 . rd:4 &rprr_esz esz=3D0 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 @@ -55,3 +59,20 @@ pred_set 00100101 00 011 000 1110 0100 0000 rd:4 &pred= _set pat=3D31 esz=3D0 i=3D0 s=3D =20 # SVE initialize FFR (SETFFR) pred_set 00100101 0010 1100 1001 0000 0000 0000 &pred_set pat=3D31 esz= =3D0 rd=3D16 i=3D1 s=3D0 + +# SVE predicate logical operations +AND_pppp 00100101 00 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm +BIC_pppp 00100101 00 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm +EOR_pppp 00100101 00 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm +SEL_pppp 00100101 00 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm +ANDS_pppp 00100101 01 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm +BICS_pppp 00100101 01 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm +EORS_pppp 00100101 01 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm +ORR_pppp 00100101 10 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm +ORN_pppp 00100101 10 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm +NOR_pppp 00100101 10 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm +NAND_pppp 00100101 10 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm +ORRS_pppp 00100101 11 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm +ORNS_pppp 00100101 11 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm +NORS_pppp 00100101 11 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm +NANDS_pppp 00100101 11 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622424660752.6962037459455; Mon, 18 Dec 2017 10:40:24 -0800 (PST) Received: from localhost ([::1]:52043 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Km-0005O1-1J for importer@patchew.org; Mon, 18 Dec 2017 13:40:16 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55569) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUN-00086M-05 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUL-0001sB-A2 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:07 -0500 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:46628) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUL-0001rY-34 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:05 -0500 Received: by mail-pg0-x244.google.com with SMTP id b11so9421599pgu.13 for ; Mon, 18 Dec 2017 09:46:04 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RgN0A+cheDDYQ1TKNQpsfIJ0KEPgf5Y99AmhSS0vM2I=; b=cYmjOs6cjCEfEVa5YxxONWawzH90M5mlRMsy3uAnOTOH95Bo4vLh/z9RZxMaRz0Qdn ipQvIqJIqxvebwg1Ejyvp5GDEhN6znwggvKT143wJRhf0Jnt76ea108WzEUgAJ0dspWy Td/tIHs4DxuK9QlvFMTpr1Hd0Owk0ByMsMggc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RgN0A+cheDDYQ1TKNQpsfIJ0KEPgf5Y99AmhSS0vM2I=; b=Y0c8pI9RlcZAtgrtXCdn6mfcZVdOxnkV1vaDI55bbLZvJaXnAHEnmOsmiixS0vnllZ WuRMtiINApFaoPtKg45MmBPpKN/9b45T7VfRWzhAuSIJUdIFGDZAsVYdwnUXlnGJdqDt YoiggQ0uRDLua6E5qs9jgCVD00wgXYAJi2cGWB2ebNYUwL5Mg36nsXNW6f6tAnU2GDff bDJg2T2bSY5mEW26OQva2LGxEUt7zGPzaQrvw7KyqeYdfmglI4dho8KjpKTD54aEhNNm mWj5US5ckuPUGY54Ar+T7LtTEd45CRIsr/hp55/9BiEsEe9cdKMuzFW0nOn9a2Kkj8/9 CQNg== X-Gm-Message-State: AKGB3mLExXum9gn4Za+BKliv9l50TwDXI2DBaQpHittsgt3GPjgigUUv gpj5Ac6EXdlEsRdJBtCQBpMIv2C1kOU= X-Google-Smtp-Source: ACJfBotlHCJweTQUhHMnZvaKaYXILcTu6/pCQKmtgY/tfl4BDtW37evqGpjQb9QyJf7DthfyfJ/8Sw== X-Received: by 10.101.101.215 with SMTP id y23mr394201pgv.391.1513619163634; Mon, 18 Dec 2017 09:46:03 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:35 -0800 Message-Id: <20171218174552.18871-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH 06/23] target/arm: Implement SVE load vector/predicate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 ++ target/arm/sve_helper.c | 31 +++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 32 ++++++++++++++++++++++++++++++++ target/arm/sve.def | 16 ++++++++++++++++ 4 files changed, 81 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4a923a33b8..8b382a962d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -35,3 +35,5 @@ DEF_HELPER_FLAGS_5(sve_orns_pred, TCG_CALL_NO_RWG, i32, p= tr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_nors_pred, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_nands_pred, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_ldr, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5d2a6b2239..a605e623f7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -20,6 +20,7 @@ #include "qemu/osdep.h" #include "cpu.h" #include "exec/exec-all.h" +#include "exec/cpu_ldst.h" #include "exec/helper-proto.h" #include "tcg/tcg-gvec-desc.h" =20 @@ -124,3 +125,33 @@ LOGICAL_PRED_FLAGS(sve_nands_pred, DO_NAND) #undef DO_NOR #undef DO_NAND #undef DO_SEL + +void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) +{ + intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); + + for (i =3D 0; i < len_align; i +=3D 8) { + *(uint64_t *)(d + i) =3D cpu_ldq_data(env, addr + i); + } + + /* For LDR of predicate registers, we can have any multiple of 2. */ + switch (len % 8) { + case 0: + break; + case 2: + *(uint64_t *)(d + i) =3D cpu_lduw_data(env, addr + i); + break; + case 4: + *(uint64_t *)(d + i) =3D (uint32_t)cpu_ldl_data(env, addr + i); + break; + case 6: + { + uint32_t t0 =3D cpu_ldl_data(env, addr + i); + uint32_t t1 =3D cpu_lduw_data(env, addr + i + 2); + *(uint64_t *)(d + i) =3D deposit64(t0, 32, 32, t1); + } + break; + default: + g_assert_not_reached(); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ab03ead000..0e988c03aa 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -526,3 +526,35 @@ void trans_NANDS_pppp(DisasContext *s, arg_rprr_esz *a= , uint32_t insn) { do_logical_pppp_flags(s, a, gen_helper_sve_nands_pred); } + +static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + TCGv_ptr vptr; + TCGv_i32 tlen; + TCGv_i64 addr =3D tcg_temp_new_i64(); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm); + + vptr =3D tcg_temp_new_ptr(); + tlen =3D tcg_const_i32(len); + tcg_gen_addi_ptr(vptr, cpu_env, vofs); + + gen_helper_sve_ldr(cpu_env, vptr, addr, tlen); + + tcg_temp_free_ptr(vptr); + tcg_temp_free_i32(tlen); + tcg_temp_free_i64(addr); +} + +void trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size =3D vec_full_reg_size(s); + do_ldr(s, vec_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} + +void trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size =3D pred_full_reg_size(s); + do_ldr(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} diff --git a/target/arm/sve.def b/target/arm/sve.def index 77f96510d8..d1172296e0 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -19,11 +19,17 @@ # This file is processed by scripts/decodetree.py # =20 +########################################################################### +# Named fields. These are primarily for disjoint fields. + +%imm9_16_10 16:s6 10:3 + ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual # instruction patterns. =20 +&rri rd rn imm &rrr_esz rd rn rm esz &rprr_esz rd pg rn rm esz &pred_set rd pat esz i s @@ -38,6 +44,10 @@ # Three prediate operand, with governing predicate, unused vector element = size @pd_pg_pn_pm ........ .... rm:4 .. pg:4 . rn:4 . rd:4 &rprr_esz esz=3D0 =20 +# Basic Load/Store with 9-bit immediate offset +@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 &rri imm=3D%imm9_16_10 +@rd_rn_i9 ........ ........ ...... rn:5 rd:5 &rri imm=3D%imm9_16_10 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 @@ -76,3 +86,9 @@ ORRS_pppp 00100101 11 00 .... 01 .... 0 .... 0 .... @pd_= pg_pn_pm ORNS_pppp 00100101 11 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm NORS_pppp 00100101 11 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm NANDS_pppp 00100101 11 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm + +# SVE load predicate register +LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE load vector register +LDR_zri 1000010110 ...... 010 ... ..... ..... @rd_rn_i9 --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513620683023797.402488480187; Mon, 18 Dec 2017 10:11:23 -0800 (PST) Received: from localhost ([::1]:60223 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzsn-0006Ry-Pz for importer@patchew.org; Mon, 18 Dec 2017 13:11:21 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55662) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUQ-0008AO-DV for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUN-0001ue-F3 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:10 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:35886) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUN-0001tp-7C for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:07 -0500 Received: by mail-pg0-x241.google.com with SMTP id k134so9433356pga.3 for ; Mon, 18 Dec 2017 09:46:07 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=c7cxL5hsUIoyNpOX1VT77sen2WskOrn4OyokdNkVw08=; b=dydhplHLgxeyiFBYfL1c6zLWhXXMYIjYfr0eCcc3F4H1VmDQr2gjBBE7J75pIypUio RzBCO/4ZPteBrajvH0KMaLn5rnP2O+nAAKXmYx54JY1UmCZvm8JoKsEK1TmmrT+PGt50 TsQRczYlBiu7ttPRcCpMN38Oo2tDLQ037qOVA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=c7cxL5hsUIoyNpOX1VT77sen2WskOrn4OyokdNkVw08=; b=gP0qNyVmjRRCxGdEIGd6JWxXpXQcmUf2Kg/BxkRkns9RtzVdy6yY9P6GZeDk6cQc/W jnr2VtGT+vJffzjql0Fi/p/d8q6BpX9vG/zg6p4Ztekz34z8P8EoOfdkyftBZ0RdpjcQ vfi2c9LHbVXOVt0MdkySMUpXtwtqtaV4jLWngP5ve2S8jsFbItAnOWZKQqRp/FDBbf0g Z997TFzYwI67M0ktOVRelk15moWuFyUfcKBzsR8wE945+Tf/cCoA0ysvMdG/LoE/Kaa5 xVvZTx7BTsK9f1PvsifXLFb+x1W53/GDSXLzWfYQlXvzPKnA3WMnvIQK6PH2h7UeL8zx Dajg== X-Gm-Message-State: AKGB3mIrClX2PILWT0/MnAMP2fUpp/0gAxvVfhejKnXjSPVdS32S9GqZ wIV0kEGq6gGTP8Cdcddo3kbpmY9aaHk= X-Google-Smtp-Source: ACJfBouSuwH7qAaLjUCAw11J5sIanoqFkXwSWpM0kPREAds4lOtU3O2v2a4PgWpVa1zrtjCKmXNlpQ== X-Received: by 10.99.95.23 with SMTP id t23mr422164pgb.338.1513619165571; Mon, 18 Dec 2017 09:46:05 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:36 -0800 Message-Id: <20171218174552.18871-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH 07/23] target/arm: Implement SVE Integer Binary Arithmetic - Predicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 145 ++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 203 +++++++++++++++++++++++++++++++++++++++++= ++-- target/arm/translate-sve.c | 75 +++++++++++++++++ target/arm/sve.def | 39 +++++++++ 4 files changed, 455 insertions(+), 7 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 8b382a962d..964b15b104 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -17,6 +17,151 @@ * License along with this library; if not, see . */ =20 +DEF_HELPER_FLAGS_5(sve_and_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_eor_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_orr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_bic_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_add_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smax_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umax_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smin_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umin_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_mul_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a605e623f7..b617ea2c04 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -28,13 +28,17 @@ /* Note that vector data is stored in host-endian 64-bit chunks, so addressing units smaller than that needs a host-endian fixup. */ #ifdef HOST_WORDS_BIGENDIAN -#define H1(x) ((x) ^ 7) -#define H2(x) ((x) ^ 3) -#define H4(x) ((x) ^ 1) +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) #else -#define H1(x) (x) -#define H2(x) (x) -#define H4(x) (x) +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) #endif =20 =20 @@ -117,7 +121,7 @@ LOGICAL_PRED_FLAGS(sve_nands_pred, DO_NAND) =20 #undef LOGICAL_PRED #undef LOGICAL_PRED_FLAGS -#undef DO_ADD +#undef DO_AND #undef DO_BIC #undef DO_EOR #undef DO_ORR @@ -126,6 +130,191 @@ LOGICAL_PRED_FLAGS(sve_nands_pred, DO_NAND) #undef DO_NAND #undef DO_SEL =20 +/* Fully general three-operand expander, controlled by a predicate. + * This is complicated by the host-endian storage of the register file. + */ +/* ??? I don't expect the compiler could ever vectorize this itself. + * With some tables we can convert bit masks to byte masks, and with + * extra care wrt byte/word ordering we could use gcc generic vectors + * and do 16 bytes at a time. + */ +#define DO_ZPZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t iv =3D 0, ib =3D 0, opr_sz =3D simd_oprsz(desc); = \ + for (iv =3D ib =3D 0; iv < opr_sz; iv +=3D 16, ib +=3D 2) { = \ + uint16_t pg =3D *(uint16_t *)(vg + H2(ib)); \ + intptr_t i =3D 0; \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + iv + H(i)); \ + TYPE mm =3D *(TYPE *)(vm + iv + H(i)); \ + *(TYPE *)(vd + iv + H(i)) =3D OP(nn, mm); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); = \ + } while (pg); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *n =3D vn, *m =3D vm; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn =3D n[i], mm =3D m[i]; \ + d[i] =3D OP(nn, mm); \ + } \ + } \ +} + +#define DO_AND(N, M) (N & M) +#define DO_EOR(N, M) (N ^ M) +#define DO_ORR(N, M) (N | M) +#define DO_BIC(N, M) (N &~ M) + +DO_ZPZZ(sve_and_zpzz_b, uint8_t, H1, DO_AND) +DO_ZPZZ(sve_orr_zpzz_b, uint8_t, H1, DO_ORR) +DO_ZPZZ(sve_eor_zpzz_b, uint8_t, H1, DO_EOR) +DO_ZPZZ(sve_bic_zpzz_b, uint8_t, H1, DO_BIC) + +DO_ZPZZ(sve_and_zpzz_h, uint16_t, H1_2, DO_AND) +DO_ZPZZ(sve_orr_zpzz_h, uint16_t, H1_2, DO_ORR) +DO_ZPZZ(sve_eor_zpzz_h, uint16_t, H1_2, DO_EOR) +DO_ZPZZ(sve_bic_zpzz_h, uint16_t, H1_2, DO_BIC) + +DO_ZPZZ(sve_and_zpzz_s, uint32_t, H1_4, DO_AND) +DO_ZPZZ(sve_orr_zpzz_s, uint32_t, H1_4, DO_ORR) +DO_ZPZZ(sve_eor_zpzz_s, uint32_t, H1_4, DO_EOR) +DO_ZPZZ(sve_bic_zpzz_s, uint32_t, H1_4, DO_BIC) + +DO_ZPZZ_D(sve_and_zpzz_d, uint64_t, DO_AND) +DO_ZPZZ_D(sve_orr_zpzz_d, uint64_t, DO_ORR) +DO_ZPZZ_D(sve_eor_zpzz_d, uint64_t, DO_EOR) +DO_ZPZZ_D(sve_bic_zpzz_d, uint64_t, DO_BIC) + +#undef DO_AND +#undef DO_ORR +#undef DO_EOR +#undef DO_BIC + +#define DO_ADD(N, M) (N + M) +#define DO_SUB(N, M) (N - M) + +DO_ZPZZ(sve_add_zpzz_b, uint8_t, H1, DO_ADD) +DO_ZPZZ(sve_sub_zpzz_b, uint8_t, H1, DO_SUB) + +DO_ZPZZ(sve_add_zpzz_h, uint16_t, H1_2, DO_ADD) +DO_ZPZZ(sve_sub_zpzz_h, uint16_t, H1_2, DO_SUB) + +DO_ZPZZ(sve_add_zpzz_s, uint32_t, H1_4, DO_ADD) +DO_ZPZZ(sve_sub_zpzz_s, uint32_t, H1_4, DO_SUB) + +DO_ZPZZ_D(sve_add_zpzz_d, uint64_t, DO_ADD) +DO_ZPZZ_D(sve_sub_zpzz_d, uint64_t, DO_SUB) + +#undef DO_ADD +#undef DO_SUB + +#define DO_MAX(N, M) ((N) >=3D (M) ? (N) : (M)) +#define DO_MIN(N, M) ((N) >=3D (M) ? (M) : (N)) +#define DO_ABD(N, M) ((N) >=3D (M) ? (N) - (M) : (M) - (N)) + +DO_ZPZZ(sve_smax_zpzz_b, int8_t, H1, DO_MAX) +DO_ZPZZ(sve_umax_zpzz_b, uint8_t, H1, DO_MAX) +DO_ZPZZ(sve_smin_zpzz_b, int8_t, H1, DO_MIN) +DO_ZPZZ(sve_umin_zpzz_b, uint8_t, H1, DO_MIN) +DO_ZPZZ(sve_sabd_zpzz_b, int8_t, H1, DO_ABD) +DO_ZPZZ(sve_uabd_zpzz_b, uint8_t, H1, DO_ABD) + +DO_ZPZZ(sve_smax_zpzz_h, int16_t, H1_2, DO_MAX) +DO_ZPZZ(sve_umax_zpzz_h, uint16_t, H1_2, DO_MAX) +DO_ZPZZ(sve_smin_zpzz_h, int16_t, H1_2, DO_MIN) +DO_ZPZZ(sve_umin_zpzz_h, uint16_t, H1_2, DO_MIN) +DO_ZPZZ(sve_sabd_zpzz_h, int16_t, H1_2, DO_ABD) +DO_ZPZZ(sve_uabd_zpzz_h, uint16_t, H1_2, DO_ABD) + +DO_ZPZZ(sve_smax_zpzz_s, int32_t, H1_4, DO_MAX) +DO_ZPZZ(sve_umax_zpzz_s, uint32_t, H1_4, DO_MAX) +DO_ZPZZ(sve_smin_zpzz_s, int32_t, H1_4, DO_MIN) +DO_ZPZZ(sve_umin_zpzz_s, uint32_t, H1_4, DO_MIN) +DO_ZPZZ(sve_sabd_zpzz_s, int32_t, H1_4, DO_ABD) +DO_ZPZZ(sve_uabd_zpzz_s, uint32_t, H1_4, DO_ABD) + +DO_ZPZZ_D(sve_smax_zpzz_d, int64_t, DO_MAX) +DO_ZPZZ_D(sve_umax_zpzz_d, uint64_t, DO_MAX) +DO_ZPZZ_D(sve_smin_zpzz_d, int64_t, DO_MIN) +DO_ZPZZ_D(sve_umin_zpzz_d, uint64_t, DO_MIN) +DO_ZPZZ_D(sve_sabd_zpzz_d, int64_t, DO_ABD) +DO_ZPZZ_D(sve_uabd_zpzz_d, uint64_t, DO_ABD) + +#undef DO_MAX +#undef DO_MIN +#undef DO_ABD + +#define DO_MUL(N, M) (N * M) +#define DO_DIV(N, M) (M ? N / M : 0) + +/* Because the computation type is at least twice as large as required, + these work for both signed and unsigned source types. */ +static inline uint8_t do_mulh_b(int32_t n, int32_t m) +{ + return (n * m) >> 8; +} + +static inline uint16_t do_mulh_h(int32_t n, int32_t m) +{ + return (n * m) >> 16; +} + +static inline uint32_t do_mulh_s(int64_t n, int64_t m) +{ + return (n * m) >> 32; +} + +static inline uint64_t do_smulh_d(uint64_t n, uint64_t m) +{ + uint64_t lo, hi; + muls64(&lo, &hi, n, m); + return hi; +} + +static inline uint64_t do_umulh_d(uint64_t n, uint64_t m) +{ + uint64_t lo, hi; + mulu64(&lo, &hi, n, m); + return hi; +} + +DO_ZPZZ(sve_mul_zpzz_b, uint8_t, H1, DO_MUL) +DO_ZPZZ(sve_smulh_zpzz_b, int8_t, H1, do_mulh_b) +DO_ZPZZ(sve_umulh_zpzz_b, uint8_t, H1, do_mulh_b) + +DO_ZPZZ(sve_mul_zpzz_h, uint16_t, H1_2, DO_MUL) +DO_ZPZZ(sve_smulh_zpzz_h, int16_t, H1_2, do_mulh_h) +DO_ZPZZ(sve_umulh_zpzz_h, uint16_t, H1_2, do_mulh_h) + +DO_ZPZZ(sve_mul_zpzz_s, uint32_t, H1_4, DO_MUL) +DO_ZPZZ(sve_smulh_zpzz_s, int32_t, H1_4, do_mulh_s) +DO_ZPZZ(sve_umulh_zpzz_s, uint32_t, H1_4, do_mulh_s) +DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_DIV) +DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV) + +DO_ZPZZ_D(sve_mul_zpzz_d, uint64_t, DO_MUL) +DO_ZPZZ_D(sve_smulh_zpzz_d, uint64_t, do_smulh_d) +DO_ZPZZ_D(sve_umulh_zpzz_d, uint64_t, do_umulh_d) +DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) +DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) + +#undef DO_MUL +#undef DO_DIV + +#undef DO_ZPZZ +#undef DO_ZPZZ_D + void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) { intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0e988c03aa..d8b34020bb 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -126,6 +126,81 @@ void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, ui= nt32_t insn) do_zzz_genfn(s, a, tcg_gen_gvec_andc); } =20 +static void do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_= 4 *fn) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZZ(NAME, name) \ +void trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_4 * const fns[4] =3D { = \ + gen_helper_sve_##name##_zpzz_b, gen_helper_sve_##name##_zpzz_h, \ + gen_helper_sve_##name##_zpzz_s, gen_helper_sve_##name##_zpzz_d, \ + }; \ + do_zpzz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZZ(AND, and) +DO_ZPZZ(EOR, eor) +DO_ZPZZ(ORR, orr) +DO_ZPZZ(BIC, bic) + +DO_ZPZZ(ADD, add) +DO_ZPZZ(SUB, sub) + +DO_ZPZZ(SMAX, smax) +DO_ZPZZ(UMAX, umax) +DO_ZPZZ(SMIN, smin) +DO_ZPZZ(UMIN, umin) +DO_ZPZZ(SABD, sabd) +DO_ZPZZ(UABD, uabd) + +DO_ZPZZ(MUL, mul) +DO_ZPZZ(SMULH, smulh) +DO_ZPZZ(UMULH, umulh) + +void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + gen_helper_gvec_4 *fn; + switch (a->esz) { + default: + unallocated_encoding(s); + return; + case 2: + fn =3D gen_helper_sve_sdiv_zpzz_s; + break; + case 3: + fn =3D gen_helper_sve_sdiv_zpzz_d; + break; + }; + do_zpzz_ool(s, a, fn); +} + +void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + gen_helper_gvec_4 *fn; + switch (a->esz) { + default: + unallocated_encoding(s); + return; + case 2: + fn =3D gen_helper_sve_udiv_zpzz_s; + break; + case 3: + fn =3D gen_helper_sve_udiv_zpzz_d; + break; + }; + do_zpzz_ool(s, a, fn); +} + +#undef DO_ZPZZ + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index d1172296e0..3bb4faaf89 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -24,6 +24,10 @@ =20 %imm9_16_10 16:s6 10:3 =20 +# Either a copy of rd (at bit 0), or a different source +# as propagated via the MOVPRFX instruction. +%reg_movprfx 0:5 + ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual @@ -44,6 +48,10 @@ # Three prediate operand, with governing predicate, unused vector element = size @pd_pg_pn_pm ........ .... rm:4 .. pg:4 . rn:4 . rd:4 &rprr_esz esz=3D0 =20 +# Two register operand, with governing predicate, vector element size +@rdn_pg_rm_esz ........ esz:2 ... ... ... pg:3 rm:5 rd:5 &rprr_esz rn=3D%= reg_movprfx +@rdm_pg_rn_esz ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rprr_esz rm=3D%= reg_movprfx + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 &rri imm=3D%imm9_16_10 @rd_rn_i9 ........ ........ ...... rn:5 rd:5 &rri imm=3D%imm9_16_10 @@ -51,6 +59,37 @@ ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 +### SVE Integer Arithmetic - Binary Predicated Group + +# SVE bitwise logical vector operations (predicated) +ORR_zpzz 00000100 .. 011 000 000 ... ..... ..... @rdn_pg_rm_esz +EOR_zpzz 00000100 .. 011 001 000 ... ..... ..... @rdn_pg_rm_esz +AND_zpzz 00000100 .. 011 010 000 ... ..... ..... @rdn_pg_rm_esz +BIC_zpzz 00000100 .. 011 011 000 ... ..... ..... @rdn_pg_rm_esz + +# SVE integer add/subtract vectors (predicated) +ADD_zpzz 00000100 .. 000 000 000 ... ..... ..... @rdn_pg_rm_esz +SUB_zpzz 00000100 .. 000 001 000 ... ..... ..... @rdn_pg_rm_esz +SUB_zpzz 00000100 .. 000 011 000 ... ..... ..... @rdm_pg_rn_esz # SUBR + +# SVE integer min/max/difference (predicated) +SMAX_zpzz 00000100 .. 001 000 000 ... ..... ..... @rdn_pg_rm_esz +UMAX_zpzz 00000100 .. 001 001 000 ... ..... ..... @rdn_pg_rm_esz +SMIN_zpzz 00000100 .. 001 010 000 ... ..... ..... @rdn_pg_rm_esz +UMIN_zpzz 00000100 .. 001 011 000 ... ..... ..... @rdn_pg_rm_esz +SABD_zpzz 00000100 .. 001 100 000 ... ..... ..... @rdn_pg_rm_esz +UABD_zpzz 00000100 .. 001 101 000 ... ..... ..... @rdn_pg_rm_esz + +# SVE integer multiply/divide (predicated) +MUL_zpzz 00000100 .. 010 000 000 ... ..... ..... @rdn_pg_rm_esz +SMULH_zpzz 00000100 .. 010 010 000 ... ..... ..... @rdn_pg_rm_esz +UMULH_zpzz 00000100 .. 010 011 000 ... ..... ..... @rdn_pg_rm_esz +# Note that divide requires size >=3D 2; below 2 is unallocated. +SDIV_zpzz 00000100 .. 010 100 000 ... ..... ..... @rdn_pg_rm_esz +UDIV_zpzz 00000100 .. 010 101 000 ... ..... ..... @rdn_pg_rm_esz +SDIV_zpzz 00000100 .. 010 110 000 ... ..... ..... @rdm_pg_rn_esz # SDIVR +UDIV_zpzz 00000100 .. 010 111 000 ... ..... ..... @rdm_pg_rn_esz # UDIVR + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513621493478639.9831360537692; Mon, 18 Dec 2017 10:24:53 -0800 (PST) Received: from localhost ([::1]:36703 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR05p-00044r-3k for importer@patchew.org; Mon, 18 Dec 2017 13:24:49 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55643) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUP-00089l-V6 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUO-0001ww-TN for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:09 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:41801) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUO-0001vm-L9 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:08 -0500 Received: by mail-pf0-x244.google.com with SMTP id j28so9963315pfk.8 for ; Mon, 18 Dec 2017 09:46:08 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=k8F0HjW9dFA0ovtDS/CZjiEOnmVa5Jw57Xx7i6w7EhA=; b=Rua+aG9oZv539Nr+S7cb4QQctCvYIN6UazkxzjzNI1elClLXMdtg5TuzQVZaTphbdY WgLGQdb/8yPG3kkVW6VO4wKYL7gLIMV7yeusMiPMJatsmu5uDukYnttvQ5/ym/c9Drsx d1HhapimgdhwwPN6gmHp05weNJCEC/Mw0SHUg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=k8F0HjW9dFA0ovtDS/CZjiEOnmVa5Jw57Xx7i6w7EhA=; b=Bgdmu+OxM7wpXWHtuyViEpNKZXxl8NGwBrbAzJZO2XanpvtyJlcnYLLyYuIJ4aifu4 oy0WHoCoejoGcRODe5goJaNDtAkJOGwI909vdmQIj8FFge65EswEoIDd4LmLfuZ96Wfm KEgLzoB7MC5zfYNepmGmStu1Lk7nSJ9flfx0NGU6FLlwwKI8PoUDMevh8ampIP0JfZTN MjjUpRM2DGahXCrWHoK8C5rY4CJOzjNYcHXmEqJDXgqM6iFTkfABMqhG+p9VXKt9f0Jt Ut8Wttbgpj6nhmGO5zo033eQH6B2hqR7by9qRrEMDhibCg29ywQhTlqbLoF8V45XcwUA howg== X-Gm-Message-State: AKGB3mL/t/Yi/1r5vUmqjGkTznLDWyQycGLtbrMAFro1Yv9/k1g9D06b A1EIJ5OMSlSQyXWRKIdvTsrlWw3pV+8= X-Google-Smtp-Source: ACJfBouKufQhxjGdaeibSgpVYPQYI1XOP6Nqlt8wdIa2ncDmHAW1KXONkvd0/6dh6MFUbYK8DGPgMA== X-Received: by 10.99.132.195 with SMTP id k186mr447536pgd.130.1513619167266; Mon, 18 Dec 2017 09:46:07 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:37 -0800 Message-Id: <20171218174552.18871-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH 08/23] target/arm: Handle SVE registers in write_fp_dreg X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" When storing to an AdvSIMD FP register, all of the high bits of the SVE register are zeroed. At the same time, export the function for use in translate-sve.c. Signed-off-by: Richard Henderson --- target/arm/translate-a64.h | 1 + target/arm/translate-a64.c | 32 ++++++++++++++++++-------------- 2 files changed, 19 insertions(+), 14 deletions(-) diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index 9014b5bf8b..07861fa9c6 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -35,6 +35,7 @@ TCGv_i64 cpu_reg(DisasContext *s, int reg); TCGv_i64 cpu_reg_sp(DisasContext *s, int reg); TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf); TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf); +void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v); =20 /* We should have at some point before trying to access an FP register * done the necessary access check, so assert that diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 8be1660661..b951045820 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -533,13 +533,28 @@ static TCGv_i32 read_fp_sreg(DisasContext *s, int reg) return v; } =20 -static void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v) +/* Clear the bits above an 64-bit vector. + * If SVE is not enabled, then there are only 128 bits in the vector. + */ +static void clear_vec_high(DisasContext *s, int rd) { + unsigned ofs =3D fp_reg_offset(s, rd, MO_64); + unsigned vsz =3D vec_full_reg_size(s); TCGv_i64 tcg_zero =3D tcg_const_i64(0); =20 - tcg_gen_st_i64(v, cpu_env, fp_reg_offset(s, reg, MO_64)); - tcg_gen_st_i64(tcg_zero, cpu_env, fp_reg_hi_offset(s, reg)); + tcg_gen_st_i64(tcg_zero, cpu_env, ofs + 8); tcg_temp_free_i64(tcg_zero); + if (vsz > 16) { + tcg_gen_gvec_dup8i(ofs + 16, vsz - 16, vsz - 16, 0); + } +} + +void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v) +{ + unsigned ofs =3D fp_reg_offset(s, reg, MO_64); + + tcg_gen_st_i64(v, cpu_env, ofs); + clear_vec_high(s, reg); } =20 static void write_fp_sreg(DisasContext *s, int reg, TCGv_i32 v) @@ -1015,17 +1030,6 @@ static void write_vec_element_i32(DisasContext *s, T= CGv_i32 tcg_src, } } =20 -/* Clear the high 64 bits of a 128 bit vector (in general non-quad - * vector ops all need to do this). - */ -static void clear_vec_high(DisasContext *s, int rd) -{ - TCGv_i64 tcg_zero =3D tcg_const_i64(0); - - write_vec_element(s, tcg_zero, rd, 1, MO_64); - tcg_temp_free_i64(tcg_zero); -} - /* Store from vector register to memory */ static void do_vec_st(DisasContext *s, int srcidx, int element, TCGv_i64 tcg_addr, int size) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622916949577.0953740536079; Mon, 18 Dec 2017 10:48:36 -0800 (PST) Received: from localhost ([::1]:53191 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Sm-0005b4-LM for importer@patchew.org; Mon, 18 Dec 2017 13:48:32 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55704) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUS-0008D1-Qq for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUQ-00020t-HE for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:12 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:34835) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUQ-0001zG-9N for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:10 -0500 Received: by mail-pl0-x241.google.com with SMTP id b96so5246524pli.2 for ; Mon, 18 Dec 2017 09:46:10 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.07 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=B9w/xv6dl+dsoGKHoDlcAQW8Ls7nRo+YPhDgGizIB2U=; b=AS/hvW+wQGZ7T1x5egFaHVGg31uyBicf4pApqN41rVHsepYAfXnh4nYbtxHMODa7gA TN8XUu8Yptuc57Gk6tQmdzTMuy3AZkUqU1ZgYMvuqQqBvxJb8G7UGKDICOZHW2v1U6nr SO71ddve34BnGkyZr+NcsApHfN5ZpZZM+TZ+Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=B9w/xv6dl+dsoGKHoDlcAQW8Ls7nRo+YPhDgGizIB2U=; b=uY29g+QqgBuG1tMbqIPPyIGK8BZHutS/4BDu0fPxqCIbH0STl8kEsOMmo/U2Z296CA tK6/rU++0SGgnLrH0nfCAnuA1mofXQgEYehjhltklFcGusZJy/zI5GfDtu5qczXnQipJ qXYBVS69vFiMZ/JGXcj5vqdVoGF4EtbbI9JtHAhfkEANmp/tbq1wp/xcmxl6w1B8ITNs qnkWf6to6IRYQHFw3u2+akQ5whIolct3DmImq6QSubusdN/k/EObc19dFjlQQTe4opp1 2OGq+xzxBSSOtTGqJcqbBj9SrhvIFyUsnVQL3fcpyt+hFn5sn7IqPRaBhr4gDSZga/gy SwfA== X-Gm-Message-State: AKGB3mKohPcPeUeD3ZdtYE1Q0l67N0QYGat6OJtfEDmWIeebW0kfrp4/ 3O4+RXGEjau8DpSSDUBoCiJwZJLLHhA= X-Google-Smtp-Source: ACJfBouq3XDOcFu32b6UlHxSFS0bxlUmrDJgnaSQo4afCumj5bAwLmprj3Krzo28qDgpm5KDj7qv0Q== X-Received: by 10.84.209.136 with SMTP id y8mr442838plh.439.1513619168832; Mon, 18 Dec 2017 09:46:08 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:38 -0800 Message-Id: <20171218174552.18871-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH 09/23] target/arm: Handle SVE registers when using clear_vec_high X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" When storing to an AdvSIMD FP register, all of the high bits of the SVE register are zeroed. Therefore, call it more often with is_q as a parameter. Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 157 +++++++++++++++--------------------------= ---- 1 file changed, 51 insertions(+), 106 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index b951045820..9e15a4b1ae 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -533,17 +533,19 @@ static TCGv_i32 read_fp_sreg(DisasContext *s, int reg) return v; } =20 -/* Clear the bits above an 64-bit vector. +/* Clear the bits above an N-bit vector, for N =3D (is_q ? 128 : 64). * If SVE is not enabled, then there are only 128 bits in the vector. */ -static void clear_vec_high(DisasContext *s, int rd) +static void clear_vec_high(DisasContext *s, bool is_q, int rd) { unsigned ofs =3D fp_reg_offset(s, rd, MO_64); unsigned vsz =3D vec_full_reg_size(s); - TCGv_i64 tcg_zero =3D tcg_const_i64(0); =20 - tcg_gen_st_i64(tcg_zero, cpu_env, ofs + 8); - tcg_temp_free_i64(tcg_zero); + if (is_q) { + TCGv_i64 tcg_zero =3D tcg_const_i64(0); + tcg_gen_st_i64(tcg_zero, cpu_env, ofs + 8); + tcg_temp_free_i64(tcg_zero); + } if (vsz > 16) { tcg_gen_gvec_dup8i(ofs + 16, vsz - 16, vsz - 16, 0); } @@ -554,7 +556,7 @@ void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v) unsigned ofs =3D fp_reg_offset(s, reg, MO_64); =20 tcg_gen_st_i64(v, cpu_env, ofs); - clear_vec_high(s, reg); + clear_vec_high(s, false, reg); } =20 static void write_fp_sreg(DisasContext *s, int reg, TCGv_i32 v) @@ -915,6 +917,8 @@ static void do_fp_ld(DisasContext *s, int destidx, TCGv= _i64 tcg_addr, int size) =20 tcg_temp_free_i64(tmplo); tcg_temp_free_i64(tmphi); + + clear_vec_high(s, true, destidx); } =20 /* @@ -2670,12 +2674,13 @@ static void disas_ldst_multiple_struct(DisasContext= *s, uint32_t insn) /* For non-quad operations, setting a slice of the low * 64 bits of the register clears the high 64 bits (in * the ARM ARM pseudocode this is implicit in the fact - * that 'rval' is a 64 bit wide variable). We optimize - * by noticing that we only need to do this the first - * time we touch a register. + * that 'rval' is a 64 bit wide variable). + * For quad operations, we might still need to zero the + * high bits of SVE. We optimize by noticing that we = only + * need to do this the first time we touch a register. */ - if (!is_q && e =3D=3D 0 && (r =3D=3D 0 || xs =3D=3D se= lem - 1)) { - clear_vec_high(s, tt); + if (e =3D=3D 0 && (r =3D=3D 0 || xs =3D=3D selem - 1))= { + clear_vec_high(s, is_q, tt); } } tcg_gen_addi_i64(tcg_addr, tcg_addr, ebytes); @@ -2818,10 +2823,9 @@ static void disas_ldst_single_struct(DisasContext *s= , uint32_t insn) write_vec_element(s, tcg_tmp, rt, 0, MO_64); if (is_q) { write_vec_element(s, tcg_tmp, rt, 1, MO_64); - } else { - clear_vec_high(s, rt); } tcg_temp_free_i64(tcg_tmp); + clear_vec_high(s, is_q, rt); } else { /* Load/store one element per register */ if (is_load) { @@ -6659,7 +6663,6 @@ static void handle_vec_simd_sqshrn(DisasContext *s, b= ool is_scalar, bool is_q, } =20 if (!is_q) { - clear_vec_high(s, rd); write_vec_element(s, tcg_final, rd, 0, MO_64); } else { write_vec_element(s, tcg_final, rd, 1, MO_64); @@ -6672,7 +6675,8 @@ static void handle_vec_simd_sqshrn(DisasContext *s, b= ool is_scalar, bool is_q, tcg_temp_free_i64(tcg_rd); tcg_temp_free_i32(tcg_rd_narrowed); tcg_temp_free_i64(tcg_final); - return; + + clear_vec_high(s, is_q, rd); } =20 /* SQSHLU, UQSHL, SQSHL: saturating left shifts */ @@ -6736,10 +6740,7 @@ static void handle_simd_qshl(DisasContext *s, bool s= calar, bool is_q, tcg_temp_free_i64(tcg_op); } tcg_temp_free_i64(tcg_shift); - - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } else { TCGv_i32 tcg_shift =3D tcg_const_i32(shift); static NeonGenTwoOpEnvFn * const fns[2][2][3] =3D { @@ -6788,8 +6789,8 @@ static void handle_simd_qshl(DisasContext *s, bool sc= alar, bool is_q, } tcg_temp_free_i32(tcg_shift); =20 - if (!is_q && !scalar) { - clear_vec_high(s, rd); + if (!scalar) { + clear_vec_high(s, is_q, rd); } } } @@ -6831,10 +6832,8 @@ static void handle_simd_intfp_conv(DisasContext *s, = int rd, int rn, write_vec_element(s, tcg_double, rd, pass, MO_64); } } - tcg_temp_free_i64(tcg_int64); tcg_temp_free_i64(tcg_double); - } else { TCGv_i32 tcg_int32 =3D tcg_temp_new_i32(); TCGv_i32 tcg_float =3D tcg_temp_new_i32(); @@ -6887,20 +6886,17 @@ static void handle_simd_intfp_conv(DisasContext *s,= int rd, int rn, write_vec_element_i32(s, tcg_float, rd, pass, size); } } - tcg_temp_free_i32(tcg_int32); tcg_temp_free_i32(tcg_float); - - if ((size =3D=3D MO_32 && elements =3D=3D 2) || - (size =3D=3D MO_16 && elements =3D=3D 4)) { - clear_vec_high(s, rd); - } } =20 tcg_temp_free_ptr(tcg_fpst); if (fracbits || size =3D=3D MO_64) { tcg_temp_free_i32(tcg_shift); } + if (elements > 1) { + clear_vec_high(s, (elements << size) > 8, rd); + } } =20 /* UCVTF/SCVTF - Integer to FP conversion */ @@ -6988,9 +6984,7 @@ static void handle_simd_shift_fpint_conv(DisasContext= *s, bool is_scalar, write_vec_element(s, tcg_op, rd, pass, MO_64); tcg_temp_free_i64(tcg_op); } - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } else { int maxpass =3D is_scalar ? 1 : is_q ? 4 : 2; for (pass =3D 0; pass < maxpass; pass++) { @@ -7009,8 +7003,8 @@ static void handle_simd_shift_fpint_conv(DisasContext= *s, bool is_scalar, } tcg_temp_free_i32(tcg_op); } - if (!is_q && !is_scalar) { - clear_vec_high(s, rd); + if (!is_scalar) { + clear_vec_high(s, is_q, rd); } } =20 @@ -7491,13 +7485,9 @@ static void handle_3same_float(DisasContext *s, int = size, int elements, tcg_temp_free_i32(tcg_op2); } } - tcg_temp_free_ptr(fpst); =20 - if ((elements << size) < 4) { - /* scalar, or non-quad vector op */ - clear_vec_high(s, rd); - } + clear_vec_high(s, elements * (size ? 8 : 4) > 8, rd); } =20 /* AdvSIMD scalar three same @@ -8005,13 +7995,10 @@ static void handle_2misc_fcmp_zero(DisasContext *s,= int opcode, } write_vec_element(s, tcg_res, rd, pass, MO_64); } - if (is_scalar) { - clear_vec_high(s, rd); - } - tcg_temp_free_i64(tcg_res); tcg_temp_free_i64(tcg_zero); tcg_temp_free_i64(tcg_op); + clear_vec_high(s, !is_scalar, rd); } else { TCGv_i32 tcg_op =3D tcg_temp_new_i32(); TCGv_i32 tcg_zero =3D tcg_const_i32(0); @@ -8063,8 +8050,8 @@ static void handle_2misc_fcmp_zero(DisasContext *s, i= nt opcode, tcg_temp_free_i32(tcg_res); tcg_temp_free_i32(tcg_zero); tcg_temp_free_i32(tcg_op); - if (!is_q && !is_scalar) { - clear_vec_high(s, rd); + if (!is_scalar) { + clear_vec_high(s, is_q, rd); } } =20 @@ -8100,12 +8087,9 @@ static void handle_2misc_reciprocal(DisasContext *s,= int opcode, } write_vec_element(s, tcg_res, rd, pass, MO_64); } - if (is_scalar) { - clear_vec_high(s, rd); - } - tcg_temp_free_i64(tcg_res); tcg_temp_free_i64(tcg_op); + clear_vec_high(s, !is_scalar, rd); } else { TCGv_i32 tcg_op =3D tcg_temp_new_i32(); TCGv_i32 tcg_res =3D tcg_temp_new_i32(); @@ -8145,8 +8129,8 @@ static void handle_2misc_reciprocal(DisasContext *s, = int opcode, } tcg_temp_free_i32(tcg_res); tcg_temp_free_i32(tcg_op); - if (!is_q && !is_scalar) { - clear_vec_high(s, rd); + if (!is_scalar) { + clear_vec_high(s, is_q, rd); } } tcg_temp_free_ptr(fpst); @@ -8259,9 +8243,7 @@ static void handle_2misc_narrow(DisasContext *s, bool= scalar, write_vec_element_i32(s, tcg_res[pass], rd, destelt + pass, MO_32); tcg_temp_free_i32(tcg_res[pass]); } - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } =20 /* Remaining saturating accumulating ops */ @@ -8286,12 +8268,9 @@ static void handle_2misc_satacc(DisasContext *s, boo= l is_scalar, bool is_u, } write_vec_element(s, tcg_rd, rd, pass, MO_64); } - if (is_scalar) { - clear_vec_high(s, rd); - } - tcg_temp_free_i64(tcg_rd); tcg_temp_free_i64(tcg_rn); + clear_vec_high(s, !is_scalar, rd); } else { TCGv_i32 tcg_rn =3D tcg_temp_new_i32(); TCGv_i32 tcg_rd =3D tcg_temp_new_i32(); @@ -8349,13 +8328,9 @@ static void handle_2misc_satacc(DisasContext *s, boo= l is_scalar, bool is_u, } write_vec_element_i32(s, tcg_rd, rd, pass, MO_32); } - - if (!is_q) { - clear_vec_high(s, rd); - } - tcg_temp_free_i32(tcg_rd); tcg_temp_free_i32(tcg_rn); + clear_vec_high(s, is_q, rd); } } =20 @@ -8855,9 +8830,7 @@ static void handle_vec_simd_shri(DisasContext *s, boo= l is_q, bool is_u, tcg_temp_free_i64(tcg_round); =20 done: - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } =20 static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, unsigned shift) @@ -9045,19 +9018,18 @@ static void handle_vec_simd_shrn(DisasContext *s, b= ool is_q, } =20 if (!is_q) { - clear_vec_high(s, rd); write_vec_element(s, tcg_final, rd, 0, MO_64); } else { write_vec_element(s, tcg_final, rd, 1, MO_64); } - if (round) { tcg_temp_free_i64(tcg_round); } tcg_temp_free_i64(tcg_rn); tcg_temp_free_i64(tcg_rd); tcg_temp_free_i64(tcg_final); - return; + + clear_vec_high(s, is_q, rd); } =20 =20 @@ -9451,9 +9423,7 @@ static void handle_3rd_narrowing(DisasContext *s, int= is_q, int is_u, int size, write_vec_element_i32(s, tcg_res[pass], rd, pass + part, MO_32); tcg_temp_free_i32(tcg_res[pass]); } - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } =20 static void handle_pmull_64(DisasContext *s, int is_q, int rd, int rn, int= rm) @@ -9877,9 +9847,7 @@ static void handle_simd_3same_pair(DisasContext *s, i= nt is_q, int u, int opcode, write_vec_element_i32(s, tcg_res[pass], rd, pass, MO_32); tcg_temp_free_i32(tcg_res[pass]); } - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } =20 if (fpst) { @@ -10372,10 +10340,7 @@ static void disas_simd_3same_int(DisasContext *s, = uint32_t insn) tcg_temp_free_i32(tcg_op2); } } - - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } =20 /* AdvSIMD three same @@ -10611,10 +10576,7 @@ static void disas_simd_three_reg_same_fp16(DisasCo= ntext *s, uint32_t insn) =20 tcg_temp_free_ptr(fpst); =20 - if (!is_q) { - /* non-quad vector op */ - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } =20 /* AdvSIMD three same extra @@ -10846,9 +10808,7 @@ static void handle_rev(DisasContext *s, int opcode,= bool u, write_vec_element(s, tcg_tmp, rd, i, grp_size); tcg_temp_free_i64(tcg_tmp); } - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } else { int revmask =3D (1 << grp_size) - 1; int esize =3D 8 << size; @@ -11499,9 +11459,7 @@ static void disas_simd_two_reg_misc(DisasContext *s= , uint32_t insn) tcg_temp_free_i32(tcg_op); } } - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); =20 if (need_rmode) { gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); @@ -11778,9 +11736,7 @@ static void disas_simd_two_reg_misc_fp16(DisasConte= xt *s, uint32_t insn) tcg_temp_free_i32(tcg_op); } =20 - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } =20 if (need_rmode) { @@ -12029,12 +11985,8 @@ static void disas_simd_indexed(DisasContext *s, ui= nt32_t insn) tcg_temp_free_i64(tcg_op); tcg_temp_free_i64(tcg_res); } - - if (is_scalar) { - clear_vec_high(s, rd); - } - tcg_temp_free_i64(tcg_idx); + clear_vec_high(s, !is_scalar, rd); } else if (!is_long) { /* 32 bit floating point, or 16 or 32 bit integer. * For the 16 bit scalar case we use the usual Neon helpers and @@ -12198,12 +12150,8 @@ static void disas_simd_indexed(DisasContext *s, ui= nt32_t insn) tcg_temp_free_i32(tcg_op); tcg_temp_free_i32(tcg_res); } - tcg_temp_free_i32(tcg_idx); - - if (!is_q) { - clear_vec_high(s, rd); - } + clear_vec_high(s, is_q, rd); } else { /* long ops: 16x16->32 or 32x32->64 */ TCGv_i64 tcg_res[2]; @@ -12279,10 +12227,7 @@ static void disas_simd_indexed(DisasContext *s, ui= nt32_t insn) tcg_temp_free_i64(tcg_passres); } tcg_temp_free_i64(tcg_idx); - - if (is_scalar) { - clear_vec_high(s, rd); - } + clear_vec_high(s, !is_scalar, rd); } else { TCGv_i32 tcg_idx =3D tcg_temp_new_i32(); =20 --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622820514949.6410052410761; Mon, 18 Dec 2017 10:47:00 -0800 (PST) Received: from localhost ([::1]:52837 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0R5-0003SR-CI for importer@patchew.org; Mon, 18 Dec 2017 13:46:47 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55737) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUU-0008Ev-QF for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUS-00023z-5i for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:14 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:44080) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUR-000232-Uy for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:12 -0500 Received: by mail-pl0-x243.google.com with SMTP id n13so5240946plp.11 for ; Mon, 18 Dec 2017 09:46:11 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SbAQzKsUC2EG/9+FPyO+p9UvPXJQnnAVP1KI6UDjMAw=; b=PqqxPFZc5+IOkEqbRmiACcz0tFDE+mV3NH6pXTYQzWAeeEO/x2NTt3/5VtER0S0uNQ uvi/sePFJyVUGOiNP6pIKWnk8YPFLHWZokHPatsSd11sWHAfggP3lxq3cVTwS5SeGBLZ SfKLmrEOFIwb2iJKVG1I8nAwW0ikzaQKeakdI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SbAQzKsUC2EG/9+FPyO+p9UvPXJQnnAVP1KI6UDjMAw=; b=T2fwUZlyKHIJQ6K7IwmdSbl2ChB0olI7KoIuB3LfwSrNcXUYZNpsR7d2t8GJEobINe jlfLDUSxvQoZze2arnTPLYxOfjENQTULf61GpE3ihAZohPEl0Dzx0OOMafeUO6ANQYKm NnX4HDvAJ7L9CpAl1q7fkPfetnizm1AQIZIa7RM/rNc2UvIo0JEpvyhV4WRF9gUWI6Yh +7A6TAN7SIdI8yJBGWP/8zcNrgXyR5aiVlMyzXPCpV951EcPRxvlpEVIwHLXAjYL9uPM IydBkMAbo9aJ+4m09X6m6aiT6eNo3rJtJMYDSp2RR5bETcSv3sfjkyFgPiueLY28Q5gp yZOQ== X-Gm-Message-State: AKGB3mIhxQG7F/fYE4u3je3zNKT5zC7OKdl9/Ps5Bwx+yhTyaRgH503q h0pdcRwcJ1kSsb+B1NJP3Ba+hZ1dxOY= X-Google-Smtp-Source: ACJfBov+wz6n3+ABqhWKDJmfxDFIJjTTAV2I7KefwYGrk/YEfGv5JJoiPKlf7I1qkc9xWABfGXZ8Zw== X-Received: by 10.84.129.7 with SMTP id 7mr511269plb.104.1513619170514; Mon, 18 Dec 2017 09:46:10 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:39 -0800 Message-Id: <20171218174552.18871-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH 10/23] target/arm: Implement SVE Integer Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Excepting MOVPRFX, which isn't a reduction. Presumably it is placed within the group because of its encoding. Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 44 +++++++++++++++++ target/arm/sve_helper.c | 116 +++++++++++++++++++++++++++++++++++++++--= ---- target/arm/translate-sve.c | 64 +++++++++++++++++++++++++ target/arm/sve.def | 22 +++++++++ 4 files changed, 231 insertions(+), 15 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 964b15b104..937598d6f8 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -162,6 +162,50 @@ DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_eorv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_andv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_saddv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_saddv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_saddv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uaddv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_smaxv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_umaxv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_sminv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uminv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b617ea2c04..fca17440e7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -196,11 +196,6 @@ DO_ZPZZ_D(sve_orr_zpzz_d, uint64_t, DO_ORR) DO_ZPZZ_D(sve_eor_zpzz_d, uint64_t, DO_EOR) DO_ZPZZ_D(sve_bic_zpzz_d, uint64_t, DO_BIC) =20 -#undef DO_AND -#undef DO_ORR -#undef DO_EOR -#undef DO_BIC - #define DO_ADD(N, M) (N + M) #define DO_SUB(N, M) (N - M) =20 @@ -216,9 +211,6 @@ DO_ZPZZ(sve_sub_zpzz_s, uint32_t, H1_4, DO_SUB) DO_ZPZZ_D(sve_add_zpzz_d, uint64_t, DO_ADD) DO_ZPZZ_D(sve_sub_zpzz_d, uint64_t, DO_SUB) =20 -#undef DO_ADD -#undef DO_SUB - #define DO_MAX(N, M) ((N) >=3D (M) ? (N) : (M)) #define DO_MIN(N, M) ((N) >=3D (M) ? (M) : (N)) #define DO_ABD(N, M) ((N) >=3D (M) ? (N) - (M) : (M) - (N)) @@ -251,10 +243,6 @@ DO_ZPZZ_D(sve_umin_zpzz_d, uint64_t, DO_MIN) DO_ZPZZ_D(sve_sabd_zpzz_d, int64_t, DO_ABD) DO_ZPZZ_D(sve_uabd_zpzz_d, uint64_t, DO_ABD) =20 -#undef DO_MAX -#undef DO_MIN -#undef DO_ABD - #define DO_MUL(N, M) (N * M) #define DO_DIV(N, M) (M ? N / M : 0) =20 @@ -309,12 +297,110 @@ DO_ZPZZ_D(sve_umulh_zpzz_d, uint64_t, do_umulh_d) DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) =20 -#undef DO_MUL -#undef DO_DIV - #undef DO_ZPZZ #undef DO_ZPZZ_D =20 +/* Two-operand reduction expander, controlled by a predicate. + * The difference between TYPERED and TYPERET has to do with + * sign-extension. E.g. for SMAX, TYPERED must be signed, + * but TYPERET must be unsigned so that e.g. a 32-bit value + * is not sign-extended to the ABI uint64_t return type. + */ +/* ??? If we were to vectorize this by hand the reduction ordering + * would change. For integer operands, this is perfectly fine. + */ +#define DO_VPZ(NAME, TYPEELT, TYPERED, TYPERET, H, INIT, OP) \ +uint64_t HELPER(NAME)(void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t iv, ib, opr_sz =3D simd_oprsz(desc); \ + TYPERED ret =3D INIT; \ + for (iv =3D ib =3D 0; iv < opr_sz; iv +=3D 16, ib +=3D 2) { \ + uint16_t pg =3D *(uint16_t *)(vg + H2(ib)); \ + intptr_t i =3D 0; \ + do { \ + TYPEELT nn =3D *(TYPEELT *)(vn + iv + H(i)); \ + ret =3D OP(ret, nn); \ + i +=3D sizeof(TYPEELT), pg >>=3D sizeof(TYPEELT); \ + } while (pg); \ + } \ + return (TYPERET)ret; \ +} + +#define DO_VPZ_D(NAME, TYPEE, TYPER, INIT, OP) \ +uint64_t HELPER(NAME)(void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPEE *n =3D vn; \ + uint8_t *pg =3D vg; \ + TYPER ret =3D INIT; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPEE nn =3D n[i]; \ + ret =3D OP(ret, nn); \ + } \ + } \ + return ret; \ +} + +DO_VPZ(sve_orv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_ORR) +DO_VPZ(sve_orv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_ORR) +DO_VPZ(sve_orv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_ORR) +DO_VPZ_D(sve_orv_d, uint64_t, uint64_t, 0, DO_ORR) + +DO_VPZ(sve_eorv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_EOR) +DO_VPZ(sve_eorv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_EOR) +DO_VPZ(sve_eorv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_EOR) +DO_VPZ_D(sve_eorv_d, uint64_t, uint64_t, 0, DO_EOR) + +DO_VPZ(sve_andv_b, uint8_t, uint8_t, uint8_t, H1, -1, DO_AND) +DO_VPZ(sve_andv_h, uint16_t, uint16_t, uint16_t, H1_2, -1, DO_AND) +DO_VPZ(sve_andv_s, uint32_t, uint32_t, uint32_t, H1_4, -1, DO_AND) +DO_VPZ_D(sve_andv_d, uint64_t, uint64_t, -1, DO_AND) + +DO_VPZ(sve_saddv_b, int8_t, uint64_t, uint64_t, H1, 0, DO_ADD) +DO_VPZ(sve_saddv_h, int16_t, uint64_t, uint64_t, H1_2, 0, DO_ADD) +DO_VPZ(sve_saddv_s, int32_t, uint64_t, uint64_t, H1_4, 0, DO_ADD) + +DO_VPZ(sve_uaddv_b, uint8_t, uint64_t, uint64_t, H1, 0, DO_ADD) +DO_VPZ(sve_uaddv_h, uint16_t, uint64_t, uint64_t, H1_2, 0, DO_ADD) +DO_VPZ(sve_uaddv_s, uint32_t, uint64_t, uint64_t, H1_4, 0, DO_ADD) +DO_VPZ_D(sve_uaddv_d, uint64_t, uint64_t, 0, DO_ADD) + +DO_VPZ(sve_smaxv_b, int8_t, int8_t, uint8_t, H1, INT8_MIN, DO_MAX) +DO_VPZ(sve_smaxv_h, int16_t, int16_t, uint16_t, H1_2, INT16_MIN, DO_MAX) +DO_VPZ(sve_smaxv_s, int32_t, int32_t, uint32_t, H1_4, INT32_MIN, DO_MAX) +DO_VPZ_D(sve_smaxv_d, int64_t, int64_t, INT64_MIN, DO_MAX) + +DO_VPZ(sve_umaxv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_MAX) +DO_VPZ(sve_umaxv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_MAX) +DO_VPZ(sve_umaxv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_MAX) +DO_VPZ_D(sve_umaxv_d, uint64_t, uint64_t, 0, DO_MAX) + +DO_VPZ(sve_sminv_b, int8_t, int8_t, uint8_t, H1, INT8_MAX, DO_MIN) +DO_VPZ(sve_sminv_h, int16_t, int16_t, uint16_t, H1_2, INT16_MAX, DO_MIN) +DO_VPZ(sve_sminv_s, int32_t, int32_t, uint32_t, H1_4, INT32_MAX, DO_MIN) +DO_VPZ_D(sve_sminv_d, int64_t, int64_t, INT64_MAX, DO_MIN) + +DO_VPZ(sve_uminv_b, uint8_t, uint8_t, uint8_t, H1, -1, DO_MIN) +DO_VPZ(sve_uminv_h, uint16_t, uint16_t, uint16_t, H1_2, -1, DO_MIN) +DO_VPZ(sve_uminv_s, uint32_t, uint32_t, uint32_t, H1_4, -1, DO_MIN) +DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) + +#undef DO_VPZ +#undef DO_VPZ_D + +#undef DO_AND +#undef DO_ORR +#undef DO_EOR +#undef DO_BIC +#undef DO_ADD +#undef DO_SUB +#undef DO_MAX +#undef DO_MIN +#undef DO_ABD +#undef DO_MUL +#undef DO_DIV + void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) { intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d8b34020bb..4abc66ba5f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -22,6 +22,7 @@ #include "exec/exec-all.h" #include "tcg-op.h" #include "tcg-op-gvec.h" +#include "tcg-gvec-desc.h" #include "qemu/log.h" #include "arm_ldst.h" #include "translate.h" @@ -201,6 +202,69 @@ void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a,= uint32_t insn) =20 #undef DO_ZPZZ =20 +typedef void gen_helper_gvec_reduc(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_i32); +static void do_vpz_ool(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_reduc *fn) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + TCGv_i32 desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_i64 temp =3D tcg_temp_new_i64(); + TCGv_ptr t_zn =3D tcg_temp_new_ptr(); + TCGv_ptr t_pg =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(temp, t_zn, t_pg, desc); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + + write_fp_dreg(s, a->rd, temp); + tcg_temp_free_i64(temp); +} + +#define DO_VPZ(NAME, name) \ +void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_reduc * const fns[4] =3D { = \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_vpz_ool(s, a, fns[a->esz]); \ +} + +DO_VPZ(ORV, orv) +DO_VPZ(ANDV, andv) +DO_VPZ(EORV, eorv) + +DO_VPZ(UADDV, uaddv) +DO_VPZ(SMAXV, smaxv) +DO_VPZ(UMAXV, umaxv) +DO_VPZ(SMINV, sminv) +DO_VPZ(UMINV, uminv) + +void trans_SADDV(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + gen_helper_gvec_reduc *fn; + switch (a->esz) { + case 0: + fn =3D gen_helper_sve_saddv_b; + break; + case 1: + fn =3D gen_helper_sve_saddv_h; + break; + case 2: + fn =3D gen_helper_sve_saddv_s; + break; + default: + unallocated_encoding(s); + return; + }; + do_vpz_ool(s, a, fn); +} + +#undef DO_VPZ + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index 3bb4faaf89..c26b1377e8 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -35,6 +35,7 @@ =20 &rri rd rn imm &rrr_esz rd rn rm esz +&rpr_esz rd pg rn esz &rprr_esz rd pg rn rm esz &pred_set rd pat esz i s =20 @@ -52,6 +53,9 @@ @rdn_pg_rm_esz ........ esz:2 ... ... ... pg:3 rm:5 rd:5 &rprr_esz rn=3D%= reg_movprfx @rdm_pg_rn_esz ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rprr_esz rm=3D%= reg_movprfx =20 +# One register operand, with governing predicate, vector element size +@rd_pg_rn_esz ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 &rri imm=3D%imm9_16_10 @rd_rn_i9 ........ ........ ...... rn:5 rd:5 &rri imm=3D%imm9_16_10 @@ -90,6 +94,24 @@ UDIV_zpzz 00000100 .. 010 101 000 ... ..... ..... @rdn= _pg_rm_esz SDIV_zpzz 00000100 .. 010 110 000 ... ..... ..... @rdm_pg_rn_esz # SDIVR UDIV_zpzz 00000100 .. 010 111 000 ... ..... ..... @rdm_pg_rn_esz # UDIVR =20 +### SVE Integer Reduction Group + +# SVE bitwise logical reduction (predicated) +ORV 00000100 .. 011 000 001 ... ..... ..... @rd_pg_rn_esz +EORV 00000100 .. 011 001 001 ... ..... ..... @rd_pg_rn_esz +ANDV 00000100 .. 011 010 001 ... ..... ..... @rd_pg_rn_esz + +# SVE integer add reduction (predicated) +UADDV 00000100 .. 000 001 001 ... ..... ..... @rd_pg_rn_esz +# Note that saddv requires size !=3D 3, which is unallocated. +SADDV 00000100 .. 000 000 001 ... ..... ..... @rd_pg_rn_esz + +# SVE integer min/max reduction (predicated) +SMAXV 00000100 .. 001 000 001 ... ..... ..... @rd_pg_rn_esz +UMAXV 00000100 .. 001 001 001 ... ..... ..... @rd_pg_rn_esz +SMINV 00000100 .. 001 010 001 ... ..... ..... @rd_pg_rn_esz +UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn_esz + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513623222893817.0178338507422; Mon, 18 Dec 2017 10:53:42 -0800 (PST) Received: from localhost ([::1]:54036 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Xl-00030T-GS for importer@patchew.org; Mon, 18 Dec 2017 13:53:41 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55776) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUW-0008Gl-FM for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUT-00025V-PL for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:16 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:45800) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUT-00024o-H1 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:13 -0500 Received: by mail-pl0-x244.google.com with SMTP id o2so5242185plk.12 for ; Mon, 18 Dec 2017 09:46:13 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=C+rsjKOajr3qlz7m+SpnMY8r8lTEeVW31kxAfY1GHdI=; b=K2sFfXs+VTxQW/FBGsV0PPv9uQB7ZjqBZBgUs6HeHhWfrXqVoszoCMQG3e78Fj5gRm aKRqXEqbYiGPAw8T7TtXtFb+FARQ0hRmpBL6V7IGdR/u5cBHnMiIgaAUtQFqRWGGsSJh FOMOGnALEkNue0uYYRwHG2Sld+BHWq38gZMrs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=C+rsjKOajr3qlz7m+SpnMY8r8lTEeVW31kxAfY1GHdI=; b=rAYcCblS9e77xEPWItwnN5LFLm4KW9aJT08dve5B/j3NQ6sYydmZqk+Zs6HFF+0p7t SO1kaHKvMHxP0BVplZdEPES3xHwWCMQwLrMnjinvxW8PMQ+Uy+kXmP5jONptgL7PQMPM 1SARph4xeSMEiiU56qg2/eAI8R3Gye9AiIg5aXbF7cRI3kYk7sS+VIbDdTMBjGLW1E+S UUZVztHuJwKHNe3r6EyUUBjEJIYCKdblZQU2PMkClKexX33hpbWw1oQkAs4NftVJyWPz 1ZPydyKZG533ovlOHRKmSsYeqJYpr0ZqOVPGCJ4dznT6LCoWyV9lv/gRIROKU+cxPpI0 hfpw== X-Gm-Message-State: AKGB3mKc7OiTJlXhGfPIlImGFEx8ALcRpMzfoppqi0AxSnfZwJMK2S5a aN/MdILVdd0T99sNnzyQXqz6jh8QJdI= X-Google-Smtp-Source: ACJfBoutRi1LaAc2iFM1SWpUaIcYNnm5QaJnCx8v8EOS370z8700yq5rE0JtXH9MmCHp0lEjsecAwg== X-Received: by 10.159.244.12 with SMTP id x12mr491707plr.312.1513619171839; Mon, 18 Dec 2017 09:46:11 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:40 -0800 Message-Id: <20171218174552.18871-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH 11/23] target/arm: Implement SVE bitwise shift by immediate (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 25 +++++ target/arm/sve_helper.c | 265 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 124 +++++++++++++++++++++ target/arm/sve.def | 21 ++++ 4 files changed, 435 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 937598d6f8..2b265e9892 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -206,6 +206,31 @@ DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, = ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_clr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) + +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) + +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i= 32) + +DEF_HELPER_FLAGS_4(sve_asrd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fca17440e7..9146e35e5b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -42,6 +42,201 @@ #endif =20 =20 +/* Expand active predicate bits to bytes, for byte elements. + * for (i =3D 0; i < 256; ++i) { + * unsigned long m =3D 0; + * for (j =3D 0; j < 8; j++) { + * if ((i >> j) & 1) { + * m |=3D 0xfful << (j << 3); + * } + * } + * printf("0x%016lx,\n", m); + * } + */ +static inline uint64_t expand_pred_b(uint8_t byte) +{ + static const uint64_t word[256] =3D { + 0x0000000000000000, 0x00000000000000ff, 0x000000000000ff00, + 0x000000000000ffff, 0x0000000000ff0000, 0x0000000000ff00ff, + 0x0000000000ffff00, 0x0000000000ffffff, 0x00000000ff000000, + 0x00000000ff0000ff, 0x00000000ff00ff00, 0x00000000ff00ffff, + 0x00000000ffff0000, 0x00000000ffff00ff, 0x00000000ffffff00, + 0x00000000ffffffff, 0x000000ff00000000, 0x000000ff000000ff, + 0x000000ff0000ff00, 0x000000ff0000ffff, 0x000000ff00ff0000, + 0x000000ff00ff00ff, 0x000000ff00ffff00, 0x000000ff00ffffff, + 0x000000ffff000000, 0x000000ffff0000ff, 0x000000ffff00ff00, + 0x000000ffff00ffff, 0x000000ffffff0000, 0x000000ffffff00ff, + 0x000000ffffffff00, 0x000000ffffffffff, 0x0000ff0000000000, + 0x0000ff00000000ff, 0x0000ff000000ff00, 0x0000ff000000ffff, + 0x0000ff0000ff0000, 0x0000ff0000ff00ff, 0x0000ff0000ffff00, + 0x0000ff0000ffffff, 0x0000ff00ff000000, 0x0000ff00ff0000ff, + 0x0000ff00ff00ff00, 0x0000ff00ff00ffff, 0x0000ff00ffff0000, + 0x0000ff00ffff00ff, 0x0000ff00ffffff00, 0x0000ff00ffffffff, + 0x0000ffff00000000, 0x0000ffff000000ff, 0x0000ffff0000ff00, + 0x0000ffff0000ffff, 0x0000ffff00ff0000, 0x0000ffff00ff00ff, + 0x0000ffff00ffff00, 0x0000ffff00ffffff, 0x0000ffffff000000, + 0x0000ffffff0000ff, 0x0000ffffff00ff00, 0x0000ffffff00ffff, + 0x0000ffffffff0000, 0x0000ffffffff00ff, 0x0000ffffffffff00, + 0x0000ffffffffffff, 0x00ff000000000000, 0x00ff0000000000ff, + 0x00ff00000000ff00, 0x00ff00000000ffff, 0x00ff000000ff0000, + 0x00ff000000ff00ff, 0x00ff000000ffff00, 0x00ff000000ffffff, + 0x00ff0000ff000000, 0x00ff0000ff0000ff, 0x00ff0000ff00ff00, + 0x00ff0000ff00ffff, 0x00ff0000ffff0000, 0x00ff0000ffff00ff, + 0x00ff0000ffffff00, 0x00ff0000ffffffff, 0x00ff00ff00000000, + 0x00ff00ff000000ff, 0x00ff00ff0000ff00, 0x00ff00ff0000ffff, + 0x00ff00ff00ff0000, 0x00ff00ff00ff00ff, 0x00ff00ff00ffff00, + 0x00ff00ff00ffffff, 0x00ff00ffff000000, 0x00ff00ffff0000ff, + 0x00ff00ffff00ff00, 0x00ff00ffff00ffff, 0x00ff00ffffff0000, + 0x00ff00ffffff00ff, 0x00ff00ffffffff00, 0x00ff00ffffffffff, + 0x00ffff0000000000, 0x00ffff00000000ff, 0x00ffff000000ff00, + 0x00ffff000000ffff, 0x00ffff0000ff0000, 0x00ffff0000ff00ff, + 0x00ffff0000ffff00, 0x00ffff0000ffffff, 0x00ffff00ff000000, + 0x00ffff00ff0000ff, 0x00ffff00ff00ff00, 0x00ffff00ff00ffff, + 0x00ffff00ffff0000, 0x00ffff00ffff00ff, 0x00ffff00ffffff00, + 0x00ffff00ffffffff, 0x00ffffff00000000, 0x00ffffff000000ff, + 0x00ffffff0000ff00, 0x00ffffff0000ffff, 0x00ffffff00ff0000, + 0x00ffffff00ff00ff, 0x00ffffff00ffff00, 0x00ffffff00ffffff, + 0x00ffffffff000000, 0x00ffffffff0000ff, 0x00ffffffff00ff00, + 0x00ffffffff00ffff, 0x00ffffffffff0000, 0x00ffffffffff00ff, + 0x00ffffffffffff00, 0x00ffffffffffffff, 0xff00000000000000, + 0xff000000000000ff, 0xff0000000000ff00, 0xff0000000000ffff, + 0xff00000000ff0000, 0xff00000000ff00ff, 0xff00000000ffff00, + 0xff00000000ffffff, 0xff000000ff000000, 0xff000000ff0000ff, + 0xff000000ff00ff00, 0xff000000ff00ffff, 0xff000000ffff0000, + 0xff000000ffff00ff, 0xff000000ffffff00, 0xff000000ffffffff, + 0xff0000ff00000000, 0xff0000ff000000ff, 0xff0000ff0000ff00, + 0xff0000ff0000ffff, 0xff0000ff00ff0000, 0xff0000ff00ff00ff, + 0xff0000ff00ffff00, 0xff0000ff00ffffff, 0xff0000ffff000000, + 0xff0000ffff0000ff, 0xff0000ffff00ff00, 0xff0000ffff00ffff, + 0xff0000ffffff0000, 0xff0000ffffff00ff, 0xff0000ffffffff00, + 0xff0000ffffffffff, 0xff00ff0000000000, 0xff00ff00000000ff, + 0xff00ff000000ff00, 0xff00ff000000ffff, 0xff00ff0000ff0000, + 0xff00ff0000ff00ff, 0xff00ff0000ffff00, 0xff00ff0000ffffff, + 0xff00ff00ff000000, 0xff00ff00ff0000ff, 0xff00ff00ff00ff00, + 0xff00ff00ff00ffff, 0xff00ff00ffff0000, 0xff00ff00ffff00ff, + 0xff00ff00ffffff00, 0xff00ff00ffffffff, 0xff00ffff00000000, + 0xff00ffff000000ff, 0xff00ffff0000ff00, 0xff00ffff0000ffff, + 0xff00ffff00ff0000, 0xff00ffff00ff00ff, 0xff00ffff00ffff00, + 0xff00ffff00ffffff, 0xff00ffffff000000, 0xff00ffffff0000ff, + 0xff00ffffff00ff00, 0xff00ffffff00ffff, 0xff00ffffffff0000, + 0xff00ffffffff00ff, 0xff00ffffffffff00, 0xff00ffffffffffff, + 0xffff000000000000, 0xffff0000000000ff, 0xffff00000000ff00, + 0xffff00000000ffff, 0xffff000000ff0000, 0xffff000000ff00ff, + 0xffff000000ffff00, 0xffff000000ffffff, 0xffff0000ff000000, + 0xffff0000ff0000ff, 0xffff0000ff00ff00, 0xffff0000ff00ffff, + 0xffff0000ffff0000, 0xffff0000ffff00ff, 0xffff0000ffffff00, + 0xffff0000ffffffff, 0xffff00ff00000000, 0xffff00ff000000ff, + 0xffff00ff0000ff00, 0xffff00ff0000ffff, 0xffff00ff00ff0000, + 0xffff00ff00ff00ff, 0xffff00ff00ffff00, 0xffff00ff00ffffff, + 0xffff00ffff000000, 0xffff00ffff0000ff, 0xffff00ffff00ff00, + 0xffff00ffff00ffff, 0xffff00ffffff0000, 0xffff00ffffff00ff, + 0xffff00ffffffff00, 0xffff00ffffffffff, 0xffffff0000000000, + 0xffffff00000000ff, 0xffffff000000ff00, 0xffffff000000ffff, + 0xffffff0000ff0000, 0xffffff0000ff00ff, 0xffffff0000ffff00, + 0xffffff0000ffffff, 0xffffff00ff000000, 0xffffff00ff0000ff, + 0xffffff00ff00ff00, 0xffffff00ff00ffff, 0xffffff00ffff0000, + 0xffffff00ffff00ff, 0xffffff00ffffff00, 0xffffff00ffffffff, + 0xffffffff00000000, 0xffffffff000000ff, 0xffffffff0000ff00, + 0xffffffff0000ffff, 0xffffffff00ff0000, 0xffffffff00ff00ff, + 0xffffffff00ffff00, 0xffffffff00ffffff, 0xffffffffff000000, + 0xffffffffff0000ff, 0xffffffffff00ff00, 0xffffffffff00ffff, + 0xffffffffffff0000, 0xffffffffffff00ff, 0xffffffffffffff00, + 0xffffffffffffffff, + }; + return word[byte]; +} + +/* Similarly for half-word elements. + * for (i =3D 0; i < 256; ++i) { + * unsigned long m =3D 0; + * if (i & 0xaa) { + * continue; + * } + * for (j =3D 0; j < 8; j +=3D 2) { + * if ((i >> j) & 1) { + * m |=3D 0xfffful << (j << 3); + * } + * } + * printf("[0x%x] =3D 0x%016lx,\n", i, m); + * } + */ +static inline uint64_t expand_pred_h(uint8_t byte) +{ + static const uint64_t word[] =3D { + [0x01] =3D 0x000000000000ffff, [0x04] =3D 0x00000000ffff0000, + [0x05] =3D 0x00000000ffffffff, [0x10] =3D 0x0000ffff00000000, + [0x11] =3D 0x0000ffff0000ffff, [0x14] =3D 0x0000ffffffff0000, + [0x15] =3D 0x0000ffffffffffff, [0x40] =3D 0xffff000000000000, + [0x41] =3D 0xffff00000000ffff, [0x44] =3D 0xffff0000ffff0000, + [0x45] =3D 0xffff0000ffffffff, [0x50] =3D 0xffffffff00000000, + [0x51] =3D 0xffffffff0000ffff, [0x54] =3D 0xffffffffffff0000, + [0x55] =3D 0xffffffffffffffff, + }; + return word[byte & 0x55]; +} + +/* Similarly for single word elements. */ +static inline uint64_t expand_pred_s(uint8_t byte) +{ + static const uint64_t word[] =3D { + [0x01] =3D 0x00000000ffffffffull, + [0x10] =3D 0xffffffff00000000ull, + [0x11] =3D 0xffffffffffffffffull, + }; + return word[byte & 0x11]; +} + +/* Store zero into every active element of Zd. We will use this for two + * and three-operand predicated instructions for which logic dictates a + * zero result. In particular, logical shift by element size, which is + * otherwise undefined on the host. + * + * For element sizes smaller than uint64_t, we use tables to expand + * the N bits of the controlling predicate to a byte mask, and clear + * those bytes. + */ +void HELPER(sve_clr_b)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D ~expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_clr_h)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D ~expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_clr_s)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] &=3D ~expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + uint8_t *pg =3D vg; + for (i =3D 0; i < opr_sz; i +=3D 1) { + if (pg[H1(i)] & 1) { + d[i] =3D 0; + } + } +} + /* Given the first and last word of the result, the first and last word of the governing mask, and the sum of the result, return a mask that can be used to quickly set NZCV. */ @@ -401,6 +596,76 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_MUL #undef DO_DIV =20 +/* Three-operand expander, immediate operand, controlled by a predicate. + */ +#define DO_ZPZI(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t iv =3D 0, ib =3D 0, opr_sz =3D simd_oprsz(desc); = \ + TYPE imm =3D simd_data(desc); \ + for (iv =3D ib =3D 0; iv < opr_sz; iv +=3D 16, ib +=3D 2) { = \ + uint16_t pg =3D *(uint16_t *)(vg + H2(ib)); \ + intptr_t i =3D 0; \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + iv + H(i)); \ + *(TYPE *)(vd + iv + H(i)) =3D OP(nn, imm); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); = \ + } while (pg); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZI_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *n =3D vn; \ + TYPE imm =3D simd_data(desc); \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn =3D n[i]; \ + d[i] =3D OP(nn, imm); \ + } \ + } \ +} + +#define DO_SHR(N, M) (N >> M) +#define DO_SHL(N, M) (N << M) + +/* Arithmetic shift right for division. This rounds negative numbers + toward zero as per signed division. Therefore before shifting, + when N is negative, add 2**M-1. */ +#define DO_ASRD(N, M) ((N + (N < 0 ? ((__typeof(N))1 << M) - 1 : 0)) >> M) + +DO_ZPZI(sve_asr_zpzi_b, int8_t, H1, DO_SHR) +DO_ZPZI(sve_asr_zpzi_h, int16_t, H1_2, DO_SHR) +DO_ZPZI(sve_asr_zpzi_s, int32_t, H1_4, DO_SHR) +DO_ZPZI_D(sve_asr_zpzi_d, int64_t, DO_SHR) + +DO_ZPZI(sve_lsr_zpzi_b, uint8_t, H1, DO_SHR) +DO_ZPZI(sve_lsr_zpzi_h, uint16_t, H1_2, DO_SHR) +DO_ZPZI(sve_lsr_zpzi_s, uint32_t, H1_4, DO_SHR) +DO_ZPZI_D(sve_lsr_zpzi_d, uint64_t, DO_SHR) + +DO_ZPZI(sve_lsl_zpzi_b, uint8_t, H1, DO_SHL) +DO_ZPZI(sve_lsl_zpzi_h, uint16_t, H1_2, DO_SHL) +DO_ZPZI(sve_lsl_zpzi_s, uint32_t, H1_4, DO_SHL) +DO_ZPZI_D(sve_lsl_zpzi_d, uint64_t, DO_SHL) + +DO_ZPZI(sve_asrd_b, int8_t, H1, DO_ASRD) +DO_ZPZI(sve_asrd_h, int16_t, H1_2, DO_ASRD) +DO_ZPZI(sve_asrd_s, int32_t, H1_4, DO_ASRD) +DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) + +#undef DO_ZPZI +#undef DO_ZPZI_D +#undef DO_SHR +#undef DO_SHL +#undef DO_ASRD + void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) { intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4abc66ba5f..08388c0a07 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -37,6 +37,30 @@ typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, ui= nt32_t, uint32_t); typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); =20 +/* + * Helpers for extracting complex instruction fields. + */ + +/* See e.g. ASL (immediate, predicated). + * Returns -1 for unallocated encoding; diagnose later. + */ +static int tszimm_esz(int x) +{ + x >>=3D 3; /* discard imm3 */ + return 31 - clz32(x); +} + +static int tszimm_shr(int x) +{ + return (2 * tszimm_esz(x)) - x; +} + +/* See e.g. LSL (immediate, predicated). */ +static int tszimm_shl(int x) +{ + return x - tszimm_esz(x); +} + /* * Include the generated decoder. */ @@ -265,6 +289,106 @@ void trans_SADDV(DisasContext *s, arg_rpr_esz *a, uin= t32_t insn) =20 #undef DO_VPZ =20 +/* Store zero into every active element of Zd. We will use this for two + * and three-operand predicated instructions for which logic dictates a + * zero result. + */ +static void do_zp_clr(DisasContext *s, int rd, int pg, int esz) +{ + static gen_helper_gvec_2 * const fns[4] =3D { + gen_helper_sve_clr_b, gen_helper_sve_clr_h, + gen_helper_sve_clr_s, gen_helper_sve_clr_d, + }; + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + +static void do_zpzi_ool(DisasContext *s, arg_rpri_esz *a, + gen_helper_gvec_3 *fn) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + vsz, vsz, a->imm, fn); +} + +void trans_ASR_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_asr_zpzi_b, gen_helper_sve_asr_zpzi_h, + gen_helper_sve_asr_zpzi_s, gen_helper_sve_asr_zpzi_d, + }; + if (a->esz < 0) { + /* Invalid tsz encoding -- see tszimm_esz. */ + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. For + arithmetic right-shift, it's the same as by one less. */ + a->imm =3D MIN(a->imm, (8 << a->esz) - 1); + do_zpzi_ool(s, a, fns[a->esz]); +} + +void trans_LSR_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_lsr_zpzi_b, gen_helper_sve_lsr_zpzi_h, + gen_helper_sve_lsr_zpzi_s, gen_helper_sve_lsr_zpzi_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. + For logical shifts, it is a zeroing operation. */ + if (a->imm >=3D (8 << a->esz)) { + do_zp_clr(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + +void trans_LSL_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_lsl_zpzi_b, gen_helper_sve_lsl_zpzi_h, + gen_helper_sve_lsl_zpzi_s, gen_helper_sve_lsl_zpzi_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. + For logical shifts, it is a zeroing operation. */ + if (a->imm >=3D (8 << a->esz)) { + do_zp_clr(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + +void trans_ASRD(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + gen_helper_sve_asrd_b, gen_helper_sve_asrd_h, + gen_helper_sve_asrd_s, gen_helper_sve_asrd_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. For arithmetic + right shift for division, it is a zeroing operation. */ + if (a->imm >=3D (8 << a->esz)) { + do_zp_clr(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index c26b1377e8..f1d2801b94 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -23,6 +23,14 @@ # Named fields. These are primarily for disjoint fields. =20 %imm9_16_10 16:s6 10:3 +%imm6_22_5 22:1 5:5 + +# A combination of tsz:imm3 -- extract esize. +%tszimm_esz 22:2 5:5 !function=3Dtszimm_esz +# A combination of tsz:imm3 -- extract (2 * esize) - (tsz:imm3) +%tszimm_shr 22:2 5:5 !function=3Dtszimm_shr +# A combination of tsz:imm3 -- extract (tsz:imm3) - esize +%tszimm_shl 22:2 5:5 !function=3Dtszimm_shl =20 # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. @@ -37,6 +45,7 @@ &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rprr_esz rd pg rn rm esz +&rpri_esz rd pg rn imm esz &pred_set rd pat esz i s =20 ########################################################################### @@ -56,6 +65,10 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn_esz ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz =20 +# Two register operand, one immediate operand, with predicate, element siz= e encoded as TSZHL. +# User must fill in imm. +@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 &rpri_esz rn=3D%r= eg_movprfx esz=3D%tszimm_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 &rri imm=3D%imm9_16_10 @rd_rn_i9 ........ ........ ...... rn:5 rd:5 &rri imm=3D%imm9_16_10 @@ -112,6 +125,14 @@ UMAXV 00000100 .. 001 001 001 ... ..... ..... @rd_p= g_rn_esz SMINV 00000100 .. 001 010 001 ... ..... ..... @rd_pg_rn_esz UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn_esz =20 +### SVE Shift by Immediate - Predicated Group + +# SVE bitwise shift by immediate (predicated) +ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... @rdn_pg_tszimm imm=3D%t= szimm_shr +LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... @rdn_pg_tszimm imm=3D%t= szimm_shl +LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... @rdn_pg_tszimm imm=3D%t= szimm_shr +ASRD 00000100 .. 000 100 100 ... .. ... ..... @rdn_pg_tszimm imm=3D%tszi= mm_shr + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513621589687904.5737590284684; Mon, 18 Dec 2017 10:26:29 -0800 (PST) Received: from localhost ([::1]:44142 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR07L-0006zz-3G for importer@patchew.org; Mon, 18 Dec 2017 13:26:23 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55773) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUW-0008Gj-E0 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUV-00026f-8Q for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:16 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:45268) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUV-000266-1B for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:15 -0500 Received: by mail-pg0-x243.google.com with SMTP id m25so9422366pgv.12 for ; Mon, 18 Dec 2017 09:46:14 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Ig4kdYIw6qOuzDY1F9Ia70Qay4cvLKadEMa5FjRQcZw=; b=OpChPybH9SFEICQQfVL9N3G4TLi3b2VGHHvT8BiGSOCn5mHV+TzpvgV6FWliEgdb8I hZjpPGNx/KXAPG+iNBuah9F+iZTxu0dn/ZoBI6Gl3YCvbmZul2IYiCTREGRRxZj1z9WB 9+Ad7hbWsZt0JqLKZTC9NwsTA2ec3eaigdIUE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Ig4kdYIw6qOuzDY1F9Ia70Qay4cvLKadEMa5FjRQcZw=; b=EvZ7blA9LXkd7fZisJdG/Vsvn7sDTzZmOfVO+saG4dsCSKWymPz1eDmNfn+mJ+jxIH +X9ayj9ttgZ+7GMIpy9qfejvkCHnFNX8DYJp7Nw/dHXDzml/sXIhGpGDYtuM8UTUP9gj TfrvHLM3IYujlbBRI6V7FyvtGRP1Wiy0Jmm5V25UhIPT/bGsdUK5O4qfr7BlXilZvDGJ CZYTErVqlG1EhbmypWvOMBoxTZx8op8ifEffe36abMfMHA7kIcUz4PKOheoVFAm9Powq ELh0dPvCh/AZ0YtLHnwz49O9TcAFNlCA6c/FpMnk1qhp6Vp0v2BJ/RtOVZ/IZyKPdaee Pnrg== X-Gm-Message-State: AKGB3mKgzK+J+xjjKQVIqTNZa8EtwEdCR4j8wIiVZgTok0SPZg5WesvH qGw63zlQl6yhECMqpTevmFaVCt+u6l0= X-Google-Smtp-Source: ACJfBouMixZtdSFIeMtyYhf3OFwrsByROj88m/Ns3kK32imyhp2ElEWNuoCwbf9u7anEVfGZ3r6t4A== X-Received: by 10.98.56.69 with SMTP id f66mr498722pfa.38.1513619173513; Mon, 18 Dec 2017 09:46:13 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:41 -0800 Message-Id: <20171218174552.18871-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH 12/23] target/arm: Implement SVE bitwise shift by vector (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 27 +++++++++++++++++++++++++++ target/arm/sve_helper.c | 25 +++++++++++++++++++++++++ target/arm/translate-sve.c | 4 ++++ target/arm/sve.def | 8 ++++++++ 4 files changed, 64 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2b265e9892..61b1287269 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -162,6 +162,33 @@ DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_asr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9146e35e5b..20f1e60fda 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -492,6 +492,28 @@ DO_ZPZZ_D(sve_umulh_zpzz_d, uint64_t, do_umulh_d) DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) =20 +/* Note that all bits of the shift are significant + and not modulo the element size. */ +#define DO_ASR(N, M) (N >> MIN(M, sizeof(N) * 8 - 1)) +#define DO_LSR(N, M) (M < sizeof(N) * 8 ? N >> M : 0) +#define DO_LSL(N, M) (M < sizeof(N) * 8 ? N << M : 0) + +DO_ZPZZ(sve_asr_zpzz_b, int8_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_b, uint8_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_b, uint8_t, H1_4, DO_LSL) + +DO_ZPZZ(sve_asr_zpzz_h, int16_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_h, uint16_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_h, uint16_t, H1_4, DO_LSL) + +DO_ZPZZ(sve_asr_zpzz_s, int32_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_s, uint32_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_s, uint32_t, H1_4, DO_LSL) + +DO_ZPZZ_D(sve_asr_zpzz_d, int64_t, DO_ASR) +DO_ZPZZ_D(sve_lsr_zpzz_d, uint64_t, DO_LSR) +DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) + #undef DO_ZPZZ #undef DO_ZPZZ_D =20 @@ -595,6 +617,9 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_ABD #undef DO_MUL #undef DO_DIV +#undef DO_ASR +#undef DO_LSR +#undef DO_LSL =20 /* Three-operand expander, immediate operand, controlled by a predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 08388c0a07..685a3ba249 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -190,6 +190,10 @@ DO_ZPZZ(MUL, mul) DO_ZPZZ(SMULH, smulh) DO_ZPZZ(UMULH, umulh) =20 +DO_ZPZZ(ASR, asr) +DO_ZPZZ(LSR, lsr) +DO_ZPZZ(LSL, lsl) + void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) { gen_helper_gvec_4 *fn; diff --git a/target/arm/sve.def b/target/arm/sve.def index f1d2801b94..9f9c0803a0 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -133,6 +133,14 @@ LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... @rd= n_pg_tszimm imm=3D%tszimm_sh LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... @rdn_pg_tszimm imm=3D%t= szimm_shr ASRD 00000100 .. 000 100 100 ... .. ... ..... @rdn_pg_tszimm imm=3D%tszi= mm_shr =20 +# SVE bitwise shift by vector (predicated) +ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm_esz +LSR_zpzz 00000100 .. 010 001 100 ... ..... ..... @rdn_pg_rm_esz +LSL_zpzz 00000100 .. 010 011 100 ... ..... ..... @rdn_pg_rm_esz +ASR_zpzz 00000100 .. 010 100 100 ... ..... ..... @rdm_pg_rn_esz # ASRR +LSR_zpzz 00000100 .. 010 101 100 ... ..... ..... @rdm_pg_rn_esz # LSRR +LSL_zpzz 00000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn_esz # LSLR + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513621946673700.5053138448084; Mon, 18 Dec 2017 10:32:26 -0800 (PST) Received: from localhost ([::1]:51595 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Cr-0006lc-Gk for importer@patchew.org; Mon, 18 Dec 2017 13:32:05 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55807) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUX-0008H9-Jj for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUW-00027u-HA for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:17 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:40363) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUW-00027C-9n for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:16 -0500 Received: by mail-pl0-x244.google.com with SMTP id 62so3801146pld.7 for ; Mon, 18 Dec 2017 09:46:16 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=eISu4UfSsAnxjFAeFFh1n1p1VMBmX+zbvHQEoYGoTBI=; b=Y7zwAIS5EX1EadgXdaf81dDgxkPnP8QuwleVD+CS/V+hjiORAgEAyZNFU8EF5BX5iu TJBF+/6x3TMS+vkRuxXzhi0hpsUCLJqeTfg2fbMKSsS/PKwf/NompMcK9xkJH1uXurUN U8MFc7YY45PNERj8SSgMcLnONwS42ijZOPrxo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=eISu4UfSsAnxjFAeFFh1n1p1VMBmX+zbvHQEoYGoTBI=; b=Az4kMNMXIZleKZGlZzeqVJ4NdiZsUSxN1r9ZqbXYbv3WSoYnvgYUXCXyn1d24AXhEB oBxbTrUbhMz9CzKxKPFZ3cZFCj1HivGmmEtZJ6WGfC83zbibk1PBhOvJXGKJEBnOd5hn 3ciWYeK/ElPxU7gVw6NM63tELgdfMQq2oBYAecTxdbQUwHLQK64Ep0zCcehl9eubwmZd nf7UEBuYCnWQhUdSeUjMHAsv9dQqEr4RBiNpnF0Tm000CMD17yIaKhQhtU4PRJUbSf2q JeldBlAWCfB18jfNHQedZ+2goS6lEEl9cDXhXCJ62LyDfYh85dv1CMy79AWQED0UqGTw vbYA== X-Gm-Message-State: AKGB3mJpzNbWsry39dhsMKW/OfxVOycHkn9LKgePaoFlis7uemeFjX98 N7ZeezdUaDmgaHd3ZjOEp2yOLpnNyQo= X-Google-Smtp-Source: ACJfBoumnQWz5OFvsWPnR2Fcgd6Tx3B+uDc+To30PlsMxw6Pw3MUnKx74Is5hXloh+rUPismrN1GCw== X-Received: by 10.84.133.132 with SMTP id f4mr452596plf.413.1513619174988; Mon, 18 Dec 2017 09:46:14 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:42 -0800 Message-Id: <20171218174552.18871-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH 13/23] target/arm: Implement SVE bitwise shift by wide elements (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 21 +++++++++++++++++++++ target/arm/sve_helper.c | 36 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 20 ++++++++++++++++++++ target/arm/sve.def | 6 ++++++ 4 files changed, 83 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 61b1287269..a2db3e2fd9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -189,6 +189,27 @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 20f1e60fda..3be6d1ae05 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -517,6 +517,42 @@ DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) #undef DO_ZPZZ #undef DO_ZPZZ_D =20 +/* Three-operand expander, controlled by a predicate, in which the + * third operand is "wide". That is, for D =3D N op M, the same 64-bit + * value of M is used with all of the narrower values of N. + */ +#define DO_ZPZW(NAME, TYPE, TYPEW, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t iv =3D 0, ib =3D 0, opr_sz =3D simd_oprsz(desc); = \ + for (iv =3D ib =3D 0; iv < opr_sz; iv +=3D 16, ib +=3D 2) { = \ + uint16_t pg =3D *(uint16_t *)(vg + H2(ib)); \ + TYPEW mm =3D *(TYPEW *)(vm + iv); \ + intptr_t i =3D 0; \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + iv + H(i)); \ + *(TYPE *)(vd + iv + H(i)) =3D OP(nn, mm); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); = \ + } while (pg); \ + } \ +} + +DO_ZPZW(sve_asr_zpzw_b, int8_t, uint64_t, H1, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_b, uint8_t, uint64_t, H1, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_b, uint8_t, uint64_t, H1, DO_LSL) + +DO_ZPZW(sve_asr_zpzw_h, int16_t, uint64_t, H1_2, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_h, uint16_t, uint64_t, H1_2, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_h, uint16_t, uint64_t, H1_2, DO_LSL) + +DO_ZPZW(sve_asr_zpzw_s, int32_t, uint64_t, H1_4, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSL) + +#undef DO_ZPZW + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 685a3ba249..91f07d57e3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -230,6 +230,26 @@ void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a,= uint32_t insn) =20 #undef DO_ZPZZ =20 +#define DO_ZPZW(NAME, name) \ +void trans_##NAME##_zpzw(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_4 * const fns[3] =3D { = \ + gen_helper_sve_##name##_zpzw_b, gen_helper_sve_##name##_zpzw_h, \ + gen_helper_sve_##name##_zpzw_s, \ + }; \ + if ((unsigned)a->esz < 3) { \ + do_zpzz_ool(s, a, fns[a->esz]); \ + } else { \ + unallocated_encoding(s); \ + } \ +} + +DO_ZPZW(ASR, asr) +DO_ZPZW(LSR, lsr) +DO_ZPZW(LSL, lsl) + +#undef DO_ZPZW + typedef void gen_helper_gvec_reduc(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_i32); static void do_vpz_ool(DisasContext *s, arg_rpr_esz *a, gen_helper_gvec_reduc *fn) diff --git a/target/arm/sve.def b/target/arm/sve.def index 9f9c0803a0..66be950ca5 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -141,6 +141,12 @@ ASR_zpzz 00000100 .. 010 100 100 ... ..... ..... @rd= m_pg_rn_esz # ASRR LSR_zpzz 00000100 .. 010 101 100 ... ..... ..... @rdm_pg_rn_esz # LSRR LSL_zpzz 00000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn_esz # LSLR =20 +# SVE bitwise shift by wide elements (predicated) +# Note these require size !=3D 3. +ASR_zpzw 00000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm_esz +LSR_zpzw 00000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm_esz +LSL_zpzw 00000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm_esz + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622128681444.76888319755665; Mon, 18 Dec 2017 10:35:28 -0800 (PST) Received: from localhost ([::1]:51803 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Fq-0001Ez-8g for importer@patchew.org; Mon, 18 Dec 2017 13:35:10 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55849) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUZ-0008Jj-ND for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUX-00029I-UD for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:19 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:34548) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUX-00028b-M9 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:17 -0500 Received: by mail-pl0-x241.google.com with SMTP id d21so5238067pll.1 for ; Mon, 18 Dec 2017 09:46:17 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.15 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qmnu1cV1FQddZr64hDlrwGSfeH9uIDzZPj4kfJrRg2E=; b=IonaUb2MvDcwPDi9j0VLhRg9fPrj5auu0LL9zOf+k5Jpol+1uv6mvO0C0i2YdRCet9 X1/kN2AkhjCDNqVva6OSA9HnWiDDaRcWWgRRAWCXFaO14AimvHu8pDAZO9g2Aq/LZ9bk RpVIcYKp6sKu2cA+kSN0fOdabKSW97OHYoMvQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qmnu1cV1FQddZr64hDlrwGSfeH9uIDzZPj4kfJrRg2E=; b=OAtBy3l4ijl1n0WJcV+z5b1xalNhrJE+RUkvefJoSsDWbI3+6WvqzANL1eEzuzNYzl cDO53bqGsP7YEGt/O8V4sSggTO2RTi2gcxN3xzG7gVCYAyUYK0/wb+quzhbYO1d17ZZY wHLkbR/T7gWfLlcDDFD/KMmyrU4J3Dm79xyGdfOLNYWJ1xs6AqW3Y6huUvgi1spuwinf kGGzYxpzQ+zhbBwqhESrAUau5UPY2FaQMv4ghspuWG+WX3Ctza1idFVfeTTn+HSziiND 4QQY8YHdL2q0peBqvFKzUmEx++t2YNwZpwD2/Xsn+cMjfEAADwg+ctou3OyOxMJ4WapV jARQ== X-Gm-Message-State: AKGB3mLsvyP0CoZuYbhogK2Hhv01rE2PnvMqluvfsYqAOfgAFIKc1Swn 88tz0I+YOYuoUOPl+v/UtnVpPT0HsjI= X-Google-Smtp-Source: ACJfBounwgTb0qRZkfr/aV3CG9zn0FedDHsj9oiYkE3vA/ZvhJVniuFhu2D5nJtegdQvCkDpOoz4pw== X-Received: by 10.159.203.137 with SMTP id ay9mr484745plb.380.1513619176266; Mon, 18 Dec 2017 09:46:16 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:43 -0800 Message-Id: <20171218174552.18871-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH 14/23] target/arm: Implement SVE Integer Arithmetic - Unary Predicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 60 +++++++++++++++++++++ target/arm/sve_helper.c | 128 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 107 +++++++++++++++++++++++++++++++++++++ target/arm/sve.def | 21 ++++++++ 4 files changed, 316 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a2db3e2fd9..e9382de300 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -279,6 +279,66 @@ DEF_HELPER_FLAGS_4(sve_asrd_h, TCG_CALL_NO_RWG, void, = ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asrd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asrd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_cls_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_clz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cnt_zpz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(sve_cnot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fabs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fabs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fabs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fneg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fneg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fneg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_not_zpz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_not_zpz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_not_zpz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_not_zpz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(sve_sxtb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxtb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxtb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uxtb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_sxth_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxth_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uxth_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxth_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_sxtw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_abs_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_neg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3be6d1ae05..481b3bdefe 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -553,6 +553,134 @@ DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_= LSL) =20 #undef DO_ZPZW =20 +/* Fully general two-operand expander, controlled by a predicate. + */ +#define DO_ZPZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t iv =3D 0, ib =3D 0, opr_sz =3D simd_oprsz(desc); \ + for (iv =3D ib =3D 0; iv < opr_sz; iv +=3D 16, ib +=3D 2) { \ + uint16_t pg =3D *(uint16_t *)(vg + H2(ib)); \ + intptr_t i =3D 0; \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + iv + H(i)); \ + *(TYPE *)(vd + iv + H(i)) =3D OP(nn); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (pg); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *n =3D vn; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn =3D n[i]; \ + d[i] =3D OP(nn); \ + } \ + } \ +} + +#define DO_CLS_B(N) (clrsb32(N) - 24) +#define DO_CLS_H(N) (clrsb32(N) - 16) + +DO_ZPZ(sve_cls_b, int8_t, H1, DO_CLS_B) +DO_ZPZ(sve_cls_h, int16_t, H1_2, DO_CLS_H) +DO_ZPZ(sve_cls_s, int32_t, H1_4, clrsb32) +DO_ZPZ_D(sve_cls_d, int64_t, clrsb64) + +#define DO_CLZ_B(N) (clz32(N) - 24) +#define DO_CLZ_H(N) (clz32(N) - 16) + +DO_ZPZ(sve_clz_b, uint8_t, H1, DO_CLZ_B) +DO_ZPZ(sve_clz_h, uint16_t, H1_2, DO_CLZ_H) +DO_ZPZ(sve_clz_s, uint32_t, H1_4, clz32) +DO_ZPZ_D(sve_clz_d, uint64_t, clz64) + +DO_ZPZ(sve_cnt_zpz_b, uint8_t, H1, ctpop8) +DO_ZPZ(sve_cnt_zpz_h, uint16_t, H1_2, ctpop16) +DO_ZPZ(sve_cnt_zpz_s, uint32_t, H1_4, ctpop32) +DO_ZPZ_D(sve_cnt_zpz_d, uint64_t, ctpop64) + +#define DO_CNOT(N) (N =3D=3D 0) + +DO_ZPZ(sve_cnot_b, uint8_t, H1, DO_CNOT) +DO_ZPZ(sve_cnot_h, uint16_t, H1_2, DO_CNOT) +DO_ZPZ(sve_cnot_s, uint32_t, H1_4, DO_CNOT) +DO_ZPZ_D(sve_cnot_d, uint64_t, DO_CNOT) + +#define DO_FABS(N) (N & ((__typeof(N))-1 >> 1)) + +DO_ZPZ(sve_fabs_h, uint16_t, H1_2, DO_FABS) +DO_ZPZ(sve_fabs_s, uint32_t, H1_4, DO_FABS) +DO_ZPZ_D(sve_fabs_d, uint64_t, DO_FABS) + +#define DO_FNEG(N) (N ^ ~((__typeof(N))-1 >> 1)) + +DO_ZPZ(sve_fneg_h, uint16_t, H1_2, DO_FNEG) +DO_ZPZ(sve_fneg_s, uint32_t, H1_4, DO_FNEG) +DO_ZPZ_D(sve_fneg_d, uint64_t, DO_FNEG) + +#define DO_NOT(N) (~N) + +DO_ZPZ(sve_not_zpz_b, uint8_t, H1, DO_NOT) +DO_ZPZ(sve_not_zpz_h, uint16_t, H1_2, DO_NOT) +DO_ZPZ(sve_not_zpz_s, uint32_t, H1_4, DO_NOT) +DO_ZPZ_D(sve_not_zpz_d, uint64_t, DO_NOT) + +#define DO_SXTB(N) ((int8_t)N) +#define DO_SXTH(N) ((int16_t)N) +#define DO_SXTS(N) ((int32_t)N) +#define DO_UXTB(N) ((uint8_t)N) +#define DO_UXTH(N) ((uint16_t)N) +#define DO_UXTS(N) ((uint32_t)N) + +DO_ZPZ(sve_sxtb_h, uint16_t, H1_2, DO_SXTB) +DO_ZPZ(sve_sxtb_s, uint32_t, H1_4, DO_SXTB) +DO_ZPZ(sve_sxth_s, uint32_t, H1_4, DO_SXTH) +DO_ZPZ_D(sve_sxtb_d, uint64_t, DO_SXTB) +DO_ZPZ_D(sve_sxth_d, uint64_t, DO_SXTH) +DO_ZPZ_D(sve_sxtw_d, uint64_t, DO_SXTS) + +DO_ZPZ(sve_uxtb_h, uint16_t, H1_2, DO_UXTB) +DO_ZPZ(sve_uxtb_s, uint32_t, H1_4, DO_UXTB) +DO_ZPZ(sve_uxth_s, uint32_t, H1_4, DO_UXTH) +DO_ZPZ_D(sve_uxtb_d, uint64_t, DO_UXTB) +DO_ZPZ_D(sve_uxth_d, uint64_t, DO_UXTH) +DO_ZPZ_D(sve_uxtw_d, uint64_t, DO_UXTS) + +#define DO_ABS(N) (N < 0 ? -N : N) + +DO_ZPZ(sve_abs_b, int8_t, H1, DO_ABS) +DO_ZPZ(sve_abs_h, int16_t, H1_2, DO_ABS) +DO_ZPZ(sve_abs_s, int32_t, H1_4, DO_ABS) +DO_ZPZ_D(sve_abs_d, int64_t, DO_ABS) + +#define DO_NEG(N) (-N) + +DO_ZPZ(sve_neg_b, uint8_t, H1, DO_NEG) +DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) +DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) +DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) + +#undef DO_CLS_B +#undef DO_CLS_H +#undef DO_CLZ_B +#undef DO_CLZ_H +#undef DO_CNOT +#undef DO_FABS +#undef DO_FNEG +#undef DO_ABS +#undef DO_NEG +#undef DO_ZPZ +#undef DO_ZPZ_D + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 91f07d57e3..7dbc43fb6e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -250,6 +250,113 @@ DO_ZPZW(LSL, lsl) =20 #undef DO_ZPZW =20 +static void do_zpz_ool(DisasContext *s, arg_rpr_esz *a, gen_helper_gvec_3 = *fn) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZ(NAME, name) \ +void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3 * const fns[4] =3D { \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_zpz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZ(CLS, cls) +DO_ZPZ(CLZ, clz) +DO_ZPZ(CNT_zpz, cnt_zpz) +DO_ZPZ(CNOT, cnot) +DO_ZPZ(NOT_zpz, not_zpz) +DO_ZPZ(ABS, abs) +DO_ZPZ(NEG, neg) + +void trans_FABS(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_fabs_h, + gen_helper_sve_fabs_s, + gen_helper_sve_fabs_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +void trans_FNEG(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_fneg_h, + gen_helper_sve_fneg_s, + gen_helper_sve_fneg_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +void trans_SXTB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_sxtb_h, + gen_helper_sve_sxtb_s, + gen_helper_sve_sxtb_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +void trans_UXTB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_uxtb_h, + gen_helper_sve_uxtb_s, + gen_helper_sve_uxtb_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +void trans_SXTH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, NULL, + gen_helper_sve_sxth_s, + gen_helper_sve_sxth_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +void trans_UXTH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, NULL, + gen_helper_sve_uxth_s, + gen_helper_sve_uxth_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +void trans_SXTW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ool(s, a, a->esz =3D=3D 3 ? gen_helper_sve_sxtw_d : NULL); +} + +void trans_UXTW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ool(s, a, a->esz =3D=3D 3 ? gen_helper_sve_uxtw_d : NULL); +} + +#undef DO_ZPZ + typedef void gen_helper_gvec_reduc(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_i32); static void do_vpz_ool(DisasContext *s, arg_rpr_esz *a, gen_helper_gvec_reduc *fn) diff --git a/target/arm/sve.def b/target/arm/sve.def index 66be950ca5..955a0275a1 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -147,6 +147,27 @@ ASR_zpzw 00000100 .. 011 000 100 ... ..... ..... @rd= n_pg_rm_esz LSR_zpzw 00000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm_esz LSL_zpzw 00000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm_esz =20 +### SVE Integer Arithmetic - Unary Predicated Group + +# SVE unary bit operations (predicated) +CLS 00000100 .. 011 000 101 ... ..... ..... @rd_pg_rn_esz +CLZ 00000100 .. 011 001 101 ... ..... ..... @rd_pg_rn_esz +CNT_zpz 00000100 .. 011 010 101 ... ..... ..... @rd_pg_rn_esz +CNOT 00000100 .. 011 011 101 ... ..... ..... @rd_pg_rn_esz +NOT_zpz 00000100 .. 011 110 101 ... ..... ..... @rd_pg_rn_esz +FABS 00000100 .. 011 100 101 ... ..... ..... @rd_pg_rn_esz # Note size = !=3D 0 +FNEG 00000100 .. 011 101 101 ... ..... ..... @rd_pg_rn_esz # Note size = !=3D 0 + +# SVE integer unary operations (predicated) +ABS 00000100 .. 010 110 101 ... ..... ..... @rd_pg_rn_esz +NEG 00000100 .. 010 111 101 ... ..... ..... @rd_pg_rn_esz +SXTB 00000100 .. 010 000 101 ... ..... ..... @rd_pg_rn_esz # Note size = !=3D 0 +UXTB 00000100 .. 010 001 101 ... ..... ..... @rd_pg_rn_esz # Note size = !=3D 0 +SXTH 00000100 .. 010 010 101 ... ..... ..... @rd_pg_rn_esz # Note size = > 1 +UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn_esz # Note size = > 1 +SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn_esz # Note size = =3D=3D 3 +UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn_esz # Note size = =3D=3D 3 + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513623037956553.9247473402352; Mon, 18 Dec 2017 10:50:37 -0800 (PST) Received: from localhost ([::1]:53406 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Ua-0007jo-Q3 for importer@patchew.org; Mon, 18 Dec 2017 13:50:24 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55875) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUb-0008Kt-3p for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUZ-0002Ao-Mc for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:21 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:37484) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUZ-0002AB-F3 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:19 -0500 Received: by mail-pg0-x243.google.com with SMTP id y6so9433270pgp.4 for ; Mon, 18 Dec 2017 09:46:19 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=3KZMEM2yXkoIROfRl/TBnbTATpSIAJgB7Mut5/GGdwo=; b=WGGZXLZl+4zA+Sdcu7UlHwIGS/4ae2brUC5BFgwnbQlz2D0CINtF94qL6CrHLWZcmQ fk7GWgrlELjQrsRcgBXSMWoYeil8N8657z9vVNjblG0rdraB3kt+iMRscny4kyaTDSXB 7OFPhzVZZnYpnNZak4kGYR+GywcvaRQjklDFs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=3KZMEM2yXkoIROfRl/TBnbTATpSIAJgB7Mut5/GGdwo=; b=szdJCWI1BKJ8gaw2Da67jAK07yuTDb9Y1HqTznMvyJrg4Ye+FRAC5nAzDDv44MyDOI vusVEZ/UPI1KoW+f7lJXDhvAuDe4iC43r5OnxGFBvOu6oWp4IrLREkhcAeJIZFsRe/mu 57EPhX5P3ZoVLRrRu8xDG41q+iMQHrKhfo32D+USyDsCV6r8GQj9N+jhwdFkmiHWYwyQ mkCV/TtE/StozLBFvwLXoSd08LsKukKoz6cKM2pbt5El7AJedd0QGmUVQg2ClQ699m0z d/m8hnqvJiY7kHdolavOlhtCcVy9UZXCG/AjHWTbOJafbPiCAVTHa6szMOIvjU5BVgcc sWFw== X-Gm-Message-State: AKGB3mIO8hoCCibckExJPZGqWm5NgW3H06FZFRSZZA6VGZ938HWf3D77 5a2M7+rdLNm32Wrh+HHcwK3HQS3RvbI= X-Google-Smtp-Source: ACJfBovuFg3RPIVT7TwewcNC3cL597rGeUbHumPU6wgamO6ILRTp20tKmpkmrT8QX1WOqNFQbLok8g== X-Received: by 10.98.64.153 with SMTP id f25mr453439pfd.213.1513619178078; Mon, 18 Dec 2017 09:46:18 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:44 -0800 Message-Id: <20171218174552.18871-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH 15/23] target/arm: Implement SVE Integer Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 18 ++++++++++++++ target/arm/sve_helper.c | 58 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 27 +++++++++++++++++++++ target/arm/sve.def | 15 ++++++++++++ 4 files changed, 118 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index e9382de300..abed625123 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -339,6 +339,24 @@ DEF_HELPER_FLAGS_4(sve_neg_h, TCG_CALL_NO_RWG, void, p= tr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_neg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_neg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_6(sve_mla_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_mls_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 481b3bdefe..8235784a82 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -855,6 +855,64 @@ DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) #undef DO_SHL #undef DO_ASRD =20 +/* Fully general four-operand expander, controlled by a predicate. + */ +#define DO_ZPZZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, \ + void *vg, uint32_t desc) \ +{ \ + intptr_t iv =3D 0, ib =3D 0, opr_sz =3D simd_oprsz(desc); \ + for (iv =3D ib =3D 0; iv < opr_sz; iv +=3D 16, ib +=3D 2) { \ + uint16_t pg =3D *(uint16_t *)(vg + H2(ib)); \ + intptr_t i =3D 0; \ + do { \ + if (pg & 1) { \ + TYPE nn =3D *(TYPE *)(vn + iv + H(i)); \ + TYPE mm =3D *(TYPE *)(vm + iv + H(i)); \ + TYPE aa =3D *(TYPE *)(va + iv + H(i)); \ + *(TYPE *)(vd + iv + H(i)) =3D OP(aa, nn, mm); \ + } \ + i +=3D sizeof(TYPE), pg >>=3D sizeof(TYPE); \ + } while (pg); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, \ + void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; \ + TYPE *d =3D vd, *a =3D va, *n =3D vn, *m =3D vm; \ + uint8_t *pg =3D vg; \ + for (i =3D 0; i < opr_sz; i +=3D 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE aa =3D a[i], nn =3D n[i], mm =3D m[i]; \ + d[i] =3D OP(aa, nn, mm); \ + } \ + } \ +} + +#define DO_MLA(A, N, M) (A + N * M) +#define DO_MLS(A, N, M) (A - N * M) + +DO_ZPZZZ(sve_mla_b, uint8_t, H1, DO_MLA) +DO_ZPZZZ(sve_mls_b, uint8_t, H1, DO_MLS) + +DO_ZPZZZ(sve_mla_h, uint16_t, H1_2, DO_MLA) +DO_ZPZZZ(sve_mls_h, uint16_t, H1_2, DO_MLS) + +DO_ZPZZZ(sve_mla_s, uint32_t, H1_4, DO_MLA) +DO_ZPZZZ(sve_mls_s, uint32_t, H1_4, DO_MLS) + +DO_ZPZZZ_D(sve_mla_d, uint64_t, DO_MLA) +DO_ZPZZZ_D(sve_mls_d, uint64_t, DO_MLS) + +#undef DO_MLA +#undef DO_MLS +#undef DO_ZPZZZ +#undef DO_ZPZZZ_D + void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) { intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7dbc43fb6e..83793ab169 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -520,6 +520,33 @@ void trans_ASRD(DisasContext *s, arg_rpri_esz *a, uint= 32_t insn) } } =20 +static void do_zpzzz_ool(DisasContext *s, arg_rprrr_esz *a, + gen_helper_gvec_5 *fn) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + tcg_gen_gvec_5_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->ra), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZZZ(NAME, name) \ +void trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_5 * const fns[4] =3D { \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_zpzzz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZZZ(MLA, mla) +DO_ZPZZZ(MLS, mls) + +#undef DO_ZPZZZ + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index 955a0275a1..3ae871394c 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -45,6 +45,7 @@ &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rprr_esz rd pg rn rm esz +&rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &pred_set rd pat esz i s =20 @@ -62,6 +63,10 @@ @rdn_pg_rm_esz ........ esz:2 ... ... ... pg:3 rm:5 rd:5 &rprr_esz rn=3D%= reg_movprfx @rdm_pg_rn_esz ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rprr_esz rm=3D%= reg_movprfx =20 +# Three register operand, with governing predicate, vector element size +@rda_pg_rn_rm_esz ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 &rprrr_esz ra= =3D%reg_movprfx +@rdn_pg_ra_rm_esz ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 &rprrr_esz rn= =3D%reg_movprfx + # One register operand, with governing predicate, vector element size @rd_pg_rn_esz ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz =20 @@ -168,6 +173,16 @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg= _rn_esz # Note size > 1 SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn_esz # Note size = =3D=3D 3 UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn_esz # Note size = =3D=3D 3 =20 +### SVE Integer Multiply-Add Group + +# SVE integer multiply-add writing addend (predicated) +MLA 00000100 .. 0 ..... 010 ... ..... ..... @rda_pg_rn_rm_esz +MLS 00000100 .. 0 ..... 011 ... ..... ..... @rda_pg_rn_rm_esz + +# SVE integer multiply-add writing multiplicand (predicated) +MLA 00000100 .. 0 ..... 110 ... ..... ..... @rdn_pg_ra_rm_esz # MAD +MLS 00000100 .. 0 ..... 111 ... ..... ..... @rdn_pg_ra_rm_esz # MSB + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622289035783.9670478904283; Mon, 18 Dec 2017 10:38:09 -0800 (PST) Received: from localhost ([::1]:51924 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Id-0003XJ-1l for importer@patchew.org; Mon, 18 Dec 2017 13:38:03 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55921) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUd-0008Nn-99 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:24 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUb-0002By-2f for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:23 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:40519) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUa-0002BG-Sw for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:20 -0500 Received: by mail-pg0-x243.google.com with SMTP id k15so9432982pgr.7 for ; Mon, 18 Dec 2017 09:46:20 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AY0VrenhwgfxRVh8W8VPt2M/vsScMF29p9zDK0Mm/X4=; b=Stx9aPgnYJTueC+9k3sO3WSTQ/X9TarUuGKcrOBi7M/I3tl/CO0AuBbIB6Zkqdz7XH 7NDwoGB8KBMekVxrPVeSysWfv2Lxgz1otjiTCB/shlpKeJ5CcaGJY+EiN2pvzr7yWyL4 /m2T5AzthujkxCHSnlwlpiaAmcUAlJteZelDA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AY0VrenhwgfxRVh8W8VPt2M/vsScMF29p9zDK0Mm/X4=; b=djo8jhhxE3suyLETQQVRXC3v2X9gWsQxFvSllAaZrJMiW/ts9CuGjScGu6Qj4O5mXP 6hCa8BuB35jObauxVJzQf5uI10rrjzdBGXfSTtkTSqqNM8qv7gg0srM81iRnU0JOcNBw uVJnRdWtydknyLBt3n3lQv/azbcbAcwht0Zpo97Nth2/zbxREQ9NUjUGsWon370pmizz 0M+8mxtevJjDaUIXqxfS20tT1+G0l+gOStYx6K0/7enSQ5H6EKlgH+umM/9+mQ3xfpo3 Qls9LOXBrsZvxrPwB120Duowbf21eKVg1hA0MZoi0U2GtSBpPQHytNTTOpSiMr6JbXTj 6EYQ== X-Gm-Message-State: AKGB3mIcwkrl9XePXSZ7sJnZIkVyV3kiwNwLthD07AKL/fasodQLCtUl XkVR7XHImN0yY7cnmUApL61P9x2hGVY= X-Google-Smtp-Source: ACJfBosmRluOwXIEIC/JthpDjYMnukld+Hg0tIqmQTwN/xNQd68TldRwlGtRcfvRI6Rhvnu5M+Zt7A== X-Received: by 10.99.45.67 with SMTP id t64mr439169pgt.146.1513619179575; Mon, 18 Dec 2017 09:46:19 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:45 -0800 Message-Id: <20171218174552.18871-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH 16/23] target/arm: Implement SVE Integer Arithmetic - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ target/arm/sve.def | 13 +++++++++++++ 2 files changed, 43 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 83793ab169..7edec8ba96 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -127,6 +127,36 @@ static void do_zzz_genfn(DisasContext *s, arg_rrr_esz = *a, GVecGen3Fn *fn) do_genfn3(s, fn, a->esz, a->rd, a->rn, a->rm); } =20 +void trans_ADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_add); +} + +void trans_SUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_sub); +} + +void trans_SQADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_ssadd); +} + +void trans_SQSUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_sssub); +} + +void trans_UQADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_usadd); +} + +void trans_UQSUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_genfn(s, a, tcg_gen_gvec_ussub); +} + void trans_AND_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) { do_zzz_genfn(s, a, tcg_gen_gvec_and); diff --git a/target/arm/sve.def b/target/arm/sve.def index 3ae871394c..a33fec4f33 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -53,6 +53,9 @@ # Named instruction formats. These are generally used to # reduce the amount of duplication between instruction patterns. =20 +# Three operand +@rd_rn_rm_esz ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz + # Three operand with unused vector element size @rd_rn_rm ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 =20 @@ -183,6 +186,16 @@ MLS 00000100 .. 0 ..... 011 ... ..... ..... @rda_pg= _rn_rm_esz MLA 00000100 .. 0 ..... 110 ... ..... ..... @rdn_pg_ra_rm_esz # MAD MLS 00000100 .. 0 ..... 111 ... ..... ..... @rdn_pg_ra_rm_esz # MSB =20 +### SVE Integer Arithmetic - Unpredicated Group + +# SVE integer add/subtract vectors (unpredicated) +ADD_zzz 00000100 .. 1 ..... 000 000 ..... ..... @rd_rn_rm_esz +SUB_zzz 00000100 .. 1 ..... 000 001 ..... ..... @rd_rn_rm_esz +SQADD_zzz 00000100 .. 1 ..... 000 100 ..... ..... @rd_rn_rm_esz +UQADD_zzz 00000100 .. 1 ..... 000 101 ..... ..... @rd_rn_rm_esz +SQSUB_zzz 00000100 .. 1 ..... 000 110 ..... ..... @rd_rn_rm_esz +UQSUB_zzz 00000100 .. 1 ..... 000 111 ..... ..... @rd_rn_rm_esz + ### SVE Logical - Unpredicated Group =20 # SVE bitwise logical operations (unpredicated) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513621537159433.62731900096674; Mon, 18 Dec 2017 10:25:37 -0800 (PST) Received: from localhost ([::1]:44125 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR06U-0006zf-Qr for importer@patchew.org; Mon, 18 Dec 2017 13:25:30 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55936) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUd-0008OJ-P2 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUc-0002DS-IO for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:23 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:34081) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUc-0002Cn-Af for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:22 -0500 Received: by mail-pg0-x241.google.com with SMTP id j4so9441910pgp.1 for ; Mon, 18 Dec 2017 09:46:22 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=79qaL/sSWdRs8yI9GhQnKs5bk8BLHvgB2DUy18WjB6U=; b=kSvDTtrFI+o8qyQfGxKP/S1D7BPPXJBSaLLHY0BOZO1klfgH41L804x2HIyRvKFoJF CONuczPevtr9DIHUZRv4KuuvqwBL7tVGJvUmriQwyIRJn81ZPkXNBi7jeJ1pW/u9klIX 3KopfZDKy1c/FwvUHKw3fpG3KDKP55MKtZskg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=79qaL/sSWdRs8yI9GhQnKs5bk8BLHvgB2DUy18WjB6U=; b=CcOymnnH6SAHZryHc/j1mqOgRZKmBaaSR9u4SCX7ofD0ej1OYEHccoLnwrEhDqJcVV AZMFUZNTGkXyew+HDcR4OGRQ+o5HQM/QvAW0q0Iio3sN4Tccr7ajsr48YV/s/9B3uHBs GvnVVt9e1TEnN54CIhr4GrZQFr8EdHvwEL7Z+1oVd1I6o/kphirJBrN28mk6usWp7lap ENnvGjXoYHHjIrC2NQvm+yq8aPpVP+7H7nJcq0v9f4mz6QNKnCC7+NFRoq/xq8G6u4H0 24qcSh52XxhzWHY3B21fH8ccnAyAxSJfEzoEoU06PZnqzrBEyVfCYL4IXb2OZBjPs7Fy DYoQ== X-Gm-Message-State: AKGB3mI+RVZhn38Pd+FieQkJeNtiBGw/7E3PVMm0MrzWw+NT6ag6aCPi UgmpkYrHc+kj33XzhnqMNJNu5pQdno0= X-Google-Smtp-Source: ACJfBovxQz5DCgDG0moCndV91vqH2PUpnH1sQ4cRjes3N1M/fxdvMYG8cO9PYrCA35A+1twvJ/PDpQ== X-Received: by 10.101.66.136 with SMTP id j8mr429452pgp.78.1513619180991; Mon, 18 Dec 2017 09:46:20 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:46 -0800 Message-Id: <20171218174552.18871-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH 17/23] target/arm: Implement SVE Index Generation Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 ++++ target/arm/sve_helper.c | 40 ++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 62 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.def | 14 +++++++++++ 4 files changed, 121 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index abed625123..c8eae5eb62 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -357,6 +357,11 @@ DEF_HELPER_FLAGS_6(sve_mls_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_mls_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_index_b, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_h, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_s, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_d, TCG_CALL_NO_RWG, void, ptr, i64, i64, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8235784a82..d8684b9457 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -913,6 +913,46 @@ DO_ZPZZZ_D(sve_mls_d, uint64_t, DO_MLS) #undef DO_ZPZZZ #undef DO_ZPZZZ_D =20 +void HELPER(sve_index_b)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc); + uint8_t *d =3D vd; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[H1(i)] =3D start + i * incr; + } +} + +void HELPER(sve_index_h)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 2; + uint16_t *d =3D vd; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[H2(i)] =3D start + i * incr; + } +} + +void HELPER(sve_index_s)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t *d =3D vd; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[H4(i)] =3D start + i * incr; + } +} + +void HELPER(sve_index_d)(void *vd, uint64_t start, + uint64_t incr, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D start + i * incr; + } +} + void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) { intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7edec8ba96..7e1bf7d623 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -577,6 +577,68 @@ DO_ZPZZZ(MLS, mls) =20 #undef DO_ZPZZZ =20 +static void do_index(DisasContext *s, int esz, int rd, + TCGv_i64 start, TCGv_i64 incr) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + TCGv_i32 desc =3D tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd =3D tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, rd)); + if (esz =3D=3D 3) { + gen_helper_sve_index_d(t_zd, start, incr, desc); + } else { + static void (*fns[3])(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32) =3D { + gen_helper_sve_index_b, + gen_helper_sve_index_h, + gen_helper_sve_index_s, + }; + TCGv_i32 s32 =3D tcg_temp_new_i32(); + TCGv_i32 i32 =3D tcg_temp_new_i32(); + + tcg_gen_extrl_i64_i32(s32, start); + tcg_gen_extrl_i64_i32(i32, incr); + fns[esz](t_zd, s32, i32, desc); + + tcg_temp_free_i32(s32); + tcg_temp_free_i32(i32); + } + tcg_temp_free_ptr(t_zd); + tcg_temp_free_i32(desc); +} + +void trans_INDEX_ii(DisasContext *s, arg_INDEX_ii *a, uint32_t insn) +{ + TCGv_i64 start =3D tcg_const_i64(a->imm1); + TCGv_i64 incr =3D tcg_const_i64(a->imm2); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(start); + tcg_temp_free_i64(incr); +} + +void trans_INDEX_ir(DisasContext *s, arg_INDEX_ir *a, uint32_t insn) +{ + TCGv_i64 start =3D tcg_const_i64(a->imm); + TCGv_i64 incr =3D cpu_reg(s, a->rm); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(start); +} + +void trans_INDEX_ri(DisasContext *s, arg_INDEX_ri *a, uint32_t insn) +{ + TCGv_i64 start =3D cpu_reg(s, a->rn); + TCGv_i64 incr =3D tcg_const_i64(a->imm); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(incr); +} + +void trans_INDEX_rr(DisasContext *s, arg_INDEX_rr *a, uint32_t insn) +{ + TCGv_i64 start =3D cpu_reg(s, a->rn); + TCGv_i64 incr =3D cpu_reg(s, a->rm); + do_index(s, a->esz, a->rd, start, incr); +} + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index a33fec4f33..0cac3a974f 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -204,6 +204,20 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd= _rn_rm EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm =20 +### SVE Index Generation Group + +# SVE index generation (immediate start, immediate increment) +INDEX_ii 00000100 esz:2 1 imm2:s5 010000 imm1:s5 rd:5 + +# SVE index generation (immediate start, register increment) +INDEX_ir 00000100 esz:2 1 rm:5 010010 imm:s5 rd:5 + +# SVE index generation (register start, immediate increment) +INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5 + +# SVE index generation (register start, register increment) +INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm_esz + ### SVE Predicate Generation Group =20 # SVE initialize predicate (PTRUE, PTRUES) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622635797935.1126499809811; Mon, 18 Dec 2017 10:43:55 -0800 (PST) Received: from localhost ([::1]:52338 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0OG-0000e4-UQ for importer@patchew.org; Mon, 18 Dec 2017 13:43:52 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55975) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUe-0008Pi-Rw for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUd-0002Eh-SN for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:24 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:39419) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUd-0002Do-NC for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:23 -0500 Received: by mail-pf0-x244.google.com with SMTP id l24so9973261pfj.6 for ; Mon, 18 Dec 2017 09:46:23 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.21 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6YYbCElCJ2gbI5x/aO1NF4xxoziErKY0Fw/koAe5yac=; b=HBLgCoNFfj09n9q/9Aw+C/mreeqD7sKIaQio3VTv/nwS97jtpHMYp9UjRn71eR2CCP t3ygxhKyaSKVJVThGb5r72E6ZbD8+4W7dacCU06urADM0LTL2+ahwdHyVI+7QvkObfa1 g4KAwP6EFRYu6GKWS5XcA/wZ+7WJqXojkWESk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6YYbCElCJ2gbI5x/aO1NF4xxoziErKY0Fw/koAe5yac=; b=L7EArzVhTYmZTKjJ5vMla1sBbrT8X2+KqFvMQsQrZn7GpvIMq2YLtSKgWSJB7EwinF ORJhdaaweSR8nZlYFSlx7MNcxq5rU647c+ZR+nWS2i/UAZEpb88uqPMBu+PWYqxUtenc rypIzQqZZouuRAeb8WPwpvOaJIvkXpfHrWBlG5ZnCXC0TXjEVNXzzLDy1Fdj86+DBMSE E8r1SjhKXqT+niZbkQr3Y1QKpwYY9uh+JYL/gvQkkJZAsrFqYeyzFi5yZ+cf2Sax72XC QdKiHKovwtP3S8RcQF1E7I8mswz15Ps2H8EJYZ0wGLiurPHiroMkEh7LcesUM0Va62Y9 0t7w== X-Gm-Message-State: AKGB3mIB88FskkxmA/bmNpE1G0+TJSpFq61NLcg/uJb1/9j9X/D7j1xT LCGhzYXDO3dxkLvhDyAkdakhJ4dWZXg= X-Google-Smtp-Source: ACJfBovxcn0Z4CB7+ledXOF/8Di7vus6ZrPbintzeHWO4L6FhCPyH2baV9QITp/dvSybaDkFnLmIag== X-Received: by 10.99.45.4 with SMTP id t4mr436031pgt.254.1513619182426; Mon, 18 Dec 2017 09:46:22 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:47 -0800 Message-Id: <20171218174552.18871-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH 18/23] target/arm: Implement SVE Stack Allocation Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 18 ++++++++++++++++++ target/arm/sve.def | 12 ++++++++++++ 2 files changed, 30 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7e1bf7d623..026af7a162 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -639,6 +639,24 @@ void trans_INDEX_rr(DisasContext *s, arg_INDEX_rr *a, = uint32_t insn) do_index(s, a->esz, a->rd, start, incr); } =20 +void trans_ADDVL(DisasContext *s, arg_ADDVL *a, uint32_t insn) +{ + TCGv_i64 reg =3D cpu_reg_sp(s, a->rd); + tcg_gen_addi_i64(reg, reg, a->imm * vec_full_reg_size(s)); +} + +void trans_ADDPL(DisasContext *s, arg_ADDPL *a, uint32_t insn) +{ + TCGv_i64 reg =3D cpu_reg_sp(s, a->rd); + tcg_gen_addi_i64(reg, reg, a->imm * pred_full_reg_size(s)); +} + +void trans_RDVL(DisasContext *s, arg_RDVL *a, uint32_t insn) +{ + TCGv_i64 reg =3D cpu_reg(s, a->rd); + tcg_gen_movi_i64(reg, a->imm * vec_full_reg_size(s)); +} + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index 0cac3a974f..7428ebc5cd 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -73,6 +73,9 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn_esz ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz =20 +# Two register operands with a 6-bit signed immediate. +@rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri + # Two register operand, one immediate operand, with predicate, element siz= e encoded as TSZHL. # User must fill in imm. @rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 &rpri_esz rn=3D%r= eg_movprfx esz=3D%tszimm_esz @@ -218,6 +221,15 @@ INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5 # SVE index generation (register start, register increment) INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm_esz =20 +### SVE Stack Allocation Group + +# SVE stack frame adjustment +ADDVL 00000100 001 ..... 01010 ...... ..... @rd_rn_i6 +ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_i6 + +# SVE stack frame size +RDVL 00000100 101 11111 01010 imm:s6 rd:5 + ### SVE Predicate Generation Group =20 # SVE initialize predicate (PTRUE, PTRUES) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513623423617583.1647371324365; Mon, 18 Dec 2017 10:57:03 -0800 (PST) Received: from localhost ([::1]:54711 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0an-0006Wv-8J for importer@patchew.org; Mon, 18 Dec 2017 13:56:49 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56023) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUh-0008SR-1e for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:28 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUf-0002GB-IW for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:27 -0500 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:38711) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUf-0002Fc-An for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:25 -0500 Received: by mail-pg0-x244.google.com with SMTP id f12so9433645pgo.5 for ; Mon, 18 Dec 2017 09:46:25 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=eh9/WnyrV/yK2DjsUOEgKf0P6VP8p4vqDCBDRPGgVy0=; b=aJXwTcQyLFo35AYd0z7pfuX5ltwmussNGFBhZEBo+i/NNTRcDdtVC5ORRQnMM61UJY i+BWmUH2FHAKbP+Dzo0qSenuF4/mi3sTwe4zOdMDIzgdTU6N8P0IZ44wcEYEy35gT5q5 /5BJiwmC33lnyyDnUHq19324GFNQdEvno4lTI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=eh9/WnyrV/yK2DjsUOEgKf0P6VP8p4vqDCBDRPGgVy0=; b=qNyY+qTHSYAp93D7xTtOnaUCk3OwlMRHaNhjG5AxReYFg9pPROP3FdEQFNvA6EgoB/ pK4HOc6EcV/34V/EVWJ2B0DZm+BlYhY0bPEXC+yyQ1qrRxPH8z84hvHyp8r6HZacKHNS EuYlTwA5gipJjimuJHpa3iREmo2nV1cr8z5biDkZO0k51aumG04CBILqGV2KG97+pqNc kId78sItrozbG4teQLloD2Oj83M8873zfYJOw0FRYGP50NnBcRWJozirxnZV+7gR4bwJ vBv6xz7BYRelyd10++dnEvNRZ/1yenml6XwZ2015ws4Wgpn3xGHeV8OKVc8gmbxBHF0k YWkQ== X-Gm-Message-State: AKGB3mLIzeCmP1iWuZAx6T00xVHnMkw0k+AdyohNoHpIKaUh7mSYoyYF J1LlCs+ioIQJ6oTWrNCVF/PmBmIP36w= X-Google-Smtp-Source: ACJfBosc0MAh0n7R/K14FXHuHbQJlR4JTXYE8BjQ/X7zvxDJ6LSd+1wkRlAlRw4yPWPJcmA5/10c1g== X-Received: by 10.98.153.221 with SMTP id t90mr476687pfk.210.1513619183875; Mon, 18 Dec 2017 09:46:23 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:48 -0800 Message-Id: <20171218174552.18871-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH 19/23] target/arm: Implement SVE Bitwise Shift - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 12 +++++++++++ target/arm/sve_helper.c | 30 ++++++++++++++++++++++++++ target/arm/translate-sve.c | 53 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.def | 21 ++++++++++++++++++ 4 files changed, 116 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c8eae5eb62..c0e23e7a83 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -362,6 +362,18 @@ DEF_HELPER_FLAGS_4(sve_index_h, TCG_CALL_NO_RWG, void,= ptr, i32, i32, i32) DEF_HELPER_FLAGS_4(sve_index_s, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) DEF_HELPER_FLAGS_4(sve_index_d, TCG_CALL_NO_RWG, void, ptr, i64, i64, i32) =20 +DEF_HELPER_FLAGS_4(sve_asr_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_asr_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_asr_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(sve_lsr_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_lsr_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_lsr_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + +DEF_HELPER_FLAGS_4(sve_lsl_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_lsl_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) +DEF_HELPER_FLAGS_4(sve_lsl_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d8684b9457..b6aca18d22 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -669,6 +669,36 @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) =20 +/* Three-operand expander, unpredicated, in which the third operand is "wi= de". + */ +#define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz =3D simd_oprsz(desc); \ + for (i =3D 0; i < opr_sz; ) { \ + TYPEW mm =3D *(TYPEW *)(vm + i); \ + do { \ + TYPE nn =3D *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) =3D OP(nn, mm); \ + i +=3D sizeof(TYPE); \ + } while (i & 7); \ + } \ +} + +DO_ZZW(sve_asr_zzw_b, int8_t, uint64_t, H1, DO_ASR) +DO_ZZW(sve_lsr_zzw_b, uint8_t, uint64_t, H1, DO_LSR) +DO_ZZW(sve_lsl_zzw_b, uint8_t, uint64_t, H1, DO_LSL) + +DO_ZZW(sve_asr_zzw_h, int16_t, uint64_t, H1_2, DO_ASR) +DO_ZZW(sve_lsr_zzw_h, uint16_t, uint64_t, H1_2, DO_LSR) +DO_ZZW(sve_lsl_zzw_h, uint16_t, uint64_t, H1_2, DO_LSL) + +DO_ZZW(sve_asr_zzw_s, int32_t, uint64_t, H1_4, DO_ASR) +DO_ZZW(sve_lsr_zzw_s, uint32_t, uint64_t, H1_4, DO_LSR) +DO_ZZW(sve_lsl_zzw_s, uint32_t, uint64_t, H1_4, DO_LSL) + +#undef DO_ZZW + #undef DO_CLS_B #undef DO_CLS_H #undef DO_CLZ_B diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 026af7a162..d8e7cc7570 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -657,6 +657,59 @@ void trans_RDVL(DisasContext *s, arg_RDVL *a, uint32_t= insn) tcg_gen_movi_i64(reg, a->imm * vec_full_reg_size(s)); } =20 +static void do_shift_imm(DisasContext *s, arg_rri_esz *a, + void (*gvec_fn)(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, unsigned)) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + gvec_fn(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), vsz, vsz, a->imm); +} + +void trans_ASR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, tcg_gen_gvec_sari); +} + +void trans_LSR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, tcg_gen_gvec_shri); +} + +void trans_LSL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, tcg_gen_gvec_shli); +} + +static void do_zzw_ool(DisasContext *s, arg_rrr_esz *a, gen_helper_gvec_3 = *fn) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + if (fn =3D=3D NULL) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fn); +} + +#define DO_ZZW(NAME, name) \ +void trans_##NAME##_zzw(DisasContext *s, arg_rrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3 * const fns[4] =3D { = \ + gen_helper_sve_##name##_zzw_b, gen_helper_sve_##name##_zzw_h, \ + gen_helper_sve_##name##_zzw_s, NULL \ + }; \ + do_zzw_ool(s, a, fns[a->esz]); \ +} + +DO_ZZW(ASR, asr) +DO_ZZW(LSR, lsr) +DO_ZZW(LSL, lsl) + +#undef DO_ZZW + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index 7428ebc5cd..9caed8fc66 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -32,6 +32,11 @@ # A combination of tsz:imm3 -- extract (tsz:imm3) - esize %tszimm_shl 22:2 5:5 !function=3Dtszimm_shl =20 +# Similarly for the tszh/tszl pair at 22/16 for zzi +%tszimm16_esz 22:2 16:5 !function=3Dtszimm_esz +%tszimm16_shr 22:2 16:5 !function=3Dtszimm_shr +%tszimm16_shl 22:2 16:5 !function=3Dtszimm_shl + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. %reg_movprfx 0:5 @@ -42,6 +47,7 @@ # instruction patterns. =20 &rri rd rn imm +&rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rprr_esz rd pg rn rm esz @@ -80,6 +86,9 @@ # User must fill in imm. @rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 &rpri_esz rn=3D%r= eg_movprfx esz=3D%tszimm_esz =20 +# Similarly without predicate. +@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 &rri_esz esz=3D%tszim= m16_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 &rri imm=3D%imm9_16_10 @rd_rn_i9 ........ ........ ...... rn:5 rd:5 &rri imm=3D%imm9_16_10 @@ -230,6 +239,18 @@ ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_= i6 # SVE stack frame size RDVL 00000100 101 11111 01010 imm:s6 rd:5 =20 +### SVE Bitwise Shift - Unpredicated Group + +# SVE bitwise shift by immediate (unpredicated) +ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... @rd_rn_tszimm imm=3D%ts= zimm16_shr +LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... @rd_rn_tszimm imm=3D%ts= zimm16_shr +LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... @rd_rn_tszimm imm=3D%ts= zimm16_shl + +# SVE bitwise shift by wide elements (unpredicated) +ASR_zzw 00000100 .. 1 ..... 1000 00 ..... ..... @rd_rn_rm_esz # Note si= ze !=3D 3 +LSR_zzw 00000100 .. 1 ..... 1000 01 ..... ..... @rd_rn_rm_esz # Note si= ze !=3D 3 +LSL_zzw 00000100 .. 1 ..... 1000 11 ..... ..... @rd_rn_rm_esz # Note si= ze !=3D 3 + ### SVE Predicate Generation Group =20 # SVE initialize predicate (PTRUE, PTRUES) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513622784014769.6327654532232; Mon, 18 Dec 2017 10:46:24 -0800 (PST) Received: from localhost ([::1]:52522 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0QJ-0002dU-PI for importer@patchew.org; Mon, 18 Dec 2017 13:45:59 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56045) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUi-0008TN-1k for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUg-0002Hi-VZ for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:28 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:43367) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUg-0002H3-Nb for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:26 -0500 Received: by mail-pg0-x241.google.com with SMTP id b18so9423314pgv.10 for ; Mon, 18 Dec 2017 09:46:26 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.23 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Ac6B4fIbk1rKjOWjF4fPb2HPxUZ3ZPd87nYhLxcCz84=; b=jdxcqv0gdfDRioz0IuhcUEbGXvN7sMaOlGt5J5U6PwEZhtDV9TM1ERdLldEvgf3Be2 PKUnGaQuZAfEiXrK3X3nnSODFDy9v13WblLyS+NuBAJ8H/ATu0j7CnfT9j1NPKW1kTwK AAzPYVbzEXo9lF13ZoMeNDXbAmpOZY+9JlmXg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Ac6B4fIbk1rKjOWjF4fPb2HPxUZ3ZPd87nYhLxcCz84=; b=tt3Iyaj5u85087ck+LCtFjNDAW6QhhqamBND5x2phTxmBwT/NlD7PWMvMh+M0EtpSS 240a2HFZo76tvwBYIAiuoTbJZmRF54VyDoUobTTvvzMPOX+nlzPC1qYI10rSXEqbZ8pZ KS664NSnzud9OcT2qmINoJGB+P2pJUex9UR2QG7JtfCOhd4y61ryWA2vRUt8DFSLz/EY XYHftH0lRbY85mtfUqI8JsNA87sDuMPpow+nEoZtg8yj9CVcuhPt8EjAtRdHb/K8/P69 25E9xnf79LhvAuMmHnmJhnnC38NzpU9EHqqrz2hj3+QtT6utaF1OvOImrCxn7lfNYEyW UX/g== X-Gm-Message-State: AKGB3mIuPkp1Vj88LvadJ1pSWGFwaVYtZZi+pCt6EFNpk7jkZcZT9SRl C0quOcCorRNj0hVGn/zsXjiEcwaS3bg= X-Google-Smtp-Source: ACJfBotTIr17X9F9Kk5ZqY57QOGD7pG54k2Kdda6pryfFJozXMacqP1NifoWaQe5siCSTVXt+oS7vg== X-Received: by 10.98.224.200 with SMTP id d69mr497613pfm.100.1513619185256; Mon, 18 Dec 2017 09:46:25 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:49 -0800 Message-Id: <20171218174552.18871-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH 20/23] target/arm: Implement SVE Compute Vector Address Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve_helper.c | 40 ++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 29 +++++++++++++++++++++++++++++ target/arm/sve.def | 12 ++++++++++++ 4 files changed, 86 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c0e23e7a83..a9fcf25b95 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -374,6 +374,11 @@ DEF_HELPER_FLAGS_4(sve_lsl_zzw_b, TCG_CALL_NO_RWG, voi= d, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_lsl_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) DEF_HELPER_FLAGS_4(sve_lsl_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i3= 2) =20 +DEF_HELPER_FLAGS_4(sve_adr_p32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_p64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_u32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b6aca18d22..33b3c3432d 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -983,6 +983,46 @@ void HELPER(sve_index_d)(void *vd, uint64_t start, } } =20 +void HELPER(sve_adr_p32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t sh =3D simd_data(desc); + uint32_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D n[i] + (m[i] << sh); + } +} + +void HELPER(sve_adr_p64)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t sh =3D simd_data(desc); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D n[i] + (m[i] << sh); + } +} + +void HELPER(sve_adr_s32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t sh =3D simd_data(desc); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D n[i] + ((uint64_t)(int32_t)m[i] << sh); + } +} + +void HELPER(sve_adr_u32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t sh =3D simd_data(desc); + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + d[i] =3D n[i] + ((uint64_t)(uint32_t)m[i] << sh); + } +} + void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) { intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d8e7cc7570..fcb5c4929e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -710,6 +710,35 @@ DO_ZZW(LSL, lsl) =20 #undef DO_ZZW =20 +static void do_adr(DisasContext *s, arg_rrri *a, gen_helper_gvec_3 *fn) +{ + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, a->imm, fn); +} + +void trans_ADR_p32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_p32); +} + +void trans_ADR_p64(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_p64); +} + +void trans_ADR_s32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_s32); +} + +void trans_ADR_u32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_u32); +} + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index 9caed8fc66..66a88f59bc 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -47,6 +47,7 @@ # instruction patterns. =20 &rri rd rn imm +&rrri rd rn rm imm &rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz @@ -65,6 +66,9 @@ # Three operand with unused vector element size @rd_rn_rm ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 =20 +# Three operand with "memory" size, aka immediate left shift +@rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri + # Three prediate operand, with governing predicate, unused vector element = size @pd_pg_pn_pm ........ .... rm:4 .. pg:4 . rn:4 . rd:4 &rprr_esz esz=3D0 =20 @@ -251,6 +255,14 @@ ASR_zzw 00000100 .. 1 ..... 1000 00 ..... ..... @rd= _rn_rm_esz # Note size !=3D LSR_zzw 00000100 .. 1 ..... 1000 01 ..... ..... @rd_rn_rm_esz # Note si= ze !=3D 3 LSL_zzw 00000100 .. 1 ..... 1000 11 ..... ..... @rd_rn_rm_esz # Note si= ze !=3D 3 =20 +### SVE Compute Vector Address Group + +# SVE vector address generation +ADR_s32 00000100 00 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_u32 00000100 01 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_p32 00000100 10 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm + ### SVE Predicate Generation Group =20 # SVE initialize predicate (PTRUE, PTRUES) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: temperror (zoho.com: Error in retrieving data from DNS) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=temperror (zoho.com: Error in retrieving data from DNS) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 15136217641981020.6400003791815; Mon, 18 Dec 2017 10:29:24 -0800 (PST) Received: from localhost ([::1]:45062 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR09p-00027k-VY for importer@patchew.org; Mon, 18 Dec 2017 13:28:58 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56114) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUm-0008WC-7t for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:33 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUk-0002MO-1I for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:32 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:37485) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUj-0002Ke-Pn for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:29 -0500 Received: by mail-pg0-x243.google.com with SMTP id y6so9433444pgp.4 for ; Mon, 18 Dec 2017 09:46:29 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/Ec/I7o4mHzULQHzz4laSWHbRQoQ0RYGvsedwesbyrA=; b=P7Ecy8pviyZJzUJrPARbn6QfvoGPNJ1n6ZCt1HFr47/Q2ZUBE+aE+uLGyfjfpI2rGj wYWPILgfvhWh0sHa+n71elwJjFgV9Wk3UhOjEIHFdppepuyLLftpZBJy8k9T9d2WuE6d nwSMTiQS8D6zd3RbGX372JkdHvbhnh+Ou+e8s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/Ec/I7o4mHzULQHzz4laSWHbRQoQ0RYGvsedwesbyrA=; b=bYkHLHYiUVCcJWuKqN0vQ770XdOhCyp/TouX0tU0of08xBz73dFKnGFTrPVuQ11ZoQ x11mcbhQQKbq3SJizWNEm5V4evQydYu5RkPHDybtzb+CULneK2bAzrS3Rfot+tL2crwG oZI3kh7uwyxdfBPUUbtDKuj2sChiuzxDPoFldbeZ9JxB7Or4UgMMbpHF7rNOQ5bawWMG AGsQ8tpQw9Obb3tDATM12ILPvMJhp4BCpIlVR/qUHDFTCmBSLUWa20S135R3Sm5Opvco yK+A2dlNwiLIUoNZwG0+VOb5BdViHIXGURekIZdaDgEHpyxcEva9caxDFcpbP+8EImEw YWyQ== X-Gm-Message-State: AKGB3mIjHJQPNY3EHL7hhEjY5g28o4WYjHvjoQRG+cG36y6m7kDeDU/q qJ/rD9xA1H2sy71DoPgdtqB0Myrb7D8= X-Google-Smtp-Source: ACJfBovHbjbD3BpaoQz82dNTjquJjM5UDyUs5oC6hA7g6tDnU1+9aWagxQX57sXhQlqxHCUSfHbJFQ== X-Received: by 10.98.70.132 with SMTP id o4mr497644pfi.102.1513619187165; Mon, 18 Dec 2017 09:46:27 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:50 -0800 Message-Id: <20171218174552.18871-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH 21/23] target/arm: Implement SVE floating-point exponential accelerator X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_6 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 +++ target/arm/sve_helper.c | 81 ++++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/translate-sve.c | 18 +++++++++++ target/arm/sve.def | 11 ++++++- 4 files changed, 113 insertions(+), 1 deletion(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a9fcf25b95..c72ae3390f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -379,6 +379,10 @@ DEF_HELPER_FLAGS_4(sve_adr_p64, TCG_CALL_NO_RWG, void,= ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_adr_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_adr_u32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_3(sve_fexpa_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fexpa_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fexpa_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 33b3c3432d..936a6ec648 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1023,6 +1023,87 @@ void HELPER(sve_adr_u32)(void *vd, void *vn, void *v= m, uint32_t desc) } } =20 +void HELPER(sve_fexpa_h)(void *vd, void *vn, uint32_t desc) +{ + static const uint16_t coeff[] =3D { + 0x0000, 0x0016, 0x002d, 0x0045, 0x005d, 0x0075, 0x008e, 0x00a8, + 0x00c2, 0x00dc, 0x00f8, 0x0114, 0x0130, 0x014d, 0x016b, 0x0189, + 0x01a8, 0x01c8, 0x01e8, 0x0209, 0x022b, 0x024e, 0x0271, 0x0295, + 0x02ba, 0x02e0, 0x0306, 0x032e, 0x0356, 0x037f, 0x03a9, 0x03d4, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / 2; + uint16_t *d =3D vd, *n =3D vn; + + for (i =3D 0; i < opr_sz; i++) { + uint16_t nn =3D n[i]; + intptr_t idx =3D extract32(nn, 0, 5); + uint16_t exp =3D extract32(nn, 5, 5); + d[i] =3D coeff[idx] | (exp << 10); + } +} + +void HELPER(sve_fexpa_s)(void *vd, void *vn, uint32_t desc) +{ + static const uint32_t coeff[] =3D { + 0x000000, 0x0164d2, 0x02cd87, 0x043a29, + 0x05aac3, 0x071f62, 0x08980f, 0x0a14d5, + 0x0b95c2, 0x0d1adf, 0x0ea43a, 0x1031dc, + 0x11c3d3, 0x135a2b, 0x14f4f0, 0x16942d, + 0x1837f0, 0x19e046, 0x1b8d3a, 0x1d3eda, + 0x1ef532, 0x20b051, 0x227043, 0x243516, + 0x25fed7, 0x27cd94, 0x29a15b, 0x2b7a3a, + 0x2d583f, 0x2f3b79, 0x3123f6, 0x3311c4, + 0x3504f3, 0x36fd92, 0x38fbaf, 0x3aff5b, + 0x3d08a4, 0x3f179a, 0x412c4d, 0x4346cd, + 0x45672a, 0x478d75, 0x49b9be, 0x4bec15, + 0x4e248c, 0x506334, 0x52a81e, 0x54f35b, + 0x5744fd, 0x599d16, 0x5bfbb8, 0x5e60f5, + 0x60ccdf, 0x633f89, 0x65b907, 0x68396a, + 0x6ac0c7, 0x6d4f30, 0x6fe4ba, 0x728177, + 0x75257d, 0x77d0df, 0x7a83b3, 0x7d3e0c, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t *d =3D vd, *n =3D vn; + + for (i =3D 0; i < opr_sz; i++) { + uint32_t nn =3D n[i]; + intptr_t idx =3D extract32(nn, 0, 6); + uint32_t exp =3D extract32(nn, 6, 8); + d[i] =3D coeff[idx] | (exp << 23); + } +} + +void HELPER(sve_fexpa_d)(void *vd, void *vn, uint32_t desc) +{ + static const uint64_t coeff[] =3D { + 0x0000000000000, 0x02C9A3E778061, 0x059B0D3158574, 0x0874518759BC8, + 0x0B5586CF9890F, 0x0E3EC32D3D1A2, 0x11301D0125B51, 0x1429AAEA92DE0, + 0x172B83C7D517B, 0x1A35BEB6FCB75, 0x1D4873168B9AA, 0x2063B88628CD6, + 0x2387A6E756238, 0x26B4565E27CDD, 0x29E9DF51FDEE1, 0x2D285A6E4030B, + 0x306FE0A31B715, 0x33C08B26416FF, 0x371A7373AA9CB, 0x3A7DB34E59FF7, + 0x3DEA64C123422, 0x4160A21F72E2A, 0x44E086061892D, 0x486A2B5C13CD0, + 0x4BFDAD5362A27, 0x4F9B2769D2CA7, 0x5342B569D4F82, 0x56F4736B527DA, + 0x5AB07DD485429, 0x5E76F15AD2148, 0x6247EB03A5585, 0x6623882552225, + 0x6A09E667F3BCD, 0x6DFB23C651A2F, 0x71F75E8EC5F74, 0x75FEB564267C9, + 0x7A11473EB0187, 0x7E2F336CF4E62, 0x82589994CCE13, 0x868D99B4492ED, + 0x8ACE5422AA0DB, 0x8F1AE99157736, 0x93737B0CDC5E5, 0x97D829FDE4E50, + 0x9C49182A3F090, 0xA0C667B5DE565, 0xA5503B23E255D, 0xA9E6B5579FDBF, + 0xAE89F995AD3AD, 0xB33A2B84F15FB, 0xB7F76F2FB5E47, 0xBCC1E904BC1D2, + 0xC199BDD85529C, 0xC67F12E57D14B, 0xCB720DCEF9069, 0xD072D4A07897C, + 0xD5818DCFBA487, 0xDA9E603DB3285, 0xDFC97337B9B5F, 0xE502EE78B3FF6, + 0xEA4AFA2A490DA, 0xEFA1BEE615A27, 0xF50765B6E4540, 0xFA7C1819E90D8, + }; + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn; + + for (i =3D 0; i < opr_sz; i++) { + uint64_t nn =3D n[i]; + intptr_t idx =3D extract32(nn, 0, 6); + uint64_t exp =3D extract32(nn, 6, 11); + d[i] =3D coeff[idx] | (exp << 52); + } +} + void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) { intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fcb5c4929e..b671462611 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -739,6 +739,24 @@ void trans_ADR_u32(DisasContext *s, arg_rrri *a, uint3= 2_t insn) do_adr(s, a, gen_helper_sve_adr_u32); } =20 +void trans_FEXPA(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4] =3D { + NULL, + gen_helper_sve_fexpa_h, + gen_helper_sve_fexpa_s, + gen_helper_sve_fexpa_d, + }; + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, 0, fns[a->esz]); +} + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index 66a88f59bc..c0fc8b7665 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -49,6 +49,7 @@ &rri rd rn imm &rrri rd rn rm imm &rri_esz rd rn imm esz +&rr_esz rd rn esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rprr_esz rd pg rn rm esz @@ -60,8 +61,11 @@ # Named instruction formats. These are generally used to # reduce the amount of duplication between instruction patterns. =20 +# Two operand +@rd_rn_esz ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz + # Three operand -@rd_rn_rm_esz ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rd_rn_rm_esz ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz =20 # Three operand with unused vector element size @rd_rn_rm ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=3D0 @@ -263,6 +267,11 @@ ADR_u32 00000100 01 1 ..... 1010 .. ..... ..... @rd= _rn_msz_rm ADR_p32 00000100 10 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm =20 +### SVE Integer Misc - Unpredicated Group + +# SVE floating-point exponential accelerator +FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn_esz # Note size != =3D 0 + ### SVE Predicate Generation Group =20 # SVE initialize predicate (PTRUE, PTRUES) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513623632719520.7218099652652; Mon, 18 Dec 2017 11:00:32 -0800 (PST) Received: from localhost ([::1]:55169 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0eK-00026o-Gy for importer@patchew.org; Mon, 18 Dec 2017 14:00:28 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56119) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUm-0008WJ-9r for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUk-0002Mn-5Z for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:32 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:41953) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUk-0002LB-0v for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:30 -0500 Received: by mail-pl0-x243.google.com with SMTP id g2so5248770pli.8 for ; Mon, 18 Dec 2017 09:46:29 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Hy9sfhMkBpSM5NImxVV6CJzkqzQztXElQhyddscya8E=; b=js5CcSy/+ldsVuSGma86YoKPXqco6HnLHfqma6ID6qBym4j1UumxmNZaGd1+jsYO8L ya30embRUCOFxxpeOU6NJQj5/L/xhzwjv4Mm9ejP4khGSu+QUPWo/V7aDDVKEnjlef4N dU5apt02G1VB4r02hH2Fs9fO+TyKGMkgssiZs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Hy9sfhMkBpSM5NImxVV6CJzkqzQztXElQhyddscya8E=; b=CY1XXSMZFOCciAOijRfkoD3sbz+T0uA9YBEZR4PaVGjPETmBAIDKsNMZhLU9X0YZew rZWsySGd07oZbNAcMCgxTXgTxbjlyPIBLxfLNcaPO+fjZdmMhAEA9pfDp6ywOvHw4f3D HeO+R1atkyLSbjKk+re4jou1nAHZMDQG2DKbHVJFoTEiRfxPdCj25Tai/+BxBA8PBl53 tTRYnCPL+Cg6WUXnA0O8XViLSbY+W0M9UCWEPiAlLEABcDP0CDPvtQ9qEjza1qq41ypf tig1Y0xDrTlFxb5O6s6Zs5RAeR1vR32zlyjMpNio/GsRMN3lB2SgwMMn3Fz/YkbtDPig vIAA== X-Gm-Message-State: AKGB3mJrRF843BenPAQy334l26e6qgLXur3WYyfzcvo5sVXgiXnsijbG n2CU0IbcS8JAA5avtsucNHQZnuA27u0= X-Google-Smtp-Source: ACJfBovOwLEmh0uzQhDh3yd+yNSU54dafo5qzarLKoXrHW4Qy8+QHdbvtxiUKY4IMg9zlfyQttvqHA== X-Received: by 10.84.129.7 with SMTP id 7mr512016plb.104.1513619188594; Mon, 18 Dec 2017 09:46:28 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:51 -0800 Message-Id: <20171218174552.18871-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH 22/23] target/arm: Implement SVE floating-point trig select coefficient X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 ++++ target/arm/sve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 19 +++++++++++++++++++ target/arm/sve.def | 3 +++ 4 files changed, 68 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c72ae3390f..ccf5405d63 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -383,6 +383,10 @@ DEF_HELPER_FLAGS_3(sve_fexpa_h, TCG_CALL_NO_RWG, void,= ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_fexpa_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_fexpa_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(sve_ftssel_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_ftssel_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_ftssel_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_bic_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) DEF_HELPER_FLAGS_5(sve_eor_pred, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr= , i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 936a6ec648..5341f6d0e5 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1104,6 +1104,48 @@ void HELPER(sve_fexpa_d)(void *vd, void *vn, uint32_= t desc) } } =20 +void HELPER(sve_ftssel_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 2; + uint16_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint16_t nn =3D n[i]; + uint16_t mm =3D m[i]; + if (mm & 1) { + nn =3D float16_one; + } + d[i] =3D nn ^ (mm & 2) << 14; + } +} + +void HELPER(sve_ftssel_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 4; + uint32_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint32_t nn =3D n[i]; + uint32_t mm =3D m[i]; + if (mm & 1) { + nn =3D float32_one; + } + d[i] =3D nn ^ (mm & 2) << 30; + } +} + +void HELPER(sve_ftssel_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz =3D simd_oprsz(desc) / 8; + uint64_t *d =3D vd, *n =3D vn, *m =3D vm; + for (i =3D 0; i < opr_sz; i +=3D 1) { + uint64_t nn =3D n[i]; + uint64_t mm =3D m[i]; + if (mm & 1) { + nn =3D float64_one; + } + d[i] =3D nn ^ (mm & 2) << 62; + } +} + void HELPER(sve_ldr)(CPUARMState *env, void *d, target_ulong addr, uint32_= t len) { intptr_t i, len_align =3D QEMU_ALIGN_DOWN(len, 8); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b671462611..a6c31e0e9c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -757,6 +757,25 @@ void trans_FEXPA(DisasContext *s, arg_rr_esz *a, uint3= 2_t insn) vsz, vsz, 0, fns[a->esz]); } =20 +void trans_FTSSEL(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] =3D { + NULL, + gen_helper_sve_ftssel_h, + gen_helper_sve_ftssel_s, + gen_helper_sve_ftssel_d, + }; + unsigned vsz =3D size_for_gvec(vec_full_reg_size(s)); + if (a->esz =3D=3D 0) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->esz]); +} + static uint64_t pred_esz_mask[4] =3D { 0xffffffffffffffffull, 0x5555555555555555ull, 0x1111111111111111ull, 0x0101010101010101ull diff --git a/target/arm/sve.def b/target/arm/sve.def index c0fc8b7665..df2730eb73 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -272,6 +272,9 @@ ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_= rn_msz_rm # SVE floating-point exponential accelerator FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn_esz # Note size != =3D 0 =20 +# SVE floating-point trig select coefficient +FTSSEL 00000100 .. 1 ..... 101100 ..... ..... @rd_rn_rm_esz # Note size= !=3D 0 + ### SVE Predicate Generation Group =20 # SVE initialize predicate (PTRUE, PTRUES) --=20 2.14.3 From nobody Tue Oct 28 12:15:16 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1513623243844662.8867868111997; Mon, 18 Dec 2017 10:54:03 -0800 (PST) Received: from localhost ([::1]:54113 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eR0Y6-0003Jw-Of for importer@patchew.org; Mon, 18 Dec 2017 13:54:02 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56129) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1eQzUn-00005M-1l for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1eQzUl-0002PO-Q8 for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:33 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:46629) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1eQzUl-0002OU-IF for qemu-devel@nongnu.org; Mon, 18 Dec 2017 12:46:31 -0500 Received: by mail-pg0-x243.google.com with SMTP id b11so9422099pgu.13 for ; Mon, 18 Dec 2017 09:46:31 -0800 (PST) Received: from cloudburst.twiddle.net (174-21-7-63.tukw.qwest.net. [174.21.7.63]) by smtp.gmail.com with ESMTPSA id t84sm26209657pfe.160.2017.12.18.09.46.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 18 Dec 2017 09:46:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ZcsLWJR9s9Z/s+Guhqhn6UX16Gk3jGYweos2BoYqHw0=; b=BVnIZh9uEK9fHJyR8NJySHlHo4MFGfJlzf7nR+Y78ASS9/Z5WbQOzuR8vL/7fTxs8I QfT1pgkaIqWeZOkl5AMjq87p8tDHlOLk7qSIZqYfHJsi31J09OeV8cBEBEEc4VKT10kV A5EureNlLIhYJMhQrOer0urk3q8yXu0DPVa6w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ZcsLWJR9s9Z/s+Guhqhn6UX16Gk3jGYweos2BoYqHw0=; b=mzJefX+9jlQq2WZgVoulgop0FGhgQrZ3iJR2I+Sc7UAMiq5wTZZb6ZFxmHRvyDPv/u tIGdCfxbnFNi1F4x0u7s8nPKYS25esEr1QgQQYUuLox1OlHixZ8dEEthmUW++8gECwhy U3Cmr7kxrmhmKbvcFfHwyGr7iAYm3bJCBSiVErBqSvG7B/i+te8Bo6WVW3bHFP/EJX7m LtdbkfNKGBWAs13uVM6D2tDmVlyNXKNEB1yS3kjiFarkklfOvcil0Ya1s6GztXQ5bZr1 IWkgDJB966g6CodnycJ7aY0Gpavwy1on8kdD8/OV9P1JvXN/++Ehy1vjwuPKJ+ykTv+G /DJg== X-Gm-Message-State: AKGB3mJyZsoNUQCiwRRJN9sh8aTuX++nJkFeABx9Gj8pSo/PnBfWMpz7 3EUa4DFUfpkDARvVLQi50NGx4LMpbwM= X-Google-Smtp-Source: ACJfBov6LgaY6UgeL0uaWe94mtHpYcm+28hfyYHUdp+hCglgFEhBS2JiAaUUHcO07VMSULBbrsCVeA== X-Received: by 10.98.51.6 with SMTP id z6mr502052pfz.34.1513619190092; Mon, 18 Dec 2017 09:46:30 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Mon, 18 Dec 2017 09:45:52 -0800 Message-Id: <20171218174552.18871-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20171218174552.18871-1-richard.henderson@linaro.org> References: <20171218174552.18871-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH 23/23] target/arm: Implement SVE Element Count Group, register destinations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZohoMail: RDKM_2 RSF_0 Z_629925259 SPT_0 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 103 +++++++++++++++++++++++++++++++++++++++++= ++++ target/arm/sve.def | 18 ++++++++ 2 files changed, 121 insertions(+) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a6c31e0e9c..91eb4e797a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -61,6 +61,11 @@ static int tszimm_shl(int x) return x - tszimm_esz(x); } =20 +static inline int plus1(int x) +{ + return x + 1; +} + /* * Include the generated decoder. */ @@ -815,6 +820,104 @@ static unsigned decode_pred_count(unsigned fullsz, in= t pattern, int esz) } } =20 +void trans_CNT_r(DisasContext *s, arg_CNT_r *a, uint32_t insn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + + tcg_gen_movi_i64(cpu_reg(s, a->rd), numelem * a->imm); +} + +void trans_INC_DEC_r(DisasContext *s, arg_incdec_cnt *a, uint32_t insn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + int inc =3D numelem * a->imm * (a->d ? -1 : 1); + TCGv_i64 reg =3D cpu_reg(s, a->rd); + + tcg_gen_addi_i64(reg, reg, inc); +} + +void trans_sat_INC_DEC_r_32(DisasContext *s, arg_incdec_cnt *a, uint32_t i= nsn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + int inc =3D numelem * a->imm * (a->d ? -1 : 1); + int64_t ibound; + TCGv_i64 reg =3D cpu_reg(s, a->rd); + TCGv_i64 bound; + TCGCond cond; + + /* Use normal 64-bit arithmetic to detect 32-bit overflow. */ + if (a->u) { + tcg_gen_ext32u_i64(reg, reg); + } else { + tcg_gen_ext32s_i64(reg, reg); + } + tcg_gen_addi_i64(reg, reg, inc); + if (a->d) { + if (a->u) { + ibound =3D 0; + cond =3D TCG_COND_LTU; + } else { + ibound =3D INT32_MIN; + cond =3D TCG_COND_LT; + } + } else { + if (a->u) { + ibound =3D UINT32_MAX; + cond =3D TCG_COND_GTU; + } else { + ibound =3D INT32_MAX; + cond =3D TCG_COND_GT; + } + } + bound =3D tcg_const_i64(ibound); + tcg_gen_movcond_i64(cond, reg, reg, bound, bound, reg); + tcg_temp_free_i64(bound); +} + +void trans_sat_INC_DEC_r_64(DisasContext *s, arg_incdec_cnt *a, uint32_t i= nsn) +{ + unsigned fullsz =3D vec_full_reg_size(s); + unsigned numelem =3D decode_pred_count(fullsz, a->pat, a->esz); + int inc =3D numelem * a->imm * (a->d ? -1 : 1); + TCGv_i64 reg =3D cpu_reg(s, a->rd); + TCGv_i64 t0 =3D tcg_temp_new_i64(); + TCGv_i64 t1 =3D tcg_temp_new_i64(); + TCGv_i64 zero; + + if (a->u) { + tcg_gen_addi_i64(t0, reg, inc); + + /* Bound the result. */ + if (a->d) { + tcg_gen_movi_i64(t1, 0); + tcg_gen_movcond_i64(TCG_COND_LTU, reg, t0, reg, t0, t1); + } else { + tcg_gen_movi_i64(t1, -1); + tcg_gen_movcond_i64(TCG_COND_LTU, reg, reg, t0, t0, t1); + } + } else { + /* Detect signed overflow for addition. */ + tcg_gen_xori_i64(t0, reg, inc); + tcg_gen_addi_i64(reg, reg, inc); + tcg_gen_xori_i64(t0, reg, inc); + tcg_gen_andc_i64(t0, t1, t0); + + /* Because we know the increment, we know which way it overflowed.= */ + tcg_gen_movi_i64(t1, a->d ? INT64_MIN : INT64_MAX); + + /* Bound the result. */ + zero =3D tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_LT, reg, t0, zero, t1, reg); + + tcg_temp_free_i64(zero); + } + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); +} + /* For PTRUE, PTRUES, PFALSE, SETFFR. */ void trans_pred_set(DisasContext *s, arg_pred_set *a, uint32_t insn) { diff --git a/target/arm/sve.def b/target/arm/sve.def index df2730eb73..da533ba666 100644 --- a/target/arm/sve.def +++ b/target/arm/sve.def @@ -24,6 +24,7 @@ =20 %imm9_16_10 16:s6 10:3 %imm6_22_5 22:1 5:5 +%imm4_16_p1 16:4 !function=3Dplus1 =20 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=3Dtszimm_esz @@ -56,6 +57,7 @@ &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &pred_set rd pat esz i s +&incdec_cnt rd pat esz imm d u =20 ########################################################################### # Named instruction formats. These are generally used to @@ -101,6 +103,10 @@ @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 &rri imm=3D%imm9_16_10 @rd_rn_i9 ........ ........ ...... rn:5 rd:5 &rri imm=3D%imm9_16_10 =20 +# One register, pattern, and uint4+1. +# User must fill in U and D. +@incdec_cnt ........ esz:2 .. .... ...... pat:5 rd:5 &incdec_cnt imm=3D%i= mm4_16_p1 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. =20 @@ -275,6 +281,18 @@ FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn= _esz # Note size !=3D 0 # SVE floating-point trig select coefficient FTSSEL 00000100 .. 1 ..... 101100 ..... ..... @rd_rn_rm_esz # Note size= !=3D 0 =20 +### SVE Element Count Group + +# SVE element count +CNT_r 00000100 .. 10 .... 1110 0 0 ..... ..... @incdec_cnt d=3D0 u= =3D1 + +# SVE inc/dec register by element count +INC_DEC_r 00000100 .. 11 .... 1110 0 d:1 ..... ..... @incdec_cnt u=3D1 + +# SVE saturating inc/dec register by element count +sat_INC_DEC_r_32 00000100 .. 10 .... 1111 d:1 u:1 ..... ..... @incdec_cnt +sat_INC_DEC_r_64 00000100 .. 11 .... 1111 d:1 u:1 ..... ..... @incdec_cnt + ### SVE Predicate Generation Group =20 # SVE initialize predicate (PTRUE, PTRUES) --=20 2.14.3