From nobody Tue Feb 10 03:39:01 2026
Delivered-To: importer@patchew.org
Authentication-Results: mx.zohomail.com;
	dkim=fail;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org;
	dmarc=pass(p=none dis=none)  header.from=nongnu.org
ARC-Seal: i=1; a=rsa-sha256; t=1644428247; cv=none;
	d=zohomail.com; s=zohoarc;
	b=CmUrDCR0ebmrqOrBeGEzB/EX75hsbUsA/nlCfwHwpW+fIv4dCIdPrF3Ynm3KdI7hrsPy+bl6/HjWkMgVCJKI9KN6TpOH+Scr+vxxpfiuH+O8vSVingINi9kyCXgXfz1MfIugkZrJzu1ZHrwwc9Jd6jkZ026rVXosbUKxad3to90=
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com;
 s=zohoarc;
	t=1644428247;
 h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:Reply-To:References:Sender:Subject:To;
	bh=eq08HLaD7PYwq85oe8Eq7tQMJt9nkpSBdfFCnF7VK6w=;
	b=QwABV38ExFcnNSSGcgZUKR0w+I038zHXGxvhCtPrsLhnoqdRobiagxDDmVZdRBhevNxm7taRt+vYs5GIyEmXSVVK7gD0DXKkRU7/1OO7ozbqU1l6//BEYJS4StQRWtUmJJc3fundfe5dC7bntgc900QG/3s0RkJeLkVill0wmFQ=
ARC-Authentication-Results: i=1; mx.zohomail.com;
	dkim=fail;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org;
	dmarc=pass header.from=<qemu-devel@nongnu.org> (p=none dis=none)
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by
 mx.zohomail.com
	with SMTPS id 1644428247282946.5211414833793;
 Wed, 9 Feb 2022 09:37:27 -0800 (PST)
Received: from localhost ([::1]:59764 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces+importer=patchew.org@nongnu.org>)
	id 1nHquE-0004Fv-6q
	for importer@patchew.org; Wed, 09 Feb 2022 12:37:26 -0500
Received: from eggs.gnu.org ([209.51.188.92]:37712)
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <anjo@rev.ng>) id 1nHqNT-00060N-CK
 for qemu-devel@nongnu.org; Wed, 09 Feb 2022 12:03:35 -0500
Received: from rev.ng ([5.9.113.41]:59991)
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <anjo@rev.ng>) id 1nHqNN-0004X7-Em
 for qemu-devel@nongnu.org; Wed, 09 Feb 2022 12:03:35 -0500
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=rev.ng;
 s=dkim; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References:
 In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID:
 Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc
 :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe:
 List-Post:List-Owner:List-Archive;
 bh=eq08HLaD7PYwq85oe8Eq7tQMJt9nkpSBdfFCnF7VK6w=; b=RWFNq/tAJVNkDHLlBiVsfp6Z1A
 U5V6vx0Qy5LlHCE+UDxEVaJwgUwe/ZZ6uXHkhDATUjxcFgR/DPXS3cnaoLXcr2FBgm1wsIN4ZuPNJ
 EkAC2Y2K4JP9zEtokH15gIV8fghyJTil0WbavDRyZ4nswD3umGEK4lwBZj3PjUrPMfRI=;
To: qemu-devel@nongnu.org
Cc: ale@rev.ng, tsimpson@quicinc.com, bcain@quicinc.com, mlambert@quicinc.com,
 babush@rev.ng, nizzo@rev.ng, richard.henderson@linaro.org
Subject: [PATCH v8 02/12] target/hexagon: import README for idef-parser
Date: Wed,  9 Feb 2022 18:03:02 +0100
Message-Id: <20220209170312.30662-3-anjo@rev.ng>
In-Reply-To: <20220209170312.30662-1-anjo@rev.ng>
References: <20220209170312.30662-1-anjo@rev.ng>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17
 as permitted sender) client-ip=209.51.188.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Received-SPF: pass client-ip=5.9.113.41; envelope-from=anjo@rev.ng;
 helo=rev.ng
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 SPF_HELO_PASS=-0.001,
 SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Reply-to: Anton Johansson <anjo@rev.ng>
From: Anton Johansson via <qemu-devel@nongnu.org>
X-ZohoMail-DKIM: fail (Header signature does not verify)
X-ZM-MESSAGEID: 1644428248057100001

From: Alessandro Di Federico <ale@rev.ng>

Signed-off-by: Alessandro Di Federico <ale@rev.ng>
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/README                 |   5 +
 target/hexagon/idef-parser/README.rst | 722 ++++++++++++++++++++++++++
 2 files changed, 727 insertions(+)
 create mode 100644 target/hexagon/idef-parser/README.rst

diff --git a/target/hexagon/README b/target/hexagon/README
index 372e24747c..6cb5affddb 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -27,6 +27,10 @@ Hexagon-specific code are
         encode*.def             Encoding patterns for each instruction
         iclass.def              Instruction class definitions used to dete=
rmine
                                 legal VLIW slots for each instruction
+    qemu/target/hexagon/idef-parser
+        Parser that, given the high-level definitions of an instruction,
+        produces a C function generating equivalent tiny code instructions.
+        See README.rst.
     qemu/linux-user/hexagon
         Helpers for loading the ELF file and making Linux system calls,
         signals, etc
@@ -47,6 +51,7 @@ header files in <BUILD_DIR>/target/hexagon
         gen_tcg_funcs.py                -> tcg_funcs_generated.c.inc
         gen_tcg_func_table.py           -> tcg_func_table_generated.c.inc
         gen_helper_funcs.py             -> helper_funcs_generated.c.inc
+        gen_idef_parser_funcs.py        -> idef_parser_input.h
=20
 Qemu helper functions have 3 parts
     DEF_HELPER declaration indicates the signature of the helper
diff --git a/target/hexagon/idef-parser/README.rst b/target/hexagon/idef-pa=
rser/README.rst
new file mode 100644
index 0000000000..65e6bf4ee5
--- /dev/null
+++ b/target/hexagon/idef-parser/README.rst
@@ -0,0 +1,722 @@
+Hexagon ISA instruction definitions to tinycode generator compiler
+------------------------------------------------------------------
+
+idef-parser is a small compiler able to translate the Hexagon ISA descript=
ion
+language into tinycode generator code, that can be easily integrated into =
QEMU.
+
+Compilation Example
+-------------------
+
+To better understand the scope of the idef-parser, we'll explore an applic=
ative
+example. Let's start by one of the simplest Hexagon instruction: the ``add=
``.
+
+The ISA description language represents the ``add`` instruction as
+follows:
+
+.. code:: c
+
+   A2_add(RdV, in RsV, in RtV) {
+       { RdV=3DRsV+RtV;}
+   }
+
+idef-parser will compile the above code into the following code:
+
+.. code:: c
+
+   /* A2_add */
+   void emit_A2_add(DisasContext *ctx, Insn *insn, Packet *pkt, TCGv_i32 R=
dV,
+                    TCGv_i32 RsV, TCGv_i32 RtV)
+   /*  { RdV=3DRsV+RtV;} */
+   {
+       TCGv_i32 tmp_0 =3D tcg_temp_new_i32();
+       tcg_gen_add_i32(tmp_0, RsV, RtV);
+       tcg_gen_mov_i32(RdV, tmp_0);
+       tcg_temp_free_i32(tmp_0);
+   }
+
+The output of the compilation process will be a function, containing the
+tinycode generator code, implementing the correct semantics. That function=
 will
+not access any global variable, because all the accessed data structures w=
ill be
+passed explicitly as function parameters. Among the passed parameters we w=
ill
+have TCGv (tinycode variables) representing the input and output registers=
 of
+the architecture, integers representing the immediates that come from the =
code,
+and other data structures which hold information about the disassemblation
+context (``DisasContext`` struct).
+
+Let's begin by describing the input code. The ``add`` instruction is assoc=
iated
+with a unique identifier, in this case ``A2_add``, which allows to disting=
uish
+variants of the same instruction, and expresses the class to which the
+instruction belongs, in this case ``A2`` corresponds to the Hexagon
+``ALU32/ALU`` instruction subclass.
+
+After the instruction identifier, we have a series of parameters that repr=
esents
+TCG variables that will be passed to the generated function. Parameters ma=
rked
+with ``in`` are already initialized, while the others are output parameter=
s.
+
+We will leverage this information to infer several information:
+
+-  Fill in the output function signature with the correct TCGv registers
+-  Fill in the output function signature with the immediate integers
+-  Keep track of which registers, among the declared one, have been
+   initialized
+
+Let's now observe the actual instruction description code, in this case:
+
+.. code:: c
+
+   { RdV=3DRsV+RtV;}
+
+This code is composed by a subset of the C syntax, and is the result of the
+application of some macro definitions contained in the ``macros.h`` file.
+
+This file is used to reduce the complexity of the input language where com=
plex
+variants of similar constructs can be mapped to a unique primitive, so tha=
t the
+idef-parser has to handle a lower number of computation primitives.
+
+As you may notice, the description code modifies the registers which have =
been
+declared by the declaration statements. In this case all the three registe=
rs
+will be declared, ``RsV`` and ``RtV`` will also be read and ``RdV`` will be
+written.
+
+Now let's have a quick look at the generated code, line by line.
+
+::
+
+   TCGv_i32 tmp_0 =3D tcg_temp_new_i32();
+
+This code starts by declaring a temporary TCGv to hold the result from the=
 sum
+operation.
+
+::
+
+   tcg_gen_add_i32(tmp_0, RsV, RtV);
+
+Then, we are generating the sum tinycode operator between the selected
+registers, storing the result in the just declared temporary.
+
+::
+
+   tcg_gen_mov_i32(RdV, tmp_0);
+
+The result of the addition is now stored in the temporary, we move it into=
 the
+correct destination register. This code may seem inefficient, but QEMU will
+perform some optimizations on the tinycode, reducing the unnecessary copy.
+
+::
+
+   tcg_temp_free_i32(tmp_0);
+
+Finally, we free the temporary we used to hold the addition result.
+
+Parser Input
+------------
+
+Before moving on to the structure of idef-parser itself, let us spend some=
 words
+on its' input. There are two preprocessing steps applied to the generated
+instruction semantics in ``semantics_generated.pyinc`` that we need to con=
sider.
+Firstly,
+
+::
+
+    gen_idef_parser_funcs.py
+
+which takes instruction semantics in ``semantics_generated.pyinc`` to C-li=
ke
+pseudo code, output into ``idef_parser_input.h.inc``. For instance, the
+``J2_jumpr`` instruction which jumps to an address stored in a register
+argument. This is instruction is defined as
+
+::
+
+    SEMANTICS( \
+        "J2_jumpr", \
+        "jumpr Rs32", \
+        """{fJUMPR(RsN,RsV,COF_TYPE_JUMPR);}""" \
+    )
+
+in ``semantics_generated.pyinc``. Running ``gen_idef_parser_funcs.py``
+we obtain the pseudo code
+
+::
+
+    J2_jumpr(in RsV) {
+        {fJUMPR(RsN,RsV,COF_TYPE_JUMPR);}
+    }
+
+with macros such as ``fJUMPR`` intact.
+
+The second step is to expand macros into a form suitable for our parser.
+These macros are defined in ``idef-parser/macros.inc`` and the step is
+carried out by the ``prepare`` script which runs the C preprocessor on
+``idef_parser_input.h.inc`` to produce
+``idef_parser_input.preprocessed.h.inc``.
+
+To finish the above example, after preprocessing ``J2_jumpr`` we obtain
+
+::
+
+    J2_jumpr(in RsV) {
+        {(PC =3D RsV);}
+    }
+
+where ``fJUMPR(RsN,RsV,COF_TYPE_JUMPR);`` was expanded to ``(PC =3D RsV)``,
+signifying a write to the Program Counter ``PC``.  Note, that ``PC`` in
+this expression is not a variable in the strict C sense since it is not
+declared anywhere, but rather a symbol which is easy to match in
+idef-parser later on.
+
+Parser Structure
+----------------
+
+The idef-parser is built using the ``flex`` and ``bison``.
+
+``flex`` is used to split the input string into tokens, each described usi=
ng a
+regular expression. The token description is contained in the
+``idef-parser.lex`` source file. The flex-generated scanner takes care als=
o to
+extract from the input text other meaningful information, e.g.,=C2=A0the n=
umerical
+value in case of an immediate constant, and decorates the token with the
+extracted information.
+
+``bison`` is used to generate the actual parser, starting from the parsing
+description contained in the ``idef-parser.y`` file. The generated parser
+executes the ``main`` function at the end of the ``idef-parser.y`` file, w=
hich
+opens input and output files, creates the parsing context, and eventually =
calls
+the ``yyparse()`` function, which starts the execution of the LALR(1) pars=
er
+(see `Wikipedia <https://en.wikipedia.org/wiki/LALR_parser>`__ for more
+information about LALR parsing techniques). The LALR(1) parser, whenever i=
t has
+to shift a token, calls the ``yylex()`` function, which is defined by the
+flex-generated code, and reads the input file returning the next scanned t=
oken.
+
+The tokens are mapped on the source language grammar, defined in the
+``idef-parser.y`` file to build a unique syntactic tree, according to the
+specified operator precedences and associativity rules.
+
+The grammar describes the whole file which contains the Hexagon instruction
+descriptions, therefore it starts from the ``input`` nonterminal, which is=
 a
+list of instructions, each instruction is represented by the following gra=
mmar
+rule, representing the structure of the input file shown above:
+
+::
+
+   instruction : INAME arguments code
+               | error
+
+   arguments : '(' ')'
+             | '(' argument_list ')';
+
+   argument_list : argument_decl ',' argument_list
+                 | argument_decl
+
+   argument_decl : REG
+                 | PRED
+                 | IN REG
+                 | IN PRED
+                 | IMM
+                 | var
+                 ;
+
+   code        : '{' statements '}'
+
+   statements  : statements statement
+               | statement
+
+   statement   : control_statement
+               | var_decl ';'
+               | rvalue ';'
+               | code_block
+               | ';'
+
+   code_block  : '{' statements '}'
+               | '{' '}'
+
+With this initial portion of the grammar we are defining the instruction, =
its'
+arguments, and its' statements. Each argument is defined by the
+``argument_decl`` rule, and can be either
+
+::
+
+    Description                  Example
+    ----------------------------------------
+    output register              RsV
+    output predicate register    P0
+    input register               in RsV
+    input predicate register     in P0
+    immediate value              1234
+    local variable               EA
+
+Note, the only local variable allowed to be used as an argument is the eff=
ective
+address ``EA``. Similarly, each statement can be a ``control_statement``, a
+variable declaration such as ``int a;``, a code block, which is just a
+bracket-enclosed list of statements, a ``';'``, which is a ``nop`` instruc=
tion,
+and an ``rvalue ';'``.
+
+Expressions
+~~~~~~~~~~~
+
+Allowed in the input code are C language expressions with a few exceptions
+to simplify parsing. For instance, variable names such as ``RdV``, ``RssV`=
`,
+``PdV``, ``CsV``, and other idiomatic register names from Hexagon, are
+reserved specifically for register arguments. These arguments then map to
+``TCGv_i32`` or ``TCGv_i64`` depending on the register size. Similarly, ``=
UiV``,
+``riV``, etc. refer to immediate arguments and will map to C integers.
+
+Also, as mentioned earlier, the names ``PC``, ``SP``, ``FP``, etc. are use=
d to
+refer to Hexagon registers such as the program counter, stack pointer, and=
 frame
+pointer seen here. Writes to these registers then correspond to assignments
+``PC =3D ...``, and reads correspond to uses of the variable ``PC``.
+
+Moreover, another example of one such exception is the selective expansion=
 of
+macros present in ``macros.h``. As an example, consider the ``fABS`` macro=
 which
+in plain C is defined as
+
+::
+
+    #define fABS(A) (((A) < 0) ? (-(A)) : (A))
+
+and returns the absolute value of the argument ``A``. This macro is not in=
cluded
+in ``idef-parser/macros.inc`` and as such is not expanded and kept as a "c=
all"
+``fABS(...)``. Reason being, that ``fABS`` is easier to match and map to
+``tcg_gen_abs_<width>``, compared to the full ternary expression above. Lo=
ads of
+macros in ``macros.h`` are kept unexpanded to aid in parsing, as seen in t=
he
+example above, for more information see ``idef-parser/idef-parser.lex``.
+
+Finally, in mapping these input expressions to tinycode generators, idef-p=
arser
+tries to perform as much as possible in plain C. Such as, performing binary
+operations in C instead of tinycode generators, thus effectively constant
+folding the expression.
+
+Variables and Variable Declarations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Similarly to C, variables in the input code must be explicitly declared, s=
uch as
+``int var1;`` which declares an uninitialized variable ``var1``. Initializ=
ation
+``int var2 =3D 0;`` is also allowed and behaves as expected. In tinycode
+generators the previous declarations are mapped to
+
+::
+
+    int var1;           ->      TCGv_i32 var1 =3D tcg_temp_local_new_i32();
+
+    int var2 =3D 0;       ->      TCGv_i32 var1 =3D tcg_temp_local_new_i32=
();
+                                tcg_gen_movi_i32(j, ((int64_t) 0ULL));
+
+which are later automatically freed at the end of the function they're dec=
lared
+in. Contrary to C, we only allow variables to be declared with an integer =
type
+specified in the following table (without permutation of keywords)
+
+::
+
+    type                        bit-width    signedness
+    ----------------------------------------------------------
+    int                         32           signed
+    signed
+    signed int
+
+    unsigned                    32           unsigned
+    unsigned int
+
+    long                        64           signed
+    long int
+    signed long
+    signed long int
+
+    unsigned long               64           unsigned
+    unsigned long int
+
+    long long                   64           signed
+    long long int
+    signed long long
+    signed long long int
+
+    unsigned long long          64           unsigned
+    unsigned long long int
+
+    size[1,2,4,8][s,u]_t        8-64         signed or unsigned
+
+In idef-parser, variable names are matched by a generic ``VARID`` token,
+which will feature the variable name as a decoration. For a variable decla=
ration
+idef-parser calls ``gen_varid_allocate`` with the ``VARID`` token to save =
the
+name, size, and bit width of the newly declared variable. In addition, this
+function also ensures that variables aren't declared multiple times, and p=
rints
+and error message if that is the case. Upon use of a variable, the ``VARID=
``
+token is used to lookup the size and bit width of the variable.
+
+Type System
+~~~~~~~~~~~
+
+idef-parser features a simple type system which is used to correctly imple=
ment
+the signedness and bit width of the operations.
+
+The type of each ``rvalue`` is determined by two attributes: its bit width
+(``unsigned bit_width``) and its signedness (``HexSignedness signedness``).
+
+For each operation, the type of ``rvalue``\ s influence the way in which t=
he
+operands are handled and emitted. For example a right shift between signed
+operators will be an arithmetic shift, while one between unsigned operators
+will be a logical shift. If one of the two operands is signed, and the oth=
er
+is unsigned, the operation will be signed.
+
+The bit width also influences the outcome of the operations, in particular=
 while
+the input languages features a fine granularity type system, with types of=
 8,
+16, 32, 64 (and more for vectorial instructions) bits, the tinycode only
+features 32 and 64 bit widths. We propagate as much as possible the fine
+granularity type, until the value has to be used inside an operation betwe=
en
+``rvalue``\ s; in that case if one of the two operands is greater than 32 =
bits
+we promote the whole operation to 64 bit, taking care of properly extendin=
g the
+two operands. Fortunately, the most critical instructions already feature
+explicit casts and zero/sign extensions which are properly propagated down=
 to
+our parser.
+
+The combination of ``rvalue``\ s are handled through the use of the
+``gen_bin_op`` and ``gen_bin_cmp`` helper functions. These two functions h=
andle
+the appropriate compile-time or run-time emission of operations to perform=
 the
+required computation.
+
+Control Statements
+~~~~~~~~~~~~~~~~~~
+
+``control_statement``\ s are all the statements which modify the order of
+execution of the generated code according to input parameters. They are ex=
panded
+by the following grammar rule:
+
+::
+
+   control_statement : frame_check
+                     | cancel_statement
+                     | if_statement
+                     | for_statement
+                     | fpart1_statement
+
+``if_statement``\ s require the emission of labels and branch instructions=
 which
+effectively perform conditional jumps (``tcg_gen_brcondi``) according to t=
he
+value of an expression. Note, the tinycode generators we produce for condi=
tional
+statements do not perfectly mirror what would be expected in C, for instan=
ce we
+do not reproduce short-circuiting of the ``&&`` operator, and use of the `=
`||``
+operator is disallowed. All the predicated instructions, and in general al=
l the
+instructions where there could be alternative values assigned to an ``lval=
ue``,
+like C-style ternary expressions:
+
+::
+
+   rvalue            : rvalue QMARK rvalue COLON rvalue
+
+are handled using the conditional move tinycode instruction
+(``tcg_gen_movcond``), which avoids the additional complexity of managing =
labels
+and jumps.
+
+Instead, regarding the ``for`` loops, exploiting the fact that they always
+iterate on immediate values, therefore their iteration ranges are always k=
nown
+at compile time, we implemented those emitting plain C ``for`` loops. This=
 is
+possible because the loops will be executed in the QEMU code, leading to t=
he
+consequential unrolling of the for loop, since the tinycode generator
+instructions will be executed multiple times, and the respective generated
+tinycode will represent the unrolled execution of the loop.
+
+Parsing Context
+~~~~~~~~~~~~~~~
+
+All the helper functions in ``idef-parser.y`` carry two fixed parameters, =
which
+are the parsing context ``c`` and the ``YYLLOC`` location information. The
+context is explicitly passed to all the functions because the parser we ge=
nerate
+is a reentrant one, meaning that it does not have any global variable, and
+therefore the instruction compilation could easily be parallelized in the
+future. Finally for each rule we propagate information about the location =
of the
+involved tokens to generate pretty error reporting, able to highlight the
+portion of the input code which generated each error.
+
+Debugging
+---------
+
+Developing the idef-parser can lead to two types of errors: compile-time e=
rrors
+and parsing errors.
+
+Compile-time errors in Bison-generated parsers are usually due to conflict=
s in
+the described grammar. Conflicts forbid the grammar to produce a unique
+derivation tree, thus must be solved (except for the dangling else problem,
+which is marked as expected through the ``%expect 1`` Bison option).
+
+For solving conflicts you need a basic understanding of `shift-reduce conf=
licts
+<https://www.gnu.org/software/Bison/manual/html_node/Shift_002fReduce.html=
>`__
+and `reduce-reduce conflicts
+<https://www.gnu.org/software/Bison/manual/html_node/Reduce_002fReduce.htm=
l>`__,
+then, if you are using a Bison version > 3.7.1 you can ask Bison to genera=
te
+some counterexamples which highlight ambiguous derivations, passing the
+``-Wcex`` option to Bison. In general shift/reduce conflicts are solved by
+redesigning the grammar in an unambiguous way or by setting the token prio=
rity
+correctly, while reduce/reduce conflicts are solved by redesigning the
+interested part of the grammar.
+
+Run-time errors can be divided between lexing and parsing errors, lexing e=
rrors
+are hard to detect, since the ``var`` token will catch everything which is=
 not
+catched by other tokens, but easy to fix, because most of the time a simple
+regex editing will be enough.
+
+idef-parser features a fancy parsing error reporting scheme, which for each
+parsing error reports the fragment of the input text which was involved in=
 the
+parsing rule that generated an error.
+
+Implementing an instruction goes through several sequential steps, here ar=
e some
+suggestions to make each instruction proceed to the next step.
+
+-  not-emitted
+
+   Means that the parsing of the input code relative to that instruction f=
ailed,
+   this could be due to a lexical error or to some mismatch between the or=
der of
+   valid tokens and a parser rule. You should check that tokens are correc=
tly
+   identified and mapped, and that there is a rule matching the token sequ=
ence
+   that you need to parse.
+
+-  emitted
+
+   This instruction class contains all the instructions which are emitted =
but
+   fail to compile when included in QEMU. The compilation errors are shown=
 by
+   the QEMU building process and will lead to fixing the bug.  Most common
+   errors regard the mismatch of parameters for tinycode generator functio=
ns,
+   which boil down to errors in the idef-parser type system.
+
+-  compiled
+
+   These instruction generate valid tinycode generator code, which however=
 fail
+   the QEMU or the harness tests, these cases must be handled manually by
+   looking into the failing tests and looking at the generated tinycode
+   generator instruction and at the generated tinycode itself. Tip: handle=
 the
+   failing harness tests first, because they usually feature only a single
+   instruction, thus will require less execution trace navigation. If a
+   multi-threaded test fail, fixing all the other tests will be the easier
+   option, hoping that the multi-threaded one will be indirectly fixed.
+
+   An example of debugging this type of failure is provided in the followi=
ng
+   section.
+
+-  tests-passed
+
+   This is the final goal for each instruction, meaning that the instructi=
on
+   passes the test suite.
+
+Another approach to fix QEMU system test, where many instructions might fa=
il, is
+to compare the execution trace of your implementation with the reference
+implementations already present in QEMU. To do so you should obtain a QEMU=
 build
+where the instruction pass the test, and run it with the following command:
+
+::
+
+   sudo unshare -p sudo -u <USER> bash -c \
+   'env -i <qemu-hexagon full path> -d cpu <TEST>'
+
+And do the same for your implementation, the generated execution traces wi=
ll be
+inherently aligned and can be inspected for behavioral differences using t=
he
+``diff`` tool.
+
+Example of debugging erroneous tinycode generator code
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The goal of this section is to provide a complete example of debugging
+incorrectly emitted tinycode generator for a single instruction.
+
+Let's first introduce a bug in the tinycode generator of the ``A2_add``
+instruction,
+
+::
+
+    void emit_A2_add(DisasContext *ctx, Insn *insn, Packet *pkt, TCGv_i32 =
RdV,
+                     TCGv_i32 RsV, TCGv_i32 RtV)
+    /*  RdV=3DRsV+RtV;} */
+    {
+        TCGv_i32 tmp_0 =3D tcg_temp_new_i32();
+        tcg_gen_add_i32(tmp_0, RsV, RsV);
+        tcg_gen_mov_i32(RdV, tmp_0);
+        tcg_temp_free_i32(tmp_0);
+    }
+
+Here the bug, albeit hard to spot, is in ``tcg_gen_add_i32(tmp_0, RsV, RsV=
);``
+where we compute ``RsV + RsV`` instead of ``RsV + RtV``, as would be expec=
ted.
+This particular bug is a bit tricky to pinpoint when debugging, since the
+``A2_add`` instruction is so ubiquitous. As a result, pretty much all test=
s will
+fail and therefore not provide a lot of information about the bug.
+
+For example, let's run the ``check-tcg`` tests
+
+::
+
+    make check-tcg TIMEOUT=3D1200 \
+                   DOCKER_IMAGE=3Ddebian-hexagon-cross \
+                   ENGINE=3Dpodman V=3D1 \
+                   DOCKER_CROSS_CC_GUEST=3Dhexagon-unknown-linux-musl-clang
+
+In the output, we find a failure in the very first test case ``float_convs=
``
+due to a segmentation fault. Similarly, all harness and libc tests will fa=
il as
+well. At this point we have no clue where the actual bug lies, and need to=
 start
+ruling out instructions. As such a good starting point is to utilize the d=
ebug
+options ``-d in_asm,cpu`` of QEMU to inspect the Hexagon instructions bein=
g run,
+alongside the CPU state. We additionally need a working version of the emu=
lator
+to compare our buggy CPU state against, running
+
+::
+
+    meson configure -Dhexagon_idef_parser_enabled=3Dfalse
+
+will disable the idef-parser for all instructions and fallback on manual
+tinycode generator overrides, or on helper function implementations. Recom=
piling
+gives us ``qemu-hexagon`` which passes all tests. If ``qemu-heaxgon-buggy`=
` is
+our binary with the incorrect tinycode generators, we can compare the CPU =
state
+between the two versions
+
+::
+
+    ./qemu-hexagon-buggy -d in_asm,cpu float_convs &> out_buggy
+    ./qemu-hexagon       -d in_asm,cpu float_convs &> out_working
+
+Looking at ``diff -u out_buggy out_working`` shows us that the CPU state b=
egins
+to diverge on line 141, with an incorrect value in the ``R1`` register
+
+::
+
+    @@ -138,7 +138,7 @@
+
+     General Purpose Registers =3D {
+       r0 =3D 0x4100f9c0
+    -  r1 =3D 0x00042108
+    +  r1 =3D 0x00000000
+       r2 =3D 0x00021084
+       r3 =3D 0x00000000
+       r4 =3D 0x00000000
+
+If we also look into ``out_buggy`` directly we can inspect the input assem=
bly
+which the caused the incorrect CPU state, around line 141 we find
+
+::
+
+    116 |  ----------------
+    117 |  IN: _start_c
+    118 |  0x000210b0:  0xa09dc002	{	allocframe(R29,#0x10):raw }
+    ... |  ...
+    137 |  0x000210fc:  0x5a00c4aa	{	call PC+2388 }
+    138 |
+    139 |  General Purpose Registers =3D {
+    140 |    r0 =3D 0x4100fa70
+    141 |    r1 =3D 0x00042108
+    142 |    r2 =3D 0x00021084
+    143 |    r3 =3D 0x00000000
+
+Importantly, we see some Hexagon assembly followed by a dump of the CPU st=
ate,
+now the CPU state is actually dumped before the input assembly above is ra=
n.
+As such, we are actually interested in the instructions ran before this.
+
+Scrolling up a bit, we find
+
+::
+
+    54 |  ----------------
+    55 |  IN: _start
+    56 |  0x00021088:  0x6a09c002	{	R2 =3D C9/pc }
+    57 |  0x0002108c:  0xbfe2ff82	{	R2 =3D add(R2,#0xfffffffc) }
+    58 |  0x00021090:  0x9182c001	{	R1 =3D memw(R2+#0x0) }
+    59 |  0x00021094:  0xf302c101	{	R1 =3D add(R2,R1) }
+    60 |  0x00021098:  0x7800c01e	{	R30 =3D #0x0 }
+    61 |  0x0002109c:  0x707dc000	{	R0 =3D R29 }
+    62 |  0x000210a0:  0x763dfe1d	{	R29 =3D and(R29,#0xfffffff0) }
+    63 |  0x000210a4:  0xa79dfdfe	{	memw(R29+#0xfffffff8) =3D R29 }
+    64 |  0x000210a8:  0xbffdff1d	{	R29 =3D add(R29,#0xfffffff8) }
+    65 |  0x000210ac:  0x5a00c002	{	call PC+4 }
+    66 |
+    67 |  General Purpose Registers =3D {
+    68 |    r0 =3D 0x00000000
+    69 |    r1 =3D 0x00000000
+    70 |    r2 =3D 0x00000000
+    71 |    r3 =3D 0x00000000
+
+Remember, the instructions on lines 56-65 are ran on the CPU state shown b=
elow
+instructions, and as the CPU state has not diverged at this point, we know=
 the
+starting state is accurate. The bug must then lie within the instructions =
shown
+here. Next we may notice that ``R1`` is only touched by lines 57 and 58, t=
hat is
+by
+
+::
+
+    58 |  0x00021090:  0x9182c001	{	R1 =3D memw(R2+#0x0) }
+    59 |  0x00021094:  0xf302c101	{	R1 =3D add(R2,R1) }
+
+Therefore, we are either dealing with an correct load instruction
+``R1 =3D memw(R2+#0x0)`` or with an incorrect add ``R1 =3D add(R2,R1)``. A=
t this
+point it might be easy enough to go directly to the emitted code for the
+instructions mentioned and look for bugs, but we could also run
+``./qemu-heaxgon -d op,in_asm float_conv`` where we find for the following
+tinycode for the Hexagon ``add`` instruction
+
+::
+
+   ---- 00021094
+   mov_i32 pkt_has_store_s1,$0x0
+   add_i32 tmp0,r2,r2
+   mov_i32 loc2,tmp0
+   mov_i32 new_r1,loc2
+   mov_i32 r1,new_r1
+
+Here we have finally located our bug ``add_i32 tmp0,r2,r2``.
+
+Limitations and Future Development
+----------------------------------
+
+The main limitation of the current parser is given by the syntax-driven na=
ture
+of the Bison-generated parsers. This has the severe implication of only be=
ing
+able to generate code in the order of evaluation of the various rules, wit=
hout,
+in any case, being able to backtrack and alter the generated code.
+
+An example limitation is highlighted by this statement of the input langua=
ge:
+
+::
+
+   { (PsV=3D=3D0xff) ? (PdV=3D0xff) : (PdV=3D0x00); }
+
+This ternary assignment, when written in this form requires us to emit some
+proper control flow statements, which emit a jump to the first or to the s=
econd
+code block, whose implementation is extremely convoluted, because when mat=
ching
+the ternary assignment, the code evaluating the two assignments will be al=
ready
+generated.
+
+Instead we pre-process that statement, making it become:
+
+::
+
+   { PdV =3D ((PsV=3D=3D0xff)) ? 0xff : 0x00; }
+
+Which can be easily matched by the following parser rules:
+
+::
+
+   statement             | rvalue ';'
+
+   rvalue                : rvalue QMARK rvalue COLON rvalue
+                         | rvalue EQ rvalue
+                         | LPAR rvalue RPAR
+                         | assign_statement
+                         | IMM
+
+   assign_statement      : pred ASSIGN rvalue
+
+Another example that highlight the limitation of the flex/bison parser can=
 be
+found even in the add operation we already saw:
+
+::
+
+   TCGv_i32 tmp_0 =3D tcg_temp_new_i32();
+   tcg_gen_add_i32(tmp_0, RsV, RtV);
+   tcg_gen_mov_i32(RdV, tmp_0);
+
+The fact that we cannot directly use ``RdV`` as the destination of the sum=
 is a
+consequence of the syntax-driven nature of the parser. In fact when we par=
se the
+assignment, the ``rvalue`` token, representing the sum has already been re=
duced,
+and thus its code emitted and unchangeable. We rely on the fact that QEMU =
will
+optimize our code reducing the useless move operations and the relative
+temporaries.
+
+A possible improvement of the parser regards the support for vectorial
+instructions and floating point instructions, which will require the exten=
sion
+of the scanner, the parser, and a partial re-design of the type system, al=
lowing
+to build the vectorial semantics over the available vectorial tinycode gen=
erator
+primitives.
+
+A more radical improvement will use the parser, not to generate directly t=
he
+tinycode generator code, but to generate an intermediate representation li=
ke the
+LLVM IR, which in turn could be compiled using the clang TCG backend. That=
 code
+could be furtherly optimized, overcoming the limitations of the syntax-dri=
ven
+parsing and could lead to a more optimized generated code.
--=20
2.34.1