From nobody Thu Nov 13 22:04:23 2025 Delivered-To: importer@patchew.org Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=quicinc.com ARC-Seal: i=1; a=rsa-sha256; t=1582908989; cv=none; d=zohomail.com; s=zohoarc; b=fCMvPTy1E+P6LedsMdz8WLWUzEvUca0q+xn/iwTaBiN5bH4t+omZPS7yc0IEl9EWNF5hVLuyXGY6yCcYdqACBKpRpNU8PwWtIPXx3we2eY4EBnlLTbj8n1DYSrd/piojIG9U1MkihkwE07DdhUUXu6lt+WLyWLiKPh01JpxlOgs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1582908989; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=FtKY+FVpgD5esMMrnUCvMYj9mcSsZ9+CvkAsAXuT+6k=; b=fBmrHi7PKeLn2Htjx1idPAdjFUCGhv4sK83dtyqQjsD2sxC77Eq+YrswyeoM7eXjHQmK4h2ItGwV4fZ5BDDmZfpdvphH2ecNEh4PdfdgOGcY7W+LVP0bsXkSORDKxo8BiVqI1teZNPViIDPKFBvvSaD2bYsdg6XdfSc/iUn6+2A= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail header.i=@quicinc.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1582908989253524.5632676792005; Fri, 28 Feb 2020 08:56:29 -0800 (PST) Received: from localhost ([::1]:50758 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7iw5-00048l-6n for importer@patchew.org; Fri, 28 Feb 2020 11:56:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:57696) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j7ikd-0002ov-1V for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:44:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j7ikZ-0005Vk-DW for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:44:34 -0500 Received: from alexa-out-sd-02.qualcomm.com ([199.106.114.39]:27035) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j7ikZ-0005Ug-0p for qemu-devel@nongnu.org; Fri, 28 Feb 2020 11:44:31 -0500 Received: from unknown (HELO ironmsg01-sd.qualcomm.com) ([10.53.140.141]) by alexa-out-sd-02.qualcomm.com with ESMTP; 28 Feb 2020 08:44:28 -0800 Received: from vu-tsimpson-aus.qualcomm.com (HELO vu-tsimpson1-aus.qualcomm.com) ([10.222.150.1]) by ironmsg01-sd.qualcomm.com with ESMTP; 28 Feb 2020 08:44:27 -0800 Received: by vu-tsimpson1-aus.qualcomm.com (Postfix, from userid 47164) id 028C6FD8; Fri, 28 Feb 2020 10:44:27 -0600 (CST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=quicinc.com; i=@quicinc.com; q=dns/txt; s=qcdkim; t=1582908271; x=1614444271; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FtKY+FVpgD5esMMrnUCvMYj9mcSsZ9+CvkAsAXuT+6k=; b=lZytXpsbM7f0m7MGInM+JkY26DaMP9X5dlGww4c60ns4wGDLnnOEMiO1 boRJSYWNuVFZujAkiRt7O8TsALEg2n04oTQBRhKpRx5p5aOe/Bno5rYFf DYNLpHfJ3BJgyQ5yvA+28P4j8zx7xJDj9ZaXP4E61RduCYEB6i25enyYn k=; From: Taylor Simpson To: qemu-devel@nongnu.org Subject: [RFC PATCH v2 02/67] Hexagon README Date: Fri, 28 Feb 2020 10:42:58 -0600 Message-Id: <1582908244-304-3-git-send-email-tsimpson@quicinc.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1582908244-304-1-git-send-email-tsimpson@quicinc.com> References: <1582908244-304-1-git-send-email-tsimpson@quicinc.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-detected-operating-system: by eggs.gnu.org: FreeBSD 9.x [fuzzy] X-Received-From: 199.106.114.39 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: riku.voipio@iki.fi, richard.henderson@linaro.org, laurent@vivier.eu, Taylor Simpson , philmd@redhat.com, aleksandar.m.mail@gmail.com Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Gives an introduction and overview to the Hexagon target Signed-off-by: Taylor Simpson --- target/hexagon/README | 296 ++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 296 insertions(+) create mode 100644 target/hexagon/README diff --git a/target/hexagon/README b/target/hexagon/README new file mode 100644 index 0000000..6de71e2 --- /dev/null +++ b/target/hexagon/README @@ -0,0 +1,296 @@ +Hexagon is Qualcomm's very long instruction word (VLIW) digital signal +processor(DSP). We also support Hexagon Vector eXtensions (HVX). HVX +is a wide vector coprocessor designed for high performance computer vision, +image processing, machine learning, and other workloads. + +The following versions of the Hexagon core are supported + Scalar core: v67 + https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programm= er-s-reference-manual + HVX extension: v66 + https://developer.qualcomm.com/downloads/qualcomm-hexagon-v66-hvx-prog= rammer-s-reference-manual + +We presented an overview of the project at the 2019 KVM Forum. + https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-trans= lation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-arch= itecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center + +*** Tour of the code *** + +The qemu-hexagon implementation is a combination of qemu and the Hexagon +architecture library (aka archlib). The three primary directories with +Hexagon-specific code are + + qemu/target/hexagon + This has all the instruction and packet semantics + qemu/target/hexagon/imported + These files are imported with very little modification from archlib + *.idef Instruction semantics definition + macros.def Mapping of macros to instruction attributes + encode*.def Encoding patterns for each instruction + iclass.def Instruction class definitions used to dete= rmine + legal VLIW slots for each instruction + qemu/linux-user/hexagon + Helpers for loading the ELF file and making Linux system calls, + signals, etc + +We start with a script that generates qemu helper for each instruction. T= his +is a two step process. The first step is to use the C preprocessor to exp= and +macros inside the architecture definition files. This is done in +target/hexagon/semantics.c. This step produces + /hexagon-linux-user/semantics_generated.pyinc. +That file is consumed by the do_qemu.py script. This script generates +several files. All of the generated files end in "_generated.*". The +primary file produced is + /hexagon-linux-user/qemu_def_generated.h + +Qemu helper functions have 3 parts + DEF_HELPER declaration indicates the signature of the helper + gen_helper_ will generate a TCG call to the helper function + The helper implementation + +In the qemu_def_generated.h file, there is a DEF_QEMU macro for each user-= space +instruction. The file is included several times with DEF_QEMU defined +differently, depending on the context. The macro has four arguments + The instruction tag + The semantics_short code + DEF_HELPER declaration + Call to the helper + Helper implementation + +Here's an example of the A2_add instruction. + Instruction tag A2_add + Assembly syntax "Rd32=3Dadd(Rs32,Rt32)" + Instruction semantics "{ RdV=3DRsV+RtV;}" + +By convention, the operands are identified by letter + RdV is the destination register + RsV, RtV are source registers + +The generator uses the operand naming conventions (see large comment in +do_qemu.py) to determine the signature of the helper function. Here is the +result for A2_add from qemu_def_generated.h + +DEF_QEMU(A2_add,{ RdV=3DRsV+RtV;}, +#ifndef fWRAP_A2_add +DEF_HELPER_3(A2_add, s32, env, s32, s32) +#endif +, +{ +/* A2_add */ +DECL_RREG_d(RdV, RdN, 0, 0); +DECL_RREG_s(RsV, RsN, 1, 0); +DECL_RREG_t(RtV, RtN, 2, 0); +READ_RREG_s(RsV, RsN); +READ_RREG_t(RtV, RtN); +fWRAP_A2_add( +do { +gen_helper_A2_add(RdV, cpu_env, RsV, RtV); +} while (0), +{ RdV=3DRsV+RtV;}); +WRITE_RREG_d(RdN, RdV); +FREE_RREG_d(RdV); +FREE_RREG_s(RsV); +FREE_RREG_t(RtV); +/* A2_add */ +}, +#ifndef fWRAP_A2_add +int32_t HELPER(A2_add)(CPUHexagonState *env, int32_t RsV, int32_t RtV) +{ +uint32_t slot =3D 4; slot =3D slot; +int32_t RdV =3D 0; +{ RdV=3DRsV+RtV;} +COUNT_HELPER(A2_add); +return RdV; +} +#endif +) + +For each operand, there are macros for DECL, FREE, READ, WRITE. These are +defined in macros.h. Note that we append the operand type to the macro na= me, +which allows us to specialize the TCG code tenerated. For read-only opera= nds, +DECL simply declares the TCGv variable (no need for tcg_temp_local_new()), +and READ will assign from the TCGv corresponding to the GPR, and FREE does= n't +have to do anything. Also, note that the WRITE macros update the disassem= bly +context to be processed when the packet commits (see "Packet Semantics" be= low). + +Note the fWRAP_A2_add macro around the gen_helper call. Each instruction = has a fWRAP_ macro that takes 2 arguments + gen_helper call + C semantics (aka short code) + +This allows the code generator to override the auto-generated code. In so= me +cases this is necessary for correct execution. We can also override for +faster emulation. For example, calling a helper for add is more expensive +than generating a TCG add operation. + +The qemu_wrap_generated.h file contains a default fWRAP_ for each +instruction. The default is to invoke the gen_helper code. + #ifndef fWRAP_A2_add + #define fWRAP_A2_add(GENHLPR, SHORTCODE) GENHLPR + #endif + +The helper_overrides.h file has any overrides. For example, + #define fWRAP_A2_add(GENHLPR, SHORTCODE) \ + tcg_gen_add_tl(RdV, RsV, RtV) + +This file is included twice +1) In genptr.c, it overrides the semantics of the desired instructions +2) In helper.h, it prevents the generation of helpers for overridden + instructions. Notice the #ifndef fWRAP_A2_add above. + +The instruction semantics C code heavily on macros. In cases where the C +semantics are specified only with macros, we can override the default with +the short semantics option and #define the macros to generate TCG code. O= ne +example is Y2_dczeroa (dc =3D=3D data cache, zero =3D=3D zero out the cach= e line, +a =3D=3D address: zero out the data cache line at the given address): + Instruction tag Y2_dczeroa + Assembly syntax "dczeroa(Rs32)" + Instruction semantics "{fEA_REG(RsV); fDCZEROA(EA);}" + +In helper_overrides.h, we use the shortcode +#define fWRAP_Y2_dczeroa(GENHLPR, SHORTCODE) SHORTCODE + +In other cases, just a little bit of wrapper code needs to be written. + #define fWRAP_tmp(SHORTCODE) \ + { \ + TCGv tmp =3D tcg_temp_new(); \ + SHORTCODE; \ + tcg_temp_free(tmp); \ + } + +For example, some load instructions use a temporary for address computatio= n. +The SL2_loadrd_sp instruction needs a temporary to hold the value of the s= tack +pointer (r29) + Instruction tag SL2_loadrd_sp + Assembly syntax "Rdd8=3Dmemd(r29+#u5:3)" + Instruction semantics "{fEA_RI(fREAD_SP(),uiV); fLOAD(1,8,u,EA,RddV);= }" + +In helper_overrides.h you'll see + #define fWRAP_SL2_loadrd_sp(GENHLPR, SHORTCODE) fWRAP_tmp(SHORTCO= DE) + +There are also cases where we brute force the TCG code generation. The +allocframe and deallocframe instructions are examples. Other examples are +instructions with multiple definitions. These require special handling +because qemu helpers can only return a single value. + +In addition to instruction semantics, we use a generator to create the dec= ode +tree. This generation is also a two step process. The first step is to r= un +target/hexagon/gen_dectree_import.c to produce + /hexagon-linux-user/iset.py +This file is imported by target/hexagon/dectree.py to produce + /hexagon-linux-user/dectree_generated.h + +*** Key Files *** + +cpu.h + +This file contains the definition of the CPUHexagonState struct. It is the +runtime information for each thread and contains stuff like the GPR and +predicate registers. + +macros.h +mmvec/macros.h + +The Hexagon arch lib relies heavily on macros for the instruction semantic= s. +This is a great advantage for qemu because we can override them for differ= ent +purposes. You will also notice there are sometimes two definitions of a m= acro. +The QEMU_GENERATE variable determines whether we want the macro to generat= e TCG +code. If QEMU_GENERATE is not defined, we want the macro to generate vani= lla +C code that will work in the helper implementation. + +translate.c + +The functions in this file generate TCG code for a translation block. Some +important functions in this file are + + gen_start_packet - initialize the data structures for packet semantics + gen_commit_packet - commit the register writes, stores, etc for a pack= et + decode_packet - disassemble a packet and generate code + +genptr.c +genptr_helpers.h +helper_overrides.h + +These file create a function for each instruction. It is mostly composed = of +fWRAP_ definitions followed by including qemu_def_generated.h. The +genptr_helpers.h file contains helper functions that are invoked by the ma= cros +in helper_overrides.h and macros.h + +op_helper.c + +This file contains the implementations of all the helpers. There are a few +general purpose helpers, but most of them are generated by including +qemu_def_generated.h. There are also several helpers used for debugging. + + +*** Packet Semantics *** + +VLIW packet semantics differ from serial semantics in that all input opera= nds +are read, then the operations are performed, then all the results are writ= ten. +For exmaple, this packet performs a swap of registers r0 and r1 + { r0 =3D r1; r1 =3D r0 } +Note that the result is different if the instructions are executed seriall= y. + +Packet semantics dictate that we defer any changes of state until the enti= re +packet is committed. We record the results of each instruction in a side = data +structure, and update the visible processor state when we commit the packe= t. + +The data structures are divided between the runtime state and the translat= ion +context. + +During the TCG generation (see translate.[ch]), we use the DisasContext to +track what needs to be done during packet commit. Here are the relevant +fields + + ctx_reg_log list of registers written + ctx_reg_log_idx index into ctx_reg_log + ctx_pred_log list of predicates written + ctx_pred_log_idx index into ctx_pred_log + ctx_store_width width of stores (indexed by slot) + +During runtime, the following fields in CPUHexagonState (see cpu.h) are us= ed + + new_value new value of a given register + reg_written boolean indicating if register was written + new_pred_value new value of a predicate register + pred_written boolean indicating if predicate was written + mem_log_stores record of the stores (indexed by slot) + +For Hexagon Vector eXtensions (HVX), the following fields are used + + future_VRegs + tmp_VRegs + future_ZRegs + ZRegs_updated + VRegs_updated_tmp + VRegs_updated + VRegs_select + +*** Debugging *** + +You can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in +internal.h. This will stream a lot of information as it generates TCG and +executes the code. + +To track down nasty issues with Hexagon->TCG generation, we compare the +execution results with actual hardware running on a Hexagon Linux target. +Run qemu with the "-d cpu" option. Then, we can diff the results and figu= re +out where qemu and hardware behave differently. + +The stacks are located at different locations. We handle this by changing +env->stack_adjust in translate.c. First, set this to zero and run qemu. +Then, change env->stack_adjust to the difference between the two stack +locations. Then rebuild qemu and run again. That will produce a very +clean diff. + +Here are some handy places to set breakpoints + + At the call to gen_start_packet for a given PC (note that the line num= ber + might change in the future) + br translate.c:602 if ctx->base.pc_next =3D=3D 0xdeadbeef + The helper function for each instruction is named helper_, so her= e's + an example that will set a breakpoint at the start + br helper_V6_vgathermh + If you have the HEX_DEBUG macro set, the following will be useful + At the start of execution of a packet for a given PC + br helper_debug_start_packet if env->gpr[41] =3D=3D 0xdeadbeef + At the end of execution of a packet for a given PC + br helper_debug_commit_end if env->this_PC =3D=3D 0xdeadbeef + --=20 2.7.4