:p
atchew
Login
The following changes since commit 2417cbd5916d043e0c56408221fbe9935d0bc8da: Merge tag 'ak-pull-request' of https://gitlab.com/berrange/qemu into staging (2022-05-26 07:00:04 -0700) are available in the Git repository at: https://gitlab.com/danielhb/qemu.git tags/pull-ppc-20220526 for you to fetch changes up to 96c343cc774b52b010e464a219d13f8e55e1003f: linux-user: Add PowerPC ISA 3.1 and MMA to hwcap (2022-05-26 17:11:33 -0300) ---------------------------------------------------------------- ppc patch queue for 2022-05-26: Most of the changes are enhancements/fixes made in TCG ppc emulation code. Several bugs fixes were made across the board as well. Changes include: - tcg and target/ppc: VSX MMA implementation, fixes in helper declarations to use call flags, memory ordering, tlbie and others - pseries: fixed stdout-path setting with -machine graphics=off - pseries: allow use of elf parser for kernel address - other assorted fixes and improvements ---------------------------------------------------------------- Alexey Kardashevskiy (2): spapr: Use address from elf parser for kernel address spapr/docs: Add a few words about x-vof Bernhard Beschow (1): hw/ppc/e500: Remove unused BINARY_DEVICE_TREE_FILE Frederic Barrat (1): pnv/xive2: Don't overwrite PC registers when writing TCTXT registers Joel Stanley (1): linux-user: Add PowerPC ISA 3.1 and MMA to hwcap Leandro Lupori (1): target/ppc: Fix tlbie Lucas Mateus Castro (alqotel) (7): target/ppc: Implement xxm[tf]acc and xxsetaccz target/ppc: Implemented xvi*ger* instructions target/ppc: Implemented pmxvi*ger* instructions target/ppc: Implemented xvf*ger* target/ppc: Implemented xvf16ger* target/ppc: Implemented pmxvf*ger* target/ppc: Implemented [pm]xvbf16ger2* Matheus Ferst (12): target/ppc: declare darn32/darn64 helpers with TCG_CALL_NO_RWG target/ppc: use TCG_CALL_NO_RWG in vector helpers without env target/ppc: use TCG_CALL_NO_RWG in BCD helpers target/ppc: use TCG_CALL_NO_RWG in VSX helpers without env target/ppc: Use TCG_CALL_NO_RWG_SE in fsel helper target/ppc: declare xscvspdpn helper with call flags target/ppc: declare xvxsigsp helper with call flags target/ppc: declare xxextractuw and xxinsertw helpers with call flags target/ppc: introduce do_va_helper target/ppc: declare vmsum[um]bm helpers with call flags target/ppc: declare vmsumuh[ms] helper with call flags target/ppc: declare vmsumsh[ms] helper with call flags Murilo Opsfelder Araujo (1): mos6522: fix linking error when CONFIG_MOS6522 is not set Nicholas Piggin (4): target/ppc: Fix eieio memory ordering semantics tcg/ppc: ST_ST memory ordering is not provided with eieio tcg/ppc: Optimize memory ordering generation with lwsync target/ppc: Implement lwsync with weaker memory ordering Paolo Bonzini (1): pseries: allow setting stdout-path even on machines with a VGA Víctor Colombo (3): target/ppc: Fix FPSCR.FI bit being cleared when it shouldn't target/ppc: Fix FPSCR.FI changing in float_overflow_excp() target/ppc: Rename sfprf to sfifprf where it's also used as set fi flag docs/system/ppc/pseries.rst | 29 ++ hmp-commands-info.hx | 2 +- hw/intc/pnv_xive2.c | 3 - hw/ppc/e500.c | 1 - hw/ppc/spapr.c | 25 +- include/hw/ppc/spapr.h | 2 +- linux-user/elfload.c | 4 + monitor/misc.c | 3 + target/ppc/cpu.h | 19 +- target/ppc/cpu_init.c | 13 +- target/ppc/fpu_helper.c | 571 ++++++++++++++++++++++++++++-------- target/ppc/helper.h | 259 +++++++++------- target/ppc/helper_regs.c | 2 +- target/ppc/insn32.decode | 80 ++++- target/ppc/insn64.decode | 79 +++++ target/ppc/int_helper.c | 152 +++++++++- target/ppc/internal.h | 15 + target/ppc/machine.c | 3 +- target/ppc/translate.c | 35 ++- target/ppc/translate/fp-impl.c.inc | 30 +- target/ppc/translate/fp-ops.c.inc | 1 - target/ppc/translate/vmx-impl.c.inc | 54 ++-- target/ppc/translate/vmx-ops.c.inc | 4 - target/ppc/translate/vsx-impl.c.inc | 237 ++++++++++++--- target/ppc/translate/vsx-ops.c.inc | 4 - tcg/ppc/tcg-target.c.inc | 12 +- 26 files changed, 1286 insertions(+), 353 deletions(-)
From: Paolo Bonzini <pbonzini@redhat.com> -machine graphics=off is the usual way to tell the firmware or the OS that the user wants a serial console. The pseries machine however does not support this, and never adds the stdout-path node to the device tree if a VGA device is provided. This is in addition to the other magic behavior of VGA devices, which is to add a keyboard and mouse to the default USB bus. Split spapr->has_graphics in two variables so that the two behaviors can be separated: the USB devices remains the same, but the stdout-path is added even with "-device VGA -machine graphics=off". Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20220507054826.124936-1-pbonzini@redhat.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- hw/ppc/spapr.c | 12 ++++++++---- include/hw/ppc/spapr.h | 2 +- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index XXXXXXX..XXXXXXX 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -XXX,XX +XXX,XX @@ static void spapr_dt_chosen(SpaprMachineState *spapr, void *fdt, bool reset) _FDT(fdt_setprop_string(fdt, chosen, "qemu,boot-device", boot_device)); } - if (!spapr->has_graphics && stdout_path) { + if (spapr->want_stdout_path && stdout_path) { /* * "linux,stdout-path" and "stdout" properties are * deprecated by linux kernel. New platforms should only @@ -XXX,XX +XXX,XX @@ static void spapr_machine_init(MachineState *machine) const char *kernel_filename = machine->kernel_filename; const char *initrd_filename = machine->initrd_filename; PCIHostState *phb; + bool has_vga; int i; MemoryRegion *sysmem = get_system_memory(); long load_limit, fw_size; @@ -XXX,XX +XXX,XX @@ static void spapr_machine_init(MachineState *machine) } /* Graphics */ - if (spapr_vga_init(phb->bus, &error_fatal)) { - spapr->has_graphics = true; + has_vga = spapr_vga_init(phb->bus, &error_fatal); + if (has_vga) { + spapr->want_stdout_path = !machine->enable_graphics; machine->usb |= defaults_enabled() && !machine->usb_disabled; + } else { + spapr->want_stdout_path = true; } if (machine->usb) { @@ -XXX,XX +XXX,XX @@ static void spapr_machine_init(MachineState *machine) pci_create_simple(phb->bus, -1, "nec-usb-xhci"); } - if (spapr->has_graphics) { + if (has_vga) { USBBus *usb_bus = usb_bus_find(-1); usb_create_simple(usb_bus, "usb-kbd"); diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index XXXXXXX..XXXXXXX 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -XXX,XX +XXX,XX @@ struct SpaprMachineState { Vof *vof; uint64_t rtc_offset; /* Now used only during incoming migration */ struct PPCTimebase tb; - bool has_graphics; + bool want_stdout_path; uint32_t vsmt; /* Virtual SMT mode (KVM's "core stride") */ /* Nested HV support (TCG only) */ -- 2.36.1
From: Bernhard Beschow <shentey@gmail.com> Commit 28290f37e20cda27574f15be9e9499493e3d0fe8 'PPC: E500: Generate device tree on reset' improved device tree generation and made BINARY_DEVICE_TREE_FILE obsolete. Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Message-Id: <20220505161805.11116-8-shentey@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- hw/ppc/e500.c | 1 - 1 file changed, 1 deletion(-) diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c index XXXXXXX..XXXXXXX 100644 --- a/hw/ppc/e500.c +++ b/hw/ppc/e500.c @@ -XXX,XX +XXX,XX @@ #include "hw/irq.h" #define EPAPR_MAGIC (0x45504150) -#define BINARY_DEVICE_TREE_FILE "mpc8544ds.dtb" #define DTC_LOAD_PAD 0x1800000 #define DTC_PAD_MASK 0xFFFFF #define DTB_MAX_SIZE (8 * MiB) -- 2.36.1
From: Alexey Kardashevskiy <aik@ozlabs.ru> tl;dr: This allows Big Endian zImage booting via -kernel + x-vof=on. QEMU loads the kernel at 0x400000 by default which works most of the time as Linux kernels are relocatable, 64bit and compiled with "-pie" (position independent code). This works for a little endian zImage too. However a big endian zImage is compiled without -pie, is 32bit, linked to 0x4000000 so current QEMU ends up loading it at 0x4400000 but keeps spapr->kernel_addr unchanged so booting fails. This uses the kernel address returned from load_elf(). If the default kernel_addr is used, there is no change in behavior (as translate_kernel_address() takes care of this), which is: LE/BE vmlinux and LE zImage boot, BE zImage does not. If the VM created with "-machine kernel-addr=0,x-vof=on", then QEMU prints a warning and BE zImage boots. Note #1: SLOF (x-vof=off) still cannot boot a big endian zImage as SLOF enables MSR_SF for everything loaded by QEMU and this leads to early crash of 32bit zImage. Note #2: BE/LE vmlinux images set MSR_SF in early boot so these just work; a LE zImage restores MSR_SF after every CI call and we are lucky enough not to crash before the first CI call. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220504065536.3534488-1-aik@ozlabs.ru> [danielhb: use PRIx64 instead of lx in warn_report] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- hw/ppc/spapr.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index XXXXXXX..XXXXXXX 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -XXX,XX +XXX,XX @@ static void spapr_machine_init(MachineState *machine) } if (kernel_filename) { + uint64_t loaded_addr = 0; + spapr->kernel_size = load_elf(kernel_filename, NULL, translate_kernel_address, spapr, - NULL, NULL, NULL, NULL, 1, + NULL, &loaded_addr, NULL, NULL, 1, PPC_ELF_MACHINE, 0, 0); if (spapr->kernel_size == ELF_LOAD_WRONG_ENDIAN) { spapr->kernel_size = load_elf(kernel_filename, NULL, translate_kernel_address, spapr, - NULL, NULL, NULL, NULL, 0, + NULL, &loaded_addr, NULL, NULL, 0, PPC_ELF_MACHINE, 0, 0); spapr->kernel_le = spapr->kernel_size > 0; } @@ -XXX,XX +XXX,XX @@ static void spapr_machine_init(MachineState *machine) exit(1); } + if (spapr->kernel_addr != loaded_addr) { + warn_report("spapr: kernel_addr changed from 0x%"PRIx64 + " to 0x%"PRIx64, + spapr->kernel_addr, loaded_addr); + spapr->kernel_addr = loaded_addr; + } + /* load initrd */ if (initrd_filename) { /* Try to locate the initrd in the gap between the kernel -- 2.36.1
From: Alexey Kardashevskiy <aik@ozlabs.ru> The alternative small firmware needs a few words of what it can and absolutely cannot do; this adds those words. Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Message-Id: <20220506055124.3822112-1-aik@ozlabs.ru> [danielhb: added linebreaks before and after table] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- docs/system/ppc/pseries.rst | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/docs/system/ppc/pseries.rst b/docs/system/ppc/pseries.rst index XXXXXXX..XXXXXXX 100644 --- a/docs/system/ppc/pseries.rst +++ b/docs/system/ppc/pseries.rst @@ -XXX,XX +XXX,XX @@ Missing devices Firmware ======== +The pSeries platform in QEMU comes with 2 firmwares: + `SLOF <https://github.com/aik/SLOF>`_ (Slimline Open Firmware) is an implementation of the `IEEE 1275-1994, Standard for Boot (Initialization Configuration) Firmware: Core Requirements and Practices <https://standards.ieee.org/standard/1275-1994.html>`_. +SLOF performs bus scanning, PCI resource allocation, provides the client +interface to boot from block devices and network. + QEMU includes a prebuilt image of SLOF which is updated when a more recent version is required. +VOF (Virtual Open Firmware) is a minimalistic firmware to work with +``-machine pseries,x-vof=on``. When enabled, the firmware acts as a slim +shim and QEMU implements parts of the IEEE 1275 Open Firmware interface. + +VOF does not have device drivers, does not do PCI resource allocation and +relies on ``-kernel`` used with Linux kernels recent enough (v5.4+) +to PCI resource assignment. It is ideal to use with petitboot. + +Booting via ``-kernel`` supports the following: + ++-------------------+-------------------+------------------+ +| kernel | pseries,x-vof=off | pseries,x-vof=on | ++===================+===================+==================+ +| vmlinux BE | ✓ | ✓ | ++-------------------+-------------------+------------------+ +| vmlinux LE | ✓ | ✓ | ++-------------------+-------------------+------------------+ +| zImage.pseries BE | x | ✓¹ | ++-------------------+-------------------+------------------+ +| zImage.pseries LE | ✓ | ✓ | ++-------------------+-------------------+------------------+ + +¹ must set kernel-addr=0 + Build directions ================ -- 2.36.1
From: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> When CONFIG_MOS6522 is not set, building ppc64-softmmu target fails: /usr/bin/ld: libqemu-ppc64-softmmu.fa.p/monitor_misc.c.o:(.data+0x1158): undefined reference to `hmp_info_via' Make devices configuration available in hmp-commands*.hx and check for CONFIG_MOS6522. Fixes: 409e9f7131e5 (mos6522: add "info via" HMP command for debugging) Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com> Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: Fabiano Rosas <farosas@linux.ibm.com> Cc: Thomas Huth <thuth@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20220510235439.54775-1-muriloo@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- hmp-commands-info.hx | 2 +- monitor/misc.c | 3 +++ 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index XXXXXXX..XXXXXXX 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -XXX,XX +XXX,XX @@ SRST Show intel SGX information. ERST -#if defined(TARGET_M68K) || defined(TARGET_PPC) +#if defined(CONFIG_MOS6522) { .name = "via", .args_type = "", diff --git a/monitor/misc.c b/monitor/misc.c index XXXXXXX..XXXXXXX 100644 --- a/monitor/misc.c +++ b/monitor/misc.c @@ -XXX,XX +XXX,XX @@ #include "hw/s390x/storage-attributes.h" #endif +/* Make devices configuration available for use in hmp-commands*.hx templates */ +#include CONFIG_DEVICES + /* file descriptors passed via SCM_RIGHTS */ typedef struct mon_fd_t mon_fd_t; struct mon_fd_t { -- 2.36.1
From: Leandro Lupori <leandro.lupori@eldorado.org.br> Commit 74c4912f097bab98 changed check_tlb_flush() to use tlb_flush_all_cpus_synced() instead of calling tlb_flush() on each CPU. However, as side effect of this, a CPU executing a ptesync after a tlbie will have its TLB flushed only after exiting its current Translation Block (TB). This causes memory accesses to invalid pages to succeed, if they happen to be on the same TB as the ptesync. To fix this, use tlb_flush_all_cpus() instead, that immediately flushes the TLB of the CPU executing the ptesync instruction. Fixes: 74c4912f097bab98 ("target/ppc: Fix synchronization of mttcg with broadcast TLB flushes") Signed-off-by: Leandro Lupori <leandro.lupori@eldorado.org.br> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Message-Id: <20220503163904.22575-1-leandro.lupori@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper_regs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/ppc/helper_regs.c b/target/ppc/helper_regs.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper_regs.c +++ b/target/ppc/helper_regs.c @@ -XXX,XX +XXX,XX @@ void check_tlb_flush(CPUPPCState *env, bool global) if (global && (env->tlb_need_flush & TLB_NEED_GLOBAL_FLUSH)) { env->tlb_need_flush &= ~TLB_NEED_GLOBAL_FLUSH; env->tlb_need_flush &= ~TLB_NEED_LOCAL_FLUSH; - tlb_flush_all_cpus_synced(cs); + tlb_flush_all_cpus(cs); return; } -- 2.36.1
From: Víctor Colombo <victor.colombo@eldorado.org.br> According to Power ISA, the FI bit in FPSCR is non-sticky. This means that if an instruction is said to modify the FI bit, then it should be set or cleared depending on the result of the instruction. Otherwise, it should be kept as was before. However, the following inconsistency was found when comparing results from the hardware (tested on both a Power 9 processor and in Power 10 Mambo): (FI bit is set before the execution of the instruction) Hardware: xscmpeqdp(0xff..ff, 0xff..ff) = FI: SET -> SET QEMU: xscmpeqdp(0xff..ff, 0xff..ff) = FI: SET -> CLEARED As the FI bit is non-sticky, and xscmpeqdp does not list it as a field that is changed by the instruction, it should not be changed after its execution. This is happening to multiple instructions in the vsx implementations. If the ISA does not list the FI bit as altered for a particular instruction, then it should be kept as it was before the instruction. QEMU is not following this behavior. Affected instructions include: - xv* (all vsx-vector instructions); - xscmp*, xsmax*, xsmin*; - xstdivdp and similars; (to identify the affected instructions, just search in the ISA for the instructions that does not list FI in "Special Registers Altered") Most instructions use the function do_float_check_status() to commit changes in the inexact flag. So the fix is to add a parameter to it that will control if the bit FI should be changed or not. All users of do_float_check_status() are then modified to provide this argument, controlling if that specific instruction changes bit FI or not. Some macro helpers are responsible for both instructions that change and instructions that aren't suposed to change FI. This seems to always overlap with the sfprf flag. So, reuse this flag for this purpose when applicable. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517161522.36132-2-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/cpu.h | 2 + target/ppc/fpu_helper.c | 122 +++++++++++++++++++++------------------- 2 files changed, 66 insertions(+), 58 deletions(-) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -XXX,XX +XXX,XX @@ enum { (1 << FPSCR_VXSOFT) | (1 << FPSCR_VXSQRT) | \ (1 << FPSCR_VXCVI)) +FIELD(FPSCR, FI, FPSCR_FI, 1) + #define FP_DRN2 (1ull << FPSCR_DRN2) #define FP_DRN1 (1ull << FPSCR_DRN1) #define FP_DRN0 (1ull << FPSCR_DRN0) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ static inline void float_inexact_excp(CPUPPCState *env) { CPUState *cs = env_cpu(env); - env->fpscr |= FP_FI; env->fpscr |= FP_XX; /* Update the floating-point exception summary */ env->fpscr |= FP_FX; @@ -XXX,XX +XXX,XX @@ void helper_fpscr_check_status(CPUPPCState *env) } } -static void do_float_check_status(CPUPPCState *env, uintptr_t raddr) +static void do_float_check_status(CPUPPCState *env, bool change_fi, + uintptr_t raddr) { CPUState *cs = env_cpu(env); int status = get_float_exception_flags(&env->fp_status); @@ -XXX,XX +XXX,XX @@ static void do_float_check_status(CPUPPCState *env, uintptr_t raddr) } if (status & float_flag_inexact) { float_inexact_excp(env); - } else { - env->fpscr &= ~FP_FI; /* clear the FPSCR[FI] bit */ + } + if (change_fi) { + env->fpscr = FIELD_DP64(env->fpscr, FPSCR, FI, + !!(status & float_flag_inexact)); } if (cs->exception_index == POWERPC_EXCP_PROGRAM && @@ -XXX,XX +XXX,XX @@ static void do_float_check_status(CPUPPCState *env, uintptr_t raddr) void helper_float_check_status(CPUPPCState *env) { - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); } void helper_reset_fpstatus(CPUPPCState *env) @@ -XXX,XX +XXX,XX @@ uint64_t helper_##op(CPUPPCState *env, uint64_t arg) \ } else { \ farg.d = cvtr(arg, &env->fp_status); \ } \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, true, GETPC()); \ return farg.ll; \ } @@ -XXX,XX +XXX,XX @@ static uint64_t do_fri(CPUPPCState *env, uint64_t arg, /* fri* does not set FPSCR[XX] */ set_float_exception_flags(flags & ~float_flag_inexact, &env->fp_status); - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); return arg; } @@ -XXX,XX +XXX,XX @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt, \ } \ } \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_ADD_SUB(xsadddp, add, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode, helper_compute_fprf_float128(env, t.f128); *xt = t; - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); } /* @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_MUL(xsmuldp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_xsmulqp(CPUPPCState *env, uint32_t opcode, helper_compute_fprf_float128(env, t.f128); *xt = t; - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); } /* @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_DIV(xsdivdp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode, helper_compute_fprf_float128(env, t.f128); *xt = t; - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); } /* @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_RE(xsredp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_SQRT(xssqrtdp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_RSQRTE(xsrsqrtedp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ } \ } \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_MADD(XSMADDDP, 1, float64, VsrD(0), MADD_FLGS, 1) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *s1, ppc_vsr_t *s2,\ \ helper_compute_fprf_float128(env, t.f128); \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, true, GETPC()); \ } VSX_MADDQ(XSMADDQP, MADD_FLGS, 0) @@ -XXX,XX +XXX,XX @@ VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0) \ memset(xt, 0, sizeof(*xt)); \ memset(&xt->fld, -r, sizeof(xt->fld)); \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, false, GETPC()); \ } VSX_SCALAR_CMP(XSCMPEQDP, float64, eq, VsrD(0), 0) @@ -XXX,XX +XXX,XX @@ void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode, env->fpscr |= cc << FPSCR_FPCC; env->crf[BF(opcode)] = cc; - do_float_check_status(env, GETPC()); + do_float_check_status(env, false, GETPC()); } void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode, @@ -XXX,XX +XXX,XX @@ void helper_xscmpexpqp(CPUPPCState *env, uint32_t opcode, env->fpscr |= cc << FPSCR_FPCC; env->crf[BF(opcode)] = cc; - do_float_check_status(env, GETPC()); + do_float_check_status(env, false, GETPC()); } static inline void do_scalar_cmp(CPUPPCState *env, ppc_vsr_t *xa, ppc_vsr_t *xb, @@ -XXX,XX +XXX,XX @@ static inline void do_scalar_cmp(CPUPPCState *env, ppc_vsr_t *xa, ppc_vsr_t *xb, float_invalid_op_vxvc(env, 0, GETPC()); } - do_float_check_status(env, GETPC()); + do_float_check_status(env, false, GETPC()); } void helper_xscmpodp(CPUPPCState *env, uint32_t opcode, ppc_vsr_t *xa, @@ -XXX,XX +XXX,XX @@ static inline void do_scalar_cmpq(CPUPPCState *env, ppc_vsr_t *xa, float_invalid_op_vxvc(env, 0, GETPC()); } - do_float_check_status(env, GETPC()); + do_float_check_status(env, false, GETPC()); } void helper_xscmpoqp(CPUPPCState *env, uint32_t opcode, ppc_vsr_t *xa, @@ -XXX,XX +XXX,XX @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt, \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, false, GETPC()); \ } VSX_MAX_MIN(xsmaxdp, maxnum, 1, float64, VsrD(0)) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_CVT_FP_TO_FP(xscvspdp, 1, float32, float64, VsrW(0), VsrD(0), 1) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_CVT_FP_TO_FP2(xvcvdpsp, 2, float64, float32, 0) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, uint32_t opcode, \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, true, GETPC()); \ } VSX_CVT_FP_TO_FP_VECTOR(xscvdpqp, 1, float64, float128, VsrD(0), f128, 1) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_CVT_FP_TO_FP_HP(xscvdphp, 1, float64, float16, VsrD(0), VsrH(3), 1) @@ -XXX,XX +XXX,XX @@ void helper_XVCVSPBF16(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) } *xt = t; - do_float_check_status(env, GETPC()); + do_float_check_status(env, false, GETPC()); } void helper_XSCVQPDP(CPUPPCState *env, uint32_t ro, ppc_vsr_t *xt, @@ -XXX,XX +XXX,XX @@ void helper_XSCVQPDP(CPUPPCState *env, uint32_t ro, ppc_vsr_t *xt, helper_compute_fprf_float64(env, t.VsrD(0)); *xt = t; - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); } uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb) @@ -XXX,XX +XXX,XX @@ uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb) * ttp - target type (int32, uint32, int64 or uint64) * sfld - source vsr_t field * tfld - target vsr_t field + * sfi - set FI * rnan - resulting NaN */ -#define VSX_CVT_FP_TO_INT(op, nels, stp, ttp, sfld, tfld, rnan) \ +#define VSX_CVT_FP_TO_INT(op, nels, stp, ttp, sfld, tfld, sfi, rnan) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ int all_flags = env->fp_status.float_exception_flags, flags; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ \ *xt = t; \ env->fp_status.float_exception_flags = all_flags; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfi, GETPC()); \ } -VSX_CVT_FP_TO_INT(xscvdpsxds, 1, float64, int64, VsrD(0), VsrD(0), \ +VSX_CVT_FP_TO_INT(xscvdpsxds, 1, float64, int64, VsrD(0), VsrD(0), true, \ 0x8000000000000000ULL) -VSX_CVT_FP_TO_INT(xscvdpuxds, 1, float64, uint64, VsrD(0), VsrD(0), 0ULL) -VSX_CVT_FP_TO_INT(xvcvdpsxds, 2, float64, int64, VsrD(i), VsrD(i), \ +VSX_CVT_FP_TO_INT(xscvdpuxds, 1, float64, uint64, VsrD(0), VsrD(0), true, 0ULL) +VSX_CVT_FP_TO_INT(xvcvdpsxds, 2, float64, int64, VsrD(i), VsrD(i), false, \ 0x8000000000000000ULL) -VSX_CVT_FP_TO_INT(xvcvdpuxds, 2, float64, uint64, VsrD(i), VsrD(i), 0ULL) -VSX_CVT_FP_TO_INT(xvcvspsxds, 2, float32, int64, VsrW(2 * i), VsrD(i), \ +VSX_CVT_FP_TO_INT(xvcvdpuxds, 2, float64, uint64, VsrD(i), VsrD(i), false, \ + 0ULL) +VSX_CVT_FP_TO_INT(xvcvspsxds, 2, float32, int64, VsrW(2 * i), VsrD(i), false, \ 0x8000000000000000ULL) -VSX_CVT_FP_TO_INT(xvcvspsxws, 4, float32, int32, VsrW(i), VsrW(i), 0x80000000U) -VSX_CVT_FP_TO_INT(xvcvspuxds, 2, float32, uint64, VsrW(2 * i), VsrD(i), 0ULL) -VSX_CVT_FP_TO_INT(xvcvspuxws, 4, float32, uint32, VsrW(i), VsrW(i), 0U) +VSX_CVT_FP_TO_INT(xvcvspsxws, 4, float32, int32, VsrW(i), VsrW(i), false, \ + 0x80000000ULL) +VSX_CVT_FP_TO_INT(xvcvspuxds, 2, float32, uint64, VsrW(2 * i), VsrD(i), \ + false, 0ULL) +VSX_CVT_FP_TO_INT(xvcvspuxws, 4, float32, uint32, VsrW(i), VsrW(i), false, 0U) #define VSX_CVT_FP_TO_INT128(op, tp, rnan) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, true, GETPC()); \ } VSX_CVT_FP_TO_INT128(XSCVQPUQZ, uint128, 0) @@ -XXX,XX +XXX,XX @@ VSX_CVT_FP_TO_INT128(XSCVQPSQZ, int128, 0x8000000000000000ULL); * words 0 and 1 (and words 2 and 3) of the result register, as * is required by this version of the architecture. */ -#define VSX_CVT_FP_TO_INT2(op, nels, stp, ttp, rnan) \ +#define VSX_CVT_FP_TO_INT2(op, nels, stp, ttp, sfi, rnan) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ int all_flags = env->fp_status.float_exception_flags, flags; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ \ *xt = t; \ env->fp_status.float_exception_flags = all_flags; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfi, GETPC()); \ } -VSX_CVT_FP_TO_INT2(xscvdpsxws, 1, float64, int32, 0x80000000U) -VSX_CVT_FP_TO_INT2(xscvdpuxws, 1, float64, uint32, 0U) -VSX_CVT_FP_TO_INT2(xvcvdpsxws, 2, float64, int32, 0x80000000U) -VSX_CVT_FP_TO_INT2(xvcvdpuxws, 2, float64, uint32, 0U) +VSX_CVT_FP_TO_INT2(xscvdpsxws, 1, float64, int32, true, 0x80000000U) +VSX_CVT_FP_TO_INT2(xscvdpuxws, 1, float64, uint32, true, 0U) +VSX_CVT_FP_TO_INT2(xvcvdpsxws, 2, float64, int32, false, 0x80000000U) +VSX_CVT_FP_TO_INT2(xvcvdpuxws, 2, float64, uint32, false, 0U) /* * VSX_CVT_FP_TO_INT_VECTOR - VSX floating point to integer conversion @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, uint32_t opcode, \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, true, GETPC()); \ } VSX_CVT_FP_TO_INT_VECTOR(xscvqpsdz, float128, int64, f128, VsrD(0), \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_CVT_INT_TO_FP(xscvsxddp, 1, int64, float64, VsrD(0), VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, false, GETPC()); \ } VSX_CVT_INT_TO_FP2(xvcvsxdsp, int64, float32) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)\ helper_reset_fpstatus(env); \ xt->f128 = tp##_to_float128(xb->s128, &env->fp_status); \ helper_compute_fprf_float128(env, xt->f128); \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, true, GETPC()); \ } VSX_CVT_INT128_TO_FP(XSCVUQQP, uint128); @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, uint32_t opcode, \ helper_compute_fprf_##ttp(env, t.tfld); \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, true, GETPC()); \ } VSX_CVT_INT_TO_FP_VECTOR(xscvsdqp, int64, float128, VsrD(0), f128) @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, GETPC()); \ + do_float_check_status(env, sfprf, GETPC()); \ } VSX_ROUND(xsrdpi, 1, float64, VsrD(0), float_round_ties_away, 1) @@ -XXX,XX +XXX,XX @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb) uint64_t xt = do_frsp(env, xb, GETPC()); helper_compute_fprf_float64(env, xt); - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); return xt; } @@ -XXX,XX +XXX,XX @@ void helper_xsrqpi(CPUPPCState *env, uint32_t opcode, } helper_compute_fprf_float128(env, t.f128); - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); *xt = t; } @@ -XXX,XX +XXX,XX @@ void helper_xsrqpxp(CPUPPCState *env, uint32_t opcode, helper_compute_fprf_float128(env, t.f128); *xt = t; - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); } void helper_xssqrtqp(CPUPPCState *env, uint32_t opcode, @@ -XXX,XX +XXX,XX @@ void helper_xssqrtqp(CPUPPCState *env, uint32_t opcode, helper_compute_fprf_float128(env, t.f128); *xt = t; - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); } void helper_xssubqp(CPUPPCState *env, uint32_t opcode, @@ -XXX,XX +XXX,XX @@ void helper_xssubqp(CPUPPCState *env, uint32_t opcode, helper_compute_fprf_float128(env, t.f128); *xt = t; - do_float_check_status(env, GETPC()); + do_float_check_status(env, true, GETPC()); } -- 2.36.1
From: Víctor Colombo <victor.colombo@eldorado.org.br> This patch fixes another not-so-clear situation in Power ISA regarding the inexact bits in FPSCR. The ISA states that: """ When Overflow Exception is disabled (OE=0) and an Overflow Exception occurs, the following actions are taken: ... 2. Inexact Exception is set XX <- 1 ... FI is set to 1 ... """ However, when tested on a Power 9 hardware, some instructions that trigger an OX don't set the FI bit: xvcvdpsp(0x4050533fcdb7b95ff8d561c40bf90996) = FI: CLEARED -> CLEARED xvnmsubmsp(0xf3c0c1fc8f3230, 0xbeaab9c5) = FI: CLEARED -> CLEARED (just a few examples. Other instructions are also affected) The root cause for this seems to be that only instructions that list the bit FI in the "Special Registers Altered" should modify it. QEMU is, today, not working like the hardware: xvcvdpsp(0x4050533fcdb7b95ff8d561c40bf90996) = FI: CLEARED -> SET xvnmsubmsp(0xf3c0c1fc8f3230, 0xbeaab9c5) = FI: CLEARED -> SET (all tests assume FI is cleared beforehand) Fix this by making float_overflow_excp() return float_flag_inexact if it should update the inexact flags. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Message-Id: <20220517161522.36132-3-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/fpu_helper.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ static inline void float_zero_divide_excp(CPUPPCState *env, uintptr_t raddr) } } -static inline void float_overflow_excp(CPUPPCState *env) +static inline int float_overflow_excp(CPUPPCState *env) { CPUState *cs = env_cpu(env); env->fpscr |= FP_OX; /* Update the floating-point exception summary */ env->fpscr |= FP_FX; - if (env->fpscr & FP_OE) { + + bool overflow_enabled = !!(env->fpscr & FP_OE); + if (overflow_enabled) { /* XXX: should adjust the result */ /* Update the floating-point enabled exception summary */ env->fpscr |= FP_FEX; /* We must update the target FPR before raising the exception */ cs->exception_index = POWERPC_EXCP_PROGRAM; env->error_code = POWERPC_EXCP_FP | POWERPC_EXCP_FP_OX; - } else { - env->fpscr |= FP_XX; - env->fpscr |= FP_FI; } + + return overflow_enabled ? 0 : float_flag_inexact; } static inline void float_underflow_excp(CPUPPCState *env) @@ -XXX,XX +XXX,XX @@ static void do_float_check_status(CPUPPCState *env, bool change_fi, int status = get_float_exception_flags(&env->fp_status); if (status & float_flag_overflow) { - float_overflow_excp(env); + status |= float_overflow_excp(env); } else if (status & float_flag_underflow) { float_underflow_excp(env); } -- 2.36.1
From: Víctor Colombo <victor.colombo@eldorado.org.br> The bit FI fix used the sfprf flag as a flag for the set_fi parameter in do_float_check_status where applicable. Now, this patch rename this flag to sfifprf to state this dual usage. Signed-off-by: Víctor Colombo <victor.colombo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Rashmica Gupta <rashmica.g@gmail.com> Message-Id: <20220517161522.36132-4-victor.colombo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/fpu_helper.c | 112 ++++++++++++++++++++-------------------- 1 file changed, 56 insertions(+), 56 deletions(-) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ uint32_t helper_efdcmpeq(CPUPPCState *env, uint64_t op1, uint64_t op2) * nels - number of elements (1, 2 or 4) * tp - type (float32 or float64) * fld - vsr_t field (VsrD(*) or VsrW(*)) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_ADD_SUB(name, op, nels, tp, fld, sfprf, r2sp) \ +#define VSX_ADD_SUB(name, op, nels, tp, fld, sfifprf, r2sp) \ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt, \ ppc_vsr_t *xa, ppc_vsr_t *xb) \ { \ @@ -XXX,XX +XXX,XX @@ void helper_##name(CPUPPCState *env, ppc_vsr_t *xt, \ \ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ float_invalid_op_addsub(env, tstat.float_exception_flags, \ - sfprf, GETPC()); \ + sfifprf, GETPC()); \ } \ \ if (r2sp) { \ t.fld = do_frsp(env, t.fld, GETPC()); \ } \ \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ } \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_ADD_SUB(xsadddp, add, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_xsaddqp(CPUPPCState *env, uint32_t opcode, * nels - number of elements (1, 2 or 4) * tp - type (float32 or float64) * fld - vsr_t field (VsrD(*) or VsrW(*)) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_MUL(op, nels, tp, fld, sfprf, r2sp) \ +#define VSX_MUL(op, nels, tp, fld, sfifprf, r2sp) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ ppc_vsr_t *xa, ppc_vsr_t *xb) \ { \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ \ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ float_invalid_op_mul(env, tstat.float_exception_flags, \ - sfprf, GETPC()); \ + sfifprf, GETPC()); \ } \ \ if (r2sp) { \ t.fld = do_frsp(env, t.fld, GETPC()); \ } \ \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_MUL(xsmuldp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_xsmulqp(CPUPPCState *env, uint32_t opcode, * nels - number of elements (1, 2 or 4) * tp - type (float32 or float64) * fld - vsr_t field (VsrD(*) or VsrW(*)) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_DIV(op, nels, tp, fld, sfprf, r2sp) \ +#define VSX_DIV(op, nels, tp, fld, sfifprf, r2sp) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ ppc_vsr_t *xa, ppc_vsr_t *xb) \ { \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ \ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ float_invalid_op_div(env, tstat.float_exception_flags, \ - sfprf, GETPC()); \ + sfifprf, GETPC()); \ } \ if (unlikely(tstat.float_exception_flags & float_flag_divbyzero)) { \ float_zero_divide_excp(env, GETPC()); \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ t.fld = do_frsp(env, t.fld, GETPC()); \ } \ \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_DIV(xsdivdp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ void helper_xsdivqp(CPUPPCState *env, uint32_t opcode, * nels - number of elements (1, 2 or 4) * tp - type (float32 or float64) * fld - vsr_t field (VsrD(*) or VsrW(*)) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_RE(op, nels, tp, fld, sfprf, r2sp) \ +#define VSX_RE(op, nels, tp, fld, sfifprf, r2sp) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ t.fld = do_frsp(env, t.fld, GETPC()); \ } \ \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_RE(xsredp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ VSX_RE(xvresp, 4, float32, VsrW(i), 0, 0) * nels - number of elements (1, 2 or 4) * tp - type (float32 or float64) * fld - vsr_t field (VsrD(*) or VsrW(*)) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_SQRT(op, nels, tp, fld, sfprf, r2sp) \ +#define VSX_SQRT(op, nels, tp, fld, sfifprf, r2sp) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ \ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ float_invalid_op_sqrt(env, tstat.float_exception_flags, \ - sfprf, GETPC()); \ + sfifprf, GETPC()); \ } \ \ if (r2sp) { \ t.fld = do_frsp(env, t.fld, GETPC()); \ } \ \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_SQRT(xssqrtdp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ VSX_SQRT(xvsqrtsp, 4, float32, VsrW(i), 0, 0) * nels - number of elements (1, 2 or 4) * tp - type (float32 or float64) * fld - vsr_t field (VsrD(*) or VsrW(*)) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_RSQRTE(op, nels, tp, fld, sfprf, r2sp) \ +#define VSX_RSQRTE(op, nels, tp, fld, sfifprf, r2sp) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ env->fp_status.float_exception_flags |= tstat.float_exception_flags; \ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ float_invalid_op_sqrt(env, tstat.float_exception_flags, \ - sfprf, GETPC()); \ + sfifprf, GETPC()); \ } \ if (r2sp) { \ t.fld = do_frsp(env, t.fld, GETPC()); \ } \ \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_RSQRTE(xsrsqrtedp, 1, float64, VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23) * fld - vsr_t field (VsrD(*) or VsrW(*)) * maddflgs - flags for the float*muladd routine that control the * various forms (madd, msub, nmadd, nmsub) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_MADD(op, nels, tp, fld, maddflgs, sfprf) \ +#define VSX_MADD(op, nels, tp, fld, maddflgs, sfifprf) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ ppc_vsr_t *s1, ppc_vsr_t *s2, ppc_vsr_t *s3) \ { \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \ \ if (unlikely(tstat.float_exception_flags & float_flag_invalid)) { \ float_invalid_op_madd(env, tstat.float_exception_flags, \ - sfprf, GETPC()); \ + sfifprf, GETPC()); \ } \ \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ } \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_MADD(XSMADDDP, 1, float64, VsrD(0), MADD_FLGS, 1) @@ -XXX,XX +XXX,XX @@ VSX_CMP(xvcmpnesp, 4, float32, VsrW(i), eq, 0, 0) * ttp - target type (float32 or float64) * sfld - source vsr_t field * tfld - target vsr_t field (f32 or f64) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_CVT_FP_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf) \ +#define VSX_CVT_FP_TO_FP(op, nels, stp, ttp, sfld, tfld, sfifprf) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ float_invalid_op_vxsnan(env, GETPC()); \ t.tfld = ttp##_snan_to_qnan(t.tfld); \ } \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_##ttp(env, t.tfld); \ } \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_CVT_FP_TO_FP(xscvspdp, 1, float32, float64, VsrW(0), VsrD(0), 1) VSX_CVT_FP_TO_FP(xvcvspdp, 2, float32, float64, VsrW(2 * i), VsrD(i), 0) -#define VSX_CVT_FP_TO_FP2(op, nels, stp, ttp, sfprf) \ +#define VSX_CVT_FP_TO_FP2(op, nels, stp, ttp, sfifprf) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ float_invalid_op_vxsnan(env, GETPC()); \ t.VsrW(2 * i) = ttp##_snan_to_qnan(t.VsrW(2 * i)); \ } \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_##ttp(env, t.VsrW(2 * i)); \ } \ t.VsrW(2 * i + 1) = t.VsrW(2 * i); \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_CVT_FP_TO_FP2(xvcvdpsp, 2, float64, float32, 0) @@ -XXX,XX +XXX,XX @@ VSX_CVT_FP_TO_FP2(xscvdpsp, 1, float64, float32, 1) * tfld - target vsr_t field (f32 or f64) * sfprf - set FPRF */ -#define VSX_CVT_FP_TO_FP_VECTOR(op, nels, stp, ttp, sfld, tfld, sfprf) \ -void helper_##op(CPUPPCState *env, uint32_t opcode, \ - ppc_vsr_t *xt, ppc_vsr_t *xb) \ +#define VSX_CVT_FP_TO_FP_VECTOR(op, nels, stp, ttp, sfld, tfld, sfprf) \ +void helper_##op(CPUPPCState *env, uint32_t opcode, \ + ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = *xt; \ int i; \ @@ -XXX,XX +XXX,XX @@ VSX_CVT_FP_TO_FP_VECTOR(xscvdpqp, 1, float64, float128, VsrD(0), f128, 1) * ttp - target type * sfld - source vsr_t field * tfld - target vsr_t field - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_CVT_FP_TO_FP_HP(op, nels, stp, ttp, sfld, tfld, sfprf) \ +#define VSX_CVT_FP_TO_FP_HP(op, nels, stp, ttp, sfld, tfld, sfifprf) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ float_invalid_op_vxsnan(env, GETPC()); \ t.tfld = ttp##_snan_to_qnan(t.tfld); \ } \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_##ttp(env, t.tfld); \ } \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_CVT_FP_TO_FP_HP(xscvdphp, 1, float64, float16, VsrD(0), VsrH(3), 1) @@ -XXX,XX +XXX,XX @@ VSX_CVT_FP_TO_INT_VECTOR(xscvqpuwz, float128, uint32, f128, VsrD(0), 0x0ULL) * sfld - source vsr_t field * tfld - target vsr_t field * jdef - definition of the j index (i or 2*i) - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_CVT_INT_TO_FP(op, nels, stp, ttp, sfld, tfld, sfprf, r2sp) \ +#define VSX_CVT_INT_TO_FP(op, nels, stp, ttp, sfld, tfld, sfifprf, r2sp)\ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ if (r2sp) { \ t.tfld = do_frsp(env, t.tfld, GETPC()); \ } \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.tfld); \ } \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_CVT_INT_TO_FP(xscvsxddp, 1, int64, float64, VsrD(0), VsrD(0), 1, 0) @@ -XXX,XX +XXX,XX @@ VSX_CVT_INT_TO_FP_VECTOR(xscvudqp, uint64, float128, VsrD(0), f128) * tp - type (float32 or float64) * fld - vsr_t field (VsrD(*) or VsrW(*)) * rmode - rounding mode - * sfprf - set FPRF + * sfifprf - set FI and FPRF */ -#define VSX_ROUND(op, nels, tp, fld, rmode, sfprf) \ +#define VSX_ROUND(op, nels, tp, fld, rmode, sfifprf) \ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ { \ ppc_vsr_t t = { }; \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } else { \ t.fld = tp##_round_to_int(xb->fld, &env->fp_status); \ } \ - if (sfprf) { \ + if (sfifprf) { \ helper_compute_fprf_float64(env, t.fld); \ } \ } \ @@ -XXX,XX +XXX,XX @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) \ } \ \ *xt = t; \ - do_float_check_status(env, sfprf, GETPC()); \ + do_float_check_status(env, sfifprf, GETPC()); \ } VSX_ROUND(xsrdpi, 1, float64, VsrD(0), float_round_ties_away, 1) -- 2.36.1
From: Frederic Barrat <fbarrat@linux.ibm.com> When writing a register from the TCTXT memory region (4th page within the IC BAR), we were overwriting the Presentation Controller (PC) register at the same offset. It looks like a silly cut and paste error. We were somehow lucky: the TCTXT registers being touched are TCTXT_ENx/_SET/_RESET to enable physical threads and the PC registers at the same offset are either not used by our model or the update was harmless. Found through code inspection. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Message-Id: <20220523151859.72283-1-fbarrat@linux.ibm.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- hw/intc/pnv_xive2.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/hw/intc/pnv_xive2.c b/hw/intc/pnv_xive2.c index XXXXXXX..XXXXXXX 100644 --- a/hw/intc/pnv_xive2.c +++ b/hw/intc/pnv_xive2.c @@ -XXX,XX +XXX,XX @@ static void pnv_xive2_ic_tctxt_write(void *opaque, hwaddr offset, uint64_t val, unsigned size) { PnvXive2 *xive = PNV_XIVE2(opaque); - uint32_t reg = offset >> 3; switch (offset) { /* @@ -XXX,XX +XXX,XX @@ static void pnv_xive2_ic_tctxt_write(void *opaque, hwaddr offset, xive2_error(xive, "TCTXT: invalid write @%"HWADDR_PRIx, offset); return; } - - xive->pc_regs[reg] = val; } static const MemoryRegionOps pnv_xive2_ic_tctxt_ops = { -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-2-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(cmpeqb, TCG_CALL_NO_RWG_SE, i32, tl, tl) DEF_HELPER_FLAGS_1(popcntw, TCG_CALL_NO_RWG_SE, tl, tl) DEF_HELPER_FLAGS_2(bpermd, TCG_CALL_NO_RWG_SE, i64, i64, i64) DEF_HELPER_3(srad, tl, env, tl, tl) -DEF_HELPER_0(darn32, tl) -DEF_HELPER_0(darn64, tl) +DEF_HELPER_FLAGS_0(darn32, TCG_CALL_NO_RWG, tl) +DEF_HELPER_FLAGS_0(darn64, TCG_CALL_NO_RWG, tl) #endif DEF_HELPER_FLAGS_1(cntlsw32, TCG_CALL_NO_RWG_SE, i32, i32) -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Helpers of vector instructions without cpu_env as an argument do not access globals. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-3-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper.h | 162 ++++++++++++++++++++++---------------------- 1 file changed, 81 insertions(+), 81 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_1(ftsqrt, TCG_CALL_NO_RWG_SE, i32, i64) #define dh_ctype_vsr ppc_vsr_t * #define dh_typecode_vsr dh_typecode_ptr -DEF_HELPER_3(vavgub, void, avr, avr, avr) -DEF_HELPER_3(vavguh, void, avr, avr, avr) -DEF_HELPER_3(vavguw, void, avr, avr, avr) -DEF_HELPER_3(vabsdub, void, avr, avr, avr) -DEF_HELPER_3(vabsduh, void, avr, avr, avr) -DEF_HELPER_3(vabsduw, void, avr, avr, avr) -DEF_HELPER_3(vavgsb, void, avr, avr, avr) -DEF_HELPER_3(vavgsh, void, avr, avr, avr) -DEF_HELPER_3(vavgsw, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vavgub, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vavguh, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vavguw, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vabsdub, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vabsduh, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vabsduw, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vavgsb, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vavgsh, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vavgsw, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr) DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr) DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr) @@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(vcmpeqfp_dot, void, env, avr, avr, avr) DEF_HELPER_4(vcmpgefp_dot, void, env, avr, avr, avr) DEF_HELPER_4(vcmpgtfp_dot, void, env, avr, avr, avr) DEF_HELPER_4(vcmpbfp_dot, void, env, avr, avr, avr) -DEF_HELPER_3(vmrglb, void, avr, avr, avr) -DEF_HELPER_3(vmrglh, void, avr, avr, avr) -DEF_HELPER_3(vmrglw, void, avr, avr, avr) -DEF_HELPER_3(vmrghb, void, avr, avr, avr) -DEF_HELPER_3(vmrghh, void, avr, avr, avr) -DEF_HELPER_3(vmrghw, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vmrglb, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vmrglh, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vmrglw, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vmrghb, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vmrghh, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vmrghw, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(VMULESB, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(VMULESH, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(VMULESW, TCG_CALL_NO_RWG, void, avr, avr, avr) @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(VMULOSW, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(VMULOUB, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(VMULOUH, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(VMULOUW, TCG_CALL_NO_RWG, void, avr, avr, avr) -DEF_HELPER_3(vslo, void, avr, avr, avr) -DEF_HELPER_3(vsro, void, avr, avr, avr) -DEF_HELPER_3(vsrv, void, avr, avr, avr) -DEF_HELPER_3(vslv, void, avr, avr, avr) -DEF_HELPER_3(vaddcuw, void, avr, avr, avr) -DEF_HELPER_2(vprtybw, void, avr, avr) -DEF_HELPER_2(vprtybd, void, avr, avr) -DEF_HELPER_2(vprtybq, void, avr, avr) -DEF_HELPER_3(vsubcuw, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vslo, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vsro, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vsrv, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vslv, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vaddcuw, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_2(vprtybw, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vprtybd, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vprtybq, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_3(vsubcuw, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_5(vaddsbs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) DEF_HELPER_FLAGS_5(vaddshs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) DEF_HELPER_FLAGS_5(vaddsws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(vadduws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) DEF_HELPER_FLAGS_5(vsububs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) DEF_HELPER_FLAGS_5(vsubuhs, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) DEF_HELPER_FLAGS_5(vsubuws, TCG_CALL_NO_RWG, void, avr, avr, avr, avr, i32) -DEF_HELPER_3(vadduqm, void, avr, avr, avr) -DEF_HELPER_4(vaddecuq, void, avr, avr, avr, avr) -DEF_HELPER_4(vaddeuqm, void, avr, avr, avr, avr) -DEF_HELPER_3(vaddcuq, void, avr, avr, avr) -DEF_HELPER_3(vsubuqm, void, avr, avr, avr) -DEF_HELPER_4(vsubecuq, void, avr, avr, avr, avr) -DEF_HELPER_4(vsubeuqm, void, avr, avr, avr, avr) -DEF_HELPER_3(vsubcuq, void, avr, avr, avr) -DEF_HELPER_4(vsldoi, void, avr, avr, avr, i32) -DEF_HELPER_3(vextractub, void, avr, avr, i32) -DEF_HELPER_3(vextractuh, void, avr, avr, i32) -DEF_HELPER_3(vextractuw, void, avr, avr, i32) -DEF_HELPER_3(vextractd, void, avr, avr, i32) +DEF_HELPER_FLAGS_3(vadduqm, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_4(vaddecuq, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) +DEF_HELPER_FLAGS_4(vaddeuqm, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) +DEF_HELPER_FLAGS_3(vaddcuq, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vsubuqm, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_4(vsubecuq, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) +DEF_HELPER_FLAGS_4(vsubeuqm, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) +DEF_HELPER_FLAGS_3(vsubcuq, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_4(vsldoi, TCG_CALL_NO_RWG, void, avr, avr, avr, i32) +DEF_HELPER_FLAGS_3(vextractub, TCG_CALL_NO_RWG, void, avr, avr, i32) +DEF_HELPER_FLAGS_3(vextractuh, TCG_CALL_NO_RWG, void, avr, avr, i32) +DEF_HELPER_FLAGS_3(vextractuw, TCG_CALL_NO_RWG, void, avr, avr, i32) +DEF_HELPER_FLAGS_3(vextractd, TCG_CALL_NO_RWG, void, avr, avr, i32) DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl) DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl) DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl) @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(VSTRIBL, TCG_CALL_NO_RWG, i32, avr, avr) DEF_HELPER_FLAGS_2(VSTRIBR, TCG_CALL_NO_RWG, i32, avr, avr) DEF_HELPER_FLAGS_2(VSTRIHL, TCG_CALL_NO_RWG, i32, avr, avr) DEF_HELPER_FLAGS_2(VSTRIHR, TCG_CALL_NO_RWG, i32, avr, avr) -DEF_HELPER_2(vnegw, void, avr, avr) -DEF_HELPER_2(vnegd, void, avr, avr) -DEF_HELPER_2(vupkhpx, void, avr, avr) -DEF_HELPER_2(vupklpx, void, avr, avr) -DEF_HELPER_2(vupkhsb, void, avr, avr) -DEF_HELPER_2(vupkhsh, void, avr, avr) -DEF_HELPER_2(vupkhsw, void, avr, avr) -DEF_HELPER_2(vupklsb, void, avr, avr) -DEF_HELPER_2(vupklsh, void, avr, avr) -DEF_HELPER_2(vupklsw, void, avr, avr) +DEF_HELPER_FLAGS_2(vnegw, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vnegd, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vupkhpx, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vupklpx, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vupkhsb, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vupkhsh, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vupkhsw, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vupklsb, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vupklsh, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vupklsw, TCG_CALL_NO_RWG, void, avr, avr) DEF_HELPER_5(vmsumubm, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmsummbm, void, env, avr, avr, avr, avr) DEF_HELPER_FLAGS_4(VPERM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) @@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(vpkudus, void, env, avr, avr, avr) DEF_HELPER_4(vpkuhum, void, env, avr, avr, avr) DEF_HELPER_4(vpkuwum, void, env, avr, avr, avr) DEF_HELPER_4(vpkudum, void, env, avr, avr, avr) -DEF_HELPER_3(vpkpx, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vpkpx, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_5(vmhaddshs, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmhraddshs, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmsumuhm, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmsumuhs, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmsumshm, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmsumshs, void, env, avr, avr, avr, avr) -DEF_HELPER_4(vmladduhm, void, avr, avr, avr, avr) +DEF_HELPER_FLAGS_4(vmladduhm, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) DEF_HELPER_FLAGS_2(mtvscr, TCG_CALL_NO_RWG, void, env, i32) DEF_HELPER_FLAGS_1(mfvscr, TCG_CALL_NO_RWG, i32, env) DEF_HELPER_3(lvebx, void, env, avr, tl) @@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(vcfsx, void, env, avr, avr, i32) DEF_HELPER_4(vctuxs, void, env, avr, avr, i32) DEF_HELPER_4(vctsxs, void, env, avr, avr, i32) -DEF_HELPER_2(vclzb, void, avr, avr) -DEF_HELPER_2(vclzh, void, avr, avr) -DEF_HELPER_2(vctzb, void, avr, avr) -DEF_HELPER_2(vctzh, void, avr, avr) -DEF_HELPER_2(vctzw, void, avr, avr) -DEF_HELPER_2(vctzd, void, avr, avr) -DEF_HELPER_2(vpopcntb, void, avr, avr) -DEF_HELPER_2(vpopcnth, void, avr, avr) -DEF_HELPER_2(vpopcntw, void, avr, avr) -DEF_HELPER_2(vpopcntd, void, avr, avr) -DEF_HELPER_1(vclzlsbb, tl, avr) -DEF_HELPER_1(vctzlsbb, tl, avr) -DEF_HELPER_3(vbpermd, void, avr, avr, avr) -DEF_HELPER_3(vbpermq, void, avr, avr, avr) -DEF_HELPER_3(vpmsumb, void, avr, avr, avr) -DEF_HELPER_3(vpmsumh, void, avr, avr, avr) -DEF_HELPER_3(vpmsumw, void, avr, avr, avr) -DEF_HELPER_3(vpmsumd, void, avr, avr, avr) -DEF_HELPER_2(vextublx, tl, tl, avr) -DEF_HELPER_2(vextuhlx, tl, tl, avr) -DEF_HELPER_2(vextuwlx, tl, tl, avr) -DEF_HELPER_2(vextubrx, tl, tl, avr) -DEF_HELPER_2(vextuhrx, tl, tl, avr) -DEF_HELPER_2(vextuwrx, tl, tl, avr) +DEF_HELPER_FLAGS_2(vclzb, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vclzh, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vctzb, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vctzh, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vctzw, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vctzd, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vpopcntb, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vpopcnth, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vpopcntw, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_2(vpopcntd, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_1(vclzlsbb, TCG_CALL_NO_RWG, tl, avr) +DEF_HELPER_FLAGS_1(vctzlsbb, TCG_CALL_NO_RWG, tl, avr) +DEF_HELPER_FLAGS_3(vbpermd, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vbpermq, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vpmsumb, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vpmsumh, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vpmsumw, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vpmsumd, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_2(vextublx, TCG_CALL_NO_RWG, tl, tl, avr) +DEF_HELPER_FLAGS_2(vextuhlx, TCG_CALL_NO_RWG, tl, tl, avr) +DEF_HELPER_FLAGS_2(vextuwlx, TCG_CALL_NO_RWG, tl, tl, avr) +DEF_HELPER_FLAGS_2(vextubrx, TCG_CALL_NO_RWG, tl, tl, avr) +DEF_HELPER_FLAGS_2(vextuhrx, TCG_CALL_NO_RWG, tl, tl, avr) +DEF_HELPER_FLAGS_2(vextuwrx, TCG_CALL_NO_RWG, tl, tl, avr) DEF_HELPER_5(VEXTDUBVLX, void, env, avr, avr, avr, tl) DEF_HELPER_5(VEXTDUHVLX, void, env, avr, avr, avr, tl) DEF_HELPER_5(VEXTDUWVLX, void, env, avr, avr, avr, tl) DEF_HELPER_5(VEXTDDVLX, void, env, avr, avr, avr, tl) -DEF_HELPER_2(vsbox, void, avr, avr) -DEF_HELPER_3(vcipher, void, avr, avr, avr) -DEF_HELPER_3(vcipherlast, void, avr, avr, avr) -DEF_HELPER_3(vncipher, void, avr, avr, avr) -DEF_HELPER_3(vncipherlast, void, avr, avr, avr) -DEF_HELPER_3(vshasigmaw, void, avr, avr, i32) -DEF_HELPER_3(vshasigmad, void, avr, avr, i32) -DEF_HELPER_4(vpermxor, void, avr, avr, avr, avr) +DEF_HELPER_FLAGS_2(vsbox, TCG_CALL_NO_RWG, void, avr, avr) +DEF_HELPER_FLAGS_3(vcipher, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vcipherlast, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vncipher, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vncipherlast, TCG_CALL_NO_RWG, void, avr, avr, avr) +DEF_HELPER_FLAGS_3(vshasigmaw, TCG_CALL_NO_RWG, void, avr, avr, i32) +DEF_HELPER_FLAGS_3(vshasigmad, TCG_CALL_NO_RWG, void, avr, avr, i32) +DEF_HELPER_FLAGS_4(vpermxor, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) DEF_HELPER_4(bcdadd, i32, avr, avr, avr, i32) DEF_HELPER_4(bcdsub, i32, avr, avr, avr, i32) -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Helpers of BCD instructions only access the VSRs supplied by the TCGv_ptr arguments, no globals are accessed. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-4-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper.h | 30 +++++++++++++++--------------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_3(vshasigmaw, TCG_CALL_NO_RWG, void, avr, avr, i32) DEF_HELPER_FLAGS_3(vshasigmad, TCG_CALL_NO_RWG, void, avr, avr, i32) DEF_HELPER_FLAGS_4(vpermxor, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) -DEF_HELPER_4(bcdadd, i32, avr, avr, avr, i32) -DEF_HELPER_4(bcdsub, i32, avr, avr, avr, i32) -DEF_HELPER_3(bcdcfn, i32, avr, avr, i32) -DEF_HELPER_3(bcdctn, i32, avr, avr, i32) -DEF_HELPER_3(bcdcfz, i32, avr, avr, i32) -DEF_HELPER_3(bcdctz, i32, avr, avr, i32) -DEF_HELPER_3(bcdcfsq, i32, avr, avr, i32) -DEF_HELPER_3(bcdctsq, i32, avr, avr, i32) -DEF_HELPER_4(bcdcpsgn, i32, avr, avr, avr, i32) -DEF_HELPER_3(bcdsetsgn, i32, avr, avr, i32) -DEF_HELPER_4(bcds, i32, avr, avr, avr, i32) -DEF_HELPER_4(bcdus, i32, avr, avr, avr, i32) -DEF_HELPER_4(bcdsr, i32, avr, avr, avr, i32) -DEF_HELPER_4(bcdtrunc, i32, avr, avr, avr, i32) -DEF_HELPER_4(bcdutrunc, i32, avr, avr, avr, i32) +DEF_HELPER_FLAGS_4(bcdadd, TCG_CALL_NO_RWG, i32, avr, avr, avr, i32) +DEF_HELPER_FLAGS_4(bcdsub, TCG_CALL_NO_RWG, i32, avr, avr, avr, i32) +DEF_HELPER_FLAGS_3(bcdcfn, TCG_CALL_NO_RWG, i32, avr, avr, i32) +DEF_HELPER_FLAGS_3(bcdctn, TCG_CALL_NO_RWG, i32, avr, avr, i32) +DEF_HELPER_FLAGS_3(bcdcfz, TCG_CALL_NO_RWG, i32, avr, avr, i32) +DEF_HELPER_FLAGS_3(bcdctz, TCG_CALL_NO_RWG, i32, avr, avr, i32) +DEF_HELPER_FLAGS_3(bcdcfsq, TCG_CALL_NO_RWG, i32, avr, avr, i32) +DEF_HELPER_FLAGS_3(bcdctsq, TCG_CALL_NO_RWG, i32, avr, avr, i32) +DEF_HELPER_FLAGS_4(bcdcpsgn, TCG_CALL_NO_RWG, i32, avr, avr, avr, i32) +DEF_HELPER_FLAGS_3(bcdsetsgn, TCG_CALL_NO_RWG, i32, avr, avr, i32) +DEF_HELPER_FLAGS_4(bcds, TCG_CALL_NO_RWG, i32, avr, avr, avr, i32) +DEF_HELPER_FLAGS_4(bcdus, TCG_CALL_NO_RWG, i32, avr, avr, avr, i32) +DEF_HELPER_FLAGS_4(bcdsr, TCG_CALL_NO_RWG, i32, avr, avr, avr, i32) +DEF_HELPER_FLAGS_4(bcdtrunc, TCG_CALL_NO_RWG, i32, avr, avr, avr, i32) +DEF_HELPER_FLAGS_4(bcdutrunc, TCG_CALL_NO_RWG, i32, avr, avr, avr, i32) DEF_HELPER_4(xsadddp, void, env, vsr, vsr, vsr) DEF_HELPER_5(xsaddqp, void, env, i32, vsr, vsr, vsr) -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Helpers of VSX instructions without cpu_env as an argument do not access globals. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-5-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(XXPERMX, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, tl) DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32) DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr) DEF_HELPER_FLAGS_5(XXEVAL, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) -DEF_HELPER_5(XXBLENDVB, void, vsr, vsr, vsr, vsr, i32) -DEF_HELPER_5(XXBLENDVH, void, vsr, vsr, vsr, vsr, i32) -DEF_HELPER_5(XXBLENDVW, void, vsr, vsr, vsr, vsr, i32) -DEF_HELPER_5(XXBLENDVD, void, vsr, vsr, vsr, vsr, i32) +DEF_HELPER_FLAGS_5(XXBLENDVB, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) +DEF_HELPER_FLAGS_5(XXBLENDVH, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) +DEF_HELPER_FLAGS_5(XXBLENDVW, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) +DEF_HELPER_FLAGS_5(XXBLENDVD, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_2(efscfsi, i32, env, i32) DEF_HELPER_2(efscfui, i32, env, i32) -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> fsel doesn't change FPSCR and CR1 is handled by gen_set_cr1_from_fpscr, so helper_fsel doesn't need the env argument and can be declared with TCG_CALL_NO_RWG_SE. We also take this opportunity to move the insn to decodetree. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-6-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/fpu_helper.c | 15 +++++++-------- target/ppc/helper.h | 2 +- target/ppc/insn32.decode | 7 +++++++ target/ppc/translate/fp-impl.c.inc | 30 ++++++++++++++++++++++++++++-- target/ppc/translate/fp-ops.c.inc | 1 - 5 files changed, 43 insertions(+), 12 deletions(-) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ float64 helper_frsqrtes(CPUPPCState *env, float64 arg) } /* fsel - fsel. */ -uint64_t helper_fsel(CPUPPCState *env, uint64_t arg1, uint64_t arg2, - uint64_t arg3) +uint64_t helper_FSEL(uint64_t a, uint64_t b, uint64_t c) { - CPU_DoubleU farg1; + CPU_DoubleU fa; - farg1.ll = arg1; + fa.ll = a; - if ((!float64_is_neg(farg1.d) || float64_is_zero(farg1.d)) && - !float64_is_any_nan(farg1.d)) { - return arg2; + if ((!float64_is_neg(fa.d) || float64_is_zero(fa.d)) && + !float64_is_any_nan(fa.d)) { + return c; } else { - return arg3; + return b; } } diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_2(fre, i64, env, i64) DEF_HELPER_2(fres, i64, env, i64) DEF_HELPER_2(frsqrte, i64, env, i64) DEF_HELPER_2(frsqrtes, i64, env, i64) -DEF_HELPER_4(fsel, i64, env, i64, i64, i64) +DEF_HELPER_FLAGS_3(FSEL, TCG_CALL_NO_RWG_SE, i64, i64, i64, i64) DEF_HELPER_FLAGS_2(ftdiv, TCG_CALL_NO_RWG_SE, i32, i64, i64) DEF_HELPER_FLAGS_1(ftsqrt, TCG_CALL_NO_RWG_SE, i32, i64) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ # License along with this library; if not, see <http://www.gnu.org/licenses/>. # +&A frt fra frb frc rc:bool +@A ...... frt:5 fra:5 frb:5 frc:5 ..... rc:1 &A + &D rt ra si:int64_t @D ...... rt:5 ra:5 si:s16 &D @@ -XXX,XX +XXX,XX @@ STFDU 110111 ..... ...... ............... @D STFDX 011111 ..... ...... .... 1011010111 - @X STFDUX 011111 ..... ...... .... 1011110111 - @X +### Floating-Point Select Instruction + +FSEL 111111 ..... ..... ..... ..... 10111 . @A + ### Move To/From System Register Instructions SETBC 011111 ..... ..... ----- 0110000000 - @X_bi diff --git a/target/ppc/translate/fp-impl.c.inc b/target/ppc/translate/fp-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/fp-impl.c.inc +++ b/target/ppc/translate/fp-impl.c.inc @@ -XXX,XX +XXX,XX @@ static void gen_frsqrtes(DisasContext *ctx) tcg_temp_free_i64(t1); } -/* fsel */ -_GEN_FLOAT_ACB(sel, 0x3F, 0x17, 0, PPC_FLOAT_FSEL); +static bool trans_FSEL(DisasContext *ctx, arg_A *a) +{ + TCGv_i64 t0, t1, t2; + + REQUIRE_INSNS_FLAGS(ctx, FLOAT_FSEL); + REQUIRE_FPU(ctx); + + t0 = tcg_temp_new_i64(); + t1 = tcg_temp_new_i64(); + t2 = tcg_temp_new_i64(); + + get_fpr(t0, a->fra); + get_fpr(t1, a->frb); + get_fpr(t2, a->frc); + + gen_helper_FSEL(t0, t0, t1, t2); + set_fpr(a->frt, t0); + if (a->rc) { + gen_set_cr1_from_fpscr(ctx); + } + + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); + tcg_temp_free_i64(t2); + + return true; +} + /* fsub - fsubs */ GEN_FLOAT_AB(sub, 0x14, 0x000007C0, 1, PPC_FLOAT); /* Optional: */ diff --git a/target/ppc/translate/fp-ops.c.inc b/target/ppc/translate/fp-ops.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/fp-ops.c.inc +++ b/target/ppc/translate/fp-ops.c.inc @@ -XXX,XX +XXX,XX @@ GEN_FLOAT_AC(mul, 0x19, 0x0000F800, 1, PPC_FLOAT), GEN_FLOAT_BS(re, 0x3F, 0x18, 1, PPC_FLOAT_EXT), GEN_FLOAT_BS(res, 0x3B, 0x18, 1, PPC_FLOAT_FRES), GEN_FLOAT_BS(rsqrte, 0x3F, 0x1A, 1, PPC_FLOAT_FRSQRTE), -_GEN_FLOAT_ACB(sel, sel, 0x3F, 0x17, 0, 0, PPC_FLOAT_FSEL), GEN_FLOAT_AB(sub, 0x14, 0x000007C0, 1, PPC_FLOAT), GEN_FLOAT_ACB(madd, 0x1D, 1, PPC_FLOAT), GEN_FLOAT_ACB(msub, 0x1C, 1, PPC_FLOAT), -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Move xscvspdpn to decodetree, declare helper_xscvspdpn with TCG_CALL_NO_RWG_SE and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-7-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/fpu_helper.c | 2 +- target/ppc/helper.h | 2 +- target/ppc/insn32.decode | 1 + target/ppc/translate/vsx-impl.c.inc | 22 +++++++++++++++++++++- target/ppc/translate/vsx-ops.c.inc | 1 - 5 files changed, 24 insertions(+), 4 deletions(-) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb) return (result << 32) | result; } -uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb) +uint64_t helper_XSCVSPDPN(uint64_t xb) { return helper_todouble(xb >> 32); } diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_3(XSCVSQQP, void, env, vsr, vsr) DEF_HELPER_3(xscvhpdp, void, env, vsr, vsr) DEF_HELPER_4(xscvsdqp, void, env, i32, vsr, vsr) DEF_HELPER_3(xscvspdp, void, env, vsr, vsr) -DEF_HELPER_2(xscvspdpn, i64, env, i64) +DEF_HELPER_FLAGS_1(XSCVSPDPN, TCG_CALL_NO_RWG_SE, i64, i64) DEF_HELPER_3(xscvdpsxds, void, env, vsr, vsr) DEF_HELPER_3(xscvdpsxws, void, env, vsr, vsr) DEF_HELPER_3(xscvdpuxds, void, env, vsr, vsr) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ XSCVUQQP 111111 ..... 00011 ..... 1101000100 - @X_tb XSCVSQQP 111111 ..... 01011 ..... 1101000100 - @X_tb XVCVBF16SPN 111100 ..... 10000 ..... 111011011 .. @XX2 XVCVSPBF16 111100 ..... 10001 ..... 111011011 .. @XX2 +XSCVSPDPN 111100 ..... ----- ..... 101001011 .. @XX2 ## VSX Vector Test Least-Significant Bit by Byte Instruction diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ GEN_VSX_HELPER_R2(xscvqpuwz, 0x04, 0x1A, 0x01, PPC2_ISA300) GEN_VSX_HELPER_X2(xscvhpdp, 0x16, 0x15, 0x10, PPC2_ISA300) GEN_VSX_HELPER_R2(xscvsdqp, 0x04, 0x1A, 0x0A, PPC2_ISA300) GEN_VSX_HELPER_X2(xscvspdp, 0x12, 0x14, 0, PPC2_VSX) -GEN_VSX_HELPER_XT_XB_ENV(xscvspdpn, 0x16, 0x14, 0, PPC2_VSX207) + +bool trans_XSCVSPDPN(DisasContext *ctx, arg_XX2 *a) +{ + TCGv_i64 tmp; + + REQUIRE_INSNS_FLAGS2(ctx, VSX207); + REQUIRE_VSX(ctx); + + tmp = tcg_temp_new_i64(); + get_cpu_vsr(tmp, a->xb, true); + + gen_helper_XSCVSPDPN(tmp, tmp); + + set_cpu_vsr(a->xt, tmp, true); + set_cpu_vsr(a->xt, tcg_constant_i64(0), false); + + tcg_temp_free_i64(tmp); + + return true; +} + GEN_VSX_HELPER_X2(xscvdpsxds, 0x10, 0x15, 0, PPC2_VSX) GEN_VSX_HELPER_X2(xscvdpsxws, 0x10, 0x05, 0, PPC2_VSX) GEN_VSX_HELPER_X2(xscvdpuxds, 0x10, 0x14, 0, PPC2_VSX) diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-ops.c.inc +++ b/target/ppc/translate/vsx-ops.c.inc @@ -XXX,XX +XXX,XX @@ GEN_XX2FORM(xscvdpspn, 0x16, 0x10, PPC2_VSX207), GEN_XX2FORM_EO(xscvhpdp, 0x16, 0x15, 0x10, PPC2_ISA300), GEN_VSX_XFORM_300_EO(xscvsdqp, 0x04, 0x1A, 0x0A, 0x00000001), GEN_XX2FORM(xscvspdp, 0x12, 0x14, PPC2_VSX), -GEN_XX2FORM(xscvspdpn, 0x16, 0x14, PPC2_VSX207), GEN_XX2FORM(xscvdpsxds, 0x10, 0x15, PPC2_VSX), GEN_XX2FORM(xscvdpsxws, 0x10, 0x05, PPC2_VSX), GEN_XX2FORM(xscvdpuxds, 0x10, 0x14, PPC2_VSX), -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Move xvxsigsp to decodetree, declare helper_xvxsigsp with TCG_CALL_NO_RWG, and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-8-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/fpu_helper.c | 2 +- target/ppc/helper.h | 2 +- target/ppc/insn32.decode | 4 ++++ target/ppc/translate/vsx-impl.c.inc | 18 +++++++++++++++++- target/ppc/translate/vsx-ops.c.inc | 1 - 5 files changed, 23 insertions(+), 4 deletions(-) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb) return xt; } -void helper_xvxsigsp(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb) +void helper_XVXSIGSP(ppc_vsr_t *xt, ppc_vsr_t *xb) { ppc_vsr_t t = { }; uint32_t exp, i, fraction; diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(XXGENPCVDM_le_comp, TCG_CALL_NO_RWG, void, vsr, avr) DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32) DEF_HELPER_FLAGS_5(XXPERMX, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, tl) DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32) -DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr) +DEF_HELPER_FLAGS_2(XVXSIGSP, TCG_CALL_NO_RWG, void, vsr, vsr) DEF_HELPER_FLAGS_5(XXEVAL, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_FLAGS_5(XXBLENDVB, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_FLAGS_5(XXBLENDVH, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ XVCVBF16SPN 111100 ..... 10000 ..... 111011011 .. @XX2 XVCVSPBF16 111100 ..... 10001 ..... 111011011 .. @XX2 XSCVSPDPN 111100 ..... ----- ..... 101001011 .. @XX2 +## VSX Binary Floating-Point Math Support Instructions + +XVXSIGSP 111100 ..... 01001 ..... 111011011 .. @XX2 + ## VSX Vector Test Least-Significant Bit by Byte Instruction XVTLSBB 111100 ... -- 00010 ..... 111011011 . - @XX2_bf_xb diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ static void gen_xvxexpdp(DisasContext *ctx) tcg_temp_free_i64(xbl); } -GEN_VSX_HELPER_X2(xvxsigsp, 0x00, 0x04, 0, PPC2_ISA300) +static bool trans_XVXSIGSP(DisasContext *ctx, arg_XX2 *a) +{ + TCGv_ptr t, b; + + REQUIRE_INSNS_FLAGS2(ctx, ISA300); + REQUIRE_VSX(ctx); + + t = gen_vsr_ptr(a->xt); + b = gen_vsr_ptr(a->xb); + + gen_helper_XVXSIGSP(t, b); + + tcg_temp_free_ptr(t); + tcg_temp_free_ptr(b); + + return true; +} static void gen_xvxsigdp(DisasContext *ctx) { diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-ops.c.inc +++ b/target/ppc/translate/vsx-ops.c.inc @@ -XXX,XX +XXX,XX @@ GEN_XX3FORM(xviexpdp, 0x00, 0x1F, PPC2_ISA300), GEN_XX2FORM_EO(xvxexpdp, 0x16, 0x1D, 0x00, PPC2_ISA300), GEN_XX2FORM_EO(xvxsigdp, 0x16, 0x1D, 0x01, PPC2_ISA300), GEN_XX2FORM_EO(xvxexpsp, 0x16, 0x1D, 0x08, PPC2_ISA300), -GEN_XX2FORM_EO(xvxsigsp, 0x16, 0x1D, 0x09, PPC2_ISA300), /* DCMX = bit[25] << 6 | bit[29] << 5 | bit[11:15] */ #define GEN_XX2FORM_DCMX(name, opc2, opc3, fl2) \ -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Move xxextractuw and xxinsertw to decodetree, declare both helpers with TCG_CALL_NO_RWG, and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-9-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper.h | 4 +- target/ppc/insn32.decode | 9 +++- target/ppc/int_helper.c | 6 +-- target/ppc/translate/vsx-impl.c.inc | 67 +++++++++++++---------------- target/ppc/translate/vsx-ops.c.inc | 2 - 5 files changed, 41 insertions(+), 47 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(XXGENPCVDM_be_exp, TCG_CALL_NO_RWG, void, vsr, avr) DEF_HELPER_FLAGS_2(XXGENPCVDM_be_comp, TCG_CALL_NO_RWG, void, vsr, avr) DEF_HELPER_FLAGS_2(XXGENPCVDM_le_exp, TCG_CALL_NO_RWG, void, vsr, avr) DEF_HELPER_FLAGS_2(XXGENPCVDM_le_comp, TCG_CALL_NO_RWG, void, vsr, avr) -DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32) +DEF_HELPER_FLAGS_3(XXEXTRACTUW, TCG_CALL_NO_RWG, void, vsr, vsr, i32) DEF_HELPER_FLAGS_5(XXPERMX, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, tl) -DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32) +DEF_HELPER_FLAGS_3(XXINSERTW, TCG_CALL_NO_RWG, void, vsr, vsr, i32) DEF_HELPER_FLAGS_2(XVXSIGSP, TCG_CALL_NO_RWG, void, vsr, vsr) DEF_HELPER_FLAGS_5(XXEVAL, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_FLAGS_5(XXBLENDVB, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ &XX2 xt xb @XX2 ...... ..... ..... ..... ......... .. &XX2 xt=%xx_xt xb=%xx_xb -&XX2_uim2 xt xb uim:uint8_t -@XX2_uim2 ...... ..... ... uim:2 ..... ......... .. &XX2_uim2 xt=%xx_xt xb=%xx_xb +&XX2_uim xt xb uim:uint8_t +@XX2_uim2 ...... ..... ... uim:2 ..... ......... .. &XX2_uim xt=%xx_xt xb=%xx_xb + +@XX2_uim4 ...... ..... . uim:4 ..... ......... .. &XX2_uim xt=%xx_xt xb=%xx_xb &XX2_bf_xb bf xb @XX2_bf_xb ...... bf:3 .. ..... ..... ......... . . &XX2_bf_xb xb=%xx_xb @@ -XXX,XX +XXX,XX @@ XXSPLTW 111100 ..... ---.. ..... 010100100 . . @XX2_uim2 ## VSX Permute Instructions +XXEXTRACTUW 111100 ..... - .... ..... 010100101 .. @XX2_uim4 +XXINSERTW 111100 ..... - .... ..... 010110101 .. @XX2_uim4 + XXPERM 111100 ..... ..... ..... 00011010 ... @XX3 XXPERMR 111100 ..... ..... ..... 00111010 ... @XX3 XXPERMDI 111100 ..... ..... ..... 0 .. 01010 ... @XX3_dm diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -XXX,XX +XXX,XX @@ VSTRI(VSTRIHL, H, 8, true) VSTRI(VSTRIHR, H, 8, false) #undef VSTRI -void helper_xxextractuw(CPUPPCState *env, ppc_vsr_t *xt, - ppc_vsr_t *xb, uint32_t index) +void helper_XXEXTRACTUW(ppc_vsr_t *xt, ppc_vsr_t *xb, uint32_t index) { ppc_vsr_t t = { }; size_t es = sizeof(uint32_t); @@ -XXX,XX +XXX,XX @@ void helper_xxextractuw(CPUPPCState *env, ppc_vsr_t *xt, *xt = t; } -void helper_xxinsertw(CPUPPCState *env, ppc_vsr_t *xt, - ppc_vsr_t *xb, uint32_t index) +void helper_XXINSERTW(ppc_vsr_t *xt, ppc_vsr_t *xb, uint32_t index) { ppc_vsr_t t = *xt; size_t es = sizeof(uint32_t); diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ static bool trans_XXSEL(DisasContext *ctx, arg_XX4 *a) return true; } -static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2_uim2 *a) +static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2_uim *a) { int tofs, bofs; @@ -XXX,XX +XXX,XX @@ static void gen_xxsldwi(DisasContext *ctx) tcg_temp_free_i64(xtl); } -#define VSX_EXTRACT_INSERT(name) \ -static void gen_##name(DisasContext *ctx) \ -{ \ - TCGv_ptr xt, xb; \ - TCGv_i32 t0; \ - TCGv_i64 t1; \ - uint8_t uimm = UIMM4(ctx->opcode); \ - \ - if (unlikely(!ctx->vsx_enabled)) { \ - gen_exception(ctx, POWERPC_EXCP_VSXU); \ - return; \ - } \ - xt = gen_vsr_ptr(xT(ctx->opcode)); \ - xb = gen_vsr_ptr(xB(ctx->opcode)); \ - t0 = tcg_temp_new_i32(); \ - t1 = tcg_temp_new_i64(); \ - /* \ - * uimm > 15 out of bound and for \ - * uimm > 12 handle as per hardware in helper \ - */ \ - if (uimm > 15) { \ - tcg_gen_movi_i64(t1, 0); \ - set_cpu_vsr(xT(ctx->opcode), t1, true); \ - set_cpu_vsr(xT(ctx->opcode), t1, false); \ - return; \ - } \ - tcg_gen_movi_i32(t0, uimm); \ - gen_helper_##name(cpu_env, xt, xb, t0); \ - tcg_temp_free_ptr(xb); \ - tcg_temp_free_ptr(xt); \ - tcg_temp_free_i32(t0); \ - tcg_temp_free_i64(t1); \ -} - -VSX_EXTRACT_INSERT(xxextractuw) -VSX_EXTRACT_INSERT(xxinsertw) +static bool do_vsx_extract_insert(DisasContext *ctx, arg_XX2_uim *a, + void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_i32)) +{ + TCGv_i64 zero = tcg_constant_i64(0); + TCGv_ptr xt, xb; + + REQUIRE_INSNS_FLAGS2(ctx, ISA300); + REQUIRE_VSX(ctx); + + /* + * uim > 15 out of bound and for + * uim > 12 handle as per hardware in helper + */ + if (a->uim > 15) { + set_cpu_vsr(a->xt, zero, true); + set_cpu_vsr(a->xt, zero, false); + } else { + xt = gen_vsr_ptr(a->xt); + xb = gen_vsr_ptr(a->xb); + gen_helper(xt, xb, tcg_constant_i32(a->uim)); + tcg_temp_free_ptr(xb); + tcg_temp_free_ptr(xt); + } + + return true; +} + +TRANS(XXEXTRACTUW, do_vsx_extract_insert, gen_helper_XXEXTRACTUW) +TRANS(XXINSERTW, do_vsx_extract_insert, gen_helper_XXINSERTW) #ifdef TARGET_PPC64 static void gen_xsxexpdp(DisasContext *ctx) diff --git a/target/ppc/translate/vsx-ops.c.inc b/target/ppc/translate/vsx-ops.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-ops.c.inc +++ b/target/ppc/translate/vsx-ops.c.inc @@ -XXX,XX +XXX,XX @@ VSX_LOGICAL(xxlorc, 0x8, 0x15, PPC2_VSX207), GEN_XX3FORM(xxmrghw, 0x08, 0x02, PPC2_VSX), GEN_XX3FORM(xxmrglw, 0x08, 0x06, PPC2_VSX), GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00), -GEN_XX2FORM_EXT(xxextractuw, 0x0A, 0x0A, PPC2_ISA300), -GEN_XX2FORM_EXT(xxinsertw, 0x0A, 0x0B, PPC2_ISA300), -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-10-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/translate/vmx-impl.c.inc | 32 +++++------------------------ 1 file changed, 5 insertions(+), 27 deletions(-) diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vmx-impl.c.inc +++ b/target/ppc/translate/vmx-impl.c.inc @@ -XXX,XX +XXX,XX @@ static void gen_vmladduhm(DisasContext *ctx) tcg_temp_free_ptr(rd); } -static bool trans_VPERM(DisasContext *ctx, arg_VA *a) +static bool do_va_helper(DisasContext *ctx, arg_VA *a, + void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr)) { TCGv_ptr vrt, vra, vrb, vrc; - - REQUIRE_INSNS_FLAGS(ctx, ALTIVEC); REQUIRE_VECTOR(ctx); vrt = gen_avr_ptr(a->vrt); vra = gen_avr_ptr(a->vra); vrb = gen_avr_ptr(a->vrb); vrc = gen_avr_ptr(a->rc); - - gen_helper_VPERM(vrt, vra, vrb, vrc); - + gen_helper(vrt, vra, vrb, vrc); tcg_temp_free_ptr(vrt); tcg_temp_free_ptr(vra); tcg_temp_free_ptr(vrb); @@ -XXX,XX +XXX,XX @@ static bool trans_VPERM(DisasContext *ctx, arg_VA *a) return true; } -static bool trans_VPERMR(DisasContext *ctx, arg_VA *a) -{ - TCGv_ptr vrt, vra, vrb, vrc; - - REQUIRE_INSNS_FLAGS2(ctx, ISA300); - REQUIRE_VECTOR(ctx); - - vrt = gen_avr_ptr(a->vrt); - vra = gen_avr_ptr(a->vra); - vrb = gen_avr_ptr(a->vrb); - vrc = gen_avr_ptr(a->rc); - - gen_helper_VPERMR(vrt, vra, vrb, vrc); - - tcg_temp_free_ptr(vrt); - tcg_temp_free_ptr(vra); - tcg_temp_free_ptr(vrb); - tcg_temp_free_ptr(vrc); - - return true; -} +TRANS_FLAGS(ALTIVEC, VPERM, do_va_helper, gen_helper_VPERM) +TRANS_FLAGS2(ISA300, VPERMR, do_va_helper, gen_helper_VPERMR) static bool trans_VSEL(DisasContext *ctx, arg_VA *a) { -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Move vmsumubm and vmsummbm to decodetree, declare both helpers with TCG_CALL_NO_RWG, and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-11-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper.h | 4 ++-- target/ppc/insn32.decode | 3 +++ target/ppc/int_helper.c | 6 ++---- target/ppc/translate/vmx-impl.c.inc | 5 ++++- target/ppc/translate/vmx-ops.c.inc | 2 -- 5 files changed, 11 insertions(+), 9 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_2(vupkhsw, TCG_CALL_NO_RWG, void, avr, avr) DEF_HELPER_FLAGS_2(vupklsb, TCG_CALL_NO_RWG, void, avr, avr) DEF_HELPER_FLAGS_2(vupklsh, TCG_CALL_NO_RWG, void, avr, avr) DEF_HELPER_FLAGS_2(vupklsw, TCG_CALL_NO_RWG, void, avr, avr) -DEF_HELPER_5(vmsumubm, void, env, avr, avr, avr, avr) -DEF_HELPER_5(vmsummbm, void, env, avr, avr, avr, avr) +DEF_HELPER_FLAGS_4(VMSUMUBM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) +DEF_HELPER_FLAGS_4(VMSUMMBM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) DEF_HELPER_FLAGS_4(VPERM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) DEF_HELPER_FLAGS_4(VPERMR, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) DEF_HELPER_4(vpkshss, void, env, avr, avr, avr) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ VMULLD 000100 ..... ..... ..... 00111001001 @VX ## Vector Multiply-Sum Instructions +VMSUMUBM 000100 ..... ..... ..... ..... 100100 @VA +VMSUMMBM 000100 ..... ..... ..... ..... 100101 @VA + VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA VMSUMUDM 000100 ..... ..... ..... ..... 100011 @VA diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -XXX,XX +XXX,XX @@ VMRG(w, u32, VsrW) #undef VMRG_DO #undef VMRG -void helper_vmsummbm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, - ppc_avr_t *b, ppc_avr_t *c) +void helper_VMSUMMBM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) { int32_t prod[16]; int i; @@ -XXX,XX +XXX,XX @@ void helper_vmsumshs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, } } -void helper_vmsumubm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, - ppc_avr_t *b, ppc_avr_t *c) +void helper_VMSUMUBM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) { uint16_t prod[16]; int i; diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vmx-impl.c.inc +++ b/target/ppc/translate/vmx-impl.c.inc @@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *ctx, arg_VA *a) return true; } -GEN_VAFORM_PAIRED(vmsumubm, vmsummbm, 18) GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19) GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20) + +TRANS_FLAGS(ALTIVEC, VMSUMUBM, do_va_helper, gen_helper_VMSUMUBM) +TRANS_FLAGS(ALTIVEC, VMSUMMBM, do_va_helper, gen_helper_VMSUMMBM) + GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23) GEN_VXFORM_NOA(vclzb, 1, 28) diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vmx-ops.c.inc +++ b/target/ppc/translate/vmx-ops.c.inc @@ -XXX,XX +XXX,XX @@ GEN_VXFORM_UIMM(vcfsx, 5, 13), GEN_VXFORM_UIMM(vctuxs, 5, 14), GEN_VXFORM_UIMM(vctsxs, 5, 15), - #define GEN_VAFORM_PAIRED(name0, name1, opc2) \ GEN_HANDLER(name0##_##name1, 0x04, opc2, 0xFF, 0x00000000, PPC_ALTIVEC) GEN_VAFORM_PAIRED(vmhaddshs, vmhraddshs, 16), -GEN_VAFORM_PAIRED(vmsumubm, vmsummbm, 18), GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19), GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20), GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23), -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Move vmsumuhm and vmsumuhs to decodetree, declare vmsumuhm helper with TCG_CALL_NO_RWG, and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-12-matheus.ferst@eldorado.org.br> [danielhb: added #undef VMSUMUHM to fix ppc64 build] Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper.h | 4 ++-- target/ppc/insn32.decode | 2 ++ target/ppc/int_helper.c | 5 ++--- target/ppc/translate/vmx-impl.c.inc | 24 ++++++++++++++++++++++-- target/ppc/translate/vmx-ops.c.inc | 1 - tcg/ppc/tcg-target.c.inc | 1 + 6 files changed, 29 insertions(+), 8 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_4(vpkudum, void, env, avr, avr, avr) DEF_HELPER_FLAGS_3(vpkpx, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_5(vmhaddshs, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmhraddshs, void, env, avr, avr, avr, avr) -DEF_HELPER_5(vmsumuhm, void, env, avr, avr, avr, avr) -DEF_HELPER_5(vmsumuhs, void, env, avr, avr, avr, avr) +DEF_HELPER_FLAGS_4(VMSUMUHM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) +DEF_HELPER_5(VMSUMUHS, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmsumshm, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmsumshs, void, env, avr, avr, avr, avr) DEF_HELPER_FLAGS_4(vmladduhm, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ VMULLD 000100 ..... ..... ..... 00111001001 @VX VMSUMUBM 000100 ..... ..... ..... ..... 100100 @VA VMSUMMBM 000100 ..... ..... ..... ..... 100101 @VA +VMSUMUHM 000100 ..... ..... ..... ..... 100110 @VA +VMSUMUHS 000100 ..... ..... ..... ..... 100111 @VA VMSUMCUD 000100 ..... ..... ..... ..... 010111 @VA VMSUMUDM 000100 ..... ..... ..... ..... 100011 @VA diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -XXX,XX +XXX,XX @@ void helper_VMSUMUBM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) } } -void helper_vmsumuhm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, - ppc_avr_t *b, ppc_avr_t *c) +void helper_VMSUMUHM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) { uint32_t prod[8]; int i; @@ -XXX,XX +XXX,XX @@ void helper_vmsumuhm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, } } -void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, +void helper_VMSUMUHS(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) { uint32_t prod[8]; diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vmx-impl.c.inc +++ b/target/ppc/translate/vmx-impl.c.inc @@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *ctx, arg_VA *a) return true; } -GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19) GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20) - TRANS_FLAGS(ALTIVEC, VMSUMUBM, do_va_helper, gen_helper_VMSUMUBM) TRANS_FLAGS(ALTIVEC, VMSUMMBM, do_va_helper, gen_helper_VMSUMMBM) +TRANS_FLAGS(ALTIVEC, VMSUMUHM, do_va_helper, gen_helper_VMSUMUHM) + +static bool do_va_env_helper(DisasContext *ctx, arg_VA *a, + void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr)) +{ + TCGv_ptr vrt, vra, vrb, vrc; + REQUIRE_VECTOR(ctx); + + vrt = gen_avr_ptr(a->vrt); + vra = gen_avr_ptr(a->vra); + vrb = gen_avr_ptr(a->vrb); + vrc = gen_avr_ptr(a->rc); + gen_helper(cpu_env, vrt, vra, vrb, vrc); + tcg_temp_free_ptr(vrt); + tcg_temp_free_ptr(vra); + tcg_temp_free_ptr(vrb); + tcg_temp_free_ptr(vrc); + + return true; +} + +TRANS_FLAGS(ALTIVEC, VMSUMUHS, do_va_env_helper, gen_helper_VMSUMUHS) GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23) diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vmx-ops.c.inc +++ b/target/ppc/translate/vmx-ops.c.inc @@ -XXX,XX +XXX,XX @@ GEN_VXFORM_UIMM(vctsxs, 5, 15), #define GEN_VAFORM_PAIRED(name0, name1, opc2) \ GEN_HANDLER(name0##_##name1, 0x04, opc2, 0xFF, 0x00000000, PPC_ALTIVEC) GEN_VAFORM_PAIRED(vmhaddshs, vmhraddshs, 16), -GEN_VAFORM_PAIRED(vmsumuhm, vmsumuhs, 19), GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20), GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23), diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index XXXXXXX..XXXXXXX 100644 --- a/tcg/ppc/tcg-target.c.inc +++ b/tcg/ppc/tcg-target.c.inc @@ -XXX,XX +XXX,XX @@ void tcg_register_jit(const void *buf, size_t buf_size) #undef VMULOUB #undef VMULOUH #undef VMULOUW +#undef VMSUMUHM -- 2.36.1
From: Matheus Ferst <matheus.ferst@eldorado.org.br> Move vmsumshm and vmsumshs to decodetree, declare vmsumshm helper with TCG_CALL_NO_RWG, and drop the unused env argument. Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220517123929.284511-13-matheus.ferst@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/helper.h | 4 ++-- target/ppc/insn32.decode | 2 ++ target/ppc/int_helper.c | 5 ++--- target/ppc/translate/vmx-impl.c.inc | 3 ++- target/ppc/translate/vmx-ops.c.inc | 1 - 5 files changed, 8 insertions(+), 7 deletions(-) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_5(vmhaddshs, void, env, avr, avr, avr, avr) DEF_HELPER_5(vmhraddshs, void, env, avr, avr, avr, avr) DEF_HELPER_FLAGS_4(VMSUMUHM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) DEF_HELPER_5(VMSUMUHS, void, env, avr, avr, avr, avr) -DEF_HELPER_5(vmsumshm, void, env, avr, avr, avr, avr) -DEF_HELPER_5(vmsumshs, void, env, avr, avr, avr, avr) +DEF_HELPER_FLAGS_4(VMSUMSHM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) +DEF_HELPER_5(VMSUMSHS, void, env, avr, avr, avr, avr) DEF_HELPER_FLAGS_4(vmladduhm, TCG_CALL_NO_RWG, void, avr, avr, avr, avr) DEF_HELPER_FLAGS_2(mtvscr, TCG_CALL_NO_RWG, void, env, i32) DEF_HELPER_FLAGS_1(mfvscr, TCG_CALL_NO_RWG, i32, env) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ VMULLD 000100 ..... ..... ..... 00111001001 @VX VMSUMUBM 000100 ..... ..... ..... ..... 100100 @VA VMSUMMBM 000100 ..... ..... ..... ..... 100101 @VA +VMSUMSHM 000100 ..... ..... ..... ..... 101000 @VA +VMSUMSHS 000100 ..... ..... ..... ..... 101001 @VA VMSUMUHM 000100 ..... ..... ..... ..... 100110 @VA VMSUMUHS 000100 ..... ..... ..... ..... 100111 @VA diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -XXX,XX +XXX,XX @@ void helper_VMSUMMBM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) } } -void helper_vmsumshm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, - ppc_avr_t *b, ppc_avr_t *c) +void helper_VMSUMSHM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) { int32_t prod[8]; int i; @@ -XXX,XX +XXX,XX @@ void helper_vmsumshm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, } } -void helper_vmsumshs(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, +void helper_VMSUMSHS(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c) { int32_t prod[8]; diff --git a/target/ppc/translate/vmx-impl.c.inc b/target/ppc/translate/vmx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vmx-impl.c.inc +++ b/target/ppc/translate/vmx-impl.c.inc @@ -XXX,XX +XXX,XX @@ static bool trans_VSEL(DisasContext *ctx, arg_VA *a) return true; } -GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20) TRANS_FLAGS(ALTIVEC, VMSUMUBM, do_va_helper, gen_helper_VMSUMUBM) TRANS_FLAGS(ALTIVEC, VMSUMMBM, do_va_helper, gen_helper_VMSUMMBM) +TRANS_FLAGS(ALTIVEC, VMSUMSHM, do_va_helper, gen_helper_VMSUMSHM) TRANS_FLAGS(ALTIVEC, VMSUMUHM, do_va_helper, gen_helper_VMSUMUHM) static bool do_va_env_helper(DisasContext *ctx, arg_VA *a, @@ -XXX,XX +XXX,XX @@ static bool do_va_env_helper(DisasContext *ctx, arg_VA *a, } TRANS_FLAGS(ALTIVEC, VMSUMUHS, do_va_env_helper, gen_helper_VMSUMUHS) +TRANS_FLAGS(ALTIVEC, VMSUMSHS, do_va_env_helper, gen_helper_VMSUMSHS) GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23) diff --git a/target/ppc/translate/vmx-ops.c.inc b/target/ppc/translate/vmx-ops.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vmx-ops.c.inc +++ b/target/ppc/translate/vmx-ops.c.inc @@ -XXX,XX +XXX,XX @@ GEN_VXFORM_UIMM(vctsxs, 5, 15), #define GEN_VAFORM_PAIRED(name0, name1, opc2) \ GEN_HANDLER(name0##_##name1, 0x04, opc2, 0xFF, 0x00000000, PPC_ALTIVEC) GEN_VAFORM_PAIRED(vmhaddshs, vmhraddshs, 16), -GEN_VAFORM_PAIRED(vmsumshm, vmsumshs, 20), GEN_VAFORM_PAIRED(vmaddfp, vnmsubfp, 23), GEN_VXFORM_DUAL(vclzb, vpopcntb, 1, 28, PPC_NONE, PPC2_ALTIVEC_207), -- 2.36.1
From: Nicholas Piggin <npiggin@gmail.com> The generated eieio memory ordering semantics do not match the instruction definition in the architecture. Add a big comment to explain this strange instruction and correct the memory ordering behaviour. Signed-off: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220519135908.21282-2-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/translate.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/target/ppc/translate.c b/target/ppc/translate.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -XXX,XX +XXX,XX @@ static void gen_stswx(DisasContext *ctx) /* eieio */ static void gen_eieio(DisasContext *ctx) { - TCGBar bar = TCG_MO_LD_ST; + TCGBar bar = TCG_MO_ALL; + + /* + * eieio has complex semanitcs. It provides memory ordering between + * operations in the set: + * - loads from CI memory. + * - stores to CI memory. + * - stores to WT memory. + * + * It separately also orders memory for operations in the set: + * - stores to cacheble memory. + * + * It also serializes instructions: + * - dcbt and dcbst. + * + * It separately serializes: + * - tlbie and tlbsync. + * + * And separately serializes: + * - slbieg, slbiag, and slbsync. + * + * The end result is that CI memory ordering requires TCG_MO_ALL + * and it is not possible to special-case more relaxed ordering for + * cacheable accesses. TCG_BAR_SC is required to provide this + * serialization. + */ /* * POWER9 has a eieio instruction variant using bit 6 as a hint to -- 2.36.1
From: Nicholas Piggin <npiggin@gmail.com> eieio does not provide ordering between stores to CI memory and stores to cacheable memory so it can't be used as a general ST_ST barrier. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-of-by: Nicholas Piggin <npiggin@gmail.com> Message-Id: <20220519135908.21282-3-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- tcg/ppc/tcg-target.c.inc | 2 -- 1 file changed, 2 deletions(-) diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index XXXXXXX..XXXXXXX 100644 --- a/tcg/ppc/tcg-target.c.inc +++ b/tcg/ppc/tcg-target.c.inc @@ -XXX,XX +XXX,XX @@ static void tcg_out_mb(TCGContext *s, TCGArg a0) a0 &= TCG_MO_ALL; if (a0 == TCG_MO_LD_LD) { insn = LWSYNC; - } else if (a0 == TCG_MO_ST_ST) { - insn = EIEIO; } tcg_out32(s, insn); } -- 2.36.1
From: Nicholas Piggin <npiggin@gmail.com> lwsync orders more than just LD_LD, importantly it matches x86 and s390 default memory ordering. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220519135908.21282-4-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- tcg/ppc/tcg-target.c.inc | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index XXXXXXX..XXXXXXX 100644 --- a/tcg/ppc/tcg-target.c.inc +++ b/tcg/ppc/tcg-target.c.inc @@ -XXX,XX +XXX,XX @@ static void tcg_out_brcond2 (TCGContext *s, const TCGArg *args, static void tcg_out_mb(TCGContext *s, TCGArg a0) { - uint32_t insn = HWSYNC; - a0 &= TCG_MO_ALL; - if (a0 == TCG_MO_LD_LD) { + uint32_t insn; + + if (a0 & TCG_MO_ST_LD) { + insn = HWSYNC; + } else { insn = LWSYNC; } + tcg_out32(s, insn); } -- 2.36.1
From: Nicholas Piggin <npiggin@gmail.com> This allows an x86 host to no-op lwsyncs, and ppc host can use lwsync rather than sync. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220519135908.21282-5-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/cpu.h | 4 +++- target/ppc/cpu_init.c | 13 +++++++------ target/ppc/machine.c | 3 ++- target/ppc/translate.c | 8 +++++++- 4 files changed, 19 insertions(+), 9 deletions(-) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -XXX,XX +XXX,XX @@ enum { PPC2_ISA300 = 0x0000000000080000ULL, /* POWER ISA 3.1 */ PPC2_ISA310 = 0x0000000000100000ULL, + /* lwsync instruction */ + PPC2_MEM_LWSYNC = 0x0000000000200000ULL, #define PPC_TCG_INSNS2 (PPC2_BOOKE206 | PPC2_VSX | PPC2_PRCNTL | PPC2_DBRX | \ PPC2_ISA205 | PPC2_VSX207 | PPC2_PERM_ISA206 | \ @@ -XXX,XX +XXX,XX @@ enum { PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | \ PPC2_ALTIVEC_207 | PPC2_ISA207S | PPC2_DFP | \ PPC2_FP_CVT_S64 | PPC2_TM | PPC2_PM_ISA206 | \ - PPC2_ISA300 | PPC2_ISA310) + PPC2_ISA300 | PPC2_ISA310 | PPC2_MEM_LWSYNC) }; /*****************************************************************************/ diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/cpu_init.c +++ b/target/ppc/cpu_init.c @@ -XXX,XX +XXX,XX @@ POWERPC_FAMILY(970)(ObjectClass *oc, void *data) PPC_MEM_TLBIE | PPC_MEM_TLBSYNC | PPC_64B | PPC_ALTIVEC | PPC_SEGMENT_64B | PPC_SLBI; - pcc->insns_flags2 = PPC2_FP_CVT_S64; + pcc->insns_flags2 = PPC2_FP_CVT_S64 | PPC2_MEM_LWSYNC; pcc->msr_mask = (1ull << MSR_SF) | (1ull << MSR_VR) | (1ull << MSR_POW) | @@ -XXX,XX +XXX,XX @@ POWERPC_FAMILY(POWER5P)(ObjectClass *oc, void *data) PPC_64B | PPC_POPCNTB | PPC_SEGMENT_64B | PPC_SLBI; - pcc->insns_flags2 = PPC2_FP_CVT_S64; + pcc->insns_flags2 = PPC2_FP_CVT_S64 | PPC2_MEM_LWSYNC; pcc->msr_mask = (1ull << MSR_SF) | (1ull << MSR_VR) | (1ull << MSR_POW) | @@ -XXX,XX +XXX,XX @@ POWERPC_FAMILY(POWER7)(ObjectClass *oc, void *data) PPC2_PERM_ISA206 | PPC2_DIVE_ISA206 | PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206 | PPC2_FP_TST_ISA206 | PPC2_FP_CVT_S64 | - PPC2_PM_ISA206; + PPC2_PM_ISA206 | PPC2_MEM_LWSYNC; pcc->msr_mask = (1ull << MSR_SF) | (1ull << MSR_VR) | (1ull << MSR_VSX) | @@ -XXX,XX +XXX,XX @@ POWERPC_FAMILY(POWER8)(ObjectClass *oc, void *data) PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | - PPC2_TM | PPC2_PM_ISA206; + PPC2_TM | PPC2_PM_ISA206 | PPC2_MEM_LWSYNC; pcc->msr_mask = (1ull << MSR_SF) | (1ull << MSR_HV) | (1ull << MSR_TM) | @@ -XXX,XX +XXX,XX @@ POWERPC_FAMILY(POWER9)(ObjectClass *oc, void *data) PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | - PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL; + PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_MEM_LWSYNC; pcc->msr_mask = (1ull << MSR_SF) | (1ull << MSR_HV) | (1ull << MSR_TM) | @@ -XXX,XX +XXX,XX @@ POWERPC_FAMILY(POWER10)(ObjectClass *oc, void *data) PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | - PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310; + PPC2_TM | PPC2_ISA300 | PPC2_PRCNTL | PPC2_ISA310 | + PPC2_MEM_LWSYNC; pcc->msr_mask = (1ull << MSR_SF) | (1ull << MSR_HV) | (1ull << MSR_TM) | diff --git a/target/ppc/machine.c b/target/ppc/machine.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/machine.c +++ b/target/ppc/machine.c @@ -XXX,XX +XXX,XX @@ static int cpu_pre_save(void *opaque) | PPC2_ATOMIC_ISA206 | PPC2_FP_CVT_ISA206 | PPC2_FP_TST_ISA206 | PPC2_BCTAR_ISA207 | PPC2_LSQ_ISA207 | PPC2_ALTIVEC_207 - | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | PPC2_TM; + | PPC2_ISA205 | PPC2_ISA207S | PPC2_FP_CVT_S64 | PPC2_TM + | PPC2_MEM_LWSYNC; env->spr[SPR_LR] = env->lr; env->spr[SPR_CTR] = env->ctr; diff --git a/target/ppc/translate.c b/target/ppc/translate.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -XXX,XX +XXX,XX @@ static void gen_stqcx_(DisasContext *ctx) /* sync */ static void gen_sync(DisasContext *ctx) { + TCGBar bar = TCG_MO_ALL; uint32_t l = (ctx->opcode >> 21) & 3; + if ((l == 1) && (ctx->insns_flags2 & PPC2_MEM_LWSYNC)) { + bar = TCG_MO_LD_LD | TCG_MO_LD_ST | TCG_MO_ST_ST; + } + /* * We may need to check for a pending TLB flush. * @@ -XXX,XX +XXX,XX @@ static void gen_sync(DisasContext *ctx) if (((l == 2) || !(ctx->insns_flags & PPC_64B)) && !ctx->pr) { gen_check_tlb_flush(ctx, true); } - tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC); + + tcg_gen_mb(bar | TCG_BAR_SC); } /* wait */ -- 2.36.1
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> Implement the following PowerISA v3.1 instructions: xxmfacc: VSX Move From Accumulator xxmtacc: VSX Move To Accumulator xxsetaccz: VSX Set Accumulator to Zero The PowerISA 3.1 mentions that for the current version of the architecture, "the hardware implementation provides the effect of ACC[i] and VSRs 4*i to 4*i + 3 logically containing the same data" and "The Accumulators introduce no new logical state at this time" (page 501). For now it seems unnecessary to create new structures, so this patch just uses ACC[i] as VSRs 4*i to 4*i+3 and therefore move to and from accumulators are no-ops. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-2-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/cpu.h | 5 +++++ target/ppc/insn32.decode | 9 +++++++++ target/ppc/translate/vsx-impl.c.inc | 31 +++++++++++++++++++++++++++++ 3 files changed, 45 insertions(+) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -XXX,XX +XXX,XX @@ static inline int vsr_full_offset(int i) return offsetof(CPUPPCState, vsr[i].u64[0]); } +static inline int acc_full_offset(int i) +{ + return vsr_full_offset(i * 4); +} + static inline int fpr_offset(int i) { return vsr64_offset(i, true); diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ &X_vrt_frbp vrt frbp @X_vrt_frbp ...... vrt:5 ..... ....0 .......... . &X_vrt_frbp frbp=%x_frbp +&X_a ra +@X_a ...... ra:3 .. ..... ..... .......... . &X_a + %xx_xt 0:1 21:5 %xx_xb 1:1 11:5 %xx_xa 2:1 16:5 @@ -XXX,XX +XXX,XX @@ XVTLSBB 111100 ... -- 00010 ..... 111011011 . - @XX2_bf_xb &XL_s s:uint8_t @XL_s ......-------------- s:1 .......... - &XL_s RFEBB 010011-------------- . 0010010010 - @XL_s + +## Accumulator Instructions + +XXMFACC 011111 ... -- 00000 ----- 0010110001 - @X_a +XXMTACC 011111 ... -- 00001 ----- 0010110001 - @X_a +XXSETACCZ 011111 ... -- 00011 ----- 0010110001 - @X_a diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ static bool trans_XVCVBF16SPN(DisasContext *ctx, arg_XX2 *a) return true; } + /* + * The PowerISA 3.1 mentions that for the current version of the + * architecture, "the hardware implementation provides the effect of + * ACC[i] and VSRs 4*i to 4*i + 3 logically containing the same data" + * and "The Accumulators introduce no new logical state at this time" + * (page 501). For now it seems unnecessary to create new structures, + * so ACC[i] is the same as VSRs 4*i to 4*i+3 and therefore + * move to and from accumulators are no-ops. + */ +static bool trans_XXMFACC(DisasContext *ctx, arg_X_a *a) +{ + REQUIRE_INSNS_FLAGS2(ctx, ISA310); + REQUIRE_VSX(ctx); + return true; +} + +static bool trans_XXMTACC(DisasContext *ctx, arg_X_a *a) +{ + REQUIRE_INSNS_FLAGS2(ctx, ISA310); + REQUIRE_VSX(ctx); + return true; +} + +static bool trans_XXSETACCZ(DisasContext *ctx, arg_X_a *a) +{ + REQUIRE_INSNS_FLAGS2(ctx, ISA310); + REQUIRE_VSX(ctx); + tcg_gen_gvec_dup_imm(MO_64, acc_full_offset(a->ra), 64, 64, 0); + return true; +} + #undef GEN_XX2FORM #undef GEN_XX3FORM #undef GEN_XX2IFORM -- 2.36.1
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> Implement the following PowerISA v3.1 instructions: xvi4ger8: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) xvi4ger8pp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) Positive multiply, Positive accumulate xvi8ger4: VSX Vector 4-bit Signed Integer GER (rank-8 update) xvi8ger4pp: VSX Vector 4-bit Signed Integer GER (rank-8 update) Positive multiply, Positive accumulate xvi8ger4spp: VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturate Positive multiply, Positive accumulate xvi16ger2: VSX Vector 16-bit Signed Integer GER (rank-2 update) xvi16ger2pp: VSX Vector 16-bit Signed Integer GER (rank-2 update) Positive multiply, Positive accumulate xvi16ger2s: VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation xvi16ger2spp: VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-3-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/cpu.h | 1 + target/ppc/helper.h | 13 +++ target/ppc/insn32.decode | 18 ++++ target/ppc/int_helper.c | 130 ++++++++++++++++++++++++++++ target/ppc/internal.h | 15 ++++ target/ppc/translate/vsx-impl.c.inc | 41 +++++++++ 6 files changed, 218 insertions(+) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -XXX,XX +XXX,XX @@ typedef union _ppc_vsr_t { typedef ppc_vsr_t ppc_avr_t; typedef ppc_vsr_t ppc_fprp_t; +typedef ppc_vsr_t ppc_acc_t; #if !defined(CONFIG_USER_ONLY) /* Software TLB cache */ diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_1(ftsqrt, TCG_CALL_NO_RWG_SE, i32, i64) #define dh_ctype_vsr ppc_vsr_t * #define dh_typecode_vsr dh_typecode_ptr +#define dh_alias_acc ptr +#define dh_ctype_acc ppc_acc_t * +#define dh_typecode_acc dh_typecode_ptr + DEF_HELPER_FLAGS_3(vavgub, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(vavguh, TCG_CALL_NO_RWG, void, avr, avr, avr) DEF_HELPER_FLAGS_3(vavguw, TCG_CALL_NO_RWG, void, avr, avr, avr) @@ -XXX,XX +XXX,XX @@ DEF_HELPER_FLAGS_5(XXBLENDVB, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_FLAGS_5(XXBLENDVH, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_FLAGS_5(XXBLENDVW, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) DEF_HELPER_FLAGS_5(XXBLENDVD, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32) +DEF_HELPER_5(XVI4GER8, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVI4GER8PP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVI8GER4, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVI8GER4PP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVI8GER4SPP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVI16GER2, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVI16GER2S, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVI16GER2PP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVI16GER2SPP, void, env, vsr, vsr, acc, i32) DEF_HELPER_2(efscfsi, i32, env, i32) DEF_HELPER_2(efscfui, i32, env, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ &XX3 xt xa xb @XX3 ...... ..... ..... ..... ........ ... &XX3 xt=%xx_xt xa=%xx_xa xb=%xx_xb +# 32 bit GER instructions have all mask bits considered 1 +&MMIRR_XX3 xa xb xt pmsk xmsk ymsk +%xx_at 23:3 +@XX3_at ...... ... .. ..... ..... ........ ... &MMIRR_XX3 xt=%xx_at xb=%xx_xb \ + pmsk=255 xmsk=15 ymsk=15 + &XX3_dm xt xa xb dm @XX3_dm ...... ..... ..... ..... . dm:2 ..... ... &XX3_dm xt=%xx_xt xa=%xx_xa xb=%xx_xb @@ -XXX,XX +XXX,XX @@ RFEBB 010011-------------- . 0010010010 - @XL_s XXMFACC 011111 ... -- 00000 ----- 0010110001 - @X_a XXMTACC 011111 ... -- 00001 ----- 0010110001 - @X_a XXSETACCZ 011111 ... -- 00011 ----- 0010110001 - @X_a + +## VSX GER instruction + +XVI4GER8 111011 ... -- ..... ..... 00100011 ..- @XX3_at xa=%xx_xa +XVI4GER8PP 111011 ... -- ..... ..... 00100010 ..- @XX3_at xa=%xx_xa +XVI8GER4 111011 ... -- ..... ..... 00000011 ..- @XX3_at xa=%xx_xa +XVI8GER4PP 111011 ... -- ..... ..... 00000010 ..- @XX3_at xa=%xx_xa +XVI16GER2 111011 ... -- ..... ..... 01001011 ..- @XX3_at xa=%xx_xa +XVI16GER2PP 111011 ... -- ..... ..... 01101011 ..- @XX3_at xa=%xx_xa +XVI8GER4SPP 111011 ... -- ..... ..... 01100011 ..- @XX3_at xa=%xx_xa +XVI16GER2S 111011 ... -- ..... ..... 00101011 ..- @XX3_at xa=%xx_xa +XVI16GER2SPP 111011 ... -- ..... ..... 00101010 ..- @XX3_at xa=%xx_xa diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -XXX,XX +XXX,XX @@ VCT(uxs, cvtsduw, u32) VCT(sxs, cvtsdsw, s32) #undef VCT +typedef int64_t do_ger(uint32_t, uint32_t, uint32_t); + +static int64_t ger_rank8(uint32_t a, uint32_t b, uint32_t mask) +{ + int64_t psum = 0; + for (int i = 0; i < 8; i++, mask >>= 1) { + if (mask & 1) { + psum += sextract32(a, 4 * i, 4) * sextract32(b, 4 * i, 4); + } + } + return psum; +} + +static int64_t ger_rank4(uint32_t a, uint32_t b, uint32_t mask) +{ + int64_t psum = 0; + for (int i = 0; i < 4; i++, mask >>= 1) { + if (mask & 1) { + psum += sextract32(a, 8 * i, 8) * (int64_t)extract32(b, 8 * i, 8); + } + } + return psum; +} + +static int64_t ger_rank2(uint32_t a, uint32_t b, uint32_t mask) +{ + int64_t psum = 0; + for (int i = 0; i < 2; i++, mask >>= 1) { + if (mask & 1) { + psum += sextract32(a, 16 * i, 16) * sextract32(b, 16 * i, 16); + } + } + return psum; +} + +static void xviger(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, ppc_acc_t *at, + uint32_t mask, bool sat, bool acc, do_ger ger) +{ + uint8_t pmsk = FIELD_EX32(mask, GER_MSK, PMSK), + xmsk = FIELD_EX32(mask, GER_MSK, XMSK), + ymsk = FIELD_EX32(mask, GER_MSK, YMSK); + uint8_t xmsk_bit, ymsk_bit; + int64_t psum; + int i, j; + for (i = 0, xmsk_bit = 1 << 3; i < 4; i++, xmsk_bit >>= 1) { + for (j = 0, ymsk_bit = 1 << 3; j < 4; j++, ymsk_bit >>= 1) { + if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) { + psum = ger(a->VsrW(i), b->VsrW(j), pmsk); + if (acc) { + psum += at[i].VsrSW(j); + } + if (sat && psum > INT32_MAX) { + set_vscr_sat(env); + at[i].VsrSW(j) = INT32_MAX; + } else if (sat && psum < INT32_MIN) { + set_vscr_sat(env); + at[i].VsrSW(j) = INT32_MIN; + } else { + at[i].VsrSW(j) = (int32_t) psum; + } + } else { + at[i].VsrSW(j) = 0; + } + } + } +} + +QEMU_FLATTEN +void helper_XVI4GER8(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, false, false, ger_rank8); +} + +QEMU_FLATTEN +void helper_XVI4GER8PP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, false, true, ger_rank8); +} + +QEMU_FLATTEN +void helper_XVI8GER4(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, false, false, ger_rank4); +} + +QEMU_FLATTEN +void helper_XVI8GER4PP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, false, true, ger_rank4); +} + +QEMU_FLATTEN +void helper_XVI8GER4SPP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, true, true, ger_rank4); +} + +QEMU_FLATTEN +void helper_XVI16GER2(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, false, false, ger_rank2); +} + +QEMU_FLATTEN +void helper_XVI16GER2S(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, true, false, ger_rank2); +} + +QEMU_FLATTEN +void helper_XVI16GER2PP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, false, true, ger_rank2); +} + +QEMU_FLATTEN +void helper_XVI16GER2SPP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + xviger(env, a, b, at, mask, true, true, ger_rank2); +} + target_ulong helper_vclzlsbb(ppc_avr_t *r) { target_ulong count = 0; diff --git a/target/ppc/internal.h b/target/ppc/internal.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/internal.h +++ b/target/ppc/internal.h @@ -XXX,XX +XXX,XX @@ #ifndef PPC_INTERNAL_H #define PPC_INTERNAL_H +#include "hw/registerfields.h" + #define FUNC_MASK(name, ret_type, size, max_val) \ static inline ret_type name(uint##size##_t start, \ uint##size##_t end) \ @@ -XXX,XX +XXX,XX @@ G_NORETURN void ppc_cpu_do_unaligned_access(CPUState *cs, vaddr addr, uintptr_t retaddr); #endif +FIELD(GER_MSK, XMSK, 0, 4) +FIELD(GER_MSK, YMSK, 4, 4) +FIELD(GER_MSK, PMSK, 8, 8) + +static inline int ger_pack_masks(int pmsk, int ymsk, int xmsk) +{ + int msk = 0; + msk = FIELD_DP32(msk, GER_MSK, XMSK, xmsk); + msk = FIELD_DP32(msk, GER_MSK, YMSK, ymsk); + msk = FIELD_DP32(msk, GER_MSK, PMSK, pmsk); + return msk; +} + #endif /* PPC_INTERNAL_H */ diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ static inline TCGv_ptr gen_vsr_ptr(int reg) return r; } +static inline TCGv_ptr gen_acc_ptr(int reg) +{ + TCGv_ptr r = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(r, cpu_env, acc_full_offset(reg)); + return r; +} + #define VSX_LOAD_SCALAR(name, operation) \ static void gen_##name(DisasContext *ctx) \ { \ @@ -XXX,XX +XXX,XX @@ static bool trans_XXSETACCZ(DisasContext *ctx, arg_X_a *a) return true; } +static bool do_ger(DisasContext *ctx, arg_MMIRR_XX3 *a, + void (*helper)(TCGv_env, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32)) +{ + uint32_t mask; + TCGv_ptr xt, xa, xb; + REQUIRE_INSNS_FLAGS2(ctx, ISA310); + REQUIRE_VSX(ctx); + if (unlikely((a->xa / 4 == a->xt) || (a->xb / 4 == a->xt))) { + gen_invalid(ctx); + return true; + } + + xt = gen_acc_ptr(a->xt); + xa = gen_vsr_ptr(a->xa); + xb = gen_vsr_ptr(a->xb); + + mask = ger_pack_masks(a->pmsk, a->ymsk, a->xmsk); + helper(cpu_env, xa, xb, xt, tcg_constant_i32(mask)); + tcg_temp_free_ptr(xt); + tcg_temp_free_ptr(xa); + tcg_temp_free_ptr(xb); + return true; +} + +TRANS(XVI4GER8, do_ger, gen_helper_XVI4GER8) +TRANS(XVI4GER8PP, do_ger, gen_helper_XVI4GER8PP) +TRANS(XVI8GER4, do_ger, gen_helper_XVI8GER4) +TRANS(XVI8GER4PP, do_ger, gen_helper_XVI8GER4PP) +TRANS(XVI8GER4SPP, do_ger, gen_helper_XVI8GER4SPP) +TRANS(XVI16GER2, do_ger, gen_helper_XVI16GER2) +TRANS(XVI16GER2PP, do_ger, gen_helper_XVI16GER2PP) +TRANS(XVI16GER2S, do_ger, gen_helper_XVI16GER2S) +TRANS(XVI16GER2SPP, do_ger, gen_helper_XVI16GER2SPP) + #undef GEN_XX2FORM #undef GEN_XX3FORM #undef GEN_XX2IFORM -- 2.36.1
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> Implement the following PowerISA v3.1 instructions: pmxvi4ger8: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) pmxvi4ger8pp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) Positive multiply, Positive accumulate pmxvi8ger4: Prefixed Masked VSX Vector 4-bit Signed Integer GER (rank-8 update) pmxvi8ger4pp: Prefixed Masked VSX Vector 4-bit Signed Integer GER (rank-8 update) Positive multiply, Positive accumulate pmxvi8ger4spp: Prefixed Masked VSX Vector 8-bit Signed/Unsigned Integer GER (rank-4 update) with Saturate Positive multiply, Positive accumulate pmxvi16ger2: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) pmxvi16ger2pp: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) Positive multiply, Positive accumulate pmxvi16ger2s: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation pmxvi16ger2spp: Prefixed Masked VSX Vector 16-bit Signed Integer GER (rank-2 update) with Saturation Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-4-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/insn64.decode | 30 +++++++++++++++++++++++++++++ target/ppc/translate/vsx-impl.c.inc | 10 ++++++++++ 2 files changed, 40 insertions(+) diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn64.decode +++ b/target/ppc/insn64.decode @@ -XXX,XX +XXX,XX @@ ...... ..... ..... ..... ..... .. .... \ &8RR_XX4_uim3 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb xc=%8rr_xx_xc +# Format MMIRR:XX3 +&MMIRR_XX3 !extern xa xb xt pmsk xmsk ymsk +%xx3_xa 2:1 16:5 +%xx3_xb 1:1 11:5 +%xx3_at 23:3 +@MMIRR_XX3 ...... .. .... .. . . ........ xmsk:4 ymsk:4 \ + ...... ... .. ..... ..... ........ ... \ + &MMIRR_XX3 xa=%xx3_xa xb=%xx3_xb xt=%xx3_at + ### Fixed-Point Load Instructions PLBZ 000001 10 0--.-- .................. \ @@ -XXX,XX +XXX,XX @@ PSTFS 000001 10 0--.-- .................. \ PSTFD 000001 10 0--.-- .................. \ 110110 ..... ..... ................ @PLS_D +## VSX GER instruction + +PMXVI4GER8 000001 11 1001 -- - - pmsk:8 ........ \ + 111011 ... -- ..... ..... 00100011 ..- @MMIRR_XX3 +PMXVI4GER8PP 000001 11 1001 -- - - pmsk:8 ........ \ + 111011 ... -- ..... ..... 00100010 ..- @MMIRR_XX3 +PMXVI8GER4 000001 11 1001 -- - - pmsk:4 ---- ........ \ + 111011 ... -- ..... ..... 00000011 ..- @MMIRR_XX3 +PMXVI8GER4PP 000001 11 1001 -- - - pmsk:4 ---- ........ \ + 111011 ... -- ..... ..... 00000010 ..- @MMIRR_XX3 +PMXVI16GER2 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 01001011 ..- @MMIRR_XX3 +PMXVI16GER2PP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 01101011 ..- @MMIRR_XX3 +PMXVI8GER4SPP 000001 11 1001 -- - - pmsk:4 ---- ........ \ + 111011 ... -- ..... ..... 01100011 ..- @MMIRR_XX3 +PMXVI16GER2S 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00101011 ..- @MMIRR_XX3 +PMXVI16GER2SPP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00101010 ..- @MMIRR_XX3 + ### Prefixed No-operation Instruction @PNOP 000001 11 0000-- 000000000000000000 \ diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ TRANS(XVI16GER2PP, do_ger, gen_helper_XVI16GER2PP) TRANS(XVI16GER2S, do_ger, gen_helper_XVI16GER2S) TRANS(XVI16GER2SPP, do_ger, gen_helper_XVI16GER2SPP) +TRANS64(PMXVI4GER8, do_ger, gen_helper_XVI4GER8) +TRANS64(PMXVI4GER8PP, do_ger, gen_helper_XVI4GER8PP) +TRANS64(PMXVI8GER4, do_ger, gen_helper_XVI8GER4) +TRANS64(PMXVI8GER4PP, do_ger, gen_helper_XVI8GER4PP) +TRANS64(PMXVI8GER4SPP, do_ger, gen_helper_XVI8GER4SPP) +TRANS64(PMXVI16GER2, do_ger, gen_helper_XVI16GER2) +TRANS64(PMXVI16GER2PP, do_ger, gen_helper_XVI16GER2PP) +TRANS64(PMXVI16GER2S, do_ger, gen_helper_XVI16GER2S) +TRANS64(PMXVI16GER2SPP, do_ger, gen_helper_XVI16GER2SPP) + #undef GEN_XX2FORM #undef GEN_XX3FORM #undef GEN_XX2IFORM -- 2.36.1
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> Implement the following PowerISA v3.1 instructions: xvf32ger: VSX Vector 32-bit Floating-Point GER (rank-1 update) xvf32gernn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf32gernp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf32gerpn: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf32gerpp: VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate xvf64ger: VSX Vector 64-bit Floating-Point GER (rank-1 update) xvf64gernn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate xvf64gernp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate xvf64gerpn: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate xvf64gerpp: VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-5-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/cpu.h | 4 + target/ppc/fpu_helper.c | 194 +++++++++++++++++++++++++++- target/ppc/helper.h | 10 ++ target/ppc/insn32.decode | 13 ++ target/ppc/translate/vsx-impl.c.inc | 12 ++ 5 files changed, 231 insertions(+), 2 deletions(-) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -XXX,XX +XXX,XX @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[i] #define VsrD(i) u64[i] #define VsrSD(i) s64[i] +#define VsrSF(i) f32[i] +#define VsrDF(i) f64[i] #else #define VsrB(i) u8[15 - (i)] #define VsrSB(i) s8[15 - (i)] @@ -XXX,XX +XXX,XX @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[3 - (i)] #define VsrD(i) u64[1 - (i)] #define VsrSD(i) s64[1 - (i)] +#define VsrSF(i) f32[3 - (i)] +#define VsrDF(i) f64[1 - (i)] #endif static inline int vsr64_offset(int i, bool high) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ void helper_store_fpscr(CPUPPCState *env, uint64_t val, uint32_t nibbles) ppc_store_fpscr(env, val); } -void helper_fpscr_check_status(CPUPPCState *env) +static void do_fpscr_check_status(CPUPPCState *env, uintptr_t raddr) { CPUState *cs = env_cpu(env); target_ulong fpscr = env->fpscr; @@ -XXX,XX +XXX,XX @@ void helper_fpscr_check_status(CPUPPCState *env) } cs->exception_index = POWERPC_EXCP_PROGRAM; env->error_code = error | POWERPC_EXCP_FP; + env->fpscr |= error ? FP_FEX : 0; /* Deferred floating-point exception after target FPSCR update */ if (fp_exceptions_enabled(env)) { raise_exception_err_ra(env, cs->exception_index, - env->error_code, GETPC()); + env->error_code, raddr); } } +void helper_fpscr_check_status(CPUPPCState *env) +{ + do_fpscr_check_status(env, GETPC()); +} + static void do_float_check_status(CPUPPCState *env, bool change_fi, uintptr_t raddr) { @@ -XXX,XX +XXX,XX @@ void helper_xssubqp(CPUPPCState *env, uint32_t opcode, *xt = t; do_float_check_status(env, true, GETPC()); } + +static inline void vsxger_excp(CPUPPCState *env, uintptr_t retaddr) +{ + /* + * XV*GER instructions execute and set the FPSCR as if exceptions + * are disabled and only at the end throw an exception + */ + target_ulong enable; + enable = env->fpscr & (FP_ENABLES | FP_FI | FP_FR); + env->fpscr &= ~(FP_ENABLES | FP_FI | FP_FR); + int status = get_float_exception_flags(&env->fp_status); + if (unlikely(status & float_flag_invalid)) { + if (status & float_flag_invalid_snan) { + float_invalid_op_vxsnan(env, 0); + } + if (status & float_flag_invalid_imz) { + float_invalid_op_vximz(env, false, 0); + } + if (status & float_flag_invalid_isi) { + float_invalid_op_vxisi(env, false, 0); + } + } + do_float_check_status(env, false, retaddr); + env->fpscr |= enable; + do_fpscr_check_status(env, retaddr); +} + +typedef void vsxger_zero(ppc_vsr_t *at, int, int); + +typedef void vsxger_muladd_f(ppc_vsr_t *, ppc_vsr_t *, ppc_vsr_t *, int, int, + int flags, float_status *s); + +static void vsxger_muladd32(ppc_vsr_t *at, ppc_vsr_t *a, ppc_vsr_t *b, int i, + int j, int flags, float_status *s) +{ + at[i].VsrSF(j) = float32_muladd(a->VsrSF(i), b->VsrSF(j), + at[i].VsrSF(j), flags, s); +} + +static void vsxger_mul32(ppc_vsr_t *at, ppc_vsr_t *a, ppc_vsr_t *b, int i, + int j, int flags, float_status *s) +{ + at[i].VsrSF(j) = float32_mul(a->VsrSF(i), b->VsrSF(j), s); +} + +static void vsxger_zero32(ppc_vsr_t *at, int i, int j) +{ + at[i].VsrSF(j) = float32_zero; +} + +static void vsxger_muladd64(ppc_vsr_t *at, ppc_vsr_t *a, ppc_vsr_t *b, int i, + int j, int flags, float_status *s) +{ + if (j >= 2) { + j -= 2; + at[i].VsrDF(j) = float64_muladd(a[i / 2].VsrDF(i % 2), b->VsrDF(j), + at[i].VsrDF(j), flags, s); + } +} + +static void vsxger_mul64(ppc_vsr_t *at, ppc_vsr_t *a, ppc_vsr_t *b, int i, + int j, int flags, float_status *s) +{ + if (j >= 2) { + j -= 2; + at[i].VsrDF(j) = float64_mul(a[i / 2].VsrDF(i % 2), b->VsrDF(j), s); + } +} + +static void vsxger_zero64(ppc_vsr_t *at, int i, int j) +{ + if (j >= 2) { + j -= 2; + at[i].VsrDF(j) = float64_zero; + } +} + +static void vsxger(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask, bool acc, bool neg_mul, + bool neg_acc, vsxger_muladd_f mul, vsxger_muladd_f muladd, + vsxger_zero zero) +{ + int i, j, xmsk_bit, ymsk_bit, op_flags; + uint8_t xmsk = mask & 0x0F; + uint8_t ymsk = (mask >> 4) & 0x0F; + float_status *excp_ptr = &env->fp_status; + op_flags = (neg_acc ^ neg_mul) ? float_muladd_negate_c : 0; + op_flags |= (neg_mul) ? float_muladd_negate_result : 0; + helper_reset_fpstatus(env); + for (i = 0, xmsk_bit = 1 << 3; i < 4; i++, xmsk_bit >>= 1) { + for (j = 0, ymsk_bit = 1 << 3; j < 4; j++, ymsk_bit >>= 1) { + if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) { + if (acc) { + muladd(at, a, b, i, j, op_flags, excp_ptr); + } else { + mul(at, a, b, i, j, op_flags, excp_ptr); + } + } else { + zero(at, i, j); + } + } + } + vsxger_excp(env, GETPC()); +} + +QEMU_FLATTEN +void helper_XVF32GER(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, false, false, false, vsxger_mul32, + vsxger_muladd32, vsxger_zero32); +} + +QEMU_FLATTEN +void helper_XVF32GERPP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, true, false, false, vsxger_mul32, + vsxger_muladd32, vsxger_zero32); +} + +QEMU_FLATTEN +void helper_XVF32GERPN(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, true, false, true, vsxger_mul32, + vsxger_muladd32, vsxger_zero32); +} + +QEMU_FLATTEN +void helper_XVF32GERNP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, true, true, false, vsxger_mul32, + vsxger_muladd32, vsxger_zero32); +} + +QEMU_FLATTEN +void helper_XVF32GERNN(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, true, true, true, vsxger_mul32, + vsxger_muladd32, vsxger_zero32); +} + +QEMU_FLATTEN +void helper_XVF64GER(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, false, false, false, vsxger_mul64, + vsxger_muladd64, vsxger_zero64); +} + +QEMU_FLATTEN +void helper_XVF64GERPP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, true, false, false, vsxger_mul64, + vsxger_muladd64, vsxger_zero64); +} + +QEMU_FLATTEN +void helper_XVF64GERPN(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, true, false, true, vsxger_mul64, + vsxger_muladd64, vsxger_zero64); +} + +QEMU_FLATTEN +void helper_XVF64GERNP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, true, true, false, vsxger_mul64, + vsxger_muladd64, vsxger_zero64); +} + +QEMU_FLATTEN +void helper_XVF64GERNN(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger(env, a, b, at, mask, true, true, true, vsxger_mul64, + vsxger_muladd64, vsxger_zero64); +} diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_5(XVI16GER2, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVI16GER2S, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVI16GER2PP, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVI16GER2SPP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF32GER, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF32GERPP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF32GERPN, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF32GERNP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF32GERNN, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF64GER, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF64GERPP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF64GERPN, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF64GERNP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF64GERNN, void, env, vsr, vsr, acc, i32) DEF_HELPER_2(efscfsi, i32, env, i32) DEF_HELPER_2(efscfui, i32, env, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ # 32 bit GER instructions have all mask bits considered 1 &MMIRR_XX3 xa xb xt pmsk xmsk ymsk %xx_at 23:3 +%xx_xa_pair 2:1 17:4 !function=times_2 @XX3_at ...... ... .. ..... ..... ........ ... &MMIRR_XX3 xt=%xx_at xb=%xx_xb \ pmsk=255 xmsk=15 ymsk=15 @@ -XXX,XX +XXX,XX @@ XVI16GER2PP 111011 ... -- ..... ..... 01101011 ..- @XX3_at xa=%xx_xa XVI8GER4SPP 111011 ... -- ..... ..... 01100011 ..- @XX3_at xa=%xx_xa XVI16GER2S 111011 ... -- ..... ..... 00101011 ..- @XX3_at xa=%xx_xa XVI16GER2SPP 111011 ... -- ..... ..... 00101010 ..- @XX3_at xa=%xx_xa + +XVF32GER 111011 ... -- ..... ..... 00011011 ..- @XX3_at xa=%xx_xa +XVF32GERPP 111011 ... -- ..... ..... 00011010 ..- @XX3_at xa=%xx_xa +XVF32GERPN 111011 ... -- ..... ..... 10011010 ..- @XX3_at xa=%xx_xa +XVF32GERNP 111011 ... -- ..... ..... 01011010 ..- @XX3_at xa=%xx_xa +XVF32GERNN 111011 ... -- ..... ..... 11011010 ..- @XX3_at xa=%xx_xa + +XVF64GER 111011 ... -- .... 0 ..... 00111011 ..- @XX3_at xa=%xx_xa_pair +XVF64GERPP 111011 ... -- .... 0 ..... 00111010 ..- @XX3_at xa=%xx_xa_pair +XVF64GERPN 111011 ... -- .... 0 ..... 10111010 ..- @XX3_at xa=%xx_xa_pair +XVF64GERNP 111011 ... -- .... 0 ..... 01111010 ..- @XX3_at xa=%xx_xa_pair +XVF64GERNN 111011 ... -- .... 0 ..... 11111010 ..- @XX3_at xa=%xx_xa_pair diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ TRANS64(PMXVI16GER2PP, do_ger, gen_helper_XVI16GER2PP) TRANS64(PMXVI16GER2S, do_ger, gen_helper_XVI16GER2S) TRANS64(PMXVI16GER2SPP, do_ger, gen_helper_XVI16GER2SPP) +TRANS(XVF32GER, do_ger, gen_helper_XVF32GER) +TRANS(XVF32GERPP, do_ger, gen_helper_XVF32GERPP) +TRANS(XVF32GERPN, do_ger, gen_helper_XVF32GERPN) +TRANS(XVF32GERNP, do_ger, gen_helper_XVF32GERNP) +TRANS(XVF32GERNN, do_ger, gen_helper_XVF32GERNN) + +TRANS(XVF64GER, do_ger, gen_helper_XVF64GER) +TRANS(XVF64GERPP, do_ger, gen_helper_XVF64GERPP) +TRANS(XVF64GERPN, do_ger, gen_helper_XVF64GERPN) +TRANS(XVF64GERNP, do_ger, gen_helper_XVF64GERNP) +TRANS(XVF64GERNN, do_ger, gen_helper_XVF64GERNN) + #undef GEN_XX2FORM #undef GEN_XX3FORM #undef GEN_XX2IFORM -- 2.36.1
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> Implement the following PowerISA v3.1 instructions: xvf16ger2: VSX Vector 16-bit Floating-Point GER (rank-2 update) xvf16ger2nn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate xvf16ger2np: VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate xvf16ger2pn: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate xvf16ger2pp: VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-6-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/cpu.h | 3 + target/ppc/fpu_helper.c | 95 +++++++++++++++++++++++++++++ target/ppc/helper.h | 5 ++ target/ppc/insn32.decode | 6 ++ target/ppc/translate/vsx-impl.c.inc | 6 ++ 5 files changed, 115 insertions(+) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -XXX,XX +XXX,XX @@ typedef union _ppc_vsr_t { int16_t s16[8]; int32_t s32[4]; int64_t s64[2]; + float16 f16[8]; float32 f32[4]; float64 f64[2]; float128 f128; @@ -XXX,XX +XXX,XX @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[i] #define VsrD(i) u64[i] #define VsrSD(i) s64[i] +#define VsrHF(i) f16[i] #define VsrSF(i) f32[i] #define VsrDF(i) f64[i] #else @@ -XXX,XX +XXX,XX @@ static inline bool lsw_reg_in_range(int start, int nregs, int rx) #define VsrSW(i) s32[3 - (i)] #define VsrD(i) u64[1 - (i)] #define VsrSD(i) s64[1 - (i)] +#define VsrHF(i) f16[7 - (i)] #define VsrSF(i) f32[3 - (i)] #define VsrDF(i) f64[1 - (i)] #endif diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ static inline float128 float128_snan_to_qnan(float128 x) #define float32_snan_to_qnan(x) ((x) | 0x00400000) #define float16_snan_to_qnan(x) ((x) | 0x0200) +static inline float32 bfp32_neg(float32 a) +{ + if (unlikely(float32_is_any_nan(a))) { + return a; + } else { + return float32_chs(a); + } +} + static inline bool fp_exceptions_enabled(CPUPPCState *env) { #ifdef CONFIG_USER_ONLY @@ -XXX,XX +XXX,XX @@ static inline void vsxger_excp(CPUPPCState *env, uintptr_t retaddr) do_fpscr_check_status(env, retaddr); } +typedef float64 extract_f16(float16, float_status *); + +static float64 extract_hf16(float16 in, float_status *fp_status) +{ + return float16_to_float64(in, true, fp_status); +} + +static void vsxger16(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask, bool acc, + bool neg_mul, bool neg_acc, extract_f16 extract) +{ + float32 r, aux_acc; + float64 psum, va, vb, vc, vd; + int i, j, xmsk_bit, ymsk_bit; + uint8_t pmsk = FIELD_EX32(mask, GER_MSK, PMSK), + xmsk = FIELD_EX32(mask, GER_MSK, XMSK), + ymsk = FIELD_EX32(mask, GER_MSK, YMSK); + float_status *excp_ptr = &env->fp_status; + for (i = 0, xmsk_bit = 1 << 3; i < 4; i++, xmsk_bit >>= 1) { + for (j = 0, ymsk_bit = 1 << 3; j < 4; j++, ymsk_bit >>= 1) { + if ((xmsk_bit & xmsk) && (ymsk_bit & ymsk)) { + va = !(pmsk & 2) ? float64_zero : + extract(a->VsrHF(2 * i), excp_ptr); + vb = !(pmsk & 2) ? float64_zero : + extract(b->VsrHF(2 * j), excp_ptr); + vc = !(pmsk & 1) ? float64_zero : + extract(a->VsrHF(2 * i + 1), excp_ptr); + vd = !(pmsk & 1) ? float64_zero : + extract(b->VsrHF(2 * j + 1), excp_ptr); + psum = float64_mul(va, vb, excp_ptr); + psum = float64r32_muladd(vc, vd, psum, 0, excp_ptr); + r = float64_to_float32(psum, excp_ptr); + if (acc) { + aux_acc = at[i].VsrSF(j); + if (neg_mul) { + r = bfp32_neg(r); + } + if (neg_acc) { + aux_acc = bfp32_neg(aux_acc); + } + r = float32_add(r, aux_acc, excp_ptr); + } + at[i].VsrSF(j) = r; + } else { + at[i].VsrSF(j) = float32_zero; + } + } + } + vsxger_excp(env, GETPC()); +} + typedef void vsxger_zero(ppc_vsr_t *at, int, int); typedef void vsxger_muladd_f(ppc_vsr_t *, ppc_vsr_t *, ppc_vsr_t *, int, int, @@ -XXX,XX +XXX,XX @@ static void vsxger(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, vsxger_excp(env, GETPC()); } +QEMU_FLATTEN +void helper_XVF16GER2(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, false, false, false, extract_hf16); +} + +QEMU_FLATTEN +void helper_XVF16GER2PP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, true, false, false, extract_hf16); +} + +QEMU_FLATTEN +void helper_XVF16GER2PN(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, true, false, true, extract_hf16); +} + +QEMU_FLATTEN +void helper_XVF16GER2NP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, true, true, false, extract_hf16); +} + +QEMU_FLATTEN +void helper_XVF16GER2NN(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, true, true, true, extract_hf16); +} + QEMU_FLATTEN void helper_XVF32GER(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, ppc_acc_t *at, uint32_t mask) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_5(XVI16GER2, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVI16GER2S, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVI16GER2PP, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVI16GER2SPP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF16GER2, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF16GER2PP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF16GER2PN, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF16GER2NP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVF16GER2NN, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF32GER, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF32GERPP, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF32GERPN, void, env, vsr, vsr, acc, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ XVI8GER4SPP 111011 ... -- ..... ..... 01100011 ..- @XX3_at xa=%xx_xa XVI16GER2S 111011 ... -- ..... ..... 00101011 ..- @XX3_at xa=%xx_xa XVI16GER2SPP 111011 ... -- ..... ..... 00101010 ..- @XX3_at xa=%xx_xa +XVF16GER2 111011 ... -- ..... ..... 00010011 ..- @XX3_at xa=%xx_xa +XVF16GER2PP 111011 ... -- ..... ..... 00010010 ..- @XX3_at xa=%xx_xa +XVF16GER2PN 111011 ... -- ..... ..... 10010010 ..- @XX3_at xa=%xx_xa +XVF16GER2NP 111011 ... -- ..... ..... 01010010 ..- @XX3_at xa=%xx_xa +XVF16GER2NN 111011 ... -- ..... ..... 11010010 ..- @XX3_at xa=%xx_xa + XVF32GER 111011 ... -- ..... ..... 00011011 ..- @XX3_at xa=%xx_xa XVF32GERPP 111011 ... -- ..... ..... 00011010 ..- @XX3_at xa=%xx_xa XVF32GERPN 111011 ... -- ..... ..... 10011010 ..- @XX3_at xa=%xx_xa diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ TRANS64(PMXVI16GER2PP, do_ger, gen_helper_XVI16GER2PP) TRANS64(PMXVI16GER2S, do_ger, gen_helper_XVI16GER2S) TRANS64(PMXVI16GER2SPP, do_ger, gen_helper_XVI16GER2SPP) +TRANS(XVF16GER2, do_ger, gen_helper_XVF16GER2) +TRANS(XVF16GER2PP, do_ger, gen_helper_XVF16GER2PP) +TRANS(XVF16GER2PN, do_ger, gen_helper_XVF16GER2PN) +TRANS(XVF16GER2NP, do_ger, gen_helper_XVF16GER2NP) +TRANS(XVF16GER2NN, do_ger, gen_helper_XVF16GER2NN) + TRANS(XVF32GER, do_ger, gen_helper_XVF32GER) TRANS(XVF32GERPP, do_ger, gen_helper_XVF32GERPP) TRANS(XVF32GERPN, do_ger, gen_helper_XVF32GERPN) -- 2.36.1
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> Implement the following PowerISA v3.1 instructions: pmxvf16ger2: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) pmxvf16ger2nn: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Negative accumulate pmxvf16ger2np: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Negative multiply, Positive accumulate pmxvf16ger2pn: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Negative accumulate pmxvf16ger2pp: Prefixed Masked VSX Vector 16-bit Floating-Point GER (rank-2 update) Positive multiply, Positive accumulate pmxvf32ger: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) pmxvf32gernn: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate pmxvf32gernp: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate pmxvf32gerpn: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate pmxvf32gerpp: Prefixed Masked VSX Vector 32-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate pmxvf64ger: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) pmxvf64gernn: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Negative accumulate pmxvf64gernp: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Negative multiply, Positive accumulate pmxvf64gerpn: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Negative accumulate pmxvf64gerpp: Prefixed Masked VSX Vector 64-bit Floating-Point GER (rank-1 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-7-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/insn64.decode | 38 +++++++++++++++++++++++++++++ target/ppc/translate/vsx-impl.c.inc | 18 ++++++++++++++ 2 files changed, 56 insertions(+) diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn64.decode +++ b/target/ppc/insn64.decode @@ -XXX,XX +XXX,XX @@ %xx3_xa 2:1 16:5 %xx3_xb 1:1 11:5 %xx3_at 23:3 +%xx3_xa_pair 2:1 17:4 !function=times_2 @MMIRR_XX3 ...... .. .... .. . . ........ xmsk:4 ymsk:4 \ ...... ... .. ..... ..... ........ ... \ &MMIRR_XX3 xa=%xx3_xa xb=%xx3_xb xt=%xx3_at +@MMIRR_XX3_NO_P ...... .. .... .. . . ........ xmsk:4 .... \ + ...... ... .. ..... ..... ........ ... \ + &MMIRR_XX3 xb=%xx3_xb xt=%xx3_at pmsk=1 + ### Fixed-Point Load Instructions PLBZ 000001 10 0--.-- .................. \ @@ -XXX,XX +XXX,XX @@ PMXVI16GER2S 000001 11 1001 -- - - pmsk:2 ------ ........ \ PMXVI16GER2SPP 000001 11 1001 -- - - pmsk:2 ------ ........ \ 111011 ... -- ..... ..... 00101010 ..- @MMIRR_XX3 +PMXVF16GER2 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00010011 ..- @MMIRR_XX3 +PMXVF16GER2PP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00010010 ..- @MMIRR_XX3 +PMXVF16GER2PN 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 10010010 ..- @MMIRR_XX3 +PMXVF16GER2NP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 01010010 ..- @MMIRR_XX3 +PMXVF16GER2NN 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 11010010 ..- @MMIRR_XX3 + +PMXVF32GER 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 00011011 ..- @MMIRR_XX3_NO_P xa=%xx3_xa +PMXVF32GERPP 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 00011010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa +PMXVF32GERPN 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 10011010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa +PMXVF32GERNP 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 01011010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa +PMXVF32GERNN 000001 11 1001 -- - - -------- .... ymsk:4 \ + 111011 ... -- ..... ..... 11011010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa + +PMXVF64GER 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 00111011 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair +PMXVF64GERPP 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 00111010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair +PMXVF64GERPN 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 10111010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair +PMXVF64GERNP 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 01111010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair +PMXVF64GERNN 000001 11 1001 -- - - -------- .... ymsk:2 -- \ + 111011 ... -- ....0 ..... 11111010 ..- @MMIRR_XX3_NO_P xa=%xx3_xa_pair + ### Prefixed No-operation Instruction @PNOP 000001 11 0000-- 000000000000000000 \ diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ TRANS(XVF64GERPN, do_ger, gen_helper_XVF64GERPN) TRANS(XVF64GERNP, do_ger, gen_helper_XVF64GERNP) TRANS(XVF64GERNN, do_ger, gen_helper_XVF64GERNN) +TRANS64(PMXVF16GER2, do_ger, gen_helper_XVF16GER2) +TRANS64(PMXVF16GER2PP, do_ger, gen_helper_XVF16GER2PP) +TRANS64(PMXVF16GER2PN, do_ger, gen_helper_XVF16GER2PN) +TRANS64(PMXVF16GER2NP, do_ger, gen_helper_XVF16GER2NP) +TRANS64(PMXVF16GER2NN, do_ger, gen_helper_XVF16GER2NN) + +TRANS64(PMXVF32GER, do_ger, gen_helper_XVF32GER) +TRANS64(PMXVF32GERPP, do_ger, gen_helper_XVF32GERPP) +TRANS64(PMXVF32GERPN, do_ger, gen_helper_XVF32GERPN) +TRANS64(PMXVF32GERNP, do_ger, gen_helper_XVF32GERNP) +TRANS64(PMXVF32GERNN, do_ger, gen_helper_XVF32GERNN) + +TRANS64(PMXVF64GER, do_ger, gen_helper_XVF64GER) +TRANS64(PMXVF64GERPP, do_ger, gen_helper_XVF64GERPP) +TRANS64(PMXVF64GERPN, do_ger, gen_helper_XVF64GERPN) +TRANS64(PMXVF64GERNP, do_ger, gen_helper_XVF64GERNP) +TRANS64(PMXVF64GERNN, do_ger, gen_helper_XVF64GERNN) + #undef GEN_XX2FORM #undef GEN_XX3FORM #undef GEN_XX2IFORM -- 2.36.1
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> Implement the following PowerISA v3.1 instructions: xvbf16ger2: VSX Vector bfloat16 GER (rank-2 update) xvbf16ger2nn: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate xvbf16ger2np: VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate xvbf16ger2pn: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate xvbf16ger2pp: VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate pmxvbf16ger2: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) pmxvbf16ger2nn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Negative accumulate pmxvbf16ger2np: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Negative multiply, Positive accumulate pmxvbf16ger2pn: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Negative accumulate pmxvbf16ger2pp: Prefixed Masked VSX Vector bfloat16 GER (rank-2 update) Positive multiply, Positive accumulate Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-8-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- target/ppc/fpu_helper.c | 40 +++++++++++++++++++++++++++++ target/ppc/helper.h | 5 ++++ target/ppc/insn32.decode | 6 +++++ target/ppc/insn64.decode | 11 ++++++++ target/ppc/translate/vsx-impl.c.inc | 12 +++++++++ 5 files changed, 74 insertions(+) diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -XXX,XX +XXX,XX @@ static float64 extract_hf16(float16 in, float_status *fp_status) return float16_to_float64(in, true, fp_status); } +static float64 extract_bf16(bfloat16 in, float_status *fp_status) +{ + return bfloat16_to_float64(in, fp_status); +} + static void vsxger16(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, ppc_acc_t *at, uint32_t mask, bool acc, bool neg_mul, bool neg_acc, extract_f16 extract) @@ -XXX,XX +XXX,XX @@ static void vsxger(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, vsxger_excp(env, GETPC()); } +QEMU_FLATTEN +void helper_XVBF16GER2(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, false, false, false, extract_bf16); +} + +QEMU_FLATTEN +void helper_XVBF16GER2PP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, true, false, false, extract_bf16); +} + +QEMU_FLATTEN +void helper_XVBF16GER2PN(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, true, false, true, extract_bf16); +} + +QEMU_FLATTEN +void helper_XVBF16GER2NP(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, true, true, false, extract_bf16); +} + +QEMU_FLATTEN +void helper_XVBF16GER2NN(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, + ppc_acc_t *at, uint32_t mask) +{ + vsxger16(env, a, b, at, mask, true, true, true, extract_bf16); +} + QEMU_FLATTEN void helper_XVF16GER2(CPUPPCState *env, ppc_vsr_t *a, ppc_vsr_t *b, ppc_acc_t *at, uint32_t mask) diff --git a/target/ppc/helper.h b/target/ppc/helper.h index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -XXX,XX +XXX,XX @@ DEF_HELPER_5(XVF16GER2PP, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF16GER2PN, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF16GER2NP, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF16GER2NN, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVBF16GER2, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVBF16GER2PP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVBF16GER2PN, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVBF16GER2NP, void, env, vsr, vsr, acc, i32) +DEF_HELPER_5(XVBF16GER2NN, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF32GER, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF32GERPP, void, env, vsr, vsr, acc, i32) DEF_HELPER_5(XVF32GERPN, void, env, vsr, vsr, acc, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -XXX,XX +XXX,XX @@ XVI8GER4SPP 111011 ... -- ..... ..... 01100011 ..- @XX3_at xa=%xx_xa XVI16GER2S 111011 ... -- ..... ..... 00101011 ..- @XX3_at xa=%xx_xa XVI16GER2SPP 111011 ... -- ..... ..... 00101010 ..- @XX3_at xa=%xx_xa +XVBF16GER2 111011 ... -- ..... ..... 00110011 ..- @XX3_at xa=%xx_xa +XVBF16GER2PP 111011 ... -- ..... ..... 00110010 ..- @XX3_at xa=%xx_xa +XVBF16GER2PN 111011 ... -- ..... ..... 10110010 ..- @XX3_at xa=%xx_xa +XVBF16GER2NP 111011 ... -- ..... ..... 01110010 ..- @XX3_at xa=%xx_xa +XVBF16GER2NN 111011 ... -- ..... ..... 11110010 ..- @XX3_at xa=%xx_xa + XVF16GER2 111011 ... -- ..... ..... 00010011 ..- @XX3_at xa=%xx_xa XVF16GER2PP 111011 ... -- ..... ..... 00010010 ..- @XX3_at xa=%xx_xa XVF16GER2PN 111011 ... -- ..... ..... 10010010 ..- @XX3_at xa=%xx_xa diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/insn64.decode +++ b/target/ppc/insn64.decode @@ -XXX,XX +XXX,XX @@ PMXVI16GER2S 000001 11 1001 -- - - pmsk:2 ------ ........ \ PMXVI16GER2SPP 000001 11 1001 -- - - pmsk:2 ------ ........ \ 111011 ... -- ..... ..... 00101010 ..- @MMIRR_XX3 +PMXVBF16GER2 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00110011 ..- @MMIRR_XX3 +PMXVBF16GER2PP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 00110010 ..- @MMIRR_XX3 +PMXVBF16GER2PN 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 10110010 ..- @MMIRR_XX3 +PMXVBF16GER2NP 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 01110010 ..- @MMIRR_XX3 +PMXVBF16GER2NN 000001 11 1001 -- - - pmsk:2 ------ ........ \ + 111011 ... -- ..... ..... 11110010 ..- @MMIRR_XX3 + PMXVF16GER2 000001 11 1001 -- - - pmsk:2 ------ ........ \ 111011 ... -- ..... ..... 00010011 ..- @MMIRR_XX3 PMXVF16GER2PP 000001 11 1001 -- - - pmsk:2 ------ ........ \ diff --git a/target/ppc/translate/vsx-impl.c.inc b/target/ppc/translate/vsx-impl.c.inc index XXXXXXX..XXXXXXX 100644 --- a/target/ppc/translate/vsx-impl.c.inc +++ b/target/ppc/translate/vsx-impl.c.inc @@ -XXX,XX +XXX,XX @@ TRANS64(PMXVI16GER2PP, do_ger, gen_helper_XVI16GER2PP) TRANS64(PMXVI16GER2S, do_ger, gen_helper_XVI16GER2S) TRANS64(PMXVI16GER2SPP, do_ger, gen_helper_XVI16GER2SPP) +TRANS(XVBF16GER2, do_ger, gen_helper_XVBF16GER2) +TRANS(XVBF16GER2PP, do_ger, gen_helper_XVBF16GER2PP) +TRANS(XVBF16GER2PN, do_ger, gen_helper_XVBF16GER2PN) +TRANS(XVBF16GER2NP, do_ger, gen_helper_XVBF16GER2NP) +TRANS(XVBF16GER2NN, do_ger, gen_helper_XVBF16GER2NN) + TRANS(XVF16GER2, do_ger, gen_helper_XVF16GER2) TRANS(XVF16GER2PP, do_ger, gen_helper_XVF16GER2PP) TRANS(XVF16GER2PN, do_ger, gen_helper_XVF16GER2PN) @@ -XXX,XX +XXX,XX @@ TRANS(XVF64GERPN, do_ger, gen_helper_XVF64GERPN) TRANS(XVF64GERNP, do_ger, gen_helper_XVF64GERNP) TRANS(XVF64GERNN, do_ger, gen_helper_XVF64GERNN) +TRANS64(PMXVBF16GER2, do_ger, gen_helper_XVBF16GER2) +TRANS64(PMXVBF16GER2PP, do_ger, gen_helper_XVBF16GER2PP) +TRANS64(PMXVBF16GER2PN, do_ger, gen_helper_XVBF16GER2PN) +TRANS64(PMXVBF16GER2NP, do_ger, gen_helper_XVBF16GER2NP) +TRANS64(PMXVBF16GER2NN, do_ger, gen_helper_XVBF16GER2NN) + TRANS64(PMXVF16GER2, do_ger, gen_helper_XVF16GER2) TRANS64(PMXVF16GER2PP, do_ger, gen_helper_XVF16GER2PP) TRANS64(PMXVF16GER2PN, do_ger, gen_helper_XVF16GER2PN) -- 2.36.1
From: Joel Stanley <joel@jms.id.au> These are new hwcap bits added for power10. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20220524140537.27451-9-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> --- linux-user/elfload.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index XXXXXXX..XXXXXXX 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -XXX,XX +XXX,XX @@ enum { QEMU_PPC_FEATURE2_DARN = 0x00200000, /* darn random number insn */ QEMU_PPC_FEATURE2_SCV = 0x00100000, /* scv syscall */ QEMU_PPC_FEATURE2_HTM_NO_SUSPEND = 0x00080000, /* TM w/o suspended state */ + QEMU_PPC_FEATURE2_ARCH_3_1 = 0x00040000, /* ISA 3.1 */ + QEMU_PPC_FEATURE2_MMA = 0x00020000, /* Matrix-Multiply Assist */ }; #define ELF_HWCAP get_elf_hwcap() @@ -XXX,XX +XXX,XX @@ static uint32_t get_elf_hwcap2(void) QEMU_PPC_FEATURE2_VEC_CRYPTO); GET_FEATURE2(PPC2_ISA300, QEMU_PPC_FEATURE2_ARCH_3_00 | QEMU_PPC_FEATURE2_DARN | QEMU_PPC_FEATURE2_HAS_IEEE128); + GET_FEATURE2(PPC2_ISA310, QEMU_PPC_FEATURE2_ARCH_3_1 | + QEMU_PPC_FEATURE2_MMA); #undef GET_FEATURE #undef GET_FEATURE2 -- 2.36.1
The following changes since commit 76b56fdfc9fa43ec6e5986aee33f108c6c6a511e: Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging (2021-12-14 12:46:18 -0800) are available in the Git repository at: https://github.com/legoater/qemu/ tags/pull-ppc-20211217 for you to fetch changes up to 0e6232bc3cb96bdf6fac1b5d7659aa9887afe657: ppc/pnv: Use QOM hierarchy to scan PEC PHB4 devices (2021-12-17 17:57:19 +0100) Changes in v3: - Fixed patch "docs: Introducing pseries documentation" with a newline and checked documentation generation with : make docker-test-build@ubuntu1804 TARGET_LIST=i386-softmmu Changes in v2: - Fixed patch "docs: rSTify ppc-spapr-hcalls.txt" with a newline - dropped patch "target/ppc: do not silence SNaN in xscvspdpn" which still had some comments pending. ---------------------------------------------------------------- ppc 7.0 queue: * General cleanup for Mac machines (Peter) * Fixes for FPU exceptions (Lucas) * Support for new ISA31 instructions (Matheus) * Fixes for ivshmem (Daniel) * Cleanups for PowerNV PHB (Christophe and Cedric) * Updates of PowerNV and pSeries documentation (Leonardo and Daniel) * Fixes for PowerNV (Daniel) * Large cleanup of FPU implementation (Richard) * Removal of SoftTLBs support for PPC74x CPUs (Fabiano) * Fixes for exception models in MPCx and 60x CPUs (Fabiano) * Removal of 401/403 CPUs (Cedric) * Deprecation of taihu machine (Thomas) * Large rework of PPC405 machine (Cedric) * Fixes for VSX instructions (Victor and Matheus) * Fix for e6500 CPU (Fabiano) * Initial support for PMU (Daniel) ---------------------------------------------------------------- Alexey Kardashevskiy (1): pseries: Update SLOF firmware image Christophe Lombard (1): pci-host: Allow extended config space access for PowerNV PHB4 model Cédric Le Goater (28): Merge tag 'qemu-slof-20211112' of github.com:aik/qemu into ppc-next target/ppc: remove 401/403 CPUs ppc/ppc405: Change kernel load address ppc: Add trace-events for DCR accesses ppc/ppc405: Convert printfs to trace-events ppc/ppc405: Drop flag parameter in ppc405_set_bootinfo() ppc/ppc405: Change ppc405ep_init() return value ppc/ppc405: Add some address space definitions ppc/ppc405: Remove flash support ppc/ppc405: Rework FW load ppc/ppc405: Introduce ppc405_set_default_bootinfo() ppc/ppc405: Fix boot from kernel ppc/ppc405: Change default PLL values at reset ppc/ppc405: Fix bi_pci_enetaddr2 field in U-Boot board information ppc/ppc405: Add update of bi_procfreq field ppc/pnv: Introduce a "chip" property under PHB3 ppc/pnv: Use the chip class to check the index of PHB3 devices ppc/pnv: Drop the "num-phbs" property ppc/pnv: Move mapping of the PHB3 CQ regions under pnv_pbcq_realize() ppc/pnv: Use QOM hierarchy to scan PHB3 devices ppc/pnv: Introduce a num_pecs class attribute for PHB4 PEC devices ppc/pnv: Introduce version and device_id class atributes for PHB4 devices ppc/pnv: Introduce a "chip" property under the PHB4 model ppc/pnv: Introduce a num_stack class attribute ppc/pnv: Compute the PHB index from the PHB4 PEC model ppc/pnv: Remove "system-memory" property from PHB4 PEC ppc/pnv: Move realize of PEC stacks under the PEC model ppc/pnv: Use QOM hierarchy to scan PEC PHB4 devices Daniel Henrique Barboza (13): ivshmem.c: change endianness to LITTLE_ENDIAN ivshmem-test.c: enable test_ivshmem_server for ppc64 arch ppc/pnv.c: add a friendly warning when accel=kvm is used docs/system/ppc/powernv.rst: document KVM support status ppc/pnv.c: fix "system-id" FDT when -uuid is set target/ppc: introduce PMUEventType and PMU overflow timers target/ppc: PMU basic cycle count for pseries TCG target/ppc: PMU: update counters on PMCs r/w target/ppc: PMU: update counters on MMCR1 write target/ppc: enable PMU counter overflow with cycle events target/ppc: enable PMU instruction count target/ppc/power8-pmu.c: add PM_RUN_INST_CMPL (0xFA) event PPC64/TCG: Implement 'rfebb' instruction Fabiano Rosas (8): target/ppc: Disable software TLB for the 7450 family target/ppc: Disable unused facilities in the e600 CPU target/ppc: Remove the software TLB model of 7450 CPUs target/ppc: Fix MPCxxx FPU interrupt address target/ppc: Remove 603e exception model target/ppc: Set 601v exception model id target/ppc: Fix e6500 boot Revert "target/ppc: Move SPR_DSISR setting to powerpc_excp" Leonardo Garcia (5): docs: Minor updates on the powernv documentation. docs: Introducing pseries documentation. docs: rSTify ppc-spapr-hcalls.txt docs: Rename ppc-spapr-hcalls.txt to ppc-spapr-hcalls.rst. Link new ppc-spapr-hcalls.rst file to pseries.rst. Lucas Mateus Castro (alqotel) (3): target/ppc: Fixed call to deferred exception test/tcg/ppc64le: test mtfsf target/ppc: ppc_store_fpscr doesn't update bits 0 to 28 and 52 Matheus Ferst (5): target/ppc: Implement Vector Expand Mask target/ppc: Implement Vector Extract Mask target/ppc: Implement Vector Mask Move insns target/ppc: fix xscvqpdp register access target/ppc: move xscvqpdp to decodetree Peter Maydell (1): hw/ppc/mac.h: Remove MAX_CPUS macro Richard Henderson (34): softfloat: Extend float_exception_flags to 16 bits softfloat: Add flag specific to Inf - Inf softfloat: Add flag specific to Inf * 0 softfloat: Add flags specific to Inf / Inf and 0 / 0 softfloat: Add flag specific to sqrt(-x) softfloat: Add flag specific to convert non-nan to int softfloat: Add flag specific to signaling nans target/ppc: Update float_invalid_op_addsub for new flags target/ppc: Update float_invalid_op_mul for new flags target/ppc: Update float_invalid_op_div for new flags target/ppc: Move float_check_status from FPU_FCTI to translate target/ppc: Update float_invalid_cvt for new flags target/ppc: Fix VXCVI return value target/ppc: Remove inline from do_fri target/ppc: Use FloatRoundMode in do_fri target/ppc: Tidy inexact handling in do_fri target/ppc: Clean up do_fri target/ppc: Update fmadd for new flags target/ppc: Split out do_fmadd target/ppc: Do not call do_float_check_status from do_fmadd target/ppc: Split out do_frsp target/ppc: Update do_frsp for new flags target/ppc: Use helper_todouble in do_frsp target/ppc: Update sqrt for new flags target/ppc: Update xsrqpi and xsrqpxp to new flags target/ppc: Update fre to new flags softfloat: Add float64r32 arithmetic routines target/ppc: Add helpers for fmadds et al target/ppc: Add helper for fsqrts target/ppc: Add helpers for fadds, fsubs, fdivs target/ppc: Add helper for fmuls target/ppc: Add helper for frsqrtes target/ppc: Update fres to new flags and float64r32 target/ppc: Use helper_todouble/tosingle in helper_xststdcsp Thomas Huth (1): ppc: Mark the 'taihu' machine as deprecated Victor Colombo (2): target/ppc: Fix xs{max, min}[cj]dp to use VSX registers target/ppc: Move xs{max,min}[cj]dp to decodetree docs/about/deprecated.rst | 9 + docs/specs/ppc-spapr-hcalls.rst | 100 +++++ docs/specs/ppc-spapr-hcalls.txt | 78 ---- docs/system/ppc/powernv.rst | 68 ++-- docs/system/ppc/pseries.rst | 226 +++++++++++ hw/ppc/mac.h | 3 - hw/ppc/ppc405.h | 14 +- include/fpu/softfloat-types.h | 23 +- include/fpu/softfloat.h | 14 +- include/hw/pci-host/pnv_phb3.h | 3 + include/hw/pci-host/pnv_phb4.h | 5 + include/hw/ppc/pnv.h | 2 + target/ppc/cpu-models.h | 19 - target/ppc/cpu-qom.h | 12 +- target/ppc/cpu.h | 63 +++- target/ppc/helper.h | 29 +- target/ppc/power8-pmu.h | 26 ++ target/ppc/spr_tcg.h | 5 + target/ppc/insn32.decode | 54 ++- fpu/softfloat.c | 114 +++++- hw/misc/ivshmem.c | 2 +- hw/pci-host/pnv_phb3.c | 3 +- hw/pci-host/pnv_phb3_pbcq.c | 11 + hw/pci-host/pnv_phb4.c | 1 + hw/pci-host/pnv_phb4_pec.c | 75 +++- hw/ppc/mac_newworld.c | 3 +- hw/ppc/mac_oldworld.c | 3 +- hw/ppc/pnv.c | 177 +++++---- hw/ppc/ppc.c | 2 + hw/ppc/ppc405_boards.c | 245 ++++++------ hw/ppc/ppc405_uc.c | 225 ++++++----- hw/ppc/spapr_cpu_core.c | 1 + target/ppc/cpu-models.c | 34 -- target/ppc/cpu.c | 2 +- target/ppc/cpu_init.c | 658 +++------------------------------ target/ppc/excp_helper.c | 95 +++-- target/ppc/fpu_helper.c | 593 +++++++++++++++-------------- target/ppc/helper_regs.c | 7 + target/ppc/mmu_common.c | 60 +-- target/ppc/mmu_helper.c | 32 -- target/ppc/power8-pmu.c | 350 ++++++++++++++++++ target/ppc/translate.c | 104 ++++-- tests/qtest/ivshmem-test.c | 5 +- tests/tcg/ppc64le/mtfsf.c | 61 +++ fpu/softfloat-parts.c.inc | 57 +-- fpu/softfloat-specialize.c.inc | 12 +- target/ppc/power8-pmu-regs.c.inc | 69 +++- target/ppc/translate/branch-impl.c.inc | 33 ++ target/ppc/translate/fp-impl.c.inc | 53 +-- target/ppc/translate/vmx-impl.c.inc | 231 ++++++++++++ target/ppc/translate/vsx-impl.c.inc | 55 ++- target/ppc/translate/vsx-ops.c.inc | 5 - hw/ppc/trace-events | 23 ++ pc-bios/README | 2 +- pc-bios/slof.bin | Bin 991744 -> 991920 bytes roms/SLOF | 2 +- target/ppc/meson.build | 1 + tests/tcg/ppc64/Makefile.target | 1 + tests/tcg/ppc64le/Makefile.target | 1 + 59 files changed, 2514 insertions(+), 1647 deletions(-) create mode 100644 docs/specs/ppc-spapr-hcalls.rst delete mode 100644 docs/specs/ppc-spapr-hcalls.txt create mode 100644 target/ppc/power8-pmu.h create mode 100644 target/ppc/power8-pmu.c create mode 100644 tests/tcg/ppc64le/mtfsf.c create mode 100644 target/ppc/translate/branch-impl.c.inc