From nobody Thu May 2 14:26:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650303806281853.8033566155476; Mon, 18 Apr 2022 10:43:26 -0700 (PDT) Received: from localhost ([::1]:37816 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ngVPJ-00077I-5X for importer@patchew.org; Mon, 18 Apr 2022 13:43:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50466) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ngVLb-0004qR-5J for qemu-devel@nongnu.org; Mon, 18 Apr 2022 13:39:35 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:41348) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ngVLZ-0006aq-8I for qemu-devel@nongnu.org; Mon, 18 Apr 2022 13:39:34 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1ngVLW-000364-Nk; Mon, 18 Apr 2022 18:39:30 +0100 From: Paul Brook To: qemu-devel@nongnu.org Subject: [PATCH 1/4] Add AVX_EN hflag Date: Mon, 18 Apr 2022 18:39:01 +0100 Message-Id: <20220418173904.3746036-2-paul@nowt.org> X-Mailer: git-send-email 2.35.2 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , Paolo Bonzini , Richard Henderson , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650303808288100001 Content-Type: text/plain; charset="utf-8" Add a new hflag bit to determine whether AVX instructions are allowed Signed-off-by: Paul Brook --- target/i386/cpu.h | 3 +++ target/i386/helper.c | 12 ++++++++++++ target/i386/tcg/fpu_helper.c | 1 + 3 files changed, 16 insertions(+) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 982c532353..0c7162e2fd 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -168,6 +168,7 @@ typedef enum X86Seg { #define HF_MPX_EN_SHIFT 25 /* MPX Enabled (CR4+XCR0+BNDCFGx) */ #define HF_MPX_IU_SHIFT 26 /* BND registers in-use */ #define HF_UMIP_SHIFT 27 /* CR4.UMIP */ +#define HF_AVX_EN_SHIFT 28 /* AVX Enabled (CR4+XCR0) */ =20 #define HF_CPL_MASK (3 << HF_CPL_SHIFT) #define HF_INHIBIT_IRQ_MASK (1 << HF_INHIBIT_IRQ_SHIFT) @@ -194,6 +195,7 @@ typedef enum X86Seg { #define HF_MPX_EN_MASK (1 << HF_MPX_EN_SHIFT) #define HF_MPX_IU_MASK (1 << HF_MPX_IU_SHIFT) #define HF_UMIP_MASK (1 << HF_UMIP_SHIFT) +#define HF_AVX_EN_MASK (1 << HF_AVX_EN_SHIFT) =20 /* hflags2 */ =20 @@ -2045,6 +2047,7 @@ void host_cpuid(uint32_t function, uint32_t count, =20 /* helper.c */ void x86_cpu_set_a20(X86CPU *cpu, int a20_state); +void cpu_sync_avx_hflag(CPUX86State *env); =20 #ifndef CONFIG_USER_ONLY static inline int x86_asidx_from_attrs(CPUState *cs, MemTxAttrs attrs) diff --git a/target/i386/helper.c b/target/i386/helper.c index fa409e9c44..30083c9cff 100644 --- a/target/i386/helper.c +++ b/target/i386/helper.c @@ -29,6 +29,17 @@ #endif #include "qemu/log.h" =20 +void cpu_sync_avx_hflag(CPUX86State *env) +{ + if ((env->cr[4] & CR4_OSXSAVE_MASK) + && (env->xcr0 & (XSTATE_SSE_MASK | XSTATE_YMM_MASK)) + =3D=3D (XSTATE_SSE_MASK | XSTATE_YMM_MASK)) { + env->hflags |=3D HF_AVX_EN_MASK; + } else{ + env->hflags &=3D ~HF_AVX_EN_MASK; + } +} + void cpu_sync_bndcs_hflags(CPUX86State *env) { uint32_t hflags =3D env->hflags; @@ -209,6 +220,7 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_= cr4) env->hflags =3D hflags; =20 cpu_sync_bndcs_hflags(env); + cpu_sync_avx_hflag(env); } =20 #if !defined(CONFIG_USER_ONLY) diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c index ebf5e73df9..b391b69635 100644 --- a/target/i386/tcg/fpu_helper.c +++ b/target/i386/tcg/fpu_helper.c @@ -2943,6 +2943,7 @@ void helper_xsetbv(CPUX86State *env, uint32_t ecx, ui= nt64_t mask) =20 env->xcr0 =3D mask; cpu_sync_bndcs_hflags(env); + cpu_sync_avx_hflag(env); return; =20 do_gpf: --=20 2.35.2 From nobody Thu May 2 14:26:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650837879929729.1417710436948; Sun, 24 Apr 2022 15:04:39 -0700 (PDT) Received: from localhost ([::1]:47100 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikLO-0007kM-UI for importer@patchew.org; Sun, 24 Apr 2022 18:04:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49284) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJC-00056L-Ft for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:22 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58676) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001L2-Ky for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:22 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ4-0001ea-Qu; Sun, 24 Apr 2022 23:02:14 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 01/42] i386: pcmpestr 64-bit sign extension bug Date: Sun, 24 Apr 2022 23:01:23 +0100 Message-Id: <20220424220204.2493824-2-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650837881461100001 Content-Type: text/plain; charset="utf-8" The abs1 function in ops_sse.h only works sorrectly when the result fits in a signed int. This is fine most of the time because we're only dealing with byte sized values. However pcmp_elen helper function uses abs1 to calculate the absolute value of a cpu register. This incorrectly truncates to 32 bits, and will give the wrong anser for the most negative value. Fix by open coding the saturation check before taking the absolute value. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson --- target/i386/ops_sse.h | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index e4d74b814a..535440f882 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2011,25 +2011,23 @@ SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ) =20 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl) { - int val; + target_long val, limit; =20 /* Presence of REX.W is indicated by a bit higher than 7 set */ if (ctrl >> 8) { - val =3D abs1((int64_t)env->regs[reg]); + val =3D (target_long)env->regs[reg]; } else { - val =3D abs1((int32_t)env->regs[reg]); + val =3D (int32_t)env->regs[reg]; } - if (ctrl & 1) { - if (val > 8) { - return 8; - } + limit =3D 8; } else { - if (val > 16) { - return 16; - } + limit =3D 16; } - return val; + if ((val > limit) || (val < -limit)) { + return limit; + } + return abs1(val); } =20 static inline int pcmp_ilen(Reg *r, uint8_t ctrl) --=20 2.36.0 From nobody Thu May 2 14:26:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650837880638935.4569122053147; Sun, 24 Apr 2022 15:04:40 -0700 (PDT) Received: from localhost ([::1]:47198 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikLP-0007oA-KM for importer@patchew.org; Sun, 24 Apr 2022 18:04:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49286) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJC-00056M-Hk for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:22 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58675) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001L1-JN for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:22 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ5-0001ea-0t; Sun, 24 Apr 2022 23:02:15 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 02/42] i386: DPPS rounding fix Date: Sun, 24 Apr 2022 23:01:24 +0100 Message-Id: <20220424220204.2493824-3-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650837882290100005 Content-Type: text/plain; charset="utf-8" The DPPS (Dot Product) instruction is defined to first sum pairs of intermediate results, then sum those values to get the final result. i.e. (A+B)+(C+D) We incrementally sum the results, i.e. ((A+B)+C)+D, which can result in incorrect rouding. For consistency, also remove the redundant (but harmless) add operation from DPPD Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 47 +++++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 22 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 535440f882..a5a48a20f6 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1934,32 +1934,36 @@ SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) =20 void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) { - float32 iresult =3D float32_zero; + float32 prod, iresult, iresult2; =20 + /* + * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D + * to correctly round the intermediate results + */ if (mask & (1 << 4)) { - iresult =3D float32_add(iresult, - float32_mul(d->ZMM_S(0), s->ZMM_S(0), - &env->sse_status), - &env->sse_status); + iresult =3D float32_mul(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); + } else { + iresult =3D float32_zero; } if (mask & (1 << 5)) { - iresult =3D float32_add(iresult, - float32_mul(d->ZMM_S(1), s->ZMM_S(1), - &env->sse_status), - &env->sse_status); + prod =3D float32_mul(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status); + } else { + prod =3D float32_zero; } + iresult =3D float32_add(iresult, prod, &env->sse_status); if (mask & (1 << 6)) { - iresult =3D float32_add(iresult, - float32_mul(d->ZMM_S(2), s->ZMM_S(2), - &env->sse_status), - &env->sse_status); + iresult2 =3D float32_mul(d->ZMM_S(2), s->ZMM_S(2), &env->sse_statu= s); + } else { + iresult2 =3D float32_zero; } if (mask & (1 << 7)) { - iresult =3D float32_add(iresult, - float32_mul(d->ZMM_S(3), s->ZMM_S(3), - &env->sse_status), - &env->sse_status); + prod =3D float32_mul(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status); + } else { + prod =3D float32_zero; } + iresult2 =3D float32_add(iresult2, prod, &env->sse_status); + iresult =3D float32_add(iresult, iresult2, &env->sse_status); + d->ZMM_S(0) =3D (mask & (1 << 0)) ? iresult : float32_zero; d->ZMM_S(1) =3D (mask & (1 << 1)) ? iresult : float32_zero; d->ZMM_S(2) =3D (mask & (1 << 2)) ? iresult : float32_zero; @@ -1968,13 +1972,12 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s, uint32_t mask) =20 void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) { - float64 iresult =3D float64_zero; + float64 iresult; =20 if (mask & (1 << 4)) { - iresult =3D float64_add(iresult, - float64_mul(d->ZMM_D(0), s->ZMM_D(0), - &env->sse_status), - &env->sse_status); + iresult =3D float64_mul(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); + } else { + iresult =3D float64_zero; } if (mask & (1 << 5)) { iresult =3D float64_add(iresult, --=20 2.36.0 From nobody Thu May 2 14:26:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650304086859176.46963827060722; Mon, 18 Apr 2022 10:48:06 -0700 (PDT) Received: from localhost ([::1]:43826 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ngVTp-0002t3-Ee for importer@patchew.org; Mon, 18 Apr 2022 13:48:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50518) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ngVLm-0004xa-TM for qemu-devel@nongnu.org; Mon, 18 Apr 2022 13:39:48 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:41356) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ngVLg-0006bQ-EF for qemu-devel@nongnu.org; Mon, 18 Apr 2022 13:39:45 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1ngVLb-000364-LS; Mon, 18 Apr 2022 18:39:36 +0100 From: Paul Brook To: qemu-devel@nongnu.org Subject: [PATCH 2/4] TCG support for AVX Date: Mon, 18 Apr 2022 18:39:02 +0100 Message-Id: <20220418173904.3746036-3-paul@nowt.org> X-Mailer: git-send-email 2.35.2 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , Paolo Bonzini , Richard Henderson , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650304089155100001 Content-Type: text/plain; charset="utf-8" Add TCG translation of guest AVX/AVX2 instructions This comprises: * VEX encodings of most (all?) "legacy" SSE operations. These typically add an extra source operand, and clear the unused half of the destination register (SSE encodings leave this unchanged) Previously we were incorrectly translating VEX encoded instructions as if they were legacy SSE encodings. * 256-bit variants of many instructions. AVX adds floating point operations. AVX2 adds integer operations. * A few new instructions (VBROADCAST, VGATHER, VZERO) Signed-off-by: Paul Brook --- target/i386/cpu.c | 8 +- target/i386/helper.h | 2 + target/i386/ops_sse.h | 2606 ++++++++++++++++++++++++---------- target/i386/ops_sse_header.h | 364 +++-- target/i386/tcg/fpu_helper.c | 3 + target/i386/tcg/translate.c | 1902 +++++++++++++++++++------ 6 files changed, 3597 insertions(+), 1288 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index cb6b5467d0..494f01959d 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -625,12 +625,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t ven= dor1, CPUID_EXT_SSE41 | CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | \ CPUID_EXT_XSAVE | /* CPUID_EXT_OSXSAVE is dynamic */ \ CPUID_EXT_MOVBE | CPUID_EXT_AES | CPUID_EXT_HYPERVISOR | \ - CPUID_EXT_RDRAND) + CPUID_EXT_RDRAND | CPUID_EXT_AVX) /* missing: CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_VMX, CPUID_EXT_SMX, CPUID_EXT_EST, CPUID_EXT_TM2, CPUID_EXT_CID, CPUID_EXT_FMA, CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_PCID, CPUID_EXT_DCA, - CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AVX, + CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_F16C */ =20 #ifdef TARGET_X86_64 @@ -653,9 +653,9 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendo= r1, CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ADX | \ CPUID_7_0_EBX_PCOMMIT | CPUID_7_0_EBX_CLFLUSHOPT | \ CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_MPX | CPUID_7_0_EBX_FSGSBASE = | \ - CPUID_7_0_EBX_ERMS) + CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_AVX2) /* missing: - CPUID_7_0_EBX_HLE, CPUID_7_0_EBX_AVX2, + CPUID_7_0_EBX_HLE CPUID_7_0_EBX_INVPCID, CPUID_7_0_EBX_RTM, CPUID_7_0_EBX_RDSEED */ #define TCG_7_0_ECX_FEATURES (CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU | \ diff --git a/target/i386/helper.h b/target/i386/helper.h index ac3b4d1ee3..3da5df98b9 100644 --- a/target/i386/helper.h +++ b/target/i386/helper.h @@ -218,6 +218,8 @@ DEF_HELPER_3(movq, void, env, ptr, ptr) #include "ops_sse_header.h" #define SHIFT 1 #include "ops_sse_header.h" +#define SHIFT 2 +#include "ops_sse_header.h" =20 DEF_HELPER_3(rclb, tl, env, tl, tl) DEF_HELPER_3(rclw, tl, env, tl, tl) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 6f1fc174b3..9cd7b2875e 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -23,6 +23,7 @@ #if SHIFT =3D=3D 0 #define Reg MMXReg #define XMM_ONLY(...) +#define YMM_ONLY(...) #define B(n) MMX_B(n) #define W(n) MMX_W(n) #define L(n) MMX_L(n) @@ -35,260 +36,355 @@ #define W(n) ZMM_W(n) #define L(n) ZMM_L(n) #define Q(n) ZMM_Q(n) +#if SHIFT =3D=3D 1 #define SUFFIX _xmm +#define YMM_ONLY(...) +#else +#define SUFFIX _ymm +#define YMM_ONLY(...) __VA_ARGS__ +#endif +#endif + +#if SHIFT =3D=3D 0 +#define SHIFT_HELPER_BODY(n, elem, F) do { \ + d->elem(0) =3D F(s->elem(0), shift); \ + if ((n) > 1) { \ + d->elem(1) =3D F(s->elem(1), shift); \ + } \ + if ((n) > 2) { \ + d->elem(2) =3D F(s->elem(2), shift); \ + d->elem(3) =3D F(s->elem(3), shift); \ + } \ + if ((n) > 4) { \ + d->elem(4) =3D F(s->elem(4), shift); \ + d->elem(5) =3D F(s->elem(5), shift); \ + d->elem(6) =3D F(s->elem(6), shift); \ + d->elem(7) =3D F(s->elem(7), shift); \ + } \ + if ((n) > 8) { \ + d->elem(8) =3D F(s->elem(8), shift); \ + d->elem(9) =3D F(s->elem(9), shift); \ + d->elem(10) =3D F(s->elem(10), shift); \ + d->elem(11) =3D F(s->elem(11), shift); \ + d->elem(12) =3D F(s->elem(12), shift); \ + d->elem(13) =3D F(s->elem(13), shift); \ + d->elem(14) =3D F(s->elem(14), shift); \ + d->elem(15) =3D F(s->elem(15), shift); \ + } \ + } while (0) + +#define FPSRL(x, c) ((x) >> shift) +#define FPSRAW(x, c) ((int16_t)(x) >> shift) +#define FPSRAL(x, c) ((int32_t)(x) >> shift) +#define FPSLL(x, c) ((x) << shift) #endif =20 -void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift; - - if (s->Q(0) > 15) { + if (c->Q(0) > 15) { d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); - d->W(0) >>=3D shift; - d->W(1) >>=3D shift; - d->W(2) >>=3D shift; - d->W(3) >>=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) >>=3D shift; - d->W(5) >>=3D shift; - d->W(6) >>=3D shift; - d->W(7) >>=3D shift; -#endif + shift =3D c->B(0); + SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRL); } } =20 -void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift; - - if (s->Q(0) > 15) { - shift =3D 15; + if (c->Q(0) > 15) { + d->Q(0) =3D 0; + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); + shift =3D c->B(0); + SHIFT_HELPER_BODY(4 << SHIFT, W, FPSLL); } - d->W(0) =3D (int16_t)d->W(0) >> shift; - d->W(1) =3D (int16_t)d->W(1) >> shift; - d->W(2) =3D (int16_t)d->W(2) >> shift; - d->W(3) =3D (int16_t)d->W(3) >> shift; -#if SHIFT =3D=3D 1 - d->W(4) =3D (int16_t)d->W(4) >> shift; - d->W(5) =3D (int16_t)d->W(5) >> shift; - d->W(6) =3D (int16_t)d->W(6) >> shift; - d->W(7) =3D (int16_t)d->W(7) >> shift; -#endif } =20 -void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift; - - if (s->Q(0) > 15) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 15) { + shift =3D 15; } else { - shift =3D s->B(0); - d->W(0) <<=3D shift; - d->W(1) <<=3D shift; - d->W(2) <<=3D shift; - d->W(3) <<=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) <<=3D shift; - d->W(5) <<=3D shift; - d->W(6) <<=3D shift; - d->W(7) <<=3D shift; -#endif + shift =3D c->B(0); } + SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRAW); } =20 -void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift; - - if (s->Q(0) > 31) { + if (c->Q(0) > 31) { d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); - d->L(0) >>=3D shift; - d->L(1) >>=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) >>=3D shift; - d->L(3) >>=3D shift; -#endif + shift =3D c->B(0); + SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRL); } } =20 -void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift; - - if (s->Q(0) > 31) { - shift =3D 31; + if (c->Q(0) > 31) { + d->Q(0) =3D 0; + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); + shift =3D c->B(0); + SHIFT_HELPER_BODY(2 << SHIFT, L, FPSLL); } - d->L(0) =3D (int32_t)d->L(0) >> shift; - d->L(1) =3D (int32_t)d->L(1) >> shift; -#if SHIFT =3D=3D 1 - d->L(2) =3D (int32_t)d->L(2) >> shift; - d->L(3) =3D (int32_t)d->L(3) >> shift; -#endif } =20 -void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift; - - if (s->Q(0) > 31) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 31) { + shift =3D 31; } else { - shift =3D s->B(0); - d->L(0) <<=3D shift; - d->L(1) <<=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) <<=3D shift; - d->L(3) <<=3D shift; -#endif + shift =3D c->B(0); } + SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRAL); } =20 -void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift; - - if (s->Q(0) > 63) { + if (c->Q(0) > 63) { d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); - d->Q(0) >>=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) >>=3D shift; -#endif + shift =3D c->B(0); + SHIFT_HELPER_BODY(1 << SHIFT, Q, FPSRL); } } =20 -void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift; - - if (s->Q(0) > 63) { + if (c->Q(0) > 63) { d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); - d->Q(0) <<=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) <<=3D shift; -#endif + shift =3D c->B(0); + SHIFT_HELPER_BODY(1 << SHIFT, Q, FPSLL); } } =20 -#if SHIFT =3D=3D 1 -void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +#if SHIFT >=3D 1 +void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift, i; =20 - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } for (i =3D 0; i < 16 - shift; i++) { - d->B(i) =3D d->B(i + shift); + d->B(i) =3D s->B(i + shift); } for (i =3D 16 - shift; i < 16; i++) { d->B(i) =3D 0; } +#if SHIFT =3D=3D 2 + for (i =3D 0; i < 16 - shift; i++) { + d->B(i + 16) =3D s->B(i + 16 + shift); + } + for (i =3D 16 - shift; i < 16; i++) { + d->B(i + 16) =3D 0; + } +#endif } =20 -void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { int shift, i; =20 - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } for (i =3D 15; i >=3D shift; i--) { - d->B(i) =3D d->B(i - shift); + d->B(i) =3D s->B(i - shift); } for (i =3D 0; i < shift; i++) { d->B(i) =3D 0; } +#if SHIFT =3D=3D 2 + for (i =3D 15; i >=3D shift; i--) { + d->B(i + 16) =3D s->B(i + 16 - shift); + } + for (i =3D 0; i < shift; i++) { + d->B(i + 16) =3D 0; + } +#endif } #endif =20 -#define SSE_HELPER_B(name, F) \ +#define SSE_HELPER_1(name, elem, num, F) = \ void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->B(0) =3D F(d->B(0), s->B(0)); \ - d->B(1) =3D F(d->B(1), s->B(1)); \ - d->B(2) =3D F(d->B(2), s->B(2)); \ - d->B(3) =3D F(d->B(3), s->B(3)); \ - d->B(4) =3D F(d->B(4), s->B(4)); \ - d->B(5) =3D F(d->B(5), s->B(5)); \ - d->B(6) =3D F(d->B(6), s->B(6)); \ - d->B(7) =3D F(d->B(7), s->B(7)); \ + d->elem(0) =3D F(s->elem(0)); \ + d->elem(1) =3D F(s->elem(1)); \ + if ((num << SHIFT) > 2) { \ + d->elem(2) =3D F(s->elem(2)); \ + d->elem(3) =3D F(s->elem(3)); \ + } \ + if ((num << SHIFT) > 4) { \ + d->elem(4) =3D F(s->elem(4)); \ + d->elem(5) =3D F(s->elem(5)); \ + d->elem(6) =3D F(s->elem(6)); \ + d->elem(7) =3D F(s->elem(7)); \ + } \ + if ((num << SHIFT) > 8) { \ + d->elem(8) =3D F(s->elem(8)); \ + d->elem(9) =3D F(s->elem(9)); \ + d->elem(10) =3D F(s->elem(10)); \ + d->elem(11) =3D F(s->elem(11)); \ + d->elem(12) =3D F(s->elem(12)); \ + d->elem(13) =3D F(s->elem(13)); \ + d->elem(14) =3D F(s->elem(14)); \ + d->elem(15) =3D F(s->elem(15)); \ + } \ + if ((num << SHIFT) > 16) { \ + d->elem(16) =3D F(s->elem(16)); \ + d->elem(17) =3D F(s->elem(17)); \ + d->elem(18) =3D F(s->elem(18)); \ + d->elem(19) =3D F(s->elem(19)); \ + d->elem(20) =3D F(s->elem(20)); \ + d->elem(21) =3D F(s->elem(21)); \ + d->elem(22) =3D F(s->elem(22)); \ + d->elem(23) =3D F(s->elem(23)); \ + d->elem(24) =3D F(s->elem(24)); \ + d->elem(25) =3D F(s->elem(25)); \ + d->elem(26) =3D F(s->elem(26)); \ + d->elem(27) =3D F(s->elem(27)); \ + d->elem(28) =3D F(s->elem(28)); \ + d->elem(29) =3D F(s->elem(29)); \ + d->elem(30) =3D F(s->elem(30)); \ + d->elem(31) =3D F(s->elem(31)); \ + } \ + } + +#define SSE_HELPER_B(name, F) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \ + { \ + d->B(0) =3D F(v->B(0), s->B(0)); \ + d->B(1) =3D F(v->B(1), s->B(1)); \ + d->B(2) =3D F(v->B(2), s->B(2)); \ + d->B(3) =3D F(v->B(3), s->B(3)); \ + d->B(4) =3D F(v->B(4), s->B(4)); \ + d->B(5) =3D F(v->B(5), s->B(5)); \ + d->B(6) =3D F(v->B(6), s->B(6)); \ + d->B(7) =3D F(v->B(7), s->B(7)); \ XMM_ONLY( \ - d->B(8) =3D F(d->B(8), s->B(8)); \ - d->B(9) =3D F(d->B(9), s->B(9)); \ - d->B(10) =3D F(d->B(10), s->B(10)); \ - d->B(11) =3D F(d->B(11), s->B(11)); \ - d->B(12) =3D F(d->B(12), s->B(12)); \ - d->B(13) =3D F(d->B(13), s->B(13)); \ - d->B(14) =3D F(d->B(14), s->B(14)); \ - d->B(15) =3D F(d->B(15), s->B(15)); \ + d->B(8) =3D F(v->B(8), s->B(8)); \ + d->B(9) =3D F(v->B(9), s->B(9)); \ + d->B(10) =3D F(v->B(10), s->B(10)); \ + d->B(11) =3D F(v->B(11), s->B(11)); \ + d->B(12) =3D F(v->B(12), s->B(12)); \ + d->B(13) =3D F(v->B(13), s->B(13)); \ + d->B(14) =3D F(v->B(14), s->B(14)); \ + d->B(15) =3D F(v->B(15), s->B(15)); \ + ) \ + YMM_ONLY( \ + d->B(16) =3D F(v->B(16), s->B(16)); \ + d->B(17) =3D F(v->B(17), s->B(17)); \ + d->B(18) =3D F(v->B(18), s->B(18)); \ + d->B(19) =3D F(v->B(19), s->B(19)); \ + d->B(20) =3D F(v->B(20), s->B(20)); \ + d->B(21) =3D F(v->B(21), s->B(21)); \ + d->B(22) =3D F(v->B(22), s->B(22)); \ + d->B(23) =3D F(v->B(23), s->B(23)); \ + d->B(24) =3D F(v->B(24), s->B(24)); \ + d->B(25) =3D F(v->B(25), s->B(25)); \ + d->B(26) =3D F(v->B(26), s->B(26)); \ + d->B(27) =3D F(v->B(27), s->B(27)); \ + d->B(28) =3D F(v->B(28), s->B(28)); \ + d->B(29) =3D F(v->B(29), s->B(29)); \ + d->B(30) =3D F(v->B(30), s->B(30)); \ + d->B(31) =3D F(v->B(31), s->B(31)); \ ) \ } =20 #define SSE_HELPER_W(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - d->W(0) =3D F(d->W(0), s->W(0)); \ - d->W(1) =3D F(d->W(1), s->W(1)); \ - d->W(2) =3D F(d->W(2), s->W(2)); \ - d->W(3) =3D F(d->W(3), s->W(3)); \ + d->W(0) =3D F(v->W(0), s->W(0)); \ + d->W(1) =3D F(v->W(1), s->W(1)); \ + d->W(2) =3D F(v->W(2), s->W(2)); \ + d->W(3) =3D F(v->W(3), s->W(3)); \ XMM_ONLY( \ - d->W(4) =3D F(d->W(4), s->W(4)); \ - d->W(5) =3D F(d->W(5), s->W(5)); \ - d->W(6) =3D F(d->W(6), s->W(6)); \ - d->W(7) =3D F(d->W(7), s->W(7)); \ + d->W(4) =3D F(v->W(4), s->W(4)); \ + d->W(5) =3D F(v->W(5), s->W(5)); \ + d->W(6) =3D F(v->W(6), s->W(6)); \ + d->W(7) =3D F(v->W(7), s->W(7)); \ + ) \ + YMM_ONLY( \ + d->W(8) =3D F(v->W(8), s->W(8)); \ + d->W(9) =3D F(v->W(9), s->W(9)); \ + d->W(10) =3D F(v->W(10), s->W(10)); \ + d->W(11) =3D F(v->W(11), s->W(11)); \ + d->W(12) =3D F(v->W(12), s->W(12)); \ + d->W(13) =3D F(v->W(13), s->W(13)); \ + d->W(14) =3D F(v->W(14), s->W(14)); \ + d->W(15) =3D F(v->W(15), s->W(15)); \ ) \ } =20 #define SSE_HELPER_L(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - d->L(0) =3D F(d->L(0), s->L(0)); \ - d->L(1) =3D F(d->L(1), s->L(1)); \ + d->L(0) =3D F(v->L(0), s->L(0)); \ + d->L(1) =3D F(v->L(1), s->L(1)); \ XMM_ONLY( \ - d->L(2) =3D F(d->L(2), s->L(2)); \ - d->L(3) =3D F(d->L(3), s->L(3)); \ + d->L(2) =3D F(v->L(2), s->L(2)); \ + d->L(3) =3D F(v->L(3), s->L(3)); \ + ) \ + YMM_ONLY( \ + d->L(4) =3D F(v->L(4), s->L(4)); \ + d->L(5) =3D F(v->L(5), s->L(5)); \ + d->L(6) =3D F(v->L(6), s->L(6)); \ + d->L(7) =3D F(v->L(7), s->L(7)); \ ) \ } =20 #define SSE_HELPER_Q(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - d->Q(0) =3D F(d->Q(0), s->Q(0)); \ + d->Q(0) =3D F(v->Q(0), s->Q(0)); \ XMM_ONLY( \ - d->Q(1) =3D F(d->Q(1), s->Q(1)); \ + d->Q(1) =3D F(v->Q(1), s->Q(1)); \ + ) \ + YMM_ONLY( \ + d->Q(2) =3D F(v->Q(2), s->Q(2)); \ + d->Q(3) =3D F(v->Q(3), s->Q(3)); \ ) \ } =20 @@ -411,30 +507,41 @@ SSE_HELPER_W(helper_pcmpeqw, FCMPEQ) SSE_HELPER_L(helper_pcmpeql, FCMPEQ) =20 SSE_HELPER_W(helper_pmullw, FMULLW) -#if SHIFT =3D=3D 0 -SSE_HELPER_W(helper_pmulhrw, FMULHRW) -#endif SSE_HELPER_W(helper_pmulhuw, FMULHUW) SSE_HELPER_W(helper_pmulhw, FMULHW) =20 +#if SHIFT =3D=3D 0 +void glue(helper_pmulhrw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + d->W(0) =3D FMULHRW(d->W(0), s->W(0)); + d->W(1) =3D FMULHRW(d->W(1), s->W(1)); + d->W(2) =3D FMULHRW(d->W(2), s->W(2)); + d->W(3) =3D FMULHRW(d->W(3), s->W(3)); +} +#endif + SSE_HELPER_B(helper_pavgb, FAVG) SSE_HELPER_W(helper_pavgw, FAVG) =20 -void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - d->Q(0) =3D (uint64_t)s->L(0) * (uint64_t)d->L(0); -#if SHIFT =3D=3D 1 - d->Q(1) =3D (uint64_t)s->L(2) * (uint64_t)d->L(2); + d->Q(0) =3D (uint64_t)s->L(0) * (uint64_t)v->L(0); +#if SHIFT >=3D 1 + d->Q(1) =3D (uint64_t)s->L(2) * (uint64_t)v->L(2); +#if SHIFT =3D=3D 2 + d->Q(2) =3D (uint64_t)s->L(4) * (uint64_t)v->L(4); + d->Q(3) =3D (uint64_t)s->L(6) * (uint64_t)v->L(6); +#endif #endif } =20 -void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { int i; =20 for (i =3D 0; i < (2 << SHIFT); i++) { - d->L(i) =3D (int16_t)s->W(2 * i) * (int16_t)d->W(2 * i) + - (int16_t)s->W(2 * i + 1) * (int16_t)d->W(2 * i + 1); + d->L(i) =3D (int16_t)s->W(2 * i) * (int16_t)v->W(2 * i) + + (int16_t)s->W(2 * i + 1) * (int16_t)v->W(2 * i + 1); } } =20 @@ -448,34 +555,57 @@ static inline int abs1(int a) } } #endif -void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { unsigned int val; =20 val =3D 0; - val +=3D abs1(d->B(0) - s->B(0)); - val +=3D abs1(d->B(1) - s->B(1)); - val +=3D abs1(d->B(2) - s->B(2)); - val +=3D abs1(d->B(3) - s->B(3)); - val +=3D abs1(d->B(4) - s->B(4)); - val +=3D abs1(d->B(5) - s->B(5)); - val +=3D abs1(d->B(6) - s->B(6)); - val +=3D abs1(d->B(7) - s->B(7)); + val +=3D abs1(v->B(0) - s->B(0)); + val +=3D abs1(v->B(1) - s->B(1)); + val +=3D abs1(v->B(2) - s->B(2)); + val +=3D abs1(v->B(3) - s->B(3)); + val +=3D abs1(v->B(4) - s->B(4)); + val +=3D abs1(v->B(5) - s->B(5)); + val +=3D abs1(v->B(6) - s->B(6)); + val +=3D abs1(v->B(7) - s->B(7)); d->Q(0) =3D val; -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 val =3D 0; - val +=3D abs1(d->B(8) - s->B(8)); - val +=3D abs1(d->B(9) - s->B(9)); - val +=3D abs1(d->B(10) - s->B(10)); - val +=3D abs1(d->B(11) - s->B(11)); - val +=3D abs1(d->B(12) - s->B(12)); - val +=3D abs1(d->B(13) - s->B(13)); - val +=3D abs1(d->B(14) - s->B(14)); - val +=3D abs1(d->B(15) - s->B(15)); + val +=3D abs1(v->B(8) - s->B(8)); + val +=3D abs1(v->B(9) - s->B(9)); + val +=3D abs1(v->B(10) - s->B(10)); + val +=3D abs1(v->B(11) - s->B(11)); + val +=3D abs1(v->B(12) - s->B(12)); + val +=3D abs1(v->B(13) - s->B(13)); + val +=3D abs1(v->B(14) - s->B(14)); + val +=3D abs1(v->B(15) - s->B(15)); d->Q(1) =3D val; +#if SHIFT =3D=3D 2 + val =3D 0; + val +=3D abs1(v->B(16) - s->B(16)); + val +=3D abs1(v->B(17) - s->B(17)); + val +=3D abs1(v->B(18) - s->B(18)); + val +=3D abs1(v->B(19) - s->B(19)); + val +=3D abs1(v->B(20) - s->B(20)); + val +=3D abs1(v->B(21) - s->B(21)); + val +=3D abs1(v->B(22) - s->B(22)); + val +=3D abs1(v->B(23) - s->B(23)); + d->Q(2) =3D val; + val =3D 0; + val +=3D abs1(v->B(24) - s->B(24)); + val +=3D abs1(v->B(25) - s->B(25)); + val +=3D abs1(v->B(26) - s->B(26)); + val +=3D abs1(v->B(27) - s->B(27)); + val +=3D abs1(v->B(28) - s->B(28)); + val +=3D abs1(v->B(29) - s->B(29)); + val +=3D abs1(v->B(30) - s->B(30)); + val +=3D abs1(v->B(31) - s->B(31)); + d->Q(3) =3D val; +#endif #endif } =20 +#if SHIFT < 2 void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, target_ulong a0) { @@ -487,13 +617,18 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, } } } +#endif =20 void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val) { d->L(0) =3D val; d->L(1) =3D 0; -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 d->Q(1) =3D 0; +#if SHIFT =3D=3D 2 + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#endif #endif } =20 @@ -501,114 +636,152 @@ void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32= _t val) void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val) { d->Q(0) =3D val; -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 d->Q(1) =3D 0; +#if SHIFT =3D=3D 2 + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#endif #endif } #endif =20 +#define SHUFFLE4(F, a, b, offset) do { \ + r0 =3D a->F((order & 3) + offset); \ + r1 =3D a->F(((order >> 2) & 3) + offset); \ + r2 =3D b->F(((order >> 4) & 3) + offset); \ + r3 =3D b->F(((order >> 6) & 3) + offset); \ + d->F(offset) =3D r0; \ + d->F(offset + 1) =3D r1; \ + d->F(offset + 2) =3D r2; \ + d->F(offset + 3) =3D r3; \ + } while (0) + #if SHIFT =3D=3D 0 void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; =20 - r.W(0) =3D s->W(order & 3); - r.W(1) =3D s->W((order >> 2) & 3); - r.W(2) =3D s->W((order >> 4) & 3); - r.W(3) =3D s->W((order >> 6) & 3); - *d =3D r; + SHUFFLE4(W, s, s, 0); } #else -void helper_shufps(Reg *d, Reg *s, int order) +void glue(helper_shufps, SUFFIX)(Reg *d, Reg *v, Reg *s, int order) { - Reg r; + uint32_t r0, r1, r2, r3; =20 - r.L(0) =3D d->L(order & 3); - r.L(1) =3D d->L((order >> 2) & 3); - r.L(2) =3D s->L((order >> 4) & 3); - r.L(3) =3D s->L((order >> 6) & 3); - *d =3D r; + SHUFFLE4(L, v, s, 0); +#if SHIFT =3D=3D 2 + SHUFFLE4(L, v, s, 4); +#endif } =20 -void helper_shufpd(Reg *d, Reg *s, int order) +void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *v, Reg *s, int order) { - Reg r; + uint64_t r0, r1; =20 - r.Q(0) =3D d->Q(order & 1); - r.Q(1) =3D s->Q((order >> 1) & 1); - *d =3D r; + r0 =3D v->Q(order & 1); + r1 =3D s->Q((order >> 1) & 1); + d->Q(0) =3D r0; + d->Q(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D v->Q(((order >> 2) & 1) + 2); + r1 =3D s->Q(((order >> 3) & 1) + 2); + d->Q(2) =3D r0; + d->Q(3) =3D r1; +#endif } =20 void glue(helper_pshufd, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint32_t r0, r1, r2, r3; =20 - r.L(0) =3D s->L(order & 3); - r.L(1) =3D s->L((order >> 2) & 3); - r.L(2) =3D s->L((order >> 4) & 3); - r.L(3) =3D s->L((order >> 6) & 3); - *d =3D r; + SHUFFLE4(L, s, s, 0); +#if SHIFT =3D=3D 2 + SHUFFLE4(L, s, s, 4); +#endif } =20 void glue(helper_pshuflw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; =20 - r.W(0) =3D s->W(order & 3); - r.W(1) =3D s->W((order >> 2) & 3); - r.W(2) =3D s->W((order >> 4) & 3); - r.W(3) =3D s->W((order >> 6) & 3); - r.Q(1) =3D s->Q(1); - *d =3D r; + SHUFFLE4(W, s, s, 0); + d->Q(1) =3D s->Q(1); +#if SHIFT =3D=3D 2 + SHUFFLE4(W, s, s, 8); + d->Q(3) =3D s->Q(3); +#endif } =20 void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; =20 - r.Q(0) =3D s->Q(0); - r.W(4) =3D s->W(4 + (order & 3)); - r.W(5) =3D s->W(4 + ((order >> 2) & 3)); - r.W(6) =3D s->W(4 + ((order >> 4) & 3)); - r.W(7) =3D s->W(4 + ((order >> 6) & 3)); - *d =3D r; + d->Q(0) =3D s->Q(0); + SHUFFLE4(W, s, s, 4); +#if SHIFT =3D=3D 2 + d->Q(2) =3D s->Q(2); + SHUFFLE4(W, s, s, 12); +#endif } #endif =20 -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 /* FPU ops */ /* XXX: not accurate */ =20 -#define SSE_HELPER_S(name, F) \ - void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s) \ +#define SSE_HELPER_P(name, F) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s) \ { \ - d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - d->ZMM_S(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ - d->ZMM_S(2) =3D F(32, d->ZMM_S(2), s->ZMM_S(2)); \ - d->ZMM_S(3) =3D F(32, d->ZMM_S(3), s->ZMM_S(3)); \ + d->ZMM_S(0) =3D F(32, v->ZMM_S(0), s->ZMM_S(0)); \ + d->ZMM_S(1) =3D F(32, v->ZMM_S(1), s->ZMM_S(1)); \ + d->ZMM_S(2) =3D F(32, v->ZMM_S(2), s->ZMM_S(2)); \ + d->ZMM_S(3) =3D F(32, v->ZMM_S(3), s->ZMM_S(3)); \ + YMM_ONLY( \ + d->ZMM_S(4) =3D F(32, v->ZMM_S(4), s->ZMM_S(4)); \ + d->ZMM_S(5) =3D F(32, v->ZMM_S(5), s->ZMM_S(5)); \ + d->ZMM_S(6) =3D F(32, v->ZMM_S(6), s->ZMM_S(6)); \ + d->ZMM_S(7) =3D F(32, v->ZMM_S(7), s->ZMM_S(7)); \ + ) \ } \ \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s) \ { \ - d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - } \ + d->ZMM_D(0) =3D F(64, v->ZMM_D(0), s->ZMM_D(0)); \ + d->ZMM_D(1) =3D F(64, v->ZMM_D(1), s->ZMM_D(1)); \ + YMM_ONLY( \ + d->ZMM_D(2) =3D F(64, v->ZMM_D(2), s->ZMM_D(2)); \ + d->ZMM_D(3) =3D F(64, v->ZMM_D(3), s->ZMM_D(3)); \ + ) \ + } + +#if SHIFT =3D=3D 1 + +#define SSE_HELPER_S(name, F) \ + SSE_HELPER_P(name, F) \ \ - void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s) \ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *v, Reg *s)\ { \ - d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - d->ZMM_D(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ + d->ZMM_S(0) =3D F(32, v->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *v, Reg *s)\ { \ - d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ + d->ZMM_D(0) =3D F(64, v->ZMM_D(0), s->ZMM_D(0)); \ } =20 +#else + +#define SSE_HELPER_S(name, F) SSE_HELPER_P(name, F) + +#endif + #define FPU_ADD(size, a, b) float ## size ## _add(a, b, &env->sse_status) #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status) #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status) #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status) -#define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status) =20 /* Note that the choice of comparison op here is important to get the * special cases right: for min and max Intel specifies that (-0,0), @@ -625,27 +798,76 @@ SSE_HELPER_S(mul, FPU_MUL) SSE_HELPER_S(div, FPU_DIV) SSE_HELPER_S(min, FPU_MIN) SSE_HELPER_S(max, FPU_MAX) -SSE_HELPER_S(sqrt, FPU_SQRT) =20 +void glue(helper_sqrtps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_S(0) =3D float32_sqrt(s->ZMM_S(0), &env->sse_status); + d->ZMM_S(1) =3D float32_sqrt(s->ZMM_S(1), &env->sse_status); + d->ZMM_S(2) =3D float32_sqrt(s->ZMM_S(2), &env->sse_status); + d->ZMM_S(3) =3D float32_sqrt(s->ZMM_S(3), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_sqrt(s->ZMM_S(4), &env->sse_status); + d->ZMM_S(5) =3D float32_sqrt(s->ZMM_S(5), &env->sse_status); + d->ZMM_S(6) =3D float32_sqrt(s->ZMM_S(6), &env->sse_status); + d->ZMM_S(7) =3D float32_sqrt(s->ZMM_S(7), &env->sse_status); +#endif +} + +void glue(helper_sqrtpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_D(0) =3D float64_sqrt(s->ZMM_D(0), &env->sse_status); + d->ZMM_D(1) =3D float64_sqrt(s->ZMM_D(1), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_D(2) =3D float64_sqrt(s->ZMM_D(2), &env->sse_status); + d->ZMM_D(3) =3D float64_sqrt(s->ZMM_D(3), &env->sse_status); +#endif +} + +#if SHIFT =3D=3D 1 +void helper_sqrtss(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_S(0) =3D float32_sqrt(s->ZMM_S(0), &env->sse_status); +} + +void helper_sqrtsd(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_D(0) =3D float64_sqrt(s->ZMM_D(0), &env->sse_status); +} +#endif =20 /* float to float conversions */ -void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { float32 s0, s1; =20 s0 =3D s->ZMM_S(0); s1 =3D s->ZMM_S(1); +#if SHIFT =3D=3D 2 + float32 s2, s3; + s2 =3D s->ZMM_S(2); + s3 =3D s->ZMM_S(3); + d->ZMM_D(2) =3D float32_to_float64(s2, &env->sse_status); + d->ZMM_D(3) =3D float32_to_float64(s3, &env->sse_status); +#endif d->ZMM_D(0) =3D float32_to_float64(s0, &env->sse_status); d->ZMM_D(1) =3D float32_to_float64(s1, &env->sse_status); } =20 -void helper_cvtpd2ps(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); d->ZMM_S(1) =3D float64_to_float32(s->ZMM_D(1), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(2) =3D float64_to_float32(s->ZMM_D(2), &env->sse_status); + d->ZMM_S(3) =3D float64_to_float32(s->ZMM_D(3), &env->sse_status); + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#else d->Q(1) =3D 0; +#endif } =20 +#if SHIFT =3D=3D 1 void helper_cvtss2sd(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_D(0) =3D float32_to_float64(s->ZMM_S(0), &env->sse_status); @@ -655,26 +877,41 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); } +#endif =20 /* integer to float */ -void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D int32_to_float32(s->ZMM_L(0), &env->sse_status); d->ZMM_S(1) =3D int32_to_float32(s->ZMM_L(1), &env->sse_status); d->ZMM_S(2) =3D int32_to_float32(s->ZMM_L(2), &env->sse_status); d->ZMM_S(3) =3D int32_to_float32(s->ZMM_L(3), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D int32_to_float32(s->ZMM_L(4), &env->sse_status); + d->ZMM_S(5) =3D int32_to_float32(s->ZMM_L(5), &env->sse_status); + d->ZMM_S(6) =3D int32_to_float32(s->ZMM_L(6), &env->sse_status); + d->ZMM_S(7) =3D int32_to_float32(s->ZMM_L(7), &env->sse_status); +#endif } =20 -void helper_cvtdq2pd(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int32_t l0, l1; =20 l0 =3D (int32_t)s->ZMM_L(0); l1 =3D (int32_t)s->ZMM_L(1); +#if SHIFT =3D=3D 2 + int32_t l2, l3; + l2 =3D (int32_t)s->ZMM_L(2); + l3 =3D (int32_t)s->ZMM_L(3); + d->ZMM_D(2) =3D int32_to_float64(l2, &env->sse_status); + d->ZMM_D(3) =3D int32_to_float64(l3, &env->sse_status); +#endif d->ZMM_D(0) =3D int32_to_float64(l0, &env->sse_status); d->ZMM_D(1) =3D int32_to_float64(l1, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_cvtpi2ps(CPUX86State *env, ZMMReg *d, MMXReg *s) { d->ZMM_S(0) =3D int32_to_float32(s->MMX_L(0), &env->sse_status); @@ -709,8 +946,11 @@ void helper_cvtsq2sd(CPUX86State *env, ZMMReg *d, uint= 64_t val) } #endif =20 +#endif + /* float to integer */ =20 +#if SHIFT =3D=3D 1 /* * x86 mandates that we return the indefinite integer value for the result * of any float-to-integer conversion that raises the 'invalid' exception. @@ -741,22 +981,37 @@ WRAP_FLOATCONV(int64_t, float32_to_int64, float32, IN= T64_MIN) WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero, float32, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN) +#endif =20 -void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); d->ZMM_L(1) =3D x86_float32_to_int32(s->ZMM_S(1), &env->sse_status); d->ZMM_L(2) =3D x86_float32_to_int32(s->ZMM_S(2), &env->sse_status); d->ZMM_L(3) =3D x86_float32_to_int32(s->ZMM_S(3), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_L(4) =3D x86_float32_to_int32(s->ZMM_S(4), &env->sse_status); + d->ZMM_L(5) =3D x86_float32_to_int32(s->ZMM_S(5), &env->sse_status); + d->ZMM_L(6) =3D x86_float32_to_int32(s->ZMM_S(6), &env->sse_status); + d->ZMM_L(7) =3D x86_float32_to_int32(s->ZMM_S(7), &env->sse_status); +#endif } =20 -void helper_cvtpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float64_to_int32(s->ZMM_D(0), &env->sse_status); d->ZMM_L(1) =3D x86_float64_to_int32(s->ZMM_D(1), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_L(2) =3D x86_float64_to_int32(s->ZMM_D(2), &env->sse_status); + d->ZMM_L(3) =3D x86_float64_to_int32(s->ZMM_D(3), &env->sse_status); + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#else d->ZMM_Q(1) =3D 0; +#endif } =20 +#if SHIFT =3D=3D 1 void helper_cvtps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { d->MMX_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); @@ -790,33 +1045,64 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s) return x86_float64_to_int64(s->ZMM_D(0), &env->sse_status); } #endif +#endif =20 /* float to integer truncated */ -void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); - d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); - d->ZMM_L(2) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(2), &env->= sse_status); - d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->= sse_status); +void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +{ + d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), + &env->sse_status); + d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), + &env->sse_status); + d->ZMM_L(2) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(2), + &env->sse_status); + d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), + &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_L(4) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(4), + &env->sse_status); + d->ZMM_L(5) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(5), + &env->sse_status); + d->ZMM_L(6) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(6), + &env->sse_status); + d->ZMM_L(7) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(7), + &env->sse_status); +#endif } =20 -void helper_cvttpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); - d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); + d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), + &env->sse_status); + d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), + &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_L(2) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(2), + &env->sse_status); + d->ZMM_L(3) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(3), + &env->sse_status); + d->ZMM_Q(2) =3D 0; + d->ZMM_Q(3) =3D 0; +#else d->ZMM_Q(1) =3D 0; +#endif } =20 +#if SHIFT =3D=3D 1 void helper_cvttps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { - d->MMX_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); - d->MMX_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); + d->MMX_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), + &env->sse_status); + d->MMX_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), + &env->sse_status); } =20 void helper_cvttpd2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { - d->MMX_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); - d->MMX_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); + d->MMX_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), + &env->sse_status); + d->MMX_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), + &env->sse_status); } =20 int32_t helper_cvttss2si(CPUX86State *env, ZMMReg *s) @@ -840,8 +1126,9 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) return x86_float64_to_int64_round_to_zero(s->ZMM_D(0), &env->sse_statu= s); } #endif +#endif =20 -void helper_rsqrtps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, @@ -856,9 +1143,24 @@ void helper_rsqrtps(CPUX86State *env, ZMMReg *d, ZMMR= eg *s) d->ZMM_S(3) =3D float32_div(float32_one, float32_sqrt(s->ZMM_S(3), &env->sse_status), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(4), &env->sse_status), + &env->sse_status); + d->ZMM_S(5) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(5), &env->sse_status), + &env->sse_status); + d->ZMM_S(6) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(6), &env->sse_status), + &env->sse_status); + d->ZMM_S(7) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(7), &env->sse_status), + &env->sse_status); +#endif set_float_exception_flags(old_flags, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); @@ -867,24 +1169,34 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMM= Reg *s) &env->sse_status); set_float_exception_flags(old_flags, &env->sse_status); } +#endif =20 -void helper_rcpps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); d->ZMM_S(1) =3D float32_div(float32_one, s->ZMM_S(1), &env->sse_status= ); d->ZMM_S(2) =3D float32_div(float32_one, s->ZMM_S(2), &env->sse_status= ); d->ZMM_S(3) =3D float32_div(float32_one, s->ZMM_S(3), &env->sse_status= ); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_div(float32_one, s->ZMM_S(4), &env->sse_status= ); + d->ZMM_S(5) =3D float32_div(float32_one, s->ZMM_S(5), &env->sse_status= ); + d->ZMM_S(6) =3D float32_div(float32_one, s->ZMM_S(6), &env->sse_status= ); + d->ZMM_S(7) =3D float32_div(float32_one, s->ZMM_S(7), &env->sse_status= ); +#endif set_float_exception_flags(old_flags, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_rcpss(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); set_float_exception_flags(old_flags, &env->sse_status); } +#endif =20 +#if SHIFT =3D=3D 1 static inline uint64_t helper_extrq(uint64_t src, int shift, int len) { uint64_t mask; @@ -928,113 +1240,213 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d,= int index, int length) { d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } +#endif =20 -void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; - - r.ZMM_S(0) =3D float32_add(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status); - r.ZMM_S(1) =3D float32_add(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status); - r.ZMM_S(2) =3D float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); - r.ZMM_S(3) =3D float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); - *d =3D r; +void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) +{ + float32 r0, r1, r2, r3; + + r0 =3D float32_add(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status); + r1 =3D float32_add(v->ZMM_S(2), v->ZMM_S(3), &env->sse_status); + r2 =3D float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); + r3 =3D float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); + d->ZMM_S(0) =3D r0; + d->ZMM_S(1) =3D r1; + d->ZMM_S(2) =3D r2; + d->ZMM_S(3) =3D r3; +#if SHIFT =3D=3D 2 + r0 =3D float32_add(v->ZMM_S(4), v->ZMM_S(5), &env->sse_status); + r1 =3D float32_add(v->ZMM_S(6), v->ZMM_S(7), &env->sse_status); + r2 =3D float32_add(s->ZMM_S(4), s->ZMM_S(5), &env->sse_status); + r3 =3D float32_add(s->ZMM_S(6), s->ZMM_S(7), &env->sse_status); + d->ZMM_S(4) =3D r0; + d->ZMM_S(5) =3D r1; + d->ZMM_S(6) =3D r2; + d->ZMM_S(7) =3D r3; +#endif } =20 -void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - ZMMReg r; + float64 r0, r1; =20 - r.ZMM_D(0) =3D float64_add(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status); - r.ZMM_D(1) =3D float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); - *d =3D r; + r0 =3D float64_add(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status); + r1 =3D float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); + d->ZMM_D(0) =3D r0; + d->ZMM_D(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D float64_add(v->ZMM_D(2), v->ZMM_D(3), &env->sse_status); + r1 =3D float64_add(s->ZMM_D(2), s->ZMM_D(3), &env->sse_status); + d->ZMM_D(2) =3D r0; + d->ZMM_D(3) =3D r1; +#endif } =20 -void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; - - r.ZMM_S(0) =3D float32_sub(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status); - r.ZMM_S(1) =3D float32_sub(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status); - r.ZMM_S(2) =3D float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); - r.ZMM_S(3) =3D float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); - *d =3D r; +void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) +{ + float32 r0, r1, r2, r3; + + r0 =3D float32_sub(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status); + r1 =3D float32_sub(v->ZMM_S(2), v->ZMM_S(3), &env->sse_status); + r2 =3D float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); + r3 =3D float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); + d->ZMM_S(0) =3D r0; + d->ZMM_S(1) =3D r1; + d->ZMM_S(2) =3D r2; + d->ZMM_S(3) =3D r3; +#if SHIFT =3D=3D 2 + r0 =3D float32_sub(v->ZMM_S(4), v->ZMM_S(5), &env->sse_status); + r1 =3D float32_sub(v->ZMM_S(6), v->ZMM_S(7), &env->sse_status); + r2 =3D float32_sub(s->ZMM_S(4), s->ZMM_S(5), &env->sse_status); + r3 =3D float32_sub(s->ZMM_S(6), s->ZMM_S(7), &env->sse_status); + d->ZMM_S(4) =3D r0; + d->ZMM_S(5) =3D r1; + d->ZMM_S(6) =3D r2; + d->ZMM_S(7) =3D r3; +#endif } =20 -void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - ZMMReg r; + float64 r0, r1; =20 - r.ZMM_D(0) =3D float64_sub(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status); - r.ZMM_D(1) =3D float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); - *d =3D r; + r0 =3D float64_sub(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status); + r1 =3D float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); + d->ZMM_D(0) =3D r0; + d->ZMM_D(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D float64_sub(v->ZMM_D(2), v->ZMM_D(3), &env->sse_status); + r1 =3D float64_sub(s->ZMM_D(2), s->ZMM_D(3), &env->sse_status); + d->ZMM_D(2) =3D r0; + d->ZMM_D(3) =3D r1; +#endif } =20 -void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) { - d->ZMM_S(0) =3D float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); - d->ZMM_S(1) =3D float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); - d->ZMM_S(2) =3D float32_sub(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status= ); - d->ZMM_S(3) =3D float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); + d->ZMM_S(0) =3D float32_sub(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); + d->ZMM_S(1) =3D float32_add(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); + d->ZMM_S(2) =3D float32_sub(v->ZMM_S(2), s->ZMM_S(2), &env->sse_status= ); + d->ZMM_S(3) =3D float32_add(v->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_sub(v->ZMM_S(4), s->ZMM_S(4), &env->sse_status= ); + d->ZMM_S(5) =3D float32_add(v->ZMM_S(5), s->ZMM_S(5), &env->sse_status= ); + d->ZMM_S(6) =3D float32_sub(v->ZMM_S(6), s->ZMM_S(6), &env->sse_status= ); + d->ZMM_S(7) =3D float32_add(v->ZMM_S(7), s->ZMM_S(7), &env->sse_status= ); +#endif } =20 -void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) { - d->ZMM_D(0) =3D float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); - d->ZMM_D(1) =3D float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); + d->ZMM_D(0) =3D float64_sub(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); + d->ZMM_D(1) =3D float64_add(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); +#if SHIFT =3D=3D 2 + d->ZMM_D(2) =3D float64_sub(v->ZMM_D(2), s->ZMM_D(2), &env->sse_status= ); + d->ZMM_D(3) =3D float64_add(v->ZMM_D(3), s->ZMM_D(3), &env->sse_status= ); +#endif } =20 -/* XXX: unordered */ -#define SSE_HELPER_CMP(name, F) \ - void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - d->ZMM_L(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ - d->ZMM_L(2) =3D F(32, d->ZMM_S(2), s->ZMM_S(2)); \ - d->ZMM_L(3) =3D F(32, d->ZMM_S(3), s->ZMM_S(3)); \ - } \ - \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - } \ - \ - void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s) \ +#define SSE_HELPER_CMP_P(name, F, C) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s) \ { \ - d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - d->ZMM_Q(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ + d->ZMM_L(0) =3D F(32, C, v->ZMM_S(0), s->ZMM_S(0)); \ + d->ZMM_L(1) =3D F(32, C, v->ZMM_S(1), s->ZMM_S(1)); \ + d->ZMM_L(2) =3D F(32, C, v->ZMM_S(2), s->ZMM_S(2)); \ + d->ZMM_L(3) =3D F(32, C, v->ZMM_S(3), s->ZMM_S(3)); \ + YMM_ONLY( \ + d->ZMM_L(4) =3D F(32, C, v->ZMM_S(4), s->ZMM_S(4)); \ + d->ZMM_L(5) =3D F(32, C, v->ZMM_S(5), s->ZMM_S(5)); \ + d->ZMM_L(6) =3D F(32, C, v->ZMM_S(6), s->ZMM_S(6)); \ + d->ZMM_L(7) =3D F(32, C, v->ZMM_S(7), s->ZMM_S(7)); \ + ) \ } \ \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s) \ { \ - d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - } - -#define FPU_CMPEQ(size, a, b) \ - (float ## size ## _eq_quiet(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPLT(size, a, b) \ - (float ## size ## _lt(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPLE(size, a, b) \ - (float ## size ## _le(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPUNORD(size, a, b) \ - (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPNEQ(size, a, b) \ - (float ## size ## _eq_quiet(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPNLT(size, a, b) \ - (float ## size ## _lt(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPNLE(size, a, b) \ - (float ## size ## _le(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPORD(size, a, b) \ - (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? 0 : -1) - -SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) -SSE_HELPER_CMP(cmplt, FPU_CMPLT) -SSE_HELPER_CMP(cmple, FPU_CMPLE) -SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD) -SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ) -SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) -SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) -SSE_HELPER_CMP(cmpord, FPU_CMPORD) + d->ZMM_Q(0) =3D F(64, C, v->ZMM_D(0), s->ZMM_D(0)); \ + d->ZMM_Q(1) =3D F(64, C, v->ZMM_D(1), s->ZMM_D(1)); \ + YMM_ONLY( \ + d->ZMM_Q(2) =3D F(64, C, v->ZMM_D(2), s->ZMM_D(2)); \ + d->ZMM_Q(3) =3D F(64, C, v->ZMM_D(3), s->ZMM_D(3)); \ + ) \ + } + +#if SHIFT =3D=3D 1 +#define SSE_HELPER_CMP(name, F, C) = \ + SSE_HELPER_CMP_P(name, F, C) = \ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *v, Reg *s) = \ + { = \ + d->ZMM_L(0) =3D F(32, C, v->ZMM_S(0), s->ZMM_S(0)); = \ + } = \ + = \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *v, Reg *s) = \ + { = \ + d->ZMM_Q(0) =3D F(64, C, v->ZMM_D(0), s->ZMM_D(0)); = \ + } + +static inline bool FPU_EQU(FloatRelation x) +{ + return (x =3D=3D float_relation_equal || x =3D=3D float_relation_unord= ered); +} +static inline bool FPU_GE(FloatRelation x) +{ + return (x =3D=3D float_relation_equal || x =3D=3D float_relation_great= er); +} +#define FPU_EQ(x) (x =3D=3D float_relation_equal) +#define FPU_LT(x) (x =3D=3D float_relation_less) +#define FPU_LE(x) (x <=3D float_relation_equal) +#define FPU_GT(x) (x =3D=3D float_relation_greater) +#define FPU_UNORD(x) (x =3D=3D float_relation_unordered) +#define FPU_FALSE(x) 0 + +#define FPU_CMPQ(size, COND, a, b) \ + (COND(float ## size ## _compare_quiet(a, b, &env->sse_status)) ? -1 : = 0) +#define FPU_CMPS(size, COND, a, b) \ + (COND(float ## size ## _compare(a, b, &env->sse_status)) ? -1 : 0) + +#else +#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_CMP_P(name, F, C) +#endif =20 +SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ) +SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT) +SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE) +SSE_HELPER_CMP(cmpunord, FPU_CMPQ, FPU_UNORD) +SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ) +SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT) +SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE) +SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) + +SSE_HELPER_CMP(cmpequ, FPU_CMPQ, FPU_EQU) +SSE_HELPER_CMP(cmpnge, FPU_CMPS, !FPU_GE) +SSE_HELPER_CMP(cmpngt, FPU_CMPS, !FPU_GT) +SSE_HELPER_CMP(cmpfalse, FPU_CMPQ, FPU_FALSE) +SSE_HELPER_CMP(cmpnequ, FPU_CMPQ, !FPU_EQU) +SSE_HELPER_CMP(cmpge, FPU_CMPS, FPU_GE) +SSE_HELPER_CMP(cmpgt, FPU_CMPS, FPU_GT) +SSE_HELPER_CMP(cmptrue, FPU_CMPQ, !FPU_FALSE) + +SSE_HELPER_CMP(cmpeqs, FPU_CMPS, FPU_EQ) +SSE_HELPER_CMP(cmpltq, FPU_CMPQ, FPU_LT) +SSE_HELPER_CMP(cmpleq, FPU_CMPQ, FPU_LE) +SSE_HELPER_CMP(cmpunords, FPU_CMPS, FPU_UNORD) +SSE_HELPER_CMP(cmpneqq, FPU_CMPS, !FPU_EQ) +SSE_HELPER_CMP(cmpnltq, FPU_CMPQ, !FPU_LT) +SSE_HELPER_CMP(cmpnleq, FPU_CMPQ, !FPU_LE) +SSE_HELPER_CMP(cmpords, FPU_CMPS, !FPU_UNORD) + +SSE_HELPER_CMP(cmpequs, FPU_CMPS, FPU_EQU) +SSE_HELPER_CMP(cmpngeq, FPU_CMPQ, !FPU_GE) +SSE_HELPER_CMP(cmpngtq, FPU_CMPQ, !FPU_GT) +SSE_HELPER_CMP(cmpfalses, FPU_CMPS, FPU_FALSE) +SSE_HELPER_CMP(cmpnequs, FPU_CMPS, !FPU_EQU) +SSE_HELPER_CMP(cmpgeq, FPU_CMPQ, FPU_GE) +SSE_HELPER_CMP(cmpgtq, FPU_CMPQ, FPU_GT) +SSE_HELPER_CMP(cmptrues, FPU_CMPS, !FPU_FALSE) + +#if SHIFT =3D=3D 1 static const int comis_eflags[4] =3D {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C}; =20 void helper_ucomiss(CPUX86State *env, Reg *d, Reg *s) @@ -1080,25 +1492,38 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s) ret =3D float64_compare(d0, d1, &env->sse_status); CC_SRC =3D comis_eflags[ret + 1]; } +#endif =20 -uint32_t helper_movmskps(CPUX86State *env, Reg *s) +uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s) { - int b0, b1, b2, b3; + uint32_t mask; =20 - b0 =3D s->ZMM_L(0) >> 31; - b1 =3D s->ZMM_L(1) >> 31; - b2 =3D s->ZMM_L(2) >> 31; - b3 =3D s->ZMM_L(3) >> 31; - return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3); + mask =3D 0; + mask |=3D (s->ZMM_L(0) >> (31 - 0)) & (1 << 0); + mask |=3D (s->ZMM_L(1) >> (31 - 1)) & (1 << 1); + mask |=3D (s->ZMM_L(2) >> (31 - 2)) & (1 << 2); + mask |=3D (s->ZMM_L(3) >> (31 - 3)) & (1 << 3); +#if SHIFT =3D=3D 2 + mask |=3D (s->ZMM_L(4) >> (31 - 4)) & (1 << 4); + mask |=3D (s->ZMM_L(5) >> (31 - 5)) & (1 << 5); + mask |=3D (s->ZMM_L(6) >> (31 - 6)) & (1 << 6); + mask |=3D (s->ZMM_L(7) >> (31 - 7)) & (1 << 7); +#endif + return mask; } =20 -uint32_t helper_movmskpd(CPUX86State *env, Reg *s) +uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s) { - int b0, b1; + uint32_t mask; =20 - b0 =3D s->ZMM_L(1) >> 31; - b1 =3D s->ZMM_L(3) >> 31; - return b0 | (b1 << 1); + mask =3D 0; + mask |=3D (s->ZMM_L(1) >> (31 - 0)) & (1 << 0); + mask |=3D (s->ZMM_L(3) >> (31 - 1)) & (1 << 1); +#if SHIFT =3D=3D 2 + mask |=3D (s->ZMM_L(5) >> (31 - 2)) & (1 << 2); + mask |=3D (s->ZMM_L(7) >> (31 - 3)) & (1 << 3); +#endif + return mask; } =20 #endif @@ -1116,7 +1541,7 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *e= nv, Reg *s) val |=3D (s->B(5) >> 2) & 0x20; val |=3D (s->B(6) >> 1) & 0x40; val |=3D (s->B(7)) & 0x80; -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 val |=3D (s->B(8) << 1) & 0x0100; val |=3D (s->B(9) << 2) & 0x0200; val |=3D (s->B(10) << 3) & 0x0400; @@ -1125,160 +1550,243 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86Stat= e *env, Reg *s) val |=3D (s->B(13) << 6) & 0x2000; val |=3D (s->B(14) << 7) & 0x4000; val |=3D (s->B(15) << 8) & 0x8000; +#if SHIFT =3D=3D 2 + val |=3D ((uint32_t)s->B(16) << 9) & 0x00010000; + val |=3D ((uint32_t)s->B(17) << 10) & 0x00020000; + val |=3D ((uint32_t)s->B(18) << 11) & 0x00040000; + val |=3D ((uint32_t)s->B(19) << 12) & 0x00080000; + val |=3D ((uint32_t)s->B(20) << 13) & 0x00100000; + val |=3D ((uint32_t)s->B(21) << 14) & 0x00200000; + val |=3D ((uint32_t)s->B(22) << 15) & 0x00400000; + val |=3D ((uint32_t)s->B(23) << 16) & 0x00800000; + val |=3D ((uint32_t)s->B(24) << 17) & 0x01000000; + val |=3D ((uint32_t)s->B(25) << 18) & 0x02000000; + val |=3D ((uint32_t)s->B(26) << 19) & 0x04000000; + val |=3D ((uint32_t)s->B(27) << 20) & 0x08000000; + val |=3D ((uint32_t)s->B(28) << 21) & 0x10000000; + val |=3D ((uint32_t)s->B(29) << 22) & 0x20000000; + val |=3D ((uint32_t)s->B(30) << 23) & 0x40000000; + val |=3D ((uint32_t)s->B(31) << 24) & 0x80000000; +#endif #endif return val; } =20 -void glue(helper_packsswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.B(0) =3D satsb((int16_t)d->W(0)); - r.B(1) =3D satsb((int16_t)d->W(1)); - r.B(2) =3D satsb((int16_t)d->W(2)); - r.B(3) =3D satsb((int16_t)d->W(3)); -#if SHIFT =3D=3D 1 - r.B(4) =3D satsb((int16_t)d->W(4)); - r.B(5) =3D satsb((int16_t)d->W(5)); - r.B(6) =3D satsb((int16_t)d->W(6)); - r.B(7) =3D satsb((int16_t)d->W(7)); -#endif - r.B((4 << SHIFT) + 0) =3D satsb((int16_t)s->W(0)); - r.B((4 << SHIFT) + 1) =3D satsb((int16_t)s->W(1)); - r.B((4 << SHIFT) + 2) =3D satsb((int16_t)s->W(2)); - r.B((4 << SHIFT) + 3) =3D satsb((int16_t)s->W(3)); -#if SHIFT =3D=3D 1 - r.B(12) =3D satsb((int16_t)s->W(4)); - r.B(13) =3D satsb((int16_t)s->W(5)); - r.B(14) =3D satsb((int16_t)s->W(6)); - r.B(15) =3D satsb((int16_t)s->W(7)); +#if SHIFT =3D=3D 0 +#define PACK_WIDTH 4 +#else +#define PACK_WIDTH 8 #endif - *d =3D r; -} - -void glue(helper_packuswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; =20 - r.B(0) =3D satub((int16_t)d->W(0)); - r.B(1) =3D satub((int16_t)d->W(1)); - r.B(2) =3D satub((int16_t)d->W(2)); - r.B(3) =3D satub((int16_t)d->W(3)); -#if SHIFT =3D=3D 1 - r.B(4) =3D satub((int16_t)d->W(4)); - r.B(5) =3D satub((int16_t)d->W(5)); - r.B(6) =3D satub((int16_t)d->W(6)); - r.B(7) =3D satub((int16_t)d->W(7)); -#endif - r.B((4 << SHIFT) + 0) =3D satub((int16_t)s->W(0)); - r.B((4 << SHIFT) + 1) =3D satub((int16_t)s->W(1)); - r.B((4 << SHIFT) + 2) =3D satub((int16_t)s->W(2)); - r.B((4 << SHIFT) + 3) =3D satub((int16_t)s->W(3)); -#if SHIFT =3D=3D 1 - r.B(12) =3D satub((int16_t)s->W(4)); - r.B(13) =3D satub((int16_t)s->W(5)); - r.B(14) =3D satub((int16_t)s->W(6)); - r.B(15) =3D satub((int16_t)s->W(7)); -#endif - *d =3D r; +#define PACK4(F, to, reg, from) do { \ + r[to + 0] =3D F((int16_t)reg->W(from + 0)); \ + r[to + 1] =3D F((int16_t)reg->W(from + 1)); \ + r[to + 2] =3D F((int16_t)reg->W(from + 2)); \ + r[to + 3] =3D F((int16_t)reg->W(from + 3)); \ + } while (0) + +#define PACK_HELPER_B(name, F) \ +void glue(helper_pack ## name, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s) \ +{ \ + uint8_t r[PACK_WIDTH * 2]; \ + int i; \ + PACK4(F, 0, v, 0); \ + PACK4(F, PACK_WIDTH, s, 0); \ + XMM_ONLY( \ + PACK4(F, 4, v, 4); \ + PACK4(F, 12, s, 4); \ + ) \ + for (i =3D 0; i < PACK_WIDTH * 2; i++) { \ + d->B(i) =3D r[i]; \ + } \ + YMM_ONLY( \ + PACK4(F, 0, v, 8); \ + PACK4(F, 4, v, 12); \ + PACK4(F, 8, s, 8); \ + PACK4(F, 12, s, 12); \ + for (i =3D 0; i < 16; i++) { \ + d->B(i + 16) =3D r[i]; \ + } \ + ) \ } =20 -void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +PACK_HELPER_B(sswb, satsb) +PACK_HELPER_B(uswb, satub) + +void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) { - Reg r; + uint16_t r[PACK_WIDTH]; + int i; =20 - r.W(0) =3D satsw(d->L(0)); - r.W(1) =3D satsw(d->L(1)); -#if SHIFT =3D=3D 1 - r.W(2) =3D satsw(d->L(2)); - r.W(3) =3D satsw(d->L(3)); + r[0] =3D satsw(v->L(0)); + r[1] =3D satsw(v->L(1)); + r[PACK_WIDTH / 2 + 0] =3D satsw(s->L(0)); + r[PACK_WIDTH / 2 + 1] =3D satsw(s->L(1)); +#if SHIFT >=3D 1 + r[2] =3D satsw(v->L(2)); + r[3] =3D satsw(v->L(3)); + r[6] =3D satsw(s->L(2)); + r[7] =3D satsw(s->L(3)); #endif - r.W((2 << SHIFT) + 0) =3D satsw(s->L(0)); - r.W((2 << SHIFT) + 1) =3D satsw(s->L(1)); -#if SHIFT =3D=3D 1 - r.W(6) =3D satsw(s->L(2)); - r.W(7) =3D satsw(s->L(3)); + for (i =3D 0; i < PACK_WIDTH; i++) { + d->W(i) =3D r[i]; + } +#if SHIFT =3D=3D 2 + r[0] =3D satsw(v->L(4)); + r[1] =3D satsw(v->L(5)); + r[2] =3D satsw(v->L(6)); + r[3] =3D satsw(v->L(7)); + r[4] =3D satsw(s->L(4)); + r[5] =3D satsw(s->L(5)); + r[6] =3D satsw(s->L(6)); + r[7] =3D satsw(s->L(7)); + for (i =3D 0; i < 8; i++) { + d->W(i + 8) =3D r[i]; + } #endif - *d =3D r; } =20 #define UNPCK_OP(base_name, base) \ \ void glue(helper_punpck ## base_name ## bw, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg r; \ + uint8_t r[PACK_WIDTH * 2]; \ + int i; \ \ - r.B(0) =3D d->B((base << (SHIFT + 2)) + 0); \ - r.B(1) =3D s->B((base << (SHIFT + 2)) + 0); \ - r.B(2) =3D d->B((base << (SHIFT + 2)) + 1); \ - r.B(3) =3D s->B((base << (SHIFT + 2)) + 1); \ - r.B(4) =3D d->B((base << (SHIFT + 2)) + 2); \ - r.B(5) =3D s->B((base << (SHIFT + 2)) + 2); \ - r.B(6) =3D d->B((base << (SHIFT + 2)) + 3); \ - r.B(7) =3D s->B((base << (SHIFT + 2)) + 3); \ + r[0] =3D v->B((base * PACK_WIDTH) + 0); \ + r[1] =3D s->B((base * PACK_WIDTH) + 0); \ + r[2] =3D v->B((base * PACK_WIDTH) + 1); \ + r[3] =3D s->B((base * PACK_WIDTH) + 1); \ + r[4] =3D v->B((base * PACK_WIDTH) + 2); \ + r[5] =3D s->B((base * PACK_WIDTH) + 2); \ + r[6] =3D v->B((base * PACK_WIDTH) + 3); \ + r[7] =3D s->B((base * PACK_WIDTH) + 3); \ XMM_ONLY( \ - r.B(8) =3D d->B((base << (SHIFT + 2)) + 4); \ - r.B(9) =3D s->B((base << (SHIFT + 2)) + 4); \ - r.B(10) =3D d->B((base << (SHIFT + 2)) + 5); \ - r.B(11) =3D s->B((base << (SHIFT + 2)) + 5); \ - r.B(12) =3D d->B((base << (SHIFT + 2)) + 6); \ - r.B(13) =3D s->B((base << (SHIFT + 2)) + 6); \ - r.B(14) =3D d->B((base << (SHIFT + 2)) + 7); \ - r.B(15) =3D s->B((base << (SHIFT + 2)) + 7); \ + r[8] =3D v->B((base * PACK_WIDTH) + 4); \ + r[9] =3D s->B((base * PACK_WIDTH) + 4); \ + r[10] =3D v->B((base * PACK_WIDTH) + 5); \ + r[11] =3D s->B((base * PACK_WIDTH) + 5); \ + r[12] =3D v->B((base * PACK_WIDTH) + 6); \ + r[13] =3D s->B((base * PACK_WIDTH) + 6); \ + r[14] =3D v->B((base * PACK_WIDTH) + 7); \ + r[15] =3D s->B((base * PACK_WIDTH) + 7); \ + ) \ + for (i =3D 0; i < PACK_WIDTH * 2; i++) { \ + d->B(i) =3D r[i]; \ + } \ + YMM_ONLY( \ + r[0] =3D v->B((base * 8) + 16); \ + r[1] =3D s->B((base * 8) + 16); \ + r[2] =3D v->B((base * 8) + 17); \ + r[3] =3D s->B((base * 8) + 17); \ + r[4] =3D v->B((base * 8) + 18); \ + r[5] =3D s->B((base * 8) + 18); \ + r[6] =3D v->B((base * 8) + 19); \ + r[7] =3D s->B((base * 8) + 19); \ + r[8] =3D v->B((base * 8) + 20); \ + r[9] =3D s->B((base * 8) + 20); \ + r[10] =3D v->B((base * 8) + 21); \ + r[11] =3D s->B((base * 8) + 21); \ + r[12] =3D v->B((base * 8) + 22); \ + r[13] =3D s->B((base * 8) + 22); \ + r[14] =3D v->B((base * 8) + 23); \ + r[15] =3D s->B((base * 8) + 23); \ + for (i =3D 0; i < PACK_WIDTH * 2; i++) { \ + d->B(16 + i) =3D r[i]; \ + } \ ) \ - *d =3D r; \ } \ \ void glue(helper_punpck ## base_name ## wd, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg r; \ + uint16_t r[PACK_WIDTH]; \ + int i; \ \ - r.W(0) =3D d->W((base << (SHIFT + 1)) + 0); \ - r.W(1) =3D s->W((base << (SHIFT + 1)) + 0); \ - r.W(2) =3D d->W((base << (SHIFT + 1)) + 1); \ - r.W(3) =3D s->W((base << (SHIFT + 1)) + 1); \ + r[0] =3D v->W((base * (PACK_WIDTH / 2)) + 0); \ + r[1] =3D s->W((base * (PACK_WIDTH / 2)) + 0); \ + r[2] =3D v->W((base * (PACK_WIDTH / 2)) + 1); \ + r[3] =3D s->W((base * (PACK_WIDTH / 2)) + 1); \ XMM_ONLY( \ - r.W(4) =3D d->W((base << (SHIFT + 1)) + 2); \ - r.W(5) =3D s->W((base << (SHIFT + 1)) + 2); \ - r.W(6) =3D d->W((base << (SHIFT + 1)) + 3); \ - r.W(7) =3D s->W((base << (SHIFT + 1)) + 3); \ + r[4] =3D v->W((base * 4) + 2); \ + r[5] =3D s->W((base * 4) + 2); \ + r[6] =3D v->W((base * 4) + 3); \ + r[7] =3D s->W((base * 4) + 3); \ + ) \ + for (i =3D 0; i < PACK_WIDTH; i++) { \ + d->W(i) =3D r[i]; \ + } \ + YMM_ONLY( \ + r[0] =3D v->W((base * 4) + 8); \ + r[1] =3D s->W((base * 4) + 8); \ + r[2] =3D v->W((base * 4) + 9); \ + r[3] =3D s->W((base * 4) + 9); \ + r[4] =3D v->W((base * 4) + 10); \ + r[5] =3D s->W((base * 4) + 10); \ + r[6] =3D v->W((base * 4) + 11); \ + r[7] =3D s->W((base * 4) + 11); \ + for (i =3D 0; i < PACK_WIDTH; i++) { \ + d->W(i + 8) =3D r[i]; \ + } \ ) \ - *d =3D r; \ } \ \ void glue(helper_punpck ## base_name ## dq, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg r; \ + uint32_t r[4]; \ \ - r.L(0) =3D d->L((base << SHIFT) + 0); \ - r.L(1) =3D s->L((base << SHIFT) + 0); \ + r[0] =3D v->L((base * (PACK_WIDTH / 4)) + 0); \ + r[1] =3D s->L((base * (PACK_WIDTH / 4)) + 0); \ XMM_ONLY( \ - r.L(2) =3D d->L((base << SHIFT) + 1); \ - r.L(3) =3D s->L((base << SHIFT) + 1); \ + r[2] =3D v->L((base * 2) + 1); \ + r[3] =3D s->L((base * 2) + 1); \ + d->L(2) =3D r[2]; \ + d->L(3) =3D r[3]; \ + ) \ + d->L(0) =3D r[0]; \ + d->L(1) =3D r[1]; \ + YMM_ONLY( \ + r[0] =3D v->L((base * 2) + 4); \ + r[1] =3D s->L((base * 2) + 4); \ + r[2] =3D v->L((base * 2) + 5); \ + r[3] =3D s->L((base * 2) + 5); \ + d->L(4) =3D r[0]; \ + d->L(5) =3D r[1]; \ + d->L(6) =3D r[2]; \ + d->L(7) =3D r[3]; \ ) \ - *d =3D r; \ } \ \ XMM_ONLY( \ - void glue(helper_punpck ## base_name ## qdq, SUFFIX)(CPUX86St= ate \ - *env, \ - Reg *d, \ - Reg *s) \ + void glue(helper_punpck ## base_name ## qdq, SUFFIX)( \ + CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - Reg r; \ + uint64_t r[2]; \ \ - r.Q(0) =3D d->Q(base); \ - r.Q(1) =3D s->Q(base); \ - *d =3D r; \ + r[0] =3D v->Q(base); \ + r[1] =3D s->Q(base); \ + d->Q(0) =3D r[0]; \ + d->Q(1) =3D r[1]; \ + YMM_ONLY( \ + r[0] =3D v->Q(base + 2); \ + r[1] =3D s->Q(base + 2); \ + d->Q(2) =3D r[0]; \ + d->Q(3) =3D r[1]; \ + ) \ } \ ) =20 UNPCK_OP(l, 0) UNPCK_OP(h, 1) =20 +#undef PACK_WIDTH +#undef PACK_HELPER_B +#undef PACK4 + + /* 3DNow! float ops */ #if SHIFT =3D=3D 0 void helper_pi2fd(CPUX86State *env, MMXReg *d, MMXReg *s) @@ -1429,123 +1937,176 @@ void helper_pswapd(CPUX86State *env, MMXReg *d, M= MXReg *s) #endif =20 /* SSSE3 op helpers */ -void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { int i; - Reg r; +#if SHIFT =3D=3D 0 + uint8_t r[8]; =20 - for (i =3D 0; i < (8 << SHIFT); i++) { - r.B(i) =3D (s->B(i) & 0x80) ? 0 : (d->B(s->B(i) & ((8 << SHIFT) - = 1))); + for (i =3D 0; i < 8; i++) { + r[i] =3D (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 7)); } + for (i =3D 0; i < 8; i++) { + d->B(i) =3D r[i]; + } +#else + uint8_t r[16]; =20 - *d =3D r; + for (i =3D 0; i < 16; i++) { + r[i] =3D (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 0xf)); + } + for (i =3D 0; i < 16; i++) { + d->B(i) =3D r[i]; + } +#if SHIFT =3D=3D 2 + for (i =3D 0; i < 16; i++) { + r[i] =3D (s->B(i + 16) & 0x80) ? 0 : (v->B((s->B(i + 16) & 0xf) + = 16)); + } + for (i =3D 0; i < 16; i++) { + d->B(i + 16) =3D r[i]; + } +#endif +#endif } =20 -void glue(helper_phaddw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ +#if SHIFT =3D=3D 0 =20 - Reg r; +#define SSE_HELPER_HW(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) \ +{ \ + uint16_t r[4]; \ + r[0] =3D F(v->W(0), v->W(1)); \ + r[1] =3D F(v->W(2), v->W(3)); \ + r[2] =3D F(s->W(0), s->W(1)); \ + r[3] =3D F(s->W(3), s->W(3)); \ + d->W(0) =3D r[0]; \ + d->W(1) =3D r[1]; \ + d->W(2) =3D r[2]; \ + d->W(3) =3D r[3]; \ +} + +#define SSE_HELPER_HL(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) \ +{ \ + uint32_t r0, r1; \ + r0 =3D F(v->L(0), v->L(1)); \ + r1 =3D F(s->L(0), s->L(1)); \ + d->W(0) =3D r0; \ + d->W(1) =3D r1; \ +} =20 - r.W(0) =3D (int16_t)d->W(0) + (int16_t)d->W(1); - r.W(1) =3D (int16_t)d->W(2) + (int16_t)d->W(3); - XMM_ONLY(r.W(2) =3D (int16_t)d->W(4) + (int16_t)d->W(5)); - XMM_ONLY(r.W(3) =3D (int16_t)d->W(6) + (int16_t)d->W(7)); - r.W((2 << SHIFT) + 0) =3D (int16_t)s->W(0) + (int16_t)s->W(1); - r.W((2 << SHIFT) + 1) =3D (int16_t)s->W(2) + (int16_t)s->W(3); - XMM_ONLY(r.W(6) =3D (int16_t)s->W(4) + (int16_t)s->W(5)); - XMM_ONLY(r.W(7) =3D (int16_t)s->W(6) + (int16_t)s->W(7)); +#else =20 - *d =3D r; +#define SSE_HELPER_HW(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) \ +{ \ + int32_t r[8]; \ + r[0] =3D F(v->W(0), v->W(1)); \ + r[1] =3D F(v->W(2), v->W(3)); \ + r[2] =3D F(v->W(4), v->W(5)); \ + r[3] =3D F(v->W(6), v->W(7)); \ + r[4] =3D F(s->W(0), s->W(1)); \ + r[5] =3D F(s->W(2), s->W(3)); \ + r[6] =3D F(s->W(4), s->W(5)); \ + r[7] =3D F(s->W(6), s->W(7)); \ + d->W(0) =3D r[0]; \ + d->W(1) =3D r[1]; \ + d->W(2) =3D r[2]; \ + d->W(3) =3D r[3]; \ + d->W(4) =3D r[4]; \ + d->W(5) =3D r[5]; \ + d->W(6) =3D r[6]; \ + d->W(7) =3D r[7]; \ + YMM_ONLY( \ + r[0] =3D F(v->W(8), v->W(9)); \ + r[1] =3D F(v->W(10), v->W(11)); \ + r[2] =3D F(v->W(12), v->W(13)); \ + r[3] =3D F(v->W(14), v->W(15)); \ + r[4] =3D F(s->W(8), s->W(9)); \ + r[5] =3D F(s->W(10), s->W(11)); \ + r[6] =3D F(s->W(12), s->W(13)); \ + r[7] =3D F(s->W(14), s->W(15)); \ + d->W(8) =3D r[0]; \ + d->W(9) =3D r[1]; \ + d->W(10) =3D r[2]; \ + d->W(11) =3D r[3]; \ + d->W(12) =3D r[4]; \ + d->W(13) =3D r[5]; \ + d->W(14) =3D r[6]; \ + d->W(15) =3D r[7]; \ + ) \ +} + +#define SSE_HELPER_HL(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) \ +{ \ + int32_t r0, r1, r2, r3; \ + r0 =3D F(v->L(0), v->L(1)); \ + r1 =3D F(v->L(2), v->L(3)); \ + r2 =3D F(s->L(0), s->L(1)); \ + r3 =3D F(s->L(2), s->L(3)); \ + d->L(0) =3D r0; \ + d->L(1) =3D r1; \ + d->L(2) =3D r2; \ + d->L(3) =3D r3; \ + YMM_ONLY( \ + r0 =3D F(v->L(4), v->L(5)); \ + r1 =3D F(v->L(6), v->L(7)); \ + r2 =3D F(s->L(4), s->L(5)); \ + r3 =3D F(s->L(6), s->L(7)); \ + d->L(4) =3D r0; \ + d->L(5) =3D r1; \ + d->L(6) =3D r2; \ + d->L(7) =3D r3; \ + ) \ } - -void glue(helper_phaddd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.L(0) =3D (int32_t)d->L(0) + (int32_t)d->L(1); - XMM_ONLY(r.L(1) =3D (int32_t)d->L(2) + (int32_t)d->L(3)); - r.L((1 << SHIFT) + 0) =3D (int32_t)s->L(0) + (int32_t)s->L(1); - XMM_ONLY(r.L(3) =3D (int32_t)s->L(2) + (int32_t)s->L(3)); - - *d =3D r; -} - -void glue(helper_phaddsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.W(0) =3D satsw((int16_t)d->W(0) + (int16_t)d->W(1)); - r.W(1) =3D satsw((int16_t)d->W(2) + (int16_t)d->W(3)); - XMM_ONLY(r.W(2) =3D satsw((int16_t)d->W(4) + (int16_t)d->W(5))); - XMM_ONLY(r.W(3) =3D satsw((int16_t)d->W(6) + (int16_t)d->W(7))); - r.W((2 << SHIFT) + 0) =3D satsw((int16_t)s->W(0) + (int16_t)s->W(1)); - r.W((2 << SHIFT) + 1) =3D satsw((int16_t)s->W(2) + (int16_t)s->W(3)); - XMM_ONLY(r.W(6) =3D satsw((int16_t)s->W(4) + (int16_t)s->W(5))); - XMM_ONLY(r.W(7) =3D satsw((int16_t)s->W(6) + (int16_t)s->W(7))); - - *d =3D r; -} - -void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->W(0) =3D satsw((int8_t)s->B(0) * (uint8_t)d->B(0) + - (int8_t)s->B(1) * (uint8_t)d->B(1)); - d->W(1) =3D satsw((int8_t)s->B(2) * (uint8_t)d->B(2) + - (int8_t)s->B(3) * (uint8_t)d->B(3)); - d->W(2) =3D satsw((int8_t)s->B(4) * (uint8_t)d->B(4) + - (int8_t)s->B(5) * (uint8_t)d->B(5)); - d->W(3) =3D satsw((int8_t)s->B(6) * (uint8_t)d->B(6) + - (int8_t)s->B(7) * (uint8_t)d->B(7)); -#if SHIFT =3D=3D 1 - d->W(4) =3D satsw((int8_t)s->B(8) * (uint8_t)d->B(8) + - (int8_t)s->B(9) * (uint8_t)d->B(9)); - d->W(5) =3D satsw((int8_t)s->B(10) * (uint8_t)d->B(10) + - (int8_t)s->B(11) * (uint8_t)d->B(11)); - d->W(6) =3D satsw((int8_t)s->B(12) * (uint8_t)d->B(12) + - (int8_t)s->B(13) * (uint8_t)d->B(13)); - d->W(7) =3D satsw((int8_t)s->B(14) * (uint8_t)d->B(14) + - (int8_t)s->B(15) * (uint8_t)d->B(15)); #endif -} =20 -void glue(helper_phsubw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->W(0) =3D (int16_t)d->W(0) - (int16_t)d->W(1); - d->W(1) =3D (int16_t)d->W(2) - (int16_t)d->W(3); - XMM_ONLY(d->W(2) =3D (int16_t)d->W(4) - (int16_t)d->W(5)); - XMM_ONLY(d->W(3) =3D (int16_t)d->W(6) - (int16_t)d->W(7)); - d->W((2 << SHIFT) + 0) =3D (int16_t)s->W(0) - (int16_t)s->W(1); - d->W((2 << SHIFT) + 1) =3D (int16_t)s->W(2) - (int16_t)s->W(3); - XMM_ONLY(d->W(6) =3D (int16_t)s->W(4) - (int16_t)s->W(5)); - XMM_ONLY(d->W(7) =3D (int16_t)s->W(6) - (int16_t)s->W(7)); -} - -void glue(helper_phsubd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->L(0) =3D (int32_t)d->L(0) - (int32_t)d->L(1); - XMM_ONLY(d->L(1) =3D (int32_t)d->L(2) - (int32_t)d->L(3)); - d->L((1 << SHIFT) + 0) =3D (int32_t)s->L(0) - (int32_t)s->L(1); - XMM_ONLY(d->L(3) =3D (int32_t)s->L(2) - (int32_t)s->L(3)); -} - -void glue(helper_phsubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->W(0) =3D satsw((int16_t)d->W(0) - (int16_t)d->W(1)); - d->W(1) =3D satsw((int16_t)d->W(2) - (int16_t)d->W(3)); - XMM_ONLY(d->W(2) =3D satsw((int16_t)d->W(4) - (int16_t)d->W(5))); - XMM_ONLY(d->W(3) =3D satsw((int16_t)d->W(6) - (int16_t)d->W(7))); - d->W((2 << SHIFT) + 0) =3D satsw((int16_t)s->W(0) - (int16_t)s->W(1)); - d->W((2 << SHIFT) + 1) =3D satsw((int16_t)s->W(2) - (int16_t)s->W(3)); - XMM_ONLY(d->W(6) =3D satsw((int16_t)s->W(4) - (int16_t)s->W(5))); - XMM_ONLY(d->W(7) =3D satsw((int16_t)s->W(6) - (int16_t)s->W(7))); +SSE_HELPER_HW(phaddw, FADD) +SSE_HELPER_HW(phsubw, FSUB) +SSE_HELPER_HW(phaddsw, FADDSW) +SSE_HELPER_HW(phsubsw, FSUBSW) +SSE_HELPER_HL(phaddd, FADD) +SSE_HELPER_HL(phsubd, FSUB) + +#undef SSE_HELPER_HW +#undef SSE_HELPER_HL + +void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg = *s) +{ + d->W(0) =3D satsw((int8_t)s->B(0) * (uint8_t)v->B(0) + + (int8_t)s->B(1) * (uint8_t)v->B(1)); + d->W(1) =3D satsw((int8_t)s->B(2) * (uint8_t)v->B(2) + + (int8_t)s->B(3) * (uint8_t)v->B(3)); + d->W(2) =3D satsw((int8_t)s->B(4) * (uint8_t)v->B(4) + + (int8_t)s->B(5) * (uint8_t)v->B(5)); + d->W(3) =3D satsw((int8_t)s->B(6) * (uint8_t)v->B(6) + + (int8_t)s->B(7) * (uint8_t)v->B(7)); +#if SHIFT >=3D 1 + d->W(4) =3D satsw((int8_t)s->B(8) * (uint8_t)v->B(8) + + (int8_t)s->B(9) * (uint8_t)v->B(9)); + d->W(5) =3D satsw((int8_t)s->B(10) * (uint8_t)v->B(10) + + (int8_t)s->B(11) * (uint8_t)v->B(11)); + d->W(6) =3D satsw((int8_t)s->B(12) * (uint8_t)v->B(12) + + (int8_t)s->B(13) * (uint8_t)v->B(13)); + d->W(7) =3D satsw((int8_t)s->B(14) * (uint8_t)v->B(14) + + (int8_t)s->B(15) * (uint8_t)v->B(15)); +#if SHIFT =3D=3D 2 + int i; + for (i =3D 8; i < 16; i++) { + d->W(i) =3D satsw((int8_t)s->B(i * 2) * (uint8_t)v->B(i * 2) + + (int8_t)s->B(i * 2 + 1) * (uint8_t)v->B(i * 2 + 1)= ); + } +#endif +#endif } =20 -#define FABSB(_, x) (x > INT8_MAX ? -(int8_t)x : x) -#define FABSW(_, x) (x > INT16_MAX ? -(int16_t)x : x) -#define FABSL(_, x) (x > INT32_MAX ? -(int32_t)x : x) -SSE_HELPER_B(helper_pabsb, FABSB) -SSE_HELPER_W(helper_pabsw, FABSW) -SSE_HELPER_L(helper_pabsd, FABSL) +#define FABSB(x) (x > INT8_MAX ? -(int8_t)x : x) +#define FABSW(x) (x > INT16_MAX ? -(int16_t)x : x) +#define FABSL(x) (x > INT32_MAX ? -(int32_t)x : x) +SSE_HELPER_1(helper_pabsb, B, 8, FABSB) +SSE_HELPER_1(helper_pabsw, W, 4, FABSW) +SSE_HELPER_1(helper_pabsd, L, 2, FABSL) =20 #define FMULHRSW(d, s) (((int16_t) d * (int16_t)s + 0x4000) >> 15) SSE_HELPER_W(helper_pmulhrsw, FMULHRSW) @@ -1557,104 +2118,119 @@ SSE_HELPER_B(helper_psignb, FSIGNB) SSE_HELPER_W(helper_psignw, FSIGNW) SSE_HELPER_L(helper_psignd, FSIGNL) =20 -void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, +void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, int32_t shift) { - Reg r; - /* XXX could be checked during translation */ - if (shift >=3D (16 << SHIFT)) { - r.Q(0) =3D 0; - XMM_ONLY(r.Q(1) =3D 0); + if (shift >=3D (SHIFT ? 32 : 16)) { + d->Q(0) =3D 0; + XMM_ONLY(d->Q(1) =3D 0); +#if SHIFT =3D=3D 2 + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#endif } else { shift <<=3D 3; #define SHR(v, i) (i < 64 && i > -64 ? i > 0 ? v >> (i) : (v << -(i)) : 0) #if SHIFT =3D=3D 0 - r.Q(0) =3D SHR(s->Q(0), shift - 0) | - SHR(d->Q(0), shift - 64); + d->Q(0) =3D SHR(s->Q(0), shift - 0) | + SHR(v->Q(0), shift - 64); #else - r.Q(0) =3D SHR(s->Q(0), shift - 0) | - SHR(s->Q(1), shift - 64) | - SHR(d->Q(0), shift - 128) | - SHR(d->Q(1), shift - 192); - r.Q(1) =3D SHR(s->Q(0), shift + 64) | - SHR(s->Q(1), shift - 0) | - SHR(d->Q(0), shift - 64) | - SHR(d->Q(1), shift - 128); + uint64_t r0, r1; + + r0 =3D SHR(s->Q(0), shift - 0) | + SHR(s->Q(1), shift - 64) | + SHR(v->Q(0), shift - 128) | + SHR(v->Q(1), shift - 192); + r1 =3D SHR(s->Q(0), shift + 64) | + SHR(s->Q(1), shift - 0) | + SHR(v->Q(0), shift - 64) | + SHR(v->Q(1), shift - 128); + d->Q(0) =3D r0; + d->Q(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D SHR(s->Q(2), shift - 0) | + SHR(s->Q(3), shift - 64) | + SHR(v->Q(2), shift - 128) | + SHR(v->Q(3), shift - 192); + r1 =3D SHR(s->Q(2), shift + 64) | + SHR(s->Q(3), shift - 0) | + SHR(v->Q(2), shift - 64) | + SHR(v->Q(3), shift - 128); + d->Q(2) =3D r0; + d->Q(3) =3D r1; +#endif #endif #undef SHR } - - *d =3D r; } =20 -#define XMM0 (env->xmm_regs[0]) +#if SHIFT >=3D 1 + +#define BLEND_V128(elem, num, F, b) do { = \ + d->elem(b + 0) =3D F(v->elem(b + 0), s->elem(b + 0), m->elem(b + 0)); = \ + d->elem(b + 1) =3D F(v->elem(b + 1), s->elem(b + 1), m->elem(b + 1)); = \ + if (num > 2) { = \ + d->elem(b + 2) =3D F(v->elem(b + 2), s->elem(b + 2), m->elem(b + 2= )); \ + d->elem(b + 3) =3D F(v->elem(b + 3), s->elem(b + 3), m->elem(b + 3= )); \ + } = \ + if (num > 4) { = \ + d->elem(b + 4) =3D F(v->elem(b + 4), s->elem(b + 4), m->elem(b + 4= )); \ + d->elem(b + 5) =3D F(v->elem(b + 5), s->elem(b + 5), m->elem(b + 5= )); \ + d->elem(b + 6) =3D F(v->elem(b + 6), s->elem(b + 6), m->elem(b + 6= )); \ + d->elem(b + 7) =3D F(v->elem(b + 7), s->elem(b + 7), m->elem(b + 7= )); \ + } = \ + if (num > 8) { = \ + d->elem(b + 8) =3D F(v->elem(b + 8), s->elem(b + 8), m->elem(b + 8= )); \ + d->elem(b + 9) =3D F(v->elem(b + 9), s->elem(b + 9), m->elem(b + 9= )); \ + d->elem(b + 10) =3D F(v->elem(b + 10), s->elem(b + 10), m->elem(b = + 10));\ + d->elem(b + 11) =3D F(v->elem(b + 11), s->elem(b + 11), m->elem(b = + 11));\ + d->elem(b + 12) =3D F(v->elem(b + 12), s->elem(b + 12), m->elem(b = + 12));\ + d->elem(b + 13) =3D F(v->elem(b + 13), s->elem(b + 13), m->elem(b = + 13));\ + d->elem(b + 14) =3D F(v->elem(b + 14), s->elem(b + 14), m->elem(b = + 14));\ + d->elem(b + 15) =3D F(v->elem(b + 15), s->elem(b + 15), m->elem(b = + 15));\ + } \ + } while (0) =20 -#if SHIFT =3D=3D 1 #define SSE_HELPER_V(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, \ + Reg *m) \ { \ - d->elem(0) =3D F(d->elem(0), s->elem(0), XMM0.elem(0)); \ - d->elem(1) =3D F(d->elem(1), s->elem(1), XMM0.elem(1)); \ - if (num > 2) { \ - d->elem(2) =3D F(d->elem(2), s->elem(2), XMM0.elem(2)); \ - d->elem(3) =3D F(d->elem(3), s->elem(3), XMM0.elem(3)); \ - if (num > 4) { \ - d->elem(4) =3D F(d->elem(4), s->elem(4), XMM0.elem(4)); \ - d->elem(5) =3D F(d->elem(5), s->elem(5), XMM0.elem(5)); \ - d->elem(6) =3D F(d->elem(6), s->elem(6), XMM0.elem(6)); \ - d->elem(7) =3D F(d->elem(7), s->elem(7), XMM0.elem(7)); \ - if (num > 8) { \ - d->elem(8) =3D F(d->elem(8), s->elem(8), XMM0.elem(8))= ; \ - d->elem(9) =3D F(d->elem(9), s->elem(9), XMM0.elem(9))= ; \ - d->elem(10) =3D F(d->elem(10), s->elem(10), XMM0.elem(= 10)); \ - d->elem(11) =3D F(d->elem(11), s->elem(11), XMM0.elem(= 11)); \ - d->elem(12) =3D F(d->elem(12), s->elem(12), XMM0.elem(= 12)); \ - d->elem(13) =3D F(d->elem(13), s->elem(13), XMM0.elem(= 13)); \ - d->elem(14) =3D F(d->elem(14), s->elem(14), XMM0.elem(= 14)); \ - d->elem(15) =3D F(d->elem(15), s->elem(15), XMM0.elem(= 15)); \ - } \ - } \ - } \ - } + BLEND_V128(elem, num, F, 0); \ + YMM_ONLY(BLEND_V128(elem, num, F, num);) \ + } + +#define BLEND_I128(elem, num, F, b) do { = \ + d->elem(b + 0) =3D F(v->elem(b + 0), s->elem(b + 0), ((imm >> 0) & 1))= ; \ + d->elem(b + 1) =3D F(v->elem(b + 1), s->elem(b + 1), ((imm >> 1) & 1))= ; \ + if (num > 2) { = \ + d->elem(b + 2) =3D F(v->elem(b + 2), s->elem(b + 2), ((imm >> 2) &= 1)); \ + d->elem(b + 3) =3D F(v->elem(b + 3), s->elem(b + 3), ((imm >> 3) &= 1)); \ + } = \ + if (num > 4) { = \ + d->elem(b + 4) =3D F(v->elem(b + 4), s->elem(b + 4), ((imm >> 4) &= 1)); \ + d->elem(b + 5) =3D F(v->elem(b + 5), s->elem(b + 5), ((imm >> 5) &= 1)); \ + d->elem(b + 6) =3D F(v->elem(b + 6), s->elem(b + 6), ((imm >> 6) &= 1)); \ + d->elem(b + 7) =3D F(v->elem(b + 7), s->elem(b + 7), ((imm >> 7) &= 1)); \ + } = \ + } while (0) =20 #define SSE_HELPER_I(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t imm= ) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, \ + uint32_t imm) \ { \ - d->elem(0) =3D F(d->elem(0), s->elem(0), ((imm >> 0) & 1)); \ - d->elem(1) =3D F(d->elem(1), s->elem(1), ((imm >> 1) & 1)); \ - if (num > 2) { \ - d->elem(2) =3D F(d->elem(2), s->elem(2), ((imm >> 2) & 1)); \ - d->elem(3) =3D F(d->elem(3), s->elem(3), ((imm >> 3) & 1)); \ - if (num > 4) { \ - d->elem(4) =3D F(d->elem(4), s->elem(4), ((imm >> 4) & 1))= ; \ - d->elem(5) =3D F(d->elem(5), s->elem(5), ((imm >> 5) & 1))= ; \ - d->elem(6) =3D F(d->elem(6), s->elem(6), ((imm >> 6) & 1))= ; \ - d->elem(7) =3D F(d->elem(7), s->elem(7), ((imm >> 7) & 1))= ; \ - if (num > 8) { \ - d->elem(8) =3D F(d->elem(8), s->elem(8), ((imm >> 8) &= 1)); \ - d->elem(9) =3D F(d->elem(9), s->elem(9), ((imm >> 9) &= 1)); \ - d->elem(10) =3D F(d->elem(10), s->elem(10), \ - ((imm >> 10) & 1)); \ - d->elem(11) =3D F(d->elem(11), s->elem(11), \ - ((imm >> 11) & 1)); \ - d->elem(12) =3D F(d->elem(12), s->elem(12), \ - ((imm >> 12) & 1)); \ - d->elem(13) =3D F(d->elem(13), s->elem(13), \ - ((imm >> 13) & 1)); \ - d->elem(14) =3D F(d->elem(14), s->elem(14), \ - ((imm >> 14) & 1)); \ - d->elem(15) =3D F(d->elem(15), s->elem(15), \ - ((imm >> 15) & 1)); \ - } \ - } \ - } \ + BLEND_I128(elem, num, F, 0); \ + YMM_ONLY( \ + if (num < 8) \ + imm >>=3D num; \ + BLEND_I128(elem, num, F, num); \ + ) \ } =20 /* SSE4.1 op helpers */ -#define FBLENDVB(d, s, m) ((m & 0x80) ? s : d) -#define FBLENDVPS(d, s, m) ((m & 0x80000000) ? s : d) -#define FBLENDVPD(d, s, m) ((m & 0x8000000000000000LL) ? s : d) +#define FBLENDVB(v, s, m) ((m & 0x80) ? s : v) +#define FBLENDVPS(v, s, m) ((m & 0x80000000) ? s : v) +#define FBLENDVPD(v, s, m) ((m & 0x8000000000000000LL) ? s : v) SSE_HELPER_V(helper_pblendvb, B, 16, FBLENDVB) SSE_HELPER_V(helper_blendvps, L, 4, FBLENDVPS) SSE_HELPER_V(helper_blendvpd, Q, 2, FBLENDVPD) @@ -1664,14 +2240,28 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s) uint64_t zf =3D (s->Q(0) & d->Q(0)) | (s->Q(1) & d->Q(1)); uint64_t cf =3D (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1)); =20 +#if SHIFT =3D=3D 2 + zf |=3D (s->Q(2) & d->Q(2)) | (s->Q(3) & d->Q(3)); + cf |=3D (s->Q(2) & ~d->Q(2)) | (s->Q(3) & ~d->Q(3)); +#endif CC_SRC =3D (zf ? 0 : CC_Z) | (cf ? 0 : CC_C); } =20 #define SSE_HELPER_F(name, elem, num, F) \ void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - if (num > 2) { \ - if (num > 4) { \ + if (num * SHIFT > 2) { \ + if (num * SHIFT > 8) { \ + d->elem(15) =3D F(15); \ + d->elem(14) =3D F(14); \ + d->elem(13) =3D F(13); \ + d->elem(12) =3D F(12); \ + d->elem(11) =3D F(11); \ + d->elem(10) =3D F(10); \ + d->elem(9) =3D F(9); \ + d->elem(8) =3D F(8); \ + } \ + if (num * SHIFT > 4) { \ d->elem(7) =3D F(7); \ d->elem(6) =3D F(6); \ d->elem(5) =3D F(5); \ @@ -1697,28 +2287,57 @@ SSE_HELPER_F(helper_pmovzxwd, L, 4, s->W) SSE_HELPER_F(helper_pmovzxwq, Q, 2, s->W) SSE_HELPER_F(helper_pmovzxdq, Q, 2, s->L) =20 -void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - d->Q(0) =3D (int64_t)(int32_t) d->L(0) * (int32_t) s->L(0); - d->Q(1) =3D (int64_t)(int32_t) d->L(2) * (int32_t) s->L(2); + d->Q(0) =3D (int64_t)(int32_t) v->L(0) * (int32_t) s->L(0); + d->Q(1) =3D (int64_t)(int32_t) v->L(2) * (int32_t) s->L(2); +#if SHIFT =3D=3D 2 + d->Q(2) =3D (int64_t)(int32_t) v->L(4) * (int32_t) s->L(4); + d->Q(3) =3D (int64_t)(int32_t) v->L(6) * (int32_t) s->L(6); +#endif } =20 #define FCMPEQQ(d, s) (d =3D=3D s ? -1 : 0) SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ) =20 -void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.W(0) =3D satuw((int32_t) d->L(0)); - r.W(1) =3D satuw((int32_t) d->L(1)); - r.W(2) =3D satuw((int32_t) d->L(2)); - r.W(3) =3D satuw((int32_t) d->L(3)); - r.W(4) =3D satuw((int32_t) s->L(0)); - r.W(5) =3D satuw((int32_t) s->L(1)); - r.W(6) =3D satuw((int32_t) s->L(2)); - r.W(7) =3D satuw((int32_t) s->L(3)); - *d =3D r; +void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) +{ + uint16_t r[8]; + + r[0] =3D satuw((int32_t) v->L(0)); + r[1] =3D satuw((int32_t) v->L(1)); + r[2] =3D satuw((int32_t) v->L(2)); + r[3] =3D satuw((int32_t) v->L(3)); + r[4] =3D satuw((int32_t) s->L(0)); + r[5] =3D satuw((int32_t) s->L(1)); + r[6] =3D satuw((int32_t) s->L(2)); + r[7] =3D satuw((int32_t) s->L(3)); + d->W(0) =3D r[0]; + d->W(1) =3D r[1]; + d->W(2) =3D r[2]; + d->W(3) =3D r[3]; + d->W(4) =3D r[4]; + d->W(5) =3D r[5]; + d->W(6) =3D r[6]; + d->W(7) =3D r[7]; +#if SHIFT =3D=3D 2 + r[0] =3D satuw((int32_t) v->L(4)); + r[1] =3D satuw((int32_t) v->L(5)); + r[2] =3D satuw((int32_t) v->L(6)); + r[3] =3D satuw((int32_t) v->L(7)); + r[4] =3D satuw((int32_t) s->L(4)); + r[5] =3D satuw((int32_t) s->L(5)); + r[6] =3D satuw((int32_t) s->L(6)); + r[7] =3D satuw((int32_t) s->L(7)); + d->W(8) =3D r[0]; + d->W(9) =3D r[1]; + d->W(10) =3D r[2]; + d->W(11) =3D r[3]; + d->W(12) =3D r[4]; + d->W(13) =3D r[5]; + d->W(14) =3D r[6]; + d->W(15) =3D r[7]; +#endif } =20 #define FMINSB(d, s) MIN((int8_t)d, (int8_t)s) @@ -1737,6 +2356,7 @@ SSE_HELPER_L(helper_pmaxud, MAX) #define FMULLD(d, s) ((int32_t)d * (int32_t)s) SSE_HELPER_L(helper_pmulld, FMULLD) =20 +#if SHIFT =3D=3D 1 void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int idx =3D 0; @@ -1768,6 +2388,7 @@ void glue(helper_phminposuw, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s) d->L(1) =3D 0; d->Q(1) =3D 0; } +#endif =20 void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) @@ -1797,6 +2418,12 @@ void glue(helper_roundps, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s, d->ZMM_S(1) =3D float32_round_to_int(s->ZMM_S(1), &env->sse_status); d->ZMM_S(2) =3D float32_round_to_int(s->ZMM_S(2), &env->sse_status); d->ZMM_S(3) =3D float32_round_to_int(s->ZMM_S(3), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_round_to_int(s->ZMM_S(4), &env->sse_status); + d->ZMM_S(5) =3D float32_round_to_int(s->ZMM_S(5), &env->sse_status); + d->ZMM_S(6) =3D float32_round_to_int(s->ZMM_S(6), &env->sse_status); + d->ZMM_S(7) =3D float32_round_to_int(s->ZMM_S(7), &env->sse_status); +#endif =20 if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) { set_float_exception_flags(get_float_exception_flags(&env->sse_stat= us) & @@ -1832,6 +2459,10 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s, =20 d->ZMM_D(0) =3D float64_round_to_int(s->ZMM_D(0), &env->sse_status); d->ZMM_D(1) =3D float64_round_to_int(s->ZMM_D(1), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_D(2) =3D float64_round_to_int(s->ZMM_D(2), &env->sse_status); + d->ZMM_D(3) =3D float64_round_to_int(s->ZMM_D(3), &env->sse_status); +#endif =20 if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) { set_float_exception_flags(get_float_exception_flags(&env->sse_stat= us) & @@ -1841,7 +2472,8 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, env->sse_status.float_rounding_mode =3D prev_rounding_mode; } =20 -void glue(helper_roundss, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, +#if SHIFT =3D=3D 1 +void helper_roundss_xmm(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); @@ -1875,7 +2507,7 @@ void glue(helper_roundss, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, env->sse_status.float_rounding_mode =3D prev_rounding_mode; } =20 -void glue(helper_roundsd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, +void helper_roundsd_xmm(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); @@ -1908,99 +2540,158 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s, } env->sse_status.float_rounding_mode =3D prev_rounding_mode; } +#endif =20 -#define FBLENDP(d, s, m) (m ? s : d) +#define FBLENDP(v, s, m) (m ? s : v) SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP) SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) =20 -void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) +void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, + uint32_t mask) { - float32 iresult =3D float32_zero; + float32 prod, iresult, iresult2; =20 + /* + * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D + * to correctly round the intermediate results + */ if (mask & (1 << 4)) { - iresult =3D float32_add(iresult, - float32_mul(d->ZMM_S(0), s->ZMM_S(0), - &env->sse_status), - &env->sse_status); + iresult =3D float32_mul(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); + } else { + iresult =3D float32_zero; } if (mask & (1 << 5)) { - iresult =3D float32_add(iresult, - float32_mul(d->ZMM_S(1), s->ZMM_S(1), - &env->sse_status), - &env->sse_status); + prod =3D float32_mul(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status); + } else { + prod =3D float32_zero; } + iresult =3D float32_add(iresult, prod, &env->sse_status); if (mask & (1 << 6)) { - iresult =3D float32_add(iresult, - float32_mul(d->ZMM_S(2), s->ZMM_S(2), - &env->sse_status), - &env->sse_status); + iresult2 =3D float32_mul(v->ZMM_S(2), s->ZMM_S(2), &env->sse_statu= s); + } else { + iresult2 =3D float32_zero; } if (mask & (1 << 7)) { - iresult =3D float32_add(iresult, - float32_mul(d->ZMM_S(3), s->ZMM_S(3), - &env->sse_status), - &env->sse_status); + prod =3D float32_mul(v->ZMM_S(3), s->ZMM_S(3), &env->sse_status); + } else { + prod =3D float32_zero; } + iresult2 =3D float32_add(iresult2, prod, &env->sse_status); + iresult =3D float32_add(iresult, iresult2, &env->sse_status); + d->ZMM_S(0) =3D (mask & (1 << 0)) ? iresult : float32_zero; d->ZMM_S(1) =3D (mask & (1 << 1)) ? iresult : float32_zero; d->ZMM_S(2) =3D (mask & (1 << 2)) ? iresult : float32_zero; d->ZMM_S(3) =3D (mask & (1 << 3)) ? iresult : float32_zero; +#if SHIFT =3D=3D 2 + if (mask & (1 << 4)) { + iresult =3D float32_mul(v->ZMM_S(4), s->ZMM_S(4), &env->sse_status= ); + } else { + iresult =3D float32_zero; + } + if (mask & (1 << 5)) { + prod =3D float32_mul(v->ZMM_S(5), s->ZMM_S(5), &env->sse_status); + } else { + prod =3D float32_zero; + } + iresult =3D float32_add(iresult, prod, &env->sse_status); + if (mask & (1 << 6)) { + iresult2 =3D float32_mul(v->ZMM_S(6), s->ZMM_S(6), &env->sse_statu= s); + } else { + iresult2 =3D float32_zero; + } + if (mask & (1 << 7)) { + prod =3D float32_mul(v->ZMM_S(7), s->ZMM_S(7), &env->sse_status); + } else { + prod =3D float32_zero; + } + iresult2 =3D float32_add(iresult2, prod, &env->sse_status); + iresult =3D float32_add(iresult, iresult2, &env->sse_status); + + d->ZMM_S(4) =3D (mask & (1 << 0)) ? iresult : float32_zero; + d->ZMM_S(5) =3D (mask & (1 << 1)) ? iresult : float32_zero; + d->ZMM_S(6) =3D (mask & (1 << 2)) ? iresult : float32_zero; + d->ZMM_S(7) =3D (mask & (1 << 3)) ? iresult : float32_zero; +#endif } =20 -void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) +#if SHIFT =3D=3D 1 +/* Oddly, there is no ymm version of dppd */ +void glue(helper_dppd, SUFFIX)(CPUX86State *env, + Reg *d, Reg *v, Reg *s, uint32_t mask) { - float64 iresult =3D float64_zero; + float64 iresult; =20 if (mask & (1 << 4)) { - iresult =3D float64_add(iresult, - float64_mul(d->ZMM_D(0), s->ZMM_D(0), - &env->sse_status), - &env->sse_status); + iresult =3D float64_mul(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); + } else { + iresult =3D float64_zero; } + if (mask & (1 << 5)) { iresult =3D float64_add(iresult, - float64_mul(d->ZMM_D(1), s->ZMM_D(1), + float64_mul(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status), &env->sse_status); } d->ZMM_D(0) =3D (mask & (1 << 0)) ? iresult : float64_zero; d->ZMM_D(1) =3D (mask & (1 << 1)) ? iresult : float64_zero; } +#endif =20 -void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, +void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, uint32_t offset) { int s0 =3D (offset & 3) << 2; int d0 =3D (offset & 4) << 0; int i; - Reg r; + uint16_t r[8]; =20 for (i =3D 0; i < 8; i++, d0++) { - r.W(i) =3D 0; - r.W(i) +=3D abs1(d->B(d0 + 0) - s->B(s0 + 0)); - r.W(i) +=3D abs1(d->B(d0 + 1) - s->B(s0 + 1)); - r.W(i) +=3D abs1(d->B(d0 + 2) - s->B(s0 + 2)); - r.W(i) +=3D abs1(d->B(d0 + 3) - s->B(s0 + 3)); + r[i] =3D 0; + r[i] +=3D abs1(v->B(d0 + 0) - s->B(s0 + 0)); + r[i] +=3D abs1(v->B(d0 + 1) - s->B(s0 + 1)); + r[i] +=3D abs1(v->B(d0 + 2) - s->B(s0 + 2)); + r[i] +=3D abs1(v->B(d0 + 3) - s->B(s0 + 3)); } + for (i =3D 0; i < 8; i++) { + d->W(i) =3D r[i]; + } +#if SHIFT =3D=3D 2 + s0 =3D ((offset & 0x18) >> 1) + 16; + d0 =3D ((offset & 0x20) >> 3) + 16; =20 - *d =3D r; + for (i =3D 0; i < 8; i++, d0++) { + r[i] =3D 0; + r[i] +=3D abs1(v->B(d0 + 0) - s->B(s0 + 0)); + r[i] +=3D abs1(v->B(d0 + 1) - s->B(s0 + 1)); + r[i] +=3D abs1(v->B(d0 + 2) - s->B(s0 + 2)); + r[i] +=3D abs1(v->B(d0 + 3) - s->B(s0 + 3)); + } + for (i =3D 0; i < 8; i++) { + d->W(i + 8) =3D r[i]; + } +#endif } =20 /* SSE4.2 op helpers */ #define FCMPGTQ(d, s) ((int64_t)d > (int64_t)s ? -1 : 0) SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ) =20 +#if SHIFT =3D=3D 1 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl) { - int val; + int64_t val; =20 /* Presence of REX.W is indicated by a bit higher than 7 set */ if (ctrl >> 8) { - val =3D abs1((int64_t)env->regs[reg]); + val =3D env->regs[reg]; } else { - val =3D abs1((int32_t)env->regs[reg]); + val =3D (int32_t)env->regs[reg]; } + if (val < 0) + val =3D 16; =20 if (ctrl & 1) { if (val > 8) { @@ -2213,14 +2904,16 @@ target_ulong helper_crc32(uint32_t crc1, target_ulo= ng msg, uint32_t len) return crc; } =20 -void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, - uint32_t ctrl) +#endif + +#if SHIFT =3D=3D 1 +static void clmulq(uint64_t *dest_l, uint64_t *dest_h, + uint64_t a, uint64_t b) { - uint64_t ah, al, b, resh, resl; + uint64_t al, ah, resh, resl; =20 ah =3D 0; - al =3D d->Q((ctrl & 1) !=3D 0); - b =3D s->Q((ctrl & 16) !=3D 0); + al =3D a; resh =3D resl =3D 0; =20 while (b) { @@ -2233,71 +2926,115 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *e= nv, Reg *d, Reg *s, b >>=3D 1; } =20 - d->Q(0) =3D resl; - d->Q(1) =3D resh; + *dest_l =3D resl; + *dest_h =3D resh; } +#endif =20 -void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg = *s, + uint32_t ctrl) +{ + uint64_t a, b; + + a =3D v->Q((ctrl & 1) !=3D 0); + b =3D s->Q((ctrl & 16) !=3D 0); + clmulq(&d->Q(0), &d->Q(1), a, b); +#if SHIFT =3D=3D 2 + a =3D v->Q(((ctrl & 1) !=3D 0) + 2); + b =3D s->Q(((ctrl & 16) !=3D 0) + 2); + clmulq(&d->Q(2), &d->Q(3), a, b); +#endif +} + +void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *v; Reg rk =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4*i+0])] ^ - AES_Td1[st.B(AES_ishifts[4*i+1])] ^ - AES_Td2[st.B(AES_ishifts[4*i+2])] ^ - AES_Td3[st.B(AES_ishifts[4*i+3])]); + d->L(i) =3D rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4 * i + 0])= ] ^ + AES_Td1[st.B(AES_ishifts[4 * i + 1])] ^ + AES_Td2[st.B(AES_ishifts[4 * i + 2])] ^ + AES_Td3[st.B(AES_ishifts[4 * i + 3])]); } +#if SHIFT =3D=3D 2 + for (i =3D 0 ; i < 4 ; i++) { + d->L(i + 4) =3D rk.L(i + 4) ^ bswap32( + AES_Td0[st.B(AES_ishifts[4 * i + 0] + 16)] ^ + AES_Td1[st.B(AES_ishifts[4 * i + 1] + 16)] ^ + AES_Td2[st.B(AES_ishifts[4 * i + 2] + 16)] ^ + AES_Td3[st.B(AES_ishifts[4 * i + 3] + 16)]); + } +#endif } =20 -void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg= *s) { int i; - Reg st =3D *d; + Reg st =3D *v; Reg rk =3D *s; =20 for (i =3D 0; i < 16; i++) { d->B(i) =3D rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i])]); } +#if SHIFT =3D=3D 2 + for (i =3D 0; i < 16; i++) { + d->B(i + 16) =3D rk.B(i + 16) ^ (AES_isbox[st.B(AES_ishifts[i] + 1= 6)]); + } +#endif } =20 -void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *v; Reg rk =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4*i+0])] ^ - AES_Te1[st.B(AES_shifts[4*i+1])] ^ - AES_Te2[st.B(AES_shifts[4*i+2])] ^ - AES_Te3[st.B(AES_shifts[4*i+3])]); + d->L(i) =3D rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4 * i + 0])]= ^ + AES_Te1[st.B(AES_shifts[4 * i + 1])] ^ + AES_Te2[st.B(AES_shifts[4 * i + 2])] ^ + AES_Te3[st.B(AES_shifts[4 * i + 3])]); + } +#if SHIFT =3D=3D 2 + for (i =3D 0 ; i < 4 ; i++) { + d->L(i + 4) =3D rk.L(i + 4) ^ bswap32( + AES_Te0[st.B(AES_shifts[4 * i + 0] + 16)] ^ + AES_Te1[st.B(AES_shifts[4 * i + 1] + 16)] ^ + AES_Te2[st.B(AES_shifts[4 * i + 2] + 16)] ^ + AES_Te3[st.B(AES_shifts[4 * i + 3] + 16)]); } +#endif } =20 -void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg= *s) { int i; - Reg st =3D *d; + Reg st =3D *v; Reg rk =3D *s; =20 for (i =3D 0; i < 16; i++) { d->B(i) =3D rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i])]); } - +#if SHIFT =3D=3D 2 + for (i =3D 0; i < 16; i++) { + d->B(i + 16) =3D rk.B(i + 16) ^ (AES_sbox[st.B(AES_shifts[i] + 16)= ]); + } +#endif } =20 +#if SHIFT =3D=3D 1 void glue(helper_aesimc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; Reg tmp =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D bswap32(AES_imc[tmp.B(4*i+0)][0] ^ - AES_imc[tmp.B(4*i+1)][1] ^ - AES_imc[tmp.B(4*i+2)][2] ^ - AES_imc[tmp.B(4*i+3)][3]); + d->L(i) =3D bswap32(AES_imc[tmp.B(4 * i + 0)][0] ^ + AES_imc[tmp.B(4 * i + 1)][1] ^ + AES_imc[tmp.B(4 * i + 2)][2] ^ + AES_imc[tmp.B(4 * i + 3)][3]); } } =20 @@ -2315,9 +3052,430 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86Sta= te *env, Reg *d, Reg *s, d->L(3) =3D (d->L(2) << 24 | d->L(2) >> 8) ^ ctrl; } #endif +#endif + +#if SHIFT >=3D 1 +void glue(helper_vbroadcastb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint8_t val =3D s->B(0); + int i; + + for (i =3D 0; i < 16 * SHIFT; i++) { + d->B(i) =3D val; + } +} + +void glue(helper_vbroadcastw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint16_t val =3D s->W(0); + int i; + + for (i =3D 0; i < 8 * SHIFT; i++) { + d->W(i) =3D val; + } +} + +void glue(helper_vbroadcastl, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint32_t val =3D s->L(0); + int i; + + for (i =3D 0; i < 8 * SHIFT; i++) { + d->L(i) =3D val; + } +} + +void glue(helper_vbroadcastq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint64_t val =3D s->Q(0); + d->Q(0) =3D val; + d->Q(1) =3D val; +#if SHIFT =3D=3D 2 + d->Q(2) =3D val; + d->Q(3) =3D val; +#endif +} + +void glue(helper_vpermilpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg = *s) +{ + uint64_t r0, r1; + + r0 =3D v->Q((s->Q(0) >> 1) & 1); + r1 =3D v->Q((s->Q(1) >> 1) & 1); + d->Q(0) =3D r0; + d->Q(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D v->Q(((s->Q(2) >> 1) & 1) + 2); + r1 =3D v->Q(((s->Q(3) >> 1) & 1) + 2); + d->Q(2) =3D r0; + d->Q(3) =3D r1; +#endif +} + +void glue(helper_vpermilps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg = *s) +{ + uint32_t r0, r1, r2, r3; + + r0 =3D v->L(s->L(0) & 3); + r1 =3D v->L(s->L(1) & 3); + r2 =3D v->L(s->L(2) & 3); + r3 =3D v->L(s->L(3) & 3); + d->L(0) =3D r0; + d->L(1) =3D r1; + d->L(2) =3D r2; + d->L(3) =3D r3; +#if SHIFT =3D=3D 2 + r0 =3D v->L((s->L(4) & 3) + 4); + r1 =3D v->L((s->L(5) & 3) + 4); + r2 =3D v->L((s->L(6) & 3) + 4); + r3 =3D v->L((s->L(7) & 3) + 4); + d->L(4) =3D r0; + d->L(5) =3D r1; + d->L(6) =3D r2; + d->L(7) =3D r3; +#endif +} + +void glue(helper_vpermilpd_imm, SUFFIX)(CPUX86State *env, + Reg *d, Reg *s, uint32_t order) +{ + uint64_t r0, r1; + + r0 =3D s->Q((order >> 0) & 1); + r1 =3D s->Q((order >> 1) & 1); + d->Q(0) =3D r0; + d->Q(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D s->Q(((order >> 2) & 1) + 2); + r1 =3D s->Q(((order >> 3) & 1) + 2); + d->Q(2) =3D r0; + d->Q(3) =3D r1; +#endif +} + +void glue(helper_vpermilps_imm, SUFFIX)(CPUX86State *env, + Reg *d, Reg *s, uint32_t order) +{ + uint32_t r0, r1, r2, r3; + + r0 =3D s->L((order >> 0) & 3); + r1 =3D s->L((order >> 2) & 3); + r2 =3D s->L((order >> 4) & 3); + r3 =3D s->L((order >> 6) & 3); + d->L(0) =3D r0; + d->L(1) =3D r1; + d->L(2) =3D r2; + d->L(3) =3D r3; +#if SHIFT =3D=3D 2 + r0 =3D s->L(((order >> 0) & 3) + 4); + r1 =3D s->L(((order >> 2) & 3) + 4); + r2 =3D s->L(((order >> 4) & 3) + 4); + r3 =3D s->L(((order >> 6) & 3) + 4); + d->L(4) =3D r0; + d->L(5) =3D r1; + d->L(6) =3D r2; + d->L(7) =3D r3; +#endif +} + +#if SHIFT =3D=3D 1 +#define FPSRLVD(x, c) (c < 32 ? ((x) >> c) : 0) +#define FPSRLVQ(x, c) (c < 64 ? ((x) >> c) : 0) +#define FPSRAVD(x, c) ((int32_t)(x) >> (c < 64 ? c : 31)) +#define FPSRAVQ(x, c) ((int64_t)(x) >> (c < 64 ? c : 63)) +#define FPSLLVD(x, c) (c < 32 ? ((x) << c) : 0) +#define FPSLLVQ(x, c) (c < 64 ? ((x) << c) : 0) +#endif + +SSE_HELPER_L(helper_vpsrlvd, FPSRLVD) +SSE_HELPER_L(helper_vpsravd, FPSRAVD) +SSE_HELPER_L(helper_vpsllvd, FPSLLVD) + +SSE_HELPER_Q(helper_vpsrlvq, FPSRLVQ) +SSE_HELPER_Q(helper_vpsravq, FPSRAVQ) +SSE_HELPER_Q(helper_vpsllvq, FPSLLVQ) + +void glue(helper_vtestps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint32_t zf =3D (s->L(0) & d->L(0)) | (s->L(1) & d->L(1)); + uint32_t cf =3D (s->L(0) & ~d->L(0)) | (s->L(1) & ~d->L(1)); + + zf |=3D (s->L(2) & d->L(2)) | (s->L(3) & d->L(3)); + cf |=3D (s->L(2) & ~d->L(2)) | (s->L(3) & ~d->L(3)); +#if SHIFT =3D=3D 2 + zf |=3D (s->L(4) & d->L(4)) | (s->L(5) & d->L(5)); + cf |=3D (s->L(4) & ~d->L(4)) | (s->L(5) & ~d->L(5)); + zf |=3D (s->L(6) & d->L(6)) | (s->L(7) & d->L(7)); + cf |=3D (s->L(6) & ~d->L(6)) | (s->L(7) & ~d->L(7)); +#endif + CC_SRC =3D ((zf >> 31) ? 0 : CC_Z) | ((cf >> 31) ? 0 : CC_C); +} + +void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint64_t zf =3D (s->Q(0) & d->Q(0)) | (s->Q(1) & d->Q(1)); + uint64_t cf =3D (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1)); + +#if SHIFT =3D=3D 2 + zf |=3D (s->Q(2) & d->Q(2)) | (s->Q(3) & d->Q(3)); + cf |=3D (s->Q(2) & ~d->Q(2)) | (s->Q(3) & ~d->Q(3)); +#endif + CC_SRC =3D ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C); +} + +void glue(helper_vpmaskmovd_st, SUFFIX)(CPUX86State *env, + Reg *s, Reg *v, target_ulong a0) +{ + int i; + + for (i =3D 0; i < (2 << SHIFT); i++) { + if (v->L(i) >> 31) { + cpu_stl_data_ra(env, a0 + i * 4, s->L(i), GETPC()); + } + } +} + +void glue(helper_vpmaskmovq_st, SUFFIX)(CPUX86State *env, + Reg *s, Reg *v, target_ulong a0) +{ + int i; + + for (i =3D 0; i < (1 << SHIFT); i++) { + if (v->Q(i) >> 63) { + cpu_stq_data_ra(env, a0 + i * 8, s->Q(i), GETPC()); + } + } +} + +void glue(helper_vpmaskmovd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg= *s) +{ + d->L(0) =3D (v->L(0) >> 31) ? s->L(0) : 0; + d->L(1) =3D (v->L(1) >> 31) ? s->L(1) : 0; + d->L(2) =3D (v->L(2) >> 31) ? s->L(2) : 0; + d->L(3) =3D (v->L(3) >> 31) ? s->L(3) : 0; +#if SHIFT =3D=3D 2 + d->L(4) =3D (v->L(4) >> 31) ? s->L(4) : 0; + d->L(5) =3D (v->L(5) >> 31) ? s->L(5) : 0; + d->L(6) =3D (v->L(6) >> 31) ? s->L(6) : 0; + d->L(7) =3D (v->L(7) >> 31) ? s->L(7) : 0; +#endif +} + +void glue(helper_vpmaskmovq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg= *s) +{ + d->Q(0) =3D (v->Q(0) >> 63) ? s->Q(0) : 0; + d->Q(1) =3D (v->Q(1) >> 63) ? s->Q(1) : 0; +#if SHIFT =3D=3D 2 + d->Q(2) =3D (v->Q(2) >> 63) ? s->Q(2) : 0; + d->Q(3) =3D (v->Q(3) >> 63) ? s->Q(3) : 0; +#endif +} + +#define VGATHER_HELPER(scale) \ +void glue(helper_vpgatherdd ## scale, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s, target_ulong a0) \ +{ \ + int i; \ + for (i =3D 0; i < (2 << SHIFT); i++) { \ + if (v->L(i) >> 31) { \ + target_ulong addr =3D a0 \ + + ((target_ulong)(int32_t)s->L(i) << scale); \ + d->L(i) =3D cpu_ldl_data_ra(env, addr, GETPC()); \ + } \ + v->L(i) =3D 0; \ + } \ +} \ +void glue(helper_vpgatherdq ## scale, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s, target_ulong a0) \ +{ \ + int i; \ + for (i =3D 0; i < (1 << SHIFT); i++) { \ + if (v->Q(i) >> 63) { \ + target_ulong addr =3D a0 \ + + ((target_ulong)(int32_t)s->L(i) << scale); \ + d->Q(i) =3D cpu_ldq_data_ra(env, addr, GETPC()); \ + } \ + v->Q(i) =3D 0; \ + } \ +} \ +void glue(helper_vpgatherqd ## scale, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s, target_ulong a0) \ +{ \ + int i; \ + for (i =3D 0; i < (1 << SHIFT); i++) { \ + if (v->L(i) >> 31) { \ + target_ulong addr =3D a0 \ + + ((target_ulong)(int64_t)s->Q(i) << scale); \ + d->L(i) =3D cpu_ldl_data_ra(env, addr, GETPC()); \ + } \ + v->L(i) =3D 0; \ + } \ + d->Q(SHIFT) =3D 0; \ + v->Q(SHIFT) =3D 0; \ + YMM_ONLY( \ + d->Q(3) =3D 0; \ + v->Q(3) =3D 0; \ + ) \ +} \ +void glue(helper_vpgatherqq ## scale, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s, target_ulong a0) \ +{ \ + int i; \ + for (i =3D 0; i < (1 << SHIFT); i++) { \ + if (v->Q(i) >> 63) { \ + target_ulong addr =3D a0 \ + + ((target_ulong)(int64_t)s->Q(i) << scale); \ + d->Q(i) =3D cpu_ldq_data_ra(env, addr, GETPC()); \ + } \ + v->Q(i) =3D 0; \ + } \ +} + +VGATHER_HELPER(0) +VGATHER_HELPER(1) +VGATHER_HELPER(2) +VGATHER_HELPER(3) + +#if SHIFT =3D=3D 2 +void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + d->Q(0) =3D s->Q(0); + d->Q(1) =3D s->Q(1); + d->Q(2) =3D s->Q(0); + d->Q(3) =3D s->Q(1); +} + +void helper_vzeroall(CPUX86State *env) +{ + int i; + + for (i =3D 0; i < 8; i++) { + env->xmm_regs[i].ZMM_Q(0) =3D 0; + env->xmm_regs[i].ZMM_Q(1) =3D 0; + env->xmm_regs[i].ZMM_Q(2) =3D 0; + env->xmm_regs[i].ZMM_Q(3) =3D 0; + } +} + +void helper_vzeroupper(CPUX86State *env) +{ + int i; + + for (i =3D 0; i < 8; i++) { + env->xmm_regs[i].ZMM_Q(2) =3D 0; + env->xmm_regs[i].ZMM_Q(3) =3D 0; + } +} + +#ifdef TARGET_X86_64 +void helper_vzeroall_hi8(CPUX86State *env) +{ + int i; + + for (i =3D 8; i < 16; i++) { + env->xmm_regs[i].ZMM_Q(0) =3D 0; + env->xmm_regs[i].ZMM_Q(1) =3D 0; + env->xmm_regs[i].ZMM_Q(2) =3D 0; + env->xmm_regs[i].ZMM_Q(3) =3D 0; + } +} + +void helper_vzeroupper_hi8(CPUX86State *env) +{ + int i; + + for (i =3D 8; i < 16; i++) { + env->xmm_regs[i].ZMM_Q(2) =3D 0; + env->xmm_regs[i].ZMM_Q(3) =3D 0; + } +} +#endif + +void helper_vpermdq_ymm(CPUX86State *env, + Reg *d, Reg *v, Reg *s, uint32_t order) +{ + uint64_t r0, r1, r2, r3; + + switch (order & 3) { + case 0: + r0 =3D v->Q(0); + r1 =3D v->Q(1); + break; + case 1: + r0 =3D v->Q(2); + r1 =3D v->Q(3); + break; + case 2: + r0 =3D s->Q(0); + r1 =3D s->Q(1); + break; + case 3: + r0 =3D s->Q(2); + r1 =3D s->Q(3); + break; + } + switch ((order >> 4) & 3) { + case 0: + r2 =3D v->Q(0); + r3 =3D v->Q(1); + break; + case 1: + r2 =3D v->Q(2); + r3 =3D v->Q(3); + break; + case 2: + r2 =3D s->Q(0); + r3 =3D s->Q(1); + break; + case 3: + r2 =3D s->Q(2); + r3 =3D s->Q(3); + break; + } + d->Q(0) =3D r0; + d->Q(1) =3D r1; + d->Q(2) =3D r2; + d->Q(3) =3D r3; +} + +void helper_vpermq_ymm(CPUX86State *env, Reg *d, Reg *s, uint32_t order) +{ + uint64_t r0, r1, r2, r3; + r0 =3D s->Q(order & 3); + r1 =3D s->Q((order >> 2) & 3); + r2 =3D s->Q((order >> 4) & 3); + r3 =3D s->Q((order >> 6) & 3); + d->Q(0) =3D r0; + d->Q(1) =3D r1; + d->Q(2) =3D r2; + d->Q(3) =3D r3; +} + +void helper_vpermd_ymm(CPUX86State *env, Reg *d, Reg *v, Reg *s) +{ + uint32_t r[8]; + int i; + + for (i =3D 0; i < 8; i++) { + r[i] =3D s->L(v->L(i) & 7); + } + for (i =3D 0; i < 8; i++) { + d->L(i) =3D r[i]; + } +} + +#endif +#endif + +#undef SHIFT_HELPER_W +#undef SHIFT_HELPER_L +#undef SHIFT_HELPER_Q +#undef SSE_HELPER_S +#undef SSE_HELPER_CMP =20 #undef SHIFT #undef XMM_ONLY +#undef YMM_ONLY #undef Reg #undef B #undef W diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index cef28f2aae..83efb8ab41 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -21,7 +21,11 @@ #define SUFFIX _mmx #else #define Reg ZMMReg +#if SHIFT =3D=3D 1 #define SUFFIX _xmm +#else +#define SUFFIX _ymm +#endif #endif =20 #define dh_alias_Reg ptr @@ -34,31 +38,31 @@ #define dh_typecode_ZMMReg dh_typecode_ptr #define dh_typecode_MMXReg dh_typecode_ptr =20 -DEF_HELPER_3(glue(psrlw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psraw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psllw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psrld, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psrad, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pslld, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psrlq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psllq, SUFFIX), void, env, Reg, Reg) - -#if SHIFT =3D=3D 1 -DEF_HELPER_3(glue(psrldq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pslldq, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(psrlw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psraw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psllw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psrld, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psrad, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pslld, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psrlq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psllq, SUFFIX), void, env, Reg, Reg, Reg) + +#if SHIFT >=3D 1 +DEF_HELPER_4(glue(psrldq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pslldq, SUFFIX), void, env, Reg, Reg, Reg) #endif =20 #define SSE_HELPER_B(name, F)\ - DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg) =20 #define SSE_HELPER_W(name, F)\ - DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg) =20 #define SSE_HELPER_L(name, F)\ - DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg) =20 #define SSE_HELPER_Q(name, F)\ - DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg) =20 SSE_HELPER_B(paddb, FADD) SSE_HELPER_W(paddw, FADD) @@ -101,7 +105,7 @@ SSE_HELPER_L(pcmpeql, FCMPEQ) =20 SSE_HELPER_W(pmullw, FMULLW) #if SHIFT =3D=3D 0 -SSE_HELPER_W(pmulhrw, FMULHRW) +DEF_HELPER_3(glue(pmulhrw, SUFFIX), void, env, Reg, Reg) #endif SSE_HELPER_W(pmulhuw, FMULHUW) SSE_HELPER_W(pmulhw, FMULHW) @@ -109,11 +113,13 @@ SSE_HELPER_W(pmulhw, FMULHW) SSE_HELPER_B(pavgb, FAVG) SSE_HELPER_W(pavgw, FAVG) =20 -DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(pmuludq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaddwd, SUFFIX), void, env, Reg, Reg, Reg) =20 -DEF_HELPER_3(glue(psadbw, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(psadbw, SUFFIX), void, env, Reg, Reg, Reg) +#if SHIFT < 2 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl) +#endif DEF_HELPER_2(glue(movl_mm_T0, SUFFIX), void, Reg, i32) #ifdef TARGET_X86_64 DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64) @@ -122,38 +128,63 @@ DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64) #if SHIFT =3D=3D 0 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int) #else -DEF_HELPER_3(shufps, void, Reg, Reg, int) -DEF_HELPER_3(shufpd, void, Reg, Reg, int) DEF_HELPER_3(glue(pshufd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshuflw, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) #endif =20 -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 /* FPU ops */ /* XXX: not accurate */ =20 -#define SSE_HELPER_S(name, F) \ - DEF_HELPER_3(name ## ps, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## pd, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## sd, void, env, Reg, Reg) +#define SSE_HELPER_P4(name, ...) \ + DEF_HELPER_4(glue(name ## ps, SUFFIX), __VA_ARGS__) \ + DEF_HELPER_4(glue(name ## pd, SUFFIX), __VA_ARGS__) + +#define SSE_HELPER_P3(name, ...) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), __VA_ARGS__) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), __VA_ARGS__) + +#if SHIFT =3D=3D 1 +#define SSE_HELPER_S4(name, ...) \ + SSE_HELPER_P4(name, __VA_ARGS__) \ + DEF_HELPER_4(name ## ss, __VA_ARGS__) \ + DEF_HELPER_4(name ## sd, __VA_ARGS__) +#define SSE_HELPER_S3(name, ...) \ + SSE_HELPER_P3(name, __VA_ARGS__) \ + DEF_HELPER_3(name ## ss, __VA_ARGS__) \ + DEF_HELPER_3(name ## sd, __VA_ARGS__) +#else +#define SSE_HELPER_S4(name, ...) SSE_HELPER_P4(name, __VA_ARGS__) +#define SSE_HELPER_S3(name, ...) SSE_HELPER_P3(name, __VA_ARGS__) +#endif + +DEF_HELPER_4(glue(shufps, SUFFIX), void, Reg, Reg, Reg, int) +DEF_HELPER_4(glue(shufpd, SUFFIX), void, Reg, Reg, Reg, int) + +SSE_HELPER_S4(add, void, env, Reg, Reg, Reg) +SSE_HELPER_S4(sub, void, env, Reg, Reg, Reg) +SSE_HELPER_S4(mul, void, env, Reg, Reg, Reg) +SSE_HELPER_S4(div, void, env, Reg, Reg, Reg) +SSE_HELPER_S4(min, void, env, Reg, Reg, Reg) +SSE_HELPER_S4(max, void, env, Reg, Reg, Reg) + +SSE_HELPER_S3(sqrt, void, env, Reg, Reg) =20 -SSE_HELPER_S(add, FPU_ADD) -SSE_HELPER_S(sub, FPU_SUB) -SSE_HELPER_S(mul, FPU_MUL) -SSE_HELPER_S(div, FPU_DIV) -SSE_HELPER_S(min, FPU_MIN) -SSE_HELPER_S(max, FPU_MAX) -SSE_HELPER_S(sqrt, FPU_SQRT) +DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtdq2ps, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtdq2pd, SUFFIX), void, env, Reg, Reg) =20 +DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) =20 -DEF_HELPER_3(cvtps2pd, void, env, Reg, Reg) -DEF_HELPER_3(cvtpd2ps, void, env, Reg, Reg) +DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) + +#if SHIFT =3D=3D 1 DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg) DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg) -DEF_HELPER_3(cvtdq2ps, void, env, Reg, Reg) -DEF_HELPER_3(cvtdq2pd, void, env, Reg, Reg) DEF_HELPER_3(cvtpi2ps, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtpi2pd, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtsi2ss, void, env, ZMMReg, i32) @@ -164,8 +195,6 @@ DEF_HELPER_3(cvtsq2ss, void, env, ZMMReg, i64) DEF_HELPER_3(cvtsq2sd, void, env, ZMMReg, i64) #endif =20 -DEF_HELPER_3(cvtps2dq, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(cvtpd2dq, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvtps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvtpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvtss2si, s32, env, ZMMReg) @@ -175,8 +204,6 @@ DEF_HELPER_2(cvtss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvtsd2sq, s64, env, ZMMReg) #endif =20 -DEF_HELPER_3(cvttps2dq, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(cvttpd2dq, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvttps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvttpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvttss2si, s32, env, ZMMReg) @@ -185,60 +212,88 @@ DEF_HELPER_2(cvttsd2si, s32, env, ZMMReg) DEF_HELPER_2(cvttss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg) #endif +#endif =20 -DEF_HELPER_3(rsqrtps, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg) + +#if SHIFT =3D=3D 1 DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(rcpps, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(extrq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int) DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int) -DEF_HELPER_3(haddps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(haddpd, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(hsubps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(hsubpd, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(addsubps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(addsubpd, void, env, ZMMReg, ZMMReg) - -#define SSE_HELPER_CMP(name, F) \ - DEF_HELPER_3(name ## ps, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## pd, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## sd, void, env, Reg, Reg) - -SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) -SSE_HELPER_CMP(cmplt, FPU_CMPLT) -SSE_HELPER_CMP(cmple, FPU_CMPLE) -SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD) -SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ) -SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) -SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) -SSE_HELPER_CMP(cmpord, FPU_CMPORD) +#endif + +SSE_HELPER_P4(hadd, void, env, Reg, Reg, Reg) +SSE_HELPER_P4(hsub, void, env, Reg, Reg, Reg) +SSE_HELPER_P4(addsub, void, env, Reg, Reg, Reg) + +#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_S4(name, void, env, Reg, Reg= , Reg) + +SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ) +SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT) +SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE) +SSE_HELPER_CMP(cmpunord, FPU_CMPQ, FPU_UNORD) +SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ) +SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT) +SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE) +SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) + +SSE_HELPER_CMP(cmpequ, FPU_CMPQ, FPU_EQU) +SSE_HELPER_CMP(cmpnge, FPU_CMPS, !FPU_GE) +SSE_HELPER_CMP(cmpngt, FPU_CMPS, !FPU_GT) +SSE_HELPER_CMP(cmpfalse, FPU_CMPQ, FPU_FALSE) +SSE_HELPER_CMP(cmpnequ, FPU_CMPQ, FPU_EQU) +SSE_HELPER_CMP(cmpge, FPU_CMPS, FPU_GE) +SSE_HELPER_CMP(cmpgt, FPU_CMPS, FPU_GT) +SSE_HELPER_CMP(cmptrue, FPU_CMPQ, !FPU_FALSE) + +SSE_HELPER_CMP(cmpeqs, FPU_CMPS, FPU_EQ) +SSE_HELPER_CMP(cmpltq, FPU_CMPQ, FPU_LT) +SSE_HELPER_CMP(cmpleq, FPU_CMPQ, FPU_LE) +SSE_HELPER_CMP(cmpunords, FPU_CMPS, FPU_UNORD) +SSE_HELPER_CMP(cmpneqq, FPU_CMPS, !FPU_EQ) +SSE_HELPER_CMP(cmpnltq, FPU_CMPQ, !FPU_LT) +SSE_HELPER_CMP(cmpnleq, FPU_CMPQ, !FPU_LE) +SSE_HELPER_CMP(cmpords, FPU_CMPS, !FPU_UNORD) + +SSE_HELPER_CMP(cmpequs, FPU_CMPS, FPU_EQU) +SSE_HELPER_CMP(cmpngeq, FPU_CMPQ, !FPU_GE) +SSE_HELPER_CMP(cmpngtq, FPU_CMPQ, !FPU_GT) +SSE_HELPER_CMP(cmpfalses, FPU_CMPS, FPU_FALSE) +SSE_HELPER_CMP(cmpnequs, FPU_CMPS, FPU_EQU) +SSE_HELPER_CMP(cmpgeq, FPU_CMPQ, FPU_GE) +SSE_HELPER_CMP(cmpgtq, FPU_CMPQ, FPU_GT) +SSE_HELPER_CMP(cmptrues, FPU_CMPS, !FPU_FALSE) =20 +#if SHIFT =3D=3D 1 DEF_HELPER_3(ucomiss, void, env, Reg, Reg) DEF_HELPER_3(comiss, void, env, Reg, Reg) DEF_HELPER_3(ucomisd, void, env, Reg, Reg) DEF_HELPER_3(comisd, void, env, Reg, Reg) -DEF_HELPER_2(movmskps, i32, env, Reg) -DEF_HELPER_2(movmskpd, i32, env, Reg) +#endif + +DEF_HELPER_2(glue(movmskps, SUFFIX), i32, env, Reg) +DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg) #endif =20 DEF_HELPER_2(glue(pmovmskb, SUFFIX), i32, env, Reg) -DEF_HELPER_3(glue(packsswb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(packuswb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(packssdw, SUFFIX), void, env, Reg, Reg) -#define UNPCK_OP(base_name, base) \ - DEF_HELPER_3(glue(punpck ## base_name ## bw, SUFFIX), void, env, Reg, = Reg) \ - DEF_HELPER_3(glue(punpck ## base_name ## wd, SUFFIX), void, env, Reg, = Reg) \ - DEF_HELPER_3(glue(punpck ## base_name ## dq, SUFFIX), void, env, Reg, = Reg) +DEF_HELPER_4(glue(packsswb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(packuswb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(packssdw, SUFFIX), void, env, Reg, Reg, Reg) +#define UNPCK_OP(name, base) \ + DEF_HELPER_4(glue(punpck ## name ## bw, SUFFIX), void, env, Reg, Reg, = Reg) \ + DEF_HELPER_4(glue(punpck ## name ## wd, SUFFIX), void, env, Reg, Reg, = Reg) \ + DEF_HELPER_4(glue(punpck ## name ## dq, SUFFIX), void, env, Reg, Reg, = Reg) =20 UNPCK_OP(l, 0) UNPCK_OP(h, 1) =20 -#if SHIFT =3D=3D 1 -DEF_HELPER_3(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg) +#if SHIFT >=3D 1 +DEF_HELPER_4(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg, Reg) #endif =20 /* 3DNow! float ops */ @@ -265,28 +320,28 @@ DEF_HELPER_3(pswapd, void, env, MMXReg, MMXReg) #endif =20 /* SSSE3 op helpers */ -DEF_HELPER_3(glue(phaddw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phaddd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phaddsw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phsubw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phsubd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phsubsw, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(phaddw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phaddd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phaddsw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phsubw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phsubd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phsubsw, SUFFIX), void, env, Reg, Reg, Reg) DEF_HELPER_3(glue(pabsb, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pabsw, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pabsd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaddubsw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmulhrsw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pshufb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psignb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psignw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psignd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_4(glue(palignr, SUFFIX), void, env, Reg, Reg, s32) +DEF_HELPER_4(glue(pmaddubsw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmulhrsw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pshufb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psignb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psignw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psignd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_5(glue(palignr, SUFFIX), void, env, Reg, Reg, Reg, s32) =20 /* SSE4.1 op helpers */ -#if SHIFT =3D=3D 1 -DEF_HELPER_3(glue(pblendvb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(blendvps, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(blendvpd, SUFFIX), void, env, Reg, Reg) +#if SHIFT >=3D 1 +DEF_HELPER_5(glue(pblendvb, SUFFIX), void, env, Reg, Reg, Reg, Reg) +DEF_HELPER_5(glue(blendvps, SUFFIX), void, env, Reg, Reg, Reg, Reg) +DEF_HELPER_5(glue(blendvpd, SUFFIX), void, env, Reg, Reg, Reg, Reg) DEF_HELPER_3(glue(ptest, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmovsxbw, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmovsxbd, SUFFIX), void, env, Reg, Reg) @@ -300,34 +355,42 @@ DEF_HELPER_3(glue(pmovzxbq, SUFFIX), void, env, Reg, = Reg) DEF_HELPER_3(glue(pmovzxwd, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmovzxwq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmovzxdq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmuldq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pcmpeqq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(packusdw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pminsb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pminsd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pminuw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pminud, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaxsb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaxsd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaxuw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaxud, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmulld, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(pmuldq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pcmpeqq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(packusdw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pminsb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pminsd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pminuw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pminud, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaxsb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaxsd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaxuw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaxud, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmulld, SUFFIX), void, env, Reg, Reg, Reg) +#if SHIFT =3D=3D 1 DEF_HELPER_3(glue(phminposuw, SUFFIX), void, env, Reg, Reg) +#endif DEF_HELPER_4(glue(roundps, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(roundpd, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(roundss, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(roundsd, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(blendps, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(blendpd, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(pblendw, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(dpps, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(dppd, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, i32) +#if SHIFT =3D=3D 1 +DEF_HELPER_4(roundss_xmm, void, env, Reg, Reg, i32) +DEF_HELPER_4(roundsd_xmm, void, env, Reg, Reg, i32) +#endif +DEF_HELPER_5(glue(blendps, SUFFIX), void, env, Reg, Reg, Reg, i32) +DEF_HELPER_5(glue(blendpd, SUFFIX), void, env, Reg, Reg, Reg, i32) +DEF_HELPER_5(glue(pblendw, SUFFIX), void, env, Reg, Reg, Reg, i32) +DEF_HELPER_5(glue(dpps, SUFFIX), void, env, Reg, Reg, Reg, i32) +#if SHIFT =3D=3D 1 +DEF_HELPER_5(glue(dppd, SUFFIX), void, env, Reg, Reg, Reg, i32) +#endif +DEF_HELPER_5(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, Reg, i32) #endif =20 /* SSE4.2 op helpers */ +#if SHIFT >=3D 1 +DEF_HELPER_4(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg, Reg) +#endif #if SHIFT =3D=3D 1 -DEF_HELPER_3(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_4(glue(pcmpestri, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(pcmpestrm, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(pcmpistri, SUFFIX), void, env, Reg, Reg, i32) @@ -336,14 +399,68 @@ DEF_HELPER_3(crc32, tl, i32, tl, i32) #endif =20 /* AES-NI op helpers */ +#if SHIFT >=3D 1 +DEF_HELPER_4(glue(aesdec, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(aesdeclast, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(aesenc, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(aesenclast, SUFFIX), void, env, Reg, Reg, Reg) #if SHIFT =3D=3D 1 -DEF_HELPER_3(glue(aesdec, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(aesdeclast, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(aesenc, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(aesenclast, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(aesimc, SUFFIX), void, env, Reg, Reg) DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32) +#endif +DEF_HELPER_5(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, Reg, i32) +#endif + +/* AVX helpers */ +#if SHIFT >=3D 1 +DEF_HELPER_3(glue(vbroadcastb, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(vbroadcastw, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(vbroadcastl, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(vbroadcastq, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(vpermilpd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpermilps, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpermilpd_imm, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_4(glue(vpermilps_imm, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_4(glue(vpsrlvd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsravd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsllvd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsrlvq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsravq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsllvq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_3(glue(vtestps, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(vtestpd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(vpmaskmovd_st, SUFFIX), void, env, Reg, Reg, tl) +DEF_HELPER_4(glue(vpmaskmovq_st, SUFFIX), void, env, Reg, Reg, tl) +DEF_HELPER_4(glue(vpmaskmovd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_5(glue(vpgatherdd0, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdq0, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqd0, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqq0, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdd1, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdq1, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqd1, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqq1, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdd2, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdq2, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqd2, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqq2, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdd3, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdq3, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqd3, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqq3, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_4(glue(vpmaskmovq, SUFFIX), void, env, Reg, Reg, Reg) +#if SHIFT =3D=3D 2 +DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_1(vzeroall, void, env) +DEF_HELPER_1(vzeroupper, void, env) +#ifdef TARGET_X86_64 +DEF_HELPER_1(vzeroall_hi8, void, env) +DEF_HELPER_1(vzeroupper_hi8, void, env) +#endif +DEF_HELPER_5(vpermdq_ymm, void, env, Reg, Reg, Reg, i32) +DEF_HELPER_4(vpermq_ymm, void, env, Reg, Reg, i32) +DEF_HELPER_4(vpermd_ymm, void, env, Reg, Reg, Reg) +#endif #endif =20 #undef SHIFT @@ -354,6 +471,9 @@ DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, R= eg, i32) #undef SSE_HELPER_W #undef SSE_HELPER_L #undef SSE_HELPER_Q -#undef SSE_HELPER_S +#undef SSE_HELPER_S3 +#undef SSE_HELPER_S4 +#undef SSE_HELPER_P3 +#undef SSE_HELPER_P4 #undef SSE_HELPER_CMP #undef UNPCK_OP diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c index b391b69635..74cf86c986 100644 --- a/target/i386/tcg/fpu_helper.c +++ b/target/i386/tcg/fpu_helper.c @@ -3053,3 +3053,6 @@ void helper_movq(CPUX86State *env, void *d, void *s) =20 #define SHIFT 1 #include "ops_sse.h" + +#define SHIFT 2 +#include "ops_sse.h" diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index c393913fe0..f1c7ab4455 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -125,6 +125,7 @@ typedef struct DisasContext { TCGv tmp4; TCGv_ptr ptr0; TCGv_ptr ptr1; + TCGv_ptr ptr2; TCGv_i32 tmp2_i32; TCGv_i32 tmp3_i32; TCGv_i64 tmp1_i64; @@ -2739,6 +2740,29 @@ static inline void gen_ldo_env_A0(DisasContext *s, i= nt offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(1= ))); } =20 +static inline void gen_ldo_env_A0_ymmh(DisasContext *s, int offset) +{ + int mem_index =3D s->mem_index; + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2= ))); + tcg_gen_addi_tl(s->tmp0, s->A0, 8); + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3= ))); +} + +/* Load 256-bit ymm register value */ +static inline void gen_ldy_env_A0(DisasContext *s, int offset) +{ + int mem_index =3D s->mem_index; + gen_ldo_env_A0(s, offset); + tcg_gen_addi_tl(s->tmp0, s->A0, 16); + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2= ))); + tcg_gen_addi_tl(s->tmp0, s->A0, 24); + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3= ))); +} + static inline void gen_sto_env_A0(DisasContext *s, int offset) { int mem_index =3D s->mem_index; @@ -2749,6 +2773,29 @@ static inline void gen_sto_env_A0(DisasContext *s, i= nt offset) tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); } =20 +static inline void gen_sto_env_A0_ymmh(DisasContext *s, int offset) +{ + int mem_index =3D s->mem_index; + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2= ))); + tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ); + tcg_gen_addi_tl(s->tmp0, s->A0, 8); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3= ))); + tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); +} + +/* Store 256-bit ymm register value */ +static inline void gen_sty_env_A0(DisasContext *s, int offset) +{ + int mem_index =3D s->mem_index; + gen_sto_env_A0(s, offset); + tcg_gen_addi_tl(s->tmp0, s->A0, 16); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2= ))); + tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); + tcg_gen_addi_tl(s->tmp0, s->A0, 24); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3= ))); + tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); +} + static inline void gen_op_movo(DisasContext *s, int d_offset, int s_offset) { tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (0))); @@ -2757,6 +2804,32 @@ static inline void gen_op_movo(DisasContext *s, int = d_offset, int s_offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (1))); } =20 +static inline void gen_op_movo_ymmh(DisasContext *s, int d_offset, int s_o= ffset) +{ + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (2))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (2))); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (3))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (3))); +} + +static inline void gen_op_movo_ymm_l2h(DisasContext *s, + int d_offset, int s_offset) +{ + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (0))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (2))); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (1))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (3))); +} + +static inline void gen_op_movo_ymm_h2l(DisasContext *s, + int d_offset, int s_offset) +{ + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (2))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (0))); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (3))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (1))); +} + static inline void gen_op_movq(DisasContext *s, int d_offset, int s_offset) { tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset); @@ -2775,170 +2848,270 @@ static inline void gen_op_movq_env_0(DisasContext= *s, int d_offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset); } =20 +#define XMM_OFFSET(reg) offsetof(CPUX86State, xmm_regs[reg]) + +/* + * Clear the top half of the ymm register after a VEX.128 instruction + * This could be optimized by tracking this in env->hflags + */ +static void gen_clear_ymmh(DisasContext *s, int reg) +{ + if (s->prefix & PREFIX_VEX) { + gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2))= ); + gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(3))= ); + } +} + +typedef void (*SSEFunc_0_pp)(TCGv_ptr reg_a, TCGv_ptr reg_b); typedef void (*SSEFunc_i_ep)(TCGv_i32 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val); typedef void (*SSEFunc_0_epl)(TCGv_ptr env, TCGv_ptr reg, TCGv_i64 val); typedef void (*SSEFunc_0_epp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b= ); +typedef void (*SSEFunc_0_eppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, + TCGv_ptr reg_c); +typedef void (*SSEFunc_0_epppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg= _b, + TCGv_ptr reg_c, TCGv_ptr reg_d); typedef void (*SSEFunc_0_eppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv_i32 val); +typedef void (*SSEFunc_0_epppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg= _b, + TCGv_ptr reg_c, TCGv_i32 val); typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_i32 val= ); +typedef void (*SSEFunc_0_pppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_ptr re= g_c, + TCGv_i32 val); typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv val); +typedef void (*SSEFunc_0_epppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg= _b, + TCGv_ptr reg_c, TCGv val); + +#define SSE_OPF_V0 (1 << 0) /* vex.v must be 1111b (only 2 operands= ) */ +#define SSE_OPF_CMP (1 << 1) /* does not write for first operand */ +#define SSE_OPF_BLENDV (1 << 2) /* blendv* instruction */ +#define SSE_OPF_SPECIAL (1 << 3) /* magic */ +#define SSE_OPF_3DNOW (1 << 4) /* 3DNow! instruction */ +#define SSE_OPF_MMX (1 << 5) /* MMX/integer/AVX2 instruction */ +#define SSE_OPF_SCALAR (1 << 6) /* Has SSE scalar variants */ +#define SSE_OPF_AVX2 (1 << 7) /* AVX2 instruction */ +#define SSE_OPF_SHUF (1 << 9) /* pshufx/shufpx */ + +#define OP(op, flags, a, b, c, d, e, f, g, h) \ + {flags, {{.op =3D a}, {.op =3D b}, {.op =3D c}, {.op =3D d}, \ + {.op =3D e}, {.op =3D f}, {.op =3D g}, {.op =3D h} } } + +#define MMX_OP(x) OP(op2, SSE_OPF_MMX, \ + gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL, \ + NULL, gen_helper_ ## x ## _ymm, NULL, NULL) + +#define SSE_FOP(name) OP(op2, SSE_OPF_SCALAR, \ + gen_helper_##name##ps_xmm, gen_helper_##name##pd_xmm, \ + gen_helper_##name##ss, gen_helper_##name##sd, \ + gen_helper_##name##ps_ymm, gen_helper_##name##pd_ymm, NULL, NULL) +#define SSE_OP(sname, dname, op, flags) OP(op, flags, \ + gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL, \ + gen_helper_##sname##_ymm, gen_helper_##dname##_ymm, NULL, NULL) + +struct SSEOpHelper_table1 { + int flags; + union { + SSEFunc_0_epp op1; + SSEFunc_0_ppi op1i; + SSEFunc_0_eppt op1t; + SSEFunc_0_eppp op2; + SSEFunc_0_pppi op2i; + } fn[8]; +}; =20 -#define SSE_SPECIAL ((void *)1) -#define SSE_DUMMY ((void *)2) - -#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } -#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ - gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } +#define SSE_3DNOW { SSE_OPF_3DNOW } +#define SSE_SPECIAL { SSE_OPF_SPECIAL } =20 -static const SSEFunc_0_epp sse_op_table1[256][4] =3D { +static const struct SSEOpHelper_table1 sse_op_table1[256] =3D { /* 3DNow! extensions */ - [0x0e] =3D { SSE_DUMMY }, /* femms */ - [0x0f] =3D { SSE_DUMMY }, /* pf... */ + [0x0e] =3D SSE_SPECIAL, /* femms */ + [0x0f] =3D SSE_3DNOW, /* pf... (sse_op_table5) */ /* pure SSE operations */ - [0x10] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movups, movupd, movss, movsd */ - [0x11] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movups, movupd, movss, movsd */ - [0x12] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movlps, movlpd, movsldup, movddup */ - [0x13] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movlps, movlpd */ - [0x14] =3D { gen_helper_punpckldq_xmm, gen_helper_punpcklqdq_xmm }, - [0x15] =3D { gen_helper_punpckhdq_xmm, gen_helper_punpckhqdq_xmm }, - [0x16] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movhps, movh= pd, movshdup */ - [0x17] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movhps, movhpd */ - - [0x28] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movaps, movapd */ - [0x29] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movaps, movapd */ - [0x2a] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */ - [0x2b] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movntps, movntpd, movntss, movntsd */ - [0x2c] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */ - [0x2d] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ - [0x2e] =3D { gen_helper_ucomiss, gen_helper_ucomisd }, - [0x2f] =3D { gen_helper_comiss, gen_helper_comisd }, - [0x50] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movmskps, movmskpd */ - [0x51] =3D SSE_FOP(sqrt), - [0x52] =3D { gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL }, - [0x53] =3D { gen_helper_rcpps, NULL, gen_helper_rcpss, NULL }, - [0x54] =3D { gen_helper_pand_xmm, gen_helper_pand_xmm }, /* andps, and= pd */ - [0x55] =3D { gen_helper_pandn_xmm, gen_helper_pandn_xmm }, /* andnps, = andnpd */ - [0x56] =3D { gen_helper_por_xmm, gen_helper_por_xmm }, /* orps, orpd */ - [0x57] =3D { gen_helper_pxor_xmm, gen_helper_pxor_xmm }, /* xorps, xor= pd */ + [0x10] =3D SSE_SPECIAL, /* movups, movupd, movss, movsd */ + [0x11] =3D SSE_SPECIAL, /* movups, movupd, movss, movsd */ + [0x12] =3D SSE_SPECIAL, /* movlps, movlpd, movsldup, movddup */ + [0x13] =3D SSE_SPECIAL, /* movlps, movlpd */ + [0x14] =3D SSE_OP(punpckldq, punpcklqdq, op2, 0), /* unpcklps, unpcklp= d */ + [0x15] =3D SSE_OP(punpckhdq, punpckhqdq, op2, 0), /* unpckhps, unpckhp= d */ + [0x16] =3D SSE_SPECIAL, /* movhps, movhpd, movshdup */ + [0x17] =3D SSE_SPECIAL, /* movhps, movhpd */ + + [0x28] =3D SSE_SPECIAL, /* movaps, movapd */ + [0x29] =3D SSE_SPECIAL, /* movaps, movapd */ + [0x2a] =3D SSE_SPECIAL, /* cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */ + [0x2b] =3D SSE_SPECIAL, /* movntps, movntpd, movntss, movntsd */ + [0x2c] =3D SSE_SPECIAL, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si = */ + [0x2d] =3D SSE_SPECIAL, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ + [0x2e] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL, + NULL, NULL, NULL, NULL), + [0x2f] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_comiss, gen_helper_comisd, NULL, NULL, + NULL, NULL, NULL, NULL), + [0x50] =3D SSE_SPECIAL, /* movmskps, movmskpd */ + [0x51] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_sqrtps_xmm, gen_helper_sqrtpd_xmm, + gen_helper_sqrtss, gen_helper_sqrtsd, + gen_helper_sqrtps_ymm, gen_helper_sqrtpd_ymm, NULL, NULL), + [0x52] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL, + gen_helper_rsqrtps_ymm, NULL, NULL, NULL), + [0x53] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL, + gen_helper_rcpps_ymm, NULL, NULL, NULL), + [0x54] =3D SSE_OP(pand, pand, op2, 0), /* andps, andpd */ + [0x55] =3D SSE_OP(pandn, pandn, op2, 0), /* andnps, andnpd */ + [0x56] =3D SSE_OP(por, por, op2, 0), /* orps, orpd */ + [0x57] =3D SSE_OP(pxor, pxor, op2, 0), /* xorps, xorpd */ [0x58] =3D SSE_FOP(add), [0x59] =3D SSE_FOP(mul), - [0x5a] =3D { gen_helper_cvtps2pd, gen_helper_cvtpd2ps, - gen_helper_cvtss2sd, gen_helper_cvtsd2ss }, - [0x5b] =3D { gen_helper_cvtdq2ps, gen_helper_cvtps2dq, gen_helper_cvtt= ps2dq }, + [0x5a] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_cvtps2pd_xmm, gen_helper_cvtpd2ps_xmm, + gen_helper_cvtss2sd, gen_helper_cvtsd2ss, + gen_helper_cvtps2pd_ymm, gen_helper_cvtpd2ps_ymm, NULL, NU= LL), + [0x5b] =3D OP(op1, SSE_OPF_V0, + gen_helper_cvtdq2ps_xmm, gen_helper_cvtps2dq_xmm, + gen_helper_cvttps2dq_xmm, NULL, + gen_helper_cvtdq2ps_ymm, gen_helper_cvtps2dq_ymm, + gen_helper_cvttps2dq_ymm, NULL), [0x5c] =3D SSE_FOP(sub), [0x5d] =3D SSE_FOP(min), [0x5e] =3D SSE_FOP(div), [0x5f] =3D SSE_FOP(max), =20 - [0xc2] =3D SSE_FOP(cmpeq), - [0xc6] =3D { (SSEFunc_0_epp)gen_helper_shufps, - (SSEFunc_0_epp)gen_helper_shufpd }, /* XXX: casts */ + [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ + [0xc6] =3D SSE_OP(shufps, shufpd, op2i, SSE_OPF_SHUF), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ - [0x38] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, - [0x3a] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, + [0x38] =3D SSE_SPECIAL, + [0x3a] =3D SSE_SPECIAL, =20 /* MMX ops and their SSE extensions */ - [0x60] =3D MMX_OP2(punpcklbw), - [0x61] =3D MMX_OP2(punpcklwd), - [0x62] =3D MMX_OP2(punpckldq), - [0x63] =3D MMX_OP2(packsswb), - [0x64] =3D MMX_OP2(pcmpgtb), - [0x65] =3D MMX_OP2(pcmpgtw), - [0x66] =3D MMX_OP2(pcmpgtl), - [0x67] =3D MMX_OP2(packuswb), - [0x68] =3D MMX_OP2(punpckhbw), - [0x69] =3D MMX_OP2(punpckhwd), - [0x6a] =3D MMX_OP2(punpckhdq), - [0x6b] =3D MMX_OP2(packssdw), - [0x6c] =3D { NULL, gen_helper_punpcklqdq_xmm }, - [0x6d] =3D { NULL, gen_helper_punpckhqdq_xmm }, - [0x6e] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movd mm, ea */ - [0x6f] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa,= , movqdu */ - [0x70] =3D { (SSEFunc_0_epp)gen_helper_pshufw_mmx, - (SSEFunc_0_epp)gen_helper_pshufd_xmm, - (SSEFunc_0_epp)gen_helper_pshufhw_xmm, - (SSEFunc_0_epp)gen_helper_pshuflw_xmm }, /* XXX: casts */ - [0x71] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftw */ - [0x72] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftd */ - [0x73] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftq */ - [0x74] =3D MMX_OP2(pcmpeqb), - [0x75] =3D MMX_OP2(pcmpeqw), - [0x76] =3D MMX_OP2(pcmpeql), - [0x77] =3D { SSE_DUMMY }, /* emms */ - [0x78] =3D { NULL, SSE_SPECIAL, NULL, SSE_SPECIAL }, /* extrq_i, inser= tq_i */ - [0x79] =3D { NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r }, - [0x7c] =3D { NULL, gen_helper_haddpd, NULL, gen_helper_haddps }, - [0x7d] =3D { NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps }, - [0x7e] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movd, movd, ,= movq */ - [0x7f] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa,= movdqu */ - [0xc4] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pinsrw */ - [0xc5] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pextrw */ - [0xd0] =3D { NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps }, - [0xd1] =3D MMX_OP2(psrlw), - [0xd2] =3D MMX_OP2(psrld), - [0xd3] =3D MMX_OP2(psrlq), - [0xd4] =3D MMX_OP2(paddq), - [0xd5] =3D MMX_OP2(pmullw), - [0xd6] =3D { NULL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, - [0xd7] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pmovmskb */ - [0xd8] =3D MMX_OP2(psubusb), - [0xd9] =3D MMX_OP2(psubusw), - [0xda] =3D MMX_OP2(pminub), - [0xdb] =3D MMX_OP2(pand), - [0xdc] =3D MMX_OP2(paddusb), - [0xdd] =3D MMX_OP2(paddusw), - [0xde] =3D MMX_OP2(pmaxub), - [0xdf] =3D MMX_OP2(pandn), - [0xe0] =3D MMX_OP2(pavgb), - [0xe1] =3D MMX_OP2(psraw), - [0xe2] =3D MMX_OP2(psrad), - [0xe3] =3D MMX_OP2(pavgw), - [0xe4] =3D MMX_OP2(pmulhuw), - [0xe5] =3D MMX_OP2(pmulhw), - [0xe6] =3D { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, gen_help= er_cvtpd2dq }, - [0xe7] =3D { SSE_SPECIAL , SSE_SPECIAL }, /* movntq, movntq */ - [0xe8] =3D MMX_OP2(psubsb), - [0xe9] =3D MMX_OP2(psubsw), - [0xea] =3D MMX_OP2(pminsw), - [0xeb] =3D MMX_OP2(por), - [0xec] =3D MMX_OP2(paddsb), - [0xed] =3D MMX_OP2(paddsw), - [0xee] =3D MMX_OP2(pmaxsw), - [0xef] =3D MMX_OP2(pxor), - [0xf0] =3D { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */ - [0xf1] =3D MMX_OP2(psllw), - [0xf2] =3D MMX_OP2(pslld), - [0xf3] =3D MMX_OP2(psllq), - [0xf4] =3D MMX_OP2(pmuludq), - [0xf5] =3D MMX_OP2(pmaddwd), - [0xf6] =3D MMX_OP2(psadbw), - [0xf7] =3D { (SSEFunc_0_epp)gen_helper_maskmov_mmx, - (SSEFunc_0_epp)gen_helper_maskmov_xmm }, /* XXX: casts */ - [0xf8] =3D MMX_OP2(psubb), - [0xf9] =3D MMX_OP2(psubw), - [0xfa] =3D MMX_OP2(psubl), - [0xfb] =3D MMX_OP2(psubq), - [0xfc] =3D MMX_OP2(paddb), - [0xfd] =3D MMX_OP2(paddw), - [0xfe] =3D MMX_OP2(paddl), + [0x60] =3D MMX_OP(punpcklbw), + [0x61] =3D MMX_OP(punpcklwd), + [0x62] =3D MMX_OP(punpckldq), + [0x63] =3D MMX_OP(packsswb), + [0x64] =3D MMX_OP(pcmpgtb), + [0x65] =3D MMX_OP(pcmpgtw), + [0x66] =3D MMX_OP(pcmpgtl), + [0x67] =3D MMX_OP(packuswb), + [0x68] =3D MMX_OP(punpckhbw), + [0x69] =3D MMX_OP(punpckhwd), + [0x6a] =3D MMX_OP(punpckhdq), + [0x6b] =3D MMX_OP(packssdw), + [0x6c] =3D OP(op2, SSE_OPF_MMX, + NULL, gen_helper_punpcklqdq_xmm, NULL, NULL, + NULL, gen_helper_punpcklqdq_ymm, NULL, NULL), + [0x6d] =3D OP(op2, SSE_OPF_MMX, + NULL, gen_helper_punpckhqdq_xmm, NULL, NULL, + NULL, gen_helper_punpckhqdq_ymm, NULL, NULL), + [0x6e] =3D SSE_SPECIAL, /* movd mm, ea */ + [0x6f] =3D SSE_SPECIAL, /* movq, movdqa, , movqdu */ + [0x70] =3D OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX | SSE_OPF_V0, + gen_helper_pshufw_mmx, gen_helper_pshufd_xmm, + gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm, + NULL, gen_helper_pshufd_ymm, + gen_helper_pshufhw_ymm, gen_helper_pshuflw_ymm), + [0x71] =3D SSE_SPECIAL, /* shiftw */ + [0x72] =3D SSE_SPECIAL, /* shiftd */ + [0x73] =3D SSE_SPECIAL, /* shiftq */ + [0x74] =3D MMX_OP(pcmpeqb), + [0x75] =3D MMX_OP(pcmpeqw), + [0x76] =3D MMX_OP(pcmpeql), + [0x77] =3D SSE_SPECIAL, /* emms */ + [0x78] =3D SSE_SPECIAL, /* extrq_i, insertq_i (sse4a) */ + [0x79] =3D OP(op1, SSE_OPF_V0, + NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r, + NULL, NULL, NULL, NULL), + [0x7c] =3D OP(op2, 0, + NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm, + NULL, gen_helper_haddpd_ymm, NULL, gen_helper_haddps_ymm), + [0x7d] =3D OP(op2, 0, + NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm, + NULL, gen_helper_hsubpd_ymm, NULL, gen_helper_hsubps_ymm), + [0x7e] =3D SSE_SPECIAL, /* movd, movd, , movq */ + [0x7f] =3D SSE_SPECIAL, /* movq, movdqa, movdqu */ + [0xc4] =3D SSE_SPECIAL, /* pinsrw */ + [0xc5] =3D SSE_SPECIAL, /* pextrw */ + [0xd0] =3D OP(op2, 0, + NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_x= mm, + NULL, gen_helper_addsubpd_ymm, NULL, gen_helper_addsubps_y= mm), + [0xd1] =3D MMX_OP(psrlw), + [0xd2] =3D MMX_OP(psrld), + [0xd3] =3D MMX_OP(psrlq), + [0xd4] =3D MMX_OP(paddq), + [0xd5] =3D MMX_OP(pmullw), + [0xd6] =3D SSE_SPECIAL, + [0xd7] =3D SSE_SPECIAL, /* pmovmskb */ + [0xd8] =3D MMX_OP(psubusb), + [0xd9] =3D MMX_OP(psubusw), + [0xda] =3D MMX_OP(pminub), + [0xdb] =3D MMX_OP(pand), + [0xdc] =3D MMX_OP(paddusb), + [0xdd] =3D MMX_OP(paddusw), + [0xde] =3D MMX_OP(pmaxub), + [0xdf] =3D MMX_OP(pandn), + [0xe0] =3D MMX_OP(pavgb), + [0xe1] =3D MMX_OP(psraw), + [0xe2] =3D MMX_OP(psrad), + [0xe3] =3D MMX_OP(pavgw), + [0xe4] =3D MMX_OP(pmulhuw), + [0xe5] =3D MMX_OP(pmulhw), + [0xe6] =3D OP(op1, SSE_OPF_V0, + NULL, gen_helper_cvttpd2dq_xmm, + gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm, + NULL, gen_helper_cvttpd2dq_ymm, + gen_helper_cvtdq2pd_ymm, gen_helper_cvtpd2dq_ymm), + [0xe7] =3D SSE_SPECIAL, /* movntq, movntq */ + [0xe8] =3D MMX_OP(psubsb), + [0xe9] =3D MMX_OP(psubsw), + [0xea] =3D MMX_OP(pminsw), + [0xeb] =3D MMX_OP(por), + [0xec] =3D MMX_OP(paddsb), + [0xed] =3D MMX_OP(paddsw), + [0xee] =3D MMX_OP(pmaxsw), + [0xef] =3D MMX_OP(pxor), + [0xf0] =3D SSE_SPECIAL, /* lddqu */ + [0xf1] =3D MMX_OP(psllw), + [0xf2] =3D MMX_OP(pslld), + [0xf3] =3D MMX_OP(psllq), + [0xf4] =3D MMX_OP(pmuludq), + [0xf5] =3D MMX_OP(pmaddwd), + [0xf6] =3D MMX_OP(psadbw), + [0xf7] =3D OP(op1t, SSE_OPF_MMX | SSE_OPF_V0, + gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL, + NULL, NULL, NULL, NULL), + [0xf8] =3D MMX_OP(psubb), + [0xf9] =3D MMX_OP(psubw), + [0xfa] =3D MMX_OP(psubl), + [0xfb] =3D MMX_OP(psubq), + [0xfc] =3D MMX_OP(paddb), + [0xfd] =3D MMX_OP(paddw), + [0xfe] =3D MMX_OP(paddl), }; - -static const SSEFunc_0_epp sse_op_table2[3 * 8][2] =3D { - [0 + 2] =3D MMX_OP2(psrlw), - [0 + 4] =3D MMX_OP2(psraw), - [0 + 6] =3D MMX_OP2(psllw), - [8 + 2] =3D MMX_OP2(psrld), - [8 + 4] =3D MMX_OP2(psrad), - [8 + 6] =3D MMX_OP2(pslld), - [16 + 2] =3D MMX_OP2(psrlq), - [16 + 3] =3D { NULL, gen_helper_psrldq_xmm }, - [16 + 6] =3D MMX_OP2(psllq), - [16 + 7] =3D { NULL, gen_helper_pslldq_xmm }, +#undef MMX_OP +#undef OP +#undef SSE_FOP +#undef SSE_OP +#undef SSE_SPECIAL + +#define MMX_OP(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, \ + gen_helper_ ## x ## _ymm} +static const SSEFunc_0_eppp sse_op_table2[3 * 8][3] =3D { + [0 + 2] =3D MMX_OP(psrlw), + [0 + 4] =3D MMX_OP(psraw), + [0 + 6] =3D MMX_OP(psllw), + [8 + 2] =3D MMX_OP(psrld), + [8 + 4] =3D MMX_OP(psrad), + [8 + 6] =3D MMX_OP(pslld), + [16 + 2] =3D MMX_OP(psrlq), + [16 + 3] =3D { NULL, gen_helper_psrldq_xmm, gen_helper_psrldq_ymm}, + [16 + 6] =3D MMX_OP(psllq), + [16 + 7] =3D { NULL, gen_helper_pslldq_xmm, gen_helper_pslldq_ymm}, }; +#undef MMX_OP =20 static const SSEFunc_0_epi sse_op_table3ai[] =3D { gen_helper_cvtsi2ss, @@ -2968,16 +3141,53 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 -static const SSEFunc_0_epp sse_op_table4[8][4] =3D { - SSE_FOP(cmpeq), - SSE_FOP(cmplt), - SSE_FOP(cmple), - SSE_FOP(cmpunord), - SSE_FOP(cmpneq), - SSE_FOP(cmpnlt), - SSE_FOP(cmpnle), - SSE_FOP(cmpord), +#define SSE_CMP(x) { \ + gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ + gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, \ + gen_helper_ ## x ## ps ## _ymm, gen_helper_ ## x ## pd ## _ymm} +static const SSEFunc_0_eppp sse_op_table4[32][6] =3D { + SSE_CMP(cmpeq), + SSE_CMP(cmplt), + SSE_CMP(cmple), + SSE_CMP(cmpunord), + SSE_CMP(cmpneq), + SSE_CMP(cmpnlt), + SSE_CMP(cmpnle), + SSE_CMP(cmpord), + + SSE_CMP(cmpequ), + SSE_CMP(cmpnge), + SSE_CMP(cmpngt), + SSE_CMP(cmpfalse), + SSE_CMP(cmpnequ), + SSE_CMP(cmpge), + SSE_CMP(cmpgt), + SSE_CMP(cmptrue), + + SSE_CMP(cmpeqs), + SSE_CMP(cmpltq), + SSE_CMP(cmpleq), + SSE_CMP(cmpunords), + SSE_CMP(cmpneqq), + SSE_CMP(cmpnltq), + SSE_CMP(cmpnleq), + SSE_CMP(cmpords), + + SSE_CMP(cmpequs), + SSE_CMP(cmpngeq), + SSE_CMP(cmpngtq), + SSE_CMP(cmpfalses), + SSE_CMP(cmpnequs), + SSE_CMP(cmpgeq), + SSE_CMP(cmpgtq), + SSE_CMP(cmptrues), }; +#undef SSE_CMP + +static void gen_helper_pavgusb(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b) +{ + gen_helper_pavgb_mmx(env, reg_a, reg_a, reg_b); +} =20 static const SSEFunc_0_epp sse_op_table5[256] =3D { [0x0c] =3D gen_helper_pi2fw, @@ -3003,117 +3213,291 @@ static const SSEFunc_0_epp sse_op_table5[256] =3D= { [0xb6] =3D gen_helper_movq, /* pfrcpit2 */ [0xb7] =3D gen_helper_pmulhrw_mmx, [0xbb] =3D gen_helper_pswapd, - [0xbf] =3D gen_helper_pavgb_mmx /* pavgusb */ + [0xbf] =3D gen_helper_pavgusb, }; =20 -struct SSEOpHelper_epp { - SSEFunc_0_epp op[2]; +struct SSEOpHelper_table6 { + union { + SSEFunc_0_epp op1; + SSEFunc_0_eppp op2; + SSEFunc_0_epppp op3; + } fn[3]; /* [0] =3D mmx, [1] =3D xmm, fn[2] =3D ymm */ uint32_t ext_mask; + int flags; }; =20 -struct SSEOpHelper_eppi { - SSEFunc_0_eppi op[2]; +struct SSEOpHelper_table7 { + union { + SSEFunc_0_eppi op1; + SSEFunc_0_epppi op2; + SSEFunc_0_epppp op3; + } fn[3]; uint32_t ext_mask; + int flags; +}; + +#define gen_helper_special_xmm NULL +#define gen_helper_special_ymm NULL + +#define OP(name, op, flags, ext, mmx_name) \ + {{{.op =3D mmx_name}, {.op =3D gen_helper_ ## name ## _xmm}, \ + {.op =3D gen_helper_ ## name ## _ymm} }, CPUID_EXT_ ## ext, flags} +#define BINARY_OP_MMX(name, ext) \ + OP(name, op2, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) +#define BINARY_OP(name, ext, flags) \ + OP(name, op2, flags, ext, NULL) +#define UNARY_OP_MMX(name, ext) \ + OP(name, op1, SSE_OPF_V0 | SSE_OPF_MMX, ext, gen_helper_ ## name ## _m= mx) +#define UNARY_OP(name, ext, flags) \ + OP(name, op1, SSE_OPF_V0 | flags, ext, NULL) +#define BLENDV_OP(name, ext, flags) OP(name, op3, SSE_OPF_BLENDV, ext, NUL= L) +#define CMP_OP(name, ext) OP(name, op1, SSE_OPF_CMP | SSE_OPF_V0, ext, NUL= L) +#define SPECIAL_OP(ext) OP(special, op1, SSE_OPF_SPECIAL, ext, NULL) + +/* prefix [66] 0f 38 */ +static const struct SSEOpHelper_table6 sse_op_table6[256] =3D { + [0x00] =3D BINARY_OP_MMX(pshufb, SSSE3), + [0x01] =3D BINARY_OP_MMX(phaddw, SSSE3), + [0x02] =3D BINARY_OP_MMX(phaddd, SSSE3), + [0x03] =3D BINARY_OP_MMX(phaddsw, SSSE3), + [0x04] =3D BINARY_OP_MMX(pmaddubsw, SSSE3), + [0x05] =3D BINARY_OP_MMX(phsubw, SSSE3), + [0x06] =3D BINARY_OP_MMX(phsubd, SSSE3), + [0x07] =3D BINARY_OP_MMX(phsubsw, SSSE3), + [0x08] =3D BINARY_OP_MMX(psignb, SSSE3), + [0x09] =3D BINARY_OP_MMX(psignw, SSSE3), + [0x0a] =3D BINARY_OP_MMX(psignd, SSSE3), + [0x0b] =3D BINARY_OP_MMX(pmulhrsw, SSSE3), + [0x0c] =3D BINARY_OP(vpermilps, AVX, 0), + [0x0d] =3D BINARY_OP(vpermilpd, AVX, 0), + [0x0e] =3D CMP_OP(vtestps, AVX), + [0x0f] =3D CMP_OP(vtestpd, AVX), + [0x10] =3D BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX), + [0x14] =3D BLENDV_OP(blendvps, SSE41, 0), + [0x15] =3D BLENDV_OP(blendvpd, SSE41, 0), +#define gen_helper_vpermd_xmm NULL + [0x16] =3D BINARY_OP(vpermd, AVX, SSE_OPF_AVX2), /* vpermps */ + [0x17] =3D CMP_OP(ptest, SSE41), + /* TODO:Some vbroadcast variants require AVX2 */ + [0x18] =3D UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR), /* vbroadcastss= */ + [0x19] =3D UNARY_OP(vbroadcastq, AVX, SSE_OPF_SCALAR), /* vbroadcastsd= */ +#define gen_helper_vbroadcastdq_xmm NULL + [0x1a] =3D UNARY_OP(vbroadcastdq, AVX, SSE_OPF_SCALAR), /* vbroadcastf= 128 */ + [0x1c] =3D UNARY_OP_MMX(pabsb, SSSE3), + [0x1d] =3D UNARY_OP_MMX(pabsw, SSSE3), + [0x1e] =3D UNARY_OP_MMX(pabsd, SSSE3), + [0x20] =3D UNARY_OP(pmovsxbw, SSE41, SSE_OPF_MMX), + [0x21] =3D UNARY_OP(pmovsxbd, SSE41, SSE_OPF_MMX), + [0x22] =3D UNARY_OP(pmovsxbq, SSE41, SSE_OPF_MMX), + [0x23] =3D UNARY_OP(pmovsxwd, SSE41, SSE_OPF_MMX), + [0x24] =3D UNARY_OP(pmovsxwq, SSE41, SSE_OPF_MMX), + [0x25] =3D UNARY_OP(pmovsxdq, SSE41, SSE_OPF_MMX), + [0x28] =3D BINARY_OP(pmuldq, SSE41, SSE_OPF_MMX), + [0x29] =3D BINARY_OP(pcmpeqq, SSE41, SSE_OPF_MMX), + [0x2a] =3D SPECIAL_OP(SSE41), /* movntqda */ + [0x2b] =3D BINARY_OP(packusdw, SSE41, SSE_OPF_MMX), + [0x2c] =3D BINARY_OP(vpmaskmovd, AVX, 0), /* vmaskmovps */ + [0x2d] =3D BINARY_OP(vpmaskmovq, AVX, 0), /* vmaskmovpd */ + [0x2e] =3D SPECIAL_OP(AVX), /* vmaskmovps */ + [0x2f] =3D SPECIAL_OP(AVX), /* vmaskmovpd */ + [0x30] =3D UNARY_OP(pmovzxbw, SSE41, SSE_OPF_MMX), + [0x31] =3D UNARY_OP(pmovzxbd, SSE41, SSE_OPF_MMX), + [0x32] =3D UNARY_OP(pmovzxbq, SSE41, SSE_OPF_MMX), + [0x33] =3D UNARY_OP(pmovzxwd, SSE41, SSE_OPF_MMX), + [0x34] =3D UNARY_OP(pmovzxwq, SSE41, SSE_OPF_MMX), + [0x35] =3D UNARY_OP(pmovzxdq, SSE41, SSE_OPF_MMX), + [0x36] =3D BINARY_OP(vpermd, AVX, SSE_OPF_AVX2), /* vpermd */ + [0x37] =3D BINARY_OP(pcmpgtq, SSE41, SSE_OPF_MMX), + [0x38] =3D BINARY_OP(pminsb, SSE41, SSE_OPF_MMX), + [0x39] =3D BINARY_OP(pminsd, SSE41, SSE_OPF_MMX), + [0x3a] =3D BINARY_OP(pminuw, SSE41, SSE_OPF_MMX), + [0x3b] =3D BINARY_OP(pminud, SSE41, SSE_OPF_MMX), + [0x3c] =3D BINARY_OP(pmaxsb, SSE41, SSE_OPF_MMX), + [0x3d] =3D BINARY_OP(pmaxsd, SSE41, SSE_OPF_MMX), + [0x3e] =3D BINARY_OP(pmaxuw, SSE41, SSE_OPF_MMX), + [0x3f] =3D BINARY_OP(pmaxud, SSE41, SSE_OPF_MMX), + [0x40] =3D BINARY_OP(pmulld, SSE41, SSE_OPF_MMX), +#define gen_helper_phminposuw_ymm NULL + [0x41] =3D UNARY_OP(phminposuw, SSE41, 0), + [0x45] =3D BINARY_OP(vpsrlvd, AVX, SSE_OPF_AVX2), + [0x46] =3D BINARY_OP(vpsravd, AVX, SSE_OPF_AVX2), + [0x47] =3D BINARY_OP(vpsllvd, AVX, SSE_OPF_AVX2), + /* vpbroadcastd */ + [0x58] =3D UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vpbroadcastq */ + [0x59] =3D UNARY_OP(vbroadcastq, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vbroadcasti128 */ + [0x5a] =3D UNARY_OP(vbroadcastdq, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vpbroadcastb */ + [0x78] =3D UNARY_OP(vbroadcastb, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vpbroadcastw */ + [0x79] =3D UNARY_OP(vbroadcastw, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vpmaskmovd, vpmaskmovq */ + [0x8c] =3D BINARY_OP(vpmaskmovd, AVX, SSE_OPF_AVX2), + [0x8e] =3D SPECIAL_OP(AVX), /* vpmaskmovd, vpmaskmovq */ + [0x90] =3D SPECIAL_OP(AVX), /* vpgatherdd, vpgatherdq */ + [0x91] =3D SPECIAL_OP(AVX), /* vpgatherqd, vpgatherqq */ + [0x92] =3D SPECIAL_OP(AVX), /* vgatherdpd, vgatherdps */ + [0x93] =3D SPECIAL_OP(AVX), /* vgatherqpd, vgatherqps */ +#define gen_helper_aesimc_ymm NULL + [0xdb] =3D UNARY_OP(aesimc, AES, 0), + [0xdc] =3D BINARY_OP(aesenc, AES, 0), + [0xdd] =3D BINARY_OP(aesenclast, AES, 0), + [0xde] =3D BINARY_OP(aesdec, AES, 0), + [0xdf] =3D BINARY_OP(aesdeclast, AES, 0), +}; + +/* prefix [66] 0f 3a */ +static const struct SSEOpHelper_table7 sse_op_table7[256] =3D { +#define gen_helper_vpermq_xmm NULL + [0x00] =3D UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), + [0x01] =3D UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), /* vpermpd */ + [0x02] =3D BINARY_OP(blendps, AVX, SSE_OPF_AVX2), /* vpblendd */ + [0x04] =3D UNARY_OP(vpermilps_imm, AVX, 0), + [0x05] =3D UNARY_OP(vpermilpd_imm, AVX, 0), +#define gen_helper_vpermdq_xmm NULL + [0x06] =3D BINARY_OP(vpermdq, AVX, 0), /* vperm2f128 */ + [0x08] =3D UNARY_OP(roundps, SSE41, 0), + [0x09] =3D UNARY_OP(roundpd, SSE41, 0), +#define gen_helper_roundss_ymm NULL + [0x0a] =3D UNARY_OP(roundss, SSE41, SSE_OPF_SCALAR), +#define gen_helper_roundsd_ymm NULL + [0x0b] =3D UNARY_OP(roundsd, SSE41, SSE_OPF_SCALAR), + [0x0c] =3D BINARY_OP(blendps, SSE41, 0), + [0x0d] =3D BINARY_OP(blendpd, SSE41, 0), + [0x0e] =3D BINARY_OP(pblendw, SSE41, SSE_OPF_MMX), + [0x0f] =3D BINARY_OP_MMX(palignr, SSSE3), + [0x14] =3D SPECIAL_OP(SSE41), /* pextrb */ + [0x15] =3D SPECIAL_OP(SSE41), /* pextrw */ + [0x16] =3D SPECIAL_OP(SSE41), /* pextrd/pextrq */ + [0x17] =3D SPECIAL_OP(SSE41), /* extractps */ + [0x18] =3D SPECIAL_OP(AVX), /* vinsertf128 */ + [0x19] =3D SPECIAL_OP(AVX), /* vextractf128 */ + [0x20] =3D SPECIAL_OP(SSE41), /* pinsrb */ + [0x21] =3D SPECIAL_OP(SSE41), /* insertps */ + [0x22] =3D SPECIAL_OP(SSE41), /* pinsrd/pinsrq */ + [0x38] =3D SPECIAL_OP(AVX), /* vinserti128 */ + [0x39] =3D SPECIAL_OP(AVX), /* vextracti128 */ + [0x40] =3D BINARY_OP(dpps, SSE41, 0), +#define gen_helper_dppd_ymm NULL + [0x41] =3D BINARY_OP(dppd, SSE41, 0), + [0x42] =3D BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX), + [0x44] =3D BINARY_OP(pclmulqdq, PCLMULQDQ, 0), + [0x46] =3D BINARY_OP(vpermdq, AVX, SSE_OPF_AVX2), /* vperm2i128 */ + [0x4a] =3D BLENDV_OP(blendvps, SSE41, 0), + [0x4b] =3D BLENDV_OP(blendvpd, SSE41, 0), + [0x4c] =3D BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX), +#define gen_helper_pcmpestrm_ymm NULL + [0x60] =3D CMP_OP(pcmpestrm, SSE42), +#define gen_helper_pcmpestri_ymm NULL + [0x61] =3D CMP_OP(pcmpestri, SSE42), +#define gen_helper_pcmpistrm_ymm NULL + [0x62] =3D CMP_OP(pcmpistrm, SSE42), +#define gen_helper_pcmpistri_ymm NULL + [0x63] =3D CMP_OP(pcmpistri, SSE42), +#define gen_helper_aeskeygenassist_ymm NULL + [0xdf] =3D UNARY_OP(aeskeygenassist, AES, 0), }; =20 -#define SSSE3_OP(x) { MMX_OP2(x), CPUID_EXT_SSSE3 } -#define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 } -#define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 } -#define SSE41_SPECIAL { { NULL, SSE_SPECIAL }, CPUID_EXT_SSE41 } -#define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \ - CPUID_EXT_PCLMULQDQ } -#define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES } - -static const struct SSEOpHelper_epp sse_op_table6[256] =3D { - [0x00] =3D SSSE3_OP(pshufb), - [0x01] =3D SSSE3_OP(phaddw), - [0x02] =3D SSSE3_OP(phaddd), - [0x03] =3D SSSE3_OP(phaddsw), - [0x04] =3D SSSE3_OP(pmaddubsw), - [0x05] =3D SSSE3_OP(phsubw), - [0x06] =3D SSSE3_OP(phsubd), - [0x07] =3D SSSE3_OP(phsubsw), - [0x08] =3D SSSE3_OP(psignb), - [0x09] =3D SSSE3_OP(psignw), - [0x0a] =3D SSSE3_OP(psignd), - [0x0b] =3D SSSE3_OP(pmulhrsw), - [0x10] =3D SSE41_OP(pblendvb), - [0x14] =3D SSE41_OP(blendvps), - [0x15] =3D SSE41_OP(blendvpd), - [0x17] =3D SSE41_OP(ptest), - [0x1c] =3D SSSE3_OP(pabsb), - [0x1d] =3D SSSE3_OP(pabsw), - [0x1e] =3D SSSE3_OP(pabsd), - [0x20] =3D SSE41_OP(pmovsxbw), - [0x21] =3D SSE41_OP(pmovsxbd), - [0x22] =3D SSE41_OP(pmovsxbq), - [0x23] =3D SSE41_OP(pmovsxwd), - [0x24] =3D SSE41_OP(pmovsxwq), - [0x25] =3D SSE41_OP(pmovsxdq), - [0x28] =3D SSE41_OP(pmuldq), - [0x29] =3D SSE41_OP(pcmpeqq), - [0x2a] =3D SSE41_SPECIAL, /* movntqda */ - [0x2b] =3D SSE41_OP(packusdw), - [0x30] =3D SSE41_OP(pmovzxbw), - [0x31] =3D SSE41_OP(pmovzxbd), - [0x32] =3D SSE41_OP(pmovzxbq), - [0x33] =3D SSE41_OP(pmovzxwd), - [0x34] =3D SSE41_OP(pmovzxwq), - [0x35] =3D SSE41_OP(pmovzxdq), - [0x37] =3D SSE42_OP(pcmpgtq), - [0x38] =3D SSE41_OP(pminsb), - [0x39] =3D SSE41_OP(pminsd), - [0x3a] =3D SSE41_OP(pminuw), - [0x3b] =3D SSE41_OP(pminud), - [0x3c] =3D SSE41_OP(pmaxsb), - [0x3d] =3D SSE41_OP(pmaxsd), - [0x3e] =3D SSE41_OP(pmaxuw), - [0x3f] =3D SSE41_OP(pmaxud), - [0x40] =3D SSE41_OP(pmulld), - [0x41] =3D SSE41_OP(phminposuw), - [0xdb] =3D AESNI_OP(aesimc), - [0xdc] =3D AESNI_OP(aesenc), - [0xdd] =3D AESNI_OP(aesenclast), - [0xde] =3D AESNI_OP(aesdec), - [0xdf] =3D AESNI_OP(aesdeclast), +#define SSE_OP(name) \ + {gen_helper_ ## name ##_xmm, gen_helper_ ## name ##_ymm} +static const SSEFunc_0_eppp sse_op_table8[3][2] =3D { + SSE_OP(vpsrlvq), + SSE_OP(vpsravq), + SSE_OP(vpsllvq), }; =20 -static const struct SSEOpHelper_eppi sse_op_table7[256] =3D { - [0x08] =3D SSE41_OP(roundps), - [0x09] =3D SSE41_OP(roundpd), - [0x0a] =3D SSE41_OP(roundss), - [0x0b] =3D SSE41_OP(roundsd), - [0x0c] =3D SSE41_OP(blendps), - [0x0d] =3D SSE41_OP(blendpd), - [0x0e] =3D SSE41_OP(pblendw), - [0x0f] =3D SSSE3_OP(palignr), - [0x14] =3D SSE41_SPECIAL, /* pextrb */ - [0x15] =3D SSE41_SPECIAL, /* pextrw */ - [0x16] =3D SSE41_SPECIAL, /* pextrd/pextrq */ - [0x17] =3D SSE41_SPECIAL, /* extractps */ - [0x20] =3D SSE41_SPECIAL, /* pinsrb */ - [0x21] =3D SSE41_SPECIAL, /* insertps */ - [0x22] =3D SSE41_SPECIAL, /* pinsrd/pinsrq */ - [0x40] =3D SSE41_OP(dpps), - [0x41] =3D SSE41_OP(dppd), - [0x42] =3D SSE41_OP(mpsadbw), - [0x44] =3D PCLMULQDQ_OP(pclmulqdq), - [0x60] =3D SSE42_OP(pcmpestrm), - [0x61] =3D SSE42_OP(pcmpestri), - [0x62] =3D SSE42_OP(pcmpistrm), - [0x63] =3D SSE42_OP(pcmpistri), - [0xdf] =3D AESNI_OP(aeskeygenassist), +static const SSEFunc_0_eppt sse_op_table9[2][2] =3D { + SSE_OP(vpmaskmovd_st), + SSE_OP(vpmaskmovq_st), }; =20 +static const SSEFunc_0_epppt sse_op_table10[16][2] =3D { + SSE_OP(vpgatherdd0), + SSE_OP(vpgatherdq0), + SSE_OP(vpgatherqd0), + SSE_OP(vpgatherqq0), + SSE_OP(vpgatherdd1), + SSE_OP(vpgatherdq1), + SSE_OP(vpgatherqd1), + SSE_OP(vpgatherqq1), + SSE_OP(vpgatherdd2), + SSE_OP(vpgatherdq2), + SSE_OP(vpgatherqd2), + SSE_OP(vpgatherqq2), + SSE_OP(vpgatherdd3), + SSE_OP(vpgatherdq3), + SSE_OP(vpgatherqd3), + SSE_OP(vpgatherqq3), +}; +#undef SSE_OP + +#undef OP +#undef BINARY_OP_MMX +#undef BINARY_OP +#undef UNARY_OP_MMX +#undef UNARY_OP +#undef BLENDV_OP +#undef SPECIAL_OP + +/* VEX prefix not allowed */ +#define CHECK_NO_VEX(s) do { \ + if (s->prefix & PREFIX_VEX) \ + goto illegal_op; \ + } while (0) + +/* + * VEX encodings require AVX + * Allow legacy SSE encodings even if AVX not enabled + */ +#define CHECK_AVX(s) do { \ + if ((s->prefix & PREFIX_VEX) \ + && !(env->hflags & HF_AVX_EN_MASK)) \ + goto illegal_op; \ + } while (0) + +/* If a VEX prefix is used then it must have V=3D1111b */ +#define CHECK_AVX_V0(s) do { \ + CHECK_AVX(s); \ + if ((s->prefix & PREFIX_VEX) && (s->vex_v !=3D 0)) \ + goto illegal_op; \ + } while (0) + +/* If a VEX prefix is used then it must have L=3D0 */ +#define CHECK_AVX_128(s) do { \ + CHECK_AVX(s); \ + if ((s->prefix & PREFIX_VEX) && (s->vex_l !=3D 0)) \ + goto illegal_op; \ + } while (0) + +/* If a VEX prefix is used then it must have V=3D1111b and L=3D0 */ +#define CHECK_AVX_V0_128(s) do { \ + CHECK_AVX(s); \ + if ((s->prefix & PREFIX_VEX) && (s->vex_v !=3D 0 || s->vex_l !=3D 0)) \ + goto illegal_op; \ + } while (0) + +/* 256-bit (ymm) variants require AVX2 */ +#define CHECK_AVX2_256(s) do { \ + if (s->vex_l && !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2)) \ + goto illegal_op; \ + } while (0) + +/* Requires AVX2 and VEX encoding */ +#define CHECK_AVX2(s) do { \ + if ((s->prefix & PREFIX_VEX) =3D=3D 0 \ + || !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2)) \ + goto illegal_op; \ + } while (0) + static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { - int b1, op1_offset, op2_offset, is_xmm, val; - int modrm, mod, rm, reg; - SSEFunc_0_epp sse_fn_epp; - SSEFunc_0_eppi sse_fn_eppi; - SSEFunc_0_ppi sse_fn_ppi; - SSEFunc_0_eppt sse_fn_eppt; + int b1, op1_offset, op2_offset, v_offset, is_xmm, val, scalar_op; + int modrm, mod, rm, reg, reg_v; + struct SSEOpHelper_table1 sse_op; + struct SSEOpHelper_table6 op6; + struct SSEOpHelper_table7 op7; MemOp ot; =20 b &=3D 0xff; @@ -3125,10 +3509,7 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, b1 =3D 3; else b1 =3D 0; - sse_fn_epp =3D sse_op_table1[b][b1]; - if (!sse_fn_epp) { - goto unknown_op; - } + sse_op =3D sse_op_table1[b]; if ((b <=3D 0x5f && b >=3D 0x10) || b =3D=3D 0xc6 || b =3D=3D 0xc2) { is_xmm =3D 1; } else { @@ -3139,20 +3520,28 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, is_xmm =3D 1; } } + if (sse_op.flags & SSE_OPF_3DNOW) { + if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { + goto illegal_op; + } + } /* simple MMX/SSE operation */ if (s->flags & HF_TS_MASK) { gen_exception(s, EXCP07_PREX, pc_start - s->cs_base); return; } - if (s->flags & HF_EM_MASK) { - illegal_op: - gen_illegal_opcode(s); - return; - } - if (is_xmm - && !(s->flags & HF_OSFXSR_MASK) - && (b !=3D 0x38 && b !=3D 0x3a)) { - goto unknown_op; + /* VEX encoded instuctions ignore EM bit. See also CHECK_AVX */ + if (!(s->prefix & PREFIX_VEX)) { + if (s->flags & HF_EM_MASK) { + illegal_op: + gen_illegal_opcode(s); + return; + } + if (is_xmm + && !(s->flags & HF_OSFXSR_MASK) + && (b !=3D 0x38 && b !=3D 0x3a)) { + goto unknown_op; + } } if (b =3D=3D 0x0e) { if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { @@ -3164,9 +3553,29 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, return; } if (b =3D=3D 0x77) { - /* emms */ - gen_helper_emms(cpu_env); - return; + if (s->prefix & PREFIX_VEX) { + CHECK_AVX(s); + if (s->vex_l) { + gen_helper_vzeroall(cpu_env); +#ifdef TARGET_X86_64 + if (CODE64(s)) { + gen_helper_vzeroall_hi8(cpu_env); + } +#endif + } else { + gen_helper_vzeroupper(cpu_env); +#ifdef TARGET_X86_64 + if (CODE64(s)) { + gen_helper_vzeroupper_hi8(cpu_env); + } +#endif + } + return; + } else { + /* emms */ + gen_helper_emms(cpu_env); + return; + } } /* prepare MMX state (XXX: optimize by storing fptt and fptags in the static cpu state) */ @@ -3179,11 +3588,17 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, if (is_xmm) { reg |=3D REX_R(s); } + if (s->prefix & PREFIX_VEX) { + reg_v =3D s->vex_v; + } else { + reg_v =3D reg; + } mod =3D (modrm >> 6) & 3; - if (sse_fn_epp =3D=3D SSE_SPECIAL) { + if (sse_op.flags & SSE_OPF_SPECIAL) { b |=3D (b1 << 8); switch(b) { case 0x0e7: /* movntq */ + CHECK_NO_VEX(s); if (mod =3D=3D 3) { goto illegal_op; } @@ -3193,19 +3608,31 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x1e7: /* movntdq */ case 0x02b: /* movntps */ case 0x12b: /* movntps */ + CHECK_AVX_V0(s); if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + if (s->vex_l) { + gen_sty_env_A0(s, XMM_OFFSET(reg)); + } else { + gen_sto_env_A0(s, XMM_OFFSET(reg)); + } break; case 0x3f0: /* lddqu */ + CHECK_AVX_V0(s); if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + if (s->vex_l) { + gen_ldy_env_A0(s, XMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, XMM_OFFSET(reg)); + gen_clear_ymmh(s, reg); + } break; case 0x22b: /* movntss */ case 0x32b: /* movntsd */ + CHECK_AVX_V0_128(s); if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); @@ -3219,6 +3646,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x6e: /* movd mm, ea */ + CHECK_NO_VEX(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); @@ -3235,23 +3663,24 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, } break; case 0x16e: /* movd xmm, ea */ + CHECK_AVX_V0_128(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(reg)); gen_helper_movq_mm_T0_xmm(s->ptr0, s->T0); } else #endif { gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 0); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(reg)); tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0); gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32); } + gen_clear_ymmh(s, reg); break; case 0x6f: /* movq mm, ea */ + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx)); @@ -3269,17 +3698,28 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x128: /* movapd */ case 0x16f: /* movdqa xmm, ea */ case 0x26f: /* movdqu xmm, ea */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + if (s->vex_l) { + gen_ldy_env_A0(s, XMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, XMM_OFFSET(reg)); + } } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movo(s, offsetof(CPUX86State, xmm_regs[reg]), - offsetof(CPUX86State,xmm_regs[rm])); + gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(rm)); + if (s->vex_l) { + gen_op_movo_ymmh(s, XMM_OFFSET(reg), XMM_OFFSET(rm)); + } + } + if (!s->vex_l) { + gen_clear_ymmh(s, reg); } break; case 0x210: /* movss xmm, ea */ if (mod !=3D 3) { + CHECK_AVX_V0_128(s); gen_lea_modrm(env, s, modrm); gen_op_ld_v(s, MO_32, s->T0, s->A0); tcg_gen_st32_tl(s->T0, cpu_env, @@ -3292,13 +3732,21 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 3))); } else { + CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); - gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0))); + tcg_gen_ld_i32(s->tmp2_i32, cpu_env, + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)= )); + if (reg !=3D reg_v) { + gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v)); + } + tcg_gen_st_i32(s->tmp2_i32, cpu_env, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= ))); } + gen_clear_ymmh(s, reg); break; case 0x310: /* movsd xmm, ea */ if (mod !=3D 3) { + CHECK_AVX_V0_128(s); gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); @@ -3308,13 +3756,21 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 3))); } else { + CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); + if (reg !=3D reg_v) { + gen_op_movq(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); + } gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); } + gen_clear_ymmh(s, reg); break; case 0x012: /* movlps */ case 0x112: /* movlpd */ + CHECK_AVX_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, @@ -3323,40 +3779,84 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, /* movhlps */ rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(1))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(1))); + } + if (reg !=3D reg_v) { + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); } + gen_clear_ymmh(s, reg); break; case 0x212: /* movsldup */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + if (s->vex_l) { + gen_ldy_env_A0(s, XMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, XMM_OFFSET(reg)); + } } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0))); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_L(2))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(2))); + if (s->vex_l) { + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 4)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(4= ))); + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 6)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(6= ))); + } } gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)), - offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0))); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)), - offsetof(CPUX86State,xmm_regs[reg].ZMM_L(2))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2))); + if (s->vex_l) { + gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4))); + gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6))); + } else { + gen_clear_ymmh(s, reg); + } break; case 0x312: /* movddup */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); + if (s->vex_l) { + tcg_gen_addi_tl(s->A0, s->A0, 16); + gen_ldq_env_A0(s, offsetof(CPUX86State, + xmm_regs[reg].ZMM_Q(2))); + } } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); + if (s->vex_l) { + gen_op_movq(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(= 2)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(2= ))); + } } gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)), - offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); + if (s->vex_l) { + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(3= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2))); + } else { + gen_clear_ymmh(s, reg); + } break; case 0x016: /* movhps */ case 0x116: /* movhpd */ + CHECK_AVX_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, @@ -3365,27 +3865,54 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, /* movlhps */ rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); + } + if (reg !=3D reg_v) { + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(0)= )); } + gen_clear_ymmh(s, reg); break; case 0x216: /* movshdup */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + if (s->vex_l) { + gen_ldy_env_A0(s, XMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, XMM_OFFSET(reg)); + } } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_L(1))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(1))); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_L(3))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(3))); + if (s->vex_l) { + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 5)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(5= ))); + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 7)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(7= ))); + } } gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)), - offsetof(CPUX86State,xmm_regs[reg].ZMM_L(1))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1))); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2)), - offsetof(CPUX86State,xmm_regs[reg].ZMM_L(3))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3))); + if (s->vex_l) { + gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5))); + gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7))); + } else { + gen_clear_ymmh(s, reg); + } break; case 0x178: case 0x378: + CHECK_NO_VEX(s); { int bit_index, field_length; =20 @@ -3393,8 +3920,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; field_length =3D x86_ldub_code(env, s) & 0x3F; bit_index =3D x86_ldub_code(env, s) & 0x3F; - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(reg)); if (b1 =3D=3D 1) gen_helper_extrq_i(cpu_env, s->ptr0, tcg_const_i32(bit_index), @@ -3406,6 +3932,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x7e: /* movd ea, mm */ + CHECK_NO_VEX(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { tcg_gen_ld_i64(s->T0, cpu_env, @@ -3420,20 +3947,22 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, } break; case 0x17e: /* movd ea, xmm */ + CHECK_AVX_V0_128(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { tcg_gen_ld_i64(s->T0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0)= )); + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= ))); gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 1); } else #endif { tcg_gen_ld32u_tl(s->T0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg].ZMM_L(= 0))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_L= (0))); gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 1); } break; case 0x27e: /* movq xmm, ea */ + CHECK_AVX_V0_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, @@ -3441,11 +3970,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); } gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); + gen_clear_ymmh(s, reg); break; case 0x7f: /* movq ea, mm */ + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx)); @@ -3461,40 +3992,64 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x129: /* movapd */ case 0x17f: /* movdqa ea, xmm */ case 0x27f: /* movdqu ea, xmm */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + if (s->vex_l) { + gen_sty_env_A0(s, XMM_OFFSET(reg)); + } else { + gen_sto_env_A0(s, XMM_OFFSET(reg)); + } } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movo(s, offsetof(CPUX86State, xmm_regs[rm]), - offsetof(CPUX86State,xmm_regs[reg])); + gen_op_movo(s, XMM_OFFSET(rm), XMM_OFFSET(reg)); + if (s->vex_l) { + gen_op_movo_ymmh(s, XMM_OFFSET(rm), XMM_OFFSET(reg)); + } else { + gen_clear_ymmh(s, rm); + } } break; case 0x211: /* movss ea, xmm */ if (mod !=3D 3) { + CHECK_AVX_V0_128(s); gen_lea_modrm(env, s, modrm); tcg_gen_ld32u_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L= (0))); gen_op_st_v(s, MO_32, s->T0, s->A0); } else { + CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); + if (rm !=3D reg_v) { + gen_op_movo(s, XMM_OFFSET(rm), XMM_OFFSET(reg_v)); + } gen_op_movl(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)= ), - offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0))); + gen_clear_ymmh(s, rm); } break; case 0x311: /* movsd ea, xmm */ if (mod !=3D 3) { + CHECK_AVX_V0_128(s); gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); } else { + CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); + if (rm !=3D reg_v) { + gen_op_movq(s, + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(1)), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); + } gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)= ), - offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); + gen_clear_ymmh(s, rm); } break; case 0x013: /* movlps */ case 0x113: /* movlpd */ + CHECK_AVX_V0_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, @@ -3505,6 +4060,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x017: /* movhps */ case 0x117: /* movhpd */ + CHECK_AVX_V0_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, @@ -3521,65 +4077,91 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x173: val =3D x86_ldub_code(env, s); if (is_xmm) { + CHECK_AVX(s); + CHECK_AVX2_256(s); tcg_gen_movi_tl(s->T0, val); tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_t0.ZMM_L(0))); tcg_gen_movi_tl(s->T0, 0); tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_t0.ZMM_L(1))); - op1_offset =3D offsetof(CPUX86State,xmm_t0); + op1_offset =3D offsetof(CPUX86State, xmm_t0); } else { + CHECK_NO_VEX(s); tcg_gen_movi_tl(s->T0, val); tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, mmx_t0.MMX_L(0))); tcg_gen_movi_tl(s->T0, 0); tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, mmx_t0.MMX_L(1))); - op1_offset =3D offsetof(CPUX86State,mmx_t0); + op1_offset =3D offsetof(CPUX86State, mmx_t0); } assert(b1 < 2); - sse_fn_epp =3D sse_op_table2[((b - 1) & 3) * 8 + + if (s->vex_l) { + b1 =3D 2; + } + SSEFunc_0_eppp fn =3D sse_op_table2[((b - 1) & 3) * 8 + (((modrm >> 3)) & 7)][b1]; - if (!sse_fn_epp) { + if (!fn) { goto unknown_op; } if (is_xmm) { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D XMM_OFFSET(rm); + if (s->prefix & PREFIX_VEX) { + v_offset =3D XMM_OFFSET(reg_v); + } else { + v_offset =3D op2_offset; + } } else { rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); + v_offset =3D op2_offset; + } + tcg_gen_addi_ptr(s->ptr0, cpu_env, v_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + tcg_gen_addi_ptr(s->ptr2, cpu_env, op1_offset); + fn(cpu_env, s->ptr0, s->ptr1, s->ptr2); + if (!s->vex_l) { + gen_clear_ymmh(s, reg_v); } - tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op1_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); break; case 0x050: /* movmskps */ + CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm])); - gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0); + offsetof(CPUX86State, xmm_regs[rm])); + if (s->vex_l) { + gen_helper_movmskps_ymm(s->tmp2_i32, cpu_env, s->ptr0); + } else { + gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0); + } tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ + CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm])); - gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0); + tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(rm)); + if (s->vex_l) { + gen_helper_movmskpd_ymm(s->tmp2_i32, cpu_env, s->ptr0); + } else { + gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0); + } tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x02a: /* cvtpi2ps */ case 0x12a: /* cvtpi2pd */ - gen_helper_enter_mmx(cpu_env); + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); op2_offset =3D offsetof(CPUX86State,mmx_t0); gen_ldq_env_A0(s, op2_offset); } else { + gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); } - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D XMM_OFFSET(reg); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); switch(b >> 8) { @@ -3594,9 +4176,14 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, break; case 0x22a: /* cvtsi2ss */ case 0x32a: /* cvtsi2sd */ + CHECK_AVX(s); ot =3D mo_64_32(s->dflag); gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0); - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D XMM_OFFSET(reg); + v_offset =3D XMM_OFFSET(reg_v); + if (op1_offset !=3D v_offset) { + gen_op_movo(s, op1_offset, v_offset); + } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); if (ot =3D=3D MO_32) { SSEFunc_0_epi sse_fn_epi =3D sse_op_table3ai[(b >> 8) & 1]; @@ -3610,19 +4197,21 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, goto illegal_op; #endif } + gen_clear_ymmh(s, reg); break; case 0x02c: /* cvttps2pi */ case 0x12c: /* cvttpd2pi */ case 0x02d: /* cvtps2pi */ case 0x12d: /* cvtpd2pi */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - op2_offset =3D offsetof(CPUX86State,xmm_t0); + op2_offset =3D offsetof(CPUX86State, xmm_t0); gen_ldo_env_A0(s, op2_offset); } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D XMM_OFFSET(rm); } op1_offset =3D offsetof(CPUX86State,fpregs[reg & 7].mmx); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); @@ -3646,6 +4235,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x32c: /* cvttsd2si */ case 0x22d: /* cvtss2si */ case 0x32d: /* cvtsd2si */ + CHECK_AVX_V0(s); ot =3D mo_64_32(s->dflag); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -3656,10 +4246,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_t0.ZMM_L(0))= ); } - op2_offset =3D offsetof(CPUX86State,xmm_t0); + op2_offset =3D offsetof(CPUX86State, xmm_t0); } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D XMM_OFFSET(rm); } tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset); if (ot =3D=3D MO_32) { @@ -3680,21 +4270,28 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, break; case 0xc4: /* pinsrw */ case 0x1c4: + CHECK_AVX_128(s); s->rip_offset =3D 1; gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0); val =3D x86_ldub_code(env, s); + if (reg !=3D reg_v) { + gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v)); + } if (b1) { val &=3D 7; tcg_gen_st16_tl(s->T0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg].ZMM_W(v= al))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_W(val))); } else { + CHECK_NO_VEX(s); val &=3D 3; tcg_gen_st16_tl(s->T0, cpu_env, - offsetof(CPUX86State,fpregs[reg].mmx.MMX_W= (val))); + offsetof(CPUX86State, fpregs[reg].mmx.MMX_W(val))); } + gen_clear_ymmh(s, reg); break; case 0xc5: /* pextrw */ case 0x1c5: + CHECK_AVX_V0_128(s); if (mod !=3D 3) goto illegal_op; ot =3D mo_64_32(s->dflag); @@ -3703,17 +4300,18 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, val &=3D 7; rm =3D (modrm & 7) | REX_B(s); tcg_gen_ld16u_tl(s->T0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm].ZMM_W(v= al))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_W(val))); } else { val &=3D 3; rm =3D (modrm & 7); tcg_gen_ld16u_tl(s->T0, cpu_env, - offsetof(CPUX86State,fpregs[rm].mmx.MMX_W(= val))); + offsetof(CPUX86State, fpregs[rm].mmx.MMX_W(val))); } reg =3D ((modrm >> 3) & 7) | REX_R(s); gen_op_mov_reg_v(s, ot, reg, s->T0); break; case 0x1d6: /* movq ea, xmm */ + CHECK_AVX_V0_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, @@ -3721,12 +4319,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)= ), - offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q= (1))); } break; case 0x2d6: /* movq2dq */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)), @@ -3734,21 +4333,27 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); break; case 0x3d6: /* movdq2q */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, fpregs[reg & 7].mmx), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); break; case 0xd7: /* pmovmskb */ case 0x1d7: if (mod !=3D 3) goto illegal_op; if (b1) { + CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State, xmm_regs[rm])); - gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); + tcg_gen_addi_ptr(s->ptr0, cpu_env, XMM_OFFSET(rm)); + if (s->vex_l) { + gen_helper_pmovmskb_ymm(s->tmp2_i32, cpu_env, s->ptr0); + } else { + gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); + } } else { + CHECK_NO_VEX(s); rm =3D (modrm & 7); tcg_gen_addi_ptr(s->ptr0, cpu_env, offsetof(CPUX86State, fpregs[rm].mmx)); @@ -3768,50 +4373,241 @@ static void gen_sse(CPUX86State *env, DisasContext= *s, int b, rm =3D modrm & 7; reg =3D ((modrm >> 3) & 7) | REX_R(s); mod =3D (modrm >> 6) & 3; + if (s->prefix & PREFIX_VEX) { + reg_v =3D s->vex_v; + } else { + reg_v =3D reg; + } =20 assert(b1 < 2); - sse_fn_epp =3D sse_op_table6[b].op[b1]; - if (!sse_fn_epp) { + op6 =3D sse_op_table6[b]; + if (op6.ext_mask =3D=3D 0) { goto unknown_op; } - if (!(s->cpuid_ext_features & sse_op_table6[b].ext_mask)) + if (!(s->cpuid_ext_features & op6.ext_mask)) { + goto illegal_op; + } + + if (op6.ext_mask =3D=3D CPUID_EXT_AVX + && (s->prefix & PREFIX_VEX) =3D=3D 0) { goto illegal_op; + } + if (op6.flags & SSE_OPF_AVX2) { + CHECK_AVX2(s); + } =20 if (b1) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + if (op6.flags & SSE_OPF_V0) { + CHECK_AVX_V0(s); + } else { + CHECK_AVX(s); + } + + op1_offset =3D XMM_OFFSET(reg); + + if ((b & 0xfc) =3D=3D 0x90) { /* vgather */ + int scale, index, base; + target_long disp =3D 0; + CHECK_AVX2(s); + if (mod =3D=3D 3 || rm !=3D 4) { + goto illegal_op; + } + + /* Vector SIB */ + val =3D x86_ldub_code(env, s); + scale =3D (val >> 6) & 3; + index =3D ((val >> 3) & 7) | REX_X(s); + base =3D (val & 7) | REX_B(s); + switch (mod) { + case 0: + if (base =3D=3D 5) { + base =3D -1; + disp =3D (int32_t)x86_ldl_code(env, s); + } + break; + case 1: + disp =3D (int8_t)x86_ldub_code(env, s); + break; + default: + case 2: + disp =3D (int32_t)x86_ldl_code(env, s); + break; + } + + /* destination, index and mask registers must not over= lap */ + if (reg =3D=3D index || reg =3D=3D reg_v) { + goto illegal_op; + } + + tcg_gen_addi_tl(s->A0, cpu_regs[base], disp); + gen_add_A0_ds_seg(s); + op2_offset =3D XMM_OFFSET(index); + v_offset =3D XMM_OFFSET(reg_v); + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + b1 =3D REX_W(s) | ((b & 1) << 1) | (scale << 2); + sse_op_table10[b1][s->vex_l](cpu_env, + s->ptr0, s->ptr2, s->ptr1, s->A0); + if (!s->vex_l) { + gen_clear_ymmh(s, reg); + gen_clear_ymmh(s, reg_v); + } + return; + } + + if (op6.flags & SSE_OPF_MMX) { + CHECK_AVX2_256(s); + } + if (op6.flags & SSE_OPF_BLENDV) { + /* + * VEX encodings of the blendv opcodes are not valid + * they use a different opcode with an 0f 3a prefix + */ + CHECK_NO_VEX(s); + } + if (mod =3D=3D 3) { - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); + op2_offset =3D XMM_OFFSET(rm | REX_B(s)); } else { - op2_offset =3D offsetof(CPUX86State,xmm_t0); + int size; + op2_offset =3D offsetof(CPUX86State, xmm_t0); gen_lea_modrm(env, s, modrm); switch (b) { + case 0x78: /* vpbroadcastb */ + size =3D 8; + break; + case 0x79: /* vpbroadcastw */ + size =3D 16; + break; + case 0x18: /* vbroadcastss */ + case 0x58: /* vpbroadcastd */ + size =3D 32; + break; + case 0x19: /* vbroadcastsd */ + case 0x59: /* vpbroadcastq */ + size =3D 64; + break; + case 0x1a: /* vbroadcastf128 */ + case 0x5a: /* vbroadcasti128 */ + size =3D 128; + break; case 0x20: case 0x30: /* pmovsxbw, pmovzxbw */ case 0x23: case 0x33: /* pmovsxwd, pmovzxwd */ case 0x25: case 0x35: /* pmovsxdq, pmovzxdq */ - gen_ldq_env_A0(s, op2_offset + - offsetof(ZMMReg, ZMM_Q(0))); + size =3D 64; break; case 0x21: case 0x31: /* pmovsxbd, pmovzxbd */ case 0x24: case 0x34: /* pmovsxwq, pmovzxwq */ - tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, - s->mem_index, MO_LEUL); - tcg_gen_st_i32(s->tmp2_i32, cpu_env, op2_offset + - offsetof(ZMMReg, ZMM_L(0))); + size =3D 32; break; case 0x22: case 0x32: /* pmovsxbq, pmovzxbq */ + size =3D 16; + break; + case 0x2a: /* movntqda */ + if (s->vex_l) { + gen_ldy_env_A0(s, op1_offset); + } else { + gen_ldo_env_A0(s, op1_offset); + gen_clear_ymmh(s, reg); + } + return; + case 0x2e: /* maskmovpd */ + b1 =3D 0; + goto vpmaskmov; + case 0x2f: /* maskmovpd */ + b1 =3D 1; + goto vpmaskmov; + case 0x8e: /* vpmaskmovd, vpmaskmovq */ + CHECK_AVX2(s); + b1 =3D REX_W(s); + vpmaskmov: + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + v_offset =3D XMM_OFFSET(reg_v); + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + sse_op_table9[b1][s->vex_l](cpu_env, + s->ptr0, s->ptr2, s->A0); + return; + default: + size =3D 128; + } + if ((op6.flags & SSE_OPF_SCALAR) =3D=3D 0 && s->vex_l)= { + size *=3D 2; + } + switch (size) { + case 8: + tcg_gen_qemu_ld_tl(s->tmp0, s->A0, + s->mem_index, MO_UB); + tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset + + offsetof(ZMMReg, ZMM_B(0))); + break; + case 16: tcg_gen_qemu_ld_tl(s->tmp0, s->A0, s->mem_index, MO_LEUW); tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset + offsetof(ZMMReg, ZMM_W(0))); break; - case 0x2a: /* movntqda */ - gen_ldo_env_A0(s, op1_offset); - return; - default: + case 32: + tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, + s->mem_index, MO_LEUL); + tcg_gen_st_i32(s->tmp2_i32, cpu_env, op2_offset + + offsetof(ZMMReg, ZMM_L(0))); + break; + case 64: + gen_ldq_env_A0(s, op2_offset + + offsetof(ZMMReg, ZMM_Q(0))); + break; + case 128: gen_ldo_env_A0(s, op2_offset); + break; + case 256: + gen_ldy_env_A0(s, op2_offset); + break; + } + } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (s->vex_l) { + b1 =3D 2; + } + if (!op6.fn[b1].op1) { + goto illegal_op; + } + if (op6.flags & SSE_OPF_V0) { + op6.fn[b1].op1(cpu_env, s->ptr0, s->ptr1); + } else { + v_offset =3D XMM_OFFSET(reg_v); + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + if (op6.flags & SSE_OPF_BLENDV) { + TCGv_ptr mask =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(mask, cpu_env, XMM_OFFSET(0)); + op6.fn[b1].op3(cpu_env, s->ptr0, s->ptr2, s->ptr1, + mask); + tcg_temp_free_ptr(mask); + } else { + SSEFunc_0_eppp fn =3D op6.fn[b1].op2; + if (REX_W(s)) { + if (b >=3D 0x45 && b <=3D 0x47) { + fn =3D sse_op_table8[b - 0x45][b1 - 1]; + } else if (b =3D=3D 0x8c) { + if (s->vex_l) { + fn =3D gen_helper_vpmaskmovq_ymm; + } else { + fn =3D gen_helper_vpmaskmovq_xmm; + } + } + } + fn(cpu_env, s->ptr0, s->ptr2, s->ptr1); } } + if ((op6.flags & SSE_OPF_CMP) =3D=3D 0 && s->vex_l =3D=3D = 0) { + gen_clear_ymmh(s, reg); + } } else { + CHECK_NO_VEX(s); + if ((op6.flags & SSE_OPF_MMX) =3D=3D 0) { + goto unknown_op; + } op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -3820,16 +4616,16 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, op2_offset); } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (op6.flags & SSE_OPF_V0) { + op6.fn[0].op1(cpu_env, s->ptr0, s->ptr1); + } else { + op6.fn[0].op2(cpu_env, s->ptr0, s->ptr0, s->ptr1); + } } - if (sse_fn_epp =3D=3D SSE_SPECIAL) { - goto unknown_op; - } - - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); =20 - if (b =3D=3D 0x17) { + if (op6.flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } break; @@ -3846,6 +4642,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x3f0: /* crc32 Gd,Eb */ case 0x3f1: /* crc32 Gd,Ey */ do_crc32: + CHECK_NO_VEX(s); if (!(s->cpuid_ext_features & CPUID_EXT_SSE42)) { goto illegal_op; } @@ -3877,6 +4674,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, /* FALLTHRU */ case 0x0f0: /* movbe Gy,My */ case 0x0f1: /* movbe My,Gy */ + CHECK_NO_VEX(s); if (!(s->cpuid_ext_features & CPUID_EXT_MOVBE)) { goto illegal_op; } @@ -4043,6 +4841,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 case 0x1f6: /* adcx Gy, Ey */ case 0x2f6: /* adox Gy, Ey */ + CHECK_NO_VEX(s); if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_ADX)) { goto illegal_op; } else { @@ -4196,18 +4995,28 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, rm =3D modrm & 7; reg =3D ((modrm >> 3) & 7) | REX_R(s); mod =3D (modrm >> 6) & 3; + if (s->prefix & PREFIX_VEX) { + reg_v =3D s->vex_v; + } else { + reg_v =3D reg; + } =20 assert(b1 < 2); - sse_fn_eppi =3D sse_op_table7[b].op[b1]; - if (!sse_fn_eppi) { + op7 =3D sse_op_table7[b]; + if (op7.ext_mask =3D=3D 0) { goto unknown_op; } - if (!(s->cpuid_ext_features & sse_op_table7[b].ext_mask)) + if (!(s->cpuid_ext_features & op7.ext_mask)) { goto illegal_op; + } =20 s->rip_offset =3D 1; =20 - if (sse_fn_eppi =3D=3D SSE_SPECIAL) { + if (op7.flags & SSE_OPF_SPECIAL) { + /* None of the "special" ops are valid on mmx registers */ + if (b1 =3D=3D 0) { + goto illegal_op; + } ot =3D mo_64_32(s->dflag); rm =3D (modrm & 7) | REX_B(s); if (mod !=3D 3) @@ -4216,6 +5025,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, val =3D x86_ldub_code(env, s); switch (b) { case 0x14: /* pextrb */ + CHECK_AVX_V0_128(s); tcg_gen_ld8u_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_B(val & 15))= ); if (mod =3D=3D 3) { @@ -4226,6 +5036,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x15: /* pextrw */ + CHECK_AVX_V0_128(s); tcg_gen_ld16u_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_W(val & 7))); if (mod =3D=3D 3) { @@ -4236,6 +5047,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x16: + CHECK_AVX_V0_128(s); if (ot =3D=3D MO_32) { /* pextrd */ tcg_gen_ld_i32(s->tmp2_i32, cpu_env, offsetof(CPUX86State, @@ -4263,6 +5075,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x17: /* extractps */ + CHECK_AVX_V0_128(s); tcg_gen_ld32u_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(val & 3))); if (mod =3D=3D 3) { @@ -4273,6 +5086,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } break; case 0x20: /* pinsrb */ + CHECK_AVX_128(s); + if (reg !=3D reg_v) { + gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v)); + } if (mod =3D=3D 3) { gen_op_mov_v_reg(s, MO_32, s->T0, rm); } else { @@ -4281,18 +5098,23 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, } tcg_gen_st8_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_B(val & 15))= ); + gen_clear_ymmh(s, reg); break; case 0x21: /* insertps */ + CHECK_AVX_128(s); if (mod =3D=3D 3) { tcg_gen_ld_i32(s->tmp2_i32, cpu_env, - offsetof(CPUX86State,xmm_regs[rm] + offsetof(CPUX86State, xmm_regs[rm] .ZMM_L((val >> 6) & 3))); } else { tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, s->mem_index, MO_LEUL); } + if (reg !=3D reg_v) { + gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v)); + } tcg_gen_st_i32(s->tmp2_i32, cpu_env, - offsetof(CPUX86State,xmm_regs[reg] + offsetof(CPUX86State, xmm_regs[reg] .ZMM_L((val >> 4) & 3))); if ((val >> 0) & 1) tcg_gen_st_i32(tcg_const_i32(0 /*float32_zero*/), @@ -4310,8 +5132,13 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, tcg_gen_st_i32(tcg_const_i32(0 /*float32_zero*/), cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3))); + gen_clear_ymmh(s, reg); break; case 0x22: + CHECK_AVX_128(s); + if (reg !=3D reg_v) { + gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg_v)); + } if (ot =3D=3D MO_32) { /* pinsrd */ if (mod =3D=3D 3) { tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[rm]= ); @@ -4337,21 +5164,91 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, goto illegal_op; #endif } + gen_clear_ymmh(s, reg); + break; + case 0x38: /* vinserti128 */ + CHECK_AVX2_256(s); + /* fall through */ + case 0x18: /* vinsertf128 */ + CHECK_AVX(s); + if ((s->prefix & PREFIX_VEX) =3D=3D 0 || s->vex_l =3D= =3D 0) { + goto illegal_op; + } + if (mod =3D=3D 3) { + if (val & 1) { + gen_op_movo_ymm_l2h(s, XMM_OFFSET(reg), + XMM_OFFSET(rm)); + } else { + gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(rm)= ); + } + } else { + if (val & 1) { + gen_ldo_env_A0_ymmh(s, XMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, XMM_OFFSET(reg)); + } + } + if (reg !=3D reg_v) { + if (val & 1) { + gen_op_movo(s, XMM_OFFSET(reg), XMM_OFFSET(reg= _v)); + } else { + gen_op_movo_ymmh(s, XMM_OFFSET(reg), + XMM_OFFSET(reg_v)); + } + } + break; + case 0x39: /* vextracti128 */ + CHECK_AVX2_256(s); + /* fall through */ + case 0x19: /* vextractf128 */ + CHECK_AVX_V0(s); + if ((s->prefix & PREFIX_VEX) =3D=3D 0 || s->vex_l =3D= =3D 0) { + goto illegal_op; + } + if (mod =3D=3D 3) { + op1_offset =3D XMM_OFFSET(rm); + if (val & 1) { + gen_op_movo_ymm_h2l(s, XMM_OFFSET(rm), + XMM_OFFSET(reg)); + } else { + gen_op_movo(s, XMM_OFFSET(rm), XMM_OFFSET(reg)= ); + } + gen_clear_ymmh(s, rm); + } else{ + if (val & 1) { + gen_sto_env_A0_ymmh(s, XMM_OFFSET(reg)); + } else { + gen_sto_env_A0(s, XMM_OFFSET(reg)); + } + } break; + default: + goto unknown_op; } return; } =20 - if (b1) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); - if (mod =3D=3D 3) { - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); - } else { - op2_offset =3D offsetof(CPUX86State,xmm_t0); - gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, op2_offset); + CHECK_AVX(s); + scalar_op =3D (s->prefix & PREFIX_VEX) + && (op7.flags & SSE_OPF_SCALAR) + && !(op7.flags & SSE_OPF_CMP); + if (is_xmm && (op7.flags & SSE_OPF_MMX)) { + CHECK_AVX2_256(s); + } + if (op7.flags & SSE_OPF_AVX2) { + CHECK_AVX2(s); + } + if ((op7.flags & SSE_OPF_V0) && !scalar_op) { + CHECK_AVX_V0(s); + } + + if (b1 =3D=3D 0) { + CHECK_NO_VEX(s); + /* MMX */ + if ((op7.flags & SSE_OPF_MMX) =3D=3D 0) { + goto illegal_op; } - } else { + op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -4360,9 +5257,37 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, op2_offset); } + val =3D x86_ldub_code(env, s); + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + + /* We only actually have one MMX instuction (palignr) */ + assert(b =3D=3D 0x0f); + + op7.fn[0].op2(cpu_env, s->ptr0, s->ptr0, s->ptr1, + tcg_const_i32(val)); + break; + } + + /* SSE */ + if (op7.flags & SSE_OPF_BLENDV && !(s->prefix & PREFIX_VEX)) { + /* Only VEX encodings are valid for these blendv opcodes */ + goto illegal_op; + } + op1_offset =3D XMM_OFFSET(reg); + if (mod =3D=3D 3) { + op2_offset =3D XMM_OFFSET(rm | REX_B(s)); + } else { + op2_offset =3D offsetof(CPUX86State, xmm_t0); + gen_lea_modrm(env, s, modrm); + if (s->vex_l) { + gen_ldy_env_A0(s, op2_offset); + } else { + gen_ldo_env_A0(s, op2_offset); + } } - val =3D x86_ldub_code(env, s); =20 + val =3D x86_ldub_code(env, s); if ((b & 0xfc) =3D=3D 0x60) { /* pcmpXstrX */ set_cc_op(s, CC_OP_EFLAGS); =20 @@ -4370,11 +5295,49 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, /* The helper must use entire 64-bit gp registers */ val |=3D 1 << 8; } + if ((b & 1) =3D=3D 0) /* pcmpXsrtm */ + gen_clear_ymmh(s, 0); } =20 + if (s->vex_l) { + b1 =3D 2; + } + v_offset =3D XMM_OFFSET(reg_v); + /* + * Populate the top part of the destination register for VEX + * encoded scalar operations + */ + if (scalar_op && op1_offset !=3D v_offset) { + if (b =3D=3D 0x0a) { /* roundss */ + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_L(1)= )); + } + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); + } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_eppi(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + if (op7.flags & SSE_OPF_V0) { + op7.fn[b1].op1(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(va= l)); + } else { + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + if (op7.flags & SSE_OPF_BLENDV) { + TCGv_ptr mask =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(mask, cpu_env, XMM_OFFSET(val >> 4)); + op7.fn[b1].op3(cpu_env, s->ptr0, s->ptr2, s->ptr1, mas= k); + tcg_temp_free_ptr(mask); + } else { + op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1, + tcg_const_i32(val)); + } + } + if ((op7.flags & SSE_OPF_CMP) =3D=3D 0 && s->vex_l =3D=3D 0) { + gen_clear_ymmh(s, reg); + } + if (op7.flags & SSE_OPF_CMP) { + set_cc_op(s, CC_OP_EFLAGS); + } break; =20 case 0x33a: @@ -4424,34 +5387,49 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, default: break; } + if (s->vex_l) { + b1 +=3D 4; + } + if ((sse_op.flags & SSE_OPF_3DNOW) =3D=3D 0 && !sse_op.fn[b1].op1)= { + goto unknown_op; + } if (is_xmm) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + scalar_op =3D (s->prefix & PREFIX_VEX) + && (sse_op.flags & SSE_OPF_SCALAR) + && !(sse_op.flags & SSE_OPF_CMP) + && (b1 =3D=3D 2 || b1 =3D=3D 3); + /* VEX encoded scalar ops always have 3 operands! */ + if ((sse_op.flags & SSE_OPF_V0) && !scalar_op) { + CHECK_AVX_V0(s); + } else { + CHECK_AVX(s); + } + if (sse_op.flags & SSE_OPF_MMX) { + CHECK_AVX2_256(s); + } + op1_offset =3D XMM_OFFSET(reg); if (mod !=3D 3) { - int sz =3D 4; + int sz =3D s->vex_l ? 5 : 4; =20 gen_lea_modrm(env, s, modrm); - op2_offset =3D offsetof(CPUX86State,xmm_t0); - - switch (b) { - case 0x50 ... 0x5a: - case 0x5c ... 0x5f: - case 0xc2: - /* Most sse scalar operations. */ - if (b1 =3D=3D 2) { - sz =3D 2; - } else if (b1 =3D=3D 3) { - sz =3D 3; - } - break; + op2_offset =3D offsetof(CPUX86State, xmm_t0); =20 - case 0x2e: /* ucomis[sd] */ - case 0x2f: /* comis[sd] */ - if (b1 =3D=3D 0) { - sz =3D 2; + if (sse_op.flags & SSE_OPF_SCALAR) { + if (sse_op.flags & SSE_OPF_CMP) { + /* ucomis[sd], comis[sd] */ + if (b1 =3D=3D 0) { + sz =3D 2; + } else { + sz =3D 3; + } } else { - sz =3D 3; + /* Most sse scalar operations. */ + if (b1 =3D=3D 2) { + sz =3D 2; + } else if (b1 =3D=3D 3) { + sz =3D 3; + } } - break; } =20 switch (sz) { @@ -4459,22 +5437,29 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, /* 32 bit access */ gen_op_ld_v(s, MO_32, s->T0, s->A0); tcg_gen_st32_tl(s->T0, cpu_env, - offsetof(CPUX86State,xmm_t0.ZMM_L(0))); + offsetof(CPUX86State, xmm_t0.ZMM_L(0))= ); break; case 3: /* 64 bit access */ gen_ldq_env_A0(s, offsetof(CPUX86State, xmm_t0.ZMM_D(0= ))); break; - default: + case 4: /* 128 bit access */ gen_ldo_env_A0(s, op2_offset); break; + case 5: + /* 256 bit access */ + gen_ldy_env_A0(s, op2_offset); + break; } } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D XMM_OFFSET(rm); } + v_offset =3D XMM_OFFSET(reg_v); } else { + CHECK_NO_VEX(s); + scalar_op =3D 0; op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -4484,60 +5469,100 @@ static void gen_sse(CPUX86State *env, DisasContext= *s, int b, rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); } - } - switch(b) { - case 0x0f: /* 3DNow! data insns */ - val =3D x86_ldub_code(env, s); - sse_fn_epp =3D sse_op_table5[val]; - if (!sse_fn_epp) { - goto unknown_op; + if (sse_op.flags & SSE_OPF_3DNOW) { + /* 3DNow! data insns */ + val =3D x86_ldub_code(env, s); + SSEFunc_0_epp sse_fn_epp =3D sse_op_table5[val]; + if (!sse_fn_epp) { + goto unknown_op; + } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + return; } - if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { - goto illegal_op; + v_offset =3D op1_offset; + } + + /* + * Populate the top part of the destination register for VEX + * encoded scalar operations + */ + if (scalar_op && op1_offset !=3D v_offset) { + if (b =3D=3D 0x5a) { + /* + * Scalar conversions are tricky because the src and dest + * may be different sizes + */ + if (op1_offset =3D=3D op2_offset) { + /* + * The the second source operand overlaps the + * destination, so we need to copy the value + */ + op2_offset =3D offsetof(CPUX86State, xmm_t0); + gen_op_movq(s, op2_offset, op1_offset); + } + gen_op_movo(s, op1_offset, v_offset); + } else { + if (b1 =3D=3D 2) { /* ss */ + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_L(1)= )); + } + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); } - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; - case 0x70: /* pshufx insn */ - case 0xc6: /* pshufx insn */ - val =3D x86_ldub_code(env, s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - /* XXX: introduce a new table? */ - sse_fn_ppi =3D (SSEFunc_0_ppi)sse_fn_epp; - sse_fn_ppi(s->ptr0, s->ptr1, tcg_const_i32(val)); - break; - case 0xc2: - /* compare insns, bits 7:3 (7:5 for AVX) are ignored */ - val =3D x86_ldub_code(env, s) & 7; - sse_fn_epp =3D sse_op_table4[val][b1]; + } =20 - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; - case 0xf7: - /* maskmov : we must prepare A0 */ - if (mod !=3D 3) - goto illegal_op; - tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]); - gen_extu(s->aflag, s->A0); - gen_add_A0_ds_seg(s); + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (sse_op.flags & SSE_OPF_V0) { + if (sse_op.flags & SSE_OPF_SHUF) { + val =3D x86_ldub_code(env, s); + sse_op.fn[b1].op1i(s->ptr0, s->ptr1, tcg_const_i32(val)); + } else if (b =3D=3D 0xf7) { + /* maskmov : we must prepare A0 */ + if (mod !=3D 3) { + goto illegal_op; + } + tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]); + gen_extu(s->aflag, s->A0); + gen_add_A0_ds_seg(s); + + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + sse_op.fn[b1].op1t(cpu_env, s->ptr0, s->ptr1, s->A0); + /* Does not write to the fist operand */ + return; + } else { + sse_op.fn[b1].op1(cpu_env, s->ptr0, s->ptr1); + } + } else { + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + if (sse_op.flags & SSE_OPF_SHUF) { + val =3D x86_ldub_code(env, s); + sse_op.fn[b1].op2i(s->ptr0, s->ptr2, s->ptr1, + tcg_const_i32(val)); + } else { + SSEFunc_0_eppp fn =3D sse_op.fn[b1].op2; + if (b =3D=3D 0xc2) { + /* compare insns */ + val =3D x86_ldub_code(env, s); + if (s->prefix & PREFIX_VEX) { + val &=3D 0x1f; + } else { + val &=3D 7; + } + fn =3D sse_op_table4[val][b1]; + } + fn(cpu_env, s->ptr0, s->ptr2, s->ptr1); + } + } =20 - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - /* XXX: introduce a new table? */ - sse_fn_eppt =3D (SSEFunc_0_eppt)sse_fn_epp; - sse_fn_eppt(cpu_env, s->ptr0, s->ptr1, s->A0); - break; - default: - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; + if (s->vex_l =3D=3D 0 && (sse_op.flags & SSE_OPF_CMP) =3D=3D 0) { + gen_clear_ymmh(s, reg); } - if (b =3D=3D 0x2e || b =3D=3D 0x2f) { + if (sse_op.flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } } @@ -8619,6 +9644,7 @@ static void i386_tr_init_disas_context(DisasContextBa= se *dcbase, CPUState *cpu) dc->tmp4 =3D tcg_temp_new(); dc->ptr0 =3D tcg_temp_new_ptr(); dc->ptr1 =3D tcg_temp_new_ptr(); + dc->ptr2 =3D tcg_temp_new_ptr(); dc->cc_srcT =3D tcg_temp_local_new(); } =20 --=20 2.35.2 From nobody Thu May 2 14:26:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838097629579.1494171614237; Sun, 24 Apr 2022 15:08:17 -0700 (PDT) Received: from localhost ([::1]:54564 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikOu-0004Ob-MQ for importer@patchew.org; Sun, 24 Apr 2022 18:08:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49296) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJC-00056O-Jp for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:22 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58684) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001L6-LN for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:22 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ5-0001ea-7p; Sun, 24 Apr 2022 23:02:15 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 03/42] Add AVX_EN hflag Date: Sun, 24 Apr 2022 23:01:25 +0100 Message-Id: <20220424220204.2493824-4-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838099142100003 Content-Type: text/plain; charset="utf-8" Add a new hflag bit to determine whether AVX instructions are allowed Signed-off-by: Paul Brook Reviewed-by: Richard Henderson --- target/i386/cpu.h | 3 +++ target/i386/helper.c | 12 ++++++++++++ target/i386/tcg/fpu_helper.c | 1 + 3 files changed, 16 insertions(+) diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 9661f9fbd1..65200a1917 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -169,6 +169,7 @@ typedef enum X86Seg { #define HF_MPX_EN_SHIFT 25 /* MPX Enabled (CR4+XCR0+BNDCFGx) */ #define HF_MPX_IU_SHIFT 26 /* BND registers in-use */ #define HF_UMIP_SHIFT 27 /* CR4.UMIP */ +#define HF_AVX_EN_SHIFT 28 /* AVX Enabled (CR4+XCR0) */ =20 #define HF_CPL_MASK (3 << HF_CPL_SHIFT) #define HF_INHIBIT_IRQ_MASK (1 << HF_INHIBIT_IRQ_SHIFT) @@ -195,6 +196,7 @@ typedef enum X86Seg { #define HF_MPX_EN_MASK (1 << HF_MPX_EN_SHIFT) #define HF_MPX_IU_MASK (1 << HF_MPX_IU_SHIFT) #define HF_UMIP_MASK (1 << HF_UMIP_SHIFT) +#define HF_AVX_EN_MASK (1 << HF_AVX_EN_SHIFT) =20 /* hflags2 */ =20 @@ -2035,6 +2037,7 @@ void host_cpuid(uint32_t function, uint32_t count, =20 /* helper.c */ void x86_cpu_set_a20(X86CPU *cpu, int a20_state); +void cpu_sync_avx_hflag(CPUX86State *env); =20 #ifndef CONFIG_USER_ONLY static inline int x86_asidx_from_attrs(CPUState *cs, MemTxAttrs attrs) diff --git a/target/i386/helper.c b/target/i386/helper.c index fa409e9c44..30083c9cff 100644 --- a/target/i386/helper.c +++ b/target/i386/helper.c @@ -29,6 +29,17 @@ #endif #include "qemu/log.h" =20 +void cpu_sync_avx_hflag(CPUX86State *env) +{ + if ((env->cr[4] & CR4_OSXSAVE_MASK) + && (env->xcr0 & (XSTATE_SSE_MASK | XSTATE_YMM_MASK)) + =3D=3D (XSTATE_SSE_MASK | XSTATE_YMM_MASK)) { + env->hflags |=3D HF_AVX_EN_MASK; + } else{ + env->hflags &=3D ~HF_AVX_EN_MASK; + } +} + void cpu_sync_bndcs_hflags(CPUX86State *env) { uint32_t hflags =3D env->hflags; @@ -209,6 +220,7 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_= cr4) env->hflags =3D hflags; =20 cpu_sync_bndcs_hflags(env); + cpu_sync_avx_hflag(env); } =20 #if !defined(CONFIG_USER_ONLY) diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c index ebf5e73df9..b391b69635 100644 --- a/target/i386/tcg/fpu_helper.c +++ b/target/i386/tcg/fpu_helper.c @@ -2943,6 +2943,7 @@ void helper_xsetbv(CPUX86State *env, uint32_t ecx, ui= nt64_t mask) =20 env->xcr0 =3D mask; cpu_sync_bndcs_hflags(env); + cpu_sync_avx_hflag(env); return; =20 do_gpf: --=20 2.36.0 From nobody Thu May 2 14:26:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650303809910417.5055636931605; Mon, 18 Apr 2022 10:43:29 -0700 (PDT) Received: from localhost ([::1]:37922 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ngVPM-0007BE-Bq for importer@patchew.org; Mon, 18 Apr 2022 13:43:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50508) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ngVLk-0004wr-1h for qemu-devel@nongnu.org; Mon, 18 Apr 2022 13:39:44 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:41359) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ngVLi-0006cK-IF for qemu-devel@nongnu.org; Mon, 18 Apr 2022 13:39:43 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1ngVLg-000364-0T; Mon, 18 Apr 2022 18:39:40 +0100 From: Paul Brook To: qemu-devel@nongnu.org Subject: [PATCH 3/4] Enable all x86-64 cpu features in user mode Date: Mon, 18 Apr 2022 18:39:03 +0100 Message-Id: <20220418173904.3746036-4-paul@nowt.org> X-Mailer: git-send-email 2.35.2 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , Paolo Bonzini , Richard Henderson , Laurent Vivier , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650303810476100001 Content-Type: text/plain; charset="utf-8" We don't have any migration concerns for usermode emulation, so we may as well enable all available CPU features by default. Signed-off-by: Paul Brook --- linux-user/x86_64/target_elf.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/linux-user/x86_64/target_elf.h b/linux-user/x86_64/target_elf.h index 7b76a90de8..3f628f8d66 100644 --- a/linux-user/x86_64/target_elf.h +++ b/linux-user/x86_64/target_elf.h @@ -9,6 +9,6 @@ #define X86_64_TARGET_ELF_H static inline const char *cpu_get_model(uint32_t eflags) { - return "qemu64"; + return "max"; } #endif --=20 2.35.2 From nobody Thu May 2 14:26:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650837884075586.7030402540581; Sun, 24 Apr 2022 15:04:44 -0700 (PDT) Received: from localhost ([::1]:47542 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikLS-000835-EE for importer@patchew.org; Sun, 24 Apr 2022 18:04:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49354) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJE-000571-NZ for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:34 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58691) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001L8-LI for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:24 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ5-0001ea-Ej; Sun, 24 Apr 2022 23:02:15 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 04/42] i386: Rework sse_op_table1 Date: Sun, 24 Apr 2022 23:01:26 +0100 Message-Id: <20220424220204.2493824-5-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650837886364100001 Content-Type: text/plain; charset="utf-8" Add a flags field each row in sse_op_table1. Initially this is only used as a replacement for the magic SSE_SPECIAL and SSE_DUMMY pointers, the other flags will become relevant as the rest of the AVX implementation is built out. Signed-off-by: Paul Brook --- target/i386/tcg/translate.c | 316 +++++++++++++++++++++--------------- 1 file changed, 186 insertions(+), 130 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index b7972f0ff5..7fec582358 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2788,146 +2788,196 @@ typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCG= v_ptr reg_b, TCGv_i32 val); typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv val); =20 -#define SSE_SPECIAL ((void *)1) -#define SSE_DUMMY ((void *)2) +#define SSE_OPF_V0 (1 << 0) /* vex.v must be 1111b (only 2 operands= ) */ +#define SSE_OPF_CMP (1 << 1) /* does not write for first operand */ +#define SSE_OPF_BLENDV (1 << 2) /* blendv* instruction */ +#define SSE_OPF_SPECIAL (1 << 3) /* magic */ +#define SSE_OPF_3DNOW (1 << 4) /* 3DNow! instruction */ +#define SSE_OPF_MMX (1 << 5) /* MMX/integer/AVX2 instruction */ +#define SSE_OPF_SCALAR (1 << 6) /* Has SSE scalar variants */ +#define SSE_OPF_AVX2 (1 << 7) /* AVX2 instruction */ +#define SSE_OPF_SHUF (1 << 9) /* pshufx/shufpx */ + +#define OP(op, flags, a, b, c, d) \ + {flags, {a, b, c, d} } + +#define MMX_OP(x) OP(op2, SSE_OPF_MMX, \ + gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL) + +#define SSE_FOP(name) OP(op2, SSE_OPF_SCALAR, \ + gen_helper_##name##ps, gen_helper_##name##pd, \ + gen_helper_##name##ss, gen_helper_##name##sd) +#define SSE_OP(sname, dname, op, flags) OP(op, flags, \ + gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) + +struct SSEOpHelper_table1 { + int flags; + SSEFunc_0_epp op[4]; +}; =20 -#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } -#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ - gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } +#define SSE_3DNOW { SSE_OPF_3DNOW } +#define SSE_SPECIAL { SSE_OPF_SPECIAL } =20 -static const SSEFunc_0_epp sse_op_table1[256][4] =3D { +static const struct SSEOpHelper_table1 sse_op_table1[256] =3D { /* 3DNow! extensions */ - [0x0e] =3D { SSE_DUMMY }, /* femms */ - [0x0f] =3D { SSE_DUMMY }, /* pf... */ + [0x0e] =3D SSE_SPECIAL, /* femms */ + [0x0f] =3D SSE_3DNOW, /* pf... (sse_op_table5) */ /* pure SSE operations */ - [0x10] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movups, movupd, movss, movsd */ - [0x11] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movups, movupd, movss, movsd */ - [0x12] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movlps, movlpd, movsldup, movddup */ - [0x13] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movlps, movlpd */ - [0x14] =3D { gen_helper_punpckldq_xmm, gen_helper_punpcklqdq_xmm }, - [0x15] =3D { gen_helper_punpckhdq_xmm, gen_helper_punpckhqdq_xmm }, - [0x16] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movhps, movh= pd, movshdup */ - [0x17] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movhps, movhpd */ - - [0x28] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movaps, movapd */ - [0x29] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movaps, movapd */ - [0x2a] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */ - [0x2b] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movntps, movntpd, movntss, movntsd */ - [0x2c] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */ - [0x2d] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ - [0x2e] =3D { gen_helper_ucomiss, gen_helper_ucomisd }, - [0x2f] =3D { gen_helper_comiss, gen_helper_comisd }, - [0x50] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movmskps, movmskpd */ - [0x51] =3D SSE_FOP(sqrt), - [0x52] =3D { gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL }, - [0x53] =3D { gen_helper_rcpps, NULL, gen_helper_rcpss, NULL }, - [0x54] =3D { gen_helper_pand_xmm, gen_helper_pand_xmm }, /* andps, and= pd */ - [0x55] =3D { gen_helper_pandn_xmm, gen_helper_pandn_xmm }, /* andnps, = andnpd */ - [0x56] =3D { gen_helper_por_xmm, gen_helper_por_xmm }, /* orps, orpd */ - [0x57] =3D { gen_helper_pxor_xmm, gen_helper_pxor_xmm }, /* xorps, xor= pd */ + [0x10] =3D SSE_SPECIAL, /* movups, movupd, movss, movsd */ + [0x11] =3D SSE_SPECIAL, /* movups, movupd, movss, movsd */ + [0x12] =3D SSE_SPECIAL, /* movlps, movlpd, movsldup, movddup */ + [0x13] =3D SSE_SPECIAL, /* movlps, movlpd */ + [0x14] =3D SSE_OP(punpckldq, punpcklqdq, op2, 0), /* unpcklps, unpcklp= d */ + [0x15] =3D SSE_OP(punpckhdq, punpckhqdq, op2, 0), /* unpckhps, unpckhp= d */ + [0x16] =3D SSE_SPECIAL, /* movhps, movhpd, movshdup */ + [0x17] =3D SSE_SPECIAL, /* movhps, movhpd */ + + [0x28] =3D SSE_SPECIAL, /* movaps, movapd */ + [0x29] =3D SSE_SPECIAL, /* movaps, movapd */ + [0x2a] =3D SSE_SPECIAL, /* cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */ + [0x2b] =3D SSE_SPECIAL, /* movntps, movntpd, movntss, movntsd */ + [0x2c] =3D SSE_SPECIAL, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si = */ + [0x2d] =3D SSE_SPECIAL, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ + [0x2e] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL), + [0x2f] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_comiss, gen_helper_comisd, NULL, NULL), + [0x50] =3D SSE_SPECIAL, /* movmskps, movmskpd */ + [0x51] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_sqrtps, gen_helper_sqrtpd, + gen_helper_sqrtss, gen_helper_sqrtsd), + [0x52] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL), + [0x53] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_rcpps, NULL, gen_helper_rcpss, NULL), + [0x54] =3D SSE_OP(pand, pand, op2, 0), /* andps, andpd */ + [0x55] =3D SSE_OP(pandn, pandn, op2, 0), /* andnps, andnpd */ + [0x56] =3D SSE_OP(por, por, op2, 0), /* orps, orpd */ + [0x57] =3D SSE_OP(pxor, pxor, op2, 0), /* xorps, xorpd */ [0x58] =3D SSE_FOP(add), [0x59] =3D SSE_FOP(mul), - [0x5a] =3D { gen_helper_cvtps2pd, gen_helper_cvtpd2ps, - gen_helper_cvtss2sd, gen_helper_cvtsd2ss }, - [0x5b] =3D { gen_helper_cvtdq2ps, gen_helper_cvtps2dq, gen_helper_cvtt= ps2dq }, + [0x5a] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, + gen_helper_cvtps2pd, gen_helper_cvtpd2ps, + gen_helper_cvtss2sd, gen_helper_cvtsd2ss), + [0x5b] =3D OP(op1, SSE_OPF_V0, + gen_helper_cvtdq2ps, gen_helper_cvtps2dq, + gen_helper_cvttps2dq, NULL), [0x5c] =3D SSE_FOP(sub), [0x5d] =3D SSE_FOP(min), [0x5e] =3D SSE_FOP(div), [0x5f] =3D SSE_FOP(max), =20 - [0xc2] =3D SSE_FOP(cmpeq), - [0xc6] =3D { (SSEFunc_0_epp)gen_helper_shufps, - (SSEFunc_0_epp)gen_helper_shufpd }, /* XXX: casts */ + [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ + [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps, + (SSEFunc_0_epp)gen_helper_shufpd, NULL, NULL), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ - [0x38] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, - [0x3a] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, + [0x38] =3D SSE_SPECIAL, + [0x3a] =3D SSE_SPECIAL, =20 /* MMX ops and their SSE extensions */ - [0x60] =3D MMX_OP2(punpcklbw), - [0x61] =3D MMX_OP2(punpcklwd), - [0x62] =3D MMX_OP2(punpckldq), - [0x63] =3D MMX_OP2(packsswb), - [0x64] =3D MMX_OP2(pcmpgtb), - [0x65] =3D MMX_OP2(pcmpgtw), - [0x66] =3D MMX_OP2(pcmpgtl), - [0x67] =3D MMX_OP2(packuswb), - [0x68] =3D MMX_OP2(punpckhbw), - [0x69] =3D MMX_OP2(punpckhwd), - [0x6a] =3D MMX_OP2(punpckhdq), - [0x6b] =3D MMX_OP2(packssdw), - [0x6c] =3D { NULL, gen_helper_punpcklqdq_xmm }, - [0x6d] =3D { NULL, gen_helper_punpckhqdq_xmm }, - [0x6e] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movd mm, ea */ - [0x6f] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa,= , movqdu */ - [0x70] =3D { (SSEFunc_0_epp)gen_helper_pshufw_mmx, - (SSEFunc_0_epp)gen_helper_pshufd_xmm, - (SSEFunc_0_epp)gen_helper_pshufhw_xmm, - (SSEFunc_0_epp)gen_helper_pshuflw_xmm }, /* XXX: casts */ - [0x71] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftw */ - [0x72] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftd */ - [0x73] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftq */ - [0x74] =3D MMX_OP2(pcmpeqb), - [0x75] =3D MMX_OP2(pcmpeqw), - [0x76] =3D MMX_OP2(pcmpeql), - [0x77] =3D { SSE_DUMMY }, /* emms */ - [0x78] =3D { NULL, SSE_SPECIAL, NULL, SSE_SPECIAL }, /* extrq_i, inser= tq_i */ - [0x79] =3D { NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r }, - [0x7c] =3D { NULL, gen_helper_haddpd, NULL, gen_helper_haddps }, - [0x7d] =3D { NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps }, - [0x7e] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movd, movd, ,= movq */ - [0x7f] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa,= movdqu */ - [0xc4] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pinsrw */ - [0xc5] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pextrw */ - [0xd0] =3D { NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps }, - [0xd1] =3D MMX_OP2(psrlw), - [0xd2] =3D MMX_OP2(psrld), - [0xd3] =3D MMX_OP2(psrlq), - [0xd4] =3D MMX_OP2(paddq), - [0xd5] =3D MMX_OP2(pmullw), - [0xd6] =3D { NULL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, - [0xd7] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pmovmskb */ - [0xd8] =3D MMX_OP2(psubusb), - [0xd9] =3D MMX_OP2(psubusw), - [0xda] =3D MMX_OP2(pminub), - [0xdb] =3D MMX_OP2(pand), - [0xdc] =3D MMX_OP2(paddusb), - [0xdd] =3D MMX_OP2(paddusw), - [0xde] =3D MMX_OP2(pmaxub), - [0xdf] =3D MMX_OP2(pandn), - [0xe0] =3D MMX_OP2(pavgb), - [0xe1] =3D MMX_OP2(psraw), - [0xe2] =3D MMX_OP2(psrad), - [0xe3] =3D MMX_OP2(pavgw), - [0xe4] =3D MMX_OP2(pmulhuw), - [0xe5] =3D MMX_OP2(pmulhw), - [0xe6] =3D { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, gen_help= er_cvtpd2dq }, - [0xe7] =3D { SSE_SPECIAL , SSE_SPECIAL }, /* movntq, movntq */ - [0xe8] =3D MMX_OP2(psubsb), - [0xe9] =3D MMX_OP2(psubsw), - [0xea] =3D MMX_OP2(pminsw), - [0xeb] =3D MMX_OP2(por), - [0xec] =3D MMX_OP2(paddsb), - [0xed] =3D MMX_OP2(paddsw), - [0xee] =3D MMX_OP2(pmaxsw), - [0xef] =3D MMX_OP2(pxor), - [0xf0] =3D { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */ - [0xf1] =3D MMX_OP2(psllw), - [0xf2] =3D MMX_OP2(pslld), - [0xf3] =3D MMX_OP2(psllq), - [0xf4] =3D MMX_OP2(pmuludq), - [0xf5] =3D MMX_OP2(pmaddwd), - [0xf6] =3D MMX_OP2(psadbw), - [0xf7] =3D { (SSEFunc_0_epp)gen_helper_maskmov_mmx, - (SSEFunc_0_epp)gen_helper_maskmov_xmm }, /* XXX: casts */ - [0xf8] =3D MMX_OP2(psubb), - [0xf9] =3D MMX_OP2(psubw), - [0xfa] =3D MMX_OP2(psubl), - [0xfb] =3D MMX_OP2(psubq), - [0xfc] =3D MMX_OP2(paddb), - [0xfd] =3D MMX_OP2(paddw), - [0xfe] =3D MMX_OP2(paddl), + [0x60] =3D MMX_OP(punpcklbw), + [0x61] =3D MMX_OP(punpcklwd), + [0x62] =3D MMX_OP(punpckldq), + [0x63] =3D MMX_OP(packsswb), + [0x64] =3D MMX_OP(pcmpgtb), + [0x65] =3D MMX_OP(pcmpgtw), + [0x66] =3D MMX_OP(pcmpgtl), + [0x67] =3D MMX_OP(packuswb), + [0x68] =3D MMX_OP(punpckhbw), + [0x69] =3D MMX_OP(punpckhwd), + [0x6a] =3D MMX_OP(punpckhdq), + [0x6b] =3D MMX_OP(packssdw), + [0x6c] =3D OP(op2, SSE_OPF_MMX, + NULL, gen_helper_punpcklqdq_xmm, NULL, NULL), + [0x6d] =3D OP(op2, SSE_OPF_MMX, + NULL, gen_helper_punpckhqdq_xmm, NULL, NULL), + [0x6e] =3D SSE_SPECIAL, /* movd mm, ea */ + [0x6f] =3D SSE_SPECIAL, /* movq, movdqa, , movqdu */ + [0x70] =3D OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX | SSE_OPF_V0, + (SSEFunc_0_epp)gen_helper_pshufw_mmx, + (SSEFunc_0_epp)gen_helper_pshufd_xmm, + (SSEFunc_0_epp)gen_helper_pshufhw_xmm, + (SSEFunc_0_epp)gen_helper_pshuflw_xmm), + [0x71] =3D SSE_SPECIAL, /* shiftw */ + [0x72] =3D SSE_SPECIAL, /* shiftd */ + [0x73] =3D SSE_SPECIAL, /* shiftq */ + [0x74] =3D MMX_OP(pcmpeqb), + [0x75] =3D MMX_OP(pcmpeqw), + [0x76] =3D MMX_OP(pcmpeql), + [0x77] =3D SSE_SPECIAL, /* emms */ + [0x78] =3D SSE_SPECIAL, /* extrq_i, insertq_i (sse4a) */ + [0x79] =3D OP(op1, SSE_OPF_V0, + NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r), + [0x7c] =3D OP(op2, 0, + NULL, gen_helper_haddpd, NULL, gen_helper_haddps), + [0x7d] =3D OP(op2, 0, + NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps), + [0x7e] =3D SSE_SPECIAL, /* movd, movd, , movq */ + [0x7f] =3D SSE_SPECIAL, /* movq, movdqa, movdqu */ + [0xc4] =3D SSE_SPECIAL, /* pinsrw */ + [0xc5] =3D SSE_SPECIAL, /* pextrw */ + [0xd0] =3D OP(op2, 0, + NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps), + [0xd1] =3D MMX_OP(psrlw), + [0xd2] =3D MMX_OP(psrld), + [0xd3] =3D MMX_OP(psrlq), + [0xd4] =3D MMX_OP(paddq), + [0xd5] =3D MMX_OP(pmullw), + [0xd6] =3D SSE_SPECIAL, + [0xd7] =3D SSE_SPECIAL, /* pmovmskb */ + [0xd8] =3D MMX_OP(psubusb), + [0xd9] =3D MMX_OP(psubusw), + [0xda] =3D MMX_OP(pminub), + [0xdb] =3D MMX_OP(pand), + [0xdc] =3D MMX_OP(paddusb), + [0xdd] =3D MMX_OP(paddusw), + [0xde] =3D MMX_OP(pmaxub), + [0xdf] =3D MMX_OP(pandn), + [0xe0] =3D MMX_OP(pavgb), + [0xe1] =3D MMX_OP(psraw), + [0xe2] =3D MMX_OP(psrad), + [0xe3] =3D MMX_OP(pavgw), + [0xe4] =3D MMX_OP(pmulhuw), + [0xe5] =3D MMX_OP(pmulhw), + [0xe6] =3D OP(op1, SSE_OPF_V0, + NULL, gen_helper_cvttpd2dq, + gen_helper_cvtdq2pd, gen_helper_cvtpd2dq), + [0xe7] =3D SSE_SPECIAL, /* movntq, movntq */ + [0xe8] =3D MMX_OP(psubsb), + [0xe9] =3D MMX_OP(psubsw), + [0xea] =3D MMX_OP(pminsw), + [0xeb] =3D MMX_OP(por), + [0xec] =3D MMX_OP(paddsb), + [0xed] =3D MMX_OP(paddsw), + [0xee] =3D MMX_OP(pmaxsw), + [0xef] =3D MMX_OP(pxor), + [0xf0] =3D SSE_SPECIAL, /* lddqu */ + [0xf1] =3D MMX_OP(psllw), + [0xf2] =3D MMX_OP(pslld), + [0xf3] =3D MMX_OP(psllq), + [0xf4] =3D MMX_OP(pmuludq), + [0xf5] =3D MMX_OP(pmaddwd), + [0xf6] =3D MMX_OP(psadbw), + [0xf7] =3D OP(op1t, SSE_OPF_MMX | SSE_OPF_V0, + (SSEFunc_0_epp)gen_helper_maskmov_mmx, + (SSEFunc_0_epp)gen_helper_maskmov_xmm, NULL, NULL), + [0xf8] =3D MMX_OP(psubb), + [0xf9] =3D MMX_OP(psubw), + [0xfa] =3D MMX_OP(psubl), + [0xfb] =3D MMX_OP(psubq), + [0xfc] =3D MMX_OP(paddb), + [0xfd] =3D MMX_OP(paddw), + [0xfe] =3D MMX_OP(paddl), }; +#undef MMX_OP +#undef OP +#undef SSE_FOP +#undef SSE_OP +#undef SSE_SPECIAL + +#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } +#define SSE_SPECIAL_FN ((void *)1) =20 static const SSEFunc_0_epp sse_op_table2[3 * 8][2] =3D { [0 + 2] =3D MMX_OP2(psrlw), @@ -2970,6 +3020,8 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 +#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ + gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpeq), SSE_FOP(cmplt), @@ -2980,6 +3032,7 @@ static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpnle), SSE_FOP(cmpord), }; +#undef SSE_FOP =20 static const SSEFunc_0_epp sse_op_table5[256] =3D { [0x0c] =3D gen_helper_pi2fw, @@ -3021,7 +3074,7 @@ struct SSEOpHelper_eppi { #define SSSE3_OP(x) { MMX_OP2(x), CPUID_EXT_SSSE3 } #define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 } #define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 } -#define SSE41_SPECIAL { { NULL, SSE_SPECIAL }, CPUID_EXT_SSE41 } +#define SSE41_SPECIAL { { NULL, SSE_SPECIAL_FN }, CPUID_EXT_SSE41 } #define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \ CPUID_EXT_PCLMULQDQ } #define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES } @@ -3112,6 +3165,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, { int b1, op1_offset, op2_offset, is_xmm, val; int modrm, mod, rm, reg; + struct SSEOpHelper_table1 sse_op; SSEFunc_0_epp sse_fn_epp; SSEFunc_0_eppi sse_fn_eppi; SSEFunc_0_ppi sse_fn_ppi; @@ -3127,8 +3181,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, b1 =3D 3; else b1 =3D 0; - sse_fn_epp =3D sse_op_table1[b][b1]; - if (!sse_fn_epp) { + sse_op =3D sse_op_table1[b]; + sse_fn_epp =3D sse_op.op[b1]; + if ((sse_op.flags & (SSE_OPF_SPECIAL | SSE_OPF_3DNOW)) =3D=3D 0 + && !sse_fn_epp) { goto unknown_op; } if ((b <=3D 0x5f && b >=3D 0x10) || b =3D=3D 0xc6 || b =3D=3D 0xc2) { @@ -3182,7 +3238,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, reg |=3D REX_R(s); } mod =3D (modrm >> 6) & 3; - if (sse_fn_epp =3D=3D SSE_SPECIAL) { + if (sse_op.flags & SSE_OPF_SPECIAL) { b |=3D (b1 << 8); switch(b) { case 0x0e7: /* movntq */ @@ -3823,7 +3879,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldq_env_A0(s, op2_offset); } } - if (sse_fn_epp =3D=3D SSE_SPECIAL) { + if (sse_fn_epp =3D=3D SSE_SPECIAL_FN) { goto unknown_op; } =20 @@ -4209,7 +4265,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 s->rip_offset =3D 1; =20 - if (sse_fn_eppi =3D=3D SSE_SPECIAL) { + if (sse_fn_eppi =3D=3D SSE_SPECIAL_FN) { ot =3D mo_64_32(s->dflag); rm =3D (modrm & 7) | REX_B(s); if (mod !=3D 3) --=20 2.36.0 From nobody Thu May 2 14:26:58 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650304118849292.942275786741; Mon, 18 Apr 2022 10:48:38 -0700 (PDT) Received: from localhost ([::1]:44282 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ngVUL-0003E0-5a for importer@patchew.org; Mon, 18 Apr 2022 13:48:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50534) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ngVLu-0004yJ-U6 for qemu-devel@nongnu.org; Mon, 18 Apr 2022 13:39:56 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:41372) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ngVLo-0006cf-3i for qemu-devel@nongnu.org; Mon, 18 Apr 2022 13:39:54 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1ngVLl-000364-4s; Mon, 18 Apr 2022 18:39:46 +0100 From: Paul Brook To: qemu-devel@nongnu.org Subject: [PATCH 4/4] AVX tests Date: Mon, 18 Apr 2022 18:39:04 +0100 Message-Id: <20220418173904.3746036-5-paul@nowt.org> X-Mailer: git-send-email 2.35.2 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, WEIRD_QUOTING=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Habkost , Paolo Bonzini , Richard Henderson , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650304119762100001 Content-Type: text/plain; charset="utf-8" Tests for correct operation of most x86-64 SSE and AVX instructions. It should cover all combinations of overlapping register and memory operands on a set of random-ish data. Results are bit-identical to an Intel i5-8500, with the exception of the RCPSS and RSQRT approximations where the real CPU gives less accurate results (the Intel spec allows relative errors up to 1.5 * 2^-12) Signed-off-by: Paul Brook Acked-by: Alex Benn=C3=A9e --- tests/tcg/i386/Makefile.target | 10 +- tests/tcg/i386/README | 9 + tests/tcg/i386/test-avx.c | 347 +++ tests/tcg/i386/test-avx.py | 352 +++ tests/tcg/i386/x86.csv | 4658 ++++++++++++++++++++++++++++++++ 5 files changed, 5374 insertions(+), 2 deletions(-) create mode 100644 tests/tcg/i386/test-avx.c create mode 100755 tests/tcg/i386/test-avx.py create mode 100644 tests/tcg/i386/x86.csv diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target index e1c0310be6..f1c3275e2e 100644 --- a/tests/tcg/i386/Makefile.target +++ b/tests/tcg/i386/Makefile.target @@ -7,8 +7,8 @@ VPATH +=3D $(I386_SRC) =20 I386_SRCS=3D$(notdir $(wildcard $(I386_SRC)/*.c)) ALL_X86_TESTS=3D$(I386_SRCS:.c=3D) -SKIP_I386_TESTS=3Dtest-i386-ssse3 -X86_64_TESTS:=3D$(filter test-i386-ssse3, $(ALL_X86_TESTS)) +SKIP_I386_TESTS=3Dtest-i386-ssse3 test-avx +X86_64_TESTS:=3D$(filter test-i386-ssse3 test-avx, $(ALL_X86_TESTS)) =20 test-i386-sse-exceptions: CFLAGS +=3D -msse4.1 -mfpmath=3Dsse run-test-i386-sse-exceptions: QEMU_OPTS +=3D -cpu max @@ -80,3 +80,9 @@ run-sha512-sse: QEMU_OPTS+=3D-cpu max run-plugin-sha512-sse-with-%: QEMU_OPTS+=3D-cpu max =20 TESTS+=3Dsha512-sse + +test-avx.h: test-avx.py x86.csv + $(PYTHON) $(I386_SRC)/test-avx.py $(I386_SRC)/x86.csv $@ + +test-avx: CFLAGS +=3D -mavx -masm=3Dintel -O -I. +test-avx: test-avx.h diff --git a/tests/tcg/i386/README b/tests/tcg/i386/README index 09e88f30dc..403d10dad8 100644 --- a/tests/tcg/i386/README +++ b/tests/tcg/i386/README @@ -15,6 +15,15 @@ The Linux system call vm86() is used to test vm86 emulat= ion. Various exceptions are raised to test most of the x86 user space exception reporting. =20 +test-avx +-------- + +This program executes most SSE/AVX instructions and generates a text outpu= t, +for comparison with the output obtained with a real CPU or another emulato= r. + +test-avx.h is generate from x86.csv by test-avx.py +x86.csv comes from https://github.com/quasilyte/avx512test + linux-test ---------- =20 diff --git a/tests/tcg/i386/test-avx.c b/tests/tcg/i386/test-avx.c new file mode 100644 index 0000000000..953e2906fe --- /dev/null +++ b/tests/tcg/i386/test-avx.c @@ -0,0 +1,347 @@ +#include +#include +#include +#include + +typedef void (*testfn)(void); + +typedef struct { + uint64_t q0, q1, q2, q3; +} __attribute__((aligned(32))) v4di; + +typedef struct { + uint64_t mm[8]; + v4di ymm[16]; + uint64_t r[16]; + uint64_t flags; + uint32_t ff; + uint64_t pad; + v4di mem[4]; + v4di mem0[4]; +} reg_state; + +typedef struct { + int n; + testfn fn; + const char *s; + reg_state *init; +} TestDef; + +reg_state initI; +reg_state initF32; +reg_state initF64; + +static void dump_ymm(const char *name, int n, const v4di *r, int ff) +{ + printf("%s%d =3D %016lx %016lx %016lx %016lx\n", + name, n, r->q3, r->q2, r->q1, r->q0); + if (ff =3D=3D 64) { + double v[4]; + memcpy(v, r, sizeof(v)); + printf(" %16g %16g %16g %16g\n", + v[3], v[2], v[1], v[0]); + } else if (ff =3D=3D 32) { + float v[8]; + memcpy(v, r, sizeof(v)); + printf(" %8g %8g %8g %8g %8g %8g %8g %8g\n", + v[7], v[6], v[5], v[4], v[3], v[2], v[1], v[0]); + } +} + +static void dump_regs(reg_state *s) +{ + int i; + + for (i =3D 0; i < 16; i++) { + dump_ymm("ymm", i, &s->ymm[i], 0); + } + for (i =3D 0; i < 4; i++) { + dump_ymm("mem", i, &s->mem0[i], 0); + } +} + +static void compare_state(const reg_state *a, const reg_state *b) +{ + int i; + for (i =3D 0; i < 8; i++) { + if (a->mm[i] !=3D b->mm[i]) { + printf("MM%d =3D %016lx\n", i, b->mm[i]); + } + } + for (i =3D 0; i < 16; i++) { + if (a->r[i] !=3D b->r[i]) { + printf("r%d =3D %016lx\n", i, b->r[i]); + } + } + for (i =3D 0; i < 16; i++) { + if (memcmp(&a->ymm[i], &b->ymm[i], 32)) { + dump_ymm("ymm", i, &b->ymm[i], a->ff); + } + } + for (i =3D 0; i < 4; i++) { + if (memcmp(&a->mem0[i], &a->mem[i], 32)) { + dump_ymm("mem", i, &a->mem[i], a->ff); + } + } + if (a->flags !=3D b->flags) { + printf("FLAGS =3D %016lx\n", b->flags); + } +} + +#define LOADMM(r, o) "movq " #r ", " #o "[%0]\n\t" +#define LOADYMM(r, o) "vmovdqa " #r ", " #o "[%0]\n\t" +#define STOREMM(r, o) "movq " #o "[%1], " #r "\n\t" +#define STOREYMM(r, o) "vmovdqa " #o "[%1], " #r "\n\t" +#define MMREG(F) \ + F(mm0, 0x00) \ + F(mm1, 0x08) \ + F(mm2, 0x10) \ + F(mm3, 0x18) \ + F(mm4, 0x20) \ + F(mm5, 0x28) \ + F(mm6, 0x30) \ + F(mm7, 0x38) +#define YMMREG(F) \ + F(ymm0, 0x040) \ + F(ymm1, 0x060) \ + F(ymm2, 0x080) \ + F(ymm3, 0x0a0) \ + F(ymm4, 0x0c0) \ + F(ymm5, 0x0e0) \ + F(ymm6, 0x100) \ + F(ymm7, 0x120) \ + F(ymm8, 0x140) \ + F(ymm9, 0x160) \ + F(ymm10, 0x180) \ + F(ymm11, 0x1a0) \ + F(ymm12, 0x1c0) \ + F(ymm13, 0x1e0) \ + F(ymm14, 0x200) \ + F(ymm15, 0x220) +#define LOADREG(r, o) "mov " #r ", " #o "[rax]\n\t" +#define STOREREG(r, o) "mov " #o "[rax], " #r "\n\t" +#define REG(F) \ + F(rbx, 0x248) \ + F(rcx, 0x250) \ + F(rdx, 0x258) \ + F(rsi, 0x260) \ + F(rdi, 0x268) \ + F(r8, 0x280) \ + F(r9, 0x288) \ + F(r10, 0x290) \ + F(r11, 0x298) \ + F(r12, 0x2a0) \ + F(r13, 0x2a8) \ + F(r14, 0x2b0) \ + F(r15, 0x2b8) \ + +static void run_test(const TestDef *t) +{ + reg_state result; + reg_state *init =3D t->init; + memcpy(init->mem, init->mem0, sizeof(init->mem)); + printf("%5d %s\n", t->n, t->s); + asm volatile( + MMREG(LOADMM) + YMMREG(LOADYMM) + "sub rsp, 128\n\t" + "push rax\n\t" + "push rbx\n\t" + "push rcx\n\t" + "push rdx\n\t" + "push %1\n\t" + "push %2\n\t" + "mov rax, %0\n\t" + "pushf\n\t" + "pop rbx\n\t" + "shr rbx, 8\n\t" + "shl rbx, 8\n\t" + "mov rcx, 0x2c0[rax]\n\t" + "and rcx, 0xff\n\t" + "or rbx, rcx\n\t" + "push rbx\n\t" + "popf\n\t" + REG(LOADREG) + "mov rax, 0x240[rax]\n\t" + "call [rsp]\n\t" + "mov [rsp], rax\n\t" + "mov rax, 8[rsp]\n\t" + REG(STOREREG) + "mov rbx, [rsp]\n\t" + "mov 0x240[rax], rbx\n\t" + "mov rbx, 0\n\t" + "mov 0x270[rax], rbx\n\t" + "mov 0x278[rax], rbx\n\t" + "pushf\n\t" + "pop rbx\n\t" + "and rbx, 0xff\n\t" + "mov 0x2c0[rax], rbx\n\t" + "add rsp, 16\n\t" + "pop rdx\n\t" + "pop rcx\n\t" + "pop rbx\n\t" + "pop rax\n\t" + "add rsp, 128\n\t" + MMREG(STOREMM) + YMMREG(STOREYMM) + : : "r"(init), "r"(&result), "r"(t->fn) + : "memory", "cc", + "rsi", "rdi", + "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", + "mm0", "mm1", "mm2", "mm3", "mm4", "mm5", "mm6", "mm7", + "ymm0", "ymm1", "ymm2", "ymm3", "ymm4", "ymm5", + "ymm6", "ymm7", "ymm8", "ymm9", "ymm10", "ymm11", + "ymm12", "ymm13", "ymm14", "ymm15" + ); + compare_state(init, &result); +} + +#define TEST(n, cmd, type) \ +static void __attribute__((naked)) test_##n(void) \ +{ \ + asm volatile(cmd); \ + asm volatile("ret"); \ +} +#include "test-avx.h" + + +static const TestDef test_table[] =3D { +#define TEST(n, cmd, type) {n, test_##n, cmd, &init##type}, +#include "test-avx.h" + {-1, NULL, "", NULL} +}; + +static void run_all(void) +{ + const TestDef *t; + for (t =3D test_table; t->fn; t++) { + run_test(t); + } +} + +#define ARRAY_LEN(x) (sizeof(x) / sizeof(x[0])) + +float val_f32[] =3D {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5, 8.3}; +double val_f64[] =3D {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5}; +v4di val_i64[] =3D { + {0x3d6b3b6a9e4118f2lu, 0x355ae76d2774d78clu, + 0xac3ff76c4daa4b28lu, 0xe7fabd204cb54083lu}, + {0xd851c54a56bf1f29lu, 0x4a84d1d50bf4c4fflu, + 0x56621e553d52b56clu, 0xd0069553da8f584alu}, + {0x5826475e2c5fd799lu, 0xfd32edc01243f5e9lu, + 0x738ba2c66d3fe126lu, 0x5707219c6e6c26b4lu}, +}; + +v4di deadbeef =3D {0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull, + 0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull}; +v4di indexq =3D {0x000000000000001full, 0x000000000000008full, + 0xffffffffffffffffull, 0xffffffffffffff5full}; +v4di indexd =3D {0x00000002000000efull, 0xfffffff500000010ull, + 0x0000000afffffff0ull, 0x000000000000000eull}; + +v4di gather_mem[0x20]; + +void init_f32reg(v4di *r) +{ + static int n; + float v[8]; + int i; + for (i =3D 0; i < 8; i++) { + v[i] =3D val_f32[n++]; + if (n =3D=3D ARRAY_LEN(val_f32)) { + n =3D 0; + } + } + memcpy(r, v, sizeof(*r)); +} + +void init_f64reg(v4di *r) +{ + static int n; + double v[4]; + int i; + for (i =3D 0; i < 4; i++) { + v[i] =3D val_f64[n++]; + if (n =3D=3D ARRAY_LEN(val_f64)) { + n =3D 0; + } + } + memcpy(r, v, sizeof(*r)); +} + +void init_intreg(v4di *r) +{ + static uint64_t mask; + static int n; + + r->q0 =3D val_i64[n].q0 ^ mask; + r->q1 =3D val_i64[n].q1 ^ mask; + r->q2 =3D val_i64[n].q2 ^ mask; + r->q3 =3D val_i64[n].q3 ^ mask; + n++; + if (n =3D=3D ARRAY_LEN(val_i64)) { + n =3D 0; + mask *=3D 0x104C11DB7; + } +} + +static void init_all(reg_state *s) +{ + int i; + + s->r[3] =3D (uint64_t)&s->mem[0]; /* rdx */ + s->r[4] =3D (uint64_t)&gather_mem[ARRAY_LEN(gather_mem) / 2]; /* rsi */ + s->r[5] =3D (uint64_t)&s->mem[2]; /* rdi */ + s->flags =3D 2; + for (i =3D 0; i < 16; i++) { + s->ymm[i] =3D deadbeef; + } + s->ymm[13] =3D indexd; + s->ymm[14] =3D indexq; + for (i =3D 0; i < 4; i++) { + s->mem0[i] =3D deadbeef; + } +} + +int main(int argc, char *argv[]) +{ + int i; + + init_all(&initI); + init_intreg(&initI.ymm[10]); + init_intreg(&initI.ymm[11]); + init_intreg(&initI.ymm[12]); + init_intreg(&initI.mem0[1]); + printf("Int:\n"); + dump_regs(&initI); + + init_all(&initF32); + init_f32reg(&initF32.ymm[10]); + init_f32reg(&initF32.ymm[11]); + init_f32reg(&initF32.ymm[12]); + init_f32reg(&initF32.mem0[1]); + initF32.ff =3D 32; + printf("F32:\n"); + dump_regs(&initF32); + + init_all(&initF64); + init_f64reg(&initF64.ymm[10]); + init_f64reg(&initF64.ymm[11]); + init_f64reg(&initF64.ymm[12]); + init_f64reg(&initF64.mem0[1]); + initF64.ff =3D 64; + printf("F64:\n"); + dump_regs(&initF64); + + for (i =3D 0; i < ARRAY_LEN(gather_mem); i++) { + init_intreg(&gather_mem[i]); + } + + if (argc > 1) { + int n =3D atoi(argv[1]); + run_test(&test_table[n]); + } else { + run_all(); + } + return 0; +} diff --git a/tests/tcg/i386/test-avx.py b/tests/tcg/i386/test-avx.py new file mode 100755 index 0000000000..0b2d799c5c --- /dev/null +++ b/tests/tcg/i386/test-avx.py @@ -0,0 +1,352 @@ +#! /usr/bin/env python3 + +# Generate test-avx.h from x86.csv + +import csv +import sys +from fnmatch import fnmatch + +archs =3D [ + # TODO: MMX? + "SSE", "SSE2", "SSE3", "SSSE3", "SSE4_1", "SSE4_2", + "AVX", "AVX2", "AES+AVX", # "VAES+AVX", +] + +ignore =3D set(["FISTTP", + "LDMXCSR", "VLDMXCSR", "STMXCSR", "VSTMXCSR"]) + +imask =3D { + 'vBLENDPD': 0xff, + 'vBLENDPS': 0x0f, + 'CMP[PS][SD]': 0x07, + 'VCMP[PS][SD]': 0x1f, + 'vDPPD': 0x33, + 'vDPPS': 0xff, + 'vEXTRACTPS': 0x03, + 'vINSERTPS': 0xff, + 'MPSADBW': 0x7, + 'VMPSADBW': 0x3f, + 'vPALIGNR': 0x3f, + 'vPBLENDW': 0xff, + 'vPCMP[EI]STR*': 0x0f, + 'vPEXTRB': 0x0f, + 'vPEXTRW': 0x07, + 'vPEXTRD': 0x03, + 'vPEXTRQ': 0x01, + 'vPINSRB': 0x0f, + 'vPINSRW': 0x07, + 'vPINSRD': 0x03, + 'vPINSRQ': 0x01, + 'vPSHUF[DW]': 0xff, + 'vPSHUF[LH]W': 0xff, + 'vPS[LR][AL][WDQ]': 0x3f, + 'vPS[RL]LDQ': 0x1f, + 'vROUND[PS][SD]': 0x7, + 'vSHUFPD': 0x0f, + 'vSHUFPS': 0xff, + 'vAESKEYGENASSIST': 0, + 'VEXTRACT[FI]128': 0x01, + 'VINSERT[FI]128': 0x01, + 'VPBLENDD': 0xff, + 'VPERM2[FI]128': 0x33, + 'VPERMPD': 0xff, + 'VPERMQ': 0xff, + 'VPERMILPS': 0xff, + 'VPERMILPD': 0x0f, + } + +def strip_comments(x): + for l in x: + if l !=3D '' and l[0] !=3D '#': + yield l + +def reg_w(w): + if w =3D=3D 8: + return 'al' + elif w =3D=3D 16: + return 'ax' + elif w =3D=3D 32: + return 'eax' + elif w =3D=3D 64: + return 'rax' + raise Exception("bad reg_w %d" % w) + +def mem_w(w): + if w =3D=3D 8: + t =3D "BYTE" + elif w =3D=3D 16: + t =3D "WORD" + elif w =3D=3D 32: + t =3D "DWORD" + elif w =3D=3D 64: + t =3D "QWORD" + elif w =3D=3D 128: + t =3D "XMMWORD" + elif w =3D=3D 256: + t =3D "YMMWORD" + else: + raise Exception() + + return t + " PTR 32[rdx]" + +class XMMArg(): + isxmm =3D True + def __init__(self, reg, mw): + if mw not in [0, 8, 16, 32, 64, 128, 256]: + raise Exception("Bad /m width: %s" % w) + self.reg =3D reg + self.mw =3D mw + self.ismem =3D mw !=3D 0 + def regstr(self, n): + if n < 0: + return mem_w(self.mw) + else: + return "%smm%d" % (self.reg, n) + +class MMArg(): + isxmm =3D True + ismem =3D False # TODO + def regstr(self, n): + return "mm%d" % (n & 7) + +def match(op, pattern): + if pattern[0] =3D=3D 'v': + return fnmatch(op, pattern[1:]) or fnmatch(op, 'V'+pattern[1:]) + return fnmatch(op, pattern) + +class ArgVSIB(): + isxmm =3D True + ismem =3D False + def __init__(self, reg, w): + if w not in [32, 64]: + raise Exception("Bad vsib width: %s" % w) + self.w =3D w + self.reg =3D reg + def regstr(self, n): + reg =3D "%smm%d" % (self.reg, n >> 2) + return "[rsi + %s * %d]" % (reg, 1 << (n & 3)) + +class ArgImm8u(): + isxmm =3D False + ismem =3D False + def __init__(self, op): + for k, v in imask.items(): + if match(op, k): + self.mask =3D imask[k]; + return + raise Exception("Unknown immediate") + def vals(self): + mask =3D self.mask + yield 0 + n =3D 0 + while n !=3D mask: + n +=3D 1 + while (n & ~mask) !=3D 0: + n +=3D (n & ~mask) + yield n + +class ArgRM(): + isxmm =3D False + def __init__(self, rw, mw): + if rw not in [8, 16, 32, 64]: + raise Exception("Bad r/w width: %s" % w) + if mw not in [0, 8, 16, 32, 64]: + raise Exception("Bad r/w width: %s" % w) + self.rw =3D rw + self.mw =3D mw + self.ismem =3D mw !=3D 0 + def regstr(self, n): + if n < 0: + return mem_w(self.mw) + else: + return reg_w(self.rw) + +class ArgMem(): + isxmm =3D False + ismem =3D True + def __init__(self, w): + if w not in [8, 16, 32, 64, 128, 256]: + raise Exception("Bad mem width: %s" % w) + self.w =3D w + def regstr(self, n): + return mem_w(self.w) + +def ArgGenerator(arg, op): + if arg[:3] =3D=3D 'xmm' or arg[:3] =3D=3D "ymm": + if "/" in arg: + r, m =3D arg.split('/') + if (m[0] !=3D 'm'): + raise Exception("Expected /m: %s", arg) + return XMMArg(arg[0], int(m[1:])); + else: + return XMMArg(arg[0], 0); + elif arg[:2] =3D=3D 'mm': + return MMArg(); + elif arg[:4] =3D=3D 'imm8': + return ArgImm8u(op); + elif arg =3D=3D '': + return None + elif arg[0] =3D=3D 'r': + if '/m' in arg: + r, m =3D arg.split('/') + if (m[0] !=3D 'm'): + raise Exception("Expected /m: %s", arg) + mw =3D int(m[1:]) + if r =3D=3D 'r': + rw =3D mw + else: + rw =3D int(r[1:]) + return ArgRM(rw, mw) + + return ArgRM(int(arg[1:]), 0); + elif arg[0] =3D=3D 'm': + return ArgMem(int(arg[1:])) + elif arg[:2] =3D=3D 'vm': + return ArgVSIB(arg[-1], int(arg[2:-1])) + else: + raise Exception("Unrecognised arg: %s", arg) + +class InsnGenerator: + def __init__(self, op, args): + self.op =3D op + if op[-2:] in ["PS", "PD", "SS", "SD"]: + if op[-1] =3D=3D 'S': + self.optype =3D 'F32' + else: + self.optype =3D 'F64' + else: + self.optype =3D 'I' + + try: + self.args =3D list(ArgGenerator(a, op) for a in args) + if len(self.args) > 0 and self.args[-1] is None: + self.args =3D self.args[:-1] + except Exception as e: + raise Exception("Bad arg %s: %s" % (op, e)) + + def gen(self): + regs =3D (10, 11, 12) + dest =3D 9 + + nreg =3D len(self.args) + if nreg =3D=3D 0: + yield self.op + return + if isinstance(self.args[-1], ArgImm8u): + nreg -=3D 1 + immarg =3D self.args[-1] + else: + immarg =3D None + memarg =3D -1 + for n, arg in enumerate(self.args): + if arg.ismem: + memarg =3D n + + if (self.op.startswith("VGATHER") or self.op.startswith("VPGATHER"= )): + if "GATHERD" in self.op: + ireg =3D 13 << 2 + else: + ireg =3D 14 << 2 + regset =3D [ + (dest, ireg | 0, regs[0]), + (dest, ireg | 1, regs[0]), + (dest, ireg | 2, regs[0]), + (dest, ireg | 3, regs[0]), + ] + if memarg >=3D 0: + raise Exception("vsib with memory: %s" % self.op) + elif nreg =3D=3D 1: + regset =3D [(regs[0],)] + if memarg =3D=3D 0: + regset +=3D [(-1,)] + elif nreg =3D=3D 2: + regset =3D [ + (regs[0], regs[1]), + (regs[0], regs[0]), + ] + if memarg =3D=3D 0: + regset +=3D [(-1, regs[0])] + elif memarg =3D=3D 1: + regset +=3D [(dest, -1)] + elif nreg =3D=3D 3: + regset =3D [ + (dest, regs[0], regs[1]), + (dest, regs[0], regs[0]), + (regs[0], regs[0], regs[1]), + (regs[0], regs[1], regs[0]), + (regs[0], regs[0], regs[0]), + ] + if memarg =3D=3D 2: + regset +=3D [ + (dest, regs[0], -1), + (regs[0], regs[0], -1), + ] + elif memarg > 0: + raise Exception("Memarg %d" % memarg) + elif nreg =3D=3D 4: + regset =3D [ + (dest, regs[0], regs[1], regs[2]), + (dest, regs[0], regs[0], regs[1]), + (dest, regs[0], regs[1], regs[0]), + (dest, regs[1], regs[0], regs[0]), + (dest, regs[0], regs[0], regs[0]), + (regs[0], regs[0], regs[1], regs[2]), + (regs[0], regs[1], regs[0], regs[2]), + (regs[0], regs[1], regs[2], regs[0]), + (regs[0], regs[0], regs[0], regs[1]), + (regs[0], regs[0], regs[1], regs[0]), + (regs[0], regs[1], regs[0], regs[0]), + (regs[0], regs[0], regs[0], regs[0]), + ] + if memarg =3D=3D 2: + regset +=3D [ + (dest, regs[0], -1, regs[1]), + (dest, regs[0], -1, regs[0]), + (regs[0], regs[0], -1, regs[1]), + (regs[0], regs[1], -1, regs[0]), + (regs[0], regs[0], -1, regs[0]), + ] + elif memarg > 0: + raise Exception("Memarg4 %d" % memarg) + else: + raise Exception("Too many regs: %s(%d)" % (self.op, nreg)) + + for regv in regset: + argstr =3D [] + for i in range(nreg): + arg =3D self.args[i] + argstr.append(arg.regstr(regv[i])) + if immarg is None: + yield self.op + ' ' + ','.join(argstr) + else: + for immval in immarg.vals(): + yield self.op + ' ' + ','.join(argstr) + ',' + str(imm= val) + +def split0(s): + if s =3D=3D '': + return [] + return s.split(',') + +def main(): + n =3D 0 + if len(sys.argv) !=3D 3: + print("Usage: test-avx.py x86.csv test-avx.h") + exit(1) + csvfile =3D open(sys.argv[1], 'r', newline=3D'') + with open(sys.argv[2], "w") as outf: + outf.write("// Generated by test-avx.py. Do not edit.\n") + for row in csv.reader(strip_comments(csvfile)): + insn =3D row[0].replace(',', '').split() + if insn[0] in ignore: + continue + cpuid =3D row[6] + if cpuid in archs: + g =3D InsnGenerator(insn[0], insn[1:]) + for insn in g.gen(): + outf.write('TEST(%d, "%s", %s)\n' % (n, insn, g.optype= )) + n +=3D 1 + outf.write("#undef TEST\n") + csvfile.close() + +if __name__ =3D=3D "__main__": + main() diff --git a/tests/tcg/i386/x86.csv b/tests/tcg/i386/x86.csv new file mode 100644 index 0000000000..d5d0c17f1b --- /dev/null +++ b/tests/tcg/i386/x86.csv @@ -0,0 +1,4658 @@ +# x86 instruction set description version 0.2x, 2018-05-08 +# +# https://golang.org/x/arch/x86 +# +# The latest version of the CSV file is +# available online at https://golang.org/s/x86.csv. +# +# This file contains a block of comment lines, each beginning with #, +# followed by entries in CSV format. All the # comments are at the top +# of the file, so a reader can skip past the comments and hand the +# rest of the file to a standard CSV reader. +# Each CSV line contains these fields: +# +# 1. The Intel manual instruction mnemonic. For example, "SHR r/m32, imm8". +# +# 2. The Go assembler instruction mnemonic. For example, "SHRL imm8, r/m32= ". +# +# 3. The GNU binutils instruction mnemonic. For example, "shrl imm8, r/m32= ". +# +# 4. The instruction encoding. For example, "C1 /4 ib". +# +# 5. The validity of the instruction in 32-bit (aka compatiblity, legacy) = mode. +# +# 6. The validity of the instruction in 64-bit mode. +# +# 7. The CPUID feature flags that signal support for the instruction. +# +# 8. Additional comma-separated tags containing hints about the instructio= n. +# +# 9. The read/write actions of the instruction on the arguments used in +# the Intel mnemonic. For example, "rw,r" to denote that "SHR r/m32, imm8" +# reads and writes its first argument but only reads its second argument. +# +# 10. Whether the opcode used in the Intel mnemonic has encoding forms +# distinguished only by operand size, like most arithmetic instructions. +# The string "Y" indicates yes, the string "" indicates no. +# +# 11. The data size of the operation in bits. In general this is the size = corresponding +# to the Go and GNU assembler opcode suffix. +# Mnemonics (the opcode string) +# +# The instruction mnemonics are as used in the Intel manual, with a few ex= ceptions. +# +# Mnemonics claiming general memory forms but that really require fixed ad= dressing modes +# are omitted in favor of their equivalents with implicit arguments.. +# For example, "CMPS m16, m16" (really CMPS [SI], [DI]) is omitted in favo= r of "CMPSW". +# +# Instruction forms with an explicit REP, REPE, or REPNE prefix are also o= mitted. +# Encoders and decoders are expected to handle those prefixes separately. +# +# Perhaps most significantly, the argument syntaxes used in the mnemonic i= ndicate +# exactly how to derive the argument from the instruction encoding, or vic= e versa. +# +# Immediate values: imm8, imm8u, imm16, imm16u, imm32, imm64. +# Immediates are signed by default; the u suffixes indicates an unsigned v= alue. +# Immediates may have bitfield-like modifier that specifies how much bits +# are used. For example, imm8u:4 is encoded like 8bit immediate, +# but only 4bits are meaningful while the others are ignored or must be 0. +# +# Memory operands. The forms m, m128, m14/28byte, m16, m16&16, m16&32, m16= &64, m16:16, m16:32, +# m16:64, m16int, m256, m2byte, m32, m32&32, m32fp, m32int, m512byte, m64,= m64fp, m64int, +# m8, m80bcd, m80dec, m80fp, m94/108byte. These operands always correspond= to the +# memory address specified by the r/m half of the modrm encoding. +# +# Integer registers. +# The forms r8, r16, r32, r64 indicate a register selected by the modrm re= g encoding. +# The forms rmr16, rmr32, rmr64 indicate a register (never memory) selecte= d by the modrm r/m encoding. +# The forms r/m8, r/m16, r/m32, and r/m64 indicate a register or memory se= lected by the modrm r/m encoding. +# Forms with two sizes, like r32/m16 also indicate a register or memory se= lected by the modrm r/m encodng, +# but the size for a register argument differs from the size of a memory a= rgument. +# The forms r8V, r16V, r32V, r64V indicate a register selected by the VEX.= vvvv bits. +# +# Multimedia registers. +# The forms mm1, xmm1, and ymm1 indicate a multimedia register selected by= the +# modrm reg encoding. +# The forms mm2, xmm2, and ymm2 indicate a register (never memory) selecte= d by +# the modrm r/m encoding. +# The forms mm2/m64, xmm2/m128, and so on indicate a register or memory +# selected by the modrm r/m encoding. +# The forms xmmV and ymmV indicate a register selected by the VEX.vvvv bit= s. +# The forms xmmI and ymmI indicate a register selected by the top four bit= s of an /is4 immediate byte. +# +# Bound registers. +# The form bnd1 indicates a bound register selected by the modrm reg encod= ing. +# The form bnd2 indicates a bound register (never memory) selected by the = modrm r/m encoding. +# The forms bnd2/m64 and bnd2/m128 indicate a register or memorys selected= by the modrm r/m encoding. +# TODO: Describe mib. +# +# One-of-a-kind operands: rel8, rel16, rel32, ptr16:16, ptr16:32, +# moffs8, moffs16, moffs32, moffs64, vm32x, vm32y, vm64x, and vm64y +# are all as in the Intel manual. +# +# Encodings +# +# The encodings are also as used in the Intel manual, with automated corre= ctions. +# For example, the Intel manual sometimes omits the modrm /r indicator or = other trailing bytes, +# and it also contains typographical errors. +# These problems are corrected so that the CSV data may be used to generate +# tools for processing x86 machine code. +# See https://golang.org/x/arch/x86/x86map for one such generator. +# +# Valid32 and Valid64 +# +# These columns hold validity abbreviations as defined in the Intel manual: +# V, I, N.E., N.P., N.S., or N.I. +# Tools processing the data are typically only concerned with whether the +# column is "V" (valid) or not. +# This data is also corrected compared to the manual. +# For example, the manual lists many instruction forms using REX bytes +# with an incorrect "V" in the Valid32 column. +# +# CPUID Feature Flags +# +# This column specifies CPUID feature flags that must be present in order +# to use the instruction. If multiple flags are required, +# they are listed separated by plus signs, as in PCLMULQDQ+AVX. +# The column can also list one of the values 486, Pentium, PentiumII, and = P6, +# indicating that the instruction was introduced on that architecture vers= ion. +# +# Tags +# +# The tag column does not correspond to a traditional column in the Intel = manual tables. +# Instead, it is itself a comma-separated list of tags or hints derived by= analysis +# of the instruction set or the instruction encodings. +# +# The tags address16, address32, and address64 indicate that the instructi= on form +# applies when using the specified addressing size. It may therefore be ne= cessary to use an +# address size prefix byte to access the instruction. +# If two address tags are listed, the instruction can be used with either = of those +# address sizes. An instruction will never list all three address sizes. +# (In fact, today, no instruction lists two address sizes, but that may ch= ange.) +# +# The tags operand16, operand32, and operand64 indicate that the instructi= on form +# applies when using the specified operand size. It may therefore be neces= sary to use an +# operand size prefix byte to access the instruction. +# If two operand tags are listed, the instruction can be used with either= of those +# operand sizes. An instruction will never list all three operand sizes. +# For some instructions, default64 is used instead of operand64, +# which specifies data promotion to 64-bit. +# For instructions with different possible data sizes, +# it also describes that default data size is 64-bit instead of 32-bit. +# Using refining prefix like 0x66 will lead to 32-bit operation (if suppor= ted). +# +# The tags modrm_regonly or modrm_memonly indicate that the modrm byte's +# r/m encoding must specify a register or memory, respectively. +# Especially in newer instructions, the modrm constraint may be the only w= ay +# to distinguish two instruction forms. For example the MOVHLPS and MOVLPS +# instructions share the same encoding, except that the former requires the +# modrm byte's r/m to indicate a register, while the latter requires it to= indicate memory. +# +# The tags pseudo and pseudo64 indicate that this instruction form is redu= ndant +# with others listed in the table and should be ignored when generating di= sassembly +# or instruction scanning programs. The pseudo64 tag is reserved for the c= ase where +# the manual lists an instruction twice, once with the optional 64-bit mod= e REX byte. +# Since most decoders will handle the REX byte separately, the form with t= he +# unnecessary REX is tagged pseudo64. +# +# The amd tag marks AMD-specific instructions. +# As an example, all instructions of SSE4a have such tag. +# +# The AVX512-specific tags: scaleX and bscaleX. +# scale1, scale2, scale4, scale8, scale16, scale32, scale64 specify +# the compressed displacement multiplier (scaling). +# For example, if displacement is 128 and scale32 is set, +# disp8 value should be calculated as 128/32. +# bscale4 and bscale8 have the same meaning, but are used +# when instruction uses embedded broadcast feature. +# If instruction does not have bscaleX tag, it does not support EVEX broad= casting. +# +# Related packages (can be a good source of additional documentation): +# x86csv - read and manipulate x86.csv +# x86spec - x86.csv generator +# x86map - x86asm table generator based on x86.csv +# x86avxgen - cmd/internal/obj/x86 optab generator based x86.csv +# All listed packages are located at golang.org/x/arch/x86/. +"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","V","N.S.= ","","operand32","r","Y","" +"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","N.S.","V= ","","default64","r","Y","" +"AAA","AAA","aaa","37","V","N.S.","","","","","" +"AAD","AAD","aad","D5 0A","V","I","","pseudo","","","" +"AAD imm8u","AAD imm8u","aad imm8u","D5 ib","V","N.S.","","","r","","" +"AAM","AAM","aam","D4 0A","V","I","","pseudo","","","" +"AAM imm8u","AAM imm8u","aam imm8u","D4 ib","V","N.S.","","","r","","" +"AAS","AAS","aas","3F","V","N.S.","","","","","" +"ADC AL, imm8","ADCB imm8, AL","adcb imm8, AL","14 ib","V","V","","","rw,r= ","Y","8" +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","80 /2 ib","V","V",""= ,"","rw,r","Y","8" +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","82 /2 ib","V","N.S."= ,"","","rw,r","Y","8" +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","REX 80 /2 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","12 /r","V","V","","","rw,r= ","Y","8" +"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","REX 12 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","10 /r","V","V","","","rw,r= ","Y","8" +"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","REX 10 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"ADC EAX, imm32","ADCL imm32, EAX","adcl imm32, EAX","15 id","V","V","","o= perand32","rw,r","Y","32" +"ADC r/m32, imm32","ADCL imm32, r/m32","adcl imm32, r/m32","81 /2 id","V",= "V","","operand32","rw,r","Y","32" +"ADC r/m32, imm8","ADCL imm8, r/m32","adcl imm8, r/m32","83 /2 ib","V","V"= ,"","operand32","rw,r","Y","32" +"ADC r32, r/m32","ADCL r/m32, r32","adcl r/m32, r32","13 /r","V","V","","o= perand32","rw,r","Y","32" +"ADC r/m32, r32","ADCL r32, r/m32","adcl r32, r/m32","11 /r","V","V","","o= perand32","rw,r","Y","32" +"ADC RAX, imm32","ADCQ imm32, RAX","adcq imm32, RAX","REX.W 15 id","N.S.",= "V","","","rw,r","Y","64" +"ADC r/m64, imm32","ADCQ imm32, r/m64","adcq imm32, r/m64","REX.W 81 /2 id= ","N.S.","V","","","rw,r","Y","64" +"ADC r/m64, imm8","ADCQ imm8, r/m64","adcq imm8, r/m64","REX.W 83 /2 ib","= N.S.","V","","","rw,r","Y","64" +"ADC r64, r/m64","ADCQ r/m64, r64","adcq r/m64, r64","REX.W 13 /r","N.S.",= "V","","","rw,r","Y","64" +"ADC r/m64, r64","ADCQ r64, r/m64","adcq r64, r/m64","REX.W 11 /r","N.S.",= "V","","","rw,r","Y","64" +"ADC AX, imm16","ADCW imm16, AX","adcw imm16, AX","15 iw","V","V","","oper= and16","rw,r","Y","16" +"ADC r/m16, imm16","ADCW imm16, r/m16","adcw imm16, r/m16","81 /2 iw","V",= "V","","operand16","rw,r","Y","16" +"ADC r/m16, imm8","ADCW imm8, r/m16","adcw imm8, r/m16","83 /2 ib","V","V"= ,"","operand16","rw,r","Y","16" +"ADC r16, r/m16","ADCW r/m16, r16","adcw r/m16, r16","13 /r","V","V","","o= perand16","rw,r","Y","16" +"ADC r/m16, r16","ADCW r16, r/m16","adcw r16, r/m16","11 /r","V","V","","o= perand16","rw,r","Y","16" +"ADCX r32, r/m32","ADCXL r/m32, r32","adcxl r/m32, r32","66 0F 38 F6 /r","= V","V","ADX","operand16,operand32","rw,r","Y","32" +"ADCX r64, r/m64","ADCXQ r/m64, r64","adcxq r/m64, r64","66 REX.W 0F 38 F6= /r","N.S.","V","ADX","","rw,r","Y","64" +"ADD AL, imm8","ADDB imm8, AL","addb imm8, AL","04 ib","V","V","","","rw,r= ","Y","8" +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","80 /0 ib","V","V",""= ,"","rw,r","Y","8" +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","82 /0 ib","V","N.S."= ,"","","rw,r","Y","8" +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","REX 80 /0 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","02 /r","V","V","","","rw,r= ","Y","8" +"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","REX 02 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","00 /r","V","V","","","rw,r= ","Y","8" +"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","REX 00 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"ADD EAX, imm32","ADDL imm32, EAX","addl imm32, EAX","05 id","V","V","","o= perand32","rw,r","Y","32" +"ADD r/m32, imm32","ADDL imm32, r/m32","addl imm32, r/m32","81 /0 id","V",= "V","","operand32","rw,r","Y","32" +"ADD r/m32, imm8","ADDL imm8, r/m32","addl imm8, r/m32","83 /0 ib","V","V"= ,"","operand32","rw,r","Y","32" +"ADD r32, r/m32","ADDL r/m32, r32","addl r/m32, r32","03 /r","V","V","","o= perand32","rw,r","Y","32" +"ADD r/m32, r32","ADDL r32, r/m32","addl r32, r/m32","01 /r","V","V","","o= perand32","rw,r","Y","32" +"ADDPD xmm1, xmm2/m128","ADDPD xmm2/m128, xmm1","addpd xmm2/m128, xmm1","6= 6 0F 58 /r","V","V","SSE2","","rw,r","","" +"ADDPS xmm1, xmm2/m128","ADDPS xmm2/m128, xmm1","addps xmm2/m128, xmm1","0= F 58 /r","V","V","SSE","","rw,r","","" +"ADD RAX, imm32","ADDQ imm32, RAX","addq imm32, RAX","REX.W 05 id","N.S.",= "V","","","rw,r","Y","64" +"ADD r/m64, imm32","ADDQ imm32, r/m64","addq imm32, r/m64","REX.W 81 /0 id= ","N.S.","V","","","rw,r","Y","64" +"ADD r/m64, imm8","ADDQ imm8, r/m64","addq imm8, r/m64","REX.W 83 /0 ib","= N.S.","V","","","rw,r","Y","64" +"ADD r64, r/m64","ADDQ r/m64, r64","addq r/m64, r64","REX.W 03 /r","N.S.",= "V","","","rw,r","Y","64" +"ADD r/m64, r64","ADDQ r64, r/m64","addq r64, r/m64","REX.W 01 /r","N.S.",= "V","","","rw,r","Y","64" +"ADDSD xmm1, xmm2/m64","ADDSD xmm2/m64, xmm1","addsd xmm2/m64, xmm1","F2 0= F 58 /r","V","V","SSE2","","rw,r","","" +"ADDSS xmm1, xmm2/m32","ADDSS xmm2/m32, xmm1","addss xmm2/m32, xmm1","F3 0= F 58 /r","V","V","SSE","","rw,r","","" +"ADDSUBPD xmm1, xmm2/m128","ADDSUBPD xmm2/m128, xmm1","addsubpd xmm2/m128,= xmm1","66 0F D0 /r","V","V","SSE3","","rw,r","","" +"ADDSUBPS xmm1, xmm2/m128","ADDSUBPS xmm2/m128, xmm1","addsubps xmm2/m128,= xmm1","F2 0F D0 /r","V","V","SSE3","","rw,r","","" +"ADD AX, imm16","ADDW imm16, AX","addw imm16, AX","05 iw","V","V","","oper= and16","rw,r","Y","16" +"ADD r/m16, imm16","ADDW imm16, r/m16","addw imm16, r/m16","81 /0 iw","V",= "V","","operand16","rw,r","Y","16" +"ADD r/m16, imm8","ADDW imm8, r/m16","addw imm8, r/m16","83 /0 ib","V","V"= ,"","operand16","rw,r","Y","16" +"ADD r16, r/m16","ADDW r/m16, r16","addw r/m16, r16","03 /r","V","V","","o= perand16","rw,r","Y","16" +"ADD r/m16, r16","ADDW r16, r/m16","addw r16, r/m16","01 /r","V","V","","o= perand16","rw,r","Y","16" +"ADOX r32, r/m32","ADOXL r/m32, r32","adoxl r/m32, r32","F3 0F 38 F6 /r","= V","V","ADX","operand16,operand32","rw,r","Y","32" +"ADOX r64, r/m64","ADOXQ r/m64, r64","adoxq r/m64, r64","F3 REX.W 0F 38 F6= /r","N.S.","V","ADX","","rw,r","Y","64" +"AESDEC xmm1, xmm2/m128","AESDEC xmm2/m128, xmm1","aesdec xmm2/m128, xmm1"= ,"66 0F 38 DE /r","V","V","AES","","rw,r","","" +"AESDECLAST xmm1, xmm2/m128","AESDECLAST xmm2/m128, xmm1","aesdeclast xmm2= /m128, xmm1","66 0F 38 DF /r","V","V","AES","","rw,r","","" +"AESENC xmm1, xmm2/m128","AESENC xmm2/m128, xmm1","aesenc xmm2/m128, xmm1"= ,"66 0F 38 DC /r","V","V","AES","","rw,r","","" +"AESENCLAST xmm1, xmm2/m128","AESENCLAST xmm2/m128, xmm1","aesenclast xmm2= /m128, xmm1","66 0F 38 DD /r","V","V","AES","","rw,r","","" +"AESIMC xmm1, xmm2/m128","AESIMC xmm2/m128, xmm1","aesimc xmm2/m128, xmm1"= ,"66 0F 38 DB /r","V","V","AES","","w,r","","" +"AESKEYGENASSIST xmm1, xmm2/m128, imm8u","AESKEYGENASSIST imm8u, xmm2/m128= , xmm1","aeskeygenassist imm8u, xmm2/m128, xmm1","66 0F 3A DF /r ib","V","V= ","AES","","w,r,r","","" +"AND AL, imm8","ANDB imm8, AL","andb imm8, AL","24 ib","V","V","","","rw,r= ","Y","8" +"AND r/m8, imm8","ANDB imm8, r/m8","andb imm8, r/m8","REX 80 /4 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","80 /4 ib","V","V"= ,"","","rw,r","Y","8" +"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","82 /4 ib","V","N.= S.","","","rw,r","Y","8" +"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","22 /r","V","V","","","rw,r= ","Y","8" +"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","REX 22 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","20 /r","V","V","","","rw,r= ","Y","8" +"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","REX 20 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"AND EAX, imm32","ANDL imm32, EAX","andl imm32, EAX","25 id","V","V","","o= perand32","rw,r","Y","32" +"AND r/m32, imm32","ANDL imm32, r/m32","andl imm32, r/m32","81 /4 id","V",= "V","","operand32","rw,r","Y","32" +"AND r/m32, imm8","ANDL imm8, r/m32","andl imm8, r/m32","83 /4 ib","V","V"= ,"","operand32","rw,r","Y","32" +"AND r32, r/m32","ANDL r/m32, r32","andl r/m32, r32","23 /r","V","V","","o= perand32","rw,r","Y","32" +"AND r/m32, r32","ANDL r32, r/m32","andl r32, r/m32","21 /r","V","V","","o= perand32","rw,r","Y","32" +"ANDN r32, r32V, r/m32","ANDNL r/m32, r32V, r32","andnl r/m32, r32V, r32",= "VEX.DDS.128.0F38.W0 F2 /r","V","V","BMI1","","rw,r,r","Y","32" +"ANDNPD xmm1, xmm2/m128","ANDNPD xmm2/m128, xmm1","andnpd xmm2/m128, xmm1"= ,"66 0F 55 /r","V","V","SSE2","","rw,r","","" +"ANDNPS xmm1, xmm2/m128","ANDNPS xmm2/m128, xmm1","andnps xmm2/m128, xmm1"= ,"0F 55 /r","V","V","SSE","","rw,r","","" +"ANDN r64, r64V, r/m64","ANDNQ r/m64, r64V, r64","andnq r/m64, r64V, r64",= "VEX.DDS.128.0F38.W1 F2 /r","N.S.","V","BMI1","","rw,r,r","Y","64" +"ANDPD xmm1, xmm2/m128","ANDPD xmm2/m128, xmm1","andpd xmm2/m128, xmm1","6= 6 0F 54 /r","V","V","SSE2","","rw,r","","" +"ANDPS xmm1, xmm2/m128","ANDPS xmm2/m128, xmm1","andps xmm2/m128, xmm1","0= F 54 /r","V","V","SSE","","rw,r","","" +"AND RAX, imm32","ANDQ imm32, RAX","andq imm32, RAX","REX.W 25 id","N.S.",= "V","","","rw,r","Y","64" +"AND r/m64, imm32","ANDQ imm32, r/m64","andq imm32, r/m64","REX.W 81 /4 id= ","N.S.","V","","","rw,r","Y","64" +"AND r/m64, imm8","ANDQ imm8, r/m64","andq imm8, r/m64","REX.W 83 /4 ib","= N.S.","V","","","rw,r","Y","64" +"AND r64, r/m64","ANDQ r/m64, r64","andq r/m64, r64","REX.W 23 /r","N.S.",= "V","","","rw,r","Y","64" +"AND r/m64, r64","ANDQ r64, r/m64","andq r64, r/m64","REX.W 21 /r","N.S.",= "V","","","rw,r","Y","64" +"AND AX, imm16","ANDW imm16, AX","andw imm16, AX","25 iw","V","V","","oper= and16","rw,r","Y","16" +"AND r/m16, imm16","ANDW imm16, r/m16","andw imm16, r/m16","81 /4 iw","V",= "V","","operand16","rw,r","Y","16" +"AND r/m16, imm8","ANDW imm8, r/m16","andw imm8, r/m16","83 /4 ib","V","V"= ,"","operand16","rw,r","Y","16" +"AND r16, r/m16","ANDW r/m16, r16","andw r/m16, r16","23 /r","V","V","","o= perand16","rw,r","Y","16" +"AND r/m16, r16","ANDW r16, r/m16","andw r16, r/m16","21 /r","V","V","","o= perand16","rw,r","Y","16" +"ARPL r/m16, r16","ARPL r16, r/m16","arpl r16, r/m16","63 /r","V","N.S.","= ","","rw,r","","" +"BEXTR r32, r/m32, r32V","BEXTRL r32V, r/m32, r32","bextrl r32V, r/m32, r3= 2","VEX.NDS.128.0F38.W0 F7 /r","V","V","BMI1","","w,r,r","Y","32" +"BEXTR r64, r/m64, r64V","BEXTRQ r64V, r/m64, r64","bextrq r64V, r/m64, r6= 4","VEX.NDS.128.0F38.W1 F7 /r","N.S.","V","BMI1","","w,r,r","Y","64" +"BEXTR_XOP r32, r/m32, imm32u","BEXTR_XOPL imm32u, r/m32, r32","bextr_xopl= imm32u, r/m32, r32","XOP.128.0A.WIG 10 /r","V","V","TBM","amd,operand16,op= erand32","w,r,r","Y","32" +"BEXTR_XOP r64, r/m64, imm32u","BEXTR_XOPQ imm32u, r/m64, r64","bextr_xopq= imm32u, r/m64, r64","XOP.128.0A.WIG 10 /r","N.S.","V","TBM","amd,operand64= ","w,r,r","Y","64" +"BLCFILL r32V, r/m32","BLCFILLL r/m32, r32V","blcfill r/m32, r32V","XOP.ND= D.128.09.WIG 01 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCFILL r64V, r/m64","BLCFILLQ r/m64, r64V","blcfill r/m64, r64V","XOP.ND= D.128.09.W1 01 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLCIC r32V, r/m32","BLCICL r/m32, r32V","blcicl r/m32, r32V","XOP.NDD.128= .09.WIG 01 /5","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCIC r64V, r/m64","BLCICQ r/m64, r64V","blcicq r/m64, r64V","XOP.NDD.128= .09.WIG 01 /5","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLCI r32V, r/m32","BLCIL r/m32, r32V","blcil r/m32, r32V","XOP.NDD.128.09= .WIG 02 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCI r64V, r/m64","BLCIQ r/m64, r64V","blciq r/m64, r64V","XOP.NDD.128.09= .WIG 02 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLCMSK r32V, r/m32","BLCMSKL r/m32, r32V","blcmskl r/m32, r32V","XOP.NDD.= 128.09.WIG 02 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCMSK r64V, r/m64","BLCMSKQ r/m64, r64V","blcmskq r/m64, r64V","XOP.NDD.= 128.09.WIG 02 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLCS r32V, r/m32","BLCSL r/m32, r32V","blcsl r/m32, r32V","XOP.NDD.128.09= .WIG 01 /3","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCS r64V, r/m64","BLCSQ r/m64, r64V","blcsq r/m64, r64V","XOP.NDD.128.09= .WIG 01 /3","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLENDPD xmm1, xmm2/m128, imm8u","BLENDPD imm8u, xmm2/m128, xmm1","blendpd= imm8u, xmm2/m128, xmm1","66 0F 3A 0D /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"BLENDPS xmm1, xmm2/m128, imm8u","BLENDPS imm8u, xmm2/m128, xmm1","blendps= imm8u, xmm2/m128, xmm1","66 0F 3A 0C /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"BLENDVPD xmm1, xmm2/m128, ","BLENDVPD , xmm2/m128, xmm1","ble= ndvpd , xmm2/m128, xmm1","66 0F 38 15 /r","V","V","SSE4_1","","rw,r,r= ","","" +"BLENDVPS xmm1, xmm2/m128, ","BLENDVPS , xmm2/m128, xmm1","ble= ndvps , xmm2/m128, xmm1","66 0F 38 14 /r","V","V","SSE4_1","","rw,r,r= ","","" +"BLSFILL r32V, r/m32","BLSFILLL r/m32, r32V","blsfill r/m32, r32V","XOP.ND= D.128.09.WIG 01 /2","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLSFILL r64V, r/m64","BLSFILLQ r/m64, r64V","blsfill r/m64, r64V","XOP.ND= D.128.09.W1 01 /2","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLSIC r32V, r/m32","BLSICL r/m32, r32V","blsicl r/m32, r32V","XOP.NDD.128= .09.WIG 01 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLSIC r64V, r/m64","BLSICQ r/m64, r64V","blsicq r/m64, r64V","XOP.NDD.128= .09.WIG 01 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLSI r32V, r/m32","BLSIL r/m32, r32V","blsil r/m32, r32V","VEX.NDD.128.0F= 38.W0 F3 /3","V","V","BMI1","","w,r","Y","32" +"BLSI r64V, r/m64","BLSIQ r/m64, r64V","blsiq r/m64, r64V","VEX.NDD.128.0F= 38.W1 F3 /3","N.S.","V","BMI1","","w,r","Y","64" +"BLSMSK r32V, r/m32","BLSMSKL r/m32, r32V","blsmskl r/m32, r32V","VEX.NDD.= 128.0F38.W0 F3 /2","V","V","BMI1","","w,r","Y","32" +"BLSMSK r64V, r/m64","BLSMSKQ r/m64, r64V","blsmskq r/m64, r64V","VEX.NDD.= 128.0F38.W1 F3 /2","N.S.","V","BMI1","","w,r","Y","64" +"BLSR r32V, r/m32","BLSRL r/m32, r32V","blsrl r/m32, r32V","VEX.NDD.128.0F= 38.W0 F3 /1","V","V","BMI1","","w,r","Y","32" +"BLSR r64V, r/m64","BLSRQ r/m64, r64V","blsrq r/m64, r64V","VEX.NDD.128.0F= 38.W1 F3 /1","N.S.","V","BMI1","","w,r","Y","64" +"BNDCL bnd1, r/m32","BNDCL r/m32, bnd1","bndcl r/m32, bnd1","F3 0F 1A /r",= "V","N.S.","MPX","","r,r","","" +"BNDCL bnd1, r/m64","BNDCL r/m64, bnd1","bndcl r/m64, bnd1","F3 0F 1A /r",= "N.S.","V","MPX","","r,r","","" +"BNDCN bnd1, r/m32","BNDCN r/m32, bnd1","bndcn r/m32, bnd1","F2 0F 1B /r",= "V","N.S.","MPX","","r,r","","" +"BNDCN bnd1, r/m64","BNDCN r/m64, bnd1","bndcn r/m64, bnd1","F2 0F 1B /r",= "N.S.","V","MPX","","r,r","","" +"BNDCU bnd1, r/m32","BNDCU r/m32, bnd1","bndcu r/m32, bnd1","F2 0F 1A /r",= "V","N.S.","MPX","","r,r","","" +"BNDCU bnd1, r/m64","BNDCU r/m64, bnd1","bndcu r/m64, bnd1","F2 0F 1A /r",= "N.S.","V","MPX","","r,r","","" +"BNDLDX bnd1, mib","BNDLDX mib, bnd1","bndldx mib, bnd1","0F 1A /r","V","V= ","MPX","modrm_memonly","w,r","","" +"BNDMK bnd1, m32","BNDMK m32, bnd1","bndmk m32, bnd1","F3 0F 1B /r","V","N= .S.","MPX","modrm_memonly","w,r","","" +"BNDMK bnd1, m64","BNDMK m64, bnd1","bndmk m64, bnd1","F3 0F 1B /r","N.S."= ,"V","MPX","modrm_memonly","w,r","","" +"BNDMOV bnd2/m128, bnd1","BNDMOV bnd1, bnd2/m128","bndmov bnd1, bnd2/m128"= ,"66 0F 1B /r","N.S.","V","MPX","","w,r","","" +"BNDMOV bnd2/m64, bnd1","BNDMOV bnd1, bnd2/m64","bndmov bnd1, bnd2/m64","6= 6 0F 1B /r","V","N.S.","MPX","","w,r","","" +"BNDMOV bnd1, bnd2/m128","BNDMOV bnd2/m128, bnd1","bndmov bnd2/m128, bnd1"= ,"66 0F 1A /r","N.S.","V","MPX","","w,r","","" +"BNDMOV bnd1, bnd2/m64","BNDMOV bnd2/m64, bnd1","bndmov bnd2/m64, bnd1","6= 6 0F 1A /r","V","N.S.","MPX","","w,r","","" +"BNDSTX mib, bnd1","BNDSTX bnd1, mib","bndstx bnd1, mib","0F 1B /r","V","V= ","MPX","modrm_memonly","w,r","","" +"BOUND r32, m32&32","BOUNDL m32&32, r32","boundl r32, m32&32","62 /r","V",= "N.S.","","modrm_memonly,operand32","r,r","Y","32" +"BOUND r16, m16&16","BOUNDW m16&16, r16","boundw r16, m16&16","62 /r","V",= "N.S.","","modrm_memonly,operand16","r,r","Y","16" +"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","0F BC /r","V","V",""= ,"operand32","rw,r","Y","32" +"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","F3 0F BC /r","V","V"= ,"","operand32","rw,r","Y","32" +"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","F3 REX.W 0F BC /r","= N.S.","V","","","rw,r","Y","64" +"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","REX.W 0F BC /r","N.S= .","V","","","rw,r","Y","64" +"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","0F BC /r","V","V",""= ,"operand16","rw,r","Y","16" +"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","F3 0F BC /r","V","V"= ,"","operand16","rw,r","Y","16" +"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","0F BD /r","V","V",""= ,"operand32","rw,r","Y","32" +"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","F3 0F BD /r","V","V"= ,"","operand32","rw,r","Y","32" +"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","F3 REX.W 0F BD /r","= N.S.","V","","","rw,r","Y","64" +"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","REX.W 0F BD /r","N.S= .","V","","","rw,r","Y","64" +"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","0F BD /r","V","V",""= ,"operand16","rw,r","Y","16" +"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","F3 0F BD /r","V","V"= ,"","operand16","rw,r","Y","16" +"BSWAP r32op","BSWAPL r32op","bswap r32op","0F C8+rd","V","V","486","opera= nd32","rw","Y","32" +"BSWAP r64op","BSWAPQ r64op","bswap r64op","REX.W 0F C8+ro","N.S.","V","48= 6","","rw","Y","64" +"BSWAP r16op","BSWAPW r16op","bswap r16op","0F C8+rw","V","V","486","opera= nd16","rw","Y","16" +"BTC r/m32, imm8u","BTCL imm8u, r/m32","btcl imm8u, r/m32","0F BA /7 ib","= V","V","","operand32","rw,r","Y","32" +"BTC r/m32, r32","BTCL r32, r/m32","btcl r32, r/m32","0F BB /r","V","V",""= ,"operand32","rw,r","Y","32" +"BTC r/m64, imm8u","BTCQ imm8u, r/m64","btcq imm8u, r/m64","REX.W 0F BA /7= ib","N.S.","V","","","rw,r","Y","64" +"BTC r/m64, r64","BTCQ r64, r/m64","btcq r64, r/m64","REX.W 0F BB /r","N.S= .","V","","","rw,r","Y","64" +"BTC r/m16, imm8u","BTCW imm8u, r/m16","btcw imm8u, r/m16","0F BA /7 ib","= V","V","","operand16","rw,r","Y","16" +"BTC r/m16, r16","BTCW r16, r/m16","btcw r16, r/m16","0F BB /r","V","V",""= ,"operand16","rw,r","Y","16" +"BT r/m32, imm8u","BTL imm8u, r/m32","btl imm8u, r/m32","0F BA /4 ib","V",= "V","","operand32","r,r","Y","32" +"BT r/m32, r32","BTL r32, r/m32","btl r32, r/m32","0F A3 /r","V","V","","o= perand32","r,r","Y","32" +"BT r/m64, imm8u","BTQ imm8u, r/m64","btq imm8u, r/m64","REX.W 0F BA /4 ib= ","N.S.","V","","","r,r","Y","64" +"BT r/m64, r64","BTQ r64, r/m64","btq r64, r/m64","REX.W 0F A3 /r","N.S.",= "V","","","r,r","Y","64" +"BTR r/m32, imm8u","BTRL imm8u, r/m32","btrl imm8u, r/m32","0F BA /6 ib","= V","V","","operand32","rw,r","Y","32" +"BTR r/m32, r32","BTRL r32, r/m32","btrl r32, r/m32","0F B3 /r","V","V",""= ,"operand32","rw,r","Y","32" +"BTR r/m64, imm8u","BTRQ imm8u, r/m64","btrq imm8u, r/m64","REX.W 0F BA /6= ib","N.S.","V","","","rw,r","Y","64" +"BTR r/m64, r64","BTRQ r64, r/m64","btrq r64, r/m64","REX.W 0F B3 /r","N.S= .","V","","","rw,r","Y","64" +"BTR r/m16, imm8u","BTRW imm8u, r/m16","btrw imm8u, r/m16","0F BA /6 ib","= V","V","","operand16","rw,r","Y","16" +"BTR r/m16, r16","BTRW r16, r/m16","btrw r16, r/m16","0F B3 /r","V","V",""= ,"operand16","rw,r","Y","16" +"BTS r/m32, imm8u","BTSL imm8u, r/m32","btsl imm8u, r/m32","0F BA /5 ib","= V","V","","operand32","rw,r","Y","32" +"BTS r/m32, r32","BTSL r32, r/m32","btsl r32, r/m32","0F AB /r","V","V",""= ,"operand32","rw,r","Y","32" +"BTS r/m64, imm8u","BTSQ imm8u, r/m64","btsq imm8u, r/m64","REX.W 0F BA /5= ib","N.S.","V","","","rw,r","Y","64" +"BTS r/m64, r64","BTSQ r64, r/m64","btsq r64, r/m64","REX.W 0F AB /r","N.S= .","V","","","rw,r","Y","64" +"BTS r/m16, imm8u","BTSW imm8u, r/m16","btsw imm8u, r/m16","0F BA /5 ib","= V","V","","operand16","rw,r","Y","16" +"BTS r/m16, r16","BTSW r16, r/m16","btsw r16, r/m16","0F AB /r","V","V",""= ,"operand16","rw,r","Y","16" +"BT r/m16, imm8u","BTW imm8u, r/m16","btw imm8u, r/m16","0F BA /4 ib","V",= "V","","operand16","r,r","Y","16" +"BT r/m16, r16","BTW r16, r/m16","btw r16, r/m16","0F A3 /r","V","V","","o= perand16","r,r","Y","16" +"BZHI r32, r/m32, r32V","BZHIL r32V, r/m32, r32","bzhil r32V, r/m32, r32",= "VEX.NDS.128.0F38.W0 F5 /r","V","V","BMI2","","w,r,r","Y","32" +"BZHI r64, r/m64, r64V","BZHIQ r64V, r/m64, r64","bzhiq r64V, r/m64, r64",= "VEX.NDS.128.0F38.W1 F5 /r","N.S.","V","BMI2","","w,r,r","Y","64" +"CALL rel16","CALL rel16","call rel16","E8 cw","V","N.S.","","operand16","= r","Y","" +"CALL rel32","CALL rel32","call rel32","E8 cd","V","N.S.","","operand32","= r","Y","" +"CALL rel32","CALL rel32","call rel32","E8 cd","N.S.","V","","default64","= r","Y","" +"CALL r/m32","CALLL* r/m32","calll* r/m32","FF /2","V","N.S.","","operand3= 2","r","Y","32" +"CALL r/m64","CALLQ* r/m64","callq* r/m64","FF /2","N.S.","V","","default6= 4","r","Y","64" +"CALL r/m16","CALLW* r/m16","callw* r/m16","FF /2","V","N.S.","","operand1= 6","r","Y","16" +"CBW","CBW","cbtw","98","V","V","","operand16","","","" +"CDQ","CDQ","cltd","99","V","V","","operand32","","","" +"CDQE","CDQE","cltq","REX.W 98","N.S.","V","","","","","" +"CLAC","CLAC","clac","0F 01 CA","V","V","","","","","" +"CLC","CLC","clc","F8","V","V","","","","","" +"CLD","CLD","cld","FC","V","V","","","","","" +"CLFLUSH m8","CLFLUSH m8","clflush m8","0F AE /7","V","V","","modrm_memonl= y","r","","" +"CLFLUSHOPT m8","CLFLUSHOPT m8","clflushopt m8","66 0F AE /7","V","V","","= modrm_memonly","r","","" +"CLGI","CLGI","clgi","0F 01 DD","V","V","SVM","amd","","","" +"CLI","CLI","cli","FA","V","V","","","","","" +"CLRSSBSY m64","CLRSSBSY m64","clrssbsy m64","F3 0F AE /6","V","V","CET","= modrm_memonly","w","","" +"CLTS","CLTS","clts","0F 06","V","V","","","","","" +"CLWB m8","CLWB m8","clwb m8","66 0F AE /6","V","V","CLWB","modrm_memonly"= ,"r","","" +"CLZERO EAX","CLZEROL EAX","clzerol EAX","0F 01 FC","V","V","CLZERO","amd,= modrm_regonly,operand32","r","Y","32" +"CLZERO RAX","CLZEROQ RAX","clzeroq RAX","REX.W 0F 01 FC","N.S.","V","CLZE= RO","amd,modrm_regonly","r","Y","64" +"CLZERO AX","CLZEROW AX","clzerow AX","0F 01 FC","V","V","CLZERO","amd,mod= rm_regonly,operand16","r","Y","16" +"CMC","CMC","cmc","F5","V","V","","","","","" +"CMOVC r16, r/m16","CMOVC r/m16, r16","cmovc r/m16, r16","0F 42 /r","V","V= ","","P6,operand16,pseudo","rw,r","","" +"CMOVC r32, r/m32","CMOVC r/m32, r32","cmovc r/m32, r32","0F 42 /r","V","V= ","","P6,operand32,pseudo","rw,r","","" +"CMOVC r64, r/m64","CMOVC r/m64, r64","cmovc r/m64, r64","REX.W 0F 42 /r",= "N.E.","V","","pseudo","rw,r","","" +"CMOVAE r32, r/m32","CMOVLCC r/m32, r32","cmovael r/m32, r32","0F 43 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVB r32, r/m32","CMOVLCS r/m32, r32","cmovbl r/m32, r32","0F 42 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVE r32, r/m32","CMOVLEQ r/m32, r32","cmovel r/m32, r32","0F 44 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVGE r32, r/m32","CMOVLGE r/m32, r32","cmovgel r/m32, r32","0F 4D /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVG r32, r/m32","CMOVLGT r/m32, r32","cmovgl r/m32, r32","0F 4F /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVA r32, r/m32","CMOVLHI r/m32, r32","cmoval r/m32, r32","0F 47 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVLE r32, r/m32","CMOVLLE r/m32, r32","cmovlel r/m32, r32","0F 4E /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVBE r32, r/m32","CMOVLLS r/m32, r32","cmovbel r/m32, r32","0F 46 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVL r32, r/m32","CMOVLLT r/m32, r32","cmovll r/m32, r32","0F 4C /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVS r32, r/m32","CMOVLMI r/m32, r32","cmovsl r/m32, r32","0F 48 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVNE r32, r/m32","CMOVLNE r/m32, r32","cmovnel r/m32, r32","0F 45 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVNO r32, r/m32","CMOVLOC r/m32, r32","cmovnol r/m32, r32","0F 41 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVO r32, r/m32","CMOVLOS r/m32, r32","cmovol r/m32, r32","0F 40 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVNP r32, r/m32","CMOVLPC r/m32, r32","cmovnpl r/m32, r32","0F 4B /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVNS r32, r/m32","CMOVLPL r/m32, r32","cmovnsl r/m32, r32","0F 49 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVP r32, r/m32","CMOVLPS r/m32, r32","cmovpl r/m32, r32","0F 4A /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVNA r16, r/m16","CMOVNA r/m16, r16","cmovna r/m16, r16","0F 46 /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNA r32, r/m32","CMOVNA r/m32, r32","cmovna r/m32, r32","0F 46 /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNA r64, r/m64","CMOVNA r/m64, r64","cmovna r/m64, r64","REX.W 0F 46 /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNAE r16, r/m16","CMOVNAE r/m16, r16","cmovnae r/m16, r16","0F 42 /r",= "V","V","","P6,operand16,pseudo","rw,r","","" +"CMOVNAE r32, r/m32","CMOVNAE r/m32, r32","cmovnae r/m32, r32","0F 42 /r",= "V","V","","P6,operand32,pseudo","rw,r","","" +"CMOVNAE r64, r/m64","CMOVNAE r/m64, r64","cmovnae r/m64, r64","REX.W 0F 4= 2 /r","N.E.","V","","pseudo","rw,r","","" +"CMOVNB r16, r/m16","CMOVNB r/m16, r16","cmovnb r/m16, r16","0F 43 /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNB r32, r/m32","CMOVNB r/m32, r32","cmovnb r/m32, r32","0F 43 /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNB r64, r/m64","CMOVNB r/m64, r64","cmovnb r/m64, r64","REX.W 0F 43 /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNBE r16, r/m16","CMOVNBE r/m16, r16","cmovnbe r/m16, r16","0F 47 /r",= "V","V","","P6,operand16,pseudo","rw,r","","" +"CMOVNBE r32, r/m32","CMOVNBE r/m32, r32","cmovnbe r/m32, r32","0F 47 /r",= "V","V","","P6,operand32,pseudo","rw,r","","" +"CMOVNBE r64, r/m64","CMOVNBE r/m64, r64","cmovnbe r/m64, r64","REX.W 0F 4= 7 /r","N.E.","V","","pseudo","rw,r","","" +"CMOVNC r16, r/m16","CMOVNC r/m16, r16","cmovnc r/m16, r16","0F 43 /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNC r32, r/m32","CMOVNC r/m32, r32","cmovnc r/m32, r32","0F 43 /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNC r64, r/m64","CMOVNC r/m64, r64","cmovnc r/m64, r64","REX.W 0F 43 /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNG r16, r/m16","CMOVNG r/m16, r16","cmovng r/m16, r16","0F 4E /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNG r32, r/m32","CMOVNG r/m32, r32","cmovng r/m32, r32","0F 4E /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNG r64, r/m64","CMOVNG r/m64, r64","cmovng r/m64, r64","REX.W 0F 4E /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNGE r16, r/m16","CMOVNGE r/m16, r16","cmovnge r/m16, r16","0F 4C /r",= "V","V","","P6,operand16,pseudo","rw,r","","" +"CMOVNGE r32, r/m32","CMOVNGE r/m32, r32","cmovnge r/m32, r32","0F 4C /r",= "V","V","","P6,operand32,pseudo","rw,r","","" +"CMOVNGE r64, r/m64","CMOVNGE r/m64, r64","cmovnge r/m64, r64","REX.W 0F 4= C /r","N.E.","V","","pseudo","rw,r","","" +"CMOVNL r16, r/m16","CMOVNL r/m16, r16","cmovnl r/m16, r16","0F 4D /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNL r32, r/m32","CMOVNL r/m32, r32","cmovnl r/m32, r32","0F 4D /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNL r64, r/m64","CMOVNL r/m64, r64","cmovnl r/m64, r64","REX.W 0F 4D /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNLE r16, r/m16","CMOVNLE r/m16, r16","cmovnle r/m16, r16","0F 4F /r",= "V","V","","P6,operand16,pseudo","rw,r","","" +"CMOVNLE r32, r/m32","CMOVNLE r/m32, r32","cmovnle r/m32, r32","0F 4F /r",= "V","V","","P6,operand32,pseudo","rw,r","","" +"CMOVNLE r64, r/m64","CMOVNLE r/m64, r64","cmovnle r/m64, r64","REX.W 0F 4= F /r","N.E.","V","","pseudo","rw,r","","" +"CMOVNZ r16, r/m16","CMOVNZ r/m16, r16","cmovnz r/m16, r16","0F 45 /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNZ r32, r/m32","CMOVNZ r/m32, r32","cmovnz r/m32, r32","0F 45 /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNZ r64, r/m64","CMOVNZ r/m64, r64","cmovnz r/m64, r64","REX.W 0F 45 /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVPE r16, r/m16","CMOVPE r/m16, r16","cmovpe r/m16, r16","0F 4A /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVPE r32, r/m32","CMOVPE r/m32, r32","cmovpe r/m32, r32","0F 4A /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVPE r64, r/m64","CMOVPE r/m64, r64","cmovpe r/m64, r64","REX.W 0F 4A /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVPO r16, r/m16","CMOVPO r/m16, r16","cmovpo r/m16, r16","0F 4B /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVPO r32, r/m32","CMOVPO r/m32, r32","cmovpo r/m32, r32","0F 4B /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVPO r64, r/m64","CMOVPO r/m64, r64","cmovpo r/m64, r64","REX.W 0F 4B /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVAE r64, r/m64","CMOVQCC r/m64, r64","cmovaeq r/m64, r64","REX.W 0F 43= /r","N.S.","V","","","rw,r","Y","64" +"CMOVB r64, r/m64","CMOVQCS r/m64, r64","cmovbq r/m64, r64","REX.W 0F 42 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVE r64, r/m64","CMOVQEQ r/m64, r64","cmoveq r/m64, r64","REX.W 0F 44 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVGE r64, r/m64","CMOVQGE r/m64, r64","cmovgeq r/m64, r64","REX.W 0F 4D= /r","N.S.","V","","","rw,r","Y","64" +"CMOVG r64, r/m64","CMOVQGT r/m64, r64","cmovgq r/m64, r64","REX.W 0F 4F /= r","N.S.","V","","","rw,r","Y","64" +"CMOVA r64, r/m64","CMOVQHI r/m64, r64","cmovaq r/m64, r64","REX.W 0F 47 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVLE r64, r/m64","CMOVQLE r/m64, r64","cmovleq r/m64, r64","REX.W 0F 4E= /r","N.S.","V","","","rw,r","Y","64" +"CMOVBE r64, r/m64","CMOVQLS r/m64, r64","cmovbeq r/m64, r64","REX.W 0F 46= /r","N.S.","V","","","rw,r","Y","64" +"CMOVL r64, r/m64","CMOVQLT r/m64, r64","cmovlq r/m64, r64","REX.W 0F 4C /= r","N.S.","V","","","rw,r","Y","64" +"CMOVS r64, r/m64","CMOVQMI r/m64, r64","cmovsq r/m64, r64","REX.W 0F 48 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVNE r64, r/m64","CMOVQNE r/m64, r64","cmovneq r/m64, r64","REX.W 0F 45= /r","N.S.","V","","","rw,r","Y","64" +"CMOVNO r64, r/m64","CMOVQOC r/m64, r64","cmovnoq r/m64, r64","REX.W 0F 41= /r","N.S.","V","","","rw,r","Y","64" +"CMOVO r64, r/m64","CMOVQOS r/m64, r64","cmovoq r/m64, r64","REX.W 0F 40 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVNP r64, r/m64","CMOVQPC r/m64, r64","cmovnpq r/m64, r64","REX.W 0F 4B= /r","N.S.","V","","","rw,r","Y","64" +"CMOVNS r64, r/m64","CMOVQPL r/m64, r64","cmovnsq r/m64, r64","REX.W 0F 49= /r","N.S.","V","","","rw,r","Y","64" +"CMOVP r64, r/m64","CMOVQPS r/m64, r64","cmovpq r/m64, r64","REX.W 0F 4A /= r","N.S.","V","","","rw,r","Y","64" +"CMOVAE r16, r/m16","CMOVWCC r/m16, r16","cmovaew r/m16, r16","0F 43 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVB r16, r/m16","CMOVWCS r/m16, r16","cmovbw r/m16, r16","0F 42 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVE r16, r/m16","CMOVWEQ r/m16, r16","cmovew r/m16, r16","0F 44 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVGE r16, r/m16","CMOVWGE r/m16, r16","cmovgew r/m16, r16","0F 4D /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVG r16, r/m16","CMOVWGT r/m16, r16","cmovgw r/m16, r16","0F 4F /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVA r16, r/m16","CMOVWHI r/m16, r16","cmovaw r/m16, r16","0F 47 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVLE r16, r/m16","CMOVWLE r/m16, r16","cmovlew r/m16, r16","0F 4E /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVBE r16, r/m16","CMOVWLS r/m16, r16","cmovbew r/m16, r16","0F 46 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVL r16, r/m16","CMOVWLT r/m16, r16","cmovlw r/m16, r16","0F 4C /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVS r16, r/m16","CMOVWMI r/m16, r16","cmovsw r/m16, r16","0F 48 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVNE r16, r/m16","CMOVWNE r/m16, r16","cmovnew r/m16, r16","0F 45 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVNO r16, r/m16","CMOVWOC r/m16, r16","cmovnow r/m16, r16","0F 41 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVO r16, r/m16","CMOVWOS r/m16, r16","cmovow r/m16, r16","0F 40 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVNP r16, r/m16","CMOVWPC r/m16, r16","cmovnpw r/m16, r16","0F 4B /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVNS r16, r/m16","CMOVWPL r/m16, r16","cmovnsw r/m16, r16","0F 49 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVP r16, r/m16","CMOVWPS r/m16, r16","cmovpw r/m16, r16","0F 4A /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVZ r16, r/m16","CMOVZ r/m16, r16","cmovz r/m16, r16","0F 44 /r","V","V= ","","P6,operand16,pseudo","rw,r","","" +"CMOVZ r32, r/m32","CMOVZ r/m32, r32","cmovz r/m32, r32","0F 44 /r","V","V= ","","P6,operand32,pseudo","rw,r","","" +"CMOVZ r64, r/m64","CMOVZ r/m64, r64","cmovz r/m64, r64","REX.W 0F 44 /r",= "N.E.","V","","pseudo","rw,r","","" +"CMP AL, imm8","CMPB AL, imm8","cmpb imm8, AL","3C ib","V","V","","","r,r"= ,"Y","8" +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","80 /7 ib","V","V",""= ,"","r,r","Y","8" +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","82 /7 ib","V","N.S."= ,"","","r,r","Y","8" +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","REX 80 /7 ib","N.E."= ,"V","","pseudo64","r,r","Y","8" +"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","38 /r","V","V","","","r,r"= ,"Y","8" +"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","REX 38 /r","N.E.","V","","= pseudo64","r,r","Y","8" +"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","3A /r","V","V","","","r,r"= ,"Y","8" +"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","REX 3A /r","N.E.","V","","= pseudo64","r,r","Y","8" +"CMP EAX, imm32","CMPL EAX, imm32","cmpl imm32, EAX","3D id","V","V","","o= perand32","r,r","Y","32" +"CMP r/m32, imm32","CMPL r/m32, imm32","cmpl imm32, r/m32","81 /7 id","V",= "V","","operand32","r,r","Y","32" +"CMP r/m32, imm8","CMPL r/m32, imm8","cmpl imm8, r/m32","83 /7 ib","V","V"= ,"","operand32","r,r","Y","32" +"CMP r/m32, r32","CMPL r/m32, r32","cmpl r32, r/m32","39 /r","V","V","","o= perand32","r,r","Y","32" +"CMP r32, r/m32","CMPL r32, r/m32","cmpl r/m32, r32","3B /r","V","V","","o= perand32","r,r","Y","32" +"CMPPD xmm1, xmm2/m128, imm8u","CMPPD imm8u, xmm1, xmm2/m128","cmppd imm8u= , xmm2/m128, xmm1","66 0F C2 /r ib","V","V","SSE2","","rw,r,r","","" +"CMPPS xmm1, xmm2/m128, imm8u","CMPPS imm8u, xmm1, xmm2/m128","cmpps imm8u= , xmm2/m128, xmm1","0F C2 /r ib","V","V","SSE","","rw,r,r","","" +"CMP RAX, imm32","CMPQ RAX, imm32","cmpq imm32, RAX","REX.W 3D id","N.S.",= "V","","","r,r","Y","64" +"CMP r/m64, imm32","CMPQ r/m64, imm32","cmpq imm32, r/m64","REX.W 81 /7 id= ","N.S.","V","","","r,r","Y","64" +"CMP r/m64, imm8","CMPQ r/m64, imm8","cmpq imm8, r/m64","REX.W 83 /7 ib","= N.S.","V","","","r,r","Y","64" +"CMP r/m64, r64","CMPQ r/m64, r64","cmpq r64, r/m64","REX.W 39 /r","N.S.",= "V","","","r,r","Y","64" +"CMP r64, r/m64","CMPQ r64, r/m64","cmpq r/m64, r64","REX.W 3B /r","N.S.",= "V","","","r,r","Y","64" +"CMPSB","CMPSB","cmpsb","A6","V","V","","","","","" +"CMPSD xmm1, xmm2/m64, imm8u","CMPSD imm8u, xmm1, xmm2/m64","cmpsd imm8u, = xmm2/m64, xmm1","F2 0F C2 /r ib","V","V","SSE2","","rw,r,r","","" +"CMPSD","CMPSL","cmpsl","A7","V","V","","operand32","","","" +"CMPSQ","CMPSQ","cmpsq","REX.W A7","N.S.","V","","","","","" +"CMPSS xmm1, xmm2/m32, imm8u","CMPSS imm8u, xmm1, xmm2/m32","cmpss imm8u, = xmm2/m32, xmm1","F3 0F C2 /r ib","V","V","SSE","","rw,r,r","","" +"CMPSW","CMPSW","cmpsw","A7","V","V","","operand16","","","" +"CMP AX, imm16","CMPW AX, imm16","cmpw imm16, AX","3D iw","V","V","","oper= and16","r,r","Y","16" +"CMP r/m16, imm16","CMPW r/m16, imm16","cmpw imm16, r/m16","81 /7 iw","V",= "V","","operand16","r,r","Y","16" +"CMP r/m16, imm8","CMPW r/m16, imm8","cmpw imm8, r/m16","83 /7 ib","V","V"= ,"","operand16","r,r","Y","16" +"CMP r/m16, r16","CMPW r/m16, r16","cmpw r16, r/m16","39 /r","V","V","","o= perand16","r,r","Y","16" +"CMP r16, r/m16","CMPW r16, r/m16","cmpw r/m16, r16","3B /r","V","V","","o= perand16","r,r","Y","16" +"CMPXCHG16B m128","CMPXCHG16B m128","cmpxchg16b m128","REX.W 0F C7 /1","N.= S.","V","","modrm_memonly","rw","","" +"CMPXCHG8B m64","CMPXCHG8B m64","cmpxchg8b m64","0F C7 /1","V","V","Pentiu= m","modrm_memonly,operand16,operand32","rw","","" +"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","0F B0 /r","V",= "V","486","","rw,r","Y","8" +"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","REX 0F B0 /r",= "N.E.","V","","pseudo64","rw,r","Y","8" +"CMPXCHG r/m32, r32","CMPXCHGL r32, r/m32","cmpxchgl r32, r/m32","0F B1 /r= ","V","V","486","operand32","rw,r","Y","32" +"CMPXCHG r/m64, r64","CMPXCHGQ r64, r/m64","cmpxchgq r64, r/m64","REX.W 0F= B1 /r","N.S.","V","486","","rw,r","Y","64" +"CMPXCHG r/m16, r16","CMPXCHGW r16, r/m16","cmpxchgw r16, r/m16","0F B1 /r= ","V","V","486","operand16","rw,r","Y","16" +"COMISD xmm1, xmm2/m64","COMISD xmm2/m64, xmm1","comisd xmm2/m64, xmm1","6= 6 0F 2F /r","V","V","SSE2","","r,r","","" +"COMISS xmm1, xmm2/m32","COMISS xmm2/m32, xmm1","comiss xmm2/m32, xmm1","0= F 2F /r","V","V","SSE","","r,r","","" +"CPUID","CPUID","cpuid","0F A2","V","V","486","","","","" +"CQO","CQO","cqto","REX.W 99","N.S.","V","","","","","" +"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 0F 38 F0 /r","= V","V","SSE4_2","operand16,operand32","rw,r","Y","8" +"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 REX 0F 38 F0 /= r","N.E.","V","","pseudo64","rw,r","Y","8" +"CRC32 r64, r/m8","CRC32B r/m8, r64","crc32b r/m8, r64","F2 REX.W 0F 38 F0= /r","N.S.","V","SSE4_2","","rw,r","Y","8" +"CRC32 r32, r/m32","CRC32L r/m32, r32","crc32l r/m32, r32","F2 0F 38 F1 /r= ","V","V","SSE4_2","operand32","rw,r","Y","32" +"CRC32 r64, r/m64","CRC32Q r/m64, r64","crc32q r/m64, r64","F2 REX.W 0F 38= F1 /r","N.S.","V","SSE4_2","","rw,r","Y","64" +"CRC32 r32, r/m16","CRC32W r/m16, r32","crc32w r/m16, r32","F2 0F 38 F1 /r= ","V","V","SSE4_2","operand16","rw,r","Y","16" +"CVTPD2PI mm1, xmm2/m128","CVTPD2PI xmm2/m128, mm1","cvtpd2pi xmm2/m128, m= m1","66 0F 2D /r","V","V","SSE2","","w,r","","" +"CVTPD2DQ xmm1, xmm2/m128","CVTPD2PL xmm2/m128, xmm1","cvtpd2dq xmm2/m128,= xmm1","F2 0F E6 /r","V","V","SSE2","","w,r","","" +"CVTPD2PS xmm1, xmm2/m128","CVTPD2PS xmm2/m128, xmm1","cvtpd2ps xmm2/m128,= xmm1","66 0F 5A /r","V","V","SSE2","","w,r","","" +"CVTPI2PD xmm1, mm2/m64","CVTPI2PD mm2/m64, xmm1","cvtpi2pd mm2/m64, xmm1"= ,"66 0F 2A /r","V","V","SSE2","","w,r","","" +"CVTPI2PS xmm1, mm2/m64","CVTPI2PS mm2/m64, xmm1","cvtpi2ps mm2/m64, xmm1"= ,"0F 2A /r","V","V","SSE","","w,r","","" +"CVTDQ2PD xmm1, xmm2/m64","CVTPL2PD xmm2/m64, xmm1","cvtdq2pd xmm2/m64, xm= m1","F3 0F E6 /r","V","V","SSE2","","w,r","","" +"CVTDQ2PS xmm1, xmm2/m128","CVTPL2PS xmm2/m128, xmm1","cvtdq2ps xmm2/m128,= xmm1","0F 5B /r","V","V","SSE2","","w,r","","" +"CVTPS2PD xmm1, xmm2/m64","CVTPS2PD xmm2/m64, xmm1","cvtps2pd xmm2/m64, xm= m1","0F 5A /r","V","V","SSE2","","w,r","","" +"CVTPS2PI mm1, xmm2/m64","CVTPS2PI xmm2/m64, mm1","cvtps2pi xmm2/m64, mm1"= ,"0F 2D /r","V","V","SSE","","w,r","","" +"CVTPS2DQ xmm1, xmm2/m128","CVTPS2PL xmm2/m128, xmm1","cvtps2dq xmm2/m128,= xmm1","66 0F 5B /r","V","V","SSE2","","w,r","","" +"CVTSD2SI r32, xmm2/m64","CVTSD2SL xmm2/m64, r32","cvtsd2si xmm2/m64, r32"= ,"F2 0F 2D /r","V","V","SSE2","operand16,operand32","w,r","Y","32" +"CVTSD2SI r64, xmm2/m64","CVTSD2SL xmm2/m64, r64","cvtsd2siq xmm2/m64, r64= ","F2 REX.W 0F 2D /r","N.S.","V","SSE2","","w,r","Y","64" +"CVTSD2SS xmm1, xmm2/m64","CVTSD2SS xmm2/m64, xmm1","cvtsd2ss xmm2/m64, xm= m1","F2 0F 5A /r","V","V","SSE2","","w,r","","" +"CVTSI2SD xmm1, r/m32","CVTSL2SD r/m32, xmm1","cvtsi2sdl r/m32, xmm1","F2 = 0F 2A /r","V","V","SSE2","operand16,operand32","w,r","Y","32" +"CVTSI2SS xmm1, r/m32","CVTSL2SS r/m32, xmm1","cvtsi2ssl r/m32, xmm1","F3 = 0F 2A /r","V","V","SSE","operand16,operand32","w,r","Y","32" +"CVTSI2SD xmm1, r/m64","CVTSQ2SD r/m64, xmm1","cvtsi2sdq r/m64, xmm1","F2 = REX.W 0F 2A /r","N.S.","V","SSE2","","w,r","Y","64" +"CVTSI2SS xmm1, r/m64","CVTSQ2SS r/m64, xmm1","cvtsi2ssq r/m64, xmm1","F3 = REX.W 0F 2A /r","N.S.","V","SSE","","w,r","Y","64" +"CVTSS2SD xmm1, xmm2/m32","CVTSS2SD xmm2/m32, xmm1","cvtss2sd xmm2/m32, xm= m1","F3 0F 5A /r","V","V","SSE2","","w,r","","" +"CVTSS2SI r32, xmm2/m32","CVTSS2SL xmm2/m32, r32","cvtss2si xmm2/m32, r32"= ,"F3 0F 2D /r","V","V","SSE","operand16,operand32","w,r","Y","32" +"CVTSS2SI r64, xmm2/m32","CVTSS2SL xmm2/m32, r64","cvtss2siq xmm2/m32, r64= ","F3 REX.W 0F 2D /r","N.S.","V","SSE","","w,r","Y","64" +"CVTTPD2PI mm1, xmm2/m128","CVTTPD2PI xmm2/m128, mm1","cvttpd2pi xmm2/m128= , mm1","66 0F 2C /r","V","V","SSE2","","w,r","","" +"CVTTPD2DQ xmm1, xmm2/m128","CVTTPD2PL xmm2/m128, xmm1","cvttpd2dq xmm2/m1= 28, xmm1","66 0F E6 /r","V","V","SSE2","","w,r","","" +"CVTTPS2PI mm1, xmm2/m64","CVTTPS2PI xmm2/m64, mm1","cvttps2pi xmm2/m64, m= m1","0F 2C /r","V","V","SSE","","w,r","","" +"CVTTPS2DQ xmm1, xmm2/m128","CVTTPS2PL xmm2/m128, xmm1","cvttps2dq xmm2/m1= 28, xmm1","F3 0F 5B /r","V","V","SSE2","","w,r","","" +"CVTTSD2SI r32, xmm2/m64","CVTTSD2SL xmm2/m64, r32","cvttsd2si xmm2/m64, r= 32","F2 0F 2C /r","V","V","SSE2","operand16,operand32","w,r","Y","32" +"CVTTSD2SI r64, xmm2/m64","CVTTSD2SL xmm2/m64, r64","cvttsd2siq xmm2/m64, = r64","F2 REX.W 0F 2C /r","N.S.","V","SSE2","","w,r","Y","64" +"CVTTSS2SI r32, xmm2/m32","CVTTSS2SL xmm2/m32, r32","cvttss2si xmm2/m32, r= 32","F3 0F 2C /r","V","V","SSE","operand16,operand32","w,r","Y","32" +"CVTTSS2SI r64, xmm2/m32","CVTTSS2SL xmm2/m32, r64","cvttss2siq xmm2/m32, = r64","F3 REX.W 0F 2C /r","N.S.","V","SSE","","w,r","Y","64" +"CWD","CWD","cwtd","99","V","V","","operand16","","","" +"CWDE","CWDE","cwtl","98","V","V","","operand32","","","" +"DAA","DAA","daa","27","V","N.S.","","","","","" +"DAS","DAS","das","2F","V","N.S.","","","","","" +"DEC r/m8","DECB r/m8","decb r/m8","FE /1","V","V","","","rw","Y","8" +"DEC r/m8","DECB r/m8","decb r/m8","REX FE /1","N.E.","V","","pseudo64","r= w","Y","8" +"DEC r/m32","DECL r/m32","decl r/m32","FF /1","V","V","","operand32","rw",= "Y","32" +"DEC r32op","DECL r32op","decl r32op","48+rd","V","N.S.","","operand32","r= w","Y","32" +"DEC r/m64","DECQ r/m64","decq r/m64","REX.W FF /1","N.S.","V","","","rw",= "Y","64" +"DEC r/m16","DECW r/m16","decw r/m16","FF /1","V","V","","operand16","rw",= "Y","16" +"DEC r16op","DECW r16op","decw r16op","48+rw","V","N.S.","","operand16","r= w","Y","16" +"DIV r/m8","DIVB r/m8","divb r/m8","F6 /6","V","V","","","r","Y","8" +"DIV r/m8","DIVB r/m8","divb r/m8","REX F6 /6","N.E.","V","","pseudo64","w= ","Y","8" +"DIV r/m32","DIVL r/m32","divl r/m32","F7 /6","V","V","","operand32","r","= Y","32" +"DIVPD xmm1, xmm2/m128","DIVPD xmm2/m128, xmm1","divpd xmm2/m128, xmm1","6= 6 0F 5E /r","V","V","SSE2","","rw,r","","" +"DIVPS xmm1, xmm2/m128","DIVPS xmm2/m128, xmm1","divps xmm2/m128, xmm1","0= F 5E /r","V","V","SSE","","rw,r","","" +"DIV r/m64","DIVQ r/m64","divq r/m64","REX.W F7 /6","N.S.","V","","","r","= Y","64" +"DIVSD xmm1, xmm2/m64","DIVSD xmm2/m64, xmm1","divsd xmm2/m64, xmm1","F2 0= F 5E /r","V","V","SSE2","","rw,r","","" +"DIVSS xmm1, xmm2/m32","DIVSS xmm2/m32, xmm1","divss xmm2/m32, xmm1","F3 0= F 5E /r","V","V","SSE","","rw,r","","" +"DIV r/m16","DIVW r/m16","divw r/m16","F7 /6","V","V","","operand16","r","= Y","16" +"DPPD xmm1, xmm2/m128, imm8u","DPPD imm8u, xmm2/m128, xmm1","dppd imm8u, x= mm2/m128, xmm1","66 0F 3A 41 /r ib","V","V","SSE4_1","","rw,r,r","","" +"DPPS xmm1, xmm2/m128, imm8u","DPPS imm8u, xmm2/m128, xmm1","dpps imm8u, x= mm2/m128, xmm1","66 0F 3A 40 /r ib","V","V","SSE4_1","","rw,r,r","","" +"EMMS","EMMS","emms","0F 77","V","V","MMX","","","","" +"ENCLS","ENCLS","encls","0F 01 CF","V","V","","","","","" +"ENCLU","ENCLU","enclu","0F 01 D7","V","V","","","","","" +"ENDBR32","ENDBR32","endbr32","F3 0F 1E FB","V","V","CET","","","","" +"ENDBR64","ENDBR64","endbr64","F3 0F 1E FA","V","V","CET","","","Y","" +"ENTER imm16, 0","ENTER 0, imm16","enter imm16, 0","C8 iw 00","V","V","","= pseudo","r,r","","" +"ENTER imm16, 1","ENTER 1, imm16","enter imm16, 1","C8 iw 01","V","V","","= pseudo","r,r","","" +"ENTER imm16, imm8b","ENTERW/ENTERL/ENTERQ imm8b, imm16","enterw/enterl/en= terq imm16, imm8b","C8 iw ib","V","V","","","r,r","","" +"EXTRACTPS r/m32, xmm1, imm8u:2","EXTRACTPS imm8u:2, xmm1, r/m32","extract= ps imm8u:2, xmm1, r/m32","66 0F 3A 17 /r ib","V","V","SSE4_1","","w,r,r",""= ,"" +"EXTRQ xmm1, imm8u, imm8u","EXTRQ imm8u, imm8u, xmm1","extrq imm8u, imm8u,= xmm1","66 0F 78 /0 ib ib","V","V","SSE4a","amd,modrm_regonly","w,r,r","","" +"EXTRQ xmm1, xmm2","EXTRQ xmm2, xmm1","extrq xmm2, xmm1","66 0F 79 /r","V"= ,"V","SSE4a","amd,modrm_regonly","w,r","","" +"F2XM1","F2XM1","f2xm1","D9 F0","V","V","","","","","" +"FABS","FABS","fabs","D9 E1","V","V","","","","","" +"FADD ST(i), ST(0)","FADDD ST(0), ST(i)","fadd ST(0), ST(i)","DC C0+i","V"= ,"V","","","rw,r","Y","" +"FADD ST(0), ST(i)","FADDD ST(i), ST(0)","fadd ST(i), ST(0)","D8 C0+i","V"= ,"V","","","rw,r","Y","" +"FADD ST(0), m32fp","FADDD m32fp, ST(0)","fadds m32fp, ST(0)","D8 /0","V",= "V","","","rw,r","Y","32" +"FADD ST(0), m64fp","FADDD m64fp, ST(0)","faddl m64fp, ST(0)","DC /0","V",= "V","","","rw,r","Y","64" +"FADDP","FADDDP","faddp","DE C1","V","V","","pseudo","","","" +"FADDP ST(i), ST(0)","FADDDP ST(0), ST(i)","faddp ST(0), ST(i)","DE C0+i",= "V","V","","","rw,r","","" +"FBLD ST(0), m80dec","FBLD m80dec, ST(0)","fbld m80dec, ST(0)","DF /4","V"= ,"V","","","w,r","","" +"FBSTP m80dec, ST(0)","FBSTP ST(0), m80dec","fbstp ST(0), m80dec","DF /6",= "V","V","","","w,r","","" +"FCHS","FCHS","fchs","D9 E0","V","V","","","","","" +"FCLEX","FCLEX","fclex","9B DB E2","V","V","","pseudo","","","" +"FCMOVB ST(0), ST(i)","FCMOVB ST(i), ST(0)","fcmovb ST(i), ST(0)","DA C0+i= ","V","V","","P6","rw,r","","" +"FCMOVBE ST(0), ST(i)","FCMOVBE ST(i), ST(0)","fcmovbe ST(i), ST(0)","DA D= 0+i","V","V","","P6","rw,r","","" +"FCMOVE ST(0), ST(i)","FCMOVE ST(i), ST(0)","fcmove ST(i), ST(0)","DA C8+i= ","V","V","","P6","rw,r","","" +"FCMOVNB ST(0), ST(i)","FCMOVNB ST(i), ST(0)","fcmovnb ST(i), ST(0)","DB C= 0+i","V","V","","P6","rw,r","","" +"FCMOVNBE ST(0), ST(i)","FCMOVNBE ST(i), ST(0)","fcmovnbe ST(i), ST(0)","D= B D0+i","V","V","","P6","rw,r","","" +"FCMOVNE ST(0), ST(i)","FCMOVNE ST(i), ST(0)","fcmovne ST(i), ST(0)","DB C= 8+i","V","V","","P6","rw,r","","" +"FCMOVNU ST(0), ST(i)","FCMOVNU ST(i), ST(0)","fcmovnu ST(i), ST(0)","DB D= 8+i","V","V","","P6","rw,r","","" +"FCMOVU ST(0), ST(i)","FCMOVU ST(i), ST(0)","fcmovu ST(i), ST(0)","DA D8+i= ","V","V","","P6","rw,r","","" +"FCOM","FCOMD","fcom","D8 D1","V","V","","pseudo","","Y","" +"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","D8 D0+i","V"= ,"V","","","r,r","Y","" +"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","DC D0+i","V"= ,"V","","","r,r","Y","" +"FCOM ST(0), m32fp","FCOMD m32fp, ST(0)","fcoms m32fp, ST(0)","D8 /2","V",= "V","","","r,r","Y","32" +"FCOM ST(0), m64fp","FCOMD m64fp, ST(0)","fcoml m64fp, ST(0)","DC /2","V",= "V","","","r,r","Y","64" +"FCOMP ST(0), m32fp","FCOMFP m32fp, ST(0)","fcomps m32fp, ST(0)","D8 /3","= V","V","","","r,r","Y","32" +"FCOMI ST(0), ST(i)","FCOMI ST(i), ST(0)","fcomi ST(i), ST(0)","DB F0+i","= V","V","PPRO","P6","r,r","","" +"FCOMIP ST(0), ST(i)","FCOMIP ST(i), ST(0)","fcomip ST(i), ST(0)","DF F0+i= ","V","V","PPRO","P6","r,r","","" +"FCOMP","FCOMP","fcomp","D8 D9","V","V","","pseudo","","Y","" +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","D8 D8+i","= V","V","","","r,r","Y","" +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DC D8+i","= V","V","","","r,r","Y","" +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DE D0+i","= V","V","","","r,r","Y","" +"FCOMP ST(0), m64fp","FCOMPL m64fp, ST(0)","fcompl m64fp, ST(0)","DC /3","= V","V","","","r,r","Y","64" +"FCOMPP","FCOMPP","fcompp","DE D9","V","V","","","","","" +"FCOS","FCOS","fcos","D9 FF","V","V","","","","","" +"FDECSTP","FDECSTP","fdecstp","D9 F6","V","V","","","","","" +"FDISI8087_NOP","FDISI8087_NOP","fdisi8087_nop","DB E1","V","V","","","","= ","" +"FDIVR ST(i), ST(0)","FDIVD ST(0), ST(i)","fdiv ST(0), ST(i)","DC F0+i","V= ","V","","","rw,r","Y","" +"FDIV ST(i), ST(0)","FDIVD ST(0), ST(i)","fdivr ST(0), ST(i)","DC F8+i","V= ","V","","","rw,r","Y","" +"FDIV ST(0), ST(i)","FDIVD ST(i), ST(0)","fdiv ST(i), ST(0)","D8 F0+i","V"= ,"V","","","rw,r","Y","" +"FDIV ST(0), m32fp","FDIVD m32fp, ST(0)","fdivs m32fp, ST(0)","D8 /6","V",= "V","","","rw,r","Y","32" +"FDIV ST(0), m64fp","FDIVD m64fp, ST(0)","fdivl m64fp, ST(0)","DC /6","V",= "V","","","rw,r","Y","64" +"FDIVR ST(0), m32fp","FDIVFR m32fp, ST(0)","fdivrs m32fp, ST(0)","D8 /7","= V","V","","","rw,r","Y","32" +"FDIVP","FDIVP","fdivp","DE F9","V","V","","pseudo","","","" +"FDIVRP ST(i), ST(0)","FDIVP ST(0), ST(i)","fdivp ST(0), ST(i)","DE F0+i",= "V","V","","","rw,r","","" +"FDIVR ST(0), ST(i)","FDIVR ST(i), ST(0)","fdivr ST(i), ST(0)","D8 F8+i","= V","V","","","rw,r","Y","" +"FDIVR ST(0), m64fp","FDIVRL m64fp, ST(0)","fdivrl m64fp, ST(0)","DC /7","= V","V","","","rw,r","Y","64" +"FDIVRP","FDIVRP","fdivrp","DE F1","V","V","","pseudo","","","" +"FDIVP ST(i), ST(0)","FDIVRP ST(0), ST(i)","fdivrp ST(0), ST(i)","DE F8+i"= ,"V","V","","","rw,r","","" +"FEMMS","FEMMS","femms","0F 0E","V","V","3DNOW","amd","","","" +"FENI8087_NOP","FENI8087_NOP","feni8087_nop","DB E0","V","V","","","","","" +"FFREE ST(i)","FFREE ST(i)","ffree ST(i)","DD C0+i","V","V","","","r","","" +"FFREEP ST(i)","FFREEP ST(i)","ffreep ST(i)","DF C0+i","V","V","","","r","= ","" +"FIADD ST(0), m16int","FIADD m16int, ST(0)","fiadd m16int, ST(0)","DE /0",= "V","V","","","rw,r","Y","" +"FIADD ST(0), m32int","FIADDL m32int, ST(0)","fiaddl m32int, ST(0)","DA /0= ","V","V","","","rw,r","Y","32" +"FICOM ST(0), m16int","FICOM m16int, ST(0)","ficom m16int, ST(0)","DE /2",= "V","V","","","r,r","Y","" +"FICOM ST(0), m32int","FICOML m32int, ST(0)","ficoml m32int, ST(0)","DA /2= ","V","V","","","r,r","Y","32" +"FICOMP ST(0), m16int","FICOMP m16int, ST(0)","ficomp m16int, ST(0)","DE /= 3","V","V","","","r,r","Y","" +"FICOMP ST(0), m32int","FICOMPL m32int, ST(0)","ficompl m32int, ST(0)","DA= /3","V","V","","","r,r","Y","32" +"FIDIV ST(0), m16int","FIDIV m16int, ST(0)","fidiv m16int, ST(0)","DE /6",= "V","V","","","rw,r","Y","" +"FIDIV ST(0), m32int","FIDIVL m32int, ST(0)","fidivl m32int, ST(0)","DA /6= ","V","V","","","rw,r","Y","32" +"FIDIVR ST(0), m16int","FIDIVR m16int, ST(0)","fidivr m16int, ST(0)","DE /= 7","V","V","","","rw,r","Y","" +"FIDIVR ST(0), m32int","FIDIVRL m32int, ST(0)","fidivrl m32int, ST(0)","DA= /7","V","V","","","rw,r","Y","32" +"FILD ST(0), m16int","FILD m16int, ST(0)","fild m16int, ST(0)","DF /0","V"= ,"V","","","w,r","Y","" +"FILD ST(0), m32int","FILDL m32int, ST(0)","fildl m32int, ST(0)","DB /0","= V","V","","","w,r","Y","32" +"FILD ST(0), m64int","FILDLL m64int, ST(0)","fildll m64int, ST(0)","DF /5"= ,"V","V","","","w,r","Y","64" +"FIMUL ST(0), m16int","FIMUL m16int, ST(0)","fimul m16int, ST(0)","DE /1",= "V","V","","","rw,r","Y","" +"FIMUL ST(0), m32int","FIMULL m32int, ST(0)","fimull m32int, ST(0)","DA /1= ","V","V","","","rw,r","Y","32" +"FINCSTP","FINCSTP","fincstp","D9 F7","V","V","","","","","" +"FINIT","FINIT","finit","9B DB E3","V","V","","pseudo","","","" +"FIST m16int, ST(0)","FIST ST(0), m16int","fist ST(0), m16int","DF /2","V"= ,"V","","","w,r","Y","" +"FIST m32int, ST(0)","FISTL ST(0), m32int","fistl ST(0), m32int","DB /2","= V","V","","","w,r","Y","32" +"FISTP m16int, ST(0)","FISTP ST(0), m16int","fistp ST(0), m16int","DF /3",= "V","V","","","w,r","Y","" +"FISTP m32int, ST(0)","FISTPL ST(0), m32int","fistpl ST(0), m32int","DB /3= ","V","V","","","w,r","Y","32" +"FISTP m64int, ST(0)","FISTPLL ST(0), m64int","fistpll ST(0), m64int","DF = /7","V","V","","","w,r","Y","64" +"FISTTP m16int, ST(0)","FISTTP ST(0), m16int","fisttp ST(0), m16int","DF /= 1","V","V","SSE3","modrm_memonly","w,r","Y","" +"FISTTP m32int, ST(0)","FISTTPL ST(0), m32int","fisttpl ST(0), m32int","DB= /1","V","V","SSE3","modrm_memonly","w,r","Y","32" +"FISTTP m64int, ST(0)","FISTTPLL ST(0), m64int","fisttpll ST(0), m64int","= DD /1","V","V","SSE3","modrm_memonly","w,r","Y","64" +"FISUB ST(0), m16int","FISUB m16int, ST(0)","fisub m16int, ST(0)","DE /4",= "V","V","","","rw,r","Y","" +"FISUB ST(0), m32int","FISUBL m32int, ST(0)","fisubl m32int, ST(0)","DA /4= ","V","V","","","rw,r","Y","32" +"FISUBR ST(0), m16int","FISUBR m16int, ST(0)","fisubr m16int, ST(0)","DE /= 5","V","V","","","rw,r","Y","" +"FISUBR ST(0), m32int","FISUBRL m32int, ST(0)","fisubrl m32int, ST(0)","DA= /5","V","V","","","rw,r","Y","32" +"FLD ST(0), ST(i)","FLD ST(i), ST(0)","fld ST(i), ST(0)","D9 C0+i","V","V"= ,"","","w,r","Y","" +"FLD1","FLD1","fld1","D9 E8","V","V","","","","","" +"FLDCW m2byte","FLDCW m2byte","fldcw m2byte","D9 /5","V","V","","","r","",= "" +"FLDENV m28byte","FLDENV m28byte","fldenv m28byte","D9 /4","V","V","","ope= rand32,operand64","r","","" +"FLDENV m14byte","FLDENVS m14byte","fldenv m14byte","D9 /4","V","V","","op= erand16","r","","" +"FLD ST(0), m64fp","FLDL m64fp, ST(0)","fldl m64fp, ST(0)","DD /0","V","V"= ,"","","w,r","Y","64" +"FLDL2E","FLDL2E","fldl2e","D9 EA","V","V","","","","","" +"FLDL2T","FLDL2T","fldl2t","D9 E9","V","V","","","","","" +"FLDLG2","FLDLG2","fldlg2","D9 EC","V","V","","","","","" +"FLDLN2","FLDLN2","fldln2","D9 ED","V","V","","","","","" +"FLDPI","FLDPI","fldpi","D9 EB","V","V","","","","","" +"FLD ST(0), m32fp","FLDS m32fp, ST(0)","flds m32fp, ST(0)","D9 /0","V","V"= ,"","","w,r","Y","32" +"FLD ST(0), m80fp","FLDT m80fp, ST(0)","fldt m80fp, ST(0)","DB /5","V","V"= ,"","","w,r","Y","80" +"FLDZ","FLDZ","fldz","D9 EE","V","V","","","","","" +"FMUL ST(i), ST(0)","FMUL ST(0), ST(i)","fmul ST(0), ST(i)","DC C8+i","V",= "V","","","rw,r","Y","" +"FMUL ST(0), ST(i)","FMUL ST(i), ST(0)","fmul ST(i), ST(0)","D8 C8+i","V",= "V","","","rw,r","Y","" +"FMUL ST(0), m64fp","FMULL m64fp, ST(0)","fmull m64fp, ST(0)","DC /1","V",= "V","","","rw,r","Y","64" +"FMULP","FMULP","fmulp","DE C9","V","V","","pseudo","","","" +"FMULP ST(i), ST(0)","FMULP ST(0), ST(i)","fmulp ST(0), ST(i)","DE C8+i","= V","V","","","rw,r","","" +"FMUL ST(0), m32fp","FMULS m32fp, ST(0)","fmuls m32fp, ST(0)","D8 /1","V",= "V","","","rw,r","Y","32" +"FNCLEX","FNCLEX","fnclex","DB E2","V","V","","","","","" +"FNINIT","FNINIT","fninit","DB E3","V","V","","","","","" +"FNOP","FNOP","fnop","D9 D0","V","V","","","","","" +"FNSAVE m108byte","FNSAVE m108byte","fnsave m108byte","DD /6","V","V","","= operand32,operand64","w","","" +"FNSAVE m94byte","FNSAVES m94byte","fnsave m94byte","DD /6","V","V","","op= erand16","w","","" +"FNSTCW m2byte","FNSTCW m2byte","fnstcw m2byte","D9 /7","V","V","","","w",= "","" +"FNSTENV m28byte","FNSTENV m28byte","fnstenv m28byte","D9 /6","V","V","","= operand32,operand64","w","","" +"FNSTENV m14byte","FNSTENVS m14byte","fnstenv m14byte","D9 /6","V","V","",= "operand16","w","","" +"FNSTSW AX","FNSTSW AX","fnstsw AX","DF E0","V","V","","","w","","" +"FNSTSW m2byte","FNSTSW m2byte","fnstsw m2byte","DD /7","V","V","","","w",= "","" +"FPATAN","FPATAN","fpatan","D9 F3","V","V","","","","","" +"FPREM","FPREM","fprem","D9 F8","V","V","","","","","" +"FPREM1","FPREM1","fprem1","D9 F5","V","V","","","","","" +"FPTAN","FPTAN","fptan","D9 F2","V","V","","","","","" +"FRNDINT","FRNDINT","frndint","D9 FC","V","V","","","","","" +"FRSTOR m108byte","FRSTOR m108byte","frstor m108byte","DD /4","V","V","","= operand32,operand64","r","","" +"FRSTOR m94byte","FRSTORS m94byte","frstor m94byte","DD /4","V","V","","op= erand16","r","","" +"FSAVE m94/108byte","FSAVE m94/108byte","fsave m94/108byte","9B DD /6","V"= ,"V","","pseudo","w","","" +"FSCALE","FSCALE","fscale","D9 FD","V","V","","","","","" +"FSETPM287_NOP","FSETPM287_NOP","fsetpm287_nop","DB E4","V","V","","","","= ","" +"FSIN","FSIN","fsin","D9 FE","V","V","","","","","" +"FSINCOS","FSINCOS","fsincos","D9 FB","V","V","","","","","" +"FSQRT","FSQRT","fsqrt","D9 FA","V","V","","","","","" +"FST ST(i), ST(0)","FST ST(0), ST(i)","fst ST(0), ST(i)","DD D0+i","V","V"= ,"","","w,r","Y","" +"FSTCW m2byte","FSTCW m2byte","fstcw m2byte","9B D9 /7","V","V","","pseudo= ","w","","" +"FSTENV m14/28byte","FSTENV m14/28byte","fstenv m14/28byte","9B D9 /6","V"= ,"V","","pseudo","w","","" +"FST m64fp, ST(0)","FSTL ST(0), m64fp","fstl ST(0), m64fp","DD /2","V","V"= ,"","","w,r","Y","64" +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DD D8+i","V",= "V","","","w,r","Y","" +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D0+i","V",= "V","","","w,r","Y","" +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D8+i","V",= "V","","","w,r","Y","" +"FSTP m64fp, ST(0)","FSTPL ST(0), m64fp","fstpl ST(0), m64fp","DD /3","V",= "V","","","w,r","Y","64" +"FSTPNCE ST(i), ST(0)","FSTPNCE ST(0), ST(i)","fstpnce ST(0), ST(i)","D9 D= 8+i","V","V","","","w,r","","" +"FSTP m32fp, ST(0)","FSTPS ST(0), m32fp","fstps ST(0), m32fp","D9 /3","V",= "V","","","w,r","Y","32" +"FSTP m80fp, ST(0)","FSTPT ST(0), m80fp","fstpt ST(0), m80fp","DB /7","V",= "V","","","w,r","Y","80" +"FST m32fp, ST(0)","FSTS ST(0), m32fp","fsts ST(0), m32fp","D9 /2","V","V"= ,"","","w,r","Y","32" +"FSTSW AX","FSTSW AX","fstsw AX","9B DF E0","V","V","","pseudo","w","","" +"FSTSW m2byte","FSTSW m2byte","fstsw m2byte","9B DD /7","V","V","","pseudo= ","w","","" +"FSUBR ST(i), ST(0)","FSUB ST(0), ST(i)","fsub ST(0), ST(i)","DC E0+i","V"= ,"V","","","rw,r","Y","" +"FSUB ST(0), ST(i)","FSUB ST(i), ST(0)","fsub ST(i), ST(0)","D8 E0+i","V",= "V","","","rw,r","Y","" +"FSUB ST(0), m64fp","FSUBL m64fp, ST(0)","fsubl m64fp, ST(0)","DC /4","V",= "V","","","rw,r","Y","64" +"FSUBP","FSUBP","fsubp","DE E9","V","V","","pseudo","","","" +"FSUBRP ST(i), ST(0)","FSUBP ST(0), ST(i)","fsubp ST(0), ST(i)","DE E0+i",= "V","V","","","rw,r","","" +"FSUB ST(i), ST(0)","FSUBR ST(0), ST(i)","fsubr ST(0), ST(i)","DC E8+i","V= ","V","","","rw,r","Y","" +"FSUBR ST(0), ST(i)","FSUBR ST(i), ST(0)","fsubr ST(i), ST(0)","D8 E8+i","= V","V","","","rw,r","Y","" +"FSUBR ST(0), m64fp","FSUBRL m64fp, ST(0)","fsubrl m64fp, ST(0)","DC /5","= V","V","","","rw,r","Y","64" +"FSUBRP","FSUBRP","fsubrp","DE E1","V","V","","pseudo","","","" +"FSUBP ST(i), ST(0)","FSUBRP ST(0), ST(i)","fsubrp ST(0), ST(i)","DE E8+i"= ,"V","V","","","rw,r","","" +"FSUBR ST(0), m32fp","FSUBRS m32fp, ST(0)","fsubrs m32fp, ST(0)","D8 /5","= V","V","","","rw,r","Y","32" +"FSUB ST(0), m32fp","FSUBS m32fp, ST(0)","fsubs m32fp, ST(0)","D8 /4","V",= "V","","","rw,r","Y","32" +"FTST","FTST","ftst","D9 E4","V","V","","","","","" +"FUCOM","FUCOM","fucom","DD E1","V","V","","pseudo","","","" +"FUCOM ST(0), ST(i)","FUCOM ST(i), ST(0)","fucom ST(i), ST(0)","DD E0+i","= V","V","","","r,r","","" +"FUCOMI ST(0), ST(i)","FUCOMI ST(i), ST(0)","fucomi ST(i), ST(0)","DB E8+i= ","V","V","PPRO","P6","r,r","","" +"FUCOMIP ST(0), ST(i)","FUCOMIP ST(i), ST(0)","fucomip ST(i), ST(0)","DF E= 8+i","V","V","PPRO","P6","r,r","","" +"FUCOMP","FUCOMP","fucomp","DD E9","V","V","","pseudo","","","" +"FUCOMP ST(0), ST(i)","FUCOMP ST(i), ST(0)","fucomp ST(i), ST(0)","DD E8+i= ","V","V","","","r,r","","" +"FUCOMPP","FUCOMPP","fucompp","DA E9","V","V","","","","","" +"FWAIT","FWAIT","fwait","9B","V","V","","","","","" +"FXAM","FXAM","fxam","D9 E5","V","V","","","","","" +"FXCH","FXCH","fxch","D9 C9","V","V","","pseudo","","","" +"FXCH ST(0), ST(i)","FXCH ST(i), ST(0)","fxch ST(i), ST(0)","D9 C8+i","V",= "V","","","rw,rw","","" +"FXCH_ALIAS1 ST(0), ST(i)","FXCH_ALIAS1 ST(i), ST(0)","fxch_alias1 ST(i), = ST(0)","DD C8+i","V","V","","","rw,rw","","" +"FXCH_ALIAS2 ST(0), ST(i)","FXCH_ALIAS2 ST(i), ST(0)","fxch_alias2 ST(i), = ST(0)","DF C8+i","V","V","","","rw,rw","","" +"FXRSTOR m512byte","FXRSTOR m512byte","fxrstor m512byte","0F AE /1","V","V= ","","modrm_memonly,operand16,operand32","r","","" +"FXRSTOR64 m512byte","FXRSTOR64 m512byte","fxrstor64 m512byte","REX.W 0F A= E /1","N.S.","V","","modrm_memonly","r","","" +"FXSAVE m512byte","FXSAVE m512byte","fxsave m512byte","0F AE /0","V","V","= ","modrm_memonly,operand16,operand32","w","","" +"FXSAVE64 m512byte","FXSAVE64 m512byte","fxsave64 m512byte","REX.W 0F AE /= 0","N.S.","V","","modrm_memonly","w","","" +"FXTRACT","FXTRACT","fxtract","D9 F4","V","V","","","","","" +"FYL2X","FYL2X","fyl2x","D9 F1","V","V","","","","","" +"FYL2XP1","FYL2XP1","fyl2xp1","D9 F9","V","V","","","","","" +"GETSEC","GETSEC","getsec","0F 37","V","V","SMX","","","","" +"GF2P8AFFINEINVQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEINVQB imm8u, xmm2/m1= 28, xmm1","gf2p8affineinvqb imm8u, xmm2/m128, xmm1","66 0F 3A CF /r ib","V"= ,"V","GFNI","","rw,r,r","","" +"GF2P8AFFINEQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEQB imm8u, xmm2/m128, xm= m1","gf2p8affineqb imm8u, xmm2/m128, xmm1","66 0F 3A CE /r ib","V","V","GFN= I","","rw,r,r","","" +"GF2P8MULB xmm1, xmm2/m128","GF2P8MULB xmm2/m128, xmm1","gf2p8mulb xmm2/m1= 28, xmm1","66 0F 38 CF /r","V","V","GFNI","","rw,r","","" +"HADDPD xmm1, xmm2/m128","HADDPD xmm2/m128, xmm1","haddpd xmm2/m128, xmm1"= ,"66 0F 7C /r","V","V","SSE3","","rw,r","","" +"HADDPS xmm1, xmm2/m128","HADDPS xmm2/m128, xmm1","haddps xmm2/m128, xmm1"= ,"F2 0F 7C /r","V","V","SSE3","","rw,r","","" +"HLT","HLT","hlt","F4","V","V","","","","","" +"HSUBPD xmm1, xmm2/m128","HSUBPD xmm2/m128, xmm1","hsubpd xmm2/m128, xmm1"= ,"66 0F 7D /r","V","V","SSE3","","rw,r","","" +"HSUBPS xmm1, xmm2/m128","HSUBPS xmm2/m128, xmm1","hsubps xmm2/m128, xmm1"= ,"F2 0F 7D /r","V","V","SSE3","","rw,r","","" +"ICEBP","ICEBP","icebp","F1","V","V","","","","","" +"IDIV r/m8","IDIVB r/m8","idivb r/m8","F6 /7","V","V","","","r","Y","8" +"IDIV r/m8","IDIVB r/m8","idivb r/m8","REX F6 /7","N.E.","V","","pseudo64"= ,"r","Y","8" +"IDIV r/m32","IDIVL r/m32","idivl r/m32","F7 /7","V","V","","operand32","r= ","Y","32" +"IDIV r/m64","IDIVQ r/m64","idivq r/m64","REX.W F7 /7","N.S.","V","","","r= ","Y","64" +"IDIV r/m16","IDIVW r/m16","idivw r/m16","F7 /7","V","V","","operand16","r= ","Y","16" +"IMUL r32, r/m32, imm32","IMUL3 imm32, r/m32, r32","imull imm32, r/m32, r3= 2","69 /r id","V","V","","operand32","w,r,r","Y","32" +"IMUL r64, r/m64, imm32","IMUL3 imm32, r/m64, r64","imulq imm32, r/m64, r6= 4","REX.W 69 /r id","N.S.","V","","","w,r,r","Y","64" +"IMUL r16, r/m16, imm8","IMUL3 imm8, r/m16, r16","imulw imm8, r/m16, r16",= "6B /r ib","V","V","","operand16","w,r,r","Y","16" +"IMUL r32, r/m32, imm8","IMUL3 imm8, r/m32, r32","imull imm8, r/m32, r32",= "6B /r ib","V","V","","operand32","w,r,r","Y","32" +"IMUL r64, r/m64, imm8","IMUL3 imm8, r/m64, r64","imulq imm8, r/m64, r64",= "REX.W 6B /r ib","N.S.","V","","","w,r,r","Y","64" +"IMUL r/m8","IMULB r/m8","imulb r/m8","F6 /5","V","V","","","r","Y","8" +"IMUL r/m32","IMULL r/m32","imull r/m32","F7 /5","V","V","","operand32","r= ","Y","32" +"IMUL r32, r/m32","IMULL r/m32, r32","imull r/m32, r32","0F AF /r","V","V"= ,"","operand32","rw,r","Y","32" +"IMUL r/m64","IMULQ r/m64","imulq r/m64","REX.W F7 /5","N.S.","V","","","r= ","Y","64" +"IMUL r64, r/m64","IMULQ r/m64, r64","imulq r/m64, r64","REX.W 0F AF /r","= N.S.","V","","","rw,r","Y","64" +"IMUL r16, r/m16, imm16","IMULW imm16, r/m16, r16","imulw imm16, r/m16, r1= 6","69 /r iw","V","V","","operand16","w,r,r","Y","16" +"IMUL r/m16","IMULW r/m16","imulw r/m16","F7 /5","V","V","","operand16","r= ","Y","16" +"IMUL r16, r/m16","IMULW r/m16, r16","imulw r/m16, r16","0F AF /r","V","V"= ,"","operand16","rw,r","Y","16" +"IN AL, DX","INB DX, AL","inb DX, AL","EC","V","V","","","w,r","Y","8" +"IN AL, imm8u","INB imm8u, AL","inb imm8u, AL","E4 ib","V","V","","","w,r"= ,"Y","8" +"INC r/m8","INCB r/m8","incb r/m8","FE /0","V","V","","","rw","Y","8" +"INC r/m8","INCB r/m8","incb r/m8","REX FE /0","N.E.","V","","pseudo64","r= w","Y","8" +"INC r/m32","INCL r/m32","incl r/m32","FF /0","V","V","","operand32","rw",= "Y","32" +"INC r32op","INCL r32op","incl r32op","40+rd","V","N.S.","","operand32","r= w","Y","32" +"INC r/m64","INCQ r/m64","incq r/m64","REX.W FF /0","N.S.","V","","","rw",= "Y","64" +"INCSSPD rmr32","INCSSPD rmr32","incsspd rmr32","F3 0F AE /5","V","V","CET= ","modrm_regonly,operand16,operand32","r","","" +"INCSSPQ rmr64","INCSSPQ rmr64","incsspq rmr64","F3 REX.W 0F AE /5","N.S."= ,"V","CET","modrm_regonly","r","","" +"INC r/m16","INCW r/m16","incw r/m16","FF /0","V","V","","operand16","rw",= "Y","16" +"INC r16op","INCW r16op","incw r16op","40+rw","V","N.S.","","operand16","r= w","Y","16" +"IN EAX, DX","INL DX, EAX","inl DX, EAX","ED","V","V","","operand32,operan= d64","w,r","Y","32" +"IN EAX, imm8u","INL imm8u, EAX","inl imm8u, EAX","E5 ib","V","V","","oper= and32,operand64","w,r","Y","32" +"INSB","INSB","insb","6C","V","V","","","","","" +"INSERTPS xmm1, xmm2/m32, imm8u","INSERTPS imm8u, xmm2/m32, xmm1","insertp= s imm8u, xmm2/m32, xmm1","66 0F 3A 21 /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"INSERTQ xmm1, xmm2, imm8u, imm8u","INSERTQ imm8u, imm8u, xmm2, xmm1","ins= ertq imm8u, imm8u, xmm2, xmm1","F2 0F 78 /r ib ib","V","V","SSE4a","amd,mod= rm_regonly","w,r,r,r","","" +"INSERTQ xmm1, xmm2","INSERTQ xmm2, xmm1","insertq xmm2, xmm1","F2 0F 79 /= r","V","V","SSE4a","amd,modrm_regonly","w,r","","" +"INSD","INSL","insl","6D","V","V","","operand32,operand64","","","" +"INSW","INSW","insw","6D","V","V","","operand16","","","" +"INT 3","INT 3","int 3","CC","V","V","","","r","","" +"INT imm8u","INT imm8u","int imm8u","CD ib","V","V","","","r","","" +"INTO","INTO","into","CE","V","N.S.","","","","","" +"INVD","INVD","invd","0F 08","V","V","486","","","","" +"INVEPT r32, m128","INVEPT m128, r32","invept m128, r32","66 0F 38 80 /r",= "V","N.S.","VTX","modrm_memonly","r,r","","" +"INVEPT r64, m128","INVEPT m128, r64","invept m128, r64","66 0F 38 80 /r",= "N.S.","V","VTX","default64,modrm_memonly","r,r","","" +"INVLPG m","INVLPG m","invlpg m","0F 01 /7","V","V","486","modrm_memonly",= "r","","" +"INVLPGA EAX, ECX","INVLPGAL ECX, EAX","invlpgal ECX, EAX","0F 01 DF","V",= "V","SVM","amd,modrm_regonly,operand32","r,r","Y","32" +"INVLPGA RAX, ECX","INVLPGAQ ECX, RAX","invlpgaq ECX, RAX","REX.W 0F 01 DF= ","N.S.","V","SVM","amd,modrm_regonly","r,r","Y","64" +"INVLPGA AX, ECX","INVLPGAW ECX, AX","invlpgaw ECX, AX","0F 01 DF","V","V"= ,"SVM","amd,modrm_regonly,operand16","r,r","Y","16" +"INVPCID r32, m128","INVPCID m128, r32","invpcid m128, r32","66 0F 38 82 /= r","V","N.S.","INVPCID","modrm_memonly","r,r","","" +"INVPCID r64, m128","INVPCID m128, r64","invpcid m128, r64","66 0F 38 82 /= r","N.S.","V","INVPCID","default64,modrm_memonly","r,r","","" +"INVVPID r32, m128","INVVPID m128, r32","invvpid m128, r32","66 0F 38 81 /= r","V","N.S.","VTX","modrm_memonly","r,r","","" +"INVVPID r64, m128","INVVPID m128, r64","invvpid m128, r64","66 0F 38 81 /= r","N.S.","V","VTX","default64,modrm_memonly","r,r","","" +"IN AX, DX","INW DX, AX","inw DX, AX","ED","V","V","","operand16","w,r","Y= ","16" +"IN AX, imm8u","INW imm8u, AX","inw imm8u, AX","E5 ib","V","V","","operand= 16","w,r","Y","16" +"IRETD","IRETL","iretl","CF","V","V","","operand32","","","" +"IRETQ","IRETQ","iretq","REX.W CF","N.S.","V","","","","","" +"IRET","IRETW","iretw","CF","V","V","","operand16","","","" +"JA rel16","JA rel16","ja rel16","0F 87 cw","V","N.S.","","operand16","r",= "","" +"JA rel32","JA rel32","ja rel32","0F 87 cd","V","N.S.","","operand32","r",= "","" +"JA rel32","JA rel32","ja rel32","0F 87 cd","N.S.","V","","default64","r",= "","" +"JA rel8","JA rel8","ja rel8","77 cb","N.S.","V","","default64","r","","" +"JA rel8","JA rel8","ja rel8","77 cb","V","N.S.","","","r","","" +"JAE rel16","JAE rel16","jae rel16","0F 83 cw","V","N.S.","","operand16","= r","","" +"JAE rel32","JAE rel32","jae rel32","0F 83 cd","N.S.","V","","default64","= r","","" +"JAE rel32","JAE rel32","jae rel32","0F 83 cd","V","N.S.","","operand32","= r","","" +"JAE rel8","JAE rel8","jae rel8","73 cb","V","N.S.","","","r","","" +"JAE rel8","JAE rel8","jae rel8","73 cb","N.S.","V","","default64","r","",= "" +"JB rel16","JB rel16","jb rel16","0F 82 cw","V","N.S.","","operand16","r",= "","" +"JB rel32","JB rel32","jb rel32","0F 82 cd","V","N.S.","","operand32","r",= "","" +"JB rel32","JB rel32","jb rel32","0F 82 cd","N.S.","V","","default64","r",= "","" +"JB rel8","JB rel8","jb rel8","72 cb","N.S.","V","","default64","r","","" +"JB rel8","JB rel8","jb rel8","72 cb","V","N.S.","","","r","","" +"JBE rel16","JBE rel16","jbe rel16","0F 86 cw","V","N.S.","","operand16","= r","","" +"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","V","N.S.","","operand32","= r","","" +"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","N.S.","V","","default64","= r","","" +"JBE rel8","JBE rel8","jbe rel8","76 cb","V","N.S.","","","r","","" +"JBE rel8","JBE rel8","jbe rel8","76 cb","N.S.","V","","default64","r","",= "" +"JC rel16","JC rel16","jc rel16","0F 82 cw","V","N.S.","","pseudo","r","",= "" +"JC rel32","JC rel32","jc rel32","0F 82 cd","V","V","","pseudo","r","","" +"JC rel8","JC rel8","jc rel8","72 cb","V","V","","pseudo","r","","" +"JCXZ rel8","JCXZ rel8","jcxz rel8","E3 cb","V","N.S.","","address16","r",= "","" +"JE rel16","JE rel16","je rel16","0F 84 cw","V","N.S.","","operand16","r",= "","" +"JE rel32","JE rel32","je rel32","0F 84 cd","V","N.S.","","operand32","r",= "","" +"JE rel32","JE rel32","je rel32","0F 84 cd","N.S.","V","","default64","r",= "","" +"JE rel8","JE rel8","je rel8","74 cb","N.S.","V","","default64","r","","" +"JE rel8","JE rel8","je rel8","74 cb","V","N.S.","","","r","","" +"JECXZ rel8","JECXZ rel8","jecxz rel8","E3 cb","V","V","","address32","r",= "","" +"JG rel16","JG rel16","jg rel16","0F 8F cw","V","N.S.","","operand16","r",= "","" +"JG rel32","JG rel32","jg rel32","0F 8F cd","N.S.","V","","default64","r",= "","" +"JG rel32","JG rel32","jg rel32","0F 8F cd","V","N.S.","","operand32","r",= "","" +"JG rel8","JG rel8","jg rel8","7F cb","V","N.S.","","","r","","" +"JG rel8","JG rel8","jg rel8","7F cb","N.S.","V","","default64","r","","" +"JGE rel16","JGE rel16","jge rel16","0F 8D cw","V","N.S.","","operand16","= r","","" +"JGE rel32","JGE rel32","jge rel32","0F 8D cd","V","N.S.","","operand32","= r","","" +"JGE rel32","JGE rel32","jge rel32","0F 8D cd","N.S.","V","","default64","= r","","" +"JGE rel8","JGE rel8","jge rel8","7D cb","N.S.","V","","default64","r","",= "" +"JGE rel8","JGE rel8","jge rel8","7D cb","V","N.S.","","","r","","" +"JL rel16","JL rel16","jl rel16","0F 8C cw","V","N.S.","","operand16","r",= "","" +"JL rel32","JL rel32","jl rel32","0F 8C cd","V","N.S.","","operand32","r",= "","" +"JL rel32","JL rel32","jl rel32","0F 8C cd","N.S.","V","","default64","r",= "","" +"JL rel8","JL rel8","jl rel8","7C cb","V","N.S.","","","r","","" +"JL rel8","JL rel8","jl rel8","7C cb","N.S.","V","","default64","r","","" +"JLE rel16","JLE rel16","jle rel16","0F 8E cw","V","N.S.","","operand16","= r","","" +"JLE rel32","JLE rel32","jle rel32","0F 8E cd","V","N.S.","","operand32","= r","","" +"JLE rel32","JLE rel32","jle rel32","0F 8E cd","N.S.","V","","default64","= r","","" +"JLE rel8","JLE rel8","jle rel8","7E cb","N.S.","V","","default64","r","",= "" +"JLE rel8","JLE rel8","jle rel8","7E cb","V","N.S.","","","r","","" +"JMP rel16","JMP rel16","jmp rel16","E9 cw","V","N.S.","","operand16","r",= "Y","" +"JMP rel32","JMP rel32","jmp rel32","E9 cd","N.S.","V","","default64","r",= "Y","" +"JMP rel32","JMP rel32","jmp rel32","E9 cd","V","N.S.","","operand32","r",= "Y","" +"JMP rel8","JMP rel8","jmp rel8","EB cb","N.S.","V","","default64","r","Y"= ,"" +"JMP rel8","JMP rel8","jmp rel8","EB cb","V","N.S.","","","r","Y","" +"JMP r/m32","JMPL* r/m32","jmpl* r/m32","FF /4","V","N.S.","","operand32",= "r","Y","32" +"JMP r/m64","JMPQ* r/m64","jmpq* r/m64","FF /4","N.S.","V","","","r","Y","= 64" +"JMP r/m16","JMPW* r/m16","jmpw* r/m16","FF /4","V","N.S.","","operand16",= "r","Y","16" +"JNA rel16","JNA rel16","jna rel16","0F 86 cw","V","N.S.","","pseudo","r",= "","" +"JNA rel32","JNA rel32","jna rel32","0F 86 cd","V","V","","pseudo","r","",= "" +"JNA rel8","JNA rel8","jna rel8","76 cb","V","V","","pseudo","r","","" +"JNAE rel16","JNAE rel16","jnae rel16","0F 82 cw","V","N.S.","","pseudo","= r","","" +"JNAE rel32","JNAE rel32","jnae rel32","0F 82 cd","V","V","","pseudo","r",= "","" +"JNAE rel8","JNAE rel8","jnae rel8","72 cb","V","V","","pseudo","r","","" +"JNB rel16","JNB rel16","jnb rel16","0F 83 cw","V","N.S.","","pseudo","r",= "","" +"JNB rel32","JNB rel32","jnb rel32","0F 83 cd","V","V","","pseudo","r","",= "" +"JNB rel8","JNB rel8","jnb rel8","73 cb","V","V","","pseudo","r","","" +"JNBE rel16","JNBE rel16","jnbe rel16","0F 87 cw","V","N.S.","","pseudo","= r","","" +"JNBE rel32","JNBE rel32","jnbe rel32","0F 87 cd","V","V","","pseudo","r",= "","" +"JNBE rel8","JNBE rel8","jnbe rel8","77 cb","V","V","","pseudo","r","","" +"JNC rel16","JNC rel16","jnc rel16","0F 83 cw","V","N.S.","","pseudo","r",= "","" +"JNC rel32","JNC rel32","jnc rel32","0F 83 cd","V","V","","pseudo","r","",= "" +"JNC rel8","JNC rel8","jnc rel8","73 cb","V","V","","pseudo","r","","" +"JNE rel16","JNE rel16","jne rel16","0F 85 cw","V","N.S.","","operand16","= r","","" +"JNE rel32","JNE rel32","jne rel32","0F 85 cd","N.S.","V","","default64","= r","","" +"JNE rel32","JNE rel32","jne rel32","0F 85 cd","V","N.S.","","operand32","= r","","" +"JNE rel8","JNE rel8","jne rel8","75 cb","V","N.S.","","","r","","" +"JNE rel8","JNE rel8","jne rel8","75 cb","N.S.","V","","default64","r","",= "" +"JNG rel16","JNG rel16","jng rel16","0F 8E cw","V","N.S.","","pseudo","r",= "","" +"JNG rel32","JNG rel32","jng rel32","0F 8E cd","V","V","","pseudo","r","",= "" +"JNG rel8","JNG rel8","jng rel8","7E cb","V","V","","pseudo","r","","" +"JNGE rel16","JNGE rel16","jnge rel16","0F 8C cw","V","N.S.","","pseudo","= r","","" +"JNGE rel32","JNGE rel32","jnge rel32","0F 8C cd","V","V","","pseudo","r",= "","" +"JNGE rel8","JNGE rel8","jnge rel8","7C cb","V","V","","pseudo","r","","" +"JNL rel16","JNL rel16","jnl rel16","0F 8D cw","V","N.S.","","pseudo","r",= "","" +"JNL rel32","JNL rel32","jnl rel32","0F 8D cd","V","V","","pseudo","r","",= "" +"JNL rel8","JNL rel8","jnl rel8","7D cb","V","V","","pseudo","r","","" +"JNLE rel16","JNLE rel16","jnle rel16","0F 8F cw","V","N.S.","","pseudo","= r","","" +"JNLE rel32","JNLE rel32","jnle rel32","0F 8F cd","V","V","","pseudo","r",= "","" +"JNLE rel8","JNLE rel8","jnle rel8","7F cb","V","V","","pseudo","r","","" +"JNO rel16","JNO rel16","jno rel16","0F 81 cw","V","N.S.","","operand16","= r","","" +"JNO rel32","JNO rel32","jno rel32","0F 81 cd","V","N.S.","","operand32","= r","","" +"JNO rel32","JNO rel32","jno rel32","0F 81 cd","N.S.","V","","default64","= r","","" +"JNO rel8","JNO rel8","jno rel8","71 cb","V","N.S.","","","r","","" +"JNO rel8","JNO rel8","jno rel8","71 cb","N.S.","V","","default64","r","",= "" +"JNP rel16","JNP rel16","jnp rel16","0F 8B cw","V","N.S.","","operand16","= r","","" +"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","V","N.S.","","operand32","= r","","" +"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","N.S.","V","","default64","= r","","" +"JNP rel8","JNP rel8","jnp rel8","7B cb","N.S.","V","","default64","r","",= "" +"JNP rel8","JNP rel8","jnp rel8","7B cb","V","N.S.","","","r","","" +"JNS rel16","JNS rel16","jns rel16","0F 89 cw","V","N.S.","","operand16","= r","","" +"JNS rel32","JNS rel32","jns rel32","0F 89 cd","N.S.","V","","default64","= r","","" +"JNS rel32","JNS rel32","jns rel32","0F 89 cd","V","N.S.","","operand32","= r","","" +"JNS rel8","JNS rel8","jns rel8","79 cb","V","N.S.","","","r","","" +"JNS rel8","JNS rel8","jns rel8","79 cb","N.S.","V","","default64","r","",= "" +"JNZ rel16","JNZ rel16","jnz rel16","0F 85 cw","V","N.S.","","pseudo","r",= "","" +"JNZ rel32","JNZ rel32","jnz rel32","0F 85 cd","V","V","","pseudo","r","",= "" +"JNZ rel8","JNZ rel8","jnz rel8","75 cb","V","V","","pseudo","r","","" +"JO rel16","JO rel16","jo rel16","0F 80 cw","V","N.S.","","operand16","r",= "","" +"JO rel32","JO rel32","jo rel32","0F 80 cd","V","N.S.","","operand32","r",= "","" +"JO rel32","JO rel32","jo rel32","0F 80 cd","N.S.","V","","default64","r",= "","" +"JO rel8","JO rel8","jo rel8","70 cb","V","N.S.","","","r","","" +"JO rel8","JO rel8","jo rel8","70 cb","N.S.","V","","default64","r","","" +"JP rel16","JP rel16","jp rel16","0F 8A cw","V","N.S.","","operand16","r",= "","" +"JP rel32","JP rel32","jp rel32","0F 8A cd","N.S.","V","","default64","r",= "","" +"JP rel32","JP rel32","jp rel32","0F 8A cd","V","N.S.","","operand32","r",= "","" +"JP rel8","JP rel8","jp rel8","7A cb","N.S.","V","","default64","r","","" +"JP rel8","JP rel8","jp rel8","7A cb","V","N.S.","","","r","","" +"JPE rel16","JPE rel16","jpe rel16","0F 8A cw","V","N.S.","","pseudo","r",= "","" +"JPE rel32","JPE rel32","jpe rel32","0F 8A cd","V","V","","pseudo","r","",= "" +"JPE rel8","JPE rel8","jpe rel8","7A cb","V","V","","pseudo","r","","" +"JPO rel16","JPO rel16","jpo rel16","0F 8B cw","V","N.S.","","pseudo","r",= "","" +"JPO rel32","JPO rel32","jpo rel32","0F 8B cd","V","V","","pseudo","r","",= "" +"JPO rel8","JPO rel8","jpo rel8","7B cb","V","V","","pseudo","r","","" +"JRCXZ rel8","JRCXZ rel8","jrcxz rel8","E3 cb","N.S.","V","","address64","= r","","" +"JS rel16","JS rel16","js rel16","0F 88 cw","V","N.S.","","operand16","r",= "","" +"JS rel32","JS rel32","js rel32","0F 88 cd","V","N.S.","","operand32","r",= "","" +"JS rel32","JS rel32","js rel32","0F 88 cd","N.S.","V","","default64","r",= "","" +"JS rel8","JS rel8","js rel8","78 cb","V","N.S.","","","r","","" +"JS rel8","JS rel8","js rel8","78 cb","N.S.","V","","default64","r","","" +"JZ rel16","JZ rel16","jz rel16","0F 84 cw","V","N.S.","","operand16,pseud= o","r","","" +"JZ rel32","JZ rel32","jz rel32","0F 84 cd","V","V","","operand32,pseudo",= "r","","" +"JZ rel8","JZ rel8","jz rel8","74 cb","V","V","","pseudo","r","","" +"KADDB k1, kV, k2","KADDB k2, kV, k1","kaddb k2, kV, k1","VEX.NDS.256.66.0= F.W0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KADDD k1, kV, k2","KADDD k2, kV, k1","kaddd k2, kV, k1","VEX.NDS.256.66.0= F.W1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KADDQ k1, kV, k2","KADDQ k2, kV, k1","kaddq k2, kV, k1","VEX.NDS.256.0F.W= 1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KADDW k1, kV, k2","KADDW k2, kV, k1","kaddw k2, kV, k1","VEX.NDS.256.0F.W= 0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KANDB k1, kV, k2","KANDB k2, kV, k1","kandb k2, kV, k1","VEX.NDS.256.66.0= F.W0 41 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KANDD k1, kV, k2","KANDD k2, kV, k1","kandd k2, kV, k1","VEX.NDS.256.66.0= F.W1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KANDNB k1, kV, k2","KANDNB k2, kV, k1","kandnb k2, kV, k1","VEX.NDS.256.6= 6.0F.W0 42 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KANDND k1, kV, k2","KANDND k2, kV, k1","kandnd k2, kV, k1","VEX.NDS.256.6= 6.0F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KANDNQ k1, kV, k2","KANDNQ k2, kV, k1","kandnq k2, kV, k1","VEX.NDS.256.0= F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KANDNW k1, kV, k2","KANDNW k2, kV, k1","kandnw k2, kV, k1","VEX.NDS.256.0= F.W0 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KANDQ k1, kV, k2","KANDQ k2, kV, k1","kandq k2, kV, k1","VEX.NDS.256.0F.W= 1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KANDW k1, kV, k2","KANDW k2, kV, k1","kandw k2, kV, k1","VEX.NDS.256.0F.W= 0 41 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KMOVB m8, k1","KMOVB k1, m8","kmovb k1, m8","VEX.128.66.0F.W0 91 /r","V",= "V","AVX512DQ","modrm_memonly","w,r","","" +"KMOVB r32, k2","KMOVB k2, r32","kmovb k2, r32","VEX.128.66.0F.W0 93 /r","= V","V","AVX512DQ","modrm_regonly","w,r","","" +"KMOVB k1, k2/m8","KMOVB k2/m8, k1","kmovb k2/m8, k1","VEX.128.66.0F.W0 90= /r","V","V","AVX512DQ","","w,r","","" +"KMOVB k1, rmr32","KMOVB rmr32, k1","kmovb rmr32, k1","VEX.128.66.0F.W0 92= /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"KMOVD m32, k1","KMOVD k1, m32","kmovd k1, m32","VEX.128.66.0F.W1 91 /r","= V","V","AVX512BW","modrm_memonly","w,r","","" +"KMOVD r32, k2","KMOVD k2, r32","kmovd k2, r32","VEX.128.F2.0F.W0 93 /r","= V","V","AVX512BW","modrm_regonly","w,r","","" +"KMOVD k1, k2/m32","KMOVD k2/m32, k1","kmovd k2/m32, k1","VEX.128.66.0F.W1= 90 /r","V","V","AVX512BW","","w,r","","" +"KMOVD k1, rmr32","KMOVD rmr32, k1","kmovd rmr32, k1","VEX.128.F2.0F.W0 92= /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"KMOVQ m64, k1","KMOVQ k1, m64","kmovq k1, m64","VEX.128.0F.W1 91 /r","V",= "V","AVX512BW","modrm_memonly","w,r","","" +"KMOVQ r64, k2","KMOVQ k2, r64","kmovq k2, r64","VEX.128.F2.0F.W1 93 /r","= N.S.","V","AVX512BW","modrm_regonly","w,r","","" +"KMOVQ k1, k2/m64","KMOVQ k2/m64, k1","kmovq k2/m64, k1","VEX.128.0F.W1 90= /r","V","V","AVX512BW","","w,r","","" +"KMOVQ k1, rmr64","KMOVQ rmr64, k1","kmovq rmr64, k1","VEX.128.F2.0F.W1 92= /r","N.S.","V","AVX512BW","modrm_regonly","w,r","","" +"KMOVW m16, k1","KMOVW k1, m16","kmovw k1, m16","VEX.128.0F.W0 91 /r","V",= "V","AVX512F","modrm_memonly","w,r","","" +"KMOVW r32, k2","KMOVW k2, r32","kmovw k2, r32","VEX.128.0F.W0 93 /r","V",= "V","AVX512F","modrm_regonly","w,r","","" +"KMOVW k1, k2/m16","KMOVW k2/m16, k1","kmovw k2/m16, k1","VEX.128.0F.W0 90= /r","V","V","AVX512F","","w,r","","" +"KMOVW k1, rmr32","KMOVW rmr32, k1","kmovw rmr32, k1","VEX.128.0F.W0 92 /r= ","V","V","AVX512F","modrm_regonly","w,r","","" +"KNOTB k1, k2","KNOTB k2, k1","knotb k2, k1","VEX.128.66.0F.W0 44 /r","V",= "V","AVX512DQ","modrm_regonly","w,r","","" +"KNOTD k1, k2","KNOTD k2, k1","knotd k2, k1","VEX.128.66.0F.W1 44 /r","V",= "V","AVX512BW","modrm_regonly","w,r","","" +"KNOTQ k1, k2","KNOTQ k2, k1","knotq k2, k1","VEX.128.0F.W1 44 /r","V","V"= ,"AVX512BW","modrm_regonly","w,r","","" +"KNOTW k1, k2","KNOTW k2, k1","knotw k2, k1","VEX.128.0F.W0 44 /r","V","V"= ,"AVX512F","modrm_regonly","w,r","","" +"KORB k1, kV, k2","KORB k2, kV, k1","korb k2, kV, k1","VEX.NDS.256.66.0F.W= 0 45 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KORD k1, kV, k2","KORD k2, kV, k1","kord k2, kV, k1","VEX.NDS.256.66.0F.W= 1 45 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KORQ k1, kV, k2","KORQ k2, kV, k1","korq k2, kV, k1","VEX.NDS.256.0F.W1 4= 5 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KORTESTB k1, k2","KORTESTB k2, k1","kortestb k2, k1","VEX.128.66.0F.W0 98= /r","V","V","AVX512DQ","modrm_regonly","r,r","","" +"KORTESTD k1, k2","KORTESTD k2, k1","kortestd k2, k1","VEX.128.66.0F.W1 98= /r","V","V","AVX512BW","modrm_regonly","r,r","","" +"KORTESTQ k1, k2","KORTESTQ k2, k1","kortestq k2, k1","VEX.128.0F.W1 98 /r= ","V","V","AVX512BW","modrm_regonly","r,r","","" +"KORTESTW k1, k2","KORTESTW k2, k1","kortestw k2, k1","VEX.128.0F.W0 98 /r= ","V","V","AVX512F","modrm_regonly","r,r","","" +"KORW k1, kV, k2","KORW k2, kV, k1","korw k2, kV, k1","VEX.NDS.256.0F.W0 4= 5 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KSHIFTLB k1, k2, imm8u","KSHIFTLB imm8u, k2, k1","kshiftlb imm8u, k2, k1"= ,"VEX.128.66.0F3A.W0 32 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","= ","" +"KSHIFTLD k1, k2, imm8u","KSHIFTLD imm8u, k2, k1","kshiftld imm8u, k2, k1"= ,"VEX.128.66.0F3A.W0 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","= ","" +"KSHIFTLQ k1, k2, imm8u","KSHIFTLQ imm8u, k2, k1","kshiftlq imm8u, k2, k1"= ,"VEX.128.66.0F3A.W1 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","= ","" +"KSHIFTLW k1, k2, imm8u","KSHIFTLW imm8u, k2, k1","kshiftlw imm8u, k2, k1"= ,"VEX.128.66.0F3A.W1 32 /r ib","V","V","AVX512F","modrm_regonly","w,r,r",""= ,"" +"KSHIFTRB k1, k2, imm8u","KSHIFTRB imm8u, k2, k1","kshiftrb imm8u, k2, k1"= ,"VEX.128.66.0F3A.W0 30 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","= ","" +"KSHIFTRD k1, k2, imm8u","KSHIFTRD imm8u, k2, k1","kshiftrd imm8u, k2, k1"= ,"VEX.128.66.0F3A.W0 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","= ","" +"KSHIFTRQ k1, k2, imm8u","KSHIFTRQ imm8u, k2, k1","kshiftrq imm8u, k2, k1"= ,"VEX.128.66.0F3A.W1 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","= ","" +"KSHIFTRW k1, k2, imm8u","KSHIFTRW imm8u, k2, k1","kshiftrw imm8u, k2, k1"= ,"VEX.128.66.0F3A.W1 30 /r ib","V","V","AVX512F","modrm_regonly","w,r,r",""= ,"" +"KTESTB k1, k2","KTESTB k2, k1","ktestb k2, k1","VEX.128.66.0F.W0 99 /r","= V","V","AVX512DQ","modrm_regonly","r,r","","" +"KTESTD k1, k2","KTESTD k2, k1","ktestd k2, k1","VEX.128.66.0F.W1 99 /r","= V","V","AVX512BW","modrm_regonly","r,r","","" +"KTESTQ k1, k2","KTESTQ k2, k1","ktestq k2, k1","VEX.128.0F.W1 99 /r","V",= "V","AVX512BW","modrm_regonly","r,r","","" +"KTESTW k1, k2","KTESTW k2, k1","ktestw k2, k1","VEX.128.0F.W0 99 /r","V",= "V","AVX512DQ","modrm_regonly","r,r","","" +"KUNPCKBW k1, kV, k2","KUNPCKBW k2, kV, k1","kunpckbw k2, kV, k1","VEX.NDS= .256.66.0F.W0 4B /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KUNPCKDQ k1, kV, k2","KUNPCKDQ k2, kV, k1","kunpckdq k2, kV, k1","VEX.NDS= .256.0F.W1 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KUNPCKWD k1, kV, k2","KUNPCKWD k2, kV, k1","kunpckwd k2, kV, k1","VEX.NDS= .256.0F.W0 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXNORB k1, kV, k2","KXNORB k2, kV, k1","kxnorb k2, kV, k1","VEX.NDS.256.6= 6.0F.W0 46 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KXNORD k1, kV, k2","KXNORD k2, kV, k1","kxnord k2, kV, k1","VEX.NDS.256.6= 6.0F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXNORQ k1, kV, k2","KXNORQ k2, kV, k1","kxnorq k2, kV, k1","VEX.NDS.256.0= F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXNORW k1, kV, k2","KXNORW k2, kV, k1","kxnorw k2, kV, k1","VEX.NDS.256.0= F.W0 46 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KXORB k1, kV, k2","KXORB k2, kV, k1","kxorb k2, kV, k1","VEX.NDS.256.66.0= F.W0 47 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KXORD k1, kV, k2","KXORD k2, kV, k1","kxord k2, kV, k1","VEX.NDS.256.66.0= F.W1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXORQ k1, kV, k2","KXORQ k2, kV, k1","kxorq k2, kV, k1","VEX.NDS.256.0F.W= 1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXORW k1, kV, k2","KXORW k2, kV, k1","kxorw k2, kV, k1","VEX.NDS.256.0F.W= 0 47 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"LAHF","LAHF","lahf","9F","V","V","LAHFSAHF","","","","" +"LAR r32, r32/m16","LARL r32/m16, r32","larl r32/m16, r32","0F 02 /r","V",= "V","","operand32","rw,r","Y","32" +"LAR r64, r64/m16","LARQ r64/m16, r64","larq r64/m16, r64","REX.W 0F 02 /r= ","N.S.","V","","","rw,r","Y","64" +"LAR r16, r/m16","LARW r/m16, r16","larw r/m16, r16","0F 02 /r","V","V",""= ,"operand16","rw,r","Y","16" +"CALL_FAR ptr16:32","LCALLL ptr16:32","lcalll ptr16:32","9A cd iw","V","N.= S.","","operand32","r","Y","" +"CALL_FAR m16:32","LCALLL* m16:32","lcalll* m16:32","FF /3","V","V","","mo= drm_memonly,operand32","r","Y","" +"CALL_FAR m16:64","LCALLQ* m16:64","lcallq* m16:64","REX.W FF /3","N.S.","= V","","modrm_memonly","r","Y","" +"CALL_FAR ptr16:16","LCALLW ptr16:16","lcallw ptr16:16","9A cw iw","V","N.= S.","","operand16","r","Y","" +"CALL_FAR m16:16","LCALLW* m16:16","lcallw* m16:16","FF /3","V","V","","mo= drm_memonly,operand16","r","Y","" +"LDDQU xmm1, m128","LDDQU m128, xmm1","lddqu m128, xmm1","F2 0F F0 /r","V"= ,"V","SSE3","modrm_memonly","w,r","","" +"LDMXCSR m32","LDMXCSR m32","ldmxcsr m32","0F AE /2","V","V","SSE","modrm_= memonly","r","","" +"LDS r32, m16:32","LDSL m16:32, r32","ldsl m16:32, r32","C5 /r","V","N.S."= ,"","modrm_memonly,operand32","w,r","Y","32" +"LDS r16, m16:16","LDSW m16:16, r16","ldsw m16:16, r16","C5 /r","V","N.S."= ,"","modrm_memonly,operand16","w,r","Y","16" +"LEA r32, m","LEAL m, r32","leal m, r32","8D /r","V","V","","modrm_memonly= ,operand32","w,r","Y","32" +"LEA r64, m","LEAQ m, r64","leaq m, r64","REX.W 8D /r","N.S.","V","","modr= m_memonly","w,r","Y","64" +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","N.S.","V","","= default64","","Y","" +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","N.S.","","= operand32","","Y","" +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","V","","ope= rand16","","Y","" +"LEA r16, m","LEAW m, r16","leaw m, r16","8D /r","V","V","","modrm_memonly= ,operand16","w,r","Y","16" +"LES r32, m16:32","LESL m16:32, r32","lesl m16:32, r32","C4 /r","V","N.S."= ,"","modrm_memonly,operand32","w,r","Y","32" +"LES r16, m16:16","LESW m16:16, r16","lesw m16:16, r16","C4 /r","V","N.S."= ,"","modrm_memonly,operand16","w,r","Y","16" +"LFENCE","LFENCE","lfence","0F AE /5","V","V","SSE2","","","","" +"LFS r32, m16:32","LFSL m16:32, r32","lfsl m16:32, r32","0F B4 /r","V","V"= ,"","modrm_memonly,operand32","w,r","Y","32" +"LFS r64, m16:64","LFSQ m16:64, r64","lfsq m16:64, r64","REX.W 0F B4 /r","= N.S.","V","","modrm_memonly","w,r","Y","64" +"LFS r16, m16:16","LFSW m16:16, r16","lfsw m16:16, r16","0F B4 /r","V","V"= ,"","modrm_memonly,operand16","w,r","Y","16" +"LGDT m16&64","LGDT m16&64","lgdt m16&64","0F 01 /2","N.S.","V","","defaul= t64,modrm_memonly","r","","" +"LGDT m16&32","LGDTW/LGDTL m16&32","lgdtw/lgdtl m16&32","0F 01 /2","V","N.= S.","","modrm_memonly","r","","" +"LGS r32, m16:32","LGSL m16:32, r32","lgsl m16:32, r32","0F B5 /r","V","V"= ,"","modrm_memonly,operand32","w,r","Y","32" +"LGS r64, m16:64","LGSQ m16:64, r64","lgsq m16:64, r64","REX.W 0F B5 /r","= N.S.","V","","modrm_memonly","w,r","Y","64" +"LGS r16, m16:16","LGSW m16:16, r16","lgsw m16:16, r16","0F B5 /r","V","V"= ,"","modrm_memonly,operand16","w,r","Y","16" +"LIDT m16&64","LIDT m16&64","lidt m16&64","0F 01 /3","N.S.","V","","defaul= t64,modrm_memonly","r","","" +"LIDT m16&32","LIDTW/LIDTL m16&32","lidtw/lidtl m16&32","0F 01 /3","V","N.= S.","","modrm_memonly","r","","" +"JMP_FAR ptr16:32","LJMPL ptr16:32","ljmpl ptr16:32","EA cd iw","V","N.S."= ,"","operand32","r","Y","" +"JMP_FAR m16:32","LJMPL* m16:32","ljmpl* m16:32","FF /5","V","V","","modrm= _memonly,operand32","r","Y","" +"JMP_FAR m16:64","LJMPQ* m16:64","ljmpq* m16:64","REX.W FF /5","N.S.","V",= "","modrm_memonly","r","Y","" +"JMP_FAR ptr16:16","LJMPW ptr16:16","ljmpw ptr16:16","EA cw iw","V","N.S."= ,"","operand16","r","Y","" +"JMP_FAR m16:16","LJMPW* m16:16","ljmpw* m16:16","FF /5","V","V","","modrm= _memonly,operand16","r","Y","" +"LLDT r/m16","LLDT r/m16","lldt r/m16","0F 00 /2","V","V","","","r","","" +"LLWPCB rmr32","LLWPCBL rmr32","llwpcbl rmr32","XOP.128.09.W0 12 /0","V","= V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32" +"LLWPCB rmr64","LLWPCBQ rmr64","llwpcbq rmr64","XOP.128.09.W0 12 /0","N.S.= ","V","XOP","amd,modrm_regonly,operand64","w","Y","64" +"LMSW r/m16","LMSW r/m16","lmsw r/m16","0F 01 /6","V","V","","","r","","" +"LOCK","LOCK","lock","F0","V","V","","pseudo","","","" +"LODSB","LODSB","lodsb","AC","V","V","","","","","" +"LODSD","LODSL","lodsl","AD","V","V","","operand32","","","" +"LODSQ","LODSQ","lodsq","REX.W AD","N.S.","V","","","","","" +"LODSW","LODSW","lodsw","AD","V","V","","operand16","","","" +"LOOP rel8","LOOP rel8","loop rel8","E2 cb","V","V","","","r","","" +"LOOPE rel8","LOOPEQ rel8","loope rel8","E1 cb","V","V","","","r","","" +"LOOPNE rel8","LOOPNE rel8","loopne rel8","E0 cb","V","V","","","r","","" +"LSL r32, r32/m16","LSLL r32/m16, r32","lsll r32/m16, r32","0F 03 /r","V",= "V","","operand32","rw,r","Y","32" +"LSL r64, r32/m16","LSLQ r32/m16, r64","lslq r32/m16, r64","REX.W 0F 03 /r= ","N.S.","V","","","rw,r","Y","64" +"LSL r16, r/m16","LSLW r/m16, r16","lslw r/m16, r16","0F 03 /r","V","V",""= ,"operand16","rw,r","Y","16" +"LSS r32, m16:32","LSSL m16:32, r32","lssl m16:32, r32","0F B2 /r","V","V"= ,"","modrm_memonly,operand32","w,r","Y","32" +"LSS r64, m16:64","LSSQ m16:64, r64","lssq m16:64, r64","REX.W 0F B2 /r","= N.S.","V","","modrm_memonly","w,r","Y","64" +"LSS r16, m16:16","LSSW m16:16, r16","lssw m16:16, r16","0F B2 /r","V","V"= ,"","modrm_memonly,operand16","w,r","Y","16" +"LTR r/m16","LTR r/m16","ltr r/m16","0F 00 /3","V","V","","","r","","" +"LWPINS r32V, r/m32, imm32u","LWPINS imm32u, r/m32, r32V","lwpins imm32u, = r/m32, r32V","XOP.NDD.128.0A.W0 12 /0","V","V","XOP","amd,operand16,operand= 32","w,r,r","","" +"LWPINS r64V, r64/m32, imm32u","LWPINS imm32u, r64/m32, r64V","lwpins imm3= 2u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /0","N.S.","V","XOP","amd,operand6= 4","w,r,r","","" +"LWPVAL r32V, r/m32, imm32u","LWPVAL imm32u, r/m32, r32V","lwpval imm32u, = r/m32, r32V","XOP.NDD.128.0A.W0 12 /1","V","V","XOP","amd,operand16,operand= 32","w,r,r","","" +"LWPVAL r64V, r64/m32, imm32u","LWPVAL imm32u, r64/m32, r64V","lwpval imm3= 2u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /1","N.S.","V","XOP","amd,operand6= 4","w,r,r","","" +"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","= V","V","LZCNT","operand32","w,r","Y","32" +"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","= V","V","AMD","amd,operand32","w,r","Y","32" +"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD= /r","N.S.","V","AMD","amd","w,r","Y","64" +"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD= /r","N.S.","V","LZCNT","","w,r","Y","64" +"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","= V","V","AMD","amd,operand16","w,r","Y","16" +"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","= V","V","LZCNT","operand16","w,r","Y","16" +"MASKMOVDQU xmm1, xmm2","MASKMOVOU xmm2, xmm1","maskmovdqu xmm2, xmm1","66= 0F F7 /r","V","V","SSE2","modrm_regonly","r,r","","" +"MASKMOVQ mm1, mm2","MASKMOVQ mm2, mm1","maskmovq mm2, mm1","0F F7 /r","V"= ,"V","MMX","modrm_regonly","r,r","","" +"MAXPD xmm1, xmm2/m128","MAXPD xmm2/m128, xmm1","maxpd xmm2/m128, xmm1","6= 6 0F 5F /r","V","V","SSE2","","rw,r","","" +"MAXPS xmm1, xmm2/m128","MAXPS xmm2/m128, xmm1","maxps xmm2/m128, xmm1","0= F 5F /r","V","V","SSE","","rw,r","","" +"MAXSD xmm1, xmm2/m64","MAXSD xmm2/m64, xmm1","maxsd xmm2/m64, xmm1","F2 0= F 5F /r","V","V","SSE2","","rw,r","","" +"MAXSS xmm1, xmm2/m32","MAXSS xmm2/m32, xmm1","maxss xmm2/m32, xmm1","F3 0= F 5F /r","V","V","SSE","","rw,r","","" +"MFENCE","MFENCE","mfence","0F AE /6","V","V","SSE2","","","","" +"MINPD xmm1, xmm2/m128","MINPD xmm2/m128, xmm1","minpd xmm2/m128, xmm1","6= 6 0F 5D /r","V","V","SSE2","","rw,r","","" +"MINPS xmm1, xmm2/m128","MINPS xmm2/m128, xmm1","minps xmm2/m128, xmm1","0= F 5D /r","V","V","SSE","","rw,r","","" +"MINSD xmm1, xmm2/m64","MINSD xmm2/m64, xmm1","minsd xmm2/m64, xmm1","F2 0= F 5D /r","V","V","SSE2","","rw,r","","" +"MINSS xmm1, xmm2/m32","MINSS xmm2/m32, xmm1","minss xmm2/m32, xmm1","F3 0= F 5D /r","V","V","SSE","","rw,r","","" +"MONITOR","MONITOR","monitor","0F 01 C8","V","V","MONITOR","","","","" +"MOVAPD xmm2/m128, xmm1","MOVAPD xmm1, xmm2/m128","movapd xmm1, xmm2/m128"= ,"66 0F 29 /r","V","V","SSE2","","w,r","","" +"MOVAPD xmm1, xmm2/m128","MOVAPD xmm2/m128, xmm1","movapd xmm2/m128, xmm1"= ,"66 0F 28 /r","V","V","SSE2","","w,r","","" +"MOVAPS xmm2/m128, xmm1","MOVAPS xmm1, xmm2/m128","movaps xmm1, xmm2/m128"= ,"0F 29 /r","V","V","SSE","","w,r","","" +"MOVAPS xmm1, xmm2/m128","MOVAPS xmm2/m128, xmm1","movaps xmm2/m128, xmm1"= ,"0F 28 /r","V","V","SSE","","w,r","","" +"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","C6 /0 ib","V","V"= ,"","","w,r","Y","8" +"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","REX C6 /0 ib","N.= E.","V","","pseudo64","w,r","Y","8" +"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","B0+rb ib","V","V"= ,"","","w,r","Y","8" +"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","REX B0+rb ib","N.= E.","V","","pseudo64","w,r","Y","8" +"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","8A /r","V","V","","","w,r"= ,"Y","8" +"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","REX 8A /r","N.E.","V","","= pseudo64","w,r","Y","8" +"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","88 /r","V","V","","","w,r"= ,"Y","8" +"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","REX 88 /r","N.E.","V","","= pseudo64","w,r","Y","8" +"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, mof= fs8","A2 cm","V","V","","","w,r","Y","8" +"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, mof= fs8","REX.W A2 cm","N.E.","V","","pseudo","w,r","Y","8" +"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8,= AL","A0 cm","V","V","","","w,r","Y","8" +"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8,= AL","REX.W A0 cm","N.E.","V","","pseudo","w,r","Y","8" +"MOVBE r32, m32","MOVBELL m32, r32","movbell m32, r32","0F 38 F0 /r","V","= V","MOVBE","modrm_memonly,operand32","w,r","Y","32" +"MOVBE m32, r32","MOVBELL r32, m32","movbell r32, m32","0F 38 F1 /r","V","= V","MOVBE","modrm_memonly,operand32","w,r","Y","32" +"MOVBE r64, m64","MOVBEQQ m64, r64","movbeqq m64, r64","REX.W 0F 38 F0 /r"= ,"N.S.","V","MOVBE","modrm_memonly","w,r","Y","64" +"MOVBE m64, r64","MOVBEQQ r64, m64","movbeqq r64, m64","REX.W 0F 38 F1 /r"= ,"N.S.","V","MOVBE","modrm_memonly","w,r","Y","64" +"MOVBE r16, m16","MOVBEWW m16, r16","movbeww m16, r16","0F 38 F0 /r","V","= V","MOVBE","modrm_memonly,operand16","w,r","Y","16" +"MOVBE m16, r16","MOVBEWW r16, m16","movbeww r16, m16","0F 38 F1 /r","V","= V","MOVBE","modrm_memonly,operand16","w,r","Y","16" +"MOVSX r32, r/m8","MOVBLSX r/m8, r32","movsbl r/m8, r32","0F BE /r","V","V= ","","operand32","w,r","Y","32" +"MOVZX r32, r/m8","MOVBLZX r/m8, r32","movzbl r/m8, r32","0F B6 /r","V","V= ","","operand32","w,r","Y","32" +"MOVSX r64, r/m8","MOVBQSX r/m8, r64","movsbq r/m8, r64","REX.W 0F BE /r",= "N.S.","V","","","w,r","Y","64" +"MOVZX r64, r/m8","MOVBQZX r/m8, r64","movzbq r/m8, r64","REX.W 0F B6 /r",= "N.S.","V","","","w,r","Y","64" +"MOVSX r16, r/m8","MOVBWSX r/m8, r16","movsbw r/m8, r16","0F BE /r","V","V= ","","operand16","w,r","Y","16" +"MOVZX r16, r/m8","MOVBWZX r/m8, r16","movzbw r/m8, r16","0F B6 /r","V","V= ","","operand16","w,r","Y","16" +"MOVD r/m32, mm1","MOVD mm1, r/m32","movd mm1, r/m32","0F 7E /r","V","V","= MMX","operand16,operand32","w,r","","" +"MOVD mm1, r/m32","MOVD r/m32, mm1","movd r/m32, mm1","0F 6E /r","V","V","= MMX","operand16,operand32","w,r","","" +"MOVD xmm1, r/m32","MOVD r/m32, xmm1","movd r/m32, xmm1","66 0F 6E /r","V"= ,"V","SSE2","operand16,operand32","w,r","","" +"MOVD r/m32, xmm1","MOVD xmm1, r/m32","movd xmm1, r/m32","66 0F 7E /r","V"= ,"V","SSE2","operand16,operand32","w,r","","" +"MOVDDUP xmm1, xmm2/m64","MOVDDUP xmm2/m64, xmm1","movddup xmm2/m64, xmm1"= ,"F2 0F 12 /r","V","V","SSE3","","w,r","","" +"MOVHLPS xmm1, xmm2","MOVHLPS xmm2, xmm1","movhlps xmm2, xmm1","0F 12 /r",= "V","V","SSE","modrm_regonly","w,r","","" +"MOVHPD xmm1, m64","MOVHPD m64, xmm1","movhpd m64, xmm1","66 0F 16 /r","V"= ,"V","SSE2","modrm_memonly","w,r","","" +"MOVHPD m64, xmm1","MOVHPD xmm1, m64","movhpd xmm1, m64","66 0F 17 /r","V"= ,"V","SSE2","modrm_memonly","w,r","","" +"MOVHPS xmm1, m64","MOVHPS m64, xmm1","movhps m64, xmm1","0F 16 /r","V","V= ","SSE","modrm_memonly","w,r","","" +"MOVHPS m64, xmm1","MOVHPS xmm1, m64","movhps xmm1, m64","0F 17 /r","V","V= ","SSE","modrm_memonly","w,r","","" +"MOV rmr32, CR0-CR7","MOVL CR0-CR7, rmr32","movl CR0-CR7, rmr32","0F 20 /r= ","V","N.S.","","","w,r","Y","32" +"MOV rmr32, DR0-DR7","MOVL DR0-DR7, rmr32","movl DR0-DR7, rmr32","0F 21 /r= ","V","N.S.","","","w,r","Y","32" +"MOV moffs32, EAX","MOVL EAX, moffs32","movl EAX, moffs32","A3 cm","V","V"= ,"","operand32","w,r","Y","32" +"MOV r/m32, imm32","MOVL imm32, r/m32","movl imm32, r/m32","C7 /0 id","V",= "V","","operand32","w,r","Y","32" +"MOV r32op, imm32u","MOVL imm32u, r32op","movl imm32u, r32op","B8+rd id","= V","V","","operand32","w,r","Y","32" +"MOV EAX, moffs32","MOVL moffs32, EAX","movl moffs32, EAX","A1 cm","V","V"= ,"","operand32","w,r","Y","32" +"MOV r32, r/m32","MOVL r/m32, r32","movl r/m32, r32","8B /r","V","V","","o= perand32","w,r","Y","32" +"MOV r/m32, r32","MOVL r32, r/m32","movl r32, r/m32","89 /r","V","V","","o= perand32","w,r","Y","32" +"MOV CR0-CR7, rmr32","MOVL rmr32, CR0-CR7","movl rmr32, CR0-CR7","0F 22 /r= ","V","N.S.","","","w,r","Y","32" +"MOV DR0-DR7, rmr32","MOVL rmr32, DR0-DR7","movl rmr32, DR0-DR7","0F 23 /r= ","V","N.S.","","","w,r","Y","32" +"MOVLHPS xmm1, xmm2","MOVLHPS xmm2, xmm1","movlhps xmm2, xmm1","0F 16 /r",= "V","V","SSE","modrm_regonly","w,r","","" +"MOVLPD xmm1, m64","MOVLPD m64, xmm1","movlpd m64, xmm1","66 0F 12 /r","V"= ,"V","SSE2","modrm_memonly","w,r","","" +"MOVLPD m64, xmm1","MOVLPD xmm1, m64","movlpd xmm1, m64","66 0F 13 /r","V"= ,"V","SSE2","modrm_memonly","w,r","","" +"MOVLPS xmm1, m64","MOVLPS m64, xmm1","movlps m64, xmm1","0F 12 /r","V","V= ","SSE","modrm_memonly","w,r","","" +"MOVLPS m64, xmm1","MOVLPS xmm1, m64","movlps xmm1, m64","0F 13 /r","V","V= ","SSE","modrm_memonly","w,r","","" +"MOVSXD r32, r/m32","MOVLQSX r/m32, r32","movsxdl r/m32, r32","63 /r","N.S= .","V","","operand32","w,r","Y","32" +"MOVSXD r64, r/m32","MOVLQSX r/m32, r64","movslq r/m32, r64","REX.W 63 /r"= ,"N.S.","V","","","w,r","Y","64" +"MOVMSKPD r32, xmm2","MOVMSKPD xmm2, r32","movmskpd xmm2, r32","66 0F 50 /= r","V","V","SSE2","modrm_regonly","w,r","","" +"MOVMSKPS r32, xmm2","MOVMSKPS xmm2, r32","movmskps xmm2, r32","0F 50 /r",= "V","V","SSE","modrm_regonly","w,r","","" +"MOVNTDQA xmm1, m128","MOVNTDQA m128, xmm1","movntdqa m128, xmm1","66 0F 3= 8 2A /r","V","V","SSE4_1","modrm_memonly","w,r","","" +"MOVNTI m32, r32","MOVNTIL r32, m32","movntil r32, m32","0F C3 /r","V","V"= ,"SSE2","modrm_memonly,operand16,operand32","w,r","Y","32" +"MOVNTI m64, r64","MOVNTIQ r64, m64","movntiq r64, m64","REX.W 0F C3 /r","= N.S.","V","SSE2","modrm_memonly","w,r","Y","64" +"MOVNTDQ m128, xmm1","MOVNTO xmm1, m128","movntdq xmm1, m128","66 0F E7 /r= ","V","V","SSE2","modrm_memonly","w,r","","" +"MOVNTPD m128, xmm1","MOVNTPD xmm1, m128","movntpd xmm1, m128","66 0F 2B /= r","V","V","SSE2","modrm_memonly","w,r","","" +"MOVNTPS m128, xmm1","MOVNTPS xmm1, m128","movntps xmm1, m128","0F 2B /r",= "V","V","SSE","modrm_memonly","w,r","","" +"MOVNTQ m64, mm1","MOVNTQ mm1, m64","movntq mm1, m64","0F E7 /r","V","V","= MMX","modrm_memonly","w,r","","" +"MOVNTSD m64, xmm1","MOVNTSD xmm1, m64","movntsd xmm1, m64","F2 0F 2B /r",= "V","V","SSE4a","amd,modrm_memonly","w,r","","" +"MOVNTSS m32, xmm1","MOVNTSS xmm1, m32","movntss xmm1, m32","F3 0F 2B /r",= "V","V","SSE4a","amd,modrm_memonly","w,r","","" +"MOVDQA xmm2/m128, xmm1","MOVO xmm1, xmm2/m128","movdqa xmm1, xmm2/m128","= 66 0F 7F /r","V","V","SSE2","","w,r","","" +"MOVDQA xmm1, xmm2/m128","MOVO xmm2/m128, xmm1","movdqa xmm2/m128, xmm1","= 66 0F 6F /r","V","V","SSE2","","w,r","","" +"MOVDQU xmm2/m128, xmm1","MOVOU xmm1, xmm2/m128","movdqu xmm1, xmm2/m128",= "F3 0F 7F /r","V","V","SSE2","","w,r","","" +"MOVDQU xmm1, xmm2/m128","MOVOU xmm2/m128, xmm1","movdqu xmm2/m128, xmm1",= "F3 0F 6F /r","V","V","SSE2","","w,r","","" +"MOV rmr64, CR0-CR7","MOVQ CR0-CR7, rmr64","movq CR0-CR7, rmr64","0F 20 /r= ","N.S.","V","","default64","w,r","Y","64" +"MOV rmr64, CR8","MOVQ CR8, rmr64","movq CR8, rmr64","REX.R + 0F 20 /0","N= .E.","V","","modrm_regonly,pseudo","w,r","Y","64" +"MOV rmr64, DR0-DR7","MOVQ DR0-DR7, rmr64","movq DR0-DR7, rmr64","0F 21 /r= ","N.S.","V","","default64","w,r","Y","64" +"MOV moffs64, RAX","MOVQ RAX, moffs64","movabsq RAX, moffs64","REX.W A3 cm= ","N.S.","V","","","w,r","Y","64" +"MOV r/m64, imm32","MOVQ imm32, r/m64","movq imm32, r/m64","REX.W C7 /0 id= ","N.S.","V","","","w,r","Y","64" +"MOV r64op, imm64u","MOVQ imm64u, r64op","movq imm64u, r64op","REX.W B8+ro= io","N.S.","V","","","w,r","Y","64" +"MOVQ mm2/m64, mm1","MOVQ mm1, mm2/m64","movq mm1, mm2/m64","0F 7F /r","V"= ,"V","MMX","","w,r","","" +"MOVQ r/m64, mm1","MOVQ mm1, r/m64","movq mm1, r/m64","REX.W 0F 7E /r","N.= S.","V","MMX","","w,r","","" +"MOVQ mm1, mm2/m64","MOVQ mm2/m64, mm1","movq mm2/m64, mm1","0F 6F /r","V"= ,"V","MMX","","w,r","","" +"MOV RAX, moffs64","MOVQ moffs64, RAX","movabsq moffs64, RAX","REX.W A1 cm= ","N.S.","V","","","w,r","Y","64" +"MOVQ mm1, r/m64","MOVQ r/m64, mm1","movq r/m64, mm1","REX.W 0F 6E /r","N.= S.","V","MMX","","w,r","","" +"MOV r64, r/m64","MOVQ r/m64, r64","movq r/m64, r64","REX.W 8B /r","N.S.",= "V","","","w,r","Y","64" +"MOVQ xmm1, r/m64","MOVQ r/m64, xmm1","movq r/m64, xmm1","66 REX.W 0F 6E /= r","N.S.","V","SSE2","","w,r","","" +"MOV r/m64, r64","MOVQ r64, r/m64","movq r64, r/m64","REX.W 89 /r","N.S.",= "V","","","w,r","Y","64" +"MOV CR0-CR7, rmr64","MOVQ rmr64, CR0-CR7","movq rmr64, CR0-CR7","0F 22 /r= ","N.S.","V","","default64","w,r","Y","64" +"MOV CR8, rmr64","MOVQ rmr64, CR8","movq rmr64, CR8","REX.R + 0F 22 /0","N= .E.","V","","modrm_regonly,pseudo","w,r","Y","64" +"MOV DR0-DR7, rmr64","MOVQ rmr64, DR0-DR7","movq rmr64, DR0-DR7","0F 23 /r= ","N.S.","V","","default64","w,r","Y","64" +"MOVQ r/m64, xmm1","MOVQ xmm1, r/m64","movq xmm1, r/m64","66 REX.W 0F 7E /= r","N.S.","V","SSE2","","w,r","","" +"MOVQ xmm2/m64, xmm1","MOVQ xmm1, xmm2/m64","movq xmm1, xmm2/m64","66 0F D= 6 /r","V","V","SSE2","","w,r","","" +"MOVDQ2Q mm1, xmm2","MOVQ xmm2, mm1","movdq2q xmm2, mm1","F2 0F D6 /r","V"= ,"V","SSE2","modrm_regonly","w,r","","" +"MOVQ xmm1, xmm2/m64","MOVQ xmm2/m64, xmm1","movq xmm2/m64, xmm1","F3 0F 7= E /r","V","V","SSE2","","w,r","","" +"MOVQ2DQ xmm1, mm2","MOVQOZX mm2, xmm1","movq2dq mm2, xmm1","F3 0F D6 /r",= "V","V","SSE2","modrm_regonly","w,r","","" +"MOVSB","MOVSB","movsb","A4","V","V","","","","","" +"MOVSD xmm2/m64, xmm1","MOVSD xmm1, xmm2/m64","movsd xmm1, xmm2/m64","F2 0= F 11 /r","V","V","SSE2","","w,r","","" +"MOVSD xmm1, xmm2/m64","MOVSD xmm2/m64, xmm1","movsd xmm2/m64, xmm1","F2 0= F 10 /r","V","V","SSE2","","w,r","","" +"MOVSHDUP xmm1, xmm2/m128","MOVSHDUP xmm2/m128, xmm1","movshdup xmm2/m128,= xmm1","F3 0F 16 /r","V","V","SSE3","","w,r","","" +"MOVSD","MOVSL","movsl","A5","V","V","","operand32","","","" +"MOVSLDUP xmm1, xmm2/m128","MOVSLDUP xmm2/m128, xmm1","movsldup xmm2/m128,= xmm1","F3 0F 12 /r","V","V","SSE3","","w,r","","" +"MOVSQ","MOVSQ","movsq","REX.W A5","N.S.","V","","","","","" +"MOVSS xmm2/m32, xmm1","MOVSS xmm1, xmm2/m32","movss xmm1, xmm2/m32","F3 0= F 11 /r","V","V","SSE","","w,r","","" +"MOVSS xmm1, xmm2/m32","MOVSS xmm2/m32, xmm1","movss xmm2/m32, xmm1","F3 0= F 10 /r","V","V","SSE","","w,r","","" +"MOVSW","MOVSW","movsw","A5","V","V","","operand16","","","" +"MOVSX r16, r/m16","MOVSWW r/m16, r16","movsww r/m16, r16","0F BF /r","V",= "V","","operand16","w,r","Y","16" +"MOVUPD xmm2/m128, xmm1","MOVUPD xmm1, xmm2/m128","movupd xmm1, xmm2/m128"= ,"66 0F 11 /r","V","V","SSE2","","w,r","","" +"MOVUPD xmm1, xmm2/m128","MOVUPD xmm2/m128, xmm1","movupd xmm2/m128, xmm1"= ,"66 0F 10 /r","V","V","SSE2","","w,r","","" +"MOVUPS xmm2/m128, xmm1","MOVUPS xmm1, xmm2/m128","movups xmm1, xmm2/m128"= ,"0F 11 /r","V","V","SSE","","w,r","","" +"MOVUPS xmm1, xmm2/m128","MOVUPS xmm2/m128, xmm1","movups xmm2/m128, xmm1"= ,"0F 10 /r","V","V","SSE","","w,r","","" +"MOV moffs16, AX","MOVW AX, moffs16","movw AX, moffs16","A3 cm","V","V",""= ,"operand16","w,r","Y","16" +"MOV r/m16, Sreg","MOVW Sreg, r/m16","movw Sreg, r/m16","8C /r","V","V",""= ,"operand16","w,r","Y","16" +"MOV r/m16, imm16","MOVW imm16, r/m16","movw imm16, r/m16","C7 /0 iw","V",= "V","","operand16","w,r","Y","16" +"MOV r16op, imm16u","MOVW imm16u, r16op","movw imm16u, r16op","B8+rw iw","= V","V","","operand16","w,r","Y","16" +"MOV AX, moffs16","MOVW moffs16, AX","movw moffs16, AX","A1 cm","V","V",""= ,"operand16","w,r","Y","16" +"MOV Sreg, r/m16","MOVW r/m16, Sreg","movw r/m16, Sreg","8E /r","V","V",""= ,"","w,r","Y","16" +"MOV r16, r/m16","MOVW r/m16, r16","movw r/m16, r16","8B /r","V","V","","o= perand16","w,r","Y","16" +"MOV r/m16, r16","MOVW r16, r/m16","movw r16, r/m16","89 /r","V","V","","o= perand16","w,r","Y","16" +"MOVSX r32, r/m16","MOVWLSX r/m16, r32","movswl r/m16, r32","0F BF /r","V"= ,"V","","operand32","w,r","Y","32" +"MOVZX r32, r/m16","MOVWLZX r/m16, r32","movzwl r/m16, r32","0F B7 /r","V"= ,"V","","operand32","w,r","Y","32" +"MOVSX r64, r/m16","MOVWQSX r/m16, r64","movswq r/m16, r64","REX.W 0F BF /= r","N.S.","V","","","w,r","Y","64" +"MOVSXD r16, r/m32","MOVWQSX r/m32, r16","movsxdw r/m32, r16","63 /r","N.S= .","V","","operand16","w,r","Y","16" +"MOVZX r64, r/m16","MOVWQZX r/m16, r64","movzwq r/m16, r64","REX.W 0F B7 /= r","N.S.","V","","","w,r","Y","64" +"MOVZX r16, r/m16","MOVZWW r/m16, r16","movzww r/m16, r16","0F B7 /r","V",= "V","","operand16","w,r","Y","16" +"MOV r32/m16, Sreg","MOV{L/W} Sreg, r32/m16","mov{l/w} Sreg, r32/m16","8C = /r","V","V","","operand32","w,r","Y","" +"MOV r64/m16, Sreg","MOV{Q/W} Sreg, r64/m16","mov{q/w} Sreg, r64/m16","REX= .W 8C /r","N.S.","V","","","w,r","Y","" +"MPSADBW xmm1, xmm2/m128, imm8u","MPSADBW imm8u, xmm2/m128, xmm1","mpsadbw= imm8u, xmm2/m128, xmm1","66 0F 3A 42 /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"MUL r/m8","MULB r/m8","mulb r/m8","F6 /4","V","V","","","r","Y","8" +"MUL r/m8","MULB r/m8","mulb r/m8","REX F6 /4","N.E.","V","","pseudo64","r= ","Y","8" +"MUL r/m32","MULL r/m32","mull r/m32","F7 /4","V","V","","operand32","r","= Y","32" +"MULPD xmm1, xmm2/m128","MULPD xmm2/m128, xmm1","mulpd xmm2/m128, xmm1","6= 6 0F 59 /r","V","V","SSE2","","rw,r","","" +"MULPS xmm1, xmm2/m128","MULPS xmm2/m128, xmm1","mulps xmm2/m128, xmm1","0= F 59 /r","V","V","SSE","","rw,r","","" +"MUL r/m64","MULQ r/m64","mulq r/m64","REX.W F7 /4","N.S.","V","","","r","= Y","64" +"MULSD xmm1, xmm2/m64","MULSD xmm2/m64, xmm1","mulsd xmm2/m64, xmm1","F2 0= F 59 /r","V","V","SSE2","","rw,r","","" +"MULSS xmm1, xmm2/m32","MULSS xmm2/m32, xmm1","mulss xmm2/m32, xmm1","F3 0= F 59 /r","V","V","SSE","","rw,r","","" +"MUL r/m16","MULW r/m16","mulw r/m16","F7 /4","V","V","","operand16","r","= Y","16" +"MULX r32, r32V, r/m32","MULXL r/m32, r32V, r32","mulxl r/m32, r32V, r32",= "VEX.NDD.128.F2.0F38.W0 F6 /r","V","V","BMI2","","w,w,r","Y","32" +"MULX r64, r64V, r/m64","MULXQ r/m64, r64V, r64","mulxq r/m64, r64V, r64",= "VEX.NDD.128.F2.0F38.W1 F6 /r","N.S.","V","BMI2","","w,w,r","Y","64" +"MWAIT","MWAIT","mwait","0F 01 C9","V","V","MONITOR","","","","" +"NEG r/m8","NEGB r/m8","negb r/m8","F6 /3","V","V","","","rw","Y","8" +"NEG r/m8","NEGB r/m8","negb r/m8","REX F6 /3","N.E.","V","","pseudo64","r= w","Y","8" +"NEG r/m32","NEGL r/m32","negl r/m32","F7 /3","V","V","","operand32","rw",= "Y","32" +"NEG r/m64","NEGQ r/m64","negq r/m64","REX.W F7 /3","N.S.","V","","","rw",= "Y","64" +"NEG r/m16","NEGW r/m16","negw r/m16","F7 /3","V","V","","operand16","rw",= "Y","16" +"NOP","NOP","nop","90","V","V","","pseudo","","Y","" +"NOP","NOP","nop","90+rd","V","V","","operand32,operand64","","Y","" +"NOP","NOP","nop","90+rw","V","V","","operand16,operand64","","Y","" +"NOP","NOP","nop","F3 90+rd","V","V","","operand32","","Y","" +"NOP","NOP","nop","F3 90+rw","V","V","","operand16","","Y","" +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /4","V","V","","operand32","r= ","Y","32" +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /5","V","V","","operand32","r= ","Y","32" +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /6","V","V","","operand32","r= ","Y","32" +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /7","V","V","","operand32","r= ","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 19 /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1A /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1B /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1C /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1D /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V","P= PRO","operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1F /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 0D /r","V","V","P= RFCHW","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1A /r","V","V","P= PRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1B /r","V","V","P= PRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","66 0F 1E /r","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F2 0F 1E /r","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1B /r","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /0","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /1","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /2","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /3","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /4","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /5","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /6","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F8","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F9","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FA","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FB","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FC","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FD","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FE","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FF","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /0","V","V","","modrm_regonly= ,operand32","r","Y","32" +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /1","V","V","","modrm_regonly= ,operand32","r","Y","32" +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /2","V","V","","modrm_regonly= ,operand32","r","Y","32" +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /3","V","V","","modrm_regonly= ,operand32","r","Y","32" +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /4","N.S.","V","","","r= ","Y","64" +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /5","N.S.","V","","","r= ","Y","64" +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /6","N.S.","V","","","r= ","Y","64" +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /7","N.S.","V","","","r= ","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 19 /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1A /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1B /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1C /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1D /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S= .","V","PPRO","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1F /r","N.S= .","V","","","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","66 REX.W 0F 1E /r","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F2 REX.W 0F 1E /r","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1B /r","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /0","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /1","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /2","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /3","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /4","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /5","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /6","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F8","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F9","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FA","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FB","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FC","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FD","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FE","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FF","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 0D /r","N.S= .","V","PRFCHW","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1A /r","N.S= .","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1B /r","N.S= .","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /0","N.S.","V","","modr= m_regonly","r","Y","64" +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /1","N.S.","V","","modr= m_regonly","r","Y","64" +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /2","N.S.","V","","modr= m_regonly","r","Y","64" +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /3","N.S.","V","","modr= m_regonly","r","Y","64" +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /4","V","V","","operand16","r= ","Y","16" +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /5","V","V","","operand16","r= ","Y","16" +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /6","V","V","","operand16","r= ","Y","16" +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /7","V","V","","operand16","r= ","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 19 /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1A /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1B /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1C /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1D /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V","P= PRO","operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1F /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 0D /r","V","V","P= RFCHW","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1A /r","V","V","P= PRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1B /r","V","V","P= PRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","66 0F 1E /r","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F2 0F 1E /r","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1B /r","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /0","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /1","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /2","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /3","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /4","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /5","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /6","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F8","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F9","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FA","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FB","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FC","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FD","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FE","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FF","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /0","V","V","","modrm_regonly= ,operand16","r","Y","16" +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /1","V","V","","modrm_regonly= ,operand16","r","Y","16" +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /2","V","V","","modrm_regonly= ,operand16","r","Y","16" +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /3","V","V","","modrm_regonly= ,operand16","r","Y","16" +"NOT r/m8","NOTB r/m8","notb r/m8","F6 /2","V","V","","","rw","Y","8" +"NOT r/m8","NOTB r/m8","notb r/m8","REX F6 /2","N.E.","V","","pseudo64","r= w","Y","8" +"NOT r/m32","NOTL r/m32","notl r/m32","F7 /2","V","V","","operand32","rw",= "Y","32" +"NOT r/m64","NOTQ r/m64","notq r/m64","REX.W F7 /2","N.S.","V","","","rw",= "Y","64" +"NOT r/m16","NOTW r/m16","notw r/m16","F7 /2","V","V","","operand16","rw",= "Y","16" +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","80 /1 ib","V","V","",""= ,"rw,r","Y","8" +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","82 /1 ib","V","N.S.",""= ,"","rw,r","Y","8" +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","REX 80 /1 ib","N.E.","V= ","","pseudo64","rw,r","Y","8" +"OR AL, imm8u","ORB imm8u, AL","orb imm8u, AL","0C ib","V","V","","","rw,r= ","Y","8" +"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","0A /r","V","V","","","rw,r","= Y","8" +"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","REX 0A /r","N.E.","V","","pse= udo64","rw,r","Y","8" +"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","08 /r","V","V","","","rw,r","= Y","8" +"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","REX 08 /r","N.E.","V","","pse= udo64","rw,r","Y","8" +"OR EAX, imm32","ORL imm32, EAX","orl imm32, EAX","0D id","V","V","","oper= and32","rw,r","Y","32" +"OR r/m32, imm32","ORL imm32, r/m32","orl imm32, r/m32","81 /1 id","V","V"= ,"","operand32","rw,r","Y","32" +"OR r/m32, imm8","ORL imm8, r/m32","orl imm8, r/m32","83 /1 ib","V","V",""= ,"operand32","rw,r","Y","32" +"OR r32, r/m32","ORL r/m32, r32","orl r/m32, r32","0B /r","V","V","","oper= and32","rw,r","Y","32" +"OR r/m32, r32","ORL r32, r/m32","orl r32, r/m32","09 /r","V","V","","oper= and32","rw,r","Y","32" +"ORPD xmm1, xmm2/m128","ORPD xmm2/m128, xmm1","orpd xmm2/m128, xmm1","66 0= F 56 /r","V","V","SSE2","","rw,r","","" +"ORPS xmm1, xmm2/m128","ORPS xmm2/m128, xmm1","orps xmm2/m128, xmm1","0F 5= 6 /r","V","V","SSE","","rw,r","","" +"OR RAX, imm32","ORQ imm32, RAX","orq imm32, RAX","REX.W 0D id","N.S.","V"= ,"","","rw,r","Y","64" +"OR r/m64, imm32","ORQ imm32, r/m64","orq imm32, r/m64","REX.W 81 /1 id","= N.S.","V","","","rw,r","Y","64" +"OR r/m64, imm8","ORQ imm8, r/m64","orq imm8, r/m64","REX.W 83 /1 ib","N.S= .","V","","","rw,r","Y","64" +"OR r64, r/m64","ORQ r/m64, r64","orq r/m64, r64","REX.W 0B /r","N.S.","V"= ,"","","rw,r","Y","64" +"OR r/m64, r64","ORQ r64, r/m64","orq r64, r/m64","REX.W 09 /r","N.S.","V"= ,"","","rw,r","Y","64" +"OR AX, imm16","ORW imm16, AX","orw imm16, AX","0D iw","V","V","","operand= 16","rw,r","Y","16" +"OR r/m16, imm16","ORW imm16, r/m16","orw imm16, r/m16","81 /1 iw","V","V"= ,"","operand16","rw,r","Y","16" +"OR r/m16, imm8","ORW imm8, r/m16","orw imm8, r/m16","83 /1 ib","V","V",""= ,"operand16","rw,r","Y","16" +"OR r16, r/m16","ORW r/m16, r16","orw r/m16, r16","0B /r","V","V","","oper= and16","rw,r","Y","16" +"OR r/m16, r16","ORW r16, r/m16","orw r16, r/m16","09 /r","V","V","","oper= and16","rw,r","Y","16" +"OUT DX, AL","OUTB AL, DX","outb AL, DX","EE","V","V","","","r,r","Y","8" +"OUT imm8u, AL","OUTB AL, imm8u","outb AL, imm8u","E6 ib","V","V","","","r= ,r","Y","8" +"OUT DX, EAX","OUTL EAX, DX","outl EAX, DX","EF","V","V","","operand32,ope= rand64","r,r","Y","32" +"OUT imm8u, EAX","OUTL EAX, imm8u","outl EAX, imm8u","E7 ib","V","V","","o= perand32,operand64","r,r","Y","32" +"OUTSB","OUTSB","outsb","6E","V","V","","","","","" +"OUTSD","OUTSL","outsl","6F","V","V","","operand32,operand64","","","" +"OUTSW","OUTSW","outsw","6F","V","V","","operand16","","","" +"OUT DX, AX","OUTW AX, DX","outw AX, DX","EF","V","V","","operand16","r,r"= ,"Y","16" +"OUT imm8u, AX","OUTW AX, imm8u","outw AX, imm8u","E7 ib","V","V","","oper= and16","r,r","Y","16" +"PABSB mm1, mm2/m64","PABSB mm2/m64, mm1","pabsb mm2/m64, mm1","0F 38 1C /= r","V","V","SSSE3","","w,r","","" +"PABSB xmm1, xmm2/m128","PABSB xmm2/m128, xmm1","pabsb xmm2/m128, xmm1","6= 6 0F 38 1C /r","V","V","SSSE3","","w,r","","" +"PABSD mm1, mm2/m64","PABSD mm2/m64, mm1","pabsd mm2/m64, mm1","0F 38 1E /= r","V","V","SSSE3","","w,r","","" +"PABSD xmm1, xmm2/m128","PABSD xmm2/m128, xmm1","pabsd xmm2/m128, xmm1","6= 6 0F 38 1E /r","V","V","SSSE3","","w,r","","" +"PABSW mm1, mm2/m64","PABSW mm2/m64, mm1","pabsw mm2/m64, mm1","0F 38 1D /= r","V","V","SSSE3","","w,r","","" +"PABSW xmm1, xmm2/m128","PABSW xmm2/m128, xmm1","pabsw xmm2/m128, xmm1","6= 6 0F 38 1D /r","V","V","SSSE3","","w,r","","" +"PACKSSDW mm1, mm2/m64","PACKSSLW mm2/m64, mm1","packssdw mm2/m64, mm1","0= F 6B /r","V","V","MMX","","rw,r","","" +"PACKSSDW xmm1, xmm2/m128","PACKSSLW xmm2/m128, xmm1","packssdw xmm2/m128,= xmm1","66 0F 6B /r","V","V","SSE2","","rw,r","","" +"PACKSSWB mm1, mm2/m64","PACKSSWB mm2/m64, mm1","packsswb mm2/m64, mm1","0= F 63 /r","V","V","MMX","","rw,r","","" +"PACKSSWB xmm1, xmm2/m128","PACKSSWB xmm2/m128, xmm1","packsswb xmm2/m128,= xmm1","66 0F 63 /r","V","V","SSE2","","rw,r","","" +"PACKUSDW xmm1, xmm2/m128","PACKUSDW xmm2/m128, xmm1","packusdw xmm2/m128,= xmm1","66 0F 38 2B /r","V","V","SSE4_1","","rw,r","","" +"PACKUSWB mm1, mm2/m64","PACKUSWB mm2/m64, mm1","packuswb mm2/m64, mm1","0= F 67 /r","V","V","MMX","","rw,r","","" +"PACKUSWB xmm1, xmm2/m128","PACKUSWB xmm2/m128, xmm1","packuswb xmm2/m128,= xmm1","66 0F 67 /r","V","V","SSE2","","rw,r","","" +"PADDB mm1, mm2/m64","PADDB mm2/m64, mm1","paddb mm2/m64, mm1","0F FC /r",= "V","V","MMX","","rw,r","","" +"PADDB xmm1, xmm2/m128","PADDB xmm2/m128, xmm1","paddb xmm2/m128, xmm1","6= 6 0F FC /r","V","V","SSE2","","rw,r","","" +"PADDD mm1, mm2/m64","PADDL mm2/m64, mm1","paddd mm2/m64, mm1","0F FE /r",= "V","V","MMX","","rw,r","","" +"PADDD xmm1, xmm2/m128","PADDL xmm2/m128, xmm1","paddd xmm2/m128, xmm1","6= 6 0F FE /r","V","V","SSE2","","rw,r","","" +"PADDQ mm1, mm2/m64","PADDQ mm2/m64, mm1","paddq mm2/m64, mm1","0F D4 /r",= "V","V","SSE2","","rw,r","","" +"PADDQ xmm1, xmm2/m128","PADDQ xmm2/m128, xmm1","paddq xmm2/m128, xmm1","6= 6 0F D4 /r","V","V","SSE2","","rw,r","","" +"PADDSB mm1, mm2/m64","PADDSB mm2/m64, mm1","paddsb mm2/m64, mm1","0F EC /= r","V","V","MMX","","rw,r","","" +"PADDSB xmm1, xmm2/m128","PADDSB xmm2/m128, xmm1","paddsb xmm2/m128, xmm1"= ,"66 0F EC /r","V","V","SSE2","","rw,r","","" +"PADDSW mm1, mm2/m64","PADDSW mm2/m64, mm1","paddsw mm2/m64, mm1","0F ED /= r","V","V","MMX","","rw,r","","" +"PADDSW xmm1, xmm2/m128","PADDSW xmm2/m128, xmm1","paddsw xmm2/m128, xmm1"= ,"66 0F ED /r","V","V","SSE2","","rw,r","","" +"PADDUSB mm1, mm2/m64","PADDUSB mm2/m64, mm1","paddusb mm2/m64, mm1","0F D= C /r","V","V","MMX","","rw,r","","" +"PADDUSB xmm1, xmm2/m128","PADDUSB xmm2/m128, xmm1","paddusb xmm2/m128, xm= m1","66 0F DC /r","V","V","SSE2","","rw,r","","" +"PADDUSW mm1, mm2/m64","PADDUSW mm2/m64, mm1","paddusw mm2/m64, mm1","0F D= D /r","V","V","MMX","","rw,r","","" +"PADDUSW xmm1, xmm2/m128","PADDUSW xmm2/m128, xmm1","paddusw xmm2/m128, xm= m1","66 0F DD /r","V","V","SSE2","","rw,r","","" +"PADDW mm1, mm2/m64","PADDW mm2/m64, mm1","paddw mm2/m64, mm1","0F FD /r",= "V","V","MMX","","rw,r","","" +"PADDW xmm1, xmm2/m128","PADDW xmm2/m128, xmm1","paddw xmm2/m128, xmm1","6= 6 0F FD /r","V","V","SSE2","","rw,r","","" +"PALIGNR mm1, mm2/m64, imm8u","PALIGNR imm8u, mm2/m64, mm1","palignr imm8u= , mm2/m64, mm1","0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r","","" +"PALIGNR xmm1, xmm2/m128, imm8u","PALIGNR imm8u, xmm2/m128, xmm1","palignr= imm8u, xmm2/m128, xmm1","66 0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r",""= ,"" +"PAND mm1, mm2/m64","PAND mm2/m64, mm1","pand mm2/m64, mm1","0F DB /r","V"= ,"V","MMX","","rw,r","","" +"PAND xmm1, xmm2/m128","PAND xmm2/m128, xmm1","pand xmm2/m128, xmm1","66 0= F DB /r","V","V","SSE2","","rw,r","","" +"PANDN mm1, mm2/m64","PANDN mm2/m64, mm1","pandn mm2/m64, mm1","0F DF /r",= "V","V","MMX","","rw,r","","" +"PANDN xmm1, xmm2/m128","PANDN xmm2/m128, xmm1","pandn xmm2/m128, xmm1","6= 6 0F DF /r","V","V","SSE2","","rw,r","","" +"PAUSE","PAUSE","pause","F3 90","V","V","","pseudo","","","" +"PAUSE","PAUSE","pause","F3 90+rd","V","V","","operand32","","Y","" +"PAUSE","PAUSE","pause","F3 90+rw","V","V","","operand16,operand64","","Y"= ,"" +"PAVGB mm1, mm2/m64","PAVGB mm2/m64, mm1","pavgb mm2/m64, mm1","0F E0 /r",= "V","V","MMX","","rw,r","","" +"PAVGB xmm1, xmm2/m128","PAVGB xmm2/m128, xmm1","pavgb xmm2/m128, xmm1","6= 6 0F E0 /r","V","V","SSE2","","rw,r","","" +"PAVGUSB mm1, mm2/m64","PAVGUSB mm2/m64, mm1","pavgusb mm2/m64, mm1","0F 0= F BF /r","V","V","3DNOW","amd","rw,r","","" +"PAVGW mm1, mm2/m64","PAVGW mm2/m64, mm1","pavgw mm2/m64, mm1","0F E3 /r",= "V","V","MMX","","rw,r","","" +"PAVGW xmm1, xmm2/m128","PAVGW xmm2/m128, xmm1","pavgw xmm2/m128, xmm1","6= 6 0F E3 /r","V","V","SSE2","","rw,r","","" +"PBLENDVB xmm1, xmm2/m128, ","PBLENDVB , xmm2/m128, xmm1","pbl= endvb , xmm2/m128, xmm1","66 0F 38 10 /r","V","V","SSE4_1","","rw,r,r= ","","" +"PBLENDW xmm1, xmm2/m128, imm8u","PBLENDW imm8u, xmm2/m128, xmm1","pblendw= imm8u, xmm2/m128, xmm1","66 0F 3A 0E /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"PCLMULQDQ xmm1, xmm2/m128, imm8u","PCLMULQDQ imm8u, xmm2/m128, xmm1","pcl= mulqdq imm8u, xmm2/m128, xmm1","66 0F 3A 44 /r ib","V","V","PCLMULQDQ","","= rw,r,r","","" +"PCMPEQB mm1, mm2/m64","PCMPEQB mm2/m64, mm1","pcmpeqb mm2/m64, mm1","0F 7= 4 /r","V","V","MMX","","rw,r","","" +"PCMPEQB xmm1, xmm2/m128","PCMPEQB xmm2/m128, xmm1","pcmpeqb xmm2/m128, xm= m1","66 0F 74 /r","V","V","SSE2","","rw,r","","" +"PCMPEQD mm1, mm2/m64","PCMPEQL mm2/m64, mm1","pcmpeqd mm2/m64, mm1","0F 7= 6 /r","V","V","MMX","","rw,r","","" +"PCMPEQD xmm1, xmm2/m128","PCMPEQL xmm2/m128, xmm1","pcmpeqd xmm2/m128, xm= m1","66 0F 76 /r","V","V","SSE2","","rw,r","","" +"PCMPEQQ xmm1, xmm2/m128","PCMPEQQ xmm2/m128, xmm1","pcmpeqq xmm2/m128, xm= m1","66 0F 38 29 /r","V","V","SSE4_1","","rw,r","","" +"PCMPEQW mm1, mm2/m64","PCMPEQW mm2/m64, mm1","pcmpeqw mm2/m64, mm1","0F 7= 5 /r","V","V","MMX","","rw,r","","" +"PCMPEQW xmm1, xmm2/m128","PCMPEQW xmm2/m128, xmm1","pcmpeqw xmm2/m128, xm= m1","66 0F 75 /r","V","V","SSE2","","rw,r","","" +"PCMPESTRI xmm1, xmm2/m128, imm8u","PCMPESTRI imm8u, xmm2/m128, xmm1","pcm= pestri imm8u, xmm2/m128, xmm1","66 0F 3A 61 /r ib","V","V","SSE4_2","","r,r= ,r","","" +"PCMPESTRM xmm1, xmm2/m128, imm8u","PCMPESTRM imm8u, xmm2/m128, xmm1","pcm= pestrm imm8u, xmm2/m128, xmm1","66 0F 3A 60 /r ib","V","V","SSE4_2","","r,r= ,r","","" +"PCMPGTB mm1, mm2/m64","PCMPGTB mm2/m64, mm1","pcmpgtb mm2/m64, mm1","0F 6= 4 /r","V","V","MMX","","rw,r","","" +"PCMPGTB xmm1, xmm2/m128","PCMPGTB xmm2/m128, xmm1","pcmpgtb xmm2/m128, xm= m1","66 0F 64 /r","V","V","SSE2","","rw,r","","" +"PCMPGTD mm1, mm2/m64","PCMPGTL mm2/m64, mm1","pcmpgtd mm2/m64, mm1","0F 6= 6 /r","V","V","MMX","","rw,r","","" +"PCMPGTD xmm1, xmm2/m128","PCMPGTL xmm2/m128, xmm1","pcmpgtd xmm2/m128, xm= m1","66 0F 66 /r","V","V","SSE2","","rw,r","","" +"PCMPGTQ xmm1, xmm2/m128","PCMPGTQ xmm2/m128, xmm1","pcmpgtq xmm2/m128, xm= m1","66 0F 38 37 /r","V","V","SSE4_2","","rw,r","","" +"PCMPGTW mm1, mm2/m64","PCMPGTW mm2/m64, mm1","pcmpgtw mm2/m64, mm1","0F 6= 5 /r","V","V","MMX","","rw,r","","" +"PCMPGTW xmm1, xmm2/m128","PCMPGTW xmm2/m128, xmm1","pcmpgtw xmm2/m128, xm= m1","66 0F 65 /r","V","V","SSE2","","rw,r","","" +"PCMPISTRI xmm1, xmm2/m128, imm8u","PCMPISTRI imm8u, xmm2/m128, xmm1","pcm= pistri imm8u, xmm2/m128, xmm1","66 0F 3A 63 /r ib","V","V","SSE4_2","","r,r= ,r","","" +"PCMPISTRM xmm1, xmm2/m128, imm8u","PCMPISTRM imm8u, xmm2/m128, xmm1","pcm= pistrm imm8u, xmm2/m128, xmm1","66 0F 3A 62 /r ib","V","V","SSE4_2","","r,r= ,r","","" +"PDEP r32, r32V, r/m32","PDEPL r/m32, r32V, r32","pdepl r/m32, r32V, r32",= "VEX.DDS.128.F2.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32" +"PDEP r64, r64V, r/m64","PDEPQ r/m64, r64V, r64","pdepq r/m64, r64V, r64",= "VEX.DDS.128.F2.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64" +"PEXT r32, r32V, r/m32","PEXTL r/m32, r32V, r32","pextl r/m32, r32V, r32",= "VEX.DDS.128.F3.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32" +"PEXT r64, r64V, r/m64","PEXTQ r/m64, r64V, r64","pextq r/m64, r64V, r64",= "VEX.DDS.128.F3.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64" +"PEXTRB r32/m8, xmm1, imm8u","PEXTRB imm8u, xmm1, r32/m8","pextrb imm8u, x= mm1, r32/m8","66 0F 3A 14 /r ib","V","V","SSE4_1","","w,r,r","","" +"PEXTRD r/m32, xmm1, imm8u","PEXTRD imm8u, xmm1, r/m32","pextrd imm8u, xmm= 1, r/m32","66 0F 3A 16 /r ib","V","V","SSE4_1","operand16,operand32","w,r,r= ","","" +"PEXTRQ r/m64, xmm1, imm8u","PEXTRQ imm8u, xmm1, r/m64","pextrq imm8u, xmm= 1, r/m64","66 REX.W 0F 3A 16 /r ib","N.S.","V","SSE4_1","","w,r,r","","" +"PEXTRW r32, mm2, imm8u","PEXTRW imm8u, mm2, r32","pextrw imm8u, mm2, r32"= ,"0F C5 /r ib","V","V","MMX","modrm_regonly","w,r,r","","" +"PEXTRW r32/m16, xmm1, imm8u","PEXTRW imm8u, xmm1, r32/m16","pextrw imm8u,= xmm1, r32/m16","66 0F 3A 15 /r ib","V","V","SSE4_1","","w,r,r","","" +"PEXTRW r32, xmm2, imm8u","PEXTRW imm8u, xmm2, r32","pextrw imm8u, xmm2, r= 32","66 0F C5 /r ib","V","V","SSE2","modrm_regonly","w,r,r","","" +"PF2ID mm1, mm2/m64","PF2ID mm2/m64, mm1","pf2id mm2/m64, mm1","0F 0F 1D /= r","V","V","3DNOW","amd","rw,r","","" +"PF2IW mm1, mm2/m64","PF2IW mm2/m64, mm1","pf2iw mm2/m64, mm1","0F 0F 1C /= r","V","V","3DNOW","amd","rw,r","","" +"PFACC mm1, mm2/m64","PFACC mm2/m64, mm1","pfacc mm2/m64, mm1","0F 0F AE /= r","V","V","3DNOW","amd","rw,r","","" +"PFADD mm1, mm2/m64","PFADD mm2/m64, mm1","pfadd mm2/m64, mm1","0F 0F 9E /= r","V","V","3DNOW","amd","rw,r","","" +"PFCMPEQ mm1, mm2/m64","PFCMPEQ mm2/m64, mm1","pfcmpeq mm2/m64, mm1","0F 0= F B0 /r","V","V","3DNOW","amd","rw,r","","" +"PFCMPGE mm1, mm2/m64","PFCMPGE mm2/m64, mm1","pfcmpge mm2/m64, mm1","0F 0= F 90 /r","V","V","3DNOW","amd","rw,r","","" +"PFCMPGT mm1, mm2/m64","PFCMPGT mm2/m64, mm1","pfcmpgt mm2/m64, mm1","0F 0= F A0 /r","V","V","3DNOW","amd","rw,r","","" +"PFCPIT1 mm1, mm2/m64","PFCPIT1 mm2/m64, mm1","pfcpit1 mm2/m64, mm1","0F 0= F A6 /r","V","V","3DNOW","amd","rw,r","","" +"PFMAX mm1, mm2/m64","PFMAX mm2/m64, mm1","pfmax mm2/m64, mm1","0F 0F A4 /= r","V","V","3DNOW","amd","rw,r","","" +"PFMIN mm1, mm2/m64","PFMIN mm2/m64, mm1","pfmin mm2/m64, mm1","0F 0F 94 /= r","V","V","3DNOW","amd","rw,r","","" +"PFMUL mm1, mm2/m64","PFMUL mm2/m64, mm1","pfmul mm2/m64, mm1","0F 0F B4 /= r","V","V","3DNOW","amd","rw,r","","" +"PFNACC mm1, mm2/m64","PFNACC mm2/m64, mm1","pfnacc mm2/m64, mm1","0F 0F 8= A /r","V","V","3DNOW","amd","rw,r","","" +"PFPNACC mm1, mm2/m64","PFPNACC mm2/m64, mm1","pfpnacc mm2/m64, mm1","0F 0= F 8E /r","V","V","3DNOW","amd","rw,r","","" +"PFRCP mm1, mm2/m64","PFRCP mm2/m64, mm1","pfrcp mm2/m64, mm1","0F 0F 96 /= r","V","V","3DNOW","amd","rw,r","","" +"PFRCPIT2 mm1, mm2/m64","PFRCPIT2 mm2/m64, mm1","pfrcpit2 mm2/m64, mm1","0= F 0F B6 /r","V","V","3DNOW","amd","rw,r","","" +"PFRSQIT1 mm1, mm2/m64","PFRSQIT1 mm2/m64, mm1","pfrsqit1 mm2/m64, mm1","0= F 0F A7 /r","V","V","3DNOW","amd","rw,r","","" +"PFSQRT mm1, mm2/m64","PFSQRT mm2/m64, mm1","pfsqrt mm2/m64, mm1","0F 0F 9= 7 /r","V","V","3DNOW","amd","rw,r","","" +"PFSUB mm1, mm2/m64","PFSUB mm2/m64, mm1","pfsub mm2/m64, mm1","0F 0F 9A /= r","V","V","3DNOW","amd","rw,r","","" +"PFSUBR mm1, mm2/m64","PFSUBR mm2/m64, mm1","pfsubr mm2/m64, mm1","0F 0F A= A /r","V","V","3DNOW","amd","rw,r","","" +"PHADDD mm1, mm2/m64","PHADDD mm2/m64, mm1","phaddd mm2/m64, mm1","0F 38 0= 2 /r","V","V","SSSE3","","rw,r","","" +"PHADDD xmm1, xmm2/m128","PHADDD xmm2/m128, xmm1","phaddd xmm2/m128, xmm1"= ,"66 0F 38 02 /r","V","V","SSSE3","","rw,r","","" +"PHADDSW mm1, mm2/m64","PHADDSW mm2/m64, mm1","phaddsw mm2/m64, mm1","0F 3= 8 03 /r","V","V","SSSE3","","rw,r","","" +"PHADDSW xmm1, xmm2/m128","PHADDSW xmm2/m128, xmm1","phaddsw xmm2/m128, xm= m1","66 0F 38 03 /r","V","V","SSSE3","","rw,r","","" +"PHADDW mm1, mm2/m64","PHADDW mm2/m64, mm1","phaddw mm2/m64, mm1","0F 38 0= 1 /r","V","V","SSSE3","","rw,r","","" +"PHADDW xmm1, xmm2/m128","PHADDW xmm2/m128, xmm1","phaddw xmm2/m128, xmm1"= ,"66 0F 38 01 /r","V","V","SSSE3","","rw,r","","" +"PHMINPOSUW xmm1, xmm2/m128","PHMINPOSUW xmm2/m128, xmm1","phminposuw xmm2= /m128, xmm1","66 0F 38 41 /r","V","V","SSE4_1","","w,r","","" +"PHSUBD mm1, mm2/m64","PHSUBD mm2/m64, mm1","phsubd mm2/m64, mm1","0F 38 0= 6 /r","V","V","SSSE3","","rw,r","","" +"PHSUBD xmm1, xmm2/m128","PHSUBD xmm2/m128, xmm1","phsubd xmm2/m128, xmm1"= ,"66 0F 38 06 /r","V","V","SSSE3","","rw,r","","" +"PHSUBSW mm1, mm2/m64","PHSUBSW mm2/m64, mm1","phsubsw mm2/m64, mm1","0F 3= 8 07 /r","V","V","SSSE3","","rw,r","","" +"PHSUBSW xmm1, xmm2/m128","PHSUBSW xmm2/m128, xmm1","phsubsw xmm2/m128, xm= m1","66 0F 38 07 /r","V","V","SSSE3","","rw,r","","" +"PHSUBW mm1, mm2/m64","PHSUBW mm2/m64, mm1","phsubw mm2/m64, mm1","0F 38 0= 5 /r","V","V","SSSE3","","rw,r","","" +"PHSUBW xmm1, xmm2/m128","PHSUBW xmm2/m128, xmm1","phsubw xmm2/m128, xmm1"= ,"66 0F 38 05 /r","V","V","SSSE3","","rw,r","","" +"PI2FD mm1, mm2/m64","PI2FD mm2/m64, mm1","pi2fd mm2/m64, mm1","0F 0F 0D /= r","V","V","3DNOW","amd","rw,r","","" +"PI2FW mm1, mm2/m64","PI2FW mm2/m64, mm1","pi2fw mm2/m64, mm1","0F 0F 0C /= r","V","V","3DNOW","amd","rw,r","","" +"PINSRB xmm1, r32/m8, imm8u","PINSRB imm8u, r32/m8, xmm1","pinsrb imm8u, r= 32/m8, xmm1","66 0F 3A 20 /r ib","V","V","SSE4_1","","rw,r,r","","" +"PINSRD xmm1, r/m32, imm8u","PINSRD imm8u, r/m32, xmm1","pinsrd imm8u, r/m= 32, xmm1","66 0F 3A 22 /r ib","V","V","SSE4_1","operand16,operand32","rw,r,= r","","" +"PINSRQ xmm1, r/m64, imm8u","PINSRQ imm8u, r/m64, xmm1","pinsrq imm8u, r/m= 64, xmm1","66 REX.W 0F 3A 22 /r ib","N.S.","V","SSE4_1","","rw,r,r","","" +"PINSRW mm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, mm1","pinsrw imm8u, r= 32/m16, mm1","0F C4 /r ib","V","V","MMX","","rw,r,r","","" +"PINSRW xmm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, xmm1","pinsrw imm8u,= r32/m16, xmm1","66 0F C4 /r ib","V","V","SSE2","","rw,r,r","","" +"PMADDUBSW mm1, mm2/m64","PMADDUBSW mm2/m64, mm1","pmaddubsw mm2/m64, mm1"= ,"0F 38 04 /r","V","V","SSSE3","","rw,r","","" +"PMADDUBSW xmm1, xmm2/m128","PMADDUBSW xmm2/m128, xmm1","pmaddubsw xmm2/m1= 28, xmm1","66 0F 38 04 /r","V","V","SSSE3","","rw,r","","" +"PMADDWD mm1, mm2/m64","PMADDWL mm2/m64, mm1","pmaddwd mm2/m64, mm1","0F F= 5 /r","V","V","MMX","","rw,r","","" +"PMADDWD xmm1, xmm2/m128","PMADDWL xmm2/m128, xmm1","pmaddwd xmm2/m128, xm= m1","66 0F F5 /r","V","V","SSE2","","rw,r","","" +"PMAXSB xmm1, xmm2/m128","PMAXSB xmm2/m128, xmm1","pmaxsb xmm2/m128, xmm1"= ,"66 0F 38 3C /r","V","V","SSE4_1","","rw,r","","" +"PMAXSD xmm1, xmm2/m128","PMAXSD xmm2/m128, xmm1","pmaxsd xmm2/m128, xmm1"= ,"66 0F 38 3D /r","V","V","SSE4_1","","rw,r","","" +"PMAXSW mm1, mm2/m64","PMAXSW mm2/m64, mm1","pmaxsw mm2/m64, mm1","0F EE /= r","V","V","MMX","","rw,r","","" +"PMAXSW xmm1, xmm2/m128","PMAXSW xmm2/m128, xmm1","pmaxsw xmm2/m128, xmm1"= ,"66 0F EE /r","V","V","SSE2","","rw,r","","" +"PMAXUB mm1, mm2/m64","PMAXUB mm2/m64, mm1","pmaxub mm2/m64, mm1","0F DE /= r","V","V","MMX","","rw,r","","" +"PMAXUB xmm1, xmm2/m128","PMAXUB xmm2/m128, xmm1","pmaxub xmm2/m128, xmm1"= ,"66 0F DE /r","V","V","SSE2","","rw,r","","" +"PMAXUD xmm1, xmm2/m128","PMAXUD xmm2/m128, xmm1","pmaxud xmm2/m128, xmm1"= ,"66 0F 38 3F /r","V","V","SSE4_1","","rw,r","","" +"PMAXUW xmm1, xmm2/m128","PMAXUW xmm2/m128, xmm1","pmaxuw xmm2/m128, xmm1"= ,"66 0F 38 3E /r","V","V","SSE4_1","","rw,r","","" +"PMINSB xmm1, xmm2/m128","PMINSB xmm2/m128, xmm1","pminsb xmm2/m128, xmm1"= ,"66 0F 38 38 /r","V","V","SSE4_1","","rw,r","","" +"PMINSD xmm1, xmm2/m128","PMINSD xmm2/m128, xmm1","pminsd xmm2/m128, xmm1"= ,"66 0F 38 39 /r","V","V","SSE4_1","","rw,r","","" +"PMINSW mm1, mm2/m64","PMINSW mm2/m64, mm1","pminsw mm2/m64, mm1","0F EA /= r","V","V","MMX","","rw,r","","" +"PMINSW xmm1, xmm2/m128","PMINSW xmm2/m128, xmm1","pminsw xmm2/m128, xmm1"= ,"66 0F EA /r","V","V","SSE2","","rw,r","","" +"PMINUB mm1, mm2/m64","PMINUB mm2/m64, mm1","pminub mm2/m64, mm1","0F DA /= r","V","V","MMX","","rw,r","","" +"PMINUB xmm1, xmm2/m128","PMINUB xmm2/m128, xmm1","pminub xmm2/m128, xmm1"= ,"66 0F DA /r","V","V","SSE2","","rw,r","","" +"PMINUD xmm1, xmm2/m128","PMINUD xmm2/m128, xmm1","pminud xmm2/m128, xmm1"= ,"66 0F 38 3B /r","V","V","SSE4_1","","rw,r","","" +"PMINUW xmm1, xmm2/m128","PMINUW xmm2/m128, xmm1","pminuw xmm2/m128, xmm1"= ,"66 0F 38 3A /r","V","V","SSE4_1","","rw,r","","" +"PMOVMSKB r32, mm2","PMOVMSKB mm2, r32","pmovmskb mm2, r32","0F D7 /r","V"= ,"V","SSE","modrm_regonly","w,r","","" +"PMOVMSKB r32, xmm2","PMOVMSKB xmm2, r32","pmovmskb xmm2, r32","66 0F D7 /= r","V","V","SSE2","modrm_regonly","w,r","","" +"PMOVSXBD xmm1, xmm2/m32","PMOVSXBD xmm2/m32, xmm1","pmovsxbd xmm2/m32, xm= m1","66 0F 38 21 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXBQ xmm1, xmm2/m16","PMOVSXBQ xmm2/m16, xmm1","pmovsxbq xmm2/m16, xm= m1","66 0F 38 22 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXBW xmm1, xmm2/m64","PMOVSXBW xmm2/m64, xmm1","pmovsxbw xmm2/m64, xm= m1","66 0F 38 20 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXDQ xmm1, xmm2/m64","PMOVSXDQ xmm2/m64, xmm1","pmovsxdq xmm2/m64, xm= m1","66 0F 38 25 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXWD xmm1, xmm2/m64","PMOVSXWD xmm2/m64, xmm1","pmovsxwd xmm2/m64, xm= m1","66 0F 38 23 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXWQ xmm1, xmm2/m32","PMOVSXWQ xmm2/m32, xmm1","pmovsxwq xmm2/m32, xm= m1","66 0F 38 24 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXBD xmm1, xmm2/m32","PMOVZXBD xmm2/m32, xmm1","pmovzxbd xmm2/m32, xm= m1","66 0F 38 31 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXBQ xmm1, xmm2/m16","PMOVZXBQ xmm2/m16, xmm1","pmovzxbq xmm2/m16, xm= m1","66 0F 38 32 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXBW xmm1, xmm2/m64","PMOVZXBW xmm2/m64, xmm1","pmovzxbw xmm2/m64, xm= m1","66 0F 38 30 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXDQ xmm1, xmm2/m64","PMOVZXDQ xmm2/m64, xmm1","pmovzxdq xmm2/m64, xm= m1","66 0F 38 35 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXWD xmm1, xmm2/m64","PMOVZXWD xmm2/m64, xmm1","pmovzxwd xmm2/m64, xm= m1","66 0F 38 33 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXWQ xmm1, xmm2/m32","PMOVZXWQ xmm2/m32, xmm1","pmovzxwq xmm2/m32, xm= m1","66 0F 38 34 /r","V","V","SSE4_1","","w,r","","" +"PMULDQ xmm1, xmm2/m128","PMULDQ xmm2/m128, xmm1","pmuldq xmm2/m128, xmm1"= ,"66 0F 38 28 /r","V","V","SSE4_1","","rw,r","","" +"PMULHRSW mm1, mm2/m64","PMULHRSW mm2/m64, mm1","pmulhrsw mm2/m64, mm1","0= F 38 0B /r","V","V","SSSE3","","rw,r","","" +"PMULHRSW xmm1, xmm2/m128","PMULHRSW xmm2/m128, xmm1","pmulhrsw xmm2/m128,= xmm1","66 0F 38 0B /r","V","V","SSSE3","","rw,r","","" +"PMULHRW mm1, mm2/m64","PMULHRW mm2/m64, mm1","pmulhrw mm2/m64, mm1","0F 0= F B7 /r","V","V","3DNOW","amd","rw,r","","" +"PMULHUW mm1, mm2/m64","PMULHUW mm2/m64, mm1","pmulhuw mm2/m64, mm1","0F E= 4 /r","V","V","MMX","","rw,r","","" +"PMULHUW xmm1, xmm2/m128","PMULHUW xmm2/m128, xmm1","pmulhuw xmm2/m128, xm= m1","66 0F E4 /r","V","V","SSE2","","rw,r","","" +"PMULHW mm1, mm2/m64","PMULHW mm2/m64, mm1","pmulhw mm2/m64, mm1","0F E5 /= r","V","V","MMX","","rw,r","","" +"PMULHW xmm1, xmm2/m128","PMULHW xmm2/m128, xmm1","pmulhw xmm2/m128, xmm1"= ,"66 0F E5 /r","V","V","SSE2","","rw,r","","" +"PMULLD xmm1, xmm2/m128","PMULLD xmm2/m128, xmm1","pmulld xmm2/m128, xmm1"= ,"66 0F 38 40 /r","V","V","SSE4_1","","rw,r","","" +"PMULLW mm1, mm2/m64","PMULLW mm2/m64, mm1","pmullw mm2/m64, mm1","0F D5 /= r","V","V","MMX","","rw,r","","" +"PMULLW xmm1, xmm2/m128","PMULLW xmm2/m128, xmm1","pmullw xmm2/m128, xmm1"= ,"66 0F D5 /r","V","V","SSE2","","rw,r","","" +"PMULUDQ mm1, mm2/m64","PMULULQ mm2/m64, mm1","pmuludq mm2/m64, mm1","0F F= 4 /r","V","V","SSE2","","rw,r","","" +"PMULUDQ xmm1, xmm2/m128","PMULULQ xmm2/m128, xmm1","pmuludq xmm2/m128, xm= m1","66 0F F4 /r","V","V","SSE2","","rw,r","","" +"POPAD","POPAL","popal","61","V","N.S.","","operand32","","","" +"POPA","POPAW","popaw","61","V","N.S.","","operand16","","","" +"POPCNT r32, r/m32","POPCNTL r/m32, r32","popcntl r/m32, r32","F3 0F B8 /r= ","V","V","POPCNT","operand32","w,r","Y","32" +"POPCNT r64, r/m64","POPCNTQ r/m64, r64","popcntq r/m64, r64","F3 REX.W 0F= B8 /r","N.S.","V","POPCNT","","w,r","Y","64" +"POPCNT r16, r/m16","POPCNTW r/m16, r16","popcntw r/m16, r16","F3 0F B8 /r= ","V","V","POPCNT","operand16","w,r","Y","16" +"POPFD","POPFL","popfl","9D","V","N.S.","","operand32","","","" +"POPFQ","POPFQ","popfq","9D","N.S.","V","","default64","","","" +"POPF","POPFW","popfw","9D","V","V","","operand16","","","" +"POP r/m32","POPL r/m32","popl r/m32","8F /0","V","N.S.","","operand32","w= ","Y","32" +"POP r32op","POPL r32op","popl r32op","58+rd","V","N.S.","","operand32","w= ","Y","32" +"POP r/m64","POPQ r/m64","popq r/m64","8F /0","N.S.","V","","default64","w= ","Y","64" +"POP r64op","POPQ r64op","popq r64op","58+ro","N.S.","V","","default64","w= ","Y","64" +"POP r/m16","POPW r/m16","popw r/m16","8F /0","V","V","","operand16","w","= Y","16" +"POP r16op","POPW r16op","popw r16op","58+rw","V","V","","operand16","w","= Y","16" +"POP DS","POPW/POPL/POPQ DS","popw/popl/popq DS","1F","V","N.S.","","","w"= ,"Y","" +"POP ES","POPW/POPL/POPQ ES","popw/popl/popq ES","07","V","N.S.","","","w"= ,"Y","" +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","N.S.","V","","de= fault64","w","Y","" +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","N.S.","","op= erand32","w","Y","" +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","V","","opera= nd16","w","Y","" +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","N.S.","V","","de= fault64","w","Y","" +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","V","","opera= nd16","w","Y","" +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","N.S.","","op= erand32","w","Y","" +"POP SS","POPW/POPL/POPQ SS","popw/popl/popq SS","17","V","N.S.","","","w"= ,"Y","" +"POR mm1, mm2/m64","POR mm2/m64, mm1","por mm2/m64, mm1","0F EB /r","V","V= ","MMX","","rw,r","","" +"POR xmm1, xmm2/m128","POR xmm2/m128, xmm1","por xmm2/m128, xmm1","66 0F E= B /r","V","V","SSE2","","rw,r","","" +"PREFETCHNTA m8","PREFETCHNTA m8","prefetchnta m8","0F 18 /0","V","V","","= modrm_memonly","r","","" +"PREFETCHT0 m8","PREFETCHT0 m8","prefetcht0 m8","0F 18 /1","V","V","","mod= rm_memonly","r","","" +"PREFETCHT1 m8","PREFETCHT1 m8","prefetcht1 m8","0F 18 /2","V","V","","mod= rm_memonly","r","","" +"PREFETCHT2 m8","PREFETCHT2 m8","prefetcht2 m8","0F 18 /3","V","V","","mod= rm_memonly","r","","" +"PREFETCHW m8","PREFETCHW m8","prefetchw m8","0F 0D /1","V","V","PRFCHW","= modrm_memonly","r","","" +"PREFETCHWT1 m8","PREFETCHWT1 m8","prefetchwt1 m8","0F 0D /2","V","V","PRE= FETCHWT1","modrm_memonly","r","","" +"PREFETCHW_ALIAS m8","PREFETCHW_ALIAS m8","prefetchw_alias m8","0F 0D /3",= "V","V","PRFCHW","modrm_memonly","r","","" +"PREFETCH_EXCLUSIVE m8","PREFETCH_EXCLUSIVE m8","prefetch_exclusive m8","0= F 0D /0","V","V","PRFCHW","modrm_memonly","r","","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /2","V","V","PRFCHW","modrm_memonly","r","Y","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /4","V","V","PRFCHW","modrm_memonly","r","Y","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /5","V","V","PRFCHW","modrm_memonly","r","Y","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /6","V","V","PRFCHW","modrm_memonly","r","Y","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /7","V","V","PRFCHW","modrm_memonly","r","Y","" +"PSADBW mm1, mm2/m64","PSADBW mm2/m64, mm1","psadbw mm2/m64, mm1","0F F6 /= r","V","V","MMX","","rw,r","","" +"PSADBW xmm1, xmm2/m128","PSADBW xmm2/m128, xmm1","psadbw xmm2/m128, xmm1"= ,"66 0F F6 /r","V","V","SSE2","","rw,r","","" +"PSHUFB mm1, mm2/m64","PSHUFB mm2/m64, mm1","pshufb mm2/m64, mm1","0F 38 0= 0 /r","V","V","SSSE3","","rw,r","","" +"PSHUFB xmm1, xmm2/m128","PSHUFB xmm2/m128, xmm1","pshufb xmm2/m128, xmm1"= ,"66 0F 38 00 /r","V","V","SSSE3","","rw,r","","" +"PSHUFD xmm1, xmm2/m128, imm8u","PSHUFD imm8u, xmm2/m128, xmm1","pshufd im= m8u, xmm2/m128, xmm1","66 0F 70 /r ib","V","V","SSE2","","w,r,r","","" +"PSHUFHW xmm1, xmm2/m128, imm8u","PSHUFHW imm8u, xmm2/m128, xmm1","pshufhw= imm8u, xmm2/m128, xmm1","F3 0F 70 /r ib","V","V","SSE2","","w,r,r","","" +"PSHUFLW xmm1, xmm2/m128, imm8u","PSHUFLW imm8u, xmm2/m128, xmm1","pshuflw= imm8u, xmm2/m128, xmm1","F2 0F 70 /r ib","V","V","SSE2","","w,r,r","","" +"PSHUFW mm1, mm2/m64, imm8u","PSHUFW imm8u, mm2/m64, mm1","pshufw imm8u, m= m2/m64, mm1","0F 70 /r ib","V","V","MMX","","w,r,r","","" +"PSIGNB mm1, mm2/m64","PSIGNB mm2/m64, mm1","psignb mm2/m64, mm1","0F 38 0= 8 /r","V","V","SSSE3","","rw,r","","" +"PSIGNB xmm1, xmm2/m128","PSIGNB xmm2/m128, xmm1","psignb xmm2/m128, xmm1"= ,"66 0F 38 08 /r","V","V","SSSE3","","rw,r","","" +"PSIGND mm1, mm2/m64","PSIGND mm2/m64, mm1","psignd mm2/m64, mm1","0F 38 0= A /r","V","V","SSSE3","","rw,r","","" +"PSIGND xmm1, xmm2/m128","PSIGND xmm2/m128, xmm1","psignd xmm2/m128, xmm1"= ,"66 0F 38 0A /r","V","V","SSSE3","","rw,r","","" +"PSIGNW mm1, mm2/m64","PSIGNW mm2/m64, mm1","psignw mm2/m64, mm1","0F 38 0= 9 /r","V","V","SSSE3","","rw,r","","" +"PSIGNW xmm1, xmm2/m128","PSIGNW xmm2/m128, xmm1","psignw xmm2/m128, xmm1"= ,"66 0F 38 09 /r","V","V","SSSE3","","rw,r","","" +"PSLLD mm2, imm8u","PSLLL imm8u, mm2","pslld imm8u, mm2","0F 72 /6 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSLLD xmm2, imm8u","PSLLL imm8u, xmm2","pslld imm8u, xmm2","66 0F 72 /6 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSLLD mm1, mm2/m64","PSLLL mm2/m64, mm1","pslld mm2/m64, mm1","0F F2 /r",= "V","V","MMX","","rw,r","","" +"PSLLD xmm1, xmm2/m128","PSLLL xmm2/m128, xmm1","pslld xmm2/m128, xmm1","6= 6 0F F2 /r","V","V","SSE2","","rw,r","","" +"PSLLDQ xmm2, imm8u","PSLLO imm8u, xmm2","pslldq imm8u, xmm2","66 0F 73 /7= ib","V","V","SSE2","modrm_regonly","rw,r","","" +"PSLLQ mm2, imm8u","PSLLQ imm8u, mm2","psllq imm8u, mm2","0F 73 /6 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSLLQ xmm2, imm8u","PSLLQ imm8u, xmm2","psllq imm8u, xmm2","66 0F 73 /6 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSLLQ mm1, mm2/m64","PSLLQ mm2/m64, mm1","psllq mm2/m64, mm1","0F F3 /r",= "V","V","MMX","","rw,r","","" +"PSLLQ xmm1, xmm2/m128","PSLLQ xmm2/m128, xmm1","psllq xmm2/m128, xmm1","6= 6 0F F3 /r","V","V","SSE2","","rw,r","","" +"PSLLW mm2, imm8u","PSLLW imm8u, mm2","psllw imm8u, mm2","0F 71 /6 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSLLW xmm2, imm8u","PSLLW imm8u, xmm2","psllw imm8u, xmm2","66 0F 71 /6 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSLLW mm1, mm2/m64","PSLLW mm2/m64, mm1","psllw mm2/m64, mm1","0F F1 /r",= "V","V","MMX","","rw,r","","" +"PSLLW xmm1, xmm2/m128","PSLLW xmm2/m128, xmm1","psllw xmm2/m128, xmm1","6= 6 0F F1 /r","V","V","SSE2","","rw,r","","" +"PSRAD mm2, imm8u","PSRAL imm8u, mm2","psrad imm8u, mm2","0F 72 /4 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRAD xmm2, imm8u","PSRAL imm8u, xmm2","psrad imm8u, xmm2","66 0F 72 /4 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRAD mm1, mm2/m64","PSRAL mm2/m64, mm1","psrad mm2/m64, mm1","0F E2 /r",= "V","V","MMX","","rw,r","","" +"PSRAD xmm1, xmm2/m128","PSRAL xmm2/m128, xmm1","psrad xmm2/m128, xmm1","6= 6 0F E2 /r","V","V","SSE2","","rw,r","","" +"PSRAW mm2, imm8u","PSRAW imm8u, mm2","psraw imm8u, mm2","0F 71 /4 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRAW xmm2, imm8u","PSRAW imm8u, xmm2","psraw imm8u, xmm2","66 0F 71 /4 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRAW mm1, mm2/m64","PSRAW mm2/m64, mm1","psraw mm2/m64, mm1","0F E1 /r",= "V","V","MMX","","rw,r","","" +"PSRAW xmm1, xmm2/m128","PSRAW xmm2/m128, xmm1","psraw xmm2/m128, xmm1","6= 6 0F E1 /r","V","V","SSE2","","rw,r","","" +"PSRLD mm2, imm8u","PSRLL imm8u, mm2","psrld imm8u, mm2","0F 72 /2 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRLD xmm2, imm8u","PSRLL imm8u, xmm2","psrld imm8u, xmm2","66 0F 72 /2 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRLD mm1, mm2/m64","PSRLL mm2/m64, mm1","psrld mm2/m64, mm1","0F D2 /r",= "V","V","MMX","","rw,r","","" +"PSRLD xmm1, xmm2/m128","PSRLL xmm2/m128, xmm1","psrld xmm2/m128, xmm1","6= 6 0F D2 /r","V","V","SSE2","","rw,r","","" +"PSRLDQ xmm2, imm8u","PSRLO imm8u, xmm2","psrldq imm8u, xmm2","66 0F 73 /3= ib","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRLQ mm2, imm8u","PSRLQ imm8u, mm2","psrlq imm8u, mm2","0F 73 /2 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRLQ xmm2, imm8u","PSRLQ imm8u, xmm2","psrlq imm8u, xmm2","66 0F 73 /2 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRLQ mm1, mm2/m64","PSRLQ mm2/m64, mm1","psrlq mm2/m64, mm1","0F D3 /r",= "V","V","MMX","","rw,r","","" +"PSRLQ xmm1, xmm2/m128","PSRLQ xmm2/m128, xmm1","psrlq xmm2/m128, xmm1","6= 6 0F D3 /r","V","V","SSE2","","rw,r","","" +"PSRLW mm2, imm8u","PSRLW imm8u, mm2","psrlw imm8u, mm2","0F 71 /2 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRLW xmm2, imm8u","PSRLW imm8u, xmm2","psrlw imm8u, xmm2","66 0F 71 /2 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRLW mm1, mm2/m64","PSRLW mm2/m64, mm1","psrlw mm2/m64, mm1","0F D1 /r",= "V","V","MMX","","rw,r","","" +"PSRLW xmm1, xmm2/m128","PSRLW xmm2/m128, xmm1","psrlw xmm2/m128, xmm1","6= 6 0F D1 /r","V","V","SSE2","","rw,r","","" +"PSUBB mm1, mm2/m64","PSUBB mm2/m64, mm1","psubb mm2/m64, mm1","0F F8 /r",= "V","V","MMX","","rw,r","","" +"PSUBB xmm1, xmm2/m128","PSUBB xmm2/m128, xmm1","psubb xmm2/m128, xmm1","6= 6 0F F8 /r","V","V","SSE2","","rw,r","","" +"PSUBD mm1, mm2/m64","PSUBL mm2/m64, mm1","psubd mm2/m64, mm1","0F FA /r",= "V","V","MMX","","rw,r","","" +"PSUBD xmm1, xmm2/m128","PSUBL xmm2/m128, xmm1","psubd xmm2/m128, xmm1","6= 6 0F FA /r","V","V","SSE2","","rw,r","","" +"PSUBQ mm1, mm2/m64","PSUBQ mm2/m64, mm1","psubq mm2/m64, mm1","0F FB /r",= "V","V","SSE2","","rw,r","","" +"PSUBQ xmm1, xmm2/m128","PSUBQ xmm2/m128, xmm1","psubq xmm2/m128, xmm1","6= 6 0F FB /r","V","V","SSE2","","rw,r","","" +"PSUBSB mm1, mm2/m64","PSUBSB mm2/m64, mm1","psubsb mm2/m64, mm1","0F E8 /= r","V","V","MMX","","rw,r","","" +"PSUBSB xmm1, xmm2/m128","PSUBSB xmm2/m128, xmm1","psubsb xmm2/m128, xmm1"= ,"66 0F E8 /r","V","V","SSE2","","rw,r","","" +"PSUBSW mm1, mm2/m64","PSUBSW mm2/m64, mm1","psubsw mm2/m64, mm1","0F E9 /= r","V","V","MMX","","rw,r","","" +"PSUBSW xmm1, xmm2/m128","PSUBSW xmm2/m128, xmm1","psubsw xmm2/m128, xmm1"= ,"66 0F E9 /r","V","V","SSE2","","rw,r","","" +"PSUBUSB mm1, mm2/m64","PSUBUSB mm2/m64, mm1","psubusb mm2/m64, mm1","0F D= 8 /r","V","V","MMX","","rw,r","","" +"PSUBUSB xmm1, xmm2/m128","PSUBUSB xmm2/m128, xmm1","psubusb xmm2/m128, xm= m1","66 0F D8 /r","V","V","SSE2","","rw,r","","" +"PSUBUSW mm1, mm2/m64","PSUBUSW mm2/m64, mm1","psubusw mm2/m64, mm1","0F D= 9 /r","V","V","MMX","","rw,r","","" +"PSUBUSW xmm1, xmm2/m128","PSUBUSW xmm2/m128, xmm1","psubusw xmm2/m128, xm= m1","66 0F D9 /r","V","V","SSE2","","rw,r","","" +"PSUBW mm1, mm2/m64","PSUBW mm2/m64, mm1","psubw mm2/m64, mm1","0F F9 /r",= "V","V","MMX","","rw,r","","" +"PSUBW xmm1, xmm2/m128","PSUBW xmm2/m128, xmm1","psubw xmm2/m128, xmm1","6= 6 0F F9 /r","V","V","SSE2","","rw,r","","" +"PSWAPD mm1, mm2/m64","PSWAPD mm2/m64, mm1","pswapd mm2/m64, mm1","0F 0F B= B /r","V","V","3DNOW","amd","rw,r","","" +"PTEST xmm1, xmm2/m128","PTEST xmm2/m128, xmm1","ptest xmm2/m128, xmm1","6= 6 0F 38 17 /r","V","V","SSE4_1","","r,r","","" +"PTWRITE r/m32","PTWRITEL r/m32","ptwritel r/m32","F3 0F AE /4","V","V",""= ,"operand16,operand32","r","Y","32" +"PTWRITE r/m64","PTWRITEQ r/m64","ptwriteq r/m64","F3 REX.W 0F AE /4","N.S= .","V","","","r","Y","64" +"PUNPCKHBW mm1, mm2/m64","PUNPCKHBW mm2/m64, mm1","punpckhbw mm2/m64, mm1"= ,"0F 68 /r","V","V","MMX","","rw,r","","" +"PUNPCKHBW xmm1, xmm2/m128","PUNPCKHBW xmm2/m128, xmm1","punpckhbw xmm2/m1= 28, xmm1","66 0F 68 /r","V","V","SSE2","","rw,r","","" +"PUNPCKHDQ mm1, mm2/m64","PUNPCKHLQ mm2/m64, mm1","punpckhdq mm2/m64, mm1"= ,"0F 6A /r","V","V","MMX","","rw,r","","" +"PUNPCKHDQ xmm1, xmm2/m128","PUNPCKHLQ xmm2/m128, xmm1","punpckhdq xmm2/m1= 28, xmm1","66 0F 6A /r","V","V","SSE2","","rw,r","","" +"PUNPCKHQDQ xmm1, xmm2/m128","PUNPCKHQDQ xmm2/m128, xmm1","punpckhqdq xmm2= /m128, xmm1","66 0F 6D /r","V","V","SSE2","","rw,r","","" +"PUNPCKHWD mm1, mm2/m64","PUNPCKHWL mm2/m64, mm1","punpckhwd mm2/m64, mm1"= ,"0F 69 /r","V","V","MMX","","rw,r","","" +"PUNPCKHWD xmm1, xmm2/m128","PUNPCKHWL xmm2/m128, xmm1","punpckhwd xmm2/m1= 28, xmm1","66 0F 69 /r","V","V","SSE2","","rw,r","","" +"PUNPCKLBW mm1, mm2/m32","PUNPCKLBW mm2/m32, mm1","punpcklbw mm2/m32, mm1"= ,"0F 60 /r","V","V","MMX","","rw,r","","" +"PUNPCKLBW xmm1, xmm2/m128","PUNPCKLBW xmm2/m128, xmm1","punpcklbw xmm2/m1= 28, xmm1","66 0F 60 /r","V","V","SSE2","","rw,r","","" +"PUNPCKLDQ mm1, mm2/m32","PUNPCKLLQ mm2/m32, mm1","punpckldq mm2/m32, mm1"= ,"0F 62 /r","V","V","MMX","","rw,r","","" +"PUNPCKLDQ xmm1, xmm2/m128","PUNPCKLLQ xmm2/m128, xmm1","punpckldq xmm2/m1= 28, xmm1","66 0F 62 /r","V","V","SSE2","","rw,r","","" +"PUNPCKLQDQ xmm1, xmm2/m128","PUNPCKLQDQ xmm2/m128, xmm1","punpcklqdq xmm2= /m128, xmm1","66 0F 6C /r","V","V","SSE2","","rw,r","","" +"PUNPCKLWD mm1, mm2/m32","PUNPCKLWL mm2/m32, mm1","punpcklwd mm2/m32, mm1"= ,"0F 61 /r","V","V","MMX","","rw,r","","" +"PUNPCKLWD xmm1, xmm2/m128","PUNPCKLWL xmm2/m128, xmm1","punpcklwd xmm2/m1= 28, xmm1","66 0F 61 /r","V","V","SSE2","","rw,r","","" +"PUSHAD","PUSHAL","pushal","60","V","N.S.","","operand32","","","" +"PUSHA","PUSHAW","pushaw","60","V","N.S.","","operand16","","","" +"PUSHFD","PUSHFL","pushfl","9C","V","N.S.","","operand32","","","" +"PUSHFQ","PUSHFQ","pushfq","9C","N.S.","V","","default64","","","" +"PUSHF","PUSHFW","pushfw","9C","V","V","","operand16","","","" +"PUSH r/m32","PUSHL r/m32","pushl r/m32","FF /6","V","N.S.","","operand32"= ,"r","Y","32" +"PUSH r32op","PUSHL r32op","pushl r32op","50+rd","V","N.S.","","operand32"= ,"r","Y","32" +"PUSH r/m64","PUSHQ r/m64","pushq r/m64","FF /6","N.S.","V","","default64"= ,"r","Y","64" +"PUSH r64op","PUSHQ r64op","pushq r64op","50+ro","N.S.","V","","default64"= ,"r","Y","64" +"PUSH imm16","PUSHW imm16","pushw imm16","68 iw","V","V","","operand16","r= ","Y","" +"PUSH r/m16","PUSHW r/m16","pushw r/m16","FF /6","V","V","","operand16","r= ","Y","16" +"PUSH r16op","PUSHW r16op","pushw r16op","50+rw","V","V","","operand16","r= ","Y","16" +"PUSH CS","PUSHW/PUSHL/PUSHQ CS","pushw/pushl/pushq CS","0E","V","N.S.",""= ,"","r","Y","" +"PUSH DS","PUSHW/PUSHL/PUSHQ DS","pushw/pushl/pushq DS","1E","V","N.S.",""= ,"","r","Y","" +"PUSH ES","PUSHW/PUSHL/PUSHQ ES","pushw/pushl/pushq ES","06","V","N.S.",""= ,"","r","Y","" +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","V",""= ,"operand16","r","Y","" +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","N.S.","V"= ,"","default64","r","Y","" +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","N.S."= ,"","operand32","r","Y","" +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","N.S.","V"= ,"","default64","r","Y","" +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","N.S."= ,"","operand32","r","Y","" +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","V",""= ,"operand16","r","Y","" +"PUSH SS","PUSHW/PUSHL/PUSHQ SS","pushw/pushl/pushq SS","16","V","N.S.",""= ,"","r","Y","" +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V",= "N.S.","","operand32","r","Y","" +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","N.S= .","V","","default64","r","Y","" +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V",= "V","","operand16","r","Y","" +"PXOR mm1, mm2/m64","PXOR mm2/m64, mm1","pxor mm2/m64, mm1","0F EF /r","V"= ,"V","MMX","","rw,r","","" +"PXOR xmm1, xmm2/m128","PXOR xmm2/m128, xmm1","pxor xmm2/m128, xmm1","66 0= F EF /r","V","V","SSE2","","rw,r","","" +"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","D0 /2","V","V","","","rw,r","= Y","8" +"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","REX D0 /2","N.E.","V","","pse= udo64","w,r","Y","8" +"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","D2 /2","V","V","","","rw,r= ","Y","8" +"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","REX D2 /2","N.E.","V","","= pseudo64","w,r","Y","8" +"RCL r/m8, imm8","RCLB imm8, r/m8","rclb imm8, r/m8","REX C0 /2 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"RCL r/m8, imm8u","RCLB imm8u, r/m8","rclb imm8u, r/m8","C0 /2 ib","V","V"= ,"","","rw,r","Y","8" +"RCL r/m32, 1","RCLL 1, r/m32","rcll 1, r/m32","D1 /2","V","V","","operand= 32","rw,r","Y","32" +"RCL r/m32, CL","RCLL CL, r/m32","rcll CL, r/m32","D3 /2","V","V","","oper= and32","rw,r","Y","32" +"RCL r/m32, imm8u","RCLL imm8u, r/m32","rcll imm8u, r/m32","C1 /2 ib","V",= "V","","operand32","rw,r","Y","32" +"RCL r/m64, 1","RCLQ 1, r/m64","rclq 1, r/m64","REX.W D1 /2","N.S.","V",""= ,"","rw,r","Y","64" +"RCL r/m64, CL","RCLQ CL, r/m64","rclq CL, r/m64","REX.W D3 /2","N.S.","V"= ,"","","rw,r","Y","64" +"RCL r/m64, imm8u","RCLQ imm8u, r/m64","rclq imm8u, r/m64","REX.W C1 /2 ib= ","N.S.","V","","","rw,r","Y","64" +"RCL r/m16, 1","RCLW 1, r/m16","rclw 1, r/m16","D1 /2","V","V","","operand= 16","rw,r","Y","16" +"RCL r/m16, CL","RCLW CL, r/m16","rclw CL, r/m16","D3 /2","V","V","","oper= and16","rw,r","Y","16" +"RCL r/m16, imm8u","RCLW imm8u, r/m16","rclw imm8u, r/m16","C1 /2 ib","V",= "V","","operand16","rw,r","Y","16" +"RCPPS xmm1, xmm2/m128","RCPPS xmm2/m128, xmm1","rcpps xmm2/m128, xmm1","0= F 53 /r","V","V","SSE","","w,r","","" +"RCPSS xmm1, xmm2/m32","RCPSS xmm2/m32, xmm1","rcpss xmm2/m32, xmm1","F3 0= F 53 /r","V","V","SSE","","w,r","","" +"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","D0 /3","V","V","","","rw,r","= Y","8" +"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","REX D0 /3","N.E.","V","","pse= udo64","w,r","Y","8" +"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","D2 /3","V","V","","","rw,r= ","Y","8" +"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","REX D2 /3","N.E.","V","","= pseudo64","w,r","Y","8" +"RCR r/m8, imm8","RCRB imm8, r/m8","rcrb imm8, r/m8","REX C0 /3 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"RCR r/m8, imm8u","RCRB imm8u, r/m8","rcrb imm8u, r/m8","C0 /3 ib","V","V"= ,"","","rw,r","Y","8" +"RCR r/m32, 1","RCRL 1, r/m32","rcrl 1, r/m32","D1 /3","V","V","","operand= 32","rw,r","Y","32" +"RCR r/m32, CL","RCRL CL, r/m32","rcrl CL, r/m32","D3 /3","V","V","","oper= and32","rw,r","Y","32" +"RCR r/m32, imm8u","RCRL imm8u, r/m32","rcrl imm8u, r/m32","C1 /3 ib","V",= "V","","operand32","rw,r","Y","32" +"RCR r/m64, 1","RCRQ 1, r/m64","rcrq 1, r/m64","REX.W D1 /3","N.S.","V",""= ,"","rw,r","Y","64" +"RCR r/m64, CL","RCRQ CL, r/m64","rcrq CL, r/m64","REX.W D3 /3","N.S.","V"= ,"","","rw,r","Y","64" +"RCR r/m64, imm8u","RCRQ imm8u, r/m64","rcrq imm8u, r/m64","REX.W C1 /3 ib= ","N.S.","V","","","rw,r","Y","64" +"RCR r/m16, 1","RCRW 1, r/m16","rcrw 1, r/m16","D1 /3","V","V","","operand= 16","rw,r","Y","16" +"RCR r/m16, CL","RCRW CL, r/m16","rcrw CL, r/m16","D3 /3","V","V","","oper= and16","rw,r","Y","16" +"RCR r/m16, imm8u","RCRW imm8u, r/m16","rcrw imm8u, r/m16","C1 /3 ib","V",= "V","","operand16","rw,r","Y","16" +"RDFSBASE rmr32","RDFSBASEL rmr32","rdfsbase rmr32","F3 0F AE /0","N.S.","= V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32" +"RDFSBASE rmr64","RDFSBASEQ rmr64","rdfsbase rmr64","F3 REX.W 0F AE /0","N= .S.","V","FSGSBASE","modrm_regonly","w","Y","64" +"RDGSBASE rmr32","RDGSBASEL rmr32","rdgsbase rmr32","F3 0F AE /1","N.S.","= V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32" +"RDGSBASE rmr64","RDGSBASEQ rmr64","rdgsbase rmr64","F3 REX.W 0F AE /1","N= .S.","V","FSGSBASE","modrm_regonly","w","Y","64" +"RDMSR","RDMSR","rdmsr","0F 32","V","V","Pentium","","","","" +"RDPKRU","RDPKRU","rdpkru","0F 01 EE","V","V","PKU","","","","" +"RDPMC","RDPMC","rdpmc","0F 33","V","V","","","","","" +"RDRAND rmr32","RDRANDL rmr32","rdrand rmr32","0F C7 /6","V","V","RDRAND",= "modrm_regonly,operand32","w","Y","32" +"RDRAND rmr64","RDRANDQ rmr64","rdrand rmr64","REX.W 0F C7 /6","N.S.","V",= "RDRAND","modrm_regonly","w","Y","64" +"RDRAND rmr16","RDRANDW rmr16","rdrand rmr16","0F C7 /6","V","V","RDRAND",= "modrm_regonly,operand16","w","Y","16" +"RDSEED rmr32","RDSEEDL rmr32","rdseed rmr32","0F C7 /7","V","V","RDSEED",= "modrm_regonly,operand32","w","Y","32" +"RDSEED rmr64","RDSEEDQ rmr64","rdseed rmr64","REX.W 0F C7 /7","N.S.","V",= "RDSEED","modrm_regonly","w","Y","64" +"RDSEED rmr16","RDSEEDW rmr16","rdseed rmr16","0F C7 /7","V","V","RDSEED",= "modrm_regonly,operand16","w","Y","16" +"RDSSPD rmr32","RDSSPD rmr32","rdsspd rmr32","F3 0F 1E /1","V","V","CET","= modrm_regonly,operand16,operand32","w","","" +"RDSSPQ rmr64","RDSSPQ rmr64","rdsspq rmr64","F3 REX.W 0F 1E /1","N.S.","V= ","CET","modrm_regonly","w","","" +"RDTSC","RDTSC","rdtsc","0F 31","V","V","Pentium","","","","" +"RDTSCP","RDTSCP","rdtscp","0F 01 F9","V","V","RDTSCP","","","","" +"RET_FAR","RETFW/RETFL/RETFQ","lretw/lretl/lretl","CB","V","V","","","",""= ,"" +"RET_FAR imm16u","RETFW/RETFL/RETFQ imm16u","lretw/lretl/lretl imm16u","CA= iw","V","V","","","r","","" +"RET","RETW/RETL/RETQ","retw/retl/retq","C3","N.S.","V","","default64","",= "","" +"RET","RETW/RETL/RETQ","retw/retl/retq","C3","V","N.S.","","","","","" +"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","N.S.= ","V","","default64","r","","" +"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","V","= N.S.","","","r","","" +"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","D0 /0","V","V","","","rw,r","= Y","8" +"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","REX D0 /0","N.E.","V","","pse= udo64","w,r","Y","8" +"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","D2 /0","V","V","","","rw,r= ","Y","8" +"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","REX D2 /0","N.E.","V","","= pseudo64","w,r","Y","8" +"ROL r/m8, imm8","ROLB imm8, r/m8","rolb imm8, r/m8","REX C0 /0 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"ROL r/m8, imm8u","ROLB imm8u, r/m8","rolb imm8u, r/m8","C0 /0 ib","V","V"= ,"","","rw,r","Y","8" +"ROL r/m32, 1","ROLL 1, r/m32","roll 1, r/m32","D1 /0","V","V","","operand= 32","rw,r","Y","32" +"ROL r/m32, CL","ROLL CL, r/m32","roll CL, r/m32","D3 /0","V","V","","oper= and32","rw,r","Y","32" +"ROL r/m32, imm8u","ROLL imm8u, r/m32","roll imm8u, r/m32","C1 /0 ib","V",= "V","","operand32","rw,r","Y","32" +"ROL r/m64, 1","ROLQ 1, r/m64","rolq 1, r/m64","REX.W D1 /0","N.S.","V",""= ,"","rw,r","Y","64" +"ROL r/m64, CL","ROLQ CL, r/m64","rolq CL, r/m64","REX.W D3 /0","N.S.","V"= ,"","","rw,r","Y","64" +"ROL r/m64, imm8u","ROLQ imm8u, r/m64","rolq imm8u, r/m64","REX.W C1 /0 ib= ","N.S.","V","","","rw,r","Y","64" +"ROL r/m16, 1","ROLW 1, r/m16","rolw 1, r/m16","D1 /0","V","V","","operand= 16","rw,r","Y","16" +"ROL r/m16, CL","ROLW CL, r/m16","rolw CL, r/m16","D3 /0","V","V","","oper= and16","rw,r","Y","16" +"ROL r/m16, imm8u","ROLW imm8u, r/m16","rolw imm8u, r/m16","C1 /0 ib","V",= "V","","operand16","rw,r","Y","16" +"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","D0 /1","V","V","","","rw,r","= Y","8" +"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","REX D0 /1","N.E.","V","","pse= udo64","w,r","Y","8" +"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","D2 /1","V","V","","","rw,r= ","Y","8" +"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","REX D2 /1","N.E.","V","","= pseudo64","w,r","Y","8" +"ROR r/m8, imm8","RORB imm8, r/m8","rorb imm8, r/m8","REX C0 /1 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"ROR r/m8, imm8u","RORB imm8u, r/m8","rorb imm8u, r/m8","C0 /1 ib","V","V"= ,"","","rw,r","Y","8" +"ROR r/m32, 1","RORL 1, r/m32","rorl 1, r/m32","D1 /1","V","V","","operand= 32","rw,r","Y","32" +"ROR r/m32, CL","RORL CL, r/m32","rorl CL, r/m32","D3 /1","V","V","","oper= and32","rw,r","Y","32" +"ROR r/m32, imm8u","RORL imm8u, r/m32","rorl imm8u, r/m32","C1 /1 ib","V",= "V","","operand32","rw,r","Y","32" +"ROR r/m64, 1","RORQ 1, r/m64","rorq 1, r/m64","REX.W D1 /1","N.S.","V",""= ,"","rw,r","Y","64" +"ROR r/m64, CL","RORQ CL, r/m64","rorq CL, r/m64","REX.W D3 /1","N.S.","V"= ,"","","rw,r","Y","64" +"ROR r/m64, imm8u","RORQ imm8u, r/m64","rorq imm8u, r/m64","REX.W C1 /1 ib= ","N.S.","V","","","rw,r","Y","64" +"ROR r/m16, 1","RORW 1, r/m16","rorw 1, r/m16","D1 /1","V","V","","operand= 16","rw,r","Y","16" +"ROR r/m16, CL","RORW CL, r/m16","rorw CL, r/m16","D3 /1","V","V","","oper= and16","rw,r","Y","16" +"ROR r/m16, imm8u","RORW imm8u, r/m16","rorw imm8u, r/m16","C1 /1 ib","V",= "V","","operand16","rw,r","Y","16" +"RORX r32, r/m32, imm8u","RORXL imm8u, r/m32, r32","rorxl imm8u, r/m32, r3= 2","VEX.128.F2.0F3A.W0 F0 /r ib","V","V","BMI2","","w,r,r","Y","32" +"RORX r64, r/m64, imm8u","RORXQ imm8u, r/m64, r64","rorxq imm8u, r/m64, r6= 4","VEX.128.F2.0F3A.W1 F0 /r ib","N.S.","V","BMI2","","w,r,r","Y","64" +"ROUNDPD xmm1, xmm2/m128, imm8u","ROUNDPD imm8u, xmm2/m128, xmm1","roundpd= imm8u, xmm2/m128, xmm1","66 0F 3A 09 /r ib","V","V","SSE4_1","","w,r,r",""= ,"" +"ROUNDPS xmm1, xmm2/m128, imm8u","ROUNDPS imm8u, xmm2/m128, xmm1","roundps= imm8u, xmm2/m128, xmm1","66 0F 3A 08 /r ib","V","V","SSE4_1","","w,r,r",""= ,"" +"ROUNDSD xmm1, xmm2/m64, imm8u","ROUNDSD imm8u, xmm2/m64, xmm1","roundsd i= mm8u, xmm2/m64, xmm1","66 0F 3A 0B /r ib","V","V","SSE4_1","","w,r,r","","" +"ROUNDSS xmm1, xmm2/m32, imm8u","ROUNDSS imm8u, xmm2/m32, xmm1","roundss i= mm8u, xmm2/m32, xmm1","66 0F 3A 0A /r ib","V","V","SSE4_1","","w,r,r","","" +"RSM","RSM","rsm","0F AA","V","V","","","","","" +"RSQRTPS xmm1, xmm2/m128","RSQRTPS xmm2/m128, xmm1","rsqrtps xmm2/m128, xm= m1","0F 52 /r","V","V","SSE","","w,r","","" +"RSQRTSS xmm1, xmm2/m32","RSQRTSS xmm2/m32, xmm1","rsqrtss xmm2/m32, xmm1"= ,"F3 0F 52 /r","V","V","SSE","","w,r","","" +"RSTORSSP m64","RSTORSSP m64","rstorssp m64","F3 0F 01 /5","V","V","CET","= modrm_memonly","rw","","" +"SAHF","SAHF","sahf","9E","V","V","LAHFSAHF","","","","" +"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","D0 /4","V","V","","pseudo","r= w,r","Y","8" +"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","REX D0 /4","N.E.","V","","pse= udo","rw,r","Y","8" +"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","D2 /4","V","V","","pseudo"= ,"rw,r","Y","8" +"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","REX D2 /4","N.E.","V","","= pseudo","rw,r","Y","8" +"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","C0 /4 ib","V","V",""= ,"pseudo","rw,r","Y","8" +"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","REX C0 /4 ib","N.E."= ,"V","","pseudo","rw,r","Y","8" +"SALC","SALC","salc","D6","V","N.S.","","","","","" +"SAL r/m32, 1","SALL 1, r/m32","sall 1, r/m32","D1 /4","V","V","","operand= 32,pseudo","rw,r","Y","32" +"SAL r/m32, CL","SALL CL, r/m32","sall CL, r/m32","D3 /4","V","V","","oper= and32,pseudo","rw,r","Y","32" +"SAL r/m32, imm8","SALL imm8, r/m32","sall imm8, r/m32","C1 /4 ib","V","V"= ,"","operand32,pseudo","rw,r","Y","32" +"SAL r/m64, 1","SALQ 1, r/m64","salq 1, r/m64","REX.W D1 /4","N.E.","V",""= ,"pseudo","rw,r","Y","64" +"SAL r/m64, CL","SALQ CL, r/m64","salq CL, r/m64","REX.W D3 /4","N.E.","V"= ,"","pseudo","rw,r","Y","64" +"SAL r/m64, imm8","SALQ imm8, r/m64","salq imm8, r/m64","REX.W C1 /4 ib","= N.E.","V","","pseudo","rw,r","Y","64" +"SAL r/m16, 1","SALW 1, r/m16","salw 1, r/m16","D1 /4","V","V","","operand= 16,pseudo","rw,r","Y","16" +"SAL r/m16, CL","SALW CL, r/m16","salw CL, r/m16","D3 /4","V","V","","oper= and16,pseudo","rw,r","Y","16" +"SAL r/m16, imm8","SALW imm8, r/m16","salw imm8, r/m16","C1 /4 ib","V","V"= ,"","operand16,pseudo","rw,r","Y","16" +"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","D0 /7","V","V","","","rw,r","= Y","8" +"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","REX D0 /7","N.E.","V","","pse= udo64","rw,r","Y","8" +"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","D2 /7","V","V","","","rw,r= ","Y","8" +"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","REX D2 /7","N.E.","V","","= pseudo64","rw,r","Y","8" +"SAR r/m8, imm8","SARB imm8, r/m8","sarb imm8, r/m8","REX C0 /7 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"SAR r/m8, imm8u","SARB imm8u, r/m8","sarb imm8u, r/m8","C0 /7 ib","V","V"= ,"","","rw,r","Y","8" +"SAR r/m32, 1","SARL 1, r/m32","sarl 1, r/m32","D1 /7","V","V","","operand= 32","rw,r","Y","32" +"SAR r/m32, CL","SARL CL, r/m32","sarl CL, r/m32","D3 /7","V","V","","oper= and32","rw,r","Y","32" +"SAR r/m32, imm8u","SARL imm8u, r/m32","sarl imm8u, r/m32","C1 /7 ib","V",= "V","","operand32","rw,r","Y","32" +"SAR r/m64, 1","SARQ 1, r/m64","sarq 1, r/m64","REX.W D1 /7","N.S.","V",""= ,"","rw,r","Y","64" +"SAR r/m64, CL","SARQ CL, r/m64","sarq CL, r/m64","REX.W D3 /7","N.S.","V"= ,"","","rw,r","Y","64" +"SAR r/m64, imm8u","SARQ imm8u, r/m64","sarq imm8u, r/m64","REX.W C1 /7 ib= ","N.S.","V","","","rw,r","Y","64" +"SAR r/m16, 1","SARW 1, r/m16","sarw 1, r/m16","D1 /7","V","V","","operand= 16","rw,r","Y","16" +"SAR r/m16, CL","SARW CL, r/m16","sarw CL, r/m16","D3 /7","V","V","","oper= and16","rw,r","Y","16" +"SAR r/m16, imm8u","SARW imm8u, r/m16","sarw imm8u, r/m16","C1 /7 ib","V",= "V","","operand16","rw,r","Y","16" +"SARX r32, r/m32, r32V","SARXL r32V, r/m32, r32","sarxl r32V, r/m32, r32",= "VEX.NDS.128.F3.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32" +"SARX r64, r/m64, r64V","SARXQ r64V, r/m64, r64","sarxq r64V, r/m64, r64",= "VEX.NDS.128.F3.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64" +"SAVESSP","SAVESSP","savessp","F3 0F 01 EA","V","V","CET","","","","" +"SBB AL, imm8","SBBB imm8, AL","sbbb imm8, AL","1C ib","V","V","","","rw,r= ","Y","8" +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","80 /3 ib","V","V",""= ,"","rw,r","Y","8" +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","82 /3 ib","V","N.S."= ,"","","rw,r","Y","8" +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","REX 80 /3 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","1A /r","V","V","","","rw,r= ","Y","8" +"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","REX 1A /r","N.E.","V","","= pseudo64","w,r","Y","8" +"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","18 /r","V","V","","","rw,r= ","Y","8" +"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","REX 18 /r","N.E.","V","","= pseudo64","w,r","Y","8" +"SBB EAX, imm32","SBBL imm32, EAX","sbbl imm32, EAX","1D id","V","V","","o= perand32","rw,r","Y","32" +"SBB r/m32, imm32","SBBL imm32, r/m32","sbbl imm32, r/m32","81 /3 id","V",= "V","","operand32","rw,r","Y","32" +"SBB r/m32, imm8","SBBL imm8, r/m32","sbbl imm8, r/m32","83 /3 ib","V","V"= ,"","operand32","rw,r","Y","32" +"SBB r32, r/m32","SBBL r/m32, r32","sbbl r/m32, r32","1B /r","V","V","","o= perand32","rw,r","Y","32" +"SBB r/m32, r32","SBBL r32, r/m32","sbbl r32, r/m32","19 /r","V","V","","o= perand32","rw,r","Y","32" +"SBB RAX, imm32","SBBQ imm32, RAX","sbbq imm32, RAX","REX.W 1D id","N.S.",= "V","","","rw,r","Y","64" +"SBB r/m64, imm32","SBBQ imm32, r/m64","sbbq imm32, r/m64","REX.W 81 /3 id= ","N.S.","V","","","rw,r","Y","64" +"SBB r/m64, imm8","SBBQ imm8, r/m64","sbbq imm8, r/m64","REX.W 83 /3 ib","= N.S.","V","","","rw,r","Y","64" +"SBB r64, r/m64","SBBQ r/m64, r64","sbbq r/m64, r64","REX.W 1B /r","N.S.",= "V","","","rw,r","Y","64" +"SBB r/m64, r64","SBBQ r64, r/m64","sbbq r64, r/m64","REX.W 19 /r","N.S.",= "V","","","rw,r","Y","64" +"SBB AX, imm16","SBBW imm16, AX","sbbw imm16, AX","1D iw","V","V","","oper= and16","rw,r","Y","16" +"SBB r/m16, imm16","SBBW imm16, r/m16","sbbw imm16, r/m16","81 /3 iw","V",= "V","","operand16","rw,r","Y","16" +"SBB r/m16, imm8","SBBW imm8, r/m16","sbbw imm8, r/m16","83 /3 ib","V","V"= ,"","operand16","rw,r","Y","16" +"SBB r16, r/m16","SBBW r/m16, r16","sbbw r/m16, r16","1B /r","V","V","","o= perand16","rw,r","Y","16" +"SBB r/m16, r16","SBBW r16, r/m16","sbbw r16, r/m16","19 /r","V","V","","o= perand16","rw,r","Y","16" +"SCASB","SCASB","scasb","AE","V","V","","","","","" +"SCASD","SCASL","scasl","AF","V","V","","operand32","","","" +"SCASQ","SCASQ","scasq","REX.W AF","N.S.","V","","","","","" +"SCASW","SCASW","scasw","AF","V","V","","operand16","","","" +"SETAE r/m8","SETCC r/m8","setae r/m8","0F 93 /r","V","V","","","w","","" +"SETNB r/m8","SETCC r/m8","setnb r/m8","0F 93 /r","V","V","","pseudo","r",= "","" +"SETNC r/m8","SETCC r/m8","setnc r/m8","0F 93 /r","V","V","","pseudo","r",= "","" +"SETAE r/m8","SETCC r/m8","setae r/m8","REX 0F 93 /r","N.E.","V","","pseud= o64","r","","" +"SETNB r/m8","SETCC r/m8","setnb r/m8","REX 0F 93 /r","N.E.","V","","pseud= o","r","","" +"SETNC r/m8","SETCC r/m8","setnc r/m8","REX 0F 93 /r","N.E.","V","","pseud= o","r","","" +"SETB r/m8","SETCS r/m8","setb r/m8","0F 92 /r","V","V","","","w","","" +"SETC r/m8","SETCS r/m8","setc r/m8","0F 92 /r","V","V","","pseudo","r",""= ,"" +"SETNAE r/m8","SETCS r/m8","setnae r/m8","0F 92 /r","V","V","","pseudo","r= ","","" +"SETB r/m8","SETCS r/m8","setb r/m8","REX 0F 92 /r","N.E.","V","","pseudo6= 4","r","","" +"SETC r/m8","SETCS r/m8","setc r/m8","REX 0F 92 /r","N.E.","V","","pseudo"= ,"r","","" +"SETNAE r/m8","SETCS r/m8","setnae r/m8","REX 0F 92 /r","N.E.","V","","pse= udo","r","","" +"SETE r/m8","SETEQ r/m8","sete r/m8","0F 94 /r","V","V","","","w","","" +"SETZ r/m8","SETEQ r/m8","setz r/m8","0F 94 /r","V","V","","pseudo","r",""= ,"" +"SETE r/m8","SETEQ r/m8","sete r/m8","REX 0F 94 /r","N.E.","V","","pseudo6= 4","r","","" +"SETZ r/m8","SETEQ r/m8","setz r/m8","REX 0F 94 /r","N.E.","V","","pseudo"= ,"r","","" +"SETGE r/m8","SETGE r/m8","setge r/m8","0F 9D /r","V","V","","","w","","" +"SETNL r/m8","SETGE r/m8","setnl r/m8","0F 9D /r","V","V","","pseudo","r",= "","" +"SETGE r/m8","SETGE r/m8","setge r/m8","REX 0F 9D /r","N.E.","V","","pseud= o64","r","","" +"SETNL r/m8","SETGE r/m8","setnl r/m8","REX 0F 9D /r","N.E.","V","","pseud= o","r","","" +"SETG r/m8","SETGT r/m8","setg r/m8","0F 9F /r","V","V","","","w","","" +"SETNLE r/m8","SETGT r/m8","setnle r/m8","0F 9F /r","V","V","","pseudo","r= ","","" +"SETG r/m8","SETGT r/m8","setg r/m8","REX 0F 9F /r","N.E.","V","","pseudo6= 4","r","","" +"SETNLE r/m8","SETGT r/m8","setnle r/m8","REX 0F 9F /r","N.E.","V","","pse= udo","r","","" +"SETA r/m8","SETHI r/m8","seta r/m8","0F 97 /r","V","V","","","w","","" +"SETNBE r/m8","SETHI r/m8","setnbe r/m8","0F 97 /r","V","V","","pseudo","r= ","","" +"SETA r/m8","SETHI r/m8","seta r/m8","REX 0F 97 /r","N.E.","V","","pseudo6= 4","r","","" +"SETNBE r/m8","SETHI r/m8","setnbe r/m8","REX 0F 97 /r","N.E.","V","","pse= udo","r","","" +"SETLE r/m8","SETLE r/m8","setle r/m8","0F 9E /r","V","V","","","w","","" +"SETNG r/m8","SETLE r/m8","setng r/m8","0F 9E /r","V","V","","pseudo","r",= "","" +"SETLE r/m8","SETLE r/m8","setle r/m8","REX 0F 9E /r","N.E.","V","","pseud= o64","r","","" +"SETNG r/m8","SETLE r/m8","setng r/m8","REX 0F 9E /r","N.E.","V","","pseud= o","r","","" +"SETBE r/m8","SETLS r/m8","setbe r/m8","0F 96 /r","V","V","","","w","","" +"SETNA r/m8","SETLS r/m8","setna r/m8","0F 96 /r","V","V","","pseudo","r",= "","" +"SETBE r/m8","SETLS r/m8","setbe r/m8","REX 0F 96 /r","N.E.","V","","pseud= o64","r","","" +"SETNA r/m8","SETLS r/m8","setna r/m8","REX 0F 96 /r","N.E.","V","","pseud= o","r","","" +"SETL r/m8","SETLT r/m8","setl r/m8","0F 9C /r","V","V","","","w","","" +"SETNGE r/m8","SETLT r/m8","setnge r/m8","0F 9C /r","V","V","","pseudo","r= ","","" +"SETL r/m8","SETLT r/m8","setl r/m8","REX 0F 9C /r","N.E.","V","","pseudo6= 4","r","","" +"SETNGE r/m8","SETLT r/m8","setnge r/m8","REX 0F 9C /r","N.E.","V","","pse= udo","r","","" +"SETS r/m8","SETMI r/m8","sets r/m8","0F 98 /r","V","V","","","w","","" +"SETS r/m8","SETMI r/m8","sets r/m8","REX 0F 98 /r","N.E.","V","","pseudo6= 4","r","","" +"SETNE r/m8","SETNE r/m8","setne r/m8","0F 95 /r","V","V","","","w","","" +"SETNZ r/m8","SETNE r/m8","setnz r/m8","0F 95 /r","V","V","","pseudo","r",= "","" +"SETNE r/m8","SETNE r/m8","setne r/m8","REX 0F 95 /r","N.E.","V","","pseud= o64","r","","" +"SETNZ r/m8","SETNE r/m8","setnz r/m8","REX 0F 95 /r","N.E.","V","","pseud= o","r","","" +"SETNO r/m8","SETOC r/m8","setno r/m8","0F 91 /r","V","V","","","w","","" +"SETNO r/m8","SETOC r/m8","setno r/m8","REX 0F 91 /r","N.E.","V","","pseud= o64","r","","" +"SETO r/m8","SETOS r/m8","seto r/m8","0F 90 /r","V","V","","","w","","" +"SETO r/m8","SETOS r/m8","seto r/m8","REX 0F 90 /r","N.E.","V","","pseudo6= 4","r","","" +"SETNP r/m8","SETPC r/m8","setnp r/m8","0F 9B /r","V","V","","","w","","" +"SETPO r/m8","SETPC r/m8","setpo r/m8","0F 9B /r","V","V","","pseudo","r",= "","" +"SETNP r/m8","SETPC r/m8","setnp r/m8","REX 0F 9B /r","N.E.","V","","pseud= o64","r","","" +"SETPO r/m8","SETPC r/m8","setpo r/m8","REX 0F 9B /r","N.E.","V","","pseud= o","r","","" +"SETNS r/m8","SETPL r/m8","setns r/m8","0F 99 /r","V","V","","","w","","" +"SETNS r/m8","SETPL r/m8","setns r/m8","REX 0F 99 /r","N.E.","V","","pseud= o64","r","","" +"SETP r/m8","SETPS r/m8","setp r/m8","0F 9A /r","V","V","","","w","","" +"SETPE r/m8","SETPS r/m8","setpe r/m8","0F 9A /r","V","V","","pseudo","r",= "","" +"SETP r/m8","SETPS r/m8","setp r/m8","REX 0F 9A /r","N.E.","V","","pseudo6= 4","r","","" +"SETPE r/m8","SETPS r/m8","setpe r/m8","REX 0F 9A /r","N.E.","V","","pseud= o","r","","" +"SETSSBSY","SETSSBSY","setssbsy","F3 0F 01 E8","V","V","CET","","","","" +"SFENCE","SFENCE","sfence","0F AE /7","V","V","SSE","","","","" +"SGDT m16&32","SGDT m16&32","sgdt m16&32","0F 01 /0","V","N.S.","","modrm_= memonly","w","","" +"SGDT m16&64","SGDT m16&64","sgdt m16&64","0F 01 /0","N.S.","V","","defaul= t64,modrm_memonly","w","","" +"SHA1MSG1 xmm1, xmm2/m128","SHA1MSG1 xmm2/m128, xmm1","sha1msg1 xmm2/m128,= xmm1","0F 38 C9 /r","V","V","SHA","","rw,r","","" +"SHA1MSG2 xmm1, xmm2/m128","SHA1MSG2 xmm2/m128, xmm1","sha1msg2 xmm2/m128,= xmm1","0F 38 CA /r","V","V","SHA","","rw,r","","" +"SHA1NEXTE xmm1, xmm2/m128","SHA1NEXTE xmm2/m128, xmm1","sha1nexte xmm2/m1= 28, xmm1","0F 38 C8 /r","V","V","SHA","","rw,r","","" +"SHA1RNDS4 xmm1, xmm2/m128, imm8u:2","SHA1RNDS4 imm8u:2, xmm2/m128, xmm1",= "sha1rnds4 imm8u:2, xmm2/m128, xmm1","0F 3A CC /r ib","V","V","SHA","","rw,= r,r","","" +"SHA256MSG1 xmm1, xmm2/m128","SHA256MSG1 xmm2/m128, xmm1","sha256msg1 xmm2= /m128, xmm1","0F 38 CC /r","V","V","SHA","","rw,r","","" +"SHA256MSG2 xmm1, xmm2/m128","SHA256MSG2 xmm2/m128, xmm1","sha256msg2 xmm2= /m128, xmm1","0F 38 CD /r","V","V","SHA","","rw,r","","" +"SHA256RNDS2 xmm1, xmm2/m128, ","SHA256RNDS2 , xmm2/m128, xmm1= ","sha256rnds2 , xmm2/m128, xmm1","0F 38 CB /r","V","V","SHA","","rw,= r,r","","" +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /4","V","V","","","rw,r","= Y","8" +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /6","V","V","","","rw,r","= Y","8" +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","REX D0 /4","N.E.","V","","pse= udo64","rw,r","Y","8" +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /4","V","V","","","rw,r= ","Y","8" +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /6","V","V","","","rw,r= ","Y","8" +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","REX D2 /4","N.E.","V","","= pseudo64","rw,r","Y","8" +"SHL r/m8, imm8","SHLB imm8, r/m8","shlb imm8, r/m8","REX C0 /4 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /4 ib","V","V"= ,"","","rw,r","Y","8" +"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /6 ib","V","V"= ,"","","rw,r","Y","8" +"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /4","V","V","","operand= 32","rw,r","Y","32" +"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /6","V","V","","operand= 32","rw,r","Y","32" +"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /4","V","V","","oper= and32","rw,r","Y","32" +"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /6","V","V","","oper= and32","rw,r","Y","32" +"SHLD r/m32, r32, CL","SHLL CL, r32, r/m32","shldl CL, r32, r/m32","0F A5 = /r","V","V","","operand32","rw,r,r","Y","32" +"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /4 ib","V",= "V","","operand32","rw,r","Y","32" +"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /6 ib","V",= "V","","operand32","rw,r","Y","32" +"SHLD r/m32, r32, imm8u","SHLL imm8u, r32, r/m32","shldl imm8u, r32, r/m32= ","0F A4 /r ib","V","V","","operand32","rw,r,r","Y","32" +"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /4","N.S.","V",""= ,"","rw,r","Y","64" +"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /6","N.S.","V",""= ,"","rw,r","Y","64" +"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /4","N.S.","V"= ,"","","rw,r","Y","64" +"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /6","N.S.","V"= ,"","","rw,r","Y","64" +"SHLD r/m64, r64, CL","SHLQ CL, r64, r/m64","shldq CL, r64, r/m64","REX.W = 0F A5 /r","N.S.","V","","","rw,r,r","Y","64" +"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /4 ib= ","N.S.","V","","","rw,r","Y","64" +"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /6 ib= ","N.S.","V","","","rw,r","Y","64" +"SHLD r/m64, r64, imm8u","SHLQ imm8u, r64, r/m64","shldq imm8u, r64, r/m64= ","REX.W 0F A4 /r ib","N.S.","V","","","rw,r,r","Y","64" +"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /4","V","V","","operand= 16","rw,r","Y","16" +"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /6","V","V","","operand= 16","rw,r","Y","16" +"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /4","V","V","","oper= and16","rw,r","Y","16" +"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /6","V","V","","oper= and16","rw,r","Y","16" +"SHLD r/m16, r16, CL","SHLW CL, r16, r/m16","shldw CL, r16, r/m16","0F A5 = /r","V","V","","operand16","rw,r,r","Y","16" +"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /4 ib","V",= "V","","operand16","rw,r","Y","16" +"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /6 ib","V",= "V","","operand16","rw,r","Y","16" +"SHLD r/m16, r16, imm8u","SHLW imm8u, r16, r/m16","shldw imm8u, r16, r/m16= ","0F A4 /r ib","V","V","","operand16","rw,r,r","Y","16" +"SHLX r32, r/m32, r32V","SHLXL r32V, r/m32, r32","shlxl r32V, r/m32, r32",= "VEX.NDS.128.66.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32" +"SHLX r64, r/m64, r64V","SHLXQ r64V, r/m64, r64","shlxq r64V, r/m64, r64",= "VEX.NDS.128.66.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64" +"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","D0 /5","V","V","","","rw,r","= Y","8" +"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","REX D0 /5","N.E.","V","","pse= udo64","rw,r","Y","8" +"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","D2 /5","V","V","","","rw,r= ","Y","8" +"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","REX D2 /5","N.E.","V","","= pseudo64","rw,r","Y","8" +"SHR r/m8, imm8","SHRB imm8, r/m8","shrb imm8, r/m8","REX C0 /5 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"SHR r/m8, imm8u","SHRB imm8u, r/m8","shrb imm8u, r/m8","C0 /5 ib","V","V"= ,"","","rw,r","Y","8" +"SHR r/m32, 1","SHRL 1, r/m32","shrl 1, r/m32","D1 /5","V","V","","operand= 32","rw,r","Y","32" +"SHR r/m32, CL","SHRL CL, r/m32","shrl CL, r/m32","D3 /5","V","V","","oper= and32","rw,r","Y","32" +"SHRD r/m32, r32, CL","SHRL CL, r32, r/m32","shrdl CL, r32, r/m32","0F AD = /r","V","V","","operand32","rw,r,r","Y","32" +"SHR r/m32, imm8u","SHRL imm8u, r/m32","shrl imm8u, r/m32","C1 /5 ib","V",= "V","","operand32","rw,r","Y","32" +"SHRD r/m32, r32, imm8u","SHRL imm8u, r32, r/m32","shrdl imm8u, r32, r/m32= ","0F AC /r ib","V","V","","operand32","rw,r,r","Y","32" +"SHR r/m64, 1","SHRQ 1, r/m64","shrq 1, r/m64","REX.W D1 /5","N.S.","V",""= ,"","rw,r","Y","64" +"SHR r/m64, CL","SHRQ CL, r/m64","shrq CL, r/m64","REX.W D3 /5","N.S.","V"= ,"","","rw,r","Y","64" +"SHRD r/m64, r64, CL","SHRQ CL, r64, r/m64","shrdq CL, r64, r/m64","REX.W = 0F AD /r","N.S.","V","","","rw,r,r","Y","64" +"SHR r/m64, imm8u","SHRQ imm8u, r/m64","shrq imm8u, r/m64","REX.W C1 /5 ib= ","N.S.","V","","","rw,r","Y","64" +"SHRD r/m64, r64, imm8u","SHRQ imm8u, r64, r/m64","shrdq imm8u, r64, r/m64= ","REX.W 0F AC /r ib","N.S.","V","","","rw,r,r","Y","64" +"SHR r/m16, 1","SHRW 1, r/m16","shrw 1, r/m16","D1 /5","V","V","","operand= 16","rw,r","Y","16" +"SHR r/m16, CL","SHRW CL, r/m16","shrw CL, r/m16","D3 /5","V","V","","oper= and16","rw,r","Y","16" +"SHRD r/m16, r16, CL","SHRW CL, r16, r/m16","shrdw CL, r16, r/m16","0F AD = /r","V","V","","operand16","rw,r,r","Y","16" +"SHR r/m16, imm8u","SHRW imm8u, r/m16","shrw imm8u, r/m16","C1 /5 ib","V",= "V","","operand16","rw,r","Y","16" +"SHRD r/m16, r16, imm8u","SHRW imm8u, r16, r/m16","shrdw imm8u, r16, r/m16= ","0F AC /r ib","V","V","","operand16","rw,r,r","Y","16" +"SHRX r32, r/m32, r32V","SHRXL r32V, r/m32, r32","shrxl r32V, r/m32, r32",= "VEX.NDS.128.F2.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32" +"SHRX r64, r/m64, r64V","SHRXQ r64V, r/m64, r64","shrxq r64V, r/m64, r64",= "VEX.NDS.128.F2.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64" +"SHUFPD xmm1, xmm2/m128, imm8u","SHUFPD imm8u, xmm2/m128, xmm1","shufpd im= m8u, xmm2/m128, xmm1","66 0F C6 /r ib","V","V","SSE2","","rw,r,r","","" +"SHUFPS xmm1, xmm2/m128, imm8u","SHUFPS imm8u, xmm2/m128, xmm1","shufps im= m8u, xmm2/m128, xmm1","0F C6 /r ib","V","V","SSE","","rw,r,r","","" +"SIDT m16&32","SIDT m16&32","sidt m16&32","0F 01 /1","V","N.S.","","modrm_= memonly","w","","" +"SIDT m16&64","SIDT m16&64","sidt m16&64","0F 01 /1","N.S.","V","","defaul= t64,modrm_memonly","w","","" +"SKINIT EAX","SKINIT EAX","skinit EAX","0F 01 DE","V","V","SVM","amd,modrm= _regonly","r","","" +"SLDT r/m16","SLDTW r/m16","sldtw r/m16","0F 00 /0","V","V","","operand16"= ,"w","Y","16" +"SLDT r32/m16","SLDT{L/W} r32/m16","sldt{l/w} r32/m16","0F 00 /0","V","V",= "","operand32","w","Y","" +"SLDT r64/m16","SLDT{Q/W} r64/m16","sldt{q/w} r64/m16","REX.W 0F 00 /0","N= .S.","V","","","w","Y","" +"SLWPCB rmr32","SLWPCBL rmr32","slwpcbl rmr32","XOP.128.09.W0 12 /1","V","= V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32" +"SLWPCB rmr64","SLWPCBQ rmr64","slwpcbq rmr64","XOP.128.09.W0 12 /1","N.S.= ","V","XOP","amd,modrm_regonly,operand64","w","Y","64" +"SMSW r/m16","SMSWW r/m16","smsww r/m16","0F 01 /4","V","V","","operand16"= ,"w","Y","16" +"SMSW r32/m16","SMSW{L/W} r32/m16","smsw{l/w} r32/m16","0F 01 /4","V","V",= "","operand32","w","Y","" +"SMSW r64/m16","SMSW{Q/W} r64/m16","smsw{q/w} r64/m16","REX.W 0F 01 /4","N= .S.","V","","","w","Y","" +"SQRTPD xmm1, xmm2/m128","SQRTPD xmm2/m128, xmm1","sqrtpd xmm2/m128, xmm1"= ,"66 0F 51 /r","V","V","SSE2","","w,r","","" +"SQRTPS xmm1, xmm2/m128","SQRTPS xmm2/m128, xmm1","sqrtps xmm2/m128, xmm1"= ,"0F 51 /r","V","V","SSE","","w,r","","" +"SQRTSD xmm1, xmm2/m64","SQRTSD xmm2/m64, xmm1","sqrtsd xmm2/m64, xmm1","F= 2 0F 51 /r","V","V","SSE2","","w,r","","" +"SQRTSS xmm1, xmm2/m32","SQRTSS xmm2/m32, xmm1","sqrtss xmm2/m32, xmm1","F= 3 0F 51 /r","V","V","SSE","","w,r","","" +"STAC","STAC","stac","0F 01 CB","V","V","","","","","" +"STC","STC","stc","F9","V","V","","","","","" +"STD","STD","std","FD","V","V","","","","","" +"STGI","STGI","stgi","0F 01 DC","V","V","SVM","amd","","","" +"STI","STI","sti","FB","V","V","","","","","" +"STMXCSR m32","STMXCSR m32","stmxcsr m32","0F AE /3","V","V","SSE","modrm_= memonly","w","","" +"STOSB","STOSB","stosb","AA","V","V","","","","","" +"STOSD","STOSL","stosl","AB","V","V","","operand32","","","" +"STOSQ","STOSQ","stosq","REX.W AB","N.S.","V","","","","","" +"STOSW","STOSW","stosw","AB","V","V","","operand16","","","" +"STR r/m16","STRW r/m16","strw r/m16","0F 00 /1","V","V","","operand16","w= ","Y","16" +"STR r32/m16","STR{L/W} r32/m16","str{l/w} r32/m16","0F 00 /1","V","V","",= "operand32","w","Y","" +"STR r64/m16","STR{Q/W} r64/m16","str{q/w} r64/m16","REX.W 0F 00 /1","N.S.= ","V","","","w","Y","" +"SUB AL, imm8","SUBB imm8, AL","subb imm8, AL","2C ib","V","V","","","rw,r= ","Y","8" +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","80 /5 ib","V","V",""= ,"","rw,r","Y","8" +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","82 /5 ib","V","N.S."= ,"","","rw,r","Y","8" +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","REX 80 /5 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","2A /r","V","V","","","rw,r= ","Y","8" +"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","REX 2A /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","28 /r","V","V","","","rw,r= ","Y","8" +"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","REX 28 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"SUB EAX, imm32","SUBL imm32, EAX","subl imm32, EAX","2D id","V","V","","o= perand32","rw,r","Y","32" +"SUB r/m32, imm32","SUBL imm32, r/m32","subl imm32, r/m32","81 /5 id","V",= "V","","operand32","rw,r","Y","32" +"SUB r/m32, imm8","SUBL imm8, r/m32","subl imm8, r/m32","83 /5 ib","V","V"= ,"","operand32","rw,r","Y","32" +"SUB r32, r/m32","SUBL r/m32, r32","subl r/m32, r32","2B /r","V","V","","o= perand32","rw,r","Y","32" +"SUB r/m32, r32","SUBL r32, r/m32","subl r32, r/m32","29 /r","V","V","","o= perand32","rw,r","Y","32" +"SUBPD xmm1, xmm2/m128","SUBPD xmm2/m128, xmm1","subpd xmm2/m128, xmm1","6= 6 0F 5C /r","V","V","SSE2","","rw,r","","" +"SUBPS xmm1, xmm2/m128","SUBPS xmm2/m128, xmm1","subps xmm2/m128, xmm1","0= F 5C /r","V","V","SSE","","rw,r","","" +"SUB RAX, imm32","SUBQ imm32, RAX","subq imm32, RAX","REX.W 2D id","N.S.",= "V","","","rw,r","Y","64" +"SUB r/m64, imm32","SUBQ imm32, r/m64","subq imm32, r/m64","REX.W 81 /5 id= ","N.S.","V","","","rw,r","Y","64" +"SUB r/m64, imm8","SUBQ imm8, r/m64","subq imm8, r/m64","REX.W 83 /5 ib","= N.S.","V","","","rw,r","Y","64" +"SUB r64, r/m64","SUBQ r/m64, r64","subq r/m64, r64","REX.W 2B /r","N.S.",= "V","","","rw,r","Y","64" +"SUB r/m64, r64","SUBQ r64, r/m64","subq r64, r/m64","REX.W 29 /r","N.S.",= "V","","","rw,r","Y","64" +"SUBSD xmm1, xmm2/m64","SUBSD xmm2/m64, xmm1","subsd xmm2/m64, xmm1","F2 0= F 5C /r","V","V","SSE2","","rw,r","","" +"SUBSS xmm1, xmm2/m32","SUBSS xmm2/m32, xmm1","subss xmm2/m32, xmm1","F3 0= F 5C /r","V","V","SSE","","rw,r","","" +"SUB AX, imm16","SUBW imm16, AX","subw imm16, AX","2D iw","V","V","","oper= and16","rw,r","Y","16" +"SUB r/m16, imm16","SUBW imm16, r/m16","subw imm16, r/m16","81 /5 iw","V",= "V","","operand16","rw,r","Y","16" +"SUB r/m16, imm8","SUBW imm8, r/m16","subw imm8, r/m16","83 /5 ib","V","V"= ,"","operand16","rw,r","Y","16" +"SUB r16, r/m16","SUBW r/m16, r16","subw r/m16, r16","2B /r","V","V","","o= perand16","rw,r","Y","16" +"SUB r/m16, r16","SUBW r16, r/m16","subw r16, r/m16","29 /r","V","V","","o= perand16","rw,r","Y","16" +"SWAPGS","SWAPGS","swapgs","0F 01 F8","N.S.","V","","","","","" +"SYSCALL","SYSCALL","syscall","0F 05","N.S.","V","","default64","","","" +"SYSCALL","SYSCALL","syscall","0F 05","V","N.S.","AMD","amd","","","" +"SYSENTER","SYSENTER","sysenter","0F 34","V","V","PPRO","","","","" +"SYSEXIT","SYSEXIT","sysexit","0F 35","V","V","PPRO","","","","" +"SYSEXIT","SYSEXIT","sysexit","REX.W 0F 35","N.E.","V","","pseudo","","","" +"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","V","N.S.","AMD","amd"= ,"","","" +"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","N.S.","V","","operand= 32,operand64","","","" +"SYSRET","SYSRET","sysretw/sysretl/sysretl","REX.W 0F 07","I","V","","pseu= do","","","" +"T1MSKC r32V, r/m32","T1MSKCL r/m32, r32V","t1mskcl r/m32, r32V","XOP.NDD.= 128.09.WIG 01 /7","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"T1MSKC r64V, r/m64","T1MSKCQ r/m64, r64V","t1mskcq r/m64, r64V","XOP.NDD.= 128.09.WIG 01 /7","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"TEST AL, imm8","TESTB imm8, AL","testb imm8, AL","A8 ib","V","V","","","r= ,r","Y","8" +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /0 ib","V","V"= ,"","","r,r","Y","8" +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /1 ib","V","V"= ,"","","r,r","Y","8" +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","REX F6 /0 ib","N.= E.","V","","pseudo64","r,r","Y","8" +"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","84 /r","V","V","","","r= ,r","Y","8" +"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","REX 84 /r","N.E.","V","= ","pseudo64","r,r","Y","8" +"TEST EAX, imm32","TESTL imm32, EAX","testl imm32, EAX","A9 id","V","V",""= ,"operand32","r,r","Y","32" +"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /0 id","= V","V","","operand32","r,r","Y","32" +"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /1 id","= V","V","","operand32","r,r","Y","32" +"TEST r/m32, r32","TESTL r32, r/m32","testl r32, r/m32","85 /r","V","V",""= ,"operand32","r,r","Y","32" +"TEST RAX, imm32","TESTQ imm32, RAX","testq imm32, RAX","REX.W A9 id","N.S= .","V","","","r,r","Y","64" +"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /0= id","N.S.","V","","","r,r","Y","64" +"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /1= id","N.S.","V","","","r,r","Y","64" +"TEST r/m64, r64","TESTQ r64, r/m64","testq r64, r/m64","REX.W 85 /r","N.S= .","V","","","r,r","Y","64" +"TEST AX, imm16","TESTW imm16, AX","testw imm16, AX","A9 iw","V","V","","o= perand16","r,r","Y","16" +"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /0 iw","= V","V","","operand16","r,r","Y","16" +"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /1 iw","= V","V","","operand16","r,r","Y","16" +"TEST r/m16, r16","TESTW r16, r/m16","testw r16, r/m16","85 /r","V","V",""= ,"operand16","r,r","Y","16" +"TZCNT r32, r/m32","TZCNTL r/m32, r32","tzcntl r/m32, r32","F3 0F BC /r","= V","V","BMI1","operand32","w,r","Y","32" +"TZCNT r64, r/m64","TZCNTQ r/m64, r64","tzcntq r/m64, r64","F3 REX.W 0F BC= /r","N.S.","V","BMI1","","w,r","Y","64" +"TZCNT r16, r/m16","TZCNTW r/m16, r16","tzcntw r/m16, r16","F3 0F BC /r","= V","V","BMI1","operand16","w,r","Y","16" +"TZMSK r32V, r/m32","TZMSKL r/m32, r32V","tzmskl r/m32, r32V","XOP.NDD.128= .09.WIG 01 /4","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"TZMSK r64V, r/m64","TZMSKQ r/m64, r64V","tzmskq r/m64, r64V","XOP.NDD.128= .09.WIG 01 /4","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"UCOMISD xmm1, xmm2/m64","UCOMISD xmm2/m64, xmm1","ucomisd xmm2/m64, xmm1"= ,"66 0F 2E /r","V","V","SSE2","","r,r","","" +"UCOMISS xmm1, xmm2/m32","UCOMISS xmm2/m32, xmm1","ucomiss xmm2/m32, xmm1"= ,"0F 2E /r","V","V","SSE","","r,r","","" +"UD0 r32, r/m32","UD0 r/m32, r32","ud0 r/m32, r32","0F FF /r","V","V","PPR= O","","r,r","","" +"UD1 r32, r/m32","UD1 r/m32, r32","ud1 r/m32, r32","0F B9 /r","V","V","PPR= O","","r,r","","" +"UD2","UD2","ud2","0F 0B","V","V","PPRO","","","","" +"UNPCKHPD xmm1, xmm2/m128","UNPCKHPD xmm2/m128, xmm1","unpckhpd xmm2/m128,= xmm1","66 0F 15 /r","V","V","SSE2","","rw,r","","" +"UNPCKHPS xmm1, xmm2/m128","UNPCKHPS xmm2/m128, xmm1","unpckhps xmm2/m128,= xmm1","0F 15 /r","V","V","SSE","","rw,r","","" +"UNPCKLPD xmm1, xmm2/m128","UNPCKLPD xmm2/m128, xmm1","unpcklpd xmm2/m128,= xmm1","66 0F 14 /r","V","V","SSE2","","rw,r","","" +"UNPCKLPS xmm1, xmm2/m128","UNPCKLPS xmm2/m128, xmm1","unpcklps xmm2/m128,= xmm1","0F 14 /r","V","V","SSE","","rw,r","","" +"V4FMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FMADDPS m128, zmmV+3, {k}{z}, zm= m1","v4fmaddps m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 9A /r",= "V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","","" +"V4FMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FMADDSS m128, xmmV+3, {k}{z}, xm= m1","v4fmaddss m128, xmmV+3, {k}{z}, xmm1","EVEX.DDS.LIG.F2.0F38.W0 9B /r",= "V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","","" +"V4FNMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FNMADDPS m128, zmmV+3, {k}{z}, = zmm1","v4fnmaddps m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 AA /= r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","","" +"V4FNMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FNMADDSS m128, xmmV+3, {k}{z}, = xmm1","v4fnmaddss m128, xmmV+3, {k}{z}, xmm1","EVEX.DDS.LIG.F2.0F38.W0 AB /= r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","","" +"VADDPD xmm1, xmmV, xmm2/m128","VADDPD xmm2/m128, xmmV, xmm1","vaddpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","= ","" +"VADDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VADDPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vaddpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 58 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VADDPD ymm1, ymmV, ymm2/m256","VADDPD ymm2/m256, ymmV, ymm1","vaddpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","= ","" +"VADDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VADDPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vaddpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 58 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VADDPD zmm1{er}, {k}{z}, zmmV, zmm2","VADDPD zmm2, zmmV, {k}{z}, zmm1{er}= ","vaddpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 58 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VADDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VADDPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vaddpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 58 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VADDPS xmm1, xmmV, xmm2/m128","VADDPS xmm2/m128, xmmV, xmm1","vaddps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 58 /r","V","V","AVX","","w,r,r","","" +"VADDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VADDPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vaddps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 58 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VADDPS ymm1, ymmV, ymm2/m256","VADDPS ymm2/m256, ymmV, ymm1","vaddps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 58 /r","V","V","AVX","","w,r,r","","" +"VADDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VADDPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vaddps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 58 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VADDPS zmm1{er}, {k}{z}, zmmV, zmm2","VADDPS zmm2, zmmV, {k}{z}, zmm1{er}= ","vaddps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 58 /r","V","V",= "AVX512F","modrm_regonly","w,r,r,r","","" +"VADDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VADDPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vaddps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 58 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VADDSD xmm1{er}, {k}{z}, xmmV, xmm2","VADDSD xmm2, xmmV, {k}{z}, xmm1{er}= ","vaddsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 58 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VADDSD xmm1, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, xmm1","vaddsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 58 /r","V","V","AVX","","w,r,r","","" +"VADDSD xmm1, {k}{z}, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vaddsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 58 /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VADDSS xmm1{er}, {k}{z}, xmmV, xmm2","VADDSS xmm2, xmmV, {k}{z}, xmm1{er}= ","vaddss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 58 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VADDSS xmm1, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, xmm1","vaddss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 58 /r","V","V","AVX","","w,r,r","","" +"VADDSS xmm1, {k}{z}, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vaddss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 58 /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VADDSUBPD xmm1, xmmV, xmm2/m128","VADDSUBPD xmm2/m128, xmmV, xmm1","vadds= ubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D0 /r","V","V","AVX","",= "w,r,r","","" +"VADDSUBPD ymm1, ymmV, ymm2/m256","VADDSUBPD ymm2/m256, ymmV, ymm1","vadds= ubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D0 /r","V","V","AVX","",= "w,r,r","","" +"VADDSUBPS xmm1, xmmV, xmm2/m128","VADDSUBPS xmm2/m128, xmmV, xmm1","vadds= ubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG D0 /r","V","V","AVX","",= "w,r,r","","" +"VADDSUBPS ymm1, ymmV, ymm2/m256","VADDSUBPS ymm2/m256, ymmV, ymm1","vadds= ubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG D0 /r","V","V","AVX","",= "w,r,r","","" +"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec x= mm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX512V= L","scale16","w,r,r","","" +"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX","",= "w,r,r","","" +"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec y= mm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DE /r","V","V","AES+AVX512V= L","scale32","w,r,r","","" +"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DE /r","V","V","VAES+AVX",""= ,"w,r,r","","" +"VAESDEC zmm1, zmmV, zmm2/m512","VAESDEC zmm2/m512, zmmV, zmm1","vaesdec z= mm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DE /r","V","V","AES+AVX512F= ","scale64","w,r,r","","" +"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV, xmm1","v= aesdeclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DF /r","V","V",= "AES+AVX512VL","scale16","w,r,r","","" +"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV, xmm1","v= aesdeclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DF /r","V","V","= AES+AVX","","w,r,r","","" +"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV, ymm1","v= aesdeclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DF /r","V","V",= "AES+AVX512VL","scale32","w,r,r","","" +"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV, ymm1","v= aesdeclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DF /r","V","V","= VAES+AVX","","w,r,r","","" +"VAESDECLAST zmm1, zmmV, zmm2/m512","VAESDECLAST zmm2/m512, zmmV, zmm1","v= aesdeclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DF /r","V","V",= "AES+AVX512F","scale64","w,r,r","","" +"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc x= mm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX512V= L","scale16","w,r,r","","" +"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX","",= "w,r,r","","" +"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc y= mm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DC /r","V","V","AES+AVX512V= L","scale32","w,r,r","","" +"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DC /r","V","V","VAES+AVX",""= ,"w,r,r","","" +"VAESENC zmm1, zmmV, zmm2/m512","VAESENC zmm2/m512, zmmV, zmm1","vaesenc z= mm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DC /r","V","V","AES+AVX512F= ","scale64","w,r,r","","" +"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV, xmm1","v= aesenclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DD /r","V","V",= "AES+AVX512VL","scale16","w,r,r","","" +"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV, xmm1","v= aesenclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DD /r","V","V","= AES+AVX","","w,r,r","","" +"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV, ymm1","v= aesenclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DD /r","V","V",= "AES+AVX512VL","scale32","w,r,r","","" +"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV, ymm1","v= aesenclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DD /r","V","V","= VAES+AVX","","w,r,r","","" +"VAESENCLAST zmm1, zmmV, zmm2/m512","VAESENCLAST zmm2/m512, zmmV, zmm1","v= aesenclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DD /r","V","V",= "AES+AVX512F","scale64","w,r,r","","" +"VAESIMC xmm1, xmm2/m128","VAESIMC xmm2/m128, xmm1","vaesimc xmm2/m128, xm= m1","VEX.128.66.0F38.WIG DB /r","V","V","AES+AVX","","w,r","","" +"VAESKEYGENASSIST xmm1, xmm2/m128, imm8u","VAESKEYGENASSIST imm8u, xmm2/m1= 28, xmm1","vaeskeygenassist imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG DF= /r ib","V","V","AES+AVX","","w,r,r","","" +"VALIGND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VALIGND imm8u, xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","valignd imm8u, xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 03 /r ib","V","V","AVX512F+AVX512V= L","bscale4,scale16","w,r,r,r,r","","" +"VALIGND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VALIGND imm8u, ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","valignd imm8u, ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 03 /r ib","V","V","AVX512F+AVX512V= L","bscale4,scale32","w,r,r,r,r","","" +"VALIGND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VALIGND imm8u, zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","valignd imm8u, zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 03 /r ib","V","V","AVX512F","bscal= e4,scale64","w,r,r,r,r","","" +"VALIGNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VALIGNQ imm8u, xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","valignq imm8u, xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 03 /r ib","V","V","AVX512F+AVX512V= L","bscale8,scale16","w,r,r,r,r","","" +"VALIGNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VALIGNQ imm8u, ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","valignq imm8u, ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 03 /r ib","V","V","AVX512F+AVX512V= L","bscale8,scale32","w,r,r,r,r","","" +"VALIGNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VALIGNQ imm8u, zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","valignq imm8u, zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 03 /r ib","V","V","AVX512F","bscal= e8,scale64","w,r,r,r,r","","" +"VANDNPD xmm1, xmmV, xmm2/m128","VANDNPD xmm2/m128, xmmV, xmm1","vandnpd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 55 /r","V","V","AVX","","w,r,r= ","","" +"VANDNPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDNPD xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vandnpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F.W1 55 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r= ,r,r","","" +"VANDNPD ymm1, ymmV, ymm2/m256","VANDNPD ymm2/m256, ymmV, ymm1","vandnpd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 55 /r","V","V","AVX","","w,r,r= ","","" +"VANDNPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDNPD ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vandnpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F.W1 55 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r= ,r,r","","" +"VANDNPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDNPD zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vandnpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F.W1 55 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",= "" +"VANDNPS xmm1, xmmV, xmm2/m128","VANDNPS xmm2/m128, xmmV, xmm1","vandnps x= mm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 55 /r","V","V","AVX","","w,r,r","= ","" +"VANDNPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDNPS xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vandnps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.0F.W0 55 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,= r","","" +"VANDNPS ymm1, ymmV, ymm2/m256","VANDNPS ymm2/m256, ymmV, ymm1","vandnps y= mm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 55 /r","V","V","AVX","","w,r,r","= ","" +"VANDNPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDNPS ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vandnps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.0F.W0 55 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,= r","","" +"VANDNPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDNPS zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vandnps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.0F.W0 55 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","","" +"VANDPD xmm1, xmmV, xmm2/m128","VANDPD xmm2/m128, xmmV, xmm1","vandpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","= ","" +"VANDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vandpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 54 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,= r","","" +"VANDPD ymm1, ymmV, ymm2/m256","VANDPD ymm2/m256, ymmV, ymm1","vandpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","= ","" +"VANDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vandpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 54 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,= r","","" +"VANDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vandpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 54 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","","" +"VANDPS xmm1, xmmV, xmm2/m128","VANDPS xmm2/m128, xmmV, xmm1","vandps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 54 /r","V","V","AVX","","w,r,r","","" +"VANDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vandps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 54 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r",= "","" +"VANDPS ymm1, ymmV, ymm2/m256","VANDPS ymm2/m256, ymmV, ymm1","vandps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 54 /r","V","V","AVX","","w,r,r","","" +"VANDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vandps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 54 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r",= "","" +"VANDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vandps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 54 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","","" +"VBLENDMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VBLENDMPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vblendmpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W1 65 /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r","","" +"VBLENDMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VBLENDMPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vblendmpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W1 65 /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r","","" +"VBLENDMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VBLENDMPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vblendmpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W1 65 /r","V","V","AVX512F","bscale8,scale64","w,r,r= ,r","","" +"VBLENDMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VBLENDMPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vblendmps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 65 /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r","","" +"VBLENDMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VBLENDMPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vblendmps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 65 /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r","","" +"VBLENDMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VBLENDMPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vblendmps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 65 /r","V","V","AVX512F","bscale4,scale64","w,r,r= ,r","","" +"VBLENDPD xmm1, xmmV, xmm2/m128, imm8u","VBLENDPD imm8u, xmm2/m128, xmmV, = xmm1","vblendpd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0D /= r ib","V","V","AVX","","w,r,r,r","","" +"VBLENDPD ymm1, ymmV, ymm2/m256, imm8u","VBLENDPD imm8u, ymm2/m256, ymmV, = ymm1","vblendpd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0D /= r ib","V","V","AVX","","w,r,r,r","","" +"VBLENDPS xmm1, xmmV, xmm2/m128, imm8u","VBLENDPS imm8u, xmm2/m128, xmmV, = xmm1","vblendps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0C /= r ib","V","V","AVX","","w,r,r,r","","" +"VBLENDPS ymm1, ymmV, ymm2/m256, imm8u","VBLENDPS imm8u, ymm2/m256, ymmV, = ymm1","vblendps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0C /= r ib","V","V","AVX","","w,r,r,r","","" +"VBLENDVPD xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPD xmmIH, xmm2/m128, xmmV= , xmm1","vblendvpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4B= /r /is4","V","V","AVX","","w,r,r,r","","" +"VBLENDVPD ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPD ymmIH, ymm2/m256, ymmV= , ymm1","vblendvpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4B= /r /is4","V","V","AVX","","w,r,r,r","","" +"VBLENDVPS xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPS xmmIH, xmm2/m128, xmmV= , xmm1","vblendvps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4A= /r /is4","V","V","AVX","","w,r,r,r","","" +"VBLENDVPS ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPS ymmIH, ymm2/m256, ymmV= , ymm1","vblendvps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4A= /r /is4","V","V","AVX","","w,r,r,r","","" +"VBROADCASTF128 ymm1, m128","VBROADCASTF128 m128, ymm1","vbroadcastf128 m1= 28, ymm1","VEX.256.66.0F38.W0 1A /r","V","V","AVX","modrm_memonly","w,r",""= ,"" +"VBROADCASTF32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64, {k}{z}= , ymm1","vbroadcastf32x2 xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W0 19 /r= ","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","","" +"VBROADCASTF32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64, {k}{z}= , zmm1","vbroadcastf32x2 xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W0 19 /r= ","V","V","AVX512DQ","scale8","w,r,r","","" +"VBROADCASTF32X4 ymm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z}, ymm1",= "vbroadcastf32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1A /r","V","V","A= VX512F+AVX512VL","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTF32X4 zmm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z}, zmm1",= "vbroadcastf32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1A /r","V","V","A= VX512F","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTF32X8 zmm1, {k}{z}, m256","VBROADCASTF32X8 m256, {k}{z}, zmm1",= "vbroadcastf32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1B /r","V","V","A= VX512DQ","modrm_memonly,scale32","w,r,r","","" +"VBROADCASTF64X2 ymm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z}, ymm1",= "vbroadcastf64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1A /r","V","V","A= VX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTF64X2 zmm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z}, zmm1",= "vbroadcastf64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1A /r","V","V","A= VX512DQ","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTF64X4 zmm1, {k}{z}, m256","VBROADCASTF64X4 m256, {k}{z}, zmm1",= "vbroadcastf64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1B /r","V","V","A= VX512F","modrm_memonly,scale32","w,r,r","","" +"VBROADCASTI128 ymm1, m128","VBROADCASTI128 m128, ymm1","vbroadcasti128 m1= 28, ymm1","VEX.256.66.0F38.W0 5A /r","V","V","AVX2","modrm_memonly","w,r","= ","" +"VBROADCASTI32X2 xmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}= , xmm1","vbroadcasti32x2 xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 59 /r= ","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","","" +"VBROADCASTI32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}= , ymm1","vbroadcasti32x2 xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W0 59 /r= ","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","","" +"VBROADCASTI32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}= , zmm1","vbroadcasti32x2 xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W0 59 /r= ","V","V","AVX512DQ","scale8","w,r,r","","" +"VBROADCASTI32X4 ymm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z}, ymm1",= "vbroadcasti32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 5A /r","V","V","A= VX512F+AVX512VL","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTI32X4 zmm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z}, zmm1",= "vbroadcasti32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5A /r","V","V","A= VX512F","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTI32X8 zmm1, {k}{z}, m256","VBROADCASTI32X8 m256, {k}{z}, zmm1",= "vbroadcasti32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5B /r","V","V","A= VX512DQ","modrm_memonly,scale32","w,r,r","","" +"VBROADCASTI64X2 ymm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z}, ymm1",= "vbroadcasti64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 5A /r","V","V","A= VX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTI64X2 zmm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z}, zmm1",= "vbroadcasti64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5A /r","V","V","A= VX512DQ","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTI64X4 zmm1, {k}{z}, m256","VBROADCASTI64X4 m256, {k}{z}, zmm1",= "vbroadcasti64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5B /r","V","V","A= VX512F","modrm_memonly,scale32","w,r,r","","" +"VBROADCASTSD ymm1, m64","VBROADCASTSD m64, ymm1","vbroadcastsd m64, ymm1"= ,"VEX.256.66.0F38.W0 19 /r","V","V","AVX","modrm_memonly","w,r","","" +"VBROADCASTSD ymm1, xmm2","VBROADCASTSD xmm2, ymm1","vbroadcastsd xmm2, ym= m1","VEX.256.66.0F38.W0 19 /r","V","V","AVX2","modrm_regonly","w,r","","" +"VBROADCASTSD ymm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z}, ymm1= ","vbroadcastsd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 19 /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VBROADCASTSD zmm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z}, zmm1= ","vbroadcastsd xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 19 /r","V","V"= ,"AVX512F","scale8","w,r,r","","" +"VBROADCASTSS xmm1, m32","VBROADCASTSS m32, xmm1","vbroadcastss m32, xmm1"= ,"VEX.128.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","","" +"VBROADCASTSS ymm1, m32","VBROADCASTSS m32, ymm1","vbroadcastss m32, ymm1"= ,"VEX.256.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","","" +"VBROADCASTSS xmm1, xmm2","VBROADCASTSS xmm2, xmm1","vbroadcastss xmm2, xm= m1","VEX.128.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","","" +"VBROADCASTSS ymm1, xmm2","VBROADCASTSS xmm2, ymm1","vbroadcastss xmm2, ym= m1","VEX.256.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","","" +"VBROADCASTSS xmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, xmm1= ","vbroadcastss xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 18 /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VBROADCASTSS ymm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, ymm1= ","vbroadcastss xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 18 /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VBROADCASTSS zmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, zmm1= ","vbroadcastss xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 18 /r","V","V"= ,"AVX512F","scale4","w,r,r","","" +"VCMPPD xmm1, xmmV, xmm2/m128, imm8u","VCMPPD imm8u, xmm2/m128, xmmV, xmm1= ","vcmppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C2 /r ib","V= ","V","AVX","","w,r,r,r","","" +"VCMPPD k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VCMPPD imm8u, xmm2/m128/= m64bcst, xmmV, {k}, k1","vcmppd imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","E= VEX.NDS.128.66.0F.W1 C2 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16"= ,"w,r,r,r,r","","" +"VCMPPD ymm1, ymmV, ymm2/m256, imm8u","VCMPPD imm8u, ymm2/m256, ymmV, ymm1= ","vcmppd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C2 /r ib","V= ","V","AVX","","w,r,r,r","","" +"VCMPPD k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VCMPPD imm8u, ymm2/m256/= m64bcst, ymmV, {k}, k1","vcmppd imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","E= VEX.NDS.256.66.0F.W1 C2 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32"= ,"w,r,r,r,r","","" +"VCMPPD k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPD imm8u, zmm2, zmmV, {k}, k= 1{sae}","vcmppd imm8u, zmm2, zmmV, {k}, k1{sae}","EVEX.NDS.512.66.0F.W1 C2 = /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","","" +"VCMPPD k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VCMPPD imm8u, zmm2/m512/= m64bcst, zmmV, {k}, k1","vcmppd imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","E= VEX.NDS.512.66.0F.W1 C2 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r= ,r","","" +"VCMPPS xmm1, xmmV, xmm2/m128, imm8u","VCMPPS imm8u, xmm2/m128, xmmV, xmm1= ","vcmpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C2 /r ib","V","= V","AVX","","w,r,r,r","","" +"VCMPPS k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VCMPPS imm8u, xmm2/m128/= m32bcst, xmmV, {k}, k1","vcmpps imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","E= VEX.NDS.128.0F.W0 C2 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r,r","","" +"VCMPPS ymm1, ymmV, ymm2/m256, imm8u","VCMPPS imm8u, ymm2/m256, ymmV, ymm1= ","vcmpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C2 /r ib","V","= V","AVX","","w,r,r,r","","" +"VCMPPS k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VCMPPS imm8u, ymm2/m256/= m32bcst, ymmV, {k}, k1","vcmpps imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","E= VEX.NDS.256.0F.W0 C2 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r,r","","" +"VCMPPS k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPS imm8u, zmm2, zmmV, {k}, k= 1{sae}","vcmpps imm8u, zmm2, zmmV, {k}, k1{sae}","EVEX.NDS.512.0F.W0 C2 /r = ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","","" +"VCMPPS k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VCMPPS imm8u, zmm2/m512/= m32bcst, zmmV, {k}, k1","vcmpps imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","E= VEX.NDS.512.0F.W0 C2 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r"= ,"","" +"VCMPSD k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSD imm8u, xmm2, xmmV, {k}, k= 1{sae}","vcmpsd imm8u, xmm2, xmmV, {k}, k1{sae}","EVEX.NDS.128.F2.0F.W1 C2 = /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","","" +"VCMPSD xmm1, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV, xmm1",= "vcmpsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG C2 /r ib","V","= V","AVX","","w,r,r,r","","" +"VCMPSD k1, {k}, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV, {k}= , k1","vcmpsd imm8u, xmm2/m64, xmmV, {k}, k1","EVEX.NDS.LIG.F2.0F.W1 C2 /r = ib","V","V","AVX512F","scale8","w,r,r,r,r","","" +"VCMPSS k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSS imm8u, xmm2, xmmV, {k}, k= 1{sae}","vcmpss imm8u, xmm2, xmmV, {k}, k1{sae}","EVEX.NDS.128.F3.0F.W0 C2 = /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","","" +"VCMPSS xmm1, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV, xmm1",= "vcmpss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG C2 /r ib","V","= V","AVX","","w,r,r,r","","" +"VCMPSS k1, {k}, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV, {k}= , k1","vcmpss imm8u, xmm2/m32, xmmV, {k}, k1","EVEX.NDS.LIG.F3.0F.W0 C2 /r = ib","V","V","AVX512F","scale4","w,r,r,r,r","","" +"VCOMISD xmm1{sae}, xmm2","VCOMISD xmm2, xmm1{sae}","vcomisd xmm2, xmm1{sa= e}","EVEX.128.66.0F.W1 2F /r","V","V","AVX512F","modrm_regonly","r,r","","" +"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1"= ,"EVEX.LIG.66.0F.W1 2F /r","V","V","AVX512F","scale8","r,r","","" +"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1"= ,"VEX.LIG.66.0F.WIG 2F /r","V","V","AVX","","r,r","","" +"VCOMISS xmm1{sae}, xmm2","VCOMISS xmm2, xmm1{sae}","vcomiss xmm2, xmm1{sa= e}","EVEX.128.0F.W0 2F /r","V","V","AVX512F","modrm_regonly","r,r","","" +"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1"= ,"EVEX.LIG.0F.W0 2F /r","V","V","AVX512F","scale4","r,r","","" +"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1"= ,"VEX.LIG.0F.WIG 2F /r","V","V","AVX","","r,r","","" +"VCOMPRESSPD xmm2/m128, {k}{z}, xmm1","VCOMPRESSPD xmm1, {k}{z}, xmm2/m128= ","vcompresspd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 8A /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VCOMPRESSPD ymm2/m256, {k}{z}, ymm1","VCOMPRESSPD ymm1, {k}{z}, ymm2/m256= ","vcompresspd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 8A /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VCOMPRESSPD zmm2/m512, {k}{z}, zmm1","VCOMPRESSPD zmm1, {k}{z}, zmm2/m512= ","vcompresspd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 8A /r","V","V"= ,"AVX512F","scale8","w,r,r","","" +"VCOMPRESSPS xmm2/m128, {k}{z}, xmm1","VCOMPRESSPS xmm1, {k}{z}, xmm2/m128= ","vcompressps xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 8A /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VCOMPRESSPS ymm2/m256, {k}{z}, ymm1","VCOMPRESSPS ymm1, {k}{z}, ymm2/m256= ","vcompressps ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 8A /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VCOMPRESSPS zmm2/m512, {k}{z}, zmm1","VCOMPRESSPS zmm1, {k}{z}, zmm2/m512= ","vcompressps zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 8A /r","V","V"= ,"AVX512F","scale4","w,r,r","","" +"VCVTDQ2PD ymm1, xmm2/m128","VCVTDQ2PD xmm2/m128, ymm1","vcvtdq2pd xmm2/m1= 28, ymm1","VEX.256.F3.0F.WIG E6 /r","V","V","AVX","","w,r","","" +"VCVTDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PD xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtdq2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W= 0 E6 /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTDQ2PD xmm2/m256/m32bcst, = {k}{z}, ymm1","vcvtdq2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W= 0 E6 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTDQ2PD xmm1, xmm2/m64","VCVTDQ2PD xmm2/m64, xmm1","vcvtdq2pd xmm2/m64,= xmm1","VEX.128.F3.0F.WIG E6 /r","V","V","AVX","","w,r","","" +"VCVTDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTDQ2PD ymm2/m512/m32bcst, = {k}{z}, zmm1","vcvtdq2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W= 0 E6 /r","V","V","AVX512F","bscale4,scale32","w,r,r","","" +"VCVTDQ2PS xmm1, xmm2/m128","VCVTDQ2PS xmm2/m128, xmm1","vcvtdq2ps xmm2/m1= 28, xmm1","VEX.128.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PS xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtdq2ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 5= B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTDQ2PS ymm1, ymm2/m256","VCVTDQ2PS ymm2/m256, ymm1","vcvtdq2ps ymm2/m2= 56, ymm1","VEX.256.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTDQ2PS ymm2/m256/m32bcst, = {k}{z}, ymm1","vcvtdq2ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 5= B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTDQ2PS zmm2, {k}{z}, zmm1{er}","vcv= tdq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 5B /r","V","V","AVX512F","mo= drm_regonly","w,r,r","","" +"VCVTDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTDQ2PS zmm2/m512/m32bcst, = {k}{z}, zmm1","vcvtdq2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 5= B /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTPD2DQ ymm1{er}, {k}{z}, zmm2","VCVTPD2DQ zmm2, {k}{z}, ymm1{er}","vcv= tpd2dq zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 E6 /r","V","V","AVX512F",= "modrm_regonly","w,r,r","Y","" +"VCVTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2DQ zmm2/m512/m64bcst, = {k}{z}, ymm1","vcvtpd2dq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.F2.0F.W= 1 E6 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTPD2DQ xmm1, xmm2/m128","VCVTPD2DQX xmm2/m128, xmm1","vcvtpd2dqx xmm2/= m128, xmm1","VEX.128.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128" +"VCVTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2DQX xmm2/m128/m64bcst,= {k}{z}, xmm1","vcvtpd2dqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F2.0F= .W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTPD2DQ xmm1, ymm2/m256","VCVTPD2DQY ymm2/m256, xmm1","vcvtpd2dqy ymm2/= m256, xmm1","VEX.256.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256" +"VCVTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2DQY ymm2/m256/m64bcst,= {k}{z}, xmm1","vcvtpd2dqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.F2.0F= .W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTPD2PS ymm1{er}, {k}{z}, zmm2","VCVTPD2PS zmm2, {k}{z}, ymm1{er}","vcv= tpd2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.66.0F.W1 5A /r","V","V","AVX512F",= "modrm_regonly","w,r,r","Y","" +"VCVTPD2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2PS zmm2/m512/m64bcst, = {k}{z}, ymm1","vcvtpd2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.66.0F.W= 1 5A /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTPD2PS xmm1, xmm2/m128","VCVTPD2PSX xmm2/m128, xmm1","vcvtpd2psx xmm2/= m128, xmm1","VEX.128.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","128" +"VCVTPD2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2PSX xmm2/m128/m64bcst,= {k}{z}, xmm1","vcvtpd2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F= .W1 5A /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTPD2PS xmm1, ymm2/m256","VCVTPD2PSY ymm2/m256, xmm1","vcvtpd2psy ymm2/= m256, xmm1","VEX.256.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","256" +"VCVTPD2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2PSY ymm2/m256/m64bcst,= {k}{z}, xmm1","vcvtpd2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.66.0F= .W1 5A /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2QQ xmm2/m128/m64bcst, = {k}{z}, xmm1","vcvtpd2qq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W= 1 7B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2QQ ymm2/m256/m64bcst, = {k}{z}, ymm1","vcvtpd2qq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W= 1 7B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTPD2QQ zmm1{er}, {k}{z}, zmm2","VCVTPD2QQ zmm2, {k}{z}, zmm1{er}","vcv= tpd2qq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 7B /r","V","V","AVX512DQ"= ,"modrm_regonly","w,r,r","","" +"VCVTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2QQ zmm2/m512/m64bcst, = {k}{z}, zmm1","vcvtpd2qq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W= 1 7B /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTPD2UDQ ymm1{er}, {k}{z}, zmm2","VCVTPD2UDQ zmm2, {k}{z}, ymm1{er}","v= cvtpd2udq zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 79 /r","V","V","AVX512F",= "modrm_regonly","w,r,r","Y","" +"VCVTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UDQ zmm2/m512/m64bcst= , {k}{z}, ymm1","vcvtpd2udq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W= 1 79 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UDQX xmm2/m128/m64bcs= t, {k}{z}, xmm1","vcvtpd2udqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F= .W1 79 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UDQY ymm2/m256/m64bcs= t, {k}{z}, xmm1","vcvtpd2udqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F= .W1 79 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UQQ xmm2/m128/m64bcst= , {k}{z}, xmm1","vcvtpd2uqq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0= F.W1 79 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UQQ ymm2/m256/m64bcst= , {k}{z}, ymm1","vcvtpd2uqq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0= F.W1 79 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTPD2UQQ zmm1{er}, {k}{z}, zmm2","VCVTPD2UQQ zmm2, {k}{z}, zmm1{er}","v= cvtpd2uqq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 79 /r","V","V","AVX512= DQ","modrm_regonly","w,r,r","","" +"VCVTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UQQ zmm2/m512/m64bcst= , {k}{z}, zmm1","vcvtpd2uqq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0= F.W1 79 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTPH2PS ymm1, xmm2/m128","VCVTPH2PS xmm2/m128, ymm1","vcvtph2ps xmm2/m1= 28, ymm1","VEX.256.66.0F38.W0 13 /r","V","V","F16C","","w,r","","" +"VCVTPH2PS ymm1, {k}{z}, xmm2/m128","VCVTPH2PS xmm2/m128, {k}{z}, ymm1","v= cvtph2ps xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 13 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VCVTPH2PS xmm1, xmm2/m64","VCVTPH2PS xmm2/m64, xmm1","vcvtph2ps xmm2/m64,= xmm1","VEX.128.66.0F38.W0 13 /r","V","V","F16C","","w,r","","" +"VCVTPH2PS xmm1, {k}{z}, xmm2/m64","VCVTPH2PS xmm2/m64, {k}{z}, xmm1","vcv= tph2ps xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 13 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VCVTPH2PS zmm1{sae}, {k}{z}, ymm2","VCVTPH2PS ymm2, {k}{z}, zmm1{sae}","v= cvtph2ps ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 13 /r","V","V","AVX5= 12F","modrm_regonly","w,r,r","","" +"VCVTPH2PS zmm1, {k}{z}, ymm2/m256","VCVTPH2PS ymm2/m256, {k}{z}, zmm1","v= cvtph2ps ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 13 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VCVTPS2DQ xmm1, xmm2/m128","VCVTPS2DQ xmm2/m128, xmm1","vcvtps2dq xmm2/m1= 28, xmm1","VEX.128.66.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2DQ xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtps2dq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W= 0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2DQ ymm1, ymm2/m256","VCVTPS2DQ ymm2/m256, ymm1","vcvtps2dq ymm2/m2= 56, ymm1","VEX.256.66.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2DQ ymm2/m256/m32bcst, = {k}{z}, ymm1","vcvtps2dq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W= 0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTPS2DQ zmm1{er}, {k}{z}, zmm2","VCVTPS2DQ zmm2, {k}{z}, zmm1{er}","vcv= tps2dq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 5B /r","V","V","AVX512F",= "modrm_regonly","w,r,r","","" +"VCVTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2DQ zmm2/m512/m32bcst, = {k}{z}, zmm1","vcvtps2dq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W= 0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTPS2PD ymm1, xmm2/m128","VCVTPS2PD xmm2/m128, ymm1","vcvtps2pd xmm2/m1= 28, ymm1","VEX.256.0F.WIG 5A /r","V","V","AVX","","w,r","","" +"VCVTPS2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2PD xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtps2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 5= A /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTPS2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2PD xmm2/m256/m32bcst, = {k}{z}, ymm1","vcvtps2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 5= A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2PD xmm1, xmm2/m64","VCVTPS2PD xmm2/m64, xmm1","vcvtps2pd xmm2/m64,= xmm1","VEX.128.0F.WIG 5A /r","V","V","AVX","","w,r","","" +"VCVTPS2PD zmm1{sae}, {k}{z}, ymm2","VCVTPS2PD ymm2, {k}{z}, zmm1{sae}","v= cvtps2pd ymm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 5A /r","V","V","AVX512F",= "modrm_regonly","w,r,r","","" +"VCVTPS2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2PD ymm2/m512/m32bcst, = {k}{z}, zmm1","vcvtps2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 5= A /r","V","V","AVX512F","bscale4,scale32","w,r,r","","" +"VCVTPS2PH xmm2/m64, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, xmm2/m64","vcvtp= s2ph imm8u, xmm1, xmm2/m64","VEX.128.66.0F3A.W0 1D /r ib","V","V","F16C",""= ,"w,r,r","","" +"VCVTPS2PH xmm2/m64, {k}{z}, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, {k}{z}, = xmm2/m64","vcvtps2ph imm8u, xmm1, {k}{z}, xmm2/m64","EVEX.128.66.0F3A.W0 1D= /r ib","V","V","AVX512F+AVX512VL","scale8","w,r,r,r","","" +"VCVTPS2PH xmm2/m128, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, xmm2/m128","vcv= tps2ph imm8u, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 1D /r ib","V","V","F16C"= ,"","w,r,r","","" +"VCVTPS2PH xmm2/m128, {k}{z}, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, {k}{z},= xmm2/m128","vcvtps2ph imm8u, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0= 1D /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VCVTPS2PH ymm2/m256, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1, {k}{z},= ymm2/m256","vcvtps2ph imm8u, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0= 1D /r ib","V","V","AVX512F","scale32","w,r,r,r","","" +"VCVTPS2PH ymm2{sae}, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1, {k}{z},= ymm2{sae}","vcvtps2ph imm8u, zmm1, {k}{z}, ymm2{sae}","EVEX.512.66.0F3A.W0= 1D /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VCVTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2QQ xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtps2qq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W= 0 7B /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2QQ xmm2/m256/m32bcst, = {k}{z}, ymm1","vcvtps2qq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W= 0 7B /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2QQ zmm1{er}, {k}{z}, ymm2","VCVTPS2QQ ymm2, {k}{z}, zmm1{er}","vcv= tps2qq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 7B /r","V","V","AVX512DQ"= ,"modrm_regonly","w,r,r","","" +"VCVTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2QQ ymm2/m512/m32bcst, = {k}{z}, zmm1","vcvtps2qq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W= 0 7B /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","","" +"VCVTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UDQ xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvtps2udq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W= 0 79 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2UDQ ymm2/m256/m32bcst= , {k}{z}, ymm1","vcvtps2udq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W= 0 79 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTPS2UDQ zmm1{er}, {k}{z}, zmm2","VCVTPS2UDQ zmm2, {k}{z}, zmm1{er}","v= cvtps2udq zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 79 /r","V","V","AVX512F",= "modrm_regonly","w,r,r","","" +"VCVTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2UDQ zmm2/m512/m32bcst= , {k}{z}, zmm1","vcvtps2udq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W= 0 79 /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UQQ xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvtps2uqq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0= F.W0 79 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2UQQ xmm2/m256/m32bcst= , {k}{z}, ymm1","vcvtps2uqq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0= F.W0 79 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2UQQ zmm1{er}, {k}{z}, ymm2","VCVTPS2UQQ ymm2, {k}{z}, zmm1{er}","v= cvtps2uqq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 79 /r","V","V","AVX512= DQ","modrm_regonly","w,r,r","","" +"VCVTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2UQQ ymm2/m512/m32bcst= , {k}{z}, zmm1","vcvtps2uqq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0= F.W0 79 /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","","" +"VCVTQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PD xmm2/m128/m64bcst, = {k}{z}, xmm1","vcvtqq2pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W= 1 E6 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PD ymm2/m256/m64bcst, = {k}{z}, ymm1","vcvtqq2pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W= 1 E6 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTQQ2PD zmm2, {k}{z}, zmm1{er}","vcv= tqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 E6 /r","V","V","AVX512DQ"= ,"modrm_regonly","w,r,r","","" +"VCVTQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PD zmm2/m512/m64bcst, = {k}{z}, zmm1","vcvtqq2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W= 1 E6 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTQQ2PS zmm2, {k}{z}, ymm1{er}","vcv= tqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 5B /r","V","V","AVX512DQ","m= odrm_regonly","w,r,r","Y","" +"VCVTQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PS zmm2/m512/m64bcst, = {k}{z}, ymm1","vcvtqq2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W1 5= B /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512" +"VCVTQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PSX xmm2/m128/m64bcst,= {k}{z}, xmm1","vcvtqq2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F.W1= 5B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PSY ymm2/m256/m64bcst,= {k}{z}, xmm1","vcvtqq2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F.W1= 5B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTSD2SI r32{er}, xmm2","VCVTSD2SI xmm2, r32{er}","vcvtsd2si xmm2, r32{e= r}","EVEX.128.F2.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","= 32" +"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r= 32","EVEX.LIG.F2.0F.W0 2D /r","V","V","AVX512F","scale8","w,r","Y","32" +"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r= 32","VEX.LIG.F2.0F.W0 2D /r","V","V","AVX","","w,r","Y","32" +"VCVTSD2SI r64{er}, xmm2","VCVTSD2SIQ xmm2, r64{er}","vcvtsd2siq xmm2, r64= {er}","EVEX.128.F2.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r",= "Y","64" +"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64,= r64","EVEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX512F","scale8","w,r","Y","64" +"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64,= r64","VEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64" +"VCVTSD2SS xmm1{er}, {k}{z}, xmmV, xmm2","VCVTSD2SS xmm2, xmmV, {k}{z}, xm= m1{er}","vcvtsd2ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5A = /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VCVTSD2SS xmm1, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, xmm1","vcvtsd2= ss xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5A /r","V","V","AVX","","w,= r,r","","" +"VCVTSD2SS xmm1, {k}{z}, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, {k}{z}= , xmm1","vcvtsd2ss xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5A = /r","V","V","AVX512F","scale8","w,r,r,r","","" +"VCVTSD2USI r32{er}, xmm2","VCVTSD2USIL xmm2, r32{er}","vcvtsd2usi xmm2, r= 32{er}","EVEX.128.F2.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r","= Y","32" +"VCVTSD2USI r32, xmm2/m64","VCVTSD2USIL xmm2/m64, r32","vcvtsd2usi xmm2/m6= 4, r32","EVEX.LIG.F2.0F.W0 79 /r","V","V","AVX512F","scale8","w,r","Y","32" +"VCVTSD2USI r64{er}, xmm2","VCVTSD2USIQ xmm2, r64{er}","vcvtsd2usi xmm2, r= 64{er}","EVEX.128.F2.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,r= ","Y","64" +"VCVTSD2USI r64, xmm2/m64","VCVTSD2USIQ xmm2/m64, r64","vcvtsd2usi xmm2/m6= 4, r64","EVEX.LIG.F2.0F.W1 79 /r","N.S.","V","AVX512F","scale8","w,r","Y","= 64" +"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r= /m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX512F","scale4",= "w,r,r","Y","32" +"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r= /m32, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX","","w,r,r","Y"= ,"32" +"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r= /m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX512F","scale= 8","w,r,r","Y","64" +"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r= /m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX","","w,r,r",= "Y","64" +"VCVTSI2SD xmm1{er}, xmmV, rmr64","VCVTSI2SDQ rmr64, xmmV, xmm1{er}","vcvt= si2sdq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 2A /r","N.S.","V","AVX= 512F","modrm_regonly","w,r,r","Y","64" +"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r= /m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX512F","scale4",= "w,r,r","Y","32" +"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r= /m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX","","w,r,r","Y"= ,"32" +"VCVTSI2SS xmm1{er}, xmmV, rmr32","VCVTSI2SSL rmr32, xmmV, xmm1{er}","vcvt= si2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 2A /r","V","V","AVX512= F","modrm_regonly","w,r,r","Y","32" +"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r= /m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX512F","scale= 8","w,r,r","Y","64" +"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r= /m64, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX","","w,r,r",= "Y","64" +"VCVTSI2SS xmm1{er}, xmmV, rmr64","VCVTSI2SSQ rmr64, xmmV, xmm1{er}","vcvt= si2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 2A /r","N.S.","V","AVX= 512F","modrm_regonly","w,r,r","Y","64" +"VCVTSS2SD xmm1{sae}, {k}{z}, xmmV, xmm2","VCVTSS2SD xmm2, xmmV, {k}{z}, x= mm1{sae}","vcvtss2sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 = 5A /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VCVTSS2SD xmm1, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, xmm1","vcvtss2= sd xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5A /r","V","V","AVX","","w,= r,r","","" +"VCVTSS2SD xmm1, {k}{z}, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, {k}{z}= , xmm1","vcvtss2sd xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5A = /r","V","V","AVX512F","scale4","w,r,r,r","","" +"VCVTSS2SI r32{er}, xmm2","VCVTSS2SI xmm2, r32{er}","vcvtss2si xmm2, r32{e= r}","EVEX.128.F3.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","= 32" +"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r= 32","EVEX.LIG.F3.0F.W0 2D /r","V","V","AVX512F","scale4","w,r","Y","32" +"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r= 32","VEX.LIG.F3.0F.W0 2D /r","V","V","AVX","","w,r","Y","32" +"VCVTSS2SI r64{er}, xmm2","VCVTSS2SIQ xmm2, r64{er}","vcvtss2siq xmm2, r64= {er}","EVEX.128.F3.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r",= "Y","64" +"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32,= r64","EVEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX512F","scale4","w,r","Y","64" +"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32,= r64","VEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64" +"VCVTSS2USI r32{er}, xmm2","VCVTSS2USIL xmm2, r32{er}","vcvtss2usil xmm2, = r32{er}","EVEX.128.F3.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r",= "Y","32" +"VCVTSS2USI r32, xmm2/m32","VCVTSS2USIL xmm2/m32, r32","vcvtss2usil xmm2/m= 32, r32","EVEX.LIG.F3.0F.W0 79 /r","V","V","AVX512F","scale4","w,r","Y","32" +"VCVTSS2USI r64{er}, xmm2","VCVTSS2USIQ xmm2, r64{er}","vcvtss2usiq xmm2, = r64{er}","EVEX.128.F3.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,= r","Y","64" +"VCVTSS2USI r64, xmm2/m32","VCVTSS2USIQ xmm2/m32, r64","vcvtss2usiq xmm2/m= 32, r64","EVEX.LIG.F3.0F.W1 79 /r","N.S.","V","AVX512F","scale4","w,r","Y",= "64" +"VCVTTPD2DQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2DQ zmm2, {k}{z}, ymm1{sae}",= "vcvttpd2dq zmm2, {k}{z}, ymm1{sae}","EVEX.512.66.0F.W1 E6 /r","V","V","AVX= 512F","modrm_regonly","w,r,r","Y","" +"VCVTTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2DQ zmm2/m512/m64bcst= , {k}{z}, ymm1","vcvttpd2dq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.66.0= F.W1 E6 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTTPD2DQ xmm1, xmm2/m128","VCVTTPD2DQX xmm2/m128, xmm1","vcvttpd2dqx xm= m2/m128, xmm1","VEX.128.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128" +"VCVTTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2DQX xmm2/m128/m64bcs= t, {k}{z}, xmm1","vcvttpd2dqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66= .0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTTPD2DQ xmm1, ymm2/m256","VCVTTPD2DQY ymm2/m256, xmm1","vcvttpd2dqy ym= m2/m256, xmm1","VEX.256.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256" +"VCVTTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2DQY ymm2/m256/m64bcs= t, {k}{z}, xmm1","vcvttpd2dqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.66= .0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2QQ xmm2/m128/m64bcst= , {k}{z}, xmm1","vcvttpd2qq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0= F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2QQ ymm2/m256/m64bcst= , {k}{z}, ymm1","vcvttpd2qq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0= F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTTPD2QQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2QQ zmm2, {k}{z}, zmm1{sae}",= "vcvttpd2qq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 7A /r","V","V","AVX= 512DQ","modrm_regonly","w,r,r","","" +"VCVTTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2QQ zmm2/m512/m64bcst= , {k}{z}, zmm1","vcvttpd2qq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0= F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTTPD2UDQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2UDQ zmm2, {k}{z}, ymm1{sae}= ","vcvttpd2udq zmm2, {k}{z}, ymm1{sae}","EVEX.512.0F.W1 78 /r","V","V","AVX= 512F","modrm_regonly","w,r,r","Y","" +"VCVTTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UDQ zmm2/m512/m64bc= st, {k}{z}, ymm1","vcvttpd2udq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0= F.W1 78 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UDQX xmm2/m128/m64b= cst, {k}{z}, xmm1","vcvttpd2udqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128= .0F.W1 78 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UDQY ymm2/m256/m64b= cst, {k}{z}, xmm1","vcvttpd2udqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256= .0F.W1 78 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UQQ xmm2/m128/m64bc= st, {k}{z}, xmm1","vcvttpd2uqq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.6= 6.0F.W1 78 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UQQ ymm2/m256/m64bc= st, {k}{z}, ymm1","vcvttpd2uqq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.6= 6.0F.W1 78 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTTPD2UQQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2UQQ zmm2, {k}{z}, zmm1{sae}= ","vcvttpd2uqq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 78 /r","V","V","= AVX512DQ","modrm_regonly","w,r,r","","" +"VCVTTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UQQ zmm2/m512/m64bc= st, {k}{z}, zmm1","vcvttpd2uqq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.6= 6.0F.W1 78 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTTPS2DQ xmm1, xmm2/m128","VCVTTPS2DQ xmm2/m128, xmm1","vcvttps2dq xmm2= /m128, xmm1","VEX.128.F3.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2DQ xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvttps2dq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0= F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTTPS2DQ ymm1, ymm2/m256","VCVTTPS2DQ ymm2/m256, ymm1","vcvttps2dq ymm2= /m256, ymm1","VEX.256.F3.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2DQ ymm2/m256/m32bcst= , {k}{z}, ymm1","vcvttps2dq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0= F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTTPS2DQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2DQ zmm2, {k}{z}, zmm1{sae}",= "vcvttps2dq zmm2, {k}{z}, zmm1{sae}","EVEX.512.F3.0F.W0 5B /r","V","V","AVX= 512F","modrm_regonly","w,r,r","","" +"VCVTTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2DQ zmm2/m512/m32bcst= , {k}{z}, zmm1","vcvttps2dq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0= F.W0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2QQ xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvttps2qq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0= F.W0 7A /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2QQ xmm2/m256/m32bcst= , {k}{z}, ymm1","vcvttps2qq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0= F.W0 7A /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTTPS2QQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2QQ ymm2, {k}{z}, zmm1{sae}",= "vcvttps2qq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 7A /r","V","V","AVX= 512DQ","modrm_regonly","w,r,r","","" +"VCVTTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2QQ ymm2/m512/m32bcst= , {k}{z}, zmm1","vcvttps2qq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0= F.W0 7A /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","","" +"VCVTTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UDQ xmm2/m128/m32bc= st, {k}{z}, xmm1","vcvttps2udq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0= F.W0 78 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2UDQ ymm2/m256/m32bc= st, {k}{z}, ymm1","vcvttps2udq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0= F.W0 78 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTTPS2UDQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2UDQ zmm2, {k}{z}, zmm1{sae}= ","vcvttps2udq zmm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 78 /r","V","V","AVX= 512F","modrm_regonly","w,r,r","","" +"VCVTTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2UDQ zmm2/m512/m32bc= st, {k}{z}, zmm1","vcvttps2udq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0= F.W0 78 /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UQQ xmm2/m128/m32bc= st, {k}{z}, xmm1","vcvttps2uqq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.6= 6.0F.W0 78 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2UQQ xmm2/m256/m32bc= st, {k}{z}, ymm1","vcvttps2uqq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.6= 6.0F.W0 78 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTTPS2UQQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2UQQ ymm2, {k}{z}, zmm1{sae}= ","vcvttps2uqq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 78 /r","V","V","= AVX512DQ","modrm_regonly","w,r,r","","" +"VCVTTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2UQQ ymm2/m512/m32bc= st, {k}{z}, zmm1","vcvttps2uqq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.6= 6.0F.W0 78 /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","","" +"VCVTTSD2SI r32{sae}, xmm2","VCVTTSD2SI xmm2, r32{sae}","vcvttsd2si xmm2, = r32{sae}","EVEX.128.F2.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r"= ,"Y","32" +"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64= , r32","EVEX.LIG.F2.0F.W0 2C /r","V","V","AVX512F","scale8","w,r","Y","32" +"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64= , r32","VEX.LIG.F2.0F.W0 2C /r","V","V","AVX","","w,r","Y","32" +"VCVTTSD2SI r64{sae}, xmm2","VCVTTSD2SIQ xmm2, r64{sae}","vcvttsd2siq xmm2= , r64{sae}","EVEX.128.F2.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly",= "w,r","Y","64" +"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m= 64, r64","EVEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX512F","scale8","w,r","Y",= "64" +"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m= 64, r64","VEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64" +"VCVTTSD2USI r32{sae}, xmm2","VCVTTSD2USIL xmm2, r32{sae}","vcvttsd2usil x= mm2, r32{sae}","EVEX.128.F2.0F.W0 78 /r","V","V","AVX512F","modrm_regonly",= "w,r","Y","32" +"VCVTTSD2USI r32, xmm2/m64","VCVTTSD2USIL xmm2/m64, r32","vcvttsd2usil xmm= 2/m64, r32","EVEX.LIG.F2.0F.W0 78 /r","V","V","AVX512F","scale8","w,r","Y",= "32" +"VCVTTSD2USI r64{sae}, xmm2","VCVTTSD2USIQ xmm2, r64{sae}","vcvttsd2usiq x= mm2, r64{sae}","EVEX.128.F2.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonl= y","w,r","Y","64" +"VCVTTSD2USI r64, xmm2/m64","VCVTTSD2USIQ xmm2/m64, r64","vcvttsd2usiq xmm= 2/m64, r64","EVEX.LIG.F2.0F.W1 78 /r","N.S.","V","AVX512F","scale8","w,r","= Y","64" +"VCVTTSS2SI r32{sae}, xmm2","VCVTTSS2SI xmm2, r32{sae}","vcvttss2si xmm2, = r32{sae}","EVEX.128.F3.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r"= ,"Y","32" +"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32= , r32","EVEX.LIG.F3.0F.W0 2C /r","V","V","AVX512F","scale4","w,r","Y","32" +"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32= , r32","VEX.LIG.F3.0F.W0 2C /r","V","V","AVX","","w,r","Y","32" +"VCVTTSS2SI r64{sae}, xmm2","VCVTTSS2SIQ xmm2, r64{sae}","vcvttss2siq xmm2= , r64{sae}","EVEX.128.F3.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly",= "w,r","Y","64" +"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m= 32, r64","EVEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX512F","scale4","w,r","Y",= "64" +"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m= 32, r64","VEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64" +"VCVTTSS2USI r32{sae}, xmm2","VCVTTSS2USIL xmm2, r32{sae}","vcvttss2usil x= mm2, r32{sae}","EVEX.128.F3.0F.W0 78 /r","V","V","AVX512F","modrm_regonly",= "w,r","Y","32" +"VCVTTSS2USI r32, xmm2/m32","VCVTTSS2USIL xmm2/m32, r32","vcvttss2usil xmm= 2/m32, r32","EVEX.LIG.F3.0F.W0 78 /r","V","V","AVX512F","scale4","w,r","Y",= "32" +"VCVTTSS2USI r64{sae}, xmm2","VCVTTSS2USIQ xmm2, r64{sae}","vcvttss2usiq x= mm2, r64{sae}","EVEX.128.F3.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonl= y","w,r","Y","64" +"VCVTTSS2USI r64, xmm2/m32","VCVTTSS2USIQ xmm2/m32, r64","vcvttss2usiq xmm= 2/m32, r64","EVEX.LIG.F3.0F.W1 78 /r","N.S.","V","AVX512F","scale4","w,r","= Y","64" +"VCVTUDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PD xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvtudq2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0= F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTUDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTUDQ2PD xmm2/m256/m32bcst= , {k}{z}, ymm1","vcvtudq2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0= F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTUDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTUDQ2PD ymm2/m512/m32bcst= , {k}{z}, zmm1","vcvtudq2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0= F.W0 7A /r","V","V","AVX512F","bscale4,scale32","w,r,r","","" +"VCVTUDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PS xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvtudq2ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F2.0= F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTUDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTUDQ2PS ymm2/m256/m32bcst= , {k}{z}, ymm1","vcvtudq2ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F2.0= F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTUDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTUDQ2PS zmm2, {k}{z}, zmm1{er}","v= cvtudq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.F2.0F.W0 7A /r","V","V","AVX512= F","modrm_regonly","w,r,r","","" +"VCVTUDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTUDQ2PS zmm2/m512/m32bcst= , {k}{z}, zmm1","vcvtudq2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F2.0= F.W0 7A /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTUQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PD xmm2/m128/m64bcst= , {k}{z}, xmm1","vcvtuqq2pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F3.0= F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTUQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PD ymm2/m256/m64bcst= , {k}{z}, ymm1","vcvtuqq2pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.F3.0= F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTUQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTUQQ2PD zmm2, {k}{z}, zmm1{er}","v= cvtuqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 7A /r","V","V","AVX512= DQ","modrm_regonly","w,r,r","","" +"VCVTUQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PD zmm2/m512/m64bcst= , {k}{z}, zmm1","vcvtuqq2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.F3.0= F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTUQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTUQQ2PS zmm2, {k}{z}, ymm1{er}","v= cvtuqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 7A /r","V","V","AVX512= DQ","modrm_regonly","w,r,r","Y","" +"VCVTUQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PS zmm2/m512/m64bcst= , {k}{z}, ymm1","vcvtuqq2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.F2.0= F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512" +"VCVTUQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PSX xmm2/m128/m64bcs= t, {k}{z}, xmm1","vcvtuqq2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F2= .0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","12= 8" +"VCVTUQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PSY ymm2/m256/m64bcs= t, {k}{z}, xmm1","vcvtuqq2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.F2= .0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","25= 6" +"VCVTUSI2SD xmm1, xmmV, r/m32","VCVTUSI2SDL r/m32, xmmV, xmm1","vcvtusi2sd= r/m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 7B /r","V","V","AVX512F","scale4= ","w,r,r","Y","32" +"VCVTUSI2SD xmm1, xmmV, r/m64","VCVTUSI2SDQ r/m64, xmmV, xmm1","vcvtusi2sd= r/m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 7B /r","N.S.","V","AVX512F","sca= le8","w,r,r","Y","64" +"VCVTUSI2SD xmm1{er}, xmmV, rmr64","VCVTUSI2SDQ rmr64, xmmV, xmm1{er}","vc= vtusi2sd rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 7B /r","N.S.","V","A= VX512F","modrm_regonly","w,r,r","Y","64" +"VCVTUSI2SS xmm1, xmmV, r/m32","VCVTUSI2SSL r/m32, xmmV, xmm1","vcvtusi2ss= l r/m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 7B /r","V","V","AVX512F","scale= 4","w,r,r","Y","32" +"VCVTUSI2SS xmm1{er}, xmmV, rmr32","VCVTUSI2SSL rmr32, xmmV, xmm1{er}","vc= vtusi2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 7B /r","V","V","AVX= 512F","modrm_regonly","w,r,r","Y","32" +"VCVTUSI2SS xmm1, xmmV, r/m64","VCVTUSI2SSQ r/m64, xmmV, xmm1","vcvtusi2ss= q r/m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 7B /r","N.S.","V","AVX512F","sc= ale8","w,r,r","Y","64" +"VCVTUSI2SS xmm1{er}, xmmV, rmr64","VCVTUSI2SSQ rmr64, xmmV, xmm1{er}","vc= vtusi2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 7B /r","N.S.","V","= AVX512F","modrm_regonly","w,r,r","Y","64" +"VDBPSADBW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VDBPSADBW imm8u, xmm2/m1= 28, xmmV, {k}{z}, xmm1","vdbpsadbw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","E= VEX.NDS.128.66.0F3A.W0 42 /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r= ,r,r,r","","" +"VDBPSADBW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VDBPSADBW imm8u, ymm2/m2= 56, ymmV, {k}{z}, ymm1","vdbpsadbw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","E= VEX.NDS.256.66.0F3A.W0 42 /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r= ,r,r,r","","" +"VDBPSADBW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VDBPSADBW imm8u, zmm2/m5= 12, zmmV, {k}{z}, zmm1","vdbpsadbw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","E= VEX.NDS.512.66.0F3A.W0 42 /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","= ","" +"VDIVPD xmm1, xmmV, xmm2/m128","VDIVPD xmm2/m128, xmmV, xmm1","vdivpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","= ","" +"VDIVPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VDIVPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vdivpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 5E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VDIVPD ymm1, ymmV, ymm2/m256","VDIVPD ymm2/m256, ymmV, ymm1","vdivpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","= ","" +"VDIVPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VDIVPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vdivpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 5E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VDIVPD zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPD zmm2, zmmV, {k}{z}, zmm1{er}= ","vdivpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 5E /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VDIVPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VDIVPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vdivpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 5E /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VDIVPS xmm1, xmmV, xmm2/m128","VDIVPS xmm2/m128, xmmV, xmm1","vdivps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5E /r","V","V","AVX","","w,r,r","","" +"VDIVPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VDIVPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vdivps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 5E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VDIVPS ymm1, ymmV, ymm2/m256","VDIVPS ymm2/m256, ymmV, ymm1","vdivps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5E /r","V","V","AVX","","w,r,r","","" +"VDIVPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VDIVPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vdivps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 5E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VDIVPS zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPS zmm2, zmmV, {k}{z}, zmm1{er}= ","vdivps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5E /r","V","V",= "AVX512F","modrm_regonly","w,r,r,r","","" +"VDIVPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VDIVPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vdivps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 5E /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VDIVSD xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSD xmm2, xmmV, {k}{z}, xmm1{er}= ","vdivsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5E /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VDIVSD xmm1, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, xmm1","vdivsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5E /r","V","V","AVX","","w,r,r","","" +"VDIVSD xmm1, {k}{z}, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vdivsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5E /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VDIVSS xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSS xmm2, xmmV, {k}{z}, xmm1{er}= ","vdivss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 5E /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VDIVSS xmm1, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, xmm1","vdivss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5E /r","V","V","AVX","","w,r,r","","" +"VDIVSS xmm1, {k}{z}, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vdivss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5E /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VDPPD xmm1, xmmV, xmm2/m128, imm8u","VDPPD imm8u, xmm2/m128, xmmV, xmm1",= "vdppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 41 /r ib","V"= ,"V","AVX","","w,r,r,r","","" +"VDPPS xmm1, xmmV, xmm2/m128, imm8u","VDPPS imm8u, xmm2/m128, xmmV, xmm1",= "vdpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 40 /r ib","V"= ,"V","AVX","","w,r,r,r","","" +"VDPPS ymm1, ymmV, ymm2/m256, imm8u","VDPPS imm8u, ymm2/m256, ymmV, ymm1",= "vdpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 40 /r ib","V"= ,"V","AVX","","w,r,r,r","","" +"VERR r/m16","VERR r/m16","verr r/m16","0F 00 /4","V","V","","","r","","" +"VERW r/m16","VERW r/m16","verw r/m16","0F 00 /5","V","V","","","r","","" +"VEXP2PD zmm1{sae}, {k}{z}, zmm2","VEXP2PD zmm2, {k}{z}, zmm1{sae}","vexp2= pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 C8 /r","V","V","AVX512ER",= "modrm_regonly","w,r,r","","" +"VEXP2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VEXP2PD zmm2/m512/m64bcst, {k}{= z}, zmm1","vexp2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 C8= /r","V","V","AVX512ER","bscale8,scale64","w,r,r","","" +"VEXP2PS zmm1{sae}, {k}{z}, zmm2","VEXP2PS zmm2, {k}{z}, zmm1{sae}","vexp2= ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 C8 /r","V","V","AVX512ER",= "modrm_regonly","w,r,r","","" +"VEXP2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VEXP2PS zmm2/m512/m32bcst, {k}{= z}, zmm1","vexp2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 C8= /r","V","V","AVX512ER","bscale4,scale64","w,r,r","","" +"VEXPANDPD xmm1, {k}{z}, xmm2/m128","VEXPANDPD xmm2/m128, {k}{z}, xmm1","v= expandpd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 88 /r","V","V","AVX5= 12F+AVX512VL","scale8","w,r,r","","" +"VEXPANDPD ymm1, {k}{z}, ymm2/m256","VEXPANDPD ymm2/m256, {k}{z}, ymm1","v= expandpd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 88 /r","V","V","AVX5= 12F+AVX512VL","scale8","w,r,r","","" +"VEXPANDPD zmm1, {k}{z}, zmm2/m512","VEXPANDPD zmm2/m512, {k}{z}, zmm1","v= expandpd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 88 /r","V","V","AVX5= 12F","scale8","w,r,r","","" +"VEXPANDPS xmm1, {k}{z}, xmm2/m128","VEXPANDPS xmm2/m128, {k}{z}, xmm1","v= expandps xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 88 /r","V","V","AVX5= 12F+AVX512VL","scale4","w,r,r","","" +"VEXPANDPS ymm1, {k}{z}, ymm2/m256","VEXPANDPS ymm2/m256, {k}{z}, ymm1","v= expandps ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 88 /r","V","V","AVX5= 12F+AVX512VL","scale4","w,r,r","","" +"VEXPANDPS zmm1, {k}{z}, zmm2/m512","VEXPANDPS zmm2/m512, {k}{z}, zmm1","v= expandps zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 88 /r","V","V","AVX5= 12F","scale4","w,r,r","","" +"VEXTRACTF128 xmm2/m128, ymm1, imm8u:1","VEXTRACTF128 imm8u:1, ymm1, xmm2/= m128","vextractf128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 19 /r ib"= ,"V","V","AVX","","w,r,r","","" +"VEXTRACTF32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF32X4 imm8u:1, y= mm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:1, ymm1, {k}{z}, xmm2/m128","E= VEX.256.66.0F3A.W0 19 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r"= ,"","" +"VEXTRACTF32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF32X4 imm8u:2, z= mm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:2, zmm1, {k}{z}, xmm2/m128","E= VEX.512.66.0F3A.W0 19 /r ib","V","V","AVX512F","scale16","w,r,r,r","","" +"VEXTRACTF32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTF32X8 imm8u:1, z= mm1, {k}{z}, ymm2/m256","vextractf32x8 imm8u:1, zmm1, {k}{z}, ymm2/m256","E= VEX.512.66.0F3A.W0 1B /r ib","V","V","AVX512DQ","scale32","w,r,r,r","","" +"VEXTRACTF64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF64X2 imm8u:1, y= mm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:1, ymm1, {k}{z}, xmm2/m128","E= VEX.256.66.0F3A.W1 19 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r= ","","" +"VEXTRACTF64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF64X2 imm8u:2, z= mm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:2, zmm1, {k}{z}, xmm2/m128","E= VEX.512.66.0F3A.W1 19 /r ib","V","V","AVX512DQ","scale16","w,r,r,r","","" +"VEXTRACTF64X4 ymm2/m256, {k}{z}, zmm1, imm8u","VEXTRACTF64X4 imm8u, zmm1,= {k}{z}, ymm2/m256","vextractf64x4 imm8u, zmm1, {k}{z}, ymm2/m256","EVEX.51= 2.66.0F3A.W1 1B /r ib","V","V","AVX512F","scale32","w,r,r,r","","" +"VEXTRACTI128 xmm2/m128, ymm1, imm8u:1","VEXTRACTI128 imm8u:1, ymm1, xmm2/= m128","vextracti128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 39 /r ib"= ,"V","V","AVX2","","w,r,r","","" +"VEXTRACTI32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI32X4 imm8u:1, y= mm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:1, ymm1, {k}{z}, xmm2/m128","E= VEX.256.66.0F3A.W0 39 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r"= ,"","" +"VEXTRACTI32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI32X4 imm8u:2, z= mm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:2, zmm1, {k}{z}, xmm2/m128","E= VEX.512.66.0F3A.W0 39 /r ib","V","V","AVX512F","scale16","w,r,r,r","","" +"VEXTRACTI32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI32X8 imm8u:1, z= mm1, {k}{z}, ymm2/m256","vextracti32x8 imm8u:1, zmm1, {k}{z}, ymm2/m256","E= VEX.512.66.0F3A.W0 3B /r ib","V","V","AVX512DQ","scale32","w,r,r,r","","" +"VEXTRACTI64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI64X2 imm8u:1, y= mm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:1, ymm1, {k}{z}, xmm2/m128","E= VEX.256.66.0F3A.W1 39 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r= ","","" +"VEXTRACTI64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI64X2 imm8u:2, z= mm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:2, zmm1, {k}{z}, xmm2/m128","E= VEX.512.66.0F3A.W1 39 /r ib","V","V","AVX512DQ","scale16","w,r,r,r","","" +"VEXTRACTI64X4 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI64X4 imm8u:1, z= mm1, {k}{z}, ymm2/m256","vextracti64x4 imm8u:1, zmm1, {k}{z}, ymm2/m256","E= VEX.512.66.0F3A.W1 3B /r ib","V","V","AVX512F","scale32","w,r,r,r","","" +"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextr= actps imm8u:2, xmm1, r/m32","EVEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextr= actps imm8u:2, xmm1, r/m32","VEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VFIXUPIMMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VFIXUPIMMPD im= m8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfixupimmpd imm8u, xmm2/m128/m= 64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 54 /r ib","V","V","AVX= 512F+AVX512VL","bscale8,scale16","rw,r,r,r,r","","" +"VFIXUPIMMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VFIXUPIMMPD im= m8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfixupimmpd imm8u, ymm2/m256/m= 64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 54 /r ib","V","V","AVX= 512F+AVX512VL","bscale8,scale32","rw,r,r,r,r","","" +"VFIXUPIMMPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPD imm8u, zmm= 2, zmmV, {k}{z}, zmm1{sae}","vfixupimmpd imm8u, zmm2, zmmV, {k}{z}, zmm1{sa= e}","EVEX.DDS.512.66.0F3A.W1 54 /r ib","V","V","AVX512F","modrm_regonly","r= w,r,r,r,r","","" +"VFIXUPIMMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VFIXUPIMMPD im= m8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfixupimmpd imm8u, zmm2/m512/m= 64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 54 /r ib","V","V","AVX= 512F","bscale8,scale64","rw,r,r,r,r","","" +"VFIXUPIMMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VFIXUPIMMPS im= m8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfixupimmps imm8u, xmm2/m128/m= 32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 54 /r ib","V","V","AVX= 512F+AVX512VL","bscale4,scale16","rw,r,r,r,r","","" +"VFIXUPIMMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VFIXUPIMMPS im= m8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfixupimmps imm8u, ymm2/m256/m= 32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 54 /r ib","V","V","AVX= 512F+AVX512VL","bscale4,scale32","rw,r,r,r,r","","" +"VFIXUPIMMPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPS imm8u, zmm= 2, zmmV, {k}{z}, zmm1{sae}","vfixupimmps imm8u, zmm2, zmmV, {k}{z}, zmm1{sa= e}","EVEX.DDS.512.66.0F3A.W0 54 /r ib","V","V","AVX512F","modrm_regonly","r= w,r,r,r,r","","" +"VFIXUPIMMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VFIXUPIMMPS im= m8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfixupimmps imm8u, zmm2/m512/m= 32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 54 /r ib","V","V","AVX= 512F","bscale4,scale64","rw,r,r,r,r","","" +"VFIXUPIMMSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSD imm8u, xmm= 2, xmmV, {k}{z}, xmm1{sae}","vfixupimmsd imm8u, xmm2, xmmV, {k}{z}, xmm1{sa= e}","EVEX.DDS.128.66.0F3A.W1 55 /r ib","V","V","AVX512F","modrm_regonly","r= w,r,r,r,r","","" +"VFIXUPIMMSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VFIXUPIMMSD imm8u, xmm2= /m64, xmmV, {k}{z}, xmm1","vfixupimmsd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.LIG.66.0F3A.W1 55 /r ib","V","V","AVX512F","scale8","rw,r,r,r,r"= ,"","" +"VFIXUPIMMSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSS imm8u, xmm= 2, xmmV, {k}{z}, xmm1{sae}","vfixupimmss imm8u, xmm2, xmmV, {k}{z}, xmm1{sa= e}","EVEX.DDS.128.66.0F3A.W0 55 /r ib","V","V","AVX512F","modrm_regonly","r= w,r,r,r,r","","" +"VFIXUPIMMSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VFIXUPIMMSS imm8u, xmm2= /m32, xmmV, {k}{z}, xmm1","vfixupimmss imm8u, xmm2/m32, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.LIG.66.0F3A.W0 55 /r ib","V","V","AVX512F","scale4","rw,r,r,r,r"= ,"","" +"VFMADD132PD xmm1, xmmV, xmm2/m128","VFMADD132PD xmm2/m128, xmmV, xmm1","v= fmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 98 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD132PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmadd132pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 98 /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMADD132PD ymm1, ymmV, ymm2/m256","VFMADD132PD ymm2/m256, ymmV, ymm1","v= fmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 98 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD132PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmadd132pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 98 /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 98 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD132PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmadd132pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 98 /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMADD132PS xmm1, xmmV, xmm2/m128","VFMADD132PS xmm2/m128, xmmV, xmm1","v= fmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 98 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD132PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmadd132ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 98 /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMADD132PS ymm1, ymmV, ymm2/m256","VFMADD132PS ymm2/m256, ymmV, ymm1","v= fmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 98 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD132PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmadd132ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 98 /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 98 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD132PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmadd132ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 98 /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 99 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD132SD xmm1, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, xmm1","vfm= add132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 99 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmadd132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 99 /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 99 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD132SS xmm1, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, xmm1","vfm= add132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 99 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmadd132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 99 /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMADD213PD xmm1, xmmV, xmm2/m128","VFMADD213PD xmm2/m128, xmmV, xmm1","v= fmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD213PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmadd213pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 A8 /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMADD213PD ymm1, ymmV, ymm2/m256","VFMADD213PD ymm2/m256, ymmV, ymm1","v= fmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD213PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmadd213pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 A8 /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 A8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD213PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmadd213pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 A8 /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMADD213PS xmm1, xmmV, xmm2/m128","VFMADD213PS xmm2/m128, xmmV, xmm1","v= fmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD213PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmadd213ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 A8 /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMADD213PS ymm1, ymmV, ymm2/m256","VFMADD213PS ymm2/m256, ymmV, ymm1","v= fmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD213PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmadd213ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 A8 /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 A8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD213PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmadd213ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 A8 /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 A9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD213SD xmm1, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, xmm1","vfm= add213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 A9 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmadd213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 A9 /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 A9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD213SS xmm1, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, xmm1","vfm= add213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 A9 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmadd213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 A9 /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMADD231PD xmm1, xmmV, xmm2/m128","VFMADD231PD xmm2/m128, xmmV, xmm1","v= fmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD231PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmadd231pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 B8 /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMADD231PD ymm1, ymmV, ymm2/m256","VFMADD231PD ymm2/m256, ymmV, ymm1","v= fmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD231PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmadd231pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 B8 /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 B8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD231PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmadd231pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 B8 /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMADD231PS xmm1, xmmV, xmm2/m128","VFMADD231PS xmm2/m128, xmmV, xmm1","v= fmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD231PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmadd231ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 B8 /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMADD231PS ymm1, ymmV, ymm2/m256","VFMADD231PS ymm2/m256, ymmV, ymm1","v= fmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD231PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmadd231ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 B8 /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 B8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD231PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmadd231ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 B8 /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 B9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD231SD xmm1, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, xmm1","vfm= add231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 B9 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmadd231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 B9 /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 B9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD231SS xmm1, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, xmm1","vfm= add231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 B9 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmadd231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 B9 /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPD xmm2/m128, xmmIH, xmmV, = xmm1","vfmaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 69 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPD xmmIH, xmm2/m128, xmmV, = xmm1","vfmaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 69 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPD ymm2/m256, ymmIH, ymmV, = ymm1","vfmaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 69 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPD ymmIH, ymm2/m256, ymmV, = ymm1","vfmaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 69 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPS xmm2/m128, xmmIH, xmmV, = xmm1","vfmaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 68 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPS xmmIH, xmm2/m128, xmmV, = xmm1","vfmaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 68 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPS ymm2/m256, ymmIH, ymmV, = ymm1","vfmaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 68 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPS ymmIH, ymm2/m256, ymmV, = ymm1","vfmaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 68 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFMADDSD xmm2/m64, xmmIH, xmmV, xm= m1","vfmaddsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6B /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFMADDSD xmmIH, xmm2/m64, xmmV, xm= m1","vfmaddsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6B /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFMADDSS xmm2/m32, xmmIH, xmmV, xm= m1","vfmaddss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6A /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFMADDSS xmmIH, xmm2/m32, xmmV, xm= m1","vfmaddss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6A /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUB132PD xmm1, xmmV, xmm2/m128","VFMADDSUB132PD xmm2/m128, xmmV, xm= m1","vfmaddsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 96 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB132PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub132pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 96 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMADDSUB132PD ymm1, ymmV, ymm2/m256","VFMADDSUB132PD ymm2/m256, ymmV, ym= m1","vfmaddsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 96 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB132PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub132pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 96 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMADDSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 96 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB132PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub132pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 96 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMADDSUB132PS xmm1, xmmV, xmm2/m128","VFMADDSUB132PS xmm2/m128, xmmV, xm= m1","vfmaddsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 96 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB132PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub132ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 96 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMADDSUB132PS ymm1, ymmV, ymm2/m256","VFMADDSUB132PS ymm2/m256, ymmV, ym= m1","vfmaddsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 96 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB132PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub132ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 96 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMADDSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 96 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB132PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub132ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 96 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMADDSUB213PD xmm1, xmmV, xmm2/m128","VFMADDSUB213PD xmm2/m128, xmmV, xm= m1","vfmaddsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB213PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub213pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A6 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMADDSUB213PD ymm1, ymmV, ymm2/m256","VFMADDSUB213PD ymm2/m256, ymmV, ym= m1","vfmaddsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB213PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub213pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A6 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMADDSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 A6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB213PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub213pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A6 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMADDSUB213PS xmm1, xmmV, xmm2/m128","VFMADDSUB213PS xmm2/m128, xmmV, xm= m1","vfmaddsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB213PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub213ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A6 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMADDSUB213PS ymm1, ymmV, ymm2/m256","VFMADDSUB213PS ymm2/m256, ymmV, ym= m1","vfmaddsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB213PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub213ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A6 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMADDSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 A6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB213PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub213ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A6 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMADDSUB231PD xmm1, xmmV, xmm2/m128","VFMADDSUB231PD xmm2/m128, xmmV, xm= m1","vfmaddsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB231PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub231pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B6 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMADDSUB231PD ymm1, ymmV, ymm2/m256","VFMADDSUB231PD ymm2/m256, ymmV, ym= m1","vfmaddsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB231PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub231pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B6 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMADDSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 B6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB231PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub231pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B6 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMADDSUB231PS xmm1, xmmV, xmm2/m128","VFMADDSUB231PS xmm2/m128, xmmV, xm= m1","vfmaddsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB231PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub231ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B6 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMADDSUB231PS ymm1, ymmV, ymm2/m256","VFMADDSUB231PS ymm2/m256, ymmV, ym= m1","vfmaddsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB231PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub231ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B6 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMADDSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 B6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB231PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub231ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B6 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMADDSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPD xmm2/m128, xmmIH, = xmmV, xmm1","vfmaddsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W1 5D /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPD xmmIH, xmm2/m128, = xmmV, xmm1","vfmaddsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W0 5D /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPD ymm2/m256, ymmIH, = ymmV, ymm1","vfmaddsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W1 5D /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPD ymmIH, ymm2/m256, = ymmV, ymm1","vfmaddsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W0 5D /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPS xmm2/m128, xmmIH, = xmmV, xmm1","vfmaddsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W1 5C /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPS xmmIH, xmm2/m128, = xmmV, xmm1","vfmaddsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W0 5C /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPS ymm2/m256, ymmIH, = ymmV, ymm1","vfmaddsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W1 5C /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPS ymmIH, ymm2/m256, = ymmV, ymm1","vfmaddsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W0 5C /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUB132PD xmm1, xmmV, xmm2/m128","VFMSUB132PD xmm2/m128, xmmV, xmm1","v= fmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9A /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB132PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmsub132pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 9A /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMSUB132PD ymm1, ymmV, ymm2/m256","VFMSUB132PD ymm2/m256, ymmV, ymm1","v= fmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9A /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB132PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmsub132pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 9A /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 9A /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB132PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmsub132pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 9A /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMSUB132PS xmm1, xmmV, xmm2/m128","VFMSUB132PS xmm2/m128, xmmV, xmm1","v= fmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9A /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB132PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmsub132ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 9A /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMSUB132PS ymm1, ymmV, ymm2/m256","VFMSUB132PS ymm2/m256, ymmV, ymm1","v= fmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9A /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB132PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmsub132ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 9A /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 9A /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB132PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmsub132ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 9A /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 9B /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB132SD xmm1, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, xmm1","vfm= sub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9B /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmsub132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 9B /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 9B /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB132SS xmm1, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, xmm1","vfm= sub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9B /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmsub132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 9B /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMSUB213PD xmm1, xmmV, xmm2/m128","VFMSUB213PD xmm2/m128, xmmV, xmm1","v= fmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB213PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmsub213pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 AA /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMSUB213PD ymm1, ymmV, ymm2/m256","VFMSUB213PD ymm2/m256, ymmV, ymm1","v= fmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB213PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmsub213pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 AA /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 AA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB213PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmsub213pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 AA /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMSUB213PS xmm1, xmmV, xmm2/m128","VFMSUB213PS xmm2/m128, xmmV, xmm1","v= fmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB213PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmsub213ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 AA /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMSUB213PS ymm1, ymmV, ymm2/m256","VFMSUB213PS ymm2/m256, ymmV, ymm1","v= fmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB213PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmsub213ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 AA /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 AA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB213PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmsub213ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 AA /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 AB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB213SD xmm1, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, xmm1","vfm= sub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AB /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmsub213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 AB /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 AB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB213SS xmm1, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, xmm1","vfm= sub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AB /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmsub213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 AB /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMSUB231PD xmm1, xmmV, xmm2/m128","VFMSUB231PD xmm2/m128, xmmV, xmm1","v= fmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB231PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmsub231pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 BA /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMSUB231PD ymm1, ymmV, ymm2/m256","VFMSUB231PD ymm2/m256, ymmV, ymm1","v= fmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB231PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmsub231pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 BA /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 BA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB231PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmsub231pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 BA /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMSUB231PS xmm1, xmmV, xmm2/m128","VFMSUB231PS xmm2/m128, xmmV, xmm1","v= fmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB231PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmsub231ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 BA /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMSUB231PS ymm1, ymmV, ymm2/m256","VFMSUB231PS ymm2/m256, ymmV, ymm1","v= fmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB231PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmsub231ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 BA /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 BA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB231PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmsub231ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 BA /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 BB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB231SD xmm1, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, xmm1","vfm= sub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BB /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmsub231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 BB /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 BB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB231SS xmm1, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, xmm1","vfm= sub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BB /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmsub231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 BB /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMSUBADD132PD xmm1, xmmV, xmm2/m128","VFMSUBADD132PD xmm2/m128, xmmV, xm= m1","vfmsubadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 97 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD132PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd132pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 97 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMSUBADD132PD ymm1, ymmV, ymm2/m256","VFMSUBADD132PD ymm2/m256, ymmV, ym= m1","vfmsubadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 97 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD132PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd132pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 97 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMSUBADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 97 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD132PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd132pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 97 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMSUBADD132PS xmm1, xmmV, xmm2/m128","VFMSUBADD132PS xmm2/m128, xmmV, xm= m1","vfmsubadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 97 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD132PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd132ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 97 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMSUBADD132PS ymm1, ymmV, ymm2/m256","VFMSUBADD132PS ymm2/m256, ymmV, ym= m1","vfmsubadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 97 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD132PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd132ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 97 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMSUBADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 97 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD132PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd132ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 97 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMSUBADD213PD xmm1, xmmV, xmm2/m128","VFMSUBADD213PD xmm2/m128, xmmV, xm= m1","vfmsubadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD213PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd213pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A7 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMSUBADD213PD ymm1, ymmV, ymm2/m256","VFMSUBADD213PD ymm2/m256, ymmV, ym= m1","vfmsubadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD213PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd213pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A7 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMSUBADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 A7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD213PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd213pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A7 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMSUBADD213PS xmm1, xmmV, xmm2/m128","VFMSUBADD213PS xmm2/m128, xmmV, xm= m1","vfmsubadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD213PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd213ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A7 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMSUBADD213PS ymm1, ymmV, ymm2/m256","VFMSUBADD213PS ymm2/m256, ymmV, ym= m1","vfmsubadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD213PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd213ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A7 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMSUBADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 A7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD213PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd213ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A7 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMSUBADD231PD xmm1, xmmV, xmm2/m128","VFMSUBADD231PD xmm2/m128, xmmV, xm= m1","vfmsubadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD231PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd231pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B7 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMSUBADD231PD ymm1, ymmV, ymm2/m256","VFMSUBADD231PD ymm2/m256, ymmV, ym= m1","vfmsubadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD231PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd231pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B7 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMSUBADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 B7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD231PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd231pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B7 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMSUBADD231PS xmm1, xmmV, xmm2/m128","VFMSUBADD231PS xmm2/m128, xmmV, xm= m1","vfmsubadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD231PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd231ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B7 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMSUBADD231PS ymm1, ymmV, ymm2/m256","VFMSUBADD231PS ymm2/m256, ymmV, ym= m1","vfmsubadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD231PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd231ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B7 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMSUBADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 B7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD231PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd231ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B7 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMSUBADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPD xmm2/m128, xmmIH, = xmmV, xmm1","vfmsubaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W1 5F /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPD xmmIH, xmm2/m128, = xmmV, xmm1","vfmsubaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W0 5F /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPD ymm2/m256, ymmIH, = ymmV, ymm1","vfmsubaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W1 5F /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPD ymmIH, ymm2/m256, = ymmV, ymm1","vfmsubaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W0 5F /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPS xmm2/m128, xmmIH, = xmmV, xmm1","vfmsubaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W1 5E /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPS xmmIH, xmm2/m128, = xmmV, xmm1","vfmsubaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W0 5E /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPS ymm2/m256, ymmIH, = ymmV, ymm1","vfmsubaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W1 5E /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPS ymmIH, ymm2/m256, = ymmV, ymm1","vfmsubaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W0 5E /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPD xmm2/m128, xmmIH, xmmV, = xmm1","vfmsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 6D /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPD xmmIH, xmm2/m128, xmmV, = xmm1","vfmsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 6D /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPD ymm2/m256, ymmIH, ymmV, = ymm1","vfmsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 6D /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPD ymmIH, ymm2/m256, ymmV, = ymm1","vfmsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 6D /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPS xmm2/m128, xmmIH, xmmV, = xmm1","vfmsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 6C /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPS xmmIH, xmm2/m128, xmmV, = xmm1","vfmsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 6C /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPS ymm2/m256, ymmIH, ymmV, = ymm1","vfmsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 6C /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPS ymmIH, ymm2/m256, ymmV, = ymm1","vfmsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 6C /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFMSUBSD xmm2/m64, xmmIH, xmmV, xm= m1","vfmsubsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6F /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFMSUBSD xmmIH, xmm2/m64, xmmV, xm= m1","vfmsubsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6F /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFMSUBSS xmm2/m32, xmmIH, xmmV, xm= m1","vfmsubss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6E /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFMSUBSS xmmIH, xmm2/m32, xmmV, xm= m1","vfmsubss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6E /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADD132PD xmm1, xmmV, xmm2/m128","VFNMADD132PD xmm2/m128, xmmV, xmm1",= "vfnmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9C /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD132PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd132pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 9C /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMADD132PD ymm1, ymmV, ymm2/m256","VFNMADD132PD ymm2/m256, ymmV, ymm1",= "vfnmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9C /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD132PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd132pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 9C /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 9C /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD132PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd132pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 9C /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMADD132PS xmm1, xmmV, xmm2/m128","VFNMADD132PS xmm2/m128, xmmV, xmm1",= "vfnmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9C /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD132PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd132ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 9C /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMADD132PS ymm1, ymmV, ymm2/m256","VFNMADD132PS ymm2/m256, ymmV, ymm1",= "vfnmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9C /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD132PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd132ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 9C /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 9C /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD132PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd132ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 9C /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 9D /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD132SD xmm1, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, xmm1","v= fnmadd132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9D /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmadd132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 9D /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 9D /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD132SS xmm1, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, xmm1","v= fnmadd132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9D /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmadd132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 9D /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMADD213PD xmm1, xmmV, xmm2/m128","VFNMADD213PD xmm2/m128, xmmV, xmm1",= "vfnmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD213PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd213pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 AC /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMADD213PD ymm1, ymmV, ymm2/m256","VFNMADD213PD ymm2/m256, ymmV, ymm1",= "vfnmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD213PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd213pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 AC /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 AC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD213PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd213pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 AC /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMADD213PS xmm1, xmmV, xmm2/m128","VFNMADD213PS xmm2/m128, xmmV, xmm1",= "vfnmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD213PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd213ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 AC /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMADD213PS ymm1, ymmV, ymm2/m256","VFNMADD213PS ymm2/m256, ymmV, ymm1",= "vfnmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD213PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd213ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 AC /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 AC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD213PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd213ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 AC /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 AD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD213SD xmm1, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, xmm1","v= fnmadd213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AD /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmadd213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 AD /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 AD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD213SS xmm1, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, xmm1","v= fnmadd213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AD /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmadd213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 AD /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMADD231PD xmm1, xmmV, xmm2/m128","VFNMADD231PD xmm2/m128, xmmV, xmm1",= "vfnmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD231PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd231pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 BC /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMADD231PD ymm1, ymmV, ymm2/m256","VFNMADD231PD ymm2/m256, ymmV, ymm1",= "vfnmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD231PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd231pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 BC /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 BC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD231PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd231pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 BC /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMADD231PS xmm1, xmmV, xmm2/m128","VFNMADD231PS xmm2/m128, xmmV, xmm1",= "vfnmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD231PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd231ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 BC /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMADD231PS ymm1, ymmV, ymm2/m256","VFNMADD231PS ymm2/m256, ymmV, ymm1",= "vfnmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD231PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd231ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 BC /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 BC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD231PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd231ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 BC /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 BD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD231SD xmm1, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, xmm1","v= fnmadd231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BD /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmadd231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 BD /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 BD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD231SS xmm1, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, xmm1","v= fnmadd231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BD /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmadd231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 BD /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPD xmm2/m128, xmmIH, xmmV= , xmm1","vfnmaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 79= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPD xmmIH, xmm2/m128, xmmV= , xmm1","vfnmaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 79= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPD ymm2/m256, ymmIH, ymmV= , ymm1","vfnmaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 79= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPD ymmIH, ymm2/m256, ymmV= , ymm1","vfnmaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 79= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPS xmm2/m128, xmmIH, xmmV= , xmm1","vfnmaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 78= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPS xmmIH, xmm2/m128, xmmV= , xmm1","vfnmaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 78= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPS ymm2/m256, ymmIH, ymmV= , ymm1","vfnmaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 78= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPS ymmIH, ymm2/m256, ymmV= , ymm1","vfnmaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 78= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMADDSD xmm2/m64, xmmIH, xmmV, = xmm1","vfnmaddsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7B /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMADDSD xmmIH, xmm2/m64, xmmV, = xmm1","vfnmaddsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7B /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMADDSS xmm2/m32, xmmIH, xmmV, = xmm1","vfnmaddss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7A /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMADDSS xmmIH, xmm2/m32, xmmV, = xmm1","vfnmaddss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7A /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUB132PD xmm1, xmmV, xmm2/m128","VFNMSUB132PD xmm2/m128, xmmV, xmm1",= "vfnmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9E /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB132PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub132pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 9E /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMSUB132PD ymm1, ymmV, ymm2/m256","VFNMSUB132PD ymm2/m256, ymmV, ymm1",= "vfnmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9E /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB132PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub132pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 9E /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 9E /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB132PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub132pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 9E /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMSUB132PS xmm1, xmmV, xmm2/m128","VFNMSUB132PS xmm2/m128, xmmV, xmm1",= "vfnmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9E /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB132PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub132ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 9E /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMSUB132PS ymm1, ymmV, ymm2/m256","VFNMSUB132PS ymm2/m256, ymmV, ymm1",= "vfnmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9E /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB132PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub132ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 9E /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 9E /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB132PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub132ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 9E /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 9F /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB132SD xmm1, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, xmm1","v= fnmsub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9F /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmsub132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 9F /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 9F /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB132SS xmm1, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, xmm1","v= fnmsub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9F /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmsub132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 9F /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMSUB213PD xmm1, xmmV, xmm2/m128","VFNMSUB213PD xmm2/m128, xmmV, xmm1",= "vfnmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB213PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub213pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 AE /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMSUB213PD ymm1, ymmV, ymm2/m256","VFNMSUB213PD ymm2/m256, ymmV, ymm1",= "vfnmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB213PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub213pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 AE /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 AE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB213PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub213pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 AE /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMSUB213PS xmm1, xmmV, xmm2/m128","VFNMSUB213PS xmm2/m128, xmmV, xmm1",= "vfnmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB213PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub213ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 AE /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMSUB213PS ymm1, ymmV, ymm2/m256","VFNMSUB213PS ymm2/m256, ymmV, ymm1",= "vfnmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB213PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub213ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 AE /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 AE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB213PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub213ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 AE /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 AF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB213SD xmm1, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, xmm1","v= fnmsub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AF /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmsub213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 AF /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 AF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB213SS xmm1, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, xmm1","v= fnmsub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AF /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmsub213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 AF /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMSUB231PD xmm1, xmmV, xmm2/m128","VFNMSUB231PD xmm2/m128, xmmV, xmm1",= "vfnmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB231PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub231pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 BE /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMSUB231PD ymm1, ymmV, ymm2/m256","VFNMSUB231PD ymm2/m256, ymmV, ymm1",= "vfnmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB231PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub231pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 BE /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 BE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB231PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub231pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 BE /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMSUB231PS xmm1, xmmV, xmm2/m128","VFNMSUB231PS xmm2/m128, xmmV, xmm1",= "vfnmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB231PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub231ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 BE /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMSUB231PS ymm1, ymmV, ymm2/m256","VFNMSUB231PS ymm2/m256, ymmV, ymm1",= "vfnmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB231PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub231ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 BE /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 BE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB231PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub231ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 BE /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 BF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB231SD xmm1, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, xmm1","v= fnmsub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BF /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmsub231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 BF /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 BF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB231SS xmm1, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, xmm1","v= fnmsub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BF /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmsub231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 BF /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPD xmm2/m128, xmmIH, xmmV= , xmm1","vfnmsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 7D= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPD xmmIH, xmm2/m128, xmmV= , xmm1","vfnmsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 7D= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPD ymm2/m256, ymmIH, ymmV= , ymm1","vfnmsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 7D= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPD ymmIH, ymm2/m256, ymmV= , ymm1","vfnmsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 7D= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPS xmm2/m128, xmmIH, xmmV= , xmm1","vfnmsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 7C= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPS xmmIH, xmm2/m128, xmmV= , xmm1","vfnmsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 7C= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPS ymm2/m256, ymmIH, ymmV= , ymm1","vfnmsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 7C= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPS ymmIH, ymm2/m256, ymmV= , ymm1","vfnmsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 7C= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMSUBSD xmm2/m64, xmmIH, xmmV, = xmm1","vfnmsubsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7F /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMSUBSD xmmIH, xmm2/m64, xmmV, = xmm1","vfnmsubsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7F /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMSUBSS xmm2/m32, xmmIH, xmmV, = xmm1","vfnmsubss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7E /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMSUBSS xmmIH, xmm2/m32, xmmV, = xmm1","vfnmsubss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7E /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFPCLASSPD k1, {k}, xmm2/m128/m64bcst, imm8u","VFPCLASSPDX imm8u, xmm2/m1= 28/m64bcst, {k}, k1","vfpclasspdx imm8u, xmm2/m128/m64bcst, {k}, k1","EVEX.= 128.66.0F3A.W1 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r= ,r,r","Y","128" +"VFPCLASSPD k1, {k}, ymm2/m256/m64bcst, imm8u","VFPCLASSPDY imm8u, ymm2/m2= 56/m64bcst, {k}, k1","vfpclasspdy imm8u, ymm2/m256/m64bcst, {k}, k1","EVEX.= 256.66.0F3A.W1 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r= ,r,r","Y","256" +"VFPCLASSPD k1, {k}, zmm2/m512/m64bcst, imm8u","VFPCLASSPDZ imm8u, zmm2/m5= 12/m64bcst, {k}, k1","vfpclasspdz imm8u, zmm2/m512/m64bcst, {k}, k1","EVEX.= 512.66.0F3A.W1 66 /r ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","Y"= ,"512" +"VFPCLASSPS k1, {k}, xmm2/m128/m32bcst, imm8u","VFPCLASSPSX imm8u, xmm2/m1= 28/m32bcst, {k}, k1","vfpclasspsx imm8u, xmm2/m128/m32bcst, {k}, k1","EVEX.= 128.66.0F3A.W0 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r= ,r,r","Y","128" +"VFPCLASSPS k1, {k}, ymm2/m256/m32bcst, imm8u","VFPCLASSPSY imm8u, ymm2/m2= 56/m32bcst, {k}, k1","vfpclasspsy imm8u, ymm2/m256/m32bcst, {k}, k1","EVEX.= 256.66.0F3A.W0 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r= ,r,r","Y","256" +"VFPCLASSPS k1, {k}, zmm2/m512/m32bcst, imm8u","VFPCLASSPSZ imm8u, zmm2/m5= 12/m32bcst, {k}, k1","vfpclasspsz imm8u, zmm2/m512/m32bcst, {k}, k1","EVEX.= 512.66.0F3A.W0 66 /r ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","Y"= ,"512" +"VFPCLASSSD k1, {k}, xmm2/m64, imm8u","VFPCLASSSD imm8u, xmm2/m64, {k}, k1= ","vfpclasssd imm8u, xmm2/m64, {k}, k1","EVEX.LIG.66.0F3A.W1 67 /r ib","V",= "V","AVX512DQ","scale8","w,r,r,r","","" +"VFPCLASSSS k1, {k}, xmm2/m32, imm8u","VFPCLASSSS imm8u, xmm2/m32, {k}, k1= ","vfpclassss imm8u, xmm2/m32, {k}, k1","EVEX.LIG.66.0F3A.W0 67 /r ib","V",= "V","AVX512DQ","scale4","w,r,r,r","","" +"VFRCZPD xmm1, xmm2/m128","VFRCZPD xmm2/m128, xmm1","vfrczpd xmm2/m128, xm= m1","XOP.128.09.W0 81 /r","V","V","XOP","amd","w,r","","" +"VFRCZPD ymm1, ymm2/m256","VFRCZPD ymm2/m256, ymm1","vfrczpd ymm2/m256, ym= m1","XOP.256.09.W0 81 /r","V","V","XOP","amd","w,r","","" +"VFRCZPS xmm1, xmm2/m128","VFRCZPS xmm2/m128, xmm1","vfrczps xmm2/m128, xm= m1","XOP.128.09.W0 80 /r","V","V","XOP","amd","w,r","","" +"VFRCZPS ymm1, ymm2/m256","VFRCZPS ymm2/m256, ymm1","vfrczps ymm2/m256, ym= m1","XOP.256.09.W0 80 /r","V","V","XOP","amd","w,r","","" +"VFRCZSD xmm1, xmm2/m64","VFRCZSD xmm2/m64, xmm1","vfrczsd xmm2/m64, xmm1"= ,"XOP.128.09.W0 83 /r","V","V","XOP","amd","w,r","","" +"VFRCZSS xmm1, xmm2/m32","VFRCZSS xmm2/m32, xmm1","vfrczss xmm2/m32, xmm1"= ,"XOP.128.09.W0 82 /r","V","V","XOP","amd","w,r","","" +"VGATHERDPD xmm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7}, xmm1","vgath= erdpd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 92 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VGATHERDPD ymm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7}, ymm1","vgath= erdpd vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 92 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VGATHERDPD zmm1, {k1-k7}, vm32y","VGATHERDPD vm32y, {k1-k7}, zmm1","vgath= erdpd vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 92 /vsib","V","V","AVX512F= ","modrm_memonly,scale8","w,rw,r","","" +"VGATHERDPD xmm1, vm32x, xmmV","VGATHERDPD xmmV, vm32x, xmm1","vgatherdpd = xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 92 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERDPD ymm1, vm32x, ymmV","VGATHERDPD ymmV, vm32x, ymm1","vgatherdpd = ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 92 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERDPS xmm1, {k1-k7}, vm32x","VGATHERDPS vm32x, {k1-k7}, xmm1","vgath= erdps vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 92 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VGATHERDPS ymm1, {k1-k7}, vm32y","VGATHERDPS vm32y, {k1-k7}, ymm1","vgath= erdps vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 92 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VGATHERDPS zmm1, {k1-k7}, vm32z","VGATHERDPS vm32z, {k1-k7}, zmm1","vgath= erdps vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 92 /vsib","V","V","AVX512F= ","modrm_memonly,scale4","w,rw,r","","" +"VGATHERDPS xmm1, vm32x, xmmV","VGATHERDPS xmmV, vm32x, xmm1","vgatherdps = xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 92 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERDPS ymm1, vm32y, ymmV","VGATHERDPS ymmV, vm32y, ymm1","vgatherdps = ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 92 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERPF0DPD vm32y, {k1-k7}","VGATHERPF0DPD {k1-k7}, vm32y","vgatherpf0d= pd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /1","V","V","AVX512PF","modrm_me= monly,scale8","r,rw","","" +"VGATHERPF0DPS vm32z, {k1-k7}","VGATHERPF0DPS {k1-k7}, vm32z","vgatherpf0d= ps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /1","V","V","AVX512PF","modrm_me= monly,scale4","r,rw","","" +"VGATHERPF0QPD vm64z, {k1-k7}","VGATHERPF0QPD {k1-k7}, vm64z","vgatherpf0q= pd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /1","V","V","AVX512PF","modrm_me= monly,scale8","r,rw","","" +"VGATHERPF0QPS vm64z, {k1-k7}","VGATHERPF0QPS {k1-k7}, vm64z","vgatherpf0q= ps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /1","V","V","AVX512PF","modrm_me= monly,scale4","r,rw","","" +"VGATHERPF1DPD vm32y, {k1-k7}","VGATHERPF1DPD {k1-k7}, vm32y","vgatherpf1d= pd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /2","V","V","AVX512PF","modrm_me= monly,scale8","r,rw","","" +"VGATHERPF1DPS vm32z, {k1-k7}","VGATHERPF1DPS {k1-k7}, vm32z","vgatherpf1d= ps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /2","V","V","AVX512PF","modrm_me= monly,scale4","r,rw","","" +"VGATHERPF1QPD vm64z, {k1-k7}","VGATHERPF1QPD {k1-k7}, vm64z","vgatherpf1q= pd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /2","V","V","AVX512PF","modrm_me= monly,scale8","r,rw","","" +"VGATHERPF1QPS vm64z, {k1-k7}","VGATHERPF1QPS {k1-k7}, vm64z","vgatherpf1q= ps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /2","V","V","AVX512PF","modrm_me= monly,scale4","r,rw","","" +"VGATHERQPD xmm1, {k1-k7}, vm64x","VGATHERQPD vm64x, {k1-k7}, xmm1","vgath= erqpd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 93 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VGATHERQPD ymm1, {k1-k7}, vm64y","VGATHERQPD vm64y, {k1-k7}, ymm1","vgath= erqpd vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 93 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VGATHERQPD zmm1, {k1-k7}, vm64z","VGATHERQPD vm64z, {k1-k7}, zmm1","vgath= erqpd vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 93 /vsib","V","V","AVX512F= ","modrm_memonly,scale8","w,rw,r","","" +"VGATHERQPD xmm1, vm64x, xmmV","VGATHERQPD xmmV, vm64x, xmm1","vgatherqpd = xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 93 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERQPD ymm1, vm64y, ymmV","VGATHERQPD ymmV, vm64y, ymm1","vgatherqpd = ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 93 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERQPS xmm1, {k1-k7}, vm64x","VGATHERQPS vm64x, {k1-k7}, xmm1","vgath= erqps vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 93 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VGATHERQPS xmm1, {k1-k7}, vm64y","VGATHERQPS vm64y, {k1-k7}, xmm1","vgath= erqps vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 93 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VGATHERQPS ymm1, {k1-k7}, vm64z","VGATHERQPS vm64z, {k1-k7}, ymm1","vgath= erqps vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 93 /vsib","V","V","AVX512F= ","modrm_memonly,scale4","w,rw,r","","" +"VGATHERQPS xmm1, vm64x, xmmV","VGATHERQPS xmmV, vm64x, xmm1","vgatherqps = xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 93 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERQPS xmm1, vm64y, xmmV","VGATHERQPS xmmV, vm64y, xmm1","vgatherqps = xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 93 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGETEXPPD xmm1, {k}{z}, xmm2/m128/m64bcst","VGETEXPPD xmm2/m128/m64bcst, = {k}{z}, xmm1","vgetexppd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38= .W1 42 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VGETEXPPD ymm1, {k}{z}, ymm2/m256/m64bcst","VGETEXPPD ymm2/m256/m64bcst, = {k}{z}, ymm1","vgetexppd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38= .W1 42 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VGETEXPPD zmm1{sae}, {k}{z}, zmm2","VGETEXPPD zmm2, {k}{z}, zmm1{sae}","v= getexppd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 42 /r","V","V","AVX5= 12F","modrm_regonly","w,r,r","","" +"VGETEXPPD zmm1, {k}{z}, zmm2/m512/m64bcst","VGETEXPPD zmm2/m512/m64bcst, = {k}{z}, zmm1","vgetexppd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38= .W1 42 /r","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VGETEXPPS xmm1, {k}{z}, xmm2/m128/m32bcst","VGETEXPPS xmm2/m128/m32bcst, = {k}{z}, xmm1","vgetexpps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38= .W0 42 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VGETEXPPS ymm1, {k}{z}, ymm2/m256/m32bcst","VGETEXPPS ymm2/m256/m32bcst, = {k}{z}, ymm1","vgetexpps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38= .W0 42 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VGETEXPPS zmm1{sae}, {k}{z}, zmm2","VGETEXPPS zmm2, {k}{z}, zmm1{sae}","v= getexpps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 42 /r","V","V","AVX5= 12F","modrm_regonly","w,r,r","","" +"VGETEXPPS zmm1, {k}{z}, zmm2/m512/m32bcst","VGETEXPPS zmm2/m512/m32bcst, = {k}{z}, zmm1","vgetexpps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38= .W0 42 /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VGETEXPSD xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSD xmm2, xmmV, {k}{z}, x= mm1{sae}","vgetexpsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W= 1 43 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VGETEXPSD xmm1, {k}{z}, xmmV, xmm2/m64","VGETEXPSD xmm2/m64, xmmV, {k}{z}= , xmm1","vgetexpsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 4= 3 /r","V","V","AVX512F","scale8","w,r,r,r","","" +"VGETEXPSS xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSS xmm2, xmmV, {k}{z}, x= mm1{sae}","vgetexpss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W= 0 43 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VGETEXPSS xmm1, {k}{z}, xmmV, xmm2/m32","VGETEXPSS xmm2/m32, xmmV, {k}{z}= , xmm1","vgetexpss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 4= 3 /r","V","V","AVX512F","scale4","w,r,r,r","","" +"VGETMANTPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u:4","VGETMANTPD imm8u:4,= xmm2/m128/m64bcst, {k}{z}, xmm1","vgetmantpd imm8u:4, xmm2/m128/m64bcst, {= k}{z}, xmm1","EVEX.128.66.0F3A.W1 26 /r ib","V","V","AVX512F+AVX512VL","bsc= ale8,scale16","w,r,r,r","","" +"VGETMANTPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u:4","VGETMANTPD imm8u:4,= ymm2/m256/m64bcst, {k}{z}, ymm1","vgetmantpd imm8u:4, ymm2/m256/m64bcst, {= k}{z}, ymm1","EVEX.256.66.0F3A.W1 26 /r ib","V","V","AVX512F+AVX512VL","bsc= ale8,scale32","w,r,r,r","","" +"VGETMANTPD zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPD imm8u:4, zmm2, {= k}{z}, zmm1{sae}","vgetmantpd imm8u:4, zmm2, {k}{z}, zmm1{sae}","EVEX.512.6= 6.0F3A.W1 26 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VGETMANTPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u:4","VGETMANTPD imm8u:4,= zmm2/m512/m64bcst, {k}{z}, zmm1","vgetmantpd imm8u:4, zmm2/m512/m64bcst, {= k}{z}, zmm1","EVEX.512.66.0F3A.W1 26 /r ib","V","V","AVX512F","bscale8,scal= e64","w,r,r,r","","" +"VGETMANTPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u:4","VGETMANTPS imm8u:4,= xmm2/m128/m32bcst, {k}{z}, xmm1","vgetmantps imm8u:4, xmm2/m128/m32bcst, {= k}{z}, xmm1","EVEX.128.66.0F3A.W0 26 /r ib","V","V","AVX512F+AVX512VL","bsc= ale4,scale16","w,r,r,r","","" +"VGETMANTPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u:4","VGETMANTPS imm8u:4,= ymm2/m256/m32bcst, {k}{z}, ymm1","vgetmantps imm8u:4, ymm2/m256/m32bcst, {= k}{z}, ymm1","EVEX.256.66.0F3A.W0 26 /r ib","V","V","AVX512F+AVX512VL","bsc= ale4,scale32","w,r,r,r","","" +"VGETMANTPS zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPS imm8u:4, zmm2, {= k}{z}, zmm1{sae}","vgetmantps imm8u:4, zmm2, {k}{z}, zmm1{sae}","EVEX.512.6= 6.0F3A.W0 26 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VGETMANTPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u:4","VGETMANTPS imm8u:4,= zmm2/m512/m32bcst, {k}{z}, zmm1","vgetmantps imm8u:4, zmm2/m512/m32bcst, {= k}{z}, zmm1","EVEX.512.66.0F3A.W0 26 /r ib","V","V","AVX512F","bscale4,scal= e64","w,r,r,r","","" +"VGETMANTSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSD imm8u:4, x= mm2, xmmV, {k}{z}, xmm1{sae}","vgetmantsd imm8u:4, xmm2, xmmV, {k}{z}, xmm1= {sae}","EVEX.NDS.128.66.0F3A.W1 27 /r ib","V","V","AVX512F","modrm_regonly"= ,"w,r,r,r,r","","" +"VGETMANTSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VGETMANTSD imm8u:4, xm= m2/m64, xmmV, {k}{z}, xmm1","vgetmantsd imm8u:4, xmm2/m64, xmmV, {k}{z}, xm= m1","EVEX.NDS.LIG.66.0F3A.W1 27 /r ib","V","V","AVX512F","scale8","w,r,r,r,= r","","" +"VGETMANTSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSS imm8u:4, x= mm2, xmmV, {k}{z}, xmm1{sae}","vgetmantss imm8u:4, xmm2, xmmV, {k}{z}, xmm1= {sae}","EVEX.NDS.128.66.0F3A.W0 27 /r ib","V","V","AVX512F","modrm_regonly"= ,"w,r,r,r,r","","" +"VGETMANTSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VGETMANTSS imm8u:4, xm= m2/m32, xmmV, {k}{z}, xmm1","vgetmantss imm8u:4, xmm2/m32, xmmV, {k}{z}, xm= m1","EVEX.NDS.LIG.66.0F3A.W0 27 /r ib","V","V","AVX512F","scale4","w,r,r,r,= r","","" +"VGF2P8AFFINEINVQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEINVQB imm8u,= xmm2/m128, xmmV, xmm1","vgf2p8affineinvqb imm8u, xmm2/m128, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX","","w,r,r,r","","" +"VGF2P8AFFINEINVQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VGF2P8AF= FINEINVQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vgf2p8affineinvqb = imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 CF /= r ib","V","V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","","" +"VGF2P8AFFINEINVQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEINVQB imm8u,= ymm2/m256, ymmV, ymm1","vgf2p8affineinvqb imm8u, ymm2/m256, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX","","w,r,r,r","","" +"VGF2P8AFFINEINVQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VGF2P8AF= FINEINVQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vgf2p8affineinvqb = imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 CF /= r ib","V","V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VGF2P8AFFINEINVQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VGF2P8AF= FINEINVQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vgf2p8affineinvqb = imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 CF /= r ib","V","V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","","" +"VGF2P8AFFINEQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEQB imm8u, xmm2/= m128, xmmV, xmm1","vgf2p8affineqb imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.12= 8.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX","","w,r,r,r","","" +"VGF2P8AFFINEQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VGF2P8AFFIN= EQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vgf2p8affineqb imm8u, xm= m2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 CE /r ib","V"= ,"V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","","" +"VGF2P8AFFINEQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEQB imm8u, ymm2/= m256, ymmV, ymm1","vgf2p8affineqb imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.25= 6.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX","","w,r,r,r","","" +"VGF2P8AFFINEQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VGF2P8AFFIN= EQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vgf2p8affineqb imm8u, ym= m2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 CE /r ib","V"= ,"V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VGF2P8AFFINEQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VGF2P8AFFIN= EQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vgf2p8affineqb imm8u, zm= m2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 CE /r ib","V"= ,"V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","","" +"VGF2P8MULB xmm1, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, xmm1","vgf= 2p8mulb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 CF /r","V","V","GFNI= +AVX","","w,r,r","","" +"VGF2P8MULB xmm1, {k}{z}, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, {k= }{z}, xmm1","vgf2p8mulb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3= 8.W0 CF /r","V","V","GFNI+AVX512VL","scale16","w,r,r,r","","" +"VGF2P8MULB ymm1, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, ymm1","vgf= 2p8mulb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 CF /r","V","V","GFNI= +AVX","","w,r,r","","" +"VGF2P8MULB ymm1, {k}{z}, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, {k= }{z}, ymm1","vgf2p8mulb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3= 8.W0 CF /r","V","V","GFNI+AVX512VL","scale32","w,r,r,r","","" +"VGF2P8MULB zmm1, {k}{z}, zmmV, zmm2/m512","VGF2P8MULB zmm2/m512, zmmV, {k= }{z}, zmm1","vgf2p8mulb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3= 8.W0 CF /r","V","V","GFNI+AVX512F","scale64","w,r,r,r","","" +"VHADDPD xmm1, xmmV, xmm2/m128","VHADDPD xmm2/m128, xmmV, xmm1","vhaddpd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7C /r","V","V","AVX","","w,r,r= ","","" +"VHADDPD ymm1, ymmV, ymm2/m256","VHADDPD ymm2/m256, ymmV, ymm1","vhaddpd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7C /r","V","V","AVX","","w,r,r= ","","" +"VHADDPS xmm1, xmmV, xmm2/m128","VHADDPS xmm2/m128, xmmV, xmm1","vhaddps x= mm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r= ","","" +"VHADDPS ymm1, ymmV, ymm2/m256","VHADDPS ymm2/m256, ymmV, ymm1","vhaddps y= mm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r= ","","" +"VHSUBPD xmm1, xmmV, xmm2/m128","VHSUBPD xmm2/m128, xmmV, xmm1","vhsubpd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7D /r","V","V","AVX","","w,r,r= ","","" +"VHSUBPD ymm1, ymmV, ymm2/m256","VHSUBPD ymm2/m256, ymmV, ymm1","vhsubpd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7D /r","V","V","AVX","","w,r,r= ","","" +"VHSUBPS xmm1, xmmV, xmm2/m128","VHSUBPS xmm2/m128, xmmV, xmm1","vhsubps x= mm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r= ","","" +"VHSUBPS ymm1, ymmV, ymm2/m256","VHSUBPS ymm2/m256, ymmV, ymm1","vhsubps y= mm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r= ","","" +"VINSERTF128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTF128 imm8u:1, xmm2/m1= 28, ymmV, ymm1","vinsertf128 imm8u:1, xmm2/m128, ymmV, ymm1","VEX.NDS.256.6= 6.0F3A.W0 18 /r ib","V","V","AVX","","w,r,r,r","","" +"VINSERTF32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF32X4 imm8u:= 1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf32x4 imm8u:1, xmm2/m128, ymmV, {= k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 18 /r ib","V","V","AVX512F+AVX512VL",= "scale16","w,r,r,r,r","","" +"VINSERTF32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF32X4 imm8u:= 2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf32x4 imm8u:2, xmm2/m128, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 18 /r ib","V","V","AVX512F","scale16"= ,"w,r,r,r,r","","" +"VINSERTF32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF32X8 imm8u:= 1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf32x8 imm8u:1, ymm2/m256, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 1A /r ib","V","V","AVX512DQ","scale32= ","w,r,r,r,r","","" +"VINSERTF64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF64X2 imm8u:= 1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf64x2 imm8u:1, xmm2/m128, ymmV, {= k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 18 /r ib","V","V","AVX512DQ+AVX512VL"= ,"scale16","w,r,r,r,r","","" +"VINSERTF64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF64X2 imm8u:= 2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf64x2 imm8u:2, xmm2/m128, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 18 /r ib","V","V","AVX512DQ","scale16= ","w,r,r,r,r","","" +"VINSERTF64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF64X4 imm8u:= 1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf64x4 imm8u:1, ymm2/m256, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 1A /r ib","V","V","AVX512F","scale32"= ,"w,r,r,r,r","","" +"VINSERTI128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTI128 imm8u:1, xmm2/m1= 28, ymmV, ymm1","vinserti128 imm8u:1, xmm2/m128, ymmV, ymm1","VEX.NDS.256.6= 6.0F3A.W0 38 /r ib","V","V","AVX2","","w,r,r,r","","" +"VINSERTI32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI32X4 imm8u:= 1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti32x4 imm8u:1, xmm2/m128, ymmV, {= k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 38 /r ib","V","V","AVX512F+AVX512VL",= "scale16","w,r,r,r,r","","" +"VINSERTI32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI32X4 imm8u:= 2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti32x4 imm8u:2, xmm2/m128, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 38 /r ib","V","V","AVX512F","scale16"= ,"w,r,r,r,r","","" +"VINSERTI32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI32X8 imm8u:= 1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti32x8 imm8u:1, ymm2/m256, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 3A /r ib","V","V","AVX512DQ","scale32= ","w,r,r,r,r","","" +"VINSERTI64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI64X2 imm8u:= 1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti64x2 imm8u:1, xmm2/m128, ymmV, {= k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 38 /r ib","V","V","AVX512DQ+AVX512VL"= ,"scale16","w,r,r,r,r","","" +"VINSERTI64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI64X2 imm8u:= 2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti64x2 imm8u:2, xmm2/m128, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 38 /r ib","V","V","AVX512DQ","scale16= ","w,r,r,r,r","","" +"VINSERTI64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI64X4 imm8u:= 1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti64x4 imm8u:1, ymm2/m256, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 3A /r ib","V","V","AVX512F","scale32"= ,"w,r,r,r,r","","" +"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32, xmmV, = xmm1","vinsertps imm8u, xmm2/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 21 /= r ib","V","V","AVX512F+AVX512VL","scale4","w,r,r,r","","" +"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32, xmmV, = xmm1","vinsertps imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 21 /= r ib","V","V","AVX","","w,r,r,r","","" +"VLDDQU xmm1, m128","VLDDQU m128, xmm1","vlddqu m128, xmm1","VEX.128.F2.0F= .WIG F0 /r","V","V","AVX","modrm_memonly","w,r","","" +"VLDDQU ymm1, m256","VLDDQU m256, ymm1","vlddqu m256, ymm1","VEX.256.F2.0F= .WIG F0 /r","V","V","AVX","modrm_memonly","w,r","","" +"VLDMXCSR m32","VLDMXCSR m32","vldmxcsr m32","VEX.128.0F.WIG AE /2","V","V= ","AVX","modrm_memonly","r","","" +"VMASKMOVDQU xmm1, xmm2","VMASKMOVDQU xmm2, xmm1","vmaskmovdqu xmm2, xmm1"= ,"VEX.128.66.0F.WIG F7 /r","V","V","AVX","modrm_regonly","r,r","","" +"VMASKMOVPD xmm1, xmmV, m128","VMASKMOVPD m128, xmmV, xmm1","vmaskmovpd m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPD ymm1, ymmV, m256","VMASKMOVPD m256, ymmV, ymm1","vmaskmovpd m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPD m128, xmmV, xmm1","VMASKMOVPD xmm1, xmmV, m128","vmaskmovpd xm= m1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPD m256, ymmV, ymm1","VMASKMOVPD ymm1, ymmV, m256","vmaskmovpd ym= m1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPS xmm1, xmmV, m128","VMASKMOVPS m128, xmmV, xmm1","vmaskmovps m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPS ymm1, ymmV, m256","VMASKMOVPS m256, ymmV, ymm1","vmaskmovps m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPS m128, xmmV, xmm1","VMASKMOVPS xmm1, xmmV, m128","vmaskmovps xm= m1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPS m256, ymmV, ymm1","VMASKMOVPS ymm1, ymmV, m256","vmaskmovps ym= m1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMAXPD xmm1, xmmV, xmm2/m128","VMAXPD xmm2/m128, xmmV, xmm1","vmaxpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","= ","" +"VMAXPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMAXPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vmaxpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 5F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VMAXPD ymm1, ymmV, ymm2/m256","VMAXPD ymm2/m256, ymmV, ymm1","vmaxpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","= ","" +"VMAXPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMAXPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vmaxpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 5F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VMAXPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPD zmm2, zmmV, {k}{z}, zmm1{sa= e}","vmaxpd zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F.W1 5F /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMAXPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMAXPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vmaxpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 5F /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VMAXPS xmm1, xmmV, xmm2/m128","VMAXPS xmm2/m128, xmmV, xmm1","vmaxps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5F /r","V","V","AVX","","w,r,r","","" +"VMAXPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMAXPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vmaxps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 5F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VMAXPS ymm1, ymmV, ymm2/m256","VMAXPS ymm2/m256, ymmV, ymm1","vmaxps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5F /r","V","V","AVX","","w,r,r","","" +"VMAXPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMAXPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vmaxps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 5F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VMAXPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPS zmm2, zmmV, {k}{z}, zmm1{sa= e}","vmaxps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0 5F /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMAXPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMAXPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vmaxps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 5F /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VMAXSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSD xmm2, xmmV, {k}{z}, xmm1{sa= e}","vmaxsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5F /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMAXSD xmm1, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, xmm1","vmaxsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5F /r","V","V","AVX","","w,r,r","","" +"VMAXSD xmm1, {k}{z}, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vmaxsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5F /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VMAXSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSS xmm2, xmmV, {k}{z}, xmm1{sa= e}","vmaxss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5F /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMAXSS xmm1, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, xmm1","vmaxss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5F /r","V","V","AVX","","w,r,r","","" +"VMAXSS xmm1, {k}{z}, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vmaxss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5F /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VMCALL","VMCALL","vmcall","0F 01 C1","V","V","VTX","","","","" +"VMCLEAR m64","VMCLEAR m64","vmclear m64","66 0F C7 /6","V","V","VTX","mod= rm_memonly","r","","" +"VMFUNC","VMFUNC","vmfunc","0F 01 D4","V","V","","","","","" +"VMINPD xmm1, xmmV, xmm2/m128","VMINPD xmm2/m128, xmmV, xmm1","vminpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","= ","" +"VMINPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMINPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vminpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 5D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VMINPD ymm1, ymmV, ymm2/m256","VMINPD ymm2/m256, ymmV, ymm1","vminpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","= ","" +"VMINPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMINPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vminpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 5D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VMINPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPD zmm2, zmmV, {k}{z}, zmm1{sa= e}","vminpd zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F.W1 5D /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMINPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMINPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vminpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 5D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VMINPS xmm1, xmmV, xmm2/m128","VMINPS xmm2/m128, xmmV, xmm1","vminps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5D /r","V","V","AVX","","w,r,r","","" +"VMINPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMINPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vminps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 5D /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VMINPS ymm1, ymmV, ymm2/m256","VMINPS ymm2/m256, ymmV, ymm1","vminps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5D /r","V","V","AVX","","w,r,r","","" +"VMINPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMINPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vminps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 5D /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VMINPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPS zmm2, zmmV, {k}{z}, zmm1{sa= e}","vminps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0 5D /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMINPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMINPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vminps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 5D /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VMINSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSD xmm2, xmmV, {k}{z}, xmm1{sa= e}","vminsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5D /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMINSD xmm1, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, xmm1","vminsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5D /r","V","V","AVX","","w,r,r","","" +"VMINSD xmm1, {k}{z}, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vminsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5D /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VMINSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSS xmm2, xmmV, {k}{z}, xmm1{sa= e}","vminss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5D /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMINSS xmm1, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, xmm1","vminss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5D /r","V","V","AVX","","w,r,r","","" +"VMINSS xmm1, {k}{z}, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vminss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5D /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VMLAUNCH","VMLAUNCH","vmlaunch","0F 01 C2","V","V","VTX","","","","" +"VMLOAD EAX","VMLOADL EAX","vmloadl EAX","0F 01 DA","V","V","SVM","amd,mod= rm_regonly,operand32","r","Y","32" +"VMLOAD RAX","VMLOADQ RAX","vmloadq RAX","REX.W 0F 01 DA","N.S.","V","SVM"= ,"amd,modrm_regonly","r","Y","64" +"VMLOAD AX","VMLOADW AX","vmloadw AX","0F 01 DA","V","V","SVM","amd,modrm_= regonly,operand16","r","Y","16" +"VMMCALL","VMMCALL","vmmcall","0F 01 D9","V","V","SVM","amd","","","" +"VMOVAPD xmm2/m128, xmm1","VMOVAPD xmm1, xmm2/m128","vmovapd xmm1, xmm2/m1= 28","VEX.128.66.0F.WIG 29 /r","V","V","AVX","","w,r","","" +"VMOVAPD xmm2/m128, {k}{z}, xmm1","VMOVAPD xmm1, {k}{z}, xmm2/m128","vmova= pd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 29 /r","V","V","AVX512F+AVX5= 12VL","scale16","w,r,r","","" +"VMOVAPD xmm1, xmm2/m128","VMOVAPD xmm2/m128, xmm1","vmovapd xmm2/m128, xm= m1","VEX.128.66.0F.WIG 28 /r","V","V","AVX","","w,r","","" +"VMOVAPD xmm1, {k}{z}, xmm2/m128","VMOVAPD xmm2/m128, {k}{z}, xmm1","vmova= pd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 28 /r","V","V","AVX512F+AVX5= 12VL","scale16","w,r,r","","" +"VMOVAPD ymm2/m256, ymm1","VMOVAPD ymm1, ymm2/m256","vmovapd ymm1, ymm2/m2= 56","VEX.256.66.0F.WIG 29 /r","V","V","AVX","","w,r","","" +"VMOVAPD ymm2/m256, {k}{z}, ymm1","VMOVAPD ymm1, {k}{z}, ymm2/m256","vmova= pd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 29 /r","V","V","AVX512F+AVX5= 12VL","scale32","w,r,r","","" +"VMOVAPD ymm1, ymm2/m256","VMOVAPD ymm2/m256, ymm1","vmovapd ymm2/m256, ym= m1","VEX.256.66.0F.WIG 28 /r","V","V","AVX","","w,r","","" +"VMOVAPD ymm1, {k}{z}, ymm2/m256","VMOVAPD ymm2/m256, {k}{z}, ymm1","vmova= pd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 28 /r","V","V","AVX512F+AVX5= 12VL","scale32","w,r,r","","" +"VMOVAPD zmm2/m512, {k}{z}, zmm1","VMOVAPD zmm1, {k}{z}, zmm2/m512","vmova= pd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 29 /r","V","V","AVX512F","sc= ale64","w,r,r","","" +"VMOVAPD zmm1, {k}{z}, zmm2/m512","VMOVAPD zmm2/m512, {k}{z}, zmm1","vmova= pd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 28 /r","V","V","AVX512F","sc= ale64","w,r,r","","" +"VMOVAPS xmm2/m128, xmm1","VMOVAPS xmm1, xmm2/m128","vmovaps xmm1, xmm2/m1= 28","VEX.128.0F.WIG 29 /r","V","V","AVX","","w,r","","" +"VMOVAPS xmm2/m128, {k}{z}, xmm1","VMOVAPS xmm1, {k}{z}, xmm2/m128","vmova= ps xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 29 /r","V","V","AVX512F+AVX512V= L","scale16","w,r,r","","" +"VMOVAPS xmm1, xmm2/m128","VMOVAPS xmm2/m128, xmm1","vmovaps xmm2/m128, xm= m1","VEX.128.0F.WIG 28 /r","V","V","AVX","","w,r","","" +"VMOVAPS xmm1, {k}{z}, xmm2/m128","VMOVAPS xmm2/m128, {k}{z}, xmm1","vmova= ps xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 28 /r","V","V","AVX512F+AVX512V= L","scale16","w,r,r","","" +"VMOVAPS ymm2/m256, ymm1","VMOVAPS ymm1, ymm2/m256","vmovaps ymm1, ymm2/m2= 56","VEX.256.0F.WIG 29 /r","V","V","AVX","","w,r","","" +"VMOVAPS ymm2/m256, {k}{z}, ymm1","VMOVAPS ymm1, {k}{z}, ymm2/m256","vmova= ps ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 29 /r","V","V","AVX512F+AVX512V= L","scale32","w,r,r","","" +"VMOVAPS ymm1, ymm2/m256","VMOVAPS ymm2/m256, ymm1","vmovaps ymm2/m256, ym= m1","VEX.256.0F.WIG 28 /r","V","V","AVX","","w,r","","" +"VMOVAPS ymm1, {k}{z}, ymm2/m256","VMOVAPS ymm2/m256, {k}{z}, ymm1","vmova= ps ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 28 /r","V","V","AVX512F+AVX512V= L","scale32","w,r,r","","" +"VMOVAPS zmm2/m512, {k}{z}, zmm1","VMOVAPS zmm1, {k}{z}, zmm2/m512","vmova= ps zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 29 /r","V","V","AVX512F","scale= 64","w,r,r","","" +"VMOVAPS zmm1, {k}{z}, zmm2/m512","VMOVAPS zmm2/m512, {k}{z}, zmm1","vmova= ps zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 28 /r","V","V","AVX512F","scale= 64","w,r,r","","" +"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","EVEX.128.66.0= F.W0 6E /r","V","V","AVX512F+AVX512VL","scale4","w,r","","" +"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","VEX.128.66.0F= .W0 6E /r","V","V","AVX","","w,r","","" +"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","EVEX.128.66.0= F.W0 7E /r","V","V","AVX512F+AVX512VL","scale4","w,r","","" +"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","VEX.128.66.0F= .W0 7E /r","V","V","AVX","","w,r","","" +"VMOVDDUP xmm1, xmm2/m64","VMOVDDUP xmm2/m64, xmm1","vmovddup xmm2/m64, xm= m1","VEX.128.F2.0F.WIG 12 /r","V","V","AVX","","w,r","","" +"VMOVDDUP xmm1, {k}{z}, xmm2/m64","VMOVDDUP xmm2/m64, {k}{z}, xmm1","vmovd= dup xmm2/m64, {k}{z}, xmm1","EVEX.128.F2.0F.W1 12 /r","V","V","AVX512F+AVX5= 12VL","scale8","w,r,r","","" +"VMOVDDUP ymm1, ymm2/m256","VMOVDDUP ymm2/m256, ymm1","vmovddup ymm2/m256,= ymm1","VEX.256.F2.0F.WIG 12 /r","V","V","AVX","","w,r","","" +"VMOVDDUP ymm1, {k}{z}, ymm2/m256","VMOVDDUP ymm2/m256, {k}{z}, ymm1","vmo= vddup ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 12 /r","V","V","AVX512F+A= VX512VL","scale32","w,r,r","","" +"VMOVDDUP zmm1, {k}{z}, zmm2/m512","VMOVDDUP zmm2/m512, {k}{z}, zmm1","vmo= vddup zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 12 /r","V","V","AVX512F",= "scale64","w,r,r","","" +"VMOVDQA xmm2/m128, xmm1","VMOVDQA xmm1, xmm2/m128","vmovdqa xmm1, xmm2/m1= 28","VEX.128.66.0F.WIG 7F /r","V","V","AVX","","w,r","","" +"VMOVDQA xmm1, xmm2/m128","VMOVDQA xmm2/m128, xmm1","vmovdqa xmm2/m128, xm= m1","VEX.128.66.0F.WIG 6F /r","V","V","AVX","","w,r","","" +"VMOVDQA ymm2/m256, ymm1","VMOVDQA ymm1, ymm2/m256","vmovdqa ymm1, ymm2/m2= 56","VEX.256.66.0F.WIG 7F /r","V","V","AVX","","w,r","","" +"VMOVDQA ymm1, ymm2/m256","VMOVDQA ymm2/m256, ymm1","vmovdqa ymm2/m256, ym= m1","VEX.256.66.0F.WIG 6F /r","V","V","AVX","","w,r","","" +"VMOVDQA32 xmm2/m128, {k}{z}, xmm1","VMOVDQA32 xmm1, {k}{z}, xmm2/m128","v= movdqa32 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W0 7F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","","" +"VMOVDQA32 xmm1, {k}{z}, xmm2/m128","VMOVDQA32 xmm2/m128, {k}{z}, xmm1","v= movdqa32 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W0 6F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","","" +"VMOVDQA32 ymm2/m256, {k}{z}, ymm1","VMOVDQA32 ymm1, {k}{z}, ymm2/m256","v= movdqa32 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W0 7F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","","" +"VMOVDQA32 ymm1, {k}{z}, ymm2/m256","VMOVDQA32 ymm2/m256, {k}{z}, ymm1","v= movdqa32 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W0 6F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","","" +"VMOVDQA32 zmm2/m512, {k}{z}, zmm1","VMOVDQA32 zmm1, {k}{z}, zmm2/m512","v= movdqa32 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W0 7F /r","V","V","AVX512= F","scale64","w,r,r","","" +"VMOVDQA32 zmm1, {k}{z}, zmm2/m512","VMOVDQA32 zmm2/m512, {k}{z}, zmm1","v= movdqa32 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W0 6F /r","V","V","AVX512= F","scale64","w,r,r","","" +"VMOVDQA64 xmm2/m128, {k}{z}, xmm1","VMOVDQA64 xmm1, {k}{z}, xmm2/m128","v= movdqa64 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 7F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQA64 xmm1, {k}{z}, xmm2/m128","VMOVDQA64 xmm2/m128, {k}{z}, xmm1","v= movdqa64 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 6F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQA64 ymm2/m256, {k}{z}, ymm1","VMOVDQA64 ymm1, {k}{z}, ymm2/m256","v= movdqa64 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 7F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQA64 ymm1, {k}{z}, ymm2/m256","VMOVDQA64 ymm2/m256, {k}{z}, ymm1","v= movdqa64 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 6F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQA64 zmm2/m512, {k}{z}, zmm1","VMOVDQA64 zmm1, {k}{z}, zmm2/m512","v= movdqa64 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 7F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQA64 zmm1, {k}{z}, zmm2/m512","VMOVDQA64 zmm2/m512, {k}{z}, zmm1","v= movdqa64 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 6F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU xmm2/m128, xmm1","VMOVDQU xmm1, xmm2/m128","vmovdqu xmm1, xmm2/m1= 28","VEX.128.F3.0F.WIG 7F /r","V","V","AVX","","w,r","","" +"VMOVDQU xmm1, xmm2/m128","VMOVDQU xmm2/m128, xmm1","vmovdqu xmm2/m128, xm= m1","VEX.128.F3.0F.WIG 6F /r","V","V","AVX","","w,r","","" +"VMOVDQU ymm2/m256, ymm1","VMOVDQU ymm1, ymm2/m256","vmovdqu ymm1, ymm2/m2= 56","VEX.256.F3.0F.WIG 7F /r","V","V","AVX","","w,r","","" +"VMOVDQU ymm1, ymm2/m256","VMOVDQU ymm2/m256, ymm1","vmovdqu ymm2/m256, ym= m1","VEX.256.F3.0F.WIG 6F /r","V","V","AVX","","w,r","","" +"VMOVDQU16 xmm2/m128, {k}{z}, xmm1","VMOVDQU16 xmm1, {k}{z}, xmm2/m128","v= movdqu16 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W1 7F /r","V","V","AVX512= BW+AVX512VL","scale16","w,r,r","","" +"VMOVDQU16 xmm1, {k}{z}, xmm2/m128","VMOVDQU16 xmm2/m128, {k}{z}, xmm1","v= movdqu16 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W1 6F /r","V","V","AVX512= BW+AVX512VL","scale16","w,r,r","","" +"VMOVDQU16 ymm2/m256, {k}{z}, ymm1","VMOVDQU16 ymm1, {k}{z}, ymm2/m256","v= movdqu16 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W1 7F /r","V","V","AVX512= BW+AVX512VL","scale32","w,r,r","","" +"VMOVDQU16 ymm1, {k}{z}, ymm2/m256","VMOVDQU16 ymm2/m256, {k}{z}, ymm1","v= movdqu16 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 6F /r","V","V","AVX512= BW+AVX512VL","scale32","w,r,r","","" +"VMOVDQU16 zmm2/m512, {k}{z}, zmm1","VMOVDQU16 zmm1, {k}{z}, zmm2/m512","v= movdqu16 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W1 7F /r","V","V","AVX512= BW","scale64","w,r,r","","" +"VMOVDQU16 zmm1, {k}{z}, zmm2/m512","VMOVDQU16 zmm2/m512, {k}{z}, zmm1","v= movdqu16 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 6F /r","V","V","AVX512= BW","scale64","w,r,r","","" +"VMOVDQU32 xmm2/m128, {k}{z}, xmm1","VMOVDQU32 xmm1, {k}{z}, xmm2/m128","v= movdqu32 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W0 7F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU32 xmm1, {k}{z}, xmm2/m128","VMOVDQU32 xmm2/m128, {k}{z}, xmm1","v= movdqu32 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 6F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU32 ymm2/m256, {k}{z}, ymm1","VMOVDQU32 ymm1, {k}{z}, ymm2/m256","v= movdqu32 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W0 7F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU32 ymm1, {k}{z}, ymm2/m256","VMOVDQU32 ymm2/m256, {k}{z}, ymm1","v= movdqu32 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 6F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU32 zmm2/m512, {k}{z}, zmm1","VMOVDQU32 zmm1, {k}{z}, zmm2/m512","v= movdqu32 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W0 7F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU32 zmm1, {k}{z}, zmm2/m512","VMOVDQU32 zmm2/m512, {k}{z}, zmm1","v= movdqu32 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 6F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU64 xmm2/m128, {k}{z}, xmm1","VMOVDQU64 xmm1, {k}{z}, xmm2/m128","v= movdqu64 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W1 7F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU64 xmm1, {k}{z}, xmm2/m128","VMOVDQU64 xmm2/m128, {k}{z}, xmm1","v= movdqu64 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W1 6F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU64 ymm2/m256, {k}{z}, ymm1","VMOVDQU64 ymm1, {k}{z}, ymm2/m256","v= movdqu64 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W1 7F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU64 ymm1, {k}{z}, ymm2/m256","VMOVDQU64 ymm2/m256, {k}{z}, ymm1","v= movdqu64 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W1 6F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU64 zmm2/m512, {k}{z}, zmm1","VMOVDQU64 zmm1, {k}{z}, zmm2/m512","v= movdqu64 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W1 7F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU64 zmm1, {k}{z}, zmm2/m512","VMOVDQU64 zmm2/m512, {k}{z}, zmm1","v= movdqu64 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W1 6F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU8 xmm2/m128, {k}{z}, xmm1","VMOVDQU8 xmm1, {k}{z}, xmm2/m128","vmo= vdqu8 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W0 7F /r","V","V","AVX512BW+= AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU8 xmm1, {k}{z}, xmm2/m128","VMOVDQU8 xmm2/m128, {k}{z}, xmm1","vmo= vdqu8 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W0 6F /r","V","V","AVX512BW+= AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU8 ymm2/m256, {k}{z}, ymm1","VMOVDQU8 ymm1, {k}{z}, ymm2/m256","vmo= vdqu8 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W0 7F /r","V","V","AVX512BW+= AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU8 ymm1, {k}{z}, ymm2/m256","VMOVDQU8 ymm2/m256, {k}{z}, ymm1","vmo= vdqu8 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W0 6F /r","V","V","AVX512BW+= AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU8 zmm2/m512, {k}{z}, zmm1","VMOVDQU8 zmm1, {k}{z}, zmm2/m512","vmo= vdqu8 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W0 7F /r","V","V","AVX512BW"= ,"scale64","w,r,r","Y","512" +"VMOVDQU8 zmm1, {k}{z}, zmm2/m512","VMOVDQU8 zmm2/m512, {k}{z}, zmm1","vmo= vdqu8 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W0 6F /r","V","V","AVX512BW"= ,"scale64","w,r,r","Y","512" +"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xm= mV, xmm1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_rego= nly","w,r,r","","" +"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xm= mV, xmm1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_regonly","w,r,r",= "","" +"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xm= m1","EVEX.NDS.LIG.66.0F.W1 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly= ,scale8","w,r,r","","" +"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xm= m1","VEX.NDS.128.66.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","",= "" +"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","EVEX.LIG.66.0= F.W1 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","","" +"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","VEX.128.66.0F= .WIG 17 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xm= m1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly,sc= ale8","w,r,r","","" +"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xm= m1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","","" +"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","EVEX.128.0F.W= 0 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","","" +"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","VEX.128.0F.WI= G 17 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xm= mV, xmm1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_rego= nly","w,r,r","","" +"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xm= mV, xmm1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_regonly","w,r,r",= "","" +"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xm= m1","EVEX.NDS.LIG.66.0F.W1 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly= ,scale8","w,r,r","","" +"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xm= m1","VEX.NDS.128.66.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","",= "" +"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","EVEX.LIG.66.0= F.W1 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","","" +"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","VEX.128.66.0F= .WIG 13 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xm= m1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly,sc= ale8","w,r,r","","" +"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xm= m1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","","" +"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","EVEX.128.0F.W= 0 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","","" +"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","VEX.128.0F.WI= G 13 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVMSKPD r32, xmm2","VMOVMSKPD xmm2, r32","vmovmskpd xmm2, r32","VEX.128= .66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","","" +"VMOVMSKPD r32, ymm2","VMOVMSKPD ymm2, r32","vmovmskpd ymm2, r32","VEX.256= .66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","","" +"VMOVMSKPS r32, xmm2","VMOVMSKPS xmm2, r32","vmovmskps xmm2, r32","VEX.128= .0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","","" +"VMOVMSKPS r32, ymm2","VMOVMSKPS ymm2, r32","vmovmskps ymm2, r32","VEX.256= .0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","","" +"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","EVEX.12= 8.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r",= "","" +"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","VEX.128= .66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","EVEX.25= 6.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r",= "","" +"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","VEX.256= .66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTDQ m512, zmm1","VMOVNTDQ zmm1, m512","vmovntdq zmm1, m512","EVEX.51= 2.66.0F.W0 E7 /r","V","V","AVX512F","modrm_memonly,scale64","w,r","","" +"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","EVEX= .128.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","= w,r","","" +"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","VEX.= 128.66.0F38.WIG 2A /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","EVEX= .256.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","= w,r","","" +"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","VEX.= 256.66.0F38.WIG 2A /r","V","V","AVX2","modrm_memonly","w,r","","" +"VMOVNTDQA zmm1, m512","VMOVNTDQA m512, zmm1","vmovntdqa m512, zmm1","EVEX= .512.66.0F38.W0 2A /r","V","V","AVX512F","modrm_memonly,scale64","w,r","","" +"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","EVEX.12= 8.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r",= "","" +"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","VEX.128= .66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","EVEX.25= 6.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r",= "","" +"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","VEX.256= .66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTPD m512, zmm1","VMOVNTPD zmm1, m512","vmovntpd zmm1, m512","EVEX.51= 2.66.0F.W1 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","","" +"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","EVEX.12= 8.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",= "" +"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","VEX.128= .0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","EVEX.25= 6.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",= "" +"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","VEX.256= .0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTPS m512, zmm1","VMOVNTPS zmm1, m512","vmovntps zmm1, m512","EVEX.51= 2.0F.W0 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","","" +"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","EVEX.128.66.0= F.W1 6E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","","" +"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","VEX.128.66.0F= .W1 6E /r","N.S.","V","AVX","","w,r","","" +"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","EVEX.128.66.0= F.W1 7E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","","" +"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","VEX.128.66.0F= .W1 7E /r","N.S.","V","AVX","","w,r","","" +"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","EVEX= .LIG.66.0F.W1 D6 /r","V","V","AVX512F+AVX512VL","scale8","w,r","","" +"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","VEX.= 128.66.0F.WIG D6 /r","V","V","AVX","","w,r","","" +"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","EVEX= .LIG.F3.0F.W1 7E /r","V","V","AVX512F+AVX512VL","scale8","w,r","","" +"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","VEX.= 128.F3.0F.WIG 7E /r","V","V","AVX","","w,r","","" +"VMOVSD xmm1, m64","VMOVSD m64, xmm1","vmovsd m64, xmm1","VEX.LIG.F2.0F.WI= G 10 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVSD xmm1, {k}{z}, m64","VMOVSD m64, {k}{z}, xmm1","vmovsd m64, {k}{z},= xmm1","EVEX.LIG.F2.0F.W1 10 /r","V","V","AVX512F","modrm_memonly,scale8","= w,r,r","","" +"VMOVSD m64, xmm1","VMOVSD xmm1, m64","vmovsd xmm1, m64","VEX.LIG.F2.0F.WI= G 11 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVSD xmm2, xmmV, xmm1","VMOVSD xmm1, xmmV, xmm2","vmovsd xmm1, xmmV, xm= m2","VEX.NDS.LIG.F2.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",= "" +"VMOVSD xmm2, {k}{z}, xmmV, xmm1","VMOVSD xmm1, xmmV, {k}{z}, xmm2","vmovs= d xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F2.0F.W1 11 /r","V","V","AVX512F"= ,"modrm_regonly","w,r,r,r","","" +"VMOVSD m64, {k}, xmm1","VMOVSD xmm1, {k}, m64","vmovsd xmm1, {k}, m64","E= VEX.LIG.F2.0F.W1 11 /r","V","V","AVX512F","modrm_memonly,scale8","w,r,r",""= ,"" +"VMOVSD xmm1, xmmV, xmm2","VMOVSD xmm2, xmmV, xmm1","vmovsd xmm2, xmmV, xm= m1","VEX.NDS.LIG.F2.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",= "" +"VMOVSD xmm1, {k}{z}, xmmV, xmm2","VMOVSD xmm2, xmmV, {k}{z}, xmm1","vmovs= d xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 10 /r","V","V","AVX512F"= ,"modrm_regonly","w,r,r,r","","" +"VMOVSHDUP xmm1, xmm2/m128","VMOVSHDUP xmm2/m128, xmm1","vmovshdup xmm2/m1= 28, xmm1","VEX.128.F3.0F.WIG 16 /r","V","V","AVX","","w,r","","" +"VMOVSHDUP xmm1, {k}{z}, xmm2/m128","VMOVSHDUP xmm2/m128, {k}{z}, xmm1","v= movshdup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 16 /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","","" +"VMOVSHDUP ymm1, ymm2/m256","VMOVSHDUP ymm2/m256, ymm1","vmovshdup ymm2/m2= 56, ymm1","VEX.256.F3.0F.WIG 16 /r","V","V","AVX","","w,r","","" +"VMOVSHDUP ymm1, {k}{z}, ymm2/m256","VMOVSHDUP ymm2/m256, {k}{z}, ymm1","v= movshdup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 16 /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","","" +"VMOVSHDUP zmm1, {k}{z}, zmm2/m512","VMOVSHDUP zmm2/m512, {k}{z}, zmm1","v= movshdup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 16 /r","V","V","AVX512= F","scale64","w,r,r","","" +"VMOVSLDUP xmm1, xmm2/m128","VMOVSLDUP xmm2/m128, xmm1","vmovsldup xmm2/m1= 28, xmm1","VEX.128.F3.0F.WIG 12 /r","V","V","AVX","","w,r","","" +"VMOVSLDUP xmm1, {k}{z}, xmm2/m128","VMOVSLDUP xmm2/m128, {k}{z}, xmm1","v= movsldup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 12 /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","","" +"VMOVSLDUP ymm1, ymm2/m256","VMOVSLDUP ymm2/m256, ymm1","vmovsldup ymm2/m2= 56, ymm1","VEX.256.F3.0F.WIG 12 /r","V","V","AVX","","w,r","","" +"VMOVSLDUP ymm1, {k}{z}, ymm2/m256","VMOVSLDUP ymm2/m256, {k}{z}, ymm1","v= movsldup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 12 /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","","" +"VMOVSLDUP zmm1, {k}{z}, zmm2/m512","VMOVSLDUP zmm2/m512, {k}{z}, zmm1","v= movsldup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 12 /r","V","V","AVX512= F","scale64","w,r,r","","" +"VMOVSS xmm1, m32","VMOVSS m32, xmm1","vmovss m32, xmm1","VEX.LIG.F3.0F.WI= G 10 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVSS xmm1, {k}{z}, m32","VMOVSS m32, {k}{z}, xmm1","vmovss m32, {k}{z},= xmm1","EVEX.LIG.F3.0F.W0 10 /r","V","V","AVX512F","modrm_memonly,scale4","= w,r,r","","" +"VMOVSS m32, xmm1","VMOVSS xmm1, m32","vmovss xmm1, m32","VEX.LIG.F3.0F.WI= G 11 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVSS xmm2, xmmV, xmm1","VMOVSS xmm1, xmmV, xmm2","vmovss xmm1, xmmV, xm= m2","VEX.NDS.LIG.F3.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",= "" +"VMOVSS xmm2, {k}{z}, xmmV, xmm1","VMOVSS xmm1, xmmV, {k}{z}, xmm2","vmovs= s xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F3.0F.W0 11 /r","V","V","AVX512F"= ,"modrm_regonly","w,r,r,r","","" +"VMOVSS m32, {k}, xmm1","VMOVSS xmm1, {k}, m32","vmovss xmm1, {k}, m32","E= VEX.LIG.F3.0F.W0 11 /r","V","V","AVX512F","modrm_memonly,scale4","w,r,r",""= ,"" +"VMOVSS xmm1, xmmV, xmm2","VMOVSS xmm2, xmmV, xmm1","vmovss xmm2, xmmV, xm= m1","VEX.NDS.LIG.F3.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",= "" +"VMOVSS xmm1, {k}{z}, xmmV, xmm2","VMOVSS xmm2, xmmV, {k}{z}, xmm1","vmovs= s xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 10 /r","V","V","AVX512F"= ,"modrm_regonly","w,r,r,r","","" +"VMOVUPD xmm2/m128, xmm1","VMOVUPD xmm1, xmm2/m128","vmovupd xmm1, xmm2/m1= 28","VEX.128.66.0F.WIG 11 /r","V","V","AVX","","w,r","","" +"VMOVUPD xmm2/m128, {k}{z}, xmm1","VMOVUPD xmm1, {k}{z}, xmm2/m128","vmovu= pd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 11 /r","V","V","AVX512F+AVX5= 12VL","scale16","w,r,r","","" +"VMOVUPD xmm1, xmm2/m128","VMOVUPD xmm2/m128, xmm1","vmovupd xmm2/m128, xm= m1","VEX.128.66.0F.WIG 10 /r","V","V","AVX","","w,r","","" +"VMOVUPD xmm1, {k}{z}, xmm2/m128","VMOVUPD xmm2/m128, {k}{z}, xmm1","vmovu= pd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 10 /r","V","V","AVX512F+AVX5= 12VL","scale16","w,r,r","","" +"VMOVUPD ymm2/m256, ymm1","VMOVUPD ymm1, ymm2/m256","vmovupd ymm1, ymm2/m2= 56","VEX.256.66.0F.WIG 11 /r","V","V","AVX","","w,r","","" +"VMOVUPD ymm2/m256, {k}{z}, ymm1","VMOVUPD ymm1, {k}{z}, ymm2/m256","vmovu= pd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 11 /r","V","V","AVX512F+AVX5= 12VL","scale32","w,r,r","","" +"VMOVUPD ymm1, ymm2/m256","VMOVUPD ymm2/m256, ymm1","vmovupd ymm2/m256, ym= m1","VEX.256.66.0F.WIG 10 /r","V","V","AVX","","w,r","","" +"VMOVUPD ymm1, {k}{z}, ymm2/m256","VMOVUPD ymm2/m256, {k}{z}, ymm1","vmovu= pd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 10 /r","V","V","AVX512F+AVX5= 12VL","scale32","w,r,r","","" +"VMOVUPD zmm2/m512, {k}{z}, zmm1","VMOVUPD zmm1, {k}{z}, zmm2/m512","vmovu= pd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 11 /r","V","V","AVX512F","sc= ale64","w,r,r","","" +"VMOVUPD zmm1, {k}{z}, zmm2/m512","VMOVUPD zmm2/m512, {k}{z}, zmm1","vmovu= pd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 10 /r","V","V","AVX512F","sc= ale64","w,r,r","","" +"VMOVUPS xmm2/m128, xmm1","VMOVUPS xmm1, xmm2/m128","vmovups xmm1, xmm2/m1= 28","VEX.128.0F.WIG 11 /r","V","V","AVX","","w,r","","" +"VMOVUPS xmm2/m128, {k}{z}, xmm1","VMOVUPS xmm1, {k}{z}, xmm2/m128","vmovu= ps xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 11 /r","V","V","AVX512F+AVX512V= L","scale16","w,r,r","","" +"VMOVUPS xmm1, xmm2/m128","VMOVUPS xmm2/m128, xmm1","vmovups xmm2/m128, xm= m1","VEX.128.0F.WIG 10 /r","V","V","AVX","","w,r","","" +"VMOVUPS xmm1, {k}{z}, xmm2/m128","VMOVUPS xmm2/m128, {k}{z}, xmm1","vmovu= ps xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 10 /r","V","V","AVX512F+AVX512V= L","scale16","w,r,r","","" +"VMOVUPS ymm2/m256, ymm1","VMOVUPS ymm1, ymm2/m256","vmovups ymm1, ymm2/m2= 56","VEX.256.0F.WIG 11 /r","V","V","AVX","","w,r","","" +"VMOVUPS ymm2/m256, {k}{z}, ymm1","VMOVUPS ymm1, {k}{z}, ymm2/m256","vmovu= ps ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 11 /r","V","V","AVX512F+AVX512V= L","scale32","w,r,r","","" +"VMOVUPS ymm1, ymm2/m256","VMOVUPS ymm2/m256, ymm1","vmovups ymm2/m256, ym= m1","VEX.256.0F.WIG 10 /r","V","V","AVX","","w,r","","" +"VMOVUPS ymm1, {k}{z}, ymm2/m256","VMOVUPS ymm2/m256, {k}{z}, ymm1","vmovu= ps ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 10 /r","V","V","AVX512F+AVX512V= L","scale32","w,r,r","","" +"VMOVUPS zmm2/m512, {k}{z}, zmm1","VMOVUPS zmm1, {k}{z}, zmm2/m512","vmovu= ps zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 11 /r","V","V","AVX512F","scale= 64","w,r,r","","" +"VMOVUPS zmm1, {k}{z}, zmm2/m512","VMOVUPS zmm2/m512, {k}{z}, zmm1","vmovu= ps zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 10 /r","V","V","AVX512F","scale= 64","w,r,r","","" +"VMPSADBW xmm1, xmmV, xmm2/m128, imm8u","VMPSADBW imm8u, xmm2/m128, xmmV, = xmm1","vmpsadbw imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 42 /= r ib","V","V","AVX","","w,r,r,r","","" +"VMPSADBW ymm1, ymmV, ymm2/m256, imm8u","VMPSADBW imm8u, ymm2/m256, ymmV, = ymm1","vmpsadbw imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 42 /= r ib","V","V","AVX2","","w,r,r,r","","" +"VMPTRLD m64","VMPTRLD m64","vmptrld m64","0F C7 /6","V","V","VTX","modrm_= memonly","r","","" +"VMPTRST m64","VMPTRST m64","vmptrst m64","0F C7 /7","V","V","VTX","modrm_= memonly","w","","" +"VMREAD r/m32, r32","VMREAD r32, r/m32","vmread r32, r/m32","0F 78 /r","V"= ,"N.S.","VTX","","rw,r","","" +"VMREAD r/m64, r64","VMREAD r64, r/m64","vmread r64, r/m64","0F 78 /r","N.= S.","V","VTX","default64","rw,r","","" +"VMRESUME","VMRESUME","vmresume","0F 01 C3","V","V","VTX","","","","" +"VMRUN EAX","VMRUNL EAX","vmrunl EAX","0F 01 D8","V","V","SVM","amd,modrm_= regonly,operand32","r","Y","32" +"VMRUN RAX","VMRUNQ RAX","vmrunq RAX","REX.W 0F 01 D8","N.S.","V","SVM","a= md,modrm_regonly","r","Y","64" +"VMRUN AX","VMRUNW AX","vmrunw AX","0F 01 D8","V","V","SVM","amd,modrm_reg= only,operand16","r","Y","16" +"VMSAVE","VMSAVE","vmsave","0F 01 DB","V","V","SVM","amd","","","" +"VMULPD xmm1, xmmV, xmm2/m128","VMULPD xmm2/m128, xmmV, xmm1","vmulpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","= ","" +"VMULPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMULPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vmulpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 59 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VMULPD ymm1, ymmV, ymm2/m256","VMULPD ymm2/m256, ymmV, ymm1","vmulpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","= ","" +"VMULPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMULPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vmulpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 59 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VMULPD zmm1{er}, {k}{z}, zmmV, zmm2","VMULPD zmm2, zmmV, {k}{z}, zmm1{er}= ","vmulpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 59 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMULPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMULPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vmulpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 59 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VMULPS xmm1, xmmV, xmm2/m128","VMULPS xmm2/m128, xmmV, xmm1","vmulps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 59 /r","V","V","AVX","","w,r,r","","" +"VMULPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMULPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vmulps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 59 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VMULPS ymm1, ymmV, ymm2/m256","VMULPS ymm2/m256, ymmV, ymm1","vmulps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 59 /r","V","V","AVX","","w,r,r","","" +"VMULPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMULPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vmulps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 59 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VMULPS zmm1{er}, {k}{z}, zmmV, zmm2","VMULPS zmm2, zmmV, {k}{z}, zmm1{er}= ","vmulps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 59 /r","V","V",= "AVX512F","modrm_regonly","w,r,r,r","","" +"VMULPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMULPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vmulps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 59 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VMULSD xmm1{er}, {k}{z}, xmmV, xmm2","VMULSD xmm2, xmmV, {k}{z}, xmm1{er}= ","vmulsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 59 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMULSD xmm1, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, xmm1","vmulsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 59 /r","V","V","AVX","","w,r,r","","" +"VMULSD xmm1, {k}{z}, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vmulsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 59 /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VMULSS xmm1{er}, {k}{z}, xmmV, xmm2","VMULSS xmm2, xmmV, {k}{z}, xmm1{er}= ","vmulss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 59 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMULSS xmm1, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, xmm1","vmulss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 59 /r","V","V","AVX","","w,r,r","","" +"VMULSS xmm1, {k}{z}, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vmulss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 59 /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VMWRITE r32, r/m32","VMWRITE r/m32, r32","vmwrite r/m32, r32","0F 79 /r",= "V","N.S.","VTX","","r,r","","" +"VMWRITE r64, r/m64","VMWRITE r/m64, r64","vmwrite r/m64, r64","0F 79 /r",= "N.S.","V","VTX","default64","r,r","","" +"VMXOFF","VMXOFF","vmxoff","0F 01 C4","V","V","VTX","","","","" +"VMXON m64","VMXON m64","vmxon m64","F3 0F C7 /6","V","V","VTX","modrm_mem= only","r","","" +"VORPD xmm1, xmmV, xmm2/m128","VORPD xmm2/m128, xmmV, xmm1","vorpd xmm2/m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","","" +"VORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VORPD xmm2/m128/m64bcst, xm= mV, {k}{z}, xmm1","vorpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.1= 28.66.0F.W1 56 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r",= "","" +"VORPD ymm1, ymmV, ymm2/m256","VORPD ymm2/m256, ymmV, ymm1","vorpd ymm2/m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","","" +"VORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VORPD ymm2/m256/m64bcst, ym= mV, {k}{z}, ymm1","vorpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.2= 56.66.0F.W1 56 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r",= "","" +"VORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VORPD zmm2/m512/m64bcst, zm= mV, {k}{z}, zmm1","vorpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.5= 12.66.0F.W1 56 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","","" +"VORPS xmm1, xmmV, xmm2/m128","VORPS xmm2/m128, xmmV, xmm1","vorps xmm2/m1= 28, xmmV, xmm1","VEX.NDS.128.0F.WIG 56 /r","V","V","AVX","","w,r,r","","" +"VORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VORPS xmm2/m128/m32bcst, xm= mV, {k}{z}, xmm1","vorps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.1= 28.0F.W0 56 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",= "" +"VORPS ymm1, ymmV, ymm2/m256","VORPS ymm2/m256, ymmV, ymm1","vorps ymm2/m2= 56, ymmV, ymm1","VEX.NDS.256.0F.WIG 56 /r","V","V","AVX","","w,r,r","","" +"VORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VORPS ymm2/m256/m32bcst, ym= mV, {k}{z}, ymm1","vorps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.2= 56.0F.W0 56 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",= "" +"VORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VORPS zmm2/m512/m32bcst, zm= mV, {k}{z}, zmm1","vorps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.5= 12.0F.W0 56 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","","" +"VP4DPWSSD zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSD m128, zmmV+3, {k}{z}, zm= m1","vp4dpwssd m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 52 /r",= "V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","","" +"VP4DPWSSDS zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSDS m128, zmmV+3, {k}{z}, = zmm1","vp4dpwssds m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 53 /= r","V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","","" +"VPABSB xmm1, xmm2/m128","VPABSB xmm2/m128, xmm1","vpabsb xmm2/m128, xmm1"= ,"VEX.128.66.0F38.WIG 1C /r","V","V","AVX","","w,r","","" +"VPABSB xmm1, {k}{z}, xmm2/m128","VPABSB xmm2/m128, {k}{z}, xmm1","vpabsb = xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1C /r","V","V","AVX512BW+AVX= 512VL","scale16","w,r,r","","" +"VPABSB ymm1, ymm2/m256","VPABSB ymm2/m256, ymm1","vpabsb ymm2/m256, ymm1"= ,"VEX.256.66.0F38.WIG 1C /r","V","V","AVX2","","w,r","","" +"VPABSB ymm1, {k}{z}, ymm2/m256","VPABSB ymm2/m256, {k}{z}, ymm1","vpabsb = ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1C /r","V","V","AVX512BW+AVX= 512VL","scale32","w,r,r","","" +"VPABSB zmm1, {k}{z}, zmm2/m512","VPABSB zmm2/m512, {k}{z}, zmm1","vpabsb = zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1C /r","V","V","AVX512BW","s= cale64","w,r,r","","" +"VPABSD xmm1, xmm2/m128","VPABSD xmm2/m128, xmm1","vpabsd xmm2/m128, xmm1"= ,"VEX.128.66.0F38.WIG 1E /r","V","V","AVX","","w,r","","" +"VPABSD xmm1, {k}{z}, xmm2/m128/m32bcst","VPABSD xmm2/m128/m32bcst, {k}{z}= , xmm1","vpabsd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 1E /r= ","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VPABSD ymm1, ymm2/m256","VPABSD ymm2/m256, ymm1","vpabsd ymm2/m256, ymm1"= ,"VEX.256.66.0F38.WIG 1E /r","V","V","AVX2","","w,r","","" +"VPABSD ymm1, {k}{z}, ymm2/m256/m32bcst","VPABSD ymm2/m256/m32bcst, {k}{z}= , ymm1","vpabsd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1E /r= ","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VPABSD zmm1, {k}{z}, zmm2/m512/m32bcst","VPABSD zmm2/m512/m32bcst, {k}{z}= , zmm1","vpabsd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1E /r= ","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VPABSQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPABSQ xmm2/m128/m64bcst, {k}{z}= , xmm1","vpabsq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 1F /r= ","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VPABSQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPABSQ ymm2/m256/m64bcst, {k}{z}= , ymm1","vpabsq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1F /r= ","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VPABSQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPABSQ zmm2/m512/m64bcst, {k}{z}= , zmm1","vpabsq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1F /r= ","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VPABSW xmm1, xmm2/m128","VPABSW xmm2/m128, xmm1","vpabsw xmm2/m128, xmm1"= ,"VEX.128.66.0F38.WIG 1D /r","V","V","AVX","","w,r","","" +"VPABSW xmm1, {k}{z}, xmm2/m128","VPABSW xmm2/m128, {k}{z}, xmm1","vpabsw = xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1D /r","V","V","AVX512BW+AVX= 512VL","scale16","w,r,r","","" +"VPABSW ymm1, ymm2/m256","VPABSW ymm2/m256, ymm1","vpabsw ymm2/m256, ymm1"= ,"VEX.256.66.0F38.WIG 1D /r","V","V","AVX2","","w,r","","" +"VPABSW ymm1, {k}{z}, ymm2/m256","VPABSW ymm2/m256, {k}{z}, ymm1","vpabsw = ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1D /r","V","V","AVX512BW+AVX= 512VL","scale32","w,r,r","","" +"VPABSW zmm1, {k}{z}, zmm2/m512","VPABSW zmm2/m512, {k}{z}, zmm1","vpabsw = zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1D /r","V","V","AVX512BW","s= cale64","w,r,r","","" +"VPACKSSDW xmm1, xmmV, xmm2/m128","VPACKSSDW xmm2/m128, xmmV, xmm1","vpack= ssdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6B /r","V","V","AVX","",= "w,r,r","","" +"VPACKSSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKSSDW xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpackssdw xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F.W0 6B /r","V","V","AVX512BW+AVX512VL","bscale4,scale16= ","w,r,r,r","","" +"VPACKSSDW ymm1, ymmV, ymm2/m256","VPACKSSDW ymm2/m256, ymmV, ymm1","vpack= ssdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6B /r","V","V","AVX2",""= ,"w,r,r","","" +"VPACKSSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKSSDW ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpackssdw ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F.W0 6B /r","V","V","AVX512BW+AVX512VL","bscale4,scale32= ","w,r,r,r","","" +"VPACKSSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKSSDW zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpackssdw zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F.W0 6B /r","V","V","AVX512BW","bscale4,scale64","w,r,r,= r","","" +"VPACKSSWB xmm1, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, xmm1","vpack= sswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 63 /r","V","V","AVX","",= "w,r,r","","" +"VPACKSSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, {k}{= z}, xmm1","vpacksswb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG= 63 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPACKSSWB ymm1, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, ymm1","vpack= sswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 63 /r","V","V","AVX2",""= ,"w,r,r","","" +"VPACKSSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, {k}{= z}, ymm1","vpacksswb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG= 63 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPACKSSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKSSWB zmm2/m512, zmmV, {k}{= z}, zmm1","vpacksswb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG= 63 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPACKUSDW xmm1, xmmV, xmm2/m128","VPACKUSDW xmm2/m128, xmmV, xmm1","vpack= usdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 2B /r","V","V","AVX","= ","w,r,r","","" +"VPACKUSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKUSDW xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpackusdw xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 2B /r","V","V","AVX512BW+AVX512VL","bscale4,scale= 16","w,r,r,r","","" +"VPACKUSDW ymm1, ymmV, ymm2/m256","VPACKUSDW ymm2/m256, ymmV, ymm1","vpack= usdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 2B /r","V","V","AVX2",= "","w,r,r","","" +"VPACKUSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKUSDW ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpackusdw ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 2B /r","V","V","AVX512BW+AVX512VL","bscale4,scale= 32","w,r,r,r","","" +"VPACKUSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKUSDW zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpackusdw zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 2B /r","V","V","AVX512BW","bscale4,scale64","w,r,= r,r","","" +"VPACKUSWB xmm1, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, xmm1","vpack= uswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 67 /r","V","V","AVX","",= "w,r,r","","" +"VPACKUSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, {k}{= z}, xmm1","vpackuswb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG= 67 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPACKUSWB ymm1, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, ymm1","vpack= uswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 67 /r","V","V","AVX2",""= ,"w,r,r","","" +"VPACKUSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, {k}{= z}, ymm1","vpackuswb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG= 67 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPACKUSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKUSWB zmm2/m512, zmmV, {k}{= z}, zmm1","vpackuswb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG= 67 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDB xmm1, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, xmm1","vpaddb xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FC /r","V","V","AVX","","w,r,r","= ","" +"VPADDB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, {k}{z}, xm= m1","vpaddb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG FC /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDB ymm1, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, ymm1","vpaddb ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FC /r","V","V","AVX2","","w,r,r",= "","" +"VPADDB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, {k}{z}, ym= m1","vpaddb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG FC /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDB zmm2/m512, zmmV, {k}{z}, zm= m1","vpaddb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG FC /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDD xmm1, xmmV, xmm2/m128","VPADDD xmm2/m128, xmmV, xmm1","vpaddd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FE /r","V","V","AVX","","w,r,r","= ","" +"VPADDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPADDD xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vpaddd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W0 FE /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r= ","","" +"VPADDD ymm1, ymmV, ymm2/m256","VPADDD ymm2/m256, ymmV, ymm1","vpaddd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FE /r","V","V","AVX2","","w,r,r",= "","" +"VPADDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPADDD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpaddd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W0 FE /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r= ","","" +"VPADDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPADDD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpaddd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W0 FE /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPADDQ xmm1, xmmV, xmm2/m128","VPADDQ xmm2/m128, xmmV, xmm1","vpaddq xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D4 /r","V","V","AVX","","w,r,r","= ","" +"VPADDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPADDQ xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vpaddq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 D4 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VPADDQ ymm1, ymmV, ymm2/m256","VPADDQ ymm2/m256, ymmV, ymm1","vpaddq ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D4 /r","V","V","AVX2","","w,r,r",= "","" +"VPADDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPADDQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpaddq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 D4 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VPADDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPADDQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpaddq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 D4 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPADDSB xmm1, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, xmm1","vpaddsb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EC /r","V","V","AVX","","w,r,r= ","","" +"VPADDSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, {k}{z}, = xmm1","vpaddsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EC /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDSB ymm1, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, ymm1","vpaddsb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EC /r","V","V","AVX2","","w,r,= r","","" +"VPADDSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, {k}{z}, = ymm1","vpaddsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EC /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSB zmm2/m512, zmmV, {k}{z}, = zmm1","vpaddsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EC /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDSW xmm1, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, xmm1","vpaddsw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG ED /r","V","V","AVX","","w,r,r= ","","" +"VPADDSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, {k}{z}, = xmm1","vpaddsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG ED /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDSW ymm1, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, ymm1","vpaddsw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG ED /r","V","V","AVX2","","w,r,= r","","" +"VPADDSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, {k}{z}, = ymm1","vpaddsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG ED /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSW zmm2/m512, zmmV, {k}{z}, = zmm1","vpaddsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG ED /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDUSB xmm1, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, xmm1","vpaddus= b xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DC /r","V","V","AVX","","w,= r,r","","" +"VPADDUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, {k}{z}= , xmm1","vpaddusb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DC= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDUSB ymm1, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, ymm1","vpaddus= b ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DC /r","V","V","AVX2","","w= ,r,r","","" +"VPADDUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, {k}{z}= , ymm1","vpaddusb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DC= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSB zmm2/m512, zmmV, {k}{z}= , zmm1","vpaddusb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DC= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDUSW xmm1, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, xmm1","vpaddus= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DD /r","V","V","AVX","","w,= r,r","","" +"VPADDUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, {k}{z}= , xmm1","vpaddusw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DD= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDUSW ymm1, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, ymm1","vpaddus= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DD /r","V","V","AVX2","","w= ,r,r","","" +"VPADDUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, {k}{z}= , ymm1","vpaddusw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DD= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSW zmm2/m512, zmmV, {k}{z}= , zmm1","vpaddusw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DD= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDW xmm1, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, xmm1","vpaddw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FD /r","V","V","AVX","","w,r,r","= ","" +"VPADDW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, {k}{z}, xm= m1","vpaddw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG FD /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDW ymm1, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, ymm1","vpaddw ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FD /r","V","V","AVX2","","w,r,r",= "","" +"VPADDW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, {k}{z}, ym= m1","vpaddw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG FD /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDW zmm2/m512, zmmV, {k}{z}, zm= m1","vpaddw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG FD /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPALIGNR xmm1, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128, xmmV, = xmm1","vpalignr imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0F /= r ib","V","V","AVX","","w,r,r,r","","" +"VPALIGNR xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128= , xmmV, {k}{z}, xmm1","vpalignr imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F3A.WIG 0F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r= ,r,r","","" +"VPALIGNR ymm1, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256, ymmV, = ymm1","vpalignr imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0F /= r ib","V","V","AVX2","","w,r,r,r","","" +"VPALIGNR ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256= , ymmV, {k}{z}, ymm1","vpalignr imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F3A.WIG 0F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r= ,r,r","","" +"VPALIGNR zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPALIGNR imm8u, zmm2/m512= , zmmV, {k}{z}, zmm1","vpalignr imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F3A.WIG 0F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",= "" +"VPAND xmm1, xmmV, xmm2/m128","VPAND xmm2/m128, xmmV, xmm1","vpand xmm2/m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DB /r","V","V","AVX","","w,r,r","","" +"VPAND ymm1, ymmV, ymm2/m256","VPAND ymm2/m256, ymmV, ymm1","vpand ymm2/m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DB /r","V","V","AVX2","","w,r,r","",= "" +"VPANDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDD xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vpandd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W0 DB /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r= ","","" +"VPANDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpandd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W0 DB /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r= ","","" +"VPANDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpandd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W0 DB /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPANDN xmm1, xmmV, xmm2/m128","VPANDN xmm2/m128, xmmV, xmm1","vpandn xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DF /r","V","V","AVX","","w,r,r","= ","" +"VPANDN ymm1, ymmV, ymm2/m256","VPANDN ymm2/m256, ymmV, ymm1","vpandn ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DF /r","V","V","AVX2","","w,r,r",= "","" +"VPANDND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDND xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpandnd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F.W0 DF /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,= r,r","","" +"VPANDND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDND ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpandnd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F.W0 DF /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,= r,r","","" +"VPANDND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDND zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpandnd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F.W0 DF /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPANDNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDNQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpandnq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F.W1 DF /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,= r,r","","" +"VPANDNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDNQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpandnq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F.W1 DF /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,= r,r","","" +"VPANDNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDNQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpandnq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F.W1 DF /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPANDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDQ xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vpandq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 DB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VPANDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpandq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 DB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VPANDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpandq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 DB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPAVGB xmm1, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, xmm1","vpavgb xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E0 /r","V","V","AVX","","w,r,r","= ","" +"VPAVGB xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, {k}{z}, xm= m1","vpavgb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E0 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPAVGB ymm1, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, ymm1","vpavgb ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E0 /r","V","V","AVX2","","w,r,r",= "","" +"VPAVGB ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, {k}{z}, ym= m1","vpavgb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E0 /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPAVGB zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGB zmm2/m512, zmmV, {k}{z}, zm= m1","vpavgb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E0 /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPAVGW xmm1, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, xmm1","vpavgw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E3 /r","V","V","AVX","","w,r,r","= ","" +"VPAVGW xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, {k}{z}, xm= m1","vpavgw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E3 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPAVGW ymm1, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, ymm1","vpavgw ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E3 /r","V","V","AVX2","","w,r,r",= "","" +"VPAVGW ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, {k}{z}, ym= m1","vpavgw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E3 /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPAVGW zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGW zmm2/m512, zmmV, {k}{z}, zm= m1","vpavgw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E3 /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPBLENDD xmm1, xmmV, xmm2/m128, imm8u","VPBLENDD imm8u, xmm2/m128, xmmV, = xmm1","vpblendd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 02 /r= ib","V","V","AVX2","","w,r,r,r","","" +"VPBLENDD ymm1, ymmV, ymm2/m256, imm8u","VPBLENDD imm8u, ymm2/m256, ymmV, = ymm1","vpblendd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 02 /r= ib","V","V","AVX2","","w,r,r,r","","" +"VPBLENDMB xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMB xmm2/m128, xmmV, {k}{= z}, xmm1","vpblendmb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W= 0 66 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPBLENDMB ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMB ymm2/m256, ymmV, {k}{= z}, ymm1","vpblendmb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W= 0 66 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPBLENDMB zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMB zmm2/m512, zmmV, {k}{= z}, zmm1","vpblendmb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W= 0 66 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPBLENDMD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPBLENDMD xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpblendmd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 64 /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r","","" +"VPBLENDMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPBLENDMD ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpblendmd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 64 /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r","","" +"VPBLENDMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPBLENDMD zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpblendmd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 64 /r","V","V","AVX512F","bscale4,scale64","w,r,r= ,r","","" +"VPBLENDMQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPBLENDMQ xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vpblendmq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W1 64 /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r","","" +"VPBLENDMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPBLENDMQ ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vpblendmq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W1 64 /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r","","" +"VPBLENDMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPBLENDMQ zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vpblendmq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W1 64 /r","V","V","AVX512F","bscale8,scale64","w,r,r= ,r","","" +"VPBLENDMW xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMW xmm2/m128, xmmV, {k}{= z}, xmm1","vpblendmw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W= 1 66 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPBLENDMW ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMW ymm2/m256, ymmV, {k}{= z}, ymm1","vpblendmw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W= 1 66 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPBLENDMW zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMW zmm2/m512, zmmV, {k}{= z}, zmm1","vpblendmw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W= 1 66 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPBLENDVB xmm1, xmmV, xmm2/m128, xmmIH","VPBLENDVB xmmIH, xmm2/m128, xmmV= , xmm1","vpblendvb xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4C= /r /is4","V","V","AVX","","w,r,r,r","","" +"VPBLENDVB ymm1, ymmV, ymm2/m256, ymmIH","VPBLENDVB ymmIH, ymm2/m256, ymmV= , ymm1","vpblendvb ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4C= /r /is4","V","V","AVX2","","w,r,r,r","","" +"VPBLENDW xmm1, xmmV, xmm2/m128, imm8u","VPBLENDW imm8u, xmm2/m128, xmmV, = xmm1","vpblendw imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0E /= r ib","V","V","AVX","","w,r,r,r","","" +"VPBLENDW ymm1, ymmV, ymm2/m256, imm8u","VPBLENDW imm8u, ymm2/m256, ymmV, = ymm1","vpblendw imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0E /= r ib","V","V","AVX2","","w,r,r,r","","" +"VPBROADCASTB xmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, xmm1","vpb= roadcastb rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7A /r","V","V","AVX512B= W+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTB ymm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, ymm1","vpb= roadcastb rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7A /r","V","V","AVX512B= W+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTB zmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, zmm1","vpb= roadcastb rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7A /r","V","V","AVX512B= W","modrm_regonly","w,r,r","","" +"VPBROADCASTB xmm1, xmm2/m8","VPBROADCASTB xmm2/m8, xmm1","vpbroadcastb xm= m2/m8, xmm1","VEX.128.66.0F38.W0 78 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTB ymm1, xmm2/m8","VPBROADCASTB xmm2/m8, ymm1","vpbroadcastb xm= m2/m8, ymm1","VEX.256.66.0F38.W0 78 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTB xmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, xmm1",= "vpbroadcastb xmm2/m8, {k}{z}, xmm1","EVEX.128.66.0F38.W0 78 /r","V","V","A= VX512BW+AVX512VL","scale1","w,r,r","","" +"VPBROADCASTB ymm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, ymm1",= "vpbroadcastb xmm2/m8, {k}{z}, ymm1","EVEX.256.66.0F38.W0 78 /r","V","V","A= VX512BW+AVX512VL","scale1","w,r,r","","" +"VPBROADCASTB zmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, zmm1",= "vpbroadcastb xmm2/m8, {k}{z}, zmm1","EVEX.512.66.0F38.W0 78 /r","V","V","A= VX512BW","scale1","w,r,r","","" +"VPBROADCASTD xmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, xmm1","vpb= roadcastd rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7C /r","V","V","AVX512F= +AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTD ymm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, ymm1","vpb= roadcastd rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7C /r","V","V","AVX512F= +AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTD zmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, zmm1","vpb= roadcastd rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7C /r","V","V","AVX512F= ","modrm_regonly","w,r,r","","" +"VPBROADCASTD xmm1, xmm2/m32","VPBROADCASTD xmm2/m32, xmm1","vpbroadcastd = xmm2/m32, xmm1","VEX.128.66.0F38.W0 58 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTD ymm1, xmm2/m32","VPBROADCASTD xmm2/m32, ymm1","vpbroadcastd = xmm2/m32, ymm1","VEX.256.66.0F38.W0 58 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTD xmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, xmm1= ","vpbroadcastd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 58 /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VPBROADCASTD ymm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, ymm1= ","vpbroadcastd xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 58 /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VPBROADCASTD zmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, zmm1= ","vpbroadcastd xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 58 /r","V","V"= ,"AVX512F","scale4","w,r,r","","" +"VPBROADCASTMB2Q xmm1, k2","VPBROADCASTMB2Q k2, xmm1","vpbroadcastmb2q k2,= xmm1","EVEX.128.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regon= ly","w,r","","" +"VPBROADCASTMB2Q ymm1, k2","VPBROADCASTMB2Q k2, ymm1","vpbroadcastmb2q k2,= ymm1","EVEX.256.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regon= ly","w,r","","" +"VPBROADCASTMB2Q zmm1, k2","VPBROADCASTMB2Q k2, zmm1","vpbroadcastmb2q k2,= zmm1","EVEX.512.F3.0F38.W1 2A /r","V","V","AVX512CD","modrm_regonly","w,r"= ,"","" +"VPBROADCASTMW2D xmm1, k2","VPBROADCASTMW2D k2, xmm1","vpbroadcastmw2d k2,= xmm1","EVEX.128.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regon= ly","w,r","","" +"VPBROADCASTMW2D ymm1, k2","VPBROADCASTMW2D k2, ymm1","vpbroadcastmw2d k2,= ymm1","EVEX.256.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regon= ly","w,r","","" +"VPBROADCASTMW2D zmm1, k2","VPBROADCASTMW2D k2, zmm1","vpbroadcastmw2d k2,= zmm1","EVEX.512.F3.0F38.W0 3A /r","V","V","AVX512CD","modrm_regonly","w,r"= ,"","" +"VPBROADCASTQ xmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, xmm1","vpb= roadcastq rmr64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 7C /r","N.S.","V","AVX5= 12F+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTQ ymm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, ymm1","vpb= roadcastq rmr64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 7C /r","N.S.","V","AVX5= 12F+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTQ zmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, zmm1","vpb= roadcastq rmr64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 7C /r","N.S.","V","AVX5= 12F","modrm_regonly","w,r,r","","" +"VPBROADCASTQ xmm1, xmm2/m64","VPBROADCASTQ xmm2/m64, xmm1","vpbroadcastq = xmm2/m64, xmm1","VEX.128.66.0F38.W0 59 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTQ ymm1, xmm2/m64","VPBROADCASTQ xmm2/m64, ymm1","vpbroadcastq = xmm2/m64, ymm1","VEX.256.66.0F38.W0 59 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTQ xmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, xmm1= ","vpbroadcastq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 59 /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VPBROADCASTQ ymm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, ymm1= ","vpbroadcastq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 59 /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VPBROADCASTQ zmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, zmm1= ","vpbroadcastq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 59 /r","V","V"= ,"AVX512F","scale8","w,r,r","","" +"VPBROADCASTW xmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, xmm1","vpb= roadcastw rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7B /r","V","V","AVX512B= W+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTW ymm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, ymm1","vpb= roadcastw rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7B /r","V","V","AVX512B= W+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTW zmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, zmm1","vpb= roadcastw rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7B /r","V","V","AVX512B= W","modrm_regonly","w,r,r","","" +"VPBROADCASTW xmm1, xmm2/m16","VPBROADCASTW xmm2/m16, xmm1","vpbroadcastw = xmm2/m16, xmm1","VEX.128.66.0F38.W0 79 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTW ymm1, xmm2/m16","VPBROADCASTW xmm2/m16, ymm1","vpbroadcastw = xmm2/m16, ymm1","VEX.256.66.0F38.W0 79 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTW xmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, xmm1= ","vpbroadcastw xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.W0 79 /r","V","V"= ,"AVX512BW+AVX512VL","scale2","w,r,r","","" +"VPBROADCASTW ymm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, ymm1= ","vpbroadcastw xmm2/m16, {k}{z}, ymm1","EVEX.256.66.0F38.W0 79 /r","V","V"= ,"AVX512BW+AVX512VL","scale2","w,r,r","","" +"VPBROADCASTW zmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, zmm1= ","vpbroadcastw xmm2/m16, {k}{z}, zmm1","EVEX.512.66.0F38.W0 79 /r","V","V"= ,"AVX512BW","scale2","w,r,r","","" +"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u, xmm2/m128, xm= mV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W= IG 44 /r ib","V","V","VPCLMULQDQ+AVX512VL","scale16","w,r,r,r","","" +"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u, xmm2/m128, xm= mV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WI= G 44 /r ib","V","V","PCLMULQDQ+AVX","","w,r,r,r","","" +"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u, ymm2/m256, ym= mV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F3A.W= IG 44 /r ib","V","V","VPCLMULQDQ+AVX512VL","scale32","w,r,r,r","","" +"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u, ymm2/m256, ym= mV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WI= G 44 /r ib","V","V","VPCLMULQDQ","","w,r,r,r","","" +"VPCLMULQDQ zmm1, zmmV, zmm2/m512, imm8u","VPCLMULQDQ imm8u, zmm2/m512, zm= mV, zmm1","vpclmulqdq imm8u, zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F3A.W= IG 44 /r ib","V","V","VPCLMULQDQ+AVX512F","scale64","w,r,r,r","","" +"VPCMOV xmm1, xmmV, xmmIH, xmm2/m128","VPCMOV xmm2/m128, xmmIH, xmmV, xmm1= ","vpcmov xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A2 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPCMOV xmm1, xmmV, xmm2/m128, xmmIH","VPCMOV xmmIH, xmm2/m128, xmmV, xmm1= ","vpcmov xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A2 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPCMOV ymm1, ymmV, ymmIH, ymm2/m256","VPCMOV ymm2/m256, ymmIH, ymmV, ymm1= ","vpcmov ymm2/m256, ymmIH, ymmV, ymm1","XOP.NDS.256.08.W1 A2 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPCMOV ymm1, ymmV, ymm2/m256, ymmIH","VPCMOV ymmIH, ymm2/m256, ymmV, ymm1= ","vpcmov ymmIH, ymm2/m256, ymmV, ymm1","XOP.NDS.256.08.W0 A2 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPCMPB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPB imm8u, xmm2/m128, xmmV, {= k}, k1","vpcmpb imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 3= F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","","" +"VPCMPB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPB imm8u, ymm2/m256, ymmV, {= k}, k1","vpcmpb imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 3= F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","","" +"VPCMPB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPB imm8u, zmm2/m512, zmmV, {= k}, k1","vpcmpb imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 3= F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","","" +"VPCMPD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPD imm8u, xmm2/m128/= m32bcst, xmmV, {k}, k1","vpcmpd imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","E= VEX.NDS.128.66.0F3A.W0 1F /r ib","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r,r","","" +"VPCMPD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPD imm8u, ymm2/m256/= m32bcst, ymmV, {k}, k1","vpcmpd imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","E= VEX.NDS.256.66.0F3A.W0 1F /r ib","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r,r","","" +"VPCMPD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPD imm8u, zmm2/m512/= m32bcst, zmmV, {k}, k1","vpcmpd imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","E= VEX.NDS.512.66.0F3A.W0 1F /r ib","V","V","AVX512F","bscale4,scale64","w,r,r= ,r,r","","" +"VPCMPEQB xmm1, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, xmm1","vpcmpeq= b xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 74 /r","V","V","AVX","","w,= r,r","","" +"VPCMPEQB k1, {k}, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, {k}, k1","v= pcmpeqb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 74 /r","V","V","A= VX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPCMPEQB ymm1, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, ymm1","vpcmpeq= b ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 74 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPEQB k1, {k}, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, {k}, k1","v= pcmpeqb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 74 /r","V","V","A= VX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPCMPEQB k1, {k}, zmmV, zmm2/m512","VPCMPEQB zmm2/m512, zmmV, {k}, k1","v= pcmpeqb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 74 /r","V","V","A= VX512BW","scale64","w,r,r,r","","" +"VPCMPEQD xmm1, xmmV, xmm2/m128","VPCMPEQD xmm2/m128, xmmV, xmm1","vpcmpeq= d xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 76 /r","V","V","AVX","","w,= r,r","","" +"VPCMPEQD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPEQD xmm2/m128/m32bcst, x= mmV, {k}, k1","vpcmpeqd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","","" +"VPCMPEQD ymm1, ymmV, ymm2/m256","VPCMPEQD ymm2/m256, ymmV, ymm1","vpcmpeq= d ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 76 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPEQD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPEQD ymm2/m256/m32bcst, y= mmV, {k}, k1","vpcmpeqd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","","" +"VPCMPEQD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPEQD zmm2/m512/m32bcst, z= mmV, {k}, k1","vpcmpeqd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F.W0 76 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPCMPEQQ xmm1, xmmV, xmm2/m128","VPCMPEQQ xmm2/m128, xmmV, xmm1","vpcmpeq= q xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 29 /r","V","V","AVX","","= w,r,r","","" +"VPCMPEQQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPEQQ xmm2/m128/m64bcst, x= mmV, {k}, k1","vpcmpeqq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F38.W1 29 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","","" +"VPCMPEQQ ymm1, ymmV, ymm2/m256","VPCMPEQQ ymm2/m256, ymmV, ymm1","vpcmpeq= q ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 29 /r","V","V","AVX2","",= "w,r,r","","" +"VPCMPEQQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPEQQ ymm2/m256/m64bcst, y= mmV, {k}, k1","vpcmpeqq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F38.W1 29 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","","" +"VPCMPEQQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPEQQ zmm2/m512/m64bcst, z= mmV, {k}, k1","vpcmpeqq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F38.W1 29 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPCMPEQW xmm1, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, xmm1","vpcmpeq= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 75 /r","V","V","AVX","","w,= r,r","","" +"VPCMPEQW k1, {k}, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, {k}, k1","v= pcmpeqw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 75 /r","V","V","A= VX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPCMPEQW ymm1, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, ymm1","vpcmpeq= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 75 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPEQW k1, {k}, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, {k}, k1","v= pcmpeqw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 75 /r","V","V","A= VX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPCMPEQW k1, {k}, zmmV, zmm2/m512","VPCMPEQW zmm2/m512, zmmV, {k}, k1","v= pcmpeqw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 75 /r","V","V","A= VX512BW","scale64","w,r,r,r","","" +"VPCMPESTRI xmm1, xmm2/m128, imm8u","VPCMPESTRI imm8u, xmm2/m128, xmm1","v= pcmpestri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 61 /r ib","V","V","A= VX","","r,r,r","","" +"VPCMPESTRM xmm1, xmm2/m128, imm8u","VPCMPESTRM imm8u, xmm2/m128, xmm1","v= pcmpestrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 60 /r ib","V","V","A= VX","","r,r,r","","" +"VPCMPGTB xmm1, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, xmm1","vpcmpgt= b xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 64 /r","V","V","AVX","","w,= r,r","","" +"VPCMPGTB k1, {k}, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, {k}, k1","v= pcmpgtb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 64 /r","V","V","A= VX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPCMPGTB ymm1, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, ymm1","vpcmpgt= b ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 64 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPGTB k1, {k}, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, {k}, k1","v= pcmpgtb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 64 /r","V","V","A= VX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPCMPGTB k1, {k}, zmmV, zmm2/m512","VPCMPGTB zmm2/m512, zmmV, {k}, k1","v= pcmpgtb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 64 /r","V","V","A= VX512BW","scale64","w,r,r,r","","" +"VPCMPGTD xmm1, xmmV, xmm2/m128","VPCMPGTD xmm2/m128, xmmV, xmm1","vpcmpgt= d xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 66 /r","V","V","AVX","","w,= r,r","","" +"VPCMPGTD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPGTD xmm2/m128/m32bcst, x= mmV, {k}, k1","vpcmpgtd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F.W0 66 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","","" +"VPCMPGTD ymm1, ymmV, ymm2/m256","VPCMPGTD ymm2/m256, ymmV, ymm1","vpcmpgt= d ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 66 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPGTD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPGTD ymm2/m256/m32bcst, y= mmV, {k}, k1","vpcmpgtd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F.W0 66 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","","" +"VPCMPGTD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPGTD zmm2/m512/m32bcst, z= mmV, {k}, k1","vpcmpgtd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F.W0 66 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPCMPGTQ xmm1, xmmV, xmm2/m128","VPCMPGTQ xmm2/m128, xmmV, xmm1","vpcmpgt= q xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 37 /r","V","V","AVX","","= w,r,r","","" +"VPCMPGTQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPGTQ xmm2/m128/m64bcst, x= mmV, {k}, k1","vpcmpgtq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F38.W1 37 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","","" +"VPCMPGTQ ymm1, ymmV, ymm2/m256","VPCMPGTQ ymm2/m256, ymmV, ymm1","vpcmpgt= q ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 37 /r","V","V","AVX2","",= "w,r,r","","" +"VPCMPGTQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPGTQ ymm2/m256/m64bcst, y= mmV, {k}, k1","vpcmpgtq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F38.W1 37 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","","" +"VPCMPGTQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPGTQ zmm2/m512/m64bcst, z= mmV, {k}, k1","vpcmpgtq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F38.W1 37 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPCMPGTW xmm1, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, xmm1","vpcmpgt= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 65 /r","V","V","AVX","","w,= r,r","","" +"VPCMPGTW k1, {k}, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, {k}, k1","v= pcmpgtw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 65 /r","V","V","A= VX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPCMPGTW ymm1, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, ymm1","vpcmpgt= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 65 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPGTW k1, {k}, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, {k}, k1","v= pcmpgtw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 65 /r","V","V","A= VX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPCMPGTW k1, {k}, zmmV, zmm2/m512","VPCMPGTW zmm2/m512, zmmV, {k}, k1","v= pcmpgtw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 65 /r","V","V","A= VX512BW","scale64","w,r,r,r","","" +"VPCMPISTRI xmm1, xmm2/m128, imm8u","VPCMPISTRI imm8u, xmm2/m128, xmm1","v= pcmpistri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 63 /r ib","V","V","A= VX","","r,r,r","","" +"VPCMPISTRM xmm1, xmm2/m128, imm8u","VPCMPISTRM imm8u, xmm2/m128, xmm1","v= pcmpistrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 62 /r ib","V","V","A= VX","","r,r,r","","" +"VPCMPQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPQ imm8u, xmm2/m128/= m64bcst, xmmV, {k}, k1","vpcmpq imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","E= VEX.NDS.128.66.0F3A.W1 1F /r ib","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r,r","","" +"VPCMPQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPQ imm8u, ymm2/m256/= m64bcst, ymmV, {k}, k1","vpcmpq imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","E= VEX.NDS.256.66.0F3A.W1 1F /r ib","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r,r","","" +"VPCMPQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPQ imm8u, zmm2/m512/= m64bcst, zmmV, {k}, k1","vpcmpq imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","E= VEX.NDS.512.66.0F3A.W1 1F /r ib","V","V","AVX512F","bscale8,scale64","w,r,r= ,r,r","","" +"VPCMPUB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUB imm8u, xmm2/m128, xmmV,= {k}, k1","vpcmpub imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W= 0 3E /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","","" +"VPCMPUB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUB imm8u, ymm2/m256, ymmV,= {k}, k1","vpcmpub imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W= 0 3E /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","","" +"VPCMPUB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUB imm8u, zmm2/m512, zmmV,= {k}, k1","vpcmpub imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W= 0 3E /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","","" +"VPCMPUD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPUD imm8u, xmm2/m12= 8/m32bcst, xmmV, {k}, k1","vpcmpud imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1"= ,"EVEX.NDS.128.66.0F3A.W0 1E /r ib","V","V","AVX512F+AVX512VL","bscale4,sca= le16","w,r,r,r,r","","" +"VPCMPUD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPUD imm8u, ymm2/m25= 6/m32bcst, ymmV, {k}, k1","vpcmpud imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1"= ,"EVEX.NDS.256.66.0F3A.W0 1E /r ib","V","V","AVX512F+AVX512VL","bscale4,sca= le32","w,r,r,r,r","","" +"VPCMPUD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPUD imm8u, zmm2/m51= 2/m32bcst, zmmV, {k}, k1","vpcmpud imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1"= ,"EVEX.NDS.512.66.0F3A.W0 1E /r ib","V","V","AVX512F","bscale4,scale64","w,= r,r,r,r","","" +"VPCMPUQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPUQ imm8u, xmm2/m12= 8/m64bcst, xmmV, {k}, k1","vpcmpuq imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1"= ,"EVEX.NDS.128.66.0F3A.W1 1E /r ib","V","V","AVX512F+AVX512VL","bscale8,sca= le16","w,r,r,r,r","","" +"VPCMPUQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPUQ imm8u, ymm2/m25= 6/m64bcst, ymmV, {k}, k1","vpcmpuq imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1"= ,"EVEX.NDS.256.66.0F3A.W1 1E /r ib","V","V","AVX512F+AVX512VL","bscale8,sca= le32","w,r,r,r,r","","" +"VPCMPUQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPUQ imm8u, zmm2/m51= 2/m64bcst, zmmV, {k}, k1","vpcmpuq imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1"= ,"EVEX.NDS.512.66.0F3A.W1 1E /r ib","V","V","AVX512F","bscale8,scale64","w,= r,r,r,r","","" +"VPCMPUW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUW imm8u, xmm2/m128, xmmV,= {k}, k1","vpcmpuw imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W= 1 3E /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","","" +"VPCMPUW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUW imm8u, ymm2/m256, ymmV,= {k}, k1","vpcmpuw imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W= 1 3E /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","","" +"VPCMPUW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUW imm8u, zmm2/m512, zmmV,= {k}, k1","vpcmpuw imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W= 1 3E /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","","" +"VPCMPW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPW imm8u, xmm2/m128, xmmV, {= k}, k1","vpcmpw imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 3= F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","","" +"VPCMPW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPW imm8u, ymm2/m256, ymmV, {= k}, k1","vpcmpw imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 3= F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","","" +"VPCMPW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPW imm8u, zmm2/m512, zmmV, {= k}, k1","vpcmpw imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 3= F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","","" +"VPCOMB xmm1, xmmV, xmm2/m128, imm8u","VPCOMB imm8u, xmm2/m128, xmmV, xmm1= ","vpcomb imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CC /r ib","V","V= ","XOP","amd","w,r,r,r","","" +"VPCOMD xmm1, xmmV, xmm2/m128, imm8u","VPCOMD imm8u, xmm2/m128, xmmV, xmm1= ","vpcomd imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CE /r ib","V","V= ","XOP","amd","w,r,r,r","","" +"VPCOMPRESSB xmm2/m128, {k}{z}, xmm1","VPCOMPRESSB xmm1, {k}{z}, xmm2/m128= ","vpcompressb xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 63 /r","V","V"= ,"AVX512_VBMI2+AVX512VL","scale1","w,r,r","","" +"VPCOMPRESSB ymm2/m256, {k}{z}, ymm1","VPCOMPRESSB ymm1, {k}{z}, ymm2/m256= ","vpcompressb ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 63 /r","V","V"= ,"AVX512_VBMI2+AVX512VL","scale1","w,r,r","","" +"VPCOMPRESSB zmm2/m512, {k}{z}, zmm1","VPCOMPRESSB zmm1, {k}{z}, zmm2/m512= ","vpcompressb zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 63 /r","V","V"= ,"AVX512_VBMI2","scale1","w,r,r","","" +"VPCOMPRESSD xmm2/m128, {k}{z}, xmm1","VPCOMPRESSD xmm1, {k}{z}, xmm2/m128= ","vpcompressd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 8B /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VPCOMPRESSD ymm2/m256, {k}{z}, ymm1","VPCOMPRESSD ymm1, {k}{z}, ymm2/m256= ","vpcompressd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 8B /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VPCOMPRESSD zmm2/m512, {k}{z}, zmm1","VPCOMPRESSD zmm1, {k}{z}, zmm2/m512= ","vpcompressd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 8B /r","V","V"= ,"AVX512F","scale4","w,r,r","","" +"VPCOMPRESSQ xmm2/m128, {k}{z}, xmm1","VPCOMPRESSQ xmm1, {k}{z}, xmm2/m128= ","vpcompressq xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 8B /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VPCOMPRESSQ ymm2/m256, {k}{z}, ymm1","VPCOMPRESSQ ymm1, {k}{z}, ymm2/m256= ","vpcompressq ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 8B /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VPCOMPRESSQ zmm2/m512, {k}{z}, zmm1","VPCOMPRESSQ zmm1, {k}{z}, zmm2/m512= ","vpcompressq zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 8B /r","V","V"= ,"AVX512F","scale8","w,r,r","","" +"VPCOMPRESSW xmm2/m128, {k}{z}, xmm1","VPCOMPRESSW xmm1, {k}{z}, xmm2/m128= ","vpcompressw xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 63 /r","V","V"= ,"AVX512_VBMI2+AVX512VL","scale2","w,r,r","","" +"VPCOMPRESSW ymm2/m256, {k}{z}, ymm1","VPCOMPRESSW ymm1, {k}{z}, ymm2/m256= ","vpcompressw ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 63 /r","V","V"= ,"AVX512_VBMI2+AVX512VL","scale2","w,r,r","","" +"VPCOMPRESSW zmm2/m512, {k}{z}, zmm1","VPCOMPRESSW zmm1, {k}{z}, zmm2/m512= ","vpcompressw zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 63 /r","V","V"= ,"AVX512_VBMI2","scale2","w,r,r","","" +"VPCOMQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMQ imm8u, xmm2/m128, xmmV, xmm1= ","vpcomq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CF /r ib","V","V= ","XOP","amd","w,r,r,r","","" +"VPCOMUB xmm1, xmmV, xmm2/m128, imm8u","VPCOMUB imm8u, xmm2/m128, xmmV, xm= m1","vpcomub imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EC /r ib","V"= ,"V","XOP","amd","w,r,r,r","","" +"VPCOMUD xmm1, xmmV, xmm2/m128, imm8u","VPCOMUD imm8u, xmm2/m128, xmmV, xm= m1","vpcomud imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EE /r ib","V"= ,"V","XOP","amd","w,r,r,r","","" +"VPCOMUQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMUQ imm8u, xmm2/m128, xmmV, xm= m1","vpcomuq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EF /r ib","V"= ,"V","XOP","amd","w,r,r,r","","" +"VPCOMUW xmm1, xmmV, xmm2/m128, imm8u","VPCOMUW imm8u, xmm2/m128, xmmV, xm= m1","vpcomuw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 ED /r ib","V"= ,"V","XOP","amd","w,r,r,r","","" +"VPCOMW xmm1, xmmV, xmm2/m128, imm8u","VPCOMW imm8u, xmm2/m128, xmmV, xmm1= ","vpcomw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CD /r ib","V","V= ","XOP","amd","w,r,r,r","","" +"VPCONFLICTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPCONFLICTD xmm2/m128/m32bc= st, {k}{z}, xmm1","vpconflictd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.6= 6.0F38.W0 C4 /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","","" +"VPCONFLICTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPCONFLICTD ymm2/m256/m32bc= st, {k}{z}, ymm1","vpconflictd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.6= 6.0F38.W0 C4 /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","","" +"VPCONFLICTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPCONFLICTD zmm2/m512/m32bc= st, {k}{z}, zmm1","vpconflictd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.6= 6.0F38.W0 C4 /r","V","V","AVX512CD","bscale4,scale64","w,r,r","","" +"VPCONFLICTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPCONFLICTQ xmm2/m128/m64bc= st, {k}{z}, xmm1","vpconflictq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.6= 6.0F38.W1 C4 /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","","" +"VPCONFLICTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPCONFLICTQ ymm2/m256/m64bc= st, {k}{z}, ymm1","vpconflictq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.6= 6.0F38.W1 C4 /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","","" +"VPCONFLICTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPCONFLICTQ zmm2/m512/m64bc= st, {k}{z}, zmm1","vpconflictq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.6= 6.0F38.W1 C4 /r","V","V","AVX512CD","bscale8,scale64","w,r,r","","" +"VPDPBUSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSD xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpdpbusd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 50 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale= 16","rw,r,r,r","","" +"VPDPBUSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSD ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpdpbusd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 50 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale= 32","rw,r,r,r","","" +"VPDPBUSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSD zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpdpbusd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 50 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r= ,r,r","","" +"VPDPBUSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSDS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpdpbusds xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W0 51 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,sc= ale16","rw,r,r,r","","" +"VPDPBUSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSDS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpdpbusds ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W0 51 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,sc= ale32","rw,r,r,r","","" +"VPDPBUSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSDS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpdpbusds zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W0 51 /r","V","V","AVX512_VNNI","bscale4,scale64","r= w,r,r,r","","" +"VPDPWSSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSD xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpdpwssd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 52 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale= 16","rw,r,r,r","","" +"VPDPWSSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSD ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpdpwssd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 52 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale= 32","rw,r,r,r","","" +"VPDPWSSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSD zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpdpwssd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 52 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r= ,r,r","","" +"VPDPWSSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSDS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpdpwssds xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W0 53 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,sc= ale16","rw,r,r,r","","" +"VPDPWSSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSDS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpdpwssds ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W0 53 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,sc= ale32","rw,r,r,r","","" +"VPDPWSSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSDS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpdpwssds zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W0 53 /r","V","V","AVX512_VNNI","bscale4,scale64","r= w,r,r,r","","" +"VPERM2F128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2F128 imm8u, ymm2/m256, ym= mV, ymm1","vperm2f128 imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0= 06 /r ib","V","V","AVX","","w,r,r,r","","" +"VPERM2I128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2I128 imm8u, ymm2/m256, ym= mV, ymm1","vperm2i128 imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0= 46 /r ib","V","V","AVX2","","w,r,r,r","","" +"VPERMB xmm1, {k}{z}, xmmV, xmm2/m128","VPERMB xmm2/m128, xmmV, {k}{z}, xm= m1","vpermb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 8D /r",= "V","V","AVX512_VBMI+AVX512VL","scale16","w,r,r,r","","" +"VPERMB ymm1, {k}{z}, ymmV, ymm2/m256","VPERMB ymm2/m256, ymmV, {k}{z}, ym= m1","vpermb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 8D /r",= "V","V","AVX512_VBMI+AVX512VL","scale32","w,r,r,r","","" +"VPERMB zmm1, {k}{z}, zmmV, zmm2/m512","VPERMB zmm2/m512, zmmV, {k}{z}, zm= m1","vpermb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 8D /r",= "V","V","AVX512_VBMI","scale64","w,r,r,r","","" +"VPERMD ymm1, ymmV, ymm2/m256","VPERMD ymm2/m256, ymmV, ymm1","vpermd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 36 /r","V","V","AVX2","","w,r,r"= ,"","" +"VPERMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpermd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F38.W0 36 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r= ,r","","" +"VPERMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpermd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F38.W0 36 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPERMI2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2B xmm2/m128, xmmV, {k}{z}= , xmm1","vpermi2b xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7= 5 /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","","" +"VPERMI2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2B ymm2/m256, ymmV, {k}{z}= , ymm1","vpermi2b ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7= 5 /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","","" +"VPERMI2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2B zmm2/m512, zmmV, {k}{z}= , zmm1","vpermi2b zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7= 5 /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","","" +"VPERMI2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2D xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpermi2d xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale16",= "rw,r,r,r","","" +"VPERMI2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2D ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpermi2d ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale32",= "rw,r,r,r","","" +"VPERMI2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2D zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpermi2d zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 76 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r= ","","" +"VPERMI2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2PD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vpermi2pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W1 77 /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","rw,r,r,r","","" +"VPERMI2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2PD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vpermi2pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W1 77 /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","rw,r,r,r","","" +"VPERMI2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2PD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vpermi2pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W1 77 /r","V","V","AVX512F","bscale8,scale64","rw,r,= r,r","","" +"VPERMI2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2PS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpermi2ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W0 77 /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","rw,r,r,r","","" +"VPERMI2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2PS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpermi2ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W0 77 /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","rw,r,r,r","","" +"VPERMI2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2PS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpermi2ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W0 77 /r","V","V","AVX512F","bscale4,scale64","rw,r,= r,r","","" +"VPERMI2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2Q xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpermi2q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W1 76 /r","V","V","AVX512F+AVX512VL","bscale8,scale16",= "rw,r,r,r","","" +"VPERMI2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2Q ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpermi2q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W1 76 /r","V","V","AVX512F+AVX512VL","bscale8,scale32",= "rw,r,r,r","","" +"VPERMI2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2Q zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpermi2q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W1 76 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r= ","","" +"VPERMI2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2W xmm2/m128, xmmV, {k}{z}= , xmm1","vpermi2w xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7= 5 /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","","" +"VPERMI2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2W ymm2/m256, ymmV, {k}{z}= , ymm1","vpermi2w ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7= 5 /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","","" +"VPERMI2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2W zmm2/m512, zmmV, {k}{z}= , zmm1","vpermi2w zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7= 5 /r","V","V","AVX512BW","scale64","rw,r,r,r","","" +"VPERMIL2PD xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PD imm8u, xmm2/m= 128, xmmIH, xmmV, xmm1","vpermil2pd imm8u, xmm2/m128, xmmIH, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W1 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PD xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PD imm8u, xmmIH,= xmm2/m128, xmmV, xmm1","vpermil2pd imm8u, xmmIH, xmm2/m128, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W0 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PD ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PD imm8u, ymm2/m= 256, ymmIH, ymmV, ymm1","vpermil2pd imm8u, ymm2/m256, ymmIH, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W1 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PD ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PD imm8u, ymmIH,= ymm2/m256, ymmV, ymm1","vpermil2pd imm8u, ymmIH, ymm2/m256, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W0 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PS xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PS imm8u, xmm2/m= 128, xmmIH, xmmV, xmm1","vpermil2ps imm8u, xmm2/m128, xmmIH, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W1 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PS xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PS imm8u, xmmIH,= xmm2/m128, xmmV, xmm1","vpermil2ps imm8u, xmmIH, xmm2/m128, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W0 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PS ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PS imm8u, ymm2/m= 256, ymmIH, ymmV, ymm1","vpermil2ps imm8u, ymm2/m256, ymmIH, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W1 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PS ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PS imm8u, ymmIH,= ymm2/m256, ymmV, ymm1","vpermil2ps imm8u, ymmIH, ymm2/m256, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W0 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMILPD xmm1, xmm2/m128, imm8u","VPERMILPD imm8u, xmm2/m128, xmm1","vpe= rmilpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 05 /r ib","V","V","AVX",= "","w,r,r","","" +"VPERMILPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VPERMILPD imm8u, xmm2/= m128/m64bcst, {k}{z}, xmm1","vpermilpd imm8u, xmm2/m128/m64bcst, {k}{z}, xm= m1","EVEX.128.66.0F3A.W1 05 /r ib","V","V","AVX512F+AVX512VL","bscale8,scal= e16","w,r,r,r","","" +"VPERMILPD ymm1, ymm2/m256, imm8u","VPERMILPD imm8u, ymm2/m256, ymm1","vpe= rmilpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 05 /r ib","V","V","AVX",= "","w,r,r","","" +"VPERMILPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMILPD imm8u, ymm2/= m256/m64bcst, {k}{z}, ymm1","vpermilpd imm8u, ymm2/m256/m64bcst, {k}{z}, ym= m1","EVEX.256.66.0F3A.W1 05 /r ib","V","V","AVX512F+AVX512VL","bscale8,scal= e32","w,r,r,r","","" +"VPERMILPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMILPD imm8u, zmm2/= m512/m64bcst, {k}{z}, zmm1","vpermilpd imm8u, zmm2/m512/m64bcst, {k}{z}, zm= m1","EVEX.512.66.0F3A.W1 05 /r ib","V","V","AVX512F","bscale8,scale64","w,r= ,r,r","","" +"VPERMILPD xmm1, xmmV, xmm2/m128","VPERMILPD xmm2/m128, xmmV, xmm1","vperm= ilpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0D /r","V","V","AVX",""= ,"w,r,r","","" +"VPERMILPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMILPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vpermilpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W1 0D /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r","","" +"VPERMILPD ymm1, ymmV, ymm2/m256","VPERMILPD ymm2/m256, ymmV, ymm1","vperm= ilpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0D /r","V","V","AVX",""= ,"w,r,r","","" +"VPERMILPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMILPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vpermilpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W1 0D /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r","","" +"VPERMILPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMILPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vpermilpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W1 0D /r","V","V","AVX512F","bscale8,scale64","w,r,r= ,r","","" +"VPERMILPS xmm1, xmm2/m128, imm8u","VPERMILPS imm8u, xmm2/m128, xmm1","vpe= rmilps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 04 /r ib","V","V","AVX",= "","w,r,r","","" +"VPERMILPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPERMILPS imm8u, xmm2/= m128/m32bcst, {k}{z}, xmm1","vpermilps imm8u, xmm2/m128/m32bcst, {k}{z}, xm= m1","EVEX.128.66.0F3A.W0 04 /r ib","V","V","AVX512F+AVX512VL","bscale4,scal= e16","w,r,r,r","","" +"VPERMILPS ymm1, ymm2/m256, imm8u","VPERMILPS imm8u, ymm2/m256, ymm1","vpe= rmilps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 04 /r ib","V","V","AVX",= "","w,r,r","","" +"VPERMILPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPERMILPS imm8u, ymm2/= m256/m32bcst, {k}{z}, ymm1","vpermilps imm8u, ymm2/m256/m32bcst, {k}{z}, ym= m1","EVEX.256.66.0F3A.W0 04 /r ib","V","V","AVX512F+AVX512VL","bscale4,scal= e32","w,r,r,r","","" +"VPERMILPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPERMILPS imm8u, zmm2/= m512/m32bcst, {k}{z}, zmm1","vpermilps imm8u, zmm2/m512/m32bcst, {k}{z}, zm= m1","EVEX.512.66.0F3A.W0 04 /r ib","V","V","AVX512F","bscale4,scale64","w,r= ,r,r","","" +"VPERMILPS xmm1, xmmV, xmm2/m128","VPERMILPS xmm2/m128, xmmV, xmm1","vperm= ilps xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX",""= ,"w,r,r","","" +"VPERMILPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMILPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpermilps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r","","" +"VPERMILPS ymm1, ymmV, ymm2/m256","VPERMILPS ymm2/m256, ymmV, ymm1","vperm= ilps ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX",""= ,"w,r,r","","" +"VPERMILPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMILPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpermilps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r","","" +"VPERMILPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMILPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpermilps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 0C /r","V","V","AVX512F","bscale4,scale64","w,r,r= ,r","","" +"VPERMPD ymm1, ymm2/m256, imm8u","VPERMPD imm8u, ymm2/m256, ymm1","vpermpd= imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 01 /r ib","V","V","AVX2","","w= ,r,r","","" +"VPERMPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMPD imm8u, ymm2/m256= /m64bcst, {k}{z}, ymm1","vpermpd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","E= VEX.256.66.0F3A.W1 01 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","= w,r,r,r","","" +"VPERMPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMPD imm8u, zmm2/m512= /m64bcst, {k}{z}, zmm1","vpermpd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","E= VEX.512.66.0F3A.W1 01 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r",= "","" +"VPERMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMPD ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpermpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 16 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPERMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMPD zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpermpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 16 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPERMPS ymm1, ymmV, ymm2/m256","VPERMPS ymm2/m256, ymmV, ymm1","vpermps y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 16 /r","V","V","AVX2","","w,r= ,r","","" +"VPERMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMPS ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpermps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 16 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPERMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMPS zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpermps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 16 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPERMQ ymm1, ymm2/m256, imm8u","VPERMQ imm8u, ymm2/m256, ymm1","vpermq im= m8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 00 /r ib","V","V","AVX2","","w,r,= r","","" +"VPERMQ ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymm1","vpermq imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX= .256.66.0F3A.W1 00 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r= ,r,r","","" +"VPERMQ zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmm1","vpermq imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX= .512.66.0F3A.W1 00 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",= "" +"VPERMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpermq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F38.W1 36 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r= ,r","","" +"VPERMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpermq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F38.W1 36 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPERMT2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2B xmm2/m128, xmmV, {k}{z}= , xmm1","vpermt2b xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7= D /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","","" +"VPERMT2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2B ymm2/m256, ymmV, {k}{z}= , ymm1","vpermt2b ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7= D /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","","" +"VPERMT2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2B zmm2/m512, zmmV, {k}{z}= , zmm1","vpermt2b zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7= D /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","","" +"VPERMT2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2D xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpermt2d xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 7E /r","V","V","AVX512F+AVX512VL","bscale4,scale16",= "rw,r,r,r","","" +"VPERMT2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2D ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpermt2d ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 7E /r","V","V","AVX512F+AVX512VL","bscale4,scale32",= "rw,r,r,r","","" +"VPERMT2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2D zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpermt2d zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 7E /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r= ","","" +"VPERMT2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2PD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vpermt2pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W1 7F /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","rw,r,r,r","","" +"VPERMT2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2PD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vpermt2pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W1 7F /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","rw,r,r,r","","" +"VPERMT2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2PD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vpermt2pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W1 7F /r","V","V","AVX512F","bscale8,scale64","rw,r,= r,r","","" +"VPERMT2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2PS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpermt2ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W0 7F /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","rw,r,r,r","","" +"VPERMT2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2PS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpermt2ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W0 7F /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","rw,r,r,r","","" +"VPERMT2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2PS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpermt2ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W0 7F /r","V","V","AVX512F","bscale4,scale64","rw,r,= r,r","","" +"VPERMT2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2Q xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpermt2q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W1 7E /r","V","V","AVX512F+AVX512VL","bscale8,scale16",= "rw,r,r,r","","" +"VPERMT2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2Q ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpermt2q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W1 7E /r","V","V","AVX512F+AVX512VL","bscale8,scale32",= "rw,r,r,r","","" +"VPERMT2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2Q zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpermt2q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W1 7E /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r= ","","" +"VPERMT2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2W xmm2/m128, xmmV, {k}{z}= , xmm1","vpermt2w xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7= D /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","","" +"VPERMT2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2W ymm2/m256, ymmV, {k}{z}= , ymm1","vpermt2w ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7= D /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","","" +"VPERMT2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2W zmm2/m512, zmmV, {k}{z}= , zmm1","vpermt2w zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7= D /r","V","V","AVX512BW","scale64","rw,r,r,r","","" +"VPERMW xmm1, {k}{z}, xmmV, xmm2/m128","VPERMW xmm2/m128, xmmV, {k}{z}, xm= m1","vpermw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 8D /r",= "V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPERMW ymm1, {k}{z}, ymmV, ymm2/m256","VPERMW ymm2/m256, ymmV, {k}{z}, ym= m1","vpermw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 8D /r",= "V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPERMW zmm1, {k}{z}, zmmV, zmm2/m512","VPERMW zmm2/m512, zmmV, {k}{z}, zm= m1","vpermw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 8D /r",= "V","V","AVX512BW","scale64","w,r,r,r","","" +"VPEXPANDB xmm1, {k}{z}, xmm2/m128","VPEXPANDB xmm2/m128, {k}{z}, xmm1","v= pexpandb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 62 /r","V","V","AVX5= 12_VBMI2+AVX512VL","scale1","w,r,r","","" +"VPEXPANDB ymm1, {k}{z}, ymm2/m256","VPEXPANDB ymm2/m256, {k}{z}, ymm1","v= pexpandb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 62 /r","V","V","AVX5= 12_VBMI2+AVX512VL","scale1","w,r,r","","" +"VPEXPANDB zmm1, {k}{z}, zmm2/m512","VPEXPANDB zmm2/m512, {k}{z}, zmm1","v= pexpandb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 62 /r","V","V","AVX5= 12_VBMI2","scale1","w,r,r","","" +"VPEXPANDD xmm1, {k}{z}, xmm2/m128","VPEXPANDD xmm2/m128, {k}{z}, xmm1","v= pexpandd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 89 /r","V","V","AVX5= 12F+AVX512VL","scale4","w,r,r","","" +"VPEXPANDD ymm1, {k}{z}, ymm2/m256","VPEXPANDD ymm2/m256, {k}{z}, ymm1","v= pexpandd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 89 /r","V","V","AVX5= 12F+AVX512VL","scale4","w,r,r","","" +"VPEXPANDD zmm1, {k}{z}, zmm2/m512","VPEXPANDD zmm2/m512, {k}{z}, zmm1","v= pexpandd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 89 /r","V","V","AVX5= 12F","scale4","w,r,r","","" +"VPEXPANDQ xmm1, {k}{z}, xmm2/m128","VPEXPANDQ xmm2/m128, {k}{z}, xmm1","v= pexpandq xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 89 /r","V","V","AVX5= 12F+AVX512VL","scale8","w,r,r","","" +"VPEXPANDQ ymm1, {k}{z}, ymm2/m256","VPEXPANDQ ymm2/m256, {k}{z}, ymm1","v= pexpandq ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 89 /r","V","V","AVX5= 12F+AVX512VL","scale8","w,r,r","","" +"VPEXPANDQ zmm1, {k}{z}, zmm2/m512","VPEXPANDQ zmm2/m512, {k}{z}, zmm1","v= pexpandq zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 89 /r","V","V","AVX5= 12F","scale8","w,r,r","","" +"VPEXPANDW xmm1, {k}{z}, xmm2/m128","VPEXPANDW xmm2/m128, {k}{z}, xmm1","v= pexpandw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 62 /r","V","V","AVX5= 12_VBMI2+AVX512VL","scale2","w,r,r","","" +"VPEXPANDW ymm1, {k}{z}, ymm2/m256","VPEXPANDW ymm2/m256, {k}{z}, ymm1","v= pexpandw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 62 /r","V","V","AVX5= 12_VBMI2+AVX512VL","scale2","w,r,r","","" +"VPEXPANDW zmm1, {k}{z}, zmm2/m512","VPEXPANDW zmm2/m512, {k}{z}, zmm1","v= pexpandw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 62 /r","V","V","AVX5= 12_VBMI2","scale2","w,r,r","","" +"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u= , xmm1, r32/m8","EVEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX512BW+AVX512VL"= ,"scale1","w,r,r","","" +"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u= , xmm1, r32/m8","VEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX","","w,r,r","",= "" +"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, = xmm1, r/m32","EVEX.128.66.0F3A.W0 16 /r ib","V","V","AVX512DQ+AVX512VL","sc= ale4","w,r,r","","" +"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, = xmm1, r/m32","VEX.128.66.0F3A.W0 16 /r ib","V","V","AVX","","w,r,r","","" +"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, = xmm1, r/m64","EVEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX512DQ+AVX512VL",= "scale8","w,r,r","","" +"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, = xmm1, r/m64","VEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX","","w,r,r","","" +"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm= 8u, xmm1, r32/m16","EVEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX512BW+AVX512= VL","scale2","w,r,r","","" +"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm= 8u, xmm1, r32/m16","VEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX","","w,r,r",= "","" +"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2= , r32","EVEX.128.66.0F.WIG C5 /r ib","V","V","AVX512BW+AVX512VL","modrm_reg= only","w,r,r","","" +"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2= , r32","VEX.128.66.0F.WIG C5 /r ib","V","V","AVX","modrm_regonly","w,r,r","= ","" +"VPGATHERDD xmm1, {k1-k7}, vm32x","VPGATHERDD vm32x, {k1-k7}, xmm1","vpgat= herdd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 90 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERDD ymm1, {k1-k7}, vm32y","VPGATHERDD vm32y, {k1-k7}, ymm1","vpgat= herdd vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 90 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERDD zmm1, {k1-k7}, vm32z","VPGATHERDD vm32z, {k1-k7}, zmm1","vpgat= herdd vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 90 /vsib","V","V","AVX512F= ","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERDD xmm1, vm32x, xmmV","VPGATHERDD xmmV, vm32x, xmm1","vpgatherdd = xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 90 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERDD ymm1, vm32y, ymmV","VPGATHERDD ymmV, vm32y, ymm1","vpgatherdd = ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 90 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERDQ xmm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7}, xmm1","vpgat= herdq vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 90 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERDQ ymm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7}, ymm1","vpgat= herdq vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 90 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERDQ zmm1, {k1-k7}, vm32y","VPGATHERDQ vm32y, {k1-k7}, zmm1","vpgat= herdq vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 90 /vsib","V","V","AVX512F= ","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERDQ xmm1, vm32x, xmmV","VPGATHERDQ xmmV, vm32x, xmm1","vpgatherdq = xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 90 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERDQ ymm1, vm32x, ymmV","VPGATHERDQ ymmV, vm32x, ymm1","vpgatherdq = ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 90 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERQD xmm1, {k1-k7}, vm64x","VPGATHERQD vm64x, {k1-k7}, xmm1","vpgat= herqd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 91 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERQD xmm1, {k1-k7}, vm64y","VPGATHERQD vm64y, {k1-k7}, xmm1","vpgat= herqd vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 91 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERQD ymm1, {k1-k7}, vm64z","VPGATHERQD vm64z, {k1-k7}, ymm1","vpgat= herqd vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 91 /vsib","V","V","AVX512F= ","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERQD xmm1, vm64x, xmmV","VPGATHERQD xmmV, vm64x, xmm1","vpgatherqd = xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 91 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERQD xmm1, vm64y, xmmV","VPGATHERQD xmmV, vm64y, xmm1","vpgatherqd = xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 91 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERQQ xmm1, {k1-k7}, vm64x","VPGATHERQQ vm64x, {k1-k7}, xmm1","vpgat= herqq vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 91 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERQQ ymm1, {k1-k7}, vm64y","VPGATHERQQ vm64y, {k1-k7}, ymm1","vpgat= herqq vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 91 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERQQ zmm1, {k1-k7}, vm64z","VPGATHERQQ vm64z, {k1-k7}, zmm1","vpgat= herqq vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 91 /vsib","V","V","AVX512F= ","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERQQ xmm1, vm64x, xmmV","VPGATHERQQ xmmV, vm64x, xmm1","vpgatherqq = xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 91 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERQQ ymm1, vm64y, ymmV","VPGATHERQQ ymmV, vm64y, ymm1","vpgatherqq = ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 91 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPHADDBD xmm1, xmm2/m128","VPHADDBD xmm2/m128, xmm1","vphaddbd xmm2/m128,= xmm1","XOP.128.09.W0 C2 /r","V","V","XOP","amd","w,r","","" +"VPHADDBQ xmm1, xmm2/m128","VPHADDBQ xmm2/m128, xmm1","vphaddbq xmm2/m128,= xmm1","XOP.128.09.W0 C3 /r","V","V","XOP","amd","w,r","","" +"VPHADDBW xmm1, xmm2/m128","VPHADDBW xmm2/m128, xmm1","vphaddbw xmm2/m128,= xmm1","XOP.128.09.W0 C1 /r","V","V","XOP","amd","w,r","","" +"VPHADDD xmm1, xmmV, xmm2/m128","VPHADDD xmm2/m128, xmmV, xmm1","vphaddd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 02 /r","V","V","AVX","","w,r= ,r","","" +"VPHADDD ymm1, ymmV, ymm2/m256","VPHADDD ymm2/m256, ymmV, ymm1","vphaddd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 02 /r","V","V","AVX2","","w,= r,r","","" +"VPHADDDQ xmm1, xmm2/m128","VPHADDDQ xmm2/m128, xmm1","vphadddq xmm2/m128,= xmm1","XOP.128.09.W0 CB /r","V","V","XOP","amd","w,r","","" +"VPHADDSW xmm1, xmmV, xmm2/m128","VPHADDSW xmm2/m128, xmmV, xmm1","vphadds= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 03 /r","V","V","AVX","","= w,r,r","","" +"VPHADDSW ymm1, ymmV, ymm2/m256","VPHADDSW ymm2/m256, ymmV, ymm1","vphadds= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 03 /r","V","V","AVX2","",= "w,r,r","","" +"VPHADDUBD xmm1, xmm2/m128","VPHADDUBD xmm2/m128, xmm1","vphaddubd xmm2/m1= 28, xmm1","XOP.128.09.W0 D2 /r","V","V","XOP","amd","w,r","","" +"VPHADDUBQ xmm1, xmm2/m128","VPHADDUBQ xmm2/m128, xmm1","vphaddubq xmm2/m1= 28, xmm1","XOP.128.09.W0 D3 /r","V","V","XOP","amd","w,r","","" +"VPHADDUBW xmm1, xmm2/m128","VPHADDUBW xmm2/m128, xmm1","vphaddubw xmm2/m1= 28, xmm1","XOP.128.09.W0 D1 /r","V","V","XOP","amd","w,r","","" +"VPHADDUDQ xmm1, xmm2/m128","VPHADDUDQ xmm2/m128, xmm1","vphaddudq xmm2/m1= 28, xmm1","XOP.128.09.W0 DB /r","V","V","XOP","amd","w,r","","" +"VPHADDUWD xmm1, xmm2/m128","VPHADDUWD xmm2/m128, xmm1","vphadduwd xmm2/m1= 28, xmm1","XOP.128.09.W0 D6 /r","V","V","XOP","amd","w,r","","" +"VPHADDUWQ xmm1, xmm2/m128","VPHADDUWQ xmm2/m128, xmm1","vphadduwq xmm2/m1= 28, xmm1","XOP.128.09.W0 D7 /r","V","V","XOP","amd","w,r","","" +"VPHADDW xmm1, xmmV, xmm2/m128","VPHADDW xmm2/m128, xmmV, xmm1","vphaddw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 01 /r","V","V","AVX","","w,r= ,r","","" +"VPHADDW ymm1, ymmV, ymm2/m256","VPHADDW ymm2/m256, ymmV, ymm1","vphaddw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 01 /r","V","V","AVX2","","w,= r,r","","" +"VPHADDWD xmm1, xmm2/m128","VPHADDWD xmm2/m128, xmm1","vphaddwd xmm2/m128,= xmm1","XOP.128.09.W0 C6 /r","V","V","XOP","amd","w,r","","" +"VPHADDWQ xmm1, xmm2/m128","VPHADDWQ xmm2/m128, xmm1","vphaddwq xmm2/m128,= xmm1","XOP.128.09.W0 C7 /r","V","V","XOP","amd","w,r","","" +"VPHMINPOSUW xmm1, xmm2/m128","VPHMINPOSUW xmm2/m128, xmm1","vphminposuw x= mm2/m128, xmm1","VEX.128.66.0F38.WIG 41 /r","V","V","AVX","","w,r","","" +"VPHSUBBW xmm1, xmm2/m128","VPHSUBBW xmm2/m128, xmm1","vphsubbw xmm2/m128,= xmm1","XOP.128.09.W0 E1 /r","V","V","XOP","amd","w,r","","" +"VPHSUBD xmm1, xmmV, xmm2/m128","VPHSUBD xmm2/m128, xmmV, xmm1","vphsubd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 06 /r","V","V","AVX","","w,r= ,r","","" +"VPHSUBD ymm1, ymmV, ymm2/m256","VPHSUBD ymm2/m256, ymmV, ymm1","vphsubd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 06 /r","V","V","AVX2","","w,= r,r","","" +"VPHSUBDQ xmm1, xmm2/m128","VPHSUBDQ xmm2/m128, xmm1","vphsubdq xmm2/m128,= xmm1","XOP.128.09.W0 E3 /r","V","V","XOP","amd","w,r","","" +"VPHSUBSW xmm1, xmmV, xmm2/m128","VPHSUBSW xmm2/m128, xmmV, xmm1","vphsubs= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 07 /r","V","V","AVX","","= w,r,r","","" +"VPHSUBSW ymm1, ymmV, ymm2/m256","VPHSUBSW ymm2/m256, ymmV, ymm1","vphsubs= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 07 /r","V","V","AVX2","",= "w,r,r","","" +"VPHSUBW xmm1, xmmV, xmm2/m128","VPHSUBW xmm2/m128, xmmV, xmm1","vphsubw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 05 /r","V","V","AVX","","w,r= ,r","","" +"VPHSUBW ymm1, ymmV, ymm2/m256","VPHSUBW ymm2/m256, ymmV, ymm1","vphsubw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 05 /r","V","V","AVX2","","w,= r,r","","" +"VPHSUBWD xmm1, xmm2/m128","VPHSUBWD xmm2/m128, xmm1","vphsubwd xmm2/m128,= xmm1","XOP.128.09.W0 E2 /r","V","V","XOP","amd","w,r","","" +"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV, xmm1","v= pinsrb imm8u, r32/m8, xmmV, xmm1","EVEX.NDS.128.66.0F3A.WIG 20 /r ib","V","= V","AVX512BW+AVX512VL","scale1","w,r,r,r","","" +"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV, xmm1","v= pinsrb imm8u, r32/m8, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 20 /r ib","V","V= ","AVX","","w,r,r,r","","" +"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpi= nsrd imm8u, r/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","= AVX512DQ+AVX512VL","scale4","w,r,r,r","","" +"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpi= nsrd imm8u, r/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","A= VX","","w,r,r,r","","" +"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV, xmm1","vpi= nsrq imm8u, r/m64, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W1 22 /r ib","N.S.","V= ","AVX512DQ+AVX512VL","scale8","w,r,r,r","","" +"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV, xmm1","vpi= nsrq imm8u, r/m64, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 22 /r ib","N.S.","V"= ,"AVX","","w,r,r,r","","" +"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV, xmm1",= "vpinsrw imm8u, r32/m16, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG C4 /r ib","V",= "V","AVX512BW+AVX512VL","scale2","w,r,r,r","","" +"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV, xmm1",= "vpinsrw imm8u, r32/m16, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C4 /r ib","V","= V","AVX","","w,r,r,r","","" +"VPLZCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPLZCNTD xmm2/m128/m32bcst, {k= }{z}, xmm1","vplzcntd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0= 44 /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","","" +"VPLZCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPLZCNTD ymm2/m256/m32bcst, {k= }{z}, ymm1","vplzcntd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0= 44 /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","","" +"VPLZCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPLZCNTD zmm2/m512/m32bcst, {k= }{z}, zmm1","vplzcntd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0= 44 /r","V","V","AVX512CD","bscale4,scale64","w,r,r","","" +"VPLZCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPLZCNTQ xmm2/m128/m64bcst, {k= }{z}, xmm1","vplzcntq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1= 44 /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","","" +"VPLZCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPLZCNTQ ymm2/m256/m64bcst, {k= }{z}, ymm1","vplzcntq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1= 44 /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","","" +"VPLZCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPLZCNTQ zmm2/m512/m64bcst, {k= }{z}, zmm1","vplzcntq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1= 44 /r","V","V","AVX512CD","bscale8,scale64","w,r,r","","" +"VPMACSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDD xmmIH, xmm2/m128, xmmV, = xmm1","vpmacsdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 9E /r /is4= ","V","V","XOP","amd","w,r,r,r","","" +"VPMACSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQH xmmIH, xmm2/m128, xmmV= , xmm1","vpmacsdqh xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 9F /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQL xmmIH, xmm2/m128, xmmV= , xmm1","vpmacsdql xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 97 /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDD xmmIH, xmm2/m128, xmmV= , xmm1","vpmacssdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 8E /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQH xmmIH, xmm2/m128, xm= mV, xmm1","vpmacssdqh xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 8F /= r /is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQL xmmIH, xmm2/m128, xm= mV, xmm1","vpmacssdql xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 87 /= r /is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWD xmmIH, xmm2/m128, xmmV= , xmm1","vpmacsswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 86 /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWW xmmIH, xmm2/m128, xmmV= , xmm1","vpmacssww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 85 /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWD xmmIH, xmm2/m128, xmmV, = xmm1","vpmacswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 96 /r /is4= ","V","V","XOP","amd","w,r,r,r","","" +"VPMACSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWW xmmIH, xmm2/m128, xmmV, = xmm1","vpmacsww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 95 /r /is4= ","V","V","XOP","amd","w,r,r,r","","" +"VPMADCSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSSWD xmmIH, xmm2/m128, xm= mV, xmm1","vpmadcsswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A6 /= r /is4","V","V","XOP","amd","w,r,r,r","","" +"VPMADCSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSWD xmmIH, xmm2/m128, xmmV= , xmm1","vpmadcswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 B6 /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMADD52HUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52HUQ xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vpmadd52huq xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 B5 /r","V","V","AVX512_IFMA+AVX512VL","bsca= le8,scale16","rw,r,r,r","","" +"VPMADD52HUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52HUQ ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vpmadd52huq ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 B5 /r","V","V","AVX512_IFMA+AVX512VL","bsca= le8,scale32","rw,r,r,r","","" +"VPMADD52HUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52HUQ zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vpmadd52huq zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 B5 /r","V","V","AVX512_IFMA","bscale8,scale= 64","rw,r,r,r","","" +"VPMADD52LUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52LUQ xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vpmadd52luq xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 B4 /r","V","V","AVX512_IFMA+AVX512VL","bsca= le8,scale16","rw,r,r,r","","" +"VPMADD52LUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52LUQ ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vpmadd52luq ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 B4 /r","V","V","AVX512_IFMA+AVX512VL","bsca= le8,scale32","rw,r,r,r","","" +"VPMADD52LUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52LUQ zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vpmadd52luq zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 B4 /r","V","V","AVX512_IFMA","bscale8,scale= 64","rw,r,r,r","","" +"VPMADDUBSW xmm1, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, xmm1","vpm= addubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 04 /r","V","V","AVX= ","","w,r,r","","" +"VPMADDUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, {k= }{z}, xmm1","vpmaddubsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3= 8.WIG 04 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMADDUBSW ymm1, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, ymm1","vpm= addubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 04 /r","V","V","AVX= 2","","w,r,r","","" +"VPMADDUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, {k= }{z}, ymm1","vpmaddubsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3= 8.WIG 04 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMADDUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDUBSW zmm2/m512, zmmV, {k= }{z}, zmm1","vpmaddubsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3= 8.WIG 04 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMADDWD xmm1, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, xmm1","vpmaddw= d xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F5 /r","V","V","AVX","","w,= r,r","","" +"VPMADDWD xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, {k}{z}= , xmm1","vpmaddwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F5= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMADDWD ymm1, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, ymm1","vpmaddw= d ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F5 /r","V","V","AVX2","","w= ,r,r","","" +"VPMADDWD ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, {k}{z}= , ymm1","vpmaddwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F5= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMADDWD zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDWD zmm2/m512, zmmV, {k}{z}= , zmm1","vpmaddwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F5= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMASKMOVD xmm1, xmmV, m128","VPMASKMOVD m128, xmmV, xmm1","vpmaskmovd m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVD ymm1, ymmV, m256","VPMASKMOVD m256, ymmV, ymm1","vpmaskmovd m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVD m128, xmmV, xmm1","VPMASKMOVD xmm1, xmmV, m128","vpmaskmovd xm= m1, xmmV, m128","VEX.NDS.128.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVD m256, ymmV, ymm1","VPMASKMOVD ymm1, ymmV, m256","vpmaskmovd ym= m1, ymmV, m256","VEX.NDS.256.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVQ xmm1, xmmV, m128","VPMASKMOVQ m128, xmmV, xmm1","vpmaskmovq m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVQ ymm1, ymmV, m256","VPMASKMOVQ m256, ymmV, ymm1","vpmaskmovq m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVQ m128, xmmV, xmm1","VPMASKMOVQ xmm1, xmmV, m128","vpmaskmovq xm= m1, xmmV, m128","VEX.NDS.128.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVQ m256, ymmV, ymm1","VPMASKMOVQ ymm1, ymmV, m256","vpmaskmovq ym= m1, ymmV, m256","VEX.NDS.256.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMAXSB xmm1, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, xmm1","vpmaxsb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3C /r","V","V","AVX","","w,r= ,r","","" +"VPMAXSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, {k}{z}, = xmm1","vpmaxsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3C = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMAXSB ymm1, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, ymm1","vpmaxsb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3C /r","V","V","AVX2","","w,= r,r","","" +"VPMAXSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, {k}{z}, = ymm1","vpmaxsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3C = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMAXSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSB zmm2/m512, zmmV, {k}{z}, = zmm1","vpmaxsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3C = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMAXSD xmm1, xmmV, xmm2/m128","VPMAXSD xmm2/m128, xmmV, xmm1","vpmaxsd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3D /r","V","V","AVX","","w,r= ,r","","" +"VPMAXSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXSD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpmaxsd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 3D /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMAXSD ymm1, ymmV, ymm2/m256","VPMAXSD ymm2/m256, ymmV, ymm1","vpmaxsd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3D /r","V","V","AVX2","","w,= r,r","","" +"VPMAXSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXSD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpmaxsd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 3D /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMAXSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXSD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpmaxsd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 3D /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMAXSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXSQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpmaxsq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 3D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMAXSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXSQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpmaxsq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 3D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMAXSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXSQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpmaxsq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 3D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMAXSW xmm1, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, xmm1","vpmaxsw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EE /r","V","V","AVX","","w,r,r= ","","" +"VPMAXSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, {k}{z}, = xmm1","vpmaxsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EE /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMAXSW ymm1, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, ymm1","vpmaxsw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EE /r","V","V","AVX2","","w,r,= r","","" +"VPMAXSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, {k}{z}, = ymm1","vpmaxsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EE /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMAXSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSW zmm2/m512, zmmV, {k}{z}, = zmm1","vpmaxsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EE /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMAXUB xmm1, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, xmm1","vpmaxub x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DE /r","V","V","AVX","","w,r,r= ","","" +"VPMAXUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, {k}{z}, = xmm1","vpmaxub xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DE /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMAXUB ymm1, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, ymm1","vpmaxub y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DE /r","V","V","AVX2","","w,r,= r","","" +"VPMAXUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, {k}{z}, = ymm1","vpmaxub ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DE /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMAXUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUB zmm2/m512, zmmV, {k}{z}, = zmm1","vpmaxub zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DE /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMAXUD xmm1, xmmV, xmm2/m128","VPMAXUD xmm2/m128, xmmV, xmm1","vpmaxud x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3F /r","V","V","AVX","","w,r= ,r","","" +"VPMAXUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXUD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpmaxud xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 3F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMAXUD ymm1, ymmV, ymm2/m256","VPMAXUD ymm2/m256, ymmV, ymm1","vpmaxud y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3F /r","V","V","AVX2","","w,= r,r","","" +"VPMAXUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXUD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpmaxud ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 3F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMAXUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXUD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpmaxud zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 3F /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMAXUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXUQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpmaxuq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 3F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMAXUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXUQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpmaxuq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 3F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMAXUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXUQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpmaxuq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 3F /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMAXUW xmm1, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, xmm1","vpmaxuw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3E /r","V","V","AVX","","w,r= ,r","","" +"VPMAXUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, {k}{z}, = xmm1","vpmaxuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3E = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMAXUW ymm1, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, ymm1","vpmaxuw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3E /r","V","V","AVX2","","w,= r,r","","" +"VPMAXUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, {k}{z}, = ymm1","vpmaxuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3E = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMAXUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUW zmm2/m512, zmmV, {k}{z}, = zmm1","vpmaxuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3E = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMINSB xmm1, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, xmm1","vpminsb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 38 /r","V","V","AVX","","w,r= ,r","","" +"VPMINSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, {k}{z}, = xmm1","vpminsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 38 = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMINSB ymm1, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, ymm1","vpminsb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 38 /r","V","V","AVX2","","w,= r,r","","" +"VPMINSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, {k}{z}, = ymm1","vpminsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 38 = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMINSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSB zmm2/m512, zmmV, {k}{z}, = zmm1","vpminsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 38 = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMINSD xmm1, xmmV, xmm2/m128","VPMINSD xmm2/m128, xmmV, xmm1","vpminsd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 39 /r","V","V","AVX","","w,r= ,r","","" +"VPMINSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINSD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpminsd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 39 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMINSD ymm1, ymmV, ymm2/m256","VPMINSD ymm2/m256, ymmV, ymm1","vpminsd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 39 /r","V","V","AVX2","","w,= r,r","","" +"VPMINSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINSD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpminsd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 39 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMINSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINSD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpminsd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 39 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMINSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINSQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpminsq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 39 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMINSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINSQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpminsq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 39 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMINSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINSQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpminsq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 39 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMINSW xmm1, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, xmm1","vpminsw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EA /r","V","V","AVX","","w,r,r= ","","" +"VPMINSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, {k}{z}, = xmm1","vpminsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EA /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMINSW ymm1, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, ymm1","vpminsw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EA /r","V","V","AVX2","","w,r,= r","","" +"VPMINSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, {k}{z}, = ymm1","vpminsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EA /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMINSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSW zmm2/m512, zmmV, {k}{z}, = zmm1","vpminsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EA /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMINUB xmm1, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, xmm1","vpminub x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DA /r","V","V","AVX","","w,r,r= ","","" +"VPMINUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, {k}{z}, = xmm1","vpminub xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DA /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMINUB ymm1, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, ymm1","vpminub y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DA /r","V","V","AVX2","","w,r,= r","","" +"VPMINUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, {k}{z}, = ymm1","vpminub ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DA /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMINUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUB zmm2/m512, zmmV, {k}{z}, = zmm1","vpminub zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DA /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMINUD xmm1, xmmV, xmm2/m128","VPMINUD xmm2/m128, xmmV, xmm1","vpminud x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3B /r","V","V","AVX","","w,r= ,r","","" +"VPMINUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINUD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpminud xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 3B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMINUD ymm1, ymmV, ymm2/m256","VPMINUD ymm2/m256, ymmV, ymm1","vpminud y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3B /r","V","V","AVX2","","w,= r,r","","" +"VPMINUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINUD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpminud ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 3B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMINUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINUD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpminud zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 3B /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMINUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINUQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpminuq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 3B /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMINUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINUQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpminuq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 3B /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMINUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINUQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpminuq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 3B /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMINUW xmm1, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, xmm1","vpminuw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3A /r","V","V","AVX","","w,r= ,r","","" +"VPMINUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, {k}{z}, = xmm1","vpminuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3A = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMINUW ymm1, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, ymm1","vpminuw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3A /r","V","V","AVX2","","w,= r,r","","" +"VPMINUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, {k}{z}, = ymm1","vpminuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3A = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMINUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUW zmm2/m512, zmmV, {k}{z}, = zmm1","vpminuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3A = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMOVB2M k1, xmm2","VPMOVB2M xmm2, k1","vpmovb2m xmm2, k1","EVEX.128.F3.0= F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVB2M k1, ymm2","VPMOVB2M ymm2, k1","vpmovb2m ymm2, k1","EVEX.256.F3.0= F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVB2M k1, zmm2","VPMOVB2M zmm2, k1","vpmovb2m zmm2, k1","EVEX.512.F3.0= F38.W0 29 /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"VPMOVD2M k1, xmm2","VPMOVD2M xmm2, k1","vpmovd2m xmm2, k1","EVEX.128.F3.0= F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVD2M k1, ymm2","VPMOVD2M ymm2, k1","vpmovd2m ymm2, k1","EVEX.256.F3.0= F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVD2M k1, zmm2","VPMOVD2M zmm2, k1","vpmovd2m zmm2, k1","EVEX.512.F3.0= F38.W0 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"VPMOVDB xmm2/m32, {k}{z}, xmm1","VPMOVDB xmm1, {k}{z}, xmm2/m32","vpmovdb= xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 31 /r","V","V","AVX512F+AVX51= 2VL","scale4","w,r,r","","" +"VPMOVDB xmm2/m64, {k}{z}, ymm1","VPMOVDB ymm1, {k}{z}, xmm2/m64","vpmovdb= ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 31 /r","V","V","AVX512F+AVX51= 2VL","scale8","w,r,r","","" +"VPMOVDB xmm2/m128, {k}{z}, zmm1","VPMOVDB zmm1, {k}{z}, xmm2/m128","vpmov= db zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 31 /r","V","V","AVX512F","= scale16","w,r,r","","" +"VPMOVDW xmm2/m64, {k}{z}, xmm1","VPMOVDW xmm1, {k}{z}, xmm2/m64","vpmovdw= xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 33 /r","V","V","AVX512F+AVX51= 2VL","scale8","w,r,r","","" +"VPMOVDW xmm2/m128, {k}{z}, ymm1","VPMOVDW ymm1, {k}{z}, xmm2/m128","vpmov= dw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 33 /r","V","V","AVX512F+AV= X512VL","scale16","w,r,r","","" +"VPMOVDW ymm2/m256, {k}{z}, zmm1","VPMOVDW zmm1, {k}{z}, ymm2/m256","vpmov= dw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 33 /r","V","V","AVX512F","= scale32","w,r,r","","" +"VPMOVM2B xmm1, k2","VPMOVM2B k2, xmm1","vpmovm2b k2, xmm1","EVEX.128.F3.0= F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2B ymm1, k2","VPMOVM2B k2, ymm1","vpmovm2b k2, ymm1","EVEX.256.F3.0= F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2B zmm1, k2","VPMOVM2B k2, zmm1","vpmovm2b k2, zmm1","EVEX.512.F3.0= F38.W0 28 /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"VPMOVM2D xmm1, k2","VPMOVM2D k2, xmm1","vpmovm2d k2, xmm1","EVEX.128.F3.0= F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2D ymm1, k2","VPMOVM2D k2, ymm1","vpmovm2d k2, ymm1","EVEX.256.F3.0= F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2D zmm1, k2","VPMOVM2D k2, zmm1","vpmovm2d k2, zmm1","EVEX.512.F3.0= F38.W0 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"VPMOVM2Q xmm1, k2","VPMOVM2Q k2, xmm1","vpmovm2q k2, xmm1","EVEX.128.F3.0= F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2Q ymm1, k2","VPMOVM2Q k2, ymm1","vpmovm2q k2, ymm1","EVEX.256.F3.0= F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2Q zmm1, k2","VPMOVM2Q k2, zmm1","vpmovm2q k2, zmm1","EVEX.512.F3.0= F38.W1 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"VPMOVM2W xmm1, k2","VPMOVM2W k2, xmm1","vpmovm2w k2, xmm1","EVEX.128.F3.0= F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2W ymm1, k2","VPMOVM2W k2, ymm1","vpmovm2w k2, ymm1","EVEX.256.F3.0= F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2W zmm1, k2","VPMOVM2W k2, zmm1","vpmovm2w k2, zmm1","EVEX.512.F3.0= F38.W1 28 /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"VPMOVMSKB r32, xmm2","VPMOVMSKB xmm2, r32","vpmovmskb xmm2, r32","VEX.128= .66.0F.WIG D7 /r","V","V","AVX","modrm_regonly","w,r","","" +"VPMOVMSKB r32, ymm2","VPMOVMSKB ymm2, r32","vpmovmskb ymm2, r32","VEX.256= .66.0F.WIG D7 /r","V","V","AVX2","modrm_regonly","w,r","","" +"VPMOVQ2M k1, xmm2","VPMOVQ2M xmm2, k1","vpmovq2m xmm2, k1","EVEX.128.F3.0= F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVQ2M k1, ymm2","VPMOVQ2M ymm2, k1","vpmovq2m ymm2, k1","EVEX.256.F3.0= F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVQ2M k1, zmm2","VPMOVQ2M zmm2, k1","vpmovq2m zmm2, k1","EVEX.512.F3.0= F38.W1 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"VPMOVQB xmm2/m16, {k}{z}, xmm1","VPMOVQB xmm1, {k}{z}, xmm2/m16","vpmovqb= xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 32 /r","V","V","AVX512F+AVX51= 2VL","scale2","w,r,r","","" +"VPMOVQB xmm2/m32, {k}{z}, ymm1","VPMOVQB ymm1, {k}{z}, xmm2/m32","vpmovqb= ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 32 /r","V","V","AVX512F+AVX51= 2VL","scale4","w,r,r","","" +"VPMOVQB xmm2/m64, {k}{z}, zmm1","VPMOVQB zmm1, {k}{z}, xmm2/m64","vpmovqb= zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 32 /r","V","V","AVX512F","sca= le8","w,r,r","","" +"VPMOVQD xmm2/m64, {k}{z}, xmm1","VPMOVQD xmm1, {k}{z}, xmm2/m64","vpmovqd= xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 35 /r","V","V","AVX512F+AVX51= 2VL","scale8","w,r,r","","" +"VPMOVQD xmm2/m128, {k}{z}, ymm1","VPMOVQD ymm1, {k}{z}, xmm2/m128","vpmov= qd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 35 /r","V","V","AVX512F+AV= X512VL","scale16","w,r,r","","" +"VPMOVQD ymm2/m256, {k}{z}, zmm1","VPMOVQD zmm1, {k}{z}, ymm2/m256","vpmov= qd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 35 /r","V","V","AVX512F","= scale32","w,r,r","","" +"VPMOVQW xmm2/m32, {k}{z}, xmm1","VPMOVQW xmm1, {k}{z}, xmm2/m32","vpmovqw= xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 34 /r","V","V","AVX512F+AVX51= 2VL","scale4","w,r,r","","" +"VPMOVQW xmm2/m64, {k}{z}, ymm1","VPMOVQW ymm1, {k}{z}, xmm2/m64","vpmovqw= ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 34 /r","V","V","AVX512F+AVX51= 2VL","scale8","w,r,r","","" +"VPMOVQW xmm2/m128, {k}{z}, zmm1","VPMOVQW zmm1, {k}{z}, xmm2/m128","vpmov= qw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 34 /r","V","V","AVX512F","= scale16","w,r,r","","" +"VPMOVSDB xmm2/m32, {k}{z}, xmm1","VPMOVSDB xmm1, {k}{z}, xmm2/m32","vpmov= sdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 21 /r","V","V","AVX512F+AV= X512VL","scale4","w,r,r","","" +"VPMOVSDB xmm2/m64, {k}{z}, ymm1","VPMOVSDB ymm1, {k}{z}, xmm2/m64","vpmov= sdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 21 /r","V","V","AVX512F+AV= X512VL","scale8","w,r,r","","" +"VPMOVSDB xmm2/m128, {k}{z}, zmm1","VPMOVSDB zmm1, {k}{z}, xmm2/m128","vpm= ovsdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 21 /r","V","V","AVX512F= ","scale16","w,r,r","","" +"VPMOVSDW xmm2/m64, {k}{z}, xmm1","VPMOVSDW xmm1, {k}{z}, xmm2/m64","vpmov= sdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 23 /r","V","V","AVX512F+AV= X512VL","scale8","w,r,r","","" +"VPMOVSDW xmm2/m128, {k}{z}, ymm1","VPMOVSDW ymm1, {k}{z}, xmm2/m128","vpm= ovsdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 23 /r","V","V","AVX512F= +AVX512VL","scale16","w,r,r","","" +"VPMOVSDW ymm2/m256, {k}{z}, zmm1","VPMOVSDW zmm1, {k}{z}, ymm2/m256","vpm= ovsdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 23 /r","V","V","AVX512F= ","scale32","w,r,r","","" +"VPMOVSQB xmm2/m16, {k}{z}, xmm1","VPMOVSQB xmm1, {k}{z}, xmm2/m16","vpmov= sqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 22 /r","V","V","AVX512F+AV= X512VL","scale2","w,r,r","","" +"VPMOVSQB xmm2/m32, {k}{z}, ymm1","VPMOVSQB ymm1, {k}{z}, xmm2/m32","vpmov= sqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 22 /r","V","V","AVX512F+AV= X512VL","scale4","w,r,r","","" +"VPMOVSQB xmm2/m64, {k}{z}, zmm1","VPMOVSQB zmm1, {k}{z}, xmm2/m64","vpmov= sqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 22 /r","V","V","AVX512F","= scale8","w,r,r","","" +"VPMOVSQD xmm2/m64, {k}{z}, xmm1","VPMOVSQD xmm1, {k}{z}, xmm2/m64","vpmov= sqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 25 /r","V","V","AVX512F+AV= X512VL","scale8","w,r,r","","" +"VPMOVSQD xmm2/m128, {k}{z}, ymm1","VPMOVSQD ymm1, {k}{z}, xmm2/m128","vpm= ovsqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 25 /r","V","V","AVX512F= +AVX512VL","scale16","w,r,r","","" +"VPMOVSQD ymm2/m256, {k}{z}, zmm1","VPMOVSQD zmm1, {k}{z}, ymm2/m256","vpm= ovsqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 25 /r","V","V","AVX512F= ","scale32","w,r,r","","" +"VPMOVSQW xmm2/m32, {k}{z}, xmm1","VPMOVSQW xmm1, {k}{z}, xmm2/m32","vpmov= sqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 24 /r","V","V","AVX512F+AV= X512VL","scale4","w,r,r","","" +"VPMOVSQW xmm2/m64, {k}{z}, ymm1","VPMOVSQW ymm1, {k}{z}, xmm2/m64","vpmov= sqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 24 /r","V","V","AVX512F+AV= X512VL","scale8","w,r,r","","" +"VPMOVSQW xmm2/m128, {k}{z}, zmm1","VPMOVSQW zmm1, {k}{z}, xmm2/m128","vpm= ovsqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 24 /r","V","V","AVX512F= ","scale16","w,r,r","","" +"VPMOVSWB xmm2/m64, {k}{z}, xmm1","VPMOVSWB xmm1, {k}{z}, xmm2/m64","vpmov= swb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 20 /r","V","V","AVX512BW+A= VX512VL","scale8","w,r,r","","" +"VPMOVSWB xmm2/m128, {k}{z}, ymm1","VPMOVSWB ymm1, {k}{z}, xmm2/m128","vpm= ovswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 20 /r","V","V","AVX512B= W+AVX512VL","scale16","w,r,r","","" +"VPMOVSWB ymm2/m256, {k}{z}, zmm1","VPMOVSWB zmm1, {k}{z}, ymm2/m256","vpm= ovswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 20 /r","V","V","AVX512B= W","scale32","w,r,r","","" +"VPMOVSXBD zmm1, {k}{z}, xmm2/m128","VPMOVSXBD xmm2/m128, {k}{z}, zmm1","v= pmovsxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 21 /r","V","V","AVX= 512F","scale16","w,r,r","","" +"VPMOVSXBD xmm1, xmm2/m32","VPMOVSXBD xmm2/m32, xmm1","vpmovsxbd xmm2/m32,= xmm1","VEX.128.66.0F38.WIG 21 /r","V","V","AVX","","w,r","","" +"VPMOVSXBD xmm1, {k}{z}, xmm2/m32","VPMOVSXBD xmm2/m32, {k}{z}, xmm1","vpm= ovsxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 21 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVSXBD ymm1, xmm2/m64","VPMOVSXBD xmm2/m64, ymm1","vpmovsxbd xmm2/m64,= ymm1","VEX.256.66.0F38.WIG 21 /r","V","V","AVX2","","w,r","","" +"VPMOVSXBD ymm1, {k}{z}, xmm2/m64","VPMOVSXBD xmm2/m64, {k}{z}, ymm1","vpm= ovsxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 21 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVSXBQ xmm1, xmm2/m16","VPMOVSXBQ xmm2/m16, xmm1","vpmovsxbq xmm2/m16,= xmm1","VEX.128.66.0F38.WIG 22 /r","V","V","AVX","","w,r","","" +"VPMOVSXBQ xmm1, {k}{z}, xmm2/m16","VPMOVSXBQ xmm2/m16, {k}{z}, xmm1","vpm= ovsxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 22 /r","V","V","AVX512= F+AVX512VL","scale2","w,r,r","","" +"VPMOVSXBQ ymm1, xmm2/m32","VPMOVSXBQ xmm2/m32, ymm1","vpmovsxbq xmm2/m32,= ymm1","VEX.256.66.0F38.WIG 22 /r","V","V","AVX2","","w,r","","" +"VPMOVSXBQ ymm1, {k}{z}, xmm2/m32","VPMOVSXBQ xmm2/m32, {k}{z}, ymm1","vpm= ovsxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 22 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVSXBQ zmm1, {k}{z}, xmm2/m64","VPMOVSXBQ xmm2/m64, {k}{z}, zmm1","vpm= ovsxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 22 /r","V","V","AVX512= F","scale8","w,r,r","","" +"VPMOVSXBW ymm1, xmm2/m128","VPMOVSXBW xmm2/m128, ymm1","vpmovsxbw xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 20 /r","V","V","AVX2","","w,r","","" +"VPMOVSXBW ymm1, {k}{z}, xmm2/m128","VPMOVSXBW xmm2/m128, {k}{z}, ymm1","v= pmovsxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 20 /r","V","V","AVX= 512BW+AVX512VL","scale16","w,r,r","","" +"VPMOVSXBW xmm1, xmm2/m64","VPMOVSXBW xmm2/m64, xmm1","vpmovsxbw xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 20 /r","V","V","AVX","","w,r","","" +"VPMOVSXBW xmm1, {k}{z}, xmm2/m64","VPMOVSXBW xmm2/m64, {k}{z}, xmm1","vpm= ovsxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 20 /r","V","V","AVX512= BW+AVX512VL","scale8","w,r,r","","" +"VPMOVSXBW zmm1, {k}{z}, ymm2/m256","VPMOVSXBW ymm2/m256, {k}{z}, zmm1","v= pmovsxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 20 /r","V","V","AVX= 512BW","scale32","w,r,r","","" +"VPMOVSXDQ ymm1, xmm2/m128","VPMOVSXDQ xmm2/m128, ymm1","vpmovsxdq xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 25 /r","V","V","AVX2","","w,r","","" +"VPMOVSXDQ ymm1, {k}{z}, xmm2/m128","VPMOVSXDQ xmm2/m128, {k}{z}, ymm1","v= pmovsxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 25 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VPMOVSXDQ xmm1, xmm2/m64","VPMOVSXDQ xmm2/m64, xmm1","vpmovsxdq xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 25 /r","V","V","AVX","","w,r","","" +"VPMOVSXDQ xmm1, {k}{z}, xmm2/m64","VPMOVSXDQ xmm2/m64, {k}{z}, xmm1","vpm= ovsxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 25 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVSXDQ zmm1, {k}{z}, ymm2/m256","VPMOVSXDQ ymm2/m256, {k}{z}, zmm1","v= pmovsxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 25 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VPMOVSXWD ymm1, xmm2/m128","VPMOVSXWD xmm2/m128, ymm1","vpmovsxwd xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 23 /r","V","V","AVX2","","w,r","","" +"VPMOVSXWD ymm1, {k}{z}, xmm2/m128","VPMOVSXWD xmm2/m128, {k}{z}, ymm1","v= pmovsxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 23 /r","V","V","AVX= 512F+AVX512VL","scale16","w,r,r","","" +"VPMOVSXWD xmm1, xmm2/m64","VPMOVSXWD xmm2/m64, xmm1","vpmovsxwd xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 23 /r","V","V","AVX","","w,r","","" +"VPMOVSXWD xmm1, {k}{z}, xmm2/m64","VPMOVSXWD xmm2/m64, {k}{z}, xmm1","vpm= ovsxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 23 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVSXWD zmm1, {k}{z}, ymm2/m256","VPMOVSXWD ymm2/m256, {k}{z}, zmm1","v= pmovsxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 23 /r","V","V","AVX= 512F","scale32","w,r,r","","" +"VPMOVSXWQ zmm1, {k}{z}, xmm2/m128","VPMOVSXWQ xmm2/m128, {k}{z}, zmm1","v= pmovsxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 24 /r","V","V","AVX= 512F","scale16","w,r,r","","" +"VPMOVSXWQ xmm1, xmm2/m32","VPMOVSXWQ xmm2/m32, xmm1","vpmovsxwq xmm2/m32,= xmm1","VEX.128.66.0F38.WIG 24 /r","V","V","AVX","","w,r","","" +"VPMOVSXWQ xmm1, {k}{z}, xmm2/m32","VPMOVSXWQ xmm2/m32, {k}{z}, xmm1","vpm= ovsxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 24 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVSXWQ ymm1, xmm2/m64","VPMOVSXWQ xmm2/m64, ymm1","vpmovsxwq xmm2/m64,= ymm1","VEX.256.66.0F38.WIG 24 /r","V","V","AVX2","","w,r","","" +"VPMOVSXWQ ymm1, {k}{z}, xmm2/m64","VPMOVSXWQ xmm2/m64, {k}{z}, ymm1","vpm= ovsxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 24 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVUSDB xmm2/m32, {k}{z}, xmm1","VPMOVUSDB xmm1, {k}{z}, xmm2/m32","vpm= ovusdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 11 /r","V","V","AVX512F= +AVX512VL","scale4","w,r,r","","" +"VPMOVUSDB xmm2/m64, {k}{z}, ymm1","VPMOVUSDB ymm1, {k}{z}, xmm2/m64","vpm= ovusdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 11 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVUSDB xmm2/m128, {k}{z}, zmm1","VPMOVUSDB zmm1, {k}{z}, xmm2/m128","v= pmovusdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 11 /r","V","V","AVX5= 12F","scale16","w,r,r","","" +"VPMOVUSDW xmm2/m64, {k}{z}, xmm1","VPMOVUSDW xmm1, {k}{z}, xmm2/m64","vpm= ovusdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 13 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVUSDW xmm2/m128, {k}{z}, ymm1","VPMOVUSDW ymm1, {k}{z}, xmm2/m128","v= pmovusdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 13 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VPMOVUSDW ymm2/m256, {k}{z}, zmm1","VPMOVUSDW zmm1, {k}{z}, ymm2/m256","v= pmovusdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 13 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VPMOVUSQB xmm2/m16, {k}{z}, xmm1","VPMOVUSQB xmm1, {k}{z}, xmm2/m16","vpm= ovusqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 12 /r","V","V","AVX512F= +AVX512VL","scale2","w,r,r","","" +"VPMOVUSQB xmm2/m32, {k}{z}, ymm1","VPMOVUSQB ymm1, {k}{z}, xmm2/m32","vpm= ovusqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 12 /r","V","V","AVX512F= +AVX512VL","scale4","w,r,r","","" +"VPMOVUSQB xmm2/m64, {k}{z}, zmm1","VPMOVUSQB zmm1, {k}{z}, xmm2/m64","vpm= ovusqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 12 /r","V","V","AVX512F= ","scale8","w,r,r","","" +"VPMOVUSQD xmm2/m64, {k}{z}, xmm1","VPMOVUSQD xmm1, {k}{z}, xmm2/m64","vpm= ovusqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 15 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVUSQD xmm2/m128, {k}{z}, ymm1","VPMOVUSQD ymm1, {k}{z}, xmm2/m128","v= pmovusqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 15 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VPMOVUSQD ymm2/m256, {k}{z}, zmm1","VPMOVUSQD zmm1, {k}{z}, ymm2/m256","v= pmovusqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 15 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VPMOVUSQW xmm2/m32, {k}{z}, xmm1","VPMOVUSQW xmm1, {k}{z}, xmm2/m32","vpm= ovusqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 14 /r","V","V","AVX512F= +AVX512VL","scale4","w,r,r","","" +"VPMOVUSQW xmm2/m64, {k}{z}, ymm1","VPMOVUSQW ymm1, {k}{z}, xmm2/m64","vpm= ovusqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 14 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVUSQW xmm2/m128, {k}{z}, zmm1","VPMOVUSQW zmm1, {k}{z}, xmm2/m128","v= pmovusqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 14 /r","V","V","AVX5= 12F","scale16","w,r,r","","" +"VPMOVUSWB xmm2/m64, {k}{z}, xmm1","VPMOVUSWB xmm1, {k}{z}, xmm2/m64","vpm= ovuswb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 10 /r","V","V","AVX512B= W+AVX512VL","scale8","w,r,r","","" +"VPMOVUSWB xmm2/m128, {k}{z}, ymm1","VPMOVUSWB ymm1, {k}{z}, xmm2/m128","v= pmovuswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 10 /r","V","V","AVX5= 12BW+AVX512VL","scale16","w,r,r","","" +"VPMOVUSWB ymm2/m256, {k}{z}, zmm1","VPMOVUSWB zmm1, {k}{z}, ymm2/m256","v= pmovuswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 10 /r","V","V","AVX5= 12BW","scale32","w,r,r","","" +"VPMOVW2M k1, xmm2","VPMOVW2M xmm2, k1","vpmovw2m xmm2, k1","EVEX.128.F3.0= F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVW2M k1, ymm2","VPMOVW2M ymm2, k1","vpmovw2m ymm2, k1","EVEX.256.F3.0= F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVW2M k1, zmm2","VPMOVW2M zmm2, k1","vpmovw2m zmm2, k1","EVEX.512.F3.0= F38.W1 29 /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"VPMOVWB xmm2/m64, {k}{z}, xmm1","VPMOVWB xmm1, {k}{z}, xmm2/m64","vpmovwb= xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 30 /r","V","V","AVX512BW+AVX5= 12VL","scale8","w,r,r","","" +"VPMOVWB xmm2/m128, {k}{z}, ymm1","VPMOVWB ymm1, {k}{z}, xmm2/m128","vpmov= wb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 30 /r","V","V","AVX512BW+A= VX512VL","scale16","w,r,r","","" +"VPMOVWB ymm2/m256, {k}{z}, zmm1","VPMOVWB zmm1, {k}{z}, ymm2/m256","vpmov= wb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 30 /r","V","V","AVX512BW",= "scale32","w,r,r","","" +"VPMOVZXBD zmm1, {k}{z}, xmm2/m128","VPMOVZXBD xmm2/m128, {k}{z}, zmm1","v= pmovzxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 31 /r","V","V","AVX= 512F","scale16","w,r,r","","" +"VPMOVZXBD xmm1, xmm2/m32","VPMOVZXBD xmm2/m32, xmm1","vpmovzxbd xmm2/m32,= xmm1","VEX.128.66.0F38.WIG 31 /r","V","V","AVX","","w,r","","" +"VPMOVZXBD xmm1, {k}{z}, xmm2/m32","VPMOVZXBD xmm2/m32, {k}{z}, xmm1","vpm= ovzxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 31 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVZXBD ymm1, xmm2/m64","VPMOVZXBD xmm2/m64, ymm1","vpmovzxbd xmm2/m64,= ymm1","VEX.256.66.0F38.WIG 31 /r","V","V","AVX2","","w,r","","" +"VPMOVZXBD ymm1, {k}{z}, xmm2/m64","VPMOVZXBD xmm2/m64, {k}{z}, ymm1","vpm= ovzxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 31 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVZXBQ xmm1, xmm2/m16","VPMOVZXBQ xmm2/m16, xmm1","vpmovzxbq xmm2/m16,= xmm1","VEX.128.66.0F38.WIG 32 /r","V","V","AVX","","w,r","","" +"VPMOVZXBQ xmm1, {k}{z}, xmm2/m16","VPMOVZXBQ xmm2/m16, {k}{z}, xmm1","vpm= ovzxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 32 /r","V","V","AVX512= F+AVX512VL","scale2","w,r,r","","" +"VPMOVZXBQ ymm1, xmm2/m32","VPMOVZXBQ xmm2/m32, ymm1","vpmovzxbq xmm2/m32,= ymm1","VEX.256.66.0F38.WIG 32 /r","V","V","AVX2","","w,r","","" +"VPMOVZXBQ ymm1, {k}{z}, xmm2/m32","VPMOVZXBQ xmm2/m32, {k}{z}, ymm1","vpm= ovzxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 32 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVZXBQ zmm1, {k}{z}, xmm2/m64","VPMOVZXBQ xmm2/m64, {k}{z}, zmm1","vpm= ovzxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 32 /r","V","V","AVX512= F","scale8","w,r,r","","" +"VPMOVZXBW ymm1, xmm2/m128","VPMOVZXBW xmm2/m128, ymm1","vpmovzxbw xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 30 /r","V","V","AVX2","","w,r","","" +"VPMOVZXBW ymm1, {k}{z}, xmm2/m128","VPMOVZXBW xmm2/m128, {k}{z}, ymm1","v= pmovzxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 30 /r","V","V","AVX= 512BW+AVX512VL","scale16","w,r,r","","" +"VPMOVZXBW xmm1, xmm2/m64","VPMOVZXBW xmm2/m64, xmm1","vpmovzxbw xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 30 /r","V","V","AVX","","w,r","","" +"VPMOVZXBW xmm1, {k}{z}, xmm2/m64","VPMOVZXBW xmm2/m64, {k}{z}, xmm1","vpm= ovzxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 30 /r","V","V","AVX512= BW+AVX512VL","scale8","w,r,r","","" +"VPMOVZXBW zmm1, {k}{z}, ymm2/m256","VPMOVZXBW ymm2/m256, {k}{z}, zmm1","v= pmovzxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 30 /r","V","V","AVX= 512BW","scale32","w,r,r","","" +"VPMOVZXDQ ymm1, xmm2/m128","VPMOVZXDQ xmm2/m128, ymm1","vpmovzxdq xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 35 /r","V","V","AVX2","","w,r","","" +"VPMOVZXDQ ymm1, {k}{z}, xmm2/m128","VPMOVZXDQ xmm2/m128, {k}{z}, ymm1","v= pmovzxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 35 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VPMOVZXDQ xmm1, xmm2/m64","VPMOVZXDQ xmm2/m64, xmm1","vpmovzxdq xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 35 /r","V","V","AVX","","w,r","","" +"VPMOVZXDQ xmm1, {k}{z}, xmm2/m64","VPMOVZXDQ xmm2/m64, {k}{z}, xmm1","vpm= ovzxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 35 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVZXDQ zmm1, {k}{z}, ymm2/m256","VPMOVZXDQ ymm2/m256, {k}{z}, zmm1","v= pmovzxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 35 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VPMOVZXWD ymm1, xmm2/m128","VPMOVZXWD xmm2/m128, ymm1","vpmovzxwd xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 33 /r","V","V","AVX2","","w,r","","" +"VPMOVZXWD ymm1, {k}{z}, xmm2/m128","VPMOVZXWD xmm2/m128, {k}{z}, ymm1","v= pmovzxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 33 /r","V","V","AVX= 512F+AVX512VL","scale16","w,r,r","","" +"VPMOVZXWD xmm1, xmm2/m64","VPMOVZXWD xmm2/m64, xmm1","vpmovzxwd xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 33 /r","V","V","AVX","","w,r","","" +"VPMOVZXWD xmm1, {k}{z}, xmm2/m64","VPMOVZXWD xmm2/m64, {k}{z}, xmm1","vpm= ovzxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 33 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVZXWD zmm1, {k}{z}, ymm2/m256","VPMOVZXWD ymm2/m256, {k}{z}, zmm1","v= pmovzxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 33 /r","V","V","AVX= 512F","scale32","w,r,r","","" +"VPMOVZXWQ zmm1, {k}{z}, xmm2/m128","VPMOVZXWQ xmm2/m128, {k}{z}, zmm1","v= pmovzxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 34 /r","V","V","AVX= 512F","scale16","w,r,r","","" +"VPMOVZXWQ xmm1, xmm2/m32","VPMOVZXWQ xmm2/m32, xmm1","vpmovzxwq xmm2/m32,= xmm1","VEX.128.66.0F38.WIG 34 /r","V","V","AVX","","w,r","","" +"VPMOVZXWQ xmm1, {k}{z}, xmm2/m32","VPMOVZXWQ xmm2/m32, {k}{z}, xmm1","vpm= ovzxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 34 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVZXWQ ymm1, xmm2/m64","VPMOVZXWQ xmm2/m64, ymm1","vpmovzxwq xmm2/m64,= ymm1","VEX.256.66.0F38.WIG 34 /r","V","V","AVX2","","w,r","","" +"VPMOVZXWQ ymm1, {k}{z}, xmm2/m64","VPMOVZXWQ xmm2/m64, {k}{z}, ymm1","vpm= ovzxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 34 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMULDQ xmm1, xmmV, xmm2/m128","VPMULDQ xmm2/m128, xmmV, xmm1","vpmuldq x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 28 /r","V","V","AVX","","w,r= ,r","","" +"VPMULDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULDQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpmuldq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 28 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMULDQ ymm1, ymmV, ymm2/m256","VPMULDQ ymm2/m256, ymmV, ymm1","vpmuldq y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 28 /r","V","V","AVX2","","w,= r,r","","" +"VPMULDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULDQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpmuldq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 28 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMULDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULDQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpmuldq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 28 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMULHRSW xmm1, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, xmm1","vpmul= hrsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0B /r","V","V","AVX","= ","w,r,r","","" +"VPMULHRSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, {k}{= z}, xmm1","vpmulhrsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W= IG 0B /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMULHRSW ymm1, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, ymm1","vpmul= hrsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0B /r","V","V","AVX2",= "","w,r,r","","" +"VPMULHRSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, {k}{= z}, ymm1","vpmulhrsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W= IG 0B /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMULHRSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHRSW zmm2/m512, zmmV, {k}{= z}, zmm1","vpmulhrsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W= IG 0B /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMULHUW xmm1, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, xmm1","vpmulhu= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E4 /r","V","V","AVX","","w,= r,r","","" +"VPMULHUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, {k}{z}= , xmm1","vpmulhuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E4= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMULHUW ymm1, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, ymm1","vpmulhu= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E4 /r","V","V","AVX2","","w= ,r,r","","" +"VPMULHUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, {k}{z}= , ymm1","vpmulhuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E4= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMULHUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHUW zmm2/m512, zmmV, {k}{z}= , zmm1","vpmulhuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E4= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMULHW xmm1, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, xmm1","vpmulhw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E5 /r","V","V","AVX","","w,r,r= ","","" +"VPMULHW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, {k}{z}, = xmm1","vpmulhw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E5 /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMULHW ymm1, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, ymm1","vpmulhw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E5 /r","V","V","AVX2","","w,r,= r","","" +"VPMULHW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, {k}{z}, = ymm1","vpmulhw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E5 /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMULHW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHW zmm2/m512, zmmV, {k}{z}, = zmm1","vpmulhw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E5 /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMULLD xmm1, xmmV, xmm2/m128","VPMULLD xmm2/m128, xmmV, xmm1","vpmulld x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 40 /r","V","V","AVX","","w,r= ,r","","" +"VPMULLD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMULLD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpmulld xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 40 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMULLD ymm1, ymmV, ymm2/m256","VPMULLD ymm2/m256, ymmV, ymm1","vpmulld y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 40 /r","V","V","AVX2","","w,= r,r","","" +"VPMULLD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMULLD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpmulld ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 40 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMULLD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMULLD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpmulld zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 40 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMULLQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULLQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpmullq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 40 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPMULLQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULLQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpmullq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 40 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPMULLQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULLQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpmullq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 40 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","= ","" +"VPMULLW xmm1, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, xmm1","vpmullw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D5 /r","V","V","AVX","","w,r,r= ","","" +"VPMULLW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, {k}{z}, = xmm1","vpmullw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D5 /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMULLW ymm1, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, ymm1","vpmullw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D5 /r","V","V","AVX2","","w,r,= r","","" +"VPMULLW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, {k}{z}, = ymm1","vpmullw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D5 /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMULLW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULLW zmm2/m512, zmmV, {k}{z}, = zmm1","vpmullw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D5 /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMULTISHIFTQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULTISHIFTQB xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmultishiftqb xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 83 /r","V","V","AVX512_VBMI+AVX512= VL","bscale8,scale16","w,r,r,r","","" +"VPMULTISHIFTQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULTISHIFTQB ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmultishiftqb ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 83 /r","V","V","AVX512_VBMI+AVX512= VL","bscale8,scale32","w,r,r,r","","" +"VPMULTISHIFTQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULTISHIFTQB zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmultishiftqb zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 83 /r","V","V","AVX512_VBMI","bsca= le8,scale64","w,r,r,r","","" +"VPMULUDQ xmm1, xmmV, xmm2/m128","VPMULUDQ xmm2/m128, xmmV, xmm1","vpmulud= q xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F4 /r","V","V","AVX","","w,= r,r","","" +"VPMULUDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULUDQ xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpmuludq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.NDS.128.66.0F.W1 F4 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPMULUDQ ymm1, ymmV, ymm2/m256","VPMULUDQ ymm2/m256, ymmV, ymm1","vpmulud= q ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F4 /r","V","V","AVX2","","w= ,r,r","","" +"VPMULUDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULUDQ ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpmuludq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.NDS.256.66.0F.W1 F4 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPMULUDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULUDQ zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpmuludq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.NDS.512.66.0F.W1 F4 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPOPCNTB xmm1, {k}{z}, xmm2/m128","VPOPCNTB xmm2/m128, {k}{z}, xmm1","vpo= pcntb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 54 /r","V","V","AVX512_= BITALG+AVX512VL","scale16","w,r,r","","" +"VPOPCNTB ymm1, {k}{z}, ymm2/m256","VPOPCNTB ymm2/m256, {k}{z}, ymm1","vpo= pcntb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 54 /r","V","V","AVX512_= BITALG+AVX512VL","scale32","w,r,r","","" +"VPOPCNTB zmm1, {k}{z}, zmm2/m512","VPOPCNTB zmm2/m512, {k}{z}, zmm1","vpo= pcntb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 54 /r","V","V","AVX512_= BITALG","scale64","w,r,r","","" +"VPOPCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPOPCNTD xmm2/m128/m32bcst, {k= }{z}, xmm1","vpopcntd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0= 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VPOPCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPOPCNTD ymm2/m256/m32bcst, {k= }{z}, ymm1","vpopcntd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0= 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale32","w,r,r","","" +"VPOPCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPOPCNTD zmm2/m512/m32bcst, {k= }{z}, zmm1","vpopcntd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0= 55 /r","V","V","AVX512_VPOPCNTDQ","bscale4,scale64","w,r,r","","" +"VPOPCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPOPCNTQ xmm2/m128/m64bcst, {k= }{z}, xmm1","vpopcntq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1= 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VPOPCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPOPCNTQ ymm2/m256/m64bcst, {k= }{z}, ymm1","vpopcntq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1= 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VPOPCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPOPCNTQ zmm2/m512/m64bcst, {k= }{z}, zmm1","vpopcntq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1= 55 /r","V","V","AVX512_VPOPCNTDQ","bscale8,scale64","w,r,r","","" +"VPOPCNTW xmm1, {k}{z}, xmm2/m128","VPOPCNTW xmm2/m128, {k}{z}, xmm1","vpo= pcntw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 54 /r","V","V","AVX512_= BITALG+AVX512VL","scale16","w,r,r","","" +"VPOPCNTW ymm1, {k}{z}, ymm2/m256","VPOPCNTW ymm2/m256, {k}{z}, ymm1","vpo= pcntw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 54 /r","V","V","AVX512_= BITALG+AVX512VL","scale32","w,r,r","","" +"VPOPCNTW zmm1, {k}{z}, zmm2/m512","VPOPCNTW zmm2/m512, {k}{z}, zmm1","vpo= pcntw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 54 /r","V","V","AVX512_= BITALG","scale64","w,r,r","","" +"VPOR xmm1, xmmV, xmm2/m128","VPOR xmm2/m128, xmmV, xmm1","vpor xmm2/m128,= xmmV, xmm1","VEX.NDS.128.66.0F.WIG EB /r","V","V","AVX","","w,r,r","","" +"VPOR ymm1, ymmV, ymm2/m256","VPOR ymm2/m256, ymmV, ymm1","vpor ymm2/m256,= ymmV, ymm1","VEX.NDS.256.66.0F.WIG EB /r","V","V","AVX2","","w,r,r","","" +"VPORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPORD xmm2/m128/m32bcst, xm= mV, {k}{z}, xmm1","vpord xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.1= 28.66.0F.W0 EB /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VPORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPORD ymm2/m256/m32bcst, ym= mV, {k}{z}, ymm1","vpord ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.2= 56.66.0F.W0 EB /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VPORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPORD zmm2/m512/m32bcst, zm= mV, {k}{z}, zmm1","vpord zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.5= 12.66.0F.W0 EB /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPORQ xmm2/m128/m64bcst, xm= mV, {k}{z}, xmm1","vporq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.1= 28.66.0F.W1 EB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","= ","" +"VPORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPORQ ymm2/m256/m64bcst, ym= mV, {k}{z}, ymm1","vporq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.2= 56.66.0F.W1 EB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","= ","" +"VPORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPORQ zmm2/m512/m64bcst, zm= mV, {k}{z}, zmm1","vporq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.5= 12.66.0F.W1 EB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPPERM xmm1, xmmV, xmmIH, xmm2/m128","VPPERM xmm2/m128, xmmIH, xmmV, xmm1= ","vpperm xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A3 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPPERM xmm1, xmmV, xmm2/m128, xmmIH","VPPERM xmmIH, xmm2/m128, xmmV, xmm1= ","vpperm xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A3 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPROLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPROLD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vprold imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /1 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPROLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPROLD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vprold imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /1 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPROLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPROLD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vprold imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /1 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPROLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPROLQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vprolq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 72 /1 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPROLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPROLQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vprolq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 72 /1 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPROLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPROLQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vprolq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 72 /1 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPROLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPROLVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vprolvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPROLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPROLVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vprolvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPROLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPROLVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vprolvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 15 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPROLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPROLVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vprolvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPROLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPROLVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vprolvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPROLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPROLVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vprolvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 15 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPRORD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPRORD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vprord imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /0 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPRORD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPRORD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vprord imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /0 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPRORD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPRORD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vprord imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /0 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPRORQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPRORQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vprorq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 72 /0 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPRORQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPRORQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vprorq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 72 /0 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPRORQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPRORQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vprorq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 72 /0 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPRORVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPRORVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vprorvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPRORVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPRORVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vprorvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPRORVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPRORVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vprorvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 14 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPRORVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPRORVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vprorvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPRORVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPRORVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vprorvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPRORVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPRORVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vprorvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 14 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPROTB xmm1, xmm2/m128, imm8u","VPROTB imm8u, xmm2/m128, xmm1","vprotb im= m8u, xmm2/m128, xmm1","XOP.128.08.W0 C0 /r ib","V","V","XOP","amd","w,r,r",= "","" +"VPROTB xmm1, xmmV, xmm2/m128","VPROTB xmm2/m128, xmmV, xmm1","vprotb xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 90 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTB xmm1, xmm2/m128, xmmV","VPROTB xmmV, xmm2/m128, xmm1","vprotb xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 90 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTD xmm1, xmm2/m128, imm8u","VPROTD imm8u, xmm2/m128, xmm1","vprotd im= m8u, xmm2/m128, xmm1","XOP.128.08.W0 C2 /r ib","V","V","XOP","amd","w,r,r",= "","" +"VPROTD xmm1, xmmV, xmm2/m128","VPROTD xmm2/m128, xmmV, xmm1","vprotd xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 92 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTD xmm1, xmm2/m128, xmmV","VPROTD xmmV, xmm2/m128, xmm1","vprotd xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 92 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTQ xmm1, xmm2/m128, imm8u","VPROTQ imm8u, xmm2/m128, xmm1","vprotq im= m8u, xmm2/m128, xmm1","XOP.128.08.W0 C3 /r ib","V","V","XOP","amd","w,r,r",= "","" +"VPROTQ xmm1, xmmV, xmm2/m128","VPROTQ xmm2/m128, xmmV, xmm1","vprotq xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 93 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTQ xmm1, xmm2/m128, xmmV","VPROTQ xmmV, xmm2/m128, xmm1","vprotq xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 93 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTW xmm1, xmm2/m128, imm8u","VPROTW imm8u, xmm2/m128, xmm1","vprotw im= m8u, xmm2/m128, xmm1","XOP.128.08.W0 C1 /r ib","V","V","XOP","amd","w,r,r",= "","" +"VPROTW xmm1, xmmV, xmm2/m128","VPROTW xmm2/m128, xmmV, xmm1","vprotw xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 91 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTW xmm1, xmm2/m128, xmmV","VPROTW xmmV, xmm2/m128, xmm1","vprotw xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 91 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw x= mm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX512BW+AVX5= 12VL","scale16","w,r,r","","" +"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX","","w,r,r= ","","" +"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw y= mm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX512BW+AVX5= 12VL","scale32","w,r,r","","" +"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX2","","w,r,= r","","" +"VPSADBW zmm1, zmmV, zmm2/m512","VPSADBW zmm2/m512, zmmV, zmm1","vpsadbw z= mm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F.WIG F6 /r","V","V","AVX512BW","sc= ale64","w,r,r","","" +"VPSCATTERDD vm32x, {k1-k7}, xmm1","VPSCATTERDD xmm1, {k1-k7}, vm32x","vps= catterdd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A0 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERDD vm32y, {k1-k7}, ymm1","VPSCATTERDD ymm1, {k1-k7}, vm32y","vps= catterdd ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A0 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERDD vm32z, {k1-k7}, zmm1","VPSCATTERDD zmm1, {k1-k7}, vm32z","vps= catterdd zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A0 /vsib","V","V","AVX5= 12F","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERDQ vm32x, {k1-k7}, xmm1","VPSCATTERDQ xmm1, {k1-k7}, vm32x","vps= catterdq xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A0 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERDQ vm32x, {k1-k7}, ymm1","VPSCATTERDQ ymm1, {k1-k7}, vm32x","vps= catterdq ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A0 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERDQ vm32y, {k1-k7}, zmm1","VPSCATTERDQ zmm1, {k1-k7}, vm32y","vps= catterdq zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A0 /vsib","V","V","AVX5= 12F","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERQD vm64x, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7}, vm64x","vps= catterqd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A1 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERQD vm64y, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7}, vm64y","vps= catterqd xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A1 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERQD vm64z, {k1-k7}, ymm1","VPSCATTERQD ymm1, {k1-k7}, vm64z","vps= catterqd ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A1 /vsib","V","V","AVX5= 12F","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERQQ vm64x, {k1-k7}, xmm1","VPSCATTERQQ xmm1, {k1-k7}, vm64x","vps= catterqq xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A1 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERQQ vm64y, {k1-k7}, ymm1","VPSCATTERQQ ymm1, {k1-k7}, vm64y","vps= catterqq ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A1 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERQQ vm64z, {k1-k7}, zmm1","VPSCATTERQQ zmm1, {k1-k7}, vm64z","vps= catterqq zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A1 /vsib","V","V","AVX5= 12F","modrm_memonly,scale8","w,rw,r","","" +"VPSHAB xmm1, xmmV, xmm2/m128","VPSHAB xmm2/m128, xmmV, xmm1","vpshab xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 98 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAB xmm1, xmm2/m128, xmmV","VPSHAB xmmV, xmm2/m128, xmm1","vpshab xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 98 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAD xmm1, xmmV, xmm2/m128","VPSHAD xmm2/m128, xmmV, xmm1","vpshad xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 9A /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAD xmm1, xmm2/m128, xmmV","VPSHAD xmmV, xmm2/m128, xmm1","vpshad xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 9A /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAQ xmm1, xmmV, xmm2/m128","VPSHAQ xmm2/m128, xmmV, xmm1","vpshaq xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 9B /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAQ xmm1, xmm2/m128, xmmV","VPSHAQ xmmV, xmm2/m128, xmm1","vpshaq xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 9B /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAW xmm1, xmmV, xmm2/m128","VPSHAW xmm2/m128, xmmV, xmm1","vpshaw xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 99 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAW xmm1, xmm2/m128, xmmV","VPSHAW xmmV, xmm2/m128, xmm1","vpshaw xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 99 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLB xmm1, xmmV, xmm2/m128","VPSHLB xmm2/m128, xmmV, xmm1","vpshlb xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 94 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLB xmm1, xmm2/m128, xmmV","VPSHLB xmmV, xmm2/m128, xmm1","vpshlb xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 94 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLD xmm1, xmmV, xmm2/m128","VPSHLD xmm2/m128, xmmV, xmm1","vpshld xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 96 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLD xmm1, xmm2/m128, xmmV","VPSHLD xmmV, xmm2/m128, xmm1","vpshld xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 96 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHLDD imm8u, xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshldd imm8u, xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale4,scale16","w,r,r,r,r","","" +"VPSHLDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHLDD imm8u, ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshldd imm8u, ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale4,scale32","w,r,r,r,r","","" +"VPSHLDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHLDD imm8u, zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshldd imm8u, zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2","= bscale4,scale64","w,r,r,r,r","","" +"VPSHLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHLDQ imm8u, xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshldq imm8u, xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale8,scale16","w,r,r,r,r","","" +"VPSHLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHLDQ imm8u, ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshldq imm8u, ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale8,scale32","w,r,r,r,r","","" +"VPSHLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHLDQ imm8u, zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshldq imm8u, zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2","= bscale8,scale64","w,r,r,r,r","","" +"VPSHLDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHLDVD xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpshldvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scal= e16","rw,r,r,r","","" +"VPSHLDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHLDVD ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpshldvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scal= e32","rw,r,r,r","","" +"VPSHLDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHLDVD zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpshldvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 71 /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,= r,r,r","","" +"VPSHLDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHLDVQ xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpshldvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W1 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scal= e16","rw,r,r,r","","" +"VPSHLDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHLDVQ ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpshldvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W1 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scal= e32","rw,r,r,r","","" +"VPSHLDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHLDVQ zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpshldvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W1 71 /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,= r,r,r","","" +"VPSHLDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHLDVW xmm2/m128, xmmV, {k}{z}= , xmm1","vpshldvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7= 0 /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","","" +"VPSHLDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHLDVW ymm2/m256, ymmV, {k}{z}= , ymm1","vpshldvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7= 0 /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","","" +"VPSHLDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHLDVW zmm2/m512, zmmV, {k}{z}= , zmm1","vpshldvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7= 0 /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","","" +"VPSHLDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHLDW imm8u, xmm2/m128, = xmmV, {k}{z}, xmm1","vpshldw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r= ,r,r","","" +"VPSHLDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHLDW imm8u, ymm2/m256, = ymmV, {k}{z}, ymm1","vpshldw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r= ,r,r","","" +"VPSHLDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHLDW imm8u, zmm2/m512, = zmmV, {k}{z}, zmm1","vpshldw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",= "" +"VPSHLQ xmm1, xmmV, xmm2/m128","VPSHLQ xmm2/m128, xmmV, xmm1","vpshlq xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 97 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLQ xmm1, xmm2/m128, xmmV","VPSHLQ xmmV, xmm2/m128, xmm1","vpshlq xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 97 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLW xmm1, xmmV, xmm2/m128","VPSHLW xmm2/m128, xmmV, xmm1","vpshlw xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 95 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLW xmm1, xmm2/m128, xmmV","VPSHLW xmmV, xmm2/m128, xmm1","vpshlw xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 95 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHRDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHRDD imm8u, xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshrdd imm8u, xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale4,scale16","w,r,r,r,r","","" +"VPSHRDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHRDD imm8u, ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshrdd imm8u, ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale4,scale32","w,r,r,r,r","","" +"VPSHRDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHRDD imm8u, zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshrdd imm8u, zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2","= bscale4,scale64","w,r,r,r,r","","" +"VPSHRDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHRDQ imm8u, xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshrdq imm8u, xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale8,scale16","w,r,r,r,r","","" +"VPSHRDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHRDQ imm8u, ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshrdq imm8u, ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale8,scale32","w,r,r,r,r","","" +"VPSHRDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHRDQ imm8u, zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshrdq imm8u, zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2","= bscale8,scale64","w,r,r,r,r","","" +"VPSHRDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHRDVD xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpshrdvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scal= e16","rw,r,r,r","","" +"VPSHRDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHRDVD ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpshrdvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scal= e32","rw,r,r,r","","" +"VPSHRDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHRDVD zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpshrdvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 73 /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,= r,r,r","","" +"VPSHRDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHRDVQ xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpshrdvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W1 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scal= e16","rw,r,r,r","","" +"VPSHRDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHRDVQ ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpshrdvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W1 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scal= e32","rw,r,r,r","","" +"VPSHRDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHRDVQ zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpshrdvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W1 73 /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,= r,r,r","","" +"VPSHRDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHRDVW xmm2/m128, xmmV, {k}{z}= , xmm1","vpshrdvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7= 2 /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","","" +"VPSHRDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHRDVW ymm2/m256, ymmV, {k}{z}= , ymm1","vpshrdvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7= 2 /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","","" +"VPSHRDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHRDVW zmm2/m512, zmmV, {k}{z}= , zmm1","vpshrdvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7= 2 /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","","" +"VPSHRDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHRDW imm8u, xmm2/m128, = xmmV, {k}{z}, xmm1","vpshrdw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r= ,r,r","","" +"VPSHRDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHRDW imm8u, ymm2/m256, = ymmV, {k}{z}, ymm1","vpshrdw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r= ,r,r","","" +"VPSHRDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHRDW imm8u, zmm2/m512, = zmmV, {k}{z}, zmm1","vpshrdw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",= "" +"VPSHUFB xmm1, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, xmm1","vpshufb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 00 /r","V","V","AVX","","w,r= ,r","","" +"VPSHUFB xmm1, {k}{z}, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, {k}{z}, = xmm1","vpshufb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 00 = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSHUFB ymm1, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, ymm1","vpshufb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 00 /r","V","V","AVX2","","w,= r,r","","" +"VPSHUFB ymm1, {k}{z}, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, {k}{z}, = ymm1","vpshufb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 00 = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSHUFB zmm1, {k}{z}, zmmV, zmm2/m512","VPSHUFB zmm2/m512, zmmV, {k}{z}, = zmm1","vpshufb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 00 = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSHUFBITQMB k1, {k}, xmmV, xmm2/m128","VPSHUFBITQMB xmm2/m128, xmmV, {k}= , k1","vpshufbitqmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 8F /= r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r,r","","" +"VPSHUFBITQMB k1, {k}, ymmV, ymm2/m256","VPSHUFBITQMB ymm2/m256, ymmV, {k}= , k1","vpshufbitqmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 8F /= r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r,r","","" +"VPSHUFBITQMB k1, {k}, zmmV, zmm2/m512","VPSHUFBITQMB zmm2/m512, zmmV, {k}= , k1","vpshufbitqmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 8F /= r","V","V","AVX512_BITALG","scale64","w,r,r,r","","" +"VPSHUFD xmm1, xmm2/m128, imm8u","VPSHUFD imm8u, xmm2/m128, xmm1","vpshufd= imm8u, xmm2/m128, xmm1","VEX.128.66.0F.WIG 70 /r ib","V","V","AVX","","w,r= ,r","","" +"VPSHUFD xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSHUFD imm8u, xmm2/m128= /m32bcst, {k}{z}, xmm1","vpshufd imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","E= VEX.128.66.0F.W0 70 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPSHUFD ymm1, ymm2/m256, imm8u","VPSHUFD imm8u, ymm2/m256, ymm1","vpshufd= imm8u, ymm2/m256, ymm1","VEX.256.66.0F.WIG 70 /r ib","V","V","AVX2","","w,= r,r","","" +"VPSHUFD ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSHUFD imm8u, ymm2/m256= /m32bcst, {k}{z}, ymm1","vpshufd imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","E= VEX.256.66.0F.W0 70 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPSHUFD zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSHUFD imm8u, zmm2/m512= /m32bcst, {k}{z}, zmm1","vpshufd imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","E= VEX.512.66.0F.W0 70 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPSHUFHW xmm1, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, xmm1","vpshu= fhw imm8u, xmm2/m128, xmm1","VEX.128.F3.0F.WIG 70 /r ib","V","V","AVX","","= w,r,r","","" +"VPSHUFHW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, {k}{= z}, xmm1","vpshufhw imm8u, xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.WIG 70 = /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSHUFHW ymm1, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, ymm1","vpshu= fhw imm8u, ymm2/m256, ymm1","VEX.256.F3.0F.WIG 70 /r ib","V","V","AVX2","",= "w,r,r","","" +"VPSHUFHW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, {k}{= z}, ymm1","vpshufhw imm8u, ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.WIG 70 = /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSHUFHW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFHW imm8u, zmm2/m512, {k}{= z}, zmm1","vpshufhw imm8u, zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.WIG 70 = /r ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSHUFLW xmm1, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, xmm1","vpshu= flw imm8u, xmm2/m128, xmm1","VEX.128.F2.0F.WIG 70 /r ib","V","V","AVX","","= w,r,r","","" +"VPSHUFLW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, {k}{= z}, xmm1","vpshuflw imm8u, xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.WIG 70 = /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSHUFLW ymm1, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, ymm1","vpshu= flw imm8u, ymm2/m256, ymm1","VEX.256.F2.0F.WIG 70 /r ib","V","V","AVX2","",= "w,r,r","","" +"VPSHUFLW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, {k}{= z}, ymm1","vpshuflw imm8u, ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.WIG 70 = /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSHUFLW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFLW imm8u, zmm2/m512, {k}{= z}, zmm1","vpshuflw imm8u, zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.WIG 70 = /r ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSIGNB xmm1, xmmV, xmm2/m128","VPSIGNB xmm2/m128, xmmV, xmm1","vpsignb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 08 /r","V","V","AVX","","w,r= ,r","","" +"VPSIGNB ymm1, ymmV, ymm2/m256","VPSIGNB ymm2/m256, ymmV, ymm1","vpsignb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 08 /r","V","V","AVX2","","w,= r,r","","" +"VPSIGND xmm1, xmmV, xmm2/m128","VPSIGND xmm2/m128, xmmV, xmm1","vpsignd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0A /r","V","V","AVX","","w,r= ,r","","" +"VPSIGND ymm1, ymmV, ymm2/m256","VPSIGND ymm2/m256, ymmV, ymm1","vpsignd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0A /r","V","V","AVX2","","w,= r,r","","" +"VPSIGNW xmm1, xmmV, xmm2/m128","VPSIGNW xmm2/m128, xmmV, xmm1","vpsignw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 09 /r","V","V","AVX","","w,r= ,r","","" +"VPSIGNW ymm1, ymmV, ymm2/m256","VPSIGNW ymm2/m256, ymmV, ymm1","vpsignw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 09 /r","V","V","AVX2","","w,= r,r","","" +"VPSLLD xmmV, xmm2, imm8u","VPSLLD imm8u, xmm2, xmmV","vpslld imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 72 /6 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSLLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSLLD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vpslld imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /6 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPSLLD ymmV, ymm2, imm8u","VPSLLD imm8u, ymm2, ymmV","vpslld imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 72 /6 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSLLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSLLD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vpslld imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /6 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPSLLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSLLD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vpslld imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /6 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPSLLD xmm1, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, xmm1","vpslld xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F2 /r","V","V","AVX","","w,r,r","= ","" +"VPSLLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, {k}{z}, xm= m1","vpslld xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 F2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSLLD ymm1, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, ymm1","vpslld xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F2 /r","V","V","AVX2","","w,r,r",= "","" +"VPSLLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, {k}{z}, ym= m1","vpslld xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 F2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSLLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLD xmm2/m128, zmmV, {k}{z}, zm= m1","vpslld xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 F2 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSLLDQ xmmV, xmm2, imm8u","VPSLLDQ imm8u, xmm2, xmmV","vpslldq imm8u, xm= m2, xmmV","VEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX","modrm_regonly","w= ,r,r","","" +"VPSLLDQ xmmV, xmm2/m128, imm8u","VPSLLDQ imm8u, xmm2/m128, xmmV","vpslldq= imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX512B= W+AVX512VL","scale16","w,r,r","","" +"VPSLLDQ ymmV, ymm2, imm8u","VPSLLDQ imm8u, ymm2, ymmV","vpslldq imm8u, ym= m2, ymmV","VEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX2","modrm_regonly","= w,r,r","","" +"VPSLLDQ ymmV, ymm2/m256, imm8u","VPSLLDQ imm8u, ymm2/m256, ymmV","vpslldq= imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX512B= W+AVX512VL","scale32","w,r,r","","" +"VPSLLDQ zmmV, zmm2/m512, imm8u","VPSLLDQ imm8u, zmm2/m512, zmmV","vpslldq= imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /7 ib","V","V","AVX512B= W","scale64","w,r,r","","" +"VPSLLQ xmmV, xmm2, imm8u","VPSLLQ imm8u, xmm2, xmmV","vpsllq imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 73 /6 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSLLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSLLQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vpsllq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 73 /6 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPSLLQ ymmV, ymm2, imm8u","VPSLLQ imm8u, ymm2, ymmV","vpsllq imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 73 /6 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSLLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSLLQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vpsllq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 73 /6 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPSLLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSLLQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vpsllq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 73 /6 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPSLLQ xmm1, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, xmm1","vpsllq xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F3 /r","V","V","AVX","","w,r,r","= ","" +"VPSLLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, {k}{z}, xm= m1","vpsllq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 F3 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSLLQ ymm1, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, ymm1","vpsllq xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F3 /r","V","V","AVX2","","w,r,r",= "","" +"VPSLLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, {k}{z}, ym= m1","vpsllq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 F3 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSLLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLQ xmm2/m128, zmmV, {k}{z}, zm= m1","vpsllq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 F3 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSLLVD xmm1, xmmV, xmm2/m128","VPSLLVD xmm2/m128, xmmV, xmm1","vpsllvd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 47 /r","V","V","AVX2","","w,r= ,r","","" +"VPSLLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSLLVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpsllvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 47 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPSLLVD ymm1, ymmV, ymm2/m256","VPSLLVD ymm2/m256, ymmV, ymm1","vpsllvd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 47 /r","V","V","AVX2","","w,r= ,r","","" +"VPSLLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSLLVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpsllvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 47 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPSLLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSLLVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpsllvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 47 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPSLLVQ xmm1, xmmV, xmm2/m128","VPSLLVQ xmm2/m128, xmmV, xmm1","vpsllvq x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 47 /r","V","V","AVX2","","w,r= ,r","","" +"VPSLLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSLLVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpsllvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 47 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPSLLVQ ymm1, ymmV, ymm2/m256","VPSLLVQ ymm2/m256, ymmV, ymm1","vpsllvq y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 47 /r","V","V","AVX2","","w,r= ,r","","" +"VPSLLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSLLVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpsllvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 47 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPSLLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSLLVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpsllvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 47 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPSLLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLVW xmm2/m128, xmmV, {k}{z}, = xmm1","vpsllvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 12 /= r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSLLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSLLVW ymm2/m256, ymmV, {k}{z}, = ymm1","vpsllvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 12 /= r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSLLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSLLVW zmm2/m512, zmmV, {k}{z}, = zmm1","vpsllvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 12 /= r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSLLW xmmV, xmm2, imm8u","VPSLLW imm8u, xmm2, xmmV","vpsllw imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 71 /6 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSLLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSLLW imm8u, xmm2/m128, {k}{z}, = xmmV","vpsllw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /6= ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSLLW ymmV, ymm2, imm8u","VPSLLW imm8u, ymm2, ymmV","vpsllw imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 71 /6 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSLLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSLLW imm8u, ymm2/m256, {k}{z}, = ymmV","vpsllw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /6= ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSLLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSLLW imm8u, zmm2/m512, {k}{z}, = zmmV","vpsllw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /6= ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSLLW xmm1, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, xmm1","vpsllw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F1 /r","V","V","AVX","","w,r,r","= ","" +"VPSLLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, {k}{z}, xm= m1","vpsllw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSLLW ymm1, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, ymm1","vpsllw xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F1 /r","V","V","AVX2","","w,r,r",= "","" +"VPSLLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, {k}{z}, ym= m1","vpsllw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSLLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLW xmm2/m128, zmmV, {k}{z}, zm= m1","vpsllw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F1 /r","= V","V","AVX512BW","scale16","w,r,r,r","","" +"VPSRAD xmmV, xmm2, imm8u","VPSRAD imm8u, xmm2, xmmV","vpsrad imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 72 /4 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRAD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRAD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vpsrad imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /4 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPSRAD ymmV, ymm2, imm8u","VPSRAD imm8u, ymm2, ymmV","vpsrad imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 72 /4 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRAD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRAD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vpsrad imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /4 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPSRAD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRAD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vpsrad imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /4 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPSRAD xmm1, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, xmm1","vpsrad xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E2 /r","V","V","AVX","","w,r,r","= ","" +"VPSRAD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, {k}{z}, xm= m1","vpsrad xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 E2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRAD ymm1, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, ymm1","vpsrad xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E2 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRAD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, {k}{z}, ym= m1","vpsrad xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 E2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRAD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAD xmm2/m128, zmmV, {k}{z}, zm= m1","vpsrad xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 E2 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSRAQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRAQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vpsraq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 72 /4 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPSRAQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRAQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vpsraq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 72 /4 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPSRAQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRAQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vpsraq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 72 /4 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPSRAQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAQ xmm2/m128, xmmV, {k}{z}, xm= m1","vpsraq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 E2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRAQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAQ xmm2/m128, ymmV, {k}{z}, ym= m1","vpsraq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 E2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRAQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAQ xmm2/m128, zmmV, {k}{z}, zm= m1","vpsraq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 E2 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSRAVD xmm1, xmmV, xmm2/m128","VPSRAVD xmm2/m128, xmmV, xmm1","vpsravd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 46 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRAVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRAVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpsravd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 46 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPSRAVD ymm1, ymmV, ymm2/m256","VPSRAVD ymm2/m256, ymmV, ymm1","vpsravd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 46 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRAVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRAVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpsravd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 46 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPSRAVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRAVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpsravd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 46 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPSRAVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRAVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpsravq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 46 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPSRAVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRAVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpsravq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 46 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPSRAVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRAVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpsravq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 46 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPSRAVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAVW xmm2/m128, xmmV, {k}{z}, = xmm1","vpsravw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 11 /= r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRAVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRAVW ymm2/m256, ymmV, {k}{z}, = ymm1","vpsravw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 11 /= r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSRAVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRAVW zmm2/m512, zmmV, {k}{z}, = zmm1","vpsravw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 11 /= r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSRAW xmmV, xmm2, imm8u","VPSRAW imm8u, xmm2, xmmV","vpsraw imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 71 /4 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRAW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRAW imm8u, xmm2/m128, {k}{z}, = xmmV","vpsraw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /4= ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRAW ymmV, ymm2, imm8u","VPSRAW imm8u, ymm2, ymmV","vpsraw imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 71 /4 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRAW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRAW imm8u, ymm2/m256, {k}{z}, = ymmV","vpsraw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /4= ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSRAW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRAW imm8u, zmm2/m512, {k}{z}, = zmmV","vpsraw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /4= ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSRAW xmm1, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, xmm1","vpsraw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E1 /r","V","V","AVX","","w,r,r","= ","" +"VPSRAW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, {k}{z}, xm= m1","vpsraw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRAW ymm1, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, ymm1","vpsraw xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E1 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRAW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, {k}{z}, ym= m1","vpsraw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRAW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAW xmm2/m128, zmmV, {k}{z}, zm= m1","vpsraw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E1 /r","= V","V","AVX512BW","scale16","w,r,r,r","","" +"VPSRLD xmmV, xmm2, imm8u","VPSRLD imm8u, xmm2, xmmV","vpsrld imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 72 /2 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRLD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vpsrld imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /2 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPSRLD ymmV, ymm2, imm8u","VPSRLD imm8u, ymm2, ymmV","vpsrld imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 72 /2 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRLD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vpsrld imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /2 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPSRLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRLD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vpsrld imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /2 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPSRLD xmm1, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, xmm1","vpsrld xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D2 /r","V","V","AVX","","w,r,r","= ","" +"VPSRLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, {k}{z}, xm= m1","vpsrld xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 D2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRLD ymm1, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, ymm1","vpsrld xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D2 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, {k}{z}, ym= m1","vpsrld xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 D2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLD xmm2/m128, zmmV, {k}{z}, zm= m1","vpsrld xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 D2 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSRLDQ xmmV, xmm2, imm8u","VPSRLDQ imm8u, xmm2, xmmV","vpsrldq imm8u, xm= m2, xmmV","VEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX","modrm_regonly","w= ,r,r","","" +"VPSRLDQ xmmV, xmm2/m128, imm8u","VPSRLDQ imm8u, xmm2/m128, xmmV","vpsrldq= imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX512B= W+AVX512VL","scale16","w,r,r","","" +"VPSRLDQ ymmV, ymm2, imm8u","VPSRLDQ imm8u, ymm2, ymmV","vpsrldq imm8u, ym= m2, ymmV","VEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX2","modrm_regonly","= w,r,r","","" +"VPSRLDQ ymmV, ymm2/m256, imm8u","VPSRLDQ imm8u, ymm2/m256, ymmV","vpsrldq= imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX512B= W+AVX512VL","scale32","w,r,r","","" +"VPSRLDQ zmmV, zmm2/m512, imm8u","VPSRLDQ imm8u, zmm2/m512, zmmV","vpsrldq= imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /3 ib","V","V","AVX512B= W","scale64","w,r,r","","" +"VPSRLQ xmmV, xmm2, imm8u","VPSRLQ imm8u, xmm2, xmmV","vpsrlq imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 73 /2 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRLQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vpsrlq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 73 /2 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPSRLQ ymmV, ymm2, imm8u","VPSRLQ imm8u, ymm2, ymmV","vpsrlq imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 73 /2 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRLQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vpsrlq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 73 /2 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPSRLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRLQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vpsrlq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 73 /2 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPSRLQ xmm1, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, xmm1","vpsrlq xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D3 /r","V","V","AVX","","w,r,r","= ","" +"VPSRLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, {k}{z}, xm= m1","vpsrlq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 D3 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRLQ ymm1, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, ymm1","vpsrlq xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D3 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, {k}{z}, ym= m1","vpsrlq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 D3 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLQ xmm2/m128, zmmV, {k}{z}, zm= m1","vpsrlq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 D3 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSRLVD xmm1, xmmV, xmm2/m128","VPSRLVD xmm2/m128, xmmV, xmm1","vpsrlvd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 45 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRLVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpsrlvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 45 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPSRLVD ymm1, ymmV, ymm2/m256","VPSRLVD ymm2/m256, ymmV, ymm1","vpsrlvd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 45 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRLVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpsrlvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 45 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPSRLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRLVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpsrlvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 45 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPSRLVQ xmm1, xmmV, xmm2/m128","VPSRLVQ xmm2/m128, xmmV, xmm1","vpsrlvq x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 45 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRLVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpsrlvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 45 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPSRLVQ ymm1, ymmV, ymm2/m256","VPSRLVQ ymm2/m256, ymmV, ymm1","vpsrlvq y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 45 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRLVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpsrlvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 45 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPSRLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRLVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpsrlvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 45 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPSRLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLVW xmm2/m128, xmmV, {k}{z}, = xmm1","vpsrlvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 10 /= r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRLVW ymm2/m256, ymmV, {k}{z}, = ymm1","vpsrlvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 10 /= r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSRLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRLVW zmm2/m512, zmmV, {k}{z}, = zmm1","vpsrlvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 10 /= r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSRLW xmmV, xmm2, imm8u","VPSRLW imm8u, xmm2, xmmV","vpsrlw imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 71 /2 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRLW imm8u, xmm2/m128, {k}{z}, = xmmV","vpsrlw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /2= ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRLW ymmV, ymm2, imm8u","VPSRLW imm8u, ymm2, ymmV","vpsrlw imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 71 /2 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRLW imm8u, ymm2/m256, {k}{z}, = ymmV","vpsrlw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /2= ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSRLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRLW imm8u, zmm2/m512, {k}{z}, = zmmV","vpsrlw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /2= ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSRLW xmm1, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, xmm1","vpsrlw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D1 /r","V","V","AVX","","w,r,r","= ","" +"VPSRLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, {k}{z}, xm= m1","vpsrlw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRLW ymm1, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, ymm1","vpsrlw xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D1 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, {k}{z}, ym= m1","vpsrlw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLW xmm2/m128, zmmV, {k}{z}, zm= m1","vpsrlw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D1 /r","= V","V","AVX512BW","scale16","w,r,r,r","","" +"VPSUBB xmm1, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, xmm1","vpsubb xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F8 /r","V","V","AVX","","w,r,r","= ","" +"VPSUBB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, {k}{z}, xm= m1","vpsubb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F8 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBB ymm1, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, ymm1","vpsubb ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F8 /r","V","V","AVX2","","w,r,r",= "","" +"VPSUBB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, {k}{z}, ym= m1","vpsubb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F8 /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBB zmm2/m512, zmmV, {k}{z}, zm= m1","vpsubb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F8 /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBD xmm1, xmmV, xmm2/m128","VPSUBD xmm2/m128, xmmV, xmm1","vpsubd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FA /r","V","V","AVX","","w,r,r","= ","" +"VPSUBD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSUBD xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vpsubd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W0 FA /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r= ","","" +"VPSUBD ymm1, ymmV, ymm2/m256","VPSUBD ymm2/m256, ymmV, ymm1","vpsubd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FA /r","V","V","AVX2","","w,r,r",= "","" +"VPSUBD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSUBD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpsubd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W0 FA /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r= ","","" +"VPSUBD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSUBD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpsubd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W0 FA /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPSUBQ xmm1, xmmV, xmm2/m128","VPSUBQ xmm2/m128, xmmV, xmm1","vpsubq xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FB /r","V","V","AVX","","w,r,r","= ","" +"VPSUBQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSUBQ xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vpsubq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 FB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VPSUBQ ymm1, ymmV, ymm2/m256","VPSUBQ ymm2/m256, ymmV, ymm1","vpsubq ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FB /r","V","V","AVX2","","w,r,r",= "","" +"VPSUBQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSUBQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpsubq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 FB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VPSUBQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSUBQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpsubq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 FB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPSUBSB xmm1, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, xmm1","vpsubsb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E8 /r","V","V","AVX","","w,r,r= ","","" +"VPSUBSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, {k}{z}, = xmm1","vpsubsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E8 /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBSB ymm1, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, ymm1","vpsubsb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E8 /r","V","V","AVX2","","w,r,= r","","" +"VPSUBSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, {k}{z}, = ymm1","vpsubsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E8 /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSB zmm2/m512, zmmV, {k}{z}, = zmm1","vpsubsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E8 /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBSW xmm1, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, xmm1","vpsubsw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E9 /r","V","V","AVX","","w,r,r= ","","" +"VPSUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, {k}{z}, = xmm1","vpsubsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E9 /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBSW ymm1, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, ymm1","vpsubsw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E9 /r","V","V","AVX2","","w,r,= r","","" +"VPSUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, {k}{z}, = ymm1","vpsubsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E9 /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSW zmm2/m512, zmmV, {k}{z}, = zmm1","vpsubsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E9 /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBUSB xmm1, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, xmm1","vpsubus= b xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D8 /r","V","V","AVX","","w,= r,r","","" +"VPSUBUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, {k}{z}= , xmm1","vpsubusb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D8= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBUSB ymm1, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, ymm1","vpsubus= b ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D8 /r","V","V","AVX2","","w= ,r,r","","" +"VPSUBUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, {k}{z}= , ymm1","vpsubusb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D8= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSB zmm2/m512, zmmV, {k}{z}= , zmm1","vpsubusb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D8= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBUSW xmm1, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, xmm1","vpsubus= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D9 /r","V","V","AVX","","w,= r,r","","" +"VPSUBUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, {k}{z}= , xmm1","vpsubusw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D9= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBUSW ymm1, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, ymm1","vpsubus= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D9 /r","V","V","AVX2","","w= ,r,r","","" +"VPSUBUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, {k}{z}= , ymm1","vpsubusw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D9= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSW zmm2/m512, zmmV, {k}{z}= , zmm1","vpsubusw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D9= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBW xmm1, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, xmm1","vpsubw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F9 /r","V","V","AVX","","w,r,r","= ","" +"VPSUBW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, {k}{z}, xm= m1","vpsubw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F9 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBW ymm1, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, ymm1","vpsubw ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F9 /r","V","V","AVX2","","w,r,r",= "","" +"VPSUBW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, {k}{z}, ym= m1","vpsubw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F9 /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBW zmm2/m512, zmmV, {k}{z}, zm= m1","vpsubw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F9 /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPTERNLOGD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPTERNLOGD imm8= u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpternlogd imm8u, xmm2/m128/m32b= cst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 25 /r ib","V","V","AVX512= F+AVX512VL","bscale4,scale16","rw,r,r,r,r","","" +"VPTERNLOGD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPTERNLOGD imm8= u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpternlogd imm8u, ymm2/m256/m32b= cst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 25 /r ib","V","V","AVX512= F+AVX512VL","bscale4,scale32","rw,r,r,r,r","","" +"VPTERNLOGD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPTERNLOGD imm8= u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpternlogd imm8u, zmm2/m512/m32b= cst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 25 /r ib","V","V","AVX512= F","bscale4,scale64","rw,r,r,r,r","","" +"VPTERNLOGQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPTERNLOGQ imm8= u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpternlogq imm8u, xmm2/m128/m64b= cst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 25 /r ib","V","V","AVX512= F+AVX512VL","bscale8,scale16","rw,r,r,r,r","","" +"VPTERNLOGQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPTERNLOGQ imm8= u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpternlogq imm8u, ymm2/m256/m64b= cst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 25 /r ib","V","V","AVX512= F+AVX512VL","bscale8,scale32","rw,r,r,r,r","","" +"VPTERNLOGQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPTERNLOGQ imm8= u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpternlogq imm8u, zmm2/m512/m64b= cst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 25 /r ib","V","V","AVX512= F","bscale8,scale64","rw,r,r,r,r","","" +"VPTEST xmm1, xmm2/m128","VPTEST xmm2/m128, xmm1","vptest xmm2/m128, xmm1"= ,"VEX.128.66.0F38.WIG 17 /r","V","V","AVX","","r,r","","" +"VPTEST ymm1, ymm2/m256","VPTEST ymm2/m256, ymm1","vptest ymm2/m256, ymm1"= ,"VEX.256.66.0F38.WIG 17 /r","V","V","AVX","","r,r","","" +"VPTESTMB k1, {k}, xmmV, xmm2/m128","VPTESTMB xmm2/m128, xmmV, {k}, k1","v= ptestmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 26 /r","V","V","= AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPTESTMB k1, {k}, ymmV, ymm2/m256","VPTESTMB ymm2/m256, ymmV, {k}, k1","v= ptestmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 26 /r","V","V","= AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPTESTMB k1, {k}, zmmV, zmm2/m512","VPTESTMB zmm2/m512, zmmV, {k}, k1","v= ptestmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 26 /r","V","V","= AVX512BW","scale64","w,r,r,r","","" +"VPTESTMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTMD xmm2/m128/m32bcst, x= mmV, {k}, k1","vptestmd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","","" +"VPTESTMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTMD ymm2/m256/m32bcst, y= mmV, {k}, k1","vptestmd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","","" +"VPTESTMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTMD zmm2/m512/m32bcst, z= mmV, {k}, k1","vptestmd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F38.W0 27 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPTESTMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTMQ xmm2/m128/m64bcst, x= mmV, {k}, k1","vptestmq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","","" +"VPTESTMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTMQ ymm2/m256/m64bcst, y= mmV, {k}, k1","vptestmq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","","" +"VPTESTMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTMQ zmm2/m512/m64bcst, z= mmV, {k}, k1","vptestmq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F38.W1 27 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPTESTMW k1, {k}, xmmV, xmm2/m128","VPTESTMW xmm2/m128, xmmV, {k}, k1","v= ptestmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 26 /r","V","V","= AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPTESTMW k1, {k}, ymmV, ymm2/m256","VPTESTMW ymm2/m256, ymmV, {k}, k1","v= ptestmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 26 /r","V","V","= AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPTESTMW k1, {k}, zmmV, zmm2/m512","VPTESTMW zmm2/m512, zmmV, {k}, k1","v= ptestmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 26 /r","V","V","= AVX512BW","scale64","w,r,r,r","","" +"VPTESTNMB k1, {k}, xmmV, xmm2/m128","VPTESTNMB xmm2/m128, xmmV, {k}, k1",= "vptestnmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W0 26 /r","V","V= ","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPTESTNMB k1, {k}, ymmV, ymm2/m256","VPTESTNMB ymm2/m256, ymmV, {k}, k1",= "vptestnmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W0 26 /r","V","V= ","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPTESTNMB k1, {k}, zmmV, zmm2/m512","VPTESTNMB zmm2/m512, zmmV, {k}, k1",= "vptestnmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W0 26 /r","V","V= ","AVX512BW","scale64","w,r,r,r","","" +"VPTESTNMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTNMD xmm2/m128/m32bcst,= xmmV, {k}, k1","vptestnmd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.= F3.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r",""= ,"" +"VPTESTNMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTNMD ymm2/m256/m32bcst,= ymmV, {k}, k1","vptestnmd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.= F3.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r",""= ,"" +"VPTESTNMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTNMD zmm2/m512/m32bcst,= zmmV, {k}, k1","vptestnmd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.= F3.0F38.W0 27 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPTESTNMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTNMQ xmm2/m128/m64bcst,= xmmV, {k}, k1","vptestnmq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.= F3.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r",""= ,"" +"VPTESTNMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTNMQ ymm2/m256/m64bcst,= ymmV, {k}, k1","vptestnmq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.= F3.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r",""= ,"" +"VPTESTNMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTNMQ zmm2/m512/m64bcst,= zmmV, {k}, k1","vptestnmq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.= F3.0F38.W1 27 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPTESTNMW k1, {k}, xmmV, xmm2/m128","VPTESTNMW xmm2/m128, xmmV, {k}, k1",= "vptestnmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W1 26 /r","V","V= ","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPTESTNMW k1, {k}, ymmV, ymm2/m256","VPTESTNMW ymm2/m256, ymmV, {k}, k1",= "vptestnmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W1 26 /r","V","V= ","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPTESTNMW k1, {k}, zmmV, zmm2/m512","VPTESTNMW zmm2/m512, zmmV, {k}, k1",= "vptestnmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W1 26 /r","V","V= ","AVX512BW","scale64","w,r,r,r","","" +"VPUNPCKHBW xmm1, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, xmm1","vpu= npckhbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 68 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKHBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, {k= }{z}, xmm1","vpunpckhbw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.= WIG 68 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPUNPCKHBW ymm1, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, ymm1","vpu= npckhbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 68 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKHBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, {k= }{z}, ymm1","vpunpckhbw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.= WIG 68 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPUNPCKHBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHBW zmm2/m512, zmmV, {k= }{z}, zmm1","vpunpckhbw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.= WIG 68 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPUNPCKHDQ xmm1, xmmV, xmm2/m128","VPUNPCKHDQ xmm2/m128, xmmV, xmm1","vpu= npckhdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6A /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKHDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKHDQ xmm2/m128/m= 32bcst, xmmV, {k}{z}, xmm1","vpunpckhdq xmm2/m128/m32bcst, xmmV, {k}{z}, xm= m1","EVEX.NDS.128.66.0F.W0 6A /r","V","V","AVX512F+AVX512VL","bscale4,scale= 16","w,r,r,r","","" +"VPUNPCKHDQ ymm1, ymmV, ymm2/m256","VPUNPCKHDQ ymm2/m256, ymmV, ymm1","vpu= npckhdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6A /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKHDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKHDQ ymm2/m256/m= 32bcst, ymmV, {k}{z}, ymm1","vpunpckhdq ymm2/m256/m32bcst, ymmV, {k}{z}, ym= m1","EVEX.NDS.256.66.0F.W0 6A /r","V","V","AVX512F+AVX512VL","bscale4,scale= 32","w,r,r,r","","" +"VPUNPCKHDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKHDQ zmm2/m512/m= 32bcst, zmmV, {k}{z}, zmm1","vpunpckhdq zmm2/m512/m32bcst, zmmV, {k}{z}, zm= m1","EVEX.NDS.512.66.0F.W0 6A /r","V","V","AVX512F","bscale4,scale64","w,r,= r,r","","" +"VPUNPCKHQDQ xmm1, xmmV, xmm2/m128","VPUNPCKHQDQ xmm2/m128, xmmV, xmm1","v= punpckhqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6D /r","V","V","AV= X","","w,r,r","","" +"VPUNPCKHQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKHQDQ xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vpunpckhqdq xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.NDS.128.66.0F.W1 6D /r","V","V","AVX512F+AVX512VL","bscale8,sc= ale16","w,r,r,r","","" +"VPUNPCKHQDQ ymm1, ymmV, ymm2/m256","VPUNPCKHQDQ ymm2/m256, ymmV, ymm1","v= punpckhqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6D /r","V","V","AV= X2","","w,r,r","","" +"VPUNPCKHQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKHQDQ ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vpunpckhqdq ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.NDS.256.66.0F.W1 6D /r","V","V","AVX512F+AVX512VL","bscale8,sc= ale32","w,r,r,r","","" +"VPUNPCKHQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKHQDQ zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vpunpckhqdq zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.NDS.512.66.0F.W1 6D /r","V","V","AVX512F","bscale8,scale64","w= ,r,r,r","","" +"VPUNPCKHWD xmm1, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, xmm1","vpu= npckhwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 69 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKHWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, {k= }{z}, xmm1","vpunpckhwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.= WIG 69 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPUNPCKHWD ymm1, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, ymm1","vpu= npckhwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 69 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKHWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, {k= }{z}, ymm1","vpunpckhwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.= WIG 69 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPUNPCKHWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHWD zmm2/m512, zmmV, {k= }{z}, zmm1","vpunpckhwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.= WIG 69 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPUNPCKLBW xmm1, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, xmm1","vpu= npcklbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 60 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKLBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, {k= }{z}, xmm1","vpunpcklbw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.= WIG 60 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPUNPCKLBW ymm1, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, ymm1","vpu= npcklbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 60 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKLBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, {k= }{z}, ymm1","vpunpcklbw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.= WIG 60 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPUNPCKLBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLBW zmm2/m512, zmmV, {k= }{z}, zmm1","vpunpcklbw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.= WIG 60 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPUNPCKLDQ xmm1, xmmV, xmm2/m128","VPUNPCKLDQ xmm2/m128, xmmV, xmm1","vpu= npckldq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 62 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKLDQ xmm2/m128/m= 32bcst, xmmV, {k}{z}, xmm1","vpunpckldq xmm2/m128/m32bcst, xmmV, {k}{z}, xm= m1","EVEX.NDS.128.66.0F.W0 62 /r","V","V","AVX512F+AVX512VL","bscale4,scale= 16","w,r,r,r","","" +"VPUNPCKLDQ ymm1, ymmV, ymm2/m256","VPUNPCKLDQ ymm2/m256, ymmV, ymm1","vpu= npckldq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 62 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKLDQ ymm2/m256/m= 32bcst, ymmV, {k}{z}, ymm1","vpunpckldq ymm2/m256/m32bcst, ymmV, {k}{z}, ym= m1","EVEX.NDS.256.66.0F.W0 62 /r","V","V","AVX512F+AVX512VL","bscale4,scale= 32","w,r,r,r","","" +"VPUNPCKLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKLDQ zmm2/m512/m= 32bcst, zmmV, {k}{z}, zmm1","vpunpckldq zmm2/m512/m32bcst, zmmV, {k}{z}, zm= m1","EVEX.NDS.512.66.0F.W0 62 /r","V","V","AVX512F","bscale4,scale64","w,r,= r,r","","" +"VPUNPCKLQDQ xmm1, xmmV, xmm2/m128","VPUNPCKLQDQ xmm2/m128, xmmV, xmm1","v= punpcklqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6C /r","V","V","AV= X","","w,r,r","","" +"VPUNPCKLQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKLQDQ xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vpunpcklqdq xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.NDS.128.66.0F.W1 6C /r","V","V","AVX512F+AVX512VL","bscale8,sc= ale16","w,r,r,r","","" +"VPUNPCKLQDQ ymm1, ymmV, ymm2/m256","VPUNPCKLQDQ ymm2/m256, ymmV, ymm1","v= punpcklqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6C /r","V","V","AV= X2","","w,r,r","","" +"VPUNPCKLQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKLQDQ ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vpunpcklqdq ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.NDS.256.66.0F.W1 6C /r","V","V","AVX512F+AVX512VL","bscale8,sc= ale32","w,r,r,r","","" +"VPUNPCKLQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKLQDQ zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vpunpcklqdq zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.NDS.512.66.0F.W1 6C /r","V","V","AVX512F","bscale8,scale64","w= ,r,r,r","","" +"VPUNPCKLWD xmm1, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, xmm1","vpu= npcklwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 61 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKLWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, {k= }{z}, xmm1","vpunpcklwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.= WIG 61 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPUNPCKLWD ymm1, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, ymm1","vpu= npcklwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 61 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKLWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, {k= }{z}, ymm1","vpunpcklwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.= WIG 61 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPUNPCKLWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLWD zmm2/m512, zmmV, {k= }{z}, zmm1","vpunpcklwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.= WIG 61 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPXOR xmm1, xmmV, xmm2/m128","VPXOR xmm2/m128, xmmV, xmm1","vpxor xmm2/m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EF /r","V","V","AVX","","w,r,r","","" +"VPXOR ymm1, ymmV, ymm2/m256","VPXOR ymm2/m256, ymmV, ymm1","vpxor ymm2/m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EF /r","V","V","AVX2","","w,r,r","",= "" +"VPXORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPXORD xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vpxord xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W0 EF /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r= ","","" +"VPXORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPXORD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpxord ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W0 EF /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r= ","","" +"VPXORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPXORD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpxord zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W0 EF /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPXORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPXORQ xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vpxorq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 EF /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VPXORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPXORQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpxorq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 EF /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VPXORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPXORQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpxorq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 EF /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VRANGEPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u:4","VRANGEPD imm8u:= 4, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vrangepd imm8u:4, xmm2/m128/m64b= cst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 50 /r ib","V","V","AVX512= DQ+AVX512VL","bscale8,scale16","w,r,r,r,r","","" +"VRANGEPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u:4","VRANGEPD imm8u:= 4, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vrangepd imm8u:4, ymm2/m256/m64b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 50 /r ib","V","V","AVX512= DQ+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VRANGEPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPD imm8u:4, zmm2,= zmmV, {k}{z}, zmm1{sae}","vrangepd imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}"= ,"EVEX.NDS.512.66.0F3A.W1 50 /r ib","V","V","AVX512DQ","modrm_regonly","w,r= ,r,r,r","","" +"VRANGEPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u:4","VRANGEPD imm8u:= 4, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vrangepd imm8u:4, zmm2/m512/m64b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 50 /r ib","V","V","AVX512= DQ","bscale8,scale64","w,r,r,r,r","","" +"VRANGEPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u:4","VRANGEPS imm8u:= 4, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vrangeps imm8u:4, xmm2/m128/m32b= cst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 50 /r ib","V","V","AVX512= DQ+AVX512VL","bscale4,scale16","w,r,r,r,r","","" +"VRANGEPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u:4","VRANGEPS imm8u:= 4, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vrangeps imm8u:4, ymm2/m256/m32b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 50 /r ib","V","V","AVX512= DQ+AVX512VL","bscale4,scale32","w,r,r,r,r","","" +"VRANGEPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPS imm8u:4, zmm2,= zmmV, {k}{z}, zmm1{sae}","vrangeps imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}"= ,"EVEX.NDS.512.66.0F3A.W0 50 /r ib","V","V","AVX512DQ","modrm_regonly","w,r= ,r,r,r","","" +"VRANGEPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u:4","VRANGEPS imm8u:= 4, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vrangeps imm8u:4, zmm2/m512/m32b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 50 /r ib","V","V","AVX512= DQ","bscale4,scale64","w,r,r,r,r","","" +"VRANGESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESD imm8u:4, xmm2,= xmmV, {k}{z}, xmm1{sae}","vrangesd imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}"= ,"EVEX.NDS.128.66.0F3A.W1 51 /r ib","V","V","AVX512DQ","modrm_regonly","w,r= ,r,r,r","","" +"VRANGESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VRANGESD imm8u:4, xmm2/m= 64, xmmV, {k}{z}, xmm1","vrangesd imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","E= VEX.NDS.LIG.66.0F3A.W1 51 /r ib","V","V","AVX512DQ","scale8","w,r,r,r,r",""= ,"" +"VRANGESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESS imm8u:4, xmm2,= xmmV, {k}{z}, xmm1{sae}","vrangess imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}"= ,"EVEX.NDS.128.66.0F3A.W0 51 /r ib","V","V","AVX512DQ","modrm_regonly","w,r= ,r,r,r","","" +"VRANGESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VRANGESS imm8u:4, xmm2/m= 32, xmmV, {k}{z}, xmm1","vrangess imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","E= VEX.NDS.LIG.66.0F3A.W0 51 /r ib","V","V","AVX512DQ","scale4","w,r,r,r,r",""= ,"" +"VRCP14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRCP14PD xmm2/m128/m64bcst, {k= }{z}, xmm1","vrcp14pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1= 4C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VRCP14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRCP14PD ymm2/m256/m64bcst, {k= }{z}, ymm1","vrcp14pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1= 4C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VRCP14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP14PD zmm2/m512/m64bcst, {k= }{z}, zmm1","vrcp14pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1= 4C /r","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VRCP14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRCP14PS xmm2/m128/m32bcst, {k= }{z}, xmm1","vrcp14ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0= 4C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VRCP14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRCP14PS ymm2/m256/m32bcst, {k= }{z}, ymm1","vrcp14ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0= 4C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VRCP14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP14PS zmm2/m512/m32bcst, {k= }{z}, zmm1","vrcp14ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0= 4C /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VRCP14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP14SD xmm2/m64, xmmV, {k}{z}, = xmm1","vrcp14sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 4D /= r","V","V","AVX512F","scale8","w,r,r,r","","" +"VRCP14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP14SS xmm2/m32, xmmV, {k}{z}, = xmm1","vrcp14ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 4D /= r","V","V","AVX512F","scale4","w,r,r,r","","" +"VRCP28PD zmm1{sae}, {k}{z}, zmm2","VRCP28PD zmm2, {k}{z}, zmm1{sae}","vrc= p28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CA /r","V","V","AVX512E= R","modrm_regonly","w,r,r","","" +"VRCP28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP28PD zmm2/m512/m64bcst, {k= }{z}, zmm1","vrcp28pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1= CA /r","V","V","AVX512ER","bscale8,scale64","w,r,r","","" +"VRCP28PS zmm1{sae}, {k}{z}, zmm2","VRCP28PS zmm2, {k}{z}, zmm1{sae}","vrc= p28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CA /r","V","V","AVX512E= R","modrm_regonly","w,r,r","","" +"VRCP28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP28PS zmm2/m512/m32bcst, {k= }{z}, zmm1","vrcp28ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0= CA /r","V","V","AVX512ER","bscale4,scale64","w,r,r","","" +"VRCP28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SD xmm2, xmmV, {k}{z}, xmm= 1{sae}","vrcp28sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W1 C= B /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","","" +"VRCP28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP28SD xmm2/m64, xmmV, {k}{z}, = xmm1","vrcp28sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 CB /= r","V","V","AVX512ER","scale8","w,r,r,r","","" +"VRCP28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SS xmm2, xmmV, {k}{z}, xmm= 1{sae}","vrcp28ss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W0 C= B /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","","" +"VRCP28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP28SS xmm2/m32, xmmV, {k}{z}, = xmm1","vrcp28ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 CB /= r","V","V","AVX512ER","scale4","w,r,r,r","","" +"VRCPPS xmm1, xmm2/m128","VRCPPS xmm2/m128, xmm1","vrcpps xmm2/m128, xmm1"= ,"VEX.128.0F.WIG 53 /r","V","V","AVX","","w,r","","" +"VRCPPS ymm1, ymm2/m256","VRCPPS ymm2/m256, ymm1","vrcpps ymm2/m256, ymm1"= ,"VEX.256.0F.WIG 53 /r","V","V","AVX","","w,r","","" +"VRCPSS xmm1, xmmV, xmm2/m32","VRCPSS xmm2/m32, xmmV, xmm1","vrcpss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 53 /r","V","V","AVX","","w,r,r","","" +"VREDUCEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VREDUCEPD imm8u, xmm2/= m128/m64bcst, {k}{z}, xmm1","vreducepd imm8u, xmm2/m128/m64bcst, {k}{z}, xm= m1","EVEX.128.66.0F3A.W1 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,sca= le16","w,r,r,r","","" +"VREDUCEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VREDUCEPD imm8u, ymm2/= m256/m64bcst, {k}{z}, ymm1","vreducepd imm8u, ymm2/m256/m64bcst, {k}{z}, ym= m1","EVEX.256.66.0F3A.W1 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,sca= le32","w,r,r,r","","" +"VREDUCEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPD imm8u, zmm2, {k}{z},= zmm1{sae}","vreducepd imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W1= 56 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","","" +"VREDUCEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VREDUCEPD imm8u, zmm2/= m512/m64bcst, {k}{z}, zmm1","vreducepd imm8u, zmm2/m512/m64bcst, {k}{z}, zm= m1","EVEX.512.66.0F3A.W1 56 /r ib","V","V","AVX512DQ","bscale8,scale64","w,= r,r,r","","" +"VREDUCEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VREDUCEPS imm8u, xmm2/= m128/m32bcst, {k}{z}, xmm1","vreduceps imm8u, xmm2/m128/m32bcst, {k}{z}, xm= m1","EVEX.128.66.0F3A.W0 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,sca= le16","w,r,r,r","","" +"VREDUCEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VREDUCEPS imm8u, ymm2/= m256/m32bcst, {k}{z}, ymm1","vreduceps imm8u, ymm2/m256/m32bcst, {k}{z}, ym= m1","EVEX.256.66.0F3A.W0 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,sca= le32","w,r,r,r","","" +"VREDUCEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPS imm8u, zmm2, {k}{z},= zmm1{sae}","vreduceps imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W0= 56 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","","" +"VREDUCEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VREDUCEPS imm8u, zmm2/= m512/m32bcst, {k}{z}, zmm1","vreduceps imm8u, zmm2/m512/m32bcst, {k}{z}, zm= m1","EVEX.512.66.0F3A.W0 56 /r ib","V","V","AVX512DQ","bscale4,scale64","w,= r,r,r","","" +"VREDUCESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESD imm8u, xmm2, x= mmV, {k}{z}, xmm1{sae}","vreducesd imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","E= VEX.NDS.128.66.0F3A.W1 57 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,= r,r","","" +"VREDUCESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VREDUCESD imm8u, xmm2/m64= , xmmV, {k}{z}, xmm1","vreducesd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX= .NDS.LIG.66.0F3A.W1 57 /r ib","V","V","AVX512DQ","scale8","w,r,r,r,r","","" +"VREDUCESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESS imm8u, xmm2, x= mmV, {k}{z}, xmm1{sae}","vreducess imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","E= VEX.NDS.128.66.0F3A.W0 57 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,= r,r","","" +"VREDUCESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VREDUCESS imm8u, xmm2/m32= , xmmV, {k}{z}, xmm1","vreducess imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX= .NDS.LIG.66.0F3A.W0 57 /r ib","V","V","AVX512DQ","scale4","w,r,r,r,r","","" +"VRNDSCALEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VRNDSCALEPD imm8u, x= mm2/m128/m64bcst, {k}{z}, xmm1","vrndscalepd imm8u, xmm2/m128/m64bcst, {k}{= z}, xmm1","EVEX.128.66.0F3A.W1 09 /r ib","V","V","AVX512F+AVX512VL","bscale= 8,scale16","w,r,r,r","","" +"VRNDSCALEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VRNDSCALEPD imm8u, y= mm2/m256/m64bcst, {k}{z}, ymm1","vrndscalepd imm8u, ymm2/m256/m64bcst, {k}{= z}, ymm1","EVEX.256.66.0F3A.W1 09 /r ib","V","V","AVX512F+AVX512VL","bscale= 8,scale32","w,r,r,r","","" +"VRNDSCALEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPD imm8u, zmm2, {k}= {z}, zmm1{sae}","vrndscalepd imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0= F3A.W1 09 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VRNDSCALEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VRNDSCALEPD imm8u, z= mm2/m512/m64bcst, {k}{z}, zmm1","vrndscalepd imm8u, zmm2/m512/m64bcst, {k}{= z}, zmm1","EVEX.512.66.0F3A.W1 09 /r ib","V","V","AVX512F","bscale8,scale64= ","w,r,r,r","","" +"VRNDSCALEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VRNDSCALEPS imm8u, x= mm2/m128/m32bcst, {k}{z}, xmm1","vrndscaleps imm8u, xmm2/m128/m32bcst, {k}{= z}, xmm1","EVEX.128.66.0F3A.W0 08 /r ib","V","V","AVX512F+AVX512VL","bscale= 4,scale16","w,r,r,r","","" +"VRNDSCALEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VRNDSCALEPS imm8u, y= mm2/m256/m32bcst, {k}{z}, ymm1","vrndscaleps imm8u, ymm2/m256/m32bcst, {k}{= z}, ymm1","EVEX.256.66.0F3A.W0 08 /r ib","V","V","AVX512F+AVX512VL","bscale= 4,scale32","w,r,r,r","","" +"VRNDSCALEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPS imm8u, zmm2, {k}= {z}, zmm1{sae}","vrndscaleps imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0= F3A.W0 08 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VRNDSCALEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VRNDSCALEPS imm8u, z= mm2/m512/m32bcst, {k}{z}, zmm1","vrndscaleps imm8u, zmm2/m512/m32bcst, {k}{= z}, zmm1","EVEX.512.66.0F3A.W0 08 /r ib","V","V","AVX512F","bscale4,scale64= ","w,r,r,r","","" +"VRNDSCALESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESD imm8u, xmm= 2, xmmV, {k}{z}, xmm1{sae}","vrndscalesd imm8u, xmm2, xmmV, {k}{z}, xmm1{sa= e}","EVEX.NDS.128.66.0F3A.W1 0B /r ib","V","V","AVX512F","modrm_regonly","w= ,r,r,r,r","","" +"VRNDSCALESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VRNDSCALESD imm8u, xmm2= /m64, xmmV, {k}{z}, xmm1","vrndscalesd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.LIG.66.0F3A.W1 0B /r ib","V","V","AVX512F","scale8","w,r,r,r,r",= "","" +"VRNDSCALESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESS imm8u, xmm= 2, xmmV, {k}{z}, xmm1{sae}","vrndscaless imm8u, xmm2, xmmV, {k}{z}, xmm1{sa= e}","EVEX.NDS.128.66.0F3A.W0 0A /r ib","V","V","AVX512F","modrm_regonly","w= ,r,r,r,r","","" +"VRNDSCALESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VRNDSCALESS imm8u, xmm2= /m32, xmmV, {k}{z}, xmm1","vrndscaless imm8u, xmm2/m32, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.LIG.66.0F3A.W0 0A /r ib","V","V","AVX512F","scale4","w,r,r,r,r",= "","" +"VROUNDPD xmm1, xmm2/m128, imm8u","VROUNDPD imm8u, xmm2/m128, xmm1","vroun= dpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 09 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VROUNDPD ymm1, ymm2/m256, imm8u","VROUNDPD imm8u, ymm2/m256, ymm1","vroun= dpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 09 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VROUNDPS xmm1, xmm2/m128, imm8u","VROUNDPS imm8u, xmm2/m128, xmm1","vroun= dps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 08 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VROUNDPS ymm1, ymm2/m256, imm8u","VROUNDPS imm8u, ymm2/m256, ymm1","vroun= dps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 08 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VROUNDSD xmm1, xmmV, xmm2/m64, imm8u","VROUNDSD imm8u, xmm2/m64, xmmV, xm= m1","vroundsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.WIG 0B /r i= b","V","V","AVX","","w,r,r,r","","" +"VROUNDSS xmm1, xmmV, xmm2/m32, imm8u","VROUNDSS imm8u, xmm2/m32, xmmV, xm= m1","vroundss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.WIG 0A /r i= b","V","V","AVX","","w,r,r,r","","" +"VRSQRT14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRSQRT14PD xmm2/m128/m64bcst= , {k}{z}, xmm1","vrsqrt14pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0= F38.W1 4E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VRSQRT14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRSQRT14PD ymm2/m256/m64bcst= , {k}{z}, ymm1","vrsqrt14pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0= F38.W1 4E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VRSQRT14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT14PD zmm2/m512/m64bcst= , {k}{z}, zmm1","vrsqrt14pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0= F38.W1 4E /r","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VRSQRT14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRSQRT14PS xmm2/m128/m32bcst= , {k}{z}, xmm1","vrsqrt14ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0= F38.W0 4E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VRSQRT14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRSQRT14PS ymm2/m256/m32bcst= , {k}{z}, ymm1","vrsqrt14ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0= F38.W0 4E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VRSQRT14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT14PS zmm2/m512/m32bcst= , {k}{z}, zmm1","vrsqrt14ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0= F38.W0 4E /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VRSQRT14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT14SD xmm2/m64, xmmV, {k}{= z}, xmm1","vrsqrt14sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W= 1 4F /r","V","V","AVX512F","scale8","w,r,r,r","","" +"VRSQRT14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT14SS xmm2/m32, xmmV, {k}{= z}, xmm1","vrsqrt14ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W= 0 4F /r","V","V","AVX512F","scale4","w,r,r,r","","" +"VRSQRT28PD zmm1{sae}, {k}{z}, zmm2","VRSQRT28PD zmm2, {k}{z}, zmm1{sae}",= "vrsqrt28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CC /r","V","V","A= VX512ER","modrm_regonly","w,r,r","","" +"VRSQRT28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT28PD zmm2/m512/m64bcst= , {k}{z}, zmm1","vrsqrt28pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0= F38.W1 CC /r","V","V","AVX512ER","bscale8,scale64","w,r,r","","" +"VRSQRT28PS zmm1{sae}, {k}{z}, zmm2","VRSQRT28PS zmm2, {k}{z}, zmm1{sae}",= "vrsqrt28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CC /r","V","V","A= VX512ER","modrm_regonly","w,r,r","","" +"VRSQRT28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT28PS zmm2/m512/m32bcst= , {k}{z}, zmm1","vrsqrt28ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0= F38.W0 CC /r","V","V","AVX512ER","bscale4,scale64","w,r,r","","" +"VRSQRT28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SD xmm2, xmmV, {k}{z},= xmm1{sae}","vrsqrt28sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3= 8.W1 CD /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","","" +"VRSQRT28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT28SD xmm2/m64, xmmV, {k}{= z}, xmm1","vrsqrt28sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W= 1 CD /r","V","V","AVX512ER","scale8","w,r,r,r","","" +"VRSQRT28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SS xmm2, xmmV, {k}{z},= xmm1{sae}","vrsqrt28ss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3= 8.W0 CD /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","","" +"VRSQRT28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT28SS xmm2/m32, xmmV, {k}{= z}, xmm1","vrsqrt28ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W= 0 CD /r","V","V","AVX512ER","scale4","w,r,r,r","","" +"VRSQRTPS xmm1, xmm2/m128","VRSQRTPS xmm2/m128, xmm1","vrsqrtps xmm2/m128,= xmm1","VEX.128.0F.WIG 52 /r","V","V","AVX","","w,r","","" +"VRSQRTPS ymm1, ymm2/m256","VRSQRTPS ymm2/m256, ymm1","vrsqrtps ymm2/m256,= ymm1","VEX.256.0F.WIG 52 /r","V","V","AVX","","w,r","","" +"VRSQRTSS xmm1, xmmV, xmm2/m32","VRSQRTSS xmm2/m32, xmmV, xmm1","vrsqrtss = xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 52 /r","V","V","AVX","","w,r,r= ","","" +"VSCALEFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSCALEFPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vscalefpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W1 2C /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r","","" +"VSCALEFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSCALEFPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vscalefpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W1 2C /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r","","" +"VSCALEFPD zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPD zmm2, zmmV, {k}{z}, zm= m1{er}","vscalefpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F38.W1 2= C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSCALEFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSCALEFPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vscalefpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W1 2C /r","V","V","AVX512F","bscale8,scale64","w,r,r= ,r","","" +"VSCALEFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSCALEFPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vscalefps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r","","" +"VSCALEFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSCALEFPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vscalefps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r","","" +"VSCALEFPS zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPS zmm2, zmmV, {k}{z}, zm= m1{er}","vscalefps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F38.W0 2= C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSCALEFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSCALEFPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vscalefps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 2C /r","V","V","AVX512F","bscale4,scale64","w,r,r= ,r","","" +"VSCALEFSD xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSD xmm2, xmmV, {k}{z}, xm= m1{er}","vscalefsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.66.0F38.W1 2= D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSCALEFSD xmm1, {k}{z}, xmmV, xmm2/m64","VSCALEFSD xmm2/m64, xmmV, {k}{z}= , xmm1","vscalefsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 2= D /r","V","V","AVX512F","scale8","w,r,r,r","","" +"VSCALEFSS xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSS xmm2, xmmV, {k}{z}, xm= m1{er}","vscalefss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.66.0F38.W0 2= D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSCALEFSS xmm1, {k}{z}, xmmV, xmm2/m32","VSCALEFSS xmm2/m32, xmmV, {k}{z}= , xmm1","vscalefss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 2= D /r","V","V","AVX512F","scale4","w,r,r,r","","" +"VSCATTERDPD vm32x, {k1-k7}, xmm1","VSCATTERDPD xmm1, {k1-k7}, vm32x","vsc= atterdpd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A2 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERDPD vm32x, {k1-k7}, ymm1","VSCATTERDPD ymm1, {k1-k7}, vm32x","vsc= atterdpd ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A2 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERDPD vm32y, {k1-k7}, zmm1","VSCATTERDPD zmm1, {k1-k7}, vm32y","vsc= atterdpd zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A2 /vsib","V","V","AVX5= 12F","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERDPS vm32x, {k1-k7}, xmm1","VSCATTERDPS xmm1, {k1-k7}, vm32x","vsc= atterdps xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A2 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERDPS vm32y, {k1-k7}, ymm1","VSCATTERDPS ymm1, {k1-k7}, vm32y","vsc= atterdps ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A2 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERDPS vm32z, {k1-k7}, zmm1","VSCATTERDPS zmm1, {k1-k7}, vm32z","vsc= atterdps zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A2 /vsib","V","V","AVX5= 12F","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERPF0DPD vm32y, {k1-k7}","VSCATTERPF0DPD {k1-k7}, vm32y","vscatterp= f0dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /5","V","V","AVX512PF","modrm= _memonly,scale8","r,rw","","" +"VSCATTERPF0DPS vm32z, {k1-k7}","VSCATTERPF0DPS {k1-k7}, vm32z","vscatterp= f0dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /5","V","V","AVX512PF","modrm= _memonly,scale4","r,rw","","" +"VSCATTERPF0QPD vm64z, {k1-k7}","VSCATTERPF0QPD {k1-k7}, vm64z","vscatterp= f0qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /5","V","V","AVX512PF","modrm= _memonly,scale8","r,rw","","" +"VSCATTERPF0QPS vm64z, {k1-k7}","VSCATTERPF0QPS {k1-k7}, vm64z","vscatterp= f0qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /5","V","V","AVX512PF","modrm= _memonly,scale4","r,rw","","" +"VSCATTERPF1DPD vm32y, {k1-k7}","VSCATTERPF1DPD {k1-k7}, vm32y","vscatterp= f1dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /6","V","V","AVX512PF","modrm= _memonly,scale8","r,rw","","" +"VSCATTERPF1DPS vm32z, {k1-k7}","VSCATTERPF1DPS {k1-k7}, vm32z","vscatterp= f1dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /6","V","V","AVX512PF","modrm= _memonly,scale4","r,rw","","" +"VSCATTERPF1QPD vm64z, {k1-k7}","VSCATTERPF1QPD {k1-k7}, vm64z","vscatterp= f1qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /6","V","V","AVX512PF","modrm= _memonly,scale8","r,rw","","" +"VSCATTERPF1QPS vm64z, {k1-k7}","VSCATTERPF1QPS {k1-k7}, vm64z","vscatterp= f1qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /6","V","V","AVX512PF","modrm= _memonly,scale4","r,rw","","" +"VSCATTERQPD vm64x, {k1-k7}, xmm1","VSCATTERQPD xmm1, {k1-k7}, vm64x","vsc= atterqpd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A3 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERQPD vm64y, {k1-k7}, ymm1","VSCATTERQPD ymm1, {k1-k7}, vm64y","vsc= atterqpd ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A3 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERQPD vm64z, {k1-k7}, zmm1","VSCATTERQPD zmm1, {k1-k7}, vm64z","vsc= atterqpd zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A3 /vsib","V","V","AVX5= 12F","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERQPS vm64x, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7}, vm64x","vsc= atterqps xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A3 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERQPS vm64y, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7}, vm64y","vsc= atterqps xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A3 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERQPS vm64z, {k1-k7}, ymm1","VSCATTERQPS ymm1, {k1-k7}, vm64z","vsc= atterqps ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A3 /vsib","V","V","AVX5= 12F","modrm_memonly,scale4","w,rw,r","","" +"VSHUFF32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFF32X4 imm8= u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshuff32x4 imm8u, ymm2/m256/m32b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 23 /r ib","V","V","AVX512= F+AVX512VL","bscale4,scale32","w,r,r,r,r","","" +"VSHUFF32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFF32X4 imm8= u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshuff32x4 imm8u, zmm2/m512/m32b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 23 /r ib","V","V","AVX512= F","bscale4,scale64","w,r,r,r,r","","" +"VSHUFF64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFF64X2 imm8= u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshuff64x2 imm8u, ymm2/m256/m64b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 23 /r ib","V","V","AVX512= F+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VSHUFF64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFF64X2 imm8= u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshuff64x2 imm8u, zmm2/m512/m64b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 23 /r ib","V","V","AVX512= F","bscale8,scale64","w,r,r,r,r","","" +"VSHUFI32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFI32X4 imm8= u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufi32x4 imm8u, ymm2/m256/m32b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 43 /r ib","V","V","AVX512= F+AVX512VL","bscale4,scale32","w,r,r,r,r","","" +"VSHUFI32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFI32X4 imm8= u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufi32x4 imm8u, zmm2/m512/m32b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 43 /r ib","V","V","AVX512= F","bscale4,scale64","w,r,r,r,r","","" +"VSHUFI64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFI64X2 imm8= u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufi64x2 imm8u, ymm2/m256/m64b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 43 /r ib","V","V","AVX512= F+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VSHUFI64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFI64X2 imm8= u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufi64x2 imm8u, zmm2/m512/m64b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 43 /r ib","V","V","AVX512= F","bscale8,scale64","w,r,r,r,r","","" +"VSHUFPD xmm1, xmmV, xmm2/m128, imm8u","VSHUFPD imm8u, xmm2/m128, xmmV, xm= m1","vshufpd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C6 /r ib"= ,"V","V","AVX","","w,r,r,r","","" +"VSHUFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VSHUFPD imm8u, xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vshufpd imm8u, xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 C6 /r ib","V","V","AVX512F+AVX512VL"= ,"bscale8,scale16","w,r,r,r,r","","" +"VSHUFPD ymm1, ymmV, ymm2/m256, imm8u","VSHUFPD imm8u, ymm2/m256, ymmV, ym= m1","vshufpd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C6 /r ib"= ,"V","V","AVX","","w,r,r,r","","" +"VSHUFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFPD imm8u, ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufpd imm8u, ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 C6 /r ib","V","V","AVX512F+AVX512VL"= ,"bscale8,scale32","w,r,r,r,r","","" +"VSHUFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFPD imm8u, zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufpd imm8u, zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 C6 /r ib","V","V","AVX512F","bscale8= ,scale64","w,r,r,r,r","","" +"VSHUFPS xmm1, xmmV, xmm2/m128, imm8u","VSHUFPS imm8u, xmm2/m128, xmmV, xm= m1","vshufps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C6 /r ib","V= ","V","AVX","","w,r,r,r","","" +"VSHUFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VSHUFPS imm8u, xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vshufps imm8u, xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.0F.W0 C6 /r ib","V","V","AVX512F+AVX512VL","b= scale4,scale16","w,r,r,r,r","","" +"VSHUFPS ymm1, ymmV, ymm2/m256, imm8u","VSHUFPS imm8u, ymm2/m256, ymmV, ym= m1","vshufps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C6 /r ib","V= ","V","AVX","","w,r,r,r","","" +"VSHUFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFPS imm8u, ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufps imm8u, ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.0F.W0 C6 /r ib","V","V","AVX512F+AVX512VL","b= scale4,scale32","w,r,r,r,r","","" +"VSHUFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFPS imm8u, zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufps imm8u, zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.0F.W0 C6 /r ib","V","V","AVX512F","bscale4,sc= ale64","w,r,r,r,r","","" +"VSQRTPD xmm1, xmm2/m128","VSQRTPD xmm2/m128, xmm1","vsqrtpd xmm2/m128, xm= m1","VEX.128.66.0F.WIG 51 /r","V","V","AVX","","w,r","","" +"VSQRTPD xmm1, {k}{z}, xmm2/m128/m64bcst","VSQRTPD xmm2/m128/m64bcst, {k}{= z}, xmm1","vsqrtpd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 51 /= r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VSQRTPD ymm1, ymm2/m256","VSQRTPD ymm2/m256, ymm1","vsqrtpd ymm2/m256, ym= m1","VEX.256.66.0F.WIG 51 /r","V","V","AVX","","w,r","","" +"VSQRTPD ymm1, {k}{z}, ymm2/m256/m64bcst","VSQRTPD ymm2/m256/m64bcst, {k}{= z}, ymm1","vsqrtpd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 51 /= r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VSQRTPD zmm1{er}, {k}{z}, zmm2","VSQRTPD zmm2, {k}{z}, zmm1{er}","vsqrtpd= zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 51 /r","V","V","AVX512F","modrm= _regonly","w,r,r","","" +"VSQRTPD zmm1, {k}{z}, zmm2/m512/m64bcst","VSQRTPD zmm2/m512/m64bcst, {k}{= z}, zmm1","vsqrtpd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 51 /= r","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VSQRTPS xmm1, xmm2/m128","VSQRTPS xmm2/m128, xmm1","vsqrtps xmm2/m128, xm= m1","VEX.128.0F.WIG 51 /r","V","V","AVX","","w,r","","" +"VSQRTPS xmm1, {k}{z}, xmm2/m128/m32bcst","VSQRTPS xmm2/m128/m32bcst, {k}{= z}, xmm1","vsqrtps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 51 /r",= "V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VSQRTPS ymm1, ymm2/m256","VSQRTPS ymm2/m256, ymm1","vsqrtps ymm2/m256, ym= m1","VEX.256.0F.WIG 51 /r","V","V","AVX","","w,r","","" +"VSQRTPS ymm1, {k}{z}, ymm2/m256/m32bcst","VSQRTPS ymm2/m256/m32bcst, {k}{= z}, ymm1","vsqrtps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 51 /r",= "V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VSQRTPS zmm1{er}, {k}{z}, zmm2","VSQRTPS zmm2, {k}{z}, zmm1{er}","vsqrtps= zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 51 /r","V","V","AVX512F","modrm_re= gonly","w,r,r","","" +"VSQRTPS zmm1, {k}{z}, zmm2/m512/m32bcst","VSQRTPS zmm2/m512/m32bcst, {k}{= z}, zmm1","vsqrtps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 51 /r",= "V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VSQRTSD xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSD xmm2, xmmV, {k}{z}, xmm1{e= r}","vsqrtsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 51 /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSQRTSD xmm1, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, xmm1","vsqrtsd xmm= 2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 51 /r","V","V","AVX","","w,r,r","= ","" +"VSQRTSD xmm1, {k}{z}, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, {k}{z}, xm= m1","vsqrtsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 51 /r","V= ","V","AVX512F","scale8","w,r,r,r","","" +"VSQRTSS xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSS xmm2, xmmV, {k}{z}, xmm1{e= r}","vsqrtss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 51 /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSQRTSS xmm1, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, xmm1","vsqrtss xmm= 2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 51 /r","V","V","AVX","","w,r,r","= ","" +"VSQRTSS xmm1, {k}{z}, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, {k}{z}, xm= m1","vsqrtss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 51 /r","V= ","V","AVX512F","scale4","w,r,r,r","","" +"VSTMXCSR m32","VSTMXCSR m32","vstmxcsr m32","VEX.128.0F.WIG AE /3","V","V= ","AVX","modrm_memonly","w","","" +"VSUBPD xmm1, xmmV, xmm2/m128","VSUBPD xmm2/m128, xmmV, xmm1","vsubpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","= ","" +"VSUBPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSUBPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vsubpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 5C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VSUBPD ymm1, ymmV, ymm2/m256","VSUBPD ymm2/m256, ymmV, ymm1","vsubpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","= ","" +"VSUBPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSUBPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vsubpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 5C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VSUBPD zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPD zmm2, zmmV, {k}{z}, zmm1{er}= ","vsubpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 5C /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSUBPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSUBPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vsubpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 5C /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VSUBPS xmm1, xmmV, xmm2/m128","VSUBPS xmm2/m128, xmmV, xmm1","vsubps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5C /r","V","V","AVX","","w,r,r","","" +"VSUBPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSUBPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vsubps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 5C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VSUBPS ymm1, ymmV, ymm2/m256","VSUBPS ymm2/m256, ymmV, ymm1","vsubps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5C /r","V","V","AVX","","w,r,r","","" +"VSUBPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSUBPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vsubps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 5C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VSUBPS zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPS zmm2, zmmV, {k}{z}, zmm1{er}= ","vsubps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5C /r","V","V",= "AVX512F","modrm_regonly","w,r,r,r","","" +"VSUBPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSUBPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vsubps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 5C /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VSUBSD xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSD xmm2, xmmV, {k}{z}, xmm1{er}= ","vsubsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5C /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSUBSD xmm1, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, xmm1","vsubsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5C /r","V","V","AVX","","w,r,r","","" +"VSUBSD xmm1, {k}{z}, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vsubsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5C /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VSUBSS xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSS xmm2, xmmV, {k}{z}, xmm1{er}= ","vsubss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 5C /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSUBSS xmm1, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, xmm1","vsubss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5C /r","V","V","AVX","","w,r,r","","" +"VSUBSS xmm1, {k}{z}, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vsubss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5C /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VTESTPD xmm1, xmm2/m128","VTESTPD xmm2/m128, xmm1","vtestpd xmm2/m128, xm= m1","VEX.128.66.0F38.W0 0F /r","V","V","AVX","","r,r","","" +"VTESTPD ymm1, ymm2/m256","VTESTPD ymm2/m256, ymm1","vtestpd ymm2/m256, ym= m1","VEX.256.66.0F38.W0 0F /r","V","V","AVX","","r,r","","" +"VTESTPS xmm1, xmm2/m128","VTESTPS xmm2/m128, xmm1","vtestps xmm2/m128, xm= m1","VEX.128.66.0F38.W0 0E /r","V","V","AVX","","r,r","","" +"VTESTPS ymm1, ymm2/m256","VTESTPS ymm2/m256, ymm1","vtestps ymm2/m256, ym= m1","VEX.256.66.0F38.W0 0E /r","V","V","AVX","","r,r","","" +"VUCOMISD xmm1{sae}, xmm2","VUCOMISD xmm2, xmm1{sae}","vucomisd xmm2, xmm1= {sae}","EVEX.128.66.0F.W1 2E /r","V","V","AVX512F","modrm_regonly","r,r",""= ,"" +"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xm= m1","EVEX.LIG.66.0F.W1 2E /r","V","V","AVX512F","scale8","r,r","","" +"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xm= m1","VEX.LIG.66.0F.WIG 2E /r","V","V","AVX","","r,r","","" +"VUCOMISS xmm1{sae}, xmm2","VUCOMISS xmm2, xmm1{sae}","vucomiss xmm2, xmm1= {sae}","EVEX.128.0F.W0 2E /r","V","V","AVX512F","modrm_regonly","r,r","","" +"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xm= m1","EVEX.LIG.0F.W0 2E /r","V","V","AVX512F","scale4","r,r","","" +"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xm= m1","VEX.LIG.0F.WIG 2E /r","V","V","AVX","","r,r","","" +"VUNPCKHPD xmm1, xmmV, xmm2/m128","VUNPCKHPD xmm2/m128, xmmV, xmm1","vunpc= khpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 15 /r","V","V","AVX","",= "w,r,r","","" +"VUNPCKHPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKHPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vunpckhpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale16"= ,"w,r,r,r","","" +"VUNPCKHPD ymm1, ymmV, ymm2/m256","VUNPCKHPD ymm2/m256, ymmV, ymm1","vunpc= khpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 15 /r","V","V","AVX","",= "w,r,r","","" +"VUNPCKHPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKHPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vunpckhpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale32"= ,"w,r,r,r","","" +"VUNPCKHPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKHPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vunpckhpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F.W1 15 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r= ","","" +"VUNPCKHPS xmm1, xmmV, xmm2/m128","VUNPCKHPS xmm2/m128, xmmV, xmm1","vunpc= khps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 15 /r","V","V","AVX","","w,= r,r","","" +"VUNPCKHPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKHPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vunpckhps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.0F.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VUNPCKHPS ymm1, ymmV, ymm2/m256","VUNPCKHPS ymm2/m256, ymmV, ymm1","vunpc= khps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 15 /r","V","V","AVX","","w,= r,r","","" +"VUNPCKHPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKHPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vunpckhps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.0F.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VUNPCKHPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKHPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vunpckhps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.0F.W0 15 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VUNPCKLPD xmm1, xmmV, xmm2/m128","VUNPCKLPD xmm2/m128, xmmV, xmm1","vunpc= klpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 14 /r","V","V","AVX","",= "w,r,r","","" +"VUNPCKLPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKLPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vunpcklpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale16"= ,"w,r,r,r","","" +"VUNPCKLPD ymm1, ymmV, ymm2/m256","VUNPCKLPD ymm2/m256, ymmV, ymm1","vunpc= klpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 14 /r","V","V","AVX","",= "w,r,r","","" +"VUNPCKLPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKLPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vunpcklpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale32"= ,"w,r,r,r","","" +"VUNPCKLPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKLPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vunpcklpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F.W1 14 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r= ","","" +"VUNPCKLPS xmm1, xmmV, xmm2/m128","VUNPCKLPS xmm2/m128, xmmV, xmm1","vunpc= klps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 14 /r","V","V","AVX","","w,= r,r","","" +"VUNPCKLPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKLPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vunpcklps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.0F.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VUNPCKLPS ymm1, ymmV, ymm2/m256","VUNPCKLPS ymm2/m256, ymmV, ymm1","vunpc= klps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 14 /r","V","V","AVX","","w,= r,r","","" +"VUNPCKLPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKLPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vunpcklps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.0F.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VUNPCKLPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKLPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vunpcklps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.0F.W0 14 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VXORPD xmm1, xmmV, xmm2/m128","VXORPD xmm2/m128, xmmV, xmm1","vxorpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","= ","" +"VXORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VXORPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vxorpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 57 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,= r","","" +"VXORPD ymm1, ymmV, ymm2/m256","VXORPD ymm2/m256, ymmV, ymm1","vxorpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","= ","" +"VXORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VXORPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vxorpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 57 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,= r","","" +"VXORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VXORPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vxorpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 57 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","","" +"VXORPS xmm1, xmmV, xmm2/m128","VXORPS xmm2/m128, xmmV, xmm1","vxorps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 57 /r","V","V","AVX","","w,r,r","","" +"VXORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VXORPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vxorps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 57 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r",= "","" +"VXORPS ymm1, ymmV, ymm2/m256","VXORPS ymm2/m256, ymmV, ymm1","vxorps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 57 /r","V","V","AVX","","w,r,r","","" +"VXORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VXORPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vxorps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 57 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r",= "","" +"VXORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VXORPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vxorps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 57 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","","" +"VZEROALL","VZEROALL","vzeroall","VEX.256.0F.WIG 77","V","V","AVX","","","= ","" +"VZEROUPPER","VZEROUPPER","vzeroupper","VEX.128.0F.WIG 77","V","V","AVX","= ","","","" +"WAIT","WAIT","wait","9B","V","V","","pseudo","","","" +"WBINVD","WBINVD","wbinvd","0F 09","V","V","486","","","","" +"WRFSBASE rmr32","WRFSBASE rmr32","wrfsbase rmr32","F3 0F AE /2","N.S.","V= ","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32" +"WRFSBASE rmr64","WRFSBASE rmr64","wrfsbase rmr64","F3 REX.W 0F AE /2","N.= S.","V","FSGSBASE","modrm_regonly","r","Y","64" +"WRGSBASE rmr32","WRGSBASE rmr32","wrgsbase rmr32","F3 0F AE /3","N.S.","V= ","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32" +"WRGSBASE rmr64","WRGSBASE rmr64","wrgsbase rmr64","F3 REX.W 0F AE /3","N.= S.","V","FSGSBASE","modrm_regonly","r","Y","64" +"WRMSR","WRMSR","wrmsr","0F 30","V","V","Pentium","","","","" +"WRPKRU","WRPKRU","wrpkru","0F 01 EF","V","V","PKU","","","","" +"WRSSD m32, r32","WRSSD r32, m32","wrssd r32, m32","0F 38 F6 /r","V","V","= CET","modrm_memonly,operand16,operand32","w,r","","" +"WRSSQ m64, r64","WRSSQ r64, m64","wrssq r64, m64","REX.W 0F 38 F6 /r","N.= S.","V","CET","modrm_memonly","w,r","","" +"WRUSSD m32, r32","WRUSSD r32, m32","wrussd r32, m32","66 0F 38 F5 /r","V"= ,"V","CET","modrm_memonly,operand16,operand32","w,r","","" +"WRUSSQ m64, r64","WRUSSQ r64, m64","wrussq r64, m64","66 REX.W 0F 38 F5 /= r","N.S.","V","CET","modrm_memonly","w,r","","" +"XABORT imm8u","XABORT imm8u","xabort imm8u","C6 F8 ib","V","V","RTM","mod= rm_regonly","r","","" +"XACQUIRE","XACQUIRE","xacquire","F2","V","V","HLE","pseudo","","","" +"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","0F C0 /r","V","V","486"= ,"","rw,rw","Y","8" +"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","REX 0F C0 /r","N.E.","V= ","","pseudo64","rw,w","Y","8" +"XADD r/m32, r32","XADDL r32, r/m32","xaddl r32, r/m32","0F C1 /r","V","V"= ,"486","operand32","rw,rw","Y","32" +"XADD r/m64, r64","XADDQ r64, r/m64","xaddq r64, r/m64","REX.W 0F C1 /r","= N.S.","V","486","","rw,rw","Y","64" +"XADD r/m16, r16","XADDW r16, r/m16","xaddw r16, r/m16","0F C1 /r","V","V"= ,"486","operand16","rw,rw","Y","16" +"XBEGIN rel16","XBEGIN rel16","xbegin rel16","C7 F8 cw","V","V","RTM","mod= rm_regonly,operand16","r","","" +"XBEGIN rel32","XBEGIN rel32","xbegin rel32","C7 F8 cd","V","V","RTM","mod= rm_regonly,operand32,operand64","r","","" +"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","86 /r","V","V","","pseu= do","w,r","Y","8" +"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","REX 86 /r","N.E.","V","= ","pseudo","w,r","Y","8" +"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","86 /r","V","V","","","r= w,rw","Y","8" +"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","REX 86 /r","N.E.","V","= ","pseudo64","rw,r","Y","8" +"XCHG r32op, EAX","XCHGL EAX, r32op","xchgl EAX, r32op","90+rd","V","V",""= ,"operand32","rw,rw","Y","32" +"XCHG r32, r/m32","XCHGL r/m32, r32","xchgl r/m32, r32","87 /r","V","V",""= ,"operand32,pseudo","w,r","Y","32" +"XCHG r/m32, r32","XCHGL r32, r/m32","xchgl r32, r/m32","87 /r","V","V",""= ,"operand32","rw,rw","Y","32" +"XCHG EAX, r32op","XCHGL r32op, EAX","xchgl r32op, EAX","90+rd","V","V",""= ,"operand32,pseudo","rw,rw","Y","32" +"XCHG r64op, RAX","XCHGQ RAX, r64op","xchgq RAX, r64op","REX.W 90+ro","N.S= .","V","","","rw,rw","Y","64" +"XCHG r64, r/m64","XCHGQ r/m64, r64","xchgq r/m64, r64","REX.W 87 /r","N.E= .","V","","pseudo","w,r","Y","64" +"XCHG r/m64, r64","XCHGQ r64, r/m64","xchgq r64, r/m64","REX.W 87 /r","N.S= .","V","","","rw,rw","Y","64" +"XCHG RAX, r64op","XCHGQ r64op, RAX","xchgq r64op, RAX","REX.W 90+rd","N.E= .","V","","pseudo","rw,rw","Y","64" +"XCHG r16op, AX","XCHGW AX, r16op","xchgw AX, r16op","90+rw","V","V","","o= perand16","rw,rw","Y","16" +"XCHG r16, r/m16","XCHGW r/m16, r16","xchgw r/m16, r16","87 /r","V","V",""= ,"operand16,pseudo","w,r","Y","16" +"XCHG r/m16, r16","XCHGW r16, r/m16","xchgw r16, r/m16","87 /r","V","V",""= ,"operand16","rw,rw","Y","16" +"XCHG AX, r16op","XCHGW r16op, AX","xchgw r16op, AX","90+rw","V","V","","o= perand16,pseudo","rw,rw","Y","16" +"XEND","XEND","xend","0F 01 D5","V","V","RTM","","","","" +"XGETBV","XGETBV","xgetbv","0F 01 D0","V","V","XSAVE","","","","" +"XLATB","XLAT","xlat","D7","V","V","","","","","" +"XLATB","XLAT","xlat","REX.W D7","N.E.","V","","pseudo","","","" +"XOR r/m8, imm8","XORB imm8, r/m8","xorb imm8, r/m8","REX 80 /6 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"XOR AL, imm8u","XORB imm8u, AL","xorb imm8u, AL","34 ib","V","V","","","r= w,r","Y","8" +"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","80 /6 ib","V","V"= ,"","","rw,r","Y","8" +"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","82 /6 ib","V","N.= S.","","","rw,r","Y","8" +"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","32 /r","V","V","","","rw,r= ","Y","8" +"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","REX 32 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","30 /r","V","V","","","rw,r= ","Y","8" +"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","REX 30 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"XOR EAX, imm32","XORL imm32, EAX","xorl imm32, EAX","35 id","V","V","","o= perand32","rw,r","Y","32" +"XOR r/m32, imm32","XORL imm32, r/m32","xorl imm32, r/m32","81 /6 id","V",= "V","","operand32","rw,r","Y","32" +"XOR r/m32, imm8","XORL imm8, r/m32","xorl imm8, r/m32","83 /6 ib","V","V"= ,"","operand32","rw,r","Y","32" +"XOR r32, r/m32","XORL r/m32, r32","xorl r/m32, r32","33 /r","V","V","","o= perand32","rw,r","Y","32" +"XOR r/m32, r32","XORL r32, r/m32","xorl r32, r/m32","31 /r","V","V","","o= perand32","rw,r","Y","32" +"XORPD xmm1, xmm2/m128","XORPD xmm2/m128, xmm1","xorpd xmm2/m128, xmm1","6= 6 0F 57 /r","V","V","SSE2","","rw,r","","" +"XORPS xmm1, xmm2/m128","XORPS xmm2/m128, xmm1","xorps xmm2/m128, xmm1","0= F 57 /r","V","V","SSE","","rw,r","","" +"XOR RAX, imm32","XORQ imm32, RAX","xorq imm32, RAX","REX.W 35 id","N.S.",= "V","","","rw,r","Y","64" +"XOR r/m64, imm32","XORQ imm32, r/m64","xorq imm32, r/m64","REX.W 81 /6 id= ","N.S.","V","","","rw,r","Y","64" +"XOR r/m64, imm8","XORQ imm8, r/m64","xorq imm8, r/m64","REX.W 83 /6 ib","= N.S.","V","","","rw,r","Y","64" +"XOR r64, r/m64","XORQ r/m64, r64","xorq r/m64, r64","REX.W 33 /r","N.S.",= "V","","","rw,r","Y","64" +"XOR r/m64, r64","XORQ r64, r/m64","xorq r64, r/m64","REX.W 31 /r","N.S.",= "V","","","rw,r","Y","64" +"XOR AX, imm16","XORW imm16, AX","xorw imm16, AX","35 iw","V","V","","oper= and16","rw,r","Y","16" +"XOR r/m16, imm16","XORW imm16, r/m16","xorw imm16, r/m16","81 /6 iw","V",= "V","","operand16","rw,r","Y","16" +"XOR r/m16, imm8","XORW imm8, r/m16","xorw imm8, r/m16","83 /6 ib","V","V"= ,"","operand16","rw,r","Y","16" +"XOR r16, r/m16","XORW r/m16, r16","xorw r/m16, r16","33 /r","V","V","","o= perand16","rw,r","Y","16" +"XOR r/m16, r16","XORW r16, r/m16","xorw r16, r/m16","31 /r","V","V","","o= perand16","rw,r","Y","16" +"XRELEASE","XRELEASE","xrelease","F3","V","V","HLE","pseudo","","","" +"XRSTOR mem","XRSTOR mem","xrstor mem","0F AE /5","V","V","XSAVE","modrm_m= emonly,operand16,operand32","r","","" +"XRSTOR64 mem","XRSTOR64 mem","xrstor64 mem","REX.W 0F AE /5","N.S.","V","= XSAVE","modrm_memonly","r","","" +"XRSTORS mem","XRSTORS mem","xrstors mem","0F C7 /3","V","V","XSAVES","mod= rm_memonly,operand16,operand32","r","","" +"XRSTORS64 mem","XRSTORS64 mem","xrstors64 mem","REX.W 0F C7 /3","N.S.","V= ","XSAVES","modrm_memonly","r","","" +"XSAVE mem","XSAVE mem","xsave mem","0F AE /4","V","V","XSAVE","modrm_memo= nly,operand16,operand32","w","","" +"XSAVE64 mem","XSAVE64 mem","xsave64 mem","REX.W 0F AE /4","N.S.","V","XSA= VE","modrm_memonly","w","","" +"XSAVEC mem","XSAVEC mem","xsavec mem","0F C7 /4","V","V","XSAVEC","modrm_= memonly,operand16,operand32","w","","" +"XSAVEC64 mem","XSAVEC64 mem","xsavec64 mem","REX.W 0F C7 /4","N.S.","V","= XSAVEC","modrm_memonly","w","","" +"XSAVEOPT mem","XSAVEOPT mem","xsaveopt mem","0F AE /6","V","V","XSAVEOPT"= ,"modrm_memonly,operand16,operand32","w","","" +"XSAVEOPT64 mem","XSAVEOPT64 mem","xsaveopt64 mem","REX.W 0F AE /6","N.S."= ,"V","XSAVEOPT","modrm_memonly","w","","" +"XSAVES mem","XSAVES mem","xsaves mem","0F C7 /5","V","V","XSAVES","modrm_= memonly,operand16,operand32","w","","" +"XSAVES64 mem","XSAVES64 mem","xsaves64 mem","REX.W 0F C7 /5","N.S.","V","= XSAVES","modrm_memonly","w","","" +"XSETBV","XSETBV","xsetbv","0F 01 D1","V","V","XSAVE","","","","" +"XTEST","XTEST","xtest","0F 01 D6","V","V","HLE or RTM","","","","" --=20 2.35.2 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838476332882.7521946630164; Sun, 24 Apr 2022 15:14:36 -0700 (PDT) Received: from localhost ([::1]:36204 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikV0-0002yg-EH for importer@patchew.org; Sun, 24 Apr 2022 18:14:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49352) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJE-000570-Mc for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:30 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58681) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001L3-LI for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:24 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ5-0001ea-Lo; Sun, 24 Apr 2022 23:02:15 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 05/42] i386: Rework sse_op_table6/7 Date: Sun, 24 Apr 2022 23:01:27 +0100 Message-Id: <20220424220204.2493824-6-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838476705100001 Content-Type: text/plain; charset="utf-8" Add a flags field each row in sse_op_table6 and sse_op_table7. Initially this is only used as a replacement for the magic SSE41_SPECIAL pointer. The other flags will become relevant as the rest of the avx implementation is built out. Signed-off-by: Paul Brook --- target/i386/tcg/translate.c | 232 ++++++++++++++++++++---------------- 1 file changed, 132 insertions(+), 100 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 7fec582358..5335b86c01 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2977,7 +2977,6 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { #undef SSE_SPECIAL =20 #define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } -#define SSE_SPECIAL_FN ((void *)1) =20 static const SSEFunc_0_epp sse_op_table2[3 * 8][2] =3D { [0 + 2] =3D MMX_OP2(psrlw), @@ -3061,113 +3060,134 @@ static const SSEFunc_0_epp sse_op_table5[256] =3D= { [0xbf] =3D gen_helper_pavgb_mmx /* pavgusb */ }; =20 -struct SSEOpHelper_epp { +struct SSEOpHelper_table6 { SSEFunc_0_epp op[2]; uint32_t ext_mask; + int flags; }; =20 -struct SSEOpHelper_eppi { +struct SSEOpHelper_table7 { SSEFunc_0_eppi op[2]; uint32_t ext_mask; + int flags; }; =20 -#define SSSE3_OP(x) { MMX_OP2(x), CPUID_EXT_SSSE3 } -#define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 } -#define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 } -#define SSE41_SPECIAL { { NULL, SSE_SPECIAL_FN }, CPUID_EXT_SSE41 } -#define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \ - CPUID_EXT_PCLMULQDQ } -#define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES } - -static const struct SSEOpHelper_epp sse_op_table6[256] =3D { - [0x00] =3D SSSE3_OP(pshufb), - [0x01] =3D SSSE3_OP(phaddw), - [0x02] =3D SSSE3_OP(phaddd), - [0x03] =3D SSSE3_OP(phaddsw), - [0x04] =3D SSSE3_OP(pmaddubsw), - [0x05] =3D SSSE3_OP(phsubw), - [0x06] =3D SSSE3_OP(phsubd), - [0x07] =3D SSSE3_OP(phsubsw), - [0x08] =3D SSSE3_OP(psignb), - [0x09] =3D SSSE3_OP(psignw), - [0x0a] =3D SSSE3_OP(psignd), - [0x0b] =3D SSSE3_OP(pmulhrsw), - [0x10] =3D SSE41_OP(pblendvb), - [0x14] =3D SSE41_OP(blendvps), - [0x15] =3D SSE41_OP(blendvpd), - [0x17] =3D SSE41_OP(ptest), - [0x1c] =3D SSSE3_OP(pabsb), - [0x1d] =3D SSSE3_OP(pabsw), - [0x1e] =3D SSSE3_OP(pabsd), - [0x20] =3D SSE41_OP(pmovsxbw), - [0x21] =3D SSE41_OP(pmovsxbd), - [0x22] =3D SSE41_OP(pmovsxbq), - [0x23] =3D SSE41_OP(pmovsxwd), - [0x24] =3D SSE41_OP(pmovsxwq), - [0x25] =3D SSE41_OP(pmovsxdq), - [0x28] =3D SSE41_OP(pmuldq), - [0x29] =3D SSE41_OP(pcmpeqq), - [0x2a] =3D SSE41_SPECIAL, /* movntqda */ - [0x2b] =3D SSE41_OP(packusdw), - [0x30] =3D SSE41_OP(pmovzxbw), - [0x31] =3D SSE41_OP(pmovzxbd), - [0x32] =3D SSE41_OP(pmovzxbq), - [0x33] =3D SSE41_OP(pmovzxwd), - [0x34] =3D SSE41_OP(pmovzxwq), - [0x35] =3D SSE41_OP(pmovzxdq), - [0x37] =3D SSE42_OP(pcmpgtq), - [0x38] =3D SSE41_OP(pminsb), - [0x39] =3D SSE41_OP(pminsd), - [0x3a] =3D SSE41_OP(pminuw), - [0x3b] =3D SSE41_OP(pminud), - [0x3c] =3D SSE41_OP(pmaxsb), - [0x3d] =3D SSE41_OP(pmaxsd), - [0x3e] =3D SSE41_OP(pmaxuw), - [0x3f] =3D SSE41_OP(pmaxud), - [0x40] =3D SSE41_OP(pmulld), - [0x41] =3D SSE41_OP(phminposuw), - [0xdb] =3D AESNI_OP(aesimc), - [0xdc] =3D AESNI_OP(aesenc), - [0xdd] =3D AESNI_OP(aesenclast), - [0xde] =3D AESNI_OP(aesdec), - [0xdf] =3D AESNI_OP(aesdeclast), +#define gen_helper_special_xmm NULL + +#define OP(name, op, flags, ext, mmx_name) \ + {{mmx_name, gen_helper_ ## name ## _xmm}, CPUID_EXT_ ## ext, flags} +#define BINARY_OP_MMX(name, ext) \ + OP(name, op2, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) +#define BINARY_OP(name, ext, flags) \ + OP(name, op2, flags, ext, NULL) +#define UNARY_OP_MMX(name, ext) \ + OP(name, op1, SSE_OPF_V0 | SSE_OPF_MMX, ext, gen_helper_ ## name ## _m= mx) +#define UNARY_OP(name, ext, flags) \ + OP(name, op1, SSE_OPF_V0 | flags, ext, NULL) +#define BLENDV_OP(name, ext, flags) OP(name, op3, SSE_OPF_BLENDV, ext, NUL= L) +#define CMP_OP(name, ext) OP(name, op1, SSE_OPF_CMP | SSE_OPF_V0, ext, NUL= L) +#define SPECIAL_OP(ext) OP(special, op1, SSE_OPF_SPECIAL, ext, NULL) + +/* prefix [66] 0f 38 */ +static const struct SSEOpHelper_table6 sse_op_table6[256] =3D { + [0x00] =3D BINARY_OP_MMX(pshufb, SSSE3), + [0x01] =3D BINARY_OP_MMX(phaddw, SSSE3), + [0x02] =3D BINARY_OP_MMX(phaddd, SSSE3), + [0x03] =3D BINARY_OP_MMX(phaddsw, SSSE3), + [0x04] =3D BINARY_OP_MMX(pmaddubsw, SSSE3), + [0x05] =3D BINARY_OP_MMX(phsubw, SSSE3), + [0x06] =3D BINARY_OP_MMX(phsubd, SSSE3), + [0x07] =3D BINARY_OP_MMX(phsubsw, SSSE3), + [0x08] =3D BINARY_OP_MMX(psignb, SSSE3), + [0x09] =3D BINARY_OP_MMX(psignw, SSSE3), + [0x0a] =3D BINARY_OP_MMX(psignd, SSSE3), + [0x0b] =3D BINARY_OP_MMX(pmulhrsw, SSSE3), + [0x10] =3D BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX), + [0x14] =3D BLENDV_OP(blendvps, SSE41, 0), + [0x15] =3D BLENDV_OP(blendvpd, SSE41, 0), + [0x17] =3D CMP_OP(ptest, SSE41), + [0x1c] =3D UNARY_OP_MMX(pabsb, SSSE3), + [0x1d] =3D UNARY_OP_MMX(pabsw, SSSE3), + [0x1e] =3D UNARY_OP_MMX(pabsd, SSSE3), + [0x20] =3D UNARY_OP(pmovsxbw, SSE41, SSE_OPF_MMX), + [0x21] =3D UNARY_OP(pmovsxbd, SSE41, SSE_OPF_MMX), + [0x22] =3D UNARY_OP(pmovsxbq, SSE41, SSE_OPF_MMX), + [0x23] =3D UNARY_OP(pmovsxwd, SSE41, SSE_OPF_MMX), + [0x24] =3D UNARY_OP(pmovsxwq, SSE41, SSE_OPF_MMX), + [0x25] =3D UNARY_OP(pmovsxdq, SSE41, SSE_OPF_MMX), + [0x28] =3D BINARY_OP(pmuldq, SSE41, SSE_OPF_MMX), + [0x29] =3D BINARY_OP(pcmpeqq, SSE41, SSE_OPF_MMX), + [0x2a] =3D SPECIAL_OP(SSE41), /* movntqda */ + [0x2b] =3D BINARY_OP(packusdw, SSE41, SSE_OPF_MMX), + [0x30] =3D UNARY_OP(pmovzxbw, SSE41, SSE_OPF_MMX), + [0x31] =3D UNARY_OP(pmovzxbd, SSE41, SSE_OPF_MMX), + [0x32] =3D UNARY_OP(pmovzxbq, SSE41, SSE_OPF_MMX), + [0x33] =3D UNARY_OP(pmovzxwd, SSE41, SSE_OPF_MMX), + [0x34] =3D UNARY_OP(pmovzxwq, SSE41, SSE_OPF_MMX), + [0x35] =3D UNARY_OP(pmovzxdq, SSE41, SSE_OPF_MMX), + [0x37] =3D BINARY_OP(pcmpgtq, SSE41, SSE_OPF_MMX), + [0x38] =3D BINARY_OP(pminsb, SSE41, SSE_OPF_MMX), + [0x39] =3D BINARY_OP(pminsd, SSE41, SSE_OPF_MMX), + [0x3a] =3D BINARY_OP(pminuw, SSE41, SSE_OPF_MMX), + [0x3b] =3D BINARY_OP(pminud, SSE41, SSE_OPF_MMX), + [0x3c] =3D BINARY_OP(pmaxsb, SSE41, SSE_OPF_MMX), + [0x3d] =3D BINARY_OP(pmaxsd, SSE41, SSE_OPF_MMX), + [0x3e] =3D BINARY_OP(pmaxuw, SSE41, SSE_OPF_MMX), + [0x3f] =3D BINARY_OP(pmaxud, SSE41, SSE_OPF_MMX), + [0x40] =3D BINARY_OP(pmulld, SSE41, SSE_OPF_MMX), + [0x41] =3D UNARY_OP(phminposuw, SSE41, 0), + [0xdb] =3D UNARY_OP(aesimc, AES, 0), + [0xdc] =3D BINARY_OP(aesenc, AES, 0), + [0xdd] =3D BINARY_OP(aesenclast, AES, 0), + [0xde] =3D BINARY_OP(aesdec, AES, 0), + [0xdf] =3D BINARY_OP(aesdeclast, AES, 0), }; =20 -static const struct SSEOpHelper_eppi sse_op_table7[256] =3D { - [0x08] =3D SSE41_OP(roundps), - [0x09] =3D SSE41_OP(roundpd), - [0x0a] =3D SSE41_OP(roundss), - [0x0b] =3D SSE41_OP(roundsd), - [0x0c] =3D SSE41_OP(blendps), - [0x0d] =3D SSE41_OP(blendpd), - [0x0e] =3D SSE41_OP(pblendw), - [0x0f] =3D SSSE3_OP(palignr), - [0x14] =3D SSE41_SPECIAL, /* pextrb */ - [0x15] =3D SSE41_SPECIAL, /* pextrw */ - [0x16] =3D SSE41_SPECIAL, /* pextrd/pextrq */ - [0x17] =3D SSE41_SPECIAL, /* extractps */ - [0x20] =3D SSE41_SPECIAL, /* pinsrb */ - [0x21] =3D SSE41_SPECIAL, /* insertps */ - [0x22] =3D SSE41_SPECIAL, /* pinsrd/pinsrq */ - [0x40] =3D SSE41_OP(dpps), - [0x41] =3D SSE41_OP(dppd), - [0x42] =3D SSE41_OP(mpsadbw), - [0x44] =3D PCLMULQDQ_OP(pclmulqdq), - [0x60] =3D SSE42_OP(pcmpestrm), - [0x61] =3D SSE42_OP(pcmpestri), - [0x62] =3D SSE42_OP(pcmpistrm), - [0x63] =3D SSE42_OP(pcmpistri), - [0xdf] =3D AESNI_OP(aeskeygenassist), +/* prefix [66] 0f 3a */ +static const struct SSEOpHelper_table7 sse_op_table7[256] =3D { + [0x08] =3D UNARY_OP(roundps, SSE41, 0), + [0x09] =3D UNARY_OP(roundpd, SSE41, 0), + [0x0a] =3D UNARY_OP(roundss, SSE41, SSE_OPF_SCALAR), + [0x0b] =3D UNARY_OP(roundsd, SSE41, SSE_OPF_SCALAR), + [0x0c] =3D BINARY_OP(blendps, SSE41, 0), + [0x0d] =3D BINARY_OP(blendpd, SSE41, 0), + [0x0e] =3D BINARY_OP(pblendw, SSE41, SSE_OPF_MMX), + [0x0f] =3D BINARY_OP_MMX(palignr, SSSE3), + [0x14] =3D SPECIAL_OP(SSE41), /* pextrb */ + [0x15] =3D SPECIAL_OP(SSE41), /* pextrw */ + [0x16] =3D SPECIAL_OP(SSE41), /* pextrd/pextrq */ + [0x17] =3D SPECIAL_OP(SSE41), /* extractps */ + [0x20] =3D SPECIAL_OP(SSE41), /* pinsrb */ + [0x21] =3D SPECIAL_OP(SSE41), /* insertps */ + [0x22] =3D SPECIAL_OP(SSE41), /* pinsrd/pinsrq */ + [0x40] =3D BINARY_OP(dpps, SSE41, 0), + [0x41] =3D BINARY_OP(dppd, SSE41, 0), + [0x42] =3D BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX), + [0x44] =3D BINARY_OP(pclmulqdq, PCLMULQDQ, 0), + [0x60] =3D CMP_OP(pcmpestrm, SSE42), + [0x61] =3D CMP_OP(pcmpestri, SSE42), + [0x62] =3D CMP_OP(pcmpistrm, SSE42), + [0x63] =3D CMP_OP(pcmpistri, SSE42), + [0xdf] =3D UNARY_OP(aeskeygenassist, AES, 0), }; =20 +#undef OP +#undef BINARY_OP_MMX +#undef BINARY_OP +#undef UNARY_OP_MMX +#undef UNARY_OP +#undef BLENDV_OP +#undef SPECIAL_OP + static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { int b1, op1_offset, op2_offset, is_xmm, val; int modrm, mod, rm, reg; struct SSEOpHelper_table1 sse_op; + struct SSEOpHelper_table6 op6; + struct SSEOpHelper_table7 op7; SSEFunc_0_epp sse_fn_epp; - SSEFunc_0_eppi sse_fn_eppi; SSEFunc_0_ppi sse_fn_ppi; SSEFunc_0_eppt sse_fn_eppt; MemOp ot; @@ -3828,12 +3848,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, mod =3D (modrm >> 6) & 3; =20 assert(b1 < 2); - sse_fn_epp =3D sse_op_table6[b].op[b1]; - if (!sse_fn_epp) { + op6 =3D sse_op_table6[b]; + if (op6.ext_mask =3D=3D 0) { goto unknown_op; } - if (!(s->cpuid_ext_features & sse_op_table6[b].ext_mask)) + if (!(s->cpuid_ext_features & op6.ext_mask)) { goto illegal_op; + } =20 if (b1) { op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); @@ -3870,6 +3891,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } } } else { + if ((op6.flags & SSE_OPF_MMX) =3D=3D 0) { + goto unknown_op; + } op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -3879,13 +3903,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_ldq_env_A0(s, op2_offset); } } - if (sse_fn_epp =3D=3D SSE_SPECIAL_FN) { - goto unknown_op; + if (!op6.op[b1]) { + goto illegal_op; } =20 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + op6.op[b1](cpu_env, s->ptr0, s->ptr1); =20 if (b =3D=3D 0x17) { set_cc_op(s, CC_OP_EFLAGS); @@ -4256,16 +4280,21 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, mod =3D (modrm >> 6) & 3; =20 assert(b1 < 2); - sse_fn_eppi =3D sse_op_table7[b].op[b1]; - if (!sse_fn_eppi) { + op7 =3D sse_op_table7[b]; + if (op7.ext_mask =3D=3D 0) { goto unknown_op; } - if (!(s->cpuid_ext_features & sse_op_table7[b].ext_mask)) + if (!(s->cpuid_ext_features & op7.ext_mask)) { goto illegal_op; + } =20 s->rip_offset =3D 1; =20 - if (sse_fn_eppi =3D=3D SSE_SPECIAL_FN) { + if (op7.flags & SSE_OPF_SPECIAL) { + /* None of the "special" ops are valid on mmx registers */ + if (b1 =3D=3D 0) { + goto illegal_op; + } ot =3D mo_64_32(s->dflag); rm =3D (modrm & 7) | REX_B(s); if (mod !=3D 3) @@ -4410,6 +4439,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldo_env_A0(s, op2_offset); } } else { + if ((op7.flags & SSE_OPF_MMX) =3D=3D 0) { + goto illegal_op; + } op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -4432,7 +4464,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_eppi(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + op7.op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); break; =20 case 0x33a: --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 165083809572219.19055951626683; Sun, 24 Apr 2022 15:08:15 -0700 (PDT) Received: from localhost ([::1]:54518 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikOs-0004N3-HZ for importer@patchew.org; Sun, 24 Apr 2022 18:08:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49306) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJC-00056i-Sk for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:22 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58688) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001L7-JI for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:22 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ5-0001ea-SZ; Sun, 24 Apr 2022 23:02:15 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 06/42] i386: Add CHECK_NO_VEX Date: Sun, 24 Apr 2022 23:01:28 +0100 Message-Id: <20220424220204.2493824-7-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838097326100001 Content-Type: text/plain; charset="utf-8" Reject invalid VEX encodings on MMX instructions. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 5335b86c01..66ba690b7d 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3179,6 +3179,12 @@ static const struct SSEOpHelper_table7 sse_op_table7= [256] =3D { #undef BLENDV_OP #undef SPECIAL_OP =20 +/* VEX prefix not allowed */ +#define CHECK_NO_VEX(s) do { \ + if (s->prefix & PREFIX_VEX) \ + goto illegal_op; \ + } while (0) + static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { @@ -3262,6 +3268,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, b |=3D (b1 << 8); switch(b) { case 0x0e7: /* movntq */ + CHECK_NO_VEX(s); if (mod =3D=3D 3) { goto illegal_op; } @@ -3297,6 +3304,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x6e: /* movd mm, ea */ + CHECK_NO_VEX(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); @@ -3330,6 +3338,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x6f: /* movq mm, ea */ + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx)); @@ -3464,6 +3473,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x178: case 0x378: + CHECK_NO_VEX(s); { int bit_index, field_length; =20 @@ -3484,6 +3494,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x7e: /* movd ea, mm */ + CHECK_NO_VEX(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { tcg_gen_ld_i64(s->T0, cpu_env, @@ -3524,6 +3535,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); break; case 0x7f: /* movq ea, mm */ + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx)); @@ -3607,6 +3619,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, offsetof(CPUX86State, xmm_t0.ZMM_L(1))); op1_offset =3D offsetof(CPUX86State,xmm_t0); } else { + CHECK_NO_VEX(s); tcg_gen_movi_tl(s->T0, val); tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, mmx_t0.MMX_L(0))); @@ -3648,6 +3661,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x02a: /* cvtpi2ps */ case 0x12a: /* cvtpi2pd */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -3693,6 +3707,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x12c: /* cvttpd2pi */ case 0x02d: /* cvtps2pi */ case 0x12d: /* cvtpd2pi */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -3766,6 +3781,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_st16_tl(s->T0, cpu_env, offsetof(CPUX86State,xmm_regs[reg].ZMM_W(v= al))); } else { + CHECK_NO_VEX(s); val &=3D 3; tcg_gen_st16_tl(s->T0, cpu_env, offsetof(CPUX86State,fpregs[reg].mmx.MMX_W= (val))); @@ -3805,6 +3821,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x2d6: /* movq2dq */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)), @@ -3812,6 +3829,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); break; case 0x3d6: /* movdq2q */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, fpregs[reg & 7].mmx), @@ -3827,6 +3845,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, offsetof(CPUX86State, xmm_regs[rm])); gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); } else { + CHECK_NO_VEX(s); rm =3D (modrm & 7); tcg_gen_addi_ptr(s->ptr0, cpu_env, offsetof(CPUX86State, fpregs[rm].mmx)); @@ -3891,6 +3910,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } } } else { + CHECK_NO_VEX(s); if ((op6.flags & SSE_OPF_MMX) =3D=3D 0) { goto unknown_op; } @@ -3928,6 +3948,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x3f0: /* crc32 Gd,Eb */ case 0x3f1: /* crc32 Gd,Ey */ do_crc32: + CHECK_NO_VEX(s); if (!(s->cpuid_ext_features & CPUID_EXT_SSE42)) { goto illegal_op; } @@ -3950,6 +3971,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 case 0x1f0: /* crc32 or movbe */ case 0x1f1: + CHECK_NO_VEX(s); /* For these insns, the f3 prefix is supposed to have prio= rity over the 66 prefix, but that's not what we implement ab= ove setting b1. */ @@ -3959,6 +3981,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, /* FALLTHRU */ case 0x0f0: /* movbe Gy,My */ case 0x0f1: /* movbe My,Gy */ + CHECK_NO_VEX(s); if (!(s->cpuid_ext_features & CPUID_EXT_MOVBE)) { goto illegal_op; } @@ -4125,6 +4148,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 case 0x1f6: /* adcx Gy, Ey */ case 0x2f6: /* adox Gy, Ey */ + CHECK_NO_VEX(s); if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_ADX)) { goto illegal_op; } else { @@ -4439,6 +4463,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldo_env_A0(s, op2_offset); } } else { + CHECK_NO_VEX(s); if ((op7.flags & SSE_OPF_MMX) =3D=3D 0) { goto illegal_op; } @@ -4565,6 +4590,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); } } else { + CHECK_NO_VEX(s); op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838229381723.201918264088; Sun, 24 Apr 2022 15:10:29 -0700 (PDT) Received: from localhost ([::1]:58840 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikR2-0007KS-0f for importer@patchew.org; Sun, 24 Apr 2022 18:10:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49322) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJD-00056v-Hs for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:24 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58692) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001LA-Ix for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:23 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ6-0001ea-4W; Sun, 24 Apr 2022 23:02:16 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 07/42] Enforce VEX encoding restrictions Date: Sun, 24 Apr 2022 23:01:29 +0100 Message-Id: <20220424220204.2493824-8-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838229985100001 Content-Type: text/plain; charset="utf-8" Add CHECK_AVX* macros, and use them to validate VEX encoded AVX instructions All AVX instructions require both CPU and OS support, this is encapsulated by HF_AVX_EN. Some also require specific values in the VEX.L and VEX.V fields. Some (mostly integer operations) also require AVX2 Signed-off-by: Paul Brook --- target/i386/tcg/translate.c | 159 +++++++++++++++++++++++++++++++++--- 1 file changed, 149 insertions(+), 10 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 66ba690b7d..2f5cc24e0c 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3185,10 +3185,54 @@ static const struct SSEOpHelper_table7 sse_op_table= 7[256] =3D { goto illegal_op; \ } while (0) =20 +/* + * VEX encodings require AVX + * Allow legacy SSE encodings even if AVX not enabled + */ +#define CHECK_AVX(s) do { \ + if ((s->prefix & PREFIX_VEX) \ + && !(env->hflags & HF_AVX_EN_MASK)) \ + goto illegal_op; \ + } while (0) + +/* If a VEX prefix is used then it must have V=3D1111b */ +#define CHECK_AVX_V0(s) do { \ + CHECK_AVX(s); \ + if ((s->prefix & PREFIX_VEX) && (s->vex_v !=3D 0)) \ + goto illegal_op; \ + } while (0) + +/* If a VEX prefix is used then it must have L=3D0 */ +#define CHECK_AVX_128(s) do { \ + CHECK_AVX(s); \ + if ((s->prefix & PREFIX_VEX) && (s->vex_l !=3D 0)) \ + goto illegal_op; \ + } while (0) + +/* If a VEX prefix is used then it must have V=3D1111b and L=3D0 */ +#define CHECK_AVX_V0_128(s) do { \ + CHECK_AVX(s); \ + if ((s->prefix & PREFIX_VEX) && (s->vex_v !=3D 0 || s->vex_l !=3D 0)) \ + goto illegal_op; \ + } while (0) + +/* 256-bit (ymm) variants require AVX2 */ +#define CHECK_AVX2_256(s) do { \ + if (s->vex_l && !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2)) \ + goto illegal_op; \ + } while (0) + +/* Requires AVX2 and VEX encoding */ +#define CHECK_AVX2(s) do { \ + if ((s->prefix & PREFIX_VEX) =3D=3D 0 \ + || !(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2)) \ + goto illegal_op; \ + } while (0) + static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { - int b1, op1_offset, op2_offset, is_xmm, val; + int b1, op1_offset, op2_offset, is_xmm, val, scalar_op; int modrm, mod, rm, reg; struct SSEOpHelper_table1 sse_op; struct SSEOpHelper_table6 op6; @@ -3228,15 +3272,18 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_exception(s, EXCP07_PREX, pc_start - s->cs_base); return; } - if (s->flags & HF_EM_MASK) { - illegal_op: - gen_illegal_opcode(s); - return; - } - if (is_xmm - && !(s->flags & HF_OSFXSR_MASK) - && (b !=3D 0x38 && b !=3D 0x3a)) { - goto unknown_op; + /* VEX encoded instuctions ignore EM bit. See also CHECK_AVX */ + if (!(s->prefix & PREFIX_VEX)) { + if (s->flags & HF_EM_MASK) { + illegal_op: + gen_illegal_opcode(s); + return; + } + if (is_xmm + && !(s->flags & HF_OSFXSR_MASK) + && (b !=3D 0x38 && b !=3D 0x3a)) { + goto unknown_op; + } } if (b =3D=3D 0x0e) { if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { @@ -3278,12 +3325,14 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x1e7: /* movntdq */ case 0x02b: /* movntps */ case 0x12b: /* movntps */ + CHECK_AVX_V0(s); if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); break; case 0x3f0: /* lddqu */ + CHECK_AVX_V0(s); if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); @@ -3291,6 +3340,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x22b: /* movntss */ case 0x32b: /* movntsd */ + CHECK_AVX_V0_128(s); if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); @@ -3321,6 +3371,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x16e: /* movd xmm, ea */ + CHECK_AVX_V0_128(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); @@ -3356,6 +3407,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x128: /* movapd */ case 0x16f: /* movdqa xmm, ea */ case 0x26f: /* movdqu xmm, ea */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); @@ -3367,6 +3419,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x210: /* movss xmm, ea */ if (mod !=3D 3) { + CHECK_AVX_V0_128(s); gen_lea_modrm(env, s, modrm); gen_op_ld_v(s, MO_32, s->T0, s->A0); tcg_gen_st32_tl(s->T0, cpu_env, @@ -3379,6 +3432,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 3))); } else { + CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0))); @@ -3386,6 +3440,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x310: /* movsd xmm, ea */ if (mod !=3D 3) { + CHECK_AVX_V0_128(s); gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); @@ -3395,6 +3450,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 3))); } else { + CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); @@ -3402,6 +3458,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x012: /* movlps */ case 0x112: /* movlpd */ + CHECK_AVX_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, @@ -3414,6 +3471,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x212: /* movsldup */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); @@ -3430,6 +3488,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, offsetof(CPUX86State,xmm_regs[reg].ZMM_L(2))); break; case 0x312: /* movddup */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, @@ -3444,6 +3503,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x016: /* movhps */ case 0x116: /* movhpd */ + CHECK_AVX_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, @@ -3456,6 +3516,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x216: /* movshdup */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); @@ -3509,6 +3570,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x17e: /* movd ea, xmm */ + CHECK_AVX_V0_128(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { tcg_gen_ld_i64(s->T0, cpu_env, @@ -3523,6 +3585,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x27e: /* movq xmm, ea */ + CHECK_AVX_V0_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, @@ -3551,6 +3614,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x129: /* movapd */ case 0x17f: /* movdqa ea, xmm */ case 0x27f: /* movdqu ea, xmm */ + CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); @@ -3562,11 +3626,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, break; case 0x211: /* movss ea, xmm */ if (mod !=3D 3) { + CHECK_AVX_V0_128(s); gen_lea_modrm(env, s, modrm); tcg_gen_ld32u_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L= (0))); gen_op_st_v(s, MO_32, s->T0, s->A0); } else { + CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)= ), offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0))); @@ -3574,10 +3640,12 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, break; case 0x311: /* movsd ea, xmm */ if (mod !=3D 3) { + CHECK_AVX_V0_128(s); gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); } else { + CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)= ), offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0))); @@ -3585,6 +3653,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x013: /* movlps */ case 0x113: /* movlpd */ + CHECK_AVX_V0_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, @@ -3595,6 +3664,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x017: /* movhps */ case 0x117: /* movhpd */ + CHECK_AVX_V0_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, @@ -3611,6 +3681,8 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x173: val =3D x86_ldub_code(env, s); if (is_xmm) { + CHECK_AVX(s); + CHECK_AVX2_256(s); tcg_gen_movi_tl(s->T0, val); tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_t0.ZMM_L(0))); @@ -3646,6 +3718,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, sse_fn_epp(cpu_env, s->ptr0, s->ptr1); break; case 0x050: /* movmskps */ + CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, offsetof(CPUX86State,xmm_regs[rm])); @@ -3653,6 +3726,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ + CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, offsetof(CPUX86State,xmm_regs[rm])); @@ -3686,6 +3760,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x22a: /* cvtsi2ss */ case 0x32a: /* cvtsi2sd */ + CHECK_AVX(s); ot =3D mo_64_32(s->dflag); gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0); op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); @@ -3739,6 +3814,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x32c: /* cvttsd2si */ case 0x22d: /* cvtss2si */ case 0x32d: /* cvtsd2si */ + CHECK_AVX_V0(s); ot =3D mo_64_32(s->dflag); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -3773,6 +3849,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0xc4: /* pinsrw */ case 0x1c4: + CHECK_AVX_128(s); s->rip_offset =3D 1; gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0); val =3D x86_ldub_code(env, s); @@ -3789,6 +3866,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0xc5: /* pextrw */ case 0x1c5: + CHECK_AVX_V0_128(s); if (mod !=3D 3) goto illegal_op; ot =3D mo_64_32(s->dflag); @@ -3808,6 +3886,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_op_mov_reg_v(s, ot, reg, s->T0); break; case 0x1d6: /* movq ea, xmm */ + CHECK_AVX_V0_128(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, @@ -3840,6 +3919,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, if (mod !=3D 3) goto illegal_op; if (b1) { + CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, offsetof(CPUX86State, xmm_regs[rm])); @@ -3875,8 +3955,33 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, goto illegal_op; } =20 + if (op6.ext_mask =3D=3D CPUID_EXT_AVX + && (s->prefix & PREFIX_VEX) =3D=3D 0) { + goto illegal_op; + } + if (op6.flags & SSE_OPF_AVX2) { + CHECK_AVX2(s); + } + if (b1) { + if (op6.flags & SSE_OPF_V0) { + CHECK_AVX_V0(s); + } else { + CHECK_AVX(s); + } op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + + if (op6.flags & SSE_OPF_MMX) { + CHECK_AVX2_256(s); + } + if (op6.flags & SSE_OPF_BLENDV) { + /* + * VEX encodings of the blendv opcodes are not valid + * they use a different opcode with an 0f 3a prefix + */ + CHECK_NO_VEX(s); + } + if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); } else { @@ -4327,6 +4432,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, val =3D x86_ldub_code(env, s); switch (b) { case 0x14: /* pextrb */ + CHECK_AVX_V0_128(s); tcg_gen_ld8u_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_B(val & 15))= ); if (mod =3D=3D 3) { @@ -4337,6 +4443,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x15: /* pextrw */ + CHECK_AVX_V0_128(s); tcg_gen_ld16u_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_W(val & 7))); if (mod =3D=3D 3) { @@ -4347,6 +4454,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x16: + CHECK_AVX_V0_128(s); if (ot =3D=3D MO_32) { /* pextrd */ tcg_gen_ld_i32(s->tmp2_i32, cpu_env, offsetof(CPUX86State, @@ -4374,6 +4482,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x17: /* extractps */ + CHECK_AVX_V0_128(s); tcg_gen_ld32u_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(val & 3))); if (mod =3D=3D 3) { @@ -4384,6 +4493,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x20: /* pinsrb */ + CHECK_AVX_128(s); if (mod =3D=3D 3) { gen_op_mov_v_reg(s, MO_32, s->T0, rm); } else { @@ -4394,6 +4504,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, xmm_regs[reg].ZMM_B(val & 15))= ); break; case 0x21: /* insertps */ + CHECK_AVX_128(s); if (mod =3D=3D 3) { tcg_gen_ld_i32(s->tmp2_i32, cpu_env, offsetof(CPUX86State,xmm_regs[rm] @@ -4423,6 +4534,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, xmm_regs[reg].ZMM_L(3))); break; case 0x22: + CHECK_AVX_128(s); if (ot =3D=3D MO_32) { /* pinsrd */ if (mod =3D=3D 3) { tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[rm]= ); @@ -4453,6 +4565,20 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, return; } =20 + CHECK_AVX(s); + scalar_op =3D (s->prefix & PREFIX_VEX) + && (op7.flags & SSE_OPF_SCALAR) + && !(op7.flags & SSE_OPF_CMP); + if (is_xmm && (op7.flags & SSE_OPF_MMX)) { + CHECK_AVX2_256(s); + } + if (op7.flags & SSE_OPF_AVX2) { + CHECK_AVX2(s); + } + if ((op7.flags & SSE_OPF_V0) && !scalar_op) { + CHECK_AVX_V0(s); + } + if (b1) { op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); if (mod =3D=3D 3) { @@ -4540,6 +4666,19 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, break; } if (is_xmm) { + scalar_op =3D (s->prefix & PREFIX_VEX) + && (sse_op.flags & SSE_OPF_SCALAR) + && !(sse_op.flags & SSE_OPF_CMP) + && (b1 =3D=3D 2 || b1 =3D=3D 3); + /* VEX encoded scalar ops always have 3 operands! */ + if ((sse_op.flags & SSE_OPF_V0) && !scalar_op) { + CHECK_AVX_V0(s); + } else { + CHECK_AVX(s); + } + if (sse_op.flags & SSE_OPF_MMX) { + CHECK_AVX2_256(s); + } op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); if (mod !=3D 3) { int sz =3D 4; --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838223201631.699891667729; Sun, 24 Apr 2022 15:10:23 -0700 (PDT) Received: from localhost ([::1]:58770 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikQu-0007Hj-U1 for importer@patchew.org; Sun, 24 Apr 2022 18:10:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49350) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJE-00056x-2N for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:24 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58693) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001LB-LH for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:23 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ6-0001ea-Ar; Sun, 24 Apr 2022 23:02:16 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 08/42] i386: Add ZMM_OFFSET macro Date: Sun, 24 Apr 2022 23:01:30 +0100 Message-Id: <20220424220204.2493824-9-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838223707100001 Content-Type: text/plain; charset="utf-8" Add a convenience macro to get the address of an xmm_regs element within CPUX86State. This was originally going to be the basis of an implementation that broke operations into 128 bit chunks. I scrapped that idea, so this is now a pure= ly cosmetic change. But I think a worthwhile one - it reduces the number of function calls that need to be split over multiple lines. No functional changes. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 60 +++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 33 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 2f5cc24e0c..e9e6062b7f 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2777,6 +2777,8 @@ static inline void gen_op_movq_env_0(DisasContext *s,= int d_offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset); } =20 +#define ZMM_OFFSET(reg) offsetof(CPUX86State, xmm_regs[reg]) + typedef void (*SSEFunc_i_ep)(TCGv_i32 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val); @@ -3329,14 +3331,14 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_sto_env_A0(s, ZMM_OFFSET(reg)); break; case 0x3f0: /* lddqu */ CHECK_AVX_V0(s); if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); break; case 0x22b: /* movntss */ case 0x32b: /* movntsd */ @@ -3375,15 +3377,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); gen_helper_movq_mm_T0_xmm(s->ptr0, s->T0); } else #endif { gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 0); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0); gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32); } @@ -3410,11 +3410,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movo(s, offsetof(CPUX86State, xmm_regs[reg]), - offsetof(CPUX86State,xmm_regs[rm])); + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm)); } break; case 0x210: /* movss xmm, ea */ @@ -3474,7 +3473,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), @@ -3519,7 +3518,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1= )), @@ -3542,8 +3541,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; field_length =3D x86_ldub_code(env, s) & 0x3F; bit_index =3D x86_ldub_code(env, s) & 0x3F; - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); if (b1 =3D=3D 1) gen_helper_extrq_i(cpu_env, s->ptr0, tcg_const_i32(bit_index), @@ -3617,11 +3615,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_sto_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movo(s, offsetof(CPUX86State, xmm_regs[rm]), - offsetof(CPUX86State,xmm_regs[reg])); + gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg)); } break; case 0x211: /* movss ea, xmm */ @@ -3708,7 +3705,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } if (is_xmm) { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } else { rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -3720,16 +3717,14 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x050: /* movmskps */ CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; @@ -3745,7 +3740,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); } - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); switch(b >> 8) { @@ -3763,7 +3758,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, CHECK_AVX(s); ot =3D mo_64_32(s->dflag); gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0); - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); if (ot =3D=3D MO_32) { SSEFunc_0_epi sse_fn_epi =3D sse_op_table3ai[(b >> 8) & 1]; @@ -3790,7 +3785,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldo_env_A0(s, op2_offset); } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } op1_offset =3D offsetof(CPUX86State,fpregs[reg & 7].mmx); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); @@ -3828,7 +3823,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op2_offset =3D offsetof(CPUX86State,xmm_t0); } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset); if (ot =3D=3D MO_32) { @@ -3921,8 +3916,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, if (b1) { CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State, xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); } else { CHECK_NO_VEX(s); @@ -3969,7 +3963,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } else { CHECK_AVX(s); } - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); =20 if (op6.flags & SSE_OPF_MMX) { CHECK_AVX2_256(s); @@ -3983,7 +3977,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } =20 if (mod =3D=3D 3) { - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); } else { op2_offset =3D offsetof(CPUX86State,xmm_t0); gen_lea_modrm(env, s, modrm); @@ -4580,9 +4574,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } =20 if (b1) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); if (mod =3D=3D 3) { - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); } else { op2_offset =3D offsetof(CPUX86State,xmm_t0); gen_lea_modrm(env, s, modrm); @@ -4679,7 +4673,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, if (sse_op.flags & SSE_OPF_MMX) { CHECK_AVX2_256(s); } - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); if (mod !=3D 3) { int sz =3D 4; =20 @@ -4726,7 +4720,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } } else { CHECK_NO_VEX(s); --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650837879747217.74850554446095; Sun, 24 Apr 2022 15:04:39 -0700 (PDT) Received: from localhost ([::1]:47040 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikLO-0007i2-3z for importer@patchew.org; Sun, 24 Apr 2022 18:04:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49326) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikJD-00056w-J3 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:24 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58694) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikJA-0001LC-M9 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:02:23 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ6-0001ea-HQ; Sun, 24 Apr 2022 23:02:16 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 09/42] i386: Helper macro for 256 bit AVX helpers Date: Sun, 24 Apr 2022 23:01:31 +0100 Message-Id: <20220424220204.2493824-10-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650837881462100002 Content-Type: text/plain; charset="utf-8" Once all the code is in place, 256 bit vector helpers will be generated by including ops_sse.h a third time with SHIFT=3D2. The first bit of support for this is to define a YMM_ONLY macro for code th= at only apples to 256 bit vectors. XXM_ONLY code will be executed for both 128 and 256 bit vectors. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 8 ++++++++ target/i386/ops_sse_header.h | 4 ++++ 2 files changed, 12 insertions(+) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index a5a48a20f6..23daab6b50 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -24,6 +24,7 @@ #define Reg MMXReg #define SIZE 8 #define XMM_ONLY(...) +#define YMM_ONLY(...) #define B(n) MMX_B(n) #define W(n) MMX_W(n) #define L(n) MMX_L(n) @@ -37,7 +38,13 @@ #define W(n) ZMM_W(n) #define L(n) ZMM_L(n) #define Q(n) ZMM_Q(n) +#if SHIFT =3D=3D 1 #define SUFFIX _xmm +#define YMM_ONLY(...) +#else +#define SUFFIX _ymm +#define YMM_ONLY(...) __VA_ARGS__ +#endif #endif =20 /* @@ -2337,6 +2344,7 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State= *env, Reg *d, Reg *s, =20 #undef SHIFT #undef XMM_ONLY +#undef YMM_ONLY #undef Reg #undef B #undef W diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index cef28f2aae..7e7f2cee2a 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -21,7 +21,11 @@ #define SUFFIX _mmx #else #define Reg ZMMReg +#if SHIFT =3D=3D 1 #define SUFFIX _xmm +#else +#define SUFFIX _ymm +#endif #endif =20 #define dh_alias_Reg ptr --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838487774238.48648843819592; Sun, 24 Apr 2022 15:14:47 -0700 (PDT) Received: from localhost ([::1]:36980 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikVC-0003WZ-Ay for importer@patchew.org; Sun, 24 Apr 2022 18:14:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50388) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikRo-0000jX-3K for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:16 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58746) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikRl-0002kR-UZ for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:15 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ6-0001ea-Mi; Sun, 24 Apr 2022 23:02:16 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 10/42] i386: Rewrite vector shift helper Date: Sun, 24 Apr 2022 23:01:32 +0100 Message-Id: <20220424220204.2493824-11-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838488729100001 Content-Type: text/plain; charset="utf-8" Rewrite the vector shift helpers in preperation for AVX support (3 operand form and 256 bit vectors). For now keep the existing two operand interface. No functional changes to existing helpers. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 250 ++++++++++++++++++++++-------------------- 1 file changed, 133 insertions(+), 117 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 23daab6b50..9297c96d04 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -63,199 +63,215 @@ #define MOVE(d, r) memcpy(&(d).B(0), &(r).B(0), SIZE) #endif =20 -void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +#if SHIFT =3D=3D 0 +#define SHIFT_HELPER_BODY(n, elem, F) do { \ + d->elem(0) =3D F(s->elem(0), shift); \ + if ((n) > 1) { \ + d->elem(1) =3D F(s->elem(1), shift); \ + } \ + if ((n) > 2) { \ + d->elem(2) =3D F(s->elem(2), shift); \ + d->elem(3) =3D F(s->elem(3), shift); \ + } \ + if ((n) > 4) { \ + d->elem(4) =3D F(s->elem(4), shift); \ + d->elem(5) =3D F(s->elem(5), shift); \ + d->elem(6) =3D F(s->elem(6), shift); \ + d->elem(7) =3D F(s->elem(7), shift); \ + } \ + if ((n) > 8) { \ + d->elem(8) =3D F(s->elem(8), shift); \ + d->elem(9) =3D F(s->elem(9), shift); \ + d->elem(10) =3D F(s->elem(10), shift); \ + d->elem(11) =3D F(s->elem(11), shift); \ + d->elem(12) =3D F(s->elem(12), shift); \ + d->elem(13) =3D F(s->elem(13), shift); \ + d->elem(14) =3D F(s->elem(14), shift); \ + d->elem(15) =3D F(s->elem(15), shift); \ + } \ + } while (0) + +#define FPSRL(x, c) ((x) >> shift) +#define FPSRAW(x, c) ((int16_t)(x) >> shift) +#define FPSRAL(x, c) ((int32_t)(x) >> shift) +#define FPSLL(x, c) ((x) << shift) +#endif + +void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 15) { + if (c->Q(0) > 15) { d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); - d->W(0) >>=3D shift; - d->W(1) >>=3D shift; - d->W(2) >>=3D shift; - d->W(3) >>=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) >>=3D shift; - d->W(5) >>=3D shift; - d->W(6) >>=3D shift; - d->W(7) >>=3D shift; -#endif + shift =3D c->B(0); + SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRL); } } =20 -void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 15) { - shift =3D 15; + if (c->Q(0) > 15) { + d->Q(0) =3D 0; + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); + shift =3D c->B(0); + SHIFT_HELPER_BODY(4 << SHIFT, W, FPSLL); } - d->W(0) =3D (int16_t)d->W(0) >> shift; - d->W(1) =3D (int16_t)d->W(1) >> shift; - d->W(2) =3D (int16_t)d->W(2) >> shift; - d->W(3) =3D (int16_t)d->W(3) >> shift; -#if SHIFT =3D=3D 1 - d->W(4) =3D (int16_t)d->W(4) >> shift; - d->W(5) =3D (int16_t)d->W(5) >> shift; - d->W(6) =3D (int16_t)d->W(6) >> shift; - d->W(7) =3D (int16_t)d->W(7) >> shift; -#endif } =20 -void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 15) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 15) { + shift =3D 15; } else { - shift =3D s->B(0); - d->W(0) <<=3D shift; - d->W(1) <<=3D shift; - d->W(2) <<=3D shift; - d->W(3) <<=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) <<=3D shift; - d->W(5) <<=3D shift; - d->W(6) <<=3D shift; - d->W(7) <<=3D shift; -#endif + shift =3D c->B(0); } + SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRAW); } =20 -void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { + if (c->Q(0) > 31) { d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); - d->L(0) >>=3D shift; - d->L(1) >>=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) >>=3D shift; - d->L(3) >>=3D shift; -#endif + shift =3D c->B(0); + SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRL); } } =20 -void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { - shift =3D 31; + if (c->Q(0) > 31) { + d->Q(0) =3D 0; + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); + shift =3D c->B(0); + SHIFT_HELPER_BODY(2 << SHIFT, L, FPSLL); } - d->L(0) =3D (int32_t)d->L(0) >> shift; - d->L(1) =3D (int32_t)d->L(1) >> shift; -#if SHIFT =3D=3D 1 - d->L(2) =3D (int32_t)d->L(2) >> shift; - d->L(3) =3D (int32_t)d->L(3) >> shift; -#endif } =20 -void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 31) { + shift =3D 31; } else { - shift =3D s->B(0); - d->L(0) <<=3D shift; - d->L(1) <<=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) <<=3D shift; - d->L(3) <<=3D shift; -#endif + shift =3D c->B(0); } + SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRAL); } =20 -void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 63) { + if (c->Q(0) > 63) { d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); - d->Q(0) >>=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) >>=3D shift; -#endif + shift =3D c->B(0); + SHIFT_HELPER_BODY(1 << SHIFT, Q, FPSRL); } } =20 -void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 63) { + if (c->Q(0) > 63) { d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + XMM_ONLY(d->Q(1) =3D 0;) + YMM_ONLY( + d->Q(2) =3D 0; + d->Q(3) =3D 0; + ) } else { - shift =3D s->B(0); - d->Q(0) <<=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) <<=3D shift; -#endif + shift =3D c->B(0); + SHIFT_HELPER_BODY(1 << SHIFT, Q, FPSLL); } } =20 -#if SHIFT =3D=3D 1 -void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +#if SHIFT >=3D 1 +void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift, i; =20 - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } for (i =3D 0; i < 16 - shift; i++) { - d->B(i) =3D d->B(i + shift); + d->B(i) =3D s->B(i + shift); } for (i =3D 16 - shift; i < 16; i++) { d->B(i) =3D 0; } +#if SHIFT =3D=3D 2 + for (i =3D 0; i < 16 - shift; i++) { + d->B(i + 16) =3D s->B(i + 16 + shift); + } + for (i =3D 16 - shift; i < 16; i++) { + d->B(i + 16) =3D 0; + } +#endif } =20 -void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift, i; =20 - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } for (i =3D 15; i >=3D shift; i--) { - d->B(i) =3D d->B(i - shift); + d->B(i) =3D s->B(i - shift); } for (i =3D 0; i < shift; i++) { d->B(i) =3D 0; } +#if SHIFT =3D=3D 2 + for (i =3D 15; i >=3D shift; i--) { + d->B(i + 16) =3D s->B(i + 16 - shift); + } + for (i =3D 0; i < shift; i++) { + d->B(i + 16) =3D 0; + } +#endif } #endif =20 --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838693667224.68187536741175; Sun, 24 Apr 2022 15:18:13 -0700 (PDT) Received: from localhost ([::1]:47456 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikYW-0002Eu-JE for importer@patchew.org; Sun, 24 Apr 2022 18:18:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50602) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSd-0001eC-6S for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:08 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58797) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSb-0002oz-2d for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:06 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ6-0001ea-Uk; Sun, 24 Apr 2022 23:02:16 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 11/42] i386: Rewrite simple integer vector helpers Date: Sun, 24 Apr 2022 23:01:33 +0100 Message-Id: <20220424220204.2493824-12-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838695569100001 Content-Type: text/plain; charset="utf-8" Rewrite the "simple" vector integer helpers in preperation for AVX support. While the current code is able to use the same prototype for unary (a =3D F(b)) and binary (a =3D F(b, c)) operations, future changes will cau= se them to diverge. No functional changes to existing helpers Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 180 ++++++++++++++++++++++++++++++++---------- 1 file changed, 137 insertions(+), 43 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 9297c96d04..bb9cbf9ead 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -275,61 +275,148 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, R= eg *d, Reg *c) } #endif =20 -#define SSE_HELPER_B(name, F) \ +#define SSE_HELPER_1(name, elem, num, F) = \ void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->B(0) =3D F(d->B(0), s->B(0)); \ - d->B(1) =3D F(d->B(1), s->B(1)); \ - d->B(2) =3D F(d->B(2), s->B(2)); \ - d->B(3) =3D F(d->B(3), s->B(3)); \ - d->B(4) =3D F(d->B(4), s->B(4)); \ - d->B(5) =3D F(d->B(5), s->B(5)); \ - d->B(6) =3D F(d->B(6), s->B(6)); \ - d->B(7) =3D F(d->B(7), s->B(7)); \ + d->elem(0) =3D F(s->elem(0)); \ + d->elem(1) =3D F(s->elem(1)); \ + if ((num << SHIFT) > 2) { \ + d->elem(2) =3D F(s->elem(2)); \ + d->elem(3) =3D F(s->elem(3)); \ + } \ + if ((num << SHIFT) > 4) { \ + d->elem(4) =3D F(s->elem(4)); \ + d->elem(5) =3D F(s->elem(5)); \ + d->elem(6) =3D F(s->elem(6)); \ + d->elem(7) =3D F(s->elem(7)); \ + } \ + if ((num << SHIFT) > 8) { \ + d->elem(8) =3D F(s->elem(8)); \ + d->elem(9) =3D F(s->elem(9)); \ + d->elem(10) =3D F(s->elem(10)); \ + d->elem(11) =3D F(s->elem(11)); \ + d->elem(12) =3D F(s->elem(12)); \ + d->elem(13) =3D F(s->elem(13)); \ + d->elem(14) =3D F(s->elem(14)); \ + d->elem(15) =3D F(s->elem(15)); \ + } \ + if ((num << SHIFT) > 16) { \ + d->elem(16) =3D F(s->elem(16)); \ + d->elem(17) =3D F(s->elem(17)); \ + d->elem(18) =3D F(s->elem(18)); \ + d->elem(19) =3D F(s->elem(19)); \ + d->elem(20) =3D F(s->elem(20)); \ + d->elem(21) =3D F(s->elem(21)); \ + d->elem(22) =3D F(s->elem(22)); \ + d->elem(23) =3D F(s->elem(23)); \ + d->elem(24) =3D F(s->elem(24)); \ + d->elem(25) =3D F(s->elem(25)); \ + d->elem(26) =3D F(s->elem(26)); \ + d->elem(27) =3D F(s->elem(27)); \ + d->elem(28) =3D F(s->elem(28)); \ + d->elem(29) =3D F(s->elem(29)); \ + d->elem(30) =3D F(s->elem(30)); \ + d->elem(31) =3D F(s->elem(31)); \ + } \ + } + +#define SSE_HELPER_B(name, F) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + { \ + Reg *v =3D d; \ + d->B(0) =3D F(v->B(0), s->B(0)); \ + d->B(1) =3D F(v->B(1), s->B(1)); \ + d->B(2) =3D F(v->B(2), s->B(2)); \ + d->B(3) =3D F(v->B(3), s->B(3)); \ + d->B(4) =3D F(v->B(4), s->B(4)); \ + d->B(5) =3D F(v->B(5), s->B(5)); \ + d->B(6) =3D F(v->B(6), s->B(6)); \ + d->B(7) =3D F(v->B(7), s->B(7)); \ XMM_ONLY( \ - d->B(8) =3D F(d->B(8), s->B(8)); \ - d->B(9) =3D F(d->B(9), s->B(9)); \ - d->B(10) =3D F(d->B(10), s->B(10)); \ - d->B(11) =3D F(d->B(11), s->B(11)); \ - d->B(12) =3D F(d->B(12), s->B(12)); \ - d->B(13) =3D F(d->B(13), s->B(13)); \ - d->B(14) =3D F(d->B(14), s->B(14)); \ - d->B(15) =3D F(d->B(15), s->B(15)); \ + d->B(8) =3D F(v->B(8), s->B(8)); \ + d->B(9) =3D F(v->B(9), s->B(9)); \ + d->B(10) =3D F(v->B(10), s->B(10)); \ + d->B(11) =3D F(v->B(11), s->B(11)); \ + d->B(12) =3D F(v->B(12), s->B(12)); \ + d->B(13) =3D F(v->B(13), s->B(13)); \ + d->B(14) =3D F(v->B(14), s->B(14)); \ + d->B(15) =3D F(v->B(15), s->B(15)); \ + ) \ + YMM_ONLY( \ + d->B(16) =3D F(v->B(16), s->B(16)); \ + d->B(17) =3D F(v->B(17), s->B(17)); \ + d->B(18) =3D F(v->B(18), s->B(18)); \ + d->B(19) =3D F(v->B(19), s->B(19)); \ + d->B(20) =3D F(v->B(20), s->B(20)); \ + d->B(21) =3D F(v->B(21), s->B(21)); \ + d->B(22) =3D F(v->B(22), s->B(22)); \ + d->B(23) =3D F(v->B(23), s->B(23)); \ + d->B(24) =3D F(v->B(24), s->B(24)); \ + d->B(25) =3D F(v->B(25), s->B(25)); \ + d->B(26) =3D F(v->B(26), s->B(26)); \ + d->B(27) =3D F(v->B(27), s->B(27)); \ + d->B(28) =3D F(v->B(28), s->B(28)); \ + d->B(29) =3D F(v->B(29), s->B(29)); \ + d->B(30) =3D F(v->B(30), s->B(30)); \ + d->B(31) =3D F(v->B(31), s->B(31)); \ ) \ } =20 #define SSE_HELPER_W(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->W(0) =3D F(d->W(0), s->W(0)); \ - d->W(1) =3D F(d->W(1), s->W(1)); \ - d->W(2) =3D F(d->W(2), s->W(2)); \ - d->W(3) =3D F(d->W(3), s->W(3)); \ + Reg *v =3D d; \ + d->W(0) =3D F(v->W(0), s->W(0)); \ + d->W(1) =3D F(v->W(1), s->W(1)); \ + d->W(2) =3D F(v->W(2), s->W(2)); \ + d->W(3) =3D F(v->W(3), s->W(3)); \ XMM_ONLY( \ - d->W(4) =3D F(d->W(4), s->W(4)); \ - d->W(5) =3D F(d->W(5), s->W(5)); \ - d->W(6) =3D F(d->W(6), s->W(6)); \ - d->W(7) =3D F(d->W(7), s->W(7)); \ + d->W(4) =3D F(v->W(4), s->W(4)); \ + d->W(5) =3D F(v->W(5), s->W(5)); \ + d->W(6) =3D F(v->W(6), s->W(6)); \ + d->W(7) =3D F(v->W(7), s->W(7)); \ + ) \ + YMM_ONLY( \ + d->W(8) =3D F(v->W(8), s->W(8)); \ + d->W(9) =3D F(v->W(9), s->W(9)); \ + d->W(10) =3D F(v->W(10), s->W(10)); \ + d->W(11) =3D F(v->W(11), s->W(11)); \ + d->W(12) =3D F(v->W(12), s->W(12)); \ + d->W(13) =3D F(v->W(13), s->W(13)); \ + d->W(14) =3D F(v->W(14), s->W(14)); \ + d->W(15) =3D F(v->W(15), s->W(15)); \ ) \ } =20 #define SSE_HELPER_L(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->L(0) =3D F(d->L(0), s->L(0)); \ - d->L(1) =3D F(d->L(1), s->L(1)); \ + Reg *v =3D d; \ + d->L(0) =3D F(v->L(0), s->L(0)); \ + d->L(1) =3D F(v->L(1), s->L(1)); \ XMM_ONLY( \ - d->L(2) =3D F(d->L(2), s->L(2)); \ - d->L(3) =3D F(d->L(3), s->L(3)); \ + d->L(2) =3D F(v->L(2), s->L(2)); \ + d->L(3) =3D F(v->L(3), s->L(3)); \ + ) \ + YMM_ONLY( \ + d->L(4) =3D F(v->L(4), s->L(4)); \ + d->L(5) =3D F(v->L(5), s->L(5)); \ + d->L(6) =3D F(v->L(6), s->L(6)); \ + d->L(7) =3D F(v->L(7), s->L(7)); \ ) \ } =20 #define SSE_HELPER_Q(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->Q(0) =3D F(d->Q(0), s->Q(0)); \ + Reg *v =3D d; \ + d->Q(0) =3D F(v->Q(0), s->Q(0)); \ XMM_ONLY( \ - d->Q(1) =3D F(d->Q(1), s->Q(1)); \ + d->Q(1) =3D F(v->Q(1), s->Q(1)); \ + ) \ + YMM_ONLY( \ + d->Q(2) =3D F(v->Q(2), s->Q(2)); \ + d->Q(3) =3D F(v->Q(3), s->Q(3)); \ ) \ } =20 @@ -452,12 +539,19 @@ SSE_HELPER_W(helper_pcmpeqw, FCMPEQ) SSE_HELPER_L(helper_pcmpeql, FCMPEQ) =20 SSE_HELPER_W(helper_pmullw, FMULLW) -#if SHIFT =3D=3D 0 -SSE_HELPER_W(helper_pmulhrw, FMULHRW) -#endif SSE_HELPER_W(helper_pmulhuw, FMULHUW) SSE_HELPER_W(helper_pmulhw, FMULHW) =20 +#if SHIFT =3D=3D 0 +void glue(helper_pmulhrw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + d->W(0) =3D FMULHRW(d->W(0), s->W(0)); + d->W(1) =3D FMULHRW(d->W(1), s->W(1)); + d->W(2) =3D FMULHRW(d->W(2), s->W(2)); + d->W(3) =3D FMULHRW(d->W(3), s->W(3)); +} +#endif + SSE_HELPER_B(helper_pavgb, FAVG) SSE_HELPER_W(helper_pavgw, FAVG) =20 @@ -1581,12 +1675,12 @@ void glue(helper_phsubsw, SUFFIX)(CPUX86State *env,= Reg *d, Reg *s) XMM_ONLY(d->W(7) =3D satsw((int16_t)s->W(6) - (int16_t)s->W(7))); } =20 -#define FABSB(_, x) (x > INT8_MAX ? -(int8_t)x : x) -#define FABSW(_, x) (x > INT16_MAX ? -(int16_t)x : x) -#define FABSL(_, x) (x > INT32_MAX ? -(int32_t)x : x) -SSE_HELPER_B(helper_pabsb, FABSB) -SSE_HELPER_W(helper_pabsw, FABSW) -SSE_HELPER_L(helper_pabsd, FABSL) +#define FABSB(x) (x > INT8_MAX ? -(int8_t)x : x) +#define FABSW(x) (x > INT16_MAX ? -(int16_t)x : x) +#define FABSL(x) (x > INT32_MAX ? -(int32_t)x : x) +SSE_HELPER_1(helper_pabsb, B, 8, FABSB) +SSE_HELPER_1(helper_pabsw, W, 4, FABSW) +SSE_HELPER_1(helper_pabsd, L, 2, FABSL) =20 #define FMULHRSW(d, s) (((int16_t) d * (int16_t)s + 0x4000) >> 15) SSE_HELPER_W(helper_pmulhrsw, FMULHRSW) --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 165083993643926.88393483381958; Sun, 24 Apr 2022 15:38:56 -0700 (PDT) Received: from localhost ([::1]:47920 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1niksY-0007og-TA for importer@patchew.org; Sun, 24 Apr 2022 18:38:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50830) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTj-0002Fo-UT for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:20 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58843) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTg-0002to-3k for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:13 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ7-0001ea-5f; Sun, 24 Apr 2022 23:02:17 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 12/42] i386: Misc integer AVX helper prep Date: Sun, 24 Apr 2022 23:01:34 +0100 Message-Id: <20220424220204.2493824-13-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839937903100001 Content-Type: text/plain; charset="utf-8" More perparatory work for AVX support in various integer vector helpers No functional changes to existing helpers. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 133 +++++++++++++++++++++++++++++++++--------- 1 file changed, 104 insertions(+), 29 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index bb9cbf9ead..d0424140d9 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -557,19 +557,25 @@ SSE_HELPER_W(helper_pavgw, FAVG) =20 void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->Q(0) =3D (uint64_t)s->L(0) * (uint64_t)d->L(0); -#if SHIFT =3D=3D 1 - d->Q(1) =3D (uint64_t)s->L(2) * (uint64_t)d->L(2); + Reg *v =3D d; + d->Q(0) =3D (uint64_t)s->L(0) * (uint64_t)v->L(0); +#if SHIFT >=3D 1 + d->Q(1) =3D (uint64_t)s->L(2) * (uint64_t)v->L(2); +#if SHIFT =3D=3D 2 + d->Q(2) =3D (uint64_t)s->L(4) * (uint64_t)v->L(4); + d->Q(3) =3D (uint64_t)s->L(6) * (uint64_t)v->L(6); +#endif #endif } =20 void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { + Reg *v =3D d; int i; =20 for (i =3D 0; i < (2 << SHIFT); i++) { - d->L(i) =3D (int16_t)s->W(2 * i) * (int16_t)d->W(2 * i) + - (int16_t)s->W(2 * i + 1) * (int16_t)d->W(2 * i + 1); + d->L(i) =3D (int16_t)s->W(2 * i) * (int16_t)v->W(2 * i) + + (int16_t)s->W(2 * i + 1) * (int16_t)v->W(2 * i + 1); } } =20 @@ -583,31 +589,55 @@ static inline int abs1(int a) } } #endif + void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { + Reg *v =3D d; unsigned int val; =20 val =3D 0; - val +=3D abs1(d->B(0) - s->B(0)); - val +=3D abs1(d->B(1) - s->B(1)); - val +=3D abs1(d->B(2) - s->B(2)); - val +=3D abs1(d->B(3) - s->B(3)); - val +=3D abs1(d->B(4) - s->B(4)); - val +=3D abs1(d->B(5) - s->B(5)); - val +=3D abs1(d->B(6) - s->B(6)); - val +=3D abs1(d->B(7) - s->B(7)); + val +=3D abs1(v->B(0) - s->B(0)); + val +=3D abs1(v->B(1) - s->B(1)); + val +=3D abs1(v->B(2) - s->B(2)); + val +=3D abs1(v->B(3) - s->B(3)); + val +=3D abs1(v->B(4) - s->B(4)); + val +=3D abs1(v->B(5) - s->B(5)); + val +=3D abs1(v->B(6) - s->B(6)); + val +=3D abs1(v->B(7) - s->B(7)); d->Q(0) =3D val; -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 val =3D 0; - val +=3D abs1(d->B(8) - s->B(8)); - val +=3D abs1(d->B(9) - s->B(9)); - val +=3D abs1(d->B(10) - s->B(10)); - val +=3D abs1(d->B(11) - s->B(11)); - val +=3D abs1(d->B(12) - s->B(12)); - val +=3D abs1(d->B(13) - s->B(13)); - val +=3D abs1(d->B(14) - s->B(14)); - val +=3D abs1(d->B(15) - s->B(15)); + val +=3D abs1(v->B(8) - s->B(8)); + val +=3D abs1(v->B(9) - s->B(9)); + val +=3D abs1(v->B(10) - s->B(10)); + val +=3D abs1(v->B(11) - s->B(11)); + val +=3D abs1(v->B(12) - s->B(12)); + val +=3D abs1(v->B(13) - s->B(13)); + val +=3D abs1(v->B(14) - s->B(14)); + val +=3D abs1(v->B(15) - s->B(15)); d->Q(1) =3D val; +#if SHIFT =3D=3D 2 + val =3D 0; + val +=3D abs1(v->B(16) - s->B(16)); + val +=3D abs1(v->B(17) - s->B(17)); + val +=3D abs1(v->B(18) - s->B(18)); + val +=3D abs1(v->B(19) - s->B(19)); + val +=3D abs1(v->B(20) - s->B(20)); + val +=3D abs1(v->B(21) - s->B(21)); + val +=3D abs1(v->B(22) - s->B(22)); + val +=3D abs1(v->B(23) - s->B(23)); + d->Q(2) =3D val; + val =3D 0; + val +=3D abs1(v->B(24) - s->B(24)); + val +=3D abs1(v->B(25) - s->B(25)); + val +=3D abs1(v->B(26) - s->B(26)); + val +=3D abs1(v->B(27) - s->B(27)); + val +=3D abs1(v->B(28) - s->B(28)); + val +=3D abs1(v->B(29) - s->B(29)); + val +=3D abs1(v->B(30) - s->B(30)); + val +=3D abs1(v->B(31) - s->B(31)); + d->Q(3) =3D val; +#endif #endif } =20 @@ -627,8 +657,12 @@ void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t = val) { d->L(0) =3D val; d->L(1) =3D 0; -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 d->Q(1) =3D 0; +#if SHIFT =3D=3D 2 + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#endif #endif } =20 @@ -636,8 +670,12 @@ void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t = val) void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val) { d->Q(0) =3D val; -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 d->Q(1) =3D 0; +#if SHIFT =3D=3D 2 + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#endif #endif } #endif @@ -1251,7 +1289,7 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *e= nv, Reg *s) val |=3D (s->B(5) >> 2) & 0x20; val |=3D (s->B(6) >> 1) & 0x40; val |=3D (s->B(7)) & 0x80; -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 val |=3D (s->B(8) << 1) & 0x0100; val |=3D (s->B(9) << 2) & 0x0200; val |=3D (s->B(10) << 3) & 0x0400; @@ -1260,6 +1298,24 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *= env, Reg *s) val |=3D (s->B(13) << 6) & 0x2000; val |=3D (s->B(14) << 7) & 0x4000; val |=3D (s->B(15) << 8) & 0x8000; +#if SHIFT =3D=3D 2 + val |=3D ((uint32_t)s->B(16) << 9) & 0x00010000; + val |=3D ((uint32_t)s->B(17) << 10) & 0x00020000; + val |=3D ((uint32_t)s->B(18) << 11) & 0x00040000; + val |=3D ((uint32_t)s->B(19) << 12) & 0x00080000; + val |=3D ((uint32_t)s->B(20) << 13) & 0x00100000; + val |=3D ((uint32_t)s->B(21) << 14) & 0x00200000; + val |=3D ((uint32_t)s->B(22) << 15) & 0x00400000; + val |=3D ((uint32_t)s->B(23) << 16) & 0x00800000; + val |=3D ((uint32_t)s->B(24) << 17) & 0x01000000; + val |=3D ((uint32_t)s->B(25) << 18) & 0x02000000; + val |=3D ((uint32_t)s->B(26) << 19) & 0x04000000; + val |=3D ((uint32_t)s->B(27) << 20) & 0x08000000; + val |=3D ((uint32_t)s->B(28) << 21) & 0x10000000; + val |=3D ((uint32_t)s->B(29) << 22) & 0x20000000; + val |=3D ((uint32_t)s->B(30) << 23) & 0x40000000; + val |=3D ((uint32_t)s->B(31) << 24) & 0x80000000; +#endif #endif return val; } @@ -1799,14 +1855,28 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s) uint64_t zf =3D (s->Q(0) & d->Q(0)) | (s->Q(1) & d->Q(1)); uint64_t cf =3D (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1)); =20 +#if SHIFT =3D=3D 2 + zf |=3D (s->Q(2) & d->Q(2)) | (s->Q(3) & d->Q(3)); + cf |=3D (s->Q(2) & ~d->Q(2)) | (s->Q(3) & ~d->Q(3)); +#endif CC_SRC =3D (zf ? 0 : CC_Z) | (cf ? 0 : CC_C); } =20 #define SSE_HELPER_F(name, elem, num, F) \ void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - if (num > 2) { \ - if (num > 4) { \ + if (num * SHIFT > 2) { \ + if (num * SHIFT > 8) { \ + d->elem(15) =3D F(15); \ + d->elem(14) =3D F(14); \ + d->elem(13) =3D F(13); \ + d->elem(12) =3D F(12); \ + d->elem(11) =3D F(11); \ + d->elem(10) =3D F(10); \ + d->elem(9) =3D F(9); \ + d->elem(8) =3D F(8); \ + } \ + if (num * SHIFT > 4) { \ d->elem(7) =3D F(7); \ d->elem(6) =3D F(6); \ d->elem(5) =3D F(5); \ @@ -1834,8 +1904,13 @@ SSE_HELPER_F(helper_pmovzxdq, Q, 2, s->L) =20 void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->Q(0) =3D (int64_t)(int32_t) d->L(0) * (int32_t) s->L(0); - d->Q(1) =3D (int64_t)(int32_t) d->L(2) * (int32_t) s->L(2); + Reg *v =3D d; + d->Q(0) =3D (int64_t)(int32_t) v->L(0) * (int32_t) s->L(0); + d->Q(1) =3D (int64_t)(int32_t) v->L(2) * (int32_t) s->L(2); +#if SHIFT =3D=3D 2 + d->Q(2) =3D (int64_t)(int32_t) v->L(4) * (int32_t) s->L(4); + d->Q(3) =3D (int64_t)(int32_t) v->L(6) * (int32_t) s->L(6); +#endif } =20 #define FCMPEQQ(d, s) (d =3D=3D s ? -1 : 0) --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838924816415.84004600205503; Sun, 24 Apr 2022 15:22:04 -0700 (PDT) Received: from localhost ([::1]:55468 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikcF-0007cs-IM for importer@patchew.org; Sun, 24 Apr 2022 18:22:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50676) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSu-0001iq-7h for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:31 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58809) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSr-0002rD-3A for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:23 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ7-0001ea-D1; Sun, 24 Apr 2022 23:02:17 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 13/42] i386: Destructive vector helpers for AVX Date: Sun, 24 Apr 2022 23:01:35 +0100 Message-Id: <20220424220204.2493824-14-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838925687100001 Content-Type: text/plain; charset="utf-8" These helpers need to take special care to avoid overwriting source values before the wole result has been calculated. Currently they use a dummy Reg typed variable to store the result then assign the whole register. This will cause 128 bit operations to corrupt the upper half of the registe= r, so replace it with explicit temporaries and element assignments. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 707 ++++++++++++++++++++++++++---------------- 1 file changed, 437 insertions(+), 270 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index d0424140d9..c645d2ddbf 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -680,71 +680,85 @@ void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t= val) } #endif =20 +#define SHUFFLE4(F, a, b, offset) do { \ + r0 =3D a->F((order & 3) + offset); \ + r1 =3D a->F(((order >> 2) & 3) + offset); \ + r2 =3D b->F(((order >> 4) & 3) + offset); \ + r3 =3D b->F(((order >> 6) & 3) + offset); \ + d->F(offset) =3D r0; \ + d->F(offset + 1) =3D r1; \ + d->F(offset + 2) =3D r2; \ + d->F(offset + 3) =3D r3; \ + } while (0) + #if SHIFT =3D=3D 0 void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; =20 - r.W(0) =3D s->W(order & 3); - r.W(1) =3D s->W((order >> 2) & 3); - r.W(2) =3D s->W((order >> 4) & 3); - r.W(3) =3D s->W((order >> 6) & 3); - MOVE(*d, r); + SHUFFLE4(W, s, s, 0); } #else void helper_shufps(Reg *d, Reg *s, int order) { - Reg r; + Reg *v =3D d; + uint32_t r0, r1, r2, r3; =20 - r.L(0) =3D d->L(order & 3); - r.L(1) =3D d->L((order >> 2) & 3); - r.L(2) =3D s->L((order >> 4) & 3); - r.L(3) =3D s->L((order >> 6) & 3); - MOVE(*d, r); + SHUFFLE4(L, v, s, 0); +#if SHIFT =3D=3D 2 + SHUFFLE4(L, v, s, 4); +#endif } =20 void helper_shufpd(Reg *d, Reg *s, int order) { - Reg r; + Reg *v =3D d; + uint64_t r0, r1; =20 - r.Q(0) =3D d->Q(order & 1); - r.Q(1) =3D s->Q((order >> 1) & 1); - MOVE(*d, r); + r0 =3D v->Q(order & 1); + r1 =3D s->Q((order >> 1) & 1); + d->Q(0) =3D r0; + d->Q(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D v->Q(((order >> 2) & 1) + 2); + r1 =3D s->Q(((order >> 3) & 1) + 2); + d->Q(2) =3D r0; + d->Q(3) =3D r1; +#endif } =20 void glue(helper_pshufd, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint32_t r0, r1, r2, r3; =20 - r.L(0) =3D s->L(order & 3); - r.L(1) =3D s->L((order >> 2) & 3); - r.L(2) =3D s->L((order >> 4) & 3); - r.L(3) =3D s->L((order >> 6) & 3); - MOVE(*d, r); + SHUFFLE4(L, s, s, 0); +#if SHIFT =3D=3D 2 + SHUFFLE4(L, s, s, 4); +#endif } =20 void glue(helper_pshuflw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; =20 - r.W(0) =3D s->W(order & 3); - r.W(1) =3D s->W((order >> 2) & 3); - r.W(2) =3D s->W((order >> 4) & 3); - r.W(3) =3D s->W((order >> 6) & 3); - r.Q(1) =3D s->Q(1); - MOVE(*d, r); + SHUFFLE4(W, s, s, 0); + d->Q(1) =3D s->Q(1); +#if SHIFT =3D=3D 2 + SHUFFLE4(W, s, s, 8); + d->Q(3) =3D s->Q(3); +#endif } =20 void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; =20 - r.Q(0) =3D s->Q(0); - r.W(4) =3D s->W(4 + (order & 3)); - r.W(5) =3D s->W(4 + ((order >> 2) & 3)); - r.W(6) =3D s->W(4 + ((order >> 4) & 3)); - r.W(7) =3D s->W(4 + ((order >> 6) & 3)); - MOVE(*d, r); + d->Q(0) =3D s->Q(0); + SHUFFLE4(W, s, s, 4); +#if SHIFT =3D=3D 2 + d->Q(2) =3D s->Q(2); + SHUFFLE4(W, s, s, 12); +#endif } #endif =20 @@ -1320,156 +1334,190 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86Stat= e *env, Reg *s) return val; } =20 -void glue(helper_packsswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.B(0) =3D satsb((int16_t)d->W(0)); - r.B(1) =3D satsb((int16_t)d->W(1)); - r.B(2) =3D satsb((int16_t)d->W(2)); - r.B(3) =3D satsb((int16_t)d->W(3)); -#if SHIFT =3D=3D 1 - r.B(4) =3D satsb((int16_t)d->W(4)); - r.B(5) =3D satsb((int16_t)d->W(5)); - r.B(6) =3D satsb((int16_t)d->W(6)); - r.B(7) =3D satsb((int16_t)d->W(7)); -#endif - r.B((4 << SHIFT) + 0) =3D satsb((int16_t)s->W(0)); - r.B((4 << SHIFT) + 1) =3D satsb((int16_t)s->W(1)); - r.B((4 << SHIFT) + 2) =3D satsb((int16_t)s->W(2)); - r.B((4 << SHIFT) + 3) =3D satsb((int16_t)s->W(3)); -#if SHIFT =3D=3D 1 - r.B(12) =3D satsb((int16_t)s->W(4)); - r.B(13) =3D satsb((int16_t)s->W(5)); - r.B(14) =3D satsb((int16_t)s->W(6)); - r.B(15) =3D satsb((int16_t)s->W(7)); -#endif - MOVE(*d, r); -} - -void glue(helper_packuswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.B(0) =3D satub((int16_t)d->W(0)); - r.B(1) =3D satub((int16_t)d->W(1)); - r.B(2) =3D satub((int16_t)d->W(2)); - r.B(3) =3D satub((int16_t)d->W(3)); -#if SHIFT =3D=3D 1 - r.B(4) =3D satub((int16_t)d->W(4)); - r.B(5) =3D satub((int16_t)d->W(5)); - r.B(6) =3D satub((int16_t)d->W(6)); - r.B(7) =3D satub((int16_t)d->W(7)); -#endif - r.B((4 << SHIFT) + 0) =3D satub((int16_t)s->W(0)); - r.B((4 << SHIFT) + 1) =3D satub((int16_t)s->W(1)); - r.B((4 << SHIFT) + 2) =3D satub((int16_t)s->W(2)); - r.B((4 << SHIFT) + 3) =3D satub((int16_t)s->W(3)); -#if SHIFT =3D=3D 1 - r.B(12) =3D satub((int16_t)s->W(4)); - r.B(13) =3D satub((int16_t)s->W(5)); - r.B(14) =3D satub((int16_t)s->W(6)); - r.B(15) =3D satub((int16_t)s->W(7)); +#if SHIFT =3D=3D 0 +#define PACK_WIDTH 4 +#else +#define PACK_WIDTH 8 #endif - MOVE(*d, r); -} =20 void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - Reg r; + Reg *v =3D d; + uint16_t r[PACK_WIDTH]; + int i; =20 - r.W(0) =3D satsw(d->L(0)); - r.W(1) =3D satsw(d->L(1)); -#if SHIFT =3D=3D 1 - r.W(2) =3D satsw(d->L(2)); - r.W(3) =3D satsw(d->L(3)); + r[0] =3D satsw(v->L(0)); + r[1] =3D satsw(v->L(1)); + r[PACK_WIDTH / 2 + 0] =3D satsw(s->L(0)); + r[PACK_WIDTH / 2 + 1] =3D satsw(s->L(1)); +#if SHIFT >=3D 1 + r[2] =3D satsw(v->L(2)); + r[3] =3D satsw(v->L(3)); + r[6] =3D satsw(s->L(2)); + r[7] =3D satsw(s->L(3)); #endif - r.W((2 << SHIFT) + 0) =3D satsw(s->L(0)); - r.W((2 << SHIFT) + 1) =3D satsw(s->L(1)); -#if SHIFT =3D=3D 1 - r.W(6) =3D satsw(s->L(2)); - r.W(7) =3D satsw(s->L(3)); + for (i =3D 0; i < PACK_WIDTH; i++) { + d->W(i) =3D r[i]; + } +#if SHIFT =3D=3D 2 + r[0] =3D satsw(v->L(4)); + r[1] =3D satsw(v->L(5)); + r[2] =3D satsw(v->L(6)); + r[3] =3D satsw(v->L(7)); + r[4] =3D satsw(s->L(4)); + r[5] =3D satsw(s->L(5)); + r[6] =3D satsw(s->L(6)); + r[7] =3D satsw(s->L(7)); + for (i =3D 0; i < 8; i++) { + d->W(i + 8) =3D r[i]; + } #endif - MOVE(*d, r); } =20 #define UNPCK_OP(base_name, base) \ \ void glue(helper_punpck ## base_name ## bw, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint8_t r[PACK_WIDTH * 2]; \ + int i; \ \ - r.B(0) =3D d->B((base << (SHIFT + 2)) + 0); \ - r.B(1) =3D s->B((base << (SHIFT + 2)) + 0); \ - r.B(2) =3D d->B((base << (SHIFT + 2)) + 1); \ - r.B(3) =3D s->B((base << (SHIFT + 2)) + 1); \ - r.B(4) =3D d->B((base << (SHIFT + 2)) + 2); \ - r.B(5) =3D s->B((base << (SHIFT + 2)) + 2); \ - r.B(6) =3D d->B((base << (SHIFT + 2)) + 3); \ - r.B(7) =3D s->B((base << (SHIFT + 2)) + 3); \ + r[0] =3D v->B((base * PACK_WIDTH) + 0); \ + r[1] =3D s->B((base * PACK_WIDTH) + 0); \ + r[2] =3D v->B((base * PACK_WIDTH) + 1); \ + r[3] =3D s->B((base * PACK_WIDTH) + 1); \ + r[4] =3D v->B((base * PACK_WIDTH) + 2); \ + r[5] =3D s->B((base * PACK_WIDTH) + 2); \ + r[6] =3D v->B((base * PACK_WIDTH) + 3); \ + r[7] =3D s->B((base * PACK_WIDTH) + 3); \ XMM_ONLY( \ - r.B(8) =3D d->B((base << (SHIFT + 2)) + 4); \ - r.B(9) =3D s->B((base << (SHIFT + 2)) + 4); \ - r.B(10) =3D d->B((base << (SHIFT + 2)) + 5); \ - r.B(11) =3D s->B((base << (SHIFT + 2)) + 5); \ - r.B(12) =3D d->B((base << (SHIFT + 2)) + 6); \ - r.B(13) =3D s->B((base << (SHIFT + 2)) + 6); \ - r.B(14) =3D d->B((base << (SHIFT + 2)) + 7); \ - r.B(15) =3D s->B((base << (SHIFT + 2)) + 7); \ + r[8] =3D v->B((base * PACK_WIDTH) + 4); \ + r[9] =3D s->B((base * PACK_WIDTH) + 4); \ + r[10] =3D v->B((base * PACK_WIDTH) + 5); \ + r[11] =3D s->B((base * PACK_WIDTH) + 5); \ + r[12] =3D v->B((base * PACK_WIDTH) + 6); \ + r[13] =3D s->B((base * PACK_WIDTH) + 6); \ + r[14] =3D v->B((base * PACK_WIDTH) + 7); \ + r[15] =3D s->B((base * PACK_WIDTH) + 7); \ + ) \ + for (i =3D 0; i < PACK_WIDTH * 2; i++) { \ + d->B(i) =3D r[i]; \ + } \ + YMM_ONLY( \ + r[0] =3D v->B((base * 8) + 16); \ + r[1] =3D s->B((base * 8) + 16); \ + r[2] =3D v->B((base * 8) + 17); \ + r[3] =3D s->B((base * 8) + 17); \ + r[4] =3D v->B((base * 8) + 18); \ + r[5] =3D s->B((base * 8) + 18); \ + r[6] =3D v->B((base * 8) + 19); \ + r[7] =3D s->B((base * 8) + 19); \ + r[8] =3D v->B((base * 8) + 20); \ + r[9] =3D s->B((base * 8) + 20); \ + r[10] =3D v->B((base * 8) + 21); \ + r[11] =3D s->B((base * 8) + 21); \ + r[12] =3D v->B((base * 8) + 22); \ + r[13] =3D s->B((base * 8) + 22); \ + r[14] =3D v->B((base * 8) + 23); \ + r[15] =3D s->B((base * 8) + 23); \ + for (i =3D 0; i < PACK_WIDTH * 2; i++) { \ + d->B(16 + i) =3D r[i]; \ + } \ ) \ - MOVE(*d, r); \ } \ \ void glue(helper_punpck ## base_name ## wd, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint16_t r[PACK_WIDTH]; \ + int i; \ \ - r.W(0) =3D d->W((base << (SHIFT + 1)) + 0); \ - r.W(1) =3D s->W((base << (SHIFT + 1)) + 0); \ - r.W(2) =3D d->W((base << (SHIFT + 1)) + 1); \ - r.W(3) =3D s->W((base << (SHIFT + 1)) + 1); \ + r[0] =3D v->W((base * (PACK_WIDTH / 2)) + 0); \ + r[1] =3D s->W((base * (PACK_WIDTH / 2)) + 0); \ + r[2] =3D v->W((base * (PACK_WIDTH / 2)) + 1); \ + r[3] =3D s->W((base * (PACK_WIDTH / 2)) + 1); \ XMM_ONLY( \ - r.W(4) =3D d->W((base << (SHIFT + 1)) + 2); \ - r.W(5) =3D s->W((base << (SHIFT + 1)) + 2); \ - r.W(6) =3D d->W((base << (SHIFT + 1)) + 3); \ - r.W(7) =3D s->W((base << (SHIFT + 1)) + 3); \ + r[4] =3D v->W((base * 4) + 2); \ + r[5] =3D s->W((base * 4) + 2); \ + r[6] =3D v->W((base * 4) + 3); \ + r[7] =3D s->W((base * 4) + 3); \ + ) \ + for (i =3D 0; i < PACK_WIDTH; i++) { \ + d->W(i) =3D r[i]; \ + } \ + YMM_ONLY( \ + r[0] =3D v->W((base * 4) + 8); \ + r[1] =3D s->W((base * 4) + 8); \ + r[2] =3D v->W((base * 4) + 9); \ + r[3] =3D s->W((base * 4) + 9); \ + r[4] =3D v->W((base * 4) + 10); \ + r[5] =3D s->W((base * 4) + 10); \ + r[6] =3D v->W((base * 4) + 11); \ + r[7] =3D s->W((base * 4) + 11); \ + for (i =3D 0; i < PACK_WIDTH; i++) { \ + d->W(i + 8) =3D r[i]; \ + } \ ) \ - MOVE(*d, r); \ } \ \ void glue(helper_punpck ## base_name ## dq, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint32_t r[4]; \ \ - r.L(0) =3D d->L((base << SHIFT) + 0); \ - r.L(1) =3D s->L((base << SHIFT) + 0); \ + r[0] =3D v->L((base * (PACK_WIDTH / 4)) + 0); \ + r[1] =3D s->L((base * (PACK_WIDTH / 4)) + 0); \ XMM_ONLY( \ - r.L(2) =3D d->L((base << SHIFT) + 1); \ - r.L(3) =3D s->L((base << SHIFT) + 1); \ + r[2] =3D v->L((base * 2) + 1); \ + r[3] =3D s->L((base * 2) + 1); \ + d->L(2) =3D r[2]; \ + d->L(3) =3D r[3]; \ + ) \ + d->L(0) =3D r[0]; \ + d->L(1) =3D r[1]; \ + YMM_ONLY( \ + r[0] =3D v->L((base * 2) + 4); \ + r[1] =3D s->L((base * 2) + 4); \ + r[2] =3D v->L((base * 2) + 5); \ + r[3] =3D s->L((base * 2) + 5); \ + d->L(4) =3D r[0]; \ + d->L(5) =3D r[1]; \ + d->L(6) =3D r[2]; \ + d->L(7) =3D r[3]; \ ) \ - MOVE(*d, r); \ } \ \ XMM_ONLY( \ - void glue(helper_punpck ## base_name ## qdq, SUFFIX)(CPUX86St= ate \ - *env, \ - Reg *d, \ - Reg *s) \ + void glue(helper_punpck ## base_name ## qdq, SUFFIX)( \ + CPUX86State *env, Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint64_t r[2]; \ \ - r.Q(0) =3D d->Q(base); \ - r.Q(1) =3D s->Q(base); \ - MOVE(*d, r); \ + r[0] =3D v->Q(base); \ + r[1] =3D s->Q(base); \ + d->Q(0) =3D r[0]; \ + d->Q(1) =3D r[1]; \ + YMM_ONLY( \ + r[0] =3D v->Q(base + 2); \ + r[1] =3D s->Q(base + 2); \ + d->Q(2) =3D r[0]; \ + d->Q(3) =3D r[1]; \ + ) \ } \ ) =20 UNPCK_OP(l, 0) UNPCK_OP(h, 1) =20 +#undef PACK_WIDTH +#undef PACK_HELPER_B +#undef PACK4 + + /* 3DNow! float ops */ #if SHIFT =3D=3D 0 void helper_pi2fd(CPUX86State *env, MMXReg *d, MMXReg *s) @@ -1622,113 +1670,172 @@ void helper_pswapd(CPUX86State *env, MMXReg *d, M= MXReg *s) /* SSSE3 op helpers */ void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { + Reg *v =3D d; int i; - Reg r; +#if SHIFT =3D=3D 0 + uint8_t r[8]; =20 - for (i =3D 0; i < (8 << SHIFT); i++) { - r.B(i) =3D (s->B(i) & 0x80) ? 0 : (d->B(s->B(i) & ((8 << SHIFT) - = 1))); + for (i =3D 0; i < 8; i++) { + r[i] =3D (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 7)); + } + for (i =3D 0; i < 8; i++) { + d->B(i) =3D r[i]; } +#else + uint8_t r[16]; =20 - MOVE(*d, r); + for (i =3D 0; i < 16; i++) { + r[i] =3D (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 0xf)); + } + for (i =3D 0; i < 16; i++) { + d->B(i) =3D r[i]; + } +#if SHIFT =3D=3D 2 + for (i =3D 0; i < 16; i++) { + r[i] =3D (s->B(i + 16) & 0x80) ? 0 : (v->B((s->B(i + 16) & 0xf) + = 16)); + } + for (i =3D 0; i < 16; i++) { + d->B(i + 16) =3D r[i]; + } +#endif +#endif } =20 -void glue(helper_phaddw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - - Reg r; - - r.W(0) =3D (int16_t)d->W(0) + (int16_t)d->W(1); - r.W(1) =3D (int16_t)d->W(2) + (int16_t)d->W(3); - XMM_ONLY(r.W(2) =3D (int16_t)d->W(4) + (int16_t)d->W(5)); - XMM_ONLY(r.W(3) =3D (int16_t)d->W(6) + (int16_t)d->W(7)); - r.W((2 << SHIFT) + 0) =3D (int16_t)s->W(0) + (int16_t)s->W(1); - r.W((2 << SHIFT) + 1) =3D (int16_t)s->W(2) + (int16_t)s->W(3); - XMM_ONLY(r.W(6) =3D (int16_t)s->W(4) + (int16_t)s->W(5)); - XMM_ONLY(r.W(7) =3D (int16_t)s->W(6) + (int16_t)s->W(7)); +#if SHIFT =3D=3D 0 =20 - MOVE(*d, r); +#define SSE_HELPER_HW(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + uint16_t r[4]; \ + r[0] =3D F(v->W(0), v->W(1)); \ + r[1] =3D F(v->W(2), v->W(3)); \ + r[2] =3D F(s->W(0), s->W(1)); \ + r[3] =3D F(s->W(3), s->W(3)); \ + d->W(0) =3D r[0]; \ + d->W(1) =3D r[1]; \ + d->W(2) =3D r[2]; \ + d->W(3) =3D r[3]; \ +} + +#define SSE_HELPER_HL(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + uint32_t r0, r1; \ + r0 =3D F(v->L(0), v->L(1)); \ + r1 =3D F(s->L(0), s->L(1)); \ + d->W(0) =3D r0; \ + d->W(1) =3D r1; \ } =20 -void glue(helper_phaddd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.L(0) =3D (int32_t)d->L(0) + (int32_t)d->L(1); - XMM_ONLY(r.L(1) =3D (int32_t)d->L(2) + (int32_t)d->L(3)); - r.L((1 << SHIFT) + 0) =3D (int32_t)s->L(0) + (int32_t)s->L(1); - XMM_ONLY(r.L(3) =3D (int32_t)s->L(2) + (int32_t)s->L(3)); +#else =20 - MOVE(*d, r); +#define SSE_HELPER_HW(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + int32_t r[8]; \ + r[0] =3D F(v->W(0), v->W(1)); \ + r[1] =3D F(v->W(2), v->W(3)); \ + r[2] =3D F(v->W(4), v->W(5)); \ + r[3] =3D F(v->W(6), v->W(7)); \ + r[4] =3D F(s->W(0), s->W(1)); \ + r[5] =3D F(s->W(2), s->W(3)); \ + r[6] =3D F(s->W(4), s->W(5)); \ + r[7] =3D F(s->W(6), s->W(7)); \ + d->W(0) =3D r[0]; \ + d->W(1) =3D r[1]; \ + d->W(2) =3D r[2]; \ + d->W(3) =3D r[3]; \ + d->W(4) =3D r[4]; \ + d->W(5) =3D r[5]; \ + d->W(6) =3D r[6]; \ + d->W(7) =3D r[7]; \ + YMM_ONLY( \ + r[0] =3D F(v->W(8), v->W(9)); \ + r[1] =3D F(v->W(10), v->W(11)); \ + r[2] =3D F(v->W(12), v->W(13)); \ + r[3] =3D F(v->W(14), v->W(15)); \ + r[4] =3D F(s->W(8), s->W(9)); \ + r[5] =3D F(s->W(10), s->W(11)); \ + r[6] =3D F(s->W(12), s->W(13)); \ + r[7] =3D F(s->W(14), s->W(15)); \ + d->W(8) =3D r[0]; \ + d->W(9) =3D r[1]; \ + d->W(10) =3D r[2]; \ + d->W(11) =3D r[3]; \ + d->W(12) =3D r[4]; \ + d->W(13) =3D r[5]; \ + d->W(14) =3D r[6]; \ + d->W(15) =3D r[7]; \ + ) \ +} + +#define SSE_HELPER_HL(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + int32_t r0, r1, r2, r3; \ + r0 =3D F(v->L(0), v->L(1)); \ + r1 =3D F(v->L(2), v->L(3)); \ + r2 =3D F(s->L(0), s->L(1)); \ + r3 =3D F(s->L(2), s->L(3)); \ + d->L(0) =3D r0; \ + d->L(1) =3D r1; \ + d->L(2) =3D r2; \ + d->L(3) =3D r3; \ + YMM_ONLY( \ + r0 =3D F(v->L(4), v->L(5)); \ + r1 =3D F(v->L(6), v->L(7)); \ + r2 =3D F(s->L(4), s->L(5)); \ + r3 =3D F(s->L(6), s->L(7)); \ + d->L(4) =3D r0; \ + d->L(5) =3D r1; \ + d->L(6) =3D r2; \ + d->L(7) =3D r3; \ + ) \ } +#endif =20 -void glue(helper_phaddsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.W(0) =3D satsw((int16_t)d->W(0) + (int16_t)d->W(1)); - r.W(1) =3D satsw((int16_t)d->W(2) + (int16_t)d->W(3)); - XMM_ONLY(r.W(2) =3D satsw((int16_t)d->W(4) + (int16_t)d->W(5))); - XMM_ONLY(r.W(3) =3D satsw((int16_t)d->W(6) + (int16_t)d->W(7))); - r.W((2 << SHIFT) + 0) =3D satsw((int16_t)s->W(0) + (int16_t)s->W(1)); - r.W((2 << SHIFT) + 1) =3D satsw((int16_t)s->W(2) + (int16_t)s->W(3)); - XMM_ONLY(r.W(6) =3D satsw((int16_t)s->W(4) + (int16_t)s->W(5))); - XMM_ONLY(r.W(7) =3D satsw((int16_t)s->W(6) + (int16_t)s->W(7))); +SSE_HELPER_HW(phaddw, FADD) +SSE_HELPER_HW(phsubw, FSUB) +SSE_HELPER_HW(phaddsw, FADDSW) +SSE_HELPER_HW(phsubsw, FSUBSW) +SSE_HELPER_HL(phaddd, FADD) +SSE_HELPER_HL(phsubd, FSUB) =20 - MOVE(*d, r); -} +#undef SSE_HELPER_HW +#undef SSE_HELPER_HL =20 void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->W(0) =3D satsw((int8_t)s->B(0) * (uint8_t)d->B(0) + - (int8_t)s->B(1) * (uint8_t)d->B(1)); - d->W(1) =3D satsw((int8_t)s->B(2) * (uint8_t)d->B(2) + - (int8_t)s->B(3) * (uint8_t)d->B(3)); - d->W(2) =3D satsw((int8_t)s->B(4) * (uint8_t)d->B(4) + - (int8_t)s->B(5) * (uint8_t)d->B(5)); - d->W(3) =3D satsw((int8_t)s->B(6) * (uint8_t)d->B(6) + - (int8_t)s->B(7) * (uint8_t)d->B(7)); -#if SHIFT =3D=3D 1 - d->W(4) =3D satsw((int8_t)s->B(8) * (uint8_t)d->B(8) + - (int8_t)s->B(9) * (uint8_t)d->B(9)); - d->W(5) =3D satsw((int8_t)s->B(10) * (uint8_t)d->B(10) + - (int8_t)s->B(11) * (uint8_t)d->B(11)); - d->W(6) =3D satsw((int8_t)s->B(12) * (uint8_t)d->B(12) + - (int8_t)s->B(13) * (uint8_t)d->B(13)); - d->W(7) =3D satsw((int8_t)s->B(14) * (uint8_t)d->B(14) + - (int8_t)s->B(15) * (uint8_t)d->B(15)); + Reg *v =3D d; + d->W(0) =3D satsw((int8_t)s->B(0) * (uint8_t)v->B(0) + + (int8_t)s->B(1) * (uint8_t)v->B(1)); + d->W(1) =3D satsw((int8_t)s->B(2) * (uint8_t)v->B(2) + + (int8_t)s->B(3) * (uint8_t)v->B(3)); + d->W(2) =3D satsw((int8_t)s->B(4) * (uint8_t)v->B(4) + + (int8_t)s->B(5) * (uint8_t)v->B(5)); + d->W(3) =3D satsw((int8_t)s->B(6) * (uint8_t)v->B(6) + + (int8_t)s->B(7) * (uint8_t)v->B(7)); +#if SHIFT >=3D 1 + d->W(4) =3D satsw((int8_t)s->B(8) * (uint8_t)v->B(8) + + (int8_t)s->B(9) * (uint8_t)v->B(9)); + d->W(5) =3D satsw((int8_t)s->B(10) * (uint8_t)v->B(10) + + (int8_t)s->B(11) * (uint8_t)v->B(11)); + d->W(6) =3D satsw((int8_t)s->B(12) * (uint8_t)v->B(12) + + (int8_t)s->B(13) * (uint8_t)v->B(13)); + d->W(7) =3D satsw((int8_t)s->B(14) * (uint8_t)v->B(14) + + (int8_t)s->B(15) * (uint8_t)v->B(15)); +#if SHIFT =3D=3D 2 + int i; + for (i =3D 8; i < 16; i++) { + d->W(i) =3D satsw((int8_t)s->B(i * 2) * (uint8_t)v->B(i * 2) + + (int8_t)s->B(i * 2 + 1) * (uint8_t)v->B(i * 2 + 1)= ); + } +#endif #endif -} - -void glue(helper_phsubw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->W(0) =3D (int16_t)d->W(0) - (int16_t)d->W(1); - d->W(1) =3D (int16_t)d->W(2) - (int16_t)d->W(3); - XMM_ONLY(d->W(2) =3D (int16_t)d->W(4) - (int16_t)d->W(5)); - XMM_ONLY(d->W(3) =3D (int16_t)d->W(6) - (int16_t)d->W(7)); - d->W((2 << SHIFT) + 0) =3D (int16_t)s->W(0) - (int16_t)s->W(1); - d->W((2 << SHIFT) + 1) =3D (int16_t)s->W(2) - (int16_t)s->W(3); - XMM_ONLY(d->W(6) =3D (int16_t)s->W(4) - (int16_t)s->W(5)); - XMM_ONLY(d->W(7) =3D (int16_t)s->W(6) - (int16_t)s->W(7)); -} - -void glue(helper_phsubd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->L(0) =3D (int32_t)d->L(0) - (int32_t)d->L(1); - XMM_ONLY(d->L(1) =3D (int32_t)d->L(2) - (int32_t)d->L(3)); - d->L((1 << SHIFT) + 0) =3D (int32_t)s->L(0) - (int32_t)s->L(1); - XMM_ONLY(d->L(3) =3D (int32_t)s->L(2) - (int32_t)s->L(3)); -} - -void glue(helper_phsubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->W(0) =3D satsw((int16_t)d->W(0) - (int16_t)d->W(1)); - d->W(1) =3D satsw((int16_t)d->W(2) - (int16_t)d->W(3)); - XMM_ONLY(d->W(2) =3D satsw((int16_t)d->W(4) - (int16_t)d->W(5))); - XMM_ONLY(d->W(3) =3D satsw((int16_t)d->W(6) - (int16_t)d->W(7))); - d->W((2 << SHIFT) + 0) =3D satsw((int16_t)s->W(0) - (int16_t)s->W(1)); - d->W((2 << SHIFT) + 1) =3D satsw((int16_t)s->W(2) - (int16_t)s->W(3)); - XMM_ONLY(d->W(6) =3D satsw((int16_t)s->W(4) - (int16_t)s->W(5))); - XMM_ONLY(d->W(7) =3D satsw((int16_t)s->W(6) - (int16_t)s->W(7))); } =20 #define FABSB(x) (x > INT8_MAX ? -(int8_t)x : x) @@ -1751,32 +1858,49 @@ SSE_HELPER_L(helper_psignd, FSIGNL) void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, int32_t shift) { - Reg r; - + Reg *v =3D d; /* XXX could be checked during translation */ - if (shift >=3D (16 << SHIFT)) { - r.Q(0) =3D 0; - XMM_ONLY(r.Q(1) =3D 0); + if (shift >=3D (SHIFT ? 32 : 16)) { + d->Q(0) =3D 0; + XMM_ONLY(d->Q(1) =3D 0); +#if SHIFT =3D=3D 2 + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#endif } else { shift <<=3D 3; #define SHR(v, i) (i < 64 && i > -64 ? i > 0 ? v >> (i) : (v << -(i)) : 0) #if SHIFT =3D=3D 0 - r.Q(0) =3D SHR(s->Q(0), shift - 0) | - SHR(d->Q(0), shift - 64); + d->Q(0) =3D SHR(s->Q(0), shift - 0) | + SHR(v->Q(0), shift - 64); #else - r.Q(0) =3D SHR(s->Q(0), shift - 0) | - SHR(s->Q(1), shift - 64) | - SHR(d->Q(0), shift - 128) | - SHR(d->Q(1), shift - 192); - r.Q(1) =3D SHR(s->Q(0), shift + 64) | - SHR(s->Q(1), shift - 0) | - SHR(d->Q(0), shift - 64) | - SHR(d->Q(1), shift - 128); + uint64_t r0, r1; + + r0 =3D SHR(s->Q(0), shift - 0) | + SHR(s->Q(1), shift - 64) | + SHR(v->Q(0), shift - 128) | + SHR(v->Q(1), shift - 192); + r1 =3D SHR(s->Q(0), shift + 64) | + SHR(s->Q(1), shift - 0) | + SHR(v->Q(0), shift - 64) | + SHR(v->Q(1), shift - 128); + d->Q(0) =3D r0; + d->Q(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D SHR(s->Q(2), shift - 0) | + SHR(s->Q(3), shift - 64) | + SHR(v->Q(2), shift - 128) | + SHR(v->Q(3), shift - 192); + r1 =3D SHR(s->Q(2), shift + 64) | + SHR(s->Q(3), shift - 0) | + SHR(v->Q(2), shift - 64) | + SHR(v->Q(3), shift - 128); + d->Q(2) =3D r0; + d->Q(3) =3D r1; +#endif #endif #undef SHR } - - MOVE(*d, r); } =20 #define XMM0 (env->xmm_regs[0]) @@ -1918,17 +2042,43 @@ SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ) =20 void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - Reg r; - - r.W(0) =3D satuw((int32_t) d->L(0)); - r.W(1) =3D satuw((int32_t) d->L(1)); - r.W(2) =3D satuw((int32_t) d->L(2)); - r.W(3) =3D satuw((int32_t) d->L(3)); - r.W(4) =3D satuw((int32_t) s->L(0)); - r.W(5) =3D satuw((int32_t) s->L(1)); - r.W(6) =3D satuw((int32_t) s->L(2)); - r.W(7) =3D satuw((int32_t) s->L(3)); - MOVE(*d, r); + Reg *v =3D d; + uint16_t r[8]; + + r[0] =3D satuw((int32_t) v->L(0)); + r[1] =3D satuw((int32_t) v->L(1)); + r[2] =3D satuw((int32_t) v->L(2)); + r[3] =3D satuw((int32_t) v->L(3)); + r[4] =3D satuw((int32_t) s->L(0)); + r[5] =3D satuw((int32_t) s->L(1)); + r[6] =3D satuw((int32_t) s->L(2)); + r[7] =3D satuw((int32_t) s->L(3)); + d->W(0) =3D r[0]; + d->W(1) =3D r[1]; + d->W(2) =3D r[2]; + d->W(3) =3D r[3]; + d->W(4) =3D r[4]; + d->W(5) =3D r[5]; + d->W(6) =3D r[6]; + d->W(7) =3D r[7]; +#if SHIFT =3D=3D 2 + r[0] =3D satuw((int32_t) v->L(4)); + r[1] =3D satuw((int32_t) v->L(5)); + r[2] =3D satuw((int32_t) v->L(6)); + r[3] =3D satuw((int32_t) v->L(7)); + r[4] =3D satuw((int32_t) s->L(4)); + r[5] =3D satuw((int32_t) s->L(5)); + r[6] =3D satuw((int32_t) s->L(6)); + r[7] =3D satuw((int32_t) s->L(7)); + d->W(8) =3D r[0]; + d->W(9) =3D r[1]; + d->W(10) =3D r[2]; + d->W(11) =3D r[3]; + d->W(12) =3D r[4]; + d->W(13) =3D r[5]; + d->W(14) =3D r[6]; + d->W(15) =3D r[7]; +#endif } =20 #define FMINSB(d, s) MIN((int8_t)d, (int8_t)s) @@ -2184,20 +2334,37 @@ void glue(helper_dppd, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s, uint32_t mask) void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t offset) { + Reg *v =3D d; int s0 =3D (offset & 3) << 2; int d0 =3D (offset & 4) << 0; int i; - Reg r; + uint16_t r[8]; =20 for (i =3D 0; i < 8; i++, d0++) { - r.W(i) =3D 0; - r.W(i) +=3D abs1(d->B(d0 + 0) - s->B(s0 + 0)); - r.W(i) +=3D abs1(d->B(d0 + 1) - s->B(s0 + 1)); - r.W(i) +=3D abs1(d->B(d0 + 2) - s->B(s0 + 2)); - r.W(i) +=3D abs1(d->B(d0 + 3) - s->B(s0 + 3)); + r[i] =3D 0; + r[i] +=3D abs1(v->B(d0 + 0) - s->B(s0 + 0)); + r[i] +=3D abs1(v->B(d0 + 1) - s->B(s0 + 1)); + r[i] +=3D abs1(v->B(d0 + 2) - s->B(s0 + 2)); + r[i] +=3D abs1(v->B(d0 + 3) - s->B(s0 + 3)); } + for (i =3D 0; i < 8; i++) { + d->W(i) =3D r[i]; + } +#if SHIFT =3D=3D 2 + s0 =3D ((offset & 0x18) >> 1) + 16; + d0 =3D ((offset & 0x20) >> 3) + 16; =20 - MOVE(*d, r); + for (i =3D 0; i < 8; i++, d0++) { + r[i] =3D 0; + r[i] +=3D abs1(v->B(d0 + 0) - s->B(s0 + 0)); + r[i] +=3D abs1(v->B(d0 + 1) - s->B(s0 + 1)); + r[i] +=3D abs1(v->B(d0 + 2) - s->B(s0 + 2)); + r[i] +=3D abs1(v->B(d0 + 3) - s->B(s0 + 3)); + } + for (i =3D 0; i < 8; i++) { + d->W(i + 8) =3D r[i]; + } +#endif } =20 /* SSE4.2 op helpers */ --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839692510971.5953880444423; Sun, 24 Apr 2022 15:34:52 -0700 (PDT) Received: from localhost ([::1]:37286 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikoc-0000Wk-Hj for importer@patchew.org; Sun, 24 Apr 2022 18:34:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50654) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSp-0001iM-12 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:23 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58805) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSm-0002pu-Cg for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:18 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ7-0001ea-LV; Sun, 24 Apr 2022 23:02:17 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 14/42] i386: Add size suffix to vector FP helpers Date: Sun, 24 Apr 2022 23:01:36 +0100 Message-Id: <20220424220204.2493824-15-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839694719100001 Content-Type: text/plain; charset="utf-8" For AVX we're going to need both 128 bit (xmm) and 256 bit (ymm) variants of floating point helpers. Add the register type suffix to the existing *PS and *PD helpers (SS and SD variants are only valid on 128 bit vectors) No functional changes. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 48 ++++++++++++++++++------------------ target/i386/ops_sse_header.h | 48 ++++++++++++++++++------------------ target/i386/tcg/translate.c | 37 +++++++++++++-------------- 3 files changed, 67 insertions(+), 66 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index c645d2ddbf..fc8fd57aa5 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -699,7 +699,7 @@ void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int or= der) SHUFFLE4(W, s, s, 0); } #else -void helper_shufps(Reg *d, Reg *s, int order) +void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order) { Reg *v =3D d; uint32_t r0, r1, r2, r3; @@ -710,7 +710,7 @@ void helper_shufps(Reg *d, Reg *s, int order) #endif } =20 -void helper_shufpd(Reg *d, Reg *s, int order) +void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *s, int order) { Reg *v =3D d; uint64_t r0, r1; @@ -767,7 +767,7 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int o= rder) /* XXX: not accurate */ =20 #define SSE_HELPER_S(name, F) \ - void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ d->ZMM_S(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ @@ -780,7 +780,7 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int o= rder) d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ d->ZMM_D(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ @@ -816,7 +816,7 @@ SSE_HELPER_S(sqrt, FPU_SQRT) =20 =20 /* float to float conversions */ -void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { float32 s0, s1; =20 @@ -826,7 +826,7 @@ void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s) d->ZMM_D(1) =3D float32_to_float64(s1, &env->sse_status); } =20 -void helper_cvtpd2ps(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); d->ZMM_S(1) =3D float64_to_float32(s->ZMM_D(1), &env->sse_status); @@ -844,7 +844,7 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s) } =20 /* integer to float */ -void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D int32_to_float32(s->ZMM_L(0), &env->sse_status); d->ZMM_S(1) =3D int32_to_float32(s->ZMM_L(1), &env->sse_status); @@ -852,7 +852,7 @@ void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s) d->ZMM_S(3) =3D int32_to_float32(s->ZMM_L(3), &env->sse_status); } =20 -void helper_cvtdq2pd(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int32_t l0, l1; =20 @@ -929,7 +929,7 @@ WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero,= float32, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN) =20 -void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); d->ZMM_L(1) =3D x86_float32_to_int32(s->ZMM_S(1), &env->sse_status); @@ -937,7 +937,7 @@ void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) d->ZMM_L(3) =3D x86_float32_to_int32(s->ZMM_S(3), &env->sse_status); } =20 -void helper_cvtpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float64_to_int32(s->ZMM_D(0), &env->sse_status); d->ZMM_L(1) =3D x86_float64_to_int32(s->ZMM_D(1), &env->sse_status); @@ -979,7 +979,7 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s) #endif =20 /* float to integer truncated */ -void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); @@ -987,7 +987,7 @@ void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMR= eg *s) d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->= sse_status); } =20 -void helper_cvttpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); @@ -1028,7 +1028,7 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) } #endif =20 -void helper_rsqrtps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, @@ -1055,7 +1055,7 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMR= eg *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 -void helper_rcpps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); @@ -1116,7 +1116,7 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, in= t index, int length) d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } =20 -void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -1127,7 +1127,7 @@ void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) MOVE(*d, r); } =20 -void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -1136,7 +1136,7 @@ void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) MOVE(*d, r); } =20 -void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -1147,7 +1147,7 @@ void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) MOVE(*d, r); } =20 -void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -1156,7 +1156,7 @@ void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) MOVE(*d, r); } =20 -void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_S(0) =3D float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); d->ZMM_S(1) =3D float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); @@ -1164,7 +1164,7 @@ void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMM= Reg *s) d->ZMM_S(3) =3D float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); } =20 -void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_D(0) =3D float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); d->ZMM_D(1) =3D float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); @@ -1172,7 +1172,7 @@ void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMM= Reg *s) =20 /* XXX: unordered */ #define SSE_HELPER_CMP(name, F) \ - void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ d->ZMM_L(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ @@ -1185,7 +1185,7 @@ void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMM= Reg *s) d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ d->ZMM_Q(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ @@ -1268,7 +1268,7 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s) CC_SRC =3D comis_eflags[ret + 1]; } =20 -uint32_t helper_movmskps(CPUX86State *env, Reg *s) +uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s) { int b0, b1, b2, b3; =20 @@ -1279,7 +1279,7 @@ uint32_t helper_movmskps(CPUX86State *env, Reg *s) return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3); } =20 -uint32_t helper_movmskpd(CPUX86State *env, Reg *s) +uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s) { int b0, b1; =20 diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index 7e7f2cee2a..b8b0666f61 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -126,8 +126,8 @@ DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64) #if SHIFT =3D=3D 0 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int) #else -DEF_HELPER_3(shufps, void, Reg, Reg, int) -DEF_HELPER_3(shufpd, void, Reg, Reg, int) +DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int) +DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshuflw, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) @@ -138,9 +138,9 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) /* XXX: not accurate */ =20 #define SSE_HELPER_S(name, F) \ - DEF_HELPER_3(name ## ps, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## pd, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) =20 SSE_HELPER_S(add, FPU_ADD) @@ -152,12 +152,12 @@ SSE_HELPER_S(max, FPU_MAX) SSE_HELPER_S(sqrt, FPU_SQRT) =20 =20 -DEF_HELPER_3(cvtps2pd, void, env, Reg, Reg) -DEF_HELPER_3(cvtpd2ps, void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg) DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg) -DEF_HELPER_3(cvtdq2ps, void, env, Reg, Reg) -DEF_HELPER_3(cvtdq2pd, void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtdq2ps, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtdq2pd, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(cvtpi2ps, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtpi2pd, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtsi2ss, void, env, ZMMReg, i32) @@ -168,8 +168,8 @@ DEF_HELPER_3(cvtsq2ss, void, env, ZMMReg, i64) DEF_HELPER_3(cvtsq2sd, void, env, ZMMReg, i64) #endif =20 -DEF_HELPER_3(cvtps2dq, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(cvtpd2dq, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvtps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvtpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvtss2si, s32, env, ZMMReg) @@ -179,8 +179,8 @@ DEF_HELPER_2(cvtss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvtsd2sq, s64, env, ZMMReg) #endif =20 -DEF_HELPER_3(cvttps2dq, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(cvttpd2dq, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvttps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvttpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvttss2si, s32, env, ZMMReg) @@ -190,25 +190,25 @@ DEF_HELPER_2(cvttss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg) #endif =20 -DEF_HELPER_3(rsqrtps, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(rcpps, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(extrq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int) DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int) -DEF_HELPER_3(haddps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(haddpd, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(hsubps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(hsubpd, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(addsubps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(addsubpd, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(haddps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(haddpd, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(hsubps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(hsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(addsubps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(addsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) =20 #define SSE_HELPER_CMP(name, F) \ - DEF_HELPER_3(name ## ps, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## pd, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) =20 SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) @@ -224,8 +224,8 @@ DEF_HELPER_3(ucomiss, void, env, Reg, Reg) DEF_HELPER_3(comiss, void, env, Reg, Reg) DEF_HELPER_3(ucomisd, void, env, Reg, Reg) DEF_HELPER_3(comisd, void, env, Reg, Reg) -DEF_HELPER_2(movmskps, i32, env, Reg) -DEF_HELPER_2(movmskpd, i32, env, Reg) +DEF_HELPER_2(glue(movmskps, SUFFIX), i32, env, Reg) +DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg) #endif =20 DEF_HELPER_2(glue(pmovmskb, SUFFIX), i32, env, Reg) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index e9e6062b7f..63b32a77e3 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2807,7 +2807,7 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr= reg_a, TCGv_ptr reg_b, gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL) =20 #define SSE_FOP(name) OP(op2, SSE_OPF_SCALAR, \ - gen_helper_##name##ps, gen_helper_##name##pd, \ + gen_helper_##name##ps##_xmm, gen_helper_##name##pd##_xmm, \ gen_helper_##name##ss, gen_helper_##name##sd) #define SSE_OP(sname, dname, op, flags) OP(op, flags, \ gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) @@ -2846,12 +2846,12 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { gen_helper_comiss, gen_helper_comisd, NULL, NULL), [0x50] =3D SSE_SPECIAL, /* movmskps, movmskpd */ [0x51] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, - gen_helper_sqrtps, gen_helper_sqrtpd, + gen_helper_sqrtps_xmm, gen_helper_sqrtpd_xmm, gen_helper_sqrtss, gen_helper_sqrtsd), [0x52] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, - gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL), + gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL), [0x53] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, - gen_helper_rcpps, NULL, gen_helper_rcpss, NULL), + gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL), [0x54] =3D SSE_OP(pand, pand, op2, 0), /* andps, andpd */ [0x55] =3D SSE_OP(pandn, pandn, op2, 0), /* andnps, andnpd */ [0x56] =3D SSE_OP(por, por, op2, 0), /* orps, orpd */ @@ -2859,19 +2859,19 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x58] =3D SSE_FOP(add), [0x59] =3D SSE_FOP(mul), [0x5a] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, - gen_helper_cvtps2pd, gen_helper_cvtpd2ps, + gen_helper_cvtps2pd_xmm, gen_helper_cvtpd2ps_xmm, gen_helper_cvtss2sd, gen_helper_cvtsd2ss), [0x5b] =3D OP(op1, SSE_OPF_V0, - gen_helper_cvtdq2ps, gen_helper_cvtps2dq, - gen_helper_cvttps2dq, NULL), + gen_helper_cvtdq2ps_xmm, gen_helper_cvtps2dq_xmm, + gen_helper_cvttps2dq_xmm, NULL), [0x5c] =3D SSE_FOP(sub), [0x5d] =3D SSE_FOP(min), [0x5e] =3D SSE_FOP(div), [0x5f] =3D SSE_FOP(max), =20 [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ - [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps, - (SSEFunc_0_epp)gen_helper_shufpd, NULL, NULL), + [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps_xm= m, + (SSEFunc_0_epp)gen_helper_shufpd_xmm, NULL, NULL), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ [0x38] =3D SSE_SPECIAL, @@ -2912,15 +2912,15 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x79] =3D OP(op1, SSE_OPF_V0, NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r), [0x7c] =3D OP(op2, 0, - NULL, gen_helper_haddpd, NULL, gen_helper_haddps), + NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm), [0x7d] =3D OP(op2, 0, - NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps), + NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm), [0x7e] =3D SSE_SPECIAL, /* movd, movd, , movq */ [0x7f] =3D SSE_SPECIAL, /* movq, movdqa, movdqu */ [0xc4] =3D SSE_SPECIAL, /* pinsrw */ [0xc5] =3D SSE_SPECIAL, /* pextrw */ [0xd0] =3D OP(op2, 0, - NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps), + NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_x= mm), [0xd1] =3D MMX_OP(psrlw), [0xd2] =3D MMX_OP(psrld), [0xd3] =3D MMX_OP(psrlq), @@ -2943,8 +2943,8 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0xe4] =3D MMX_OP(pmulhuw), [0xe5] =3D MMX_OP(pmulhw), [0xe6] =3D OP(op1, SSE_OPF_V0, - NULL, gen_helper_cvttpd2dq, - gen_helper_cvtdq2pd, gen_helper_cvtpd2dq), + NULL, gen_helper_cvttpd2dq_xmm, + gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm), [0xe7] =3D SSE_SPECIAL, /* movntq, movntq */ [0xe8] =3D MMX_OP(psubsb), [0xe9] =3D MMX_OP(psubsw), @@ -3021,8 +3021,9 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 -#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ - gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } +#define SSE_FOP(x) { \ + gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ + gen_helper_ ## x ## ss, gen_helper_ ## x ## sd} static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpeq), SSE_FOP(cmplt), @@ -3718,14 +3719,14 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0); + gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0); + gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x02a: /* cvtpi2ps */ --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839675193702.4779484128525; Sun, 24 Apr 2022 15:34:35 -0700 (PDT) Received: from localhost ([::1]:36334 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikoL-0008Hx-Jg for importer@patchew.org; Sun, 24 Apr 2022 18:34:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51112) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikV7-0004W9-5G for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:14:41 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58898) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikV4-00030N-RI for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:14:40 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ7-0001ea-RZ; Sun, 24 Apr 2022 23:02:17 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 15/42] i386: Floating point atithmetic helper AVX prep Date: Sun, 24 Apr 2022 23:01:37 +0100 Message-Id: <20220424220204.2493824-16-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839676727100001 Content-Type: text/plain; charset="utf-8" Prepare the "easy" floating point vector helpers for AVX No functional changes to existing helpers. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 144 ++++++++++++++++++++++++++++++++++-------- 1 file changed, 119 insertions(+), 25 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index fc8fd57aa5..d308a1ec40 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -762,40 +762,66 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int= order) } #endif =20 -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 /* FPU ops */ /* XXX: not accurate */ =20 -#define SSE_HELPER_S(name, F) \ - void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ +#define SSE_HELPER_P(name, F) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - d->ZMM_S(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ - d->ZMM_S(2) =3D F(32, d->ZMM_S(2), s->ZMM_S(2)); \ - d->ZMM_S(3) =3D F(32, d->ZMM_S(3), s->ZMM_S(3)); \ + Reg *v =3D d; \ + d->ZMM_S(0) =3D F(32, v->ZMM_S(0), s->ZMM_S(0)); \ + d->ZMM_S(1) =3D F(32, v->ZMM_S(1), s->ZMM_S(1)); \ + d->ZMM_S(2) =3D F(32, v->ZMM_S(2), s->ZMM_S(2)); \ + d->ZMM_S(3) =3D F(32, v->ZMM_S(3), s->ZMM_S(3)); \ + YMM_ONLY( \ + d->ZMM_S(4) =3D F(32, v->ZMM_S(4), s->ZMM_S(4)); \ + d->ZMM_S(5) =3D F(32, v->ZMM_S(5), s->ZMM_S(5)); \ + d->ZMM_S(6) =3D F(32, v->ZMM_S(6), s->ZMM_S(6)); \ + d->ZMM_S(7) =3D F(32, v->ZMM_S(7), s->ZMM_S(7)); \ + ) \ } \ \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - } \ + Reg *v =3D d; \ + d->ZMM_D(0) =3D F(64, v->ZMM_D(0), s->ZMM_D(0)); \ + d->ZMM_D(1) =3D F(64, v->ZMM_D(1), s->ZMM_D(1)); \ + YMM_ONLY( \ + d->ZMM_D(2) =3D F(64, v->ZMM_D(2), s->ZMM_D(2)); \ + d->ZMM_D(3) =3D F(64, v->ZMM_D(3), s->ZMM_D(3)); \ + ) \ + } + +#if SHIFT =3D=3D 1 + +#define SSE_HELPER_S(name, F) \ + SSE_HELPER_P(name, F) \ \ - void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)\ { \ - d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - d->ZMM_D(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ + Reg *v =3D d; \ + d->ZMM_S(0) =3D F(32, v->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)\ { \ - d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ + Reg *v =3D d; \ + d->ZMM_D(0) =3D F(64, v->ZMM_D(0), s->ZMM_D(0)); \ } =20 +#else + +#define SSE_HELPER_S(name, F) SSE_HELPER_P(name, F) + +#endif + #define FPU_ADD(size, a, b) float ## size ## _add(a, b, &env->sse_status) #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status) #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status) #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status) -#define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status) =20 /* Note that the choice of comparison op here is important to get the * special cases right: for min and max Intel specifies that (-0,0), @@ -812,8 +838,42 @@ SSE_HELPER_S(mul, FPU_MUL) SSE_HELPER_S(div, FPU_DIV) SSE_HELPER_S(min, FPU_MIN) SSE_HELPER_S(max, FPU_MAX) -SSE_HELPER_S(sqrt, FPU_SQRT) =20 +void glue(helper_sqrtps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_S(0) =3D float32_sqrt(s->ZMM_S(0), &env->sse_status); + d->ZMM_S(1) =3D float32_sqrt(s->ZMM_S(1), &env->sse_status); + d->ZMM_S(2) =3D float32_sqrt(s->ZMM_S(2), &env->sse_status); + d->ZMM_S(3) =3D float32_sqrt(s->ZMM_S(3), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_sqrt(s->ZMM_S(4), &env->sse_status); + d->ZMM_S(5) =3D float32_sqrt(s->ZMM_S(5), &env->sse_status); + d->ZMM_S(6) =3D float32_sqrt(s->ZMM_S(6), &env->sse_status); + d->ZMM_S(7) =3D float32_sqrt(s->ZMM_S(7), &env->sse_status); +#endif +} + +void glue(helper_sqrtpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_D(0) =3D float64_sqrt(s->ZMM_D(0), &env->sse_status); + d->ZMM_D(1) =3D float64_sqrt(s->ZMM_D(1), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_D(2) =3D float64_sqrt(s->ZMM_D(2), &env->sse_status); + d->ZMM_D(3) =3D float64_sqrt(s->ZMM_D(3), &env->sse_status); +#endif +} + +#if SHIFT =3D=3D 1 +void helper_sqrtss(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_S(0) =3D float32_sqrt(s->ZMM_S(0), &env->sse_status); +} + +void helper_sqrtsd(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_D(0) =3D float64_sqrt(s->ZMM_D(0), &env->sse_status); +} +#endif =20 /* float to float conversions */ void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) @@ -1043,6 +1103,20 @@ void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, = ZMMReg *d, ZMMReg *s) d->ZMM_S(3) =3D float32_div(float32_one, float32_sqrt(s->ZMM_S(3), &env->sse_status), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(4), &env->sse_status), + &env->sse_status); + d->ZMM_S(5) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(5), &env->sse_status), + &env->sse_status); + d->ZMM_S(6) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(6), &env->sse_status), + &env->sse_status); + d->ZMM_S(7) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(7), &env->sse_status), + &env->sse_status); +#endif set_float_exception_flags(old_flags, &env->sse_status); } =20 @@ -1062,6 +1136,12 @@ void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZM= MReg *d, ZMMReg *s) d->ZMM_S(1) =3D float32_div(float32_one, s->ZMM_S(1), &env->sse_status= ); d->ZMM_S(2) =3D float32_div(float32_one, s->ZMM_S(2), &env->sse_status= ); d->ZMM_S(3) =3D float32_div(float32_one, s->ZMM_S(3), &env->sse_status= ); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_div(float32_one, s->ZMM_S(4), &env->sse_status= ); + d->ZMM_S(5) =3D float32_div(float32_one, s->ZMM_S(5), &env->sse_status= ); + d->ZMM_S(6) =3D float32_div(float32_one, s->ZMM_S(6), &env->sse_status= ); + d->ZMM_S(7) =3D float32_div(float32_one, s->ZMM_S(7), &env->sse_status= ); +#endif set_float_exception_flags(old_flags, &env->sse_status); } =20 @@ -1156,18 +1236,30 @@ void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, = ZMMReg *d, ZMMReg *s) MOVE(*d, r); } =20 -void glue(helper_addsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_S(0) =3D float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); - d->ZMM_S(1) =3D float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); - d->ZMM_S(2) =3D float32_sub(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status= ); - d->ZMM_S(3) =3D float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); + Reg *v =3D d; + d->ZMM_S(0) =3D float32_sub(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); + d->ZMM_S(1) =3D float32_add(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); + d->ZMM_S(2) =3D float32_sub(v->ZMM_S(2), s->ZMM_S(2), &env->sse_status= ); + d->ZMM_S(3) =3D float32_add(v->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_sub(v->ZMM_S(4), s->ZMM_S(4), &env->sse_status= ); + d->ZMM_S(5) =3D float32_add(v->ZMM_S(5), s->ZMM_S(5), &env->sse_status= ); + d->ZMM_S(6) =3D float32_sub(v->ZMM_S(6), s->ZMM_S(6), &env->sse_status= ); + d->ZMM_S(7) =3D float32_add(v->ZMM_S(7), s->ZMM_S(7), &env->sse_status= ); +#endif } =20 -void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_D(0) =3D float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); - d->ZMM_D(1) =3D float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); + Reg *v =3D d; + d->ZMM_D(0) =3D float64_sub(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); + d->ZMM_D(1) =3D float64_add(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); +#if SHIFT =3D=3D 2 + d->ZMM_D(2) =3D float64_sub(v->ZMM_D(2), s->ZMM_D(2), &env->sse_status= ); + d->ZMM_D(3) =3D float64_add(v->ZMM_D(3), s->ZMM_D(3), &env->sse_status= ); +#endif } =20 /* XXX: unordered */ @@ -2694,6 +2786,8 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State= *env, Reg *d, Reg *s, } #endif =20 +#undef SSE_HELPER_S + #undef SHIFT #undef XMM_ONLY #undef YMM_ONLY --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839290470911.0535774994997; Sun, 24 Apr 2022 15:28:10 -0700 (PDT) Received: from localhost ([::1]:43014 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1niki9-0001r0-Aw for importer@patchew.org; Sun, 24 Apr 2022 18:28:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50532) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSK-0001Lv-Ly for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:48 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58771) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSI-0002ng-Ur for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:48 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ8-0001ea-1t; Sun, 24 Apr 2022 23:02:18 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 16/42] i386: Dot product AVX helper prep Date: Sun, 24 Apr 2022 23:01:38 +0100 Message-Id: <20220424220204.2493824-17-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839290884100001 Content-Type: text/plain; charset="utf-8" Make the dpps and dppd helpers AVX-ready I can't see any obvious reason why dppd shouldn't work on 256 bit ymm registers, but both AMD and Intel agree that it's xmm only. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 54 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 46 insertions(+), 8 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index d308a1ec40..4137e6e1fa 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2366,8 +2366,10 @@ SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP) SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) =20 -void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) +void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, + uint32_t mask) { + Reg *v =3D d; float32 prod, iresult, iresult2; =20 /* @@ -2375,23 +2377,23 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s, uint32_t mask) * to correctly round the intermediate results */ if (mask & (1 << 4)) { - iresult =3D float32_mul(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); + iresult =3D float32_mul(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); } else { iresult =3D float32_zero; } if (mask & (1 << 5)) { - prod =3D float32_mul(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status); + prod =3D float32_mul(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status); } else { prod =3D float32_zero; } iresult =3D float32_add(iresult, prod, &env->sse_status); if (mask & (1 << 6)) { - iresult2 =3D float32_mul(d->ZMM_S(2), s->ZMM_S(2), &env->sse_statu= s); + iresult2 =3D float32_mul(v->ZMM_S(2), s->ZMM_S(2), &env->sse_statu= s); } else { iresult2 =3D float32_zero; } if (mask & (1 << 7)) { - prod =3D float32_mul(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status); + prod =3D float32_mul(v->ZMM_S(3), s->ZMM_S(3), &env->sse_status); } else { prod =3D float32_zero; } @@ -2402,26 +2404,62 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s, uint32_t mask) d->ZMM_S(1) =3D (mask & (1 << 1)) ? iresult : float32_zero; d->ZMM_S(2) =3D (mask & (1 << 2)) ? iresult : float32_zero; d->ZMM_S(3) =3D (mask & (1 << 3)) ? iresult : float32_zero; +#if SHIFT =3D=3D 2 + if (mask & (1 << 4)) { + iresult =3D float32_mul(v->ZMM_S(4), s->ZMM_S(4), &env->sse_status= ); + } else { + iresult =3D float32_zero; + } + if (mask & (1 << 5)) { + prod =3D float32_mul(v->ZMM_S(5), s->ZMM_S(5), &env->sse_status); + } else { + prod =3D float32_zero; + } + iresult =3D float32_add(iresult, prod, &env->sse_status); + if (mask & (1 << 6)) { + iresult2 =3D float32_mul(v->ZMM_S(6), s->ZMM_S(6), &env->sse_statu= s); + } else { + iresult2 =3D float32_zero; + } + if (mask & (1 << 7)) { + prod =3D float32_mul(v->ZMM_S(7), s->ZMM_S(7), &env->sse_status); + } else { + prod =3D float32_zero; + } + iresult2 =3D float32_add(iresult2, prod, &env->sse_status); + iresult =3D float32_add(iresult, iresult2, &env->sse_status); + + d->ZMM_S(4) =3D (mask & (1 << 0)) ? iresult : float32_zero; + d->ZMM_S(5) =3D (mask & (1 << 1)) ? iresult : float32_zero; + d->ZMM_S(6) =3D (mask & (1 << 2)) ? iresult : float32_zero; + d->ZMM_S(7) =3D (mask & (1 << 3)) ? iresult : float32_zero; +#endif } =20 -void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) +#if SHIFT =3D=3D 1 +/* Oddly, there is no ymm version of dppd */ +void glue(helper_dppd, SUFFIX)(CPUX86State *env, + Reg *d, Reg *s, uint32_t mask) { + Reg *v =3D d; float64 iresult; =20 if (mask & (1 << 4)) { - iresult =3D float64_mul(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); + iresult =3D float64_mul(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); } else { iresult =3D float64_zero; } + if (mask & (1 << 5)) { iresult =3D float64_add(iresult, - float64_mul(d->ZMM_D(1), s->ZMM_D(1), + float64_mul(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status), &env->sse_status); } d->ZMM_D(0) =3D (mask & (1 << 0)) ? iresult : float64_zero; d->ZMM_D(1) =3D (mask & (1 << 1)) ? iresult : float64_zero; } +#endif =20 void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t offset) --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 165083866320648.2310359124848; Sun, 24 Apr 2022 15:17:43 -0700 (PDT) Received: from localhost ([::1]:45996 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikY1-0001G2-9M for importer@patchew.org; Sun, 24 Apr 2022 18:17:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50408) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikRs-0000sc-EY for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:22 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58750) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikRq-0002ln-LQ for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:20 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ8-0001ea-7q; Sun, 24 Apr 2022 23:02:18 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 17/42] i386: Destructive FP helpers for AVX Date: Sun, 24 Apr 2022 23:01:39 +0100 Message-Id: <20220424220204.2493824-18-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838663504100001 Content-Type: text/plain; charset="utf-8" Perpare the horizontal atithmetic vector helpers for AVX These currently use a dummy Reg typed variable to store the result then assign the whole register. This will cause 128 bit operations to corrupt the upper half of the register, so replace it with explicit temporaries and element assignments. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 96 +++++++++++++++++++++++++++++++------------ 1 file changed, 70 insertions(+), 26 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 4137e6e1fa..d128af6cc8 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1196,44 +1196,88 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, = int index, int length) d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } =20 -void glue(helper_haddps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - ZMMReg r; - - r.ZMM_S(0) =3D float32_add(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status); - r.ZMM_S(1) =3D float32_add(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status); - r.ZMM_S(2) =3D float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); - r.ZMM_S(3) =3D float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); - MOVE(*d, r); + Reg *v =3D d; + float32 r0, r1, r2, r3; + + r0 =3D float32_add(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status); + r1 =3D float32_add(v->ZMM_S(2), v->ZMM_S(3), &env->sse_status); + r2 =3D float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); + r3 =3D float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); + d->ZMM_S(0) =3D r0; + d->ZMM_S(1) =3D r1; + d->ZMM_S(2) =3D r2; + d->ZMM_S(3) =3D r3; +#if SHIFT =3D=3D 2 + r0 =3D float32_add(v->ZMM_S(4), v->ZMM_S(5), &env->sse_status); + r1 =3D float32_add(v->ZMM_S(6), v->ZMM_S(7), &env->sse_status); + r2 =3D float32_add(s->ZMM_S(4), s->ZMM_S(5), &env->sse_status); + r3 =3D float32_add(s->ZMM_S(6), s->ZMM_S(7), &env->sse_status); + d->ZMM_S(4) =3D r0; + d->ZMM_S(5) =3D r1; + d->ZMM_S(6) =3D r2; + d->ZMM_S(7) =3D r3; +#endif } =20 -void glue(helper_haddpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - ZMMReg r; + Reg *v =3D d; + float64 r0, r1; =20 - r.ZMM_D(0) =3D float64_add(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status); - r.ZMM_D(1) =3D float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); - MOVE(*d, r); + r0 =3D float64_add(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status); + r1 =3D float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); + d->ZMM_D(0) =3D r0; + d->ZMM_D(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D float64_add(v->ZMM_D(2), v->ZMM_D(3), &env->sse_status); + r1 =3D float64_add(s->ZMM_D(2), s->ZMM_D(3), &env->sse_status); + d->ZMM_D(2) =3D r0; + d->ZMM_D(3) =3D r1; +#endif } =20 -void glue(helper_hsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - ZMMReg r; - - r.ZMM_S(0) =3D float32_sub(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status); - r.ZMM_S(1) =3D float32_sub(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status); - r.ZMM_S(2) =3D float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); - r.ZMM_S(3) =3D float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); - MOVE(*d, r); + Reg *v =3D d; + float32 r0, r1, r2, r3; + + r0 =3D float32_sub(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status); + r1 =3D float32_sub(v->ZMM_S(2), v->ZMM_S(3), &env->sse_status); + r2 =3D float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); + r3 =3D float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); + d->ZMM_S(0) =3D r0; + d->ZMM_S(1) =3D r1; + d->ZMM_S(2) =3D r2; + d->ZMM_S(3) =3D r3; +#if SHIFT =3D=3D 2 + r0 =3D float32_sub(v->ZMM_S(4), v->ZMM_S(5), &env->sse_status); + r1 =3D float32_sub(v->ZMM_S(6), v->ZMM_S(7), &env->sse_status); + r2 =3D float32_sub(s->ZMM_S(4), s->ZMM_S(5), &env->sse_status); + r3 =3D float32_sub(s->ZMM_S(6), s->ZMM_S(7), &env->sse_status); + d->ZMM_S(4) =3D r0; + d->ZMM_S(5) =3D r1; + d->ZMM_S(6) =3D r2; + d->ZMM_S(7) =3D r3; +#endif } =20 -void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - ZMMReg r; + Reg *v =3D d; + float64 r0, r1; =20 - r.ZMM_D(0) =3D float64_sub(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status); - r.ZMM_D(1) =3D float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); - MOVE(*d, r); + r0 =3D float64_sub(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status); + r1 =3D float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); + d->ZMM_D(0) =3D r0; + d->ZMM_D(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D float64_sub(v->ZMM_D(2), v->ZMM_D(3), &env->sse_status); + r1 =3D float64_sub(s->ZMM_D(2), s->ZMM_D(3), &env->sse_status); + d->ZMM_D(2) =3D r0; + d->ZMM_D(3) =3D r1; +#endif } =20 void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839047529513.4577385578356; Sun, 24 Apr 2022 15:24:07 -0700 (PDT) Received: from localhost ([::1]:34078 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikeE-0003ts-E4 for importer@patchew.org; Sun, 24 Apr 2022 18:24:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50814) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTT-00021w-U2 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:59 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58839) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTQ-0002tV-Eq for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:59 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ8-0001ea-EL; Sun, 24 Apr 2022 23:02:18 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 18/42] i386: Misc AVX helper prep Date: Sun, 24 Apr 2022 23:01:40 +0100 Message-Id: <20220424220204.2493824-19-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839048063100001 Content-Type: text/plain; charset="utf-8" Fixup various vector helpers that either trivially exten to 256 bit, or don't have 256 bit variants. No functional changes to existing helpers Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 159 ++++++++++++++++++++++++++++++++++++------ 1 file changed, 139 insertions(+), 20 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index d128af6cc8..3202c00572 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -641,6 +641,7 @@ void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg = *d, Reg *s) #endif } =20 +#if SHIFT < 2 void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, target_ulong a0) { @@ -652,6 +653,7 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg= *d, Reg *s, } } } +#endif =20 void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val) { @@ -882,6 +884,13 @@ void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s) =20 s0 =3D s->ZMM_S(0); s1 =3D s->ZMM_S(1); +#if SHIFT =3D=3D 2 + float32 s2, s3; + s2 =3D s->ZMM_S(2); + s3 =3D s->ZMM_S(3); + d->ZMM_D(2) =3D float32_to_float64(s2, &env->sse_status); + d->ZMM_D(3) =3D float32_to_float64(s3, &env->sse_status); +#endif d->ZMM_D(0) =3D float32_to_float64(s0, &env->sse_status); d->ZMM_D(1) =3D float32_to_float64(s1, &env->sse_status); } @@ -890,9 +899,17 @@ void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); d->ZMM_S(1) =3D float64_to_float32(s->ZMM_D(1), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(2) =3D float64_to_float32(s->ZMM_D(2), &env->sse_status); + d->ZMM_S(3) =3D float64_to_float32(s->ZMM_D(3), &env->sse_status); + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#else d->Q(1) =3D 0; +#endif } =20 +#if SHIFT =3D=3D 1 void helper_cvtss2sd(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_D(0) =3D float32_to_float64(s->ZMM_S(0), &env->sse_status); @@ -902,6 +919,7 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); } +#endif =20 /* integer to float */ void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) @@ -910,6 +928,12 @@ void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s) d->ZMM_S(1) =3D int32_to_float32(s->ZMM_L(1), &env->sse_status); d->ZMM_S(2) =3D int32_to_float32(s->ZMM_L(2), &env->sse_status); d->ZMM_S(3) =3D int32_to_float32(s->ZMM_L(3), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D int32_to_float32(s->ZMM_L(4), &env->sse_status); + d->ZMM_S(5) =3D int32_to_float32(s->ZMM_L(5), &env->sse_status); + d->ZMM_S(6) =3D int32_to_float32(s->ZMM_L(6), &env->sse_status); + d->ZMM_S(7) =3D int32_to_float32(s->ZMM_L(7), &env->sse_status); +#endif } =20 void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) @@ -918,10 +942,18 @@ void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) =20 l0 =3D (int32_t)s->ZMM_L(0); l1 =3D (int32_t)s->ZMM_L(1); +#if SHIFT =3D=3D 2 + int32_t l2, l3; + l2 =3D (int32_t)s->ZMM_L(2); + l3 =3D (int32_t)s->ZMM_L(3); + d->ZMM_D(2) =3D int32_to_float64(l2, &env->sse_status); + d->ZMM_D(3) =3D int32_to_float64(l3, &env->sse_status); +#endif d->ZMM_D(0) =3D int32_to_float64(l0, &env->sse_status); d->ZMM_D(1) =3D int32_to_float64(l1, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_cvtpi2ps(CPUX86State *env, ZMMReg *d, MMXReg *s) { d->ZMM_S(0) =3D int32_to_float32(s->MMX_L(0), &env->sse_status); @@ -956,8 +988,11 @@ void helper_cvtsq2sd(CPUX86State *env, ZMMReg *d, uint= 64_t val) } #endif =20 +#endif + /* float to integer */ =20 +#if SHIFT =3D=3D 1 /* * x86 mandates that we return the indefinite integer value for the result * of any float-to-integer conversion that raises the 'invalid' exception. @@ -988,6 +1023,7 @@ WRAP_FLOATCONV(int64_t, float32_to_int64, float32, INT= 64_MIN) WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero, float32, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN) +#endif =20 void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { @@ -995,15 +1031,29 @@ void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env,= ZMMReg *d, ZMMReg *s) d->ZMM_L(1) =3D x86_float32_to_int32(s->ZMM_S(1), &env->sse_status); d->ZMM_L(2) =3D x86_float32_to_int32(s->ZMM_S(2), &env->sse_status); d->ZMM_L(3) =3D x86_float32_to_int32(s->ZMM_S(3), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_L(4) =3D x86_float32_to_int32(s->ZMM_S(4), &env->sse_status); + d->ZMM_L(5) =3D x86_float32_to_int32(s->ZMM_S(5), &env->sse_status); + d->ZMM_L(6) =3D x86_float32_to_int32(s->ZMM_S(6), &env->sse_status); + d->ZMM_L(7) =3D x86_float32_to_int32(s->ZMM_S(7), &env->sse_status); +#endif } =20 void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float64_to_int32(s->ZMM_D(0), &env->sse_status); d->ZMM_L(1) =3D x86_float64_to_int32(s->ZMM_D(1), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_L(2) =3D x86_float64_to_int32(s->ZMM_D(2), &env->sse_status); + d->ZMM_L(3) =3D x86_float64_to_int32(s->ZMM_D(3), &env->sse_status); + d->Q(2) =3D 0; + d->Q(3) =3D 0; +#else d->ZMM_Q(1) =3D 0; +#endif } =20 +#if SHIFT =3D=3D 1 void helper_cvtps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { d->MMX_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); @@ -1037,33 +1087,64 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s) return x86_float64_to_int64(s->ZMM_D(0), &env->sse_status); } #endif +#endif =20 /* float to integer truncated */ void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); - d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); - d->ZMM_L(2) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(2), &env->= sse_status); - d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->= sse_status); + d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), + &env->sse_status); + d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), + &env->sse_status); + d->ZMM_L(2) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(2), + &env->sse_status); + d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), + &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_L(4) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(4), + &env->sse_status); + d->ZMM_L(5) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(5), + &env->sse_status); + d->ZMM_L(6) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(6), + &env->sse_status); + d->ZMM_L(7) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(7), + &env->sse_status); +#endif } =20 void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); - d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); + d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), + &env->sse_status); + d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), + &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_L(2) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(2), + &env->sse_status); + d->ZMM_L(3) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(3), + &env->sse_status); + d->ZMM_Q(2) =3D 0; + d->ZMM_Q(3) =3D 0; +#else d->ZMM_Q(1) =3D 0; +#endif } =20 +#if SHIFT =3D=3D 1 void helper_cvttps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { - d->MMX_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); - d->MMX_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); + d->MMX_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), + &env->sse_status); + d->MMX_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), + &env->sse_status); } =20 void helper_cvttpd2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { - d->MMX_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); - d->MMX_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); + d->MMX_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), + &env->sse_status); + d->MMX_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), + &env->sse_status); } =20 int32_t helper_cvttss2si(CPUX86State *env, ZMMReg *s) @@ -1087,6 +1168,7 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) return x86_float64_to_int64_round_to_zero(s->ZMM_D(0), &env->sse_statu= s); } #endif +#endif =20 void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { @@ -1120,6 +1202,7 @@ void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, Z= MMReg *d, ZMMReg *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); @@ -1128,6 +1211,7 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMR= eg *s) &env->sse_status); set_float_exception_flags(old_flags, &env->sse_status); } +#endif =20 void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { @@ -1145,13 +1229,16 @@ void glue(helper_rcpps, SUFFIX)(CPUX86State *env, Z= MMReg *d, ZMMReg *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_rcpss(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); set_float_exception_flags(old_flags, &env->sse_status); } +#endif =20 +#if SHIFT =3D=3D 1 static inline uint64_t helper_extrq(uint64_t src, int shift, int len) { uint64_t mask; @@ -1195,6 +1282,7 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, in= t index, int length) { d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } +#endif =20 void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { @@ -1358,6 +1446,7 @@ SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) SSE_HELPER_CMP(cmpord, FPU_CMPORD) =20 +#if SHIFT =3D=3D 1 static const int comis_eflags[4] =3D {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C}; =20 void helper_ucomiss(CPUX86State *env, Reg *d, Reg *s) @@ -1403,25 +1492,38 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s) ret =3D float64_compare(d0, d1, &env->sse_status); CC_SRC =3D comis_eflags[ret + 1]; } +#endif =20 uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s) { - int b0, b1, b2, b3; + uint32_t mask; =20 - b0 =3D s->ZMM_L(0) >> 31; - b1 =3D s->ZMM_L(1) >> 31; - b2 =3D s->ZMM_L(2) >> 31; - b3 =3D s->ZMM_L(3) >> 31; - return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3); + mask =3D 0; + mask |=3D (s->ZMM_L(0) >> (31 - 0)) & (1 << 0); + mask |=3D (s->ZMM_L(1) >> (31 - 1)) & (1 << 1); + mask |=3D (s->ZMM_L(2) >> (31 - 2)) & (1 << 2); + mask |=3D (s->ZMM_L(3) >> (31 - 3)) & (1 << 3); +#if SHIFT =3D=3D 2 + mask |=3D (s->ZMM_L(4) >> (31 - 4)) & (1 << 4); + mask |=3D (s->ZMM_L(5) >> (31 - 5)) & (1 << 5); + mask |=3D (s->ZMM_L(6) >> (31 - 6)) & (1 << 6); + mask |=3D (s->ZMM_L(7) >> (31 - 7)) & (1 << 7); +#endif + return mask; } =20 uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s) { - int b0, b1; + uint32_t mask; =20 - b0 =3D s->ZMM_L(1) >> 31; - b1 =3D s->ZMM_L(3) >> 31; - return b0 | (b1 << 1); + mask =3D 0; + mask |=3D (s->ZMM_L(1) >> (31 - 0)) & (1 << 0); + mask |=3D (s->ZMM_L(3) >> (31 - 1)) & (1 << 1); +#if SHIFT =3D=3D 2 + mask |=3D (s->ZMM_L(5) >> (31 - 2)) & (1 << 2); + mask |=3D (s->ZMM_L(7) >> (31 - 3)) & (1 << 3); +#endif + return mask; } =20 #endif @@ -2233,6 +2335,7 @@ SSE_HELPER_L(helper_pmaxud, MAX) #define FMULLD(d, s) ((int32_t)d * (int32_t)s) SSE_HELPER_L(helper_pmulld, FMULLD) =20 +#if SHIFT =3D=3D 1 void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int idx =3D 0; @@ -2264,6 +2367,7 @@ void glue(helper_phminposuw, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s) d->L(1) =3D 0; d->Q(1) =3D 0; } +#endif =20 void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) @@ -2293,6 +2397,12 @@ void glue(helper_roundps, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s, d->ZMM_S(1) =3D float32_round_to_int(s->ZMM_S(1), &env->sse_status); d->ZMM_S(2) =3D float32_round_to_int(s->ZMM_S(2), &env->sse_status); d->ZMM_S(3) =3D float32_round_to_int(s->ZMM_S(3), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_S(4) =3D float32_round_to_int(s->ZMM_S(4), &env->sse_status); + d->ZMM_S(5) =3D float32_round_to_int(s->ZMM_S(5), &env->sse_status); + d->ZMM_S(6) =3D float32_round_to_int(s->ZMM_S(6), &env->sse_status); + d->ZMM_S(7) =3D float32_round_to_int(s->ZMM_S(7), &env->sse_status); +#endif =20 if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) { set_float_exception_flags(get_float_exception_flags(&env->sse_stat= us) & @@ -2328,6 +2438,10 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s, =20 d->ZMM_D(0) =3D float64_round_to_int(s->ZMM_D(0), &env->sse_status); d->ZMM_D(1) =3D float64_round_to_int(s->ZMM_D(1), &env->sse_status); +#if SHIFT =3D=3D 2 + d->ZMM_D(2) =3D float64_round_to_int(s->ZMM_D(2), &env->sse_status); + d->ZMM_D(3) =3D float64_round_to_int(s->ZMM_D(3), &env->sse_status); +#endif =20 if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) { set_float_exception_flags(get_float_exception_flags(&env->sse_stat= us) & @@ -2337,6 +2451,7 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, env->sse_status.float_rounding_mode =3D prev_rounding_mode; } =20 +#if SHIFT =3D=3D 1 void glue(helper_roundss, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) { @@ -2404,6 +2519,7 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, } env->sse_status.float_rounding_mode =3D prev_rounding_mode; } +#endif =20 #define FBLENDP(d, s, m) (m ? s : d) SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) @@ -2545,6 +2661,7 @@ void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, #define FCMPGTQ(d, s) ((int64_t)d > (int64_t)s ? -1 : 0) SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ) =20 +#if SHIFT =3D=3D 1 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl) { target_long val, limit; @@ -2765,6 +2882,8 @@ target_ulong helper_crc32(uint32_t crc1, target_ulong= msg, uint32_t len) return crc; } =20 +#endif + void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t ctrl) { --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650840109131385.0882452126822; Sun, 24 Apr 2022 15:41:49 -0700 (PDT) Received: from localhost ([::1]:52220 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikvM-0002Nf-1S for importer@patchew.org; Sun, 24 Apr 2022 18:41:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50956) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikUQ-0003Pk-UX for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:58 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58877) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikUP-0002xA-0R for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:58 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ8-0001ea-LI; Sun, 24 Apr 2022 23:02:18 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 19/42] i386: Rewrite blendv helpers Date: Sun, 24 Apr 2022 23:01:41 +0100 Message-Id: <20220424220204.2493824-20-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650840110830100001 Content-Type: text/plain; charset="utf-8" Rewrite the blendv helpers so that they can easily be extended to support the AVX encodings, which make all 4 arguments explicit. No functional changes to the existing helpers Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 119 +++++++++++++++++++++--------------------- 1 file changed, 60 insertions(+), 59 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 3202c00572..9f388b02b9 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2141,73 +2141,74 @@ void glue(helper_palignr, SUFFIX)(CPUX86State *env,= Reg *d, Reg *s, } } =20 -#define XMM0 (env->xmm_regs[0]) +#if SHIFT >=3D 1 + +#define BLEND_V128(elem, num, F, b) do { = \ + d->elem(b + 0) =3D F(v->elem(b + 0), s->elem(b + 0), m->elem(b + 0)); = \ + d->elem(b + 1) =3D F(v->elem(b + 1), s->elem(b + 1), m->elem(b + 1)); = \ + if (num > 2) { = \ + d->elem(b + 2) =3D F(v->elem(b + 2), s->elem(b + 2), m->elem(b + 2= )); \ + d->elem(b + 3) =3D F(v->elem(b + 3), s->elem(b + 3), m->elem(b + 3= )); \ + } = \ + if (num > 4) { = \ + d->elem(b + 4) =3D F(v->elem(b + 4), s->elem(b + 4), m->elem(b + 4= )); \ + d->elem(b + 5) =3D F(v->elem(b + 5), s->elem(b + 5), m->elem(b + 5= )); \ + d->elem(b + 6) =3D F(v->elem(b + 6), s->elem(b + 6), m->elem(b + 6= )); \ + d->elem(b + 7) =3D F(v->elem(b + 7), s->elem(b + 7), m->elem(b + 7= )); \ + } = \ + if (num > 8) { = \ + d->elem(b + 8) =3D F(v->elem(b + 8), s->elem(b + 8), m->elem(b + 8= )); \ + d->elem(b + 9) =3D F(v->elem(b + 9), s->elem(b + 9), m->elem(b + 9= )); \ + d->elem(b + 10) =3D F(v->elem(b + 10), s->elem(b + 10), m->elem(b = + 10));\ + d->elem(b + 11) =3D F(v->elem(b + 11), s->elem(b + 11), m->elem(b = + 11));\ + d->elem(b + 12) =3D F(v->elem(b + 12), s->elem(b + 12), m->elem(b = + 12));\ + d->elem(b + 13) =3D F(v->elem(b + 13), s->elem(b + 13), m->elem(b = + 13));\ + d->elem(b + 14) =3D F(v->elem(b + 14), s->elem(b + 14), m->elem(b = + 14));\ + d->elem(b + 15) =3D F(v->elem(b + 15), s->elem(b + 15), m->elem(b = + 15));\ + } \ + } while (0) =20 -#if SHIFT =3D=3D 1 #define SSE_HELPER_V(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->elem(0) =3D F(d->elem(0), s->elem(0), XMM0.elem(0)); \ - d->elem(1) =3D F(d->elem(1), s->elem(1), XMM0.elem(1)); \ - if (num > 2) { \ - d->elem(2) =3D F(d->elem(2), s->elem(2), XMM0.elem(2)); \ - d->elem(3) =3D F(d->elem(3), s->elem(3), XMM0.elem(3)); \ - if (num > 4) { \ - d->elem(4) =3D F(d->elem(4), s->elem(4), XMM0.elem(4)); \ - d->elem(5) =3D F(d->elem(5), s->elem(5), XMM0.elem(5)); \ - d->elem(6) =3D F(d->elem(6), s->elem(6), XMM0.elem(6)); \ - d->elem(7) =3D F(d->elem(7), s->elem(7), XMM0.elem(7)); \ - if (num > 8) { \ - d->elem(8) =3D F(d->elem(8), s->elem(8), XMM0.elem(8))= ; \ - d->elem(9) =3D F(d->elem(9), s->elem(9), XMM0.elem(9))= ; \ - d->elem(10) =3D F(d->elem(10), s->elem(10), XMM0.elem(= 10)); \ - d->elem(11) =3D F(d->elem(11), s->elem(11), XMM0.elem(= 11)); \ - d->elem(12) =3D F(d->elem(12), s->elem(12), XMM0.elem(= 12)); \ - d->elem(13) =3D F(d->elem(13), s->elem(13), XMM0.elem(= 13)); \ - d->elem(14) =3D F(d->elem(14), s->elem(14), XMM0.elem(= 14)); \ - d->elem(15) =3D F(d->elem(15), s->elem(15), XMM0.elem(= 15)); \ - } \ - } \ - } \ - } + Reg *v =3D d; \ + Reg *m =3D &env->xmm_regs[0]; \ + BLEND_V128(elem, num, F, 0); \ + YMM_ONLY(BLEND_V128(elem, num, F, num);) \ + } + +#define BLEND_I128(elem, num, F, b) do { = \ + d->elem(b + 0) =3D F(v->elem(b + 0), s->elem(b + 0), ((imm >> 0) & 1))= ; \ + d->elem(b + 1) =3D F(v->elem(b + 1), s->elem(b + 1), ((imm >> 1) & 1))= ; \ + if (num > 2) { = \ + d->elem(b + 2) =3D F(v->elem(b + 2), s->elem(b + 2), ((imm >> 2) &= 1)); \ + d->elem(b + 3) =3D F(v->elem(b + 3), s->elem(b + 3), ((imm >> 3) &= 1)); \ + } = \ + if (num > 4) { = \ + d->elem(b + 4) =3D F(v->elem(b + 4), s->elem(b + 4), ((imm >> 4) &= 1)); \ + d->elem(b + 5) =3D F(v->elem(b + 5), s->elem(b + 5), ((imm >> 5) &= 1)); \ + d->elem(b + 6) =3D F(v->elem(b + 6), s->elem(b + 6), ((imm >> 6) &= 1)); \ + d->elem(b + 7) =3D F(v->elem(b + 7), s->elem(b + 7), ((imm >> 7) &= 1)); \ + } = \ + } while (0) =20 #define SSE_HELPER_I(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t imm= ) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, \ + uint32_t imm) \ { \ - d->elem(0) =3D F(d->elem(0), s->elem(0), ((imm >> 0) & 1)); \ - d->elem(1) =3D F(d->elem(1), s->elem(1), ((imm >> 1) & 1)); \ - if (num > 2) { \ - d->elem(2) =3D F(d->elem(2), s->elem(2), ((imm >> 2) & 1)); \ - d->elem(3) =3D F(d->elem(3), s->elem(3), ((imm >> 3) & 1)); \ - if (num > 4) { \ - d->elem(4) =3D F(d->elem(4), s->elem(4), ((imm >> 4) & 1))= ; \ - d->elem(5) =3D F(d->elem(5), s->elem(5), ((imm >> 5) & 1))= ; \ - d->elem(6) =3D F(d->elem(6), s->elem(6), ((imm >> 6) & 1))= ; \ - d->elem(7) =3D F(d->elem(7), s->elem(7), ((imm >> 7) & 1))= ; \ - if (num > 8) { \ - d->elem(8) =3D F(d->elem(8), s->elem(8), ((imm >> 8) &= 1)); \ - d->elem(9) =3D F(d->elem(9), s->elem(9), ((imm >> 9) &= 1)); \ - d->elem(10) =3D F(d->elem(10), s->elem(10), \ - ((imm >> 10) & 1)); \ - d->elem(11) =3D F(d->elem(11), s->elem(11), \ - ((imm >> 11) & 1)); \ - d->elem(12) =3D F(d->elem(12), s->elem(12), \ - ((imm >> 12) & 1)); \ - d->elem(13) =3D F(d->elem(13), s->elem(13), \ - ((imm >> 13) & 1)); \ - d->elem(14) =3D F(d->elem(14), s->elem(14), \ - ((imm >> 14) & 1)); \ - d->elem(15) =3D F(d->elem(15), s->elem(15), \ - ((imm >> 15) & 1)); \ - } \ - } \ - } \ + Reg *v =3D d; \ + BLEND_I128(elem, num, F, 0); \ + YMM_ONLY( \ + if (num < 8) \ + imm >>=3D num; \ + BLEND_I128(elem, num, F, num); \ + ) \ } =20 /* SSE4.1 op helpers */ -#define FBLENDVB(d, s, m) ((m & 0x80) ? s : d) -#define FBLENDVPS(d, s, m) ((m & 0x80000000) ? s : d) -#define FBLENDVPD(d, s, m) ((m & 0x8000000000000000LL) ? s : d) +#define FBLENDVB(v, s, m) ((m & 0x80) ? s : v) +#define FBLENDVPS(v, s, m) ((m & 0x80000000) ? s : v) +#define FBLENDVPD(v, s, m) ((m & 0x8000000000000000LL) ? s : v) SSE_HELPER_V(helper_pblendvb, B, 16, FBLENDVB) SSE_HELPER_V(helper_blendvps, L, 4, FBLENDVPS) SSE_HELPER_V(helper_blendvpd, Q, 2, FBLENDVPD) @@ -2521,7 +2522,7 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, } #endif =20 -#define FBLENDP(d, s, m) (m ? s : d) +#define FBLENDP(v, s, m) (m ? s : v) SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP) SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839852607606.2815525319778; Sun, 24 Apr 2022 15:37:32 -0700 (PDT) Received: from localhost ([::1]:44482 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikrD-0005SI-LD for importer@patchew.org; Sun, 24 Apr 2022 18:37:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50742) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTC-0001kZ-Jv for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:44 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58822) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTB-0002sS-3c for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:42 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ8-0001ea-Pf; Sun, 24 Apr 2022 23:02:18 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 20/42] i386: AVX pclmulqdq Date: Sun, 24 Apr 2022 23:01:42 +0100 Message-Id: <20220424220204.2493824-21-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839853439100001 Content-Type: text/plain; charset="utf-8" Make the pclmulqdq helper AVX ready Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 9f388b02b9..b7100fdce1 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2885,14 +2885,14 @@ target_ulong helper_crc32(uint32_t crc1, target_ulo= ng msg, uint32_t len) =20 #endif =20 -void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, - uint32_t ctrl) +#if SHIFT =3D=3D 1 +static void clmulq(uint64_t *dest_l, uint64_t *dest_h, + uint64_t a, uint64_t b) { - uint64_t ah, al, b, resh, resl; + uint64_t al, ah, resh, resl; =20 ah =3D 0; - al =3D d->Q((ctrl & 1) !=3D 0); - b =3D s->Q((ctrl & 16) !=3D 0); + al =3D a; resh =3D resl =3D 0; =20 while (b) { @@ -2905,8 +2905,25 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s, b >>=3D 1; } =20 - d->Q(0) =3D resl; - d->Q(1) =3D resh; + *dest_l =3D resl; + *dest_h =3D resh; +} +#endif + +void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, + uint32_t ctrl) +{ + Reg *v =3D d; + uint64_t a, b; + + a =3D v->Q((ctrl & 1) !=3D 0); + b =3D s->Q((ctrl & 16) !=3D 0); + clmulq(&d->Q(0), &d->Q(1), a, b); +#if SHIFT =3D=3D 2 + a =3D v->Q(((ctrl & 1) !=3D 0) + 2); + b =3D s->Q(((ctrl & 16) !=3D 0) + 2); + clmulq(&d->Q(2), &d->Q(3), a, b); +#endif } =20 void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839381679101.80510245571213; Sun, 24 Apr 2022 15:29:41 -0700 (PDT) Received: from localhost ([::1]:49520 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikjc-0006GW-5t for importer@patchew.org; Sun, 24 Apr 2022 18:29:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50844) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTn-0002HO-TR for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:21 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58848) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTl-0002u8-OK for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:19 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ8-0001ea-W4; Sun, 24 Apr 2022 23:02:19 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 21/42] i386: AVX+AES helpers Date: Sun, 24 Apr 2022 23:01:43 +0100 Message-Id: <20220424220204.2493824-22-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839383272100001 Content-Type: text/plain; charset="utf-8" Make the AES vector helpers AVX ready No functional changes to existing helpers Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 63 ++++++++++++++++++++++++++---------- target/i386/ops_sse_header.h | 55 ++++++++++++++++++++++--------- 2 files changed, 85 insertions(+), 33 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index b7100fdce1..48cec40074 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2929,64 +2929,92 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *en= v, Reg *d, Reg *s, void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *d; // v Reg rk =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4*i+0])] ^ - AES_Td1[st.B(AES_ishifts[4*i+1])] ^ - AES_Td2[st.B(AES_ishifts[4*i+2])] ^ - AES_Td3[st.B(AES_ishifts[4*i+3])]); + d->L(i) =3D rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4 * i + 0])= ] ^ + AES_Td1[st.B(AES_ishifts[4 * i + 1])] ^ + AES_Td2[st.B(AES_ishifts[4 * i + 2])] ^ + AES_Td3[st.B(AES_ishifts[4 * i + 3])]); } +#if SHIFT =3D=3D 2 + for (i =3D 0 ; i < 4 ; i++) { + d->L(i + 4) =3D rk.L(i + 4) ^ bswap32( + AES_Td0[st.B(AES_ishifts[4 * i + 0] + 16)] ^ + AES_Td1[st.B(AES_ishifts[4 * i + 1] + 16)] ^ + AES_Td2[st.B(AES_ishifts[4 * i + 2] + 16)] ^ + AES_Td3[st.B(AES_ishifts[4 * i + 3] + 16)]); + } +#endif } =20 void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *d; // v Reg rk =3D *s; =20 for (i =3D 0; i < 16; i++) { d->B(i) =3D rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i])]); } +#if SHIFT =3D=3D 2 + for (i =3D 0; i < 16; i++) { + d->B(i + 16) =3D rk.B(i + 16) ^ (AES_isbox[st.B(AES_ishifts[i] + 1= 6)]); + } +#endif } =20 void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *d; // v Reg rk =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4*i+0])] ^ - AES_Te1[st.B(AES_shifts[4*i+1])] ^ - AES_Te2[st.B(AES_shifts[4*i+2])] ^ - AES_Te3[st.B(AES_shifts[4*i+3])]); + d->L(i) =3D rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4 * i + 0])]= ^ + AES_Te1[st.B(AES_shifts[4 * i + 1])] ^ + AES_Te2[st.B(AES_shifts[4 * i + 2])] ^ + AES_Te3[st.B(AES_shifts[4 * i + 3])]); } +#if SHIFT =3D=3D 2 + for (i =3D 0 ; i < 4 ; i++) { + d->L(i + 4) =3D rk.L(i + 4) ^ bswap32( + AES_Te0[st.B(AES_shifts[4 * i + 0] + 16)] ^ + AES_Te1[st.B(AES_shifts[4 * i + 1] + 16)] ^ + AES_Te2[st.B(AES_shifts[4 * i + 2] + 16)] ^ + AES_Te3[st.B(AES_shifts[4 * i + 3] + 16)]); + } +#endif } =20 void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *d; // v Reg rk =3D *s; =20 for (i =3D 0; i < 16; i++) { d->B(i) =3D rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i])]); } - +#if SHIFT =3D=3D 2 + for (i =3D 0; i < 16; i++) { + d->B(i + 16) =3D rk.B(i + 16) ^ (AES_sbox[st.B(AES_shifts[i] + 16)= ]); + } +#endif } =20 +#if SHIFT =3D=3D 1 void glue(helper_aesimc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; Reg tmp =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D bswap32(AES_imc[tmp.B(4*i+0)][0] ^ - AES_imc[tmp.B(4*i+1)][1] ^ - AES_imc[tmp.B(4*i+2)][2] ^ - AES_imc[tmp.B(4*i+3)][3]); + d->L(i) =3D bswap32(AES_imc[tmp.B(4 * i + 0)][0] ^ + AES_imc[tmp.B(4 * i + 1)][1] ^ + AES_imc[tmp.B(4 * i + 2)][2] ^ + AES_imc[tmp.B(4 * i + 3)][3]); } } =20 @@ -3004,6 +3032,7 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State= *env, Reg *d, Reg *s, d->L(3) =3D (d->L(2) << 24 | d->L(2) >> 8) ^ ctrl; } #endif +#endif =20 #undef SSE_HELPER_S =20 diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index b8b0666f61..203afbb5a1 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -47,7 +47,7 @@ DEF_HELPER_3(glue(pslld, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(psrlq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(psllq, SUFFIX), void, env, Reg, Reg) =20 -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 DEF_HELPER_3(glue(psrldq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pslldq, SUFFIX), void, env, Reg, Reg) #endif @@ -105,7 +105,7 @@ SSE_HELPER_L(pcmpeql, FCMPEQ) =20 SSE_HELPER_W(pmullw, FMULLW) #if SHIFT =3D=3D 0 -SSE_HELPER_W(pmulhrw, FMULHRW) +DEF_HELPER_3(glue(pmulhrw, SUFFIX), FMULHRW) #endif SSE_HELPER_W(pmulhuw, FMULHUW) SSE_HELPER_W(pmulhw, FMULHW) @@ -117,7 +117,9 @@ DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg) =20 DEF_HELPER_3(glue(psadbw, SUFFIX), void, env, Reg, Reg) +#if SHIFT < 2 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl) +#endif DEF_HELPER_2(glue(movl_mm_T0, SUFFIX), void, Reg, i32) #ifdef TARGET_X86_64 DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64) @@ -126,17 +128,18 @@ DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64) #if SHIFT =3D=3D 0 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int) #else -DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int) -DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshuflw, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) #endif =20 -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 /* FPU ops */ /* XXX: not accurate */ =20 +DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int) +DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int) + #define SSE_HELPER_S(name, F) \ DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ @@ -154,10 +157,18 @@ SSE_HELPER_S(sqrt, FPU_SQRT) =20 DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg) -DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg) DEF_HELPER_3(glue(cvtdq2ps, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(cvtdq2pd, SUFFIX), void, env, Reg, Reg) + +DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) + +DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) + +#if SHIFT =3D=3D 1 +DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg) +DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg) DEF_HELPER_3(cvtpi2ps, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtpi2pd, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtsi2ss, void, env, ZMMReg, i32) @@ -168,8 +179,6 @@ DEF_HELPER_3(cvtsq2ss, void, env, ZMMReg, i64) DEF_HELPER_3(cvtsq2sd, void, env, ZMMReg, i64) #endif =20 -DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvtps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvtpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvtss2si, s32, env, ZMMReg) @@ -179,8 +188,6 @@ DEF_HELPER_2(cvtss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvtsd2sq, s64, env, ZMMReg) #endif =20 -DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvttps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvttpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvttss2si, s32, env, ZMMReg) @@ -189,15 +196,18 @@ DEF_HELPER_2(cvttsd2si, s32, env, ZMMReg) DEF_HELPER_2(cvttss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg) #endif +#endif =20 DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg) +#if SHIFT =3D=3D 1 +DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(extrq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int) DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int) +#endif DEF_HELPER_3(glue(haddps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(glue(haddpd, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(glue(hsubps, SUFFIX), void, env, ZMMReg, ZMMReg) @@ -220,10 +230,13 @@ SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) SSE_HELPER_CMP(cmpord, FPU_CMPORD) =20 +#if SHIFT =3D=3D 1 DEF_HELPER_3(ucomiss, void, env, Reg, Reg) DEF_HELPER_3(comiss, void, env, Reg, Reg) DEF_HELPER_3(ucomisd, void, env, Reg, Reg) DEF_HELPER_3(comisd, void, env, Reg, Reg) +#endif + DEF_HELPER_2(glue(movmskps, SUFFIX), i32, env, Reg) DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg) #endif @@ -240,7 +253,7 @@ DEF_HELPER_3(glue(packssdw, SUFFIX), void, env, Reg, Re= g) UNPCK_OP(l, 0) UNPCK_OP(h, 1) =20 -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 DEF_HELPER_3(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg) #endif @@ -287,7 +300,7 @@ DEF_HELPER_3(glue(psignd, SUFFIX), void, env, Reg, Reg) DEF_HELPER_4(glue(palignr, SUFFIX), void, env, Reg, Reg, s32) =20 /* SSE4.1 op helpers */ -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 DEF_HELPER_3(glue(pblendvb, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(blendvps, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(blendvpd, SUFFIX), void, env, Reg, Reg) @@ -316,22 +329,30 @@ DEF_HELPER_3(glue(pmaxsd, SUFFIX), void, env, Reg, Re= g) DEF_HELPER_3(glue(pmaxuw, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmaxud, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmulld, SUFFIX), void, env, Reg, Reg) +#if SHIFT =3D=3D 1 DEF_HELPER_3(glue(phminposuw, SUFFIX), void, env, Reg, Reg) +#endif DEF_HELPER_4(glue(roundps, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(roundpd, SUFFIX), void, env, Reg, Reg, i32) +#if SHIFT =3D=3D 1 DEF_HELPER_4(glue(roundss, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(roundsd, SUFFIX), void, env, Reg, Reg, i32) +#endif DEF_HELPER_4(glue(blendps, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(blendpd, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(pblendw, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(dpps, SUFFIX), void, env, Reg, Reg, i32) +#if SHIFT =3D=3D 1 DEF_HELPER_4(glue(dppd, SUFFIX), void, env, Reg, Reg, i32) +#endif DEF_HELPER_4(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, i32) #endif =20 /* SSE4.2 op helpers */ -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 DEF_HELPER_3(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg) +#endif +#if SHIFT =3D=3D 1 DEF_HELPER_4(glue(pcmpestri, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(pcmpestrm, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(pcmpistri, SUFFIX), void, env, Reg, Reg, i32) @@ -340,13 +361,15 @@ DEF_HELPER_3(crc32, tl, i32, tl, i32) #endif =20 /* AES-NI op helpers */ -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 DEF_HELPER_3(glue(aesdec, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(aesdeclast, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(aesenc, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(aesenclast, SUFFIX), void, env, Reg, Reg) +#if SHIFT =3D=3D 1 DEF_HELPER_3(glue(aesimc, SUFFIX), void, env, Reg, Reg) DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32) +#endif DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32) #endif =20 --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839081909231.78698137206834; Sun, 24 Apr 2022 15:24:41 -0700 (PDT) Received: from localhost ([::1]:35046 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikem-0004Zq-SY for importer@patchew.org; Sun, 24 Apr 2022 18:24:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50702) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikT0-0001jH-11 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:37 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58813) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSw-0002ra-QW for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:28 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ9-0001ea-6b; Sun, 24 Apr 2022 23:02:19 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 22/42] i386: Update ops_sse_helper.h ready for 256 bit AVX Date: Sun, 24 Apr 2022 23:01:44 +0100 Message-Id: <20220424220204.2493824-23-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839082135100001 Content-Type: text/plain; charset="utf-8" Update ops_sse_helper.h ready for 256 bit AVX helpers Signed-off-by: Paul Brook --- target/i386/ops_sse_header.h | 67 +++++++++++++++++++++--------------- 1 file changed, 40 insertions(+), 27 deletions(-) diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index 203afbb5a1..63b63eb532 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -105,7 +105,7 @@ SSE_HELPER_L(pcmpeql, FCMPEQ) =20 SSE_HELPER_W(pmullw, FMULLW) #if SHIFT =3D=3D 0 -DEF_HELPER_3(glue(pmulhrw, SUFFIX), FMULHRW) +DEF_HELPER_3(glue(pmulhrw, SUFFIX), void, env, Reg, Reg) #endif SSE_HELPER_W(pmulhuw, FMULHUW) SSE_HELPER_W(pmulhw, FMULHW) @@ -137,23 +137,39 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, i= nt) /* FPU ops */ /* XXX: not accurate */ =20 -DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int) -DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int) +#define SSE_HELPER_P4(name) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) + +#define SSE_HELPER_P3(name, ...) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) =20 -#define SSE_HELPER_S(name, F) \ - DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ - DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ +#if SHIFT =3D=3D 1 +#define SSE_HELPER_S4(name) \ + SSE_HELPER_P4(name) \ + DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) +#define SSE_HELPER_S3(name) \ + SSE_HELPER_P3(name) \ + DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ + DEF_HELPER_3(name ## sd, void, env, Reg, Reg) +#else +#define SSE_HELPER_S4(name, ...) SSE_HELPER_P4(name) +#define SSE_HELPER_S3(name, ...) SSE_HELPER_P3(name) +#endif + +DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int) +DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int) =20 -SSE_HELPER_S(add, FPU_ADD) -SSE_HELPER_S(sub, FPU_SUB) -SSE_HELPER_S(mul, FPU_MUL) -SSE_HELPER_S(div, FPU_DIV) -SSE_HELPER_S(min, FPU_MIN) -SSE_HELPER_S(max, FPU_MAX) -SSE_HELPER_S(sqrt, FPU_SQRT) +SSE_HELPER_S4(add) +SSE_HELPER_S4(sub) +SSE_HELPER_S4(mul) +SSE_HELPER_S4(div) +SSE_HELPER_S4(min) +SSE_HELPER_S4(max) =20 +SSE_HELPER_S3(sqrt) =20 DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg) @@ -208,18 +224,12 @@ DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int) DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int) #endif -DEF_HELPER_3(glue(haddps, SUFFIX), void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(glue(haddpd, SUFFIX), void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(glue(hsubps, SUFFIX), void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(glue(hsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(glue(addsubps, SUFFIX), void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(glue(addsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) - -#define SSE_HELPER_CMP(name, F) \ - DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ - DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ - DEF_HELPER_3(name ## sd, void, env, Reg, Reg) + +SSE_HELPER_P4(hadd) +SSE_HELPER_P4(hsub) +SSE_HELPER_P4(addsub) + +#define SSE_HELPER_CMP(name, F) SSE_HELPER_S4(name) =20 SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) SSE_HELPER_CMP(cmplt, FPU_CMPLT) @@ -381,6 +391,9 @@ DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, R= eg, i32) #undef SSE_HELPER_W #undef SSE_HELPER_L #undef SSE_HELPER_Q -#undef SSE_HELPER_S +#undef SSE_HELPER_S3 +#undef SSE_HELPER_S4 +#undef SSE_HELPER_P3 +#undef SSE_HELPER_P4 #undef SSE_HELPER_CMP #undef UNPCK_OP --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839275600796.3260933504802; Sun, 24 Apr 2022 15:27:55 -0700 (PDT) Received: from localhost ([::1]:42068 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikhu-0001DX-IU for importer@patchew.org; Sun, 24 Apr 2022 18:27:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50858) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTs-0002I2-I6 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:25 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58852) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTp-0002vW-SK for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:24 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ9-0001ea-Dp; Sun, 24 Apr 2022 23:02:19 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 23/42] i386: AVX comparison helpers Date: Sun, 24 Apr 2022 23:01:45 +0100 Message-Id: <20220424220204.2493824-24-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839276907100001 Content-Type: text/plain; charset="utf-8" AVX includes additional a more extensive set of comparison predicates, some of some of which our softfloat implementation does not expose directly. Rewrite the helpers in terms of floatN_compare Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 149 ++++++++++++++++++++++++----------- target/i386/ops_sse_header.h | 47 ++++++++--- target/i386/tcg/translate.c | 49 +++++++++--- 3 files changed, 177 insertions(+), 68 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 48cec40074..e48dfc2fc5 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1394,57 +1394,112 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *en= v, Reg *d, Reg *s) #endif } =20 -/* XXX: unordered */ -#define SSE_HELPER_CMP(name, F) \ - void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ - { \ - d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - d->ZMM_L(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ - d->ZMM_L(2) =3D F(32, d->ZMM_S(2), s->ZMM_S(2)); \ - d->ZMM_L(3) =3D F(32, d->ZMM_S(3), s->ZMM_S(3)); \ - } \ - \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - } \ - \ - void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ +#define SSE_HELPER_CMP_P(name, F, C) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - d->ZMM_Q(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ + Reg *v =3D d; \ + d->ZMM_L(0) =3D F(32, C, v->ZMM_S(0), s->ZMM_S(0)); \ + d->ZMM_L(1) =3D F(32, C, v->ZMM_S(1), s->ZMM_S(1)); \ + d->ZMM_L(2) =3D F(32, C, v->ZMM_S(2), s->ZMM_S(2)); \ + d->ZMM_L(3) =3D F(32, C, v->ZMM_S(3), s->ZMM_S(3)); \ + YMM_ONLY( \ + d->ZMM_L(4) =3D F(32, C, v->ZMM_S(4), s->ZMM_S(4)); \ + d->ZMM_L(5) =3D F(32, C, v->ZMM_S(5), s->ZMM_S(5)); \ + d->ZMM_L(6) =3D F(32, C, v->ZMM_S(6), s->ZMM_S(6)); \ + d->ZMM_L(7) =3D F(32, C, v->ZMM_S(7), s->ZMM_S(7)); \ + ) \ } \ \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - } - -#define FPU_CMPEQ(size, a, b) \ - (float ## size ## _eq_quiet(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPLT(size, a, b) \ - (float ## size ## _lt(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPLE(size, a, b) \ - (float ## size ## _le(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPUNORD(size, a, b) \ - (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPNEQ(size, a, b) \ - (float ## size ## _eq_quiet(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPNLT(size, a, b) \ - (float ## size ## _lt(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPNLE(size, a, b) \ - (float ## size ## _le(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPORD(size, a, b) \ - (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? 0 : -1) - -SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) -SSE_HELPER_CMP(cmplt, FPU_CMPLT) -SSE_HELPER_CMP(cmple, FPU_CMPLE) -SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD) -SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ) -SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) -SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) -SSE_HELPER_CMP(cmpord, FPU_CMPORD) + Reg *v =3D d; \ + d->ZMM_Q(0) =3D F(64, C, v->ZMM_D(0), s->ZMM_D(0)); \ + d->ZMM_Q(1) =3D F(64, C, v->ZMM_D(1), s->ZMM_D(1)); \ + YMM_ONLY( \ + d->ZMM_Q(2) =3D F(64, C, v->ZMM_D(2), s->ZMM_D(2)); \ + d->ZMM_Q(3) =3D F(64, C, v->ZMM_D(3), s->ZMM_D(3)); \ + ) \ + } + +#if SHIFT =3D=3D 1 +#define SSE_HELPER_CMP(name, F, C) = \ + SSE_HELPER_CMP_P(name, F, C) = \ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + { = \ + Reg *v =3D d; = \ + d->ZMM_L(0) =3D F(32, C, v->ZMM_S(0), s->ZMM_S(0)); = \ + } = \ + = \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ + { = \ + Reg *v =3D d; = \ + d->ZMM_Q(0) =3D F(64, C, v->ZMM_D(0), s->ZMM_D(0)); = \ + } + +static inline bool FPU_EQU(FloatRelation x) +{ + return (x =3D=3D float_relation_equal || x =3D=3D float_relation_unord= ered); +} +static inline bool FPU_GE(FloatRelation x) +{ + return (x =3D=3D float_relation_equal || x =3D=3D float_relation_great= er); +} +#define FPU_EQ(x) (x =3D=3D float_relation_equal) +#define FPU_LT(x) (x =3D=3D float_relation_less) +#define FPU_LE(x) (x <=3D float_relation_equal) +#define FPU_GT(x) (x =3D=3D float_relation_greater) +#define FPU_UNORD(x) (x =3D=3D float_relation_unordered) +// We must make sure we evaluate the argument in case it is a signalling N= AN +#define FPU_FALSE(x) (x =3D=3D float_relation_equal && 0) + +#define FPU_CMPQ(size, COND, a, b) \ + (COND(float ## size ## _compare_quiet(a, b, &env->sse_status)) ? -1 : = 0) +#define FPU_CMPS(size, COND, a, b) \ + (COND(float ## size ## _compare(a, b, &env->sse_status)) ? -1 : 0) + +#else +#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_CMP_P(name, F, C) +#endif + +SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ) +SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT) +SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE) +SSE_HELPER_CMP(cmpunord, FPU_CMPQ, FPU_UNORD) +SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ) +SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT) +SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE) +SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) + +SSE_HELPER_CMP(cmpequ, FPU_CMPQ, FPU_EQU) +SSE_HELPER_CMP(cmpnge, FPU_CMPS, !FPU_GE) +SSE_HELPER_CMP(cmpngt, FPU_CMPS, !FPU_GT) +SSE_HELPER_CMP(cmpfalse, FPU_CMPQ, FPU_FALSE) +SSE_HELPER_CMP(cmpnequ, FPU_CMPQ, !FPU_EQU) +SSE_HELPER_CMP(cmpge, FPU_CMPS, FPU_GE) +SSE_HELPER_CMP(cmpgt, FPU_CMPS, FPU_GT) +SSE_HELPER_CMP(cmptrue, FPU_CMPQ, !FPU_FALSE) + +SSE_HELPER_CMP(cmpeqs, FPU_CMPS, FPU_EQ) +SSE_HELPER_CMP(cmpltq, FPU_CMPQ, FPU_LT) +SSE_HELPER_CMP(cmpleq, FPU_CMPQ, FPU_LE) +SSE_HELPER_CMP(cmpunords, FPU_CMPS, FPU_UNORD) +SSE_HELPER_CMP(cmpneqq, FPU_CMPS, !FPU_EQ) +SSE_HELPER_CMP(cmpnltq, FPU_CMPQ, !FPU_LT) +SSE_HELPER_CMP(cmpnleq, FPU_CMPQ, !FPU_LE) +SSE_HELPER_CMP(cmpords, FPU_CMPS, !FPU_UNORD) + +SSE_HELPER_CMP(cmpequs, FPU_CMPS, FPU_EQU) +SSE_HELPER_CMP(cmpngeq, FPU_CMPQ, !FPU_GE) +SSE_HELPER_CMP(cmpngtq, FPU_CMPQ, !FPU_GT) +SSE_HELPER_CMP(cmpfalses, FPU_CMPS, FPU_FALSE) +SSE_HELPER_CMP(cmpnequs, FPU_CMPS, !FPU_EQU) +SSE_HELPER_CMP(cmpgeq, FPU_CMPQ, FPU_GE) +SSE_HELPER_CMP(cmpgtq, FPU_CMPQ, FPU_GT) +SSE_HELPER_CMP(cmptrues, FPU_CMPS, !FPU_FALSE) + +#undef SSE_HELPER_CMP =20 #if SHIFT =3D=3D 1 static const int comis_eflags[4] =3D {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C}; diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index 63b63eb532..793e581224 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -229,16 +229,43 @@ SSE_HELPER_P4(hadd) SSE_HELPER_P4(hsub) SSE_HELPER_P4(addsub) =20 -#define SSE_HELPER_CMP(name, F) SSE_HELPER_S4(name) - -SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) -SSE_HELPER_CMP(cmplt, FPU_CMPLT) -SSE_HELPER_CMP(cmple, FPU_CMPLE) -SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD) -SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ) -SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) -SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) -SSE_HELPER_CMP(cmpord, FPU_CMPORD) +#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_S4(name) + +SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ) +SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT) +SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE) +SSE_HELPER_CMP(cmpunord, FPU_CMPQ, FPU_UNORD) +SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ) +SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT) +SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE) +SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) + +SSE_HELPER_CMP(cmpequ, FPU_CMPQ, FPU_EQU) +SSE_HELPER_CMP(cmpnge, FPU_CMPS, !FPU_GE) +SSE_HELPER_CMP(cmpngt, FPU_CMPS, !FPU_GT) +SSE_HELPER_CMP(cmpfalse, FPU_CMPQ, FPU_FALSE) +SSE_HELPER_CMP(cmpnequ, FPU_CMPQ, !FPU_EQU) +SSE_HELPER_CMP(cmpge, FPU_CMPS, FPU_GE) +SSE_HELPER_CMP(cmpgt, FPU_CMPS, FPU_GT) +SSE_HELPER_CMP(cmptrue, FPU_CMPQ, !FPU_FALSE) + +SSE_HELPER_CMP(cmpeqs, FPU_CMPS, FPU_EQ) +SSE_HELPER_CMP(cmpltq, FPU_CMPQ, FPU_LT) +SSE_HELPER_CMP(cmpleq, FPU_CMPQ, FPU_LE) +SSE_HELPER_CMP(cmpunords, FPU_CMPS, FPU_UNORD) +SSE_HELPER_CMP(cmpneqq, FPU_CMPS, !FPU_EQ) +SSE_HELPER_CMP(cmpnltq, FPU_CMPQ, !FPU_LT) +SSE_HELPER_CMP(cmpnleq, FPU_CMPQ, !FPU_LE) +SSE_HELPER_CMP(cmpords, FPU_CMPS, !FPU_UNORD) + +SSE_HELPER_CMP(cmpequs, FPU_CMPS, FPU_EQU) +SSE_HELPER_CMP(cmpngeq, FPU_CMPQ, !FPU_GE) +SSE_HELPER_CMP(cmpngtq, FPU_CMPQ, !FPU_GT) +SSE_HELPER_CMP(cmpfalses, FPU_CMPS, FPU_FALSE) +SSE_HELPER_CMP(cmpnequs, FPU_CMPS, !FPU_EQU) +SSE_HELPER_CMP(cmpgeq, FPU_CMPQ, FPU_GE) +SSE_HELPER_CMP(cmpgtq, FPU_CMPQ, FPU_GT) +SSE_HELPER_CMP(cmptrues, FPU_CMPS, !FPU_FALSE) =20 #if SHIFT =3D=3D 1 DEF_HELPER_3(ucomiss, void, env, Reg, Reg) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 63b32a77e3..64f026c0af 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3021,20 +3021,47 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 -#define SSE_FOP(x) { \ +#define SSE_CMP(x) { \ gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ gen_helper_ ## x ## ss, gen_helper_ ## x ## sd} -static const SSEFunc_0_epp sse_op_table4[8][4] =3D { - SSE_FOP(cmpeq), - SSE_FOP(cmplt), - SSE_FOP(cmple), - SSE_FOP(cmpunord), - SSE_FOP(cmpneq), - SSE_FOP(cmpnlt), - SSE_FOP(cmpnle), - SSE_FOP(cmpord), +static const SSEFunc_0_epp sse_op_table4[32][4] =3D { + SSE_CMP(cmpeq), + SSE_CMP(cmplt), + SSE_CMP(cmple), + SSE_CMP(cmpunord), + SSE_CMP(cmpneq), + SSE_CMP(cmpnlt), + SSE_CMP(cmpnle), + SSE_CMP(cmpord), + + SSE_CMP(cmpequ), + SSE_CMP(cmpnge), + SSE_CMP(cmpngt), + SSE_CMP(cmpfalse), + SSE_CMP(cmpnequ), + SSE_CMP(cmpge), + SSE_CMP(cmpgt), + SSE_CMP(cmptrue), + + SSE_CMP(cmpeqs), + SSE_CMP(cmpltq), + SSE_CMP(cmpleq), + SSE_CMP(cmpunords), + SSE_CMP(cmpneqq), + SSE_CMP(cmpnltq), + SSE_CMP(cmpnleq), + SSE_CMP(cmpords), + + SSE_CMP(cmpequs), + SSE_CMP(cmpngeq), + SSE_CMP(cmpngtq), + SSE_CMP(cmpfalses), + SSE_CMP(cmpnequs), + SSE_CMP(cmpgeq), + SSE_CMP(cmpgtq), + SSE_CMP(cmptrues), }; -#undef SSE_FOP +#undef SSE_CMP =20 static const SSEFunc_0_epp sse_op_table5[256] =3D { [0x0c] =3D gen_helper_pi2fw, --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839234583335.40595708395495; Sun, 24 Apr 2022 15:27:14 -0700 (PDT) Received: from localhost ([::1]:41158 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikhF-0000bH-Es for importer@patchew.org; Sun, 24 Apr 2022 18:27:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50756) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTG-0001lk-5u for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:47 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58826) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTE-0002sm-N2 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:45 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ9-0001ea-Ka; Sun, 24 Apr 2022 23:02:19 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 24/42] i386: Move 3DNOW decoder Date: Sun, 24 Apr 2022 23:01:46 +0100 Message-Id: <20220424220204.2493824-25-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839236765100001 Content-Type: text/plain; charset="utf-8" Handle 3DNOW instructions early to avoid complicating the AVX logic. Signed-off-by: Paul Brook --- target/i386/tcg/translate.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 64f026c0af..6c40df61d4 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3297,6 +3297,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, is_xmm =3D 1; } } + if (sse_op.flags & SSE_OPF_3DNOW) { + if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { + goto illegal_op; + } + } /* simple MMX/SSE operation */ if (s->flags & HF_TS_MASK) { gen_exception(s, EXCP07_PREX, pc_start - s->cs_base); @@ -4761,21 +4766,20 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); } + if (sse_op.flags & SSE_OPF_3DNOW) { + /* 3DNow! data insns */ + val =3D x86_ldub_code(env, s); + SSEFunc_0_epp op_3dnow =3D sse_op_table5[val]; + if (!op_3dnow) { + goto unknown_op; + } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + op_3dnow(cpu_env, s->ptr0, s->ptr1); + return; + } } switch(b) { - case 0x0f: /* 3DNow! data insns */ - val =3D x86_ldub_code(env, s); - sse_fn_epp =3D sse_op_table5[val]; - if (!sse_fn_epp) { - goto unknown_op; - } - if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { - goto illegal_op; - } - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; case 0x70: /* pshufx insn */ case 0xc6: /* pshufx insn */ val =3D x86_ldub_code(env, s); --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839574598327.30728663085904; Sun, 24 Apr 2022 15:32:54 -0700 (PDT) Received: from localhost ([::1]:58950 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikmj-0004Vl-GN for importer@patchew.org; Sun, 24 Apr 2022 18:32:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50576) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSZ-0001Ys-4M for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:03 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58784) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSU-0002oR-OY for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:02 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJ9-0001ea-Rk; Sun, 24 Apr 2022 23:02:19 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 25/42] i386: VEX.V encodings (3 operand) Date: Sun, 24 Apr 2022 23:01:47 +0100 Message-Id: <20220424220204.2493824-26-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839576196100003 Content-Type: text/plain; charset="utf-8" Enable translation of VEX encoded AVX instructions. The big change is the addition of an additional register operand in the VEX= .V field. This is usually (but not always!) used to explictly encode the first source operand. The changes to ops_sse.h and ops_sse_header.h are purely mechanical, with pervious changes ensuring that the relevant helper functions are ready to handle the non destructive source operand. We now have a grater variety of operand patterns for the vector helper functions. The SSE_OPF_* flags we added to the opcode lookup tables are used to select between these. This includes e.g. pshufX and cmpX instructions which were previously overriden by opcode. One gotcha is the "scalar" vector instructions. The SSE encodings write a single element to the destination and leave the remainder of the register unchanged. The VEX encodings which copy the remainder of the destination f= rom first source operand. If the operation only has a single source value, then the VEX.V encodes an additional operand from which is coped to the the remainder of destination. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 214 +++++++++---------- target/i386/ops_sse_header.h | 149 ++++++------- target/i386/tcg/translate.c | 399 +++++++++++++++++++++++++---------- 3 files changed, 463 insertions(+), 299 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index e48dfc2fc5..ad3312d353 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -97,9 +97,8 @@ #define FPSLL(x, c) ((x) << shift) #endif =20 -void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift; if (c->Q(0) > 15) { d->Q(0) =3D 0; @@ -114,9 +113,8 @@ void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *= d, Reg *c) } } =20 -void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift; if (c->Q(0) > 15) { d->Q(0) =3D 0; @@ -131,9 +129,8 @@ void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *= d, Reg *c) } } =20 -void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift; if (c->Q(0) > 15) { shift =3D 15; @@ -143,9 +140,8 @@ void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *= d, Reg *c) SHIFT_HELPER_BODY(4 << SHIFT, W, FPSRAW); } =20 -void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift; if (c->Q(0) > 31) { d->Q(0) =3D 0; @@ -160,9 +156,8 @@ void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *= d, Reg *c) } } =20 -void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift; if (c->Q(0) > 31) { d->Q(0) =3D 0; @@ -177,9 +172,8 @@ void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *= d, Reg *c) } } =20 -void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift; if (c->Q(0) > 31) { shift =3D 31; @@ -189,9 +183,8 @@ void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *= d, Reg *c) SHIFT_HELPER_BODY(2 << SHIFT, L, FPSRAL); } =20 -void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift; if (c->Q(0) > 63) { d->Q(0) =3D 0; @@ -206,9 +199,8 @@ void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *= d, Reg *c) } } =20 -void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift; if (c->Q(0) > 63) { d->Q(0) =3D 0; @@ -224,9 +216,8 @@ void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *= d, Reg *c) } =20 #if SHIFT >=3D 1 -void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift, i; =20 shift =3D c->L(0); @@ -249,9 +240,8 @@ void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg = *d, Reg *c) #endif } =20 -void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c) { - Reg *s =3D d; int shift, i; =20 shift =3D c->L(0); @@ -321,9 +311,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg = *d, Reg *c) } =20 #define SSE_HELPER_B(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ d->B(0) =3D F(v->B(0), s->B(0)); \ d->B(1) =3D F(v->B(1), s->B(1)); \ d->B(2) =3D F(v->B(2), s->B(2)); \ @@ -363,9 +352,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg = *d, Reg *c) } =20 #define SSE_HELPER_W(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ d->W(0) =3D F(v->W(0), s->W(0)); \ d->W(1) =3D F(v->W(1), s->W(1)); \ d->W(2) =3D F(v->W(2), s->W(2)); \ @@ -389,9 +377,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg = *d, Reg *c) } =20 #define SSE_HELPER_L(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ d->L(0) =3D F(v->L(0), s->L(0)); \ d->L(1) =3D F(v->L(1), s->L(1)); \ XMM_ONLY( \ @@ -407,9 +394,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg = *d, Reg *c) } =20 #define SSE_HELPER_Q(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ d->Q(0) =3D F(v->Q(0), s->Q(0)); \ XMM_ONLY( \ d->Q(1) =3D F(v->Q(1), s->Q(1)); \ @@ -555,9 +541,8 @@ void glue(helper_pmulhrw, SUFFIX)(CPUX86State *env, Reg= *d, Reg *s) SSE_HELPER_B(helper_pavgb, FAVG) SSE_HELPER_W(helper_pavgw, FAVG) =20 -void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; d->Q(0) =3D (uint64_t)s->L(0) * (uint64_t)v->L(0); #if SHIFT >=3D 1 d->Q(1) =3D (uint64_t)s->L(2) * (uint64_t)v->L(2); @@ -568,9 +553,8 @@ void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg= *d, Reg *s) #endif } =20 -void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; int i; =20 for (i =3D 0; i < (2 << SHIFT); i++) { @@ -589,10 +573,8 @@ static inline int abs1(int a) } } #endif - -void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; unsigned int val; =20 val =3D 0; @@ -701,9 +683,8 @@ void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int or= der) SHUFFLE4(W, s, s, 0); } #else -void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order) +void glue(helper_shufps, SUFFIX)(Reg *d, Reg *v, Reg *s, int order) { - Reg *v =3D d; uint32_t r0, r1, r2, r3; =20 SHUFFLE4(L, v, s, 0); @@ -712,9 +693,8 @@ void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int or= der) #endif } =20 -void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *s, int order) +void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *v, Reg *s, int order) { - Reg *v =3D d; uint64_t r0, r1; =20 r0 =3D v->Q(order & 1); @@ -770,9 +750,8 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int o= rder) =20 #define SSE_HELPER_P(name, F) \ void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ d->ZMM_S(0) =3D F(32, v->ZMM_S(0), s->ZMM_S(0)); \ d->ZMM_S(1) =3D F(32, v->ZMM_S(1), s->ZMM_S(1)); \ d->ZMM_S(2) =3D F(32, v->ZMM_S(2), s->ZMM_S(2)); \ @@ -786,9 +765,8 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int o= rder) } \ \ void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ d->ZMM_D(0) =3D F(64, v->ZMM_D(0), s->ZMM_D(0)); \ d->ZMM_D(1) =3D F(64, v->ZMM_D(1), s->ZMM_D(1)); \ YMM_ONLY( \ @@ -802,15 +780,13 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int= order) #define SSE_HELPER_S(name, F) \ SSE_HELPER_P(name, F) \ \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)\ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *v, Reg *s)\ { \ - Reg *v =3D d; \ d->ZMM_S(0) =3D F(32, v->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)\ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *v, Reg *s)\ { \ - Reg *v =3D d; \ d->ZMM_D(0) =3D F(64, v->ZMM_D(0), s->ZMM_D(0)); \ } =20 @@ -1284,9 +1260,8 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, in= t index, int length) } #endif =20 -void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_haddps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; float32 r0, r1, r2, r3; =20 r0 =3D float32_add(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status); @@ -1309,9 +1284,8 @@ void glue(helper_haddps, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s) #endif } =20 -void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; float64 r0, r1; =20 r0 =3D float64_add(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status); @@ -1326,9 +1300,8 @@ void glue(helper_haddpd, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s) #endif } =20 -void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; float32 r0, r1, r2, r3; =20 r0 =3D float32_sub(v->ZMM_S(0), v->ZMM_S(1), &env->sse_status); @@ -1351,9 +1324,8 @@ void glue(helper_hsubps, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s) #endif } =20 -void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; float64 r0, r1; =20 r0 =3D float64_sub(v->ZMM_D(0), v->ZMM_D(1), &env->sse_status); @@ -1368,9 +1340,8 @@ void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s) #endif } =20 -void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) { - Reg *v =3D d; d->ZMM_S(0) =3D float32_sub(v->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); d->ZMM_S(1) =3D float32_add(v->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); d->ZMM_S(2) =3D float32_sub(v->ZMM_S(2), s->ZMM_S(2), &env->sse_status= ); @@ -1383,9 +1354,8 @@ void glue(helper_addsubps, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) #endif } =20 -void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) { - Reg *v =3D d; d->ZMM_D(0) =3D float64_sub(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); d->ZMM_D(1) =3D float64_add(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); #if SHIFT =3D=3D 2 @@ -1396,9 +1366,8 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) =20 #define SSE_HELPER_CMP_P(name, F, C) \ void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ d->ZMM_L(0) =3D F(32, C, v->ZMM_S(0), s->ZMM_S(0)); \ d->ZMM_L(1) =3D F(32, C, v->ZMM_S(1), s->ZMM_S(1)); \ d->ZMM_L(2) =3D F(32, C, v->ZMM_S(2), s->ZMM_S(2)); \ @@ -1412,9 +1381,8 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) } \ \ void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ d->ZMM_Q(0) =3D F(64, C, v->ZMM_D(0), s->ZMM_D(0)); \ d->ZMM_Q(1) =3D F(64, C, v->ZMM_D(1), s->ZMM_D(1)); \ YMM_ONLY( \ @@ -1426,15 +1394,13 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s) #if SHIFT =3D=3D 1 #define SSE_HELPER_CMP(name, F, C) = \ SSE_HELPER_CMP_P(name, F, C) = \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *v, Reg *s) = \ { = \ - Reg *v =3D d; = \ d->ZMM_L(0) =3D F(32, C, v->ZMM_S(0), s->ZMM_S(0)); = \ } = \ = \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *v, Reg *s) = \ { = \ - Reg *v =3D d; = \ d->ZMM_Q(0) =3D F(64, C, v->ZMM_D(0), s->ZMM_D(0)); = \ } =20 @@ -1633,9 +1599,44 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *= env, Reg *s) #define PACK_WIDTH 8 #endif =20 -void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +#define PACK4(F, to, reg, from) do { \ + r[to + 0] =3D F((int16_t)reg->W(from + 0)); \ + r[to + 1] =3D F((int16_t)reg->W(from + 1)); \ + r[to + 2] =3D F((int16_t)reg->W(from + 2)); \ + r[to + 3] =3D F((int16_t)reg->W(from + 3)); \ + } while (0) + +#define PACK_HELPER_B(name, F) \ +void glue(helper_pack ## name, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s) \ +{ \ + uint8_t r[PACK_WIDTH * 2]; \ + int i; \ + PACK4(F, 0, v, 0); \ + PACK4(F, PACK_WIDTH, s, 0); \ + XMM_ONLY( \ + PACK4(F, 4, v, 4); \ + PACK4(F, 12, s, 4); \ + ) \ + for (i =3D 0; i < PACK_WIDTH * 2; i++) { \ + d->B(i) =3D r[i]; \ + } \ + YMM_ONLY( \ + PACK4(F, 0, v, 8); \ + PACK4(F, 4, v, 12); \ + PACK4(F, 8, s, 8); \ + PACK4(F, 12, s, 12); \ + for (i =3D 0; i < 16; i++) { \ + d->B(i + 16) =3D r[i]; \ + } \ + ) \ +} + +PACK_HELPER_B(sswb, satsb) +PACK_HELPER_B(uswb, satub) + +void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) { - Reg *v =3D d; uint16_t r[PACK_WIDTH]; int i; =20 @@ -1670,9 +1671,8 @@ void glue(helper_packssdw, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) #define UNPCK_OP(base_name, base) \ \ void glue(helper_punpck ## base_name ## bw, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ uint8_t r[PACK_WIDTH * 2]; \ int i; \ \ @@ -1721,9 +1721,8 @@ void glue(helper_packssdw, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) } \ \ void glue(helper_punpck ## base_name ## wd, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ uint16_t r[PACK_WIDTH]; \ int i; \ \ @@ -1756,9 +1755,8 @@ void glue(helper_packssdw, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) } \ \ void glue(helper_punpck ## base_name ## dq, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ uint32_t r[4]; \ \ r[0] =3D v->L((base * (PACK_WIDTH / 4)) + 0); \ @@ -1785,9 +1783,8 @@ void glue(helper_packssdw, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) \ XMM_ONLY( \ void glue(helper_punpck ## base_name ## qdq, SUFFIX)( \ - CPUX86State *env, Reg *d, Reg *s) \ + CPUX86State *env, Reg *d, Reg *v, Reg *s) \ { \ - Reg *v =3D d; \ uint64_t r[2]; \ \ r[0] =3D v->Q(base); \ @@ -1961,9 +1958,8 @@ void helper_pswapd(CPUX86State *env, MMXReg *d, MMXRe= g *s) #endif =20 /* SSSE3 op helpers */ -void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; int i; #if SHIFT =3D=3D 0 uint8_t r[8]; @@ -1997,9 +1993,8 @@ void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s) #if SHIFT =3D=3D 0 =20 #define SSE_HELPER_HW(name, F) \ -void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) \ { \ - Reg *v =3D d; \ uint16_t r[4]; \ r[0] =3D F(v->W(0), v->W(1)); \ r[1] =3D F(v->W(2), v->W(3)); \ @@ -2012,9 +2007,8 @@ void glue(helper_ ## name, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) \ } =20 #define SSE_HELPER_HL(name, F) \ -void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) \ { \ - Reg *v =3D d; \ uint32_t r0, r1; \ r0 =3D F(v->L(0), v->L(1)); \ r1 =3D F(s->L(0), s->L(1)); \ @@ -2025,9 +2019,8 @@ void glue(helper_ ## name, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) \ #else =20 #define SSE_HELPER_HW(name, F) \ -void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) \ { \ - Reg *v =3D d; \ int32_t r[8]; \ r[0] =3D F(v->W(0), v->W(1)); \ r[1] =3D F(v->W(2), v->W(3)); \ @@ -2066,9 +2059,8 @@ void glue(helper_ ## name, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) \ } =20 #define SSE_HELPER_HL(name, F) \ -void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) \ { \ - Reg *v =3D d; \ int32_t r0, r1, r2, r3; \ r0 =3D F(v->L(0), v->L(1)); \ r1 =3D F(v->L(2), v->L(3)); \ @@ -2101,9 +2093,8 @@ SSE_HELPER_HL(phsubd, FSUB) #undef SSE_HELPER_HW #undef SSE_HELPER_HL =20 -void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg = *s) { - Reg *v =3D d; d->W(0) =3D satsw((int8_t)s->B(0) * (uint8_t)v->B(0) + (int8_t)s->B(1) * (uint8_t)v->B(1)); d->W(1) =3D satsw((int8_t)s->B(2) * (uint8_t)v->B(2) + @@ -2148,10 +2139,9 @@ SSE_HELPER_B(helper_psignb, FSIGNB) SSE_HELPER_W(helper_psignw, FSIGNW) SSE_HELPER_L(helper_psignd, FSIGNL) =20 -void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, +void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, int32_t shift) { - Reg *v =3D d; /* XXX could be checked during translation */ if (shift >=3D (SHIFT ? 32 : 16)) { d->Q(0) =3D 0; @@ -2224,10 +2214,9 @@ void glue(helper_palignr, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s, } while (0) =20 #define SSE_HELPER_V(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, \ + Reg *m) \ { \ - Reg *v =3D d; \ - Reg *m =3D &env->xmm_regs[0]; \ BLEND_V128(elem, num, F, 0); \ YMM_ONLY(BLEND_V128(elem, num, F, num);) \ } @@ -2248,10 +2237,9 @@ void glue(helper_palignr, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s, } while (0) =20 #define SSE_HELPER_I(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, \ uint32_t imm) \ { \ - Reg *v =3D d; \ BLEND_I128(elem, num, F, 0); \ YMM_ONLY( \ if (num < 8) \ @@ -2320,9 +2308,8 @@ SSE_HELPER_F(helper_pmovzxwd, L, 4, s->W) SSE_HELPER_F(helper_pmovzxwq, Q, 2, s->W) SSE_HELPER_F(helper_pmovzxdq, Q, 2, s->L) =20 -void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { - Reg *v =3D d; d->Q(0) =3D (int64_t)(int32_t) v->L(0) * (int32_t) s->L(0); d->Q(1) =3D (int64_t)(int32_t) v->L(2) * (int32_t) s->L(2); #if SHIFT =3D=3D 2 @@ -2334,9 +2321,8 @@ void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s) #define FCMPEQQ(d, s) (d =3D=3D s ? -1 : 0) SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ) =20 -void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *= s) { - Reg *v =3D d; uint16_t r[8]; =20 r[0] =3D satuw((int32_t) v->L(0)); @@ -2582,10 +2568,9 @@ SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP) SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) =20 -void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, +void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, uint32_t mask) { - Reg *v =3D d; float32 prod, iresult, iresult2; =20 /* @@ -2655,9 +2640,8 @@ void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg = *d, Reg *s, #if SHIFT =3D=3D 1 /* Oddly, there is no ymm version of dppd */ void glue(helper_dppd, SUFFIX)(CPUX86State *env, - Reg *d, Reg *s, uint32_t mask) + Reg *d, Reg *v, Reg *s, uint32_t mask) { - Reg *v =3D d; float64 iresult; =20 if (mask & (1 << 4)) { @@ -2677,10 +2661,9 @@ void glue(helper_dppd, SUFFIX)(CPUX86State *env, } #endif =20 -void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, +void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s, uint32_t offset) { - Reg *v =3D d; int s0 =3D (offset & 3) << 2; int d0 =3D (offset & 4) << 0; int i; @@ -2965,10 +2948,9 @@ static void clmulq(uint64_t *dest_l, uint64_t *dest_= h, } #endif =20 -void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, +void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg = *s, uint32_t ctrl) { - Reg *v =3D d; uint64_t a, b; =20 a =3D v->Q((ctrl & 1) !=3D 0); @@ -2981,10 +2963,10 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *en= v, Reg *d, Reg *s, #endif } =20 -void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { int i; - Reg st =3D *d; // v + Reg st =3D *v; Reg rk =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { @@ -3004,10 +2986,10 @@ void glue(helper_aesdec, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) #endif } =20 -void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg= *s) { int i; - Reg st =3D *d; // v + Reg st =3D *v; Reg rk =3D *s; =20 for (i =3D 0; i < 16; i++) { @@ -3020,10 +3002,10 @@ void glue(helper_aesdeclast, SUFFIX)(CPUX86State *e= nv, Reg *d, Reg *s) #endif } =20 -void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s) { int i; - Reg st =3D *d; // v + Reg st =3D *v; Reg rk =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { @@ -3043,10 +3025,10 @@ void glue(helper_aesenc, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) #endif } =20 -void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg= *s) { int i; - Reg st =3D *d; // v + Reg st =3D *v; Reg rk =3D *s; =20 for (i =3D 0; i < 16; i++) { diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index 793e581224..cfcfba154b 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -38,31 +38,31 @@ #define dh_typecode_ZMMReg dh_typecode_ptr #define dh_typecode_MMXReg dh_typecode_ptr =20 -DEF_HELPER_3(glue(psrlw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psraw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psllw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psrld, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psrad, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pslld, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psrlq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psllq, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(psrlw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psraw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psllw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psrld, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psrad, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pslld, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psrlq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psllq, SUFFIX), void, env, Reg, Reg, Reg) =20 #if SHIFT >=3D 1 -DEF_HELPER_3(glue(psrldq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pslldq, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(psrldq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pslldq, SUFFIX), void, env, Reg, Reg, Reg) #endif =20 #define SSE_HELPER_B(name, F)\ - DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg) =20 #define SSE_HELPER_W(name, F)\ - DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg) =20 #define SSE_HELPER_L(name, F)\ - DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg) =20 #define SSE_HELPER_Q(name, F)\ - DEF_HELPER_3(glue(name, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name, SUFFIX), void, env, Reg, Reg, Reg) =20 SSE_HELPER_B(paddb, FADD) SSE_HELPER_W(paddw, FADD) @@ -113,10 +113,10 @@ SSE_HELPER_W(pmulhw, FMULHW) SSE_HELPER_B(pavgb, FAVG) SSE_HELPER_W(pavgw, FAVG) =20 -DEF_HELPER_3(glue(pmuludq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaddwd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(pmuludq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaddwd, SUFFIX), void, env, Reg, Reg, Reg) =20 -DEF_HELPER_3(glue(psadbw, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(psadbw, SUFFIX), void, env, Reg, Reg, Reg) #if SHIFT < 2 DEF_HELPER_4(glue(maskmov, SUFFIX), void, env, Reg, Reg, tl) #endif @@ -138,8 +138,8 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) /* XXX: not accurate */ =20 #define SSE_HELPER_P4(name) \ - DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ - DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) + DEF_HELPER_4(glue(name ## ps, SUFFIX), void, env, Reg, Reg, Reg) \ + DEF_HELPER_4(glue(name ## pd, SUFFIX), void, env, Reg, Reg, Reg) =20 #define SSE_HELPER_P3(name, ...) \ DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ @@ -148,8 +148,8 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) #if SHIFT =3D=3D 1 #define SSE_HELPER_S4(name) \ SSE_HELPER_P4(name) \ - DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## sd, void, env, Reg, Reg) + DEF_HELPER_4(name ## ss, void, env, Reg, Reg, Reg) \ + DEF_HELPER_4(name ## sd, void, env, Reg, Reg, Reg) #define SSE_HELPER_S3(name) \ SSE_HELPER_P3(name) \ DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ @@ -159,8 +159,8 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) #define SSE_HELPER_S3(name, ...) SSE_HELPER_P3(name) #endif =20 -DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int) -DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int) +DEF_HELPER_4(glue(shufps, SUFFIX), void, Reg, Reg, Reg, int) +DEF_HELPER_4(glue(shufpd, SUFFIX), void, Reg, Reg, Reg, int) =20 SSE_HELPER_S4(add) SSE_HELPER_S4(sub) @@ -216,6 +216,7 @@ DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg) =20 DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg) + #if SHIFT =3D=3D 1 DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg) @@ -279,20 +280,20 @@ DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg) #endif =20 DEF_HELPER_2(glue(pmovmskb, SUFFIX), i32, env, Reg) -DEF_HELPER_3(glue(packsswb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(packuswb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(packssdw, SUFFIX), void, env, Reg, Reg) -#define UNPCK_OP(base_name, base) \ - DEF_HELPER_3(glue(punpck ## base_name ## bw, SUFFIX), void, env, Reg, = Reg) \ - DEF_HELPER_3(glue(punpck ## base_name ## wd, SUFFIX), void, env, Reg, = Reg) \ - DEF_HELPER_3(glue(punpck ## base_name ## dq, SUFFIX), void, env, Reg, = Reg) +DEF_HELPER_4(glue(packsswb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(packuswb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(packssdw, SUFFIX), void, env, Reg, Reg, Reg) +#define UNPCK_OP(name, base) \ + DEF_HELPER_4(glue(punpck ## name ## bw, SUFFIX), void, env, Reg, Reg, = Reg) \ + DEF_HELPER_4(glue(punpck ## name ## wd, SUFFIX), void, env, Reg, Reg, = Reg) \ + DEF_HELPER_4(glue(punpck ## name ## dq, SUFFIX), void, env, Reg, Reg, = Reg) =20 UNPCK_OP(l, 0) UNPCK_OP(h, 1) =20 #if SHIFT >=3D 1 -DEF_HELPER_3(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(punpcklqdq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(punpckhqdq, SUFFIX), void, env, Reg, Reg, Reg) #endif =20 /* 3DNow! float ops */ @@ -319,28 +320,28 @@ DEF_HELPER_3(pswapd, void, env, MMXReg, MMXReg) #endif =20 /* SSSE3 op helpers */ -DEF_HELPER_3(glue(phaddw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phaddd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phaddsw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phsubw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phsubd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(phsubsw, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(phaddw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phaddd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phaddsw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phsubw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phsubd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(phsubsw, SUFFIX), void, env, Reg, Reg, Reg) DEF_HELPER_3(glue(pabsb, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pabsw, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pabsd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaddubsw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmulhrsw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pshufb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psignb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psignw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(psignd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_4(glue(palignr, SUFFIX), void, env, Reg, Reg, s32) +DEF_HELPER_4(glue(pmaddubsw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmulhrsw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pshufb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psignb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psignw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(psignd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_5(glue(palignr, SUFFIX), void, env, Reg, Reg, Reg, s32) =20 /* SSE4.1 op helpers */ #if SHIFT >=3D 1 -DEF_HELPER_3(glue(pblendvb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(blendvps, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(blendvpd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_5(glue(pblendvb, SUFFIX), void, env, Reg, Reg, Reg, Reg) +DEF_HELPER_5(glue(blendvps, SUFFIX), void, env, Reg, Reg, Reg, Reg) +DEF_HELPER_5(glue(blendvpd, SUFFIX), void, env, Reg, Reg, Reg, Reg) DEF_HELPER_3(glue(ptest, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmovsxbw, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmovsxbd, SUFFIX), void, env, Reg, Reg) @@ -354,40 +355,40 @@ DEF_HELPER_3(glue(pmovzxbq, SUFFIX), void, env, Reg, = Reg) DEF_HELPER_3(glue(pmovzxwd, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmovzxwq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(pmovzxdq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmuldq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pcmpeqq, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(packusdw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pminsb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pminsd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pminuw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pminud, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaxsb, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaxsd, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaxuw, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmaxud, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(pmulld, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(pmuldq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pcmpeqq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(packusdw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pminsb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pminsd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pminuw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pminud, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaxsb, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaxsd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaxuw, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmaxud, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(pmulld, SUFFIX), void, env, Reg, Reg, Reg) #if SHIFT =3D=3D 1 DEF_HELPER_3(glue(phminposuw, SUFFIX), void, env, Reg, Reg) #endif DEF_HELPER_4(glue(roundps, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(roundpd, SUFFIX), void, env, Reg, Reg, i32) #if SHIFT =3D=3D 1 -DEF_HELPER_4(glue(roundss, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(roundsd, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_4(roundss_xmm, void, env, Reg, Reg, i32) +DEF_HELPER_4(roundsd_xmm, void, env, Reg, Reg, i32) #endif -DEF_HELPER_4(glue(blendps, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(blendpd, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(pblendw, SUFFIX), void, env, Reg, Reg, i32) -DEF_HELPER_4(glue(dpps, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_5(glue(blendps, SUFFIX), void, env, Reg, Reg, Reg, i32) +DEF_HELPER_5(glue(blendpd, SUFFIX), void, env, Reg, Reg, Reg, i32) +DEF_HELPER_5(glue(pblendw, SUFFIX), void, env, Reg, Reg, Reg, i32) +DEF_HELPER_5(glue(dpps, SUFFIX), void, env, Reg, Reg, Reg, i32) #if SHIFT =3D=3D 1 -DEF_HELPER_4(glue(dppd, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_5(glue(dppd, SUFFIX), void, env, Reg, Reg, Reg, i32) #endif -DEF_HELPER_4(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_5(glue(mpsadbw, SUFFIX), void, env, Reg, Reg, Reg, i32) #endif =20 /* SSE4.2 op helpers */ #if SHIFT >=3D 1 -DEF_HELPER_3(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(pcmpgtq, SUFFIX), void, env, Reg, Reg, Reg) #endif #if SHIFT =3D=3D 1 DEF_HELPER_4(glue(pcmpestri, SUFFIX), void, env, Reg, Reg, i32) @@ -399,15 +400,15 @@ DEF_HELPER_3(crc32, tl, i32, tl, i32) =20 /* AES-NI op helpers */ #if SHIFT >=3D 1 -DEF_HELPER_3(glue(aesdec, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(aesdeclast, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(aesenc, SUFFIX), void, env, Reg, Reg) -DEF_HELPER_3(glue(aesenclast, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(aesdec, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(aesdeclast, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(aesenc, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(aesenclast, SUFFIX), void, env, Reg, Reg, Reg) #if SHIFT =3D=3D 1 DEF_HELPER_3(glue(aesimc, SUFFIX), void, env, Reg, Reg) DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env, Reg, Reg, i32) #endif -DEF_HELPER_4(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_5(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, Reg, i32) #endif =20 #undef SHIFT diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 6c40df61d4..d148a2319d 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -125,6 +125,7 @@ typedef struct DisasContext { TCGv tmp4; TCGv_ptr ptr0; TCGv_ptr ptr1; + TCGv_ptr ptr2; TCGv_i32 tmp2_i32; TCGv_i32 tmp3_i32; TCGv_i64 tmp1_i64; @@ -2784,11 +2785,21 @@ typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr= env, TCGv_ptr reg); typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val); typedef void (*SSEFunc_0_epl)(TCGv_ptr env, TCGv_ptr reg, TCGv_i64 val); typedef void (*SSEFunc_0_epp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b= ); +typedef void (*SSEFunc_0_eppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, + TCGv_ptr reg_c); +typedef void (*SSEFunc_0_epppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg= _b, + TCGv_ptr reg_c, TCGv_ptr reg_d); typedef void (*SSEFunc_0_eppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv_i32 val); +typedef void (*SSEFunc_0_epppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg= _b, + TCGv_ptr reg_c, TCGv_i32 val); typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_i32 val= ); +typedef void (*SSEFunc_0_pppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_ptr re= g_c, + TCGv_i32 val); typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv val); +typedef void (*SSEFunc_0_epppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg= _b, + TCGv_ptr reg_c, TCGv val); =20 #define SSE_OPF_V0 (1 << 0) /* vex.v must be 1111b (only 2 operands= ) */ #define SSE_OPF_CMP (1 << 1) /* does not write for first operand */ @@ -2801,7 +2812,7 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr= reg_a, TCGv_ptr reg_b, #define SSE_OPF_SHUF (1 << 9) /* pshufx/shufpx */ =20 #define OP(op, flags, a, b, c, d) \ - {flags, {a, b, c, d} } + {flags, {{.op =3D a}, {.op =3D b}, {.op =3D c}, {.op =3D d} } } =20 #define MMX_OP(x) OP(op2, SSE_OPF_MMX, \ gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL) @@ -2814,7 +2825,13 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_pt= r reg_a, TCGv_ptr reg_b, =20 struct SSEOpHelper_table1 { int flags; - SSEFunc_0_epp op[4]; + union { + SSEFunc_0_epp op1; + SSEFunc_0_ppi op1i; + SSEFunc_0_eppt op1t; + SSEFunc_0_eppp op2; + SSEFunc_0_pppi op2i; + } fn[4]; }; =20 #define SSE_3DNOW { SSE_OPF_3DNOW } @@ -2870,8 +2887,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0x5f] =3D SSE_FOP(max), =20 [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ - [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps_xm= m, - (SSEFunc_0_epp)gen_helper_shufpd_xmm, NULL, NULL), + [0xc6] =3D SSE_OP(shufps, shufpd, op2i, SSE_OPF_SHUF), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ [0x38] =3D SSE_SPECIAL, @@ -2897,10 +2913,8 @@ static const struct SSEOpHelper_table1 sse_op_table1= [256] =3D { [0x6e] =3D SSE_SPECIAL, /* movd mm, ea */ [0x6f] =3D SSE_SPECIAL, /* movq, movdqa, , movqdu */ [0x70] =3D OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX | SSE_OPF_V0, - (SSEFunc_0_epp)gen_helper_pshufw_mmx, - (SSEFunc_0_epp)gen_helper_pshufd_xmm, - (SSEFunc_0_epp)gen_helper_pshufhw_xmm, - (SSEFunc_0_epp)gen_helper_pshuflw_xmm), + gen_helper_pshufw_mmx, gen_helper_pshufd_xmm, + gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm), [0x71] =3D SSE_SPECIAL, /* shiftw */ [0x72] =3D SSE_SPECIAL, /* shiftd */ [0x73] =3D SSE_SPECIAL, /* shiftq */ @@ -2962,8 +2976,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0xf5] =3D MMX_OP(pmaddwd), [0xf6] =3D MMX_OP(psadbw), [0xf7] =3D OP(op1t, SSE_OPF_MMX | SSE_OPF_V0, - (SSEFunc_0_epp)gen_helper_maskmov_mmx, - (SSEFunc_0_epp)gen_helper_maskmov_xmm, NULL, NULL), + gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL= ), [0xf8] =3D MMX_OP(psubb), [0xf9] =3D MMX_OP(psubw), [0xfa] =3D MMX_OP(psubl), @@ -2980,7 +2993,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { =20 #define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } =20 -static const SSEFunc_0_epp sse_op_table2[3 * 8][2] =3D { +static const SSEFunc_0_eppp sse_op_table2[3 * 8][2] =3D { [0 + 2] =3D MMX_OP2(psrlw), [0 + 4] =3D MMX_OP2(psraw), [0 + 6] =3D MMX_OP2(psllw), @@ -2992,6 +3005,7 @@ static const SSEFunc_0_epp sse_op_table2[3 * 8][2] = =3D { [16 + 6] =3D MMX_OP2(psllq), [16 + 7] =3D { NULL, gen_helper_pslldq_xmm }, }; +#undef MMX_OP2 =20 static const SSEFunc_0_epi sse_op_table3ai[] =3D { gen_helper_cvtsi2ss, @@ -3024,7 +3038,7 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { #define SSE_CMP(x) { \ gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ gen_helper_ ## x ## ss, gen_helper_ ## x ## sd} -static const SSEFunc_0_epp sse_op_table4[32][4] =3D { +static const SSEFunc_0_eppp sse_op_table4[32][4] =3D { SSE_CMP(cmpeq), SSE_CMP(cmplt), SSE_CMP(cmple), @@ -3063,6 +3077,11 @@ static const SSEFunc_0_epp sse_op_table4[32][4] =3D { }; #undef SSE_CMP =20 +static void gen_helper_pavgusb(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b) +{ + gen_helper_pavgb_mmx(env, reg_a, reg_a, reg_b); +} + static const SSEFunc_0_epp sse_op_table5[256] =3D { [0x0c] =3D gen_helper_pi2fw, [0x0d] =3D gen_helper_pi2fd, @@ -3087,17 +3106,25 @@ static const SSEFunc_0_epp sse_op_table5[256] =3D { [0xb6] =3D gen_helper_movq, /* pfrcpit2 */ [0xb7] =3D gen_helper_pmulhrw_mmx, [0xbb] =3D gen_helper_pswapd, - [0xbf] =3D gen_helper_pavgb_mmx /* pavgusb */ + [0xbf] =3D gen_helper_pavgusb, }; =20 struct SSEOpHelper_table6 { - SSEFunc_0_epp op[2]; + union { + SSEFunc_0_epp op1; + SSEFunc_0_eppp op2; + SSEFunc_0_epppp op3; + } fn[2]; uint32_t ext_mask; int flags; }; =20 struct SSEOpHelper_table7 { - SSEFunc_0_eppi op[2]; + union { + SSEFunc_0_eppi op1; + SSEFunc_0_epppi op2; + SSEFunc_0_epppp op3; + } fn[2]; uint32_t ext_mask; int flags; }; @@ -3105,7 +3132,8 @@ struct SSEOpHelper_table7 { #define gen_helper_special_xmm NULL =20 #define OP(name, op, flags, ext, mmx_name) \ - {{mmx_name, gen_helper_ ## name ## _xmm}, CPUID_EXT_ ## ext, flags} + {{{.op =3D mmx_name}, {.op =3D gen_helper_ ## name ## _xmm} }, \ + CPUID_EXT_ ## ext, flags} #define BINARY_OP_MMX(name, ext) \ OP(name, op2, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) #define BINARY_OP(name, ext, flags) \ @@ -3262,14 +3290,11 @@ static const struct SSEOpHelper_table7 sse_op_table= 7[256] =3D { static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { - int b1, op1_offset, op2_offset, is_xmm, val, scalar_op; - int modrm, mod, rm, reg; + int b1, op1_offset, op2_offset, v_offset, is_xmm, val, scalar_op; + int modrm, mod, rm, reg, reg_v; struct SSEOpHelper_table1 sse_op; struct SSEOpHelper_table6 op6; struct SSEOpHelper_table7 op7; - SSEFunc_0_epp sse_fn_epp; - SSEFunc_0_ppi sse_fn_ppi; - SSEFunc_0_eppt sse_fn_eppt; MemOp ot; =20 b &=3D 0xff; @@ -3282,9 +3307,8 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, else b1 =3D 0; sse_op =3D sse_op_table1[b]; - sse_fn_epp =3D sse_op.op[b1]; if ((sse_op.flags & (SSE_OPF_SPECIAL | SSE_OPF_3DNOW)) =3D=3D 0 - && !sse_fn_epp) { + && !sse_op.fn[b1].op1) { goto unknown_op; } if ((b <=3D 0x5f && b >=3D 0x10) || b =3D=3D 0xc6 || b =3D=3D 0xc2) { @@ -3345,6 +3369,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, if (is_xmm) { reg |=3D REX_R(s); } + if (s->prefix & PREFIX_VEX) { + reg_v =3D s->vex_v; + } else { + reg_v =3D reg; + } mod =3D (modrm >> 6) & 3; if (sse_op.flags & SSE_OPF_SPECIAL) { b |=3D (b1 << 8); @@ -3466,8 +3495,13 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } else { CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); - gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0))); + tcg_gen_ld_i32(s->tmp2_i32, cpu_env, + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)= )); + if (reg !=3D reg_v) { + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v)); + } + tcg_gen_st_i32(s->tmp2_i32, cpu_env, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= ))); } break; case 0x310: /* movsd xmm, ea */ @@ -3484,8 +3518,13 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } else { CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); + if (reg !=3D reg_v) { + gen_op_movq(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); + } gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); } break; case 0x012: /* movlps */ @@ -3501,6 +3540,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(1))); } + if (reg !=3D reg_v) { + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); + } break; case 0x212: /* movsldup */ CHECK_AVX_V0(s); @@ -3546,6 +3589,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); } + if (reg !=3D reg_v) { + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(0)= )); + } break; case 0x216: /* movshdup */ CHECK_AVX_V0(s); @@ -3664,6 +3711,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } else { CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); + if (rm !=3D reg_v) { + gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg_v)); + } gen_op_movl(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)= ), offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0))); } @@ -3677,6 +3727,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } else { CHECK_AVX_128(s); rm =3D (modrm & 7) | REX_B(s); + if (rm !=3D reg_v) { + gen_op_movq(s, + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(1)), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); + } gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)= ), offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0))); } @@ -3731,21 +3786,28 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, op1_offset =3D offsetof(CPUX86State,mmx_t0); } assert(b1 < 2); - sse_fn_epp =3D sse_op_table2[((b - 1) & 3) * 8 + + SSEFunc_0_eppp fn =3D sse_op_table2[((b - 1) & 3) * 8 + (((modrm >> 3)) & 7)][b1]; - if (!sse_fn_epp) { + if (!fn) { goto unknown_op; } if (is_xmm) { rm =3D (modrm & 7) | REX_B(s); op2_offset =3D ZMM_OFFSET(rm); + if (s->prefix & PREFIX_VEX) { + v_offset =3D ZMM_OFFSET(reg_v); + } else { + v_offset =3D op2_offset; + } } else { rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); + v_offset =3D op2_offset; } - tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op1_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + tcg_gen_addi_ptr(s->ptr0, cpu_env, v_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + tcg_gen_addi_ptr(s->ptr2, cpu_env, op1_offset); + fn(cpu_env, s->ptr0, s->ptr1, s->ptr2); break; case 0x050: /* movmskps */ CHECK_AVX_V0(s); @@ -3792,6 +3854,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, ot =3D mo_64_32(s->dflag); gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0); op1_offset =3D ZMM_OFFSET(reg); + v_offset =3D ZMM_OFFSET(reg_v); + if (op1_offset !=3D v_offset) { + gen_op_movo(s, op1_offset, v_offset); + } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); if (ot =3D=3D MO_32) { SSEFunc_0_epi sse_fn_epi =3D sse_op_table3ai[(b >> 8) & 1]; @@ -3881,6 +3947,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, s->rip_offset =3D 1; gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0); val =3D x86_ldub_code(env, s); + if (reg !=3D reg_v) { + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v)); + } if (b1) { val &=3D 7; tcg_gen_st16_tl(s->T0, cpu_env, @@ -3972,6 +4041,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, rm =3D modrm & 7; reg =3D ((modrm >> 3) & 7) | REX_R(s); mod =3D (modrm >> 6) & 3; + if (s->prefix & PREFIX_VEX) { + reg_v =3D s->vex_v; + } else { + reg_v =3D reg; + } =20 assert(b1 < 2); op6 =3D sse_op_table6[b]; @@ -4041,6 +4115,27 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_ldo_env_A0(s, op2_offset); } } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (!op6.fn[b1].op1) { + goto illegal_op; + } + if (op6.flags & SSE_OPF_V0) { + op6.fn[b1].op1(cpu_env, s->ptr0, s->ptr1); + } else { + v_offset =3D ZMM_OFFSET(reg_v); + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + if (op6.flags & SSE_OPF_BLENDV) { + TCGv_ptr mask =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(mask, cpu_env, ZMM_OFFSET(0)); + op6.fn[b1].op3(cpu_env, s->ptr0, s->ptr2, s->ptr1, + mask); + tcg_temp_free_ptr(mask); + } else { + SSEFunc_0_eppp fn =3D op6.fn[b1].op2; + fn(cpu_env, s->ptr0, s->ptr2, s->ptr1); + } + } } else { CHECK_NO_VEX(s); if ((op6.flags & SSE_OPF_MMX) =3D=3D 0) { @@ -4054,16 +4149,16 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, op2_offset); } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (op6.flags & SSE_OPF_V0) { + op6.fn[0].op1(cpu_env, s->ptr0, s->ptr1); + } else { + op6.fn[0].op2(cpu_env, s->ptr0, s->ptr0, s->ptr1); + } } - if (!op6.op[b1]) { - goto illegal_op; - } - - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op6.op[b1](cpu_env, s->ptr0, s->ptr1); =20 - if (b =3D=3D 0x17) { + if (op6.flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } break; @@ -4434,6 +4529,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, rm =3D modrm & 7; reg =3D ((modrm >> 3) & 7) | REX_R(s); mod =3D (modrm >> 6) & 3; + if (s->prefix & PREFIX_VEX) { + reg_v =3D s->vex_v; + } else { + reg_v =3D reg; + } =20 assert(b1 < 2); op7 =3D sse_op_table7[b]; @@ -4521,6 +4621,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x20: /* pinsrb */ CHECK_AVX_128(s); + if (reg !=3D reg_v) { + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v)); + } if (mod =3D=3D 3) { gen_op_mov_v_reg(s, MO_32, s->T0, rm); } else { @@ -4540,6 +4643,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, s->mem_index, MO_LEUL); } + if (reg !=3D reg_v) { + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v)); + } tcg_gen_st_i32(s->tmp2_i32, cpu_env, offsetof(CPUX86State,xmm_regs[reg] .ZMM_L((val >> 4) & 3))); @@ -4562,6 +4668,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x22: CHECK_AVX_128(s); + if (reg !=3D reg_v) { + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg_v)); + } if (ot =3D=3D MO_32) { /* pinsrd */ if (mod =3D=3D 3) { tcg_gen_trunc_tl_i32(s->tmp2_i32, cpu_regs[rm]= ); @@ -4606,17 +4715,9 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, CHECK_AVX_V0(s); } =20 - if (b1) { - op1_offset =3D ZMM_OFFSET(reg); - if (mod =3D=3D 3) { - op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); - } else { - op2_offset =3D offsetof(CPUX86State,xmm_t0); - gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, op2_offset); - } - } else { + if (b1 =3D=3D 0) { CHECK_NO_VEX(s); + /* MMX */ if ((op7.flags & SSE_OPF_MMX) =3D=3D 0) { goto illegal_op; } @@ -4628,9 +4729,29 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, op2_offset); } + val =3D x86_ldub_code(env, s); + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + + /* We only actually have one MMX instuction (palignr) */ + assert(b =3D=3D 0x0f); + + op7.fn[0].op2(cpu_env, s->ptr0, s->ptr0, s->ptr1, + tcg_const_i32(val)); + break; + } + + /* SSE */ + op1_offset =3D ZMM_OFFSET(reg); + if (mod =3D=3D 3) { + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); + } else { + op2_offset =3D offsetof(CPUX86State, xmm_t0); + gen_lea_modrm(env, s, modrm); + gen_ldo_env_A0(s, op2_offset); } - val =3D x86_ldub_code(env, s); =20 + val =3D x86_ldub_code(env, s); if ((b & 0xfc) =3D=3D 0x60) { /* pcmpXstrX */ set_cc_op(s, CC_OP_EFLAGS); =20 @@ -4640,9 +4761,32 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } } =20 + v_offset =3D ZMM_OFFSET(reg_v); + /* + * Populate the top part of the destination register for VEX + * encoded scalar operations + */ + if (scalar_op && op1_offset !=3D v_offset) { + if (b =3D=3D 0x0a) { /* roundss */ + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_L(1)= )); + } + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); + } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op7.op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + if (op7.flags & SSE_OPF_V0) { + op7.fn[b1].op1(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(va= l)); + } else { + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1, + tcg_const_i32(val)); + } + if (op7.flags & SSE_OPF_CMP) { + set_cc_op(s, CC_OP_EFLAGS); + } break; =20 case 0x33a: @@ -4711,28 +4855,24 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, int sz =3D 4; =20 gen_lea_modrm(env, s, modrm); - op2_offset =3D offsetof(CPUX86State,xmm_t0); + op2_offset =3D offsetof(CPUX86State, xmm_t0); =20 - switch (b) { - case 0x50 ... 0x5a: - case 0x5c ... 0x5f: - case 0xc2: - /* Most sse scalar operations. */ - if (b1 =3D=3D 2) { - sz =3D 2; - } else if (b1 =3D=3D 3) { - sz =3D 3; - } - break; - - case 0x2e: /* ucomis[sd] */ - case 0x2f: /* comis[sd] */ - if (b1 =3D=3D 0) { - sz =3D 2; + if (sse_op.flags & SSE_OPF_SCALAR) { + if (sse_op.flags & SSE_OPF_CMP) { + /* ucomis[sd], comis[sd] */ + if (b1 =3D=3D 0) { + sz =3D 2; + } else { + sz =3D 3; + } } else { - sz =3D 3; + /* Most sse scalar operations. */ + if (b1 =3D=3D 2) { + sz =3D 2; + } else if (b1 =3D=3D 3) { + sz =3D 3; + } } - break; } =20 switch (sz) { @@ -4740,13 +4880,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, /* 32 bit access */ gen_op_ld_v(s, MO_32, s->T0, s->A0); tcg_gen_st32_tl(s->T0, cpu_env, - offsetof(CPUX86State,xmm_t0.ZMM_L(0))); + offsetof(CPUX86State, xmm_t0.ZMM_L(0))= ); break; case 3: /* 64 bit access */ gen_ldq_env_A0(s, offsetof(CPUX86State, xmm_t0.ZMM_D(0= ))); break; - default: + case 4: /* 128 bit access */ gen_ldo_env_A0(s, op2_offset); break; @@ -4755,8 +4895,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, rm =3D (modrm & 7) | REX_B(s); op2_offset =3D ZMM_OFFSET(rm); } + v_offset =3D ZMM_OFFSET(reg_v); } else { CHECK_NO_VEX(s); + scalar_op =3D 0; op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -4778,47 +4920,85 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, op_3dnow(cpu_env, s->ptr0, s->ptr1); return; } + v_offset =3D op1_offset; } - switch(b) { - case 0x70: /* pshufx insn */ - case 0xc6: /* pshufx insn */ - val =3D x86_ldub_code(env, s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - /* XXX: introduce a new table? */ - sse_fn_ppi =3D (SSEFunc_0_ppi)sse_fn_epp; - sse_fn_ppi(s->ptr0, s->ptr1, tcg_const_i32(val)); - break; - case 0xc2: - /* compare insns, bits 7:3 (7:5 for AVX) are ignored */ - val =3D x86_ldub_code(env, s) & 7; - sse_fn_epp =3D sse_op_table4[val][b1]; =20 - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; - case 0xf7: - /* maskmov : we must prepare A0 */ - if (mod !=3D 3) - goto illegal_op; - tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]); - gen_extu(s->aflag, s->A0); - gen_add_A0_ds_seg(s); + /* + * Populate the top part of the destination register for VEX + * encoded scalar operations + */ + if (scalar_op && op1_offset !=3D v_offset) { + if (b =3D=3D 0x5a) { + /* + * Scalar conversions are tricky because the src and dest + * may be different sizes + */ + if (op1_offset =3D=3D op2_offset) { + /* + * The the second source operand overlaps the + * destination, so we need to copy the value + */ + op2_offset =3D offsetof(CPUX86State, xmm_t0); + gen_op_movq(s, op2_offset, op1_offset); + } + gen_op_movo(s, op1_offset, v_offset); + } else { + if (b1 =3D=3D 2) { /* ss */ + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_L(1)= )); + } + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), + offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); + } + } =20 - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - /* XXX: introduce a new table? */ - sse_fn_eppt =3D (SSEFunc_0_eppt)sse_fn_epp; - sse_fn_eppt(cpu_env, s->ptr0, s->ptr1, s->A0); - break; - default: - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (sse_op.flags & SSE_OPF_V0) { + if (sse_op.flags & SSE_OPF_SHUF) { + val =3D x86_ldub_code(env, s); + sse_op.fn[b1].op1i(s->ptr0, s->ptr1, tcg_const_i32(val)); + } else if (b =3D=3D 0xf7) { + /* maskmov : we must prepare A0 */ + if (mod !=3D 3) { + goto illegal_op; + } + tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]); + gen_extu(s->aflag, s->A0); + gen_add_A0_ds_seg(s); + + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + sse_op.fn[b1].op1t(cpu_env, s->ptr0, s->ptr1, s->A0); + /* Does not write to the fist operand */ + return; + } else { + sse_op.fn[b1].op1(cpu_env, s->ptr0, s->ptr1); + } + } else { + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + if (sse_op.flags & SSE_OPF_SHUF) { + val =3D x86_ldub_code(env, s); + sse_op.fn[b1].op2i(s->ptr0, s->ptr2, s->ptr1, + tcg_const_i32(val)); + } else { + SSEFunc_0_eppp fn =3D sse_op.fn[b1].op2; + if (b =3D=3D 0xc2) { + /* compare insns */ + val =3D x86_ldub_code(env, s); + if (s->prefix & PREFIX_VEX) { + val &=3D 0x1f; + } else { + val &=3D 7; + } + fn =3D sse_op_table4[val][b1]; + } + fn(cpu_env, s->ptr0, s->ptr2, s->ptr1); + } } - if (b =3D=3D 0x2e || b =3D=3D 0x2f) { + + if (sse_op.flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } } @@ -8900,6 +9080,7 @@ static void i386_tr_init_disas_context(DisasContextBa= se *dcbase, CPUState *cpu) dc->tmp4 =3D tcg_temp_new(); dc->ptr0 =3D tcg_temp_new_ptr(); dc->ptr1 =3D tcg_temp_new_ptr(); + dc->ptr2 =3D tcg_temp_new_ptr(); dc->cc_srcT =3D tcg_temp_local_new(); } =20 --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16508395855341012.2359829929526; Sun, 24 Apr 2022 15:33:05 -0700 (PDT) Received: from localhost ([::1]:59774 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikmu-00054s-Ij for importer@patchew.org; Sun, 24 Apr 2022 18:33:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50904) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikUD-00037M-Ex for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:45 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58864) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikUB-0002wM-Cr for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:44 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJA-0001ea-3U; Sun, 24 Apr 2022 23:02:20 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 26/42] i386: Utility function for 128 bit AVX Date: Sun, 24 Apr 2022 23:01:48 +0100 Message-Id: <20220424220204.2493824-27-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839586107100001 Content-Type: text/plain; charset="utf-8" VEX encoded instructions that write to a (128 bit) xmm register clear the rest (upper half) of the corresonding (256 bit) ymm register. When legacy SSE encodings are used the rest of the ymm register is left unchanged. Add a utility fuction so that we don't have to keep duplicating this logic. Signed-off-by: Paul Brook --- target/i386/tcg/translate.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index d148a2319d..278ed8ed1c 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2780,6 +2780,18 @@ static inline void gen_op_movq_env_0(DisasContext *s= , int d_offset) =20 #define ZMM_OFFSET(reg) offsetof(CPUX86State, xmm_regs[reg]) =20 +/* + * Clear the top half of the ymm register after a VEX.128 instruction + * This could be optimized by tracking this in env->hflags + */ +static void gen_clear_ymmh(DisasContext *s, int reg) +{ + if (s->prefix & PREFIX_VEX) { + gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2))= ); + gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(3))= ); + } +} + typedef void (*SSEFunc_i_ep)(TCGv_i32 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val); --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839701676500.2569175669845; Sun, 24 Apr 2022 15:35:01 -0700 (PDT) Received: from localhost ([::1]:37800 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikom-0000rW-IH for importer@patchew.org; Sun, 24 Apr 2022 18:35:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50942) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikUN-0003Gz-45 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:55 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58873) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikUJ-0002wu-Em for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:54 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJA-0001ea-9c; Sun, 24 Apr 2022 23:02:20 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 27/42] i386: Translate 256 bit AVX instructions Date: Sun, 24 Apr 2022 23:01:49 +0100 Message-Id: <20220424220204.2493824-28-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839702831100001 Content-Type: text/plain; charset="utf-8" All the work for the helper functions is already done, we just need to build them, and a few macro tweaks to poulate the lookup tables. For sse_op_table6 and sse_op_table7 we use #defines to fill in the entries where and opcode only supports one vector size, rather than complicating the main table. Several of the open-coded mov type instruction need special handling, but m= ost of the rest falls out from the infrastructure we already added. Also clear the top half of the register after 128 bit VEX register writes. In the current code this correlates with VEX.L =3D=3D 0, but there are exce= ptios later. Signed-off-by: Paul Brook --- target/i386/helper.h | 2 + target/i386/tcg/fpu_helper.c | 3 + target/i386/tcg/translate.c | 370 +++++++++++++++++++++++++++++------ 3 files changed, 319 insertions(+), 56 deletions(-) diff --git a/target/i386/helper.h b/target/i386/helper.h index ac3b4d1ee3..3da5df98b9 100644 --- a/target/i386/helper.h +++ b/target/i386/helper.h @@ -218,6 +218,8 @@ DEF_HELPER_3(movq, void, env, ptr, ptr) #include "ops_sse_header.h" #define SHIFT 1 #include "ops_sse_header.h" +#define SHIFT 2 +#include "ops_sse_header.h" =20 DEF_HELPER_3(rclb, tl, env, tl, tl) DEF_HELPER_3(rclw, tl, env, tl, tl) diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c index b391b69635..74cf86c986 100644 --- a/target/i386/tcg/fpu_helper.c +++ b/target/i386/tcg/fpu_helper.c @@ -3053,3 +3053,6 @@ void helper_movq(CPUX86State *env, void *d, void *s) =20 #define SHIFT 1 #include "ops_sse.h" + +#define SHIFT 2 +#include "ops_sse.h" diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 278ed8ed1c..bcd6d47fd0 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2742,6 +2742,29 @@ static inline void gen_ldo_env_A0(DisasContext *s, i= nt offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(1= ))); } =20 +static inline void gen_ldo_env_A0_ymmh(DisasContext *s, int offset) +{ + int mem_index =3D s->mem_index; + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2= ))); + tcg_gen_addi_tl(s->tmp0, s->A0, 8); + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3= ))); +} + +/* Load 256-bit ymm register value */ +static inline void gen_ldy_env_A0(DisasContext *s, int offset) +{ + int mem_index =3D s->mem_index; + gen_ldo_env_A0(s, offset); + tcg_gen_addi_tl(s->tmp0, s->A0, 16); + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2= ))); + tcg_gen_addi_tl(s->tmp0, s->A0, 24); + tcg_gen_qemu_ld_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3= ))); +} + static inline void gen_sto_env_A0(DisasContext *s, int offset) { int mem_index =3D s->mem_index; @@ -2752,6 +2775,29 @@ static inline void gen_sto_env_A0(DisasContext *s, i= nt offset) tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); } =20 +static inline void gen_sto_env_A0_ymmh(DisasContext *s, int offset) +{ + int mem_index =3D s->mem_index; + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2= ))); + tcg_gen_qemu_st_i64(s->tmp1_i64, s->A0, mem_index, MO_LEUQ); + tcg_gen_addi_tl(s->tmp0, s->A0, 8); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3= ))); + tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); +} + +/* Store 256-bit ymm register value */ +static inline void gen_sty_env_A0(DisasContext *s, int offset) +{ + int mem_index =3D s->mem_index; + gen_sto_env_A0(s, offset); + tcg_gen_addi_tl(s->tmp0, s->A0, 16); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(2= ))); + tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); + tcg_gen_addi_tl(s->tmp0, s->A0, 24); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, offset + offsetof(ZMMReg, ZMM_Q(3= ))); + tcg_gen_qemu_st_i64(s->tmp1_i64, s->tmp0, mem_index, MO_LEUQ); +} + static inline void gen_op_movo(DisasContext *s, int d_offset, int s_offset) { tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (0))); @@ -2760,6 +2806,14 @@ static inline void gen_op_movo(DisasContext *s, int = d_offset, int s_offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (1))); } =20 +static inline void gen_op_movo_ymmh(DisasContext *s, int d_offset, int s_o= ffset) +{ + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (2))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (2))); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (3))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (3))); +} + static inline void gen_op_movq(DisasContext *s, int d_offset, int s_offset) { tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset); @@ -2823,17 +2877,21 @@ typedef void (*SSEFunc_0_epppt)(TCGv_ptr env, TCGv_= ptr reg_a, TCGv_ptr reg_b, #define SSE_OPF_AVX2 (1 << 7) /* AVX2 instruction */ #define SSE_OPF_SHUF (1 << 9) /* pshufx/shufpx */ =20 -#define OP(op, flags, a, b, c, d) \ - {flags, {{.op =3D a}, {.op =3D b}, {.op =3D c}, {.op =3D d} } } +#define OP(op, flags, a, b, c, d, e, f, g, h) \ + {flags, {{.op =3D a}, {.op =3D b}, {.op =3D c}, {.op =3D d}, \ + {.op =3D e}, {.op =3D f}, {.op =3D g}, {.op =3D h} } } =20 #define MMX_OP(x) OP(op2, SSE_OPF_MMX, \ - gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL) + gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL, \ + NULL, gen_helper_ ## x ## _ymm, NULL, NULL) =20 #define SSE_FOP(name) OP(op2, SSE_OPF_SCALAR, \ - gen_helper_##name##ps##_xmm, gen_helper_##name##pd##_xmm, \ - gen_helper_##name##ss, gen_helper_##name##sd) + gen_helper_##name##ps_xmm, gen_helper_##name##pd_xmm, \ + gen_helper_##name##ss, gen_helper_##name##sd, \ + gen_helper_##name##ps_ymm, gen_helper_##name##pd_ymm, NULL, NULL) #define SSE_OP(sname, dname, op, flags) OP(op, flags, \ - gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) + gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL, \ + gen_helper_##sname##_ymm, gen_helper_##dname##_ymm, NULL, NULL) =20 struct SSEOpHelper_table1 { int flags; @@ -2843,7 +2901,7 @@ struct SSEOpHelper_table1 { SSEFunc_0_eppt op1t; SSEFunc_0_eppp op2; SSEFunc_0_pppi op2i; - } fn[4]; + } fn[8]; }; =20 #define SSE_3DNOW { SSE_OPF_3DNOW } @@ -2870,17 +2928,22 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x2c] =3D SSE_SPECIAL, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si = */ [0x2d] =3D SSE_SPECIAL, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ [0x2e] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0, - gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL), + gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL, + NULL, NULL, NULL, NULL), [0x2f] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR | SSE_OPF_V0, - gen_helper_comiss, gen_helper_comisd, NULL, NULL), + gen_helper_comiss, gen_helper_comisd, NULL, NULL, + NULL, NULL, NULL, NULL), [0x50] =3D SSE_SPECIAL, /* movmskps, movmskpd */ [0x51] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, gen_helper_sqrtps_xmm, gen_helper_sqrtpd_xmm, - gen_helper_sqrtss, gen_helper_sqrtsd), + gen_helper_sqrtss, gen_helper_sqrtsd, + gen_helper_sqrtps_ymm, gen_helper_sqrtpd_ymm, NULL, NULL), [0x52] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, - gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL), + gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL, + gen_helper_rsqrtps_ymm, NULL, NULL, NULL), [0x53] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, - gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL), + gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL, + gen_helper_rcpps_ymm, NULL, NULL, NULL), [0x54] =3D SSE_OP(pand, pand, op2, 0), /* andps, andpd */ [0x55] =3D SSE_OP(pandn, pandn, op2, 0), /* andnps, andnpd */ [0x56] =3D SSE_OP(por, por, op2, 0), /* orps, orpd */ @@ -2889,10 +2952,13 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x59] =3D SSE_FOP(mul), [0x5a] =3D OP(op1, SSE_OPF_SCALAR | SSE_OPF_V0, gen_helper_cvtps2pd_xmm, gen_helper_cvtpd2ps_xmm, - gen_helper_cvtss2sd, gen_helper_cvtsd2ss), + gen_helper_cvtss2sd, gen_helper_cvtsd2ss, + gen_helper_cvtps2pd_ymm, gen_helper_cvtpd2ps_ymm, NULL, NU= LL), [0x5b] =3D OP(op1, SSE_OPF_V0, gen_helper_cvtdq2ps_xmm, gen_helper_cvtps2dq_xmm, - gen_helper_cvttps2dq_xmm, NULL), + gen_helper_cvttps2dq_xmm, NULL, + gen_helper_cvtdq2ps_ymm, gen_helper_cvtps2dq_ymm, + gen_helper_cvttps2dq_ymm, NULL), [0x5c] =3D SSE_FOP(sub), [0x5d] =3D SSE_FOP(min), [0x5e] =3D SSE_FOP(div), @@ -2919,14 +2985,18 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x6a] =3D MMX_OP(punpckhdq), [0x6b] =3D MMX_OP(packssdw), [0x6c] =3D OP(op2, SSE_OPF_MMX, - NULL, gen_helper_punpcklqdq_xmm, NULL, NULL), + NULL, gen_helper_punpcklqdq_xmm, NULL, NULL, + NULL, gen_helper_punpcklqdq_ymm, NULL, NULL), [0x6d] =3D OP(op2, SSE_OPF_MMX, - NULL, gen_helper_punpckhqdq_xmm, NULL, NULL), + NULL, gen_helper_punpckhqdq_xmm, NULL, NULL, + NULL, gen_helper_punpckhqdq_ymm, NULL, NULL), [0x6e] =3D SSE_SPECIAL, /* movd mm, ea */ [0x6f] =3D SSE_SPECIAL, /* movq, movdqa, , movqdu */ [0x70] =3D OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX | SSE_OPF_V0, gen_helper_pshufw_mmx, gen_helper_pshufd_xmm, - gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm), + gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm, + NULL, gen_helper_pshufd_ymm, + gen_helper_pshufhw_ymm, gen_helper_pshuflw_ymm), [0x71] =3D SSE_SPECIAL, /* shiftw */ [0x72] =3D SSE_SPECIAL, /* shiftd */ [0x73] =3D SSE_SPECIAL, /* shiftq */ @@ -2936,17 +3006,21 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x77] =3D SSE_SPECIAL, /* emms */ [0x78] =3D SSE_SPECIAL, /* extrq_i, insertq_i (sse4a) */ [0x79] =3D OP(op1, SSE_OPF_V0, - NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r), + NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r, + NULL, NULL, NULL, NULL), [0x7c] =3D OP(op2, 0, - NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm), + NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm, + NULL, gen_helper_haddpd_ymm, NULL, gen_helper_haddps_ymm), [0x7d] =3D OP(op2, 0, - NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm), + NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm, + NULL, gen_helper_hsubpd_ymm, NULL, gen_helper_hsubps_ymm), [0x7e] =3D SSE_SPECIAL, /* movd, movd, , movq */ [0x7f] =3D SSE_SPECIAL, /* movq, movdqa, movdqu */ [0xc4] =3D SSE_SPECIAL, /* pinsrw */ [0xc5] =3D SSE_SPECIAL, /* pextrw */ [0xd0] =3D OP(op2, 0, - NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_x= mm), + NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_x= mm, + NULL, gen_helper_addsubpd_ymm, NULL, gen_helper_addsubps_y= mm), [0xd1] =3D MMX_OP(psrlw), [0xd2] =3D MMX_OP(psrld), [0xd3] =3D MMX_OP(psrlq), @@ -2970,7 +3044,9 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0xe5] =3D MMX_OP(pmulhw), [0xe6] =3D OP(op1, SSE_OPF_V0, NULL, gen_helper_cvttpd2dq_xmm, - gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm), + gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm, + NULL, gen_helper_cvttpd2dq_ymm, + gen_helper_cvtdq2pd_ymm, gen_helper_cvtpd2dq_ymm), [0xe7] =3D SSE_SPECIAL, /* movntq, movntq */ [0xe8] =3D MMX_OP(psubsb), [0xe9] =3D MMX_OP(psubsw), @@ -2988,7 +3064,8 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0xf5] =3D MMX_OP(pmaddwd), [0xf6] =3D MMX_OP(psadbw), [0xf7] =3D OP(op1t, SSE_OPF_MMX | SSE_OPF_V0, - gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL= ), + gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL, + NULL, NULL, NULL, NULL), [0xf8] =3D MMX_OP(psubb), [0xf9] =3D MMX_OP(psubw), [0xfa] =3D MMX_OP(psubl), @@ -3003,9 +3080,9 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { #undef SSE_OP #undef SSE_SPECIAL =20 -#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } - -static const SSEFunc_0_eppp sse_op_table2[3 * 8][2] =3D { +#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, \ + gen_helper_ ## x ## _ymm} +static const SSEFunc_0_eppp sse_op_table2[3 * 8][3] =3D { [0 + 2] =3D MMX_OP2(psrlw), [0 + 4] =3D MMX_OP2(psraw), [0 + 6] =3D MMX_OP2(psllw), @@ -3013,9 +3090,9 @@ static const SSEFunc_0_eppp sse_op_table2[3 * 8][2] = =3D { [8 + 4] =3D MMX_OP2(psrad), [8 + 6] =3D MMX_OP2(pslld), [16 + 2] =3D MMX_OP2(psrlq), - [16 + 3] =3D { NULL, gen_helper_psrldq_xmm }, + [16 + 3] =3D { NULL, gen_helper_psrldq_xmm, gen_helper_psrldq_ymm}, [16 + 6] =3D MMX_OP2(psllq), - [16 + 7] =3D { NULL, gen_helper_pslldq_xmm }, + [16 + 7] =3D { NULL, gen_helper_pslldq_xmm, gen_helper_pslldq_ymm}, }; #undef MMX_OP2 =20 @@ -3049,8 +3126,9 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { =20 #define SSE_CMP(x) { \ gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ - gen_helper_ ## x ## ss, gen_helper_ ## x ## sd} -static const SSEFunc_0_eppp sse_op_table4[32][4] =3D { + gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, \ + gen_helper_ ## x ## ps ## _ymm, gen_helper_ ## x ## pd ## _ymm} +static const SSEFunc_0_eppp sse_op_table4[32][6] =3D { SSE_CMP(cmpeq), SSE_CMP(cmplt), SSE_CMP(cmple), @@ -3126,7 +3204,7 @@ struct SSEOpHelper_table6 { SSEFunc_0_epp op1; SSEFunc_0_eppp op2; SSEFunc_0_epppp op3; - } fn[2]; + } fn[3]; /* [0] =3D mmx, [1] =3D xmm, fn[2] =3D ymm */ uint32_t ext_mask; int flags; }; @@ -3136,16 +3214,17 @@ struct SSEOpHelper_table7 { SSEFunc_0_eppi op1; SSEFunc_0_epppi op2; SSEFunc_0_epppp op3; - } fn[2]; + } fn[3]; /* [0] =3D mmx, [1] =3D xmm, fn[2] =3D ymm */ uint32_t ext_mask; int flags; }; =20 #define gen_helper_special_xmm NULL +#define gen_helper_special_ymm NULL =20 #define OP(name, op, flags, ext, mmx_name) \ - {{{.op =3D mmx_name}, {.op =3D gen_helper_ ## name ## _xmm} }, \ - CPUID_EXT_ ## ext, flags} + {{{.op =3D mmx_name}, {.op =3D gen_helper_ ## name ## _xmm}, \ + {.op =3D gen_helper_ ## name ## _ymm} }, CPUID_EXT_ ## ext, flags} #define BINARY_OP_MMX(name, ext) \ OP(name, op2, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) #define BINARY_OP(name, ext, flags) \ @@ -3205,7 +3284,9 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { [0x3e] =3D BINARY_OP(pmaxuw, SSE41, SSE_OPF_MMX), [0x3f] =3D BINARY_OP(pmaxud, SSE41, SSE_OPF_MMX), [0x40] =3D BINARY_OP(pmulld, SSE41, SSE_OPF_MMX), +#define gen_helper_phminposuw_ymm NULL [0x41] =3D UNARY_OP(phminposuw, SSE41, 0), +#define gen_helper_aesimc_ymm NULL [0xdb] =3D UNARY_OP(aesimc, AES, 0), [0xdc] =3D BINARY_OP(aesenc, AES, 0), [0xdd] =3D BINARY_OP(aesenclast, AES, 0), @@ -3217,7 +3298,9 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { static const struct SSEOpHelper_table7 sse_op_table7[256] =3D { [0x08] =3D UNARY_OP(roundps, SSE41, 0), [0x09] =3D UNARY_OP(roundpd, SSE41, 0), +#define gen_helper_roundss_ymm NULL [0x0a] =3D UNARY_OP(roundss, SSE41, SSE_OPF_SCALAR), +#define gen_helper_roundsd_ymm NULL [0x0b] =3D UNARY_OP(roundsd, SSE41, SSE_OPF_SCALAR), [0x0c] =3D BINARY_OP(blendps, SSE41, 0), [0x0d] =3D BINARY_OP(blendpd, SSE41, 0), @@ -3231,13 +3314,19 @@ static const struct SSEOpHelper_table7 sse_op_table= 7[256] =3D { [0x21] =3D SPECIAL_OP(SSE41), /* insertps */ [0x22] =3D SPECIAL_OP(SSE41), /* pinsrd/pinsrq */ [0x40] =3D BINARY_OP(dpps, SSE41, 0), +#define gen_helper_dppd_ymm NULL [0x41] =3D BINARY_OP(dppd, SSE41, 0), [0x42] =3D BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX), [0x44] =3D BINARY_OP(pclmulqdq, PCLMULQDQ, 0), +#define gen_helper_pcmpestrm_ymm NULL [0x60] =3D CMP_OP(pcmpestrm, SSE42), +#define gen_helper_pcmpestri_ymm NULL [0x61] =3D CMP_OP(pcmpestri, SSE42), +#define gen_helper_pcmpistrm_ymm NULL [0x62] =3D CMP_OP(pcmpistrm, SSE42), +#define gen_helper_pcmpistri_ymm NULL [0x63] =3D CMP_OP(pcmpistri, SSE42), +#define gen_helper_aeskeygenassist_ymm NULL [0xdf] =3D UNARY_OP(aeskeygenassist, AES, 0), }; =20 @@ -3405,14 +3494,23 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, ZMM_OFFSET(reg)); + if (s->vex_l) { + gen_sty_env_A0(s, ZMM_OFFSET(reg)); + } else { + gen_sto_env_A0(s, ZMM_OFFSET(reg)); + } break; case 0x3f0: /* lddqu */ CHECK_AVX_V0(s); if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + if (s->vex_l) { + gen_ldy_env_A0(s, ZMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + gen_clear_ymmh(s, reg); + } break; case 0x22b: /* movntss */ case 0x32b: /* movntsd */ @@ -3461,6 +3559,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0); gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32); } + gen_clear_ymmh(s, reg); break; case 0x6f: /* movq mm, ea */ CHECK_NO_VEX(s); @@ -3484,10 +3583,20 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + if (s->vex_l) { + gen_ldy_env_A0(s, ZMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + } } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm)); + if (s->vex_l) { + gen_op_movo_ymmh(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm)); + } + } + if (!s->vex_l) { + gen_clear_ymmh(s, reg); } break; case 0x210: /* movss xmm, ea */ @@ -3515,6 +3624,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_st_i32(s->tmp2_i32, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= ))); } + gen_clear_ymmh(s, reg); break; case 0x310: /* movsd xmm, ea */ if (mod !=3D 3) { @@ -3538,6 +3648,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); } + gen_clear_ymmh(s, reg); break; case 0x012: /* movlps */ case 0x112: /* movlpd */ @@ -3556,23 +3667,44 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1= )), offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(1)= )); } + gen_clear_ymmh(s, reg); break; case 0x212: /* movsldup */ CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + if (s->vex_l) { + gen_ldy_env_A0(s, ZMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + } } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0))); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_L(2))); + if (s->vex_l) { + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 4)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(4= ))); + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 6)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(6= ))); + } } gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1)), offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0))); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3)), offsetof(CPUX86State,xmm_regs[reg].ZMM_L(2))); + if (s->vex_l) { + gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4))); + gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6))); + } else { + gen_clear_ymmh(s, reg); + } break; case 0x312: /* movddup */ CHECK_AVX_V0(s); @@ -3580,13 +3712,29 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0))); + if (s->vex_l) { + tcg_gen_addi_tl(s->A0, s->A0, 16); + gen_ldq_env_A0(s, offsetof(CPUX86State, + xmm_regs[reg].ZMM_Q(2))); + } } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + if (s->vex_l) { + gen_op_movq(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(= 2)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(2= ))); + } } gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(1)), offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0))); + if (s->vex_l) { + gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(3= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(2))); + } else { + gen_clear_ymmh(s, reg); + } break; case 0x016: /* movhps */ case 0x116: /* movhpd */ @@ -3605,23 +3753,44 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), offsetof(CPUX86State, xmm_regs[reg_v].ZMM_Q(0)= )); } + gen_clear_ymmh(s, reg); break; case 0x216: /* movshdup */ CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + if (s->vex_l) { + gen_ldy_env_A0(s, ZMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + } } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_L(1))); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3= )), offsetof(CPUX86State,xmm_regs[rm].ZMM_L(3))); + if (s->vex_l) { + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 5)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(5= ))); + gen_op_movl(s, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 7)), + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(7= ))); + } } gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0)), offsetof(CPUX86State,xmm_regs[reg].ZMM_L(1))); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(2)), offsetof(CPUX86State,xmm_regs[reg].ZMM_L(3))); + if (s->vex_l) { + gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(4= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(5))); + gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(6= )), + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(7))); + } else { + gen_clear_ymmh(s, reg); + } break; case 0x178: case 0x378: @@ -3686,6 +3855,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); } gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); + gen_clear_ymmh(s, reg); break; case 0x7f: /* movq ea, mm */ CHECK_NO_VEX(s); @@ -3707,10 +3877,19 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, CHECK_AVX_V0(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, ZMM_OFFSET(reg)); + if (s->vex_l) { + gen_sty_env_A0(s, ZMM_OFFSET(reg)); + } else { + gen_sto_env_A0(s, ZMM_OFFSET(reg)); + } } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg)); + if (s->vex_l) { + gen_op_movo_ymmh(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg)); + } else { + gen_clear_ymmh(s, rm); + } } break; case 0x211: /* movss ea, xmm */ @@ -3728,6 +3907,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } gen_op_movl(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)= ), offsetof(CPUX86State,xmm_regs[reg].ZMM_L(0))); + gen_clear_ymmh(s, rm); } break; case 0x311: /* movsd ea, xmm */ @@ -3746,6 +3926,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } gen_op_movq(s, offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0)= ), offsetof(CPUX86State,xmm_regs[reg].ZMM_Q(0))); + gen_clear_ymmh(s, rm); } break; case 0x013: /* movlps */ @@ -3798,6 +3979,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op1_offset =3D offsetof(CPUX86State,mmx_t0); } assert(b1 < 2); + if (s->vex_l) { + b1 =3D 2; + } SSEFunc_0_eppp fn =3D sse_op_table2[((b - 1) & 3) * 8 + (((modrm >> 3)) & 7)][b1]; if (!fn) { @@ -3820,19 +4004,30 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); tcg_gen_addi_ptr(s->ptr2, cpu_env, op1_offset); fn(cpu_env, s->ptr0, s->ptr1, s->ptr2); + if (!s->vex_l) { + gen_clear_ymmh(s, reg_v); + } break; case 0x050: /* movmskps */ CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0); + if (s->vex_l) { + gen_helper_movmskps_ymm(s->tmp2_i32, cpu_env, s->ptr0); + } else { + gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0); + } tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0); + if (s->vex_l) { + gen_helper_movmskpd_ymm(s->tmp2_i32, cpu_env, s->ptr0); + } else { + gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0); + } tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x02a: /* cvtpi2ps */ @@ -3883,6 +4078,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; #endif } + gen_clear_ymmh(s, reg); break; case 0x02c: /* cvttps2pi */ case 0x12c: /* cvttpd2pi */ @@ -3972,6 +4168,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_st16_tl(s->T0, cpu_env, offsetof(CPUX86State,fpregs[reg].mmx.MMX_W= (val))); } + gen_clear_ymmh(s, reg); break; case 0xc5: /* pextrw */ case 0x1c5: @@ -4031,7 +4228,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, CHECK_AVX_V0(s); rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); + if (s->vex_l) { + gen_helper_pmovmskb_ymm(s->tmp2_i32, cpu_env, s->ptr0); + } else { + gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); + } } else { CHECK_NO_VEX(s); rm =3D (modrm & 7); @@ -4098,37 +4299,66 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, if (mod =3D=3D 3) { op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); } else { - op2_offset =3D offsetof(CPUX86State,xmm_t0); + int size; + op2_offset =3D offsetof(CPUX86State, xmm_t0); gen_lea_modrm(env, s, modrm); switch (b) { case 0x20: case 0x30: /* pmovsxbw, pmovzxbw */ case 0x23: case 0x33: /* pmovsxwd, pmovzxwd */ case 0x25: case 0x35: /* pmovsxdq, pmovzxdq */ - gen_ldq_env_A0(s, op2_offset + - offsetof(ZMMReg, ZMM_Q(0))); + size =3D 64; break; case 0x21: case 0x31: /* pmovsxbd, pmovzxbd */ case 0x24: case 0x34: /* pmovsxwq, pmovzxwq */ - tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, - s->mem_index, MO_LEUL); - tcg_gen_st_i32(s->tmp2_i32, cpu_env, op2_offset + - offsetof(ZMMReg, ZMM_L(0))); + size =3D 32; break; case 0x22: case 0x32: /* pmovsxbq, pmovzxbq */ + size =3D 16; + break; + case 0x2a: /* movntqda */ + if (s->vex_l) { + gen_ldy_env_A0(s, op1_offset); + } else { + gen_ldo_env_A0(s, op1_offset); + gen_clear_ymmh(s, reg); + } + return; + default: + size =3D 128; + } + if (s->vex_l) { + size *=3D 2; + } + switch (size) { + case 16: tcg_gen_qemu_ld_tl(s->tmp0, s->A0, s->mem_index, MO_LEUW); tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset + offsetof(ZMMReg, ZMM_W(0))); break; - case 0x2a: /* movntqda */ - gen_ldo_env_A0(s, op1_offset); - return; - default: + case 32: + tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, + s->mem_index, MO_LEUL); + tcg_gen_st_i32(s->tmp2_i32, cpu_env, op2_offset + + offsetof(ZMMReg, ZMM_L(0))); + break; + case 64: + gen_ldq_env_A0(s, op2_offset + + offsetof(ZMMReg, ZMM_Q(0))); + break; + case 128: gen_ldo_env_A0(s, op2_offset); + break; + case 256: + gen_ldy_env_A0(s, op2_offset); + break; } } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (s->vex_l) { + b1 =3D 2; + } if (!op6.fn[b1].op1) { goto illegal_op; } @@ -4148,6 +4378,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, fn(cpu_env, s->ptr0, s->ptr2, s->ptr1); } } + if ((op6.flags & SSE_OPF_CMP) =3D=3D 0 && s->vex_l =3D=3D = 0) { + gen_clear_ymmh(s, reg); + } } else { CHECK_NO_VEX(s); if ((op6.flags & SSE_OPF_MMX) =3D=3D 0) { @@ -4644,6 +4877,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } tcg_gen_st8_tl(s->T0, cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_B(val & 15))= ); + gen_clear_ymmh(s, reg); break; case 0x21: /* insertps */ CHECK_AVX_128(s); @@ -4677,6 +4911,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_st_i32(tcg_const_i32(0 /*float32_zero*/), cpu_env, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(3))); + gen_clear_ymmh(s, reg); break; case 0x22: CHECK_AVX_128(s); @@ -4708,6 +4943,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; #endif } + gen_clear_ymmh(s, reg); break; } return; @@ -4760,7 +4996,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } else { op2_offset =3D offsetof(CPUX86State, xmm_t0); gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, op2_offset); + if (s->vex_l) { + gen_ldy_env_A0(s, op2_offset); + } else { + gen_ldo_env_A0(s, op2_offset); + } } =20 val =3D x86_ldub_code(env, s); @@ -4771,8 +5011,13 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, /* The helper must use entire 64-bit gp registers */ val |=3D 1 << 8; } + if ((b & 1) =3D=3D 0) /* pcmpXsrtm */ + gen_clear_ymmh(s, 0); } =20 + if (s->vex_l) { + b1 =3D 2; + } v_offset =3D ZMM_OFFSET(reg_v); /* * Populate the top part of the destination register for VEX @@ -4796,6 +5041,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1, tcg_const_i32(val)); } + if ((op7.flags & SSE_OPF_CMP) =3D=3D 0 && s->vex_l =3D=3D 0) { + gen_clear_ymmh(s, reg); + } if (op7.flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } @@ -4848,6 +5096,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, default: break; } + if (s->vex_l) { + b1 +=3D 4; + } if (is_xmm) { scalar_op =3D (s->prefix & PREFIX_VEX) && (sse_op.flags & SSE_OPF_SCALAR) @@ -4864,7 +5115,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } op1_offset =3D ZMM_OFFSET(reg); if (mod !=3D 3) { - int sz =3D 4; + int sz =3D s->vex_l ? 5 : 4; =20 gen_lea_modrm(env, s, modrm); op2_offset =3D offsetof(CPUX86State, xmm_t0); @@ -4902,6 +5153,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, /* 128 bit access */ gen_ldo_env_A0(s, op2_offset); break; + case 5: + /* 256 bit access */ + gen_ldy_env_A0(s, op2_offset); + break; } } else { rm =3D (modrm & 7) | REX_B(s); @@ -5010,6 +5265,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } } =20 + if (s->vex_l =3D=3D 0 && (sse_op.flags & SSE_OPF_CMP) =3D=3D 0) { + gen_clear_ymmh(s, reg); + } if (sse_op.flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839034887969.4573135622587; Sun, 24 Apr 2022 15:23:54 -0700 (PDT) Received: from localhost ([::1]:33534 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nike1-0003X5-Px for importer@patchew.org; Sun, 24 Apr 2022 18:23:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50630) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSk-0001g9-3f for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:14 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58801) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSh-0002pR-LG for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:13 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJA-0001ea-Hv; Sun, 24 Apr 2022 23:02:20 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 28/42] i386: Implement VZEROALL and VZEROUPPER Date: Sun, 24 Apr 2022 23:01:50 +0100 Message-Id: <20220424220204.2493824-29-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839035980100001 Content-Type: text/plain; charset="utf-8" The use the same opcode as EMMS, which I guess makes some sort of sense. Fairly strightforward other than that. If we were wanting to optimize out gen_clear_ymmh then this would be one of the starting points. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 48 ++++++++++++++++++++++++++++++++++++ target/i386/ops_sse_header.h | 9 +++++++ target/i386/tcg/translate.c | 26 ++++++++++++++++--- 3 files changed, 80 insertions(+), 3 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index ad3312d353..a1f50f0c8b 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -3071,6 +3071,54 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86Stat= e *env, Reg *d, Reg *s, #endif #endif =20 +#if SHIFT =3D=3D 2 +void helper_vzeroall(CPUX86State *env) +{ + int i; + + for (i =3D 0; i < 8; i++) { + env->xmm_regs[i].ZMM_Q(0) =3D 0; + env->xmm_regs[i].ZMM_Q(1) =3D 0; + env->xmm_regs[i].ZMM_Q(2) =3D 0; + env->xmm_regs[i].ZMM_Q(3) =3D 0; + } +} + +void helper_vzeroupper(CPUX86State *env) +{ + int i; + + for (i =3D 0; i < 8; i++) { + env->xmm_regs[i].ZMM_Q(2) =3D 0; + env->xmm_regs[i].ZMM_Q(3) =3D 0; + } +} + +#ifdef TARGET_X86_64 +void helper_vzeroall_hi8(CPUX86State *env) +{ + int i; + + for (i =3D 8; i < 16; i++) { + env->xmm_regs[i].ZMM_Q(0) =3D 0; + env->xmm_regs[i].ZMM_Q(1) =3D 0; + env->xmm_regs[i].ZMM_Q(2) =3D 0; + env->xmm_regs[i].ZMM_Q(3) =3D 0; + } +} + +void helper_vzeroupper_hi8(CPUX86State *env) +{ + int i; + + for (i =3D 8; i < 16; i++) { + env->xmm_regs[i].ZMM_Q(2) =3D 0; + env->xmm_regs[i].ZMM_Q(3) =3D 0; + } +} +#endif +#endif + #undef SSE_HELPER_S =20 #undef SHIFT diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index cfcfba154b..48f0945917 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -411,6 +411,15 @@ DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env,= Reg, Reg, i32) DEF_HELPER_5(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, Reg, i32) #endif =20 +#if SHIFT =3D=3D 2 +DEF_HELPER_1(vzeroall, void, env) +DEF_HELPER_1(vzeroupper, void, env) +#ifdef TARGET_X86_64 +DEF_HELPER_1(vzeroall_hi8, void, env) +DEF_HELPER_1(vzeroupper_hi8, void, env) +#endif +#endif + #undef SHIFT #undef Reg #undef SUFFIX diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index bcd6d47fd0..ba70aeb039 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3455,9 +3455,29 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, return; } if (b =3D=3D 0x77) { - /* emms */ - gen_helper_emms(cpu_env); - return; + if (s->prefix & PREFIX_VEX) { + CHECK_AVX(s); + if (s->vex_l) { + gen_helper_vzeroall(cpu_env); +#ifdef TARGET_X86_64 + if (CODE64(s)) { + gen_helper_vzeroall_hi8(cpu_env); + } +#endif + } else { + gen_helper_vzeroupper(cpu_env); +#ifdef TARGET_X86_64 + if (CODE64(s)) { + gen_helper_vzeroupper_hi8(cpu_env); + } +#endif + } + return; + } else { + /* emms */ + gen_helper_emms(cpu_env); + return; + } } /* prepare MMX state (XXX: optimize by storing fptt and fptags in the static cpu state) */ --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838877737440.2849707890525; Sun, 24 Apr 2022 15:21:17 -0700 (PDT) Received: from localhost ([::1]:54108 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikbU-0006jR-5y for importer@patchew.org; Sun, 24 Apr 2022 18:21:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50478) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikS6-00016T-3A for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:34 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58763) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikS4-0002mv-78 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:33 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJA-0001ea-Of; Sun, 24 Apr 2022 23:02:20 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 29/42] i386: Implement VBROADCAST Date: Sun, 24 Apr 2022 23:01:51 +0100 Message-Id: <20220424220204.2493824-30-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838879614100001 Content-Type: text/plain; charset="utf-8" The catch here is that these are whole vector operations (not independent 1= 28 bit lanes). We abuse the SSE_OPF_SCALAR flag to select the memory operand width appropriately. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 51 ++++++++++++++++++++++++++++++++++++ target/i386/ops_sse_header.h | 8 ++++++ target/i386/tcg/translate.c | 42 ++++++++++++++++++++++++++++- 3 files changed, 100 insertions(+), 1 deletion(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index a1f50f0c8b..4115c9a257 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -3071,7 +3071,57 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86Stat= e *env, Reg *d, Reg *s, #endif #endif =20 +#if SHIFT >=3D 1 +void glue(helper_vbroadcastb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint8_t val =3D s->B(0); + int i; + + for (i =3D 0; i < 16 * SHIFT; i++) { + d->B(i) =3D val; + } +} + +void glue(helper_vbroadcastw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint16_t val =3D s->W(0); + int i; + + for (i =3D 0; i < 8 * SHIFT; i++) { + d->W(i) =3D val; + } +} + +void glue(helper_vbroadcastl, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint32_t val =3D s->L(0); + int i; + + for (i =3D 0; i < 8 * SHIFT; i++) { + d->L(i) =3D val; + } +} + +void glue(helper_vbroadcastq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint64_t val =3D s->Q(0); + d->Q(0) =3D val; + d->Q(1) =3D val; #if SHIFT =3D=3D 2 + d->Q(2) =3D val; + d->Q(3) =3D val; +#endif +} + +#if SHIFT =3D=3D 2 +void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + d->Q(0) =3D s->Q(0); + d->Q(1) =3D s->Q(1); + d->Q(2) =3D s->Q(0); + d->Q(3) =3D s->Q(1); +} + void helper_vzeroall(CPUX86State *env) { int i; @@ -3118,6 +3168,7 @@ void helper_vzeroupper_hi8(CPUX86State *env) } #endif #endif +#endif =20 #undef SSE_HELPER_S =20 diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index 48f0945917..51e02cd4fa 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -411,7 +411,14 @@ DEF_HELPER_4(glue(aeskeygenassist, SUFFIX), void, env,= Reg, Reg, i32) DEF_HELPER_5(glue(pclmulqdq, SUFFIX), void, env, Reg, Reg, Reg, i32) #endif =20 +/* AVX helpers */ +#if SHIFT >=3D 1 +DEF_HELPER_3(glue(vbroadcastb, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(vbroadcastw, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(vbroadcastl, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(vbroadcastq, SUFFIX), void, env, Reg, Reg) #if SHIFT =3D=3D 2 +DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_1(vzeroall, void, env) DEF_HELPER_1(vzeroupper, void, env) #ifdef TARGET_X86_64 @@ -419,6 +426,7 @@ DEF_HELPER_1(vzeroall_hi8, void, env) DEF_HELPER_1(vzeroupper_hi8, void, env) #endif #endif +#endif =20 #undef SHIFT #undef Reg diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index ba70aeb039..59ab1dc562 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3255,6 +3255,11 @@ static const struct SSEOpHelper_table6 sse_op_table6= [256] =3D { [0x14] =3D BLENDV_OP(blendvps, SSE41, 0), [0x15] =3D BLENDV_OP(blendvpd, SSE41, 0), [0x17] =3D CMP_OP(ptest, SSE41), + /* TODO:Some vbroadcast variants require AVX2 */ + [0x18] =3D UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR), /* vbroadcastss= */ + [0x19] =3D UNARY_OP(vbroadcastq, AVX, SSE_OPF_SCALAR), /* vbroadcastsd= */ +#define gen_helper_vbroadcastdq_xmm NULL + [0x1a] =3D UNARY_OP(vbroadcastdq, AVX, SSE_OPF_SCALAR), /* vbroadcastf= 128 */ [0x1c] =3D UNARY_OP_MMX(pabsb, SSSE3), [0x1d] =3D UNARY_OP_MMX(pabsw, SSSE3), [0x1e] =3D UNARY_OP_MMX(pabsd, SSSE3), @@ -3286,6 +3291,16 @@ static const struct SSEOpHelper_table6 sse_op_table6= [256] =3D { [0x40] =3D BINARY_OP(pmulld, SSE41, SSE_OPF_MMX), #define gen_helper_phminposuw_ymm NULL [0x41] =3D UNARY_OP(phminposuw, SSE41, 0), + /* vpbroadcastd */ + [0x58] =3D UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vpbroadcastq */ + [0x59] =3D UNARY_OP(vbroadcastq, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vbroadcasti128 */ + [0x5a] =3D UNARY_OP(vbroadcastdq, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vpbroadcastb */ + [0x78] =3D UNARY_OP(vbroadcastb, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vpbroadcastw */ + [0x79] =3D UNARY_OP(vbroadcastw, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), #define gen_helper_aesimc_ymm NULL [0xdb] =3D UNARY_OP(aesimc, AES, 0), [0xdc] =3D BINARY_OP(aesenc, AES, 0), @@ -4323,6 +4338,24 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, op2_offset =3D offsetof(CPUX86State, xmm_t0); gen_lea_modrm(env, s, modrm); switch (b) { + case 0x78: /* vpbroadcastb */ + size =3D 8; + break; + case 0x79: /* vpbroadcastw */ + size =3D 16; + break; + case 0x18: /* vbroadcastss */ + case 0x58: /* vpbroadcastd */ + size =3D 32; + break; + case 0x19: /* vbroadcastsd */ + case 0x59: /* vpbroadcastq */ + size =3D 64; + break; + case 0x1a: /* vbroadcastf128 */ + case 0x5a: /* vbroadcasti128 */ + size =3D 128; + break; case 0x20: case 0x30: /* pmovsxbw, pmovzxbw */ case 0x23: case 0x33: /* pmovsxwd, pmovzxwd */ case 0x25: case 0x35: /* pmovsxdq, pmovzxdq */ @@ -4346,10 +4379,17 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, default: size =3D 128; } - if (s->vex_l) { + /* 256 bit vbroadcast only load a single element. */ + if ((op6.flags & SSE_OPF_SCALAR) =3D=3D 0 && s->vex_l)= { size *=3D 2; } switch (size) { + case 8: + tcg_gen_qemu_ld_tl(s->tmp0, s->A0, + s->mem_index, MO_UB); + tcg_gen_st16_tl(s->tmp0, cpu_env, op2_offset + + offsetof(ZMMReg, ZMM_B(0))); + break; case 16: tcg_gen_qemu_ld_tl(s->tmp0, s->A0, s->mem_index, MO_LEUW); --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838647717577.8299408834894; Sun, 24 Apr 2022 15:17:27 -0700 (PDT) Received: from localhost ([::1]:45524 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikXm-0000vY-JD for importer@patchew.org; Sun, 24 Apr 2022 18:17:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50444) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikRx-00010u-At for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:26 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58755) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikRv-0002mE-KU for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:25 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJA-0001ea-Vr; Sun, 24 Apr 2022 23:02:21 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 30/42] i386: Implement VPERMIL Date: Sun, 24 Apr 2022 23:01:52 +0100 Message-Id: <20220424220204.2493824-31-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838649332100001 Content-Type: text/plain; charset="utf-8" Some potentially surprising details when comparing vpermilpd v.s. vpermilps, but overall pretty straightforward. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 82 ++++++++++++++++++++++++++++++++++++ target/i386/ops_sse_header.h | 4 ++ target/i386/tcg/translate.c | 4 ++ 3 files changed, 90 insertions(+) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 4115c9a257..9b92b9790a 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -3113,6 +3113,88 @@ void glue(helper_vbroadcastq, SUFFIX)(CPUX86State *e= nv, Reg *d, Reg *s) #endif } =20 +void glue(helper_vpermilpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg = *s) +{ + uint64_t r0, r1; + + r0 =3D v->Q((s->Q(0) >> 1) & 1); + r1 =3D v->Q((s->Q(1) >> 1) & 1); + d->Q(0) =3D r0; + d->Q(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D v->Q(((s->Q(2) >> 1) & 1) + 2); + r1 =3D v->Q(((s->Q(3) >> 1) & 1) + 2); + d->Q(2) =3D r0; + d->Q(3) =3D r1; +#endif +} + +void glue(helper_vpermilps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg = *s) +{ + uint32_t r0, r1, r2, r3; + + r0 =3D v->L(s->L(0) & 3); + r1 =3D v->L(s->L(1) & 3); + r2 =3D v->L(s->L(2) & 3); + r3 =3D v->L(s->L(3) & 3); + d->L(0) =3D r0; + d->L(1) =3D r1; + d->L(2) =3D r2; + d->L(3) =3D r3; +#if SHIFT =3D=3D 2 + r0 =3D v->L((s->L(4) & 3) + 4); + r1 =3D v->L((s->L(5) & 3) + 4); + r2 =3D v->L((s->L(6) & 3) + 4); + r3 =3D v->L((s->L(7) & 3) + 4); + d->L(4) =3D r0; + d->L(5) =3D r1; + d->L(6) =3D r2; + d->L(7) =3D r3; +#endif +} + +void glue(helper_vpermilpd_imm, SUFFIX)(CPUX86State *env, + Reg *d, Reg *s, uint32_t order) +{ + uint64_t r0, r1; + + r0 =3D s->Q((order >> 0) & 1); + r1 =3D s->Q((order >> 1) & 1); + d->Q(0) =3D r0; + d->Q(1) =3D r1; +#if SHIFT =3D=3D 2 + r0 =3D s->Q(((order >> 2) & 1) + 2); + r1 =3D s->Q(((order >> 3) & 1) + 2); + d->Q(2) =3D r0; + d->Q(3) =3D r1; +#endif +} + +void glue(helper_vpermilps_imm, SUFFIX)(CPUX86State *env, + Reg *d, Reg *s, uint32_t order) +{ + uint32_t r0, r1, r2, r3; + + r0 =3D s->L((order >> 0) & 3); + r1 =3D s->L((order >> 2) & 3); + r2 =3D s->L((order >> 4) & 3); + r3 =3D s->L((order >> 6) & 3); + d->L(0) =3D r0; + d->L(1) =3D r1; + d->L(2) =3D r2; + d->L(3) =3D r3; +#if SHIFT =3D=3D 2 + r0 =3D s->L(((order >> 0) & 3) + 4); + r1 =3D s->L(((order >> 2) & 3) + 4); + r2 =3D s->L(((order >> 4) & 3) + 4); + r3 =3D s->L(((order >> 6) & 3) + 4); + d->L(4) =3D r0; + d->L(5) =3D r1; + d->L(6) =3D r2; + d->L(7) =3D r3; +#endif +} + #if SHIFT =3D=3D 2 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index 51e02cd4fa..c52169a030 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -417,6 +417,10 @@ DEF_HELPER_3(glue(vbroadcastb, SUFFIX), void, env, Reg= , Reg) DEF_HELPER_3(glue(vbroadcastw, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(vbroadcastl, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(vbroadcastq, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(vpermilpd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpermilps, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpermilpd_imm, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_4(glue(vpermilps_imm, SUFFIX), void, env, Reg, Reg, i32) #if SHIFT =3D=3D 2 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_1(vzeroall, void, env) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 59ab1dc562..358c3ecb0b 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3251,6 +3251,8 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { [0x09] =3D BINARY_OP_MMX(psignw, SSSE3), [0x0a] =3D BINARY_OP_MMX(psignd, SSSE3), [0x0b] =3D BINARY_OP_MMX(pmulhrsw, SSSE3), + [0x0c] =3D BINARY_OP(vpermilps, AVX, 0), + [0x0d] =3D BINARY_OP(vpermilpd, AVX, 0), [0x10] =3D BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX), [0x14] =3D BLENDV_OP(blendvps, SSE41, 0), [0x15] =3D BLENDV_OP(blendvpd, SSE41, 0), @@ -3311,6 +3313,8 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { =20 /* prefix [66] 0f 3a */ static const struct SSEOpHelper_table7 sse_op_table7[256] =3D { + [0x04] =3D UNARY_OP(vpermilps_imm, AVX, 0), + [0x05] =3D UNARY_OP(vpermilpd_imm, AVX, 0), [0x08] =3D UNARY_OP(roundps, SSE41, 0), [0x09] =3D UNARY_OP(roundpd, SSE41, 0), #define gen_helper_roundss_ymm NULL --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650840021635990.4579455316017; Sun, 24 Apr 2022 15:40:21 -0700 (PDT) Received: from localhost ([::1]:50058 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1niktw-0000ph-DK for importer@patchew.org; Sun, 24 Apr 2022 18:40:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50886) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTz-0002aA-Dj for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:31 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58860) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTx-0002w7-Ln for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:31 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJB-0001ea-6F; Sun, 24 Apr 2022 23:02:21 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 31/42] i386: Implement AVX variable shifts Date: Sun, 24 Apr 2022 23:01:53 +0100 Message-Id: <20220424220204.2493824-32-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650840022659100001 Content-Type: text/plain; charset="utf-8" These use the W bit to encode the operand width, but otherwise fairly straightforward. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 17 +++++++++++++++++ target/i386/ops_sse_header.h | 6 ++++++ target/i386/tcg/translate.c | 17 +++++++++++++++++ 3 files changed, 40 insertions(+) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 9b92b9790a..8f2bd48394 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -3195,6 +3195,23 @@ void glue(helper_vpermilps_imm, SUFFIX)(CPUX86State = *env, #endif } =20 +#if SHIFT =3D=3D 1 +#define FPSRLVD(x, c) (c < 32 ? ((x) >> c) : 0) +#define FPSRLVQ(x, c) (c < 64 ? ((x) >> c) : 0) +#define FPSRAVD(x, c) ((int32_t)(x) >> (c < 64 ? c : 31)) +#define FPSRAVQ(x, c) ((int64_t)(x) >> (c < 64 ? c : 63)) +#define FPSLLVD(x, c) (c < 32 ? ((x) << c) : 0) +#define FPSLLVQ(x, c) (c < 64 ? ((x) << c) : 0) +#endif + +SSE_HELPER_L(helper_vpsrlvd, FPSRLVD) +SSE_HELPER_L(helper_vpsravd, FPSRAVD) +SSE_HELPER_L(helper_vpsllvd, FPSLLVD) + +SSE_HELPER_Q(helper_vpsrlvq, FPSRLVQ) +SSE_HELPER_Q(helper_vpsravq, FPSRAVQ) +SSE_HELPER_Q(helper_vpsllvq, FPSLLVQ) + #if SHIFT =3D=3D 2 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index c52169a030..20db6c4240 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -421,6 +421,12 @@ DEF_HELPER_4(glue(vpermilpd, SUFFIX), void, env, Reg, = Reg, Reg) DEF_HELPER_4(glue(vpermilps, SUFFIX), void, env, Reg, Reg, Reg) DEF_HELPER_4(glue(vpermilpd_imm, SUFFIX), void, env, Reg, Reg, i32) DEF_HELPER_4(glue(vpermilps_imm, SUFFIX), void, env, Reg, Reg, i32) +DEF_HELPER_4(glue(vpsrlvd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsravd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsllvd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsrlvq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsravq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpsllvq, SUFFIX), void, env, Reg, Reg, Reg) #if SHIFT =3D=3D 2 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_1(vzeroall, void, env) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 358c3ecb0b..4990470083 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3293,6 +3293,9 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { [0x40] =3D BINARY_OP(pmulld, SSE41, SSE_OPF_MMX), #define gen_helper_phminposuw_ymm NULL [0x41] =3D UNARY_OP(phminposuw, SSE41, 0), + [0x45] =3D BINARY_OP(vpsrlvd, AVX, SSE_OPF_AVX2), + [0x46] =3D BINARY_OP(vpsravd, AVX, SSE_OPF_AVX2), + [0x47] =3D BINARY_OP(vpsllvd, AVX, SSE_OPF_AVX2), /* vpbroadcastd */ [0x58] =3D UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), /* vpbroadcastq */ @@ -3357,6 +3360,15 @@ static const struct SSEOpHelper_table7 sse_op_table7= [256] =3D { #undef BLENDV_OP #undef SPECIAL_OP =20 +#define SSE_OP(name) \ + {gen_helper_ ## name ##_xmm, gen_helper_ ## name ##_ymm} +static const SSEFunc_0_eppp sse_op_table8[3][2] =3D { + SSE_OP(vpsrlvq), + SSE_OP(vpsravq), + SSE_OP(vpsllvq), +}; +#undef SSE_OP + /* VEX prefix not allowed */ #define CHECK_NO_VEX(s) do { \ if (s->prefix & PREFIX_VEX) \ @@ -4439,6 +4451,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, tcg_temp_free_ptr(mask); } else { SSEFunc_0_eppp fn =3D op6.fn[b1].op2; + if (REX_W(s)) { + if (b >=3D 0x45 && b <=3D 0x47) { + fn =3D sse_op_table8[b - 0x45][b1 - 1]; + } + } fn(cpu_env, s->ptr0, s->ptr2, s->ptr1); } } --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839813964685.5318696203109; Sun, 24 Apr 2022 15:36:53 -0700 (PDT) Received: from localhost ([::1]:43264 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikqa-0004bG-W2 for importer@patchew.org; Sun, 24 Apr 2022 18:36:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51038) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikV2-0004Kg-UR for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:14:36 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58894) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikUg-0002xm-Pu for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:14:28 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJB-0001ea-Cp; Sun, 24 Apr 2022 23:02:21 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 32/42] i386: Implement VTEST Date: Sun, 24 Apr 2022 23:01:54 +0100 Message-Id: <20220424220204.2493824-33-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839815275100001 Content-Type: text/plain; charset="utf-8" Noting special here Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 28 ++++++++++++++++++++++++++++ target/i386/ops_sse_header.h | 2 ++ target/i386/tcg/translate.c | 2 ++ 3 files changed, 32 insertions(+) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 8f2bd48394..edf14a25d7 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -3212,6 +3212,34 @@ SSE_HELPER_Q(helper_vpsrlvq, FPSRLVQ) SSE_HELPER_Q(helper_vpsravq, FPSRAVQ) SSE_HELPER_Q(helper_vpsllvq, FPSLLVQ) =20 +void glue(helper_vtestps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint32_t zf =3D (s->L(0) & d->L(0)) | (s->L(1) & d->L(1)); + uint32_t cf =3D (s->L(0) & ~d->L(0)) | (s->L(1) & ~d->L(1)); + + zf |=3D (s->L(2) & d->L(2)) | (s->L(3) & d->L(3)); + cf |=3D (s->L(2) & ~d->L(2)) | (s->L(3) & ~d->L(3)); +#if SHIFT =3D=3D 2 + zf |=3D (s->L(4) & d->L(4)) | (s->L(5) & d->L(5)); + cf |=3D (s->L(4) & ~d->L(4)) | (s->L(5) & ~d->L(5)); + zf |=3D (s->L(6) & d->L(6)) | (s->L(7) & d->L(7)); + cf |=3D (s->L(6) & ~d->L(6)) | (s->L(7) & ~d->L(7)); +#endif + CC_SRC =3D ((zf >> 31) ? 0 : CC_Z) | ((cf >> 31) ? 0 : CC_C); +} + +void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + uint64_t zf =3D (s->Q(0) & d->Q(0)) | (s->Q(1) & d->Q(1)); + uint64_t cf =3D (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1)); + +#if SHIFT =3D=3D 2 + zf |=3D (s->Q(2) & d->Q(2)) | (s->Q(3) & d->Q(3)); + cf |=3D (s->Q(2) & ~d->Q(2)) | (s->Q(3) & ~d->Q(3)); +#endif + CC_SRC =3D ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C); +} + #if SHIFT =3D=3D 2 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index 20db6c4240..8b93b8e6d6 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -427,6 +427,8 @@ DEF_HELPER_4(glue(vpsllvd, SUFFIX), void, env, Reg, Reg= , Reg) DEF_HELPER_4(glue(vpsrlvq, SUFFIX), void, env, Reg, Reg, Reg) DEF_HELPER_4(glue(vpsravq, SUFFIX), void, env, Reg, Reg, Reg) DEF_HELPER_4(glue(vpsllvq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_3(glue(vtestps, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(vtestpd, SUFFIX), void, env, Reg, Reg) #if SHIFT =3D=3D 2 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_1(vzeroall, void, env) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 4990470083..2fbb7bfcad 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3253,6 +3253,8 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { [0x0b] =3D BINARY_OP_MMX(pmulhrsw, SSSE3), [0x0c] =3D BINARY_OP(vpermilps, AVX, 0), [0x0d] =3D BINARY_OP(vpermilpd, AVX, 0), + [0x0e] =3D CMP_OP(vtestps, AVX), + [0x0f] =3D CMP_OP(vtestpd, AVX), [0x10] =3D BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX), [0x14] =3D BLENDV_OP(blendvps, SSE41, 0), [0x15] =3D BLENDV_OP(blendvpd, SSE41, 0), --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838537332334.99634111243006; Sun, 24 Apr 2022 15:15:37 -0700 (PDT) Received: from localhost ([::1]:39106 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikVz-0004xe-Ka for importer@patchew.org; Sun, 24 Apr 2022 18:15:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50458) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikS1-00014a-U5 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:30 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58759) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikRz-0002mj-SF for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:29 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJB-0001ea-Jq; Sun, 24 Apr 2022 23:02:21 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 33/42] i386: Implement VMASKMOV Date: Sun, 24 Apr 2022 23:01:55 +0100 Message-Id: <20220424220204.2493824-34-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838538886100001 Content-Type: text/plain; charset="utf-8" Decoding these is a bit messy, but at least the integer and float variants have the same semantics once decoded. We don't try and be clever with the load forms, instead load the whole vector then mask out the elements we want. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 48 ++++++++++++++++++++++++++++++++++++ target/i386/ops_sse_header.h | 4 +++ target/i386/tcg/translate.c | 34 +++++++++++++++++++++++++ 3 files changed, 86 insertions(+) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index edf14a25d7..ffcba3d02c 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -3240,6 +3240,54 @@ void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) CC_SRC =3D ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C); } =20 +void glue(helper_vpmaskmovd_st, SUFFIX)(CPUX86State *env, + Reg *s, Reg *v, target_ulong a0) +{ + int i; + + for (i =3D 0; i < (2 << SHIFT); i++) { + if (v->L(i) >> 31) { + cpu_stl_data_ra(env, a0 + i * 4, s->L(i), GETPC()); + } + } +} + +void glue(helper_vpmaskmovq_st, SUFFIX)(CPUX86State *env, + Reg *s, Reg *v, target_ulong a0) +{ + int i; + + for (i =3D 0; i < (1 << SHIFT); i++) { + if (v->Q(i) >> 63) { + cpu_stq_data_ra(env, a0 + i * 8, s->Q(i), GETPC()); + } + } +} + +void glue(helper_vpmaskmovd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg= *s) +{ + d->L(0) =3D (v->L(0) >> 31) ? s->L(0) : 0; + d->L(1) =3D (v->L(1) >> 31) ? s->L(1) : 0; + d->L(2) =3D (v->L(2) >> 31) ? s->L(2) : 0; + d->L(3) =3D (v->L(3) >> 31) ? s->L(3) : 0; +#if SHIFT =3D=3D 2 + d->L(4) =3D (v->L(4) >> 31) ? s->L(4) : 0; + d->L(5) =3D (v->L(5) >> 31) ? s->L(5) : 0; + d->L(6) =3D (v->L(6) >> 31) ? s->L(6) : 0; + d->L(7) =3D (v->L(7) >> 31) ? s->L(7) : 0; +#endif +} + +void glue(helper_vpmaskmovq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg= *s) +{ + d->Q(0) =3D (v->Q(0) >> 63) ? s->Q(0) : 0; + d->Q(1) =3D (v->Q(1) >> 63) ? s->Q(1) : 0; +#if SHIFT =3D=3D 2 + d->Q(2) =3D (v->Q(2) >> 63) ? s->Q(2) : 0; + d->Q(3) =3D (v->Q(3) >> 63) ? s->Q(3) : 0; +#endif +} + #if SHIFT =3D=3D 2 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index 8b93b8e6d6..a7a6bf6b10 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -429,6 +429,10 @@ DEF_HELPER_4(glue(vpsravq, SUFFIX), void, env, Reg, Re= g, Reg) DEF_HELPER_4(glue(vpsllvq, SUFFIX), void, env, Reg, Reg, Reg) DEF_HELPER_3(glue(vtestps, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(glue(vtestpd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_4(glue(vpmaskmovd_st, SUFFIX), void, env, Reg, Reg, tl) +DEF_HELPER_4(glue(vpmaskmovq_st, SUFFIX), void, env, Reg, Reg, tl) +DEF_HELPER_4(glue(vpmaskmovd, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_4(glue(vpmaskmovq, SUFFIX), void, env, Reg, Reg, Reg) #if SHIFT =3D=3D 2 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_1(vzeroall, void, env) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 2fbb7bfcad..e00195d301 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3277,6 +3277,10 @@ static const struct SSEOpHelper_table6 sse_op_table6= [256] =3D { [0x29] =3D BINARY_OP(pcmpeqq, SSE41, SSE_OPF_MMX), [0x2a] =3D SPECIAL_OP(SSE41), /* movntqda */ [0x2b] =3D BINARY_OP(packusdw, SSE41, SSE_OPF_MMX), + [0x2c] =3D BINARY_OP(vpmaskmovd, AVX, 0), /* vmaskmovps */ + [0x2d] =3D BINARY_OP(vpmaskmovq, AVX, 0), /* vmaskmovpd */ + [0x2e] =3D SPECIAL_OP(AVX), /* vmaskmovps */ + [0x2f] =3D SPECIAL_OP(AVX), /* vmaskmovpd */ [0x30] =3D UNARY_OP(pmovzxbw, SSE41, SSE_OPF_MMX), [0x31] =3D UNARY_OP(pmovzxbd, SSE41, SSE_OPF_MMX), [0x32] =3D UNARY_OP(pmovzxbq, SSE41, SSE_OPF_MMX), @@ -3308,6 +3312,9 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { [0x78] =3D UNARY_OP(vbroadcastb, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), /* vpbroadcastw */ [0x79] =3D UNARY_OP(vbroadcastw, AVX, SSE_OPF_SCALAR | SSE_OPF_MMX), + /* vpmaskmovd, vpmaskmovq */ + [0x8c] =3D BINARY_OP(vpmaskmovd, AVX, SSE_OPF_AVX2), + [0x8e] =3D SPECIAL_OP(AVX), /* vpmaskmovd, vpmaskmovq */ #define gen_helper_aesimc_ymm NULL [0xdb] =3D UNARY_OP(aesimc, AES, 0), [0xdc] =3D BINARY_OP(aesenc, AES, 0), @@ -3369,6 +3376,11 @@ static const SSEFunc_0_eppp sse_op_table8[3][2] =3D { SSE_OP(vpsravq), SSE_OP(vpsllvq), }; + +static const SSEFunc_0_eppt sse_op_table9[2][2] =3D { + SSE_OP(vpmaskmovd_st), + SSE_OP(vpmaskmovq_st), +}; #undef SSE_OP =20 /* VEX prefix not allowed */ @@ -4394,6 +4406,22 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_clear_ymmh(s, reg); } return; + case 0x2e: /* maskmovpd */ + b1 =3D 0; + goto vpmaskmov; + case 0x2f: /* maskmovpd */ + b1 =3D 1; + goto vpmaskmov; + case 0x8e: /* vpmaskmovd, vpmaskmovq */ + CHECK_AVX2(s); + b1 =3D REX_W(s); + vpmaskmov: + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + v_offset =3D ZMM_OFFSET(reg_v); + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + sse_op_table9[b1][s->vex_l](cpu_env, + s->ptr0, s->ptr2, s->A0); + return; default: size =3D 128; } @@ -4456,6 +4484,12 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, if (REX_W(s)) { if (b >=3D 0x45 && b <=3D 0x47) { fn =3D sse_op_table8[b - 0x45][b1 - 1]; + } else if (b =3D=3D 0x8c) { + if (s->vex_l) { + fn =3D gen_helper_vpmaskmovq_ymm; + } else { + fn =3D gen_helper_vpmaskmovq_xmm; + } } } fn(cpu_env, s->ptr0, s->ptr2, s->ptr1); --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839292929358.9264575060612; Sun, 24 Apr 2022 15:28:12 -0700 (PDT) Received: from localhost ([::1]:43176 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikiB-0001xX-PF for importer@patchew.org; Sun, 24 Apr 2022 18:28:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50788) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTO-0001vP-ES for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:55 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58835) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTM-0002tG-Fz for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:54 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJB-0001ea-Qa; Sun, 24 Apr 2022 23:02:21 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 34/42] i386: Implement VGATHER Date: Sun, 24 Apr 2022 23:01:56 +0100 Message-Id: <20220424220204.2493824-35-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839294867100001 Content-Type: text/plain; charset="utf-8" These are scatter load instructions that need introduce a new "Vector SIB" encoding. Also a bit of hair to handle different index sizes and scaling factors, but overall the combinatorial explosion doesn't end up too bad. The other thing of note is probably that these also modify the mask operand. Thankfully the operands may not overlap, and we do not have to make the who= le thing appear atomic. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 65 +++++++++++++++++++++++++++++++ target/i386/ops_sse_header.h | 16 ++++++++ target/i386/tcg/translate.c | 74 ++++++++++++++++++++++++++++++++++++ 3 files changed, 155 insertions(+) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index ffcba3d02c..14a2d1bf78 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -3288,6 +3288,71 @@ void glue(helper_vpmaskmovq, SUFFIX)(CPUX86State *en= v, Reg *d, Reg *v, Reg *s) #endif } =20 +#define VGATHER_HELPER(scale) \ +void glue(helper_vpgatherdd ## scale, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s, target_ulong a0) \ +{ \ + int i; \ + for (i =3D 0; i < (2 << SHIFT); i++) { \ + if (v->L(i) >> 31) { \ + target_ulong addr =3D a0 \ + + ((target_ulong)(int32_t)s->L(i) << scale); \ + d->L(i) =3D cpu_ldl_data_ra(env, addr, GETPC()); \ + } \ + v->L(i) =3D 0; \ + } \ +} \ +void glue(helper_vpgatherdq ## scale, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s, target_ulong a0) \ +{ \ + int i; \ + for (i =3D 0; i < (1 << SHIFT); i++) { \ + if (v->Q(i) >> 63) { \ + target_ulong addr =3D a0 \ + + ((target_ulong)(int32_t)s->L(i) << scale); \ + d->Q(i) =3D cpu_ldq_data_ra(env, addr, GETPC()); \ + } \ + v->Q(i) =3D 0; \ + } \ +} \ +void glue(helper_vpgatherqd ## scale, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s, target_ulong a0) \ +{ \ + int i; \ + for (i =3D 0; i < (1 << SHIFT); i++) { \ + if (v->L(i) >> 31) { \ + target_ulong addr =3D a0 \ + + ((target_ulong)(int64_t)s->Q(i) << scale); \ + d->L(i) =3D cpu_ldl_data_ra(env, addr, GETPC()); \ + } \ + v->L(i) =3D 0; \ + } \ + d->Q(SHIFT) =3D 0; \ + v->Q(SHIFT) =3D 0; \ + YMM_ONLY( \ + d->Q(3) =3D 0; \ + v->Q(3) =3D 0; \ + ) \ +} \ +void glue(helper_vpgatherqq ## scale, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *v, Reg *s, target_ulong a0) \ +{ \ + int i; \ + for (i =3D 0; i < (1 << SHIFT); i++) { \ + if (v->Q(i) >> 63) { \ + target_ulong addr =3D a0 \ + + ((target_ulong)(int64_t)s->Q(i) << scale); \ + d->Q(i) =3D cpu_ldq_data_ra(env, addr, GETPC()); \ + } \ + v->Q(i) =3D 0; \ + } \ +} + +VGATHER_HELPER(0) +VGATHER_HELPER(1) +VGATHER_HELPER(2) +VGATHER_HELPER(3) + #if SHIFT =3D=3D 2 void glue(helper_vbroadcastdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index a7a6bf6b10..e5d8ea9bb7 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -433,6 +433,22 @@ DEF_HELPER_4(glue(vpmaskmovd_st, SUFFIX), void, env, R= eg, Reg, tl) DEF_HELPER_4(glue(vpmaskmovq_st, SUFFIX), void, env, Reg, Reg, tl) DEF_HELPER_4(glue(vpmaskmovd, SUFFIX), void, env, Reg, Reg, Reg) DEF_HELPER_4(glue(vpmaskmovq, SUFFIX), void, env, Reg, Reg, Reg) +DEF_HELPER_5(glue(vpgatherdd0, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdq0, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqd0, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqq0, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdd1, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdq1, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqd1, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqq1, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdd2, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdq2, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqd2, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqq2, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdd3, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherdq3, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqd3, SUFFIX), void, env, Reg, Reg, Reg, tl) +DEF_HELPER_5(glue(vpgatherqq3, SUFFIX), void, env, Reg, Reg, Reg, tl) #if SHIFT =3D=3D 2 DEF_HELPER_3(glue(vbroadcastdq, SUFFIX), void, env, Reg, Reg) DEF_HELPER_1(vzeroall, void, env) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index e00195d301..fe1ab58d07 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3315,6 +3315,10 @@ static const struct SSEOpHelper_table6 sse_op_table6= [256] =3D { /* vpmaskmovd, vpmaskmovq */ [0x8c] =3D BINARY_OP(vpmaskmovd, AVX, SSE_OPF_AVX2), [0x8e] =3D SPECIAL_OP(AVX), /* vpmaskmovd, vpmaskmovq */ + [0x90] =3D SPECIAL_OP(AVX), /* vpgatherdd, vpgatherdq */ + [0x91] =3D SPECIAL_OP(AVX), /* vpgatherqd, vpgatherqq */ + [0x92] =3D SPECIAL_OP(AVX), /* vgatherdpd, vgatherdps */ + [0x93] =3D SPECIAL_OP(AVX), /* vgatherqpd, vgatherqps */ #define gen_helper_aesimc_ymm NULL [0xdb] =3D UNARY_OP(aesimc, AES, 0), [0xdc] =3D BINARY_OP(aesenc, AES, 0), @@ -3381,6 +3385,25 @@ static const SSEFunc_0_eppt sse_op_table9[2][2] =3D { SSE_OP(vpmaskmovd_st), SSE_OP(vpmaskmovq_st), }; + +static const SSEFunc_0_epppt sse_op_table10[16][2] =3D { + SSE_OP(vpgatherdd0), + SSE_OP(vpgatherdq0), + SSE_OP(vpgatherqd0), + SSE_OP(vpgatherqq0), + SSE_OP(vpgatherdd1), + SSE_OP(vpgatherdq1), + SSE_OP(vpgatherqd1), + SSE_OP(vpgatherqq1), + SSE_OP(vpgatherdd2), + SSE_OP(vpgatherdq2), + SSE_OP(vpgatherqd2), + SSE_OP(vpgatherqq2), + SSE_OP(vpgatherdd3), + SSE_OP(vpgatherdq3), + SSE_OP(vpgatherqd3), + SSE_OP(vpgatherqq3), +}; #undef SSE_OP =20 /* VEX prefix not allowed */ @@ -4350,6 +4373,57 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } op1_offset =3D ZMM_OFFSET(reg); =20 + if ((b & 0xfc) =3D=3D 0x90) { /* vgather */ + int scale, index, base; + target_long disp =3D 0; + CHECK_AVX2(s); + if (mod =3D=3D 3 || rm !=3D 4) { + goto illegal_op; + } + + /* Vector SIB */ + val =3D x86_ldub_code(env, s); + scale =3D (val >> 6) & 3; + index =3D ((val >> 3) & 7) | REX_X(s); + base =3D (val & 7) | REX_B(s); + switch (mod) { + case 0: + if (base =3D=3D 5) { + base =3D -1; + disp =3D (int32_t)x86_ldl_code(env, s); + } + break; + case 1: + disp =3D (int8_t)x86_ldub_code(env, s); + break; + default: + case 2: + disp =3D (int32_t)x86_ldl_code(env, s); + break; + } + + /* destination, index and mask registers must not over= lap */ + if (reg =3D=3D index || reg =3D=3D reg_v) { + goto illegal_op; + } + + tcg_gen_addi_tl(s->A0, cpu_regs[base], disp); + gen_add_A0_ds_seg(s); + op2_offset =3D ZMM_OFFSET(index); + v_offset =3D ZMM_OFFSET(reg_v); + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); + b1 =3D REX_W(s) | ((b & 1) << 1) | (scale << 2); + sse_op_table10[b1][s->vex_l](cpu_env, + s->ptr0, s->ptr2, s->ptr1, s->A0); + if (!s->vex_l) { + gen_clear_ymmh(s, reg); + gen_clear_ymmh(s, reg_v); + } + return; + } + if (op6.flags & SSE_OPF_MMX) { CHECK_AVX2_256(s); } --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839422000810.7381125490103; Sun, 24 Apr 2022 15:30:22 -0700 (PDT) Received: from localhost ([::1]:51660 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikkH-0007it-0D for importer@patchew.org; Sun, 24 Apr 2022 18:30:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50986) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikUU-0003bQ-N5 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:14:02 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58881) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikUS-0002xO-US for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:14:02 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJC-0001ea-1G; Sun, 24 Apr 2022 23:02:22 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 35/42] i386: Implement VPERM Date: Sun, 24 Apr 2022 23:01:57 +0100 Message-Id: <20220424220204.2493824-36-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839423342100001 Content-Type: text/plain; charset="utf-8" A set of shuffle operations that operate on complete 256 bit registers. The integer and floating point variants have identical semantics. Signed-off-by: Paul Brook --- target/i386/ops_sse.h | 73 ++++++++++++++++++++++++++++++++++++ target/i386/ops_sse_header.h | 3 ++ target/i386/tcg/translate.c | 9 +++++ 3 files changed, 85 insertions(+) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 14a2d1bf78..04d2006cd8 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -3407,6 +3407,79 @@ void helper_vzeroupper_hi8(CPUX86State *env) } } #endif + +void helper_vpermdq_ymm(CPUX86State *env, + Reg *d, Reg *v, Reg *s, uint32_t order) +{ + uint64_t r0, r1, r2, r3; + + switch (order & 3) { + case 0: + r0 =3D v->Q(0); + r1 =3D v->Q(1); + break; + case 1: + r0 =3D v->Q(2); + r1 =3D v->Q(3); + break; + case 2: + r0 =3D s->Q(0); + r1 =3D s->Q(1); + break; + case 3: + r0 =3D s->Q(2); + r1 =3D s->Q(3); + break; + } + switch ((order >> 4) & 3) { + case 0: + r2 =3D v->Q(0); + r3 =3D v->Q(1); + break; + case 1: + r2 =3D v->Q(2); + r3 =3D v->Q(3); + break; + case 2: + r2 =3D s->Q(0); + r3 =3D s->Q(1); + break; + case 3: + r2 =3D s->Q(2); + r3 =3D s->Q(3); + break; + } + d->Q(0) =3D r0; + d->Q(1) =3D r1; + d->Q(2) =3D r2; + d->Q(3) =3D r3; +} + +void helper_vpermq_ymm(CPUX86State *env, Reg *d, Reg *s, uint32_t order) +{ + uint64_t r0, r1, r2, r3; + r0 =3D s->Q(order & 3); + r1 =3D s->Q((order >> 2) & 3); + r2 =3D s->Q((order >> 4) & 3); + r3 =3D s->Q((order >> 6) & 3); + d->Q(0) =3D r0; + d->Q(1) =3D r1; + d->Q(2) =3D r2; + d->Q(3) =3D r3; +} + +void helper_vpermd_ymm(CPUX86State *env, Reg *d, Reg *v, Reg *s) +{ + uint32_t r[8]; + int i; + + for (i =3D 0; i < 8; i++) { + r[i] =3D s->L(v->L(i) & 7); + } + for (i =3D 0; i < 8; i++) { + d->L(i) =3D r[i]; + } +} #endif #endif =20 diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index e5d8ea9bb7..099e6e8ffc 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -457,6 +457,9 @@ DEF_HELPER_1(vzeroupper, void, env) DEF_HELPER_1(vzeroall_hi8, void, env) DEF_HELPER_1(vzeroupper_hi8, void, env) #endif +DEF_HELPER_5(vpermdq_ymm, void, env, Reg, Reg, Reg, i32) +DEF_HELPER_4(vpermq_ymm, void, env, Reg, Reg, i32) +DEF_HELPER_4(vpermd_ymm, void, env, Reg, Reg, Reg) #endif #endif =20 diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index fe1ab58d07..5a11d3c083 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3258,6 +3258,8 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { [0x10] =3D BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX), [0x14] =3D BLENDV_OP(blendvps, SSE41, 0), [0x15] =3D BLENDV_OP(blendvpd, SSE41, 0), +#define gen_helper_vpermd_xmm NULL + [0x16] =3D BINARY_OP(vpermd, AVX, SSE_OPF_AVX2), /* vpermps */ [0x17] =3D CMP_OP(ptest, SSE41), /* TODO:Some vbroadcast variants require AVX2 */ [0x18] =3D UNARY_OP(vbroadcastl, AVX, SSE_OPF_SCALAR), /* vbroadcastss= */ @@ -3287,6 +3289,7 @@ static const struct SSEOpHelper_table6 sse_op_table6[= 256] =3D { [0x33] =3D UNARY_OP(pmovzxwd, SSE41, SSE_OPF_MMX), [0x34] =3D UNARY_OP(pmovzxwq, SSE41, SSE_OPF_MMX), [0x35] =3D UNARY_OP(pmovzxdq, SSE41, SSE_OPF_MMX), + [0x36] =3D BINARY_OP(vpermd, AVX, SSE_OPF_AVX2), /* vpermd */ [0x37] =3D BINARY_OP(pcmpgtq, SSE41, SSE_OPF_MMX), [0x38] =3D BINARY_OP(pminsb, SSE41, SSE_OPF_MMX), [0x39] =3D BINARY_OP(pminsd, SSE41, SSE_OPF_MMX), @@ -3329,8 +3332,13 @@ static const struct SSEOpHelper_table6 sse_op_table6= [256] =3D { =20 /* prefix [66] 0f 3a */ static const struct SSEOpHelper_table7 sse_op_table7[256] =3D { +#define gen_helper_vpermq_xmm NULL + [0x00] =3D UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), + [0x01] =3D UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), /* vpermpd */ [0x04] =3D UNARY_OP(vpermilps_imm, AVX, 0), [0x05] =3D UNARY_OP(vpermilpd_imm, AVX, 0), +#define gen_helper_vpermdq_xmm NULL + [0x06] =3D BINARY_OP(vpermdq, AVX, 0), /* vperm2f128 */ [0x08] =3D UNARY_OP(roundps, SSE41, 0), [0x09] =3D UNARY_OP(roundpd, SSE41, 0), #define gen_helper_roundss_ymm NULL @@ -3353,6 +3361,7 @@ static const struct SSEOpHelper_table7 sse_op_table7[= 256] =3D { [0x41] =3D BINARY_OP(dppd, SSE41, 0), [0x42] =3D BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX), [0x44] =3D BINARY_OP(pclmulqdq, PCLMULQDQ, 0), + [0x46] =3D BINARY_OP(vpermdq, AVX, SSE_OPF_AVX2), /* vperm2i128 */ #define gen_helper_pcmpestrm_ymm NULL [0x60] =3D CMP_OP(pcmpestrm, SSE42), #define gen_helper_pcmpestri_ymm NULL --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839562898771.1326620833174; Sun, 24 Apr 2022 15:32:42 -0700 (PDT) Received: from localhost ([::1]:58332 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikmW-000466-SE for importer@patchew.org; Sun, 24 Apr 2022 18:32:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50918) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikUH-00039c-AK for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:49 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58869) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikUF-0002wg-D7 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:49 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJC-0001ea-7c; Sun, 24 Apr 2022 23:02:22 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 36/42] i386: Implement VINSERT128/VEXTRACT128 Date: Sun, 24 Apr 2022 23:01:58 +0100 Message-Id: <20220424220204.2493824-37-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839564166100001 Content-Type: text/plain; charset="utf-8" 128-bit vinsert/vextract instructions. The integer and loating point varian= ts have the same semantics. This is where we encounter an instruction encoded with VEX.L =3D=3D 1 and a 128 bit (xmm) destination operand. Signed-off-by: Paul Brook --- target/i386/tcg/translate.c | 78 +++++++++++++++++++++++++++++++++++++ 1 file changed, 78 insertions(+) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 5a11d3c083..4072fa28d3 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2814,6 +2814,24 @@ static inline void gen_op_movo_ymmh(DisasContext *s,= int d_offset, int s_offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (3))); } =20 +static inline void gen_op_movo_ymm_l2h(DisasContext *s, + int d_offset, int s_offset) +{ + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (0))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (2))); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (1))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (3))); +} + +static inline void gen_op_movo_ymm_h2l(DisasContext *s, + int d_offset, int s_offset) +{ + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (2))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (0))); + tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset + offsetof(ZMMReg, ZMM_Q= (3))); + tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset + offsetof(ZMMReg, ZMM_Q= (1))); +} + static inline void gen_op_movq(DisasContext *s, int d_offset, int s_offset) { tcg_gen_ld_i64(s->tmp1_i64, cpu_env, s_offset); @@ -3353,9 +3371,13 @@ static const struct SSEOpHelper_table7 sse_op_table7= [256] =3D { [0x15] =3D SPECIAL_OP(SSE41), /* pextrw */ [0x16] =3D SPECIAL_OP(SSE41), /* pextrd/pextrq */ [0x17] =3D SPECIAL_OP(SSE41), /* extractps */ + [0x18] =3D SPECIAL_OP(AVX), /* vinsertf128 */ + [0x19] =3D SPECIAL_OP(AVX), /* vextractf128 */ [0x20] =3D SPECIAL_OP(SSE41), /* pinsrb */ [0x21] =3D SPECIAL_OP(SSE41), /* insertps */ [0x22] =3D SPECIAL_OP(SSE41), /* pinsrd/pinsrq */ + [0x38] =3D SPECIAL_OP(AVX), /* vinserti128 */ + [0x39] =3D SPECIAL_OP(AVX), /* vextracti128 */ [0x40] =3D BINARY_OP(dpps, SSE41, 0), #define gen_helper_dppd_ymm NULL [0x41] =3D BINARY_OP(dppd, SSE41, 0), @@ -5145,6 +5167,62 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } gen_clear_ymmh(s, reg); break; + case 0x38: /* vinserti128 */ + CHECK_AVX2_256(s); + /* fall through */ + case 0x18: /* vinsertf128 */ + CHECK_AVX(s); + if ((s->prefix & PREFIX_VEX) =3D=3D 0 || s->vex_l =3D= =3D 0) { + goto illegal_op; + } + if (mod =3D=3D 3) { + if (val & 1) { + gen_op_movo_ymm_l2h(s, ZMM_OFFSET(reg), + ZMM_OFFSET(rm)); + } else { + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm)= ); + } + } else { + if (val & 1) { + gen_ldo_env_A0_ymmh(s, ZMM_OFFSET(reg)); + } else { + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); + } + } + if (reg !=3D reg_v) { + if (val & 1) { + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(reg= _v)); + } else { + gen_op_movo_ymmh(s, ZMM_OFFSET(reg), + ZMM_OFFSET(reg_v)); + } + } + break; + case 0x39: /* vextracti128 */ + CHECK_AVX2_256(s); + /* fall through */ + case 0x19: /* vextractf128 */ + CHECK_AVX_V0(s); + if ((s->prefix & PREFIX_VEX) =3D=3D 0 || s->vex_l =3D= =3D 0) { + goto illegal_op; + } + if (mod =3D=3D 3) { + op1_offset =3D ZMM_OFFSET(rm); + if (val & 1) { + gen_op_movo_ymm_h2l(s, ZMM_OFFSET(rm), + ZMM_OFFSET(reg)); + } else { + gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg)= ); + } + gen_clear_ymmh(s, rm); + } else{ + if (val & 1) { + gen_sto_env_A0_ymmh(s, ZMM_OFFSET(reg)); + } else { + gen_sto_env_A0(s, ZMM_OFFSET(reg)); + } + } + break; } return; } --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838525949258.0624826885677; Sun, 24 Apr 2022 15:15:25 -0700 (PDT) Received: from localhost ([::1]:38104 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikVo-0004IZ-Ke for importer@patchew.org; Sun, 24 Apr 2022 18:15:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50546) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSO-0001RR-HC for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:52 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58776) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSM-0002o7-VD for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:52 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJC-0001ea-Dq; Sun, 24 Apr 2022 23:02:22 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 37/42] i386: Implement VBLENDV Date: Sun, 24 Apr 2022 23:01:59 +0100 Message-Id: <20220424220204.2493824-38-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838526838100001 Content-Type: text/plain; charset="utf-8" The AVX variants of the BLENDV instructions use a different opcode prefix to support the additional operands. We already modified the helper functions in anticipation of this. Signed-off-by: Paul Brook --- target/i386/tcg/translate.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 4072fa28d3..95ecdea8fe 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3384,6 +3384,9 @@ static const struct SSEOpHelper_table7 sse_op_table7[= 256] =3D { [0x42] =3D BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX), [0x44] =3D BINARY_OP(pclmulqdq, PCLMULQDQ, 0), [0x46] =3D BINARY_OP(vpermdq, AVX, SSE_OPF_AVX2), /* vperm2i128 */ + [0x4a] =3D BLENDV_OP(blendvps, AVX, 0), + [0x4b] =3D BLENDV_OP(blendvpd, AVX, 0), + [0x4c] =3D BLENDV_OP(pblendvb, AVX, SSE_OPF_MMX), #define gen_helper_pcmpestrm_ymm NULL [0x60] =3D CMP_OP(pcmpestrm, SSE42), #define gen_helper_pcmpestri_ymm NULL @@ -5268,6 +5271,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, } =20 /* SSE */ + if (op7.flags & SSE_OPF_BLENDV && !(s->prefix & PREFIX_VEX)) { + /* Only VEX encodings are valid for these blendv opcodes */ + goto illegal_op; + } op1_offset =3D ZMM_OFFSET(reg); if (mod =3D=3D 3) { op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); @@ -5316,8 +5323,15 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, op7.fn[b1].op1(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(va= l)); } else { tcg_gen_addi_ptr(s->ptr2, cpu_env, v_offset); - op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1, - tcg_const_i32(val)); + if (op7.flags & SSE_OPF_BLENDV) { + TCGv_ptr mask =3D tcg_temp_new_ptr(); + tcg_gen_addi_ptr(mask, cpu_env, ZMM_OFFSET(val >> 4)); + op7.fn[b1].op3(cpu_env, s->ptr0, s->ptr2, s->ptr1, mas= k); + tcg_temp_free_ptr(mask); + } else { + op7.fn[b1].op2(cpu_env, s->ptr0, s->ptr2, s->ptr1, + tcg_const_i32(val)); + } } if ((op7.flags & SSE_OPF_CMP) =3D=3D 0 && s->vex_l =3D=3D 0) { gen_clear_ymmh(s, reg); --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838683775581.2849443940191; Sun, 24 Apr 2022 15:18:03 -0700 (PDT) Received: from localhost ([::1]:46872 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikYM-0001pG-Ce for importer@patchew.org; Sun, 24 Apr 2022 18:18:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50718) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikT5-0001ji-K3 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:37 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58817) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikT2-0002rr-Nn for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:35 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJC-0001ea-K6; Sun, 24 Apr 2022 23:02:22 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 38/42] i386: Implement VPBLENDD Date: Sun, 24 Apr 2022 23:02:00 +0100 Message-Id: <20220424220204.2493824-39-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838685522100001 Content-Type: text/plain; charset="utf-8" This is semantically eqivalent to VBLENDPS. Signed-off-by: Paul Brook --- target/i386/tcg/translate.c | 1 + 1 file changed, 1 insertion(+) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 95ecdea8fe..73f3842c36 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3353,6 +3353,7 @@ static const struct SSEOpHelper_table7 sse_op_table7[= 256] =3D { #define gen_helper_vpermq_xmm NULL [0x00] =3D UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), [0x01] =3D UNARY_OP(vpermq, AVX, SSE_OPF_AVX2), /* vpermpd */ + [0x02] =3D BINARY_OP(blendps, AVX, SSE_OPF_AVX2), /* vpblendd */ [0x04] =3D UNARY_OP(vpermilps_imm, AVX, 0), [0x05] =3D UNARY_OP(vpermilpd_imm, AVX, 0), #define gen_helper_vpermdq_xmm NULL --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839404440103.26884495950708; Sun, 24 Apr 2022 15:30:04 -0700 (PDT) Received: from localhost ([::1]:51314 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikjy-0007UV-VN for importer@patchew.org; Sun, 24 Apr 2022 18:30:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50560) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSR-0001Sk-Vv for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:57 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58780) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikSQ-0002oH-HN for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:55 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJC-0001ea-QE; Sun, 24 Apr 2022 23:02:22 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 39/42] i386: Enable AVX cpuid bits when using TCG Date: Sun, 24 Apr 2022 23:02:01 +0100 Message-Id: <20220424220204.2493824-40-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, UPPERCASE_50_75=0.008 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839405273100001 Content-Type: text/plain; charset="utf-8" Include AVX and AVX2 in the guest cpuid features supported by TCG Signed-off-by: Paul Brook --- target/i386/cpu.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 99343be926..bd35233d5b 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -625,12 +625,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t ven= dor1, CPUID_EXT_SSE41 | CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | \ CPUID_EXT_XSAVE | /* CPUID_EXT_OSXSAVE is dynamic */ \ CPUID_EXT_MOVBE | CPUID_EXT_AES | CPUID_EXT_HYPERVISOR | \ - CPUID_EXT_RDRAND) + CPUID_EXT_RDRAND | CPUID_EXT_AVX) /* missing: CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_VMX, CPUID_EXT_SMX, CPUID_EXT_EST, CPUID_EXT_TM2, CPUID_EXT_CID, CPUID_EXT_FMA, CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_PCID, CPUID_EXT_DCA, - CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AVX, + CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_F16C */ =20 #ifdef TARGET_X86_64 @@ -653,9 +653,9 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendo= r1, CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ADX | \ CPUID_7_0_EBX_PCOMMIT | CPUID_7_0_EBX_CLFLUSHOPT | \ CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_MPX | CPUID_7_0_EBX_FSGSBASE = | \ - CPUID_7_0_EBX_ERMS) + CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_AVX2) /* missing: - CPUID_7_0_EBX_HLE, CPUID_7_0_EBX_AVX2, + CPUID_7_0_EBX_HLE CPUID_7_0_EBX_INVPCID, CPUID_7_0_EBX_RTM, CPUID_7_0_EBX_RDSEED */ #define TCG_7_0_ECX_FEATURES (CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU | \ --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838886380934.6339786961482; Sun, 24 Apr 2022 15:21:26 -0700 (PDT) Received: from localhost ([::1]:54764 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikbd-0007Au-8M for importer@patchew.org; Sun, 24 Apr 2022 18:21:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50770) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTK-0001oO-Hq for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:51 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58831) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTI-0002t1-T0 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:12:50 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJD-0001ea-0h; Sun, 24 Apr 2022 23:02:23 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 40/42] Enable all x86-64 cpu features in user mode Date: Sun, 24 Apr 2022 23:02:02 +0100 Message-Id: <20220424220204.2493824-41-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Laurent Vivier , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838887831100001 Content-Type: text/plain; charset="utf-8" We don't have any migration concerns for usermode emulation, so we may as well enable all available CPU features by default. Signed-off-by: Paul Brook --- linux-user/x86_64/target_elf.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/linux-user/x86_64/target_elf.h b/linux-user/x86_64/target_elf.h index 7b76a90de8..3f628f8d66 100644 --- a/linux-user/x86_64/target_elf.h +++ b/linux-user/x86_64/target_elf.h @@ -9,6 +9,6 @@ #define X86_64_TARGET_ELF_H static inline const char *cpu_get_model(uint32_t eflags) { - return "qemu64"; + return "max"; } #endif --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650838964684182.7600729821171; Sun, 24 Apr 2022 15:22:44 -0700 (PDT) Received: from localhost ([::1]:55920 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikct-0007vw-4B for importer@patchew.org; Sun, 24 Apr 2022 18:22:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50516) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikSG-0001DU-Lq for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:44 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58767) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikS9-0002nL-9q for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:11:43 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJD-0001ea-Ev; Sun, 24 Apr 2022 23:02:24 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 41/42] AVX tests Date: Sun, 24 Apr 2022 23:02:03 +0100 Message-Id: <20220424220204.2493824-42-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, WEIRD_QUOTING=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650838966217100001 Content-Type: text/plain; charset="utf-8" Tests for correct operation of most x86-64 SSE and AVX instructions. It should cover all combinations of overlapping register and memory operands on a set of random-ish data. Results are bit-identical to an Intel i5-8500, with the exception of the RCPSS and RSQRT approximations where the real CPU gives less accurate results (the Intel spec allows relative errors up to 1.5 * 2^-12) Signed-off-by: Paul Brook --- tests/tcg/i386/Makefile.target | 10 +- tests/tcg/i386/README | 9 + tests/tcg/i386/test-avx.c | 347 +++ tests/tcg/i386/test-avx.py | 352 +++ tests/tcg/i386/x86.csv | 4658 ++++++++++++++++++++++++++++++++ 5 files changed, 5374 insertions(+), 2 deletions(-) create mode 100644 tests/tcg/i386/test-avx.c create mode 100755 tests/tcg/i386/test-avx.py create mode 100644 tests/tcg/i386/x86.csv diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target index bd73c96d0d..eb06f7eb89 100644 --- a/tests/tcg/i386/Makefile.target +++ b/tests/tcg/i386/Makefile.target @@ -7,8 +7,8 @@ VPATH +=3D $(I386_SRC) =20 I386_SRCS=3D$(notdir $(wildcard $(I386_SRC)/*.c)) ALL_X86_TESTS=3D$(I386_SRCS:.c=3D) -SKIP_I386_TESTS=3Dtest-i386-ssse3 -X86_64_TESTS:=3D$(filter test-i386-ssse3, $(ALL_X86_TESTS)) +SKIP_I386_TESTS=3Dtest-i386-ssse3 test-avx +X86_64_TESTS:=3D$(filter test-i386-ssse3 test-avx, $(ALL_X86_TESTS)) =20 test-i386-sse-exceptions: CFLAGS +=3D -msse4.1 -mfpmath=3Dsse run-test-i386-sse-exceptions: QEMU_OPTS +=3D -cpu max @@ -80,3 +80,9 @@ run-sha512-sse: QEMU_OPTS+=3D-cpu max run-plugin-sha512-sse-with-%: QEMU_OPTS+=3D-cpu max =20 TESTS+=3Dsha512-sse + +test-avx.h: test-avx.py x86.csv + $(PYTHON) $(I386_SRC)/test-avx.py $(I386_SRC)/x86.csv $@ + +test-avx: CFLAGS +=3D -mavx -masm=3Dintel -O -I. +test-avx: test-avx.h diff --git a/tests/tcg/i386/README b/tests/tcg/i386/README index 09e88f30dc..403d10dad8 100644 --- a/tests/tcg/i386/README +++ b/tests/tcg/i386/README @@ -15,6 +15,15 @@ The Linux system call vm86() is used to test vm86 emulat= ion. Various exceptions are raised to test most of the x86 user space exception reporting. =20 +test-avx +-------- + +This program executes most SSE/AVX instructions and generates a text outpu= t, +for comparison with the output obtained with a real CPU or another emulato= r. + +test-avx.h is generate from x86.csv by test-avx.py +x86.csv comes from https://github.com/quasilyte/avx512test + linux-test ---------- =20 diff --git a/tests/tcg/i386/test-avx.c b/tests/tcg/i386/test-avx.c new file mode 100644 index 0000000000..953e2906fe --- /dev/null +++ b/tests/tcg/i386/test-avx.c @@ -0,0 +1,347 @@ +#include +#include +#include +#include + +typedef void (*testfn)(void); + +typedef struct { + uint64_t q0, q1, q2, q3; +} __attribute__((aligned(32))) v4di; + +typedef struct { + uint64_t mm[8]; + v4di ymm[16]; + uint64_t r[16]; + uint64_t flags; + uint32_t ff; + uint64_t pad; + v4di mem[4]; + v4di mem0[4]; +} reg_state; + +typedef struct { + int n; + testfn fn; + const char *s; + reg_state *init; +} TestDef; + +reg_state initI; +reg_state initF32; +reg_state initF64; + +static void dump_ymm(const char *name, int n, const v4di *r, int ff) +{ + printf("%s%d =3D %016lx %016lx %016lx %016lx\n", + name, n, r->q3, r->q2, r->q1, r->q0); + if (ff =3D=3D 64) { + double v[4]; + memcpy(v, r, sizeof(v)); + printf(" %16g %16g %16g %16g\n", + v[3], v[2], v[1], v[0]); + } else if (ff =3D=3D 32) { + float v[8]; + memcpy(v, r, sizeof(v)); + printf(" %8g %8g %8g %8g %8g %8g %8g %8g\n", + v[7], v[6], v[5], v[4], v[3], v[2], v[1], v[0]); + } +} + +static void dump_regs(reg_state *s) +{ + int i; + + for (i =3D 0; i < 16; i++) { + dump_ymm("ymm", i, &s->ymm[i], 0); + } + for (i =3D 0; i < 4; i++) { + dump_ymm("mem", i, &s->mem0[i], 0); + } +} + +static void compare_state(const reg_state *a, const reg_state *b) +{ + int i; + for (i =3D 0; i < 8; i++) { + if (a->mm[i] !=3D b->mm[i]) { + printf("MM%d =3D %016lx\n", i, b->mm[i]); + } + } + for (i =3D 0; i < 16; i++) { + if (a->r[i] !=3D b->r[i]) { + printf("r%d =3D %016lx\n", i, b->r[i]); + } + } + for (i =3D 0; i < 16; i++) { + if (memcmp(&a->ymm[i], &b->ymm[i], 32)) { + dump_ymm("ymm", i, &b->ymm[i], a->ff); + } + } + for (i =3D 0; i < 4; i++) { + if (memcmp(&a->mem0[i], &a->mem[i], 32)) { + dump_ymm("mem", i, &a->mem[i], a->ff); + } + } + if (a->flags !=3D b->flags) { + printf("FLAGS =3D %016lx\n", b->flags); + } +} + +#define LOADMM(r, o) "movq " #r ", " #o "[%0]\n\t" +#define LOADYMM(r, o) "vmovdqa " #r ", " #o "[%0]\n\t" +#define STOREMM(r, o) "movq " #o "[%1], " #r "\n\t" +#define STOREYMM(r, o) "vmovdqa " #o "[%1], " #r "\n\t" +#define MMREG(F) \ + F(mm0, 0x00) \ + F(mm1, 0x08) \ + F(mm2, 0x10) \ + F(mm3, 0x18) \ + F(mm4, 0x20) \ + F(mm5, 0x28) \ + F(mm6, 0x30) \ + F(mm7, 0x38) +#define YMMREG(F) \ + F(ymm0, 0x040) \ + F(ymm1, 0x060) \ + F(ymm2, 0x080) \ + F(ymm3, 0x0a0) \ + F(ymm4, 0x0c0) \ + F(ymm5, 0x0e0) \ + F(ymm6, 0x100) \ + F(ymm7, 0x120) \ + F(ymm8, 0x140) \ + F(ymm9, 0x160) \ + F(ymm10, 0x180) \ + F(ymm11, 0x1a0) \ + F(ymm12, 0x1c0) \ + F(ymm13, 0x1e0) \ + F(ymm14, 0x200) \ + F(ymm15, 0x220) +#define LOADREG(r, o) "mov " #r ", " #o "[rax]\n\t" +#define STOREREG(r, o) "mov " #o "[rax], " #r "\n\t" +#define REG(F) \ + F(rbx, 0x248) \ + F(rcx, 0x250) \ + F(rdx, 0x258) \ + F(rsi, 0x260) \ + F(rdi, 0x268) \ + F(r8, 0x280) \ + F(r9, 0x288) \ + F(r10, 0x290) \ + F(r11, 0x298) \ + F(r12, 0x2a0) \ + F(r13, 0x2a8) \ + F(r14, 0x2b0) \ + F(r15, 0x2b8) \ + +static void run_test(const TestDef *t) +{ + reg_state result; + reg_state *init =3D t->init; + memcpy(init->mem, init->mem0, sizeof(init->mem)); + printf("%5d %s\n", t->n, t->s); + asm volatile( + MMREG(LOADMM) + YMMREG(LOADYMM) + "sub rsp, 128\n\t" + "push rax\n\t" + "push rbx\n\t" + "push rcx\n\t" + "push rdx\n\t" + "push %1\n\t" + "push %2\n\t" + "mov rax, %0\n\t" + "pushf\n\t" + "pop rbx\n\t" + "shr rbx, 8\n\t" + "shl rbx, 8\n\t" + "mov rcx, 0x2c0[rax]\n\t" + "and rcx, 0xff\n\t" + "or rbx, rcx\n\t" + "push rbx\n\t" + "popf\n\t" + REG(LOADREG) + "mov rax, 0x240[rax]\n\t" + "call [rsp]\n\t" + "mov [rsp], rax\n\t" + "mov rax, 8[rsp]\n\t" + REG(STOREREG) + "mov rbx, [rsp]\n\t" + "mov 0x240[rax], rbx\n\t" + "mov rbx, 0\n\t" + "mov 0x270[rax], rbx\n\t" + "mov 0x278[rax], rbx\n\t" + "pushf\n\t" + "pop rbx\n\t" + "and rbx, 0xff\n\t" + "mov 0x2c0[rax], rbx\n\t" + "add rsp, 16\n\t" + "pop rdx\n\t" + "pop rcx\n\t" + "pop rbx\n\t" + "pop rax\n\t" + "add rsp, 128\n\t" + MMREG(STOREMM) + YMMREG(STOREYMM) + : : "r"(init), "r"(&result), "r"(t->fn) + : "memory", "cc", + "rsi", "rdi", + "r8", "r9", "r10", "r11", "r12", "r13", "r14", "r15", + "mm0", "mm1", "mm2", "mm3", "mm4", "mm5", "mm6", "mm7", + "ymm0", "ymm1", "ymm2", "ymm3", "ymm4", "ymm5", + "ymm6", "ymm7", "ymm8", "ymm9", "ymm10", "ymm11", + "ymm12", "ymm13", "ymm14", "ymm15" + ); + compare_state(init, &result); +} + +#define TEST(n, cmd, type) \ +static void __attribute__((naked)) test_##n(void) \ +{ \ + asm volatile(cmd); \ + asm volatile("ret"); \ +} +#include "test-avx.h" + + +static const TestDef test_table[] =3D { +#define TEST(n, cmd, type) {n, test_##n, cmd, &init##type}, +#include "test-avx.h" + {-1, NULL, "", NULL} +}; + +static void run_all(void) +{ + const TestDef *t; + for (t =3D test_table; t->fn; t++) { + run_test(t); + } +} + +#define ARRAY_LEN(x) (sizeof(x) / sizeof(x[0])) + +float val_f32[] =3D {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5, 8.3}; +double val_f64[] =3D {2.0, -1.0, 4.8, 0.8, 3, -42.0, 5e6, 7.5}; +v4di val_i64[] =3D { + {0x3d6b3b6a9e4118f2lu, 0x355ae76d2774d78clu, + 0xac3ff76c4daa4b28lu, 0xe7fabd204cb54083lu}, + {0xd851c54a56bf1f29lu, 0x4a84d1d50bf4c4fflu, + 0x56621e553d52b56clu, 0xd0069553da8f584alu}, + {0x5826475e2c5fd799lu, 0xfd32edc01243f5e9lu, + 0x738ba2c66d3fe126lu, 0x5707219c6e6c26b4lu}, +}; + +v4di deadbeef =3D {0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull, + 0xa5a5a5a5deadbeefull, 0xa5a5a5a5deadbeefull}; +v4di indexq =3D {0x000000000000001full, 0x000000000000008full, + 0xffffffffffffffffull, 0xffffffffffffff5full}; +v4di indexd =3D {0x00000002000000efull, 0xfffffff500000010ull, + 0x0000000afffffff0ull, 0x000000000000000eull}; + +v4di gather_mem[0x20]; + +void init_f32reg(v4di *r) +{ + static int n; + float v[8]; + int i; + for (i =3D 0; i < 8; i++) { + v[i] =3D val_f32[n++]; + if (n =3D=3D ARRAY_LEN(val_f32)) { + n =3D 0; + } + } + memcpy(r, v, sizeof(*r)); +} + +void init_f64reg(v4di *r) +{ + static int n; + double v[4]; + int i; + for (i =3D 0; i < 4; i++) { + v[i] =3D val_f64[n++]; + if (n =3D=3D ARRAY_LEN(val_f64)) { + n =3D 0; + } + } + memcpy(r, v, sizeof(*r)); +} + +void init_intreg(v4di *r) +{ + static uint64_t mask; + static int n; + + r->q0 =3D val_i64[n].q0 ^ mask; + r->q1 =3D val_i64[n].q1 ^ mask; + r->q2 =3D val_i64[n].q2 ^ mask; + r->q3 =3D val_i64[n].q3 ^ mask; + n++; + if (n =3D=3D ARRAY_LEN(val_i64)) { + n =3D 0; + mask *=3D 0x104C11DB7; + } +} + +static void init_all(reg_state *s) +{ + int i; + + s->r[3] =3D (uint64_t)&s->mem[0]; /* rdx */ + s->r[4] =3D (uint64_t)&gather_mem[ARRAY_LEN(gather_mem) / 2]; /* rsi */ + s->r[5] =3D (uint64_t)&s->mem[2]; /* rdi */ + s->flags =3D 2; + for (i =3D 0; i < 16; i++) { + s->ymm[i] =3D deadbeef; + } + s->ymm[13] =3D indexd; + s->ymm[14] =3D indexq; + for (i =3D 0; i < 4; i++) { + s->mem0[i] =3D deadbeef; + } +} + +int main(int argc, char *argv[]) +{ + int i; + + init_all(&initI); + init_intreg(&initI.ymm[10]); + init_intreg(&initI.ymm[11]); + init_intreg(&initI.ymm[12]); + init_intreg(&initI.mem0[1]); + printf("Int:\n"); + dump_regs(&initI); + + init_all(&initF32); + init_f32reg(&initF32.ymm[10]); + init_f32reg(&initF32.ymm[11]); + init_f32reg(&initF32.ymm[12]); + init_f32reg(&initF32.mem0[1]); + initF32.ff =3D 32; + printf("F32:\n"); + dump_regs(&initF32); + + init_all(&initF64); + init_f64reg(&initF64.ymm[10]); + init_f64reg(&initF64.ymm[11]); + init_f64reg(&initF64.ymm[12]); + init_f64reg(&initF64.mem0[1]); + initF64.ff =3D 64; + printf("F64:\n"); + dump_regs(&initF64); + + for (i =3D 0; i < ARRAY_LEN(gather_mem); i++) { + init_intreg(&gather_mem[i]); + } + + if (argc > 1) { + int n =3D atoi(argv[1]); + run_test(&test_table[n]); + } else { + run_all(); + } + return 0; +} diff --git a/tests/tcg/i386/test-avx.py b/tests/tcg/i386/test-avx.py new file mode 100755 index 0000000000..0b2d799c5c --- /dev/null +++ b/tests/tcg/i386/test-avx.py @@ -0,0 +1,352 @@ +#! /usr/bin/env python3 + +# Generate test-avx.h from x86.csv + +import csv +import sys +from fnmatch import fnmatch + +archs =3D [ + # TODO: MMX? + "SSE", "SSE2", "SSE3", "SSSE3", "SSE4_1", "SSE4_2", + "AVX", "AVX2", "AES+AVX", # "VAES+AVX", +] + +ignore =3D set(["FISTTP", + "LDMXCSR", "VLDMXCSR", "STMXCSR", "VSTMXCSR"]) + +imask =3D { + 'vBLENDPD': 0xff, + 'vBLENDPS': 0x0f, + 'CMP[PS][SD]': 0x07, + 'VCMP[PS][SD]': 0x1f, + 'vDPPD': 0x33, + 'vDPPS': 0xff, + 'vEXTRACTPS': 0x03, + 'vINSERTPS': 0xff, + 'MPSADBW': 0x7, + 'VMPSADBW': 0x3f, + 'vPALIGNR': 0x3f, + 'vPBLENDW': 0xff, + 'vPCMP[EI]STR*': 0x0f, + 'vPEXTRB': 0x0f, + 'vPEXTRW': 0x07, + 'vPEXTRD': 0x03, + 'vPEXTRQ': 0x01, + 'vPINSRB': 0x0f, + 'vPINSRW': 0x07, + 'vPINSRD': 0x03, + 'vPINSRQ': 0x01, + 'vPSHUF[DW]': 0xff, + 'vPSHUF[LH]W': 0xff, + 'vPS[LR][AL][WDQ]': 0x3f, + 'vPS[RL]LDQ': 0x1f, + 'vROUND[PS][SD]': 0x7, + 'vSHUFPD': 0x0f, + 'vSHUFPS': 0xff, + 'vAESKEYGENASSIST': 0, + 'VEXTRACT[FI]128': 0x01, + 'VINSERT[FI]128': 0x01, + 'VPBLENDD': 0xff, + 'VPERM2[FI]128': 0x33, + 'VPERMPD': 0xff, + 'VPERMQ': 0xff, + 'VPERMILPS': 0xff, + 'VPERMILPD': 0x0f, + } + +def strip_comments(x): + for l in x: + if l !=3D '' and l[0] !=3D '#': + yield l + +def reg_w(w): + if w =3D=3D 8: + return 'al' + elif w =3D=3D 16: + return 'ax' + elif w =3D=3D 32: + return 'eax' + elif w =3D=3D 64: + return 'rax' + raise Exception("bad reg_w %d" % w) + +def mem_w(w): + if w =3D=3D 8: + t =3D "BYTE" + elif w =3D=3D 16: + t =3D "WORD" + elif w =3D=3D 32: + t =3D "DWORD" + elif w =3D=3D 64: + t =3D "QWORD" + elif w =3D=3D 128: + t =3D "XMMWORD" + elif w =3D=3D 256: + t =3D "YMMWORD" + else: + raise Exception() + + return t + " PTR 32[rdx]" + +class XMMArg(): + isxmm =3D True + def __init__(self, reg, mw): + if mw not in [0, 8, 16, 32, 64, 128, 256]: + raise Exception("Bad /m width: %s" % w) + self.reg =3D reg + self.mw =3D mw + self.ismem =3D mw !=3D 0 + def regstr(self, n): + if n < 0: + return mem_w(self.mw) + else: + return "%smm%d" % (self.reg, n) + +class MMArg(): + isxmm =3D True + ismem =3D False # TODO + def regstr(self, n): + return "mm%d" % (n & 7) + +def match(op, pattern): + if pattern[0] =3D=3D 'v': + return fnmatch(op, pattern[1:]) or fnmatch(op, 'V'+pattern[1:]) + return fnmatch(op, pattern) + +class ArgVSIB(): + isxmm =3D True + ismem =3D False + def __init__(self, reg, w): + if w not in [32, 64]: + raise Exception("Bad vsib width: %s" % w) + self.w =3D w + self.reg =3D reg + def regstr(self, n): + reg =3D "%smm%d" % (self.reg, n >> 2) + return "[rsi + %s * %d]" % (reg, 1 << (n & 3)) + +class ArgImm8u(): + isxmm =3D False + ismem =3D False + def __init__(self, op): + for k, v in imask.items(): + if match(op, k): + self.mask =3D imask[k]; + return + raise Exception("Unknown immediate") + def vals(self): + mask =3D self.mask + yield 0 + n =3D 0 + while n !=3D mask: + n +=3D 1 + while (n & ~mask) !=3D 0: + n +=3D (n & ~mask) + yield n + +class ArgRM(): + isxmm =3D False + def __init__(self, rw, mw): + if rw not in [8, 16, 32, 64]: + raise Exception("Bad r/w width: %s" % w) + if mw not in [0, 8, 16, 32, 64]: + raise Exception("Bad r/w width: %s" % w) + self.rw =3D rw + self.mw =3D mw + self.ismem =3D mw !=3D 0 + def regstr(self, n): + if n < 0: + return mem_w(self.mw) + else: + return reg_w(self.rw) + +class ArgMem(): + isxmm =3D False + ismem =3D True + def __init__(self, w): + if w not in [8, 16, 32, 64, 128, 256]: + raise Exception("Bad mem width: %s" % w) + self.w =3D w + def regstr(self, n): + return mem_w(self.w) + +def ArgGenerator(arg, op): + if arg[:3] =3D=3D 'xmm' or arg[:3] =3D=3D "ymm": + if "/" in arg: + r, m =3D arg.split('/') + if (m[0] !=3D 'm'): + raise Exception("Expected /m: %s", arg) + return XMMArg(arg[0], int(m[1:])); + else: + return XMMArg(arg[0], 0); + elif arg[:2] =3D=3D 'mm': + return MMArg(); + elif arg[:4] =3D=3D 'imm8': + return ArgImm8u(op); + elif arg =3D=3D '': + return None + elif arg[0] =3D=3D 'r': + if '/m' in arg: + r, m =3D arg.split('/') + if (m[0] !=3D 'm'): + raise Exception("Expected /m: %s", arg) + mw =3D int(m[1:]) + if r =3D=3D 'r': + rw =3D mw + else: + rw =3D int(r[1:]) + return ArgRM(rw, mw) + + return ArgRM(int(arg[1:]), 0); + elif arg[0] =3D=3D 'm': + return ArgMem(int(arg[1:])) + elif arg[:2] =3D=3D 'vm': + return ArgVSIB(arg[-1], int(arg[2:-1])) + else: + raise Exception("Unrecognised arg: %s", arg) + +class InsnGenerator: + def __init__(self, op, args): + self.op =3D op + if op[-2:] in ["PS", "PD", "SS", "SD"]: + if op[-1] =3D=3D 'S': + self.optype =3D 'F32' + else: + self.optype =3D 'F64' + else: + self.optype =3D 'I' + + try: + self.args =3D list(ArgGenerator(a, op) for a in args) + if len(self.args) > 0 and self.args[-1] is None: + self.args =3D self.args[:-1] + except Exception as e: + raise Exception("Bad arg %s: %s" % (op, e)) + + def gen(self): + regs =3D (10, 11, 12) + dest =3D 9 + + nreg =3D len(self.args) + if nreg =3D=3D 0: + yield self.op + return + if isinstance(self.args[-1], ArgImm8u): + nreg -=3D 1 + immarg =3D self.args[-1] + else: + immarg =3D None + memarg =3D -1 + for n, arg in enumerate(self.args): + if arg.ismem: + memarg =3D n + + if (self.op.startswith("VGATHER") or self.op.startswith("VPGATHER"= )): + if "GATHERD" in self.op: + ireg =3D 13 << 2 + else: + ireg =3D 14 << 2 + regset =3D [ + (dest, ireg | 0, regs[0]), + (dest, ireg | 1, regs[0]), + (dest, ireg | 2, regs[0]), + (dest, ireg | 3, regs[0]), + ] + if memarg >=3D 0: + raise Exception("vsib with memory: %s" % self.op) + elif nreg =3D=3D 1: + regset =3D [(regs[0],)] + if memarg =3D=3D 0: + regset +=3D [(-1,)] + elif nreg =3D=3D 2: + regset =3D [ + (regs[0], regs[1]), + (regs[0], regs[0]), + ] + if memarg =3D=3D 0: + regset +=3D [(-1, regs[0])] + elif memarg =3D=3D 1: + regset +=3D [(dest, -1)] + elif nreg =3D=3D 3: + regset =3D [ + (dest, regs[0], regs[1]), + (dest, regs[0], regs[0]), + (regs[0], regs[0], regs[1]), + (regs[0], regs[1], regs[0]), + (regs[0], regs[0], regs[0]), + ] + if memarg =3D=3D 2: + regset +=3D [ + (dest, regs[0], -1), + (regs[0], regs[0], -1), + ] + elif memarg > 0: + raise Exception("Memarg %d" % memarg) + elif nreg =3D=3D 4: + regset =3D [ + (dest, regs[0], regs[1], regs[2]), + (dest, regs[0], regs[0], regs[1]), + (dest, regs[0], regs[1], regs[0]), + (dest, regs[1], regs[0], regs[0]), + (dest, regs[0], regs[0], regs[0]), + (regs[0], regs[0], regs[1], regs[2]), + (regs[0], regs[1], regs[0], regs[2]), + (regs[0], regs[1], regs[2], regs[0]), + (regs[0], regs[0], regs[0], regs[1]), + (regs[0], regs[0], regs[1], regs[0]), + (regs[0], regs[1], regs[0], regs[0]), + (regs[0], regs[0], regs[0], regs[0]), + ] + if memarg =3D=3D 2: + regset +=3D [ + (dest, regs[0], -1, regs[1]), + (dest, regs[0], -1, regs[0]), + (regs[0], regs[0], -1, regs[1]), + (regs[0], regs[1], -1, regs[0]), + (regs[0], regs[0], -1, regs[0]), + ] + elif memarg > 0: + raise Exception("Memarg4 %d" % memarg) + else: + raise Exception("Too many regs: %s(%d)" % (self.op, nreg)) + + for regv in regset: + argstr =3D [] + for i in range(nreg): + arg =3D self.args[i] + argstr.append(arg.regstr(regv[i])) + if immarg is None: + yield self.op + ' ' + ','.join(argstr) + else: + for immval in immarg.vals(): + yield self.op + ' ' + ','.join(argstr) + ',' + str(imm= val) + +def split0(s): + if s =3D=3D '': + return [] + return s.split(',') + +def main(): + n =3D 0 + if len(sys.argv) !=3D 3: + print("Usage: test-avx.py x86.csv test-avx.h") + exit(1) + csvfile =3D open(sys.argv[1], 'r', newline=3D'') + with open(sys.argv[2], "w") as outf: + outf.write("// Generated by test-avx.py. Do not edit.\n") + for row in csv.reader(strip_comments(csvfile)): + insn =3D row[0].replace(',', '').split() + if insn[0] in ignore: + continue + cpuid =3D row[6] + if cpuid in archs: + g =3D InsnGenerator(insn[0], insn[1:]) + for insn in g.gen(): + outf.write('TEST(%d, "%s", %s)\n' % (n, insn, g.optype= )) + n +=3D 1 + outf.write("#undef TEST\n") + csvfile.close() + +if __name__ =3D=3D "__main__": + main() diff --git a/tests/tcg/i386/x86.csv b/tests/tcg/i386/x86.csv new file mode 100644 index 0000000000..d5d0c17f1b --- /dev/null +++ b/tests/tcg/i386/x86.csv @@ -0,0 +1,4658 @@ +# x86 instruction set description version 0.2x, 2018-05-08 +# +# https://golang.org/x/arch/x86 +# +# The latest version of the CSV file is +# available online at https://golang.org/s/x86.csv. +# +# This file contains a block of comment lines, each beginning with #, +# followed by entries in CSV format. All the # comments are at the top +# of the file, so a reader can skip past the comments and hand the +# rest of the file to a standard CSV reader. +# Each CSV line contains these fields: +# +# 1. The Intel manual instruction mnemonic. For example, "SHR r/m32, imm8". +# +# 2. The Go assembler instruction mnemonic. For example, "SHRL imm8, r/m32= ". +# +# 3. The GNU binutils instruction mnemonic. For example, "shrl imm8, r/m32= ". +# +# 4. The instruction encoding. For example, "C1 /4 ib". +# +# 5. The validity of the instruction in 32-bit (aka compatiblity, legacy) = mode. +# +# 6. The validity of the instruction in 64-bit mode. +# +# 7. The CPUID feature flags that signal support for the instruction. +# +# 8. Additional comma-separated tags containing hints about the instructio= n. +# +# 9. The read/write actions of the instruction on the arguments used in +# the Intel mnemonic. For example, "rw,r" to denote that "SHR r/m32, imm8" +# reads and writes its first argument but only reads its second argument. +# +# 10. Whether the opcode used in the Intel mnemonic has encoding forms +# distinguished only by operand size, like most arithmetic instructions. +# The string "Y" indicates yes, the string "" indicates no. +# +# 11. The data size of the operation in bits. In general this is the size = corresponding +# to the Go and GNU assembler opcode suffix. +# Mnemonics (the opcode string) +# +# The instruction mnemonics are as used in the Intel manual, with a few ex= ceptions. +# +# Mnemonics claiming general memory forms but that really require fixed ad= dressing modes +# are omitted in favor of their equivalents with implicit arguments.. +# For example, "CMPS m16, m16" (really CMPS [SI], [DI]) is omitted in favo= r of "CMPSW". +# +# Instruction forms with an explicit REP, REPE, or REPNE prefix are also o= mitted. +# Encoders and decoders are expected to handle those prefixes separately. +# +# Perhaps most significantly, the argument syntaxes used in the mnemonic i= ndicate +# exactly how to derive the argument from the instruction encoding, or vic= e versa. +# +# Immediate values: imm8, imm8u, imm16, imm16u, imm32, imm64. +# Immediates are signed by default; the u suffixes indicates an unsigned v= alue. +# Immediates may have bitfield-like modifier that specifies how much bits +# are used. For example, imm8u:4 is encoded like 8bit immediate, +# but only 4bits are meaningful while the others are ignored or must be 0. +# +# Memory operands. The forms m, m128, m14/28byte, m16, m16&16, m16&32, m16= &64, m16:16, m16:32, +# m16:64, m16int, m256, m2byte, m32, m32&32, m32fp, m32int, m512byte, m64,= m64fp, m64int, +# m8, m80bcd, m80dec, m80fp, m94/108byte. These operands always correspond= to the +# memory address specified by the r/m half of the modrm encoding. +# +# Integer registers. +# The forms r8, r16, r32, r64 indicate a register selected by the modrm re= g encoding. +# The forms rmr16, rmr32, rmr64 indicate a register (never memory) selecte= d by the modrm r/m encoding. +# The forms r/m8, r/m16, r/m32, and r/m64 indicate a register or memory se= lected by the modrm r/m encoding. +# Forms with two sizes, like r32/m16 also indicate a register or memory se= lected by the modrm r/m encodng, +# but the size for a register argument differs from the size of a memory a= rgument. +# The forms r8V, r16V, r32V, r64V indicate a register selected by the VEX.= vvvv bits. +# +# Multimedia registers. +# The forms mm1, xmm1, and ymm1 indicate a multimedia register selected by= the +# modrm reg encoding. +# The forms mm2, xmm2, and ymm2 indicate a register (never memory) selecte= d by +# the modrm r/m encoding. +# The forms mm2/m64, xmm2/m128, and so on indicate a register or memory +# selected by the modrm r/m encoding. +# The forms xmmV and ymmV indicate a register selected by the VEX.vvvv bit= s. +# The forms xmmI and ymmI indicate a register selected by the top four bit= s of an /is4 immediate byte. +# +# Bound registers. +# The form bnd1 indicates a bound register selected by the modrm reg encod= ing. +# The form bnd2 indicates a bound register (never memory) selected by the = modrm r/m encoding. +# The forms bnd2/m64 and bnd2/m128 indicate a register or memorys selected= by the modrm r/m encoding. +# TODO: Describe mib. +# +# One-of-a-kind operands: rel8, rel16, rel32, ptr16:16, ptr16:32, +# moffs8, moffs16, moffs32, moffs64, vm32x, vm32y, vm64x, and vm64y +# are all as in the Intel manual. +# +# Encodings +# +# The encodings are also as used in the Intel manual, with automated corre= ctions. +# For example, the Intel manual sometimes omits the modrm /r indicator or = other trailing bytes, +# and it also contains typographical errors. +# These problems are corrected so that the CSV data may be used to generate +# tools for processing x86 machine code. +# See https://golang.org/x/arch/x86/x86map for one such generator. +# +# Valid32 and Valid64 +# +# These columns hold validity abbreviations as defined in the Intel manual: +# V, I, N.E., N.P., N.S., or N.I. +# Tools processing the data are typically only concerned with whether the +# column is "V" (valid) or not. +# This data is also corrected compared to the manual. +# For example, the manual lists many instruction forms using REX bytes +# with an incorrect "V" in the Valid32 column. +# +# CPUID Feature Flags +# +# This column specifies CPUID feature flags that must be present in order +# to use the instruction. If multiple flags are required, +# they are listed separated by plus signs, as in PCLMULQDQ+AVX. +# The column can also list one of the values 486, Pentium, PentiumII, and = P6, +# indicating that the instruction was introduced on that architecture vers= ion. +# +# Tags +# +# The tag column does not correspond to a traditional column in the Intel = manual tables. +# Instead, it is itself a comma-separated list of tags or hints derived by= analysis +# of the instruction set or the instruction encodings. +# +# The tags address16, address32, and address64 indicate that the instructi= on form +# applies when using the specified addressing size. It may therefore be ne= cessary to use an +# address size prefix byte to access the instruction. +# If two address tags are listed, the instruction can be used with either = of those +# address sizes. An instruction will never list all three address sizes. +# (In fact, today, no instruction lists two address sizes, but that may ch= ange.) +# +# The tags operand16, operand32, and operand64 indicate that the instructi= on form +# applies when using the specified operand size. It may therefore be neces= sary to use an +# operand size prefix byte to access the instruction. +# If two operand tags are listed, the instruction can be used with either= of those +# operand sizes. An instruction will never list all three operand sizes. +# For some instructions, default64 is used instead of operand64, +# which specifies data promotion to 64-bit. +# For instructions with different possible data sizes, +# it also describes that default data size is 64-bit instead of 32-bit. +# Using refining prefix like 0x66 will lead to 32-bit operation (if suppor= ted). +# +# The tags modrm_regonly or modrm_memonly indicate that the modrm byte's +# r/m encoding must specify a register or memory, respectively. +# Especially in newer instructions, the modrm constraint may be the only w= ay +# to distinguish two instruction forms. For example the MOVHLPS and MOVLPS +# instructions share the same encoding, except that the former requires the +# modrm byte's r/m to indicate a register, while the latter requires it to= indicate memory. +# +# The tags pseudo and pseudo64 indicate that this instruction form is redu= ndant +# with others listed in the table and should be ignored when generating di= sassembly +# or instruction scanning programs. The pseudo64 tag is reserved for the c= ase where +# the manual lists an instruction twice, once with the optional 64-bit mod= e REX byte. +# Since most decoders will handle the REX byte separately, the form with t= he +# unnecessary REX is tagged pseudo64. +# +# The amd tag marks AMD-specific instructions. +# As an example, all instructions of SSE4a have such tag. +# +# The AVX512-specific tags: scaleX and bscaleX. +# scale1, scale2, scale4, scale8, scale16, scale32, scale64 specify +# the compressed displacement multiplier (scaling). +# For example, if displacement is 128 and scale32 is set, +# disp8 value should be calculated as 128/32. +# bscale4 and bscale8 have the same meaning, but are used +# when instruction uses embedded broadcast feature. +# If instruction does not have bscaleX tag, it does not support EVEX broad= casting. +# +# Related packages (can be a good source of additional documentation): +# x86csv - read and manipulate x86.csv +# x86spec - x86.csv generator +# x86map - x86asm table generator based on x86.csv +# x86avxgen - cmd/internal/obj/x86 optab generator based x86.csv +# All listed packages are located at golang.org/x/arch/x86/. +"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","V","N.S.= ","","operand32","r","Y","" +"PUSH imm32","-/PUSHL/PUSHQ imm32","-/pushl/pushq imm32","68 id","N.S.","V= ","","default64","r","Y","" +"AAA","AAA","aaa","37","V","N.S.","","","","","" +"AAD","AAD","aad","D5 0A","V","I","","pseudo","","","" +"AAD imm8u","AAD imm8u","aad imm8u","D5 ib","V","N.S.","","","r","","" +"AAM","AAM","aam","D4 0A","V","I","","pseudo","","","" +"AAM imm8u","AAM imm8u","aam imm8u","D4 ib","V","N.S.","","","r","","" +"AAS","AAS","aas","3F","V","N.S.","","","","","" +"ADC AL, imm8","ADCB imm8, AL","adcb imm8, AL","14 ib","V","V","","","rw,r= ","Y","8" +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","80 /2 ib","V","V",""= ,"","rw,r","Y","8" +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","82 /2 ib","V","N.S."= ,"","","rw,r","Y","8" +"ADC r/m8, imm8","ADCB imm8, r/m8","adcb imm8, r/m8","REX 80 /2 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","12 /r","V","V","","","rw,r= ","Y","8" +"ADC r8, r/m8","ADCB r/m8, r8","adcb r/m8, r8","REX 12 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","10 /r","V","V","","","rw,r= ","Y","8" +"ADC r/m8, r8","ADCB r8, r/m8","adcb r8, r/m8","REX 10 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"ADC EAX, imm32","ADCL imm32, EAX","adcl imm32, EAX","15 id","V","V","","o= perand32","rw,r","Y","32" +"ADC r/m32, imm32","ADCL imm32, r/m32","adcl imm32, r/m32","81 /2 id","V",= "V","","operand32","rw,r","Y","32" +"ADC r/m32, imm8","ADCL imm8, r/m32","adcl imm8, r/m32","83 /2 ib","V","V"= ,"","operand32","rw,r","Y","32" +"ADC r32, r/m32","ADCL r/m32, r32","adcl r/m32, r32","13 /r","V","V","","o= perand32","rw,r","Y","32" +"ADC r/m32, r32","ADCL r32, r/m32","adcl r32, r/m32","11 /r","V","V","","o= perand32","rw,r","Y","32" +"ADC RAX, imm32","ADCQ imm32, RAX","adcq imm32, RAX","REX.W 15 id","N.S.",= "V","","","rw,r","Y","64" +"ADC r/m64, imm32","ADCQ imm32, r/m64","adcq imm32, r/m64","REX.W 81 /2 id= ","N.S.","V","","","rw,r","Y","64" +"ADC r/m64, imm8","ADCQ imm8, r/m64","adcq imm8, r/m64","REX.W 83 /2 ib","= N.S.","V","","","rw,r","Y","64" +"ADC r64, r/m64","ADCQ r/m64, r64","adcq r/m64, r64","REX.W 13 /r","N.S.",= "V","","","rw,r","Y","64" +"ADC r/m64, r64","ADCQ r64, r/m64","adcq r64, r/m64","REX.W 11 /r","N.S.",= "V","","","rw,r","Y","64" +"ADC AX, imm16","ADCW imm16, AX","adcw imm16, AX","15 iw","V","V","","oper= and16","rw,r","Y","16" +"ADC r/m16, imm16","ADCW imm16, r/m16","adcw imm16, r/m16","81 /2 iw","V",= "V","","operand16","rw,r","Y","16" +"ADC r/m16, imm8","ADCW imm8, r/m16","adcw imm8, r/m16","83 /2 ib","V","V"= ,"","operand16","rw,r","Y","16" +"ADC r16, r/m16","ADCW r/m16, r16","adcw r/m16, r16","13 /r","V","V","","o= perand16","rw,r","Y","16" +"ADC r/m16, r16","ADCW r16, r/m16","adcw r16, r/m16","11 /r","V","V","","o= perand16","rw,r","Y","16" +"ADCX r32, r/m32","ADCXL r/m32, r32","adcxl r/m32, r32","66 0F 38 F6 /r","= V","V","ADX","operand16,operand32","rw,r","Y","32" +"ADCX r64, r/m64","ADCXQ r/m64, r64","adcxq r/m64, r64","66 REX.W 0F 38 F6= /r","N.S.","V","ADX","","rw,r","Y","64" +"ADD AL, imm8","ADDB imm8, AL","addb imm8, AL","04 ib","V","V","","","rw,r= ","Y","8" +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","80 /0 ib","V","V",""= ,"","rw,r","Y","8" +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","82 /0 ib","V","N.S."= ,"","","rw,r","Y","8" +"ADD r/m8, imm8","ADDB imm8, r/m8","addb imm8, r/m8","REX 80 /0 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","02 /r","V","V","","","rw,r= ","Y","8" +"ADD r8, r/m8","ADDB r/m8, r8","addb r/m8, r8","REX 02 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","00 /r","V","V","","","rw,r= ","Y","8" +"ADD r/m8, r8","ADDB r8, r/m8","addb r8, r/m8","REX 00 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"ADD EAX, imm32","ADDL imm32, EAX","addl imm32, EAX","05 id","V","V","","o= perand32","rw,r","Y","32" +"ADD r/m32, imm32","ADDL imm32, r/m32","addl imm32, r/m32","81 /0 id","V",= "V","","operand32","rw,r","Y","32" +"ADD r/m32, imm8","ADDL imm8, r/m32","addl imm8, r/m32","83 /0 ib","V","V"= ,"","operand32","rw,r","Y","32" +"ADD r32, r/m32","ADDL r/m32, r32","addl r/m32, r32","03 /r","V","V","","o= perand32","rw,r","Y","32" +"ADD r/m32, r32","ADDL r32, r/m32","addl r32, r/m32","01 /r","V","V","","o= perand32","rw,r","Y","32" +"ADDPD xmm1, xmm2/m128","ADDPD xmm2/m128, xmm1","addpd xmm2/m128, xmm1","6= 6 0F 58 /r","V","V","SSE2","","rw,r","","" +"ADDPS xmm1, xmm2/m128","ADDPS xmm2/m128, xmm1","addps xmm2/m128, xmm1","0= F 58 /r","V","V","SSE","","rw,r","","" +"ADD RAX, imm32","ADDQ imm32, RAX","addq imm32, RAX","REX.W 05 id","N.S.",= "V","","","rw,r","Y","64" +"ADD r/m64, imm32","ADDQ imm32, r/m64","addq imm32, r/m64","REX.W 81 /0 id= ","N.S.","V","","","rw,r","Y","64" +"ADD r/m64, imm8","ADDQ imm8, r/m64","addq imm8, r/m64","REX.W 83 /0 ib","= N.S.","V","","","rw,r","Y","64" +"ADD r64, r/m64","ADDQ r/m64, r64","addq r/m64, r64","REX.W 03 /r","N.S.",= "V","","","rw,r","Y","64" +"ADD r/m64, r64","ADDQ r64, r/m64","addq r64, r/m64","REX.W 01 /r","N.S.",= "V","","","rw,r","Y","64" +"ADDSD xmm1, xmm2/m64","ADDSD xmm2/m64, xmm1","addsd xmm2/m64, xmm1","F2 0= F 58 /r","V","V","SSE2","","rw,r","","" +"ADDSS xmm1, xmm2/m32","ADDSS xmm2/m32, xmm1","addss xmm2/m32, xmm1","F3 0= F 58 /r","V","V","SSE","","rw,r","","" +"ADDSUBPD xmm1, xmm2/m128","ADDSUBPD xmm2/m128, xmm1","addsubpd xmm2/m128,= xmm1","66 0F D0 /r","V","V","SSE3","","rw,r","","" +"ADDSUBPS xmm1, xmm2/m128","ADDSUBPS xmm2/m128, xmm1","addsubps xmm2/m128,= xmm1","F2 0F D0 /r","V","V","SSE3","","rw,r","","" +"ADD AX, imm16","ADDW imm16, AX","addw imm16, AX","05 iw","V","V","","oper= and16","rw,r","Y","16" +"ADD r/m16, imm16","ADDW imm16, r/m16","addw imm16, r/m16","81 /0 iw","V",= "V","","operand16","rw,r","Y","16" +"ADD r/m16, imm8","ADDW imm8, r/m16","addw imm8, r/m16","83 /0 ib","V","V"= ,"","operand16","rw,r","Y","16" +"ADD r16, r/m16","ADDW r/m16, r16","addw r/m16, r16","03 /r","V","V","","o= perand16","rw,r","Y","16" +"ADD r/m16, r16","ADDW r16, r/m16","addw r16, r/m16","01 /r","V","V","","o= perand16","rw,r","Y","16" +"ADOX r32, r/m32","ADOXL r/m32, r32","adoxl r/m32, r32","F3 0F 38 F6 /r","= V","V","ADX","operand16,operand32","rw,r","Y","32" +"ADOX r64, r/m64","ADOXQ r/m64, r64","adoxq r/m64, r64","F3 REX.W 0F 38 F6= /r","N.S.","V","ADX","","rw,r","Y","64" +"AESDEC xmm1, xmm2/m128","AESDEC xmm2/m128, xmm1","aesdec xmm2/m128, xmm1"= ,"66 0F 38 DE /r","V","V","AES","","rw,r","","" +"AESDECLAST xmm1, xmm2/m128","AESDECLAST xmm2/m128, xmm1","aesdeclast xmm2= /m128, xmm1","66 0F 38 DF /r","V","V","AES","","rw,r","","" +"AESENC xmm1, xmm2/m128","AESENC xmm2/m128, xmm1","aesenc xmm2/m128, xmm1"= ,"66 0F 38 DC /r","V","V","AES","","rw,r","","" +"AESENCLAST xmm1, xmm2/m128","AESENCLAST xmm2/m128, xmm1","aesenclast xmm2= /m128, xmm1","66 0F 38 DD /r","V","V","AES","","rw,r","","" +"AESIMC xmm1, xmm2/m128","AESIMC xmm2/m128, xmm1","aesimc xmm2/m128, xmm1"= ,"66 0F 38 DB /r","V","V","AES","","w,r","","" +"AESKEYGENASSIST xmm1, xmm2/m128, imm8u","AESKEYGENASSIST imm8u, xmm2/m128= , xmm1","aeskeygenassist imm8u, xmm2/m128, xmm1","66 0F 3A DF /r ib","V","V= ","AES","","w,r,r","","" +"AND AL, imm8","ANDB imm8, AL","andb imm8, AL","24 ib","V","V","","","rw,r= ","Y","8" +"AND r/m8, imm8","ANDB imm8, r/m8","andb imm8, r/m8","REX 80 /4 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","80 /4 ib","V","V"= ,"","","rw,r","Y","8" +"AND r/m8, imm8u","ANDB imm8u, r/m8","andb imm8u, r/m8","82 /4 ib","V","N.= S.","","","rw,r","Y","8" +"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","22 /r","V","V","","","rw,r= ","Y","8" +"AND r8, r/m8","ANDB r/m8, r8","andb r/m8, r8","REX 22 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","20 /r","V","V","","","rw,r= ","Y","8" +"AND r/m8, r8","ANDB r8, r/m8","andb r8, r/m8","REX 20 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"AND EAX, imm32","ANDL imm32, EAX","andl imm32, EAX","25 id","V","V","","o= perand32","rw,r","Y","32" +"AND r/m32, imm32","ANDL imm32, r/m32","andl imm32, r/m32","81 /4 id","V",= "V","","operand32","rw,r","Y","32" +"AND r/m32, imm8","ANDL imm8, r/m32","andl imm8, r/m32","83 /4 ib","V","V"= ,"","operand32","rw,r","Y","32" +"AND r32, r/m32","ANDL r/m32, r32","andl r/m32, r32","23 /r","V","V","","o= perand32","rw,r","Y","32" +"AND r/m32, r32","ANDL r32, r/m32","andl r32, r/m32","21 /r","V","V","","o= perand32","rw,r","Y","32" +"ANDN r32, r32V, r/m32","ANDNL r/m32, r32V, r32","andnl r/m32, r32V, r32",= "VEX.DDS.128.0F38.W0 F2 /r","V","V","BMI1","","rw,r,r","Y","32" +"ANDNPD xmm1, xmm2/m128","ANDNPD xmm2/m128, xmm1","andnpd xmm2/m128, xmm1"= ,"66 0F 55 /r","V","V","SSE2","","rw,r","","" +"ANDNPS xmm1, xmm2/m128","ANDNPS xmm2/m128, xmm1","andnps xmm2/m128, xmm1"= ,"0F 55 /r","V","V","SSE","","rw,r","","" +"ANDN r64, r64V, r/m64","ANDNQ r/m64, r64V, r64","andnq r/m64, r64V, r64",= "VEX.DDS.128.0F38.W1 F2 /r","N.S.","V","BMI1","","rw,r,r","Y","64" +"ANDPD xmm1, xmm2/m128","ANDPD xmm2/m128, xmm1","andpd xmm2/m128, xmm1","6= 6 0F 54 /r","V","V","SSE2","","rw,r","","" +"ANDPS xmm1, xmm2/m128","ANDPS xmm2/m128, xmm1","andps xmm2/m128, xmm1","0= F 54 /r","V","V","SSE","","rw,r","","" +"AND RAX, imm32","ANDQ imm32, RAX","andq imm32, RAX","REX.W 25 id","N.S.",= "V","","","rw,r","Y","64" +"AND r/m64, imm32","ANDQ imm32, r/m64","andq imm32, r/m64","REX.W 81 /4 id= ","N.S.","V","","","rw,r","Y","64" +"AND r/m64, imm8","ANDQ imm8, r/m64","andq imm8, r/m64","REX.W 83 /4 ib","= N.S.","V","","","rw,r","Y","64" +"AND r64, r/m64","ANDQ r/m64, r64","andq r/m64, r64","REX.W 23 /r","N.S.",= "V","","","rw,r","Y","64" +"AND r/m64, r64","ANDQ r64, r/m64","andq r64, r/m64","REX.W 21 /r","N.S.",= "V","","","rw,r","Y","64" +"AND AX, imm16","ANDW imm16, AX","andw imm16, AX","25 iw","V","V","","oper= and16","rw,r","Y","16" +"AND r/m16, imm16","ANDW imm16, r/m16","andw imm16, r/m16","81 /4 iw","V",= "V","","operand16","rw,r","Y","16" +"AND r/m16, imm8","ANDW imm8, r/m16","andw imm8, r/m16","83 /4 ib","V","V"= ,"","operand16","rw,r","Y","16" +"AND r16, r/m16","ANDW r/m16, r16","andw r/m16, r16","23 /r","V","V","","o= perand16","rw,r","Y","16" +"AND r/m16, r16","ANDW r16, r/m16","andw r16, r/m16","21 /r","V","V","","o= perand16","rw,r","Y","16" +"ARPL r/m16, r16","ARPL r16, r/m16","arpl r16, r/m16","63 /r","V","N.S.","= ","","rw,r","","" +"BEXTR r32, r/m32, r32V","BEXTRL r32V, r/m32, r32","bextrl r32V, r/m32, r3= 2","VEX.NDS.128.0F38.W0 F7 /r","V","V","BMI1","","w,r,r","Y","32" +"BEXTR r64, r/m64, r64V","BEXTRQ r64V, r/m64, r64","bextrq r64V, r/m64, r6= 4","VEX.NDS.128.0F38.W1 F7 /r","N.S.","V","BMI1","","w,r,r","Y","64" +"BEXTR_XOP r32, r/m32, imm32u","BEXTR_XOPL imm32u, r/m32, r32","bextr_xopl= imm32u, r/m32, r32","XOP.128.0A.WIG 10 /r","V","V","TBM","amd,operand16,op= erand32","w,r,r","Y","32" +"BEXTR_XOP r64, r/m64, imm32u","BEXTR_XOPQ imm32u, r/m64, r64","bextr_xopq= imm32u, r/m64, r64","XOP.128.0A.WIG 10 /r","N.S.","V","TBM","amd,operand64= ","w,r,r","Y","64" +"BLCFILL r32V, r/m32","BLCFILLL r/m32, r32V","blcfill r/m32, r32V","XOP.ND= D.128.09.WIG 01 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCFILL r64V, r/m64","BLCFILLQ r/m64, r64V","blcfill r/m64, r64V","XOP.ND= D.128.09.W1 01 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLCIC r32V, r/m32","BLCICL r/m32, r32V","blcicl r/m32, r32V","XOP.NDD.128= .09.WIG 01 /5","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCIC r64V, r/m64","BLCICQ r/m64, r64V","blcicq r/m64, r64V","XOP.NDD.128= .09.WIG 01 /5","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLCI r32V, r/m32","BLCIL r/m32, r32V","blcil r/m32, r32V","XOP.NDD.128.09= .WIG 02 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCI r64V, r/m64","BLCIQ r/m64, r64V","blciq r/m64, r64V","XOP.NDD.128.09= .WIG 02 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLCMSK r32V, r/m32","BLCMSKL r/m32, r32V","blcmskl r/m32, r32V","XOP.NDD.= 128.09.WIG 02 /1","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCMSK r64V, r/m64","BLCMSKQ r/m64, r64V","blcmskq r/m64, r64V","XOP.NDD.= 128.09.WIG 02 /1","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLCS r32V, r/m32","BLCSL r/m32, r32V","blcsl r/m32, r32V","XOP.NDD.128.09= .WIG 01 /3","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLCS r64V, r/m64","BLCSQ r/m64, r64V","blcsq r/m64, r64V","XOP.NDD.128.09= .WIG 01 /3","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLENDPD xmm1, xmm2/m128, imm8u","BLENDPD imm8u, xmm2/m128, xmm1","blendpd= imm8u, xmm2/m128, xmm1","66 0F 3A 0D /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"BLENDPS xmm1, xmm2/m128, imm8u","BLENDPS imm8u, xmm2/m128, xmm1","blendps= imm8u, xmm2/m128, xmm1","66 0F 3A 0C /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"BLENDVPD xmm1, xmm2/m128, ","BLENDVPD , xmm2/m128, xmm1","ble= ndvpd , xmm2/m128, xmm1","66 0F 38 15 /r","V","V","SSE4_1","","rw,r,r= ","","" +"BLENDVPS xmm1, xmm2/m128, ","BLENDVPS , xmm2/m128, xmm1","ble= ndvps , xmm2/m128, xmm1","66 0F 38 14 /r","V","V","SSE4_1","","rw,r,r= ","","" +"BLSFILL r32V, r/m32","BLSFILLL r/m32, r32V","blsfill r/m32, r32V","XOP.ND= D.128.09.WIG 01 /2","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLSFILL r64V, r/m64","BLSFILLQ r/m64, r64V","blsfill r/m64, r64V","XOP.ND= D.128.09.W1 01 /2","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLSIC r32V, r/m32","BLSICL r/m32, r32V","blsicl r/m32, r32V","XOP.NDD.128= .09.WIG 01 /6","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"BLSIC r64V, r/m64","BLSICQ r/m64, r64V","blsicq r/m64, r64V","XOP.NDD.128= .09.WIG 01 /6","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"BLSI r32V, r/m32","BLSIL r/m32, r32V","blsil r/m32, r32V","VEX.NDD.128.0F= 38.W0 F3 /3","V","V","BMI1","","w,r","Y","32" +"BLSI r64V, r/m64","BLSIQ r/m64, r64V","blsiq r/m64, r64V","VEX.NDD.128.0F= 38.W1 F3 /3","N.S.","V","BMI1","","w,r","Y","64" +"BLSMSK r32V, r/m32","BLSMSKL r/m32, r32V","blsmskl r/m32, r32V","VEX.NDD.= 128.0F38.W0 F3 /2","V","V","BMI1","","w,r","Y","32" +"BLSMSK r64V, r/m64","BLSMSKQ r/m64, r64V","blsmskq r/m64, r64V","VEX.NDD.= 128.0F38.W1 F3 /2","N.S.","V","BMI1","","w,r","Y","64" +"BLSR r32V, r/m32","BLSRL r/m32, r32V","blsrl r/m32, r32V","VEX.NDD.128.0F= 38.W0 F3 /1","V","V","BMI1","","w,r","Y","32" +"BLSR r64V, r/m64","BLSRQ r/m64, r64V","blsrq r/m64, r64V","VEX.NDD.128.0F= 38.W1 F3 /1","N.S.","V","BMI1","","w,r","Y","64" +"BNDCL bnd1, r/m32","BNDCL r/m32, bnd1","bndcl r/m32, bnd1","F3 0F 1A /r",= "V","N.S.","MPX","","r,r","","" +"BNDCL bnd1, r/m64","BNDCL r/m64, bnd1","bndcl r/m64, bnd1","F3 0F 1A /r",= "N.S.","V","MPX","","r,r","","" +"BNDCN bnd1, r/m32","BNDCN r/m32, bnd1","bndcn r/m32, bnd1","F2 0F 1B /r",= "V","N.S.","MPX","","r,r","","" +"BNDCN bnd1, r/m64","BNDCN r/m64, bnd1","bndcn r/m64, bnd1","F2 0F 1B /r",= "N.S.","V","MPX","","r,r","","" +"BNDCU bnd1, r/m32","BNDCU r/m32, bnd1","bndcu r/m32, bnd1","F2 0F 1A /r",= "V","N.S.","MPX","","r,r","","" +"BNDCU bnd1, r/m64","BNDCU r/m64, bnd1","bndcu r/m64, bnd1","F2 0F 1A /r",= "N.S.","V","MPX","","r,r","","" +"BNDLDX bnd1, mib","BNDLDX mib, bnd1","bndldx mib, bnd1","0F 1A /r","V","V= ","MPX","modrm_memonly","w,r","","" +"BNDMK bnd1, m32","BNDMK m32, bnd1","bndmk m32, bnd1","F3 0F 1B /r","V","N= .S.","MPX","modrm_memonly","w,r","","" +"BNDMK bnd1, m64","BNDMK m64, bnd1","bndmk m64, bnd1","F3 0F 1B /r","N.S."= ,"V","MPX","modrm_memonly","w,r","","" +"BNDMOV bnd2/m128, bnd1","BNDMOV bnd1, bnd2/m128","bndmov bnd1, bnd2/m128"= ,"66 0F 1B /r","N.S.","V","MPX","","w,r","","" +"BNDMOV bnd2/m64, bnd1","BNDMOV bnd1, bnd2/m64","bndmov bnd1, bnd2/m64","6= 6 0F 1B /r","V","N.S.","MPX","","w,r","","" +"BNDMOV bnd1, bnd2/m128","BNDMOV bnd2/m128, bnd1","bndmov bnd2/m128, bnd1"= ,"66 0F 1A /r","N.S.","V","MPX","","w,r","","" +"BNDMOV bnd1, bnd2/m64","BNDMOV bnd2/m64, bnd1","bndmov bnd2/m64, bnd1","6= 6 0F 1A /r","V","N.S.","MPX","","w,r","","" +"BNDSTX mib, bnd1","BNDSTX bnd1, mib","bndstx bnd1, mib","0F 1B /r","V","V= ","MPX","modrm_memonly","w,r","","" +"BOUND r32, m32&32","BOUNDL m32&32, r32","boundl r32, m32&32","62 /r","V",= "N.S.","","modrm_memonly,operand32","r,r","Y","32" +"BOUND r16, m16&16","BOUNDW m16&16, r16","boundw r16, m16&16","62 /r","V",= "N.S.","","modrm_memonly,operand16","r,r","Y","16" +"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","0F BC /r","V","V",""= ,"operand32","rw,r","Y","32" +"BSF r32, r/m32","BSFL r/m32, r32","bsfl r/m32, r32","F3 0F BC /r","V","V"= ,"","operand32","rw,r","Y","32" +"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","F3 REX.W 0F BC /r","= N.S.","V","","","rw,r","Y","64" +"BSF r64, r/m64","BSFQ r/m64, r64","bsfq r/m64, r64","REX.W 0F BC /r","N.S= .","V","","","rw,r","Y","64" +"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","0F BC /r","V","V",""= ,"operand16","rw,r","Y","16" +"BSF r16, r/m16","BSFW r/m16, r16","bsfw r/m16, r16","F3 0F BC /r","V","V"= ,"","operand16","rw,r","Y","16" +"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","0F BD /r","V","V",""= ,"operand32","rw,r","Y","32" +"BSR r32, r/m32","BSRL r/m32, r32","bsrl r/m32, r32","F3 0F BD /r","V","V"= ,"","operand32","rw,r","Y","32" +"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","F3 REX.W 0F BD /r","= N.S.","V","","","rw,r","Y","64" +"BSR r64, r/m64","BSRQ r/m64, r64","bsrq r/m64, r64","REX.W 0F BD /r","N.S= .","V","","","rw,r","Y","64" +"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","0F BD /r","V","V",""= ,"operand16","rw,r","Y","16" +"BSR r16, r/m16","BSRW r/m16, r16","bsrw r/m16, r16","F3 0F BD /r","V","V"= ,"","operand16","rw,r","Y","16" +"BSWAP r32op","BSWAPL r32op","bswap r32op","0F C8+rd","V","V","486","opera= nd32","rw","Y","32" +"BSWAP r64op","BSWAPQ r64op","bswap r64op","REX.W 0F C8+ro","N.S.","V","48= 6","","rw","Y","64" +"BSWAP r16op","BSWAPW r16op","bswap r16op","0F C8+rw","V","V","486","opera= nd16","rw","Y","16" +"BTC r/m32, imm8u","BTCL imm8u, r/m32","btcl imm8u, r/m32","0F BA /7 ib","= V","V","","operand32","rw,r","Y","32" +"BTC r/m32, r32","BTCL r32, r/m32","btcl r32, r/m32","0F BB /r","V","V",""= ,"operand32","rw,r","Y","32" +"BTC r/m64, imm8u","BTCQ imm8u, r/m64","btcq imm8u, r/m64","REX.W 0F BA /7= ib","N.S.","V","","","rw,r","Y","64" +"BTC r/m64, r64","BTCQ r64, r/m64","btcq r64, r/m64","REX.W 0F BB /r","N.S= .","V","","","rw,r","Y","64" +"BTC r/m16, imm8u","BTCW imm8u, r/m16","btcw imm8u, r/m16","0F BA /7 ib","= V","V","","operand16","rw,r","Y","16" +"BTC r/m16, r16","BTCW r16, r/m16","btcw r16, r/m16","0F BB /r","V","V",""= ,"operand16","rw,r","Y","16" +"BT r/m32, imm8u","BTL imm8u, r/m32","btl imm8u, r/m32","0F BA /4 ib","V",= "V","","operand32","r,r","Y","32" +"BT r/m32, r32","BTL r32, r/m32","btl r32, r/m32","0F A3 /r","V","V","","o= perand32","r,r","Y","32" +"BT r/m64, imm8u","BTQ imm8u, r/m64","btq imm8u, r/m64","REX.W 0F BA /4 ib= ","N.S.","V","","","r,r","Y","64" +"BT r/m64, r64","BTQ r64, r/m64","btq r64, r/m64","REX.W 0F A3 /r","N.S.",= "V","","","r,r","Y","64" +"BTR r/m32, imm8u","BTRL imm8u, r/m32","btrl imm8u, r/m32","0F BA /6 ib","= V","V","","operand32","rw,r","Y","32" +"BTR r/m32, r32","BTRL r32, r/m32","btrl r32, r/m32","0F B3 /r","V","V",""= ,"operand32","rw,r","Y","32" +"BTR r/m64, imm8u","BTRQ imm8u, r/m64","btrq imm8u, r/m64","REX.W 0F BA /6= ib","N.S.","V","","","rw,r","Y","64" +"BTR r/m64, r64","BTRQ r64, r/m64","btrq r64, r/m64","REX.W 0F B3 /r","N.S= .","V","","","rw,r","Y","64" +"BTR r/m16, imm8u","BTRW imm8u, r/m16","btrw imm8u, r/m16","0F BA /6 ib","= V","V","","operand16","rw,r","Y","16" +"BTR r/m16, r16","BTRW r16, r/m16","btrw r16, r/m16","0F B3 /r","V","V",""= ,"operand16","rw,r","Y","16" +"BTS r/m32, imm8u","BTSL imm8u, r/m32","btsl imm8u, r/m32","0F BA /5 ib","= V","V","","operand32","rw,r","Y","32" +"BTS r/m32, r32","BTSL r32, r/m32","btsl r32, r/m32","0F AB /r","V","V",""= ,"operand32","rw,r","Y","32" +"BTS r/m64, imm8u","BTSQ imm8u, r/m64","btsq imm8u, r/m64","REX.W 0F BA /5= ib","N.S.","V","","","rw,r","Y","64" +"BTS r/m64, r64","BTSQ r64, r/m64","btsq r64, r/m64","REX.W 0F AB /r","N.S= .","V","","","rw,r","Y","64" +"BTS r/m16, imm8u","BTSW imm8u, r/m16","btsw imm8u, r/m16","0F BA /5 ib","= V","V","","operand16","rw,r","Y","16" +"BTS r/m16, r16","BTSW r16, r/m16","btsw r16, r/m16","0F AB /r","V","V",""= ,"operand16","rw,r","Y","16" +"BT r/m16, imm8u","BTW imm8u, r/m16","btw imm8u, r/m16","0F BA /4 ib","V",= "V","","operand16","r,r","Y","16" +"BT r/m16, r16","BTW r16, r/m16","btw r16, r/m16","0F A3 /r","V","V","","o= perand16","r,r","Y","16" +"BZHI r32, r/m32, r32V","BZHIL r32V, r/m32, r32","bzhil r32V, r/m32, r32",= "VEX.NDS.128.0F38.W0 F5 /r","V","V","BMI2","","w,r,r","Y","32" +"BZHI r64, r/m64, r64V","BZHIQ r64V, r/m64, r64","bzhiq r64V, r/m64, r64",= "VEX.NDS.128.0F38.W1 F5 /r","N.S.","V","BMI2","","w,r,r","Y","64" +"CALL rel16","CALL rel16","call rel16","E8 cw","V","N.S.","","operand16","= r","Y","" +"CALL rel32","CALL rel32","call rel32","E8 cd","V","N.S.","","operand32","= r","Y","" +"CALL rel32","CALL rel32","call rel32","E8 cd","N.S.","V","","default64","= r","Y","" +"CALL r/m32","CALLL* r/m32","calll* r/m32","FF /2","V","N.S.","","operand3= 2","r","Y","32" +"CALL r/m64","CALLQ* r/m64","callq* r/m64","FF /2","N.S.","V","","default6= 4","r","Y","64" +"CALL r/m16","CALLW* r/m16","callw* r/m16","FF /2","V","N.S.","","operand1= 6","r","Y","16" +"CBW","CBW","cbtw","98","V","V","","operand16","","","" +"CDQ","CDQ","cltd","99","V","V","","operand32","","","" +"CDQE","CDQE","cltq","REX.W 98","N.S.","V","","","","","" +"CLAC","CLAC","clac","0F 01 CA","V","V","","","","","" +"CLC","CLC","clc","F8","V","V","","","","","" +"CLD","CLD","cld","FC","V","V","","","","","" +"CLFLUSH m8","CLFLUSH m8","clflush m8","0F AE /7","V","V","","modrm_memonl= y","r","","" +"CLFLUSHOPT m8","CLFLUSHOPT m8","clflushopt m8","66 0F AE /7","V","V","","= modrm_memonly","r","","" +"CLGI","CLGI","clgi","0F 01 DD","V","V","SVM","amd","","","" +"CLI","CLI","cli","FA","V","V","","","","","" +"CLRSSBSY m64","CLRSSBSY m64","clrssbsy m64","F3 0F AE /6","V","V","CET","= modrm_memonly","w","","" +"CLTS","CLTS","clts","0F 06","V","V","","","","","" +"CLWB m8","CLWB m8","clwb m8","66 0F AE /6","V","V","CLWB","modrm_memonly"= ,"r","","" +"CLZERO EAX","CLZEROL EAX","clzerol EAX","0F 01 FC","V","V","CLZERO","amd,= modrm_regonly,operand32","r","Y","32" +"CLZERO RAX","CLZEROQ RAX","clzeroq RAX","REX.W 0F 01 FC","N.S.","V","CLZE= RO","amd,modrm_regonly","r","Y","64" +"CLZERO AX","CLZEROW AX","clzerow AX","0F 01 FC","V","V","CLZERO","amd,mod= rm_regonly,operand16","r","Y","16" +"CMC","CMC","cmc","F5","V","V","","","","","" +"CMOVC r16, r/m16","CMOVC r/m16, r16","cmovc r/m16, r16","0F 42 /r","V","V= ","","P6,operand16,pseudo","rw,r","","" +"CMOVC r32, r/m32","CMOVC r/m32, r32","cmovc r/m32, r32","0F 42 /r","V","V= ","","P6,operand32,pseudo","rw,r","","" +"CMOVC r64, r/m64","CMOVC r/m64, r64","cmovc r/m64, r64","REX.W 0F 42 /r",= "N.E.","V","","pseudo","rw,r","","" +"CMOVAE r32, r/m32","CMOVLCC r/m32, r32","cmovael r/m32, r32","0F 43 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVB r32, r/m32","CMOVLCS r/m32, r32","cmovbl r/m32, r32","0F 42 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVE r32, r/m32","CMOVLEQ r/m32, r32","cmovel r/m32, r32","0F 44 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVGE r32, r/m32","CMOVLGE r/m32, r32","cmovgel r/m32, r32","0F 4D /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVG r32, r/m32","CMOVLGT r/m32, r32","cmovgl r/m32, r32","0F 4F /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVA r32, r/m32","CMOVLHI r/m32, r32","cmoval r/m32, r32","0F 47 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVLE r32, r/m32","CMOVLLE r/m32, r32","cmovlel r/m32, r32","0F 4E /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVBE r32, r/m32","CMOVLLS r/m32, r32","cmovbel r/m32, r32","0F 46 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVL r32, r/m32","CMOVLLT r/m32, r32","cmovll r/m32, r32","0F 4C /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVS r32, r/m32","CMOVLMI r/m32, r32","cmovsl r/m32, r32","0F 48 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVNE r32, r/m32","CMOVLNE r/m32, r32","cmovnel r/m32, r32","0F 45 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVNO r32, r/m32","CMOVLOC r/m32, r32","cmovnol r/m32, r32","0F 41 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVO r32, r/m32","CMOVLOS r/m32, r32","cmovol r/m32, r32","0F 40 /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVNP r32, r/m32","CMOVLPC r/m32, r32","cmovnpl r/m32, r32","0F 4B /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVNS r32, r/m32","CMOVLPL r/m32, r32","cmovnsl r/m32, r32","0F 49 /r","= V","V","","P6,operand32","rw,r","Y","32" +"CMOVP r32, r/m32","CMOVLPS r/m32, r32","cmovpl r/m32, r32","0F 4A /r","V"= ,"V","","P6,operand32","rw,r","Y","32" +"CMOVNA r16, r/m16","CMOVNA r/m16, r16","cmovna r/m16, r16","0F 46 /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNA r32, r/m32","CMOVNA r/m32, r32","cmovna r/m32, r32","0F 46 /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNA r64, r/m64","CMOVNA r/m64, r64","cmovna r/m64, r64","REX.W 0F 46 /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNAE r16, r/m16","CMOVNAE r/m16, r16","cmovnae r/m16, r16","0F 42 /r",= "V","V","","P6,operand16,pseudo","rw,r","","" +"CMOVNAE r32, r/m32","CMOVNAE r/m32, r32","cmovnae r/m32, r32","0F 42 /r",= "V","V","","P6,operand32,pseudo","rw,r","","" +"CMOVNAE r64, r/m64","CMOVNAE r/m64, r64","cmovnae r/m64, r64","REX.W 0F 4= 2 /r","N.E.","V","","pseudo","rw,r","","" +"CMOVNB r16, r/m16","CMOVNB r/m16, r16","cmovnb r/m16, r16","0F 43 /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNB r32, r/m32","CMOVNB r/m32, r32","cmovnb r/m32, r32","0F 43 /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNB r64, r/m64","CMOVNB r/m64, r64","cmovnb r/m64, r64","REX.W 0F 43 /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNBE r16, r/m16","CMOVNBE r/m16, r16","cmovnbe r/m16, r16","0F 47 /r",= "V","V","","P6,operand16,pseudo","rw,r","","" +"CMOVNBE r32, r/m32","CMOVNBE r/m32, r32","cmovnbe r/m32, r32","0F 47 /r",= "V","V","","P6,operand32,pseudo","rw,r","","" +"CMOVNBE r64, r/m64","CMOVNBE r/m64, r64","cmovnbe r/m64, r64","REX.W 0F 4= 7 /r","N.E.","V","","pseudo","rw,r","","" +"CMOVNC r16, r/m16","CMOVNC r/m16, r16","cmovnc r/m16, r16","0F 43 /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNC r32, r/m32","CMOVNC r/m32, r32","cmovnc r/m32, r32","0F 43 /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNC r64, r/m64","CMOVNC r/m64, r64","cmovnc r/m64, r64","REX.W 0F 43 /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNG r16, r/m16","CMOVNG r/m16, r16","cmovng r/m16, r16","0F 4E /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNG r32, r/m32","CMOVNG r/m32, r32","cmovng r/m32, r32","0F 4E /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNG r64, r/m64","CMOVNG r/m64, r64","cmovng r/m64, r64","REX.W 0F 4E /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNGE r16, r/m16","CMOVNGE r/m16, r16","cmovnge r/m16, r16","0F 4C /r",= "V","V","","P6,operand16,pseudo","rw,r","","" +"CMOVNGE r32, r/m32","CMOVNGE r/m32, r32","cmovnge r/m32, r32","0F 4C /r",= "V","V","","P6,operand32,pseudo","rw,r","","" +"CMOVNGE r64, r/m64","CMOVNGE r/m64, r64","cmovnge r/m64, r64","REX.W 0F 4= C /r","N.E.","V","","pseudo","rw,r","","" +"CMOVNL r16, r/m16","CMOVNL r/m16, r16","cmovnl r/m16, r16","0F 4D /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNL r32, r/m32","CMOVNL r/m32, r32","cmovnl r/m32, r32","0F 4D /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNL r64, r/m64","CMOVNL r/m64, r64","cmovnl r/m64, r64","REX.W 0F 4D /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVNLE r16, r/m16","CMOVNLE r/m16, r16","cmovnle r/m16, r16","0F 4F /r",= "V","V","","P6,operand16,pseudo","rw,r","","" +"CMOVNLE r32, r/m32","CMOVNLE r/m32, r32","cmovnle r/m32, r32","0F 4F /r",= "V","V","","P6,operand32,pseudo","rw,r","","" +"CMOVNLE r64, r/m64","CMOVNLE r/m64, r64","cmovnle r/m64, r64","REX.W 0F 4= F /r","N.E.","V","","pseudo","rw,r","","" +"CMOVNZ r16, r/m16","CMOVNZ r/m16, r16","cmovnz r/m16, r16","0F 45 /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVNZ r32, r/m32","CMOVNZ r/m32, r32","cmovnz r/m32, r32","0F 45 /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVNZ r64, r/m64","CMOVNZ r/m64, r64","cmovnz r/m64, r64","REX.W 0F 45 /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVPE r16, r/m16","CMOVPE r/m16, r16","cmovpe r/m16, r16","0F 4A /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVPE r32, r/m32","CMOVPE r/m32, r32","cmovpe r/m32, r32","0F 4A /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVPE r64, r/m64","CMOVPE r/m64, r64","cmovpe r/m64, r64","REX.W 0F 4A /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVPO r16, r/m16","CMOVPO r/m16, r16","cmovpo r/m16, r16","0F 4B /r","V"= ,"V","","P6,operand16,pseudo","rw,r","","" +"CMOVPO r32, r/m32","CMOVPO r/m32, r32","cmovpo r/m32, r32","0F 4B /r","V"= ,"V","","P6,operand32,pseudo","rw,r","","" +"CMOVPO r64, r/m64","CMOVPO r/m64, r64","cmovpo r/m64, r64","REX.W 0F 4B /= r","N.E.","V","","pseudo","rw,r","","" +"CMOVAE r64, r/m64","CMOVQCC r/m64, r64","cmovaeq r/m64, r64","REX.W 0F 43= /r","N.S.","V","","","rw,r","Y","64" +"CMOVB r64, r/m64","CMOVQCS r/m64, r64","cmovbq r/m64, r64","REX.W 0F 42 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVE r64, r/m64","CMOVQEQ r/m64, r64","cmoveq r/m64, r64","REX.W 0F 44 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVGE r64, r/m64","CMOVQGE r/m64, r64","cmovgeq r/m64, r64","REX.W 0F 4D= /r","N.S.","V","","","rw,r","Y","64" +"CMOVG r64, r/m64","CMOVQGT r/m64, r64","cmovgq r/m64, r64","REX.W 0F 4F /= r","N.S.","V","","","rw,r","Y","64" +"CMOVA r64, r/m64","CMOVQHI r/m64, r64","cmovaq r/m64, r64","REX.W 0F 47 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVLE r64, r/m64","CMOVQLE r/m64, r64","cmovleq r/m64, r64","REX.W 0F 4E= /r","N.S.","V","","","rw,r","Y","64" +"CMOVBE r64, r/m64","CMOVQLS r/m64, r64","cmovbeq r/m64, r64","REX.W 0F 46= /r","N.S.","V","","","rw,r","Y","64" +"CMOVL r64, r/m64","CMOVQLT r/m64, r64","cmovlq r/m64, r64","REX.W 0F 4C /= r","N.S.","V","","","rw,r","Y","64" +"CMOVS r64, r/m64","CMOVQMI r/m64, r64","cmovsq r/m64, r64","REX.W 0F 48 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVNE r64, r/m64","CMOVQNE r/m64, r64","cmovneq r/m64, r64","REX.W 0F 45= /r","N.S.","V","","","rw,r","Y","64" +"CMOVNO r64, r/m64","CMOVQOC r/m64, r64","cmovnoq r/m64, r64","REX.W 0F 41= /r","N.S.","V","","","rw,r","Y","64" +"CMOVO r64, r/m64","CMOVQOS r/m64, r64","cmovoq r/m64, r64","REX.W 0F 40 /= r","N.S.","V","","","rw,r","Y","64" +"CMOVNP r64, r/m64","CMOVQPC r/m64, r64","cmovnpq r/m64, r64","REX.W 0F 4B= /r","N.S.","V","","","rw,r","Y","64" +"CMOVNS r64, r/m64","CMOVQPL r/m64, r64","cmovnsq r/m64, r64","REX.W 0F 49= /r","N.S.","V","","","rw,r","Y","64" +"CMOVP r64, r/m64","CMOVQPS r/m64, r64","cmovpq r/m64, r64","REX.W 0F 4A /= r","N.S.","V","","","rw,r","Y","64" +"CMOVAE r16, r/m16","CMOVWCC r/m16, r16","cmovaew r/m16, r16","0F 43 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVB r16, r/m16","CMOVWCS r/m16, r16","cmovbw r/m16, r16","0F 42 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVE r16, r/m16","CMOVWEQ r/m16, r16","cmovew r/m16, r16","0F 44 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVGE r16, r/m16","CMOVWGE r/m16, r16","cmovgew r/m16, r16","0F 4D /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVG r16, r/m16","CMOVWGT r/m16, r16","cmovgw r/m16, r16","0F 4F /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVA r16, r/m16","CMOVWHI r/m16, r16","cmovaw r/m16, r16","0F 47 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVLE r16, r/m16","CMOVWLE r/m16, r16","cmovlew r/m16, r16","0F 4E /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVBE r16, r/m16","CMOVWLS r/m16, r16","cmovbew r/m16, r16","0F 46 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVL r16, r/m16","CMOVWLT r/m16, r16","cmovlw r/m16, r16","0F 4C /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVS r16, r/m16","CMOVWMI r/m16, r16","cmovsw r/m16, r16","0F 48 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVNE r16, r/m16","CMOVWNE r/m16, r16","cmovnew r/m16, r16","0F 45 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVNO r16, r/m16","CMOVWOC r/m16, r16","cmovnow r/m16, r16","0F 41 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVO r16, r/m16","CMOVWOS r/m16, r16","cmovow r/m16, r16","0F 40 /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVNP r16, r/m16","CMOVWPC r/m16, r16","cmovnpw r/m16, r16","0F 4B /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVNS r16, r/m16","CMOVWPL r/m16, r16","cmovnsw r/m16, r16","0F 49 /r","= V","V","","P6,operand16","rw,r","Y","16" +"CMOVP r16, r/m16","CMOVWPS r/m16, r16","cmovpw r/m16, r16","0F 4A /r","V"= ,"V","","P6,operand16","rw,r","Y","16" +"CMOVZ r16, r/m16","CMOVZ r/m16, r16","cmovz r/m16, r16","0F 44 /r","V","V= ","","P6,operand16,pseudo","rw,r","","" +"CMOVZ r32, r/m32","CMOVZ r/m32, r32","cmovz r/m32, r32","0F 44 /r","V","V= ","","P6,operand32,pseudo","rw,r","","" +"CMOVZ r64, r/m64","CMOVZ r/m64, r64","cmovz r/m64, r64","REX.W 0F 44 /r",= "N.E.","V","","pseudo","rw,r","","" +"CMP AL, imm8","CMPB AL, imm8","cmpb imm8, AL","3C ib","V","V","","","r,r"= ,"Y","8" +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","80 /7 ib","V","V",""= ,"","r,r","Y","8" +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","82 /7 ib","V","N.S."= ,"","","r,r","Y","8" +"CMP r/m8, imm8","CMPB r/m8, imm8","cmpb imm8, r/m8","REX 80 /7 ib","N.E."= ,"V","","pseudo64","r,r","Y","8" +"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","38 /r","V","V","","","r,r"= ,"Y","8" +"CMP r/m8, r8","CMPB r/m8, r8","cmpb r8, r/m8","REX 38 /r","N.E.","V","","= pseudo64","r,r","Y","8" +"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","3A /r","V","V","","","r,r"= ,"Y","8" +"CMP r8, r/m8","CMPB r8, r/m8","cmpb r/m8, r8","REX 3A /r","N.E.","V","","= pseudo64","r,r","Y","8" +"CMP EAX, imm32","CMPL EAX, imm32","cmpl imm32, EAX","3D id","V","V","","o= perand32","r,r","Y","32" +"CMP r/m32, imm32","CMPL r/m32, imm32","cmpl imm32, r/m32","81 /7 id","V",= "V","","operand32","r,r","Y","32" +"CMP r/m32, imm8","CMPL r/m32, imm8","cmpl imm8, r/m32","83 /7 ib","V","V"= ,"","operand32","r,r","Y","32" +"CMP r/m32, r32","CMPL r/m32, r32","cmpl r32, r/m32","39 /r","V","V","","o= perand32","r,r","Y","32" +"CMP r32, r/m32","CMPL r32, r/m32","cmpl r/m32, r32","3B /r","V","V","","o= perand32","r,r","Y","32" +"CMPPD xmm1, xmm2/m128, imm8u","CMPPD imm8u, xmm1, xmm2/m128","cmppd imm8u= , xmm2/m128, xmm1","66 0F C2 /r ib","V","V","SSE2","","rw,r,r","","" +"CMPPS xmm1, xmm2/m128, imm8u","CMPPS imm8u, xmm1, xmm2/m128","cmpps imm8u= , xmm2/m128, xmm1","0F C2 /r ib","V","V","SSE","","rw,r,r","","" +"CMP RAX, imm32","CMPQ RAX, imm32","cmpq imm32, RAX","REX.W 3D id","N.S.",= "V","","","r,r","Y","64" +"CMP r/m64, imm32","CMPQ r/m64, imm32","cmpq imm32, r/m64","REX.W 81 /7 id= ","N.S.","V","","","r,r","Y","64" +"CMP r/m64, imm8","CMPQ r/m64, imm8","cmpq imm8, r/m64","REX.W 83 /7 ib","= N.S.","V","","","r,r","Y","64" +"CMP r/m64, r64","CMPQ r/m64, r64","cmpq r64, r/m64","REX.W 39 /r","N.S.",= "V","","","r,r","Y","64" +"CMP r64, r/m64","CMPQ r64, r/m64","cmpq r/m64, r64","REX.W 3B /r","N.S.",= "V","","","r,r","Y","64" +"CMPSB","CMPSB","cmpsb","A6","V","V","","","","","" +"CMPSD xmm1, xmm2/m64, imm8u","CMPSD imm8u, xmm1, xmm2/m64","cmpsd imm8u, = xmm2/m64, xmm1","F2 0F C2 /r ib","V","V","SSE2","","rw,r,r","","" +"CMPSD","CMPSL","cmpsl","A7","V","V","","operand32","","","" +"CMPSQ","CMPSQ","cmpsq","REX.W A7","N.S.","V","","","","","" +"CMPSS xmm1, xmm2/m32, imm8u","CMPSS imm8u, xmm1, xmm2/m32","cmpss imm8u, = xmm2/m32, xmm1","F3 0F C2 /r ib","V","V","SSE","","rw,r,r","","" +"CMPSW","CMPSW","cmpsw","A7","V","V","","operand16","","","" +"CMP AX, imm16","CMPW AX, imm16","cmpw imm16, AX","3D iw","V","V","","oper= and16","r,r","Y","16" +"CMP r/m16, imm16","CMPW r/m16, imm16","cmpw imm16, r/m16","81 /7 iw","V",= "V","","operand16","r,r","Y","16" +"CMP r/m16, imm8","CMPW r/m16, imm8","cmpw imm8, r/m16","83 /7 ib","V","V"= ,"","operand16","r,r","Y","16" +"CMP r/m16, r16","CMPW r/m16, r16","cmpw r16, r/m16","39 /r","V","V","","o= perand16","r,r","Y","16" +"CMP r16, r/m16","CMPW r16, r/m16","cmpw r/m16, r16","3B /r","V","V","","o= perand16","r,r","Y","16" +"CMPXCHG16B m128","CMPXCHG16B m128","cmpxchg16b m128","REX.W 0F C7 /1","N.= S.","V","","modrm_memonly","rw","","" +"CMPXCHG8B m64","CMPXCHG8B m64","cmpxchg8b m64","0F C7 /1","V","V","Pentiu= m","modrm_memonly,operand16,operand32","rw","","" +"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","0F B0 /r","V",= "V","486","","rw,r","Y","8" +"CMPXCHG r/m8, r8","CMPXCHGB r8, r/m8","cmpxchgb r8, r/m8","REX 0F B0 /r",= "N.E.","V","","pseudo64","rw,r","Y","8" +"CMPXCHG r/m32, r32","CMPXCHGL r32, r/m32","cmpxchgl r32, r/m32","0F B1 /r= ","V","V","486","operand32","rw,r","Y","32" +"CMPXCHG r/m64, r64","CMPXCHGQ r64, r/m64","cmpxchgq r64, r/m64","REX.W 0F= B1 /r","N.S.","V","486","","rw,r","Y","64" +"CMPXCHG r/m16, r16","CMPXCHGW r16, r/m16","cmpxchgw r16, r/m16","0F B1 /r= ","V","V","486","operand16","rw,r","Y","16" +"COMISD xmm1, xmm2/m64","COMISD xmm2/m64, xmm1","comisd xmm2/m64, xmm1","6= 6 0F 2F /r","V","V","SSE2","","r,r","","" +"COMISS xmm1, xmm2/m32","COMISS xmm2/m32, xmm1","comiss xmm2/m32, xmm1","0= F 2F /r","V","V","SSE","","r,r","","" +"CPUID","CPUID","cpuid","0F A2","V","V","486","","","","" +"CQO","CQO","cqto","REX.W 99","N.S.","V","","","","","" +"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 0F 38 F0 /r","= V","V","SSE4_2","operand16,operand32","rw,r","Y","8" +"CRC32 r32, r/m8","CRC32B r/m8, r32","crc32b r/m8, r32","F2 REX 0F 38 F0 /= r","N.E.","V","","pseudo64","rw,r","Y","8" +"CRC32 r64, r/m8","CRC32B r/m8, r64","crc32b r/m8, r64","F2 REX.W 0F 38 F0= /r","N.S.","V","SSE4_2","","rw,r","Y","8" +"CRC32 r32, r/m32","CRC32L r/m32, r32","crc32l r/m32, r32","F2 0F 38 F1 /r= ","V","V","SSE4_2","operand32","rw,r","Y","32" +"CRC32 r64, r/m64","CRC32Q r/m64, r64","crc32q r/m64, r64","F2 REX.W 0F 38= F1 /r","N.S.","V","SSE4_2","","rw,r","Y","64" +"CRC32 r32, r/m16","CRC32W r/m16, r32","crc32w r/m16, r32","F2 0F 38 F1 /r= ","V","V","SSE4_2","operand16","rw,r","Y","16" +"CVTPD2PI mm1, xmm2/m128","CVTPD2PI xmm2/m128, mm1","cvtpd2pi xmm2/m128, m= m1","66 0F 2D /r","V","V","SSE2","","w,r","","" +"CVTPD2DQ xmm1, xmm2/m128","CVTPD2PL xmm2/m128, xmm1","cvtpd2dq xmm2/m128,= xmm1","F2 0F E6 /r","V","V","SSE2","","w,r","","" +"CVTPD2PS xmm1, xmm2/m128","CVTPD2PS xmm2/m128, xmm1","cvtpd2ps xmm2/m128,= xmm1","66 0F 5A /r","V","V","SSE2","","w,r","","" +"CVTPI2PD xmm1, mm2/m64","CVTPI2PD mm2/m64, xmm1","cvtpi2pd mm2/m64, xmm1"= ,"66 0F 2A /r","V","V","SSE2","","w,r","","" +"CVTPI2PS xmm1, mm2/m64","CVTPI2PS mm2/m64, xmm1","cvtpi2ps mm2/m64, xmm1"= ,"0F 2A /r","V","V","SSE","","w,r","","" +"CVTDQ2PD xmm1, xmm2/m64","CVTPL2PD xmm2/m64, xmm1","cvtdq2pd xmm2/m64, xm= m1","F3 0F E6 /r","V","V","SSE2","","w,r","","" +"CVTDQ2PS xmm1, xmm2/m128","CVTPL2PS xmm2/m128, xmm1","cvtdq2ps xmm2/m128,= xmm1","0F 5B /r","V","V","SSE2","","w,r","","" +"CVTPS2PD xmm1, xmm2/m64","CVTPS2PD xmm2/m64, xmm1","cvtps2pd xmm2/m64, xm= m1","0F 5A /r","V","V","SSE2","","w,r","","" +"CVTPS2PI mm1, xmm2/m64","CVTPS2PI xmm2/m64, mm1","cvtps2pi xmm2/m64, mm1"= ,"0F 2D /r","V","V","SSE","","w,r","","" +"CVTPS2DQ xmm1, xmm2/m128","CVTPS2PL xmm2/m128, xmm1","cvtps2dq xmm2/m128,= xmm1","66 0F 5B /r","V","V","SSE2","","w,r","","" +"CVTSD2SI r32, xmm2/m64","CVTSD2SL xmm2/m64, r32","cvtsd2si xmm2/m64, r32"= ,"F2 0F 2D /r","V","V","SSE2","operand16,operand32","w,r","Y","32" +"CVTSD2SI r64, xmm2/m64","CVTSD2SL xmm2/m64, r64","cvtsd2siq xmm2/m64, r64= ","F2 REX.W 0F 2D /r","N.S.","V","SSE2","","w,r","Y","64" +"CVTSD2SS xmm1, xmm2/m64","CVTSD2SS xmm2/m64, xmm1","cvtsd2ss xmm2/m64, xm= m1","F2 0F 5A /r","V","V","SSE2","","w,r","","" +"CVTSI2SD xmm1, r/m32","CVTSL2SD r/m32, xmm1","cvtsi2sdl r/m32, xmm1","F2 = 0F 2A /r","V","V","SSE2","operand16,operand32","w,r","Y","32" +"CVTSI2SS xmm1, r/m32","CVTSL2SS r/m32, xmm1","cvtsi2ssl r/m32, xmm1","F3 = 0F 2A /r","V","V","SSE","operand16,operand32","w,r","Y","32" +"CVTSI2SD xmm1, r/m64","CVTSQ2SD r/m64, xmm1","cvtsi2sdq r/m64, xmm1","F2 = REX.W 0F 2A /r","N.S.","V","SSE2","","w,r","Y","64" +"CVTSI2SS xmm1, r/m64","CVTSQ2SS r/m64, xmm1","cvtsi2ssq r/m64, xmm1","F3 = REX.W 0F 2A /r","N.S.","V","SSE","","w,r","Y","64" +"CVTSS2SD xmm1, xmm2/m32","CVTSS2SD xmm2/m32, xmm1","cvtss2sd xmm2/m32, xm= m1","F3 0F 5A /r","V","V","SSE2","","w,r","","" +"CVTSS2SI r32, xmm2/m32","CVTSS2SL xmm2/m32, r32","cvtss2si xmm2/m32, r32"= ,"F3 0F 2D /r","V","V","SSE","operand16,operand32","w,r","Y","32" +"CVTSS2SI r64, xmm2/m32","CVTSS2SL xmm2/m32, r64","cvtss2siq xmm2/m32, r64= ","F3 REX.W 0F 2D /r","N.S.","V","SSE","","w,r","Y","64" +"CVTTPD2PI mm1, xmm2/m128","CVTTPD2PI xmm2/m128, mm1","cvttpd2pi xmm2/m128= , mm1","66 0F 2C /r","V","V","SSE2","","w,r","","" +"CVTTPD2DQ xmm1, xmm2/m128","CVTTPD2PL xmm2/m128, xmm1","cvttpd2dq xmm2/m1= 28, xmm1","66 0F E6 /r","V","V","SSE2","","w,r","","" +"CVTTPS2PI mm1, xmm2/m64","CVTTPS2PI xmm2/m64, mm1","cvttps2pi xmm2/m64, m= m1","0F 2C /r","V","V","SSE","","w,r","","" +"CVTTPS2DQ xmm1, xmm2/m128","CVTTPS2PL xmm2/m128, xmm1","cvttps2dq xmm2/m1= 28, xmm1","F3 0F 5B /r","V","V","SSE2","","w,r","","" +"CVTTSD2SI r32, xmm2/m64","CVTTSD2SL xmm2/m64, r32","cvttsd2si xmm2/m64, r= 32","F2 0F 2C /r","V","V","SSE2","operand16,operand32","w,r","Y","32" +"CVTTSD2SI r64, xmm2/m64","CVTTSD2SL xmm2/m64, r64","cvttsd2siq xmm2/m64, = r64","F2 REX.W 0F 2C /r","N.S.","V","SSE2","","w,r","Y","64" +"CVTTSS2SI r32, xmm2/m32","CVTTSS2SL xmm2/m32, r32","cvttss2si xmm2/m32, r= 32","F3 0F 2C /r","V","V","SSE","operand16,operand32","w,r","Y","32" +"CVTTSS2SI r64, xmm2/m32","CVTTSS2SL xmm2/m32, r64","cvttss2siq xmm2/m32, = r64","F3 REX.W 0F 2C /r","N.S.","V","SSE","","w,r","Y","64" +"CWD","CWD","cwtd","99","V","V","","operand16","","","" +"CWDE","CWDE","cwtl","98","V","V","","operand32","","","" +"DAA","DAA","daa","27","V","N.S.","","","","","" +"DAS","DAS","das","2F","V","N.S.","","","","","" +"DEC r/m8","DECB r/m8","decb r/m8","FE /1","V","V","","","rw","Y","8" +"DEC r/m8","DECB r/m8","decb r/m8","REX FE /1","N.E.","V","","pseudo64","r= w","Y","8" +"DEC r/m32","DECL r/m32","decl r/m32","FF /1","V","V","","operand32","rw",= "Y","32" +"DEC r32op","DECL r32op","decl r32op","48+rd","V","N.S.","","operand32","r= w","Y","32" +"DEC r/m64","DECQ r/m64","decq r/m64","REX.W FF /1","N.S.","V","","","rw",= "Y","64" +"DEC r/m16","DECW r/m16","decw r/m16","FF /1","V","V","","operand16","rw",= "Y","16" +"DEC r16op","DECW r16op","decw r16op","48+rw","V","N.S.","","operand16","r= w","Y","16" +"DIV r/m8","DIVB r/m8","divb r/m8","F6 /6","V","V","","","r","Y","8" +"DIV r/m8","DIVB r/m8","divb r/m8","REX F6 /6","N.E.","V","","pseudo64","w= ","Y","8" +"DIV r/m32","DIVL r/m32","divl r/m32","F7 /6","V","V","","operand32","r","= Y","32" +"DIVPD xmm1, xmm2/m128","DIVPD xmm2/m128, xmm1","divpd xmm2/m128, xmm1","6= 6 0F 5E /r","V","V","SSE2","","rw,r","","" +"DIVPS xmm1, xmm2/m128","DIVPS xmm2/m128, xmm1","divps xmm2/m128, xmm1","0= F 5E /r","V","V","SSE","","rw,r","","" +"DIV r/m64","DIVQ r/m64","divq r/m64","REX.W F7 /6","N.S.","V","","","r","= Y","64" +"DIVSD xmm1, xmm2/m64","DIVSD xmm2/m64, xmm1","divsd xmm2/m64, xmm1","F2 0= F 5E /r","V","V","SSE2","","rw,r","","" +"DIVSS xmm1, xmm2/m32","DIVSS xmm2/m32, xmm1","divss xmm2/m32, xmm1","F3 0= F 5E /r","V","V","SSE","","rw,r","","" +"DIV r/m16","DIVW r/m16","divw r/m16","F7 /6","V","V","","operand16","r","= Y","16" +"DPPD xmm1, xmm2/m128, imm8u","DPPD imm8u, xmm2/m128, xmm1","dppd imm8u, x= mm2/m128, xmm1","66 0F 3A 41 /r ib","V","V","SSE4_1","","rw,r,r","","" +"DPPS xmm1, xmm2/m128, imm8u","DPPS imm8u, xmm2/m128, xmm1","dpps imm8u, x= mm2/m128, xmm1","66 0F 3A 40 /r ib","V","V","SSE4_1","","rw,r,r","","" +"EMMS","EMMS","emms","0F 77","V","V","MMX","","","","" +"ENCLS","ENCLS","encls","0F 01 CF","V","V","","","","","" +"ENCLU","ENCLU","enclu","0F 01 D7","V","V","","","","","" +"ENDBR32","ENDBR32","endbr32","F3 0F 1E FB","V","V","CET","","","","" +"ENDBR64","ENDBR64","endbr64","F3 0F 1E FA","V","V","CET","","","Y","" +"ENTER imm16, 0","ENTER 0, imm16","enter imm16, 0","C8 iw 00","V","V","","= pseudo","r,r","","" +"ENTER imm16, 1","ENTER 1, imm16","enter imm16, 1","C8 iw 01","V","V","","= pseudo","r,r","","" +"ENTER imm16, imm8b","ENTERW/ENTERL/ENTERQ imm8b, imm16","enterw/enterl/en= terq imm16, imm8b","C8 iw ib","V","V","","","r,r","","" +"EXTRACTPS r/m32, xmm1, imm8u:2","EXTRACTPS imm8u:2, xmm1, r/m32","extract= ps imm8u:2, xmm1, r/m32","66 0F 3A 17 /r ib","V","V","SSE4_1","","w,r,r",""= ,"" +"EXTRQ xmm1, imm8u, imm8u","EXTRQ imm8u, imm8u, xmm1","extrq imm8u, imm8u,= xmm1","66 0F 78 /0 ib ib","V","V","SSE4a","amd,modrm_regonly","w,r,r","","" +"EXTRQ xmm1, xmm2","EXTRQ xmm2, xmm1","extrq xmm2, xmm1","66 0F 79 /r","V"= ,"V","SSE4a","amd,modrm_regonly","w,r","","" +"F2XM1","F2XM1","f2xm1","D9 F0","V","V","","","","","" +"FABS","FABS","fabs","D9 E1","V","V","","","","","" +"FADD ST(i), ST(0)","FADDD ST(0), ST(i)","fadd ST(0), ST(i)","DC C0+i","V"= ,"V","","","rw,r","Y","" +"FADD ST(0), ST(i)","FADDD ST(i), ST(0)","fadd ST(i), ST(0)","D8 C0+i","V"= ,"V","","","rw,r","Y","" +"FADD ST(0), m32fp","FADDD m32fp, ST(0)","fadds m32fp, ST(0)","D8 /0","V",= "V","","","rw,r","Y","32" +"FADD ST(0), m64fp","FADDD m64fp, ST(0)","faddl m64fp, ST(0)","DC /0","V",= "V","","","rw,r","Y","64" +"FADDP","FADDDP","faddp","DE C1","V","V","","pseudo","","","" +"FADDP ST(i), ST(0)","FADDDP ST(0), ST(i)","faddp ST(0), ST(i)","DE C0+i",= "V","V","","","rw,r","","" +"FBLD ST(0), m80dec","FBLD m80dec, ST(0)","fbld m80dec, ST(0)","DF /4","V"= ,"V","","","w,r","","" +"FBSTP m80dec, ST(0)","FBSTP ST(0), m80dec","fbstp ST(0), m80dec","DF /6",= "V","V","","","w,r","","" +"FCHS","FCHS","fchs","D9 E0","V","V","","","","","" +"FCLEX","FCLEX","fclex","9B DB E2","V","V","","pseudo","","","" +"FCMOVB ST(0), ST(i)","FCMOVB ST(i), ST(0)","fcmovb ST(i), ST(0)","DA C0+i= ","V","V","","P6","rw,r","","" +"FCMOVBE ST(0), ST(i)","FCMOVBE ST(i), ST(0)","fcmovbe ST(i), ST(0)","DA D= 0+i","V","V","","P6","rw,r","","" +"FCMOVE ST(0), ST(i)","FCMOVE ST(i), ST(0)","fcmove ST(i), ST(0)","DA C8+i= ","V","V","","P6","rw,r","","" +"FCMOVNB ST(0), ST(i)","FCMOVNB ST(i), ST(0)","fcmovnb ST(i), ST(0)","DB C= 0+i","V","V","","P6","rw,r","","" +"FCMOVNBE ST(0), ST(i)","FCMOVNBE ST(i), ST(0)","fcmovnbe ST(i), ST(0)","D= B D0+i","V","V","","P6","rw,r","","" +"FCMOVNE ST(0), ST(i)","FCMOVNE ST(i), ST(0)","fcmovne ST(i), ST(0)","DB C= 8+i","V","V","","P6","rw,r","","" +"FCMOVNU ST(0), ST(i)","FCMOVNU ST(i), ST(0)","fcmovnu ST(i), ST(0)","DB D= 8+i","V","V","","P6","rw,r","","" +"FCMOVU ST(0), ST(i)","FCMOVU ST(i), ST(0)","fcmovu ST(i), ST(0)","DA D8+i= ","V","V","","P6","rw,r","","" +"FCOM","FCOMD","fcom","D8 D1","V","V","","pseudo","","Y","" +"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","D8 D0+i","V"= ,"V","","","r,r","Y","" +"FCOM ST(0), ST(i)","FCOMD ST(i), ST(0)","fcom ST(i), ST(0)","DC D0+i","V"= ,"V","","","r,r","Y","" +"FCOM ST(0), m32fp","FCOMD m32fp, ST(0)","fcoms m32fp, ST(0)","D8 /2","V",= "V","","","r,r","Y","32" +"FCOM ST(0), m64fp","FCOMD m64fp, ST(0)","fcoml m64fp, ST(0)","DC /2","V",= "V","","","r,r","Y","64" +"FCOMP ST(0), m32fp","FCOMFP m32fp, ST(0)","fcomps m32fp, ST(0)","D8 /3","= V","V","","","r,r","Y","32" +"FCOMI ST(0), ST(i)","FCOMI ST(i), ST(0)","fcomi ST(i), ST(0)","DB F0+i","= V","V","PPRO","P6","r,r","","" +"FCOMIP ST(0), ST(i)","FCOMIP ST(i), ST(0)","fcomip ST(i), ST(0)","DF F0+i= ","V","V","PPRO","P6","r,r","","" +"FCOMP","FCOMP","fcomp","D8 D9","V","V","","pseudo","","Y","" +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","D8 D8+i","= V","V","","","r,r","Y","" +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DC D8+i","= V","V","","","r,r","Y","" +"FCOMP ST(0), ST(i)","FCOMP ST(i), ST(0)","fcomp ST(i), ST(0)","DE D0+i","= V","V","","","r,r","Y","" +"FCOMP ST(0), m64fp","FCOMPL m64fp, ST(0)","fcompl m64fp, ST(0)","DC /3","= V","V","","","r,r","Y","64" +"FCOMPP","FCOMPP","fcompp","DE D9","V","V","","","","","" +"FCOS","FCOS","fcos","D9 FF","V","V","","","","","" +"FDECSTP","FDECSTP","fdecstp","D9 F6","V","V","","","","","" +"FDISI8087_NOP","FDISI8087_NOP","fdisi8087_nop","DB E1","V","V","","","","= ","" +"FDIVR ST(i), ST(0)","FDIVD ST(0), ST(i)","fdiv ST(0), ST(i)","DC F0+i","V= ","V","","","rw,r","Y","" +"FDIV ST(i), ST(0)","FDIVD ST(0), ST(i)","fdivr ST(0), ST(i)","DC F8+i","V= ","V","","","rw,r","Y","" +"FDIV ST(0), ST(i)","FDIVD ST(i), ST(0)","fdiv ST(i), ST(0)","D8 F0+i","V"= ,"V","","","rw,r","Y","" +"FDIV ST(0), m32fp","FDIVD m32fp, ST(0)","fdivs m32fp, ST(0)","D8 /6","V",= "V","","","rw,r","Y","32" +"FDIV ST(0), m64fp","FDIVD m64fp, ST(0)","fdivl m64fp, ST(0)","DC /6","V",= "V","","","rw,r","Y","64" +"FDIVR ST(0), m32fp","FDIVFR m32fp, ST(0)","fdivrs m32fp, ST(0)","D8 /7","= V","V","","","rw,r","Y","32" +"FDIVP","FDIVP","fdivp","DE F9","V","V","","pseudo","","","" +"FDIVRP ST(i), ST(0)","FDIVP ST(0), ST(i)","fdivp ST(0), ST(i)","DE F0+i",= "V","V","","","rw,r","","" +"FDIVR ST(0), ST(i)","FDIVR ST(i), ST(0)","fdivr ST(i), ST(0)","D8 F8+i","= V","V","","","rw,r","Y","" +"FDIVR ST(0), m64fp","FDIVRL m64fp, ST(0)","fdivrl m64fp, ST(0)","DC /7","= V","V","","","rw,r","Y","64" +"FDIVRP","FDIVRP","fdivrp","DE F1","V","V","","pseudo","","","" +"FDIVP ST(i), ST(0)","FDIVRP ST(0), ST(i)","fdivrp ST(0), ST(i)","DE F8+i"= ,"V","V","","","rw,r","","" +"FEMMS","FEMMS","femms","0F 0E","V","V","3DNOW","amd","","","" +"FENI8087_NOP","FENI8087_NOP","feni8087_nop","DB E0","V","V","","","","","" +"FFREE ST(i)","FFREE ST(i)","ffree ST(i)","DD C0+i","V","V","","","r","","" +"FFREEP ST(i)","FFREEP ST(i)","ffreep ST(i)","DF C0+i","V","V","","","r","= ","" +"FIADD ST(0), m16int","FIADD m16int, ST(0)","fiadd m16int, ST(0)","DE /0",= "V","V","","","rw,r","Y","" +"FIADD ST(0), m32int","FIADDL m32int, ST(0)","fiaddl m32int, ST(0)","DA /0= ","V","V","","","rw,r","Y","32" +"FICOM ST(0), m16int","FICOM m16int, ST(0)","ficom m16int, ST(0)","DE /2",= "V","V","","","r,r","Y","" +"FICOM ST(0), m32int","FICOML m32int, ST(0)","ficoml m32int, ST(0)","DA /2= ","V","V","","","r,r","Y","32" +"FICOMP ST(0), m16int","FICOMP m16int, ST(0)","ficomp m16int, ST(0)","DE /= 3","V","V","","","r,r","Y","" +"FICOMP ST(0), m32int","FICOMPL m32int, ST(0)","ficompl m32int, ST(0)","DA= /3","V","V","","","r,r","Y","32" +"FIDIV ST(0), m16int","FIDIV m16int, ST(0)","fidiv m16int, ST(0)","DE /6",= "V","V","","","rw,r","Y","" +"FIDIV ST(0), m32int","FIDIVL m32int, ST(0)","fidivl m32int, ST(0)","DA /6= ","V","V","","","rw,r","Y","32" +"FIDIVR ST(0), m16int","FIDIVR m16int, ST(0)","fidivr m16int, ST(0)","DE /= 7","V","V","","","rw,r","Y","" +"FIDIVR ST(0), m32int","FIDIVRL m32int, ST(0)","fidivrl m32int, ST(0)","DA= /7","V","V","","","rw,r","Y","32" +"FILD ST(0), m16int","FILD m16int, ST(0)","fild m16int, ST(0)","DF /0","V"= ,"V","","","w,r","Y","" +"FILD ST(0), m32int","FILDL m32int, ST(0)","fildl m32int, ST(0)","DB /0","= V","V","","","w,r","Y","32" +"FILD ST(0), m64int","FILDLL m64int, ST(0)","fildll m64int, ST(0)","DF /5"= ,"V","V","","","w,r","Y","64" +"FIMUL ST(0), m16int","FIMUL m16int, ST(0)","fimul m16int, ST(0)","DE /1",= "V","V","","","rw,r","Y","" +"FIMUL ST(0), m32int","FIMULL m32int, ST(0)","fimull m32int, ST(0)","DA /1= ","V","V","","","rw,r","Y","32" +"FINCSTP","FINCSTP","fincstp","D9 F7","V","V","","","","","" +"FINIT","FINIT","finit","9B DB E3","V","V","","pseudo","","","" +"FIST m16int, ST(0)","FIST ST(0), m16int","fist ST(0), m16int","DF /2","V"= ,"V","","","w,r","Y","" +"FIST m32int, ST(0)","FISTL ST(0), m32int","fistl ST(0), m32int","DB /2","= V","V","","","w,r","Y","32" +"FISTP m16int, ST(0)","FISTP ST(0), m16int","fistp ST(0), m16int","DF /3",= "V","V","","","w,r","Y","" +"FISTP m32int, ST(0)","FISTPL ST(0), m32int","fistpl ST(0), m32int","DB /3= ","V","V","","","w,r","Y","32" +"FISTP m64int, ST(0)","FISTPLL ST(0), m64int","fistpll ST(0), m64int","DF = /7","V","V","","","w,r","Y","64" +"FISTTP m16int, ST(0)","FISTTP ST(0), m16int","fisttp ST(0), m16int","DF /= 1","V","V","SSE3","modrm_memonly","w,r","Y","" +"FISTTP m32int, ST(0)","FISTTPL ST(0), m32int","fisttpl ST(0), m32int","DB= /1","V","V","SSE3","modrm_memonly","w,r","Y","32" +"FISTTP m64int, ST(0)","FISTTPLL ST(0), m64int","fisttpll ST(0), m64int","= DD /1","V","V","SSE3","modrm_memonly","w,r","Y","64" +"FISUB ST(0), m16int","FISUB m16int, ST(0)","fisub m16int, ST(0)","DE /4",= "V","V","","","rw,r","Y","" +"FISUB ST(0), m32int","FISUBL m32int, ST(0)","fisubl m32int, ST(0)","DA /4= ","V","V","","","rw,r","Y","32" +"FISUBR ST(0), m16int","FISUBR m16int, ST(0)","fisubr m16int, ST(0)","DE /= 5","V","V","","","rw,r","Y","" +"FISUBR ST(0), m32int","FISUBRL m32int, ST(0)","fisubrl m32int, ST(0)","DA= /5","V","V","","","rw,r","Y","32" +"FLD ST(0), ST(i)","FLD ST(i), ST(0)","fld ST(i), ST(0)","D9 C0+i","V","V"= ,"","","w,r","Y","" +"FLD1","FLD1","fld1","D9 E8","V","V","","","","","" +"FLDCW m2byte","FLDCW m2byte","fldcw m2byte","D9 /5","V","V","","","r","",= "" +"FLDENV m28byte","FLDENV m28byte","fldenv m28byte","D9 /4","V","V","","ope= rand32,operand64","r","","" +"FLDENV m14byte","FLDENVS m14byte","fldenv m14byte","D9 /4","V","V","","op= erand16","r","","" +"FLD ST(0), m64fp","FLDL m64fp, ST(0)","fldl m64fp, ST(0)","DD /0","V","V"= ,"","","w,r","Y","64" +"FLDL2E","FLDL2E","fldl2e","D9 EA","V","V","","","","","" +"FLDL2T","FLDL2T","fldl2t","D9 E9","V","V","","","","","" +"FLDLG2","FLDLG2","fldlg2","D9 EC","V","V","","","","","" +"FLDLN2","FLDLN2","fldln2","D9 ED","V","V","","","","","" +"FLDPI","FLDPI","fldpi","D9 EB","V","V","","","","","" +"FLD ST(0), m32fp","FLDS m32fp, ST(0)","flds m32fp, ST(0)","D9 /0","V","V"= ,"","","w,r","Y","32" +"FLD ST(0), m80fp","FLDT m80fp, ST(0)","fldt m80fp, ST(0)","DB /5","V","V"= ,"","","w,r","Y","80" +"FLDZ","FLDZ","fldz","D9 EE","V","V","","","","","" +"FMUL ST(i), ST(0)","FMUL ST(0), ST(i)","fmul ST(0), ST(i)","DC C8+i","V",= "V","","","rw,r","Y","" +"FMUL ST(0), ST(i)","FMUL ST(i), ST(0)","fmul ST(i), ST(0)","D8 C8+i","V",= "V","","","rw,r","Y","" +"FMUL ST(0), m64fp","FMULL m64fp, ST(0)","fmull m64fp, ST(0)","DC /1","V",= "V","","","rw,r","Y","64" +"FMULP","FMULP","fmulp","DE C9","V","V","","pseudo","","","" +"FMULP ST(i), ST(0)","FMULP ST(0), ST(i)","fmulp ST(0), ST(i)","DE C8+i","= V","V","","","rw,r","","" +"FMUL ST(0), m32fp","FMULS m32fp, ST(0)","fmuls m32fp, ST(0)","D8 /1","V",= "V","","","rw,r","Y","32" +"FNCLEX","FNCLEX","fnclex","DB E2","V","V","","","","","" +"FNINIT","FNINIT","fninit","DB E3","V","V","","","","","" +"FNOP","FNOP","fnop","D9 D0","V","V","","","","","" +"FNSAVE m108byte","FNSAVE m108byte","fnsave m108byte","DD /6","V","V","","= operand32,operand64","w","","" +"FNSAVE m94byte","FNSAVES m94byte","fnsave m94byte","DD /6","V","V","","op= erand16","w","","" +"FNSTCW m2byte","FNSTCW m2byte","fnstcw m2byte","D9 /7","V","V","","","w",= "","" +"FNSTENV m28byte","FNSTENV m28byte","fnstenv m28byte","D9 /6","V","V","","= operand32,operand64","w","","" +"FNSTENV m14byte","FNSTENVS m14byte","fnstenv m14byte","D9 /6","V","V","",= "operand16","w","","" +"FNSTSW AX","FNSTSW AX","fnstsw AX","DF E0","V","V","","","w","","" +"FNSTSW m2byte","FNSTSW m2byte","fnstsw m2byte","DD /7","V","V","","","w",= "","" +"FPATAN","FPATAN","fpatan","D9 F3","V","V","","","","","" +"FPREM","FPREM","fprem","D9 F8","V","V","","","","","" +"FPREM1","FPREM1","fprem1","D9 F5","V","V","","","","","" +"FPTAN","FPTAN","fptan","D9 F2","V","V","","","","","" +"FRNDINT","FRNDINT","frndint","D9 FC","V","V","","","","","" +"FRSTOR m108byte","FRSTOR m108byte","frstor m108byte","DD /4","V","V","","= operand32,operand64","r","","" +"FRSTOR m94byte","FRSTORS m94byte","frstor m94byte","DD /4","V","V","","op= erand16","r","","" +"FSAVE m94/108byte","FSAVE m94/108byte","fsave m94/108byte","9B DD /6","V"= ,"V","","pseudo","w","","" +"FSCALE","FSCALE","fscale","D9 FD","V","V","","","","","" +"FSETPM287_NOP","FSETPM287_NOP","fsetpm287_nop","DB E4","V","V","","","","= ","" +"FSIN","FSIN","fsin","D9 FE","V","V","","","","","" +"FSINCOS","FSINCOS","fsincos","D9 FB","V","V","","","","","" +"FSQRT","FSQRT","fsqrt","D9 FA","V","V","","","","","" +"FST ST(i), ST(0)","FST ST(0), ST(i)","fst ST(0), ST(i)","DD D0+i","V","V"= ,"","","w,r","Y","" +"FSTCW m2byte","FSTCW m2byte","fstcw m2byte","9B D9 /7","V","V","","pseudo= ","w","","" +"FSTENV m14/28byte","FSTENV m14/28byte","fstenv m14/28byte","9B D9 /6","V"= ,"V","","pseudo","w","","" +"FST m64fp, ST(0)","FSTL ST(0), m64fp","fstl ST(0), m64fp","DD /2","V","V"= ,"","","w,r","Y","64" +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DD D8+i","V",= "V","","","w,r","Y","" +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D0+i","V",= "V","","","w,r","Y","" +"FSTP ST(i), ST(0)","FSTP ST(0), ST(i)","fstp ST(0), ST(i)","DF D8+i","V",= "V","","","w,r","Y","" +"FSTP m64fp, ST(0)","FSTPL ST(0), m64fp","fstpl ST(0), m64fp","DD /3","V",= "V","","","w,r","Y","64" +"FSTPNCE ST(i), ST(0)","FSTPNCE ST(0), ST(i)","fstpnce ST(0), ST(i)","D9 D= 8+i","V","V","","","w,r","","" +"FSTP m32fp, ST(0)","FSTPS ST(0), m32fp","fstps ST(0), m32fp","D9 /3","V",= "V","","","w,r","Y","32" +"FSTP m80fp, ST(0)","FSTPT ST(0), m80fp","fstpt ST(0), m80fp","DB /7","V",= "V","","","w,r","Y","80" +"FST m32fp, ST(0)","FSTS ST(0), m32fp","fsts ST(0), m32fp","D9 /2","V","V"= ,"","","w,r","Y","32" +"FSTSW AX","FSTSW AX","fstsw AX","9B DF E0","V","V","","pseudo","w","","" +"FSTSW m2byte","FSTSW m2byte","fstsw m2byte","9B DD /7","V","V","","pseudo= ","w","","" +"FSUBR ST(i), ST(0)","FSUB ST(0), ST(i)","fsub ST(0), ST(i)","DC E0+i","V"= ,"V","","","rw,r","Y","" +"FSUB ST(0), ST(i)","FSUB ST(i), ST(0)","fsub ST(i), ST(0)","D8 E0+i","V",= "V","","","rw,r","Y","" +"FSUB ST(0), m64fp","FSUBL m64fp, ST(0)","fsubl m64fp, ST(0)","DC /4","V",= "V","","","rw,r","Y","64" +"FSUBP","FSUBP","fsubp","DE E9","V","V","","pseudo","","","" +"FSUBRP ST(i), ST(0)","FSUBP ST(0), ST(i)","fsubp ST(0), ST(i)","DE E0+i",= "V","V","","","rw,r","","" +"FSUB ST(i), ST(0)","FSUBR ST(0), ST(i)","fsubr ST(0), ST(i)","DC E8+i","V= ","V","","","rw,r","Y","" +"FSUBR ST(0), ST(i)","FSUBR ST(i), ST(0)","fsubr ST(i), ST(0)","D8 E8+i","= V","V","","","rw,r","Y","" +"FSUBR ST(0), m64fp","FSUBRL m64fp, ST(0)","fsubrl m64fp, ST(0)","DC /5","= V","V","","","rw,r","Y","64" +"FSUBRP","FSUBRP","fsubrp","DE E1","V","V","","pseudo","","","" +"FSUBP ST(i), ST(0)","FSUBRP ST(0), ST(i)","fsubrp ST(0), ST(i)","DE E8+i"= ,"V","V","","","rw,r","","" +"FSUBR ST(0), m32fp","FSUBRS m32fp, ST(0)","fsubrs m32fp, ST(0)","D8 /5","= V","V","","","rw,r","Y","32" +"FSUB ST(0), m32fp","FSUBS m32fp, ST(0)","fsubs m32fp, ST(0)","D8 /4","V",= "V","","","rw,r","Y","32" +"FTST","FTST","ftst","D9 E4","V","V","","","","","" +"FUCOM","FUCOM","fucom","DD E1","V","V","","pseudo","","","" +"FUCOM ST(0), ST(i)","FUCOM ST(i), ST(0)","fucom ST(i), ST(0)","DD E0+i","= V","V","","","r,r","","" +"FUCOMI ST(0), ST(i)","FUCOMI ST(i), ST(0)","fucomi ST(i), ST(0)","DB E8+i= ","V","V","PPRO","P6","r,r","","" +"FUCOMIP ST(0), ST(i)","FUCOMIP ST(i), ST(0)","fucomip ST(i), ST(0)","DF E= 8+i","V","V","PPRO","P6","r,r","","" +"FUCOMP","FUCOMP","fucomp","DD E9","V","V","","pseudo","","","" +"FUCOMP ST(0), ST(i)","FUCOMP ST(i), ST(0)","fucomp ST(i), ST(0)","DD E8+i= ","V","V","","","r,r","","" +"FUCOMPP","FUCOMPP","fucompp","DA E9","V","V","","","","","" +"FWAIT","FWAIT","fwait","9B","V","V","","","","","" +"FXAM","FXAM","fxam","D9 E5","V","V","","","","","" +"FXCH","FXCH","fxch","D9 C9","V","V","","pseudo","","","" +"FXCH ST(0), ST(i)","FXCH ST(i), ST(0)","fxch ST(i), ST(0)","D9 C8+i","V",= "V","","","rw,rw","","" +"FXCH_ALIAS1 ST(0), ST(i)","FXCH_ALIAS1 ST(i), ST(0)","fxch_alias1 ST(i), = ST(0)","DD C8+i","V","V","","","rw,rw","","" +"FXCH_ALIAS2 ST(0), ST(i)","FXCH_ALIAS2 ST(i), ST(0)","fxch_alias2 ST(i), = ST(0)","DF C8+i","V","V","","","rw,rw","","" +"FXRSTOR m512byte","FXRSTOR m512byte","fxrstor m512byte","0F AE /1","V","V= ","","modrm_memonly,operand16,operand32","r","","" +"FXRSTOR64 m512byte","FXRSTOR64 m512byte","fxrstor64 m512byte","REX.W 0F A= E /1","N.S.","V","","modrm_memonly","r","","" +"FXSAVE m512byte","FXSAVE m512byte","fxsave m512byte","0F AE /0","V","V","= ","modrm_memonly,operand16,operand32","w","","" +"FXSAVE64 m512byte","FXSAVE64 m512byte","fxsave64 m512byte","REX.W 0F AE /= 0","N.S.","V","","modrm_memonly","w","","" +"FXTRACT","FXTRACT","fxtract","D9 F4","V","V","","","","","" +"FYL2X","FYL2X","fyl2x","D9 F1","V","V","","","","","" +"FYL2XP1","FYL2XP1","fyl2xp1","D9 F9","V","V","","","","","" +"GETSEC","GETSEC","getsec","0F 37","V","V","SMX","","","","" +"GF2P8AFFINEINVQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEINVQB imm8u, xmm2/m1= 28, xmm1","gf2p8affineinvqb imm8u, xmm2/m128, xmm1","66 0F 3A CF /r ib","V"= ,"V","GFNI","","rw,r,r","","" +"GF2P8AFFINEQB xmm1, xmm2/m128, imm8u","GF2P8AFFINEQB imm8u, xmm2/m128, xm= m1","gf2p8affineqb imm8u, xmm2/m128, xmm1","66 0F 3A CE /r ib","V","V","GFN= I","","rw,r,r","","" +"GF2P8MULB xmm1, xmm2/m128","GF2P8MULB xmm2/m128, xmm1","gf2p8mulb xmm2/m1= 28, xmm1","66 0F 38 CF /r","V","V","GFNI","","rw,r","","" +"HADDPD xmm1, xmm2/m128","HADDPD xmm2/m128, xmm1","haddpd xmm2/m128, xmm1"= ,"66 0F 7C /r","V","V","SSE3","","rw,r","","" +"HADDPS xmm1, xmm2/m128","HADDPS xmm2/m128, xmm1","haddps xmm2/m128, xmm1"= ,"F2 0F 7C /r","V","V","SSE3","","rw,r","","" +"HLT","HLT","hlt","F4","V","V","","","","","" +"HSUBPD xmm1, xmm2/m128","HSUBPD xmm2/m128, xmm1","hsubpd xmm2/m128, xmm1"= ,"66 0F 7D /r","V","V","SSE3","","rw,r","","" +"HSUBPS xmm1, xmm2/m128","HSUBPS xmm2/m128, xmm1","hsubps xmm2/m128, xmm1"= ,"F2 0F 7D /r","V","V","SSE3","","rw,r","","" +"ICEBP","ICEBP","icebp","F1","V","V","","","","","" +"IDIV r/m8","IDIVB r/m8","idivb r/m8","F6 /7","V","V","","","r","Y","8" +"IDIV r/m8","IDIVB r/m8","idivb r/m8","REX F6 /7","N.E.","V","","pseudo64"= ,"r","Y","8" +"IDIV r/m32","IDIVL r/m32","idivl r/m32","F7 /7","V","V","","operand32","r= ","Y","32" +"IDIV r/m64","IDIVQ r/m64","idivq r/m64","REX.W F7 /7","N.S.","V","","","r= ","Y","64" +"IDIV r/m16","IDIVW r/m16","idivw r/m16","F7 /7","V","V","","operand16","r= ","Y","16" +"IMUL r32, r/m32, imm32","IMUL3 imm32, r/m32, r32","imull imm32, r/m32, r3= 2","69 /r id","V","V","","operand32","w,r,r","Y","32" +"IMUL r64, r/m64, imm32","IMUL3 imm32, r/m64, r64","imulq imm32, r/m64, r6= 4","REX.W 69 /r id","N.S.","V","","","w,r,r","Y","64" +"IMUL r16, r/m16, imm8","IMUL3 imm8, r/m16, r16","imulw imm8, r/m16, r16",= "6B /r ib","V","V","","operand16","w,r,r","Y","16" +"IMUL r32, r/m32, imm8","IMUL3 imm8, r/m32, r32","imull imm8, r/m32, r32",= "6B /r ib","V","V","","operand32","w,r,r","Y","32" +"IMUL r64, r/m64, imm8","IMUL3 imm8, r/m64, r64","imulq imm8, r/m64, r64",= "REX.W 6B /r ib","N.S.","V","","","w,r,r","Y","64" +"IMUL r/m8","IMULB r/m8","imulb r/m8","F6 /5","V","V","","","r","Y","8" +"IMUL r/m32","IMULL r/m32","imull r/m32","F7 /5","V","V","","operand32","r= ","Y","32" +"IMUL r32, r/m32","IMULL r/m32, r32","imull r/m32, r32","0F AF /r","V","V"= ,"","operand32","rw,r","Y","32" +"IMUL r/m64","IMULQ r/m64","imulq r/m64","REX.W F7 /5","N.S.","V","","","r= ","Y","64" +"IMUL r64, r/m64","IMULQ r/m64, r64","imulq r/m64, r64","REX.W 0F AF /r","= N.S.","V","","","rw,r","Y","64" +"IMUL r16, r/m16, imm16","IMULW imm16, r/m16, r16","imulw imm16, r/m16, r1= 6","69 /r iw","V","V","","operand16","w,r,r","Y","16" +"IMUL r/m16","IMULW r/m16","imulw r/m16","F7 /5","V","V","","operand16","r= ","Y","16" +"IMUL r16, r/m16","IMULW r/m16, r16","imulw r/m16, r16","0F AF /r","V","V"= ,"","operand16","rw,r","Y","16" +"IN AL, DX","INB DX, AL","inb DX, AL","EC","V","V","","","w,r","Y","8" +"IN AL, imm8u","INB imm8u, AL","inb imm8u, AL","E4 ib","V","V","","","w,r"= ,"Y","8" +"INC r/m8","INCB r/m8","incb r/m8","FE /0","V","V","","","rw","Y","8" +"INC r/m8","INCB r/m8","incb r/m8","REX FE /0","N.E.","V","","pseudo64","r= w","Y","8" +"INC r/m32","INCL r/m32","incl r/m32","FF /0","V","V","","operand32","rw",= "Y","32" +"INC r32op","INCL r32op","incl r32op","40+rd","V","N.S.","","operand32","r= w","Y","32" +"INC r/m64","INCQ r/m64","incq r/m64","REX.W FF /0","N.S.","V","","","rw",= "Y","64" +"INCSSPD rmr32","INCSSPD rmr32","incsspd rmr32","F3 0F AE /5","V","V","CET= ","modrm_regonly,operand16,operand32","r","","" +"INCSSPQ rmr64","INCSSPQ rmr64","incsspq rmr64","F3 REX.W 0F AE /5","N.S."= ,"V","CET","modrm_regonly","r","","" +"INC r/m16","INCW r/m16","incw r/m16","FF /0","V","V","","operand16","rw",= "Y","16" +"INC r16op","INCW r16op","incw r16op","40+rw","V","N.S.","","operand16","r= w","Y","16" +"IN EAX, DX","INL DX, EAX","inl DX, EAX","ED","V","V","","operand32,operan= d64","w,r","Y","32" +"IN EAX, imm8u","INL imm8u, EAX","inl imm8u, EAX","E5 ib","V","V","","oper= and32,operand64","w,r","Y","32" +"INSB","INSB","insb","6C","V","V","","","","","" +"INSERTPS xmm1, xmm2/m32, imm8u","INSERTPS imm8u, xmm2/m32, xmm1","insertp= s imm8u, xmm2/m32, xmm1","66 0F 3A 21 /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"INSERTQ xmm1, xmm2, imm8u, imm8u","INSERTQ imm8u, imm8u, xmm2, xmm1","ins= ertq imm8u, imm8u, xmm2, xmm1","F2 0F 78 /r ib ib","V","V","SSE4a","amd,mod= rm_regonly","w,r,r,r","","" +"INSERTQ xmm1, xmm2","INSERTQ xmm2, xmm1","insertq xmm2, xmm1","F2 0F 79 /= r","V","V","SSE4a","amd,modrm_regonly","w,r","","" +"INSD","INSL","insl","6D","V","V","","operand32,operand64","","","" +"INSW","INSW","insw","6D","V","V","","operand16","","","" +"INT 3","INT 3","int 3","CC","V","V","","","r","","" +"INT imm8u","INT imm8u","int imm8u","CD ib","V","V","","","r","","" +"INTO","INTO","into","CE","V","N.S.","","","","","" +"INVD","INVD","invd","0F 08","V","V","486","","","","" +"INVEPT r32, m128","INVEPT m128, r32","invept m128, r32","66 0F 38 80 /r",= "V","N.S.","VTX","modrm_memonly","r,r","","" +"INVEPT r64, m128","INVEPT m128, r64","invept m128, r64","66 0F 38 80 /r",= "N.S.","V","VTX","default64,modrm_memonly","r,r","","" +"INVLPG m","INVLPG m","invlpg m","0F 01 /7","V","V","486","modrm_memonly",= "r","","" +"INVLPGA EAX, ECX","INVLPGAL ECX, EAX","invlpgal ECX, EAX","0F 01 DF","V",= "V","SVM","amd,modrm_regonly,operand32","r,r","Y","32" +"INVLPGA RAX, ECX","INVLPGAQ ECX, RAX","invlpgaq ECX, RAX","REX.W 0F 01 DF= ","N.S.","V","SVM","amd,modrm_regonly","r,r","Y","64" +"INVLPGA AX, ECX","INVLPGAW ECX, AX","invlpgaw ECX, AX","0F 01 DF","V","V"= ,"SVM","amd,modrm_regonly,operand16","r,r","Y","16" +"INVPCID r32, m128","INVPCID m128, r32","invpcid m128, r32","66 0F 38 82 /= r","V","N.S.","INVPCID","modrm_memonly","r,r","","" +"INVPCID r64, m128","INVPCID m128, r64","invpcid m128, r64","66 0F 38 82 /= r","N.S.","V","INVPCID","default64,modrm_memonly","r,r","","" +"INVVPID r32, m128","INVVPID m128, r32","invvpid m128, r32","66 0F 38 81 /= r","V","N.S.","VTX","modrm_memonly","r,r","","" +"INVVPID r64, m128","INVVPID m128, r64","invvpid m128, r64","66 0F 38 81 /= r","N.S.","V","VTX","default64,modrm_memonly","r,r","","" +"IN AX, DX","INW DX, AX","inw DX, AX","ED","V","V","","operand16","w,r","Y= ","16" +"IN AX, imm8u","INW imm8u, AX","inw imm8u, AX","E5 ib","V","V","","operand= 16","w,r","Y","16" +"IRETD","IRETL","iretl","CF","V","V","","operand32","","","" +"IRETQ","IRETQ","iretq","REX.W CF","N.S.","V","","","","","" +"IRET","IRETW","iretw","CF","V","V","","operand16","","","" +"JA rel16","JA rel16","ja rel16","0F 87 cw","V","N.S.","","operand16","r",= "","" +"JA rel32","JA rel32","ja rel32","0F 87 cd","V","N.S.","","operand32","r",= "","" +"JA rel32","JA rel32","ja rel32","0F 87 cd","N.S.","V","","default64","r",= "","" +"JA rel8","JA rel8","ja rel8","77 cb","N.S.","V","","default64","r","","" +"JA rel8","JA rel8","ja rel8","77 cb","V","N.S.","","","r","","" +"JAE rel16","JAE rel16","jae rel16","0F 83 cw","V","N.S.","","operand16","= r","","" +"JAE rel32","JAE rel32","jae rel32","0F 83 cd","N.S.","V","","default64","= r","","" +"JAE rel32","JAE rel32","jae rel32","0F 83 cd","V","N.S.","","operand32","= r","","" +"JAE rel8","JAE rel8","jae rel8","73 cb","V","N.S.","","","r","","" +"JAE rel8","JAE rel8","jae rel8","73 cb","N.S.","V","","default64","r","",= "" +"JB rel16","JB rel16","jb rel16","0F 82 cw","V","N.S.","","operand16","r",= "","" +"JB rel32","JB rel32","jb rel32","0F 82 cd","V","N.S.","","operand32","r",= "","" +"JB rel32","JB rel32","jb rel32","0F 82 cd","N.S.","V","","default64","r",= "","" +"JB rel8","JB rel8","jb rel8","72 cb","N.S.","V","","default64","r","","" +"JB rel8","JB rel8","jb rel8","72 cb","V","N.S.","","","r","","" +"JBE rel16","JBE rel16","jbe rel16","0F 86 cw","V","N.S.","","operand16","= r","","" +"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","V","N.S.","","operand32","= r","","" +"JBE rel32","JBE rel32","jbe rel32","0F 86 cd","N.S.","V","","default64","= r","","" +"JBE rel8","JBE rel8","jbe rel8","76 cb","V","N.S.","","","r","","" +"JBE rel8","JBE rel8","jbe rel8","76 cb","N.S.","V","","default64","r","",= "" +"JC rel16","JC rel16","jc rel16","0F 82 cw","V","N.S.","","pseudo","r","",= "" +"JC rel32","JC rel32","jc rel32","0F 82 cd","V","V","","pseudo","r","","" +"JC rel8","JC rel8","jc rel8","72 cb","V","V","","pseudo","r","","" +"JCXZ rel8","JCXZ rel8","jcxz rel8","E3 cb","V","N.S.","","address16","r",= "","" +"JE rel16","JE rel16","je rel16","0F 84 cw","V","N.S.","","operand16","r",= "","" +"JE rel32","JE rel32","je rel32","0F 84 cd","V","N.S.","","operand32","r",= "","" +"JE rel32","JE rel32","je rel32","0F 84 cd","N.S.","V","","default64","r",= "","" +"JE rel8","JE rel8","je rel8","74 cb","N.S.","V","","default64","r","","" +"JE rel8","JE rel8","je rel8","74 cb","V","N.S.","","","r","","" +"JECXZ rel8","JECXZ rel8","jecxz rel8","E3 cb","V","V","","address32","r",= "","" +"JG rel16","JG rel16","jg rel16","0F 8F cw","V","N.S.","","operand16","r",= "","" +"JG rel32","JG rel32","jg rel32","0F 8F cd","N.S.","V","","default64","r",= "","" +"JG rel32","JG rel32","jg rel32","0F 8F cd","V","N.S.","","operand32","r",= "","" +"JG rel8","JG rel8","jg rel8","7F cb","V","N.S.","","","r","","" +"JG rel8","JG rel8","jg rel8","7F cb","N.S.","V","","default64","r","","" +"JGE rel16","JGE rel16","jge rel16","0F 8D cw","V","N.S.","","operand16","= r","","" +"JGE rel32","JGE rel32","jge rel32","0F 8D cd","V","N.S.","","operand32","= r","","" +"JGE rel32","JGE rel32","jge rel32","0F 8D cd","N.S.","V","","default64","= r","","" +"JGE rel8","JGE rel8","jge rel8","7D cb","N.S.","V","","default64","r","",= "" +"JGE rel8","JGE rel8","jge rel8","7D cb","V","N.S.","","","r","","" +"JL rel16","JL rel16","jl rel16","0F 8C cw","V","N.S.","","operand16","r",= "","" +"JL rel32","JL rel32","jl rel32","0F 8C cd","V","N.S.","","operand32","r",= "","" +"JL rel32","JL rel32","jl rel32","0F 8C cd","N.S.","V","","default64","r",= "","" +"JL rel8","JL rel8","jl rel8","7C cb","V","N.S.","","","r","","" +"JL rel8","JL rel8","jl rel8","7C cb","N.S.","V","","default64","r","","" +"JLE rel16","JLE rel16","jle rel16","0F 8E cw","V","N.S.","","operand16","= r","","" +"JLE rel32","JLE rel32","jle rel32","0F 8E cd","V","N.S.","","operand32","= r","","" +"JLE rel32","JLE rel32","jle rel32","0F 8E cd","N.S.","V","","default64","= r","","" +"JLE rel8","JLE rel8","jle rel8","7E cb","N.S.","V","","default64","r","",= "" +"JLE rel8","JLE rel8","jle rel8","7E cb","V","N.S.","","","r","","" +"JMP rel16","JMP rel16","jmp rel16","E9 cw","V","N.S.","","operand16","r",= "Y","" +"JMP rel32","JMP rel32","jmp rel32","E9 cd","N.S.","V","","default64","r",= "Y","" +"JMP rel32","JMP rel32","jmp rel32","E9 cd","V","N.S.","","operand32","r",= "Y","" +"JMP rel8","JMP rel8","jmp rel8","EB cb","N.S.","V","","default64","r","Y"= ,"" +"JMP rel8","JMP rel8","jmp rel8","EB cb","V","N.S.","","","r","Y","" +"JMP r/m32","JMPL* r/m32","jmpl* r/m32","FF /4","V","N.S.","","operand32",= "r","Y","32" +"JMP r/m64","JMPQ* r/m64","jmpq* r/m64","FF /4","N.S.","V","","","r","Y","= 64" +"JMP r/m16","JMPW* r/m16","jmpw* r/m16","FF /4","V","N.S.","","operand16",= "r","Y","16" +"JNA rel16","JNA rel16","jna rel16","0F 86 cw","V","N.S.","","pseudo","r",= "","" +"JNA rel32","JNA rel32","jna rel32","0F 86 cd","V","V","","pseudo","r","",= "" +"JNA rel8","JNA rel8","jna rel8","76 cb","V","V","","pseudo","r","","" +"JNAE rel16","JNAE rel16","jnae rel16","0F 82 cw","V","N.S.","","pseudo","= r","","" +"JNAE rel32","JNAE rel32","jnae rel32","0F 82 cd","V","V","","pseudo","r",= "","" +"JNAE rel8","JNAE rel8","jnae rel8","72 cb","V","V","","pseudo","r","","" +"JNB rel16","JNB rel16","jnb rel16","0F 83 cw","V","N.S.","","pseudo","r",= "","" +"JNB rel32","JNB rel32","jnb rel32","0F 83 cd","V","V","","pseudo","r","",= "" +"JNB rel8","JNB rel8","jnb rel8","73 cb","V","V","","pseudo","r","","" +"JNBE rel16","JNBE rel16","jnbe rel16","0F 87 cw","V","N.S.","","pseudo","= r","","" +"JNBE rel32","JNBE rel32","jnbe rel32","0F 87 cd","V","V","","pseudo","r",= "","" +"JNBE rel8","JNBE rel8","jnbe rel8","77 cb","V","V","","pseudo","r","","" +"JNC rel16","JNC rel16","jnc rel16","0F 83 cw","V","N.S.","","pseudo","r",= "","" +"JNC rel32","JNC rel32","jnc rel32","0F 83 cd","V","V","","pseudo","r","",= "" +"JNC rel8","JNC rel8","jnc rel8","73 cb","V","V","","pseudo","r","","" +"JNE rel16","JNE rel16","jne rel16","0F 85 cw","V","N.S.","","operand16","= r","","" +"JNE rel32","JNE rel32","jne rel32","0F 85 cd","N.S.","V","","default64","= r","","" +"JNE rel32","JNE rel32","jne rel32","0F 85 cd","V","N.S.","","operand32","= r","","" +"JNE rel8","JNE rel8","jne rel8","75 cb","V","N.S.","","","r","","" +"JNE rel8","JNE rel8","jne rel8","75 cb","N.S.","V","","default64","r","",= "" +"JNG rel16","JNG rel16","jng rel16","0F 8E cw","V","N.S.","","pseudo","r",= "","" +"JNG rel32","JNG rel32","jng rel32","0F 8E cd","V","V","","pseudo","r","",= "" +"JNG rel8","JNG rel8","jng rel8","7E cb","V","V","","pseudo","r","","" +"JNGE rel16","JNGE rel16","jnge rel16","0F 8C cw","V","N.S.","","pseudo","= r","","" +"JNGE rel32","JNGE rel32","jnge rel32","0F 8C cd","V","V","","pseudo","r",= "","" +"JNGE rel8","JNGE rel8","jnge rel8","7C cb","V","V","","pseudo","r","","" +"JNL rel16","JNL rel16","jnl rel16","0F 8D cw","V","N.S.","","pseudo","r",= "","" +"JNL rel32","JNL rel32","jnl rel32","0F 8D cd","V","V","","pseudo","r","",= "" +"JNL rel8","JNL rel8","jnl rel8","7D cb","V","V","","pseudo","r","","" +"JNLE rel16","JNLE rel16","jnle rel16","0F 8F cw","V","N.S.","","pseudo","= r","","" +"JNLE rel32","JNLE rel32","jnle rel32","0F 8F cd","V","V","","pseudo","r",= "","" +"JNLE rel8","JNLE rel8","jnle rel8","7F cb","V","V","","pseudo","r","","" +"JNO rel16","JNO rel16","jno rel16","0F 81 cw","V","N.S.","","operand16","= r","","" +"JNO rel32","JNO rel32","jno rel32","0F 81 cd","V","N.S.","","operand32","= r","","" +"JNO rel32","JNO rel32","jno rel32","0F 81 cd","N.S.","V","","default64","= r","","" +"JNO rel8","JNO rel8","jno rel8","71 cb","V","N.S.","","","r","","" +"JNO rel8","JNO rel8","jno rel8","71 cb","N.S.","V","","default64","r","",= "" +"JNP rel16","JNP rel16","jnp rel16","0F 8B cw","V","N.S.","","operand16","= r","","" +"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","V","N.S.","","operand32","= r","","" +"JNP rel32","JNP rel32","jnp rel32","0F 8B cd","N.S.","V","","default64","= r","","" +"JNP rel8","JNP rel8","jnp rel8","7B cb","N.S.","V","","default64","r","",= "" +"JNP rel8","JNP rel8","jnp rel8","7B cb","V","N.S.","","","r","","" +"JNS rel16","JNS rel16","jns rel16","0F 89 cw","V","N.S.","","operand16","= r","","" +"JNS rel32","JNS rel32","jns rel32","0F 89 cd","N.S.","V","","default64","= r","","" +"JNS rel32","JNS rel32","jns rel32","0F 89 cd","V","N.S.","","operand32","= r","","" +"JNS rel8","JNS rel8","jns rel8","79 cb","V","N.S.","","","r","","" +"JNS rel8","JNS rel8","jns rel8","79 cb","N.S.","V","","default64","r","",= "" +"JNZ rel16","JNZ rel16","jnz rel16","0F 85 cw","V","N.S.","","pseudo","r",= "","" +"JNZ rel32","JNZ rel32","jnz rel32","0F 85 cd","V","V","","pseudo","r","",= "" +"JNZ rel8","JNZ rel8","jnz rel8","75 cb","V","V","","pseudo","r","","" +"JO rel16","JO rel16","jo rel16","0F 80 cw","V","N.S.","","operand16","r",= "","" +"JO rel32","JO rel32","jo rel32","0F 80 cd","V","N.S.","","operand32","r",= "","" +"JO rel32","JO rel32","jo rel32","0F 80 cd","N.S.","V","","default64","r",= "","" +"JO rel8","JO rel8","jo rel8","70 cb","V","N.S.","","","r","","" +"JO rel8","JO rel8","jo rel8","70 cb","N.S.","V","","default64","r","","" +"JP rel16","JP rel16","jp rel16","0F 8A cw","V","N.S.","","operand16","r",= "","" +"JP rel32","JP rel32","jp rel32","0F 8A cd","N.S.","V","","default64","r",= "","" +"JP rel32","JP rel32","jp rel32","0F 8A cd","V","N.S.","","operand32","r",= "","" +"JP rel8","JP rel8","jp rel8","7A cb","N.S.","V","","default64","r","","" +"JP rel8","JP rel8","jp rel8","7A cb","V","N.S.","","","r","","" +"JPE rel16","JPE rel16","jpe rel16","0F 8A cw","V","N.S.","","pseudo","r",= "","" +"JPE rel32","JPE rel32","jpe rel32","0F 8A cd","V","V","","pseudo","r","",= "" +"JPE rel8","JPE rel8","jpe rel8","7A cb","V","V","","pseudo","r","","" +"JPO rel16","JPO rel16","jpo rel16","0F 8B cw","V","N.S.","","pseudo","r",= "","" +"JPO rel32","JPO rel32","jpo rel32","0F 8B cd","V","V","","pseudo","r","",= "" +"JPO rel8","JPO rel8","jpo rel8","7B cb","V","V","","pseudo","r","","" +"JRCXZ rel8","JRCXZ rel8","jrcxz rel8","E3 cb","N.S.","V","","address64","= r","","" +"JS rel16","JS rel16","js rel16","0F 88 cw","V","N.S.","","operand16","r",= "","" +"JS rel32","JS rel32","js rel32","0F 88 cd","V","N.S.","","operand32","r",= "","" +"JS rel32","JS rel32","js rel32","0F 88 cd","N.S.","V","","default64","r",= "","" +"JS rel8","JS rel8","js rel8","78 cb","V","N.S.","","","r","","" +"JS rel8","JS rel8","js rel8","78 cb","N.S.","V","","default64","r","","" +"JZ rel16","JZ rel16","jz rel16","0F 84 cw","V","N.S.","","operand16,pseud= o","r","","" +"JZ rel32","JZ rel32","jz rel32","0F 84 cd","V","V","","operand32,pseudo",= "r","","" +"JZ rel8","JZ rel8","jz rel8","74 cb","V","V","","pseudo","r","","" +"KADDB k1, kV, k2","KADDB k2, kV, k1","kaddb k2, kV, k1","VEX.NDS.256.66.0= F.W0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KADDD k1, kV, k2","KADDD k2, kV, k1","kaddd k2, kV, k1","VEX.NDS.256.66.0= F.W1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KADDQ k1, kV, k2","KADDQ k2, kV, k1","kaddq k2, kV, k1","VEX.NDS.256.0F.W= 1 4A /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KADDW k1, kV, k2","KADDW k2, kV, k1","kaddw k2, kV, k1","VEX.NDS.256.0F.W= 0 4A /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KANDB k1, kV, k2","KANDB k2, kV, k1","kandb k2, kV, k1","VEX.NDS.256.66.0= F.W0 41 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KANDD k1, kV, k2","KANDD k2, kV, k1","kandd k2, kV, k1","VEX.NDS.256.66.0= F.W1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KANDNB k1, kV, k2","KANDNB k2, kV, k1","kandnb k2, kV, k1","VEX.NDS.256.6= 6.0F.W0 42 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KANDND k1, kV, k2","KANDND k2, kV, k1","kandnd k2, kV, k1","VEX.NDS.256.6= 6.0F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KANDNQ k1, kV, k2","KANDNQ k2, kV, k1","kandnq k2, kV, k1","VEX.NDS.256.0= F.W1 42 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KANDNW k1, kV, k2","KANDNW k2, kV, k1","kandnw k2, kV, k1","VEX.NDS.256.0= F.W0 42 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KANDQ k1, kV, k2","KANDQ k2, kV, k1","kandq k2, kV, k1","VEX.NDS.256.0F.W= 1 41 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KANDW k1, kV, k2","KANDW k2, kV, k1","kandw k2, kV, k1","VEX.NDS.256.0F.W= 0 41 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KMOVB m8, k1","KMOVB k1, m8","kmovb k1, m8","VEX.128.66.0F.W0 91 /r","V",= "V","AVX512DQ","modrm_memonly","w,r","","" +"KMOVB r32, k2","KMOVB k2, r32","kmovb k2, r32","VEX.128.66.0F.W0 93 /r","= V","V","AVX512DQ","modrm_regonly","w,r","","" +"KMOVB k1, k2/m8","KMOVB k2/m8, k1","kmovb k2/m8, k1","VEX.128.66.0F.W0 90= /r","V","V","AVX512DQ","","w,r","","" +"KMOVB k1, rmr32","KMOVB rmr32, k1","kmovb rmr32, k1","VEX.128.66.0F.W0 92= /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"KMOVD m32, k1","KMOVD k1, m32","kmovd k1, m32","VEX.128.66.0F.W1 91 /r","= V","V","AVX512BW","modrm_memonly","w,r","","" +"KMOVD r32, k2","KMOVD k2, r32","kmovd k2, r32","VEX.128.F2.0F.W0 93 /r","= V","V","AVX512BW","modrm_regonly","w,r","","" +"KMOVD k1, k2/m32","KMOVD k2/m32, k1","kmovd k2/m32, k1","VEX.128.66.0F.W1= 90 /r","V","V","AVX512BW","","w,r","","" +"KMOVD k1, rmr32","KMOVD rmr32, k1","kmovd rmr32, k1","VEX.128.F2.0F.W0 92= /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"KMOVQ m64, k1","KMOVQ k1, m64","kmovq k1, m64","VEX.128.0F.W1 91 /r","V",= "V","AVX512BW","modrm_memonly","w,r","","" +"KMOVQ r64, k2","KMOVQ k2, r64","kmovq k2, r64","VEX.128.F2.0F.W1 93 /r","= N.S.","V","AVX512BW","modrm_regonly","w,r","","" +"KMOVQ k1, k2/m64","KMOVQ k2/m64, k1","kmovq k2/m64, k1","VEX.128.0F.W1 90= /r","V","V","AVX512BW","","w,r","","" +"KMOVQ k1, rmr64","KMOVQ rmr64, k1","kmovq rmr64, k1","VEX.128.F2.0F.W1 92= /r","N.S.","V","AVX512BW","modrm_regonly","w,r","","" +"KMOVW m16, k1","KMOVW k1, m16","kmovw k1, m16","VEX.128.0F.W0 91 /r","V",= "V","AVX512F","modrm_memonly","w,r","","" +"KMOVW r32, k2","KMOVW k2, r32","kmovw k2, r32","VEX.128.0F.W0 93 /r","V",= "V","AVX512F","modrm_regonly","w,r","","" +"KMOVW k1, k2/m16","KMOVW k2/m16, k1","kmovw k2/m16, k1","VEX.128.0F.W0 90= /r","V","V","AVX512F","","w,r","","" +"KMOVW k1, rmr32","KMOVW rmr32, k1","kmovw rmr32, k1","VEX.128.0F.W0 92 /r= ","V","V","AVX512F","modrm_regonly","w,r","","" +"KNOTB k1, k2","KNOTB k2, k1","knotb k2, k1","VEX.128.66.0F.W0 44 /r","V",= "V","AVX512DQ","modrm_regonly","w,r","","" +"KNOTD k1, k2","KNOTD k2, k1","knotd k2, k1","VEX.128.66.0F.W1 44 /r","V",= "V","AVX512BW","modrm_regonly","w,r","","" +"KNOTQ k1, k2","KNOTQ k2, k1","knotq k2, k1","VEX.128.0F.W1 44 /r","V","V"= ,"AVX512BW","modrm_regonly","w,r","","" +"KNOTW k1, k2","KNOTW k2, k1","knotw k2, k1","VEX.128.0F.W0 44 /r","V","V"= ,"AVX512F","modrm_regonly","w,r","","" +"KORB k1, kV, k2","KORB k2, kV, k1","korb k2, kV, k1","VEX.NDS.256.66.0F.W= 0 45 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KORD k1, kV, k2","KORD k2, kV, k1","kord k2, kV, k1","VEX.NDS.256.66.0F.W= 1 45 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KORQ k1, kV, k2","KORQ k2, kV, k1","korq k2, kV, k1","VEX.NDS.256.0F.W1 4= 5 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KORTESTB k1, k2","KORTESTB k2, k1","kortestb k2, k1","VEX.128.66.0F.W0 98= /r","V","V","AVX512DQ","modrm_regonly","r,r","","" +"KORTESTD k1, k2","KORTESTD k2, k1","kortestd k2, k1","VEX.128.66.0F.W1 98= /r","V","V","AVX512BW","modrm_regonly","r,r","","" +"KORTESTQ k1, k2","KORTESTQ k2, k1","kortestq k2, k1","VEX.128.0F.W1 98 /r= ","V","V","AVX512BW","modrm_regonly","r,r","","" +"KORTESTW k1, k2","KORTESTW k2, k1","kortestw k2, k1","VEX.128.0F.W0 98 /r= ","V","V","AVX512F","modrm_regonly","r,r","","" +"KORW k1, kV, k2","KORW k2, kV, k1","korw k2, kV, k1","VEX.NDS.256.0F.W0 4= 5 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KSHIFTLB k1, k2, imm8u","KSHIFTLB imm8u, k2, k1","kshiftlb imm8u, k2, k1"= ,"VEX.128.66.0F3A.W0 32 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","= ","" +"KSHIFTLD k1, k2, imm8u","KSHIFTLD imm8u, k2, k1","kshiftld imm8u, k2, k1"= ,"VEX.128.66.0F3A.W0 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","= ","" +"KSHIFTLQ k1, k2, imm8u","KSHIFTLQ imm8u, k2, k1","kshiftlq imm8u, k2, k1"= ,"VEX.128.66.0F3A.W1 33 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","= ","" +"KSHIFTLW k1, k2, imm8u","KSHIFTLW imm8u, k2, k1","kshiftlw imm8u, k2, k1"= ,"VEX.128.66.0F3A.W1 32 /r ib","V","V","AVX512F","modrm_regonly","w,r,r",""= ,"" +"KSHIFTRB k1, k2, imm8u","KSHIFTRB imm8u, k2, k1","kshiftrb imm8u, k2, k1"= ,"VEX.128.66.0F3A.W0 30 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r","= ","" +"KSHIFTRD k1, k2, imm8u","KSHIFTRD imm8u, k2, k1","kshiftrd imm8u, k2, k1"= ,"VEX.128.66.0F3A.W0 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","= ","" +"KSHIFTRQ k1, k2, imm8u","KSHIFTRQ imm8u, k2, k1","kshiftrq imm8u, k2, k1"= ,"VEX.128.66.0F3A.W1 31 /r ib","V","V","AVX512BW","modrm_regonly","w,r,r","= ","" +"KSHIFTRW k1, k2, imm8u","KSHIFTRW imm8u, k2, k1","kshiftrw imm8u, k2, k1"= ,"VEX.128.66.0F3A.W1 30 /r ib","V","V","AVX512F","modrm_regonly","w,r,r",""= ,"" +"KTESTB k1, k2","KTESTB k2, k1","ktestb k2, k1","VEX.128.66.0F.W0 99 /r","= V","V","AVX512DQ","modrm_regonly","r,r","","" +"KTESTD k1, k2","KTESTD k2, k1","ktestd k2, k1","VEX.128.66.0F.W1 99 /r","= V","V","AVX512BW","modrm_regonly","r,r","","" +"KTESTQ k1, k2","KTESTQ k2, k1","ktestq k2, k1","VEX.128.0F.W1 99 /r","V",= "V","AVX512BW","modrm_regonly","r,r","","" +"KTESTW k1, k2","KTESTW k2, k1","ktestw k2, k1","VEX.128.0F.W0 99 /r","V",= "V","AVX512DQ","modrm_regonly","r,r","","" +"KUNPCKBW k1, kV, k2","KUNPCKBW k2, kV, k1","kunpckbw k2, kV, k1","VEX.NDS= .256.66.0F.W0 4B /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KUNPCKDQ k1, kV, k2","KUNPCKDQ k2, kV, k1","kunpckdq k2, kV, k1","VEX.NDS= .256.0F.W1 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KUNPCKWD k1, kV, k2","KUNPCKWD k2, kV, k1","kunpckwd k2, kV, k1","VEX.NDS= .256.0F.W0 4B /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXNORB k1, kV, k2","KXNORB k2, kV, k1","kxnorb k2, kV, k1","VEX.NDS.256.6= 6.0F.W0 46 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KXNORD k1, kV, k2","KXNORD k2, kV, k1","kxnord k2, kV, k1","VEX.NDS.256.6= 6.0F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXNORQ k1, kV, k2","KXNORQ k2, kV, k1","kxnorq k2, kV, k1","VEX.NDS.256.0= F.W1 46 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXNORW k1, kV, k2","KXNORW k2, kV, k1","kxnorw k2, kV, k1","VEX.NDS.256.0= F.W0 46 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"KXORB k1, kV, k2","KXORB k2, kV, k1","kxorb k2, kV, k1","VEX.NDS.256.66.0= F.W0 47 /r","V","V","AVX512DQ","modrm_regonly","w,r,r","","" +"KXORD k1, kV, k2","KXORD k2, kV, k1","kxord k2, kV, k1","VEX.NDS.256.66.0= F.W1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXORQ k1, kV, k2","KXORQ k2, kV, k1","kxorq k2, kV, k1","VEX.NDS.256.0F.W= 1 47 /r","V","V","AVX512BW","modrm_regonly","w,r,r","","" +"KXORW k1, kV, k2","KXORW k2, kV, k1","kxorw k2, kV, k1","VEX.NDS.256.0F.W= 0 47 /r","V","V","AVX512F","modrm_regonly","w,r,r","","" +"LAHF","LAHF","lahf","9F","V","V","LAHFSAHF","","","","" +"LAR r32, r32/m16","LARL r32/m16, r32","larl r32/m16, r32","0F 02 /r","V",= "V","","operand32","rw,r","Y","32" +"LAR r64, r64/m16","LARQ r64/m16, r64","larq r64/m16, r64","REX.W 0F 02 /r= ","N.S.","V","","","rw,r","Y","64" +"LAR r16, r/m16","LARW r/m16, r16","larw r/m16, r16","0F 02 /r","V","V",""= ,"operand16","rw,r","Y","16" +"CALL_FAR ptr16:32","LCALLL ptr16:32","lcalll ptr16:32","9A cd iw","V","N.= S.","","operand32","r","Y","" +"CALL_FAR m16:32","LCALLL* m16:32","lcalll* m16:32","FF /3","V","V","","mo= drm_memonly,operand32","r","Y","" +"CALL_FAR m16:64","LCALLQ* m16:64","lcallq* m16:64","REX.W FF /3","N.S.","= V","","modrm_memonly","r","Y","" +"CALL_FAR ptr16:16","LCALLW ptr16:16","lcallw ptr16:16","9A cw iw","V","N.= S.","","operand16","r","Y","" +"CALL_FAR m16:16","LCALLW* m16:16","lcallw* m16:16","FF /3","V","V","","mo= drm_memonly,operand16","r","Y","" +"LDDQU xmm1, m128","LDDQU m128, xmm1","lddqu m128, xmm1","F2 0F F0 /r","V"= ,"V","SSE3","modrm_memonly","w,r","","" +"LDMXCSR m32","LDMXCSR m32","ldmxcsr m32","0F AE /2","V","V","SSE","modrm_= memonly","r","","" +"LDS r32, m16:32","LDSL m16:32, r32","ldsl m16:32, r32","C5 /r","V","N.S."= ,"","modrm_memonly,operand32","w,r","Y","32" +"LDS r16, m16:16","LDSW m16:16, r16","ldsw m16:16, r16","C5 /r","V","N.S."= ,"","modrm_memonly,operand16","w,r","Y","16" +"LEA r32, m","LEAL m, r32","leal m, r32","8D /r","V","V","","modrm_memonly= ,operand32","w,r","Y","32" +"LEA r64, m","LEAQ m, r64","leaq m, r64","REX.W 8D /r","N.S.","V","","modr= m_memonly","w,r","Y","64" +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","N.S.","V","","= default64","","Y","" +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","N.S.","","= operand32","","Y","" +"LEAVE","LEAVEW/LEAVEL/LEAVEQ","leavew/leavel/leaveq","C9","V","V","","ope= rand16","","Y","" +"LEA r16, m","LEAW m, r16","leaw m, r16","8D /r","V","V","","modrm_memonly= ,operand16","w,r","Y","16" +"LES r32, m16:32","LESL m16:32, r32","lesl m16:32, r32","C4 /r","V","N.S."= ,"","modrm_memonly,operand32","w,r","Y","32" +"LES r16, m16:16","LESW m16:16, r16","lesw m16:16, r16","C4 /r","V","N.S."= ,"","modrm_memonly,operand16","w,r","Y","16" +"LFENCE","LFENCE","lfence","0F AE /5","V","V","SSE2","","","","" +"LFS r32, m16:32","LFSL m16:32, r32","lfsl m16:32, r32","0F B4 /r","V","V"= ,"","modrm_memonly,operand32","w,r","Y","32" +"LFS r64, m16:64","LFSQ m16:64, r64","lfsq m16:64, r64","REX.W 0F B4 /r","= N.S.","V","","modrm_memonly","w,r","Y","64" +"LFS r16, m16:16","LFSW m16:16, r16","lfsw m16:16, r16","0F B4 /r","V","V"= ,"","modrm_memonly,operand16","w,r","Y","16" +"LGDT m16&64","LGDT m16&64","lgdt m16&64","0F 01 /2","N.S.","V","","defaul= t64,modrm_memonly","r","","" +"LGDT m16&32","LGDTW/LGDTL m16&32","lgdtw/lgdtl m16&32","0F 01 /2","V","N.= S.","","modrm_memonly","r","","" +"LGS r32, m16:32","LGSL m16:32, r32","lgsl m16:32, r32","0F B5 /r","V","V"= ,"","modrm_memonly,operand32","w,r","Y","32" +"LGS r64, m16:64","LGSQ m16:64, r64","lgsq m16:64, r64","REX.W 0F B5 /r","= N.S.","V","","modrm_memonly","w,r","Y","64" +"LGS r16, m16:16","LGSW m16:16, r16","lgsw m16:16, r16","0F B5 /r","V","V"= ,"","modrm_memonly,operand16","w,r","Y","16" +"LIDT m16&64","LIDT m16&64","lidt m16&64","0F 01 /3","N.S.","V","","defaul= t64,modrm_memonly","r","","" +"LIDT m16&32","LIDTW/LIDTL m16&32","lidtw/lidtl m16&32","0F 01 /3","V","N.= S.","","modrm_memonly","r","","" +"JMP_FAR ptr16:32","LJMPL ptr16:32","ljmpl ptr16:32","EA cd iw","V","N.S."= ,"","operand32","r","Y","" +"JMP_FAR m16:32","LJMPL* m16:32","ljmpl* m16:32","FF /5","V","V","","modrm= _memonly,operand32","r","Y","" +"JMP_FAR m16:64","LJMPQ* m16:64","ljmpq* m16:64","REX.W FF /5","N.S.","V",= "","modrm_memonly","r","Y","" +"JMP_FAR ptr16:16","LJMPW ptr16:16","ljmpw ptr16:16","EA cw iw","V","N.S."= ,"","operand16","r","Y","" +"JMP_FAR m16:16","LJMPW* m16:16","ljmpw* m16:16","FF /5","V","V","","modrm= _memonly,operand16","r","Y","" +"LLDT r/m16","LLDT r/m16","lldt r/m16","0F 00 /2","V","V","","","r","","" +"LLWPCB rmr32","LLWPCBL rmr32","llwpcbl rmr32","XOP.128.09.W0 12 /0","V","= V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32" +"LLWPCB rmr64","LLWPCBQ rmr64","llwpcbq rmr64","XOP.128.09.W0 12 /0","N.S.= ","V","XOP","amd,modrm_regonly,operand64","w","Y","64" +"LMSW r/m16","LMSW r/m16","lmsw r/m16","0F 01 /6","V","V","","","r","","" +"LOCK","LOCK","lock","F0","V","V","","pseudo","","","" +"LODSB","LODSB","lodsb","AC","V","V","","","","","" +"LODSD","LODSL","lodsl","AD","V","V","","operand32","","","" +"LODSQ","LODSQ","lodsq","REX.W AD","N.S.","V","","","","","" +"LODSW","LODSW","lodsw","AD","V","V","","operand16","","","" +"LOOP rel8","LOOP rel8","loop rel8","E2 cb","V","V","","","r","","" +"LOOPE rel8","LOOPEQ rel8","loope rel8","E1 cb","V","V","","","r","","" +"LOOPNE rel8","LOOPNE rel8","loopne rel8","E0 cb","V","V","","","r","","" +"LSL r32, r32/m16","LSLL r32/m16, r32","lsll r32/m16, r32","0F 03 /r","V",= "V","","operand32","rw,r","Y","32" +"LSL r64, r32/m16","LSLQ r32/m16, r64","lslq r32/m16, r64","REX.W 0F 03 /r= ","N.S.","V","","","rw,r","Y","64" +"LSL r16, r/m16","LSLW r/m16, r16","lslw r/m16, r16","0F 03 /r","V","V",""= ,"operand16","rw,r","Y","16" +"LSS r32, m16:32","LSSL m16:32, r32","lssl m16:32, r32","0F B2 /r","V","V"= ,"","modrm_memonly,operand32","w,r","Y","32" +"LSS r64, m16:64","LSSQ m16:64, r64","lssq m16:64, r64","REX.W 0F B2 /r","= N.S.","V","","modrm_memonly","w,r","Y","64" +"LSS r16, m16:16","LSSW m16:16, r16","lssw m16:16, r16","0F B2 /r","V","V"= ,"","modrm_memonly,operand16","w,r","Y","16" +"LTR r/m16","LTR r/m16","ltr r/m16","0F 00 /3","V","V","","","r","","" +"LWPINS r32V, r/m32, imm32u","LWPINS imm32u, r/m32, r32V","lwpins imm32u, = r/m32, r32V","XOP.NDD.128.0A.W0 12 /0","V","V","XOP","amd,operand16,operand= 32","w,r,r","","" +"LWPINS r64V, r64/m32, imm32u","LWPINS imm32u, r64/m32, r64V","lwpins imm3= 2u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /0","N.S.","V","XOP","amd,operand6= 4","w,r,r","","" +"LWPVAL r32V, r/m32, imm32u","LWPVAL imm32u, r/m32, r32V","lwpval imm32u, = r/m32, r32V","XOP.NDD.128.0A.W0 12 /1","V","V","XOP","amd,operand16,operand= 32","w,r,r","","" +"LWPVAL r64V, r64/m32, imm32u","LWPVAL imm32u, r64/m32, r64V","lwpval imm3= 2u, r64/m32, r64V","XOP.NDD.128.0A.W0 12 /1","N.S.","V","XOP","amd,operand6= 4","w,r,r","","" +"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","= V","V","LZCNT","operand32","w,r","Y","32" +"LZCNT r32, r/m32","LZCNTL r/m32, r32","lzcntl r/m32, r32","F3 0F BD /r","= V","V","AMD","amd,operand32","w,r","Y","32" +"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD= /r","N.S.","V","AMD","amd","w,r","Y","64" +"LZCNT r64, r/m64","LZCNTQ r/m64, r64","lzcntq r/m64, r64","F3 REX.W 0F BD= /r","N.S.","V","LZCNT","","w,r","Y","64" +"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","= V","V","AMD","amd,operand16","w,r","Y","16" +"LZCNT r16, r/m16","LZCNTW r/m16, r16","lzcntw r/m16, r16","F3 0F BD /r","= V","V","LZCNT","operand16","w,r","Y","16" +"MASKMOVDQU xmm1, xmm2","MASKMOVOU xmm2, xmm1","maskmovdqu xmm2, xmm1","66= 0F F7 /r","V","V","SSE2","modrm_regonly","r,r","","" +"MASKMOVQ mm1, mm2","MASKMOVQ mm2, mm1","maskmovq mm2, mm1","0F F7 /r","V"= ,"V","MMX","modrm_regonly","r,r","","" +"MAXPD xmm1, xmm2/m128","MAXPD xmm2/m128, xmm1","maxpd xmm2/m128, xmm1","6= 6 0F 5F /r","V","V","SSE2","","rw,r","","" +"MAXPS xmm1, xmm2/m128","MAXPS xmm2/m128, xmm1","maxps xmm2/m128, xmm1","0= F 5F /r","V","V","SSE","","rw,r","","" +"MAXSD xmm1, xmm2/m64","MAXSD xmm2/m64, xmm1","maxsd xmm2/m64, xmm1","F2 0= F 5F /r","V","V","SSE2","","rw,r","","" +"MAXSS xmm1, xmm2/m32","MAXSS xmm2/m32, xmm1","maxss xmm2/m32, xmm1","F3 0= F 5F /r","V","V","SSE","","rw,r","","" +"MFENCE","MFENCE","mfence","0F AE /6","V","V","SSE2","","","","" +"MINPD xmm1, xmm2/m128","MINPD xmm2/m128, xmm1","minpd xmm2/m128, xmm1","6= 6 0F 5D /r","V","V","SSE2","","rw,r","","" +"MINPS xmm1, xmm2/m128","MINPS xmm2/m128, xmm1","minps xmm2/m128, xmm1","0= F 5D /r","V","V","SSE","","rw,r","","" +"MINSD xmm1, xmm2/m64","MINSD xmm2/m64, xmm1","minsd xmm2/m64, xmm1","F2 0= F 5D /r","V","V","SSE2","","rw,r","","" +"MINSS xmm1, xmm2/m32","MINSS xmm2/m32, xmm1","minss xmm2/m32, xmm1","F3 0= F 5D /r","V","V","SSE","","rw,r","","" +"MONITOR","MONITOR","monitor","0F 01 C8","V","V","MONITOR","","","","" +"MOVAPD xmm2/m128, xmm1","MOVAPD xmm1, xmm2/m128","movapd xmm1, xmm2/m128"= ,"66 0F 29 /r","V","V","SSE2","","w,r","","" +"MOVAPD xmm1, xmm2/m128","MOVAPD xmm2/m128, xmm1","movapd xmm2/m128, xmm1"= ,"66 0F 28 /r","V","V","SSE2","","w,r","","" +"MOVAPS xmm2/m128, xmm1","MOVAPS xmm1, xmm2/m128","movaps xmm1, xmm2/m128"= ,"0F 29 /r","V","V","SSE","","w,r","","" +"MOVAPS xmm1, xmm2/m128","MOVAPS xmm2/m128, xmm1","movaps xmm2/m128, xmm1"= ,"0F 28 /r","V","V","SSE","","w,r","","" +"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","C6 /0 ib","V","V"= ,"","","w,r","Y","8" +"MOV r/m8, imm8u","MOVB imm8u, r/m8","movb imm8u, r/m8","REX C6 /0 ib","N.= E.","V","","pseudo64","w,r","Y","8" +"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","B0+rb ib","V","V"= ,"","","w,r","Y","8" +"MOV r8op, imm8u","MOVB imm8u, r8op","movb imm8u, r8op","REX B0+rb ib","N.= E.","V","","pseudo64","w,r","Y","8" +"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","8A /r","V","V","","","w,r"= ,"Y","8" +"MOV r8, r/m8","MOVB r/m8, r8","movb r/m8, r8","REX 8A /r","N.E.","V","","= pseudo64","w,r","Y","8" +"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","88 /r","V","V","","","w,r"= ,"Y","8" +"MOV r/m8, r8","MOVB r8, r/m8","movb r8, r/m8","REX 88 /r","N.E.","V","","= pseudo64","w,r","Y","8" +"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, mof= fs8","A2 cm","V","V","","","w,r","Y","8" +"MOV moffs8, AL","MOVB/MOVB/MOVABSB AL, moffs8","movb/movb/movabsb AL, mof= fs8","REX.W A2 cm","N.E.","V","","pseudo","w,r","Y","8" +"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8,= AL","A0 cm","V","V","","","w,r","Y","8" +"MOV AL, moffs8","MOVB/MOVB/MOVABSB moffs8, AL","movb/movb/movabsb moffs8,= AL","REX.W A0 cm","N.E.","V","","pseudo","w,r","Y","8" +"MOVBE r32, m32","MOVBELL m32, r32","movbell m32, r32","0F 38 F0 /r","V","= V","MOVBE","modrm_memonly,operand32","w,r","Y","32" +"MOVBE m32, r32","MOVBELL r32, m32","movbell r32, m32","0F 38 F1 /r","V","= V","MOVBE","modrm_memonly,operand32","w,r","Y","32" +"MOVBE r64, m64","MOVBEQQ m64, r64","movbeqq m64, r64","REX.W 0F 38 F0 /r"= ,"N.S.","V","MOVBE","modrm_memonly","w,r","Y","64" +"MOVBE m64, r64","MOVBEQQ r64, m64","movbeqq r64, m64","REX.W 0F 38 F1 /r"= ,"N.S.","V","MOVBE","modrm_memonly","w,r","Y","64" +"MOVBE r16, m16","MOVBEWW m16, r16","movbeww m16, r16","0F 38 F0 /r","V","= V","MOVBE","modrm_memonly,operand16","w,r","Y","16" +"MOVBE m16, r16","MOVBEWW r16, m16","movbeww r16, m16","0F 38 F1 /r","V","= V","MOVBE","modrm_memonly,operand16","w,r","Y","16" +"MOVSX r32, r/m8","MOVBLSX r/m8, r32","movsbl r/m8, r32","0F BE /r","V","V= ","","operand32","w,r","Y","32" +"MOVZX r32, r/m8","MOVBLZX r/m8, r32","movzbl r/m8, r32","0F B6 /r","V","V= ","","operand32","w,r","Y","32" +"MOVSX r64, r/m8","MOVBQSX r/m8, r64","movsbq r/m8, r64","REX.W 0F BE /r",= "N.S.","V","","","w,r","Y","64" +"MOVZX r64, r/m8","MOVBQZX r/m8, r64","movzbq r/m8, r64","REX.W 0F B6 /r",= "N.S.","V","","","w,r","Y","64" +"MOVSX r16, r/m8","MOVBWSX r/m8, r16","movsbw r/m8, r16","0F BE /r","V","V= ","","operand16","w,r","Y","16" +"MOVZX r16, r/m8","MOVBWZX r/m8, r16","movzbw r/m8, r16","0F B6 /r","V","V= ","","operand16","w,r","Y","16" +"MOVD r/m32, mm1","MOVD mm1, r/m32","movd mm1, r/m32","0F 7E /r","V","V","= MMX","operand16,operand32","w,r","","" +"MOVD mm1, r/m32","MOVD r/m32, mm1","movd r/m32, mm1","0F 6E /r","V","V","= MMX","operand16,operand32","w,r","","" +"MOVD xmm1, r/m32","MOVD r/m32, xmm1","movd r/m32, xmm1","66 0F 6E /r","V"= ,"V","SSE2","operand16,operand32","w,r","","" +"MOVD r/m32, xmm1","MOVD xmm1, r/m32","movd xmm1, r/m32","66 0F 7E /r","V"= ,"V","SSE2","operand16,operand32","w,r","","" +"MOVDDUP xmm1, xmm2/m64","MOVDDUP xmm2/m64, xmm1","movddup xmm2/m64, xmm1"= ,"F2 0F 12 /r","V","V","SSE3","","w,r","","" +"MOVHLPS xmm1, xmm2","MOVHLPS xmm2, xmm1","movhlps xmm2, xmm1","0F 12 /r",= "V","V","SSE","modrm_regonly","w,r","","" +"MOVHPD xmm1, m64","MOVHPD m64, xmm1","movhpd m64, xmm1","66 0F 16 /r","V"= ,"V","SSE2","modrm_memonly","w,r","","" +"MOVHPD m64, xmm1","MOVHPD xmm1, m64","movhpd xmm1, m64","66 0F 17 /r","V"= ,"V","SSE2","modrm_memonly","w,r","","" +"MOVHPS xmm1, m64","MOVHPS m64, xmm1","movhps m64, xmm1","0F 16 /r","V","V= ","SSE","modrm_memonly","w,r","","" +"MOVHPS m64, xmm1","MOVHPS xmm1, m64","movhps xmm1, m64","0F 17 /r","V","V= ","SSE","modrm_memonly","w,r","","" +"MOV rmr32, CR0-CR7","MOVL CR0-CR7, rmr32","movl CR0-CR7, rmr32","0F 20 /r= ","V","N.S.","","","w,r","Y","32" +"MOV rmr32, DR0-DR7","MOVL DR0-DR7, rmr32","movl DR0-DR7, rmr32","0F 21 /r= ","V","N.S.","","","w,r","Y","32" +"MOV moffs32, EAX","MOVL EAX, moffs32","movl EAX, moffs32","A3 cm","V","V"= ,"","operand32","w,r","Y","32" +"MOV r/m32, imm32","MOVL imm32, r/m32","movl imm32, r/m32","C7 /0 id","V",= "V","","operand32","w,r","Y","32" +"MOV r32op, imm32u","MOVL imm32u, r32op","movl imm32u, r32op","B8+rd id","= V","V","","operand32","w,r","Y","32" +"MOV EAX, moffs32","MOVL moffs32, EAX","movl moffs32, EAX","A1 cm","V","V"= ,"","operand32","w,r","Y","32" +"MOV r32, r/m32","MOVL r/m32, r32","movl r/m32, r32","8B /r","V","V","","o= perand32","w,r","Y","32" +"MOV r/m32, r32","MOVL r32, r/m32","movl r32, r/m32","89 /r","V","V","","o= perand32","w,r","Y","32" +"MOV CR0-CR7, rmr32","MOVL rmr32, CR0-CR7","movl rmr32, CR0-CR7","0F 22 /r= ","V","N.S.","","","w,r","Y","32" +"MOV DR0-DR7, rmr32","MOVL rmr32, DR0-DR7","movl rmr32, DR0-DR7","0F 23 /r= ","V","N.S.","","","w,r","Y","32" +"MOVLHPS xmm1, xmm2","MOVLHPS xmm2, xmm1","movlhps xmm2, xmm1","0F 16 /r",= "V","V","SSE","modrm_regonly","w,r","","" +"MOVLPD xmm1, m64","MOVLPD m64, xmm1","movlpd m64, xmm1","66 0F 12 /r","V"= ,"V","SSE2","modrm_memonly","w,r","","" +"MOVLPD m64, xmm1","MOVLPD xmm1, m64","movlpd xmm1, m64","66 0F 13 /r","V"= ,"V","SSE2","modrm_memonly","w,r","","" +"MOVLPS xmm1, m64","MOVLPS m64, xmm1","movlps m64, xmm1","0F 12 /r","V","V= ","SSE","modrm_memonly","w,r","","" +"MOVLPS m64, xmm1","MOVLPS xmm1, m64","movlps xmm1, m64","0F 13 /r","V","V= ","SSE","modrm_memonly","w,r","","" +"MOVSXD r32, r/m32","MOVLQSX r/m32, r32","movsxdl r/m32, r32","63 /r","N.S= .","V","","operand32","w,r","Y","32" +"MOVSXD r64, r/m32","MOVLQSX r/m32, r64","movslq r/m32, r64","REX.W 63 /r"= ,"N.S.","V","","","w,r","Y","64" +"MOVMSKPD r32, xmm2","MOVMSKPD xmm2, r32","movmskpd xmm2, r32","66 0F 50 /= r","V","V","SSE2","modrm_regonly","w,r","","" +"MOVMSKPS r32, xmm2","MOVMSKPS xmm2, r32","movmskps xmm2, r32","0F 50 /r",= "V","V","SSE","modrm_regonly","w,r","","" +"MOVNTDQA xmm1, m128","MOVNTDQA m128, xmm1","movntdqa m128, xmm1","66 0F 3= 8 2A /r","V","V","SSE4_1","modrm_memonly","w,r","","" +"MOVNTI m32, r32","MOVNTIL r32, m32","movntil r32, m32","0F C3 /r","V","V"= ,"SSE2","modrm_memonly,operand16,operand32","w,r","Y","32" +"MOVNTI m64, r64","MOVNTIQ r64, m64","movntiq r64, m64","REX.W 0F C3 /r","= N.S.","V","SSE2","modrm_memonly","w,r","Y","64" +"MOVNTDQ m128, xmm1","MOVNTO xmm1, m128","movntdq xmm1, m128","66 0F E7 /r= ","V","V","SSE2","modrm_memonly","w,r","","" +"MOVNTPD m128, xmm1","MOVNTPD xmm1, m128","movntpd xmm1, m128","66 0F 2B /= r","V","V","SSE2","modrm_memonly","w,r","","" +"MOVNTPS m128, xmm1","MOVNTPS xmm1, m128","movntps xmm1, m128","0F 2B /r",= "V","V","SSE","modrm_memonly","w,r","","" +"MOVNTQ m64, mm1","MOVNTQ mm1, m64","movntq mm1, m64","0F E7 /r","V","V","= MMX","modrm_memonly","w,r","","" +"MOVNTSD m64, xmm1","MOVNTSD xmm1, m64","movntsd xmm1, m64","F2 0F 2B /r",= "V","V","SSE4a","amd,modrm_memonly","w,r","","" +"MOVNTSS m32, xmm1","MOVNTSS xmm1, m32","movntss xmm1, m32","F3 0F 2B /r",= "V","V","SSE4a","amd,modrm_memonly","w,r","","" +"MOVDQA xmm2/m128, xmm1","MOVO xmm1, xmm2/m128","movdqa xmm1, xmm2/m128","= 66 0F 7F /r","V","V","SSE2","","w,r","","" +"MOVDQA xmm1, xmm2/m128","MOVO xmm2/m128, xmm1","movdqa xmm2/m128, xmm1","= 66 0F 6F /r","V","V","SSE2","","w,r","","" +"MOVDQU xmm2/m128, xmm1","MOVOU xmm1, xmm2/m128","movdqu xmm1, xmm2/m128",= "F3 0F 7F /r","V","V","SSE2","","w,r","","" +"MOVDQU xmm1, xmm2/m128","MOVOU xmm2/m128, xmm1","movdqu xmm2/m128, xmm1",= "F3 0F 6F /r","V","V","SSE2","","w,r","","" +"MOV rmr64, CR0-CR7","MOVQ CR0-CR7, rmr64","movq CR0-CR7, rmr64","0F 20 /r= ","N.S.","V","","default64","w,r","Y","64" +"MOV rmr64, CR8","MOVQ CR8, rmr64","movq CR8, rmr64","REX.R + 0F 20 /0","N= .E.","V","","modrm_regonly,pseudo","w,r","Y","64" +"MOV rmr64, DR0-DR7","MOVQ DR0-DR7, rmr64","movq DR0-DR7, rmr64","0F 21 /r= ","N.S.","V","","default64","w,r","Y","64" +"MOV moffs64, RAX","MOVQ RAX, moffs64","movabsq RAX, moffs64","REX.W A3 cm= ","N.S.","V","","","w,r","Y","64" +"MOV r/m64, imm32","MOVQ imm32, r/m64","movq imm32, r/m64","REX.W C7 /0 id= ","N.S.","V","","","w,r","Y","64" +"MOV r64op, imm64u","MOVQ imm64u, r64op","movq imm64u, r64op","REX.W B8+ro= io","N.S.","V","","","w,r","Y","64" +"MOVQ mm2/m64, mm1","MOVQ mm1, mm2/m64","movq mm1, mm2/m64","0F 7F /r","V"= ,"V","MMX","","w,r","","" +"MOVQ r/m64, mm1","MOVQ mm1, r/m64","movq mm1, r/m64","REX.W 0F 7E /r","N.= S.","V","MMX","","w,r","","" +"MOVQ mm1, mm2/m64","MOVQ mm2/m64, mm1","movq mm2/m64, mm1","0F 6F /r","V"= ,"V","MMX","","w,r","","" +"MOV RAX, moffs64","MOVQ moffs64, RAX","movabsq moffs64, RAX","REX.W A1 cm= ","N.S.","V","","","w,r","Y","64" +"MOVQ mm1, r/m64","MOVQ r/m64, mm1","movq r/m64, mm1","REX.W 0F 6E /r","N.= S.","V","MMX","","w,r","","" +"MOV r64, r/m64","MOVQ r/m64, r64","movq r/m64, r64","REX.W 8B /r","N.S.",= "V","","","w,r","Y","64" +"MOVQ xmm1, r/m64","MOVQ r/m64, xmm1","movq r/m64, xmm1","66 REX.W 0F 6E /= r","N.S.","V","SSE2","","w,r","","" +"MOV r/m64, r64","MOVQ r64, r/m64","movq r64, r/m64","REX.W 89 /r","N.S.",= "V","","","w,r","Y","64" +"MOV CR0-CR7, rmr64","MOVQ rmr64, CR0-CR7","movq rmr64, CR0-CR7","0F 22 /r= ","N.S.","V","","default64","w,r","Y","64" +"MOV CR8, rmr64","MOVQ rmr64, CR8","movq rmr64, CR8","REX.R + 0F 22 /0","N= .E.","V","","modrm_regonly,pseudo","w,r","Y","64" +"MOV DR0-DR7, rmr64","MOVQ rmr64, DR0-DR7","movq rmr64, DR0-DR7","0F 23 /r= ","N.S.","V","","default64","w,r","Y","64" +"MOVQ r/m64, xmm1","MOVQ xmm1, r/m64","movq xmm1, r/m64","66 REX.W 0F 7E /= r","N.S.","V","SSE2","","w,r","","" +"MOVQ xmm2/m64, xmm1","MOVQ xmm1, xmm2/m64","movq xmm1, xmm2/m64","66 0F D= 6 /r","V","V","SSE2","","w,r","","" +"MOVDQ2Q mm1, xmm2","MOVQ xmm2, mm1","movdq2q xmm2, mm1","F2 0F D6 /r","V"= ,"V","SSE2","modrm_regonly","w,r","","" +"MOVQ xmm1, xmm2/m64","MOVQ xmm2/m64, xmm1","movq xmm2/m64, xmm1","F3 0F 7= E /r","V","V","SSE2","","w,r","","" +"MOVQ2DQ xmm1, mm2","MOVQOZX mm2, xmm1","movq2dq mm2, xmm1","F3 0F D6 /r",= "V","V","SSE2","modrm_regonly","w,r","","" +"MOVSB","MOVSB","movsb","A4","V","V","","","","","" +"MOVSD xmm2/m64, xmm1","MOVSD xmm1, xmm2/m64","movsd xmm1, xmm2/m64","F2 0= F 11 /r","V","V","SSE2","","w,r","","" +"MOVSD xmm1, xmm2/m64","MOVSD xmm2/m64, xmm1","movsd xmm2/m64, xmm1","F2 0= F 10 /r","V","V","SSE2","","w,r","","" +"MOVSHDUP xmm1, xmm2/m128","MOVSHDUP xmm2/m128, xmm1","movshdup xmm2/m128,= xmm1","F3 0F 16 /r","V","V","SSE3","","w,r","","" +"MOVSD","MOVSL","movsl","A5","V","V","","operand32","","","" +"MOVSLDUP xmm1, xmm2/m128","MOVSLDUP xmm2/m128, xmm1","movsldup xmm2/m128,= xmm1","F3 0F 12 /r","V","V","SSE3","","w,r","","" +"MOVSQ","MOVSQ","movsq","REX.W A5","N.S.","V","","","","","" +"MOVSS xmm2/m32, xmm1","MOVSS xmm1, xmm2/m32","movss xmm1, xmm2/m32","F3 0= F 11 /r","V","V","SSE","","w,r","","" +"MOVSS xmm1, xmm2/m32","MOVSS xmm2/m32, xmm1","movss xmm2/m32, xmm1","F3 0= F 10 /r","V","V","SSE","","w,r","","" +"MOVSW","MOVSW","movsw","A5","V","V","","operand16","","","" +"MOVSX r16, r/m16","MOVSWW r/m16, r16","movsww r/m16, r16","0F BF /r","V",= "V","","operand16","w,r","Y","16" +"MOVUPD xmm2/m128, xmm1","MOVUPD xmm1, xmm2/m128","movupd xmm1, xmm2/m128"= ,"66 0F 11 /r","V","V","SSE2","","w,r","","" +"MOVUPD xmm1, xmm2/m128","MOVUPD xmm2/m128, xmm1","movupd xmm2/m128, xmm1"= ,"66 0F 10 /r","V","V","SSE2","","w,r","","" +"MOVUPS xmm2/m128, xmm1","MOVUPS xmm1, xmm2/m128","movups xmm1, xmm2/m128"= ,"0F 11 /r","V","V","SSE","","w,r","","" +"MOVUPS xmm1, xmm2/m128","MOVUPS xmm2/m128, xmm1","movups xmm2/m128, xmm1"= ,"0F 10 /r","V","V","SSE","","w,r","","" +"MOV moffs16, AX","MOVW AX, moffs16","movw AX, moffs16","A3 cm","V","V",""= ,"operand16","w,r","Y","16" +"MOV r/m16, Sreg","MOVW Sreg, r/m16","movw Sreg, r/m16","8C /r","V","V",""= ,"operand16","w,r","Y","16" +"MOV r/m16, imm16","MOVW imm16, r/m16","movw imm16, r/m16","C7 /0 iw","V",= "V","","operand16","w,r","Y","16" +"MOV r16op, imm16u","MOVW imm16u, r16op","movw imm16u, r16op","B8+rw iw","= V","V","","operand16","w,r","Y","16" +"MOV AX, moffs16","MOVW moffs16, AX","movw moffs16, AX","A1 cm","V","V",""= ,"operand16","w,r","Y","16" +"MOV Sreg, r/m16","MOVW r/m16, Sreg","movw r/m16, Sreg","8E /r","V","V",""= ,"","w,r","Y","16" +"MOV r16, r/m16","MOVW r/m16, r16","movw r/m16, r16","8B /r","V","V","","o= perand16","w,r","Y","16" +"MOV r/m16, r16","MOVW r16, r/m16","movw r16, r/m16","89 /r","V","V","","o= perand16","w,r","Y","16" +"MOVSX r32, r/m16","MOVWLSX r/m16, r32","movswl r/m16, r32","0F BF /r","V"= ,"V","","operand32","w,r","Y","32" +"MOVZX r32, r/m16","MOVWLZX r/m16, r32","movzwl r/m16, r32","0F B7 /r","V"= ,"V","","operand32","w,r","Y","32" +"MOVSX r64, r/m16","MOVWQSX r/m16, r64","movswq r/m16, r64","REX.W 0F BF /= r","N.S.","V","","","w,r","Y","64" +"MOVSXD r16, r/m32","MOVWQSX r/m32, r16","movsxdw r/m32, r16","63 /r","N.S= .","V","","operand16","w,r","Y","16" +"MOVZX r64, r/m16","MOVWQZX r/m16, r64","movzwq r/m16, r64","REX.W 0F B7 /= r","N.S.","V","","","w,r","Y","64" +"MOVZX r16, r/m16","MOVZWW r/m16, r16","movzww r/m16, r16","0F B7 /r","V",= "V","","operand16","w,r","Y","16" +"MOV r32/m16, Sreg","MOV{L/W} Sreg, r32/m16","mov{l/w} Sreg, r32/m16","8C = /r","V","V","","operand32","w,r","Y","" +"MOV r64/m16, Sreg","MOV{Q/W} Sreg, r64/m16","mov{q/w} Sreg, r64/m16","REX= .W 8C /r","N.S.","V","","","w,r","Y","" +"MPSADBW xmm1, xmm2/m128, imm8u","MPSADBW imm8u, xmm2/m128, xmm1","mpsadbw= imm8u, xmm2/m128, xmm1","66 0F 3A 42 /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"MUL r/m8","MULB r/m8","mulb r/m8","F6 /4","V","V","","","r","Y","8" +"MUL r/m8","MULB r/m8","mulb r/m8","REX F6 /4","N.E.","V","","pseudo64","r= ","Y","8" +"MUL r/m32","MULL r/m32","mull r/m32","F7 /4","V","V","","operand32","r","= Y","32" +"MULPD xmm1, xmm2/m128","MULPD xmm2/m128, xmm1","mulpd xmm2/m128, xmm1","6= 6 0F 59 /r","V","V","SSE2","","rw,r","","" +"MULPS xmm1, xmm2/m128","MULPS xmm2/m128, xmm1","mulps xmm2/m128, xmm1","0= F 59 /r","V","V","SSE","","rw,r","","" +"MUL r/m64","MULQ r/m64","mulq r/m64","REX.W F7 /4","N.S.","V","","","r","= Y","64" +"MULSD xmm1, xmm2/m64","MULSD xmm2/m64, xmm1","mulsd xmm2/m64, xmm1","F2 0= F 59 /r","V","V","SSE2","","rw,r","","" +"MULSS xmm1, xmm2/m32","MULSS xmm2/m32, xmm1","mulss xmm2/m32, xmm1","F3 0= F 59 /r","V","V","SSE","","rw,r","","" +"MUL r/m16","MULW r/m16","mulw r/m16","F7 /4","V","V","","operand16","r","= Y","16" +"MULX r32, r32V, r/m32","MULXL r/m32, r32V, r32","mulxl r/m32, r32V, r32",= "VEX.NDD.128.F2.0F38.W0 F6 /r","V","V","BMI2","","w,w,r","Y","32" +"MULX r64, r64V, r/m64","MULXQ r/m64, r64V, r64","mulxq r/m64, r64V, r64",= "VEX.NDD.128.F2.0F38.W1 F6 /r","N.S.","V","BMI2","","w,w,r","Y","64" +"MWAIT","MWAIT","mwait","0F 01 C9","V","V","MONITOR","","","","" +"NEG r/m8","NEGB r/m8","negb r/m8","F6 /3","V","V","","","rw","Y","8" +"NEG r/m8","NEGB r/m8","negb r/m8","REX F6 /3","N.E.","V","","pseudo64","r= w","Y","8" +"NEG r/m32","NEGL r/m32","negl r/m32","F7 /3","V","V","","operand32","rw",= "Y","32" +"NEG r/m64","NEGQ r/m64","negq r/m64","REX.W F7 /3","N.S.","V","","","rw",= "Y","64" +"NEG r/m16","NEGW r/m16","negw r/m16","F7 /3","V","V","","operand16","rw",= "Y","16" +"NOP","NOP","nop","90","V","V","","pseudo","","Y","" +"NOP","NOP","nop","90+rd","V","V","","operand32,operand64","","Y","" +"NOP","NOP","nop","90+rw","V","V","","operand16,operand64","","Y","" +"NOP","NOP","nop","F3 90+rd","V","V","","operand32","","Y","" +"NOP","NOP","nop","F3 90+rw","V","V","","operand16","","Y","" +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /4","V","V","","operand32","r= ","Y","32" +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /5","V","V","","operand32","r= ","Y","32" +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /6","V","V","","operand32","r= ","Y","32" +"NOP r/m32","NOPL r/m32","nopl r/m32","0F 18 /7","V","V","","operand32","r= ","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 19 /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1A /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1B /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1C /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1D /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V","P= PRO","operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1E /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP r/m32, r32","NOPL r32, r/m32","nopl r32, r/m32","0F 1F /r","V","V",""= ,"operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 0D /r","V","V","P= RFCHW","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1A /r","V","V","P= PRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","0F 1B /r","V","V","P= PRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","66 0F 1E /r","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F2 0F 1E /r","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1B /r","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /0","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /1","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /2","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /3","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /4","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /5","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E /6","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F8","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E F9","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FA","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FB","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FC","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FD","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FE","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32, r32","NOPL r32, rmr32","nopl r32, rmr32","F3 0F 1E FF","V","V"= ,"PPRO","modrm_regonly,operand32","r,r","Y","32" +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /0","V","V","","modrm_regonly= ,operand32","r","Y","32" +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /1","V","V","","modrm_regonly= ,operand32","r","Y","32" +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /2","V","V","","modrm_regonly= ,operand32","r","Y","32" +"NOP rmr32","NOPL rmr32","nopl rmr32","0F 18 /3","V","V","","modrm_regonly= ,operand32","r","Y","32" +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /4","N.S.","V","","","r= ","Y","64" +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /5","N.S.","V","","","r= ","Y","64" +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /6","N.S.","V","","","r= ","Y","64" +"NOP r/m64","NOPQ r/m64","nopq r/m64","REX.W 0F 18 /7","N.S.","V","","","r= ","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 19 /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1A /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1B /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1C /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1D /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S= .","V","","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1E /r","N.S= .","V","PPRO","","r,r","Y","64" +"NOP r/m64, r64","NOPQ r64, r/m64","nopq r64, r/m64","REX.W 0F 1F /r","N.S= .","V","","","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","66 REX.W 0F 1E /r","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F2 REX.W 0F 1E /r","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1B /r","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /0","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /1","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /2","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /3","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /4","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /5","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E /6","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F8","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E F9","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FA","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FB","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FC","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FD","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FE","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","F3 REX.W 0F 1E FF","= N.S.","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 0D /r","N.S= .","V","PRFCHW","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1A /r","N.S= .","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64, r64","NOPQ r64, rmr64","nopq r64, rmr64","REX.W 0F 1B /r","N.S= .","V","PPRO","modrm_regonly","r,r","Y","64" +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /0","N.S.","V","","modr= m_regonly","r","Y","64" +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /1","N.S.","V","","modr= m_regonly","r","Y","64" +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /2","N.S.","V","","modr= m_regonly","r","Y","64" +"NOP rmr64","NOPQ rmr64","nopq rmr64","REX.W 0F 18 /3","N.S.","V","","modr= m_regonly","r","Y","64" +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /4","V","V","","operand16","r= ","Y","16" +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /5","V","V","","operand16","r= ","Y","16" +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /6","V","V","","operand16","r= ","Y","16" +"NOP r/m16","NOPW r/m16","nopw r/m16","0F 18 /7","V","V","","operand16","r= ","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 19 /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1A /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1B /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1C /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1D /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1E /r","V","V","P= PRO","operand16","r,r","Y","16" +"NOP r/m16, r16","NOPW r16, r/m16","nopw r16, r/m16","0F 1F /r","V","V",""= ,"operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 0D /r","V","V","P= RFCHW","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1A /r","V","V","P= PRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","0F 1B /r","V","V","P= PRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","66 0F 1E /r","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F2 0F 1E /r","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1B /r","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /0","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /1","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /2","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /3","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /4","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /5","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E /6","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F8","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E F9","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FA","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FB","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FC","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FD","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FE","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16, r16","NOPW r16, rmr16","nopw r16, rmr16","F3 0F 1E FF","V","V"= ,"PPRO","modrm_regonly,operand16","r,r","Y","16" +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /0","V","V","","modrm_regonly= ,operand16","r","Y","16" +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /1","V","V","","modrm_regonly= ,operand16","r","Y","16" +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /2","V","V","","modrm_regonly= ,operand16","r","Y","16" +"NOP rmr16","NOPW rmr16","nopw rmr16","0F 18 /3","V","V","","modrm_regonly= ,operand16","r","Y","16" +"NOT r/m8","NOTB r/m8","notb r/m8","F6 /2","V","V","","","rw","Y","8" +"NOT r/m8","NOTB r/m8","notb r/m8","REX F6 /2","N.E.","V","","pseudo64","r= w","Y","8" +"NOT r/m32","NOTL r/m32","notl r/m32","F7 /2","V","V","","operand32","rw",= "Y","32" +"NOT r/m64","NOTQ r/m64","notq r/m64","REX.W F7 /2","N.S.","V","","","rw",= "Y","64" +"NOT r/m16","NOTW r/m16","notw r/m16","F7 /2","V","V","","operand16","rw",= "Y","16" +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","80 /1 ib","V","V","",""= ,"rw,r","Y","8" +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","82 /1 ib","V","N.S.",""= ,"","rw,r","Y","8" +"OR r/m8, imm8","ORB imm8, r/m8","orb imm8, r/m8","REX 80 /1 ib","N.E.","V= ","","pseudo64","rw,r","Y","8" +"OR AL, imm8u","ORB imm8u, AL","orb imm8u, AL","0C ib","V","V","","","rw,r= ","Y","8" +"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","0A /r","V","V","","","rw,r","= Y","8" +"OR r8, r/m8","ORB r/m8, r8","orb r/m8, r8","REX 0A /r","N.E.","V","","pse= udo64","rw,r","Y","8" +"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","08 /r","V","V","","","rw,r","= Y","8" +"OR r/m8, r8","ORB r8, r/m8","orb r8, r/m8","REX 08 /r","N.E.","V","","pse= udo64","rw,r","Y","8" +"OR EAX, imm32","ORL imm32, EAX","orl imm32, EAX","0D id","V","V","","oper= and32","rw,r","Y","32" +"OR r/m32, imm32","ORL imm32, r/m32","orl imm32, r/m32","81 /1 id","V","V"= ,"","operand32","rw,r","Y","32" +"OR r/m32, imm8","ORL imm8, r/m32","orl imm8, r/m32","83 /1 ib","V","V",""= ,"operand32","rw,r","Y","32" +"OR r32, r/m32","ORL r/m32, r32","orl r/m32, r32","0B /r","V","V","","oper= and32","rw,r","Y","32" +"OR r/m32, r32","ORL r32, r/m32","orl r32, r/m32","09 /r","V","V","","oper= and32","rw,r","Y","32" +"ORPD xmm1, xmm2/m128","ORPD xmm2/m128, xmm1","orpd xmm2/m128, xmm1","66 0= F 56 /r","V","V","SSE2","","rw,r","","" +"ORPS xmm1, xmm2/m128","ORPS xmm2/m128, xmm1","orps xmm2/m128, xmm1","0F 5= 6 /r","V","V","SSE","","rw,r","","" +"OR RAX, imm32","ORQ imm32, RAX","orq imm32, RAX","REX.W 0D id","N.S.","V"= ,"","","rw,r","Y","64" +"OR r/m64, imm32","ORQ imm32, r/m64","orq imm32, r/m64","REX.W 81 /1 id","= N.S.","V","","","rw,r","Y","64" +"OR r/m64, imm8","ORQ imm8, r/m64","orq imm8, r/m64","REX.W 83 /1 ib","N.S= .","V","","","rw,r","Y","64" +"OR r64, r/m64","ORQ r/m64, r64","orq r/m64, r64","REX.W 0B /r","N.S.","V"= ,"","","rw,r","Y","64" +"OR r/m64, r64","ORQ r64, r/m64","orq r64, r/m64","REX.W 09 /r","N.S.","V"= ,"","","rw,r","Y","64" +"OR AX, imm16","ORW imm16, AX","orw imm16, AX","0D iw","V","V","","operand= 16","rw,r","Y","16" +"OR r/m16, imm16","ORW imm16, r/m16","orw imm16, r/m16","81 /1 iw","V","V"= ,"","operand16","rw,r","Y","16" +"OR r/m16, imm8","ORW imm8, r/m16","orw imm8, r/m16","83 /1 ib","V","V",""= ,"operand16","rw,r","Y","16" +"OR r16, r/m16","ORW r/m16, r16","orw r/m16, r16","0B /r","V","V","","oper= and16","rw,r","Y","16" +"OR r/m16, r16","ORW r16, r/m16","orw r16, r/m16","09 /r","V","V","","oper= and16","rw,r","Y","16" +"OUT DX, AL","OUTB AL, DX","outb AL, DX","EE","V","V","","","r,r","Y","8" +"OUT imm8u, AL","OUTB AL, imm8u","outb AL, imm8u","E6 ib","V","V","","","r= ,r","Y","8" +"OUT DX, EAX","OUTL EAX, DX","outl EAX, DX","EF","V","V","","operand32,ope= rand64","r,r","Y","32" +"OUT imm8u, EAX","OUTL EAX, imm8u","outl EAX, imm8u","E7 ib","V","V","","o= perand32,operand64","r,r","Y","32" +"OUTSB","OUTSB","outsb","6E","V","V","","","","","" +"OUTSD","OUTSL","outsl","6F","V","V","","operand32,operand64","","","" +"OUTSW","OUTSW","outsw","6F","V","V","","operand16","","","" +"OUT DX, AX","OUTW AX, DX","outw AX, DX","EF","V","V","","operand16","r,r"= ,"Y","16" +"OUT imm8u, AX","OUTW AX, imm8u","outw AX, imm8u","E7 ib","V","V","","oper= and16","r,r","Y","16" +"PABSB mm1, mm2/m64","PABSB mm2/m64, mm1","pabsb mm2/m64, mm1","0F 38 1C /= r","V","V","SSSE3","","w,r","","" +"PABSB xmm1, xmm2/m128","PABSB xmm2/m128, xmm1","pabsb xmm2/m128, xmm1","6= 6 0F 38 1C /r","V","V","SSSE3","","w,r","","" +"PABSD mm1, mm2/m64","PABSD mm2/m64, mm1","pabsd mm2/m64, mm1","0F 38 1E /= r","V","V","SSSE3","","w,r","","" +"PABSD xmm1, xmm2/m128","PABSD xmm2/m128, xmm1","pabsd xmm2/m128, xmm1","6= 6 0F 38 1E /r","V","V","SSSE3","","w,r","","" +"PABSW mm1, mm2/m64","PABSW mm2/m64, mm1","pabsw mm2/m64, mm1","0F 38 1D /= r","V","V","SSSE3","","w,r","","" +"PABSW xmm1, xmm2/m128","PABSW xmm2/m128, xmm1","pabsw xmm2/m128, xmm1","6= 6 0F 38 1D /r","V","V","SSSE3","","w,r","","" +"PACKSSDW mm1, mm2/m64","PACKSSLW mm2/m64, mm1","packssdw mm2/m64, mm1","0= F 6B /r","V","V","MMX","","rw,r","","" +"PACKSSDW xmm1, xmm2/m128","PACKSSLW xmm2/m128, xmm1","packssdw xmm2/m128,= xmm1","66 0F 6B /r","V","V","SSE2","","rw,r","","" +"PACKSSWB mm1, mm2/m64","PACKSSWB mm2/m64, mm1","packsswb mm2/m64, mm1","0= F 63 /r","V","V","MMX","","rw,r","","" +"PACKSSWB xmm1, xmm2/m128","PACKSSWB xmm2/m128, xmm1","packsswb xmm2/m128,= xmm1","66 0F 63 /r","V","V","SSE2","","rw,r","","" +"PACKUSDW xmm1, xmm2/m128","PACKUSDW xmm2/m128, xmm1","packusdw xmm2/m128,= xmm1","66 0F 38 2B /r","V","V","SSE4_1","","rw,r","","" +"PACKUSWB mm1, mm2/m64","PACKUSWB mm2/m64, mm1","packuswb mm2/m64, mm1","0= F 67 /r","V","V","MMX","","rw,r","","" +"PACKUSWB xmm1, xmm2/m128","PACKUSWB xmm2/m128, xmm1","packuswb xmm2/m128,= xmm1","66 0F 67 /r","V","V","SSE2","","rw,r","","" +"PADDB mm1, mm2/m64","PADDB mm2/m64, mm1","paddb mm2/m64, mm1","0F FC /r",= "V","V","MMX","","rw,r","","" +"PADDB xmm1, xmm2/m128","PADDB xmm2/m128, xmm1","paddb xmm2/m128, xmm1","6= 6 0F FC /r","V","V","SSE2","","rw,r","","" +"PADDD mm1, mm2/m64","PADDL mm2/m64, mm1","paddd mm2/m64, mm1","0F FE /r",= "V","V","MMX","","rw,r","","" +"PADDD xmm1, xmm2/m128","PADDL xmm2/m128, xmm1","paddd xmm2/m128, xmm1","6= 6 0F FE /r","V","V","SSE2","","rw,r","","" +"PADDQ mm1, mm2/m64","PADDQ mm2/m64, mm1","paddq mm2/m64, mm1","0F D4 /r",= "V","V","SSE2","","rw,r","","" +"PADDQ xmm1, xmm2/m128","PADDQ xmm2/m128, xmm1","paddq xmm2/m128, xmm1","6= 6 0F D4 /r","V","V","SSE2","","rw,r","","" +"PADDSB mm1, mm2/m64","PADDSB mm2/m64, mm1","paddsb mm2/m64, mm1","0F EC /= r","V","V","MMX","","rw,r","","" +"PADDSB xmm1, xmm2/m128","PADDSB xmm2/m128, xmm1","paddsb xmm2/m128, xmm1"= ,"66 0F EC /r","V","V","SSE2","","rw,r","","" +"PADDSW mm1, mm2/m64","PADDSW mm2/m64, mm1","paddsw mm2/m64, mm1","0F ED /= r","V","V","MMX","","rw,r","","" +"PADDSW xmm1, xmm2/m128","PADDSW xmm2/m128, xmm1","paddsw xmm2/m128, xmm1"= ,"66 0F ED /r","V","V","SSE2","","rw,r","","" +"PADDUSB mm1, mm2/m64","PADDUSB mm2/m64, mm1","paddusb mm2/m64, mm1","0F D= C /r","V","V","MMX","","rw,r","","" +"PADDUSB xmm1, xmm2/m128","PADDUSB xmm2/m128, xmm1","paddusb xmm2/m128, xm= m1","66 0F DC /r","V","V","SSE2","","rw,r","","" +"PADDUSW mm1, mm2/m64","PADDUSW mm2/m64, mm1","paddusw mm2/m64, mm1","0F D= D /r","V","V","MMX","","rw,r","","" +"PADDUSW xmm1, xmm2/m128","PADDUSW xmm2/m128, xmm1","paddusw xmm2/m128, xm= m1","66 0F DD /r","V","V","SSE2","","rw,r","","" +"PADDW mm1, mm2/m64","PADDW mm2/m64, mm1","paddw mm2/m64, mm1","0F FD /r",= "V","V","MMX","","rw,r","","" +"PADDW xmm1, xmm2/m128","PADDW xmm2/m128, xmm1","paddw xmm2/m128, xmm1","6= 6 0F FD /r","V","V","SSE2","","rw,r","","" +"PALIGNR mm1, mm2/m64, imm8u","PALIGNR imm8u, mm2/m64, mm1","palignr imm8u= , mm2/m64, mm1","0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r","","" +"PALIGNR xmm1, xmm2/m128, imm8u","PALIGNR imm8u, xmm2/m128, xmm1","palignr= imm8u, xmm2/m128, xmm1","66 0F 3A 0F /r ib","V","V","SSSE3","","rw,r,r",""= ,"" +"PAND mm1, mm2/m64","PAND mm2/m64, mm1","pand mm2/m64, mm1","0F DB /r","V"= ,"V","MMX","","rw,r","","" +"PAND xmm1, xmm2/m128","PAND xmm2/m128, xmm1","pand xmm2/m128, xmm1","66 0= F DB /r","V","V","SSE2","","rw,r","","" +"PANDN mm1, mm2/m64","PANDN mm2/m64, mm1","pandn mm2/m64, mm1","0F DF /r",= "V","V","MMX","","rw,r","","" +"PANDN xmm1, xmm2/m128","PANDN xmm2/m128, xmm1","pandn xmm2/m128, xmm1","6= 6 0F DF /r","V","V","SSE2","","rw,r","","" +"PAUSE","PAUSE","pause","F3 90","V","V","","pseudo","","","" +"PAUSE","PAUSE","pause","F3 90+rd","V","V","","operand32","","Y","" +"PAUSE","PAUSE","pause","F3 90+rw","V","V","","operand16,operand64","","Y"= ,"" +"PAVGB mm1, mm2/m64","PAVGB mm2/m64, mm1","pavgb mm2/m64, mm1","0F E0 /r",= "V","V","MMX","","rw,r","","" +"PAVGB xmm1, xmm2/m128","PAVGB xmm2/m128, xmm1","pavgb xmm2/m128, xmm1","6= 6 0F E0 /r","V","V","SSE2","","rw,r","","" +"PAVGUSB mm1, mm2/m64","PAVGUSB mm2/m64, mm1","pavgusb mm2/m64, mm1","0F 0= F BF /r","V","V","3DNOW","amd","rw,r","","" +"PAVGW mm1, mm2/m64","PAVGW mm2/m64, mm1","pavgw mm2/m64, mm1","0F E3 /r",= "V","V","MMX","","rw,r","","" +"PAVGW xmm1, xmm2/m128","PAVGW xmm2/m128, xmm1","pavgw xmm2/m128, xmm1","6= 6 0F E3 /r","V","V","SSE2","","rw,r","","" +"PBLENDVB xmm1, xmm2/m128, ","PBLENDVB , xmm2/m128, xmm1","pbl= endvb , xmm2/m128, xmm1","66 0F 38 10 /r","V","V","SSE4_1","","rw,r,r= ","","" +"PBLENDW xmm1, xmm2/m128, imm8u","PBLENDW imm8u, xmm2/m128, xmm1","pblendw= imm8u, xmm2/m128, xmm1","66 0F 3A 0E /r ib","V","V","SSE4_1","","rw,r,r","= ","" +"PCLMULQDQ xmm1, xmm2/m128, imm8u","PCLMULQDQ imm8u, xmm2/m128, xmm1","pcl= mulqdq imm8u, xmm2/m128, xmm1","66 0F 3A 44 /r ib","V","V","PCLMULQDQ","","= rw,r,r","","" +"PCMPEQB mm1, mm2/m64","PCMPEQB mm2/m64, mm1","pcmpeqb mm2/m64, mm1","0F 7= 4 /r","V","V","MMX","","rw,r","","" +"PCMPEQB xmm1, xmm2/m128","PCMPEQB xmm2/m128, xmm1","pcmpeqb xmm2/m128, xm= m1","66 0F 74 /r","V","V","SSE2","","rw,r","","" +"PCMPEQD mm1, mm2/m64","PCMPEQL mm2/m64, mm1","pcmpeqd mm2/m64, mm1","0F 7= 6 /r","V","V","MMX","","rw,r","","" +"PCMPEQD xmm1, xmm2/m128","PCMPEQL xmm2/m128, xmm1","pcmpeqd xmm2/m128, xm= m1","66 0F 76 /r","V","V","SSE2","","rw,r","","" +"PCMPEQQ xmm1, xmm2/m128","PCMPEQQ xmm2/m128, xmm1","pcmpeqq xmm2/m128, xm= m1","66 0F 38 29 /r","V","V","SSE4_1","","rw,r","","" +"PCMPEQW mm1, mm2/m64","PCMPEQW mm2/m64, mm1","pcmpeqw mm2/m64, mm1","0F 7= 5 /r","V","V","MMX","","rw,r","","" +"PCMPEQW xmm1, xmm2/m128","PCMPEQW xmm2/m128, xmm1","pcmpeqw xmm2/m128, xm= m1","66 0F 75 /r","V","V","SSE2","","rw,r","","" +"PCMPESTRI xmm1, xmm2/m128, imm8u","PCMPESTRI imm8u, xmm2/m128, xmm1","pcm= pestri imm8u, xmm2/m128, xmm1","66 0F 3A 61 /r ib","V","V","SSE4_2","","r,r= ,r","","" +"PCMPESTRM xmm1, xmm2/m128, imm8u","PCMPESTRM imm8u, xmm2/m128, xmm1","pcm= pestrm imm8u, xmm2/m128, xmm1","66 0F 3A 60 /r ib","V","V","SSE4_2","","r,r= ,r","","" +"PCMPGTB mm1, mm2/m64","PCMPGTB mm2/m64, mm1","pcmpgtb mm2/m64, mm1","0F 6= 4 /r","V","V","MMX","","rw,r","","" +"PCMPGTB xmm1, xmm2/m128","PCMPGTB xmm2/m128, xmm1","pcmpgtb xmm2/m128, xm= m1","66 0F 64 /r","V","V","SSE2","","rw,r","","" +"PCMPGTD mm1, mm2/m64","PCMPGTL mm2/m64, mm1","pcmpgtd mm2/m64, mm1","0F 6= 6 /r","V","V","MMX","","rw,r","","" +"PCMPGTD xmm1, xmm2/m128","PCMPGTL xmm2/m128, xmm1","pcmpgtd xmm2/m128, xm= m1","66 0F 66 /r","V","V","SSE2","","rw,r","","" +"PCMPGTQ xmm1, xmm2/m128","PCMPGTQ xmm2/m128, xmm1","pcmpgtq xmm2/m128, xm= m1","66 0F 38 37 /r","V","V","SSE4_2","","rw,r","","" +"PCMPGTW mm1, mm2/m64","PCMPGTW mm2/m64, mm1","pcmpgtw mm2/m64, mm1","0F 6= 5 /r","V","V","MMX","","rw,r","","" +"PCMPGTW xmm1, xmm2/m128","PCMPGTW xmm2/m128, xmm1","pcmpgtw xmm2/m128, xm= m1","66 0F 65 /r","V","V","SSE2","","rw,r","","" +"PCMPISTRI xmm1, xmm2/m128, imm8u","PCMPISTRI imm8u, xmm2/m128, xmm1","pcm= pistri imm8u, xmm2/m128, xmm1","66 0F 3A 63 /r ib","V","V","SSE4_2","","r,r= ,r","","" +"PCMPISTRM xmm1, xmm2/m128, imm8u","PCMPISTRM imm8u, xmm2/m128, xmm1","pcm= pistrm imm8u, xmm2/m128, xmm1","66 0F 3A 62 /r ib","V","V","SSE4_2","","r,r= ,r","","" +"PDEP r32, r32V, r/m32","PDEPL r/m32, r32V, r32","pdepl r/m32, r32V, r32",= "VEX.DDS.128.F2.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32" +"PDEP r64, r64V, r/m64","PDEPQ r/m64, r64V, r64","pdepq r/m64, r64V, r64",= "VEX.DDS.128.F2.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64" +"PEXT r32, r32V, r/m32","PEXTL r/m32, r32V, r32","pextl r/m32, r32V, r32",= "VEX.DDS.128.F3.0F38.W0 F5 /r","V","V","BMI2","","rw,r,r","Y","32" +"PEXT r64, r64V, r/m64","PEXTQ r/m64, r64V, r64","pextq r/m64, r64V, r64",= "VEX.DDS.128.F3.0F38.W1 F5 /r","N.S.","V","BMI2","","rw,r,r","Y","64" +"PEXTRB r32/m8, xmm1, imm8u","PEXTRB imm8u, xmm1, r32/m8","pextrb imm8u, x= mm1, r32/m8","66 0F 3A 14 /r ib","V","V","SSE4_1","","w,r,r","","" +"PEXTRD r/m32, xmm1, imm8u","PEXTRD imm8u, xmm1, r/m32","pextrd imm8u, xmm= 1, r/m32","66 0F 3A 16 /r ib","V","V","SSE4_1","operand16,operand32","w,r,r= ","","" +"PEXTRQ r/m64, xmm1, imm8u","PEXTRQ imm8u, xmm1, r/m64","pextrq imm8u, xmm= 1, r/m64","66 REX.W 0F 3A 16 /r ib","N.S.","V","SSE4_1","","w,r,r","","" +"PEXTRW r32, mm2, imm8u","PEXTRW imm8u, mm2, r32","pextrw imm8u, mm2, r32"= ,"0F C5 /r ib","V","V","MMX","modrm_regonly","w,r,r","","" +"PEXTRW r32/m16, xmm1, imm8u","PEXTRW imm8u, xmm1, r32/m16","pextrw imm8u,= xmm1, r32/m16","66 0F 3A 15 /r ib","V","V","SSE4_1","","w,r,r","","" +"PEXTRW r32, xmm2, imm8u","PEXTRW imm8u, xmm2, r32","pextrw imm8u, xmm2, r= 32","66 0F C5 /r ib","V","V","SSE2","modrm_regonly","w,r,r","","" +"PF2ID mm1, mm2/m64","PF2ID mm2/m64, mm1","pf2id mm2/m64, mm1","0F 0F 1D /= r","V","V","3DNOW","amd","rw,r","","" +"PF2IW mm1, mm2/m64","PF2IW mm2/m64, mm1","pf2iw mm2/m64, mm1","0F 0F 1C /= r","V","V","3DNOW","amd","rw,r","","" +"PFACC mm1, mm2/m64","PFACC mm2/m64, mm1","pfacc mm2/m64, mm1","0F 0F AE /= r","V","V","3DNOW","amd","rw,r","","" +"PFADD mm1, mm2/m64","PFADD mm2/m64, mm1","pfadd mm2/m64, mm1","0F 0F 9E /= r","V","V","3DNOW","amd","rw,r","","" +"PFCMPEQ mm1, mm2/m64","PFCMPEQ mm2/m64, mm1","pfcmpeq mm2/m64, mm1","0F 0= F B0 /r","V","V","3DNOW","amd","rw,r","","" +"PFCMPGE mm1, mm2/m64","PFCMPGE mm2/m64, mm1","pfcmpge mm2/m64, mm1","0F 0= F 90 /r","V","V","3DNOW","amd","rw,r","","" +"PFCMPGT mm1, mm2/m64","PFCMPGT mm2/m64, mm1","pfcmpgt mm2/m64, mm1","0F 0= F A0 /r","V","V","3DNOW","amd","rw,r","","" +"PFCPIT1 mm1, mm2/m64","PFCPIT1 mm2/m64, mm1","pfcpit1 mm2/m64, mm1","0F 0= F A6 /r","V","V","3DNOW","amd","rw,r","","" +"PFMAX mm1, mm2/m64","PFMAX mm2/m64, mm1","pfmax mm2/m64, mm1","0F 0F A4 /= r","V","V","3DNOW","amd","rw,r","","" +"PFMIN mm1, mm2/m64","PFMIN mm2/m64, mm1","pfmin mm2/m64, mm1","0F 0F 94 /= r","V","V","3DNOW","amd","rw,r","","" +"PFMUL mm1, mm2/m64","PFMUL mm2/m64, mm1","pfmul mm2/m64, mm1","0F 0F B4 /= r","V","V","3DNOW","amd","rw,r","","" +"PFNACC mm1, mm2/m64","PFNACC mm2/m64, mm1","pfnacc mm2/m64, mm1","0F 0F 8= A /r","V","V","3DNOW","amd","rw,r","","" +"PFPNACC mm1, mm2/m64","PFPNACC mm2/m64, mm1","pfpnacc mm2/m64, mm1","0F 0= F 8E /r","V","V","3DNOW","amd","rw,r","","" +"PFRCP mm1, mm2/m64","PFRCP mm2/m64, mm1","pfrcp mm2/m64, mm1","0F 0F 96 /= r","V","V","3DNOW","amd","rw,r","","" +"PFRCPIT2 mm1, mm2/m64","PFRCPIT2 mm2/m64, mm1","pfrcpit2 mm2/m64, mm1","0= F 0F B6 /r","V","V","3DNOW","amd","rw,r","","" +"PFRSQIT1 mm1, mm2/m64","PFRSQIT1 mm2/m64, mm1","pfrsqit1 mm2/m64, mm1","0= F 0F A7 /r","V","V","3DNOW","amd","rw,r","","" +"PFSQRT mm1, mm2/m64","PFSQRT mm2/m64, mm1","pfsqrt mm2/m64, mm1","0F 0F 9= 7 /r","V","V","3DNOW","amd","rw,r","","" +"PFSUB mm1, mm2/m64","PFSUB mm2/m64, mm1","pfsub mm2/m64, mm1","0F 0F 9A /= r","V","V","3DNOW","amd","rw,r","","" +"PFSUBR mm1, mm2/m64","PFSUBR mm2/m64, mm1","pfsubr mm2/m64, mm1","0F 0F A= A /r","V","V","3DNOW","amd","rw,r","","" +"PHADDD mm1, mm2/m64","PHADDD mm2/m64, mm1","phaddd mm2/m64, mm1","0F 38 0= 2 /r","V","V","SSSE3","","rw,r","","" +"PHADDD xmm1, xmm2/m128","PHADDD xmm2/m128, xmm1","phaddd xmm2/m128, xmm1"= ,"66 0F 38 02 /r","V","V","SSSE3","","rw,r","","" +"PHADDSW mm1, mm2/m64","PHADDSW mm2/m64, mm1","phaddsw mm2/m64, mm1","0F 3= 8 03 /r","V","V","SSSE3","","rw,r","","" +"PHADDSW xmm1, xmm2/m128","PHADDSW xmm2/m128, xmm1","phaddsw xmm2/m128, xm= m1","66 0F 38 03 /r","V","V","SSSE3","","rw,r","","" +"PHADDW mm1, mm2/m64","PHADDW mm2/m64, mm1","phaddw mm2/m64, mm1","0F 38 0= 1 /r","V","V","SSSE3","","rw,r","","" +"PHADDW xmm1, xmm2/m128","PHADDW xmm2/m128, xmm1","phaddw xmm2/m128, xmm1"= ,"66 0F 38 01 /r","V","V","SSSE3","","rw,r","","" +"PHMINPOSUW xmm1, xmm2/m128","PHMINPOSUW xmm2/m128, xmm1","phminposuw xmm2= /m128, xmm1","66 0F 38 41 /r","V","V","SSE4_1","","w,r","","" +"PHSUBD mm1, mm2/m64","PHSUBD mm2/m64, mm1","phsubd mm2/m64, mm1","0F 38 0= 6 /r","V","V","SSSE3","","rw,r","","" +"PHSUBD xmm1, xmm2/m128","PHSUBD xmm2/m128, xmm1","phsubd xmm2/m128, xmm1"= ,"66 0F 38 06 /r","V","V","SSSE3","","rw,r","","" +"PHSUBSW mm1, mm2/m64","PHSUBSW mm2/m64, mm1","phsubsw mm2/m64, mm1","0F 3= 8 07 /r","V","V","SSSE3","","rw,r","","" +"PHSUBSW xmm1, xmm2/m128","PHSUBSW xmm2/m128, xmm1","phsubsw xmm2/m128, xm= m1","66 0F 38 07 /r","V","V","SSSE3","","rw,r","","" +"PHSUBW mm1, mm2/m64","PHSUBW mm2/m64, mm1","phsubw mm2/m64, mm1","0F 38 0= 5 /r","V","V","SSSE3","","rw,r","","" +"PHSUBW xmm1, xmm2/m128","PHSUBW xmm2/m128, xmm1","phsubw xmm2/m128, xmm1"= ,"66 0F 38 05 /r","V","V","SSSE3","","rw,r","","" +"PI2FD mm1, mm2/m64","PI2FD mm2/m64, mm1","pi2fd mm2/m64, mm1","0F 0F 0D /= r","V","V","3DNOW","amd","rw,r","","" +"PI2FW mm1, mm2/m64","PI2FW mm2/m64, mm1","pi2fw mm2/m64, mm1","0F 0F 0C /= r","V","V","3DNOW","amd","rw,r","","" +"PINSRB xmm1, r32/m8, imm8u","PINSRB imm8u, r32/m8, xmm1","pinsrb imm8u, r= 32/m8, xmm1","66 0F 3A 20 /r ib","V","V","SSE4_1","","rw,r,r","","" +"PINSRD xmm1, r/m32, imm8u","PINSRD imm8u, r/m32, xmm1","pinsrd imm8u, r/m= 32, xmm1","66 0F 3A 22 /r ib","V","V","SSE4_1","operand16,operand32","rw,r,= r","","" +"PINSRQ xmm1, r/m64, imm8u","PINSRQ imm8u, r/m64, xmm1","pinsrq imm8u, r/m= 64, xmm1","66 REX.W 0F 3A 22 /r ib","N.S.","V","SSE4_1","","rw,r,r","","" +"PINSRW mm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, mm1","pinsrw imm8u, r= 32/m16, mm1","0F C4 /r ib","V","V","MMX","","rw,r,r","","" +"PINSRW xmm1, r32/m16, imm8u","PINSRW imm8u, r32/m16, xmm1","pinsrw imm8u,= r32/m16, xmm1","66 0F C4 /r ib","V","V","SSE2","","rw,r,r","","" +"PMADDUBSW mm1, mm2/m64","PMADDUBSW mm2/m64, mm1","pmaddubsw mm2/m64, mm1"= ,"0F 38 04 /r","V","V","SSSE3","","rw,r","","" +"PMADDUBSW xmm1, xmm2/m128","PMADDUBSW xmm2/m128, xmm1","pmaddubsw xmm2/m1= 28, xmm1","66 0F 38 04 /r","V","V","SSSE3","","rw,r","","" +"PMADDWD mm1, mm2/m64","PMADDWL mm2/m64, mm1","pmaddwd mm2/m64, mm1","0F F= 5 /r","V","V","MMX","","rw,r","","" +"PMADDWD xmm1, xmm2/m128","PMADDWL xmm2/m128, xmm1","pmaddwd xmm2/m128, xm= m1","66 0F F5 /r","V","V","SSE2","","rw,r","","" +"PMAXSB xmm1, xmm2/m128","PMAXSB xmm2/m128, xmm1","pmaxsb xmm2/m128, xmm1"= ,"66 0F 38 3C /r","V","V","SSE4_1","","rw,r","","" +"PMAXSD xmm1, xmm2/m128","PMAXSD xmm2/m128, xmm1","pmaxsd xmm2/m128, xmm1"= ,"66 0F 38 3D /r","V","V","SSE4_1","","rw,r","","" +"PMAXSW mm1, mm2/m64","PMAXSW mm2/m64, mm1","pmaxsw mm2/m64, mm1","0F EE /= r","V","V","MMX","","rw,r","","" +"PMAXSW xmm1, xmm2/m128","PMAXSW xmm2/m128, xmm1","pmaxsw xmm2/m128, xmm1"= ,"66 0F EE /r","V","V","SSE2","","rw,r","","" +"PMAXUB mm1, mm2/m64","PMAXUB mm2/m64, mm1","pmaxub mm2/m64, mm1","0F DE /= r","V","V","MMX","","rw,r","","" +"PMAXUB xmm1, xmm2/m128","PMAXUB xmm2/m128, xmm1","pmaxub xmm2/m128, xmm1"= ,"66 0F DE /r","V","V","SSE2","","rw,r","","" +"PMAXUD xmm1, xmm2/m128","PMAXUD xmm2/m128, xmm1","pmaxud xmm2/m128, xmm1"= ,"66 0F 38 3F /r","V","V","SSE4_1","","rw,r","","" +"PMAXUW xmm1, xmm2/m128","PMAXUW xmm2/m128, xmm1","pmaxuw xmm2/m128, xmm1"= ,"66 0F 38 3E /r","V","V","SSE4_1","","rw,r","","" +"PMINSB xmm1, xmm2/m128","PMINSB xmm2/m128, xmm1","pminsb xmm2/m128, xmm1"= ,"66 0F 38 38 /r","V","V","SSE4_1","","rw,r","","" +"PMINSD xmm1, xmm2/m128","PMINSD xmm2/m128, xmm1","pminsd xmm2/m128, xmm1"= ,"66 0F 38 39 /r","V","V","SSE4_1","","rw,r","","" +"PMINSW mm1, mm2/m64","PMINSW mm2/m64, mm1","pminsw mm2/m64, mm1","0F EA /= r","V","V","MMX","","rw,r","","" +"PMINSW xmm1, xmm2/m128","PMINSW xmm2/m128, xmm1","pminsw xmm2/m128, xmm1"= ,"66 0F EA /r","V","V","SSE2","","rw,r","","" +"PMINUB mm1, mm2/m64","PMINUB mm2/m64, mm1","pminub mm2/m64, mm1","0F DA /= r","V","V","MMX","","rw,r","","" +"PMINUB xmm1, xmm2/m128","PMINUB xmm2/m128, xmm1","pminub xmm2/m128, xmm1"= ,"66 0F DA /r","V","V","SSE2","","rw,r","","" +"PMINUD xmm1, xmm2/m128","PMINUD xmm2/m128, xmm1","pminud xmm2/m128, xmm1"= ,"66 0F 38 3B /r","V","V","SSE4_1","","rw,r","","" +"PMINUW xmm1, xmm2/m128","PMINUW xmm2/m128, xmm1","pminuw xmm2/m128, xmm1"= ,"66 0F 38 3A /r","V","V","SSE4_1","","rw,r","","" +"PMOVMSKB r32, mm2","PMOVMSKB mm2, r32","pmovmskb mm2, r32","0F D7 /r","V"= ,"V","SSE","modrm_regonly","w,r","","" +"PMOVMSKB r32, xmm2","PMOVMSKB xmm2, r32","pmovmskb xmm2, r32","66 0F D7 /= r","V","V","SSE2","modrm_regonly","w,r","","" +"PMOVSXBD xmm1, xmm2/m32","PMOVSXBD xmm2/m32, xmm1","pmovsxbd xmm2/m32, xm= m1","66 0F 38 21 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXBQ xmm1, xmm2/m16","PMOVSXBQ xmm2/m16, xmm1","pmovsxbq xmm2/m16, xm= m1","66 0F 38 22 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXBW xmm1, xmm2/m64","PMOVSXBW xmm2/m64, xmm1","pmovsxbw xmm2/m64, xm= m1","66 0F 38 20 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXDQ xmm1, xmm2/m64","PMOVSXDQ xmm2/m64, xmm1","pmovsxdq xmm2/m64, xm= m1","66 0F 38 25 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXWD xmm1, xmm2/m64","PMOVSXWD xmm2/m64, xmm1","pmovsxwd xmm2/m64, xm= m1","66 0F 38 23 /r","V","V","SSE4_1","","w,r","","" +"PMOVSXWQ xmm1, xmm2/m32","PMOVSXWQ xmm2/m32, xmm1","pmovsxwq xmm2/m32, xm= m1","66 0F 38 24 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXBD xmm1, xmm2/m32","PMOVZXBD xmm2/m32, xmm1","pmovzxbd xmm2/m32, xm= m1","66 0F 38 31 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXBQ xmm1, xmm2/m16","PMOVZXBQ xmm2/m16, xmm1","pmovzxbq xmm2/m16, xm= m1","66 0F 38 32 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXBW xmm1, xmm2/m64","PMOVZXBW xmm2/m64, xmm1","pmovzxbw xmm2/m64, xm= m1","66 0F 38 30 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXDQ xmm1, xmm2/m64","PMOVZXDQ xmm2/m64, xmm1","pmovzxdq xmm2/m64, xm= m1","66 0F 38 35 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXWD xmm1, xmm2/m64","PMOVZXWD xmm2/m64, xmm1","pmovzxwd xmm2/m64, xm= m1","66 0F 38 33 /r","V","V","SSE4_1","","w,r","","" +"PMOVZXWQ xmm1, xmm2/m32","PMOVZXWQ xmm2/m32, xmm1","pmovzxwq xmm2/m32, xm= m1","66 0F 38 34 /r","V","V","SSE4_1","","w,r","","" +"PMULDQ xmm1, xmm2/m128","PMULDQ xmm2/m128, xmm1","pmuldq xmm2/m128, xmm1"= ,"66 0F 38 28 /r","V","V","SSE4_1","","rw,r","","" +"PMULHRSW mm1, mm2/m64","PMULHRSW mm2/m64, mm1","pmulhrsw mm2/m64, mm1","0= F 38 0B /r","V","V","SSSE3","","rw,r","","" +"PMULHRSW xmm1, xmm2/m128","PMULHRSW xmm2/m128, xmm1","pmulhrsw xmm2/m128,= xmm1","66 0F 38 0B /r","V","V","SSSE3","","rw,r","","" +"PMULHRW mm1, mm2/m64","PMULHRW mm2/m64, mm1","pmulhrw mm2/m64, mm1","0F 0= F B7 /r","V","V","3DNOW","amd","rw,r","","" +"PMULHUW mm1, mm2/m64","PMULHUW mm2/m64, mm1","pmulhuw mm2/m64, mm1","0F E= 4 /r","V","V","MMX","","rw,r","","" +"PMULHUW xmm1, xmm2/m128","PMULHUW xmm2/m128, xmm1","pmulhuw xmm2/m128, xm= m1","66 0F E4 /r","V","V","SSE2","","rw,r","","" +"PMULHW mm1, mm2/m64","PMULHW mm2/m64, mm1","pmulhw mm2/m64, mm1","0F E5 /= r","V","V","MMX","","rw,r","","" +"PMULHW xmm1, xmm2/m128","PMULHW xmm2/m128, xmm1","pmulhw xmm2/m128, xmm1"= ,"66 0F E5 /r","V","V","SSE2","","rw,r","","" +"PMULLD xmm1, xmm2/m128","PMULLD xmm2/m128, xmm1","pmulld xmm2/m128, xmm1"= ,"66 0F 38 40 /r","V","V","SSE4_1","","rw,r","","" +"PMULLW mm1, mm2/m64","PMULLW mm2/m64, mm1","pmullw mm2/m64, mm1","0F D5 /= r","V","V","MMX","","rw,r","","" +"PMULLW xmm1, xmm2/m128","PMULLW xmm2/m128, xmm1","pmullw xmm2/m128, xmm1"= ,"66 0F D5 /r","V","V","SSE2","","rw,r","","" +"PMULUDQ mm1, mm2/m64","PMULULQ mm2/m64, mm1","pmuludq mm2/m64, mm1","0F F= 4 /r","V","V","SSE2","","rw,r","","" +"PMULUDQ xmm1, xmm2/m128","PMULULQ xmm2/m128, xmm1","pmuludq xmm2/m128, xm= m1","66 0F F4 /r","V","V","SSE2","","rw,r","","" +"POPAD","POPAL","popal","61","V","N.S.","","operand32","","","" +"POPA","POPAW","popaw","61","V","N.S.","","operand16","","","" +"POPCNT r32, r/m32","POPCNTL r/m32, r32","popcntl r/m32, r32","F3 0F B8 /r= ","V","V","POPCNT","operand32","w,r","Y","32" +"POPCNT r64, r/m64","POPCNTQ r/m64, r64","popcntq r/m64, r64","F3 REX.W 0F= B8 /r","N.S.","V","POPCNT","","w,r","Y","64" +"POPCNT r16, r/m16","POPCNTW r/m16, r16","popcntw r/m16, r16","F3 0F B8 /r= ","V","V","POPCNT","operand16","w,r","Y","16" +"POPFD","POPFL","popfl","9D","V","N.S.","","operand32","","","" +"POPFQ","POPFQ","popfq","9D","N.S.","V","","default64","","","" +"POPF","POPFW","popfw","9D","V","V","","operand16","","","" +"POP r/m32","POPL r/m32","popl r/m32","8F /0","V","N.S.","","operand32","w= ","Y","32" +"POP r32op","POPL r32op","popl r32op","58+rd","V","N.S.","","operand32","w= ","Y","32" +"POP r/m64","POPQ r/m64","popq r/m64","8F /0","N.S.","V","","default64","w= ","Y","64" +"POP r64op","POPQ r64op","popq r64op","58+ro","N.S.","V","","default64","w= ","Y","64" +"POP r/m16","POPW r/m16","popw r/m16","8F /0","V","V","","operand16","w","= Y","16" +"POP r16op","POPW r16op","popw r16op","58+rw","V","V","","operand16","w","= Y","16" +"POP DS","POPW/POPL/POPQ DS","popw/popl/popq DS","1F","V","N.S.","","","w"= ,"Y","" +"POP ES","POPW/POPL/POPQ ES","popw/popl/popq ES","07","V","N.S.","","","w"= ,"Y","" +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","N.S.","V","","de= fault64","w","Y","" +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","N.S.","","op= erand32","w","Y","" +"POP FS","POPW/POPL/POPQ FS","popw/popl/popq FS","0F A1","V","V","","opera= nd16","w","Y","" +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","N.S.","V","","de= fault64","w","Y","" +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","V","","opera= nd16","w","Y","" +"POP GS","POPW/POPL/POPQ GS","popw/popl/popq GS","0F A9","V","N.S.","","op= erand32","w","Y","" +"POP SS","POPW/POPL/POPQ SS","popw/popl/popq SS","17","V","N.S.","","","w"= ,"Y","" +"POR mm1, mm2/m64","POR mm2/m64, mm1","por mm2/m64, mm1","0F EB /r","V","V= ","MMX","","rw,r","","" +"POR xmm1, xmm2/m128","POR xmm2/m128, xmm1","por xmm2/m128, xmm1","66 0F E= B /r","V","V","SSE2","","rw,r","","" +"PREFETCHNTA m8","PREFETCHNTA m8","prefetchnta m8","0F 18 /0","V","V","","= modrm_memonly","r","","" +"PREFETCHT0 m8","PREFETCHT0 m8","prefetcht0 m8","0F 18 /1","V","V","","mod= rm_memonly","r","","" +"PREFETCHT1 m8","PREFETCHT1 m8","prefetcht1 m8","0F 18 /2","V","V","","mod= rm_memonly","r","","" +"PREFETCHT2 m8","PREFETCHT2 m8","prefetcht2 m8","0F 18 /3","V","V","","mod= rm_memonly","r","","" +"PREFETCHW m8","PREFETCHW m8","prefetchw m8","0F 0D /1","V","V","PRFCHW","= modrm_memonly","r","","" +"PREFETCHWT1 m8","PREFETCHWT1 m8","prefetchwt1 m8","0F 0D /2","V","V","PRE= FETCHWT1","modrm_memonly","r","","" +"PREFETCHW_ALIAS m8","PREFETCHW_ALIAS m8","prefetchw_alias m8","0F 0D /3",= "V","V","PRFCHW","modrm_memonly","r","","" +"PREFETCH_EXCLUSIVE m8","PREFETCH_EXCLUSIVE m8","prefetch_exclusive m8","0= F 0D /0","V","V","PRFCHW","modrm_memonly","r","","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /2","V","V","PRFCHW","modrm_memonly","r","Y","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /4","V","V","PRFCHW","modrm_memonly","r","Y","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /5","V","V","PRFCHW","modrm_memonly","r","Y","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /6","V","V","PRFCHW","modrm_memonly","r","Y","" +"PREFETCH_RESERVED m8","PREFETCH_RESERVED m8","prefetch_reserved m8","0F 0= D /7","V","V","PRFCHW","modrm_memonly","r","Y","" +"PSADBW mm1, mm2/m64","PSADBW mm2/m64, mm1","psadbw mm2/m64, mm1","0F F6 /= r","V","V","MMX","","rw,r","","" +"PSADBW xmm1, xmm2/m128","PSADBW xmm2/m128, xmm1","psadbw xmm2/m128, xmm1"= ,"66 0F F6 /r","V","V","SSE2","","rw,r","","" +"PSHUFB mm1, mm2/m64","PSHUFB mm2/m64, mm1","pshufb mm2/m64, mm1","0F 38 0= 0 /r","V","V","SSSE3","","rw,r","","" +"PSHUFB xmm1, xmm2/m128","PSHUFB xmm2/m128, xmm1","pshufb xmm2/m128, xmm1"= ,"66 0F 38 00 /r","V","V","SSSE3","","rw,r","","" +"PSHUFD xmm1, xmm2/m128, imm8u","PSHUFD imm8u, xmm2/m128, xmm1","pshufd im= m8u, xmm2/m128, xmm1","66 0F 70 /r ib","V","V","SSE2","","w,r,r","","" +"PSHUFHW xmm1, xmm2/m128, imm8u","PSHUFHW imm8u, xmm2/m128, xmm1","pshufhw= imm8u, xmm2/m128, xmm1","F3 0F 70 /r ib","V","V","SSE2","","w,r,r","","" +"PSHUFLW xmm1, xmm2/m128, imm8u","PSHUFLW imm8u, xmm2/m128, xmm1","pshuflw= imm8u, xmm2/m128, xmm1","F2 0F 70 /r ib","V","V","SSE2","","w,r,r","","" +"PSHUFW mm1, mm2/m64, imm8u","PSHUFW imm8u, mm2/m64, mm1","pshufw imm8u, m= m2/m64, mm1","0F 70 /r ib","V","V","MMX","","w,r,r","","" +"PSIGNB mm1, mm2/m64","PSIGNB mm2/m64, mm1","psignb mm2/m64, mm1","0F 38 0= 8 /r","V","V","SSSE3","","rw,r","","" +"PSIGNB xmm1, xmm2/m128","PSIGNB xmm2/m128, xmm1","psignb xmm2/m128, xmm1"= ,"66 0F 38 08 /r","V","V","SSSE3","","rw,r","","" +"PSIGND mm1, mm2/m64","PSIGND mm2/m64, mm1","psignd mm2/m64, mm1","0F 38 0= A /r","V","V","SSSE3","","rw,r","","" +"PSIGND xmm1, xmm2/m128","PSIGND xmm2/m128, xmm1","psignd xmm2/m128, xmm1"= ,"66 0F 38 0A /r","V","V","SSSE3","","rw,r","","" +"PSIGNW mm1, mm2/m64","PSIGNW mm2/m64, mm1","psignw mm2/m64, mm1","0F 38 0= 9 /r","V","V","SSSE3","","rw,r","","" +"PSIGNW xmm1, xmm2/m128","PSIGNW xmm2/m128, xmm1","psignw xmm2/m128, xmm1"= ,"66 0F 38 09 /r","V","V","SSSE3","","rw,r","","" +"PSLLD mm2, imm8u","PSLLL imm8u, mm2","pslld imm8u, mm2","0F 72 /6 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSLLD xmm2, imm8u","PSLLL imm8u, xmm2","pslld imm8u, xmm2","66 0F 72 /6 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSLLD mm1, mm2/m64","PSLLL mm2/m64, mm1","pslld mm2/m64, mm1","0F F2 /r",= "V","V","MMX","","rw,r","","" +"PSLLD xmm1, xmm2/m128","PSLLL xmm2/m128, xmm1","pslld xmm2/m128, xmm1","6= 6 0F F2 /r","V","V","SSE2","","rw,r","","" +"PSLLDQ xmm2, imm8u","PSLLO imm8u, xmm2","pslldq imm8u, xmm2","66 0F 73 /7= ib","V","V","SSE2","modrm_regonly","rw,r","","" +"PSLLQ mm2, imm8u","PSLLQ imm8u, mm2","psllq imm8u, mm2","0F 73 /6 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSLLQ xmm2, imm8u","PSLLQ imm8u, xmm2","psllq imm8u, xmm2","66 0F 73 /6 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSLLQ mm1, mm2/m64","PSLLQ mm2/m64, mm1","psllq mm2/m64, mm1","0F F3 /r",= "V","V","MMX","","rw,r","","" +"PSLLQ xmm1, xmm2/m128","PSLLQ xmm2/m128, xmm1","psllq xmm2/m128, xmm1","6= 6 0F F3 /r","V","V","SSE2","","rw,r","","" +"PSLLW mm2, imm8u","PSLLW imm8u, mm2","psllw imm8u, mm2","0F 71 /6 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSLLW xmm2, imm8u","PSLLW imm8u, xmm2","psllw imm8u, xmm2","66 0F 71 /6 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSLLW mm1, mm2/m64","PSLLW mm2/m64, mm1","psllw mm2/m64, mm1","0F F1 /r",= "V","V","MMX","","rw,r","","" +"PSLLW xmm1, xmm2/m128","PSLLW xmm2/m128, xmm1","psllw xmm2/m128, xmm1","6= 6 0F F1 /r","V","V","SSE2","","rw,r","","" +"PSRAD mm2, imm8u","PSRAL imm8u, mm2","psrad imm8u, mm2","0F 72 /4 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRAD xmm2, imm8u","PSRAL imm8u, xmm2","psrad imm8u, xmm2","66 0F 72 /4 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRAD mm1, mm2/m64","PSRAL mm2/m64, mm1","psrad mm2/m64, mm1","0F E2 /r",= "V","V","MMX","","rw,r","","" +"PSRAD xmm1, xmm2/m128","PSRAL xmm2/m128, xmm1","psrad xmm2/m128, xmm1","6= 6 0F E2 /r","V","V","SSE2","","rw,r","","" +"PSRAW mm2, imm8u","PSRAW imm8u, mm2","psraw imm8u, mm2","0F 71 /4 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRAW xmm2, imm8u","PSRAW imm8u, xmm2","psraw imm8u, xmm2","66 0F 71 /4 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRAW mm1, mm2/m64","PSRAW mm2/m64, mm1","psraw mm2/m64, mm1","0F E1 /r",= "V","V","MMX","","rw,r","","" +"PSRAW xmm1, xmm2/m128","PSRAW xmm2/m128, xmm1","psraw xmm2/m128, xmm1","6= 6 0F E1 /r","V","V","SSE2","","rw,r","","" +"PSRLD mm2, imm8u","PSRLL imm8u, mm2","psrld imm8u, mm2","0F 72 /2 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRLD xmm2, imm8u","PSRLL imm8u, xmm2","psrld imm8u, xmm2","66 0F 72 /2 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRLD mm1, mm2/m64","PSRLL mm2/m64, mm1","psrld mm2/m64, mm1","0F D2 /r",= "V","V","MMX","","rw,r","","" +"PSRLD xmm1, xmm2/m128","PSRLL xmm2/m128, xmm1","psrld xmm2/m128, xmm1","6= 6 0F D2 /r","V","V","SSE2","","rw,r","","" +"PSRLDQ xmm2, imm8u","PSRLO imm8u, xmm2","psrldq imm8u, xmm2","66 0F 73 /3= ib","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRLQ mm2, imm8u","PSRLQ imm8u, mm2","psrlq imm8u, mm2","0F 73 /2 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRLQ xmm2, imm8u","PSRLQ imm8u, xmm2","psrlq imm8u, xmm2","66 0F 73 /2 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRLQ mm1, mm2/m64","PSRLQ mm2/m64, mm1","psrlq mm2/m64, mm1","0F D3 /r",= "V","V","MMX","","rw,r","","" +"PSRLQ xmm1, xmm2/m128","PSRLQ xmm2/m128, xmm1","psrlq xmm2/m128, xmm1","6= 6 0F D3 /r","V","V","SSE2","","rw,r","","" +"PSRLW mm2, imm8u","PSRLW imm8u, mm2","psrlw imm8u, mm2","0F 71 /2 ib","V"= ,"V","MMX","modrm_regonly","rw,r","","" +"PSRLW xmm2, imm8u","PSRLW imm8u, xmm2","psrlw imm8u, xmm2","66 0F 71 /2 i= b","V","V","SSE2","modrm_regonly","rw,r","","" +"PSRLW mm1, mm2/m64","PSRLW mm2/m64, mm1","psrlw mm2/m64, mm1","0F D1 /r",= "V","V","MMX","","rw,r","","" +"PSRLW xmm1, xmm2/m128","PSRLW xmm2/m128, xmm1","psrlw xmm2/m128, xmm1","6= 6 0F D1 /r","V","V","SSE2","","rw,r","","" +"PSUBB mm1, mm2/m64","PSUBB mm2/m64, mm1","psubb mm2/m64, mm1","0F F8 /r",= "V","V","MMX","","rw,r","","" +"PSUBB xmm1, xmm2/m128","PSUBB xmm2/m128, xmm1","psubb xmm2/m128, xmm1","6= 6 0F F8 /r","V","V","SSE2","","rw,r","","" +"PSUBD mm1, mm2/m64","PSUBL mm2/m64, mm1","psubd mm2/m64, mm1","0F FA /r",= "V","V","MMX","","rw,r","","" +"PSUBD xmm1, xmm2/m128","PSUBL xmm2/m128, xmm1","psubd xmm2/m128, xmm1","6= 6 0F FA /r","V","V","SSE2","","rw,r","","" +"PSUBQ mm1, mm2/m64","PSUBQ mm2/m64, mm1","psubq mm2/m64, mm1","0F FB /r",= "V","V","SSE2","","rw,r","","" +"PSUBQ xmm1, xmm2/m128","PSUBQ xmm2/m128, xmm1","psubq xmm2/m128, xmm1","6= 6 0F FB /r","V","V","SSE2","","rw,r","","" +"PSUBSB mm1, mm2/m64","PSUBSB mm2/m64, mm1","psubsb mm2/m64, mm1","0F E8 /= r","V","V","MMX","","rw,r","","" +"PSUBSB xmm1, xmm2/m128","PSUBSB xmm2/m128, xmm1","psubsb xmm2/m128, xmm1"= ,"66 0F E8 /r","V","V","SSE2","","rw,r","","" +"PSUBSW mm1, mm2/m64","PSUBSW mm2/m64, mm1","psubsw mm2/m64, mm1","0F E9 /= r","V","V","MMX","","rw,r","","" +"PSUBSW xmm1, xmm2/m128","PSUBSW xmm2/m128, xmm1","psubsw xmm2/m128, xmm1"= ,"66 0F E9 /r","V","V","SSE2","","rw,r","","" +"PSUBUSB mm1, mm2/m64","PSUBUSB mm2/m64, mm1","psubusb mm2/m64, mm1","0F D= 8 /r","V","V","MMX","","rw,r","","" +"PSUBUSB xmm1, xmm2/m128","PSUBUSB xmm2/m128, xmm1","psubusb xmm2/m128, xm= m1","66 0F D8 /r","V","V","SSE2","","rw,r","","" +"PSUBUSW mm1, mm2/m64","PSUBUSW mm2/m64, mm1","psubusw mm2/m64, mm1","0F D= 9 /r","V","V","MMX","","rw,r","","" +"PSUBUSW xmm1, xmm2/m128","PSUBUSW xmm2/m128, xmm1","psubusw xmm2/m128, xm= m1","66 0F D9 /r","V","V","SSE2","","rw,r","","" +"PSUBW mm1, mm2/m64","PSUBW mm2/m64, mm1","psubw mm2/m64, mm1","0F F9 /r",= "V","V","MMX","","rw,r","","" +"PSUBW xmm1, xmm2/m128","PSUBW xmm2/m128, xmm1","psubw xmm2/m128, xmm1","6= 6 0F F9 /r","V","V","SSE2","","rw,r","","" +"PSWAPD mm1, mm2/m64","PSWAPD mm2/m64, mm1","pswapd mm2/m64, mm1","0F 0F B= B /r","V","V","3DNOW","amd","rw,r","","" +"PTEST xmm1, xmm2/m128","PTEST xmm2/m128, xmm1","ptest xmm2/m128, xmm1","6= 6 0F 38 17 /r","V","V","SSE4_1","","r,r","","" +"PTWRITE r/m32","PTWRITEL r/m32","ptwritel r/m32","F3 0F AE /4","V","V",""= ,"operand16,operand32","r","Y","32" +"PTWRITE r/m64","PTWRITEQ r/m64","ptwriteq r/m64","F3 REX.W 0F AE /4","N.S= .","V","","","r","Y","64" +"PUNPCKHBW mm1, mm2/m64","PUNPCKHBW mm2/m64, mm1","punpckhbw mm2/m64, mm1"= ,"0F 68 /r","V","V","MMX","","rw,r","","" +"PUNPCKHBW xmm1, xmm2/m128","PUNPCKHBW xmm2/m128, xmm1","punpckhbw xmm2/m1= 28, xmm1","66 0F 68 /r","V","V","SSE2","","rw,r","","" +"PUNPCKHDQ mm1, mm2/m64","PUNPCKHLQ mm2/m64, mm1","punpckhdq mm2/m64, mm1"= ,"0F 6A /r","V","V","MMX","","rw,r","","" +"PUNPCKHDQ xmm1, xmm2/m128","PUNPCKHLQ xmm2/m128, xmm1","punpckhdq xmm2/m1= 28, xmm1","66 0F 6A /r","V","V","SSE2","","rw,r","","" +"PUNPCKHQDQ xmm1, xmm2/m128","PUNPCKHQDQ xmm2/m128, xmm1","punpckhqdq xmm2= /m128, xmm1","66 0F 6D /r","V","V","SSE2","","rw,r","","" +"PUNPCKHWD mm1, mm2/m64","PUNPCKHWL mm2/m64, mm1","punpckhwd mm2/m64, mm1"= ,"0F 69 /r","V","V","MMX","","rw,r","","" +"PUNPCKHWD xmm1, xmm2/m128","PUNPCKHWL xmm2/m128, xmm1","punpckhwd xmm2/m1= 28, xmm1","66 0F 69 /r","V","V","SSE2","","rw,r","","" +"PUNPCKLBW mm1, mm2/m32","PUNPCKLBW mm2/m32, mm1","punpcklbw mm2/m32, mm1"= ,"0F 60 /r","V","V","MMX","","rw,r","","" +"PUNPCKLBW xmm1, xmm2/m128","PUNPCKLBW xmm2/m128, xmm1","punpcklbw xmm2/m1= 28, xmm1","66 0F 60 /r","V","V","SSE2","","rw,r","","" +"PUNPCKLDQ mm1, mm2/m32","PUNPCKLLQ mm2/m32, mm1","punpckldq mm2/m32, mm1"= ,"0F 62 /r","V","V","MMX","","rw,r","","" +"PUNPCKLDQ xmm1, xmm2/m128","PUNPCKLLQ xmm2/m128, xmm1","punpckldq xmm2/m1= 28, xmm1","66 0F 62 /r","V","V","SSE2","","rw,r","","" +"PUNPCKLQDQ xmm1, xmm2/m128","PUNPCKLQDQ xmm2/m128, xmm1","punpcklqdq xmm2= /m128, xmm1","66 0F 6C /r","V","V","SSE2","","rw,r","","" +"PUNPCKLWD mm1, mm2/m32","PUNPCKLWL mm2/m32, mm1","punpcklwd mm2/m32, mm1"= ,"0F 61 /r","V","V","MMX","","rw,r","","" +"PUNPCKLWD xmm1, xmm2/m128","PUNPCKLWL xmm2/m128, xmm1","punpcklwd xmm2/m1= 28, xmm1","66 0F 61 /r","V","V","SSE2","","rw,r","","" +"PUSHAD","PUSHAL","pushal","60","V","N.S.","","operand32","","","" +"PUSHA","PUSHAW","pushaw","60","V","N.S.","","operand16","","","" +"PUSHFD","PUSHFL","pushfl","9C","V","N.S.","","operand32","","","" +"PUSHFQ","PUSHFQ","pushfq","9C","N.S.","V","","default64","","","" +"PUSHF","PUSHFW","pushfw","9C","V","V","","operand16","","","" +"PUSH r/m32","PUSHL r/m32","pushl r/m32","FF /6","V","N.S.","","operand32"= ,"r","Y","32" +"PUSH r32op","PUSHL r32op","pushl r32op","50+rd","V","N.S.","","operand32"= ,"r","Y","32" +"PUSH r/m64","PUSHQ r/m64","pushq r/m64","FF /6","N.S.","V","","default64"= ,"r","Y","64" +"PUSH r64op","PUSHQ r64op","pushq r64op","50+ro","N.S.","V","","default64"= ,"r","Y","64" +"PUSH imm16","PUSHW imm16","pushw imm16","68 iw","V","V","","operand16","r= ","Y","" +"PUSH r/m16","PUSHW r/m16","pushw r/m16","FF /6","V","V","","operand16","r= ","Y","16" +"PUSH r16op","PUSHW r16op","pushw r16op","50+rw","V","V","","operand16","r= ","Y","16" +"PUSH CS","PUSHW/PUSHL/PUSHQ CS","pushw/pushl/pushq CS","0E","V","N.S.",""= ,"","r","Y","" +"PUSH DS","PUSHW/PUSHL/PUSHQ DS","pushw/pushl/pushq DS","1E","V","N.S.",""= ,"","r","Y","" +"PUSH ES","PUSHW/PUSHL/PUSHQ ES","pushw/pushl/pushq ES","06","V","N.S.",""= ,"","r","Y","" +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","V",""= ,"operand16","r","Y","" +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","N.S.","V"= ,"","default64","r","Y","" +"PUSH FS","PUSHW/PUSHL/PUSHQ FS","pushw/pushl/pushq FS","0F A0","V","N.S."= ,"","operand32","r","Y","" +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","N.S.","V"= ,"","default64","r","Y","" +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","N.S."= ,"","operand32","r","Y","" +"PUSH GS","PUSHW/PUSHL/PUSHQ GS","pushw/pushl/pushq GS","0F A8","V","V",""= ,"operand16","r","Y","" +"PUSH SS","PUSHW/PUSHL/PUSHQ SS","pushw/pushl/pushq SS","16","V","N.S.",""= ,"","r","Y","" +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V",= "N.S.","","operand32","r","Y","" +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","N.S= .","V","","default64","r","Y","" +"PUSH imm8","PUSHW/PUSHL/PUSHQ imm8","pushw/pushl/pushq imm8","6A ib","V",= "V","","operand16","r","Y","" +"PXOR mm1, mm2/m64","PXOR mm2/m64, mm1","pxor mm2/m64, mm1","0F EF /r","V"= ,"V","MMX","","rw,r","","" +"PXOR xmm1, xmm2/m128","PXOR xmm2/m128, xmm1","pxor xmm2/m128, xmm1","66 0= F EF /r","V","V","SSE2","","rw,r","","" +"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","D0 /2","V","V","","","rw,r","= Y","8" +"RCL r/m8, 1","RCLB 1, r/m8","rclb 1, r/m8","REX D0 /2","N.E.","V","","pse= udo64","w,r","Y","8" +"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","D2 /2","V","V","","","rw,r= ","Y","8" +"RCL r/m8, CL","RCLB CL, r/m8","rclb CL, r/m8","REX D2 /2","N.E.","V","","= pseudo64","w,r","Y","8" +"RCL r/m8, imm8","RCLB imm8, r/m8","rclb imm8, r/m8","REX C0 /2 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"RCL r/m8, imm8u","RCLB imm8u, r/m8","rclb imm8u, r/m8","C0 /2 ib","V","V"= ,"","","rw,r","Y","8" +"RCL r/m32, 1","RCLL 1, r/m32","rcll 1, r/m32","D1 /2","V","V","","operand= 32","rw,r","Y","32" +"RCL r/m32, CL","RCLL CL, r/m32","rcll CL, r/m32","D3 /2","V","V","","oper= and32","rw,r","Y","32" +"RCL r/m32, imm8u","RCLL imm8u, r/m32","rcll imm8u, r/m32","C1 /2 ib","V",= "V","","operand32","rw,r","Y","32" +"RCL r/m64, 1","RCLQ 1, r/m64","rclq 1, r/m64","REX.W D1 /2","N.S.","V",""= ,"","rw,r","Y","64" +"RCL r/m64, CL","RCLQ CL, r/m64","rclq CL, r/m64","REX.W D3 /2","N.S.","V"= ,"","","rw,r","Y","64" +"RCL r/m64, imm8u","RCLQ imm8u, r/m64","rclq imm8u, r/m64","REX.W C1 /2 ib= ","N.S.","V","","","rw,r","Y","64" +"RCL r/m16, 1","RCLW 1, r/m16","rclw 1, r/m16","D1 /2","V","V","","operand= 16","rw,r","Y","16" +"RCL r/m16, CL","RCLW CL, r/m16","rclw CL, r/m16","D3 /2","V","V","","oper= and16","rw,r","Y","16" +"RCL r/m16, imm8u","RCLW imm8u, r/m16","rclw imm8u, r/m16","C1 /2 ib","V",= "V","","operand16","rw,r","Y","16" +"RCPPS xmm1, xmm2/m128","RCPPS xmm2/m128, xmm1","rcpps xmm2/m128, xmm1","0= F 53 /r","V","V","SSE","","w,r","","" +"RCPSS xmm1, xmm2/m32","RCPSS xmm2/m32, xmm1","rcpss xmm2/m32, xmm1","F3 0= F 53 /r","V","V","SSE","","w,r","","" +"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","D0 /3","V","V","","","rw,r","= Y","8" +"RCR r/m8, 1","RCRB 1, r/m8","rcrb 1, r/m8","REX D0 /3","N.E.","V","","pse= udo64","w,r","Y","8" +"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","D2 /3","V","V","","","rw,r= ","Y","8" +"RCR r/m8, CL","RCRB CL, r/m8","rcrb CL, r/m8","REX D2 /3","N.E.","V","","= pseudo64","w,r","Y","8" +"RCR r/m8, imm8","RCRB imm8, r/m8","rcrb imm8, r/m8","REX C0 /3 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"RCR r/m8, imm8u","RCRB imm8u, r/m8","rcrb imm8u, r/m8","C0 /3 ib","V","V"= ,"","","rw,r","Y","8" +"RCR r/m32, 1","RCRL 1, r/m32","rcrl 1, r/m32","D1 /3","V","V","","operand= 32","rw,r","Y","32" +"RCR r/m32, CL","RCRL CL, r/m32","rcrl CL, r/m32","D3 /3","V","V","","oper= and32","rw,r","Y","32" +"RCR r/m32, imm8u","RCRL imm8u, r/m32","rcrl imm8u, r/m32","C1 /3 ib","V",= "V","","operand32","rw,r","Y","32" +"RCR r/m64, 1","RCRQ 1, r/m64","rcrq 1, r/m64","REX.W D1 /3","N.S.","V",""= ,"","rw,r","Y","64" +"RCR r/m64, CL","RCRQ CL, r/m64","rcrq CL, r/m64","REX.W D3 /3","N.S.","V"= ,"","","rw,r","Y","64" +"RCR r/m64, imm8u","RCRQ imm8u, r/m64","rcrq imm8u, r/m64","REX.W C1 /3 ib= ","N.S.","V","","","rw,r","Y","64" +"RCR r/m16, 1","RCRW 1, r/m16","rcrw 1, r/m16","D1 /3","V","V","","operand= 16","rw,r","Y","16" +"RCR r/m16, CL","RCRW CL, r/m16","rcrw CL, r/m16","D3 /3","V","V","","oper= and16","rw,r","Y","16" +"RCR r/m16, imm8u","RCRW imm8u, r/m16","rcrw imm8u, r/m16","C1 /3 ib","V",= "V","","operand16","rw,r","Y","16" +"RDFSBASE rmr32","RDFSBASEL rmr32","rdfsbase rmr32","F3 0F AE /0","N.S.","= V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32" +"RDFSBASE rmr64","RDFSBASEQ rmr64","rdfsbase rmr64","F3 REX.W 0F AE /0","N= .S.","V","FSGSBASE","modrm_regonly","w","Y","64" +"RDGSBASE rmr32","RDGSBASEL rmr32","rdgsbase rmr32","F3 0F AE /1","N.S.","= V","FSGSBASE","modrm_regonly,operand16,operand32","w","Y","32" +"RDGSBASE rmr64","RDGSBASEQ rmr64","rdgsbase rmr64","F3 REX.W 0F AE /1","N= .S.","V","FSGSBASE","modrm_regonly","w","Y","64" +"RDMSR","RDMSR","rdmsr","0F 32","V","V","Pentium","","","","" +"RDPKRU","RDPKRU","rdpkru","0F 01 EE","V","V","PKU","","","","" +"RDPMC","RDPMC","rdpmc","0F 33","V","V","","","","","" +"RDRAND rmr32","RDRANDL rmr32","rdrand rmr32","0F C7 /6","V","V","RDRAND",= "modrm_regonly,operand32","w","Y","32" +"RDRAND rmr64","RDRANDQ rmr64","rdrand rmr64","REX.W 0F C7 /6","N.S.","V",= "RDRAND","modrm_regonly","w","Y","64" +"RDRAND rmr16","RDRANDW rmr16","rdrand rmr16","0F C7 /6","V","V","RDRAND",= "modrm_regonly,operand16","w","Y","16" +"RDSEED rmr32","RDSEEDL rmr32","rdseed rmr32","0F C7 /7","V","V","RDSEED",= "modrm_regonly,operand32","w","Y","32" +"RDSEED rmr64","RDSEEDQ rmr64","rdseed rmr64","REX.W 0F C7 /7","N.S.","V",= "RDSEED","modrm_regonly","w","Y","64" +"RDSEED rmr16","RDSEEDW rmr16","rdseed rmr16","0F C7 /7","V","V","RDSEED",= "modrm_regonly,operand16","w","Y","16" +"RDSSPD rmr32","RDSSPD rmr32","rdsspd rmr32","F3 0F 1E /1","V","V","CET","= modrm_regonly,operand16,operand32","w","","" +"RDSSPQ rmr64","RDSSPQ rmr64","rdsspq rmr64","F3 REX.W 0F 1E /1","N.S.","V= ","CET","modrm_regonly","w","","" +"RDTSC","RDTSC","rdtsc","0F 31","V","V","Pentium","","","","" +"RDTSCP","RDTSCP","rdtscp","0F 01 F9","V","V","RDTSCP","","","","" +"RET_FAR","RETFW/RETFL/RETFQ","lretw/lretl/lretl","CB","V","V","","","",""= ,"" +"RET_FAR imm16u","RETFW/RETFL/RETFQ imm16u","lretw/lretl/lretl imm16u","CA= iw","V","V","","","r","","" +"RET","RETW/RETL/RETQ","retw/retl/retq","C3","N.S.","V","","default64","",= "","" +"RET","RETW/RETL/RETQ","retw/retl/retq","C3","V","N.S.","","","","","" +"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","N.S.= ","V","","default64","r","","" +"RET imm16u","RETW/RETL/RETQ imm16u","retw/retl/retq imm16u","C2 iw","V","= N.S.","","","r","","" +"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","D0 /0","V","V","","","rw,r","= Y","8" +"ROL r/m8, 1","ROLB 1, r/m8","rolb 1, r/m8","REX D0 /0","N.E.","V","","pse= udo64","w,r","Y","8" +"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","D2 /0","V","V","","","rw,r= ","Y","8" +"ROL r/m8, CL","ROLB CL, r/m8","rolb CL, r/m8","REX D2 /0","N.E.","V","","= pseudo64","w,r","Y","8" +"ROL r/m8, imm8","ROLB imm8, r/m8","rolb imm8, r/m8","REX C0 /0 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"ROL r/m8, imm8u","ROLB imm8u, r/m8","rolb imm8u, r/m8","C0 /0 ib","V","V"= ,"","","rw,r","Y","8" +"ROL r/m32, 1","ROLL 1, r/m32","roll 1, r/m32","D1 /0","V","V","","operand= 32","rw,r","Y","32" +"ROL r/m32, CL","ROLL CL, r/m32","roll CL, r/m32","D3 /0","V","V","","oper= and32","rw,r","Y","32" +"ROL r/m32, imm8u","ROLL imm8u, r/m32","roll imm8u, r/m32","C1 /0 ib","V",= "V","","operand32","rw,r","Y","32" +"ROL r/m64, 1","ROLQ 1, r/m64","rolq 1, r/m64","REX.W D1 /0","N.S.","V",""= ,"","rw,r","Y","64" +"ROL r/m64, CL","ROLQ CL, r/m64","rolq CL, r/m64","REX.W D3 /0","N.S.","V"= ,"","","rw,r","Y","64" +"ROL r/m64, imm8u","ROLQ imm8u, r/m64","rolq imm8u, r/m64","REX.W C1 /0 ib= ","N.S.","V","","","rw,r","Y","64" +"ROL r/m16, 1","ROLW 1, r/m16","rolw 1, r/m16","D1 /0","V","V","","operand= 16","rw,r","Y","16" +"ROL r/m16, CL","ROLW CL, r/m16","rolw CL, r/m16","D3 /0","V","V","","oper= and16","rw,r","Y","16" +"ROL r/m16, imm8u","ROLW imm8u, r/m16","rolw imm8u, r/m16","C1 /0 ib","V",= "V","","operand16","rw,r","Y","16" +"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","D0 /1","V","V","","","rw,r","= Y","8" +"ROR r/m8, 1","RORB 1, r/m8","rorb 1, r/m8","REX D0 /1","N.E.","V","","pse= udo64","w,r","Y","8" +"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","D2 /1","V","V","","","rw,r= ","Y","8" +"ROR r/m8, CL","RORB CL, r/m8","rorb CL, r/m8","REX D2 /1","N.E.","V","","= pseudo64","w,r","Y","8" +"ROR r/m8, imm8","RORB imm8, r/m8","rorb imm8, r/m8","REX C0 /1 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"ROR r/m8, imm8u","RORB imm8u, r/m8","rorb imm8u, r/m8","C0 /1 ib","V","V"= ,"","","rw,r","Y","8" +"ROR r/m32, 1","RORL 1, r/m32","rorl 1, r/m32","D1 /1","V","V","","operand= 32","rw,r","Y","32" +"ROR r/m32, CL","RORL CL, r/m32","rorl CL, r/m32","D3 /1","V","V","","oper= and32","rw,r","Y","32" +"ROR r/m32, imm8u","RORL imm8u, r/m32","rorl imm8u, r/m32","C1 /1 ib","V",= "V","","operand32","rw,r","Y","32" +"ROR r/m64, 1","RORQ 1, r/m64","rorq 1, r/m64","REX.W D1 /1","N.S.","V",""= ,"","rw,r","Y","64" +"ROR r/m64, CL","RORQ CL, r/m64","rorq CL, r/m64","REX.W D3 /1","N.S.","V"= ,"","","rw,r","Y","64" +"ROR r/m64, imm8u","RORQ imm8u, r/m64","rorq imm8u, r/m64","REX.W C1 /1 ib= ","N.S.","V","","","rw,r","Y","64" +"ROR r/m16, 1","RORW 1, r/m16","rorw 1, r/m16","D1 /1","V","V","","operand= 16","rw,r","Y","16" +"ROR r/m16, CL","RORW CL, r/m16","rorw CL, r/m16","D3 /1","V","V","","oper= and16","rw,r","Y","16" +"ROR r/m16, imm8u","RORW imm8u, r/m16","rorw imm8u, r/m16","C1 /1 ib","V",= "V","","operand16","rw,r","Y","16" +"RORX r32, r/m32, imm8u","RORXL imm8u, r/m32, r32","rorxl imm8u, r/m32, r3= 2","VEX.128.F2.0F3A.W0 F0 /r ib","V","V","BMI2","","w,r,r","Y","32" +"RORX r64, r/m64, imm8u","RORXQ imm8u, r/m64, r64","rorxq imm8u, r/m64, r6= 4","VEX.128.F2.0F3A.W1 F0 /r ib","N.S.","V","BMI2","","w,r,r","Y","64" +"ROUNDPD xmm1, xmm2/m128, imm8u","ROUNDPD imm8u, xmm2/m128, xmm1","roundpd= imm8u, xmm2/m128, xmm1","66 0F 3A 09 /r ib","V","V","SSE4_1","","w,r,r",""= ,"" +"ROUNDPS xmm1, xmm2/m128, imm8u","ROUNDPS imm8u, xmm2/m128, xmm1","roundps= imm8u, xmm2/m128, xmm1","66 0F 3A 08 /r ib","V","V","SSE4_1","","w,r,r",""= ,"" +"ROUNDSD xmm1, xmm2/m64, imm8u","ROUNDSD imm8u, xmm2/m64, xmm1","roundsd i= mm8u, xmm2/m64, xmm1","66 0F 3A 0B /r ib","V","V","SSE4_1","","w,r,r","","" +"ROUNDSS xmm1, xmm2/m32, imm8u","ROUNDSS imm8u, xmm2/m32, xmm1","roundss i= mm8u, xmm2/m32, xmm1","66 0F 3A 0A /r ib","V","V","SSE4_1","","w,r,r","","" +"RSM","RSM","rsm","0F AA","V","V","","","","","" +"RSQRTPS xmm1, xmm2/m128","RSQRTPS xmm2/m128, xmm1","rsqrtps xmm2/m128, xm= m1","0F 52 /r","V","V","SSE","","w,r","","" +"RSQRTSS xmm1, xmm2/m32","RSQRTSS xmm2/m32, xmm1","rsqrtss xmm2/m32, xmm1"= ,"F3 0F 52 /r","V","V","SSE","","w,r","","" +"RSTORSSP m64","RSTORSSP m64","rstorssp m64","F3 0F 01 /5","V","V","CET","= modrm_memonly","rw","","" +"SAHF","SAHF","sahf","9E","V","V","LAHFSAHF","","","","" +"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","D0 /4","V","V","","pseudo","r= w,r","Y","8" +"SAL r/m8, 1","SALB 1, r/m8","salb 1, r/m8","REX D0 /4","N.E.","V","","pse= udo","rw,r","Y","8" +"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","D2 /4","V","V","","pseudo"= ,"rw,r","Y","8" +"SAL r/m8, CL","SALB CL, r/m8","salb CL, r/m8","REX D2 /4","N.E.","V","","= pseudo","rw,r","Y","8" +"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","C0 /4 ib","V","V",""= ,"pseudo","rw,r","Y","8" +"SAL r/m8, imm8","SALB imm8, r/m8","salb imm8, r/m8","REX C0 /4 ib","N.E."= ,"V","","pseudo","rw,r","Y","8" +"SALC","SALC","salc","D6","V","N.S.","","","","","" +"SAL r/m32, 1","SALL 1, r/m32","sall 1, r/m32","D1 /4","V","V","","operand= 32,pseudo","rw,r","Y","32" +"SAL r/m32, CL","SALL CL, r/m32","sall CL, r/m32","D3 /4","V","V","","oper= and32,pseudo","rw,r","Y","32" +"SAL r/m32, imm8","SALL imm8, r/m32","sall imm8, r/m32","C1 /4 ib","V","V"= ,"","operand32,pseudo","rw,r","Y","32" +"SAL r/m64, 1","SALQ 1, r/m64","salq 1, r/m64","REX.W D1 /4","N.E.","V",""= ,"pseudo","rw,r","Y","64" +"SAL r/m64, CL","SALQ CL, r/m64","salq CL, r/m64","REX.W D3 /4","N.E.","V"= ,"","pseudo","rw,r","Y","64" +"SAL r/m64, imm8","SALQ imm8, r/m64","salq imm8, r/m64","REX.W C1 /4 ib","= N.E.","V","","pseudo","rw,r","Y","64" +"SAL r/m16, 1","SALW 1, r/m16","salw 1, r/m16","D1 /4","V","V","","operand= 16,pseudo","rw,r","Y","16" +"SAL r/m16, CL","SALW CL, r/m16","salw CL, r/m16","D3 /4","V","V","","oper= and16,pseudo","rw,r","Y","16" +"SAL r/m16, imm8","SALW imm8, r/m16","salw imm8, r/m16","C1 /4 ib","V","V"= ,"","operand16,pseudo","rw,r","Y","16" +"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","D0 /7","V","V","","","rw,r","= Y","8" +"SAR r/m8, 1","SARB 1, r/m8","sarb 1, r/m8","REX D0 /7","N.E.","V","","pse= udo64","rw,r","Y","8" +"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","D2 /7","V","V","","","rw,r= ","Y","8" +"SAR r/m8, CL","SARB CL, r/m8","sarb CL, r/m8","REX D2 /7","N.E.","V","","= pseudo64","rw,r","Y","8" +"SAR r/m8, imm8","SARB imm8, r/m8","sarb imm8, r/m8","REX C0 /7 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"SAR r/m8, imm8u","SARB imm8u, r/m8","sarb imm8u, r/m8","C0 /7 ib","V","V"= ,"","","rw,r","Y","8" +"SAR r/m32, 1","SARL 1, r/m32","sarl 1, r/m32","D1 /7","V","V","","operand= 32","rw,r","Y","32" +"SAR r/m32, CL","SARL CL, r/m32","sarl CL, r/m32","D3 /7","V","V","","oper= and32","rw,r","Y","32" +"SAR r/m32, imm8u","SARL imm8u, r/m32","sarl imm8u, r/m32","C1 /7 ib","V",= "V","","operand32","rw,r","Y","32" +"SAR r/m64, 1","SARQ 1, r/m64","sarq 1, r/m64","REX.W D1 /7","N.S.","V",""= ,"","rw,r","Y","64" +"SAR r/m64, CL","SARQ CL, r/m64","sarq CL, r/m64","REX.W D3 /7","N.S.","V"= ,"","","rw,r","Y","64" +"SAR r/m64, imm8u","SARQ imm8u, r/m64","sarq imm8u, r/m64","REX.W C1 /7 ib= ","N.S.","V","","","rw,r","Y","64" +"SAR r/m16, 1","SARW 1, r/m16","sarw 1, r/m16","D1 /7","V","V","","operand= 16","rw,r","Y","16" +"SAR r/m16, CL","SARW CL, r/m16","sarw CL, r/m16","D3 /7","V","V","","oper= and16","rw,r","Y","16" +"SAR r/m16, imm8u","SARW imm8u, r/m16","sarw imm8u, r/m16","C1 /7 ib","V",= "V","","operand16","rw,r","Y","16" +"SARX r32, r/m32, r32V","SARXL r32V, r/m32, r32","sarxl r32V, r/m32, r32",= "VEX.NDS.128.F3.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32" +"SARX r64, r/m64, r64V","SARXQ r64V, r/m64, r64","sarxq r64V, r/m64, r64",= "VEX.NDS.128.F3.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64" +"SAVESSP","SAVESSP","savessp","F3 0F 01 EA","V","V","CET","","","","" +"SBB AL, imm8","SBBB imm8, AL","sbbb imm8, AL","1C ib","V","V","","","rw,r= ","Y","8" +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","80 /3 ib","V","V",""= ,"","rw,r","Y","8" +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","82 /3 ib","V","N.S."= ,"","","rw,r","Y","8" +"SBB r/m8, imm8","SBBB imm8, r/m8","sbbb imm8, r/m8","REX 80 /3 ib","N.E."= ,"V","","pseudo64","w,r","Y","8" +"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","1A /r","V","V","","","rw,r= ","Y","8" +"SBB r8, r/m8","SBBB r/m8, r8","sbbb r/m8, r8","REX 1A /r","N.E.","V","","= pseudo64","w,r","Y","8" +"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","18 /r","V","V","","","rw,r= ","Y","8" +"SBB r/m8, r8","SBBB r8, r/m8","sbbb r8, r/m8","REX 18 /r","N.E.","V","","= pseudo64","w,r","Y","8" +"SBB EAX, imm32","SBBL imm32, EAX","sbbl imm32, EAX","1D id","V","V","","o= perand32","rw,r","Y","32" +"SBB r/m32, imm32","SBBL imm32, r/m32","sbbl imm32, r/m32","81 /3 id","V",= "V","","operand32","rw,r","Y","32" +"SBB r/m32, imm8","SBBL imm8, r/m32","sbbl imm8, r/m32","83 /3 ib","V","V"= ,"","operand32","rw,r","Y","32" +"SBB r32, r/m32","SBBL r/m32, r32","sbbl r/m32, r32","1B /r","V","V","","o= perand32","rw,r","Y","32" +"SBB r/m32, r32","SBBL r32, r/m32","sbbl r32, r/m32","19 /r","V","V","","o= perand32","rw,r","Y","32" +"SBB RAX, imm32","SBBQ imm32, RAX","sbbq imm32, RAX","REX.W 1D id","N.S.",= "V","","","rw,r","Y","64" +"SBB r/m64, imm32","SBBQ imm32, r/m64","sbbq imm32, r/m64","REX.W 81 /3 id= ","N.S.","V","","","rw,r","Y","64" +"SBB r/m64, imm8","SBBQ imm8, r/m64","sbbq imm8, r/m64","REX.W 83 /3 ib","= N.S.","V","","","rw,r","Y","64" +"SBB r64, r/m64","SBBQ r/m64, r64","sbbq r/m64, r64","REX.W 1B /r","N.S.",= "V","","","rw,r","Y","64" +"SBB r/m64, r64","SBBQ r64, r/m64","sbbq r64, r/m64","REX.W 19 /r","N.S.",= "V","","","rw,r","Y","64" +"SBB AX, imm16","SBBW imm16, AX","sbbw imm16, AX","1D iw","V","V","","oper= and16","rw,r","Y","16" +"SBB r/m16, imm16","SBBW imm16, r/m16","sbbw imm16, r/m16","81 /3 iw","V",= "V","","operand16","rw,r","Y","16" +"SBB r/m16, imm8","SBBW imm8, r/m16","sbbw imm8, r/m16","83 /3 ib","V","V"= ,"","operand16","rw,r","Y","16" +"SBB r16, r/m16","SBBW r/m16, r16","sbbw r/m16, r16","1B /r","V","V","","o= perand16","rw,r","Y","16" +"SBB r/m16, r16","SBBW r16, r/m16","sbbw r16, r/m16","19 /r","V","V","","o= perand16","rw,r","Y","16" +"SCASB","SCASB","scasb","AE","V","V","","","","","" +"SCASD","SCASL","scasl","AF","V","V","","operand32","","","" +"SCASQ","SCASQ","scasq","REX.W AF","N.S.","V","","","","","" +"SCASW","SCASW","scasw","AF","V","V","","operand16","","","" +"SETAE r/m8","SETCC r/m8","setae r/m8","0F 93 /r","V","V","","","w","","" +"SETNB r/m8","SETCC r/m8","setnb r/m8","0F 93 /r","V","V","","pseudo","r",= "","" +"SETNC r/m8","SETCC r/m8","setnc r/m8","0F 93 /r","V","V","","pseudo","r",= "","" +"SETAE r/m8","SETCC r/m8","setae r/m8","REX 0F 93 /r","N.E.","V","","pseud= o64","r","","" +"SETNB r/m8","SETCC r/m8","setnb r/m8","REX 0F 93 /r","N.E.","V","","pseud= o","r","","" +"SETNC r/m8","SETCC r/m8","setnc r/m8","REX 0F 93 /r","N.E.","V","","pseud= o","r","","" +"SETB r/m8","SETCS r/m8","setb r/m8","0F 92 /r","V","V","","","w","","" +"SETC r/m8","SETCS r/m8","setc r/m8","0F 92 /r","V","V","","pseudo","r",""= ,"" +"SETNAE r/m8","SETCS r/m8","setnae r/m8","0F 92 /r","V","V","","pseudo","r= ","","" +"SETB r/m8","SETCS r/m8","setb r/m8","REX 0F 92 /r","N.E.","V","","pseudo6= 4","r","","" +"SETC r/m8","SETCS r/m8","setc r/m8","REX 0F 92 /r","N.E.","V","","pseudo"= ,"r","","" +"SETNAE r/m8","SETCS r/m8","setnae r/m8","REX 0F 92 /r","N.E.","V","","pse= udo","r","","" +"SETE r/m8","SETEQ r/m8","sete r/m8","0F 94 /r","V","V","","","w","","" +"SETZ r/m8","SETEQ r/m8","setz r/m8","0F 94 /r","V","V","","pseudo","r",""= ,"" +"SETE r/m8","SETEQ r/m8","sete r/m8","REX 0F 94 /r","N.E.","V","","pseudo6= 4","r","","" +"SETZ r/m8","SETEQ r/m8","setz r/m8","REX 0F 94 /r","N.E.","V","","pseudo"= ,"r","","" +"SETGE r/m8","SETGE r/m8","setge r/m8","0F 9D /r","V","V","","","w","","" +"SETNL r/m8","SETGE r/m8","setnl r/m8","0F 9D /r","V","V","","pseudo","r",= "","" +"SETGE r/m8","SETGE r/m8","setge r/m8","REX 0F 9D /r","N.E.","V","","pseud= o64","r","","" +"SETNL r/m8","SETGE r/m8","setnl r/m8","REX 0F 9D /r","N.E.","V","","pseud= o","r","","" +"SETG r/m8","SETGT r/m8","setg r/m8","0F 9F /r","V","V","","","w","","" +"SETNLE r/m8","SETGT r/m8","setnle r/m8","0F 9F /r","V","V","","pseudo","r= ","","" +"SETG r/m8","SETGT r/m8","setg r/m8","REX 0F 9F /r","N.E.","V","","pseudo6= 4","r","","" +"SETNLE r/m8","SETGT r/m8","setnle r/m8","REX 0F 9F /r","N.E.","V","","pse= udo","r","","" +"SETA r/m8","SETHI r/m8","seta r/m8","0F 97 /r","V","V","","","w","","" +"SETNBE r/m8","SETHI r/m8","setnbe r/m8","0F 97 /r","V","V","","pseudo","r= ","","" +"SETA r/m8","SETHI r/m8","seta r/m8","REX 0F 97 /r","N.E.","V","","pseudo6= 4","r","","" +"SETNBE r/m8","SETHI r/m8","setnbe r/m8","REX 0F 97 /r","N.E.","V","","pse= udo","r","","" +"SETLE r/m8","SETLE r/m8","setle r/m8","0F 9E /r","V","V","","","w","","" +"SETNG r/m8","SETLE r/m8","setng r/m8","0F 9E /r","V","V","","pseudo","r",= "","" +"SETLE r/m8","SETLE r/m8","setle r/m8","REX 0F 9E /r","N.E.","V","","pseud= o64","r","","" +"SETNG r/m8","SETLE r/m8","setng r/m8","REX 0F 9E /r","N.E.","V","","pseud= o","r","","" +"SETBE r/m8","SETLS r/m8","setbe r/m8","0F 96 /r","V","V","","","w","","" +"SETNA r/m8","SETLS r/m8","setna r/m8","0F 96 /r","V","V","","pseudo","r",= "","" +"SETBE r/m8","SETLS r/m8","setbe r/m8","REX 0F 96 /r","N.E.","V","","pseud= o64","r","","" +"SETNA r/m8","SETLS r/m8","setna r/m8","REX 0F 96 /r","N.E.","V","","pseud= o","r","","" +"SETL r/m8","SETLT r/m8","setl r/m8","0F 9C /r","V","V","","","w","","" +"SETNGE r/m8","SETLT r/m8","setnge r/m8","0F 9C /r","V","V","","pseudo","r= ","","" +"SETL r/m8","SETLT r/m8","setl r/m8","REX 0F 9C /r","N.E.","V","","pseudo6= 4","r","","" +"SETNGE r/m8","SETLT r/m8","setnge r/m8","REX 0F 9C /r","N.E.","V","","pse= udo","r","","" +"SETS r/m8","SETMI r/m8","sets r/m8","0F 98 /r","V","V","","","w","","" +"SETS r/m8","SETMI r/m8","sets r/m8","REX 0F 98 /r","N.E.","V","","pseudo6= 4","r","","" +"SETNE r/m8","SETNE r/m8","setne r/m8","0F 95 /r","V","V","","","w","","" +"SETNZ r/m8","SETNE r/m8","setnz r/m8","0F 95 /r","V","V","","pseudo","r",= "","" +"SETNE r/m8","SETNE r/m8","setne r/m8","REX 0F 95 /r","N.E.","V","","pseud= o64","r","","" +"SETNZ r/m8","SETNE r/m8","setnz r/m8","REX 0F 95 /r","N.E.","V","","pseud= o","r","","" +"SETNO r/m8","SETOC r/m8","setno r/m8","0F 91 /r","V","V","","","w","","" +"SETNO r/m8","SETOC r/m8","setno r/m8","REX 0F 91 /r","N.E.","V","","pseud= o64","r","","" +"SETO r/m8","SETOS r/m8","seto r/m8","0F 90 /r","V","V","","","w","","" +"SETO r/m8","SETOS r/m8","seto r/m8","REX 0F 90 /r","N.E.","V","","pseudo6= 4","r","","" +"SETNP r/m8","SETPC r/m8","setnp r/m8","0F 9B /r","V","V","","","w","","" +"SETPO r/m8","SETPC r/m8","setpo r/m8","0F 9B /r","V","V","","pseudo","r",= "","" +"SETNP r/m8","SETPC r/m8","setnp r/m8","REX 0F 9B /r","N.E.","V","","pseud= o64","r","","" +"SETPO r/m8","SETPC r/m8","setpo r/m8","REX 0F 9B /r","N.E.","V","","pseud= o","r","","" +"SETNS r/m8","SETPL r/m8","setns r/m8","0F 99 /r","V","V","","","w","","" +"SETNS r/m8","SETPL r/m8","setns r/m8","REX 0F 99 /r","N.E.","V","","pseud= o64","r","","" +"SETP r/m8","SETPS r/m8","setp r/m8","0F 9A /r","V","V","","","w","","" +"SETPE r/m8","SETPS r/m8","setpe r/m8","0F 9A /r","V","V","","pseudo","r",= "","" +"SETP r/m8","SETPS r/m8","setp r/m8","REX 0F 9A /r","N.E.","V","","pseudo6= 4","r","","" +"SETPE r/m8","SETPS r/m8","setpe r/m8","REX 0F 9A /r","N.E.","V","","pseud= o","r","","" +"SETSSBSY","SETSSBSY","setssbsy","F3 0F 01 E8","V","V","CET","","","","" +"SFENCE","SFENCE","sfence","0F AE /7","V","V","SSE","","","","" +"SGDT m16&32","SGDT m16&32","sgdt m16&32","0F 01 /0","V","N.S.","","modrm_= memonly","w","","" +"SGDT m16&64","SGDT m16&64","sgdt m16&64","0F 01 /0","N.S.","V","","defaul= t64,modrm_memonly","w","","" +"SHA1MSG1 xmm1, xmm2/m128","SHA1MSG1 xmm2/m128, xmm1","sha1msg1 xmm2/m128,= xmm1","0F 38 C9 /r","V","V","SHA","","rw,r","","" +"SHA1MSG2 xmm1, xmm2/m128","SHA1MSG2 xmm2/m128, xmm1","sha1msg2 xmm2/m128,= xmm1","0F 38 CA /r","V","V","SHA","","rw,r","","" +"SHA1NEXTE xmm1, xmm2/m128","SHA1NEXTE xmm2/m128, xmm1","sha1nexte xmm2/m1= 28, xmm1","0F 38 C8 /r","V","V","SHA","","rw,r","","" +"SHA1RNDS4 xmm1, xmm2/m128, imm8u:2","SHA1RNDS4 imm8u:2, xmm2/m128, xmm1",= "sha1rnds4 imm8u:2, xmm2/m128, xmm1","0F 3A CC /r ib","V","V","SHA","","rw,= r,r","","" +"SHA256MSG1 xmm1, xmm2/m128","SHA256MSG1 xmm2/m128, xmm1","sha256msg1 xmm2= /m128, xmm1","0F 38 CC /r","V","V","SHA","","rw,r","","" +"SHA256MSG2 xmm1, xmm2/m128","SHA256MSG2 xmm2/m128, xmm1","sha256msg2 xmm2= /m128, xmm1","0F 38 CD /r","V","V","SHA","","rw,r","","" +"SHA256RNDS2 xmm1, xmm2/m128, ","SHA256RNDS2 , xmm2/m128, xmm1= ","sha256rnds2 , xmm2/m128, xmm1","0F 38 CB /r","V","V","SHA","","rw,= r,r","","" +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /4","V","V","","","rw,r","= Y","8" +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","D0 /6","V","V","","","rw,r","= Y","8" +"SHL r/m8, 1","SHLB 1, r/m8","shlb 1, r/m8","REX D0 /4","N.E.","V","","pse= udo64","rw,r","Y","8" +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /4","V","V","","","rw,r= ","Y","8" +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","D2 /6","V","V","","","rw,r= ","Y","8" +"SHL r/m8, CL","SHLB CL, r/m8","shlb CL, r/m8","REX D2 /4","N.E.","V","","= pseudo64","rw,r","Y","8" +"SHL r/m8, imm8","SHLB imm8, r/m8","shlb imm8, r/m8","REX C0 /4 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /4 ib","V","V"= ,"","","rw,r","Y","8" +"SHL r/m8, imm8u","SHLB imm8u, r/m8","shlb imm8u, r/m8","C0 /6 ib","V","V"= ,"","","rw,r","Y","8" +"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /4","V","V","","operand= 32","rw,r","Y","32" +"SHL r/m32, 1","SHLL 1, r/m32","shll 1, r/m32","D1 /6","V","V","","operand= 32","rw,r","Y","32" +"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /4","V","V","","oper= and32","rw,r","Y","32" +"SHL r/m32, CL","SHLL CL, r/m32","shll CL, r/m32","D3 /6","V","V","","oper= and32","rw,r","Y","32" +"SHLD r/m32, r32, CL","SHLL CL, r32, r/m32","shldl CL, r32, r/m32","0F A5 = /r","V","V","","operand32","rw,r,r","Y","32" +"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /4 ib","V",= "V","","operand32","rw,r","Y","32" +"SHL r/m32, imm8u","SHLL imm8u, r/m32","shll imm8u, r/m32","C1 /6 ib","V",= "V","","operand32","rw,r","Y","32" +"SHLD r/m32, r32, imm8u","SHLL imm8u, r32, r/m32","shldl imm8u, r32, r/m32= ","0F A4 /r ib","V","V","","operand32","rw,r,r","Y","32" +"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /4","N.S.","V",""= ,"","rw,r","Y","64" +"SHL r/m64, 1","SHLQ 1, r/m64","shlq 1, r/m64","REX.W D1 /6","N.S.","V",""= ,"","rw,r","Y","64" +"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /4","N.S.","V"= ,"","","rw,r","Y","64" +"SHL r/m64, CL","SHLQ CL, r/m64","shlq CL, r/m64","REX.W D3 /6","N.S.","V"= ,"","","rw,r","Y","64" +"SHLD r/m64, r64, CL","SHLQ CL, r64, r/m64","shldq CL, r64, r/m64","REX.W = 0F A5 /r","N.S.","V","","","rw,r,r","Y","64" +"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /4 ib= ","N.S.","V","","","rw,r","Y","64" +"SHL r/m64, imm8u","SHLQ imm8u, r/m64","shlq imm8u, r/m64","REX.W C1 /6 ib= ","N.S.","V","","","rw,r","Y","64" +"SHLD r/m64, r64, imm8u","SHLQ imm8u, r64, r/m64","shldq imm8u, r64, r/m64= ","REX.W 0F A4 /r ib","N.S.","V","","","rw,r,r","Y","64" +"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /4","V","V","","operand= 16","rw,r","Y","16" +"SHL r/m16, 1","SHLW 1, r/m16","shlw 1, r/m16","D1 /6","V","V","","operand= 16","rw,r","Y","16" +"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /4","V","V","","oper= and16","rw,r","Y","16" +"SHL r/m16, CL","SHLW CL, r/m16","shlw CL, r/m16","D3 /6","V","V","","oper= and16","rw,r","Y","16" +"SHLD r/m16, r16, CL","SHLW CL, r16, r/m16","shldw CL, r16, r/m16","0F A5 = /r","V","V","","operand16","rw,r,r","Y","16" +"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /4 ib","V",= "V","","operand16","rw,r","Y","16" +"SHL r/m16, imm8u","SHLW imm8u, r/m16","shlw imm8u, r/m16","C1 /6 ib","V",= "V","","operand16","rw,r","Y","16" +"SHLD r/m16, r16, imm8u","SHLW imm8u, r16, r/m16","shldw imm8u, r16, r/m16= ","0F A4 /r ib","V","V","","operand16","rw,r,r","Y","16" +"SHLX r32, r/m32, r32V","SHLXL r32V, r/m32, r32","shlxl r32V, r/m32, r32",= "VEX.NDS.128.66.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32" +"SHLX r64, r/m64, r64V","SHLXQ r64V, r/m64, r64","shlxq r64V, r/m64, r64",= "VEX.NDS.128.66.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64" +"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","D0 /5","V","V","","","rw,r","= Y","8" +"SHR r/m8, 1","SHRB 1, r/m8","shrb 1, r/m8","REX D0 /5","N.E.","V","","pse= udo64","rw,r","Y","8" +"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","D2 /5","V","V","","","rw,r= ","Y","8" +"SHR r/m8, CL","SHRB CL, r/m8","shrb CL, r/m8","REX D2 /5","N.E.","V","","= pseudo64","rw,r","Y","8" +"SHR r/m8, imm8","SHRB imm8, r/m8","shrb imm8, r/m8","REX C0 /5 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"SHR r/m8, imm8u","SHRB imm8u, r/m8","shrb imm8u, r/m8","C0 /5 ib","V","V"= ,"","","rw,r","Y","8" +"SHR r/m32, 1","SHRL 1, r/m32","shrl 1, r/m32","D1 /5","V","V","","operand= 32","rw,r","Y","32" +"SHR r/m32, CL","SHRL CL, r/m32","shrl CL, r/m32","D3 /5","V","V","","oper= and32","rw,r","Y","32" +"SHRD r/m32, r32, CL","SHRL CL, r32, r/m32","shrdl CL, r32, r/m32","0F AD = /r","V","V","","operand32","rw,r,r","Y","32" +"SHR r/m32, imm8u","SHRL imm8u, r/m32","shrl imm8u, r/m32","C1 /5 ib","V",= "V","","operand32","rw,r","Y","32" +"SHRD r/m32, r32, imm8u","SHRL imm8u, r32, r/m32","shrdl imm8u, r32, r/m32= ","0F AC /r ib","V","V","","operand32","rw,r,r","Y","32" +"SHR r/m64, 1","SHRQ 1, r/m64","shrq 1, r/m64","REX.W D1 /5","N.S.","V",""= ,"","rw,r","Y","64" +"SHR r/m64, CL","SHRQ CL, r/m64","shrq CL, r/m64","REX.W D3 /5","N.S.","V"= ,"","","rw,r","Y","64" +"SHRD r/m64, r64, CL","SHRQ CL, r64, r/m64","shrdq CL, r64, r/m64","REX.W = 0F AD /r","N.S.","V","","","rw,r,r","Y","64" +"SHR r/m64, imm8u","SHRQ imm8u, r/m64","shrq imm8u, r/m64","REX.W C1 /5 ib= ","N.S.","V","","","rw,r","Y","64" +"SHRD r/m64, r64, imm8u","SHRQ imm8u, r64, r/m64","shrdq imm8u, r64, r/m64= ","REX.W 0F AC /r ib","N.S.","V","","","rw,r,r","Y","64" +"SHR r/m16, 1","SHRW 1, r/m16","shrw 1, r/m16","D1 /5","V","V","","operand= 16","rw,r","Y","16" +"SHR r/m16, CL","SHRW CL, r/m16","shrw CL, r/m16","D3 /5","V","V","","oper= and16","rw,r","Y","16" +"SHRD r/m16, r16, CL","SHRW CL, r16, r/m16","shrdw CL, r16, r/m16","0F AD = /r","V","V","","operand16","rw,r,r","Y","16" +"SHR r/m16, imm8u","SHRW imm8u, r/m16","shrw imm8u, r/m16","C1 /5 ib","V",= "V","","operand16","rw,r","Y","16" +"SHRD r/m16, r16, imm8u","SHRW imm8u, r16, r/m16","shrdw imm8u, r16, r/m16= ","0F AC /r ib","V","V","","operand16","rw,r,r","Y","16" +"SHRX r32, r/m32, r32V","SHRXL r32V, r/m32, r32","shrxl r32V, r/m32, r32",= "VEX.NDS.128.F2.0F38.W0 F7 /r","V","V","BMI2","","w,r,r","Y","32" +"SHRX r64, r/m64, r64V","SHRXQ r64V, r/m64, r64","shrxq r64V, r/m64, r64",= "VEX.NDS.128.F2.0F38.W1 F7 /r","N.S.","V","BMI2","","w,r,r","Y","64" +"SHUFPD xmm1, xmm2/m128, imm8u","SHUFPD imm8u, xmm2/m128, xmm1","shufpd im= m8u, xmm2/m128, xmm1","66 0F C6 /r ib","V","V","SSE2","","rw,r,r","","" +"SHUFPS xmm1, xmm2/m128, imm8u","SHUFPS imm8u, xmm2/m128, xmm1","shufps im= m8u, xmm2/m128, xmm1","0F C6 /r ib","V","V","SSE","","rw,r,r","","" +"SIDT m16&32","SIDT m16&32","sidt m16&32","0F 01 /1","V","N.S.","","modrm_= memonly","w","","" +"SIDT m16&64","SIDT m16&64","sidt m16&64","0F 01 /1","N.S.","V","","defaul= t64,modrm_memonly","w","","" +"SKINIT EAX","SKINIT EAX","skinit EAX","0F 01 DE","V","V","SVM","amd,modrm= _regonly","r","","" +"SLDT r/m16","SLDTW r/m16","sldtw r/m16","0F 00 /0","V","V","","operand16"= ,"w","Y","16" +"SLDT r32/m16","SLDT{L/W} r32/m16","sldt{l/w} r32/m16","0F 00 /0","V","V",= "","operand32","w","Y","" +"SLDT r64/m16","SLDT{Q/W} r64/m16","sldt{q/w} r64/m16","REX.W 0F 00 /0","N= .S.","V","","","w","Y","" +"SLWPCB rmr32","SLWPCBL rmr32","slwpcbl rmr32","XOP.128.09.W0 12 /1","V","= V","XOP","amd,modrm_regonly,operand16,operand32","w","Y","32" +"SLWPCB rmr64","SLWPCBQ rmr64","slwpcbq rmr64","XOP.128.09.W0 12 /1","N.S.= ","V","XOP","amd,modrm_regonly,operand64","w","Y","64" +"SMSW r/m16","SMSWW r/m16","smsww r/m16","0F 01 /4","V","V","","operand16"= ,"w","Y","16" +"SMSW r32/m16","SMSW{L/W} r32/m16","smsw{l/w} r32/m16","0F 01 /4","V","V",= "","operand32","w","Y","" +"SMSW r64/m16","SMSW{Q/W} r64/m16","smsw{q/w} r64/m16","REX.W 0F 01 /4","N= .S.","V","","","w","Y","" +"SQRTPD xmm1, xmm2/m128","SQRTPD xmm2/m128, xmm1","sqrtpd xmm2/m128, xmm1"= ,"66 0F 51 /r","V","V","SSE2","","w,r","","" +"SQRTPS xmm1, xmm2/m128","SQRTPS xmm2/m128, xmm1","sqrtps xmm2/m128, xmm1"= ,"0F 51 /r","V","V","SSE","","w,r","","" +"SQRTSD xmm1, xmm2/m64","SQRTSD xmm2/m64, xmm1","sqrtsd xmm2/m64, xmm1","F= 2 0F 51 /r","V","V","SSE2","","w,r","","" +"SQRTSS xmm1, xmm2/m32","SQRTSS xmm2/m32, xmm1","sqrtss xmm2/m32, xmm1","F= 3 0F 51 /r","V","V","SSE","","w,r","","" +"STAC","STAC","stac","0F 01 CB","V","V","","","","","" +"STC","STC","stc","F9","V","V","","","","","" +"STD","STD","std","FD","V","V","","","","","" +"STGI","STGI","stgi","0F 01 DC","V","V","SVM","amd","","","" +"STI","STI","sti","FB","V","V","","","","","" +"STMXCSR m32","STMXCSR m32","stmxcsr m32","0F AE /3","V","V","SSE","modrm_= memonly","w","","" +"STOSB","STOSB","stosb","AA","V","V","","","","","" +"STOSD","STOSL","stosl","AB","V","V","","operand32","","","" +"STOSQ","STOSQ","stosq","REX.W AB","N.S.","V","","","","","" +"STOSW","STOSW","stosw","AB","V","V","","operand16","","","" +"STR r/m16","STRW r/m16","strw r/m16","0F 00 /1","V","V","","operand16","w= ","Y","16" +"STR r32/m16","STR{L/W} r32/m16","str{l/w} r32/m16","0F 00 /1","V","V","",= "operand32","w","Y","" +"STR r64/m16","STR{Q/W} r64/m16","str{q/w} r64/m16","REX.W 0F 00 /1","N.S.= ","V","","","w","Y","" +"SUB AL, imm8","SUBB imm8, AL","subb imm8, AL","2C ib","V","V","","","rw,r= ","Y","8" +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","80 /5 ib","V","V",""= ,"","rw,r","Y","8" +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","82 /5 ib","V","N.S."= ,"","","rw,r","Y","8" +"SUB r/m8, imm8","SUBB imm8, r/m8","subb imm8, r/m8","REX 80 /5 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","2A /r","V","V","","","rw,r= ","Y","8" +"SUB r8, r/m8","SUBB r/m8, r8","subb r/m8, r8","REX 2A /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","28 /r","V","V","","","rw,r= ","Y","8" +"SUB r/m8, r8","SUBB r8, r/m8","subb r8, r/m8","REX 28 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"SUB EAX, imm32","SUBL imm32, EAX","subl imm32, EAX","2D id","V","V","","o= perand32","rw,r","Y","32" +"SUB r/m32, imm32","SUBL imm32, r/m32","subl imm32, r/m32","81 /5 id","V",= "V","","operand32","rw,r","Y","32" +"SUB r/m32, imm8","SUBL imm8, r/m32","subl imm8, r/m32","83 /5 ib","V","V"= ,"","operand32","rw,r","Y","32" +"SUB r32, r/m32","SUBL r/m32, r32","subl r/m32, r32","2B /r","V","V","","o= perand32","rw,r","Y","32" +"SUB r/m32, r32","SUBL r32, r/m32","subl r32, r/m32","29 /r","V","V","","o= perand32","rw,r","Y","32" +"SUBPD xmm1, xmm2/m128","SUBPD xmm2/m128, xmm1","subpd xmm2/m128, xmm1","6= 6 0F 5C /r","V","V","SSE2","","rw,r","","" +"SUBPS xmm1, xmm2/m128","SUBPS xmm2/m128, xmm1","subps xmm2/m128, xmm1","0= F 5C /r","V","V","SSE","","rw,r","","" +"SUB RAX, imm32","SUBQ imm32, RAX","subq imm32, RAX","REX.W 2D id","N.S.",= "V","","","rw,r","Y","64" +"SUB r/m64, imm32","SUBQ imm32, r/m64","subq imm32, r/m64","REX.W 81 /5 id= ","N.S.","V","","","rw,r","Y","64" +"SUB r/m64, imm8","SUBQ imm8, r/m64","subq imm8, r/m64","REX.W 83 /5 ib","= N.S.","V","","","rw,r","Y","64" +"SUB r64, r/m64","SUBQ r/m64, r64","subq r/m64, r64","REX.W 2B /r","N.S.",= "V","","","rw,r","Y","64" +"SUB r/m64, r64","SUBQ r64, r/m64","subq r64, r/m64","REX.W 29 /r","N.S.",= "V","","","rw,r","Y","64" +"SUBSD xmm1, xmm2/m64","SUBSD xmm2/m64, xmm1","subsd xmm2/m64, xmm1","F2 0= F 5C /r","V","V","SSE2","","rw,r","","" +"SUBSS xmm1, xmm2/m32","SUBSS xmm2/m32, xmm1","subss xmm2/m32, xmm1","F3 0= F 5C /r","V","V","SSE","","rw,r","","" +"SUB AX, imm16","SUBW imm16, AX","subw imm16, AX","2D iw","V","V","","oper= and16","rw,r","Y","16" +"SUB r/m16, imm16","SUBW imm16, r/m16","subw imm16, r/m16","81 /5 iw","V",= "V","","operand16","rw,r","Y","16" +"SUB r/m16, imm8","SUBW imm8, r/m16","subw imm8, r/m16","83 /5 ib","V","V"= ,"","operand16","rw,r","Y","16" +"SUB r16, r/m16","SUBW r/m16, r16","subw r/m16, r16","2B /r","V","V","","o= perand16","rw,r","Y","16" +"SUB r/m16, r16","SUBW r16, r/m16","subw r16, r/m16","29 /r","V","V","","o= perand16","rw,r","Y","16" +"SWAPGS","SWAPGS","swapgs","0F 01 F8","N.S.","V","","","","","" +"SYSCALL","SYSCALL","syscall","0F 05","N.S.","V","","default64","","","" +"SYSCALL","SYSCALL","syscall","0F 05","V","N.S.","AMD","amd","","","" +"SYSENTER","SYSENTER","sysenter","0F 34","V","V","PPRO","","","","" +"SYSEXIT","SYSEXIT","sysexit","0F 35","V","V","PPRO","","","","" +"SYSEXIT","SYSEXIT","sysexit","REX.W 0F 35","N.E.","V","","pseudo","","","" +"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","V","N.S.","AMD","amd"= ,"","","" +"SYSRET","SYSRET","sysretw/sysretl/sysretl","0F 07","N.S.","V","","operand= 32,operand64","","","" +"SYSRET","SYSRET","sysretw/sysretl/sysretl","REX.W 0F 07","I","V","","pseu= do","","","" +"T1MSKC r32V, r/m32","T1MSKCL r/m32, r32V","t1mskcl r/m32, r32V","XOP.NDD.= 128.09.WIG 01 /7","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"T1MSKC r64V, r/m64","T1MSKCQ r/m64, r64V","t1mskcq r/m64, r64V","XOP.NDD.= 128.09.WIG 01 /7","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"TEST AL, imm8","TESTB imm8, AL","testb imm8, AL","A8 ib","V","V","","","r= ,r","Y","8" +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /0 ib","V","V"= ,"","","r,r","Y","8" +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","F6 /1 ib","V","V"= ,"","","r,r","Y","8" +"TEST r/m8, imm8","TESTB imm8, r/m8","testb imm8, r/m8","REX F6 /0 ib","N.= E.","V","","pseudo64","r,r","Y","8" +"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","84 /r","V","V","","","r= ,r","Y","8" +"TEST r/m8, r8","TESTB r8, r/m8","testb r8, r/m8","REX 84 /r","N.E.","V","= ","pseudo64","r,r","Y","8" +"TEST EAX, imm32","TESTL imm32, EAX","testl imm32, EAX","A9 id","V","V",""= ,"operand32","r,r","Y","32" +"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /0 id","= V","V","","operand32","r,r","Y","32" +"TEST r/m32, imm32","TESTL imm32, r/m32","testl imm32, r/m32","F7 /1 id","= V","V","","operand32","r,r","Y","32" +"TEST r/m32, r32","TESTL r32, r/m32","testl r32, r/m32","85 /r","V","V",""= ,"operand32","r,r","Y","32" +"TEST RAX, imm32","TESTQ imm32, RAX","testq imm32, RAX","REX.W A9 id","N.S= .","V","","","r,r","Y","64" +"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /0= id","N.S.","V","","","r,r","Y","64" +"TEST r/m64, imm32","TESTQ imm32, r/m64","testq imm32, r/m64","REX.W F7 /1= id","N.S.","V","","","r,r","Y","64" +"TEST r/m64, r64","TESTQ r64, r/m64","testq r64, r/m64","REX.W 85 /r","N.S= .","V","","","r,r","Y","64" +"TEST AX, imm16","TESTW imm16, AX","testw imm16, AX","A9 iw","V","V","","o= perand16","r,r","Y","16" +"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /0 iw","= V","V","","operand16","r,r","Y","16" +"TEST r/m16, imm16","TESTW imm16, r/m16","testw imm16, r/m16","F7 /1 iw","= V","V","","operand16","r,r","Y","16" +"TEST r/m16, r16","TESTW r16, r/m16","testw r16, r/m16","85 /r","V","V",""= ,"operand16","r,r","Y","16" +"TZCNT r32, r/m32","TZCNTL r/m32, r32","tzcntl r/m32, r32","F3 0F BC /r","= V","V","BMI1","operand32","w,r","Y","32" +"TZCNT r64, r/m64","TZCNTQ r/m64, r64","tzcntq r/m64, r64","F3 REX.W 0F BC= /r","N.S.","V","BMI1","","w,r","Y","64" +"TZCNT r16, r/m16","TZCNTW r/m16, r16","tzcntw r/m16, r16","F3 0F BC /r","= V","V","BMI1","operand16","w,r","Y","16" +"TZMSK r32V, r/m32","TZMSKL r/m32, r32V","tzmskl r/m32, r32V","XOP.NDD.128= .09.WIG 01 /4","V","V","TBM","amd,operand16,operand32","w,r","Y","32" +"TZMSK r64V, r/m64","TZMSKQ r/m64, r64V","tzmskq r/m64, r64V","XOP.NDD.128= .09.WIG 01 /4","N.S.","V","TBM","amd,operand64","w,r","Y","64" +"UCOMISD xmm1, xmm2/m64","UCOMISD xmm2/m64, xmm1","ucomisd xmm2/m64, xmm1"= ,"66 0F 2E /r","V","V","SSE2","","r,r","","" +"UCOMISS xmm1, xmm2/m32","UCOMISS xmm2/m32, xmm1","ucomiss xmm2/m32, xmm1"= ,"0F 2E /r","V","V","SSE","","r,r","","" +"UD0 r32, r/m32","UD0 r/m32, r32","ud0 r/m32, r32","0F FF /r","V","V","PPR= O","","r,r","","" +"UD1 r32, r/m32","UD1 r/m32, r32","ud1 r/m32, r32","0F B9 /r","V","V","PPR= O","","r,r","","" +"UD2","UD2","ud2","0F 0B","V","V","PPRO","","","","" +"UNPCKHPD xmm1, xmm2/m128","UNPCKHPD xmm2/m128, xmm1","unpckhpd xmm2/m128,= xmm1","66 0F 15 /r","V","V","SSE2","","rw,r","","" +"UNPCKHPS xmm1, xmm2/m128","UNPCKHPS xmm2/m128, xmm1","unpckhps xmm2/m128,= xmm1","0F 15 /r","V","V","SSE","","rw,r","","" +"UNPCKLPD xmm1, xmm2/m128","UNPCKLPD xmm2/m128, xmm1","unpcklpd xmm2/m128,= xmm1","66 0F 14 /r","V","V","SSE2","","rw,r","","" +"UNPCKLPS xmm1, xmm2/m128","UNPCKLPS xmm2/m128, xmm1","unpcklps xmm2/m128,= xmm1","0F 14 /r","V","V","SSE","","rw,r","","" +"V4FMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FMADDPS m128, zmmV+3, {k}{z}, zm= m1","v4fmaddps m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 9A /r",= "V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","","" +"V4FMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FMADDSS m128, xmmV+3, {k}{z}, xm= m1","v4fmaddss m128, xmmV+3, {k}{z}, xmm1","EVEX.DDS.LIG.F2.0F38.W0 9B /r",= "V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","","" +"V4FNMADDPS zmm1, {k}{z}, zmmV+3, m128","V4FNMADDPS m128, zmmV+3, {k}{z}, = zmm1","v4fnmaddps m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 AA /= r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","","" +"V4FNMADDSS xmm1, {k}{z}, xmmV+3, m128","V4FNMADDSS m128, xmmV+3, {k}{z}, = xmm1","v4fnmaddss m128, xmmV+3, {k}{z}, xmm1","EVEX.DDS.LIG.F2.0F38.W0 AB /= r","V","V","AVX512_4FMAPS","modrm_memonly,scale16","rw,r,r,r","","" +"VADDPD xmm1, xmmV, xmm2/m128","VADDPD xmm2/m128, xmmV, xmm1","vaddpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","= ","" +"VADDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VADDPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vaddpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 58 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VADDPD ymm1, ymmV, ymm2/m256","VADDPD ymm2/m256, ymmV, ymm1","vaddpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 58 /r","V","V","AVX","","w,r,r","= ","" +"VADDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VADDPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vaddpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 58 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VADDPD zmm1{er}, {k}{z}, zmmV, zmm2","VADDPD zmm2, zmmV, {k}{z}, zmm1{er}= ","vaddpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 58 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VADDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VADDPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vaddpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 58 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VADDPS xmm1, xmmV, xmm2/m128","VADDPS xmm2/m128, xmmV, xmm1","vaddps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 58 /r","V","V","AVX","","w,r,r","","" +"VADDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VADDPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vaddps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 58 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VADDPS ymm1, ymmV, ymm2/m256","VADDPS ymm2/m256, ymmV, ymm1","vaddps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 58 /r","V","V","AVX","","w,r,r","","" +"VADDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VADDPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vaddps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 58 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VADDPS zmm1{er}, {k}{z}, zmmV, zmm2","VADDPS zmm2, zmmV, {k}{z}, zmm1{er}= ","vaddps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 58 /r","V","V",= "AVX512F","modrm_regonly","w,r,r,r","","" +"VADDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VADDPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vaddps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 58 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VADDSD xmm1{er}, {k}{z}, xmmV, xmm2","VADDSD xmm2, xmmV, {k}{z}, xmm1{er}= ","vaddsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 58 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VADDSD xmm1, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, xmm1","vaddsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 58 /r","V","V","AVX","","w,r,r","","" +"VADDSD xmm1, {k}{z}, xmmV, xmm2/m64","VADDSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vaddsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 58 /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VADDSS xmm1{er}, {k}{z}, xmmV, xmm2","VADDSS xmm2, xmmV, {k}{z}, xmm1{er}= ","vaddss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 58 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VADDSS xmm1, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, xmm1","vaddss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 58 /r","V","V","AVX","","w,r,r","","" +"VADDSS xmm1, {k}{z}, xmmV, xmm2/m32","VADDSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vaddss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 58 /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VADDSUBPD xmm1, xmmV, xmm2/m128","VADDSUBPD xmm2/m128, xmmV, xmm1","vadds= ubpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D0 /r","V","V","AVX","",= "w,r,r","","" +"VADDSUBPD ymm1, ymmV, ymm2/m256","VADDSUBPD ymm2/m256, ymmV, ymm1","vadds= ubpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D0 /r","V","V","AVX","",= "w,r,r","","" +"VADDSUBPS xmm1, xmmV, xmm2/m128","VADDSUBPS xmm2/m128, xmmV, xmm1","vadds= ubps xmm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG D0 /r","V","V","AVX","",= "w,r,r","","" +"VADDSUBPS ymm1, ymmV, ymm2/m256","VADDSUBPS ymm2/m256, ymmV, ymm1","vadds= ubps ymm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG D0 /r","V","V","AVX","",= "w,r,r","","" +"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec x= mm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX512V= L","scale16","w,r,r","","" +"VAESDEC xmm1, xmmV, xmm2/m128","VAESDEC xmm2/m128, xmmV, xmm1","vaesdec x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DE /r","V","V","AES+AVX","",= "w,r,r","","" +"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec y= mm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DE /r","V","V","AES+AVX512V= L","scale32","w,r,r","","" +"VAESDEC ymm1, ymmV, ymm2/m256","VAESDEC ymm2/m256, ymmV, ymm1","vaesdec y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DE /r","V","V","VAES+AVX",""= ,"w,r,r","","" +"VAESDEC zmm1, zmmV, zmm2/m512","VAESDEC zmm2/m512, zmmV, zmm1","vaesdec z= mm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DE /r","V","V","AES+AVX512F= ","scale64","w,r,r","","" +"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV, xmm1","v= aesdeclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DF /r","V","V",= "AES+AVX512VL","scale16","w,r,r","","" +"VAESDECLAST xmm1, xmmV, xmm2/m128","VAESDECLAST xmm2/m128, xmmV, xmm1","v= aesdeclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DF /r","V","V","= AES+AVX","","w,r,r","","" +"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV, ymm1","v= aesdeclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DF /r","V","V",= "AES+AVX512VL","scale32","w,r,r","","" +"VAESDECLAST ymm1, ymmV, ymm2/m256","VAESDECLAST ymm2/m256, ymmV, ymm1","v= aesdeclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DF /r","V","V","= VAES+AVX","","w,r,r","","" +"VAESDECLAST zmm1, zmmV, zmm2/m512","VAESDECLAST zmm2/m512, zmmV, zmm1","v= aesdeclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DF /r","V","V",= "AES+AVX512F","scale64","w,r,r","","" +"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc x= mm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX512V= L","scale16","w,r,r","","" +"VAESENC xmm1, xmmV, xmm2/m128","VAESENC xmm2/m128, xmmV, xmm1","vaesenc x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DC /r","V","V","AES+AVX","",= "w,r,r","","" +"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc y= mm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DC /r","V","V","AES+AVX512V= L","scale32","w,r,r","","" +"VAESENC ymm1, ymmV, ymm2/m256","VAESENC ymm2/m256, ymmV, ymm1","vaesenc y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DC /r","V","V","VAES+AVX",""= ,"w,r,r","","" +"VAESENC zmm1, zmmV, zmm2/m512","VAESENC zmm2/m512, zmmV, zmm1","vaesenc z= mm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DC /r","V","V","AES+AVX512F= ","scale64","w,r,r","","" +"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV, xmm1","v= aesenclast xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F38.WIG DD /r","V","V",= "AES+AVX512VL","scale16","w,r,r","","" +"VAESENCLAST xmm1, xmmV, xmm2/m128","VAESENCLAST xmm2/m128, xmmV, xmm1","v= aesenclast xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG DD /r","V","V","= AES+AVX","","w,r,r","","" +"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV, ymm1","v= aesenclast ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F38.WIG DD /r","V","V",= "AES+AVX512VL","scale32","w,r,r","","" +"VAESENCLAST ymm1, ymmV, ymm2/m256","VAESENCLAST ymm2/m256, ymmV, ymm1","v= aesenclast ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG DD /r","V","V","= VAES+AVX","","w,r,r","","" +"VAESENCLAST zmm1, zmmV, zmm2/m512","VAESENCLAST zmm2/m512, zmmV, zmm1","v= aesenclast zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F38.WIG DD /r","V","V",= "AES+AVX512F","scale64","w,r,r","","" +"VAESIMC xmm1, xmm2/m128","VAESIMC xmm2/m128, xmm1","vaesimc xmm2/m128, xm= m1","VEX.128.66.0F38.WIG DB /r","V","V","AES+AVX","","w,r","","" +"VAESKEYGENASSIST xmm1, xmm2/m128, imm8u","VAESKEYGENASSIST imm8u, xmm2/m1= 28, xmm1","vaeskeygenassist imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG DF= /r ib","V","V","AES+AVX","","w,r,r","","" +"VALIGND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VALIGND imm8u, xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","valignd imm8u, xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 03 /r ib","V","V","AVX512F+AVX512V= L","bscale4,scale16","w,r,r,r,r","","" +"VALIGND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VALIGND imm8u, ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","valignd imm8u, ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 03 /r ib","V","V","AVX512F+AVX512V= L","bscale4,scale32","w,r,r,r,r","","" +"VALIGND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VALIGND imm8u, zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","valignd imm8u, zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 03 /r ib","V","V","AVX512F","bscal= e4,scale64","w,r,r,r,r","","" +"VALIGNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VALIGNQ imm8u, xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","valignq imm8u, xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 03 /r ib","V","V","AVX512F+AVX512V= L","bscale8,scale16","w,r,r,r,r","","" +"VALIGNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VALIGNQ imm8u, ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","valignq imm8u, ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 03 /r ib","V","V","AVX512F+AVX512V= L","bscale8,scale32","w,r,r,r,r","","" +"VALIGNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VALIGNQ imm8u, zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","valignq imm8u, zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 03 /r ib","V","V","AVX512F","bscal= e8,scale64","w,r,r,r,r","","" +"VANDNPD xmm1, xmmV, xmm2/m128","VANDNPD xmm2/m128, xmmV, xmm1","vandnpd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 55 /r","V","V","AVX","","w,r,r= ","","" +"VANDNPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDNPD xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vandnpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F.W1 55 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r= ,r,r","","" +"VANDNPD ymm1, ymmV, ymm2/m256","VANDNPD ymm2/m256, ymmV, ymm1","vandnpd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 55 /r","V","V","AVX","","w,r,r= ","","" +"VANDNPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDNPD ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vandnpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F.W1 55 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r= ,r,r","","" +"VANDNPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDNPD zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vandnpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F.W1 55 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","",= "" +"VANDNPS xmm1, xmmV, xmm2/m128","VANDNPS xmm2/m128, xmmV, xmm1","vandnps x= mm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 55 /r","V","V","AVX","","w,r,r","= ","" +"VANDNPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDNPS xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vandnps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.0F.W0 55 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,= r","","" +"VANDNPS ymm1, ymmV, ymm2/m256","VANDNPS ymm2/m256, ymmV, ymm1","vandnps y= mm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 55 /r","V","V","AVX","","w,r,r","= ","" +"VANDNPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDNPS ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vandnps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.0F.W0 55 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,= r","","" +"VANDNPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDNPS zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vandnps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.0F.W0 55 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","","" +"VANDPD xmm1, xmmV, xmm2/m128","VANDPD xmm2/m128, xmmV, xmm1","vandpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","= ","" +"VANDPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VANDPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vandpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 54 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,= r","","" +"VANDPD ymm1, ymmV, ymm2/m256","VANDPD ymm2/m256, ymmV, ymm1","vandpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 54 /r","V","V","AVX","","w,r,r","= ","" +"VANDPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VANDPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vandpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 54 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,= r","","" +"VANDPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VANDPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vandpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 54 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","","" +"VANDPS xmm1, xmmV, xmm2/m128","VANDPS xmm2/m128, xmmV, xmm1","vandps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 54 /r","V","V","AVX","","w,r,r","","" +"VANDPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VANDPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vandps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 54 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r",= "","" +"VANDPS ymm1, ymmV, ymm2/m256","VANDPS ymm2/m256, ymmV, ymm1","vandps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 54 /r","V","V","AVX","","w,r,r","","" +"VANDPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VANDPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vandps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 54 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r",= "","" +"VANDPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VANDPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vandps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 54 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","","" +"VBLENDMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VBLENDMPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vblendmpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W1 65 /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r","","" +"VBLENDMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VBLENDMPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vblendmpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W1 65 /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r","","" +"VBLENDMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VBLENDMPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vblendmpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W1 65 /r","V","V","AVX512F","bscale8,scale64","w,r,r= ,r","","" +"VBLENDMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VBLENDMPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vblendmps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 65 /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r","","" +"VBLENDMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VBLENDMPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vblendmps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 65 /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r","","" +"VBLENDMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VBLENDMPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vblendmps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 65 /r","V","V","AVX512F","bscale4,scale64","w,r,r= ,r","","" +"VBLENDPD xmm1, xmmV, xmm2/m128, imm8u","VBLENDPD imm8u, xmm2/m128, xmmV, = xmm1","vblendpd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0D /= r ib","V","V","AVX","","w,r,r,r","","" +"VBLENDPD ymm1, ymmV, ymm2/m256, imm8u","VBLENDPD imm8u, ymm2/m256, ymmV, = ymm1","vblendpd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0D /= r ib","V","V","AVX","","w,r,r,r","","" +"VBLENDPS xmm1, xmmV, xmm2/m128, imm8u","VBLENDPS imm8u, xmm2/m128, xmmV, = xmm1","vblendps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0C /= r ib","V","V","AVX","","w,r,r,r","","" +"VBLENDPS ymm1, ymmV, ymm2/m256, imm8u","VBLENDPS imm8u, ymm2/m256, ymmV, = ymm1","vblendps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0C /= r ib","V","V","AVX","","w,r,r,r","","" +"VBLENDVPD xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPD xmmIH, xmm2/m128, xmmV= , xmm1","vblendvpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4B= /r /is4","V","V","AVX","","w,r,r,r","","" +"VBLENDVPD ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPD ymmIH, ymm2/m256, ymmV= , ymm1","vblendvpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4B= /r /is4","V","V","AVX","","w,r,r,r","","" +"VBLENDVPS xmm1, xmmV, xmm2/m128, xmmIH","VBLENDVPS xmmIH, xmm2/m128, xmmV= , xmm1","vblendvps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4A= /r /is4","V","V","AVX","","w,r,r,r","","" +"VBLENDVPS ymm1, ymmV, ymm2/m256, ymmIH","VBLENDVPS ymmIH, ymm2/m256, ymmV= , ymm1","vblendvps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4A= /r /is4","V","V","AVX","","w,r,r,r","","" +"VBROADCASTF128 ymm1, m128","VBROADCASTF128 m128, ymm1","vbroadcastf128 m1= 28, ymm1","VEX.256.66.0F38.W0 1A /r","V","V","AVX","modrm_memonly","w,r",""= ,"" +"VBROADCASTF32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64, {k}{z}= , ymm1","vbroadcastf32x2 xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W0 19 /r= ","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","","" +"VBROADCASTF32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTF32X2 xmm2/m64, {k}{z}= , zmm1","vbroadcastf32x2 xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W0 19 /r= ","V","V","AVX512DQ","scale8","w,r,r","","" +"VBROADCASTF32X4 ymm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z}, ymm1",= "vbroadcastf32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1A /r","V","V","A= VX512F+AVX512VL","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTF32X4 zmm1, {k}{z}, m128","VBROADCASTF32X4 m128, {k}{z}, zmm1",= "vbroadcastf32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1A /r","V","V","A= VX512F","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTF32X8 zmm1, {k}{z}, m256","VBROADCASTF32X8 m256, {k}{z}, zmm1",= "vbroadcastf32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1B /r","V","V","A= VX512DQ","modrm_memonly,scale32","w,r,r","","" +"VBROADCASTF64X2 ymm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z}, ymm1",= "vbroadcastf64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1A /r","V","V","A= VX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTF64X2 zmm1, {k}{z}, m128","VBROADCASTF64X2 m128, {k}{z}, zmm1",= "vbroadcastf64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1A /r","V","V","A= VX512DQ","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTF64X4 zmm1, {k}{z}, m256","VBROADCASTF64X4 m256, {k}{z}, zmm1",= "vbroadcastf64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1B /r","V","V","A= VX512F","modrm_memonly,scale32","w,r,r","","" +"VBROADCASTI128 ymm1, m128","VBROADCASTI128 m128, ymm1","vbroadcasti128 m1= 28, ymm1","VEX.256.66.0F38.W0 5A /r","V","V","AVX2","modrm_memonly","w,r","= ","" +"VBROADCASTI32X2 xmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}= , xmm1","vbroadcasti32x2 xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 59 /r= ","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","","" +"VBROADCASTI32X2 ymm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}= , ymm1","vbroadcasti32x2 xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W0 59 /r= ","V","V","AVX512DQ+AVX512VL","scale8","w,r,r","","" +"VBROADCASTI32X2 zmm1, {k}{z}, xmm2/m64","VBROADCASTI32X2 xmm2/m64, {k}{z}= , zmm1","vbroadcasti32x2 xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W0 59 /r= ","V","V","AVX512DQ","scale8","w,r,r","","" +"VBROADCASTI32X4 ymm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z}, ymm1",= "vbroadcasti32x4 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 5A /r","V","V","A= VX512F+AVX512VL","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTI32X4 zmm1, {k}{z}, m128","VBROADCASTI32X4 m128, {k}{z}, zmm1",= "vbroadcasti32x4 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5A /r","V","V","A= VX512F","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTI32X8 zmm1, {k}{z}, m256","VBROADCASTI32X8 m256, {k}{z}, zmm1",= "vbroadcasti32x8 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 5B /r","V","V","A= VX512DQ","modrm_memonly,scale32","w,r,r","","" +"VBROADCASTI64X2 ymm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z}, ymm1",= "vbroadcasti64x2 m128, {k}{z}, ymm1","EVEX.256.66.0F38.W1 5A /r","V","V","A= VX512DQ+AVX512VL","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTI64X2 zmm1, {k}{z}, m128","VBROADCASTI64X2 m128, {k}{z}, zmm1",= "vbroadcasti64x2 m128, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5A /r","V","V","A= VX512DQ","modrm_memonly,scale16","w,r,r","","" +"VBROADCASTI64X4 zmm1, {k}{z}, m256","VBROADCASTI64X4 m256, {k}{z}, zmm1",= "vbroadcasti64x4 m256, {k}{z}, zmm1","EVEX.512.66.0F38.W1 5B /r","V","V","A= VX512F","modrm_memonly,scale32","w,r,r","","" +"VBROADCASTSD ymm1, m64","VBROADCASTSD m64, ymm1","vbroadcastsd m64, ymm1"= ,"VEX.256.66.0F38.W0 19 /r","V","V","AVX","modrm_memonly","w,r","","" +"VBROADCASTSD ymm1, xmm2","VBROADCASTSD xmm2, ymm1","vbroadcastsd xmm2, ym= m1","VEX.256.66.0F38.W0 19 /r","V","V","AVX2","modrm_regonly","w,r","","" +"VBROADCASTSD ymm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z}, ymm1= ","vbroadcastsd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 19 /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VBROADCASTSD zmm1, {k}{z}, xmm2/m64","VBROADCASTSD xmm2/m64, {k}{z}, zmm1= ","vbroadcastsd xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 19 /r","V","V"= ,"AVX512F","scale8","w,r,r","","" +"VBROADCASTSS xmm1, m32","VBROADCASTSS m32, xmm1","vbroadcastss m32, xmm1"= ,"VEX.128.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","","" +"VBROADCASTSS ymm1, m32","VBROADCASTSS m32, ymm1","vbroadcastss m32, ymm1"= ,"VEX.256.66.0F38.W0 18 /r","V","V","AVX","modrm_memonly","w,r","","" +"VBROADCASTSS xmm1, xmm2","VBROADCASTSS xmm2, xmm1","vbroadcastss xmm2, xm= m1","VEX.128.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","","" +"VBROADCASTSS ymm1, xmm2","VBROADCASTSS xmm2, ymm1","vbroadcastss xmm2, ym= m1","VEX.256.66.0F38.W0 18 /r","V","V","AVX2","modrm_regonly","w,r","","" +"VBROADCASTSS xmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, xmm1= ","vbroadcastss xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 18 /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VBROADCASTSS ymm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, ymm1= ","vbroadcastss xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 18 /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VBROADCASTSS zmm1, {k}{z}, xmm2/m32","VBROADCASTSS xmm2/m32, {k}{z}, zmm1= ","vbroadcastss xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 18 /r","V","V"= ,"AVX512F","scale4","w,r,r","","" +"VCMPPD xmm1, xmmV, xmm2/m128, imm8u","VCMPPD imm8u, xmm2/m128, xmmV, xmm1= ","vcmppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C2 /r ib","V= ","V","AVX","","w,r,r,r","","" +"VCMPPD k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VCMPPD imm8u, xmm2/m128/= m64bcst, xmmV, {k}, k1","vcmppd imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","E= VEX.NDS.128.66.0F.W1 C2 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale16"= ,"w,r,r,r,r","","" +"VCMPPD ymm1, ymmV, ymm2/m256, imm8u","VCMPPD imm8u, ymm2/m256, ymmV, ymm1= ","vcmppd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C2 /r ib","V= ","V","AVX","","w,r,r,r","","" +"VCMPPD k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VCMPPD imm8u, ymm2/m256/= m64bcst, ymmV, {k}, k1","vcmppd imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","E= VEX.NDS.256.66.0F.W1 C2 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32"= ,"w,r,r,r,r","","" +"VCMPPD k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPD imm8u, zmm2, zmmV, {k}, k= 1{sae}","vcmppd imm8u, zmm2, zmmV, {k}, k1{sae}","EVEX.NDS.512.66.0F.W1 C2 = /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","","" +"VCMPPD k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VCMPPD imm8u, zmm2/m512/= m64bcst, zmmV, {k}, k1","vcmppd imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","E= VEX.NDS.512.66.0F.W1 C2 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r= ,r","","" +"VCMPPS xmm1, xmmV, xmm2/m128, imm8u","VCMPPS imm8u, xmm2/m128, xmmV, xmm1= ","vcmpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C2 /r ib","V","= V","AVX","","w,r,r,r","","" +"VCMPPS k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VCMPPS imm8u, xmm2/m128/= m32bcst, xmmV, {k}, k1","vcmpps imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","E= VEX.NDS.128.0F.W0 C2 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r,r","","" +"VCMPPS ymm1, ymmV, ymm2/m256, imm8u","VCMPPS imm8u, ymm2/m256, ymmV, ymm1= ","vcmpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C2 /r ib","V","= V","AVX","","w,r,r,r","","" +"VCMPPS k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VCMPPS imm8u, ymm2/m256/= m32bcst, ymmV, {k}, k1","vcmpps imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","E= VEX.NDS.256.0F.W0 C2 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r,r","","" +"VCMPPS k1{sae}, {k}, zmmV, zmm2, imm8u","VCMPPS imm8u, zmm2, zmmV, {k}, k= 1{sae}","vcmpps imm8u, zmm2, zmmV, {k}, k1{sae}","EVEX.NDS.512.0F.W0 C2 /r = ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","","" +"VCMPPS k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VCMPPS imm8u, zmm2/m512/= m32bcst, zmmV, {k}, k1","vcmpps imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","E= VEX.NDS.512.0F.W0 C2 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r,r"= ,"","" +"VCMPSD k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSD imm8u, xmm2, xmmV, {k}, k= 1{sae}","vcmpsd imm8u, xmm2, xmmV, {k}, k1{sae}","EVEX.NDS.128.F2.0F.W1 C2 = /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","","" +"VCMPSD xmm1, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV, xmm1",= "vcmpsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG C2 /r ib","V","= V","AVX","","w,r,r,r","","" +"VCMPSD k1, {k}, xmmV, xmm2/m64, imm8u","VCMPSD imm8u, xmm2/m64, xmmV, {k}= , k1","vcmpsd imm8u, xmm2/m64, xmmV, {k}, k1","EVEX.NDS.LIG.F2.0F.W1 C2 /r = ib","V","V","AVX512F","scale8","w,r,r,r,r","","" +"VCMPSS k1{sae}, {k}, xmmV, xmm2, imm8u","VCMPSS imm8u, xmm2, xmmV, {k}, k= 1{sae}","vcmpss imm8u, xmm2, xmmV, {k}, k1{sae}","EVEX.NDS.128.F3.0F.W0 C2 = /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r,r","","" +"VCMPSS xmm1, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV, xmm1",= "vcmpss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG C2 /r ib","V","= V","AVX","","w,r,r,r","","" +"VCMPSS k1, {k}, xmmV, xmm2/m32, imm8u","VCMPSS imm8u, xmm2/m32, xmmV, {k}= , k1","vcmpss imm8u, xmm2/m32, xmmV, {k}, k1","EVEX.NDS.LIG.F3.0F.W0 C2 /r = ib","V","V","AVX512F","scale4","w,r,r,r,r","","" +"VCOMISD xmm1{sae}, xmm2","VCOMISD xmm2, xmm1{sae}","vcomisd xmm2, xmm1{sa= e}","EVEX.128.66.0F.W1 2F /r","V","V","AVX512F","modrm_regonly","r,r","","" +"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1"= ,"EVEX.LIG.66.0F.W1 2F /r","V","V","AVX512F","scale8","r,r","","" +"VCOMISD xmm1, xmm2/m64","VCOMISD xmm2/m64, xmm1","vcomisd xmm2/m64, xmm1"= ,"VEX.LIG.66.0F.WIG 2F /r","V","V","AVX","","r,r","","" +"VCOMISS xmm1{sae}, xmm2","VCOMISS xmm2, xmm1{sae}","vcomiss xmm2, xmm1{sa= e}","EVEX.128.0F.W0 2F /r","V","V","AVX512F","modrm_regonly","r,r","","" +"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1"= ,"EVEX.LIG.0F.W0 2F /r","V","V","AVX512F","scale4","r,r","","" +"VCOMISS xmm1, xmm2/m32","VCOMISS xmm2/m32, xmm1","vcomiss xmm2/m32, xmm1"= ,"VEX.LIG.0F.WIG 2F /r","V","V","AVX","","r,r","","" +"VCOMPRESSPD xmm2/m128, {k}{z}, xmm1","VCOMPRESSPD xmm1, {k}{z}, xmm2/m128= ","vcompresspd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 8A /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VCOMPRESSPD ymm2/m256, {k}{z}, ymm1","VCOMPRESSPD ymm1, {k}{z}, ymm2/m256= ","vcompresspd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 8A /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VCOMPRESSPD zmm2/m512, {k}{z}, zmm1","VCOMPRESSPD zmm1, {k}{z}, zmm2/m512= ","vcompresspd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 8A /r","V","V"= ,"AVX512F","scale8","w,r,r","","" +"VCOMPRESSPS xmm2/m128, {k}{z}, xmm1","VCOMPRESSPS xmm1, {k}{z}, xmm2/m128= ","vcompressps xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 8A /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VCOMPRESSPS ymm2/m256, {k}{z}, ymm1","VCOMPRESSPS ymm1, {k}{z}, ymm2/m256= ","vcompressps ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 8A /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VCOMPRESSPS zmm2/m512, {k}{z}, zmm1","VCOMPRESSPS zmm1, {k}{z}, zmm2/m512= ","vcompressps zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 8A /r","V","V"= ,"AVX512F","scale4","w,r,r","","" +"VCVTDQ2PD ymm1, xmm2/m128","VCVTDQ2PD xmm2/m128, ymm1","vcvtdq2pd xmm2/m1= 28, ymm1","VEX.256.F3.0F.WIG E6 /r","V","V","AVX","","w,r","","" +"VCVTDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PD xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtdq2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W= 0 E6 /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTDQ2PD xmm2/m256/m32bcst, = {k}{z}, ymm1","vcvtdq2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W= 0 E6 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTDQ2PD xmm1, xmm2/m64","VCVTDQ2PD xmm2/m64, xmm1","vcvtdq2pd xmm2/m64,= xmm1","VEX.128.F3.0F.WIG E6 /r","V","V","AVX","","w,r","","" +"VCVTDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTDQ2PD ymm2/m512/m32bcst, = {k}{z}, zmm1","vcvtdq2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W= 0 E6 /r","V","V","AVX512F","bscale4,scale32","w,r,r","","" +"VCVTDQ2PS xmm1, xmm2/m128","VCVTDQ2PS xmm2/m128, xmm1","vcvtdq2ps xmm2/m1= 28, xmm1","VEX.128.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTDQ2PS xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtdq2ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 5= B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTDQ2PS ymm1, ymm2/m256","VCVTDQ2PS ymm2/m256, ymm1","vcvtdq2ps ymm2/m2= 56, ymm1","VEX.256.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTDQ2PS ymm2/m256/m32bcst, = {k}{z}, ymm1","vcvtdq2ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 5= B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTDQ2PS zmm2, {k}{z}, zmm1{er}","vcv= tdq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 5B /r","V","V","AVX512F","mo= drm_regonly","w,r,r","","" +"VCVTDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTDQ2PS zmm2/m512/m32bcst, = {k}{z}, zmm1","vcvtdq2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 5= B /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTPD2DQ ymm1{er}, {k}{z}, zmm2","VCVTPD2DQ zmm2, {k}{z}, ymm1{er}","vcv= tpd2dq zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 E6 /r","V","V","AVX512F",= "modrm_regonly","w,r,r","Y","" +"VCVTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2DQ zmm2/m512/m64bcst, = {k}{z}, ymm1","vcvtpd2dq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.F2.0F.W= 1 E6 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTPD2DQ xmm1, xmm2/m128","VCVTPD2DQX xmm2/m128, xmm1","vcvtpd2dqx xmm2/= m128, xmm1","VEX.128.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128" +"VCVTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2DQX xmm2/m128/m64bcst,= {k}{z}, xmm1","vcvtpd2dqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F2.0F= .W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTPD2DQ xmm1, ymm2/m256","VCVTPD2DQY ymm2/m256, xmm1","vcvtpd2dqy ymm2/= m256, xmm1","VEX.256.F2.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256" +"VCVTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2DQY ymm2/m256/m64bcst,= {k}{z}, xmm1","vcvtpd2dqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.F2.0F= .W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTPD2PS ymm1{er}, {k}{z}, zmm2","VCVTPD2PS zmm2, {k}{z}, ymm1{er}","vcv= tpd2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.66.0F.W1 5A /r","V","V","AVX512F",= "modrm_regonly","w,r,r","Y","" +"VCVTPD2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2PS zmm2/m512/m64bcst, = {k}{z}, ymm1","vcvtpd2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.66.0F.W= 1 5A /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTPD2PS xmm1, xmm2/m128","VCVTPD2PSX xmm2/m128, xmm1","vcvtpd2psx xmm2/= m128, xmm1","VEX.128.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","128" +"VCVTPD2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2PSX xmm2/m128/m64bcst,= {k}{z}, xmm1","vcvtpd2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F= .W1 5A /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTPD2PS xmm1, ymm2/m256","VCVTPD2PSY ymm2/m256, xmm1","vcvtpd2psy ymm2/= m256, xmm1","VEX.256.66.0F.WIG 5A /r","V","V","AVX","","w,r","Y","256" +"VCVTPD2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2PSY ymm2/m256/m64bcst,= {k}{z}, xmm1","vcvtpd2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.66.0F= .W1 5A /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2QQ xmm2/m128/m64bcst, = {k}{z}, xmm1","vcvtpd2qq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W= 1 7B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2QQ ymm2/m256/m64bcst, = {k}{z}, ymm1","vcvtpd2qq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W= 1 7B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTPD2QQ zmm1{er}, {k}{z}, zmm2","VCVTPD2QQ zmm2, {k}{z}, zmm1{er}","vcv= tpd2qq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 7B /r","V","V","AVX512DQ"= ,"modrm_regonly","w,r,r","","" +"VCVTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2QQ zmm2/m512/m64bcst, = {k}{z}, zmm1","vcvtpd2qq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W= 1 7B /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTPD2UDQ ymm1{er}, {k}{z}, zmm2","VCVTPD2UDQ zmm2, {k}{z}, ymm1{er}","v= cvtpd2udq zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 79 /r","V","V","AVX512F",= "modrm_regonly","w,r,r","Y","" +"VCVTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UDQ zmm2/m512/m64bcst= , {k}{z}, ymm1","vcvtpd2udq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W= 1 79 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UDQX xmm2/m128/m64bcs= t, {k}{z}, xmm1","vcvtpd2udqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F= .W1 79 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UDQY ymm2/m256/m64bcs= t, {k}{z}, xmm1","vcvtpd2udqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F= .W1 79 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTPD2UQQ xmm2/m128/m64bcst= , {k}{z}, xmm1","vcvtpd2uqq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0= F.W1 79 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTPD2UQQ ymm2/m256/m64bcst= , {k}{z}, ymm1","vcvtpd2uqq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0= F.W1 79 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTPD2UQQ zmm1{er}, {k}{z}, zmm2","VCVTPD2UQQ zmm2, {k}{z}, zmm1{er}","v= cvtpd2uqq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 79 /r","V","V","AVX512= DQ","modrm_regonly","w,r,r","","" +"VCVTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTPD2UQQ zmm2/m512/m64bcst= , {k}{z}, zmm1","vcvtpd2uqq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0= F.W1 79 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTPH2PS ymm1, xmm2/m128","VCVTPH2PS xmm2/m128, ymm1","vcvtph2ps xmm2/m1= 28, ymm1","VEX.256.66.0F38.W0 13 /r","V","V","F16C","","w,r","","" +"VCVTPH2PS ymm1, {k}{z}, xmm2/m128","VCVTPH2PS xmm2/m128, {k}{z}, ymm1","v= cvtph2ps xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 13 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VCVTPH2PS xmm1, xmm2/m64","VCVTPH2PS xmm2/m64, xmm1","vcvtph2ps xmm2/m64,= xmm1","VEX.128.66.0F38.W0 13 /r","V","V","F16C","","w,r","","" +"VCVTPH2PS xmm1, {k}{z}, xmm2/m64","VCVTPH2PS xmm2/m64, {k}{z}, xmm1","vcv= tph2ps xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 13 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VCVTPH2PS zmm1{sae}, {k}{z}, ymm2","VCVTPH2PS ymm2, {k}{z}, zmm1{sae}","v= cvtph2ps ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 13 /r","V","V","AVX5= 12F","modrm_regonly","w,r,r","","" +"VCVTPH2PS zmm1, {k}{z}, ymm2/m256","VCVTPH2PS ymm2/m256, {k}{z}, zmm1","v= cvtph2ps ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 13 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VCVTPS2DQ xmm1, xmm2/m128","VCVTPS2DQ xmm2/m128, xmm1","vcvtps2dq xmm2/m1= 28, xmm1","VEX.128.66.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2DQ xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtps2dq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W= 0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2DQ ymm1, ymm2/m256","VCVTPS2DQ ymm2/m256, ymm1","vcvtps2dq ymm2/m2= 56, ymm1","VEX.256.66.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2DQ ymm2/m256/m32bcst, = {k}{z}, ymm1","vcvtps2dq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W= 0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTPS2DQ zmm1{er}, {k}{z}, zmm2","VCVTPS2DQ zmm2, {k}{z}, zmm1{er}","vcv= tps2dq zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 5B /r","V","V","AVX512F",= "modrm_regonly","w,r,r","","" +"VCVTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2DQ zmm2/m512/m32bcst, = {k}{z}, zmm1","vcvtps2dq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W= 0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTPS2PD ymm1, xmm2/m128","VCVTPS2PD xmm2/m128, ymm1","vcvtps2pd xmm2/m1= 28, ymm1","VEX.256.0F.WIG 5A /r","V","V","AVX","","w,r","","" +"VCVTPS2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2PD xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtps2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 5= A /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTPS2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2PD xmm2/m256/m32bcst, = {k}{z}, ymm1","vcvtps2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 5= A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2PD xmm1, xmm2/m64","VCVTPS2PD xmm2/m64, xmm1","vcvtps2pd xmm2/m64,= xmm1","VEX.128.0F.WIG 5A /r","V","V","AVX","","w,r","","" +"VCVTPS2PD zmm1{sae}, {k}{z}, ymm2","VCVTPS2PD ymm2, {k}{z}, zmm1{sae}","v= cvtps2pd ymm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 5A /r","V","V","AVX512F",= "modrm_regonly","w,r,r","","" +"VCVTPS2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2PD ymm2/m512/m32bcst, = {k}{z}, zmm1","vcvtps2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 5= A /r","V","V","AVX512F","bscale4,scale32","w,r,r","","" +"VCVTPS2PH xmm2/m64, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, xmm2/m64","vcvtp= s2ph imm8u, xmm1, xmm2/m64","VEX.128.66.0F3A.W0 1D /r ib","V","V","F16C",""= ,"w,r,r","","" +"VCVTPS2PH xmm2/m64, {k}{z}, xmm1, imm8u","VCVTPS2PH imm8u, xmm1, {k}{z}, = xmm2/m64","vcvtps2ph imm8u, xmm1, {k}{z}, xmm2/m64","EVEX.128.66.0F3A.W0 1D= /r ib","V","V","AVX512F+AVX512VL","scale8","w,r,r,r","","" +"VCVTPS2PH xmm2/m128, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, xmm2/m128","vcv= tps2ph imm8u, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 1D /r ib","V","V","F16C"= ,"","w,r,r","","" +"VCVTPS2PH xmm2/m128, {k}{z}, ymm1, imm8u","VCVTPS2PH imm8u, ymm1, {k}{z},= xmm2/m128","vcvtps2ph imm8u, ymm1, {k}{z}, xmm2/m128","EVEX.256.66.0F3A.W0= 1D /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VCVTPS2PH ymm2/m256, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1, {k}{z},= ymm2/m256","vcvtps2ph imm8u, zmm1, {k}{z}, ymm2/m256","EVEX.512.66.0F3A.W0= 1D /r ib","V","V","AVX512F","scale32","w,r,r,r","","" +"VCVTPS2PH ymm2{sae}, {k}{z}, zmm1, imm8u","VCVTPS2PH imm8u, zmm1, {k}{z},= ymm2{sae}","vcvtps2ph imm8u, zmm1, {k}{z}, ymm2{sae}","EVEX.512.66.0F3A.W0= 1D /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VCVTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2QQ xmm2/m128/m32bcst, = {k}{z}, xmm1","vcvtps2qq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F.W= 0 7B /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2QQ xmm2/m256/m32bcst, = {k}{z}, ymm1","vcvtps2qq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F.W= 0 7B /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2QQ zmm1{er}, {k}{z}, ymm2","VCVTPS2QQ ymm2, {k}{z}, zmm1{er}","vcv= tps2qq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 7B /r","V","V","AVX512DQ"= ,"modrm_regonly","w,r,r","","" +"VCVTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2QQ ymm2/m512/m32bcst, = {k}{z}, zmm1","vcvtps2qq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F.W= 0 7B /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","","" +"VCVTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UDQ xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvtps2udq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W= 0 79 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTPS2UDQ ymm2/m256/m32bcst= , {k}{z}, ymm1","vcvtps2udq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W= 0 79 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTPS2UDQ zmm1{er}, {k}{z}, zmm2","VCVTPS2UDQ zmm2, {k}{z}, zmm1{er}","v= cvtps2udq zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 79 /r","V","V","AVX512F",= "modrm_regonly","w,r,r","","" +"VCVTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTPS2UDQ zmm2/m512/m32bcst= , {k}{z}, zmm1","vcvtps2udq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W= 0 79 /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTPS2UQQ xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvtps2uqq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0= F.W0 79 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTPS2UQQ xmm2/m256/m32bcst= , {k}{z}, ymm1","vcvtps2uqq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0= F.W0 79 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTPS2UQQ zmm1{er}, {k}{z}, ymm2","VCVTPS2UQQ ymm2, {k}{z}, zmm1{er}","v= cvtps2uqq ymm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W0 79 /r","V","V","AVX512= DQ","modrm_regonly","w,r,r","","" +"VCVTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTPS2UQQ ymm2/m512/m32bcst= , {k}{z}, zmm1","vcvtps2uqq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0= F.W0 79 /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","","" +"VCVTQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PD xmm2/m128/m64bcst, = {k}{z}, xmm1","vcvtqq2pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F3.0F.W= 1 E6 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PD ymm2/m256/m64bcst, = {k}{z}, ymm1","vcvtqq2pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.F3.0F.W= 1 E6 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTQQ2PD zmm2, {k}{z}, zmm1{er}","vcv= tqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 E6 /r","V","V","AVX512DQ"= ,"modrm_regonly","w,r,r","","" +"VCVTQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PD zmm2/m512/m64bcst, = {k}{z}, zmm1","vcvtqq2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.F3.0F.W= 1 E6 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTQQ2PS zmm2, {k}{z}, ymm1{er}","vcv= tqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.0F.W1 5B /r","V","V","AVX512DQ","m= odrm_regonly","w,r,r","Y","" +"VCVTQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTQQ2PS zmm2/m512/m64bcst, = {k}{z}, ymm1","vcvtqq2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0F.W1 5= B /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512" +"VCVTQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTQQ2PSX xmm2/m128/m64bcst,= {k}{z}, xmm1","vcvtqq2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.0F.W1= 5B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTQQ2PSY ymm2/m256/m64bcst,= {k}{z}, xmm1","vcvtqq2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.0F.W1= 5B /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTSD2SI r32{er}, xmm2","VCVTSD2SI xmm2, r32{er}","vcvtsd2si xmm2, r32{e= r}","EVEX.128.F2.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","= 32" +"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r= 32","EVEX.LIG.F2.0F.W0 2D /r","V","V","AVX512F","scale8","w,r","Y","32" +"VCVTSD2SI r32, xmm2/m64","VCVTSD2SI xmm2/m64, r32","vcvtsd2si xmm2/m64, r= 32","VEX.LIG.F2.0F.W0 2D /r","V","V","AVX","","w,r","Y","32" +"VCVTSD2SI r64{er}, xmm2","VCVTSD2SIQ xmm2, r64{er}","vcvtsd2siq xmm2, r64= {er}","EVEX.128.F2.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r",= "Y","64" +"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64,= r64","EVEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX512F","scale8","w,r","Y","64" +"VCVTSD2SI r64, xmm2/m64","VCVTSD2SIQ xmm2/m64, r64","vcvtsd2siq xmm2/m64,= r64","VEX.LIG.F2.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64" +"VCVTSD2SS xmm1{er}, {k}{z}, xmmV, xmm2","VCVTSD2SS xmm2, xmmV, {k}{z}, xm= m1{er}","vcvtsd2ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5A = /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VCVTSD2SS xmm1, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, xmm1","vcvtsd2= ss xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5A /r","V","V","AVX","","w,= r,r","","" +"VCVTSD2SS xmm1, {k}{z}, xmmV, xmm2/m64","VCVTSD2SS xmm2/m64, xmmV, {k}{z}= , xmm1","vcvtsd2ss xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5A = /r","V","V","AVX512F","scale8","w,r,r,r","","" +"VCVTSD2USI r32{er}, xmm2","VCVTSD2USIL xmm2, r32{er}","vcvtsd2usi xmm2, r= 32{er}","EVEX.128.F2.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r","= Y","32" +"VCVTSD2USI r32, xmm2/m64","VCVTSD2USIL xmm2/m64, r32","vcvtsd2usi xmm2/m6= 4, r32","EVEX.LIG.F2.0F.W0 79 /r","V","V","AVX512F","scale8","w,r","Y","32" +"VCVTSD2USI r64{er}, xmm2","VCVTSD2USIQ xmm2, r64{er}","vcvtsd2usi xmm2, r= 64{er}","EVEX.128.F2.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,r= ","Y","64" +"VCVTSD2USI r64, xmm2/m64","VCVTSD2USIQ xmm2/m64, r64","vcvtsd2usi xmm2/m6= 4, r64","EVEX.LIG.F2.0F.W1 79 /r","N.S.","V","AVX512F","scale8","w,r","Y","= 64" +"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r= /m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX512F","scale4",= "w,r,r","Y","32" +"VCVTSI2SD xmm1, xmmV, r/m32","VCVTSI2SDL r/m32, xmmV, xmm1","vcvtsi2sdl r= /m32, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W0 2A /r","V","V","AVX","","w,r,r","Y"= ,"32" +"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r= /m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX512F","scale= 8","w,r,r","Y","64" +"VCVTSI2SD xmm1, xmmV, r/m64","VCVTSI2SDQ r/m64, xmmV, xmm1","vcvtsi2sdq r= /m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.W1 2A /r","N.S.","V","AVX","","w,r,r",= "Y","64" +"VCVTSI2SD xmm1{er}, xmmV, rmr64","VCVTSI2SDQ rmr64, xmmV, xmm1{er}","vcvt= si2sdq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 2A /r","N.S.","V","AVX= 512F","modrm_regonly","w,r,r","Y","64" +"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r= /m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX512F","scale4",= "w,r,r","Y","32" +"VCVTSI2SS xmm1, xmmV, r/m32","VCVTSI2SSL r/m32, xmmV, xmm1","vcvtsi2ssl r= /m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W0 2A /r","V","V","AVX","","w,r,r","Y"= ,"32" +"VCVTSI2SS xmm1{er}, xmmV, rmr32","VCVTSI2SSL rmr32, xmmV, xmm1{er}","vcvt= si2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 2A /r","V","V","AVX512= F","modrm_regonly","w,r,r","Y","32" +"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r= /m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX512F","scale= 8","w,r,r","Y","64" +"VCVTSI2SS xmm1, xmmV, r/m64","VCVTSI2SSQ r/m64, xmmV, xmm1","vcvtsi2ssq r= /m64, xmmV, xmm1","VEX.NDS.LIG.F3.0F.W1 2A /r","N.S.","V","AVX","","w,r,r",= "Y","64" +"VCVTSI2SS xmm1{er}, xmmV, rmr64","VCVTSI2SSQ rmr64, xmmV, xmm1{er}","vcvt= si2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 2A /r","N.S.","V","AVX= 512F","modrm_regonly","w,r,r","Y","64" +"VCVTSS2SD xmm1{sae}, {k}{z}, xmmV, xmm2","VCVTSS2SD xmm2, xmmV, {k}{z}, x= mm1{sae}","vcvtss2sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 = 5A /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VCVTSS2SD xmm1, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, xmm1","vcvtss2= sd xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5A /r","V","V","AVX","","w,= r,r","","" +"VCVTSS2SD xmm1, {k}{z}, xmmV, xmm2/m32","VCVTSS2SD xmm2/m32, xmmV, {k}{z}= , xmm1","vcvtss2sd xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5A = /r","V","V","AVX512F","scale4","w,r,r,r","","" +"VCVTSS2SI r32{er}, xmm2","VCVTSS2SI xmm2, r32{er}","vcvtss2si xmm2, r32{e= r}","EVEX.128.F3.0F.W0 2D /r","V","V","AVX512F","modrm_regonly","w,r","Y","= 32" +"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r= 32","EVEX.LIG.F3.0F.W0 2D /r","V","V","AVX512F","scale4","w,r","Y","32" +"VCVTSS2SI r32, xmm2/m32","VCVTSS2SI xmm2/m32, r32","vcvtss2si xmm2/m32, r= 32","VEX.LIG.F3.0F.W0 2D /r","V","V","AVX","","w,r","Y","32" +"VCVTSS2SI r64{er}, xmm2","VCVTSS2SIQ xmm2, r64{er}","vcvtss2siq xmm2, r64= {er}","EVEX.128.F3.0F.W1 2D /r","N.S.","V","AVX512F","modrm_regonly","w,r",= "Y","64" +"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32,= r64","EVEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX512F","scale4","w,r","Y","64" +"VCVTSS2SI r64, xmm2/m32","VCVTSS2SIQ xmm2/m32, r64","vcvtss2siq xmm2/m32,= r64","VEX.LIG.F3.0F.W1 2D /r","N.S.","V","AVX","","w,r","Y","64" +"VCVTSS2USI r32{er}, xmm2","VCVTSS2USIL xmm2, r32{er}","vcvtss2usil xmm2, = r32{er}","EVEX.128.F3.0F.W0 79 /r","V","V","AVX512F","modrm_regonly","w,r",= "Y","32" +"VCVTSS2USI r32, xmm2/m32","VCVTSS2USIL xmm2/m32, r32","vcvtss2usil xmm2/m= 32, r32","EVEX.LIG.F3.0F.W0 79 /r","V","V","AVX512F","scale4","w,r","Y","32" +"VCVTSS2USI r64{er}, xmm2","VCVTSS2USIQ xmm2, r64{er}","vcvtss2usiq xmm2, = r64{er}","EVEX.128.F3.0F.W1 79 /r","N.S.","V","AVX512F","modrm_regonly","w,= r","Y","64" +"VCVTSS2USI r64, xmm2/m32","VCVTSS2USIQ xmm2/m32, r64","vcvtss2usiq xmm2/m= 32, r64","EVEX.LIG.F3.0F.W1 79 /r","N.S.","V","AVX512F","scale4","w,r","Y",= "64" +"VCVTTPD2DQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2DQ zmm2, {k}{z}, ymm1{sae}",= "vcvttpd2dq zmm2, {k}{z}, ymm1{sae}","EVEX.512.66.0F.W1 E6 /r","V","V","AVX= 512F","modrm_regonly","w,r,r","Y","" +"VCVTTPD2DQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2DQ zmm2/m512/m64bcst= , {k}{z}, ymm1","vcvttpd2dq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.66.0= F.W1 E6 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTTPD2DQ xmm1, xmm2/m128","VCVTTPD2DQX xmm2/m128, xmm1","vcvttpd2dqx xm= m2/m128, xmm1","VEX.128.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","128" +"VCVTTPD2DQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2DQX xmm2/m128/m64bcs= t, {k}{z}, xmm1","vcvttpd2dqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66= .0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTTPD2DQ xmm1, ymm2/m256","VCVTTPD2DQY ymm2/m256, xmm1","vcvttpd2dqy ym= m2/m256, xmm1","VEX.256.66.0F.WIG E6 /r","V","V","AVX","","w,r","Y","256" +"VCVTTPD2DQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2DQY ymm2/m256/m64bcs= t, {k}{z}, xmm1","vcvttpd2dqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.66= .0F.W1 E6 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTTPD2QQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2QQ xmm2/m128/m64bcst= , {k}{z}, xmm1","vcvttpd2qq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0= F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTTPD2QQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2QQ ymm2/m256/m64bcst= , {k}{z}, ymm1","vcvttpd2qq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0= F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTTPD2QQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2QQ zmm2, {k}{z}, zmm1{sae}",= "vcvttpd2qq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 7A /r","V","V","AVX= 512DQ","modrm_regonly","w,r,r","","" +"VCVTTPD2QQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2QQ zmm2/m512/m64bcst= , {k}{z}, zmm1","vcvttpd2qq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0= F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTTPD2UDQ ymm1{sae}, {k}{z}, zmm2","VCVTTPD2UDQ zmm2, {k}{z}, ymm1{sae}= ","vcvttpd2udq zmm2, {k}{z}, ymm1{sae}","EVEX.512.0F.W1 78 /r","V","V","AVX= 512F","modrm_regonly","w,r,r","Y","" +"VCVTTPD2UDQ ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UDQ zmm2/m512/m64bc= st, {k}{z}, ymm1","vcvttpd2udq zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.0= F.W1 78 /r","V","V","AVX512F","bscale8,scale64","w,r,r","Y","512" +"VCVTTPD2UDQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UDQX xmm2/m128/m64b= cst, {k}{z}, xmm1","vcvttpd2udqx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128= .0F.W1 78 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","Y","128" +"VCVTTPD2UDQ xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UDQY ymm2/m256/m64b= cst, {k}{z}, xmm1","vcvttpd2udqy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256= .0F.W1 78 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","Y","256" +"VCVTTPD2UQQ xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTTPD2UQQ xmm2/m128/m64bc= st, {k}{z}, xmm1","vcvttpd2uqq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.6= 6.0F.W1 78 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTTPD2UQQ ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTTPD2UQQ ymm2/m256/m64bc= st, {k}{z}, ymm1","vcvttpd2uqq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.6= 6.0F.W1 78 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTTPD2UQQ zmm1{sae}, {k}{z}, zmm2","VCVTTPD2UQQ zmm2, {k}{z}, zmm1{sae}= ","vcvttpd2uqq zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W1 78 /r","V","V","= AVX512DQ","modrm_regonly","w,r,r","","" +"VCVTTPD2UQQ zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTTPD2UQQ zmm2/m512/m64bc= st, {k}{z}, zmm1","vcvttpd2uqq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.6= 6.0F.W1 78 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTTPS2DQ xmm1, xmm2/m128","VCVTTPS2DQ xmm2/m128, xmm1","vcvttps2dq xmm2= /m128, xmm1","VEX.128.F3.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTTPS2DQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2DQ xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvttps2dq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0= F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTTPS2DQ ymm1, ymm2/m256","VCVTTPS2DQ ymm2/m256, ymm1","vcvttps2dq ymm2= /m256, ymm1","VEX.256.F3.0F.WIG 5B /r","V","V","AVX","","w,r","","" +"VCVTTPS2DQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2DQ ymm2/m256/m32bcst= , {k}{z}, ymm1","vcvttps2dq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0= F.W0 5B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTTPS2DQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2DQ zmm2, {k}{z}, zmm1{sae}",= "vcvttps2dq zmm2, {k}{z}, zmm1{sae}","EVEX.512.F3.0F.W0 5B /r","V","V","AVX= 512F","modrm_regonly","w,r,r","","" +"VCVTTPS2DQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2DQ zmm2/m512/m32bcst= , {k}{z}, zmm1","vcvttps2dq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0= F.W0 5B /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTTPS2QQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2QQ xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvttps2qq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0= F.W0 7A /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTTPS2QQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2QQ xmm2/m256/m32bcst= , {k}{z}, ymm1","vcvttps2qq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0= F.W0 7A /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTTPS2QQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2QQ ymm2, {k}{z}, zmm1{sae}",= "vcvttps2qq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 7A /r","V","V","AVX= 512DQ","modrm_regonly","w,r,r","","" +"VCVTTPS2QQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2QQ ymm2/m512/m32bcst= , {k}{z}, zmm1","vcvttps2qq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0= F.W0 7A /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","","" +"VCVTTPS2UDQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UDQ xmm2/m128/m32bc= st, {k}{z}, xmm1","vcvttps2udq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0= F.W0 78 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTTPS2UDQ ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTTPS2UDQ ymm2/m256/m32bc= st, {k}{z}, ymm1","vcvttps2udq ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0= F.W0 78 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTTPS2UDQ zmm1{sae}, {k}{z}, zmm2","VCVTTPS2UDQ zmm2, {k}{z}, zmm1{sae}= ","vcvttps2udq zmm2, {k}{z}, zmm1{sae}","EVEX.512.0F.W0 78 /r","V","V","AVX= 512F","modrm_regonly","w,r,r","","" +"VCVTTPS2UDQ zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTTPS2UDQ zmm2/m512/m32bc= st, {k}{z}, zmm1","vcvttps2udq zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0= F.W0 78 /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTTPS2UQQ xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTTPS2UQQ xmm2/m128/m32bc= st, {k}{z}, xmm1","vcvttps2uqq xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.6= 6.0F.W0 78 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTTPS2UQQ ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTTPS2UQQ xmm2/m256/m32bc= st, {k}{z}, ymm1","vcvttps2uqq xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.6= 6.0F.W0 78 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTTPS2UQQ zmm1{sae}, {k}{z}, ymm2","VCVTTPS2UQQ ymm2, {k}{z}, zmm1{sae}= ","vcvttps2uqq ymm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F.W0 78 /r","V","V","= AVX512DQ","modrm_regonly","w,r,r","","" +"VCVTTPS2UQQ zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTTPS2UQQ ymm2/m512/m32bc= st, {k}{z}, zmm1","vcvttps2uqq ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.6= 6.0F.W0 78 /r","V","V","AVX512DQ","bscale4,scale32","w,r,r","","" +"VCVTTSD2SI r32{sae}, xmm2","VCVTTSD2SI xmm2, r32{sae}","vcvttsd2si xmm2, = r32{sae}","EVEX.128.F2.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r"= ,"Y","32" +"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64= , r32","EVEX.LIG.F2.0F.W0 2C /r","V","V","AVX512F","scale8","w,r","Y","32" +"VCVTTSD2SI r32, xmm2/m64","VCVTTSD2SI xmm2/m64, r32","vcvttsd2si xmm2/m64= , r32","VEX.LIG.F2.0F.W0 2C /r","V","V","AVX","","w,r","Y","32" +"VCVTTSD2SI r64{sae}, xmm2","VCVTTSD2SIQ xmm2, r64{sae}","vcvttsd2siq xmm2= , r64{sae}","EVEX.128.F2.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly",= "w,r","Y","64" +"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m= 64, r64","EVEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX512F","scale8","w,r","Y",= "64" +"VCVTTSD2SI r64, xmm2/m64","VCVTTSD2SIQ xmm2/m64, r64","vcvttsd2siq xmm2/m= 64, r64","VEX.LIG.F2.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64" +"VCVTTSD2USI r32{sae}, xmm2","VCVTTSD2USIL xmm2, r32{sae}","vcvttsd2usil x= mm2, r32{sae}","EVEX.128.F2.0F.W0 78 /r","V","V","AVX512F","modrm_regonly",= "w,r","Y","32" +"VCVTTSD2USI r32, xmm2/m64","VCVTTSD2USIL xmm2/m64, r32","vcvttsd2usil xmm= 2/m64, r32","EVEX.LIG.F2.0F.W0 78 /r","V","V","AVX512F","scale8","w,r","Y",= "32" +"VCVTTSD2USI r64{sae}, xmm2","VCVTTSD2USIQ xmm2, r64{sae}","vcvttsd2usiq x= mm2, r64{sae}","EVEX.128.F2.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonl= y","w,r","Y","64" +"VCVTTSD2USI r64, xmm2/m64","VCVTTSD2USIQ xmm2/m64, r64","vcvttsd2usiq xmm= 2/m64, r64","EVEX.LIG.F2.0F.W1 78 /r","N.S.","V","AVX512F","scale8","w,r","= Y","64" +"VCVTTSS2SI r32{sae}, xmm2","VCVTTSS2SI xmm2, r32{sae}","vcvttss2si xmm2, = r32{sae}","EVEX.128.F3.0F.W0 2C /r","V","V","AVX512F","modrm_regonly","w,r"= ,"Y","32" +"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32= , r32","EVEX.LIG.F3.0F.W0 2C /r","V","V","AVX512F","scale4","w,r","Y","32" +"VCVTTSS2SI r32, xmm2/m32","VCVTTSS2SI xmm2/m32, r32","vcvttss2si xmm2/m32= , r32","VEX.LIG.F3.0F.W0 2C /r","V","V","AVX","","w,r","Y","32" +"VCVTTSS2SI r64{sae}, xmm2","VCVTTSS2SIQ xmm2, r64{sae}","vcvttss2siq xmm2= , r64{sae}","EVEX.128.F3.0F.W1 2C /r","N.S.","V","AVX512F","modrm_regonly",= "w,r","Y","64" +"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m= 32, r64","EVEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX512F","scale4","w,r","Y",= "64" +"VCVTTSS2SI r64, xmm2/m32","VCVTTSS2SIQ xmm2/m32, r64","vcvttss2siq xmm2/m= 32, r64","VEX.LIG.F3.0F.W1 2C /r","N.S.","V","AVX","","w,r","Y","64" +"VCVTTSS2USI r32{sae}, xmm2","VCVTTSS2USIL xmm2, r32{sae}","vcvttss2usil x= mm2, r32{sae}","EVEX.128.F3.0F.W0 78 /r","V","V","AVX512F","modrm_regonly",= "w,r","Y","32" +"VCVTTSS2USI r32, xmm2/m32","VCVTTSS2USIL xmm2/m32, r32","vcvttss2usil xmm= 2/m32, r32","EVEX.LIG.F3.0F.W0 78 /r","V","V","AVX512F","scale4","w,r","Y",= "32" +"VCVTTSS2USI r64{sae}, xmm2","VCVTTSS2USIQ xmm2, r64{sae}","vcvttss2usiq x= mm2, r64{sae}","EVEX.128.F3.0F.W1 78 /r","N.S.","V","AVX512F","modrm_regonl= y","w,r","Y","64" +"VCVTTSS2USI r64, xmm2/m32","VCVTTSS2USIQ xmm2/m32, r64","vcvttss2usiq xmm= 2/m32, r64","EVEX.LIG.F3.0F.W1 78 /r","N.S.","V","AVX512F","scale4","w,r","= Y","64" +"VCVTUDQ2PD xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PD xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvtudq2pd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F3.0= F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale8","w,r,r","","" +"VCVTUDQ2PD ymm1, {k}{z}, xmm2/m256/m32bcst","VCVTUDQ2PD xmm2/m256/m32bcst= , {k}{z}, ymm1","vcvtudq2pd xmm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F3.0= F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTUDQ2PD zmm1, {k}{z}, ymm2/m512/m32bcst","VCVTUDQ2PD ymm2/m512/m32bcst= , {k}{z}, zmm1","vcvtudq2pd ymm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F3.0= F.W0 7A /r","V","V","AVX512F","bscale4,scale32","w,r,r","","" +"VCVTUDQ2PS xmm1, {k}{z}, xmm2/m128/m32bcst","VCVTUDQ2PS xmm2/m128/m32bcst= , {k}{z}, xmm1","vcvtudq2ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.F2.0= F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VCVTUDQ2PS ymm1, {k}{z}, ymm2/m256/m32bcst","VCVTUDQ2PS ymm2/m256/m32bcst= , {k}{z}, ymm1","vcvtudq2ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.F2.0= F.W0 7A /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VCVTUDQ2PS zmm1{er}, {k}{z}, zmm2","VCVTUDQ2PS zmm2, {k}{z}, zmm1{er}","v= cvtudq2ps zmm2, {k}{z}, zmm1{er}","EVEX.512.F2.0F.W0 7A /r","V","V","AVX512= F","modrm_regonly","w,r,r","","" +"VCVTUDQ2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VCVTUDQ2PS zmm2/m512/m32bcst= , {k}{z}, zmm1","vcvtudq2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.F2.0= F.W0 7A /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VCVTUQQ2PD xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PD xmm2/m128/m64bcst= , {k}{z}, xmm1","vcvtuqq2pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F3.0= F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VCVTUQQ2PD ymm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PD ymm2/m256/m64bcst= , {k}{z}, ymm1","vcvtuqq2pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.F3.0= F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VCVTUQQ2PD zmm1{er}, {k}{z}, zmm2","VCVTUQQ2PD zmm2, {k}{z}, zmm1{er}","v= cvtuqq2pd zmm2, {k}{z}, zmm1{er}","EVEX.512.F3.0F.W1 7A /r","V","V","AVX512= DQ","modrm_regonly","w,r,r","","" +"VCVTUQQ2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PD zmm2/m512/m64bcst= , {k}{z}, zmm1","vcvtuqq2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.F3.0= F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","","" +"VCVTUQQ2PS ymm1{er}, {k}{z}, zmm2","VCVTUQQ2PS zmm2, {k}{z}, ymm1{er}","v= cvtuqq2ps zmm2, {k}{z}, ymm1{er}","EVEX.512.F2.0F.W1 7A /r","V","V","AVX512= DQ","modrm_regonly","w,r,r","Y","" +"VCVTUQQ2PS ymm1, {k}{z}, zmm2/m512/m64bcst","VCVTUQQ2PS zmm2/m512/m64bcst= , {k}{z}, ymm1","vcvtuqq2ps zmm2/m512/m64bcst, {k}{z}, ymm1","EVEX.512.F2.0= F.W1 7A /r","V","V","AVX512DQ","bscale8,scale64","w,r,r","Y","512" +"VCVTUQQ2PS xmm1, {k}{z}, xmm2/m128/m64bcst","VCVTUQQ2PSX xmm2/m128/m64bcs= t, {k}{z}, xmm1","vcvtuqq2psx xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.F2= .0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r","Y","12= 8" +"VCVTUQQ2PS xmm1, {k}{z}, ymm2/m256/m64bcst","VCVTUQQ2PSY ymm2/m256/m64bcs= t, {k}{z}, xmm1","vcvtuqq2psy ymm2/m256/m64bcst, {k}{z}, xmm1","EVEX.256.F2= .0F.W1 7A /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r","Y","25= 6" +"VCVTUSI2SD xmm1, xmmV, r/m32","VCVTUSI2SDL r/m32, xmmV, xmm1","vcvtusi2sd= r/m32, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W0 7B /r","V","V","AVX512F","scale4= ","w,r,r","Y","32" +"VCVTUSI2SD xmm1, xmmV, r/m64","VCVTUSI2SDQ r/m64, xmmV, xmm1","vcvtusi2sd= r/m64, xmmV, xmm1","EVEX.NDS.LIG.F2.0F.W1 7B /r","N.S.","V","AVX512F","sca= le8","w,r,r","Y","64" +"VCVTUSI2SD xmm1{er}, xmmV, rmr64","VCVTUSI2SDQ rmr64, xmmV, xmm1{er}","vc= vtusi2sd rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F2.0F.W1 7B /r","N.S.","V","A= VX512F","modrm_regonly","w,r,r","Y","64" +"VCVTUSI2SS xmm1, xmmV, r/m32","VCVTUSI2SSL r/m32, xmmV, xmm1","vcvtusi2ss= l r/m32, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W0 7B /r","V","V","AVX512F","scale= 4","w,r,r","Y","32" +"VCVTUSI2SS xmm1{er}, xmmV, rmr32","VCVTUSI2SSL rmr32, xmmV, xmm1{er}","vc= vtusi2ssl rmr32, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W0 7B /r","V","V","AVX= 512F","modrm_regonly","w,r,r","Y","32" +"VCVTUSI2SS xmm1, xmmV, r/m64","VCVTUSI2SSQ r/m64, xmmV, xmm1","vcvtusi2ss= q r/m64, xmmV, xmm1","EVEX.NDS.LIG.F3.0F.W1 7B /r","N.S.","V","AVX512F","sc= ale8","w,r,r","Y","64" +"VCVTUSI2SS xmm1{er}, xmmV, rmr64","VCVTUSI2SSQ rmr64, xmmV, xmm1{er}","vc= vtusi2ssq rmr64, xmmV, xmm1{er}","EVEX.NDS.128.F3.0F.W1 7B /r","N.S.","V","= AVX512F","modrm_regonly","w,r,r","Y","64" +"VDBPSADBW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VDBPSADBW imm8u, xmm2/m1= 28, xmmV, {k}{z}, xmm1","vdbpsadbw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","E= VEX.NDS.128.66.0F3A.W0 42 /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r= ,r,r,r","","" +"VDBPSADBW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VDBPSADBW imm8u, ymm2/m2= 56, ymmV, {k}{z}, ymm1","vdbpsadbw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","E= VEX.NDS.256.66.0F3A.W0 42 /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r= ,r,r,r","","" +"VDBPSADBW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VDBPSADBW imm8u, zmm2/m5= 12, zmmV, {k}{z}, zmm1","vdbpsadbw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","E= VEX.NDS.512.66.0F3A.W0 42 /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","= ","" +"VDIVPD xmm1, xmmV, xmm2/m128","VDIVPD xmm2/m128, xmmV, xmm1","vdivpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","= ","" +"VDIVPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VDIVPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vdivpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 5E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VDIVPD ymm1, ymmV, ymm2/m256","VDIVPD ymm2/m256, ymmV, ymm1","vdivpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5E /r","V","V","AVX","","w,r,r","= ","" +"VDIVPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VDIVPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vdivpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 5E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VDIVPD zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPD zmm2, zmmV, {k}{z}, zmm1{er}= ","vdivpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 5E /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VDIVPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VDIVPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vdivpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 5E /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VDIVPS xmm1, xmmV, xmm2/m128","VDIVPS xmm2/m128, xmmV, xmm1","vdivps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5E /r","V","V","AVX","","w,r,r","","" +"VDIVPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VDIVPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vdivps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 5E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VDIVPS ymm1, ymmV, ymm2/m256","VDIVPS ymm2/m256, ymmV, ymm1","vdivps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5E /r","V","V","AVX","","w,r,r","","" +"VDIVPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VDIVPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vdivps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 5E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VDIVPS zmm1{er}, {k}{z}, zmmV, zmm2","VDIVPS zmm2, zmmV, {k}{z}, zmm1{er}= ","vdivps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5E /r","V","V",= "AVX512F","modrm_regonly","w,r,r,r","","" +"VDIVPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VDIVPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vdivps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 5E /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VDIVSD xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSD xmm2, xmmV, {k}{z}, xmm1{er}= ","vdivsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5E /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VDIVSD xmm1, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, xmm1","vdivsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5E /r","V","V","AVX","","w,r,r","","" +"VDIVSD xmm1, {k}{z}, xmmV, xmm2/m64","VDIVSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vdivsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5E /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VDIVSS xmm1{er}, {k}{z}, xmmV, xmm2","VDIVSS xmm2, xmmV, {k}{z}, xmm1{er}= ","vdivss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 5E /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VDIVSS xmm1, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, xmm1","vdivss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5E /r","V","V","AVX","","w,r,r","","" +"VDIVSS xmm1, {k}{z}, xmmV, xmm2/m32","VDIVSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vdivss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5E /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VDPPD xmm1, xmmV, xmm2/m128, imm8u","VDPPD imm8u, xmm2/m128, xmmV, xmm1",= "vdppd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 41 /r ib","V"= ,"V","AVX","","w,r,r,r","","" +"VDPPS xmm1, xmmV, xmm2/m128, imm8u","VDPPS imm8u, xmm2/m128, xmmV, xmm1",= "vdpps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 40 /r ib","V"= ,"V","AVX","","w,r,r,r","","" +"VDPPS ymm1, ymmV, ymm2/m256, imm8u","VDPPS imm8u, ymm2/m256, ymmV, ymm1",= "vdpps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 40 /r ib","V"= ,"V","AVX","","w,r,r,r","","" +"VERR r/m16","VERR r/m16","verr r/m16","0F 00 /4","V","V","","","r","","" +"VERW r/m16","VERW r/m16","verw r/m16","0F 00 /5","V","V","","","r","","" +"VEXP2PD zmm1{sae}, {k}{z}, zmm2","VEXP2PD zmm2, {k}{z}, zmm1{sae}","vexp2= pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 C8 /r","V","V","AVX512ER",= "modrm_regonly","w,r,r","","" +"VEXP2PD zmm1, {k}{z}, zmm2/m512/m64bcst","VEXP2PD zmm2/m512/m64bcst, {k}{= z}, zmm1","vexp2pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 C8= /r","V","V","AVX512ER","bscale8,scale64","w,r,r","","" +"VEXP2PS zmm1{sae}, {k}{z}, zmm2","VEXP2PS zmm2, {k}{z}, zmm1{sae}","vexp2= ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 C8 /r","V","V","AVX512ER",= "modrm_regonly","w,r,r","","" +"VEXP2PS zmm1, {k}{z}, zmm2/m512/m32bcst","VEXP2PS zmm2/m512/m32bcst, {k}{= z}, zmm1","vexp2ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 C8= /r","V","V","AVX512ER","bscale4,scale64","w,r,r","","" +"VEXPANDPD xmm1, {k}{z}, xmm2/m128","VEXPANDPD xmm2/m128, {k}{z}, xmm1","v= expandpd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 88 /r","V","V","AVX5= 12F+AVX512VL","scale8","w,r,r","","" +"VEXPANDPD ymm1, {k}{z}, ymm2/m256","VEXPANDPD ymm2/m256, {k}{z}, ymm1","v= expandpd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 88 /r","V","V","AVX5= 12F+AVX512VL","scale8","w,r,r","","" +"VEXPANDPD zmm1, {k}{z}, zmm2/m512","VEXPANDPD zmm2/m512, {k}{z}, zmm1","v= expandpd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 88 /r","V","V","AVX5= 12F","scale8","w,r,r","","" +"VEXPANDPS xmm1, {k}{z}, xmm2/m128","VEXPANDPS xmm2/m128, {k}{z}, xmm1","v= expandps xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 88 /r","V","V","AVX5= 12F+AVX512VL","scale4","w,r,r","","" +"VEXPANDPS ymm1, {k}{z}, ymm2/m256","VEXPANDPS ymm2/m256, {k}{z}, ymm1","v= expandps ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 88 /r","V","V","AVX5= 12F+AVX512VL","scale4","w,r,r","","" +"VEXPANDPS zmm1, {k}{z}, zmm2/m512","VEXPANDPS zmm2/m512, {k}{z}, zmm1","v= expandps zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 88 /r","V","V","AVX5= 12F","scale4","w,r,r","","" +"VEXTRACTF128 xmm2/m128, ymm1, imm8u:1","VEXTRACTF128 imm8u:1, ymm1, xmm2/= m128","vextractf128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 19 /r ib"= ,"V","V","AVX","","w,r,r","","" +"VEXTRACTF32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF32X4 imm8u:1, y= mm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:1, ymm1, {k}{z}, xmm2/m128","E= VEX.256.66.0F3A.W0 19 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r"= ,"","" +"VEXTRACTF32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF32X4 imm8u:2, z= mm1, {k}{z}, xmm2/m128","vextractf32x4 imm8u:2, zmm1, {k}{z}, xmm2/m128","E= VEX.512.66.0F3A.W0 19 /r ib","V","V","AVX512F","scale16","w,r,r,r","","" +"VEXTRACTF32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTF32X8 imm8u:1, z= mm1, {k}{z}, ymm2/m256","vextractf32x8 imm8u:1, zmm1, {k}{z}, ymm2/m256","E= VEX.512.66.0F3A.W0 1B /r ib","V","V","AVX512DQ","scale32","w,r,r,r","","" +"VEXTRACTF64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTF64X2 imm8u:1, y= mm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:1, ymm1, {k}{z}, xmm2/m128","E= VEX.256.66.0F3A.W1 19 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r= ","","" +"VEXTRACTF64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTF64X2 imm8u:2, z= mm1, {k}{z}, xmm2/m128","vextractf64x2 imm8u:2, zmm1, {k}{z}, xmm2/m128","E= VEX.512.66.0F3A.W1 19 /r ib","V","V","AVX512DQ","scale16","w,r,r,r","","" +"VEXTRACTF64X4 ymm2/m256, {k}{z}, zmm1, imm8u","VEXTRACTF64X4 imm8u, zmm1,= {k}{z}, ymm2/m256","vextractf64x4 imm8u, zmm1, {k}{z}, ymm2/m256","EVEX.51= 2.66.0F3A.W1 1B /r ib","V","V","AVX512F","scale32","w,r,r,r","","" +"VEXTRACTI128 xmm2/m128, ymm1, imm8u:1","VEXTRACTI128 imm8u:1, ymm1, xmm2/= m128","vextracti128 imm8u:1, ymm1, xmm2/m128","VEX.256.66.0F3A.W0 39 /r ib"= ,"V","V","AVX2","","w,r,r","","" +"VEXTRACTI32X4 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI32X4 imm8u:1, y= mm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:1, ymm1, {k}{z}, xmm2/m128","E= VEX.256.66.0F3A.W0 39 /r ib","V","V","AVX512F+AVX512VL","scale16","w,r,r,r"= ,"","" +"VEXTRACTI32X4 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI32X4 imm8u:2, z= mm1, {k}{z}, xmm2/m128","vextracti32x4 imm8u:2, zmm1, {k}{z}, xmm2/m128","E= VEX.512.66.0F3A.W0 39 /r ib","V","V","AVX512F","scale16","w,r,r,r","","" +"VEXTRACTI32X8 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI32X8 imm8u:1, z= mm1, {k}{z}, ymm2/m256","vextracti32x8 imm8u:1, zmm1, {k}{z}, ymm2/m256","E= VEX.512.66.0F3A.W0 3B /r ib","V","V","AVX512DQ","scale32","w,r,r,r","","" +"VEXTRACTI64X2 xmm2/m128, {k}{z}, ymm1, imm8u:1","VEXTRACTI64X2 imm8u:1, y= mm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:1, ymm1, {k}{z}, xmm2/m128","E= VEX.256.66.0F3A.W1 39 /r ib","V","V","AVX512DQ+AVX512VL","scale16","w,r,r,r= ","","" +"VEXTRACTI64X2 xmm2/m128, {k}{z}, zmm1, imm8u:2","VEXTRACTI64X2 imm8u:2, z= mm1, {k}{z}, xmm2/m128","vextracti64x2 imm8u:2, zmm1, {k}{z}, xmm2/m128","E= VEX.512.66.0F3A.W1 39 /r ib","V","V","AVX512DQ","scale16","w,r,r,r","","" +"VEXTRACTI64X4 ymm2/m256, {k}{z}, zmm1, imm8u:1","VEXTRACTI64X4 imm8u:1, z= mm1, {k}{z}, ymm2/m256","vextracti64x4 imm8u:1, zmm1, {k}{z}, ymm2/m256","E= VEX.512.66.0F3A.W1 3B /r ib","V","V","AVX512F","scale32","w,r,r,r","","" +"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextr= actps imm8u:2, xmm1, r/m32","EVEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VEXTRACTPS r/m32, xmm1, imm8u:2","VEXTRACTPS imm8u:2, xmm1, r/m32","vextr= actps imm8u:2, xmm1, r/m32","VEX.128.66.0F3A.WIG 17 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VFIXUPIMMPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VFIXUPIMMPD im= m8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfixupimmpd imm8u, xmm2/m128/m= 64bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 54 /r ib","V","V","AVX= 512F+AVX512VL","bscale8,scale16","rw,r,r,r,r","","" +"VFIXUPIMMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VFIXUPIMMPD im= m8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfixupimmpd imm8u, ymm2/m256/m= 64bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 54 /r ib","V","V","AVX= 512F+AVX512VL","bscale8,scale32","rw,r,r,r,r","","" +"VFIXUPIMMPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPD imm8u, zmm= 2, zmmV, {k}{z}, zmm1{sae}","vfixupimmpd imm8u, zmm2, zmmV, {k}{z}, zmm1{sa= e}","EVEX.DDS.512.66.0F3A.W1 54 /r ib","V","V","AVX512F","modrm_regonly","r= w,r,r,r,r","","" +"VFIXUPIMMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VFIXUPIMMPD im= m8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfixupimmpd imm8u, zmm2/m512/m= 64bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 54 /r ib","V","V","AVX= 512F","bscale8,scale64","rw,r,r,r,r","","" +"VFIXUPIMMPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VFIXUPIMMPS im= m8u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfixupimmps imm8u, xmm2/m128/m= 32bcst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 54 /r ib","V","V","AVX= 512F+AVX512VL","bscale4,scale16","rw,r,r,r,r","","" +"VFIXUPIMMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VFIXUPIMMPS im= m8u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfixupimmps imm8u, ymm2/m256/m= 32bcst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 54 /r ib","V","V","AVX= 512F+AVX512VL","bscale4,scale32","rw,r,r,r,r","","" +"VFIXUPIMMPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u","VFIXUPIMMPS imm8u, zmm= 2, zmmV, {k}{z}, zmm1{sae}","vfixupimmps imm8u, zmm2, zmmV, {k}{z}, zmm1{sa= e}","EVEX.DDS.512.66.0F3A.W0 54 /r ib","V","V","AVX512F","modrm_regonly","r= w,r,r,r,r","","" +"VFIXUPIMMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VFIXUPIMMPS im= m8u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfixupimmps imm8u, zmm2/m512/m= 32bcst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 54 /r ib","V","V","AVX= 512F","bscale4,scale64","rw,r,r,r,r","","" +"VFIXUPIMMSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSD imm8u, xmm= 2, xmmV, {k}{z}, xmm1{sae}","vfixupimmsd imm8u, xmm2, xmmV, {k}{z}, xmm1{sa= e}","EVEX.DDS.128.66.0F3A.W1 55 /r ib","V","V","AVX512F","modrm_regonly","r= w,r,r,r,r","","" +"VFIXUPIMMSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VFIXUPIMMSD imm8u, xmm2= /m64, xmmV, {k}{z}, xmm1","vfixupimmsd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.LIG.66.0F3A.W1 55 /r ib","V","V","AVX512F","scale8","rw,r,r,r,r"= ,"","" +"VFIXUPIMMSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VFIXUPIMMSS imm8u, xmm= 2, xmmV, {k}{z}, xmm1{sae}","vfixupimmss imm8u, xmm2, xmmV, {k}{z}, xmm1{sa= e}","EVEX.DDS.128.66.0F3A.W0 55 /r ib","V","V","AVX512F","modrm_regonly","r= w,r,r,r,r","","" +"VFIXUPIMMSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VFIXUPIMMSS imm8u, xmm2= /m32, xmmV, {k}{z}, xmm1","vfixupimmss imm8u, xmm2/m32, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.LIG.66.0F3A.W0 55 /r ib","V","V","AVX512F","scale4","rw,r,r,r,r"= ,"","" +"VFMADD132PD xmm1, xmmV, xmm2/m128","VFMADD132PD xmm2/m128, xmmV, xmm1","v= fmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 98 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD132PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmadd132pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 98 /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMADD132PD ymm1, ymmV, ymm2/m256","VFMADD132PD ymm2/m256, ymmV, ymm1","v= fmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 98 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD132PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmadd132pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 98 /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 98 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD132PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmadd132pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 98 /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMADD132PS xmm1, xmmV, xmm2/m128","VFMADD132PS xmm2/m128, xmmV, xmm1","v= fmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 98 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD132PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmadd132ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 98 /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMADD132PS ymm1, ymmV, ymm2/m256","VFMADD132PS ymm2/m256, ymmV, ymm1","v= fmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 98 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD132PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmadd132ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 98 /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD132PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 98 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD132PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmadd132ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 98 /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 99 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD132SD xmm1, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, xmm1","vfm= add132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 99 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD132SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmadd132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 99 /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD132SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 99 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD132SS xmm1, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, xmm1","vfm= add132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 99 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD132SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmadd132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 99 /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMADD213PD xmm1, xmmV, xmm2/m128","VFMADD213PD xmm2/m128, xmmV, xmm1","v= fmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD213PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmadd213pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 A8 /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMADD213PD ymm1, ymmV, ymm2/m256","VFMADD213PD ymm2/m256, ymmV, ymm1","v= fmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD213PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmadd213pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 A8 /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 A8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD213PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmadd213pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 A8 /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMADD213PS xmm1, xmmV, xmm2/m128","VFMADD213PS xmm2/m128, xmmV, xmm1","v= fmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD213PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmadd213ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 A8 /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMADD213PS ymm1, ymmV, ymm2/m256","VFMADD213PS ymm2/m256, ymmV, ymm1","v= fmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD213PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmadd213ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 A8 /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD213PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 A8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD213PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmadd213ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 A8 /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 A9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD213SD xmm1, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, xmm1","vfm= add213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 A9 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD213SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmadd213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 A9 /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD213SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 A9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD213SS xmm1, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, xmm1","vfm= add213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 A9 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD213SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmadd213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 A9 /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMADD231PD xmm1, xmmV, xmm2/m128","VFMADD231PD xmm2/m128, xmmV, xmm1","v= fmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADD231PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmadd231pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 B8 /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMADD231PD ymm1, ymmV, ymm2/m256","VFMADD231PD ymm2/m256, ymmV, ymm1","v= fmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADD231PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmadd231pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 B8 /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 B8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADD231PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmadd231pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 B8 /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMADD231PS xmm1, xmmV, xmm2/m128","VFMADD231PS xmm2/m128, xmmV, xmm1","v= fmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADD231PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmadd231ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 B8 /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMADD231PS ymm1, ymmV, ymm2/m256","VFMADD231PS ymm2/m256, ymmV, ymm1","v= fmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B8 /r","V","V","F= MA","","rw,r,r","","" +"VFMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADD231PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmadd231ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 B8 /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADD231PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 B8 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADD231PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmadd231ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 B8 /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 B9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD231SD xmm1, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, xmm1","vfm= add231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 B9 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMADD231SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmadd231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 B9 /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMADD231SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmadd231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 B9 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADD231SS xmm1, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, xmm1","vfm= add231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 B9 /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMADD231SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmadd231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 B9 /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPD xmm2/m128, xmmIH, xmmV, = xmm1","vfmaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 69 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPD xmmIH, xmm2/m128, xmmV, = xmm1","vfmaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 69 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPD ymm2/m256, ymmIH, ymmV, = ymm1","vfmaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 69 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPD ymmIH, ymm2/m256, ymmV, = ymm1","vfmaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 69 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDPS xmm2/m128, xmmIH, xmmV, = xmm1","vfmaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 68 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDPS xmmIH, xmm2/m128, xmmV, = xmm1","vfmaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 68 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDPS ymm2/m256, ymmIH, ymmV, = ymm1","vfmaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 68 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDPS ymmIH, ymm2/m256, ymmV, = ymm1","vfmaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 68 /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFMADDSD xmm2/m64, xmmIH, xmmV, xm= m1","vfmaddsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6B /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFMADDSD xmmIH, xmm2/m64, xmmV, xm= m1","vfmaddsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6B /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFMADDSS xmm2/m32, xmmIH, xmmV, xm= m1","vfmaddss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6A /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFMADDSS xmmIH, xmm2/m32, xmmV, xm= m1","vfmaddss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6A /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUB132PD xmm1, xmmV, xmm2/m128","VFMADDSUB132PD xmm2/m128, xmmV, xm= m1","vfmaddsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 96 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB132PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub132pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 96 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMADDSUB132PD ymm1, ymmV, ymm2/m256","VFMADDSUB132PD ymm2/m256, ymmV, ym= m1","vfmaddsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 96 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB132PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub132pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 96 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMADDSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 96 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB132PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub132pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 96 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMADDSUB132PS xmm1, xmmV, xmm2/m128","VFMADDSUB132PS xmm2/m128, xmmV, xm= m1","vfmaddsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 96 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB132PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub132ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 96 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMADDSUB132PS ymm1, ymmV, ymm2/m256","VFMADDSUB132PS ymm2/m256, ymmV, ym= m1","vfmaddsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 96 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB132PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub132ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 96 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMADDSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB132PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 96 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB132PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub132ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 96 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMADDSUB213PD xmm1, xmmV, xmm2/m128","VFMADDSUB213PD xmm2/m128, xmmV, xm= m1","vfmaddsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB213PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub213pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A6 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMADDSUB213PD ymm1, ymmV, ymm2/m256","VFMADDSUB213PD ymm2/m256, ymmV, ym= m1","vfmaddsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB213PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub213pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A6 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMADDSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 A6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB213PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub213pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A6 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMADDSUB213PS xmm1, xmmV, xmm2/m128","VFMADDSUB213PS xmm2/m128, xmmV, xm= m1","vfmaddsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB213PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub213ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A6 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMADDSUB213PS ymm1, ymmV, ymm2/m256","VFMADDSUB213PS ymm2/m256, ymmV, ym= m1","vfmaddsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB213PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub213ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A6 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMADDSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB213PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 A6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB213PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub213ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A6 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMADDSUB231PD xmm1, xmmV, xmm2/m128","VFMADDSUB231PD xmm2/m128, xmmV, xm= m1","vfmaddsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMADDSUB231PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmaddsub231pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B6 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMADDSUB231PD ymm1, ymmV, ymm2/m256","VFMADDSUB231PD ymm2/m256, ymmV, ym= m1","vfmaddsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMADDSUB231PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmaddsub231pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B6 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMADDSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 B6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMADDSUB231PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmaddsub231pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B6 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMADDSUB231PS xmm1, xmmV, xmm2/m128","VFMADDSUB231PS xmm2/m128, xmmV, xm= m1","vfmaddsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMADDSUB231PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmaddsub231ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B6 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMADDSUB231PS ymm1, ymmV, ymm2/m256","VFMADDSUB231PS ymm2/m256, ymmV, ym= m1","vfmaddsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B6 /r","= V","V","FMA","","rw,r,r","","" +"VFMADDSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMADDSUB231PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmaddsub231ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B6 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMADDSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMADDSUB231PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmaddsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 B6 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMADDSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMADDSUB231PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmaddsub231ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B6 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMADDSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPD xmm2/m128, xmmIH, = xmmV, xmm1","vfmaddsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W1 5D /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPD xmmIH, xmm2/m128, = xmmV, xmm1","vfmaddsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W0 5D /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPD ymm2/m256, ymmIH, = ymmV, ymm1","vfmaddsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W1 5D /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPD ymmIH, ymm2/m256, = ymmV, ymm1","vfmaddsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W0 5D /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMADDSUBPS xmm2/m128, xmmIH, = xmmV, xmm1","vfmaddsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W1 5C /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMADDSUBPS xmmIH, xmm2/m128, = xmmV, xmm1","vfmaddsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W0 5C /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMADDSUBPS ymm2/m256, ymmIH, = ymmV, ymm1","vfmaddsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W1 5C /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMADDSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMADDSUBPS ymmIH, ymm2/m256, = ymmV, ymm1","vfmaddsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W0 5C /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUB132PD xmm1, xmmV, xmm2/m128","VFMSUB132PD xmm2/m128, xmmV, xmm1","v= fmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9A /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB132PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmsub132pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 9A /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMSUB132PD ymm1, ymmV, ymm2/m256","VFMSUB132PD ymm2/m256, ymmV, ymm1","v= fmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9A /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB132PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmsub132pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 9A /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 9A /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB132PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmsub132pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 9A /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMSUB132PS xmm1, xmmV, xmm2/m128","VFMSUB132PS xmm2/m128, xmmV, xmm1","v= fmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9A /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB132PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmsub132ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 9A /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMSUB132PS ymm1, ymmV, ymm2/m256","VFMSUB132PS ymm2/m256, ymmV, ymm1","v= fmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9A /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB132PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmsub132ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 9A /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB132PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 9A /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB132PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmsub132ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 9A /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 9B /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB132SD xmm1, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, xmm1","vfm= sub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9B /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB132SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmsub132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 9B /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB132SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 9B /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB132SS xmm1, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, xmm1","vfm= sub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9B /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB132SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmsub132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 9B /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMSUB213PD xmm1, xmmV, xmm2/m128","VFMSUB213PD xmm2/m128, xmmV, xmm1","v= fmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB213PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmsub213pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 AA /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMSUB213PD ymm1, ymmV, ymm2/m256","VFMSUB213PD ymm2/m256, ymmV, ymm1","v= fmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB213PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmsub213pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 AA /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 AA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB213PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmsub213pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 AA /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMSUB213PS xmm1, xmmV, xmm2/m128","VFMSUB213PS xmm2/m128, xmmV, xmm1","v= fmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB213PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmsub213ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 AA /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMSUB213PS ymm1, ymmV, ymm2/m256","VFMSUB213PS ymm2/m256, ymmV, ymm1","v= fmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB213PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmsub213ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 AA /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB213PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 AA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB213PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmsub213ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 AA /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 AB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB213SD xmm1, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, xmm1","vfm= sub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AB /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB213SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmsub213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 AB /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB213SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 AB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB213SS xmm1, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, xmm1","vfm= sub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AB /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB213SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmsub213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 AB /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMSUB231PD xmm1, xmmV, xmm2/m128","VFMSUB231PD xmm2/m128, xmmV, xmm1","v= fmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUB231PD xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vfmsub231pd xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 BA /r","V","V","AVX512F+AVX512VL","bscale8,= scale16","rw,r,r,r","","" +"VFMSUB231PD ymm1, ymmV, ymm2/m256","VFMSUB231PD ymm2/m256, ymmV, ymm1","v= fmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUB231PD ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vfmsub231pd ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 BA /r","V","V","AVX512F+AVX512VL","bscale8,= scale32","rw,r,r,r","","" +"VFMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PD zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W1 BA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUB231PD zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vfmsub231pd zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 BA /r","V","V","AVX512F","bscale8,scale64",= "rw,r,r,r","","" +"VFMSUB231PS xmm1, xmmV, xmm2/m128","VFMSUB231PS xmm2/m128, xmmV, xmm1","v= fmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUB231PS xmm2/m128= /m32bcst, xmmV, {k}{z}, xmm1","vfmsub231ps xmm2/m128/m32bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W0 BA /r","V","V","AVX512F+AVX512VL","bscale4,= scale16","rw,r,r,r","","" +"VFMSUB231PS ymm1, ymmV, ymm2/m256","VFMSUB231PS ymm2/m256, ymmV, ymm1","v= fmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BA /r","V","V","F= MA","","rw,r,r","","" +"VFMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUB231PS ymm2/m256= /m32bcst, ymmV, {k}{z}, ymm1","vfmsub231ps ymm2/m256/m32bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W0 BA /r","V","V","AVX512F+AVX512VL","bscale4,= scale32","rw,r,r,r","","" +"VFMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUB231PS zmm2, zmmV, {k}{z}= , zmm1{er}","vfmsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.0F3= 8.W0 BA /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUB231PS zmm2/m512= /m32bcst, zmmV, {k}{z}, zmm1","vfmsub231ps zmm2/m512/m32bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W0 BA /r","V","V","AVX512F","bscale4,scale64",= "rw,r,r,r","","" +"VFMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SD xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W1 BB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB231SD xmm1, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, xmm1","vfm= sub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BB /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFMSUB231SD xmm2/m64, xmmV, {k= }{z}, xmm1","vfmsub231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W1 BB /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFMSUB231SS xmm2, xmmV, {k}{z}= , xmm1{er}","vfmsub231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.0F3= 8.W0 BB /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUB231SS xmm1, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, xmm1","vfm= sub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BB /r","V","V","FMA"= ,"","rw,r,r","","" +"VFMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFMSUB231SS xmm2/m32, xmmV, {k= }{z}, xmm1","vfmsub231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.0F3= 8.W0 BB /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFMSUBADD132PD xmm1, xmmV, xmm2/m128","VFMSUBADD132PD xmm2/m128, xmmV, xm= m1","vfmsubadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 97 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD132PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd132pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 97 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMSUBADD132PD ymm1, ymmV, ymm2/m256","VFMSUBADD132PD ymm2/m256, ymmV, ym= m1","vfmsubadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 97 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD132PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd132pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 97 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMSUBADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 97 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD132PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd132pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 97 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMSUBADD132PS xmm1, xmmV, xmm2/m128","VFMSUBADD132PS xmm2/m128, xmmV, xm= m1","vfmsubadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 97 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD132PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd132ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 97 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMSUBADD132PS ymm1, ymmV, ymm2/m256","VFMSUBADD132PS ymm2/m256, ymmV, ym= m1","vfmsubadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 97 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD132PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd132ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 97 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMSUBADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD132PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 97 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD132PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd132ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 97 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMSUBADD213PD xmm1, xmmV, xmm2/m128","VFMSUBADD213PD xmm2/m128, xmmV, xm= m1","vfmsubadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 A7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD213PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd213pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 A7 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMSUBADD213PD ymm1, ymmV, ymm2/m256","VFMSUBADD213PD ymm2/m256, ymmV, ym= m1","vfmsubadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 A7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD213PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd213pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 A7 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMSUBADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 A7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD213PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd213pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 A7 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMSUBADD213PS xmm1, xmmV, xmm2/m128","VFMSUBADD213PS xmm2/m128, xmmV, xm= m1","vfmsubadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 A7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD213PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd213ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 A7 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMSUBADD213PS ymm1, ymmV, ymm2/m256","VFMSUBADD213PS ymm2/m256, ymmV, ym= m1","vfmsubadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 A7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD213PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd213ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 A7 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMSUBADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD213PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 A7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD213PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd213ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 A7 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMSUBADD231PD xmm1, xmmV, xmm2/m128","VFMSUBADD231PD xmm2/m128, xmmV, xm= m1","vfmsubadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 B7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFMSUBADD231PD xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vfmsubadd231pd xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 B7 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale16","rw,r,r,r","","" +"VFMSUBADD231PD ymm1, ymmV, ymm2/m256","VFMSUBADD231PD ymm2/m256, ymmV, ym= m1","vfmsubadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 B7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFMSUBADD231PD ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vfmsubadd231pd ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 B7 /r","V","V","AVX512F+AVX512VL",= "bscale8,scale32","rw,r,r,r","","" +"VFMSUBADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PD zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W1 B7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFMSUBADD231PD zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vfmsubadd231pd zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 B7 /r","V","V","AVX512F","bscale8,= scale64","rw,r,r,r","","" +"VFMSUBADD231PS xmm1, xmmV, xmm2/m128","VFMSUBADD231PS xmm2/m128, xmmV, xm= m1","vfmsubadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 B7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFMSUBADD231PS xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vfmsubadd231ps xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 B7 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale16","rw,r,r,r","","" +"VFMSUBADD231PS ymm1, ymmV, ymm2/m256","VFMSUBADD231PS ymm2/m256, ymmV, ym= m1","vfmsubadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 B7 /r","= V","V","FMA","","rw,r,r","","" +"VFMSUBADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFMSUBADD231PS ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vfmsubadd231ps ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 B7 /r","V","V","AVX512F+AVX512VL",= "bscale4,scale32","rw,r,r,r","","" +"VFMSUBADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFMSUBADD231PS zmm2, zmmV, = {k}{z}, zmm1{er}","vfmsubadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.5= 12.66.0F38.W0 B7 /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFMSUBADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFMSUBADD231PS zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vfmsubadd231ps zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 B7 /r","V","V","AVX512F","bscale4,= scale64","rw,r,r,r","","" +"VFMSUBADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPD xmm2/m128, xmmIH, = xmmV, xmm1","vfmsubaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W1 5F /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPD xmmIH, xmm2/m128, = xmmV, xmm1","vfmsubaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W0 5F /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPD ymm2/m256, ymmIH, = ymmV, ymm1","vfmsubaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W1 5F /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPD ymmIH, ymm2/m256, = ymmV, ymm1","vfmsubaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W0 5F /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBADDPS xmm2/m128, xmmIH, = xmmV, xmm1","vfmsubaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W1 5E /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBADDPS xmmIH, xmm2/m128, = xmmV, xmm1","vfmsubaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A= .W0 5E /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBADDPS ymm2/m256, ymmIH, = ymmV, ymm1","vfmsubaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W1 5E /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBADDPS ymmIH, ymm2/m256, = ymmV, ymm1","vfmsubaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A= .W0 5E /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPD xmm2/m128, xmmIH, xmmV, = xmm1","vfmsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 6D /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPD xmmIH, xmm2/m128, xmmV, = xmm1","vfmsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 6D /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPD ymm2/m256, ymmIH, ymmV, = ymm1","vfmsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 6D /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPD ymmIH, ymm2/m256, ymmV, = ymm1","vfmsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 6D /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFMSUBPS xmm2/m128, xmmIH, xmmV, = xmm1","vfmsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 6C /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFMSUBPS xmmIH, xmm2/m128, xmmV, = xmm1","vfmsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 6C /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFMSUBPS ymm2/m256, ymmIH, ymmV, = ymm1","vfmsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 6C /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFMSUBPS ymmIH, ymm2/m256, ymmV, = ymm1","vfmsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 6C /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFMSUBSD xmm2/m64, xmmIH, xmmV, xm= m1","vfmsubsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6F /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFMSUBSD xmmIH, xmm2/m64, xmmV, xm= m1","vfmsubsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6F /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFMSUBSS xmm2/m32, xmmIH, xmmV, xm= m1","vfmsubss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 6E /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFMSUBSS xmmIH, xmm2/m32, xmmV, xm= m1","vfmsubss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 6E /r /i= s4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADD132PD xmm1, xmmV, xmm2/m128","VFNMADD132PD xmm2/m128, xmmV, xmm1",= "vfnmadd132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9C /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD132PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd132pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 9C /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMADD132PD ymm1, ymmV, ymm2/m256","VFNMADD132PD ymm2/m256, ymmV, ymm1",= "vfnmadd132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9C /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD132PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd132pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 9C /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMADD132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 9C /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD132PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd132pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 9C /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMADD132PS xmm1, xmmV, xmm2/m128","VFNMADD132PS xmm2/m128, xmmV, xmm1",= "vfnmadd132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9C /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD132PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd132ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 9C /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMADD132PS ymm1, ymmV, ymm2/m256","VFNMADD132PS ymm2/m256, ymmV, ymm1",= "vfnmadd132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9C /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD132PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd132ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 9C /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMADD132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD132PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 9C /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD132PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd132ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 9C /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMADD132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 9D /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD132SD xmm1, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, xmm1","v= fnmadd132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9D /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD132SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmadd132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 9D /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMADD132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD132SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 9D /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD132SS xmm1, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, xmm1","v= fnmadd132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9D /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD132SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmadd132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 9D /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMADD213PD xmm1, xmmV, xmm2/m128","VFNMADD213PD xmm2/m128, xmmV, xmm1",= "vfnmadd213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD213PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd213pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 AC /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMADD213PD ymm1, ymmV, ymm2/m256","VFNMADD213PD ymm2/m256, ymmV, ymm1",= "vfnmadd213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD213PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd213pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 AC /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMADD213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 AC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD213PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd213pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 AC /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMADD213PS xmm1, xmmV, xmm2/m128","VFNMADD213PS xmm2/m128, xmmV, xmm1",= "vfnmadd213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD213PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd213ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 AC /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMADD213PS ymm1, ymmV, ymm2/m256","VFNMADD213PS ymm2/m256, ymmV, ymm1",= "vfnmadd213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD213PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd213ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 AC /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMADD213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD213PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 AC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD213PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd213ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 AC /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMADD213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 AD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD213SD xmm1, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, xmm1","v= fnmadd213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AD /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD213SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmadd213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 AD /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMADD213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD213SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 AD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD213SS xmm1, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, xmm1","v= fnmadd213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AD /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD213SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmadd213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 AD /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMADD231PD xmm1, xmmV, xmm2/m128","VFNMADD231PD xmm2/m128, xmmV, xmm1",= "vfnmadd231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMADD231PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmadd231pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 BC /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMADD231PD ymm1, ymmV, ymm2/m256","VFNMADD231PD ymm2/m256, ymmV, ymm1",= "vfnmadd231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMADD231PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmadd231pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 BC /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMADD231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 BC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMADD231PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmadd231pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 BC /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMADD231PS xmm1, xmmV, xmm2/m128","VFNMADD231PS xmm2/m128, xmmV, xmm1",= "vfnmadd231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMADD231PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmadd231ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 BC /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMADD231PS ymm1, ymmV, ymm2/m256","VFNMADD231PS ymm2/m256, ymmV, ymm1",= "vfnmadd231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BC /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMADD231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMADD231PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmadd231ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 BC /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMADD231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMADD231PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmadd231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 BC /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMADD231PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmadd231ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 BC /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMADD231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 BD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD231SD xmm1, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, xmm1","v= fnmadd231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BD /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMADD231SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmadd231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 BD /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMADD231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMADD231SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmadd231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 BD /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMADD231SS xmm1, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, xmm1","v= fnmadd231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BD /r","V","V","F= MA","","rw,r,r","","" +"VFNMADD231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMADD231SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmadd231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 BD /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMADDPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPD xmm2/m128, xmmIH, xmmV= , xmm1","vfnmaddpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 79= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPD xmmIH, xmm2/m128, xmmV= , xmm1","vfnmaddpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 79= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPD ymm2/m256, ymmIH, ymmV= , ymm1","vfnmaddpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 79= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPD ymmIH, ymm2/m256, ymmV= , ymm1","vfnmaddpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 79= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMADDPS xmm2/m128, xmmIH, xmmV= , xmm1","vfnmaddps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 78= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMADDPS xmmIH, xmm2/m128, xmmV= , xmm1","vfnmaddps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 78= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMADDPS ymm2/m256, ymmIH, ymmV= , ymm1","vfnmaddps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 78= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMADDPS ymmIH, ymm2/m256, ymmV= , ymm1","vfnmaddps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 78= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMADDSD xmm2/m64, xmmIH, xmmV, = xmm1","vfnmaddsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7B /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMADDSD xmmIH, xmm2/m64, xmmV, = xmm1","vfnmaddsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7B /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMADDSS xmm2/m32, xmmIH, xmmV, = xmm1","vfnmaddss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7A /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMADDSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMADDSS xmmIH, xmm2/m32, xmmV, = xmm1","vfnmaddss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7A /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUB132PD xmm1, xmmV, xmm2/m128","VFNMSUB132PD xmm2/m128, xmmV, xmm1",= "vfnmsub132pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 9E /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB132PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB132PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub132pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 9E /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMSUB132PD ymm1, ymmV, ymm2/m256","VFNMSUB132PD ymm2/m256, ymmV, ymm1",= "vfnmsub132pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 9E /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB132PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB132PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub132pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 9E /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMSUB132PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub132pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 9E /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB132PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB132PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub132pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 9E /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMSUB132PS xmm1, xmmV, xmm2/m128","VFNMSUB132PS xmm2/m128, xmmV, xmm1",= "vfnmsub132ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 9E /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB132PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB132PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub132ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 9E /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMSUB132PS ymm1, ymmV, ymm2/m256","VFNMSUB132PS ymm2/m256, ymmV, ymm1",= "vfnmsub132ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 9E /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB132PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB132PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub132ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 9E /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMSUB132PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB132PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub132ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 9E /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB132PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB132PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub132ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 9E /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMSUB132SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub132sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 9F /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB132SD xmm1, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, xmm1","v= fnmsub132sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 9F /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB132SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB132SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmsub132sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 9F /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMSUB132SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB132SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub132ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 9F /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB132SS xmm1, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, xmm1","v= fnmsub132ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 9F /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB132SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB132SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmsub132ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 9F /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMSUB213PD xmm1, xmmV, xmm2/m128","VFNMSUB213PD xmm2/m128, xmmV, xmm1",= "vfnmsub213pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 AE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB213PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB213PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub213pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 AE /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMSUB213PD ymm1, ymmV, ymm2/m256","VFNMSUB213PD ymm2/m256, ymmV, ymm1",= "vfnmsub213pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 AE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB213PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB213PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub213pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 AE /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMSUB213PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub213pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 AE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB213PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB213PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub213pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 AE /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMSUB213PS xmm1, xmmV, xmm2/m128","VFNMSUB213PS xmm2/m128, xmmV, xmm1",= "vfnmsub213ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 AE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB213PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB213PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub213ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 AE /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMSUB213PS ymm1, ymmV, ymm2/m256","VFNMSUB213PS ymm2/m256, ymmV, ymm1",= "vfnmsub213ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 AE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB213PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB213PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub213ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 AE /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMSUB213PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB213PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub213ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 AE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB213PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB213PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub213ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 AE /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMSUB213SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub213sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 AF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB213SD xmm1, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, xmm1","v= fnmsub213sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 AF /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB213SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB213SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmsub213sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 AF /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMSUB213SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB213SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub213ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 AF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB213SS xmm1, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, xmm1","v= fnmsub213ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 AF /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB213SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB213SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmsub213ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 AF /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMSUB231PD xmm1, xmmV, xmm2/m128","VFNMSUB231PD xmm2/m128, xmmV, xmm1",= "vfnmsub231pd xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W1 BE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB231PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VFNMSUB231PD xmm2/m1= 28/m64bcst, xmmV, {k}{z}, xmm1","vfnmsub231pd xmm2/m128/m64bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W1 BE /r","V","V","AVX512F+AVX512VL","bscal= e8,scale16","rw,r,r,r","","" +"VFNMSUB231PD ymm1, ymmV, ymm2/m256","VFNMSUB231PD ymm2/m256, ymmV, ymm1",= "vfnmsub231pd ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W1 BE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB231PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VFNMSUB231PD ymm2/m2= 56/m64bcst, ymmV, {k}{z}, ymm1","vfnmsub231pd ymm2/m256/m64bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W1 BE /r","V","V","AVX512F+AVX512VL","bscal= e8,scale32","rw,r,r,r","","" +"VFNMSUB231PD zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PD zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub231pd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W1 BE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB231PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VFNMSUB231PD zmm2/m5= 12/m64bcst, zmmV, {k}{z}, zmm1","vfnmsub231pd zmm2/m512/m64bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W1 BE /r","V","V","AVX512F","bscale8,scale6= 4","rw,r,r,r","","" +"VFNMSUB231PS xmm1, xmmV, xmm2/m128","VFNMSUB231PS xmm2/m128, xmmV, xmm1",= "vfnmsub231ps xmm2/m128, xmmV, xmm1","VEX.DDS.128.66.0F38.W0 BE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB231PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VFNMSUB231PS xmm2/m1= 28/m32bcst, xmmV, {k}{z}, xmm1","vfnmsub231ps xmm2/m128/m32bcst, xmmV, {k}{= z}, xmm1","EVEX.DDS.128.66.0F38.W0 BE /r","V","V","AVX512F+AVX512VL","bscal= e4,scale16","rw,r,r,r","","" +"VFNMSUB231PS ymm1, ymmV, ymm2/m256","VFNMSUB231PS ymm2/m256, ymmV, ymm1",= "vfnmsub231ps ymm2/m256, ymmV, ymm1","VEX.DDS.256.66.0F38.W0 BE /r","V","V"= ,"FMA","","rw,r,r","","" +"VFNMSUB231PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VFNMSUB231PS ymm2/m2= 56/m32bcst, ymmV, {k}{z}, ymm1","vfnmsub231ps ymm2/m256/m32bcst, ymmV, {k}{= z}, ymm1","EVEX.DDS.256.66.0F38.W0 BE /r","V","V","AVX512F+AVX512VL","bscal= e4,scale32","rw,r,r,r","","" +"VFNMSUB231PS zmm1{er}, {k}{z}, zmmV, zmm2","VFNMSUB231PS zmm2, zmmV, {k}{= z}, zmm1{er}","vfnmsub231ps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.DDS.512.66.= 0F38.W0 BE /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB231PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VFNMSUB231PS zmm2/m5= 12/m32bcst, zmmV, {k}{z}, zmm1","vfnmsub231ps zmm2/m512/m32bcst, zmmV, {k}{= z}, zmm1","EVEX.DDS.512.66.0F38.W0 BE /r","V","V","AVX512F","bscale4,scale6= 4","rw,r,r,r","","" +"VFNMSUB231SD xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SD xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub231sd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W1 BF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB231SD xmm1, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, xmm1","v= fnmsub231sd xmm2/m64, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W1 BF /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB231SD xmm1, {k}{z}, xmmV, xmm2/m64","VFNMSUB231SD xmm2/m64, xmmV, = {k}{z}, xmm1","vfnmsub231sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W1 BF /r","V","V","AVX512F","scale8","rw,r,r,r","","" +"VFNMSUB231SS xmm1{er}, {k}{z}, xmmV, xmm2","VFNMSUB231SS xmm2, xmmV, {k}{= z}, xmm1{er}","vfnmsub231ss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.DDS.128.66.= 0F38.W0 BF /r","V","V","AVX512F","modrm_regonly","rw,r,r,r","","" +"VFNMSUB231SS xmm1, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, xmm1","v= fnmsub231ss xmm2/m32, xmmV, xmm1","VEX.DDS.LIG.66.0F38.W0 BF /r","V","V","F= MA","","rw,r,r","","" +"VFNMSUB231SS xmm1, {k}{z}, xmmV, xmm2/m32","VFNMSUB231SS xmm2/m32, xmmV, = {k}{z}, xmm1","vfnmsub231ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.DDS.LIG.66.= 0F38.W0 BF /r","V","V","AVX512F","scale4","rw,r,r,r","","" +"VFNMSUBPD xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPD xmm2/m128, xmmIH, xmmV= , xmm1","vfnmsubpd xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 7D= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPD xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPD xmmIH, xmm2/m128, xmmV= , xmm1","vfnmsubpd xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 7D= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPD ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPD ymm2/m256, ymmIH, ymmV= , ymm1","vfnmsubpd ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 7D= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPD ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPD ymmIH, ymm2/m256, ymmV= , ymm1","vfnmsubpd ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 7D= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPS xmm1, xmmV, xmmIH, xmm2/m128","VFNMSUBPS xmm2/m128, xmmIH, xmmV= , xmm1","vfnmsubps xmm2/m128, xmmIH, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 7C= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPS xmm1, xmmV, xmm2/m128, xmmIH","VFNMSUBPS xmmIH, xmm2/m128, xmmV= , xmm1","vfnmsubps xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 7C= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPS ymm1, ymmV, ymmIH, ymm2/m256","VFNMSUBPS ymm2/m256, ymmIH, ymmV= , ymm1","vfnmsubps ymm2/m256, ymmIH, ymmV, ymm1","VEX.NDS.256.66.0F3A.W1 7C= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBPS ymm1, ymmV, ymm2/m256, ymmIH","VFNMSUBPS ymmIH, ymm2/m256, ymmV= , ymm1","vfnmsubps ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 7C= /r /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBSD xmm1, xmmV, xmmIH, xmm2/m64","VFNMSUBSD xmm2/m64, xmmIH, xmmV, = xmm1","vfnmsubsd xmm2/m64, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7F /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBSD xmm1, xmmV, xmm2/m64, xmmIH","VFNMSUBSD xmmIH, xmm2/m64, xmmV, = xmm1","vfnmsubsd xmmIH, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7F /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBSS xmm1, xmmV, xmmIH, xmm2/m32","VFNMSUBSS xmm2/m32, xmmIH, xmmV, = xmm1","vfnmsubss xmm2/m32, xmmIH, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W1 7E /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFNMSUBSS xmm1, xmmV, xmm2/m32, xmmIH","VFNMSUBSS xmmIH, xmm2/m32, xmmV, = xmm1","vfnmsubss xmmIH, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.W0 7E /r= /is4","V","V","FMA4","amd","w,r,r,r","","" +"VFPCLASSPD k1, {k}, xmm2/m128/m64bcst, imm8u","VFPCLASSPDX imm8u, xmm2/m1= 28/m64bcst, {k}, k1","vfpclasspdx imm8u, xmm2/m128/m64bcst, {k}, k1","EVEX.= 128.66.0F3A.W1 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r= ,r,r","Y","128" +"VFPCLASSPD k1, {k}, ymm2/m256/m64bcst, imm8u","VFPCLASSPDY imm8u, ymm2/m2= 56/m64bcst, {k}, k1","vfpclasspdy imm8u, ymm2/m256/m64bcst, {k}, k1","EVEX.= 256.66.0F3A.W1 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r= ,r,r","Y","256" +"VFPCLASSPD k1, {k}, zmm2/m512/m64bcst, imm8u","VFPCLASSPDZ imm8u, zmm2/m5= 12/m64bcst, {k}, k1","vfpclasspdz imm8u, zmm2/m512/m64bcst, {k}, k1","EVEX.= 512.66.0F3A.W1 66 /r ib","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","Y"= ,"512" +"VFPCLASSPS k1, {k}, xmm2/m128/m32bcst, imm8u","VFPCLASSPSX imm8u, xmm2/m1= 28/m32bcst, {k}, k1","vfpclasspsx imm8u, xmm2/m128/m32bcst, {k}, k1","EVEX.= 128.66.0F3A.W0 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r= ,r,r","Y","128" +"VFPCLASSPS k1, {k}, ymm2/m256/m32bcst, imm8u","VFPCLASSPSY imm8u, ymm2/m2= 56/m32bcst, {k}, k1","vfpclasspsy imm8u, ymm2/m256/m32bcst, {k}, k1","EVEX.= 256.66.0F3A.W0 66 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r= ,r,r","Y","256" +"VFPCLASSPS k1, {k}, zmm2/m512/m32bcst, imm8u","VFPCLASSPSZ imm8u, zmm2/m5= 12/m32bcst, {k}, k1","vfpclasspsz imm8u, zmm2/m512/m32bcst, {k}, k1","EVEX.= 512.66.0F3A.W0 66 /r ib","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","Y"= ,"512" +"VFPCLASSSD k1, {k}, xmm2/m64, imm8u","VFPCLASSSD imm8u, xmm2/m64, {k}, k1= ","vfpclasssd imm8u, xmm2/m64, {k}, k1","EVEX.LIG.66.0F3A.W1 67 /r ib","V",= "V","AVX512DQ","scale8","w,r,r,r","","" +"VFPCLASSSS k1, {k}, xmm2/m32, imm8u","VFPCLASSSS imm8u, xmm2/m32, {k}, k1= ","vfpclassss imm8u, xmm2/m32, {k}, k1","EVEX.LIG.66.0F3A.W0 67 /r ib","V",= "V","AVX512DQ","scale4","w,r,r,r","","" +"VFRCZPD xmm1, xmm2/m128","VFRCZPD xmm2/m128, xmm1","vfrczpd xmm2/m128, xm= m1","XOP.128.09.W0 81 /r","V","V","XOP","amd","w,r","","" +"VFRCZPD ymm1, ymm2/m256","VFRCZPD ymm2/m256, ymm1","vfrczpd ymm2/m256, ym= m1","XOP.256.09.W0 81 /r","V","V","XOP","amd","w,r","","" +"VFRCZPS xmm1, xmm2/m128","VFRCZPS xmm2/m128, xmm1","vfrczps xmm2/m128, xm= m1","XOP.128.09.W0 80 /r","V","V","XOP","amd","w,r","","" +"VFRCZPS ymm1, ymm2/m256","VFRCZPS ymm2/m256, ymm1","vfrczps ymm2/m256, ym= m1","XOP.256.09.W0 80 /r","V","V","XOP","amd","w,r","","" +"VFRCZSD xmm1, xmm2/m64","VFRCZSD xmm2/m64, xmm1","vfrczsd xmm2/m64, xmm1"= ,"XOP.128.09.W0 83 /r","V","V","XOP","amd","w,r","","" +"VFRCZSS xmm1, xmm2/m32","VFRCZSS xmm2/m32, xmm1","vfrczss xmm2/m32, xmm1"= ,"XOP.128.09.W0 82 /r","V","V","XOP","amd","w,r","","" +"VGATHERDPD xmm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7}, xmm1","vgath= erdpd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 92 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VGATHERDPD ymm1, {k1-k7}, vm32x","VGATHERDPD vm32x, {k1-k7}, ymm1","vgath= erdpd vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 92 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VGATHERDPD zmm1, {k1-k7}, vm32y","VGATHERDPD vm32y, {k1-k7}, zmm1","vgath= erdpd vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 92 /vsib","V","V","AVX512F= ","modrm_memonly,scale8","w,rw,r","","" +"VGATHERDPD xmm1, vm32x, xmmV","VGATHERDPD xmmV, vm32x, xmm1","vgatherdpd = xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 92 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERDPD ymm1, vm32x, ymmV","VGATHERDPD ymmV, vm32x, ymm1","vgatherdpd = ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 92 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERDPS xmm1, {k1-k7}, vm32x","VGATHERDPS vm32x, {k1-k7}, xmm1","vgath= erdps vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 92 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VGATHERDPS ymm1, {k1-k7}, vm32y","VGATHERDPS vm32y, {k1-k7}, ymm1","vgath= erdps vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 92 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VGATHERDPS zmm1, {k1-k7}, vm32z","VGATHERDPS vm32z, {k1-k7}, zmm1","vgath= erdps vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 92 /vsib","V","V","AVX512F= ","modrm_memonly,scale4","w,rw,r","","" +"VGATHERDPS xmm1, vm32x, xmmV","VGATHERDPS xmmV, vm32x, xmm1","vgatherdps = xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 92 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERDPS ymm1, vm32y, ymmV","VGATHERDPS ymmV, vm32y, ymm1","vgatherdps = ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 92 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERPF0DPD vm32y, {k1-k7}","VGATHERPF0DPD {k1-k7}, vm32y","vgatherpf0d= pd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /1","V","V","AVX512PF","modrm_me= monly,scale8","r,rw","","" +"VGATHERPF0DPS vm32z, {k1-k7}","VGATHERPF0DPS {k1-k7}, vm32z","vgatherpf0d= ps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /1","V","V","AVX512PF","modrm_me= monly,scale4","r,rw","","" +"VGATHERPF0QPD vm64z, {k1-k7}","VGATHERPF0QPD {k1-k7}, vm64z","vgatherpf0q= pd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /1","V","V","AVX512PF","modrm_me= monly,scale8","r,rw","","" +"VGATHERPF0QPS vm64z, {k1-k7}","VGATHERPF0QPS {k1-k7}, vm64z","vgatherpf0q= ps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /1","V","V","AVX512PF","modrm_me= monly,scale4","r,rw","","" +"VGATHERPF1DPD vm32y, {k1-k7}","VGATHERPF1DPD {k1-k7}, vm32y","vgatherpf1d= pd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /2","V","V","AVX512PF","modrm_me= monly,scale8","r,rw","","" +"VGATHERPF1DPS vm32z, {k1-k7}","VGATHERPF1DPS {k1-k7}, vm32z","vgatherpf1d= ps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /2","V","V","AVX512PF","modrm_me= monly,scale4","r,rw","","" +"VGATHERPF1QPD vm64z, {k1-k7}","VGATHERPF1QPD {k1-k7}, vm64z","vgatherpf1q= pd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /2","V","V","AVX512PF","modrm_me= monly,scale8","r,rw","","" +"VGATHERPF1QPS vm64z, {k1-k7}","VGATHERPF1QPS {k1-k7}, vm64z","vgatherpf1q= ps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /2","V","V","AVX512PF","modrm_me= monly,scale4","r,rw","","" +"VGATHERQPD xmm1, {k1-k7}, vm64x","VGATHERQPD vm64x, {k1-k7}, xmm1","vgath= erqpd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 93 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VGATHERQPD ymm1, {k1-k7}, vm64y","VGATHERQPD vm64y, {k1-k7}, ymm1","vgath= erqpd vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 93 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VGATHERQPD zmm1, {k1-k7}, vm64z","VGATHERQPD vm64z, {k1-k7}, zmm1","vgath= erqpd vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 93 /vsib","V","V","AVX512F= ","modrm_memonly,scale8","w,rw,r","","" +"VGATHERQPD xmm1, vm64x, xmmV","VGATHERQPD xmmV, vm64x, xmm1","vgatherqpd = xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 93 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERQPD ymm1, vm64y, ymmV","VGATHERQPD ymmV, vm64y, ymm1","vgatherqpd = ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 93 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERQPS xmm1, {k1-k7}, vm64x","VGATHERQPS vm64x, {k1-k7}, xmm1","vgath= erqps vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 93 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VGATHERQPS xmm1, {k1-k7}, vm64y","VGATHERQPS vm64y, {k1-k7}, xmm1","vgath= erqps vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 93 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VGATHERQPS ymm1, {k1-k7}, vm64z","VGATHERQPS vm64z, {k1-k7}, ymm1","vgath= erqps vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 93 /vsib","V","V","AVX512F= ","modrm_memonly,scale4","w,rw,r","","" +"VGATHERQPS xmm1, vm64x, xmmV","VGATHERQPS xmmV, vm64x, xmm1","vgatherqps = xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 93 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGATHERQPS xmm1, vm64y, xmmV","VGATHERQPS xmmV, vm64y, xmm1","vgatherqps = xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 93 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VGETEXPPD xmm1, {k}{z}, xmm2/m128/m64bcst","VGETEXPPD xmm2/m128/m64bcst, = {k}{z}, xmm1","vgetexppd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38= .W1 42 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VGETEXPPD ymm1, {k}{z}, ymm2/m256/m64bcst","VGETEXPPD ymm2/m256/m64bcst, = {k}{z}, ymm1","vgetexppd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38= .W1 42 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VGETEXPPD zmm1{sae}, {k}{z}, zmm2","VGETEXPPD zmm2, {k}{z}, zmm1{sae}","v= getexppd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 42 /r","V","V","AVX5= 12F","modrm_regonly","w,r,r","","" +"VGETEXPPD zmm1, {k}{z}, zmm2/m512/m64bcst","VGETEXPPD zmm2/m512/m64bcst, = {k}{z}, zmm1","vgetexppd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38= .W1 42 /r","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VGETEXPPS xmm1, {k}{z}, xmm2/m128/m32bcst","VGETEXPPS xmm2/m128/m32bcst, = {k}{z}, xmm1","vgetexpps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38= .W0 42 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VGETEXPPS ymm1, {k}{z}, ymm2/m256/m32bcst","VGETEXPPS ymm2/m256/m32bcst, = {k}{z}, ymm1","vgetexpps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38= .W0 42 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VGETEXPPS zmm1{sae}, {k}{z}, zmm2","VGETEXPPS zmm2, {k}{z}, zmm1{sae}","v= getexpps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 42 /r","V","V","AVX5= 12F","modrm_regonly","w,r,r","","" +"VGETEXPPS zmm1, {k}{z}, zmm2/m512/m32bcst","VGETEXPPS zmm2/m512/m32bcst, = {k}{z}, zmm1","vgetexpps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38= .W0 42 /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VGETEXPSD xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSD xmm2, xmmV, {k}{z}, x= mm1{sae}","vgetexpsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W= 1 43 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VGETEXPSD xmm1, {k}{z}, xmmV, xmm2/m64","VGETEXPSD xmm2/m64, xmmV, {k}{z}= , xmm1","vgetexpsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 4= 3 /r","V","V","AVX512F","scale8","w,r,r,r","","" +"VGETEXPSS xmm1{sae}, {k}{z}, xmmV, xmm2","VGETEXPSS xmm2, xmmV, {k}{z}, x= mm1{sae}","vgetexpss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W= 0 43 /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VGETEXPSS xmm1, {k}{z}, xmmV, xmm2/m32","VGETEXPSS xmm2/m32, xmmV, {k}{z}= , xmm1","vgetexpss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 4= 3 /r","V","V","AVX512F","scale4","w,r,r,r","","" +"VGETMANTPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u:4","VGETMANTPD imm8u:4,= xmm2/m128/m64bcst, {k}{z}, xmm1","vgetmantpd imm8u:4, xmm2/m128/m64bcst, {= k}{z}, xmm1","EVEX.128.66.0F3A.W1 26 /r ib","V","V","AVX512F+AVX512VL","bsc= ale8,scale16","w,r,r,r","","" +"VGETMANTPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u:4","VGETMANTPD imm8u:4,= ymm2/m256/m64bcst, {k}{z}, ymm1","vgetmantpd imm8u:4, ymm2/m256/m64bcst, {= k}{z}, ymm1","EVEX.256.66.0F3A.W1 26 /r ib","V","V","AVX512F+AVX512VL","bsc= ale8,scale32","w,r,r,r","","" +"VGETMANTPD zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPD imm8u:4, zmm2, {= k}{z}, zmm1{sae}","vgetmantpd imm8u:4, zmm2, {k}{z}, zmm1{sae}","EVEX.512.6= 6.0F3A.W1 26 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VGETMANTPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u:4","VGETMANTPD imm8u:4,= zmm2/m512/m64bcst, {k}{z}, zmm1","vgetmantpd imm8u:4, zmm2/m512/m64bcst, {= k}{z}, zmm1","EVEX.512.66.0F3A.W1 26 /r ib","V","V","AVX512F","bscale8,scal= e64","w,r,r,r","","" +"VGETMANTPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u:4","VGETMANTPS imm8u:4,= xmm2/m128/m32bcst, {k}{z}, xmm1","vgetmantps imm8u:4, xmm2/m128/m32bcst, {= k}{z}, xmm1","EVEX.128.66.0F3A.W0 26 /r ib","V","V","AVX512F+AVX512VL","bsc= ale4,scale16","w,r,r,r","","" +"VGETMANTPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u:4","VGETMANTPS imm8u:4,= ymm2/m256/m32bcst, {k}{z}, ymm1","vgetmantps imm8u:4, ymm2/m256/m32bcst, {= k}{z}, ymm1","EVEX.256.66.0F3A.W0 26 /r ib","V","V","AVX512F+AVX512VL","bsc= ale4,scale32","w,r,r,r","","" +"VGETMANTPS zmm1{sae}, {k}{z}, zmm2, imm8u:4","VGETMANTPS imm8u:4, zmm2, {= k}{z}, zmm1{sae}","vgetmantps imm8u:4, zmm2, {k}{z}, zmm1{sae}","EVEX.512.6= 6.0F3A.W0 26 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VGETMANTPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u:4","VGETMANTPS imm8u:4,= zmm2/m512/m32bcst, {k}{z}, zmm1","vgetmantps imm8u:4, zmm2/m512/m32bcst, {= k}{z}, zmm1","EVEX.512.66.0F3A.W0 26 /r ib","V","V","AVX512F","bscale4,scal= e64","w,r,r,r","","" +"VGETMANTSD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSD imm8u:4, x= mm2, xmmV, {k}{z}, xmm1{sae}","vgetmantsd imm8u:4, xmm2, xmmV, {k}{z}, xmm1= {sae}","EVEX.NDS.128.66.0F3A.W1 27 /r ib","V","V","AVX512F","modrm_regonly"= ,"w,r,r,r,r","","" +"VGETMANTSD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VGETMANTSD imm8u:4, xm= m2/m64, xmmV, {k}{z}, xmm1","vgetmantsd imm8u:4, xmm2/m64, xmmV, {k}{z}, xm= m1","EVEX.NDS.LIG.66.0F3A.W1 27 /r ib","V","V","AVX512F","scale8","w,r,r,r,= r","","" +"VGETMANTSS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VGETMANTSS imm8u:4, x= mm2, xmmV, {k}{z}, xmm1{sae}","vgetmantss imm8u:4, xmm2, xmmV, {k}{z}, xmm1= {sae}","EVEX.NDS.128.66.0F3A.W0 27 /r ib","V","V","AVX512F","modrm_regonly"= ,"w,r,r,r,r","","" +"VGETMANTSS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VGETMANTSS imm8u:4, xm= m2/m32, xmmV, {k}{z}, xmm1","vgetmantss imm8u:4, xmm2/m32, xmmV, {k}{z}, xm= m1","EVEX.NDS.LIG.66.0F3A.W0 27 /r ib","V","V","AVX512F","scale4","w,r,r,r,= r","","" +"VGF2P8AFFINEINVQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEINVQB imm8u,= xmm2/m128, xmmV, xmm1","vgf2p8affineinvqb imm8u, xmm2/m128, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX","","w,r,r,r","","" +"VGF2P8AFFINEINVQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VGF2P8AF= FINEINVQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vgf2p8affineinvqb = imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 CF /= r ib","V","V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","","" +"VGF2P8AFFINEINVQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEINVQB imm8u,= ymm2/m256, ymmV, ymm1","vgf2p8affineinvqb imm8u, ymm2/m256, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W1 CF /r ib","V","V","GFNI+AVX","","w,r,r,r","","" +"VGF2P8AFFINEINVQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VGF2P8AF= FINEINVQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vgf2p8affineinvqb = imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 CF /= r ib","V","V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VGF2P8AFFINEINVQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VGF2P8AF= FINEINVQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vgf2p8affineinvqb = imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 CF /= r ib","V","V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","","" +"VGF2P8AFFINEQB xmm1, xmmV, xmm2/m128, imm8u","VGF2P8AFFINEQB imm8u, xmm2/= m128, xmmV, xmm1","vgf2p8affineqb imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.12= 8.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX","","w,r,r,r","","" +"VGF2P8AFFINEQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VGF2P8AFFIN= EQB imm8u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vgf2p8affineqb imm8u, xm= m2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 CE /r ib","V"= ,"V","GFNI+AVX512VL","bscale8,scale16","w,r,r,r,r","","" +"VGF2P8AFFINEQB ymm1, ymmV, ymm2/m256, imm8u","VGF2P8AFFINEQB imm8u, ymm2/= m256, ymmV, ymm1","vgf2p8affineqb imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.25= 6.66.0F3A.W1 CE /r ib","V","V","GFNI+AVX","","w,r,r,r","","" +"VGF2P8AFFINEQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VGF2P8AFFIN= EQB imm8u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vgf2p8affineqb imm8u, ym= m2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 CE /r ib","V"= ,"V","GFNI+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VGF2P8AFFINEQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VGF2P8AFFIN= EQB imm8u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vgf2p8affineqb imm8u, zm= m2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 CE /r ib","V"= ,"V","GFNI+AVX512F","bscale8,scale64","w,r,r,r,r","","" +"VGF2P8MULB xmm1, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, xmm1","vgf= 2p8mulb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 CF /r","V","V","GFNI= +AVX","","w,r,r","","" +"VGF2P8MULB xmm1, {k}{z}, xmmV, xmm2/m128","VGF2P8MULB xmm2/m128, xmmV, {k= }{z}, xmm1","vgf2p8mulb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3= 8.W0 CF /r","V","V","GFNI+AVX512VL","scale16","w,r,r,r","","" +"VGF2P8MULB ymm1, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, ymm1","vgf= 2p8mulb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 CF /r","V","V","GFNI= +AVX","","w,r,r","","" +"VGF2P8MULB ymm1, {k}{z}, ymmV, ymm2/m256","VGF2P8MULB ymm2/m256, ymmV, {k= }{z}, ymm1","vgf2p8mulb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3= 8.W0 CF /r","V","V","GFNI+AVX512VL","scale32","w,r,r,r","","" +"VGF2P8MULB zmm1, {k}{z}, zmmV, zmm2/m512","VGF2P8MULB zmm2/m512, zmmV, {k= }{z}, zmm1","vgf2p8mulb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3= 8.W0 CF /r","V","V","GFNI+AVX512F","scale64","w,r,r,r","","" +"VHADDPD xmm1, xmmV, xmm2/m128","VHADDPD xmm2/m128, xmmV, xmm1","vhaddpd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7C /r","V","V","AVX","","w,r,r= ","","" +"VHADDPD ymm1, ymmV, ymm2/m256","VHADDPD ymm2/m256, ymmV, ymm1","vhaddpd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7C /r","V","V","AVX","","w,r,r= ","","" +"VHADDPS xmm1, xmmV, xmm2/m128","VHADDPS xmm2/m128, xmmV, xmm1","vhaddps x= mm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r= ","","" +"VHADDPS ymm1, ymmV, ymm2/m256","VHADDPS ymm2/m256, ymmV, ymm1","vhaddps y= mm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7C /r","V","V","AVX","","w,r,r= ","","" +"VHSUBPD xmm1, xmmV, xmm2/m128","VHSUBPD xmm2/m128, xmmV, xmm1","vhsubpd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 7D /r","V","V","AVX","","w,r,r= ","","" +"VHSUBPD ymm1, ymmV, ymm2/m256","VHSUBPD ymm2/m256, ymmV, ymm1","vhsubpd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 7D /r","V","V","AVX","","w,r,r= ","","" +"VHSUBPS xmm1, xmmV, xmm2/m128","VHSUBPS xmm2/m128, xmmV, xmm1","vhsubps x= mm2/m128, xmmV, xmm1","VEX.NDS.128.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r= ","","" +"VHSUBPS ymm1, ymmV, ymm2/m256","VHSUBPS ymm2/m256, ymmV, ymm1","vhsubps y= mm2/m256, ymmV, ymm1","VEX.NDS.256.F2.0F.WIG 7D /r","V","V","AVX","","w,r,r= ","","" +"VINSERTF128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTF128 imm8u:1, xmm2/m1= 28, ymmV, ymm1","vinsertf128 imm8u:1, xmm2/m128, ymmV, ymm1","VEX.NDS.256.6= 6.0F3A.W0 18 /r ib","V","V","AVX","","w,r,r,r","","" +"VINSERTF32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF32X4 imm8u:= 1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf32x4 imm8u:1, xmm2/m128, ymmV, {= k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 18 /r ib","V","V","AVX512F+AVX512VL",= "scale16","w,r,r,r,r","","" +"VINSERTF32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF32X4 imm8u:= 2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf32x4 imm8u:2, xmm2/m128, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 18 /r ib","V","V","AVX512F","scale16"= ,"w,r,r,r,r","","" +"VINSERTF32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF32X8 imm8u:= 1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf32x8 imm8u:1, ymm2/m256, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 1A /r ib","V","V","AVX512DQ","scale32= ","w,r,r,r,r","","" +"VINSERTF64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTF64X2 imm8u:= 1, xmm2/m128, ymmV, {k}{z}, ymm1","vinsertf64x2 imm8u:1, xmm2/m128, ymmV, {= k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 18 /r ib","V","V","AVX512DQ+AVX512VL"= ,"scale16","w,r,r,r,r","","" +"VINSERTF64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTF64X2 imm8u:= 2, xmm2/m128, zmmV, {k}{z}, zmm1","vinsertf64x2 imm8u:2, xmm2/m128, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 18 /r ib","V","V","AVX512DQ","scale16= ","w,r,r,r,r","","" +"VINSERTF64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTF64X4 imm8u:= 1, ymm2/m256, zmmV, {k}{z}, zmm1","vinsertf64x4 imm8u:1, ymm2/m256, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 1A /r ib","V","V","AVX512F","scale32"= ,"w,r,r,r,r","","" +"VINSERTI128 ymm1, ymmV, xmm2/m128, imm8u:1","VINSERTI128 imm8u:1, xmm2/m1= 28, ymmV, ymm1","vinserti128 imm8u:1, xmm2/m128, ymmV, ymm1","VEX.NDS.256.6= 6.0F3A.W0 38 /r ib","V","V","AVX2","","w,r,r,r","","" +"VINSERTI32X4 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI32X4 imm8u:= 1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti32x4 imm8u:1, xmm2/m128, ymmV, {= k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 38 /r ib","V","V","AVX512F+AVX512VL",= "scale16","w,r,r,r,r","","" +"VINSERTI32X4 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI32X4 imm8u:= 2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti32x4 imm8u:2, xmm2/m128, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 38 /r ib","V","V","AVX512F","scale16"= ,"w,r,r,r,r","","" +"VINSERTI32X8 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI32X8 imm8u:= 1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti32x8 imm8u:1, ymm2/m256, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 3A /r ib","V","V","AVX512DQ","scale32= ","w,r,r,r,r","","" +"VINSERTI64X2 ymm1, {k}{z}, ymmV, xmm2/m128, imm8u:1","VINSERTI64X2 imm8u:= 1, xmm2/m128, ymmV, {k}{z}, ymm1","vinserti64x2 imm8u:1, xmm2/m128, ymmV, {= k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 38 /r ib","V","V","AVX512DQ+AVX512VL"= ,"scale16","w,r,r,r,r","","" +"VINSERTI64X2 zmm1, {k}{z}, zmmV, xmm2/m128, imm8u:2","VINSERTI64X2 imm8u:= 2, xmm2/m128, zmmV, {k}{z}, zmm1","vinserti64x2 imm8u:2, xmm2/m128, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 38 /r ib","V","V","AVX512DQ","scale16= ","w,r,r,r,r","","" +"VINSERTI64X4 zmm1, {k}{z}, zmmV, ymm2/m256, imm8u:1","VINSERTI64X4 imm8u:= 1, ymm2/m256, zmmV, {k}{z}, zmm1","vinserti64x4 imm8u:1, ymm2/m256, zmmV, {= k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 3A /r ib","V","V","AVX512F","scale32"= ,"w,r,r,r,r","","" +"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32, xmmV, = xmm1","vinsertps imm8u, xmm2/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 21 /= r ib","V","V","AVX512F+AVX512VL","scale4","w,r,r,r","","" +"VINSERTPS xmm1, xmmV, xmm2/m32, imm8u","VINSERTPS imm8u, xmm2/m32, xmmV, = xmm1","vinsertps imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 21 /= r ib","V","V","AVX","","w,r,r,r","","" +"VLDDQU xmm1, m128","VLDDQU m128, xmm1","vlddqu m128, xmm1","VEX.128.F2.0F= .WIG F0 /r","V","V","AVX","modrm_memonly","w,r","","" +"VLDDQU ymm1, m256","VLDDQU m256, ymm1","vlddqu m256, ymm1","VEX.256.F2.0F= .WIG F0 /r","V","V","AVX","modrm_memonly","w,r","","" +"VLDMXCSR m32","VLDMXCSR m32","vldmxcsr m32","VEX.128.0F.WIG AE /2","V","V= ","AVX","modrm_memonly","r","","" +"VMASKMOVDQU xmm1, xmm2","VMASKMOVDQU xmm2, xmm1","vmaskmovdqu xmm2, xmm1"= ,"VEX.128.66.0F.WIG F7 /r","V","V","AVX","modrm_regonly","r,r","","" +"VMASKMOVPD xmm1, xmmV, m128","VMASKMOVPD m128, xmmV, xmm1","vmaskmovpd m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPD ymm1, ymmV, m256","VMASKMOVPD m256, ymmV, ymm1","vmaskmovpd m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2D /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPD m128, xmmV, xmm1","VMASKMOVPD xmm1, xmmV, m128","vmaskmovpd xm= m1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPD m256, ymmV, ymm1","VMASKMOVPD ymm1, ymmV, m256","vmaskmovpd ym= m1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2F /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPS xmm1, xmmV, m128","VMASKMOVPS m128, xmmV, xmm1","vmaskmovps m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPS ymm1, ymmV, m256","VMASKMOVPS m256, ymmV, ymm1","vmaskmovps m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPS m128, xmmV, xmm1","VMASKMOVPS xmm1, xmmV, m128","vmaskmovps xm= m1, xmmV, m128","VEX.NDS.128.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMASKMOVPS m256, ymmV, ymm1","VMASKMOVPS ymm1, ymmV, m256","vmaskmovps ym= m1, ymmV, m256","VEX.NDS.256.66.0F38.W0 2E /r","V","V","AVX","modrm_memonly= ","w,r,r","","" +"VMAXPD xmm1, xmmV, xmm2/m128","VMAXPD xmm2/m128, xmmV, xmm1","vmaxpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","= ","" +"VMAXPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMAXPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vmaxpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 5F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VMAXPD ymm1, ymmV, ymm2/m256","VMAXPD ymm2/m256, ymmV, ymm1","vmaxpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5F /r","V","V","AVX","","w,r,r","= ","" +"VMAXPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMAXPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vmaxpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 5F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VMAXPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPD zmm2, zmmV, {k}{z}, zmm1{sa= e}","vmaxpd zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F.W1 5F /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMAXPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMAXPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vmaxpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 5F /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VMAXPS xmm1, xmmV, xmm2/m128","VMAXPS xmm2/m128, xmmV, xmm1","vmaxps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5F /r","V","V","AVX","","w,r,r","","" +"VMAXPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMAXPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vmaxps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 5F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VMAXPS ymm1, ymmV, ymm2/m256","VMAXPS ymm2/m256, ymmV, ymm1","vmaxps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5F /r","V","V","AVX","","w,r,r","","" +"VMAXPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMAXPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vmaxps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 5F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VMAXPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMAXPS zmm2, zmmV, {k}{z}, zmm1{sa= e}","vmaxps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0 5F /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMAXPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMAXPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vmaxps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 5F /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VMAXSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSD xmm2, xmmV, {k}{z}, xmm1{sa= e}","vmaxsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5F /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMAXSD xmm1, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, xmm1","vmaxsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5F /r","V","V","AVX","","w,r,r","","" +"VMAXSD xmm1, {k}{z}, xmmV, xmm2/m64","VMAXSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vmaxsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5F /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VMAXSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMAXSS xmm2, xmmV, {k}{z}, xmm1{sa= e}","vmaxss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5F /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMAXSS xmm1, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, xmm1","vmaxss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5F /r","V","V","AVX","","w,r,r","","" +"VMAXSS xmm1, {k}{z}, xmmV, xmm2/m32","VMAXSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vmaxss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5F /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VMCALL","VMCALL","vmcall","0F 01 C1","V","V","VTX","","","","" +"VMCLEAR m64","VMCLEAR m64","vmclear m64","66 0F C7 /6","V","V","VTX","mod= rm_memonly","r","","" +"VMFUNC","VMFUNC","vmfunc","0F 01 D4","V","V","","","","","" +"VMINPD xmm1, xmmV, xmm2/m128","VMINPD xmm2/m128, xmmV, xmm1","vminpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","= ","" +"VMINPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMINPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vminpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 5D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VMINPD ymm1, ymmV, ymm2/m256","VMINPD ymm2/m256, ymmV, ymm1","vminpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5D /r","V","V","AVX","","w,r,r","= ","" +"VMINPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMINPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vminpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 5D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VMINPD zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPD zmm2, zmmV, {k}{z}, zmm1{sa= e}","vminpd zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.66.0F.W1 5D /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMINPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMINPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vminpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 5D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VMINPS xmm1, xmmV, xmm2/m128","VMINPS xmm2/m128, xmmV, xmm1","vminps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5D /r","V","V","AVX","","w,r,r","","" +"VMINPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMINPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vminps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 5D /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VMINPS ymm1, ymmV, ymm2/m256","VMINPS ymm2/m256, ymmV, ymm1","vminps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5D /r","V","V","AVX","","w,r,r","","" +"VMINPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMINPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vminps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 5D /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VMINPS zmm1{sae}, {k}{z}, zmmV, zmm2","VMINPS zmm2, zmmV, {k}{z}, zmm1{sa= e}","vminps zmm2, zmmV, {k}{z}, zmm1{sae}","EVEX.NDS.512.0F.W0 5D /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMINPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMINPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vminps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 5D /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VMINSD xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSD xmm2, xmmV, {k}{z}, xmm1{sa= e}","vminsd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F2.0F.W1 5D /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMINSD xmm1, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, xmm1","vminsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5D /r","V","V","AVX","","w,r,r","","" +"VMINSD xmm1, {k}{z}, xmmV, xmm2/m64","VMINSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vminsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5D /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VMINSS xmm1{sae}, {k}{z}, xmmV, xmm2","VMINSS xmm2, xmmV, {k}{z}, xmm1{sa= e}","vminss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.F3.0F.W0 5D /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMINSS xmm1, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, xmm1","vminss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5D /r","V","V","AVX","","w,r,r","","" +"VMINSS xmm1, {k}{z}, xmmV, xmm2/m32","VMINSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vminss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5D /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VMLAUNCH","VMLAUNCH","vmlaunch","0F 01 C2","V","V","VTX","","","","" +"VMLOAD EAX","VMLOADL EAX","vmloadl EAX","0F 01 DA","V","V","SVM","amd,mod= rm_regonly,operand32","r","Y","32" +"VMLOAD RAX","VMLOADQ RAX","vmloadq RAX","REX.W 0F 01 DA","N.S.","V","SVM"= ,"amd,modrm_regonly","r","Y","64" +"VMLOAD AX","VMLOADW AX","vmloadw AX","0F 01 DA","V","V","SVM","amd,modrm_= regonly,operand16","r","Y","16" +"VMMCALL","VMMCALL","vmmcall","0F 01 D9","V","V","SVM","amd","","","" +"VMOVAPD xmm2/m128, xmm1","VMOVAPD xmm1, xmm2/m128","vmovapd xmm1, xmm2/m1= 28","VEX.128.66.0F.WIG 29 /r","V","V","AVX","","w,r","","" +"VMOVAPD xmm2/m128, {k}{z}, xmm1","VMOVAPD xmm1, {k}{z}, xmm2/m128","vmova= pd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 29 /r","V","V","AVX512F+AVX5= 12VL","scale16","w,r,r","","" +"VMOVAPD xmm1, xmm2/m128","VMOVAPD xmm2/m128, xmm1","vmovapd xmm2/m128, xm= m1","VEX.128.66.0F.WIG 28 /r","V","V","AVX","","w,r","","" +"VMOVAPD xmm1, {k}{z}, xmm2/m128","VMOVAPD xmm2/m128, {k}{z}, xmm1","vmova= pd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 28 /r","V","V","AVX512F+AVX5= 12VL","scale16","w,r,r","","" +"VMOVAPD ymm2/m256, ymm1","VMOVAPD ymm1, ymm2/m256","vmovapd ymm1, ymm2/m2= 56","VEX.256.66.0F.WIG 29 /r","V","V","AVX","","w,r","","" +"VMOVAPD ymm2/m256, {k}{z}, ymm1","VMOVAPD ymm1, {k}{z}, ymm2/m256","vmova= pd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 29 /r","V","V","AVX512F+AVX5= 12VL","scale32","w,r,r","","" +"VMOVAPD ymm1, ymm2/m256","VMOVAPD ymm2/m256, ymm1","vmovapd ymm2/m256, ym= m1","VEX.256.66.0F.WIG 28 /r","V","V","AVX","","w,r","","" +"VMOVAPD ymm1, {k}{z}, ymm2/m256","VMOVAPD ymm2/m256, {k}{z}, ymm1","vmova= pd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 28 /r","V","V","AVX512F+AVX5= 12VL","scale32","w,r,r","","" +"VMOVAPD zmm2/m512, {k}{z}, zmm1","VMOVAPD zmm1, {k}{z}, zmm2/m512","vmova= pd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 29 /r","V","V","AVX512F","sc= ale64","w,r,r","","" +"VMOVAPD zmm1, {k}{z}, zmm2/m512","VMOVAPD zmm2/m512, {k}{z}, zmm1","vmova= pd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 28 /r","V","V","AVX512F","sc= ale64","w,r,r","","" +"VMOVAPS xmm2/m128, xmm1","VMOVAPS xmm1, xmm2/m128","vmovaps xmm1, xmm2/m1= 28","VEX.128.0F.WIG 29 /r","V","V","AVX","","w,r","","" +"VMOVAPS xmm2/m128, {k}{z}, xmm1","VMOVAPS xmm1, {k}{z}, xmm2/m128","vmova= ps xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 29 /r","V","V","AVX512F+AVX512V= L","scale16","w,r,r","","" +"VMOVAPS xmm1, xmm2/m128","VMOVAPS xmm2/m128, xmm1","vmovaps xmm2/m128, xm= m1","VEX.128.0F.WIG 28 /r","V","V","AVX","","w,r","","" +"VMOVAPS xmm1, {k}{z}, xmm2/m128","VMOVAPS xmm2/m128, {k}{z}, xmm1","vmova= ps xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 28 /r","V","V","AVX512F+AVX512V= L","scale16","w,r,r","","" +"VMOVAPS ymm2/m256, ymm1","VMOVAPS ymm1, ymm2/m256","vmovaps ymm1, ymm2/m2= 56","VEX.256.0F.WIG 29 /r","V","V","AVX","","w,r","","" +"VMOVAPS ymm2/m256, {k}{z}, ymm1","VMOVAPS ymm1, {k}{z}, ymm2/m256","vmova= ps ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 29 /r","V","V","AVX512F+AVX512V= L","scale32","w,r,r","","" +"VMOVAPS ymm1, ymm2/m256","VMOVAPS ymm2/m256, ymm1","vmovaps ymm2/m256, ym= m1","VEX.256.0F.WIG 28 /r","V","V","AVX","","w,r","","" +"VMOVAPS ymm1, {k}{z}, ymm2/m256","VMOVAPS ymm2/m256, {k}{z}, ymm1","vmova= ps ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 28 /r","V","V","AVX512F+AVX512V= L","scale32","w,r,r","","" +"VMOVAPS zmm2/m512, {k}{z}, zmm1","VMOVAPS zmm1, {k}{z}, zmm2/m512","vmova= ps zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 29 /r","V","V","AVX512F","scale= 64","w,r,r","","" +"VMOVAPS zmm1, {k}{z}, zmm2/m512","VMOVAPS zmm2/m512, {k}{z}, zmm1","vmova= ps zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 28 /r","V","V","AVX512F","scale= 64","w,r,r","","" +"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","EVEX.128.66.0= F.W0 6E /r","V","V","AVX512F+AVX512VL","scale4","w,r","","" +"VMOVD xmm1, r/m32","VMOVD r/m32, xmm1","vmovd r/m32, xmm1","VEX.128.66.0F= .W0 6E /r","V","V","AVX","","w,r","","" +"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","EVEX.128.66.0= F.W0 7E /r","V","V","AVX512F+AVX512VL","scale4","w,r","","" +"VMOVD r/m32, xmm1","VMOVD xmm1, r/m32","vmovd xmm1, r/m32","VEX.128.66.0F= .W0 7E /r","V","V","AVX","","w,r","","" +"VMOVDDUP xmm1, xmm2/m64","VMOVDDUP xmm2/m64, xmm1","vmovddup xmm2/m64, xm= m1","VEX.128.F2.0F.WIG 12 /r","V","V","AVX","","w,r","","" +"VMOVDDUP xmm1, {k}{z}, xmm2/m64","VMOVDDUP xmm2/m64, {k}{z}, xmm1","vmovd= dup xmm2/m64, {k}{z}, xmm1","EVEX.128.F2.0F.W1 12 /r","V","V","AVX512F+AVX5= 12VL","scale8","w,r,r","","" +"VMOVDDUP ymm1, ymm2/m256","VMOVDDUP ymm2/m256, ymm1","vmovddup ymm2/m256,= ymm1","VEX.256.F2.0F.WIG 12 /r","V","V","AVX","","w,r","","" +"VMOVDDUP ymm1, {k}{z}, ymm2/m256","VMOVDDUP ymm2/m256, {k}{z}, ymm1","vmo= vddup ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 12 /r","V","V","AVX512F+A= VX512VL","scale32","w,r,r","","" +"VMOVDDUP zmm1, {k}{z}, zmm2/m512","VMOVDDUP zmm2/m512, {k}{z}, zmm1","vmo= vddup zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 12 /r","V","V","AVX512F",= "scale64","w,r,r","","" +"VMOVDQA xmm2/m128, xmm1","VMOVDQA xmm1, xmm2/m128","vmovdqa xmm1, xmm2/m1= 28","VEX.128.66.0F.WIG 7F /r","V","V","AVX","","w,r","","" +"VMOVDQA xmm1, xmm2/m128","VMOVDQA xmm2/m128, xmm1","vmovdqa xmm2/m128, xm= m1","VEX.128.66.0F.WIG 6F /r","V","V","AVX","","w,r","","" +"VMOVDQA ymm2/m256, ymm1","VMOVDQA ymm1, ymm2/m256","vmovdqa ymm1, ymm2/m2= 56","VEX.256.66.0F.WIG 7F /r","V","V","AVX","","w,r","","" +"VMOVDQA ymm1, ymm2/m256","VMOVDQA ymm2/m256, ymm1","vmovdqa ymm2/m256, ym= m1","VEX.256.66.0F.WIG 6F /r","V","V","AVX","","w,r","","" +"VMOVDQA32 xmm2/m128, {k}{z}, xmm1","VMOVDQA32 xmm1, {k}{z}, xmm2/m128","v= movdqa32 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W0 7F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","","" +"VMOVDQA32 xmm1, {k}{z}, xmm2/m128","VMOVDQA32 xmm2/m128, {k}{z}, xmm1","v= movdqa32 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W0 6F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","","" +"VMOVDQA32 ymm2/m256, {k}{z}, ymm1","VMOVDQA32 ymm1, {k}{z}, ymm2/m256","v= movdqa32 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W0 7F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","","" +"VMOVDQA32 ymm1, {k}{z}, ymm2/m256","VMOVDQA32 ymm2/m256, {k}{z}, ymm1","v= movdqa32 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W0 6F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","","" +"VMOVDQA32 zmm2/m512, {k}{z}, zmm1","VMOVDQA32 zmm1, {k}{z}, zmm2/m512","v= movdqa32 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W0 7F /r","V","V","AVX512= F","scale64","w,r,r","","" +"VMOVDQA32 zmm1, {k}{z}, zmm2/m512","VMOVDQA32 zmm2/m512, {k}{z}, zmm1","v= movdqa32 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W0 6F /r","V","V","AVX512= F","scale64","w,r,r","","" +"VMOVDQA64 xmm2/m128, {k}{z}, xmm1","VMOVDQA64 xmm1, {k}{z}, xmm2/m128","v= movdqa64 xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 7F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQA64 xmm1, {k}{z}, xmm2/m128","VMOVDQA64 xmm2/m128, {k}{z}, xmm1","v= movdqa64 xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 6F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQA64 ymm2/m256, {k}{z}, ymm1","VMOVDQA64 ymm1, {k}{z}, ymm2/m256","v= movdqa64 ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 7F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQA64 ymm1, {k}{z}, ymm2/m256","VMOVDQA64 ymm2/m256, {k}{z}, ymm1","v= movdqa64 ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 6F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQA64 zmm2/m512, {k}{z}, zmm1","VMOVDQA64 zmm1, {k}{z}, zmm2/m512","v= movdqa64 zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 7F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQA64 zmm1, {k}{z}, zmm2/m512","VMOVDQA64 zmm2/m512, {k}{z}, zmm1","v= movdqa64 zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 6F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU xmm2/m128, xmm1","VMOVDQU xmm1, xmm2/m128","vmovdqu xmm1, xmm2/m1= 28","VEX.128.F3.0F.WIG 7F /r","V","V","AVX","","w,r","","" +"VMOVDQU xmm1, xmm2/m128","VMOVDQU xmm2/m128, xmm1","vmovdqu xmm2/m128, xm= m1","VEX.128.F3.0F.WIG 6F /r","V","V","AVX","","w,r","","" +"VMOVDQU ymm2/m256, ymm1","VMOVDQU ymm1, ymm2/m256","vmovdqu ymm1, ymm2/m2= 56","VEX.256.F3.0F.WIG 7F /r","V","V","AVX","","w,r","","" +"VMOVDQU ymm1, ymm2/m256","VMOVDQU ymm2/m256, ymm1","vmovdqu ymm2/m256, ym= m1","VEX.256.F3.0F.WIG 6F /r","V","V","AVX","","w,r","","" +"VMOVDQU16 xmm2/m128, {k}{z}, xmm1","VMOVDQU16 xmm1, {k}{z}, xmm2/m128","v= movdqu16 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W1 7F /r","V","V","AVX512= BW+AVX512VL","scale16","w,r,r","","" +"VMOVDQU16 xmm1, {k}{z}, xmm2/m128","VMOVDQU16 xmm2/m128, {k}{z}, xmm1","v= movdqu16 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W1 6F /r","V","V","AVX512= BW+AVX512VL","scale16","w,r,r","","" +"VMOVDQU16 ymm2/m256, {k}{z}, ymm1","VMOVDQU16 ymm1, {k}{z}, ymm2/m256","v= movdqu16 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W1 7F /r","V","V","AVX512= BW+AVX512VL","scale32","w,r,r","","" +"VMOVDQU16 ymm1, {k}{z}, ymm2/m256","VMOVDQU16 ymm2/m256, {k}{z}, ymm1","v= movdqu16 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W1 6F /r","V","V","AVX512= BW+AVX512VL","scale32","w,r,r","","" +"VMOVDQU16 zmm2/m512, {k}{z}, zmm1","VMOVDQU16 zmm1, {k}{z}, zmm2/m512","v= movdqu16 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W1 7F /r","V","V","AVX512= BW","scale64","w,r,r","","" +"VMOVDQU16 zmm1, {k}{z}, zmm2/m512","VMOVDQU16 zmm2/m512, {k}{z}, zmm1","v= movdqu16 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W1 6F /r","V","V","AVX512= BW","scale64","w,r,r","","" +"VMOVDQU32 xmm2/m128, {k}{z}, xmm1","VMOVDQU32 xmm1, {k}{z}, xmm2/m128","v= movdqu32 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W0 7F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU32 xmm1, {k}{z}, xmm2/m128","VMOVDQU32 xmm2/m128, {k}{z}, xmm1","v= movdqu32 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 6F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU32 ymm2/m256, {k}{z}, ymm1","VMOVDQU32 ymm1, {k}{z}, ymm2/m256","v= movdqu32 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W0 7F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU32 ymm1, {k}{z}, ymm2/m256","VMOVDQU32 ymm2/m256, {k}{z}, ymm1","v= movdqu32 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 6F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU32 zmm2/m512, {k}{z}, zmm1","VMOVDQU32 zmm1, {k}{z}, zmm2/m512","v= movdqu32 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W0 7F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU32 zmm1, {k}{z}, zmm2/m512","VMOVDQU32 zmm2/m512, {k}{z}, zmm1","v= movdqu32 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 6F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU64 xmm2/m128, {k}{z}, xmm1","VMOVDQU64 xmm1, {k}{z}, xmm2/m128","v= movdqu64 xmm1, {k}{z}, xmm2/m128","EVEX.128.F3.0F.W1 7F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU64 xmm1, {k}{z}, xmm2/m128","VMOVDQU64 xmm2/m128, {k}{z}, xmm1","v= movdqu64 xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W1 6F /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU64 ymm2/m256, {k}{z}, ymm1","VMOVDQU64 ymm1, {k}{z}, ymm2/m256","v= movdqu64 ymm1, {k}{z}, ymm2/m256","EVEX.256.F3.0F.W1 7F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU64 ymm1, {k}{z}, ymm2/m256","VMOVDQU64 ymm2/m256, {k}{z}, ymm1","v= movdqu64 ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W1 6F /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU64 zmm2/m512, {k}{z}, zmm1","VMOVDQU64 zmm1, {k}{z}, zmm2/m512","v= movdqu64 zmm1, {k}{z}, zmm2/m512","EVEX.512.F3.0F.W1 7F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU64 zmm1, {k}{z}, zmm2/m512","VMOVDQU64 zmm2/m512, {k}{z}, zmm1","v= movdqu64 zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W1 6F /r","V","V","AVX512= F","scale64","w,r,r","Y","512" +"VMOVDQU8 xmm2/m128, {k}{z}, xmm1","VMOVDQU8 xmm1, {k}{z}, xmm2/m128","vmo= vdqu8 xmm1, {k}{z}, xmm2/m128","EVEX.128.F2.0F.W0 7F /r","V","V","AVX512BW+= AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU8 xmm1, {k}{z}, xmm2/m128","VMOVDQU8 xmm2/m128, {k}{z}, xmm1","vmo= vdqu8 xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.W0 6F /r","V","V","AVX512BW+= AVX512VL","scale16","w,r,r","Y","128" +"VMOVDQU8 ymm2/m256, {k}{z}, ymm1","VMOVDQU8 ymm1, {k}{z}, ymm2/m256","vmo= vdqu8 ymm1, {k}{z}, ymm2/m256","EVEX.256.F2.0F.W0 7F /r","V","V","AVX512BW+= AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU8 ymm1, {k}{z}, ymm2/m256","VMOVDQU8 ymm2/m256, {k}{z}, ymm1","vmo= vdqu8 ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.W0 6F /r","V","V","AVX512BW+= AVX512VL","scale32","w,r,r","Y","256" +"VMOVDQU8 zmm2/m512, {k}{z}, zmm1","VMOVDQU8 zmm1, {k}{z}, zmm2/m512","vmo= vdqu8 zmm1, {k}{z}, zmm2/m512","EVEX.512.F2.0F.W0 7F /r","V","V","AVX512BW"= ,"scale64","w,r,r","Y","512" +"VMOVDQU8 zmm1, {k}{z}, zmm2/m512","VMOVDQU8 zmm2/m512, {k}{z}, zmm1","vmo= vdqu8 zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.W0 6F /r","V","V","AVX512BW"= ,"scale64","w,r,r","Y","512" +"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xm= mV, xmm1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_rego= nly","w,r,r","","" +"VMOVHLPS xmm1, xmmV, xmm2","VMOVHLPS xmm2, xmmV, xmm1","vmovhlps xmm2, xm= mV, xmm1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_regonly","w,r,r",= "","" +"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xm= m1","EVEX.NDS.LIG.66.0F.W1 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly= ,scale8","w,r,r","","" +"VMOVHPD xmm1, xmmV, m64","VMOVHPD m64, xmmV, xmm1","vmovhpd m64, xmmV, xm= m1","VEX.NDS.128.66.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","",= "" +"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","EVEX.LIG.66.0= F.W1 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","","" +"VMOVHPD m64, xmm1","VMOVHPD xmm1, m64","vmovhpd xmm1, m64","VEX.128.66.0F= .WIG 17 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xm= m1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_memonly,sc= ale8","w,r,r","","" +"VMOVHPS xmm1, xmmV, m64","VMOVHPS m64, xmmV, xmm1","vmovhps m64, xmmV, xm= m1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_memonly","w,r,r","","" +"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","EVEX.128.0F.W= 0 17 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","","" +"VMOVHPS m64, xmm1","VMOVHPS xmm1, m64","vmovhps xmm1, m64","VEX.128.0F.WI= G 17 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xm= mV, xmm1","EVEX.NDS.128.0F.W0 16 /r","V","V","AVX512F+AVX512VL","modrm_rego= nly","w,r,r","","" +"VMOVLHPS xmm1, xmmV, xmm2","VMOVLHPS xmm2, xmmV, xmm1","vmovlhps xmm2, xm= mV, xmm1","VEX.NDS.128.0F.WIG 16 /r","V","V","AVX","modrm_regonly","w,r,r",= "","" +"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xm= m1","EVEX.NDS.LIG.66.0F.W1 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly= ,scale8","w,r,r","","" +"VMOVLPD xmm1, xmmV, m64","VMOVLPD m64, xmmV, xmm1","vmovlpd m64, xmmV, xm= m1","VEX.NDS.128.66.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","",= "" +"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","EVEX.LIG.66.0= F.W1 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","","" +"VMOVLPD m64, xmm1","VMOVLPD xmm1, m64","vmovlpd xmm1, m64","VEX.128.66.0F= .WIG 13 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xm= m1","EVEX.NDS.128.0F.W0 12 /r","V","V","AVX512F+AVX512VL","modrm_memonly,sc= ale8","w,r,r","","" +"VMOVLPS xmm1, xmmV, m64","VMOVLPS m64, xmmV, xmm1","vmovlps m64, xmmV, xm= m1","VEX.NDS.128.0F.WIG 12 /r","V","V","AVX","modrm_memonly","w,r,r","","" +"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","EVEX.128.0F.W= 0 13 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale8","w,r","","" +"VMOVLPS m64, xmm1","VMOVLPS xmm1, m64","vmovlps xmm1, m64","VEX.128.0F.WI= G 13 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVMSKPD r32, xmm2","VMOVMSKPD xmm2, r32","vmovmskpd xmm2, r32","VEX.128= .66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","","" +"VMOVMSKPD r32, ymm2","VMOVMSKPD ymm2, r32","vmovmskpd ymm2, r32","VEX.256= .66.0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","","" +"VMOVMSKPS r32, xmm2","VMOVMSKPS xmm2, r32","vmovmskps xmm2, r32","VEX.128= .0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","","" +"VMOVMSKPS r32, ymm2","VMOVMSKPS ymm2, r32","vmovmskps ymm2, r32","VEX.256= .0F.WIG 50 /r","V","V","AVX","modrm_regonly","w,r","","" +"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","EVEX.12= 8.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r",= "","" +"VMOVNTDQ m128, xmm1","VMOVNTDQ xmm1, m128","vmovntdq xmm1, m128","VEX.128= .66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","EVEX.25= 6.66.0F.W0 E7 /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r",= "","" +"VMOVNTDQ m256, ymm1","VMOVNTDQ ymm1, m256","vmovntdq ymm1, m256","VEX.256= .66.0F.WIG E7 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTDQ m512, zmm1","VMOVNTDQ zmm1, m512","vmovntdq zmm1, m512","EVEX.51= 2.66.0F.W0 E7 /r","V","V","AVX512F","modrm_memonly,scale64","w,r","","" +"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","EVEX= .128.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","= w,r","","" +"VMOVNTDQA xmm1, m128","VMOVNTDQA m128, xmm1","vmovntdqa m128, xmm1","VEX.= 128.66.0F38.WIG 2A /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","EVEX= .256.66.0F38.W0 2A /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","= w,r","","" +"VMOVNTDQA ymm1, m256","VMOVNTDQA m256, ymm1","vmovntdqa m256, ymm1","VEX.= 256.66.0F38.WIG 2A /r","V","V","AVX2","modrm_memonly","w,r","","" +"VMOVNTDQA zmm1, m512","VMOVNTDQA m512, zmm1","vmovntdqa m512, zmm1","EVEX= .512.66.0F38.W0 2A /r","V","V","AVX512F","modrm_memonly,scale64","w,r","","" +"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","EVEX.12= 8.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r",= "","" +"VMOVNTPD m128, xmm1","VMOVNTPD xmm1, m128","vmovntpd xmm1, m128","VEX.128= .66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","EVEX.25= 6.66.0F.W1 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r",= "","" +"VMOVNTPD m256, ymm1","VMOVNTPD ymm1, m256","vmovntpd ymm1, m256","VEX.256= .66.0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTPD m512, zmm1","VMOVNTPD zmm1, m512","vmovntpd zmm1, m512","EVEX.51= 2.66.0F.W1 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","","" +"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","EVEX.12= 8.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale16","w,r","",= "" +"VMOVNTPS m128, xmm1","VMOVNTPS xmm1, m128","vmovntps xmm1, m128","VEX.128= .0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","EVEX.25= 6.0F.W0 2B /r","V","V","AVX512F+AVX512VL","modrm_memonly,scale32","w,r","",= "" +"VMOVNTPS m256, ymm1","VMOVNTPS ymm1, m256","vmovntps ymm1, m256","VEX.256= .0F.WIG 2B /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVNTPS m512, zmm1","VMOVNTPS zmm1, m512","vmovntps zmm1, m512","EVEX.51= 2.0F.W0 2B /r","V","V","AVX512F","modrm_memonly,scale64","w,r","","" +"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","EVEX.128.66.0= F.W1 6E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","","" +"VMOVQ xmm1, r/m64","VMOVQ r/m64, xmm1","vmovq r/m64, xmm1","VEX.128.66.0F= .W1 6E /r","N.S.","V","AVX","","w,r","","" +"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","EVEX.128.66.0= F.W1 7E /r","N.S.","V","AVX512F+AVX512VL","scale8","w,r","","" +"VMOVQ r/m64, xmm1","VMOVQ xmm1, r/m64","vmovq xmm1, r/m64","VEX.128.66.0F= .W1 7E /r","N.S.","V","AVX","","w,r","","" +"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","EVEX= .LIG.66.0F.W1 D6 /r","V","V","AVX512F+AVX512VL","scale8","w,r","","" +"VMOVQ xmm2/m64, xmm1","VMOVQ xmm1, xmm2/m64","vmovq xmm1, xmm2/m64","VEX.= 128.66.0F.WIG D6 /r","V","V","AVX","","w,r","","" +"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","EVEX= .LIG.F3.0F.W1 7E /r","V","V","AVX512F+AVX512VL","scale8","w,r","","" +"VMOVQ xmm1, xmm2/m64","VMOVQ xmm2/m64, xmm1","vmovq xmm2/m64, xmm1","VEX.= 128.F3.0F.WIG 7E /r","V","V","AVX","","w,r","","" +"VMOVSD xmm1, m64","VMOVSD m64, xmm1","vmovsd m64, xmm1","VEX.LIG.F2.0F.WI= G 10 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVSD xmm1, {k}{z}, m64","VMOVSD m64, {k}{z}, xmm1","vmovsd m64, {k}{z},= xmm1","EVEX.LIG.F2.0F.W1 10 /r","V","V","AVX512F","modrm_memonly,scale8","= w,r,r","","" +"VMOVSD m64, xmm1","VMOVSD xmm1, m64","vmovsd xmm1, m64","VEX.LIG.F2.0F.WI= G 11 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVSD xmm2, xmmV, xmm1","VMOVSD xmm1, xmmV, xmm2","vmovsd xmm1, xmmV, xm= m2","VEX.NDS.LIG.F2.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",= "" +"VMOVSD xmm2, {k}{z}, xmmV, xmm1","VMOVSD xmm1, xmmV, {k}{z}, xmm2","vmovs= d xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F2.0F.W1 11 /r","V","V","AVX512F"= ,"modrm_regonly","w,r,r,r","","" +"VMOVSD m64, {k}, xmm1","VMOVSD xmm1, {k}, m64","vmovsd xmm1, {k}, m64","E= VEX.LIG.F2.0F.W1 11 /r","V","V","AVX512F","modrm_memonly,scale8","w,r,r",""= ,"" +"VMOVSD xmm1, xmmV, xmm2","VMOVSD xmm2, xmmV, xmm1","vmovsd xmm2, xmmV, xm= m1","VEX.NDS.LIG.F2.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",= "" +"VMOVSD xmm1, {k}{z}, xmmV, xmm2","VMOVSD xmm2, xmmV, {k}{z}, xmm1","vmovs= d xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 10 /r","V","V","AVX512F"= ,"modrm_regonly","w,r,r,r","","" +"VMOVSHDUP xmm1, xmm2/m128","VMOVSHDUP xmm2/m128, xmm1","vmovshdup xmm2/m1= 28, xmm1","VEX.128.F3.0F.WIG 16 /r","V","V","AVX","","w,r","","" +"VMOVSHDUP xmm1, {k}{z}, xmm2/m128","VMOVSHDUP xmm2/m128, {k}{z}, xmm1","v= movshdup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 16 /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","","" +"VMOVSHDUP ymm1, ymm2/m256","VMOVSHDUP ymm2/m256, ymm1","vmovshdup ymm2/m2= 56, ymm1","VEX.256.F3.0F.WIG 16 /r","V","V","AVX","","w,r","","" +"VMOVSHDUP ymm1, {k}{z}, ymm2/m256","VMOVSHDUP ymm2/m256, {k}{z}, ymm1","v= movshdup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 16 /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","","" +"VMOVSHDUP zmm1, {k}{z}, zmm2/m512","VMOVSHDUP zmm2/m512, {k}{z}, zmm1","v= movshdup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 16 /r","V","V","AVX512= F","scale64","w,r,r","","" +"VMOVSLDUP xmm1, xmm2/m128","VMOVSLDUP xmm2/m128, xmm1","vmovsldup xmm2/m1= 28, xmm1","VEX.128.F3.0F.WIG 12 /r","V","V","AVX","","w,r","","" +"VMOVSLDUP xmm1, {k}{z}, xmm2/m128","VMOVSLDUP xmm2/m128, {k}{z}, xmm1","v= movsldup xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.W0 12 /r","V","V","AVX512= F+AVX512VL","scale16","w,r,r","","" +"VMOVSLDUP ymm1, ymm2/m256","VMOVSLDUP ymm2/m256, ymm1","vmovsldup ymm2/m2= 56, ymm1","VEX.256.F3.0F.WIG 12 /r","V","V","AVX","","w,r","","" +"VMOVSLDUP ymm1, {k}{z}, ymm2/m256","VMOVSLDUP ymm2/m256, {k}{z}, ymm1","v= movsldup ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.W0 12 /r","V","V","AVX512= F+AVX512VL","scale32","w,r,r","","" +"VMOVSLDUP zmm1, {k}{z}, zmm2/m512","VMOVSLDUP zmm2/m512, {k}{z}, zmm1","v= movsldup zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.W0 12 /r","V","V","AVX512= F","scale64","w,r,r","","" +"VMOVSS xmm1, m32","VMOVSS m32, xmm1","vmovss m32, xmm1","VEX.LIG.F3.0F.WI= G 10 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVSS xmm1, {k}{z}, m32","VMOVSS m32, {k}{z}, xmm1","vmovss m32, {k}{z},= xmm1","EVEX.LIG.F3.0F.W0 10 /r","V","V","AVX512F","modrm_memonly,scale4","= w,r,r","","" +"VMOVSS m32, xmm1","VMOVSS xmm1, m32","vmovss xmm1, m32","VEX.LIG.F3.0F.WI= G 11 /r","V","V","AVX","modrm_memonly","w,r","","" +"VMOVSS xmm2, xmmV, xmm1","VMOVSS xmm1, xmmV, xmm2","vmovss xmm1, xmmV, xm= m2","VEX.NDS.LIG.F3.0F.WIG 11 /r","V","V","AVX","modrm_regonly","w,r,r","",= "" +"VMOVSS xmm2, {k}{z}, xmmV, xmm1","VMOVSS xmm1, xmmV, {k}{z}, xmm2","vmovs= s xmm1, xmmV, {k}{z}, xmm2","EVEX.NDS.LIG.F3.0F.W0 11 /r","V","V","AVX512F"= ,"modrm_regonly","w,r,r,r","","" +"VMOVSS m32, {k}, xmm1","VMOVSS xmm1, {k}, m32","vmovss xmm1, {k}, m32","E= VEX.LIG.F3.0F.W0 11 /r","V","V","AVX512F","modrm_memonly,scale4","w,r,r",""= ,"" +"VMOVSS xmm1, xmmV, xmm2","VMOVSS xmm2, xmmV, xmm1","vmovss xmm2, xmmV, xm= m1","VEX.NDS.LIG.F3.0F.WIG 10 /r","V","V","AVX","modrm_regonly","w,r,r","",= "" +"VMOVSS xmm1, {k}{z}, xmmV, xmm2","VMOVSS xmm2, xmmV, {k}{z}, xmm1","vmovs= s xmm2, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 10 /r","V","V","AVX512F"= ,"modrm_regonly","w,r,r,r","","" +"VMOVUPD xmm2/m128, xmm1","VMOVUPD xmm1, xmm2/m128","vmovupd xmm1, xmm2/m1= 28","VEX.128.66.0F.WIG 11 /r","V","V","AVX","","w,r","","" +"VMOVUPD xmm2/m128, {k}{z}, xmm1","VMOVUPD xmm1, {k}{z}, xmm2/m128","vmovu= pd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F.W1 11 /r","V","V","AVX512F+AVX5= 12VL","scale16","w,r,r","","" +"VMOVUPD xmm1, xmm2/m128","VMOVUPD xmm2/m128, xmm1","vmovupd xmm2/m128, xm= m1","VEX.128.66.0F.WIG 10 /r","V","V","AVX","","w,r","","" +"VMOVUPD xmm1, {k}{z}, xmm2/m128","VMOVUPD xmm2/m128, {k}{z}, xmm1","vmovu= pd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F.W1 10 /r","V","V","AVX512F+AVX5= 12VL","scale16","w,r,r","","" +"VMOVUPD ymm2/m256, ymm1","VMOVUPD ymm1, ymm2/m256","vmovupd ymm1, ymm2/m2= 56","VEX.256.66.0F.WIG 11 /r","V","V","AVX","","w,r","","" +"VMOVUPD ymm2/m256, {k}{z}, ymm1","VMOVUPD ymm1, {k}{z}, ymm2/m256","vmovu= pd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F.W1 11 /r","V","V","AVX512F+AVX5= 12VL","scale32","w,r,r","","" +"VMOVUPD ymm1, ymm2/m256","VMOVUPD ymm2/m256, ymm1","vmovupd ymm2/m256, ym= m1","VEX.256.66.0F.WIG 10 /r","V","V","AVX","","w,r","","" +"VMOVUPD ymm1, {k}{z}, ymm2/m256","VMOVUPD ymm2/m256, {k}{z}, ymm1","vmovu= pd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F.W1 10 /r","V","V","AVX512F+AVX5= 12VL","scale32","w,r,r","","" +"VMOVUPD zmm2/m512, {k}{z}, zmm1","VMOVUPD zmm1, {k}{z}, zmm2/m512","vmovu= pd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F.W1 11 /r","V","V","AVX512F","sc= ale64","w,r,r","","" +"VMOVUPD zmm1, {k}{z}, zmm2/m512","VMOVUPD zmm2/m512, {k}{z}, zmm1","vmovu= pd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F.W1 10 /r","V","V","AVX512F","sc= ale64","w,r,r","","" +"VMOVUPS xmm2/m128, xmm1","VMOVUPS xmm1, xmm2/m128","vmovups xmm1, xmm2/m1= 28","VEX.128.0F.WIG 11 /r","V","V","AVX","","w,r","","" +"VMOVUPS xmm2/m128, {k}{z}, xmm1","VMOVUPS xmm1, {k}{z}, xmm2/m128","vmovu= ps xmm1, {k}{z}, xmm2/m128","EVEX.128.0F.W0 11 /r","V","V","AVX512F+AVX512V= L","scale16","w,r,r","","" +"VMOVUPS xmm1, xmm2/m128","VMOVUPS xmm2/m128, xmm1","vmovups xmm2/m128, xm= m1","VEX.128.0F.WIG 10 /r","V","V","AVX","","w,r","","" +"VMOVUPS xmm1, {k}{z}, xmm2/m128","VMOVUPS xmm2/m128, {k}{z}, xmm1","vmovu= ps xmm2/m128, {k}{z}, xmm1","EVEX.128.0F.W0 10 /r","V","V","AVX512F+AVX512V= L","scale16","w,r,r","","" +"VMOVUPS ymm2/m256, ymm1","VMOVUPS ymm1, ymm2/m256","vmovups ymm1, ymm2/m2= 56","VEX.256.0F.WIG 11 /r","V","V","AVX","","w,r","","" +"VMOVUPS ymm2/m256, {k}{z}, ymm1","VMOVUPS ymm1, {k}{z}, ymm2/m256","vmovu= ps ymm1, {k}{z}, ymm2/m256","EVEX.256.0F.W0 11 /r","V","V","AVX512F+AVX512V= L","scale32","w,r,r","","" +"VMOVUPS ymm1, ymm2/m256","VMOVUPS ymm2/m256, ymm1","vmovups ymm2/m256, ym= m1","VEX.256.0F.WIG 10 /r","V","V","AVX","","w,r","","" +"VMOVUPS ymm1, {k}{z}, ymm2/m256","VMOVUPS ymm2/m256, {k}{z}, ymm1","vmovu= ps ymm2/m256, {k}{z}, ymm1","EVEX.256.0F.W0 10 /r","V","V","AVX512F+AVX512V= L","scale32","w,r,r","","" +"VMOVUPS zmm2/m512, {k}{z}, zmm1","VMOVUPS zmm1, {k}{z}, zmm2/m512","vmovu= ps zmm1, {k}{z}, zmm2/m512","EVEX.512.0F.W0 11 /r","V","V","AVX512F","scale= 64","w,r,r","","" +"VMOVUPS zmm1, {k}{z}, zmm2/m512","VMOVUPS zmm2/m512, {k}{z}, zmm1","vmovu= ps zmm2/m512, {k}{z}, zmm1","EVEX.512.0F.W0 10 /r","V","V","AVX512F","scale= 64","w,r,r","","" +"VMPSADBW xmm1, xmmV, xmm2/m128, imm8u","VMPSADBW imm8u, xmm2/m128, xmmV, = xmm1","vmpsadbw imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 42 /= r ib","V","V","AVX","","w,r,r,r","","" +"VMPSADBW ymm1, ymmV, ymm2/m256, imm8u","VMPSADBW imm8u, ymm2/m256, ymmV, = ymm1","vmpsadbw imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 42 /= r ib","V","V","AVX2","","w,r,r,r","","" +"VMPTRLD m64","VMPTRLD m64","vmptrld m64","0F C7 /6","V","V","VTX","modrm_= memonly","r","","" +"VMPTRST m64","VMPTRST m64","vmptrst m64","0F C7 /7","V","V","VTX","modrm_= memonly","w","","" +"VMREAD r/m32, r32","VMREAD r32, r/m32","vmread r32, r/m32","0F 78 /r","V"= ,"N.S.","VTX","","rw,r","","" +"VMREAD r/m64, r64","VMREAD r64, r/m64","vmread r64, r/m64","0F 78 /r","N.= S.","V","VTX","default64","rw,r","","" +"VMRESUME","VMRESUME","vmresume","0F 01 C3","V","V","VTX","","","","" +"VMRUN EAX","VMRUNL EAX","vmrunl EAX","0F 01 D8","V","V","SVM","amd,modrm_= regonly,operand32","r","Y","32" +"VMRUN RAX","VMRUNQ RAX","vmrunq RAX","REX.W 0F 01 D8","N.S.","V","SVM","a= md,modrm_regonly","r","Y","64" +"VMRUN AX","VMRUNW AX","vmrunw AX","0F 01 D8","V","V","SVM","amd,modrm_reg= only,operand16","r","Y","16" +"VMSAVE","VMSAVE","vmsave","0F 01 DB","V","V","SVM","amd","","","" +"VMULPD xmm1, xmmV, xmm2/m128","VMULPD xmm2/m128, xmmV, xmm1","vmulpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","= ","" +"VMULPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VMULPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vmulpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 59 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VMULPD ymm1, ymmV, ymm2/m256","VMULPD ymm2/m256, ymmV, ymm1","vmulpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 59 /r","V","V","AVX","","w,r,r","= ","" +"VMULPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VMULPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vmulpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 59 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VMULPD zmm1{er}, {k}{z}, zmmV, zmm2","VMULPD zmm2, zmmV, {k}{z}, zmm1{er}= ","vmulpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 59 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMULPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VMULPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vmulpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 59 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VMULPS xmm1, xmmV, xmm2/m128","VMULPS xmm2/m128, xmmV, xmm1","vmulps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 59 /r","V","V","AVX","","w,r,r","","" +"VMULPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VMULPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vmulps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 59 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VMULPS ymm1, ymmV, ymm2/m256","VMULPS ymm2/m256, ymmV, ymm1","vmulps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 59 /r","V","V","AVX","","w,r,r","","" +"VMULPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VMULPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vmulps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 59 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VMULPS zmm1{er}, {k}{z}, zmmV, zmm2","VMULPS zmm2, zmmV, {k}{z}, zmm1{er}= ","vmulps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 59 /r","V","V",= "AVX512F","modrm_regonly","w,r,r,r","","" +"VMULPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VMULPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vmulps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 59 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VMULSD xmm1{er}, {k}{z}, xmmV, xmm2","VMULSD xmm2, xmmV, {k}{z}, xmm1{er}= ","vmulsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 59 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMULSD xmm1, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, xmm1","vmulsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 59 /r","V","V","AVX","","w,r,r","","" +"VMULSD xmm1, {k}{z}, xmmV, xmm2/m64","VMULSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vmulsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 59 /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VMULSS xmm1{er}, {k}{z}, xmmV, xmm2","VMULSS xmm2, xmmV, {k}{z}, xmm1{er}= ","vmulss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 59 /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VMULSS xmm1, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, xmm1","vmulss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 59 /r","V","V","AVX","","w,r,r","","" +"VMULSS xmm1, {k}{z}, xmmV, xmm2/m32","VMULSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vmulss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 59 /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VMWRITE r32, r/m32","VMWRITE r/m32, r32","vmwrite r/m32, r32","0F 79 /r",= "V","N.S.","VTX","","r,r","","" +"VMWRITE r64, r/m64","VMWRITE r/m64, r64","vmwrite r/m64, r64","0F 79 /r",= "N.S.","V","VTX","default64","r,r","","" +"VMXOFF","VMXOFF","vmxoff","0F 01 C4","V","V","VTX","","","","" +"VMXON m64","VMXON m64","vmxon m64","F3 0F C7 /6","V","V","VTX","modrm_mem= only","r","","" +"VORPD xmm1, xmmV, xmm2/m128","VORPD xmm2/m128, xmmV, xmm1","vorpd xmm2/m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","","" +"VORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VORPD xmm2/m128/m64bcst, xm= mV, {k}{z}, xmm1","vorpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.1= 28.66.0F.W1 56 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,r",= "","" +"VORPD ymm1, ymmV, ymm2/m256","VORPD ymm2/m256, ymmV, ymm1","vorpd ymm2/m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 56 /r","V","V","AVX","","w,r,r","","" +"VORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VORPD ymm2/m256/m64bcst, ym= mV, {k}{z}, ymm1","vorpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.2= 56.66.0F.W1 56 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,r",= "","" +"VORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VORPD zmm2/m512/m64bcst, zm= mV, {k}{z}, zmm1","vorpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.5= 12.66.0F.W1 56 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","","" +"VORPS xmm1, xmmV, xmm2/m128","VORPS xmm2/m128, xmmV, xmm1","vorps xmm2/m1= 28, xmmV, xmm1","VEX.NDS.128.0F.WIG 56 /r","V","V","AVX","","w,r,r","","" +"VORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VORPS xmm2/m128/m32bcst, xm= mV, {k}{z}, xmm1","vorps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.1= 28.0F.W0 56 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r","",= "" +"VORPS ymm1, ymmV, ymm2/m256","VORPS ymm2/m256, ymmV, ymm1","vorps ymm2/m2= 56, ymmV, ymm1","VEX.NDS.256.0F.WIG 56 /r","V","V","AVX","","w,r,r","","" +"VORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VORPS ymm2/m256/m32bcst, ym= mV, {k}{z}, ymm1","vorps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.2= 56.0F.W0 56 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r","",= "" +"VORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VORPS zmm2/m512/m32bcst, zm= mV, {k}{z}, zmm1","vorps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.5= 12.0F.W0 56 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","","" +"VP4DPWSSD zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSD m128, zmmV+3, {k}{z}, zm= m1","vp4dpwssd m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 52 /r",= "V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","","" +"VP4DPWSSDS zmm1, {k}{z}, zmmV+3, m128","VP4DPWSSDS m128, zmmV+3, {k}{z}, = zmm1","vp4dpwssds m128, zmmV+3, {k}{z}, zmm1","EVEX.DDS.512.F2.0F38.W0 53 /= r","V","V","AVX512_4VNNIW","modrm_memonly,scale16","rw,r,r,r","","" +"VPABSB xmm1, xmm2/m128","VPABSB xmm2/m128, xmm1","vpabsb xmm2/m128, xmm1"= ,"VEX.128.66.0F38.WIG 1C /r","V","V","AVX","","w,r","","" +"VPABSB xmm1, {k}{z}, xmm2/m128","VPABSB xmm2/m128, {k}{z}, xmm1","vpabsb = xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1C /r","V","V","AVX512BW+AVX= 512VL","scale16","w,r,r","","" +"VPABSB ymm1, ymm2/m256","VPABSB ymm2/m256, ymm1","vpabsb ymm2/m256, ymm1"= ,"VEX.256.66.0F38.WIG 1C /r","V","V","AVX2","","w,r","","" +"VPABSB ymm1, {k}{z}, ymm2/m256","VPABSB ymm2/m256, {k}{z}, ymm1","vpabsb = ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1C /r","V","V","AVX512BW+AVX= 512VL","scale32","w,r,r","","" +"VPABSB zmm1, {k}{z}, zmm2/m512","VPABSB zmm2/m512, {k}{z}, zmm1","vpabsb = zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1C /r","V","V","AVX512BW","s= cale64","w,r,r","","" +"VPABSD xmm1, xmm2/m128","VPABSD xmm2/m128, xmm1","vpabsd xmm2/m128, xmm1"= ,"VEX.128.66.0F38.WIG 1E /r","V","V","AVX","","w,r","","" +"VPABSD xmm1, {k}{z}, xmm2/m128/m32bcst","VPABSD xmm2/m128/m32bcst, {k}{z}= , xmm1","vpabsd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0 1E /r= ","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VPABSD ymm1, ymm2/m256","VPABSD ymm2/m256, ymm1","vpabsd ymm2/m256, ymm1"= ,"VEX.256.66.0F38.WIG 1E /r","V","V","AVX2","","w,r","","" +"VPABSD ymm1, {k}{z}, ymm2/m256/m32bcst","VPABSD ymm2/m256/m32bcst, {k}{z}= , ymm1","vpabsd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0 1E /r= ","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VPABSD zmm1, {k}{z}, zmm2/m512/m32bcst","VPABSD zmm2/m512/m32bcst, {k}{z}= , zmm1","vpabsd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0 1E /r= ","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VPABSQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPABSQ xmm2/m128/m64bcst, {k}{z}= , xmm1","vpabsq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1 1F /r= ","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VPABSQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPABSQ ymm2/m256/m64bcst, {k}{z}= , ymm1","vpabsq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1 1F /r= ","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VPABSQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPABSQ zmm2/m512/m64bcst, {k}{z}= , zmm1","vpabsq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1 1F /r= ","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VPABSW xmm1, xmm2/m128","VPABSW xmm2/m128, xmm1","vpabsw xmm2/m128, xmm1"= ,"VEX.128.66.0F38.WIG 1D /r","V","V","AVX","","w,r","","" +"VPABSW xmm1, {k}{z}, xmm2/m128","VPABSW xmm2/m128, {k}{z}, xmm1","vpabsw = xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 1D /r","V","V","AVX512BW+AVX= 512VL","scale16","w,r,r","","" +"VPABSW ymm1, ymm2/m256","VPABSW ymm2/m256, ymm1","vpabsw ymm2/m256, ymm1"= ,"VEX.256.66.0F38.WIG 1D /r","V","V","AVX2","","w,r","","" +"VPABSW ymm1, {k}{z}, ymm2/m256","VPABSW ymm2/m256, {k}{z}, ymm1","vpabsw = ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 1D /r","V","V","AVX512BW+AVX= 512VL","scale32","w,r,r","","" +"VPABSW zmm1, {k}{z}, zmm2/m512","VPABSW zmm2/m512, {k}{z}, zmm1","vpabsw = zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 1D /r","V","V","AVX512BW","s= cale64","w,r,r","","" +"VPACKSSDW xmm1, xmmV, xmm2/m128","VPACKSSDW xmm2/m128, xmmV, xmm1","vpack= ssdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6B /r","V","V","AVX","",= "w,r,r","","" +"VPACKSSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKSSDW xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpackssdw xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F.W0 6B /r","V","V","AVX512BW+AVX512VL","bscale4,scale16= ","w,r,r,r","","" +"VPACKSSDW ymm1, ymmV, ymm2/m256","VPACKSSDW ymm2/m256, ymmV, ymm1","vpack= ssdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6B /r","V","V","AVX2",""= ,"w,r,r","","" +"VPACKSSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKSSDW ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpackssdw ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F.W0 6B /r","V","V","AVX512BW+AVX512VL","bscale4,scale32= ","w,r,r,r","","" +"VPACKSSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKSSDW zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpackssdw zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F.W0 6B /r","V","V","AVX512BW","bscale4,scale64","w,r,r,= r","","" +"VPACKSSWB xmm1, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, xmm1","vpack= sswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 63 /r","V","V","AVX","",= "w,r,r","","" +"VPACKSSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKSSWB xmm2/m128, xmmV, {k}{= z}, xmm1","vpacksswb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG= 63 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPACKSSWB ymm1, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, ymm1","vpack= sswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 63 /r","V","V","AVX2",""= ,"w,r,r","","" +"VPACKSSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKSSWB ymm2/m256, ymmV, {k}{= z}, ymm1","vpacksswb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG= 63 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPACKSSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKSSWB zmm2/m512, zmmV, {k}{= z}, zmm1","vpacksswb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG= 63 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPACKUSDW xmm1, xmmV, xmm2/m128","VPACKUSDW xmm2/m128, xmmV, xmm1","vpack= usdw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 2B /r","V","V","AVX","= ","w,r,r","","" +"VPACKUSDW xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPACKUSDW xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpackusdw xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 2B /r","V","V","AVX512BW+AVX512VL","bscale4,scale= 16","w,r,r,r","","" +"VPACKUSDW ymm1, ymmV, ymm2/m256","VPACKUSDW ymm2/m256, ymmV, ymm1","vpack= usdw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 2B /r","V","V","AVX2",= "","w,r,r","","" +"VPACKUSDW ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPACKUSDW ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpackusdw ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 2B /r","V","V","AVX512BW+AVX512VL","bscale4,scale= 32","w,r,r,r","","" +"VPACKUSDW zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPACKUSDW zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpackusdw zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 2B /r","V","V","AVX512BW","bscale4,scale64","w,r,= r,r","","" +"VPACKUSWB xmm1, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, xmm1","vpack= uswb xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 67 /r","V","V","AVX","",= "w,r,r","","" +"VPACKUSWB xmm1, {k}{z}, xmmV, xmm2/m128","VPACKUSWB xmm2/m128, xmmV, {k}{= z}, xmm1","vpackuswb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG= 67 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPACKUSWB ymm1, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, ymm1","vpack= uswb ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 67 /r","V","V","AVX2",""= ,"w,r,r","","" +"VPACKUSWB ymm1, {k}{z}, ymmV, ymm2/m256","VPACKUSWB ymm2/m256, ymmV, {k}{= z}, ymm1","vpackuswb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG= 67 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPACKUSWB zmm1, {k}{z}, zmmV, zmm2/m512","VPACKUSWB zmm2/m512, zmmV, {k}{= z}, zmm1","vpackuswb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG= 67 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDB xmm1, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, xmm1","vpaddb xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FC /r","V","V","AVX","","w,r,r","= ","" +"VPADDB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDB xmm2/m128, xmmV, {k}{z}, xm= m1","vpaddb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG FC /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDB ymm1, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, ymm1","vpaddb ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FC /r","V","V","AVX2","","w,r,r",= "","" +"VPADDB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDB ymm2/m256, ymmV, {k}{z}, ym= m1","vpaddb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG FC /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDB zmm2/m512, zmmV, {k}{z}, zm= m1","vpaddb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG FC /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDD xmm1, xmmV, xmm2/m128","VPADDD xmm2/m128, xmmV, xmm1","vpaddd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FE /r","V","V","AVX","","w,r,r","= ","" +"VPADDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPADDD xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vpaddd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W0 FE /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r= ","","" +"VPADDD ymm1, ymmV, ymm2/m256","VPADDD ymm2/m256, ymmV, ymm1","vpaddd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FE /r","V","V","AVX2","","w,r,r",= "","" +"VPADDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPADDD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpaddd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W0 FE /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r= ","","" +"VPADDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPADDD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpaddd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W0 FE /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPADDQ xmm1, xmmV, xmm2/m128","VPADDQ xmm2/m128, xmmV, xmm1","vpaddq xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D4 /r","V","V","AVX","","w,r,r","= ","" +"VPADDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPADDQ xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vpaddq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 D4 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VPADDQ ymm1, ymmV, ymm2/m256","VPADDQ ymm2/m256, ymmV, ymm1","vpaddq ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D4 /r","V","V","AVX2","","w,r,r",= "","" +"VPADDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPADDQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpaddq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 D4 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VPADDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPADDQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpaddq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 D4 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPADDSB xmm1, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, xmm1","vpaddsb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EC /r","V","V","AVX","","w,r,r= ","","" +"VPADDSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSB xmm2/m128, xmmV, {k}{z}, = xmm1","vpaddsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EC /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDSB ymm1, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, ymm1","vpaddsb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EC /r","V","V","AVX2","","w,r,= r","","" +"VPADDSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSB ymm2/m256, ymmV, {k}{z}, = ymm1","vpaddsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EC /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSB zmm2/m512, zmmV, {k}{z}, = zmm1","vpaddsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EC /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDSW xmm1, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, xmm1","vpaddsw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG ED /r","V","V","AVX","","w,r,r= ","","" +"VPADDSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDSW xmm2/m128, xmmV, {k}{z}, = xmm1","vpaddsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG ED /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDSW ymm1, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, ymm1","vpaddsw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG ED /r","V","V","AVX2","","w,r,= r","","" +"VPADDSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDSW ymm2/m256, ymmV, {k}{z}, = ymm1","vpaddsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG ED /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDSW zmm2/m512, zmmV, {k}{z}, = zmm1","vpaddsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG ED /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDUSB xmm1, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, xmm1","vpaddus= b xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DC /r","V","V","AVX","","w,= r,r","","" +"VPADDUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSB xmm2/m128, xmmV, {k}{z}= , xmm1","vpaddusb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DC= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDUSB ymm1, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, ymm1","vpaddus= b ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DC /r","V","V","AVX2","","w= ,r,r","","" +"VPADDUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSB ymm2/m256, ymmV, {k}{z}= , ymm1","vpaddusb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DC= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSB zmm2/m512, zmmV, {k}{z}= , zmm1","vpaddusb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DC= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDUSW xmm1, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, xmm1","vpaddus= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DD /r","V","V","AVX","","w,= r,r","","" +"VPADDUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDUSW xmm2/m128, xmmV, {k}{z}= , xmm1","vpaddusw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DD= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDUSW ymm1, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, ymm1","vpaddus= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DD /r","V","V","AVX2","","w= ,r,r","","" +"VPADDUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDUSW ymm2/m256, ymmV, {k}{z}= , ymm1","vpaddusw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DD= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDUSW zmm2/m512, zmmV, {k}{z}= , zmm1","vpaddusw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DD= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPADDW xmm1, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, xmm1","vpaddw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FD /r","V","V","AVX","","w,r,r","= ","" +"VPADDW xmm1, {k}{z}, xmmV, xmm2/m128","VPADDW xmm2/m128, xmmV, {k}{z}, xm= m1","vpaddw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG FD /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPADDW ymm1, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, ymm1","vpaddw ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FD /r","V","V","AVX2","","w,r,r",= "","" +"VPADDW ymm1, {k}{z}, ymmV, ymm2/m256","VPADDW ymm2/m256, ymmV, {k}{z}, ym= m1","vpaddw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG FD /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPADDW zmm1, {k}{z}, zmmV, zmm2/m512","VPADDW zmm2/m512, zmmV, {k}{z}, zm= m1","vpaddw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG FD /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPALIGNR xmm1, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128, xmmV, = xmm1","vpalignr imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0F /= r ib","V","V","AVX","","w,r,r,r","","" +"VPALIGNR xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPALIGNR imm8u, xmm2/m128= , xmmV, {k}{z}, xmm1","vpalignr imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F3A.WIG 0F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r= ,r,r","","" +"VPALIGNR ymm1, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256, ymmV, = ymm1","vpalignr imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0F /= r ib","V","V","AVX2","","w,r,r,r","","" +"VPALIGNR ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPALIGNR imm8u, ymm2/m256= , ymmV, {k}{z}, ymm1","vpalignr imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F3A.WIG 0F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r= ,r,r","","" +"VPALIGNR zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPALIGNR imm8u, zmm2/m512= , zmmV, {k}{z}, zmm1","vpalignr imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F3A.WIG 0F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","",= "" +"VPAND xmm1, xmmV, xmm2/m128","VPAND xmm2/m128, xmmV, xmm1","vpand xmm2/m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DB /r","V","V","AVX","","w,r,r","","" +"VPAND ymm1, ymmV, ymm2/m256","VPAND ymm2/m256, ymmV, ymm1","vpand ymm2/m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DB /r","V","V","AVX2","","w,r,r","",= "" +"VPANDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDD xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vpandd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W0 DB /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r= ","","" +"VPANDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpandd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W0 DB /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r= ","","" +"VPANDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpandd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W0 DB /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPANDN xmm1, xmmV, xmm2/m128","VPANDN xmm2/m128, xmmV, xmm1","vpandn xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DF /r","V","V","AVX","","w,r,r","= ","" +"VPANDN ymm1, ymmV, ymm2/m256","VPANDN ymm2/m256, ymmV, ymm1","vpandn ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DF /r","V","V","AVX2","","w,r,r",= "","" +"VPANDND xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPANDND xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpandnd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F.W0 DF /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,= r,r","","" +"VPANDND ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPANDND ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpandnd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F.W0 DF /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,= r,r","","" +"VPANDND zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPANDND zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpandnd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F.W0 DF /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPANDNQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDNQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpandnq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F.W1 DF /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,= r,r","","" +"VPANDNQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDNQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpandnq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F.W1 DF /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,= r,r","","" +"VPANDNQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDNQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpandnq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F.W1 DF /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPANDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPANDQ xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vpandq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 DB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VPANDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPANDQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpandq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 DB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VPANDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPANDQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpandq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 DB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPAVGB xmm1, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, xmm1","vpavgb xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E0 /r","V","V","AVX","","w,r,r","= ","" +"VPAVGB xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGB xmm2/m128, xmmV, {k}{z}, xm= m1","vpavgb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E0 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPAVGB ymm1, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, ymm1","vpavgb ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E0 /r","V","V","AVX2","","w,r,r",= "","" +"VPAVGB ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGB ymm2/m256, ymmV, {k}{z}, ym= m1","vpavgb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E0 /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPAVGB zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGB zmm2/m512, zmmV, {k}{z}, zm= m1","vpavgb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E0 /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPAVGW xmm1, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, xmm1","vpavgw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E3 /r","V","V","AVX","","w,r,r","= ","" +"VPAVGW xmm1, {k}{z}, xmmV, xmm2/m128","VPAVGW xmm2/m128, xmmV, {k}{z}, xm= m1","vpavgw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E3 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPAVGW ymm1, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, ymm1","vpavgw ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E3 /r","V","V","AVX2","","w,r,r",= "","" +"VPAVGW ymm1, {k}{z}, ymmV, ymm2/m256","VPAVGW ymm2/m256, ymmV, {k}{z}, ym= m1","vpavgw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E3 /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPAVGW zmm1, {k}{z}, zmmV, zmm2/m512","VPAVGW zmm2/m512, zmmV, {k}{z}, zm= m1","vpavgw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E3 /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPBLENDD xmm1, xmmV, xmm2/m128, imm8u","VPBLENDD imm8u, xmm2/m128, xmmV, = xmm1","vpblendd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 02 /r= ib","V","V","AVX2","","w,r,r,r","","" +"VPBLENDD ymm1, ymmV, ymm2/m256, imm8u","VPBLENDD imm8u, ymm2/m256, ymmV, = ymm1","vpblendd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 02 /r= ib","V","V","AVX2","","w,r,r,r","","" +"VPBLENDMB xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMB xmm2/m128, xmmV, {k}{= z}, xmm1","vpblendmb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W= 0 66 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPBLENDMB ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMB ymm2/m256, ymmV, {k}{= z}, ymm1","vpblendmb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W= 0 66 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPBLENDMB zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMB zmm2/m512, zmmV, {k}{= z}, zmm1","vpblendmb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W= 0 66 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPBLENDMD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPBLENDMD xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpblendmd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 64 /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r","","" +"VPBLENDMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPBLENDMD ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpblendmd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 64 /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r","","" +"VPBLENDMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPBLENDMD zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpblendmd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 64 /r","V","V","AVX512F","bscale4,scale64","w,r,r= ,r","","" +"VPBLENDMQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPBLENDMQ xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vpblendmq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W1 64 /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r","","" +"VPBLENDMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPBLENDMQ ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vpblendmq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W1 64 /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r","","" +"VPBLENDMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPBLENDMQ zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vpblendmq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W1 64 /r","V","V","AVX512F","bscale8,scale64","w,r,r= ,r","","" +"VPBLENDMW xmm1, {k}{z}, xmmV, xmm2/m128","VPBLENDMW xmm2/m128, xmmV, {k}{= z}, xmm1","vpblendmw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W= 1 66 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPBLENDMW ymm1, {k}{z}, ymmV, ymm2/m256","VPBLENDMW ymm2/m256, ymmV, {k}{= z}, ymm1","vpblendmw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W= 1 66 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPBLENDMW zmm1, {k}{z}, zmmV, zmm2/m512","VPBLENDMW zmm2/m512, zmmV, {k}{= z}, zmm1","vpblendmw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W= 1 66 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPBLENDVB xmm1, xmmV, xmm2/m128, xmmIH","VPBLENDVB xmmIH, xmm2/m128, xmmV= , xmm1","vpblendvb xmmIH, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 4C= /r /is4","V","V","AVX","","w,r,r,r","","" +"VPBLENDVB ymm1, ymmV, ymm2/m256, ymmIH","VPBLENDVB ymmIH, ymm2/m256, ymmV= , ymm1","vpblendvb ymmIH, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0 4C= /r /is4","V","V","AVX2","","w,r,r,r","","" +"VPBLENDW xmm1, xmmV, xmm2/m128, imm8u","VPBLENDW imm8u, xmm2/m128, xmmV, = xmm1","vpblendw imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 0E /= r ib","V","V","AVX","","w,r,r,r","","" +"VPBLENDW ymm1, ymmV, ymm2/m256, imm8u","VPBLENDW imm8u, ymm2/m256, ymmV, = ymm1","vpblendw imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WIG 0E /= r ib","V","V","AVX2","","w,r,r,r","","" +"VPBROADCASTB xmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, xmm1","vpb= roadcastb rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7A /r","V","V","AVX512B= W+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTB ymm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, ymm1","vpb= roadcastb rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7A /r","V","V","AVX512B= W+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTB zmm1, {k}{z}, rmr32","VPBROADCASTB rmr32, {k}{z}, zmm1","vpb= roadcastb rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7A /r","V","V","AVX512B= W","modrm_regonly","w,r,r","","" +"VPBROADCASTB xmm1, xmm2/m8","VPBROADCASTB xmm2/m8, xmm1","vpbroadcastb xm= m2/m8, xmm1","VEX.128.66.0F38.W0 78 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTB ymm1, xmm2/m8","VPBROADCASTB xmm2/m8, ymm1","vpbroadcastb xm= m2/m8, ymm1","VEX.256.66.0F38.W0 78 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTB xmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, xmm1",= "vpbroadcastb xmm2/m8, {k}{z}, xmm1","EVEX.128.66.0F38.W0 78 /r","V","V","A= VX512BW+AVX512VL","scale1","w,r,r","","" +"VPBROADCASTB ymm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, ymm1",= "vpbroadcastb xmm2/m8, {k}{z}, ymm1","EVEX.256.66.0F38.W0 78 /r","V","V","A= VX512BW+AVX512VL","scale1","w,r,r","","" +"VPBROADCASTB zmm1, {k}{z}, xmm2/m8","VPBROADCASTB xmm2/m8, {k}{z}, zmm1",= "vpbroadcastb xmm2/m8, {k}{z}, zmm1","EVEX.512.66.0F38.W0 78 /r","V","V","A= VX512BW","scale1","w,r,r","","" +"VPBROADCASTD xmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, xmm1","vpb= roadcastd rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7C /r","V","V","AVX512F= +AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTD ymm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, ymm1","vpb= roadcastd rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7C /r","V","V","AVX512F= +AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTD zmm1, {k}{z}, rmr32","VPBROADCASTD rmr32, {k}{z}, zmm1","vpb= roadcastd rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7C /r","V","V","AVX512F= ","modrm_regonly","w,r,r","","" +"VPBROADCASTD xmm1, xmm2/m32","VPBROADCASTD xmm2/m32, xmm1","vpbroadcastd = xmm2/m32, xmm1","VEX.128.66.0F38.W0 58 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTD ymm1, xmm2/m32","VPBROADCASTD xmm2/m32, ymm1","vpbroadcastd = xmm2/m32, ymm1","VEX.256.66.0F38.W0 58 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTD xmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, xmm1= ","vpbroadcastd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 58 /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VPBROADCASTD ymm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, ymm1= ","vpbroadcastd xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 58 /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VPBROADCASTD zmm1, {k}{z}, xmm2/m32","VPBROADCASTD xmm2/m32, {k}{z}, zmm1= ","vpbroadcastd xmm2/m32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 58 /r","V","V"= ,"AVX512F","scale4","w,r,r","","" +"VPBROADCASTMB2Q xmm1, k2","VPBROADCASTMB2Q k2, xmm1","vpbroadcastmb2q k2,= xmm1","EVEX.128.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regon= ly","w,r","","" +"VPBROADCASTMB2Q ymm1, k2","VPBROADCASTMB2Q k2, ymm1","vpbroadcastmb2q k2,= ymm1","EVEX.256.F3.0F38.W1 2A /r","V","V","AVX512CD+AVX512VL","modrm_regon= ly","w,r","","" +"VPBROADCASTMB2Q zmm1, k2","VPBROADCASTMB2Q k2, zmm1","vpbroadcastmb2q k2,= zmm1","EVEX.512.F3.0F38.W1 2A /r","V","V","AVX512CD","modrm_regonly","w,r"= ,"","" +"VPBROADCASTMW2D xmm1, k2","VPBROADCASTMW2D k2, xmm1","vpbroadcastmw2d k2,= xmm1","EVEX.128.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regon= ly","w,r","","" +"VPBROADCASTMW2D ymm1, k2","VPBROADCASTMW2D k2, ymm1","vpbroadcastmw2d k2,= ymm1","EVEX.256.F3.0F38.W0 3A /r","V","V","AVX512CD+AVX512VL","modrm_regon= ly","w,r","","" +"VPBROADCASTMW2D zmm1, k2","VPBROADCASTMW2D k2, zmm1","vpbroadcastmw2d k2,= zmm1","EVEX.512.F3.0F38.W0 3A /r","V","V","AVX512CD","modrm_regonly","w,r"= ,"","" +"VPBROADCASTQ xmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, xmm1","vpb= roadcastq rmr64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 7C /r","N.S.","V","AVX5= 12F+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTQ ymm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, ymm1","vpb= roadcastq rmr64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 7C /r","N.S.","V","AVX5= 12F+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTQ zmm1, {k}{z}, rmr64","VPBROADCASTQ rmr64, {k}{z}, zmm1","vpb= roadcastq rmr64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 7C /r","N.S.","V","AVX5= 12F","modrm_regonly","w,r,r","","" +"VPBROADCASTQ xmm1, xmm2/m64","VPBROADCASTQ xmm2/m64, xmm1","vpbroadcastq = xmm2/m64, xmm1","VEX.128.66.0F38.W0 59 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTQ ymm1, xmm2/m64","VPBROADCASTQ xmm2/m64, ymm1","vpbroadcastq = xmm2/m64, ymm1","VEX.256.66.0F38.W0 59 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTQ xmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, xmm1= ","vpbroadcastq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W1 59 /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VPBROADCASTQ ymm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, ymm1= ","vpbroadcastq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.W1 59 /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VPBROADCASTQ zmm1, {k}{z}, xmm2/m64","VPBROADCASTQ xmm2/m64, {k}{z}, zmm1= ","vpbroadcastq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.W1 59 /r","V","V"= ,"AVX512F","scale8","w,r,r","","" +"VPBROADCASTW xmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, xmm1","vpb= roadcastw rmr32, {k}{z}, xmm1","EVEX.128.66.0F38.W0 7B /r","V","V","AVX512B= W+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTW ymm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, ymm1","vpb= roadcastw rmr32, {k}{z}, ymm1","EVEX.256.66.0F38.W0 7B /r","V","V","AVX512B= W+AVX512VL","modrm_regonly","w,r,r","","" +"VPBROADCASTW zmm1, {k}{z}, rmr32","VPBROADCASTW rmr32, {k}{z}, zmm1","vpb= roadcastw rmr32, {k}{z}, zmm1","EVEX.512.66.0F38.W0 7B /r","V","V","AVX512B= W","modrm_regonly","w,r,r","","" +"VPBROADCASTW xmm1, xmm2/m16","VPBROADCASTW xmm2/m16, xmm1","vpbroadcastw = xmm2/m16, xmm1","VEX.128.66.0F38.W0 79 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTW ymm1, xmm2/m16","VPBROADCASTW xmm2/m16, ymm1","vpbroadcastw = xmm2/m16, ymm1","VEX.256.66.0F38.W0 79 /r","V","V","AVX2","","w,r","","" +"VPBROADCASTW xmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, xmm1= ","vpbroadcastw xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.W0 79 /r","V","V"= ,"AVX512BW+AVX512VL","scale2","w,r,r","","" +"VPBROADCASTW ymm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, ymm1= ","vpbroadcastw xmm2/m16, {k}{z}, ymm1","EVEX.256.66.0F38.W0 79 /r","V","V"= ,"AVX512BW+AVX512VL","scale2","w,r,r","","" +"VPBROADCASTW zmm1, {k}{z}, xmm2/m16","VPBROADCASTW xmm2/m16, {k}{z}, zmm1= ","vpbroadcastw xmm2/m16, {k}{z}, zmm1","EVEX.512.66.0F38.W0 79 /r","V","V"= ,"AVX512BW","scale2","w,r,r","","" +"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u, xmm2/m128, xm= mV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W= IG 44 /r ib","V","V","VPCLMULQDQ+AVX512VL","scale16","w,r,r,r","","" +"VPCLMULQDQ xmm1, xmmV, xmm2/m128, imm8u","VPCLMULQDQ imm8u, xmm2/m128, xm= mV, xmm1","vpclmulqdq imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F3A.WI= G 44 /r ib","V","V","PCLMULQDQ+AVX","","w,r,r,r","","" +"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u, ymm2/m256, ym= mV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F3A.W= IG 44 /r ib","V","V","VPCLMULQDQ+AVX512VL","scale32","w,r,r,r","","" +"VPCLMULQDQ ymm1, ymmV, ymm2/m256, imm8u","VPCLMULQDQ imm8u, ymm2/m256, ym= mV, ymm1","vpclmulqdq imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.WI= G 44 /r ib","V","V","VPCLMULQDQ","","w,r,r,r","","" +"VPCLMULQDQ zmm1, zmmV, zmm2/m512, imm8u","VPCLMULQDQ imm8u, zmm2/m512, zm= mV, zmm1","vpclmulqdq imm8u, zmm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F3A.W= IG 44 /r ib","V","V","VPCLMULQDQ+AVX512F","scale64","w,r,r,r","","" +"VPCMOV xmm1, xmmV, xmmIH, xmm2/m128","VPCMOV xmm2/m128, xmmIH, xmmV, xmm1= ","vpcmov xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A2 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPCMOV xmm1, xmmV, xmm2/m128, xmmIH","VPCMOV xmmIH, xmm2/m128, xmmV, xmm1= ","vpcmov xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A2 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPCMOV ymm1, ymmV, ymmIH, ymm2/m256","VPCMOV ymm2/m256, ymmIH, ymmV, ymm1= ","vpcmov ymm2/m256, ymmIH, ymmV, ymm1","XOP.NDS.256.08.W1 A2 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPCMOV ymm1, ymmV, ymm2/m256, ymmIH","VPCMOV ymmIH, ymm2/m256, ymmV, ymm1= ","vpcmov ymmIH, ymm2/m256, ymmV, ymm1","XOP.NDS.256.08.W0 A2 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPCMPB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPB imm8u, xmm2/m128, xmmV, {= k}, k1","vpcmpb imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W0 3= F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","","" +"VPCMPB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPB imm8u, ymm2/m256, ymmV, {= k}, k1","vpcmpb imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W0 3= F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","","" +"VPCMPB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPB imm8u, zmm2/m512, zmmV, {= k}, k1","vpcmpb imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W0 3= F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","","" +"VPCMPD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPD imm8u, xmm2/m128/= m32bcst, xmmV, {k}, k1","vpcmpd imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1","E= VEX.NDS.128.66.0F3A.W0 1F /r ib","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r,r","","" +"VPCMPD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPD imm8u, ymm2/m256/= m32bcst, ymmV, {k}, k1","vpcmpd imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1","E= VEX.NDS.256.66.0F3A.W0 1F /r ib","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r,r","","" +"VPCMPD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPD imm8u, zmm2/m512/= m32bcst, zmmV, {k}, k1","vpcmpd imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1","E= VEX.NDS.512.66.0F3A.W0 1F /r ib","V","V","AVX512F","bscale4,scale64","w,r,r= ,r,r","","" +"VPCMPEQB xmm1, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, xmm1","vpcmpeq= b xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 74 /r","V","V","AVX","","w,= r,r","","" +"VPCMPEQB k1, {k}, xmmV, xmm2/m128","VPCMPEQB xmm2/m128, xmmV, {k}, k1","v= pcmpeqb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 74 /r","V","V","A= VX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPCMPEQB ymm1, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, ymm1","vpcmpeq= b ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 74 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPEQB k1, {k}, ymmV, ymm2/m256","VPCMPEQB ymm2/m256, ymmV, {k}, k1","v= pcmpeqb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 74 /r","V","V","A= VX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPCMPEQB k1, {k}, zmmV, zmm2/m512","VPCMPEQB zmm2/m512, zmmV, {k}, k1","v= pcmpeqb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 74 /r","V","V","A= VX512BW","scale64","w,r,r,r","","" +"VPCMPEQD xmm1, xmmV, xmm2/m128","VPCMPEQD xmm2/m128, xmmV, xmm1","vpcmpeq= d xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 76 /r","V","V","AVX","","w,= r,r","","" +"VPCMPEQD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPEQD xmm2/m128/m32bcst, x= mmV, {k}, k1","vpcmpeqd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","","" +"VPCMPEQD ymm1, ymmV, ymm2/m256","VPCMPEQD ymm2/m256, ymmV, ymm1","vpcmpeq= d ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 76 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPEQD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPEQD ymm2/m256/m32bcst, y= mmV, {k}, k1","vpcmpeqd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","","" +"VPCMPEQD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPEQD zmm2/m512/m32bcst, z= mmV, {k}, k1","vpcmpeqd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F.W0 76 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPCMPEQQ xmm1, xmmV, xmm2/m128","VPCMPEQQ xmm2/m128, xmmV, xmm1","vpcmpeq= q xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 29 /r","V","V","AVX","","= w,r,r","","" +"VPCMPEQQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPEQQ xmm2/m128/m64bcst, x= mmV, {k}, k1","vpcmpeqq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F38.W1 29 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","","" +"VPCMPEQQ ymm1, ymmV, ymm2/m256","VPCMPEQQ ymm2/m256, ymmV, ymm1","vpcmpeq= q ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 29 /r","V","V","AVX2","",= "w,r,r","","" +"VPCMPEQQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPEQQ ymm2/m256/m64bcst, y= mmV, {k}, k1","vpcmpeqq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F38.W1 29 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","","" +"VPCMPEQQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPEQQ zmm2/m512/m64bcst, z= mmV, {k}, k1","vpcmpeqq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F38.W1 29 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPCMPEQW xmm1, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, xmm1","vpcmpeq= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 75 /r","V","V","AVX","","w,= r,r","","" +"VPCMPEQW k1, {k}, xmmV, xmm2/m128","VPCMPEQW xmm2/m128, xmmV, {k}, k1","v= pcmpeqw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 75 /r","V","V","A= VX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPCMPEQW ymm1, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, ymm1","vpcmpeq= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 75 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPEQW k1, {k}, ymmV, ymm2/m256","VPCMPEQW ymm2/m256, ymmV, {k}, k1","v= pcmpeqw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 75 /r","V","V","A= VX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPCMPEQW k1, {k}, zmmV, zmm2/m512","VPCMPEQW zmm2/m512, zmmV, {k}, k1","v= pcmpeqw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 75 /r","V","V","A= VX512BW","scale64","w,r,r,r","","" +"VPCMPESTRI xmm1, xmm2/m128, imm8u","VPCMPESTRI imm8u, xmm2/m128, xmm1","v= pcmpestri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 61 /r ib","V","V","A= VX","","r,r,r","","" +"VPCMPESTRM xmm1, xmm2/m128, imm8u","VPCMPESTRM imm8u, xmm2/m128, xmm1","v= pcmpestrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 60 /r ib","V","V","A= VX","","r,r,r","","" +"VPCMPGTB xmm1, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, xmm1","vpcmpgt= b xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 64 /r","V","V","AVX","","w,= r,r","","" +"VPCMPGTB k1, {k}, xmmV, xmm2/m128","VPCMPGTB xmm2/m128, xmmV, {k}, k1","v= pcmpgtb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 64 /r","V","V","A= VX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPCMPGTB ymm1, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, ymm1","vpcmpgt= b ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 64 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPGTB k1, {k}, ymmV, ymm2/m256","VPCMPGTB ymm2/m256, ymmV, {k}, k1","v= pcmpgtb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 64 /r","V","V","A= VX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPCMPGTB k1, {k}, zmmV, zmm2/m512","VPCMPGTB zmm2/m512, zmmV, {k}, k1","v= pcmpgtb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 64 /r","V","V","A= VX512BW","scale64","w,r,r,r","","" +"VPCMPGTD xmm1, xmmV, xmm2/m128","VPCMPGTD xmm2/m128, xmmV, xmm1","vpcmpgt= d xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 66 /r","V","V","AVX","","w,= r,r","","" +"VPCMPGTD k1, {k}, xmmV, xmm2/m128/m32bcst","VPCMPGTD xmm2/m128/m32bcst, x= mmV, {k}, k1","vpcmpgtd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F.W0 66 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","","" +"VPCMPGTD ymm1, ymmV, ymm2/m256","VPCMPGTD ymm2/m256, ymmV, ymm1","vpcmpgt= d ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 66 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPGTD k1, {k}, ymmV, ymm2/m256/m32bcst","VPCMPGTD ymm2/m256/m32bcst, y= mmV, {k}, k1","vpcmpgtd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F.W0 66 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","","" +"VPCMPGTD k1, {k}, zmmV, zmm2/m512/m32bcst","VPCMPGTD zmm2/m512/m32bcst, z= mmV, {k}, k1","vpcmpgtd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F.W0 66 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPCMPGTQ xmm1, xmmV, xmm2/m128","VPCMPGTQ xmm2/m128, xmmV, xmm1","vpcmpgt= q xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 37 /r","V","V","AVX","","= w,r,r","","" +"VPCMPGTQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPCMPGTQ xmm2/m128/m64bcst, x= mmV, {k}, k1","vpcmpgtq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F38.W1 37 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","","" +"VPCMPGTQ ymm1, ymmV, ymm2/m256","VPCMPGTQ ymm2/m256, ymmV, ymm1","vpcmpgt= q ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 37 /r","V","V","AVX2","",= "w,r,r","","" +"VPCMPGTQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPCMPGTQ ymm2/m256/m64bcst, y= mmV, {k}, k1","vpcmpgtq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F38.W1 37 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","","" +"VPCMPGTQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPCMPGTQ zmm2/m512/m64bcst, z= mmV, {k}, k1","vpcmpgtq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F38.W1 37 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPCMPGTW xmm1, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, xmm1","vpcmpgt= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 65 /r","V","V","AVX","","w,= r,r","","" +"VPCMPGTW k1, {k}, xmmV, xmm2/m128","VPCMPGTW xmm2/m128, xmmV, {k}, k1","v= pcmpgtw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F.WIG 65 /r","V","V","A= VX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPCMPGTW ymm1, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, ymm1","vpcmpgt= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 65 /r","V","V","AVX2","","w= ,r,r","","" +"VPCMPGTW k1, {k}, ymmV, ymm2/m256","VPCMPGTW ymm2/m256, ymmV, {k}, k1","v= pcmpgtw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F.WIG 65 /r","V","V","A= VX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPCMPGTW k1, {k}, zmmV, zmm2/m512","VPCMPGTW zmm2/m512, zmmV, {k}, k1","v= pcmpgtw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F.WIG 65 /r","V","V","A= VX512BW","scale64","w,r,r,r","","" +"VPCMPISTRI xmm1, xmm2/m128, imm8u","VPCMPISTRI imm8u, xmm2/m128, xmm1","v= pcmpistri imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 63 /r ib","V","V","A= VX","","r,r,r","","" +"VPCMPISTRM xmm1, xmm2/m128, imm8u","VPCMPISTRM imm8u, xmm2/m128, xmm1","v= pcmpistrm imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 62 /r ib","V","V","A= VX","","r,r,r","","" +"VPCMPQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPQ imm8u, xmm2/m128/= m64bcst, xmmV, {k}, k1","vpcmpq imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1","E= VEX.NDS.128.66.0F3A.W1 1F /r ib","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r,r","","" +"VPCMPQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPQ imm8u, ymm2/m256/= m64bcst, ymmV, {k}, k1","vpcmpq imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1","E= VEX.NDS.256.66.0F3A.W1 1F /r ib","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r,r","","" +"VPCMPQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPQ imm8u, zmm2/m512/= m64bcst, zmmV, {k}, k1","vpcmpq imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1","E= VEX.NDS.512.66.0F3A.W1 1F /r ib","V","V","AVX512F","bscale8,scale64","w,r,r= ,r,r","","" +"VPCMPUB k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUB imm8u, xmm2/m128, xmmV,= {k}, k1","vpcmpub imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W= 0 3E /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","","" +"VPCMPUB k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUB imm8u, ymm2/m256, ymmV,= {k}, k1","vpcmpub imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W= 0 3E /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","","" +"VPCMPUB k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUB imm8u, zmm2/m512, zmmV,= {k}, k1","vpcmpub imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W= 0 3E /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","","" +"VPCMPUD k1, {k}, xmmV, xmm2/m128/m32bcst, imm8u","VPCMPUD imm8u, xmm2/m12= 8/m32bcst, xmmV, {k}, k1","vpcmpud imm8u, xmm2/m128/m32bcst, xmmV, {k}, k1"= ,"EVEX.NDS.128.66.0F3A.W0 1E /r ib","V","V","AVX512F+AVX512VL","bscale4,sca= le16","w,r,r,r,r","","" +"VPCMPUD k1, {k}, ymmV, ymm2/m256/m32bcst, imm8u","VPCMPUD imm8u, ymm2/m25= 6/m32bcst, ymmV, {k}, k1","vpcmpud imm8u, ymm2/m256/m32bcst, ymmV, {k}, k1"= ,"EVEX.NDS.256.66.0F3A.W0 1E /r ib","V","V","AVX512F+AVX512VL","bscale4,sca= le32","w,r,r,r,r","","" +"VPCMPUD k1, {k}, zmmV, zmm2/m512/m32bcst, imm8u","VPCMPUD imm8u, zmm2/m51= 2/m32bcst, zmmV, {k}, k1","vpcmpud imm8u, zmm2/m512/m32bcst, zmmV, {k}, k1"= ,"EVEX.NDS.512.66.0F3A.W0 1E /r ib","V","V","AVX512F","bscale4,scale64","w,= r,r,r,r","","" +"VPCMPUQ k1, {k}, xmmV, xmm2/m128/m64bcst, imm8u","VPCMPUQ imm8u, xmm2/m12= 8/m64bcst, xmmV, {k}, k1","vpcmpuq imm8u, xmm2/m128/m64bcst, xmmV, {k}, k1"= ,"EVEX.NDS.128.66.0F3A.W1 1E /r ib","V","V","AVX512F+AVX512VL","bscale8,sca= le16","w,r,r,r,r","","" +"VPCMPUQ k1, {k}, ymmV, ymm2/m256/m64bcst, imm8u","VPCMPUQ imm8u, ymm2/m25= 6/m64bcst, ymmV, {k}, k1","vpcmpuq imm8u, ymm2/m256/m64bcst, ymmV, {k}, k1"= ,"EVEX.NDS.256.66.0F3A.W1 1E /r ib","V","V","AVX512F+AVX512VL","bscale8,sca= le32","w,r,r,r,r","","" +"VPCMPUQ k1, {k}, zmmV, zmm2/m512/m64bcst, imm8u","VPCMPUQ imm8u, zmm2/m51= 2/m64bcst, zmmV, {k}, k1","vpcmpuq imm8u, zmm2/m512/m64bcst, zmmV, {k}, k1"= ,"EVEX.NDS.512.66.0F3A.W1 1E /r ib","V","V","AVX512F","bscale8,scale64","w,= r,r,r,r","","" +"VPCMPUW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPUW imm8u, xmm2/m128, xmmV,= {k}, k1","vpcmpuw imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W= 1 3E /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","","" +"VPCMPUW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPUW imm8u, ymm2/m256, ymmV,= {k}, k1","vpcmpuw imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W= 1 3E /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","","" +"VPCMPUW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPUW imm8u, zmm2/m512, zmmV,= {k}, k1","vpcmpuw imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W= 1 3E /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","","" +"VPCMPW k1, {k}, xmmV, xmm2/m128, imm8u","VPCMPW imm8u, xmm2/m128, xmmV, {= k}, k1","vpcmpw imm8u, xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F3A.W1 3= F /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r,r","","" +"VPCMPW k1, {k}, ymmV, ymm2/m256, imm8u","VPCMPW imm8u, ymm2/m256, ymmV, {= k}, k1","vpcmpw imm8u, ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F3A.W1 3= F /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r,r","","" +"VPCMPW k1, {k}, zmmV, zmm2/m512, imm8u","VPCMPW imm8u, zmm2/m512, zmmV, {= k}, k1","vpcmpw imm8u, zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F3A.W1 3= F /r ib","V","V","AVX512BW","scale64","w,r,r,r,r","","" +"VPCOMB xmm1, xmmV, xmm2/m128, imm8u","VPCOMB imm8u, xmm2/m128, xmmV, xmm1= ","vpcomb imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CC /r ib","V","V= ","XOP","amd","w,r,r,r","","" +"VPCOMD xmm1, xmmV, xmm2/m128, imm8u","VPCOMD imm8u, xmm2/m128, xmmV, xmm1= ","vpcomd imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CE /r ib","V","V= ","XOP","amd","w,r,r,r","","" +"VPCOMPRESSB xmm2/m128, {k}{z}, xmm1","VPCOMPRESSB xmm1, {k}{z}, xmm2/m128= ","vpcompressb xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 63 /r","V","V"= ,"AVX512_VBMI2+AVX512VL","scale1","w,r,r","","" +"VPCOMPRESSB ymm2/m256, {k}{z}, ymm1","VPCOMPRESSB ymm1, {k}{z}, ymm2/m256= ","vpcompressb ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 63 /r","V","V"= ,"AVX512_VBMI2+AVX512VL","scale1","w,r,r","","" +"VPCOMPRESSB zmm2/m512, {k}{z}, zmm1","VPCOMPRESSB zmm1, {k}{z}, zmm2/m512= ","vpcompressb zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 63 /r","V","V"= ,"AVX512_VBMI2","scale1","w,r,r","","" +"VPCOMPRESSD xmm2/m128, {k}{z}, xmm1","VPCOMPRESSD xmm1, {k}{z}, xmm2/m128= ","vpcompressd xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W0 8B /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VPCOMPRESSD ymm2/m256, {k}{z}, ymm1","VPCOMPRESSD ymm1, {k}{z}, ymm2/m256= ","vpcompressd ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W0 8B /r","V","V"= ,"AVX512F+AVX512VL","scale4","w,r,r","","" +"VPCOMPRESSD zmm2/m512, {k}{z}, zmm1","VPCOMPRESSD zmm1, {k}{z}, zmm2/m512= ","vpcompressd zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W0 8B /r","V","V"= ,"AVX512F","scale4","w,r,r","","" +"VPCOMPRESSQ xmm2/m128, {k}{z}, xmm1","VPCOMPRESSQ xmm1, {k}{z}, xmm2/m128= ","vpcompressq xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 8B /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VPCOMPRESSQ ymm2/m256, {k}{z}, ymm1","VPCOMPRESSQ ymm1, {k}{z}, ymm2/m256= ","vpcompressq ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 8B /r","V","V"= ,"AVX512F+AVX512VL","scale8","w,r,r","","" +"VPCOMPRESSQ zmm2/m512, {k}{z}, zmm1","VPCOMPRESSQ zmm1, {k}{z}, zmm2/m512= ","vpcompressq zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 8B /r","V","V"= ,"AVX512F","scale8","w,r,r","","" +"VPCOMPRESSW xmm2/m128, {k}{z}, xmm1","VPCOMPRESSW xmm1, {k}{z}, xmm2/m128= ","vpcompressw xmm1, {k}{z}, xmm2/m128","EVEX.128.66.0F38.W1 63 /r","V","V"= ,"AVX512_VBMI2+AVX512VL","scale2","w,r,r","","" +"VPCOMPRESSW ymm2/m256, {k}{z}, ymm1","VPCOMPRESSW ymm1, {k}{z}, ymm2/m256= ","vpcompressw ymm1, {k}{z}, ymm2/m256","EVEX.256.66.0F38.W1 63 /r","V","V"= ,"AVX512_VBMI2+AVX512VL","scale2","w,r,r","","" +"VPCOMPRESSW zmm2/m512, {k}{z}, zmm1","VPCOMPRESSW zmm1, {k}{z}, zmm2/m512= ","vpcompressw zmm1, {k}{z}, zmm2/m512","EVEX.512.66.0F38.W1 63 /r","V","V"= ,"AVX512_VBMI2","scale2","w,r,r","","" +"VPCOMQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMQ imm8u, xmm2/m128, xmmV, xmm1= ","vpcomq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CF /r ib","V","V= ","XOP","amd","w,r,r,r","","" +"VPCOMUB xmm1, xmmV, xmm2/m128, imm8u","VPCOMUB imm8u, xmm2/m128, xmmV, xm= m1","vpcomub imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EC /r ib","V"= ,"V","XOP","amd","w,r,r,r","","" +"VPCOMUD xmm1, xmmV, xmm2/m128, imm8u","VPCOMUD imm8u, xmm2/m128, xmmV, xm= m1","vpcomud imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EE /r ib","V"= ,"V","XOP","amd","w,r,r,r","","" +"VPCOMUQ xmm1, xmmV, xmm2/m128, imm8u","VPCOMUQ imm8u, xmm2/m128, xmmV, xm= m1","vpcomuq imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 EF /r ib","V"= ,"V","XOP","amd","w,r,r,r","","" +"VPCOMUW xmm1, xmmV, xmm2/m128, imm8u","VPCOMUW imm8u, xmm2/m128, xmmV, xm= m1","vpcomuw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 ED /r ib","V"= ,"V","XOP","amd","w,r,r,r","","" +"VPCOMW xmm1, xmmV, xmm2/m128, imm8u","VPCOMW imm8u, xmm2/m128, xmmV, xmm1= ","vpcomw imm8u, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 CD /r ib","V","V= ","XOP","amd","w,r,r,r","","" +"VPCONFLICTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPCONFLICTD xmm2/m128/m32bc= st, {k}{z}, xmm1","vpconflictd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.6= 6.0F38.W0 C4 /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","","" +"VPCONFLICTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPCONFLICTD ymm2/m256/m32bc= st, {k}{z}, ymm1","vpconflictd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.6= 6.0F38.W0 C4 /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","","" +"VPCONFLICTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPCONFLICTD zmm2/m512/m32bc= st, {k}{z}, zmm1","vpconflictd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.6= 6.0F38.W0 C4 /r","V","V","AVX512CD","bscale4,scale64","w,r,r","","" +"VPCONFLICTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPCONFLICTQ xmm2/m128/m64bc= st, {k}{z}, xmm1","vpconflictq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.6= 6.0F38.W1 C4 /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","","" +"VPCONFLICTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPCONFLICTQ ymm2/m256/m64bc= st, {k}{z}, ymm1","vpconflictq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.6= 6.0F38.W1 C4 /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","","" +"VPCONFLICTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPCONFLICTQ zmm2/m512/m64bc= st, {k}{z}, zmm1","vpconflictq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.6= 6.0F38.W1 C4 /r","V","V","AVX512CD","bscale8,scale64","w,r,r","","" +"VPDPBUSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSD xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpdpbusd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 50 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale= 16","rw,r,r,r","","" +"VPDPBUSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSD ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpdpbusd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 50 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale= 32","rw,r,r,r","","" +"VPDPBUSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSD zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpdpbusd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 50 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r= ,r,r","","" +"VPDPBUSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPBUSDS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpdpbusds xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W0 51 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,sc= ale16","rw,r,r,r","","" +"VPDPBUSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPBUSDS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpdpbusds ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W0 51 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,sc= ale32","rw,r,r,r","","" +"VPDPBUSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPBUSDS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpdpbusds zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W0 51 /r","V","V","AVX512_VNNI","bscale4,scale64","r= w,r,r,r","","" +"VPDPWSSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSD xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpdpwssd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 52 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale= 16","rw,r,r,r","","" +"VPDPWSSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSD ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpdpwssd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 52 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,scale= 32","rw,r,r,r","","" +"VPDPWSSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSD zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpdpwssd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 52 /r","V","V","AVX512_VNNI","bscale4,scale64","rw,r= ,r,r","","" +"VPDPWSSDS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPDPWSSDS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpdpwssds xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W0 53 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,sc= ale16","rw,r,r,r","","" +"VPDPWSSDS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPDPWSSDS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpdpwssds ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W0 53 /r","V","V","AVX512_VNNI+AVX512VL","bscale4,sc= ale32","rw,r,r,r","","" +"VPDPWSSDS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPDPWSSDS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpdpwssds zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W0 53 /r","V","V","AVX512_VNNI","bscale4,scale64","r= w,r,r,r","","" +"VPERM2F128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2F128 imm8u, ymm2/m256, ym= mV, ymm1","vperm2f128 imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0= 06 /r ib","V","V","AVX","","w,r,r,r","","" +"VPERM2I128 ymm1, ymmV, ymm2/m256, imm8u","VPERM2I128 imm8u, ymm2/m256, ym= mV, ymm1","vperm2i128 imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F3A.W0= 46 /r ib","V","V","AVX2","","w,r,r,r","","" +"VPERMB xmm1, {k}{z}, xmmV, xmm2/m128","VPERMB xmm2/m128, xmmV, {k}{z}, xm= m1","vpermb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W0 8D /r",= "V","V","AVX512_VBMI+AVX512VL","scale16","w,r,r,r","","" +"VPERMB ymm1, {k}{z}, ymmV, ymm2/m256","VPERMB ymm2/m256, ymmV, {k}{z}, ym= m1","vpermb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W0 8D /r",= "V","V","AVX512_VBMI+AVX512VL","scale32","w,r,r,r","","" +"VPERMB zmm1, {k}{z}, zmmV, zmm2/m512","VPERMB zmm2/m512, zmmV, {k}{z}, zm= m1","vpermb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W0 8D /r",= "V","V","AVX512_VBMI","scale64","w,r,r,r","","" +"VPERMD ymm1, ymmV, ymm2/m256","VPERMD ymm2/m256, ymmV, ymm1","vpermd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 36 /r","V","V","AVX2","","w,r,r"= ,"","" +"VPERMD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpermd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F38.W0 36 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r= ,r","","" +"VPERMD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpermd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F38.W0 36 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPERMI2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2B xmm2/m128, xmmV, {k}{z}= , xmm1","vpermi2b xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7= 5 /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","","" +"VPERMI2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2B ymm2/m256, ymmV, {k}{z}= , ymm1","vpermi2b ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7= 5 /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","","" +"VPERMI2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2B zmm2/m512, zmmV, {k}{z}= , zmm1","vpermi2b zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7= 5 /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","","" +"VPERMI2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2D xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpermi2d xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale16",= "rw,r,r,r","","" +"VPERMI2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2D ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpermi2d ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 76 /r","V","V","AVX512F+AVX512VL","bscale4,scale32",= "rw,r,r,r","","" +"VPERMI2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2D zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpermi2d zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 76 /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r= ","","" +"VPERMI2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2PD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vpermi2pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W1 77 /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","rw,r,r,r","","" +"VPERMI2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2PD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vpermi2pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W1 77 /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","rw,r,r,r","","" +"VPERMI2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2PD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vpermi2pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W1 77 /r","V","V","AVX512F","bscale8,scale64","rw,r,= r,r","","" +"VPERMI2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMI2PS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpermi2ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W0 77 /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","rw,r,r,r","","" +"VPERMI2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMI2PS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpermi2ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W0 77 /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","rw,r,r,r","","" +"VPERMI2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMI2PS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpermi2ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W0 77 /r","V","V","AVX512F","bscale4,scale64","rw,r,= r,r","","" +"VPERMI2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMI2Q xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpermi2q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W1 76 /r","V","V","AVX512F+AVX512VL","bscale8,scale16",= "rw,r,r,r","","" +"VPERMI2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMI2Q ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpermi2q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W1 76 /r","V","V","AVX512F+AVX512VL","bscale8,scale32",= "rw,r,r,r","","" +"VPERMI2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMI2Q zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpermi2q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W1 76 /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r= ","","" +"VPERMI2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMI2W xmm2/m128, xmmV, {k}{z}= , xmm1","vpermi2w xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7= 5 /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","","" +"VPERMI2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMI2W ymm2/m256, ymmV, {k}{z}= , ymm1","vpermi2w ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7= 5 /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","","" +"VPERMI2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMI2W zmm2/m512, zmmV, {k}{z}= , zmm1","vpermi2w zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7= 5 /r","V","V","AVX512BW","scale64","rw,r,r,r","","" +"VPERMIL2PD xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PD imm8u, xmm2/m= 128, xmmIH, xmmV, xmm1","vpermil2pd imm8u, xmm2/m128, xmmIH, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W1 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PD xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PD imm8u, xmmIH,= xmm2/m128, xmmV, xmm1","vpermil2pd imm8u, xmmIH, xmm2/m128, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W0 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PD ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PD imm8u, ymm2/m= 256, ymmIH, ymmV, ymm1","vpermil2pd imm8u, ymm2/m256, ymmIH, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W1 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PD ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PD imm8u, ymmIH,= ymm2/m256, ymmV, ymm1","vpermil2pd imm8u, ymmIH, ymm2/m256, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W0 49 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PS xmm1, xmmV, xmmIH, xmm2/m128, imm8u","VPERMIL2PS imm8u, xmm2/m= 128, xmmIH, xmmV, xmm1","vpermil2ps imm8u, xmm2/m128, xmmIH, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W1 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PS xmm1, xmmV, xmm2/m128, xmmIH, imm8u","VPERMIL2PS imm8u, xmmIH,= xmm2/m128, xmmV, xmm1","vpermil2ps imm8u, xmmIH, xmm2/m128, xmmV, xmm1","V= EX.NDS.128.66.0F3A.W0 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PS ymm1, ymmV, ymmIH, ymm2/m256, imm8u","VPERMIL2PS imm8u, ymm2/m= 256, ymmIH, ymmV, ymm1","vpermil2ps imm8u, ymm2/m256, ymmIH, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W1 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMIL2PS ymm1, ymmV, ymm2/m256, ymmIH, imm8u","VPERMIL2PS imm8u, ymmIH,= ymm2/m256, ymmV, ymm1","vpermil2ps imm8u, ymmIH, ymm2/m256, ymmV, ymm1","V= EX.NDS.256.66.0F3A.W0 48 /r /is4","V","V","XOP","amd","w,r,r,r,r","","" +"VPERMILPD xmm1, xmm2/m128, imm8u","VPERMILPD imm8u, xmm2/m128, xmm1","vpe= rmilpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 05 /r ib","V","V","AVX",= "","w,r,r","","" +"VPERMILPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VPERMILPD imm8u, xmm2/= m128/m64bcst, {k}{z}, xmm1","vpermilpd imm8u, xmm2/m128/m64bcst, {k}{z}, xm= m1","EVEX.128.66.0F3A.W1 05 /r ib","V","V","AVX512F+AVX512VL","bscale8,scal= e16","w,r,r,r","","" +"VPERMILPD ymm1, ymm2/m256, imm8u","VPERMILPD imm8u, ymm2/m256, ymm1","vpe= rmilpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 05 /r ib","V","V","AVX",= "","w,r,r","","" +"VPERMILPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMILPD imm8u, ymm2/= m256/m64bcst, {k}{z}, ymm1","vpermilpd imm8u, ymm2/m256/m64bcst, {k}{z}, ym= m1","EVEX.256.66.0F3A.W1 05 /r ib","V","V","AVX512F+AVX512VL","bscale8,scal= e32","w,r,r,r","","" +"VPERMILPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMILPD imm8u, zmm2/= m512/m64bcst, {k}{z}, zmm1","vpermilpd imm8u, zmm2/m512/m64bcst, {k}{z}, zm= m1","EVEX.512.66.0F3A.W1 05 /r ib","V","V","AVX512F","bscale8,scale64","w,r= ,r,r","","" +"VPERMILPD xmm1, xmmV, xmm2/m128","VPERMILPD xmm2/m128, xmmV, xmm1","vperm= ilpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0D /r","V","V","AVX",""= ,"w,r,r","","" +"VPERMILPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMILPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vpermilpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W1 0D /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r","","" +"VPERMILPD ymm1, ymmV, ymm2/m256","VPERMILPD ymm2/m256, ymmV, ymm1","vperm= ilpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0D /r","V","V","AVX",""= ,"w,r,r","","" +"VPERMILPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMILPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vpermilpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W1 0D /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r","","" +"VPERMILPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMILPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vpermilpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W1 0D /r","V","V","AVX512F","bscale8,scale64","w,r,r= ,r","","" +"VPERMILPS xmm1, xmm2/m128, imm8u","VPERMILPS imm8u, xmm2/m128, xmm1","vpe= rmilps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.W0 04 /r ib","V","V","AVX",= "","w,r,r","","" +"VPERMILPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPERMILPS imm8u, xmm2/= m128/m32bcst, {k}{z}, xmm1","vpermilps imm8u, xmm2/m128/m32bcst, {k}{z}, xm= m1","EVEX.128.66.0F3A.W0 04 /r ib","V","V","AVX512F+AVX512VL","bscale4,scal= e16","w,r,r,r","","" +"VPERMILPS ymm1, ymm2/m256, imm8u","VPERMILPS imm8u, ymm2/m256, ymm1","vpe= rmilps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W0 04 /r ib","V","V","AVX",= "","w,r,r","","" +"VPERMILPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPERMILPS imm8u, ymm2/= m256/m32bcst, {k}{z}, ymm1","vpermilps imm8u, ymm2/m256/m32bcst, {k}{z}, ym= m1","EVEX.256.66.0F3A.W0 04 /r ib","V","V","AVX512F+AVX512VL","bscale4,scal= e32","w,r,r,r","","" +"VPERMILPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPERMILPS imm8u, zmm2/= m512/m32bcst, {k}{z}, zmm1","vpermilps imm8u, zmm2/m512/m32bcst, {k}{z}, zm= m1","EVEX.512.66.0F3A.W0 04 /r ib","V","V","AVX512F","bscale4,scale64","w,r= ,r,r","","" +"VPERMILPS xmm1, xmmV, xmm2/m128","VPERMILPS xmm2/m128, xmmV, xmm1","vperm= ilps xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX",""= ,"w,r,r","","" +"VPERMILPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMILPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpermilps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 0C /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r","","" +"VPERMILPS ymm1, ymmV, ymm2/m256","VPERMILPS ymm2/m256, ymmV, ymm1","vperm= ilps ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX",""= ,"w,r,r","","" +"VPERMILPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMILPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpermilps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 0C /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r","","" +"VPERMILPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMILPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpermilps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 0C /r","V","V","AVX512F","bscale4,scale64","w,r,r= ,r","","" +"VPERMPD ymm1, ymm2/m256, imm8u","VPERMPD imm8u, ymm2/m256, ymm1","vpermpd= imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 01 /r ib","V","V","AVX2","","w= ,r,r","","" +"VPERMPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMPD imm8u, ymm2/m256= /m64bcst, {k}{z}, ymm1","vpermpd imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","E= VEX.256.66.0F3A.W1 01 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","= w,r,r,r","","" +"VPERMPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMPD imm8u, zmm2/m512= /m64bcst, {k}{z}, zmm1","vpermpd imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","E= VEX.512.66.0F3A.W1 01 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r",= "","" +"VPERMPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMPD ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpermpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 16 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPERMPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMPD zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpermpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 16 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPERMPS ymm1, ymmV, ymm2/m256","VPERMPS ymm2/m256, ymmV, ymm1","vpermps y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 16 /r","V","V","AVX2","","w,r= ,r","","" +"VPERMPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMPS ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpermps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 16 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPERMPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMPS zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpermps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 16 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPERMQ ymm1, ymm2/m256, imm8u","VPERMQ imm8u, ymm2/m256, ymm1","vpermq im= m8u, ymm2/m256, ymm1","VEX.256.66.0F3A.W1 00 /r ib","V","V","AVX2","","w,r,= r","","" +"VPERMQ ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VPERMQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymm1","vpermq imm8u, ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX= .256.66.0F3A.W1 00 /r ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r= ,r,r","","" +"VPERMQ zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VPERMQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmm1","vpermq imm8u, zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX= .512.66.0F3A.W1 00 /r ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","",= "" +"VPERMQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpermq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F38.W1 36 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r= ,r","","" +"VPERMQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpermq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F38.W1 36 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPERMT2B xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2B xmm2/m128, xmmV, {k}{z}= , xmm1","vpermt2b xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W0 7= D /r","V","V","AVX512_VBMI+AVX512VL","scale16","rw,r,r,r","","" +"VPERMT2B ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2B ymm2/m256, ymmV, {k}{z}= , ymm1","vpermt2b ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W0 7= D /r","V","V","AVX512_VBMI+AVX512VL","scale32","rw,r,r,r","","" +"VPERMT2B zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2B zmm2/m512, zmmV, {k}{z}= , zmm1","vpermt2b zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W0 7= D /r","V","V","AVX512_VBMI","scale64","rw,r,r,r","","" +"VPERMT2D xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2D xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpermt2d xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 7E /r","V","V","AVX512F+AVX512VL","bscale4,scale16",= "rw,r,r,r","","" +"VPERMT2D ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2D ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpermt2d ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 7E /r","V","V","AVX512F+AVX512VL","bscale4,scale32",= "rw,r,r,r","","" +"VPERMT2D zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2D zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpermt2d zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 7E /r","V","V","AVX512F","bscale4,scale64","rw,r,r,r= ","","" +"VPERMT2PD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2PD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vpermt2pd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W1 7F /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","rw,r,r,r","","" +"VPERMT2PD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2PD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vpermt2pd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W1 7F /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","rw,r,r,r","","" +"VPERMT2PD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2PD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vpermt2pd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W1 7F /r","V","V","AVX512F","bscale8,scale64","rw,r,= r,r","","" +"VPERMT2PS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPERMT2PS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vpermt2ps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.DDS.128.66.0F38.W0 7F /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","rw,r,r,r","","" +"VPERMT2PS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPERMT2PS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vpermt2ps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.DDS.256.66.0F38.W0 7F /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","rw,r,r,r","","" +"VPERMT2PS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPERMT2PS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vpermt2ps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.DDS.512.66.0F38.W0 7F /r","V","V","AVX512F","bscale4,scale64","rw,r,= r,r","","" +"VPERMT2Q xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPERMT2Q xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpermt2q xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W1 7E /r","V","V","AVX512F+AVX512VL","bscale8,scale16",= "rw,r,r,r","","" +"VPERMT2Q ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPERMT2Q ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpermt2q ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W1 7E /r","V","V","AVX512F+AVX512VL","bscale8,scale32",= "rw,r,r,r","","" +"VPERMT2Q zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPERMT2Q zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpermt2q zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W1 7E /r","V","V","AVX512F","bscale8,scale64","rw,r,r,r= ","","" +"VPERMT2W xmm1, {k}{z}, xmmV, xmm2/m128","VPERMT2W xmm2/m128, xmmV, {k}{z}= , xmm1","vpermt2w xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7= D /r","V","V","AVX512BW+AVX512VL","scale16","rw,r,r,r","","" +"VPERMT2W ymm1, {k}{z}, ymmV, ymm2/m256","VPERMT2W ymm2/m256, ymmV, {k}{z}= , ymm1","vpermt2w ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7= D /r","V","V","AVX512BW+AVX512VL","scale32","rw,r,r,r","","" +"VPERMT2W zmm1, {k}{z}, zmmV, zmm2/m512","VPERMT2W zmm2/m512, zmmV, {k}{z}= , zmm1","vpermt2w zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7= D /r","V","V","AVX512BW","scale64","rw,r,r,r","","" +"VPERMW xmm1, {k}{z}, xmmV, xmm2/m128","VPERMW xmm2/m128, xmmV, {k}{z}, xm= m1","vpermw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 8D /r",= "V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPERMW ymm1, {k}{z}, ymmV, ymm2/m256","VPERMW ymm2/m256, ymmV, {k}{z}, ym= m1","vpermw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 8D /r",= "V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPERMW zmm1, {k}{z}, zmmV, zmm2/m512","VPERMW zmm2/m512, zmmV, {k}{z}, zm= m1","vpermw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 8D /r",= "V","V","AVX512BW","scale64","w,r,r,r","","" +"VPEXPANDB xmm1, {k}{z}, xmm2/m128","VPEXPANDB xmm2/m128, {k}{z}, xmm1","v= pexpandb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 62 /r","V","V","AVX5= 12_VBMI2+AVX512VL","scale1","w,r,r","","" +"VPEXPANDB ymm1, {k}{z}, ymm2/m256","VPEXPANDB ymm2/m256, {k}{z}, ymm1","v= pexpandb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 62 /r","V","V","AVX5= 12_VBMI2+AVX512VL","scale1","w,r,r","","" +"VPEXPANDB zmm1, {k}{z}, zmm2/m512","VPEXPANDB zmm2/m512, {k}{z}, zmm1","v= pexpandb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 62 /r","V","V","AVX5= 12_VBMI2","scale1","w,r,r","","" +"VPEXPANDD xmm1, {k}{z}, xmm2/m128","VPEXPANDD xmm2/m128, {k}{z}, xmm1","v= pexpandd xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 89 /r","V","V","AVX5= 12F+AVX512VL","scale4","w,r,r","","" +"VPEXPANDD ymm1, {k}{z}, ymm2/m256","VPEXPANDD ymm2/m256, {k}{z}, ymm1","v= pexpandd ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 89 /r","V","V","AVX5= 12F+AVX512VL","scale4","w,r,r","","" +"VPEXPANDD zmm1, {k}{z}, zmm2/m512","VPEXPANDD zmm2/m512, {k}{z}, zmm1","v= pexpandd zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 89 /r","V","V","AVX5= 12F","scale4","w,r,r","","" +"VPEXPANDQ xmm1, {k}{z}, xmm2/m128","VPEXPANDQ xmm2/m128, {k}{z}, xmm1","v= pexpandq xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 89 /r","V","V","AVX5= 12F+AVX512VL","scale8","w,r,r","","" +"VPEXPANDQ ymm1, {k}{z}, ymm2/m256","VPEXPANDQ ymm2/m256, {k}{z}, ymm1","v= pexpandq ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 89 /r","V","V","AVX5= 12F+AVX512VL","scale8","w,r,r","","" +"VPEXPANDQ zmm1, {k}{z}, zmm2/m512","VPEXPANDQ zmm2/m512, {k}{z}, zmm1","v= pexpandq zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 89 /r","V","V","AVX5= 12F","scale8","w,r,r","","" +"VPEXPANDW xmm1, {k}{z}, xmm2/m128","VPEXPANDW xmm2/m128, {k}{z}, xmm1","v= pexpandw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 62 /r","V","V","AVX5= 12_VBMI2+AVX512VL","scale2","w,r,r","","" +"VPEXPANDW ymm1, {k}{z}, ymm2/m256","VPEXPANDW ymm2/m256, {k}{z}, ymm1","v= pexpandw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 62 /r","V","V","AVX5= 12_VBMI2+AVX512VL","scale2","w,r,r","","" +"VPEXPANDW zmm1, {k}{z}, zmm2/m512","VPEXPANDW zmm2/m512, {k}{z}, zmm1","v= pexpandw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 62 /r","V","V","AVX5= 12_VBMI2","scale2","w,r,r","","" +"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u= , xmm1, r32/m8","EVEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX512BW+AVX512VL"= ,"scale1","w,r,r","","" +"VPEXTRB r32/m8, xmm1, imm8u","VPEXTRB imm8u, xmm1, r32/m8","vpextrb imm8u= , xmm1, r32/m8","VEX.128.66.0F3A.WIG 14 /r ib","V","V","AVX","","w,r,r","",= "" +"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, = xmm1, r/m32","EVEX.128.66.0F3A.W0 16 /r ib","V","V","AVX512DQ+AVX512VL","sc= ale4","w,r,r","","" +"VPEXTRD r/m32, xmm1, imm8u","VPEXTRD imm8u, xmm1, r/m32","vpextrd imm8u, = xmm1, r/m32","VEX.128.66.0F3A.W0 16 /r ib","V","V","AVX","","w,r,r","","" +"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, = xmm1, r/m64","EVEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX512DQ+AVX512VL",= "scale8","w,r,r","","" +"VPEXTRQ r/m64, xmm1, imm8u","VPEXTRQ imm8u, xmm1, r/m64","vpextrq imm8u, = xmm1, r/m64","VEX.128.66.0F3A.W1 16 /r ib","N.S.","V","AVX","","w,r,r","","" +"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm= 8u, xmm1, r32/m16","EVEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX512BW+AVX512= VL","scale2","w,r,r","","" +"VPEXTRW r32/m16, xmm1, imm8u","VPEXTRW imm8u, xmm1, r32/m16","vpextrw imm= 8u, xmm1, r32/m16","VEX.128.66.0F3A.WIG 15 /r ib","V","V","AVX","","w,r,r",= "","" +"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2= , r32","EVEX.128.66.0F.WIG C5 /r ib","V","V","AVX512BW+AVX512VL","modrm_reg= only","w,r,r","","" +"VPEXTRW r32, xmm2, imm8u","VPEXTRW imm8u, xmm2, r32","vpextrw imm8u, xmm2= , r32","VEX.128.66.0F.WIG C5 /r ib","V","V","AVX","modrm_regonly","w,r,r","= ","" +"VPGATHERDD xmm1, {k1-k7}, vm32x","VPGATHERDD vm32x, {k1-k7}, xmm1","vpgat= herdd vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 90 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERDD ymm1, {k1-k7}, vm32y","VPGATHERDD vm32y, {k1-k7}, ymm1","vpgat= herdd vm32y, {k1-k7}, ymm1","EVEX.256.66.0F38.W0 90 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERDD zmm1, {k1-k7}, vm32z","VPGATHERDD vm32z, {k1-k7}, zmm1","vpgat= herdd vm32z, {k1-k7}, zmm1","EVEX.512.66.0F38.W0 90 /vsib","V","V","AVX512F= ","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERDD xmm1, vm32x, xmmV","VPGATHERDD xmmV, vm32x, xmm1","vpgatherdd = xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W0 90 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERDD ymm1, vm32y, ymmV","VPGATHERDD ymmV, vm32y, ymm1","vpgatherdd = ymmV, vm32y, ymm1","VEX.DDS.256.66.0F38.W0 90 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERDQ xmm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7}, xmm1","vpgat= herdq vm32x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 90 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERDQ ymm1, {k1-k7}, vm32x","VPGATHERDQ vm32x, {k1-k7}, ymm1","vpgat= herdq vm32x, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 90 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERDQ zmm1, {k1-k7}, vm32y","VPGATHERDQ vm32y, {k1-k7}, zmm1","vpgat= herdq vm32y, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 90 /vsib","V","V","AVX512F= ","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERDQ xmm1, vm32x, xmmV","VPGATHERDQ xmmV, vm32x, xmm1","vpgatherdq = xmmV, vm32x, xmm1","VEX.DDS.128.66.0F38.W1 90 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERDQ ymm1, vm32x, ymmV","VPGATHERDQ ymmV, vm32x, ymm1","vpgatherdq = ymmV, vm32x, ymm1","VEX.DDS.256.66.0F38.W1 90 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERQD xmm1, {k1-k7}, vm64x","VPGATHERQD vm64x, {k1-k7}, xmm1","vpgat= herqd vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W0 91 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERQD xmm1, {k1-k7}, vm64y","VPGATHERQD vm64y, {k1-k7}, xmm1","vpgat= herqd vm64y, {k1-k7}, xmm1","EVEX.256.66.0F38.W0 91 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERQD ymm1, {k1-k7}, vm64z","VPGATHERQD vm64z, {k1-k7}, ymm1","vpgat= herqd vm64z, {k1-k7}, ymm1","EVEX.512.66.0F38.W0 91 /vsib","V","V","AVX512F= ","modrm_memonly,scale4","w,rw,r","","" +"VPGATHERQD xmm1, vm64x, xmmV","VPGATHERQD xmmV, vm64x, xmm1","vpgatherqd = xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W0 91 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERQD xmm1, vm64y, xmmV","VPGATHERQD xmmV, vm64y, xmm1","vpgatherqd = xmmV, vm64y, xmm1","VEX.DDS.256.66.0F38.W0 91 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERQQ xmm1, {k1-k7}, vm64x","VPGATHERQQ vm64x, {k1-k7}, xmm1","vpgat= herqq vm64x, {k1-k7}, xmm1","EVEX.128.66.0F38.W1 91 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERQQ ymm1, {k1-k7}, vm64y","VPGATHERQQ vm64y, {k1-k7}, ymm1","vpgat= herqq vm64y, {k1-k7}, ymm1","EVEX.256.66.0F38.W1 91 /vsib","V","V","AVX512F= +AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERQQ zmm1, {k1-k7}, vm64z","VPGATHERQQ vm64z, {k1-k7}, zmm1","vpgat= herqq vm64z, {k1-k7}, zmm1","EVEX.512.66.0F38.W1 91 /vsib","V","V","AVX512F= ","modrm_memonly,scale8","w,rw,r","","" +"VPGATHERQQ xmm1, vm64x, xmmV","VPGATHERQQ xmmV, vm64x, xmm1","vpgatherqq = xmmV, vm64x, xmm1","VEX.DDS.128.66.0F38.W1 91 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPGATHERQQ ymm1, vm64y, ymmV","VPGATHERQQ ymmV, vm64y, ymm1","vpgatherqq = ymmV, vm64y, ymm1","VEX.DDS.256.66.0F38.W1 91 /r","V","V","AVX2","modrm_mem= only","rw,r,rw","","" +"VPHADDBD xmm1, xmm2/m128","VPHADDBD xmm2/m128, xmm1","vphaddbd xmm2/m128,= xmm1","XOP.128.09.W0 C2 /r","V","V","XOP","amd","w,r","","" +"VPHADDBQ xmm1, xmm2/m128","VPHADDBQ xmm2/m128, xmm1","vphaddbq xmm2/m128,= xmm1","XOP.128.09.W0 C3 /r","V","V","XOP","amd","w,r","","" +"VPHADDBW xmm1, xmm2/m128","VPHADDBW xmm2/m128, xmm1","vphaddbw xmm2/m128,= xmm1","XOP.128.09.W0 C1 /r","V","V","XOP","amd","w,r","","" +"VPHADDD xmm1, xmmV, xmm2/m128","VPHADDD xmm2/m128, xmmV, xmm1","vphaddd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 02 /r","V","V","AVX","","w,r= ,r","","" +"VPHADDD ymm1, ymmV, ymm2/m256","VPHADDD ymm2/m256, ymmV, ymm1","vphaddd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 02 /r","V","V","AVX2","","w,= r,r","","" +"VPHADDDQ xmm1, xmm2/m128","VPHADDDQ xmm2/m128, xmm1","vphadddq xmm2/m128,= xmm1","XOP.128.09.W0 CB /r","V","V","XOP","amd","w,r","","" +"VPHADDSW xmm1, xmmV, xmm2/m128","VPHADDSW xmm2/m128, xmmV, xmm1","vphadds= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 03 /r","V","V","AVX","","= w,r,r","","" +"VPHADDSW ymm1, ymmV, ymm2/m256","VPHADDSW ymm2/m256, ymmV, ymm1","vphadds= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 03 /r","V","V","AVX2","",= "w,r,r","","" +"VPHADDUBD xmm1, xmm2/m128","VPHADDUBD xmm2/m128, xmm1","vphaddubd xmm2/m1= 28, xmm1","XOP.128.09.W0 D2 /r","V","V","XOP","amd","w,r","","" +"VPHADDUBQ xmm1, xmm2/m128","VPHADDUBQ xmm2/m128, xmm1","vphaddubq xmm2/m1= 28, xmm1","XOP.128.09.W0 D3 /r","V","V","XOP","amd","w,r","","" +"VPHADDUBW xmm1, xmm2/m128","VPHADDUBW xmm2/m128, xmm1","vphaddubw xmm2/m1= 28, xmm1","XOP.128.09.W0 D1 /r","V","V","XOP","amd","w,r","","" +"VPHADDUDQ xmm1, xmm2/m128","VPHADDUDQ xmm2/m128, xmm1","vphaddudq xmm2/m1= 28, xmm1","XOP.128.09.W0 DB /r","V","V","XOP","amd","w,r","","" +"VPHADDUWD xmm1, xmm2/m128","VPHADDUWD xmm2/m128, xmm1","vphadduwd xmm2/m1= 28, xmm1","XOP.128.09.W0 D6 /r","V","V","XOP","amd","w,r","","" +"VPHADDUWQ xmm1, xmm2/m128","VPHADDUWQ xmm2/m128, xmm1","vphadduwq xmm2/m1= 28, xmm1","XOP.128.09.W0 D7 /r","V","V","XOP","amd","w,r","","" +"VPHADDW xmm1, xmmV, xmm2/m128","VPHADDW xmm2/m128, xmmV, xmm1","vphaddw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 01 /r","V","V","AVX","","w,r= ,r","","" +"VPHADDW ymm1, ymmV, ymm2/m256","VPHADDW ymm2/m256, ymmV, ymm1","vphaddw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 01 /r","V","V","AVX2","","w,= r,r","","" +"VPHADDWD xmm1, xmm2/m128","VPHADDWD xmm2/m128, xmm1","vphaddwd xmm2/m128,= xmm1","XOP.128.09.W0 C6 /r","V","V","XOP","amd","w,r","","" +"VPHADDWQ xmm1, xmm2/m128","VPHADDWQ xmm2/m128, xmm1","vphaddwq xmm2/m128,= xmm1","XOP.128.09.W0 C7 /r","V","V","XOP","amd","w,r","","" +"VPHMINPOSUW xmm1, xmm2/m128","VPHMINPOSUW xmm2/m128, xmm1","vphminposuw x= mm2/m128, xmm1","VEX.128.66.0F38.WIG 41 /r","V","V","AVX","","w,r","","" +"VPHSUBBW xmm1, xmm2/m128","VPHSUBBW xmm2/m128, xmm1","vphsubbw xmm2/m128,= xmm1","XOP.128.09.W0 E1 /r","V","V","XOP","amd","w,r","","" +"VPHSUBD xmm1, xmmV, xmm2/m128","VPHSUBD xmm2/m128, xmmV, xmm1","vphsubd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 06 /r","V","V","AVX","","w,r= ,r","","" +"VPHSUBD ymm1, ymmV, ymm2/m256","VPHSUBD ymm2/m256, ymmV, ymm1","vphsubd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 06 /r","V","V","AVX2","","w,= r,r","","" +"VPHSUBDQ xmm1, xmm2/m128","VPHSUBDQ xmm2/m128, xmm1","vphsubdq xmm2/m128,= xmm1","XOP.128.09.W0 E3 /r","V","V","XOP","amd","w,r","","" +"VPHSUBSW xmm1, xmmV, xmm2/m128","VPHSUBSW xmm2/m128, xmmV, xmm1","vphsubs= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 07 /r","V","V","AVX","","= w,r,r","","" +"VPHSUBSW ymm1, ymmV, ymm2/m256","VPHSUBSW ymm2/m256, ymmV, ymm1","vphsubs= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 07 /r","V","V","AVX2","",= "w,r,r","","" +"VPHSUBW xmm1, xmmV, xmm2/m128","VPHSUBW xmm2/m128, xmmV, xmm1","vphsubw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 05 /r","V","V","AVX","","w,r= ,r","","" +"VPHSUBW ymm1, ymmV, ymm2/m256","VPHSUBW ymm2/m256, ymmV, ymm1","vphsubw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 05 /r","V","V","AVX2","","w,= r,r","","" +"VPHSUBWD xmm1, xmm2/m128","VPHSUBWD xmm2/m128, xmm1","vphsubwd xmm2/m128,= xmm1","XOP.128.09.W0 E2 /r","V","V","XOP","amd","w,r","","" +"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV, xmm1","v= pinsrb imm8u, r32/m8, xmmV, xmm1","EVEX.NDS.128.66.0F3A.WIG 20 /r ib","V","= V","AVX512BW+AVX512VL","scale1","w,r,r,r","","" +"VPINSRB xmm1, xmmV, r32/m8, imm8u","VPINSRB imm8u, r32/m8, xmmV, xmm1","v= pinsrb imm8u, r32/m8, xmmV, xmm1","VEX.NDS.128.66.0F3A.WIG 20 /r ib","V","V= ","AVX","","w,r,r,r","","" +"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpi= nsrd imm8u, r/m32, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","= AVX512DQ+AVX512VL","scale4","w,r,r,r","","" +"VPINSRD xmm1, xmmV, r/m32, imm8u","VPINSRD imm8u, r/m32, xmmV, xmm1","vpi= nsrd imm8u, r/m32, xmmV, xmm1","VEX.NDS.128.66.0F3A.W0 22 /r ib","V","V","A= VX","","w,r,r,r","","" +"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV, xmm1","vpi= nsrq imm8u, r/m64, xmmV, xmm1","EVEX.NDS.128.66.0F3A.W1 22 /r ib","N.S.","V= ","AVX512DQ+AVX512VL","scale8","w,r,r,r","","" +"VPINSRQ xmm1, xmmV, r/m64, imm8u","VPINSRQ imm8u, r/m64, xmmV, xmm1","vpi= nsrq imm8u, r/m64, xmmV, xmm1","VEX.NDS.128.66.0F3A.W1 22 /r ib","N.S.","V"= ,"AVX","","w,r,r,r","","" +"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV, xmm1",= "vpinsrw imm8u, r32/m16, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG C4 /r ib","V",= "V","AVX512BW+AVX512VL","scale2","w,r,r,r","","" +"VPINSRW xmm1, xmmV, r32/m16, imm8u","VPINSRW imm8u, r32/m16, xmmV, xmm1",= "vpinsrw imm8u, r32/m16, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C4 /r ib","V","= V","AVX","","w,r,r,r","","" +"VPLZCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPLZCNTD xmm2/m128/m32bcst, {k= }{z}, xmm1","vplzcntd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0= 44 /r","V","V","AVX512CD+AVX512VL","bscale4,scale16","w,r,r","","" +"VPLZCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPLZCNTD ymm2/m256/m32bcst, {k= }{z}, ymm1","vplzcntd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0= 44 /r","V","V","AVX512CD+AVX512VL","bscale4,scale32","w,r,r","","" +"VPLZCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPLZCNTD zmm2/m512/m32bcst, {k= }{z}, zmm1","vplzcntd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0= 44 /r","V","V","AVX512CD","bscale4,scale64","w,r,r","","" +"VPLZCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPLZCNTQ xmm2/m128/m64bcst, {k= }{z}, xmm1","vplzcntq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1= 44 /r","V","V","AVX512CD+AVX512VL","bscale8,scale16","w,r,r","","" +"VPLZCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPLZCNTQ ymm2/m256/m64bcst, {k= }{z}, ymm1","vplzcntq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1= 44 /r","V","V","AVX512CD+AVX512VL","bscale8,scale32","w,r,r","","" +"VPLZCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPLZCNTQ zmm2/m512/m64bcst, {k= }{z}, zmm1","vplzcntq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1= 44 /r","V","V","AVX512CD","bscale8,scale64","w,r,r","","" +"VPMACSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDD xmmIH, xmm2/m128, xmmV, = xmm1","vpmacsdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 9E /r /is4= ","V","V","XOP","amd","w,r,r,r","","" +"VPMACSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQH xmmIH, xmm2/m128, xmmV= , xmm1","vpmacsdqh xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 9F /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSDQL xmmIH, xmm2/m128, xmmV= , xmm1","vpmacsdql xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 97 /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSDD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDD xmmIH, xmm2/m128, xmmV= , xmm1","vpmacssdd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 8E /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSDQH xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQH xmmIH, xmm2/m128, xm= mV, xmm1","vpmacssdqh xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 8F /= r /is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSDQL xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSDQL xmmIH, xmm2/m128, xm= mV, xmm1","vpmacssdql xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 87 /= r /is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWD xmmIH, xmm2/m128, xmmV= , xmm1","vpmacsswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 86 /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSSWW xmmIH, xmm2/m128, xmmV= , xmm1","vpmacssww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 85 /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMACSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWD xmmIH, xmm2/m128, xmmV, = xmm1","vpmacswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 96 /r /is4= ","V","V","XOP","amd","w,r,r,r","","" +"VPMACSWW xmm1, xmmV, xmm2/m128, xmmIH","VPMACSWW xmmIH, xmm2/m128, xmmV, = xmm1","vpmacsww xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 95 /r /is4= ","V","V","XOP","amd","w,r,r,r","","" +"VPMADCSSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSSWD xmmIH, xmm2/m128, xm= mV, xmm1","vpmadcsswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A6 /= r /is4","V","V","XOP","amd","w,r,r,r","","" +"VPMADCSWD xmm1, xmmV, xmm2/m128, xmmIH","VPMADCSWD xmmIH, xmm2/m128, xmmV= , xmm1","vpmadcswd xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 B6 /r /= is4","V","V","XOP","amd","w,r,r,r","","" +"VPMADD52HUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52HUQ xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vpmadd52huq xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 B5 /r","V","V","AVX512_IFMA+AVX512VL","bsca= le8,scale16","rw,r,r,r","","" +"VPMADD52HUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52HUQ ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vpmadd52huq ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 B5 /r","V","V","AVX512_IFMA+AVX512VL","bsca= le8,scale32","rw,r,r,r","","" +"VPMADD52HUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52HUQ zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vpmadd52huq zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 B5 /r","V","V","AVX512_IFMA","bscale8,scale= 64","rw,r,r,r","","" +"VPMADD52LUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMADD52LUQ xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vpmadd52luq xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.DDS.128.66.0F38.W1 B4 /r","V","V","AVX512_IFMA+AVX512VL","bsca= le8,scale16","rw,r,r,r","","" +"VPMADD52LUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMADD52LUQ ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vpmadd52luq ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.DDS.256.66.0F38.W1 B4 /r","V","V","AVX512_IFMA+AVX512VL","bsca= le8,scale32","rw,r,r,r","","" +"VPMADD52LUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMADD52LUQ zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vpmadd52luq zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.DDS.512.66.0F38.W1 B4 /r","V","V","AVX512_IFMA","bscale8,scale= 64","rw,r,r,r","","" +"VPMADDUBSW xmm1, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, xmm1","vpm= addubsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 04 /r","V","V","AVX= ","","w,r,r","","" +"VPMADDUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDUBSW xmm2/m128, xmmV, {k= }{z}, xmm1","vpmaddubsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3= 8.WIG 04 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMADDUBSW ymm1, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, ymm1","vpm= addubsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 04 /r","V","V","AVX= 2","","w,r,r","","" +"VPMADDUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDUBSW ymm2/m256, ymmV, {k= }{z}, ymm1","vpmaddubsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3= 8.WIG 04 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMADDUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDUBSW zmm2/m512, zmmV, {k= }{z}, zmm1","vpmaddubsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3= 8.WIG 04 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMADDWD xmm1, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, xmm1","vpmaddw= d xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F5 /r","V","V","AVX","","w,= r,r","","" +"VPMADDWD xmm1, {k}{z}, xmmV, xmm2/m128","VPMADDWD xmm2/m128, xmmV, {k}{z}= , xmm1","vpmaddwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F5= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMADDWD ymm1, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, ymm1","vpmaddw= d ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F5 /r","V","V","AVX2","","w= ,r,r","","" +"VPMADDWD ymm1, {k}{z}, ymmV, ymm2/m256","VPMADDWD ymm2/m256, ymmV, {k}{z}= , ymm1","vpmaddwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F5= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMADDWD zmm1, {k}{z}, zmmV, zmm2/m512","VPMADDWD zmm2/m512, zmmV, {k}{z}= , zmm1","vpmaddwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F5= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMASKMOVD xmm1, xmmV, m128","VPMASKMOVD m128, xmmV, xmm1","vpmaskmovd m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVD ymm1, ymmV, m256","VPMASKMOVD m256, ymmV, ymm1","vpmaskmovd m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 8C /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVD m128, xmmV, xmm1","VPMASKMOVD xmm1, xmmV, m128","vpmaskmovd xm= m1, xmmV, m128","VEX.NDS.128.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVD m256, ymmV, ymm1","VPMASKMOVD ymm1, ymmV, m256","vpmaskmovd ym= m1, ymmV, m256","VEX.NDS.256.66.0F38.W0 8E /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVQ xmm1, xmmV, m128","VPMASKMOVQ m128, xmmV, xmm1","vpmaskmovq m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVQ ymm1, ymmV, m256","VPMASKMOVQ m256, ymmV, ymm1","vpmaskmovq m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 8C /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVQ m128, xmmV, xmm1","VPMASKMOVQ xmm1, xmmV, m128","vpmaskmovq xm= m1, xmmV, m128","VEX.NDS.128.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMASKMOVQ m256, ymmV, ymm1","VPMASKMOVQ ymm1, ymmV, m256","vpmaskmovq ym= m1, ymmV, m256","VEX.NDS.256.66.0F38.W1 8E /r","V","V","AVX2","modrm_memonl= y","w,r,r","","" +"VPMAXSB xmm1, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, xmm1","vpmaxsb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3C /r","V","V","AVX","","w,r= ,r","","" +"VPMAXSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSB xmm2/m128, xmmV, {k}{z}, = xmm1","vpmaxsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3C = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMAXSB ymm1, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, ymm1","vpmaxsb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3C /r","V","V","AVX2","","w,= r,r","","" +"VPMAXSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSB ymm2/m256, ymmV, {k}{z}, = ymm1","vpmaxsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3C = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMAXSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSB zmm2/m512, zmmV, {k}{z}, = zmm1","vpmaxsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3C = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMAXSD xmm1, xmmV, xmm2/m128","VPMAXSD xmm2/m128, xmmV, xmm1","vpmaxsd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3D /r","V","V","AVX","","w,r= ,r","","" +"VPMAXSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXSD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpmaxsd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 3D /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMAXSD ymm1, ymmV, ymm2/m256","VPMAXSD ymm2/m256, ymmV, ymm1","vpmaxsd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3D /r","V","V","AVX2","","w,= r,r","","" +"VPMAXSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXSD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpmaxsd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 3D /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMAXSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXSD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpmaxsd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 3D /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMAXSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXSQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpmaxsq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 3D /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMAXSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXSQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpmaxsq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 3D /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMAXSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXSQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpmaxsq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 3D /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMAXSW xmm1, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, xmm1","vpmaxsw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EE /r","V","V","AVX","","w,r,r= ","","" +"VPMAXSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXSW xmm2/m128, xmmV, {k}{z}, = xmm1","vpmaxsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EE /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMAXSW ymm1, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, ymm1","vpmaxsw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EE /r","V","V","AVX2","","w,r,= r","","" +"VPMAXSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXSW ymm2/m256, ymmV, {k}{z}, = ymm1","vpmaxsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EE /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMAXSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXSW zmm2/m512, zmmV, {k}{z}, = zmm1","vpmaxsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EE /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMAXUB xmm1, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, xmm1","vpmaxub x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DE /r","V","V","AVX","","w,r,r= ","","" +"VPMAXUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUB xmm2/m128, xmmV, {k}{z}, = xmm1","vpmaxub xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DE /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMAXUB ymm1, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, ymm1","vpmaxub y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DE /r","V","V","AVX2","","w,r,= r","","" +"VPMAXUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUB ymm2/m256, ymmV, {k}{z}, = ymm1","vpmaxub ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DE /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMAXUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUB zmm2/m512, zmmV, {k}{z}, = zmm1","vpmaxub zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DE /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMAXUD xmm1, xmmV, xmm2/m128","VPMAXUD xmm2/m128, xmmV, xmm1","vpmaxud x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3F /r","V","V","AVX","","w,r= ,r","","" +"VPMAXUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMAXUD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpmaxud xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 3F /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMAXUD ymm1, ymmV, ymm2/m256","VPMAXUD ymm2/m256, ymmV, ymm1","vpmaxud y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3F /r","V","V","AVX2","","w,= r,r","","" +"VPMAXUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMAXUD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpmaxud ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 3F /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMAXUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMAXUD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpmaxud zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 3F /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMAXUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMAXUQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpmaxuq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 3F /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMAXUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMAXUQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpmaxuq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 3F /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMAXUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMAXUQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpmaxuq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 3F /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMAXUW xmm1, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, xmm1","vpmaxuw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3E /r","V","V","AVX","","w,r= ,r","","" +"VPMAXUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMAXUW xmm2/m128, xmmV, {k}{z}, = xmm1","vpmaxuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3E = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMAXUW ymm1, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, ymm1","vpmaxuw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3E /r","V","V","AVX2","","w,= r,r","","" +"VPMAXUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMAXUW ymm2/m256, ymmV, {k}{z}, = ymm1","vpmaxuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3E = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMAXUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMAXUW zmm2/m512, zmmV, {k}{z}, = zmm1","vpmaxuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3E = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMINSB xmm1, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, xmm1","vpminsb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 38 /r","V","V","AVX","","w,r= ,r","","" +"VPMINSB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSB xmm2/m128, xmmV, {k}{z}, = xmm1","vpminsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 38 = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMINSB ymm1, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, ymm1","vpminsb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 38 /r","V","V","AVX2","","w,= r,r","","" +"VPMINSB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSB ymm2/m256, ymmV, {k}{z}, = ymm1","vpminsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 38 = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMINSB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSB zmm2/m512, zmmV, {k}{z}, = zmm1","vpminsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 38 = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMINSD xmm1, xmmV, xmm2/m128","VPMINSD xmm2/m128, xmmV, xmm1","vpminsd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 39 /r","V","V","AVX","","w,r= ,r","","" +"VPMINSD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINSD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpminsd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 39 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMINSD ymm1, ymmV, ymm2/m256","VPMINSD ymm2/m256, ymmV, ymm1","vpminsd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 39 /r","V","V","AVX2","","w,= r,r","","" +"VPMINSD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINSD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpminsd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 39 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMINSD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINSD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpminsd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 39 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMINSQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINSQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpminsq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 39 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMINSQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINSQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpminsq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 39 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMINSQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINSQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpminsq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 39 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMINSW xmm1, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, xmm1","vpminsw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EA /r","V","V","AVX","","w,r,r= ","","" +"VPMINSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINSW xmm2/m128, xmmV, {k}{z}, = xmm1","vpminsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG EA /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMINSW ymm1, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, ymm1","vpminsw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EA /r","V","V","AVX2","","w,r,= r","","" +"VPMINSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINSW ymm2/m256, ymmV, {k}{z}, = ymm1","vpminsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG EA /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMINSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINSW zmm2/m512, zmmV, {k}{z}, = zmm1","vpminsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG EA /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMINUB xmm1, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, xmm1","vpminub x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG DA /r","V","V","AVX","","w,r,r= ","","" +"VPMINUB xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUB xmm2/m128, xmmV, {k}{z}, = xmm1","vpminub xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG DA /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMINUB ymm1, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, ymm1","vpminub y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG DA /r","V","V","AVX2","","w,r,= r","","" +"VPMINUB ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUB ymm2/m256, ymmV, {k}{z}, = ymm1","vpminub ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG DA /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMINUB zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUB zmm2/m512, zmmV, {k}{z}, = zmm1","vpminub zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG DA /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMINUD xmm1, xmmV, xmm2/m128","VPMINUD xmm2/m128, xmmV, xmm1","vpminud x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3B /r","V","V","AVX","","w,r= ,r","","" +"VPMINUD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMINUD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpminud xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 3B /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMINUD ymm1, ymmV, ymm2/m256","VPMINUD ymm2/m256, ymmV, ymm1","vpminud y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3B /r","V","V","AVX2","","w,= r,r","","" +"VPMINUD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMINUD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpminud ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 3B /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMINUD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMINUD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpminud zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 3B /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMINUQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMINUQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpminuq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 3B /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMINUQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMINUQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpminuq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 3B /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMINUQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMINUQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpminuq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 3B /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMINUW xmm1, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, xmm1","vpminuw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 3A /r","V","V","AVX","","w,r= ,r","","" +"VPMINUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMINUW xmm2/m128, xmmV, {k}{z}, = xmm1","vpminuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 3A = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMINUW ymm1, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, ymm1","vpminuw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 3A /r","V","V","AVX2","","w,= r,r","","" +"VPMINUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMINUW ymm2/m256, ymmV, {k}{z}, = ymm1","vpminuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 3A = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMINUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMINUW zmm2/m512, zmmV, {k}{z}, = zmm1","vpminuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 3A = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMOVB2M k1, xmm2","VPMOVB2M xmm2, k1","vpmovb2m xmm2, k1","EVEX.128.F3.0= F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVB2M k1, ymm2","VPMOVB2M ymm2, k1","vpmovb2m ymm2, k1","EVEX.256.F3.0= F38.W0 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVB2M k1, zmm2","VPMOVB2M zmm2, k1","vpmovb2m zmm2, k1","EVEX.512.F3.0= F38.W0 29 /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"VPMOVD2M k1, xmm2","VPMOVD2M xmm2, k1","vpmovd2m xmm2, k1","EVEX.128.F3.0= F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVD2M k1, ymm2","VPMOVD2M ymm2, k1","vpmovd2m ymm2, k1","EVEX.256.F3.0= F38.W0 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVD2M k1, zmm2","VPMOVD2M zmm2, k1","vpmovd2m zmm2, k1","EVEX.512.F3.0= F38.W0 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"VPMOVDB xmm2/m32, {k}{z}, xmm1","VPMOVDB xmm1, {k}{z}, xmm2/m32","vpmovdb= xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 31 /r","V","V","AVX512F+AVX51= 2VL","scale4","w,r,r","","" +"VPMOVDB xmm2/m64, {k}{z}, ymm1","VPMOVDB ymm1, {k}{z}, xmm2/m64","vpmovdb= ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 31 /r","V","V","AVX512F+AVX51= 2VL","scale8","w,r,r","","" +"VPMOVDB xmm2/m128, {k}{z}, zmm1","VPMOVDB zmm1, {k}{z}, xmm2/m128","vpmov= db zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 31 /r","V","V","AVX512F","= scale16","w,r,r","","" +"VPMOVDW xmm2/m64, {k}{z}, xmm1","VPMOVDW xmm1, {k}{z}, xmm2/m64","vpmovdw= xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 33 /r","V","V","AVX512F+AVX51= 2VL","scale8","w,r,r","","" +"VPMOVDW xmm2/m128, {k}{z}, ymm1","VPMOVDW ymm1, {k}{z}, xmm2/m128","vpmov= dw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 33 /r","V","V","AVX512F+AV= X512VL","scale16","w,r,r","","" +"VPMOVDW ymm2/m256, {k}{z}, zmm1","VPMOVDW zmm1, {k}{z}, ymm2/m256","vpmov= dw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 33 /r","V","V","AVX512F","= scale32","w,r,r","","" +"VPMOVM2B xmm1, k2","VPMOVM2B k2, xmm1","vpmovm2b k2, xmm1","EVEX.128.F3.0= F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2B ymm1, k2","VPMOVM2B k2, ymm1","vpmovm2b k2, ymm1","EVEX.256.F3.0= F38.W0 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2B zmm1, k2","VPMOVM2B k2, zmm1","vpmovm2b k2, zmm1","EVEX.512.F3.0= F38.W0 28 /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"VPMOVM2D xmm1, k2","VPMOVM2D k2, xmm1","vpmovm2d k2, xmm1","EVEX.128.F3.0= F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2D ymm1, k2","VPMOVM2D k2, ymm1","vpmovm2d k2, ymm1","EVEX.256.F3.0= F38.W0 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2D zmm1, k2","VPMOVM2D k2, zmm1","vpmovm2d k2, zmm1","EVEX.512.F3.0= F38.W0 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"VPMOVM2Q xmm1, k2","VPMOVM2Q k2, xmm1","vpmovm2q k2, xmm1","EVEX.128.F3.0= F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2Q ymm1, k2","VPMOVM2Q k2, ymm1","vpmovm2q k2, ymm1","EVEX.256.F3.0= F38.W1 38 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2Q zmm1, k2","VPMOVM2Q k2, zmm1","vpmovm2q k2, zmm1","EVEX.512.F3.0= F38.W1 38 /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"VPMOVM2W xmm1, k2","VPMOVM2W k2, xmm1","vpmovm2w k2, xmm1","EVEX.128.F3.0= F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2W ymm1, k2","VPMOVM2W k2, ymm1","vpmovm2w k2, ymm1","EVEX.256.F3.0= F38.W1 28 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVM2W zmm1, k2","VPMOVM2W k2, zmm1","vpmovm2w k2, zmm1","EVEX.512.F3.0= F38.W1 28 /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"VPMOVMSKB r32, xmm2","VPMOVMSKB xmm2, r32","vpmovmskb xmm2, r32","VEX.128= .66.0F.WIG D7 /r","V","V","AVX","modrm_regonly","w,r","","" +"VPMOVMSKB r32, ymm2","VPMOVMSKB ymm2, r32","vpmovmskb ymm2, r32","VEX.256= .66.0F.WIG D7 /r","V","V","AVX2","modrm_regonly","w,r","","" +"VPMOVQ2M k1, xmm2","VPMOVQ2M xmm2, k1","vpmovq2m xmm2, k1","EVEX.128.F3.0= F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVQ2M k1, ymm2","VPMOVQ2M ymm2, k1","vpmovq2m ymm2, k1","EVEX.256.F3.0= F38.W1 39 /r","V","V","AVX512DQ+AVX512VL","modrm_regonly","w,r","","" +"VPMOVQ2M k1, zmm2","VPMOVQ2M zmm2, k1","vpmovq2m zmm2, k1","EVEX.512.F3.0= F38.W1 39 /r","V","V","AVX512DQ","modrm_regonly","w,r","","" +"VPMOVQB xmm2/m16, {k}{z}, xmm1","VPMOVQB xmm1, {k}{z}, xmm2/m16","vpmovqb= xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 32 /r","V","V","AVX512F+AVX51= 2VL","scale2","w,r,r","","" +"VPMOVQB xmm2/m32, {k}{z}, ymm1","VPMOVQB ymm1, {k}{z}, xmm2/m32","vpmovqb= ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 32 /r","V","V","AVX512F+AVX51= 2VL","scale4","w,r,r","","" +"VPMOVQB xmm2/m64, {k}{z}, zmm1","VPMOVQB zmm1, {k}{z}, xmm2/m64","vpmovqb= zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 32 /r","V","V","AVX512F","sca= le8","w,r,r","","" +"VPMOVQD xmm2/m64, {k}{z}, xmm1","VPMOVQD xmm1, {k}{z}, xmm2/m64","vpmovqd= xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 35 /r","V","V","AVX512F+AVX51= 2VL","scale8","w,r,r","","" +"VPMOVQD xmm2/m128, {k}{z}, ymm1","VPMOVQD ymm1, {k}{z}, xmm2/m128","vpmov= qd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 35 /r","V","V","AVX512F+AV= X512VL","scale16","w,r,r","","" +"VPMOVQD ymm2/m256, {k}{z}, zmm1","VPMOVQD zmm1, {k}{z}, ymm2/m256","vpmov= qd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 35 /r","V","V","AVX512F","= scale32","w,r,r","","" +"VPMOVQW xmm2/m32, {k}{z}, xmm1","VPMOVQW xmm1, {k}{z}, xmm2/m32","vpmovqw= xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 34 /r","V","V","AVX512F+AVX51= 2VL","scale4","w,r,r","","" +"VPMOVQW xmm2/m64, {k}{z}, ymm1","VPMOVQW ymm1, {k}{z}, xmm2/m64","vpmovqw= ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 34 /r","V","V","AVX512F+AVX51= 2VL","scale8","w,r,r","","" +"VPMOVQW xmm2/m128, {k}{z}, zmm1","VPMOVQW zmm1, {k}{z}, xmm2/m128","vpmov= qw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 34 /r","V","V","AVX512F","= scale16","w,r,r","","" +"VPMOVSDB xmm2/m32, {k}{z}, xmm1","VPMOVSDB xmm1, {k}{z}, xmm2/m32","vpmov= sdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 21 /r","V","V","AVX512F+AV= X512VL","scale4","w,r,r","","" +"VPMOVSDB xmm2/m64, {k}{z}, ymm1","VPMOVSDB ymm1, {k}{z}, xmm2/m64","vpmov= sdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 21 /r","V","V","AVX512F+AV= X512VL","scale8","w,r,r","","" +"VPMOVSDB xmm2/m128, {k}{z}, zmm1","VPMOVSDB zmm1, {k}{z}, xmm2/m128","vpm= ovsdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 21 /r","V","V","AVX512F= ","scale16","w,r,r","","" +"VPMOVSDW xmm2/m64, {k}{z}, xmm1","VPMOVSDW xmm1, {k}{z}, xmm2/m64","vpmov= sdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 23 /r","V","V","AVX512F+AV= X512VL","scale8","w,r,r","","" +"VPMOVSDW xmm2/m128, {k}{z}, ymm1","VPMOVSDW ymm1, {k}{z}, xmm2/m128","vpm= ovsdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 23 /r","V","V","AVX512F= +AVX512VL","scale16","w,r,r","","" +"VPMOVSDW ymm2/m256, {k}{z}, zmm1","VPMOVSDW zmm1, {k}{z}, ymm2/m256","vpm= ovsdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 23 /r","V","V","AVX512F= ","scale32","w,r,r","","" +"VPMOVSQB xmm2/m16, {k}{z}, xmm1","VPMOVSQB xmm1, {k}{z}, xmm2/m16","vpmov= sqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 22 /r","V","V","AVX512F+AV= X512VL","scale2","w,r,r","","" +"VPMOVSQB xmm2/m32, {k}{z}, ymm1","VPMOVSQB ymm1, {k}{z}, xmm2/m32","vpmov= sqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 22 /r","V","V","AVX512F+AV= X512VL","scale4","w,r,r","","" +"VPMOVSQB xmm2/m64, {k}{z}, zmm1","VPMOVSQB zmm1, {k}{z}, xmm2/m64","vpmov= sqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 22 /r","V","V","AVX512F","= scale8","w,r,r","","" +"VPMOVSQD xmm2/m64, {k}{z}, xmm1","VPMOVSQD xmm1, {k}{z}, xmm2/m64","vpmov= sqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 25 /r","V","V","AVX512F+AV= X512VL","scale8","w,r,r","","" +"VPMOVSQD xmm2/m128, {k}{z}, ymm1","VPMOVSQD ymm1, {k}{z}, xmm2/m128","vpm= ovsqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 25 /r","V","V","AVX512F= +AVX512VL","scale16","w,r,r","","" +"VPMOVSQD ymm2/m256, {k}{z}, zmm1","VPMOVSQD zmm1, {k}{z}, ymm2/m256","vpm= ovsqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 25 /r","V","V","AVX512F= ","scale32","w,r,r","","" +"VPMOVSQW xmm2/m32, {k}{z}, xmm1","VPMOVSQW xmm1, {k}{z}, xmm2/m32","vpmov= sqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 24 /r","V","V","AVX512F+AV= X512VL","scale4","w,r,r","","" +"VPMOVSQW xmm2/m64, {k}{z}, ymm1","VPMOVSQW ymm1, {k}{z}, xmm2/m64","vpmov= sqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 24 /r","V","V","AVX512F+AV= X512VL","scale8","w,r,r","","" +"VPMOVSQW xmm2/m128, {k}{z}, zmm1","VPMOVSQW zmm1, {k}{z}, xmm2/m128","vpm= ovsqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 24 /r","V","V","AVX512F= ","scale16","w,r,r","","" +"VPMOVSWB xmm2/m64, {k}{z}, xmm1","VPMOVSWB xmm1, {k}{z}, xmm2/m64","vpmov= swb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 20 /r","V","V","AVX512BW+A= VX512VL","scale8","w,r,r","","" +"VPMOVSWB xmm2/m128, {k}{z}, ymm1","VPMOVSWB ymm1, {k}{z}, xmm2/m128","vpm= ovswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 20 /r","V","V","AVX512B= W+AVX512VL","scale16","w,r,r","","" +"VPMOVSWB ymm2/m256, {k}{z}, zmm1","VPMOVSWB zmm1, {k}{z}, ymm2/m256","vpm= ovswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 20 /r","V","V","AVX512B= W","scale32","w,r,r","","" +"VPMOVSXBD zmm1, {k}{z}, xmm2/m128","VPMOVSXBD xmm2/m128, {k}{z}, zmm1","v= pmovsxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 21 /r","V","V","AVX= 512F","scale16","w,r,r","","" +"VPMOVSXBD xmm1, xmm2/m32","VPMOVSXBD xmm2/m32, xmm1","vpmovsxbd xmm2/m32,= xmm1","VEX.128.66.0F38.WIG 21 /r","V","V","AVX","","w,r","","" +"VPMOVSXBD xmm1, {k}{z}, xmm2/m32","VPMOVSXBD xmm2/m32, {k}{z}, xmm1","vpm= ovsxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 21 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVSXBD ymm1, xmm2/m64","VPMOVSXBD xmm2/m64, ymm1","vpmovsxbd xmm2/m64,= ymm1","VEX.256.66.0F38.WIG 21 /r","V","V","AVX2","","w,r","","" +"VPMOVSXBD ymm1, {k}{z}, xmm2/m64","VPMOVSXBD xmm2/m64, {k}{z}, ymm1","vpm= ovsxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 21 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVSXBQ xmm1, xmm2/m16","VPMOVSXBQ xmm2/m16, xmm1","vpmovsxbq xmm2/m16,= xmm1","VEX.128.66.0F38.WIG 22 /r","V","V","AVX","","w,r","","" +"VPMOVSXBQ xmm1, {k}{z}, xmm2/m16","VPMOVSXBQ xmm2/m16, {k}{z}, xmm1","vpm= ovsxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 22 /r","V","V","AVX512= F+AVX512VL","scale2","w,r,r","","" +"VPMOVSXBQ ymm1, xmm2/m32","VPMOVSXBQ xmm2/m32, ymm1","vpmovsxbq xmm2/m32,= ymm1","VEX.256.66.0F38.WIG 22 /r","V","V","AVX2","","w,r","","" +"VPMOVSXBQ ymm1, {k}{z}, xmm2/m32","VPMOVSXBQ xmm2/m32, {k}{z}, ymm1","vpm= ovsxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 22 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVSXBQ zmm1, {k}{z}, xmm2/m64","VPMOVSXBQ xmm2/m64, {k}{z}, zmm1","vpm= ovsxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 22 /r","V","V","AVX512= F","scale8","w,r,r","","" +"VPMOVSXBW ymm1, xmm2/m128","VPMOVSXBW xmm2/m128, ymm1","vpmovsxbw xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 20 /r","V","V","AVX2","","w,r","","" +"VPMOVSXBW ymm1, {k}{z}, xmm2/m128","VPMOVSXBW xmm2/m128, {k}{z}, ymm1","v= pmovsxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 20 /r","V","V","AVX= 512BW+AVX512VL","scale16","w,r,r","","" +"VPMOVSXBW xmm1, xmm2/m64","VPMOVSXBW xmm2/m64, xmm1","vpmovsxbw xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 20 /r","V","V","AVX","","w,r","","" +"VPMOVSXBW xmm1, {k}{z}, xmm2/m64","VPMOVSXBW xmm2/m64, {k}{z}, xmm1","vpm= ovsxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 20 /r","V","V","AVX512= BW+AVX512VL","scale8","w,r,r","","" +"VPMOVSXBW zmm1, {k}{z}, ymm2/m256","VPMOVSXBW ymm2/m256, {k}{z}, zmm1","v= pmovsxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 20 /r","V","V","AVX= 512BW","scale32","w,r,r","","" +"VPMOVSXDQ ymm1, xmm2/m128","VPMOVSXDQ xmm2/m128, ymm1","vpmovsxdq xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 25 /r","V","V","AVX2","","w,r","","" +"VPMOVSXDQ ymm1, {k}{z}, xmm2/m128","VPMOVSXDQ xmm2/m128, {k}{z}, ymm1","v= pmovsxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 25 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VPMOVSXDQ xmm1, xmm2/m64","VPMOVSXDQ xmm2/m64, xmm1","vpmovsxdq xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 25 /r","V","V","AVX","","w,r","","" +"VPMOVSXDQ xmm1, {k}{z}, xmm2/m64","VPMOVSXDQ xmm2/m64, {k}{z}, xmm1","vpm= ovsxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 25 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVSXDQ zmm1, {k}{z}, ymm2/m256","VPMOVSXDQ ymm2/m256, {k}{z}, zmm1","v= pmovsxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 25 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VPMOVSXWD ymm1, xmm2/m128","VPMOVSXWD xmm2/m128, ymm1","vpmovsxwd xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 23 /r","V","V","AVX2","","w,r","","" +"VPMOVSXWD ymm1, {k}{z}, xmm2/m128","VPMOVSXWD xmm2/m128, {k}{z}, ymm1","v= pmovsxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 23 /r","V","V","AVX= 512F+AVX512VL","scale16","w,r,r","","" +"VPMOVSXWD xmm1, xmm2/m64","VPMOVSXWD xmm2/m64, xmm1","vpmovsxwd xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 23 /r","V","V","AVX","","w,r","","" +"VPMOVSXWD xmm1, {k}{z}, xmm2/m64","VPMOVSXWD xmm2/m64, {k}{z}, xmm1","vpm= ovsxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 23 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVSXWD zmm1, {k}{z}, ymm2/m256","VPMOVSXWD ymm2/m256, {k}{z}, zmm1","v= pmovsxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 23 /r","V","V","AVX= 512F","scale32","w,r,r","","" +"VPMOVSXWQ zmm1, {k}{z}, xmm2/m128","VPMOVSXWQ xmm2/m128, {k}{z}, zmm1","v= pmovsxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 24 /r","V","V","AVX= 512F","scale16","w,r,r","","" +"VPMOVSXWQ xmm1, xmm2/m32","VPMOVSXWQ xmm2/m32, xmm1","vpmovsxwq xmm2/m32,= xmm1","VEX.128.66.0F38.WIG 24 /r","V","V","AVX","","w,r","","" +"VPMOVSXWQ xmm1, {k}{z}, xmm2/m32","VPMOVSXWQ xmm2/m32, {k}{z}, xmm1","vpm= ovsxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 24 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVSXWQ ymm1, xmm2/m64","VPMOVSXWQ xmm2/m64, ymm1","vpmovsxwq xmm2/m64,= ymm1","VEX.256.66.0F38.WIG 24 /r","V","V","AVX2","","w,r","","" +"VPMOVSXWQ ymm1, {k}{z}, xmm2/m64","VPMOVSXWQ xmm2/m64, {k}{z}, ymm1","vpm= ovsxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 24 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVUSDB xmm2/m32, {k}{z}, xmm1","VPMOVUSDB xmm1, {k}{z}, xmm2/m32","vpm= ovusdb xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 11 /r","V","V","AVX512F= +AVX512VL","scale4","w,r,r","","" +"VPMOVUSDB xmm2/m64, {k}{z}, ymm1","VPMOVUSDB ymm1, {k}{z}, xmm2/m64","vpm= ovusdb ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 11 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVUSDB xmm2/m128, {k}{z}, zmm1","VPMOVUSDB zmm1, {k}{z}, xmm2/m128","v= pmovusdb zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 11 /r","V","V","AVX5= 12F","scale16","w,r,r","","" +"VPMOVUSDW xmm2/m64, {k}{z}, xmm1","VPMOVUSDW xmm1, {k}{z}, xmm2/m64","vpm= ovusdw xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 13 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVUSDW xmm2/m128, {k}{z}, ymm1","VPMOVUSDW ymm1, {k}{z}, xmm2/m128","v= pmovusdw ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 13 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VPMOVUSDW ymm2/m256, {k}{z}, zmm1","VPMOVUSDW zmm1, {k}{z}, ymm2/m256","v= pmovusdw zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 13 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VPMOVUSQB xmm2/m16, {k}{z}, xmm1","VPMOVUSQB xmm1, {k}{z}, xmm2/m16","vpm= ovusqb xmm1, {k}{z}, xmm2/m16","EVEX.128.F3.0F38.W0 12 /r","V","V","AVX512F= +AVX512VL","scale2","w,r,r","","" +"VPMOVUSQB xmm2/m32, {k}{z}, ymm1","VPMOVUSQB ymm1, {k}{z}, xmm2/m32","vpm= ovusqb ymm1, {k}{z}, xmm2/m32","EVEX.256.F3.0F38.W0 12 /r","V","V","AVX512F= +AVX512VL","scale4","w,r,r","","" +"VPMOVUSQB xmm2/m64, {k}{z}, zmm1","VPMOVUSQB zmm1, {k}{z}, xmm2/m64","vpm= ovusqb zmm1, {k}{z}, xmm2/m64","EVEX.512.F3.0F38.W0 12 /r","V","V","AVX512F= ","scale8","w,r,r","","" +"VPMOVUSQD xmm2/m64, {k}{z}, xmm1","VPMOVUSQD xmm1, {k}{z}, xmm2/m64","vpm= ovusqd xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 15 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVUSQD xmm2/m128, {k}{z}, ymm1","VPMOVUSQD ymm1, {k}{z}, xmm2/m128","v= pmovusqd ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 15 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VPMOVUSQD ymm2/m256, {k}{z}, zmm1","VPMOVUSQD zmm1, {k}{z}, ymm2/m256","v= pmovusqd zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 15 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VPMOVUSQW xmm2/m32, {k}{z}, xmm1","VPMOVUSQW xmm1, {k}{z}, xmm2/m32","vpm= ovusqw xmm1, {k}{z}, xmm2/m32","EVEX.128.F3.0F38.W0 14 /r","V","V","AVX512F= +AVX512VL","scale4","w,r,r","","" +"VPMOVUSQW xmm2/m64, {k}{z}, ymm1","VPMOVUSQW ymm1, {k}{z}, xmm2/m64","vpm= ovusqw ymm1, {k}{z}, xmm2/m64","EVEX.256.F3.0F38.W0 14 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVUSQW xmm2/m128, {k}{z}, zmm1","VPMOVUSQW zmm1, {k}{z}, xmm2/m128","v= pmovusqw zmm1, {k}{z}, xmm2/m128","EVEX.512.F3.0F38.W0 14 /r","V","V","AVX5= 12F","scale16","w,r,r","","" +"VPMOVUSWB xmm2/m64, {k}{z}, xmm1","VPMOVUSWB xmm1, {k}{z}, xmm2/m64","vpm= ovuswb xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 10 /r","V","V","AVX512B= W+AVX512VL","scale8","w,r,r","","" +"VPMOVUSWB xmm2/m128, {k}{z}, ymm1","VPMOVUSWB ymm1, {k}{z}, xmm2/m128","v= pmovuswb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 10 /r","V","V","AVX5= 12BW+AVX512VL","scale16","w,r,r","","" +"VPMOVUSWB ymm2/m256, {k}{z}, zmm1","VPMOVUSWB zmm1, {k}{z}, ymm2/m256","v= pmovuswb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 10 /r","V","V","AVX5= 12BW","scale32","w,r,r","","" +"VPMOVW2M k1, xmm2","VPMOVW2M xmm2, k1","vpmovw2m xmm2, k1","EVEX.128.F3.0= F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVW2M k1, ymm2","VPMOVW2M ymm2, k1","vpmovw2m ymm2, k1","EVEX.256.F3.0= F38.W1 29 /r","V","V","AVX512BW+AVX512VL","modrm_regonly","w,r","","" +"VPMOVW2M k1, zmm2","VPMOVW2M zmm2, k1","vpmovw2m zmm2, k1","EVEX.512.F3.0= F38.W1 29 /r","V","V","AVX512BW","modrm_regonly","w,r","","" +"VPMOVWB xmm2/m64, {k}{z}, xmm1","VPMOVWB xmm1, {k}{z}, xmm2/m64","vpmovwb= xmm1, {k}{z}, xmm2/m64","EVEX.128.F3.0F38.W0 30 /r","V","V","AVX512BW+AVX5= 12VL","scale8","w,r,r","","" +"VPMOVWB xmm2/m128, {k}{z}, ymm1","VPMOVWB ymm1, {k}{z}, xmm2/m128","vpmov= wb ymm1, {k}{z}, xmm2/m128","EVEX.256.F3.0F38.W0 30 /r","V","V","AVX512BW+A= VX512VL","scale16","w,r,r","","" +"VPMOVWB ymm2/m256, {k}{z}, zmm1","VPMOVWB zmm1, {k}{z}, ymm2/m256","vpmov= wb zmm1, {k}{z}, ymm2/m256","EVEX.512.F3.0F38.W0 30 /r","V","V","AVX512BW",= "scale32","w,r,r","","" +"VPMOVZXBD zmm1, {k}{z}, xmm2/m128","VPMOVZXBD xmm2/m128, {k}{z}, zmm1","v= pmovzxbd xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 31 /r","V","V","AVX= 512F","scale16","w,r,r","","" +"VPMOVZXBD xmm1, xmm2/m32","VPMOVZXBD xmm2/m32, xmm1","vpmovzxbd xmm2/m32,= xmm1","VEX.128.66.0F38.WIG 31 /r","V","V","AVX","","w,r","","" +"VPMOVZXBD xmm1, {k}{z}, xmm2/m32","VPMOVZXBD xmm2/m32, {k}{z}, xmm1","vpm= ovzxbd xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 31 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVZXBD ymm1, xmm2/m64","VPMOVZXBD xmm2/m64, ymm1","vpmovzxbd xmm2/m64,= ymm1","VEX.256.66.0F38.WIG 31 /r","V","V","AVX2","","w,r","","" +"VPMOVZXBD ymm1, {k}{z}, xmm2/m64","VPMOVZXBD xmm2/m64, {k}{z}, ymm1","vpm= ovzxbd xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 31 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVZXBQ xmm1, xmm2/m16","VPMOVZXBQ xmm2/m16, xmm1","vpmovzxbq xmm2/m16,= xmm1","VEX.128.66.0F38.WIG 32 /r","V","V","AVX","","w,r","","" +"VPMOVZXBQ xmm1, {k}{z}, xmm2/m16","VPMOVZXBQ xmm2/m16, {k}{z}, xmm1","vpm= ovzxbq xmm2/m16, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 32 /r","V","V","AVX512= F+AVX512VL","scale2","w,r,r","","" +"VPMOVZXBQ ymm1, xmm2/m32","VPMOVZXBQ xmm2/m32, ymm1","vpmovzxbq xmm2/m32,= ymm1","VEX.256.66.0F38.WIG 32 /r","V","V","AVX2","","w,r","","" +"VPMOVZXBQ ymm1, {k}{z}, xmm2/m32","VPMOVZXBQ xmm2/m32, {k}{z}, ymm1","vpm= ovzxbq xmm2/m32, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 32 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVZXBQ zmm1, {k}{z}, xmm2/m64","VPMOVZXBQ xmm2/m64, {k}{z}, zmm1","vpm= ovzxbq xmm2/m64, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 32 /r","V","V","AVX512= F","scale8","w,r,r","","" +"VPMOVZXBW ymm1, xmm2/m128","VPMOVZXBW xmm2/m128, ymm1","vpmovzxbw xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 30 /r","V","V","AVX2","","w,r","","" +"VPMOVZXBW ymm1, {k}{z}, xmm2/m128","VPMOVZXBW xmm2/m128, {k}{z}, ymm1","v= pmovzxbw xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 30 /r","V","V","AVX= 512BW+AVX512VL","scale16","w,r,r","","" +"VPMOVZXBW xmm1, xmm2/m64","VPMOVZXBW xmm2/m64, xmm1","vpmovzxbw xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 30 /r","V","V","AVX","","w,r","","" +"VPMOVZXBW xmm1, {k}{z}, xmm2/m64","VPMOVZXBW xmm2/m64, {k}{z}, xmm1","vpm= ovzxbw xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 30 /r","V","V","AVX512= BW+AVX512VL","scale8","w,r,r","","" +"VPMOVZXBW zmm1, {k}{z}, ymm2/m256","VPMOVZXBW ymm2/m256, {k}{z}, zmm1","v= pmovzxbw ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 30 /r","V","V","AVX= 512BW","scale32","w,r,r","","" +"VPMOVZXDQ ymm1, xmm2/m128","VPMOVZXDQ xmm2/m128, ymm1","vpmovzxdq xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 35 /r","V","V","AVX2","","w,r","","" +"VPMOVZXDQ ymm1, {k}{z}, xmm2/m128","VPMOVZXDQ xmm2/m128, {k}{z}, ymm1","v= pmovzxdq xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.W0 35 /r","V","V","AVX5= 12F+AVX512VL","scale16","w,r,r","","" +"VPMOVZXDQ xmm1, xmm2/m64","VPMOVZXDQ xmm2/m64, xmm1","vpmovzxdq xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 35 /r","V","V","AVX","","w,r","","" +"VPMOVZXDQ xmm1, {k}{z}, xmm2/m64","VPMOVZXDQ xmm2/m64, {k}{z}, xmm1","vpm= ovzxdq xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.W0 35 /r","V","V","AVX512F= +AVX512VL","scale8","w,r,r","","" +"VPMOVZXDQ zmm1, {k}{z}, ymm2/m256","VPMOVZXDQ ymm2/m256, {k}{z}, zmm1","v= pmovzxdq ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.W0 35 /r","V","V","AVX5= 12F","scale32","w,r,r","","" +"VPMOVZXWD ymm1, xmm2/m128","VPMOVZXWD xmm2/m128, ymm1","vpmovzxwd xmm2/m1= 28, ymm1","VEX.256.66.0F38.WIG 33 /r","V","V","AVX2","","w,r","","" +"VPMOVZXWD ymm1, {k}{z}, xmm2/m128","VPMOVZXWD xmm2/m128, {k}{z}, ymm1","v= pmovzxwd xmm2/m128, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 33 /r","V","V","AVX= 512F+AVX512VL","scale16","w,r,r","","" +"VPMOVZXWD xmm1, xmm2/m64","VPMOVZXWD xmm2/m64, xmm1","vpmovzxwd xmm2/m64,= xmm1","VEX.128.66.0F38.WIG 33 /r","V","V","AVX","","w,r","","" +"VPMOVZXWD xmm1, {k}{z}, xmm2/m64","VPMOVZXWD xmm2/m64, {k}{z}, xmm1","vpm= ovzxwd xmm2/m64, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 33 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMOVZXWD zmm1, {k}{z}, ymm2/m256","VPMOVZXWD ymm2/m256, {k}{z}, zmm1","v= pmovzxwd ymm2/m256, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 33 /r","V","V","AVX= 512F","scale32","w,r,r","","" +"VPMOVZXWQ zmm1, {k}{z}, xmm2/m128","VPMOVZXWQ xmm2/m128, {k}{z}, zmm1","v= pmovzxwq xmm2/m128, {k}{z}, zmm1","EVEX.512.66.0F38.WIG 34 /r","V","V","AVX= 512F","scale16","w,r,r","","" +"VPMOVZXWQ xmm1, xmm2/m32","VPMOVZXWQ xmm2/m32, xmm1","vpmovzxwq xmm2/m32,= xmm1","VEX.128.66.0F38.WIG 34 /r","V","V","AVX","","w,r","","" +"VPMOVZXWQ xmm1, {k}{z}, xmm2/m32","VPMOVZXWQ xmm2/m32, {k}{z}, xmm1","vpm= ovzxwq xmm2/m32, {k}{z}, xmm1","EVEX.128.66.0F38.WIG 34 /r","V","V","AVX512= F+AVX512VL","scale4","w,r,r","","" +"VPMOVZXWQ ymm1, xmm2/m64","VPMOVZXWQ xmm2/m64, ymm1","vpmovzxwq xmm2/m64,= ymm1","VEX.256.66.0F38.WIG 34 /r","V","V","AVX2","","w,r","","" +"VPMOVZXWQ ymm1, {k}{z}, xmm2/m64","VPMOVZXWQ xmm2/m64, {k}{z}, ymm1","vpm= ovzxwq xmm2/m64, {k}{z}, ymm1","EVEX.256.66.0F38.WIG 34 /r","V","V","AVX512= F+AVX512VL","scale8","w,r,r","","" +"VPMULDQ xmm1, xmmV, xmm2/m128","VPMULDQ xmm2/m128, xmmV, xmm1","vpmuldq x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 28 /r","V","V","AVX","","w,r= ,r","","" +"VPMULDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULDQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpmuldq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 28 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPMULDQ ymm1, ymmV, ymm2/m256","VPMULDQ ymm2/m256, ymmV, ymm1","vpmuldq y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 28 /r","V","V","AVX2","","w,= r,r","","" +"VPMULDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULDQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpmuldq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 28 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPMULDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULDQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpmuldq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 28 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPMULHRSW xmm1, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, xmm1","vpmul= hrsw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0B /r","V","V","AVX","= ","w,r,r","","" +"VPMULHRSW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHRSW xmm2/m128, xmmV, {k}{= z}, xmm1","vpmulhrsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W= IG 0B /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMULHRSW ymm1, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, ymm1","vpmul= hrsw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0B /r","V","V","AVX2",= "","w,r,r","","" +"VPMULHRSW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHRSW ymm2/m256, ymmV, {k}{= z}, ymm1","vpmulhrsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W= IG 0B /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMULHRSW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHRSW zmm2/m512, zmmV, {k}{= z}, zmm1","vpmulhrsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W= IG 0B /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMULHUW xmm1, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, xmm1","vpmulhu= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E4 /r","V","V","AVX","","w,= r,r","","" +"VPMULHUW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHUW xmm2/m128, xmmV, {k}{z}= , xmm1","vpmulhuw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E4= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMULHUW ymm1, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, ymm1","vpmulhu= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E4 /r","V","V","AVX2","","w= ,r,r","","" +"VPMULHUW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHUW ymm2/m256, ymmV, {k}{z}= , ymm1","vpmulhuw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E4= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMULHUW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHUW zmm2/m512, zmmV, {k}{z}= , zmm1","vpmulhuw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E4= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMULHW xmm1, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, xmm1","vpmulhw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E5 /r","V","V","AVX","","w,r,r= ","","" +"VPMULHW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULHW xmm2/m128, xmmV, {k}{z}, = xmm1","vpmulhw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E5 /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMULHW ymm1, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, ymm1","vpmulhw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E5 /r","V","V","AVX2","","w,r,= r","","" +"VPMULHW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULHW ymm2/m256, ymmV, {k}{z}, = ymm1","vpmulhw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E5 /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMULHW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULHW zmm2/m512, zmmV, {k}{z}, = zmm1","vpmulhw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E5 /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMULLD xmm1, xmmV, xmm2/m128","VPMULLD xmm2/m128, xmmV, xmm1","vpmulld x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 40 /r","V","V","AVX","","w,r= ,r","","" +"VPMULLD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPMULLD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpmulld xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 40 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPMULLD ymm1, ymmV, ymm2/m256","VPMULLD ymm2/m256, ymmV, ymm1","vpmulld y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 40 /r","V","V","AVX2","","w,= r,r","","" +"VPMULLD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPMULLD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpmulld ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 40 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPMULLD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPMULLD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpmulld zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 40 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPMULLQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULLQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpmullq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 40 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPMULLQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULLQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpmullq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 40 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPMULLQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULLQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpmullq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 40 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","= ","" +"VPMULLW xmm1, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, xmm1","vpmullw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D5 /r","V","V","AVX","","w,r,r= ","","" +"VPMULLW xmm1, {k}{z}, xmmV, xmm2/m128","VPMULLW xmm2/m128, xmmV, {k}{z}, = xmm1","vpmullw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D5 /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPMULLW ymm1, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, ymm1","vpmullw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D5 /r","V","V","AVX2","","w,r,= r","","" +"VPMULLW ymm1, {k}{z}, ymmV, ymm2/m256","VPMULLW ymm2/m256, ymmV, {k}{z}, = ymm1","vpmullw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D5 /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPMULLW zmm1, {k}{z}, zmmV, zmm2/m512","VPMULLW zmm2/m512, zmmV, {k}{z}, = zmm1","vpmullw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D5 /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPMULTISHIFTQB xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULTISHIFTQB xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpmultishiftqb xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 83 /r","V","V","AVX512_VBMI+AVX512= VL","bscale8,scale16","w,r,r,r","","" +"VPMULTISHIFTQB ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULTISHIFTQB ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpmultishiftqb ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 83 /r","V","V","AVX512_VBMI+AVX512= VL","bscale8,scale32","w,r,r,r","","" +"VPMULTISHIFTQB zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULTISHIFTQB zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpmultishiftqb zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 83 /r","V","V","AVX512_VBMI","bsca= le8,scale64","w,r,r,r","","" +"VPMULUDQ xmm1, xmmV, xmm2/m128","VPMULUDQ xmm2/m128, xmmV, xmm1","vpmulud= q xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F4 /r","V","V","AVX","","w,= r,r","","" +"VPMULUDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPMULUDQ xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpmuludq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.NDS.128.66.0F.W1 F4 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPMULUDQ ymm1, ymmV, ymm2/m256","VPMULUDQ ymm2/m256, ymmV, ymm1","vpmulud= q ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F4 /r","V","V","AVX2","","w= ,r,r","","" +"VPMULUDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPMULUDQ ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpmuludq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.NDS.256.66.0F.W1 F4 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPMULUDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPMULUDQ zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpmuludq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.NDS.512.66.0F.W1 F4 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPOPCNTB xmm1, {k}{z}, xmm2/m128","VPOPCNTB xmm2/m128, {k}{z}, xmm1","vpo= pcntb xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W0 54 /r","V","V","AVX512_= BITALG+AVX512VL","scale16","w,r,r","","" +"VPOPCNTB ymm1, {k}{z}, ymm2/m256","VPOPCNTB ymm2/m256, {k}{z}, ymm1","vpo= pcntb ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W0 54 /r","V","V","AVX512_= BITALG+AVX512VL","scale32","w,r,r","","" +"VPOPCNTB zmm1, {k}{z}, zmm2/m512","VPOPCNTB zmm2/m512, {k}{z}, zmm1","vpo= pcntb zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W0 54 /r","V","V","AVX512_= BITALG","scale64","w,r,r","","" +"VPOPCNTD xmm1, {k}{z}, xmm2/m128/m32bcst","VPOPCNTD xmm2/m128/m32bcst, {k= }{z}, xmm1","vpopcntd xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0= 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale16","w,r,r","","" +"VPOPCNTD ymm1, {k}{z}, ymm2/m256/m32bcst","VPOPCNTD ymm2/m256/m32bcst, {k= }{z}, ymm1","vpopcntd ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0= 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale4,scale32","w,r,r","","" +"VPOPCNTD zmm1, {k}{z}, zmm2/m512/m32bcst","VPOPCNTD zmm2/m512/m32bcst, {k= }{z}, zmm1","vpopcntd zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0= 55 /r","V","V","AVX512_VPOPCNTDQ","bscale4,scale64","w,r,r","","" +"VPOPCNTQ xmm1, {k}{z}, xmm2/m128/m64bcst","VPOPCNTQ xmm2/m128/m64bcst, {k= }{z}, xmm1","vpopcntq xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1= 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale16","w,r,r","","" +"VPOPCNTQ ymm1, {k}{z}, ymm2/m256/m64bcst","VPOPCNTQ ymm2/m256/m64bcst, {k= }{z}, ymm1","vpopcntq ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1= 55 /r","V","V","AVX512_VPOPCNTDQ+AVX512VL","bscale8,scale32","w,r,r","","" +"VPOPCNTQ zmm1, {k}{z}, zmm2/m512/m64bcst","VPOPCNTQ zmm2/m512/m64bcst, {k= }{z}, zmm1","vpopcntq zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1= 55 /r","V","V","AVX512_VPOPCNTDQ","bscale8,scale64","w,r,r","","" +"VPOPCNTW xmm1, {k}{z}, xmm2/m128","VPOPCNTW xmm2/m128, {k}{z}, xmm1","vpo= pcntw xmm2/m128, {k}{z}, xmm1","EVEX.128.66.0F38.W1 54 /r","V","V","AVX512_= BITALG+AVX512VL","scale16","w,r,r","","" +"VPOPCNTW ymm1, {k}{z}, ymm2/m256","VPOPCNTW ymm2/m256, {k}{z}, ymm1","vpo= pcntw ymm2/m256, {k}{z}, ymm1","EVEX.256.66.0F38.W1 54 /r","V","V","AVX512_= BITALG+AVX512VL","scale32","w,r,r","","" +"VPOPCNTW zmm1, {k}{z}, zmm2/m512","VPOPCNTW zmm2/m512, {k}{z}, zmm1","vpo= pcntw zmm2/m512, {k}{z}, zmm1","EVEX.512.66.0F38.W1 54 /r","V","V","AVX512_= BITALG","scale64","w,r,r","","" +"VPOR xmm1, xmmV, xmm2/m128","VPOR xmm2/m128, xmmV, xmm1","vpor xmm2/m128,= xmmV, xmm1","VEX.NDS.128.66.0F.WIG EB /r","V","V","AVX","","w,r,r","","" +"VPOR ymm1, ymmV, ymm2/m256","VPOR ymm2/m256, ymmV, ymm1","vpor ymm2/m256,= ymmV, ymm1","VEX.NDS.256.66.0F.WIG EB /r","V","V","AVX2","","w,r,r","","" +"VPORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPORD xmm2/m128/m32bcst, xm= mV, {k}{z}, xmm1","vpord xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.1= 28.66.0F.W0 EB /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VPORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPORD ymm2/m256/m32bcst, ym= mV, {k}{z}, ymm1","vpord ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.2= 56.66.0F.W0 EB /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VPORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPORD zmm2/m512/m32bcst, zm= mV, {k}{z}, zmm1","vpord zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.5= 12.66.0F.W0 EB /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPORQ xmm2/m128/m64bcst, xm= mV, {k}{z}, xmm1","vporq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.NDS.1= 28.66.0F.W1 EB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","= ","" +"VPORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPORQ ymm2/m256/m64bcst, ym= mV, {k}{z}, ymm1","vporq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.NDS.2= 56.66.0F.W1 EB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","= ","" +"VPORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPORQ zmm2/m512/m64bcst, zm= mV, {k}{z}, zmm1","vporq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.NDS.5= 12.66.0F.W1 EB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPPERM xmm1, xmmV, xmmIH, xmm2/m128","VPPERM xmm2/m128, xmmIH, xmmV, xmm1= ","vpperm xmm2/m128, xmmIH, xmmV, xmm1","XOP.NDS.128.08.W1 A3 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPPERM xmm1, xmmV, xmm2/m128, xmmIH","VPPERM xmmIH, xmm2/m128, xmmV, xmm1= ","vpperm xmmIH, xmm2/m128, xmmV, xmm1","XOP.NDS.128.08.W0 A3 /r /is4","V",= "V","XOP","amd","w,r,r,r","","" +"VPROLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPROLD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vprold imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /1 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPROLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPROLD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vprold imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /1 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPROLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPROLD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vprold imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /1 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPROLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPROLQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vprolq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 72 /1 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPROLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPROLQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vprolq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 72 /1 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPROLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPROLQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vprolq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 72 /1 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPROLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPROLVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vprolvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPROLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPROLVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vprolvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPROLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPROLVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vprolvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 15 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPROLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPROLVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vprolvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPROLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPROLVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vprolvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPROLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPROLVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vprolvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 15 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPRORD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPRORD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vprord imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /0 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPRORD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPRORD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vprord imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /0 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPRORD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPRORD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vprord imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /0 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPRORQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPRORQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vprorq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 72 /0 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPRORQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPRORQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vprorq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 72 /0 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPRORQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPRORQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vprorq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 72 /0 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPRORVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPRORVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vprorvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPRORVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPRORVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vprorvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPRORVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPRORVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vprorvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 14 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPRORVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPRORVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vprorvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPRORVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPRORVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vprorvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPRORVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPRORVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vprorvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 14 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPROTB xmm1, xmm2/m128, imm8u","VPROTB imm8u, xmm2/m128, xmm1","vprotb im= m8u, xmm2/m128, xmm1","XOP.128.08.W0 C0 /r ib","V","V","XOP","amd","w,r,r",= "","" +"VPROTB xmm1, xmmV, xmm2/m128","VPROTB xmm2/m128, xmmV, xmm1","vprotb xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 90 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTB xmm1, xmm2/m128, xmmV","VPROTB xmmV, xmm2/m128, xmm1","vprotb xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 90 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTD xmm1, xmm2/m128, imm8u","VPROTD imm8u, xmm2/m128, xmm1","vprotd im= m8u, xmm2/m128, xmm1","XOP.128.08.W0 C2 /r ib","V","V","XOP","amd","w,r,r",= "","" +"VPROTD xmm1, xmmV, xmm2/m128","VPROTD xmm2/m128, xmmV, xmm1","vprotd xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 92 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTD xmm1, xmm2/m128, xmmV","VPROTD xmmV, xmm2/m128, xmm1","vprotd xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 92 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTQ xmm1, xmm2/m128, imm8u","VPROTQ imm8u, xmm2/m128, xmm1","vprotq im= m8u, xmm2/m128, xmm1","XOP.128.08.W0 C3 /r ib","V","V","XOP","amd","w,r,r",= "","" +"VPROTQ xmm1, xmmV, xmm2/m128","VPROTQ xmm2/m128, xmmV, xmm1","vprotq xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 93 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTQ xmm1, xmm2/m128, xmmV","VPROTQ xmmV, xmm2/m128, xmm1","vprotq xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 93 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTW xmm1, xmm2/m128, imm8u","VPROTW imm8u, xmm2/m128, xmm1","vprotw im= m8u, xmm2/m128, xmm1","XOP.128.08.W0 C1 /r ib","V","V","XOP","amd","w,r,r",= "","" +"VPROTW xmm1, xmmV, xmm2/m128","VPROTW xmm2/m128, xmmV, xmm1","vprotw xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 91 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPROTW xmm1, xmm2/m128, xmmV","VPROTW xmmV, xmm2/m128, xmm1","vprotw xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 91 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw x= mm2/m128, xmmV, xmm1","EVEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX512BW+AVX5= 12VL","scale16","w,r,r","","" +"VPSADBW xmm1, xmmV, xmm2/m128","VPSADBW xmm2/m128, xmmV, xmm1","vpsadbw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F6 /r","V","V","AVX","","w,r,r= ","","" +"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw y= mm2/m256, ymmV, ymm1","EVEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX512BW+AVX5= 12VL","scale32","w,r,r","","" +"VPSADBW ymm1, ymmV, ymm2/m256","VPSADBW ymm2/m256, ymmV, ymm1","vpsadbw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F6 /r","V","V","AVX2","","w,r,= r","","" +"VPSADBW zmm1, zmmV, zmm2/m512","VPSADBW zmm2/m512, zmmV, zmm1","vpsadbw z= mm2/m512, zmmV, zmm1","EVEX.NDS.512.66.0F.WIG F6 /r","V","V","AVX512BW","sc= ale64","w,r,r","","" +"VPSCATTERDD vm32x, {k1-k7}, xmm1","VPSCATTERDD xmm1, {k1-k7}, vm32x","vps= catterdd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A0 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERDD vm32y, {k1-k7}, ymm1","VPSCATTERDD ymm1, {k1-k7}, vm32y","vps= catterdd ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A0 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERDD vm32z, {k1-k7}, zmm1","VPSCATTERDD zmm1, {k1-k7}, vm32z","vps= catterdd zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A0 /vsib","V","V","AVX5= 12F","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERDQ vm32x, {k1-k7}, xmm1","VPSCATTERDQ xmm1, {k1-k7}, vm32x","vps= catterdq xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A0 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERDQ vm32x, {k1-k7}, ymm1","VPSCATTERDQ ymm1, {k1-k7}, vm32x","vps= catterdq ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A0 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERDQ vm32y, {k1-k7}, zmm1","VPSCATTERDQ zmm1, {k1-k7}, vm32y","vps= catterdq zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A0 /vsib","V","V","AVX5= 12F","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERQD vm64x, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7}, vm64x","vps= catterqd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A1 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERQD vm64y, {k1-k7}, xmm1","VPSCATTERQD xmm1, {k1-k7}, vm64y","vps= catterqd xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A1 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERQD vm64z, {k1-k7}, ymm1","VPSCATTERQD ymm1, {k1-k7}, vm64z","vps= catterqd ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A1 /vsib","V","V","AVX5= 12F","modrm_memonly,scale4","w,rw,r","","" +"VPSCATTERQQ vm64x, {k1-k7}, xmm1","VPSCATTERQQ xmm1, {k1-k7}, vm64x","vps= catterqq xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A1 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERQQ vm64y, {k1-k7}, ymm1","VPSCATTERQQ ymm1, {k1-k7}, vm64y","vps= catterqq ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A1 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VPSCATTERQQ vm64z, {k1-k7}, zmm1","VPSCATTERQQ zmm1, {k1-k7}, vm64z","vps= catterqq zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A1 /vsib","V","V","AVX5= 12F","modrm_memonly,scale8","w,rw,r","","" +"VPSHAB xmm1, xmmV, xmm2/m128","VPSHAB xmm2/m128, xmmV, xmm1","vpshab xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 98 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAB xmm1, xmm2/m128, xmmV","VPSHAB xmmV, xmm2/m128, xmm1","vpshab xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 98 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAD xmm1, xmmV, xmm2/m128","VPSHAD xmm2/m128, xmmV, xmm1","vpshad xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 9A /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAD xmm1, xmm2/m128, xmmV","VPSHAD xmmV, xmm2/m128, xmm1","vpshad xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 9A /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAQ xmm1, xmmV, xmm2/m128","VPSHAQ xmm2/m128, xmmV, xmm1","vpshaq xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 9B /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAQ xmm1, xmm2/m128, xmmV","VPSHAQ xmmV, xmm2/m128, xmm1","vpshaq xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 9B /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAW xmm1, xmmV, xmm2/m128","VPSHAW xmm2/m128, xmmV, xmm1","vpshaw xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 99 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHAW xmm1, xmm2/m128, xmmV","VPSHAW xmmV, xmm2/m128, xmm1","vpshaw xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 99 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLB xmm1, xmmV, xmm2/m128","VPSHLB xmm2/m128, xmmV, xmm1","vpshlb xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 94 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLB xmm1, xmm2/m128, xmmV","VPSHLB xmmV, xmm2/m128, xmm1","vpshlb xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 94 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLD xmm1, xmmV, xmm2/m128","VPSHLD xmm2/m128, xmmV, xmm1","vpshld xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 96 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLD xmm1, xmm2/m128, xmmV","VPSHLD xmmV, xmm2/m128, xmm1","vpshld xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 96 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHLDD imm8u, xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshldd imm8u, xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale4,scale16","w,r,r,r,r","","" +"VPSHLDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHLDD imm8u, ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshldd imm8u, ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale4,scale32","w,r,r,r,r","","" +"VPSHLDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHLDD imm8u, zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshldd imm8u, zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 71 /r ib","V","V","AVX512_VBMI2","= bscale4,scale64","w,r,r,r,r","","" +"VPSHLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHLDQ imm8u, xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshldq imm8u, xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale8,scale16","w,r,r,r,r","","" +"VPSHLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHLDQ imm8u, ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshldq imm8u, ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale8,scale32","w,r,r,r,r","","" +"VPSHLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHLDQ imm8u, zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshldq imm8u, zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 71 /r ib","V","V","AVX512_VBMI2","= bscale8,scale64","w,r,r,r,r","","" +"VPSHLDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHLDVD xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpshldvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scal= e16","rw,r,r,r","","" +"VPSHLDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHLDVD ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpshldvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scal= e32","rw,r,r,r","","" +"VPSHLDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHLDVD zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpshldvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 71 /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,= r,r,r","","" +"VPSHLDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHLDVQ xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpshldvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W1 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scal= e16","rw,r,r,r","","" +"VPSHLDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHLDVQ ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpshldvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W1 71 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scal= e32","rw,r,r,r","","" +"VPSHLDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHLDVQ zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpshldvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W1 71 /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,= r,r,r","","" +"VPSHLDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHLDVW xmm2/m128, xmmV, {k}{z}= , xmm1","vpshldvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7= 0 /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","","" +"VPSHLDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHLDVW ymm2/m256, ymmV, {k}{z}= , ymm1","vpshldvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7= 0 /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","","" +"VPSHLDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHLDVW zmm2/m512, zmmV, {k}{z}= , zmm1","vpshldvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7= 0 /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","","" +"VPSHLDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHLDW imm8u, xmm2/m128, = xmmV, {k}{z}, xmm1","vpshldw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r= ,r,r","","" +"VPSHLDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHLDW imm8u, ymm2/m256, = ymmV, {k}{z}, ymm1","vpshldw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r= ,r,r","","" +"VPSHLDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHLDW imm8u, zmm2/m512, = zmmV, {k}{z}, zmm1","vpshldw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F3A.W1 70 /r ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",= "" +"VPSHLQ xmm1, xmmV, xmm2/m128","VPSHLQ xmm2/m128, xmmV, xmm1","vpshlq xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 97 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLQ xmm1, xmm2/m128, xmmV","VPSHLQ xmmV, xmm2/m128, xmm1","vpshlq xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 97 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLW xmm1, xmmV, xmm2/m128","VPSHLW xmm2/m128, xmmV, xmm1","vpshlw xmm2= /m128, xmmV, xmm1","XOP.NDS.128.09.W1 95 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHLW xmm1, xmm2/m128, xmmV","VPSHLW xmmV, xmm2/m128, xmm1","vpshlw xmmV= , xmm2/m128, xmm1","XOP.NDS.128.09.W0 95 /r","V","V","XOP","amd","w,r,r",""= ,"" +"VPSHRDD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPSHRDD imm8u, xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpshrdd imm8u, xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale4,scale16","w,r,r,r,r","","" +"VPSHRDD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPSHRDD imm8u, ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpshrdd imm8u, ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale4,scale32","w,r,r,r,r","","" +"VPSHRDD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPSHRDD imm8u, zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpshrdd imm8u, zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 73 /r ib","V","V","AVX512_VBMI2","= bscale4,scale64","w,r,r,r,r","","" +"VPSHRDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPSHRDQ imm8u, xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpshrdq imm8u, xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale8,scale16","w,r,r,r,r","","" +"VPSHRDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPSHRDQ imm8u, ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpshrdq imm8u, ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2+AV= X512VL","bscale8,scale32","w,r,r,r,r","","" +"VPSHRDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPSHRDQ imm8u, zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpshrdq imm8u, zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 73 /r ib","V","V","AVX512_VBMI2","= bscale8,scale64","w,r,r,r,r","","" +"VPSHRDVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSHRDVD xmm2/m128/m32bc= st, xmmV, {k}{z}, xmm1","vpshrdvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W0 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scal= e16","rw,r,r,r","","" +"VPSHRDVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSHRDVD ymm2/m256/m32bc= st, ymmV, {k}{z}, ymm1","vpshrdvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W0 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale4,scal= e32","rw,r,r,r","","" +"VPSHRDVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSHRDVD zmm2/m512/m32bc= st, zmmV, {k}{z}, zmm1","vpshrdvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W0 73 /r","V","V","AVX512_VBMI2","bscale4,scale64","rw,= r,r,r","","" +"VPSHRDVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSHRDVQ xmm2/m128/m64bc= st, xmmV, {k}{z}, xmm1","vpshrdvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","E= VEX.DDS.128.66.0F38.W1 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scal= e16","rw,r,r,r","","" +"VPSHRDVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSHRDVQ ymm2/m256/m64bc= st, ymmV, {k}{z}, ymm1","vpshrdvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","E= VEX.DDS.256.66.0F38.W1 73 /r","V","V","AVX512_VBMI2+AVX512VL","bscale8,scal= e32","rw,r,r,r","","" +"VPSHRDVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSHRDVQ zmm2/m512/m64bc= st, zmmV, {k}{z}, zmm1","vpshrdvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","E= VEX.DDS.512.66.0F38.W1 73 /r","V","V","AVX512_VBMI2","bscale8,scale64","rw,= r,r,r","","" +"VPSHRDVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSHRDVW xmm2/m128, xmmV, {k}{z}= , xmm1","vpshrdvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F38.W1 7= 2 /r","V","V","AVX512_VBMI2+AVX512VL","scale16","rw,r,r,r","","" +"VPSHRDVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSHRDVW ymm2/m256, ymmV, {k}{z}= , ymm1","vpshrdvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F38.W1 7= 2 /r","V","V","AVX512_VBMI2+AVX512VL","scale32","rw,r,r,r","","" +"VPSHRDVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSHRDVW zmm2/m512, zmmV, {k}{z}= , zmm1","vpshrdvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F38.W1 7= 2 /r","V","V","AVX512_VBMI2","scale64","rw,r,r,r","","" +"VPSHRDW xmm1, {k}{z}, xmmV, xmm2/m128, imm8u","VPSHRDW imm8u, xmm2/m128, = xmmV, {k}{z}, xmm1","vpshrdw imm8u, xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale16","w,r,r= ,r,r","","" +"VPSHRDW ymm1, {k}{z}, ymmV, ymm2/m256, imm8u","VPSHRDW imm8u, ymm2/m256, = ymmV, {k}{z}, ymm1","vpshrdw imm8u, ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2+AVX512VL","scale32","w,r,r= ,r,r","","" +"VPSHRDW zmm1, {k}{z}, zmmV, zmm2/m512, imm8u","VPSHRDW imm8u, zmm2/m512, = zmmV, {k}{z}, zmm1","vpshrdw imm8u, zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F3A.W1 72 /r ib","V","V","AVX512_VBMI2","scale64","w,r,r,r,r","",= "" +"VPSHUFB xmm1, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, xmm1","vpshufb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 00 /r","V","V","AVX","","w,r= ,r","","" +"VPSHUFB xmm1, {k}{z}, xmmV, xmm2/m128","VPSHUFB xmm2/m128, xmmV, {k}{z}, = xmm1","vpshufb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.WIG 00 = /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSHUFB ymm1, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, ymm1","vpshufb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 00 /r","V","V","AVX2","","w,= r,r","","" +"VPSHUFB ymm1, {k}{z}, ymmV, ymm2/m256","VPSHUFB ymm2/m256, ymmV, {k}{z}, = ymm1","vpshufb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.WIG 00 = /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSHUFB zmm1, {k}{z}, zmmV, zmm2/m512","VPSHUFB zmm2/m512, zmmV, {k}{z}, = zmm1","vpshufb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.WIG 00 = /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSHUFBITQMB k1, {k}, xmmV, xmm2/m128","VPSHUFBITQMB xmm2/m128, xmmV, {k}= , k1","vpshufbitqmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 8F /= r","V","V","AVX512_BITALG+AVX512VL","scale16","w,r,r,r","","" +"VPSHUFBITQMB k1, {k}, ymmV, ymm2/m256","VPSHUFBITQMB ymm2/m256, ymmV, {k}= , k1","vpshufbitqmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 8F /= r","V","V","AVX512_BITALG+AVX512VL","scale32","w,r,r,r","","" +"VPSHUFBITQMB k1, {k}, zmmV, zmm2/m512","VPSHUFBITQMB zmm2/m512, zmmV, {k}= , k1","vpshufbitqmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 8F /= r","V","V","AVX512_BITALG","scale64","w,r,r,r","","" +"VPSHUFD xmm1, xmm2/m128, imm8u","VPSHUFD imm8u, xmm2/m128, xmm1","vpshufd= imm8u, xmm2/m128, xmm1","VEX.128.66.0F.WIG 70 /r ib","V","V","AVX","","w,r= ,r","","" +"VPSHUFD xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSHUFD imm8u, xmm2/m128= /m32bcst, {k}{z}, xmm1","vpshufd imm8u, xmm2/m128/m32bcst, {k}{z}, xmm1","E= VEX.128.66.0F.W0 70 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPSHUFD ymm1, ymm2/m256, imm8u","VPSHUFD imm8u, ymm2/m256, ymm1","vpshufd= imm8u, ymm2/m256, ymm1","VEX.256.66.0F.WIG 70 /r ib","V","V","AVX2","","w,= r,r","","" +"VPSHUFD ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSHUFD imm8u, ymm2/m256= /m32bcst, {k}{z}, ymm1","vpshufd imm8u, ymm2/m256/m32bcst, {k}{z}, ymm1","E= VEX.256.66.0F.W0 70 /r ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPSHUFD zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSHUFD imm8u, zmm2/m512= /m32bcst, {k}{z}, zmm1","vpshufd imm8u, zmm2/m512/m32bcst, {k}{z}, zmm1","E= VEX.512.66.0F.W0 70 /r ib","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPSHUFHW xmm1, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, xmm1","vpshu= fhw imm8u, xmm2/m128, xmm1","VEX.128.F3.0F.WIG 70 /r ib","V","V","AVX","","= w,r,r","","" +"VPSHUFHW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFHW imm8u, xmm2/m128, {k}{= z}, xmm1","vpshufhw imm8u, xmm2/m128, {k}{z}, xmm1","EVEX.128.F3.0F.WIG 70 = /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSHUFHW ymm1, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, ymm1","vpshu= fhw imm8u, ymm2/m256, ymm1","VEX.256.F3.0F.WIG 70 /r ib","V","V","AVX2","",= "w,r,r","","" +"VPSHUFHW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFHW imm8u, ymm2/m256, {k}{= z}, ymm1","vpshufhw imm8u, ymm2/m256, {k}{z}, ymm1","EVEX.256.F3.0F.WIG 70 = /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSHUFHW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFHW imm8u, zmm2/m512, {k}{= z}, zmm1","vpshufhw imm8u, zmm2/m512, {k}{z}, zmm1","EVEX.512.F3.0F.WIG 70 = /r ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSHUFLW xmm1, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, xmm1","vpshu= flw imm8u, xmm2/m128, xmm1","VEX.128.F2.0F.WIG 70 /r ib","V","V","AVX","","= w,r,r","","" +"VPSHUFLW xmm1, {k}{z}, xmm2/m128, imm8u","VPSHUFLW imm8u, xmm2/m128, {k}{= z}, xmm1","vpshuflw imm8u, xmm2/m128, {k}{z}, xmm1","EVEX.128.F2.0F.WIG 70 = /r ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSHUFLW ymm1, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, ymm1","vpshu= flw imm8u, ymm2/m256, ymm1","VEX.256.F2.0F.WIG 70 /r ib","V","V","AVX2","",= "w,r,r","","" +"VPSHUFLW ymm1, {k}{z}, ymm2/m256, imm8u","VPSHUFLW imm8u, ymm2/m256, {k}{= z}, ymm1","vpshuflw imm8u, ymm2/m256, {k}{z}, ymm1","EVEX.256.F2.0F.WIG 70 = /r ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSHUFLW zmm1, {k}{z}, zmm2/m512, imm8u","VPSHUFLW imm8u, zmm2/m512, {k}{= z}, zmm1","vpshuflw imm8u, zmm2/m512, {k}{z}, zmm1","EVEX.512.F2.0F.WIG 70 = /r ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSIGNB xmm1, xmmV, xmm2/m128","VPSIGNB xmm2/m128, xmmV, xmm1","vpsignb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 08 /r","V","V","AVX","","w,r= ,r","","" +"VPSIGNB ymm1, ymmV, ymm2/m256","VPSIGNB ymm2/m256, ymmV, ymm1","vpsignb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 08 /r","V","V","AVX2","","w,= r,r","","" +"VPSIGND xmm1, xmmV, xmm2/m128","VPSIGND xmm2/m128, xmmV, xmm1","vpsignd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 0A /r","V","V","AVX","","w,r= ,r","","" +"VPSIGND ymm1, ymmV, ymm2/m256","VPSIGND ymm2/m256, ymmV, ymm1","vpsignd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 0A /r","V","V","AVX2","","w,= r,r","","" +"VPSIGNW xmm1, xmmV, xmm2/m128","VPSIGNW xmm2/m128, xmmV, xmm1","vpsignw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.WIG 09 /r","V","V","AVX","","w,r= ,r","","" +"VPSIGNW ymm1, ymmV, ymm2/m256","VPSIGNW ymm2/m256, ymmV, ymm1","vpsignw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.WIG 09 /r","V","V","AVX2","","w,= r,r","","" +"VPSLLD xmmV, xmm2, imm8u","VPSLLD imm8u, xmm2, xmmV","vpslld imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 72 /6 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSLLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSLLD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vpslld imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /6 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPSLLD ymmV, ymm2, imm8u","VPSLLD imm8u, ymm2, ymmV","vpslld imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 72 /6 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSLLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSLLD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vpslld imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /6 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPSLLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSLLD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vpslld imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /6 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPSLLD xmm1, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, xmm1","vpslld xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F2 /r","V","V","AVX","","w,r,r","= ","" +"VPSLLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLD xmm2/m128, xmmV, {k}{z}, xm= m1","vpslld xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 F2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSLLD ymm1, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, ymm1","vpslld xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F2 /r","V","V","AVX2","","w,r,r",= "","" +"VPSLLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLD xmm2/m128, ymmV, {k}{z}, ym= m1","vpslld xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 F2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSLLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLD xmm2/m128, zmmV, {k}{z}, zm= m1","vpslld xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 F2 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSLLDQ xmmV, xmm2, imm8u","VPSLLDQ imm8u, xmm2, xmmV","vpslldq imm8u, xm= m2, xmmV","VEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX","modrm_regonly","w= ,r,r","","" +"VPSLLDQ xmmV, xmm2/m128, imm8u","VPSLLDQ imm8u, xmm2/m128, xmmV","vpslldq= imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /7 ib","V","V","AVX512B= W+AVX512VL","scale16","w,r,r","","" +"VPSLLDQ ymmV, ymm2, imm8u","VPSLLDQ imm8u, ymm2, ymmV","vpslldq imm8u, ym= m2, ymmV","VEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX2","modrm_regonly","= w,r,r","","" +"VPSLLDQ ymmV, ymm2/m256, imm8u","VPSLLDQ imm8u, ymm2/m256, ymmV","vpslldq= imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /7 ib","V","V","AVX512B= W+AVX512VL","scale32","w,r,r","","" +"VPSLLDQ zmmV, zmm2/m512, imm8u","VPSLLDQ imm8u, zmm2/m512, zmmV","vpslldq= imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /7 ib","V","V","AVX512B= W","scale64","w,r,r","","" +"VPSLLQ xmmV, xmm2, imm8u","VPSLLQ imm8u, xmm2, xmmV","vpsllq imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 73 /6 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSLLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSLLQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vpsllq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 73 /6 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPSLLQ ymmV, ymm2, imm8u","VPSLLQ imm8u, ymm2, ymmV","vpsllq imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 73 /6 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSLLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSLLQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vpsllq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 73 /6 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPSLLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSLLQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vpsllq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 73 /6 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPSLLQ xmm1, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, xmm1","vpsllq xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F3 /r","V","V","AVX","","w,r,r","= ","" +"VPSLLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLQ xmm2/m128, xmmV, {k}{z}, xm= m1","vpsllq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 F3 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSLLQ ymm1, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, ymm1","vpsllq xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F3 /r","V","V","AVX2","","w,r,r",= "","" +"VPSLLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLQ xmm2/m128, ymmV, {k}{z}, ym= m1","vpsllq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 F3 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSLLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLQ xmm2/m128, zmmV, {k}{z}, zm= m1","vpsllq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 F3 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSLLVD xmm1, xmmV, xmm2/m128","VPSLLVD xmm2/m128, xmmV, xmm1","vpsllvd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 47 /r","V","V","AVX2","","w,r= ,r","","" +"VPSLLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSLLVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpsllvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 47 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPSLLVD ymm1, ymmV, ymm2/m256","VPSLLVD ymm2/m256, ymmV, ymm1","vpsllvd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 47 /r","V","V","AVX2","","w,r= ,r","","" +"VPSLLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSLLVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpsllvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 47 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPSLLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSLLVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpsllvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 47 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPSLLVQ xmm1, xmmV, xmm2/m128","VPSLLVQ xmm2/m128, xmmV, xmm1","vpsllvq x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 47 /r","V","V","AVX2","","w,r= ,r","","" +"VPSLLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSLLVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpsllvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 47 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPSLLVQ ymm1, ymmV, ymm2/m256","VPSLLVQ ymm2/m256, ymmV, ymm1","vpsllvq y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 47 /r","V","V","AVX2","","w,r= ,r","","" +"VPSLLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSLLVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpsllvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 47 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPSLLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSLLVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpsllvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 47 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPSLLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLVW xmm2/m128, xmmV, {k}{z}, = xmm1","vpsllvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 12 /= r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSLLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSLLVW ymm2/m256, ymmV, {k}{z}, = ymm1","vpsllvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 12 /= r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSLLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSLLVW zmm2/m512, zmmV, {k}{z}, = zmm1","vpsllvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 12 /= r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSLLW xmmV, xmm2, imm8u","VPSLLW imm8u, xmm2, xmmV","vpsllw imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 71 /6 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSLLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSLLW imm8u, xmm2/m128, {k}{z}, = xmmV","vpsllw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /6= ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSLLW ymmV, ymm2, imm8u","VPSLLW imm8u, ymm2, ymmV","vpsllw imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 71 /6 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSLLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSLLW imm8u, ymm2/m256, {k}{z}, = ymmV","vpsllw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /6= ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSLLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSLLW imm8u, zmm2/m512, {k}{z}, = zmmV","vpsllw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /6= ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSLLW xmm1, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, xmm1","vpsllw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F1 /r","V","V","AVX","","w,r,r","= ","" +"VPSLLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSLLW xmm2/m128, xmmV, {k}{z}, xm= m1","vpsllw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSLLW ymm1, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, ymm1","vpsllw xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F1 /r","V","V","AVX2","","w,r,r",= "","" +"VPSLLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSLLW xmm2/m128, ymmV, {k}{z}, ym= m1","vpsllw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSLLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSLLW xmm2/m128, zmmV, {k}{z}, zm= m1","vpsllw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F1 /r","= V","V","AVX512BW","scale16","w,r,r,r","","" +"VPSRAD xmmV, xmm2, imm8u","VPSRAD imm8u, xmm2, xmmV","vpsrad imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 72 /4 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRAD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRAD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vpsrad imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /4 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPSRAD ymmV, ymm2, imm8u","VPSRAD imm8u, ymm2, ymmV","vpsrad imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 72 /4 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRAD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRAD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vpsrad imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /4 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPSRAD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRAD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vpsrad imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /4 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPSRAD xmm1, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, xmm1","vpsrad xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E2 /r","V","V","AVX","","w,r,r","= ","" +"VPSRAD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAD xmm2/m128, xmmV, {k}{z}, xm= m1","vpsrad xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 E2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRAD ymm1, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, ymm1","vpsrad xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E2 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRAD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAD xmm2/m128, ymmV, {k}{z}, ym= m1","vpsrad xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 E2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRAD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAD xmm2/m128, zmmV, {k}{z}, zm= m1","vpsrad xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 E2 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSRAQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRAQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vpsraq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 72 /4 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPSRAQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRAQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vpsraq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 72 /4 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPSRAQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRAQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vpsraq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 72 /4 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPSRAQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAQ xmm2/m128, xmmV, {k}{z}, xm= m1","vpsraq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 E2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRAQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAQ xmm2/m128, ymmV, {k}{z}, ym= m1","vpsraq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 E2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRAQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAQ xmm2/m128, zmmV, {k}{z}, zm= m1","vpsraq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 E2 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSRAVD xmm1, xmmV, xmm2/m128","VPSRAVD xmm2/m128, xmmV, xmm1","vpsravd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 46 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRAVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRAVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpsravd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 46 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPSRAVD ymm1, ymmV, ymm2/m256","VPSRAVD ymm2/m256, ymmV, ymm1","vpsravd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 46 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRAVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRAVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpsravd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 46 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPSRAVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRAVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpsravd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 46 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPSRAVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRAVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpsravq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 46 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPSRAVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRAVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpsravq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 46 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPSRAVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRAVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpsravq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 46 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPSRAVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAVW xmm2/m128, xmmV, {k}{z}, = xmm1","vpsravw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 11 /= r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRAVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRAVW ymm2/m256, ymmV, {k}{z}, = ymm1","vpsravw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 11 /= r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSRAVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRAVW zmm2/m512, zmmV, {k}{z}, = zmm1","vpsravw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 11 /= r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSRAW xmmV, xmm2, imm8u","VPSRAW imm8u, xmm2, xmmV","vpsraw imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 71 /4 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRAW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRAW imm8u, xmm2/m128, {k}{z}, = xmmV","vpsraw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /4= ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRAW ymmV, ymm2, imm8u","VPSRAW imm8u, ymm2, ymmV","vpsraw imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 71 /4 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRAW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRAW imm8u, ymm2/m256, {k}{z}, = ymmV","vpsraw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /4= ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSRAW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRAW imm8u, zmm2/m512, {k}{z}, = zmmV","vpsraw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /4= ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSRAW xmm1, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, xmm1","vpsraw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E1 /r","V","V","AVX","","w,r,r","= ","" +"VPSRAW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRAW xmm2/m128, xmmV, {k}{z}, xm= m1","vpsraw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRAW ymm1, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, ymm1","vpsraw xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E1 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRAW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRAW xmm2/m128, ymmV, {k}{z}, ym= m1","vpsraw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRAW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRAW xmm2/m128, zmmV, {k}{z}, zm= m1","vpsraw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E1 /r","= V","V","AVX512BW","scale16","w,r,r,r","","" +"VPSRLD xmmV, xmm2, imm8u","VPSRLD imm8u, xmm2, xmmV","vpsrld imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 72 /2 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRLD xmmV, {k}{z}, xmm2/m128/m32bcst, imm8u","VPSRLD imm8u, xmm2/m128/m= 32bcst, {k}{z}, xmmV","vpsrld imm8u, xmm2/m128/m32bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W0 72 /2 ib","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VPSRLD ymmV, ymm2, imm8u","VPSRLD imm8u, ymm2, ymmV","vpsrld imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 72 /2 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRLD ymmV, {k}{z}, ymm2/m256/m32bcst, imm8u","VPSRLD imm8u, ymm2/m256/m= 32bcst, {k}{z}, ymmV","vpsrld imm8u, ymm2/m256/m32bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W0 72 /2 ib","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VPSRLD zmmV, {k}{z}, zmm2/m512/m32bcst, imm8u","VPSRLD imm8u, zmm2/m512/m= 32bcst, {k}{z}, zmmV","vpsrld imm8u, zmm2/m512/m32bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W0 72 /2 ib","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VPSRLD xmm1, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, xmm1","vpsrld xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D2 /r","V","V","AVX","","w,r,r","= ","" +"VPSRLD xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLD xmm2/m128, xmmV, {k}{z}, xm= m1","vpsrld xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W0 D2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRLD ymm1, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, ymm1","vpsrld xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D2 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRLD ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLD xmm2/m128, ymmV, {k}{z}, ym= m1","vpsrld xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W0 D2 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRLD zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLD xmm2/m128, zmmV, {k}{z}, zm= m1","vpsrld xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W0 D2 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSRLDQ xmmV, xmm2, imm8u","VPSRLDQ imm8u, xmm2, xmmV","vpsrldq imm8u, xm= m2, xmmV","VEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX","modrm_regonly","w= ,r,r","","" +"VPSRLDQ xmmV, xmm2/m128, imm8u","VPSRLDQ imm8u, xmm2/m128, xmmV","vpsrldq= imm8u, xmm2/m128, xmmV","EVEX.NDD.128.66.0F.WIG 73 /3 ib","V","V","AVX512B= W+AVX512VL","scale16","w,r,r","","" +"VPSRLDQ ymmV, ymm2, imm8u","VPSRLDQ imm8u, ymm2, ymmV","vpsrldq imm8u, ym= m2, ymmV","VEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX2","modrm_regonly","= w,r,r","","" +"VPSRLDQ ymmV, ymm2/m256, imm8u","VPSRLDQ imm8u, ymm2/m256, ymmV","vpsrldq= imm8u, ymm2/m256, ymmV","EVEX.NDD.256.66.0F.WIG 73 /3 ib","V","V","AVX512B= W+AVX512VL","scale32","w,r,r","","" +"VPSRLDQ zmmV, zmm2/m512, imm8u","VPSRLDQ imm8u, zmm2/m512, zmmV","vpsrldq= imm8u, zmm2/m512, zmmV","EVEX.NDD.512.66.0F.WIG 73 /3 ib","V","V","AVX512B= W","scale64","w,r,r","","" +"VPSRLQ xmmV, xmm2, imm8u","VPSRLQ imm8u, xmm2, xmmV","vpsrlq imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 73 /2 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRLQ xmmV, {k}{z}, xmm2/m128/m64bcst, imm8u","VPSRLQ imm8u, xmm2/m128/m= 64bcst, {k}{z}, xmmV","vpsrlq imm8u, xmm2/m128/m64bcst, {k}{z}, xmmV","EVEX= .NDD.128.66.0F.W1 73 /2 ib","V","V","AVX512F+AVX512VL","bscale8,scale16","w= ,r,r,r","","" +"VPSRLQ ymmV, ymm2, imm8u","VPSRLQ imm8u, ymm2, ymmV","vpsrlq imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 73 /2 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRLQ ymmV, {k}{z}, ymm2/m256/m64bcst, imm8u","VPSRLQ imm8u, ymm2/m256/m= 64bcst, {k}{z}, ymmV","vpsrlq imm8u, ymm2/m256/m64bcst, {k}{z}, ymmV","EVEX= .NDD.256.66.0F.W1 73 /2 ib","V","V","AVX512F+AVX512VL","bscale8,scale32","w= ,r,r,r","","" +"VPSRLQ zmmV, {k}{z}, zmm2/m512/m64bcst, imm8u","VPSRLQ imm8u, zmm2/m512/m= 64bcst, {k}{z}, zmmV","vpsrlq imm8u, zmm2/m512/m64bcst, {k}{z}, zmmV","EVEX= .NDD.512.66.0F.W1 73 /2 ib","V","V","AVX512F","bscale8,scale64","w,r,r,r","= ","" +"VPSRLQ xmm1, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, xmm1","vpsrlq xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D3 /r","V","V","AVX","","w,r,r","= ","" +"VPSRLQ xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLQ xmm2/m128, xmmV, {k}{z}, xm= m1","vpsrlq xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 D3 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRLQ ymm1, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, ymm1","vpsrlq xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D3 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRLQ ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLQ xmm2/m128, ymmV, {k}{z}, ym= m1","vpsrlq xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 D3 /r","V= ","V","AVX512F+AVX512VL","scale16","w,r,r,r","","" +"VPSRLQ zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLQ xmm2/m128, zmmV, {k}{z}, zm= m1","vpsrlq xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 D3 /r","V= ","V","AVX512F","scale16","w,r,r,r","","" +"VPSRLVD xmm1, xmmV, xmm2/m128","VPSRLVD xmm2/m128, xmmV, xmm1","vpsrlvd x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W0 45 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRLVD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSRLVD xmm2/m128/m32bcst= , xmmV, {k}{z}, xmm1","vpsrlvd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W0 45 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,= r,r,r","","" +"VPSRLVD ymm1, ymmV, ymm2/m256","VPSRLVD ymm2/m256, ymmV, ymm1","vpsrlvd y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W0 45 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRLVD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSRLVD ymm2/m256/m32bcst= , ymmV, {k}{z}, ymm1","vpsrlvd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W0 45 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,= r,r,r","","" +"VPSRLVD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSRLVD zmm2/m512/m32bcst= , zmmV, {k}{z}, zmm1","vpsrlvd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W0 45 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r",""= ,"" +"VPSRLVQ xmm1, xmmV, xmm2/m128","VPSRLVQ xmm2/m128, xmmV, xmm1","vpsrlvq x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F38.W1 45 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRLVQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSRLVQ xmm2/m128/m64bcst= , xmmV, {k}{z}, xmm1","vpsrlvq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX= .NDS.128.66.0F38.W1 45 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,= r,r,r","","" +"VPSRLVQ ymm1, ymmV, ymm2/m256","VPSRLVQ ymm2/m256, ymmV, ymm1","vpsrlvq y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F38.W1 45 /r","V","V","AVX2","","w,r= ,r","","" +"VPSRLVQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSRLVQ ymm2/m256/m64bcst= , ymmV, {k}{z}, ymm1","vpsrlvq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX= .NDS.256.66.0F38.W1 45 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,= r,r,r","","" +"VPSRLVQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSRLVQ zmm2/m512/m64bcst= , zmmV, {k}{z}, zmm1","vpsrlvq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX= .NDS.512.66.0F38.W1 45 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r",""= ,"" +"VPSRLVW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLVW xmm2/m128, xmmV, {k}{z}, = xmm1","vpsrlvw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F38.W1 10 /= r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRLVW ymm1, {k}{z}, ymmV, ymm2/m256","VPSRLVW ymm2/m256, ymmV, {k}{z}, = ymm1","vpsrlvw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F38.W1 10 /= r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSRLVW zmm1, {k}{z}, zmmV, zmm2/m512","VPSRLVW zmm2/m512, zmmV, {k}{z}, = zmm1","vpsrlvw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F38.W1 10 /= r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSRLW xmmV, xmm2, imm8u","VPSRLW imm8u, xmm2, xmmV","vpsrlw imm8u, xmm2,= xmmV","VEX.NDD.128.66.0F.WIG 71 /2 ib","V","V","AVX","modrm_regonly","w,r,= r","","" +"VPSRLW xmmV, {k}{z}, xmm2/m128, imm8u","VPSRLW imm8u, xmm2/m128, {k}{z}, = xmmV","vpsrlw imm8u, xmm2/m128, {k}{z}, xmmV","EVEX.NDD.128.66.0F.WIG 71 /2= ib","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRLW ymmV, ymm2, imm8u","VPSRLW imm8u, ymm2, ymmV","vpsrlw imm8u, ymm2,= ymmV","VEX.NDD.256.66.0F.WIG 71 /2 ib","V","V","AVX2","modrm_regonly","w,r= ,r","","" +"VPSRLW ymmV, {k}{z}, ymm2/m256, imm8u","VPSRLW imm8u, ymm2/m256, {k}{z}, = ymmV","vpsrlw imm8u, ymm2/m256, {k}{z}, ymmV","EVEX.NDD.256.66.0F.WIG 71 /2= ib","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSRLW zmmV, {k}{z}, zmm2/m512, imm8u","VPSRLW imm8u, zmm2/m512, {k}{z}, = zmmV","vpsrlw imm8u, zmm2/m512, {k}{z}, zmmV","EVEX.NDD.512.66.0F.WIG 71 /2= ib","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSRLW xmm1, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, xmm1","vpsrlw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D1 /r","V","V","AVX","","w,r,r","= ","" +"VPSRLW xmm1, {k}{z}, xmmV, xmm2/m128","VPSRLW xmm2/m128, xmmV, {k}{z}, xm= m1","vpsrlw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRLW ymm1, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, ymm1","vpsrlw xmm2= /m128, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D1 /r","V","V","AVX2","","w,r,r",= "","" +"VPSRLW ymm1, {k}{z}, ymmV, xmm2/m128","VPSRLW xmm2/m128, ymmV, {k}{z}, ym= m1","vpsrlw xmm2/m128, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D1 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSRLW zmm1, {k}{z}, zmmV, xmm2/m128","VPSRLW xmm2/m128, zmmV, {k}{z}, zm= m1","vpsrlw xmm2/m128, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D1 /r","= V","V","AVX512BW","scale16","w,r,r,r","","" +"VPSUBB xmm1, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, xmm1","vpsubb xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F8 /r","V","V","AVX","","w,r,r","= ","" +"VPSUBB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBB xmm2/m128, xmmV, {k}{z}, xm= m1","vpsubb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F8 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBB ymm1, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, ymm1","vpsubb ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F8 /r","V","V","AVX2","","w,r,r",= "","" +"VPSUBB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBB ymm2/m256, ymmV, {k}{z}, ym= m1","vpsubb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F8 /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBB zmm2/m512, zmmV, {k}{z}, zm= m1","vpsubb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F8 /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBD xmm1, xmmV, xmm2/m128","VPSUBD xmm2/m128, xmmV, xmm1","vpsubd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FA /r","V","V","AVX","","w,r,r","= ","" +"VPSUBD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPSUBD xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vpsubd xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W0 FA /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r= ","","" +"VPSUBD ymm1, ymmV, ymm2/m256","VPSUBD ymm2/m256, ymmV, ymm1","vpsubd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FA /r","V","V","AVX2","","w,r,r",= "","" +"VPSUBD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPSUBD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpsubd ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W0 FA /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r= ","","" +"VPSUBD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPSUBD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpsubd zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W0 FA /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPSUBQ xmm1, xmmV, xmm2/m128","VPSUBQ xmm2/m128, xmmV, xmm1","vpsubq xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG FB /r","V","V","AVX","","w,r,r","= ","" +"VPSUBQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPSUBQ xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vpsubq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 FB /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VPSUBQ ymm1, ymmV, ymm2/m256","VPSUBQ ymm2/m256, ymmV, ymm1","vpsubq ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG FB /r","V","V","AVX2","","w,r,r",= "","" +"VPSUBQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPSUBQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpsubq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 FB /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VPSUBQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPSUBQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpsubq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 FB /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPSUBSB xmm1, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, xmm1","vpsubsb x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E8 /r","V","V","AVX","","w,r,r= ","","" +"VPSUBSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSB xmm2/m128, xmmV, {k}{z}, = xmm1","vpsubsb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E8 /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBSB ymm1, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, ymm1","vpsubsb y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E8 /r","V","V","AVX2","","w,r,= r","","" +"VPSUBSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSB ymm2/m256, ymmV, {k}{z}, = ymm1","vpsubsb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E8 /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSB zmm2/m512, zmmV, {k}{z}, = zmm1","vpsubsb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E8 /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBSW xmm1, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, xmm1","vpsubsw x= mm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG E9 /r","V","V","AVX","","w,r,r= ","","" +"VPSUBSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBSW xmm2/m128, xmmV, {k}{z}, = xmm1","vpsubsw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG E9 /r= ","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBSW ymm1, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, ymm1","vpsubsw y= mm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG E9 /r","V","V","AVX2","","w,r,= r","","" +"VPSUBSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBSW ymm2/m256, ymmV, {k}{z}, = ymm1","vpsubsw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG E9 /r= ","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBSW zmm2/m512, zmmV, {k}{z}, = zmm1","vpsubsw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG E9 /r= ","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBUSB xmm1, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, xmm1","vpsubus= b xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D8 /r","V","V","AVX","","w,= r,r","","" +"VPSUBUSB xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSB xmm2/m128, xmmV, {k}{z}= , xmm1","vpsubusb xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D8= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBUSB ymm1, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, ymm1","vpsubus= b ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D8 /r","V","V","AVX2","","w= ,r,r","","" +"VPSUBUSB ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSB ymm2/m256, ymmV, {k}{z}= , ymm1","vpsubusb ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D8= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBUSB zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSB zmm2/m512, zmmV, {k}{z}= , zmm1","vpsubusb zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D8= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBUSW xmm1, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, xmm1","vpsubus= w xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG D9 /r","V","V","AVX","","w,= r,r","","" +"VPSUBUSW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBUSW xmm2/m128, xmmV, {k}{z}= , xmm1","vpsubusw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG D9= /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBUSW ymm1, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, ymm1","vpsubus= w ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG D9 /r","V","V","AVX2","","w= ,r,r","","" +"VPSUBUSW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBUSW ymm2/m256, ymmV, {k}{z}= , ymm1","vpsubusw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG D9= /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBUSW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBUSW zmm2/m512, zmmV, {k}{z}= , zmm1","vpsubusw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG D9= /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPSUBW xmm1, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, xmm1","vpsubw xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG F9 /r","V","V","AVX","","w,r,r","= ","" +"VPSUBW xmm1, {k}{z}, xmmV, xmm2/m128","VPSUBW xmm2/m128, xmmV, {k}{z}, xm= m1","vpsubw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.WIG F9 /r","= V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPSUBW ymm1, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, ymm1","vpsubw ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG F9 /r","V","V","AVX2","","w,r,r",= "","" +"VPSUBW ymm1, {k}{z}, ymmV, ymm2/m256","VPSUBW ymm2/m256, ymmV, {k}{z}, ym= m1","vpsubw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.WIG F9 /r","= V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPSUBW zmm1, {k}{z}, zmmV, zmm2/m512","VPSUBW zmm2/m512, zmmV, {k}{z}, zm= m1","vpsubw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.WIG F9 /r","= V","V","AVX512BW","scale64","w,r,r,r","","" +"VPTERNLOGD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VPTERNLOGD imm8= u, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vpternlogd imm8u, xmm2/m128/m32b= cst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W0 25 /r ib","V","V","AVX512= F+AVX512VL","bscale4,scale16","rw,r,r,r,r","","" +"VPTERNLOGD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VPTERNLOGD imm8= u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vpternlogd imm8u, ymm2/m256/m32b= cst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W0 25 /r ib","V","V","AVX512= F+AVX512VL","bscale4,scale32","rw,r,r,r,r","","" +"VPTERNLOGD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VPTERNLOGD imm8= u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vpternlogd imm8u, zmm2/m512/m32b= cst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W0 25 /r ib","V","V","AVX512= F","bscale4,scale64","rw,r,r,r,r","","" +"VPTERNLOGQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VPTERNLOGQ imm8= u, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vpternlogq imm8u, xmm2/m128/m64b= cst, xmmV, {k}{z}, xmm1","EVEX.DDS.128.66.0F3A.W1 25 /r ib","V","V","AVX512= F+AVX512VL","bscale8,scale16","rw,r,r,r,r","","" +"VPTERNLOGQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VPTERNLOGQ imm8= u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vpternlogq imm8u, ymm2/m256/m64b= cst, ymmV, {k}{z}, ymm1","EVEX.DDS.256.66.0F3A.W1 25 /r ib","V","V","AVX512= F+AVX512VL","bscale8,scale32","rw,r,r,r,r","","" +"VPTERNLOGQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VPTERNLOGQ imm8= u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vpternlogq imm8u, zmm2/m512/m64b= cst, zmmV, {k}{z}, zmm1","EVEX.DDS.512.66.0F3A.W1 25 /r ib","V","V","AVX512= F","bscale8,scale64","rw,r,r,r,r","","" +"VPTEST xmm1, xmm2/m128","VPTEST xmm2/m128, xmm1","vptest xmm2/m128, xmm1"= ,"VEX.128.66.0F38.WIG 17 /r","V","V","AVX","","r,r","","" +"VPTEST ymm1, ymm2/m256","VPTEST ymm2/m256, ymm1","vptest ymm2/m256, ymm1"= ,"VEX.256.66.0F38.WIG 17 /r","V","V","AVX","","r,r","","" +"VPTESTMB k1, {k}, xmmV, xmm2/m128","VPTESTMB xmm2/m128, xmmV, {k}, k1","v= ptestmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W0 26 /r","V","V","= AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPTESTMB k1, {k}, ymmV, ymm2/m256","VPTESTMB ymm2/m256, ymmV, {k}, k1","v= ptestmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W0 26 /r","V","V","= AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPTESTMB k1, {k}, zmmV, zmm2/m512","VPTESTMB zmm2/m512, zmmV, {k}, k1","v= ptestmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W0 26 /r","V","V","= AVX512BW","scale64","w,r,r,r","","" +"VPTESTMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTMD xmm2/m128/m32bcst, x= mmV, {k}, k1","vptestmd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","","" +"VPTESTMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTMD ymm2/m256/m32bcst, y= mmV, {k}, k1","vptestmd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","","" +"VPTESTMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTMD zmm2/m512/m32bcst, z= mmV, {k}, k1","vptestmd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F38.W0 27 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPTESTMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTMQ xmm2/m128/m64bcst, x= mmV, {k}, k1","vptestmq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.66.= 0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r","","" +"VPTESTMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTMQ ymm2/m256/m64bcst, y= mmV, {k}, k1","vptestmq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.66.= 0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r","","" +"VPTESTMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTMQ zmm2/m512/m64bcst, z= mmV, {k}, k1","vptestmq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.66.= 0F38.W1 27 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPTESTMW k1, {k}, xmmV, xmm2/m128","VPTESTMW xmm2/m128, xmmV, {k}, k1","v= ptestmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.66.0F38.W1 26 /r","V","V","= AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPTESTMW k1, {k}, ymmV, ymm2/m256","VPTESTMW ymm2/m256, ymmV, {k}, k1","v= ptestmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.66.0F38.W1 26 /r","V","V","= AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPTESTMW k1, {k}, zmmV, zmm2/m512","VPTESTMW zmm2/m512, zmmV, {k}, k1","v= ptestmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.66.0F38.W1 26 /r","V","V","= AVX512BW","scale64","w,r,r,r","","" +"VPTESTNMB k1, {k}, xmmV, xmm2/m128","VPTESTNMB xmm2/m128, xmmV, {k}, k1",= "vptestnmb xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W0 26 /r","V","V= ","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPTESTNMB k1, {k}, ymmV, ymm2/m256","VPTESTNMB ymm2/m256, ymmV, {k}, k1",= "vptestnmb ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W0 26 /r","V","V= ","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPTESTNMB k1, {k}, zmmV, zmm2/m512","VPTESTNMB zmm2/m512, zmmV, {k}, k1",= "vptestnmb zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W0 26 /r","V","V= ","AVX512BW","scale64","w,r,r,r","","" +"VPTESTNMD k1, {k}, xmmV, xmm2/m128/m32bcst","VPTESTNMD xmm2/m128/m32bcst,= xmmV, {k}, k1","vptestnmd xmm2/m128/m32bcst, xmmV, {k}, k1","EVEX.NDS.128.= F3.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r",""= ,"" +"VPTESTNMD k1, {k}, ymmV, ymm2/m256/m32bcst","VPTESTNMD ymm2/m256/m32bcst,= ymmV, {k}, k1","vptestnmd ymm2/m256/m32bcst, ymmV, {k}, k1","EVEX.NDS.256.= F3.0F38.W0 27 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r",""= ,"" +"VPTESTNMD k1, {k}, zmmV, zmm2/m512/m32bcst","VPTESTNMD zmm2/m512/m32bcst,= zmmV, {k}, k1","vptestnmd zmm2/m512/m32bcst, zmmV, {k}, k1","EVEX.NDS.512.= F3.0F38.W0 27 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPTESTNMQ k1, {k}, xmmV, xmm2/m128/m64bcst","VPTESTNMQ xmm2/m128/m64bcst,= xmmV, {k}, k1","vptestnmq xmm2/m128/m64bcst, xmmV, {k}, k1","EVEX.NDS.128.= F3.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r",""= ,"" +"VPTESTNMQ k1, {k}, ymmV, ymm2/m256/m64bcst","VPTESTNMQ ymm2/m256/m64bcst,= ymmV, {k}, k1","vptestnmq ymm2/m256/m64bcst, ymmV, {k}, k1","EVEX.NDS.256.= F3.0F38.W1 27 /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r",""= ,"" +"VPTESTNMQ k1, {k}, zmmV, zmm2/m512/m64bcst","VPTESTNMQ zmm2/m512/m64bcst,= zmmV, {k}, k1","vptestnmq zmm2/m512/m64bcst, zmmV, {k}, k1","EVEX.NDS.512.= F3.0F38.W1 27 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VPTESTNMW k1, {k}, xmmV, xmm2/m128","VPTESTNMW xmm2/m128, xmmV, {k}, k1",= "vptestnmw xmm2/m128, xmmV, {k}, k1","EVEX.NDS.128.F3.0F38.W1 26 /r","V","V= ","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPTESTNMW k1, {k}, ymmV, ymm2/m256","VPTESTNMW ymm2/m256, ymmV, {k}, k1",= "vptestnmw ymm2/m256, ymmV, {k}, k1","EVEX.NDS.256.F3.0F38.W1 26 /r","V","V= ","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPTESTNMW k1, {k}, zmmV, zmm2/m512","VPTESTNMW zmm2/m512, zmmV, {k}, k1",= "vptestnmw zmm2/m512, zmmV, {k}, k1","EVEX.NDS.512.F3.0F38.W1 26 /r","V","V= ","AVX512BW","scale64","w,r,r,r","","" +"VPUNPCKHBW xmm1, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, xmm1","vpu= npckhbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 68 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKHBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHBW xmm2/m128, xmmV, {k= }{z}, xmm1","vpunpckhbw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.= WIG 68 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPUNPCKHBW ymm1, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, ymm1","vpu= npckhbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 68 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKHBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHBW ymm2/m256, ymmV, {k= }{z}, ymm1","vpunpckhbw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.= WIG 68 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPUNPCKHBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHBW zmm2/m512, zmmV, {k= }{z}, zmm1","vpunpckhbw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.= WIG 68 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPUNPCKHDQ xmm1, xmmV, xmm2/m128","VPUNPCKHDQ xmm2/m128, xmmV, xmm1","vpu= npckhdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6A /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKHDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKHDQ xmm2/m128/m= 32bcst, xmmV, {k}{z}, xmm1","vpunpckhdq xmm2/m128/m32bcst, xmmV, {k}{z}, xm= m1","EVEX.NDS.128.66.0F.W0 6A /r","V","V","AVX512F+AVX512VL","bscale4,scale= 16","w,r,r,r","","" +"VPUNPCKHDQ ymm1, ymmV, ymm2/m256","VPUNPCKHDQ ymm2/m256, ymmV, ymm1","vpu= npckhdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6A /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKHDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKHDQ ymm2/m256/m= 32bcst, ymmV, {k}{z}, ymm1","vpunpckhdq ymm2/m256/m32bcst, ymmV, {k}{z}, ym= m1","EVEX.NDS.256.66.0F.W0 6A /r","V","V","AVX512F+AVX512VL","bscale4,scale= 32","w,r,r,r","","" +"VPUNPCKHDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKHDQ zmm2/m512/m= 32bcst, zmmV, {k}{z}, zmm1","vpunpckhdq zmm2/m512/m32bcst, zmmV, {k}{z}, zm= m1","EVEX.NDS.512.66.0F.W0 6A /r","V","V","AVX512F","bscale4,scale64","w,r,= r,r","","" +"VPUNPCKHQDQ xmm1, xmmV, xmm2/m128","VPUNPCKHQDQ xmm2/m128, xmmV, xmm1","v= punpckhqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6D /r","V","V","AV= X","","w,r,r","","" +"VPUNPCKHQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKHQDQ xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vpunpckhqdq xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.NDS.128.66.0F.W1 6D /r","V","V","AVX512F+AVX512VL","bscale8,sc= ale16","w,r,r,r","","" +"VPUNPCKHQDQ ymm1, ymmV, ymm2/m256","VPUNPCKHQDQ ymm2/m256, ymmV, ymm1","v= punpckhqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6D /r","V","V","AV= X2","","w,r,r","","" +"VPUNPCKHQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKHQDQ ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vpunpckhqdq ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.NDS.256.66.0F.W1 6D /r","V","V","AVX512F+AVX512VL","bscale8,sc= ale32","w,r,r,r","","" +"VPUNPCKHQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKHQDQ zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vpunpckhqdq zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.NDS.512.66.0F.W1 6D /r","V","V","AVX512F","bscale8,scale64","w= ,r,r,r","","" +"VPUNPCKHWD xmm1, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, xmm1","vpu= npckhwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 69 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKHWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKHWD xmm2/m128, xmmV, {k= }{z}, xmm1","vpunpckhwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.= WIG 69 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPUNPCKHWD ymm1, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, ymm1","vpu= npckhwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 69 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKHWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKHWD ymm2/m256, ymmV, {k= }{z}, ymm1","vpunpckhwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.= WIG 69 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPUNPCKHWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKHWD zmm2/m512, zmmV, {k= }{z}, zmm1","vpunpckhwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.= WIG 69 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPUNPCKLBW xmm1, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, xmm1","vpu= npcklbw xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 60 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKLBW xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLBW xmm2/m128, xmmV, {k= }{z}, xmm1","vpunpcklbw xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.= WIG 60 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPUNPCKLBW ymm1, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, ymm1","vpu= npcklbw ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 60 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKLBW ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLBW ymm2/m256, ymmV, {k= }{z}, ymm1","vpunpcklbw ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.= WIG 60 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPUNPCKLBW zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLBW zmm2/m512, zmmV, {k= }{z}, zmm1","vpunpcklbw zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.= WIG 60 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPUNPCKLDQ xmm1, xmmV, xmm2/m128","VPUNPCKLDQ xmm2/m128, xmmV, xmm1","vpu= npckldq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 62 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKLDQ xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPUNPCKLDQ xmm2/m128/m= 32bcst, xmmV, {k}{z}, xmm1","vpunpckldq xmm2/m128/m32bcst, xmmV, {k}{z}, xm= m1","EVEX.NDS.128.66.0F.W0 62 /r","V","V","AVX512F+AVX512VL","bscale4,scale= 16","w,r,r,r","","" +"VPUNPCKLDQ ymm1, ymmV, ymm2/m256","VPUNPCKLDQ ymm2/m256, ymmV, ymm1","vpu= npckldq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 62 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKLDQ ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPUNPCKLDQ ymm2/m256/m= 32bcst, ymmV, {k}{z}, ymm1","vpunpckldq ymm2/m256/m32bcst, ymmV, {k}{z}, ym= m1","EVEX.NDS.256.66.0F.W0 62 /r","V","V","AVX512F+AVX512VL","bscale4,scale= 32","w,r,r,r","","" +"VPUNPCKLDQ zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPUNPCKLDQ zmm2/m512/m= 32bcst, zmmV, {k}{z}, zmm1","vpunpckldq zmm2/m512/m32bcst, zmmV, {k}{z}, zm= m1","EVEX.NDS.512.66.0F.W0 62 /r","V","V","AVX512F","bscale4,scale64","w,r,= r,r","","" +"VPUNPCKLQDQ xmm1, xmmV, xmm2/m128","VPUNPCKLQDQ xmm2/m128, xmmV, xmm1","v= punpcklqdq xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 6C /r","V","V","AV= X","","w,r,r","","" +"VPUNPCKLQDQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPUNPCKLQDQ xmm2/m128= /m64bcst, xmmV, {k}{z}, xmm1","vpunpcklqdq xmm2/m128/m64bcst, xmmV, {k}{z},= xmm1","EVEX.NDS.128.66.0F.W1 6C /r","V","V","AVX512F+AVX512VL","bscale8,sc= ale16","w,r,r,r","","" +"VPUNPCKLQDQ ymm1, ymmV, ymm2/m256","VPUNPCKLQDQ ymm2/m256, ymmV, ymm1","v= punpcklqdq ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 6C /r","V","V","AV= X2","","w,r,r","","" +"VPUNPCKLQDQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPUNPCKLQDQ ymm2/m256= /m64bcst, ymmV, {k}{z}, ymm1","vpunpcklqdq ymm2/m256/m64bcst, ymmV, {k}{z},= ymm1","EVEX.NDS.256.66.0F.W1 6C /r","V","V","AVX512F+AVX512VL","bscale8,sc= ale32","w,r,r,r","","" +"VPUNPCKLQDQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPUNPCKLQDQ zmm2/m512= /m64bcst, zmmV, {k}{z}, zmm1","vpunpcklqdq zmm2/m512/m64bcst, zmmV, {k}{z},= zmm1","EVEX.NDS.512.66.0F.W1 6C /r","V","V","AVX512F","bscale8,scale64","w= ,r,r,r","","" +"VPUNPCKLWD xmm1, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, xmm1","vpu= npcklwd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 61 /r","V","V","AVX",= "","w,r,r","","" +"VPUNPCKLWD xmm1, {k}{z}, xmmV, xmm2/m128","VPUNPCKLWD xmm2/m128, xmmV, {k= }{z}, xmm1","vpunpcklwd xmm2/m128, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F.= WIG 61 /r","V","V","AVX512BW+AVX512VL","scale16","w,r,r,r","","" +"VPUNPCKLWD ymm1, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, ymm1","vpu= npcklwd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 61 /r","V","V","AVX2"= ,"","w,r,r","","" +"VPUNPCKLWD ymm1, {k}{z}, ymmV, ymm2/m256","VPUNPCKLWD ymm2/m256, ymmV, {k= }{z}, ymm1","vpunpcklwd ymm2/m256, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F.= WIG 61 /r","V","V","AVX512BW+AVX512VL","scale32","w,r,r,r","","" +"VPUNPCKLWD zmm1, {k}{z}, zmmV, zmm2/m512","VPUNPCKLWD zmm2/m512, zmmV, {k= }{z}, zmm1","vpunpcklwd zmm2/m512, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F.= WIG 61 /r","V","V","AVX512BW","scale64","w,r,r,r","","" +"VPXOR xmm1, xmmV, xmm2/m128","VPXOR xmm2/m128, xmmV, xmm1","vpxor xmm2/m1= 28, xmmV, xmm1","VEX.NDS.128.66.0F.WIG EF /r","V","V","AVX","","w,r,r","","" +"VPXOR ymm1, ymmV, ymm2/m256","VPXOR ymm2/m256, ymmV, ymm1","vpxor ymm2/m2= 56, ymmV, ymm1","VEX.NDS.256.66.0F.WIG EF /r","V","V","AVX2","","w,r,r","",= "" +"VPXORD xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VPXORD xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vpxord xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W0 EF /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r= ","","" +"VPXORD ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VPXORD ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vpxord ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W0 EF /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r= ","","" +"VPXORD zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VPXORD zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vpxord zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W0 EF /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VPXORQ xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VPXORQ xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vpxorq xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 EF /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VPXORQ ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VPXORQ ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vpxorq ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 EF /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VPXORQ zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VPXORQ zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vpxorq zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 EF /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VRANGEPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u:4","VRANGEPD imm8u:= 4, xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","vrangepd imm8u:4, xmm2/m128/m64b= cst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W1 50 /r ib","V","V","AVX512= DQ+AVX512VL","bscale8,scale16","w,r,r,r,r","","" +"VRANGEPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u:4","VRANGEPD imm8u:= 4, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vrangepd imm8u:4, ymm2/m256/m64b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 50 /r ib","V","V","AVX512= DQ+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VRANGEPD zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPD imm8u:4, zmm2,= zmmV, {k}{z}, zmm1{sae}","vrangepd imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}"= ,"EVEX.NDS.512.66.0F3A.W1 50 /r ib","V","V","AVX512DQ","modrm_regonly","w,r= ,r,r,r","","" +"VRANGEPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u:4","VRANGEPD imm8u:= 4, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vrangepd imm8u:4, zmm2/m512/m64b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 50 /r ib","V","V","AVX512= DQ","bscale8,scale64","w,r,r,r,r","","" +"VRANGEPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u:4","VRANGEPS imm8u:= 4, xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","vrangeps imm8u:4, xmm2/m128/m32b= cst, xmmV, {k}{z}, xmm1","EVEX.NDS.128.66.0F3A.W0 50 /r ib","V","V","AVX512= DQ+AVX512VL","bscale4,scale16","w,r,r,r,r","","" +"VRANGEPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u:4","VRANGEPS imm8u:= 4, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vrangeps imm8u:4, ymm2/m256/m32b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 50 /r ib","V","V","AVX512= DQ+AVX512VL","bscale4,scale32","w,r,r,r,r","","" +"VRANGEPS zmm1{sae}, {k}{z}, zmmV, zmm2, imm8u:4","VRANGEPS imm8u:4, zmm2,= zmmV, {k}{z}, zmm1{sae}","vrangeps imm8u:4, zmm2, zmmV, {k}{z}, zmm1{sae}"= ,"EVEX.NDS.512.66.0F3A.W0 50 /r ib","V","V","AVX512DQ","modrm_regonly","w,r= ,r,r,r","","" +"VRANGEPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u:4","VRANGEPS imm8u:= 4, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vrangeps imm8u:4, zmm2/m512/m32b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 50 /r ib","V","V","AVX512= DQ","bscale4,scale64","w,r,r,r,r","","" +"VRANGESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESD imm8u:4, xmm2,= xmmV, {k}{z}, xmm1{sae}","vrangesd imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}"= ,"EVEX.NDS.128.66.0F3A.W1 51 /r ib","V","V","AVX512DQ","modrm_regonly","w,r= ,r,r,r","","" +"VRANGESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u:4","VRANGESD imm8u:4, xmm2/m= 64, xmmV, {k}{z}, xmm1","vrangesd imm8u:4, xmm2/m64, xmmV, {k}{z}, xmm1","E= VEX.NDS.LIG.66.0F3A.W1 51 /r ib","V","V","AVX512DQ","scale8","w,r,r,r,r",""= ,"" +"VRANGESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u:4","VRANGESS imm8u:4, xmm2,= xmmV, {k}{z}, xmm1{sae}","vrangess imm8u:4, xmm2, xmmV, {k}{z}, xmm1{sae}"= ,"EVEX.NDS.128.66.0F3A.W0 51 /r ib","V","V","AVX512DQ","modrm_regonly","w,r= ,r,r,r","","" +"VRANGESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u:4","VRANGESS imm8u:4, xmm2/m= 32, xmmV, {k}{z}, xmm1","vrangess imm8u:4, xmm2/m32, xmmV, {k}{z}, xmm1","E= VEX.NDS.LIG.66.0F3A.W0 51 /r ib","V","V","AVX512DQ","scale4","w,r,r,r,r",""= ,"" +"VRCP14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRCP14PD xmm2/m128/m64bcst, {k= }{z}, xmm1","vrcp14pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W1= 4C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VRCP14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRCP14PD ymm2/m256/m64bcst, {k= }{z}, ymm1","vrcp14pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W1= 4C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VRCP14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP14PD zmm2/m512/m64bcst, {k= }{z}, zmm1","vrcp14pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1= 4C /r","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VRCP14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRCP14PS xmm2/m128/m32bcst, {k= }{z}, xmm1","vrcp14ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0F38.W0= 4C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VRCP14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRCP14PS ymm2/m256/m32bcst, {k= }{z}, ymm1","vrcp14ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0F38.W0= 4C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VRCP14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP14PS zmm2/m512/m32bcst, {k= }{z}, zmm1","vrcp14ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0= 4C /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VRCP14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP14SD xmm2/m64, xmmV, {k}{z}, = xmm1","vrcp14sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 4D /= r","V","V","AVX512F","scale8","w,r,r,r","","" +"VRCP14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP14SS xmm2/m32, xmmV, {k}{z}, = xmm1","vrcp14ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 4D /= r","V","V","AVX512F","scale4","w,r,r,r","","" +"VRCP28PD zmm1{sae}, {k}{z}, zmm2","VRCP28PD zmm2, {k}{z}, zmm1{sae}","vrc= p28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CA /r","V","V","AVX512E= R","modrm_regonly","w,r,r","","" +"VRCP28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRCP28PD zmm2/m512/m64bcst, {k= }{z}, zmm1","vrcp28pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W1= CA /r","V","V","AVX512ER","bscale8,scale64","w,r,r","","" +"VRCP28PS zmm1{sae}, {k}{z}, zmm2","VRCP28PS zmm2, {k}{z}, zmm1{sae}","vrc= p28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CA /r","V","V","AVX512E= R","modrm_regonly","w,r,r","","" +"VRCP28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRCP28PS zmm2/m512/m32bcst, {k= }{z}, zmm1","vrcp28ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0F38.W0= CA /r","V","V","AVX512ER","bscale4,scale64","w,r,r","","" +"VRCP28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SD xmm2, xmmV, {k}{z}, xmm= 1{sae}","vrcp28sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W1 C= B /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","","" +"VRCP28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRCP28SD xmm2/m64, xmmV, {k}{z}, = xmm1","vrcp28sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 CB /= r","V","V","AVX512ER","scale8","w,r,r,r","","" +"VRCP28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRCP28SS xmm2, xmmV, {k}{z}, xmm= 1{sae}","vrcp28ss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F38.W0 C= B /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","","" +"VRCP28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRCP28SS xmm2/m32, xmmV, {k}{z}, = xmm1","vrcp28ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 CB /= r","V","V","AVX512ER","scale4","w,r,r,r","","" +"VRCPPS xmm1, xmm2/m128","VRCPPS xmm2/m128, xmm1","vrcpps xmm2/m128, xmm1"= ,"VEX.128.0F.WIG 53 /r","V","V","AVX","","w,r","","" +"VRCPPS ymm1, ymm2/m256","VRCPPS ymm2/m256, ymm1","vrcpps ymm2/m256, ymm1"= ,"VEX.256.0F.WIG 53 /r","V","V","AVX","","w,r","","" +"VRCPSS xmm1, xmmV, xmm2/m32","VRCPSS xmm2/m32, xmmV, xmm1","vrcpss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 53 /r","V","V","AVX","","w,r,r","","" +"VREDUCEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VREDUCEPD imm8u, xmm2/= m128/m64bcst, {k}{z}, xmm1","vreducepd imm8u, xmm2/m128/m64bcst, {k}{z}, xm= m1","EVEX.128.66.0F3A.W1 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,sca= le16","w,r,r,r","","" +"VREDUCEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VREDUCEPD imm8u, ymm2/= m256/m64bcst, {k}{z}, ymm1","vreducepd imm8u, ymm2/m256/m64bcst, {k}{z}, ym= m1","EVEX.256.66.0F3A.W1 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale8,sca= le32","w,r,r,r","","" +"VREDUCEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPD imm8u, zmm2, {k}{z},= zmm1{sae}","vreducepd imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W1= 56 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","","" +"VREDUCEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VREDUCEPD imm8u, zmm2/= m512/m64bcst, {k}{z}, zmm1","vreducepd imm8u, zmm2/m512/m64bcst, {k}{z}, zm= m1","EVEX.512.66.0F3A.W1 56 /r ib","V","V","AVX512DQ","bscale8,scale64","w,= r,r,r","","" +"VREDUCEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VREDUCEPS imm8u, xmm2/= m128/m32bcst, {k}{z}, xmm1","vreduceps imm8u, xmm2/m128/m32bcst, {k}{z}, xm= m1","EVEX.128.66.0F3A.W0 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,sca= le16","w,r,r,r","","" +"VREDUCEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VREDUCEPS imm8u, ymm2/= m256/m32bcst, {k}{z}, ymm1","vreduceps imm8u, ymm2/m256/m32bcst, {k}{z}, ym= m1","EVEX.256.66.0F3A.W0 56 /r ib","V","V","AVX512DQ+AVX512VL","bscale4,sca= le32","w,r,r,r","","" +"VREDUCEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VREDUCEPS imm8u, zmm2, {k}{z},= zmm1{sae}","vreduceps imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F3A.W0= 56 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,r","","" +"VREDUCEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VREDUCEPS imm8u, zmm2/= m512/m32bcst, {k}{z}, zmm1","vreduceps imm8u, zmm2/m512/m32bcst, {k}{z}, zm= m1","EVEX.512.66.0F3A.W0 56 /r ib","V","V","AVX512DQ","bscale4,scale64","w,= r,r,r","","" +"VREDUCESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESD imm8u, xmm2, x= mmV, {k}{z}, xmm1{sae}","vreducesd imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","E= VEX.NDS.128.66.0F3A.W1 57 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,= r,r","","" +"VREDUCESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VREDUCESD imm8u, xmm2/m64= , xmmV, {k}{z}, xmm1","vreducesd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1","EVEX= .NDS.LIG.66.0F3A.W1 57 /r ib","V","V","AVX512DQ","scale8","w,r,r,r,r","","" +"VREDUCESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VREDUCESS imm8u, xmm2, x= mmV, {k}{z}, xmm1{sae}","vreducess imm8u, xmm2, xmmV, {k}{z}, xmm1{sae}","E= VEX.NDS.128.66.0F3A.W0 57 /r ib","V","V","AVX512DQ","modrm_regonly","w,r,r,= r,r","","" +"VREDUCESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VREDUCESS imm8u, xmm2/m32= , xmmV, {k}{z}, xmm1","vreducess imm8u, xmm2/m32, xmmV, {k}{z}, xmm1","EVEX= .NDS.LIG.66.0F3A.W0 57 /r ib","V","V","AVX512DQ","scale4","w,r,r,r,r","","" +"VRNDSCALEPD xmm1, {k}{z}, xmm2/m128/m64bcst, imm8u","VRNDSCALEPD imm8u, x= mm2/m128/m64bcst, {k}{z}, xmm1","vrndscalepd imm8u, xmm2/m128/m64bcst, {k}{= z}, xmm1","EVEX.128.66.0F3A.W1 09 /r ib","V","V","AVX512F+AVX512VL","bscale= 8,scale16","w,r,r,r","","" +"VRNDSCALEPD ymm1, {k}{z}, ymm2/m256/m64bcst, imm8u","VRNDSCALEPD imm8u, y= mm2/m256/m64bcst, {k}{z}, ymm1","vrndscalepd imm8u, ymm2/m256/m64bcst, {k}{= z}, ymm1","EVEX.256.66.0F3A.W1 09 /r ib","V","V","AVX512F+AVX512VL","bscale= 8,scale32","w,r,r,r","","" +"VRNDSCALEPD zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPD imm8u, zmm2, {k}= {z}, zmm1{sae}","vrndscalepd imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0= F3A.W1 09 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VRNDSCALEPD zmm1, {k}{z}, zmm2/m512/m64bcst, imm8u","VRNDSCALEPD imm8u, z= mm2/m512/m64bcst, {k}{z}, zmm1","vrndscalepd imm8u, zmm2/m512/m64bcst, {k}{= z}, zmm1","EVEX.512.66.0F3A.W1 09 /r ib","V","V","AVX512F","bscale8,scale64= ","w,r,r,r","","" +"VRNDSCALEPS xmm1, {k}{z}, xmm2/m128/m32bcst, imm8u","VRNDSCALEPS imm8u, x= mm2/m128/m32bcst, {k}{z}, xmm1","vrndscaleps imm8u, xmm2/m128/m32bcst, {k}{= z}, xmm1","EVEX.128.66.0F3A.W0 08 /r ib","V","V","AVX512F+AVX512VL","bscale= 4,scale16","w,r,r,r","","" +"VRNDSCALEPS ymm1, {k}{z}, ymm2/m256/m32bcst, imm8u","VRNDSCALEPS imm8u, y= mm2/m256/m32bcst, {k}{z}, ymm1","vrndscaleps imm8u, ymm2/m256/m32bcst, {k}{= z}, ymm1","EVEX.256.66.0F3A.W0 08 /r ib","V","V","AVX512F+AVX512VL","bscale= 4,scale32","w,r,r,r","","" +"VRNDSCALEPS zmm1{sae}, {k}{z}, zmm2, imm8u","VRNDSCALEPS imm8u, zmm2, {k}= {z}, zmm1{sae}","vrndscaleps imm8u, zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0= F3A.W0 08 /r ib","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VRNDSCALEPS zmm1, {k}{z}, zmm2/m512/m32bcst, imm8u","VRNDSCALEPS imm8u, z= mm2/m512/m32bcst, {k}{z}, zmm1","vrndscaleps imm8u, zmm2/m512/m32bcst, {k}{= z}, zmm1","EVEX.512.66.0F3A.W0 08 /r ib","V","V","AVX512F","bscale4,scale64= ","w,r,r,r","","" +"VRNDSCALESD xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESD imm8u, xmm= 2, xmmV, {k}{z}, xmm1{sae}","vrndscalesd imm8u, xmm2, xmmV, {k}{z}, xmm1{sa= e}","EVEX.NDS.128.66.0F3A.W1 0B /r ib","V","V","AVX512F","modrm_regonly","w= ,r,r,r,r","","" +"VRNDSCALESD xmm1, {k}{z}, xmmV, xmm2/m64, imm8u","VRNDSCALESD imm8u, xmm2= /m64, xmmV, {k}{z}, xmm1","vrndscalesd imm8u, xmm2/m64, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.LIG.66.0F3A.W1 0B /r ib","V","V","AVX512F","scale8","w,r,r,r,r",= "","" +"VRNDSCALESS xmm1{sae}, {k}{z}, xmmV, xmm2, imm8u","VRNDSCALESS imm8u, xmm= 2, xmmV, {k}{z}, xmm1{sae}","vrndscaless imm8u, xmm2, xmmV, {k}{z}, xmm1{sa= e}","EVEX.NDS.128.66.0F3A.W0 0A /r ib","V","V","AVX512F","modrm_regonly","w= ,r,r,r,r","","" +"VRNDSCALESS xmm1, {k}{z}, xmmV, xmm2/m32, imm8u","VRNDSCALESS imm8u, xmm2= /m32, xmmV, {k}{z}, xmm1","vrndscaless imm8u, xmm2/m32, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.LIG.66.0F3A.W0 0A /r ib","V","V","AVX512F","scale4","w,r,r,r,r",= "","" +"VROUNDPD xmm1, xmm2/m128, imm8u","VROUNDPD imm8u, xmm2/m128, xmm1","vroun= dpd imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 09 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VROUNDPD ymm1, ymm2/m256, imm8u","VROUNDPD imm8u, ymm2/m256, ymm1","vroun= dpd imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 09 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VROUNDPS xmm1, xmm2/m128, imm8u","VROUNDPS imm8u, xmm2/m128, xmm1","vroun= dps imm8u, xmm2/m128, xmm1","VEX.128.66.0F3A.WIG 08 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VROUNDPS ymm1, ymm2/m256, imm8u","VROUNDPS imm8u, ymm2/m256, ymm1","vroun= dps imm8u, ymm2/m256, ymm1","VEX.256.66.0F3A.WIG 08 /r ib","V","V","AVX",""= ,"w,r,r","","" +"VROUNDSD xmm1, xmmV, xmm2/m64, imm8u","VROUNDSD imm8u, xmm2/m64, xmmV, xm= m1","vroundsd imm8u, xmm2/m64, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.WIG 0B /r i= b","V","V","AVX","","w,r,r,r","","" +"VROUNDSS xmm1, xmmV, xmm2/m32, imm8u","VROUNDSS imm8u, xmm2/m32, xmmV, xm= m1","vroundss imm8u, xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.66.0F3A.WIG 0A /r i= b","V","V","AVX","","w,r,r,r","","" +"VRSQRT14PD xmm1, {k}{z}, xmm2/m128/m64bcst","VRSQRT14PD xmm2/m128/m64bcst= , {k}{z}, xmm1","vrsqrt14pd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0= F38.W1 4E /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VRSQRT14PD ymm1, {k}{z}, ymm2/m256/m64bcst","VRSQRT14PD ymm2/m256/m64bcst= , {k}{z}, ymm1","vrsqrt14pd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0= F38.W1 4E /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VRSQRT14PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT14PD zmm2/m512/m64bcst= , {k}{z}, zmm1","vrsqrt14pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0= F38.W1 4E /r","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VRSQRT14PS xmm1, {k}{z}, xmm2/m128/m32bcst","VRSQRT14PS xmm2/m128/m32bcst= , {k}{z}, xmm1","vrsqrt14ps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.66.0= F38.W0 4E /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VRSQRT14PS ymm1, {k}{z}, ymm2/m256/m32bcst","VRSQRT14PS ymm2/m256/m32bcst= , {k}{z}, ymm1","vrsqrt14ps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.66.0= F38.W0 4E /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VRSQRT14PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT14PS zmm2/m512/m32bcst= , {k}{z}, zmm1","vrsqrt14ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0= F38.W0 4E /r","V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VRSQRT14SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT14SD xmm2/m64, xmmV, {k}{= z}, xmm1","vrsqrt14sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W= 1 4F /r","V","V","AVX512F","scale8","w,r,r,r","","" +"VRSQRT14SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT14SS xmm2/m32, xmmV, {k}{= z}, xmm1","vrsqrt14ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W= 0 4F /r","V","V","AVX512F","scale4","w,r,r,r","","" +"VRSQRT28PD zmm1{sae}, {k}{z}, zmm2","VRSQRT28PD zmm2, {k}{z}, zmm1{sae}",= "vrsqrt28pd zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W1 CC /r","V","V","A= VX512ER","modrm_regonly","w,r,r","","" +"VRSQRT28PD zmm1, {k}{z}, zmm2/m512/m64bcst","VRSQRT28PD zmm2/m512/m64bcst= , {k}{z}, zmm1","vrsqrt28pd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0= F38.W1 CC /r","V","V","AVX512ER","bscale8,scale64","w,r,r","","" +"VRSQRT28PS zmm1{sae}, {k}{z}, zmm2","VRSQRT28PS zmm2, {k}{z}, zmm1{sae}",= "vrsqrt28ps zmm2, {k}{z}, zmm1{sae}","EVEX.512.66.0F38.W0 CC /r","V","V","A= VX512ER","modrm_regonly","w,r,r","","" +"VRSQRT28PS zmm1, {k}{z}, zmm2/m512/m32bcst","VRSQRT28PS zmm2/m512/m32bcst= , {k}{z}, zmm1","vrsqrt28ps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.66.0= F38.W0 CC /r","V","V","AVX512ER","bscale4,scale64","w,r,r","","" +"VRSQRT28SD xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SD xmm2, xmmV, {k}{z},= xmm1{sae}","vrsqrt28sd xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3= 8.W1 CD /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","","" +"VRSQRT28SD xmm1, {k}{z}, xmmV, xmm2/m64","VRSQRT28SD xmm2/m64, xmmV, {k}{= z}, xmm1","vrsqrt28sd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W= 1 CD /r","V","V","AVX512ER","scale8","w,r,r,r","","" +"VRSQRT28SS xmm1{sae}, {k}{z}, xmmV, xmm2","VRSQRT28SS xmm2, xmmV, {k}{z},= xmm1{sae}","vrsqrt28ss xmm2, xmmV, {k}{z}, xmm1{sae}","EVEX.NDS.128.66.0F3= 8.W0 CD /r","V","V","AVX512ER","modrm_regonly","w,r,r,r","","" +"VRSQRT28SS xmm1, {k}{z}, xmmV, xmm2/m32","VRSQRT28SS xmm2/m32, xmmV, {k}{= z}, xmm1","vrsqrt28ss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W= 0 CD /r","V","V","AVX512ER","scale4","w,r,r,r","","" +"VRSQRTPS xmm1, xmm2/m128","VRSQRTPS xmm2/m128, xmm1","vrsqrtps xmm2/m128,= xmm1","VEX.128.0F.WIG 52 /r","V","V","AVX","","w,r","","" +"VRSQRTPS ymm1, ymm2/m256","VRSQRTPS ymm2/m256, ymm1","vrsqrtps ymm2/m256,= ymm1","VEX.256.0F.WIG 52 /r","V","V","AVX","","w,r","","" +"VRSQRTSS xmm1, xmmV, xmm2/m32","VRSQRTSS xmm2/m32, xmmV, xmm1","vrsqrtss = xmm2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 52 /r","V","V","AVX","","w,r,r= ","","" +"VSCALEFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSCALEFPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vscalefpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W1 2C /r","V","V","AVX512F+AVX512VL","bscale8,scale1= 6","w,r,r,r","","" +"VSCALEFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSCALEFPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vscalefpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W1 2C /r","V","V","AVX512F+AVX512VL","bscale8,scale3= 2","w,r,r,r","","" +"VSCALEFPD zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPD zmm2, zmmV, {k}{z}, zm= m1{er}","vscalefpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F38.W1 2= C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSCALEFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSCALEFPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vscalefpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W1 2C /r","V","V","AVX512F","bscale8,scale64","w,r,r= ,r","","" +"VSCALEFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSCALEFPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vscalefps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F38.W0 2C /r","V","V","AVX512F+AVX512VL","bscale4,scale1= 6","w,r,r,r","","" +"VSCALEFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSCALEFPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vscalefps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F38.W0 2C /r","V","V","AVX512F+AVX512VL","bscale4,scale3= 2","w,r,r,r","","" +"VSCALEFPS zmm1{er}, {k}{z}, zmmV, zmm2","VSCALEFPS zmm2, zmmV, {k}{z}, zm= m1{er}","vscalefps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F38.W0 2= C /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSCALEFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSCALEFPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vscalefps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F38.W0 2C /r","V","V","AVX512F","bscale4,scale64","w,r,r= ,r","","" +"VSCALEFSD xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSD xmm2, xmmV, {k}{z}, xm= m1{er}","vscalefsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.66.0F38.W1 2= D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSCALEFSD xmm1, {k}{z}, xmmV, xmm2/m64","VSCALEFSD xmm2/m64, xmmV, {k}{z}= , xmm1","vscalefsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W1 2= D /r","V","V","AVX512F","scale8","w,r,r,r","","" +"VSCALEFSS xmm1{er}, {k}{z}, xmmV, xmm2","VSCALEFSS xmm2, xmmV, {k}{z}, xm= m1{er}","vscalefss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.66.0F38.W0 2= D /r","V","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSCALEFSS xmm1, {k}{z}, xmmV, xmm2/m32","VSCALEFSS xmm2/m32, xmmV, {k}{z}= , xmm1","vscalefss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.66.0F38.W0 2= D /r","V","V","AVX512F","scale4","w,r,r,r","","" +"VSCATTERDPD vm32x, {k1-k7}, xmm1","VSCATTERDPD xmm1, {k1-k7}, vm32x","vsc= atterdpd xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W1 A2 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERDPD vm32x, {k1-k7}, ymm1","VSCATTERDPD ymm1, {k1-k7}, vm32x","vsc= atterdpd ymm1, {k1-k7}, vm32x","EVEX.256.66.0F38.W1 A2 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERDPD vm32y, {k1-k7}, zmm1","VSCATTERDPD zmm1, {k1-k7}, vm32y","vsc= atterdpd zmm1, {k1-k7}, vm32y","EVEX.512.66.0F38.W1 A2 /vsib","V","V","AVX5= 12F","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERDPS vm32x, {k1-k7}, xmm1","VSCATTERDPS xmm1, {k1-k7}, vm32x","vsc= atterdps xmm1, {k1-k7}, vm32x","EVEX.128.66.0F38.W0 A2 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERDPS vm32y, {k1-k7}, ymm1","VSCATTERDPS ymm1, {k1-k7}, vm32y","vsc= atterdps ymm1, {k1-k7}, vm32y","EVEX.256.66.0F38.W0 A2 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERDPS vm32z, {k1-k7}, zmm1","VSCATTERDPS zmm1, {k1-k7}, vm32z","vsc= atterdps zmm1, {k1-k7}, vm32z","EVEX.512.66.0F38.W0 A2 /vsib","V","V","AVX5= 12F","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERPF0DPD vm32y, {k1-k7}","VSCATTERPF0DPD {k1-k7}, vm32y","vscatterp= f0dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /5","V","V","AVX512PF","modrm= _memonly,scale8","r,rw","","" +"VSCATTERPF0DPS vm32z, {k1-k7}","VSCATTERPF0DPS {k1-k7}, vm32z","vscatterp= f0dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /5","V","V","AVX512PF","modrm= _memonly,scale4","r,rw","","" +"VSCATTERPF0QPD vm64z, {k1-k7}","VSCATTERPF0QPD {k1-k7}, vm64z","vscatterp= f0qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /5","V","V","AVX512PF","modrm= _memonly,scale8","r,rw","","" +"VSCATTERPF0QPS vm64z, {k1-k7}","VSCATTERPF0QPS {k1-k7}, vm64z","vscatterp= f0qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /5","V","V","AVX512PF","modrm= _memonly,scale4","r,rw","","" +"VSCATTERPF1DPD vm32y, {k1-k7}","VSCATTERPF1DPD {k1-k7}, vm32y","vscatterp= f1dpd {k1-k7}, vm32y","EVEX.512.66.0F38.W1 C6 /6","V","V","AVX512PF","modrm= _memonly,scale8","r,rw","","" +"VSCATTERPF1DPS vm32z, {k1-k7}","VSCATTERPF1DPS {k1-k7}, vm32z","vscatterp= f1dps {k1-k7}, vm32z","EVEX.512.66.0F38.W0 C6 /6","V","V","AVX512PF","modrm= _memonly,scale4","r,rw","","" +"VSCATTERPF1QPD vm64z, {k1-k7}","VSCATTERPF1QPD {k1-k7}, vm64z","vscatterp= f1qpd {k1-k7}, vm64z","EVEX.512.66.0F38.W1 C7 /6","V","V","AVX512PF","modrm= _memonly,scale8","r,rw","","" +"VSCATTERPF1QPS vm64z, {k1-k7}","VSCATTERPF1QPS {k1-k7}, vm64z","vscatterp= f1qps {k1-k7}, vm64z","EVEX.512.66.0F38.W0 C7 /6","V","V","AVX512PF","modrm= _memonly,scale4","r,rw","","" +"VSCATTERQPD vm64x, {k1-k7}, xmm1","VSCATTERQPD xmm1, {k1-k7}, vm64x","vsc= atterqpd xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W1 A3 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERQPD vm64y, {k1-k7}, ymm1","VSCATTERQPD ymm1, {k1-k7}, vm64y","vsc= atterqpd ymm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W1 A3 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERQPD vm64z, {k1-k7}, zmm1","VSCATTERQPD zmm1, {k1-k7}, vm64z","vsc= atterqpd zmm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W1 A3 /vsib","V","V","AVX5= 12F","modrm_memonly,scale8","w,rw,r","","" +"VSCATTERQPS vm64x, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7}, vm64x","vsc= atterqps xmm1, {k1-k7}, vm64x","EVEX.128.66.0F38.W0 A3 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERQPS vm64y, {k1-k7}, xmm1","VSCATTERQPS xmm1, {k1-k7}, vm64y","vsc= atterqps xmm1, {k1-k7}, vm64y","EVEX.256.66.0F38.W0 A3 /vsib","V","V","AVX5= 12F+AVX512VL","modrm_memonly,scale4","w,rw,r","","" +"VSCATTERQPS vm64z, {k1-k7}, ymm1","VSCATTERQPS ymm1, {k1-k7}, vm64z","vsc= atterqps ymm1, {k1-k7}, vm64z","EVEX.512.66.0F38.W0 A3 /vsib","V","V","AVX5= 12F","modrm_memonly,scale4","w,rw,r","","" +"VSHUFF32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFF32X4 imm8= u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshuff32x4 imm8u, ymm2/m256/m32b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 23 /r ib","V","V","AVX512= F+AVX512VL","bscale4,scale32","w,r,r,r,r","","" +"VSHUFF32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFF32X4 imm8= u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshuff32x4 imm8u, zmm2/m512/m32b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 23 /r ib","V","V","AVX512= F","bscale4,scale64","w,r,r,r,r","","" +"VSHUFF64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFF64X2 imm8= u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshuff64x2 imm8u, ymm2/m256/m64b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 23 /r ib","V","V","AVX512= F+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VSHUFF64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFF64X2 imm8= u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshuff64x2 imm8u, zmm2/m512/m64b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 23 /r ib","V","V","AVX512= F","bscale8,scale64","w,r,r,r,r","","" +"VSHUFI32X4 ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFI32X4 imm8= u, ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufi32x4 imm8u, ymm2/m256/m32b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W0 43 /r ib","V","V","AVX512= F+AVX512VL","bscale4,scale32","w,r,r,r,r","","" +"VSHUFI32X4 zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFI32X4 imm8= u, zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufi32x4 imm8u, zmm2/m512/m32b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W0 43 /r ib","V","V","AVX512= F","bscale4,scale64","w,r,r,r,r","","" +"VSHUFI64X2 ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFI64X2 imm8= u, ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufi64x2 imm8u, ymm2/m256/m64b= cst, ymmV, {k}{z}, ymm1","EVEX.NDS.256.66.0F3A.W1 43 /r ib","V","V","AVX512= F+AVX512VL","bscale8,scale32","w,r,r,r,r","","" +"VSHUFI64X2 zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFI64X2 imm8= u, zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufi64x2 imm8u, zmm2/m512/m64b= cst, zmmV, {k}{z}, zmm1","EVEX.NDS.512.66.0F3A.W1 43 /r ib","V","V","AVX512= F","bscale8,scale64","w,r,r,r,r","","" +"VSHUFPD xmm1, xmmV, xmm2/m128, imm8u","VSHUFPD imm8u, xmm2/m128, xmmV, xm= m1","vshufpd imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG C6 /r ib"= ,"V","V","AVX","","w,r,r,r","","" +"VSHUFPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst, imm8u","VSHUFPD imm8u, xmm= 2/m128/m64bcst, xmmV, {k}{z}, xmm1","vshufpd imm8u, xmm2/m128/m64bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.66.0F.W1 C6 /r ib","V","V","AVX512F+AVX512VL"= ,"bscale8,scale16","w,r,r,r,r","","" +"VSHUFPD ymm1, ymmV, ymm2/m256, imm8u","VSHUFPD imm8u, ymm2/m256, ymmV, ym= m1","vshufpd imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG C6 /r ib"= ,"V","V","AVX","","w,r,r,r","","" +"VSHUFPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst, imm8u","VSHUFPD imm8u, ymm= 2/m256/m64bcst, ymmV, {k}{z}, ymm1","vshufpd imm8u, ymm2/m256/m64bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.66.0F.W1 C6 /r ib","V","V","AVX512F+AVX512VL"= ,"bscale8,scale32","w,r,r,r,r","","" +"VSHUFPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst, imm8u","VSHUFPD imm8u, zmm= 2/m512/m64bcst, zmmV, {k}{z}, zmm1","vshufpd imm8u, zmm2/m512/m64bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.66.0F.W1 C6 /r ib","V","V","AVX512F","bscale8= ,scale64","w,r,r,r,r","","" +"VSHUFPS xmm1, xmmV, xmm2/m128, imm8u","VSHUFPS imm8u, xmm2/m128, xmmV, xm= m1","vshufps imm8u, xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG C6 /r ib","V= ","V","AVX","","w,r,r,r","","" +"VSHUFPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst, imm8u","VSHUFPS imm8u, xmm= 2/m128/m32bcst, xmmV, {k}{z}, xmm1","vshufps imm8u, xmm2/m128/m32bcst, xmmV= , {k}{z}, xmm1","EVEX.NDS.128.0F.W0 C6 /r ib","V","V","AVX512F+AVX512VL","b= scale4,scale16","w,r,r,r,r","","" +"VSHUFPS ymm1, ymmV, ymm2/m256, imm8u","VSHUFPS imm8u, ymm2/m256, ymmV, ym= m1","vshufps imm8u, ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG C6 /r ib","V= ","V","AVX","","w,r,r,r","","" +"VSHUFPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst, imm8u","VSHUFPS imm8u, ymm= 2/m256/m32bcst, ymmV, {k}{z}, ymm1","vshufps imm8u, ymm2/m256/m32bcst, ymmV= , {k}{z}, ymm1","EVEX.NDS.256.0F.W0 C6 /r ib","V","V","AVX512F+AVX512VL","b= scale4,scale32","w,r,r,r,r","","" +"VSHUFPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst, imm8u","VSHUFPS imm8u, zmm= 2/m512/m32bcst, zmmV, {k}{z}, zmm1","vshufps imm8u, zmm2/m512/m32bcst, zmmV= , {k}{z}, zmm1","EVEX.NDS.512.0F.W0 C6 /r ib","V","V","AVX512F","bscale4,sc= ale64","w,r,r,r,r","","" +"VSQRTPD xmm1, xmm2/m128","VSQRTPD xmm2/m128, xmm1","vsqrtpd xmm2/m128, xm= m1","VEX.128.66.0F.WIG 51 /r","V","V","AVX","","w,r","","" +"VSQRTPD xmm1, {k}{z}, xmm2/m128/m64bcst","VSQRTPD xmm2/m128/m64bcst, {k}{= z}, xmm1","vsqrtpd xmm2/m128/m64bcst, {k}{z}, xmm1","EVEX.128.66.0F.W1 51 /= r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r","","" +"VSQRTPD ymm1, ymm2/m256","VSQRTPD ymm2/m256, ymm1","vsqrtpd ymm2/m256, ym= m1","VEX.256.66.0F.WIG 51 /r","V","V","AVX","","w,r","","" +"VSQRTPD ymm1, {k}{z}, ymm2/m256/m64bcst","VSQRTPD ymm2/m256/m64bcst, {k}{= z}, ymm1","vsqrtpd ymm2/m256/m64bcst, {k}{z}, ymm1","EVEX.256.66.0F.W1 51 /= r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r","","" +"VSQRTPD zmm1{er}, {k}{z}, zmm2","VSQRTPD zmm2, {k}{z}, zmm1{er}","vsqrtpd= zmm2, {k}{z}, zmm1{er}","EVEX.512.66.0F.W1 51 /r","V","V","AVX512F","modrm= _regonly","w,r,r","","" +"VSQRTPD zmm1, {k}{z}, zmm2/m512/m64bcst","VSQRTPD zmm2/m512/m64bcst, {k}{= z}, zmm1","vsqrtpd zmm2/m512/m64bcst, {k}{z}, zmm1","EVEX.512.66.0F.W1 51 /= r","V","V","AVX512F","bscale8,scale64","w,r,r","","" +"VSQRTPS xmm1, xmm2/m128","VSQRTPS xmm2/m128, xmm1","vsqrtps xmm2/m128, xm= m1","VEX.128.0F.WIG 51 /r","V","V","AVX","","w,r","","" +"VSQRTPS xmm1, {k}{z}, xmm2/m128/m32bcst","VSQRTPS xmm2/m128/m32bcst, {k}{= z}, xmm1","vsqrtps xmm2/m128/m32bcst, {k}{z}, xmm1","EVEX.128.0F.W0 51 /r",= "V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r","","" +"VSQRTPS ymm1, ymm2/m256","VSQRTPS ymm2/m256, ymm1","vsqrtps ymm2/m256, ym= m1","VEX.256.0F.WIG 51 /r","V","V","AVX","","w,r","","" +"VSQRTPS ymm1, {k}{z}, ymm2/m256/m32bcst","VSQRTPS ymm2/m256/m32bcst, {k}{= z}, ymm1","vsqrtps ymm2/m256/m32bcst, {k}{z}, ymm1","EVEX.256.0F.W0 51 /r",= "V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r","","" +"VSQRTPS zmm1{er}, {k}{z}, zmm2","VSQRTPS zmm2, {k}{z}, zmm1{er}","vsqrtps= zmm2, {k}{z}, zmm1{er}","EVEX.512.0F.W0 51 /r","V","V","AVX512F","modrm_re= gonly","w,r,r","","" +"VSQRTPS zmm1, {k}{z}, zmm2/m512/m32bcst","VSQRTPS zmm2/m512/m32bcst, {k}{= z}, zmm1","vsqrtps zmm2/m512/m32bcst, {k}{z}, zmm1","EVEX.512.0F.W0 51 /r",= "V","V","AVX512F","bscale4,scale64","w,r,r","","" +"VSQRTSD xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSD xmm2, xmmV, {k}{z}, xmm1{e= r}","vsqrtsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 51 /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSQRTSD xmm1, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, xmm1","vsqrtsd xmm= 2/m64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 51 /r","V","V","AVX","","w,r,r","= ","" +"VSQRTSD xmm1, {k}{z}, xmmV, xmm2/m64","VSQRTSD xmm2/m64, xmmV, {k}{z}, xm= m1","vsqrtsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 51 /r","V= ","V","AVX512F","scale8","w,r,r,r","","" +"VSQRTSS xmm1{er}, {k}{z}, xmmV, xmm2","VSQRTSS xmm2, xmmV, {k}{z}, xmm1{e= r}","vsqrtss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 51 /r","V= ","V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSQRTSS xmm1, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, xmm1","vsqrtss xmm= 2/m32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 51 /r","V","V","AVX","","w,r,r","= ","" +"VSQRTSS xmm1, {k}{z}, xmmV, xmm2/m32","VSQRTSS xmm2/m32, xmmV, {k}{z}, xm= m1","vsqrtss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 51 /r","V= ","V","AVX512F","scale4","w,r,r,r","","" +"VSTMXCSR m32","VSTMXCSR m32","vstmxcsr m32","VEX.128.0F.WIG AE /3","V","V= ","AVX","modrm_memonly","w","","" +"VSUBPD xmm1, xmmV, xmm2/m128","VSUBPD xmm2/m128, xmmV, xmm1","vsubpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","= ","" +"VSUBPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VSUBPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vsubpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 5C /r","V","V","AVX512F+AVX512VL","bscale8,scale16","w,r,r,r= ","","" +"VSUBPD ymm1, ymmV, ymm2/m256","VSUBPD ymm2/m256, ymmV, ymm1","vsubpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 5C /r","V","V","AVX","","w,r,r","= ","" +"VSUBPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VSUBPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vsubpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 5C /r","V","V","AVX512F+AVX512VL","bscale8,scale32","w,r,r,r= ","","" +"VSUBPD zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPD zmm2, zmmV, {k}{z}, zmm1{er}= ","vsubpd zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.66.0F.W1 5C /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSUBPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VSUBPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vsubpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 5C /r","V","V","AVX512F","bscale8,scale64","w,r,r,r","","" +"VSUBPS xmm1, xmmV, xmm2/m128","VSUBPS xmm2/m128, xmmV, xmm1","vsubps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 5C /r","V","V","AVX","","w,r,r","","" +"VSUBPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VSUBPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vsubps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 5C /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w,r,r,r","= ","" +"VSUBPS ymm1, ymmV, ymm2/m256","VSUBPS ymm2/m256, ymmV, ymm1","vsubps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 5C /r","V","V","AVX","","w,r,r","","" +"VSUBPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VSUBPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vsubps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 5C /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w,r,r,r","= ","" +"VSUBPS zmm1{er}, {k}{z}, zmmV, zmm2","VSUBPS zmm2, zmmV, {k}{z}, zmm1{er}= ","vsubps zmm2, zmmV, {k}{z}, zmm1{er}","EVEX.NDS.512.0F.W0 5C /r","V","V",= "AVX512F","modrm_regonly","w,r,r,r","","" +"VSUBPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VSUBPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vsubps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 5C /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","","" +"VSUBSD xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSD xmm2, xmmV, {k}{z}, xmm1{er}= ","vsubsd xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F2.0F.W1 5C /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSUBSD xmm1, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, xmm1","vsubsd xmm2/m= 64, xmmV, xmm1","VEX.NDS.LIG.F2.0F.WIG 5C /r","V","V","AVX","","w,r,r","","" +"VSUBSD xmm1, {k}{z}, xmmV, xmm2/m64","VSUBSD xmm2/m64, xmmV, {k}{z}, xmm1= ","vsubsd xmm2/m64, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F2.0F.W1 5C /r","V","= V","AVX512F","scale8","w,r,r,r","","" +"VSUBSS xmm1{er}, {k}{z}, xmmV, xmm2","VSUBSS xmm2, xmmV, {k}{z}, xmm1{er}= ","vsubss xmm2, xmmV, {k}{z}, xmm1{er}","EVEX.NDS.128.F3.0F.W0 5C /r","V","= V","AVX512F","modrm_regonly","w,r,r,r","","" +"VSUBSS xmm1, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, xmm1","vsubss xmm2/m= 32, xmmV, xmm1","VEX.NDS.LIG.F3.0F.WIG 5C /r","V","V","AVX","","w,r,r","","" +"VSUBSS xmm1, {k}{z}, xmmV, xmm2/m32","VSUBSS xmm2/m32, xmmV, {k}{z}, xmm1= ","vsubss xmm2/m32, xmmV, {k}{z}, xmm1","EVEX.NDS.LIG.F3.0F.W0 5C /r","V","= V","AVX512F","scale4","w,r,r,r","","" +"VTESTPD xmm1, xmm2/m128","VTESTPD xmm2/m128, xmm1","vtestpd xmm2/m128, xm= m1","VEX.128.66.0F38.W0 0F /r","V","V","AVX","","r,r","","" +"VTESTPD ymm1, ymm2/m256","VTESTPD ymm2/m256, ymm1","vtestpd ymm2/m256, ym= m1","VEX.256.66.0F38.W0 0F /r","V","V","AVX","","r,r","","" +"VTESTPS xmm1, xmm2/m128","VTESTPS xmm2/m128, xmm1","vtestps xmm2/m128, xm= m1","VEX.128.66.0F38.W0 0E /r","V","V","AVX","","r,r","","" +"VTESTPS ymm1, ymm2/m256","VTESTPS ymm2/m256, ymm1","vtestps ymm2/m256, ym= m1","VEX.256.66.0F38.W0 0E /r","V","V","AVX","","r,r","","" +"VUCOMISD xmm1{sae}, xmm2","VUCOMISD xmm2, xmm1{sae}","vucomisd xmm2, xmm1= {sae}","EVEX.128.66.0F.W1 2E /r","V","V","AVX512F","modrm_regonly","r,r",""= ,"" +"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xm= m1","EVEX.LIG.66.0F.W1 2E /r","V","V","AVX512F","scale8","r,r","","" +"VUCOMISD xmm1, xmm2/m64","VUCOMISD xmm2/m64, xmm1","vucomisd xmm2/m64, xm= m1","VEX.LIG.66.0F.WIG 2E /r","V","V","AVX","","r,r","","" +"VUCOMISS xmm1{sae}, xmm2","VUCOMISS xmm2, xmm1{sae}","vucomiss xmm2, xmm1= {sae}","EVEX.128.0F.W0 2E /r","V","V","AVX512F","modrm_regonly","r,r","","" +"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xm= m1","EVEX.LIG.0F.W0 2E /r","V","V","AVX512F","scale4","r,r","","" +"VUCOMISS xmm1, xmm2/m32","VUCOMISS xmm2/m32, xmm1","vucomiss xmm2/m32, xm= m1","VEX.LIG.0F.WIG 2E /r","V","V","AVX","","r,r","","" +"VUNPCKHPD xmm1, xmmV, xmm2/m128","VUNPCKHPD xmm2/m128, xmmV, xmm1","vunpc= khpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 15 /r","V","V","AVX","",= "w,r,r","","" +"VUNPCKHPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKHPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vunpckhpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale16"= ,"w,r,r,r","","" +"VUNPCKHPD ymm1, ymmV, ymm2/m256","VUNPCKHPD ymm2/m256, ymmV, ymm1","vunpc= khpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 15 /r","V","V","AVX","",= "w,r,r","","" +"VUNPCKHPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKHPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vunpckhpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F.W1 15 /r","V","V","AVX512F+AVX512VL","bscale8,scale32"= ,"w,r,r,r","","" +"VUNPCKHPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKHPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vunpckhpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F.W1 15 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r= ","","" +"VUNPCKHPS xmm1, xmmV, xmm2/m128","VUNPCKHPS xmm2/m128, xmmV, xmm1","vunpc= khps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 15 /r","V","V","AVX","","w,= r,r","","" +"VUNPCKHPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKHPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vunpckhps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.0F.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VUNPCKHPS ymm1, ymmV, ymm2/m256","VUNPCKHPS ymm2/m256, ymmV, ymm1","vunpc= khps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 15 /r","V","V","AVX","","w,= r,r","","" +"VUNPCKHPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKHPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vunpckhps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.0F.W0 15 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VUNPCKHPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKHPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vunpckhps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.0F.W0 15 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VUNPCKLPD xmm1, xmmV, xmm2/m128","VUNPCKLPD xmm2/m128, xmmV, xmm1","vunpc= klpd xmm2/m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 14 /r","V","V","AVX","",= "w,r,r","","" +"VUNPCKLPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VUNPCKLPD xmm2/m128/m64= bcst, xmmV, {k}{z}, xmm1","vunpcklpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.66.0F.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale16"= ,"w,r,r,r","","" +"VUNPCKLPD ymm1, ymmV, ymm2/m256","VUNPCKLPD ymm2/m256, ymmV, ymm1","vunpc= klpd ymm2/m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 14 /r","V","V","AVX","",= "w,r,r","","" +"VUNPCKLPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VUNPCKLPD ymm2/m256/m64= bcst, ymmV, {k}{z}, ymm1","vunpcklpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.66.0F.W1 14 /r","V","V","AVX512F+AVX512VL","bscale8,scale32"= ,"w,r,r,r","","" +"VUNPCKLPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VUNPCKLPD zmm2/m512/m64= bcst, zmmV, {k}{z}, zmm1","vunpcklpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.66.0F.W1 14 /r","V","V","AVX512F","bscale8,scale64","w,r,r,r= ","","" +"VUNPCKLPS xmm1, xmmV, xmm2/m128","VUNPCKLPS xmm2/m128, xmmV, xmm1","vunpc= klps xmm2/m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 14 /r","V","V","AVX","","w,= r,r","","" +"VUNPCKLPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VUNPCKLPS xmm2/m128/m32= bcst, xmmV, {k}{z}, xmm1","vunpcklps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1"= ,"EVEX.NDS.128.0F.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale16","w= ,r,r,r","","" +"VUNPCKLPS ymm1, ymmV, ymm2/m256","VUNPCKLPS ymm2/m256, ymmV, ymm1","vunpc= klps ymm2/m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 14 /r","V","V","AVX","","w,= r,r","","" +"VUNPCKLPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VUNPCKLPS ymm2/m256/m32= bcst, ymmV, {k}{z}, ymm1","vunpcklps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1"= ,"EVEX.NDS.256.0F.W0 14 /r","V","V","AVX512F+AVX512VL","bscale4,scale32","w= ,r,r,r","","" +"VUNPCKLPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VUNPCKLPS zmm2/m512/m32= bcst, zmmV, {k}{z}, zmm1","vunpcklps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1"= ,"EVEX.NDS.512.0F.W0 14 /r","V","V","AVX512F","bscale4,scale64","w,r,r,r","= ","" +"VXORPD xmm1, xmmV, xmm2/m128","VXORPD xmm2/m128, xmmV, xmm1","vxorpd xmm2= /m128, xmmV, xmm1","VEX.NDS.128.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","= ","" +"VXORPD xmm1, {k}{z}, xmmV, xmm2/m128/m64bcst","VXORPD xmm2/m128/m64bcst, = xmmV, {k}{z}, xmm1","vxorpd xmm2/m128/m64bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.66.0F.W1 57 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale16","w,r,r,= r","","" +"VXORPD ymm1, ymmV, ymm2/m256","VXORPD ymm2/m256, ymmV, ymm1","vxorpd ymm2= /m256, ymmV, ymm1","VEX.NDS.256.66.0F.WIG 57 /r","V","V","AVX","","w,r,r","= ","" +"VXORPD ymm1, {k}{z}, ymmV, ymm2/m256/m64bcst","VXORPD ymm2/m256/m64bcst, = ymmV, {k}{z}, ymm1","vxorpd ymm2/m256/m64bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.66.0F.W1 57 /r","V","V","AVX512DQ+AVX512VL","bscale8,scale32","w,r,r,= r","","" +"VXORPD zmm1, {k}{z}, zmmV, zmm2/m512/m64bcst","VXORPD zmm2/m512/m64bcst, = zmmV, {k}{z}, zmm1","vxorpd zmm2/m512/m64bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.66.0F.W1 57 /r","V","V","AVX512DQ","bscale8,scale64","w,r,r,r","","" +"VXORPS xmm1, xmmV, xmm2/m128","VXORPS xmm2/m128, xmmV, xmm1","vxorps xmm2= /m128, xmmV, xmm1","VEX.NDS.128.0F.WIG 57 /r","V","V","AVX","","w,r,r","","" +"VXORPS xmm1, {k}{z}, xmmV, xmm2/m128/m32bcst","VXORPS xmm2/m128/m32bcst, = xmmV, {k}{z}, xmm1","vxorps xmm2/m128/m32bcst, xmmV, {k}{z}, xmm1","EVEX.ND= S.128.0F.W0 57 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale16","w,r,r,r",= "","" +"VXORPS ymm1, ymmV, ymm2/m256","VXORPS ymm2/m256, ymmV, ymm1","vxorps ymm2= /m256, ymmV, ymm1","VEX.NDS.256.0F.WIG 57 /r","V","V","AVX","","w,r,r","","" +"VXORPS ymm1, {k}{z}, ymmV, ymm2/m256/m32bcst","VXORPS ymm2/m256/m32bcst, = ymmV, {k}{z}, ymm1","vxorps ymm2/m256/m32bcst, ymmV, {k}{z}, ymm1","EVEX.ND= S.256.0F.W0 57 /r","V","V","AVX512DQ+AVX512VL","bscale4,scale32","w,r,r,r",= "","" +"VXORPS zmm1, {k}{z}, zmmV, zmm2/m512/m32bcst","VXORPS zmm2/m512/m32bcst, = zmmV, {k}{z}, zmm1","vxorps zmm2/m512/m32bcst, zmmV, {k}{z}, zmm1","EVEX.ND= S.512.0F.W0 57 /r","V","V","AVX512DQ","bscale4,scale64","w,r,r,r","","" +"VZEROALL","VZEROALL","vzeroall","VEX.256.0F.WIG 77","V","V","AVX","","","= ","" +"VZEROUPPER","VZEROUPPER","vzeroupper","VEX.128.0F.WIG 77","V","V","AVX","= ","","","" +"WAIT","WAIT","wait","9B","V","V","","pseudo","","","" +"WBINVD","WBINVD","wbinvd","0F 09","V","V","486","","","","" +"WRFSBASE rmr32","WRFSBASE rmr32","wrfsbase rmr32","F3 0F AE /2","N.S.","V= ","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32" +"WRFSBASE rmr64","WRFSBASE rmr64","wrfsbase rmr64","F3 REX.W 0F AE /2","N.= S.","V","FSGSBASE","modrm_regonly","r","Y","64" +"WRGSBASE rmr32","WRGSBASE rmr32","wrgsbase rmr32","F3 0F AE /3","N.S.","V= ","FSGSBASE","modrm_regonly,operand16,operand32","r","Y","32" +"WRGSBASE rmr64","WRGSBASE rmr64","wrgsbase rmr64","F3 REX.W 0F AE /3","N.= S.","V","FSGSBASE","modrm_regonly","r","Y","64" +"WRMSR","WRMSR","wrmsr","0F 30","V","V","Pentium","","","","" +"WRPKRU","WRPKRU","wrpkru","0F 01 EF","V","V","PKU","","","","" +"WRSSD m32, r32","WRSSD r32, m32","wrssd r32, m32","0F 38 F6 /r","V","V","= CET","modrm_memonly,operand16,operand32","w,r","","" +"WRSSQ m64, r64","WRSSQ r64, m64","wrssq r64, m64","REX.W 0F 38 F6 /r","N.= S.","V","CET","modrm_memonly","w,r","","" +"WRUSSD m32, r32","WRUSSD r32, m32","wrussd r32, m32","66 0F 38 F5 /r","V"= ,"V","CET","modrm_memonly,operand16,operand32","w,r","","" +"WRUSSQ m64, r64","WRUSSQ r64, m64","wrussq r64, m64","66 REX.W 0F 38 F5 /= r","N.S.","V","CET","modrm_memonly","w,r","","" +"XABORT imm8u","XABORT imm8u","xabort imm8u","C6 F8 ib","V","V","RTM","mod= rm_regonly","r","","" +"XACQUIRE","XACQUIRE","xacquire","F2","V","V","HLE","pseudo","","","" +"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","0F C0 /r","V","V","486"= ,"","rw,rw","Y","8" +"XADD r/m8, r8","XADDB r8, r/m8","xaddb r8, r/m8","REX 0F C0 /r","N.E.","V= ","","pseudo64","rw,w","Y","8" +"XADD r/m32, r32","XADDL r32, r/m32","xaddl r32, r/m32","0F C1 /r","V","V"= ,"486","operand32","rw,rw","Y","32" +"XADD r/m64, r64","XADDQ r64, r/m64","xaddq r64, r/m64","REX.W 0F C1 /r","= N.S.","V","486","","rw,rw","Y","64" +"XADD r/m16, r16","XADDW r16, r/m16","xaddw r16, r/m16","0F C1 /r","V","V"= ,"486","operand16","rw,rw","Y","16" +"XBEGIN rel16","XBEGIN rel16","xbegin rel16","C7 F8 cw","V","V","RTM","mod= rm_regonly,operand16","r","","" +"XBEGIN rel32","XBEGIN rel32","xbegin rel32","C7 F8 cd","V","V","RTM","mod= rm_regonly,operand32,operand64","r","","" +"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","86 /r","V","V","","pseu= do","w,r","Y","8" +"XCHG r8, r/m8","XCHGB r/m8, r8","xchgb r/m8, r8","REX 86 /r","N.E.","V","= ","pseudo","w,r","Y","8" +"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","86 /r","V","V","","","r= w,rw","Y","8" +"XCHG r/m8, r8","XCHGB r8, r/m8","xchgb r8, r/m8","REX 86 /r","N.E.","V","= ","pseudo64","rw,r","Y","8" +"XCHG r32op, EAX","XCHGL EAX, r32op","xchgl EAX, r32op","90+rd","V","V",""= ,"operand32","rw,rw","Y","32" +"XCHG r32, r/m32","XCHGL r/m32, r32","xchgl r/m32, r32","87 /r","V","V",""= ,"operand32,pseudo","w,r","Y","32" +"XCHG r/m32, r32","XCHGL r32, r/m32","xchgl r32, r/m32","87 /r","V","V",""= ,"operand32","rw,rw","Y","32" +"XCHG EAX, r32op","XCHGL r32op, EAX","xchgl r32op, EAX","90+rd","V","V",""= ,"operand32,pseudo","rw,rw","Y","32" +"XCHG r64op, RAX","XCHGQ RAX, r64op","xchgq RAX, r64op","REX.W 90+ro","N.S= .","V","","","rw,rw","Y","64" +"XCHG r64, r/m64","XCHGQ r/m64, r64","xchgq r/m64, r64","REX.W 87 /r","N.E= .","V","","pseudo","w,r","Y","64" +"XCHG r/m64, r64","XCHGQ r64, r/m64","xchgq r64, r/m64","REX.W 87 /r","N.S= .","V","","","rw,rw","Y","64" +"XCHG RAX, r64op","XCHGQ r64op, RAX","xchgq r64op, RAX","REX.W 90+rd","N.E= .","V","","pseudo","rw,rw","Y","64" +"XCHG r16op, AX","XCHGW AX, r16op","xchgw AX, r16op","90+rw","V","V","","o= perand16","rw,rw","Y","16" +"XCHG r16, r/m16","XCHGW r/m16, r16","xchgw r/m16, r16","87 /r","V","V",""= ,"operand16,pseudo","w,r","Y","16" +"XCHG r/m16, r16","XCHGW r16, r/m16","xchgw r16, r/m16","87 /r","V","V",""= ,"operand16","rw,rw","Y","16" +"XCHG AX, r16op","XCHGW r16op, AX","xchgw r16op, AX","90+rw","V","V","","o= perand16,pseudo","rw,rw","Y","16" +"XEND","XEND","xend","0F 01 D5","V","V","RTM","","","","" +"XGETBV","XGETBV","xgetbv","0F 01 D0","V","V","XSAVE","","","","" +"XLATB","XLAT","xlat","D7","V","V","","","","","" +"XLATB","XLAT","xlat","REX.W D7","N.E.","V","","pseudo","","","" +"XOR r/m8, imm8","XORB imm8, r/m8","xorb imm8, r/m8","REX 80 /6 ib","N.E."= ,"V","","pseudo64","rw,r","Y","8" +"XOR AL, imm8u","XORB imm8u, AL","xorb imm8u, AL","34 ib","V","V","","","r= w,r","Y","8" +"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","80 /6 ib","V","V"= ,"","","rw,r","Y","8" +"XOR r/m8, imm8u","XORB imm8u, r/m8","xorb imm8u, r/m8","82 /6 ib","V","N.= S.","","","rw,r","Y","8" +"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","32 /r","V","V","","","rw,r= ","Y","8" +"XOR r8, r/m8","XORB r/m8, r8","xorb r/m8, r8","REX 32 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","30 /r","V","V","","","rw,r= ","Y","8" +"XOR r/m8, r8","XORB r8, r/m8","xorb r8, r/m8","REX 30 /r","N.E.","V","","= pseudo64","rw,r","Y","8" +"XOR EAX, imm32","XORL imm32, EAX","xorl imm32, EAX","35 id","V","V","","o= perand32","rw,r","Y","32" +"XOR r/m32, imm32","XORL imm32, r/m32","xorl imm32, r/m32","81 /6 id","V",= "V","","operand32","rw,r","Y","32" +"XOR r/m32, imm8","XORL imm8, r/m32","xorl imm8, r/m32","83 /6 ib","V","V"= ,"","operand32","rw,r","Y","32" +"XOR r32, r/m32","XORL r/m32, r32","xorl r/m32, r32","33 /r","V","V","","o= perand32","rw,r","Y","32" +"XOR r/m32, r32","XORL r32, r/m32","xorl r32, r/m32","31 /r","V","V","","o= perand32","rw,r","Y","32" +"XORPD xmm1, xmm2/m128","XORPD xmm2/m128, xmm1","xorpd xmm2/m128, xmm1","6= 6 0F 57 /r","V","V","SSE2","","rw,r","","" +"XORPS xmm1, xmm2/m128","XORPS xmm2/m128, xmm1","xorps xmm2/m128, xmm1","0= F 57 /r","V","V","SSE","","rw,r","","" +"XOR RAX, imm32","XORQ imm32, RAX","xorq imm32, RAX","REX.W 35 id","N.S.",= "V","","","rw,r","Y","64" +"XOR r/m64, imm32","XORQ imm32, r/m64","xorq imm32, r/m64","REX.W 81 /6 id= ","N.S.","V","","","rw,r","Y","64" +"XOR r/m64, imm8","XORQ imm8, r/m64","xorq imm8, r/m64","REX.W 83 /6 ib","= N.S.","V","","","rw,r","Y","64" +"XOR r64, r/m64","XORQ r/m64, r64","xorq r/m64, r64","REX.W 33 /r","N.S.",= "V","","","rw,r","Y","64" +"XOR r/m64, r64","XORQ r64, r/m64","xorq r64, r/m64","REX.W 31 /r","N.S.",= "V","","","rw,r","Y","64" +"XOR AX, imm16","XORW imm16, AX","xorw imm16, AX","35 iw","V","V","","oper= and16","rw,r","Y","16" +"XOR r/m16, imm16","XORW imm16, r/m16","xorw imm16, r/m16","81 /6 iw","V",= "V","","operand16","rw,r","Y","16" +"XOR r/m16, imm8","XORW imm8, r/m16","xorw imm8, r/m16","83 /6 ib","V","V"= ,"","operand16","rw,r","Y","16" +"XOR r16, r/m16","XORW r/m16, r16","xorw r/m16, r16","33 /r","V","V","","o= perand16","rw,r","Y","16" +"XOR r/m16, r16","XORW r16, r/m16","xorw r16, r/m16","31 /r","V","V","","o= perand16","rw,r","Y","16" +"XRELEASE","XRELEASE","xrelease","F3","V","V","HLE","pseudo","","","" +"XRSTOR mem","XRSTOR mem","xrstor mem","0F AE /5","V","V","XSAVE","modrm_m= emonly,operand16,operand32","r","","" +"XRSTOR64 mem","XRSTOR64 mem","xrstor64 mem","REX.W 0F AE /5","N.S.","V","= XSAVE","modrm_memonly","r","","" +"XRSTORS mem","XRSTORS mem","xrstors mem","0F C7 /3","V","V","XSAVES","mod= rm_memonly,operand16,operand32","r","","" +"XRSTORS64 mem","XRSTORS64 mem","xrstors64 mem","REX.W 0F C7 /3","N.S.","V= ","XSAVES","modrm_memonly","r","","" +"XSAVE mem","XSAVE mem","xsave mem","0F AE /4","V","V","XSAVE","modrm_memo= nly,operand16,operand32","w","","" +"XSAVE64 mem","XSAVE64 mem","xsave64 mem","REX.W 0F AE /4","N.S.","V","XSA= VE","modrm_memonly","w","","" +"XSAVEC mem","XSAVEC mem","xsavec mem","0F C7 /4","V","V","XSAVEC","modrm_= memonly,operand16,operand32","w","","" +"XSAVEC64 mem","XSAVEC64 mem","xsavec64 mem","REX.W 0F C7 /4","N.S.","V","= XSAVEC","modrm_memonly","w","","" +"XSAVEOPT mem","XSAVEOPT mem","xsaveopt mem","0F AE /6","V","V","XSAVEOPT"= ,"modrm_memonly,operand16,operand32","w","","" +"XSAVEOPT64 mem","XSAVEOPT64 mem","xsaveopt64 mem","REX.W 0F AE /6","N.S."= ,"V","XSAVEOPT","modrm_memonly","w","","" +"XSAVES mem","XSAVES mem","xsaves mem","0F C7 /5","V","V","XSAVES","modrm_= memonly,operand16,operand32","w","","" +"XSAVES64 mem","XSAVES64 mem","xsaves64 mem","REX.W 0F C7 /5","N.S.","V","= XSAVES","modrm_memonly","w","","" +"XSETBV","XSETBV","xsetbv","0F 01 D1","V","V","XSAVE","","","","" +"XTEST","XTEST","xtest","0F 01 D6","V","V","HLE or RTM","","","","" --=20 2.36.0 From nobody Thu May 2 14:26:59 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1650839394406221.12979583081017; Sun, 24 Apr 2022 15:29:54 -0700 (PDT) Received: from localhost ([::1]:50530 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nikjp-0006y6-3E for importer@patchew.org; Sun, 24 Apr 2022 18:29:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50872) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nikTv-0002PM-S7 for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:27 -0400 Received: from nowt.default.pbrook.uk0.bigv.io ([2001:41c8:51:832:fcff:ff:fe00:46dd]:58856) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1nikTu-0002vq-Cd for qemu-devel@nongnu.org; Sun, 24 Apr 2022 18:13:27 -0400 Received: from cpc91554-seac25-2-0-cust857.7-2.cable.virginm.net ([82.27.199.90] helo=wren.home) by nowt.default.pbrook.uk0.bigv.io with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1nikJE-0001ea-Cg; Sun, 24 Apr 2022 23:02:24 +0100 From: Paul Brook To: Paolo Bonzini , Richard Henderson , Eduardo Habkost Subject: [PATCH v2 42/42] i386: Add sha512-avx test Date: Sun, 24 Apr 2022 23:02:04 +0100 Message-Id: <20220424220204.2493824-43-paul@nowt.org> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20220418173904.3746036-1-paul@nowt.org> References: <20220418173904.3746036-1-paul@nowt.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2001:41c8:51:832:fcff:ff:fe00:46dd; envelope-from=paul@nowt.org; helo=nowt.default.pbrook.uk0.bigv.io X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "open list:All patches CC here" , Paul Brook Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZM-MESSAGEID: 1650839395411100003 Content-Type: text/plain; charset="utf-8" Include sha512 built with avx[2] in the tcg tests. Signed-off-by: Paul Brook --- tests/tcg/i386/Makefile.target | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target index eb06f7eb89..a0335fff6d 100644 --- a/tests/tcg/i386/Makefile.target +++ b/tests/tcg/i386/Makefile.target @@ -79,7 +79,14 @@ sha512-sse: sha512.c run-sha512-sse: QEMU_OPTS+=3D-cpu max run-plugin-sha512-sse-with-%: QEMU_OPTS+=3D-cpu max =20 -TESTS+=3Dsha512-sse +sha512-avx: CFLAGS=3D-mavx2 -mavx -O3 +sha512-avx: sha512.c + $(CC) $(CFLAGS) $(EXTRA_CFLAGS) $< -o $@ $(LDFLAGS) + +run-sha512-avx: QEMU_OPTS+=3D-cpu max +run-plugin-sha512-avx-with-%: QEMU_OPTS+=3D-cpu max + +TESTS+=3Dsha512-sse sha512-avx =20 test-avx.h: test-avx.py x86.csv $(PYTHON) $(I386_SRC)/test-avx.py $(I386_SRC)/x86.csv $@ --=20 2.36.0