From nobody Wed Nov 27 00:48:26 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1703269078; cv=none; d=zohomail.com; s=zohoarc; b=Uh6Nljw476G9FN5oFLvnKiXqEq22iEpA3X8H/ltb+vanX98ForpOgV/igXHl9NAYdUSvaQ5sxlvxvp5WhovqWj49K/vTuwU/6APHx8jodQFECm2+1Gl1OiXJK20Z3st5VUczvDJqGvF9mWh4+fL2uyOxFIbEZ6Mk++9INzaLr7I= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1703269078; h=Content-Type:Content-Transfer-Encoding:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To:Cc; bh=7WupHyJmuUNt7ysKhkT0O9XkGJqqDUaEHoe9me/DKx4=; b=ZhGR6FnXRHxBBVDZdG48QzJnqt30qChubgnyEKC6H93v6BLbuiacH2L3GN7thl8rWmiLByzUq5dNjPgH9MP2VNV1TZTh49yL/6c3R30zbfYN1hmOz5rb7uGgiFlG+5QArBgmumG3AJ3ONn00r204F//gerS0KuQTL0tet68mbVk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 17032690787941017.0707889951534; Fri, 22 Dec 2023 10:17:58 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rGk4u-0001HH-2I; Fri, 22 Dec 2023 13:16:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rGk4t-0001CI-0M for qemu-devel@nongnu.org; Fri, 22 Dec 2023 13:16:55 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rGk4q-0006FL-5r for qemu-devel@nongnu.org; Fri, 22 Dec 2023 13:16:54 -0500 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-569-2YR4UKEpP3-BBQ7_ZVoZIw-1; Fri, 22 Dec 2023 13:16:50 -0500 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-40d427df857so11025625e9.3 for ; Fri, 22 Dec 2023 10:16:49 -0800 (PST) Received: from [192.168.10.118] ([2001:b07:6468:f312:9af8:e5f5:7516:fa89]) by smtp.gmail.com with ESMTPSA id ez11-20020a056402450b00b005532f5abaedsm2869998edb.72.2023.12.22.10.16.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Dec 2023 10:16:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1703269011; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7WupHyJmuUNt7ysKhkT0O9XkGJqqDUaEHoe9me/DKx4=; b=Gu7GfCuYQUNPNow5WijfKqp67XPUNJ+1dagwEKYVTLMOyRACYqKEMn6yw8doHFsib/pOjF 8Ava9ZF/DdBEW29MVhg33C1GdD6R9PoK05YMc0RmMjmMktHAv1np/v+pYlACdh4HqPVPwC qZN3zO+ddX0hiV8ohnnCpVaqGxxERno= X-MC-Unique: 2YR4UKEpP3-BBQ7_ZVoZIw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703269008; x=1703873808; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7WupHyJmuUNt7ysKhkT0O9XkGJqqDUaEHoe9me/DKx4=; b=cIvoNqilihKGUUkgLYfj6Uyy8gY9vaX8eCrTKWnF40NiFH+GwizEj5zpLkv2eMsgRc RlJ3bTcDRvThgd41Mzel9vxNsRQc4jYIzlTZ56Kkps2z9gOCpRADG0A0vdRBzdxb9oQd 1Z010bO91gphAwRkFNxe9K7PH9oCeIPXjgZ5ZYfCngoiHftbt2zRxVAO1wG6iKfDmcrw 2YRa0zu9zyysU78ssmQhi2sasuggwocFZmKMbfc/npbpAMIbolWyfCi00xGPQ6si7jFg l8s8iKJxn0uieB1bOgNrvf+4d6ykf7ASR3Fgyh2OEg8uyDA41rv31eMlKisoV3RUW0Q2 UJDA== X-Gm-Message-State: AOJu0Yxk7Fr7haa8woSI7OtsCwl20XoIiUQGM0DpsnmEYRsDgHi4YW+/ PaajXzW5POdjrs0QNuD45EAUJ3gPYDqSpMTM9Fg+8LY5r7HiSnoIPFbgdt/Vy04hUJZxY1Qjy32 RsfMSpmbEQw52m8nBHVPBpjlO+Y0kCS4cTmEfwcRDtEanvtsLt0kZC5US/QfLkREFT5u/bLZTYU gsI4REER8= X-Received: by 2002:a05:600c:2342:b0:40c:532b:7a30 with SMTP id 2-20020a05600c234200b0040c532b7a30mr931450wmq.202.1703269008434; Fri, 22 Dec 2023 10:16:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IFFL9tQXdHIqA+l/AC63bb6K7oFE8YAYLryu+DqWIeziyJtEt1pHv+dd0OR4NWLNhXluFdKhg== X-Received: by 2002:a05:600c:2342:b0:40c:532b:7a30 with SMTP id 2-20020a05600c234200b0040c532b7a30mr931441wmq.202.1703269007976; Fri, 22 Dec 2023 10:16:47 -0800 (PST) From: Paolo Bonzini To: qemu-devel@nongnu.org Subject: [PATCH 22/22] target/i386: implement CMPccXADD Date: Fri, 22 Dec 2023 19:16:03 +0100 Message-ID: <20231222181603.174137-23-pbonzini@redhat.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231222181603.174137-1-pbonzini@redhat.com> References: <20231222181603.174137-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -21 X-Spam_score: -2.2 X-Spam_bar: -- X-Spam_report: (-2.2 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1703269079473100001 Content-Type: text/plain; charset="utf-8" The main difficulty here is that a page fault when writing to the destinati= on must not overwrite the flags. Therefore, the compute-flags helper must be called with a temporary destination instead of using gen_jcc1*. For simplicity, I am using an unconditional cmpxchg operation, that becomes a NOP if the comparison fails. Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/cpu.c | 2 +- target/i386/tcg/decode-new.c.inc | 25 ++++++++ target/i386/tcg/decode-new.h | 1 + target/i386/tcg/emit.c.inc | 104 +++++++++++++++++++++++++++++++ target/i386/tcg/translate.c | 2 + 5 files changed, 133 insertions(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 95d5f16cd5e..fd47ee7defb 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -738,7 +738,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendo= r1, #define TCG_7_0_EDX_FEATURES (CPUID_7_0_EDX_FSRM | CPUID_7_0_EDX_KERNEL_FE= ATURES) =20 #define TCG_7_1_EAX_FEATURES (CPUID_7_1_EAX_FZRM | CPUID_7_1_EAX_FSRS | \ - CPUID_7_1_EAX_FSRC) + CPUID_7_1_EAX_FSRC | CPUID_7_1_EAX_CMPCCXADD) #define TCG_7_1_EDX_FEATURES 0 #define TCG_7_2_EDX_FEATURES 0 #define TCG_APM_FEATURES 0 diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.= c.inc index 717d7307722..426c4594120 100644 --- a/target/i386/tcg/decode-new.c.inc +++ b/target/i386/tcg/decode-new.c.inc @@ -538,6 +538,28 @@ static const X86OpEntry opcodes_0F38_00toEF[240] =3D { [0xdd] =3D X86_OP_ENTRY3(VAESENCLAST, V,x, H,x, W,x, vex4 cpui= d(AES) p_66), [0xde] =3D X86_OP_ENTRY3(VAESDEC, V,x, H,x, W,x, vex4 cpui= d(AES) p_66), [0xdf] =3D X86_OP_ENTRY3(VAESDECLAST, V,x, H,x, W,x, vex4 cpui= d(AES) p_66), + + /* + * REG selects srcdest2 operand, VEX.vvvv selects src3. VEX class not= found + * in manual, assumed to be 13 from the VEX.L0 constraint. + */ + [0xe0] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe1] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe2] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe3] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe4] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe5] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe6] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe7] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + + [0xe8] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe9] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xea] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xeb] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xec] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xed] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xee] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xef] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), }; =20 /* five rows for no prefix, 66, F3, F2, 66+F2 */ @@ -1503,6 +1525,9 @@ static bool has_cpuid_feature(DisasContext *s, X86CPU= IDFeature cpuid) return (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2); case X86_FEAT_SHA_NI: return (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_SHA_NI); + + case X86_FEAT_CMPCCXADD: + return (s->cpuid_7_1_eax_features & CPUID_7_1_EAX_CMPCCXADD); } g_assert_not_reached(); } diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h index 25220fc4362..15e6bfef4b1 100644 --- a/target/i386/tcg/decode-new.h +++ b/target/i386/tcg/decode-new.h @@ -104,6 +104,7 @@ typedef enum X86CPUIDFeature { X86_FEAT_AVX2, X86_FEAT_BMI1, X86_FEAT_BMI2, + X86_FEAT_CMPCCXADD, X86_FEAT_F16C, X86_FEAT_FMA, X86_FEAT_MOVBE, diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc index fd120e7b9b4..f05f79e1f62 100644 --- a/target/i386/tcg/emit.c.inc +++ b/target/i386/tcg/emit.c.inc @@ -1190,6 +1190,109 @@ static void gen_BZHI(DisasContext *s, CPUX86State *= env, X86DecodedInsn *decode) prepare_update2_cc(decode, s, CC_OP_BMILGB + ot); } =20 +static void gen_CMPccXADD(DisasContext *s, CPUX86State *env, X86DecodedIns= n *decode) +{ + TCGLabel *label_top =3D gen_new_label(); + TCGLabel *label_bottom =3D gen_new_label(); + TCGv oldv =3D tcg_temp_new(); + TCGv newv =3D tcg_temp_new(); + TCGv cmpv =3D tcg_temp_new(); + TCGCond cond; + + TCGv cmp_lhs, cmp_rhs; + MemOp ot, ot_full; + + int jcc_op =3D (decode->b >> 1) & 7; + static const TCGCond cond_table[8] =3D { + [JCC_O] =3D TCG_COND_LT, /* test sign bit by comparing against 0 = */ + [JCC_B] =3D TCG_COND_LTU, + [JCC_Z] =3D TCG_COND_EQ, + [JCC_BE] =3D TCG_COND_LEU, + [JCC_S] =3D TCG_COND_LT, /* test sign bit by comparing against 0 = */ + [JCC_P] =3D TCG_COND_EQ, /* even parity - tests low bit of popcou= nt */ + [JCC_L] =3D TCG_COND_LT, + [JCC_LE] =3D TCG_COND_LE, + }; + + cond =3D cond_table[jcc_op]; + if (decode->b & 1) { + cond =3D tcg_invert_cond(cond); + } + + ot =3D decode->op[0].ot; + ot_full =3D ot | MO_LE; + if (jcc_op >=3D JCC_S) { + /* + * Sign-extend values before subtracting for S, P (zero/sign exten= sion + * does not matter there) L, LE and their inverses. + */ + ot_full |=3D MO_SIGN; + } + + /* + * cmpv will be moved to cc_src *after* cpu_regs[] is written back, so= use + * tcg_gen_ext_tl instead of gen_ext_tl. + */ + tcg_gen_ext_tl(cmpv, cpu_regs[decode->op[1].n], ot_full); + + /* + * Cmpxchg loop starts here. + * - s->T1: addition operand (from decoder) + * - s->A0: dest address (from decoder) + * - s->cc_srcT: memory operand (lhs for comparison) + * - cmpv: rhs for comparison + */ + gen_set_label(label_top); + gen_op_ld_v(s, ot_full, s->cc_srcT, s->A0); + tcg_gen_sub_tl(s->T0, s->cc_srcT, cmpv); + + /* Compute the comparison result by hand, to avoid clobbering cc_*. */ + switch (jcc_op) { + case JCC_O: + /* (src1 ^ src2) & (src1 ^ dst). newv is only used here for a mome= nt */ + tcg_gen_xor_tl(newv, s->cc_srcT, s->T0); + tcg_gen_xor_tl(s->tmp0, s->cc_srcT, cmpv); + tcg_gen_and_tl(s->tmp0, s->tmp0, newv); + tcg_gen_sextract_tl(s->tmp0, s->tmp0, 0, 8 << ot); + cmp_lhs =3D s->tmp0, cmp_rhs =3D tcg_constant_tl(0); + break; + + case JCC_P: + tcg_gen_ext8u_tl(s->tmp0, s->T0); + tcg_gen_ctpop_tl(s->tmp0, s->tmp0); + tcg_gen_andi_tl(s->tmp0, s->tmp0, 1); + cmp_lhs =3D s->tmp0, cmp_rhs =3D tcg_constant_tl(0); + break; + + case JCC_S: + cmp_lhs =3D s->T0, cmp_rhs =3D tcg_constant_tl(0); + break; + + default: + cmp_lhs =3D s->cc_srcT, cmp_rhs =3D cmpv; + break; + } + + /* Compute new value: if condition does not hold, just store back s->c= c_srcT */ + tcg_gen_add_tl(newv, s->cc_srcT, s->T1); + tcg_gen_movcond_tl(cond, newv, cmp_lhs, cmp_rhs, newv, s->cc_srcT); + tcg_gen_atomic_cmpxchg_tl(oldv, s->A0, s->cc_srcT, newv, s->mem_index,= ot_full); + + /* Exit unconditionally if cmpxchg succeeded. */ + tcg_gen_brcond_tl(TCG_COND_EQ, oldv, s->cc_srcT, label_bottom); + + /* Try again if there was actually a store to make. */ + tcg_gen_brcond_tl(cond, cmp_lhs, cmp_rhs, label_top); + gen_set_label(label_bottom); + + /* Store old value to registers only after a successful store. */ + gen_writeback(s, decode, 1, s->cc_srcT); + + decode->cc_dst =3D s->T0; + decode->cc_src =3D cmpv; + decode->cc_op =3D CC_OP_SUBB + ot; +} + static void gen_CRC32(DisasContext *s, CPUX86State *env, X86DecodedInsn *d= ecode) { MemOp ot =3D decode->op[2].ot; diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 53b98d5e6ac..c6bb215d68a 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -122,6 +122,7 @@ typedef struct DisasContext { int cpuid_ext3_features; int cpuid_7_0_ebx_features; int cpuid_7_0_ecx_features; + int cpuid_7_1_eax_features; int cpuid_xsave_features; =20 /* TCG local temps */ @@ -6973,6 +6974,7 @@ static void i386_tr_init_disas_context(DisasContextBa= se *dcbase, CPUState *cpu) dc->cpuid_ext3_features =3D env->features[FEAT_8000_0001_ECX]; dc->cpuid_7_0_ebx_features =3D env->features[FEAT_7_0_EBX]; dc->cpuid_7_0_ecx_features =3D env->features[FEAT_7_0_ECX]; + dc->cpuid_7_1_eax_features =3D env->features[FEAT_7_1_EAX]; dc->cpuid_xsave_features =3D env->features[FEAT_XSAVE]; dc->jmp_opt =3D !((cflags & CF_NO_GOTO_TB) || (flags & (HF_TF_MASK | HF_INHIBIT_IRQ_MASK))); --=20 2.43.0