From nobody Tue Nov 26 22:20:13 2024 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1704012728; cv=none; d=zohomail.com; s=zohoarc; b=d0slOYxDEt+wWEwAKFeuBMjoI+xg8nl7R8fMRcx3/iFh2Eh/XCbr3c23AqBIDYmUMAGwF1MZ+qCsiD0UkwCtZf99Tdrwtf2uwXJocRPh4FB4pfBM+aMDsgJpD5HoVhHOMbcEiP2QAGvoPMGzSn8jhv5bfSbQKp0MNe90zZJsSTQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1704012728; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=QrgSQf0vPXxBEO3kmxsTrPnNKx7sufkUNSl4oDzUU84=; b=LgvJCsKpA1otH58JLNDlhnN5x7rBFr+msS5pUKqF0ZESqbT0amJp9BUTRwHPwMnJ870hQ0IXw4WWuUjwBYg94MCVxsYvfVaEmQTBEwXqUGSLpJH6SZ5ukVNe601SZwe9klCpc4fR0PfMwXnHPQc7Ec7rIwE086oMPYYWkE1ldak= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1704012728848877.8443434858153; Sun, 31 Dec 2023 00:52:08 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rJrUn-0007rg-8l; Sun, 31 Dec 2023 03:48:33 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rJrSe-0004xc-92 for qemu-devel@nongnu.org; Sun, 31 Dec 2023 03:46:27 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rJrSc-0000o5-CI for qemu-devel@nongnu.org; Sun, 31 Dec 2023 03:46:20 -0500 Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-12-N9Yhf8RlPoyrUFVIblobwQ-1; Sun, 31 Dec 2023 03:46:10 -0500 Received: by mail-wm1-f71.google.com with SMTP id 5b1f17b1804b1-40d86184891so2567645e9.1 for ; Sun, 31 Dec 2023 00:46:10 -0800 (PST) Received: from [192.168.10.118] ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id r14-20020a05600c458e00b0040d724896cbsm9028379wmo.18.2023.12.31.00.46.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 31 Dec 2023 00:46:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1704012377; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QrgSQf0vPXxBEO3kmxsTrPnNKx7sufkUNSl4oDzUU84=; b=FHSCsvbONlMKEgGycv3Tz9KwkhBY5yLjHM19Nq45Zyzu4/bbzSkSuT3o0ApY69O4xcEWFn dlpKVlw7yZRik4CwbuWIli1Sxwi2Y7luOmYv+4OjbKvZGG/jS1+G0PyE5H3lSo9LBkwWKH qHsM/qNgFHsJVM20BZQyqqci/hK4O1Y= X-MC-Unique: N9Yhf8RlPoyrUFVIblobwQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704012368; x=1704617168; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QrgSQf0vPXxBEO3kmxsTrPnNKx7sufkUNSl4oDzUU84=; b=Z9+gcIsZ2lndqMzsASOj1JT73zHbpxFwIcrB6OQLy5fodCaktsw/aManQbCrgTkRTu bLRjy/4aWpnfRqS4LAFLz0lO9ywE6HSfoFRxNtACy+z0eKLvktdLdn5FGvOgk2LeFe+J ra8juZ9K6vogGz80LWE1jFwqv/d/XgbqBi87/TTsBr3gVuv9l3ZK+m7bFrmXJpSHoCg8 xMmSUlig8Zkt8mwgNshMW2TNP2za9ASmPt4QfqMLddv3XQ86JdDj0HLBdI6UZx5Bt1qh 81GDsZeeCiI5Lx6K9MwQMJO8jaAcwDVv0HBGGqNWA7A0f8OkCiynM99ShWuczt0X+lR8 rxqg== X-Gm-Message-State: AOJu0YxFpXZRVLDFvrNiP4Z1mQJGaPUvVUpq0Ld01upVcBLCiEEbefcR WCMl7XV0UUTd8aVrQghXbfyOZZykoFsYS/DHq5DhelTCT0JjwuD5R6QcmqvG+bWcW80M4XL+Qxv UncqjPex8SVTKjzAH4XKWDJOgbApOiZTpHGfhXKrTs601v8ON5IRBHxXMvbtkTJvFHhI6PhNgzp xRpx+0BBA= X-Received: by 2002:a05:600c:1c91:b0:40d:3ad5:7622 with SMTP id k17-20020a05600c1c9100b0040d3ad57622mr7610023wms.124.1704012368699; Sun, 31 Dec 2023 00:46:08 -0800 (PST) X-Google-Smtp-Source: AGHT+IFJPzHlOlwrIEuM6WW8NKwgu/clufBtKd+cvXvKbegbTXh7Njvt8/v+8NJB7bwmP/pR1TatMA== X-Received: by 2002:a05:600c:1c91:b0:40d:3ad5:7622 with SMTP id k17-20020a05600c1c9100b0040d3ad57622mr7610016wms.124.1704012368264; Sun, 31 Dec 2023 00:46:08 -0800 (PST) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: Richard Henderson Subject: [PULL 22/46] target/i386: implement CMPccXADD Date: Sun, 31 Dec 2023 09:44:38 +0100 Message-ID: <20231231084502.235366-23-pbonzini@redhat.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20231231084502.235366-1-pbonzini@redhat.com> References: <20231231084502.235366-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -47 X-Spam_score: -4.8 X-Spam_bar: ---- X-Spam_report: (-4.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.667, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1704012729256100001 Content-Type: text/plain; charset="utf-8" The main difficulty here is that a page fault when writing to the destinati= on must not overwrite the flags. Therefore, the flags computation must be inlined instead of using gen_jcc1*. For simplicity, I am using an unconditional cmpxchg operation, that becomes a NOP if the comparison fails. Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/cpu.c | 2 +- target/i386/tcg/decode-new.c.inc | 25 ++++++++ target/i386/tcg/decode-new.h | 1 + target/i386/tcg/emit.c.inc | 104 +++++++++++++++++++++++++++++++ target/i386/tcg/translate.c | 2 + 5 files changed, 133 insertions(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 95d5f16cd5e..fd47ee7defb 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -738,7 +738,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendo= r1, #define TCG_7_0_EDX_FEATURES (CPUID_7_0_EDX_FSRM | CPUID_7_0_EDX_KERNEL_FE= ATURES) =20 #define TCG_7_1_EAX_FEATURES (CPUID_7_1_EAX_FZRM | CPUID_7_1_EAX_FSRS | \ - CPUID_7_1_EAX_FSRC) + CPUID_7_1_EAX_FSRC | CPUID_7_1_EAX_CMPCCXADD) #define TCG_7_1_EDX_FEATURES 0 #define TCG_7_2_EDX_FEATURES 0 #define TCG_APM_FEATURES 0 diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.= c.inc index 717d7307722..426c4594120 100644 --- a/target/i386/tcg/decode-new.c.inc +++ b/target/i386/tcg/decode-new.c.inc @@ -538,6 +538,28 @@ static const X86OpEntry opcodes_0F38_00toEF[240] =3D { [0xdd] =3D X86_OP_ENTRY3(VAESENCLAST, V,x, H,x, W,x, vex4 cpui= d(AES) p_66), [0xde] =3D X86_OP_ENTRY3(VAESDEC, V,x, H,x, W,x, vex4 cpui= d(AES) p_66), [0xdf] =3D X86_OP_ENTRY3(VAESDECLAST, V,x, H,x, W,x, vex4 cpui= d(AES) p_66), + + /* + * REG selects srcdest2 operand, VEX.vvvv selects src3. VEX class not= found + * in manual, assumed to be 13 from the VEX.L0 constraint. + */ + [0xe0] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe1] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe2] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe3] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe4] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe5] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe6] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe7] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + + [0xe8] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xe9] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xea] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xeb] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xec] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xed] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xee] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), + [0xef] =3D X86_OP_ENTRY3(CMPccXADD, M,y, G,y, B,y, vex13 xchg chk(o= 64) cpuid(CMPCCXADD) p_66), }; =20 /* five rows for no prefix, 66, F3, F2, 66+F2 */ @@ -1503,6 +1525,9 @@ static bool has_cpuid_feature(DisasContext *s, X86CPU= IDFeature cpuid) return (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_AVX2); case X86_FEAT_SHA_NI: return (s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_SHA_NI); + + case X86_FEAT_CMPCCXADD: + return (s->cpuid_7_1_eax_features & CPUID_7_1_EAX_CMPCCXADD); } g_assert_not_reached(); } diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h index 25220fc4362..15e6bfef4b1 100644 --- a/target/i386/tcg/decode-new.h +++ b/target/i386/tcg/decode-new.h @@ -104,6 +104,7 @@ typedef enum X86CPUIDFeature { X86_FEAT_AVX2, X86_FEAT_BMI1, X86_FEAT_BMI2, + X86_FEAT_CMPCCXADD, X86_FEAT_F16C, X86_FEAT_FMA, X86_FEAT_MOVBE, diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc index fd120e7b9b4..6bcf88ecd71 100644 --- a/target/i386/tcg/emit.c.inc +++ b/target/i386/tcg/emit.c.inc @@ -1190,6 +1190,110 @@ static void gen_BZHI(DisasContext *s, CPUX86State *= env, X86DecodedInsn *decode) prepare_update2_cc(decode, s, CC_OP_BMILGB + ot); } =20 +static void gen_CMPccXADD(DisasContext *s, CPUX86State *env, X86DecodedIns= n *decode) +{ + TCGLabel *label_top =3D gen_new_label(); + TCGLabel *label_bottom =3D gen_new_label(); + TCGv oldv =3D tcg_temp_new(); + TCGv newv =3D tcg_temp_new(); + TCGv cmpv =3D tcg_temp_new(); + TCGCond cond; + + TCGv cmp_lhs, cmp_rhs; + MemOp ot, ot_full; + + int jcc_op =3D (decode->b >> 1) & 7; + static const TCGCond cond_table[8] =3D { + [JCC_O] =3D TCG_COND_LT, /* test sign bit by comparing against 0 = */ + [JCC_B] =3D TCG_COND_LTU, + [JCC_Z] =3D TCG_COND_EQ, + [JCC_BE] =3D TCG_COND_LEU, + [JCC_S] =3D TCG_COND_LT, /* test sign bit by comparing against 0 = */ + [JCC_P] =3D TCG_COND_EQ, /* even parity - tests low bit of popcou= nt */ + [JCC_L] =3D TCG_COND_LT, + [JCC_LE] =3D TCG_COND_LE, + }; + + cond =3D cond_table[jcc_op]; + if (decode->b & 1) { + cond =3D tcg_invert_cond(cond); + } + + ot =3D decode->op[0].ot; + ot_full =3D ot | MO_LE; + if (jcc_op >=3D JCC_S) { + /* + * Sign-extend values before subtracting for S, P (zero/sign exten= sion + * does not matter there) L, LE and their inverses. + */ + ot_full |=3D MO_SIGN; + } + + /* + * cmpv will be moved to cc_src *after* cpu_regs[] is written back, so= use + * tcg_gen_ext_tl instead of gen_ext_tl. + */ + tcg_gen_ext_tl(cmpv, cpu_regs[decode->op[1].n], ot_full); + + /* + * Cmpxchg loop starts here. + * - s->T1: addition operand (from decoder) + * - s->A0: dest address (from decoder) + * - s->cc_srcT: memory operand (lhs for comparison) + * - cmpv: rhs for comparison + */ + gen_set_label(label_top); + gen_op_ld_v(s, ot_full, s->cc_srcT, s->A0); + tcg_gen_sub_tl(s->T0, s->cc_srcT, cmpv); + + /* Compute the comparison result by hand, to avoid clobbering cc_*. */ + switch (jcc_op) { + case JCC_O: + /* (src1 ^ src2) & (src1 ^ dst). newv is only used here for a mome= nt */ + tcg_gen_xor_tl(newv, s->cc_srcT, s->T0); + tcg_gen_xor_tl(s->tmp0, s->cc_srcT, cmpv); + tcg_gen_and_tl(s->tmp0, s->tmp0, newv); + tcg_gen_sextract_tl(s->tmp0, s->tmp0, 0, 8 << ot); + cmp_lhs =3D s->tmp0, cmp_rhs =3D tcg_constant_tl(0); + break; + + case JCC_P: + tcg_gen_ext8u_tl(s->tmp0, s->T0); + tcg_gen_ctpop_tl(s->tmp0, s->tmp0); + tcg_gen_andi_tl(s->tmp0, s->tmp0, 1); + cmp_lhs =3D s->tmp0, cmp_rhs =3D tcg_constant_tl(0); + break; + + case JCC_S: + tcg_gen_sextract_tl(s->tmp0, s->T0, 0, 8 << ot); + cmp_lhs =3D s->tmp0, cmp_rhs =3D tcg_constant_tl(0); + break; + + default: + cmp_lhs =3D s->cc_srcT, cmp_rhs =3D cmpv; + break; + } + + /* Compute new value: if condition does not hold, just store back s->c= c_srcT */ + tcg_gen_add_tl(newv, s->cc_srcT, s->T1); + tcg_gen_movcond_tl(cond, newv, cmp_lhs, cmp_rhs, newv, s->cc_srcT); + tcg_gen_atomic_cmpxchg_tl(oldv, s->A0, s->cc_srcT, newv, s->mem_index,= ot_full); + + /* Exit unconditionally if cmpxchg succeeded. */ + tcg_gen_brcond_tl(TCG_COND_EQ, oldv, s->cc_srcT, label_bottom); + + /* Try again if there was actually a store to make. */ + tcg_gen_brcond_tl(cond, cmp_lhs, cmp_rhs, label_top); + gen_set_label(label_bottom); + + /* Store old value to registers only after a successful store. */ + gen_writeback(s, decode, 1, s->cc_srcT); + + decode->cc_dst =3D s->T0; + decode->cc_src =3D cmpv; + decode->cc_op =3D CC_OP_SUBB + ot; +} + static void gen_CRC32(DisasContext *s, CPUX86State *env, X86DecodedInsn *d= ecode) { MemOp ot =3D decode->op[2].ot; diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index fe82d421576..e1eb82a5c68 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -122,6 +122,7 @@ typedef struct DisasContext { int cpuid_ext3_features; int cpuid_7_0_ebx_features; int cpuid_7_0_ecx_features; + int cpuid_7_1_eax_features; int cpuid_xsave_features; =20 /* TCG local temps */ @@ -6963,6 +6964,7 @@ static void i386_tr_init_disas_context(DisasContextBa= se *dcbase, CPUState *cpu) dc->cpuid_ext3_features =3D env->features[FEAT_8000_0001_ECX]; dc->cpuid_7_0_ebx_features =3D env->features[FEAT_7_0_EBX]; dc->cpuid_7_0_ecx_features =3D env->features[FEAT_7_0_ECX]; + dc->cpuid_7_1_eax_features =3D env->features[FEAT_7_1_EAX]; dc->cpuid_xsave_features =3D env->features[FEAT_XSAVE]; dc->jmp_opt =3D !((cflags & CF_NO_GOTO_TB) || (flags & (HF_TF_MASK | HF_INHIBIT_IRQ_MASK))); --=20 2.43.0