From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556245; cv=none; d=zohomail.com; s=zohoarc; b=OJJ0bQSXUQUdFw9rICPOniG2NMv0TK19AiEBkjwRI7y5wfz5/TkJmI/IbZumtOc3Wwb6d0EayLL333MnsF+OZxp0Cnu8f8rhW8RNNqhjfdRR5/Yh0VeYM2mehFKUhwWBa+5cXQcaAt9+lSY7DE9KZb6QOOmXEgi5/Z1XR+SZIzI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556245; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=u/fwrBqwwUlpPhd+8wo68HHqIUYopdT2IYzo8vQkQ5M=; b=JQ85kSoL2gNqNzRl6TsTW+dEKzs2mCm8sPq6VH9F89Q4Qf6zQy3rRYAYM24x6FtKw3khsd3URLvH0y42z8T2V4X5Oarg9/dVA+cT+a0oGNavDRc8VeR5ef+L88rGJmLzxivfepiS+AJ3yv8CckWGnZqY4+T4k6Br3gSMVt6y6eg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556245136871.7723414768841; Fri, 26 Aug 2022 16:24:05 -0700 (PDT) Received: from localhost ([::1]:48962 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRigG-0003ef-64 for importer@patchew.org; Fri, 26 Aug 2022 19:24:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38308) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUr-0006LT-BE for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:17 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:22645) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUm-0007oC-Ad for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:17 -0400 Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-100-di8361FSO0eQtYQCgjOGYg-1; Fri, 26 Aug 2022 19:12:10 -0400 Received: by mail-ed1-f71.google.com with SMTP id z20-20020a05640235d400b0043e1e74a495so1869019edc.11 for ; Fri, 26 Aug 2022 16:12:09 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id 21-20020a170906319500b007402796f065sm1172177ejy.132.2022.08.26.16.12.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555531; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u/fwrBqwwUlpPhd+8wo68HHqIUYopdT2IYzo8vQkQ5M=; b=CqHFxuhgmI1Ju0VsS0g1p3UrZJP69Z6OM35+Rhibud1sIUONiYbRhrxTPMY6w1f5qy0J7r 5xdh2C6Pej+q6KT4Cgz8Z/py3/v0N79FUzf/NfRU08tu7s762/MGlk3OzmI+KG2nTzq7AP +YxMjAbjwFn6/h6jocy+b944XdyIzCM= X-MC-Unique: di8361FSO0eQtYQCgjOGYg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=u/fwrBqwwUlpPhd+8wo68HHqIUYopdT2IYzo8vQkQ5M=; b=0itBvxeLvB6dohAun94CrNhlv/9Lfa5KEyDrbHOsT8cbf5Bb2AWTfbuN/OmRCyLuYT g68S0NCPGRW9XOFnv9ARnLzsbMVtSNyfwN9Q09euuEY4cD6bkPy+g3f1NkP06X5crL8Z /YZQvr6jOtrjb09wYE8esiP10sxaV/7WW1rehcCx9hQMb9xRsr0bL/6tE8Gz2NNYGlJW c2CD8RI9tENIo0KmfsMj+koBUrY129dXx1KIVAa0Xa49Q6DOmPMq8yN+r7k2NdVovznw ie3lqiOZX/J5nytlZY+Ut04/A3pywt5K68oMVJbfNadNAcj/WvgxiRqjRxamjJGv/Fw4 RX6g== X-Gm-Message-State: ACgBeo1AHKduGoftBXiXHs2qlPwdi8YWGicqwjEqZZ08E5JpEN7zrrWv TUaG5kuCCDyrX2ckNqlk/ypsfkrZJ7d6hGLgXECbenwCgXMM1fnEIChVZS5yNbqxT9G2mp5rMcq e9gn1X78vjlm9p/9sVqeLXMJazbEkEZ1k7z4Yg1+kZLOsZxyD+EAd5pmCcOIvlrVjS+s= X-Received: by 2002:a17:906:770d:b0:73c:a08f:593c with SMTP id q13-20020a170906770d00b0073ca08f593cmr7085376ejm.182.1661555528812; Fri, 26 Aug 2022 16:12:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR6JyYDJrprd1FCqLoJzasVpAqcTpTtb+dicSqSVJD2qBPeC7dvh30a328528OJPhtMLRGqxEQ== X-Received: by 2002:a17:906:770d:b0:73c:a08f:593c with SMTP id q13-20020a170906770d00b0073ca08f593cmr7085362ejm.182.1661555528493; Fri, 26 Aug 2022 16:12:08 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 01/23] i386: do not use MOVL to move data between SSE registers Date: Sat, 27 Aug 2022 01:11:42 +0200 Message-Id: <20220826231204.201395-2-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_HELO_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556245705100001 Content-Type: text/plain; charset="utf-8" Write down explicitly the load/store sequence. Extracted from a patch by Paul Brook . Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index b7972f0ff5..3237c1d8f9 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3295,8 +3295,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 3))); } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0))); + tcg_gen_ld_i32(s->tmp2_i32, cpu_env, + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)= )); + tcg_gen_st_i32(s->tmp2_i32, cpu_env, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= ))); } break; case 0x310: /* movsd xmm, ea */ --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661555683; cv=none; d=zohomail.com; s=zohoarc; b=S57Xde3eqOhgEa3OQeVSxAp66HokWwgpCcbK5lXaQISsWvwCXVT7WZQtF2r1pD4HX+ME7tDMjqLxNSphhZ04NTB64+fziY9crl6KRHDAR++OEGWTVEApZhuQtYeumgxMQPa9dzxUGw3AZQvW9a35CNzLz5JKjia6szf6Pio2BJM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661555683; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=KQlUJmQNEFf+CgRVH4yJkN/ZdorWqvX2krBPeVdgIRk=; b=isuHm74nyo/KJahEqaAHqy/LPNBUisV5xyofJimqwK2sGmsIHveB+wGw3stVufXX2/Qsi3BNVii3ad57JLBnrEJk56HtGyKQKpceNCWXeq67H1SktSeouiG6b2yCI5kwtUc+0x1VKPWH7g/ciSXN61viXZCkaiWw2qVwzGT5Czo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661555683048144.58553221082389; Fri, 26 Aug 2022 16:14:43 -0700 (PDT) Received: from localhost ([::1]:40516 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRiXB-0000TS-No for importer@patchew.org; Fri, 26 Aug 2022 19:14:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:37318) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUq-0006K1-7O for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:16 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:57741) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUn-0007oI-KY for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:14 -0400 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-209-Y1BVK46QM4yhwfelQWDpfQ-1; Fri, 26 Aug 2022 19:12:12 -0400 Received: by mail-ed1-f72.google.com with SMTP id q18-20020a056402519200b0043dd2ff50feso1845218edd.9 for ; Fri, 26 Aug 2022 16:12:11 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id bt21-20020a170906b15500b0073dbc35a0desm1404111ejb.100.2022.08.26.16.12.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555533; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KQlUJmQNEFf+CgRVH4yJkN/ZdorWqvX2krBPeVdgIRk=; b=VsqM+xDO02SF0UCyGv1Me/4EWJG5XYfoygwiFi9k7PTCFgHfyU1/NO8FV5a52BrRSyanGu e+8FUKkcYsAuSDm/d0aEISbt0OG5GqdT6V5nNAf6nIQVhpibKtdKvKDYPcdkW2JPtHUUGg DWp6h0rL7HkvOf4QuKZuzxS9hV0W2mg= X-MC-Unique: Y1BVK46QM4yhwfelQWDpfQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=KQlUJmQNEFf+CgRVH4yJkN/ZdorWqvX2krBPeVdgIRk=; b=HtT0zp0peEshRWcwRfizRNrh1TuxYkMogkyTpiDF/iZWlTdVMtHAvHOQdLwVM0Kic/ V4ppsiaRnA+6emfdwf4N5u1wzSuNrWofKr3Vw3/laQ2HG5g4cTYbE2wYRE5avOW4Bnb+ ykX/aYofiuUiIxScN3/nqJ9NzcNkCtqgkObPj3KJonPL4/QJ5BCYEO41jRBgg+QkRNb2 V8vSoE0unPvv3082TeuJ2S93v3jDgMQ7EpgExc7sis2JkXWd8hxZkDgmkidlNOjj94D6 jiSuhdGmZDhWrhb+op73r2qPWg6LwReDUS3SroBO4vFrN+S4XMfaVMgJ0GhMPx8yX+Hs eopA== X-Gm-Message-State: ACgBeo1ac9Rdgrc0MZ3q1jtKFCQf53gwkaVyW71CRCbVU/Mn1NQRd1MZ JTpVySXIlDvgUt6wS7+P90pSPEqqAf86P5MIusgRpXalrUWiFQrnpRUT6ayEe6lYmLnYdeHrFcC 8lsVHgJCAGEl8KmnLkfno0mT6ESATzyqpQ8a2HIz9HwLGR56aRfJEUrL+eOLA4huh1IU= X-Received: by 2002:aa7:dc17:0:b0:441:e5fc:7f91 with SMTP id b23-20020aa7dc17000000b00441e5fc7f91mr8259371edu.301.1661555530634; Fri, 26 Aug 2022 16:12:10 -0700 (PDT) X-Google-Smtp-Source: AA6agR4QKCiZCBKPQAD69LtSj1gygw1Cg0dFRVD1uuFExjfpiLUIbvNesdxjaoXDtCazvhl/xN572Q== X-Received: by 2002:aa7:dc17:0:b0:441:e5fc:7f91 with SMTP id b23-20020aa7dc17000000b00441e5fc7f91mr8259354edu.301.1661555530293; Fri, 26 Aug 2022 16:12:10 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 02/23] i386: formatting fixes Date: Sat, 27 Aug 2022 01:11:43 +0200 Message-Id: <20220826231204.201395-3-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661555683668100001 Content-Type: text/plain; charset="utf-8" Extracted from a patch by Paul Brook . Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 3237c1d8f9..25a2539d59 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3314,7 +3314,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); } break; case 0x012: /* movlps */ @@ -4463,7 +4463,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, /* 32 bit access */ gen_op_ld_v(s, MO_32, s->T0, s->A0); tcg_gen_st32_tl(s->T0, cpu_env, - offsetof(CPUX86State,xmm_t0.ZMM_L(0))); + offsetof(CPUX86State, xmm_t0.ZMM_L(0))= ); break; case 3: /* 64 bit access */ @@ -4523,8 +4523,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0xf7: /* maskmov : we must prepare A0 */ - if (mod !=3D 3) + if (mod !=3D 3) { goto illegal_op; + } tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]); gen_extu(s->aflag, s->A0); gen_add_A0_ds_seg(s); --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556032; cv=none; d=zohomail.com; s=zohoarc; b=mAlX4BKcehsfx8RpKOE79mr/dZKsuja/q2X3A7n38lk1XzNV+rVJb45UUaolgC/D2Bs4fjKs2HYPUYmKnjD1snqbIsZXDKxb7u2wHr/2kL0cZMdVJcRw6TSeNAquJUlm8H7gvTGA0CDcUNpIOW1LjmBaz7PZo4PhyqYyhbYfAqM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556032; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=cBqHhBRHuJ+wrsM4qVjTcT2gmflhKvCRi43Smo5AsYI=; b=EkGBxQ4lJVt2KbJKffQl+a51HreyNAy1cQKV/ToVzfJFsnYNmhSkEnBR3RIO/zC9mfAmx1iVJ0sB8ZBaT9HM0UjItLogKJcAlU6ORx8EcHYoc2Ly4NfqEHpZDw3aVYyOuJKzzTJkTEA42QoE+j1rERWmq56DXdhuRve4757VNW4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556032408572.8612394372915; Fri, 26 Aug 2022 16:20:32 -0700 (PDT) Received: from localhost ([::1]:40366 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRicp-0006J4-34 for importer@patchew.org; Fri, 26 Aug 2022 19:20:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38306) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUr-0006LQ-AW for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:17 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:22422) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUp-0007oY-AK for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:17 -0400 Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-648-JGaBl8IhN9ui1eTeSRvM5w-1; Fri, 26 Aug 2022 19:12:13 -0400 Received: by mail-ed1-f71.google.com with SMTP id z20-20020a05640235d400b0043e1e74a495so1869076edc.11 for ; Fri, 26 Aug 2022 16:12:13 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id s4-20020a170906bc4400b0073dd11cd1c6sm1428241ejv.34.2022.08.26.16.12.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555534; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cBqHhBRHuJ+wrsM4qVjTcT2gmflhKvCRi43Smo5AsYI=; b=OAcRdRbErnkB/E1hKCg0z1CM0LtKgoXI08XvPhslWdC+e/3/shgMooVl4NyKprAVE5qN43 G3Vbe1CKg7b9U0+emAyzmaDL5GoGiRnPphI+TsydUvnm7pLO60UutCVklbtf+Lnu+bJsZe PktND3JB9u/UX4WHiS8obCq6h7An9IQ= X-MC-Unique: JGaBl8IhN9ui1eTeSRvM5w-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=cBqHhBRHuJ+wrsM4qVjTcT2gmflhKvCRi43Smo5AsYI=; b=wz6s3o3fUwG+nbznPsUgoVE++5AyN1s9dzfHtk3YcvDW/egz6BLoi/WU4V+vMOvTJt uaNLp2NjKu7BKjmY7+f7RWslFKH2pgpPlyAa0MxKFPKW9xbaOlp368c/v0Jd0fi1dgdH NZmVhm6yk7KUDPxoH9Ydz8a2nrMmryp6ilxqoi55ARtxaJskBG+QQmG+M942QlVzE1e7 GWYLDlEfWuA+ozmNa2hkn4wYJQcCryhT7vRIqBY77HKRl+G0vEGxdk8QYR481aTLJ36t VmcfNV9snUyMaPuZU+8FIj3UYreiedXhbm96O5ssDMXaISriWOWltPPdUFLnChrt7EZ1 25iQ== X-Gm-Message-State: ACgBeo0/O/lFYs2PkY4BgSM7q5yhVA/YJy2cH8HVJ1tHzC6XQLOo5xnY BBt2IdgYlWrP5K3mw2hXJSm9pp5ETkYI7BD7YdZOPXxAsbnBug6hIcIySR+cRMRJP83LVSJ4Sa5 vzk1mdJZr0sArIMH5DM9FU276QrLoFkRwRYrwXWQe3L9FPSyYrnwcuwvaQlLP9qpY9Lk= X-Received: by 2002:a17:907:3f21:b0:73d:87da:dc6a with SMTP id hq33-20020a1709073f2100b0073d87dadc6amr6723288ejc.658.1661555531860; Fri, 26 Aug 2022 16:12:11 -0700 (PDT) X-Google-Smtp-Source: AA6agR4tZsbvk0RLZFL3BgWHhq2mAF7pBUrHJmMkd+RJ0tSHcY13fv0aUXmJeCO0SM/ENg3hUeRdUA== X-Received: by 2002:a17:907:3f21:b0:73d:87da:dc6a with SMTP id hq33-20020a1709073f2100b0073d87dadc6amr6723280ejc.658.1661555531548; Fri, 26 Aug 2022 16:12:11 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 03/23] i386: Add ZMM_OFFSET macro Date: Sat, 27 Aug 2022 01:11:44 +0200 Message-Id: <20220826231204.201395-4-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556033017100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Add a convenience macro to get the address of an xmm_regs element within CPUX86State. This was originally going to be the basis of an implementation that broke operations into 128 bit chunks. I scrapped that idea, so this is now a pure= ly cosmetic change. But I think a worthwhile one - it reduces the number of function calls that need to be split over multiple lines. No functional changes. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson Message-Id: <20220424220204.2493824-9-paul@nowt.org> Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 60 +++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 33 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 25a2539d59..cba862746b 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2777,6 +2777,8 @@ static inline void gen_op_movq_env_0(DisasContext *s,= int d_offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset); } =20 +#define ZMM_OFFSET(reg) offsetof(CPUX86State, xmm_regs[reg]) + typedef void (*SSEFunc_i_ep)(TCGv_i32 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val); @@ -3198,13 +3200,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_sto_env_A0(s, ZMM_OFFSET(reg)); break; case 0x3f0: /* lddqu */ if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); break; case 0x22b: /* movntss */ case 0x32b: /* movntsd */ @@ -3240,15 +3242,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); gen_helper_movq_mm_T0_xmm(s->ptr0, s->T0); } else #endif { gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 0); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0); gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32); } @@ -3273,11 +3273,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x26f: /* movdqu xmm, ea */ if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movo(s, offsetof(CPUX86State, xmm_regs[reg]), - offsetof(CPUX86State,xmm_regs[rm])); + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm)); } break; case 0x210: /* movss xmm, ea */ @@ -3333,7 +3332,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x212: /* movsldup */ if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), @@ -3375,7 +3374,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x216: /* movshdup */ if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1= )), @@ -3397,8 +3396,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; field_length =3D x86_ldub_code(env, s) & 0x3F; bit_index =3D x86_ldub_code(env, s) & 0x3F; - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); if (b1 =3D=3D 1) gen_helper_extrq_i(cpu_env, s->ptr0, tcg_const_i32(bit_index), @@ -3467,11 +3465,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x27f: /* movdqu ea, xmm */ if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_sto_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movo(s, offsetof(CPUX86State, xmm_regs[rm]), - offsetof(CPUX86State,xmm_regs[reg])); + gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg)); } break; case 0x211: /* movss ea, xmm */ @@ -3549,7 +3546,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } if (is_xmm) { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } else { rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -3560,15 +3557,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, break; case 0x050: /* movmskps */ rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; @@ -3583,7 +3578,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); } - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); switch(b >> 8) { @@ -3600,7 +3595,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x32a: /* cvtsi2sd */ ot =3D mo_64_32(s->dflag); gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0); - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); if (ot =3D=3D MO_32) { SSEFunc_0_epi sse_fn_epi =3D sse_op_table3ai[(b >> 8) & 1]; @@ -3626,7 +3621,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldo_env_A0(s, op2_offset); } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } op1_offset =3D offsetof(CPUX86State,fpregs[reg & 7].mmx); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); @@ -3663,7 +3658,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op2_offset =3D offsetof(CPUX86State,xmm_t0); } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset); if (ot =3D=3D MO_32) { @@ -3749,8 +3744,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; if (b1) { rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State, xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); } else { rm =3D (modrm & 7); @@ -3782,9 +3776,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; =20 if (b1) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); if (mod =3D=3D 3) { - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); } else { op2_offset =3D offsetof(CPUX86State,xmm_t0); gen_lea_modrm(env, s, modrm); @@ -4347,9 +4341,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } =20 if (b1) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); if (mod =3D=3D 3) { - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); } else { op2_offset =3D offsetof(CPUX86State,xmm_t0); gen_lea_modrm(env, s, modrm); @@ -4429,7 +4423,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; } if (is_xmm) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); if (mod !=3D 3) { int sz =3D 4; =20 @@ -4476,7 +4470,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } } else { op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661555691; cv=none; d=zohomail.com; s=zohoarc; b=TEIPr1jzNozUIpMyrPyY+wadCcbXa/I1H2ImyAXFa4MevmgFy6FfdHCk8Gni+Nzc4on32Lb0gllnzXiAqCm7viBtgoKh25zLD+q+9TQNxod73lgsmA/tc4UVQ+hfQx2xTKRgbfhxqF3DU8ghg/XnAoyfGIjLZEXfXcLmfhnb/lc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661555691; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=ce+JjCIhuae2cOI1iW4snBFPD80zf3M3cjwIFHb65b0=; b=UQ9EaI4PIEovGeZmtmTNSLVr2QajdnaA33vTuqWaBi3cAsmexqpV/unQpEz9LMVOfspMOH21OXuhpGDn9dmwGOIoD21p4MizW2MC93BJXE/Zk24ETDqG/35oHRwpJbm9oE6Ep9NcCrOwn7aMvrm7pgShYJkXYcI91peLxxxF7Wo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661555691051895.9943658050064; Fri, 26 Aug 2022 16:14:51 -0700 (PDT) Received: from localhost ([::1]:51588 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRiXI-0000qc-9R for importer@patchew.org; Fri, 26 Aug 2022 19:14:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38310) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUu-0006SZ-8n for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:20 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:34270) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUr-0007ov-QF for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:19 -0400 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-93-tAESeklWPLS67rOFRY2hjQ-1; Fri, 26 Aug 2022 19:12:15 -0400 Received: by mail-ed1-f72.google.com with SMTP id q32-20020a05640224a000b004462f105fa9so1849054eda.4 for ; Fri, 26 Aug 2022 16:12:15 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id 3-20020a170906300300b0073100dfa7b0sm1412429ejz.8.2022.08.26.16.12.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555537; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ce+JjCIhuae2cOI1iW4snBFPD80zf3M3cjwIFHb65b0=; b=VKLRRknWnF/nsr0eUpeIS7QMX2ws1xuRxbDCMTG4ribYfU95eGIZ7RpXaIA0FF8eA2Gcf8 UNi6/guDWGlpfb3K79fIImqQxEy1YaoxhK+uml91O+Sj/ZFS7Bu2c4Mr0xJ5391/TuTXVo HaVJJFkPAg3SR7r0Ej+jBexo3MA/KEY= X-MC-Unique: tAESeklWPLS67rOFRY2hjQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=ce+JjCIhuae2cOI1iW4snBFPD80zf3M3cjwIFHb65b0=; b=XfpP9S9HPHinp9wV/94EQmCXbMgFrVr9vzWyET/4/RQ4yui3NdRZPXk6lV/aHCRoNg F0jAhbR+iVLygeE0/PeEGXrpU2Z8LH1yFRgWXH9nk3TMVHp4aF3AClDS+D4Q4Mb3RrEd mD/sCurFVMjHjtPhcHRRmbU6fWZY8wogzBVN+qmhUXdV8LgXW68sJfi7WbHBe+682+Lo hEDDUDT3AOiZGBxzHZwp6Nrg65hnSQUDTtUzAysGVTSZEy12p4H6hWYjHcdnGYi1gPUs Vlu8saAvsFphJ2ZKvzAYJlHXybrq8dDcgsfGjnBFAqEvfQeXLFpxtwbhl7CYb0qQfrBt fJGw== X-Gm-Message-State: ACgBeo2LRanqCn0U1yW+WNMeDTv3mDHPhDzaxDEiMLMopGWvE0QcUe76 OklqUNbEYfAah3RSTyFhdFrXFjkzIJfbI3zNHxw2tXpWQeNUzotHin2ceDFSqlRQF2uMFPjR+n3 9nwtYRD6d9cBgm03aL3qaaJvbvLKihZKW5kegI8bJFS9BU4V8BYjvbQgjafmrz0kc1IY= X-Received: by 2002:a05:6402:4407:b0:447:1026:7537 with SMTP id y7-20020a056402440700b0044710267537mr8126216eda.312.1661555534291; Fri, 26 Aug 2022 16:12:14 -0700 (PDT) X-Google-Smtp-Source: AA6agR6bU5pHE3AkcyghBAbC4xRMTTnTgUtoaQk59VjZg+SpXcUNgNorql4TSok5e+tEhPWIel/1MA== X-Received: by 2002:a05:6402:4407:b0:447:1026:7537 with SMTP id y7-20020a056402440700b0044710267537mr8126199eda.312.1661555533890; Fri, 26 Aug 2022 16:12:13 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 04/23] i386: Rework sse_op_table1 Date: Sat, 27 Aug 2022 01:11:45 +0200 Message-Id: <20220826231204.201395-5-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661555691595100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Add a flags field each row in sse_op_table1. Initially this is only used as a replacement for the magic SSE_SPECIAL and SSE_DUMMY pointers, the other flags are mostly relevant for the AVX implementation but can be applied to SSE as well. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-5-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 311 +++++++++++++++++++++--------------- 1 file changed, 182 insertions(+), 129 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index cba862746b..7332bbcf44 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2790,146 +2790,193 @@ typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCG= v_ptr reg_b, TCGv_i32 val); typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv val); =20 -#define SSE_SPECIAL ((void *)1) -#define SSE_DUMMY ((void *)2) +#define SSE_OPF_CMP (1 << 1) /* does not write for first operand */ +#define SSE_OPF_SPECIAL (1 << 3) /* magic */ +#define SSE_OPF_3DNOW (1 << 4) /* 3DNow! instruction */ +#define SSE_OPF_MMX (1 << 5) /* MMX/integer/AVX2 instruction */ +#define SSE_OPF_SCALAR (1 << 6) /* Has SSE scalar variants */ +#define SSE_OPF_SHUF (1 << 9) /* pshufx/shufpx */ =20 -#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } -#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ - gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } +#define OP(op, flags, a, b, c, d) \ + {flags, {a, b, c, d} } =20 -static const SSEFunc_0_epp sse_op_table1[256][4] =3D { +#define MMX_OP(x) OP(op1, SSE_OPF_MMX, \ + gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL) + +#define SSE_FOP(name) OP(op1, SSE_OPF_SCALAR, \ + gen_helper_##name##ps, gen_helper_##name##pd, \ + gen_helper_##name##ss, gen_helper_##name##sd) +#define SSE_OP(sname, dname, op, flags) OP(op, flags, \ + gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) + +struct SSEOpHelper_table1 { + int flags; + SSEFunc_0_epp op[4]; +}; + +#define SSE_3DNOW { SSE_OPF_3DNOW } +#define SSE_SPECIAL { SSE_OPF_SPECIAL } + +static const struct SSEOpHelper_table1 sse_op_table1[256] =3D { /* 3DNow! extensions */ - [0x0e] =3D { SSE_DUMMY }, /* femms */ - [0x0f] =3D { SSE_DUMMY }, /* pf... */ + [0x0e] =3D SSE_SPECIAL, /* femms */ + [0x0f] =3D SSE_3DNOW, /* pf... (sse_op_table5) */ /* pure SSE operations */ - [0x10] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movups, movupd, movss, movsd */ - [0x11] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movups, movupd, movss, movsd */ - [0x12] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movlps, movlpd, movsldup, movddup */ - [0x13] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movlps, movlpd */ - [0x14] =3D { gen_helper_punpckldq_xmm, gen_helper_punpcklqdq_xmm }, - [0x15] =3D { gen_helper_punpckhdq_xmm, gen_helper_punpckhqdq_xmm }, - [0x16] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movhps, movh= pd, movshdup */ - [0x17] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movhps, movhpd */ + [0x10] =3D SSE_SPECIAL, /* movups, movupd, movss, movsd */ + [0x11] =3D SSE_SPECIAL, /* movups, movupd, movss, movsd */ + [0x12] =3D SSE_SPECIAL, /* movlps, movlpd, movsldup, movddup */ + [0x13] =3D SSE_SPECIAL, /* movlps, movlpd */ + [0x14] =3D SSE_OP(punpckldq, punpcklqdq, op1, 0), /* unpcklps, unpcklp= d */ + [0x15] =3D SSE_OP(punpckhdq, punpckhqdq, op1, 0), /* unpckhps, unpckhp= d */ + [0x16] =3D SSE_SPECIAL, /* movhps, movhpd, movshdup */ + [0x17] =3D SSE_SPECIAL, /* movhps, movhpd */ =20 - [0x28] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movaps, movapd */ - [0x29] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movaps, movapd */ - [0x2a] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */ - [0x2b] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movntps, movntpd, movntss, movntsd */ - [0x2c] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */ - [0x2d] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ - [0x2e] =3D { gen_helper_ucomiss, gen_helper_ucomisd }, - [0x2f] =3D { gen_helper_comiss, gen_helper_comisd }, - [0x50] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movmskps, movmskpd */ - [0x51] =3D SSE_FOP(sqrt), - [0x52] =3D { gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL }, - [0x53] =3D { gen_helper_rcpps, NULL, gen_helper_rcpss, NULL }, - [0x54] =3D { gen_helper_pand_xmm, gen_helper_pand_xmm }, /* andps, and= pd */ - [0x55] =3D { gen_helper_pandn_xmm, gen_helper_pandn_xmm }, /* andnps, = andnpd */ - [0x56] =3D { gen_helper_por_xmm, gen_helper_por_xmm }, /* orps, orpd */ - [0x57] =3D { gen_helper_pxor_xmm, gen_helper_pxor_xmm }, /* xorps, xor= pd */ + [0x28] =3D SSE_SPECIAL, /* movaps, movapd */ + [0x29] =3D SSE_SPECIAL, /* movaps, movapd */ + [0x2a] =3D SSE_SPECIAL, /* cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */ + [0x2b] =3D SSE_SPECIAL, /* movntps, movntpd, movntss, movntsd */ + [0x2c] =3D SSE_SPECIAL, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si = */ + [0x2d] =3D SSE_SPECIAL, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ + [0x2e] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR, + gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL), + [0x2f] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR, + gen_helper_comiss, gen_helper_comisd, NULL, NULL), + [0x50] =3D SSE_SPECIAL, /* movmskps, movmskpd */ + [0x51] =3D OP(op1, SSE_OPF_SCALAR, + gen_helper_sqrtps, gen_helper_sqrtpd, + gen_helper_sqrtss, gen_helper_sqrtsd), + [0x52] =3D OP(op1, SSE_OPF_SCALAR, + gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL), + [0x53] =3D OP(op1, SSE_OPF_SCALAR, + gen_helper_rcpps, NULL, gen_helper_rcpss, NULL), + [0x54] =3D SSE_OP(pand, pand, op1, 0), /* andps, andpd */ + [0x55] =3D SSE_OP(pandn, pandn, op1, 0), /* andnps, andnpd */ + [0x56] =3D SSE_OP(por, por, op1, 0), /* orps, orpd */ + [0x57] =3D SSE_OP(pxor, pxor, op1, 0), /* xorps, xorpd */ [0x58] =3D SSE_FOP(add), [0x59] =3D SSE_FOP(mul), - [0x5a] =3D { gen_helper_cvtps2pd, gen_helper_cvtpd2ps, - gen_helper_cvtss2sd, gen_helper_cvtsd2ss }, - [0x5b] =3D { gen_helper_cvtdq2ps, gen_helper_cvtps2dq, gen_helper_cvtt= ps2dq }, + [0x5a] =3D OP(op1, SSE_OPF_SCALAR, + gen_helper_cvtps2pd, gen_helper_cvtpd2ps, + gen_helper_cvtss2sd, gen_helper_cvtsd2ss), + [0x5b] =3D OP(op1, 0, + gen_helper_cvtdq2ps, gen_helper_cvtps2dq, + gen_helper_cvttps2dq, NULL), [0x5c] =3D SSE_FOP(sub), [0x5d] =3D SSE_FOP(min), [0x5e] =3D SSE_FOP(div), [0x5f] =3D SSE_FOP(max), =20 - [0xc2] =3D SSE_FOP(cmpeq), - [0xc6] =3D { (SSEFunc_0_epp)gen_helper_shufps, - (SSEFunc_0_epp)gen_helper_shufpd }, /* XXX: casts */ + [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ + [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps, + (SSEFunc_0_epp)gen_helper_shufpd, NULL, NULL), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ - [0x38] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, - [0x3a] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, + [0x38] =3D SSE_SPECIAL, + [0x3a] =3D SSE_SPECIAL, =20 /* MMX ops and their SSE extensions */ - [0x60] =3D MMX_OP2(punpcklbw), - [0x61] =3D MMX_OP2(punpcklwd), - [0x62] =3D MMX_OP2(punpckldq), - [0x63] =3D MMX_OP2(packsswb), - [0x64] =3D MMX_OP2(pcmpgtb), - [0x65] =3D MMX_OP2(pcmpgtw), - [0x66] =3D MMX_OP2(pcmpgtl), - [0x67] =3D MMX_OP2(packuswb), - [0x68] =3D MMX_OP2(punpckhbw), - [0x69] =3D MMX_OP2(punpckhwd), - [0x6a] =3D MMX_OP2(punpckhdq), - [0x6b] =3D MMX_OP2(packssdw), - [0x6c] =3D { NULL, gen_helper_punpcklqdq_xmm }, - [0x6d] =3D { NULL, gen_helper_punpckhqdq_xmm }, - [0x6e] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movd mm, ea */ - [0x6f] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa,= , movqdu */ - [0x70] =3D { (SSEFunc_0_epp)gen_helper_pshufw_mmx, - (SSEFunc_0_epp)gen_helper_pshufd_xmm, - (SSEFunc_0_epp)gen_helper_pshufhw_xmm, - (SSEFunc_0_epp)gen_helper_pshuflw_xmm }, /* XXX: casts */ - [0x71] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftw */ - [0x72] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftd */ - [0x73] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftq */ - [0x74] =3D MMX_OP2(pcmpeqb), - [0x75] =3D MMX_OP2(pcmpeqw), - [0x76] =3D MMX_OP2(pcmpeql), - [0x77] =3D { SSE_DUMMY }, /* emms */ - [0x78] =3D { NULL, SSE_SPECIAL, NULL, SSE_SPECIAL }, /* extrq_i, inser= tq_i */ - [0x79] =3D { NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r }, - [0x7c] =3D { NULL, gen_helper_haddpd, NULL, gen_helper_haddps }, - [0x7d] =3D { NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps }, - [0x7e] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movd, movd, ,= movq */ - [0x7f] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa,= movdqu */ - [0xc4] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pinsrw */ - [0xc5] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pextrw */ - [0xd0] =3D { NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps }, - [0xd1] =3D MMX_OP2(psrlw), - [0xd2] =3D MMX_OP2(psrld), - [0xd3] =3D MMX_OP2(psrlq), - [0xd4] =3D MMX_OP2(paddq), - [0xd5] =3D MMX_OP2(pmullw), - [0xd6] =3D { NULL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, - [0xd7] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pmovmskb */ - [0xd8] =3D MMX_OP2(psubusb), - [0xd9] =3D MMX_OP2(psubusw), - [0xda] =3D MMX_OP2(pminub), - [0xdb] =3D MMX_OP2(pand), - [0xdc] =3D MMX_OP2(paddusb), - [0xdd] =3D MMX_OP2(paddusw), - [0xde] =3D MMX_OP2(pmaxub), - [0xdf] =3D MMX_OP2(pandn), - [0xe0] =3D MMX_OP2(pavgb), - [0xe1] =3D MMX_OP2(psraw), - [0xe2] =3D MMX_OP2(psrad), - [0xe3] =3D MMX_OP2(pavgw), - [0xe4] =3D MMX_OP2(pmulhuw), - [0xe5] =3D MMX_OP2(pmulhw), - [0xe6] =3D { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, gen_help= er_cvtpd2dq }, - [0xe7] =3D { SSE_SPECIAL , SSE_SPECIAL }, /* movntq, movntq */ - [0xe8] =3D MMX_OP2(psubsb), - [0xe9] =3D MMX_OP2(psubsw), - [0xea] =3D MMX_OP2(pminsw), - [0xeb] =3D MMX_OP2(por), - [0xec] =3D MMX_OP2(paddsb), - [0xed] =3D MMX_OP2(paddsw), - [0xee] =3D MMX_OP2(pmaxsw), - [0xef] =3D MMX_OP2(pxor), - [0xf0] =3D { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */ - [0xf1] =3D MMX_OP2(psllw), - [0xf2] =3D MMX_OP2(pslld), - [0xf3] =3D MMX_OP2(psllq), - [0xf4] =3D MMX_OP2(pmuludq), - [0xf5] =3D MMX_OP2(pmaddwd), - [0xf6] =3D MMX_OP2(psadbw), - [0xf7] =3D { (SSEFunc_0_epp)gen_helper_maskmov_mmx, - (SSEFunc_0_epp)gen_helper_maskmov_xmm }, /* XXX: casts */ - [0xf8] =3D MMX_OP2(psubb), - [0xf9] =3D MMX_OP2(psubw), - [0xfa] =3D MMX_OP2(psubl), - [0xfb] =3D MMX_OP2(psubq), - [0xfc] =3D MMX_OP2(paddb), - [0xfd] =3D MMX_OP2(paddw), - [0xfe] =3D MMX_OP2(paddl), + [0x60] =3D MMX_OP(punpcklbw), + [0x61] =3D MMX_OP(punpcklwd), + [0x62] =3D MMX_OP(punpckldq), + [0x63] =3D MMX_OP(packsswb), + [0x64] =3D MMX_OP(pcmpgtb), + [0x65] =3D MMX_OP(pcmpgtw), + [0x66] =3D MMX_OP(pcmpgtl), + [0x67] =3D MMX_OP(packuswb), + [0x68] =3D MMX_OP(punpckhbw), + [0x69] =3D MMX_OP(punpckhwd), + [0x6a] =3D MMX_OP(punpckhdq), + [0x6b] =3D MMX_OP(packssdw), + [0x6c] =3D OP(op1, SSE_OPF_MMX, + NULL, gen_helper_punpcklqdq_xmm, NULL, NULL), + [0x6d] =3D OP(op1, SSE_OPF_MMX, + NULL, gen_helper_punpckhqdq_xmm, NULL, NULL), + [0x6e] =3D SSE_SPECIAL, /* movd mm, ea */ + [0x6f] =3D SSE_SPECIAL, /* movq, movdqa, , movqdu */ + [0x70] =3D OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX, + (SSEFunc_0_epp)gen_helper_pshufw_mmx, + (SSEFunc_0_epp)gen_helper_pshufd_xmm, + (SSEFunc_0_epp)gen_helper_pshufhw_xmm, + (SSEFunc_0_epp)gen_helper_pshuflw_xmm), + [0x71] =3D SSE_SPECIAL, /* shiftw */ + [0x72] =3D SSE_SPECIAL, /* shiftd */ + [0x73] =3D SSE_SPECIAL, /* shiftq */ + [0x74] =3D MMX_OP(pcmpeqb), + [0x75] =3D MMX_OP(pcmpeqw), + [0x76] =3D MMX_OP(pcmpeql), + [0x77] =3D SSE_SPECIAL, /* emms */ + [0x78] =3D SSE_SPECIAL, /* extrq_i, insertq_i (sse4a) */ + [0x79] =3D OP(op1, 0, + NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r), + [0x7c] =3D OP(op1, 0, + NULL, gen_helper_haddpd, NULL, gen_helper_haddps), + [0x7d] =3D OP(op1, 0, + NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps), + [0x7e] =3D SSE_SPECIAL, /* movd, movd, , movq */ + [0x7f] =3D SSE_SPECIAL, /* movq, movdqa, movdqu */ + [0xc4] =3D SSE_SPECIAL, /* pinsrw */ + [0xc5] =3D SSE_SPECIAL, /* pextrw */ + [0xd0] =3D OP(op1, 0, + NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps), + [0xd1] =3D MMX_OP(psrlw), + [0xd2] =3D MMX_OP(psrld), + [0xd3] =3D MMX_OP(psrlq), + [0xd4] =3D MMX_OP(paddq), + [0xd5] =3D MMX_OP(pmullw), + [0xd6] =3D SSE_SPECIAL, + [0xd7] =3D SSE_SPECIAL, /* pmovmskb */ + [0xd8] =3D MMX_OP(psubusb), + [0xd9] =3D MMX_OP(psubusw), + [0xda] =3D MMX_OP(pminub), + [0xdb] =3D MMX_OP(pand), + [0xdc] =3D MMX_OP(paddusb), + [0xdd] =3D MMX_OP(paddusw), + [0xde] =3D MMX_OP(pmaxub), + [0xdf] =3D MMX_OP(pandn), + [0xe0] =3D MMX_OP(pavgb), + [0xe1] =3D MMX_OP(psraw), + [0xe2] =3D MMX_OP(psrad), + [0xe3] =3D MMX_OP(pavgw), + [0xe4] =3D MMX_OP(pmulhuw), + [0xe5] =3D MMX_OP(pmulhw), + [0xe6] =3D OP(op1, 0, + NULL, gen_helper_cvttpd2dq, + gen_helper_cvtdq2pd, gen_helper_cvtpd2dq), + [0xe7] =3D SSE_SPECIAL, /* movntq, movntq */ + [0xe8] =3D MMX_OP(psubsb), + [0xe9] =3D MMX_OP(psubsw), + [0xea] =3D MMX_OP(pminsw), + [0xeb] =3D MMX_OP(por), + [0xec] =3D MMX_OP(paddsb), + [0xed] =3D MMX_OP(paddsw), + [0xee] =3D MMX_OP(pmaxsw), + [0xef] =3D MMX_OP(pxor), + [0xf0] =3D SSE_SPECIAL, /* lddqu */ + [0xf1] =3D MMX_OP(psllw), + [0xf2] =3D MMX_OP(pslld), + [0xf3] =3D MMX_OP(psllq), + [0xf4] =3D MMX_OP(pmuludq), + [0xf5] =3D MMX_OP(pmaddwd), + [0xf6] =3D MMX_OP(psadbw), + [0xf7] =3D OP(op1t, SSE_OPF_MMX, + (SSEFunc_0_epp)gen_helper_maskmov_mmx, + (SSEFunc_0_epp)gen_helper_maskmov_xmm, NULL, NULL), + [0xf8] =3D MMX_OP(psubb), + [0xf9] =3D MMX_OP(psubw), + [0xfa] =3D MMX_OP(psubl), + [0xfb] =3D MMX_OP(psubq), + [0xfc] =3D MMX_OP(paddb), + [0xfd] =3D MMX_OP(paddw), + [0xfe] =3D MMX_OP(paddl), }; +#undef MMX_OP +#undef OP +#undef SSE_FOP +#undef SSE_OP +#undef SSE_SPECIAL + +#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } +#define SSE_SPECIAL_FN ((void *)1) =20 static const SSEFunc_0_epp sse_op_table2[3 * 8][2] =3D { [0 + 2] =3D MMX_OP2(psrlw), @@ -2972,6 +3019,8 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 +#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ + gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpeq), SSE_FOP(cmplt), @@ -2982,6 +3031,7 @@ static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpnle), SSE_FOP(cmpord), }; +#undef SSE_FOP =20 static const SSEFunc_0_epp sse_op_table5[256] =3D { [0x0c] =3D gen_helper_pi2fw, @@ -3023,7 +3073,7 @@ struct SSEOpHelper_eppi { #define SSSE3_OP(x) # x ## _xmm }, \ CPUID_EXT_PCLMULQDQ } #define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES } @@ -3114,6 +3164,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, { int b1, op1_offset, op2_offset, is_xmm, val; int modrm, mod, rm, reg; + int sse_op_flags; SSEFunc_0_epp sse_fn_epp; SSEFunc_0_eppi sse_fn_eppi; SSEFunc_0_ppi sse_fn_ppi; @@ -3129,8 +3180,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, b1 =3D 3; else b1 =3D 0; - sse_fn_epp =3D sse_op_table1[b][b1]; - if (!sse_fn_epp) { + sse_op_flags =3D sse_op_table1[b].flags; + sse_fn_epp =3D sse_op_table1[b].op[b1]; + if ((sse_op_flags & (SSE_OPF_SPECIAL | SSE_OPF_3DNOW)) =3D=3D 0 + && !sse_fn_epp) { goto unknown_op; } if ((b <=3D 0x5f && b >=3D 0x10) || b =3D=3D 0xc6 || b =3D=3D 0xc2) { @@ -3184,7 +3237,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, reg |=3D REX_R(s); } mod =3D (modrm >> 6) & 3; - if (sse_fn_epp =3D=3D SSE_SPECIAL) { + if (sse_op_flags & SSE_OPF_SPECIAL) { b |=3D (b1 << 8); switch(b) { case 0x0e7: /* movntq */ @@ -3819,7 +3872,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldq_env_A0(s, op2_offset); } } - if (sse_fn_epp =3D=3D SSE_SPECIAL) { + if (sse_fn_epp =3D=3D SSE_SPECIAL_FN) { goto unknown_op; } =20 @@ -4205,7 +4258,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 s->rip_offset =3D 1; =20 - if (sse_fn_eppi =3D=3D SSE_SPECIAL) { + if (sse_fn_eppi =3D=3D SSE_SPECIAL_FN) { ot =3D mo_64_32(s->dflag); rm =3D (modrm & 7) | REX_B(s); if (mod !=3D 3) --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556030; cv=none; d=zohomail.com; s=zohoarc; b=ODk97EpABYt1is8eByCtApa2hTFF1K+xr72rmcYEOWd7K3OnARSaAUHk2KQ1PGTVJB5ltL1NzF22Y65OlbHrW27g7yifEjdAZ9CajzhLqSqO7nEE7ZdCccoi3R3FX4SNSHrWhGUeA8sfAgXMQ8+v8INEZk4z+qPrPJjqXLrH/pE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556030; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=6TOl7y6czP5SaerSawoN35k7HXgcRExU/bqStI0DmHQ=; b=MnGv/JtdYcAnzmOBMY06g0Gxg4u6DXHZmCb/Hr4Sjg949NwyYTP/cYIdC679c4L3gyGEL3VP5RflEI//Kw1uhTMlZVN4a8zSE9bcDRE8LTfphybRraa2pW+YSQfMPq50bTcTrQKdyHgmJIHVM9O8UI7Yka8sE7BAnD4WzKWsdl4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556030659999.905066893379; Fri, 26 Aug 2022 16:20:30 -0700 (PDT) Received: from localhost ([::1]:43868 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRicn-0006Fg-6w for importer@patchew.org; Fri, 26 Aug 2022 19:20:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38312) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUv-0006Vx-M6 for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:21 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:21255) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUt-0007p1-Dz for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:21 -0400 Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-452-57i1xGR3PD6qz9ncB6MK6A-1; Fri, 26 Aug 2022 19:12:17 -0400 Received: by mail-ed1-f69.google.com with SMTP id m16-20020a056402431000b0044662a0ba2cso1842592edc.13 for ; Fri, 26 Aug 2022 16:12:17 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id lv4-20020a170906bc8400b007262a5e2204sm1398187ejb.153.2022.08.26.16.12.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555538; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6TOl7y6czP5SaerSawoN35k7HXgcRExU/bqStI0DmHQ=; b=hnpdc2jihi20fihILYk10v0l9OjOOh6hVubc41YN5ufeaoS+6gVcSzqbK+FAdGm49EIAuu hf00c0P5s0scHy444dYSZRFNY2+2PWT14JBIs8o6a2ju+lADGMfJEfH2MmAKP1O/kON2Ge IgcQfUBmvTnUXiLtdvEWY4y3izW0DIE= X-MC-Unique: 57i1xGR3PD6qz9ncB6MK6A-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=6TOl7y6czP5SaerSawoN35k7HXgcRExU/bqStI0DmHQ=; b=Bc63xY+uxvB9dFq6Q4dVUNN0LaSdKief74qTEDiIUc+znEQP4frJkLUPHVwmMGcvZh o5FNKd8EvhzsaYlGme7pp6l1R2o0J4Th6tslylf4lAn8aaFa0dTtHXsv0/v1dC+JX4SN Sk6YxUpzLxhZIAoUkkRTu90VsxwlLfzWcsNZu4nZprKSJZCjbcOk0ffWmQ/BUOYB1+Ca jvG099LqL30XH+6I6cn9S6tnY9LrU8NDOdGNcSU8oGLN6wYvkpFA7WwTSV5kFMLuIBwX MRJyC2AbR67j6CBN7WP3vZRWhlQK8On3pLPqOA5WawPrMD1DITsPmJbWzjmtlCxqCsx0 QqCQ== X-Gm-Message-State: ACgBeo298MIytr3t5UTCRBKr77QmDA8lpdIMCDypqiBG+TNGzRQxZFxP PTIpj/L1Z+/sy+W4FDNN9WrKQe4lgQPctOPJtdo6APl44BFiza5W3sTPpT8GHJYD9/9+e81T2Tf +m9NCzOHLWIbUQbwPs96iG8HK18yHz4+aqvNBpWchn/DiAopD1+C6OreX/hVS6yx/wfA= X-Received: by 2002:a17:907:929:b0:731:3bb6:d454 with SMTP id au9-20020a170907092900b007313bb6d454mr6647648ejc.96.1661555536102; Fri, 26 Aug 2022 16:12:16 -0700 (PDT) X-Google-Smtp-Source: AA6agR5e0N6aYejdMWFFXtrvNgKmcIeUvHDumnBWolH2faYwhkReny0FviQ6ilpsRgBIx907wua3HA== X-Received: by 2002:a17:907:929:b0:731:3bb6:d454 with SMTP id au9-20020a170907092900b007313bb6d454mr6647636ejc.96.1661555535694; Fri, 26 Aug 2022 16:12:15 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 05/23] i386: Rework sse_op_table6/7 Date: Sat, 27 Aug 2022 01:11:46 +0200 Message-Id: <20220826231204.201395-6-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556033019100002 Content-Type: text/plain; charset="utf-8" From: Paul Brook Add a flags field each row in sse_op_table6 and sse_op_table7. Initially this is only used as a replacement for the magic SSE41_SPECIAL pointer. The other flags are mostly relevant for the AVX implementation but can be applied to SSE as well. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-6-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 230 ++++++++++++++++++++---------------- 1 file changed, 131 insertions(+), 99 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 7332bbcf44..b7321b7588 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2976,7 +2976,6 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { #undef SSE_SPECIAL =20 #define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } -#define SSE_SPECIAL_FN ((void *)1) =20 static const SSEFunc_0_epp sse_op_table2[3 * 8][2] =3D { [0 + 2] =3D MMX_OP2(psrlw), @@ -3060,113 +3059,134 @@ static const SSEFunc_0_epp sse_op_table5[256] =3D= { [0xbf] =3D gen_helper_pavgb_mmx /* pavgusb */ }; =20 -struct SSEOpHelper_epp { +struct SSEOpHelper_table6 { SSEFunc_0_epp op[2]; uint32_t ext_mask; + int flags; }; =20 -struct SSEOpHelper_eppi { +struct SSEOpHelper_table7 { SSEFunc_0_eppi op[2]; uint32_t ext_mask; + int flags; }; =20 -#define SSSE3_OP(x) { MMX_OP2(x), CPUID_EXT_SSSE3 } -#define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 } -#define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 } -#define SSE41_SPECIAL { { NULL, SSE_SPECIAL_FN }, CPUID_EXT_SSE41 } -#define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \ - CPUID_EXT_PCLMULQDQ } -#define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES } +#define gen_helper_special_xmm NULL =20 -static const struct SSEOpHelper_epp sse_op_table6[256] =3D { - [0x00] =3D SSSE3_OP(pshufb), - [0x01] =3D SSSE3_OP(phaddw), - [0x02] =3D SSSE3_OP(phaddd), - [0x03] =3D SSSE3_OP(phaddsw), - [0x04] =3D SSSE3_OP(pmaddubsw), - [0x05] =3D SSSE3_OP(phsubw), - [0x06] =3D SSSE3_OP(phsubd), - [0x07] =3D SSSE3_OP(phsubsw), - [0x08] =3D SSSE3_OP(psignb), - [0x09] =3D SSSE3_OP(psignw), - [0x0a] =3D SSSE3_OP(psignd), - [0x0b] =3D SSSE3_OP(pmulhrsw), - [0x10] =3D SSE41_OP(pblendvb), - [0x14] =3D SSE41_OP(blendvps), - [0x15] =3D SSE41_OP(blendvpd), - [0x17] =3D SSE41_OP(ptest), - [0x1c] =3D SSSE3_OP(pabsb), - [0x1d] =3D SSSE3_OP(pabsw), - [0x1e] =3D SSSE3_OP(pabsd), - [0x20] =3D SSE41_OP(pmovsxbw), - [0x21] =3D SSE41_OP(pmovsxbd), - [0x22] =3D SSE41_OP(pmovsxbq), - [0x23] =3D SSE41_OP(pmovsxwd), - [0x24] =3D SSE41_OP(pmovsxwq), - [0x25] =3D SSE41_OP(pmovsxdq), - [0x28] =3D SSE41_OP(pmuldq), - [0x29] =3D SSE41_OP(pcmpeqq), - [0x2a] =3D SSE41_SPECIAL, /* movntqda */ - [0x2b] =3D SSE41_OP(packusdw), - [0x30] =3D SSE41_OP(pmovzxbw), - [0x31] =3D SSE41_OP(pmovzxbd), - [0x32] =3D SSE41_OP(pmovzxbq), - [0x33] =3D SSE41_OP(pmovzxwd), - [0x34] =3D SSE41_OP(pmovzxwq), - [0x35] =3D SSE41_OP(pmovzxdq), - [0x37] =3D SSE42_OP(pcmpgtq), - [0x38] =3D SSE41_OP(pminsb), - [0x39] =3D SSE41_OP(pminsd), - [0x3a] =3D SSE41_OP(pminuw), - [0x3b] =3D SSE41_OP(pminud), - [0x3c] =3D SSE41_OP(pmaxsb), - [0x3d] =3D SSE41_OP(pmaxsd), - [0x3e] =3D SSE41_OP(pmaxuw), - [0x3f] =3D SSE41_OP(pmaxud), - [0x40] =3D SSE41_OP(pmulld), - [0x41] =3D SSE41_OP(phminposuw), - [0xdb] =3D AESNI_OP(aesimc), - [0xdc] =3D AESNI_OP(aesenc), - [0xdd] =3D AESNI_OP(aesenclast), - [0xde] =3D AESNI_OP(aesdec), - [0xdf] =3D AESNI_OP(aesdeclast), +#define OP(name, op, flags, ext, mmx_name) \ + {{mmx_name, gen_helper_ ## name ## _xmm}, CPUID_EXT_ ## ext, flags} +#define BINARY_OP_MMX(name, ext) \ + OP(name, op1, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) +#define BINARY_OP(name, ext, flags) \ + OP(name, op1, flags, ext, NULL) +#define UNARY_OP_MMX(name, ext) \ + OP(name, op1, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) +#define UNARY_OP(name, ext, flags) \ + OP(name, op1, flags, ext, NULL) +#define BLENDV_OP(name, ext, flags) OP(name, op1, 0, ext, NULL) +#define CMP_OP(name, ext) OP(name, op1, SSE_OPF_CMP, ext, NULL) +#define SPECIAL_OP(ext) OP(special, op1, SSE_OPF_SPECIAL, ext, NULL) + +/* prefix [66] 0f 38 */ +static const struct SSEOpHelper_table6 sse_op_table6[256] =3D { + [0x00] =3D BINARY_OP_MMX(pshufb, SSSE3), + [0x01] =3D BINARY_OP_MMX(phaddw, SSSE3), + [0x02] =3D BINARY_OP_MMX(phaddd, SSSE3), + [0x03] =3D BINARY_OP_MMX(phaddsw, SSSE3), + [0x04] =3D BINARY_OP_MMX(pmaddubsw, SSSE3), + [0x05] =3D BINARY_OP_MMX(phsubw, SSSE3), + [0x06] =3D BINARY_OP_MMX(phsubd, SSSE3), + [0x07] =3D BINARY_OP_MMX(phsubsw, SSSE3), + [0x08] =3D BINARY_OP_MMX(psignb, SSSE3), + [0x09] =3D BINARY_OP_MMX(psignw, SSSE3), + [0x0a] =3D BINARY_OP_MMX(psignd, SSSE3), + [0x0b] =3D BINARY_OP_MMX(pmulhrsw, SSSE3), + [0x10] =3D BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX), + [0x14] =3D BLENDV_OP(blendvps, SSE41, 0), + [0x15] =3D BLENDV_OP(blendvpd, SSE41, 0), + [0x17] =3D CMP_OP(ptest, SSE41), + [0x1c] =3D UNARY_OP_MMX(pabsb, SSSE3), + [0x1d] =3D UNARY_OP_MMX(pabsw, SSSE3), + [0x1e] =3D UNARY_OP_MMX(pabsd, SSSE3), + [0x20] =3D UNARY_OP(pmovsxbw, SSE41, SSE_OPF_MMX), + [0x21] =3D UNARY_OP(pmovsxbd, SSE41, SSE_OPF_MMX), + [0x22] =3D UNARY_OP(pmovsxbq, SSE41, SSE_OPF_MMX), + [0x23] =3D UNARY_OP(pmovsxwd, SSE41, SSE_OPF_MMX), + [0x24] =3D UNARY_OP(pmovsxwq, SSE41, SSE_OPF_MMX), + [0x25] =3D UNARY_OP(pmovsxdq, SSE41, SSE_OPF_MMX), + [0x28] =3D BINARY_OP(pmuldq, SSE41, SSE_OPF_MMX), + [0x29] =3D BINARY_OP(pcmpeqq, SSE41, SSE_OPF_MMX), + [0x2a] =3D SPECIAL_OP(SSE41), /* movntqda */ + [0x2b] =3D BINARY_OP(packusdw, SSE41, SSE_OPF_MMX), + [0x30] =3D UNARY_OP(pmovzxbw, SSE41, SSE_OPF_MMX), + [0x31] =3D UNARY_OP(pmovzxbd, SSE41, SSE_OPF_MMX), + [0x32] =3D UNARY_OP(pmovzxbq, SSE41, SSE_OPF_MMX), + [0x33] =3D UNARY_OP(pmovzxwd, SSE41, SSE_OPF_MMX), + [0x34] =3D UNARY_OP(pmovzxwq, SSE41, SSE_OPF_MMX), + [0x35] =3D UNARY_OP(pmovzxdq, SSE41, SSE_OPF_MMX), + [0x37] =3D BINARY_OP(pcmpgtq, SSE41, SSE_OPF_MMX), + [0x38] =3D BINARY_OP(pminsb, SSE41, SSE_OPF_MMX), + [0x39] =3D BINARY_OP(pminsd, SSE41, SSE_OPF_MMX), + [0x3a] =3D BINARY_OP(pminuw, SSE41, SSE_OPF_MMX), + [0x3b] =3D BINARY_OP(pminud, SSE41, SSE_OPF_MMX), + [0x3c] =3D BINARY_OP(pmaxsb, SSE41, SSE_OPF_MMX), + [0x3d] =3D BINARY_OP(pmaxsd, SSE41, SSE_OPF_MMX), + [0x3e] =3D BINARY_OP(pmaxuw, SSE41, SSE_OPF_MMX), + [0x3f] =3D BINARY_OP(pmaxud, SSE41, SSE_OPF_MMX), + [0x40] =3D BINARY_OP(pmulld, SSE41, SSE_OPF_MMX), + [0x41] =3D UNARY_OP(phminposuw, SSE41, 0), + [0xdb] =3D UNARY_OP(aesimc, AES, 0), + [0xdc] =3D BINARY_OP(aesenc, AES, 0), + [0xdd] =3D BINARY_OP(aesenclast, AES, 0), + [0xde] =3D BINARY_OP(aesdec, AES, 0), + [0xdf] =3D BINARY_OP(aesdeclast, AES, 0), }; =20 -static const struct SSEOpHelper_eppi sse_op_table7[256] =3D { - [0x08] =3D SSE41_OP(roundps), - [0x09] =3D SSE41_OP(roundpd), - [0x0a] =3D SSE41_OP(roundss), - [0x0b] =3D SSE41_OP(roundsd), - [0x0c] =3D SSE41_OP(blendps), - [0x0d] =3D SSE41_OP(blendpd), - [0x0e] =3D SSE41_OP(pblendw), - [0x0f] =3D SSSE3_OP(palignr), - [0x14] =3D SSE41_SPECIAL, /* pextrb */ - [0x15] =3D SSE41_SPECIAL, /* pextrw */ - [0x16] =3D SSE41_SPECIAL, /* pextrd/pextrq */ - [0x17] =3D SSE41_SPECIAL, /* extractps */ - [0x20] =3D SSE41_SPECIAL, /* pinsrb */ - [0x21] =3D SSE41_SPECIAL, /* insertps */ - [0x22] =3D SSE41_SPECIAL, /* pinsrd/pinsrq */ - [0x40] =3D SSE41_OP(dpps), - [0x41] =3D SSE41_OP(dppd), - [0x42] =3D SSE41_OP(mpsadbw), - [0x44] =3D PCLMULQDQ_OP(pclmulqdq), - [0x60] =3D SSE42_OP(pcmpestrm), - [0x61] =3D SSE42_OP(pcmpestri), - [0x62] =3D SSE42_OP(pcmpistrm), - [0x63] =3D SSE42_OP(pcmpistri), - [0xdf] =3D AESNI_OP(aeskeygenassist), +/* prefix [66] 0f 3a */ +static const struct SSEOpHelper_table7 sse_op_table7[256] =3D { + [0x08] =3D UNARY_OP(roundps, SSE41, 0), + [0x09] =3D UNARY_OP(roundpd, SSE41, 0), + [0x0a] =3D UNARY_OP(roundss, SSE41, SSE_OPF_SCALAR), + [0x0b] =3D UNARY_OP(roundsd, SSE41, SSE_OPF_SCALAR), + [0x0c] =3D BINARY_OP(blendps, SSE41, 0), + [0x0d] =3D BINARY_OP(blendpd, SSE41, 0), + [0x0e] =3D BINARY_OP(pblendw, SSE41, SSE_OPF_MMX), + [0x0f] =3D BINARY_OP_MMX(palignr, SSSE3), + [0x14] =3D SPECIAL_OP(SSE41), /* pextrb */ + [0x15] =3D SPECIAL_OP(SSE41), /* pextrw */ + [0x16] =3D SPECIAL_OP(SSE41), /* pextrd/pextrq */ + [0x17] =3D SPECIAL_OP(SSE41), /* extractps */ + [0x20] =3D SPECIAL_OP(SSE41), /* pinsrb */ + [0x21] =3D SPECIAL_OP(SSE41), /* insertps */ + [0x22] =3D SPECIAL_OP(SSE41), /* pinsrd/pinsrq */ + [0x40] =3D BINARY_OP(dpps, SSE41, 0), + [0x41] =3D BINARY_OP(dppd, SSE41, 0), + [0x42] =3D BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX), + [0x44] =3D BINARY_OP(pclmulqdq, PCLMULQDQ, 0), + [0x60] =3D CMP_OP(pcmpestrm, SSE42), + [0x61] =3D CMP_OP(pcmpestri, SSE42), + [0x62] =3D CMP_OP(pcmpistrm, SSE42), + [0x63] =3D CMP_OP(pcmpistri, SSE42), + [0xdf] =3D UNARY_OP(aeskeygenassist, AES, 0), }; =20 +#undef OP +#undef BINARY_OP_MMX +#undef BINARY_OP +#undef UNARY_OP_MMX +#undef UNARY_OP +#undef BLENDV_OP +#undef SPECIAL_OP + static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { int b1, op1_offset, op2_offset, is_xmm, val; int modrm, mod, rm, reg; int sse_op_flags; + const struct SSEOpHelper_table6 *op6; + const struct SSEOpHelper_table7 *op7; SSEFunc_0_epp sse_fn_epp; - SSEFunc_0_eppi sse_fn_eppi; SSEFunc_0_ppi sse_fn_ppi; SSEFunc_0_eppt sse_fn_eppt; MemOp ot; @@ -3821,12 +3841,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, mod =3D (modrm >> 6) & 3; =20 assert(b1 < 2); - sse_fn_epp =3D sse_op_table6[b].op[b1]; - if (!sse_fn_epp) { + op6 =3D &sse_op_table6[b]; + if (op6->ext_mask =3D=3D 0) { goto unknown_op; } - if (!(s->cpuid_ext_features & sse_op_table6[b].ext_mask)) + if (!(s->cpuid_ext_features & op6->ext_mask)) { goto illegal_op; + } =20 if (b1) { op1_offset =3D ZMM_OFFSET(reg); @@ -3863,6 +3884,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } } } else { + if ((op6->flags & SSE_OPF_MMX) =3D=3D 0) { + goto unknown_op; + } op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -3872,13 +3896,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_ldq_env_A0(s, op2_offset); } } - if (sse_fn_epp =3D=3D SSE_SPECIAL_FN) { - goto unknown_op; + if (!op6->op[b1]) { + goto illegal_op; } =20 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + op6->op[b1](cpu_env, s->ptr0, s->ptr1); =20 if (b =3D=3D 0x17) { set_cc_op(s, CC_OP_EFLAGS); @@ -4249,16 +4273,21 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, mod =3D (modrm >> 6) & 3; =20 assert(b1 < 2); - sse_fn_eppi =3D sse_op_table7[b].op[b1]; - if (!sse_fn_eppi) { + op7 =3D &sse_op_table7[b]; + if (op7->ext_mask =3D=3D 0) { goto unknown_op; } - if (!(s->cpuid_ext_features & sse_op_table7[b].ext_mask)) + if (!(s->cpuid_ext_features & op7->ext_mask)) { goto illegal_op; + } =20 s->rip_offset =3D 1; =20 - if (sse_fn_eppi =3D=3D SSE_SPECIAL_FN) { + if (op7->flags & SSE_OPF_SPECIAL) { + /* None of the "special" ops are valid on mmx registers */ + if (b1 =3D=3D 0) { + goto illegal_op; + } ot =3D mo_64_32(s->dflag); rm =3D (modrm & 7) | REX_B(s); if (mod !=3D 3) @@ -4403,6 +4432,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldo_env_A0(s, op2_offset); } } else { + if ((op7->flags & SSE_OPF_MMX) =3D=3D 0) { + goto illegal_op; + } op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -4425,7 +4457,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_eppi(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + op7->op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); break; =20 case 0x33a: --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556241; cv=none; d=zohomail.com; s=zohoarc; b=kRhzEbC9w6YlECEvEjUOtWc6qvMFatlHPCNo9C6RP+u9xVbxVCeH8eWmvBDe0o7XPElKrkaDgdp2QX+Srzt0tXag2rsjKSggMheKR+QuMIjO0WWzE/UoxTpStyavDkOdEEsiEVe2Eg9dPTgU0FcAnbz+zqVOyQb0qfFlMx++jOU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556241; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=a5u+pTezVEAVGRqU1R5Cw6KDqL7HfzBbIXOPN2TtMv8=; b=LsXIaiqhW/Xq8O6FnTVB5IPwa5AiTroFTSLPEdVwXPeSxZpUifJqlihpXPDGK+e2HzTi3kAl7A/ioPuK1MWcilA7+xzQT4hDTYTQKHkvmYgHfwh19mxBpsPChMXDprkudETnJNMHO1XYJOwh2A5NTgC+zNAoQ0R6xSwrpYuvWLk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556241201875.888814307122; Fri, 26 Aug 2022 16:24:01 -0700 (PDT) Received: from localhost ([::1]:48966 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRigC-0003hu-6C for importer@patchew.org; Fri, 26 Aug 2022 19:24:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38314) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUx-0006b5-DN for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:23 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:60890) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUv-0007pJ-QG for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:23 -0400 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-657-QSxi59ajNcKm0f77Fg08uw-1; Fri, 26 Aug 2022 19:12:20 -0400 Received: by mail-ed1-f72.google.com with SMTP id b13-20020a056402350d00b0043dfc84c533so1842273edd.5 for ; Fri, 26 Aug 2022 16:12:19 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id n2-20020a170906088200b0073d678f50bfsm1369988eje.164.2022.08.26.16.12.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555541; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a5u+pTezVEAVGRqU1R5Cw6KDqL7HfzBbIXOPN2TtMv8=; b=bR5OvdvZARslVPMScYKqWEgqXAqi9xtKk0YjKwXvbjFB+KLd3xOptc6mdCsg8PUBgA/g2J DdB9B+ebiRz+/6eyAeDTXNZ5k1XDeWvQELboXGBMOmWDoMLAtkJ50EvuHfruDM5KxIukqT Qp+Hp4cOt8PpqVXrdjiHFNjL2wPSHnQ= X-MC-Unique: QSxi59ajNcKm0f77Fg08uw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=a5u+pTezVEAVGRqU1R5Cw6KDqL7HfzBbIXOPN2TtMv8=; b=zXi2vPIOO40Y749FPWr8YxRXfom7T9KnT3JCc8Xq0xW9Q/9hJJnTHxf4yQ6VTLdI/Y h1aLdVhGkK0cJChiziV4VzIGNc2sKDLN6228VNWBevk5OG8Fjyp8Tvuz15KSvuaQRAl0 GSVGS6ejfUOZJJD14xgfO4cDuqHLNjmzfvouX0s3R5WBvfeYdoxbPPVG5dWWDfj61RF7 JaETDVZkNfxzcF7ApM4Fgj3A8kmUwqT9x7U0+oTnxAsSLsqxoM2HTjezILVb+i7HZvM8 +iVU6l1asoJVTpMZh3Cmf31ZmQ7Thy4+bHZ+Y2z0TvKL8t1ifzdxrWidgUcWlq6TZjvI tDhw== X-Gm-Message-State: ACgBeo30dBjZGJFublLc1Ccs7/VgHQTsyOE1PvTIaUWpshD3DYvGReRH 023JCPLD3fbmeU5CR2gsMIpT/1gf48gPFxremOJCtep41LXCutjJkv1EiAicmDPlVkGh6ASyPzS TfImCYe7r8qiGeARBfMr1ULDH7clebJQAdZdsu18L80LKONfzAOCg9e8B4BSRojDXplU= X-Received: by 2002:a05:6402:909:b0:435:a8b:5232 with SMTP id g9-20020a056402090900b004350a8b5232mr8257427edz.240.1661555538679; Fri, 26 Aug 2022 16:12:18 -0700 (PDT) X-Google-Smtp-Source: AA6agR4kw6hfrbXrInaD/z3PxAOlhY+qtGfHxUoChBz3lHq1sW1Ssbn4L9U8aa2KiQ9duQL4yA2NHg== X-Received: by 2002:a05:6402:909:b0:435:a8b:5232 with SMTP id g9-20020a056402090900b004350a8b5232mr8257415edz.240.1661555538352; Fri, 26 Aug 2022 16:12:18 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 06/23] i386: Move 3DNOW decoder Date: Sat, 27 Aug 2022 01:11:47 +0200 Message-Id: <20220826231204.201395-7-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556241689100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Handle 3DNOW instructions early to avoid complicating the MMX/SSE logic. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-25-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index b7321b7588..c76f6dba11 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3216,6 +3216,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, is_xmm =3D 1; } } + if (sse_op_flags & SSE_OPF_3DNOW) { + if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { + goto illegal_op; + } + } /* simple MMX/SSE operation */ if (s->flags & HF_TS_MASK) { gen_exception(s, EXCP07_PREX, pc_start - s->cs_base); @@ -4567,21 +4572,20 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); } + if (sse_op_flags & SSE_OPF_3DNOW) { + /* 3DNow! data insns */ + val =3D x86_ldub_code(env, s); + SSEFunc_0_epp op_3dnow =3D sse_op_table5[val]; + if (!op_3dnow) { + goto unknown_op; + } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + op_3dnow(cpu_env, s->ptr0, s->ptr1); + return; + } } switch(b) { - case 0x0f: /* 3DNow! data insns */ - val =3D x86_ldub_code(env, s); - sse_fn_epp =3D sse_op_table5[val]; - if (!sse_fn_epp) { - goto unknown_op; - } - if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { - goto illegal_op; - } - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; case 0x70: /* pshufx insn */ case 0xc6: /* pshufx insn */ val =3D x86_ldub_code(env, s); --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556045; cv=none; d=zohomail.com; s=zohoarc; b=hZB00zUtF3f6L4C8cvCkGN6OpOJZH+V9s8Vlww8s+M/pAHc9J6oDxa/ZdFt86XGMxaLijNGr2SLciolTSnMui0eCdY5/UufpjtoAajIEEDNk9UweYYfn4h5TxJ0zSoRLdKkYaY+4nybWanpFouEuuEo2m8lXd02oO5noOgAO2PA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556045; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Qsqk+/tZj/l1X2BcmANA8UNdh29v3IMZ8Unky4JedO0=; b=U/6y9kQ4/o/WRYPMX07/YEuZ051KQQGSAUWz+LQ7HmIypawAvQ4H4bpugaiILO8FXrgl0gz4INvhUXtPDWKtRXdcDnBS4dPwi0n3PTPCybMt6tTSehN/0eQjCvCyPYiyzcODW5WG14043tG4jvnwiCEW5DNEf8Yqdor1P3J0DD4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556045964953.022453478186; Fri, 26 Aug 2022 16:20:45 -0700 (PDT) Received: from localhost ([::1]:51082 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRid2-0006jq-SC for importer@patchew.org; Fri, 26 Aug 2022 19:20:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:38316) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUz-0006f4-0z for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:25 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:28038) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiUx-0007pY-AZ for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:24 -0400 Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-10-62NMWyXuNZK801mT_3hYvQ-1; Fri, 26 Aug 2022 19:12:21 -0400 Received: by mail-ed1-f70.google.com with SMTP id m16-20020a056402431000b0044662a0ba2cso1842659edc.13 for ; Fri, 26 Aug 2022 16:12:21 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id gg3-20020a170906e28300b0073dc3acfe26sm1407119ejb.65.2022.08.26.16.12.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555542; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qsqk+/tZj/l1X2BcmANA8UNdh29v3IMZ8Unky4JedO0=; b=Jc725GI5qbocbCnQ+qrj8rYw4iqdh4vHhjzHC6WBuj1HLj+Tlqo1YS0yOCvhDZAQ7pn1oM A3lX3dIAZINvOnSwrIqu1ZnnHFl8W+uXRRzkaKbxP2P1mwbVXyWXRQehtstIcdnxA23vob bn5t5wdDXdqkI8tEbKDusuVYeSlkrSI= X-MC-Unique: 62NMWyXuNZK801mT_3hYvQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=Qsqk+/tZj/l1X2BcmANA8UNdh29v3IMZ8Unky4JedO0=; b=P3/k6MnW+8ICpurREO9VzjPaF+HRHd/lNAJqOX/SmAmT9/KH+0dGWkFAIVZ4XTTruA yTCEoioyM9Cwb+oZi4H9senJKIHIDeMTju2pPaLfVMsFiD6nBYd2gw6ib4A38EzyWe7X 875adBxoXI6sbU62Pz6ci0sR9eZ2+3e5nbEFQI3kv4s3nFTA3ph2afbotL6QFHnaF1yV V8ljei6x3U+yvbleaKcSn1EaJqlsxszHPtdh5S1IYHead2J9D8as0ieNYquOClXFMqdq /+XrpVr8jASXdJDT3kg/ghpwy0v1B/z3ubiUuuXwLT+qcL10hCYCSsbd+T+mOwNCwPaO llzQ== X-Gm-Message-State: ACgBeo26Yp7BNd4Ap84lIkfRgbQ+g8YXmg1unzyG84mDMl4ZP4k+m7+4 jsZ+FIkXYG7NE1POxc/B10Py7yldYK5FGgrQBSdRuAmruYVkhYzPyw84SLyQAxs4ua1CDoZk1rQ PtVbc2dqds6KG9kIlx5/Fk/vxjMBUMODoheIlaNqW2k2ouiMd7Sg1iAOMtT9qDeGnBak= X-Received: by 2002:a50:ff13:0:b0:43e:76d3:63e1 with SMTP id a19-20020a50ff13000000b0043e76d363e1mr8192819edu.271.1661555540104; Fri, 26 Aug 2022 16:12:20 -0700 (PDT) X-Google-Smtp-Source: AA6agR7JicnQll/WeBlEuVzZRXUyhhWOTEF3I4LlHwGNl+E7UZ4eI4E9VewBa6GYL9l9te5TUHuURw== X-Received: by 2002:a50:ff13:0:b0:43e:76d3:63e1 with SMTP id a19-20020a50ff13000000b0043e76d363e1mr8192806edu.271.1661555539780; Fri, 26 Aug 2022 16:12:19 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 07/23] i386: check SSE table flags instead of hardcoding opcodes Date: Sat, 27 Aug 2022 01:11:48 +0200 Message-Id: <20220826231204.201395-8-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556046784100001 Content-Type: text/plain; charset="utf-8" Put more flags to work to avoid hardcoding lists of opcodes. The op7 case for SSE_OPF_CMP is included for homogeneity and because AVX needs it, but it is never used by SSE or MMX. Extracted from a patch by Paul Brook . Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 75 +++++++++++++++---------------------- 1 file changed, 31 insertions(+), 44 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index c76f6dba11..849c40b685 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3909,7 +3909,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); op6->op[b1](cpu_env, s->ptr0, s->ptr1); =20 - if (b =3D=3D 0x17) { + if (op6->flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } break; @@ -4463,6 +4463,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); op7->op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + if (op7->flags & SSE_OPF_CMP) { + set_cc_op(s, CC_OP_EFLAGS); + } break; =20 case 0x33a: @@ -4518,28 +4521,24 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, int sz =3D 4; =20 gen_lea_modrm(env, s, modrm); - op2_offset =3D offsetof(CPUX86State,xmm_t0); + op2_offset =3D offsetof(CPUX86State, xmm_t0); =20 - switch (b) { - case 0x50 ... 0x5a: - case 0x5c ... 0x5f: - case 0xc2: - /* Most sse scalar operations. */ - if (b1 =3D=3D 2) { - sz =3D 2; - } else if (b1 =3D=3D 3) { - sz =3D 3; - } - break; - - case 0x2e: /* ucomis[sd] */ - case 0x2f: /* comis[sd] */ - if (b1 =3D=3D 0) { - sz =3D 2; + if (sse_op_flags & SSE_OPF_SCALAR) { + if (sse_op_flags & SSE_OPF_CMP) { + /* ucomis[sd], comis[sd] */ + if (b1 =3D=3D 0) { + sz =3D 2; + } else { + sz =3D 3; + } } else { - sz =3D 3; + /* Most sse scalar operations. */ + if (b1 =3D=3D 2) { + sz =3D 2; + } else if (b1 =3D=3D 3) { + sz =3D 3; + } } - break; } =20 switch (sz) { @@ -4585,26 +4584,14 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, return; } } - switch(b) { - case 0x70: /* pshufx insn */ - case 0xc6: /* pshufx insn */ + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (sse_op_flags & SSE_OPF_SHUF) { val =3D x86_ldub_code(env, s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); /* XXX: introduce a new table? */ sse_fn_ppi =3D (SSEFunc_0_ppi)sse_fn_epp; sse_fn_ppi(s->ptr0, s->ptr1, tcg_const_i32(val)); - break; - case 0xc2: - /* compare insns, bits 7:3 (7:5 for AVX) are ignored */ - val =3D x86_ldub_code(env, s) & 7; - sse_fn_epp =3D sse_op_table4[val][b1]; - - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; - case 0xf7: + } else if (b =3D=3D 0xf7) { /* maskmov : we must prepare A0 */ if (mod !=3D 3) { goto illegal_op; @@ -4613,19 +4600,19 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_extu(s->aflag, s->A0); gen_add_A0_ds_seg(s); =20 - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); /* XXX: introduce a new table? */ sse_fn_eppt =3D (SSEFunc_0_eppt)sse_fn_epp; sse_fn_eppt(cpu_env, s->ptr0, s->ptr1, s->A0); - break; - default: - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + } else if (b =3D=3D 0xc2) { + /* compare insns, bits 7:3 (7:5 for AVX) are ignored */ + val =3D x86_ldub_code(env, s) & 7; + sse_fn_epp =3D sse_op_table4[val][b1]; + sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + } else { sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; } - if (b =3D=3D 0x2e || b =3D=3D 0x2f) { + + if (sse_op_flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } } --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661555702; cv=none; d=zohomail.com; s=zohoarc; b=UosymAc/1j9FCs3DsdNeIgbLbTYqZhIKPc17SgikNTsqRcaJzJ9vyJnglPXdE3cdW+r6+1KNfJ2bCh53e20ALj4mRSmrT7jzdjJQLPAq42wRoybTiLEk9rhny6GYDKkhhkgkXEr6pxT22c6uF0nARcmQssg9cITrbUMJwysYcx8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661555702; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=8OG3BES3jn6rqjpInSfI0r3StqLKtgpUh4DBPp2S9+U=; b=ffDXU64Y0/yKoAYEhc5hnvuxSdRb18odvknpu+akRnQHtQG6fCf21VqnfF98SmCBugtFTcMX9jBJm11DbZp/eZayGtVLocZRQRP1Qu80b53N3FZpkJmuZE8ebvUDQ3L0CujqPTO8L37gvM6Q4aYPq501kx5+Dy2HPq79A8JlmGY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661555702959946.6505205869812; Fri, 26 Aug 2022 16:15:02 -0700 (PDT) Received: from localhost ([::1]:60208 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRiXV-0001Dx-Pq for importer@patchew.org; Fri, 26 Aug 2022 19:15:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47416) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiV2-0006o9-JV for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:27525) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiV0-0007pr-TI for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:28 -0400 Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-139-5h-bHcDYOmW4HpQHL7QLSQ-1; Fri, 26 Aug 2022 19:12:23 -0400 Received: by mail-ed1-f69.google.com with SMTP id b12-20020a056402278c00b00447f2029741so1760831ede.23 for ; Fri, 26 Aug 2022 16:12:23 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id t35-20020a056402242300b00448176872f7sm400676eda.81.2022.08.26.16.12.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555546; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8OG3BES3jn6rqjpInSfI0r3StqLKtgpUh4DBPp2S9+U=; b=Yr3DpZpNozefHelgBzFJH+VrO+ezXaqjPH0gT60QzU71/zG5hXesTsEm7etlSLO7lVkfqR ukgcQuI+qeoPmd/mBKosuCuJqWyrZr3QwJl0wUifpzHEUQsWV1cqOoPTHe+lWmQTvOwkJB FBj657JvR3N+y3M4ZSYNy8bo3CkJRyI= X-MC-Unique: 5h-bHcDYOmW4HpQHL7QLSQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=8OG3BES3jn6rqjpInSfI0r3StqLKtgpUh4DBPp2S9+U=; b=4GN008mvgKLBzXtwgOJpNiqkkFpilbYa86cZR0b5FfjAmDYdl+hLzfydr+EjJk1xDh BipypY6h1ipLY2xH+XlPruTJvR7eN414VayzBbjDDfNBMZof7DP1rlyKqRKh4RHl1nhv ltkli5uRkCiixWVLtsrlRI6rthRaNgU8lq74AsyV3xg2rTreNfnRyK2okdKtO/NHiZ8M 45jD8R87vnVHspERtnVOZVclSxYLFmN9TCcWPqAtlfG7CRvBlhIWDJa6HwmBtCXcCwqs 0Ol34lrZiBZ5aE9yXIzJPaOQhitwv2Btmokbf6AXv4WPibZBFbT3VIBVRlv60mFRqdYb oRbA== X-Gm-Message-State: ACgBeo0sYLOMppRBib56pdj5ZsTY4Ju/iUDHCIUi0ciCWhlBr0lk78ub jDTlHoX9nbSpVE8CIMCl9bKl1hGzX+ouBYfkaF8zpM5D0FzHSobNALStHsW8AENClaLCu4Au9UY 6EmiBLR2/Dt+NtbI3Kn9nSyfdhs073JT+0uIZTOir84gBqk2j6AOXqhNjXcMX3BjHXeQ= X-Received: by 2002:a05:6402:14e:b0:443:f58:17e9 with SMTP id s14-20020a056402014e00b004430f5817e9mr8607505edu.106.1661555542166; Fri, 26 Aug 2022 16:12:22 -0700 (PDT) X-Google-Smtp-Source: AA6agR7J2uOVJsI+6g6liW3BOOXYX+aYXbDjwT78P5ipec8IDMUZiJO6LS3Mwrd7N6E+MgPPiirMPA== X-Received: by 2002:a05:6402:14e:b0:443:f58:17e9 with SMTP id s14-20020a056402014e00b004430f5817e9mr8607492edu.106.1661555541874; Fri, 26 Aug 2022 16:12:21 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 08/23] i386: isolate MMX code more Date: Sat, 27 Aug 2022 01:11:49 +0200 Message-Id: <20220826231204.201395-9-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661555703545100003 Content-Type: text/plain; charset="utf-8" Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 52 +++++++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 19 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 849c40b685..f174b1d986 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3888,6 +3888,12 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_ldo_env_A0(s, op2_offset); } } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (!op6->op[b1]) { + goto illegal_op; + } + op6->op[b1](cpu_env, s->ptr0, s->ptr1); } else { if ((op6->flags & SSE_OPF_MMX) =3D=3D 0) { goto unknown_op; @@ -3900,14 +3906,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, op2_offset); } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + op6->op[0](cpu_env, s->ptr0, s->ptr1); } - if (!op6->op[b1]) { - goto illegal_op; - } - - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op6->op[b1](cpu_env, s->ptr0, s->ptr1); =20 if (op6->flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); @@ -4427,16 +4429,8 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, return; } =20 - if (b1) { - op1_offset =3D ZMM_OFFSET(reg); - if (mod =3D=3D 3) { - op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); - } else { - op2_offset =3D offsetof(CPUX86State,xmm_t0); - gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, op2_offset); - } - } else { + if (b1 =3D=3D 0) { + /* MMX */ if ((op7->flags & SSE_OPF_MMX) =3D=3D 0) { goto illegal_op; } @@ -4448,9 +4442,29 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, op2_offset); } - } - val =3D x86_ldub_code(env, s); + val =3D x86_ldub_code(env, s); + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); =20 + /* We only actually have one MMX instuction (palignr) */ + assert(b =3D=3D 0x0f); + + op7->op[0](cpu_env, s->ptr0, s->ptr1, + tcg_const_i32(val)); + break; + } + + /* SSE */ + op1_offset =3D ZMM_OFFSET(reg); + if (mod =3D=3D 3) { + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); + } else { + op2_offset =3D offsetof(CPUX86State, xmm_t0); + gen_lea_modrm(env, s, modrm); + gen_ldo_env_A0(s, op2_offset); + } + + val =3D x86_ldub_code(env, s); if ((b & 0xfc) =3D=3D 0x60) { /* pcmpXstrX */ set_cc_op(s, CC_OP_EFLAGS); =20 --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556387; cv=none; d=zohomail.com; s=zohoarc; b=afirGSPXwVdCx+BlalX8NtshJgu9O/wg2V02LEMJPcI+lfA2x2RfmGudaPtADMpDUPsr2VTnrua671ADdMv41ywzP+g2UAiBCkmp1vfQeroN3IsS1Oo+iihGrfwqVonqWR7mgyuGwzdY6o67SJfhZhEQn7IM99eqKBJumkpCimU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556387; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=zwe5vm/b7Apg5QV2gvDAQ/7D8/ogzsXDfa3xoFuc9n4=; b=VF1LOM4it800np80+vcOAnKNjwOElQLcpzlZPa/pNRi5DpH56cj25w+E7LrRGApzUDTcc0ymZKpLTMe5T/6PbjLdhrwSpeLHZkGJgUAm+lK6aSTrgUbbtXS+9SMXXNjK3Yygu2YsYoVTgjKnGE2bS0fIrSq0/uQtqqH4EvJw8BQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556387006503.38829541831626; Fri, 26 Aug 2022 16:26:27 -0700 (PDT) Received: from localhost ([::1]:60716 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRiiW-00013Z-Ul for importer@patchew.org; Fri, 26 Aug 2022 19:26:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47418) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiV4-0006rM-AU for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:49569) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiV1-0007pv-F1 for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:29 -0400 Received: from mail-ej1-f70.google.com (mail-ej1-f70.google.com [209.85.218.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-607-PbNPI045M0W3TYCNy39oaQ-1; Fri, 26 Aug 2022 19:12:25 -0400 Received: by mail-ej1-f70.google.com with SMTP id js11-20020a17090797cb00b00730d73eac83so1028330ejc.19 for ; Fri, 26 Aug 2022 16:12:25 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id et17-20020a170907295100b0073093eaf53esm1406722ejc.131.2022.08.26.16.12.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555546; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zwe5vm/b7Apg5QV2gvDAQ/7D8/ogzsXDfa3xoFuc9n4=; b=OF5x4sVy57vMDHaVG2Dj6LuSRal2MdX2ESMLO8QxOlQjAGDorpuuJr4A3zVwL8QcVtv+hs zzfQJsh9u8yB080AZ1X1BHwaRZYgqcAIkJ5f4viCeQIsWGdqxVFJcfQ6+gQRvKxb3F6fyA TKuMn2WzS6t06+F1FUtfUOkvr28uAzU= X-MC-Unique: PbNPI045M0W3TYCNy39oaQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=zwe5vm/b7Apg5QV2gvDAQ/7D8/ogzsXDfa3xoFuc9n4=; b=EIcrDI9bVR5c4aVaXL74Ns15HHYeJTvqxcHwOLM3fx5ZFZg7lWmqEWHFUkI7JSQ/Vw WXyshOHUGbM1a6T3SzvNjz8WwsN9vk70aLN1eds96PiUi8KEtpbcxiYsT3QOfCRoPXoe tWOJ9SOgZfyYS/FGeaDFLqwpbh+DGZJegbwcVbSd94cHfbXU+lTajbnDzwuDLtTAuWfR 0EvBtrW64F4A7u2+Ki/AIiARQj2TuyU+DXaQ+XiCk1AA2SSXzlnDiJ4mvUwBVWnCFh4B HhqNaLhoO3LQ+IXmuptAivGZ/3cS/1CqSMDk4bhXY0TAuUNSv+k1a2ZMdteXzxAsW2fP Lj9Q== X-Gm-Message-State: ACgBeo1do0yCDCFtnDjLkiEmfIy6QMDFb9KEDdonD3Y1BYLN9DArbMKu I/pi6bJ4vRUqLcel/qaxKH0X3MR0KX69Io97zJeZklc1sZDbnfGSZGSSyE5ykePKjU5FsgT1+NI eMzr+Mz14GtKyHUFpaJ1vRN3P8F3JzL7jW0+nnhVxs+LegGAZVah4jZLRRogvBIYPyv0= X-Received: by 2002:aa7:d4d8:0:b0:447:a745:4b18 with SMTP id t24-20020aa7d4d8000000b00447a7454b18mr8662890edr.174.1661555543921; Fri, 26 Aug 2022 16:12:23 -0700 (PDT) X-Google-Smtp-Source: AA6agR7eEl9U9XI1QQNklZIcfflzlX30o3vDvPGNPHyE+L8HaFuj+TN0yc0CVESU3el/59j0JnL7oA== X-Received: by 2002:aa7:d4d8:0:b0:447:a745:4b18 with SMTP id t24-20020aa7d4d8000000b00447a7454b18mr8662863edr.174.1661555543446; Fri, 26 Aug 2022 16:12:23 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 09/23] i386: Add size suffix to vector FP helpers Date: Sat, 27 Aug 2022 01:11:50 +0200 Message-Id: <20220826231204.201395-10-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556389016100003 Content-Type: text/plain; charset="utf-8" For AVX we're going to need both 128 bit (xmm) and 256 bit (ymm) variants of floating point helpers. Add the register type suffix to the existing *PS and *PD helpers (SS and SD variants are only valid on 128 bit vectors) No functional changes. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-15-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 48 ++++++++++++++++++------------------ target/i386/ops_sse_header.h | 48 ++++++++++++++++++------------------ target/i386/tcg/translate.c | 37 +++++++++++++-------------- 3 files changed, 67 insertions(+), 66 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index b12b271fcd..f603981ab8 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -537,7 +537,7 @@ void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int or= der) MOVE(*d, r); } #else -void helper_shufps(Reg *d, Reg *s, int order) +void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order) { Reg r; =20 @@ -548,7 +548,7 @@ void helper_shufps(Reg *d, Reg *s, int order) MOVE(*d, r); } =20 -void helper_shufpd(Reg *d, Reg *s, int order) +void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *s, int order) { Reg r; =20 @@ -598,7 +598,7 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int o= rder) /* XXX: not accurate */ =20 #define SSE_HELPER_S(name, F) \ - void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ d->ZMM_S(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ @@ -611,7 +611,7 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int o= rder) d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ d->ZMM_D(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ @@ -647,7 +647,7 @@ SSE_HELPER_S(sqrt, FPU_SQRT) =20 =20 /* float to float conversions */ -void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { float32 s0, s1; =20 @@ -657,7 +657,7 @@ void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s) d->ZMM_D(1) =3D float32_to_float64(s1, &env->sse_status); } =20 -void helper_cvtpd2ps(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); d->ZMM_S(1) =3D float64_to_float32(s->ZMM_D(1), &env->sse_status); @@ -675,7 +675,7 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s) } =20 /* integer to float */ -void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D int32_to_float32(s->ZMM_L(0), &env->sse_status); d->ZMM_S(1) =3D int32_to_float32(s->ZMM_L(1), &env->sse_status); @@ -683,7 +683,7 @@ void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s) d->ZMM_S(3) =3D int32_to_float32(s->ZMM_L(3), &env->sse_status); } =20 -void helper_cvtdq2pd(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int32_t l0, l1; =20 @@ -760,7 +760,7 @@ WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero,= float32, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN) =20 -void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); d->ZMM_L(1) =3D x86_float32_to_int32(s->ZMM_S(1), &env->sse_status); @@ -768,7 +768,7 @@ void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) d->ZMM_L(3) =3D x86_float32_to_int32(s->ZMM_S(3), &env->sse_status); } =20 -void helper_cvtpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float64_to_int32(s->ZMM_D(0), &env->sse_status); d->ZMM_L(1) =3D x86_float64_to_int32(s->ZMM_D(1), &env->sse_status); @@ -810,7 +810,7 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s) #endif =20 /* float to integer truncated */ -void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); @@ -818,7 +818,7 @@ void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMR= eg *s) d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->= sse_status); } =20 -void helper_cvttpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); @@ -859,7 +859,7 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) } #endif =20 -void helper_rsqrtps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, @@ -886,7 +886,7 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg= *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 -void helper_rcpps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); @@ -947,7 +947,7 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int = index, int length) d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } =20 -void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -958,7 +958,7 @@ void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg = *s) MOVE(*d, r); } =20 -void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -967,7 +967,7 @@ void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg = *s) MOVE(*d, r); } =20 -void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -978,7 +978,7 @@ void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg = *s) MOVE(*d, r); } =20 -void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -987,7 +987,7 @@ void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg = *s) MOVE(*d, r); } =20 -void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_S(0) =3D float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); d->ZMM_S(1) =3D float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); @@ -995,7 +995,7 @@ void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) d->ZMM_S(3) =3D float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); } =20 -void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_D(0) =3D float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); d->ZMM_D(1) =3D float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); @@ -1003,7 +1003,7 @@ void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMM= Reg *s) =20 /* XXX: unordered */ #define SSE_HELPER_CMP(name, F) \ - void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ d->ZMM_L(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ @@ -1016,7 +1016,7 @@ void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMM= Reg *s) d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ d->ZMM_Q(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ @@ -1099,7 +1099,7 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s) CC_SRC =3D comis_eflags[ret + 1]; } =20 -uint32_t helper_movmskps(CPUX86State *env, Reg *s) +uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s) { int b0, b1, b2, b3; =20 @@ -1110,7 +1110,7 @@ uint32_t helper_movmskps(CPUX86State *env, Reg *s) return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3); } =20 -uint32_t helper_movmskpd(CPUX86State *env, Reg *s) +uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s) { int b0, b1; =20 diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index cef28f2aae..fc697536a0 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -122,8 +122,8 @@ DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64) #if SHIFT =3D=3D 0 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int) #else -DEF_HELPER_3(shufps, void, Reg, Reg, int) -DEF_HELPER_3(shufpd, void, Reg, Reg, int) +DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int) +DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshuflw, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) @@ -134,9 +134,9 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) /* XXX: not accurate */ =20 #define SSE_HELPER_S(name, F) \ - DEF_HELPER_3(name ## ps, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## pd, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) =20 SSE_HELPER_S(add, FPU_ADD) @@ -148,12 +148,12 @@ SSE_HELPER_S(max, FPU_MAX) SSE_HELPER_S(sqrt, FPU_SQRT) =20 =20 -DEF_HELPER_3(cvtps2pd, void, env, Reg, Reg) -DEF_HELPER_3(cvtpd2ps, void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg) DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg) -DEF_HELPER_3(cvtdq2ps, void, env, Reg, Reg) -DEF_HELPER_3(cvtdq2pd, void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtdq2ps, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtdq2pd, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(cvtpi2ps, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtpi2pd, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtsi2ss, void, env, ZMMReg, i32) @@ -164,8 +164,8 @@ DEF_HELPER_3(cvtsq2ss, void, env, ZMMReg, i64) DEF_HELPER_3(cvtsq2sd, void, env, ZMMReg, i64) #endif =20 -DEF_HELPER_3(cvtps2dq, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(cvtpd2dq, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvtps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvtpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvtss2si, s32, env, ZMMReg) @@ -175,8 +175,8 @@ DEF_HELPER_2(cvtss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvtsd2sq, s64, env, ZMMReg) #endif =20 -DEF_HELPER_3(cvttps2dq, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(cvttpd2dq, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvttps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvttpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvttss2si, s32, env, ZMMReg) @@ -186,25 +186,25 @@ DEF_HELPER_2(cvttss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg) #endif =20 -DEF_HELPER_3(rsqrtps, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(rcpps, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(extrq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int) DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int) -DEF_HELPER_3(haddps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(haddpd, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(hsubps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(hsubpd, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(addsubps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(addsubpd, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(haddps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(haddpd, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(hsubps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(hsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(addsubps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(addsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) =20 #define SSE_HELPER_CMP(name, F) \ - DEF_HELPER_3(name ## ps, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## pd, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) =20 SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) @@ -220,8 +220,8 @@ DEF_HELPER_3(ucomiss, void, env, Reg, Reg) DEF_HELPER_3(comiss, void, env, Reg, Reg) DEF_HELPER_3(ucomisd, void, env, Reg, Reg) DEF_HELPER_3(comisd, void, env, Reg, Reg) -DEF_HELPER_2(movmskps, i32, env, Reg) -DEF_HELPER_2(movmskpd, i32, env, Reg) +DEF_HELPER_2(glue(movmskps, SUFFIX), i32, env, Reg) +DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg) #endif =20 DEF_HELPER_2(glue(pmovmskb, SUFFIX), i32, efine SSE_FOP(name) OP(op1, SSE_= OPF_SCALAR, \ - gen_helper_##name##ps, gen_helper_##name##pd, \ + gen_helper_##name##ps##_xmm, gen_helper_##name##pd##_xmm, \ gen_helper_##name##ss, gen_helper_##name##sd) #define SSE_OP(sname, dname, op, flags) OP(op, flags, \ gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) @@ -2843,12 +2843,12 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { gen_helper_comiss, gen_helper_comisd, NULL, NULL), [0x50] =3D SSE_SPECIAL, /* movmskps, movmskpd */ [0x51] =3D OP(op1, SSE_OPF_SCALAR, - gen_helper_sqrtps, gen_helper_sqrtpd, + gen_helper_sqrtps_xmm, gen_helper_sqrtpd_xmm, gen_helper_sqrtss, gen_helper_sqrtsd), [0x52] =3D OP(op1, SSE_OPF_SCALAR, - gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL), + gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL), [0x53] =3D OP(op1, SSE_OPF_SCALAR, - gen_helper_rcpps, NULL, gen_helper_rcpss, NULL), + gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL), [0x54] =3D SSE_OP(pand, pand, op1, 0), /* andps, andpd */ [0x55] =3D SSE_OP(pandn, pandn, op1, 0), /* andnps, andnpd */ [0x56] =3D SSE_OP(por, por, op1, 0), /* orps, orpd */ @@ -2856,19 +2856,19 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x58] =3D SSE_FOP(add), [0x59] =3D SSE_FOP(mul), [0x5a] =3D OP(op1, SSE_OPF_SCALAR, - gen_helper_cvtps2pd, gen_helper_cvtpd2ps, + gen_helper_cvtps2pd_xmm, gen_helper_cvtpd2ps_xmm, gen_helper_cvtss2sd, gen_helper_cvtsd2ss), [0x5b] =3D OP(op1, 0, - gen_helper_cvtdq2ps, gen_helper_cvtps2dq, - gen_helper_cvttps2dq, NULL), + gen_helper_cvtdq2ps_xmm, gen_helper_cvtps2dq_xmm, + gen_helper_cvttps2dq_xmm, NULL), [0x5c] =3D SSE_FOP(sub), [0x5d] =3D SSE_FOP(min), [0x5e] =3D SSE_FOP(div), [0x5f] =3D SSE_FOP(max), =20 [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ - [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps, - (SSEFunc_0_epp)gen_helper_shufpd, NULL, NULL), + [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps_xm= m, + (SSEFunc_0_epp)gen_helper_shufpd_xmm, NULL, NULL), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ [0x38] =3D SSE_SPECIAL, @@ -2909,15 +2909,15 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x79] =3D OP(op1, 0, NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r), [0x7c] =3D OP(op1, 0, - NULL, gen_helper_haddpd, NULL, gen_helper_haddps), + NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm), [0x7d] =3D OP(op1, 0, - NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps), + NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm), [0x7e] =3D SSE_SPECIAL, /* movd, movd, , movq */ [0x7f] =3D SSE_SPECIAL, /* movq, movdqa, movdqu */ [0xc4] =3D SSE_SPECIAL, /* pinsrw */ [0xc5] =3D SSE_SPECIAL, /* pextrw */ [0xd0] =3D OP(op1, 0, - NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps), + NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_x= mm), [0xd1] =3D MMX_OP(psrlw), [0xd2] =3D MMX_OP(psrld), [0xd3] =3D MMX_OP(psrlq), @@ -2940,8 +2940,8 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0xe4] =3D MMX_OP(pmulhuw), [0xe5] =3D MMX_OP(pmulhw), [0xe6] =3D OP(op1, 0, - NULL, gen_helper_cvttpd2dq, - gen_helper_cvtdq2pd, gen_helper_cvtpd2dq), + NULL, gen_helper_cvttpd2dq_xmm, + gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm), [0xe7] =3D SSE_SPECIAL, /* movntq, movntq */ [0xe8] =3D MMX_OP(psubsb), [0xe9] =3D MMX_OP(psubsw), @@ -3018,8 +3018,9 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 -#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ - gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } +#define SSE_FOP(x) { \ + gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ + gen_helper_ ## x ## ss, gen_helper_ ## x ## sd} static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpeq), SSE_FOP(cmplt), @@ -3636,13 +3637,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x050: /* movmskps */ rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0); + gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0); + gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x02a: /* cvtpi2ps */ --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556386; cv=none; d=zohomail.com; s=zohoarc; b=ICooT0aXHRNtaBzAXtb82je+7OkbGn7MN2a/3g2C6Qce8WohQV6WQYjeNygICw1Ao+2JJhGMbjF33uMT4kpeCq7f60IELkT/O3Q8F01Of2BbrDkW/fTnSlJwiAboJKCyomXGNS+9VJXQaJjj//+QqUDrJJXjEOXAjXJNgCmUi4Y= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556386; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Kdg9/jaegC9s7KWKBTOCUZzex/VlVAag6H/fVXgfe1c=; b=WrWJuj7msGTrYj9KKYiQdaGW/DfY97buE0UEWkVrKFHrrk6oAluwRz38Dp8YWUfpu2mD+C9rLpDysdq9DwExS1U6FUfoKksJyb+zp+wugkgRQ2EG5GW433aChy9DpmYuBrDFIu5gWdCdMYgxuC6j4R6EJW8NnJ03zbhpuJs9AdM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16615563860442.6496126601321066; Fri, 26 Aug 2022 16:26:26 -0700 (PDT) Received: from localhost ([::1]:47114 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRiiV-0000uw-55 for importer@patchew.org; Fri, 26 Aug 2022 19:26:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40706) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVX-00074B-OD for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:59 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:24939) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiV5-0007q5-F6 for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:33 -0400 Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-274-3_GcX3BUNTy7p3y_rD-VLg-1; Fri, 26 Aug 2022 19:12:27 -0400 Received: by mail-ed1-f71.google.com with SMTP id m16-20020a056402431000b0044662a0ba2cso1842760edc.13 for ; Fri, 26 Aug 2022 16:12:27 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id e8-20020a170906314800b0072b3464c043sm1380756eje.116.2022.08.26.16.12.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555548; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Kdg9/jaegC9s7KWKBTOCUZzex/VlVAag6H/fVXgfe1c=; b=egyM3FwDhTX9IVPAIdgQotFtOH0dPbtRPD0OFvYm/VFKoO+G+sia7N8BzP8Mh8ROsCexgN o/vvLmfyGn/FNqhgXqnJ0ogfFt1njk+jn46kclRamiMw1wNVTT0JusQL7VVc8gwT9vtp89 ia0+xNRzts+1JVedi/C4I40C6F6r4+s= X-MC-Unique: 3_GcX3BUNTy7p3y_rD-VLg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=Kdg9/jaegC9s7KWKBTOCUZzex/VlVAag6H/fVXgfe1c=; b=7TcSWnyr5IR+7ifpZ4VzVXW2b6of2Syar5PC0+lLQFCP1rQxd8577nGApLnXUWZ/sO ixAfx871LcbHmdWwWQLEcN4j7Do+HUYZ6E+J1wGNOy/38UnWo/uDBz89JdZ0g0naob9/ jPFwC/sNTmVIefBAK/avOFe4NsQtw80zU5b8cHZbLyyCIQyaigUhOwBXCzAFgrawfWZs B8fs6lHV7o7T/MzJLI5JtZrKfIE+3/r3Wc6HzwOvV2b4Idc5Mg9ZYXeMjwt9/nUP/NM4 HdJrSycPc6O94Mqar3UQQp5OB0sxENaytomI98GhH84LO5UsCEtu36pQGdjraFxrrXh3 gReg== X-Gm-Message-State: ACgBeo2Rf/IRcjvt2/UQXwPZyefcdRtsGEEw4looOZI9qsr5qCz4VTRg 7Ga/Uuwm8whV1Q7ndegwO1CQ2ZxH+iZiii0ePdkt6HogAHWGgXnr4vaIlHfKpG2och1ga8ySv0u ElnXljZjZGIrOyELtwHy4RtfI0XeTi6D3NdFsewNAuv7+Xra5Efvg5BbLOYgTxJiFi/0= X-Received: by 2002:a17:906:8a6a:b0:73d:c740:f836 with SMTP id hy10-20020a1709068a6a00b0073dc740f836mr6841438ejc.14.1661555545381; Fri, 26 Aug 2022 16:12:25 -0700 (PDT) X-Google-Smtp-Source: AA6agR66L/TF+21IVWEPzHU6ouhKE1nyXWw7rtPH8Lc4LmnJ5xrdrvNus1UH3b61reX71e93mDVnKw== X-Received: by 2002:a17:906:8a6a:b0:73d:c740:f836 with SMTP id hy10-20020a1709068a6a00b0073dc740f836mr6841426ejc.14.1661555545061; Fri, 26 Aug 2022 16:12:25 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 10/23] i386: do not cast gen_helper_* function pointers Date: Sat, 27 Aug 2022 01:11:51 +0200 Message-Id: <20220826231204.201395-11-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556387036100001 Content-Type: text/plain; charset="utf-8" Use a union to store the various possible kinds of function pointers, and access the correct one based on the flags. SSEOpHelper_table6 and SSEOpHelper_table7 right now only have one case, but this would change with AVX's 3- and 4-argument operations. Use unions there too, to keep the code more similar for the three tables. Extracted from a patch by Paul Brook . Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 75 ++++++++++++++++++------------------- 1 file changed, 37 insertions(+), 38 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index aab04839c8..f7e8cab52d 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2784,6 +2784,8 @@ typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr e= nv, TCGv_ptr reg); typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val); typedef void (*SSEFunc_0_epl)(TCGv_ptr env, TCGv_ptr reg, TCGv_i64 val); typedef void (*SSEFunc_0_epp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b= ); +typedef void (*SSEFunc_0_eppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, + TCGv_ptr reg_c); typedef void (*SSEFunc_0_eppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv_i32 val); typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_i32 val= ); @@ -2798,7 +2800,7 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr= reg_a, TCGv_ptr reg_b, #define SSE_OPF_SHUF (1 << 9) /* pshufx/shufpx */ =20 #define OP(op, flags, a, b, c, d) \ - {flags, {a, b, c, d} } + {flags, {{.op =3D a}, {.op =3D b}, {.op =3D c}, {.op =3D d} } } =20 #define MMX_OP(x) OP(op1, SSE_OPF_MMX, \ gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL) @@ -2809,9 +2811,15 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_pt= r reg_a, TCGv_ptr reg_b, #define SSE_OP(sname, dname, op, flags) OP(op, flags, \ gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) =20 +typedef union SSEFuncs { + SSEFunc_0_epp op1; + SSEFunc_0_ppi op1i; + SSEFunc_0_eppt op1t; +} SSEFuncs; + struct SSEOpHelper_table1 { int flags; - SSEFunc_0_epp op[4]; + SSEFuncs fn[4]; }; =20 #define SSE_3DNOW { SSE_OPF_3DNOW } @@ -2867,8 +2875,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0x5f] =3D SSE_FOP(max), =20 [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ - [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps_xm= m, - (SSEFunc_0_epp)gen_helper_shufpd_xmm, NULL, NULL), + [0xc6] =3D SSE_OP(shufps, shufpd, op1i, SSE_OPF_SHUF), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ [0x38] =3D SSE_SPECIAL, @@ -2894,10 +2901,8 @@ static const struct SSEOpHelper_table1 sse_op_table1= [256] =3D { [0x6e] =3D SSE_SPECIAL, /* movd mm, ea */ [0x6f] =3D SSE_SPECIAL, /* movq, movdqa, , movqdu */ [0x70] =3D OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX, - (SSEFunc_0_epp)gen_helper_pshufw_mmx, - (SSEFunc_0_epp)gen_helper_pshufd_xmm, - (SSEFunc_0_epp)gen_helper_pshufhw_xmm, - (SSEFunc_0_epp)gen_helper_pshuflw_xmm), + gen_helper_pshufw_mmx, gen_helper_pshufd_xmm, + gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm), [0x71] =3D SSE_SPECIAL, /* shiftw */ [0x72] =3D SSE_SPECIAL, /* shiftd */ [0x73] =3D SSE_SPECIAL, /* shiftq */ @@ -2959,8 +2964,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0xf5] =3D MMX_OP(pmaddwd), [0xf6] =3D MMX_OP(psadbw), [0xf7] =3D OP(op1t, SSE_OPF_MMX, - (SSEFunc_0_epp)gen_helper_maskmov_mmx, - (SSEFunc_0_epp)gen_helper_maskmov_xmm, NULL, NULL), + gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL= ), [0xf8] =3D MMX_OP(psubb), [0xf9] =3D MMX_OP(psubw), [0xfa] =3D MMX_OP(psubl), @@ -3057,17 +3061,19 @@ static const SSEFunc_0_epp sse_op_table5[256] =3D { [0xb6] =3D gen_helper_movq, /* pfrcpit2 */ [0xb7] =3D gen_helper_pmulhrw_mmx, [0xbb] =3D gen_helper_pswapd, - [0xbf] =3D gen_helper_pavgb_mmx /* pavgusb */ + [0xbf] =3D gen_helper_pavgb_mmx, }; =20 struct SSEOpHelper_table6 { - SSEFunc_0_epp op[2]; + SSEFuncs fn[2]; uint32_t ext_mask; int flags; }; =20 struct SSEOpHelper_table7 { - SSEFunc_0_eppi op[2]; + union { + SSEFunc_0_eppi op1; + } fn[2]; uint32_t ext_mask; int flags; }; @@ -3075,7 +3081,8 @@ struct SSEOpHelper_table7 { #define gen_helper_special_xmm NULL =20 #define OP(name, op, flags, ext, mmx_name) \ - {{mmx_name, gen_helper_ ## name ## _xmm}, CPUID_EXT_ ## ext, flags} + {{{.op =3D mmx_name}, {.op =3D gen_helper_ ## name ## _xmm} }, \ + CPUID_EXT_ ## ext, flags} #define BINARY_OP_MMX(name, ext) \ OP(name, op1, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) #define BINARY_OP(name, ext, flags) \ @@ -3185,11 +3192,9 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, int b1, op1_offset, op2_offset, is_xmm, val; int modrm, mod, rm, reg; int sse_op_flags; + SSEFuncs sse_op_fn; const struct SSEOpHelper_table6 *op6; const struct SSEOpHelper_table7 *op7; - SSEFunc_0_epp sse_fn_epp; - SSEFunc_0_ppi sse_fn_ppi; - SSEFunc_0_eppt sse_fn_eppt; MemOp ot; =20 b &=3D 0xff; @@ -3202,9 +3207,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, else b1 =3D 0; sse_op_flags =3D sse_op_table1[b].flags; - sse_fn_epp =3D sse_op_table1[b].op[b1]; + sse_op_fn =3D sse_op_table1[b].fn[b1]; if ((sse_op_flags & (SSE_OPF_SPECIAL | SSE_OPF_3DNOW)) =3D=3D 0 - && !sse_fn_epp) { + && !sse_op_fn.op1) { goto unknown_op; } if ((b <=3D 0x5f && b >=3D 0x10) || b =3D=3D 0xc6 || b =3D=3D 0xc2) { @@ -3618,9 +3623,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op1_offset =3D offsetof(CPUX86State,mmx_t0); } assert(b1 < 2); - sse_fn_epp =3D sse_op_table2[((b - 1) & 3) * 8 + + SSEFunc_0_epp fn =3D sse_op_table2[((b - 1) & 3) * 8 + (((modrm >> 3)) & 7)][b1]; - if (!sse_fn_epp) { + if (!fn) { goto unknown_op; } if (is_xmm) { @@ -3632,7 +3637,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op1_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + fn(cpu_env, s->ptr0, s->ptr1); break; case 0x050: /* movmskps */ rm =3D (modrm & 7) | REX_B(s); @@ -3891,10 +3896,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - if (!op6->op[b1]) { + if (!op6->fn[b1].op1) { goto illegal_op; } - op6->op[b1](cpu_env, s->ptr0, s->ptr1); + op6->fn[b1].op1(cpu_env, s->ptr0, s->ptr1); } else { if ((op6->flags & SSE_OPF_MMX) =3D=3D 0) { goto unknown_op; @@ -3909,7 +3914,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op6->op[0](cpu_env, s->ptr0, s->ptr1); + op6->fn[0].op1(cpu_env, s->ptr0, s->ptr1); } =20 if (op6->flags & SSE_OPF_CMP) { @@ -4450,8 +4455,8 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, /* We only actually have one MMX instuction (palignr) */ assert(b =3D=3D 0x0f); =20 - op7->op[0](cpu_env, s->ptr0, s->ptr1, - tcg_const_i32(val)); + op7->fn[0].op1(cpu_env, s->ptr0, s->ptr1, + tcg_const_i32(val)); break; } =20 @@ -4477,7 +4482,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op7->op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + op7->fn[b1].op1(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); if (op7->flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } @@ -4603,9 +4608,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); if (sse_op_flags & SSE_OPF_SHUF) { val =3D x86_ldub_code(env, s); - /* XXX: introduce a new table? */ - sse_fn_ppi =3D (SSEFunc_0_ppi)sse_fn_epp; - sse_fn_ppi(s->ptr0, s->ptr1, tcg_const_i32(val)); + sse_op_fn.op1i(s->ptr0, s->ptr1, tcg_const_i32(val)); } else if (b =3D=3D 0xf7) { /* maskmov : we must prepare A0 */ if (mod !=3D 3) { @@ -4614,17 +4617,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]); gen_extu(s->aflag, s->A0); gen_add_A0_ds_seg(s); - - /* XXX: introduce a new table? */ - sse_fn_eppt =3D (SSEFunc_0_eppt)sse_fn_epp; - sse_fn_eppt(cpu_env, s->ptr0, s->ptr1, s->A0); + sse_op_fn.op1t(cpu_env, s->ptr0, s->ptr1, s->A0); } else if (b =3D=3D 0xc2) { /* compare insns, bits 7:3 (7:5 for AVX) are ignored */ val =3D x86_ldub_code(env, s) & 7; - sse_fn_epp =3D sse_op_table4[val][b1]; - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + sse_op_table4[val][b1](cpu_env, s->ptr0, s->ptr1); } else { - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + sse_op_fn.op1(cpu_env, s->ptr0, s->ptr1); } =20 if (sse_op_flags & SSE_OPF_CMP) { --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556133; cv=none; d=zohomail.com; s=zohoarc; b=KyRSJhbZCVHUW1/tO/cFfkchm2ucshKOfFKIPLYyscAKM2PaLAL5sF9sw+y9HaMMkuE/2Ztb5CRjCAcAbhanh8h88nuVSJr216rm9coTjLUzHD20v17JVEIQxJUMX7LmgD/r2gvXcRaRKZxH8S2Lt27fi2gQTtGAbVPk9xO4xQo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556133; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=YXsImx5i7hBv31oU1swR4MPDTIfDIok8sQcAF54K24s=; b=eXURc3CqI5olaencp2o8XE8d42mEzCgmYTfbFj1Ox82v1ZDIdguKHdaVGDPrlxLZ4/4ARA7FIXwDE8tb1RgggKh1XpK3kf5VcGka1MIc4tHyhwaPZSbyxJ/75k8K+w9aIu6v7cvZhn+w182Tt5Qg3UzGNwGFnXK7skeerlBuziY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556133899932.7167120584608; Fri, 26 Aug 2022 16:22:13 -0700 (PDT) Received: from localhost ([::1]:38702 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRieR-0000gD-U0 for importer@patchew.org; Fri, 26 Aug 2022 19:22:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47420) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiV7-0006yd-HI for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:34 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:37300) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiV5-0007qA-Eq for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:33 -0400 Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-384-bDvsZ7tQP--9AQo_q4fYLA-1; Fri, 26 Aug 2022 19:12:28 -0400 Received: by mail-ed1-f70.google.com with SMTP id b13-20020a056402350d00b0043dfc84c533so1842435edd.5 for ; Fri, 26 Aug 2022 16:12:28 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id dk1-20020a0564021d8100b0043bea0a48d0sm1919901edb.22.2022.08.26.16.12.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555549; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YXsImx5i7hBv31oU1swR4MPDTIfDIok8sQcAF54K24s=; b=JCln/wSnQ4oNqbdV28ZSdlSo4eNqASXz7OwIV2QeG9kFm/ejG/MG0e3U6u8km3ysHXylL3 MEouXNYcWqOz/e7HS5AVV1aqJER58Qqu0Q6d82SN5G5v3V4lqGV39AsGDGWvkqaDg/0Ewx E23F+KaFyMoYRmjuwZZmhSwSkM+m12o= X-MC-Unique: bDvsZ7tQP--9AQo_q4fYLA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=YXsImx5i7hBv31oU1swR4MPDTIfDIok8sQcAF54K24s=; b=YxHovK7hNjmHz8YfbYO/oxj6ojHya6qNmQ7/oLsHPMOzbysMLLAWeVgw6+73xbVkr8 BLvriKHJS+pzrzUEDLbdrL6trEJy6/rrMitIJA/8v9D06wJYSOHIvQE/WaHB7QTZpa7u 5dNwUuZoZIuPfsRjTKafh92AAgk2B1a/MnueKwF3oYNLb+kxvdt6Acyf4oumIZKQMYCD AlSsxxuUh1go9/VnBRWFkwZ+gWLBDeTW42vRLAnWBbDjIMuCRBujFdUQDvEtsnh5qQnA vxOxJzvpGy1YHibTJeoPyZ6kd4S3vNkg3oz65jGEiFMshvvStzzxYtkVS8ADHaxSWzO1 dC1g== X-Gm-Message-State: ACgBeo0NbuxUxE4MxtU5Fly4hvDdrKMVCDRCyCCpuTRTHCwJWU4KEQoI NpERy2yLR6Go/gx/1gPGJCpjBGVfLxQNHbow0Hx4bfRKR6qSmPkVb6Hig4FKvNRZQkuRDyOtAfM w+KiN5AHOvZS38MfPshpLfFjttEXVLY+hC1U65Q2ceDHz1Dl1QhwI1cT3KW7PMfo48Ec= X-Received: by 2002:a17:907:160b:b0:73c:fc00:e55e with SMTP id hb11-20020a170907160b00b0073cfc00e55emr6517002ejc.356.1661555547016; Fri, 26 Aug 2022 16:12:27 -0700 (PDT) X-Google-Smtp-Source: AA6agR7ajeqG5IZPZYDUfOQhe/tIljZLyJ7lGQpXk4EhPtc3VnlK2IMV3AgFB+49C8dhfbd4KXydDg== X-Received: by 2002:a17:907:160b:b0:73c:fc00:e55e with SMTP id hb11-20020a170907160b00b0073cfc00e55emr6516991ejc.356.1661555546706; Fri, 26 Aug 2022 16:12:26 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 11/23] i386: Add CHECK_NO_VEX Date: Sat, 27 Aug 2022 01:11:52 +0200 Message-Id: <20220826231204.201395-12-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556135304100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Reject invalid VEX encodings on MMX instructions. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson Message-Id: <20220424220204.2493824-7-paul@nowt.org> Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index f7e8cab52d..f155cbb667 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3186,6 +3186,12 @@ static const struct SSEOpHelper_table7 sse_op_table7= [256] =3D { #undef BLENDV_OP #undef SPECIAL_OP =20 +/* VEX prefix not allowed */ +#define CHECK_NO_VEX(s) do { \ + if (s->prefix & PREFIX_VEX) \ + goto illegal_op; \ + } while (0) + static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { @@ -3272,6 +3278,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, b |=3D (b1 << 8); switch(b) { case 0x0e7: /* movntq */ + CHECK_NO_VEX(s); if (mod =3D=3D 3) { goto illegal_op; } @@ -3307,6 +3314,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x6e: /* movd mm, ea */ + CHECK_NO_VEX(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); @@ -3338,6 +3346,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x6f: /* movq mm, ea */ + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx)); @@ -3473,6 +3482,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x178: case 0x378: + CHECK_NO_VEX(s); { int bit_index, field_length; =20 @@ -3492,6 +3502,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x7e: /* movd ea, mm */ + CHECK_NO_VEX(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { tcg_gen_ld_i64(s->T0, cpu_env, @@ -3532,6 +3543,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); break; case 0x7f: /* movq ea, mm */ + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx)); @@ -3614,6 +3626,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, offsetof(CPUX86State, xmm_t0.ZMM_L(1))); op1_offset =3D offsetof(CPUX86State,xmm_t0); } else { + CHECK_NO_VEX(s); tcg_gen_movi_tl(s->T0, val); tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, mmx_t0.MMX_L(0))); @@ -3653,6 +3666,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x02a: /* cvtpi2ps */ case 0x12a: /* cvtpi2pd */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -3698,6 +3712,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x12c: /* cvttpd2pi */ case 0x02d: /* cvtps2pi */ case 0x12d: /* cvtpd2pi */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -3771,6 +3786,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_st16_tl(s->T0, cpu_env, offsetof(CPUX86State,xmm_regs[reg].ZMM_W(v= al))); } else { + CHECK_NO_VEX(s); val &=3D 3; tcg_gen_st16_tl(s->T0, cpu_env, offsetof(CPUX86State,fpregs[reg].mmx.MMX_W= (val))); @@ -3810,6 +3826,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x2d6: /* movq2dq */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)), @@ -3817,6 +3834,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); break; case 0x3d6: /* movdq2q */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, fpregs[reg & 7].mmx), @@ -3831,6 +3849,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); } else { + CHECK_NO_VEX(s); rm =3D (modrm & 7); tcg_gen_addi_ptr(s->ptr0, cpu_env, offsetof(CPUX86State, fpregs[rm].mmx)); @@ -3901,6 +3920,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } op6->fn[b1].op1(cpu_env, s->ptr0, s->ptr1); } else { + CHECK_NO_VEX(s); if ((op6->flags & SSE_OPF_MMX) =3D=3D 0) { goto unknown_op; } @@ -3934,6 +3954,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x3f0: /* crc32 Gd,Eb */ case 0x3f1: /* crc32 Gd,Ey */ do_crc32: + CHECK_NO_VEX(s); if (!(s->cpuid_ext_features & CPUID_EXT_SSE42)) { goto illegal_op; } @@ -3956,6 +3977,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 case 0x1f0: /* crc32 or movbe */ case 0x1f1: + CHECK_NO_VEX(s); /* For these insns, the f3 prefix is supposed to have prio= rity over the 66 prefix, but that's not what we implement ab= ove setting b1. */ @@ -3965,6 +3987,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, /* FALLTHRU */ case 0x0f0: /* movbe Gy,My */ case 0x0f1: /* movbe My,Gy */ + CHECK_NO_VEX(s); if (!(s->cpuid_ext_features & CPUID_EXT_MOVBE)) { goto illegal_op; } @@ -4131,6 +4154,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 case 0x1f6: /* adcx Gy, Ey */ case 0x2f6: /* adox Gy, Ey */ + CHECK_NO_VEX(s); if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_ADX)) { goto illegal_op; } else { @@ -4436,6 +4460,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } =20 if (b1 =3D=3D 0) { + CHECK_NO_VEX(s); /* MMX */ if ((op7->flags & SSE_OPF_MMX) =3D=3D 0) { goto illegal_op; @@ -4582,6 +4607,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op2_offset =3D ZMM_OFFSET(rm); } } else { + CHECK_NO_VEX(s); op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556294; cv=none; d=zohomail.com; s=zohoarc; b=hPJPKBxTSe/rKQ4CS58ZMffTn3lGSXtjwvy9sy/cSxqtAW8HumtBLMKPDLSpJ7devceLz8AC5R0M7n0BS5AqSEbTgUTGarEqPLFe3zagh73NbPYMMZL3N+7U3imWNKPPQV40VnLqZsGsq1zXhUsczc+As6hVkmEp2eyPPm/av6U= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556294; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=SkcLlE3+NWnE1Ni60695EMJiMdZS1Zu17YnLw8OeY/U=; b=OITyxOb/j+JDZWcaEcxm5sAhAFXzOaLAguKmBZHBXFq0DeGqTPi/nP7GPD8rCzrgI0h/HNUZGHQ5Ym90KSovmQVGdk5vSQG4trKrTTuKK9fhyUNEnnX1XojvN3pFFE7xapxpZvrl+DTLfFa4NL0hpj5EwIbW8eR/91ED47j2NhI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556294606727.6379773262831; Fri, 26 Aug 2022 16:24:54 -0700 (PDT) Received: from localhost ([::1]:52632 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRih3-0006Ij-Fc for importer@patchew.org; Fri, 26 Aug 2022 19:24:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:40704) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVX-00074A-Mn for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:59 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:56290) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiV6-0007qJ-DF for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:12:34 -0400 Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-508-tYWdAsjgPLWCEqDnzo3rhQ-1; Fri, 26 Aug 2022 19:12:30 -0400 Received: by mail-ed1-f71.google.com with SMTP id h17-20020a05640250d100b00446d1825c9fso1848245edb.14 for ; Fri, 26 Aug 2022 16:12:30 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id es5-20020a056402380500b0043bc4b28464sm1804833edb.34.2022.08.26.16.12.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555551; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SkcLlE3+NWnE1Ni60695EMJiMdZS1Zu17YnLw8OeY/U=; b=SUQi3jQiZgUfVAehLSsPje+9mMDh88CCgwh0rdgWI4oiKKV0ohIlJcGfg48KZ0xjug0bta KhNXym7ozyizU4KUctbg4Sq7hbIj51kMmJJQZ9GwlurXBFmqDiKqaDPEWwbYpffMm5Klj0 aTOKuPaGlZqTwA8KArUzCEB59UKwZ7U= X-MC-Unique: tYWdAsjgPLWCEqDnzo3rhQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=SkcLlE3+NWnE1Ni60695EMJiMdZS1Zu17YnLw8OeY/U=; b=wqeSvqwBc6SnOZ6jOo5KH61xhbY0k0sHybfaHDZyAslPsRYw0Imfv7UDmRRB7EhCxk 3gXlEhQteY92UwKcFVlap4lfVX+9jzjpKQYAttd0/oe2PcOCaRl7U2xo1KnBJg6s5kWE hwfDho21Wf/QWynDcgdCQu/22I+kzpRxIva4bPTUtVcOlcdhQRpz+0PikUjWxZ7shH3O M/0QFHjMt9lgwTunleTFHpQSvPdjv9Ao7VtaKcIoDBH7LDUBPMB+NBjc185p8LpUHlLR uzszrSrwbRbrlWWu1k6FlSV4vTuRGF2srmsokd16qUJeTxsJwLFGUZzrsV/h8DiJpeJ4 2Esw== X-Gm-Message-State: ACgBeo2LYFmjih5lxg/7c0m8yCY4ZZ6/WKKrbe7OM5qbBeShN1WzHmKX bwgywyTtmOpzVF+WkkXaYl5RjgriX2/u/tyf+whqphQpBewIWuaj0tV6zQqxEDer0YdSqbPMPjy JN+cNVDnTi7KHGfoYZTmaUhZMTyJlQamtiNogrO8fZp1VukaR0Q2q+qMfhfqteNfArS4= X-Received: by 2002:a17:906:4795:b0:73d:daa0:3cbf with SMTP id cw21-20020a170906479500b0073ddaa03cbfmr4954064ejc.693.1661555549014; Fri, 26 Aug 2022 16:12:29 -0700 (PDT) X-Google-Smtp-Source: AA6agR6/9fNjS8uajm1RbKdFDZEfxUV6cEebWeYmS/OA6GMBhtCwPuUp0qSfgAa1wJtsZl0MfUFNZQ== X-Received: by 2002:a17:906:4795:b0:73d:daa0:3cbf with SMTP id cw21-20020a170906479500b0073ddaa03cbfmr4954054ejc.693.1661555548708; Fri, 26 Aug 2022 16:12:28 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 12/23] i386: Rewrite vector shift helper Date: Sat, 27 Aug 2022 01:11:53 +0200 Message-Id: <20220826231204.201395-13-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556296450100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Rewrite the vector shift helpers in preperation for AVX support (3 operand form and 256 bit vectors). For now keep the existing two operand interface. No functional changes to existing helpers. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-11-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 221 ++++++++++++++++++------------------------ 1 file changed, 96 insertions(+), 125 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index f603981ab8..8c745f5cab 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -56,195 +56,166 @@ #define MOVE(d, r) memcpy(&(d).B(0), &(r).B(0), SIZE) #endif =20 -void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - int shift; +#if SHIFT =3D=3D 0 +#define FPSRL(x, c) ((x) >> shift) +#define FPSRAW(x, c) ((int16_t)(x) >> shift) +#define FPSRAL(x, c) ((int32_t)(x) >> shift) +#define FPSLL(x, c) ((x) << shift) +#endif =20 - if (s->Q(0) > 15) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif +void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +{ + Reg *s =3D d; + int shift; + if (c->Q(0) > 15) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->W(0) >>=3D shift; - d->W(1) >>=3D shift; - d->W(2) >>=3D shift; - d->W(3) >>=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) >>=3D shift; - d->W(5) >>=3D shift; - d->W(6) >>=3D shift; - d->W(7) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSRL(s->W(i), shift); + } } } =20 -void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; + if (c->Q(0) > 15) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } + } else { + shift =3D c->B(0); + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSLL(s->W(i), shift); + } + } +} =20 - if (s->Q(0) > 15) { +void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +{ + Reg *s =3D d; + int shift; + if (c->Q(0) > 15) { shift =3D 15; } else { - shift =3D s->B(0); + shift =3D c->B(0); + } + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSRAW(s->W(i), shift); } - d->W(0) =3D (int16_t)d->W(0) >> shift; - d->W(1) =3D (int16_t)d->W(1) >> shift; - d->W(2) =3D (int16_t)d->W(2) >> shift; - d->W(3) =3D (int16_t)d->W(3) >> shift; -#if SHIFT =3D=3D 1 - d->W(4) =3D (int16_t)d->W(4) >> shift; - d->W(5) =3D (int16_t)d->W(5) >> shift; - d->W(6) =3D (int16_t)d->W(6) >> shift; - d->W(7) =3D (int16_t)d->W(7) >> shift; -#endif } =20 -void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 15) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 31) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->W(0) <<=3D shift; - d->W(1) <<=3D shift; - d->W(2) <<=3D shift; - d->W(3) <<=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) <<=3D shift; - d->W(5) <<=3D shift; - d->W(6) <<=3D shift; - d->W(7) <<=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSRL(s->L(i), shift); + } } } =20 -void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 31) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->L(0) >>=3D shift; - d->L(1) >>=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) >>=3D shift; - d->L(3) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSLL(s->L(i), shift); + } } } =20 -void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { + if (c->Q(0) > 31) { shift =3D 31; } else { - shift =3D s->B(0); + shift =3D c->B(0); + } + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSRAL(s->L(i), shift); } - d->L(0) =3D (int32_t)d->L(0) >> shift; - d->L(1) =3D (int32_t)d->L(1) >> shift; -#if SHIFT =3D=3D 1 - d->L(2) =3D (int32_t)d->L(2) >> shift; - d->L(3) =3D (int32_t)d->L(3) >> shift; -#endif } =20 -void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 63) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->L(0) <<=3D shift; - d->L(1) <<=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) <<=3D shift; - d->L(3) <<=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D FPSRL(s->Q(i), shift); + } } } =20 -void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 63) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 63) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->Q(0) >>=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D FPSLL(s->Q(i), shift); + } } } =20 -void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - int shift; - - if (s->Q(0) > 63) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif - } else { - shift =3D s->B(0); - d->Q(0) <<=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) <<=3D shift; -#endif - } -} - -#if SHIFT =3D=3D 1 -void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +#if SHIFT >=3D 1 +void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift, i; =20 - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } for (i =3D 0; i < 16 - shift; i++) { - d->B(i) =3D d->B(i + shift); + d->B(i) =3D s->B(i + shift); } for (i =3D 16 - shift; i < 16; i++) { d->B(i) =3D 0; } } =20 -void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift, i; =20 - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } for (i =3D 15; i >=3D shift; i--) { - d->B(i) =3D d->B(i - shift); + d->B(i) =3D s->B(i - shift); } for (i =3D 0; i < shift; i++) { d->B(i) =3D 0; --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556524; cv=none; d=zohomail.com; s=zohoarc; b=KOWrW9CvWZ1SPZNXsyDvIZ0fqCU/a9WSb4GRVnl6vI6WGCyCJr6XYGjDnpsee38BsCppgbTiQunCeGAlDuYTL09NBKDlJScPa+xlJb47sdePhwqj0PTc8Q7dhDD6vb8/3LjnD0AlAn96Auqmto+wrbBSxkQ1vugdKL71R3E0e5s= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556524; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=vLmD0zmrzApy3raDGv/uUpAiWJQdUPxyPzWTBxnFFXI=; b=Ts8yBRcqJAgDKllYMUiHJXXSKh8fq0ANlYMDmWdovS2wHB0440zkLh39xdgGMBpROAoS5eEQG21PkNMnxOcHIFifoNRMxhDKGTt3DmunpqkJZGn2s88uylldOXkN3OMii8UjVDvbFfFSdR7er6EUnsUp23zwBh456jFkSWjwUpM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556524095870.2521896411068; Fri, 26 Aug 2022 16:28:44 -0700 (PDT) Received: from localhost ([::1]:55226 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRikk-0006fD-1w for importer@patchew.org; Fri, 26 Aug 2022 19:28:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42896) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVi-0007EH-Iz for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:11 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:39799) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVG-0007qV-Mv for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:10 -0400 Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-651-4kMN76ZmO0K_d6xYtrfsHg-1; Fri, 26 Aug 2022 19:12:32 -0400 Received: by mail-ed1-f71.google.com with SMTP id b13-20020a056402350d00b0043dfc84c533so1842512edd.5 for ; Fri, 26 Aug 2022 16:12:32 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id ch4-20020a0564021bc400b00445d760fc69sm1970009edb.50.2022.08.26.16.12.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555553; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vLmD0zmrzApy3raDGv/uUpAiWJQdUPxyPzWTBxnFFXI=; b=C3etWHFOqB8gm7R3Zx4VJHHiiASiEAbQGyZ9j9fOH6JE2qOEaGeUec9XcVH00oQaCvptEc U+YMcL6dIyzHrEJQ7uqvkMUAM2/fKKbhouRVsU+mx24yZB8YTItcZCYJDFicABuIeoehRm 5JQahjwIw4T4jwfjOkNVZt98msp98dE= X-MC-Unique: 4kMN76ZmO0K_d6xYtrfsHg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=vLmD0zmrzApy3raDGv/uUpAiWJQdUPxyPzWTBxnFFXI=; b=4qJXBYoFiu2ZU/Jm4b2CTWHYlWxME5h86cK64LK2I4zDR+qZNGA4TSyFJ4EdRBCEHQ /6/AMxMKOQjSxzcmJmtmakm5vA2zG47TyTJKCLtkn1Q+s5+z29lXIDK4VCwJmXFFBtN1 gT6Tt/t1a4otlJK6TQBy34GdfmDYsmxQk/UxRBYw+DpbL9nMkBIOCZW9DxKVkkgyRuD1 ZNgnha0EJrSVlDW0UeqkKqM2EFoVKQw4nZ8oWwpr5MEEjOZxnLpecYW9pz6f4lMQbNcH btqVIAJWPoAKZbeDdLcsFFb0rdU/SwDeh3on1HyHc798XnAfWF+Kuw8R/E6PDvb+uCgD al9g== X-Gm-Message-State: ACgBeo1P84zpzHaay7wT0Sp4SkMDg8C11E59FILuBFBHyXEIrdgrJIB5 5SC7W65mshVdXvar1UPNoX33YI5dSs9ijY0b1bZ78YGj6kNFWO78D1QWgD9J5ajFJXdW/4eKOda ktwod1tv8KXf6xK8SFvv0E3KY8Q0Sf21t6bW8J0uBc49w2LvyU+gg8kgxjq+zYHxSf60= X-Received: by 2002:a05:6402:5cd:b0:446:5965:f4af with SMTP id n13-20020a05640205cd00b004465965f4afmr8310324edx.12.1661555550910; Fri, 26 Aug 2022 16:12:30 -0700 (PDT) X-Google-Smtp-Source: AA6agR6wg1uBEmKVv/jrvOVhg2pF6L3yIoy/aoH31LeyYmTWQx0EFQbMQSnlYxA7J/UjHCa/SwI8YA== X-Received: by 2002:a05:6402:5cd:b0:446:5965:f4af with SMTP id n13-20020a05640205cd00b004465965f4afmr8310306edx.12.1661555550531; Fri, 26 Aug 2022 16:12:30 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 13/23] i386: Rewrite simple integer vector helpers Date: Sat, 27 Aug 2022 01:11:54 +0200 Message-Id: <20220826231204.201395-14-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_HELO_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556525751100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Rewrite the "simple" vector integer helpers in preperation for AVX support. While the current code is able to use the same prototype for unary (a =3D F(b)) and binary (a =3D F(b, c)) operations, future changes will cau= se them to diverge. No functional changes to existing helpers Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-12-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 83 +++++++++++++++---------------------------- 1 file changed, 28 insertions(+), 55 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 8c745f5cab..8395733432 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -223,63 +223,36 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Re= g *d, Reg *c) } #endif =20 -#define SSE_HELPER_B(name, F) \ +#define SSE_HELPER_1(name, elem, num, F) \ void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->B(0) =3D F(d->B(0), s->B(0)); \ - d->B(1) =3D F(d->B(1), s->B(1)); \ - d->B(2) =3D F(d->B(2), s->B(2)); \ - d->B(3) =3D F(d->B(3), s->B(3)); \ - d->B(4) =3D F(d->B(4), s->B(4)); \ - d->B(5) =3D F(d->B(5), s->B(5)); \ - d->B(6) =3D F(d->B(6), s->B(6)); \ - d->B(7) =3D F(d->B(7), s->B(7)); \ - XMM_ONLY( \ - d->B(8) =3D F(d->B(8), s->B(8)); \ - d->B(9) =3D F(d->B(9), s->B(9)); \ - d->B(10) =3D F(d->B(10), s->B(10)); \ - d->B(11) =3D F(d->B(11), s->B(11)); \ - d->B(12) =3D F(d->B(12), s->B(12)); \ - d->B(13) =3D F(d->B(13), s->B(13)); \ - d->B(14) =3D F(d->B(14), s->B(14)); \ - d->B(15) =3D F(d->B(15), s->B(15)); \ - ) \ - } + int n =3D num; \ + for (int i =3D 0; i < n; i++) { \ + d->elem(i) =3D F(s->elem(i)); \ + } \ + } + +#define SSE_HELPER_2(name, elem, num, F) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + { \ + Reg *v =3D d; \ + int n =3D num; \ + for (int i =3D 0; i < n; i++) { \ + d->elem(i) =3D F(v->elem(i), s->elem(i)); \ + } \ + } + +#define SSE_HELPER_B(name, F) \ + SSE_HELPER_2(name, B, 8 << SHIFT, F) =20 #define SSE_HELPER_W(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->W(0) =3D F(d->W(0), s->W(0)); \ - d->W(1) =3D F(d->W(1), s->W(1)); \ - d->W(2) =3D F(d->W(2), s->W(2)); \ - d->W(3) =3D F(d->W(3), s->W(3)); \ - XMM_ONLY( \ - d->W(4) =3D F(d->W(4), s->W(4)); \ - d->W(5) =3D F(d->W(5), s->W(5)); \ - d->W(6) =3D F(d->W(6), s->W(6)); \ - d->W(7) =3D F(d->W(7), s->W(7)); \ - ) \ - } + SSE_HELPER_2(name, W, 4 << SHIFT, F) =20 #define SSE_HELPER_L(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->L(0) =3D F(d->L(0), s->L(0)); \ - d->L(1) =3D F(d->L(1), s->L(1)); \ - XMM_ONLY( \ - d->L(2) =3D F(d->L(2), s->L(2)); \ - d->L(3) =3D F(d->L(3), s->L(3)); \ - ) \ - } + SSE_HELPER_2(name, L, 2 << SHIFT, F) =20 #define SSE_HELPER_Q(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->Q(0) =3D F(d->Q(0), s->Q(0)); \ - XMM_ONLY( \ - d->Q(1) =3D F(d->Q(1), s->Q(1)); \ - ) \ - } + SSE_HELPER_2(name, Q, 1 << SHIFT, F) =20 #if SHIFT =3D=3D 0 static inline int satub(int x) @@ -1538,12 +1511,12 @@ void glue(helper_phsubsw, SUFFIX)(CPUX86State *env,= Reg *d, Reg *s) MOVE(*d, r); } =20 -#define FABSB(_, x) (x > INT8_MAX ? -(int8_t)x : x) -#define FABSW(_, x) (x > INT16_MAX ? -(int16_t)x : x) -#define FABSL(_, x) (x > INT32_MAX ? -(int32_t)x : x) -SSE_HELPER_B(helper_pabsb, FABSB) -SSE_HELPER_W(helper_pabsw, FABSW) -SSE_HELPER_L(helper_pabsd, FABSL) +#define FABSB(x) (x > INT8_MAX ? -(int8_t)x : x) +#define FABSW(x) (x > INT16_MAX ? -(int16_t)x : x) +#define FABSL(x) (x > INT32_MAX ? -(int32_t)x : x) +SSE_HELPER_1(helper_pabsb, B, 8 << SHIFT, FABSB) +SSE_HELPER_1(helper_pabsw, W, 4 << SHIFT, FABSW) +SSE_HELPER_1(helper_pabsd, L, 2 << SHIFT, FABSL) =20 #define FMULHRSW(d, s) (((int16_t) d * (int16_t)s + 0x4000) >> 15) SSE_HELPER_W(helper_pmulhrsw, FMULHRSW) --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556391; cv=none; d=zohomail.com; s=zohoarc; b=fyaYlrcKR6gSl4HKWiWCs9VHIr9lBP9PyvzsvhrSPzY5VflkIQ/ryhD33dbvSKhnmtxGAO0Ls+T6ABcOzdxG13qVWZ80yG+1f7UliSgdgfkrvANTgpgSPPz3pj0uVmhbh3pRzJzcDCNzRZVgXT8CAdrbh45Rn78cePbn4Wl2aCY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556391; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=OoHj0LaqAZEbzfPId1kM7cw3m7eW7SNwOHuk0WbTBQ8=; b=l6RW0xBCnOf9OXmjcY3cdeOqeo+9Z34vob7cNvDkY4mSO9key9CC9h7DSEm8K8Seclk54UgHrqFRdL2ChtfRxu1Rmosw0Ed3uoUyNudlee4xITAPK4Q3t3IIYs8F6wxXObl6zwRmQZ4C5RE6LC7wvQQLYGsRr8SkxUEskVhcxCk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 166155639176442.85145930565409; Fri, 26 Aug 2022 16:26:31 -0700 (PDT) Received: from localhost ([::1]:60720 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRiib-0001N1-Qc for importer@patchew.org; Fri, 26 Aug 2022 19:26:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42900) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVj-0007GD-Ui for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:12 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:21414) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVP-0007qe-KP for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:11 -0400 Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-673-xOufiQrvMIeRRCbc-5FRMg-1; Fri, 26 Aug 2022 19:12:39 -0400 Received: by mail-ed1-f69.google.com with SMTP id y20-20020a056402359400b00447a871c48fso1863206edc.3 for ; Fri, 26 Aug 2022 16:12:38 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id q18-20020a17090676d200b00730860b6c43sm1387051ejn.173.2022.08.26.16.12.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555560; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OoHj0LaqAZEbzfPId1kM7cw3m7eW7SNwOHuk0WbTBQ8=; b=JJGT/EjJG+Y7J57Z3T9JbyboSV1hyX8llf0gxgLg5zEy83bOKc5UK1as73RJL9oIl6YcSH b+J1hDx8Zctyy+VwbJb47YEr5oJMtJhitHCxjEWkhOxlZs+9Uw0VhhBkYGW0QPX6LPSC0k cic070fa8UoFkE+L6IcuFL3uWoQ3mC4= X-MC-Unique: xOufiQrvMIeRRCbc-5FRMg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=OoHj0LaqAZEbzfPId1kM7cw3m7eW7SNwOHuk0WbTBQ8=; b=srrKpRAyKkoTI/wkFZKBiLsdXHptu6mS17pEJ1cMapDt/1ndlFMzPquwMGnybTocvd Ey/QZrHrSuHq1HB1oZ42hx3GVvhQ/5fXvkplGvfOHIYTLdymj3t8GYmMArkr4pKzwvHM +eA1nxYb6h+qGzX+PzaT4vRax8oekBFxZQZ7VuhXs+T7/wYROJH2ANywSM5ODDhBQBPK hG3XjC/PpzQAWvP8CCEKjV2SmvHdhfCjRhB9yX6/i1Tk0LhzFXIEsIVWkC0nbqhptlp1 lj2TvBCRJo6xc+u7sivOgfV8y1hq6OI9Z4SeesyW+Rp1YRNdlRRCm/7aDKZq3Ou8Pm9v 5VTg== X-Gm-Message-State: ACgBeo0vDj0UYJOQJs+iD6sX+gARufuoU2MRPx+7whYKWQp4Qb+C3R7p Tl54li+K6F/pkLuQDc1rSRy4amqRVdO9nz1p6a62XOfhwlveh/6jzKZFOboJ7gd8Bueutdzsn+j YbgZdLgamirV/bdf5AdN1YunNSaKkMw5t8ZvfE3cFxm00xDV3j0enIT15mkbVOpnI4Pw= X-Received: by 2002:a05:6402:1745:b0:448:116b:12fb with SMTP id v5-20020a056402174500b00448116b12fbmr1446997edx.421.1661555557279; Fri, 26 Aug 2022 16:12:37 -0700 (PDT) X-Google-Smtp-Source: AA6agR5cEcdw3ybKm0akVerCDOiGkZBlyMWPJoANj+o762T1vKPs6Yf2HTAK1eDikPYKryv+VE0wUA== X-Received: by 2002:a05:6402:1745:b0:448:116b:12fb with SMTP id v5-20020a056402174500b00448116b12fbmr1446978edx.421.1661555556872; Fri, 26 Aug 2022 16:12:36 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 14/23] i386: Misc integer AVX helper prep Date: Sat, 27 Aug 2022 01:11:55 +0200 Message-Id: <20220826231204.201395-15-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556393105100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook More preparatory work for AVX support in various integer vector helpers No functional changes to existing helpers. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-13-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 164 +++++++++++++++++++++--------------------- 1 file changed, 80 insertions(+), 84 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 8395733432..9ea763cad2 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -384,19 +384,22 @@ SSE_HELPER_W(helper_pavgw, FAVG) =20 void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->Q(0) =3D (uint64_t)s->L(0) * (uint64_t)d->L(0); -#if SHIFT =3D=3D 1 - d->Q(1) =3D (uint64_t)s->L(2) * (uint64_t)d->L(2); -#endif + Reg *v =3D d; + int i; + + for (i =3D 0; i < (1 << SHIFT); i++) { + d->Q(i) =3D (uint64_t)s->L(i * 2) * (uint64_t)v->L(i * 2); + } } =20 void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { + Reg *v =3D d; int i; =20 for (i =3D 0; i < (2 << SHIFT); i++) { - d->L(i) =3D (int16_t)s->W(2 * i) * (int16_t)d->W(2 * i) + - (int16_t)s->W(2 * i + 1) * (int16_t)d->W(2 * i + 1); + d->L(i) =3D (int16_t)s->W(2 * i) * (int16_t)v->W(2 * i) + + (int16_t)s->W(2 * i + 1) * (int16_t)v->W(2 * i + 1); } } =20 @@ -410,32 +413,24 @@ static inline int abs1(int a) } } #endif + void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - unsigned int val; + Reg *v =3D d; + int i; =20 - val =3D 0; - val +=3D abs1(d->B(0) - s->B(0)); - val +=3D abs1(d->B(1) - s->B(1)); - val +=3D abs1(d->B(2) - s->B(2)); - val +=3D abs1(d->B(3) - s->B(3)); - val +=3D abs1(d->B(4) - s->B(4)); - val +=3D abs1(d->B(5) - s->B(5)); - val +=3D abs1(d->B(6) - s->B(6)); - val +=3D abs1(d->B(7) - s->B(7)); - d->Q(0) =3D val; -#if SHIFT =3D=3D 1 - val =3D 0; - val +=3D abs1(d->B(8) - s->B(8)); - val +=3D abs1(d->B(9) - s->B(9)); - val +=3D abs1(d->B(10) - s->B(10)); - val +=3D abs1(d->B(11) - s->B(11)); - val +=3D abs1(d->B(12) - s->B(12)); - val +=3D abs1(d->B(13) - s->B(13)); - val +=3D abs1(d->B(14) - s->B(14)); - val +=3D abs1(d->B(15) - s->B(15)); - d->Q(1) =3D val; -#endif + for (i =3D 0; i < (1 << SHIFT); i++) { + unsigned int val =3D 0; + val +=3D abs1(v->B(8 * i + 0) - s->B(8 * i + 0)); + val +=3D abs1(v->B(8 * i + 1) - s->B(8 * i + 1)); + val +=3D abs1(v->B(8 * i + 2) - s->B(8 * i + 2)); + val +=3D abs1(v->B(8 * i + 3) - s->B(8 * i + 3)); + val +=3D abs1(v->B(8 * i + 4) - s->B(8 * i + 4)); + val +=3D abs1(v->B(8 * i + 5) - s->B(8 * i + 5)); + val +=3D abs1(v->B(8 * i + 6) - s->B(8 * i + 6)); + val +=3D abs1(v->B(8 * i + 7) - s->B(8 * i + 7)); + d->Q(i) =3D val; + } } =20 void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, @@ -452,20 +447,24 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, =20 void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val) { + int i; + d->L(0) =3D val; d->L(1) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + for (i =3D 1; i < (1 << SHIFT); i++) { + d->Q(i) =3D 0; + } } =20 #ifdef TARGET_X86_64 void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val) { + int i; + d->Q(0) =3D val; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + for (i =3D 1; i < (1 << SHIFT); i++) { + d->Q(i) =3D 0; + } } #endif =20 @@ -1068,26 +1067,21 @@ uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State = *env, Reg *s) uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s) { uint32_t val; + int i; =20 val =3D 0; - val |=3D (s->B(0) >> 7); - val |=3D (s->B(1) >> 6) & 0x02; - val |=3D (s->B(2) >> 5) & 0x04; - val |=3D (s->B(3) >> 4) & 0x08; - val |=3D (s->B(4) >> 3) & 0x10; - val |=3D (s->B(5) >> 2) & 0x20; - val |=3D (s->B(6) >> 1) & 0x40; - val |=3D (s->B(7)) & 0x80; -#if SHIFT =3D=3D 1 - val |=3D (s->B(8) << 1) & 0x0100; - val |=3D (s->B(9) << 2) & 0x0200; - val |=3D (s->B(10) << 3) & 0x0400; - val |=3D (s->B(11) << 4) & 0x0800; - val |=3D (s->B(12) << 5) & 0x1000; - val |=3D (s->B(13) << 6) & 0x2000; - val |=3D (s->B(14) << 7) & 0x4000; - val |=3D (s->B(15) << 8) & 0x8000; -#endif + for (i =3D 0; i < (1 << SHIFT); i++) { + uint8_t byte =3D 0; + byte |=3D (s->B(8 * i + 0) >> 7); + byte |=3D (s->B(8 * i + 1) >> 6) & 0x02; + byte |=3D (s->B(8 * i + 2) >> 5) & 0x04; + byte |=3D (s->B(8 * i + 3) >> 4) & 0x08; + byte |=3D (s->B(8 * i + 4) >> 3) & 0x10; + byte |=3D (s->B(8 * i + 5) >> 2) & 0x20; + byte |=3D (s->B(8 * i + 6) >> 1) & 0x40; + byte |=3D (s->B(8 * i + 7)) & 0x80; + val |=3D byte << (8 * i); + } return val; } =20 @@ -1632,46 +1626,48 @@ SSE_HELPER_V(helper_blendvpd, Q, 2, FBLENDVPD) =20 void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - uint64_t zf =3D (s->Q(0) & d->Q(0)) | (s->Q(1) & d->Q(1)); - uint64_t cf =3D (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1)); + uint64_t zf =3D 0, cf =3D 0; + int i; =20 + for (i =3D 0; i < 1 << SHIFT; i++) { + zf |=3D (s->Q(i) & d->Q(i)); + cf |=3D (s->Q(i) & ~d->Q(i)); + } CC_SRC =3D (zf ? 0 : CC_Z) | (cf ? 0 : CC_C); } =20 -#define SSE_HELPER_F(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ - { \ - if (num > 2) { \ - if (num > 4) { \ - d->elem(7) =3D F(7); \ - d->elem(6) =3D F(6); \ - d->elem(5) =3D F(5); \ - d->elem(4) =3D F(4); \ - } \ - d->elem(3) =3D F(3); \ - d->elem(2) =3D F(2); \ - } \ - d->elem(1) =3D F(1); \ - d->elem(0) =3D F(0); \ +#define SSE_HELPER_F(name, elem, num, F) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + { \ + int n =3D num; \ + for (int i =3D n; --i >=3D 0; ) { \ + d->elem(i) =3D F(i); \ + } \ } =20 -SSE_HELPER_F(helper_pmovsxbw, W, 8, (int8_t) s->B) -SSE_HELPER_F(helper_pmovsxbd, L, 4, (int8_t) s->B) -SSE_HELPER_F(helper_pmovsxbq, Q, 2, (int8_t) s->B) -SSE_HELPER_F(helper_pmovsxwd, L, 4, (int16_t) s->W) -SSE_HELPER_F(helper_pmovsxwq, Q, 2, (int16_t) s->W) -SSE_HELPER_F(helper_pmovsxdq, Q, 2, (int32_t) s->L) -SSE_HELPER_F(helper_pmovzxbw, W, 8, s->B) -SSE_HELPER_F(helper_pmovzxbd, L, 4, s->B) -SSE_HELPER_F(helper_pmovzxbq, Q, 2, s->B) -SSE_HELPER_F(helper_pmovzxwd, L, 4, s->W) -SSE_HELPER_F(helper_pmovzxwq, Q, 2, s->W) -SSE_HELPER_F(helper_pmovzxdq, Q, 2, s->L) +#if SHIFT > 0 +SSE_HELPER_F(helper_pmovsxbw, W, 4 << SHIFT, (int8_t) s->B) +SSE_HELPER_F(helper_pmovsxbd, L, 2 << SHIFT, (int8_t) s->B) +SSE_HELPER_F(helper_pmovsxbq, Q, 1 << SHIFT, (int8_t) s->B) +SSE_HELPER_F(helper_pmovsxwd, L, 2 << SHIFT, (int16_t) s->W) +SSE_HELPER_F(helper_pmovsxwq, Q, 1 << SHIFT, (int16_t) s->W) +SSE_HELPER_F(helper_pmovsxdq, Q, 1 << SHIFT, (int32_t) s->L) +SSE_HELPER_F(helper_pmovzxbw, W, 4 << SHIFT, s->B) +SSE_HELPER_F(helper_pmovzxbd, L, 2 << SHIFT, s->B) +SSE_HELPER_F(helper_pmovzxbq, Q, 1 << SHIFT, s->B) +SSE_HELPER_F(helper_pmovzxwd, L, 2 << SHIFT, s->W) +SSE_HELPER_F(helper_pmovzxwq, Q, 1 << SHIFT, s->W) +SSE_HELPER_F(helper_pmovzxdq, Q, 1 << SHIFT, s->L) +#endif =20 void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->Q(0) =3D (int64_t)(int32_t) d->L(0) * (int32_t) s->L(0); - d->Q(1) =3D (int64_t)(int32_t) d->L(2) * (int32_t) s->L(2); + Reg *v =3D d; + int i; + + for (i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D (int64_t)(int32_t) v->L(2 * i) * (int32_t) s->L(2 * i); + } } =20 #define FCMPEQQ(d, s) (d =3D=3D s ? -1 : 0) --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556814; cv=none; d=zohomail.com; s=zohoarc; b=fbuFdK6AJQeAqgUKGNB5t8g3PdYG3Z6Vb9GsRz9vSflvIR/qMkeGmjcwAVxAPSCkMFC2zHp1TE0UUAGMOhDDARZrxdTrLE+xwwMAdGXWWpcJE8+47eto+aaYJ59my8K/hf0wbozMfhmcD+X4eu3nqg7Jno5/6tQiSxlvlsdmYck= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556814; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=wMFaujhfGUVRTaOpHWl5IcvAxPSahlt6RD3nhGBR4F4=; b=ENA18yLLbcTSu27ZOf/TLiibH2Ln4i5EHJv+uiyfXQx6Ufe1nB2iLPMjEeYX0/MHB3ngv5iq+UXnXhTz5CDn38ByIMHqguPIRn0Nz4QL4tlLUVVWfcXAW9a3RzoO5LLvnBXUXI1YQxbilxIf04nkJlGdd/Sug93MHgAD26I6wZA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556814157605.1415317586823; Fri, 26 Aug 2022 16:33:34 -0700 (PDT) Received: from localhost ([::1]:33476 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRipQ-0003jn-KQ for importer@patchew.org; Fri, 26 Aug 2022 19:33:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42906) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVk-0007JV-UD for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:12 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:38683) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVV-0007qn-Hc for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:12 -0400 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-530-Ycccb2CYMzuf3uvRJ-wlMw-1; Fri, 26 Aug 2022 19:12:42 -0400 Received: by mail-ed1-f72.google.com with SMTP id y20-20020a056402359400b00447a871c48fso1863278edc.3 for ; Fri, 26 Aug 2022 16:12:42 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id l3-20020a1709065a8300b0073c9d68ca0dsm1406211ejq.133.2022.08.26.16.12.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wMFaujhfGUVRTaOpHWl5IcvAxPSahlt6RD3nhGBR4F4=; b=TenYnjCb58lSIb4VNFZguf/2axuNP/PgFZ0HsyNXz9C572DTqHcpvWLusUd7JQkkYbA6Lb kXtCo618xPNVISWX8BAVeQ2FyRbkHiVIRIvuAuA1KcV0rEp7ydS4ig/KO77mUz1m7mosgT sY8Z+WETuo7mXwxUyG5iTxQTx+ZtDY4= X-MC-Unique: Ycccb2CYMzuf3uvRJ-wlMw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=wMFaujhfGUVRTaOpHWl5IcvAxPSahlt6RD3nhGBR4F4=; b=p4dU6mnOW/mwcPAhOqnhF+SD0OTHMYmwnFvkhGpTFvsqkj5zLux+4k3TZs+RaiqOPc 3yq8YsyFLYMSFdW4EXYAQZ8eJgmb0oEgiigFPs7hKDzwIYwOqzcg25OpdRZ6mSCytFtA 7d2jDp50wknmxiOc/H5SsSIYvgMcPICwvqkHYpQxKfQXaArDMMAQ80MtM43BnMfyl0M3 dk5JYuuOEnEBhgA+iyOVGWVHjzRNHzIZcARQYL9O7q1wmITHCFHFgM+hUwRdN49BvpH3 XE8mqJelb52hBG5K7XmoB7fWA8J9CUODZX4X2g/brpbbmMQPLWPDFhQXcY4ADLQDcqfp 7NOA== X-Gm-Message-State: ACgBeo1aUKy8psCOhVp0wGVwPujf6n+n1Q502rWxhT74he1LltNPLCzP MLn/T1Sa20+KL3z20zV0y23ReDy3tqmxBPl5wZpG4o8+SoHV7/NIbCQ6ELisB9ttejRpwqZKb8d 0r5OmH+jsAqNhWDYMz6NgxbVc1QXLPBFyI3SjzvbF618cOd5oVnCT4w6QM1bfna4KQYk= X-Received: by 2002:a17:907:3f11:b0:73d:94a6:944b with SMTP id hq17-20020a1709073f1100b0073d94a6944bmr6729940ejc.228.1661555560293; Fri, 26 Aug 2022 16:12:40 -0700 (PDT) X-Google-Smtp-Source: AA6agR73XT6c+WLijZNzzAmRvAvcH/CX5so3rkGpKeFrwsArWkxdjtWTSLU/yATQ2TJIElrw48IHVw== X-Received: by 2002:a17:907:3f11:b0:73d:94a6:944b with SMTP id hq17-20020a1709073f1100b0073d94a6944bmr6729916ejc.228.1661555559590; Fri, 26 Aug 2022 16:12:39 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 15/23] i386: Destructive vector helpers for AVX Date: Sat, 27 Aug 2022 01:11:56 +0200 Message-Id: <20220826231204.201395-16-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556815573100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook These helpers need to take special care to avoid overwriting source values before the wole result has been calculated. Currently they use a dummy Reg typed variable to store the result then assign the whole register. This will cause 128 bit operations to corrupt the upper half of the registe= r, so replace it with explicit temporaries and element assignments. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-14-paul@nowt.org> Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 582 +++++++++++++++++++++--------------------- 1 file changed, 289 insertions(+), 293 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 9ea763cad2..09dabfcbd5 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -40,6 +40,8 @@ #define SUFFIX _xmm #endif =20 +#define PACK_WIDTH (4 << SHIFT) + /* * Copy the relevant parts of a Reg value around. In the case where * sizeof(Reg) > SIZE, these helpers operate only on the lower bytes of @@ -468,71 +470,81 @@ void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t= val) } #endif =20 +#define SHUFFLE4(F, a, b, offset) do { \ + r0 =3D a->F((order & 3) + offset); \ + r1 =3D a->F(((order >> 2) & 3) + offset); \ + r2 =3D b->F(((order >> 4) & 3) + offset); \ + r3 =3D b->F(((order >> 6) & 3) + offset); \ + d->F(offset) =3D r0; \ + d->F(offset + 1) =3D r1; \ + d->F(offset + 2) =3D r2; \ + d->F(offset + 3) =3D r3; \ + } while (0) + #if SHIFT =3D=3D 0 void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; =20 - r.W(0) =3D s->W(order & 3); - r.W(1) =3D s->W((order >> 2) & 3); - r.W(2) =3D s->W((order >> 4) & 3); - r.W(3) =3D s->W((order >> 6) & 3); - MOVE(*d, r); + SHUFFLE4(W, s, s, 0); } #else void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + Reg *v =3D d; + uint32_t r0, r1, r2, r3; + int i; =20 - r.L(0) =3D d->L(order & 3); - r.L(1) =3D d->L((order >> 2) & 3); - r.L(2) =3D s->L((order >> 4) & 3); - r.L(3) =3D s->L((order >> 6) & 3); - MOVE(*d, r); + for (i =3D 0; i < 2 << SHIFT; i +=3D 4) { + SHUFFLE4(L, v, s, i); + } } =20 void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + Reg *v =3D d; + uint64_t r0, r1; + int i; =20 - r.Q(0) =3D d->Q(order & 1); - r.Q(1) =3D s->Q((order >> 1) & 1); - MOVE(*d, r); + for (i =3D 0; i < 1 << SHIFT; i +=3D 2) { + r0 =3D v->Q(((order & 1) & 1) + i); + r1 =3D s->Q(((order >> 1) & 1) + i); + d->Q(i) =3D r0; + d->Q(i + 1) =3D r1; + order >>=3D 2; + } } =20 void glue(helper_pshufd, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint32_t r0, r1, r2, r3; + int i; =20 - r.L(0) =3D s->L(order & 3); - r.L(1) =3D s->L((order >> 2) & 3); - r.L(2) =3D s->L((order >> 4) & 3); - r.L(3) =3D s->L((order >> 6) & 3); - MOVE(*d, r); + for (i =3D 0; i < 2 << SHIFT; i +=3D 4) { + SHUFFLE4(L, s, s, i); + } } =20 void glue(helper_pshuflw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; + int i, j; =20 - r.W(0) =3D s->W(order & 3); - r.W(1) =3D s->W((order >> 2) & 3); - r.W(2) =3D s->W((order >> 4) & 3); - r.W(3) =3D s->W((order >> 6) & 3); - r.Q(1) =3D s->Q(1); - MOVE(*d, r); + for (i =3D 0, j =3D 1; j < 1 << SHIFT; i +=3D 8, j +=3D 2) { + SHUFFLE4(W, s, s, i); + d->Q(j) =3D s->Q(j); + } } =20 void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; + int i, j; =20 - r.Q(0) =3D s->Q(0); - r.W(4) =3D s->W(4 + (order & 3)); - r.W(5) =3D s->W(4 + ((order >> 2) & 3)); - r.W(6) =3D s->W(4 + ((order >> 4) & 3)); - r.W(7) =3D s->W(4 + ((order >> 6) & 3)); - MOVE(*d, r); + for (i =3D 4, j =3D 0; j < 1 << SHIFT; i +=3D 8, j +=3D 2) { + d->Q(j) =3D s->Q(j); + SHUFFLE4(W, s, s, i); + } } #endif =20 @@ -1085,156 +1097,132 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86Stat= e *env, Reg *s) return val; } =20 -void glue(helper_packsswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.B(0) =3D satsb((int16_t)d->W(0)); - r.B(1) =3D satsb((int16_t)d->W(1)); - r.B(2) =3D satsb((int16_t)d->W(2)); - r.B(3) =3D satsb((int16_t)d->W(3)); -#if SHIFT =3D=3D 1 - r.B(4) =3D satsb((int16_t)d->W(4)); - r.B(5) =3D satsb((int16_t)d->W(5)); - r.B(6) =3D satsb((int16_t)d->W(6)); - r.B(7) =3D satsb((int16_t)d->W(7)); -#endif - r.B((4 << SHIFT) + 0) =3D satsb((int16_t)s->W(0)); - r.B((4 << SHIFT) + 1) =3D satsb((int16_t)s->W(1)); - r.B((4 << SHIFT) + 2) =3D satsb((int16_t)s->W(2)); - r.B((4 << SHIFT) + 3) =3D satsb((int16_t)s->W(3)); -#if SHIFT =3D=3D 1 - r.B(12) =3D satsb((int16_t)s->W(4)); - r.B(13) =3D satsb((int16_t)s->W(5)); - r.B(14) =3D satsb((int16_t)s->W(6)); - r.B(15) =3D satsb((int16_t)s->W(7)); -#endif - MOVE(*d, r); +#define PACK_HELPER_B(name, F) \ +void glue(helper_pack ## name, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + uint8_t r[PACK_WIDTH * 2]; \ + int j, k; \ + for (j =3D 0; j < 4 << SHIFT; j +=3D PACK_WIDTH) { \ + for (k =3D 0; k < PACK_WIDTH; k++) { \ + r[k] =3D F((int16_t)v->W(j + k)); \ + } \ + for (k =3D 0; k < PACK_WIDTH; k++) { \ + r[PACK_WIDTH + k] =3D F((int16_t)s->W(j + k)); \ + } \ + for (k =3D 0; k < PACK_WIDTH * 2; k++) { \ + d->B(2 * j + k) =3D r[k]; \ + } \ + } \ } =20 -void glue(helper_packuswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.B(0) =3D satub((int16_t)d->W(0)); - r.B(1) =3D satub((int16_t)d->W(1)); - r.B(2) =3D satub((int16_t)d->W(2)); - r.B(3) =3D satub((int16_t)d->W(3)); -#if SHIFT =3D=3D 1 - r.B(4) =3D satub((int16_t)d->W(4)); - r.B(5) =3D satub((int16_t)d->W(5)); - r.B(6) =3D satub((int16_t)d->W(6)); - r.B(7) =3D satub((int16_t)d->W(7)); -#endif - r.B((4 << SHIFT) + 0) =3D satub((int16_t)s->W(0)); - r.B((4 << SHIFT) + 1) =3D satub((int16_t)s->W(1)); - r.B((4 << SHIFT) + 2) =3D satub((int16_t)s->W(2)); - r.B((4 << SHIFT) + 3) =3D satub((int16_t)s->W(3)); -#if SHIFT =3D=3D 1 - r.B(12) =3D satub((int16_t)s->W(4)); - r.B(13) =3D satub((int16_t)s->W(5)); - r.B(14) =3D satub((int16_t)s->W(6)); - r.B(15) =3D satub((int16_t)s->W(7)); -#endif - MOVE(*d, r); -} +PACK_HELPER_B(sswb, satsb) +PACK_HELPER_B(uswb, satub) =20 void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - Reg r; + Reg *v =3D d; + uint16_t r[PACK_WIDTH]; + int j, k; =20 - r.W(0) =3D satsw(d->L(0)); - r.W(1) =3D satsw(d->L(1)); -#if SHIFT =3D=3D 1 - r.W(2) =3D satsw(d->L(2)); - r.W(3) =3D satsw(d->L(3)); -#endif - r.W((2 << SHIFT) + 0) =3D satsw(s->L(0)); - r.W((2 << SHIFT) + 1) =3D satsw(s->L(1)); -#if SHIFT =3D=3D 1 - r.W(6) =3D satsw(s->L(2)); - r.W(7) =3D satsw(s->L(3)); -#endif - MOVE(*d, r); + for (j =3D 0; j < 2 << SHIFT; j +=3D PACK_WIDTH / 2) { + for (k =3D 0; k < PACK_WIDTH / 2; k++) { + r[k] =3D satsw(v->L(j + k)); + } + for (k =3D 0; k < PACK_WIDTH / 2; k++) { + r[PACK_WIDTH / 2 + k] =3D satsw(s->L(j + k)); + } + for (k =3D 0; k < PACK_WIDTH; k++) { + d->W(2 * j + k) =3D r[k]; + } + } } =20 #define UNPCK_OP(base_name, base) \ \ void glue(helper_punpck ## base_name ## bw, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint8_t r[PACK_WIDTH * 2]; \ + int j, i; \ \ - r.B(0) =3D d->B((base << (SHIFT + 2)) + 0); \ - r.B(1) =3D s->B((base << (SHIFT + 2)) + 0); \ - r.B(2) =3D d->B((base << (SHIFT + 2)) + 1); \ - r.B(3) =3D s->B((base << (SHIFT + 2)) + 1); \ - r.B(4) =3D d->B((base << (SHIFT + 2)) + 2); \ - r.B(5) =3D s->B((base << (SHIFT + 2)) + 2); \ - r.B(6) =3D d->B((base << (SHIFT + 2)) + 3); \ - r.B(7) =3D s->B((base << (SHIFT + 2)) + 3); \ - XMM_ONLY( \ - r.B(8) =3D d->B((base << (SHIFT + 2)) + 4); \ - r.B(9) =3D s->B((base << (SHIFT + 2)) + 4); \ - r.B(10) =3D d->B((base << (SHIFT + 2)) + 5); \ - r.B(11) =3D s->B((base << (SHIFT + 2)) + 5); \ - r.B(12) =3D d->B((base << (SHIFT + 2)) + 6); \ - r.B(13) =3D s->B((base << (SHIFT + 2)) + 6); \ - r.B(14) =3D d->B((base << (SHIFT + 2)) + 7); \ - r.B(15) =3D s->B((base << (SHIFT + 2)) + 7); \ - ) \ - MOVE(*d, r); \ + for (j =3D 0; j < 8 << SHIFT; ) { \ + int k =3D j + base * PACK_WIDTH; \ + for (i =3D 0; i < PACK_WIDTH; i++) { \ + r[2 * i] =3D v->B(k + i); \ + r[2 * i + 1] =3D s->B(k + i); \ + } \ + for (i =3D 0; i < PACK_WIDTH * 2; i++, j++) { \ + d->B(j) =3D r[i]; \ + } \ + } \ } \ \ void glue(helper_punpck ## base_name ## wd, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint16_t r[PACK_WIDTH]; \ + int j, i; \ \ - r.W(0) =3D d->W((base << (SHIFT + 1)) + 0); \ - r.W(1) =3D s->W((base << (SHIFT + 1)) + 0); \ - r.W(2) =3D d->W((base << (SHIFT + 1)) + 1); \ - r.W(3) =3D s->W((base << (SHIFT + 1)) + 1); \ - XMM_ONLY( \ - r.W(4) =3D d->W((base << (SHIFT + 1)) + 2); \ - r.W(5) =3D s->W((base << (SHIFT + 1)) + 2); \ - r.W(6) =3D d->W((base << (SHIFT + 1)) + 3); \ - r.W(7) =3D s->W((base << (SHIFT + 1)) + 3); \ - ) \ - MOVE(*d, r); \ + for (j =3D 0; j < 4 << SHIFT; ) { \ + int k =3D j + base * PACK_WIDTH / 2; \ + for (i =3D 0; i < PACK_WIDTH / 2; i++) { \ + r[2 * i] =3D v->W(k + i); \ + r[2 * i + 1] =3D s->W(k + i); \ + } \ + for (i =3D 0; i < PACK_WIDTH; i++, j++) { \ + d->W(j) =3D r[i]; \ + } \ + } \ } \ \ void glue(helper_punpck ## base_name ## dq, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint32_t r[PACK_WIDTH / 2]; \ + int j, i; \ \ - r.L(0) =3D d->L((base << SHIFT) + 0); \ - r.L(1) =3D s->L((base << SHIFT) + 0); \ - XMM_ONLY( \ - r.L(2) =3D d->L((base << SHIFT) + 1); \ - r.L(3) =3D s->L((base << SHIFT) + 1); \ - ) \ - MOVE(*d, r); \ + for (j =3D 0; j < 2 << SHIFT; ) { \ + int k =3D j + base * PACK_WIDTH / 4; \ + for (i =3D 0; i < PACK_WIDTH / 4; i++) { \ + r[2 * i] =3D v->L(k + i); \ + r[2 * i + 1] =3D s->L(k + i); \ + } \ + for (i =3D 0; i < PACK_WIDTH / 2; i++, j++) { \ + d->L(j) =3D r[i]; \ + } \ + } \ } \ \ XMM_ON + void glue(helper_punpck ## base_name ## qdq, SUFFIX)( \ + CPUX86State *env, Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint64_t r[2]; \ + int i; \ \ - r.Q(0) =3D d->Q(base); \ - r.Q(1) =3D s->Q(base); \ - MOVE(*d, r); \ + for (i =3D 0; i < 1 << SHIFT; i +=3D 2) { = \ + r[0] =3D v->Q(base + i); \ + r[1] =3D s->Q(base + i); \ + d->Q(i) =3D r[0]; \ + d->Q(i + 1) =3D r[1]; \ + } \ } \ ) =20 UNPCK_OP(l, 0) UNPCK_OP(h, 1) =20 +#undef PACK_WIDTH +#undef PACK_HELPER_B +#undef UNPCK_OP + + /* 3DNow! float ops */ #if SHIFT =3D=3D 0 void helper_pi2fd(CPUX86State *env, MMXReg *d, MMXReg *s) @@ -1387,122 +1375,113 @@ void helper_pswapd(CPUX86State *env, MMXReg *d, M= MXReg *s) /* SSSE3 op helpers */ void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { + Reg *v =3D d; int i; - Reg r; +#if SHIFT =3D=3D 0 + uint8_t r[8]; =20 - for (i =3D 0; i < (8 << SHIFT); i++) { - r.B(i) =3D (s->B(i) & 0x80) ? 0 : (d->B(s->B(i) & ((8 << SHIFT) - = 1))); + for (i =3D 0; i < 8; i++) { + r[i] =3D (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 7)); } + for (i =3D 0; i < 8; i++) { + d->B(i) =3D r[i]; + } +#else + uint8_t r[8 << SHIFT]; =20 - MOVE(*d, r); -} - -void glue(helper_phaddw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - - Reg r; - - r.W(0) =3D (int16_t)d->W(0) + (int16_t)d->W(1); - r.W(1) =3D (int16_t)d->W(2) + (int16_t)d->W(3); - XMM_ONLY(r.W(2) =3D (int16_t)d->W(4) + (int16_t)d->W(5)); - XMM_ONLY(r.W(3) =3D (int16_t)d->W(6) + (int16_t)d->W(7)); - r.W((2 << SHIFT) + 0) =3D (int16_t)s->W(0) + (int16_t)s->W(1); - r.W((2 << SHIFT) + 1) =3D (int16_t)s->W(2) + (int16_t)s->W(3); - XMM_ONLY(r.W(6) =3D (int16_t)s->W(4) + (int16_t)s->W(5)); - XMM_ONLY(r.W(7) =3D (int16_t)s->W(6) + (int16_t)s->W(7)); - - MOVE(*d, r); -} - -void glue(helper_phaddd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.L(0) =3D (int32_t)d->L(0) + (int32_t)d->L(1); - XMM_ONLY(r.L(1) =3D (int32_t)d->L(2) + (int32_t)d->L(3)); - r.L((1 << SHIFT) + 0) =3D (int32_t)s->L(0) + (int32_t)s->L(1); - XMM_ONLY(r.L(3) =3D (int32_t)s->L(2) + (int32_t)s->L(3)); - - MOVE(*d, r); -} - -void glue(helper_phaddsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.W(0) =3D satsw((int16_t)d->W(0) + (int16_t)d->W(1)); - r.W(1) =3D satsw((int16_t)d->W(2) + (int16_t)d->W(3)); - XMM_ONLY(r.W(2) =3D satsw((int16_t)d->W(4) + (int16_t)d->W(5))); - XMM_ONLY(r.W(3) =3D satsw((int16_t)d->W(6) + (int16_t)d->W(7))); - r.W((2 << SHIFT) + 0) =3D satsw((int16_t)s->W(0) + (int16_t)s->W(1)); - r.W((2 << SHIFT) + 1) =3D satsw((int16_t)s->W(2) + (int16_t)s->W(3)); - XMM_ONLY(r.W(6) =3D satsw((int16_t)s->W(4) + (int16_t)s->W(5))); - XMM_ONLY(r.W(7) =3D satsw((int16_t)s->W(6) + (int16_t)s->W(7))); - - MOVE(*d, r); -} - -void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->W(0) =3D satsw((int8_t)s->B(0) * (uint8_t)d->B(0) + - (int8_t)s->B(1) * (uint8_t)d->B(1)); - d->W(1) =3D satsw((int8_t)s->B(2) * (uint8_t)d->B(2) + - (int8_t)s->B(3) * (uint8_t)d->B(3)); - d->W(2) =3D satsw((int8_t)s->B(4) * (uint8_t)d->B(4) + - (int8_t)s->B(5) * (uint8_t)d->B(5)); - d->W(3) =3D satsw((int8_t)s->B(6) * (uint8_t)d->B(6) + - (int8_t)s->B(7) * (uint8_t)d->B(7)); -#if SHIFT =3D=3D 1 - d->W(4) =3D satsw((int8_t)s->B(8) * (uint8_t)d->B(8) + - (int8_t)s->B(9) * (uint8_t)d->B(9)); - d->W(5) =3D satsw((int8_t)s->B(10) * (uint8_t)d->B(10) + - (int8_t)s->B(11) * (uint8_t)d->B(11)); - d->W(6) =3D satsw((int8_t)s->B(12) * (uint8_t)d->B(12) + - (int8_t)s->B(13) * (uint8_t)d->B(13)); - d->W(7) =3D satsw((int8_t)s->B(14) * (uint8_t)d->B(14) + - (int8_t)s->B(15) * (uint8_t)d->B(15)); + for (i =3D 0; i < 8 << SHIFT; i++) { + int j =3D i & ~0xf; + r[i] =3D (s->B(i) & 0x80) ? 0 : v->B(j | (s->B(i) & 0xf)); + } + for (i =3D 0; i < 8 << SHIFT; i++) { + d->B(i) =3D r[i]; + } #endif } =20 -void glue(helper_phsubw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; +#if SHIFT =3D=3D 0 =20 - r.W(0) =3D (int16_t)d->W(0) - (int16_t)d->W(1); - r.W(1) =3D (int16_t)d->W(2) - (int16_t)d->W(3); - XMM_ONLY(r.W(2) =3D (int16_t)d->W(4) - (int16_t)d->W(5)); - XMM_ONLY(r.W(3) =3D (int16_t)d->W(6) - (int16_t)d->W(7)); - r.W((2 << SHIFT) + 0) =3D (int16_t)s->W(0) - (int16_t)s->W(1); - r.W((2 << SHIFT) + 1) =3D (int16_t)s->W(2) - (int16_t)s->W(3); - XMM_ONLY(r.W(6) =3D (int16_t)s->W(4) - (int16_t)s->W(5)); - XMM_ONLY(r.W(7) =3D (int16_t)s->W(6) - (int16_t)s->W(7)); - MOVE(*d, r); +#define SSE_HELPER_HW(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + uint16_t r[4]; \ + r[0] =3D F(v->W(0), v->W(1)); \ + r[1] =3D F(v->W(2), v->W(3)); \ + r[2] =3D F(s->W(0), s->W(1)); \ + r[3] =3D F(s->W(3), s->W(3)); \ + d->W(0) =3D r[0]; \ + d->W(1) =3D r[1]; \ + d->W(2) =3D r[2]; \ + d->W(3) =3D r[3]; \ } =20 -void glue(helper_phsubd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.L(0) =3D (int32_t)d->L(0) - (int32_t)d->L(1); - XMM_ONLY(r.L(1) =3D (int32_t)d->L(2) - (int32_t)d->L(3)); - r.L((1 << SHIFT) + 0) =3D (int32_t)s->L(0) - (int32_t)s->L(1); - XMM_ONLY(r.L(3) =3D (int32_t)s->L(2) - (int32_t)s->L(3)); - MOVE(*d, r); +#define SSE_HELPER_HL(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + uint32_t r0, r1; \ + r0 =3D F(v->L(0), v->L(1)); \ + r1 =3D F(s->L(0), s->L(1)); \ + d->W(0) =3D r0; \ + d->W(1) =3D r1; \ } =20 -void glue(helper_phsubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; +#else =20 - r.W(0) =3D satsw((int16_t)d->W(0) - (int16_t)d->W(1)); - r.W(1) =3D satsw((int16_t)d->W(2) - (int16_t)d->W(3)); - XMM_ONLY(r.W(2) =3D satsw((int16_t)d->W(4) - (int16_t)d->W(5))); - XMM_ONLY(r.W(3) =3D satsw((int16_t)d->W(6) - (int16_t)d->W(7))); - r.W((2 << SHIFT) + 0) =3D satsw((int16_t)s->W(0) - (int16_t)s->W(1)); - r.W((2 << SHIFT) + 1) =3D satsw((int16_t)s->W(2) - (int16_t)s->W(3)); - XMM_ONLY(r.W(6) =3D satsw((int16_t)s->W(4) - (int16_t)s->W(5))); - XMM_ONLY(r.W(7) =3D satsw((int16_t)s->W(6) - (int16_t)s->W(7))); - MOVE(*d, r); +#define SSE_HELPER_HW(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + int16_t r[4 << SHIFT]; \ + int i, j; \ + for (i =3D j =3D 0; j < 8; i++, j +=3D 2) { \ + r[i] =3D F(v->W(j), v->W(j + 1)); \ + } \ + for (j =3D 0; j < 8; i++, j +=3D 2) { \ + r[i] =3D F(s->W(j), s->W(j + 1)); \ + } \ + for (i =3D 0; i < 4 << SHIFT; i++) { \ + d->W(i) =3D r[i]; \ + } \ +} + +#define SSE_HELPER_HL(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + int32_t r[2 << SHIFT]; \ + int i, j; \ + for (i =3D j =3D 0; j < 4; i++, j +=3D 2) { \ + r[i] =3D F(v->L(j), v->L(j + 1)); \ + } \ + for (j =3D 0; j < 4; i++, j +=3D 2) { \ + r[i] =3D F(s->L(j), s->L(j + 1)); \ + } \ + for (i =3D 0; i < 2 << SHIFT; i++) { \ + d->L(i) =3D r[i]; \ + } \ +} +#endif + +SSE_HELPER_HW(phaddw, FADD) +SSE_HELPER_HW(phsubw, FSUB) +SSE_HELPER_HW(phaddsw, FADDSW) +SSE_HELPER_HW(phsubsw, FSUBSW) +SSE_HELPER_HL(phaddd, FADD) +SSE_HELPER_HL(phsubd, FSUB) + +#undef SSE_HELPER_HW +#undef SSE_HELPER_HL + +void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + Reg *v =3D d; + int i; + for (i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D satsw((int8_t)s->B(i * 2) * (uint8_t)v->B(i * 2) + + (int8_t)s->B(i * 2 + 1) * (uint8_t)v->B(i * 2 + 1)= ); + } } =20 #define FABSB(x) (x > INT8_MAX ? -(int8_t)x : x) @@ -1525,32 +1504,38 @@ SSE_HELPER_L(helper_psignd, FSIGNL) void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, int32_t shift) { - Reg r; + Reg *v =3D d; + int i; =20 /* XXX could be checked during translation */ - if (shift >=3D (16 << SHIFT)) { - r.Q(0) =3D 0; - XMM_ONLY(r.Q(1) =3D 0); + if (shift >=3D (SHIFT ? 32 : 16)) { + for (i =3D 0; i < (1 << SHIFT); i++) { + d->Q(i) =3D 0; + } } else { shift <<=3D 3; #define SHR(v, i) (i < 64 && i > -64 ? i > 0 ? v >> (i) : (v << -(i)) : 0) #if SHIFT =3D=3D 0 - r.Q(0) =3D SHR(s->Q(0), shift - 0) | - SHR(d->Q(0), shift - 64); + d->Q(0) =3D SHR(s->Q(0), shift - 0) | + SHR(v->Q(0), shift - 64); #else - r.Q(0) =3D SHR(s->Q(0), shift - 0) | - SHR(s->Q(1), shift - 64) | - SHR(d->Q(0), shift - 128) | - SHR(d->Q(1), shift - 192); - r.Q(1) =3D SHR(s->Q(0), shift + 64) | - SHR(s->Q(1), shift - 0) | - SHR(d->Q(0), shift - 64) | - SHR(d->Q(1), shift - 128); + for (i =3D 0; i < (1 << SHIFT); i +=3D 2) { + uint64_t r0, r1; + + r0 =3D SHR(s->Q(i), shift - 0) | + SHR(s->Q(i + 1), shift - 64) | + SHR(v->Q(i), shift - 128) | + SHR(v->Q(i + 1), shift - 192); + r1 =3D SHR(s->Q(i), shift + 64) | + SHR(s->Q(i + 1), shift - 0) | + SHR(v->Q(i), shift - 64) | + SHR(v->Q(i + 1), shift - 128); + d->Q(i) =3D r0; + d->Q(i + 1) =3D r1; + } #endif #undef SHR } - - MOVE(*d, r); } =20 #define XMM0 (env->xmm_regs[0]) @@ -1675,17 +1660,23 @@ SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ) =20 void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - Reg r; + Reg *v =3D d; + uint16_t r[8]; + int i, j, k; =20 - r.W(0) =3D satuw((int32_t) d->L(0)); - r.W(1) =3D satuw((int32_t) d->L(1)); - r.W(2) =3D satuw((int32_t) d->L(2)); - r.W(3) =3D satuw((int32_t) d->L(3)); - r.W(4) =3D satuw((int32_t) s->L(0)); - r.W(5) =3D satuw((int32_t) s->L(1)); - r.W(6) =3D satuw((int32_t) s->L(2)); - r.W(7) =3D satuw((int32_t) s->L(3)); - MOVE(*d, r); + for (i =3D 0, j =3D 0; i <=3D 2 << SHIFT; i +=3D 8, j +=3D 4) { + r[0] =3D satuw(v->L(j)); + r[1] =3D satuw(v->L(j + 1)); + r[2] =3D satuw(v->L(j + 2)); + r[3] =3D satuw(v->L(j + 3)); + r[4] =3D satuw(s->L(j)); + r[5] =3D satuw(s->L(j + 1)); + r[6] =3D satuw(s->L(j + 2)); + r[7] =3D satuw(s->L(j + 3)); + for (k =3D 0; k < 8; k++) { + d->W(i + k) =3D r[k]; + } + } } =20 #define FMINSB(d, s) MIN((int8_t)d, (int8_t)s) @@ -1941,20 +1932,25 @@ void glue(helper_dppd, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s, uint32_t mask) void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t offset) { - int s0 =3D (offset & 3) << 2; - int d0 =3D (offset & 4) << 0; - int i; - Reg r; + Reg *v =3D d; + int i, j; + uint16_t r[8]; =20 - for (i =3D 0; i < 8; i++, d0++) { - r.W(i) =3D 0; - r.W(i) +=3D abs1(d->B(d0 + 0) - s->B(s0 + 0)); - r.W(i) +=3D abs1(d->B(d0 + 1) - s->B(s0 + 1)); - r.W(i) +=3D abs1(d->B(d0 + 2) - s->B(s0 + 2)); - r.W(i) +=3D abs1(d->B(d0 + 3) - s->B(s0 + 3)); + for (j =3D 0; j < 4 << SHIFT; j++) { + int s0 =3D (j * 2) + ((offset & 3) << 2); + int d0 =3D (j * 2) + ((offset & 4) << 0); + for (i =3D 0; i < 8; i++, d0++) { + r[i] =3D 0; + r[i] +=3D abs1(v->B(d0 + 0) - s->B(s0 + 0)); + r[i] +=3D abs1(v->B(d0 + 1) - s->B(s0 + 1)); + r[i] +=3D abs1(v->B(d0 + 2) - s->B(s0 + 2)); + r[i] +=3D abs1(v->B(d0 + 3) - s->B(s0 + 3)); + } + for (i =3D 0; i < 8; i++, j++) { + d->W(j) =3D r[i]; + } + offset >>=3D 3; } - - MOVE(*d, r); } =20 /* SSE4.2 op helpers */ --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556632; cv=none; d=zohomail.com; s=zohoarc; b=eZpNAsqTyRlT+z03JwI6Ib2pvpKXDSj48D3HrRhummtzQU9LtBuAMmkgyJE5yuGLxTc5ELtHfS2nNNWSvqwHg7wyCAp9f8XbU1mgkG6uK/wtUrPjuGMeQhnA3HuaiUCLMKNU3ROfSsqZr8M96i9JTaqDbcJ5U1XzuKKyo38rYhw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556632; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=bgv7mW6ielxwXjO6KDJmV0OpvgUjuac6qGEvyWX0ZoI=; b=RpZrPmz8e2aFbtpHFmhbbX9mategEp+Ys439C0lb4vh5BLzTOI0ox3YTnYqcFPEvqrn/xM1hxfx3LLCFc+5bykwGRcb50A60bgp7lufNFhQ9kKNbEJRYTB6FJGieeB3fWDQzIhx6zd+qEpRR1r29OLKm5Kwjvxob3lohs1H1LCk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556632656171.7093976362588; Fri, 26 Aug 2022 16:30:32 -0700 (PDT) Received: from localhost ([::1]:53626 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRimU-0000qA-K6 for importer@patchew.org; Fri, 26 Aug 2022 19:30:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42910) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVl-0007M2-LL for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:39735) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVU-0007qo-F1 for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:13 -0400 Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-61-tKWSMM-yNS-NqKsjbiO4Gw-1; Fri, 26 Aug 2022 19:12:42 -0400 Received: by mail-ed1-f70.google.com with SMTP id z6-20020a05640240c600b0043e1d52fd98so1870907edb.22 for ; Fri, 26 Aug 2022 16:12:42 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id m20-20020a17090679d400b00715a02874acsm1426361ejo.35.2022.08.26.16.12.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bgv7mW6ielxwXjO6KDJmV0OpvgUjuac6qGEvyWX0ZoI=; b=CJsWJwCG2Mk7lJ5nO+WjtQkSplvtoYy8a9UEV4QyoKWcDycrLWx+OdsdpkKPPhwGV2PfqZ frnE1p0WKRkTeOxMhKpDIEjuCUFrbInKYfUgiQDGr2H99PZsG1wDKJwy3GUIX6rYkBhjJj 70rNZ3+5zVqdC5nWzeLTBWAC2CNukTE= X-MC-Unique: tKWSMM-yNS-NqKsjbiO4Gw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=bgv7mW6ielxwXjO6KDJmV0OpvgUjuac6qGEvyWX0ZoI=; b=ZUUvb55BV3/apgSuM1/taefdkMJmyPcSauUB/JurZmX3mh86XfXHhn+avNdX8LVEJm TbIe5YcL8DXZh/bBNUxKqbN7mzKbioJw7190NhfvEnRgLUG5P+ajVlvIf5Zpk1ZSmyJN 1hE/jfBvHu6484XDttXny7aLlq/ONjLLvaIuFRdDvFsYuhNHWUdiQ3hgBj77TaHkQNR4 Nf20764LQV8LoZUUIlAlqOuzl2k6x7NJ53soQQVrLOHgVeWVFZbktT1WfypqeumM6kdc G/z7wfazCY8hpynWh4VintivrlH97ng1vGM63guqULJ+nYDBCVvQjGqq7hun4JzKbyUu wIXg== X-Gm-Message-State: ACgBeo17+VUTj/WABQYnTYHTtymOChAnlGQPfsCMDZDJTohAyuY5WJpi KbxbA5c9tcEnEZSccauiGhgsmVrIbexyoju0AN6OCtCuOeD8wEPZvaA83hR8kwc0gPzHqD+/Ypx KUQfp/8MsxbSoS6oNiHVxWAB3aQLoy88Wh/ZNeJFR4i6lKix+sjwG53GmXcfiz20jBvI= X-Received: by 2002:a17:906:2245:b0:715:7c81:e39d with SMTP id 5-20020a170906224500b007157c81e39dmr6996868ejr.262.1661555561239; Fri, 26 Aug 2022 16:12:41 -0700 (PDT) X-Google-Smtp-Source: AA6agR4mBTrkiHR7px/JQqRjXtw7lruud0oCZ3sLwDqHClmHt0vFtLjH6PKgJR4rjt1uN5HOS8bvZg== X-Received: by 2002:a17:906:2245:b0:715:7c81:e39d with SMTP id 5-20020a170906224500b007157c81e39dmr6996854ejr.262.1661555560955; Fri, 26 Aug 2022 16:12:40 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 16/23] i386: Floating point arithmetic helper AVX prep Date: Sat, 27 Aug 2022 01:11:57 +0200 Message-Id: <20220826231204.201395-17-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556634401100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Prepare the "easy" floating point vector helpers for AVX No functional changes to existing helpers. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-16-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 138 ++++++++++++++++++++++++++++-------------- 1 file changed, 92 insertions(+), 46 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 09dabfcbd5..1d05e42a45 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -548,40 +548,58 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int= order) } #endif =20 -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 /* FPU ops */ /* XXX: not accurate */ =20 -#define SSE_HELPER_S(name, F) \ - void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ +#define SSE_HELPER_P(name, F) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - d->ZMM_S(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ - d->ZMM_S(2) =3D F(32, d->ZMM_S(2), s->ZMM_S(2)); \ - d->ZMM_S(3) =3D F(32, d->ZMM_S(3), s->ZMM_S(3)); \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < 2 << SHIFT; i++) { \ + d->ZMM_S(i) =3D F(32, v->ZMM_S(i), s->ZMM_S(i)); \ + } \ } \ \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - } \ - \ - void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ - { \ - d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - d->ZMM_D(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ - } \ - \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < 1 << SHIFT; i++) { \ + d->ZMM_D(i) =3D F(64, v->ZMM_D(i), s->ZMM_D(i)); \ + } \ } =20 +#if SHIFT =3D=3D 1 + +#define SSE_HELPER_S(name, F) \ + SSE_HELPER_P(name, F) \ + \ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)\ + { \ + Reg *v =3D d; \ + d->ZMM_S(0) =3D F(32, v->ZMM_S(0), s->ZMM_S(0)); \ + } \ + \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)\ + { \ + Reg *v =3D d; \ + d->ZMM_D(0) =3D F(64, v->ZMM_D(0), s->ZMM_D(0)); \ + } + +#else + +#define SSE_HELPER_S(name, F) SSE_HELPER_P(name, F) + +#endif + #define FPU_ADD(size, a, b) float ## size ## _add(a, b, &env->sse_status) #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status) #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status) #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status) -#define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status) =20 /* Note that the choice of comparison op here is important to get the * special cases right: for min and max Intel specifies that (-0,0), @@ -598,8 +616,34 @@ SSE_HELPER_S(mul, FPU_MUL) SSE_HELPER_S(div, FPU_DIV) SSE_HELPER_S(min, FPU_MIN) SSE_HELPER_S(max, FPU_MAX) -SSE_HELPER_S(sqrt, FPU_SQRT) =20 +void glue(helper_sqrtps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D float32_sqrt(s->ZMM_S(i), &env->sse_status); + } +} + +void glue(helper_sqrtpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + int i; + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_D(i) =3D float64_sqrt(s->ZMM_D(i), &env->sse_status); + } +} + +#if SHIFT =3D=3D 1 +void helper_sqrtss(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_S(0) =3D float32_sqrt(s->ZMM_S(0), &env->sse_status); +} + +void helper_sqrtsd(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_D(0) =3D float64_sqrt(s->ZMM_D(0), &env->sse_status); +} +#endif =20 /* float to float conversions */ void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) @@ -817,18 +861,12 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); - d->ZMM_S(0) =3D float32_div(float32_one, - float32_sqrt(s->ZMM_S(0), &env->sse_status), - &env->sse_status); - d->ZMM_S(1) =3D float32_div(float32_one, - float32_sqrt(s->ZMM_S(1), &env->sse_status), - &env->sse_status); - d->ZMM_S(2) =3D float32_div(float32_one, - float32_sqrt(s->ZMM_S(2), &env->sse_status), - &env->sse_status); - d->ZMM_S(3) =3D float32_div(float32_one, - float32_sqrt(s->ZMM_S(3), &env->sse_status), - &env->sse_status); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(i), &env->sse_stat= us), + &env->sse_status); + } set_float_exception_flags(old_flags, &env->sse_status); } =20 @@ -844,10 +882,10 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMR= eg *s) void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); - d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); - d->ZMM_S(1) =3D float32_div(float32_one, s->ZMM_S(1), &env->sse_status= ); - d->ZMM_S(2) =3D float32_div(float32_one, s->ZMM_S(2), &env->sse_status= ); - d->ZMM_S(3) =3D float32_div(float32_one, s->ZMM_S(3), &env->sse_status= ); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D float32_div(float32_one, s->ZMM_S(i), &env->sse_st= atus); + } set_float_exception_flags(old_flags, &env->sse_status); } =20 @@ -942,18 +980,24 @@ void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZM= MReg *d, ZMMReg *s) MOVE(*d, r); } =20 -void glue(helper_addsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_S(0) =3D float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); - d->ZMM_S(1) =3D float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); - d->ZMM_S(2) =3D float32_sub(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status= ); - d->ZMM_S(3) =3D float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); + Reg *v =3D d; + int i; + for (i =3D 0; i < 2 << SHIFT; i +=3D 2) { + d->ZMM_S(i) =3D float32_sub(v->ZMM_S(i), s->ZMM_S(i), &env->sse_st= atus); + d->ZMM_S(i+1) =3D float32_add(v->ZMM_S(i+1), s->ZMM_S(i+1), &env->= sse_status); + } } =20 -void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_D(0) =3D float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); - d->ZMM_D(1) =3D float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); + Reg *v =3D d; + int i; + for (i =3D 0; i < 1 << SHIFT; i +=3D 2) { + d->ZMM_D(i) =3D float64_sub(v->ZMM_D(i), s->ZMM_D(i), &env->sse_st= atus); + d->ZMM_D(i+1) =3D float64_add(v->ZMM_D(i+1), s->ZMM_D(i+1), &env->= sse_status); + } } =20 /* XXX: unordered */ @@ -2280,6 +2324,8 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State= *env, Reg *d, Reg *s, } #endif =20 +#undef SSE_HELPER_S + #undef SHIFT #undef XMM_ONLY #undef Reg --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556941; cv=none; d=zohomail.com; s=zohoarc; b=MAxFRaNqmLf7t6anfg+810pB4I19YIwsmsKKlq1OEERtvHexroig86BGoZxhc9jyEMbnGena8f1IXfN3ZSLSQcuJfglKlKSglz4Qbdz7kVK8W7AbbsGk2XBcHTpM4GiCDhSDdhstuxsNvRbe0cUml4fSTPVijStQ/tcUvJH7C3E= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556941; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=uPn+ZzRNo2tzX1jbVQcEIneeyEJpTdW44n3kld+3PnI=; b=fvA2SvPHWpQR+nx4naoMShLoENq/Tom7UWrZMS5vhqEhHTNueQ2ZITt0qSgpoohw5cY+LuZTsUtzoGCAP+qggGBa8k7qOuhrU/o3OXKQdPNJ7gyJlSiDqct6ROEdPiGR7kfzDxMhkwqY3IV7PLmNx/+AtfZeZwh4aYXEOhII2vw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556941840812.5502551675293; Fri, 26 Aug 2022 16:35:41 -0700 (PDT) Received: from localhost ([::1]:36742 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRirU-0005S5-PY for importer@patchew.org; Fri, 26 Aug 2022 19:35:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42912) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVl-0007MM-Ow for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:22987) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVV-0007qz-Hy for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:13 -0400 Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-490-Z7Ek_C5hPoOtdcC3AD-CiQ-1; Fri, 26 Aug 2022 19:12:45 -0400 Received: by mail-ed1-f69.google.com with SMTP id q32-20020a05640224a000b004462f105fa9so1849641eda.4 for ; Fri, 26 Aug 2022 16:12:45 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id e25-20020a170906315900b0073d71792c8dsm1392340eje.180.2022.08.26.16.12.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555566; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uPn+ZzRNo2tzX1jbVQcEIneeyEJpTdW44n3kld+3PnI=; b=GKS4FnaET0W0CtfdS52/m3aOu7iUEPTjqZZdcx2xVf5qzdP7H3Y3r7/OyQypKH6k37NSJ4 GZuS0Ey0hwQziR8KIbgO2wG57NBiP7Q+t9ZRlTMmqT5rP6iW6+zKfIz/UcMONqQbLLlI2o vJMeQRav2RQHZGvENyJW/EZCtK29NZA= X-MC-Unique: Z7Ek_C5hPoOtdcC3AD-CiQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=uPn+ZzRNo2tzX1jbVQcEIneeyEJpTdW44n3kld+3PnI=; b=jqxj3v0vagjH5WVIBbKLSUCXPh9hPjZZWmjXoXFwhwEa8PcLPU13wyd5NcDQ5gbNWL hQgtTCh87tgK8IbZ5nmAdAzlfaUjcm8PFXG4yzrZ3m2zkcT18xUU8hlsxkqLLPvqSfSV 63S8RV5lX1AgpHl2cPXtZrGTEhmSu0FnqzNiuWmExtVdPdqIesPqxBAvddVA1jbWsX8Q 0JFsOhqvPnrasKRy0I6IY8/P6aiCTLL4MTAytq+izeVcJhr5opyuyKre/zkbOi97zItB GnXwhxHPpUJVS9hp44jQ0NCAQcLl9aAhKcoZg0dLwIDNAyJ2C5jshlBwYANmOaBpWwMu /38A== X-Gm-Message-State: ACgBeo1/cd4ky1uANTgngt5uUDuT8xxer6geynUHJUdY6ux3/yiTLbbZ /bd//aDLCOY48E3JsVg5eZL2gZ/taHeeUARGqMjkSHWpoNl/CalVNRBN1KVtvbGHz3fUDHw/r7B iCsNG+/XRcIBevApJtbSVoIyNxMNGUPqzHChH03+d5wzl31T4OYYO5puZIhTKKyO/N68= X-Received: by 2002:a05:6402:5002:b0:444:26fd:d341 with SMTP id p2-20020a056402500200b0044426fdd341mr8462420eda.351.1661555563922; Fri, 26 Aug 2022 16:12:43 -0700 (PDT) X-Google-Smtp-Source: AA6agR4XXr4h5t5bR8t0N9K9FhaMOngLKHQ2IGrmW+GwohSKtZWXi/x70QvvG4GDkht/Q2Xdi4GNsA== X-Received: by 2002:a05:6402:5002:b0:444:26fd:d341 with SMTP id p2-20020a056402500200b0044426fdd341mr8462404eda.351.1661555563542; Fri, 26 Aug 2022 16:12:43 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 17/23] i386: reimplement AVX comparison helpers Date: Sat, 27 Aug 2022 01:11:58 +0200 Message-Id: <20220826231204.201395-18-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556942197100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook AVX includes an additional set of comparison predicates, some of which our softfloat implementation does not expose as separate functions. Rewrite the helpers in terms of floatN_compare for future extensibility. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson Message-Id: <20220424220204.2493824-24-paul@nowt.org> Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 97 ++++++++++++++++++++---------------- target/i386/ops_sse_header.h | 24 ++++----- target/i386/tcg/translate.c | 20 ++++---- 3 files changed, 75 insertions(+), 66 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 1d05e42a45..f0bb30ba53 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1000,57 +1000,66 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s) } } =20 -/* XXX: unordered */ -#define SSE_HELPER_CMP(name, F) \ - void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ +#define SSE_HELPER_CMP_P(name, F, C) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - d->ZMM_L(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ - d->ZMM_L(2) =3D F(32, d->ZMM_S(2), s->ZMM_S(2)); \ - d->ZMM_L(3) =3D F(32, d->ZMM_S(3), s->ZMM_S(3)); \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < 2 << SHIFT; i++) { \ + d->ZMM_L(i) =3D C(F(32, v->ZMM_S(i), s->ZMM_S(i))) ? -1 : 0; \ + } \ } \ \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - } \ - \ - void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ - { \ - d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - d->ZMM_Q(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ - } \ - \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < 1 << SHIFT; i++) { \ + d->ZMM_Q(i) =3D C(F(64, v->ZMM_D(i), s->ZMM_D(i))) ? -1 : 0; \ + } \ } =20 -#define FPU_CMPEQ(size, a, b) \ - (float ## size ## _eq_quiet(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPLT(size, a, b) \ - (float ## size ## _lt(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPLE(size, a, b) \ - (float ## size ## _le(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPUNORD(size, a, b) \ - (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPNEQ(size, a, b) \ - (float ## size ## _eq_quiet(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPNLT(size, a, b) \ - (float ## size ## _lt(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPNLE(size, a, b) \ - (float ## size ## _le(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPORD(size, a, b) \ - (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? 0 : -1) +#if SHIFT =3D=3D 1 +#define SSE_HELPER_CMP(name, F, C) = \ + SSE_HELPER_CMP_P(name, F, C) = \ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + { = \ + Reg *v =3D d; = \ + d->ZMM_L(0) =3D C(F(32, v->ZMM_S(0), s->ZMM_S(0))) ? -1 : 0; = \ + } = \ + = \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ + { = \ + Reg *v =3D d; = \ + d->ZMM_Q(0) =3D C(F(64, v->ZMM_D(0), s->ZMM_D(0))) ? -1 : 0; = \ + } =20 -SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) -SSE_HELPER_CMP(cmplt, FPU_CMPLT) -SSE_HELPER_CMP(cmple, FPU_CMPLE) -SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD) -SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ) -SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) -SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) -SSE_HELPER_CMP(cmpord, FPU_CMPORD) +#define FPU_EQ(x) (x =3D=3D float_relation_equal) +#define FPU_LT(x) (x =3D=3D float_relation_less) +#define FPU_LE(x) (x <=3D float_relation_equal) +#define FPU_UNORD(x) (x =3D=3D float_relation_unordered) + +#define FPU_CMPQ(size, a, b) \ + float ## size ## _compare_quiet(a, b, &env->sse_status) +#define FPU_CMPS(size, a, b) \ + float ## size ## _compare(a, b, &env->sse_status) + +#else +#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_CMP_P(name, F, C) +#endif + +SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ) +SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT) +SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE) +SSE_HELPER_CMP(cmpunord, FPU_CMPQ, FPU_UNORD) +SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ) +SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT) +SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE) +SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) + +#undef SSE_HELPER_CMP =20 static const int comis_eflags[4] =3D {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C}; =20 diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index fc697536a0..d99464afb0 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -201,20 +201,20 @@ DEF_HELPER_3(glue(hsubpd, SUFFIX), void, env, ZMMReg,= ZMMReg) DEF_HELPER_3(glue(addsubps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(glue(addsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) =20 -#define SSE_HELPER_CMP(name, F) \ - DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ - DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ +#define SSE_HELPER_CMP(name, F, C) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ + DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) =20 -SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) -SSE_HELPER_CMP(cmplt, FPU_CMPLT) -SSE_HELPER_CMP(cmple, FPU_CMPLE) -SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD) -SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ) -SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) -SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) -SSE_HELPER_CMP(cmpord, FPU_CMPORD) +SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ) +SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT) +SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE) +SSE_HELPER_CMP(cmpunord, FPU_CMPQ, FPU_UNORD) +SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ) +SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT) +SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE) +SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) =20 DEF_HELPER_3(ucomiss, void, env, Reg, Reg) DEF_HELPER_3(comiss, void, env, Reg, Reg) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index f155cbb667..fdbc78c0c9 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3022,20 +3022,20 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 -#define SSE_FOP(x) { \ +#define SSE_CMP(x) { \ gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ gen_helper_ ## x ## ss, gen_helper_ ## x ## sd} static const SSEFunc_0_epp sse_op_table4[8][4] =3D { - SSE_FOP(cmpeq), - SSE_FOP(cmplt), - SSE_FOP(cmple), - SSE_FOP(cmpunord), - SSE_FOP(cmpneq), - SSE_FOP(cmpnlt), - SSE_FOP(cmpnle), - SSE_FOP(cmpord), + SSE_CMP(cmpeq), + SSE_CMP(cmplt), + SSE_CMP(cmple), + SSE_CMP(cmpunord), + SSE_CMP(cmpneq), + SSE_CMP(cmpnlt), + SSE_CMP(cmpnle), + SSE_CMP(cmpord), }; -#undef SSE_FOP +#undef SSE_CMP =20 static const SSEFunc_0_epp sse_op_table5[256] =3D { [0x0c] =3D gen_helper_pi2fw, --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661557080; cv=none; d=zohomail.com; s=zohoarc; b=dVHMMhMjaNcFuT1TzrVaheHSoQPMe2vM6bxGgf4fMVKv7Fip5L7w9Xz01VJ/zEuvGzatmKeFTaSkkvCzx3KaeyJtt/svbHgLDhC5AsTKKDKV0aJQMkfNd/S+3caODPWnzH5/UWSgCHXjgcthw3f0/KTYxJUYuAZBQ/BsRRIW28k= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661557080; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=dXDG0T4lgQtoZWVDW9qu9uiTkBCwz2icaBi/0SFHNjk=; b=DlfqJrd1np+l74/v7gy3cuU52V6/zkzSSQxBPd0O0QZIFyUBv2fBNNuWtCYG1q27mPXSKa4C2BoOADZELWvpq4IBDNhlRUjyyL6XzDk7Kytp6xwLpIQtK82vNE6K0c8uVzVqMMlUD24lFwMvhgAAOnCbQd3Uq2NXXI+qTFfPJII= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661557080707701.2986573067795; Fri, 26 Aug 2022 16:38:00 -0700 (PDT) Received: from localhost ([::1]:37474 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRitj-0001Ug-ON for importer@patchew.org; Fri, 26 Aug 2022 19:37:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42914) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVl-0007NE-Vx for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:14 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:44361) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVX-0007r4-Kx for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:13 -0400 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-610-aRutDFsVOhCcGDVlWfd1BA-1; Fri, 26 Aug 2022 19:12:47 -0400 Received: by mail-ed1-f72.google.com with SMTP id q18-20020a056402519200b0043dd2ff50feso1845825edd.9 for ; Fri, 26 Aug 2022 16:12:47 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id g30-20020a056402321e00b00445bda73fbesm1925079eda.33.2022.08.26.16.12.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dXDG0T4lgQtoZWVDW9qu9uiTkBCwz2icaBi/0SFHNjk=; b=CKiz/FPg5InstVVhlS6KhEQZavNySvuXiJ3rO3lBuwVveCpYKjafvf2W0kgQF0vNW93XW5 xSuvMEGO/efZMm0Xiz2gcRxG9Y/Ptzs3RQAjiunp1omHcZfyVnN9gKCmXkylqJGfPLHpu/ dQedeKihEl4vE2VjWpk4wA8WOOl1g4A= X-MC-Unique: aRutDFsVOhCcGDVlWfd1BA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=dXDG0T4lgQtoZWVDW9qu9uiTkBCwz2icaBi/0SFHNjk=; b=VRZJysH9947CTZWRLH9fIKltQswcJlV+Rfbs9wbeF74NcQjAxe0YyojcaFkmv6i5GK PuyQNS8CiX/YxkAJsfGeWCRUSF3qM1CJacTKaWTqJSR3f7TA9Ei1aZSHSs3XcE4TP1+9 DGLjs1LqiKN3qTRpcCjMFgJxLOrbs74pwVvAstg95B1rO4d7eymulXprFK7Uc9li8fZm CM26S1hH4km2J31kRlsTalDLpvOPOiejVLnKzc/1iJjthVzc4mwlG7Nad2pzjZ/yI6b9 BvjBkKUV39l2z0GsidiwaohlVHVhw9FKnNDe6ZyUdF+2DswwVx82Du2WrIKsQp2/SPEE tXpA== X-Gm-Message-State: ACgBeo2rc7yjrWd1lf3TGHBhuEnCliZ5g5dKVOquLuVzRkRQmY9UFtFq bpd2FVcYcuwSV3D55inpU5SQJMQ0N8t7itAyStV5Vwjn/t11xAozVb4v9e2aIvilSJIJ5NGoi7U /GbHatkhHAkb0FXmHGWjG2cxubuLQf+pjLAmRzI/7AOkT0oVT7CqDWyTpvIhXqy7U6yw= X-Received: by 2002:a17:907:6eaa:b0:741:44e3:d9e4 with SMTP id sh42-20020a1709076eaa00b0074144e3d9e4mr637013ejc.424.1661555565786; Fri, 26 Aug 2022 16:12:45 -0700 (PDT) X-Google-Smtp-Source: AA6agR4wHj2yZz1mGMc9JuOBJVgaz1FYoWHeQYuwR5lH5TGorYOg05UsFBGeMtqfMKjdcfHlVbCb0A== X-Received: by 2002:a17:907:6eaa:b0:741:44e3:d9e4 with SMTP id sh42-20020a1709076eaa00b0074144e3d9e4mr636996ejc.424.1661555565527; Fri, 26 Aug 2022 16:12:45 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 18/23] i386: Dot product AVX helper prep Date: Sat, 27 Aug 2022 01:11:59 +0200 Message-Id: <20220826231204.201395-19-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661557082745100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Make the dpps and dppd helpers AVX-ready I can't see any obvious reason why dppd shouldn't work on 256 bit ymm registers, but both AMD and Intel agree that it's xmm only. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-17-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 80 ++++++++++++++++++++++++------------------- 1 file changed, 45 insertions(+), 35 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index f0bb30ba53..17d04888c5 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1925,55 +1925,64 @@ SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP) SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) =20 -void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) +void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, + uint32_t mask) { + Reg *v =3D d; float32 prod1, prod2, temp2, temp3, temp4; + int i; =20 - /* - * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D - * to correctly round the intermediate results - */ - if (mask & (1 << 4)) { - prod1 =3D float32_mul(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status); - } else { - prod1 =3D float32_zero; - } - if (mask & (1 << 5)) { - prod2 =3D float32_mul(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status); - } else { - prod2 =3D float32_zero; - } - temp2 =3D float32_add(prod1, prod2, &env->sse_status); - if (mask & (1 << 6)) { - prod1 =3D float32_mul(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status); - } else { - prod1 =3D float32_zero; - } - if (mask & (1 << 7)) { - prod2 =3D float32_mul(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status); - } else { - prod2 =3D float32_zero; - } - temp3 =3D float32_add(prod1, prod2, &env->sse_status); - temp4 =3D float32_add(temp2, temp3, &env->sse_status); + for (i =3D 0; i < 2 << SHIFT; i +=3D 4) { + /* + * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D + * to correctly round the intermediate results + */ + if (mask & (1 << 4)) { + prod1 =3D float32_mul(v->ZMM_S(i), s->ZMM_S(i), &env->sse_stat= us); + } else { + prod1 =3D float32_zero; + } + if (mask & (1 << 5)) { + prod2 =3D float32_mul(v->ZMM_S(i+1), s->ZMM_S(i+1), &env->sse_= status); + } else { + prod2 =3D float32_zero; + } + temp2 =3D float32_add(prod1, prod2, &env->sse_status); + if (mask & (1 << 6)) { + prod1 =3D float32_mul(v->ZMM_S(i+2), s->ZMM_S(i+2), &env->sse_= status); + } else { + prod1 =3D float32_zero; + } + if (mask & (1 << 7)) { + prod2 =3D float32_mul(v->ZMM_S(i+3), s->ZMM_S(i+3), &env->sse_= status); + } else { + prod2 =3D float32_zero; + } + temp3 =3D float32_add(prod1, prod2, &env->sse_status); + temp4 =3D float32_add(temp2, temp3, &env->sse_status); =20 - d->ZMM_S(0) =3D (mask & (1 << 0)) ? temp4 : float32_zero; - d->ZMM_S(1) =3D (mask & (1 << 1)) ? temp4 : float32_zero; - d->ZMM_S(2) =3D (mask & (1 << 2)) ? temp4 : float32_zero; - d->ZMM_S(3) =3D (mask & (1 << 3)) ? temp4 : float32_zero; + d->ZMM_S(i) =3D (mask & (1 << 0)) ? temp4 : float32_zero; + d->ZMM_S(i+1) =3D (mask & (1 << 1)) ? temp4 : float32_zero; + d->ZMM_S(i+2) =3D (mask & (1 << 2)) ? temp4 : float32_zero; + d->ZMM_S(i+3) =3D (mask & (1 << 3)) ? temp4 : float32_zero; + } } =20 -void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) +#if SHIFT =3D=3D 1 +/* Oddly, there is no ymm version of dppd */ +void glue(helper_dppd, SUFFIX)(CPUX86State *env, + Reg *d, Reg *s, uint32_t mask) { + Reg *v =3D d; float64 prod1, prod2, temp2; =20 if (mask & (1 << 4)) { - prod1 =3D float64_mul(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status); + prod1 =3D float64_mul(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status); } else { prod1 =3D float64_zero; } if (mask & (1 << 5)) { - prod2 =3D float64_mul(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status); + prod2 =3D float64_mul(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status); } else { prod2 =3D float64_zero; } @@ -1981,6 +1990,7 @@ void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg = *d, Reg *s, uint32_t mask) d->ZMM_D(0) =3D (mask & (1 << 0)) ? temp2 : float64_zero; d->ZMM_D(1) =3D (mask & (1 << 1)) ? temp2 : float64_zero; } +#endif =20 void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t offset) --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556551; cv=none; d=zohomail.com; s=zohoarc; b=BObUb6Sp/uF1058XgpB1DcZMBwuGjJzc7/XaH9WNfpGeGvAi/H5/ftZTBgHwyledk0tGwmKAn5LEd/MieqkyOXg8kq8HEMCQjPwA2K7lH6h4rriFWMfQfAG8wnmDkdx8JPZbx8sFmgOc5MNoWHAJcQhp9tzSVsWd2deG0tLFXGM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556551; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=mhoWDleWv/dGnNNeR4qdNvgqCfSBeJZS8kKwET5x6P0=; b=h7x0NKr5jLglpLulPw6pxHYphItqO3IFOeNuRsNhcJ4+wCCxyfRi7pyH/nnKZMgxDK2o1H6fWTu5IeeB/PYB5KmqxACZDFvVIYiiWh3h71ejFgJJbKqCnMMK3QPiqLhsl+ccUu2U4YtOwaOU3ZqlK6sme3pT1naTAFFldzN57Cw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556551322164.0144750745385; Fri, 26 Aug 2022 16:29:11 -0700 (PDT) Received: from localhost ([::1]:40004 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRilC-00071t-1H for importer@patchew.org; Fri, 26 Aug 2022 19:29:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42902) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVk-0007GO-33 for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:12 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:34966) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVV-0007sY-GI for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:11 -0400 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-552-DpHsaKGUOASL-1N7PAl5Ag-1; Fri, 26 Aug 2022 19:12:48 -0400 Received: by mail-ed1-f72.google.com with SMTP id v7-20020a056402348700b004481d536c7bso310359edc.10 for ; Fri, 26 Aug 2022 16:12:48 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id 1-20020a170906328100b0073d70df6e56sm1397203ejw.138.2022.08.26.16.12.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555575; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mhoWDleWv/dGnNNeR4qdNvgqCfSBeJZS8kKwET5x6P0=; b=Raqqd/5cqEvBwyLRvYoVBEAGzojDvW3p07Cn/WWU9WR3P9LW1QMevRUZKEtxDRvAK+tWBF t/SjA3dCcHl2BZYtTQREANBn3eH6tHFP1MgyIrv4XM5qdrAYOL3Shr0IhwbbF/DYKVkCC3 FNkwmAb6WDwLk7E5xtW48yI5w1esF6g= X-MC-Unique: DpHsaKGUOASL-1N7PAl5Ag-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=mhoWDleWv/dGnNNeR4qdNvgqCfSBeJZS8kKwET5x6P0=; b=bfdL4KrQyL4lnS+IoFC92N+fkK2CpHKNvtr31OKeAUWmy98EQ/sAVwcupQeNQSQv5+ wWAPtYyIMASA+QT3dVhpjJbrMwocmFXEARYbCOvsaJXNc0JdyGffqWooTTujpptMnOmo VNySXZV8O8m4O24rrFv3iPChZcu7juJlfvSzsNsJwYRrgmkg4LQq5+QvyN4imbDv0O0U 2tM4BjYw+CCU84xF7ka+j+GDRgqd8ElPfwflGcrtxJPYYDi30yHE82/VXEOb6By05je2 cpKGKYA0f0x6+xfwDBCBXqziThgKchCfH4gNZQdMaM6LmmwL+dsKp6WIRRMEkb1j0JKJ lSOA== X-Gm-Message-State: ACgBeo1RiyHyQ3JpfPM2Xf3D+5MZzT/6xNpUkh/ar2cfbwIathXG6Xg5 AE750Al8hgM8njS4dd21iKCCRXj6ujUFAyBqpB+/W6KGxqiAkTIG1lsqvLfucIzPZmuARcu0yRT mwQbfsHnN5jf/Kxm5r2HhbcKDmhynt6i8N9xnLNRXLzInov3JIQZVXM944Es9kPYphCc= X-Received: by 2002:a17:907:7389:b0:73d:81a1:d562 with SMTP id er9-20020a170907738900b0073d81a1d562mr7011830ejc.27.1661555567247; Fri, 26 Aug 2022 16:12:47 -0700 (PDT) X-Google-Smtp-Source: AA6agR5bg0m/K18cBFQtpQFQl34NDWClzyjCbH2DkwUwirOnTC3eeSs/O1JIHhL+CDLFtT7sD5l+3w== X-Received: by 2002:a17:907:7389:b0:73d:81a1:d562 with SMTP id er9-20020a170907738900b0073d81a1d562mr7011815ejc.27.1661555566914; Fri, 26 Aug 2022 16:12:46 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 19/23] i386: Destructive FP helpers for AVX Date: Sat, 27 Aug 2022 01:12:00 +0200 Message-Id: <20220826231204.201395-20-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556551891100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Perpare the horizontal atithmetic vector helpers for AVX These currently use a dummy Reg typed variable to store the result then assign the whole register. This will cause 128 bit operations to corrupt the upper half of the register, so replace it with explicit temporaries and element assignments. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-18-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 70 +++++++++++++++++++++---------------------- 1 file changed, 35 insertions(+), 35 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 17d04888c5..ed2f04ded5 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -940,45 +940,45 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, in= t index, int length) d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } =20 -void glue(helper_haddps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; - - r.ZMM_S(0) =3D float32_add(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status); - r.ZMM_S(1) =3D float32_add(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status); - r.ZMM_S(2) =3D float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); - r.ZMM_S(3) =3D float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); - MOVE(*d, r); +#define SSE_HELPER_HPS(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + float32 r[2 << SHIFT]; \ + int i, j; \ + for (i =3D j =3D 0; j < 4; i++, j +=3D 2) { \ + r[i] =3D F(v->ZMM_S(j), v->ZMM_S(j + 1), &env->sse_status); \ + } \ + for (j =3D 0; j < 4; i++, j +=3D 2) { \ + r[i] =3D F(s->ZMM_S(j), s->ZMM_S(j + 1), &env->sse_status); \ + } \ + for (i =3D 0; i < 2 << SHIFT; i++) { \ + d->ZMM_S(i) =3D r[i]; \ + } \ } =20 -void glue(helper_haddpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; +SSE_HELPER_HPS(haddps, float32_add) +SSE_HELPER_HPS(hsubps, float32_sub) =20 - r.ZMM_D(0) =3D float64_add(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status); - r.ZMM_D(1) =3D float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); - MOVE(*d, r); +#define SSE_HELPER_HPD(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + float64 r[2 << SHIFT]; \ + int i, j; \ + for (i =3D j =3D 0; j < 2; i++, j +=3D 2) { \ + r[i] =3D F(v->ZMM_D(j), v->ZMM_D(j + 1), &env->sse_status); \ + } \ + for (j =3D 0; j < 2; i++, j +=3D 2) { \ + r[i] =3D F(s->ZMM_D(j), s->ZMM_D(j + 1), &env->sse_status); \ + } \ + for (i =3D 0; i < 1 << SHIFT; i++) { \ + d->ZMM_D(i) =3D r[i]; \ + } \ } =20 -void glue(helper_hsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; - - r.ZMM_S(0) =3D float32_sub(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status); - r.ZMM_S(1) =3D float32_sub(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status); - r.ZMM_S(2) =3D float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); - r.ZMM_S(3) =3D float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); - MOVE(*d, r); -} - -void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; - - r.ZMM_D(0) =3D float64_sub(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status); - r.ZMM_D(1) =3D float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); - MOVE(*d, r); -} +SSE_HELPER_HPD(haddpd, float64_add) +SSE_HELPER_HPD(hsubpd, float64_sub) =20 void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { @@ -1999,7 +1999,7 @@ void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, int i, j; uint16_t r[8]; =20 - for (j =3D 0; j < 4 << SHIFT; j++) { + for (j =3D 0; j < 4 << SHIFT; ) { int s0 =3D (j * 2) + ((offset & 3) << 2); int d0 =3D (j * 2) + ((offset & 4) << 0); for (i =3D 0; i < 8; i++, d0++) { --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556448; cv=none; d=zohomail.com; s=zohoarc; b=JDfV2/P1i5pGlY3+eY3l7TTpwxSB9APgkKnW2Bth/JV3KRVtw6TA5y16tZUsnvS3AMc3NsaGePWrvOI8EQwkK3LzKX06As52NcH/dsa09H+WHmmDHuVW9uibwGBilqlklBpfbnAcIcvupt3bd3p63Zwq1YZ6HfflZwPJCUHP0LQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556448; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=pcth6vLe+jdsIMzsYi1JPCoKZ2tqufvhWn2ddGNKh4I=; b=Dm6nu9R5scwzvdKAUN0H+tvojs3cm77C9raKD86/ISMW1swC/CrUyY705dtuPbpw20O1YszqYaJxZihliJzzIkQG+BXtJXsIkncbYxnBDKocKlkID86e84qGKG82FkVAtmPf18mzM+XwF/NbVzaodAgtB5JjhGf2ulo1NEBoA9k= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661556448032602.4964180566174; Fri, 26 Aug 2022 16:27:28 -0700 (PDT) Received: from localhost ([::1]:36690 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRijV-0003by-UQ for importer@patchew.org; Fri, 26 Aug 2022 19:27:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42904) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVk-0007IT-IK for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:12 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:24791) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVV-0007rE-HO for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:12 -0400 Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-624-tWUzo9JFM8GkXhnQmLxedA-1; Fri, 26 Aug 2022 19:12:50 -0400 Received: by mail-ed1-f70.google.com with SMTP id q32-20020a05640224a000b004462f105fa9so1849734eda.4 for ; Fri, 26 Aug 2022 16:12:50 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id n24-20020a170906841800b0073cf6ec3276sm1360806ejx.207.2022.08.26.16.12.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555571; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pcth6vLe+jdsIMzsYi1JPCoKZ2tqufvhWn2ddGNKh4I=; b=AzvdKCLTJWRE0xzVaegLfXvkb1Audh7aDSiDel7cWV6FBzvr67Lm7lViMl75OOfW3YPShI zIMvmL+GlWx0Y4eYmTKritr40/SAY80sHBsmLURERDt8X0sNJIAHbGTenvOxtIA3kZ5QaM gmBKxPaQaQZ1+Ig0N7NpVVnJIBnTct4= X-MC-Unique: tWUzo9JFM8GkXhnQmLxedA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=pcth6vLe+jdsIMzsYi1JPCoKZ2tqufvhWn2ddGNKh4I=; b=B53BM8AlqswMaElyqbc4W0ID3EJ/7nqkcEbPxqlXDnWg7II/yxtdOwcfwUk06mpATL 2F4pFLwHLsGNPqsEVvdWBjKQ2A7I9dn/GVunm1EHzgqMRnqGfu1y2ZvU8plGw1YHtBAL x+K72xxo8PCPZSwZ0g0TYMLTXexCQjktGDG7NdtgtnEOEqnyce2h0dbHhKP2klnEPCFQ i7mf0u9gdPgyLN8NG12/CPMqJZYZLfEjvxcnkKbE8+BG9foxi/43jMY2JMx7Cal7BF78 nXhpek2ro6AlrSQSexIZZZ8iUk7J5v0zXgISVEpu8j1tCyXmPhbmEVYm2ts/oCyYL97F y/0A== X-Gm-Message-State: ACgBeo3llDJelUYbRO85nZkaZYd3yGLRgA0485/RGGt5VVfFrDEuoUXX FS5lar+bORgqWJGUp7mT9P6x03UBsCWrhkSoQttbviK38UMqek/uOc64tm2IB6yq4RAZFDgcnNp ie6YAMIVHrocopUiAqDdHirKk6h0/reHbZTnB46jZjZmDR5U/C7WnyAQnN5PvAT+IMD8= X-Received: by 2002:a17:907:7f20:b0:73d:d54f:6571 with SMTP id qf32-20020a1709077f2000b0073dd54f6571mr5735448ejc.315.1661555569147; Fri, 26 Aug 2022 16:12:49 -0700 (PDT) X-Google-Smtp-Source: AA6agR7AZPi/QEI4UkQiAz3MmFr6JHxia3aeJUJsc08x3mnI2OkUtKF/9ww5jHq11d3FsGjLLSKu/g== X-Received: by 2002:a17:907:7f20:b0:73d:d54f:6571 with SMTP id qf32-20020a1709077f2000b0073dd54f6571mr5735437ejc.315.1661555568725; Fri, 26 Aug 2022 16:12:48 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 20/23] i386: Misc AVX helper prep Date: Sat, 27 Aug 2022 01:12:01 +0200 Message-Id: <20220826231204.201395-21-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556449315100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Fixup various vector helpers that either trivially exten to 256 bit, or don't have 256 bit variants. No functional changes to existing helpers Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-19-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 143 +++++++++++++++++++++++++++--------------- 1 file changed, 94 insertions(+), 49 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index ed2f04ded5..84bbae4b9a 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -435,6 +435,7 @@ void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg = *d, Reg *s) } } =20 +#if SHIFT < 2 void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, target_ulong a0) { @@ -446,6 +447,7 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg= *d, Reg *s, } } } +#endif =20 void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val) { @@ -648,21 +650,24 @@ void helper_sqrtsd(CPUX86State *env, Reg *d, Reg *s) /* float to float conversions */ void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - float32 s0, s1; - - s0 =3D s->ZMM_S(0); - s1 =3D s->ZMM_S(1); - d->ZMM_D(0) =3D float32_to_float64(s0, &env->sse_status); - d->ZMM_D(1) =3D float32_to_float64(s1, &env->sse_status); + int i; + for (i =3D 1 << SHIFT; --i >=3D 0; ) { + d->ZMM_D(i) =3D float32_to_float64(s->ZMM_S(i), &env->sse_status); + } } =20 void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); - d->ZMM_S(1) =3D float64_to_float32(s->ZMM_D(1), &env->sse_status); - d->Q(1) =3D 0; + int i; + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_S(i) =3D float64_to_float32(s->ZMM_D(i), &env->sse_status); + } + for (i >>=3D 1; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } =20 +#if SHIFT =3D=3D 1 void helper_cvtss2sd(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_D(0) =3D float32_to_float64(s->ZMM_S(0), &env->sse_status); @@ -672,26 +677,27 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); } +#endif =20 /* integer to float */ void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_S(0) =3D int32_to_float32(s->ZMM_L(0), &env->sse_status); - d->ZMM_S(1) =3D int32_to_float32(s->ZMM_L(1), &env->sse_status); - d->ZMM_S(2) =3D int32_to_float32(s->ZMM_L(2), &env->sse_status); - d->ZMM_S(3) =3D int32_to_float32(s->ZMM_L(3), &env->sse_status); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D int32_to_float32(s->ZMM_L(i), &env->sse_status); + } } =20 void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - int32_t l0, l1; - - l0 =3D (int32_t)s->ZMM_L(0); - l1 =3D (int32_t)s->ZMM_L(1); - d->ZMM_D(0) =3D int32_to_float64(l0, &env->sse_status); - d->ZMM_D(1) =3D int32_to_float64(l1, &env->sse_status); + int i; + for (i =3D 1 << SHIFT; --i >=3D 0; ) { + int32_t l =3D s->ZMM_L(i); + d->ZMM_D(i) =3D int32_to_float64(l, &env->sse_status); + } } =20 +#if SHIFT =3D=3D 1 void helper_cvtpi2ps(CPUX86State *env, ZMMReg *d, MMXReg *s) { d->ZMM_S(0) =3D int32_to_float32(s->MMX_L(0), &env->sse_status); @@ -726,8 +732,11 @@ void helper_cvtsq2sd(CPUX86State *env, ZMMReg *d, uint= 64_t val) } #endif =20 +#endif + /* float to integer */ =20 +#if SHIFT =3D=3D 1 /* * x86 mandates that we return the indefinite integer value for the result * of any float-to-integer conversion that raises the 'invalid' exception. @@ -758,22 +767,28 @@ WRAP_FLOATCONV(int64_t, float32_to_int64, float32, IN= T64_MIN) WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero, float32, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN) +#endif =20 void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); - d->ZMM_L(1) =3D x86_float32_to_int32(s->ZMM_S(1), &env->sse_status); - d->ZMM_L(2) =3D x86_float32_to_int32(s->ZMM_S(2), &env->sse_status); - d->ZMM_L(3) =3D x86_float32_to_int32(s->ZMM_S(3), &env->sse_status); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_L(i) =3D x86_float32_to_int32(s->ZMM_S(i), &env->sse_status= ); + } } =20 void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float64_to_int32(s->ZMM_D(0), &env->sse_status); - d->ZMM_L(1) =3D x86_float64_to_int32(s->ZMM_D(1), &env->sse_status); - d->ZMM_Q(1) =3D 0; + int i; + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_L(i) =3D x86_float64_to_int32(s->ZMM_D(i), &env->sse_status= ); + } + for (i >>=3D 1; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } =20 +#if SHIFT =3D=3D 1 void helper_cvtps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { d->MMX_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); @@ -807,23 +822,31 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s) return x86_float64_to_int64(s->ZMM_D(0), &env->sse_status); } #endif +#endif =20 /* float to integer truncated */ void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); - d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); - d->ZMM_L(2) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(2), &env->= sse_status); - d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->= sse_status); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_L(i) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(i), + &env->sse_status); + } } =20 void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); - d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); - d->ZMM_Q(1) =3D 0; + int i; + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_L(i) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(i), + &env->sse_status); + } + for (i >>=3D 1; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } =20 +#if SHIFT =3D=3D 1 void helper_cvttps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { d->MMX_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); @@ -857,6 +880,7 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) return x86_float64_to_int64_round_to_zero(s->ZMM_D(0), &env->sse_statu= s); } #endif +#endif =20 void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { @@ -870,6 +894,7 @@ void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMM= Reg *d, ZMMReg *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); @@ -878,6 +903,7 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg= *s) &env->sse_status); set_float_exception_flags(old_flags, &env->sse_status); } +#endif =20 void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { @@ -889,13 +915,16 @@ void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMM= Reg *d, ZMMReg *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_rcpss(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); set_float_exception_flags(old_flags, &env->sse_status); } +#endif =20 +#if SHIFT =3D=3D 1 static inline uint64_t helper_extrq(uint64_t src, int shift, int len) { uint64_t mask; @@ -939,6 +968,7 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int = index, int length) { d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } +#endif =20 #define SSE_HELPER_HPS(name, F) \ void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ @@ -1061,6 +1091,7 @@ SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) =20 #undef SSE_HELPER_CMP =20 +#if SHIFT =3D=3D 1 static const int comis_eflags[4] =3D {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C}; =20 void helper_ucomiss(CPUX86State *env, Reg *d, Reg *s) @@ -1106,25 +1137,30 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s) ret =3D float64_compare(d0, d1, &env->sse_status); CC_SRC =3D comis_eflags[ret + 1]; } +#endif =20 uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s) { - int b0, b1, b2, b3; + uint32_t mask; + int i; =20 - b0 =3D s->ZMM_L(0) >> 31; - b1 =3D s->ZMM_L(1) >> 31; - b2 =3D s->ZMM_L(2) >> 31; - b3 =3D s->ZMM_L(3) >> 31; - return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3); + mask =3D 0; + for (i =3D 0; i < 2 << SHIFT; i++) { + mask |=3D (s->ZMM_L(i) >> (31 - i)) & (1 << i); + } + return mask; } =20 uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s) { - int b0, b1; + uint32_t mask; + int i; =20 - b0 =3D s->ZMM_L(1) >> 31; - b1 =3D s->ZMM_L(3) >> 31; - return b0 | (b1 << 1); + mask =3D 0; + for (i =3D 0; i < 1 << SHIFT; i++) { + mask |=3D (s->ZMM_Q(i) >> (63 - i)) & (1 << i); + } + return mask; } =20 #endif @@ -1748,6 +1784,7 @@ SSE_HELPER_L(helper_pmaxud, MAX) #define FMULLD(d, s) ((int32_t)d * (int32_t)s) SSE_HELPER_L(helper_pmulld, FMULLD) =20 +#if SHIFT =3D=3D 1 void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int idx =3D 0; @@ -1779,12 +1816,14 @@ void glue(helper_phminposuw, SUFFIX)(CPUX86State *e= nv, Reg *d, Reg *s) d->L(1) =3D 0; d->Q(1) =3D 0; } +#endif =20 void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); signed char prev_rounding_mode; + int i; =20 prev_rounding_mode =3D env->sse_status.float_rounding_mode; if (!(mode & (1 << 2))) { @@ -1804,10 +1843,9 @@ void glue(helper_roundps, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s, } } =20 - d->ZMM_S(0) =3D float32_round_to_int(s->ZMM_S(0), &env->sse_status); - d->ZMM_S(1) =3D float32_round_to_int(s->ZMM_S(1), &env->sse_status); - d->ZMM_S(2) =3D float32_round_to_int(s->ZMM_S(2), &env->sse_status); - d->ZMM_S(3) =3D float32_round_to_int(s->ZMM_S(3), &env->sse_status); + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D float32_round_to_int(s->ZMM_S(i), &env->sse_status= ); + } =20 if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) { set_float_exception_flags(get_float_exception_flags(&env->sse_stat= us) & @@ -1822,6 +1860,7 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); signed char prev_rounding_mode; + int i; =20 prev_rounding_mode =3D env->sse_status.float_rounding_mode; if (!(mode & (1 << 2))) { @@ -1841,8 +1880,9 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, } } =20 - d->ZMM_D(0) =3D float64_round_to_int(s->ZMM_D(0), &env->sse_status); - d->ZMM_D(1) =3D float64_round_to_int(s->ZMM_D(1), &env->sse_status); + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_D(i) =3D float64_round_to_int(s->ZMM_D(i), &env->sse_status= ); + } =20 if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) { set_float_exception_flags(get_float_exception_flags(&env->sse_stat= us) & @@ -1852,6 +1892,7 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, env->sse_status.float_rounding_mode =3D prev_rounding_mode; } =20 +#if SHIFT =3D=3D 1 void glue(helper_roundss, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) { @@ -1919,6 +1960,7 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, } env->sse_status.float_rounding_mode =3D prev_rounding_mode; } +#endif =20 #define FBLENDP(d, s, m) (m ? s : d) SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) @@ -2020,6 +2062,7 @@ void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, #define FCMPGTQ(d, s) ((int64_t)d > (int64_t)s ? -1 : 0) SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ) =20 +#if SHIFT =3D=3D 1 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl) { target_long val, limit; @@ -2240,6 +2283,8 @@ target_ulong helper_crc32(uint32_t crc1, target_ulong= msg, uint32_t len) return crc; } =20 +#endif + void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t ctrl) { --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661557366; cv=none; d=zohomail.com; s=zohoarc; b=aYCepIO4LRl2unDH+CyLUWIMWf5Cx+0tJ+nuDblNPwiQp2p7m/iGVljWrdQtYPucTycZ2r0SBmTO/dEO1kx1xjwDdmkqPVplaTGuXHEvB7l0HEVvzVD118FxHddCFjcSUxgODky+MxEPkwg5auudVdoFIZhRKdDcufm4msxYRQQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661557366; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=0DEO/olqE9tW3TFJpHSpIhXVaOnMkzlGBZP7IVsXs58=; b=SwZ7vGg8pYWNiLcLwvDj6OjPMydFiTepuk0xey4KdjFgFuldN3rCDj4/mleRUvwwek42lL7n3Xy/Pq4Sdy58Kp+gx0adlhLcmHD4VIKJTCBtTlb1hs7LUML3zfWutrCkfrXS3EKgiu0nvsvxLDob890E2369EjdReKizmjG+Iik= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1661557366300480.12098741692387; Fri, 26 Aug 2022 16:42:46 -0700 (PDT) Received: from localhost ([::1]:39132 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRiyK-0005Ui-SE for importer@patchew.org; Fri, 26 Aug 2022 19:42:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42916) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVm-0007P6-J1 for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:14 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:38922) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVX-0007rK-Kw for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:14 -0400 Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-48-LzxJN6GuPGSnogvp3CZfQA-1; Fri, 26 Aug 2022 19:12:51 -0400 Received: by mail-ed1-f69.google.com with SMTP id y14-20020a056402440e00b0044301c7ccd9so1837454eda.19 for ; Fri, 26 Aug 2022 16:12:51 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id z22-20020a50cd16000000b00445f9faf13csm1945300edi.72.2022.08.26.16.12.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555573; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0DEO/olqE9tW3TFJpHSpIhXVaOnMkzlGBZP7IVsXs58=; b=W/DFZoVuDn8j0NCP3cnt0YOht8NwTUBAn+f8PzQV2/7rzi0t/AVzxXfl7Do2PHDDIF4Hz0 eMqcnGiQnh+wDXVsJMcGBeniF5KJc0M459/D4j1aoCFRZm/pVvUnqh5/8raXT5RmJudXOH uPqo1JLp0ZUaKPV4WhFN8RJx4y/p8hk= X-MC-Unique: LzxJN6GuPGSnogvp3CZfQA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=0DEO/olqE9tW3TFJpHSpIhXVaOnMkzlGBZP7IVsXs58=; b=bMg9uOe1K5DpEhNcpgFdFmjfc0oZLv3W00Hfi7dcosz6f8lW2MJF+2HR3bEOPCO9yQ b4FGyRfYSv9/2ruQ9XcxtfsD+gZuKNg5ZMKBbPfc0x8iLpq9fOUco3shhsWK7XTLpjbn 1t7h4FNBJtu8grNxFKP/EkprmrPm2797HEU6HkJa53lQzUL9lyjyzbS2mFd5+NGkBTfT oQehJSib7d+2PZaZlQdGGnttxz0jW0lD5keQWpH9Mr08ptopPnKLpZYTdj/IZmBzijw4 t43Dv6BkElTZk53YaA2GYC7WB/U8fB0vRgr96OOsvvPvzIrjSN/TvBfzmyPL2SlV/J0g 1c+w== X-Gm-Message-State: ACgBeo1ZKVv9OM7gkPjfj3qcTbYiOo6lmomrDEhmZOXA+91xTk3GxZQW fsHDkKmHq9b53uTXsNIFGEZF8eWXxX7glbKmnFnIPSlQx+Tz19q7YC1md2LzBgPGmBVC7qov7iY a8c7ZDVovtChtQ3CHW69A9DGVxt50bFjwEchjGtYZIvTQoumIYzEiDEGQsmI9L5i/t48= X-Received: by 2002:a17:906:4bd3:b0:731:3bdf:b95c with SMTP id x19-20020a1709064bd300b007313bdfb95cmr6790819ejv.677.1661555570460; Fri, 26 Aug 2022 16:12:50 -0700 (PDT) X-Google-Smtp-Source: AA6agR6+s8NEZMSrPd4b63MVw0yzfuTwvCkklHc4GYDMOh3Ql3ma8qFG+nsfE5k+uLE/v+ikO1ZBiw== X-Received: by 2002:a17:906:4bd3:b0:731:3bdf:b95c with SMTP id x19-20020a1709064bd300b007313bdfb95cmr6790806ejv.677.1661555570182; Fri, 26 Aug 2022 16:12:50 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 21/23] i386: Rewrite blendv helpers Date: Sat, 27 Aug 2022 01:12:02 +0200 Message-Id: <20220826231204.201395-22-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661557367930100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Rewrite the blendv helpers so that they can easily be extended to support the AVX encodings, which make all 4 arguments explicit. No functional changes to the existing helpers Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-20-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 86 ++++++++++++------------------------------- 1 file changed, 24 insertions(+), 62 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 84bbae4b9a..f9cc1d7623 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1627,76 +1627,38 @@ void glue(helper_palignr, SUFFIX)(CPUX86State *env,= Reg *d, Reg *s, } } =20 -#define XMM0 (env->xmm_regs[0]) +#if SHIFT >=3D 1 =20 -#if SHIFT =3D=3D 1 #define SSE_HELPER_V(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->elem(0) =3D F(d->elem(0), s->elem(0), XMM0.elem(0)); \ - d->elem(1) =3D F(d->elem(1), s->elem(1), XMM0.elem(1)); \ - if (num > 2) { \ - d->elem(2) =3D F(d->elem(2), s->elem(2), XMM0.elem(2)); \ - d->elem(3) =3D F(d->elem(3), s->elem(3), XMM0.elem(3)); \ - if (num > 4) { \ - d->elem(4) =3D F(d->elem(4), s->elem(4), XMM0.elem(4)); \ - d->elem(5) =3D F(d->elem(5), s->elem(5), XMM0.elem(5)); \ - d->elem(6) =3D F(d->elem(6), s->elem(6), XMM0.elem(6)); \ - d->elem(7) =3D F(d->elem(7), s->elem(7), XMM0.elem(7)); \ - if (num > 8) { \ - d->elem(8) =3D F(d->elem(8), s->elem(8), XMM0.elem(8))= ; \ - d->elem(9) =3D F(d->elem(9), s->elem(9), XMM0.elem(9))= ; \ - d->elem(10) =3D F(d->elem(10), s->elem(10), XMM0.elem(= 10)); \ - d->elem(11) =3D F(d->elem(11), s->elem(11), XMM0.elem(= 11)); \ - d->elem(12) =3D F(d->elem(12), s->elem(12), XMM0.elem(= 12)); \ - d->elem(13) =3D F(d->elem(13), s->elem(13), XMM0.elem(= 13)); \ - d->elem(14) =3D F(d->elem(14), s->elem(14), XMM0.elem(= 14)); \ - d->elem(15) =3D F(d->elem(15), s->elem(15), XMM0.elem(= 15)); \ - } \ - } \ + Reg *v =3D d; \ + Reg *m =3D &env->xmm_regs[0]; \ + int i; \ + for (i =3D 0; i < num; i++) { \ + d->elem(i) =3D F(v->elem(i), s->elem(i), m->elem(i)); \ } \ } =20 #define SSE_HELPER_I(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t imm= ) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, \ + uint32_t imm) \ { \ - d->elem(0) =3D F(d->elem(0), s->elem(0), ((imm >> 0) & 1)); \ - d->elem(1) =3D F(d->elem(1), s->elem(1), ((imm >> 1) & 1)); \ - if (num > 2) { \ - d->elem(2) =3D F(d->elem(2), s->elem(2), ((imm >> 2) & 1)); \ - d->elem(3) =3D F(d->elem(3), s->elem(3), ((imm >> 3) & 1)); \ - if (num > 4) { \ - d->elem(4) =3D F(d->elem(4), s->elem(4), ((imm >> 4) & 1))= ; \ - d->elem(5) =3D F(d->elem(5), s->elem(5), ((imm >> 5) & 1))= ; \ - d->elem(6) =3D F(d->elem(6), s->elem(6), ((imm >> 6) & 1))= ; \ - d->elem(7) =3D F(d->elem(7), s->elem(7), ((imm >> 7) & 1))= ; \ - if (num > 8) { \ - d->elem(8) =3D F(d->elem(8), s->elem(8), ((imm >> 8) &= 1)); \ - d->elem(9) =3D F(d->elem(9), s->elem(9), ((imm >> 9) &= 1)); \ - d->elem(10) =3D F(d->elem(10), s->elem(10), \ - ((imm >> 10) & 1)); \ - d->elem(11) =3D F(d->elem(11), s->elem(11), \ - ((imm >> 11) & 1)); \ - d->elem(12) =3D F(d->elem(12), s->elem(12), \ - ((imm >> 12) & 1)); \ - d->elem(13) =3D F(d->elem(13), s->elem(13), \ - ((imm >> 13) & 1)); \ - d->elem(14) =3D F(d->elem(14), s->elem(14), \ - ((imm >> 14) & 1)); \ - d->elem(15) =3D F(d->elem(15), s->elem(15), \ - ((imm >> 15) & 1)); \ - } \ - } \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < num; i++) { \ + int j =3D i & 7; \ + d->elem(i) =3D F(v->elem(i), s->elem(i), (imm >> j) & 1); \ } \ } =20 /* SSE4.1 op helpers */ -#define FBLENDVB(d, s, m) ((m & 0x80) ? s : d) -#define FBLENDVPS(d, s, m) ((m & 0x80000000) ? s : d) -#define FBLENDVPD(d, s, m) ((m & 0x8000000000000000LL) ? s : d) -SSE_HELPER_V(helper_pblendvb, B, 16, FBLENDVB) -SSE_HELPER_V(helper_blendvps, L, 4, FBLENDVPS) -SSE_HELPER_V(helper_blendvpd, Q, 2, FBLENDVPD) +#define FBLENDVB(v, s, m) ((m & 0x80) ? s : v) +#define FBLENDVPS(v, s, m) ((m & 0x80000000) ? s : v) +#define FBLENDVPD(v, s, m) ((m & 0x8000000000000000LL) ? s : v) +SSE_HELPER_V(helper_pblendvb, B, 8 << SHIFT, FBLENDVB) +SSE_HELPER_V(helper_blendvps, L, 2 << SHIFT, FBLENDVPS) +SSE_HELPER_V(helper_blendvpd, Q, 1 << SHIFT, FBLENDVPD) =20 void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { @@ -1962,10 +1924,10 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env,= Reg *d, Reg *s, } #endif =20 -#define FBLENDP(d, s, m) (m ? s : d) -SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) -SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP) -SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) +#define FBLENDP(v, s, m) (m ? s : v) +SSE_HELPER_I(helper_blendps, L, 2 << SHIFT, FBLENDP) +SSE_HELPER_I(helper_blendpd, Q, 1 << SHIFT, FBLENDP) +SSE_HELPER_I(helper_pblendw, W, 4 << SHIFT, FBLENDP) =20 void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask) --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661557049; cv=none; d=zohomail.com; s=zohoarc; b=T8h/DKvdjABAPyN/5sb9NFJlJ+PgFgVpiJmKE5MDkVCJbzuv+ZWakB5TW42otZPccGYQp7f6l4PrsJNrXGXrKJY/sZMVR+9Q1ZUYBq7cFCJUCnNj097a30ZCZb6DJV7I1fkUNIwbl+I+EEg7igu72hu436gE7+YR7xH1RSKTkJ4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661557049; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=EdzxNqc3E/z2EO4Qyu4yrjNJ5KEu/Wcds8vGPwbfTeQ=; b=Yr7L2m7PACSxCmleZQWjcr7/N0PxEHZi4onSX+FLZtE3eZYcbaLTDF9w7FdCyt1vbYOpSYFXCuSquX+cpT7zMKWJEOqnNg7CZ35M9x5e0dlYxCEP0u3TguYv0WRvwtlrKdqGtxRoIUWu1RssdvVlh6bwPGXLgs1GxZ0SXBhl00U= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 166155704989864.11827603875167; Fri, 26 Aug 2022 16:37:29 -0700 (PDT) Received: from localhost ([::1]:41746 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRitE-0008V8-Jl for importer@patchew.org; Fri, 26 Aug 2022 19:37:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42908) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVl-0007Kh-90 for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:54318) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVW-0007sW-Hg for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:12 -0400 Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-580-yH9DzL6RNz-vg8PSX1kmqQ-1; Fri, 26 Aug 2022 19:12:53 -0400 Received: by mail-ed1-f72.google.com with SMTP id w20-20020a05640234d400b00447e6ffefccso1877552edc.0 for ; Fri, 26 Aug 2022 16:12:53 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:1c09:f536:3de6:228c]) by smtp.gmail.com with ESMTPSA id 1-20020a170906210100b00731582babcasm1426540ejt.71.2022.08.26.16.12.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555575; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EdzxNqc3E/z2EO4Qyu4yrjNJ5KEu/Wcds8vGPwbfTeQ=; b=NLzHKp+85o2DtqlcydkeXbPyRPX05C/vtkpWx1sVYF+/0kJD2J4XqzKVSuX7vR8RzFEAWW WMF9qqoPC021OlGPEQi8ABLNBhdrvxyDpJNtC/PbBHCVDF2DmqHptqGdMMdqutMPRd1zDp 9bThTTwIBRZWnyQe3EgsU9VGVvxUbbs= X-MC-Unique: yH9DzL6RNz-vg8PSX1kmqQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=EdzxNqc3E/z2EO4Qyu4yrjNJ5KEu/Wcds8vGPwbfTeQ=; b=QUxmyYtsr6MNodbtJQPYJlbw8eJEANKiboG/YmKU0mMDjIG2rFfLjFcxpl4pbBT/Fa X0Lv9CLEosD9Yx+T9LPN4PlTovO0Ki/LDS8vx2RA0RwbSPyu/2QLgYIuQMWGuJlS/Ns4 X9Ed1zQrYCYPZCrrUrXYzNZ9jJuvnJlZSIp/7yCZXQaIAykUdsx6VOkrfJobioNpLJYb dogDNikPoh73kHpokVsxqIGX2y6ZVdG8dd13bMEluZdCJlqp5n1u24PBUcQX6LXndW0h wTlx8DVEhmFoKrrpLe1m/e5BsXwPEfKFv2ZfYnLA/vpW+5wBP3Q2xWJ2Dy6EmOVbMh2K q6IA== X-Gm-Message-State: ACgBeo0ikolWSGH83nVtigu5FuilE886xhy4Lb/6knPln6g1b4Edm4hM bkFoGGinhThBaZTSpsdyLiJvBwAYvct1zNERc1ZoaNhnnAZfZ/kTDSAbmNmY5BBuO+TMd9vIv+e CTRp7KkfFlUpn3slni7/nRNHEA3kfS/sfS29ljGxHRKzyOrrYyEgpUZmakmsCeg5aHcs= X-Received: by 2002:a17:907:3e07:b0:73d:760d:3e01 with SMTP id hp7-20020a1709073e0700b0073d760d3e01mr6596192ejc.136.1661555572294; Fri, 26 Aug 2022 16:12:52 -0700 (PDT) X-Google-Smtp-Source: AA6agR5iit9OkcezP/M58TvCAAPPqqT513DBsSgZlYJn1/shpVKgW27eDYb4q3MZQoBHxE8ez8Huow== X-Received: by 2002:a17:907:3e07:b0:73d:760d:3e01 with SMTP id hp7-20020a1709073e0700b0073d760d3e01mr6596180ejc.136.1661555571994; Fri, 26 Aug 2022 16:12:51 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 22/23] i386: AVX pclmulqdq prep Date: Sat, 27 Aug 2022 01:12:03 +0200 Message-Id: <20220826231204.201395-23-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661557050575100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Make the pclmulqdq helper AVX ready Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-21-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index f9cc1d7623..2b35b6e533 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2247,14 +2247,14 @@ target_ulong helper_crc32(uint32_t crc1, target_ulo= ng msg, uint32_t len) =20 #endif =20 -void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, - uint32_t ctrl) +#if SHIFT =3D=3D 1 +static void clmulq(uint64_t *dest_l, uint64_t *dest_h, + uint64_t a, uint64_t b) { - uint64_t ah, al, b, resh, resl; + uint64_t al, ah, resh, resl; =20 ah =3D 0; - al =3D d->Q((ctrl & 1) !=3D 0); - b =3D s->Q((ctrl & 16) !=3D 0); + al =3D a; resh =3D resl =3D 0; =20 while (b) { @@ -2267,8 +2267,23 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s, b >>=3D 1; } =20 - d->Q(0) =3D resl; - d->Q(1) =3D resh; + *dest_l =3D resl; + *dest_h =3D resh; +} +#endif + +void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, + uint32_t ctrl) +{ + Reg *v =3D d; + uint64_t a, b; + int i; + + for (i =3D 0; i < 1 << SHIFT; i +=3D 2) { + a =3D v->Q(((ctrl & 1) !=3D 0) + i); + b =3D s->Q(((ctrl & 16) !=3D 0) + i); + clmulq(&d->Q(i), &d->Q(i + 1), a, b); + } } =20 void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) --=20 2.37.1 From nobody Mon Feb 9 20:36:49 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1661556253; cv=none; d=zohomail.com; s=zohoarc; b=WD3H4azkkaxMbep65FF4yFGjGdKWTydA8UfaC6/ru52QqbEbw9HX+zh1GjSGiguxKTC/m2vAe9jwX5b+ZFLqQPxKSgtBp9Mr1waoNvzc+PmtRXi+Oya+UtHOhVr9sKcwmkqzndZ0LCuXincpENm1rk/KZrqwdECc6pwBj6/UDHM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1661556253; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=VQkmLXOo3oapP2Ppxp4sKP0WmFrqbzXLa3eq0R8XHlA=; b=oDuXdwUKRtqCsjucsocE9dWEieupcr5L6JwumGy3c4v+LRbSPI8P17X3LrKrGnWh2YNjmOL4U3Htck9/tEKD1Ikuqv245JwGv5icHWm1xFq4koLylEC6Adty0Ph/PPvJ9cLiqYRVE5Pgp1vsf5t6DjnbNRuJqVJaodzbBackPWw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16615562534641023.6040510379128; Fri, 26 Aug 2022 16:24:13 -0700 (PDT) Received: from localhost ([::1]:60132 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oRigO-00043G-Ar for importer@patchew.org; Fri, 26 Aug 2022 19:24:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42898) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVj-0007Fc-Ox for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:11 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:24310) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oRiVW-0007su-8l for qemu-devel@nongnu.org; Fri, 26 Aug 2022 19:13:11 -0400 Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-192-EXpMrZ8NOy-0sciEhwBjRw-1; Fri, 26 Aug 2022 19:12:55 -0400 Received: by mail-ed1-f69.google.com with SMTP id y14-20020a056402440e00b0044301c7ccd9so1837538eda.19 for ; Fri, 26 Aug 2022 16:12:55 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id p13-20020a17090653cd00b0074149364e76sm124254ejo.27.2022.08.26.16.12.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Aug 2022 16:12:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1661555576; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VQkmLXOo3oapP2Ppxp4sKP0WmFrqbzXLa3eq0R8XHlA=; b=SvPE5+VzdznqacggXgTpYyB0OjuBpmv7THVRB7N9S5zlnILMMSaWq7NWs1FNncrGuuK6E/ e2LekbiVj8OcYHVThlldjEQw84oA15I+AZgl26s7xFadAJHUmFuEjB8rCAkafz5vNKZ0G7 lCulSpmYZoCJ9f1773MtqRsLSHqj0Io= X-MC-Unique: EXpMrZ8NOy-0sciEhwBjRw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc; bh=VQkmLXOo3oapP2Ppxp4sKP0WmFrqbzXLa3eq0R8XHlA=; b=iPThjoGywGAt+pKpuvaPYdOvKW0aTCtML0MdU8/RZaHJHMpaCdLD0ckYlFm2OA8mf3 7aoRdFjP9lkGhtwUWN1VIe2efZoAo6XgGlfkYB3QvhQUrP2oUuFjrceLtUFuSStfts6N CvOyha/XRl18YFZWmYSG8zI+ud5Bi5ZxN9ArOWL7RMYeZnJ8vcKoq01N0V6w6Vz5gK1C pM2YHwOAfuFL0zIi6zLzYF5jpQanF3RzfYSBbjqBeR2AnyLax3ehD8mVaq0GQ8CDfEZp eIBopK01FOD0UPOCb02ncmNH4xuil+CY7hzWBnU099XW/Vp+kmd/rMOUhBAXo4dhExIz MkCw== X-Gm-Message-State: ACgBeo19FksL37gQs0+lQg4t2b61mjjO16SaEWo499/TIb1UMkFOv3cj Z4Q3NbBc7EmyvaLo4Ciw0n5+6apQmbTYbbX7h34GKfgNnQnQ1u+jfwpKDYX5ya8Ds5YPgKYSUTq L45tKL/FxY9UP83IRO5X3dmAB4y9oWQUHAF8le7xbFv1qq/gZ+ECvikJ9oA6u0TTkRUc= X-Received: by 2002:a17:907:1dd2:b0:740:97ab:1f9b with SMTP id og18-20020a1709071dd200b0074097ab1f9bmr2702385ejc.319.1661555574235; Fri, 26 Aug 2022 16:12:54 -0700 (PDT) X-Google-Smtp-Source: AA6agR6HdyeWRdCDVbcTGrQIdsmV4dptuJ3GE6RESzpzsi+MtBuG77xEGjhiUwaDWUdSspnp3BSMtA== X-Received: by 2002:a17:907:1dd2:b0:740:97ab:1f9b with SMTP id og18-20020a1709071dd200b0074097ab1f9bmr2702375ejc.319.1661555573945; Fri, 26 Aug 2022 16:12:53 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: richard.henderson@linaro.org, paul@nowt.org Subject: [PATCH 23/23] i386: AVX+AES helpers prep Date: Sat, 27 Aug 2022 01:12:04 +0200 Message-Id: <20220826231204.201395-24-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220826231204.201395-1-pbonzini@redhat.com> References: <20220826231204.201395-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1661556253718100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Make the AES vector helpers AVX ready No functional changes to existing helpers Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-22-paul@nowt.org> Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/ops_sse.h | 49 +++++++++++++++++++++++-------------------- 1 file changed, 26 insertions(+), 23 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 2b35b6e533..a42d7b26ba 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2289,64 +2289,66 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *en= v, Reg *d, Reg *s, void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *d; // v Reg rk =3D *s; =20 - for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4*i+0])] ^ - AES_Td1[st.B(AES_ishifts[4*i+1])] ^ - AES_Td2[st.B(AES_ishifts[4*i+2])] ^ - AES_Td3[st.B(AES_ishifts[4*i+3])]); + for (i =3D 0 ; i < 2 << SHIFT ; i++) { + int j =3D i & 3; + d->L(i) =3D rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4 * j + 0])= ] ^ + AES_Td1[st.B(AES_ishifts[4 * j + 1])] ^ + AES_Td2[st.B(AES_ishifts[4 * j + 2])] ^ + AES_Td3[st.B(AES_ishifts[4 * j + 3])]); } } =20 void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *d; // v Reg rk =3D *s; =20 - for (i =3D 0; i < 16; i++) { - d->B(i) =3D rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i])]); + for (i =3D 0; i < 8 << SHIFT; i++) { + d->B(i) =3D rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i & 15] + (i & ~= 15))]); } } =20 void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *d; // v Reg rk =3D *s; =20 - for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4*i+0])] ^ - AES_Te1[st.B(AES_shifts[4*i+1])] ^ - AES_Te2[st.B(AES_shifts[4*i+2])] ^ - AES_Te3[st.B(AES_shifts[4*i+3])]); + for (i =3D 0 ; i < 2 << SHIFT ; i++) { + int j =3D i & 3; + d->L(i) =3D rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4 * j + 0])]= ^ + AES_Te1[st.B(AES_shifts[4 * j + 1])] ^ + AES_Te2[st.B(AES_shifts[4 * j + 2])] ^ + AES_Te3[st.B(AES_shifts[4 * j + 3])]); } } =20 void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; - Reg st =3D *d; + Reg st =3D *d; // v Reg rk =3D *s; =20 - for (i =3D 0; i < 16; i++) { - d->B(i) =3D rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i])]); + for (i =3D 0; i < 8 << SHIFT; i++) { + d->B(i) =3D rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i & 15] + (i & ~15= ))]); } - } =20 +#if SHIFT =3D=3D 1 void glue(helper_aesimc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; Reg tmp =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D bswap32(AES_imc[tmp.B(4*i+0)][0] ^ - AES_imc[tmp.B(4*i+1)][1] ^ - AES_imc[tmp.B(4*i+2)][2] ^ - AES_imc[tmp.B(4*i+3)][3]); + d->L(i) =3D bswap32(AES_imc[tmp.B(4 * i + 0)][0] ^ + AES_imc[tmp.B(4 * i + 1)][1] ^ + AES_imc[tmp.B(4 * i + 2)][2] ^ + AES_imc[tmp.B(4 * i + 3)][3]); } } =20 @@ -2364,6 +2366,7 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State= *env, Reg *d, Reg *s, d->L(3) =3D (d->L(2) << 24 | d->L(2) >> 8) ^ ctrl; } #endif +#endif =20 #undef SSE_HELPER_S =20 --=20 2.37.1