From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662019850; cv=none; d=zohomail.com; s=zohoarc; b=O/xSHTl9IZi7vqvb9l58sl2tQrn4TuZ+GdWjV+qmdeymv0w2eWGWbf4o7Z0K6ztPvSztq+oq5Wpj17Jd0v52vpIn+nkfMQTC8tsh1LJ1oxfpzHMT4CedeLvFgLoiNDm/TWxkCrgD64+GCS0j1z/eipCyqPxX1MLDqLE7QxnvQcc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662019850; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=TliPTaXw5V3NL8dMo5X5J1P25je0Qgx/8MGj43xjA48=; b=QBfujlLAbl2NiD9QfDSGT6ZecSUNAilkd7z9awHetgd5C1Ou9L8WmnmMoVqnjh0WzE9PISbX0sgHInGcu0N7xrV0dXT2TFPfbfylYz7emUwq/A4u6z6HtLIG5LFhDs9+PCGdxc2s493oIdFadqoHKeAVo/LE98ByOT9XHCskY48= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662019850935445.1482959644153; Thu, 1 Sep 2022 01:10:50 -0700 (PDT) Received: from localhost ([::1]:38582 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfHl-0002bb-Tx for importer@patchew.org; Thu, 01 Sep 2022 04:10:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60288) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewe-0000z9-As for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:00 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:29319) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewc-00035u-GU for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:48:59 -0400 Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-576-tTdgYcc3NYaBiLc4QsrwHw-1; Thu, 01 Sep 2022 03:48:49 -0400 Received: by mail-wm1-f72.google.com with SMTP id c25-20020a05600c0ad900b003a5ebad295aso517193wmr.5 for ; Thu, 01 Sep 2022 00:48:49 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id c4-20020adfef44000000b00220592005edsm14270029wrp.85.2022.09.01.00.48.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:48:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018538; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TliPTaXw5V3NL8dMo5X5J1P25je0Qgx/8MGj43xjA48=; b=RZg4mR4k2NqFwblDQklIB3JlnQ88RxDDEp4hPmJaMbo/EEoOXTQLLa62up8eW5K+LyrrB1 quDQ51EwN1vPuBw2S7dVKMDkIKkibVpXdc/EtjeN0WMmHR2DosV8ZWfJWWmRagXmNUcE5F a8KnTg6usLr8MzgKR5hbjtQDXGWTSqw= X-MC-Unique: tTdgYcc3NYaBiLc4QsrwHw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=TliPTaXw5V3NL8dMo5X5J1P25je0Qgx/8MGj43xjA48=; b=vz2XJetP69lA5WbBAl25x1cDbGmZ//BgdK8YEJMDOPXZvhSRvCcAIYPVTvnM6VLbDk JHIulNFHS19kYsL+YQ4pPD/o5I0nxs8x5Ch3GhshsT2LXDHaul5qymEyyO3qr5wii61A ZGH9xI6N0eZlTsQ7V5KGfK6BXehFJozD/V4ZuBxEUHmYwvKRZPFurdQ4Gm84u/JGAVub B8MTpq2EbVLAzsekRPO8azDQUOcqovM+++flVB+laD8rK3yDCJ4HwWzY6U7k860KBUV/ QuD/Lbq63RGjP1iclK1+g1FsDT+nFdU9nHpF5lx9WFpCtIUxeuZQcPF1hvswpD1mlklk jVrg== X-Gm-Message-State: ACgBeo11VdKReiCwc877QorUs130zYZMdyncCtA3hkfDylkvW51TGRp3 4MsLRD+Lwfzo7J9RN58QzfPbCr8ehnYGbVSMY5fS2/cmH3gXEP1WjhZPV2/hzFUv+xBir8KyO8i 3165cIbiHU1WkAIOURdOwOUFNrwBlSyfTe6KYTrdG0+0TCJImFtP/vI7fcghWUI0bTfY= X-Received: by 2002:adf:d203:0:b0:226:d4e7:511e with SMTP id j3-20020adfd203000000b00226d4e7511emr12152067wrh.13.1662018527971; Thu, 01 Sep 2022 00:48:47 -0700 (PDT) X-Google-Smtp-Source: AA6agR6xlKEK7P/S3xgOSIvBNobN8yd817h6MW32qiYBxZxSsG/2XJGmHgB4giDtqRnGx6bVIXH9Lg== X-Received: by 2002:adf:d203:0:b0:226:d4e7:511e with SMTP id j3-20020adfd203000000b00226d4e7511emr12152051wrh.13.1662018527627; Thu, 01 Sep 2022 00:48:47 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 01/23] i386: do not use MOVL to move data between SSE registers Date: Thu, 1 Sep 2022 09:48:20 +0200 Message-Id: <20220901074842.57424-2-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662019851900100001 Content-Type: text/plain; charset="utf-8" Write down explicitly the load/store sequence. Extracted from a patch by Paul Brook . Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index b7972f0ff5..3237c1d8f9 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3295,8 +3295,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(= 3))); } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_L(0))); + tcg_gen_ld_i32(s->tmp2_i32, cpu_env, + offsetof(CPUX86State, xmm_regs[rm].ZMM_L(0)= )); + tcg_gen_st_i32(s->tmp2_i32, cpu_env, + offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= ))); } break; case 0x310: /* movsd xmm, ea */ --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020388; cv=none; d=zohomail.com; s=zohoarc; b=dKbVEwZvxbMgp330PoPMmds+CR1+PBn0YdfW1LpBvOQJxBreDEqO/IQOC05d/piVYNuSQxpeqJKFmgN2/uznu40tWvhJYJCb33AbAMb1nAd/P7wNbJ4fuJW6+Ijd+U87VZfyPWdxJztU3IJiwhhYd+JRZNc/Et5W3lNRLonCztw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020388; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=ILS20iXHwK0SsJaaTQLLSnjpCkVgaq8CBH8Roq3vleY=; b=m8CfnjIXF5A2gVZ8x90bUJxPuI2ovmGkgIVsBcjGFbfnwRb2p0x2t/E3pM7hEAMzdlDtDyDVx+GNdrY2NrnBqtO4P7LxEMK5NY7yKyzdqPl2VKTtigVXvOnModr9IeGXKwd84WJeIMMnkHKLN+mu6gEulAjJDCHN71nItdqxv7U= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662020388653395.3099890528822; Thu, 1 Sep 2022 01:19:48 -0700 (PDT) Received: from localhost ([::1]:53360 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfQR-0004n1-FG for importer@patchew.org; Thu, 01 Sep 2022 04:19:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60286) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewc-0000tu-7K for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:48:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:35769) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewa-00035Y-1p for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:48:57 -0400 Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-397-yhb-o6pBN1O8FlKUWqdNVg-1; Thu, 01 Sep 2022 03:48:50 -0400 Received: by mail-wm1-f71.google.com with SMTP id f9-20020a7bcd09000000b003a62725489bso519551wmj.2 for ; Thu, 01 Sep 2022 00:48:50 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id u8-20020a5d4688000000b00226f2ab6e2asm1586659wrq.68.2022.09.01.00.48.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:48:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018534; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ILS20iXHwK0SsJaaTQLLSnjpCkVgaq8CBH8Roq3vleY=; b=aRYU9RNX6YUmXaPmfLJpu9vBSvwnnEnEhRTwtIBza/YM+Go3PRneGbQFdVpbdSox+KCVDJ DNOiFOgPomI3LEBLg1xVtBqAzcqsZNEQjymutGupVZQVCNFtJxrUhRG8c3lJs3NtgOmCA1 DcK7NX9U7798imOz6ECEFBdi8sdtYZ4= X-MC-Unique: yhb-o6pBN1O8FlKUWqdNVg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=ILS20iXHwK0SsJaaTQLLSnjpCkVgaq8CBH8Roq3vleY=; b=pKffwqR8WepJxYEM/KlQZO4xFXy+wYX6VG7CB1BbIRZEQgbY4ibwlD7a4Eon0oM27c QEAtKeysKOzghQiBtG3Goh5kpc/51KqBqtroUo4m6DBL3e99UYBtgT2Km1KVVSckfbU+ ikTRQhOl+371hXZ41+djL/fnAjMZl98BuTvqmAJD8xHNOslQJIJlogbNf0j7j+8ePoon y4yGsqe5TLvIC1SerPBgM0t31YBrrbvipNJ+awTEu5o0OoJcxbV9JM9tRY04yxjjR3wZ zfGqVTQMWRLJWn1s+XLJPoqZH0Fa3NXh+9wCP6S21y5345mXsd8HGSAObIPpwZmTNEwL jIag== X-Gm-Message-State: ACgBeo2whM3nEkYWMCpEf9Z5Xb8H9Evynf9B8oXYqM6JdR5HlkLF0P3r CwtgItdrF8D3dowEr3rRptODTknMKhQmPqYMIZWqSyVFLMzEfBujj2wyto7Iez4nyfc+MOBbpvo xo6ui2S3Pkm7v6lj5JSOngf/HpHVataD5XHpKbCGNmFZDbf2LhtiShtedNf/DQ+HrAWA= X-Received: by 2002:a5d:5272:0:b0:226:eedb:ee59 with SMTP id l18-20020a5d5272000000b00226eedbee59mr2957901wrc.668.1662018529633; Thu, 01 Sep 2022 00:48:49 -0700 (PDT) X-Google-Smtp-Source: AA6agR7h4Km39rMemhSvVUH0Nzeu9qgrelG/IbeW4YNUpNWrab4+nG8L39Tss7oVPKNswN0MOMgEeQ== X-Received: by 2002:a5d:5272:0:b0:226:eedb:ee59 with SMTP id l18-20020a5d5272000000b00226eedbee59mr2957885wrc.668.1662018529343; Thu, 01 Sep 2022 00:48:49 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 02/23] i386: formatting fixes Date: Thu, 1 Sep 2022 09:48:21 +0200 Message-Id: <20220901074842.57424-3-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020389473100003 Content-Type: text/plain; charset="utf-8" Extracted from a patch by Paul Brook . Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 3237c1d8f9..25a2539d59 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3314,7 +3314,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0= )), - offsetof(CPUX86State,xmm_regs[rm].ZMM_Q(0))); + offsetof(CPUX86State, xmm_regs[rm].ZMM_Q(0))); } break; case 0x012: /* movlps */ @@ -4463,7 +4463,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, /* 32 bit access */ gen_op_ld_v(s, MO_32, s->T0, s->A0); tcg_gen_st32_tl(s->T0, cpu_env, - offsetof(CPUX86State,xmm_t0.ZMM_L(0))); + offsetof(CPUX86State, xmm_t0.ZMM_L(0))= ); break; case 3: /* 64 bit access */ @@ -4523,8 +4523,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0xf7: /* maskmov : we must prepare A0 */ - if (mod !=3D 3) + if (mod !=3D 3) { goto illegal_op; + } tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]); gen_extu(s->aflag, s->A0); gen_add_A0_ds_seg(s); --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020041; cv=none; d=zohomail.com; s=zohoarc; b=cxDnlfP6TKPNfDmDUnoG7G3OF1/ZLo8tzczWJxDImjFabLXMMiLWEMbZvMHXcyeVX6SUB755+iDpMolMV6kkDpM7n8kI7mT2IrdHzEf0ol4razurfnLcCEi5VV0GrHPQ2S1QbTQZRgzXVE/3i7408wwE0Rxaz7i4GAOtiB5FgLs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020041; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=cBqHhBRHuJ+wrsM4qVjTcT2gmflhKvCRi43Smo5AsYI=; b=CBYwLG0gziZO4+GSiqvMpIupjgLq0JnICWtawVu41c4/1bab6DSNKfsqi5f2FzuN9ImVn1BbSZMxN5gngPcwhzTQKfL2UeOojYciq6KOP9lSCDtTodW+sL13vtKe3alIif0flTWvRVSD7VfwX6IYadP0UIdMXd3oUqsFxf1eK1o= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 166202004131688.75511482312106; Thu, 1 Sep 2022 01:14:01 -0700 (PDT) Received: from localhost ([::1]:40768 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfKo-0006RE-KZ for importer@patchew.org; Thu, 01 Sep 2022 04:13:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60298) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewh-00011s-6Z for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:42902) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewf-00036G-5V for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:02 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-661-kiV8a4MDNpecaQxjH202gg-1; Thu, 01 Sep 2022 03:48:54 -0400 Received: by mail-wm1-f69.google.com with SMTP id f9-20020a7bcd09000000b003a62725489bso519606wmj.2 for ; Thu, 01 Sep 2022 00:48:53 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id az19-20020a05600c601300b003a342933727sm4704207wmb.3.2022.09.01.00.48.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:48:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018540; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cBqHhBRHuJ+wrsM4qVjTcT2gmflhKvCRi43Smo5AsYI=; b=iEgJhKIlJ34I2zuNvqeJq51XxYeIy5HvxCJc1FvFNtz168LN7r/1/X736L89x8T4zmWLWq dI8rBrNnRE5KRakXgS1i88qPlOwwJt8dHTfSP+pQ6m6kz/u7qAyQJvzLY5MkeBebeXYmT8 nHQhsYX3UykOuyie77rLQe0XZ+v3iMM= X-MC-Unique: kiV8a4MDNpecaQxjH202gg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=cBqHhBRHuJ+wrsM4qVjTcT2gmflhKvCRi43Smo5AsYI=; b=T9MJmlCoGj+jgC8gyU91YYAiW+PhgNIZDwfiWdYg2OVRTqzVDK0cUzybxeXQcoVD2V PaUVo+4bFxpv3/oQAxjc+hGHKRNF7JiJg1Lj8vZtVYTcYv8eoksAa91S/PeeNF4Rf0S6 WxtZgTs4nASUDcpUjlJX7rSibZT4UimpIfPAGsx7MNWVm8jWnbjVGUcgII0mykV4VRA0 zhUtpqyZPNW0q5EX1V0+tcXJLYt9Z9hhQLbKYbfTP57NcrxAern/jMHQ9W4+gaSwvn9w aLYncvLpLav+soBSVFNRQdrb9zAvmKu9P9MeIRfdvLd4ZYrNefBmECXVwE+13R8RqGsT D91w== X-Gm-Message-State: ACgBeo2mgYF8MIi6SjicYP+SIu6kNFbIxiCkCUalTxOhElTN6e+56qYL wx4M3YwmIBC+fi0T3XPNDAUgP/yIdo59WK5O0V865NpKLTCdBQ+Erh9yqqIQ5xGrj9GwHZxVNLi SbIWPVSShRLQuNYB2IkdO8bCkv1rhWXWu1J+tfq/ehhfAXHkhliCsk3IZ+ioYLapV/r4= X-Received: by 2002:a05:6000:78b:b0:226:d10f:1c3 with SMTP id bu11-20020a056000078b00b00226d10f01c3mr12999118wrb.149.1662018532660; Thu, 01 Sep 2022 00:48:52 -0700 (PDT) X-Google-Smtp-Source: AA6agR68X6iO8u5Fn85QARVRNdHhX5keH+WDn+UP+t31vugvXXOOe6is/Y59AbLAjUWDtXQO6Co5aw== X-Received: by 2002:a05:6000:78b:b0:226:d10f:1c3 with SMTP id bu11-20020a056000078b00b00226d10f01c3mr12999105wrb.149.1662018532342; Thu, 01 Sep 2022 00:48:52 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 03/23] i386: Add ZMM_OFFSET macro Date: Thu, 1 Sep 2022 09:48:22 +0200 Message-Id: <20220901074842.57424-4-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020043269100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Add a convenience macro to get the address of an xmm_regs element within CPUX86State. This was originally going to be the basis of an implementation that broke operations into 128 bit chunks. I scrapped that idea, so this is now a pure= ly cosmetic change. But I think a worthwhile one - it reduces the number of function calls that need to be split over multiple lines. No functional changes. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson Message-Id: <20220424220204.2493824-9-paul@nowt.org> Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 60 +++++++++++++++++-------------------- 1 file changed, 27 insertions(+), 33 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 25a2539d59..cba862746b 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2777,6 +2777,8 @@ static inline void gen_op_movq_env_0(DisasContext *s,= int d_offset) tcg_gen_st_i64(s->tmp1_i64, cpu_env, d_offset); } =20 +#define ZMM_OFFSET(reg) offsetof(CPUX86State, xmm_regs[reg]) + typedef void (*SSEFunc_i_ep)(TCGv_i32 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr env, TCGv_ptr reg); typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val); @@ -3198,13 +3200,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_sto_env_A0(s, ZMM_OFFSET(reg)); break; case 0x3f0: /* lddqu */ if (mod =3D=3D 3) goto illegal_op; gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); break; case 0x22b: /* movntss */ case 0x32b: /* movntsd */ @@ -3240,15 +3242,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); gen_helper_movq_mm_T0_xmm(s->ptr0, s->T0); } else #endif { gen_ldst_modrm(env, s, modrm, MO_32, OR_TMP0, 0); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0); gen_helper_movl_mm_T0_xmm(s->ptr0, s->tmp2_i32); } @@ -3273,11 +3273,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x26f: /* movdqu xmm, ea */ if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movo(s, offsetof(CPUX86State, xmm_regs[reg]), - offsetof(CPUX86State,xmm_regs[rm])); + gen_op_movo(s, ZMM_OFFSET(reg), ZMM_OFFSET(rm)); } break; case 0x210: /* movss xmm, ea */ @@ -3333,7 +3332,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x212: /* movsldup */ if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(0= )), @@ -3375,7 +3374,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x216: /* movshdup */ if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_ldo_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); gen_op_movl(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_L(1= )), @@ -3397,8 +3396,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; field_length =3D x86_ldub_code(env, s) & 0x3F; bit_index =3D x86_ldub_code(env, s) & 0x3F; - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[reg])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(reg)); if (b1 =3D=3D 1) gen_helper_extrq_i(cpu_env, s->ptr0, tcg_const_i32(bit_index), @@ -3467,11 +3465,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x27f: /* movdqu ea, xmm */ if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); - gen_sto_env_A0(s, offsetof(CPUX86State, xmm_regs[reg])); + gen_sto_env_A0(s, ZMM_OFFSET(reg)); } else { rm =3D (modrm & 7) | REX_B(s); - gen_op_movo(s, offsetof(CPUX86State, xmm_regs[rm]), - offsetof(CPUX86State,xmm_regs[reg])); + gen_op_movo(s, ZMM_OFFSET(rm), ZMM_OFFSET(reg)); } break; case 0x211: /* movss ea, xmm */ @@ -3549,7 +3546,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } if (is_xmm) { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } else { rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -3560,15 +3557,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, break; case 0x050: /* movmskps */ rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State,xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; @@ -3583,7 +3578,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); } - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); switch(b >> 8) { @@ -3600,7 +3595,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x32a: /* cvtsi2sd */ ot =3D mo_64_32(s->dflag); gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0); - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); if (ot =3D=3D MO_32) { SSEFunc_0_epi sse_fn_epi =3D sse_op_table3ai[(b >> 8) & 1]; @@ -3626,7 +3621,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldo_env_A0(s, op2_offset); } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } op1_offset =3D offsetof(CPUX86State,fpregs[reg & 7].mmx); tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); @@ -3663,7 +3658,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op2_offset =3D offsetof(CPUX86State,xmm_t0); } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset); if (ot =3D=3D MO_32) { @@ -3749,8 +3744,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; if (b1) { rm =3D (modrm & 7) | REX_B(s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, - offsetof(CPUX86State, xmm_regs[rm])); + tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); } else { rm =3D (modrm & 7); @@ -3782,9 +3776,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, goto illegal_op; =20 if (b1) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); if (mod =3D=3D 3) { - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); } else { op2_offset =3D offsetof(CPUX86State,xmm_t0); gen_lea_modrm(env, s, modrm); @@ -4347,9 +4341,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } =20 if (b1) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); if (mod =3D=3D 3) { - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm | REX_= B(s)]); + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); } else { op2_offset =3D offsetof(CPUX86State,xmm_t0); gen_lea_modrm(env, s, modrm); @@ -4429,7 +4423,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; } if (is_xmm) { - op1_offset =3D offsetof(CPUX86State,xmm_regs[reg]); + op1_offset =3D ZMM_OFFSET(reg); if (mod !=3D 3) { int sz =3D 4; =20 @@ -4476,7 +4470,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } } else { rm =3D (modrm & 7) | REX_B(s); - op2_offset =3D offsetof(CPUX86State,xmm_regs[rm]); + op2_offset =3D ZMM_OFFSET(rm); } } else { op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020810; cv=none; d=zohomail.com; s=zohoarc; b=D51LUImW3pP8wk94XnnchL8P7oixQnEA7LfX6bblypzAIlaDDPbxaCxyJKDZv0s7SzxsniPpRe1eOK+zo7Ats1YK7Vbs/CB7f3N5/Q120GsdMz03a9bOpyu0Cy4HXD4/jqGJi3/8KM+0+azmDCXZMB+W+h3j398jXEsKyv/hwbM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020810; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=ce+JjCIhuae2cOI1iW4snBFPD80zf3M3cjwIFHb65b0=; b=edAi3S/IduSM0Gsw8l4ahotjVy6CV3zmwapfAs/3HyyszI2fvjj0sAfpBG3e1O7qF7P1bn2TvMHZIn+LJfTZ3ZPossPPD1zSWVtsWt1YRE25LfhcwxD0ayiXv43cZPfCx6FnXYt0VAldjMVL+TS+eSkHx3iL5Q9XRHA38UTnVZc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662020810001415.0886200232459; Thu, 1 Sep 2022 01:26:50 -0700 (PDT) Received: from localhost ([::1]:33994 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfXE-0002Ff-TF for importer@patchew.org; Thu, 01 Sep 2022 04:26:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60290) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewe-0000zw-Fr for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:00 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:55005) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewb-00035j-RB for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:00 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-591-wx8rpXnAM36G1Q22KyvxmQ-1; Thu, 01 Sep 2022 03:48:55 -0400 Received: by mail-wm1-f69.google.com with SMTP id f18-20020a05600c4e9200b003a5f81299caso9471958wmq.7 for ; Thu, 01 Sep 2022 00:48:55 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id p17-20020a05600c05d100b003a5bd9448e5sm4238187wmd.28.2022.09.01.00.48.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:48:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018537; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ce+JjCIhuae2cOI1iW4snBFPD80zf3M3cjwIFHb65b0=; b=UEqo/vksoiiOgRRAk+pKZyT70jDTOdBMX7pSfumOProV8Chf+L5Pb45/Wvl34xMNucXkLF gt5fFuliejUVCnDz0bTtLJGL75ZkMIAj8MGAL9S7fLj/MgSlV5QvNOCNYAeRxj7WulrqVV QC6u3qp1KugSIrL52ScJ2cu0DuOaiKo= X-MC-Unique: wx8rpXnAM36G1Q22KyvxmQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=ce+JjCIhuae2cOI1iW4snBFPD80zf3M3cjwIFHb65b0=; b=bRguMh2g09FhHe+hMzb7GTl7KA/AeG+d+lsf9cMeqQaj7449NJ44jLYGNSqx0iHQP3 qS6eufwfeQiQq64avzx5Snr9GLq8mI3YzpDt0xenbj340bIsQZ3+M57frAah07gZ446N WjN6+qX6yyUtBLgbUqlyhxvg5RfOCWHCqlBvfppFHTuyUQgE2E3zNTwr3BYHk+5uaG2m Slaafb07AB7ht4fRLMIskxPspjhSjFUzWeMQ9om4vQk/lzJDjq8Ll8Ehrp9owwkprSLy fV1O84YP4SkDHjCTbTJCOAExMlu4fTcs47kM+8IKYYOiKyNjZJHI1q+gZZW/RKj6f3BV wtiw== X-Gm-Message-State: ACgBeo2cDiNxhKqy0gTt5Z8UZFTSJXWGuVMNI9Pqa6+NmkxwLpfDwPEv RelIGb0Dbyf1YcNt0zj8IQGLB02zACRmI6yraYlOml7y4OhgJ8cHZyzy51U7xiqMzwN3T57GJKl 75oB6IjRb27J6N8Ht9uumbCzEQmi4FxDz2Fc5xn8PQ5S9UTFJWyFGSj7HJ+9k5HfLTXk= X-Received: by 2002:a1c:7707:0:b0:3a5:d953:838f with SMTP id t7-20020a1c7707000000b003a5d953838fmr4344178wmi.139.1662018534397; Thu, 01 Sep 2022 00:48:54 -0700 (PDT) X-Google-Smtp-Source: AA6agR78joTpErA05zXuhA6f1atM3KpfCJv3L+Cg3CACHXEhGNJ1ihSVoGPUwx+kR0wNTmDPNsXw1w== X-Received: by 2002:a1c:7707:0:b0:3a5:d953:838f with SMTP id t7-20020a1c7707000000b003a5d953838fmr4344148wmi.139.1662018533931; Thu, 01 Sep 2022 00:48:53 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 04/23] i386: Rework sse_op_table1 Date: Thu, 1 Sep 2022 09:48:23 +0200 Message-Id: <20220901074842.57424-5-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020810818100002 Content-Type: text/plain; charset="utf-8" From: Paul Brook Add a flags field each row in sse_op_table1. Initially this is only used as a replacement for the magic SSE_SPECIAL and SSE_DUMMY pointers, the other flags are mostly relevant for the AVX implementation but can be applied to SSE as well. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-5-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 311 +++++++++++++++++++++--------------- 1 file changed, 182 insertions(+), 129 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index cba862746b..7332bbcf44 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2790,146 +2790,193 @@ typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCG= v_ptr reg_b, TCGv_i32 val); typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv val); =20 -#define SSE_SPECIAL ((void *)1) -#define SSE_DUMMY ((void *)2) +#define SSE_OPF_CMP (1 << 1) /* does not write for first operand */ +#define SSE_OPF_SPECIAL (1 << 3) /* magic */ +#define SSE_OPF_3DNOW (1 << 4) /* 3DNow! instruction */ +#define SSE_OPF_MMX (1 << 5) /* MMX/integer/AVX2 instruction */ +#define SSE_OPF_SCALAR (1 << 6) /* Has SSE scalar variants */ +#define SSE_OPF_SHUF (1 << 9) /* pshufx/shufpx */ =20 -#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } -#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ - gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } +#define OP(op, flags, a, b, c, d) \ + {flags, {a, b, c, d} } =20 -static const SSEFunc_0_epp sse_op_table1[256][4] =3D { +#define MMX_OP(x) OP(op1, SSE_OPF_MMX, \ + gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL) + +#define SSE_FOP(name) OP(op1, SSE_OPF_SCALAR, \ + gen_helper_##name##ps, gen_helper_##name##pd, \ + gen_helper_##name##ss, gen_helper_##name##sd) +#define SSE_OP(sname, dname, op, flags) OP(op, flags, \ + gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) + +struct SSEOpHelper_table1 { + int flags; + SSEFunc_0_epp op[4]; +}; + +#define SSE_3DNOW { SSE_OPF_3DNOW } +#define SSE_SPECIAL { SSE_OPF_SPECIAL } + +static const struct SSEOpHelper_table1 sse_op_table1[256] =3D { /* 3DNow! extensions */ - [0x0e] =3D { SSE_DUMMY }, /* femms */ - [0x0f] =3D { SSE_DUMMY }, /* pf... */ + [0x0e] =3D SSE_SPECIAL, /* femms */ + [0x0f] =3D SSE_3DNOW, /* pf... (sse_op_table5) */ /* pure SSE operations */ - [0x10] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movups, movupd, movss, movsd */ - [0x11] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movups, movupd, movss, movsd */ - [0x12] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movlps, movlpd, movsldup, movddup */ - [0x13] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movlps, movlpd */ - [0x14] =3D { gen_helper_punpckldq_xmm, gen_helper_punpcklqdq_xmm }, - [0x15] =3D { gen_helper_punpckhdq_xmm, gen_helper_punpckhqdq_xmm }, - [0x16] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movhps, movh= pd, movshdup */ - [0x17] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movhps, movhpd */ + [0x10] =3D SSE_SPECIAL, /* movups, movupd, movss, movsd */ + [0x11] =3D SSE_SPECIAL, /* movups, movupd, movss, movsd */ + [0x12] =3D SSE_SPECIAL, /* movlps, movlpd, movsldup, movddup */ + [0x13] =3D SSE_SPECIAL, /* movlps, movlpd */ + [0x14] =3D SSE_OP(punpckldq, punpcklqdq, op1, 0), /* unpcklps, unpcklp= d */ + [0x15] =3D SSE_OP(punpckhdq, punpckhqdq, op1, 0), /* unpckhps, unpckhp= d */ + [0x16] =3D SSE_SPECIAL, /* movhps, movhpd, movshdup */ + [0x17] =3D SSE_SPECIAL, /* movhps, movhpd */ =20 - [0x28] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movaps, movapd */ - [0x29] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movaps, movapd */ - [0x2a] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */ - [0x2b] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = movntps, movntpd, movntss, movntsd */ - [0x2c] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si */ - [0x2d] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* = cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ - [0x2e] =3D { gen_helper_ucomiss, gen_helper_ucomisd }, - [0x2f] =3D { gen_helper_comiss, gen_helper_comisd }, - [0x50] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movmskps, movmskpd */ - [0x51] =3D SSE_FOP(sqrt), - [0x52] =3D { gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL }, - [0x53] =3D { gen_helper_rcpps, NULL, gen_helper_rcpss, NULL }, - [0x54] =3D { gen_helper_pand_xmm, gen_helper_pand_xmm }, /* andps, and= pd */ - [0x55] =3D { gen_helper_pandn_xmm, gen_helper_pandn_xmm }, /* andnps, = andnpd */ - [0x56] =3D { gen_helper_por_xmm, gen_helper_por_xmm }, /* orps, orpd */ - [0x57] =3D { gen_helper_pxor_xmm, gen_helper_pxor_xmm }, /* xorps, xor= pd */ + [0x28] =3D SSE_SPECIAL, /* movaps, movapd */ + [0x29] =3D SSE_SPECIAL, /* movaps, movapd */ + [0x2a] =3D SSE_SPECIAL, /* cvtpi2ps, cvtpi2pd, cvtsi2ss, cvtsi2sd */ + [0x2b] =3D SSE_SPECIAL, /* movntps, movntpd, movntss, movntsd */ + [0x2c] =3D SSE_SPECIAL, /* cvttps2pi, cvttpd2pi, cvttsd2si, cvttss2si = */ + [0x2d] =3D SSE_SPECIAL, /* cvtps2pi, cvtpd2pi, cvtsd2si, cvtss2si */ + [0x2e] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR, + gen_helper_ucomiss, gen_helper_ucomisd, NULL, NULL), + [0x2f] =3D OP(op1, SSE_OPF_CMP | SSE_OPF_SCALAR, + gen_helper_comiss, gen_helper_comisd, NULL, NULL), + [0x50] =3D SSE_SPECIAL, /* movmskps, movmskpd */ + [0x51] =3D OP(op1, SSE_OPF_SCALAR, + gen_helper_sqrtps, gen_helper_sqrtpd, + gen_helper_sqrtss, gen_helper_sqrtsd), + [0x52] =3D OP(op1, SSE_OPF_SCALAR, + gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL), + [0x53] =3D OP(op1, SSE_OPF_SCALAR, + gen_helper_rcpps, NULL, gen_helper_rcpss, NULL), + [0x54] =3D SSE_OP(pand, pand, op1, 0), /* andps, andpd */ + [0x55] =3D SSE_OP(pandn, pandn, op1, 0), /* andnps, andnpd */ + [0x56] =3D SSE_OP(por, por, op1, 0), /* orps, orpd */ + [0x57] =3D SSE_OP(pxor, pxor, op1, 0), /* xorps, xorpd */ [0x58] =3D SSE_FOP(add), [0x59] =3D SSE_FOP(mul), - [0x5a] =3D { gen_helper_cvtps2pd, gen_helper_cvtpd2ps, - gen_helper_cvtss2sd, gen_helper_cvtsd2ss }, - [0x5b] =3D { gen_helper_cvtdq2ps, gen_helper_cvtps2dq, gen_helper_cvtt= ps2dq }, + [0x5a] =3D OP(op1, SSE_OPF_SCALAR, + gen_helper_cvtps2pd, gen_helper_cvtpd2ps, + gen_helper_cvtss2sd, gen_helper_cvtsd2ss), + [0x5b] =3D OP(op1, 0, + gen_helper_cvtdq2ps, gen_helper_cvtps2dq, + gen_helper_cvttps2dq, NULL), [0x5c] =3D SSE_FOP(sub), [0x5d] =3D SSE_FOP(min), [0x5e] =3D SSE_FOP(div), [0x5f] =3D SSE_FOP(max), =20 - [0xc2] =3D SSE_FOP(cmpeq), - [0xc6] =3D { (SSEFunc_0_epp)gen_helper_shufps, - (SSEFunc_0_epp)gen_helper_shufpd }, /* XXX: casts */ + [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ + [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps, + (SSEFunc_0_epp)gen_helper_shufpd, NULL, NULL), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ - [0x38] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, - [0x3a] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, + [0x38] =3D SSE_SPECIAL, + [0x3a] =3D SSE_SPECIAL, =20 /* MMX ops and their SSE extensions */ - [0x60] =3D MMX_OP2(punpcklbw), - [0x61] =3D MMX_OP2(punpcklwd), - [0x62] =3D MMX_OP2(punpckldq), - [0x63] =3D MMX_OP2(packsswb), - [0x64] =3D MMX_OP2(pcmpgtb), - [0x65] =3D MMX_OP2(pcmpgtw), - [0x66] =3D MMX_OP2(pcmpgtl), - [0x67] =3D MMX_OP2(packuswb), - [0x68] =3D MMX_OP2(punpckhbw), - [0x69] =3D MMX_OP2(punpckhwd), - [0x6a] =3D MMX_OP2(punpckhdq), - [0x6b] =3D MMX_OP2(packssdw), - [0x6c] =3D { NULL, gen_helper_punpcklqdq_xmm }, - [0x6d] =3D { NULL, gen_helper_punpckhqdq_xmm }, - [0x6e] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* movd mm, ea */ - [0x6f] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa,= , movqdu */ - [0x70] =3D { (SSEFunc_0_epp)gen_helper_pshufw_mmx, - (SSEFunc_0_epp)gen_helper_pshufd_xmm, - (SSEFunc_0_epp)gen_helper_pshufhw_xmm, - (SSEFunc_0_epp)gen_helper_pshuflw_xmm }, /* XXX: casts */ - [0x71] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftw */ - [0x72] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftd */ - [0x73] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* shiftq */ - [0x74] =3D MMX_OP2(pcmpeqb), - [0x75] =3D MMX_OP2(pcmpeqw), - [0x76] =3D MMX_OP2(pcmpeql), - [0x77] =3D { SSE_DUMMY }, /* emms */ - [0x78] =3D { NULL, SSE_SPECIAL, NULL, SSE_SPECIAL }, /* extrq_i, inser= tq_i */ - [0x79] =3D { NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r }, - [0x7c] =3D { NULL, gen_helper_haddpd, NULL, gen_helper_haddps }, - [0x7d] =3D { NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps }, - [0x7e] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movd, movd, ,= movq */ - [0x7f] =3D { SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, /* movq, movdqa,= movdqu */ - [0xc4] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pinsrw */ - [0xc5] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pextrw */ - [0xd0] =3D { NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps }, - [0xd1] =3D MMX_OP2(psrlw), - [0xd2] =3D MMX_OP2(psrld), - [0xd3] =3D MMX_OP2(psrlq), - [0xd4] =3D MMX_OP2(paddq), - [0xd5] =3D MMX_OP2(pmullw), - [0xd6] =3D { NULL, SSE_SPECIAL, SSE_SPECIAL, SSE_SPECIAL }, - [0xd7] =3D { SSE_SPECIAL, SSE_SPECIAL }, /* pmovmskb */ - [0xd8] =3D MMX_OP2(psubusb), - [0xd9] =3D MMX_OP2(psubusw), - [0xda] =3D MMX_OP2(pminub), - [0xdb] =3D MMX_OP2(pand), - [0xdc] =3D MMX_OP2(paddusb), - [0xdd] =3D MMX_OP2(paddusw), - [0xde] =3D MMX_OP2(pmaxub), - [0xdf] =3D MMX_OP2(pandn), - [0xe0] =3D MMX_OP2(pavgb), - [0xe1] =3D MMX_OP2(psraw), - [0xe2] =3D MMX_OP2(psrad), - [0xe3] =3D MMX_OP2(pavgw), - [0xe4] =3D MMX_OP2(pmulhuw), - [0xe5] =3D MMX_OP2(pmulhw), - [0xe6] =3D { NULL, gen_helper_cvttpd2dq, gen_helper_cvtdq2pd, gen_help= er_cvtpd2dq }, - [0xe7] =3D { SSE_SPECIAL , SSE_SPECIAL }, /* movntq, movntq */ - [0xe8] =3D MMX_OP2(psubsb), - [0xe9] =3D MMX_OP2(psubsw), - [0xea] =3D MMX_OP2(pminsw), - [0xeb] =3D MMX_OP2(por), - [0xec] =3D MMX_OP2(paddsb), - [0xed] =3D MMX_OP2(paddsw), - [0xee] =3D MMX_OP2(pmaxsw), - [0xef] =3D MMX_OP2(pxor), - [0xf0] =3D { NULL, NULL, NULL, SSE_SPECIAL }, /* lddqu */ - [0xf1] =3D MMX_OP2(psllw), - [0xf2] =3D MMX_OP2(pslld), - [0xf3] =3D MMX_OP2(psllq), - [0xf4] =3D MMX_OP2(pmuludq), - [0xf5] =3D MMX_OP2(pmaddwd), - [0xf6] =3D MMX_OP2(psadbw), - [0xf7] =3D { (SSEFunc_0_epp)gen_helper_maskmov_mmx, - (SSEFunc_0_epp)gen_helper_maskmov_xmm }, /* XXX: casts */ - [0xf8] =3D MMX_OP2(psubb), - [0xf9] =3D MMX_OP2(psubw), - [0xfa] =3D MMX_OP2(psubl), - [0xfb] =3D MMX_OP2(psubq), - [0xfc] =3D MMX_OP2(paddb), - [0xfd] =3D MMX_OP2(paddw), - [0xfe] =3D MMX_OP2(paddl), + [0x60] =3D MMX_OP(punpcklbw), + [0x61] =3D MMX_OP(punpcklwd), + [0x62] =3D MMX_OP(punpckldq), + [0x63] =3D MMX_OP(packsswb), + [0x64] =3D MMX_OP(pcmpgtb), + [0x65] =3D MMX_OP(pcmpgtw), + [0x66] =3D MMX_OP(pcmpgtl), + [0x67] =3D MMX_OP(packuswb), + [0x68] =3D MMX_OP(punpckhbw), + [0x69] =3D MMX_OP(punpckhwd), + [0x6a] =3D MMX_OP(punpckhdq), + [0x6b] =3D MMX_OP(packssdw), + [0x6c] =3D OP(op1, SSE_OPF_MMX, + NULL, gen_helper_punpcklqdq_xmm, NULL, NULL), + [0x6d] =3D OP(op1, SSE_OPF_MMX, + NULL, gen_helper_punpckhqdq_xmm, NULL, NULL), + [0x6e] =3D SSE_SPECIAL, /* movd mm, ea */ + [0x6f] =3D SSE_SPECIAL, /* movq, movdqa, , movqdu */ + [0x70] =3D OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX, + (SSEFunc_0_epp)gen_helper_pshufw_mmx, + (SSEFunc_0_epp)gen_helper_pshufd_xmm, + (SSEFunc_0_epp)gen_helper_pshufhw_xmm, + (SSEFunc_0_epp)gen_helper_pshuflw_xmm), + [0x71] =3D SSE_SPECIAL, /* shiftw */ + [0x72] =3D SSE_SPECIAL, /* shiftd */ + [0x73] =3D SSE_SPECIAL, /* shiftq */ + [0x74] =3D MMX_OP(pcmpeqb), + [0x75] =3D MMX_OP(pcmpeqw), + [0x76] =3D MMX_OP(pcmpeql), + [0x77] =3D SSE_SPECIAL, /* emms */ + [0x78] =3D SSE_SPECIAL, /* extrq_i, insertq_i (sse4a) */ + [0x79] =3D OP(op1, 0, + NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r), + [0x7c] =3D OP(op1, 0, + NULL, gen_helper_haddpd, NULL, gen_helper_haddps), + [0x7d] =3D OP(op1, 0, + NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps), + [0x7e] =3D SSE_SPECIAL, /* movd, movd, , movq */ + [0x7f] =3D SSE_SPECIAL, /* movq, movdqa, movdqu */ + [0xc4] =3D SSE_SPECIAL, /* pinsrw */ + [0xc5] =3D SSE_SPECIAL, /* pextrw */ + [0xd0] =3D OP(op1, 0, + NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps), + [0xd1] =3D MMX_OP(psrlw), + [0xd2] =3D MMX_OP(psrld), + [0xd3] =3D MMX_OP(psrlq), + [0xd4] =3D MMX_OP(paddq), + [0xd5] =3D MMX_OP(pmullw), + [0xd6] =3D SSE_SPECIAL, + [0xd7] =3D SSE_SPECIAL, /* pmovmskb */ + [0xd8] =3D MMX_OP(psubusb), + [0xd9] =3D MMX_OP(psubusw), + [0xda] =3D MMX_OP(pminub), + [0xdb] =3D MMX_OP(pand), + [0xdc] =3D MMX_OP(paddusb), + [0xdd] =3D MMX_OP(paddusw), + [0xde] =3D MMX_OP(pmaxub), + [0xdf] =3D MMX_OP(pandn), + [0xe0] =3D MMX_OP(pavgb), + [0xe1] =3D MMX_OP(psraw), + [0xe2] =3D MMX_OP(psrad), + [0xe3] =3D MMX_OP(pavgw), + [0xe4] =3D MMX_OP(pmulhuw), + [0xe5] =3D MMX_OP(pmulhw), + [0xe6] =3D OP(op1, 0, + NULL, gen_helper_cvttpd2dq, + gen_helper_cvtdq2pd, gen_helper_cvtpd2dq), + [0xe7] =3D SSE_SPECIAL, /* movntq, movntq */ + [0xe8] =3D MMX_OP(psubsb), + [0xe9] =3D MMX_OP(psubsw), + [0xea] =3D MMX_OP(pminsw), + [0xeb] =3D MMX_OP(por), + [0xec] =3D MMX_OP(paddsb), + [0xed] =3D MMX_OP(paddsw), + [0xee] =3D MMX_OP(pmaxsw), + [0xef] =3D MMX_OP(pxor), + [0xf0] =3D SSE_SPECIAL, /* lddqu */ + [0xf1] =3D MMX_OP(psllw), + [0xf2] =3D MMX_OP(pslld), + [0xf3] =3D MMX_OP(psllq), + [0xf4] =3D MMX_OP(pmuludq), + [0xf5] =3D MMX_OP(pmaddwd), + [0xf6] =3D MMX_OP(psadbw), + [0xf7] =3D OP(op1t, SSE_OPF_MMX, + (SSEFunc_0_epp)gen_helper_maskmov_mmx, + (SSEFunc_0_epp)gen_helper_maskmov_xmm, NULL, NULL), + [0xf8] =3D MMX_OP(psubb), + [0xf9] =3D MMX_OP(psubw), + [0xfa] =3D MMX_OP(psubl), + [0xfb] =3D MMX_OP(psubq), + [0xfc] =3D MMX_OP(paddb), + [0xfd] =3D MMX_OP(paddw), + [0xfe] =3D MMX_OP(paddl), }; +#undef MMX_OP +#undef OP +#undef SSE_FOP +#undef SSE_OP +#undef SSE_SPECIAL + +#define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } +#define SSE_SPECIAL_FN ((void *)1) =20 static const SSEFunc_0_epp sse_op_table2[3 * 8][2] =3D { [0 + 2] =3D MMX_OP2(psrlw), @@ -2972,6 +3019,8 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 +#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ + gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpeq), SSE_FOP(cmplt), @@ -2982,6 +3031,7 @@ static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpnle), SSE_FOP(cmpord), }; +#undef SSE_FOP =20 static const SSEFunc_0_epp sse_op_table5[256] =3D { [0x0c] =3D gen_helper_pi2fw, @@ -3023,7 +3073,7 @@ struct SSEOpHelper_eppi { #define SSSE3_OP(x) # x ## _xmm }, \ CPUID_EXT_PCLMULQDQ } #define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES } @@ -3114,6 +3164,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, { int b1, op1_offset, op2_offset, is_xmm, val; int modrm, mod, rm, reg; + int sse_op_flags; SSEFunc_0_epp sse_fn_epp; SSEFunc_0_eppi sse_fn_eppi; SSEFunc_0_ppi sse_fn_ppi; @@ -3129,8 +3180,10 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, b1 =3D 3; else b1 =3D 0; - sse_fn_epp =3D sse_op_table1[b][b1]; - if (!sse_fn_epp) { + sse_op_flags =3D sse_op_table1[b].flags; + sse_fn_epp =3D sse_op_table1[b].op[b1]; + if ((sse_op_flags & (SSE_OPF_SPECIAL | SSE_OPF_3DNOW)) =3D=3D 0 + && !sse_fn_epp) { goto unknown_op; } if ((b <=3D 0x5f && b >=3D 0x10) || b =3D=3D 0xc6 || b =3D=3D 0xc2) { @@ -3184,7 +3237,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, reg |=3D REX_R(s); } mod =3D (modrm >> 6) & 3; - if (sse_fn_epp =3D=3D SSE_SPECIAL) { + if (sse_op_flags & SSE_OPF_SPECIAL) { b |=3D (b1 << 8); switch(b) { case 0x0e7: /* movntq */ @@ -3819,7 +3872,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldq_env_A0(s, op2_offset); } } - if (sse_fn_epp =3D=3D SSE_SPECIAL) { + if (sse_fn_epp =3D=3D SSE_SPECIAL_FN) { goto unknown_op; } =20 @@ -4205,7 +4258,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 s->rip_offset =3D 1; =20 - if (sse_fn_eppi =3D=3D SSE_SPECIAL) { + if (sse_fn_eppi =3D=3D SSE_SPECIAL_FN) { ot =3D mo_64_32(s->dflag); rm =3D (modrm & 7) | REX_B(s); if (mod !=3D 3) --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662019725; cv=none; d=zohomail.com; s=zohoarc; b=kQTCn7he3LZIQsZOpFIodLY8hklEahEQ5xX5CckUWR3PQSFFqU6UM04pA7IwLEwnT9yKLFsf4fzCd/cWILzLp3017UKSmJ8MbfhKaeHpx1Iu0/DrM30I9aS49oyub/d7NyZPT541Z8Fl5WPk1RTjs/uhONttXc9IMdOWuwI0L8Q= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662019725; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=6TOl7y6czP5SaerSawoN35k7HXgcRExU/bqStI0DmHQ=; b=LLI53YjkF8qhU/fI00XT0nzQfbeIJ63ZZzP0Wc1tvkPVzoyFIWS4vtTmKrqp7AV8BicFhWG69On45BqXoUb/Jsn6z9lln/m4bEEjzy77EldyE+HYvY6ByXtfiVA+DXiGTH9xEHtc6DSMM6KmpP0rilTvknXCZiVM3xZ8Gnl1ZMc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662019725647705.6178558269985; Thu, 1 Sep 2022 01:08:45 -0700 (PDT) Received: from localhost ([::1]:60272 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfFj-0000SN-Ub for importer@patchew.org; Thu, 01 Sep 2022 04:08:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60292) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewg-00011n-0s for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:52599) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewd-00035z-FF for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:01 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-612-BIfWGLAlMIKoRyw2FbWUlg-1; Thu, 01 Sep 2022 03:48:57 -0400 Received: by mail-wm1-f69.google.com with SMTP id r10-20020a1c440a000000b003a538a648a9so9470707wma.5 for ; Thu, 01 Sep 2022 00:48:57 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id v15-20020adfe28f000000b002255eebf785sm14353266wri.89.2022.09.01.00.48.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:48:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018538; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6TOl7y6czP5SaerSawoN35k7HXgcRExU/bqStI0DmHQ=; b=f4pH9QTlB0cgkV9NRr7t1NvAOjNqWzgyketuJ1VbpPv8gBjeUi43HaZPahFMr5W2yzBQk7 3sLGUF0qw0gkuVAUnBssGFEQQW8QFjTA+E2JausjGNdCiabxv7DeIqOSChcTkwZAUlxEm+ 5gWb1mQU4Ei3SmjOb1za1A+70r2AJmk= X-MC-Unique: BIfWGLAlMIKoRyw2FbWUlg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=6TOl7y6czP5SaerSawoN35k7HXgcRExU/bqStI0DmHQ=; b=fsSqV8nsaqa7IxmYPQRgRWdqxY6wEQf0aoXoqOJwceO2EaVi3kWYpGbsHph7f2m3+w nXTKIHLspBfyjCWS8nlSgpjdBTnhFi7rTvsUEadIH1MJwpLKduodvJ0dDxrilMSdUJC9 IRfXtOVzHXuzE/HIAIX0oujxdA9f4PBwgLoh64Zj+Pywj3SmriNpq57Wt03plxhsLVFS 9qylibinEqNUdl+xAm59Apz1Xx36qLgThskuOSQv78jigARVXvy+jX2Q43sx8Oesc0Fy YmIA7vLN4CVpspuUlNPVSLOoByiPg3aKvqnhQ1d022tiMKhuug3Rb4SIBtbFoIfgTXao X5Vg== X-Gm-Message-State: ACgBeo3ql7AYuzkif75YueFOfHSeCNdS4V/BcJCUnc0sczN1QwBxWx+P 5e8jr3haXmIiOMNTsTMOIbdzclb1uKv/JpmNhtU//QvG+MFZbMzwLNqWi8j2WDt10EipsKEqRTA owFug70YKO1IPw1AlP6NyZiE6kO3j1PKECeglRJ2H1Tf5HszilKqf/yT4BpQ6pu59of8= X-Received: by 2002:a05:600c:2283:b0:3a8:3e92:86e4 with SMTP id 3-20020a05600c228300b003a83e9286e4mr4210257wmf.178.1662018536204; Thu, 01 Sep 2022 00:48:56 -0700 (PDT) X-Google-Smtp-Source: AA6agR7Hlk+iW8xm/e4yldpBIK0ZJt5asZav3EF9WlrvBUFRkElzqXYpdNnWb+9v6OYL1JHRBljlTw== X-Received: by 2002:a05:600c:2283:b0:3a8:3e92:86e4 with SMTP id 3-20020a05600c228300b003a83e9286e4mr4210235wmf.178.1662018535831; Thu, 01 Sep 2022 00:48:55 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 05/23] i386: Rework sse_op_table6/7 Date: Thu, 1 Sep 2022 09:48:24 +0200 Message-Id: <20220901074842.57424-6-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662019726555100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Add a flags field each row in sse_op_table6 and sse_op_table7. Initially this is only used as a replacement for the magic SSE41_SPECIAL pointer. The other flags are mostly relevant for the AVX implementation but can be applied to SSE as well. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-6-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 230 ++++++++++++++++++++---------------- 1 file changed, 131 insertions(+), 99 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 7332bbcf44..b7321b7588 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2976,7 +2976,6 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { #undef SSE_SPECIAL =20 #define MMX_OP2(x) { gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm } -#define SSE_SPECIAL_FN ((void *)1) =20 static const SSEFunc_0_epp sse_op_table2[3 * 8][2] =3D { [0 + 2] =3D MMX_OP2(psrlw), @@ -3060,113 +3059,134 @@ static const SSEFunc_0_epp sse_op_table5[256] =3D= { [0xbf] =3D gen_helper_pavgb_mmx /* pavgusb */ }; =20 -struct SSEOpHelper_epp { +struct SSEOpHelper_table6 { SSEFunc_0_epp op[2]; uint32_t ext_mask; + int flags; }; =20 -struct SSEOpHelper_eppi { +struct SSEOpHelper_table7 { SSEFunc_0_eppi op[2]; uint32_t ext_mask; + int flags; }; =20 -#define SSSE3_OP(x) { MMX_OP2(x), CPUID_EXT_SSSE3 } -#define SSE41_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE41 } -#define SSE42_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_SSE42 } -#define SSE41_SPECIAL { { NULL, SSE_SPECIAL_FN }, CPUID_EXT_SSE41 } -#define PCLMULQDQ_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, \ - CPUID_EXT_PCLMULQDQ } -#define AESNI_OP(x) { { NULL, gen_helper_ ## x ## _xmm }, CPUID_EXT_AES } +#define gen_helper_special_xmm NULL =20 -static const struct SSEOpHelper_epp sse_op_table6[256] =3D { - [0x00] =3D SSSE3_OP(pshufb), - [0x01] =3D SSSE3_OP(phaddw), - [0x02] =3D SSSE3_OP(phaddd), - [0x03] =3D SSSE3_OP(phaddsw), - [0x04] =3D SSSE3_OP(pmaddubsw), - [0x05] =3D SSSE3_OP(phsubw), - [0x06] =3D SSSE3_OP(phsubd), - [0x07] =3D SSSE3_OP(phsubsw), - [0x08] =3D SSSE3_OP(psignb), - [0x09] =3D SSSE3_OP(psignw), - [0x0a] =3D SSSE3_OP(psignd), - [0x0b] =3D SSSE3_OP(pmulhrsw), - [0x10] =3D SSE41_OP(pblendvb), - [0x14] =3D SSE41_OP(blendvps), - [0x15] =3D SSE41_OP(blendvpd), - [0x17] =3D SSE41_OP(ptest), - [0x1c] =3D SSSE3_OP(pabsb), - [0x1d] =3D SSSE3_OP(pabsw), - [0x1e] =3D SSSE3_OP(pabsd), - [0x20] =3D SSE41_OP(pmovsxbw), - [0x21] =3D SSE41_OP(pmovsxbd), - [0x22] =3D SSE41_OP(pmovsxbq), - [0x23] =3D SSE41_OP(pmovsxwd), - [0x24] =3D SSE41_OP(pmovsxwq), - [0x25] =3D SSE41_OP(pmovsxdq), - [0x28] =3D SSE41_OP(pmuldq), - [0x29] =3D SSE41_OP(pcmpeqq), - [0x2a] =3D SSE41_SPECIAL, /* movntqda */ - [0x2b] =3D SSE41_OP(packusdw), - [0x30] =3D SSE41_OP(pmovzxbw), - [0x31] =3D SSE41_OP(pmovzxbd), - [0x32] =3D SSE41_OP(pmovzxbq), - [0x33] =3D SSE41_OP(pmovzxwd), - [0x34] =3D SSE41_OP(pmovzxwq), - [0x35] =3D SSE41_OP(pmovzxdq), - [0x37] =3D SSE42_OP(pcmpgtq), - [0x38] =3D SSE41_OP(pminsb), - [0x39] =3D SSE41_OP(pminsd), - [0x3a] =3D SSE41_OP(pminuw), - [0x3b] =3D SSE41_OP(pminud), - [0x3c] =3D SSE41_OP(pmaxsb), - [0x3d] =3D SSE41_OP(pmaxsd), - [0x3e] =3D SSE41_OP(pmaxuw), - [0x3f] =3D SSE41_OP(pmaxud), - [0x40] =3D SSE41_OP(pmulld), - [0x41] =3D SSE41_OP(phminposuw), - [0xdb] =3D AESNI_OP(aesimc), - [0xdc] =3D AESNI_OP(aesenc), - [0xdd] =3D AESNI_OP(aesenclast), - [0xde] =3D AESNI_OP(aesdec), - [0xdf] =3D AESNI_OP(aesdeclast), +#define OP(name, op, flags, ext, mmx_name) \ + {{mmx_name, gen_helper_ ## name ## _xmm}, CPUID_EXT_ ## ext, flags} +#define BINARY_OP_MMX(name, ext) \ + OP(name, op1, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) +#define BINARY_OP(name, ext, flags) \ + OP(name, op1, flags, ext, NULL) +#define UNARY_OP_MMX(name, ext) \ + OP(name, op1, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) +#define UNARY_OP(name, ext, flags) \ + OP(name, op1, flags, ext, NULL) +#define BLENDV_OP(name, ext, flags) OP(name, op1, 0, ext, NULL) +#define CMP_OP(name, ext) OP(name, op1, SSE_OPF_CMP, ext, NULL) +#define SPECIAL_OP(ext) OP(special, op1, SSE_OPF_SPECIAL, ext, NULL) + +/* prefix [66] 0f 38 */ +static const struct SSEOpHelper_table6 sse_op_table6[256] =3D { + [0x00] =3D BINARY_OP_MMX(pshufb, SSSE3), + [0x01] =3D BINARY_OP_MMX(phaddw, SSSE3), + [0x02] =3D BINARY_OP_MMX(phaddd, SSSE3), + [0x03] =3D BINARY_OP_MMX(phaddsw, SSSE3), + [0x04] =3D BINARY_OP_MMX(pmaddubsw, SSSE3), + [0x05] =3D BINARY_OP_MMX(phsubw, SSSE3), + [0x06] =3D BINARY_OP_MMX(phsubd, SSSE3), + [0x07] =3D BINARY_OP_MMX(phsubsw, SSSE3), + [0x08] =3D BINARY_OP_MMX(psignb, SSSE3), + [0x09] =3D BINARY_OP_MMX(psignw, SSSE3), + [0x0a] =3D BINARY_OP_MMX(psignd, SSSE3), + [0x0b] =3D BINARY_OP_MMX(pmulhrsw, SSSE3), + [0x10] =3D BLENDV_OP(pblendvb, SSE41, SSE_OPF_MMX), + [0x14] =3D BLENDV_OP(blendvps, SSE41, 0), + [0x15] =3D BLENDV_OP(blendvpd, SSE41, 0), + [0x17] =3D CMP_OP(ptest, SSE41), + [0x1c] =3D UNARY_OP_MMX(pabsb, SSSE3), + [0x1d] =3D UNARY_OP_MMX(pabsw, SSSE3), + [0x1e] =3D UNARY_OP_MMX(pabsd, SSSE3), + [0x20] =3D UNARY_OP(pmovsxbw, SSE41, SSE_OPF_MMX), + [0x21] =3D UNARY_OP(pmovsxbd, SSE41, SSE_OPF_MMX), + [0x22] =3D UNARY_OP(pmovsxbq, SSE41, SSE_OPF_MMX), + [0x23] =3D UNARY_OP(pmovsxwd, SSE41, SSE_OPF_MMX), + [0x24] =3D UNARY_OP(pmovsxwq, SSE41, SSE_OPF_MMX), + [0x25] =3D UNARY_OP(pmovsxdq, SSE41, SSE_OPF_MMX), + [0x28] =3D BINARY_OP(pmuldq, SSE41, SSE_OPF_MMX), + [0x29] =3D BINARY_OP(pcmpeqq, SSE41, SSE_OPF_MMX), + [0x2a] =3D SPECIAL_OP(SSE41), /* movntqda */ + [0x2b] =3D BINARY_OP(packusdw, SSE41, SSE_OPF_MMX), + [0x30] =3D UNARY_OP(pmovzxbw, SSE41, SSE_OPF_MMX), + [0x31] =3D UNARY_OP(pmovzxbd, SSE41, SSE_OPF_MMX), + [0x32] =3D UNARY_OP(pmovzxbq, SSE41, SSE_OPF_MMX), + [0x33] =3D UNARY_OP(pmovzxwd, SSE41, SSE_OPF_MMX), + [0x34] =3D UNARY_OP(pmovzxwq, SSE41, SSE_OPF_MMX), + [0x35] =3D UNARY_OP(pmovzxdq, SSE41, SSE_OPF_MMX), + [0x37] =3D BINARY_OP(pcmpgtq, SSE41, SSE_OPF_MMX), + [0x38] =3D BINARY_OP(pminsb, SSE41, SSE_OPF_MMX), + [0x39] =3D BINARY_OP(pminsd, SSE41, SSE_OPF_MMX), + [0x3a] =3D BINARY_OP(pminuw, SSE41, SSE_OPF_MMX), + [0x3b] =3D BINARY_OP(pminud, SSE41, SSE_OPF_MMX), + [0x3c] =3D BINARY_OP(pmaxsb, SSE41, SSE_OPF_MMX), + [0x3d] =3D BINARY_OP(pmaxsd, SSE41, SSE_OPF_MMX), + [0x3e] =3D BINARY_OP(pmaxuw, SSE41, SSE_OPF_MMX), + [0x3f] =3D BINARY_OP(pmaxud, SSE41, SSE_OPF_MMX), + [0x40] =3D BINARY_OP(pmulld, SSE41, SSE_OPF_MMX), + [0x41] =3D UNARY_OP(phminposuw, SSE41, 0), + [0xdb] =3D UNARY_OP(aesimc, AES, 0), + [0xdc] =3D BINARY_OP(aesenc, AES, 0), + [0xdd] =3D BINARY_OP(aesenclast, AES, 0), + [0xde] =3D BINARY_OP(aesdec, AES, 0), + [0xdf] =3D BINARY_OP(aesdeclast, AES, 0), }; =20 -static const struct SSEOpHelper_eppi sse_op_table7[256] =3D { - [0x08] =3D SSE41_OP(roundps), - [0x09] =3D SSE41_OP(roundpd), - [0x0a] =3D SSE41_OP(roundss), - [0x0b] =3D SSE41_OP(roundsd), - [0x0c] =3D SSE41_OP(blendps), - [0x0d] =3D SSE41_OP(blendpd), - [0x0e] =3D SSE41_OP(pblendw), - [0x0f] =3D SSSE3_OP(palignr), - [0x14] =3D SSE41_SPECIAL, /* pextrb */ - [0x15] =3D SSE41_SPECIAL, /* pextrw */ - [0x16] =3D SSE41_SPECIAL, /* pextrd/pextrq */ - [0x17] =3D SSE41_SPECIAL, /* extractps */ - [0x20] =3D SSE41_SPECIAL, /* pinsrb */ - [0x21] =3D SSE41_SPECIAL, /* insertps */ - [0x22] =3D SSE41_SPECIAL, /* pinsrd/pinsrq */ - [0x40] =3D SSE41_OP(dpps), - [0x41] =3D SSE41_OP(dppd), - [0x42] =3D SSE41_OP(mpsadbw), - [0x44] =3D PCLMULQDQ_OP(pclmulqdq), - [0x60] =3D SSE42_OP(pcmpestrm), - [0x61] =3D SSE42_OP(pcmpestri), - [0x62] =3D SSE42_OP(pcmpistrm), - [0x63] =3D SSE42_OP(pcmpistri), - [0xdf] =3D AESNI_OP(aeskeygenassist), +/* prefix [66] 0f 3a */ +static const struct SSEOpHelper_table7 sse_op_table7[256] =3D { + [0x08] =3D UNARY_OP(roundps, SSE41, 0), + [0x09] =3D UNARY_OP(roundpd, SSE41, 0), + [0x0a] =3D UNARY_OP(roundss, SSE41, SSE_OPF_SCALAR), + [0x0b] =3D UNARY_OP(roundsd, SSE41, SSE_OPF_SCALAR), + [0x0c] =3D BINARY_OP(blendps, SSE41, 0), + [0x0d] =3D BINARY_OP(blendpd, SSE41, 0), + [0x0e] =3D BINARY_OP(pblendw, SSE41, SSE_OPF_MMX), + [0x0f] =3D BINARY_OP_MMX(palignr, SSSE3), + [0x14] =3D SPECIAL_OP(SSE41), /* pextrb */ + [0x15] =3D SPECIAL_OP(SSE41), /* pextrw */ + [0x16] =3D SPECIAL_OP(SSE41), /* pextrd/pextrq */ + [0x17] =3D SPECIAL_OP(SSE41), /* extractps */ + [0x20] =3D SPECIAL_OP(SSE41), /* pinsrb */ + [0x21] =3D SPECIAL_OP(SSE41), /* insertps */ + [0x22] =3D SPECIAL_OP(SSE41), /* pinsrd/pinsrq */ + [0x40] =3D BINARY_OP(dpps, SSE41, 0), + [0x41] =3D BINARY_OP(dppd, SSE41, 0), + [0x42] =3D BINARY_OP(mpsadbw, SSE41, SSE_OPF_MMX), + [0x44] =3D BINARY_OP(pclmulqdq, PCLMULQDQ, 0), + [0x60] =3D CMP_OP(pcmpestrm, SSE42), + [0x61] =3D CMP_OP(pcmpestri, SSE42), + [0x62] =3D CMP_OP(pcmpistrm, SSE42), + [0x63] =3D CMP_OP(pcmpistri, SSE42), + [0xdf] =3D UNARY_OP(aeskeygenassist, AES, 0), }; =20 +#undef OP +#undef BINARY_OP_MMX +#undef BINARY_OP +#undef UNARY_OP_MMX +#undef UNARY_OP +#undef BLENDV_OP +#undef SPECIAL_OP + static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { int b1, op1_offset, op2_offset, is_xmm, val; int modrm, mod, rm, reg; int sse_op_flags; + const struct SSEOpHelper_table6 *op6; + const struct SSEOpHelper_table7 *op7; SSEFunc_0_epp sse_fn_epp; - SSEFunc_0_eppi sse_fn_eppi; SSEFunc_0_ppi sse_fn_ppi; SSEFunc_0_eppt sse_fn_eppt; MemOp ot; @@ -3821,12 +3841,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, mod =3D (modrm >> 6) & 3; =20 assert(b1 < 2); - sse_fn_epp =3D sse_op_table6[b].op[b1]; - if (!sse_fn_epp) { + op6 =3D &sse_op_table6[b]; + if (op6->ext_mask =3D=3D 0) { goto unknown_op; } - if (!(s->cpuid_ext_features & sse_op_table6[b].ext_mask)) + if (!(s->cpuid_ext_features & op6->ext_mask)) { goto illegal_op; + } =20 if (b1) { op1_offset =3D ZMM_OFFSET(reg); @@ -3863,6 +3884,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } } } else { + if ((op6->flags & SSE_OPF_MMX) =3D=3D 0) { + goto unknown_op; + } op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -3872,13 +3896,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_ldq_env_A0(s, op2_offset); } } - if (sse_fn_epp =3D=3D SSE_SPECIAL_FN) { - goto unknown_op; + if (!op6->op[b1]) { + goto illegal_op; } =20 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + op6->op[b1](cpu_env, s->ptr0, s->ptr1); =20 if (b =3D=3D 0x17) { set_cc_op(s, CC_OP_EFLAGS); @@ -4249,16 +4273,21 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, mod =3D (modrm >> 6) & 3; =20 assert(b1 < 2); - sse_fn_eppi =3D sse_op_table7[b].op[b1]; - if (!sse_fn_eppi) { + op7 =3D &sse_op_table7[b]; + if (op7->ext_mask =3D=3D 0) { goto unknown_op; } - if (!(s->cpuid_ext_features & sse_op_table7[b].ext_mask)) + if (!(s->cpuid_ext_features & op7->ext_mask)) { goto illegal_op; + } =20 s->rip_offset =3D 1; =20 - if (sse_fn_eppi =3D=3D SSE_SPECIAL_FN) { + if (op7->flags & SSE_OPF_SPECIAL) { + /* None of the "special" ops are valid on mmx registers */ + if (b1 =3D=3D 0) { + goto illegal_op; + } ot =3D mo_64_32(s->dflag); rm =3D (modrm & 7) | REX_B(s); if (mod !=3D 3) @@ -4403,6 +4432,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_ldo_env_A0(s, op2_offset); } } else { + if ((op7->flags & SSE_OPF_MMX) =3D=3D 0) { + goto illegal_op; + } op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod =3D=3D 3) { op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); @@ -4425,7 +4457,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_eppi(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + op7->op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); break; =20 case 0x33a: --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020698; cv=none; d=zohomail.com; s=zohoarc; b=Sbjc4xceU+1MAE+uDrpznW2DyBUB7O7OvT7vdFWVYMyTBcVAO8IaSY1xDYqrJ1fdl3F7oepAkmAQdldibOHhPmRLAjrRi8ZUx6fdmIIqIGig3yGSAARE0UGttw7INNHw4Sgz7gavhGeMvhdlkz69v48874U1xFwdPn87sh6SmHQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020698; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=a5u+pTezVEAVGRqU1R5Cw6KDqL7HfzBbIXOPN2TtMv8=; b=TJjyj+rZmq/BzdVDuE/ak0LIB6j4tbb+jCQCcmQa9AblSb5+E8zneQ5CyVbBWBXKrgaiZeSFZb9dHPt+3ycHJ6shfss0Bg64pX2HpT85CcFVA/3NTOQljtBOoZk140uuz3OYXz7gym4QhLg5F4BUKR2xB7p0m/+HbabeU6DWR6M= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16620206985551.588202326825467; Thu, 1 Sep 2022 01:24:58 -0700 (PDT) Received: from localhost ([::1]:35782 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfVR-0000jS-FK for importer@patchew.org; Thu, 01 Sep 2022 04:24:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60296) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewh-00011r-Lb for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:24838) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewf-00036I-6K for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:02 -0400 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-646-u_4CwCiqNsu43IymELZNKg-1; Thu, 01 Sep 2022 03:48:59 -0400 Received: by mail-wm1-f70.google.com with SMTP id f7-20020a1c6a07000000b003a60ede816cso528145wmc.0 for ; Thu, 01 Sep 2022 00:48:59 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id r10-20020a05600c284a00b003a531c7aa66sm4525658wmb.1.2022.09.01.00.48.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:48:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018540; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=a5u+pTezVEAVGRqU1R5Cw6KDqL7HfzBbIXOPN2TtMv8=; b=W8QxUBgkKBfi1mKb4QX3fhAmgHSP+lqmngKn08IxSuePWMWJWd9xFXjygN307odj1+LBfN yJLybnyE0N2XuVLitQXtiyX7zMQQHhIII9IZI69v5oDZY+ZwOnWT8zzPUaclwgkLSZ1V5k 2yYAFmoEZHLfx5cHGyIqS/cusq8mYIU= X-MC-Unique: u_4CwCiqNsu43IymELZNKg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=a5u+pTezVEAVGRqU1R5Cw6KDqL7HfzBbIXOPN2TtMv8=; b=MROFrGfZLwecDbYnLUiSu79k1KtI23lF7E7oSc2uuSjltWedo6ZpxIE7ntWj+lI8+c L50T1xYJDNNWdHSRQmbh7NlHFdRR7Bugs7fXDWqQ6vrKufHLJaaMsW/W/KKXF8BzQ6ZX K9PZ9hCOIVu9u93w0mKe3tU1x80YmsWrc2NvdB5q5a1m6b0X/W1lK381dTx3md/4dNZO hoZzc/zR3VZHQUJKIPBUvy9g/wQYBc2PzzWCDSHtDcEKzugp3nA8kwoo6Iw3xLGvPzf1 5pATYD3Mw6wwhkiSH9RnMIoMr8s10kBVuUgvuNwN8XkxbbRZjUZDyT8lkZdmKLsN5x6T dzwQ== X-Gm-Message-State: ACgBeo0/XCHLbUfRXIQ77InmoC326EKIrbS6d/ND7CCaceFXXvnSjRPx B9jlE5PVMZTqMSHtGtoxxIEZVFdJ6eBjzsslSRCxfITzLdn9uNqAhpHYmpkItUzUtmEjLg5YoWA zbYpf/pRzZD0Al3UtMdewSH9N4MDEtaafBZs6HTkAAJb9i3j+xTClpsdkjJZfbkspmTs= X-Received: by 2002:a5d:5581:0:b0:20f:fc51:7754 with SMTP id i1-20020a5d5581000000b0020ffc517754mr14351457wrv.413.1662018538093; Thu, 01 Sep 2022 00:48:58 -0700 (PDT) X-Google-Smtp-Source: AA6agR63ClZZCrdL4TZJAV2G4EMocPyEhbnYfiTLOwpV93DTQwLB9fhf/J0ul1Xgeych+ol7ViydXg== X-Received: by 2002:a5d:5581:0:b0:20f:fc51:7754 with SMTP id i1-20020a5d5581000000b0020ffc517754mr14351443wrv.413.1662018537756; Thu, 01 Sep 2022 00:48:57 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 06/23] i386: Move 3DNOW decoder Date: Thu, 1 Sep 2022 09:48:25 +0200 Message-Id: <20220901074842.57424-7-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020700252100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Handle 3DNOW instructions early to avoid complicating the MMX/SSE logic. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-25-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index b7321b7588..c76f6dba11 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3216,6 +3216,11 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, is_xmm =3D 1; } } + if (sse_op_flags & SSE_OPF_3DNOW) { + if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { + goto illegal_op; + } + } /* simple MMX/SSE operation */ if (s->flags & HF_TS_MASK) { gen_exception(s, EXCP07_PREX, pc_start - s->cs_base); @@ -4567,21 +4572,20 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, rm =3D (modrm & 7); op2_offset =3D offsetof(CPUX86State,fpregs[rm].mmx); } + if (sse_op_flags & SSE_OPF_3DNOW) { + /* 3DNow! data insns */ + val =3D x86_ldub_code(env, s); + SSEFunc_0_epp op_3dnow =3D sse_op_table5[val]; + if (!op_3dnow) { + goto unknown_op; + } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + op_3dnow(cpu_env, s->ptr0, s->ptr1); + return; + } } switch(b) { - case 0x0f: /* 3DNow! data insns */ - val =3D x86_ldub_code(env, s); - sse_fn_epp =3D sse_op_table5[val]; - if (!sse_fn_epp) { - goto unknown_op; - } - if (!(s->cpuid_ext2_features & CPUID_EXT2_3DNOW)) { - goto illegal_op; - } - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; case 0x70: /* pshufx insn */ case 0xc6: /* pshufx insn */ val =3D x86_ldub_code(env, s); --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021048; cv=none; d=zohomail.com; s=zohoarc; b=kbRzHWv+c6NjlGiVk7BPL0mLuGQEeFkqFNGBuL0rBAFBHuTpW6p4JTL75XWg6UMqZDMYQuu708i2+2Ju3r4vuUXFHlY67eF02xyzKqXL0oSE39BzQRP7NzvGe5LJUsCwKAN0bp2+q4nVCrILJuc8t7aDkTCck9OMeFA6Aw6YRIQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021048; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=Qsqk+/tZj/l1X2BcmANA8UNdh29v3IMZ8Unky4JedO0=; b=YNrodOrxXU5Mu7yL7uN2vK3edrqFfP29hVYbei6g1mpjEv53g8loNn9HhNlXmJug7S+0NDTYqIg+Cj5SikQvfcAELE9f4shc/uKoeh+S8AjXjJ7dLYfLuWpQYjwaq8oqjFo65Cb38BWx6H4iCuDpGyRUNi5CX5G8q3hRo2DInp0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021048167812.7257909900792; Thu, 1 Sep 2022 01:30:48 -0700 (PDT) Received: from localhost ([::1]:50364 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfb1-0006XE-F4 for importer@patchew.org; Thu, 01 Sep 2022 04:30:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewi-00011y-JT for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:05 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:37693) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewh-00036d-25 for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:04 -0400 Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-616-TCHZhegvNFyJVzRNDd3cTg-1; Thu, 01 Sep 2022 03:49:01 -0400 Received: by mail-wm1-f71.google.com with SMTP id q16-20020a1cf310000000b003a626026ed1so518468wmq.4 for ; Thu, 01 Sep 2022 00:49:01 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id t12-20020a05600c198c00b003a2e92edeccsm4679026wmq.46.2022.09.01.00.48.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:48:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018542; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qsqk+/tZj/l1X2BcmANA8UNdh29v3IMZ8Unky4JedO0=; b=UMLHr3MLFQvQD5bXA48xDpHQapK6f4Rty4tW2wB/n0myLsevmxvjG3yV4+YYi47J1OiATy SpC0cNT69l11RiPQ3k8AvtSsWtzgKuG2rfyiRrJEU8GklwXrM2216WsZeGwvBEKA2gQdCa yoaTOYvnAonXmRfxd5GPupFE8+4OO3k= X-MC-Unique: TCHZhegvNFyJVzRNDd3cTg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=Qsqk+/tZj/l1X2BcmANA8UNdh29v3IMZ8Unky4JedO0=; b=CWMwFEW5f1rthbxBq7sRfmgjarQ5oFctPTDRGShu8rgOqKF/iK87abC1XYkvQBA8bv /5apcdOj/z1uXZhwoI3foxa3BssiOCpsoD16fxXtjCCz522ZT7pkopeZrEx5s+2cFNoX IzEm3XvM79f0zhmsb0SObBLWDtFnnNE3UWT2xvJFqXYOGVRbRbiz9JdGXzlO3MAhAgdm O7aNegN7zA4EnGWy2IfhemUeVm9/Vi7ZvYqH03zBLDR1vfFjHjmTwRTUin0zkBgTxV9r Ckp3ezPAgcVaa+TZXVsjiSnYwaq/ubaKZajNVaIor+gRQecAj3iIOjlBqBMRDvkYXCBJ 8ILw== X-Gm-Message-State: ACgBeo3V7eGYjXlgCs8ZWtMfViI5RH06mT//CFU8e+TKCuHTK+D9rzUk CNjEu55BwNhq4ff739liaAavqv+cfND9J2bKKSedvfxrAYjhqF4UkDGQ1+xendsMyvuWVnXt6rE Tg9jJGH61v6w8bad8XcsXvPUk4LCkmdCSuJHxstAK7+rbwGzF2s18VXxf1AvCn7klpQQ= X-Received: by 2002:adf:fb8f:0:b0:225:2def:221e with SMTP id a15-20020adffb8f000000b002252def221emr13794316wrr.130.1662018539738; Thu, 01 Sep 2022 00:48:59 -0700 (PDT) X-Google-Smtp-Source: AA6agR72nbSfkRtNGPyKCOLAzt49KEVkx9PC8a+FqvWiMFFp1xEVIUeQTN1sevBUBOkMU77JiDumFQ== X-Received: by 2002:adf:fb8f:0:b0:225:2def:221e with SMTP id a15-20020adffb8f000000b002252def221emr13794302wrr.130.1662018539433; Thu, 01 Sep 2022 00:48:59 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 07/23] i386: check SSE table flags instead of hardcoding opcodes Date: Thu, 1 Sep 2022 09:48:26 +0200 Message-Id: <20220901074842.57424-8-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021050251100001 Content-Type: text/plain; charset="utf-8" Put more flags to work to avoid hardcoding lists of opcodes. The op7 case for SSE_OPF_CMP is included for homogeneity and because AVX needs it, but it is never used by SSE or MMX. Extracted from a patch by Paul Brook . Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 75 +++++++++++++++---------------------- 1 file changed, 31 insertions(+), 44 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index c76f6dba11..849c40b685 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3909,7 +3909,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); op6->op[b1](cpu_env, s->ptr0, s->ptr1); =20 - if (b =3D=3D 0x17) { + if (op6->flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } break; @@ -4463,6 +4463,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); op7->op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + if (op7->flags & SSE_OPF_CMP) { + set_cc_op(s, CC_OP_EFLAGS); + } break; =20 case 0x33a: @@ -4518,28 +4521,24 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, int sz =3D 4; =20 gen_lea_modrm(env, s, modrm); - op2_offset =3D offsetof(CPUX86State,xmm_t0); + op2_offset =3D offsetof(CPUX86State, xmm_t0); =20 - switch (b) { - case 0x50 ... 0x5a: - case 0x5c ... 0x5f: - case 0xc2: - /* Most sse scalar operations. */ - if (b1 =3D=3D 2) { - sz =3D 2; - } else if (b1 =3D=3D 3) { - sz =3D 3; - } - break; - - case 0x2e: /* ucomis[sd] */ - case 0x2f: /* comis[sd] */ - if (b1 =3D=3D 0) { - sz =3D 2; + if (sse_op_flags & SSE_OPF_SCALAR) { + if (sse_op_flags & SSE_OPF_CMP) { + /* ucomis[sd], comis[sd] */ + if (b1 =3D=3D 0) { + sz =3D 2; + } else { + sz =3D 3; + } } else { - sz =3D 3; + /* Most sse scalar operations. */ + if (b1 =3D=3D 2) { + sz =3D 2; + } else if (b1 =3D=3D 3) { + sz =3D 3; + } } - break; } =20 switch (sz) { @@ -4585,26 +4584,14 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, return; } } - switch(b) { - case 0x70: /* pshufx insn */ - case 0xc6: /* pshufx insn */ + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + if (sse_op_flags & SSE_OPF_SHUF) { val =3D x86_ldub_code(env, s); - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); /* XXX: introduce a new table? */ sse_fn_ppi =3D (SSEFunc_0_ppi)sse_fn_epp; sse_fn_ppi(s->ptr0, s->ptr1, tcg_const_i32(val)); - break; - case 0xc2: - /* compare insns, bits 7:3 (7:5 for AVX) are ignored */ - val =3D x86_ldub_code(env, s) & 7; - sse_fn_epp =3D sse_op_table4[val][b1]; - - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; - case 0xf7: + } else if (b =3D=3D 0xf7) { /* maskmov : we must prepare A0 */ if (mod !=3D 3) { goto illegal_op; @@ -4613,19 +4600,19 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_extu(s->aflag, s->A0); gen_add_A0_ds_seg(s); =20 - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); /* XXX: introduce a new table? */ sse_fn_eppt =3D (SSEFunc_0_eppt)sse_fn_epp; sse_fn_eppt(cpu_env, s->ptr0, s->ptr1, s->A0); - break; - default: - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + } else if (b =3D=3D 0xc2) { + /* compare insns, bits 7:3 (7:5 for AVX) are ignored */ + val =3D x86_ldub_code(env, s) & 7; + sse_fn_epp =3D sse_op_table4[val][b1]; + sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + } else { sse_fn_epp(cpu_env, s->ptr0, s->ptr1); - break; } - if (b =3D=3D 0x2e || b =3D=3D 0x2f) { + + if (sse_op_flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } } --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020353; cv=none; d=zohomail.com; s=zohoarc; b=bT7XyaXG2VronrAC1aEfaTdq8d7MxG1LP46YiJL84N1OjkAFULu5L9tXjY7Ks4mhLsV2zMKxNVESHJ/ote0EXFLCrLOE6+7QP2x35a7+3oxQKEdQH3h5PTlEjk8E2aHBURtin41mxZk49NtZf8SYIvD0b8kCvDhGCVZ2AJPTsnM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020353; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=WR9GJ5jTK5V3HkY/K2hWkBqs/3JL99qPzXMMzbn8l7o=; b=d64jQ2nDlid0ztv5jo640Ki42BJTMc8LrCANJdpAGylqTfoDNVv12y6pXT46fUAR7fNGl8hthSsSoc/68W33A0+ybDrIKi11U9UEvOCzwl5bGTNBfJvwKCdncf86xYkJJOMUOvUZtZZ/JTuv3XSIKK6V3KFYr2rgiJu22uQOZPo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662020353539306.4363112179766; Thu, 1 Sep 2022 01:19:13 -0700 (PDT) Received: from localhost ([::1]:45948 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfPs-00042D-Df for importer@patchew.org; Thu, 01 Sep 2022 04:19:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48474) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewk-00014X-7z for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:06 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:28265) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewi-00036s-MU for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:05 -0400 Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-201-0RtOvNDnPkKZlhFEIDkUAg-1; Thu, 01 Sep 2022 03:49:02 -0400 Received: by mail-wr1-f71.google.com with SMTP id m21-20020adfa3d5000000b00226d1478469so2641790wrb.21 for ; Thu, 01 Sep 2022 00:49:02 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id 11-20020a05600c268b00b003a60bc8ae8fsm4336523wmt.21.2022.09.01.00.49.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018544; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WR9GJ5jTK5V3HkY/K2hWkBqs/3JL99qPzXMMzbn8l7o=; b=R1Kd3GW1BpcU9cQ3RH/p/gVBLaohogg/Gg6Z4XkknulO6F6WQ6oBakw9nZW9yjRnoOqhuz 1Vr8Y0SAfffvbmi7gthD2eczGGQIh1limkbI6Dy9+Ta63QEagUPXsX4cpauAL3DLeQhooS wmiXfudJ6kKeU/LRLZmrD6dJd2jMzpM= X-MC-Unique: 0RtOvNDnPkKZlhFEIDkUAg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=WR9GJ5jTK5V3HkY/K2hWkBqs/3JL99qPzXMMzbn8l7o=; b=6liepeyJY1esmD6WkVx7cKyrj3sxcU7uj5Z1A43B9Uo6yLq8/CbQwLhCoougtXkyT2 tYvCMNHGpPbkP6sHDaBmN7LjZ2lHuZjq7STXB4d7v4mA3OARl3moMS+pCo2ubS33K6t9 2/MYzN3uIu7dI/vr3tROV/cxUVj+r0DXJvzolJKlA6aB6KDpl4CCBc/e8IGIVm7eOi7s /8BkL4JdYpVUzV7kw4cKw4suPNQCJRiOxe4eBzWSk8b5t/7hCgRjShGrctEZNEz9Dt45 iam+aQTNj9WvJou3ohtkzQpYpr1ctugbQrFbnhoLuS7qefs7oxD7i0/QlNViVv0PPeiz EAHg== X-Gm-Message-State: ACgBeo2mvWpIhborRrQo10nb+KEvm2d0f/mc4AsNle/bN0U9JQKojw6W vuW1MxtF6LTPppnEvG3arwIoXt2tC6oOaAseRRS1lu/QG65lpq6Ze+N2OIKkjs0B2QtQRJeNJmk z/BB8T2P1x2OSjvw8xnF7bmsRu/KY9qNn3+fwsrGMfTrbyB455T+VPtOtK2RZhXqtI0A= X-Received: by 2002:a5d:598c:0:b0:226:e676:c64c with SMTP id n12-20020a5d598c000000b00226e676c64cmr5871975wri.580.1662018541259; Thu, 01 Sep 2022 00:49:01 -0700 (PDT) X-Google-Smtp-Source: AA6agR7fJNQqCDTO9tabyaQwOvSHK0uUsDwLoEpWJoF2koHDfKBDmsNtGq66fnoT/Qj063Nu2nKOCg== X-Received: by 2002:a5d:598c:0:b0:226:e676:c64c with SMTP id n12-20020a5d598c000000b00226e676c64cmr5871959wri.580.1662018540897; Thu, 01 Sep 2022 00:49:00 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 08/23] i386: isolate MMX code more Date: Thu, 1 Sep 2022 09:48:27 +0200 Message-Id: <20220901074842.57424-9-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020355074100001 Content-Type: text/plain; charset="utf-8" Extracted from a patch by Paul Brook . Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 52 +++++++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 19 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 849c40b685..097c895ef1 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3888,6 +3888,12 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_ldo_env_A0(s, op2_offset); } } + if (!op6->op[b1]) { + goto illegal_op; + } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + op6->op[b1](cpu_env, s->ptr0, s->ptr1); } else { if ((op6->flags & SSE_OPF_MMX) =3D=3D 0) { goto unknown_op; @@ -3900,14 +3906,10 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, op2_offset); } + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); + op6->op[0](cpu_env, s->ptr0, s->ptr1); } - if (!op6->op[b1]) { - goto illegal_op; - } - - tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); - tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op6->op[b1](cpu_env, s->ptr0, s->ptr1); =20 if (op6->flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); @@ -4427,16 +4429,8 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, return; } =20 - if (b1) { - op1_offset =3D ZMM_OFFSET(reg); - if (mod =3D=3D 3) { - op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); - } else { - op2_offset =3D offsetof(CPUX86State,xmm_t0); - gen_lea_modrm(env, s, modrm); - gen_ldo_env_A0(s, op2_offset); - } - } else { + if (b1 =3D=3D 0) { + /* MMX */ if ((op7->flags & SSE_OPF_MMX) =3D=3D 0) { goto illegal_op; } @@ -4448,9 +4442,29 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, op2_offset); } - } - val =3D x86_ldub_code(env, s); + val =3D x86_ldub_code(env, s); + tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); + tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); =20 + /* We only actually have one MMX instuction (palignr) */ + assert(b =3D=3D 0x0f); + + op7->op[0](cpu_env, s->ptr0, s->ptr1, + tcg_const_i32(val)); + break; + } + + /* SSE */ + op1_offset =3D ZMM_OFFSET(reg); + if (mod =3D=3D 3) { + op2_offset =3D ZMM_OFFSET(rm | REX_B(s)); + } else { + op2_offset =3D offsetof(CPUX86State, xmm_t0); + gen_lea_modrm(env, s, modrm); + gen_ldo_env_A0(s, op2_offset); + } + + val =3D x86_ldub_code(env, s); if ((b & 0xfc) =3D=3D 0x60) { /* pcmpXstrX */ set_cc_op(s, CC_OP_EFLAGS); =20 --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020209; cv=none; d=zohomail.com; s=zohoarc; b=Kxh3kxOfCFJHIZhtre/OElZhK5htIB2DnblBFqKFGy5VOB1rsLNlZBIbmEU27jsurAu+dtT95jRe94BowDIKUh6JVkoEEGJkyU97NQNWmZN6UrWq/rjvEWAvFt80npla0aGTO3Gq6PXXS75jLBlXT5hFIQuaoKL0tLGYu2TgzB4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020209; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=k72TV+K/Thufd/Yx3ENU8lvxqNGUdXWu8n5j7+Wfsxc=; b=GHSv7nX0SLNUBHjzXCOl6mnHY4zkJS5dEeMAjwW8GEdxIImB99XoRISw24A+7srbgm2YKa+vk3wfBZr8ZqcqvjRZZDMTEJjv1bORyXmY4JzLjAaiyiK6pL/6o2dnYMyhCld23sca52lJcd2cQIeMiOVch4g44Wx+WuCKpzZx52w= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16620202093204.524208938021729; Thu, 1 Sep 2022 01:16:49 -0700 (PDT) Received: from localhost ([::1]:41482 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfNY-000201-86 for importer@patchew.org; Thu, 01 Sep 2022 04:16:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48476) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewn-0001Da-HE for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:09 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:20904) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewk-00037L-QN for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:09 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-29-bjGepFA2MLW4HabLHA2lxA-1; Thu, 01 Sep 2022 03:49:05 -0400 Received: by mail-wm1-f69.google.com with SMTP id h133-20020a1c218b000000b003a5fa79008bso887091wmh.5 for ; Thu, 01 Sep 2022 00:49:04 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id o26-20020a05600c511a00b003a643ac2b08sm4621481wms.8.2022.09.01.00.49.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018546; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k72TV+K/Thufd/Yx3ENU8lvxqNGUdXWu8n5j7+Wfsxc=; b=ERmQhVcUx2soxs6YbOlJ7WAsG55nUK669qF2JeFomFk3ly1Y+gznWIXi8oN41R3Asbcp0g lS0HB8XJwsvLF3wWHMDkCnBZxEvjdWuflgTzIMuyqEg8iU53Fft0nLFBEeuq5QWZ/kVamn fYAq+19guBl6pWLiYIzP2fYWoNi+VsI= X-MC-Unique: bjGepFA2MLW4HabLHA2lxA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=k72TV+K/Thufd/Yx3ENU8lvxqNGUdXWu8n5j7+Wfsxc=; b=qg6ckKkoc9mvO/4z045ULIqbcrp2XockQjkNFvQ1zUwLLADW2VSw1PRbb2KSA5SEtc xSo8J8ydnkuCREdJB5mwrqzv9GC9LlRRCPrtB0ENWNkI4/n67d/RxO67PdtvhbjTwdS2 NM7wRYb74MUZP/8vEyJ4k8AT/YWHkf76vTd4WSX3dzFHapihkIrpgJbSzSwPpI5s17nd Yx2sZZYbrHuIUC2QsF4eDiWflXr7+xQ1spu3JtNvkZrj92PvWGuDZ/c7xTgudMk5XVKH lfViqKLGd/10OudDsotbM+ITbR59EZKLO5YoxyWLoKo29vMnxRC1YITfgMvMQVSzg/5E KCRQ== X-Gm-Message-State: ACgBeo2RylIZ3mQKDGkP9+alXpXoAZ1q+3B0YpSfpCYIdPt/C+IRc6sI Qfi2xIkqkQvZ8uy2+RNjTXH5i3V50SlS1ZoghlKYp/R+MqGjioLMUHlPRgg5Jn/iEvx2UHhEKIT bMWn8G2m7m4Cv7ZWLLQc0jIIkej0BD/PJQaDf3h80NXeqx8ge0UyT4Ih9sADrGfDcjmk= X-Received: by 2002:a05:600c:3512:b0:3a5:e9d3:d418 with SMTP id h18-20020a05600c351200b003a5e9d3d418mr4322350wmq.0.1662018543346; Thu, 01 Sep 2022 00:49:03 -0700 (PDT) X-Google-Smtp-Source: AA6agR5BnYjeJxoijHIXrlesxOAEvt4Kwy9Bamqj9rh5zsRq9aB+5kQDxNW5+eF41cZDXGKMO+JfdA== X-Received: by 2002:a05:600c:3512:b0:3a5:e9d3:d418 with SMTP id h18-20020a05600c351200b003a5e9d3d418mr4322323wmq.0.1662018542843; Thu, 01 Sep 2022 00:49:02 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 09/23] i386: Add size suffix to vector FP helpers Date: Thu, 1 Sep 2022 09:48:28 +0200 Message-Id: <20220901074842.57424-10-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020209802100001 Content-Type: text/plain; charset="utf-8" For AVX we're going to need both 128 bit (xmm) and 256 bit (ymm) variants of floating point helpers. Add the register type suffix to the existing *PS and *PD helpers (SS and SD variants are only valid on 128 bit vectors) No functional changes. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-15-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 48 ++++++++++++++++++------------------ target/i386/ops_sse_header.h | 48 ++++++++++++++++++------------------ target/i386/tcg/translate.c | 37 +++++++++++++-------------- 3 files changed, 67 insertions(+), 66 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 8845b6d4cb..2c0090a647 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -537,7 +537,7 @@ void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int or= der) MOVE(*d, r); } #else -void helper_shufps(Reg *d, Reg *s, int order) +void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order) { Reg r; =20 @@ -548,7 +548,7 @@ void helper_shufps(Reg *d, Reg *s, int order) MOVE(*d, r); } =20 -void helper_shufpd(Reg *d, Reg *s, int order) +void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *s, int order) { Reg r; =20 @@ -598,7 +598,7 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int o= rder) /* XXX: not accurate */ =20 #define SSE_HELPER_S(name, F) \ - void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ d->ZMM_S(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ @@ -611,7 +611,7 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int o= rder) d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ d->ZMM_D(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ @@ -647,7 +647,7 @@ SSE_HELPER_S(sqrt, FPU_SQRT) =20 =20 /* float to float conversions */ -void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { float32 s0, s1; =20 @@ -657,7 +657,7 @@ void helper_cvtps2pd(CPUX86State *env, Reg *d, Reg *s) d->ZMM_D(1) =3D float32_to_float64(s1, &env->sse_status); } =20 -void helper_cvtpd2ps(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); d->ZMM_S(1) =3D float64_to_float32(s->ZMM_D(1), &env->sse_status); @@ -675,7 +675,7 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s) } =20 /* integer to float */ -void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D int32_to_float32(s->ZMM_L(0), &env->sse_status); d->ZMM_S(1) =3D int32_to_float32(s->ZMM_L(1), &env->sse_status); @@ -683,7 +683,7 @@ void helper_cvtdq2ps(CPUX86State *env, Reg *d, Reg *s) d->ZMM_S(3) =3D int32_to_float32(s->ZMM_L(3), &env->sse_status); } =20 -void helper_cvtdq2pd(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int32_t l0, l1; =20 @@ -760,7 +760,7 @@ WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero,= float32, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN) =20 -void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); d->ZMM_L(1) =3D x86_float32_to_int32(s->ZMM_S(1), &env->sse_status); @@ -768,7 +768,7 @@ void helper_cvtps2dq(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) d->ZMM_L(3) =3D x86_float32_to_int32(s->ZMM_S(3), &env->sse_status); } =20 -void helper_cvtpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float64_to_int32(s->ZMM_D(0), &env->sse_status); d->ZMM_L(1) =3D x86_float64_to_int32(s->ZMM_D(1), &env->sse_status); @@ -810,7 +810,7 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s) #endif =20 /* float to integer truncated */ -void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); @@ -818,7 +818,7 @@ void helper_cvttps2dq(CPUX86State *env, ZMMReg *d, ZMMR= eg *s) d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->= sse_status); } =20 -void helper_cvttpd2dq(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); @@ -859,7 +859,7 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) } #endif =20 -void helper_rsqrtps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, @@ -886,7 +886,7 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg= *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 -void helper_rcpps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); @@ -947,7 +947,7 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int = index, int length) d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } =20 -void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -958,7 +958,7 @@ void helper_haddps(CPUX86State *env, ZMMReg *d, ZMMReg = *s) MOVE(*d, r); } =20 -void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_haddpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -967,7 +967,7 @@ void helper_haddpd(CPUX86State *env, ZMMReg *d, ZMMReg = *s) MOVE(*d, r); } =20 -void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -978,7 +978,7 @@ void helper_hsubps(CPUX86State *env, ZMMReg *d, ZMMReg = *s) MOVE(*d, r); } =20 -void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { ZMMReg r; =20 @@ -987,7 +987,7 @@ void helper_hsubpd(CPUX86State *env, ZMMReg *d, ZMMReg = *s) MOVE(*d, r); } =20 -void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_S(0) =3D float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); d->ZMM_S(1) =3D float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); @@ -995,7 +995,7 @@ void helper_addsubps(CPUX86State *env, ZMMReg *d, ZMMRe= g *s) d->ZMM_S(3) =3D float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); } =20 -void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { d->ZMM_D(0) =3D float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); d->ZMM_D(1) =3D float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); @@ -1003,7 +1003,7 @@ void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMM= Reg *s) =20 /* XXX: unordered */ #define SSE_HELPER_CMP(name, F) \ - void helper_ ## name ## ps(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ d->ZMM_L(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ @@ -1016,7 +1016,7 @@ void helper_addsubpd(CPUX86State *env, ZMMReg *d, ZMM= Reg *s) d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ } \ \ - void helper_ ## name ## pd(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ { \ d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ d->ZMM_Q(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ @@ -1099,7 +1099,7 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s) CC_SRC =3D comis_eflags[ret + 1]; } =20 -uint32_t helper_movmskps(CPUX86State *env, Reg *s) +uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s) { int b0, b1, b2, b3; =20 @@ -1110,7 +1110,7 @@ uint32_t helper_movmskps(CPUX86State *env, Reg *s) return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3); } =20 -uint32_t helper_movmskpd(CPUX86State *env, Reg *s) +uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s) { int b0, b1; =20 diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index cef28f2aae..fc697536a0 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -122,8 +122,8 @@ DEF_HELPER_2(glue(movq_mm_T0, SUFFIX), void, Reg, i64) #if SHIFT =3D=3D 0 DEF_HELPER_3(glue(pshufw, SUFFIX), void, Reg, Reg, int) #else -DEF_HELPER_3(shufps, void, Reg, Reg, int) -DEF_HELPER_3(shufpd, void, Reg, Reg, int) +DEF_HELPER_3(glue(shufps, SUFFIX), void, Reg, Reg, int) +DEF_HELPER_3(glue(shufpd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufd, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshuflw, SUFFIX), void, Reg, Reg, int) DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) @@ -134,9 +134,9 @@ DEF_HELPER_3(glue(pshufhw, SUFFIX), void, Reg, Reg, int) /* XXX: not accurate */ =20 #define SSE_HELPER_S(name, F) \ - DEF_HELPER_3(name ## ps, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## pd, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) =20 SSE_HELPER_S(add, FPU_ADD) @@ -148,12 +148,12 @@ SSE_HELPER_S(max, FPU_MAX) SSE_HELPER_S(sqrt, FPU_SQRT) =20 =20 -DEF_HELPER_3(cvtps2pd, void, env, Reg, Reg) -DEF_HELPER_3(cvtpd2ps, void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtps2pd, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtpd2ps, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(cvtss2sd, void, env, Reg, Reg) DEF_HELPER_3(cvtsd2ss, void, env, Reg, Reg) -DEF_HELPER_3(cvtdq2ps, void, env, Reg, Reg) -DEF_HELPER_3(cvtdq2pd, void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtdq2ps, SUFFIX), void, env, Reg, Reg) +DEF_HELPER_3(glue(cvtdq2pd, SUFFIX), void, env, Reg, Reg) DEF_HELPER_3(cvtpi2ps, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtpi2pd, void, env, ZMMReg, MMXReg) DEF_HELPER_3(cvtsi2ss, void, env, ZMMReg, i32) @@ -164,8 +164,8 @@ DEF_HELPER_3(cvtsq2ss, void, env, ZMMReg, i64) DEF_HELPER_3(cvtsq2sd, void, env, ZMMReg, i64) #endif =20 -DEF_HELPER_3(cvtps2dq, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(cvtpd2dq, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvtps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvtpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvtps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvtpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvtss2si, s32, env, ZMMReg) @@ -175,8 +175,8 @@ DEF_HELPER_2(cvtss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvtsd2sq, s64, env, ZMMReg) #endif =20 -DEF_HELPER_3(cvttps2dq, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(cvttpd2dq, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvttps2dq, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(cvttpd2dq, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(cvttps2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_3(cvttpd2pi, void, env, MMXReg, ZMMReg) DEF_HELPER_2(cvttss2si, s32, env, ZMMReg) @@ -186,25 +186,25 @@ DEF_HELPER_2(cvttss2sq, s64, env, ZMMReg) DEF_HELPER_2(cvttsd2sq, s64, env, ZMMReg) #endif =20 -DEF_HELPER_3(rsqrtps, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(rsqrtps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rsqrtss, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(rcpps, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(rcpps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(rcpss, void, env, ZMMReg, ZMMReg) DEF_HELPER_3(extrq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(extrq_i, void, env, ZMMReg, int, int) DEF_HELPER_3(insertq_r, void, env, ZMMReg, ZMMReg) DEF_HELPER_4(insertq_i, void, env, ZMMReg, int, int) -DEF_HELPER_3(haddps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(haddpd, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(hsubps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(hsubpd, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(addsubps, void, env, ZMMReg, ZMMReg) -DEF_HELPER_3(addsubpd, void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(haddps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(haddpd, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(hsubps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(hsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(addsubps, SUFFIX), void, env, ZMMReg, ZMMReg) +DEF_HELPER_3(glue(addsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) =20 #define SSE_HELPER_CMP(name, F) \ - DEF_HELPER_3(name ## ps, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(name ## pd, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) =20 SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) @@ -220,8 +220,8 @@ DEF_HELPER_3(ucomiss, void, env, Reg, Reg) DEF_HELPER_3(comiss, void, env, Reg, Reg) DEF_HELPER_3(ucomisd, void, env, Reg, Reg) DEF_HELPER_3(comisd, void, env, Reg, Reg) -DEF_HELPER_2(movmskps, i32, env, Reg) -DEF_HELPER_2(movmskpd, i32, env, Reg) +DEF_HELPER_2(glue(movmskps, SUFFIX), i32, env, Reg) +DEF_HELPER_2(glue(movmskpd, SUFFIX), i32, env, Reg) #endif =20 DEF_HELPER_2(glue(pmovmskb, SUFFIX), i32, efine SSE_FOP(name) OP(op1, SSE_= OPF_SCALAR, \ - gen_helper_##name##ps, gen_helper_##name##pd, \ + gen_helper_##name##ps##_xmm, gen_helper_##name##pd##_xmm, \ gen_helper_##name##ss, gen_helper_##name##sd) #define SSE_OP(sname, dname, op, flags) OP(op, flags, \ gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) @@ -2843,12 +2843,12 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { gen_helper_comiss, gen_helper_comisd, NULL, NULL), [0x50] =3D SSE_SPECIAL, /* movmskps, movmskpd */ [0x51] =3D OP(op1, SSE_OPF_SCALAR, - gen_helper_sqrtps, gen_helper_sqrtpd, + gen_helper_sqrtps_xmm, gen_helper_sqrtpd_xmm, gen_helper_sqrtss, gen_helper_sqrtsd), [0x52] =3D OP(op1, SSE_OPF_SCALAR, - gen_helper_rsqrtps, NULL, gen_helper_rsqrtss, NULL), + gen_helper_rsqrtps_xmm, NULL, gen_helper_rsqrtss, NULL), [0x53] =3D OP(op1, SSE_OPF_SCALAR, - gen_helper_rcpps, NULL, gen_helper_rcpss, NULL), + gen_helper_rcpps_xmm, NULL, gen_helper_rcpss, NULL), [0x54] =3D SSE_OP(pand, pand, op1, 0), /* andps, andpd */ [0x55] =3D SSE_OP(pandn, pandn, op1, 0), /* andnps, andnpd */ [0x56] =3D SSE_OP(por, por, op1, 0), /* orps, orpd */ @@ -2856,19 +2856,19 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x58] =3D SSE_FOP(add), [0x59] =3D SSE_FOP(mul), [0x5a] =3D OP(op1, SSE_OPF_SCALAR, - gen_helper_cvtps2pd, gen_helper_cvtpd2ps, + gen_helper_cvtps2pd_xmm, gen_helper_cvtpd2ps_xmm, gen_helper_cvtss2sd, gen_helper_cvtsd2ss), [0x5b] =3D OP(op1, 0, - gen_helper_cvtdq2ps, gen_helper_cvtps2dq, - gen_helper_cvttps2dq, NULL), + gen_helper_cvtdq2ps_xmm, gen_helper_cvtps2dq_xmm, + gen_helper_cvttps2dq_xmm, NULL), [0x5c] =3D SSE_FOP(sub), [0x5d] =3D SSE_FOP(min), [0x5e] =3D SSE_FOP(div), [0x5f] =3D SSE_FOP(max), =20 [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ - [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps, - (SSEFunc_0_epp)gen_helper_shufpd, NULL, NULL), + [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps_xm= m, + (SSEFunc_0_epp)gen_helper_shufpd_xmm, NULL, NULL), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ [0x38] =3D SSE_SPECIAL, @@ -2909,15 +2909,15 @@ static const struct SSEOpHelper_table1 sse_op_table= 1[256] =3D { [0x79] =3D OP(op1, 0, NULL, gen_helper_extrq_r, NULL, gen_helper_insertq_r), [0x7c] =3D OP(op1, 0, - NULL, gen_helper_haddpd, NULL, gen_helper_haddps), + NULL, gen_helper_haddpd_xmm, NULL, gen_helper_haddps_xmm), [0x7d] =3D OP(op1, 0, - NULL, gen_helper_hsubpd, NULL, gen_helper_hsubps), + NULL, gen_helper_hsubpd_xmm, NULL, gen_helper_hsubps_xmm), [0x7e] =3D SSE_SPECIAL, /* movd, movd, , movq */ [0x7f] =3D SSE_SPECIAL, /* movq, movdqa, movdqu */ [0xc4] =3D SSE_SPECIAL, /* pinsrw */ [0xc5] =3D SSE_SPECIAL, /* pextrw */ [0xd0] =3D OP(op1, 0, - NULL, gen_helper_addsubpd, NULL, gen_helper_addsubps), + NULL, gen_helper_addsubpd_xmm, NULL, gen_helper_addsubps_x= mm), [0xd1] =3D MMX_OP(psrlw), [0xd2] =3D MMX_OP(psrld), [0xd3] =3D MMX_OP(psrlq), @@ -2940,8 +2940,8 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0xe4] =3D MMX_OP(pmulhuw), [0xe5] =3D MMX_OP(pmulhw), [0xe6] =3D OP(op1, 0, - NULL, gen_helper_cvttpd2dq, - gen_helper_cvtdq2pd, gen_helper_cvtpd2dq), + NULL, gen_helper_cvttpd2dq_xmm, + gen_helper_cvtdq2pd_xmm, gen_helper_cvtpd2dq_xmm), [0xe7] =3D SSE_SPECIAL, /* movntq, movntq */ [0xe8] =3D MMX_OP(psubsb), [0xe9] =3D MMX_OP(psubsw), @@ -3018,8 +3018,9 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 -#define SSE_FOP(x) { gen_helper_ ## x ## ps, gen_helper_ ## x ## pd, \ - gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, } +#define SSE_FOP(x) { \ + gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ + gen_helper_ ## x ## ss, gen_helper_ ## x ## sd} static const SSEFunc_0_epp sse_op_table4[8][4] =3D { SSE_FOP(cmpeq), SSE_FOP(cmplt), @@ -3636,13 +3637,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, case 0x050: /* movmskps */ rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_movmskps(s->tmp2_i32, cpu_env, s->ptr0); + gen_helper_movmskps_xmm(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x150: /* movmskpd */ rm =3D (modrm & 7) | REX_B(s); tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); - gen_helper_movmskpd(s->tmp2_i32, cpu_env, s->ptr0); + gen_helper_movmskpd_xmm(s->tmp2_i32, cpu_env, s->ptr0); tcg_gen_extu_i32_tl(cpu_regs[reg], s->tmp2_i32); break; case 0x02a: /* cvtpi2ps */ --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021399; cv=none; d=zohomail.com; s=zohoarc; b=XlRd+8J71fQHPoR4ertpIWAxDwJ0c4lAmfvtzd/kRCoFh21xWtc+u8EvbrWEHX3YLDXNbphsVKWYQuVmsSv4evfPW1xDWf7RdD30rMi1HBbY/HlquSoiTnAIwB9/9kUDmffZaeyv2EQnN2u+ddtc9GMD3DsXGK51AkjpLmhtDXU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021399; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=jiLcyNFadFmE+CsaVrOfMKBIDJnZ/246U5aUFQtItY4=; b=cg1Sc035WqSBrvDDqL5B2eLsiWD0WOaYZEZ6yNLa5exWCk5kRN7NPa5bT+x6cx0XD8iC3rd+dOvJq3XIsaNpPToZGFnEdWN5v9t/U1+VPtvlRfs5Mg3I2Srst5gL7EeaJlQGuGMYDB5+2p0yre5EHT1rk/mwSNhN95P69HrTEqs= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 16620213994511013.733094170085; Thu, 1 Sep 2022 01:36:39 -0700 (PDT) Received: from localhost ([::1]:55706 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfgk-0004N9-3L for importer@patchew.org; Thu, 01 Sep 2022 04:36:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48478) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewp-0001Hs-99 for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:11 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:29905) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewm-00037Z-CP for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:10 -0400 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-307-yy6JMethPSmo3zxwNNLvYA-1; Thu, 01 Sep 2022 03:49:06 -0400 Received: by mail-wm1-f70.google.com with SMTP id h82-20020a1c2155000000b003a64d0510d9so9467779wmh.8 for ; Thu, 01 Sep 2022 00:49:06 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id d18-20020adff2d2000000b0022542581800sm16257170wrp.45.2022.09.01.00.49.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018547; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jiLcyNFadFmE+CsaVrOfMKBIDJnZ/246U5aUFQtItY4=; b=bL4JxLcgfLBvyywzhJZ9yxUivixw4FNMz6vt/4jcONK2o7a/+yyjEjqc7KCFsgc/VJpeHW 58eOPhCp4PHobOmIrFQavVZsPM6BrDOItu28G2tv8CJXpQqD3ADqOdct5l7nkXVaBYcfV9 9y8lzvqJk3BpH72aEcbe/mG4e0hEX6M= X-MC-Unique: yy6JMethPSmo3zxwNNLvYA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=jiLcyNFadFmE+CsaVrOfMKBIDJnZ/246U5aUFQtItY4=; b=57lp/T1IP35rpTdGHhxHGGLQCAZXODkJHWOa1l08Cgw4cLJ5elHQ54ZIEes8Yz/3KY VxKWQ8qnGcAmT2uoQrDiMJOQegZNjd2sW20s0mDg4r+QUeR3IMjE5YlVGNRVwgzpY3KJ AFciO/EeQJDShqF2TmfBadzax8v0qYRy0LBZLZd6FpgWjOcYj0tL+MxswSrpFjO+fulI YOpUg7SMUh7c6B010qbM1g1AjUzlHNGeklvpMq/5jJLREoHgPJyUgMqUVuzP1YInRYg+ GytiWEu+lzxShfejVg0BCdfHuSZ/4o9V7wXiEN3NrwSg+rWuI5Hq3vvYT3PM/GDk8KWB yipQ== X-Gm-Message-State: ACgBeo1M4E1ilzm2RC4Z57+MOskWIuWunoKHxGULw1kkGXTbcCKMXvFz 7vPQH0qizazLlUW/EuUySuTEPg4WayyWRTMecIo5v7MQ5iA1Wo5DVA9MvGKa91PbhbkFyilLxla 9MfekOES4yNl1LaGQ9qeZlYHkT4uR469qhSTi1/FOrXvBTL0KTxSGGp5wa7pjpcPxLkg= X-Received: by 2002:a05:6000:1563:b0:222:c70e:b2a5 with SMTP id 3-20020a056000156300b00222c70eb2a5mr14104040wrz.492.1662018545095; Thu, 01 Sep 2022 00:49:05 -0700 (PDT) X-Google-Smtp-Source: AA6agR72mFg4sv6ofuOvfmMfMhvAMT2gCSWYZ0h+Q9y5yRiIGYTcrKquWaeM7+bbpAyNVkbqNxV+oQ== X-Received: by 2002:a05:6000:1563:b0:222:c70e:b2a5 with SMTP id 3-20020a056000156300b00222c70eb2a5mr14104022wrz.492.1662018544754; Thu, 01 Sep 2022 00:49:04 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 10/23] i386: do not cast gen_helper_* function pointers Date: Thu, 1 Sep 2022 09:48:29 +0200 Message-Id: <20220901074842.57424-11-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021400246100001 Content-Type: text/plain; charset="utf-8" Use a union to store the various possible kinds of function pointers, and access the correct one based on the flags. SSEOpHelper_table6 and SSEOpHelper_table7 right now only have one case, but this would change with AVX's 3- and 4-argument operations. Use unions there too, to keep the code more similar for the three tables. Extracted from a patch by Paul Brook . Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/tcg/translate.c | 75 ++++++++++++++++++------------------- 1 file changed, 37 insertions(+), 38 deletions(-) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 16db155c94..c6a9a5b1d4 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -2784,6 +2784,8 @@ typedef void (*SSEFunc_l_ep)(TCGv_i64 val, TCGv_ptr e= nv, TCGv_ptr reg); typedef void (*SSEFunc_0_epi)(TCGv_ptr env, TCGv_ptr reg, TCGv_i32 val); typedef void (*SSEFunc_0_epl)(TCGv_ptr env, TCGv_ptr reg, TCGv_i64 val); typedef void (*SSEFunc_0_epp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_b= ); +typedef void (*SSEFunc_0_eppp)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, + TCGv_ptr reg_c); typedef void (*SSEFunc_0_eppi)(TCGv_ptr env, TCGv_ptr reg_a, TCGv_ptr reg_= b, TCGv_i32 val); typedef void (*SSEFunc_0_ppi)(TCGv_ptr reg_a, TCGv_ptr reg_b, TCGv_i32 val= ); @@ -2798,7 +2800,7 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_ptr= reg_a, TCGv_ptr reg_b, #define SSE_OPF_SHUF (1 << 9) /* pshufx/shufpx */ =20 #define OP(op, flags, a, b, c, d) \ - {flags, {a, b, c, d} } + {flags, {{.op =3D a}, {.op =3D b}, {.op =3D c}, {.op =3D d} } } =20 #define MMX_OP(x) OP(op1, SSE_OPF_MMX, \ gen_helper_ ## x ## _mmx, gen_helper_ ## x ## _xmm, NULL, NULL) @@ -2809,9 +2811,15 @@ typedef void (*SSEFunc_0_eppt)(TCGv_ptr env, TCGv_pt= r reg_a, TCGv_ptr reg_b, #define SSE_OP(sname, dname, op, flags) OP(op, flags, \ gen_helper_##sname##_xmm, gen_helper_##dname##_xmm, NULL, NULL) =20 +typedef union SSEFuncs { + SSEFunc_0_epp op1; + SSEFunc_0_ppi op1i; + SSEFunc_0_eppt op1t; +} SSEFuncs; + struct SSEOpHelper_table1 { int flags; - SSEFunc_0_epp op[4]; + SSEFuncs fn[4]; }; =20 #define SSE_3DNOW { SSE_OPF_3DNOW } @@ -2867,8 +2875,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0x5f] =3D SSE_FOP(max), =20 [0xc2] =3D SSE_FOP(cmpeq), /* sse_op_table4 */ - [0xc6] =3D OP(dummy, SSE_OPF_SHUF, (SSEFunc_0_epp)gen_helper_shufps_xm= m, - (SSEFunc_0_epp)gen_helper_shufpd_xmm, NULL, NULL), + [0xc6] =3D SSE_OP(shufps, shufpd, op1i, SSE_OPF_SHUF), =20 /* SSSE3, SSE4, MOVBE, CRC32, BMI1, BMI2, ADX. */ [0x38] =3D SSE_SPECIAL, @@ -2894,10 +2901,8 @@ static const struct SSEOpHelper_table1 sse_op_table1= [256] =3D { [0x6e] =3D SSE_SPECIAL, /* movd mm, ea */ [0x6f] =3D SSE_SPECIAL, /* movq, movdqa, , movqdu */ [0x70] =3D OP(op1i, SSE_OPF_SHUF | SSE_OPF_MMX, - (SSEFunc_0_epp)gen_helper_pshufw_mmx, - (SSEFunc_0_epp)gen_helper_pshufd_xmm, - (SSEFunc_0_epp)gen_helper_pshufhw_xmm, - (SSEFunc_0_epp)gen_helper_pshuflw_xmm), + gen_helper_pshufw_mmx, gen_helper_pshufd_xmm, + gen_helper_pshufhw_xmm, gen_helper_pshuflw_xmm), [0x71] =3D SSE_SPECIAL, /* shiftw */ [0x72] =3D SSE_SPECIAL, /* shiftd */ [0x73] =3D SSE_SPECIAL, /* shiftq */ @@ -2959,8 +2964,7 @@ static const struct SSEOpHelper_table1 sse_op_table1[= 256] =3D { [0xf5] =3D MMX_OP(pmaddwd), [0xf6] =3D MMX_OP(psadbw), [0xf7] =3D OP(op1t, SSE_OPF_MMX, - (SSEFunc_0_epp)gen_helper_maskmov_mmx, - (SSEFunc_0_epp)gen_helper_maskmov_xmm, NULL, NULL), + gen_helper_maskmov_mmx, gen_helper_maskmov_xmm, NULL, NULL= ), [0xf8] =3D MMX_OP(psubb), [0xf9] =3D MMX_OP(psubw), [0xfa] =3D MMX_OP(psubl), @@ -3057,17 +3061,19 @@ static const SSEFunc_0_epp sse_op_table5[256] =3D { [0xb6] =3D gen_helper_movq, /* pfrcpit2 */ [0xb7] =3D gen_helper_pmulhrw_mmx, [0xbb] =3D gen_helper_pswapd, - [0xbf] =3D gen_helper_pavgb_mmx /* pavgusb */ + [0xbf] =3D gen_helper_pavgb_mmx, }; =20 struct SSEOpHelper_table6 { - SSEFunc_0_epp op[2]; + SSEFuncs fn[2]; uint32_t ext_mask; int flags; }; =20 struct SSEOpHelper_table7 { - SSEFunc_0_eppi op[2]; + union { + SSEFunc_0_eppi op1; + } fn[2]; uint32_t ext_mask; int flags; }; @@ -3075,7 +3081,8 @@ struct SSEOpHelper_table7 { #define gen_helper_special_xmm NULL =20 #define OP(name, op, flags, ext, mmx_name) \ - {{mmx_name, gen_helper_ ## name ## _xmm}, CPUID_EXT_ ## ext, flags} + {{{.op =3D mmx_name}, {.op =3D gen_helper_ ## name ## _xmm} }, \ + CPUID_EXT_ ## ext, flags} #define BINARY_OP_MMX(name, ext) \ OP(name, op1, SSE_OPF_MMX, ext, gen_helper_ ## name ## _mmx) #define BINARY_OP(name, ext, flags) \ @@ -3185,11 +3192,9 @@ static void gen_sse(CPUX86State *env, DisasContext *= s, int b, int b1, op1_offset, op2_offset, is_xmm, val; int modrm, mod, rm, reg; int sse_op_flags; + SSEFuncs sse_op_fn; const struct SSEOpHelper_table6 *op6; const struct SSEOpHelper_table7 *op7; - SSEFunc_0_epp sse_fn_epp; - SSEFunc_0_ppi sse_fn_ppi; - SSEFunc_0_eppt sse_fn_eppt; MemOp ot; =20 b &=3D 0xff; @@ -3202,9 +3207,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, else b1 =3D 0; sse_op_flags =3D sse_op_table1[b].flags; - sse_fn_epp =3D sse_op_table1[b].op[b1]; + sse_op_fn =3D sse_op_table1[b].fn[b1]; if ((sse_op_flags & (SSE_OPF_SPECIAL | SSE_OPF_3DNOW)) =3D=3D 0 - && !sse_fn_epp) { + && !sse_op_fn.op1) { goto unknown_op; } if ((b <=3D 0x5f && b >=3D 0x10) || b =3D=3D 0xc6 || b =3D=3D 0xc2) { @@ -3618,9 +3623,9 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op1_offset =3D offsetof(CPUX86State,mmx_t0); } assert(b1 < 2); - sse_fn_epp =3D sse_op_table2[((b - 1) & 3) * 8 + + SSEFunc_0_epp fn =3D sse_op_table2[((b - 1) & 3) * 8 + (((modrm >> 3)) & 7)][b1]; - if (!sse_fn_epp) { + if (!fn) { goto unknown_op; } if (is_xmm) { @@ -3632,7 +3637,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } tcg_gen_addi_ptr(s->ptr0, cpu_env, op2_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op1_offset); - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + fn(cpu_env, s->ptr0, s->ptr1); break; case 0x050: /* movmskps */ rm =3D (modrm & 7) | REX_B(s); @@ -3889,12 +3894,12 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, gen_ldo_env_A0(s, op2_offset); } } - if (!op6->op[b1]) { + if (!op6->fn[b1].op1) { goto illegal_op; } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op6->op[b1](cpu_env, s->ptr0, s->ptr1); + op6->fn[b1].op1(cpu_env, s->ptr0, s->ptr1); } else { if ((op6->flags & SSE_OPF_MMX) =3D=3D 0) { goto unknown_op; @@ -3909,7 +3914,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op6->op[0](cpu_env, s->ptr0, s->ptr1); + op6->fn[0].op1(cpu_env, s->ptr0, s->ptr1); } =20 if (op6->flags & SSE_OPF_CMP) { @@ -4450,8 +4455,8 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, /* We only actually have one MMX instuction (palignr) */ assert(b =3D=3D 0x0f); =20 - op7->op[0](cpu_env, s->ptr0, s->ptr1, - tcg_const_i32(val)); + op7->fn[0].op1(cpu_env, s->ptr0, s->ptr1, + tcg_const_i32(val)); break; } =20 @@ -4477,7 +4482,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 tcg_gen_addi_ptr(s->ptr0, cpu_env, op1_offset); tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); - op7->op[b1](cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); + op7->fn[b1].op1(cpu_env, s->ptr0, s->ptr1, tcg_const_i32(val)); if (op7->flags & SSE_OPF_CMP) { set_cc_op(s, CC_OP_EFLAGS); } @@ -4603,9 +4608,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); if (sse_op_flags & SSE_OPF_SHUF) { val =3D x86_ldub_code(env, s); - /* XXX: introduce a new table? */ - sse_fn_ppi =3D (SSEFunc_0_ppi)sse_fn_epp; - sse_fn_ppi(s->ptr0, s->ptr1, tcg_const_i32(val)); + sse_op_fn.op1i(s->ptr0, s->ptr1, tcg_const_i32(val)); } else if (b =3D=3D 0xf7) { /* maskmov : we must prepare A0 */ if (mod !=3D 3) { @@ -4614,17 +4617,13 @@ static void gen_sse(CPUX86State *env, DisasContext = *s, int b, tcg_gen_mov_tl(s->A0, cpu_regs[R_EDI]); gen_extu(s->aflag, s->A0); gen_add_A0_ds_seg(s); - - /* XXX: introduce a new table? */ - sse_fn_eppt =3D (SSEFunc_0_eppt)sse_fn_epp; - sse_fn_eppt(cpu_env, s->ptr0, s->ptr1, s->A0); + sse_op_fn.op1t(cpu_env, s->ptr0, s->ptr1, s->A0); } else if (b =3D=3D 0xc2) { /* compare insns, bits 7:3 (7:5 for AVX) are ignored */ val =3D x86_ldub_code(env, s) & 7; - sse_fn_epp =3D sse_op_table4[val][b1]; - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + sse_op_table4[val][b1](cpu_env, s->ptr0, s->ptr1); } else { - sse_fn_epp(cpu_env, s->ptr0, s->ptr1); + sse_op_fn.op1(cpu_env, s->ptr0, s->ptr1); } =20 if (sse_op_flags & SSE_OPF_CMP) { --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020657; cv=none; d=zohomail.com; s=zohoarc; b=ZDCIUJwjDbYEtLMSO4WpEt6oP2A8oDOesUlMqdVNrve7ro3tyxm9QlVjFJcW0KFdO6gxZyKIyIKj8nDduHcxCbaZdUHnSeerr3WG5Mh4EAVUZrFpW5l5k7jP8FlQb3PPZHXGsLMZB4Ur/yrR12urQvErXgXzDVAzDnCr81eEcSw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020657; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=iT5cL3WTVeQZflXZw+M1LECEHW0Yw44IK7kf+h+EFn4=; b=MDsEFh4A5UDcHQi+vUqRoPGLY7ldBim8mDLsIDMT7LeaVvL5kx11RQshnSj/oV7XzzDIfAW+Xw/VzuDViXfzExRtPQ9EOECwwenKHTXemRNjojFd44fXmG8HehmcWYlEl3XXVMhawDx6qWJ7B/06VqqLlfO3iNqSV14dP+Hbo5M= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662020657455614.3860860500084; Thu, 1 Sep 2022 01:24:17 -0700 (PDT) Received: from localhost ([::1]:39738 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfUm-0008JE-7i for importer@patchew.org; Thu, 01 Sep 2022 04:24:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48480) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewp-0001JP-Pf for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:11 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:40926) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewn-00037n-V3 for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:11 -0400 Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-645--kJweAh-O06eX6fFnjP9bQ-1; Thu, 01 Sep 2022 03:49:08 -0400 Received: by mail-wr1-f72.google.com with SMTP id r7-20020adfbb07000000b00225b9579132so2822702wrg.6 for ; Thu, 01 Sep 2022 00:49:07 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id cc2-20020a5d5c02000000b0021e4bc9edbfsm14799787wrb.112.2022.09.01.00.49.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018549; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iT5cL3WTVeQZflXZw+M1LECEHW0Yw44IK7kf+h+EFn4=; b=eeHUz6N1FsK082myuWv8VX0N/EDh8AOM7Jl/3e+Kxn+rjM3zzXY036eOUDd6PydZM68/dS M1wYXtnEz4x1ggaiZJ2f2yr1vbgTqOcK0Xer1msWUsfwEicdzgGxADAIARyrp6MZbCHuw+ RMMcQFg8WFNyzvWEj2wbOWbqksZJloM= X-MC-Unique: -kJweAh-O06eX6fFnjP9bQ-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=iT5cL3WTVeQZflXZw+M1LECEHW0Yw44IK7kf+h+EFn4=; b=pZLCkiPNHfKtHnnlKaB+ecQyw157H6NNcU5fCN7CDUD+FDRVXAypYoC2CvPSXhsaQ+ oVHC0DPptDdXdZ2wPXjLBS8LhPgoeJF/EINRTYmqczUmePRQvRsEIzAD++sDt40f3Tzd 7c1gH2ivUmZ2hVqXuWnZm3wNDwX0y1rw0FR+Ae+KTRExyy1/+HYmq5Y9OK3tVinPqUG1 NH15H7rV+Avzt4EITMMoxaKG0GnM7k6a3+/vghRLGIT6yyj7JilfA9ZGFYDiKiYToOQS TosgcAps7h4YFZbTpAjvhJxRzjF7ztsi7F1u+kAtROUu1FTJp37MigrwMZLulDYENtPj TM+g== X-Gm-Message-State: ACgBeo1jtuCeTwS816spqNWeNKgimE8CZxXe0Nb8xHexOE25uCbRMrQ9 JwJLMLEFoX9bXLLxFWZ+riKXqB5op/nRMtUbGhvwzmvVBD46A6Xx/AI0DwfEEf3vnLhDrukRUid QfGPDXEGR8PqlvzPbxASmk43GmZVBlhSpsHDbuqUqmyAEdsmau/awelRLzn0keRonYwc= X-Received: by 2002:a05:6000:713:b0:226:ea6c:2d7d with SMTP id bs19-20020a056000071300b00226ea6c2d7dmr4224888wrb.293.1662018546707; Thu, 01 Sep 2022 00:49:06 -0700 (PDT) X-Google-Smtp-Source: AA6agR6iDxTsZbp4IK5wpmPplvsKwrfF87FGP4Kafz62WlQsk7LoNdWn5zOKc1sorrTotVwsfj/Plg== X-Received: by 2002:a05:6000:713:b0:226:ea6c:2d7d with SMTP id bs19-20020a056000071300b00226ea6c2d7dmr4224867wrb.293.1662018546374; Thu, 01 Sep 2022 00:49:06 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 11/23] i386: Add CHECK_NO_VEX Date: Thu, 1 Sep 2022 09:48:30 +0200 Message-Id: <20220901074842.57424-12-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020658175100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Reject invalid VEX encodings on MMX instructions. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson Message-Id: <20220424220204.2493824-7-paul@nowt.org> Signed-off-by: Paolo Bonzini --- target/i386/tcg/translate.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index c6a9a5b1d4..99c84473f4 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3186,6 +3186,12 @@ static const struct SSEOpHelper_table7 sse_op_table7= [256] =3D { #undef BLENDV_OP #undef SPECIAL_OP =20 +/* VEX prefix not allowed */ +#define CHECK_NO_VEX(s) do { \ + if (s->prefix & PREFIX_VEX) \ + goto illegal_op; \ + } while (0) + static void gen_sse(CPUX86State *env, DisasContext *s, int b, target_ulong pc_start) { @@ -3272,6 +3278,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, b |=3D (b1 << 8); switch(b) { case 0x0e7: /* movntq */ + CHECK_NO_VEX(s); if (mod =3D=3D 3) { goto illegal_op; } @@ -3307,6 +3314,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x6e: /* movd mm, ea */ + CHECK_NO_VEX(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { gen_ldst_modrm(env, s, modrm, MO_64, OR_TMP0, 0); @@ -3338,6 +3346,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x6f: /* movq mm, ea */ + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_ldq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx)); @@ -3473,6 +3482,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x178: case 0x378: + CHECK_NO_VEX(s); { int bit_index, field_length; =20 @@ -3492,6 +3502,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x7e: /* movd ea, mm */ + CHECK_NO_VEX(s); #ifdef TARGET_X86_64 if (s->dflag =3D=3D MO_64) { tcg_gen_ld_i64(s->T0, cpu_env, @@ -3532,6 +3543,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); break; case 0x7f: /* movq ea, mm */ + CHECK_NO_VEX(s); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); gen_stq_env_A0(s, offsetof(CPUX86State, fpregs[reg].mmx)); @@ -3614,6 +3626,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, offsetof(CPUX86State, xmm_t0.ZMM_L(1))); op1_offset =3D offsetof(CPUX86State,xmm_t0); } else { + CHECK_NO_VEX(s); tcg_gen_movi_tl(s->T0, val); tcg_gen_st32_tl(s->T0, cpu_env, offsetof(CPUX86State, mmx_t0.MMX_L(0))); @@ -3653,6 +3666,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, break; case 0x02a: /* cvtpi2ps */ case 0x12a: /* cvtpi2pd */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -3698,6 +3712,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x12c: /* cvttpd2pi */ case 0x02d: /* cvtps2pi */ case 0x12d: /* cvtpd2pi */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); @@ -3771,6 +3786,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_st16_tl(s->T0, cpu_env, offsetof(CPUX86State,xmm_regs[reg].ZMM_W(v= al))); } else { + CHECK_NO_VEX(s); val &=3D 3; tcg_gen_st16_tl(s->T0, cpu_env, offsetof(CPUX86State,fpregs[reg].mmx.MMX_W= (val))); @@ -3810,6 +3826,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } break; case 0x2d6: /* movq2dq */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7); gen_op_movq(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q(0)), @@ -3817,6 +3834,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, gen_op_movq_env_0(s, offsetof(CPUX86State, xmm_regs[reg].ZMM_Q= (1))); break; case 0x3d6: /* movdq2q */ + CHECK_NO_VEX(s); gen_helper_enter_mmx(cpu_env); rm =3D (modrm & 7) | REX_B(s); gen_op_movq(s, offsetof(CPUX86State, fpregs[reg & 7].mmx), @@ -3831,6 +3849,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr0, cpu_env, ZMM_OFFSET(rm)); gen_helper_pmovmskb_xmm(s->tmp2_i32, cpu_env, s->ptr0); } else { + CHECK_NO_VEX(s); rm =3D (modrm & 7); tcg_gen_addi_ptr(s->ptr0, cpu_env, offsetof(CPUX86State, fpregs[rm].mmx)); @@ -3901,6 +3920,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, tcg_gen_addi_ptr(s->ptr1, cpu_env, op2_offset); op6->fn[b1].op1(cpu_env, s->ptr0, s->ptr1); } else { + CHECK_NO_VEX(s); if ((op6->flags & SSE_OPF_MMX) =3D=3D 0) { goto unknown_op; } @@ -3934,6 +3954,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, case 0x3f0: /* crc32 Gd,Eb */ case 0x3f1: /* crc32 Gd,Ey */ do_crc32: + CHECK_NO_VEX(s); if (!(s->cpuid_ext_features & CPUID_EXT_SSE42)) { goto illegal_op; } @@ -3956,6 +3977,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 case 0x1f0: /* crc32 or movbe */ case 0x1f1: + CHECK_NO_VEX(s); /* For these insns, the f3 prefix is supposed to have prio= rity over the 66 prefix, but that's not what we implement ab= ove setting b1. */ @@ -3965,6 +3987,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, /* FALLTHRU */ case 0x0f0: /* movbe Gy,My */ case 0x0f1: /* movbe My,Gy */ + CHECK_NO_VEX(s); if (!(s->cpuid_ext_features & CPUID_EXT_MOVBE)) { goto illegal_op; } @@ -4131,6 +4154,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, =20 case 0x1f6: /* adcx Gy, Ey */ case 0x2f6: /* adox Gy, Ey */ + CHECK_NO_VEX(s); if (!(s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_ADX)) { goto illegal_op; } else { @@ -4436,6 +4460,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, } =20 if (b1 =3D=3D 0) { + CHECK_NO_VEX(s); /* MMX */ if ((op7->flags & SSE_OPF_MMX) =3D=3D 0) { goto illegal_op; @@ -4582,6 +4607,7 @@ static void gen_sse(CPUX86State *env, DisasContext *s= , int b, op2_offset =3D ZMM_OFFSET(rm); } } else { + CHECK_NO_VEX(s); op1_offset =3D offsetof(CPUX86State,fpregs[reg].mmx); if (mod !=3D 3) { gen_lea_modrm(env, s, modrm); --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021581; cv=none; d=zohomail.com; s=zohoarc; b=iXMOnR4t8wNI56SJuZNZE93kfuetV+2sPE6MbHicsw9UgxOeHdyqdNBCiPgg7Hd/5OGIz//m6VOJSZKHGsOEShCc0IE87ffyLzgUtafigJSHVHnzCcVs36v7vLGqgalDxbEoh+Km4UlCQ0KN/TJaSUk/h1W9Iqg0+mdU0fP9/DM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021581; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=RnbqrRyb+Pgqqj81ijJJCqtS+n3s+WDNawbq5Qm6zEY=; b=XYXmq2KzNb2QeXYV7TeouPQh7IodpExKHdU/oxdmOcKItqfT6bXSVo6HHC+9JiHKQXfxG6hT/Bc/DbhcJAYE7/vjBRmDuS8Co1Ah5Kzof88yavKPwtyH5sDFcZbWUOdKwcVQzR34QQJpiMo/zdwl+5jSY4bO45TxO9/9yWZ/m2Q= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021581929724.6015025653089; Thu, 1 Sep 2022 01:39:41 -0700 (PDT) Received: from localhost ([::1]:47438 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfjg-0001mh-Od for importer@patchew.org; Thu, 01 Sep 2022 04:39:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60876) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTeww-0001SA-VO for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:59159) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTews-00039O-Po for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:17 -0400 Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-639-SjCkVOZgPUCXjuTHzcxF7A-1; Thu, 01 Sep 2022 03:49:11 -0400 Received: by mail-wr1-f69.google.com with SMTP id j12-20020adfff8c000000b002265dcdfad7so2781067wrr.2 for ; Thu, 01 Sep 2022 00:49:10 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id k36-20020a05600c1ca400b003a5f3de6fddsm5272964wms.25.2022.09.01.00.49.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018553; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RnbqrRyb+Pgqqj81ijJJCqtS+n3s+WDNawbq5Qm6zEY=; b=BUebbpsNYBdma9uciRCOuQtttm/KWDjcYtteMZNjhohQhfPXDD+fGPvclW9BJqYxqJQoCa j52r9FKWyamtWsrmGhEBI5G4B0NQGA9mztq9BEP0dRFY/vdjNp4jxniuSVwwra1gnenSns 3H/Jod9TZCzNlQivU9aGKXJaCHoeasA= X-MC-Unique: SjCkVOZgPUCXjuTHzcxF7A-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=RnbqrRyb+Pgqqj81ijJJCqtS+n3s+WDNawbq5Qm6zEY=; b=UTXj6/dwp7wxfRTkbJSiA5g6T3wgPlMvb9p+qML1eYcvKNrkIPZrjUpMA8jItCsYob n/9EoPN+LLTW2DdHT6jORactTD8J5zgUzuL2NCKq75X+dhySbmCk603Mq1YnSDJ1VcSL Oov7C1F6SBvAaxJtp2ye9jhIacUo5NZ6SkyV/BR9grdDgcHyy9W4SVDALN7C+unvdwnl 7fJYftBCkGNuAFjEZNzkJuC4lS8yffSKl9kYOVhrAlcl8MmC5p6lJO9mMk00IJHw1XgI KHq0YSy1oZkxm1iTfVs+j9QptK7FjflsUG3gtnHGzFwJlCjzKQRGOM4FuU2K/I1xfQZp E9DQ== X-Gm-Message-State: ACgBeo0zCw1aeJ7lOG0G5SsGsd9nwWJCcjn58ftnsyT/syg7QqzzyQLT zGumhPMKF21kRXXfpQxUslIANh5ZTg2lr9MqAnZP7Qxyxbl8VOf+a5sciWe6UhHL6KVGFOx1/Dw UD5o+fDM7uQt2dHNhj0z+59OOFjJ//rUnC477Kk7RVl5veHZ/e6ORQ6Mjf++cBS9q/4I= X-Received: by 2002:a05:600c:40d5:b0:3a5:3d9f:6e7f with SMTP id m21-20020a05600c40d500b003a53d9f6e7fmr4204932wmh.21.1662018548913; Thu, 01 Sep 2022 00:49:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR5Tt8ZjYP+7uE5NyfoOWiEKIXiUsbr/Dbg4k18hMeaY1sW0/IBB2pO0C2ydDU0W4gKtJOx5YA== X-Received: by 2002:a05:600c:40d5:b0:3a5:3d9f:6e7f with SMTP id m21-20020a05600c40d500b003a53d9f6e7fmr4204917wmh.21.1662018548539; Thu, 01 Sep 2022 00:49:08 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 12/23] i386: Rewrite vector shift helper Date: Thu, 1 Sep 2022 09:48:31 +0200 Message-Id: <20220901074842.57424-13-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021583129100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Rewrite the vector shift helpers in preperation for AVX support (3 operand form and 256 bit vectors). For now keep the existing two operand interface. No functional changes to existing helpers. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-11-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 247 +++++++++++++++++++----------------------- 1 file changed, 112 insertions(+), 135 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 2c0090a647..a4a09226e3 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -40,6 +40,8 @@ #define SUFFIX _xmm #endif =20 +#define LANE_WIDTH (SHIFT ? 16 : 8) + /* * Copy the relevant parts of a Reg value around. In the case where * sizeof(Reg) > SIZE, these helpers operate only on the lower bytes of @@ -56,198 +58,173 @@ #define MOVE(d, r) memcpy(&(d).B(0), &(r).B(0), SIZE) #endif =20 -void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - int shift; +#if SHIFT =3D=3D 0 +#define FPSRL(x, c) ((x) >> shift) +#define FPSRAW(x, c) ((int16_t)(x) >> shift) +#define FPSRAL(x, c) ((int32_t)(x) >> shift) +#define FPSLL(x, c) ((x) << shift) +#endif =20 - if (s->Q(0) > 15) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif +void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +{ + Reg *s =3D d; + int shift; + if (c->Q(0) > 15) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->W(0) >>=3D shift; - d->W(1) >>=3D shift; - d->W(2) >>=3D shift; - d->W(3) >>=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) >>=3D shift; - d->W(5) >>=3D shift; - d->W(6) >>=3D shift; - d->W(7) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSRL(s->W(i), shift); + } } } =20 -void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; + if (c->Q(0) > 15) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } + } else { + shift =3D c->B(0); + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSLL(s->W(i), shift); + } + } +} =20 - if (s->Q(0) > 15) { +void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +{ + Reg *s =3D d; + int shift; + if (c->Q(0) > 15) { shift =3D 15; } else { - shift =3D s->B(0); + shift =3D c->B(0); + } + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSRAW(s->W(i), shift); } - d->W(0) =3D (int16_t)d->W(0) >> shift; - d->W(1) =3D (int16_t)d->W(1) >> shift; - d->W(2) =3D (int16_t)d->W(2) >> shift; - d->W(3) =3D (int16_t)d->W(3) >> shift; -#if SHIFT =3D=3D 1 - d->W(4) =3D (int16_t)d->W(4) >> shift; - d->W(5) =3D (int16_t)d->W(5) >> shift; - d->W(6) =3D (int16_t)d->W(6) >> shift; - d->W(7) =3D (int16_t)d->W(7) >> shift; -#endif } =20 -void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 15) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 31) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->W(0) <<=3D shift; - d->W(1) <<=3D shift; - d->W(2) <<=3D shift; - d->W(3) <<=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) <<=3D shift; - d->W(5) <<=3D shift; - d->W(6) <<=3D shift; - d->W(7) <<=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSRL(s->L(i), shift); + } } } =20 -void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 31) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->L(0) >>=3D shift; - d->L(1) >>=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) >>=3D shift; - d->L(3) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSLL(s->L(i), shift); + } } } =20 -void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { + if (c->Q(0) > 31) { shift =3D 31; } else { - shift =3D s->B(0); + shift =3D c->B(0); + } + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSRAL(s->L(i), shift); } - d->L(0) =3D (int32_t)d->L(0) >> shift; - d->L(1) =3D (int32_t)d->L(1) >> shift; -#if SHIFT =3D=3D 1 - d->L(2) =3D (int32_t)d->L(2) >> shift; - d->L(3) =3D (int32_t)d->L(3) >> shift; -#endif } =20 -void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 63) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->L(0) <<=3D shift; - d->L(1) <<=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) <<=3D shift; - d->L(3) <<=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D FPSRL(s->Q(i), shift); + } } } =20 -void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 63) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 63) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->Q(0) >>=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D FPSLL(s->Q(i), shift); + } } } =20 -void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +#if SHIFT >=3D 1 +void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { - int shift; + Reg *s =3D d; + int shift, i, j; =20 - if (s->Q(0) > 63) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif - } else { - shift =3D s->B(0); - d->Q(0) <<=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) <<=3D shift; -#endif - } -} - -#if SHIFT =3D=3D 1 -void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - int shift, i; - - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } - for (i =3D 0; i < 16 - shift; i++) { - d->B(i) =3D d->B(i + shift); - } - for (i =3D 16 - shift; i < 16; i++) { - d->B(i) =3D 0; + for (j =3D 0; j < 8 << SHIFT; j +=3D LANE_WIDTH) { + for (i =3D 0; i < 16 - shift; i++) { + d->B(j + i) =3D s->B(j + i + shift); + } + for (i =3D 16 - shift; i < 16; i++) { + d->B(j + i) =3D 0; + } } } =20 -void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { - int shift, i; + Reg *s =3D d; + int shift, i, j; =20 - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } - for (i =3D 15; i >=3D shift; i--) { - d->B(i) =3D d->B(i - shift); - } - for (i =3D 0; i < shift; i++) { - d->B(i) =3D 0; + for (j =3D 0; j < 8 << SHIFT; j +=3D LANE_WIDTH) { + for (i =3D 15; i >=3D shift; i--) { + d->B(j + i) =3D s->B(j + i - shift); + } + for (i =3D 0; i < shift; i++) { + d->B(j + i) =3D 0; + } } } #endif --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020809; cv=none; d=zohomail.com; s=zohoarc; b=kimB4dqc2IBuRwI+OcyLtgF5wVqwU6w4rkUJ8hI8F9hQzHC31v3Nt2YsrdLFraLNFGSgqJjfvGJ9728LA0i1phrU1r/7CKMVC/GDBJwPi9nWVij1lEfUKez2P1wPuL0v7D0/Rvrol0i7BR/tRJhLzs+jaGZcgptpzFkhyWWCVn4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020809; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=YVDqfQsbPjz2PJzedAtKt9Ie3sZVvmyyqv7yLJZVb7Q=; b=crYfx3cgmtmoY0mbgQjF9KUf8VBvhoEUMPTWj4ddkkXb1pYPQ1YSTOzFUHi5PjdRmwZB+ZNV7W9JZqpFQ1AZtJylrCJnVYcz9Y720LtXPUUwTFTJ0UghKvUdhdX3DXpTmesIKELg4gcydhyFJgZkjCcF5IDyYut/EqU7My/CY0s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662020809548704.9785095348869; Thu, 1 Sep 2022 01:26:49 -0700 (PDT) Received: from localhost ([::1]:33992 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfXC-0002Dx-Lj for importer@patchew.org; Thu, 01 Sep 2022 04:26:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60880) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTewz-0001SK-RO for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:58639) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTeww-00039a-MF for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:21 -0400 Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-266-NsFmGk8TP3SVVYoAys-zJg-1; Thu, 01 Sep 2022 03:49:14 -0400 Received: by mail-wm1-f71.google.com with SMTP id n7-20020a1c2707000000b003a638356355so9486156wmn.2 for ; Thu, 01 Sep 2022 00:49:13 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id e20-20020a05600c4e5400b003a5bd5ea215sm4415013wmq.37.2022.09.01.00.49.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018555; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YVDqfQsbPjz2PJzedAtKt9Ie3sZVvmyyqv7yLJZVb7Q=; b=Pyt4Y98D/TWRJZ9LZ5CXONHHARuEqWlZUBX5B+ZDXWM1W9kmG2upCsIKAOShCh9o/KF2e3 l/U0ggtDP9M9fzQU3apwXQLcjj6BZuSy2/HPNKOQQSdGaLf35OOHsEuxqgH3EWVC9f1sP5 +4p07xelA/8yNSd1x69QtQd7hfIiyHU= X-MC-Unique: NsFmGk8TP3SVVYoAys-zJg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=YVDqfQsbPjz2PJzedAtKt9Ie3sZVvmyyqv7yLJZVb7Q=; b=SELfyzSwTKWPyeNLvJSgoLV2DUUZMWsWL121i5aZ6RxNgMcn70VFpWJEmeepzWiADR LDgW7pRaM8Q/hxGjZnc/8NrOjKDKZgh7pRRH94GLRZ6jXJ5qMVaUXCAn9aZSCg7DzAVy DeMFa3Dkvl9/EFfegCPnHnA3XLIFisYztnK+tAIupHhAt+NuiY102+uN1TqvvyQ0YG2x 4wHvIfmD9suaiT1dHP/F5gRIyvTCk10zKV7tbP2NXEOdywjIyRlpFBZGlujrJsoFQsRm dku1bu7KTYWZq+UJWPUR3UCW8GNbn3Zv5qAVmNwS7bqQbMYjTkJEEcoesEqKrNzHQgJW ei5Q== X-Gm-Message-State: ACgBeo3B+Fr0i/WI5D35N98SggOyLJYquOhUoXDMwIsnPypCk2NOw8Ub VNpTc1l3xge7frlnvpajSkn8PsnUQBa4ehAwJrwgrW8NvqZP1N+pLITZERFebgtOFFaG3zUvmAi wJ6L4sEa61Iup6Or2g6TgkajGAloNYppMdSNDVUWJT3R/4VyqdXYmIntf2BFiV0PP0XM= X-Received: by 2002:a5d:424f:0:b0:226:d206:cd6e with SMTP id s15-20020a5d424f000000b00226d206cd6emr13013659wrr.554.1662018552495; Thu, 01 Sep 2022 00:49:12 -0700 (PDT) X-Google-Smtp-Source: AA6agR7OiT0WAA2pXLEaNV0vCtCAHvwXJGIB9e5rzr13DIJtUs3li9WfJ0K3hr/ITPhnc0FNz5midA== X-Received: by 2002:a5d:424f:0:b0:226:d206:cd6e with SMTP id s15-20020a5d424f000000b00226d206cd6emr13013648wrr.554.1662018552150; Thu, 01 Sep 2022 00:49:12 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 13/23] i386: Rewrite simple integer vector helpers Date: Thu, 1 Sep 2022 09:48:32 +0200 Message-Id: <20220901074842.57424-14-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020810811100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Rewrite the "simple" vector integer helpers in preperation for AVX support. While the current code is able to use the same prototype for unary (a =3D F(b)) and binary (a =3D F(b, c)) operations, future changes will cau= se them to diverge. No functional changes to existing helpers Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-12-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 83 +++++++++++++++---------------------------- 1 file changed, 28 insertions(+), 55 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index a4a09226e3..ce03362810 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -229,63 +229,36 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Re= g *d, Reg *c) } #endif =20 -#define SSE_HELPER_B(name, F) \ +#define SSE_HELPER_1(name, elem, num, F) \ void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->B(0) =3D F(d->B(0), s->B(0)); \ - d->B(1) =3D F(d->B(1), s->B(1)); \ - d->B(2) =3D F(d->B(2), s->B(2)); \ - d->B(3) =3D F(d->B(3), s->B(3)); \ - d->B(4) =3D F(d->B(4), s->B(4)); \ - d->B(5) =3D F(d->B(5), s->B(5)); \ - d->B(6) =3D F(d->B(6), s->B(6)); \ - d->B(7) =3D F(d->B(7), s->B(7)); \ - XMM_ONLY( \ - d->B(8) =3D F(d->B(8), s->B(8)); \ - d->B(9) =3D F(d->B(9), s->B(9)); \ - d->B(10) =3D F(d->B(10), s->B(10)); \ - d->B(11) =3D F(d->B(11), s->B(11)); \ - d->B(12) =3D F(d->B(12), s->B(12)); \ - d->B(13) =3D F(d->B(13), s->B(13)); \ - d->B(14) =3D F(d->B(14), s->B(14)); \ - d->B(15) =3D F(d->B(15), s->B(15)); \ - ) \ - } + int n =3D num; \ + for (int i =3D 0; i < n; i++) { \ + d->elem(i) =3D F(s->elem(i)); \ + } \ + } + +#define SSE_HELPER_2(name, elem, num, F) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + { \ + Reg *v =3D d; \ + int n =3D num; \ + for (int i =3D 0; i < n; i++) { \ + d->elem(i) =3D F(v->elem(i), s->elem(i)); \ + } \ + } + +#define SSE_HELPER_B(name, F) \ + SSE_HELPER_2(name, B, 8 << SHIFT, F) =20 #define SSE_HELPER_W(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->W(0) =3D F(d->W(0), s->W(0)); \ - d->W(1) =3D F(d->W(1), s->W(1)); \ - d->W(2) =3D F(d->W(2), s->W(2)); \ - d->W(3) =3D F(d->W(3), s->W(3)); \ - XMM_ONLY( \ - d->W(4) =3D F(d->W(4), s->W(4)); \ - d->W(5) =3D F(d->W(5), s->W(5)); \ - d->W(6) =3D F(d->W(6), s->W(6)); \ - d->W(7) =3D F(d->W(7), s->W(7)); \ - ) \ - } + SSE_HELPER_2(name, W, 4 << SHIFT, F) =20 #define SSE_HELPER_L(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->L(0) =3D F(d->L(0), s->L(0)); \ - d->L(1) =3D F(d->L(1), s->L(1)); \ - XMM_ONLY( \ - d->L(2) =3D F(d->L(2), s->L(2)); \ - d->L(3) =3D F(d->L(3), s->L(3)); \ - ) \ - } + SSE_HELPER_2(name, L, 2 << SHIFT, F) =20 #define SSE_HELPER_Q(name, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->Q(0) =3D F(d->Q(0), s->Q(0)); \ - XMM_ONLY( \ - d->Q(1) =3D F(d->Q(1), s->Q(1)); \ - ) \ - } + SSE_HELPER_2(name, Q, 1 << SHIFT, F) =20 #if SHIFT =3D=3D 0 static inline int satub(int x) @@ -1544,12 +1517,12 @@ void glue(helper_phsubsw, SUFFIX)(CPUX86State *env,= Reg *d, Reg *s) MOVE(*d, r); } =20 -#define FABSB(_, x) (x > INT8_MAX ? -(int8_t)x : x) -#define FABSW(_, x) (x > INT16_MAX ? -(int16_t)x : x) -#define FABSL(_, x) (x > INT32_MAX ? -(int32_t)x : x) -SSE_HELPER_B(helper_pabsb, FABSB) -SSE_HELPER_W(helper_pabsw, FABSW) -SSE_HELPER_L(helper_pabsd, FABSL) +#define FABSB(x) (x > INT8_MAX ? -(int8_t)x : x) +#define FABSW(x) (x > INT16_MAX ? -(int16_t)x : x) +#define FABSL(x) (x > INT32_MAX ? -(int32_t)x : x) +SSE_HELPER_1(helper_pabsb, B, 8 << SHIFT, FABSB) +SSE_HELPER_1(helper_pabsw, W, 4 << SHIFT, FABSW) +SSE_HELPER_1(helper_pabsd, L, 2 << SHIFT, FABSL) =20 #define FMULHRSW(d, s) (((int16_t) d * (int16_t)s + 0x4000) >> 15) SSE_HELPER_W(helper_pmulhrsw, FMULHRSW) --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662022059; cv=none; d=zohomail.com; s=zohoarc; b=B/HrP4EAwwnjsCVPsGzts7mQujdi0K6abkh8hXhFsTb2ul4vyVq1zabAIq5n2gueu8e5HLLlbkHWE3BKryeWmeyk2mby3s+e4pQYfVnA/S4dzkyPyAxE7FCURBDx8R3RitmSeO7B2lqrLpfX/L4AKQbUP0tPAUDCWOzuCGGrfqI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662022059; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=yxZseeCDARY4xTal4F7lWpeHRxt8V5rzn3z7VPT/LB4=; b=U8uW/f0H/oS+g+NNlzYCltsScAWuKJjCTv2vdo0Arv7tiySEpK5euyB+08vuhu3oKCLqkNkkFqE66nhAwxg/i7f0mOvuvTcz2dVrdHyWhPI605n3zYmyhpKi4LHTnVctTSJ5AHOwXLruintamqdhTSYoHKed5jrCqBg/PDoTxh8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662022059544768.2348067135615; Thu, 1 Sep 2022 01:47:39 -0700 (PDT) Received: from localhost ([::1]:52456 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfrO-0007nH-81 for importer@patchew.org; Thu, 01 Sep 2022 04:47:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60882) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex1-0001Sb-Fa for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:25 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:39553) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTeww-00039o-NP for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:23 -0400 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-163-xrcxumcOPpKyxVzSLnYlLw-1; Thu, 01 Sep 2022 03:49:15 -0400 Received: by mail-wm1-f70.google.com with SMTP id v3-20020a1cac03000000b003a7012c430dso893262wme.3 for ; Thu, 01 Sep 2022 00:49:15 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id v5-20020a5d59c5000000b002257fd37877sm14822064wry.6.2022.09.01.00.49.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018556; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yxZseeCDARY4xTal4F7lWpeHRxt8V5rzn3z7VPT/LB4=; b=eU2LPeLsP2JNy6KEt9DnbpVSmXGRtsiOgN4wY6tMeCzxTqMfZudhlwbT6+SXFHtO93DgCA B4fSvrEebi7e7ItATioz2fF+pdpl9gFT5f6qEA0x8odm2Yl0SOIwQlvlPNu2cnBQ+n0pCE HU+hn5k259FHNWihz0ZNY89JrzQ8ueA= X-MC-Unique: xrcxumcOPpKyxVzSLnYlLw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=yxZseeCDARY4xTal4F7lWpeHRxt8V5rzn3z7VPT/LB4=; b=T9byV3vT++XlbuN+T4PbNcd7Md+OUsQ+8/tq9RQWwOvZWGge4tQ4psOvogcMyiiceK sZqb+zCAb8MUsa+KwJ8BRVvDqV4M8MMcznffRTLoUWiQlJvEB1ZCZktfcAtPDUkHEfrX NbelV5+QGq+84UjyvfgqNU8GxkPmWD/vpS2EFnRKvMubJXwbqR9GVM6s4t1DD2AgvOKU y3DTfddJuaxoKXhqk2E48hDzTC4QsaIi4vTCaItrd03rJFNKbPNcL8cK06AnhszRBvuC Ynvm0UCLPHpNvH6ck3jKepK8DNha6v2NREu4jXKqTkiXWg8tr2ZEIfXDYj+JIg6kETJO 3WUQ== X-Gm-Message-State: ACgBeo3FTmS7OJYgMizDVQXpsRh5On5GE5lGVib1sFqq73pedVu/rJm8 Law2VoHqA+oXO4IRpfqBuYE0WauL5vtNNA6jveapQI7Mj1a9DxCme3eJztqCIEaIeNHAX9orLYC fLlart7iTNGmEObnCPRxcUIkGFXfaq7PYQZR4GjKDGQwznMGuXD1PRulqgR2dCd3iXXI= X-Received: by 2002:adf:d203:0:b0:226:d4e7:511e with SMTP id j3-20020adfd203000000b00226d4e7511emr12153094wrh.13.1662018554305; Thu, 01 Sep 2022 00:49:14 -0700 (PDT) X-Google-Smtp-Source: AA6agR6nEykDcoHh8BX4Is/dB4b7N4lnLgDfh8t/kx/z3PO8CvYayjtdUO0UgJPh3dboKJkSdWCMdQ== X-Received: by 2002:adf:d203:0:b0:226:d4e7:511e with SMTP id j3-20020adfd203000000b00226d4e7511emr12153081wrh.13.1662018553941; Thu, 01 Sep 2022 00:49:13 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 14/23] i386: Misc integer AVX helper prep Date: Thu, 1 Sep 2022 09:48:33 +0200 Message-Id: <20220901074842.57424-15-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662022060318100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook More preparatory work for AVX support in various integer vector helpers No functional changes to existing helpers. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-13-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 164 +++++++++++++++++++++--------------------- 1 file changed, 80 insertions(+), 84 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index ce03362810..557cc7ce7d 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -390,19 +390,22 @@ SSE_HELPER_W(helper_pavgw, FAVG) =20 void glue(helper_pmuludq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->Q(0) =3D (uint64_t)s->L(0) * (uint64_t)d->L(0); -#if SHIFT =3D=3D 1 - d->Q(1) =3D (uint64_t)s->L(2) * (uint64_t)d->L(2); -#endif + Reg *v =3D d; + int i; + + for (i =3D 0; i < (1 << SHIFT); i++) { + d->Q(i) =3D (uint64_t)s->L(i * 2) * (uint64_t)v->L(i * 2); + } } =20 void glue(helper_pmaddwd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { + Reg *v =3D d; int i; =20 for (i =3D 0; i < (2 << SHIFT); i++) { - d->L(i) =3D (int16_t)s->W(2 * i) * (int16_t)d->W(2 * i) + - (int16_t)s->W(2 * i + 1) * (int16_t)d->W(2 * i + 1); + d->L(i) =3D (int16_t)s->W(2 * i) * (int16_t)v->W(2 * i) + + (int16_t)s->W(2 * i + 1) * (int16_t)v->W(2 * i + 1); } } =20 @@ -416,32 +419,24 @@ static inline int abs1(int a) } } #endif + void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - unsigned int val; + Reg *v =3D d; + int i; =20 - val =3D 0; - val +=3D abs1(d->B(0) - s->B(0)); - val +=3D abs1(d->B(1) - s->B(1)); - val +=3D abs1(d->B(2) - s->B(2)); - val +=3D abs1(d->B(3) - s->B(3)); - val +=3D abs1(d->B(4) - s->B(4)); - val +=3D abs1(d->B(5) - s->B(5)); - val +=3D abs1(d->B(6) - s->B(6)); - val +=3D abs1(d->B(7) - s->B(7)); - d->Q(0) =3D val; -#if SHIFT =3D=3D 1 - val =3D 0; - val +=3D abs1(d->B(8) - s->B(8)); - val +=3D abs1(d->B(9) - s->B(9)); - val +=3D abs1(d->B(10) - s->B(10)); - val +=3D abs1(d->B(11) - s->B(11)); - val +=3D abs1(d->B(12) - s->B(12)); - val +=3D abs1(d->B(13) - s->B(13)); - val +=3D abs1(d->B(14) - s->B(14)); - val +=3D abs1(d->B(15) - s->B(15)); - d->Q(1) =3D val; -#endif + for (i =3D 0; i < (1 << SHIFT); i++) { + unsigned int val =3D 0; + val +=3D abs1(v->B(8 * i + 0) - s->B(8 * i + 0)); + val +=3D abs1(v->B(8 * i + 1) - s->B(8 * i + 1)); + val +=3D abs1(v->B(8 * i + 2) - s->B(8 * i + 2)); + val +=3D abs1(v->B(8 * i + 3) - s->B(8 * i + 3)); + val +=3D abs1(v->B(8 * i + 4) - s->B(8 * i + 4)); + val +=3D abs1(v->B(8 * i + 5) - s->B(8 * i + 5)); + val +=3D abs1(v->B(8 * i + 6) - s->B(8 * i + 6)); + val +=3D abs1(v->B(8 * i + 7) - s->B(8 * i + 7)); + d->Q(i) =3D val; + } } =20 void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, @@ -458,20 +453,24 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, =20 void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val) { + int i; + d->L(0) =3D val; d->L(1) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + for (i =3D 1; i < (1 << SHIFT); i++) { + d->Q(i) =3D 0; + } } =20 #ifdef TARGET_X86_64 void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val) { + int i; + d->Q(0) =3D val; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + for (i =3D 1; i < (1 << SHIFT); i++) { + d->Q(i) =3D 0; + } } #endif =20 @@ -1074,26 +1073,21 @@ uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State = *env, Reg *s) uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s) { uint32_t val; + int i; =20 val =3D 0; - val |=3D (s->B(0) >> 7); - val |=3D (s->B(1) >> 6) & 0x02; - val |=3D (s->B(2) >> 5) & 0x04; - val |=3D (s->B(3) >> 4) & 0x08; - val |=3D (s->B(4) >> 3) & 0x10; - val |=3D (s->B(5) >> 2) & 0x20; - val |=3D (s->B(6) >> 1) & 0x40; - val |=3D (s->B(7)) & 0x80; -#if SHIFT =3D=3D 1 - val |=3D (s->B(8) << 1) & 0x0100; - val |=3D (s->B(9) << 2) & 0x0200; - val |=3D (s->B(10) << 3) & 0x0400; - val |=3D (s->B(11) << 4) & 0x0800; - val |=3D (s->B(12) << 5) & 0x1000; - val |=3D (s->B(13) << 6) & 0x2000; - val |=3D (s->B(14) << 7) & 0x4000; - val |=3D (s->B(15) << 8) & 0x8000; -#endif + for (i =3D 0; i < (1 << SHIFT); i++) { + uint8_t byte =3D 0; + byte |=3D (s->B(8 * i + 0) >> 7); + byte |=3D (s->B(8 * i + 1) >> 6) & 0x02; + byte |=3D (s->B(8 * i + 2) >> 5) & 0x04; + byte |=3D (s->B(8 * i + 3) >> 4) & 0x08; + byte |=3D (s->B(8 * i + 4) >> 3) & 0x10; + byte |=3D (s->B(8 * i + 5) >> 2) & 0x20; + byte |=3D (s->B(8 * i + 6) >> 1) & 0x40; + byte |=3D (s->B(8 * i + 7)) & 0x80; + val |=3D byte << (8 * i); + } return val; } =20 @@ -1638,46 +1632,48 @@ SSE_HELPER_V(helper_blendvpd, Q, 2, FBLENDVPD) =20 void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - uint64_t zf =3D (s->Q(0) & d->Q(0)) | (s->Q(1) & d->Q(1)); - uint64_t cf =3D (s->Q(0) & ~d->Q(0)) | (s->Q(1) & ~d->Q(1)); + uint64_t zf =3D 0, cf =3D 0; + int i; =20 + for (i =3D 0; i < 1 << SHIFT; i++) { + zf |=3D (s->Q(i) & d->Q(i)); + cf |=3D (s->Q(i) & ~d->Q(i)); + } CC_SRC =3D (zf ? 0 : CC_Z) | (cf ? 0 : CC_C); } =20 -#define SSE_HELPER_F(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ - { \ - if (num > 2) { \ - if (num > 4) { \ - d->elem(7) =3D F(7); \ - d->elem(6) =3D F(6); \ - d->elem(5) =3D F(5); \ - d->elem(4) =3D F(4); \ - } \ - d->elem(3) =3D F(3); \ - d->elem(2) =3D F(2); \ - } \ - d->elem(1) =3D F(1); \ - d->elem(0) =3D F(0); \ +#define SSE_HELPER_F(name, elem, num, F) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + { \ + int n =3D num; \ + for (int i =3D n; --i >=3D 0; ) { \ + d->elem(i) =3D F(i); \ + } \ } =20 -SSE_HELPER_F(helper_pmovsxbw, W, 8, (int8_t) s->B) -SSE_HELPER_F(helper_pmovsxbd, L, 4, (int8_t) s->B) -SSE_HELPER_F(helper_pmovsxbq, Q, 2, (int8_t) s->B) -SSE_HELPER_F(helper_pmovsxwd, L, 4, (int16_t) s->W) -SSE_HELPER_F(helper_pmovsxwq, Q, 2, (int16_t) s->W) -SSE_HELPER_F(helper_pmovsxdq, Q, 2, (int32_t) s->L) -SSE_HELPER_F(helper_pmovzxbw, W, 8, s->B) -SSE_HELPER_F(helper_pmovzxbd, L, 4, s->B) -SSE_HELPER_F(helper_pmovzxbq, Q, 2, s->B) -SSE_HELPER_F(helper_pmovzxwd, L, 4, s->W) -SSE_HELPER_F(helper_pmovzxwq, Q, 2, s->W) -SSE_HELPER_F(helper_pmovzxdq, Q, 2, s->L) +#if SHIFT > 0 +SSE_HELPER_F(helper_pmovsxbw, W, 4 << SHIFT, (int8_t) s->B) +SSE_HELPER_F(helper_pmovsxbd, L, 2 << SHIFT, (int8_t) s->B) +SSE_HELPER_F(helper_pmovsxbq, Q, 1 << SHIFT, (int8_t) s->B) +SSE_HELPER_F(helper_pmovsxwd, L, 2 << SHIFT, (int16_t) s->W) +SSE_HELPER_F(helper_pmovsxwq, Q, 1 << SHIFT, (int16_t) s->W) +SSE_HELPER_F(helper_pmovsxdq, Q, 1 << SHIFT, (int32_t) s->L) +SSE_HELPER_F(helper_pmovzxbw, W, 4 << SHIFT, s->B) +SSE_HELPER_F(helper_pmovzxbd, L, 2 << SHIFT, s->B) +SSE_HELPER_F(helper_pmovzxbq, Q, 1 << SHIFT, s->B) +SSE_HELPER_F(helper_pmovzxwd, L, 2 << SHIFT, s->W) +SSE_HELPER_F(helper_pmovzxwq, Q, 1 << SHIFT, s->W) +SSE_HELPER_F(helper_pmovzxdq, Q, 1 << SHIFT, s->L) +#endif =20 void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->Q(0) =3D (int64_t)(int32_t) d->L(0) * (int32_t) s->L(0); - d->Q(1) =3D (int64_t)(int32_t) d->L(2) * (int32_t) s->L(2); + Reg *v =3D d; + int i; + + for (i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D (int64_t)(int32_t) v->L(2 * i) * (int32_t) s->L(2 * i); + } } =20 #define FCMPEQQ(d, s) (d =3D=3D s ? -1 : 0) --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021225; cv=none; d=zohomail.com; s=zohoarc; b=eyfqaW+uy0/eIIrOhMu22a70E5DfACtGROIbiJfJIa7XbWR40ZcnrmXrzfmN24NO5isDLCo4V8UoUyw75WOZ3EOgHClPxvV1MNQwWs2bUkHZ5M1Pv7aPnB1E5u6S1PaR/R2EpA0HkTIA1hFjbWAJ0YVUsyrOiPQOMEBED4FaKwU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021225; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=3REAHuapFJUYclfbIdPQbtZkfTTxzjiTaVwLoWZYQGg=; b=StYLOdJreNRRnf+nP1nNNu/2CpHcngVpNFy6jmTULpmRMskeKoWK/oaVDpTDPZBkqvhUiVNLNjBq2ObvzyHHfhJCcJRYI02AZFw26XUIC4kkMp/zhUjtog3MG30ApdiG18C5rR21hQ+SIS0AD+fJ4sRbKtqi6wCbegUH0pUmb+s= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021225889850.4009547369794; Thu, 1 Sep 2022 01:33:45 -0700 (PDT) Received: from localhost ([::1]:37398 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfdw-000267-Hm for importer@patchew.org; Thu, 01 Sep 2022 04:33:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42656) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex7-0001T6-EW for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:46597) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex0-0003AO-Ru for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:25 -0400 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-257-0-taQgKKNbmtUr14U24tag-1; Thu, 01 Sep 2022 03:49:21 -0400 Received: by mail-wm1-f70.google.com with SMTP id ay21-20020a05600c1e1500b003a6271a9718so9498139wmb.0 for ; Thu, 01 Sep 2022 00:49:20 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id d5-20020a5d4f85000000b0021e6c52c921sm16388484wru.54.2022.09.01.00.49.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018562; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3REAHuapFJUYclfbIdPQbtZkfTTxzjiTaVwLoWZYQGg=; b=TmDQ2ZLFLWlhj7ALeEE5xSnBYcoJajGVHMCOYZWIk4s+TSQum9ZD4Nt20qefQt0M3tJxsK BKjWwHRNcZ9Yevnw9gl6UHhpx+LvzOka6eh2ZERIAWYZsQQW+Xj70gXp/G9jrsKZG5avsQ v2k0CTFmpIO6N4FnY63Gzj8Y/h3bJbw= X-MC-Unique: 0-taQgKKNbmtUr14U24tag-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=3REAHuapFJUYclfbIdPQbtZkfTTxzjiTaVwLoWZYQGg=; b=3nz0VLhoMM7GvOH7qFa5yH8NxBD7V3SrRe3CmDIkheuORo8ebnvIInmkSjEJm54KrG rCEGctewSSGAZZsrmNRGoOeuw3J/DUSbzVW1N1buI2hYDHwS33bFbIvfk2Os0vJN7R72 LrIVxbbk9t+Cf8a7abOkssj52rjNpNJs5oka1ETT2ouGQSzTe44sPgEjU54tGFkwtcs0 lehsQ/2z9dSe9akJubSON2m5aZEcNX/qFtwEWGPPY6goccOfT59G350Xy9oprgsh7Zrt W99SiUYHTZ4i67iWVkA+JjW8YbcoNdbEifd69bYj/8TnzgkEl8hPfRVlP7t/uk/SnIG+ tm5A== X-Gm-Message-State: ACgBeo3bf2FRv6zKRghhwehqEUTxWWQsy0BThBjfQoXvG6V+MMr0bWMk 8dTwnDsrMivQQX65erbj1R+1DU/xAsriuKqQcWvrbhjWsUYXSN7a3uoJvTjbaewmfQom8tG/c2d wkR76W/nEroy6EAoOdpTe28NLylhzB3f8m/GMCi25fvYD6veYZkP1tx+aJnZARLJTpGc= X-Received: by 2002:a05:600c:3b1b:b0:3a8:4044:9a9c with SMTP id m27-20020a05600c3b1b00b003a840449a9cmr4149455wms.69.1662018559310; Thu, 01 Sep 2022 00:49:19 -0700 (PDT) X-Google-Smtp-Source: AA6agR7HojsFYlkQ4mbFHF2467fqcxZXjQS/nd1pQsK1Pd6hCqetjH0B8eM94Zb/yja40AwAJ1WACg== X-Received: by 2002:a05:600c:3b1b:b0:3a8:4044:9a9c with SMTP id m27-20020a05600c3b1b00b003a840449a9cmr4149420wms.69.1662018558655; Thu, 01 Sep 2022 00:49:18 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 15/23] i386: Destructive vector helpers for AVX Date: Thu, 1 Sep 2022 09:48:34 +0200 Message-Id: <20220901074842.57424-16-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021226458100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook These helpers need to take special care to avoid overwriting source values before the wole result has been calculated. Currently they use a dummy Reg typed variable to store the result then assign the whole register. This will cause 128 bit operations to corrupt the upper half of the registe= r, so replace it with explicit temporaries and element assignments. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-14-paul@nowt.org> Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/ops_sse.h | 556 ++++++++++++++++++++---------------------- 1 file changed, 262 insertions(+), 294 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 557cc7ce7d..7d48c05693 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -41,6 +41,7 @@ #endif =20 #define LANE_WIDTH (SHIFT ? 16 : 8) +#define PACK_WIDTH (LANE_WIDTH / 2) =20 /* * Copy the relevant parts of a Reg value around. In the case where @@ -474,71 +475,81 @@ void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t= val) } #endif =20 +#define SHUFFLE4(F, a, b, offset) do { \ + r0 =3D a->F((order & 3) + offset); \ + r1 =3D a->F(((order >> 2) & 3) + offset); \ + r2 =3D b->F(((order >> 4) & 3) + offset); \ + r3 =3D b->F(((order >> 6) & 3) + offset); \ + d->F(offset) =3D r0; \ + d->F(offset + 1) =3D r1; \ + d->F(offset + 2) =3D r2; \ + d->F(offset + 3) =3D r3; \ + } while (0) + #if SHIFT =3D=3D 0 void glue(helper_pshufw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; =20 - r.W(0) =3D s->W(order & 3); - r.W(1) =3D s->W((order >> 2) & 3); - r.W(2) =3D s->W((order >> 4) & 3); - r.W(3) =3D s->W((order >> 6) & 3); - MOVE(*d, r); + SHUFFLE4(W, s, s, 0); } #else void glue(helper_shufps, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + Reg *v =3D d; + uint32_t r0, r1, r2, r3; + int i; =20 - r.L(0) =3D d->L(order & 3); - r.L(1) =3D d->L((order >> 2) & 3); - r.L(2) =3D s->L((order >> 4) & 3); - r.L(3) =3D s->L((order >> 6) & 3); - MOVE(*d, r); + for (i =3D 0; i < 2 << SHIFT; i +=3D 4) { + SHUFFLE4(L, v, s, i); + } } =20 void glue(helper_shufpd, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + Reg *v =3D d; + uint64_t r0, r1; + int i; =20 - r.Q(0) =3D d->Q(order & 1); - r.Q(1) =3D s->Q((order >> 1) & 1); - MOVE(*d, r); + for (i =3D 0; i < 1 << SHIFT; i +=3D 2) { + r0 =3D v->Q(((order & 1) & 1) + i); + r1 =3D s->Q(((order >> 1) & 1) + i); + d->Q(i) =3D r0; + d->Q(i + 1) =3D r1; + order >>=3D 2; + } } =20 void glue(helper_pshufd, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint32_t r0, r1, r2, r3; + int i; =20 - r.L(0) =3D s->L(order & 3); - r.L(1) =3D s->L((order >> 2) & 3); - r.L(2) =3D s->L((order >> 4) & 3); - r.L(3) =3D s->L((order >> 6) & 3); - MOVE(*d, r); + for (i =3D 0; i < 2 << SHIFT; i +=3D 4) { + SHUFFLE4(L, s, s, i); + } } =20 void glue(helper_pshuflw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; + int i, j; =20 - r.W(0) =3D s->W(order & 3); - r.W(1) =3D s->W((order >> 2) & 3); - r.W(2) =3D s->W((order >> 4) & 3); - r.W(3) =3D s->W((order >> 6) & 3); - r.Q(1) =3D s->Q(1); - MOVE(*d, r); + for (i =3D 0, j =3D 1; j < 1 << SHIFT; i +=3D 8, j +=3D 2) { + SHUFFLE4(W, s, s, i); + d->Q(j) =3D s->Q(j); + } } =20 void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int order) { - Reg r; + uint16_t r0, r1, r2, r3; + int i, j; =20 - r.Q(0) =3D s->Q(0); - r.W(4) =3D s->W(4 + (order & 3)); - r.W(5) =3D s->W(4 + ((order >> 2) & 3)); - r.W(6) =3D s->W(4 + ((order >> 4) & 3)); - r.W(7) =3D s->W(4 + ((order >> 6) & 3)); - MOVE(*d, r); + for (i =3D 4, j =3D 0; j < 1 << SHIFT; i +=3D 8, j +=3D 2) { + d->Q(j) =3D s->Q(j); + SHUFFLE4(W, s, s, i); + } } #endif =20 @@ -1091,156 +1102,132 @@ uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86Stat= e *env, Reg *s) return val; } =20 -void glue(helper_packsswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.B(0) =3D satsb((int16_t)d->W(0)); - r.B(1) =3D satsb((int16_t)d->W(1)); - r.B(2) =3D satsb((int16_t)d->W(2)); - r.B(3) =3D satsb((int16_t)d->W(3)); -#if SHIFT =3D=3D 1 - r.B(4) =3D satsb((int16_t)d->W(4)); - r.B(5) =3D satsb((int16_t)d->W(5)); - r.B(6) =3D satsb((int16_t)d->W(6)); - r.B(7) =3D satsb((int16_t)d->W(7)); -#endif - r.B((4 << SHIFT) + 0) =3D satsb((int16_t)s->W(0)); - r.B((4 << SHIFT) + 1) =3D satsb((int16_t)s->W(1)); - r.B((4 << SHIFT) + 2) =3D satsb((int16_t)s->W(2)); - r.B((4 << SHIFT) + 3) =3D satsb((int16_t)s->W(3)); -#if SHIFT =3D=3D 1 - r.B(12) =3D satsb((int16_t)s->W(4)); - r.B(13) =3D satsb((int16_t)s->W(5)); - r.B(14) =3D satsb((int16_t)s->W(6)); - r.B(15) =3D satsb((int16_t)s->W(7)); -#endif - MOVE(*d, r); +#define PACK_HELPER_B(name, F) \ +void glue(helper_pack ## name, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + uint8_t r[PACK_WIDTH * 2]; \ + int j, k; \ + for (j =3D 0; j < 4 << SHIFT; j +=3D PACK_WIDTH) { \ + for (k =3D 0; k < PACK_WIDTH; k++) { \ + r[k] =3D F((int16_t)v->W(j + k)); \ + } \ + for (k =3D 0; k < PACK_WIDTH; k++) { \ + r[PACK_WIDTH + k] =3D F((int16_t)s->W(j + k)); \ + } \ + for (k =3D 0; k < PACK_WIDTH * 2; k++) { \ + d->B(2 * j + k) =3D r[k]; \ + } \ + } \ } =20 -void glue(helper_packuswb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.B(0) =3D satub((int16_t)d->W(0)); - r.B(1) =3D satub((int16_t)d->W(1)); - r.B(2) =3D satub((int16_t)d->W(2)); - r.B(3) =3D satub((int16_t)d->W(3)); -#if SHIFT =3D=3D 1 - r.B(4) =3D satub((int16_t)d->W(4)); - r.B(5) =3D satub((int16_t)d->W(5)); - r.B(6) =3D satub((int16_t)d->W(6)); - r.B(7) =3D satub((int16_t)d->W(7)); -#endif - r.B((4 << SHIFT) + 0) =3D satub((int16_t)s->W(0)); - r.B((4 << SHIFT) + 1) =3D satub((int16_t)s->W(1)); - r.B((4 << SHIFT) + 2) =3D satub((int16_t)s->W(2)); - r.B((4 << SHIFT) + 3) =3D satub((int16_t)s->W(3)); -#if SHIFT =3D=3D 1 - r.B(12) =3D satub((int16_t)s->W(4)); - r.B(13) =3D satub((int16_t)s->W(5)); - r.B(14) =3D satub((int16_t)s->W(6)); - r.B(15) =3D satub((int16_t)s->W(7)); -#endif - MOVE(*d, r); -} +PACK_HELPER_B(sswb, satsb) +PACK_HELPER_B(uswb, satub) =20 void glue(helper_packssdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - Reg r; + Reg *v =3D d; + uint16_t r[PACK_WIDTH]; + int j, k; =20 - r.W(0) =3D satsw(d->L(0)); - r.W(1) =3D satsw(d->L(1)); -#if SHIFT =3D=3D 1 - r.W(2) =3D satsw(d->L(2)); - r.W(3) =3D satsw(d->L(3)); -#endif - r.W((2 << SHIFT) + 0) =3D satsw(s->L(0)); - r.W((2 << SHIFT) + 1) =3D satsw(s->L(1)); -#if SHIFT =3D=3D 1 - r.W(6) =3D satsw(s->L(2)); - r.W(7) =3D satsw(s->L(3)); -#endif - MOVE(*d, r); + for (j =3D 0; j < 2 << SHIFT; j +=3D PACK_WIDTH / 2) { + for (k =3D 0; k < PACK_WIDTH / 2; k++) { + r[k] =3D satsw(v->L(j + k)); + } + for (k =3D 0; k < PACK_WIDTH / 2; k++) { + r[PACK_WIDTH / 2 + k] =3D satsw(s->L(j + k)); + } + for (k =3D 0; k < PACK_WIDTH; k++) { + d->W(2 * j + k) =3D r[k]; + } + } } =20 #define UNPCK_OP(base_name, base) \ \ void glue(helper_punpck ## base_name ## bw, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint8_t r[PACK_WIDTH * 2]; \ + int j, i; \ \ - r.B(0) =3D d->B((base << (SHIFT + 2)) + 0); \ - r.B(1) =3D s->B((base << (SHIFT + 2)) + 0); \ - r.B(2) =3D d->B((base << (SHIFT + 2)) + 1); \ - r.B(3) =3D s->B((base << (SHIFT + 2)) + 1); \ - r.B(4) =3D d->B((base << (SHIFT + 2)) + 2); \ - r.B(5) =3D s->B((base << (SHIFT + 2)) + 2); \ - r.B(6) =3D d->B((base << (SHIFT + 2)) + 3); \ - r.B(7) =3D s->B((base << (SHIFT + 2)) + 3); \ - XMM_ONLY( \ - r.B(8) =3D d->B((base << (SHIFT + 2)) + 4); \ - r.B(9) =3D s->B((base << (SHIFT + 2)) + 4); \ - r.B(10) =3D d->B((base << (SHIFT + 2)) + 5); \ - r.B(11) =3D s->B((base << (SHIFT + 2)) + 5); \ - r.B(12) =3D d->B((base << (SHIFT + 2)) + 6); \ - r.B(13) =3D s->B((base << (SHIFT + 2)) + 6); \ - r.B(14) =3D d->B((base << (SHIFT + 2)) + 7); \ - r.B(15) =3D s->B((base << (SHIFT + 2)) + 7); \ - ) \ - MOVE(*d, r); \ + for (j =3D 0; j < 8 << SHIFT; ) { \ + int k =3D j + base * PACK_WIDTH; \ + for (i =3D 0; i < PACK_WIDTH; i++) { \ + r[2 * i] =3D v->B(k + i); \ + r[2 * i + 1] =3D s->B(k + i); \ + } \ + for (i =3D 0; i < PACK_WIDTH * 2; i++, j++) { \ + d->B(j) =3D r[i]; \ + } \ + } \ } \ \ void glue(helper_punpck ## base_name ## wd, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint16_t r[PACK_WIDTH]; \ + int j, i; \ \ - r.W(0) =3D d->W((base << (SHIFT + 1)) + 0); \ - r.W(1) =3D s->W((base << (SHIFT + 1)) + 0); \ - r.W(2) =3D d->W((base << (SHIFT + 1)) + 1); \ - r.W(3) =3D s->W((base << (SHIFT + 1)) + 1); \ - XMM_ONLY( \ - r.W(4) =3D d->W((base << (SHIFT + 1)) + 2); \ - r.W(5) =3D s->W((base << (SHIFT + 1)) + 2); \ - r.W(6) =3D d->W((base << (SHIFT + 1)) + 3); \ - r.W(7) =3D s->W((base << (SHIFT + 1)) + 3); \ - ) \ - MOVE(*d, r); \ + for (j =3D 0; j < 4 << SHIFT; ) { \ + int k =3D j + base * PACK_WIDTH / 2; \ + for (i =3D 0; i < PACK_WIDTH / 2; i++) { \ + r[2 * i] =3D v->W(k + i); \ + r[2 * i + 1] =3D s->W(k + i); \ + } \ + for (i =3D 0; i < PACK_WIDTH; i++, j++) { \ + d->W(j) =3D r[i]; \ + } \ + } \ } \ \ void glue(helper_punpck ## base_name ## dq, SUFFIX)(CPUX86State *env,\ - Reg *d, Reg *s) \ + Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint32_t r[PACK_WIDTH / 2]; \ + int j, i; \ \ - r.L(0) =3D d->L((base << SHIFT) + 0); \ - r.L(1) =3D s->L((base << SHIFT) + 0); \ - XMM_ONLY( \ - r.L(2) =3D d->L((base << SHIFT) + 1); \ - r.L(3) =3D s->L((base << SHIFT) + 1); \ - ) \ - MOVE(*d, r); \ + for (j =3D 0; j < 2 << SHIFT; ) { \ + int k =3D j + base * PACK_WIDTH / 4; \ + for (i =3D 0; i < PACK_WIDTH / 4; i++) { \ + r[2 * i] =3D v->L(k + i); \ + r[2 * i + 1] =3D s->L(k + i); \ + } \ + for (i =3D 0; i < PACK_WIDTH / 2; i++, j++) { \ + d->L(j) =3D r[i]; \ + } \ + } \ } \ \ XMM_ONLY( # qdq, SUF= FIX)( \ + CPUX86State *env, Reg *d, Reg *s) \ { \ - Reg r; \ + Reg *v =3D d; \ + uint64_t r[2]; \ + int i; \ \ - r.Q(0) =3D d->Q(base); \ - r.Q(1) =3D s->Q(base); \ - MOVE(*d, r); \ + for (i =3D 0; i < 1 << SHIFT; i +=3D 2) { = \ + r[0] =3D v->Q(base + i); \ + r[1] =3D s->Q(base + i); \ + d->Q(i) =3D r[0]; \ + d->Q(i + 1) =3D r[1]; \ + } \ } \ ) =20 UNPCK_OP(l, 0) UNPCK_OP(h, 1) =20 +#undef PACK_WIDTH +#undef PACK_HELPER_B +#undef UNPCK_OP + + /* 3DNow! float ops */ #if SHIFT =3D=3D 0 void helper_pi2fd(CPUX86State *env, MMXReg *d, MMXReg *s) @@ -1393,122 +1380,86 @@ void helper_pswapd(CPUX86State *env, MMXReg *d, MM= XReg *s) /* SSSE3 op helpers */ void glue(helper_pshufb, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { + Reg *v =3D d; int i; - Reg r; +#if SHIFT =3D=3D 0 + uint8_t r[8]; =20 - for (i =3D 0; i < (8 << SHIFT); i++) { - r.B(i) =3D (s->B(i) & 0x80) ? 0 : (d->B(s->B(i) & ((8 << SHIFT) - = 1))); + for (i =3D 0; i < 8; i++) { + r[i] =3D (s->B(i) & 0x80) ? 0 : (v->B(s->B(i) & 7)); } + for (i =3D 0; i < 8; i++) { + d->B(i) =3D r[i]; + } +#else + uint8_t r[8 << SHIFT]; =20 - MOVE(*d, r); -} - -void glue(helper_phaddw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - - Reg r; - - r.W(0) =3D (int16_t)d->W(0) + (int16_t)d->W(1); - r.W(1) =3D (int16_t)d->W(2) + (int16_t)d->W(3); - XMM_ONLY(r.W(2) =3D (int16_t)d->W(4) + (int16_t)d->W(5)); - XMM_ONLY(r.W(3) =3D (int16_t)d->W(6) + (int16_t)d->W(7)); - r.W((2 << SHIFT) + 0) =3D (int16_t)s->W(0) + (int16_t)s->W(1); - r.W((2 << SHIFT) + 1) =3D (int16_t)s->W(2) + (int16_t)s->W(3); - XMM_ONLY(r.W(6) =3D (int16_t)s->W(4) + (int16_t)s->W(5)); - XMM_ONLY(r.W(7) =3D (int16_t)s->W(6) + (int16_t)s->W(7)); - - MOVE(*d, r); -} - -void glue(helper_phaddd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.L(0) =3D (int32_t)d->L(0) + (int32_t)d->L(1); - XMM_ONLY(r.L(1) =3D (int32_t)d->L(2) + (int32_t)d->L(3)); - r.L((1 << SHIFT) + 0) =3D (int32_t)s->L(0) + (int32_t)s->L(1); - XMM_ONLY(r.L(3) =3D (int32_t)s->L(2) + (int32_t)s->L(3)); - - MOVE(*d, r); -} - -void glue(helper_phaddsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.W(0) =3D satsw((int16_t)d->W(0) + (int16_t)d->W(1)); - r.W(1) =3D satsw((int16_t)d->W(2) + (int16_t)d->W(3)); - XMM_ONLY(r.W(2) =3D satsw((int16_t)d->W(4) + (int16_t)d->W(5))); - XMM_ONLY(r.W(3) =3D satsw((int16_t)d->W(6) + (int16_t)d->W(7))); - r.W((2 << SHIFT) + 0) =3D satsw((int16_t)s->W(0) + (int16_t)s->W(1)); - r.W((2 << SHIFT) + 1) =3D satsw((int16_t)s->W(2) + (int16_t)s->W(3)); - XMM_ONLY(r.W(6) =3D satsw((int16_t)s->W(4) + (int16_t)s->W(5))); - XMM_ONLY(r.W(7) =3D satsw((int16_t)s->W(6) + (int16_t)s->W(7))); - - MOVE(*d, r); -} - -void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - d->W(0) =3D satsw((int8_t)s->B(0) * (uint8_t)d->B(0) + - (int8_t)s->B(1) * (uint8_t)d->B(1)); - d->W(1) =3D satsw((int8_t)s->B(2) * (uint8_t)d->B(2) + - (int8_t)s->B(3) * (uint8_t)d->B(3)); - d->W(2) =3D satsw((int8_t)s->B(4) * (uint8_t)d->B(4) + - (int8_t)s->B(5) * (uint8_t)d->B(5)); - d->W(3) =3D satsw((int8_t)s->B(6) * (uint8_t)d->B(6) + - (int8_t)s->B(7) * (uint8_t)d->B(7)); -#if SHIFT =3D=3D 1 - d->W(4) =3D satsw((int8_t)s->B(8) * (uint8_t)d->B(8) + - (int8_t)s->B(9) * (uint8_t)d->B(9)); - d->W(5) =3D satsw((int8_t)s->B(10) * (uint8_t)d->B(10) + - (int8_t)s->B(11) * (uint8_t)d->B(11)); - d->W(6) =3D satsw((int8_t)s->B(12) * (uint8_t)d->B(12) + - (int8_t)s->B(13) * (uint8_t)d->B(13)); - d->W(7) =3D satsw((int8_t)s->B(14) * (uint8_t)d->B(14) + - (int8_t)s->B(15) * (uint8_t)d->B(15)); + for (i =3D 0; i < 8 << SHIFT; i++) { + int j =3D i & ~0xf; + r[i] =3D (s->B(i) & 0x80) ? 0 : v->B(j | (s->B(i) & 0xf)); + } + for (i =3D 0; i < 8 << SHIFT; i++) { + d->B(i) =3D r[i]; + } #endif } =20 -void glue(helper_phsubw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.W(0) =3D (int16_t)d->W(0) - (int16_t)d->W(1); - r.W(1) =3D (int16_t)d->W(2) - (int16_t)d->W(3); - XMM_ONLY(r.W(2) =3D (int16_t)d->W(4) - (int16_t)d->W(5)); - XMM_ONLY(r.W(3) =3D (int16_t)d->W(6) - (int16_t)d->W(7)); - r.W((2 << SHIFT) + 0) =3D (int16_t)s->W(0) - (int16_t)s->W(1); - r.W((2 << SHIFT) + 1) =3D (int16_t)s->W(2) - (int16_t)s->W(3); - XMM_ONLY(r.W(6) =3D (int16_t)s->W(4) - (int16_t)s->W(5)); - XMM_ONLY(r.W(7) =3D (int16_t)s->W(6) - (int16_t)s->W(7)); - MOVE(*d, r); +#define SSE_HELPER_HW(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + uint16_t r[4 << SHIFT]; \ + int i, j, k; \ + for (k =3D 0; k < 4 << SHIFT; k +=3D LANE_WIDTH / 2) { \ + for (i =3D j =3D 0; j < LANE_WIDTH / 2; i++, j +=3D 2) { \ + r[i + k] =3D F(v->W(j + k), v->W(j + k + 1)); \ + } \ + for (j =3D 0; j < LANE_WIDTH / 2; i++, j +=3D 2) { \ + r[i + k] =3D F(s->W(j + k), s->W(j + k + 1)); \ + } \ + } \ + for (i =3D 0; i < 4 << SHIFT; i++) { \ + d->W(i) =3D r[i]; \ + } \ } =20 -void glue(helper_phsubd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; - - r.L(0) =3D (int32_t)d->L(0) - (int32_t)d->L(1); - XMM_ONLY(r.L(1) =3D (int32_t)d->L(2) - (int32_t)d->L(3)); - r.L((1 << SHIFT) + 0) =3D (int32_t)s->L(0) - (int32_t)s->L(1); - XMM_ONLY(r.L(3) =3D (int32_t)s->L(2) - (int32_t)s->L(3)); - MOVE(*d, r); +#define SSE_HELPER_HL(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + uint32_t r[2 << SHIFT]; \ + int i, j, k; \ + for (k =3D 0; k < 2 << SHIFT; k +=3D LANE_WIDTH / 4) { \ + for (i =3D j =3D 0; j < LANE_WIDTH / 4; i++, j +=3D 2) { \ + r[i + k] =3D F(v->L(j + k), v->L(j + k + 1)); \ + } \ + for (j =3D 0; j < LANE_WIDTH / 4; i++, j +=3D 2) { \ + r[i + k] =3D F(s->L(j + k), s->L(j + k + 1)); \ + } \ + } \ + for (i =3D 0; i < 2 << SHIFT; i++) { \ + d->L(i) =3D r[i]; \ + } \ } =20 -void glue(helper_phsubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - Reg r; +SSE_HELPER_HW(phaddw, FADD) +SSE_HELPER_HW(phsubw, FSUB) +SSE_HELPER_HW(phaddsw, FADDSW) +SSE_HELPER_HW(phsubsw, FSUBSW) +SSE_HELPER_HL(phaddd, FADD) +SSE_HELPER_HL(phsubd, FSUB) =20 - r.W(0) =3D satsw((int16_t)d->W(0) - (int16_t)d->W(1)); - r.W(1) =3D satsw((int16_t)d->W(2) - (int16_t)d->W(3)); - XMM_ONLY(r.W(2) =3D satsw((int16_t)d->W(4) - (int16_t)d->W(5))); - XMM_ONLY(r.W(3) =3D satsw((int16_t)d->W(6) - (int16_t)d->W(7))); - r.W((2 << SHIFT) + 0) =3D satsw((int16_t)s->W(0) - (int16_t)s->W(1)); - r.W((2 << SHIFT) + 1) =3D satsw((int16_t)s->W(2) - (int16_t)s->W(3)); - XMM_ONLY(r.W(6) =3D satsw((int16_t)s->W(4) - (int16_t)s->W(5))); - XMM_ONLY(r.W(7) =3D satsw((int16_t)s->W(6) - (int16_t)s->W(7))); - MOVE(*d, r); +#undef SSE_HELPER_HW +#undef SSE_HELPER_HL + +void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + Reg *v =3D d; + int i; + for (i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D satsw((int8_t)s->B(i * 2) * (uint8_t)v->B(i * 2) + + (int8_t)s->B(i * 2 + 1) * (uint8_t)v->B(i * 2 + 1)= ); + } } =20 #define FABSB(x) (x > INT8_MAX ? -(int8_t)x : x) @@ -1531,32 +1482,38 @@ SSE_HELPER_L(helper_psignd, FSIGNL) void glue(helper_palignr, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, int32_t shift) { - Reg r; + Reg *v =3D d; + int i; =20 /* XXX could be checked during translation */ - if (shift >=3D (16 << SHIFT)) { - r.Q(0) =3D 0; - XMM_ONLY(r.Q(1) =3D 0); + if (shift >=3D (SHIFT ? 32 : 16)) { + for (i =3D 0; i < (1 << SHIFT); i++) { + d->Q(i) =3D 0; + } } else { shift <<=3D 3; #define SHR(v, i) (i < 64 && i > -64 ? i > 0 ? v >> (i) : (v << -(i)) : 0) #if SHIFT =3D=3D 0 - r.Q(0) =3D SHR(s->Q(0), shift - 0) | - SHR(d->Q(0), shift - 64); + d->Q(0) =3D SHR(s->Q(0), shift - 0) | + SHR(v->Q(0), shift - 64); #else - r.Q(0) =3D SHR(s->Q(0), shift - 0) | - SHR(s->Q(1), shift - 64) | - SHR(d->Q(0), shift - 128) | - SHR(d->Q(1), shift - 192); - r.Q(1) =3D SHR(s->Q(0), shift + 64) | - SHR(s->Q(1), shift - 0) | - SHR(d->Q(0), shift - 64) | - SHR(d->Q(1), shift - 128); + for (i =3D 0; i < (1 << SHIFT); i +=3D 2) { + uint64_t r0, r1; + + r0 =3D SHR(s->Q(i), shift - 0) | + SHR(s->Q(i + 1), shift - 64) | + SHR(v->Q(i), shift - 128) | + SHR(v->Q(i + 1), shift - 192); + r1 =3D SHR(s->Q(i), shift + 64) | + SHR(s->Q(i + 1), shift - 0) | + SHR(v->Q(i), shift - 64) | + SHR(v->Q(i + 1), shift - 128); + d->Q(i) =3D r0; + d->Q(i + 1) =3D r1; + } #endif #undef SHR } - - MOVE(*d, r); } =20 #define XMM0 (env->xmm_regs[0]) @@ -1681,17 +1638,23 @@ SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ) =20 void glue(helper_packusdw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - Reg r; + Reg *v =3D d; + uint16_t r[8]; + int i, j, k; =20 - r.W(0) =3D satuw((int32_t) d->L(0)); - r.W(1) =3D satuw((int32_t) d->L(1)); - r.W(2) =3D satuw((int32_t) d->L(2)); - r.W(3) =3D satuw((int32_t) d->L(3)); - r.W(4) =3D satuw((int32_t) s->L(0)); - r.W(5) =3D satuw((int32_t) s->L(1)); - r.W(6) =3D satuw((int32_t) s->L(2)); - r.W(7) =3D satuw((int32_t) s->L(3)); - MOVE(*d, r); + for (i =3D 0, j =3D 0; i <=3D 2 << SHIFT; i +=3D 8, j +=3D 4) { + r[0] =3D satuw(v->L(j)); + r[1] =3D satuw(v->L(j + 1)); + r[2] =3D satuw(v->L(j + 2)); + r[3] =3D satuw(v->L(j + 3)); + r[4] =3D satuw(s->L(j)); + r[5] =3D satuw(s->L(j + 1)); + r[6] =3D satuw(s->L(j + 2)); + r[7] =3D satuw(s->L(j + 3)); + for (k =3D 0; k < 8; k++) { + d->W(i + k) =3D r[k]; + } + } } =20 #define FMINSB(d, s) MIN((int8_t)d, (int8_t)s) @@ -1947,20 +1910,25 @@ void glue(helper_dppd, SUFFIX)(CPUX86State *env, Re= g *d, Reg *s, uint32_t mask) void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t offset) { - int s0 =3D (offset & 3) << 2; - int d0 =3D (offset & 4) << 0; - int i; - Reg r; + Reg *v =3D d; + int i, j; + uint16_t r[8]; =20 - for (i =3D 0; i < 8; i++, d0++) { - r.W(i) =3D 0; - r.W(i) +=3D abs1(d->B(d0 + 0) - s->B(s0 + 0)); - r.W(i) +=3D abs1(d->B(d0 + 1) - s->B(s0 + 1)); - r.W(i) +=3D abs1(d->B(d0 + 2) - s->B(s0 + 2)); - r.W(i) +=3D abs1(d->B(d0 + 3) - s->B(s0 + 3)); + for (j =3D 0; j < 4 << SHIFT; ) { + int s0 =3D (j * 2) + ((offset & 3) << 2); + int d0 =3D (j * 2) + ((offset & 4) << 0); + for (i =3D 0; i < LANE_WIDTH / 2; i++, d0++) { + r[i] =3D 0; + r[i] +=3D abs1(v->B(d0 + 0) - s->B(s0 + 0)); + r[i] +=3D abs1(v->B(d0 + 1) - s->B(s0 + 1)); + r[i] +=3D abs1(v->B(d0 + 2) - s->B(s0 + 2)); + r[i] +=3D abs1(v->B(d0 + 3) - s->B(s0 + 3)); + } + for (i =3D 0; i < LANE_WIDTH / 2; i++, j++) { + d->W(j) =3D r[i]; + } + offset >>=3D 3; } - - MOVE(*d, r); } =20 /* SSE4.2 op helpers */ --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021118; cv=none; d=zohomail.com; s=zohoarc; b=Iydap9MQvQYJsfqS9THVLDYHNrlj2165yCQI5dSpxEAWfB2LxNjl0R+pncm1YDoyeIDWKezadbSR6239xXRBIrIT1wKP0qA0789fola0D3qs0UpWzixLlIE/iWLzUWDeSsp3TgtABgVIykJWtbs3BjXCEEVuxS2uLNoZWlp87mk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021118; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=I4vZuVciV0YX/F9jLz5g5o86XdllCA9R+OzS131eDps=; b=Xh1q/ltGpDxfd0GFTMQO0chQoU5i1PqTgc/9XW+DOzXx/1tmp/tji9W9kHNNFRlChNpW3ksEpQQXj13A9H+gx1q631dEJdFAd19wJUNP3Yh8K9X3SbGYTMLwqvYYASoCDV1TIKIeavsNvdvjtu8/GdqwmY8TS38FXpmns2VnESk= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021118006916.9075322081383; Thu, 1 Sep 2022 01:31:58 -0700 (PDT) Received: from localhost ([::1]:44784 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfcB-0008KB-OM for importer@patchew.org; Thu, 01 Sep 2022 04:31:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42664) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexB-0001Th-Kj for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:59149) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex9-0003BR-KF for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:33 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-230-yxUkJtu_NaGWt92R94EiXA-1; Thu, 01 Sep 2022 03:49:22 -0400 Received: by mail-wm1-f69.google.com with SMTP id c66-20020a1c3545000000b003a5f6dd6a25so897632wma.1 for ; Thu, 01 Sep 2022 00:49:22 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id z7-20020a05600c0a0700b003a5c1e916c8sm12371851wmp.1.2022.09.01.00.49.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018571; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I4vZuVciV0YX/F9jLz5g5o86XdllCA9R+OzS131eDps=; b=fJAurUBqss2HHPuGegTkEiXxMneuvBXeoavSet2gxAefSIE6sZGEIt0pUl12G3FwfuI1x5 I8TmQm3kNfz8A6ds90KzDuN4pMaionk0NVcXLkNgbPfmgzhhPlPPQ0EQQa69e3RPkzZNSD wdaDzdle7n28st5eMk4sGxem3w9kCxk= X-MC-Unique: yxUkJtu_NaGWt92R94EiXA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=I4vZuVciV0YX/F9jLz5g5o86XdllCA9R+OzS131eDps=; b=LlnMvUaErcYusvXiNim5o9O4GhU520kOi1bOu44fpjkALRi/rd4tXc9cs4AJ4L4dN0 w6x1BjnlfhnI9YOX3IicXf8cgpv7UixXuez2j1Rw1oODE2XWOZ1CXvp5Dr8wN5cbe9mr 95KXxwyhHOmhWLNuuxLuUO634vz99i8IZtljzAPt8DtDHMPdFIqR4p8uIa0zKl3LbLsC fuIERzU7836l8NS3d+pQyReWfUJ1FCqF/e7u7ChhunyJlAqGaXe5WQjgon8gaSNTgwYj 3i8FWBBRE1ec+GEMm7sMTM57hd1e4qHaQhGP03mhYDdMsIeIK2i7V8WCEAaTgfBYJk/3 Pbbw== X-Gm-Message-State: ACgBeo1Nejdo1mpHvlPXsk0yJwE3O6bfYc/0h4iEYF/NYb0d/zmQTgJR juBJ66g5yblcvYkcnY6NE26FYRvg3DbsKbSq8s1IY1HRVx2kw4pJZ7G5C5VBwsfYKnYg59v3Hyg deWq58dBN8eEL549CD1DTceUNGL6aLgpuV5fJ9ZLIKKGAh4n98nVsuOX4eQMMl5wqxSc= X-Received: by 2002:adf:ed50:0:b0:225:4c37:5346 with SMTP id u16-20020adfed50000000b002254c375346mr13765772wro.207.1662018561188; Thu, 01 Sep 2022 00:49:21 -0700 (PDT) X-Google-Smtp-Source: AA6agR520TQqGsDHP0nDSJXAqGErvPbFmJyT2BVOEi1Uwn8+TLmDKmgm4NBYoqFYWjSEAe5m5A7x+Q== X-Received: by 2002:adf:ed50:0:b0:225:4c37:5346 with SMTP id u16-20020adfed50000000b002254c375346mr13765753wro.207.1662018560899; Thu, 01 Sep 2022 00:49:20 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 16/23] i386: Floating point arithmetic helper AVX prep Date: Thu, 1 Sep 2022 09:48:35 +0200 Message-Id: <20220901074842.57424-17-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021119852100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Prepare the "easy" floating point vector helpers for AVX No functional changes to existing helpers. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-16-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 138 ++++++++++++++++++++++++++++-------------- 1 file changed, 92 insertions(+), 46 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 7d48c05693..d881d03228 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -553,40 +553,58 @@ void glue(helper_pshufhw, SUFFIX)(Reg *d, Reg *s, int= order) } #endif =20 -#if SHIFT =3D=3D 1 +#if SHIFT >=3D 1 /* FPU ops */ /* XXX: not accurate */ =20 -#define SSE_HELPER_S(name, F) \ - void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ +#define SSE_HELPER_P(name, F) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - d->ZMM_S(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ - d->ZMM_S(2) =3D F(32, d->ZMM_S(2), s->ZMM_S(2)); \ - d->ZMM_S(3) =3D F(32, d->ZMM_S(3), s->ZMM_S(3)); \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < 2 << SHIFT; i++) { \ + d->ZMM_S(i) =3D F(32, v->ZMM_S(i), s->ZMM_S(i)); \ + } \ } \ \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_S(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - } \ - \ - void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ - { \ - d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - d->ZMM_D(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ - } \ - \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->ZMM_D(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < 1 << SHIFT; i++) { \ + d->ZMM_D(i) =3D F(64, v->ZMM_D(i), s->ZMM_D(i)); \ + } \ } =20 +#if SHIFT =3D=3D 1 + +#define SSE_HELPER_S(name, F) \ + SSE_HELPER_P(name, F) \ + \ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s)\ + { \ + Reg *v =3D d; \ + d->ZMM_S(0) =3D F(32, v->ZMM_S(0), s->ZMM_S(0)); \ + } \ + \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s)\ + { \ + Reg *v =3D d; \ + d->ZMM_D(0) =3D F(64, v->ZMM_D(0), s->ZMM_D(0)); \ + } + +#else + +#define SSE_HELPER_S(name, F) SSE_HELPER_P(name, F) + +#endif + #define FPU_ADD(size, a, b) float ## size ## _add(a, b, &env->sse_status) #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status) #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status) #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status) -#define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status) =20 /* Note that the choice of comparison op here is important to get the * special cases right: for min and max Intel specifies that (-0,0), @@ -603,8 +621,34 @@ SSE_HELPER_S(mul, FPU_MUL) SSE_HELPER_S(div, FPU_DIV) SSE_HELPER_S(min, FPU_MIN) SSE_HELPER_S(max, FPU_MAX) -SSE_HELPER_S(sqrt, FPU_SQRT) =20 +void glue(helper_sqrtps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D float32_sqrt(s->ZMM_S(i), &env->sse_status); + } +} + +void glue(helper_sqrtpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +{ + int i; + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_D(i) =3D float64_sqrt(s->ZMM_D(i), &env->sse_status); + } +} + +#if SHIFT =3D=3D 1 +void helper_sqrtss(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_S(0) =3D float32_sqrt(s->ZMM_S(0), &env->sse_status); +} + +void helper_sqrtsd(CPUX86State *env, Reg *d, Reg *s) +{ + d->ZMM_D(0) =3D float64_sqrt(s->ZMM_D(0), &env->sse_status); +} +#endif =20 /* float to float conversions */ void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) @@ -822,18 +866,12 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); - d->ZMM_S(0) =3D float32_div(float32_one, - float32_sqrt(s->ZMM_S(0), &env->sse_status), - &env->sse_status); - d->ZMM_S(1) =3D float32_div(float32_one, - float32_sqrt(s->ZMM_S(1), &env->sse_status), - &env->sse_status); - d->ZMM_S(2) =3D float32_div(float32_one, - float32_sqrt(s->ZMM_S(2), &env->sse_status), - &env->sse_status); - d->ZMM_S(3) =3D float32_div(float32_one, - float32_sqrt(s->ZMM_S(3), &env->sse_status), - &env->sse_status); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D float32_div(float32_one, + float32_sqrt(s->ZMM_S(i), &env->sse_stat= us), + &env->sse_status); + } set_float_exception_flags(old_flags, &env->sse_status); } =20 @@ -849,10 +887,10 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMR= eg *s) void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); - d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); - d->ZMM_S(1) =3D float32_div(float32_one, s->ZMM_S(1), &env->sse_status= ); - d->ZMM_S(2) =3D float32_div(float32_one, s->ZMM_S(2), &env->sse_status= ); - d->ZMM_S(3) =3D float32_div(float32_one, s->ZMM_S(3), &env->sse_status= ); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D float32_div(float32_one, s->ZMM_S(i), &env->sse_st= atus); + } set_float_exception_flags(old_flags, &env->sse_status); } =20 @@ -947,18 +985,24 @@ void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZM= MReg *d, ZMMReg *s) MOVE(*d, r); } =20 -void glue(helper_addsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_S(0) =3D float32_sub(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status= ); - d->ZMM_S(1) =3D float32_add(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status= ); - d->ZMM_S(2) =3D float32_sub(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status= ); - d->ZMM_S(3) =3D float32_add(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status= ); + Reg *v =3D d; + int i; + for (i =3D 0; i < 2 << SHIFT; i +=3D 2) { + d->ZMM_S(i) =3D float32_sub(v->ZMM_S(i), s->ZMM_S(i), &env->sse_st= atus); + d->ZMM_S(i+1) =3D float32_add(v->ZMM_S(i+1), s->ZMM_S(i+1), &env->= sse_status); + } } =20 -void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) +void glue(helper_addsubpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_D(0) =3D float64_sub(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status= ); - d->ZMM_D(1) =3D float64_add(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status= ); + Reg *v =3D d; + int i; + for (i =3D 0; i < 1 << SHIFT; i +=3D 2) { + d->ZMM_D(i) =3D float64_sub(v->ZMM_D(i), s->ZMM_D(i), &env->sse_st= atus); + d->ZMM_D(i+1) =3D float64_add(v->ZMM_D(i+1), s->ZMM_D(i+1), &env->= sse_status); + } } =20 /* XXX: unordered */ @@ -2258,6 +2302,8 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State= *env, Reg *d, Reg *s, } #endif =20 +#undef SSE_HELPER_S + #undef SHIFT #undef XMM_ONLY #undef Reg --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662020979; cv=none; d=zohomail.com; s=zohoarc; b=oJhxB/qzhRrC4WxP41hHpAXRmXF4QWd+hbmspq54/OjSeSZsfeQiS+RlEhT3U6/ZDWmd1blOjACt9160ObU7t1CVq5IrkMkCTUmblS42q6ky72J9Jly/qlk0YZ8H9LNSg0/kH+aqTcESTm2HNhHguHUZRPInZfw/dxmiTCd6hu0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662020979; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=ft71QaJ3LzYCrx6idmpkX1A3Ap34QJJWk7lqbXj4fu8=; b=Gvch1iOxYhSDUsgaxPlvH9olLE34UUzDZEfQEpMJeDA/DmqooWSs59HYzF8dvoGgFfToPVPwZ19DUoknztZxue3WF9u4tYmDaxLJWx0ASWTqje5E0e+1INf2WobXIg51b55tP1sWBut351EQvj9Fsa4wiPAdyxxatwrJi3d5CbQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662020979754987.7105856995277; Thu, 1 Sep 2022 01:29:39 -0700 (PDT) Received: from localhost ([::1]:35252 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfZw-0005q6-1x for importer@patchew.org; Thu, 01 Sep 2022 04:29:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42658) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex9-0001TM-Rt for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:34 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:38196) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex7-0003Aw-60 for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:30 -0400 Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-487-fuCPznBlNouv7Osurzd1pA-1; Thu, 01 Sep 2022 03:49:24 -0400 Received: by mail-wm1-f70.google.com with SMTP id n7-20020a1c2707000000b003a638356355so9486368wmn.2 for ; Thu, 01 Sep 2022 00:49:24 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id u20-20020a05600c19d400b003a6a3595edasm5261388wmq.27.2022.09.01.00.49.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018566; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ft71QaJ3LzYCrx6idmpkX1A3Ap34QJJWk7lqbXj4fu8=; b=cV5lsTyVgMhjOBKi/zjO9SJN0eS8y3PiK+mHsLJyhccHKezI11rEC3BBuDh2Nnq8crnySb SqXS+lQ5d22hKz187I8TFHjNyqeVyrc8z5Bg1leMvcjjutN12syBn8jUYtQEjwwJwch6YQ 1aeBEa9G3EZcS0sjr5ncnKs9XLARvDM= X-MC-Unique: fuCPznBlNouv7Osurzd1pA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=ft71QaJ3LzYCrx6idmpkX1A3Ap34QJJWk7lqbXj4fu8=; b=0ZLvxLmY+3UjtFE0YkpM/xH/bbX3ePmiRToNii+ykIgEpsBpXX+7VgaX4x3jPMdq9v laXw/xGYpbvZkdNTdt6jUfrTrrLuHek8BDQG04S9LDWnejkMSkZyj1bRizALPzhp9Exp x09IR1Q9kPEH/xqrKkEnn/ohhofeQF+CoDyKfJ39tufA1afD91XckyMSajeDw4VOnqE9 pxO9qxvsjOzxdc3opyHNg3wEWIy0KhNPKNISABK6b6HuNxRmkG8yGDFbK7ZbaoFvqEox rF5OrZOZF2kpdaWttYoZXljXUdfJ/Dmv+AYtwYpCQq9MIJNdNhhms4HWGDtJJ8Vv0pN/ fLZg== X-Gm-Message-State: ACgBeo1/bwIjDRfi4g14VPiC8W/IvhdN7N0hBy+P1ff8ytqp83qsda8K w1KcpfyqPksxy1XblE3nTRc8xIyiCj6gsid37ZXuTGUCmYdpmRAqSe914qxGMMHchleK3bcY6t9 qrCK+ljlxGjir7UjxVehGAPfdxKzcU23eKVqdXJi1vZ8zFPAcAxTULz7Cc1RJ62zsL8s= X-Received: by 2002:adf:fc83:0:b0:226:d2d4:bc27 with SMTP id g3-20020adffc83000000b00226d2d4bc27mr11912319wrr.606.1662018563249; Thu, 01 Sep 2022 00:49:23 -0700 (PDT) X-Google-Smtp-Source: AA6agR7IPqpaUSq4NaOsblBfEHhW+hBX6PfNO7qkniU0uUvaUr5DV6p2uiY0C2HsdeQD0usIjKv98w== X-Received: by 2002:adf:fc83:0:b0:226:d2d4:bc27 with SMTP id g3-20020adffc83000000b00226d2d4bc27mr11912297wrr.606.1662018562866; Thu, 01 Sep 2022 00:49:22 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 17/23] i386: reimplement AVX comparison helpers Date: Thu, 1 Sep 2022 09:48:36 +0200 Message-Id: <20220901074842.57424-18-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662020981924100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook AVX includes an additional set of comparison predicates, some of which our softfloat implementation does not expose as separate functions. Rewrite the helpers in terms of floatN_compare for future extensibility. Signed-off-by: Paul Brook Reviewed-by: Richard Henderson Message-Id: <20220424220204.2493824-24-paul@nowt.org> Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 97 ++++++++++++++++++++---------------- target/i386/ops_sse_header.h | 24 ++++----- target/i386/tcg/translate.c | 20 ++++---- 3 files changed, 75 insertions(+), 66 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index d881d03228..de874e136f 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1005,57 +1005,66 @@ void glue(helper_addsubpd, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s) } } =20 -/* XXX: unordered */ -#define SSE_HELPER_CMP(name, F) \ - void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ +#define SSE_HELPER_CMP_P(name, F, C) \ + void glue(helper_ ## name ## ps, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - d->ZMM_L(1) =3D F(32, d->ZMM_S(1), s->ZMM_S(1)); \ - d->ZMM_L(2) =3D F(32, d->ZMM_S(2), s->ZMM_S(2)); \ - d->ZMM_L(3) =3D F(32, d->ZMM_S(3), s->ZMM_S(3)); \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < 2 << SHIFT; i++) { \ + d->ZMM_L(i) =3D C(F(32, v->ZMM_S(i), s->ZMM_S(i))) ? -1 : 0; \ + } \ } \ \ - void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, \ + Reg *d, Reg *s) \ { \ - d->ZMM_L(0) =3D F(32, d->ZMM_S(0), s->ZMM_S(0)); \ - } \ - \ - void glue(helper_ ## name ## pd, SUFFIX)(CPUX86State *env, Reg *d, Reg= *s)\ - { \ - d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ - d->ZMM_Q(1) =3D F(64, d->ZMM_D(1), s->ZMM_D(1)); \ - } \ - \ - void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ - { \ - d->ZMM_Q(0) =3D F(64, d->ZMM_D(0), s->ZMM_D(0)); \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < 1 << SHIFT; i++) { \ + d->ZMM_Q(i) =3D C(F(64, v->ZMM_D(i), s->ZMM_D(i))) ? -1 : 0; \ + } \ } =20 -#define FPU_CMPEQ(size, a, b) \ - (float ## size ## _eq_quiet(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPLT(size, a, b) \ - (float ## size ## _lt(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPLE(size, a, b) \ - (float ## size ## _le(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPUNORD(size, a, b) \ - (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? -1 : 0) -#define FPU_CMPNEQ(size, a, b) \ - (float ## size ## _eq_quiet(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPNLT(size, a, b) \ - (float ## size ## _lt(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPNLE(size, a, b) \ - (float ## size ## _le(a, b, &env->sse_status) ? 0 : -1) -#define FPU_CMPORD(size, a, b) \ - (float ## size ## _unordered_quiet(a, b, &env->sse_status) ? 0 : -1) +#if SHIFT =3D=3D 1 +#define SSE_HELPER_CMP(name, F, C) = \ + SSE_HELPER_CMP_P(name, F, C) = \ + void helper_ ## name ## ss(CPUX86State *env, Reg *d, Reg *s) \ + { = \ + Reg *v =3D d; = \ + d->ZMM_L(0) =3D C(F(32, v->ZMM_S(0), s->ZMM_S(0))) ? -1 : 0; = \ + } = \ + = \ + void helper_ ## name ## sd(CPUX86State *env, Reg *d, Reg *s) \ + { = \ + Reg *v =3D d; = \ + d->ZMM_Q(0) =3D C(F(64, v->ZMM_D(0), s->ZMM_D(0))) ? -1 : 0; = \ + } =20 -SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) -SSE_HELPER_CMP(cmplt, FPU_CMPLT) -SSE_HELPER_CMP(cmple, FPU_CMPLE) -SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD) -SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ) -SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) -SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) -SSE_HELPER_CMP(cmpord, FPU_CMPORD) +#define FPU_EQ(x) (x =3D=3D float_relation_equal) +#define FPU_LT(x) (x =3D=3D float_relation_less) +#define FPU_LE(x) (x <=3D float_relation_equal) +#define FPU_UNORD(x) (x =3D=3D float_relation_unordered) + +#define FPU_CMPQ(size, a, b) \ + float ## size ## _compare_quiet(a, b, &env->sse_status) +#define FPU_CMPS(size, a, b) \ + float ## size ## _compare(a, b, &env->sse_status) + +#else +#define SSE_HELPER_CMP(name, F, C) SSE_HELPER_CMP_P(name, F, C) +#endif + +SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ) +SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT) +SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE) +SSE_HELPER_CMP(cmpunord, FPU_CMPQ, FPU_UNORD) +SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ) +SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT) +SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE) +SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) + +#undef SSE_HELPER_CMP =20 static const int comis_eflags[4] =3D {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C}; =20 diff --git a/target/i386/ops_sse_header.h b/target/i386/ops_sse_header.h index fc697536a0..d99464afb0 100644 --- a/target/i386/ops_sse_header.h +++ b/target/i386/ops_sse_header.h @@ -201,20 +201,20 @@ DEF_HELPER_3(glue(hsubpd, SUFFIX), void, env, ZMMReg,= ZMMReg) DEF_HELPER_3(glue(addsubps, SUFFIX), void, env, ZMMReg, ZMMReg) DEF_HELPER_3(glue(addsubpd, SUFFIX), void, env, ZMMReg, ZMMReg) =20 -#define SSE_HELPER_CMP(name, F) \ - DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ - DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ - DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ +#define SSE_HELPER_CMP(name, F, C) \ + DEF_HELPER_3(glue(name ## ps, SUFFIX), void, env, Reg, Reg) \ + DEF_HELPER_3(name ## ss, void, env, Reg, Reg) \ + DEF_HELPER_3(glue(name ## pd, SUFFIX), void, env, Reg, Reg) \ DEF_HELPER_3(name ## sd, void, env, Reg, Reg) =20 -SSE_HELPER_CMP(cmpeq, FPU_CMPEQ) -SSE_HELPER_CMP(cmplt, FPU_CMPLT) -SSE_HELPER_CMP(cmple, FPU_CMPLE) -SSE_HELPER_CMP(cmpunord, FPU_CMPUNORD) -SSE_HELPER_CMP(cmpneq, FPU_CMPNEQ) -SSE_HELPER_CMP(cmpnlt, FPU_CMPNLT) -SSE_HELPER_CMP(cmpnle, FPU_CMPNLE) -SSE_HELPER_CMP(cmpord, FPU_CMPORD) +SSE_HELPER_CMP(cmpeq, FPU_CMPQ, FPU_EQ) +SSE_HELPER_CMP(cmplt, FPU_CMPS, FPU_LT) +SSE_HELPER_CMP(cmple, FPU_CMPS, FPU_LE) +SSE_HELPER_CMP(cmpunord, FPU_CMPQ, FPU_UNORD) +SSE_HELPER_CMP(cmpneq, FPU_CMPQ, !FPU_EQ) +SSE_HELPER_CMP(cmpnlt, FPU_CMPS, !FPU_LT) +SSE_HELPER_CMP(cmpnle, FPU_CMPS, !FPU_LE) +SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) =20 DEF_HELPER_3(ucomiss, void, env, Reg, Reg) DEF_HELPER_3(comiss, void, env, Reg, Reg) diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c index 99c84473f4..fc081e6ad6 100644 --- a/target/i386/tcg/translate.c +++ b/target/i386/tcg/translate.c @@ -3022,20 +3022,20 @@ static const SSEFunc_l_ep sse_op_table3bq[] =3D { }; #endif =20 -#define SSE_FOP(x) { \ +#define SSE_CMP(x) { \ gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \ gen_helper_ ## x ## ss, gen_helper_ ## x ## sd} static const SSEFunc_0_epp sse_op_table4[8][4] =3D { - SSE_FOP(cmpeq), - SSE_FOP(cmplt), - SSE_FOP(cmple), - SSE_FOP(cmpunord), - SSE_FOP(cmpneq), - SSE_FOP(cmpnlt), - SSE_FOP(cmpnle), - SSE_FOP(cmpord), + SSE_CMP(cmpeq), + SSE_CMP(cmplt), + SSE_CMP(cmple), + SSE_CMP(cmpunord), + SSE_CMP(cmpneq), + SSE_CMP(cmpnlt), + SSE_CMP(cmpnle), + SSE_CMP(cmpord), }; -#undef SSE_FOP +#undef SSE_CMP =20 static const SSEFunc_0_epp sse_op_table5[256] =3D { [0x0c] =3D gen_helper_pi2fw, --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662022678; cv=none; d=zohomail.com; s=zohoarc; b=i2lGqlD75YjlHJfeiBj780z6Jw4hkQXswgWs+CaJydE9qCdj+I+7cvxd/PW7lX72nIuQkI3WtYuY2Nez1faZYihoTfFyqkNxEKzTa1ybj1WoAuOBU0W1cpAnCbCSvU9yJzZBc3zoW/un1cnCVqJSOMm+6KgJpG81J936Y+tVNFY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662022678; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=dT+Cm0N8qaLNhVPC331ez8mhNQ20GiFjLqzVTX8b194=; b=iZORHQKaeCqzK/iI2Yy9djhagFIGe9Dpt+f81A5vkpl8Bz06my7Nkh+JLNYIewrOPqjrTEcasrbydqs8UpFk1Ku3Lxml1IIrQnv3FkEp/v0YKv/9YByshUSuwf1ywOAsm9rW5n8orqCYMcuyMkrnchy3rm03HPiOPckEEhmwzuA= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662022678632869.850582484817; Thu, 1 Sep 2022 01:57:58 -0700 (PDT) Received: from localhost ([::1]:35630 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTg1M-0005dn-Bk for importer@patchew.org; Thu, 01 Sep 2022 04:57:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42662) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexA-0001TV-GT for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:51390) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex8-0003BA-GW for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:32 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-513-yjn1Q9vuOoChF09bw0Snbg-1; Thu, 01 Sep 2022 03:49:26 -0400 Received: by mail-wm1-f69.google.com with SMTP id j36-20020a05600c1c2400b003a540d88677so9479620wms.1 for ; Thu, 01 Sep 2022 00:49:26 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id p6-20020a5d48c6000000b002252884cc91sm13699840wrs.43.2022.09.01.00.49.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dT+Cm0N8qaLNhVPC331ez8mhNQ20GiFjLqzVTX8b194=; b=CCM/CFATP8/FN/wrb677NydjXa65iGcgdx3zVf39Rnsobb8BWPRYGF3aG6pmv4no4HY1Bd nJStvJMtyKZ6IZex+4xtpA2JFRlZeGyqBUBsK7hMHEKbJWZ7Xcl4tT77XLBtcPfJcoMPst mFITR7EIEIxeAg+WWtZf9ru5YX3dijA= X-MC-Unique: yjn1Q9vuOoChF09bw0Snbg-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=dT+Cm0N8qaLNhVPC331ez8mhNQ20GiFjLqzVTX8b194=; b=CW9FwCOw+QuPkOsxWUX6RdnA5Dp5PuVTWbokDFA6ZZ0vGU7YSIYwp8tCHQrWphsnkq aZq5yKZtyTwBxg5+MzEDLevZmxcs/Klg+iUJmAhUQdjVQ8vX+U7UOcKFtESF8nesdjBE Ubj1cejnQVnks4YDQtRK0bRXDCdmKuUtDxfKI8/k3lzLchSmje41Vems8T/z4wqAVhVO HWmBH3g8GtsYN6BNzos9Nu3mjVLvfhv3lD8lqaSdH/quBoPJtgtRVSUrsIZy8x4ELnyB dKvdDLMUAWxF4lB5a11iezGCuWN4K6vA3LHkQL+aCIMCexAmclwKs7l1jdfZVBc/I0b/ 0T5g== X-Gm-Message-State: ACgBeo17YCbFzMg6C5GnWgmfrIxoDQhMZQqGvyqH5FBgFEx9B/MgW+l4 Eu5K2BPmq62EtOcszQVO/gSufc4pMZsQr3Ln2EN6hpkH7S14rDM4U0o/eDkHvOEFbvOpbkGJPtH dQbdlzV9A/LulnztRWmalTVXg8MwlIXnWblAXduagdro7b0eE5IGBRkMKNpdWvqjBkNo= X-Received: by 2002:a05:600c:4e52:b0:3a6:d89:4d1b with SMTP id e18-20020a05600c4e5200b003a60d894d1bmr4360719wmq.150.1662018564799; Thu, 01 Sep 2022 00:49:24 -0700 (PDT) X-Google-Smtp-Source: AA6agR65/i8+miWFhL5mdsvqODo+bGnsD9xQQL4HJViQzFPkrkXUtztTfM8/RF2M7ro8L+pblSFyWA== X-Received: by 2002:a05:600c:4e52:b0:3a6:d89:4d1b with SMTP id e18-20020a05600c4e5200b003a60d894d1bmr4360708wmq.150.1662018564536; Thu, 01 Sep 2022 00:49:24 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 18/23] i386: Dot product AVX helper prep Date: Thu, 1 Sep 2022 09:48:37 +0200 Message-Id: <20220901074842.57424-19-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662022680328100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Make the dpps and dppd helpers AVX-ready I can't see any obvious reason why dppd shouldn't work on 256 bit ymm registers, but both AMD and Intel agree that it's xmm only. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-17-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 80 ++++++++++++++++++++++++------------------- 1 file changed, 45 insertions(+), 35 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index de874e136f..59ed30071e 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1903,55 +1903,64 @@ SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP) SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) =20 -void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) +void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, + uint32_t mask) { + Reg *v =3D d; float32 prod1, prod2, temp2, temp3, temp4; + int i; =20 - /* - * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D - * to correctly round the intermediate results - */ - if (mask & (1 << 4)) { - prod1 =3D float32_mul(d->ZMM_S(0), s->ZMM_S(0), &env->sse_status); - } else { - prod1 =3D float32_zero; - } - if (mask & (1 << 5)) { - prod2 =3D float32_mul(d->ZMM_S(1), s->ZMM_S(1), &env->sse_status); - } else { - prod2 =3D float32_zero; - } - temp2 =3D float32_add(prod1, prod2, &env->sse_status); - if (mask & (1 << 6)) { - prod1 =3D float32_mul(d->ZMM_S(2), s->ZMM_S(2), &env->sse_status); - } else { - prod1 =3D float32_zero; - } - if (mask & (1 << 7)) { - prod2 =3D float32_mul(d->ZMM_S(3), s->ZMM_S(3), &env->sse_status); - } else { - prod2 =3D float32_zero; - } - temp3 =3D float32_add(prod1, prod2, &env->sse_status); - temp4 =3D float32_add(temp2, temp3, &env->sse_status); + for (i =3D 0; i < 2 << SHIFT; i +=3D 4) { + /* + * We must evaluate (A+B)+(C+D), not ((A+B)+C)+D + * to correctly round the intermediate results + */ + if (mask & (1 << 4)) { + prod1 =3D float32_mul(v->ZMM_S(i), s->ZMM_S(i), &env->sse_stat= us); + } else { + prod1 =3D float32_zero; + } + if (mask & (1 << 5)) { + prod2 =3D float32_mul(v->ZMM_S(i+1), s->ZMM_S(i+1), &env->sse_= status); + } else { + prod2 =3D float32_zero; + } + temp2 =3D float32_add(prod1, prod2, &env->sse_status); + if (mask & (1 << 6)) { + prod1 =3D float32_mul(v->ZMM_S(i+2), s->ZMM_S(i+2), &env->sse_= status); + } else { + prod1 =3D float32_zero; + } + if (mask & (1 << 7)) { + prod2 =3D float32_mul(v->ZMM_S(i+3), s->ZMM_S(i+3), &env->sse_= status); + } else { + prod2 =3D float32_zero; + } + temp3 =3D float32_add(prod1, prod2, &env->sse_status); + temp4 =3D float32_add(temp2, temp3, &env->sse_status); =20 - d->ZMM_S(0) =3D (mask & (1 << 0)) ? temp4 : float32_zero; - d->ZMM_S(1) =3D (mask & (1 << 1)) ? temp4 : float32_zero; - d->ZMM_S(2) =3D (mask & (1 << 2)) ? temp4 : float32_zero; - d->ZMM_S(3) =3D (mask & (1 << 3)) ? temp4 : float32_zero; + d->ZMM_S(i) =3D (mask & (1 << 0)) ? temp4 : float32_zero; + d->ZMM_S(i+1) =3D (mask & (1 << 1)) ? temp4 : float32_zero; + d->ZMM_S(i+2) =3D (mask & (1 << 2)) ? temp4 : float32_zero; + d->ZMM_S(i+3) =3D (mask & (1 << 3)) ? temp4 : float32_zero; + } } =20 -void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t = mask) +#if SHIFT =3D=3D 1 +/* Oddly, there is no ymm version of dppd */ +void glue(helper_dppd, SUFFIX)(CPUX86State *env, + Reg *d, Reg *s, uint32_t mask) { + Reg *v =3D d; float64 prod1, prod2, temp2; =20 if (mask & (1 << 4)) { - prod1 =3D float64_mul(d->ZMM_D(0), s->ZMM_D(0), &env->sse_status); + prod1 =3D float64_mul(v->ZMM_D(0), s->ZMM_D(0), &env->sse_status); } else { prod1 =3D float64_zero; } if (mask & (1 << 5)) { - prod2 =3D float64_mul(d->ZMM_D(1), s->ZMM_D(1), &env->sse_status); + prod2 =3D float64_mul(v->ZMM_D(1), s->ZMM_D(1), &env->sse_status); } else { prod2 =3D float64_zero; } @@ -1959,6 +1968,7 @@ void glue(helper_dppd, SUFFIX)(CPUX86State *env, Reg = *d, Reg *s, uint32_t mask) d->ZMM_D(0) =3D (mask & (1 << 0)) ? temp2 : float64_zero; d->ZMM_D(1) =3D (mask & (1 << 1)) ? temp2 : float64_zero; } +#endif =20 void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t offset) --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021571; cv=none; d=zohomail.com; s=zohoarc; b=ZtE6TcoQurFNBwU4dkhueff85+7VjjbT5s0ao2MYj4wQXyF1kK+NAmrRereC0vpxfTBR7eTajw/+Gt0WuPM3NVbWKI5wpyARsXDs5c070Z9SWm9lgmUL6/y96qGeU8oQdWh0lJXyr9Nvn3INpr2DomQ5EvaWMd4KH7JcNJJhmEE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021571; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=cvRdYXqxKilYPGCf8ByPEEOUIBIjFXJU4or0s3GS58o=; b=Vh+riFOqu2bD3ImEa0W2Ut0ro6PU1f0en1SCRuSRM2/EGWEZW+Aniiml2GHiQ40baFQ62DAL5TpGoBuhc3qKIxMhGTxGotSpRkPynWg/kNPq07XMXWvhfaHJHj9POIOhPz83VzteQG5sy82b7+T7WTWfTXSDFZ7e9Da6ay9gb9o= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021571106412.6693293988898; Thu, 1 Sep 2022 01:39:31 -0700 (PDT) Received: from localhost ([::1]:51740 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfjW-0001LC-4v for importer@patchew.org; Thu, 01 Sep 2022 04:39:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42660) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexA-0001TU-Bz for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:34107) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex7-0003B1-HF for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:32 -0400 Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-586-6udRPbS5PVeKPnWyoWAO8w-1; Thu, 01 Sep 2022 03:49:27 -0400 Received: by mail-wr1-f70.google.com with SMTP id r20-20020adfb1d4000000b002258c581ba2so2831887wra.1 for ; Thu, 01 Sep 2022 00:49:27 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id j38-20020a05600c1c2600b003a5c75bd36fsm4731660wms.10.2022.09.01.00.49.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cvRdYXqxKilYPGCf8ByPEEOUIBIjFXJU4or0s3GS58o=; b=AiLbSpBLJntM5nOQNVT7aRpeZwFTxx83s2BIObZtFomG37N+MfHx3WgoqtK+Wn98/PfQqS DUH0Wq15CK1f9mGPyxtnhxfO5hNT9W0xeoCGvCHx9vNRya0M8LICjF7iWsAOBDLYzTGFtK eaJ/TwbHMqSDyB6LOKyaVFsHfI0Pdvc= X-MC-Unique: 6udRPbS5PVeKPnWyoWAO8w-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=cvRdYXqxKilYPGCf8ByPEEOUIBIjFXJU4or0s3GS58o=; b=GN64axDb1dT3vVZ1jSqAU2faQMeZ7iOXZQm/Y6FF9PedHS6ygEUoOTmjZ0CegsRVEM yb5+oJnNj617KdirFDkcgAVWq27vYgCGbvbo/yh+wfbGO9EEtVBTp1KdB2F5P4WYycDa 7l3JFIhHgf1IeY9/KrAaQJDErQtasARdIxvM0Vnn5EJkLjBA3aPx18xPqPgF9rWkBeBg xcTanPMXacXZrnYURL+uoh0WdCbJCeZ1VZwKMQcKFpklA6t9gRAf6fEitCHRQUgt2Bbu iP/vjfS/+7iG/cRGS/tT48DQV+U8jxdeUHPm94TNMh0y7NL1rgJeM1ZWl86o0K6+ZiNL eBUg== X-Gm-Message-State: ACgBeo3o+J8faIk6B/Lbu9XxTWuJ7SEXbTxNVKJ0It27OC2azqHHHo3o N/0CbX+OjFHTmsX+ct8/Lx1G79B6PAYFK7t9OnDob8BwuTnBXQWSDWm+PCTdZF2WXU5gFmNvWxC qYs3T86cRobfRD36e25HqOfkOh+Am4vX6UyKnZ0ZjI16xrVQ8+bjqQy0KwyP5/5w4Ywo= X-Received: by 2002:a05:600c:4e8c:b0:3a6:11e:cc08 with SMTP id f12-20020a05600c4e8c00b003a6011ecc08mr4247308wmq.198.1662018566610; Thu, 01 Sep 2022 00:49:26 -0700 (PDT) X-Google-Smtp-Source: AA6agR7IYi4LshRzlAbT755GY6I5+VeivcH+2+8NX195yN2qNkiOytJnNLSZpzl9oQOoVIKEAvc1KA== X-Received: by 2002:a05:600c:4e8c:b0:3a6:11e:cc08 with SMTP id f12-20020a05600c4e8c00b003a6011ecc08mr4247292wmq.198.1662018566380; Thu, 01 Sep 2022 00:49:26 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 19/23] i386: Destructive FP helpers for AVX Date: Thu, 1 Sep 2022 09:48:38 +0200 Message-Id: <20220901074842.57424-20-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021572955100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Perpare the horizontal atithmetic vector helpers for AVX These currently use a dummy Reg typed variable to store the result then assign the whole register. This will cause 128 bit operations to corrupt the upper half of the register, so replace it with explicit temporaries and element assignments. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-18-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 93 ++++++++++++++++++------------------------- 1 file changed, 39 insertions(+), 54 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 59ed30071e..61722fe4a2 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -945,45 +927,49 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, in= t index, int length) d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } =20 -void glue(helper_haddps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; - - r.ZMM_S(0) =3D float32_add(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status); - r.ZMM_S(1) =3D float32_add(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status); - r.ZMM_S(2) =3D float32_add(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); - r.ZMM_S(3) =3D float32_add(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); - MOVE(*d, r); +#define SSE_HELPER_HPS(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + float32 r[2 << SHIFT]; \ + int i, j, k; \ + for (k =3D 0; k < 2 << SHIFT; k +=3D LANE_WIDTH / 4) { \ + for (i =3D j =3D 0; j < 4; i++, j +=3D 2) { \ + r[i + k] =3D F(v->ZMM_S(j + k), v->ZMM_S(j + k + 1), &env->sse= _status); \ + } \ + for (j =3D 0; j < 4; i++, j +=3D 2) { \ + r[i + k] =3D F(s->ZMM_S(j + k), s->ZMM_S(j + k + 1), &env->sse= _status); \ + } \ + } \ + for (i =3D 0; i < 2 << SHIFT; i++) { \ + d->ZMM_S(i) =3D r[i]; \ + } \ } =20 -void glue(helper_haddpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; +SSE_HELPER_HPS(haddps, float32_add) +SSE_HELPER_HPS(hsubps, float32_sub) =20 - r.ZMM_D(0) =3D float64_add(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status); - r.ZMM_D(1) =3D float64_add(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); - MOVE(*d, r); +#define SSE_HELPER_HPD(name, F) \ +void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ +{ \ + Reg *v =3D d; \ + float64 r[1 << SHIFT]; \ + int i, j, k; \ + for (k =3D 0; k < 1 << SHIFT; k +=3D LANE_WIDTH / 8) { \ + for (i =3D j =3D 0; j < 2; i++, j +=3D 2) { \ + r[i + k] =3D F(v->ZMM_D(j + k), v->ZMM_D(j + k + 1), &env->sse= _status); \ + } \ + for (j =3D 0; j < 2; i++, j +=3D 2) { \ + r[i + k] =3D F(s->ZMM_D(j + k), s->ZMM_D(j + k + 1), &env->sse= _status); \ + } \ + } \ + for (i =3D 0; i < 1 << SHIFT; i++) { \ + d->ZMM_D(i) =3D r[i]; \ + } \ } =20 -void glue(helper_hsubps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; - - r.ZMM_S(0) =3D float32_sub(d->ZMM_S(0), d->ZMM_S(1), &env->sse_status); - r.ZMM_S(1) =3D float32_sub(d->ZMM_S(2), d->ZMM_S(3), &env->sse_status); - r.ZMM_S(2) =3D float32_sub(s->ZMM_S(0), s->ZMM_S(1), &env->sse_status); - r.ZMM_S(3) =3D float32_sub(s->ZMM_S(2), s->ZMM_S(3), &env->sse_status); - MOVE(*d, r); -} - -void glue(helper_hsubpd, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) -{ - ZMMReg r; - - r.ZMM_D(0) =3D float64_sub(d->ZMM_D(0), d->ZMM_D(1), &env->sse_status); - r.ZMM_D(1) =3D float64_sub(s->ZMM_D(0), s->ZMM_D(1), &env->sse_status); - MOVE(*d, r); -} +SSE_HELPER_HPD(haddpd, float64_add) +SSE_HELPER_HPD(hsubpd, float64_sub) =20 void glue(helper_addsubps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021346; cv=none; d=zohomail.com; s=zohoarc; b=Srv2SfiFiSNOTUFZBQTRGm6ncNJPxTZUMoMrOVUIUOSUwsl90W72Ka9DacNH94VEdz3AauJG+iBbOSke3SPKlgKQuZOo3CtSk2EBcU1krZV9oOcTWYVhF3XuDXZ6onWtJ94EdcHEzwDSD+BNSBZV1nUCeSNsUKdeaI/vnhDovag= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021346; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=JT9nUfXaAwH3lAfSMWeiHOzyGcwJZ/vNY0rng0C6F/Q=; b=MJHzzqdRVthf74Y0CfeC2I7Uum9H4dwvaLFNCrkcWQCJ7s+sKwB/Tc8qkJA+zzLQW+NfTyoA+lt5jSGrcWKxEgVKSjxNWaUyL50zEArAIKNEaZ70nQtWoRF5QfA/RBgUOwLKFzdAUl8/5YXbHQvgAT7aIDIUCAg7OSlQyfwufNQ= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021346455890.0842552809679; Thu, 1 Sep 2022 01:35:46 -0700 (PDT) Received: from localhost ([::1]:35388 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfft-0003fc-3h for importer@patchew.org; Thu, 01 Sep 2022 04:35:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42666) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexB-0001Tn-On for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:47597) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTex9-0003BN-Hm for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:33 -0400 Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-494-BovkWOYGOT6WL2EXmF-R8g-1; Thu, 01 Sep 2022 03:49:30 -0400 Received: by mail-wm1-f72.google.com with SMTP id j36-20020a05600c1c2400b003a540d88677so9479710wms.1 for ; Thu, 01 Sep 2022 00:49:29 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id l42-20020a05600c1d2a00b003a3170a7af9sm4911883wms.4.2022.09.01.00.49.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018571; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JT9nUfXaAwH3lAfSMWeiHOzyGcwJZ/vNY0rng0C6F/Q=; b=Htg/TmVpMk/VQXvE3r9O2LWXUCS52X+1AX+t1vzaOwXFLwRFohk0cRX4+pT9op4RfY5ipH sRuKLUyjqyPf4HfSLeCQxj6023OT5VrlkFtokoo7ZurQU2cgK8xu08eMEEuETv9L8Wulwk 9RsbJEREjFt445EZ+Rkh/kCQqJ+91gk= X-MC-Unique: BovkWOYGOT6WL2EXmF-R8g-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=JT9nUfXaAwH3lAfSMWeiHOzyGcwJZ/vNY0rng0C6F/Q=; b=Qyu8u+yeDAbmEcHK0Pf50ng2vK9PNM0AI3zPyFqbZV8NXdqwgwhpveQC32asRbbp0T 8YUVzEVGzwj4TzBcql1ddLgYw8Uz2RGYG+GsK5H8hbWzcaYbmFtbu1pXzBpDfHr4ItWN JTo4aY1uc9MVxCkLM5Xhy5h9SfjNbg0/3/Wv4wy5YnB1DqP/tT1hZ5vbbhbe84WumS5W pV1eSt/mZAIcrEUhHQf3DCTfIQVQ1RHPZ+6duB6qszAVBzKqMglwLu+6slmVFsK1YkEV MuWXdRSzfv7SfjvRgbdoxBmMQwtCTTyLzanZuqQEPXY3dH1rUZHFtf/9BCECEcRqC93h bCCg== X-Gm-Message-State: ACgBeo1lE7s44T9kJUO5Mw2oduFMxjia9Iyl/rHQtsXyzacXV+RAjXWR AZqvSTtbw/R9dLWJecA4rOjkOB34yj3fzpgUsBMcRH2jCN15WfH24xhEI7E5ssVQnUFF+I+R1Yb V+vS/duyJnjhIL1cZIboOa30IZugYsll1xogJ2dHdwVKvCFuobaZcqpiYsxxQAOgwFbs= X-Received: by 2002:adf:f081:0:b0:225:7209:4bd7 with SMTP id n1-20020adff081000000b0022572094bd7mr14526112wro.36.1662018568542; Thu, 01 Sep 2022 00:49:28 -0700 (PDT) X-Google-Smtp-Source: AA6agR6nUv/1cWI8/4JY6+PxaKrBbyTmbevJcZStvuVzEW1/lV6ea7eaQW2HAym45oxQDB+7CjXtCQ== X-Received: by 2002:adf:f081:0:b0:225:7209:4bd7 with SMTP id n1-20020adff081000000b0022572094bd7mr14526087wro.36.1662018568150; Thu, 01 Sep 2022 00:49:28 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 20/23] i386: Misc AVX helper prep Date: Thu, 1 Sep 2022 09:48:39 +0200 Message-Id: <20220901074842.57424-21-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021347897100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Fixup various vector helpers that either trivially exten to 256 bit, or don't have 256 bit variants. No functional changes to existing helpers Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-19-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 143 +++++++++++++++++++++++++++--------------- 1 file changed, 94 insertions(+), 49 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 61722fe4a2..7cfbcce49f 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -422,6 +422,7 @@ void glue(helper_psadbw, SUFFIX)(CPUX86State *env, Reg = *d, Reg *s) } } =20 +#if SHIFT < 2 void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, target_ulong a0) { @@ -433,6 +434,7 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg= *d, Reg *s, } } } +#endif =20 void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val) { @@ -635,21 +637,24 @@ void helper_sqrtsd(CPUX86State *env, Reg *d, Reg *s) /* float to float conversions */ void glue(helper_cvtps2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - float32 s0, s1; - - s0 =3D s->ZMM_S(0); - s1 =3D s->ZMM_S(1); - d->ZMM_D(0) =3D float32_to_float64(s0, &env->sse_status); - d->ZMM_D(1) =3D float32_to_float64(s1, &env->sse_status); + int i; + for (i =3D 1 << SHIFT; --i >=3D 0; ) { + d->ZMM_D(i) =3D float32_to_float64(s->ZMM_S(i), &env->sse_status); + } } =20 void glue(helper_cvtpd2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); - d->ZMM_S(1) =3D float64_to_float32(s->ZMM_D(1), &env->sse_status); - d->Q(1) =3D 0; + int i; + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_S(i) =3D float64_to_float32(s->ZMM_D(i), &env->sse_status); + } + for (i >>=3D 1; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } =20 +#if SHIFT =3D=3D 1 void helper_cvtss2sd(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_D(0) =3D float32_to_float64(s->ZMM_S(0), &env->sse_status); @@ -659,26 +664,27 @@ void helper_cvtsd2ss(CPUX86State *env, Reg *d, Reg *s) { d->ZMM_S(0) =3D float64_to_float32(s->ZMM_D(0), &env->sse_status); } +#endif =20 /* integer to float */ void glue(helper_cvtdq2ps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - d->ZMM_S(0) =3D int32_to_float32(s->ZMM_L(0), &env->sse_status); - d->ZMM_S(1) =3D int32_to_float32(s->ZMM_L(1), &env->sse_status); - d->ZMM_S(2) =3D int32_to_float32(s->ZMM_L(2), &env->sse_status); - d->ZMM_S(3) =3D int32_to_float32(s->ZMM_L(3), &env->sse_status); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D int32_to_float32(s->ZMM_L(i), &env->sse_status); + } } =20 void glue(helper_cvtdq2pd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { - int32_t l0, l1; - - l0 =3D (int32_t)s->ZMM_L(0); - l1 =3D (int32_t)s->ZMM_L(1); - d->ZMM_D(0) =3D int32_to_float64(l0, &env->sse_status); - d->ZMM_D(1) =3D int32_to_float64(l1, &env->sse_status); + int i; + for (i =3D 1 << SHIFT; --i >=3D 0; ) { + int32_t l =3D s->ZMM_L(i); + d->ZMM_D(i) =3D int32_to_float64(l, &env->sse_status); + } } =20 +#if SHIFT =3D=3D 1 void helper_cvtpi2ps(CPUX86State *env, ZMMReg *d, MMXReg *s) { d->ZMM_S(0) =3D int32_to_float32(s->MMX_L(0), &env->sse_status); @@ -713,8 +719,11 @@ void helper_cvtsq2sd(CPUX86State *env, ZMMReg *d, uint= 64_t val) } #endif =20 +#endif + /* float to integer */ =20 +#if SHIFT =3D=3D 1 /* * x86 mandates that we return the indefinite integer value for the result * of any float-to-integer conversion that raises the 'invalid' exception. @@ -745,22 +754,28 @@ WRAP_FLOATCONV(int64_t, float32_to_int64, float32, IN= T64_MIN) WRAP_FLOATCONV(int64_t, float32_to_int64_round_to_zero, float32, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64, float64, INT64_MIN) WRAP_FLOATCONV(int64_t, float64_to_int64_round_to_zero, float64, INT64_MIN) +#endif =20 void glue(helper_cvtps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); - d->ZMM_L(1) =3D x86_float32_to_int32(s->ZMM_S(1), &env->sse_status); - d->ZMM_L(2) =3D x86_float32_to_int32(s->ZMM_S(2), &env->sse_status); - d->ZMM_L(3) =3D x86_float32_to_int32(s->ZMM_S(3), &env->sse_status); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_L(i) =3D x86_float32_to_int32(s->ZMM_S(i), &env->sse_status= ); + } } =20 void glue(helper_cvtpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float64_to_int32(s->ZMM_D(0), &env->sse_status); - d->ZMM_L(1) =3D x86_float64_to_int32(s->ZMM_D(1), &env->sse_status); - d->ZMM_Q(1) =3D 0; + int i; + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_L(i) =3D x86_float64_to_int32(s->ZMM_D(i), &env->sse_status= ); + } + for (i >>=3D 1; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } =20 +#if SHIFT =3D=3D 1 void helper_cvtps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { d->MMX_L(0) =3D x86_float32_to_int32(s->ZMM_S(0), &env->sse_status); @@ -794,23 +809,31 @@ int64_t helper_cvtsd2sq(CPUX86State *env, ZMMReg *s) return x86_float64_to_int64(s->ZMM_D(0), &env->sse_status); } #endif +#endif =20 /* float to integer truncated */ void glue(helper_cvttps2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); - d->ZMM_L(1) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(1), &env->= sse_status); - d->ZMM_L(2) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(2), &env->= sse_status); - d->ZMM_L(3) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(3), &env->= sse_status); + int i; + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_L(i) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(i), + &env->sse_status); + } } =20 void glue(helper_cvttpd2dq, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { - d->ZMM_L(0) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(0), &env->= sse_status); - d->ZMM_L(1) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(1), &env->= sse_status); - d->ZMM_Q(1) =3D 0; + int i; + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_L(i) =3D x86_float64_to_int32_round_to_zero(s->ZMM_D(i), + &env->sse_status); + } + for (i >>=3D 1; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } =20 +#if SHIFT =3D=3D 1 void helper_cvttps2pi(CPUX86State *env, MMXReg *d, ZMMReg *s) { d->MMX_L(0) =3D x86_float32_to_int32_round_to_zero(s->ZMM_S(0), &env->= sse_status); @@ -844,6 +867,7 @@ int64_t helper_cvttsd2sq(CPUX86State *env, ZMMReg *s) return x86_float64_to_int64_round_to_zero(s->ZMM_D(0), &env->sse_statu= s); } #endif +#endif =20 void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { @@ -857,6 +881,7 @@ void glue(helper_rsqrtps, SUFFIX)(CPUX86State *env, ZMM= Reg *d, ZMMReg *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); @@ -865,6 +890,7 @@ void helper_rsqrtss(CPUX86State *env, ZMMReg *d, ZMMReg= *s) &env->sse_status); set_float_exception_flags(old_flags, &env->sse_status); } +#endif =20 void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMMReg *d, ZMMReg *s) { @@ -876,13 +902,16 @@ void glue(helper_rcpps, SUFFIX)(CPUX86State *env, ZMM= Reg *d, ZMMReg *s) set_float_exception_flags(old_flags, &env->sse_status); } =20 +#if SHIFT =3D=3D 1 void helper_rcpss(CPUX86State *env, ZMMReg *d, ZMMReg *s) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); d->ZMM_S(0) =3D float32_div(float32_one, s->ZMM_S(0), &env->sse_status= ); set_float_exception_flags(old_flags, &env->sse_status); } +#endif =20 +#if SHIFT =3D=3D 1 static inline uint64_t helper_extrq(uint64_t src, int shift, int len) { uint64_t mask; @@ -926,6 +955,7 @@ void helper_insertq_i(CPUX86State *env, ZMMReg *d, int = index, int length) { d->ZMM_Q(0) =3D helper_insertq(d->ZMM_Q(0), index, length); } +#endif =20 #define SSE_HELPER_HPS(name, F) \ void glue(helper_ ## name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ @@ -1052,6 +1082,7 @@ SSE_HELPER_CMP(cmpord, FPU_CMPQ, !FPU_UNORD) =20 #undef SSE_HELPER_CMP =20 +#if SHIFT =3D=3D 1 static const int comis_eflags[4] =3D {CC_C, CC_Z, 0, CC_Z | CC_P | CC_C}; =20 void helper_ucomiss(CPUX86State *env, Reg *d, Reg *s) @@ -1097,25 +1128,30 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s) ret =3D float64_compare(d0, d1, &env->sse_status); CC_SRC =3D comis_eflags[ret + 1]; } +#endif =20 uint32_t glue(helper_movmskps, SUFFIX)(CPUX86State *env, Reg *s) { - int b0, b1, b2, b3; + uint32_t mask; + int i; =20 - b0 =3D s->ZMM_L(0) >> 31; - b1 =3D s->ZMM_L(1) >> 31; - b2 =3D s->ZMM_L(2) >> 31; - b3 =3D s->ZMM_L(3) >> 31; - return b0 | (b1 << 1) | (b2 << 2) | (b3 << 3); + mask =3D 0; + for (i =3D 0; i < 2 << SHIFT; i++) { + mask |=3D (s->ZMM_L(i) >> (31 - i)) & (1 << i); + } + return mask; } =20 uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, Reg *s) { - int b0, b1; + uint32_t mask; + int i; =20 - b0 =3D s->ZMM_L(1) >> 31; - b1 =3D s->ZMM_L(3) >> 31; - return b0 | (b1 << 1); + mask =3D 0; + for (i =3D 0; i < 1 << SHIFT; i++) { + mask |=3D (s->ZMM_Q(i) >> (63 - i)) & (1 << i); + } + return mask; } =20 #endif @@ -1712,6 +1748,7 @@ SSE_HELPER_L(helper_pmaxud, MAX) #define FMULLD(d, s) ((int32_t)d * (int32_t)s) SSE_HELPER_L(helper_pmulld, FMULLD) =20 +#if SHIFT =3D=3D 1 void glue(helper_phminposuw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int idx =3D 0; @@ -1743,12 +1780,14 @@ void glue(helper_phminposuw, SUFFIX)(CPUX86State *e= nv, Reg *d, Reg *s) d->L(1) =3D 0; d->Q(1) =3D 0; } +#endif =20 void glue(helper_roundps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); signed char prev_rounding_mode; + int i; =20 prev_rounding_mode =3D env->sse_status.float_rounding_mode; if (!(mode & (1 << 2))) { @@ -1768,10 +1807,9 @@ void glue(helper_roundps, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s, } } =20 - d->ZMM_S(0) =3D float32_round_to_int(s->ZMM_S(0), &env->sse_status); - d->ZMM_S(1) =3D float32_round_to_int(s->ZMM_S(1), &env->sse_status); - d->ZMM_S(2) =3D float32_round_to_int(s->ZMM_S(2), &env->sse_status); - d->ZMM_S(3) =3D float32_round_to_int(s->ZMM_S(3), &env->sse_status); + for (i =3D 0; i < 2 << SHIFT; i++) { + d->ZMM_S(i) =3D float32_round_to_int(s->ZMM_S(i), &env->sse_status= ); + } =20 if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) { set_float_exception_flags(get_float_exception_flags(&env->sse_stat= us) & @@ -1786,6 +1824,7 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, { uint8_t old_flags =3D get_float_exception_flags(&env->sse_status); signed char prev_rounding_mode; + int i; =20 prev_rounding_mode =3D env->sse_status.float_rounding_mode; if (!(mode & (1 << 2))) { @@ -1805,8 +1844,9 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, } } =20 - d->ZMM_D(0) =3D float64_round_to_int(s->ZMM_D(0), &env->sse_status); - d->ZMM_D(1) =3D float64_round_to_int(s->ZMM_D(1), &env->sse_status); + for (i =3D 0; i < 1 << SHIFT; i++) { + d->ZMM_D(i) =3D float64_round_to_int(s->ZMM_D(i), &env->sse_status= ); + } =20 if (mode & (1 << 3) && !(old_flags & float_flag_inexact)) { set_float_exception_flags(get_float_exception_flags(&env->sse_stat= us) & @@ -1816,6 +1856,7 @@ void glue(helper_roundpd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, env->sse_status.float_rounding_mode =3D prev_rounding_mode; } =20 +#if SHIFT =3D=3D 1 void glue(helper_roundss, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mode) { @@ -1883,6 +1924,7 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, } env->sse_status.float_rounding_mode =3D prev_rounding_mode; } +#endif =20 #define FBLENDP(d, s, m) (m ? s : d) SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) @@ -1984,6 +2026,7 @@ void glue(helper_mpsadbw, SUFFIX)(CPUX86State *env, R= eg *d, Reg *s, #define FCMPGTQ(d, s) ((int64_t)d > (int64_t)s ? -1 : 0) SSE_HELPER_Q(helper_pcmpgtq, FCMPGTQ) =20 +#if SHIFT =3D=3D 1 static inline int pcmp_elen(CPUX86State *env, int reg, uint32_t ctrl) { target_long val, limit; @@ -2204,6 +2247,8 @@ target_ulong helper_crc32(uint32_t crc1, target_ulong= msg, uint32_t len) return crc; } =20 +#endif + void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t ctrl) { --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662023023; cv=none; d=zohomail.com; s=zohoarc; b=ju9UAFRtpg8cAQ+L/P8H1Cazb/Jka180rdlNu/sJnKOUITEwP1wd+SJO0QhSad4eUxAZ7DXgMDgjiJEypkRDNw/v3jBwACETYKB6F3a0AtKdCQ+tXKwOgnIQGNxesqK6HsPisOxgeq+/CF7MLXOwueu8HjXpHMQwRvar0iSCacg= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662023023; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=BybYTNikrXupXntL2EKojQjO6GO3TVFAcNn2eVVB5uY=; b=P9E23xtcjT4rSHlaAjZHhKjQEHxJy8M/Et39G5w38Kb3woxqiloisPlNdzs+sRAvFf64Mg8A1YUncrxVE9hfom7c/i3eh+H++i48C/iIstq8s+V8JX8k3v/oXHZA3mW0qFTZq+FQGDtmF/WneHz6vkdK3rU0ZaKvWQOOzZPdhT8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662023023575384.69004953726824; Thu, 1 Sep 2022 02:03:43 -0700 (PDT) Received: from localhost ([::1]:43654 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTg6w-0003Hg-8U for importer@patchew.org; Thu, 01 Sep 2022 05:03:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42668) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexC-0001Tt-Qf for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:36946) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexB-0003Bn-16 for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:34 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-398-EPABywsOOY-ph9_5Fr911g-1; Thu, 01 Sep 2022 03:49:31 -0400 Received: by mail-wm1-f69.google.com with SMTP id r10-20020a1c440a000000b003a538a648a9so9471367wma.5 for ; Thu, 01 Sep 2022 00:49:31 -0700 (PDT) Received: from goa-sendmail ([2001:b07:6468:f312:5e2c:eb9a:a8b6:fd3e]) by smtp.gmail.com with ESMTPSA id bh11-20020a05600c3d0b00b003a53731f273sm4707928wmb.31.2022.09.01.00.49.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018572; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BybYTNikrXupXntL2EKojQjO6GO3TVFAcNn2eVVB5uY=; b=OHjrd+0Q9WjSnj4eJNLMe13Bv3VoWUKQCD4DRJPHuTt9Y4+CQBwm+arRIvidQQb/bNxy8F +b2tquxyW1c34NS1ecf5vgvotGGdAJPs9CcOYaRwzbWxzwyLSj3t37PHNCEU8wf9ZdFEIq z4EsvCSrYtAxIQYtVQzuWIE6Wfq5FEs= X-MC-Unique: EPABywsOOY-ph9_5Fr911g-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=BybYTNikrXupXntL2EKojQjO6GO3TVFAcNn2eVVB5uY=; b=GB9VXbNz5Gdnq8wY63m6HL4zgfGOxqpK/PSmJW2FzBKU72FQanlDn/OSzbk3hTZsNP QA0amBSKR3YZ4aNQbRixgumUlguwmqLM93OSmWQt0FvD5UjZ4fuedas65+qak0CG0kOf JZUT5SQ4HNe+SfoQGNYafqwLog1EwiBI//Cg4Zi5h2TwF/mUtg8hJmavTF/SPdHvcFeO 8H3NwM+eXLOLgTsi+kFSgbXEg7qTyb4qeOV4+nvD0YD46PnyPDOj9UGcMfA87hH+irtx z3rEJE7g09eZ/t1Yhf5UoDjA2WlLf+YFcEnUN0KEXqT0MsiOC2bpvgFXEfu9U1QaZ8SE N20Q== X-Gm-Message-State: ACgBeo3RYV9v2P9Va3oT6qH5oQnwCumIA5pMarngHhgOwnNDKnkFA2/s 1RnWuLr44azBn65x+gBHGE/xsbOmrMhYfvXC0CSfA3G1LDWvAkHFRDWff0/J4/AW743GmSB/Ctz TEWyeqriQDRY7T5+R80plddeY+8ph+h4isLfoRY9IW27oMKTaUOkTTMLmozdw/wL0aoM= X-Received: by 2002:a7b:c7d8:0:b0:3a6:34d2:1705 with SMTP id z24-20020a7bc7d8000000b003a634d21705mr4188151wmk.206.1662018570033; Thu, 01 Sep 2022 00:49:30 -0700 (PDT) X-Google-Smtp-Source: AA6agR5v7zs4fa2+G8/zvdsCqXAPRZlisulrGpRShVi1M8t5InxnfLTGqZQoCC1X9w595ZojntQqdQ== X-Received: by 2002:a7b:c7d8:0:b0:3a6:34d2:1705 with SMTP id z24-20020a7bc7d8000000b003a634d21705mr4188138wmk.206.1662018569726; Thu, 01 Sep 2022 00:49:29 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 21/23] i386: Rewrite blendv helpers Date: Thu, 1 Sep 2022 09:48:40 +0200 Message-Id: <20220901074842.57424-22-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662023025026100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Rewrite the blendv helpers so that they can easily be extended to support the AVX encodings, which make all 4 arguments explicit. No functional changes to the existing helpers Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-20-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 86 ++++++++++++------------------------------- 1 file changed, 24 insertions(+), 62 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 7cfbcce49f..a11a0143bf 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -1591,76 +1591,38 @@ void glue(helper_palignr, SUFFIX)(CPUX86State *env,= Reg *d, Reg *s, } } =20 -#define XMM0 (env->xmm_regs[0]) +#if SHIFT >=3D 1 =20 -#if SHIFT =3D=3D 1 #define SSE_HELPER_V(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) \ { \ - d->elem(0) =3D F(d->elem(0), s->elem(0), XMM0.elem(0)); \ - d->elem(1) =3D F(d->elem(1), s->elem(1), XMM0.elem(1)); \ - if (num > 2) { \ - d->elem(2) =3D F(d->elem(2), s->elem(2), XMM0.elem(2)); \ - d->elem(3) =3D F(d->elem(3), s->elem(3), XMM0.elem(3)); \ - if (num > 4) { \ - d->elem(4) =3D F(d->elem(4), s->elem(4), XMM0.elem(4)); \ - d->elem(5) =3D F(d->elem(5), s->elem(5), XMM0.elem(5)); \ - d->elem(6) =3D F(d->elem(6), s->elem(6), XMM0.elem(6)); \ - d->elem(7) =3D F(d->elem(7), s->elem(7), XMM0.elem(7)); \ - if (num > 8) { \ - d->elem(8) =3D F(d->elem(8), s->elem(8), XMM0.elem(8))= ; \ - d->elem(9) =3D F(d->elem(9), s->elem(9), XMM0.elem(9))= ; \ - d->elem(10) =3D F(d->elem(10), s->elem(10), XMM0.elem(= 10)); \ - d->elem(11) =3D F(d->elem(11), s->elem(11), XMM0.elem(= 11)); \ - d->elem(12) =3D F(d->elem(12), s->elem(12), XMM0.elem(= 12)); \ - d->elem(13) =3D F(d->elem(13), s->elem(13), XMM0.elem(= 13)); \ - d->elem(14) =3D F(d->elem(14), s->elem(14), XMM0.elem(= 14)); \ - d->elem(15) =3D F(d->elem(15), s->elem(15), XMM0.elem(= 15)); \ - } \ - } \ + Reg *v =3D d; \ + Reg *m =3D &env->xmm_regs[0]; \ + int i; \ + for (i =3D 0; i < num; i++) { \ + d->elem(i) =3D F(v->elem(i), s->elem(i), m->elem(i)); \ } \ } =20 #define SSE_HELPER_I(name, elem, num, F) \ - void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t imm= ) \ + void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, \ + uint32_t imm) \ { \ - d->elem(0) =3D F(d->elem(0), s->elem(0), ((imm >> 0) & 1)); \ - d->elem(1) =3D F(d->elem(1), s->elem(1), ((imm >> 1) & 1)); \ - if (num > 2) { \ - d->elem(2) =3D F(d->elem(2), s->elem(2), ((imm >> 2) & 1)); \ - d->elem(3) =3D F(d->elem(3), s->elem(3), ((imm >> 3) & 1)); \ - if (num > 4) { \ - d->elem(4) =3D F(d->elem(4), s->elem(4), ((imm >> 4) & 1))= ; \ - d->elem(5) =3D F(d->elem(5), s->elem(5), ((imm >> 5) & 1))= ; \ - d->elem(6) =3D F(d->elem(6), s->elem(6), ((imm >> 6) & 1))= ; \ - d->elem(7) =3D F(d->elem(7), s->elem(7), ((imm >> 7) & 1))= ; \ - if (num > 8) { \ - d->elem(8) =3D F(d->elem(8), s->elem(8), ((imm >> 8) &= 1)); \ - d->elem(9) =3D F(d->elem(9), s->elem(9), ((imm >> 9) &= 1)); \ - d->elem(10) =3D F(d->elem(10), s->elem(10), \ - ((imm >> 10) & 1)); \ - d->elem(11) =3D F(d->elem(11), s->elem(11), \ - ((imm >> 11) & 1)); \ - d->elem(12) =3D F(d->elem(12), s->elem(12), \ - ((imm >> 12) & 1)); \ - d->elem(13) =3D F(d->elem(13), s->elem(13), \ - ((imm >> 13) & 1)); \ - d->elem(14) =3D F(d->elem(14), s->elem(14), \ - ((imm >> 14) & 1)); \ - d->elem(15) =3D F(d->elem(15), s->elem(15), \ - ((imm >> 15) & 1)); \ - } \ - } \ + Reg *v =3D d; \ + int i; \ + for (i =3D 0; i < num; i++) { \ + int j =3D i & 7; \ + d->elem(i) =3D F(v->elem(i), s->elem(i), (imm >> j) & 1); \ } \ } =20 /* SSE4.1 op helpers */ -#define FBLENDVB(d, s, m) ((m & 0x80) ? s : d) -#define FBLENDVPS(d, s, m) ((m & 0x80000000) ? s : d) -#define FBLENDVPD(d, s, m) ((m & 0x8000000000000000LL) ? s : d) -SSE_HELPER_V(helper_pblendvb, B, 16, FBLENDVB) -SSE_HELPER_V(helper_blendvps, L, 4, FBLENDVPS) -SSE_HELPER_V(helper_blendvpd, Q, 2, FBLENDVPD) +#define FBLENDVB(v, s, m) ((m & 0x80) ? s : v) +#define FBLENDVPS(v, s, m) ((m & 0x80000000) ? s : v) +#define FBLENDVPD(v, s, m) ((m & 0x8000000000000000LL) ? s : v) +SSE_HELPER_V(helper_pblendvb, B, 8 << SHIFT, FBLENDVB) +SSE_HELPER_V(helper_blendvps, L, 2 << SHIFT, FBLENDVPS) +SSE_HELPER_V(helper_blendvpd, Q, 1 << SHIFT, FBLENDVPD) =20 void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { @@ -1926,10 +1888,10 @@ void glue(helper_roundsd, SUFFIX)(CPUX86State *env,= Reg *d, Reg *s, } #endif =20 -#define FBLENDP(d, s, m) (m ? s : d) -SSE_HELPER_I(helper_blendps, L, 4, FBLENDP) -SSE_HELPER_I(helper_blendpd, Q, 2, FBLENDP) -SSE_HELPER_I(helper_pblendw, W, 8, FBLENDP) +#define FBLENDP(v, s, m) (m ? s : v) +SSE_HELPER_I(helper_blendps, L, 2 << SHIFT, FBLENDP) +SSE_HELPER_I(helper_blendpd, Q, 1 << SHIFT, FBLENDP) +SSE_HELPER_I(helper_pblendw, W, 4 << SHIFT, FBLENDP) =20 void glue(helper_dpps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, uint32_t mask) --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021540; cv=none; d=zohomail.com; s=zohoarc; b=AhdmfWX/jPuQG5rPL1UpVC4hJNtS5LlXkIsyMZaLoC0P4gY5evcTdv7hTJvNsCFv0IcuM/pQo0JK0DW9WxQ0k0gxY3pcfIdv4gtWG3MMR6uujq0JJ2fdP3mSB2Oo7UbyUh0Vdpa4PCALiUHMpLT8kQ7Ebi4qac2GOVbq/y8+y9g= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021540; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=9IC2Tite3HzkQmKsgB8f7REAQrQ4k7ITc5m5MXNySxs=; b=cezmG20PiOufiF6II6X2T2tNjzuKGE7KNcRzdJzaJb4IAb86NW9QtpUovuqe0+oaytxWnGbhwAwIQEmudg/EqU0B1usVCSHU0eOtcEn0fnxO+ifwie98mHstKFjgNbkUjhKdSRIt58AzU9o6yAkTlQ2u3Z+DdWQWw3K2cYkAb5c= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021540162549.3391480028929; Thu, 1 Sep 2022 01:39:00 -0700 (PDT) Received: from localhost ([::1]:51164 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfj0-0008Bv-33 for importer@patchew.org; Thu, 01 Sep 2022 04:38:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51054) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexH-0001UL-9i for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:28102) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexC-0003C6-HM for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:35 -0400 Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-517-utm8OatfOYqb8Y1h9cRfNw-1; Thu, 01 Sep 2022 03:49:33 -0400 Received: by mail-wr1-f69.google.com with SMTP id j4-20020adfa544000000b002255264474bso2851938wrb.17 for ; Thu, 01 Sep 2022 00:49:32 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id l14-20020a5d410e000000b002238a1f6b74sm13906987wrp.37.2022.09.01.00.49.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018574; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9IC2Tite3HzkQmKsgB8f7REAQrQ4k7ITc5m5MXNySxs=; b=fcDWuM8h1KCNDZ6cjqIZwyfD2L1OwHZqmjT3cew5Hgqk4dLv95iZFDnYBVxJkQxC18Y0iK OD6Jecrq4vxUdlnMAHv7CcWAdw8vie150tfiZ/h3NKU5rdtqyfEk7sMi5oktXKPwNQkLT2 dFT/Pv6K0zaIX0UyxLum1RRyBo46vw8= X-MC-Unique: utm8OatfOYqb8Y1h9cRfNw-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=9IC2Tite3HzkQmKsgB8f7REAQrQ4k7ITc5m5MXNySxs=; b=zCvK/n6Im6agLkYeAo19BWweILBZVht23e5xZfivtbMFla6GsPctkwMnS8z6cno0si FiVj8ZnIBgXt63ihCyZvS1uq5KJC+oaTCl/IzxcoAVmUQZqPN4dhSOhi/umJiYOJM5/w 0FMe86NpNomt3bYjq/9sritKH0hklUY7o38NF3cys1Nl8b0irzuWAZ8neXZvwniu5QaK NU/Sldd/chJdlzx8Hi066m6uqr89tGzkfsy8kG31EP7eXreUdzz1aRfelOAfDsFn/26V Cs/Z6bNEay6Neh1wPqtwqdTrscF6FvHnm0Rf2HW95XFLcJQljLdnSo0fVbC9UCAIinP6 IQ2A== X-Gm-Message-State: ACgBeo3iolY5Q/fZg0ogpXAUdivU3iT14EdPZS9S/X5W42ClhieFTxM7 h7y0a5BxoR1osecd2c/LHVyH1p/ZgYm4uIAQo85ez9kdtdMtitBwF2OzBFc+E8lldstRFb36w08 fuHdx5giJOKAg3BNwkOMv64cD5xJJFaQH0T9UoNQkkiEMh26UFyAwsd9rcD3b4fySCzk= X-Received: by 2002:adf:e112:0:b0:21d:7195:3a8d with SMTP id t18-20020adfe112000000b0021d71953a8dmr14377871wrz.371.1662018571688; Thu, 01 Sep 2022 00:49:31 -0700 (PDT) X-Google-Smtp-Source: AA6agR5d97Eps6E1yKE7QmKPefzYFRb24jNsMQ+5lq5MGgnYQD9jRz5/2HO5KP8E5+2flI2VoVCWcg== X-Received: by 2002:adf:e112:0:b0:21d:7195:3a8d with SMTP id t18-20020adfe112000000b0021d71953a8dmr14377855wrz.371.1662018571423; Thu, 01 Sep 2022 00:49:31 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 22/23] i386: AVX pclmulqdq prep Date: Thu, 1 Sep 2022 09:48:41 +0200 Message-Id: <20220901074842.57424-23-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021540917100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Make the pclmulqdq helper AVX ready Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-21-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index a11a0143bf..4135623ad8 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2211,14 +2211,14 @@ target_ulong helper_crc32(uint32_t crc1, target_ulo= ng msg, uint32_t len) =20 #endif =20 -void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, - uint32_t ctrl) +#if SHIFT =3D=3D 1 +static void clmulq(uint64_t *dest_l, uint64_t *dest_h, + uint64_t a, uint64_t b) { - uint64_t ah, al, b, resh, resl; + uint64_t al, ah, resh, resl; =20 ah =3D 0; - al =3D d->Q((ctrl & 1) !=3D 0); - b =3D s->Q((ctrl & 16) !=3D 0); + al =3D a; resh =3D resl =3D 0; =20 while (b) { @@ -2231,8 +2231,23 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s, b >>=3D 1; } =20 - d->Q(0) =3D resl; - d->Q(1) =3D resh; + *dest_l =3D resl; + *dest_h =3D resh; +} +#endif + +void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, + uint32_t ctrl) +{ + Reg *v =3D d; + uint64_t a, b; + int i; + + for (i =3D 0; i < 1 << SHIFT; i +=3D 2) { + a =3D v->Q(((ctrl & 1) !=3D 0) + i); + b =3D s->Q(((ctrl & 16) !=3D 0) + i); + clmulq(&d->Q(i), &d->Q(i + 1), a, b); + } } =20 void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) --=20 2.37.1 From nobody Mon Feb 9 15:46:37 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021997; cv=none; d=zohomail.com; s=zohoarc; b=n3+agFFpDL/KiUVbN7Mszt6rYC2KxRcj5ykAj6IC5mp8qElq9JowogrjoB8F46YxR6YihIIRKlp+3g015bde/bdE/PsANZaa6B6vUUEQcIy+ByH2C+PCBPSRvIN/0BbATDt4W9YI+IhJpdsYJCN3GV9m0Ad0MoRsIhcGjfm0Jg0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021997; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=KJQUU+VddMwgot1TSEksOqwCThrJaNjx4Q2C2mhLmXc=; b=QV0aFLYLFbsKw/QwOSkgJNRLpo7fvEkPv7J8lYRpW19DUbuHkSQGiITSS6mGTB/yoBJCGZV+lgFPCp3Frntj7AO9Z34ea8VrxTNFVU9gvxGb6AMEkzZPNiLOR5Frl4vDTK/Tis37vFsNsD5vdpV5u/XXrMKmLKEtUGI7JxEfmm8= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021997166531.8265367612378; Thu, 1 Sep 2022 01:46:37 -0700 (PDT) Received: from localhost ([::1]:33780 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfqO-0007AL-0N for importer@patchew.org; Thu, 01 Sep 2022 04:46:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:51056) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexI-0001Y1-NP for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:40 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:50932) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTexH-0003CX-0g for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:40 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-62-CRjnFhUWNBG7-ton_GVSnA-1; Thu, 01 Sep 2022 03:49:35 -0400 Received: by mail-wm1-f69.google.com with SMTP id ay27-20020a05600c1e1b00b003a5bff0df8dso2534121wmb.0 for ; Thu, 01 Sep 2022 00:49:35 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id m3-20020a05600c3b0300b003a83ca67f73sm4568546wms.3.2022.09.01.00.49.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018576; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KJQUU+VddMwgot1TSEksOqwCThrJaNjx4Q2C2mhLmXc=; b=gz93rKm8G2h2qMP0t7Pyq9Eh0P25AtUh9GNDeOHcffGcXh+UBVtiXJw2AoKAGwqHKld6SA 0tU+ddp03v/3I+haZx9/b0oRZIirbemLJLNPJ7rBibii3upAYXy2aQVkkp46bNjJvA8747 e7/8O515JjRnBAj4A3cPJ6ruEbDzmgo= X-MC-Unique: CRjnFhUWNBG7-ton_GVSnA-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=KJQUU+VddMwgot1TSEksOqwCThrJaNjx4Q2C2mhLmXc=; b=qZ6PKjQZTnq70mQ8qDeFdV7K3noQb+/aXLY/7VD2SrthCv+TQ/w2foCFoIjLBBa9w/ 40oNpQ7uzRX1liSljWZrmJziLg0kdS6E5kdFJHErmHHdYb6SwH6xdMeERXyM3TGwPqQk rLlVuzcGW3U1WxAulvpWkj6MdBEH/kkD6Q74lSoRa5UxheL78Z6J2yEtdz4ylDoGn9l/ 6fsSCyi+PzYBBD/GZ4YmWHW+mFs46Xxeq8Ym49usxFrm+ircBOAsv/QRPDmOUamlrJtG mqsWhPwyoN7/d4TZPPIzzhXDpuQtWeH9yrfzXt/uUwlzu3SJ+lu9TtgIw3Ob2yrhWWUz SJfg== X-Gm-Message-State: ACgBeo1KNf175NaSKRExyWWHvzbUh/8FA75n9pSDI4/prI4n6CsmIvby f0LN749mBqeXlrffzhqzOkYhiRv39CaCpWX8ZF3RbmK/yF6eHs4pxa5UKqz0x8U7JEzLVbe7Qbb 1ru93vZDrqLZ5EwJfWdVzi6cwfWyf4LmrsdJH6Q0orxyeGG1XtobF6hwVOJwQ4KIf8RY= X-Received: by 2002:a7b:c045:0:b0:3a5:ff4e:5528 with SMTP id u5-20020a7bc045000000b003a5ff4e5528mr4374386wmc.150.1662018573867; Thu, 01 Sep 2022 00:49:33 -0700 (PDT) X-Google-Smtp-Source: AA6agR6DFIWJFH1PRyeno0fv60MtolRZkw8SeI+hGdzN4RMGcCfyLWB4J6erfXulOxGrC5PAVh6p1w== X-Received: by 2002:a7b:c045:0:b0:3a5:ff4e:5528 with SMTP id u5-20020a7bc045000000b003a5ff4e5528mr4374365wmc.150.1662018573509; Thu, 01 Sep 2022 00:49:33 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 23/23] i386: AVX+AES helpers prep Date: Thu, 1 Sep 2022 09:48:42 +0200 Message-Id: <20220901074842.57424-24-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.133.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021997721100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Make the AES vector helpers AVX ready No functional changes to existing helpers Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-22-paul@nowt.org> Signed-off-by: Paolo Bonzini Reviewed-by: Richard Henderson --- target/i386/ops_sse.h | 41 ++++++++++++++++++++++------------------- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 4135623ad8..f208253161 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -2256,11 +2256,12 @@ void glue(helper_aesdec, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) Reg st =3D *d; Reg rk =3D *s; =20 - for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4*i+0])] ^ - AES_Td1[st.B(AES_ishifts[4*i+1])] ^ - AES_Td2[st.B(AES_ishifts[4*i+2])] ^ - AES_Td3[st.B(AES_ishifts[4*i+3])]); + for (i =3D 0 ; i < 2 << SHIFT ; i++) { + int j =3D i & 3; + d->L(i) =3D rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4 * j + 0])= ] ^ + AES_Td1[st.B(AES_ishifts[4 * j + 1])] ^ + AES_Td2[st.B(AES_ishifts[4 * j + 2])] ^ + AES_Td3[st.B(AES_ishifts[4 * j + 3])]); } } =20 @@ -2270,8 +2271,8 @@ void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env= , Reg *d, Reg *s) Reg st =3D *d; Reg rk =3D *s; =20 - for (i =3D 0; i < 16; i++) { - d->B(i) =3D rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i])]); + for (i =3D 0; i < 8 << SHIFT; i++) { + d->B(i) =3D rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i & 15] + (i & ~= 15))]); } } =20 @@ -2281,11 +2282,12 @@ void glue(helper_aesenc, SUFFIX)(CPUX86State *env, = Reg *d, Reg *s) Reg st =3D *d; Reg rk =3D *s; =20 - for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4*i+0])] ^ - AES_Te1[st.B(AES_shifts[4*i+1])] ^ - AES_Te2[st.B(AES_shifts[4*i+2])] ^ - AES_Te3[st.B(AES_shifts[4*i+3])]); + for (i =3D 0 ; i < 2 << SHIFT ; i++) { + int j =3D i & 3; + d->L(i) =3D rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4 * j + 0])]= ^ + AES_Te1[st.B(AES_shifts[4 * j + 1])] ^ + AES_Te2[st.B(AES_shifts[4 * j + 2])] ^ + AES_Te3[st.B(AES_shifts[4 * j + 3])]); } } =20 @@ -2295,22 +2297,22 @@ void glue(helper_aesenclast, SUFFIX)(CPUX86State *e= nv, Reg *d, Reg *s) Reg st =3D *d; Reg rk =3D *s; =20 - for (i =3D 0; i < 16; i++) { - d->B(i) =3D rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i])]); + for (i =3D 0; i < 8 << SHIFT; i++) { + d->B(i) =3D rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i & 15] + (i & ~15= ))]); } - } =20 +#if SHIFT =3D=3D 1 void glue(helper_aesimc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) { int i; Reg tmp =3D *s; =20 for (i =3D 0 ; i < 4 ; i++) { - d->L(i) =3D bswap32(AES_imc[tmp.B(4*i+0)][0] ^ - AES_imc[tmp.B(4*i+1)][1] ^ - AES_imc[tmp.B(4*i+2)][2] ^ - AES_imc[tmp.B(4*i+3)][3]); + d->L(i) =3D bswap32(AES_imc[tmp.B(4 * i + 0)][0] ^ + AES_imc[tmp.B(4 * i + 1)][1] ^ + AES_imc[tmp.B(4 * i + 2)][2] ^ + AES_imc[tmp.B(4 * i + 3)][3]); } } =20 @@ -2328,6 +2330,7 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State= *env, Reg *d, Reg *s, d->L(3) =3D (d->L(2) << 24 | d->L(2) >> 8) ^ ctrl; } #endif +#endif =20 #undef SSE_HELPER_S =20 --=20 2.37.1