From nobody Mon Feb 9 20:12:48 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=redhat.com ARC-Seal: i=1; a=rsa-sha256; t=1662021581; cv=none; d=zohomail.com; s=zohoarc; b=iXMOnR4t8wNI56SJuZNZE93kfuetV+2sPE6MbHicsw9UgxOeHdyqdNBCiPgg7Hd/5OGIz//m6VOJSZKHGsOEShCc0IE87ffyLzgUtafigJSHVHnzCcVs36v7vLGqgalDxbEoh+Km4UlCQ0KN/TJaSUk/h1W9Iqg0+mdU0fP9/DM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1662021581; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=RnbqrRyb+Pgqqj81ijJJCqtS+n3s+WDNawbq5Qm6zEY=; b=XYXmq2KzNb2QeXYV7TeouPQh7IodpExKHdU/oxdmOcKItqfT6bXSVo6HHC+9JiHKQXfxG6hT/Bc/DbhcJAYE7/vjBRmDuS8Co1Ah5Kzof88yavKPwtyH5sDFcZbWUOdKwcVQzR34QQJpiMo/zdwl+5jSY4bO45TxO9/9yWZ/m2Q= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1662021581929724.6015025653089; Thu, 1 Sep 2022 01:39:41 -0700 (PDT) Received: from localhost ([::1]:47438 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1oTfjg-0001mh-Od for importer@patchew.org; Thu, 01 Sep 2022 04:39:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:60876) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTeww-0001SA-VO for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:59159) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1oTews-00039O-Po for qemu-devel@nongnu.org; Thu, 01 Sep 2022 03:49:17 -0400 Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-639-SjCkVOZgPUCXjuTHzcxF7A-1; Thu, 01 Sep 2022 03:49:11 -0400 Received: by mail-wr1-f69.google.com with SMTP id j12-20020adfff8c000000b002265dcdfad7so2781067wrr.2 for ; Thu, 01 Sep 2022 00:49:10 -0700 (PDT) Received: from goa-sendmail ([93.56.160.208]) by smtp.gmail.com with ESMTPSA id k36-20020a05600c1ca400b003a5f3de6fddsm5272964wms.25.2022.09.01.00.49.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Sep 2022 00:49:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662018553; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RnbqrRyb+Pgqqj81ijJJCqtS+n3s+WDNawbq5Qm6zEY=; b=BUebbpsNYBdma9uciRCOuQtttm/KWDjcYtteMZNjhohQhfPXDD+fGPvclW9BJqYxqJQoCa j52r9FKWyamtWsrmGhEBI5G4B0NQGA9mztq9BEP0dRFY/vdjNp4jxniuSVwwra1gnenSns 3H/Jod9TZCzNlQivU9aGKXJaCHoeasA= X-MC-Unique: SjCkVOZgPUCXjuTHzcxF7A-1 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=RnbqrRyb+Pgqqj81ijJJCqtS+n3s+WDNawbq5Qm6zEY=; b=UTXj6/dwp7wxfRTkbJSiA5g6T3wgPlMvb9p+qML1eYcvKNrkIPZrjUpMA8jItCsYob n/9EoPN+LLTW2DdHT6jORactTD8J5zgUzuL2NCKq75X+dhySbmCk603Mq1YnSDJ1VcSL Oov7C1F6SBvAaxJtp2ye9jhIacUo5NZ6SkyV/BR9grdDgcHyy9W4SVDALN7C+unvdwnl 7fJYftBCkGNuAFjEZNzkJuC4lS8yffSKl9kYOVhrAlcl8MmC5p6lJO9mMk00IJHw1XgI KHq0YSy1oZkxm1iTfVs+j9QptK7FjflsUG3gtnHGzFwJlCjzKQRGOM4FuU2K/I1xfQZp E9DQ== X-Gm-Message-State: ACgBeo0zCw1aeJ7lOG0G5SsGsd9nwWJCcjn58ftnsyT/syg7QqzzyQLT zGumhPMKF21kRXXfpQxUslIANh5ZTg2lr9MqAnZP7Qxyxbl8VOf+a5sciWe6UhHL6KVGFOx1/Dw UD5o+fDM7uQt2dHNhj0z+59OOFjJ//rUnC477Kk7RVl5veHZ/e6ORQ6Mjf++cBS9q/4I= X-Received: by 2002:a05:600c:40d5:b0:3a5:3d9f:6e7f with SMTP id m21-20020a05600c40d500b003a53d9f6e7fmr4204932wmh.21.1662018548913; Thu, 01 Sep 2022 00:49:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR5Tt8ZjYP+7uE5NyfoOWiEKIXiUsbr/Dbg4k18hMeaY1sW0/IBB2pO0C2ydDU0W4gKtJOx5YA== X-Received: by 2002:a05:600c:40d5:b0:3a5:3d9f:6e7f with SMTP id m21-20020a05600c40d500b003a53d9f6e7fmr4204917wmh.21.1662018548539; Thu, 01 Sep 2022 00:49:08 -0700 (PDT) From: Paolo Bonzini To: qemu-devel@nongnu.org Cc: paul@nowt.org, richard.henderson@linaro.org Subject: [PATCH v3 12/23] i386: Rewrite vector shift helper Date: Thu, 1 Sep 2022 09:48:31 +0200 Message-Id: <20220901074842.57424-13-pbonzini@redhat.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20220901074842.57424-1-pbonzini@redhat.com> References: <20220901074842.57424-1-pbonzini@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=170.10.129.124; envelope-from=pbonzini@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @redhat.com) X-ZM-MESSAGEID: 1662021583129100001 Content-Type: text/plain; charset="utf-8" From: Paul Brook Rewrite the vector shift helpers in preperation for AVX support (3 operand form and 256 bit vectors). For now keep the existing two operand interface. No functional changes to existing helpers. Signed-off-by: Paul Brook Message-Id: <20220424220204.2493824-11-paul@nowt.org> Reviewed-by: Richard Henderson Signed-off-by: Paolo Bonzini --- target/i386/ops_sse.h | 247 +++++++++++++++++++----------------------- 1 file changed, 112 insertions(+), 135 deletions(-) diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h index 2c0090a647..a4a09226e3 100644 --- a/target/i386/ops_sse.h +++ b/target/i386/ops_sse.h @@ -40,6 +40,8 @@ #define SUFFIX _xmm #endif =20 +#define LANE_WIDTH (SHIFT ? 16 : 8) + /* * Copy the relevant parts of a Reg value around. In the case where * sizeof(Reg) > SIZE, these helpers operate only on the lower bytes of @@ -56,198 +58,173 @@ #define MOVE(d, r) memcpy(&(d).B(0), &(r).B(0), SIZE) #endif =20 -void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - int shift; +#if SHIFT =3D=3D 0 +#define FPSRL(x, c) ((x) >> shift) +#define FPSRAW(x, c) ((int16_t)(x) >> shift) +#define FPSRAL(x, c) ((int32_t)(x) >> shift) +#define FPSLL(x, c) ((x) << shift) +#endif =20 - if (s->Q(0) > 15) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif +void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +{ + Reg *s =3D d; + int shift; + if (c->Q(0) > 15) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->W(0) >>=3D shift; - d->W(1) >>=3D shift; - d->W(2) >>=3D shift; - d->W(3) >>=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) >>=3D shift; - d->W(5) >>=3D shift; - d->W(6) >>=3D shift; - d->W(7) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSRL(s->W(i), shift); + } } } =20 -void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; + if (c->Q(0) > 15) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } + } else { + shift =3D c->B(0); + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSLL(s->W(i), shift); + } + } +} =20 - if (s->Q(0) > 15) { +void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) +{ + Reg *s =3D d; + int shift; + if (c->Q(0) > 15) { shift =3D 15; } else { - shift =3D s->B(0); + shift =3D c->B(0); + } + for (int i =3D 0; i < 4 << SHIFT; i++) { + d->W(i) =3D FPSRAW(s->W(i), shift); } - d->W(0) =3D (int16_t)d->W(0) >> shift; - d->W(1) =3D (int16_t)d->W(1) >> shift; - d->W(2) =3D (int16_t)d->W(2) >> shift; - d->W(3) =3D (int16_t)d->W(3) >> shift; -#if SHIFT =3D=3D 1 - d->W(4) =3D (int16_t)d->W(4) >> shift; - d->W(5) =3D (int16_t)d->W(5) >> shift; - d->W(6) =3D (int16_t)d->W(6) >> shift; - d->W(7) =3D (int16_t)d->W(7) >> shift; -#endif } =20 -void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 15) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 31) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->W(0) <<=3D shift; - d->W(1) <<=3D shift; - d->W(2) <<=3D shift; - d->W(3) <<=3D shift; -#if SHIFT =3D=3D 1 - d->W(4) <<=3D shift; - d->W(5) <<=3D shift; - d->W(6) <<=3D shift; - d->W(7) <<=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSRL(s->L(i), shift); + } } } =20 -void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 31) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->L(0) >>=3D shift; - d->L(1) >>=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) >>=3D shift; - d->L(3) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSLL(s->L(i), shift); + } } } =20 -void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { + if (c->Q(0) > 31) { shift =3D 31; } else { - shift =3D s->B(0); + shift =3D c->B(0); + } + for (int i =3D 0; i < 2 << SHIFT; i++) { + d->L(i) =3D FPSRAL(s->L(i), shift); } - d->L(0) =3D (int32_t)d->L(0) >> shift; - d->L(1) =3D (int32_t)d->L(1) >> shift; -#if SHIFT =3D=3D 1 - d->L(2) =3D (int32_t)d->L(2) >> shift; - d->L(3) =3D (int32_t)d->L(3) >> shift; -#endif } =20 -void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 31) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 63) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->L(0) <<=3D shift; - d->L(1) <<=3D shift; -#if SHIFT =3D=3D 1 - d->L(2) <<=3D shift; - d->L(3) <<=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D FPSRL(s->Q(i), shift); + } } } =20 -void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { + Reg *s =3D d; int shift; - - if (s->Q(0) > 63) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif + if (c->Q(0) > 63) { + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D 0; + } } else { - shift =3D s->B(0); - d->Q(0) >>=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) >>=3D shift; -#endif + shift =3D c->B(0); + for (int i =3D 0; i < 1 << SHIFT; i++) { + d->Q(i) =3D FPSLL(s->Q(i), shift); + } } } =20 -void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +#if SHIFT >=3D 1 +void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { - int shift; + Reg *s =3D d; + int shift, i, j; =20 - if (s->Q(0) > 63) { - d->Q(0) =3D 0; -#if SHIFT =3D=3D 1 - d->Q(1) =3D 0; -#endif - } else { - shift =3D s->B(0); - d->Q(0) <<=3D shift; -#if SHIFT =3D=3D 1 - d->Q(1) <<=3D shift; -#endif - } -} - -#if SHIFT =3D=3D 1 -void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) -{ - int shift, i; - - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } - for (i =3D 0; i < 16 - shift; i++) { - d->B(i) =3D d->B(i + shift); - } - for (i =3D 16 - shift; i < 16; i++) { - d->B(i) =3D 0; + for (j =3D 0; j < 8 << SHIFT; j +=3D LANE_WIDTH) { + for (i =3D 0; i < 16 - shift; i++) { + d->B(j + i) =3D s->B(j + i + shift); + } + for (i =3D 16 - shift; i < 16; i++) { + d->B(j + i) =3D 0; + } } } =20 -void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s) +void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c) { - int shift, i; + Reg *s =3D d; + int shift, i, j; =20 - shift =3D s->L(0); + shift =3D c->L(0); if (shift > 16) { shift =3D 16; } - for (i =3D 15; i >=3D shift; i--) { - d->B(i) =3D d->B(i - shift); - } - for (i =3D 0; i < shift; i++) { - d->B(i) =3D 0; + for (j =3D 0; j < 8 << SHIFT; j +=3D LANE_WIDTH) { + for (i =3D 15; i >=3D shift; i--) { + d->B(j + i) =3D s->B(j + i - shift); + } + for (i =3D 0; i < shift; i++) { + d->B(j + i) =3D 0; + } } } #endif --=20 2.37.1