From nobody Wed Feb 11 06:15:58 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1715092509; cv=none; d=zohomail.com; s=zohoarc; b=U29hEhbKxrQovaxlMkW/idGIbLS+ACUYTQz2yxiDEuuArvrZcoRvuZ4FwBepGgxF8X3wrviORyssbqOED6+7kONa8Tbw0N8jy44Jiepq5oovEUB8soDp/a5lk8k3TJJV1+PBHhRhBPQQWw/XjLqqrI37ZgqDmSu6H919fNQTjoA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1715092509; h=Content-Transfer-Encoding:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To:Cc; bh=T0/wwsmOrIFP2k0lfcZhLa1SDoLg5xmY78unAbAeB00=; b=J58bivsVXx/CTi5LblhSMiichB+eOEFcFjDZmiKgmRoeqxIJo5q5vK6gndG4XWYNtSp53SaFFVUMjvN/3aGKR5bb/QF3dmQMWxi0VwskZL4DNSJuawenWcZDx4hNKoumvNcJT86fpOUxqfZ2zKaYFQDm9l3sdP31ogl/HtOTcXo= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1715092509941353.2321800880802; Tue, 7 May 2024 07:35:09 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1s4LtB-0003Pc-DA; Tue, 07 May 2024 10:33:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1s4Lsj-0003BP-Pc for qemu-devel@nongnu.org; Tue, 07 May 2024 10:33:30 -0400 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1s4LsY-0001M6-PY for qemu-devel@nongnu.org; Tue, 07 May 2024 10:33:24 -0400 Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1ee38966529so12455005ad.1 for ; Tue, 07 May 2024 07:33:13 -0700 (PDT) Received: from stoup.. (174-21-72-5.tukw.qwest.net. [174.21.72.5]) by smtp.gmail.com with ESMTPSA id p3-20020a170902780300b001ed2d84c1cbsm8690097pll.193.2024.05.07.07.33.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 May 2024 07:33:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1715092392; x=1715697192; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=T0/wwsmOrIFP2k0lfcZhLa1SDoLg5xmY78unAbAeB00=; b=kiBd4fz7dJnr7APi4uGHFIVCWk7EDt+/Wll62WpAWG1XwpCwVNKoCXfWWmUMA/+wva DQYEUs71DE337OmF5Y2+OUTbd2YmbVC4JCvNjVw2NkXWS5ZEk8mvrl7nzZTdQ19CV9D3 MES7J3Igu9+DQXeycqGp89BkocsPLZvseuPNMmcghBPUPk6O/F9bJGpqRs80/btvgEbv tGX/eHB7u2icFheCd28Gru+mb9j8UbLei52uRVec/XklkZMvP/ka89zvdkcFmxHeL7sp 571s9/29IAe31FkwiZ4vFT9EOttLkt/LPcQ85y89XIAfXnI7KhEU5wpYih9dsXscemJY AwTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715092392; x=1715697192; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T0/wwsmOrIFP2k0lfcZhLa1SDoLg5xmY78unAbAeB00=; b=boORFZYgIktUgDBZXBb0EriIt4Oo060fzWAKfojMnPYWzUvlcAg8E2daIEEBvQqpR5 DMNw8ELDy+U/5K2qxAjrWecODxyR+QhU4OixUhUdBnzM+y+fOqErFpufJoXd9yzE50U+ 5qifalERZcVJsWnNs2TzFLi7kkX6TWOEK2uVnBcsgm0pgkNNku7ytOqLGCE/4p9oWOtC OvoEet8jQcyBT+6lCniQRAn7NbBNo4vvUefU8lFUK7uWl/YhS3a++bJzE2qICrruxhGE 3WWBNUwUYByXsGbY6bfeCfmO0zh6+FwiLx6rEx/fc85oRu/P8RLbYh3txAaP4ONR3XNv W++A== X-Gm-Message-State: AOJu0Yxxj2YwwF6PyQ5gZH7jZZi1kqAGTZtf3Dhj5U3MLYAq0CGJ0URS Y2QplZDDpExlbIyLDytpFXamL//T4IpmMEAYwmVx28lAfSFq2r0fH7AljZGiN4TaUs7DpQhKLfY L X-Google-Smtp-Source: AGHT+IGZm1S2pNCyxFo/vgXiN2lLokX8duBkNY04YlhT35X7LlypaWahG9QsRSV1KnxRPtwUqrWTgA== X-Received: by 2002:a17:902:654d:b0:1e8:7906:5be3 with SMTP id d9443c01a7336-1ee63167542mr34107525ad.18.1715092392442; Tue, 07 May 2024 07:33:12 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PULL 2/9] tcg/i386: Simplify immediate 8-bit logical vector shifts Date: Tue, 7 May 2024 07:33:02 -0700 Message-Id: <20240507143309.5528-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240507143309.5528-1-richard.henderson@linaro.org> References: <20240507143309.5528-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linaro.org) X-ZM-MESSAGEID: 1715092510688100001 Content-Type: text/plain; charset="utf-8" The x86 isa does not have this operation, so we need an expansion. Use the same algorithm that we use for expanding this vector operation with integers: perform the shift with a wider type and then mask the bits that must be zero. This reduces the instruction count from 5 to 2. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 61 +++++++++------------------------------ 1 file changed, 14 insertions(+), 47 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index c6ba498623..6837c519b0 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -3769,49 +3769,20 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type= , unsigned vece) } } =20 -static void expand_vec_shi(TCGType type, unsigned vece, TCGOpcode opc, +static void expand_vec_shi(TCGType type, unsigned vece, bool right, TCGv_vec v0, TCGv_vec v1, TCGArg imm) { - TCGv_vec t1, t2; + uint8_t mask; =20 tcg_debug_assert(vece =3D=3D MO_8); - - t1 =3D tcg_temp_new_vec(type); - t2 =3D tcg_temp_new_vec(type); - - /* - * Unpack to W, shift, and repack. Tricky bits: - * (1) Use punpck*bw x,x to produce DDCCBBAA, - * i.e. duplicate in other half of the 16-bit lane. - * (2) For right-shift, add 8 so that the high half of the lane - * becomes zero. For left-shift, and left-rotate, we must - * shift up and down again. - * (3) Step 2 leaves high half zero such that PACKUSWB - * (pack with unsigned saturation) does not modify - * the quantity. - */ - vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, - tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); - vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, - tcgv_vec_arg(t2), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); - - if (opc !=3D INDEX_op_rotli_vec) { - imm +=3D 8; - } - if (opc =3D=3D INDEX_op_shri_vec) { - tcg_gen_shri_vec(MO_16, t1, t1, imm); - tcg_gen_shri_vec(MO_16, t2, t2, imm); + if (right) { + mask =3D 0xff >> imm; + tcg_gen_shri_vec(MO_16, v0, v1, imm); } else { - tcg_gen_shli_vec(MO_16, t1, t1, imm); - tcg_gen_shli_vec(MO_16, t2, t2, imm); - tcg_gen_shri_vec(MO_16, t1, t1, 8); - tcg_gen_shri_vec(MO_16, t2, t2, 8); + mask =3D 0xff << imm; + tcg_gen_shli_vec(MO_16, v0, v1, imm); } - - vec_gen_3(INDEX_op_x86_packus_vec, type, MO_8, - tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t2)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); + tcg_gen_and_vec(MO_8, v0, v0, tcg_constant_vec(type, MO_8, mask)); } =20 static void expand_vec_sari(TCGType type, unsigned vece, @@ -3821,7 +3792,7 @@ static void expand_vec_sari(TCGType type, unsigned ve= ce, =20 switch (vece) { case MO_8: - /* Unpack to W, shift, and repack, as in expand_vec_shi. */ + /* Unpack to 16-bit, shift, and repack. */ t1 =3D tcg_temp_new_vec(type); t2 =3D tcg_temp_new_vec(type); vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, @@ -3874,12 +3845,7 @@ static void expand_vec_rotli(TCGType type, unsigned = vece, { TCGv_vec t; =20 - if (vece =3D=3D MO_8) { - expand_vec_shi(type, vece, INDEX_op_rotli_vec, v0, v1, imm); - return; - } - - if (have_avx512vbmi2) { + if (vece !=3D MO_8 && have_avx512vbmi2) { vec_gen_4(INDEX_op_x86_vpshldi_vec, type, vece, tcgv_vec_arg(v0), tcgv_vec_arg(v1), tcgv_vec_arg(v1), im= m); return; @@ -4155,10 +4121,11 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type,= unsigned vece, =20 switch (opc) { case INDEX_op_shli_vec: - case INDEX_op_shri_vec: - expand_vec_shi(type, vece, opc, v0, v1, a2); + expand_vec_shi(type, vece, false, v0, v1, a2); + break; + case INDEX_op_shri_vec: + expand_vec_shi(type, vece, true, v0, v1, a2); break; - case INDEX_op_sari_vec: expand_vec_sari(type, vece, v0, v1, a2); break; --=20 2.34.1