From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=listsout.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1546642500538104.35097947917086; Fri, 4 Jan 2019 14:55:00 -0800 (PST) Received: from localhost ([127.0.0.1]:57047 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY34-0005ZH-4L for importer@patchew.org; Fri, 04 Jan 2019 17:34:38 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54790 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY02-0002dW-ME for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY01-0001Dj-QI for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:30 -0500 Received: from mail-it1-x141.google.com ([2607:f8b0:4864:20::141]:33664) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY01-0001DX-MK for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:29 -0500 Received: by mail-it1-x141.google.com with SMTP id m8so2327292itk.0 for ; Fri, 04 Jan 2019 14:31:29 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.26 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=po3fGKZcEsraXkqBMSX5FPnIKtuOJRDnAJ2/AeZX7VI=; b=faTdPAb1TBGQ2kdTyUf/ne3KM0vGFU78c0h9NFNH2AvMRGln2paiWhJ8uPDMSUjsYW dBBmHAliCHujluhQsepIPzePD3m0tHL6tO12EusGKAiwqtjpjWyAUugtx30s/mLJsgEE GRQAhsZl+Bkrkg2+ssLHIJZ+28n4zUuAJFvgg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=po3fGKZcEsraXkqBMSX5FPnIKtuOJRDnAJ2/AeZX7VI=; b=myv3NahVC6v1O/0JSKbrHFCtrzNuZFMjqG2TKfHdxa5TkMy3eW8PnU8salI/FXBxj4 Vxcqynvlv22U95wPnyYX3v2FIkoAAGxdftLhWC06S/G6+fTnHXxyEJMIKtBcE507Ttbt s4OIY7dtXAUGOFqgJLQUT6IfnY32poe66R4iJTfTUa6zEJeX/n+lRbfNAk1nhJZICQw8 GRVn1+xOLQqFmPv7XMsqLFAkJim3u7vPwjWM4j+xGc+r2QWdCvADZXWozfKJ9vTK0rQw j8/GOkcuv7fiXP1cBx/+zl7BsjSTQa4TIyatYkVuBfpN0MAES4jP7pCYrI67LEwS420u ysqg== X-Gm-Message-State: AJcUukcPkLtcFilHVVenY0Hlkm4rk5GgH4I2LtCVprY9zyAISzUH/yRr yYJ2wYSrgb3bECvnJ7GtUbKJCAYo5q0= X-Google-Smtp-Source: ALg8bN4SmdkhVUF/Cf8NiqW8F7bRKJlmuBCdmRI/MYwisY8mYfyEneQiy5DaYncoMRv3ZD8Av+mU6g== X-Received: by 2002:a24:57c5:: with SMTP id u188mr2367371ita.54.1546641088647; Fri, 04 Jan 2019 14:31:28 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:07 +1000 Message-Id: <20190104223116.14037-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::141 Subject: [Qemu-devel] [PATCH v2 01/10] tcg: Add logical simplifications during gvec expand X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" We handle many of these during integer expansion, and the rest of them during integer optimization. Reviewed-by: David Gibson Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.c | 35 ++++++++++++++++++++++++++++++----- 1 file changed, 30 insertions(+), 5 deletions(-) diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 61c25f5784..ec231b78fb 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1840,7 +1840,12 @@ void tcg_gen_gvec_and(unsigned vece, uint32_t dofs, = uint32_t aofs, .opc =3D INDEX_op_and_vec, .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs =3D=3D bofs) { + tcg_gen_gvec_mov(vece, dofs, aofs, oprsz, maxsz); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } =20 void tcg_gen_gvec_or(unsigned vece, uint32_t dofs, uint32_t aofs, @@ -1853,7 +1858,12 @@ void tcg_gen_gvec_or(unsigned vece, uint32_t dofs, u= int32_t aofs, .opc =3D INDEX_op_or_vec, .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs =3D=3D bofs) { + tcg_gen_gvec_mov(vece, dofs, aofs, oprsz, maxsz); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } =20 void tcg_gen_gvec_xor(unsigned vece, uint32_t dofs, uint32_t aofs, @@ -1866,7 +1876,12 @@ void tcg_gen_gvec_xor(unsigned vece, uint32_t dofs, = uint32_t aofs, .opc =3D INDEX_op_xor_vec, .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs =3D=3D bofs) { + tcg_gen_gvec_dup8i(dofs, oprsz, maxsz, 0); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } =20 void tcg_gen_gvec_andc(unsigned vece, uint32_t dofs, uint32_t aofs, @@ -1879,7 +1894,12 @@ void tcg_gen_gvec_andc(unsigned vece, uint32_t dofs,= uint32_t aofs, .opc =3D INDEX_op_andc_vec, .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs =3D=3D bofs) { + tcg_gen_gvec_dup8i(dofs, oprsz, maxsz, 0); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } =20 void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, uint32_t aofs, @@ -1892,7 +1912,12 @@ void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, = uint32_t aofs, .opc =3D INDEX_op_orc_vec, .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, }; - tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + + if (aofs =3D=3D bofs) { + tcg_gen_gvec_dup8i(dofs, oprsz, maxsz, -1); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } } =20 static const GVecGen2s gop_ands =3D { --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=listsout.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (208.118.235.17 [208.118.235.17]) by mx.zohomail.com with SMTPS id 1546641534976604.8455259741098; Fri, 4 Jan 2019 14:38:54 -0800 (PST) Received: from localhost ([127.0.0.1]:57913 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY74-0002yF-HP for importer@patchew.org; Fri, 04 Jan 2019 17:38:46 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54816 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY05-0002es-Mh for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY04-0001F7-F8 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:33 -0500 Received: from mail-it1-x141.google.com ([2607:f8b0:4864:20::141]:50458) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY04-0001Et-9t for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:32 -0500 Received: by mail-it1-x141.google.com with SMTP id z7so3664843iti.0 for ; Fri, 04 Jan 2019 14:31:32 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.29 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=DCNcv2k0t88K5ivbES4v5MrwSiEW1I9azwgO1d1vYhU=; b=JNQ5IkZJSyIVUWbg6RZuVaeG4z5YD3wLR+h/CpnI4J87zN4jPikCfwoiADgPl7i00V QeFfG1EVJrhDj2iNHxEHeLUKAnlvYXQlcM7QKLP2p03EcHaTCna9oWmrKE9A7gg3AzZs 3cuYw8ObC+1DOG9laAMA8PacV3HBKepwlPCjU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=DCNcv2k0t88K5ivbES4v5MrwSiEW1I9azwgO1d1vYhU=; b=K7Rump+oTPdxyV85pAq0VppmzeRQXnbLI02G1ryEp6aH2/FZkQcMkbZ1vTRaKEXPaY k9ZUJqg580HFq8c6CbdtjqgZd0u3FjU8srCNqufqXYe72JoqU5FKjGZwR40N7uDuVsDc VnEFUd67vfpwh6xL+D2kxWyikrQWQ4WgrR1N+XOEv+/9kaPqS6eUMBVLpcF9VaE67e6q 7Ux2QlJbJJwkZ57qCQOx0gf2gCtdxNhWIf3kr61aeOCsxxA6Zli61+O2C9NNyXb6LTwI F2gtXCzfDUSn4c15eKPo26kENnti5TtO8JJGoLTk/ixDP79dNG0MMPDHp4ziC2RVLkiP A8bQ== X-Gm-Message-State: AJcUukdWeqJRynPf0OOlL5s7JoXNdKVlTrJmzEsyWjJ1V5AaxvDzqItl HfZ8bbSEqFunQLBVFSde4jWyZpjR+To= X-Google-Smtp-Source: ALg8bN4SWLtMdswr4WM4onNQpjunWAZzTr6/apORbl1XMF/8fxAxr6bR6nNuWC8q2aZiHqgyrUN3WA== X-Received: by 2002:a24:fa4b:: with SMTP id v72mr2014813ith.20.1546641091308; Fri, 04 Jan 2019 14:31:31 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:08 +1000 Message-Id: <20190104223116.14037-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::141 Subject: [Qemu-devel] [PATCH v2 02/10] tcg: Add gvec expanders for nand, nor, eqv X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Reviewed-by: David Gibson Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 3 +++ tcg/tcg-op-gvec.h | 6 +++++ tcg/tcg-op.h | 3 +++ accel/tcg/tcg-runtime-gvec.c | 33 +++++++++++++++++++++++ tcg/tcg-op-gvec.c | 51 ++++++++++++++++++++++++++++++++++++ tcg/tcg-op-vec.c | 21 +++++++++++++++ 6 files changed, 117 insertions(+) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 1bd39d136d..835ddfebb2 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -211,6 +211,9 @@ DEF_HELPER_FLAGS_4(gvec_or, TCG_CALL_NO_RWG, void, ptr,= ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_xor, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_andc, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_orc, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_nand, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_nor, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_eqv, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_4(gvec_ands, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(gvec_xors, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index ff43a29a0b..d65b9d9d4c 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -242,6 +242,12 @@ void tcg_gen_gvec_andc(unsigned vece, uint32_t dofs, u= int32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_nand(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_nor(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_eqv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); =20 void tcg_gen_gvec_andi(unsigned vece, uint32_t dofs, uint32_t aofs, int64_t c, uint32_t oprsz, uint32_t maxsz); diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 7007ec0d4d..f6ef1cd690 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -962,6 +962,9 @@ void tcg_gen_or_vec(unsigned vece, TCGv_vec r, TCGv_vec= a, TCGv_vec b); void tcg_gen_xor_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_andc_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_orc_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_nand_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_nor_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_eqv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a); void tcg_gen_neg_vec(unsigned vece, TCGv_vec r, TCGv_vec a); =20 diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c index 90340e56e0..d1802467d5 100644 --- a/accel/tcg/tcg-runtime-gvec.c +++ b/accel/tcg/tcg-runtime-gvec.c @@ -512,6 +512,39 @@ void HELPER(gvec_orc)(void *d, void *a, void *b, uint3= 2_t desc) clear_high(d, oprsz, desc); } =20 +void HELPER(gvec_nand)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(vec64)) { + *(vec64 *)(d + i) =3D ~(*(vec64 *)(a + i) & *(vec64 *)(b + i)); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_nor)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(vec64)) { + *(vec64 *)(d + i) =3D ~(*(vec64 *)(a + i) | *(vec64 *)(b + i)); + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_eqv)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(vec64)) { + *(vec64 *)(d + i) =3D ~(*(vec64 *)(a + i) ^ *(vec64 *)(b + i)); + } + clear_high(d, oprsz, desc); +} + void HELPER(gvec_ands)(void *d, void *a, uint64_t b, uint32_t desc) { intptr_t oprsz =3D simd_oprsz(desc); diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index ec231b78fb..81689d02f7 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1920,6 +1920,57 @@ void tcg_gen_gvec_orc(unsigned vece, uint32_t dofs, = uint32_t aofs, } } =20 +void tcg_gen_gvec_nand(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g =3D { + .fni8 =3D tcg_gen_nand_i64, + .fniv =3D tcg_gen_nand_vec, + .fno =3D gen_helper_gvec_nand, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + + if (aofs =3D=3D bofs) { + tcg_gen_gvec_not(vece, dofs, aofs, oprsz, maxsz); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } +} + +void tcg_gen_gvec_nor(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g =3D { + .fni8 =3D tcg_gen_nor_i64, + .fniv =3D tcg_gen_nor_vec, + .fno =3D gen_helper_gvec_nor, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + + if (aofs =3D=3D bofs) { + tcg_gen_gvec_not(vece, dofs, aofs, oprsz, maxsz); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } +} + +void tcg_gen_gvec_eqv(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g =3D { + .fni8 =3D tcg_gen_eqv_i64, + .fniv =3D tcg_gen_eqv_vec, + .fno =3D gen_helper_gvec_eqv, + .prefer_i64 =3D TCG_TARGET_REG_BITS =3D=3D 64, + }; + + if (aofs =3D=3D bofs) { + tcg_gen_gvec_dup8i(dofs, oprsz, maxsz, -1); + } else { + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g); + } +} + static const GVecGen2s gop_ands =3D { .fni8 =3D tcg_gen_and_i64, .fniv =3D tcg_gen_and_vec, diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index cefba3d185..d77fdf7c1d 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -275,6 +275,27 @@ void tcg_gen_orc_vec(unsigned vece, TCGv_vec r, TCGv_v= ec a, TCGv_vec b) } } =20 +void tcg_gen_nand_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + /* TODO: Add TCG_TARGET_HAS_nand_vec when adding a backend supports it= . */ + tcg_gen_and_vec(0, r, a, b); + tcg_gen_not_vec(0, r, r); +} + +void tcg_gen_nor_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + /* TODO: Add TCG_TARGET_HAS_nor_vec when adding a backend supports it.= */ + tcg_gen_or_vec(0, r, a, b); + tcg_gen_not_vec(0, r, r); +} + +void tcg_gen_eqv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + /* TODO: Add TCG_TARGET_HAS_eqv_vec when adding a backend supports it.= */ + tcg_gen_xor_vec(0, r, a, b); + tcg_gen_not_vec(0, r, r); +} + void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a) { if (TCG_TARGET_HAS_not_vec) { --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1546641292508408.2318980664278; Fri, 4 Jan 2019 14:34:52 -0800 (PST) Received: from localhost ([127.0.0.1]:57101 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY3H-0005uL-Bo for importer@patchew.org; Fri, 04 Jan 2019 17:34:51 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54833 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY07-0002hZ-Tc for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY06-0001Qa-VI for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:35 -0500 Received: from mail-it1-x143.google.com ([2607:f8b0:4864:20::143]:33666) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY06-0001NY-Qp for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:34 -0500 Received: by mail-it1-x143.google.com with SMTP id m8so2327407itk.0 for ; Fri, 04 Jan 2019 14:31:34 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.31 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=13baR0fYl6duiBYqIpKDqOz4EjDaN2cM4w2ffhmehW8=; b=TqrIethis2KNSQswGFi4CJFpenX8Tljjgqs4j9LEAclSCN+/s0U+HOrOwGCXjzqxid 2IwreALKGvdw3wL4SPn9qqyiZq91epnPJA11PgiKngwYLnAsRUEic0JpHqWVsbx8CXlX 4hMDspAsOeEjhRfo4WBSKl8pnd7w2R+neZBHE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=13baR0fYl6duiBYqIpKDqOz4EjDaN2cM4w2ffhmehW8=; b=HeBB6DVzdI/JEEubAkQeUbBNFjbFHDNnwLl4ffFAUYc3uvkqbaF469+5c3pSQfNa2V 85t20zZ7UYkFs5yIwGh83XYfRbDhI4IZBAQHzZKQ0YKt+w+Di6+nRRejZ1rZTak/q3/1 yviTqPJvFFLRqUY4sTpLKY4x9iz8OIA63zCAhURx433vN8wf0/c1dkJyAAf9MOCAv+BU Z0MNFejedL8hqKb2BTiRZ8SizmGgLFx5S5GooNA4YjOvIBCifSTaeY3Wjhq53Z0OG9WB H7KEtr+NLELux64o3Pm8HcSxKvCQWuzifRlZJM2RFPP+Avh7SZBy7wZgRI/rwHG8ncfu 3VPA== X-Gm-Message-State: AJcUukertxSVuNEZgAu8RQXLmvpUtBQ2DM+9xqBdoxLuv37WwsUsLweI vX6WjZHCLVWm94Lqwh8hVN5LL2CtoUg= X-Google-Smtp-Source: ALg8bN4gw8VI7mgCSmqILPQR2SIAL2tNIyK/h3LJf41Az2dsemfgTWw1CiuFC8OxqiM6x7mkJ0L35Q== X-Received: by 2002:a24:185:: with SMTP id 127mr2184486itk.55.1546641093768; Fri, 04 Jan 2019 14:31:33 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:09 +1000 Message-Id: <20190104223116.14037-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::143 Subject: [Qemu-devel] [PATCH v2 03/10] tcg: Add write_aofs to GVecGen4 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This allows writing 2 output, 3 input operations. Signed-off-by: Richard Henderson --- tcg/tcg-op-gvec.h | 2 ++ tcg/tcg-op-gvec.c | 27 +++++++++++++++++++-------- 2 files changed, 21 insertions(+), 8 deletions(-) diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index d65b9d9d4c..2cb447112e 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -181,6 +181,8 @@ typedef struct { uint8_t vece; /* Prefer i64 to v64. */ bool prefer_i64; + /* Write aofs as a 2nd dest operand. */ + bool write_aofs; } GVecGen4; =20 void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs, diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 81689d02f7..c10d3d7b26 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -665,7 +665,7 @@ static void expand_3_i32(uint32_t dofs, uint32_t aofs, =20 /* Expand OPSZ bytes worth of three-operand operations using i32 elements.= */ static void expand_4_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs, - uint32_t cofs, uint32_t oprsz, + uint32_t cofs, uint32_t oprsz, bool write_aofs, void (*fni)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i3= 2)) { TCGv_i32 t0 =3D tcg_temp_new_i32(); @@ -680,6 +680,9 @@ static void expand_4_i32(uint32_t dofs, uint32_t aofs, = uint32_t bofs, tcg_gen_ld_i32(t3, cpu_env, cofs + i); fni(t0, t1, t2, t3); tcg_gen_st_i32(t0, cpu_env, dofs + i); + if (write_aofs) { + tcg_gen_st_i32(t1, cpu_env, aofs + i); + } } tcg_temp_free_i32(t3); tcg_temp_free_i32(t2); @@ -769,7 +772,7 @@ static void expand_3_i64(uint32_t dofs, uint32_t aofs, =20 /* Expand OPSZ bytes worth of three-operand operations using i64 elements.= */ static void expand_4_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs, - uint32_t cofs, uint32_t oprsz, + uint32_t cofs, uint32_t oprsz, bool write_aofs, void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i6= 4)) { TCGv_i64 t0 =3D tcg_temp_new_i64(); @@ -784,6 +787,9 @@ static void expand_4_i64(uint32_t dofs, uint32_t aofs, = uint32_t bofs, tcg_gen_ld_i64(t3, cpu_env, cofs + i); fni(t0, t1, t2, t3); tcg_gen_st_i64(t0, cpu_env, dofs + i); + if (write_aofs) { + tcg_gen_st_i64(t1, cpu_env, aofs + i); + } } tcg_temp_free_i64(t3); tcg_temp_free_i64(t2); @@ -880,7 +886,7 @@ static void expand_3_vec(unsigned vece, uint32_t dofs, = uint32_t aofs, /* Expand OPSZ bytes worth of four-operand operations using host vectors. = */ static void expand_4_vec(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs, uint32_t oprsz, - uint32_t tysz, TCGType type, + uint32_t tysz, TCGType type, bool write_aofs, void (*fni)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, TCGv_vec)) { @@ -896,6 +902,9 @@ static void expand_4_vec(unsigned vece, uint32_t dofs, = uint32_t aofs, tcg_gen_ld_vec(t3, cpu_env, cofs + i); fni(vece, t0, t1, t2, t3); tcg_gen_st_vec(t0, cpu_env, dofs + i); + if (write_aofs) { + tcg_gen_st_vec(t1, cpu_env, aofs + i); + } } tcg_temp_free_vec(t3); tcg_temp_free_vec(t2); @@ -1187,7 +1196,7 @@ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uin= t32_t bofs, uint32_t cofs, */ some =3D QEMU_ALIGN_DOWN(oprsz, 32); expand_4_vec(g->vece, dofs, aofs, bofs, cofs, some, - 32, TCG_TYPE_V256, g->fniv); + 32, TCG_TYPE_V256, g->write_aofs, g->fniv); if (some =3D=3D oprsz) { break; } @@ -1200,18 +1209,20 @@ void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, u= int32_t bofs, uint32_t cofs, /* fallthru */ case TCG_TYPE_V128: expand_4_vec(g->vece, dofs, aofs, bofs, cofs, oprsz, - 16, TCG_TYPE_V128, g->fniv); + 16, TCG_TYPE_V128, g->write_aofs, g->fniv); break; case TCG_TYPE_V64: expand_4_vec(g->vece, dofs, aofs, bofs, cofs, oprsz, - 8, TCG_TYPE_V64, g->fniv); + 8, TCG_TYPE_V64, g->write_aofs, g->fniv); break; =20 case 0: if (g->fni8 && check_size_impl(oprsz, 8)) { - expand_4_i64(dofs, aofs, bofs, cofs, oprsz, g->fni8); + expand_4_i64(dofs, aofs, bofs, cofs, oprsz, + g->write_aofs, g->fni8); } else if (g->fni4 && check_size_impl(oprsz, 4)) { - expand_4_i32(dofs, aofs, bofs, cofs, oprsz, g->fni4); + expand_4_i32(dofs, aofs, bofs, cofs, oprsz, + g->write_aofs, g->fni4); } else { assert(g->fno !=3D NULL); tcg_gen_gvec_4_ool(dofs, aofs, bofs, cofs, --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=listsout.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1546641927966356.766203415718; Fri, 4 Jan 2019 14:45:27 -0800 (PST) Received: from localhost ([127.0.0.1]:57090 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY3F-0005rJ-Vq for importer@patchew.org; Fri, 04 Jan 2019 17:34:50 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54861 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0D-0002le-9X for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY09-0001g5-SN for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:41 -0500 Received: from mail-it1-x144.google.com ([2607:f8b0:4864:20::144]:34867) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY09-0001dw-Mm for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:37 -0500 Received: by mail-it1-x144.google.com with SMTP id p197so3464526itp.0 for ; Fri, 04 Jan 2019 14:31:37 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.34 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=+6X0SjI2cU8UHG6qE+NK8sKOtTuvkXLBDx285LoFEAA=; b=jk/lpdHhZyPx3Qnt3VxQaa0GMwm5Pn//IdSFX7zmzhfUC2yOA67q/MNUV3orL5GHyK uCpVSU2Hi4fvQIOL7Ei97JH6ygfQMXxSrB4h/BduBD9bcHNWhXZ/RIvkL1bgDZ5hIVq6 fa+4zO4Sc270rs49qeRIqhAinQl0qxSt1spoI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=+6X0SjI2cU8UHG6qE+NK8sKOtTuvkXLBDx285LoFEAA=; b=ThYTDU31sZVqEZ5aI/CahncpwoCbbSPBfs9MuPJXu+ss9ql1yOTZSIPlRdilMPPfZe OwVbTZCSX/BXcnJRBHWWViYvR9u3QsZihLxpfoTTyN4fcGNQDROdQ+qCMJiS4LPYNbFe Ss9GvmsbRJH9IjGE1jcNTflYQXfbY9Jq4H76jdg8kBldzYnpKDQ+KELBI+4u/fQXabnc pqu175FuC4Q6v8hCW7jC2t+nbhJRi6GjUEujisBQmL+hiRWB175Im1BdQTKsG3FbRHGs 3RpFuRZcAFk8OTM/X/qRP3NUQOJk2AcQjwiXIVobnlF1d/oDqDWlpYevVB232fOWbz2T VVgQ== X-Gm-Message-State: AJcUuke7ehThcIROLg1Favwheb/KVcYZv367Yo8BwSA8nvaMQEcGc4xT 8/XbzsPbvS8nXLJj7h+1gYZQfvHmLOo= X-Google-Smtp-Source: ALg8bN56qpvWSxlyaPDYRuEXpG+pk6EkzcttHb/ojRP9kZEfE1h7/b4fKs6hzfozcgZh7F+DSbDWxA== X-Received: by 2002:a24:25ce:: with SMTP id g197mr2209083itg.61.1546641096623; Fri, 04 Jan 2019 14:31:36 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:10 +1000 Message-Id: <20190104223116.14037-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::144 Subject: [Qemu-devel] [PATCH v2 04/10] tcg: Add opcodes for vector saturated arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 1 + tcg/i386/tcg-target.h | 1 + tcg/tcg-op.h | 4 ++ tcg/tcg-opc.h | 4 ++ tcg/tcg.h | 1 + tcg/tcg-op-gvec.c | 84 ++++++++++++++++++++++++++++++---------- tcg/tcg-op-vec.c | 34 ++++++++++++++-- tcg/tcg.c | 5 +++ 8 files changed, 110 insertions(+), 24 deletions(-) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index f966a4fcb3..98556bcf22 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -135,6 +135,7 @@ typedef enum { #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 +#define TCG_TARGET_HAS_sat_vec 0 =20 #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index f378d29568..44381062e6 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -185,6 +185,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 +#define TCG_TARGET_HAS_sat_vec 0 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) =3D=3D 0 && (len) =3D=3D 8) || ((ofs) =3D=3D 8 && (len) =3D=3D= 8) || \ diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index f6ef1cd690..4a93d730e8 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -967,6 +967,10 @@ void tcg_gen_nor_vec(unsigned vece, TCGv_vec r, TCGv_v= ec a, TCGv_vec b); void tcg_gen_eqv_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_not_vec(unsigned vece, TCGv_vec r, TCGv_vec a); void tcg_gen_neg_vec(unsigned vece, TCGv_vec r, TCGv_vec a); +void tcg_gen_ssadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_usadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_sssub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); =20 void tcg_gen_shli_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_shri_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 7a8a3edb5b..94b2ed80af 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -222,6 +222,10 @@ DEF(add_vec, 1, 2, 0, IMPLVEC) DEF(sub_vec, 1, 2, 0, IMPLVEC) DEF(mul_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_mul_vec)) DEF(neg_vec, 1, 1, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_neg_vec)) +DEF(ssadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) +DEF(usadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) +DEF(sssub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) +DEF(ussub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) =20 DEF(and_vec, 1, 2, 0, IMPLVEC) DEF(or_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/tcg.h b/tcg/tcg.h index 3a629991ca..df24afa425 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -183,6 +183,7 @@ typedef uint64_t TCGRegSet; #define TCG_TARGET_HAS_shs_vec 0 #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_mul_vec 0 +#define TCG_TARGET_HAS_sat_vec 0 #else #define TCG_TARGET_MAYBE_vec 1 #endif diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index c10d3d7b26..0a33f51065 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1678,10 +1678,22 @@ void tcg_gen_gvec_ssadd(unsigned vece, uint32_t dof= s, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] =3D { - { .fno =3D gen_helper_gvec_ssadd8, .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_ssadd16, .vece =3D MO_16 }, - { .fno =3D gen_helper_gvec_ssadd32, .vece =3D MO_32 }, - { .fno =3D gen_helper_gvec_ssadd64, .vece =3D MO_64 } + { .fniv =3D tcg_gen_ssadd_vec, + .fno =3D gen_helper_gvec_ssadd8, + .opc =3D INDEX_op_ssadd_vec, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_ssadd_vec, + .fno =3D gen_helper_gvec_ssadd16, + .opc =3D INDEX_op_ssadd_vec, + .vece =3D MO_16 }, + { .fniv =3D tcg_gen_ssadd_vec, + .fno =3D gen_helper_gvec_ssadd32, + .opc =3D INDEX_op_ssadd_vec, + .vece =3D MO_32 }, + { .fniv =3D tcg_gen_ssadd_vec, + .fno =3D gen_helper_gvec_ssadd64, + .opc =3D INDEX_op_ssadd_vec, + .vece =3D MO_64 }, }; tcg_debug_assert(vece <=3D MO_64); tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); @@ -1691,16 +1703,28 @@ void tcg_gen_gvec_sssub(unsigned vece, uint32_t dof= s, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] =3D { - { .fno =3D gen_helper_gvec_sssub8, .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_sssub16, .vece =3D MO_16 }, - { .fno =3D gen_helper_gvec_sssub32, .vece =3D MO_32 }, - { .fno =3D gen_helper_gvec_sssub64, .vece =3D MO_64 } + { .fniv =3D tcg_gen_sssub_vec, + .fno =3D gen_helper_gvec_sssub8, + .opc =3D INDEX_op_sssub_vec, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_sssub_vec, + .fno =3D gen_helper_gvec_sssub16, + .opc =3D INDEX_op_sssub_vec, + .vece =3D MO_16 }, + { .fniv =3D tcg_gen_sssub_vec, + .fno =3D gen_helper_gvec_sssub32, + .opc =3D INDEX_op_sssub_vec, + .vece =3D MO_32 }, + { .fniv =3D tcg_gen_sssub_vec, + .fno =3D gen_helper_gvec_sssub64, + .opc =3D INDEX_op_sssub_vec, + .vece =3D MO_64 }, }; tcg_debug_assert(vece <=3D MO_64); tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } =20 -static void tcg_gen_vec_usadd32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +static void tcg_gen_usadd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { TCGv_i32 max =3D tcg_const_i32(-1); tcg_gen_add_i32(d, a, b); @@ -1708,7 +1732,7 @@ static void tcg_gen_vec_usadd32_i32(TCGv_i32 d, TCGv_= i32 a, TCGv_i32 b) tcg_temp_free_i32(max); } =20 -static void tcg_gen_vec_usadd32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +static void tcg_gen_usadd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { TCGv_i64 max =3D tcg_const_i64(-1); tcg_gen_add_i64(d, a, b); @@ -1720,20 +1744,30 @@ void tcg_gen_gvec_usadd(unsigned vece, uint32_t dof= s, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] =3D { - { .fno =3D gen_helper_gvec_usadd8, .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_usadd16, .vece =3D MO_16 }, - { .fni4 =3D tcg_gen_vec_usadd32_i32, + { .fniv =3D tcg_gen_usadd_vec, + .fno =3D gen_helper_gvec_usadd8, + .opc =3D INDEX_op_usadd_vec, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_usadd_vec, + .fno =3D gen_helper_gvec_usadd16, + .opc =3D INDEX_op_usadd_vec, + .vece =3D MO_16 }, + { .fni4 =3D tcg_gen_usadd_i32, + .fniv =3D tcg_gen_usadd_vec, .fno =3D gen_helper_gvec_usadd32, + .opc =3D INDEX_op_usadd_vec, .vece =3D MO_32 }, - { .fni8 =3D tcg_gen_vec_usadd32_i64, + { .fni8 =3D tcg_gen_usadd_i64, + .fniv =3D tcg_gen_usadd_vec, .fno =3D gen_helper_gvec_usadd64, + .opc =3D INDEX_op_usadd_vec, .vece =3D MO_64 } }; tcg_debug_assert(vece <=3D MO_64); tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } =20 -static void tcg_gen_vec_ussub32_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +static void tcg_gen_ussub_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { TCGv_i32 min =3D tcg_const_i32(0); tcg_gen_sub_i32(d, a, b); @@ -1741,7 +1775,7 @@ static void tcg_gen_vec_ussub32_i32(TCGv_i32 d, TCGv_= i32 a, TCGv_i32 b) tcg_temp_free_i32(min); } =20 -static void tcg_gen_vec_ussub32_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +static void tcg_gen_ussub_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) { TCGv_i64 min =3D tcg_const_i64(0); tcg_gen_sub_i64(d, a, b); @@ -1753,13 +1787,23 @@ void tcg_gen_gvec_ussub(unsigned vece, uint32_t dof= s, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 g[4] =3D { - { .fno =3D gen_helper_gvec_ussub8, .vece =3D MO_8 }, - { .fno =3D gen_helper_gvec_ussub16, .vece =3D MO_16 }, - { .fni4 =3D tcg_gen_vec_ussub32_i32, + { .fniv =3D tcg_gen_ussub_vec, + .fno =3D gen_helper_gvec_ussub8, + .opc =3D INDEX_op_ussub_vec, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_ussub_vec, + .fno =3D gen_helper_gvec_ussub16, + .opc =3D INDEX_op_ussub_vec, + .vece =3D MO_16 }, + { .fni4 =3D tcg_gen_ussub_i32, + .fniv =3D tcg_gen_ussub_vec, .fno =3D gen_helper_gvec_ussub32, + .opc =3D INDEX_op_ussub_vec, .vece =3D MO_32 }, - { .fni8 =3D tcg_gen_vec_ussub32_i64, + { .fni8 =3D tcg_gen_ussub_i64, + .fniv =3D tcg_gen_ussub_vec, .fno =3D gen_helper_gvec_ussub64, + .opc =3D INDEX_op_ussub_vec, .vece =3D MO_64 } }; tcg_debug_assert(vece <=3D MO_64); diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index d77fdf7c1d..675aa09258 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -386,7 +386,8 @@ void tcg_gen_cmp_vec(TCGCond cond, unsigned vece, } } =20 -void tcg_gen_mul_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +static void do_op3(unsigned vece, TCGv_vec r, TCGv_vec a, + TCGv_vec b, TCGOpcode opc) { TCGTemp *rt =3D tcgv_vec_temp(r); TCGTemp *at =3D tcgv_vec_temp(a); @@ -399,11 +400,36 @@ void tcg_gen_mul_vec(unsigned vece, TCGv_vec r, TCGv_= vec a, TCGv_vec b) =20 tcg_debug_assert(at->base_type >=3D type); tcg_debug_assert(bt->base_type >=3D type); - can =3D tcg_can_emit_vec_op(INDEX_op_mul_vec, type, vece); + can =3D tcg_can_emit_vec_op(opc, type, vece); if (can > 0) { - vec_gen_3(INDEX_op_mul_vec, type, vece, ri, ai, bi); + vec_gen_3(opc, type, vece, ri, ai, bi); } else { tcg_debug_assert(can < 0); - tcg_expand_vec_op(INDEX_op_mul_vec, type, vece, ri, ai, bi); + tcg_expand_vec_op(opc, type, vece, ri, ai, bi); } } + +void tcg_gen_mul_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_mul_vec); +} + +void tcg_gen_ssadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_ssadd_vec); +} + +void tcg_gen_usadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_usadd_vec); +} + +void tcg_gen_sssub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_sssub_vec); +} + +void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_ussub_vec); +} diff --git a/tcg/tcg.c b/tcg/tcg.c index c54b119020..15ed5af007 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1607,6 +1607,11 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_shrv_vec: case INDEX_op_sarv_vec: return have_vec && TCG_TARGET_HAS_shv_vec; + case INDEX_op_ssadd_vec: + case INDEX_op_usadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_ussub_vec: + return have_vec && TCG_TARGET_HAS_sat_vec; =20 default: tcg_debug_assert(op > INDEX_op_last_generic && op < NB_OPS); --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=listsout.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1546641543340190.06655204480023; Fri, 4 Jan 2019 14:39:03 -0800 (PST) Received: from localhost ([127.0.0.1]:57951 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY7C-0003BG-3Q for importer@patchew.org; Fri, 04 Jan 2019 17:38:54 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54877 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0F-0002oD-4a for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0D-0001to-4b for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:43 -0500 Received: from mail-io1-xd44.google.com ([2607:f8b0:4864:20::d44]:36634) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0C-0001r7-UR for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:41 -0500 Received: by mail-io1-xd44.google.com with SMTP id m19so30707930ioh.3 for ; Fri, 04 Jan 2019 14:31:40 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.36 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=oc2/ydQBmZXr7b3KRnpHwkAcfEDqqil/zgo+zZnn/do=; b=j/4dtI7Hgf6ABlYoR0XjnUC1FiGsRJsH2Fc3d/VgAGgDjtogw2QJKAR0tAbTdN+m+X EvZ9+GCUuoXvtckifrRiTHF5DRHSlYS+9+N8jkIwukeEBRhfuWi4DRvl0/8zCQ9gZ+aN YKwQ3vGKNaVP143HsCTgRwdXjXleJgMGzKr+g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=oc2/ydQBmZXr7b3KRnpHwkAcfEDqqil/zgo+zZnn/do=; b=DmhaekliemdY2R0XYYoRpnH2v06+wJP67pqcsryc4JiBhYlunDPYC5z5dmta5CakVp kqSo+A8wtkKnRsWZTTk+k3ZzDPkArKO9MD/G9wBCgj/6iGGOmvAwwghYHSXnegx1od5p RiSM2nuObif+XshWfGjaMWGtzzU7lP7p6U45LRBUCqabNbWrbqJfW9YJffkflCmAZhZD MWJzpdFzu2QHL5V4q+7lClf5aQgCVbqCvNG1ZSRT9DpPTTootcql821jP6BQXHp2JmI/ r/ClgzNNgxN1OVBuRwJODjQPVlCKs9pOuiVsyHMebWL61mkFErBTZyDgau/Q6bstwkxE Uobw== X-Gm-Message-State: AJcUukcRsBnax7rFJEMnK0qepBWjfMgUu9028u0fBxzxqz17HTgNpufj YlBkfnHAcmSypQaG+nbErBTy0Yun3Ro= X-Google-Smtp-Source: ALg8bN5oWtrsfGmgGIq2P68l0317LFbLWSzwV7552ROGoYeV9hzxmtS9furiUC2kvixyYKD/KmiUqw== X-Received: by 2002:a6b:d803:: with SMTP id y3mr39385073iob.247.1546641099522; Fri, 04 Jan 2019 14:31:39 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:11 +1000 Message-Id: <20190104223116.14037-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d44 Subject: [Qemu-devel] [PATCH v2 05/10] tcg: Add opcodes for vector minmax arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 20 ++++ tcg/aarch64/tcg-target.h | 1 + tcg/i386/tcg-target.h | 1 + tcg/tcg-op-gvec.h | 10 ++ tcg/tcg-op.h | 4 + tcg/tcg-opc.h | 4 + tcg/tcg.h | 1 + accel/tcg/tcg-runtime-gvec.c | 224 +++++++++++++++++++++++++++++++++++ tcg/tcg-op-gvec.c | 108 +++++++++++++++++ tcg/tcg-op-vec.c | 20 ++++ tcg/tcg.c | 5 + 11 files changed, 398 insertions(+) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index 835ddfebb2..dfe325625c 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -200,6 +200,26 @@ DEF_HELPER_FLAGS_4(gvec_ussub16, TCG_CALL_NO_RWG, void= , ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ussub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ussub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) =20 +DEF_HELPER_FLAGS_4(gvec_smin8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smin16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smin32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smin64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_smax8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smax16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smax32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smax64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umin8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umin16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umin32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umin64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umax8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umax16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umax32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umax64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(gvec_neg8, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_neg16, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_neg32, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 98556bcf22..545a6eec75 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -136,6 +136,7 @@ typedef enum { #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_minmax_vec 0 =20 #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 44381062e6..7bd7eae672 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -186,6 +186,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_minmax_vec 0 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) =3D=3D 0 && (len) =3D=3D 8) || ((ofs) =3D=3D 8 && (len) =3D=3D= 8) || \ diff --git a/tcg/tcg-op-gvec.h b/tcg/tcg-op-gvec.h index 2cb447112e..4734eef7de 100644 --- a/tcg/tcg-op-gvec.h +++ b/tcg/tcg-op-gvec.h @@ -234,6 +234,16 @@ void tcg_gen_gvec_usadd(unsigned vece, uint32_t dofs, = uint32_t aofs, void tcg_gen_gvec_ussub(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); =20 +/* Min/max. */ +void tcg_gen_gvec_smin(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_umin(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_smax(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); +void tcg_gen_gvec_umax(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz); + void tcg_gen_gvec_and(unsigned vece, uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t oprsz, uint32_t maxsz); void tcg_gen_gvec_or(unsigned vece, uint32_t dofs, uint32_t aofs, diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h index 4a93d730e8..2d98868d8f 100644 --- a/tcg/tcg-op.h +++ b/tcg/tcg-op.h @@ -971,6 +971,10 @@ void tcg_gen_ssadd_vec(unsigned vece, TCGv_vec r, TCGv= _vec a, TCGv_vec b); void tcg_gen_usadd_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_sssub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_smin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_umin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_smax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); +void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b); =20 void tcg_gen_shli_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); void tcg_gen_shri_vec(unsigned vece, TCGv_vec r, TCGv_vec a, int64_t i); diff --git a/tcg/tcg-opc.h b/tcg/tcg-opc.h index 94b2ed80af..4e0238ad1a 100644 --- a/tcg/tcg-opc.h +++ b/tcg/tcg-opc.h @@ -226,6 +226,10 @@ DEF(ssadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_= sat_vec)) DEF(usadd_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) DEF(sssub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) DEF(ussub_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_sat_vec)) +DEF(smin_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) +DEF(umin_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) +DEF(smax_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) +DEF(umax_vec, 1, 2, 0, IMPLVEC | IMPL(TCG_TARGET_HAS_minmax_vec)) =20 DEF(and_vec, 1, 2, 0, IMPLVEC) DEF(or_vec, 1, 2, 0, IMPLVEC) diff --git a/tcg/tcg.h b/tcg/tcg.h index df24afa425..1c3579077d 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -184,6 +184,7 @@ typedef uint64_t TCGRegSet; #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_mul_vec 0 #define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_minmax_vec 0 #else #define TCG_TARGET_MAYBE_vec 1 #endif diff --git a/accel/tcg/tcg-runtime-gvec.c b/accel/tcg/tcg-runtime-gvec.c index d1802467d5..9358749741 100644 --- a/accel/tcg/tcg-runtime-gvec.c +++ b/accel/tcg/tcg-runtime-gvec.c @@ -1028,3 +1028,227 @@ void HELPER(gvec_ussub64)(void *d, void *a, void *b= , uint32_t desc) } clear_high(d, oprsz, desc); } + +void HELPER(gvec_smin8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(int8_t)) { + int8_t aa =3D *(int8_t *)(a + i); + int8_t bb =3D *(int8_t *)(b + i); + int8_t dd =3D aa < bb ? aa : bb; + *(int8_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smin16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(int16_t)) { + int16_t aa =3D *(int16_t *)(a + i); + int16_t bb =3D *(int16_t *)(b + i); + int16_t dd =3D aa < bb ? aa : bb; + *(int16_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smin32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(int32_t)) { + int32_t aa =3D *(int32_t *)(a + i); + int32_t bb =3D *(int32_t *)(b + i); + int32_t dd =3D aa < bb ? aa : bb; + *(int32_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smin64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(int64_t)) { + int64_t aa =3D *(int64_t *)(a + i); + int64_t bb =3D *(int64_t *)(b + i); + int64_t dd =3D aa < bb ? aa : bb; + *(int64_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smax8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(int8_t)) { + int8_t aa =3D *(int8_t *)(a + i); + int8_t bb =3D *(int8_t *)(b + i); + int8_t dd =3D aa > bb ? aa : bb; + *(int8_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smax16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(int16_t)) { + int16_t aa =3D *(int16_t *)(a + i); + int16_t bb =3D *(int16_t *)(b + i); + int16_t dd =3D aa > bb ? aa : bb; + *(int16_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smax32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(int32_t)) { + int32_t aa =3D *(int32_t *)(a + i); + int32_t bb =3D *(int32_t *)(b + i); + int32_t dd =3D aa > bb ? aa : bb; + *(int32_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_smax64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(int64_t)) { + int64_t aa =3D *(int64_t *)(a + i); + int64_t bb =3D *(int64_t *)(b + i); + int64_t dd =3D aa > bb ? aa : bb; + *(int64_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umin8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint8_t)) { + uint8_t aa =3D *(uint8_t *)(a + i); + uint8_t bb =3D *(uint8_t *)(b + i); + uint8_t dd =3D aa < bb ? aa : bb; + *(uint8_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umin16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint16_t)) { + uint16_t aa =3D *(uint16_t *)(a + i); + uint16_t bb =3D *(uint16_t *)(b + i); + uint16_t dd =3D aa < bb ? aa : bb; + *(uint16_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umin32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint32_t)) { + uint32_t aa =3D *(uint32_t *)(a + i); + uint32_t bb =3D *(uint32_t *)(b + i); + uint32_t dd =3D aa < bb ? aa : bb; + *(uint32_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umin64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint64_t)) { + uint64_t aa =3D *(uint64_t *)(a + i); + uint64_t bb =3D *(uint64_t *)(b + i); + uint64_t dd =3D aa < bb ? aa : bb; + *(uint64_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umax8)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint8_t)) { + uint8_t aa =3D *(uint8_t *)(a + i); + uint8_t bb =3D *(uint8_t *)(b + i); + uint8_t dd =3D aa > bb ? aa : bb; + *(uint8_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umax16)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint16_t)) { + uint16_t aa =3D *(uint16_t *)(a + i); + uint16_t bb =3D *(uint16_t *)(b + i); + uint16_t dd =3D aa > bb ? aa : bb; + *(uint16_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umax32)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint32_t)) { + uint32_t aa =3D *(uint32_t *)(a + i); + uint32_t bb =3D *(uint32_t *)(b + i); + uint32_t dd =3D aa > bb ? aa : bb; + *(uint32_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} + +void HELPER(gvec_umax64)(void *d, void *a, void *b, uint32_t desc) +{ + intptr_t oprsz =3D simd_oprsz(desc); + intptr_t i; + + for (i =3D 0; i < oprsz; i +=3D sizeof(uint64_t)) { + uint64_t aa =3D *(uint64_t *)(a + i); + uint64_t bb =3D *(uint64_t *)(b + i); + uint64_t dd =3D aa > bb ? aa : bb; + *(uint64_t *)(d + i) =3D dd; + } + clear_high(d, oprsz, desc); +} diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 0a33f51065..3ee44fcb75 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -1810,6 +1810,114 @@ void tcg_gen_gvec_ussub(unsigned vece, uint32_t dof= s, uint32_t aofs, tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); } =20 +void tcg_gen_gvec_smin(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] =3D { + { .fniv =3D tcg_gen_smin_vec, + .fno =3D gen_helper_gvec_smin8, + .opc =3D INDEX_op_smin_vec, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_smin_vec, + .fno =3D gen_helper_gvec_smin16, + .opc =3D INDEX_op_smin_vec, + .vece =3D MO_16 }, + { .fni4 =3D tcg_gen_smin_i32, + .fniv =3D tcg_gen_smin_vec, + .fno =3D gen_helper_gvec_smin32, + .opc =3D INDEX_op_smin_vec, + .vece =3D MO_32 }, + { .fni8 =3D tcg_gen_smin_i64, + .fniv =3D tcg_gen_smin_vec, + .fno =3D gen_helper_gvec_smin64, + .opc =3D INDEX_op_smin_vec, + .vece =3D MO_64 } + }; + tcg_debug_assert(vece <=3D MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +void tcg_gen_gvec_umin(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] =3D { + { .fniv =3D tcg_gen_umin_vec, + .fno =3D gen_helper_gvec_umin8, + .opc =3D INDEX_op_umin_vec, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_umin_vec, + .fno =3D gen_helper_gvec_umin16, + .opc =3D INDEX_op_umin_vec, + .vece =3D MO_16 }, + { .fni4 =3D tcg_gen_umin_i32, + .fniv =3D tcg_gen_umin_vec, + .fno =3D gen_helper_gvec_umin32, + .opc =3D INDEX_op_umin_vec, + .vece =3D MO_32 }, + { .fni8 =3D tcg_gen_umin_i64, + .fniv =3D tcg_gen_umin_vec, + .fno =3D gen_helper_gvec_umin64, + .opc =3D INDEX_op_umin_vec, + .vece =3D MO_64 } + }; + tcg_debug_assert(vece <=3D MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +void tcg_gen_gvec_smax(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] =3D { + { .fniv =3D tcg_gen_smax_vec, + .fno =3D gen_helper_gvec_smax8, + .opc =3D INDEX_op_smax_vec, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_smax_vec, + .fno =3D gen_helper_gvec_smax16, + .opc =3D INDEX_op_smax_vec, + .vece =3D MO_16 }, + { .fni4 =3D tcg_gen_smax_i32, + .fniv =3D tcg_gen_smax_vec, + .fno =3D gen_helper_gvec_smax32, + .opc =3D INDEX_op_smax_vec, + .vece =3D MO_32 }, + { .fni8 =3D tcg_gen_smax_i64, + .fniv =3D tcg_gen_smax_vec, + .fno =3D gen_helper_gvec_smax64, + .opc =3D INDEX_op_smax_vec, + .vece =3D MO_64 } + }; + tcg_debug_assert(vece <=3D MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + +void tcg_gen_gvec_umax(unsigned vece, uint32_t dofs, uint32_t aofs, + uint32_t bofs, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen3 g[4] =3D { + { .fniv =3D tcg_gen_umax_vec, + .fno =3D gen_helper_gvec_umax8, + .opc =3D INDEX_op_umax_vec, + .vece =3D MO_8 }, + { .fniv =3D tcg_gen_umax_vec, + .fno =3D gen_helper_gvec_umax16, + .opc =3D INDEX_op_umax_vec, + .vece =3D MO_16 }, + { .fni4 =3D tcg_gen_umax_i32, + .fniv =3D tcg_gen_umax_vec, + .fno =3D gen_helper_gvec_umax32, + .opc =3D INDEX_op_umax_vec, + .vece =3D MO_32 }, + { .fni8 =3D tcg_gen_umax_i64, + .fniv =3D tcg_gen_umax_vec, + .fno =3D gen_helper_gvec_umax64, + .opc =3D INDEX_op_umax_vec, + .vece =3D MO_64 } + }; + tcg_debug_assert(vece <=3D MO_64); + tcg_gen_gvec_3(dofs, aofs, bofs, oprsz, maxsz, &g[vece]); +} + /* Perform a vector negation using normal negation and a mask. Compare gen_subv_mask above. */ static void gen_negv_mask(TCGv_i64 d, TCGv_i64 b, TCGv_i64 m) diff --git a/tcg/tcg-op-vec.c b/tcg/tcg-op-vec.c index 675aa09258..36f35022ac 100644 --- a/tcg/tcg-op-vec.c +++ b/tcg/tcg-op-vec.c @@ -433,3 +433,23 @@ void tcg_gen_ussub_vec(unsigned vece, TCGv_vec r, TCGv= _vec a, TCGv_vec b) { do_op3(vece, r, a, b, INDEX_op_ussub_vec); } + +void tcg_gen_smin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_smin_vec); +} + +void tcg_gen_umin_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_umin_vec); +} + +void tcg_gen_smax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_smax_vec); +} + +void tcg_gen_umax_vec(unsigned vece, TCGv_vec r, TCGv_vec a, TCGv_vec b) +{ + do_op3(vece, r, a, b, INDEX_op_umax_vec); +} diff --git a/tcg/tcg.c b/tcg/tcg.c index 15ed5af007..1ae1e788f6 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1612,6 +1612,11 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_sssub_vec: case INDEX_op_ussub_vec: return have_vec && TCG_TARGET_HAS_sat_vec; + case INDEX_op_smin_vec: + case INDEX_op_umin_vec: + case INDEX_op_smax_vec: + case INDEX_op_umax_vec: + return have_vec && TCG_TARGET_HAS_minmax_vec; =20 default: tcg_debug_assert(op > INDEX_op_last_generic && op < NB_OPS); --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=listsout.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1546641543370591.5607583543555; Fri, 4 Jan 2019 14:39:03 -0800 (PST) Received: from localhost ([127.0.0.1]:57955 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY7D-0003Bp-Q3 for importer@patchew.org; Fri, 04 Jan 2019 17:38:55 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:54972 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0I-0002tu-P6 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0G-0001y0-9e for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:46 -0500 Received: from mail-io1-xd34.google.com ([2607:f8b0:4864:20::d34]:43030) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0G-0001xZ-1H for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:44 -0500 Received: by mail-io1-xd34.google.com with SMTP id b23so9777766ios.10 for ; Fri, 04 Jan 2019 14:31:43 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.39 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=Z1wlPA+4A6/CnEqRLT3YSogLwPxAp1hyT1AeUsX3gRw=; b=WGJa5YpPuRzdfAXf2/dNXGypAOMPXgBuOmJrq4uICsWAbsroW3q/LewVqIE2gtGFa0 czxKTwbBQ9J5FZ4CDY+0nRzLVuG/7tAy5UYfjZwzw6J7jgnEk6TDg6Gv/GxD5VAQtUhi zQBLP+73MxNdvRH5RGQeCDSlZFw3M8CwLFzsQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=Z1wlPA+4A6/CnEqRLT3YSogLwPxAp1hyT1AeUsX3gRw=; b=CoU8t5RbPn43TR29PS5JLxTnM9cG+VoiQ2hdN4SAsYgY8H2aNABZeYtliakUwR39ix fn/9/Ew2/8bVskN3YB1/RbODKoqGtU9qDx5m1jN6VO7GHRJ6IVZwIfEQqkh5Lpus0t1S LMMTPssU5uz0CX6hjR/xZvwOti5pRAjBzXQgz6g8rUr8b6TOwCBSZnsNwiUGAshnDzX0 HNe9jUu3aa6VOPDV1LLMzX1j4N5/4MFINCJkYYe/5mrgBXcabL4FYBYt5xp2/sx/o6l5 n7Qpg80VF4ODkXF0lE6js+y/fE4u+Ss4GpQhDe65wHQjAV8eGOxWHx90Zxh/LfpQj8zC tQQA== X-Gm-Message-State: AJcUukeCRPrcGxCiT1CbEIgdM5ITbpKAdiC0m9Lj+7NzdDCSVor+iPd7 MGaJEeJYAxKiFA8ipisYgO8nk9CQacE= X-Google-Smtp-Source: ALg8bN6izqH7rUwgGly2C2XhV/3GFI7I3jLr6cWIitrUQPViIXvET5GHHCi2+NLgWCSqQK+o4G9jwQ== X-Received: by 2002:a5d:8597:: with SMTP id f23mr39641872ioj.238.1546641102860; Fri, 04 Jan 2019 14:31:42 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:12 +1000 Message-Id: <20190104223116.14037-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d34 Subject: [Qemu-devel] [PATCH v2 06/10] tcg/i386: Split subroutines out of tcg_expand_vec_op X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" This routine was becoming too large. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.inc.c | 459 +++++++++++++++++++------------------- 1 file changed, 232 insertions(+), 227 deletions(-) diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index c21c3272f2..ad97386d06 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -3079,253 +3079,258 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType ty= pe, unsigned vece) } } =20 +static void expand_vec_shi(TCGType type, unsigned vece, bool shr, + TCGv_vec v0, TCGv_vec v1, TCGArg imm) +{ + TCGv_vec t1, t2; + + tcg_debug_assert(vece =3D=3D MO_8); + + t1 =3D tcg_temp_new_vec(type); + t2 =3D tcg_temp_new_vec(type); + + /* Unpack to W, shift, and repack. Tricky bits: + (1) Use punpck*bw x,x to produce DDCCBBAA, + i.e. duplicate in other half of the 16-bit lane. + (2) For right-shift, add 8 so that the high half of + the lane becomes zero. For left-shift, we must + shift up and down again. + (3) Step 2 leaves high half zero such that PACKUSWB + (pack with unsigned saturation) does not modify + the quantity. */ + vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, + tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); + vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, + tcgv_vec_arg(t2), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); + + if (shr) { + tcg_gen_shri_vec(MO_16, t1, t1, imm + 8); + tcg_gen_shri_vec(MO_16, t2, t2, imm + 8); + } else { + tcg_gen_shli_vec(MO_16, t1, t1, imm + 8); + tcg_gen_shli_vec(MO_16, t2, t2, imm + 8); + tcg_gen_shri_vec(MO_16, t1, t1, 8); + tcg_gen_shri_vec(MO_16, t2, t2, 8); + } + + vec_gen_3(INDEX_op_x86_packus_vec, type, MO_8, + tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t2)); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); +} + +static void expand_vec_sari(TCGType type, unsigned vece, + TCGv_vec v0, TCGv_vec v1, TCGArg imm) +{ + TCGv_vec t1, t2; + + switch (vece) { + case MO_8: + /* Unpack to W, shift, and repack, as in expand_vec_shi. */ + t1 =3D tcg_temp_new_vec(type); + t2 =3D tcg_temp_new_vec(type); + vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, + tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); + vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, + tcgv_vec_arg(t2), tcgv_vec_arg(v1), tcgv_vec_arg(v1)); + tcg_gen_sari_vec(MO_16, t1, t1, imm + 8); + tcg_gen_sari_vec(MO_16, t2, t2, imm + 8); + vec_gen_3(INDEX_op_x86_packss_vec, type, MO_8, + tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t2)); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); + break; + + case MO_64: + if (imm <=3D 32) { + /* We can emulate a small sign extend by performing an arithme= tic + * 32-bit shift and overwriting the high half of a 64-bit logi= cal + * shift (note that the ISA says shift of 32 is valid). + */ + t1 =3D tcg_temp_new_vec(type); + tcg_gen_sari_vec(MO_32, t1, v1, imm); + tcg_gen_shri_vec(MO_64, v0, v1, imm); + vec_gen_4(INDEX_op_x86_blend_vec, type, MO_32, + tcgv_vec_arg(v0), tcgv_vec_arg(v0), + tcgv_vec_arg(t1), 0xaa); + tcg_temp_free_vec(t1); + } else { + /* Otherwise we will need to use a compare vs 0 to produce + * the sign-extend, shift and merge. + */ + t1 =3D tcg_const_zeros_vec(type); + tcg_gen_cmp_vec(TCG_COND_GT, MO_64, t1, t1, v1); + tcg_gen_shri_vec(MO_64, v0, v1, imm); + tcg_gen_shli_vec(MO_64, t1, t1, 64 - imm); + tcg_gen_or_vec(MO_64, v0, v0, t1); + tcg_temp_free_vec(t1); + } + break; + + default: + g_assert_not_reached(); + } +} + +static void expand_vec_mul(TCGType type, unsigned vece, + TCGv_vec v0, TCGv_vec v1, TCGv_vec v2) +{ + TCGv_vec t1, t2, t3, t4; + + tcg_debug_assert(vece =3D=3D MO_8); + + /* + * Unpack v1 bytes to words, 0 | x. + * Unpack v2 bytes to words, y | 0. + * This leaves the 8-bit result, x * y, with 8 bits of right padding. + * Shift logical right by 8 bits to clear the high 8 bytes before + * using an unsigned saturated pack. + * + * The difference between the V64, V128 and V256 cases is merely how + * we distribute the expansion between temporaries. + */ + switch (type) { + case TCG_TYPE_V64: + t1 =3D tcg_temp_new_vec(TCG_TYPE_V128); + t2 =3D tcg_temp_new_vec(TCG_TYPE_V128); + tcg_gen_dup16i_vec(t2, 0); + vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, + tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(t2)); + vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, + tcgv_vec_arg(t2), tcgv_vec_arg(t2), tcgv_vec_arg(v2)); + tcg_gen_mul_vec(MO_16, t1, t1, t2); + tcg_gen_shri_vec(MO_16, t1, t1, 8); + vec_gen_3(INDEX_op_x86_packus_vec, TCG_TYPE_V128, MO_8, + tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t1)); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); + break; + + case TCG_TYPE_V128: + case TCG_TYPE_V256: + t1 =3D tcg_temp_new_vec(type); + t2 =3D tcg_temp_new_vec(type); + t3 =3D tcg_temp_new_vec(type); + t4 =3D tcg_temp_new_vec(type); + tcg_gen_dup16i_vec(t4, 0); + vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, + tcgv_vec_arg(t1), tcgv_vec_arg(v1), tcgv_vec_arg(t4)); + vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, + tcgv_vec_arg(t2), tcgv_vec_arg(t4), tcgv_vec_arg(v2)); + vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, + tcgv_vec_arg(t3), tcgv_vec_arg(v1), tcgv_vec_arg(t4)); + vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, + tcgv_vec_arg(t4), tcgv_vec_arg(t4), tcgv_vec_arg(v2)); + tcg_gen_mul_vec(MO_16, t1, t1, t2); + tcg_gen_mul_vec(MO_16, t3, t3, t4); + tcg_gen_shri_vec(MO_16, t1, t1, 8); + tcg_gen_shri_vec(MO_16, t3, t3, 8); + vec_gen_3(INDEX_op_x86_packus_vec, type, MO_8, + tcgv_vec_arg(v0), tcgv_vec_arg(t1), tcgv_vec_arg(t3)); + tcg_temp_free_vec(t1); + tcg_temp_free_vec(t2); + tcg_temp_free_vec(t3); + tcg_temp_free_vec(t4); + break; + + default: + g_assert_not_reached(); + } +} + +static void expand_vec_cmp(TCGType type, unsigned vece, TCGv_vec v0, + TCGv_vec v1, TCGv_vec v2, TCGCond cond) +{ + enum { + NEED_SWAP =3D 1, + NEED_INV =3D 2, + NEED_BIAS =3D 4 + }; + static const uint8_t fixups[16] =3D { + [0 ... 15] =3D -1, + [TCG_COND_EQ] =3D 0, + [TCG_COND_NE] =3D NEED_INV, + [TCG_COND_GT] =3D 0, + [TCG_COND_LT] =3D NEED_SWAP, + [TCG_COND_LE] =3D NEED_INV, + [TCG_COND_GE] =3D NEED_SWAP | NEED_INV, + [TCG_COND_GTU] =3D NEED_BIAS, + [TCG_COND_LTU] =3D NEED_BIAS | NEED_SWAP, + [TCG_COND_LEU] =3D NEED_BIAS | NEED_INV, + [TCG_COND_GEU] =3D NEED_BIAS | NEED_SWAP | NEED_INV, + }; + TCGv_vec t1, t2; + uint8_t fixup; + + fixup =3D fixups[cond & 15]; + tcg_debug_assert(fixup !=3D 0xff); + + if (fixup & NEED_INV) { + cond =3D tcg_invert_cond(cond); + } + if (fixup & NEED_SWAP) { + t1 =3D v1, v1 =3D v2, v2 =3D t1; + cond =3D tcg_swap_cond(cond); + } + + t1 =3D t2 =3D NULL; + if (fixup & NEED_BIAS) { + t1 =3D tcg_temp_new_vec(type); + t2 =3D tcg_temp_new_vec(type); + tcg_gen_dupi_vec(vece, t2, 1ull << ((8 << vece) - 1)); + tcg_gen_sub_vec(vece, t1, v1, t2); + tcg_gen_sub_vec(vece, t2, v2, t2); + v1 =3D t1; + v2 =3D t2; + cond =3D tcg_signed_cond(cond); + } + + tcg_debug_assert(cond =3D=3D TCG_COND_EQ || cond =3D=3D TCG_COND_GT); + /* Expand directly; do not recurse. */ + vec_gen_4(INDEX_op_cmp_vec, type, vece, + tcgv_vec_arg(v0), tcgv_vec_arg(v1), tcgv_vec_arg(v2), cond); + + if (t1) { + tcg_temp_free_vec(t1); + if (t2) { + tcg_temp_free_vec(t2); + } + } + if (fixup & NEED_INV) { + tcg_gen_not_vec(vece, v0, v0); + } +} + void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { va_list va; - TCGArg a1, a2; - TCGv_vec v0, t1, t2, t3, t4; + TCGArg a2; + TCGv_vec v0, v1, v2; =20 va_start(va, a0); v0 =3D temp_tcgv_vec(arg_temp(a0)); + v1 =3D temp_tcgv_vec(arg_temp(va_arg(va, TCGArg))); + a2 =3D va_arg(va, TCGArg); =20 switch (opc) { case INDEX_op_shli_vec: case INDEX_op_shri_vec: - tcg_debug_assert(vece =3D=3D MO_8); - a1 =3D va_arg(va, TCGArg); - a2 =3D va_arg(va, TCGArg); - /* Unpack to W, shift, and repack. Tricky bits: - (1) Use punpck*bw x,x to produce DDCCBBAA, - i.e. duplicate in other half of the 16-bit lane. - (2) For right-shift, add 8 so that the high half of - the lane becomes zero. For left-shift, we must - shift up and down again. - (3) Step 2 leaves high half zero such that PACKUSWB - (pack with unsigned saturation) does not modify - the quantity. */ - t1 =3D tcg_temp_new_vec(type); - t2 =3D tcg_temp_new_vec(type); - vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, - tcgv_vec_arg(t1), a1, a1); - vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, - tcgv_vec_arg(t2), a1, a1); - if (opc =3D=3D INDEX_op_shri_vec) { - vec_gen_3(INDEX_op_shri_vec, type, MO_16, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), a2 + 8); - vec_gen_3(INDEX_op_shri_vec, type, MO_16, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), a2 + 8); - } else { - vec_gen_3(INDEX_op_shli_vec, type, MO_16, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), a2 + 8); - vec_gen_3(INDEX_op_shli_vec, type, MO_16, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), a2 + 8); - vec_gen_3(INDEX_op_shri_vec, type, MO_16, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), 8); - vec_gen_3(INDEX_op_shri_vec, type, MO_16, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), 8); - } - vec_gen_3(INDEX_op_x86_packus_vec, type, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t2)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); + expand_vec_shi(type, vece, opc =3D=3D INDEX_op_shri_vec, v0, v1, a= 2); break; =20 case INDEX_op_sari_vec: - a1 =3D va_arg(va, TCGArg); - a2 =3D va_arg(va, TCGArg); - if (vece =3D=3D MO_8) { - /* Unpack to W, shift, and repack, as above. */ - t1 =3D tcg_temp_new_vec(type); - t2 =3D tcg_temp_new_vec(type); - vec_gen_3(INDEX_op_x86_punpckl_vec, type, MO_8, - tcgv_vec_arg(t1), a1, a1); - vec_gen_3(INDEX_op_x86_punpckh_vec, type, MO_8, - tcgv_vec_arg(t2), a1, a1); - vec_gen_3(INDEX_op_sari_vec, type, MO_16, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), a2 + 8); - vec_gen_3(INDEX_op_sari_vec, type, MO_16, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), a2 + 8); - vec_gen_3(INDEX_op_x86_packss_vec, type, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t2)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - break; - } - tcg_debug_assert(vece =3D=3D MO_64); - /* MO_64: If the shift is <=3D 32, we can emulate the sign extend = by - performing an arithmetic 32-bit shift and overwriting the high - half of the result (note that the ISA says shift of 32 is valid= ). */ - if (a2 <=3D 32) { - t1 =3D tcg_temp_new_vec(type); - vec_gen_3(INDEX_op_sari_vec, type, MO_32, tcgv_vec_arg(t1), a1= , a2); - vec_gen_3(INDEX_op_shri_vec, type, MO_64, a0, a1, a2); - vec_gen_4(INDEX_op_x86_blend_vec, type, MO_32, - a0, a0, tcgv_vec_arg(t1), 0xaa); - tcg_temp_free_vec(t1); - break; - } - /* Otherwise we will need to use a compare vs 0 to produce the - sign-extend, shift and merge. */ - t1 =3D tcg_temp_new_vec(type); - t2 =3D tcg_const_zeros_vec(type); - vec_gen_4(INDEX_op_cmp_vec, type, MO_64, - tcgv_vec_arg(t1), tcgv_vec_arg(t2), a1, TCG_COND_GT); - tcg_temp_free_vec(t2); - vec_gen_3(INDEX_op_shri_vec, type, MO_64, a0, a1, a2); - vec_gen_3(INDEX_op_shli_vec, type, MO_64, - tcgv_vec_arg(t1), tcgv_vec_arg(t1), 64 - a2); - vec_gen_3(INDEX_op_or_vec, type, MO_64, a0, a0, tcgv_vec_arg(t1)); - tcg_temp_free_vec(t1); + expand_vec_sari(type, vece, v0, v1, a2); break; =20 case INDEX_op_mul_vec: - tcg_debug_assert(vece =3D=3D MO_8); - a1 =3D va_arg(va, TCGArg); - a2 =3D va_arg(va, TCGArg); - switch (type) { - case TCG_TYPE_V64: - t1 =3D tcg_temp_new_vec(TCG_TYPE_V128); - t2 =3D tcg_temp_new_vec(TCG_TYPE_V128); - tcg_gen_dup16i_vec(t2, 0); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t1), a1, tcgv_vec_arg(t2)); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t2), tcgv_vec_arg(t2), a2); - tcg_gen_mul_vec(MO_16, t1, t1, t2); - tcg_gen_shri_vec(MO_16, t1, t1, 8); - vec_gen_3(INDEX_op_x86_packus_vec, TCG_TYPE_V128, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t1)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - break; - - case TCG_TYPE_V128: - t1 =3D tcg_temp_new_vec(TCG_TYPE_V128); - t2 =3D tcg_temp_new_vec(TCG_TYPE_V128); - t3 =3D tcg_temp_new_vec(TCG_TYPE_V128); - t4 =3D tcg_temp_new_vec(TCG_TYPE_V128); - tcg_gen_dup16i_vec(t4, 0); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t1), a1, tcgv_vec_arg(t4)); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t2), tcgv_vec_arg(t4), a2); - vec_gen_3(INDEX_op_x86_punpckh_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t3), a1, tcgv_vec_arg(t4)); - vec_gen_3(INDEX_op_x86_punpckh_vec, TCG_TYPE_V128, MO_8, - tcgv_vec_arg(t4), tcgv_vec_arg(t4), a2); - tcg_gen_mul_vec(MO_16, t1, t1, t2); - tcg_gen_mul_vec(MO_16, t3, t3, t4); - tcg_gen_shri_vec(MO_16, t1, t1, 8); - tcg_gen_shri_vec(MO_16, t3, t3, 8); - vec_gen_3(INDEX_op_x86_packus_vec, TCG_TYPE_V128, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t3)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - tcg_temp_free_vec(t3); - tcg_temp_free_vec(t4); - break; - - case TCG_TYPE_V256: - t1 =3D tcg_temp_new_vec(TCG_TYPE_V256); - t2 =3D tcg_temp_new_vec(TCG_TYPE_V256); - t3 =3D tcg_temp_new_vec(TCG_TYPE_V256); - t4 =3D tcg_temp_new_vec(TCG_TYPE_V256); - tcg_gen_dup16i_vec(t4, 0); - /* a1: A[0-7] ... D[0-7]; a2: W[0-7] ... Z[0-7] - t1: extends of B[0-7], D[0-7] - t2: extends of X[0-7], Z[0-7] - t3: extends of A[0-7], C[0-7] - t4: extends of W[0-7], Y[0-7]. */ - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V256, MO_8, - tcgv_vec_arg(t1), a1, tcgv_vec_arg(t4)); - vec_gen_3(INDEX_op_x86_punpckl_vec, TCG_TYPE_V256, MO_8, - tcgv_vec_arg(t2), tcgv_vec_arg(t4), a2); - vec_gen_3(INDEX_op_x86_punpckh_vec, TCG_TYPE_V256, MO_8, - tcgv_vec_arg(t3), a1, tcgv_vec_arg(t4)); - vec_gen_3(INDEX_op_x86_punpckh_vec, TCG_TYPE_V256, MO_8, - tcgv_vec_arg(t4), tcgv_vec_arg(t4), a2); - /* t1: BX DZ; t2: AW CY. */ - tcg_gen_mul_vec(MO_16, t1, t1, t2); - tcg_gen_mul_vec(MO_16, t3, t3, t4); - tcg_gen_shri_vec(MO_16, t1, t1, 8); - tcg_gen_shri_vec(MO_16, t3, t3, 8); - /* a0: AW BX CY DZ. */ - vec_gen_3(INDEX_op_x86_packus_vec, TCG_TYPE_V256, MO_8, - a0, tcgv_vec_arg(t1), tcgv_vec_arg(t3)); - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - tcg_temp_free_vec(t3); - tcg_temp_free_vec(t4); - break; - - default: - g_assert_not_reached(); - } + v2 =3D temp_tcgv_vec(arg_temp(a2)); + expand_vec_mul(type, vece, v0, v1, v2); break; =20 case INDEX_op_cmp_vec: - { - enum { - NEED_SWAP =3D 1, - NEED_INV =3D 2, - NEED_BIAS =3D 4 - }; - static const uint8_t fixups[16] =3D { - [0 ... 15] =3D -1, - [TCG_COND_EQ] =3D 0, - [TCG_COND_NE] =3D NEED_INV, - [TCG_COND_GT] =3D 0, - [TCG_COND_LT] =3D NEED_SWAP, - [TCG_COND_LE] =3D NEED_INV, - [TCG_COND_GE] =3D NEED_SWAP | NEED_INV, - [TCG_COND_GTU] =3D NEED_BIAS, - [TCG_COND_LTU] =3D NEED_BIAS | NEED_SWAP, - [TCG_COND_LEU] =3D NEED_BIAS | NEED_INV, - [TCG_COND_GEU] =3D NEED_BIAS | NEED_SWAP | NEED_INV, - }; - - TCGCond cond; - uint8_t fixup; - - a1 =3D va_arg(va, TCGArg); - a2 =3D va_arg(va, TCGArg); - cond =3D va_arg(va, TCGArg); - fixup =3D fixups[cond & 15]; - tcg_debug_assert(fixup !=3D 0xff); - - if (fixup & NEED_INV) { - cond =3D tcg_invert_cond(cond); - } - if (fixup & NEED_SWAP) { - TCGArg t; - t =3D a1, a1 =3D a2, a2 =3D t; - cond =3D tcg_swap_cond(cond); - } - - t1 =3D t2 =3D NULL; - if (fixup & NEED_BIAS) { - t1 =3D tcg_temp_new_vec(type); - t2 =3D tcg_temp_new_vec(type); - tcg_gen_dupi_vec(vece, t2, 1ull << ((8 << vece) - 1)); - tcg_gen_sub_vec(vece, t1, temp_tcgv_vec(arg_temp(a1)), t2); - tcg_gen_sub_vec(vece, t2, temp_tcgv_vec(arg_temp(a2)), t2); - a1 =3D tcgv_vec_arg(t1); - a2 =3D tcgv_vec_arg(t2); - cond =3D tcg_signed_cond(cond); - } - - tcg_debug_assert(cond =3D=3D TCG_COND_EQ || cond =3D=3D TCG_CO= ND_GT); - vec_gen_4(INDEX_op_cmp_vec, type, vece, a0, a1, a2, cond); - - if (fixup & NEED_BIAS) { - tcg_temp_free_vec(t1); - tcg_temp_free_vec(t2); - } - if (fixup & NEED_INV) { - tcg_gen_not_vec(vece, v0, v0); - } - } + v2 =3D temp_tcgv_vec(arg_temp(a2)); + expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); break; =20 default: --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from lists.gnu.org (209.51.188.17 [209.51.188.17]) by mx.zohomail.com with SMTPS id 1546641618656679.9211607585012; Fri, 4 Jan 2019 14:40:18 -0800 (PST) Received: from localhost ([127.0.0.1]:58255 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY8Q-0004sz-H4 for importer@patchew.org; Fri, 04 Jan 2019 17:40:10 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:55059 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0L-0002yZ-MH for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0I-00020G-HS for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:49 -0500 Received: from mail-io1-xd42.google.com ([2607:f8b0:4864:20::d42]:36633) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0I-0001zd-D2 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:46 -0500 Received: by mail-io1-xd42.google.com with SMTP id m19so30708059ioh.3 for ; Fri, 04 Jan 2019 14:31:46 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.43 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=SN+MeOnTNXBfnXP6XYPFDEbbdv3QTmQaTEHi4q44Zmk=; b=cIYtx4KHUoKmQJ7otCLJr5/C916lAi6j2e/W/uaXgQwde9d0gr9Gluhrw8uhRjSMvE ro+w/8vxDD96TOvmLWJhcvGZmpUJwEqwoEO54P8yHjGy4OPkoa3f89PR7A0rzChA+qv9 TDYWqEEjfTtHro97430vc10zwf0qMEdh7TTL4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=SN+MeOnTNXBfnXP6XYPFDEbbdv3QTmQaTEHi4q44Zmk=; b=eHQF2+iRDvfx/YauKKMY/XUCp/ZZ5qVv6dXUxN3dsaI1DjkDXYdJg/S6ppdmhtJoNd ufn7DLEZC5mB9gPhmnoXkMvTpghieDfyS5vI0zznT9jeAG3kV64i9VBu0WSnZFn5Mx+M YHtvRLOseL7eDQDDBnkY+d/2tPaUSuC018k/LbFZcSoknLkRd2QoOKfnm47Eaqqr83tW dR2cRku5gc9GJo6fIScSrHOF//fTH69xNTKvXPXwJw+Jy0icfyRgdaZd2Eis3ANZ4txn CIDKE14Qc3fFRrHK63313D9vpyRl4uUrCNEkCRPAaPR6QC50CVdtTCCq0xCugYNXjd6l OjkQ== X-Gm-Message-State: AJcUukfZg+EmB2jjN26ne0VRUrGj3DbCEzACNiiTFIwnDipeY2Vehob0 ryOF/nrhLsc3f6qwp0bhh5wQlFK2FMw= X-Google-Smtp-Source: ALg8bN5vjB79EYFyXeilGhH0Z3r2SXOHxkLIHQfhqyj7i/90CztvrBQze18t+OeR0P0idaegCStxgQ== X-Received: by 2002:a6b:700a:: with SMTP id l10mr10152331ioc.138.1546641105407; Fri, 04 Jan 2019 14:31:45 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:13 +1000 Message-Id: <20190104223116.14037-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d42 Subject: [Qemu-devel] [PATCH v2 07/10] tcg/i386: Implement vector saturating arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Only MO_8 and MO_16 are implemented, since that's all the instruction set provides. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 42 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 7bd7eae672..efbd5a6fc9 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -185,7 +185,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 -#define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 0 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index ad97386d06..feec40a412 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -377,6 +377,10 @@ static inline int tcg_target_const_match(tcg_target_lo= ng val, TCGType type, #define OPC_PADDW (0xfd | P_EXT | P_DATA16) #define OPC_PADDD (0xfe | P_EXT | P_DATA16) #define OPC_PADDQ (0xd4 | P_EXT | P_DATA16) +#define OPC_PADDSB (0xec | P_EXT | P_DATA16) +#define OPC_PADDSW (0xed | P_EXT | P_DATA16) +#define OPC_PADDUB (0xdc | P_EXT | P_DATA16) +#define OPC_PADDUW (0xdd | P_EXT | P_DATA16) #define OPC_PAND (0xdb | P_EXT | P_DATA16) #define OPC_PANDN (0xdf | P_EXT | P_DATA16) #define OPC_PBLENDW (0x0e | P_EXT3A | P_DATA16) @@ -408,6 +412,10 @@ static inline int tcg_target_const_match(tcg_target_lo= ng val, TCGType type, #define OPC_PSUBW (0xf9 | P_EXT | P_DATA16) #define OPC_PSUBD (0xfa | P_EXT | P_DATA16) #define OPC_PSUBQ (0xfb | P_EXT | P_DATA16) +#define OPC_PSUBSB (0xe8 | P_EXT | P_DATA16) +#define OPC_PSUBSW (0xe9 | P_EXT | P_DATA16) +#define OPC_PSUBUB (0xd8 | P_EXT | P_DATA16) +#define OPC_PSUBUW (0xd9 | P_EXT | P_DATA16) #define OPC_PUNPCKLBW (0x60 | P_EXT | P_DATA16) #define OPC_PUNPCKLWD (0x61 | P_EXT | P_DATA16) #define OPC_PUNPCKLDQ (0x62 | P_EXT | P_DATA16) @@ -2591,9 +2599,21 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode = opc, static int const add_insn[4] =3D { OPC_PADDB, OPC_PADDW, OPC_PADDD, OPC_PADDQ }; + static int const ssadd_insn[4] =3D { + OPC_PADDSB, OPC_PADDSW, OPC_UD2, OPC_UD2 + }; + static int const usadd_insn[4] =3D { + OPC_PADDSB, OPC_PADDSW, OPC_UD2, OPC_UD2 + }; static int const sub_insn[4] =3D { OPC_PSUBB, OPC_PSUBW, OPC_PSUBD, OPC_PSUBQ }; + static int const sssub_insn[4] =3D { + OPC_PSUBSB, OPC_PSUBSW, OPC_UD2, OPC_UD2 + }; + static int const ussub_insn[4] =3D { + OPC_PSUBSB, OPC_PSUBSW, OPC_UD2, OPC_UD2 + }; static int const mul_insn[4] =3D { OPC_UD2, OPC_PMULLW, OPC_PMULLD, OPC_UD2 }; @@ -2631,9 +2651,21 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode = opc, case INDEX_op_add_vec: insn =3D add_insn[vece]; goto gen_simd; + case INDEX_op_ssadd_vec: + insn =3D ssadd_insn[vece]; + goto gen_simd; + case INDEX_op_usadd_vec: + insn =3D usadd_insn[vece]; + goto gen_simd; case INDEX_op_sub_vec: insn =3D sub_insn[vece]; goto gen_simd; + case INDEX_op_sssub_vec: + insn =3D sssub_insn[vece]; + goto gen_simd; + case INDEX_op_ussub_vec: + insn =3D ussub_insn[vece]; + goto gen_simd; case INDEX_op_mul_vec: insn =3D mul_insn[vece]; goto gen_simd; @@ -3007,6 +3039,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOp= code op) case INDEX_op_or_vec: case INDEX_op_xor_vec: case INDEX_op_andc_vec: + case INDEX_op_ssadd_vec: + case INDEX_op_usadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_ussub_vec: case INDEX_op_cmp_vec: case INDEX_op_x86_shufps_vec: case INDEX_op_x86_blend_vec: @@ -3074,6 +3110,12 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type,= unsigned vece) } return 1; =20 + case INDEX_op_ssadd_vec: + case INDEX_op_usadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_ussub_vec: + return vece <=3D MO_16; + default: return 0; } --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=listsout.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1546641732252675.6300811672824; Fri, 4 Jan 2019 14:42:12 -0800 (PST) Received: from localhost ([127.0.0.1]:58729 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfYAJ-0007OJ-UK for importer@patchew.org; Fri, 04 Jan 2019 17:42:07 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:55066 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0M-0002yf-7m for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0L-00026o-2y for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:50 -0500 Received: from mail-io1-xd41.google.com ([2607:f8b0:4864:20::d41]:41916) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0K-00024L-Tw for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:49 -0500 Received: by mail-io1-xd41.google.com with SMTP id s22so30699537ioc.8 for ; Fri, 04 Jan 2019 14:31:48 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.45 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=hAwJsayMvDe4O8fXPQhl9uUYFAm/hD4Vqu6cubinGYk=; b=dzVRsv61gb/RrEUgFVJ2EdjqU6w8PMeXiQafEDjbqNB8BxjoFbIIscKPovPAKX9HOE gyF25eG1GSoHYqME0sbImL/zd2hB8fkt5aC9JJRAtWHuX82fzLsHHG4RJVS/9JWxhvya xXaGY3pi14BUilxRklo+HakxjOS/jX2dtFEJg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=hAwJsayMvDe4O8fXPQhl9uUYFAm/hD4Vqu6cubinGYk=; b=cuoUWbDpuHA43PfHTXZGr7guBmnvW5JbIxJ3XqLtA654b/PQWkUp0HyYAgC/AGPUWs hwXnGGGf47QNyJGLDR6er1xGYzggqjdAKsNNG72RyDRu3Ixc51eTEXYqeO2yHpC+JIAG H29GeX0NF71rctoPc9NMpRXfmuPo1ikKI87C1gWHba1nkhBV5pa6fsHPB3bN87LIxhAT Np3yde1lF/TNQ6dsxUMZ3zky1GCbn0PgeQKGP3qzK5vzOqHPvmM5H4RnVTqPeA1CJwil FRVpqE2OYeeaVHgqcR/NVQJUUUIpDjkKpVuMz/NEheX5J40H3VWXl/M7AV1jcC4ExZbg +6sw== X-Gm-Message-State: AJcUukchpLjWSFV9VwC3Z25UEYC8FYXA8H+cQ5zUPKiyDr8X/9QATvCF I1U32FqyZCMsMWBrIER5WiQ30KmzvwU= X-Google-Smtp-Source: ALg8bN59BdTX0EOMXbAaR0r3QT3ZtDSfVxh4J6TYNN6d7jw0SOkY0VJA6gJQ6GhagWloWKvbqnvmMg== X-Received: by 2002:a6b:5902:: with SMTP id n2mr17552308iob.16.1546641107899; Fri, 04 Jan 2019 14:31:47 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:14 +1000 Message-Id: <20190104223116.14037-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d41 Subject: [Qemu-devel] [PATCH v2 08/10] tcg/i386: Implement vector minmax arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The avx instruction set does not directly provide MO_64. We can still implement 64-bit with comparison and vpblendvb. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 2 +- tcg/i386/tcg-target.inc.c | 81 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 82 insertions(+), 1 deletion(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index efbd5a6fc9..7995fe3eab 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -186,7 +186,7 @@ extern bool have_avx2; #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 -#define TCG_TARGET_HAS_minmax_vec 0 +#define TCG_TARGET_HAS_minmax_vec 1 =20 #define TCG_TARGET_deposit_i32_valid(ofs, len) \ (((ofs) =3D=3D 0 && (len) =3D=3D 8) || ((ofs) =3D=3D 8 && (len) =3D=3D= 8) || \ diff --git a/tcg/i386/tcg-target.inc.c b/tcg/i386/tcg-target.inc.c index feec40a412..94007c7aa5 100644 --- a/tcg/i386/tcg-target.inc.c +++ b/tcg/i386/tcg-target.inc.c @@ -392,6 +392,18 @@ static inline int tcg_target_const_match(tcg_target_lo= ng val, TCGType type, #define OPC_PCMPGTW (0x65 | P_EXT | P_DATA16) #define OPC_PCMPGTD (0x66 | P_EXT | P_DATA16) #define OPC_PCMPGTQ (0x37 | P_EXT38 | P_DATA16) +#define OPC_PMAXSB (0x3c | P_EXT38 | P_DATA16) +#define OPC_PMAXSW (0xee | P_EXT | P_DATA16) +#define OPC_PMAXSD (0x3d | P_EXT38 | P_DATA16) +#define OPC_PMAXUB (0xde | P_EXT | P_DATA16) +#define OPC_PMAXUW (0x3e | P_EXT38 | P_DATA16) +#define OPC_PMAXUD (0x3f | P_EXT38 | P_DATA16) +#define OPC_PMINSB (0x38 | P_EXT38 | P_DATA16) +#define OPC_PMINSW (0xea | P_EXT | P_DATA16) +#define OPC_PMINSD (0x39 | P_EXT38 | P_DATA16) +#define OPC_PMINUB (0xda | P_EXT | P_DATA16) +#define OPC_PMINUW (0x3a | P_EXT38 | P_DATA16) +#define OPC_PMINUD (0x3b | P_EXT38 | P_DATA16) #define OPC_PMOVSXBW (0x20 | P_EXT38 | P_DATA16) #define OPC_PMOVSXWD (0x23 | P_EXT38 | P_DATA16) #define OPC_PMOVSXDQ (0x25 | P_EXT38 | P_DATA16) @@ -2638,6 +2650,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode = opc, static int const packus_insn[4] =3D { OPC_PACKUSWB, OPC_PACKUSDW, OPC_UD2, OPC_UD2 }; + static int const smin_insn[4] =3D { + OPC_PMINSB, OPC_PMINSW, OPC_PMINSD, OPC_UD2 + }; + static int const smax_insn[4] =3D { + OPC_PMAXSB, OPC_PMAXSW, OPC_PMAXSD, OPC_UD2 + }; + static int const umin_insn[4] =3D { + OPC_PMINUB, OPC_PMINUW, OPC_PMINUD, OPC_UD2 + }; + static int const umax_insn[4] =3D { + OPC_PMAXUB, OPC_PMAXUW, OPC_PMAXUD, OPC_UD2 + }; =20 TCGType type =3D vecl + TCG_TYPE_V64; int insn, sub; @@ -2678,6 +2702,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode = opc, case INDEX_op_xor_vec: insn =3D OPC_PXOR; goto gen_simd; + case INDEX_op_smin_vec: + insn =3D smin_insn[vece]; + goto gen_simd; + case INDEX_op_umin_vec: + insn =3D umin_insn[vece]; + goto gen_simd; + case INDEX_op_smax_vec: + insn =3D smax_insn[vece]; + goto gen_simd; + case INDEX_op_umax_vec: + insn =3D umax_insn[vece]; + goto gen_simd; case INDEX_op_x86_punpckl_vec: insn =3D punpckl_insn[vece]; goto gen_simd; @@ -3043,6 +3079,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOp= code op) case INDEX_op_usadd_vec: case INDEX_op_sssub_vec: case INDEX_op_ussub_vec: + case INDEX_op_smin_vec: + case INDEX_op_umin_vec: + case INDEX_op_smax_vec: + case INDEX_op_umax_vec: case INDEX_op_cmp_vec: case INDEX_op_x86_shufps_vec: case INDEX_op_x86_blend_vec: @@ -3115,6 +3155,11 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type,= unsigned vece) case INDEX_op_sssub_vec: case INDEX_op_ussub_vec: return vece <=3D MO_16; + case INDEX_op_smin_vec: + case INDEX_op_smax_vec: + case INDEX_op_umin_vec: + case INDEX_op_umax_vec: + return vece <=3D MO_32 ? 1 : -1; =20 default: return 0; @@ -3343,6 +3388,25 @@ static void expand_vec_cmp(TCGType type, unsigned ve= ce, TCGv_vec v0, } } =20 +static void expand_vec_minmax(TCGType type, unsigned vece, + TCGCond cond, bool min, + TCGv_vec v0, TCGv_vec v1, TCGv_vec v2) +{ + TCGv_vec t1 =3D tcg_temp_new_vec(type); + + tcg_debug_assert(vece =3D=3D MO_64); + + tcg_gen_cmp_vec(cond, vece, t1, v1, v2); + if (min) { + TCGv_vec t2; + t2 =3D v1, v1 =3D v2, v2 =3D t2; + } + vec_gen_4(INDEX_op_x86_vpblendvb_vec, type, vece, + tcgv_vec_arg(v0), tcgv_vec_arg(v1), + tcgv_vec_arg(v2), tcgv_vec_arg(t1)); + tcg_temp_free_vec(t1); +} + void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece, TCGArg a0, ...) { @@ -3375,6 +3439,23 @@ void tcg_expand_vec_op(TCGOpcode opc, TCGType type, = unsigned vece, expand_vec_cmp(type, vece, v0, v1, v2, va_arg(va, TCGArg)); break; =20 + case INDEX_op_smin_vec: + v2 =3D temp_tcgv_vec(arg_temp(a2)); + expand_vec_minmax(type, vece, TCG_COND_GT, true, v0, v1, v2); + break; + case INDEX_op_smax_vec: + v2 =3D temp_tcgv_vec(arg_temp(a2)); + expand_vec_minmax(type, vece, TCG_COND_GT, false, v0, v1, v2); + break; + case INDEX_op_umin_vec: + v2 =3D temp_tcgv_vec(arg_temp(a2)); + expand_vec_minmax(type, vece, TCG_COND_GTU, true, v0, v1, v2); + break; + case INDEX_op_umax_vec: + v2 =3D temp_tcgv_vec(arg_temp(a2)); + expand_vec_minmax(type, vece, TCG_COND_GTU, false, v0, v1, v2); + break; + default: break; } --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=listsout.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1546641734384586.6892435692258; Fri, 4 Jan 2019 14:42:14 -0800 (PST) Received: from localhost ([127.0.0.1]:58763 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfYAP-0007WO-B6 for importer@patchew.org; Fri, 04 Jan 2019 17:42:13 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:55082 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0O-00031k-Ay for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0N-00029c-Fz for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:52 -0500 Received: from mail-io1-xd43.google.com ([2607:f8b0:4864:20::d43]:36634) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0N-00029C-B6 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:51 -0500 Received: by mail-io1-xd43.google.com with SMTP id m19so30708174ioh.3 for ; Fri, 04 Jan 2019 14:31:51 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.48 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=bjBWCHW6CsJ6peflKASsBY6j3Bp2s4dyGDxZEiyy4vc=; b=i6zdQcgN8zn4dBlv6w7PMsz9tOUMQ0HkshxeCND8wJVAYreOEWGQWURdNcyBznaAGL 2PuUrFuNnNuZeQLAx4ws4UWkUoFHA2foC6ke5+dT0Fav4fuTfWTHCj28RA/RNijoSn+Z K2A36UIAOpEOHI2b09CJH6ZEYUoXWroDwK+9o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=bjBWCHW6CsJ6peflKASsBY6j3Bp2s4dyGDxZEiyy4vc=; b=Dv6LNzHX5Y82OIqfxgjO13QfxMzo+OfD+XpaUdJBSRxPreN9uor7hY3pmdDldJIYMf WP92vXxS6y79EzBlJm4qMBF1opNP2HmgNzowLnbQRmupxq/CWK62XUUGfUxfyOe/Lilo 0WA0+dwNpXaxkbaDNYWN5MJ2aDdZ/+FNlMHcNy23yT0q8+A/Hv2e0eM9ze7tdjWqUkCU ggaa6s90j0tLemxzMPPnoVUbWWn5/fKzRmGql3QO3+7pcqrzD4lqjklgKpi964ikCThv Ah1JYFFXuZYuZVsAiSbBmR/dFh3zaF/Mw2Kv4OZricyRXH/qHlfYln7a0nWF22Gv9wUd 9Pcg== X-Gm-Message-State: AJcUukfSXlIISt+WswYyqeD534391Jkknk2L67tOmA/EheC1NDiZxpc1 GKanBH0yqev0zXy9wkELntNILU8sZtc= X-Google-Smtp-Source: ALg8bN4PrVckknJaQvfj9zDKBnoNCmtRC7tQ/bgPJVAb1ofjR+09r/r1v5p1Wyhj3sH2UORvEWV8YQ== X-Received: by 2002:a6b:600b:: with SMTP id r11mr40122957iog.259.1546641110355; Fri, 04 Jan 2019 14:31:50 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:15 +1000 Message-Id: <20190104223116.14037-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d43 Subject: [Qemu-devel] [PATCH v2 09/10] tcg/aarch64: Implement vector saturating arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 2 +- tcg/aarch64/tcg-target.inc.c | 24 ++++++++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 545a6eec75..a1884543d0 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -135,7 +135,7 @@ typedef enum { #define TCG_TARGET_HAS_shv_vec 0 #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 -#define TCG_TARGET_HAS_sat_vec 0 +#define TCG_TARGET_HAS_sat_vec 1 #define TCG_TARGET_HAS_minmax_vec 0 =20 #define TCG_TARGET_DEFAULT_MO (0) diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index 0562e0aa40..b2b011f130 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -528,6 +528,10 @@ typedef enum { I3616_CMHI =3D 0x2e203400, I3616_CMHS =3D 0x2e203c00, I3616_CMEQ =3D 0x2e208c00, + I3616_SQADD =3D 0x0e200c00, + I3616_SQSUB =3D 0x0e202c00, + I3616_UQADD =3D 0x2e200c00, + I3616_UQSUB =3D 0x2e202c00, =20 /* AdvSIMD two-reg misc. */ I3617_CMGT0 =3D 0x0e208800, @@ -2137,6 +2141,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode = opc, case INDEX_op_orc_vec: tcg_out_insn(s, 3616, ORN, is_q, 0, a0, a1, a2); break; + case INDEX_op_ssadd_vec: + tcg_out_insn(s, 3616, SQADD, is_q, vece, a0, a1, a2); + break; + case INDEX_op_sssub_vec: + tcg_out_insn(s, 3616, SQSUB, is_q, vece, a0, a1, a2); + break; + case INDEX_op_usadd_vec: + tcg_out_insn(s, 3616, UQADD, is_q, vece, a0, a1, a2); + break; + case INDEX_op_ussub_vec: + tcg_out_insn(s, 3616, UQSUB, is_q, vece, a0, a1, a2); + break; case INDEX_op_not_vec: tcg_out_insn(s, 3617, NOT, is_q, 0, a0, a1); break; @@ -2207,6 +2223,10 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type,= unsigned vece) case INDEX_op_shli_vec: case INDEX_op_shri_vec: case INDEX_op_sari_vec: + case INDEX_op_ssadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_usadd_vec: + case INDEX_op_ussub_vec: return 1; case INDEX_op_mul_vec: return vece < MO_64; @@ -2386,6 +2406,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOp= code op) case INDEX_op_xor_vec: case INDEX_op_andc_vec: case INDEX_op_orc_vec: + case INDEX_op_ssadd_vec: + case INDEX_op_sssub_vec: + case INDEX_op_usadd_vec: + case INDEX_op_ussub_vec: return &w_w_w; case INDEX_op_not_vec: case INDEX_op_neg_vec: --=20 2.17.2 From nobody Sun May 5 14:24:39 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) client-ip=208.118.235.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=listsout.gnu.org; Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zoho.com: domain of gnu.org designates 208.118.235.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org Return-Path: Received: from listsout.gnu.org (listsout.gnu.org [208.118.235.17]) by mx.zohomail.com with SMTPS id 1546641928020346.20462338158677; Fri, 4 Jan 2019 14:45:28 -0800 (PST) Received: from localhost ([127.0.0.1]:59506 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfYDW-000309-Ve for importer@patchew.org; Fri, 04 Jan 2019 17:45:27 -0500 Received: from eggsout.gnu.org ([209.51.188.92]:55099 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gfY0Q-00036k-O7 for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gfY0P-0002GI-RK for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:54 -0500 Received: from mail-io1-xd42.google.com ([2607:f8b0:4864:20::d42]:42690) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gfY0P-0002F0-Mz for qemu-devel@nongnu.org; Fri, 04 Jan 2019 17:31:53 -0500 Received: by mail-io1-xd42.google.com with SMTP id x6so30675046ioa.9 for ; Fri, 04 Jan 2019 14:31:53 -0800 (PST) Received: from cloudburst.twiddle.net ([172.56.12.23]) by smtp.gmail.com with ESMTPSA id t6sm27793259ioc.87.2019.01.04.14.31.50 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 04 Jan 2019 14:31:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:subject:date:message-id:in-reply-to:references; bh=1RVpvwteSuePPKo6wNsZRQ/0lJBH3U2L6QXC6WM1wUc=; b=Rn81FQ8ffSMh/Ve/JS9/8VI/EGe6C8+v8O7nzsWMcWxT+plSqu4+C2M4aTFcuOXH5E bcwFQhCeVrz/Js8losiHtU+gfXxF1l+oV9Y8qRO4d/wHPI2nhj4tTyfMpdUfjdEz6VoZ W1tK6yD+04f31SK3iZxamC5sgV6fG1sFsuhwE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=1RVpvwteSuePPKo6wNsZRQ/0lJBH3U2L6QXC6WM1wUc=; b=RUF4oxrCWUCgOOxiXOvjmw3+0x7EQfQTA87onxK7SAtpvL7XXrYM6AGMSL6BbEQPsq tNCnDQa2zGd+s8RWFQqw+OruPHEmwMJAiv9OQYVTRtd9yCb8qIppYWhTZI7tBjUmpj/4 ex1mAItWxkHaFOa+99ULO9WI/IFYkxhSzZo1/7NYREd1SawrThwgQtcWZ/Fty/OwUZeg TWfU/6bUW4FlxRaUgxjxlbvXa8A4zA189j5icXXBIHlDdhvvtq/ZhfOoXhgXbYQ5C+Yo xmNcEQdsW5hF78G/CnhncmPrej6QlsL+NALSM8X4PMIwlHMIqD6tp5j33Ku2fTrvdyMJ SqlQ== X-Gm-Message-State: AJcUukeVMcY/Q3R8+d1DKirL7W9nS3J/oDotI5uZa3vFKFDwyQNy1BCu HzEKN0HtQvhgb8gHnt7GLZPJJLBOm/c= X-Google-Smtp-Source: ALg8bN7FhfqXUtOTx/VGNMSZOJTJUZZOxer2A+W7CYCnrJ5eIyYKAiZwez2nRTboenCZo3Wcp8qtNA== X-Received: by 2002:a6b:7402:: with SMTP id s2mr35341564iog.219.1546641112844; Fri, 04 Jan 2019 14:31:52 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 5 Jan 2019 08:31:16 +1000 Message-Id: <20190104223116.14037-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.2 In-Reply-To: <20190104223116.14037-1-richard.henderson@linaro.org> References: <20190104223116.14037-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::d42 Subject: [Qemu-devel] [PATCH v2 10/10] tcg/aarch64: Implement vector minmax arithmetic X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 2 +- tcg/aarch64/tcg-target.inc.c | 24 ++++++++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index a1884543d0..2d93cf404e 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -136,7 +136,7 @@ typedef enum { #define TCG_TARGET_HAS_cmp_vec 1 #define TCG_TARGET_HAS_mul_vec 1 #define TCG_TARGET_HAS_sat_vec 1 -#define TCG_TARGET_HAS_minmax_vec 0 +#define TCG_TARGET_HAS_minmax_vec 1 =20 #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c index b2b011f130..ee0d5819af 100644 --- a/tcg/aarch64/tcg-target.inc.c +++ b/tcg/aarch64/tcg-target.inc.c @@ -528,8 +528,12 @@ typedef enum { I3616_CMHI =3D 0x2e203400, I3616_CMHS =3D 0x2e203c00, I3616_CMEQ =3D 0x2e208c00, + I3616_SMAX =3D 0x0e206400, + I3616_SMIN =3D 0x0e206c00, I3616_SQADD =3D 0x0e200c00, I3616_SQSUB =3D 0x0e202c00, + I3616_UMAX =3D 0x2e206400, + I3616_UMIN =3D 0x2e206c00, I3616_UQADD =3D 0x2e200c00, I3616_UQSUB =3D 0x2e202c00, =20 @@ -2153,6 +2157,18 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode = opc, case INDEX_op_ussub_vec: tcg_out_insn(s, 3616, UQSUB, is_q, vece, a0, a1, a2); break; + case INDEX_op_smax_vec: + tcg_out_insn(s, 3616, SMAX, is_q, vece, a0, a1, a2); + break; + case INDEX_op_smin_vec: + tcg_out_insn(s, 3616, SMIN, is_q, vece, a0, a1, a2); + break; + case INDEX_op_umax_vec: + tcg_out_insn(s, 3616, UMAX, is_q, vece, a0, a1, a2); + break; + case INDEX_op_umin_vec: + tcg_out_insn(s, 3616, UMIN, is_q, vece, a0, a1, a2); + break; case INDEX_op_not_vec: tcg_out_insn(s, 3617, NOT, is_q, 0, a0, a1); break; @@ -2227,6 +2243,10 @@ int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type,= unsigned vece) case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: + case INDEX_op_smax_vec: + case INDEX_op_smin_vec: + case INDEX_op_umax_vec: + case INDEX_op_umin_vec: return 1; case INDEX_op_mul_vec: return vece < MO_64; @@ -2410,6 +2430,10 @@ static const TCGTargetOpDef *tcg_target_op_def(TCGOp= code op) case INDEX_op_sssub_vec: case INDEX_op_usadd_vec: case INDEX_op_ussub_vec: + case INDEX_op_smax_vec: + case INDEX_op_smin_vec: + case INDEX_op_umax_vec: + case INDEX_op_umin_vec: return &w_w_w; case INDEX_op_not_vec: case INDEX_op_neg_vec: --=20 2.17.2