From nobody Fri Nov 14 13:42:48 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588620326; cv=none; d=zohomail.com; s=zohoarc; b=F1VzdgwZnq4zhC4AE+VgADzfbAQWKggHQMISiIpX/NzDNqr1C+eBUqyNdgbXr2C+do3cOGh4+zBT54K3TXJC5bQqQF2+wmr69FOWtJX70EJj9j5f+G2GRdvDfptScgl6ZRCLugKowOEnHfC4z0atPjoPsDgbxcEHzK4F9y4U3t8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588620326; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=46o/ONBElt4xz1c1CIj6X0D90fesls/AxPAELovCv6M=; b=UgfIriNgNCTJBoL/9hVvYlmhQdoKgtv0it7/xKgZl+F6iNmdzQDWeMgwDuCLizVxSKtF7GNw1dXSrJyOSa0glLfsYNJcXqImBtWQkAgBY+x0KSsbFXBNoA4ybbuTrkAYYIvxLlpr1H/t6KEDhNVECpMTAY6zM5ByN/scMop/Uzg= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1588620326827418.0135226248059; Mon, 4 May 2020 12:25:26 -0700 (PDT) Received: from localhost ([::1]:42774 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jVgiT-0003CA-3b for importer@patchew.org; Mon, 04 May 2020 15:25:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46232) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jVggy-0001VC-QR for qemu-devel@nongnu.org; Mon, 04 May 2020 15:23:52 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:36167) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jVggw-0005yB-G9 for qemu-devel@nongnu.org; Mon, 04 May 2020 15:23:52 -0400 Received: by mail-pj1-x1042.google.com with SMTP id a31so383204pje.1 for ; Mon, 04 May 2020 12:23:49 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id b9sm9407364pfp.12.2020.05.04.12.23.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 May 2020 12:23:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=46o/ONBElt4xz1c1CIj6X0D90fesls/AxPAELovCv6M=; b=BoH+IGKo0kFLitkk3gtUKKCQphwVNE8e+1PZu2YW8pzC5o1YI8t7LbKu0uaioLl9LG +cvNx+57ewFVtTgo1WS+h0+BDZ2kQS1FK1EQnKgdkZ8DhS77Hr4Efn14FhqhVlVzEzEu rZAweWTNqLs1yBknvz2nz04Obb0NK8t2kklPWAbDN/j/Rv5a3Gn2saVB06plzQCUlph2 YfeEHA/3igX6CEQSeF02V4EWDicWm30bY1M1dZ+Y00neo4/I6mh8bu9wI1sAWfIXe3mz bzbzliy2NyRdPh4S67nRFETHEZNeKuCvG9W4e7n/Z/VEaSH6O2QdioraJ5HHvi8FQCnj O3XQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=46o/ONBElt4xz1c1CIj6X0D90fesls/AxPAELovCv6M=; b=JeGnwERIRi4pa7SqCnMGTR4e+baNxhISq58fHUJx02O2NCKWNiGrGS9JNSt90+jQxu hBzTC5CWlgyJAEHS18L1V3mLqu2GxGgvN6Hx9cgUX8u6f3rmgXKQBCy1+oZxhJMAqkuW rm+FnNN1XlG+p0tkArSGjVDvlVSGGb91QkBCVrWC9itmtQkqrjEQ+UwQ9bdJqy3fBK7W vPUoaR9W/RU4eIIccRQIaOLu7ekH7X3dcXGLROyfOG8PnV6csTRnI7B2NObluRKgoRB9 hO8Pbut2otiQ2kTuy9+R0TyQArY43UBrkkTLWd6Tjeys2c5OqkGqKh4gXGBhNEdcbpvY fVIg== X-Gm-Message-State: AGi0PuaDXKTQNkabXPfdrMQdL1W1CLm8UQxN/StrajLz1zUz/qyuyG59 6SxPsh7zzeH+F5tyzbHuVJuDpFYonb8= X-Google-Smtp-Source: APiQypLdsZAZ1iubi2VOYaaJs12085vFrtdIGhHqXIgxG0VXh/hWHa3IgbspeTIUgd/eVlR9XDQscw== X-Received: by 2002:a17:90a:d711:: with SMTP id y17mr567882pju.11.1588620227554; Mon, 04 May 2020 12:23:47 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 1/3] target/arm: Use tcg_gen_gvec_5_ptr for sve FMLA/FCMLA Date: Mon, 4 May 2020 12:23:42 -0700 Message-Id: <20200504192344.13404-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504192344.13404-1-richard.henderson@linaro.org> References: <20200504192344.13404-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, Taylor Simpson , =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: fail (Header signature does not verify) Now that we can pass 7 parameters, do not encode register operands within simd_data. Reviewed-by: Alex Benn=C3=A9e Reviewed-by: Taylor Simpson Signed-off-by: Richard Henderson --- v2: Remove gen_helper_sve_fmla typedef (phil). --- target/arm/helper-sve.h | 45 +++++++---- target/arm/sve_helper.c | 157 ++++++++++++++----------------------- target/arm/translate-sve.c | 70 ++++++----------- 3 files changed, 114 insertions(+), 158 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2f47279155..7a200755ac 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1099,25 +1099,40 @@ DEF_HELPER_FLAGS_6(sve_fcadd_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) =20 -DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) =20 -DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) =20 -DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) =20 -DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) =20 -DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) -DEF_HELPER_FLAGS_3(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fcmla_zpzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fcmla_zpzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_7(sve_fcmla_zpzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, ptr, i32) =20 DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr,= i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fdfa652094..33b5a54a47 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3372,23 +3372,11 @@ DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_fl= oat64) =20 #undef DO_ZPZ_FP =20 -/* 4-operand predicated multiply-add. This requires 7 operands to pass - * "properly", so we need to encode some of the registers into DESC. - */ -QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32); - -static void do_fmla_zpzzz_h(CPUARMState *env, void *vg, uint32_t desc, +static void do_fmla_zpzzz_h(void *vd, void *vn, void *vm, void *va, void *= vg, + float_status *status, uint32_t desc, uint16_t neg1, uint16_t neg3) { intptr_t i =3D simd_oprsz(desc); - unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT, 5); - unsigned rn =3D extract32(desc, SIMD_DATA_SHIFT + 5, 5); - unsigned rm =3D extract32(desc, SIMD_DATA_SHIFT + 10, 5); - unsigned ra =3D extract32(desc, SIMD_DATA_SHIFT + 15, 5); - void *vd =3D &env->vfp.zregs[rd]; - void *vn =3D &env->vfp.zregs[rn]; - void *vm =3D &env->vfp.zregs[rm]; - void *va =3D &env->vfp.zregs[ra]; uint64_t *g =3D vg; =20 do { @@ -3401,45 +3389,42 @@ static void do_fmla_zpzzz_h(CPUARMState *env, void = *vg, uint32_t desc, e1 =3D *(uint16_t *)(vn + H1_2(i)) ^ neg1; e2 =3D *(uint16_t *)(vm + H1_2(i)); e3 =3D *(uint16_t *)(va + H1_2(i)) ^ neg3; - r =3D float16_muladd(e1, e2, e3, 0, &env->vfp.fp_status_f1= 6); + r =3D float16_muladd(e1, e2, e3, 0, status); *(uint16_t *)(vd + H1_2(i)) =3D r; } } while (i & 63); } while (i !=3D 0); } =20 -void HELPER(sve_fmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_h(env, vg, desc, 0, 0); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0); } =20 -void HELPER(sve_fmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fmls_zpzzz_h)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0); } =20 -void HELPER(sve_fnmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fnmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_h(env, vg, desc, 0x8000, 0x8000); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0x8000); } =20 -void HELPER(sve_fnmls_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fnmls_zpzzz_h)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_h(env, vg, desc, 0, 0x8000); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0x8000); } =20 -static void do_fmla_zpzzz_s(CPUARMState *env, void *vg, uint32_t desc, +static void do_fmla_zpzzz_s(void *vd, void *vn, void *vm, void *va, void *= vg, + float_status *status, uint32_t desc, uint32_t neg1, uint32_t neg3) { intptr_t i =3D simd_oprsz(desc); - unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT, 5); - unsigned rn =3D extract32(desc, SIMD_DATA_SHIFT + 5, 5); - unsigned rm =3D extract32(desc, SIMD_DATA_SHIFT + 10, 5); - unsigned ra =3D extract32(desc, SIMD_DATA_SHIFT + 15, 5); - void *vd =3D &env->vfp.zregs[rd]; - void *vn =3D &env->vfp.zregs[rn]; - void *vm =3D &env->vfp.zregs[rm]; - void *va =3D &env->vfp.zregs[ra]; uint64_t *g =3D vg; =20 do { @@ -3452,45 +3437,42 @@ static void do_fmla_zpzzz_s(CPUARMState *env, void = *vg, uint32_t desc, e1 =3D *(uint32_t *)(vn + H1_4(i)) ^ neg1; e2 =3D *(uint32_t *)(vm + H1_4(i)); e3 =3D *(uint32_t *)(va + H1_4(i)) ^ neg3; - r =3D float32_muladd(e1, e2, e3, 0, &env->vfp.fp_status); + r =3D float32_muladd(e1, e2, e3, 0, status); *(uint32_t *)(vd + H1_4(i)) =3D r; } } while (i & 63); } while (i !=3D 0); } =20 -void HELPER(sve_fmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_s(env, vg, desc, 0, 0); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0); } =20 -void HELPER(sve_fmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fmls_zpzzz_s)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0); } =20 -void HELPER(sve_fnmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fnmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_s(env, vg, desc, 0x80000000, 0x80000000); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0x800000= 00); } =20 -void HELPER(sve_fnmls_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fnmls_zpzzz_s)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_s(env, vg, desc, 0, 0x80000000); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0x80000000); } =20 -static void do_fmla_zpzzz_d(CPUARMState *env, void *vg, uint32_t desc, +static void do_fmla_zpzzz_d(void *vd, void *vn, void *vm, void *va, void *= vg, + float_status *status, uint32_t desc, uint64_t neg1, uint64_t neg3) { intptr_t i =3D simd_oprsz(desc); - unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT, 5); - unsigned rn =3D extract32(desc, SIMD_DATA_SHIFT + 5, 5); - unsigned rm =3D extract32(desc, SIMD_DATA_SHIFT + 10, 5); - unsigned ra =3D extract32(desc, SIMD_DATA_SHIFT + 15, 5); - void *vd =3D &env->vfp.zregs[rd]; - void *vn =3D &env->vfp.zregs[rn]; - void *vm =3D &env->vfp.zregs[rm]; - void *va =3D &env->vfp.zregs[ra]; uint64_t *g =3D vg; =20 do { @@ -3503,31 +3485,35 @@ static void do_fmla_zpzzz_d(CPUARMState *env, void = *vg, uint32_t desc, e1 =3D *(uint64_t *)(vn + i) ^ neg1; e2 =3D *(uint64_t *)(vm + i); e3 =3D *(uint64_t *)(va + i) ^ neg3; - r =3D float64_muladd(e1, e2, e3, 0, &env->vfp.fp_status); + r =3D float64_muladd(e1, e2, e3, 0, status); *(uint64_t *)(vd + i) =3D r; } } while (i & 63); } while (i !=3D 0); } =20 -void HELPER(sve_fmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_d(env, vg, desc, 0, 0); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, 0); } =20 -void HELPER(sve_fmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, 0); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, 0); } =20 -void HELPER(sve_fnmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fnmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_d(env, vg, desc, INT64_MIN, INT64_MIN); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, INT64_MIN= ); } =20 -void HELPER(sve_fnmls_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fnmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { - do_fmla_zpzzz_d(env, vg, desc, 0, INT64_MIN); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, INT64_MIN); } =20 /* Two operand floating-point comparison controlled by a predicate. @@ -3809,22 +3795,13 @@ void HELPER(sve_fcadd_d)(void *vd, void *vn, void *= vm, void *vg, * FP Complex Multiply */ =20 -QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 22 > 32); - -void HELPER(sve_fcmla_zpzzz_h)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fcmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { intptr_t j, i =3D simd_oprsz(desc); - unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT, 5); - unsigned rn =3D extract32(desc, SIMD_DATA_SHIFT + 5, 5); - unsigned rm =3D extract32(desc, SIMD_DATA_SHIFT + 10, 5); - unsigned ra =3D extract32(desc, SIMD_DATA_SHIFT + 15, 5); - unsigned rot =3D extract32(desc, SIMD_DATA_SHIFT + 20, 2); + unsigned rot =3D simd_data(desc); bool flip =3D rot & 1; float16 neg_imag, neg_real; - void *vd =3D &env->vfp.zregs[rd]; - void *vn =3D &env->vfp.zregs[rn]; - void *vm =3D &env->vfp.zregs[rm]; - void *va =3D &env->vfp.zregs[ra]; uint64_t *g =3D vg; =20 neg_imag =3D float16_set_sign(0, (rot & 2) !=3D 0); @@ -3851,32 +3828,25 @@ void HELPER(sve_fcmla_zpzzz_h)(CPUARMState *env, vo= id *vg, uint32_t desc) =20 if (likely((pg >> (i & 63)) & 1)) { d =3D *(float16 *)(va + H1_2(i)); - d =3D float16_muladd(e2, e1, d, 0, &env->vfp.fp_status_f16= ); + d =3D float16_muladd(e2, e1, d, 0, status); *(float16 *)(vd + H1_2(i)) =3D d; } if (likely((pg >> (j & 63)) & 1)) { d =3D *(float16 *)(va + H1_2(j)); - d =3D float16_muladd(e4, e3, d, 0, &env->vfp.fp_status_f16= ); + d =3D float16_muladd(e4, e3, d, 0, status); *(float16 *)(vd + H1_2(j)) =3D d; } } while (i & 63); } while (i !=3D 0); } =20 -void HELPER(sve_fcmla_zpzzz_s)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fcmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { intptr_t j, i =3D simd_oprsz(desc); - unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT, 5); - unsigned rn =3D extract32(desc, SIMD_DATA_SHIFT + 5, 5); - unsigned rm =3D extract32(desc, SIMD_DATA_SHIFT + 10, 5); - unsigned ra =3D extract32(desc, SIMD_DATA_SHIFT + 15, 5); - unsigned rot =3D extract32(desc, SIMD_DATA_SHIFT + 20, 2); + unsigned rot =3D simd_data(desc); bool flip =3D rot & 1; float32 neg_imag, neg_real; - void *vd =3D &env->vfp.zregs[rd]; - void *vn =3D &env->vfp.zregs[rn]; - void *vm =3D &env->vfp.zregs[rm]; - void *va =3D &env->vfp.zregs[ra]; uint64_t *g =3D vg; =20 neg_imag =3D float32_set_sign(0, (rot & 2) !=3D 0); @@ -3903,32 +3873,25 @@ void HELPER(sve_fcmla_zpzzz_s)(CPUARMState *env, vo= id *vg, uint32_t desc) =20 if (likely((pg >> (i & 63)) & 1)) { d =3D *(float32 *)(va + H1_2(i)); - d =3D float32_muladd(e2, e1, d, 0, &env->vfp.fp_status); + d =3D float32_muladd(e2, e1, d, 0, status); *(float32 *)(vd + H1_2(i)) =3D d; } if (likely((pg >> (j & 63)) & 1)) { d =3D *(float32 *)(va + H1_2(j)); - d =3D float32_muladd(e4, e3, d, 0, &env->vfp.fp_status); + d =3D float32_muladd(e4, e3, d, 0, status); *(float32 *)(vd + H1_2(j)) =3D d; } } while (i & 63); } while (i !=3D 0); } =20 -void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, void *vg, uint32_t desc) +void HELPER(sve_fcmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va, + void *vg, void *status, uint32_t desc) { intptr_t j, i =3D simd_oprsz(desc); - unsigned rd =3D extract32(desc, SIMD_DATA_SHIFT, 5); - unsigned rn =3D extract32(desc, SIMD_DATA_SHIFT + 5, 5); - unsigned rm =3D extract32(desc, SIMD_DATA_SHIFT + 10, 5); - unsigned ra =3D extract32(desc, SIMD_DATA_SHIFT + 15, 5); - unsigned rot =3D extract32(desc, SIMD_DATA_SHIFT + 20, 2); + unsigned rot =3D simd_data(desc); bool flip =3D rot & 1; float64 neg_imag, neg_real; - void *vd =3D &env->vfp.zregs[rd]; - void *vn =3D &env->vfp.zregs[rn]; - void *vm =3D &env->vfp.zregs[rm]; - void *va =3D &env->vfp.zregs[ra]; uint64_t *g =3D vg; =20 neg_imag =3D float64_set_sign(0, (rot & 2) !=3D 0); @@ -3955,12 +3918,12 @@ void HELPER(sve_fcmla_zpzzz_d)(CPUARMState *env, vo= id *vg, uint32_t desc) =20 if (likely((pg >> (i & 63)) & 1)) { d =3D *(float64 *)(va + H1_2(i)); - d =3D float64_muladd(e2, e1, d, 0, &env->vfp.fp_status); + d =3D float64_muladd(e2, e1, d, 0, status); *(float64 *)(vd + H1_2(i)) =3D d; } if (likely((pg >> (j & 63)) & 1)) { d =3D *(float64 *)(va + H1_2(j)); - d =3D float64_muladd(e4, e3, d, 0, &env->vfp.fp_status); + d =3D float64_muladd(e4, e3, d, 0, status); *(float64 *)(vd + H1_2(j)) =3D d; } } while (i & 63); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b35bad245e..8d6b971d50 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3948,42 +3948,30 @@ static bool trans_FCADD(DisasContext *s, arg_FCADD = *a) return true; } =20 -typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); - -static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla= *fn) +static bool do_fmla(DisasContext *s, arg_rprrr_esz *a, + gen_helper_gvec_5_ptr *fn) { - if (fn =3D=3D NULL) { + if (a->esz =3D=3D 0) { return false; } - if (!sve_access_check(s)) { - return true; + if (sve_access_check(s)) { + unsigned vsz =3D vec_full_reg_size(s); + TCGv_ptr status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + tcg_gen_gvec_5_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); } - - unsigned vsz =3D vec_full_reg_size(s); - unsigned desc; - TCGv_i32 t_desc; - TCGv_ptr pg =3D tcg_temp_new_ptr(); - - /* We would need 7 operands to pass these arguments "properly". - * So we encode all the register numbers into the descriptor. - */ - desc =3D deposit32(a->rd, 5, 5, a->rn); - desc =3D deposit32(desc, 10, 5, a->rm); - desc =3D deposit32(desc, 15, 5, a->ra); - desc =3D simd_desc(vsz, vsz, desc); - - t_desc =3D tcg_const_i32(desc); - tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); - fn(cpu_env, pg, t_desc); - tcg_temp_free_i32(t_desc); - tcg_temp_free_ptr(pg); return true; } =20 #define DO_FMLA(NAME, name) \ static bool trans_##NAME(DisasContext *s, arg_rprrr_esz *a) \ { \ - static gen_helper_sve_fmla * const fns[4] =3D { \ + static gen_helper_gvec_5_ptr * const fns[4] =3D { \ NULL, gen_helper_sve_##name##_h, \ gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ }; \ @@ -3999,7 +3987,8 @@ DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) =20 static bool trans_FCMLA_zpzzz(DisasContext *s, arg_FCMLA_zpzzz *a) { - static gen_helper_sve_fmla * const fns[3] =3D { + static gen_helper_gvec_5_ptr * const fns[4] =3D { + NULL, gen_helper_sve_fcmla_zpzzz_h, gen_helper_sve_fcmla_zpzzz_s, gen_helper_sve_fcmla_zpzzz_d, @@ -4010,25 +3999,14 @@ static bool trans_FCMLA_zpzzz(DisasContext *s, arg_= FCMLA_zpzzz *a) } if (sve_access_check(s)) { unsigned vsz =3D vec_full_reg_size(s); - unsigned desc; - TCGv_i32 t_desc; - TCGv_ptr pg =3D tcg_temp_new_ptr(); - - /* We would need 7 operands to pass these arguments "properly". - * So we encode all the register numbers into the descriptor. - */ - desc =3D deposit32(a->rd, 5, 5, a->rn); - desc =3D deposit32(desc, 10, 5, a->rm); - desc =3D deposit32(desc, 15, 5, a->ra); - desc =3D deposit32(desc, 20, 2, a->rot); - desc =3D sextract32(desc, 0, 22); - desc =3D simd_desc(vsz, vsz, desc); - - t_desc =3D tcg_const_i32(desc); - tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); - fns[a->esz - 1](cpu_env, pg, t_desc); - tcg_temp_free_i32(t_desc); - tcg_temp_free_ptr(pg); + TCGv_ptr status =3D get_fpstatus_ptr(a->esz =3D=3D MO_16); + tcg_gen_gvec_5_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, a->rot, fns[a->esz]); + tcg_temp_free_ptr(status); } return true; } --=20 2.20.1 From nobody Fri Nov 14 13:42:48 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588620325; cv=none; d=zohomail.com; s=zohoarc; b=bLEkdlmYnDjQ1HGxr43VZYDQ5b6Tt+jyFfdGCZo0Mwib6l4lNDx9gBM0Bk6OWbtgU6r4TrsGQibqTsOsMPNCxibaRaCWHgsgk53ztedbT5FAf839NrKNPpRGlCq4M1wUysQGmg5/HUzB40zaMFkBGA6AUj2mrcV3ciuc5/eiK+Q= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588620325; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=XhxyK+kkO2Y8hUw/ZAxBXnrBF6bTSYi2MZ+Sv+Ux6CU=; b=UlmI5YyQQEEP7/T85TqOdMjfp1UdGR4vpby/x/T23x7+Xa1M44LmAzpQFF8AZfS/hq2v20OgHht2Y2RtlBRozEPsDCEhfEwf3ynRnVfDo5LwQYj+r+pn6duqMSj3Rw0aynwXxCnKUdVlQFm+AlKePXNtKT4TbU497NeeIe4roYY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 158862032543258.21665613676009; Mon, 4 May 2020 12:25:25 -0700 (PDT) Received: from localhost ([::1]:42730 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jVgiR-0003Az-Si for importer@patchew.org; Mon, 04 May 2020 15:25:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46238) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jVgh0-0001W6-MH for qemu-devel@nongnu.org; Mon, 04 May 2020 15:23:54 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:39969) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jVggx-0005yM-A9 for qemu-devel@nongnu.org; Mon, 04 May 2020 15:23:54 -0400 Received: by mail-pf1-x442.google.com with SMTP id x2so6029956pfx.7 for ; Mon, 04 May 2020 12:23:50 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id b9sm9407364pfp.12.2020.05.04.12.23.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 May 2020 12:23:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XhxyK+kkO2Y8hUw/ZAxBXnrBF6bTSYi2MZ+Sv+Ux6CU=; b=n6hVWoV2XE3Zct0JzbqbToZJUuMcwwHOspHkh5EbtXXix57aW+YMKjqKusbz38H+C0 jB3ecQJEEBlyXIBjW3RHIpsdcCbaQKFTK50dEyhhuw+7oOi5GuvkThAkKld1LWQJJWJV 8QSdp9z0IlRP0B3sIfBysJ131SAyeKIwpBle3m92NThb445hF9zWl6F/R+vE0G8uggq+ a7Wcg4FR5zhk9D0Qmuy2NJuHzuFSkp+TlLJfAs/kW8wsbsvvQ0SifPfJ6mtoRNdvdStx VftGgwDf536F1dMGbrkBd4ryp0iEEjtLLv/TJ7ovEbSrwWXEMeim6yE1iWbknVKrJLxn ZjCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XhxyK+kkO2Y8hUw/ZAxBXnrBF6bTSYi2MZ+Sv+Ux6CU=; b=hq8SUUu3SPpwKLhDD5FI4/9aFjEPR+zDVBwaIf3cD6Zt8b+1Rl4TD31bNdYfIhEktj LBNbS9n7Y/7DDX5Wt5HV3Ed4V7c7CvCR8A23KlHvu8zJj2W2cl5vHYhbavOwgQ4u1ype 1gwFaLS3polsFU2LqbQ4CF5oXSi/kxLvNOfhLIjHzc0NFRt5NHp402C0d076EaVqb8K/ hSwEwxYQvyTdfeBOAGGJO2/WcD+SSmJIVIw5GQk6MUkR0sswoqhmLwtvIttd6Vh5YowD bXmBc+9XvtbnFkYpmMvvhEuHYZG7a6P8MmL0O77BciZ3IvZJBOmHugR2iFRnRhXHZCLq UkBA== X-Gm-Message-State: AGi0Pub5Ct0v5t158LYRShNqxATqUrQexCAgKC+1lJXvEmlox9fyVrhU 6VxFBebKa/Y9E+LPt5yvuT8Be6Chy+I= X-Google-Smtp-Source: APiQypL9g3ME3tvyCdPuPuQI5PyMFWIYOrZgfy3u/DnhUzKVaQKGoapSjGdThVGywEi3dxzdc6E/Ig== X-Received: by 2002:a63:4ce:: with SMTP id 197mr462934pge.240.1588620228563; Mon, 04 May 2020 12:23:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 2/3] target/arm: Use tcg_gen_gvec_mov for clear_vec_high Date: Mon, 4 May 2020 12:23:43 -0700 Message-Id: <20200504192344.13404-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504192344.13404-1-richard.henderson@linaro.org> References: <20200504192344.13404-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) The 8-byte store for the end a !is_q operation can be merged with the other stores. Use a no-op vector move to trigger the expand_clr portion of tcg_gen_gvec_mov. Reviewed-by: Alex Benn=C3=A9e Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index a896f9c4b8..729e746e25 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -496,14 +496,8 @@ static void clear_vec_high(DisasContext *s, bool is_q,= int rd) unsigned ofs =3D fp_reg_offset(s, rd, MO_64); unsigned vsz =3D vec_full_reg_size(s); =20 - if (!is_q) { - TCGv_i64 tcg_zero =3D tcg_const_i64(0); - tcg_gen_st_i64(tcg_zero, cpu_env, ofs + 8); - tcg_temp_free_i64(tcg_zero); - } - if (vsz > 16) { - tcg_gen_gvec_dup8i(ofs + 16, vsz - 16, vsz - 16, 0); - } + /* Nop move, with side effect of clearing the tail. */ + tcg_gen_gvec_mov(MO_64, ofs, ofs, is_q ? 16 : 8, vsz); } =20 void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v) --=20 2.20.1 From nobody Fri Nov 14 13:42:48 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1588620323; cv=none; d=zohomail.com; s=zohoarc; b=iTk3bJBlX7oajeUCep/4dQj80fjs2FoPTl7iMhBKdo5n4dr36IY/5H4w47eeoMvnuZwhHc779xthZKQzaNhKKhhaqiJYbu71cH/dnP2HmFoZ9zs6weak9Crxq9UBZmemzNSsHK0WKdtaRaaSziQrBk1KFNSdZ/KL9tyb6qGIXxE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1588620323; h=Content-Type:Content-Transfer-Encoding:Cc:Date:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:To; bh=c3OBmdqjWHi93YqiLmZoZH4s0E/AFDrL/k2Mh8jPufY=; b=PQECfDaFtSOllKlBKdqCgR9MusnU3VG0E1Nvh5KTtOWAl0msQ754YeRHXKBPSyw4eOvPAfpSADeZ0q/J86bivlsgZIvK+CxhGMU+A5YLJD+4HtJQIHR4FodUVpPwROavxt6ex7kvzyTUYlc8M++937hhqOpig8V229CGlvuDElI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) header.from= Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1588620323969801.5531222537769; Mon, 4 May 2020 12:25:23 -0700 (PDT) Received: from localhost ([::1]:42570 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jVgiQ-00037F-N0 for importer@patchew.org; Mon, 04 May 2020 15:25:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46234) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jVggz-0001VN-Mk for qemu-devel@nongnu.org; Mon, 04 May 2020 15:23:54 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:37238) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jVggy-0005yX-GN for qemu-devel@nongnu.org; Mon, 04 May 2020 15:23:53 -0400 Received: by mail-pl1-x643.google.com with SMTP id x10so158681plr.4 for ; Mon, 04 May 2020 12:23:51 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id b9sm9407364pfp.12.2020.05.04.12.23.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 May 2020 12:23:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=c3OBmdqjWHi93YqiLmZoZH4s0E/AFDrL/k2Mh8jPufY=; b=q7W5FY2C0QYokknteHr8slty/pobMtz7m5eY2mf/XP/RzNOirkHn9tNLjRIAn4lNt7 jYP8AS8+7OmOB1yQJLeC6lTL0RQDdNRu/gIBg+e1n4ONt7nL3pDXXmVqhBhQLDGn7UH6 8uKlfIZEPh2VW4jLevXD6SMheiJabk8XgTWM+lWGuKz/YGr13LgbmKlm0KoYPjUe9K+1 ZT1HO9BUNQezmuWc+8+VwzFwOAN73OI6YqByL9zoQzZfT3ABLK49ON0XWk0U/dFPIK5a vlAAOFhjuq5D4kFd29TtTjL+ubBvO73BYZjWWU8ZApTOmkfE8j1sGnQ4S9X3py+RzJQG PSrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=c3OBmdqjWHi93YqiLmZoZH4s0E/AFDrL/k2Mh8jPufY=; b=SnUs+4snttP6VZbdqflXJYtI1gw9lrwOQPWLyweBZG71TVJIGyGmIgXGXiBn598gh/ dxppnavKBX6xeEhjIrwa9e9iY24mkEJX3GLOLSDSeihY6z+vQEgil92a1LLBH5ryCRBi Et2LkWVZIixI0kTkCvW7BZTCBdBA7Ep5A3S7MG2ZrJHYD9SVuwlEnZ2utbxrnkW7bSw2 i6fHcqDe7+IWZ+Qc+dkemDzWu3FbRvQMsnLu8W4F8Fh53X+r7PtVe3B+DtBJ0DfKz+hp ajzz7aBMvy8RjBvYt2uOJH+VtMnXIsas+weJnNM9Oz9H0y/uA0mRUs/8WWozYFl2ka8e ZHwg== X-Gm-Message-State: AGi0PubO6NKM0zQ2lIerhNNeaDSaEZM59kR/tMSXSbUFsz95a637dnxW Ul3JNfXcUd5FnLtjKaOnci3ggnnnq7I= X-Google-Smtp-Source: APiQypI0URcFGyTqybBj/pg/0B3CnTtdsk+GZLZpn0QO89FjQkCH1yYsaZ92T5YIm/rrspr1lUv9zQ== X-Received: by 2002:a17:90a:fd94:: with SMTP id cx20mr552247pjb.157.1588620229754; Mon, 04 May 2020 12:23:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 3/3] target/arm: Use clear_vec_high more effectively Date: Mon, 4 May 2020 12:23:44 -0700 Message-Id: <20200504192344.13404-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200504192344.13404-1-richard.henderson@linaro.org> References: <20200504192344.13404-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, =?UTF-8?q?Alex=20Benn=C3=A9e?= Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" X-ZohoMail-DKIM: pass (identity @linaro.org) Do not explicitly store zero to the NEON high part when we can pass !is_q to clear_vec_high. Reviewed-by: Alex Benn=C3=A9e Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 59 +++++++++++++++++++++++--------------- 1 file changed, 36 insertions(+), 23 deletions(-) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 729e746e25..d1c9150c4f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -939,11 +939,10 @@ static void do_fp_ld(DisasContext *s, int destidx, TC= Gv_i64 tcg_addr, int size) { /* This always zero-extends and writes to a full 128 bit wide vector */ TCGv_i64 tmplo =3D tcg_temp_new_i64(); - TCGv_i64 tmphi; + TCGv_i64 tmphi =3D NULL; =20 if (size < 4) { MemOp memop =3D s->be_data + size; - tmphi =3D tcg_const_i64(0); tcg_gen_qemu_ld_i64(tmplo, tcg_addr, get_mem_index(s), memop); } else { bool be =3D s->be_data =3D=3D MO_BE; @@ -961,12 +960,13 @@ static void do_fp_ld(DisasContext *s, int destidx, TC= Gv_i64 tcg_addr, int size) } =20 tcg_gen_st_i64(tmplo, cpu_env, fp_reg_offset(s, destidx, MO_64)); - tcg_gen_st_i64(tmphi, cpu_env, fp_reg_hi_offset(s, destidx)); - tcg_temp_free_i64(tmplo); - tcg_temp_free_i64(tmphi); =20 - clear_vec_high(s, true, destidx); + if (tmphi) { + tcg_gen_st_i64(tmphi, cpu_env, fp_reg_hi_offset(s, destidx)); + tcg_temp_free_i64(tmphi); + } + clear_vec_high(s, tmphi !=3D NULL, destidx); } =20 /* @@ -6960,8 +6960,8 @@ static void disas_simd_ext(DisasContext *s, uint32_t = insn) return; } =20 - tcg_resh =3D tcg_temp_new_i64(); tcg_resl =3D tcg_temp_new_i64(); + tcg_resh =3D NULL; =20 /* Vd gets bits starting at pos bits into Vm:Vn. This is * either extracting 128 bits from a 128:128 concatenation, or @@ -6973,7 +6973,6 @@ static void disas_simd_ext(DisasContext *s, uint32_t = insn) read_vec_element(s, tcg_resh, rm, 0, MO_64); do_ext64(s, tcg_resh, tcg_resl, pos); } - tcg_gen_movi_i64(tcg_resh, 0); } else { TCGv_i64 tcg_hh; typedef struct { @@ -6988,6 +6987,7 @@ static void disas_simd_ext(DisasContext *s, uint32_t = insn) pos -=3D 64; } =20 + tcg_resh =3D tcg_temp_new_i64(); read_vec_element(s, tcg_resl, elt->reg, elt->elt, MO_64); elt++; read_vec_element(s, tcg_resh, elt->reg, elt->elt, MO_64); @@ -7003,9 +7003,12 @@ static void disas_simd_ext(DisasContext *s, uint32_t= insn) =20 write_vec_element(s, tcg_resl, rd, 0, MO_64); tcg_temp_free_i64(tcg_resl); - write_vec_element(s, tcg_resh, rd, 1, MO_64); - tcg_temp_free_i64(tcg_resh); - clear_vec_high(s, true, rd); + + if (is_q) { + write_vec_element(s, tcg_resh, rd, 1, MO_64); + tcg_temp_free_i64(tcg_resh); + } + clear_vec_high(s, is_q, rd); } =20 /* TBL/TBX @@ -7042,17 +7045,21 @@ static void disas_simd_tb(DisasContext *s, uint32_t= insn) * the input. */ tcg_resl =3D tcg_temp_new_i64(); - tcg_resh =3D tcg_temp_new_i64(); + tcg_resh =3D NULL; =20 if (is_tblx) { read_vec_element(s, tcg_resl, rd, 0, MO_64); } else { tcg_gen_movi_i64(tcg_resl, 0); } - if (is_tblx && is_q) { - read_vec_element(s, tcg_resh, rd, 1, MO_64); - } else { - tcg_gen_movi_i64(tcg_resh, 0); + + if (is_q) { + tcg_resh =3D tcg_temp_new_i64(); + if (is_tblx) { + read_vec_element(s, tcg_resh, rd, 1, MO_64); + } else { + tcg_gen_movi_i64(tcg_resh, 0); + } } =20 tcg_idx =3D tcg_temp_new_i64(); @@ -7072,9 +7079,12 @@ static void disas_simd_tb(DisasContext *s, uint32_t = insn) =20 write_vec_element(s, tcg_resl, rd, 0, MO_64); tcg_temp_free_i64(tcg_resl); - write_vec_element(s, tcg_resh, rd, 1, MO_64); - tcg_temp_free_i64(tcg_resh); - clear_vec_high(s, true, rd); + + if (is_q) { + write_vec_element(s, tcg_resh, rd, 1, MO_64); + tcg_temp_free_i64(tcg_resh); + } + clear_vec_high(s, is_q, rd); } =20 /* ZIP/UZP/TRN @@ -7111,7 +7121,7 @@ static void disas_simd_zip_trn(DisasContext *s, uint3= 2_t insn) } =20 tcg_resl =3D tcg_const_i64(0); - tcg_resh =3D tcg_const_i64(0); + tcg_resh =3D is_q ? tcg_const_i64(0) : NULL; tcg_res =3D tcg_temp_new_i64(); =20 for (i =3D 0; i < elements; i++) { @@ -7162,9 +7172,12 @@ static void disas_simd_zip_trn(DisasContext *s, uint= 32_t insn) =20 write_vec_element(s, tcg_resl, rd, 0, MO_64); tcg_temp_free_i64(tcg_resl); - write_vec_element(s, tcg_resh, rd, 1, MO_64); - tcg_temp_free_i64(tcg_resh); - clear_vec_high(s, true, rd); + + if (is_q) { + write_vec_element(s, tcg_resh, rd, 1, MO_64); + tcg_temp_free_i64(tcg_resh); + } + clear_vec_high(s, is_q, rd); } =20 /* --=20 2.20.1