From nobody Sat Apr 5 15:23:33 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=linaro.org ARC-Seal: i=1; a=rsa-sha256; t=1738020612; cv=none; d=zohomail.com; s=zohoarc; b=dxLxdECgD9TzoRBdyg2MorG2mHK0P9GYZKimXjJn7aNUq6PtF7dMR6xslB5YFdhjOY+6hzCzuMlQcSNnEnY56pBQ83Aq6jDpSAnf5gqmM7TU37a2BNI2t14HezX+XLMQAOBpY2eYfwaCH9suVRcSSvxMABYGo3tBpUpbKOE/KL4= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1738020612; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=2GDxHDUfDEHJ5sXsQfC9s6zGTEGY5YOTfZmfjmgVvKc=; b=iGyJH5L9MmsXsWCMtQAn2XXsRvCIt7G6lHWUpAoBQXiBRXeInpWdQfojF4+rx3r+DIfzAN4Yt4ceZ6nYlq4ZuTsWHrowRhm5NOQVZ9+je5RcaBlPYjaZf6DUpo/1xH3eGKSkLGx1sU5I5rE+wrqH2edDKutlRL0Y+2wxhfWsqmE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1738020612651764.770200857051; Mon, 27 Jan 2025 15:30:12 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tcYV5-0006d0-7U; Mon, 27 Jan 2025 18:26:39 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tcYUs-0006SX-FB for qemu-devel@nongnu.org; Mon, 27 Jan 2025 18:26:27 -0500 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tcYUq-0005fA-5L for qemu-devel@nongnu.org; Mon, 27 Jan 2025 18:26:26 -0500 Received: by mail-pj1-x102a.google.com with SMTP id 98e67ed59e1d1-2eed82ca5b4so8527885a91.2 for ; Mon, 27 Jan 2025 15:26:23 -0800 (PST) Received: from stoup.. (174-21-71-127.tukw.qwest.net. [174.21.71.127]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2f7ffb1b31esm7833000a91.47.2025.01.27.15.26.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jan 2025 15:26:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1738020383; x=1738625183; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2GDxHDUfDEHJ5sXsQfC9s6zGTEGY5YOTfZmfjmgVvKc=; b=prK1VNR1Bor4WZCUyYx9DU4xyvR00YL0GnhicrBUeY4R1TCa0sgMP1bw+IesUr/4vx KxkmrzOBPCh/XxI1niKFGBgZY7/i03mfaTAKaF/U7xED0mHhC/VGAK+dIerMyjhADH8p ciA431Kb+JEYGxqoQw0tD0HWFw3K2TpA7o2YKNLlaFnpWhUx1lXTpfuOOdh7KytqkW60 4fwl2m/zfEdwvOlnABf66gJ7mQYIwAAfDOYt1tgmfU1S0t/ex+Ia1g+kBfgL3TcDYfm1 ax2oiSH/3fxHeUGDcuzj1dtVZ8OzjwsSTtJxbyPIVD+/l47i+1wKCqGFDb/TUUwcxgUZ bHAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020383; x=1738625183; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2GDxHDUfDEHJ5sXsQfC9s6zGTEGY5YOTfZmfjmgVvKc=; b=toru4VDLWaJ3pYNJ9kxvL5lIH8rsvlp3JxZ4AnEFS8ZFLvpXFc0rVka4XvvTp20oI8 MsbgpEbBVP+0Em/63VJRZhR57WNI4JsNrkIO77y/4RstaBPhxvD6614PKEd0lO0uWid0 cxTSvriE9ExGsch/UL64f0SXtQNEj/UYss4/QHHGjo05LZVyVcJPKLhfTdc2yPRSEnhZ jT0kFERAdIcjleGt0AJXlCLAHKUHUu4v4bfzrLjwWWCcNfLS9YiipKQneAsbr16qI97r qCSRSSUOIqSm8Ya7hInZMfV5nZHlxA23vXacegDFppEFDQjXZwrNEmWCA3j4azFuGGlF 8eAg== X-Gm-Message-State: AOJu0Yze3xV5rq5kqL0644X/x6LnfcSEFo0lSi1ECKC3JNk0Dbv+86QS GLAzkCKngbCtAuZM3EiXV/CbNcSH0xmGpiB8l6Bp7i42M3X8lJl5a9Uk8z43HdzVh1TftUZpr6K b X-Gm-Gg: ASbGnctfUx2GKaV8wHXNhmY/98ATX8D4aInqKc7tlo/9K5sOIY5y/X/DN5uOML3yY5z DIdO9gORmOO7ReoocU7s/z6Ppt/bYaKSOXTRRf/ytqh0ZEicDk/z0xobbXu4aDqEVY/sR60NLxY hXJRa0IRUNG6aApDK+6EgOZJYaaO5cxBWSn3DJg0KDFAXh0H+BUGpZ9Kie2A055bcvEnHwjGZiW rmlLewqUdU9oJv/mJdmsuxY7Psodxnkl5DrSlLX0RSOk3JsSCTii3jjjeXnw/kwbRM3MUj7WHDl NyfA9Uwcd0FMbGDw/Ultp+6VotKETt+ED4DmHjo= X-Google-Smtp-Source: AGHT+IHKf3uTi6cSuwONfZTM1K59JVIlwX422STFHcUgztz/I5UJHTZPI6jaBk2P4+PsO2N8Dyk/hQ== X-Received: by 2002:a17:90b:2dc2:b0:2ee:6db1:21d3 with SMTP id 98e67ed59e1d1-2f782d323a6mr59729663a91.25.1738020382797; Mon, 27 Jan 2025 15:26:22 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Subject: [PATCH 22/22] target/arm: Use flags for AH negation in do_fmla_zpzzz_* Date: Mon, 27 Jan 2025 15:26:04 -0800 Message-ID: <20250127232604.20386-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250127232604.20386-1-richard.henderson@linaro.org> References: <20250127232604.20386-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2607:f8b0:4864:20::102a; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @linaro.org) X-ZM-MESSAGEID: 1738020613346019000 Content-Type: text/plain; charset="utf-8" The float*_muladd functions have a flags argument that can perform optional negation of various operand. We don't use that for "normal" arm fmla, because the muladd flags are not applied when an input is a NaN. But since FEAT_AFP does not negate NaNs, this behaviour is exactly what we need. Since we have separate helper entry points for the various fmla, fmls, fnmla, fnmls instructions, it's easy to just pass down the exact values required so that no conditional branch is required within the inner loop. Signed-off-by: Richard Henderson --- target/arm/tcg/sve_helper.c | 93 +++++++++++++++++-------------------- 1 file changed, 42 insertions(+), 51 deletions(-) diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c index 6a5bfa0e59..4fdc98cd98 100644 --- a/target/arm/tcg/sve_helper.c +++ b/target/arm/tcg/sve_helper.c @@ -4814,7 +4814,7 @@ DO_ZPZ_FP(flogb_d, float64, H1_8, do_float64_logb_as_= int) =20 static void do_fmla_zpzzz_h(void *vd, void *vn, void *vm, void *va, void *= vg, float_status *status, uint32_t desc, - uint16_t neg1, uint16_t neg3, bool fpcr_ah) + uint16_t neg1, uint16_t neg3, int flags) { intptr_t i =3D simd_oprsz(desc); uint64_t *g =3D vg; @@ -4826,16 +4826,10 @@ static void do_fmla_zpzzz_h(void *vd, void *vn, voi= d *vm, void *va, void *vg, if (likely((pg >> (i & 63)) & 1)) { float16 e1, e2, e3, r; =20 - e1 =3D *(uint16_t *)(vn + H1_2(i)); + e1 =3D *(uint16_t *)(vn + H1_2(i)) ^ neg1; e2 =3D *(uint16_t *)(vm + H1_2(i)); - e3 =3D *(uint16_t *)(va + H1_2(i)); - if (neg1 && !(fpcr_ah && float16_is_any_nan(e1))) { - e1 ^=3D neg1; - } - if (neg3 && !(fpcr_ah && float16_is_any_nan(e3))) { - e3 ^=3D neg3; - } - r =3D float16_muladd(e1, e2, e3, 0, status); + e3 =3D *(uint16_t *)(va + H1_2(i)) ^ neg3; + r =3D float16_muladd(e1, e2, e3, flags, status); *(uint16_t *)(vd + H1_2(i)) =3D r; } } while (i & 63); @@ -4845,48 +4839,51 @@ static void do_fmla_zpzzz_h(void *vd, void *vn, voi= d *vm, void *va, void *vg, void HELPER(sve_fmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0, false); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0, 0); } =20 void HELPER(sve_fmls_zpzzz_h)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0, false); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0, 0); } =20 void HELPER(sve_fnmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0x8000, fals= e); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0x8000, 0); } =20 void HELPER(sve_fnmls_zpzzz_h)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0x8000, false); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0x8000, 0); } =20 void HELPER(sve_ah_fmls_zpzzz_h)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0, true); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_product); } =20 void HELPER(sve_ah_fnmla_zpzzz_h)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0x8000, 0x8000, true= ); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_product | float_muladd_negate_c); } =20 void HELPER(sve_ah_fnmls_zpzzz_h)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0x8000, true); + do_fmla_zpzzz_h(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_c); } =20 static void do_fmla_zpzzz_s(void *vd, void *vn, void *vm, void *va, void *= vg, float_status *status, uint32_t desc, - uint32_t neg1, uint32_t neg3, bool fpcr_ah) + uint32_t neg1, uint32_t neg3, int flags) { intptr_t i =3D simd_oprsz(desc); uint64_t *g =3D vg; @@ -4898,16 +4895,10 @@ static void do_fmla_zpzzz_s(void *vd, void *vn, voi= d *vm, void *va, void *vg, if (likely((pg >> (i & 63)) & 1)) { float32 e1, e2, e3, r; =20 - e1 =3D *(uint32_t *)(vn + H1_4(i)); + e1 =3D *(uint32_t *)(vn + H1_4(i)) ^ neg1; e2 =3D *(uint32_t *)(vm + H1_4(i)); - e3 =3D *(uint32_t *)(va + H1_4(i)); - if (neg1 && !(fpcr_ah && float32_is_any_nan(e1))) { - e1 ^=3D neg1; - } - if (neg3 && !(fpcr_ah && float32_is_any_nan(e3))) { - e3 ^=3D neg3; - } - r =3D float32_muladd(e1, e2, e3, 0, status); + e3 =3D *(uint32_t *)(va + H1_4(i)) ^ neg3; + r =3D float32_muladd(e1, e2, e3, flags, status); *(uint32_t *)(vd + H1_4(i)) =3D r; } } while (i & 63); @@ -4917,48 +4908,51 @@ static void do_fmla_zpzzz_s(void *vd, void *vn, voi= d *vm, void *va, void *vg, void HELPER(sve_fmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0, false); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0, 0); } =20 void HELPER(sve_fmls_zpzzz_s)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0, false= ); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0, 0); } =20 void HELPER(sve_fnmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0x800000= 00, false); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0x800000= 00, 0); } =20 void HELPER(sve_fnmls_zpzzz_s)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0x80000000, false= ); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0x80000000, 0); } =20 void HELPER(sve_ah_fmls_zpzzz_s)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0, true); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_product); } =20 void HELPER(sve_ah_fnmla_zpzzz_s)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0x80000000, 0x800000= 00, true); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_product | float_muladd_negate_c); } =20 void HELPER(sve_ah_fnmls_zpzzz_s)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0x80000000, true); + do_fmla_zpzzz_s(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_c); } =20 static void do_fmla_zpzzz_d(void *vd, void *vn, void *vm, void *va, void *= vg, float_status *status, uint32_t desc, - uint64_t neg1, uint64_t neg3, bool fpcr_ah) + uint64_t neg1, uint64_t neg3, int flags) { intptr_t i =3D simd_oprsz(desc); uint64_t *g =3D vg; @@ -4970,16 +4964,10 @@ static void do_fmla_zpzzz_d(void *vd, void *vn, voi= d *vm, void *va, void *vg, if (likely((pg >> (i & 63)) & 1)) { float64 e1, e2, e3, r; =20 - e1 =3D *(uint64_t *)(vn + i); + e1 =3D *(uint64_t *)(vn + i) ^ neg1; e2 =3D *(uint64_t *)(vm + i); - e3 =3D *(uint64_t *)(va + i); - if (neg1 && !(fpcr_ah && float64_is_any_nan(e1))) { - e1 ^=3D neg1; - } - if (neg3 && !(fpcr_ah && float64_is_any_nan(e3))) { - e3 ^=3D neg3; - } - r =3D float64_muladd(e1, e2, e3, 0, status); + e3 =3D *(uint64_t *)(va + i) ^ neg3; + r =3D float64_muladd(e1, e2, e3, flags, status); *(uint64_t *)(vd + i) =3D r; } } while (i & 63); @@ -4989,43 +4977,46 @@ static void do_fmla_zpzzz_d(void *vd, void *vn, voi= d *vm, void *va, void *vg, void HELPER(sve_fmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, 0, false); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, 0, 0); } =20 void HELPER(sve_fmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, 0, false); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, 0, 0); } =20 void HELPER(sve_fnmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, INT64_MIN= , false); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, INT64_MIN= , 0); } =20 void HELPER(sve_fnmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, INT64_MIN, false); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, INT64_MIN, 0); } =20 void HELPER(sve_ah_fmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t des= c) { - do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, 0, true); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_product); } =20 void HELPER(sve_ah_fnmla_zpzzz_d)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, INT64_MIN, INT64_MIN= , true); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_product | float_muladd_negate_c); } =20 void HELPER(sve_ah_fnmls_zpzzz_d)(void *vd, void *vn, void *vm, void *va, void *vg, float_status *status, uint32_t de= sc) { - do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, INT64_MIN, true); + do_fmla_zpzzz_d(vd, vn, vm, va, vg, status, desc, 0, 0, + float_muladd_negate_c); } =20 /* Two operand floating-point comparison controlled by a predicate. --=20 2.43.0