From nobody Tue Feb 10 05:45:34 2026
Delivered-To: importer@patchew.org
Authentication-Results: mx.zohomail.com;
	spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as
 permitted sender)
  smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org
Return-Path: <qemu-devel-bounces+importer=patchew.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by
 mx.zohomail.com
	with SMTPS id 1666812535806335.8824267978523;
 Wed, 26 Oct 2022 12:28:55 -0700 (PDT)
Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces@nongnu.org>)
	id 1onm3B-0005fC-Vf; Wed, 26 Oct 2022 15:26:54 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <victor.colombo@eldorado.org.br>)
 id 1onm39-0005RI-R3; Wed, 26 Oct 2022 15:26:51 -0400
Received: from [200.168.210.66] (helo=outlook.eldorado.org.br)
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <victor.colombo@eldorado.org.br>)
 id 1onm35-0003oy-U6; Wed, 26 Oct 2022 15:26:51 -0400
Received: from p9ibm ([10.10.71.235]) by outlook.eldorado.org.br over TLS
 secured channel with Microsoft SMTPSVC(8.5.9600.16384);
 Wed, 26 Oct 2022 16:26:35 -0300
Received: from eldorado.org.br (unknown [10.10.70.45])
 by p9ibm (Postfix) with ESMTP id 7BF0380023A;
 Wed, 26 Oct 2022 16:26:35 -0300 (-03)
From: =?UTF-8?q?V=C3=ADctor=20Colombo?= <victor.colombo@eldorado.org.br>
To: qemu-devel@nongnu.org,
	qemu-ppc@nongnu.org
Cc: clg@kaod.org, danielhb413@gmail.com, david@gibson.dropbear.id.au,
 groug@kaod.org, richard.henderson@linaro.org, aurelien@aurel32.net,
 peter.maydell@linaro.org, alex.bennee@linaro.org, balaton@eik.bme.hu,
 victor.colombo@eldorado.org.br, matheus.ferst@eldorado.org.br,
 lucas.araujo@eldorado.org.br, leandro.lupori@eldorado.org.br,
 lucas.coutinho@eldorado.org.br
Subject: [RFC PATCH v2 2/5] target/ppc: Implement instruction caching for
 fsqrt
Date: Wed, 26 Oct 2022 16:25:45 -0300
Message-Id: <20221026192548.67303-3-victor.colombo@eldorado.org.br>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20221026192548.67303-1-victor.colombo@eldorado.org.br>
References: <20221026192548.67303-1-victor.colombo@eldorado.org.br>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-OriginalArrivalTime: 26 Oct 2022 19:26:35.0937 (UTC)
 FILETIME=[DCEA2110:01D8E970]
X-Host-Lookup-Failed: Reverse DNS lookup failed for 200.168.210.66 (failed)
Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17
 as permitted sender) client-ip=209.51.188.17;
 envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org;
 helo=lists.gnu.org;
Received-SPF: pass client-ip=200.168.210.66;
 envelope-from=victor.colombo@eldorado.org.br; helo=outlook.eldorado.org.br
X-Spam_score_int: -10
X-Spam_score: -1.1
X-Spam_bar: -
X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, RDNS_NONE=0.793,
 SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Sender: "Qemu-devel" <qemu-devel-bounces@nongnu.org>
Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org
X-ZM-MESSAGEID: 1666812537634100001

This patch adds the code necessary to cache fsqrt for usage
with hardfpu in Power. It is also the first instruction to
use the new cache instruction system.

fsqrt is an instruction that receives two arguments, one f64 and
one status, and returns f64. This info will be cached inside a new
union in env, which will grow when other instructions with other
signatures are added.

Hardfpu in QEMU only works when the inexact is already set. So,
CACHE_FN_3 will check if FP_XX is set, and set float_flag_inexact
to enable the hardfpu behavior. When the instruction is later
reexecuted, it will be with float_flag_inexact cleared, forcing
softfloat and correctly updating the relevant flags, as is today.

This implementation only works in linux-user. No test or effort
was done in this patch to make it work for softmmu. Future work
will be required to make it work correctly in this scenario.

Signed-off-by: V=C3=ADctor Colombo <victor.colombo@eldorado.org.br>
---
 target/ppc/cpu.h        | 11 +++++++++++
 target/ppc/fpu_helper.c | 40 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 116ee639ff..e55c10b0db 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1082,6 +1082,14 @@ struct ppc_radix_page_info {
=20
 enum {
     CACHED_FN_TYPE_NONE,
+    CACHED_FN_TYPE_F64_F64_FSTATUS,
+
+};
+
+struct cached_fn_f64_f64_fstatus {
+    float64 (*fn)(float64, float_status*);
+    float64 arg1;
+    float_status arg2;
 };
=20
 struct CPUArchState {
@@ -1162,6 +1170,9 @@ struct CPUArchState {
     target_ulong fpscr;     /* Floating point status and control register =
*/
=20
     int cached_fn_type;
+    union {
+        struct cached_fn_f64_f64_fstatus f64_f64_fstatus;
+    } cached_fn;
=20
     /* Internal devices resources */
     ppc_tb_t *tb_env;      /* Time base and decrementer */
diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 34b242c025..1756719664 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -30,8 +30,24 @@
                  float_flag_inexact));                                    =
    \
         env->cached_fn_type =3D CACHED_FN_TYPE_NONE;                      =
      \
     } while (0)
+
+#define CACHE_FN_3(env, FN, ARG1, ARG2, FIELD, TYPE)                      =
    \
+    do {                                                                  =
    \
+        if (env->fpscr & FP_XX) {                                         =
    \
+            env->cached_fn_type =3D TYPE;                                 =
      \
+            env->cached_fn.FIELD.fn =3D FN;                               =
      \
+            env->cached_fn.FIELD.arg1 =3D ARG1;                           =
      \
+            env->cached_fn.FIELD.arg2 =3D ARG2;                           =
      \
+            env->fp_status.float_exception_flags |=3D float_flag_inexact; =
      \
+        } else {                                                          =
    \
+            assert(!(env->fp_status.float_exception_flags &               =
    \
+                     float_flag_inexact));                                =
    \
+            env->cached_fn_type =3D CACHED_FN_TYPE_NONE;                  =
      \
+        }                                                                 =
    \
+    } while (0)
 #else
 #define CACHE_FN_NONE(env)
+#define CACHE_FN_3(env, FN, ARG1, ARG2, FIELD, TYPE)
 #endif
=20
 static inline float128 float128_snan_to_qnan(float128 x)
@@ -535,6 +551,27 @@ void helper_execute_fp_cached(CPUPPCState *env)
          * so no need to execute it again
          */
         break;
+    case CACHED_FN_TYPE_F64_F64_FSTATUS:
+        /*
+         * execute the cached insn. At this point, float_exception_flags
+         * should have FI not set, otherwise the result will not be correct
+         */
+        assert((env->cached_fn.f64_f64_fstatus.arg2.float_exception_flags &
+               float_flag_inexact) =3D=3D 0);
+        env->cached_fn.f64_f64_fstatus.fn(
+            env->cached_fn.f64_f64_fstatus.arg1,
+            &env->cached_fn.f64_f64_fstatus.arg2);
+
+        env->fpscr &=3D ~FP_FI;
+        /*
+         * if the cached instruction resulted in FI being set
+         * then we update fpscr with this value
+         */
+        if (env->cached_fn.f64_f64_fstatus.arg2.float_exception_flags &
+            float_flag_inexact) {
+            env->fpscr |=3D FP_FI | FP_XX;
+        }
+        break;
     default:
         g_assert_not_reached();
     }
@@ -878,7 +915,8 @@ static void float_invalid_op_sqrt(CPUPPCState *env, int=
 flags,
 #define FPU_FSQRT(name, op)                                               =
    \
 float64 helper_##name(CPUPPCState *env, float64 arg)                      =
    \
 {                                                                         =
    \
-    CACHE_FN_NONE(env);                                                   =
    \
+    CACHE_FN_3(env, op, arg, env->fp_status, f64_f64_fstatus,             =
    \
+        CACHED_FN_TYPE_F64_F64_FSTATUS);                                  =
    \
     float64 ret =3D op(arg, &env->fp_status);                             =
      \
     int flags =3D get_float_exception_flags(&env->fp_status);             =
      \
                                                                           =
    \
--=20
2.25.1