[PULL 03/18] target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr

There is a newer version of this series
[PULL 03/18] target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr
Posted by Peter Maydell 3 weeks, 4 days ago
Currently we store the FPSR cumulative exception bits in the
float_status fields, and use env->vfp.fpsr only for the NZCV bits.
(The QC bit is stored in env->vfp.qc[].)

This works for TCG, but if QEMU was built without CONFIG_TCG (i.e.
with KVM support only) then we use the stub versions of
vfp_get_fpsr_from_host() and vfp_set_fpsr_to_host() which do nothing,
throwing away the cumulative exception bit state.  The effect is that
if the FPSR state is round-tripped from KVM to QEMU then we lose the
cumulative exception bits.  In particular, this will happen if the VM
is migrated.  There is no user-visible bug when using KVM with a QEMU
binary that was built with CONFIG_TCG.

Fix this by always storing the cumulative exception bits in
env->vfp.fpsr.  If we are using TCG then we may also keep pending
cumulative exception information in the float_status fields, so we
continue to fold that in on reads.

This change will also be helpful for implementing FEAT_AFP later,
because that includes a feature where in some situations we want to
cause input denormals to be flushed to zero without affecting the
existing state of the FPSR.IDC bit, so we need a place to store IDC
which is distinct from the various float_status fields.

(Note for stable backports: the bug goes back to 4a15527c9fee but
this code was refactored in commits ea8618382aba..a8ab8706d4cc461, so
fixing it in branches without those refactorings will mean either
backporting the refactor or else implementing a conceptually similar
fix for the old code.)

Cc: qemu-stable@nongnu.org
Fixes: 4a15527c9fee ("target/arm/vfp_helper: Restrict the SoftFloat use to TCG")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241011162401.3672735-1-peter.maydell@linaro.org
---
 target/arm/vfp_helper.c | 56 ++++++++++++-----------------------------
 1 file changed, 16 insertions(+), 40 deletions(-)

diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c
index 203d37303bd..62638d2b1f9 100644
--- a/target/arm/vfp_helper.c
+++ b/target/arm/vfp_helper.c
@@ -59,32 +59,6 @@ static inline int vfp_exceptbits_from_host(int host_bits)
     return target_bits;
 }
 
-/* Convert vfp exception flags to target form.  */
-static inline int vfp_exceptbits_to_host(int target_bits)
-{
-    int host_bits = 0;
-
-    if (target_bits & 1) {
-        host_bits |= float_flag_invalid;
-    }
-    if (target_bits & 2) {
-        host_bits |= float_flag_divbyzero;
-    }
-    if (target_bits & 4) {
-        host_bits |= float_flag_overflow;
-    }
-    if (target_bits & 8) {
-        host_bits |= float_flag_underflow;
-    }
-    if (target_bits & 0x10) {
-        host_bits |= float_flag_inexact;
-    }
-    if (target_bits & 0x80) {
-        host_bits |= float_flag_input_denormal;
-    }
-    return host_bits;
-}
-
 static uint32_t vfp_get_fpsr_from_host(CPUARMState *env)
 {
     uint32_t i;
@@ -99,15 +73,14 @@ static uint32_t vfp_get_fpsr_from_host(CPUARMState *env)
     return vfp_exceptbits_from_host(i);
 }
 
-static void vfp_set_fpsr_to_host(CPUARMState *env, uint32_t val)
+static void vfp_clear_float_status_exc_flags(CPUARMState *env)
 {
     /*
-     * The exception flags are ORed together when we read fpscr so we
-     * only need to preserve the current state in one of our
-     * float_status values.
+     * Clear out all the exception-flag information in the float_status
+     * values. The caller should have arranged for env->vfp.fpsr to
+     * be the architecturally up-to-date exception flag information first.
      */
-    int i = vfp_exceptbits_to_host(val);
-    set_float_exception_flags(i, &env->vfp.fp_status);
+    set_float_exception_flags(0, &env->vfp.fp_status);
     set_float_exception_flags(0, &env->vfp.fp_status_f16);
     set_float_exception_flags(0, &env->vfp.standard_fp_status);
     set_float_exception_flags(0, &env->vfp.standard_fp_status_f16);
@@ -164,7 +137,7 @@ static uint32_t vfp_get_fpsr_from_host(CPUARMState *env)
     return 0;
 }
 
-static void vfp_set_fpsr_to_host(CPUARMState *env, uint32_t val)
+static void vfp_clear_float_status_exc_flags(CPUARMState *env)
 {
 }
 
@@ -216,8 +189,6 @@ void vfp_set_fpsr(CPUARMState *env, uint32_t val)
 {
     ARMCPU *cpu = env_archcpu(env);
 
-    vfp_set_fpsr_to_host(env, val);
-
     if (arm_feature(env, ARM_FEATURE_NEON) ||
         cpu_isar_feature(aa32_mve, cpu)) {
         /*
@@ -231,13 +202,18 @@ void vfp_set_fpsr(CPUARMState *env, uint32_t val)
     }
 
     /*
-     * The only FPSR bits we keep in vfp.fpsr are NZCV:
-     * the exception flags IOC|DZC|OFC|UFC|IXC|IDC are stored in
-     * fp_status, and QC is in vfp.qc[]. Store the NZCV bits there,
-     * and zero any of the other FPSR bits.
+     * NZCV lives only in env->vfp.fpsr. The cumulative exception flags
+     * IOC|DZC|OFC|UFC|IXC|IDC also live in env->vfp.fpsr, with possible
+     * extra pending exception information that hasn't yet been folded in
+     * living in the float_status values (for TCG).
+     * Since this FPSR write gives us the up to date values of the exception
+     * flags, we want to store into vfp.fpsr the NZCV and CEXC bits, zeroing
+     * anything else. We also need to clear out the float_status exception
+     * information so that the next vfp_get_fpsr does not fold in stale data.
      */
-    val &= FPSR_NZCV_MASK;
+    val &= FPSR_NZCV_MASK | FPSR_CEXC_MASK;
     env->vfp.fpsr = val;
+    vfp_clear_float_status_exc_flags(env);
 }
 
 static void vfp_set_fpcr_masked(CPUARMState *env, uint32_t val, uint32_t mask)
-- 
2.34.1
Re: [PULL 03/18] target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr
Posted by Michael Tokarev 3 weeks, 2 days ago
29.10.2024 18:10, Peter Maydell wrote:
...
> (Note for stable backports: the bug goes back to 4a15527c9fee but
> this code was refactored in commits ea8618382aba..a8ab8706d4cc461, so
> fixing it in branches without those refactorings will mean either
> backporting the refactor or else implementing a conceptually similar
> fix for the old code.)

What do you think is the better way here -- pick up the refactoring
changes (to 9.0 and earlier), do a backport of this single fix, or
do nothing?  Note for 7.2 branch it probably requires quite a bit
more work.

Thanks,

/mjt
Re: [PULL 03/18] target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr
Posted by Peter Maydell 3 weeks, 2 days ago
On Thu, 31 Oct 2024 at 16:55, Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> 29.10.2024 18:10, Peter Maydell wrote:
> ...
> > (Note for stable backports: the bug goes back to 4a15527c9fee but
> > this code was refactored in commits ea8618382aba..a8ab8706d4cc461, so
> > fixing it in branches without those refactorings will mean either
> > backporting the refactor or else implementing a conceptually similar
> > fix for the old code.)
>
> What do you think is the better way here -- pick up the refactoring
> changes (to 9.0 and earlier), do a backport of this single fix, or
> do nothing?  Note for 7.2 branch it probably requires quite a bit
> more work.

The bug being fixed here is a bit niche (I don't think many
people build with KVM support only) so I would not worry
about backporting it all the way to 7.2. I'm not sure about
the best approach for 9.0 (which is why I left that note rather
than actively suggestion one or the other) -- it depends whether
you value more "backport the change that's actually been tested
by being in mainline" or "make the minimal set of changes in the
stable branch"...

-- PMM
Re: [PULL 03/18] target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr
Posted by Michael Tokarev 3 weeks, 2 days ago
31.10.2024 20:01, Peter Maydell wrote:
> On Thu, 31 Oct 2024 at 16:55, Michael Tokarev <mjt@tls.msk.ru> wrote:
>>
>> 29.10.2024 18:10, Peter Maydell wrote:
>> ...
>>> (Note for stable backports: the bug goes back to 4a15527c9fee but
>>> this code was refactored in commits ea8618382aba..a8ab8706d4cc461, so
>>> fixing it in branches without those refactorings will mean either
>>> backporting the refactor or else implementing a conceptually similar
>>> fix for the old code.)
>>
>> What do you think is the better way here -- pick up the refactoring
>> changes (to 9.0 and earlier), do a backport of this single fix, or
>> do nothing?  Note for 7.2 branch it probably requires quite a bit
>> more work.
> 
> The bug being fixed here is a bit niche (I don't think many
> people build with KVM support only) so I would not worry
> about backporting it all the way to 7.2. I'm not sure about
> the best approach for 9.0 (which is why I left that note rather
> than actively suggestion one or the other) -- it depends whether
> you value more "backport the change that's actually been tested
> by being in mainline" or "make the minimal set of changes in the
> stable branch"...

Aha.

Well.

What I see is that the whole thing - plus one more commit right
before the mentioned refactoring, ea8618382aba, "target/arm: Correct
comments about M-profile FPSCR" - applies cleanly to 9.0 and 8.2.
For 7.2 though, this is definitely too much because there, tcg code
has different rules for the temps, - in particular, it frees all
temps explicitly, so new code wont fit there anyway, resulting in
leaks and requiring major re-review.

Being niche, I'd not bother with that even in 8.2, but the fact
this refactoring applies nicely.. :)

It's probably the first time when the difference between many
changes but tested in master, vs smaller possible change, is quite
significant.  I prefer for it to just work ;))

8.2 is in ubuntu noble (LTS), so it might be with us a bit longer
than some other series, but in ubuntu they build with KVM support,
so the bug isn't there.

For 7.2 there's no question anymore.

For 9.0, the next release will be the last one in the series
(unless something changes).

For being really niche, let's pick this single change to 9.1 only,
keeping other series without the fix, for now.  Unless there are
other arguments to change this.

I especially thank you for the excellent summary in the patch itself!

/mjt
Re: [PULL 03/18] target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr
Posted by Peter Maydell 3 weeks, 2 days ago
On Thu, 31 Oct 2024 at 17:31, Michael Tokarev <mjt@tls.msk.ru> wrote:
> What I see is that the whole thing - plus one more commit right
> before the mentioned refactoring, ea8618382aba, "target/arm: Correct
> comments about M-profile FPSCR" - applies cleanly to 9.0 and 8.2.
> For 7.2 though, this is definitely too much because there, tcg code
> has different rules for the temps, - in particular, it frees all
> temps explicitly, so new code wont fit there anyway, resulting in
> leaks and requiring major re-review.
>
> Being niche, I'd not bother with that even in 8.2, but the fact
> this refactoring applies nicely.. :)
>
> It's probably the first time when the difference between many
> changes but tested in master, vs smaller possible change, is quite
> significant.  I prefer for it to just work ;))
>
> 8.2 is in ubuntu noble (LTS), so it might be with us a bit longer
> than some other series, but in ubuntu they build with KVM support,
> so the bug isn't there.

Note that the bug is not in "TCG + KVM" or "TCG only"
configs, but specifically "KVM without TCG" (i.e
you configured with '--disable-tcg' on an aarch64 host).

-- PMM
Re: [PULL 03/18] target/arm: Store FPSR cumulative exception bits in env->vfp.fpsr
Posted by Michael Tokarev 3 weeks, 2 days ago
31.10.2024 20:40, Peter Maydell:

>> 8.2 is in ubuntu noble (LTS), so it might be with us a bit longer
>> than some other series, but in ubuntu they build with KVM support,
>> so the bug isn't there.
> 
> Note that the bug is not in "TCG + KVM" or "TCG only"
> configs, but specifically "KVM without TCG" (i.e
> you configured with '--disable-tcg' on an aarch64 host).

Yeah, what I meant is "they build with tcg *AND* KVM", -- they enable
KVM *too*", so the bug doesn't exist in their config, so we can ignore
ubuntu in this context.

Thanks,

/mjt