[Qemu-devel] [PATCH v6 26/26] RISC-V: Workaround for critical mstatus.FS MTTCG bug

Michael Clark posted 26 patches 7 years, 10 months ago
Only 14 patches received!
[Qemu-devel] [PATCH v6 26/26] RISC-V: Workaround for critical mstatus.FS MTTCG bug
Posted by Michael Clark 7 years, 10 months ago
This change is a workaround for a bug where mstatus.FS
is not correctly reporting dirty when MTTCG and SMP are
enabled which results in the floating point register file
not being saved during context switches. This a critical
bug for RISC-V in QEMU as it results in floating point
register file corruption when running SMP Linux in the
RISC-V 'virt' machine.

This workaround will return dirty if mstatus.FS is
switched from off to initial or clean. We have checked
the specification and it is legal for an implementation
to return either off, or dirty, if set to initial or clean.

This workaround will result in unnecessary floating point
save restore. When mstatus.FS is off, floating point
instruction trap to indicate the process is using the FPU.
The OS can then save floating-point state of the previous
process using the FPU and set mstatus.FS to initial or
clean. With this workaround, mstatus.FS will always return
dirty if set to a non-zero value, indicating floating point
save restore is necessary, versus misreporting mstatus.FS
resulting in floating point register file corruption.

Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Clark <mjc@sifive.com>
---
 target/riscv/op_helper.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index 1fdde90..d345688 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env, target_ulong val_to_write,
         }
 
         mstatus = (mstatus & ~mask) | (val_to_write & mask);
-        int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
-        dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
+
+        /* Note: this is a workaround for an issue where mstatus.FS
+           does not report dirty when SMP and MTTCG is enabled. This
+           workaround is technically compliant with the RISC-V Privileged
+           specification as it is legal to return only off, or dirty,
+           however this may cause unnecessary saves of floating point state.
+           Without this workaround, floating point state is not saved and
+           restored coorectly when SMP and MTTCG is enabled, */
+        if (qemu_tcg_mttcg_enabled()) {
+            /* FP is always dirty or off */
+            if (mstatus & MSTATUS_FS) {
+                mstatus |= MSTATUS_FS;
+            }
+        }
+
+        int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
+                    ((mstatus & MSTATUS_XS) == MSTATUS_XS);
         mstatus = set_field(mstatus, MSTATUS_SD, dirty);
         env->mstatus = mstatus;
         break;
-- 
2.7.0


Re: [Qemu-devel] [PATCH v6 26/26] RISC-V: Workaround for critical mstatus.FS MTTCG bug
Posted by Michael Clark 7 years, 10 months ago
On Sat, Mar 24, 2018 at 11:13 AM, Michael Clark <mjc@sifive.com> wrote:

> This change is a workaround for a bug where mstatus.FS
> is not correctly reporting dirty when MTTCG and SMP are
> enabled which results in the floating point register file
> not being saved during context switches. This a critical
> bug for RISC-V in QEMU as it results in floating point
> register file corruption when running SMP Linux in the
> RISC-V 'virt' machine.
>
> This workaround will return dirty if mstatus.FS is
> switched from off to initial or clean. We have checked
> the specification and it is legal for an implementation
> to return either off, or dirty, if set to initial or clean.
>
> This workaround will result in unnecessary floating point
> save restore. When mstatus.FS is off, floating point
> instruction trap to indicate the process is using the FPU.
> The OS can then save floating-point state of the previous
> process using the FPU and set mstatus.FS to initial or
> clean. With this workaround, mstatus.FS will always return
> dirty if set to a non-zero value, indicating floating point
> save restore is necessary, versus misreporting mstatus.FS
> resulting in floating point register file corruption.
>
> Cc: Palmer Dabbelt <palmer@sifive.com>
> Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
> Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
> Cc: Richard W.M. Jones <rjones@redhat.com>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Michael Clark <mjc@sifive.com>
> ---
>  target/riscv/op_helper.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
> index 1fdde90..d345688 100644
> --- a/target/riscv/op_helper.c
> +++ b/target/riscv/op_helper.c
> @@ -144,8 +144,23 @@ void csr_write_helper(CPURISCVState *env,
> target_ulong val_to_write,
>          }
>
>          mstatus = (mstatus & ~mask) | (val_to_write & mask);
> -        int dirty = (mstatus & MSTATUS_FS) == MSTATUS_FS;
> -        dirty |= (mstatus & MSTATUS_XS) == MSTATUS_XS;
> +
> +        /* Note: this is a workaround for an issue where mstatus.FS
> +           does not report dirty when SMP and MTTCG is enabled. This
> +           workaround is technically compliant with the RISC-V Privileged
> +           specification as it is legal to return only off, or dirty,
> +           however this may cause unnecessary saves of floating point
> state.
> +           Without this workaround, floating point state is not saved and
> +           restored coorectly when SMP and MTTCG is enabled, */
>

typo in "correctly". I will fix this...


> +        if (qemu_tcg_mttcg_enabled()) {
> +            /* FP is always dirty or off */
> +            if (mstatus & MSTATUS_FS) {
> +                mstatus |= MSTATUS_FS;
> +            }
> +        }
> +
> +        int dirty = ((mstatus & MSTATUS_FS) == MSTATUS_FS) |
> +                    ((mstatus & MSTATUS_XS) == MSTATUS_XS);
>          mstatus = set_field(mstatus, MSTATUS_SD, dirty);
>          env->mstatus = mstatus;
>          break;
> --
> 2.7.0
>
>
Re: [Qemu-devel] [PATCH v6 26/26] RISC-V: Workaround for critical mstatus.FS MTTCG bug
Posted by Richard W.M. Jones 7 years, 10 months ago
On Sat, Mar 24, 2018 at 11:13:40AM -0700, Michael Clark wrote:
> This change is a workaround for a bug where mstatus.FS
> is not correctly reporting dirty when MTTCG and SMP are
> enabled which results in the floating point register file
> not being saved during context switches. This a critical
> bug for RISC-V in QEMU as it results in floating point
> register file corruption when running SMP Linux in the
> RISC-V 'virt' machine.
> 
> This workaround will return dirty if mstatus.FS is
> switched from off to initial or clean. We have checked
> the specification and it is legal for an implementation
> to return either off, or dirty, if set to initial or clean.
> 
> This workaround will result in unnecessary floating point
> save restore. When mstatus.FS is off, floating point
> instruction trap to indicate the process is using the FPU.
> The OS can then save floating-point state of the previous
> process using the FPU and set mstatus.FS to initial or
> clean. With this workaround, mstatus.FS will always return
> dirty if set to a non-zero value, indicating floating point
> save restore is necessary, versus misreporting mstatus.FS
> resulting in floating point register file corruption.
> 
> Cc: Palmer Dabbelt <palmer@sifive.com>
> Cc: Sagar Karandikar <sagark@eecs.berkeley.edu>
> Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
> Cc: Richard W.M. Jones <rjones@redhat.com>
> Cc: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Michael Clark <mjc@sifive.com>

I tested this by running qemu from git with and without this patch,
both times compiling and running the “sched.c” test program:

http://oirase.annexia.org/tmp/sched.c

In my tests it fixes the problem, and therefore:

Tested-by: Richard W.M. Jones <rjones@redhat.com>

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v