target/i386: fix operand size for VCOMI/VUCOMI instructions

[PATCH] target/i386: fix operand size for VCOMI/VUCOMI instructions

Posted by Paolo Bonzini 11 months, 3 weeks ago

Compared to other SSE instructions, VUCOMISx and VCOMISx are different:
the single and double precision versions are distinguished through a
prefix, however they use no-prefix and 0x66 for SS and SD respectively.
Scalar values usually are associated with 0xF2 and 0xF3.

Because of these, they incorrectly perform a 128-bit memory load instead
of a 32- or 64-bit load.  Fix this by writing a custom decoding function.

I tested that the reproducer is fixed and the test-avx output does not
change.

Reported-by: Gabriele Svelto <gsvelto@mozilla.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1637
Fixes: f8d19eec0d53 ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/decode-new.c.inc | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 4fdd87750bea..48fefaffdf63 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -783,6 +783,17 @@ static void decode_0F2D(DisasContext *s, CPUX86State *env, X86OpEntry *entry, ui
     *entry = *decode_by_prefix(s, opcodes_0F2D);
 }
 
+static void decode_VxCOMISx(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
+{
+    /*
+     * VUCOMISx and VCOMISx are different and use no-prefix and 0x66 for SS and SD
+     * respectively.  Scalar values usually are associated with 0xF2 and 0xF3, for
+     * which X86_VEX_REPScalar exists, but here it has to be decoded by hand.
+     */
+    entry->s1 = entry->s2 = (s->prefix & PREFIX_DATA ? X86_SIZE_sd : X86_SIZE_ss);
+    entry->gen = (*b == 0x2E ? gen_VUCOMI : gen_VCOMI);
+}
+
 static void decode_sse_unary(DisasContext *s, CPUX86State *env, X86OpEntry *entry, uint8_t *b)
 {
     if (!(s->prefix & (PREFIX_REPZ | PREFIX_REPNZ))) {
@@ -871,8 +882,8 @@ static const X86OpEntry opcodes_0F[256] = {
     [0x2B] = X86_OP_GROUP0(0F2B),
     [0x2C] = X86_OP_GROUP0(0F2C),
     [0x2D] = X86_OP_GROUP0(0F2D),
-    [0x2E] = X86_OP_ENTRY3(VUCOMI,     None,None, V,x, W,x,  vex4 p_00_66),
-    [0x2F] = X86_OP_ENTRY3(VCOMI,      None,None, V,x, W,x,  vex4 p_00_66),
+    [0x2E] = X86_OP_GROUP3(VxCOMISx,   None,None, V,x, W,x,  vex3 p_00_66), /* VUCOMISS/SD */
+    [0x2F] = X86_OP_GROUP3(VxCOMISx,   None,None, V,x, W,x,  vex3 p_00_66), /* VCOMISS/SD */
 
     [0x38] = X86_OP_GROUP0(0F38),
     [0x3a] = X86_OP_GROUP0(0F3A),
-- 
2.40.1

Re: [PATCH] target/i386: fix operand size for VCOMI/VUCOMI instructions

Posted by Peter Maydell 11 months, 3 weeks ago

On Tue, 9 May 2023 at 15:27, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> Compared to other SSE instructions, VUCOMISx and VCOMISx are different:
> the single and double precision versions are distinguished through a
> prefix, however they use no-prefix and 0x66 for SS and SD respectively.
> Scalar values usually are associated with 0xF2 and 0xF3.
>
> Because of these, they incorrectly perform a 128-bit memory load instead
> of a 32- or 64-bit load.  Fix this by writing a custom decoding function.
>
> I tested that the reproducer is fixed and the test-avx output does not
> change.
>
> Reported-by: Gabriele Svelto <gsvelto@mozilla.com>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1637
> Fixes: f8d19eec0d53 ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18)
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Worth
Cc: qemu-stable@nongnu.org
also? We have real-world reports of guests falling over on this.

thanks
-- PMM

Re: [PATCH] target/i386: fix operand size for VCOMI/VUCOMI instructions

Posted by Richard Henderson 11 months, 3 weeks ago

On 5/9/23 15:26, Paolo Bonzini wrote:
> Compared to other SSE instructions, VUCOMISx and VCOMISx are different:
> the single and double precision versions are distinguished through a
> prefix, however they use no-prefix and 0x66 for SS and SD respectively.
> Scalar values usually are associated with 0xF2 and 0xF3.
> 
> Because of these, they incorrectly perform a 128-bit memory load instead
> of a 32- or 64-bit load.  Fix this by writing a custom decoding function.
> 
> I tested that the reproducer is fixed and the test-avx output does not
> change.
> 
> Reported-by: Gabriele Svelto<gsvelto@mozilla.com>
> Resolves:https://gitlab.com/qemu-project/qemu/-/issues/1637
> Fixes: f8d19eec0d53 ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18)
> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
> ---
>   target/i386/tcg/decode-new.c.inc | 15 +++++++++++++--
>   1 file changed, 13 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~