target/i386: fix width of third operand of VINSERTx128

[PATCH] target/i386: fix width of third operand of VINSERTx128

Posted by Paolo Bonzini 3 months, 3 weeks ago

Table A-5 of the Intel manual incorrectly lists the third operand of
VINSERTx128 as Wqq, but it is actually a 128-bit value.  This is
visible when W is a memory operand close to the end of the page.

Fixes the recently-added poly1305_kunit test in linux-next.

(No testcase yet, but I plan to modify test-avx2 to use memory
close to the end of the page.  This would work because the test
vectors correctly have the memory operand as xmm2/m128).

Reported-by: Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/decode-new.c.inc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 853b1c8bf95..51038657f0f 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -878,10 +878,10 @@ static const X86OpEntry opcodes_0F3A[256] = {
     [0x0e] = X86_OP_ENTRY4(VPBLENDW,   V,x,  H,x,  W,x,  vex4 cpuid(SSE41) avx2_256 p_66),
     [0x0f] = X86_OP_ENTRY4(PALIGNR,    V,x,  H,x,  W,x,  vex4 cpuid(SSSE3) mmx avx2_256 p_00_66),
 
-    [0x18] = X86_OP_ENTRY4(VINSERTx128,  V,qq, H,qq, W,qq, vex6 chk(W0) cpuid(AVX) p_66),
+    [0x18] = X86_OP_ENTRY4(VINSERTx128,  V,qq, H,qq, W,dq, vex6 chk(W0) cpuid(AVX) p_66),
     [0x19] = X86_OP_ENTRY3(VEXTRACTx128, W,dq, V,qq, I,b,  vex6 chk(W0) cpuid(AVX) p_66),
 
-    [0x38] = X86_OP_ENTRY4(VINSERTx128,  V,qq, H,qq, W,qq, vex6 chk(W0) cpuid(AVX2) p_66),
+    [0x38] = X86_OP_ENTRY4(VINSERTx128,  V,qq, H,qq, W,dq, vex6 chk(W0) cpuid(AVX2) p_66),
     [0x39] = X86_OP_ENTRY3(VEXTRACTx128, W,dq, V,qq, I,b,  vex6 chk(W0) cpuid(AVX2) p_66),
 
     /* Listed incorrectly as type 4 */
-- 
2.50.1

Re: [PATCH] target/i386: fix width of third operand of VINSERTx128

Posted by Philippe Mathieu-Daudé 3 months, 2 weeks ago

On 25/7/25 08:17, Paolo Bonzini wrote:
> Table A-5 of the Intel manual incorrectly lists the third operand of
> VINSERTx128 as Wqq, but it is actually a 128-bit value.  This is
> visible when W is a memory operand close to the end of the page.
> 
> Fixes the recently-added poly1305_kunit test in linux-next.
> 
> (No testcase yet, but I plan to modify test-avx2 to use memory
> close to the end of the page.  This would work because the test
> vectors correctly have the memory operand as xmm2/m128).
> 

Fixes: 79068477686 ("target/i386: reimplement 0x0f 0x3a, add AVX")

> Reported-by: Eric Biggers <ebiggers@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/decode-new.c.inc | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)

Re: [PATCH] target/i386: fix width of third operand of VINSERTx128

Posted by Richard Henderson 3 months, 3 weeks ago

On 7/24/25 20:17, Paolo Bonzini wrote:
> Table A-5 of the Intel manual incorrectly lists the third operand of
> VINSERTx128 as Wqq, but it is actually a 128-bit value.  This is
> visible when W is a memory operand close to the end of the page.
> 
> Fixes the recently-added poly1305_kunit test in linux-next.
> 
> (No testcase yet, but I plan to modify test-avx2 to use memory
> close to the end of the page.  This would work because the test
> vectors correctly have the memory operand as xmm2/m128).
> 
> Reported-by: Eric Biggers<ebiggers@kernel.org>
> Cc: Ard Biesheuvel<ardb@kernel.org>
> Cc: "Jason A. Donenfeld"<Jason@zx2c4.com>
> Cc: Guenter Roeck<linux@roeck-us.net>
> Signed-off-by: Paolo Bonzini<pbonzini@redhat.com>
> ---
>   target/i386/tcg/decode-new.c.inc | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

Re: [PATCH] target/i386: fix width of third operand of VINSERTx128

Posted by Guenter Roeck 3 months, 3 weeks ago

On Fri, Jul 25, 2025 at 08:17:36AM +0200, Paolo Bonzini wrote:
> Table A-5 of the Intel manual incorrectly lists the third operand of
> VINSERTx128 as Wqq, but it is actually a 128-bit value.  This is
> visible when W is a memory operand close to the end of the page.
> 
> Fixes the recently-added poly1305_kunit test in linux-next.
> 
> (No testcase yet, but I plan to modify test-avx2 to use memory
> close to the end of the page.  This would work because the test
> vectors correctly have the memory operand as xmm2/m128).
> 
> Reported-by: Eric Biggers <ebiggers@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Tested-by: Guenter Roeck <linux@roeck-us.net>

Thanks a lot for the quick fix!

Guenter

Re: [PATCH] target/i386: fix width of third operand of VINSERTx128

Posted by Eric Biggers 3 months, 3 weeks ago

On Fri, Jul 25, 2025 at 08:17:36AM +0200, Paolo Bonzini wrote:
> Table A-5 of the Intel manual incorrectly lists the third operand of
> VINSERTx128 as Wqq, but it is actually a 128-bit value.

That's annoying.  I wonder what the way to report that is.

FWIW, AMD's manual gets it right.

> This is
> visible when W is a memory operand close to the end of the page.
> 
> Fixes the recently-added poly1305_kunit test in linux-next.
> 
> (No testcase yet, but I plan to modify test-avx2 to use memory
> close to the end of the page.  This would work because the test
> vectors correctly have the memory operand as xmm2/m128).
> 
> Reported-by: Eric Biggers <ebiggers@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  target/i386/tcg/decode-new.c.inc | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
> index 853b1c8bf95..51038657f0f 100644
> --- a/target/i386/tcg/decode-new.c.inc
> +++ b/target/i386/tcg/decode-new.c.inc
> @@ -878,10 +878,10 @@ static const X86OpEntry opcodes_0F3A[256] = {
>      [0x0e] = X86_OP_ENTRY4(VPBLENDW,   V,x,  H,x,  W,x,  vex4 cpuid(SSE41) avx2_256 p_66),
>      [0x0f] = X86_OP_ENTRY4(PALIGNR,    V,x,  H,x,  W,x,  vex4 cpuid(SSSE3) mmx avx2_256 p_00_66),
>  
> -    [0x18] = X86_OP_ENTRY4(VINSERTx128,  V,qq, H,qq, W,qq, vex6 chk(W0) cpuid(AVX) p_66),
> +    [0x18] = X86_OP_ENTRY4(VINSERTx128,  V,qq, H,qq, W,dq, vex6 chk(W0) cpuid(AVX) p_66),
>      [0x19] = X86_OP_ENTRY3(VEXTRACTx128, W,dq, V,qq, I,b,  vex6 chk(W0) cpuid(AVX) p_66),
>  
> -    [0x38] = X86_OP_ENTRY4(VINSERTx128,  V,qq, H,qq, W,qq, vex6 chk(W0) cpuid(AVX2) p_66),
> +    [0x38] = X86_OP_ENTRY4(VINSERTx128,  V,qq, H,qq, W,dq, vex6 chk(W0) cpuid(AVX2) p_66),
>      [0x39] = X86_OP_ENTRY3(VEXTRACTx128, W,dq, V,qq, I,b,  vex6 chk(W0) cpuid(AVX2) p_66),
>  
>      /* Listed incorrectly as type 4 */

Tested-by: Eric Biggers <ebiggers@kernel.org>

Thanks,

- Eric