target/i386/tcg/decode-new.c.inc | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Table A-5 of the Intel manual incorrectly lists the third operand of
VINSERTx128 as Wqq, but it is actually a 128-bit value. This is
visible when W is a memory operand close to the end of the page.
Fixes the recently-added poly1305_kunit test in linux-next.
(No testcase yet, but I plan to modify test-avx2 to use memory
close to the end of the page. This would work because the test
vectors correctly have the memory operand as xmm2/m128).
Reported-by: Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/tcg/decode-new.c.inc | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 853b1c8bf95..51038657f0f 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -878,10 +878,10 @@ static const X86OpEntry opcodes_0F3A[256] = {
[0x0e] = X86_OP_ENTRY4(VPBLENDW, V,x, H,x, W,x, vex4 cpuid(SSE41) avx2_256 p_66),
[0x0f] = X86_OP_ENTRY4(PALIGNR, V,x, H,x, W,x, vex4 cpuid(SSSE3) mmx avx2_256 p_00_66),
- [0x18] = X86_OP_ENTRY4(VINSERTx128, V,qq, H,qq, W,qq, vex6 chk(W0) cpuid(AVX) p_66),
+ [0x18] = X86_OP_ENTRY4(VINSERTx128, V,qq, H,qq, W,dq, vex6 chk(W0) cpuid(AVX) p_66),
[0x19] = X86_OP_ENTRY3(VEXTRACTx128, W,dq, V,qq, I,b, vex6 chk(W0) cpuid(AVX) p_66),
- [0x38] = X86_OP_ENTRY4(VINSERTx128, V,qq, H,qq, W,qq, vex6 chk(W0) cpuid(AVX2) p_66),
+ [0x38] = X86_OP_ENTRY4(VINSERTx128, V,qq, H,qq, W,dq, vex6 chk(W0) cpuid(AVX2) p_66),
[0x39] = X86_OP_ENTRY3(VEXTRACTx128, W,dq, V,qq, I,b, vex6 chk(W0) cpuid(AVX2) p_66),
/* Listed incorrectly as type 4 */
--
2.50.1
On 25/7/25 08:17, Paolo Bonzini wrote:
> Table A-5 of the Intel manual incorrectly lists the third operand of
> VINSERTx128 as Wqq, but it is actually a 128-bit value. This is
> visible when W is a memory operand close to the end of the page.
>
> Fixes the recently-added poly1305_kunit test in linux-next.
>
> (No testcase yet, but I plan to modify test-avx2 to use memory
> close to the end of the page. This would work because the test
> vectors correctly have the memory operand as xmm2/m128).
>
Fixes: 79068477686 ("target/i386: reimplement 0x0f 0x3a, add AVX")
> Reported-by: Eric Biggers <ebiggers@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> target/i386/tcg/decode-new.c.inc | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
On 7/24/25 20:17, Paolo Bonzini wrote: > Table A-5 of the Intel manual incorrectly lists the third operand of > VINSERTx128 as Wqq, but it is actually a 128-bit value. This is > visible when W is a memory operand close to the end of the page. > > Fixes the recently-added poly1305_kunit test in linux-next. > > (No testcase yet, but I plan to modify test-avx2 to use memory > close to the end of the page. This would work because the test > vectors correctly have the memory operand as xmm2/m128). > > Reported-by: Eric Biggers<ebiggers@kernel.org> > Cc: Ard Biesheuvel<ardb@kernel.org> > Cc: "Jason A. Donenfeld"<Jason@zx2c4.com> > Cc: Guenter Roeck<linux@roeck-us.net> > Signed-off-by: Paolo Bonzini<pbonzini@redhat.com> > --- > target/i386/tcg/decode-new.c.inc | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
On Fri, Jul 25, 2025 at 08:17:36AM +0200, Paolo Bonzini wrote: > Table A-5 of the Intel manual incorrectly lists the third operand of > VINSERTx128 as Wqq, but it is actually a 128-bit value. This is > visible when W is a memory operand close to the end of the page. > > Fixes the recently-added poly1305_kunit test in linux-next. > > (No testcase yet, but I plan to modify test-avx2 to use memory > close to the end of the page. This would work because the test > vectors correctly have the memory operand as xmm2/m128). > > Reported-by: Eric Biggers <ebiggers@kernel.org> > Cc: Ard Biesheuvel <ardb@kernel.org> > Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> > Cc: Guenter Roeck <linux@roeck-us.net> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Thanks a lot for the quick fix! Guenter
On Fri, Jul 25, 2025 at 08:17:36AM +0200, Paolo Bonzini wrote:
> Table A-5 of the Intel manual incorrectly lists the third operand of
> VINSERTx128 as Wqq, but it is actually a 128-bit value.
That's annoying. I wonder what the way to report that is.
FWIW, AMD's manual gets it right.
> This is
> visible when W is a memory operand close to the end of the page.
>
> Fixes the recently-added poly1305_kunit test in linux-next.
>
> (No testcase yet, but I plan to modify test-avx2 to use memory
> close to the end of the page. This would work because the test
> vectors correctly have the memory operand as xmm2/m128).
>
> Reported-by: Eric Biggers <ebiggers@kernel.org>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> target/i386/tcg/decode-new.c.inc | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
> index 853b1c8bf95..51038657f0f 100644
> --- a/target/i386/tcg/decode-new.c.inc
> +++ b/target/i386/tcg/decode-new.c.inc
> @@ -878,10 +878,10 @@ static const X86OpEntry opcodes_0F3A[256] = {
> [0x0e] = X86_OP_ENTRY4(VPBLENDW, V,x, H,x, W,x, vex4 cpuid(SSE41) avx2_256 p_66),
> [0x0f] = X86_OP_ENTRY4(PALIGNR, V,x, H,x, W,x, vex4 cpuid(SSSE3) mmx avx2_256 p_00_66),
>
> - [0x18] = X86_OP_ENTRY4(VINSERTx128, V,qq, H,qq, W,qq, vex6 chk(W0) cpuid(AVX) p_66),
> + [0x18] = X86_OP_ENTRY4(VINSERTx128, V,qq, H,qq, W,dq, vex6 chk(W0) cpuid(AVX) p_66),
> [0x19] = X86_OP_ENTRY3(VEXTRACTx128, W,dq, V,qq, I,b, vex6 chk(W0) cpuid(AVX) p_66),
>
> - [0x38] = X86_OP_ENTRY4(VINSERTx128, V,qq, H,qq, W,qq, vex6 chk(W0) cpuid(AVX2) p_66),
> + [0x38] = X86_OP_ENTRY4(VINSERTx128, V,qq, H,qq, W,dq, vex6 chk(W0) cpuid(AVX2) p_66),
> [0x39] = X86_OP_ENTRY3(VEXTRACTx128, W,dq, V,qq, I,b, vex6 chk(W0) cpuid(AVX2) p_66),
>
> /* Listed incorrectly as type 4 */
Tested-by: Eric Biggers <ebiggers@kernel.org>
Thanks,
- Eric
© 2016 - 2025 Red Hat, Inc.