[v2] x86: support AVX512-FP16

[PATCH v2 00/10] x86: support AVX512-FP16

Posted by Jan Beulich 1 year ago

While I (quite obviously) don't have any suitable hardware, Intel's
SDE allows testing the implementation. And since there's no new
state (registers) associated with this ISA extension, this should
suffice for integration.

01: handle AVX512-FP16 insns encoded in 0f3a opcode map
02: handle AVX512-FP16 Map5 arithmetic insns
03: handle AVX512-FP16 move insns
04: handle AVX512-FP16 fma-like insns
05: handle AVX512-FP16 Map6 misc insns
06: handle AVX512-FP16 complex multiplication insns
07: handle AVX512-FP16 conversion to/from (packed) int16 insns
08: handle AVX512-FP16 floating point conversion insns
09: handle AVX512-FP16 conversion to/from (packed) int{32,64} insns
10: AVX512-FP16 testing

I've re-based this ahead of the also pending AMX series (and,
obviously, ahead of the not even submitted yet KeyLocker one), in
the hope that this may find its way in sooner than that other series.

Jan

Re: [PATCH v2 00/10] x86: support AVX512-FP16

Posted by Andrew Cooper 11 months, 1 week ago

On 03/04/2023 3:56 pm, Jan Beulich wrote:
> While I (quite obviously) don't have any suitable hardware, Intel's
> SDE allows testing the implementation. And since there's no new
> state (registers) associated with this ISA extension, this should
> suffice for integration.

I've given this a spin on a Sapphire Rapids system.

Relevant (AFAICT) bits of the log:

Testing vfpclasspsz $0x46,64(%edx),%k2...okay
Testing vfpclassphz $0x46,128(%ecx),%k3...okay
...
Testing avx512_fp16/all disp8 handling...okay
Testing avx512_fp16/128 disp8 handling...okay
...
Testing AVX512_FP16 f16 scal native execution...okay
Testing AVX512_FP16 f16 scal 64-bit code sequence...okay
Testing AVX512_FP16 f16 scal 32-bit code sequence...okay
Testing AVX512_FP16 f16x32 native execution...okay
Testing AVX512_FP16 f16x32 64-bit code sequence...okay
Testing AVX512_FP16 f16x32 32-bit code sequence...okay
Testing AVX512_FP16+VL f16x8 native execution...okay
Testing AVX512_FP16+VL f16x8 64-bit code sequence...okay
Testing AVX512_FP16+VL f16x8 32-bit code sequence...okay
Testing AVX512_FP16+VL f16x16 native execution...okay
Testing AVX512_FP16+VL f16x16 64-bit code sequence...okay
Testing AVX512_FP16+VL f16x16 32-bit code sequence...okay

and it exits zero, so everything seems fine.

One thing however, this series ups the minimum GCC version required to
build the emulator at all:

make: Entering directory '/local/xen.git/tools/tests/x86_emulator'
gcc: error: unrecognized command-line option ‘-mavx512fp16’; did you
mean ‘-mavx512bf16’?
Makefile:121: Test harness not built, use newer compiler than "gcc"
(version 10) and an "{evex}" capable assembler

and I'm not sure we want to do this.  When upping the version of GCC but
leaving binutils as-was does lead to a build of the harness without
AVX512-FP16 active, which is the preferred behaviour here.

~Andrew

Re: [PATCH v2 00/10] x86: support AVX512-FP16

Posted by Jan Beulich 11 months, 1 week ago

On 22.05.2023 18:25, Andrew Cooper wrote:
> On 03/04/2023 3:56 pm, Jan Beulich wrote:
>> While I (quite obviously) don't have any suitable hardware, Intel's
>> SDE allows testing the implementation. And since there's no new
>> state (registers) associated with this ISA extension, this should
>> suffice for integration.
> 
> I've given this a spin on a Sapphire Rapids system.
> 
> Relevant (AFAICT) bits of the log:
> 
> Testing vfpclasspsz $0x46,64(%edx),%k2...okay
> Testing vfpclassphz $0x46,128(%ecx),%k3...okay
> ...
> Testing avx512_fp16/all disp8 handling...okay
> Testing avx512_fp16/128 disp8 handling...okay
> ...
> Testing AVX512_FP16 f16 scal native execution...okay
> Testing AVX512_FP16 f16 scal 64-bit code sequence...okay
> Testing AVX512_FP16 f16 scal 32-bit code sequence...okay
> Testing AVX512_FP16 f16x32 native execution...okay
> Testing AVX512_FP16 f16x32 64-bit code sequence...okay
> Testing AVX512_FP16 f16x32 32-bit code sequence...okay
> Testing AVX512_FP16+VL f16x8 native execution...okay
> Testing AVX512_FP16+VL f16x8 64-bit code sequence...okay
> Testing AVX512_FP16+VL f16x8 32-bit code sequence...okay
> Testing AVX512_FP16+VL f16x16 native execution...okay
> Testing AVX512_FP16+VL f16x16 64-bit code sequence...okay
> Testing AVX512_FP16+VL f16x16 32-bit code sequence...okay
> 
> and it exits zero, so everything seems fine.
> 
> 
> One thing however, this series ups the minimum GCC version required to
> build the emulator at all:
> 
> make: Entering directory '/local/xen.git/tools/tests/x86_emulator'
> gcc: error: unrecognized command-line option ‘-mavx512fp16’; did you
> mean ‘-mavx512bf16’?
> Makefile:121: Test harness not built, use newer compiler than "gcc"
> (version 10) and an "{evex}" capable assembler
> 
> and I'm not sure we want to do this.  When upping the version of GCC but
> leaving binutils as-was does lead to a build of the harness without
> AVX512-FP16 active, which is the preferred behaviour here.

Well, this series on its own does, but I did notice the issue already.
Hence "x86emul: rework compiler probing in the test harness" [1].

Jan

[1] https://lists.xen.org/archives/html/xen-devel/2023-03/msg00123.html

Re: [PATCH v2 00/10] x86: support AVX512-FP16

Posted by Andrew Cooper 10 months, 4 weeks ago

On 23/05/2023 7:35 am, Jan Beulich wrote:
> On 22.05.2023 18:25, Andrew Cooper wrote:
>> On 03/04/2023 3:56 pm, Jan Beulich wrote:
>>> While I (quite obviously) don't have any suitable hardware, Intel's
>>> SDE allows testing the implementation. And since there's no new
>>> state (registers) associated with this ISA extension, this should
>>> suffice for integration.
>> I've given this a spin on a Sapphire Rapids system.
>>
>> Relevant (AFAICT) bits of the log:
>>
>> Testing vfpclasspsz $0x46,64(%edx),%k2...okay
>> Testing vfpclassphz $0x46,128(%ecx),%k3...okay
>> ...
>> Testing avx512_fp16/all disp8 handling...okay
>> Testing avx512_fp16/128 disp8 handling...okay
>> ...
>> Testing AVX512_FP16 f16 scal native execution...okay
>> Testing AVX512_FP16 f16 scal 64-bit code sequence...okay
>> Testing AVX512_FP16 f16 scal 32-bit code sequence...okay
>> Testing AVX512_FP16 f16x32 native execution...okay
>> Testing AVX512_FP16 f16x32 64-bit code sequence...okay
>> Testing AVX512_FP16 f16x32 32-bit code sequence...okay
>> Testing AVX512_FP16+VL f16x8 native execution...okay
>> Testing AVX512_FP16+VL f16x8 64-bit code sequence...okay
>> Testing AVX512_FP16+VL f16x8 32-bit code sequence...okay
>> Testing AVX512_FP16+VL f16x16 native execution...okay
>> Testing AVX512_FP16+VL f16x16 64-bit code sequence...okay
>> Testing AVX512_FP16+VL f16x16 32-bit code sequence...okay
>>
>> and it exits zero, so everything seems fine.
>>
>>
>> One thing however, this series ups the minimum GCC version required to
>> build the emulator at all:
>>
>> make: Entering directory '/local/xen.git/tools/tests/x86_emulator'
>> gcc: error: unrecognized command-line option ‘-mavx512fp16’; did you
>> mean ‘-mavx512bf16’?
>> Makefile:121: Test harness not built, use newer compiler than "gcc"
>> (version 10) and an "{evex}" capable assembler
>>
>> and I'm not sure we want to do this.  When upping the version of GCC but
>> leaving binutils as-was does lead to a build of the harness without
>> AVX512-FP16 active, which is the preferred behaviour here.
> Well, this series on its own does, but I did notice the issue already.
> Hence "x86emul: rework compiler probing in the test harness" [1].
>
> Jan
>
> [1] https://lists.xen.org/archives/html/xen-devel/2023-03/msg00123.html

Ok.  Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>