[Qemu-devel] [RISU v2 00/11] Support for i386/x86_64 with vector extensions

Jan Bobek posted 11 patches 6 years, 5 months ago
Failed in applying to current master (apply log)
There is a newer version of this series
configure           |  10 +-
Makefile            |   5 +-
risu_reginfo_i386.h |  49 ++++++
risu_i386.c         | 142 ++--------------
risu_reginfo_i386.c | 392 ++++++++++++++++++++++++++++++++++++++++++++
test_i386.S         |  80 +++++++++
test_i386.s         |  27 ---
7 files changed, 548 insertions(+), 157 deletions(-)
create mode 100644 risu_reginfo_i386.h
create mode 100644 risu_reginfo_i386.c
create mode 100644 test_i386.S
delete mode 100644 test_i386.s
[Qemu-devel] [RISU v2 00/11] Support for i386/x86_64 with vector extensions
Posted by Jan Bobek 6 years, 5 months ago
This patch series adds support for i386 and x86_64 architectures to
RISU.  Notably, vector registers (SSE, AVX, AVX-512) are supported for
verification of the apprentice. This is V2 of the series posted in
[1].

I decided not to drop the register definitions from the second patch
as suggested by Alex Bennée [4], but replaced them in the fourth patch
instead. This keeps the second and third patches code-motion only.

I wasn't 100% sure how to acknowledge Richard's contributions in some
of the patches, and eventually decided to include a Suggested-by:
line. Let me know if that's (not) acceptable.

 -Jan

Changes in V2:
  - included Richard Henderson's fix-ups [2] and vector register
    support [3]
  - replaced the magic numbers for XSAVE feature sets with symbolic
    constants
  - symbolic names ("sse", "avx", "avx512") as well as numbers are
    accepted for the parameter --xfeatures

References:
  1. https://lists.nongnu.org/archive/html/qemu-devel/2019-04/msg01294.html
  2. https://lists.nongnu.org/archive/html/qemu-devel/2019-04/msg01338.html
  3. https://lists.nongnu.org/archive/html/qemu-devel/2019-04/msg01371.html
  4. https://lists.nongnu.org/archive/html/qemu-devel/2019-04/msg04307.html

Jan Bobek (10):
  Makefile: undefine the arch name symbol
  risu_i386: move reginfo_t and related defines to risu_reginfo_i386.h
  risu_i386: move reginfo-related code to risu_reginfo_i386.c
  risu_reginfo_i386: implement arch-specific reginfo interface
  risu_i386: implement missing CPU-specific functions
  risu_i386: remove old unused code
  test_i386: change syntax from nasm to gas
  configure: add i386/x86_64 architectures
  risu_reginfo_i386: replace xfeature constants with symbolic names
  risu_reginfo_i386: accept named feature sets for --xfeature

Richard Henderson (1):
  i386: Add avx512 state to reginfo_t

 configure           |  10 +-
 Makefile            |   5 +-
 risu_reginfo_i386.h |  49 ++++++
 risu_i386.c         | 142 ++--------------
 risu_reginfo_i386.c | 392 ++++++++++++++++++++++++++++++++++++++++++++
 test_i386.S         |  80 +++++++++
 test_i386.s         |  27 ---
 7 files changed, 548 insertions(+), 157 deletions(-)
 create mode 100644 risu_reginfo_i386.h
 create mode 100644 risu_reginfo_i386.c
 create mode 100644 test_i386.S
 delete mode 100644 test_i386.s

-- 
2.20.1


Re: [Qemu-devel] [RISU v2 00/11] Support for i386/x86_64 with vector extensions
Posted by Alex Bennée 6 years, 5 months ago
Jan Bobek <jan.bobek@gmail.com> writes:

> This patch series adds support for i386 and x86_64 architectures to
> RISU.  Notably, vector registers (SSE, AVX, AVX-512) are supported for
> verification of the apprentice. This is V2 of the series posted in
> [1].
>
> I decided not to drop the register definitions from the second patch
> as suggested by Alex Bennée [4], but replaced them in the fourth patch
> instead. This keeps the second and third patches code-motion only.
>
> I wasn't 100% sure how to acknowledge Richard's contributions in some
> of the patches, and eventually decided to include a Suggested-by:
> line. Let me know if that's (not) acceptable.

Suggested-by: is a common tag for this sort of thing ;-)

--
Alex Bennée

Re: [Qemu-devel] [RISU v2 00/11] Support for i386/x86_64 with vector extensions
Posted by Alex Bennée 6 years, 5 months ago
Jan Bobek <jan.bobek@gmail.com> writes:

> This patch series adds support for i386 and x86_64 architectures to
> RISU.  Notably, vector registers (SSE, AVX, AVX-512) are supported for
> verification of the apprentice. This is V2 of the series posted in
> [1].
>
> I decided not to drop the register definitions from the second patch
> as suggested by Alex Bennée [4], but replaced them in the fourth patch
> instead. This keeps the second and third patches code-motion only.
>
> I wasn't 100% sure how to acknowledge Richard's contributions in some
> of the patches, and eventually decided to include a Suggested-by:
> line. Let me know if that's (not) acceptable.
>
>  -Jan
>
> Changes in V2:
>   - included Richard Henderson's fix-ups [2] and vector register
>     support [3]
>   - replaced the magic numbers for XSAVE feature sets with symbolic
>     constants
>   - symbolic names ("sse", "avx", "avx512") as well as numbers are
>     accepted for the parameter --xfeatures

I'm not sure where my test went wrong but I guess it's around xfeatures.
The code says required argument but risu doesn't seem to stop me not
specifying it. I suspect we should default to the most minimal x86_64 we
can and explicitly enable extra features.

Storing xfeat in the stream is a good idea so people don't mix up their
dumps but we probably need more validation when running in master mode
that the feature you have enabled is actually available on the
processor. Otherwise you'll potentially end up generating test streams
on HW with no support and just get a bunch of undef noise ;-)

However the series is looking pretty good so far. Looking forward to the
next iteration.

--
Alex Bennée

Re: [Qemu-devel] [RISU v2 00/11] Support for i386/x86_64 with vector extensions
Posted by Jan Bobek 6 years, 5 months ago
On 5/20/19 8:30 AM, Alex Bennée wrote:
> 
> I'm not sure where my test went wrong but I guess it's around xfeatures.
> The code says required argument but risu doesn't seem to stop me not
> specifying it. I suspect we should default to the most minimal x86_64 we
> can and explicitly enable extra features.

The argument is indeed required, that's taken care of by getopt: to
test that, one can simply specify --xfeatures as the last option on
the command-line. However, we don't check if the value successfully
parses into an integer; is it at all possible that --xfeatures
inadvertently swallowed the next part of your command-line? I shall
add this check in v3.

In any case, we currently default to SSE; this seems reasonable given
that it's an extension dating back some 20 years and pre-dates x86_64
by 4 years (1999 vs. 2003). Opinions?

> Storing xfeat in the stream is a good idea so people don't mix up their
> dumps but we probably need more validation when running in master mode
> that the feature you have enabled is actually available on the
> processor. Otherwise you'll potentially end up generating test streams
> on HW with no support and just get a bunch of undef noise ;-)

Correct me if I'm mistaken, but I believe this should be enforced by
xsave_feature_buf. There's a call to __get_cpuid_count to retrieve
location of a given XSAVE feature in memory, which is asserted to
complete successfully. I assume if the feature were not present, the
assertion would fail. I guess there's a point to be made about
release builds, in which the assert may have been optimized out; shall
I turn it into an error message instead?

> However the series is looking pretty good so far. Looking forward to the
> next iteration.

Once again, thanks a lot for the review, Alex!

-Jan

Re: [Qemu-devel] [RISU v2 00/11] Support for i386/x86_64 with vector extensions
Posted by Richard Henderson 6 years, 5 months ago
On 5/21/19 11:28 AM, Jan Bobek wrote:
> On 5/20/19 8:30 AM, Alex Bennée wrote:
>>
>> I'm not sure where my test went wrong but I guess it's around xfeatures.
>> The code says required argument but risu doesn't seem to stop me not
>> specifying it. I suspect we should default to the most minimal x86_64 we
>> can and explicitly enable extra features.
> 
> The argument is indeed required, that's taken care of by getopt: to
> test that, one can simply specify --xfeatures as the last option on
> the command-line. However, we don't check if the value successfully
> parses into an integer; is it at all possible that --xfeatures
> inadvertently swallowed the next part of your command-line? I shall
> add this check in v3.
> 
> In any case, we currently default to SSE; this seems reasonable given
> that it's an extension dating back some 20 years and pre-dates x86_64
> by 4 years (1999 vs. 2003). Opinions?

SSE2 is a mandatory part of the x86_64 ABI.

I sincerely doubt we care about testing 32-bit that does not have SSE, but even
then this patch set will not fail, as the kernel will not include the SSE
registers into the signal frame.  It would be the actual test cases for SSE
instructions that would SIGILL when run on a 32-bit guest w/o SSE.

>> Storing xfeat in the stream is a good idea so people don't mix up their
>> dumps but we probably need more validation when running in master mode
>> that the feature you have enabled is actually available on the
>> processor. Otherwise you'll potentially end up generating test streams
>> on HW with no support and just get a bunch of undef noise ;-)
> 
> Correct me if I'm mistaken, but I believe this should be enforced by
> xsave_feature_buf. There's a call to __get_cpuid_count to retrieve
> location of a given XSAVE feature in memory, which is asserted to
> complete successfully. I assume if the feature were not present, the
> assertion would fail. I guess there's a point to be made about
> release builds, in which the assert may have been optimized out; shall
> I turn it into an error message instead?

No, the assert is really an assert, because we have also masked the --xfeatures
value against the set of features stored in the signal frame.  If the kernel
reports a feature in the signal frame for which there is no cpuid leaf, then
something is very confused somewhere.

I am not sure that we can validate that the features in the signal frame match
the --xfeatures value, because I *think* that features are omitted from the
XSAVE data structure when they are in init state.  E.g. when we have not yet
exercised the feature.

This caveat is definitely true of ARM SVE, and I found that if I asserted that
all of the SVE state was in the signal frame that the generated RISU test which
uses memory would fail the 1st checkpoint, because no SVE instructions had yet
been executed.

A careful reading of the XSAVE documentation, plus some experimentation, is
definitely required.  Maybe hand-craft a test case using XRSTOR, giving it a
save area that specifies all features to be reset to init state.


r~

Re: [Qemu-devel] [RISU v2 00/11] Support for i386/x86_64 with vector extensions
Posted by Jan Bobek 6 years, 5 months ago
On 5/21/19 12:49 PM, Richard Henderson wrote:
> SSE2 is a mandatory part of the x86_64 ABI.
> 
> I sincerely doubt we care about testing 32-bit that does not have SSE, but even
> then this patch set will not fail, as the kernel will not include the SSE
> registers into the signal frame.  It would be the actual test cases for SSE
> instructions that would SIGILL when run on a 32-bit guest w/o SSE.

Fair enough. I had no idea that SSE2 is mandatory in x86_64, that's
good to know. Let's keep the SSE default then.

> No, the assert is really an assert, because we have also masked the --xfeatures
> value against the set of features stored in the signal frame.  If the kernel
> reports a feature in the signal frame for which there is no cpuid leaf, then
> something is very confused somewhere.

Ah, I see. Makes complete sense.

> I am not sure that we can validate that the features in the signal frame match
> the --xfeatures value, because I *think* that features are omitted from the
> XSAVE data structure when they are in init state.  E.g. when we have not yet
> exercised the feature.
> 
> This caveat is definitely true of ARM SVE, and I found that if I asserted that
> all of the SVE state was in the signal frame that the generated RISU test which
> uses memory would fail the 1st checkpoint, because no SVE instructions had yet
> been executed.
> 
> A careful reading of the XSAVE documentation, plus some experimentation, is
> definitely required.  Maybe hand-craft a test case using XRSTOR, giving it a
> save area that specifies all features to be reset to init state.

tl;dr Richard is exactly right; a component may be missing from the
XSAVE region if it's in the "initial configuration." I'd just leave it
as it is now: it appears everything more advanced than the SSE state
is in the initial configuration when the test image starts
executing. It won't show up until there's instructions which actually
touch it, but by then we get a SIGILL if the HW doesn't support it.

Long story: The Intel manual definitely has a notion of "init
optimization," in addition to "modified optimization." Vol. 1, Section
13.6 "Processor Tracking of XSAVE Managed State" says:

  The XSAVEOPT, XSAVEC, and XSAVES instructions use two optimizations
  to reduce the amount of data that they write to memory. They avoid
  writing data for any state component known to be in its initial
  configuration (the init optimization). In addition, if either
  XSAVEOPT or XSAVES is using the same XSAVE area as that used by the
  most recent execution of XRSTOR or XRSTORS, it may avoid writing
  data for any state component whose configuration is known not to
  have been modified since then (the modified optimization). (XSAVE
  does not use these optimizations, and XSAVEC does not use the
  modified optimization.)

So, XSAVE does not use any optimizations, whereas all other XSAVE
variants use at least the init optimization. Furthermore,

  The following notation describes the state of the init and modified
  optimizations:

  * XINUSE denotes the state-component bitmap corresponding to the
    init optimization. If XINUSE[i] = 0, state component i is known to
    be in its initial configuration; otherwise XINUSE[i] = 1. It is
    possible for XINUSE[i] to be 1 even when state component i is in
    its initial configuration. On a processor that does not support
    the init optimization, XINUSE[i] is always 1 for every value of i.

  [...]

The processor does not need to detect "return" to the initial
configuration; this makes more sense once it's clear what the initial
configuration is:

  The following items specify the initial configuration each state
  component (for the purposes of defining the XINUSE bitmap):

  * SSE state. In 64-bit mode, SSE state is in its initial
    configuration if each of XMM0–XMM15 is 0. Outside 64-bit mode, SSE
    state is in its initial configuration if each of XMM0–XMM7 is
    0. [...]

  * AVX state. In 64-bit mode, AVX state is in its initial
    configuration if each of YMM0_H–YMM15_H is 0. Outside 64-bit mode,
    AVX state is in its initial configuration if each of YMM0_H–YMM7_H
    is 0. [...]

  [...]

No surprise here; the initial configuration is just all zeros.

I ran some experiments on my laptop's Intel(R) Core(TM) i5-4210U to
find out what actually happens. This CPU supports AVX (but not
AVX-512) and doesn't support XSAVEC, so I only looked at XSAVE and
XSAVEOPT (XSAVES is kernel-mode only). I found out that:

1. both XSAVE and XSAVEOPT do not include the AVX state if it is in
   the initial configuration (not only XSAVEOPT),
2. return to initial configuration is not detected, i.e. the AVX state
   is included even though it has been set to all zeros via vxorps,
3. the AVX state can be brought back to the initial configuration via
   XRSTOR with the AVX component removed.

Cheers,
-Jan

Re: [Qemu-devel] [RISU v2 00/11] Support for i386/x86_64 with vector extensions
Posted by Richard Henderson 6 years, 5 months ago
On 5/23/19 2:03 PM, Jan Bobek wrote:
> I ran some experiments on my laptop's Intel(R) Core(TM) i5-4210U to
> find out what actually happens. This CPU supports AVX (but not
> AVX-512) and doesn't support XSAVEC, so I only looked at XSAVE and
> XSAVEOPT (XSAVES is kernel-mode only). I found out that:
> 
> 1. both XSAVE and XSAVEOPT do not include the AVX state if it is in
>    the initial configuration (not only XSAVEOPT),
> 2. return to initial configuration is not detected, i.e. the AVX state
>    is included even though it has been set to all zeros via vxorps,
> 3. the AVX state can be brought back to the initial configuration via
>    XRSTOR with the AVX component removed.

Thanks, for doing the tests.  I'm glad what I understood from my reading
matches experimental results.  :-)


r~