• Subject: [Qemu-devel] [RISU PATCH v3 00/22] SVE support and various misc fixes
  • Author: Alex Bennée
  • Date: June 13, 2018, 12:55 p.m.
  • Patches: 22 / 22
Changeset
Makefile                |   5 +-
build-all-archs         |  12 +-
comms.c                 |   1 +
contrib/generate_all.sh |  14 +-
reginfo.c               |   6 +-
risu.c                  |  51 ++++---
risu.h                  |  12 +-
risu_reginfo_aarch64.c  | 288 ++++++++++++++++++++++++++++++++++++----
risu_reginfo_aarch64.h  |  31 ++++-
risu_reginfo_arm.c      |  22 +++
risu_reginfo_m68k.c     |  14 ++
risu_reginfo_ppc64.c    |  14 ++
risugen                 |   3 +
risugen_arm.pm          | 243 ++++++++++++++++++++++++++++++---
14 files changed, 639 insertions(+), 77 deletions(-)
Git apply log
Switched to a new branch '20180613125601.14371-1-alex.bennee@linaro.org'
Applying: risu_reginfo_aarch64: include signal.h for FPSIMD_MAGIC
fatal: sha1 information is lacking or useless (risu_reginfo_aarch64.c).
error: could not build fake ancestor
Patch failed at 0001 risu_reginfo_aarch64: include signal.h for FPSIMD_MAGIC
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Failed to apply patch:
[Qemu-devel] [RISU PATCH v3 01/22] risu_reginfo_aarch64: include signal.h for FPSIMD_MAGIC
[Qemu-devel] [RISU PATCH v3 00/22] SVE support and various misc fixes
Posted by Alex Bennée, 1 week ago
Hi,

The core concept of the code hasn't changed much but I've done some
restructuring so we can still use an SVE enabled binary to run our
existing tests without having to regenerate everything. I've folded a
bunch of Richard's fixes into the relevant patches and also split some
patches up that make sense on their own. The code is also now smarter
about only worrying about SVE or SIMD to save on duplicating effort.

Thee are also a smattering of misc fixes for the build system, headers
and adding more feedback into help.

There are perhaps more #ifdef SVE_MAGIC's than I'd like but overall
I'm pretty happy with it. My main problem now is making sure I
generate patterns for an exhaustive coverage of the SVE instruction
set that we can fully exercise everything. Manually adding stuff to
aarch64.risu is more than a little error prone.

Alex Bennée (16):
  risu_reginfo_aarch64: include signal.h for FPSIMD_MAGIC
  comms: include header for writev
  build-all-arches: expand the range of docker images
  build-all-arches: do a distclean $(SRC) configured
  risu: add zlib indication to help text
  Makefile: include risu_reginfo_$(ARCH) in HDRS
  risugen: add --sve support
  contrib/generate_all.sh: allow passing of arguments to risugen
  risu: move optional args to each architecture
  risu_reginfo_aarch64: drop stray ;
  risu_reginfo_aarch64: unionify VFP regs
  risu_reginfo: introduce reginfo_size()
  risu_reginfo_aarch64: left justify regnums and drop masks
  risu_reginfo_aarch64: add support for copying SVE register state
  risu_reginfo_aarch64: add SVE support to reginfo_dump_mismatch
  risu_reginfo_aarch64: handle variable VQ

Richard Henderson (6):
  risugen: Initialize sve predicates with random data
  risugen: use fewer insns for aarch64 immediate load
  risugen: add reg_plus_imm_pl and reg_plus_imm_vl address helpers
  risugen: add dtype_msz address helper
  risu: add process_arch_opt
  risu_reginfo_aarch64: limit SVE_VQ_MAX to current architecture

 Makefile                |   5 +-
 build-all-archs         |  12 +-
 comms.c                 |   1 +
 contrib/generate_all.sh |  14 +-
 reginfo.c               |   6 +-
 risu.c                  |  51 ++++---
 risu.h                  |  12 +-
 risu_reginfo_aarch64.c  | 288 ++++++++++++++++++++++++++++++++++++----
 risu_reginfo_aarch64.h  |  31 ++++-
 risu_reginfo_arm.c      |  22 +++
 risu_reginfo_m68k.c     |  14 ++
 risu_reginfo_ppc64.c    |  14 ++
 risugen                 |   3 +
 risugen_arm.pm          | 243 ++++++++++++++++++++++++++++++---
 14 files changed, 639 insertions(+), 77 deletions(-)

-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 01/22] risu_reginfo_aarch64: include signal.h for FPSIMD_MAGIC
Posted by Alex Bennée, 1 week ago
Building with the Bionic Beaver cross compilers fails which probably
means we were getting this as a side effect before. Include the
correct header to get at these bits.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risu_reginfo_aarch64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index e3fadde..3bb4339 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -13,6 +13,7 @@
 #include <stdio.h>
 #include <ucontext.h>
 #include <string.h>
+#include <signal.h> /* for FPSIMD_MAGIC */
 
 #include "risu.h"
 #include "risu_reginfo_aarch64.h"
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 01/22] risu_reginfo_aarch64: include signal.h for FPSIMD_MAGIC
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> Building with the Bionic Beaver cross compilers fails which probably
> means we were getting this as a side effect before. Include the
> correct header to get at these bits.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  risu_reginfo_aarch64.c | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

[Qemu-devel] [RISU PATCH v3 02/22] comms: include header for writev
Posted by Alex Bennée, 1 week ago
The compiler complains about implicit declarations otherwise.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 comms.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/comms.c b/comms.c
index 2900c33..6946fd9 100644
--- a/comms.c
+++ b/comms.c
@@ -17,6 +17,7 @@
 #include <errno.h>
 #include <sys/socket.h>
 #include <sys/types.h>
+#include <sys/uio.h>
 #include <netinet/in.h>
 #include <netdb.h>
 
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 02/22] comms: include header for writev
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> The compiler complains about implicit declarations otherwise.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  comms.c | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 03/22] build-all-arches: expand the range of docker images
Posted by Alex Bennée, 1 week ago
We won't actually want power, we want ppc64el for the 64 bit version.
Also we will soon have m68k so include that as well.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 build-all-archs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/build-all-archs b/build-all-archs
index fa2ac90..a2f5cff 100755
--- a/build-all-archs
+++ b/build-all-archs
@@ -39,7 +39,7 @@ while [[ "$1" = -* ]]; do
             ;;
         --use-docker)
             if [ -z "$arg" ]; then
-                default_tags=$(docker images qemu --format "{{.Repository}}:{{.Tag}}" | grep "\(arm\|power\).*cross$")
+                default_tags=$(docker images qemu --format "{{.Repository}}:{{.Tag}}" | grep "\(arm\|ppc64el\|m68k\).*cross$")
                 docker_tags=$(echo $default_tags | sed 's/\n/\s/g' )
             else
                 docker_tags="$arg"
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 03/22] build-all-arches: expand the range of docker images
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> We won't actually want power, we want ppc64el for the 64 bit version.
> Also we will soon have m68k so include that as well.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  build-all-archs | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 04/22] build-all-arches: do a distclean $(SRC) configured
Posted by Alex Bennée, 1 week ago
This can cause much confusion when you have been building in your
source tree. I've added a distclean so we don't unexpectedly drop the
config for normal make clean invocations.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 Makefile        |  3 +++
 build-all-archs | 10 ++++++++++
 2 files changed, 13 insertions(+)

diff --git a/Makefile b/Makefile
index ca80eef..16e48a0 100644
--- a/Makefile
+++ b/Makefile
@@ -51,3 +51,6 @@ $(PROG): $(OBJS)
 
 clean:
 	rm -f $(PROG) $(OBJS) $(BINS)
+
+distclean: clean
+	rm -f config.h Makefile.in
diff --git a/build-all-archs b/build-all-archs
index a2f5cff..a7cd7c2 100755
--- a/build-all-archs
+++ b/build-all-archs
@@ -54,6 +54,16 @@ while [[ "$1" = -* ]]; do
     esac
 done
 
+#
+# If you are developing your primary architecture directly out of the
+# source tree you can confuse any out-of-tree builds thanks to random
+# crap in your VPATH. Let's detect that and save some hair pulling.
+#
+if [ -e Makefile.in ]; then
+    echo "Cleaning in-src-tree build"
+    make distclean
+fi
+
 # Debian stretch and Ubuntu Xenial have cross compiler packages for
 # all of these:
 #   gcc-arm-linux-gnueabihf gcc-aarch64-linux-gnu gcc-m68k-linux-gnu
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 04/22] build-all-arches: do a distclean $(SRC) configured
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> This can cause much confusion when you have been building in your
> source tree. I've added a distclean so we don't unexpectedly drop the
> config for normal make clean invocations.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  Makefile        |  3 +++
>  build-all-archs | 10 ++++++++++
>  2 files changed, 13 insertions(+)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 05/22] risu: add zlib indication to help text
Posted by Alex Bennée, 1 week ago
This is a simple aide-memoir as it can be tricky to determine this
with a simple statically compiled binary.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/risu.c b/risu.c
index 616fc33..c684c90 100644
--- a/risu.c
+++ b/risu.c
@@ -39,6 +39,9 @@ size_t signal_count;
 #ifdef HAVE_ZLIB
 #include <zlib.h>
 gzFile gz_trace_file;
+#define TRACE_TYPE "compressed"
+#else
+#define TRACE_TYPE "uncompressed"
 #endif
 
 sigjmp_buf jmpbuf;
@@ -273,7 +276,7 @@ void usage(void)
     fprintf(stderr, "between master and apprentice risu processes.\n\n");
     fprintf(stderr, "Options:\n");
     fprintf(stderr, "  --master          Be the master (server)\n");
-    fprintf(stderr, "  -t, --trace=FILE  Record/playback trace file\n");
+    fprintf(stderr, "  -t, --trace=FILE  Record/playback " TRACE_TYPE " trace file\n");
     fprintf(stderr,
             "  -h, --host=HOST   Specify master host machine (apprentice only)"
             "\n");
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 05/22] risu: add zlib indication to help text
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> This is a simple aide-memoir as it can be tricky to determine this
> with a simple statically compiled binary.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  risu.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

Eh, sure.  Or we could simply require zlib...

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


Re: [Qemu-devel] [RISU PATCH v3 05/22] risu: add zlib indication to help text
Posted by Alex Bennée, 1 week ago
Richard Henderson <richard.henderson@linaro.org> writes:

> On 06/13/2018 02:55 AM, Alex Bennée wrote:
>> This is a simple aide-memoir as it can be tricky to determine this
>> with a simple statically compiled binary.
>>
>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
>> ---
>>  risu.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> Eh, sure.  Or we could simply require zlib...
>
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

I would be some cross compilers (or more correctly multiarch setups) are
broken in this regard. There was a time before I moved to 18.04 that I
couldn't have multiple zlib1g-dev packages installed due to clashing
files.

As risu accepts stdin/stdout for trace files you can work around this
with pipes and native gzip so a non-zlib enabled risu is still usable.
But I'm happy to defer to Peter on this one.

--
Alex Bennée

[Qemu-devel] [RISU PATCH v3 06/22] Makefile: include risu_reginfo_$(ARCH) in HDRS
Posted by Alex Bennée, 1 week ago
Otherwise changes to reginfo don't cause the whole thing to be
re-built causing much confusion when bisecting.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 16e48a0..4aad448 100644
--- a/Makefile
+++ b/Makefile
@@ -21,7 +21,7 @@ ALL_CFLAGS = -Wall -D_GNU_SOURCE -DARCH=$(ARCH) $(BUILD_INC) $(CFLAGS) $(EXTRA_C
 
 PROG=risu
 SRCS=risu.c comms.c reginfo.c risu_$(ARCH).c risu_reginfo_$(ARCH).c
-HDRS=risu.h
+HDRS=risu.h risu_reginfo_$(ARCH).h
 BINS=test_$(ARCH).bin
 
 # For dumping test patterns
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 06/22] Makefile: include risu_reginfo_$(ARCH) in HDRS
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> Otherwise changes to reginfo don't cause the whole thing to be
> re-built causing much confusion when bisecting.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 07/22] risugen: add --sve support
Posted by Alex Bennée, 1 week ago
This is similar to the approach used by the FP/simd data in so far as
we generate a block of random data and then load into it. The loading
is actually done by the current vector length but that is implicit in
the run anyway.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v2
  - only one SIMD/FP/SVE block as they alias
  - move efficient loading as suggested by Dave
---
 risugen        |  3 +++
 risugen_arm.pm | 56 ++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/risugen b/risugen
index 488d804..de3f5ea 100755
--- a/risugen
+++ b/risugen
@@ -319,6 +319,7 @@ sub main()
     my $condprob = 0;
     my $fpscr = 0;
     my $fp_enabled = 1;
+    my $sve_enabled = 1;
     my $big_endian = 0;
     my ($infile, $outfile);
 
@@ -336,6 +337,7 @@ sub main()
                 },
                 "be" => sub { $big_endian = 1; },
                 "no-fp" => sub { $fp_enabled = 0; },
+                "sve" => sub { $sve_enabled = 1; },
         ) or return 1;
     # allow "--pattern re,re" and "--pattern re --pattern re"
     @pattern_re = split(/,/,join(',',@pattern_re));
@@ -363,6 +365,7 @@ sub main()
         'fpscr' => $fpscr,
         'numinsns' => $numinsns,
         'fp_enabled' => $fp_enabled,
+        'sve_enabled' => $sve_enabled,
         'outfile' => $outfile,
         'details' => \%insn_details,
         'keys' => \@insn_keys,
diff --git a/risugen_arm.pm b/risugen_arm.pm
index 2f10d58..bb3ee90 100644
--- a/risugen_arm.pm
+++ b/risugen_arm.pm
@@ -472,14 +472,51 @@ sub write_random_aarch64_fpdata()
     }
 }
 
-sub write_random_aarch64_regdata($)
+sub write_random_aarch64_svedata()
 {
-    my ($fp_enabled) = @_;
+    # Load SVE registers
+    my $align = 16;
+    my $vq = 16;                             # quadwords per vector
+    my $datalen = (32 * $vq * 16) + $align;
+
+    write_pc_adr(0, (3 * 4) + ($align - 1)); # insn 1
+    write_align_reg(0, $align);              # insn 2
+    write_jump_fwd($datalen);                # insn 3
+
+    # align safety
+    for (my $i = 0; $i < ($align / 4); $i++) {
+        # align with nops
+        insn32(0xd503201f);
+    };
+
+    for (my $rt = 0; $rt <= 31; $rt++) {
+        for (my $q = 0; $q < $vq; $q++) {
+            write_random_fpreg_var(4); # quad
+        }
+    }
+
+    # Reset all the predicate registers to all true
+    for (my $p = 0; $p < 16; $p++) {
+        insn32(0x2518e3e0 | $p);
+    }
+
+    for (my $rt = 0; $rt <= 31; $rt++) {
+        # ldr z$rt, [x0, #$rt, mul vl]
+        insn32(0x85804000 + $rt + (($rt & 7) << 10) + (($rt & 0x18) << 13));
+    }
+}
+
+sub write_random_aarch64_regdata($$)
+{
+    my ($fp_enabled, $sve_enabled) = @_;
     # clear flags
     insn32(0xd51b421f);     # msr nzcv, xzr
 
-    if ($fp_enabled) {
-        # load floating point / SIMD registers
+    # Load floating point / SIMD registers
+    # (one or the other as they overlap)
+    if ($sve_enabled) {
+        write_random_aarch64_svedata();
+    } elsif ($fp_enabled) {
         write_random_aarch64_fpdata();
     }
 
@@ -490,12 +527,12 @@ sub write_random_aarch64_regdata($)
     }
 }
 
-sub write_random_register_data($)
+sub write_random_register_data($$)
 {
-    my ($fp_enabled) = @_;
+    my ($fp_enabled, $sve_enabled) = @_;
 
     if ($is_aarch64) {
-        write_random_aarch64_regdata($fp_enabled);
+        write_random_aarch64_regdata($fp_enabled, $sve_enabled);
     } else {
         write_random_arm_regdata($fp_enabled);
     }
@@ -893,6 +930,7 @@ sub write_test_code($$$$$$$$)
     my $fpscr = $params->{ 'fpscr' };
     my $numinsns = $params->{ 'numinsns' };
     my $fp_enabled = $params->{ 'fp_enabled' };
+    my $sve_enabled = $params->{ 'sve_enabled' };
     my $outfile = $params->{ 'outfile' };
 
     my %insn_details = %{ $params->{ 'details' } };
@@ -918,7 +956,7 @@ sub write_test_code($$$$$$$$)
         write_memblock_setup();
     }
     # memblock setup doesn't clean its registers, so this must come afterwards.
-    write_random_register_data($fp_enabled);
+    write_random_register_data($fp_enabled, $sve_enabled);
     write_switch_to_test_mode();
 
     for my $i (1..$numinsns) {
@@ -930,7 +968,7 @@ sub write_test_code($$$$$$$$)
         # Rewrite the registers periodically. This avoids the tendency
         # for the VFP registers to decay to NaNs and zeroes.
         if ($periodic_reg_random && ($i % 100) == 0) {
-            write_random_register_data($fp_enabled);
+            write_random_register_data($fp_enabled, $sve_enabled);
             write_switch_to_test_mode();
         }
         progress_update($i);
-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 08/22] risugen: Initialize sve predicates with random data
Posted by Alex Bennée, 1 week ago
From: Richard Henderson <richard.henderson@linaro.org>

Using ptrue makes most of the uses of predicates trivial.
Therefore, initialize them to something interesting.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risugen_arm.pm | 48 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/risugen_arm.pm b/risugen_arm.pm
index bb3ee90..83e521d 100644
--- a/risugen_arm.pm
+++ b/risugen_arm.pm
@@ -174,6 +174,24 @@ sub write_sxt32($$)
     insn32(0x93407c00 | $rn << 5 | $rd);
 }
 
+sub write_add_rri($$$)
+{
+    my ($rd, $rn, $i) = @_;
+    my $sh;
+
+    die "write_add_rri: invalid operation for this arch.\n" if (!$is_aarch64);
+
+    if ($i >= 0 && $i < 0x1000) {
+        $sh = 0;
+    } elsif (($i & 0xfff) || $i >= 0x1000000) {
+        die "invalid immediate for this arch,\n";
+    } else {
+        $sh = 1;
+        $i >>= 12;
+    }
+    insn32(0x91000000 | ($rd << 0) | ($rn << 5) | ($i << 10) | ($sh << 22));
+}
+
 sub write_sub_rrr($$$)
 {
     my ($rd, $rn, $rm) = @_;
@@ -477,33 +495,35 @@ sub write_random_aarch64_svedata()
     # Load SVE registers
     my $align = 16;
     my $vq = 16;                             # quadwords per vector
-    my $datalen = (32 * $vq * 16) + $align;
-
-    write_pc_adr(0, (3 * 4) + ($align - 1)); # insn 1
-    write_align_reg(0, $align);              # insn 2
-    write_jump_fwd($datalen);                # insn 3
+    my $veclen = 32 * $vq * 16;
+    my $predlen = 16 * $vq * 2;
+    my $datalen = $veclen + $predlen;
 
-    # align safety
-    for (my $i = 0; $i < ($align / 4); $i++) {
-        # align with nops
-        insn32(0xd503201f);
-    };
+    write_pc_adr(0, 2 * 4);     # insn 1
+    write_jump_fwd($datalen);   # insn 2
 
     for (my $rt = 0; $rt <= 31; $rt++) {
         for (my $q = 0; $q < $vq; $q++) {
             write_random_fpreg_var(4); # quad
         }
     }
-
-    # Reset all the predicate registers to all true
-    for (my $p = 0; $p < 16; $p++) {
-        insn32(0x2518e3e0 | $p);
+    for (my $rt = 0; $rt <= 15; $rt++) {
+        for (my $q = 0; $q < $vq; $q++) {
+            insn16(rand(0xffff));
+        }
     }
 
     for (my $rt = 0; $rt <= 31; $rt++) {
         # ldr z$rt, [x0, #$rt, mul vl]
         insn32(0x85804000 + $rt + (($rt & 7) << 10) + (($rt & 0x18) << 13));
     }
+
+    write_add_rri(0, 0, $veclen);
+
+    for (my $rt = 0; $rt <= 15; $rt++) {
+        # ldr p$rt, [x0, #$pt, mul vl]
+        insn32(0x85800000 + $rt + (($rt & 7) << 10) + (($rt & 0x18) << 13));
+    }
 }
 
 sub write_random_aarch64_regdata($$)
-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 09/22] risugen: use fewer insns for aarch64 immediate load
Posted by Alex Bennée, 1 week ago
From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risugen_arm.pm | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/risugen_arm.pm b/risugen_arm.pm
index 83e521d..485e94e 100644
--- a/risugen_arm.pm
+++ b/risugen_arm.pm
@@ -261,15 +261,13 @@ sub write_mov_rr($$)
 
 sub write_mov_ri16($$$)
 {
-    # Write 16 bits of immediate to register, using either MOVW or MOVT
+    # Write 16 bits of immediate to register.
     my ($rd, $imm, $is_movt) = @_;
-    die "write_mov_ri16: immediate $imm out of range\n" if (($imm & 0xffff0000) != 0);
 
-    if ($is_aarch64) {
-        # Use MOVZ 0x52800000. is_movt means MOVK of high bits */
-        insn32(0xd2800000 | ($is_movt << 29) | ($is_movt ? 16 << 17 : 0) | ($imm << 5) | $rd);
+    die "write_mov_ri16: invalid operation for this arch.\n" if ($is_aarch64);
+    die "write_mov_ri16: immediate $imm out of range\n" if (($imm & 0xffff0000) != 0);
 
-    } elsif ($is_thumb) {
+    if ($is_thumb) {
         # enc T3
         my ($imm4, $i, $imm3, $imm8) = (($imm & 0xf000) >> 12,
                                         ($imm & 0x0800) >> 11,
@@ -287,16 +285,24 @@ sub write_mov_ri16($$$)
 
 sub write_mov_ri($$)
 {
-    # We always use a MOVW/MOVT pair, for simplicity.
-    # on aarch64, we use a MOVZ/MOVK pair.
     my ($rd, $imm) = @_;
-    write_mov_ri16($rd, ($imm & 0xffff), 0);
     my $highhalf = ($imm >> 16) & 0xffff;
-    write_mov_ri16($rd, $highhalf, 1) if $highhalf;
 
-    if ($is_aarch64 && $imm < 0) {
-        # sign extend to allow small negative imm constants
-        write_sxt32($rd, $rd);
+    if ($is_aarch64) {
+        if ($imm < 0) {
+            # MOVN
+            insn32(0x92800000 | ((~$imm & 0xffff) << 5) | $rd);
+            # MOVK, LSL 16
+            insn32(0xf2a00000 | ($highhalf << 5) | $rd) if $highhalf != 0xffff;
+        } else {
+            # MOVZ
+            insn32(0x52800000 | (($imm & 0xffff) << 5) | $rd);
+            # MOVK, LSL 16
+            insn32(0xf2a00000 | ($highhalf << 5) | $rd) if $highhalf != 0;
+        }
+    } else {
+        write_mov_ri16($rd, ($imm & 0xffff), 0);
+        write_mov_ri16($rd, $highhalf, 1) if $highhalf;
     }
 }
 
-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 10/22] risugen: add reg_plus_imm_pl and reg_plus_imm_vl address helpers
Posted by Alex Bennée, 1 week ago
From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risugen_arm.pm | 126 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 126 insertions(+)

diff --git a/risugen_arm.pm b/risugen_arm.pm
index 485e94e..696bf5f 100644
--- a/risugen_arm.pm
+++ b/risugen_arm.pm
@@ -306,6 +306,52 @@ sub write_mov_ri($$)
     }
 }
 
+sub write_addpl_rri($$$)
+{
+    my ($rd, $rn, $imm) = @_;
+    die "write_addpl: invalid operation for this arch.\n" if (!$is_aarch64);
+
+    insn32(0x04605000 | ($rn << 16) | (($imm & 0x3f) << 5) | $rd);
+}
+
+sub write_addvl_rri($$$)
+{
+    my ($rd, $rn, $imm) = @_;
+    die "write_addvl: invalid operation for this arch.\n" if (!$is_aarch64);
+
+    insn32(0x04205000 | ($rn << 16) | (($imm & 0x3f) << 5) | $rd);
+}
+
+sub write_rdvl_ri($$)
+{
+    my ($rd, $imm) = @_;
+    die "write_rdvl: invalid operation for this arch.\n" if (!$is_aarch64);
+
+    insn32(0x04bf5000 | (($imm & 0x3f) << 5) | $rd);
+}
+
+sub write_madd_rrrr($$$$)
+{
+    my ($rd, $rn, $rm, $ra) = @_;
+    die "write_madd: invalid operation for this arch.\n" if (!$is_aarch64);
+
+    insn32(0x9b000000 | ($rm << 16) | ($ra << 10) | ($rn << 5) | $rd);
+}
+
+sub write_msub_rrrr($$$$)
+{
+    my ($rd, $rn, $rm, $ra) = @_;
+    die "write_msub: invalid operation for this arch.\n" if (!$is_aarch64);
+
+    insn32(0x9b008000 | ($rm << 16) | ($ra << 10) | ($rn << 5) | $rd);
+}
+
+sub write_mul_rrr($$$)
+{
+    my ($rd, $rn, $rm) = @_;
+    write_madd_rrrr($rd, $rn, $rm, 31);
+}
+
 # write random fp value of passed precision (1=single, 2=double, 4=quad)
 sub write_random_fpreg_var($)
 {
@@ -767,6 +813,86 @@ sub reg_plus_imm($$@)
     return $base;
 }
 
+sub reg_plus_imm_pl($$@)
+{
+    # Handle reg + immediate addressing mode
+    my ($base, $imm, @trashed) = @_;
+    if ($imm == 0) {
+        return reg($base, @trashed);
+    }
+    write_get_offset();
+
+    # Now r0 is the address we want to do the access to,
+    # so set the basereg by doing the inverse of the
+    # addressing mode calculation, ie base = r0 - imm
+    #
+    # Note that addpl has a 6-bit immediate, but ldr has a 9-bit
+    # immediate, so we need to be able to support larger immediates.
+
+    if (-$imm >= -32 && -$imm <= 31) {
+        write_addpl_rri($base, 0, -$imm);
+    } else {
+        # We borrow r1 and r2 as a temporaries (not a problem
+        # as long as we don't leave anything in a register
+        # which depends on the location of the memory block)
+        write_mov_ri(1, 0);
+        write_mov_ri(2, $imm);
+        write_addpl_rri(1, 1, 1);
+        write_msub_rrrr($base, 1, 2, 0);
+    }
+    if (grep $_ == $base, @trashed) {
+        return -1;
+    }
+    return $base;
+}
+
+sub reg_plus_imm_vl($$@)
+{
+    # The usual address formulation is
+    #   elements = VL DIV esize
+    #   mbytes = msize DIV 8
+    #   addr = base + imm * elements * mbytes
+    # Here we compute
+    #   scale = log2(esize / msize)
+    #   base + (imm * VL) >> scale
+    my ($base, $imm, $scale, @trashed) = @_;
+    if ($imm == 0) {
+        return reg($base, @trashed);
+    }
+    write_get_offset();
+
+    # Now r0 is the address we want to do the access to,
+    # so set the basereg by doing the inverse of the
+    # addressing mode calculation, ie base = r0 - imm
+    #
+    # Note that rdvl/addvl have a 6-bit immediate, but ldr has a 9-bit
+    # immediate, so we need to be able to support larger immediates.
+
+    use integer;
+    my $mul = 1 << $scale;
+    my $imm_div = $imm / $mul;
+
+    if ($imm == $imm_div * $mul && -$imm_div >= -32 && -$imm_div <= 31) {
+        write_addvl_rri($base, 0, -$imm_div);
+    } elsif ($imm >= -32 && $imm <= 31) {
+        write_rdvl_ri(1, $imm);
+        write_sub_rrrs($base, 0, 1, $SHIFT_ASR, $scale);
+    } else {
+        write_rdvl_ri(1, 1);
+        write_mov_ri(2, $imm);
+        if ($scale == 0) {
+            write_msub_rrrr($base, 1, 2, 0);
+        } else {
+            write_mul_rrr(1, 1, 2);
+            write_sub_rrrs($base, 0, 1, $SHIFT_ASR, $scale);
+        }
+    }
+    if (grep $_ == $base, @trashed) {
+        return -1;
+    }
+    return $base;
+}
+
 sub reg_minus_imm($$@)
 {
     my ($base, $imm, @trashed) = @_;
-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 11/22] risugen: add dtype_msz address helper
Posted by Alex Bennée, 1 week ago
From: Richard Henderson <richard.henderson@linaro.org>

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risugen_arm.pm | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/risugen_arm.pm b/risugen_arm.pm
index 696bf5f..8d423b1 100644
--- a/risugen_arm.pm
+++ b/risugen_arm.pm
@@ -765,6 +765,15 @@ sub write_get_offset()
     write_risuop($OP_GETMEMBLOCK);
 }
 
+# Return the log2 of the memory size of an operation described by dtype.
+sub dtype_msz($)
+{
+    my ($dtype) = @_;
+    my $dth = $dtype >> 2;
+    my $dtl = $dtype & 3;
+    return $dtl >= $dth ? $dth : 3 - $dth;
+}
+
 sub reg($@)
 {
     my ($base, @trashed) = @_;
-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 12/22] contrib/generate_all.sh: allow passing of arguments to risugen
Posted by Alex Bennée, 1 week ago
This allows us to use any new risugen options when generating all our
test patterns.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 contrib/generate_all.sh | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/contrib/generate_all.sh b/contrib/generate_all.sh
index 1e6b847..651cb23 100755
--- a/contrib/generate_all.sh
+++ b/contrib/generate_all.sh
@@ -12,11 +12,11 @@
 #     Alex Bennée <alex.bennee@linaro.org> - initial implementation
 #
 # Usage:
-#   ./contrib/generate_all.sh <arch.risu> <target directory>
+#   ./contrib/generate_all.sh <arch.risu> <target directory> -- risugen args
 
 set -e
 
-USAGE="Usage: `basename $0` [-h] [-n x] <risufile> <target dir>"
+USAGE="Usage: `basename $0` [-h] [-n x] <risufile> <target dir> -- [risugen args]"
 SPLIT=4
 RISUGEN=$(CDPATH= cd -- "$(dirname -- "$0")/.." && pwd -P)/risugen
 
@@ -41,18 +41,24 @@ done
 # Remove the switches we parsed above.
 shift `expr $OPTIND - 1`
 
-while [ $# -ne 0 ]; do
+# Parse up to and including any --
+RISUGEN_ARGS=""
+while [ $# -ne 0 ] && [ -z "$RISUGEN_ARGS" ]; do
 
     if [ -f $1 ]; then
         RISU_FILE=$1;
     elif [ -d $1 ]; then
         TARGET_DIR=$1;
+    elif [ "$1" = "--" ]; then
+        RISUGEN_ARGS=$1
     elif [ ! -e $1 ]; then
         TARGET_DIR=$1
     fi
 
     shift
 done
+# anything left is for RISUGEN
+RISUGEN_ARGS=$@
 
 if test -z "${RISUGEN}" || test ! -x "${RISUGEN}";  then
     echo "Couldn't find risugen (${RISUGEN})"
@@ -90,7 +96,7 @@ while test $# -gt 0 ; do
         fi
     done
     I_FILE="${I_FILE}_INC.risu.bin"
-    CMD="${RISUGEN} ${INSN_PATTERNS} ${RISU_FILE} ${I_FILE}"
+    CMD="${RISUGEN} ${RISUGEN_ARGS} ${INSN_PATTERNS} ${RISU_FILE} ${I_FILE}"
     echo "Running: $CMD"
     $CMD
 done
-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 13/22] risu: move optional args to each architecture
Posted by Alex Bennée, 1 week ago
The key variables here are: *arch_long_opts and *arch_extra_help. If
they are not NULL then we concatenate the extra options to appropriate
structure to enable the support. Adding architecture short options is
not supported.

This also includes moving the ARM specific test_fp_exc/test-fp-exc
into ARM specific code.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v2
  - better concatenation of default and extra opts
---
 risu.c                 | 29 +++++++++++++++++++++++++----
 risu.h                 |  6 ++++--
 risu_reginfo_aarch64.c |  3 +++
 risu_reginfo_arm.c     | 11 +++++++++++
 risu_reginfo_m68k.c    |  3 +++
 risu_reginfo_ppc64.c   |  3 +++
 6 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/risu.c b/risu.c
index c684c90..18de351 100644
--- a/risu.c
+++ b/risu.c
@@ -46,8 +46,7 @@ gzFile gz_trace_file;
 
 sigjmp_buf jmpbuf;
 
-/* Should we test for FP exception status bits? */
-int test_fp_exc;
+#define ARRAY_SIZE(x)	(sizeof(x) / sizeof((x)[0]))
 
 /* Master functions */
 
@@ -283,6 +282,9 @@ void usage(void)
     fprintf(stderr,
             "  -p, --port=PORT   Specify the port to connect to/listen on "
             "(default 9191)\n");
+    if (arch_extra_help) {
+        fprintf(stderr, "%s", arch_extra_help);
+    }
 }
 
 struct option * setup_options(char **short_opts)
@@ -293,12 +295,31 @@ struct option * setup_options(char **short_opts)
         {"host", required_argument, 0, 'h'},
         {"port", required_argument, 0, 'p'},
         {"trace", required_argument, 0, 't'},
-        {"test-fp-exc", no_argument, &test_fp_exc, 1},
         {0, 0, 0, 0}
     };
+    struct option *lopts = &default_longopts[0];
 
     *short_opts = "h:p:t:";
-    return default_longopts;
+
+    if (arch_long_opts) {
+        const size_t osize = sizeof(struct option);
+        const int default_count = ARRAY_SIZE(default_longopts);
+        struct option *dptr;
+        int extra_count = 0;
+
+        /* count additional opts */
+        dptr = arch_long_opts;
+        do {} while (dptr[extra_count++].name);
+
+        lopts = calloc(default_count + extra_count, osize);
+
+        /* Copy default opts + extra opts */
+        memcpy(lopts, default_longopts, default_count * osize);
+        dptr = &lopts[default_count - 1];
+        memcpy(dptr, arch_long_opts, extra_count * osize);
+    }
+
+    return lopts;
 }
 
 int main(int argc, char **argv)
diff --git a/risu.h b/risu.h
index 1c8ecee..89811f4 100644
--- a/risu.h
+++ b/risu.h
@@ -17,6 +17,10 @@
 #include <ucontext.h>
 #include <stdio.h>
 
+/* Extra option processing for architectures */
+extern void *arch_long_opts;
+extern char *arch_extra_help;
+
 /* GCC computed include to pull in the correct risu_reginfo_*.h for
  * the architecture.
  */
@@ -36,8 +40,6 @@ void send_response_byte(int sock, int resp);
 extern uintptr_t image_start_address;
 extern void *memblock;
 
-extern int test_fp_exc;
-
 /* Ops code under test can request from risu: */
 #define OP_COMPARE 0
 #define OP_TESTEND 1
diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index 3bb4339..ab12270 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -18,6 +18,9 @@
 #include "risu.h"
 #include "risu_reginfo_aarch64.h"
 
+void *arch_long_opts;
+char *arch_extra_help;
+
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 {
diff --git a/risu_reginfo_arm.c b/risu_reginfo_arm.c
index 6b9ee7b..5acad02 100644
--- a/risu_reginfo_arm.c
+++ b/risu_reginfo_arm.c
@@ -13,12 +13,23 @@
 #include <stdio.h>
 #include <ucontext.h>
 #include <string.h>
+#include <getopt.h>
 
 #include "risu.h"
 #include "risu_reginfo_arm.h"
 
 extern int insnsize(ucontext_t *uc);
 
+/* Should we test for FP exception status bits? */
+static int test_fp_exc;
+static struct option extra_opts[] = {
+    {"test-fp-exc", no_argument, &test_fp_exc, 1},
+    {0, 0, 0, 0}
+};
+
+void *arch_long_opts = &extra_opts[0];
+char *arch_extra_help = "  --test-fp-exc     Check FP exception bits when comparing\n";
+
 static void reginfo_init_vfp(struct reginfo *ri, ucontext_t *uc)
 {
     /* Read VFP registers. These live in uc->uc_regspace, which is
diff --git a/risu_reginfo_m68k.c b/risu_reginfo_m68k.c
index 4ff0aa8..d429502 100644
--- a/risu_reginfo_m68k.c
+++ b/risu_reginfo_m68k.c
@@ -14,6 +14,9 @@
 #include "risu.h"
 #include "risu_reginfo_m68k.h"
 
+void *arch_long_opts;
+char *arch_extra_help;
+
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 {
diff --git a/risu_reginfo_ppc64.c b/risu_reginfo_ppc64.c
index 5f33648..395111c 100644
--- a/risu_reginfo_ppc64.c
+++ b/risu_reginfo_ppc64.c
@@ -22,6 +22,9 @@
 #define XER 37
 #define CCR 38
 
+void *arch_long_opts;
+char *arch_extra_help;
+
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 {
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 13/22] risu: move optional args to each architecture
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> The key variables here are: *arch_long_opts and *arch_extra_help. If
> they are not NULL then we concatenate the extra options to appropriate
> structure to enable the support. Adding architecture short options is
> not supported.
> 
> This also includes moving the ARM specific test_fp_exc/test-fp-exc
> into ARM specific code.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> ---
> v2
>   - better concatenation of default and extra opts

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 14/22] risu: add process_arch_opt
Posted by Alex Bennée, 1 week ago
From: Richard Henderson <richard.henderson@linaro.org>

Allows the backend to do more that just set a flag when it comes to
processing options.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
[AJB: tweaked order, added stdlib includes]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risu.c                 | 33 +++++++++++----------------------
 risu.h                 |  7 +++++--
 risu_reginfo_aarch64.c | 10 ++++++++--
 risu_reginfo_arm.c     | 14 ++++++++++----
 risu_reginfo_m68k.c    | 10 ++++++++--
 risu_reginfo_ppc64.c   | 10 ++++++++--
 6 files changed, 50 insertions(+), 34 deletions(-)

diff --git a/risu.c b/risu.c
index 18de351..01525d2 100644
--- a/risu.c
+++ b/risu.c
@@ -18,7 +18,6 @@
 #include <errno.h>
 #include <signal.h>
 #include <ucontext.h>
-#include <getopt.h>
 #include <setjmp.h>
 #include <assert.h>
 #include <sys/stat.h>
@@ -27,7 +26,6 @@
 #include <string.h>
 
 #include "config.h"
-
 #include "risu.h"
 
 void *memblock;
@@ -287,7 +285,7 @@ void usage(void)
     }
 }
 
-struct option * setup_options(char **short_opts)
+static struct option * setup_options(char **short_opts)
 {
     static struct option default_longopts[] = {
         {"help", no_argument, 0, '?'},
@@ -303,20 +301,19 @@ struct option * setup_options(char **short_opts)
 
     if (arch_long_opts) {
         const size_t osize = sizeof(struct option);
-        const int default_count = ARRAY_SIZE(default_longopts);
-        struct option *dptr;
-        int extra_count = 0;
+        const int default_count = ARRAY_SIZE(default_longopts) - 1;
+        int arch_count;
 
         /* count additional opts */
-        dptr = arch_long_opts;
-        do {} while (dptr[extra_count++].name);
+        for (arch_count = 0; arch_long_opts[arch_count].name; arch_count++) {
+            continue;
+        }
 
-        lopts = calloc(default_count + extra_count, osize);
+        lopts = calloc(default_count + arch_count + 1, osize);
 
         /* Copy default opts + extra opts */
         memcpy(lopts, default_longopts, default_count * osize);
-        dptr = &lopts[default_count - 1];
-        memcpy(dptr, arch_long_opts, extra_count * osize);
+        memcpy(lopts + default_count, arch_long_opts, arch_count * osize);
     }
 
     return lopts;
@@ -343,34 +340,26 @@ int main(int argc, char **argv)
 
         switch (c) {
         case 0:
-        {
             /* flag set by getopt_long, do nothing */
             break;
-        }
         case 't':
-        {
             trace_fn = optarg;
             trace = 1;
             break;
-        }
         case 'h':
-        {
             hostname = optarg;
             break;
-        }
         case 'p':
-        {
             /* FIXME err handling */
             port = strtol(optarg, 0, 10);
             break;
-        }
         case '?':
-        {
             usage();
             exit(1);
-        }
         default:
-            abort();
+            assert(c >= FIRST_ARCH_OPT);
+            process_arch_opt(c, optarg);
+            break;
         }
     }
 
diff --git a/risu.h b/risu.h
index 89811f4..48c50d9 100644
--- a/risu.h
+++ b/risu.h
@@ -16,10 +16,13 @@
 #include <stdint.h>
 #include <ucontext.h>
 #include <stdio.h>
+#include <getopt.h>
 
 /* Extra option processing for architectures */
-extern void *arch_long_opts;
-extern char *arch_extra_help;
+extern const struct option * const arch_long_opts;
+extern const char * const arch_extra_help;
+void process_arch_opt(int opt, const char *arg);
+#define FIRST_ARCH_OPT   0x100
 
 /* GCC computed include to pull in the correct risu_reginfo_*.h for
  * the architecture.
diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index ab12270..c270e0b 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -14,12 +14,18 @@
 #include <ucontext.h>
 #include <string.h>
 #include <signal.h> /* for FPSIMD_MAGIC */
+#include <stdlib.h>
 
 #include "risu.h"
 #include "risu_reginfo_aarch64.h"
 
-void *arch_long_opts;
-char *arch_extra_help;
+const struct option * const arch_long_opts;
+const char * const arch_extra_help;
+
+void process_arch_opt(int opt, const char *arg)
+{
+    abort();
+}
 
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
diff --git a/risu_reginfo_arm.c b/risu_reginfo_arm.c
index 5acad02..12ad0ef 100644
--- a/risu_reginfo_arm.c
+++ b/risu_reginfo_arm.c
@@ -13,7 +13,7 @@
 #include <stdio.h>
 #include <ucontext.h>
 #include <string.h>
-#include <getopt.h>
+#include <stdlib.h>
 
 #include "risu.h"
 #include "risu_reginfo_arm.h"
@@ -22,13 +22,19 @@ extern int insnsize(ucontext_t *uc);
 
 /* Should we test for FP exception status bits? */
 static int test_fp_exc;
-static struct option extra_opts[] = {
+static const struct option extra_opts[] = {
     {"test-fp-exc", no_argument, &test_fp_exc, 1},
     {0, 0, 0, 0}
 };
 
-void *arch_long_opts = &extra_opts[0];
-char *arch_extra_help = "  --test-fp-exc     Check FP exception bits when comparing\n";
+const struct option * const arch_long_opts = &extra_opts[0];
+const char * const arch_extra_help =
+    "  --test-fp-exc     Check FP exception bits when comparing\n";
+
+void process_arch_opt(int opt, const char *arg)
+{
+    abort();
+}
 
 static void reginfo_init_vfp(struct reginfo *ri, ucontext_t *uc)
 {
diff --git a/risu_reginfo_m68k.c b/risu_reginfo_m68k.c
index d429502..7a1c5a9 100644
--- a/risu_reginfo_m68k.c
+++ b/risu_reginfo_m68k.c
@@ -10,12 +10,18 @@
 #include <ucontext.h>
 #include <string.h>
 #include <math.h>
+#include <stdlib.h>
 
 #include "risu.h"
 #include "risu_reginfo_m68k.h"
 
-void *arch_long_opts;
-char *arch_extra_help;
+const struct option * const arch_long_opts;
+const char * const arch_extra_help;
+
+void process_arch_opt(int opt, const char *arg)
+{
+    abort();
+}
 
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
diff --git a/risu_reginfo_ppc64.c b/risu_reginfo_ppc64.c
index 395111c..4b70460 100644
--- a/risu_reginfo_ppc64.c
+++ b/risu_reginfo_ppc64.c
@@ -15,6 +15,7 @@
 #include <ucontext.h>
 #include <string.h>
 #include <math.h>
+#include <stdlib.h>
 
 #include "risu.h"
 #include "risu_reginfo_ppc64.h"
@@ -22,8 +23,13 @@
 #define XER 37
 #define CCR 38
 
-void *arch_long_opts;
-char *arch_extra_help;
+const struct option * const arch_long_opts;
+const char * const arch_extra_help;
+
+void process_arch_opt(int opt, const char *arg)
+{
+    abort();
+}
 
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 15/22] risu_reginfo_aarch64: drop stray ;
Posted by Alex Bennée, 1 week ago
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risu_reginfo_aarch64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index c270e0b..34dd9af 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -66,7 +66,7 @@ void reginfo_init(struct reginfo *ri, ucontext_t *uc)
     for (i = 0; i < 32; i++) {
         ri->vregs[i] = fp->vregs[i];
     }
-};
+}
 
 /* reginfo_is_eq: compare the reginfo structs, returns nonzero if equal */
 int reginfo_is_eq(struct reginfo *r1, struct reginfo *r2)
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 15/22] risu_reginfo_aarch64: drop stray ;
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  risu_reginfo_aarch64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 16/22] risu_reginfo_aarch64: unionify VFP regs
Posted by Alex Bennée, 1 week ago
This is preparation for the SVE work as we won't want to be carrying
around both VFP and SVE registers at the same time as they overlap.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risu_reginfo_aarch64.c | 16 ++++++++--------
 risu_reginfo_aarch64.h |  9 ++++++++-
 2 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index 34dd9af..62a5599 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -64,7 +64,7 @@ void reginfo_init(struct reginfo *ri, ucontext_t *uc)
     ri->fpcr = fp->fpcr;
 
     for (i = 0; i < 32; i++) {
-        ri->vregs[i] = fp->vregs[i];
+        ri->simd.vregs[i] = fp->vregs[i];
     }
 }
 
@@ -92,8 +92,8 @@ int reginfo_dump(struct reginfo *ri, FILE * f)
 
     for (i = 0; i < 32; i++) {
         fprintf(f, "  V%2d   : %016" PRIx64 "%016" PRIx64 "\n", i,
-                (uint64_t) (ri->vregs[i] >> 64),
-                (uint64_t) (ri->vregs[i] & 0xffffffffffffffff));
+                (uint64_t) (ri->simd.vregs[i] >> 64),
+                (uint64_t) (ri->simd.vregs[i] & 0xffffffffffffffff));
     }
 
     return !ferror(f);
@@ -138,14 +138,14 @@ int reginfo_dump_mismatch(struct reginfo *m, struct reginfo *a, FILE * f)
     }
 
     for (i = 0; i < 32; i++) {
-        if (m->vregs[i] != a->vregs[i]) {
+        if (m->simd.vregs[i] != a->simd.vregs[i]) {
             fprintf(f, "  V%2d   : "
                     "%016" PRIx64 "%016" PRIx64 " vs "
                     "%016" PRIx64 "%016" PRIx64 "\n", i,
-                    (uint64_t) (m->vregs[i] >> 64),
-                    (uint64_t) (m->vregs[i] & 0xffffffffffffffff),
-                    (uint64_t) (a->vregs[i] >> 64),
-                    (uint64_t) (a->vregs[i] & 0xffffffffffffffff));
+                    (uint64_t) (m->simd.vregs[i] >> 64),
+                    (uint64_t) (m->simd.vregs[i] & 0xffffffffffffffff),
+                    (uint64_t) (a->simd.vregs[i] >> 64),
+                    (uint64_t) (a->simd.vregs[i] & 0xffffffffffffffff));
         }
     }
 
diff --git a/risu_reginfo_aarch64.h b/risu_reginfo_aarch64.h
index a05fb4e..a1c708b 100644
--- a/risu_reginfo_aarch64.h
+++ b/risu_reginfo_aarch64.h
@@ -13,6 +13,10 @@
 #ifndef RISU_REGINFO_AARCH64_H
 #define RISU_REGINFO_AARCH64_H
 
+struct simd_reginfo {
+    __uint128_t vregs[32];
+};
+
 struct reginfo {
     uint64_t fault_address;
     uint64_t regs[31];
@@ -24,7 +28,10 @@ struct reginfo {
     /* FP/SIMD */
     uint32_t fpsr;
     uint32_t fpcr;
-    __uint128_t vregs[32];
+
+    union {
+        struct simd_reginfo simd;
+    };
 };
 
 #endif /* RISU_REGINFO_AARCH64_H */
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 16/22] risu_reginfo_aarch64: unionify VFP regs
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> This is preparation for the SVE work as we won't want to be carrying
> around both VFP and SVE registers at the same time as they overlap.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  risu_reginfo_aarch64.c | 16 ++++++++--------
>  risu_reginfo_aarch64.h |  9 ++++++++-
>  2 files changed, 16 insertions(+), 9 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 17/22] risu_reginfo: introduce reginfo_size()
Posted by Alex Bennée, 1 week ago
In preparation for conditionally supporting SVE we need to be able to
have different sized reginfos. This introduces reginfo_size() to
abstract the size away to the code the actually knows. For aarch64 we
also use this while initialising the block.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 reginfo.c              |  6 +++---
 risu.h                 |  3 +++
 risu_reginfo_aarch64.c | 11 ++++++++++-
 risu_reginfo_aarch64.h |  1 +
 risu_reginfo_arm.c     |  5 +++++
 risu_reginfo_m68k.c    |  5 +++++
 risu_reginfo_ppc64.c   |  5 +++++
 7 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/reginfo.c b/reginfo.c
index 1f55d06..dd42ae2 100644
--- a/reginfo.c
+++ b/reginfo.c
@@ -39,7 +39,7 @@ int send_register_info(write_fn write_fn, void *uc)
 
     switch (op) {
     case OP_TESTEND:
-        write_fn(&ri, sizeof(ri));
+        write_fn(&ri, reginfo_size());
         /* if we are tracing write_fn will return 0 unlike a remote
            end, hence we force return of 1 here */
         return 1;
@@ -58,7 +58,7 @@ int send_register_info(write_fn write_fn, void *uc)
         /* Do a simple register compare on (a) explicit request
          * (b) end of test (c) a non-risuop UNDEF
          */
-        return write_fn(&ri, sizeof(ri));
+        return write_fn(&ri, reginfo_size());
     }
     return 0;
 }
@@ -101,7 +101,7 @@ int recv_and_compare_register_info(read_fn read_fn,
         /* Do a simple register compare on (a) explicit request
          * (b) end of test (c) a non-risuop UNDEF
          */
-        if (read_fn(&apprentice_ri, sizeof(apprentice_ri))) {
+        if (read_fn(&apprentice_ri, reginfo_size())) {
             packet_mismatch = 1;
             resp = 2;
         } else if (!reginfo_is_eq(&master_ri, &apprentice_ri)) {
diff --git a/risu.h b/risu.h
index 48c50d9..8d2d646 100644
--- a/risu.h
+++ b/risu.h
@@ -133,4 +133,7 @@ int reginfo_dump(struct reginfo *ri, FILE * f);
 /* reginfo_dump_mismatch: print mismatch details to a stream, ret nonzero=ok */
 int reginfo_dump_mismatch(struct reginfo *m, struct reginfo *a, FILE *f);
 
+/* return size of reginfo */
+const int reginfo_size(void);
+
 #endif /* RISU_H */
diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index 62a5599..5da9e39 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -15,6 +15,8 @@
 #include <string.h>
 #include <signal.h> /* for FPSIMD_MAGIC */
 #include <stdlib.h>
+#include <stddef.h>
+#include <assert.h>
 
 #include "risu.h"
 #include "risu_reginfo_aarch64.h"
@@ -27,6 +29,13 @@ void process_arch_opt(int opt, const char *arg)
     abort();
 }
 
+const int reginfo_size(void)
+{
+    const int size = offsetof(struct reginfo, simd.end);
+    assert(sizeof(struct reginfo)==size);
+    return size;
+}
+
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 {
@@ -71,7 +80,7 @@ void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 /* reginfo_is_eq: compare the reginfo structs, returns nonzero if equal */
 int reginfo_is_eq(struct reginfo *r1, struct reginfo *r2)
 {
-    return memcmp(r1, r2, sizeof(*r1)) == 0;
+    return memcmp(r1, r2, reginfo_size()) == 0;
 }
 
 /* reginfo_dump: print state to a stream, returns nonzero on success */
diff --git a/risu_reginfo_aarch64.h b/risu_reginfo_aarch64.h
index a1c708b..ef97622 100644
--- a/risu_reginfo_aarch64.h
+++ b/risu_reginfo_aarch64.h
@@ -15,6 +15,7 @@
 
 struct simd_reginfo {
     __uint128_t vregs[32];
+    char end[0];
 };
 
 struct reginfo {
diff --git a/risu_reginfo_arm.c b/risu_reginfo_arm.c
index 12ad0ef..3662f12 100644
--- a/risu_reginfo_arm.c
+++ b/risu_reginfo_arm.c
@@ -36,6 +36,11 @@ void process_arch_opt(int opt, const char *arg)
     abort();
 }
 
+const int reginfo_size(void)
+{
+    return sizeof(struct reginfo);
+}
+
 static void reginfo_init_vfp(struct reginfo *ri, ucontext_t *uc)
 {
     /* Read VFP registers. These live in uc->uc_regspace, which is
diff --git a/risu_reginfo_m68k.c b/risu_reginfo_m68k.c
index 7a1c5a9..32b28c8 100644
--- a/risu_reginfo_m68k.c
+++ b/risu_reginfo_m68k.c
@@ -23,6 +23,11 @@ void process_arch_opt(int opt, const char *arg)
     abort();
 }
 
+const int reginfo_size(void)
+{
+    return sizeof(struct reginfo);
+}
+
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 {
diff --git a/risu_reginfo_ppc64.c b/risu_reginfo_ppc64.c
index 4b70460..f9d2f0d 100644
--- a/risu_reginfo_ppc64.c
+++ b/risu_reginfo_ppc64.c
@@ -31,6 +31,11 @@ void process_arch_opt(int opt, const char *arg)
     abort();
 }
 
+const int reginfo_size(void)
+{
+    return sizeof(struct reginfo);
+}
+
 /* reginfo_init: initialize with a ucontext */
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 {
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 17/22] risu_reginfo: introduce reginfo_size()
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> In preparation for conditionally supporting SVE we need to be able to
> have different sized reginfos. This introduces reginfo_size() to
> abstract the size away to the code the actually knows. For aarch64 we
> also use this while initialising the block.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  reginfo.c              |  6 +++---
>  risu.h                 |  3 +++
>  risu_reginfo_aarch64.c | 11 ++++++++++-
>  risu_reginfo_aarch64.h |  1 +
>  risu_reginfo_arm.c     |  5 +++++
>  risu_reginfo_m68k.c    |  5 +++++
>  risu_reginfo_ppc64.c   |  5 +++++
>  7 files changed, 32 insertions(+), 4 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 18/22] risu_reginfo_aarch64: left justify regnums and drop masks
Posted by Alex Bennée, 1 week ago
Left justification is more pleasing to the eye than the default. We
also drop the masking which isn't needed as we are casting to a
smaller size anyway.

This was split out of Richard's re-factoring work for SVE.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risu_reginfo_aarch64.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index 5da9e39..3ccaf0e 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -90,7 +90,7 @@ int reginfo_dump(struct reginfo *ri, FILE * f)
     fprintf(f, "  faulting insn %08x\n", ri->faulting_insn);
 
     for (i = 0; i < 31; i++) {
-        fprintf(f, "  X%2d   : %016" PRIx64 "\n", i, ri->regs[i]);
+        fprintf(f, "  X%-2d   : %016" PRIx64 "\n", i, ri->regs[i]);
     }
 
     fprintf(f, "  sp    : %016" PRIx64 "\n", ri->sp);
@@ -100,9 +100,9 @@ int reginfo_dump(struct reginfo *ri, FILE * f)
     fprintf(f, "  fpcr  : %08x\n", ri->fpcr);
 
     for (i = 0; i < 32; i++) {
-        fprintf(f, "  V%2d   : %016" PRIx64 "%016" PRIx64 "\n", i,
+        fprintf(f, "  V%-2d   : %016" PRIx64 "%016" PRIx64 "\n", i,
                 (uint64_t) (ri->simd.vregs[i] >> 64),
-                (uint64_t) (ri->simd.vregs[i] & 0xffffffffffffffff));
+                (uint64_t) (ri->simd.vregs[i]));
     }
 
     return !ferror(f);
@@ -119,7 +119,7 @@ int reginfo_dump_mismatch(struct reginfo *m, struct reginfo *a, FILE * f)
     }
     for (i = 0; i < 31; i++) {
         if (m->regs[i] != a->regs[i]) {
-            fprintf(f, "  X%2d   : %016" PRIx64 " vs %016" PRIx64 "\n",
+            fprintf(f, "  X%-2d   : %016" PRIx64 " vs %016" PRIx64 "\n",
                     i, m->regs[i], a->regs[i]);
         }
     }
@@ -148,13 +148,13 @@ int reginfo_dump_mismatch(struct reginfo *m, struct reginfo *a, FILE * f)
 
     for (i = 0; i < 32; i++) {
         if (m->simd.vregs[i] != a->simd.vregs[i]) {
-            fprintf(f, "  V%2d   : "
+            fprintf(f, "  V%-2d   : "
                     "%016" PRIx64 "%016" PRIx64 " vs "
                     "%016" PRIx64 "%016" PRIx64 "\n", i,
                     (uint64_t) (m->simd.vregs[i] >> 64),
-                    (uint64_t) (m->simd.vregs[i] & 0xffffffffffffffff),
+                    (uint64_t) m->simd.vregs[i],
                     (uint64_t) (a->simd.vregs[i] >> 64),
-                    (uint64_t) (a->simd.vregs[i] & 0xffffffffffffffff));
+                    (uint64_t) a->simd.vregs[i]);
         }
     }
 
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 18/22] risu_reginfo_aarch64: left justify regnums and drop masks
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> Left justification is more pleasing to the eye than the default. We
> also drop the masking which isn't needed as we are casting to a
> smaller size anyway.
> 
> This was split out of Richard's re-factoring work for SVE.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> ---
>  risu_reginfo_aarch64.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~


[Qemu-devel] [RISU PATCH v3 19/22] risu_reginfo_aarch64: add support for copying SVE register state
Posted by Alex Bennée, 1 week ago
Add the ability to save SVE registers from the signal context. This is
controlled with an optional flag --test-sve. The whole thing is
conditionally compiled when SVE support is in the sigcontext headers.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v2
  - support EXTRA_MAGIC contexts
v3
  - handle conditional bits
  - include <signal.h> in reginfo.h
  - move from helper function to main init function
  - (void *) cast for memcpy
  - additional ifdef SVE_MAGIC stuff
---
 risu_reginfo_aarch64.c | 107 ++++++++++++++++++++++++++++++++++++-----
 risu_reginfo_aarch64.h |  16 ++++++
 2 files changed, 110 insertions(+), 13 deletions(-)

diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index 3ccaf0e..79db5dd 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -16,13 +16,26 @@
 #include <signal.h> /* for FPSIMD_MAGIC */
 #include <stdlib.h>
 #include <stddef.h>
-#include <assert.h>
+#include <stdbool.h>
 
 #include "risu.h"
 #include "risu_reginfo_aarch64.h"
 
+#ifndef SVE_MAGIC
 const struct option * const arch_long_opts;
 const char * const arch_extra_help;
+#else
+
+/* Should we test SVE register state */
+static int test_sve;
+static const struct option extra_opts[] = {
+    {"test-sve", no_argument, &test_sve, 1},
+    {0, 0, 0, 0}
+};
+
+const struct option * const arch_long_opts = &extra_opts[0];
+const char * const arch_extra_help = "  --test-sve        Compare SVE registers\n";
+#endif
 
 void process_arch_opt(int opt, const char *arg)
 {
@@ -31,8 +44,12 @@ void process_arch_opt(int opt, const char *arg)
 
 const int reginfo_size(void)
 {
-    const int size = offsetof(struct reginfo, simd.end);
-    assert(sizeof(struct reginfo)==size);
+    int size = offsetof(struct reginfo, simd.end);
+#ifdef SVE_MAGIC
+    if (test_sve) {
+        size = offsetof(struct reginfo, sve.end);
+    }
+#endif
     return size;
 }
 
@@ -40,8 +57,12 @@ const int reginfo_size(void)
 void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 {
     int i;
-    struct _aarch64_ctx *ctx;
-    struct fpsimd_context *fp;
+    struct _aarch64_ctx *ctx, *extra = NULL;
+    struct fpsimd_context *fp = NULL;
+#ifdef SVE_MAGIC
+    struct sve_context *sve = NULL;
+#endif
+
     /* necessary to be able to compare with memcmp later */
     memset(ri, 0, sizeof(*ri));
 
@@ -57,21 +78,81 @@ void reginfo_init(struct reginfo *ri, ucontext_t *uc)
     ri->faulting_insn = *((uint32_t *) uc->uc_mcontext.pc);
 
     ctx = (struct _aarch64_ctx *) &uc->uc_mcontext.__reserved[0];
-
-    while (ctx->magic != FPSIMD_MAGIC && ctx->size != 0) {
-        ctx += (ctx->size + sizeof(*ctx) - 1) / sizeof(*ctx);
+    while (ctx) {
+        switch (ctx->magic) {
+        case FPSIMD_MAGIC:
+            fp = (void *)ctx;
+            break;
+#ifdef SVE_MAGIC
+        case SVE_MAGIC:
+            sve = (void *)ctx;
+            break;
+        case EXTRA_MAGIC:
+            extra = (void *)((struct extra_context *)(ctx))->datap;
+            break;
+#endif
+        case 0:
+            /* End of list.  */
+            ctx = extra;
+            extra = NULL;
+            continue;
+        default:
+            /* Unknown record -- skip it.  */
+            break;
+        }
+        ctx = (void *)ctx + ctx->size;
     }
 
-    if (ctx->magic != FPSIMD_MAGIC || ctx->size != sizeof(*fp)) {
-        fprintf(stderr,
-                "risu_reginfo_aarch64: failed to get FP/SIMD state\n");
+    if (!fp || fp->head.size != sizeof(*fp)) {
+        fprintf(stderr, "risu_reginfo_aarch64: failed to get FP/SIMD state\n");
         return;
     }
-
-    fp = (struct fpsimd_context *) ctx;
     ri->fpsr = fp->fpsr;
     ri->fpcr = fp->fpcr;
 
+#ifdef SVE_MAGIC
+    if (test_sve) {
+        int vq = sve_vq_from_vl(sve->vl); /* number of quads for whole vl */
+
+        if (sve == NULL) {
+            fprintf(stderr, "risu_reginfo_aarch64: failed to get SVE state\n");
+            return;
+        }
+
+        ri->sve.vl = sve->vl;
+
+        if (sve->head.size < SVE_SIG_CONTEXT_SIZE(vq)) {
+            if (sve->head.size == sizeof(*sve)) {
+                /* SVE state is empty -- not an error.  */
+            } else {
+                fprintf(stderr, "risu_reginfo_aarch64: "
+                        "failed to get complete SVE state\n");
+            }
+            return;
+        }
+
+        /* Copy ZREG's one at a time */
+        for (i = 0; i < SVE_NUM_ZREGS; i++) {
+            memcpy(&ri->sve.zregs[i],
+                   (void *)sve + SVE_SIG_ZREG_OFFSET(vq, i),
+                   SVE_SIG_ZREG_SIZE(vq));
+        }
+
+        /* Copy PREG's one at a time */
+        for (i = 0; i < SVE_NUM_PREGS; i++) {
+            memcpy(&ri->sve.pregs[i],
+                   (void *)sve + SVE_SIG_PREG_OFFSET(vq, i),
+                   SVE_SIG_PREG_SIZE(vq));
+        }
+
+        /* Finally the FFR */
+        memcpy(&ri->sve.ffr,(void *)sve + SVE_SIG_FFR_OFFSET(vq),
+               SVE_SIG_FFR_SIZE(vq));
+
+        return;
+    }
+#endif
+
     for (i = 0; i < 32; i++) {
         ri->simd.vregs[i] = fp->vregs[i];
     }
diff --git a/risu_reginfo_aarch64.h b/risu_reginfo_aarch64.h
index ef97622..b3701b3 100644
--- a/risu_reginfo_aarch64.h
+++ b/risu_reginfo_aarch64.h
@@ -13,11 +13,24 @@
 #ifndef RISU_REGINFO_AARCH64_H
 #define RISU_REGINFO_AARCH64_H
 
+#include <signal.h> /* for SVE_MAGIC */
+
 struct simd_reginfo {
     __uint128_t vregs[32];
     char end[0];
 };
 
+#ifdef SVE_MAGIC
+struct sve_reginfo {
+    /* SVE */
+    uint16_t    vl; /* current VL */
+    __uint128_t zregs[SVE_NUM_ZREGS][SVE_VQ_MAX];
+    uint16_t    pregs[SVE_NUM_PREGS][SVE_VQ_MAX];
+    uint16_t    ffr[SVE_VQ_MAX];
+    char end[0];
+};
+#endif
+
 struct reginfo {
     uint64_t fault_address;
     uint64_t regs[31];
@@ -32,6 +45,9 @@ struct reginfo {
 
     union {
         struct simd_reginfo simd;
+#ifdef SVE_MAGIC
+        struct sve_reginfo sve;
+#endif
     };
 };
 
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 19/22] risu_reginfo_aarch64: add support for copying SVE register state
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> Add the ability to save SVE registers from the signal context. This is
> controlled with an optional flag --test-sve. The whole thing is
> conditionally compiled when SVE support is in the sigcontext headers.
> 
> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> ---
> v2
>   - support EXTRA_MAGIC contexts
> v3
>   - handle conditional bits
>   - include <signal.h> in reginfo.h
>   - move from helper function to main init function
>   - (void *) cast for memcpy
>   - additional ifdef SVE_MAGIC stuff
> ---

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~

[Qemu-devel] [RISU PATCH v3 20/22] risu_reginfo_aarch64: add SVE support to reginfo_dump_mismatch
Posted by Alex Bennée, 1 week ago
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

---
v2
  - include ffr in comparison
  - mild re-factor of preg cmp/diff
v3
  - re-factoring
---
 risu_reginfo_aarch64.c | 74 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index 79db5dd..a352b4c 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -17,6 +17,7 @@
 #include <stdlib.h>
 #include <stddef.h>
 #include <stdbool.h>
+#include <inttypes.h>
 
 #include "risu.h"
 #include "risu_reginfo_aarch64.h"
@@ -164,6 +165,35 @@ int reginfo_is_eq(struct reginfo *r1, struct reginfo *r2)
     return memcmp(r1, r2, reginfo_size()) == 0;
 }
 
+#ifdef SVE_MAGIC
+static int sve_zreg_is_eq(struct reginfo *r1, struct reginfo *r2, int z)
+{
+    return memcmp(r1->sve.zregs[z], r2->sve.zregs[z], sizeof(*r1->sve.zregs[z])) == 0;
+}
+
+static int sve_preg_is_eq(uint16_t const (*p1)[SVE_VQ_MAX],
+                          uint16_t const (*p2)[SVE_VQ_MAX])
+{
+    return memcmp(p1, p2, sizeof *p1) == 0;
+}
+
+static void sve_dump_preg_diff(FILE *f, int vq,
+                               uint16_t const (*p1)[SVE_VQ_MAX],
+                               uint16_t const (*p2)[SVE_VQ_MAX])
+{
+    int q;
+
+    for (q = 0; q < vq; q++) {
+       fprintf(f, "%#04x", *p1[q]);
+    }
+    fprintf(f, " vs ");
+    for (q = 0; q < vq; q++) {
+       fprintf(f, "%#04x", *p2[q]);
+    }
+    fprintf(f, "\n");
+}
+#endif
+
 /* reginfo_dump: print state to a stream, returns nonzero on success */
 int reginfo_dump(struct reginfo *ri, FILE * f)
 {
@@ -227,6 +257,50 @@ int reginfo_dump_mismatch(struct reginfo *m, struct reginfo *a, FILE * f)
         fprintf(f, "  fpcr  : %08x vs %08x\n", m->fpcr, a->fpcr);
     }
 
+#ifdef SVE_MAGIC
+    if (test_sve) {
+        struct sve_reginfo *ms = &m->sve;
+        struct sve_reginfo *as = &a->sve;
+
+        if (ms->vl != as->vl) {
+            fprintf(f, "  SVE VL  : %d vs %d\n", ms->vl, as->vl);
+        }
+
+        if (!sve_preg_is_eq(&ms->ffr, &as->ffr)) {
+           fprintf(f, "  FFR   : ");
+           sve_dump_preg_diff(f, sve_vq_from_vl(ms->vl),
+                              &ms->pregs[i], &as->pregs[i]);
+        }
+        for (i = 0; i < SVE_NUM_PREGS; i++) {
+           if (!sve_preg_is_eq(&ms->pregs[i], &as->pregs[i])) {
+              fprintf(f, "  P%2d   : ", i);
+              sve_dump_preg_diff(f, sve_vq_from_vl(ms->vl),
+                                 &ms->pregs[i], &as->pregs[i]);
+           }
+        }
+        for (i = 0; i < SVE_NUM_ZREGS; i++) {
+           if (!sve_zreg_is_eq(m, a, i)) {
+              int q;
+              char *pad="";
+              fprintf(f, "  Z%2d   : ", i);
+              for (q = 0; q < sve_vq_from_vl(ms->vl); q++) {
+                 if (ms->zregs[i][q] != as->zregs[i][q]) {
+                    fprintf(f, "%sq%02d: %016" PRIx64 "%016" PRIx64
+                            " vs %016" PRIx64 "%016" PRIx64"\n", pad, q,
+                            (uint64_t) (ms->zregs[i][q] >> 64),
+                            (uint64_t) ms->zregs[i][q],
+                            (uint64_t) (as->zregs[i][q] >> 64),
+                            (uint64_t) as->zregs[i][q]);
+                    pad = "          ";
+                 }
+              }
+           }
+        }
+
+        return !ferror(f);
+    }
+#endif
+
     for (i = 0; i < 32; i++) {
         if (m->simd.vregs[i] != a->simd.vregs[i]) {
             fprintf(f, "  V%-2d   : "
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 20/22] risu_reginfo_aarch64: add SVE support to reginfo_dump_mismatch
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:55 AM, Alex Bennée wrote:
> +static void sve_dump_preg_diff(FILE *f, int vq,
> +                               uint16_t const (*p1)[SVE_VQ_MAX],
> +                               uint16_t const (*p2)[SVE_VQ_MAX])
> +{
> +    int q;
> +
> +    for (q = 0; q < vq; q++) {
> +       fprintf(f, "%#04x", *p1[q]);
> +    }
> +    fprintf(f, " vs ");
> +    for (q = 0; q < vq; q++) {
> +       fprintf(f, "%#04x", *p2[q]);

%# adds 0x into every 16-bit unit, so for vq=2 we get

  0xffff0xffff

Emit the 0x separately to start?

> +        for (i = 0; i < SVE_NUM_ZREGS; i++) {
> +           if (!sve_zreg_is_eq(m, a, i)) {
> +              int q;
> +              char *pad="";
> +              fprintf(f, "  Z%2d   : ", i);

%-2d?  %02d?

> +              for (q = 0; q < sve_vq_from_vl(ms->vl); q++) {
> +                 if (ms->zregs[i][q] != as->zregs[i][q]) {
> +                    fprintf(f, "%sq%02d: %016" PRIx64 "%016" PRIx64
> +                            " vs %016" PRIx64 "%016" PRIx64"\n", pad, q,

Actually, another thing that has annoyed me in the past,
but apparently not quite enough to actually fix, is the
fact that reginfo_dump and reginfo_dump_mismatch have a
slightly different format for Zregs.

It's probably worth splitting those bits out to helper
functions so that they must match.


r~

[Qemu-devel] [RISU PATCH v3 21/22] risu_reginfo_aarch64: limit SVE_VQ_MAX to current architecture
Posted by Alex Bennée, 1 week ago
From: Richard Henderson <richard.henderson@linaro.org>

The kernel headers optimistically assume it's going to grow but as we
are never going to use that many on current hardware we limit
SVE_VQ_MAX to what we will.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risu_reginfo_aarch64.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/risu_reginfo_aarch64.h b/risu_reginfo_aarch64.h
index b3701b3..c33b86f 100644
--- a/risu_reginfo_aarch64.h
+++ b/risu_reginfo_aarch64.h
@@ -31,6 +31,11 @@ struct sve_reginfo {
 };
 #endif
 
+/* The kernel headers set this based on future arch extensions.
+   The current arch maximum is 16.  Save space below.  */
+#undef SVE_VQ_MAX
+#define SVE_VQ_MAX 16
+
 struct reginfo {
     uint64_t fault_address;
     uint64_t regs[31];
-- 
2.17.1


[Qemu-devel] [RISU PATCH v3 22/22] risu_reginfo_aarch64: handle variable VQ
Posted by Alex Bennée, 1 week ago
This involves parsing the command line parameter and calling the
kernel to set the VQ limit. We also add dumping of the register state
in the main register dump.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 risu_reginfo_aarch64.c | 164 +++++++++++++++++++++++++++++------------
 1 file changed, 115 insertions(+), 49 deletions(-)

diff --git a/risu_reginfo_aarch64.c b/risu_reginfo_aarch64.c
index a352b4c..71c57b9 100644
--- a/risu_reginfo_aarch64.c
+++ b/risu_reginfo_aarch64.c
@@ -18,6 +18,8 @@
 #include <stddef.h>
 #include <stdbool.h>
 #include <inttypes.h>
+#include <assert.h>
+#include <sys/prctl.h>
 
 #include "risu.h"
 #include "risu_reginfo_aarch64.h"
@@ -30,17 +32,41 @@ const char * const arch_extra_help;
 /* Should we test SVE register state */
 static int test_sve;
 static const struct option extra_opts[] = {
-    {"test-sve", no_argument, &test_sve, 1},
+    {"test-sve", required_argument, NULL, FIRST_ARCH_OPT },
     {0, 0, 0, 0}
 };
 
 const struct option * const arch_long_opts = &extra_opts[0];
-const char * const arch_extra_help = "  --test-sve        Compare SVE registers\n";
+const char * const arch_extra_help
+    = "  --test-sve=<vq>        Compare SVE registers with VQ\n";
 #endif
 
 void process_arch_opt(int opt, const char *arg)
 {
+#ifdef SVE_MAGIC
+    long want, got;
+
+    assert(opt == FIRST_ARCH_OPT);
+    test_sve = strtol(arg, 0, 10);
+
+    if (test_sve <= 0 || test_sve > SVE_VQ_MAX) {
+        fprintf(stderr, "Invalid value for VQ (1-%d)\n", SVE_VQ_MAX);
+        exit(1);
+    }
+    want = sve_vl_from_vq(test_sve);
+    got = prctl(PR_SVE_SET_VL, want);
+    if (want != got) {
+        if (got < 0) {
+            perror("prctl PR_SVE_SET_VL");
+        } else {
+            fprintf(stderr, "Unsupported value for VQ (%d != %d)\n",
+                    test_sve, (int)sve_vq_from_vl(got));
+        }
+        exit(1);
+    }
+#else
     abort();
+#endif
 }
 
 const int reginfo_size(void)
@@ -113,12 +139,18 @@ void reginfo_init(struct reginfo *ri, ucontext_t *uc)
 
 #ifdef SVE_MAGIC
     if (test_sve) {
-        int vq = sve_vq_from_vl(sve->vl); /* number of quads for whole vl */
+        int vq = test_sve;
 
         if (sve == NULL) {
             fprintf(stderr, "risu_reginfo_aarch64: failed to get SVE state\n");
             return;
         }
+        if (sve->vl != sve_vl_from_vq(vq)) {
+            fprintf(stderr, "risu_reginfo_aarch64: "
+                    "unexpected SVE state: %d != %d\n",
+                    sve->vl, sve_vl_from_vq(vq));
+            return;
+        }
 
         ri->sve.vl = sve->vl;
 
@@ -147,12 +179,12 @@ void reginfo_init(struct reginfo *ri, ucontext_t *uc)
         }
 
         /* Finally the FFR */
-        memcpy(&ri->sve.ffr,(void *)sve + SVE_SIG_FFR_OFFSET(vq),
+        memcpy(&ri->sve.ffr, (void *)sve + SVE_SIG_FFR_OFFSET(vq),
                SVE_SIG_FFR_SIZE(vq));
 
         return;
     }
-#endif
+#endif /* SVE_MAGIC */
 
     for (i = 0; i < 32; i++) {
         ri->simd.vregs[i] = fp->vregs[i];
@@ -166,32 +198,49 @@ int reginfo_is_eq(struct reginfo *r1, struct reginfo *r2)
 }
 
 #ifdef SVE_MAGIC
-static int sve_zreg_is_eq(struct reginfo *r1, struct reginfo *r2, int z)
+static int sve_zreg_is_eq(int vq, const void *z1, const void *z2)
 {
-    return memcmp(r1->sve.zregs[z], r2->sve.zregs[z], sizeof(*r1->sve.zregs[z])) == 0;
+    return memcmp(z1, z2, vq * 16) == 0;
 }
 
-static int sve_preg_is_eq(uint16_t const (*p1)[SVE_VQ_MAX],
-                          uint16_t const (*p2)[SVE_VQ_MAX])
+static int sve_preg_is_eq(int vq, const void *p1, const void *p2)
 {
-    return memcmp(p1, p2, sizeof *p1) == 0;
+    return memcmp(p1, p2, vq * 2) == 0;
 }
 
-static void sve_dump_preg_diff(FILE *f, int vq,
-                               uint16_t const (*p1)[SVE_VQ_MAX],
-                               uint16_t const (*p2)[SVE_VQ_MAX])
+static void sve_dump_preg(FILE *f, int vq, const uint16_t *p)
 {
     int q;
-
-    for (q = 0; q < vq; q++) {
-       fprintf(f, "%#04x", *p1[q]);
+    for (q = vq - 1; q >= 0; q--) {
+        fprintf(f, "%04x", p[q]);
     }
+}
+
+static void sve_dump_preg_diff(FILE *f, int vq, const uint16_t *p1,
+                               const uint16_t *p2)
+{
+    sve_dump_preg(f, vq, p1);
     fprintf(f, " vs ");
-    for (q = 0; q < vq; q++) {
-       fprintf(f, "%#04x", *p2[q]);
-    }
+    sve_dump_preg(f, vq, p2);
     fprintf(f, "\n");
 }
+
+static void sve_dump_zreg_diff(FILE *f, int vq, const __uint128_t *z1,
+                               const __uint128_t *z2)
+{
+    const char *pad = "";
+    int q;
+
+    for (q = 0; q < vq; ++q) {
+        if (z1[q] != z2[q]) {
+            fprintf(f, "%sq%02d: %016" PRIx64 "%016" PRIx64
+                    " vs %016" PRIx64 "%016" PRIx64"\n", pad, q,
+                    (uint64_t)(z1[q] >> 64), (uint64_t)z1[q],
+                    (uint64_t)(z2[q] >> 64), (uint64_t)z2[q]);
+            pad = "          ";
+        }
+    }
+}
 #endif
 
 /* reginfo_dump: print state to a stream, returns nonzero on success */
@@ -210,6 +259,36 @@ int reginfo_dump(struct reginfo *ri, FILE * f)
     fprintf(f, "  fpsr  : %08x\n", ri->fpsr);
     fprintf(f, "  fpcr  : %08x\n", ri->fpcr);
 
+#ifdef SVE_MAGIC
+    if (test_sve) {
+        int q, vq = test_sve;
+
+        fprintf(f, "  vl    : %d\n", ri->sve.vl);
+
+        for (i = 0; i < 32; i++) {
+            fprintf(f, "  Z%-2d q%-2d: %016" PRIx64 "%016" PRIx64 "\n", i, 0,
+                    (uint64_t)(ri->sve.zregs[i][0] >> 64),
+                    (uint64_t)ri->sve.zregs[i][0]);
+            for (q = 1; q < vq; ++q) {
+                fprintf(f, "      q%-2d: %016" PRIx64 "%016" PRIx64 "\n", q,
+                        (uint64_t)(ri->sve.zregs[i][q] >> 64),
+                        (uint64_t)ri->sve.zregs[i][q]);
+            }
+        }
+
+        for (i = 0; i < 16; i++) {
+            fprintf(f, "  P%-2d    : ", i);
+            sve_dump_preg(f, vq, &ri->sve.pregs[i][0]);
+            fprintf(f, "\n");
+        }
+        fprintf(f, "  FFR    : ");
+        sve_dump_preg(f, vq, &ri->sve.ffr[0]);
+        fprintf(f, "\n");
+
+        return !ferror(f);
+    }
+#endif
+
     for (i = 0; i < 32; i++) {
         fprintf(f, "  V%-2d   : %016" PRIx64 "%016" PRIx64 "\n", i,
                 (uint64_t) (ri->simd.vregs[i] >> 64),
@@ -259,42 +338,29 @@ int reginfo_dump_mismatch(struct reginfo *m, struct reginfo *a, FILE * f)
 
 #ifdef SVE_MAGIC
     if (test_sve) {
-        struct sve_reginfo *ms = &m->sve;
-        struct sve_reginfo *as = &a->sve;
+        int vq = sve_vq_from_vl(m->sve.vl);
 
-        if (ms->vl != as->vl) {
-            fprintf(f, "  SVE VL  : %d vs %d\n", ms->vl, as->vl);
+        if (m->sve.vl != a->sve.vl) {
+            fprintf(f, "  vl    : %d vs %d\n", m->sve.vl, a->sve.vl);
         }
 
-        if (!sve_preg_is_eq(&ms->ffr, &as->ffr)) {
-           fprintf(f, "  FFR   : ");
-           sve_dump_preg_diff(f, sve_vq_from_vl(ms->vl),
-                              &ms->pregs[i], &as->pregs[i]);
+        for (i = 0; i < SVE_NUM_ZREGS; i++) {
+            if (!sve_zreg_is_eq(vq, &m->sve.zregs[i], &a->sve.zregs[i])) {
+                fprintf(f, "  Z%-2d   : ", i);
+                sve_dump_zreg_diff(f, vq, &m->sve.zregs[i][0],
+                                   &a->sve.zregs[i][0]);
+            }
         }
         for (i = 0; i < SVE_NUM_PREGS; i++) {
-           if (!sve_preg_is_eq(&ms->pregs[i], &as->pregs[i])) {
-              fprintf(f, "  P%2d   : ", i);
-              sve_dump_preg_diff(f, sve_vq_from_vl(ms->vl),
-                                 &ms->pregs[i], &as->pregs[i]);
-           }
+            if (!sve_preg_is_eq(vq, &m->sve.pregs[i], &a->sve.pregs[i])) {
+                fprintf(f, "  P%-2d   : ", i);
+                sve_dump_preg_diff(f, vq, &m->sve.pregs[i][0],
+                                   &a->sve.pregs[i][0]);
+            }
         }
-        for (i = 0; i < SVE_NUM_ZREGS; i++) {
-           if (!sve_zreg_is_eq(m, a, i)) {
-              int q;
-              char *pad="";
-              fprintf(f, "  Z%2d   : ", i);
-              for (q = 0; q < sve_vq_from_vl(ms->vl); q++) {
-                 if (ms->zregs[i][q] != as->zregs[i][q]) {
-                    fprintf(f, "%sq%02d: %016" PRIx64 "%016" PRIx64
-                            " vs %016" PRIx64 "%016" PRIx64"\n", pad, q,
-                            (uint64_t) (ms->zregs[i][q] >> 64),
-                            (uint64_t) ms->zregs[i][q],
-                            (uint64_t) (as->zregs[i][q] >> 64),
-                            (uint64_t) as->zregs[i][q]);
-                    pad = "          ";
-                 }
-              }
-           }
+        if (!sve_preg_is_eq(vq, &m->sve.ffr, &a->sve.ffr)) {
+            fprintf(f, "  FFR   : ");
+            sve_dump_preg_diff(f, vq, &m->sve.pregs[i][0], &a->sve.pregs[i][0]);
         }
 
         return !ferror(f);
-- 
2.17.1


Re: [Qemu-devel] [RISU PATCH v3 22/22] risu_reginfo_aarch64: handle variable VQ
Posted by Richard Henderson, 1 week ago
On 06/13/2018 02:56 AM, Alex Bennée wrote:
> @@ -147,12 +179,12 @@ void reginfo_init(struct reginfo *ri, ucontext_t *uc)
>          }
>  
>          /* Finally the FFR */
> -        memcpy(&ri->sve.ffr,(void *)sve + SVE_SIG_FFR_OFFSET(vq),
> +        memcpy(&ri->sve.ffr, (void *)sve + SVE_SIG_FFR_OFFSET(vq),
>                 SVE_SIG_FFR_SIZE(vq));
>  
>          return;
>      }
> -#endif
> +#endif /* SVE_MAGIC */

It appears that most of the rest of this patch should be folded back earlier in
the patch set.


r~

>  
>      for (i = 0; i < 32; i++) {
>          ri->simd.vregs[i] = fp->vregs[i];
> @@ -166,32 +198,49 @@ int reginfo_is_eq(struct reginfo *r1, struct reginfo *r2)
>  }
>  
>  #ifdef SVE_MAGIC
> -static int sve_zreg_is_eq(struct reginfo *r1, struct reginfo *r2, int z)
> +static int sve_zreg_is_eq(int vq, const void *z1, const void *z2)
>  {
> -    return memcmp(r1->sve.zregs[z], r2->sve.zregs[z], sizeof(*r1->sve.zregs[z])) == 0;
> +    return memcmp(z1, z2, vq * 16) == 0;
>  }
>  
> -static int sve_preg_is_eq(uint16_t const (*p1)[SVE_VQ_MAX],
> -                          uint16_t const (*p2)[SVE_VQ_MAX])
> +static int sve_preg_is_eq(int vq, const void *p1, const void *p2)
>  {
> -    return memcmp(p1, p2, sizeof *p1) == 0;
> +    return memcmp(p1, p2, vq * 2) == 0;
>  }
>  
> -static void sve_dump_preg_diff(FILE *f, int vq,
> -                               uint16_t const (*p1)[SVE_VQ_MAX],
> -                               uint16_t const (*p2)[SVE_VQ_MAX])
> +static void sve_dump_preg(FILE *f, int vq, const uint16_t *p)
>  {
>      int q;
> -
> -    for (q = 0; q < vq; q++) {
> -       fprintf(f, "%#04x", *p1[q]);
> +    for (q = vq - 1; q >= 0; q--) {
> +        fprintf(f, "%04x", p[q]);
>      }
> +}
> +
> +static void sve_dump_preg_diff(FILE *f, int vq, const uint16_t *p1,
> +                               const uint16_t *p2)
> +{
> +    sve_dump_preg(f, vq, p1);
>      fprintf(f, " vs ");
> -    for (q = 0; q < vq; q++) {
> -       fprintf(f, "%#04x", *p2[q]);
> -    }
> +    sve_dump_preg(f, vq, p2);
>      fprintf(f, "\n");
>  }
> +
> +static void sve_dump_zreg_diff(FILE *f, int vq, const __uint128_t *z1,
> +                               const __uint128_t *z2)
> +{
> +    const char *pad = "";
> +    int q;
> +
> +    for (q = 0; q < vq; ++q) {
> +        if (z1[q] != z2[q]) {
> +            fprintf(f, "%sq%02d: %016" PRIx64 "%016" PRIx64
> +                    " vs %016" PRIx64 "%016" PRIx64"\n", pad, q,
> +                    (uint64_t)(z1[q] >> 64), (uint64_t)z1[q],
> +                    (uint64_t)(z2[q] >> 64), (uint64_t)z2[q]);
> +            pad = "          ";
> +        }
> +    }
> +}
>  #endif
>  
>  /* reginfo_dump: print state to a stream, returns nonzero on success */
> @@ -210,6 +259,36 @@ int reginfo_dump(struct reginfo *ri, FILE * f)
>      fprintf(f, "  fpsr  : %08x\n", ri->fpsr);
>      fprintf(f, "  fpcr  : %08x\n", ri->fpcr);
>  
> +#ifdef SVE_MAGIC
> +    if (test_sve) {
> +        int q, vq = test_sve;
> +
> +        fprintf(f, "  vl    : %d\n", ri->sve.vl);
> +
> +        for (i = 0; i < 32; i++) {
> +            fprintf(f, "  Z%-2d q%-2d: %016" PRIx64 "%016" PRIx64 "\n", i, 0,
> +                    (uint64_t)(ri->sve.zregs[i][0] >> 64),
> +                    (uint64_t)ri->sve.zregs[i][0]);
> +            for (q = 1; q < vq; ++q) {
> +                fprintf(f, "      q%-2d: %016" PRIx64 "%016" PRIx64 "\n", q,
> +                        (uint64_t)(ri->sve.zregs[i][q] >> 64),
> +                        (uint64_t)ri->sve.zregs[i][q]);
> +            }
> +        }
> +
> +        for (i = 0; i < 16; i++) {
> +            fprintf(f, "  P%-2d    : ", i);
> +            sve_dump_preg(f, vq, &ri->sve.pregs[i][0]);
> +            fprintf(f, "\n");
> +        }
> +        fprintf(f, "  FFR    : ");
> +        sve_dump_preg(f, vq, &ri->sve.ffr[0]);
> +        fprintf(f, "\n");
> +
> +        return !ferror(f);
> +    }
> +#endif
> +
>      for (i = 0; i < 32; i++) {
>          fprintf(f, "  V%-2d   : %016" PRIx64 "%016" PRIx64 "\n", i,
>                  (uint64_t) (ri->simd.vregs[i] >> 64),
> @@ -259,42 +338,29 @@ int reginfo_dump_mismatch(struct reginfo *m, struct reginfo *a, FILE * f)
>  
>  #ifdef SVE_MAGIC
>      if (test_sve) {
> -        struct sve_reginfo *ms = &m->sve;
> -        struct sve_reginfo *as = &a->sve;
> +        int vq = sve_vq_from_vl(m->sve.vl);
>  
> -        if (ms->vl != as->vl) {
> -            fprintf(f, "  SVE VL  : %d vs %d\n", ms->vl, as->vl);
> +        if (m->sve.vl != a->sve.vl) {
> +            fprintf(f, "  vl    : %d vs %d\n", m->sve.vl, a->sve.vl);
>          }
>  
> -        if (!sve_preg_is_eq(&ms->ffr, &as->ffr)) {
> -           fprintf(f, "  FFR   : ");
> -           sve_dump_preg_diff(f, sve_vq_from_vl(ms->vl),
> -                              &ms->pregs[i], &as->pregs[i]);
> +        for (i = 0; i < SVE_NUM_ZREGS; i++) {
> +            if (!sve_zreg_is_eq(vq, &m->sve.zregs[i], &a->sve.zregs[i])) {
> +                fprintf(f, "  Z%-2d   : ", i);
> +                sve_dump_zreg_diff(f, vq, &m->sve.zregs[i][0],
> +                                   &a->sve.zregs[i][0]);
> +            }
>          }
>          for (i = 0; i < SVE_NUM_PREGS; i++) {
> -           if (!sve_preg_is_eq(&ms->pregs[i], &as->pregs[i])) {
> -              fprintf(f, "  P%2d   : ", i);
> -              sve_dump_preg_diff(f, sve_vq_from_vl(ms->vl),
> -                                 &ms->pregs[i], &as->pregs[i]);
> -           }
> +            if (!sve_preg_is_eq(vq, &m->sve.pregs[i], &a->sve.pregs[i])) {
> +                fprintf(f, "  P%-2d   : ", i);
> +                sve_dump_preg_diff(f, vq, &m->sve.pregs[i][0],
> +                                   &a->sve.pregs[i][0]);
> +            }
>          }
> -        for (i = 0; i < SVE_NUM_ZREGS; i++) {
> -           if (!sve_zreg_is_eq(m, a, i)) {
> -              int q;
> -              char *pad="";
> -              fprintf(f, "  Z%2d   : ", i);
> -              for (q = 0; q < sve_vq_from_vl(ms->vl); q++) {
> -                 if (ms->zregs[i][q] != as->zregs[i][q]) {
> -                    fprintf(f, "%sq%02d: %016" PRIx64 "%016" PRIx64
> -                            " vs %016" PRIx64 "%016" PRIx64"\n", pad, q,
> -                            (uint64_t) (ms->zregs[i][q] >> 64),
> -                            (uint64_t) ms->zregs[i][q],
> -                            (uint64_t) (as->zregs[i][q] >> 64),
> -                            (uint64_t) as->zregs[i][q]);
> -                    pad = "          ";
> -                 }
> -              }
> -           }
> +        if (!sve_preg_is_eq(vq, &m->sve.ffr, &a->sve.ffr)) {
> +            fprintf(f, "  FFR   : ");
> +            sve_dump_preg_diff(f, vq, &m->sve.pregs[i][0], &a->sve.pregs[i][0]);
>          }
>  
>          return !ferror(f);
>