[Qemu-devel] [PATCH 00/38] tcg vector improvements

Richard Henderson posted 38 patches 4 years, 12 months ago
Test checkpatch failed
Test asan passed
Test docker-clang@ubuntu passed
Test docker-mingw@fedora passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20190420073442.7488-1-richard.henderson@linaro.org
Maintainers: Cornelia Huck <cohuck@redhat.com>, Alistair Francis <Alistair.Francis@wdc.com>, David Gibson <david@gibson.dropbear.id.au>, Stefan Weil <sw@weilnetz.de>, Paolo Bonzini <pbonzini@redhat.com>, Aleksandar Rikalo <arikalo@wavecomp.com>, Peter Maydell <peter.maydell@linaro.org>, Aurelien Jarno <aurelien@aurel32.net>, Richard Henderson <rth@twiddle.net>, Palmer Dabbelt <palmer@sifive.com>, Claudio Fontana <claudio.fontana@huawei.com>, "Edgar E. Iglesias" <edgar.iglesias@gmail.com>, David Hildenbrand <david@redhat.com>, Max Filippov <jcmvbkbc@gmail.com>, Andrzej Zaborowski <balrogg@gmail.com>
There is a newer version of this series
accel/tcg/tcg-runtime.h             |  20 +
target/arm/helper.h                 |  17 +-
target/arm/translate.h              |   6 +
target/ppc/helper.h                 |  24 +-
tcg/aarch64/tcg-target.h            |   4 +-
tcg/aarch64/tcg-target.opc.h        |   2 +
tcg/i386/tcg-target.h               |   6 +-
tcg/i386/tcg-target.opc.h           |   1 -
tcg/tcg-op-gvec.h                   |  60 +-
tcg/tcg-op.h                        |  16 +
tcg/tcg-opc.h                       |   3 +
tcg/tcg.h                           |  20 +
accel/tcg/tcg-runtime-gvec.c        | 180 ++++++
target/arm/neon_helper.c            |  38 --
target/arm/translate-a64.c          |  59 +-
target/arm/translate-sve.c          |   9 +-
target/arm/translate.c              | 432 ++++++++++---
target/arm/vec_helper.c             | 176 ++++++
target/cris/translate.c             |   9 +-
target/ppc/int_helper.c             |   6 +-
target/ppc/translate.c              |  80 +--
target/ppc/translate/vmx-impl.inc.c | 175 +++++-
target/s390x/translate.c            |   8 +-
target/xtensa/translate.c           |   9 +-
tcg/aarch64/tcg-target.inc.c        | 227 ++++++-
tcg/arm/tcg-target.inc.c            |   7 +-
tcg/i386/tcg-target.inc.c           | 176 +++++-
tcg/mips/tcg-target.inc.c           |   3 +-
tcg/optimize.c                      |   8 +-
tcg/ppc/tcg-target.inc.c            |   3 +-
tcg/riscv/tcg-target.inc.c          |   5 +-
tcg/s390/tcg-target.inc.c           |   3 +-
tcg/sparc/tcg-target.inc.c          |   3 +-
tcg/tcg-op-gvec.c                   | 917 +++++++++++++++++++++++-----
tcg/tcg-op-vec.c                    | 259 +++++++-
tcg/tcg-op.c                        |  20 +
tcg/tcg.c                           | 256 ++++++--
tcg/tci/tcg-target.inc.c            |   3 +-
tcg/README                          |  16 +
39 files changed, 2699 insertions(+), 567 deletions(-)
[Qemu-devel] [PATCH 00/38] tcg vector improvements
Posted by Richard Henderson 4 years, 12 months ago
Based-on: tcg-next, which at present is only tcg_gen_extract2.

The dupm patches have been on list before, with a larger context
of supporting tcg/ppc.  The rest of the set was written to support
David's s390 vector patches.  In particular:

(1) Add vector absolute value.
(2) Add vector shift by non-constant scalar.
(3) Add vector shift by vector.
(4) Add vector select.
(5) Be more precise in handling target-specific vector expansions.

And then there's a set of bugs that I encountered while working
on this across x86, aa64, and ppc hosts.  Tested primarily with
aa64 as the guest, via RISU.


r~


David Hildenbrand (1):
  tcg: Implement tcg_gen_gvec_3i()

Richard Henderson (37):
  target/arm: Fill in .opc for cmtst_op
  tcg: Assert fixed_reg is read-only
  tcg: Return bool success from tcg_out_mov
  tcg: Support cross-class moves without instruction support
  tcg: Allow add_vec, sub_vec, neg_vec, not_vec to be expanded
  tcg: Promote tcg_out_{dup,dupi}_vec to backend interface
  tcg: Manually expand INDEX_op_dup_vec
  tcg: Add tcg_out_dupm_vec to the backend interface
  tcg/i386: Implement tcg_out_dupm_vec
  tcg/aarch64: Implement tcg_out_dupm_vec
  tcg: Add INDEX_op_dup_mem_vec
  tcg: Add gvec expanders for variable shift
  tcg/i386: Support vector variable shift opcodes
  tcg/aarch64: Support vector variable shift opcodes
  tcg: Specify optional vector requirements with a list
  tcg: Add gvec expanders for vector shift by scalar
  tcg/i386: Support vector scalar shift opcodes
  tcg: Add support for integer absolute value
  tcg: Add support for vector absolute value
  target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs
  target/cris: Use tcg_gen_abs_tl
  target/ppc: Use tcg_gen_abs_tl
  target/s390x: Use tcg_gen_abs_i64
  target/xtensa: Use tcg_gen_abs_i32
  tcg/i386: Support vector absolute value
  tcg/aarch64: Support vector absolute value
  tcg: Add support for vector comparison select
  tcg/i386: Support vector comparison select value
  tcg/aarch64: Support vector comparison select value
  target/ppc: Use vector variable shifts for VS{L,R,RA}{B,H,W,D}
  target/arm: Vectorize USHL and SSHL
  tcg/aarch64: Do not advertise minmax for MO_64
  tcg: Do not recreate INDEX_op_neg_vec unless supported
  tcg: Introduce do_op3_nofail for vector expansion
  tcg: Expand vector minmax using cmp+cmpsel
  tcg/aarch64: Use MVNI for expansion of dupi
  tcg/aarch64: Use ORRI and BICI for vector logical operations

 accel/tcg/tcg-runtime.h             |  20 +
 target/arm/helper.h                 |  17 +-
 target/arm/translate.h              |   6 +
 target/ppc/helper.h                 |  24 +-
 tcg/aarch64/tcg-target.h            |   4 +-
 tcg/aarch64/tcg-target.opc.h        |   2 +
 tcg/i386/tcg-target.h               |   6 +-
 tcg/i386/tcg-target.opc.h           |   1 -
 tcg/tcg-op-gvec.h                   |  60 +-
 tcg/tcg-op.h                        |  16 +
 tcg/tcg-opc.h                       |   3 +
 tcg/tcg.h                           |  20 +
 accel/tcg/tcg-runtime-gvec.c        | 180 ++++++
 target/arm/neon_helper.c            |  38 --
 target/arm/translate-a64.c          |  59 +-
 target/arm/translate-sve.c          |   9 +-
 target/arm/translate.c              | 432 ++++++++++---
 target/arm/vec_helper.c             | 176 ++++++
 target/cris/translate.c             |   9 +-
 target/ppc/int_helper.c             |   6 +-
 target/ppc/translate.c              |  80 +--
 target/ppc/translate/vmx-impl.inc.c | 175 +++++-
 target/s390x/translate.c            |   8 +-
 target/xtensa/translate.c           |   9 +-
 tcg/aarch64/tcg-target.inc.c        | 227 ++++++-
 tcg/arm/tcg-target.inc.c            |   7 +-
 tcg/i386/tcg-target.inc.c           | 176 +++++-
 tcg/mips/tcg-target.inc.c           |   3 +-
 tcg/optimize.c                      |   8 +-
 tcg/ppc/tcg-target.inc.c            |   3 +-
 tcg/riscv/tcg-target.inc.c          |   5 +-
 tcg/s390/tcg-target.inc.c           |   3 +-
 tcg/sparc/tcg-target.inc.c          |   3 +-
 tcg/tcg-op-gvec.c                   | 917 +++++++++++++++++++++++-----
 tcg/tcg-op-vec.c                    | 259 +++++++-
 tcg/tcg-op.c                        |  20 +
 tcg/tcg.c                           | 256 ++++++--
 tcg/tci/tcg-target.inc.c            |   3 +-
 tcg/README                          |  16 +
 39 files changed, 2699 insertions(+), 567 deletions(-)

-- 
2.17.1


Re: [Qemu-devel] [PATCH 00/38] tcg vector improvements
Posted by David Hildenbrand 4 years, 12 months ago
On 20.04.19 09:34, Richard Henderson wrote:
> Based-on: tcg-next, which at present is only tcg_gen_extract2.
> 
> The dupm patches have been on list before, with a larger context
> of supporting tcg/ppc.  The rest of the set was written to support
> David's s390 vector patches.  In particular:
> 
> (1) Add vector absolute value.
> (2) Add vector shift by non-constant scalar.
> (3) Add vector shift by vector.
> (4) Add vector select.

Remind me, is this for VECTOR SELECT on s390x where we already added a
vector variant? At least VECTOR SELECT on s390x works on bit, not
element granularity.

> (5) Be more precise in handling target-specific vector expansions.
> 
> And then there's a set of bugs that I encountered while working
> on this across x86, aa64, and ppc hosts.  Tested primarily with
> aa64 as the guest, via RISU.
> 
> 
> r~


-- 

Thanks,

David / dhildenb

Re: [Qemu-devel] [PATCH 00/38] tcg vector improvements
Posted by Richard Henderson 4 years, 12 months ago
On 4/23/19 12:15 PM, David Hildenbrand wrote:
> On 20.04.19 09:34, Richard Henderson wrote:
>> Based-on: tcg-next, which at present is only tcg_gen_extract2.
>>
>> The dupm patches have been on list before, with a larger context
>> of supporting tcg/ppc.  The rest of the set was written to support
>> David's s390 vector patches.  In particular:
>>
>> (1) Add vector absolute value.
>> (2) Add vector shift by non-constant scalar.
>> (3) Add vector shift by vector.
>> (4) Add vector select.
> 
> Remind me, is this for VECTOR SELECT on s390x where we already added a
> vector variant? At least VECTOR SELECT on s390x works on bit, not
> element granularity.


No, this was more for implementing _vec helpers, where we can
reasonably use element granularity (since that's all x86 has).

I thought about adding a bitsel alongside cmpsel, to work on
bits like this, but haven't yet.


r~

Re: [Qemu-devel] [PATCH 00/38] tcg vector improvements
Posted by David Hildenbrand 4 years, 12 months ago
On 23.04.19 22:26, Richard Henderson wrote:
> On 4/23/19 12:15 PM, David Hildenbrand wrote:
>> On 20.04.19 09:34, Richard Henderson wrote:
>>> Based-on: tcg-next, which at present is only tcg_gen_extract2.
>>>
>>> The dupm patches have been on list before, with a larger context
>>> of supporting tcg/ppc.  The rest of the set was written to support
>>> David's s390 vector patches.  In particular:
>>>
>>> (1) Add vector absolute value.
>>> (2) Add vector shift by non-constant scalar.
>>> (3) Add vector shift by vector.
>>> (4) Add vector select.
>>
>> Remind me, is this for VECTOR SELECT on s390x where we already added a
>> vector variant? At least VECTOR SELECT on s390x works on bit, not
>> element granularity.
> 
> 
> No, this was more for implementing _vec helpers, where we can
> reasonably use element granularity (since that's all x86 has).
> 
> I thought about adding a bitsel alongside cmpsel, to work on
> bits like this, but haven't yet.
> 

Makes sense, thanks!


-- 

Thanks,

David / dhildenb

Re: [Qemu-devel] [PATCH 00/38] tcg vector improvements
Posted by no-reply@patchew.org 4 years, 12 months ago
Patchew URL: https://patchew.org/QEMU/20190420073442.7488-1-richard.henderson@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20190420073442.7488-1-richard.henderson@linaro.org
Subject: [Qemu-devel] [PATCH 00/38] tcg vector improvements

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20190420073442.7488-1-richard.henderson@linaro.org -> patchew/20190420073442.7488-1-richard.henderson@linaro.org
Switched to a new branch 'test'
eaace97609 tcg/aarch64: Use ORRI and BICI for vector logical operations
a701168d9c tcg/aarch64: Use MVNI for expansion of dupi
508dc19d39 tcg: Expand vector minmax using cmp+cmpsel
6976ef828e tcg: Introduce do_op3_nofail for vector expansion
2d6f5f7050 tcg: Do not recreate INDEX_op_neg_vec unless supported
7425b741da tcg/aarch64: Do not advertise minmax for MO_64
54ed8f1e33 target/arm: Vectorize USHL and SSHL
23ef118db8 target/ppc: Use vector variable shifts for VS{L, R, RA}{B, H, W, D}
65cae51501 tcg/aarch64: Support vector comparison select value
28392fb9a2 tcg/i386: Support vector comparison select value
a7efba49a2 tcg: Add support for vector comparison select
2aded2eb78 tcg/aarch64: Support vector absolute value
4537dd40fc tcg/i386: Support vector absolute value
f81e912312 target/xtensa: Use tcg_gen_abs_i32
3c3292af32 target/s390x: Use tcg_gen_abs_i64
e81e04de28 target/ppc: Use tcg_gen_abs_tl
1869b46302 target/cris: Use tcg_gen_abs_tl
74420abd9d target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs
72e6fb61b1 tcg: Add support for vector absolute value
b6885d0400 tcg: Add support for integer absolute value
9c178a385e tcg/i386: Support vector scalar shift opcodes
8a79bb2407 tcg: Add gvec expanders for vector shift by scalar
c611d5ab1d tcg: Specify optional vector requirements with a list
2d22c62f7d tcg: Implement tcg_gen_gvec_3i()
3e06422a97 tcg/aarch64: Support vector variable shift opcodes
1d95e80f8b tcg/i386: Support vector variable shift opcodes
8f6ed1c661 tcg: Add gvec expanders for variable shift
8c573e5a9c tcg: Add INDEX_op_dup_mem_vec
ca9a67767b tcg/aarch64: Implement tcg_out_dupm_vec
24cd3652da tcg/i386: Implement tcg_out_dupm_vec
9a14d7cf98 tcg: Add tcg_out_dupm_vec to the backend interface
504eb1c2ce tcg: Manually expand INDEX_op_dup_vec
71309aeb54 tcg: Promote tcg_out_{dup, dupi}_vec to backend interface
24aee13663 tcg: Allow add_vec, sub_vec, neg_vec, not_vec to be expanded
0ccf0219e9 tcg: Support cross-class moves without instruction support
476aacd9e7 tcg: Return bool success from tcg_out_mov
41a8017efa tcg: Assert fixed_reg is read-only
d2482ee256 target/arm: Fill in .opc for cmtst_op

=== OUTPUT BEGIN ===
1/38 Checking commit d2482ee25652 (target/arm: Fill in .opc for cmtst_op)
2/38 Checking commit 41a8017efa97 (tcg: Assert fixed_reg is read-only)
WARNING: Block comments use a leading /* on a separate line
#103: FILE: tcg/tcg.c:3529:
+            /* temp value is modified, so the value kept in memory is

WARNING: Block comments use * on subsequent lines
#104: FILE: tcg/tcg.c:3530:
+            /* temp value is modified, so the value kept in memory is
+               potentially not the same */

WARNING: Block comments use a trailing */ on a separate line
#104: FILE: tcg/tcg.c:3530:
+               potentially not the same */

total: 0 errors, 3 warnings, 140 lines checked

Patch 2/38 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/38 Checking commit 476aacd9e75c (tcg: Return bool success from tcg_out_mov)
4/38 Checking commit 0ccf0219e9e3 (tcg: Support cross-class moves without instruction support)
WARNING: Block comments use a leading /* on a separate line
#24: FILE: tcg/tcg.c:3372:
+                /* Cross register class move not supported.

WARNING: Block comments use * on subsequent lines
#25: FILE: tcg/tcg.c:3373:
+                /* Cross register class move not supported.
+                   Store the source register into the destination slot

WARNING: Block comments use a trailing */ on a separate line
#26: FILE: tcg/tcg.c:3374:
+                   and leave the destination temp as TEMP_VAL_MEM.  */

WARNING: Block comments use a leading /* on a separate line
#44: FILE: tcg/tcg.c:3485:
+                /* Cross register class move not supported.  Sync the

WARNING: Block comments use * on subsequent lines
#45: FILE: tcg/tcg.c:3486:
+                /* Cross register class move not supported.  Sync the
+                   temp back to its slot and load from there.  */

WARNING: Block comments use a trailing */ on a separate line
#45: FILE: tcg/tcg.c:3486:
+                   temp back to its slot and load from there.  */

WARNING: Block comments use a leading /* on a separate line
#57: FILE: tcg/tcg.c:3648:
+                        /* Cross register class move not supported.  Sync the

WARNING: Block comments use * on subsequent lines
#58: FILE: tcg/tcg.c:3649:
+                        /* Cross register class move not supported.  Sync the
+                           temp back to its slot and load from there.  */

WARNING: Block comments use a trailing */ on a separate line
#58: FILE: tcg/tcg.c:3649:
+                           temp back to its slot and load from there.  */

total: 0 errors, 9 warnings, 43 lines checked

Patch 4/38 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/38 Checking commit 24aee1366381 (tcg: Allow add_vec, sub_vec, neg_vec, not_vec to be expanded)
6/38 Checking commit 71309aeb54f3 (tcg: Promote tcg_out_{dup, dupi}_vec to backend interface)
7/38 Checking commit 504eb1c2ce3e (tcg: Manually expand INDEX_op_dup_vec)
8/38 Checking commit 9a14d7cf98a8 (tcg: Add tcg_out_dupm_vec to the backend interface)
9/38 Checking commit 24cd3652da7f (tcg/i386: Implement tcg_out_dupm_vec)
10/38 Checking commit ca9a67767b3f (tcg/aarch64: Implement tcg_out_dupm_vec)
11/38 Checking commit 8c573e5a9cf7 (tcg: Add INDEX_op_dup_mem_vec)
WARNING: Block comments use a leading /* on a separate line
#96: FILE: tcg/tcg-op-gvec.c:400:
+        /* Recall that ARM SVE allows vector sizes that are not a

total: 0 errors, 1 warnings, 178 lines checked

Patch 11/38 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
12/38 Checking commit 8f6ed1c661ad (tcg: Add gvec expanders for variable shift)
13/38 Checking commit 1d95e80f8b9b (tcg/i386: Support vector variable shift opcodes)
14/38 Checking commit 3e06422a97e0 (tcg/aarch64: Support vector variable shift opcodes)
15/38 Checking commit 2d22c62f7d91 (tcg: Implement tcg_gen_gvec_3i())
16/38 Checking commit c611d5ab1dd7 (tcg: Specify optional vector requirements with a list)
17/38 Checking commit 8a79bb240796 (tcg: Add gvec expanders for vector shift by scalar)
18/38 Checking commit 9c178a385e74 (tcg/i386: Support vector scalar shift opcodes)
19/38 Checking commit b6885d0400cb (tcg: Add support for integer absolute value)
20/38 Checking commit 72e6fb61b122 (tcg: Add support for vector absolute value)
21/38 Checking commit 74420abd9dc1 (target/arm: Use tcg_gen_abs_i64 and tcg_gen_gvec_abs)
22/38 Checking commit 1869b46302aa (target/cris: Use tcg_gen_abs_tl)
23/38 Checking commit e81e04de2818 (target/ppc: Use tcg_gen_abs_tl)
24/38 Checking commit 3c3292af3271 (target/s390x: Use tcg_gen_abs_i64)
25/38 Checking commit f81e912312f6 (target/xtensa: Use tcg_gen_abs_i32)
26/38 Checking commit 4537dd40fc11 (tcg/i386: Support vector absolute value)
27/38 Checking commit 2aded2eb782d (tcg/aarch64: Support vector absolute value)
28/38 Checking commit a7efba49a2b5 (tcg: Add support for vector comparison select)
29/38 Checking commit 28392fb9a2f4 (tcg/i386: Support vector comparison select value)
30/38 Checking commit 65cae5150118 (tcg/aarch64: Support vector comparison select value)
31/38 Checking commit 23ef118db83a (target/ppc: Use vector variable shifts for VS{L, R, RA}{B, H, W, D})
32/38 Checking commit 54ed8f1e33a8 (target/arm: Vectorize USHL and SSHL)
ERROR: trailing statements should be on next line
#161: FILE: target/arm/translate.c:5288:
+            case 2: gen_ushl_i32(var, var, shift); break;

ERROR: trailing statements should be on next line
#168: FILE: target/arm/translate.c:5294:
+            case 2: gen_sshl_i32(var, var, shift); break;

total: 2 errors, 0 warnings, 650 lines checked

Patch 32/38 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

33/38 Checking commit 7425b741da09 (tcg/aarch64: Do not advertise minmax for MO_64)
34/38 Checking commit 2d6f5f70509e (tcg: Do not recreate INDEX_op_neg_vec unless supported)
35/38 Checking commit 6976ef828e8b (tcg: Introduce do_op3_nofail for vector expansion)
36/38 Checking commit 508dc19d3978 (tcg: Expand vector minmax using cmp+cmpsel)
37/38 Checking commit a701168d9cbe (tcg/aarch64: Use MVNI for expansion of dupi)
38/38 Checking commit eaace9760937 (tcg/aarch64: Use ORRI and BICI for vector logical operations)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20190420073442.7488-1-richard.henderson@linaro.org/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [Qemu-devel] [PATCH 00/38] tcg vector improvements
Posted by David Hildenbrand 4 years, 11 months ago
On 20.04.19 09:34, Richard Henderson wrote:
> Based-on: tcg-next, which at present is only tcg_gen_extract2.
> 
> The dupm patches have been on list before, with a larger context
> of supporting tcg/ppc.  The rest of the set was written to support
> David's s390 vector patches.  In particular:
> 
> (1) Add vector absolute value.
> (2) Add vector shift by non-constant scalar.
> (3) Add vector shift by vector.
> (4) Add vector select.
> (5) Be more precise in handling target-specific vector expansions.
> 
> And then there's a set of bugs that I encountered while working
> on this across x86, aa64, and ppc hosts.  Tested primarily with
> aa64 as the guest, via RISU.
> 
> 
> r~

Hi Richard,

what are your plans with this series? (and shlv and friends?)

-- 

Thanks,

David / dhildenb

Re: [Qemu-devel] [PATCH 00/38] tcg vector improvements
Posted by Richard Henderson 4 years, 11 months ago
On 4/29/19 12:28 PM, David Hildenbrand wrote:
> Hi Richard,
> 
> what are your plans with this series? (and shlv and friends?)
> 

I expect to submit them this week, barring any other comment on the patches
themselves.

r~