[RFC PATCH v4 00/44] Add LoongArch LSX instructions

Song Gao posted 44 patches 1 year ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230425070248.2550028-1-gaosong@loongson.cn
Maintainers: Laurent Vivier <laurent@vivier.eu>, Song Gao <gaosong@loongson.cn>, Xiaojuan Yang <yangxiaojuan@loongson.cn>
There is a newer version of this series
linux-user/loongarch64/signal.c               |    4 +-
target/loongarch/cpu.c                        |    5 +-
target/loongarch/cpu.h                        |   27 +-
target/loongarch/disas.c                      |  911 ++++
target/loongarch/fpu_helper.c                 |    2 +-
target/loongarch/gdbstub.c                    |    4 +-
target/loongarch/helper.h                     |  566 +++
.../loongarch/insn_trans/trans_farith.c.inc   |   72 +-
target/loongarch/insn_trans/trans_fcmp.c.inc  |   12 +-
.../loongarch/insn_trans/trans_fmemory.c.inc  |   37 +-
target/loongarch/insn_trans/trans_fmov.c.inc  |   31 +-
target/loongarch/insn_trans/trans_lsx.c.inc   | 4400 +++++++++++++++++
target/loongarch/insns.decode                 |  811 +++
target/loongarch/internals.h                  |   23 +
target/loongarch/lsx_helper.c                 | 3004 +++++++++++
target/loongarch/machine.c                    |   79 +-
target/loongarch/meson.build                  |    1 +
target/loongarch/translate.c                  |   55 +-
target/loongarch/translate.h                  |    1 +
19 files changed, 9988 insertions(+), 57 deletions(-)
create mode 100644 target/loongarch/insn_trans/trans_lsx.c.inc
create mode 100644 target/loongarch/lsx_helper.c
[RFC PATCH v4 00/44] Add LoongArch LSX instructions
Posted by Song Gao 1 year ago
Hi,

This series adds LoongArch LSX instructions, Since the LoongArch
Vol2 is not open, So we use 'RFC' title.

I'm not sure when the manual will be open.
After these patches are reviewed, how about merging them?

About test:
V2 we use RISU test the LoongArch LSX instructions.

QEMU:
    https://github.com/loongson/qemu/tree/tcg-old-abi-support-lsx
RISU:
    https://github.com/loongson/risu/tree/loongarch-suport-lsx

Build test:
make docker-test-build@fedora-i386-cross

The following patches need to be reviewed:
  0001-target-loongarch-Add-LSX-data-type-VReg.patch
  0014-target-loongarch-Implement-vmul-vmuh-vmulw-ev-od.patch
  0030-target-loongarch-Implement-vpcnt.patch
  0034-target-loongarch-Implement-LSX-fpu-fcvt-instructions.patch
  0037-target-loongarch-Implement-vbitsel-vset.patch
  0041-target-loongarch-Implement-vld-vst.patch

V4:
  - R-b and rebase;
  - Migrate the upper half lsx regs;
  - Remove tcg_gen_mulus2_*;
  - Vsetallnez use !do_match2;
  - Use tcg_gen_concat_i64_i128/tcg_gen_extr_i128_i64 to replace 
    TCGV128_LOW(val)/TCGV128_High(val);

V3:
  - R-b;
  - Add unsigned data type in vreg;
  - Add ctx->vl;
  - Use tcg_constant_vec_matching instead of dupi;
  - Use __typeof(Vd->E(0)) instead of the output type;
  - Tcg integer expansion;
  - Use tcg_gen_qemu_ld/st_i128 to implement vld/vst;
  - Fix some typos;
  - Optimize code based on Richard's comments.

V2:
  - Use gvec;
  - Fix instructions bugs;
  - Add set_fpr()/get_fpr() replace to cpu_fpr.


Song Gao (44):
  target/loongarch: Add LSX data type VReg
  target/loongarch: meson.build support build LSX
  target/loongarch: Add CHECK_SXE maccro for check LSX enable
  target/loongarch: Implement vadd/vsub
  target/loongarch: Implement vaddi/vsubi
  target/loongarch: Implement vneg
  target/loongarch: Implement vsadd/vssub
  target/loongarch: Implement vhaddw/vhsubw
  target/loongarch: Implement vaddw/vsubw
  target/loongarch: Implement vavg/vavgr
  target/loongarch: Implement vabsd
  target/loongarch: Implement vadda
  target/loongarch: Implement vmax/vmin
  target/loongarch: Implement vmul/vmuh/vmulw{ev/od}
  target/loongarch: Implement vmadd/vmsub/vmaddw{ev/od}
  target/loongarch: Implement vdiv/vmod
  target/loongarch: Implement vsat
  target/loongarch: Implement vexth
  target/loongarch: Implement vsigncov
  target/loongarch: Implement vmskltz/vmskgez/vmsknz
  target/loongarch: Implement LSX logic instructions
  target/loongarch: Implement vsll vsrl vsra vrotr
  target/loongarch: Implement vsllwil vextl
  target/loongarch: Implement vsrlr vsrar
  target/loongarch: Implement vsrln vsran
  target/loongarch: Implement vsrlrn vsrarn
  target/loongarch: Implement vssrln vssran
  target/loongarch: Implement vssrlrn vssrarn
  target/loongarch: Implement vclo vclz
  target/loongarch: Implement vpcnt
  target/loongarch: Implement vbitclr vbitset vbitrev
  target/loongarch: Implement vfrstp
  target/loongarch: Implement LSX fpu arith instructions
  target/loongarch: Implement LSX fpu fcvt instructions
  target/loongarch: Implement vseq vsle vslt
  target/loongarch: Implement vfcmp
  target/loongarch: Implement vbitsel vset
  target/loongarch: Implement vinsgr2vr vpickve2gr vreplgr2vr
  target/loongarch: Implement vreplve vpack vpick
  target/loongarch: Implement vilvl vilvh vextrins vshuf
  target/loongarch: Implement vld vst
  target/loongarch: Implement vldi
  target/loongarch: Use {set/get}_gpr replace to cpu_fpr
  target/loongarch: CPUCFG support LSX

 linux-user/loongarch64/signal.c               |    4 +-
 target/loongarch/cpu.c                        |    5 +-
 target/loongarch/cpu.h                        |   27 +-
 target/loongarch/disas.c                      |  911 ++++
 target/loongarch/fpu_helper.c                 |    2 +-
 target/loongarch/gdbstub.c                    |    4 +-
 target/loongarch/helper.h                     |  566 +++
 .../loongarch/insn_trans/trans_farith.c.inc   |   72 +-
 target/loongarch/insn_trans/trans_fcmp.c.inc  |   12 +-
 .../loongarch/insn_trans/trans_fmemory.c.inc  |   37 +-
 target/loongarch/insn_trans/trans_fmov.c.inc  |   31 +-
 target/loongarch/insn_trans/trans_lsx.c.inc   | 4400 +++++++++++++++++
 target/loongarch/insns.decode                 |  811 +++
 target/loongarch/internals.h                  |   23 +
 target/loongarch/lsx_helper.c                 | 3004 +++++++++++
 target/loongarch/machine.c                    |   79 +-
 target/loongarch/meson.build                  |    1 +
 target/loongarch/translate.c                  |   55 +-
 target/loongarch/translate.h                  |    1 +
 19 files changed, 9988 insertions(+), 57 deletions(-)
 create mode 100644 target/loongarch/insn_trans/trans_lsx.c.inc
 create mode 100644 target/loongarch/lsx_helper.c

-- 
2.31.1
Re: [RFC PATCH v4 00/44] Add LoongArch LSX instructions
Posted by Richard Henderson 1 year ago
On 4/25/23 08:02, Song Gao wrote:
> Hi,
> 
> This series adds LoongArch LSX instructions, Since the LoongArch
> Vol2 is not open, So we use 'RFC' title.
> 
> I'm not sure when the manual will be open.
> After these patches are reviewed, how about merging them?
> 
> About test:
> V2 we use RISU test the LoongArch LSX instructions.
> 
> QEMU:
>      https://github.com/loongson/qemu/tree/tcg-old-abi-support-lsx
> RISU:
>      https://github.com/loongson/risu/tree/loongarch-suport-lsx
> 
> Build test:
> make docker-test-build@fedora-i386-cross
> 
> The following patches need to be reviewed:
>    0001-target-loongarch-Add-LSX-data-type-VReg.patch
>    0014-target-loongarch-Implement-vmul-vmuh-vmulw-ev-od.patch
>    0030-target-loongarch-Implement-vpcnt.patch
>    0034-target-loongarch-Implement-LSX-fpu-fcvt-instructions.patch
>    0037-target-loongarch-Implement-vbitsel-vset.patch
>    0041-target-loongarch-Implement-vld-vst.patch
> 
> V4:
>    - R-b and rebase;
>    - Migrate the upper half lsx regs;
>    - Remove tcg_gen_mulus2_*;
>    - Vsetallnez use !do_match2;
>    - Use tcg_gen_concat_i64_i128/tcg_gen_extr_i128_i64 to replace
>      TCGV128_LOW(val)/TCGV128_High(val);

One minor nit, everything reviewed!  Congratulations.


r~
Re: [RFC PATCH v4 00/44] Add LoongArch LSX instructions
Posted by Song Gao 1 year ago
在 2023/5/2 上午2:43, Richard Henderson 写道:
> On 4/25/23 08:02, Song Gao wrote:
>> Hi,
>>
>> This series adds LoongArch LSX instructions, Since the LoongArch
>> Vol2 is not open, So we use 'RFC' title.
>>
>> I'm not sure when the manual will be open.
>> After these patches are reviewed, how about merging them?
>>
>> About test:
>> V2 we use RISU test the LoongArch LSX instructions.
>>
>> QEMU:
>> https://github.com/loongson/qemu/tree/tcg-old-abi-support-lsx
>> RISU:
>>      https://github.com/loongson/risu/tree/loongarch-suport-lsx
>>
>> Build test:
>> make docker-test-build@fedora-i386-cross
>>
>> The following patches need to be reviewed:
>>    0001-target-loongarch-Add-LSX-data-type-VReg.patch
>>    0014-target-loongarch-Implement-vmul-vmuh-vmulw-ev-od.patch
>>    0030-target-loongarch-Implement-vpcnt.patch
>> 0034-target-loongarch-Implement-LSX-fpu-fcvt-instructions.patch
>>    0037-target-loongarch-Implement-vbitsel-vset.patch
>>    0041-target-loongarch-Implement-vld-vst.patch
>>
>> V4:
>>    - R-b and rebase;
>>    - Migrate the upper half lsx regs;
>>    - Remove tcg_gen_mulus2_*;
>>    - Vsetallnez use !do_match2;
>>    - Use tcg_gen_concat_i64_i128/tcg_gen_extr_i128_i64 to replace
>>      TCGV128_LOW(val)/TCGV128_High(val);
>
> One minor nit, everything reviewed!  Congratulations.
>
Thank you for your guidance and review.

Since all patches are reviewed, how about drop 'RFC' on v5?
I am  really not sure When the Vol2 will be open.

Thanks.
Song Gao


Re: [RFC PATCH v4 00/44] Add LoongArch LSX instructions
Posted by Richard Henderson 1 year ago
On 5/4/23 02:26, Song Gao wrote:
>>> V4:
>>>    - R-b and rebase;
>>>    - Migrate the upper half lsx regs;
>>>    - Remove tcg_gen_mulus2_*;
>>>    - Vsetallnez use !do_match2;
>>>    - Use tcg_gen_concat_i64_i128/tcg_gen_extr_i128_i64 to replace
>>>      TCGV128_LOW(val)/TCGV128_High(val);
>>
>> One minor nit, everything reviewed!  Congratulations.
>>
> Thank you for your guidance and review.
> 
> Since all patches are reviewed, how about drop 'RFC' on v5?

Sure.

> I am  really not sure When the Vol2 will be open.

That is unfortunate.

I think the timing of the merge of this code is now up to you as maintainer of the 
Loongson target.  As an employee of the company you have more insight into the status of 
the extension, whether it is already present in shipping silicon, or completed final 
draft, or still beta, etc.

Even if the extension is finalized, I see no reason why you should not be able to merge, 
so long as the extension is off by default, to be enabled by a cpu property from the 
command-line.  We regularly do this for unratified RISCV extensions.


r~

Re: [RFC PATCH v4 00/44] Add LoongArch LSX instructions
Posted by Richard Henderson 1 year ago
On 5/4/23 09:25, Richard Henderson wrote:
> On 5/4/23 02:26, Song Gao wrote:
>>>> V4:
>>>>    - R-b and rebase;
>>>>    - Migrate the upper half lsx regs;
>>>>    - Remove tcg_gen_mulus2_*;
>>>>    - Vsetallnez use !do_match2;
>>>>    - Use tcg_gen_concat_i64_i128/tcg_gen_extr_i128_i64 to replace
>>>>      TCGV128_LOW(val)/TCGV128_High(val);
>>>
>>> One minor nit, everything reviewed!  Congratulations.
>>>
>> Thank you for your guidance and review.
>>
>> Since all patches are reviewed, how about drop 'RFC' on v5?
> 
> Sure.
> 
>> I am  really not sure When the Vol2 will be open.
> 
> That is unfortunate.
> 
> I think the timing of the merge of this code is now up to you as maintainer of the 
> Loongson target.  As an employee of the company you have more insight into the status of 
> the extension, whether it is already present in shipping silicon, or completed final 
> draft, or still beta, etc.
> 
> Even if the extension is finalized, I see no reason why you should not be able to merge, 

Oops.  "Even if the extension is *not* finalized..."
I.e., a beta extension can be merged if it is off by default.

r~

Re: [RFC PATCH v4 00/44] Add LoongArch LSX instructions
Posted by Song Gao 1 year ago
在 2023/5/4 下午5:41, Richard Henderson 写道:
> On 5/4/23 09:25, Richard Henderson wrote:
>> On 5/4/23 02:26, Song Gao wrote:
>>>>> V4:
>>>>>    - R-b and rebase;
>>>>>    - Migrate the upper half lsx regs;
>>>>>    - Remove tcg_gen_mulus2_*;
>>>>>    - Vsetallnez use !do_match2;
>>>>>    - Use tcg_gen_concat_i64_i128/tcg_gen_extr_i128_i64 to replace
>>>>>      TCGV128_LOW(val)/TCGV128_High(val);
>>>>
>>>> One minor nit, everything reviewed!  Congratulations.
>>>>
>>> Thank you for your guidance and review.
>>>
>>> Since all patches are reviewed, how about drop 'RFC' on v5?
>>
>> Sure.
>>
>>> I am  really not sure When the Vol2 will be open.
>>
>> That is unfortunate.
>>
>> I think the timing of the merge of this code is now up to you as 
>> maintainer of the Loongson target.  As an employee of the company you 
>> have more insight into the status of the extension, whether it is 
>> already present in shipping silicon, or completed final draft, or 
>> still beta, etc.
>>
>> Even if the extension is finalized, I see no reason why you should 
>> not be able to merge, 
>
> Oops.  "Even if the extension is *not* finalized..."
> I.e., a beta extension can be merged if it is off by default. 
Thanks for  your explanation.

The  'la464' supports LSX and LASX by default.  The 'la364' supports LSX 
by default.
Neither of these are beta extensions. They are already present in the 
shipping silicon and have been finalized.

And the 'la664' also supports LSX and LASX by default.

I will send v5 series.

Thanks.
Song Gao