:p
atchew
Login
v2: https://lists.nongnu.org/archive/html/qemu-devel/2022-01/msg02488.html v2 -> v3: Unify CC_OP_SLA_32 and CC_OP_SLA_64. Add underscores to test macro parameters. Shift CC in test asm. Add a second SLAG test. Add tags to commit messages. v1: https://lists.nongnu.org/archive/html/qemu-devel/2022-01/msg02035.html v1 -> v2: Fix cc_calc_sla_32(). Fix cc_calc_sla_64(). Fix SLDA sign bit index. Inline help_l2_shift(). Fix wout_r1_D32(). Add all shift instructions to the test. Split the series. Ilya Leoshkevich (5): target/s390x: Fix SLDA sign bit index target/s390x: Fix SRDA CC calculation target/s390x: Fix cc_calc_sla_64() missing overflows target/s390x: Fix shifting 32-bit values for more than 31 bits tests/tcg/s390x: Test shift instructions target/s390x/cpu-dump.c | 3 +- target/s390x/s390x-internal.h | 3 +- target/s390x/tcg/cc_helper.c | 39 +---- target/s390x/tcg/insn-data.def | 36 ++--- target/s390x/tcg/translate.c | 49 +++--- tests/tcg/s390x/Makefile.target | 1 + tests/tcg/s390x/shift.c | 270 ++++++++++++++++++++++++++++++++ 7 files changed, 318 insertions(+), 83 deletions(-) create mode 100644 tests/tcg/s390x/shift.c -- 2.31.1
SLDA operates on 64-bit values, so its sign bit index should be 63, not 31. Fixes: a79ba3398a0a ("target-s390: Convert SHIFT DOUBLE") Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> --- target/s390x/tcg/insn-data.def | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/insn-data.def +++ b/target/s390x/tcg/insn-data.def @@ -XXX,XX +XXX,XX @@ C(0xebde, SRLK, RSY_a, DO, r3_32u, sh32, new, r1_32, srl, 0) C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh64, r1, 0, srl, 0) /* SHIFT LEFT DOUBLE */ - D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 31) + D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 63) /* SHIFT LEFT DOUBLE LOGICAL */ C(0x8d00, SLDL, RS_a, Z, r1_D32, sh64, new, r1_D32, sll, 0) /* SHIFT RIGHT DOUBLE */ -- 2.31.1
SRDA uses r1_D32 for binding the first operand and s64 for setting CC. cout_s64() relies on o->out being the shift result, however, wout_r1_D32() clobbers it. Fix by using a temporary. Fixes: a79ba3398a0a ("target-s390: Convert SHIFT DOUBLE") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> --- target/s390x/tcg/translate.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -XXX,XX +XXX,XX @@ static void wout_r1_P32(DisasContext *s, DisasOps *o) static void wout_r1_D32(DisasContext *s, DisasOps *o) { int r1 = get_field(s, r1); + TCGv_i64 t = tcg_temp_new_i64(); store_reg32_i64(r1 + 1, o->out); - tcg_gen_shri_i64(o->out, o->out, 32); - store_reg32_i64(r1, o->out); + tcg_gen_shri_i64(t, o->out, 32); + store_reg32_i64(r1, t); } #define SPEC_wout_r1_D32 SPEC_r1_even -- 2.31.1
An overflow occurs for SLAG when at least one shifted bit is not equal to sign bit. Therefore, we need to check that `shift + 1` bits are neither all 0s nor all 1s. The current code checks only `shift` bits, missing some overflows. Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- target/s390x/tcg/cc_helper.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/cc_helper.c +++ b/target/s390x/tcg/cc_helper.c @@ -XXX,XX +XXX,XX @@ static uint32_t cc_calc_sla_32(uint32_t src, int shift) static uint32_t cc_calc_sla_64(uint64_t src, int shift) { - uint64_t mask = ((1ULL << shift) - 1ULL) << (64 - shift); + /* Do not use (1ULL << (shift + 1)): it triggers UB when shift is 63. */ + uint64_t mask = ((((1ULL << shift) - 1) << 1) + 1) << (64 - (shift + 1)); uint64_t sign = 1ULL << 63; uint64_t match; int64_t r; -- 2.31.1
According to PoP, both 32- and 64-bit shifts use lowest 6 address bits. The current code special-cases 32-bit shifts to use only 5 bits, which is not correct. For example, shifting by 32 bits currently preserves the initial value, however, it's supposed zero it out instead. Fix by merging sh32 and sh64 and adapting CC calculation to shift values greater than 31. Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> --- target/s390x/cpu-dump.c | 3 +-- target/s390x/s390x-internal.h | 3 +-- target/s390x/tcg/cc_helper.c | 36 +++------------------------- target/s390x/tcg/insn-data.def | 36 ++++++++++++++-------------- target/s390x/tcg/translate.c | 44 +++++++++++++++------------------- 5 files changed, 42 insertions(+), 80 deletions(-) diff --git a/target/s390x/cpu-dump.c b/target/s390x/cpu-dump.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/cpu-dump.c +++ b/target/s390x/cpu-dump.c @@ -XXX,XX +XXX,XX @@ const char *cc_name(enum cc_op cc_op) [CC_OP_NZ_F64] = "CC_OP_NZ_F64", [CC_OP_NZ_F128] = "CC_OP_NZ_F128", [CC_OP_ICM] = "CC_OP_ICM", - [CC_OP_SLA_32] = "CC_OP_SLA_32", - [CC_OP_SLA_64] = "CC_OP_SLA_64", + [CC_OP_SLA] = "CC_OP_SLA", [CC_OP_FLOGR] = "CC_OP_FLOGR", [CC_OP_LCBB] = "CC_OP_LCBB", [CC_OP_VC] = "CC_OP_VC", diff --git a/target/s390x/s390x-internal.h b/target/s390x/s390x-internal.h index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/s390x-internal.h +++ b/target/s390x/s390x-internal.h @@ -XXX,XX +XXX,XX @@ enum cc_op { CC_OP_NZ_F128, /* FP dst != 0 (128bit) */ CC_OP_ICM, /* insert characters under mask */ - CC_OP_SLA_32, /* Calculate shift left signed (32bit) */ - CC_OP_SLA_64, /* Calculate shift left signed (64bit) */ + CC_OP_SLA, /* Calculate shift left signed */ CC_OP_FLOGR, /* find leftmost one */ CC_OP_LCBB, /* load count to block boundary */ CC_OP_VC, /* vector compare result */ diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/cc_helper.c +++ b/target/s390x/tcg/cc_helper.c @@ -XXX,XX +XXX,XX @@ static uint32_t cc_calc_icm(uint64_t mask, uint64_t val) } } -static uint32_t cc_calc_sla_32(uint32_t src, int shift) -{ - uint32_t mask = ((1U << shift) - 1U) << (32 - shift); - uint32_t sign = 1U << 31; - uint32_t match; - int32_t r; - - /* Check if the sign bit stays the same. */ - if (src & sign) { - match = mask; - } else { - match = 0; - } - if ((src & mask) != match) { - /* Overflow. */ - return 3; - } - - r = ((src << shift) & ~sign) | (src & sign); - if (r == 0) { - return 0; - } else if (r < 0) { - return 1; - } - return 2; -} - -static uint32_t cc_calc_sla_64(uint64_t src, int shift) +static uint32_t cc_calc_sla(uint64_t src, int shift) { /* Do not use (1ULL << (shift + 1)): it triggers UB when shift is 63. */ uint64_t mask = ((((1ULL << shift) - 1) << 1) + 1) << (64 - (shift + 1)); @@ -XXX,XX +XXX,XX @@ static uint32_t do_calc_cc(CPUS390XState *env, uint32_t cc_op, case CC_OP_ICM: r = cc_calc_icm(src, dst); break; - case CC_OP_SLA_32: - r = cc_calc_sla_32(src, dst); - break; - case CC_OP_SLA_64: - r = cc_calc_sla_64(src, dst); + case CC_OP_SLA: + r = cc_calc_sla(src, dst); break; case CC_OP_FLOGR: r = cc_calc_flogr(dst); diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/insn-data.def +++ b/target/s390x/tcg/insn-data.def @@ -XXX,XX +XXX,XX @@ C(0xb9e1, POPCNT, RRE, PC, 0, r2_o, r1, 0, popcnt, nz64) /* ROTATE LEFT SINGLE LOGICAL */ - C(0xeb1d, RLL, RSY_a, Z, r3_o, sh32, new, r1_32, rll32, 0) - C(0xeb1c, RLLG, RSY_a, Z, r3_o, sh64, r1, 0, rll64, 0) + C(0xeb1d, RLL, RSY_a, Z, r3_o, sh, new, r1_32, rll32, 0) + C(0xeb1c, RLLG, RSY_a, Z, r3_o, sh, r1, 0, rll64, 0) /* ROTATE THEN INSERT SELECTED BITS */ C(0xec55, RISBG, RIE_f, GIE, 0, r2, r1, 0, risbg, s64) @@ -XXX,XX +XXX,XX @@ C(0x0400, SPM, RR_a, Z, r1, 0, 0, 0, spm, 0) /* SHIFT LEFT SINGLE */ - D(0x8b00, SLA, RS_a, Z, r1, sh32, new, r1_32, sla, 0, 31) - D(0xebdd, SLAK, RSY_a, DO, r3, sh32, new, r1_32, sla, 0, 31) - D(0xeb0b, SLAG, RSY_a, Z, r3, sh64, r1, 0, sla, 0, 63) + D(0x8b00, SLA, RS_a, Z, r1, sh, new, r1_32, sla, 0, 31) + D(0xebdd, SLAK, RSY_a, DO, r3, sh, new, r1_32, sla, 0, 31) + D(0xeb0b, SLAG, RSY_a, Z, r3, sh, r1, 0, sla, 0, 63) /* SHIFT LEFT SINGLE LOGICAL */ - C(0x8900, SLL, RS_a, Z, r1_o, sh32, new, r1_32, sll, 0) - C(0xebdf, SLLK, RSY_a, DO, r3_o, sh32, new, r1_32, sll, 0) - C(0xeb0d, SLLG, RSY_a, Z, r3_o, sh64, r1, 0, sll, 0) + C(0x8900, SLL, RS_a, Z, r1_o, sh, new, r1_32, sll, 0) + C(0xebdf, SLLK, RSY_a, DO, r3_o, sh, new, r1_32, sll, 0) + C(0xeb0d, SLLG, RSY_a, Z, r3_o, sh, r1, 0, sll, 0) /* SHIFT RIGHT SINGLE */ - C(0x8a00, SRA, RS_a, Z, r1_32s, sh32, new, r1_32, sra, s32) - C(0xebdc, SRAK, RSY_a, DO, r3_32s, sh32, new, r1_32, sra, s32) - C(0xeb0a, SRAG, RSY_a, Z, r3_o, sh64, r1, 0, sra, s64) + C(0x8a00, SRA, RS_a, Z, r1_32s, sh, new, r1_32, sra, s32) + C(0xebdc, SRAK, RSY_a, DO, r3_32s, sh, new, r1_32, sra, s32) + C(0xeb0a, SRAG, RSY_a, Z, r3_o, sh, r1, 0, sra, s64) /* SHIFT RIGHT SINGLE LOGICAL */ - C(0x8800, SRL, RS_a, Z, r1_32u, sh32, new, r1_32, srl, 0) - C(0xebde, SRLK, RSY_a, DO, r3_32u, sh32, new, r1_32, srl, 0) - C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh64, r1, 0, srl, 0) + C(0x8800, SRL, RS_a, Z, r1_32u, sh, new, r1_32, srl, 0) + C(0xebde, SRLK, RSY_a, DO, r3_32u, sh, new, r1_32, srl, 0) + C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh, r1, 0, srl, 0) /* SHIFT LEFT DOUBLE */ - D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 63) + D(0x8f00, SLDA, RS_a, Z, r1_D32, sh, new, r1_D32, sla, 0, 63) /* SHIFT LEFT DOUBLE LOGICAL */ - C(0x8d00, SLDL, RS_a, Z, r1_D32, sh64, new, r1_D32, sll, 0) + C(0x8d00, SLDL, RS_a, Z, r1_D32, sh, new, r1_D32, sll, 0) /* SHIFT RIGHT DOUBLE */ - C(0x8e00, SRDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sra, s64) + C(0x8e00, SRDA, RS_a, Z, r1_D32, sh, new, r1_D32, sra, s64) /* SHIFT RIGHT DOUBLE LOGICAL */ - C(0x8c00, SRDL, RS_a, Z, r1_D32, sh64, new, r1_D32, srl, 0) + C(0x8c00, SRDL, RS_a, Z, r1_D32, sh, new, r1_D32, srl, 0) /* SQUARE ROOT */ F(0xb314, SQEBR, RRE, Z, 0, e2, new, e1, sqeb, 0, IF_BFP) diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -XXX,XX +XXX,XX @@ static void gen_op_calc_cc(DisasContext *s) case CC_OP_LTUGTU_64: case CC_OP_TM_32: case CC_OP_TM_64: - case CC_OP_SLA_32: - case CC_OP_SLA_64: + case CC_OP_SLA: case CC_OP_SUBU: case CC_OP_NZ_F128: case CC_OP_VC: @@ -XXX,XX +XXX,XX @@ struct DisasInsn { /* ====================================================================== */ /* Miscellaneous helpers, used by several operations. */ -static void help_l2_shift(DisasContext *s, DisasOps *o, int mask) -{ - int b2 = get_field(s, b2); - int d2 = get_field(s, d2); - - if (b2 == 0) { - o->in2 = tcg_const_i64(d2 & mask); - } else { - o->in2 = get_address(s, 0, b2, d2); - tcg_gen_andi_i64(o->in2, o->in2, mask); - } -} - static DisasJumpType help_goto_direct(DisasContext *s, uint64_t dest) { if (dest == s->pc_tmp) { @@ -XXX,XX +XXX,XX @@ static DisasJumpType op_soc(DisasContext *s, DisasOps *o) static DisasJumpType op_sla(DisasContext *s, DisasOps *o) { + TCGv_i64 t; uint64_t sign = 1ull << s->insn->data; - enum cc_op cco = s->insn->data == 31 ? CC_OP_SLA_32 : CC_OP_SLA_64; - gen_op_update2_cc_i64(s, cco, o->in1, o->in2); + if (s->insn->data == 31) { + t = tcg_temp_new_i64(); + tcg_gen_shli_i64(t, o->in1, 32); + } else { + t = o->in1; + } + gen_op_update2_cc_i64(s, CC_OP_SLA, t, o->in2); tcg_gen_shl_i64(o->out, o->in1, o->in2); /* The arithmetic left shift is curious in that it does not affect the sign bit. Copy that over from the source unchanged. */ @@ -XXX,XX +XXX,XX @@ static void in2_ri2(DisasContext *s, DisasOps *o) } #define SPEC_in2_ri2 0 -static void in2_sh32(DisasContext *s, DisasOps *o) +static void in2_sh(DisasContext *s, DisasOps *o) { - help_l2_shift(s, o, 31); -} -#define SPEC_in2_sh32 0 + int b2 = get_field(s, b2); + int d2 = get_field(s, d2); -static void in2_sh64(DisasContext *s, DisasOps *o) -{ - help_l2_shift(s, o, 63); + if (b2 == 0) { + o->in2 = tcg_const_i64(d2 & 0x3f); + } else { + o->in2 = get_address(s, 0, b2, d2); + tcg_gen_andi_i64(o->in2, o->in2, 0x3f); + } } -#define SPEC_in2_sh64 0 +#define SPEC_in2_sh 0 static void in2_m2_8u(DisasContext *s, DisasOps *o) { -- 2.31.1
Add a test for each shift instruction in order to to prevent regressions. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- tests/tcg/s390x/Makefile.target | 1 + tests/tcg/s390x/shift.c | 270 ++++++++++++++++++++++++++++++++ 2 files changed, 271 insertions(+) create mode 100644 tests/tcg/s390x/shift.c diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/s390x/Makefile.target +++ b/tests/tcg/s390x/Makefile.target @@ -XXX,XX +XXX,XX @@ TESTS+=exrl-trtr TESTS+=pack TESTS+=mvo TESTS+=mvc +TESTS+=shift TESTS+=trap TESTS+=signals-s390x diff --git a/tests/tcg/s390x/shift.c b/tests/tcg/s390x/shift.c new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tests/tcg/s390x/shift.c @@ -XXX,XX +XXX,XX @@ +#include <inttypes.h> +#include <stdint.h> +#include <stdio.h> + +#define DEFINE_SHIFT_SINGLE_COMMON(_name, _insn_str) \ + static uint64_t _name(uint64_t op1, uint64_t op2, uint64_t *cc) \ + { \ + asm(" sll %[cc],28\n" \ + " spm %[cc]\n" \ + " " _insn_str "\n" \ + " ipm %[cc]\n" \ + " srl %[cc],28" \ + : [op1] "+&r" (op1), \ + [cc] "+&r" (*cc) \ + : [op2] "r" (op2) \ + : "cc"); \ + return op1; \ + } +#define DEFINE_SHIFT_SINGLE_2(_insn, _offset) \ + DEFINE_SHIFT_SINGLE_COMMON(_insn ## _ ## _offset, \ + #_insn " %[op1]," #_offset "(%[op2])") +#define DEFINE_SHIFT_SINGLE_3(_insn, _offset) \ + DEFINE_SHIFT_SINGLE_COMMON(_insn ## _ ## _offset, \ + #_insn " %[op1],%[op1]," #_offset "(%[op2])") +#define DEFINE_SHIFT_DOUBLE(_insn, _offset) \ + static uint64_t _insn ## _ ## _offset(uint64_t op1, uint64_t op2, \ + uint64_t *cc) \ + { \ + uint32_t op1h = op1 >> 32; \ + uint32_t op1l = op1 & 0xffffffff; \ + register uint32_t r2 asm("2") = op1h; \ + register uint32_t r3 asm("3") = op1l; \ + \ + asm(" sll %[cc],28\n" \ + " spm %[cc]\n" \ + " " #_insn " %[r2]," #_offset "(%[op2])\n" \ + " ipm %[cc]\n" \ + " srl %[cc],28" \ + : [r2] "+&r" (r2), \ + [r3] "+&r" (r3), \ + [cc] "+&r" (*cc) \ + : [op2] "r" (op2) \ + : "cc"); \ + op1h = r2; \ + op1l = r3; \ + return (((uint64_t)op1h) << 32) | op1l; \ + } + +DEFINE_SHIFT_SINGLE_3(rll, 0x4cf3b); +DEFINE_SHIFT_SINGLE_3(rllg, 0x697c9); +DEFINE_SHIFT_SINGLE_2(sla, 0x4b0); +DEFINE_SHIFT_SINGLE_2(sla, 0xd54); +DEFINE_SHIFT_SINGLE_3(slak, 0x2832c); +DEFINE_SHIFT_SINGLE_3(slag, 0x66cc4); +DEFINE_SHIFT_SINGLE_3(slag, 0xd54); +DEFINE_SHIFT_SINGLE_2(sll, 0xd04); +DEFINE_SHIFT_SINGLE_3(sllk, 0x2699f); +DEFINE_SHIFT_SINGLE_3(sllg, 0x59df9); +DEFINE_SHIFT_SINGLE_2(sra, 0x67e); +DEFINE_SHIFT_SINGLE_3(srak, 0x60943); +DEFINE_SHIFT_SINGLE_3(srag, 0x6b048); +DEFINE_SHIFT_SINGLE_2(srl, 0x035); +DEFINE_SHIFT_SINGLE_3(srlk, 0x43dfc); +DEFINE_SHIFT_SINGLE_3(srlg, 0x27227); +DEFINE_SHIFT_DOUBLE(slda, 0x38b); +DEFINE_SHIFT_DOUBLE(sldl, 0x031); +DEFINE_SHIFT_DOUBLE(srda, 0x36f); +DEFINE_SHIFT_DOUBLE(srdl, 0x99a); + +struct shift_test { + const char *name; + uint64_t (*insn)(uint64_t, uint64_t, uint64_t *); + uint64_t op1; + uint64_t op2; + uint64_t exp_result; + uint64_t exp_cc; +}; + +static const struct shift_test tests[] = { + { + .name = "rll", + .insn = rll_0x4cf3b, + .op1 = 0xecbd589a45c248f5ull, + .op2 = 0x62e5508ccb4c99fdull, + .exp_result = 0xecbd589af545c248ull, + .exp_cc = 0, + }, + { + .name = "rllg", + .insn = rllg_0x697c9, + .op1 = 0xaa2d54c1b729f7f4ull, + .op2 = 0x5ffcf7465f5cd71full, + .exp_result = 0x29f7f4aa2d54c1b7ull, + .exp_cc = 0, + }, + { + .name = "sla-1", + .insn = sla_0x4b0, + .op1 = 0x8bf21fb67cca0e96ull, + .op2 = 0x3ddf2f53347d3030ull, + .exp_result = 0x8bf21fb600000000ull, + .exp_cc = 3, + }, + { + .name = "sla-2", + .insn = sla_0xd54, + .op1 = 0xe4faaed5def0e926ull, + .op2 = 0x18d586fab239cbeeull, + .exp_result = 0xe4faaed5fbc3a498ull, + .exp_cc = 3, + }, + { + .name = "slak", + .insn = slak_0x2832c, + .op1 = 0x7300bf78707f09f9ull, + .op2 = 0x4d193b85bb5cb39bull, + .exp_result = 0x7300bf783f84fc80ull, + .exp_cc = 3, + }, + { + .name = "slag-1", + .insn = slag_0x66cc4, + .op1 = 0xe805966de1a77762ull, + .op2 = 0x0e92953f6aa91c6bull, + .exp_result = 0xbbb1000000000000ull, + .exp_cc = 3, + }, + { + .name = "slag-2", + .insn = slag_0xd54, + .op1 = 0xdef0e92600000000ull, + .op2 = 0x18d586fab239cbeeull, + .exp_result = 0xfbc3a49800000000ull, + .exp_cc = 3, + }, + { + .name = "sll", + .insn = sll_0xd04, + .op1 = 0xb90281a3105939dfull, + .op2 = 0xb5e4df7e082e4c5eull, + .exp_result = 0xb90281a300000000ull, + .exp_cc = 0, + }, + { + .name = "sllk", + .insn = sllk_0x2699f, + .op1 = 0x777c6cf116f99557ull, + .op2 = 0xe0556cf112e5a458ull, + .exp_result = 0x777c6cf100000000ull, + .exp_cc = 0, + }, + { + .name = "sllg", + .insn = sllg_0x59df9, + .op1 = 0xcdf86cbfbc0f3557ull, + .op2 = 0x325a45acf99c6d3dull, + .exp_result = 0x55c0000000000000ull, + .exp_cc = 0, + }, + { + .name = "sra", + .insn = sra_0x67e, + .op1 = 0xb878f048d5354183ull, + .op2 = 0x9e27d13195931f79ull, + .exp_result = 0xb878f048ffffffffull, + .exp_cc = 1, + }, + { + .name = "srak", + .insn = srak_0x60943, + .op1 = 0xb6ceb5a429cedb35ull, + .op2 = 0x352354900ae34d7aull, + .exp_result = 0xb6ceb5a400000000ull, + .exp_cc = 0, + }, + { + .name = "srag", + .insn = srag_0x6b048, + .op1 = 0xd54dd4468676c63bull, + .op2 = 0x84d026db7b4dca28ull, + .exp_result = 0xffffffffffffd54dull, + .exp_cc = 1, + }, + { + .name = "srl", + .insn = srl_0x035, + .op1 = 0x09be503ef826815full, + .op2 = 0xbba8d1a0e542d5c1ull, + .exp_result = 0x9be503e00000000ull, + .exp_cc = 0, + }, + { + .name = "srlk", + .insn = srlk_0x43dfc, + .op1 = 0x540d6c8de71aee2aull, + .op2 = 0x0000000000000000ull, + .exp_result = 0x540d6c8d00000000ull, + .exp_cc = 0, + }, + { + .name = "srlg", + .insn = srlg_0x27227, + .op1 = 0x26f7123c1c447a34ull, + .op2 = 0x0000000000000000ull, + .exp_result = 0x00000000004dee24ull, + .exp_cc = 0, + }, + { + .name = "slda", + .insn = slda_0x38b, + .op1 = 0x7988f722dd5bbe7cull, + .op2 = 0x9aed3f95b4d78cc2ull, + .exp_result = 0x1ee45bab77cf8000ull, + .exp_cc = 3, + }, + { + .name = "sldl", + .insn = sldl_0x031, + .op1 = 0xaae2918dce2b049aull, + .op2 = 0x0000000000000000ull, + .exp_result = 0x0934000000000000ull, + .exp_cc = 0, + }, + { + .name = "srda", + .insn = srda_0x36f, + .op1 = 0x0cd4ed9228a50978ull, + .op2 = 0x72b046f0848b8cc9ull, + .exp_result = 0x000000000000000cull, + .exp_cc = 2, + }, + { + .name = "srdl", + .insn = srdl_0x99a, + .op1 = 0x1018611c41689a1dull, + .op2 = 0x2907e150c50ba319ull, + .exp_result = 0x0000000000000203ull, + .exp_cc = 0, + }, +}; + +int main(void) +{ + int ret = 0; + size_t i; + + for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) { + uint64_t result; + uint64_t cc = 0; + + result = tests[i].insn(tests[i].op1, tests[i].op2, &cc); + if (result != tests[i].exp_result) { + fprintf(stderr, + "bad %s result:\n" + "actual = 0x%" PRIx64 "\n" + "expected = 0x%" PRIx64 "\n", + tests[i].name, result, tests[i].exp_result); + ret = 1; + } + if (cc != tests[i].exp_cc) { + fprintf(stderr, + "bad %s cc:\n" + "actual = %" PRIu64 "\n" + "expected = %" PRIu64 "\n", + tests[i].name, cc, tests[i].exp_cc); + ret = 1; + } + } + return ret; +} -- 2.31.1
v3: https://lists.nongnu.org/archive/html/qemu-devel/2022-01/msg02680.html v3 -> v4: Simplify cc_calc_sla(). Free temporaries. v2: https://lists.nongnu.org/archive/html/qemu-devel/2022-01/msg02488.html v2 -> v3: Unify CC_OP_SLA_32 and CC_OP_SLA_64. Add underscores to test macro parameters. Shift CC in test asm. Add a second SLAG test. Add tags to commit messages. v1: https://lists.nongnu.org/archive/html/qemu-devel/2022-01/msg02035.html v1 -> v2: Fix cc_calc_sla_32(). Fix cc_calc_sla_64(). Fix SLDA sign bit index. Inline help_l2_shift(). Fix wout_r1_D32(). Add all shift instructions to the test. Split the series. Ilya Leoshkevich (5): target/s390x: Fix SLDA sign bit index target/s390x: Fix SRDA CC calculation target/s390x: Fix cc_calc_sla_64() missing overflows target/s390x: Fix shifting 32-bit values for more than 31 bits tests/tcg/s390x: Test shift instructions target/s390x/cpu-dump.c | 3 +- target/s390x/s390x-internal.h | 3 +- target/s390x/tcg/cc_helper.c | 38 +---- target/s390x/tcg/insn-data.def | 36 ++--- target/s390x/tcg/translate.c | 53 +++---- tests/tcg/s390x/Makefile.target | 1 + tests/tcg/s390x/shift.c | 270 ++++++++++++++++++++++++++++++++ 7 files changed, 321 insertions(+), 83 deletions(-) create mode 100644 tests/tcg/s390x/shift.c -- 2.31.1
SLDA operates on 64-bit values, so its sign bit index should be 63, not 31. Fixes: a79ba3398a0a ("target-s390: Convert SHIFT DOUBLE") Reported-by: David Hildenbrand <david@redhat.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> --- target/s390x/tcg/insn-data.def | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/insn-data.def +++ b/target/s390x/tcg/insn-data.def @@ -XXX,XX +XXX,XX @@ C(0xebde, SRLK, RSY_a, DO, r3_32u, sh32, new, r1_32, srl, 0) C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh64, r1, 0, srl, 0) /* SHIFT LEFT DOUBLE */ - D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 31) + D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 63) /* SHIFT LEFT DOUBLE LOGICAL */ C(0x8d00, SLDL, RS_a, Z, r1_D32, sh64, new, r1_D32, sll, 0) /* SHIFT RIGHT DOUBLE */ -- 2.31.1
SRDA uses r1_D32 for binding the first operand and s64 for setting CC. cout_s64() relies on o->out being the shift result, however, wout_r1_D32() clobbers it. Fix by using a temporary. Fixes: a79ba3398a0a ("target-s390: Convert SHIFT DOUBLE") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- target/s390x/tcg/translate.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -XXX,XX +XXX,XX @@ static void wout_r1_P32(DisasContext *s, DisasOps *o) static void wout_r1_D32(DisasContext *s, DisasOps *o) { int r1 = get_field(s, r1); + TCGv_i64 t = tcg_temp_new_i64(); store_reg32_i64(r1 + 1, o->out); - tcg_gen_shri_i64(o->out, o->out, 32); - store_reg32_i64(r1, o->out); + tcg_gen_shri_i64(t, o->out, 32); + store_reg32_i64(r1, t); + tcg_temp_free_i64(t); } #define SPEC_wout_r1_D32 SPEC_r1_even -- 2.31.1
An overflow occurs for SLAG when at least one shifted bit is not equal to sign bit. Therefore, we need to check that `shift + 1` bits are neither all 0s nor all 1s. The current code checks only `shift` bits, missing some overflows. Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE") Co-developed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- target/s390x/tcg/cc_helper.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/cc_helper.c +++ b/target/s390x/tcg/cc_helper.c @@ -XXX,XX +XXX,XX @@ static uint32_t cc_calc_sla_32(uint32_t src, int shift) static uint32_t cc_calc_sla_64(uint64_t src, int shift) { - uint64_t mask = ((1ULL << shift) - 1ULL) << (64 - shift); + uint64_t mask = -1ULL << (63 - shift); uint64_t sign = 1ULL << 63; uint64_t match; int64_t r; -- 2.31.1
According to PoP, both 32- and 64-bit shifts use lowest 6 address bits. The current code special-cases 32-bit shifts to use only 5 bits, which is not correct. For example, shifting by 32 bits currently preserves the initial value, however, it's supposed zero it out instead. Fix by merging sh32 and sh64 and adapting CC calculation to shift values greater than 31. Fixes: cbe24bfa91d2 ("target-s390: Convert SHIFT, ROTATE SINGLE") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- target/s390x/cpu-dump.c | 3 +-- target/s390x/s390x-internal.h | 3 +-- target/s390x/tcg/cc_helper.c | 36 +++----------------------- target/s390x/tcg/insn-data.def | 36 +++++++++++++------------- target/s390x/tcg/translate.c | 47 ++++++++++++++++------------------ 5 files changed, 45 insertions(+), 80 deletions(-) diff --git a/target/s390x/cpu-dump.c b/target/s390x/cpu-dump.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/cpu-dump.c +++ b/target/s390x/cpu-dump.c @@ -XXX,XX +XXX,XX @@ const char *cc_name(enum cc_op cc_op) [CC_OP_NZ_F64] = "CC_OP_NZ_F64", [CC_OP_NZ_F128] = "CC_OP_NZ_F128", [CC_OP_ICM] = "CC_OP_ICM", - [CC_OP_SLA_32] = "CC_OP_SLA_32", - [CC_OP_SLA_64] = "CC_OP_SLA_64", + [CC_OP_SLA] = "CC_OP_SLA", [CC_OP_FLOGR] = "CC_OP_FLOGR", [CC_OP_LCBB] = "CC_OP_LCBB", [CC_OP_VC] = "CC_OP_VC", diff --git a/target/s390x/s390x-internal.h b/target/s390x/s390x-internal.h index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/s390x-internal.h +++ b/target/s390x/s390x-internal.h @@ -XXX,XX +XXX,XX @@ enum cc_op { CC_OP_NZ_F128, /* FP dst != 0 (128bit) */ CC_OP_ICM, /* insert characters under mask */ - CC_OP_SLA_32, /* Calculate shift left signed (32bit) */ - CC_OP_SLA_64, /* Calculate shift left signed (64bit) */ + CC_OP_SLA, /* Calculate shift left signed */ CC_OP_FLOGR, /* find leftmost one */ CC_OP_LCBB, /* load count to block boundary */ CC_OP_VC, /* vector compare result */ diff --git a/target/s390x/tcg/cc_helper.c b/target/s390x/tcg/cc_helper.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/cc_helper.c +++ b/target/s390x/tcg/cc_helper.c @@ -XXX,XX +XXX,XX @@ static uint32_t cc_calc_icm(uint64_t mask, uint64_t val) } } -static uint32_t cc_calc_sla_32(uint32_t src, int shift) -{ - uint32_t mask = ((1U << shift) - 1U) << (32 - shift); - uint32_t sign = 1U << 31; - uint32_t match; - int32_t r; - - /* Check if the sign bit stays the same. */ - if (src & sign) { - match = mask; - } else { - match = 0; - } - if ((src & mask) != match) { - /* Overflow. */ - return 3; - } - - r = ((src << shift) & ~sign) | (src & sign); - if (r == 0) { - return 0; - } else if (r < 0) { - return 1; - } - return 2; -} - -static uint32_t cc_calc_sla_64(uint64_t src, int shift) +static uint32_t cc_calc_sla(uint64_t src, int shift) { uint64_t mask = -1ULL << (63 - shift); uint64_t sign = 1ULL << 63; @@ -XXX,XX +XXX,XX @@ static uint32_t do_calc_cc(CPUS390XState *env, uint32_t cc_op, case CC_OP_ICM: r = cc_calc_icm(src, dst); break; - case CC_OP_SLA_32: - r = cc_calc_sla_32(src, dst); - break; - case CC_OP_SLA_64: - r = cc_calc_sla_64(src, dst); + case CC_OP_SLA: + r = cc_calc_sla(src, dst); break; case CC_OP_FLOGR: r = cc_calc_flogr(dst); diff --git a/target/s390x/tcg/insn-data.def b/target/s390x/tcg/insn-data.def index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/insn-data.def +++ b/target/s390x/tcg/insn-data.def @@ -XXX,XX +XXX,XX @@ C(0xb9e1, POPCNT, RRE, PC, 0, r2_o, r1, 0, popcnt, nz64) /* ROTATE LEFT SINGLE LOGICAL */ - C(0xeb1d, RLL, RSY_a, Z, r3_o, sh32, new, r1_32, rll32, 0) - C(0xeb1c, RLLG, RSY_a, Z, r3_o, sh64, r1, 0, rll64, 0) + C(0xeb1d, RLL, RSY_a, Z, r3_o, sh, new, r1_32, rll32, 0) + C(0xeb1c, RLLG, RSY_a, Z, r3_o, sh, r1, 0, rll64, 0) /* ROTATE THEN INSERT SELECTED BITS */ C(0xec55, RISBG, RIE_f, GIE, 0, r2, r1, 0, risbg, s64) @@ -XXX,XX +XXX,XX @@ C(0x0400, SPM, RR_a, Z, r1, 0, 0, 0, spm, 0) /* SHIFT LEFT SINGLE */ - D(0x8b00, SLA, RS_a, Z, r1, sh32, new, r1_32, sla, 0, 31) - D(0xebdd, SLAK, RSY_a, DO, r3, sh32, new, r1_32, sla, 0, 31) - D(0xeb0b, SLAG, RSY_a, Z, r3, sh64, r1, 0, sla, 0, 63) + D(0x8b00, SLA, RS_a, Z, r1, sh, new, r1_32, sla, 0, 31) + D(0xebdd, SLAK, RSY_a, DO, r3, sh, new, r1_32, sla, 0, 31) + D(0xeb0b, SLAG, RSY_a, Z, r3, sh, r1, 0, sla, 0, 63) /* SHIFT LEFT SINGLE LOGICAL */ - C(0x8900, SLL, RS_a, Z, r1_o, sh32, new, r1_32, sll, 0) - C(0xebdf, SLLK, RSY_a, DO, r3_o, sh32, new, r1_32, sll, 0) - C(0xeb0d, SLLG, RSY_a, Z, r3_o, sh64, r1, 0, sll, 0) + C(0x8900, SLL, RS_a, Z, r1_o, sh, new, r1_32, sll, 0) + C(0xebdf, SLLK, RSY_a, DO, r3_o, sh, new, r1_32, sll, 0) + C(0xeb0d, SLLG, RSY_a, Z, r3_o, sh, r1, 0, sll, 0) /* SHIFT RIGHT SINGLE */ - C(0x8a00, SRA, RS_a, Z, r1_32s, sh32, new, r1_32, sra, s32) - C(0xebdc, SRAK, RSY_a, DO, r3_32s, sh32, new, r1_32, sra, s32) - C(0xeb0a, SRAG, RSY_a, Z, r3_o, sh64, r1, 0, sra, s64) + C(0x8a00, SRA, RS_a, Z, r1_32s, sh, new, r1_32, sra, s32) + C(0xebdc, SRAK, RSY_a, DO, r3_32s, sh, new, r1_32, sra, s32) + C(0xeb0a, SRAG, RSY_a, Z, r3_o, sh, r1, 0, sra, s64) /* SHIFT RIGHT SINGLE LOGICAL */ - C(0x8800, SRL, RS_a, Z, r1_32u, sh32, new, r1_32, srl, 0) - C(0xebde, SRLK, RSY_a, DO, r3_32u, sh32, new, r1_32, srl, 0) - C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh64, r1, 0, srl, 0) + C(0x8800, SRL, RS_a, Z, r1_32u, sh, new, r1_32, srl, 0) + C(0xebde, SRLK, RSY_a, DO, r3_32u, sh, new, r1_32, srl, 0) + C(0xeb0c, SRLG, RSY_a, Z, r3_o, sh, r1, 0, srl, 0) /* SHIFT LEFT DOUBLE */ - D(0x8f00, SLDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sla, 0, 63) + D(0x8f00, SLDA, RS_a, Z, r1_D32, sh, new, r1_D32, sla, 0, 63) /* SHIFT LEFT DOUBLE LOGICAL */ - C(0x8d00, SLDL, RS_a, Z, r1_D32, sh64, new, r1_D32, sll, 0) + C(0x8d00, SLDL, RS_a, Z, r1_D32, sh, new, r1_D32, sll, 0) /* SHIFT RIGHT DOUBLE */ - C(0x8e00, SRDA, RS_a, Z, r1_D32, sh64, new, r1_D32, sra, s64) + C(0x8e00, SRDA, RS_a, Z, r1_D32, sh, new, r1_D32, sra, s64) /* SHIFT RIGHT DOUBLE LOGICAL */ - C(0x8c00, SRDL, RS_a, Z, r1_D32, sh64, new, r1_D32, srl, 0) + C(0x8c00, SRDL, RS_a, Z, r1_D32, sh, new, r1_D32, srl, 0) /* SQUARE ROOT */ F(0xb314, SQEBR, RRE, Z, 0, e2, new, e1, sqeb, 0, IF_BFP) diff --git a/target/s390x/tcg/translate.c b/target/s390x/tcg/translate.c index XXXXXXX..XXXXXXX 100644 --- a/target/s390x/tcg/translate.c +++ b/target/s390x/tcg/translate.c @@ -XXX,XX +XXX,XX @@ static void gen_op_calc_cc(DisasContext *s) case CC_OP_LTUGTU_64: case CC_OP_TM_32: case CC_OP_TM_64: - case CC_OP_SLA_32: - case CC_OP_SLA_64: + case CC_OP_SLA: case CC_OP_SUBU: case CC_OP_NZ_F128: case CC_OP_VC: @@ -XXX,XX +XXX,XX @@ struct DisasInsn { /* ====================================================================== */ /* Miscellaneous helpers, used by several operations. */ -static void help_l2_shift(DisasContext *s, DisasOps *o, int mask) -{ - int b2 = get_field(s, b2); - int d2 = get_field(s, d2); - - if (b2 == 0) { - o->in2 = tcg_const_i64(d2 & mask); - } else { - o->in2 = get_address(s, 0, b2, d2); - tcg_gen_andi_i64(o->in2, o->in2, mask); - } -} - static DisasJumpType help_goto_direct(DisasContext *s, uint64_t dest) { if (dest == s->pc_tmp) { @@ -XXX,XX +XXX,XX @@ static DisasJumpType op_soc(DisasContext *s, DisasOps *o) static DisasJumpType op_sla(DisasContext *s, DisasOps *o) { + TCGv_i64 t; uint64_t sign = 1ull << s->insn->data; - enum cc_op cco = s->insn->data == 31 ? CC_OP_SLA_32 : CC_OP_SLA_64; - gen_op_update2_cc_i64(s, cco, o->in1, o->in2); + if (s->insn->data == 31) { + t = tcg_temp_new_i64(); + tcg_gen_shli_i64(t, o->in1, 32); + } else { + t = o->in1; + } + gen_op_update2_cc_i64(s, CC_OP_SLA, t, o->in2); + if (s->insn->data == 31) { + tcg_temp_free_i64(t); + } tcg_gen_shl_i64(o->out, o->in1, o->in2); /* The arithmetic left shift is curious in that it does not affect the sign bit. Copy that over from the source unchanged. */ @@ -XXX,XX +XXX,XX @@ static void in2_ri2(DisasContext *s, DisasOps *o) } #define SPEC_in2_ri2 0 -static void in2_sh32(DisasContext *s, DisasOps *o) +static void in2_sh(DisasContext *s, DisasOps *o) { - help_l2_shift(s, o, 31); -} -#define SPEC_in2_sh32 0 + int b2 = get_field(s, b2); + int d2 = get_field(s, d2); -static void in2_sh64(DisasContext *s, DisasOps *o) -{ - help_l2_shift(s, o, 63); + if (b2 == 0) { + o->in2 = tcg_const_i64(d2 & 0x3f); + } else { + o->in2 = get_address(s, 0, b2, d2); + tcg_gen_andi_i64(o->in2, o->in2, 0x3f); + } } -#define SPEC_in2_sh64 0 +#define SPEC_in2_sh 0 static void in2_m2_8u(DisasContext *s, DisasOps *o) { -- 2.31.1
Add a test for each shift instruction in order to to prevent regressions. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> --- tests/tcg/s390x/Makefile.target | 1 + tests/tcg/s390x/shift.c | 270 ++++++++++++++++++++++++++++++++ 2 files changed, 271 insertions(+) create mode 100644 tests/tcg/s390x/shift.c diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target index XXXXXXX..XXXXXXX 100644 --- a/tests/tcg/s390x/Makefile.target +++ b/tests/tcg/s390x/Makefile.target @@ -XXX,XX +XXX,XX @@ TESTS+=exrl-trtr TESTS+=pack TESTS+=mvo TESTS+=mvc +TESTS+=shift TESTS+=trap TESTS+=signals-s390x diff --git a/tests/tcg/s390x/shift.c b/tests/tcg/s390x/shift.c new file mode 100644 index XXXXXXX..XXXXXXX --- /dev/null +++ b/tests/tcg/s390x/shift.c @@ -XXX,XX +XXX,XX @@ +#include <inttypes.h> +#include <stdint.h> +#include <stdio.h> + +#define DEFINE_SHIFT_SINGLE_COMMON(_name, _insn_str) \ + static uint64_t _name(uint64_t op1, uint64_t op2, uint64_t *cc) \ + { \ + asm(" sll %[cc],28\n" \ + " spm %[cc]\n" \ + " " _insn_str "\n" \ + " ipm %[cc]\n" \ + " srl %[cc],28" \ + : [op1] "+&r" (op1), \ + [cc] "+&r" (*cc) \ + : [op2] "r" (op2) \ + : "cc"); \ + return op1; \ + } +#define DEFINE_SHIFT_SINGLE_2(_insn, _offset) \ + DEFINE_SHIFT_SINGLE_COMMON(_insn ## _ ## _offset, \ + #_insn " %[op1]," #_offset "(%[op2])") +#define DEFINE_SHIFT_SINGLE_3(_insn, _offset) \ + DEFINE_SHIFT_SINGLE_COMMON(_insn ## _ ## _offset, \ + #_insn " %[op1],%[op1]," #_offset "(%[op2])") +#define DEFINE_SHIFT_DOUBLE(_insn, _offset) \ + static uint64_t _insn ## _ ## _offset(uint64_t op1, uint64_t op2, \ + uint64_t *cc) \ + { \ + uint32_t op1h = op1 >> 32; \ + uint32_t op1l = op1 & 0xffffffff; \ + register uint32_t r2 asm("2") = op1h; \ + register uint32_t r3 asm("3") = op1l; \ + \ + asm(" sll %[cc],28\n" \ + " spm %[cc]\n" \ + " " #_insn " %[r2]," #_offset "(%[op2])\n" \ + " ipm %[cc]\n" \ + " srl %[cc],28" \ + : [r2] "+&r" (r2), \ + [r3] "+&r" (r3), \ + [cc] "+&r" (*cc) \ + : [op2] "r" (op2) \ + : "cc"); \ + op1h = r2; \ + op1l = r3; \ + return (((uint64_t)op1h) << 32) | op1l; \ + } + +DEFINE_SHIFT_SINGLE_3(rll, 0x4cf3b); +DEFINE_SHIFT_SINGLE_3(rllg, 0x697c9); +DEFINE_SHIFT_SINGLE_2(sla, 0x4b0); +DEFINE_SHIFT_SINGLE_2(sla, 0xd54); +DEFINE_SHIFT_SINGLE_3(slak, 0x2832c); +DEFINE_SHIFT_SINGLE_3(slag, 0x66cc4); +DEFINE_SHIFT_SINGLE_3(slag, 0xd54); +DEFINE_SHIFT_SINGLE_2(sll, 0xd04); +DEFINE_SHIFT_SINGLE_3(sllk, 0x2699f); +DEFINE_SHIFT_SINGLE_3(sllg, 0x59df9); +DEFINE_SHIFT_SINGLE_2(sra, 0x67e); +DEFINE_SHIFT_SINGLE_3(srak, 0x60943); +DEFINE_SHIFT_SINGLE_3(srag, 0x6b048); +DEFINE_SHIFT_SINGLE_2(srl, 0x035); +DEFINE_SHIFT_SINGLE_3(srlk, 0x43dfc); +DEFINE_SHIFT_SINGLE_3(srlg, 0x27227); +DEFINE_SHIFT_DOUBLE(slda, 0x38b); +DEFINE_SHIFT_DOUBLE(sldl, 0x031); +DEFINE_SHIFT_DOUBLE(srda, 0x36f); +DEFINE_SHIFT_DOUBLE(srdl, 0x99a); + +struct shift_test { + const char *name; + uint64_t (*insn)(uint64_t, uint64_t, uint64_t *); + uint64_t op1; + uint64_t op2; + uint64_t exp_result; + uint64_t exp_cc; +}; + +static const struct shift_test tests[] = { + { + .name = "rll", + .insn = rll_0x4cf3b, + .op1 = 0xecbd589a45c248f5ull, + .op2 = 0x62e5508ccb4c99fdull, + .exp_result = 0xecbd589af545c248ull, + .exp_cc = 0, + }, + { + .name = "rllg", + .insn = rllg_0x697c9, + .op1 = 0xaa2d54c1b729f7f4ull, + .op2 = 0x5ffcf7465f5cd71full, + .exp_result = 0x29f7f4aa2d54c1b7ull, + .exp_cc = 0, + }, + { + .name = "sla-1", + .insn = sla_0x4b0, + .op1 = 0x8bf21fb67cca0e96ull, + .op2 = 0x3ddf2f53347d3030ull, + .exp_result = 0x8bf21fb600000000ull, + .exp_cc = 3, + }, + { + .name = "sla-2", + .insn = sla_0xd54, + .op1 = 0xe4faaed5def0e926ull, + .op2 = 0x18d586fab239cbeeull, + .exp_result = 0xe4faaed5fbc3a498ull, + .exp_cc = 3, + }, + { + .name = "slak", + .insn = slak_0x2832c, + .op1 = 0x7300bf78707f09f9ull, + .op2 = 0x4d193b85bb5cb39bull, + .exp_result = 0x7300bf783f84fc80ull, + .exp_cc = 3, + }, + { + .name = "slag-1", + .insn = slag_0x66cc4, + .op1 = 0xe805966de1a77762ull, + .op2 = 0x0e92953f6aa91c6bull, + .exp_result = 0xbbb1000000000000ull, + .exp_cc = 3, + }, + { + .name = "slag-2", + .insn = slag_0xd54, + .op1 = 0xdef0e92600000000ull, + .op2 = 0x18d586fab239cbeeull, + .exp_result = 0xfbc3a49800000000ull, + .exp_cc = 3, + }, + { + .name = "sll", + .insn = sll_0xd04, + .op1 = 0xb90281a3105939dfull, + .op2 = 0xb5e4df7e082e4c5eull, + .exp_result = 0xb90281a300000000ull, + .exp_cc = 0, + }, + { + .name = "sllk", + .insn = sllk_0x2699f, + .op1 = 0x777c6cf116f99557ull, + .op2 = 0xe0556cf112e5a458ull, + .exp_result = 0x777c6cf100000000ull, + .exp_cc = 0, + }, + { + .name = "sllg", + .insn = sllg_0x59df9, + .op1 = 0xcdf86cbfbc0f3557ull, + .op2 = 0x325a45acf99c6d3dull, + .exp_result = 0x55c0000000000000ull, + .exp_cc = 0, + }, + { + .name = "sra", + .insn = sra_0x67e, + .op1 = 0xb878f048d5354183ull, + .op2 = 0x9e27d13195931f79ull, + .exp_result = 0xb878f048ffffffffull, + .exp_cc = 1, + }, + { + .name = "srak", + .insn = srak_0x60943, + .op1 = 0xb6ceb5a429cedb35ull, + .op2 = 0x352354900ae34d7aull, + .exp_result = 0xb6ceb5a400000000ull, + .exp_cc = 0, + }, + { + .name = "srag", + .insn = srag_0x6b048, + .op1 = 0xd54dd4468676c63bull, + .op2 = 0x84d026db7b4dca28ull, + .exp_result = 0xffffffffffffd54dull, + .exp_cc = 1, + }, + { + .name = "srl", + .insn = srl_0x035, + .op1 = 0x09be503ef826815full, + .op2 = 0xbba8d1a0e542d5c1ull, + .exp_result = 0x9be503e00000000ull, + .exp_cc = 0, + }, + { + .name = "srlk", + .insn = srlk_0x43dfc, + .op1 = 0x540d6c8de71aee2aull, + .op2 = 0x0000000000000000ull, + .exp_result = 0x540d6c8d00000000ull, + .exp_cc = 0, + }, + { + .name = "srlg", + .insn = srlg_0x27227, + .op1 = 0x26f7123c1c447a34ull, + .op2 = 0x0000000000000000ull, + .exp_result = 0x00000000004dee24ull, + .exp_cc = 0, + }, + { + .name = "slda", + .insn = slda_0x38b, + .op1 = 0x7988f722dd5bbe7cull, + .op2 = 0x9aed3f95b4d78cc2ull, + .exp_result = 0x1ee45bab77cf8000ull, + .exp_cc = 3, + }, + { + .name = "sldl", + .insn = sldl_0x031, + .op1 = 0xaae2918dce2b049aull, + .op2 = 0x0000000000000000ull, + .exp_result = 0x0934000000000000ull, + .exp_cc = 0, + }, + { + .name = "srda", + .insn = srda_0x36f, + .op1 = 0x0cd4ed9228a50978ull, + .op2 = 0x72b046f0848b8cc9ull, + .exp_result = 0x000000000000000cull, + .exp_cc = 2, + }, + { + .name = "srdl", + .insn = srdl_0x99a, + .op1 = 0x1018611c41689a1dull, + .op2 = 0x2907e150c50ba319ull, + .exp_result = 0x0000000000000203ull, + .exp_cc = 0, + }, +}; + +int main(void) +{ + int ret = 0; + size_t i; + + for (i = 0; i < sizeof(tests) / sizeof(tests[0]); i++) { + uint64_t result; + uint64_t cc = 0; + + result = tests[i].insn(tests[i].op1, tests[i].op2, &cc); + if (result != tests[i].exp_result) { + fprintf(stderr, + "bad %s result:\n" + "actual = 0x%" PRIx64 "\n" + "expected = 0x%" PRIx64 "\n", + tests[i].name, result, tests[i].exp_result); + ret = 1; + } + if (cc != tests[i].exp_cc) { + fprintf(stderr, + "bad %s cc:\n" + "actual = %" PRIu64 "\n" + "expected = %" PRIu64 "\n", + tests[i].name, cc, tests[i].exp_cc); + ret = 1; + } + } + return ret; +} -- 2.31.1