Message ID | 20190518191934.21887-3-richard.henderson@linaro.org |
---|---|
State | New |
Headers | show |
Series | target/arm: make use of new gvec expanders | expand |
On Sat, 18 May 2019 at 20:19, Richard Henderson <richard.henderson@linaro.org> wrote: > > This replaces 3 target-specific implementations for BIT, BIF, and BSL. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > @@ -10916,13 +10925,13 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) > return; > > case 5: /* BSL bitwise select */ > - gen_gvec_op3(s, is_q, rd, rn, rm, &bsl_op); > + gen_gvec_fn4(s, is_q, rd, rd, rn, rm, tcg_gen_gvec_bitsel, 0); > return; > case 6: /* BIT, bitwise insert if true */ > - gen_gvec_op3(s, is_q, rd, rn, rm, &bit_op); > + gen_gvec_fn4(s, is_q, rd, rm, rn, rd, tcg_gen_gvec_bitsel, 0); > return; > case 7: /* BIF, bitwise insert if false */ > - gen_gvec_op3(s, is_q, rd, rn, rm, &bif_op); > + gen_gvec_fn4(s, is_q, rd, rm, rd, rn, tcg_gen_gvec_bitsel, 0); > return; We were previously doing different operations for these three different instructions. Now we seem to always be doing the same thing but with randomly reshuffled register arguments. How does this work ? thanks -- PMM
On 5/23/19 8:46 AM, Peter Maydell wrote: > On Sat, 18 May 2019 at 20:19, Richard Henderson > <richard.henderson@linaro.org> wrote: >> >> This replaces 3 target-specific implementations for BIT, BIF, and BSL. >> >> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> >> @@ -10916,13 +10925,13 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) >> return; >> >> case 5: /* BSL bitwise select */ >> - gen_gvec_op3(s, is_q, rd, rn, rm, &bsl_op); >> + gen_gvec_fn4(s, is_q, rd, rd, rn, rm, tcg_gen_gvec_bitsel, 0); >> return; >> case 6: /* BIT, bitwise insert if true */ >> - gen_gvec_op3(s, is_q, rd, rn, rm, &bit_op); >> + gen_gvec_fn4(s, is_q, rd, rm, rn, rd, tcg_gen_gvec_bitsel, 0); >> return; >> case 7: /* BIF, bitwise insert if false */ >> - gen_gvec_op3(s, is_q, rd, rn, rm, &bif_op); >> + gen_gvec_fn4(s, is_q, rd, rm, rd, rn, tcg_gen_gvec_bitsel, 0); >> return; > > We were previously doing different operations for these three > different instructions. Now we seem to always be doing the same > thing but with randomly reshuffled register arguments. How > does this work ? Because the three different instructions perform the same operation with reshuffled register arguments. I'm not sure why we were using different operations in the beginning. Possibly because those formulations do not require a temporary? Possibly that's how they're written in the ARM ARM pseudocode? r~
On Thu, 23 May 2019 at 14:02, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 5/23/19 8:46 AM, Peter Maydell wrote: > > On Sat, 18 May 2019 at 20:19, Richard Henderson > > <richard.henderson@linaro.org> wrote: > >> > >> This replaces 3 target-specific implementations for BIT, BIF, and BSL. > >> > >> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> > >> @@ -10916,13 +10925,13 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) > >> return; > >> > >> case 5: /* BSL bitwise select */ > >> - gen_gvec_op3(s, is_q, rd, rn, rm, &bsl_op); > >> + gen_gvec_fn4(s, is_q, rd, rd, rn, rm, tcg_gen_gvec_bitsel, 0); > >> return; > >> case 6: /* BIT, bitwise insert if true */ > >> - gen_gvec_op3(s, is_q, rd, rn, rm, &bit_op); > >> + gen_gvec_fn4(s, is_q, rd, rm, rn, rd, tcg_gen_gvec_bitsel, 0); > >> return; > >> case 7: /* BIF, bitwise insert if false */ > >> - gen_gvec_op3(s, is_q, rd, rn, rm, &bif_op); > >> + gen_gvec_fn4(s, is_q, rd, rm, rd, rn, tcg_gen_gvec_bitsel, 0); > >> return; > > > > We were previously doing different operations for these three > > different instructions. Now we seem to always be doing the same > > thing but with randomly reshuffled register arguments. How > > does this work ? > > Because the three different instructions perform the same operation with > reshuffled register arguments. Ah, so they do. Next question, how do I find out what the order of arguments in the above code means so I can compare it against the pseudocode expression we're implementing? thanks -- PMM
On 5/23/19 9:08 AM, Peter Maydell wrote: >> Because the three different instructions perform the same operation with >> reshuffled register arguments. > > Ah, so they do. Next question, how do I find out what the > order of arguments in the above code means so I can compare > it against the pseudocode expression we're implementing? >From tcg/README: * bitsel_vec v0, v1, v2, v3 Bitwise select, v0 = (v2 & v1) | (v3 & ~v1), across the entire vector. The "selector" is second, the first input operand. r~
On Thu, 23 May 2019 at 14:16, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 5/23/19 9:08 AM, Peter Maydell wrote: > >> Because the three different instructions perform the same operation with > >> reshuffled register arguments. > > > > Ah, so they do. Next question, how do I find out what the > > order of arguments in the above code means so I can compare > > it against the pseudocode expression we're implementing? > > >From tcg/README: > > * bitsel_vec v0, v1, v2, v3 > > Bitwise select, v0 = (v2 & v1) | (v3 & ~v1), across the entire vector. > > The "selector" is second, the first input operand. Oh, this series is based on another patchset. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> thanks -- PMM
On 5/23/19 8:30 AM, Peter Maydell wrote: > On Thu, 23 May 2019 at 14:16, Richard Henderson > <richard.henderson@linaro.org> wrote: >> >> On 5/23/19 9:08 AM, Peter Maydell wrote: >>>> Because the three different instructions perform the same operation with >>>> reshuffled register arguments. >>> >>> Ah, so they do. Next question, how do I find out what the >>> order of arguments in the above code means so I can compare >>> it against the pseudocode expression we're implementing? >> >> >From tcg/README: >> >> * bitsel_vec v0, v1, v2, v3 >> >> Bitwise select, v0 = (v2 & v1) | (v3 & ~v1), across the entire vector. >> >> The "selector" is second, the first input operand. > > Oh, this series is based on another patchset. > > Reviewed-by: Peter Maydell <peter.maydell@linaro.org> The prerequisite for this patch is now in master. r~
diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index 63d958cf50..9569bc5963 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -122,5 +122,7 @@ typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t); typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); +typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t); #endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/translate.h b/target/arm/translate.h index f357b767cb..01ae454dcf 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -238,9 +238,6 @@ static inline void gen_ss_advance(DisasContext *s) } /* Vector operations shared between ARM and AArch64. */ -extern const GVecGen3 bsl_op; -extern const GVecGen3 bit_op; -extern const GVecGen3 bif_op; extern const GVecGen3 mla_op[4]; extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2c280243a9..955ab63ff8 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -704,6 +704,15 @@ static void gen_gvec_fn3(DisasContext *s, bool is_q, int rd, int rn, int rm, vec_full_reg_offset(s, rm), is_q ? 16 : 8, vec_full_reg_size(s)); } +/* Expand a 4-operand AdvSIMD vector operation using an expander function. */ +static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, + int rx, GVecGen4Fn *gvec_fn, int vece) +{ + gvec_fn(vece, vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), vec_full_reg_offset(s, rx), + is_q ? 16 : 8, vec_full_reg_size(s)); +} + /* Expand a 2-operand + immediate AdvSIMD vector operation using * an op descriptor. */ @@ -10916,13 +10925,13 @@ static void disas_simd_3same_logic(DisasContext *s, uint32_t insn) return; case 5: /* BSL bitwise select */ - gen_gvec_op3(s, is_q, rd, rn, rm, &bsl_op); + gen_gvec_fn4(s, is_q, rd, rd, rn, rm, tcg_gen_gvec_bitsel, 0); return; case 6: /* BIT, bitwise insert if true */ - gen_gvec_op3(s, is_q, rd, rn, rm, &bit_op); + gen_gvec_fn4(s, is_q, rd, rm, rn, rd, tcg_gen_gvec_bitsel, 0); return; case 7: /* BIF, bitwise insert if false */ - gen_gvec_op3(s, is_q, rd, rn, rm, &bif_op); + gen_gvec_fn4(s, is_q, rd, rm, rd, rn, tcg_gen_gvec_bitsel, 0); return; default: diff --git a/target/arm/translate.c b/target/arm/translate.c index 49dfcdc90d..3abcae3a50 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -5755,72 +5755,6 @@ static int do_v81_helper(DisasContext *s, gen_helper_gvec_3_ptr *fn, return 1; } -/* - * Expanders for VBitOps_VBIF, VBIT, VBSL. - */ -static void gen_bsl_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm) -{ - tcg_gen_xor_i64(rn, rn, rm); - tcg_gen_and_i64(rn, rn, rd); - tcg_gen_xor_i64(rd, rm, rn); -} - -static void gen_bit_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm) -{ - tcg_gen_xor_i64(rn, rn, rd); - tcg_gen_and_i64(rn, rn, rm); - tcg_gen_xor_i64(rd, rd, rn); -} - -static void gen_bif_i64(TCGv_i64 rd, TCGv_i64 rn, TCGv_i64 rm) -{ - tcg_gen_xor_i64(rn, rn, rd); - tcg_gen_andc_i64(rn, rn, rm); - tcg_gen_xor_i64(rd, rd, rn); -} - -static void gen_bsl_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm) -{ - tcg_gen_xor_vec(vece, rn, rn, rm); - tcg_gen_and_vec(vece, rn, rn, rd); - tcg_gen_xor_vec(vece, rd, rm, rn); -} - -static void gen_bit_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm) -{ - tcg_gen_xor_vec(vece, rn, rn, rd); - tcg_gen_and_vec(vece, rn, rn, rm); - tcg_gen_xor_vec(vece, rd, rd, rn); -} - -static void gen_bif_vec(unsigned vece, TCGv_vec rd, TCGv_vec rn, TCGv_vec rm) -{ - tcg_gen_xor_vec(vece, rn, rn, rd); - tcg_gen_andc_vec(vece, rn, rn, rm); - tcg_gen_xor_vec(vece, rd, rd, rn); -} - -const GVecGen3 bsl_op = { - .fni8 = gen_bsl_i64, - .fniv = gen_bsl_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true -}; - -const GVecGen3 bit_op = { - .fni8 = gen_bit_i64, - .fniv = gen_bit_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true -}; - -const GVecGen3 bif_op = { - .fni8 = gen_bif_i64, - .fniv = gen_bif_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true -}; - static void gen_ssra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { tcg_gen_vec_sar8i_i64(a, a, shift); @@ -6830,16 +6764,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) vec_size, vec_size); break; case 5: /* VBSL */ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, - vec_size, vec_size, &bsl_op); + tcg_gen_gvec_bitsel(MO_8, rd_ofs, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); break; case 6: /* VBIT */ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, - vec_size, vec_size, &bit_op); + tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rn_ofs, rd_ofs, + vec_size, vec_size); break; case 7: /* VBIF */ - tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, - vec_size, vec_size, &bif_op); + tcg_gen_gvec_bitsel(MO_8, rd_ofs, rm_ofs, rd_ofs, rn_ofs, + vec_size, vec_size); break; } return 0;
This replaces 3 target-specific implementations for BIT, BIF, and BSL. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- target/arm/translate-a64.h | 2 + target/arm/translate.h | 3 -- target/arm/translate-a64.c | 15 ++++++-- target/arm/translate.c | 78 +++----------------------------------- 4 files changed, 20 insertions(+), 78 deletions(-) -- 2.17.1