Message ID | 20231028194522.245170-1-richard.henderson@linaro.org |
---|---|
Headers | show |
Series | tcg: Introduce TCG_COND_TST{EQ,NE} | expand |
Ping. Running out of days before soft-freeze. :-) r~ On 10/28/23 12:44, Richard Henderson wrote: > Expose a pair of comparison operators that map to the "test" > comparison that is available on many architectures. > > Changes for v2: > * Add TCGCond to tcg_target_const_match. > This fixes a long-standing issue with ppc and s390x backends, > in that CMPI for signed comparisons has signed immediate and > CMPLI for unsigned comparisons has unsigned immediate. > But now allows different immediates for the TST comparisons. > * tcg/i386: Generate TEST x,x for power-of-two in {7,15,31,63}. > * tcg/i386: Generate BT n,x for other power-of-two. > * tcg/ppc: tcg_target_const_match improvements > * tcg/s390x: tcg_target_const_match improvements > * target/m68k: Use TST{EQ,NE} for gen_fcc_cond. > > I wanted more testing, so I went looking for low-hanging fruit. > I didn't see any within target/arm, probably due to how we represent NZCV, > and I didn't want to overlap anything that Paolo might be doing with > target/i386, so I landed on trget/m68k. Tested via our float_convd. > > > r~ > > > Richard Henderson (35): > tcg: Introduce TCG_COND_TST{EQ,NE} > tcg/optimize: Split out arg_is_const_val > tcg/optimize: Split out do_constant_folding_cond1 > tcg/optimize: Do swap_commutative2 in do_constant_folding_cond2 > tcg/optimize: Split out arg_new_constant > tcg/optimize: Handle TCG_COND_TST{EQ,NE} > tcg: Add TCGConst argument to tcg_target_const_match > tcg/aarch64: Support TCG_COND_TST{EQ,NE} > tcg/aarch64: Generate TBZ, TBNZ > tcg/aarch64: Generate CBNZ for TSTNE of UINT32_MAX > tcg/arm: Support TCG_COND_TST{EQ,NE} > tcg/i386: Pass x86 condition codes to tcg_out_cmov > tcg/i386: Move tcg_cond_to_jcc[] into tcg_out_cmp > tcg/i386: Support TCG_COND_TST{EQ,NE} > tcg/i386: Improve TSTNE/TESTEQ vs powers of two > tcg/loongarch64: Support TCG_COND_TST{EQ,NE} > tcg/mips: Support TCG_COND_TST{EQ,NE} > tcg/riscv: Support TCG_COND_TST{EQ,NE} > tcg/sparc64: Implement tcg_out_extrl_i64_i32 > tcg/sparc64: Hoist read of tcg_cond_to_rcond > tcg/sparc64: Pass TCGCond to tcg_out_cmp > tcg/sparc64: Support TCG_COND_TST{EQ,NE} > tcg/ppc: Sink tcg_to_bc usage into tcg_out_bc > tcg/ppc: Use cr0 in tcg_to_bc and tcg_to_isel > tcg/ppc: Tidy up tcg_target_const_match > tcg/ppc: Add TCG_CT_CONST_CMP > tcg/ppc: Support TCG_COND_TST{EQ,NE} > tcg/s390x: Split constraint A into J+U > tcg/s390x: Add TCG_CT_CONST_CMP > tcg/s390x: Support TCG_COND_TST{EQ,NE} > tcg/tci: Support TCG_COND_TST{EQ,NE} > target/alpha: Use TCG_COND_TST{EQ,NE} for BLB{C,S} > target/alpha: Use TCG_COND_TST{EQ,NE} for CMOVLB{C,S} > target/alpha: Use TCG_COND_TSTNE for gen_fold_mzero > target/m68k: Use TCG_COND_TST{EQ,NE} in gen_fcc_cond > > docs/devel/tcg-ops.rst | 2 + > include/tcg/tcg-cond.h | 49 +++- > tcg/aarch64/tcg-target-con-set.h | 5 +- > tcg/aarch64/tcg-target-con-str.h | 1 + > tcg/i386/tcg-target-con-set.h | 6 +- > tcg/i386/tcg-target-con-str.h | 1 + > tcg/ppc/tcg-target-con-set.h | 5 +- > tcg/ppc/tcg-target-con-str.h | 1 + > tcg/s390x/tcg-target-con-set.h | 8 +- > tcg/s390x/tcg-target-con-str.h | 3 +- > target/alpha/translate.c | 94 +++---- > target/m68k/translate.c | 74 +++-- > tcg/optimize.c | 464 +++++++++++++++++++++++-------- > tcg/tcg.c | 38 ++- > tcg/tci.c | 14 + > tcg/aarch64/tcg-target.c.inc | 165 ++++++++--- > tcg/arm/tcg-target.c.inc | 62 +++-- > tcg/i386/tcg-target.c.inc | 178 ++++++++---- > tcg/loongarch64/tcg-target.c.inc | 59 ++-- > tcg/mips/tcg-target.c.inc | 44 ++- > tcg/ppc/tcg-target.c.inc | 277 +++++++++++++----- > tcg/riscv/tcg-target.c.inc | 23 +- > tcg/s390x/tcg-target.c.inc | 246 ++++++++++------ > tcg/sparc64/tcg-target.c.inc | 70 +++-- > tcg/tci/tcg-target.c.inc | 3 +- > 25 files changed, 1342 insertions(+), 550 deletions(-) >
On Sat, Oct 28, 2023 at 9:45 PM Richard Henderson <richard.henderson@linaro.org> wrote: > > Expose a pair of comparison operators that map to the "test" > comparison that is available on many architectures. > > Changes for v2: > * Add TCGCond to tcg_target_const_match. > This fixes a long-standing issue with ppc and s390x backends, > in that CMPI for signed comparisons has signed immediate and > CMPLI for unsigned comparisons has unsigned immediate. > But now allows different immediates for the TST comparisons. > * tcg/i386: Generate TEST x,x for power-of-two in {7,15,31,63}. > * tcg/i386: Generate BT n,x for other power-of-two. > * tcg/ppc: tcg_target_const_match improvements > * tcg/s390x: tcg_target_const_match improvements > * target/m68k: Use TST{EQ,NE} for gen_fcc_cond. I updated the MIPS backend (untested though) and pushed the result to branch i386 of https://gitlab.com/bonzini/qemu/. However I was thinking: a lot of RISC targets simply do AND/ANDI followed by the sequence used for TCG_COND_NE. Would it make sense to have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE to AND+EQ/NE directly in the optimizer? And for brcond2/setcond2, always using AND/AND/OR may work just as well as any backend-specific trick, and will give more freedom to the register allocator. This also would allow to reorder the target patches before the TCG patches, thus: 1) providing more exposure to the lowering code 2) making the "minimum viable series" smaller. (2) is not a huge deal since you have already written the backend code, but (1) is interesting. Paolo
On 1/7/24 04:43, Paolo Bonzini wrote: > On Sat, Oct 28, 2023 at 9:45 PM Richard Henderson > <richard.henderson@linaro.org> wrote: >> >> Expose a pair of comparison operators that map to the "test" >> comparison that is available on many architectures. >> >> Changes for v2: >> * Add TCGCond to tcg_target_const_match. >> This fixes a long-standing issue with ppc and s390x backends, >> in that CMPI for signed comparisons has signed immediate and >> CMPLI for unsigned comparisons has unsigned immediate. >> But now allows different immediates for the TST comparisons. >> * tcg/i386: Generate TEST x,x for power-of-two in {7,15,31,63}. >> * tcg/i386: Generate BT n,x for other power-of-two. >> * tcg/ppc: tcg_target_const_match improvements >> * tcg/s390x: tcg_target_const_match improvements >> * target/m68k: Use TST{EQ,NE} for gen_fcc_cond. > > I updated the MIPS backend (untested though) and pushed the result to > branch i386 of https://gitlab.com/bonzini/qemu/. Thanks. > > However I was thinking: a lot of RISC targets simply do AND/ANDI > followed by the sequence used for TCG_COND_NE. Would it make sense to > have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE > to AND+EQ/NE directly in the optimizer? Probably best, yes. > And for brcond2/setcond2, > always using AND/AND/OR may work just as well as any backend-specific > trick, and will give more freedom to the register allocator. test a,b testeq c,e for Arm32. So I'll leave it to the backends. r~
Il lun 8 gen 2024, 22:45 Richard Henderson <richard.henderson@linaro.org> ha scritto: > > I was thinking: a lot of RISC targets simply do AND/ANDI > > followed by the sequence used for TCG_COND_NE. Would it make sense to > > have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE > > to AND+EQ/NE directly in the optimizer? > > Probably best, yes. > Ok, I will give it a shot. > And for brcond2/setcond2, > > always using AND/AND/OR may work just as well as any backend-specific > > trick, and will give more freedom to the register allocator. > > test a,b > testeq c,e > > for Arm32. So I'll leave it to the backends. > Nice. :) Paolo > > r~ > >
On 1/9/24 09:55, Paolo Bonzini wrote: > > > Il lun 8 gen 2024, 22:45 Richard Henderson <richard.henderson@linaro.org > <mailto:richard.henderson@linaro.org>> ha scritto: > > > I was thinking: a lot of RISC targets simply do AND/ANDI > > followed by the sequence used for TCG_COND_NE. Would it make sense to > > have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE > > to AND+EQ/NE directly in the optimizer? > > Probably best, yes. > > > Ok, I will give it a shot. I'm in the process now... r~