Message ID | 20231101041132.174501-1-richard.henderson@linaro.org |
---|---|
Headers | show |
Series | target/sparc: Cleanup condition codes etc | expand |
On 01/11/2023 04:11, Richard Henderson wrote: > This was part of my guess for some of the performance problems. > > I saw compute_all_sub quite high in the profile at some point, and I > believe that the test case has a partially rotated loop such that "cmp" > is in a delay slot, and so the gen_compare fast path for CC_OP_SUB is > not visible to the conditional branch that uses the output of the compare. > Which means that helper_compute_psr gets called more often that we'd like. > > Since almost all Sparc instructions that set cc also have a version of > the instruction that does not set cc, we can trust that the compiler > has only used the cc-setting version when it is actually required. > Thus, unlike CISC processors, there is very little scope for optimization > of the flags -- we might as well compute them immediately. > > Move away from CC_OP to explicit computation of conditions. This is > modeled on target/arm for the (mostly) separate representation of the bits. > We can pack icc.[NV] and xcc.[NV] into the same target_ulong, but Z and C > cannot share. (For "normal" setting of Z, we could share, but it is > possible to set xcc.Z and !icc.Z via explicit write to %ccr, and for > that we have to have two variables.) > > After removing CC_OP, clean up the handling of conditions so that we can > minimize additional setcond required for env->cond. > > Finally, inline some division, which can make use of the new out-of-line > exception path, which means we can expand UDIVX and SDIVX with very few > host insns. The 64/32 UDIV insn needs only a few more. Leave UDIVcc and > SDIV* out of line, as the overflow and saturation computation in these > cases is really too large to inline. > > r~ I've tested this series by running through my OpenBIOS boot tests for SPARC32 and SPARC64 and didn't spot any obvious regressions, so: Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> > Richard Henderson (21): > target/sparc: Introduce cpu_put_psr_icc > target/sparc: Split psr and xcc into components > target/sparc: Remove CC_OP_LOGIC > target/sparc: Remove CC_OP_DIV > target/sparc: Remove CC_OP_ADD, CC_OP_ADDX, CC_OP_TADD > target/sparc: Remove CC_OP_SUB, CC_OP_SUBX, CC_OP_TSUB > target/sparc: Remove CC_OP_TADDTV, CC_OP_TSUBTV > target/sparc: Remove CC_OP leftovers > target/sparc: Remove DisasCompare.is_bool > target/sparc: Change DisasCompare.c2 to int > target/sparc: Always copy conditions into a new temporary > target/sparc: Do flush_cond in advance_jump_cond > target/sparc: Merge gen_branch2 into advance_pc > target/sparc: Merge advance_jump_uncond_{never,always} into > advance_jump_cond > target/sparc: Pass displacement to advance_jump_cond > target/sparc: Merge gen_op_next_insn into only caller > target/sparc: Record entire jump condition in DisasContext > target/sparc: Discard cpu_cond at the end of each insn > target/sparc: Implement UDIVX and SDIVX inline > target/sparc: Implement UDIV inline > target/sparc: Check for invalid cond in gen_compare_reg > > linux-user/sparc/target_cpu.h | 17 +- > target/sparc/cpu.h | 58 +- > target/sparc/helper.h | 12 +- > target/sparc/insns.decode | 7 +- > linux-user/sparc/cpu_loop.c | 11 +- > linux-user/sparc/signal.c | 2 +- > target/sparc/cc_helper.c | 471 ------------ > target/sparc/cpu.c | 1 - > target/sparc/helper.c | 171 ++--- > target/sparc/int32_helper.c | 5 - > target/sparc/int64_helper.c | 5 - > target/sparc/machine.c | 45 +- > target/sparc/translate.c | 1333 ++++++++++++++------------------- > target/sparc/win_helper.c | 56 +- > target/sparc/meson.build | 1 - > 15 files changed, 789 insertions(+), 1406 deletions(-) > delete mode 100644 target/sparc/cc_helper.c > ATB, Mark.